Capital One data breach step by step analysis
On July 29, Capital One announced more than 100 million individuals in the US and Canada were affected by a data breach. According to Capital One, ‘the information included personal information Capital One routinely collects at the time it receives credit card applications, including names, addresses, post codes, phone numbers, email addresses, dates of birth, and self-reported income’.
What we know is largely based on our review of a federal criminal complaint filed against the alleged hacker known as ‘Erratic’. According to the complaint, Erratic is alleged to have broken into a Capital One server running on Amazon Web Services (‘AWS’) because of a firewall misconfiguration. As a result, the hacker obtained privileges to further access and exfiltrate data.
Fortunately, Capital One was alerted to the breach when they were notified that the alleged hacker posted the exfiltrated data on Github. At the same time, she had been bragging about her exploits on social media, describing in sufficient detail how she accessed and exfiltrated the data.
According to Erratic, she successfully circumvented multiple security controls which could have otherwise prevented her unauthorised access. We’ve analysed the information that’s publicly available, and offer the following analysis:
The firewall was misconfigured
The criminal complaint says ‘a firewall misconfiguration permitted commands to reach and be executed by the server’. We believe the misconfigured firewall rules allowed the server to receive traffic, for instance SSH (which allows users remotely to interact with the server), directly from the Internet. If the AWS-managed firewall (i.e., security groups) had been incorrectly configured, tools AWS provides, if properly configured, should have alerted Capital One to that misconfiguration. It’s not clear whether Capital One employed those AWS tools, received those alerts or, if they did, failed to act.
Capital One’s Server was Vulnerable Without Proper Hardening or Patching
Even if the firewall were misconfigured, that fact alone should not have enabled the server to be compromised. However, the combination of a misconfigured firewall and the failure to promptly configure, patch and secure the server could have enabled a hacker to access the data.
Did Capital One Violate the Least Privilege Principle?
The compromised server seems happened to have identity and access management (‘IAM’) roles assigned which allowed anyone who had shell access on the server to access AWS resources, such as the S3 bucket where the Capital One data appears to have been stored, without any further userID and password type authentication. This feature alone is not a problem as it allows servers to access AWS resources without having to save certain credentials on the server. However, it’s essential that a least privilege approach must be taken so a user cannot access information they don’t need to access. Based on Erratic’s social media postings, it is our assessment that she evidently had access that would enable her to corrupt the SSM (the AWS System Manager, an operation automation tool that has the capability to install software on managed EC2 instances), thereby giving her the ability to deploy software on other EC2 servers and launch new instances.
Why Did it Take Months to Detect?
The attack started in March and remained undetected until July when the alleged hacker’s bragging of her exploits on social media data was reported to Capital One. According to the criminal compliant, the connection was initiated from Tor (an anonymity network). Because the hacker apparently accessed AWS from Tor, AWS tools GuardDuty and Macie would have detected the connection, resulting in the early detection of the breach. It’s possible those tools were enabled but the alerts were overlooked. In our view, automated security incident detection and diligent security alert triage and handling are essential to mitigate the risks of unauthorised access to the AWS resources.
Encryption is the Last Line of Defense
Although the information of more than 100 million individuals was affected, according to Capital One only 140,000 social security numbers and 80,000 bank account numbers were disclosed. It is likely that the impact of the breach was mitigated because Capital One employed some level of encryption in their services. In our view, had all the data been properly encrypted, none of the data would have been compromised even when all other controls failed.
What Roostify Does to Protect Our Customers’ Data
At Roostify we employ the following controls to mitigate to the maximum extent practicable an event like this:
We use a least privilege approach to ensure that users have access only to the services necessary to do their job
We use both AWS GuardDuty, Macie and other security detection tools, and monitor in real-time any alarms.
We use tools such as AWS Config and Cloud Watch to monitor possible misconfigurations in security groups (AWS managed firewall) as well as any security group changes
We harden and scan AMIs (Amazon Machine Image) before deploying an instance to ensure server configuration consistency and security
We use application level encryption to encrypt all personal information before it is pushed to an S3 bucket or Aurora database and, at a customer’s request, will keep the master encryption key in AWS’s CloudHSM, a cloud-based hardware security module. In this way, even in the unlikely event that S3 files or database tables were exfiltrated, the data would still remain inaccessible without access to the master encryption key.
We employ additional security tools and measures that are not publicly disclosed except under an NDA.