David Gibson, vice president at Varonis, on how the NHS can protect patients from data breaches
The effects of leaked personal health information can be exceptionally damaging and a single breach of one individual’s privacy can thrust a healthcare institution, and the victim, into international headlines. In this article, David Gibson, vice president at Varonis, explores some of the necessary lessons of compliance and protection against internal and external data threats in the changing UK healthcare industry
The UK’s NHS will soon require GPs to enter patient data into a centralised health database called care.data. After being anonymised, this clinical information will be used primarily for analysis of medical outcomes, as well as for drug research by academics and healthcare companies.
There are obvious ‘big data’ benefits to analysing millions of health records, but this comes with considerable privacy implications
Large-scale collection and distribution of health data, even though it’s been de-identified, does raise privacy concerns, and minimally requires more attention to compliance obligations under the UK’s existing Data Protection Act (DPA) and NHS rules.
In April this year, an advisory commission to the NHS approved the care.data programme to make medical data more widely available to the medical research community. There are obvious ‘big data’ benefits to analysing millions of health records, but this comes with considerable privacy implications. Fortunately, though, the NHS is not starting with a blank slate.
Since 1998, the UK’s DPA has required organisations to have security controls in place when collecting and processing consumer personal data - names, addresses, and other data that ‘relates to an individual’. The NHS falls under most, but not all of the DPA’s security and personal privacy regulations. For example, while patient data is protected and requires an explicit opt-in to share with others, in cases of epidemics and other health emergencies, the NHS can access and distribute patient data to investigators.
But when personal data is stripped from data sets, the DPA requirements no longer apply, and the data can be more openly used. This is the approach that the NHS has taken with its care.data repository. Allowing anonymised patient information to be made available makes great sense, in that researchers and others can publish results, papers, and important breakthroughs without compromising patient identities.
The DPA does have an exception for ‘sensitive’ data - which includes ethnicity, religion, political belief, and medical information. Even though they are not identifiers in the strict sense, sensitive data still needs to be treated with great care. It turns out that there is more of an issue with sensitive and other patient medical data than is suggested by the DPA regulations.
Anonymisation falls short
Though the UK’s data protection regulations, along with other EU countries, set a high privacy standard, advocates have openly voiced their concerns with the Government’s plans to make patient medical data more widely available.
Ross Anderson, professor in security engineering at the University of Cambridge Computer Laboratory, notes that conventional ideas about identifiers in personal data are outdated: Computer scientists realised about 30 years ago that protecting privacy using anonymity is a lot harder than it looks.
The safe havens act as kind of data clean room, where data that has a risk of re-identification is segregated out from the other patient data, but with controls and governance in place over its usage
Consumer data often contains certain information, formerly considered non-consequential, but now because of the web public online forums, expanding use of social media and improved computing resources can be used as a new kind of personal identifier. The re-identification potential of this ‘grey’ personal data has been known for over a decade.
In a well-publicised incident in 1998, MIT graduate student, Latanya Sweeney, managed to identify the medical condition of the governor of Massachusetts from ‘anonymous’ records released by the Veterans' Administration. Matching three quasi-identifiers, zip code, full birthdate, and gender, in the records to public voting rolls, Sweeney was able to re-identify Weld’s diagnosis and prescriptions.
Sweeney has also pointed out that ethnicity, considered sensitive data by the DPA, when coupled with location information can act as a quasi-identifier. More recently, researchers have shown it’s possible to re-identify anonymous web search histories and movie preferences, matching results to individuals.
The larger point is the divide between what has historically been seen as personal (and anonymous) data has become blurred since we started to share so much of it on social networking sites.
In the UK, this issue of re-identifying data came to a head in the 1990s when John Major's government built a database of hospital records with names removed, but with postcode and date of birth and other quasi-identifiers still present, so most patients were easy to identify.
After the British Medical Association (BMA) objected, a committee led by Dame Fiona Caldicott was established in 1997 to look into the problem.
One of the results of the Caldicott committee was a set of principles focused on limiting medical data collection, usage, and retention. We may view them today as ‘privacy by design’, an important concept in current data security research.
While Caldicott principles, coupled with DPA rules, are still important, we have her to thank for ‘Caldicott guardians’ to oversee patient data, they don’t directly address the re-identification and other privacy issues now posed by the NHS’s centralised databases.
More recently, Dame Caldicott was commissioned by the Chief Medical Officer (CMO) to investigate how patient information should be used in the new NHS data collection and sharing system. The report, Information to Share or Not to Share was made public in March. Besides updates to the original Caldicott principles, and new recommendations with regard to breach notifications, Caldicott directly addresses quasi-identifiers, which she directly refers to as a ‘grey area’ of data identifiers.
The report’s recommendations specifically call for setting up separate data ‘safe haven’ environments in which quasi-identifiers can be reviewed and processed by outside researchers. In effect, it goes further than the DPA, and explicitly requires ‘published register of data flowing into or out of the safe haven’, appropriate techniques for ‘re-identification risk management’, ‘use of role-based access controls’, and ‘auditable standards’. In effect, the safe havens act as kind of data clean room, where data that has a risk of re-identification is segregated out from the other patient data, but with controls and governance in place over its usage. >
The report doesn’t necessarily supersede DPA’s regulations: GPs and hospitals are still considered ‘data collectors’ and are responsible for protecting patient personal data. The Caldicott report, however, in great detail outlines an approach to sharing data between medical providers and the NHS that protects sensitive data and other quasi-identifiers while still enabling medical researchers to share and publish their work.
Even with the positive changes to data protection and privacy brought about by the original Calidicott committee and the DPA’s overall security regulations, in the 12 months to the end of June 2012, 186 serious data breaches were notified to the Department of Health. Caldicott points out that these were all about data losses and breaches of the Data Protection Act, not sharing.
Even with the positive changes to data protection and privacy, in the 12 months to the end of June 2012, 186 serious data breaches were notified to the Department of Health
Data breaches, however, continue to occur. A new survey by the Health Care Compliance Association and the Society of Corporate Compliance and Ethics reveals that 59% of those surveyed had an incident in the past year, while 20% of organisations have suffered four or more breaches and 37% of these experienced multiple breach incidents. Additionally, 17% of respondents reported two or three incidents, while 20% reported four or more breaches.
Even with new Caldicott recommended controls, there’s simply no fool-proof system. With the wider sharing of medical data in the UK, there will be opportunities for data to be hacked, misused, misplaced, and accessed by unauthorised users.
So what ultimately leads to breaches, and how can they be avoided?
There are a few lessons to be learned from medical information security cases. For example, in incidents involving unauthorised access where logins have been used multiple times, better file-level auditing and alerting might have led to detection much earlier instead of allowing for three weeks of unlimited access. And at least one of those breaches might have been prevented with a combination of policy and technology that restricted access to certain users and/or certain devices.
There are a few questions you may want to ask when assessing your liabilities with respect to medical data:
If there are any of the questions above you cannot answer, you may have some deficiencies with respect to data governance. However, don’t be too surprised. Data governance is one of the biggest initiatives in IT today to bolster data defences. Data governance encompasses the people, processes, and procedures to create a consistent, enterprise-wide view of a company’s data, to increase consistency in entitlement decision making, decrease the risk of data misuse, and improve data security.
Despite the new Caldicott recommended controls, there’s simply no fool-proof system. With the wider sharing of medical data in the UK, there will be opportunities for data to be hacked, misused, misplaced, and accessed by unauthorised users
Data governance and the business case for it is clear. If you can ensure that the right people have access to only the data they need to perform their jobs, all use is monitored, and abuse is flagged, you will improve security, reduce risk, and take a significant step toward sustainable access compliance. While this may mean we have to cross some bureaucratic bridges, as long as we play our part, the eventual goal – preventing healthcare information getting in to the wrong hands – can be achieved.