Big Data, big problems?
Big Data is here to stay, whether we like it or not. It has transformed the way businesses operate and continues to streamline everyday life for individuals. However, Big Data has big implications for privacy; this blog considers the highs and lows of Big Data and what the law is doing to intervene.
The term "Big Data" refers to the increasingly widespread practice of collecting, storing and analysing vast complex data sets to extract value. Back in 2001, analyst Doug Laney first identified the three elements of Big Data: Volume, Velocity and Variety. Big Data offered new opportunities for business, most notably the creation of the digital advertising industry as we know it. Digital advertising relies on analysing data generated by individuals, such as particular browsing patterns, to identify appropriate targets for advertisements. For instance, people who search for kitchen knives often also buy recipe books – companies can then use this information to ensure that their advertising budget is used effectively. Of course, where Big Data really added value was in uncovering less obvious associations, allowing companies to capitalise where competitors did not.
Development of Big Data
Since the birth of Big Data, data production rates have risen exponentially; a mobile phone alone monitors every keystroke, not to mention using the camera, microphone, GPS and motion sensors to gather data. With this rapid generation, we've seen far more sophisticated utilisation of Big Data, both for businesses and individuals. The market for Big Data software and hardware is estimated to grow to £128bn in 2018.
One major area of development has been in healthcare analytics, where data can be used to reduce costs of treatment, predict outbreaks of epidemics and make early diagnoses. In Paris, a number of hospitals have trialled the use of data from a range of sources to predict patterns in admissions, allowing hospital management to effectively distribute staff and resources which has resulted in better patient care.
While the potential for Big Data is undeniably exciting, concerns about privacy cannot be ignored. Current Big Data does not necessarily comprise the kind of personal data we might be naturally wary about sharing, such as addresses, dates of birth, race, religion or sexuality, but rather more nuanced information we don't typically notice we are providing but which amounts to personal data nonetheless.
On the face of it, Big Data preserves privacy by detaching information from individuals and repurposing it. However, by taking multiple anonymised data sets and triangulating them, you can begin to break down that anonymity. For instance, take information about all the journeys that people in London have taken over the past year from a taxi service. This data alone is not necessarily sensitive, but if you combine it with venue information and social media, you could conceivably make assumptions about an individual who, say, ended or began journeys at a known LGBT destination.
A real-life example of this invasion of privacy occurred in 2012 when it was revealed that Target, the US supermarket chain, used customers’ shopping habits to identify a list of products which, when analysed together, assigned each shopper a “pregnancy prediction” score. Purchases of unscented lotion, supplements and cotton wool balls indicated a shopper may be pregnant and Target were even able to predict a woman’s due date to within a small window, allowing them to send her coupons timed to specific stages of pregnancy. The system worked so well that Target were able to predict a teenager’s pregnancy and send coupons for baby clothes to her home before even her father was aware she was pregnant.
The General Data Protection Regulation (GDPR) will come into force on 25 May 2018 and aims to strike a balance between protecting our privacy and encouraging valuable Big Data use. Businesses which currently make use of Big Data will be forced to review their systems and ensure they have the relevant consents and/or protections in place.
One of the principal focuses of the GDPR is strengthening the rights of data subjects. Under the GDPR, it will be unlawful to automatically process data where it has a legal or otherwise significant effect on the individual, unless certain exceptions apply. The Regulation also introduces the concept of ‘pseudonymised’ data, data which cannot be attributed to a particular person without using additional information; this data is not classified as ‘personal’ and its processing is permitted where the additional identifiers are stored separately and securely.
The struggle between the need to protect privacy and allowing Big Data to continue to improve the way we live without quashing innovation is unlikely to be resolved easily. It remains to be seen how effective the GDPR will be in achieving either of these aims. One thing that’s certain is that our data production will not slow down and neither will the development of new ways to use it – legislators face an uphill battle to keep pace.