Recent revelations of phone firms selling their million+ users’ data to public and private companies and also that the Bloomberg News Unit, via their data extraction technology, were tapping secretive information that Wall Street traders use on a daily basis, have reconfirmed that data breaches have become a regular phenomenon and re-endorsed the thought that data is the new oil and that big data science is not working.
But before we look at why data is really the new oil, let’s quickly brush up: is abusing big data the only way for companies to gain a competitive advantage or there are other ways too? The answer is that there are many fair practices that companies can use to obtain customer data, such as offering free services like unlimited mobile phone data usage, free content generation networks (Facebook, Twitter, YouTube) or the use of relatively cheap or free sophisticated content aggregation tools (Digg, Hootsuite). Cloud computing with open source data processing models like Hadoop and NoSQL or social media data aggregators such as Gnip and Datasift have made it relatively easy for businesses to collect all this information and subsequently use it for decision making.
However, despite all these advances we keep hearing every second day that companies are violating data protection laws or using unlawful techniques to get, buy or sell data. Which made us think, as we discuss below, is this sheer greed or there is something missing from this big data science that forces businesses to take an unlawful route?
We have come up with six main reasons which cause or tempt companies to breach data privacy i.e. go beyond normal big data science practices to gain a greater competitive advantage.
1. Data within dark social media cannot be accessed legally
Despite so many open source social media networks, 80% of communications are still done via emails, SMS or private messages apps and this is information that companies cannot legitimately access. This therefore forces them to look at ways of using technology, such as buying cookies or spying on these dark social media tools, to get hold of that information.
2. Inability to de-code intuition or customer buying intent via big data analysis
Many studies suggest that 60% of the time we go with our impulse or 80% of the time we go with word of mouth or else we simply Google when we buy something plus these studies also suggest that the existing technologies are not able to decode sentiments from whatever data we have. This means that, despite having all this social interaction, engagement and sentimental data, companies are continuously looking for real data on consumer buying behaviours and in real life this data is either with Google or within individual companies’ systems. Big companies maintain this valuable data in centralised data storage centres which, despite both digital and physical security, are always prone to data holes or attempts by competitors to break in; this is particularly true for those data centres held in Far Eastern countries.
3. ROI is very low on advertising or subscription models for free data services
We all know that the rationale behind mobile phone companies giving unlimited data usage package or websites such as Facebook, Twitter or Google giving free space to upload images, text, video and audio is to implement an advertising or subscription revenue model based on the back of huge amounts of data collected through these services. However, apart from Google, very few companies really make a substantial amount of money from these models and therefore are forced to look to other avenues to monetise their data. The most obvious route is licensing or selling data, something which is very prone to data law breaches, as in the case of EE trying to make some money by selling their mobile user data.
4. Hacking can still tempt corporates to get big data illegally
Hacking is no longer the hobby of technologists who like to break the code of robust systems to prove their technical superiority, it has become a business where even large corporations have broken into their competitors’ data systems in order to obtain exclusive information and thereby gain a competitive advantage. The recent hacking of the Wall Street trading system, either knowingly or unknowingly, by the Bloomberg News Unit could be categorised as this kind of hacking.
5. The expense of data scientists and processing forces companies to buy cheap data without a sanity check
The abundance of information has created a new breed of scientist and managed services companies who charge businesses a fortune to collect, process and render information for competitive benefits. Few companies can afford to do this and, if they can, it is a time consuming process. Businesses can take a shorter route to obtain this information by buying in data but in the process they usually forget to undertake a sanity check on the source of the information.
6. Inconsistent and loosely coupled data privacy laws create loopholes for data breaches
In practical terms, despite many regulatory bodies, there are no consistent worldwide data privacy and protection laws. For example, Google, Microsoft and many other big companies currently have issues with the European Commission under EU data privacy laws, however, these same companies are running their business in the rest of the world using the same data privacy policies. As another example, in the USA every breach made to an organisation’s system must be reported to the authorities and be made public whereas in countries like India this is not mandatory. In other words, inconsistent laws around the globe mean that companies can be tempted to either obtain or block competitive data from different parts of the world.
Overall, the main reasons for big data privacy breaches are the continuous advance of technology (hacking); a lack of consistent laws internationally; the inability to extract and contextualise big data or to find a way to monetise data collected via free services.
In this open source, crowd funded and cloud storage era I don’t think that the situation is going to improve very soon. Closing the loop, for all the above reasons, is tough and, for the same reasons, I and many others are forced to accept that data is the new oil, causing mayhem everywhere, from Wall Street to top websites, from the banking industry to public services.