Is your data really yours?

Photo by Markus Spiske on Unsplash

When internet was born, services were siloed, flicker provided photography services, Hotmail provided email, and yahoo provided email and news.  As the internet matured, the providers started providing more services. Microsoft, which was offering news, acquired hotmail and included it in their MSN suite.  Later Yahoo acquired Flickr to add to their suite of products.  This is not an exhaustive list but the idea was for big corporations to offer suite of products to their customers.  Once on-boarded on one service from their suite , on-boarding them to other products would be a matter of marketing. Cybersecurity was not a big concern and data collection had just started.

Single Sign-on

As the acquisitions continued it became difficult for users to manage multiple accounts  and for platform owners track who was using their services.  To address this issue, single sign on solutions started to crop up.  With single sign on users had to just remember one userid/password and once logged on they could visit all the other properties on the platform.  What the users did not realize was that single sign-on although made their lives easy, in return they would end up loosing a lot more. You say how so? Read on.

data center

Data is the new soil, not oil

Well that single sign-on capability made it easy for platform owners to track logged on users’ activities while using the platform.  In effect the platform owners once they collected all the logs and fed them through a event logging engine, they would be able to track anything the user did. This very information can be used for any number of purposes, further re-enforcing the “data is the new soil paradigm” , okay , what do I mean?

Once the data has been collected by the platform owner, they can harvest all kinds of information from it, and if they sell this data further the new owners can get some other information from the same data.  As the data gets collected, data mining companies can correlate the data with other data elements to get richer information.  Lets say one set of data has user’s email address, name and phone number.  Another source of data might have user email address, name and mailing address, now due to correlation the information about the user has become richer.  This entity can then sell this data onward and the new owner can corelate this data with the data they hold and make it richer. This cycle does not end as we leave log data every time we use the internet, this data can also be called bread crumbs.  When some has enough such breadcrumbs they have pretty much reconstructed a digital version of you.

Platforms as data aggregators

If you are using the platforms offered by likes of GAFA ( stands for Google, Apple, Facebook, and Amazon or you could also say Google, Amazon, Facebook, and Apple), your data is already centralized.  Users while using these platforms create the data which stays in the platform eco system enriching existing data that is already there.  This data gets even richer as data from other sources is amalgamated. Lets pick on one of the GAFA companies and see what data they might have on you.


Google has a lot of platforms in their ecosystem, most common ones being:

  • Gmail
  • Google News
  • Google Voice
  • YouTube
  • Drive
  • Photos
  • Maps
  • Android
  • Google Play Store

While you go about your normal daily activities Google is analyzing all the data

  • Where you are located
  • Who sends you emails
  • What kind of emails you receive
  • The contents of those emails
  • What are your video consuming patterns, such as what you watch, when you watch, comments you make.
  • What kind of files you store on google drive and their contents
  • Photos can already identify dog, cat, baby, etc. Once you tag folks they know exactly what John looks like and so on, where the picture was taken
  • If you are on mobile, and your location is turned on, all your travels are being tracked
  • If you use google voice, they already know what you sound like and if you have google voice configured to listen for “Ok Google”, it is always listening to all what you are saying and activates when it hears the key word “Ok Google”


Facebook is not far in terms of data collection either, some of their platforms are:

  • Facebook
  • Messenger
  • Instagram
  • Pages
  • Groups
  • WhatsApp

While Messenger and WhatsApp seem to be providing end to end encryption, Facebook is still able to identify the parties exchanging messages.  Similar to Google, users on Facebook are uploading their photos, tagging them, checking in at locations, creating posts, tagging folks as friends, spouses, brothers, sisters, cousins, etc.  The information that Facebook has is comparatively richer than what google has.  It is just a matter of how it is mined.

When a user starts consuming a product from an ecosystem owned by one entity, the user in spite of getting free services, is the biggest loser.  This is because as we use the platform in our daily lives, we keep on adding to our existing data and as a result making it richer.

Data Rush

Remember “Gold Rush” , I  have coined this term “Data Rush” to draw the similarities on how everyone is after your data just like back then folks were after gold.  Gold made people rich, but once they bartered it for fiat(money) it was gone, and if they spent the money, they were left with nothing.  The advantage of data that big organization own about you is that they don’t as such “give” your data to their clients and have nothing left, but they share certain attributes about you while still holding the original data.  Once the organization has collected user data, it keeps on giving like a high interest savings account, and the users keep adding to it as they continue to create more data by using the platforms in the ecosystem.

Gold Rush and cybersecurity
Gold Rush

Data is the new ‘Oil’ was used throughout 2017 and 2018, which does not seem to be true, as ‘oil’ does not keep giving unlike data, which keeps on giving.  Hence the most appropriate term would be Data is the new soil.  As long as you have the soil, you till it (data mine) and it keeps producing crops(more data about data subjects after co-relation).


Having too much data about ourselves with vendors, means we are making ourselves susceptible to data breaches which end up impacting our privacy.  The Equifax breach in 2017 exposed data of 147.9 million consumers in US and Canada, majority of impacted users were in US. This data was passed on to Equifax by the organizations that we chose to do business with.

Laws related to credit reporting give us rights to our credit information if it’s reported. But there’s no legislation requiring lenders to take the step of reporting in the first place.”

From <>

We are responsible for our data, hence we have to ensure we think twice what we share and who we share it with, if in doubt ask.

Cyber Resilience

Resilience is the ability of a system to recover from failures.  Cyber Resilience is ability of an organization and the users to recover from cyber attacks, which are always about data.  As we share more data we are making ourselves more susceptible to identity theft, hence to address this risk there are two ways, one very drastic is to stop using electronic medium all together, which is not realistic in this time and age.  The best approach will be so minimize our digital footprint.

We will discuss more about ways of minimizing our digital footprint in the next post.