Data as a liability
Software is eating the world and pooping out data. These days, everything we do leaves a trail of data, and that data has proven to be quite valuable to businesses big and small. So much so, that it has become natural for business leaders to want to collect as much data as they can from their users, even if there is no immediate need for that data. I can hear the conversation in the meeting room now...
Engineering: What fields should we collect on the signup page other than an email address?
Marketing: Get their full contact information, family tree, medical history, and school transcripts.
Security: That is an awful lot of personal information to collect. What do we need it all for?
Marketing: We aren't sure yet, but it will probably be valuable some day.
Engineering: If we think we might need it, we should go ahead and design the system to collect the information now. It will be easier than modifying the system to collect the data in the future.
Security: We really should be more mindful about the data that we collect and retain.
VP: Everybody knows how valuable data is these days, right? The more we collect, the more it will be worth! Besides, I agree that we should collect it up front while we have the chance and eliminate the cost of modifying the system later to collect data that we didn't know we would need. Lets move forward with marketing's suggestion. Thank you, everyone.
My grandparents were hoarders. Legit, put them on the TV show, hoarders. A common refrain when questioned about the need to keep something was that someone might use it or that it might become valuable... someday. With the exception of very few items, someday never came. What happened instead is that they created a severe health, safety, and fire hazard in their home, and the house fell into disrepair because they could not access what needed to be fixed due to the mountains of junk. Those things that might be valuable... someday... turned into a giant liability.
The data that a business collects can be a double edged sword. The right data is valuable, as it is necessary to operate the business and provide goods or services to customers, but the wrong data can be a liability. By being mindful of the data that we collect, when and why we collect it, we can avoid becoming data hoarders, keeping mountains of customer data in hopes that it will be valuable... someday.
Before deciding whether or not to collect data from users, business leaders can perform some analysis to determine if that data will be an asset or a liability. While the formula itself is straight forward, calculating the values to feed into it is not. The calculations will require the involvement of multiple functions across the business if the exercise is to be useful.
Potential business value of the proposed data
- Incremental cost to retain and protect the data
- Incremental expected cost of a breach of the data
Residual value of the proposed data
Potential business value of the proposed data The business needs to take an honest look at why they need to collect the data, how it will be used, and what the true potential business value of that data will be. A concerted effort should be made to place a realistic estimate on the value of the data, taking into account the market and business strategy of the organization, as well as the competitive and cooperative landscape in the industry. The sales, marketing, and business development functions can help to bring this estimate into focus.
Incremental cost to retain and protect the data
The business must understand the additional costs to collect and retain the data. Incremental is an important word here. There will be certain fixed overhead costs associated with things like IT infrastructure and operations that will not necessarily change based upon the decision collect and store additional data. Only the variable costs are relevant to the decision. These variable costs could involve additional storage, new IT systems, licenses, people, backups, bandwidth, insurance, subscriptions, audits, and so on. Regulatory and compliance requirements will inform these costs. Under new data privacy laws that are emerging, businesses must also consider the costs associated with processing data subject requests from their users. The IT, Security, legal, and finance teams should be able to help to estimate these costs.
Incremental expected cost of a breach of the data
In finance, expected gains or losses are used to inform decisions about whether or not to make an investment. We can use the same concept to determine the expected losses associated with a breach by multiplying the estimated statistical probability of a breach by the potential loss associated with that breach. As before, only the incremental difference associated with a breach of the proposed data is relevant to the decision. For instance, the damage to brand reputation may be consistent regardless of whether this additional data is included. On the other hand, legal consequences such as fines may increase substantially. The finance, legal, risk management, and security teams can help to estimate these costs.
Residual value of the proposed data
The residual value of the proposed data is what remains. It is the net value that the business can expect to realize by collecting and retaining that data. Business leaders can use it to decide if the proposed data will be an asset or a liability. If the number is positive, the business is likely to benefit from collecting the data, but if the number is negative, it should serve as a warning that the data will likely cost the business money in the long run. If the number is near zero, there may not be much benefit. Management will have to decide at what point they are willing to move ahead with the proposal.
This model is most useful at the macro level to determine whether or not the business should collect and retain a new class of data. It is far less useful in making decisions about collecting additional pieces of data for an existing classification. For the characters in the conference room, the incremental costs and expected losses associated with collecting a user's name, address, and phone number in addition to an email address may be negligible. Though, when considering the addition of each new class of information, such as the user's family tree, medical records, and school transcripts, the team could use this model to help them make a better decision.
It is worth noting that as certain types of information age, they may become less valuable. For instance, address, employer, credit scores, personal interests, and income levels, change over time, sometimes very quickly, and are much less useful to the business if they are inaccurate. If the business is considering data that may become stale over time, this depreciation of value should also be accounted for in the calculations to determine the expected residual value of the data.
What do you think? Was this helpful? Am I off track? Click "follow" button at the top right corner of the page to follow me on twitter and start a conversation!