Your organization’s most expensive datasets might be the ones you aren’t even buying
The Rising Importance of Data in Business Decisions
Data is increasingly showing up as an ingredient in every part of your business. Whether it’s making a decision on where to open a new store, calculating the value of a potential customer, or running an analysis on whether to make a multi-billion dollar investment, having the right data is crucial. As the demand for unique data increases, the overall data budgets at firms are feeling significant upward pressure. Despite this, many firms are seeing an even greater cost resulting from the data they aren’t even buying. How is this possible?
Data Productionization: Understanding the Costs
Productionizing data in a business process has many costs, only one of which is the cost of the data itself. The journey from business need to productionized data solution is a long one.
Steps to Convert Business Needs to Data Solutions
Once the need is identified, you must embark on a multi-step process, including sourcing potential datasets, filling out NDAs or trial agreements to begin testing, assigning data engineers to build ingestion pipelines, assigning data scientists to build test models, negotiating and signing purchase agreements and building production pipelines and models. This all happens before you can begin making decisions or building your product feature. The cost of wages and computing can oftentimes dwarf the cost of the actual data.
The Oversight in Business Leaders' Data Strategy
Many business leaders have already included all these additional data costs as a cost of doing business. What business leaders haven’t accounted for is going through all the previously mentioned steps for a dataset that the firm has already evaluated. Many firms don’t document the datasets they are already purchasing. What almost all firms we’ve spoken to don’t document at all is the data teams evaluated and chose not to purchase.
The Consequences of Duplicate Data Evaluation
At Nomad Data we speak to data leaders across a multitude of industries. What we’ve heard from our clients again and again is that they oftentimes find out they’ve had multiple teams go through all the data discovery and testing steps, but then decided against purchasing data because of an underlying issue with it. These teams have no knowledge of the firm’s previous testing history and have ultimately wasted thousands of employee hours testing and retesting the same datasets over and over. In the most extreme situation, we’ve seen firms the same data over and over with no knowledge of purchases elsewhere in the organization.
Wasted Efforts & Resources
The result of all this testing and retesting of data is that some of the lowest quality data, which isn’t even being purchased, ends up creating the largest cost to the organization. Typically, your cost for a dataset you purchase is bounded. With data you aren’t purchasing, but testing over and over, the costs are uncapped. This duplicate effort ends up wasting the efforts of engineers, data scientists, lawyers and businesses stakeholders. It also comes with significant compute costs to house, transform and analyze the data repeatedly.
The Impact on Return on Investment for Data
This waste is especially present in data categories growing the most with new data providers frequently coming to market. As more and more providers enter a market, an evaluation takes more time, effort and expense as teams have to look at more data and deal with more paperwork. This has the effect of further compressing the useful data budget and negatively pressuring the return on investment for data within the firm.
Data Relationship Management (DRM): An Introduction
The solution to this problem is Data Relationship Management(DRM). A DRM software platform ties together the efforts of all stakeholders in an organization around their data. A DRM catalogs every data provider the firm has interacted with, not only those it purchases from. Whether it’s legal paperwork, documentation, purchase contracts, meeting notes, test results or production issues and observations, the information is available for all stakeholders, at all times.
The Benefits of DRM
With properly implemented DRM software, there’s no chance of repurchasing or even retesting the same data. Stakeholders immediately see the history of all the firm’s dealings with a data provider and its products. A team beginning a test can see the results of previous tests and get in contact with the internal stakeholders who are already up to speed.
The Multiplier Effect on Your Data Business
Beyond saving time, a DRM frees up a significant budget that can be spent on the data your firm actually wants to purchase to grow the business and reduce costs. Due to this, the investment in a DRM has a multiplier effect on your data business.
Data Budget Efficiency: Its Importance For Your Business
As data continues to become an increasingly important raw material to businesses, tracking its use and waste will be a necessity. Very few businesses have a data budget large enough to buy everything they’d like. By reducing wasted spending, organizations will be able to stretch their data budgets further, likely to the benefit of the organization. This benefit is more likely to result in larger budgets over time as the return on investment will continue to strengthen. Like in any other area of business, those that spend efficiently will drive outsized returns versus those that do not.