Big Data and Digital Transformation

By Dr. Michael Valivullah, CTO, NASS/USDA

Dr. Michael Valivullah, CTO, NASS/USDA

Enterprises collect and store lots of data but analyze only a fraction of it. They have discovered that data is the new currency and there is a lot of value hiding in their data. To extract the value from their ‘data treasure troves’ they are utilizing data science and big data analytical tools. This is helping them in their ‘digital transformation’. Some organizations have been very successful in this endeavor and continue to innovate, gain market share and/or add value (e.g., Amazon, Google, Facebook, etc) and others are trying to follow suit.

"A common myth is that an organization needs to have a lot of structured and unstructured data collected from different sources including external sources to start analytics"

About five years ago ‘Big data and Analytics’ started getting people’s attention facilitated by a seminal paper published by McKinsey Global Institute in May 2011 entitled ‘Big data: The next frontier for innovation, competition and productivity’. The ‘big data and analytics fever’ then grew and reached its peak in June 2016 based on Google’s trend analysis (which provides relative level of search interest over time for a keyword phrase). Cloud computing has sustained high interest since reaching its peak because more and more enterprises continue to implement cloud computing technologies to increase business agility, operational resiliency, improved performance and greater efficiency.

One may wonder what will become of ‘big data and analytics’ after reaching its peak last year. Enterprises both private and public will pursue big data and analytics as long as there is value based on published customer surveys, vendor interests, analyst reports, and revenue generation. A recent Gartner survey (2016) reported that total dollar amount invested in big data and analytics has been increasing at a constant pace for the past five years but the interest in future investments seemed to decline a little bit. This may be a pause to see the actual benefits of these investments because another Gartner report (2016) indicated that only about 12 percent of the big data projects have yielded successful measurable results. However, increased adoption and use of social media, Internet of Things (IoT), smart phones, mobile devices, game gear, wearables, sensors, drones, remote monitors, precision medicine, precision agriculture, smart cities, smart buildings, autonomous vehicles, remote vehicles, etc. will generate mountains of data that need to be collected, aggregated and analyzed to make decisions to be useful and valuable. It will be impossible to analyze the data manually using traditional methods and legacy systems. The potential future value from big data and analytics runs into billions and trillions of US dollars per year. This is considered a conservative estimate. Since McKinsey’s 2011 publication only a fraction of the potential value of big data has been captured. Only location based data has seen a high level of adoption and value capture at 50-60 percent, followed by US retail industry at 30-40 percent, (both of these industries are digital natives), manufacturing at 20-30 percent, US healthcare at 10-20 percent and EU public sector at 10-20 percent. Therefore, interest and investment in big data and analytics is bound to increase in almost all sectors to capture the value hidden in big data. I expect a sustained interest in big data similar to the cloud in the years to come.

Data Security

As more and more data is collected, aggregated, analyzed and used to make decisions that impact our life, data security is of utmost concern. Data governance needs to take the center stage in dealing with mountains of data gathered from different sources and the risks involved in managing these data elements. Federal, state, city and local government agencies and other non-profit public service organizations need to meet strict confidentiality, integrity and availability (CIA) rules and also provide good governance, meet compliance requirements and manage risk (GCR).

A common myth is that an organization needs to have a lot of structured and unstructured data collected from different sources including external sources (that require validation and risk assessment) to start analytics. One need not have large data to start an analytics project. One could start with the ‘gold standard data’ one already has and look at the possibility of using that data alone or combining it with other internal datasets to solve a business problem as a proof of concept to get buy in from decision makers. An organization could try and analyze different variables that have not been looked at before to identify correlations, causations and predictors and be careful to spot and avoid coincidences. This is where the domain knowledge and expertise come into play. With available and affordable computing power, storage and network capacity one could easily analyze more data relatively easily to see patterns and probabilities hiding in the data. Analytics could be used for descriptive, diagnostic, predictive and prescriptive purposes based on the business need. IoT, sensors, operational technology, equipment maintenance, precision medicine, power grids, shipping, logistics, law enforcement and precision agriculture are increasingly utilizing different types of analytics mentioned above to deal with one or more business problems and/or provide solutions as needed.

Demand for Big Data

Big data means different things to different people. Different information technology analysts, business leaders, consultants, academic researchers and standards organizations have defined big data based on their perspective to include volume, velocity, variety, veracity, complexity, etc. Though there is no clear consensus on big data, one common theme emerges, that it is too much for their existing capacity to handle efficiently in terms of people, process and technology. In terms of big data and analytics implementation ‘people’ is the hardest part. There is organizational inertia, lack of support from decision makers and difficulty finding the right ‘data scientists’ who have a good understanding of analytics, data, and business domain. Along with data scientists there is also a big shortage of big data analysts. Many schools around the world are offering new courses in data science and analytics to meet this increasing demand.

As the big data field is new and it is hard to find experts, the so called ‘big data experts or data scientists’ are attracted to big financial firms on Wall Street, big banks, credit rating and credit card companies, and the likes of Google, Facebook, LinkedIn, Yahoo, Microsoft, Amazon, etc., because they provide them with big salaries, stock options and ‘sexy’ projects to work on. Federal, state, city and local governments and non-profit organizations are at a disadvantage competing for the same talent. However, some creative government organizations have successfully recruited good big data scientists.

Overcome the Shortfalls

To overcome the data scientist shortage challenge, many enterprises are building a data science team that includes people with knowledge and expertise in big data analytics with two or more experts in areas like IT and business domains. Together they can supplement each other’s expertise, collaborate and come up with solutions to business problems. An important characteristic of a successful big data analytics team is the ability to tell the story in business terms and generate striking data visualizations that need very little explanation. This is a very special skill that requires selling skills to close the deal. These abilities help build credibility of the data science team or big data and analytics team to get senior managers support and expand analytics from one business area to another and eventually expanding to the whole organization or enterprise. These folks are the ‘translators’ who can take the results obtained from data analytics and put it in business terms so that the enterprise can understand and adapt. Digital transformation needs to take place at the organizational level to be effective and become a permanent way of operations. Otherwise, it will end up as a middle school science project. Big data and analytics is an integral part of digital transformation of a private or public enterprise. Therefore, many organizations have embarked on a digital transformation journey to unleash the value hidden in big data using analytics. More are likely to follow.

Current Issues

Deltek: Born with Government Contracting DNA
DatabaseUSA: Harnessing Big Data For Government Agencies