“Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization.” –Gartner.
The current state of play could probably be summed up as “If the data isn’t big, it’s worthless.” But before you fall under the spell of what might just be the latest ‘management fad’, think a little deeper. Big data has been variously described as “The next frontier for innovation, competition, and productivity” (McKinsey) and “Hubris” (researchers at Harvard).
It is true, there is much more data out there than ever before, thanks to the Internet and modern computing power, but ‘data’ isn’t ‘information’, ‘knowledge’ or ‘wisdom’. To be transformed into those, data needs to be processed somehow, by machines and humans, and then translated into terms that make actions possible. Advocates and critics of Big Data clash in this area.
The pro-Big data lobby suggests that analysis could, among other things, “Segment populations to customise actions,” “Replace/support human decision making with algorithms,” and “Innovate new business models, products & services.”
Big data’s critics meanwhile suggest that what started as ways to speedily store, retrieve and process large volumes of machine generated data, Big Data has taken on a life of its own as the end rather than a means to answering life’s questions. A symptom of the well-known ‘Hype Cycle’ effect in which a technology’s early proponents make overly grandiose claims, that are rarely met in the short term.
Critics also note there is always inherent bias in the data, especially when its from different sources and collected for its own reasons. That crashing together vast data sets from diverse sources, they say, increases the risk of ‘spurious correlations’ – associations that are statistically robust but happen only by chance. Humans would question the association between, for example, per capita cheese consumption and the number of deaths from becoming tangled in bed sheets or the age of Miss America and murders by steam, hot vapours and hot objects – clearly spurious.
It seems to us that today’s dependence on the algorithm (“In mathematics and computer science, an algorithm is a self-contained sequence of actions to be performed.”) is at fault. The rise in AI and Robotics notwithstanding (possibly the subject of a future article), secondary data (Big or Small) can tell you lots about what, who, when, where and how; but it can never tell you ‘why’. When it comes to customer choice behaviour, which is directly correlated with your revenues and profits, understanding their motivation is crucial.
Add that to a growing ‘segment’ of people voicing privacy concerns over the use of their data without consent and we might be looking at the ‘trough of disillusionment’ before things get better.