Jan Bosch is a research center director, professor, consultant and angel investor in start-ups. You can contact him at jan@janbosch.com.


AI is NOT big data analytics

Leestijd: 3 minuten

During the big data era, one of the key tenets of successfully realizing your big data strategy was to create a central data warehouse or data lake where all data was stored. The data analysts could then run their analyses to their hearts’ content and find relevant correlations, outliers, predictive patterns and the like. In this scenario, everyone contributes their data to the data lake, after which a central data science department uses it to provide, typically executive, decision support (Figure 1).

Although this looks great in theory, the reality in many companies is, of course, quite a bit different. We see at least four challenges. First, analyzing data from products and customers in the field often requires significant domain knowledge that data scientists in a central department typically lack. This easily results in incorrect interpretations of data and, consequently, inaccurate results.

Second, different departments and groups that collect data often do so in different ways, resulting in similarly looking data but with different semantics. These can be minor differences, such as the frequency of data generation, eg seconds, minutes, hours or days, but also much larger differences, such as data concerning individual products in the field vs similar data concerning an entire product family in a specific category. As data scientists in a central department often seek to relate data from different sources, this easily causes incorrect conclusions to be drawn.

This article is exclusively available to premium members of Bits&Chips. Already a premium member? Please log in. Not yet a premium member? Become one and enjoy all the benefits.


Related content