He said, “We have a bunch of software tools now, but they are all built for traditional programming … Most of the errors end up being in the data rather than in the code, and the tools we have aren’t as strong.”
Instead of describing a process or tracking other things, the data is raw material that is mined for important signals. Here was Peter Norvig of Google, a practitioner at a leading AI and data analytics firm, suggesting that we don’t know how to manage and deal with data in the machine learning and AI space. The assertion had me puzzled.
So, what could he mean? Well, it’s obvious that people at firms of all types know how to move data around, managing it regardless of the volume, and assigning metadata to it so that it can be categorized and processed. And once these steps are taken, companies of all stripes can apply algorithms to the data and get analytic outputs. Given how many businesses are comfortable with these processes, what exactly was he referring to? What is it that we don’t know about managing this data that is challenging?