This short article will cover seven common mistakes I have seen machine learning engineers make, from juniors to seniors. I've collected these mistakes from conversations with colleagues, questions during my lectures, and discussions over social media.
1. AI overselling and overpromising creates unrealistic expectations (or: why people imagine the Terminator…)
A mistake typically made during the sales process is describing AI as an all-powerful being that only the seller can control. This presentation has misled many to confuse AI and magic.
In some cases, the Data Science (DS) team is asked to meet some unrealistic goal, leading to the project's failure from the get-go. To increase the chances of success, one should break the goal down into small business questions with valid quality data that can support your questions.
Consider further involving your analysts in the process so that they know both the business and the data backward and forward. Unfortunately, you may often trust your relevant data, whereas the data tends to be unusable. Let your analyst lead the way. Knowing your data is much more important than knowing the latest state-of-the-art algorithm.
2. AI is a "new" field, which means most specialists become experts through academic research
Developing expertise in academia usually leads to two negative phenomena:
A. Sandboxes, in academic research, mean that the data is its own little bubble. Researchers usually test their work using the same data. The fact is that in most cases, the data is a simplification of real life. And what works in the lab (or the sandbox) will not necessarily work in the dedicated process or system.
B. Data is alive. In the lab, the data is static, and no new data is accumulated. But data is a changing process. For example, when modeling user behavior, the process is incorrect if a user does not change. Data changes all the time, and so should the model, something that academic research usually disregards.
3. Machine learning developers prefer "state of the art" rather than fundamentals…
I remember an advanced tutorial I once gave at a professional conference, and some "experts" in the audience were displeased that I started with some basic terminology. They kept asking about advanced material and architecture, showing much knowledge of the latest published papers. But when I got to explain cross-validation (k-fold), a basic concept in machine learning and statistics, suddenly all the "experts" had tons of questions to ask…
Why did my audience know the latest papers and trends but not cross-validation? Skipping the basics is dangerous, but more and more people getting involved in machine learning lack thorough knowledge of algorithms and data structures.
The fundamental problem of machine learning is "generalization", meaning how your model generalizes or reacts to data. Cross-validation is one of the most effective tools to test "generalization". How can one try and apply machine learning without such an essential tool? When problems occur and processes fail, the basic knowledge and tools, not the recent trends, allow in-depth understanding of any new technology.
4. When the model fails, what's needed is more data, not improved parameters
After the pipeline is ready and you get some results, I see people struggling with hyperparameters (hopefully automatic). Better algorithm parameters can improve your outcome, but usually, all that is needed is a modest change that allows you to trust your model.
Instead of improving parameters, you should take what you have so far, try it in the real world, and clean up some more data to enhance your model, especially when it fails.
5. Skipping Exploratory Data Analysis (EDA) and going directly to modeling
Running AI algorithms on the data is probably the most fun part of the project, where you apply sophisticated models. But rarely is this the most important task.
I argue that data exploration is the most critical step of any project. True, cleaning and mining your data might be draining. Still, it is the only way to ensure that you have quality data and a model with statistical significance for your business theory - before committing to a task.
6. Committing to answering questions instead of asking the right business question
Most AI experts are engineering-oriented, not business-oriented. When I started, I was eager to solve complex problems given to me and did not ask how the outcome will serve the organization and what will be done with the results.
In one case, I was given an interesting problem: to predict when a driver would crash his car - a few minutes before he actually did. I was given IoT sensors of the vehicle, and I built a model that did a fairly good job. But then we hit a wall. How do we use that model without being "Big Brother" to the company customers? (which would send them straight to the arms of the competitors).
We solved a challenging data problem, but we had nothing of value to show for it. Later we found other uses for a similar model, but that was a big lesson: the business should lead the research!
7. Preferring Computer Experts over Domain Experts
Every business and industry have their unique knowledge and data. Finding a data scientist with the organization's specific knowledge can be challenging and expensive. Training one can take a long time. Most machine learning developers come with a background in computer science. Adapting to a specific domain can take anywhere from a few months to years. One should consider how to empower current analysts already in the system with predictive modeling capabilities.
These are common mistakes I find among junior and senior AI experts. To save time and avoid costly mistakes, find the right balance between carefully selected systems, outsourced expertise, and in-house talent.
Comments