Why this article?
There is a lot of talk about Artificial Intelligence (AI), Cloud Computing, Internet-of-Things (IoT), Edge Computing and all that jazz. But I believe that most of the world (or the markets, or what have you) underestimate the gravity of the situation. I’m usually not one for hyperbole or shock-value, but I believe that civilization really is at an inflection point. Data – and its use – will increase by multitudes of factors in the next few years. Chances are that it will be used to improve most (if not all) facets of our lives. Here are just a few things that will change dramatically:
- Mobility – through autonomous cars and “smart cities”.
- Factory Automation – it’s not hard to imagine Robots doing a lot of painful human jobs.
- Healthcare – detection, prevention and cure of most diseases will see massive improvements.
The key to all this happening is massive improvement in computing power. And it is happening, even though there are roadblocks. If the last 3 years are any indication of our trajectory, over the next 10 years, we’re in for one hell of a ride. This article isn’t about the specifics of computing. That’s a separate discussion. This is about the inflection point in which we live.
I don’t need to go into a detailed investigation about whether there is indeed a data explosion. Digitization is the obvious reason. Consider this:
- Computers became mainstream just over 30 years ago.
- The Internet became mainstream just over 20 years ago.
- Smartphones became mainstream just over 10 years ago.
This giant river of information is widening rapidly and is now approaching a delta. In the words of the venerable Catwoman from the movie The Dark Knight Rises, “Everything we do is collated and quantified. Everything sticks.”. This is creepy, but true. Things that never used to be collected, collated and quantified are now stored in cyberspace for eternity. Every time she likes a cat video or buys Purina cat food, it is a piece of data that somebody (probably companies whose names rhyme with Frugal and Spacebook) uses for something. Quite simply, for them and many others, it’s raining gold.
Most estimates suggest that data doubles every 2 years. I have no evidence to confirm or refute that claim, but I suspect the snowball-effect is underestimated. All this data – public and private – will keep changing our civilization at a pace that we’ve never experienced. The scary part, however, is the second-order effect of this – it’s called Metcalf’s Law.
Metcalf’s Law states that the value of a telecommunications network equals the square of the number of nodes in that network. The law is used to depict the value of networks like the Internet or Facebook or Bitcoin. I use it to depict the value of the “network of data”, with some caveats:
- First, I don’t think it’s a law, even for telecommunications. I don’t think it’s a mathematical certainty that the value of network equals the square of the number of nodes. The trend, I believe, is correct; it passes the common-sense test.
- The term “value” is vague. I’m not sure what it means in a telecom network. Maybe it refers to the satisfaction of being connected? Economic value? In the case of data, I believe it means insights that translate into economic value. It’s still vague but this is what most CEOs and Entrepreneurs are betting on. This is where Artificial Intelligence steps in (AI).
- Metcalf’s Law seems to assume “nodes” as a commoditized, homogenous entity. In the world of communications networks, maybe homogeneity makes sense. In the world of data, I think quality of nodes matters. Quality of data matters. I’ve argued that quality of data used to “train” an AI system will be one of the main factors that would determine whether an AI is effective or not. I’ve realized this from my experience in analyzing large datasets in Finance. If I used a certain “sample” dataset to formulate a model, the quality of that historical data I chose mattered a lot. Just choosing ALL historical data was rarely useful. The model would fail spectacularly in an “out-of-sample” test or worse, in the real world. I had to use judgment and common-sense to choose a relevant “training sample”. In other words, a model based on tons of past data may not be applicable in the future. More is not necessarily better. Better is better. Of course, more and better are loosely correlated.
Whether the “value” from all this data increases according to a precise mathematical formula or not is irrelevant. The point is this: more data, used judiciously, means more insights. I’ll take that one step further: better data means better insights. And that’s what matters. Apart from processing data, discerning good data from bad is what AI can do at lightning-fast speeds and at an epic scale. As more data nodes are available, an effective AI system will be able to judge which nodes are most relevant for a particular query. And then it shoots back an insight in a second. AI could make Metcalf’s law a reality. It could even outpace it.
Neural Network Computing
The most popular phrase in AI seems to be “Neural Networks”, which gives us a mental picture of how an AI system uses data. I thought through this in some detail earlier, but the gist is that AI systems are designed with logic that is “convoluted” in various degrees, as opposed to a rather plain-vanilla “if…then…else” type logic that dominates computing today. I encourage you to check out McKinsey’s mind-map on AI. It may be the best thing to have ever come out of their Research division.
To put it simply, the difference between computing today and this new paradigm called AI comes down to the number of data nodes that a computer can process simultaneously. For most of our computing needs now, data is processed in a sequential, linear fashion. In AI, data processing, if depicted, would look more like a 3D latticework, with no care for sequence or linearity. This is massive.
If Metcalf’s Law wasn’t a reality with data nodes earlier, it could be now with AI, at least for a while. But I don’t think Metcalf’s Law is sustainable forever. I don’t think the “value” from more data will increase disproportionately forever. At some point, there will probably be diminishing marginal returns from each extra node of data. So, the curve may look like a 45-degree angled “S”. In the Math world, it’s known as a Logistic Function. If you recall your middle-school algebra, it looks a bit like Y = X2 for a while, but after an inflection point, the curve gradually flattens out. At this early innings of the AI revolution, we’re probably at the part of the curve that looks like Y = X2. Sounds awesome, but let’s not get too excited. There is a giant roadblock.
We can’t get much value out of all that ever-growing volume of data if we can’t process it in a reasonable amount of time. Processing means running some mathematical operations on all those nodes of data within computer chips in a computer or server. If all the data that’s now collated and quantified is to be used, we’re looking at multiples of the workload computers have been used to thus far. As we move more towards harnessing the power of AI, we need super-processors that can do a hell of a lot more than most of the existing crop. This is the bottleneck.
The problem is captured elegantly in another equation called Moore’s Law. Back in the 1950s, Gordon Moore prophesized that the number of transistors that engineers can pack into an Integrated Circuit will double every 2 years. For a long time, his theory panned out to be true (more or less) and it was rightly given the moniker of “law”. It turns out, engineers are reaching the physical limits of this law. They can’t pack in too many more transistors into an Integrated Circuit. This is the problem.
It takes two to Tango.
In my mind, we’re at a momentous place in this data vs. computing power juxtaposition. As I’ve said before, it’s hard to overemphasize its significance. So, here is my rough sketch to show you where I think we are.
There we have it – data is increasing exponentially, and we can write “intelligent” programs to makes sense of all that good stuff. But we’re reaching the limits of computer processing power. The impending end of Moore’s Law is a party-pooper.
At the moment, we’re already in a phase where we have more than we can chew. But it’s always good to step back and ask why this is a problem. So, why chew? Because companies that sell stuff want to. The laws of Schumpeterian Competitive Destruction will ensure that the movers and shakers of the world will do whatever they can to use this exponentially increasing fuel called Data. It’s the new Oil. And companies out there want the best, fastest motor car to get to the promised land of sustainable competitive advantage. They’re not going to let a pesky little thing like Moore’s Law get in the way. It’s raining gold, and they will find a bucket.
OK. Let’s Tango.
There are two broad ways to tackle this problem:
- Change the foundational principles of “traditional computing”.
- Work around Moore’s Law.
#1 is complicated. The Googles, Microsofts and IBMs are building something called “Quantum Computers”. I don’t know what they are, but they sound cool. I don’t think I’ll understand a whole lot without a PHD in Computer Science. At the moment, #2 is where most of the work is being done. And most of this work is being done over the last 2-3 years. The party is just getting started.
Most of the processing in our computers and smartphones are done on a Central Processing Unit (CPU). These are good for the relatively low data-intensive things that satisfy most of our computing needs. But as more data is collected, collated and quantified, and as companies out there want to harness it to sell us stuff, CPUs won’t cut it. They aren’t designed for Neural Network Computing. They aren’t designed for AI. And Moore’s Law suggests that we’re already approaching the limits of processing power of CPUs anyway.
The know-somethings of the computing world have worked around this problem. They’ve started using Graphics Processing Units (GPUs) to isolate these complicated algorithms and terabytes of data away from the CPU. It turned out that GPUs, that had long been used in videogame consoles, were good at “parallel computing” that any “intelligent” system would need. GPUs are a good first step, but they are still quite general-purpose and were built specifically for graphics, not for AI. With the advent of Machine Learning and AI, GPU specialists like Nvidia started producing “Enhanced GPUs” for this new lucrative market. In doing so, they outgunned CPU giants like Intel and AMD in the AI game.
But those giants are making a comeback. While Nvidia leapfrogged them in “GPUs for AI”, the Intels of the world may have a leg up in the latest innovations that go beyond GPUs. These are things like Field Programmable Gate Arrays (FPGAs) and Application-specific Integrated Circuits (ASICs) designed specifically for AI. These are both new technologies being used by the big vendors of AI – Google, IBM, Amazon and Microsoft. FPGAs and ASICs tend to be faster and more power-efficient than GPUs. These new chips are, of course, much more expensive to produce. And they can’t really be used for computing needs other than Machine Learning and Neural Network Computing. They will probably be placed right beside the CPU, which can carry out the more “traditional computing” duties.
Chips for AI is a fascinating topic and needs to be tackled separately. But here, the point I’m making is this: a lot of work is being done to solve the problem of the data bomb at a time when processing power is maxing out. It’s fascinating to me that this is where we are. Computing power isn’t keeping up with data. And data from the Internet-of-Things and Virtual Reality hasn’t even begun trickling in. This is a problem we’ve never faced before. But I suspect we’ll find a way. Our civilization will work it out. Whether we survive a post-AI world, is a separate matter altogether.
Hey, maybe AI will take us places we can’t even imagine.