In a previous post, I said I looked at energy use per capita to determine the development of an economy. I highlighted this as it looked like computing digital data was going to cause an inflection in energy use in North America - the “digital world eating the real world”.
I tried to analysis the digital world in the same way. I started by asking Chat GPT how much digital data does the average person generate - and it gave me the answer of 1.7MB of data per second - or 146GB per day. ChatGPT gives a forecast of 149 zettabytes of data generates in 2024, which I suspect is taken from the same source at Statista seen below.
Adjusting the above data to gigabyte per day gives 2024 data production at 69GB per day, rather than 146GB. We could assume the difference comes from penetration rates into the digital world, which is around 60%.
What is odd is that the data generated has started to grow much faster than the number of users of social media. Social media use is looking ex-growth in recent years.
But when we look at data created per social media account, it sky rockets in 2020. This could be a Covid related change - where people changed habits over night, and never went back
Or it could be a TikTok inspired change. Certainly TikTok caused more people to upload videos of themselves (and cats and dogs).
If TikTok is driving the data explosion, then it is upending the traditional way innovation and technology spreads. TikTok is far more popular in emerging markets than it is in the developed markets. Japan, UK, France and Germany are not top markets for TikTok.
This suggests two things. First, that the typical emerging market-developed market divide in “digital data creation” - at least in social media terms - is far smaller than you would think. If we were using the energy analysis - it would seemingly imply that in the digital world, there are no real developing nations. If you are connected, you are a data producer.
Secondly, as far as data harvesting is concerned, TikTok may well be more dominant that Facebook, Google or Apple. To my surprise this implies that the US and China may well be closer to parity on harvesting global data - with both acting to protect their home markets from the other. If China and the US have parity in data harvesting, then the real fight is in data analysis. To use an energy analysis, it would be like finding that everyone could produce crude oil, but only one nation could refine it. And the US monopoly on refining is built on controlling the supply of Nvidia chips.
What does this mean? Well, unsurprisingly, the market has got things pretty much right. The digital world, is a global phenomenon, with connected users generating similar amounts of data wherever they are - as long as they have a smart phone and are connected to social media - particularly TikTok or Instagram. The US is looking to maintain a monopoly in the distilling of that data into useful forms, and the sheer growth and size of the data industry is pushing semiconductor usage and pricing to extreme levels. But the TikTok data above suggest Chinese versions of Chat GPT will be particularly good in emerging markets. I thought Chinese influence could maybe be contained to Asia - but in the digital world, Chinese looks to be much more global. The big issue is where Chinese technology can work around the restrictions on high end chips. If it can, a digital divide between the west and rest beckons - as Chinese AI tech will have access to the most data for training emerging market LLMs.
Share this post