The History of Data Science
I got an opportunity to talk about this topic two weeks ago in SOTEDx - an internal TED inspired talk in HP Singapore. Just thought it would be good to write it as an article and hence you are here.
Whenever I come across something trending, I generally go back to history to see how it evolved. By looking at history, it helps me to demystify the trend and think logically. I did the same for Data Science to understand what events in history directly/indirectly influenced it to become pervasive in the business world today.
Broadly, there have been people from two schools of thought - the industry, business and corporate on one side and the engineers, computer hobbyists and academia on the other.
In 1955, in the midst of the second industrial revolution, Taylorism or Scientific Management was extensively adopted in industries. Through his theory, Taylor demonstrated that if a process workflow is analyzed and data is collected from all variables involved in the process we can improve over time to achieve economic efficiency. Around the same time, computers and AI were researched a lot after World War II as they played a crucial role in curtailing the war; thanks to Alan Turing and his computer that broke Enigma.
1965s was the time HP developed world's first desktop computer and on the corporate side, Peter Drucker introduced the concept of 'Knowledge Management'. Drucker's theory mentioned that people who think for a living - the accountants, consultants and scientists are the most important resources for any organization.
75s was the time academia realized the importance of data analysis, exploration and visualization thanks to the statistician John Tukey and his book titled 'Exploratory Data Analysis'. While on the industry front, building on the principles of Taylorism, the principle of Kaizen was adopted by Toyota Production System that yielded continuous quality improvement and gained a lot of traction across manufacturing landscape.
Operating systems debuted in 1985 after personal computers grew leaps and bounds over the past decade. This was also the first time corporate folks started embracing software like Excel and Word. A few of the traditional businesses started to get digitized during this decade.
Internet was born in 1995 and it was the crucial invention that influenced Data Science we have today. Internet was such a gamechanger that businesses and computer folks started to work together.
At the start of the millennium, dot-com companies started to proliferate around the internet. Almost every type of business from florists to restaurants had an online service and it was like a gold rush. A lot of these companies also went public quickly with skyrocketing valuations that led to the dot-com bubble which did burst in a hurry.
By 2005, with a decade of the internet, things had stabilized and the companies like Amazon and Netflix that had survived the dot-com scare were focused on expanding their user base. Google and Facebook also made an entry during this decade. Since all these businesses were entirely digital, they were also able to collect data and tune their services to 'exactly' meet the customers' needs. Jeff Bezos calls it 'Customer Obsession'.
In the next five years, with the introduction of smartphones the concept of 'customer obsession' only intensified. Businesses were able to seamlessly track and target customers for specific services. Desktops put a digital device in every household whereas smartphones made it possible to track every person with his digital footprints.
This further paved the way for on-demand companies like Uber, Airbnb, Deliveroo etc., where a consumer demand is fulfilled in a 'digital' marketplace. These companies also collected tonnes of data through smartphone app services to better understand the customer and to provide him/her with the best services.
By 2015, so much data was generated from digital devices that it necessitated the need for large-scale data analysis. Machine learning algorithms had significantly evolved by now and GPU Processing made AI operational.
So, this is how the Data Science that we have today, evolved. As you may observe, data science or 'learning from data' is not something 'new' and has been around for many decades but in different flavours. The seed for this idea was laid by Friedrick Taylor's scientific management theory. When companies today track their customer or employees to improve their service offerings, a few attribute this as 'Digital Taylorism'. Only the terminologies used has changed over decades (or will change) but the methodology of improving through data will prevail. Hence, it's not the time to fear about data science taking over but to embrace it as it will touch every aspect of our lives and will impact every decision we make in the future.
Last but not the least, thanks to HP for creating SOTEDx that provided a forum to share my thoughts.