Perhaps you are reading this article because of its interesting title; “Data science is the most exciting specialty in the world.” But, dear reader, I did not write this shocking title, but rather the prestigious Harvard Business Review.
What is data science? Why is it the most exciting discipline of the twenty-first century? That's what we'll cover in today's article… Buckle up and get ready for a journey that will change your thinking and your entire life.
What is data science?
Data science is the science that deals with all forms of data, from collecting and coordinating data to analyzing, reviewing, and extracting important indicators from them. It is the science that combines various fields of computer science, statistics, and a number of other important modern sciences.
You didn't understand anything, did you? Let's say data science is a combination of storytelling and astrology, where the data scientist tells the story he knows from the numbers in front of him, and then uses his crystal ball to predict the end of his story and what other scenarios would happen as well.
And data science is the basis on which many of the popular disciplines today, such as Machine learning, Deep Learning, in addition to Big Data, and therefore it is one of the most sought-after disciplines in the labor market in the twenty-first century.
Although data science is something that humans have needed and practiced since ancient times, the term “Data Science” did not gain all this momentum except in recent decades, especially after the Internet invaded all areas of our lives, and the resulting huge amount of data is many times what it produced. mankind throughout its existence.
Why is data science so important?
In October 2012, Harvard Business Review published an article titled “The Data Scientist: The Most Exciting Job of the 21st Century” to mark the beginning of the Fourth Industrial Revolution that is known globally as the “Data Revolution.”
Which is led by programmers, data scientists and AI (artificial intelligence) developers who are changing our world at an insane speed by making us generate massive amounts of data for them to analyze and study well.
You may not imagine how much data we produce. I will not tell you per day, but per minute:
41,666,667 messages are sent through the WhatsApp application.
147,000 photos are posted on Facebook.
404,444 hours of movies and series are watched on Netflix.
6,659 packages are shipped through Amazon.
And other data that the human mind cannot even imagine or imagine its size.
This data is used to understand our behavior, study and then predict our actions by major technology companies, but it also helps companies make decisions, predict the future, reduce risks, and sometimes is used as a source of funds by selling them.
What is data science all about?
Data science is concerned with a number of necessary operations, which can be summarized as asking a question and then answering it, but in a more complex way where the goal of the question may be to predict a certain phenomenon, classify some data, identify a pattern in the data, or generate recommendations before Take an important step, or even for the purposes of maintaining and controlling quality.
Examples of these are many, such as predicting the number of Hyundai buyers in April 2021, knowing what Amazon buyers are asking for refrigerators, measuring the speed of some weather forecasting algorithms, and many other applications that include almost everything in our lives.
This is done through several stages known as the data scientist journey.
I will explain these stages and apply them to a simplified example:
1. Understand the problem and ask the right questions
In the beginning, the data scientist studies and understands the problem in order to ask a number of pivotal questions about this problem, which have major impacts on it and control the phenomenon under study.
Suppose a distinguished data scientist was reading one of our articles, and while reading the articles he was very curious about the quality of Seo7u's articles, he identified the key questions that will help him know the quality of the articles:
What is the average time readers spend on the site?
What percentage of website visitors come back to Google to find another source of information?
What percentage of visitors go to another page on the site (evidence that they like the content and want more of it).
What is the average number of comments on the content.
What is the average of the previous rates on other sites.
2. Collect data related to this question
In the next step, the data scientist collects all possible data to answer this question from various sources, such as databases, Internet servers, questionnaires or opinion polls he conducts and other sources.
This data scientist could then create a survey and publish it online asking visitors if they know Seo7u, and if they do, to rate the following factors with numbers from 1 to 5 in their opinion:
Quality of site articles.
Seo7u's articles are superior to other sites that write on the same topics.
Articles integrity from spelling and grammatical errors.
The effect of Seo7u articles on you.
Then he asks them about their age, country and gender.
3. Data processing and analysis
Then comes the most difficult stage in the matter, which is the stage of preparing and preparing the data, where the data scientist cleans the data and removes or corrects any data that may negatively affect the validity of his results, and then collects the remaining data and organizes it in a practical form that can be used in answering his question.
The data scientist also understands what he has and determines the main keys to the data, and the most important variables that will be the key to answering his question, to choose the best data analysis model.
The data scientist then collects the results of a survey that 50,000 people have filled out.
He excludes people who do not know Seo7u, and then also excludes duplicate or incomplete answers, leaving him with 48,000 answers, for example, and realizes that the most important variable to focus on is people's opinion of the quality of articles, and the extent of their impact on them, to choose the most appropriate data analysis model.
4. Analyze the data and use the results to answer his question
The data scientist is about to finish his strenuous search, and he will finally find the answer to his questions, but he still has to analyze the data, read the results in order to tell his story, and successfully answer the question he asked at the beginning.
After analyzing the data, the scientist found that 99% of these readers gave the quality of the content a 5, and that 89% of them rated the degree of impact of the articles on them as 5, so the data scientist concluded that Seo7u's articles are of high quality.
Data Science: Human Magic Wand
Just think how many questions data science will answer for us, and imagine what we can do with all those answers? Well, don't bother thinking too much, just look around and see how life becomes simpler and more luxurious, from the wonderful suggestions of Netflix, to the development of the medical system in the world, and the reduction of the dangers of the spread of the Covid-19 pandemic.
Data science is not only concerned with recreational and industrial matters, but is an effective factor in medical progress around the world, by applying its magic to biological or medical data in what is known as bioinformatics or medical.
Without human knowledge of data science, we would not have been able to contain the Corona virus, nor would we be able to reduce the number of cancer patients in the world.
But dear reader, do not think that data science is used in these areas only, but is used in almost all fields. Any field or industry, no matter how it needs or uses data, starting with the humanities, and even the missile industry, passing through everything you can or I can imagine or even I don't know his presence.
The data scientist is the luckiest man in the world
According to many statistics, data science specialists are the most wanted people in the world, not for justice, of course, but for the labor market, the specialization of data science has been the most in demand over the past years as the demand is much more than the supply.
That is why data scientist salaries are very high, with the average salary for data science professionals in the US at $113,000 per year, according to the well-known website Glassdoor.
And if we talk in numbers, in the year 2020, there was a deficit in the United States of America only with about 250,000 specialists in data science, while the labor market is increasing at crazy rates, reaching 39% in 2019, and the salary increase by 14% in 2020 due to the need for more data science specialists .
And if this is not enough for you, the indicators say that there will be a deficit in the millions in the next few decades.
You may be surprised by these numbers and the crazy increases in salary and demand, while the world is suffering from major economic problems, and you have every right to that, but data science and other relatively modern technological disciplines are now running, controlling and changing the world, so no matter what happens, we will still need their skills and their superhuman abilities with technology.
Experts predict that in the coming years the number of jobs offered will increase, but at any time you can work through freelance platforms on projects related to data science and machine learning.
The average hourly rate for data science professionals on Upwork in 2019 is around $36-$200 per hour depending on skill and experience.
Best Data Science Study Resources
Before I mention sources for studying data science, I should point out that while most jobs in this field require you to have at least a bachelor's degree in computer science or equivalent, anyone can study data science and use it in their field.
If you are a doctor or pharmacist, you can study data science to work in bioinformatics or medical, which is concerned with biological or medical data.
And if you work in the fields of business and finance, your study of data science will allow you to work in areas such as risk analysis, user analytics, identifying financial manipulation, and so on, whatever your major, even if it is in the humanities.
As for the study of data science, the data scientist has to master a number of topics, on top of which are programming, statistics, machine learning, various algorithms and databases, but fortunately, the Internet is full of many wonderful resources to learn this important field, and some of them start from scratch completely.
Now here is a list of a selection of the best data science learning resources:
1. IBM Data Science Professional Certificate from Coursera
It is a specialization offered by IBM, and it extends for about 12 months, and you can complete it at a rate of studying 4 hours per week. It is one of the most popular courses on the Coursera platform, as it is rated 4.6 out of 5 based on 45 thousand reviews, and it is distinguished by having a graduation project at the end of it.
2. UCSD Mini Master’s Degree in Data Science from edx platform
It is a group of distinguished courses by professors who are strong in their scientific subject, and the study takes about 10 months, with an average of 9 to 11 hours of study per week, and this degree costs approximately 1400 dollars.
3. Data Science major from Johns Hopkins University on Coursera
It is also one of the most popular and recommended by many.
Nine courses in addition to a graduation project, and it takes about 11 months to complete, with a weekly study rate of 7 hours.
4. Mini Master's Degree in Statistics and Data Science from MIT on edx platform
It is a very strong set of courses, as is the case with all MIT courses, and it is offered by a variety of experts, and it takes about 14 months to complete with a weekly study rate of 10 to 14 hours, but it is also as usual as the mini-edx degrees are very expensive if it costs about 1500 dollars.
It is one of the most popular courses on the platform in the field of data science, as the number of participants has reached approximately 750,000 subscribers, and its rating is 4.5 based on approximately 140,000 evaluations, and the number of its lectures is 322 in about 44 hours.
The course is very nice and I advise beginners to join it at first, as it is relatively cheap, as there is usually a big discount on it, reaching $67.
It is a very special site, and many people prefer to study data science from it, and the study from it costs approximately $ 12.5 per month only for courses in all branches of data science.
It is one of the most popular programming sites ever, and it is very popular among the data science student crowd, and it has a limited free plan or a monthly plan of $36 or an annual plan of $348.
Read also : make money ways from programming (explanation of the 15 most famous ways to make money from programming)
Final words on data science
I would like to end the article with actual numbers to assure you how important data science is, all the data that humans produced during their existence from the beginning of creation until the year 2005 was estimated at 130 exabytes (1 exabyte equals 1,073,741,824 gigabytes).
But due to the Internet revolution, the amount of data has increased in just 10 years to 7900 exabytes, or about 61 times, which is a very large number that cannot even be imagined.
And I will add to you by poetry that the amount of data in 2019 reached about 30,000 exabytes, and this number continues to double every moment, so the world needs more and more data scientists to deal with all these huge amounts of data and volunteer it to serve humans.
I would also like, before I finish my talk, to advise you, if you find yourself passionate about this field, to start learning it immediately, even if it takes a year or two, in any case, if you start, you will arrive even if late, but if you do not start, you will never reach, and who knows, perhaps you will be a data genius Next.