Data Science

By Pritha Banerjee, L&T MHPS

‘Data Science’! Isn’t that the word about which the entire world is talking about recently? What is this buzz all about? What is data? How is it related to science? What does Data Science altogether mean?
– Is it a new field or old wine packaged in a new bottle?
Let us dive deep into the world of Data Science.

  • Data

Data can be simply defined as a raw fact in form of text and/or numbers without any context. In general, data can be broadly classified as a qualitative or quantitative variable.

“The world’s most valuable resource is no longer oil, but data” – The Economist
  • What is the source of this data?

Data is everywhere. It is generated from different sources. Any question-answer session can create data. Any experiment through sensors and instruments can create data. Our social media profiles are creating a bulk of unstructured data in the form of images and sounds. Every e-commerce site is creating data. Organizations have data in forms of financial logs, text files, multimedia forms, to name a few.

Different sources of Data

Recently, the amount of unstructured data has increased immensely compared to that of a structured one. So, we need a science to study it and a technology to handle them and make it functional. This is where Data Science comes into the picture

Difference between Structured & Unstructured Data
  • Data Science:

Data Science can be simply described as a field that combines mathematics, statistics, and programming. Hence, it can be defined as a multi-disciplinary field using scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data in various forms.

Data Science Process
  • Is Data Science something new?

No, the term ‘data science’ has been used in different contexts over the past thirty years but only in recent times, it has become an established term.

  • Purpose of Data Science

There are different purposes served by data science today. Few are mentioned below:

  • Today every business has become data-driven. A proper methodology, technology and resource can provide better business understanding. Data generation & acquisition can deploy a better business model leading to a successful outcome as per the vision. So, with consumable data that is easily available, a lot of tools can be explored and applied to build a sophisticated data analytics solution providing better insights.
  • In this competitive market, every organization is eager to bring in more business accounts. Hence, based on the past browsing history of your customer and their demographic details you can get a precise idea about their requirement. With the vast quantity and variety of data available we can train a model more effectively for customer recommendation related issues.
  • We can train decision making models based on sensor data, camera, radars and lasers to follow a particular map with controlled speed and thus create a self-driving car by using advanced machine learning algorithms.
  • Using predictive analysis models, we can generate weather forecast reports or predict the occurrence of any natural calamities.
  • The list goes on!
Priority Areas of Data Science
  • Techniques of Data Science:

A Data Scientist will do the exploratory analysis of the data available to discover insights from it. Then, will use various advanced machine learning algorithms to predict or provide decision. The basic processes are as follows:

Different methods of Analytics

Descriptive Analysis: A model that summarizes the data in a meaningful way.

Predictive analysis: A model which can predict the possibilities of a particular event in the future. For example, whether a lender will default on his/her payment or not.

Prescriptive Analysis: an intelligent model which is capable of taking its own decisions and able to modify it with dynamic parameters. For example, a self-driving car.

Machine Learning: A supervised learning trains its model based on the data available and predicts for future occurrence. For example, a fraud detection model can be trained using past transactional data of a finance company having records of fraudulent purchases.

Similarly, machine learning can also be used for pattern recognition. Such a model can find out hidden patterns in the dataset to make meaningful predictions. Such algorithms can be called as Clustering.

Machine Learning techniques
  • Various Tools Used during the entire life cycle of Data Science
Data Science Life Cycle

Step 1: Discovery or data understanding

Step 2: Data preparation – explore, pre-process and condition data prior to modelling.

Step 3: Model Planning – apply Exploratory Data Analytics (EDA) using various statistical formulas and visualization tools and determine the relationships between variables. We can use R, SQL, SAS/ Access etc. tools for these purposes. Tableau, Power Bi can help in data visualization.

Step 4: Model Building – Develop training and test dataset and apply techniques like regression or classification or clustering to build the models.

Step 5: Operationalize – Implement the training model on the required data and find out results as required.

Step 6: Communicate results – identify all the key findings and communicate to the stakeholders to see if the model is successful or not.

To end with, we can easily claim that the future belongs to Data Science. More and more data will provide opportunities to drive key business decisions and will soon change the way we look at the world overloaded with data around us.

Hope you enjoyed reading this article! Please give your valuable feedback .

Note: Technical details and the pictures used here are googled and from various blogs.

One of the source that needs mention is : www.edureka.co/blog

3
Leave a Reply

3 Comment threads
0 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
3 Comment authors
  Subscribe  
newest oldest most voted
Notify of
Madhulika

Well Presented Pritha…

Anonymous

Very informative! The writer has simplified and presented the topic in such a way that a layman can understand. A good read indeed. 🙂

Sangeeta Maity

Very informative and practical. Exactly what we need everymoment in todays life. Thank you for sharing