Re: data science tutorials

AudioGames . net Forum — Off-topic room : visualstudio via Audiogames-reflector Mon, 24 Feb 2020 05:15:52 -0800

introduction to data science

data science as it's name says, is the science of retrieving, preprocessing and usingg the data in different contexts of machine learning
for example, we can use the data to predict the prices of house in a series of time
in theory, machine learning is fall into 2 categories

supervised learning
unsupervised learning

although we have another type of machine learning called reinforcement learning, but I might not cover it.

supervised learning

let me start by an example
suppose you have a dataset of emails, and it categorises into spam or not spam (we call it ham)
now, we want to classify if for example if we receive another new email, is it spam, or ham
so, we feed the data, which is called x, and we receive y, which is it's class (we need to train the model and show it which is spam, and which isn't and the model learns based on our data)
since classification between spam or ham is between 0 and 1 (2 categories), it is called binary classification as well.
an example of classification algorithm is called naive bayze classification, which scikit-learn has support for it (we will cover that as well).
another example:
suppose that we want to predict the prices of houses in a city
so, this is not a classification task. we want to calculate a quantity which is the price
this is called regression
the simplest algorithm of regression is called linear regression, which is available on scikit-learn

simple example of linear regression

suppose this function:

def calc_bmi(weight, height):
    return weight/(height**2)

this is a simple linear regression function
given the features (weight, height), it can calculate bmi
now, download this data and we will play with it a little bit
create a python file and put the following in it:

import numpy as np
import pandas as pd
import joblib
from sklearn.linear_model import LinearRegression


def calc_bmi(weight, height):
    return weight/(height**2)

df = pd.read_csv("Davis.csv")
print(df.head())

x = df[["weight", "height"]]
y = []


for i, r in x.iterrows():
    y.append(calc_bmi(r["weight"], r["height"]))

r = LinearRegression()
r.fit(x, y)
joblib.dump(r, "bmi.pkl")

this is a simple linear regression based on a dataset, and I know this is not the way of doing it, but this is an example of showing you what regression is.
so, in classification, we predict descrete numbers (1, 2, etc which is our classes), while in regression,, we predict based on Continuous numbers (0, 0.0000000001, 0.0000000002, etc).
have a nice time

-- 
Audiogames-reflector mailing list
Audiogames-reflector@sabahattin-gucukoglu.com
https://sabahattin-gucukoglu.com/cgi-bin/mailman/listinfo/audiogames-reflector

Re: data science tutorials

Re: data science tutorials

introduction to data science

supervised learning

simple example of linear regression

Reply via email to