Hi everyone,
I have the data with following attributes: (Latitude, Longitude). Now I am
performing clustering using DBSCAN for my data. I have following doubts:
1. How can I add data to the data set of the package?
2. How I can calculate Rand index for my data?
3. How to use make_blobs command fo
Hi developers,
It is excited that I have opportunity work with you!
I am Yizheng Zhao, a graduate student at Carnegie Mellon University majoring in
Software Engineering and I’ve got my Bachelor’s degree in Math in 2016 at Jilin
University.
I love python and machine learning and that why I wann
Hi, Shuchi,
> 1. How can I add data to the data set of the package?
You don’t need to add your dataset to the dataset module to run your analysis.
A convenient way to load it into a numpy array would be via pandas. E.g.,
import pandas as pd
df = pd.read_csv(‘your_data.txt', delimiter=r"\s+”)
X
Since you're using lat / long coords, you'll also want to convert them
to radians and specify 'haversine' as your distance metric; i.e. :
coords = np.vstack([lats.ravel(),longs.ravel()]).T
coords *= np.pi / 180. # to radians
...and:
db = DBSCAN(eps=0.3, min_samples=10, metric='haversi
Thank you so much for your quick reply. I have one more doubt. The below
statement is used to calculate rand score.
metrics.adjusted_rand_score(labels_true, labels_pred)
In my case what will be labels_true and labels_pred and how I will
calculate labels_pred?
With Best Regards,
Shuchi Mala
Resea