Uri Goren <ugo...@gmail.com> 於 2019年5月3日 週五 下午7:29寫道:
> I usually use clustering to save costs on labelling. > I like to apply hierarchical clustering, and then label a small sample and > fine-tune the clustering algorithm. > > That way, you can evaluate the effectiveness in terms of cluster purity > (how many clusters contain mixed labels) > > See example with sklearn here : > https://youtu.be/GM8L324MuHc?list=PLqkckaeDLF4IDdKltyBwx8jLaz5nwDPQU > > > But if my dataset is too large to load into memory, will it work?
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn