Re: [scikit-learn] Can I evaluate clustering efficiency incrementally?

2019-05-14 Thread Joel Nothman
Evaluating on large datasets is easy if the sufficient statistics are just the contingency matrix. On Tue., 14 May 2019, 11:19 pm Tom Augspurger, wrote: > If anyone is interested in implementing these, dask-ml would welcome > additional > metrics that work well with Dask arrays: > https://github

Re: [scikit-learn] Can I evaluate clustering efficiency incrementally?

2019-05-14 Thread Tom Augspurger
If anyone is interested in implementing these, dask-ml would welcome additional metrics that work well with Dask arrays: https://github.com/dask/dask-ml/issues/213. On Tue, May 14, 2019 at 2:09 AM Uri Goren wrote: > Sounds like you need to use spark, > this project looks promising: > https://git

Re: [scikit-learn] Can I evaluate clustering efficiency incrementally?

2019-05-14 Thread Uri Goren
Sounds like you need to use spark, this project looks promising: https://github.com/xiaocai00/SparkPinkMST On Tue, May 14, 2019 at 5:12 AM lampahome wrote: > > Uri Goren 於 2019年5月3日 週五 下午7:29寫道: > >> I usually use clustering to save costs on labelling. >> I like to apply hierarchical clustering