I try to use Birch to cluster time-series data incrementally. Because insufficient memory, so I train it batch by batch. Every batch is 1000 samples and for 50 batch.
I found when I only train the first batch, it cluster well. After first trained, I train following batch with the same model and use partial_fit to train them. I found the clustering result become worse after I trained many rounds until finish. Some samples will mix into another cluster which that seems very different with another samples in the same cluster. Is there any way to make it better? Or I use the wrong method?
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn