I use Birch to cluster my data and my data is kind of time-series data. I don't know the actually cluster numbers and need to read large data(online learning), so I choose Birch rather than MiniKmeans.
When I read it, I found the critical parameters might be branching_factor and threshold, and threshold will affect my cluster numbers obviously! Any way to estimate the suitable threshold of Birch? Any paper suggestion is ok. thx
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn