Hello scikit-learn developers. I've noticed a somewhat unexpected difference between the behaviour of the KMeans class and the MiniBatchKMeans class.
When the 'n_init' argument is given, I'd expect both of these classes to run the corresponding algorithm (Lloyd and mini-batch k-means, respectively) 'n_init' times on the data to be fitted, each time with a different initialization, and then select the result which gives the smallest inertia. However, while this expectation is met by the KMeans class, it's not really met the by the MiniBatchKMeans class: the latter only executes the *initialization* of centroids 'n_init' times, then selecting the initialization that gives the smallest inertia, and running the mini-batch k-means algorithm only once, with that initialization. This different behaviour is not made apparent in the documentation, either. So, my question is: is this a bug, or is it intended behaviour? And if it's intended behaviour, should the documentation be adjusted to reference it explicitly? Best regards (and BTW, thank you for your great software!), Stefano ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
