The clusters produces by your examples are actually the same (despite the different labels).
I'd guess that "fit" and "partial_fit" draw a different amount of random_numbers before actually assigning a label to the first (randomly drawn) sample from "x" (in your code). This is why the labeling is permutated. Best regards Christian Am Mo., 10. Juni 2019 um 04:12 Uhr schrieb lampahome <pahome.c...@mirlab.org >: > > > federico vaggi <vaggi.feder...@gmail.com> 於 2019年6月7日 週五 上午1:08寫道: > >> k-means isn't a convex problem, unless you freeze the initialization, you >> are going to get very different solutions (depending on the dataset) with >> different initializations. >> >> > Nope, I specify the random_state=0. u can try it. > > >>> x = np.array([[1,2],[2,3]]) > >>> y = np.array([[3,4],[4,5],[5,6]]) > >>> z = np.append(x,y, axis=0) > >>> from sklearn.cluster import MiniBatchKMeans as MBK > >>> m = MBK(random_state=0, n_clusters=2) > >>> m.fit(x) ; m.labels_ > array([1,0], dtype=int32) <-- (1-a) > >>> m.partial_fit(y) ; m.labels_ > array([0,0,0], dtype=int32) <-- (1-b) > > >>> m = MBK(random_state=0, n_clusters=2) > >>> m.partial_fit(x) ; m.labels_ > array([0,1], dtype=int32) <-- (2-a) > >>> m.partial_fit(y) ; m.labels_ > array([1,1,1], dtype=int32) <-- (2-b) > > 1-a,1-b and 2-a, 2-b are all different, especially the members of each > cluster. > I'm just confused about what usage of partial_fit and fit is the > suitable(reasonable?) way to cluster incrementally? > > thx > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn