[scikit-learn] Question about Kmeans implementation in sklearn

serafim loukas Mon, 05 Aug 2019 10:59:15 -0700

Dear Sklearn community,


I have a simple question concerning the implementation of KMeans clustering 
algorithm.
Two of the input arguments are the “n_init” and “random_state”.

Consider a case where  “n_init=10” and “random_state=0”.

By looking at the source code 
(https://github.com/scikit-learn/scikit-learn/blob/1495f69242646d239d89a5713982946b8ffcf9d9/sklearn/cluster/k_means_.py#L187),
 we have the following:

for it in range(n_init):
# run a k-means once
labels, inertia, centers, n_iter_ = kmeans_single(
X, sample_weight, n_clusters, max_iter=max_iter, init=init,
verbose=verbose, precompute_distances=precompute_distances,
tol=tol, x_squared_norms=x_squared_norms,
random_state=random_state)


My question is: Why the results are not going to be the same for all `n_init` 
iterations since `random_state` is fixed?


Bests,
Makis

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

[scikit-learn] Question about Kmeans implementation in sklearn

Reply via email to