Re: [Scikit-learn-general] k-means with unbalanced clusters

Pagliari, Roberto Wed, 05 Nov 2014 12:16:26 -0800

I agree with you. However, for clarification purposes, do you know why in this 
extreme case, false positive rate (where class 0 is much bigger than class 1) 
might be pretty high if not 1?




Thank you,
________________________________________
From: Andy [[email protected]]
Sent: Wednesday, November 05, 2014 1:58 PM
To: [email protected]
Subject: Re: [Scikit-learn-general] k-means with unbalanced clusters

On 11/05/2014 01:10 AM, Sturla Molden wrote:
> "Pagliari, Roberto" <[email protected]>
> wrote:
>
>> If that's the case, why is that the underlying implementation of k-means
>> does not take this into account?
> Because then it would be the "classification EM algorithm" (often called
> CEM) instead of k-means. By definition, k-means is CEM constrained with
> equal cluster size and equal and spherical covariance matrices.
>
If you want different sized clusters, you might want to look into GMMs,
which learn a covariance structure.
In KMeans, the cluster structure is always given by the voronoi cells of
the means, which means that the border between the clusters is exactly
in the middle of the two clusters centers.

------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] k-means with unbalanced clusters

Reply via email to