Depends on what you want to interpret from the clustering. Is this
unsupervised, where you have no idea a priori the number of clusters to
expect? All algorithms are fairly good, so sometimes just picking one is
sufficient, unless you have more specific criteria you can talk about. To
"evaluate which clustering method is suitable" you just have to try them all
out and see which one you like best, unless you can define more objective
criteria by which you judge the clusters (for example, say you can accept
'any' reasonably good clustering of the data -- it doesn't have to be 'the
best'). One of the simplest algorithms is k-means, which I believe goes
something like this:

i. Position (arbitrarily) k points into the data space, defining the initial
group centroid.

ii. Assign each data point to the group that has the closest centroid.

iii. Once all objects are assigned to the nearest centroid, you have to
recalculate the positions of the k centroids.

Repeat (ii) and (iii) until the centroids arrive at a fixed position
(guaranteed) based on the cost function you are minimizing (e.g., a distance
metric). Now you can compute the value of the chosen cost function over the
groups thus defined. By minimizing the cost function, you will reach the
target clustering. The procedure will always produce a result, however, it
may not be 'optimal'. The algorithm is sensitive to the initial selection of
random cluster centers; so you're finding a local cost function minimum, not
the global min.

p

----- Original Message ----- 
From: "Brian" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, May 07, 2004 2:27 PM
Subject: [edstat] How to evaluate which clustering method is suitable?


> I would like to cluster a set of data. However, I don't know which
> clustering method is the best. Would you please suggest a way to decide
> which clustering method is the best?
>
> Thanks a lot!
>
>
> .
> .
> =================================================================
> Instructions for joining and leaving this list, remarks about the
> problem of INAPPROPRIATE MESSAGES, and archives are available at:
> .                  http://jse.stat.ncsu.edu/                    .
> =================================================================
>

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to