Depends on what you want to interpret from the clustering. Is this unsupervised, where you have no idea a priori the number of clusters to expect? All algorithms are fairly good, so sometimes just picking one is sufficient, unless you have more specific criteria you can talk about. To "evaluate which clustering method is suitable" you just have to try them all out and see which one you like best, unless you can define more objective criteria by which you judge the clusters (for example, say you can accept 'any' reasonably good clustering of the data -- it doesn't have to be 'the best'). One of the simplest algorithms is k-means, which I believe goes something like this:
i. Position (arbitrarily) k points into the data space, defining the initial group centroid. ii. Assign each data point to the group that has the closest centroid. iii. Once all objects are assigned to the nearest centroid, you have to recalculate the positions of the k centroids. Repeat (ii) and (iii) until the centroids arrive at a fixed position (guaranteed) based on the cost function you are minimizing (e.g., a distance metric). Now you can compute the value of the chosen cost function over the groups thus defined. By minimizing the cost function, you will reach the target clustering. The procedure will always produce a result, however, it may not be 'optimal'. The algorithm is sensitive to the initial selection of random cluster centers; so you're finding a local cost function minimum, not the global min. p ----- Original Message ----- From: "Brian" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Friday, May 07, 2004 2:27 PM Subject: [edstat] How to evaluate which clustering method is suitable? > I would like to cluster a set of data. However, I don't know which > clustering method is the best. Would you please suggest a way to decide > which clustering method is the best? > > Thanks a lot! > > > . > . > ================================================================= > Instructions for joining and leaving this list, remarks about the > problem of INAPPROPRIATE MESSAGES, and archives are available at: > . http://jse.stat.ncsu.edu/ . > ================================================================= > . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
