Hi all, I am looking for an R function or a metric that I could self code that compare the results of a clustering exercise with a given solution key.
An example. Let's say four elements are clustered, the number of clustered is unknown a priori. For my guess and the solution, I have two matrices with two columns the first colum gives the cluster id, the second the element id: guess <- cbind(c(1,1,2,3),c(1,2,3,4)); solution <- cbind(c(1,2,3,3),c(1,2,3,4)); colnames(guess) <- colnames(solution) <- c("cluster.id","element.id"); guess; solution; So here the guess is wrong in several ways. The guess claims elements 3 & 4 belong to distinct clusters, but in the solution we see that they belong to the same. Also, the guess claims elements 1 & 2 belong to one cluster, but in the solution we see they belong to distinct clusters. What I am looking for is a function or a metric that I could code up myself, that defines a sensible distance between the guess and the solution. There are various ways to do this, but I am just wondering if there is some standard way of doing this in one of the cluster analysis packages or so. Thanks very much, Tom ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.