> -----Original Message----- > From: Weiguang Shi > > Thanks again Andy. > > The definition of AC is understood, yet I have trouble > picturing the amount of "clear clustering structure" > it measures. To put things into perspective, for two > series > 1,2,1000,1001 > and > 1,2,3,1000 > agnes(x, method="single") generates ac values of > 0.998998 and 0.0.7492477 respectively, yet it seems to > me that both have fairly clear clustering structures.
It has to do with sample sizes. Consider the following: testAC <- function(prop1=0.5, x=rnorm(50), center=c(0, 100), ...) { stopifnot(require(cluster)) n <- length(x) n1 <- ceiling(n * prop1) n2 <- n - n1 agnes(x + rep(center, c(n1, n2)), ...)$ac } Now some tests: > sapply(c(.25, .5), testAC, x=x[1:4], method="single") [1] 0.7427591 0.9862944 > sapply(1:5 / 10, testAC, x=x[1:10], method="single") [1] 0.8977139 0.9974224 0.9950061 0.9946366 0.9946366 > sapply(1:5 / 10, testAC, x=x, method="single") [1] 0.9982955 0.9969757 0.9971114 0.9971127 0.9975111 So it seems like AC does not consider isolated singletons as cluster structures. This is only discernable in small sample size, though. Andy > --- "Liaw, Andy" <[EMAIL PROTECTED]> wrote: > > BTW, I checked the book. You're not going find much > > more than that. > > > Thanks for checking. > > Weiguang > > ______________________________________________________________ > ________ > Post your free ad now! http://personals.yahoo.ca > > ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html