I'm not sure why you'd expect Euclidean distance and squared Euclidean distance to give the same results.
Euclidean distance is the square root of the sums of squared differences for each variable, and that's exactly what dist() returns. http://en.wikipedia.org/wiki/Euclidean_distance On a map, it's the length of the hypoteneuse, and you can measure it with a ruler and get the same number. Euclidean distance has a specific geometric meaning. Squared Euclidean distance is not the same thing, and not the standard definition you seem to be expecting. If that's what you want, then square the output of dist() before you perform the clustering. Sarah On Mon, Apr 26, 2010 at 8:37 AM, Jeoffrey Gaspard <jeoffrey.gasp...@gmail.com> wrote: > Hello everyone! > > My data is composed of 277 individuals measured on 8 binary variables > (1=yes, 2=no). > > I did two similar cluster analyses, one on SPSS 18.0 and one on R 2.9.2. The > objective is to have the means for each variable per retained cluster. > > 1) the R analysis ran as followed: > >> call data >> dist=dist(data,method="euclidean") >> cluster=hclust(dist,method="ward") >> cluster > > Call: > hclust(d = dist, method = "ward") > > Cluster method : ward > Distance : euclidean > Number of objects: 277 > >> plot(cluster) >> rect.hclust(cluster, k=4, border="red") >> x=rect.hclust(cluster, k=4, border="red") >> sapply(x, function(i) colMeans(data[i,])) >> round(sapply(x, function(i) colMeans(data[i,])),2) > > 2) The SPSS analysis ran as follows: > > Analysis --> Classify --> Hierarchical cluster analysis --> Cluster method= > Ward's method and Distance measure= Interval: Squared Euclidean distance. > After that, I computed the means of each variable for each cluster. > > The problem is I have different results between the two analyses (different > clusters and means). > > However, when I use the "Euclidean distance" (unsquared) in SPSS, I have the > same results! > > I thought the R "euclidean" command meant the "usual square distance between > the two vectors (2 norm)" as specified in the documentation, no the > unsquared distance. Did it not? > > Thanks for the comment! > > Jeffrey > > -- Sarah Goslee http://www.functionaldiversity.org ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.