The easiest workaround is the one you included in your original posting. 
Specify k= and not h=. Examine the dendrogram and decide how many clusters are 
at the level you want. You could add guidelines to the dendrogram with abline() 
to make it easier to see the number of clusters at various heights.

plot(hc)
abline(h=c(20, 40, 60, 80, 100, 120), lty=3)

David C

From: Johannes Radinger [mailto:johannesradin...@gmail.com]
Sent: Friday, July 11, 2014 3:24 AM
To: David L Carlson; R help
Subject: Re: [R] Cutting hierarchical cluster tree at specific height fails

Hi,

@David: Thanks for the explanation why this does not work. This of
course makes theoretically sense.

However in a recent discussion
(http://stats.stackexchange.com/questions/107448/spatial-distance-between-cluster-means)
it was stated that "the 'reversals problem' of  centroid method is
not a serious reason to deactivate the option of 'tree cut'". Instead
a warning message should be provided rather than a deactivation.

So does anyone know how a tree that was created with "centroid" can still
be cut at a specific height? I tried the package "dynamicTreeCut", but this
also relies on cutree and consequently raises an error when used for cutting
"centroid" trees.

Does anyone know a work around and can provide a minimum working example?

/Johannes

On Wed, Jul 9, 2014 at 4:58 PM, David L Carlson 
<dcarl...@tamu.edu<mailto:dcarl...@tamu.edu>> wrote:
To cut the tree, the clustering algorithm must produce consistently increasing 
height values with no reversals. You used one of the two options in hclust that 
does not do this. Note the following from the hclust manual page:

"Note however, that methods "median" and "centroid" are not leading to a 
monotone distance measure, or equivalently the resulting dendrograms can have 
so called inversions (which are hard to interpret)."

The cutree manual page:

"Cutting trees at a given height is only possible for ultrametric trees (with 
monotone clustering heights)."

Use a different method (but not median).

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org> 
[mailto:r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org>] On 
Behalf Of Johannes Radinger
Sent: Wednesday, July 9, 2014 7:07 AM
To: R help
Subject: [R] Cutting hierarchical cluster tree at specific height fails

Hi,

I'd like to cut a hierachical cluster tree calculated with hclust at a
specific height.
However ever get following error message:
"Error in cutree(hc, h = 60) :
  the 'height' component of 'tree' is not sorted (increasingly)"


Here is a working example to show that when specifing a height in  cutree()
the code fails. In contrast, specifying the number of clusters in cutree()
works.
What is the exact problem and how can I solve it?

x <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,80,15))
y <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,150,25))
df <- data.frame(x,y)
plot(df)

hc <- hclust(dist(df,method = "euclidean"), method="centroid")
plot(hc)

df$memb <- cutree(hc, h = 60) # this does not work
df$memb <- cutree(hc, k = 3) # this works!

plot(df$x,df$y,col=df$memb)


Thank you for your hints!

Best regards,
Johannes
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org<mailto:R-help@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to