Re: [R] Cutting hierarchical cluster tree at specific height fails

2014-07-15 Thread Peter Langfelder
Hi Johannes,

you mentioned dynamicTreeCut - the dynamic hybrid method works fine on
your data. Just supply the dissimilarity matrix as well: I use the
function plotDendroAndColors from WGCNA to show the results; if you
don't want to use WGCNA, just leave out the last call.

library(WGCNA)

set.seed(42)
x <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,80,15))
y <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,150,25))
df <- data.frame(x,y)
hc <- hclust(dist(df,method = "euclidean"), method="centroid")
dm = as.matrix(dist(df,method = "euclidean"))
plot(hc)
labels = cutreeDynamic(hc, distM = dm, deepSplit = 2)
# ..cutHeight not given, setting it to 115  ===>  99% of the
(truncated) height range in #dendro.
#..done.
plotDendroAndColors(hc, labels)

As you see, the algorithm found 3 clusters that seem right based on
the dendrogram.

Please look carefully at the help file for cutreeDynamic since the
defaults may not be what you want.

If you absolutely want to cut at a given height, it can be done as
well, but the arguments will need some massaging.

Best,

Peter

On Mon, Jul 14, 2014 at 4:42 AM, Johannes Radinger
 wrote:
> Of course,
> manually checking the number of clusters that are cut at a specific height
> (e.g. by abline())
> is one possibility. However, this only makes sense for single trees, but is
> not a feasible
> approach for multiple model runs when hundreds of trees are built with many
> cluster branches.
>
> Thus, I'd be nice if somebody knows a more programatic approach or another
> package
> that allows cutting "centroid"-trees.
>
> /Johannes
>
>
> On Fri, Jul 11, 2014 at 4:19 PM, David L Carlson  wrote:
>
>>  The easiest workaround is the one you included in your original posting.
>> Specify k= and not h=. Examine the dendrogram and decide how many clusters
>> are at the level you want. You could add guidelines to the dendrogram with
>> abline() to make it easier to see the number of clusters at various heights.
>>
>>
>>
>> plot(hc)
>>
>> abline(h=c(20, 40, 60, 80, 100, 120), lty=3)
>>
>>
>>
>> David C
>>
>>
>>
>> *From:* Johannes Radinger [mailto:johannesradin...@gmail.com]
>> *Sent:* Friday, July 11, 2014 3:24 AM
>> *To:* David L Carlson; R help
>> *Subject:* Re: [R] Cutting hierarchical cluster tree at specific height
>> fails
>>
>>
>>
>> Hi,
>>
>>
>>
>> @David: Thanks for the explanation why this does not work. This of
>>
>> course makes theoretically sense.
>>
>>
>>
>> However in a recent discussion
>>
>> (
>> http://stats.stackexchange.com/questions/107448/spatial-distance-between-cluster-means
>> )
>>
>> it was stated that "the 'reversals problem' of  centroid method is
>>
>> not a serious reason to deactivate the option of 'tree cut'". Instead
>>
>> a warning message should be provided rather than a deactivation.
>>
>>
>>
>> So does anyone know how a tree that was created with "centroid" can still
>>
>> be cut at a specific height? I tried the package "dynamicTreeCut", but this
>>
>> also relies on cutree and consequently raises an error when used for
>> cutting
>>
>> "centroid" trees.
>>
>>
>>
>> Does anyone know a work around and can provide a minimum working example?
>>
>>
>>
>> /Johannes
>>
>>
>>
>> On Wed, Jul 9, 2014 at 4:58 PM, David L Carlson  wrote:
>>
>> To cut the tree, the clustering algorithm must produce consistently
>> increasing height values with no reversals. You used one of the two options
>> in hclust that does not do this. Note the following from the hclust manual
>> page:
>>
>> "Note however, that methods "median" and "centroid" are not leading to a
>> monotone distance measure, or equivalently the resulting dendrograms can
>> have so called inversions (which are hard to interpret)."
>>
>> The cutree manual page:
>>
>> "Cutting trees at a given height is only possible for ultrametric trees
>> (with monotone clustering heights)."
>>
>> Use a different method (but not median).
>>
>> -
>> David L Carlson
>> Department of Anthropology
>> Texas A&M University
>> College Station, TX 77840-4352
>>
>>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> On Behalf Of Johannes Radinger
>>

Re: [R] Cutting hierarchical cluster tree at specific height fails

2014-07-15 Thread David L Carlson
I believe you can accomplish this without modifying cutree by analyzing the 
cluster heights before calling cutree. First generate the data setting the 
random seed so we are all looking at the same thing:

set.seed(42)
x <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,80,15))
y <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,150,25))
df <- data.frame(x,y)
hc <- hclust(dist(df,method = "euclidean"), method="centroid")

Now figure where to cut the tree by finding the first place that 60 is 
exceeded. Because of the reversals, there could be more than one solution, so 
we take the one with more clusters:

hval <- 60
nclust <- (nrow(df)-1):1 # Number of clusters at each step
# Find the number of clusters that first exceeds hval
k <- max(nclust[which(diff(hc$height < hval) == -1) + 1])
df$memb <- cutree(hc, k = k)

David C

-Original Message-
From: jlh.membership [mailto:jlh.members...@gmail.com] 
Sent: Tuesday, July 15, 2014 10:14 AM
To: 'Johannes Radinger'; David L Carlson
Cc: 'R help'
Subject: RE: [R] Cutting hierarchical cluster tree at specific height fails

Hi Johannes,

Looking at the code for cutree(...), if h is provided but not k, then 
cutree(...) calculates k from h and calls a C function with k
to cut the tree (but all this only if height is sorted). So we can short 
circuit the test this way:

cutree.h <- function(tree,h) {
  # this line adapted from cutree(...) code
  k <- nrow(tree$merge) + 2L - apply(outer(c(hc$height, Inf), h, ">"), 2, 
which.max)
  return(cutree(tree,k=k))
}
df$memb <- cutree.h(hc, h = 60) # this *does* work 
plot(df$x,df$y,col=df$memb)

This does work, at least for your example, but note that you didn't set the 
seed so the plot will be different from your original
question.

John Howard
Prism Marketing Group
http://www.prismmg.com


-Original Message-
From: Johannes Radinger [mailto:johannesradin...@gmail.com] 
Sent: Monday, July 14, 2014 7:43 AM
To: David L Carlson
Cc: R help
Subject: Re: [R] Cutting hierarchical cluster tree at specific height fails

Of course,
manually checking the number of clusters that are cut at a specific height 
(e.g. by abline()) is one possibility. However, this only
makes sense for single trees, but is not a feasible approach for multiple model 
runs when hundreds of trees are built with many
cluster branches.

Thus, I'd be nice if somebody knows a more programatic approach or another 
package that allows cutting "centroid"-trees.

/Johannes


On Fri, Jul 11, 2014 at 4:19 PM, David L Carlson  wrote:

>  The easiest workaround is the one you included in your original posting.
> Specify k= and not h=. Examine the dendrogram and decide how many 
> clusters are at the level you want. You could add guidelines to the 
> dendrogram with
> abline() to make it easier to see the number of clusters at various heights.
>
>
>
> plot(hc)
>
> abline(h=c(20, 40, 60, 80, 100, 120), lty=3)
>
>
>
> David C
>
>
>
> *From:* Johannes Radinger [mailto:johannesradin...@gmail.com]
> *Sent:* Friday, July 11, 2014 3:24 AM
> *To:* David L Carlson; R help
> *Subject:* Re: [R] Cutting hierarchical cluster tree at specific 
> height fails
>
>
>
> Hi,
>
>
>
> @David: Thanks for the explanation why this does not work. This of
>
> course makes theoretically sense.
>
>
>
> However in a recent discussion
>
> (
> http://stats.stackexchange.com/questions/107448/spatial-distance-betwe
> en-cluster-means
> )
>
> it was stated that "the 'reversals problem' of  centroid method is
>
> not a serious reason to deactivate the option of 'tree cut'". Instead
>
> a warning message should be provided rather than a deactivation.
>
>
>
> So does anyone know how a tree that was created with "centroid" can 
> still
>
> be cut at a specific height? I tried the package "dynamicTreeCut", but 
> this
>
> also relies on cutree and consequently raises an error when used for 
> cutting
>
> "centroid" trees.
>
>
>
> Does anyone know a work around and can provide a minimum working example?
>
>
>
> /Johannes
>
>
>
> On Wed, Jul 9, 2014 at 4:58 PM, David L Carlson  wrote:
>
> To cut the tree, the clustering algorithm must produce consistently 
> increasing height values with no reversals. You used one of the two 
> options in hclust that does not do this. Note the following from the 
> hclust manual
> page:
>
> "Note however, that methods "median" and "centroid" are not leading to 
> a monotone distance measure, or equivalently the resulting dendrograms 
> can have so called inversions (which are hard to interpret)."
>
> The cutree manual p

Re: [R] Cutting hierarchical cluster tree at specific height fails

2014-07-15 Thread jlh.membership
Hi Johannes,

Looking at the code for cutree(...), if h is provided but not k, then 
cutree(...) calculates k from h and calls a C function with k
to cut the tree (but all this only if height is sorted). So we can short 
circuit the test this way:

cutree.h <- function(tree,h) {
  # this line adapted from cutree(...) code
  k <- nrow(tree$merge) + 2L - apply(outer(c(hc$height, Inf), h, ">"), 2, 
which.max)
  return(cutree(tree,k=k))
}
df$memb <- cutree.h(hc, h = 60) # this *does* work 
plot(df$x,df$y,col=df$memb)

This does work, at least for your example, but note that you didn't set the 
seed so the plot will be different from your original
question.

John Howard
Prism Marketing Group
http://www.prismmg.com


-Original Message-
From: Johannes Radinger [mailto:johannesradin...@gmail.com] 
Sent: Monday, July 14, 2014 7:43 AM
To: David L Carlson
Cc: R help
Subject: Re: [R] Cutting hierarchical cluster tree at specific height fails

Of course,
manually checking the number of clusters that are cut at a specific height 
(e.g. by abline()) is one possibility. However, this only
makes sense for single trees, but is not a feasible approach for multiple model 
runs when hundreds of trees are built with many
cluster branches.

Thus, I'd be nice if somebody knows a more programatic approach or another 
package that allows cutting "centroid"-trees.

/Johannes


On Fri, Jul 11, 2014 at 4:19 PM, David L Carlson  wrote:

>  The easiest workaround is the one you included in your original posting.
> Specify k= and not h=. Examine the dendrogram and decide how many 
> clusters are at the level you want. You could add guidelines to the 
> dendrogram with
> abline() to make it easier to see the number of clusters at various heights.
>
>
>
> plot(hc)
>
> abline(h=c(20, 40, 60, 80, 100, 120), lty=3)
>
>
>
> David C
>
>
>
> *From:* Johannes Radinger [mailto:johannesradin...@gmail.com]
> *Sent:* Friday, July 11, 2014 3:24 AM
> *To:* David L Carlson; R help
> *Subject:* Re: [R] Cutting hierarchical cluster tree at specific 
> height fails
>
>
>
> Hi,
>
>
>
> @David: Thanks for the explanation why this does not work. This of
>
> course makes theoretically sense.
>
>
>
> However in a recent discussion
>
> (
> http://stats.stackexchange.com/questions/107448/spatial-distance-betwe
> en-cluster-means
> )
>
> it was stated that "the 'reversals problem' of  centroid method is
>
> not a serious reason to deactivate the option of 'tree cut'". Instead
>
> a warning message should be provided rather than a deactivation.
>
>
>
> So does anyone know how a tree that was created with "centroid" can 
> still
>
> be cut at a specific height? I tried the package "dynamicTreeCut", but 
> this
>
> also relies on cutree and consequently raises an error when used for 
> cutting
>
> "centroid" trees.
>
>
>
> Does anyone know a work around and can provide a minimum working example?
>
>
>
> /Johannes
>
>
>
> On Wed, Jul 9, 2014 at 4:58 PM, David L Carlson  wrote:
>
> To cut the tree, the clustering algorithm must produce consistently 
> increasing height values with no reversals. You used one of the two 
> options in hclust that does not do this. Note the following from the 
> hclust manual
> page:
>
> "Note however, that methods "median" and "centroid" are not leading to 
> a monotone distance measure, or equivalently the resulting dendrograms 
> can have so called inversions (which are hard to interpret)."
>
> The cutree manual page:
>
> "Cutting trees at a given height is only possible for ultrametric 
> trees (with monotone clustering heights)."
>
> Use a different method (but not median).
>
> -
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
>
> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org]
> On Behalf Of Johannes Radinger
> Sent: Wednesday, July 9, 2014 7:07 AM
> To: R help
> Subject: [R] Cutting hierarchical cluster tree at specific height 
> fails
>
> Hi,
>
> I'd like to cut a hierachical cluster tree calculated with hclust at a 
> specific height.
> However ever get following error message:
> "Error in cutree(hc, h = 60) :
>   the 'height' component of 'tree' is not sorted (increasingly)"
>
>
> Here is a working example to show that when specifing a height in  
> cutree() the code fails. In contrast, specifying the number of 
> clusters in cutree() w

Re: [R] Cutting hierarchical cluster tree at specific height fails

2014-07-14 Thread Johannes Radinger
Of course,
manually checking the number of clusters that are cut at a specific height
(e.g. by abline())
is one possibility. However, this only makes sense for single trees, but is
not a feasible
approach for multiple model runs when hundreds of trees are built with many
cluster branches.

Thus, I'd be nice if somebody knows a more programatic approach or another
package
that allows cutting "centroid"-trees.

/Johannes


On Fri, Jul 11, 2014 at 4:19 PM, David L Carlson  wrote:

>  The easiest workaround is the one you included in your original posting.
> Specify k= and not h=. Examine the dendrogram and decide how many clusters
> are at the level you want. You could add guidelines to the dendrogram with
> abline() to make it easier to see the number of clusters at various heights.
>
>
>
> plot(hc)
>
> abline(h=c(20, 40, 60, 80, 100, 120), lty=3)
>
>
>
> David C
>
>
>
> *From:* Johannes Radinger [mailto:johannesradin...@gmail.com]
> *Sent:* Friday, July 11, 2014 3:24 AM
> *To:* David L Carlson; R help
> *Subject:* Re: [R] Cutting hierarchical cluster tree at specific height
> fails
>
>
>
> Hi,
>
>
>
> @David: Thanks for the explanation why this does not work. This of
>
> course makes theoretically sense.
>
>
>
> However in a recent discussion
>
> (
> http://stats.stackexchange.com/questions/107448/spatial-distance-between-cluster-means
> )
>
> it was stated that "the 'reversals problem' of  centroid method is
>
> not a serious reason to deactivate the option of 'tree cut'". Instead
>
> a warning message should be provided rather than a deactivation.
>
>
>
> So does anyone know how a tree that was created with "centroid" can still
>
> be cut at a specific height? I tried the package "dynamicTreeCut", but this
>
> also relies on cutree and consequently raises an error when used for
> cutting
>
> "centroid" trees.
>
>
>
> Does anyone know a work around and can provide a minimum working example?
>
>
>
> /Johannes
>
>
>
> On Wed, Jul 9, 2014 at 4:58 PM, David L Carlson  wrote:
>
> To cut the tree, the clustering algorithm must produce consistently
> increasing height values with no reversals. You used one of the two options
> in hclust that does not do this. Note the following from the hclust manual
> page:
>
> "Note however, that methods "median" and "centroid" are not leading to a
> monotone distance measure, or equivalently the resulting dendrograms can
> have so called inversions (which are hard to interpret)."
>
> The cutree manual page:
>
> "Cutting trees at a given height is only possible for ultrametric trees
> (with monotone clustering heights)."
>
> Use a different method (but not median).
>
> -------------
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Johannes Radinger
> Sent: Wednesday, July 9, 2014 7:07 AM
> To: R help
> Subject: [R] Cutting hierarchical cluster tree at specific height fails
>
> Hi,
>
> I'd like to cut a hierachical cluster tree calculated with hclust at a
> specific height.
> However ever get following error message:
> "Error in cutree(hc, h = 60) :
>   the 'height' component of 'tree' is not sorted (increasingly)"
>
>
> Here is a working example to show that when specifing a height in  cutree()
> the code fails. In contrast, specifying the number of clusters in cutree()
> works.
> What is the exact problem and how can I solve it?
>
> x <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,80,15))
> y <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,150,25))
> df <- data.frame(x,y)
> plot(df)
>
> hc <- hclust(dist(df,method = "euclidean"), method="centroid")
> plot(hc)
>
> df$memb <- cutree(hc, h = 60) # this does not work
> df$memb <- cutree(hc, k = 3) # this works!
>
> plot(df$x,df$y,col=df$memb)
>
>
> Thank you for your hints!
>
> Best regards,
> Johannes
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cutting hierarchical cluster tree at specific height fails

2014-07-11 Thread David L Carlson
The easiest workaround is the one you included in your original posting. 
Specify k= and not h=. Examine the dendrogram and decide how many clusters are 
at the level you want. You could add guidelines to the dendrogram with abline() 
to make it easier to see the number of clusters at various heights.

plot(hc)
abline(h=c(20, 40, 60, 80, 100, 120), lty=3)

David C

From: Johannes Radinger [mailto:johannesradin...@gmail.com]
Sent: Friday, July 11, 2014 3:24 AM
To: David L Carlson; R help
Subject: Re: [R] Cutting hierarchical cluster tree at specific height fails

Hi,

@David: Thanks for the explanation why this does not work. This of
course makes theoretically sense.

However in a recent discussion
(http://stats.stackexchange.com/questions/107448/spatial-distance-between-cluster-means)
it was stated that "the 'reversals problem' of  centroid method is
not a serious reason to deactivate the option of 'tree cut'". Instead
a warning message should be provided rather than a deactivation.

So does anyone know how a tree that was created with "centroid" can still
be cut at a specific height? I tried the package "dynamicTreeCut", but this
also relies on cutree and consequently raises an error when used for cutting
"centroid" trees.

Does anyone know a work around and can provide a minimum working example?

/Johannes

On Wed, Jul 9, 2014 at 4:58 PM, David L Carlson 
mailto:dcarl...@tamu.edu>> wrote:
To cut the tree, the clustering algorithm must produce consistently increasing 
height values with no reversals. You used one of the two options in hclust that 
does not do this. Note the following from the hclust manual page:

"Note however, that methods "median" and "centroid" are not leading to a 
monotone distance measure, or equivalently the resulting dendrograms can have 
so called inversions (which are hard to interpret)."

The cutree manual page:

"Cutting trees at a given height is only possible for ultrametric trees (with 
monotone clustering heights)."

Use a different method (but not median).

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org> 
[mailto:r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org>] On 
Behalf Of Johannes Radinger
Sent: Wednesday, July 9, 2014 7:07 AM
To: R help
Subject: [R] Cutting hierarchical cluster tree at specific height fails

Hi,

I'd like to cut a hierachical cluster tree calculated with hclust at a
specific height.
However ever get following error message:
"Error in cutree(hc, h = 60) :
  the 'height' component of 'tree' is not sorted (increasingly)"


Here is a working example to show that when specifing a height in  cutree()
the code fails. In contrast, specifying the number of clusters in cutree()
works.
What is the exact problem and how can I solve it?

x <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,80,15))
y <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,150,25))
df <- data.frame(x,y)
plot(df)

hc <- hclust(dist(df,method = "euclidean"), method="centroid")
plot(hc)

df$memb <- cutree(hc, h = 60) # this does not work
df$memb <- cutree(hc, k = 3) # this works!

plot(df$x,df$y,col=df$memb)


Thank you for your hints!

Best regards,
Johannes
[[alternative HTML version deleted]]

__
R-help@r-project.org<mailto:R-help@r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cutting hierarchical cluster tree at specific height fails

2014-07-11 Thread Johannes Radinger
Hi,

@David: Thanks for the explanation why this does not work. This of
course makes theoretically sense.

However in a recent discussion
(
http://stats.stackexchange.com/questions/107448/spatial-distance-between-cluster-means
)
it was stated that "the 'reversals problem' of  centroid method is
not a serious reason to deactivate the option of 'tree cut'". Instead
a warning message should be provided rather than a deactivation.

So does anyone know how a tree that was created with "centroid" can still
be cut at a specific height? I tried the package "dynamicTreeCut", but this
also relies on cutree and consequently raises an error when used for cutting
"centroid" trees.

Does anyone know a work around and can provide a minimum working example?

/Johannes


On Wed, Jul 9, 2014 at 4:58 PM, David L Carlson  wrote:

> To cut the tree, the clustering algorithm must produce consistently
> increasing height values with no reversals. You used one of the two options
> in hclust that does not do this. Note the following from the hclust manual
> page:
>
> "Note however, that methods "median" and "centroid" are not leading to a
> monotone distance measure, or equivalently the resulting dendrograms can
> have so called inversions (which are hard to interpret)."
>
> The cutree manual page:
>
> "Cutting trees at a given height is only possible for ultrametric trees
> (with monotone clustering heights)."
>
> Use a different method (but not median).
>
> -
> David L Carlson
> Department of Anthropology
> Texas A&M University
> College Station, TX 77840-4352
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Johannes Radinger
> Sent: Wednesday, July 9, 2014 7:07 AM
> To: R help
> Subject: [R] Cutting hierarchical cluster tree at specific height fails
>
> Hi,
>
> I'd like to cut a hierachical cluster tree calculated with hclust at a
> specific height.
> However ever get following error message:
> "Error in cutree(hc, h = 60) :
>   the 'height' component of 'tree' is not sorted (increasingly)"
>
>
> Here is a working example to show that when specifing a height in  cutree()
> the code fails. In contrast, specifying the number of clusters in cutree()
> works.
> What is the exact problem and how can I solve it?
>
> x <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,80,15))
> y <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,150,25))
> df <- data.frame(x,y)
> plot(df)
>
> hc <- hclust(dist(df,method = "euclidean"), method="centroid")
> plot(hc)
>
> df$memb <- cutree(hc, h = 60) # this does not work
> df$memb <- cutree(hc, k = 3) # this works!
>
> plot(df$x,df$y,col=df$memb)
>
>
> Thank you for your hints!
>
> Best regards,
> Johannes
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Cutting hierarchical cluster tree at specific height fails

2014-07-09 Thread David L Carlson
To cut the tree, the clustering algorithm must produce consistently increasing 
height values with no reversals. You used one of the two options in hclust that 
does not do this. Note the following from the hclust manual page:

"Note however, that methods "median" and "centroid" are not leading to a 
monotone distance measure, or equivalently the resulting dendrograms can have 
so called inversions (which are hard to interpret)."

The cutree manual page:

"Cutting trees at a given height is only possible for ultrametric trees (with 
monotone clustering heights)."

Use a different method (but not median).

-
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Johannes Radinger
Sent: Wednesday, July 9, 2014 7:07 AM
To: R help
Subject: [R] Cutting hierarchical cluster tree at specific height fails

Hi,

I'd like to cut a hierachical cluster tree calculated with hclust at a
specific height.
However ever get following error message:
"Error in cutree(hc, h = 60) :
  the 'height' component of 'tree' is not sorted (increasingly)"


Here is a working example to show that when specifing a height in  cutree()
the code fails. In contrast, specifying the number of clusters in cutree()
works.
What is the exact problem and how can I solve it?

x <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,80,15))
y <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,150,25))
df <- data.frame(x,y)
plot(df)

hc <- hclust(dist(df,method = "euclidean"), method="centroid")
plot(hc)

df$memb <- cutree(hc, h = 60) # this does not work
df$memb <- cutree(hc, k = 3) # this works!

plot(df$x,df$y,col=df$memb)


Thank you for your hints!

Best regards,
Johannes

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cutting hierarchical cluster tree at specific height fails

2014-07-09 Thread Johannes Radinger
Hi,

I'd like to cut a hierachical cluster tree calculated with hclust at a
specific height.
However ever get following error message:
"Error in cutree(hc, h = 60) :
  the 'height' component of 'tree' is not sorted (increasingly)"


Here is a working example to show that when specifing a height in  cutree()
the code fails. In contrast, specifying the number of clusters in cutree()
works.
What is the exact problem and how can I solve it?

x <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,80,15))
y <- c(rnorm(100,50,10),rnorm(100,200,25),rnorm(100,150,25))
df <- data.frame(x,y)
plot(df)

hc <- hclust(dist(df,method = "euclidean"), method="centroid")
plot(hc)

df$memb <- cutree(hc, h = 60) # this does not work
df$memb <- cutree(hc, k = 3) # this works!

plot(df$x,df$y,col=df$memb)


Thank you for your hints!

Best regards,
Johannes

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.