[R] Cluster analysis: hclust manipulation possible?

Jopi Harri Mon, 16 Nov 2009 06:32:36 -0800

I am doing cluster analysis [hclust(Dist, method="average")] on
data that potentially contains redundant objects. As expected,
the inclusion of redundant objects affects the clustering result,
i.e., the data a1, = a2, = a3, b, c, d, e1, = e2 is likely to
cluster differently from the same data without the redundancy,
i.e., a1, b, c, d, e1. This is apparent when the outcome is
visualized as a dendrogram.


Now, it seems that the clustering result for which the redundancy
has been eliminated is more robust for the present assignment
than that of the redundant data. Naturally, there is no problem
in the elimination: just exclude the redundant objects from Dist.

However, it would be very convenient to be able to include the
redundant objects in the *dendrogram* by attaching them as
0-level branches to the subtrees, i.e.:

1.0........-------........
0.5....___|__...._|_......
0.0.._|_..|..|..|.._|_....
....|.|.|.|..|..|.|...|...
...a1a2a3.b..c..d.e1.e2...

instead of

1.0........-------........
0.5....___|__...._|_......
0.0...|...|..|..|...|.....
......a1..b..c..d..e1.....

The question: Can this be accomplished in the *dendrogram plot*
by manipulating the resulting hclust data structure or by some
other means, and if yes, how?

Jopi Harri

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Cluster analysis: hclust manipulation possible?

Reply via email to