jgaspard wrote:
Hi all!

I'm new to R and don't know many about it. Because it is free, I managed to
learn it a little bit.

Here is my problem: I did a cluster analysis on 30 observations and 16
variables (monde, figaro, liberation, etc.). Here is the .txt data file:

"monde","figaro","liberation","yespeople","nopeople","bxl","europe","ue","union_eur","other","yesmeto","nometo","yesfonc","nofonc","yestone","notone"
1,0,0,0,1,0,0,0,1,0,0,1,1,0,1,0
1,0,0,0,1,0,0,0,1,0,0,1,1,0,1,0
1,0,0,0,1,0,0,0,1,0,1,0,1,0,1,0
0,1,0,0,1,0,0,0,1,0,0,1,1,0,0,1
1,0,0,0,1,0,0,0,1,0,0,1,1,0,0,1
1,0,0,0,1,0,0,0,0,1,0,1,1,0,1,0
1,0,0,0,1,0,0,0,0,1,0,1,1,0,1,0
1,0,0,0,1,0,0,0,1,0,0,1,0,1,1,0
0,1,0,0,1,0,0,0,1,0,0,1,0,1,1,0
0,1,0,0,1,0,0,0,0,1,0,1,0,1,1,0
1,0,0,0,1,0,1,0,0,0,0,1,0,1,0,1
0,1,0,0,1,0,0,1,0,0,0,1,1,0,1,0
0,0,1,0,1,0,0,1,0,0,0,1,0,1,1,0
1,0,0,0,1,0,0,1,0,0,0,1,0,1,1,0
0,1,0,0,1,0,0,0,1,0,0,1,1,0,1,0
0,0,1,0,1,0,0,1,0,0,0,1,0,1,1,0
0,1,0,1,0,0,1,0,0,0,0,1,0,1,1,0
0,1,0,0,1,1,0,0,0,0,1,0,0,1,1,0
0,1,0,0,1,1,0,0,0,0,1,0,0,1,1,0
0,1,0,0,1,1,0,0,0,0,1,0,0,1,1,0
0,1,0,0,1,1,0,0,0,0,1,0,0,1,1,0
0,1,0,0,1,1,0,0,0,0,1,0,0,1,0,1
0,1,0,0,1,1,0,0,0,0,1,0,1,0,1,0
0,1,0,0,1,1,0,0,0,0,1,0,1,0,1,0
0,1,0,0,1,1,0,0,0,0,1,0,1,0,1,0
0,1,0,0,1,1,0,0,0,0,1,0,1,0,1,0
0,1,0,0,1,1,0,0,0,0,1,0,1,0,1,0
0,1,0,0,1,1,0,0,0,0,1,0,1,0,1,0
0,1,0,0,1,1,0,0,0,0,1,0,1,0,1,0
1,0,0,0,1,1,0,0,0,0,1,0,1,0,1,0


The steps I made were those:

headlines=read.table("/data.csv", header=T, sep=",")
data
dist=dist(data,method="euclidean")
dist
cluster=hclust(dist,method="ward")
cluster
plot(cluster)
rect.hclust(cluster, k=4, border="red")

I extracted 4 clusters from the data. My question is: is it possible to
produce a summary of every mean values for each variable of each of the 4
clusters?


Well, I think this is not what you want.
Probably you want to use Manhattan distance (rather than Euclidean) 0/1 data and you want to know the number of 1s and the total number in each cluster.

Anyway, in order to answer your question, do an assignment in the end such as:

x <- rect.hclust(cluster, k=4, border="red")
sapply(x, function(i) colMeans(data[i,]))

Uwe Ligges



Thanks a lot in advance,

Jeoffrey





______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to