Dear WizaRds,

        My goal is to program the VS-KM algorithm by Brusco and Cradit 01 and I 
have 
come to a complete stop in my efforts. Maybe anybody is willing to follow my 
thoughts and offer some help.
        In a first step, I want to use a single variable for the partitioning 
process. 
As the center-matrix I use the objects that belong to the cluster I found with 
the hierarchial Ward algorithm. Then, I have to take all possible variable 
pairs 
and apply kmeans again, which is quite confusing to me. Here is
what I do:

##      0. data
mat <- matrix( c(6,7,8,2,3,4,12,14,14, 14,15,13,3,1,2,3,4,2,
15,3,10,5,11,7,13,6,1, 15,4,10,6,12,8,12,7,1), ncol=9, byrow=T )
rownames(mat) <- paste("v", 1:4, sep="" )
tmat <- t(mat)

##      1. Provide clusters via Ward:
ward    <- hclust(d=dist(tmat), method = "ward", members=NULL)

##      2. Compute cluster centers and create center-matrix for kmeans:
groups  <- cutree(ward, k = 3, h = NULL)

centroids       <- vector(mode="numeric", length=3)
obj             <- vector(mode="list", length=3)

for (i in 1:3){
        where <- which(groups==i) # which object belongs to which group?
        centroids[i] <- mean( tmat[ where, ] )
        obj[[i]] <- tmat[where,]
}
P       <- vector(mode="numeric", dim(mat)[2] )
pj      <- vector(mode="list", length=dim(mat)[1])

for (i in 1:dim(mat)[1]){
        pj[[i]] <- kmeans( tmat[,i], centers=centroids, iter.max=10, 
algorithm="MacQueen")
        P <- rbind(P, pj[[i]]$cluster)
}
P       <- P[-1,]

##      gives a matrix of partitions using each single variable
##      (I'm sure, P can be programmed much easier)

##      3. kmeans using all possible pairs of variables, here just e.g. 
variables 1 
and 3:
wjk     <- kmeans(tmat[,c(1,3)], centers=centroids, iter.max=10, 
algorithm="MacQueen")

###
        which, of course, gives an error message since "centroids" is not a 
matrix of 
the cluster centers. How on earth do I correctly construct a matrix of centers 
corresponding to the pairwise variables? Is it always the same matrix no matter 
which pair of variables I choose?
        I apologize for my lack of clustering knowledge and expertise - any 
help is 
welcome. Thank you very much.

Many greetings
mark

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to