-----Messaggio originale----- Da: Sarah Goslee [mailto:sarah.gos...@gmail.com] Inviato: mar 30/12/2008 13.42 A: mau...@alice.it Cc: r-help@r-project.org Oggetto: Re: [R] I would appreciate some help with clustering Is this homework? If so, you should discuss it with the instructor, not us.
Regardless, the methods you suggested are a reasonable place to start, and you perhaps should have done so first before asking here. You may well have gotten the results you needed without the delay of waiting for an answer on the list. Sarab On Tue, Dec 30, 2008 at 6:59 AM, <mau...@alice.it> wrote: > I have a binary vector whose length is known. > Such a vector contains an unspecified number of 1s. > My goal is > 1. to generate as many clusters as the number of 1s > 2. to place the 1 as much as possible at the center of its own cluster > > Example. Say I have the following binary vector: > v <- c(0,0,1,0,0,0,0,1,0,1,0,0) > Then I have to get 3 clusters. > I can generate a matrix containing the distance of each element from each one > of the > clusters center (the 1s): > > 1st_1 2nd_1 3rd_1 > ---|----------------------------------- > 0 | 2 7 9 > 0 | 1 6 8 > 1 | 0 5 7 > 0 | 1 4 6 > 0 | 2 3 5 > 0 | 3 2 4 > 0 | 4 1 3 > 1 | 5 0 2 > 0 | 6 1 1 > 1 | 7 2 0 > 0 | 8 3 1 > 0 | 9 4 2 > > Should I input such matrix to R function "dist" and then use for instance PAM > or KMEAN > to get the expected 3 clusters ? > I would greatly appreciate some help. > > Thank you so much. > Maura > -- Sarah Goslee http://www.functionaldiversity.org It is no homework. It is part of a project where a binary matrix, whose 1s represent the position of the highest DWT coefficients energy, is used as a template to extract signal features. The approach I am following requires each row of the binary matrix (correspondent to a DWT scale level) to be clustered separately subject to the requirements of generating as many clusters as the numbers of 1s and having the 1s a the centers of the respective clusters. I simply do not know how to organize a binary vector data and which function to feed it to, in order to achieve my goal. I tried to input the matrix I posted to R function "dist" and got a 3x3 distance matrix. I wonder what is the meanong of doing that since I already calculated the distances of each 0 from each 1. Although I cannot call this a "distance matrix" because it is not symmatrical and not square. I tried calling both KMEANS and PAM passing the 3x3 distance matrix and got no more than 2 clusters whereas I need 3 ones. I tried calling both KMEANS and PAM passing the matrix I posted. Again I only got 2 clusters. I do not know how to specify the cluster centers, which is accepted by KMEANS but there is no example showing how to do that. I deduce that my approach is wrong rather than a reasonable starting place. Best regards, Maura tutti i telefonini TIM! [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.