[R] About clustering techniques

2008-07-29 Thread pacomet
Hello R users It's some time I am playing with a dataset to do some cluster analysis. The data set consists of 14 columns being geographical coordinates and monthly temperatures in annual files latitutde - longitude - temperature 1 -. - temperature 12 I have some missing values in some cases

Re: [R] About clustering techniques

2008-07-29 Thread ctu
Hi Paco, I got the same problem with you before. Thus, I just impute the missing values For example: newdata<-as.matrix(impute(olddata, fun="random")) then I believe that you could analyze your data. Hopefully it helps. Chunhao Quoting pacomet <[EMAIL PROTECTED]>: Hello R users It's some ti

Re: [R] About clustering techniques

2008-07-29 Thread Christian Hennig
Dear Paco, in order to use the methods in the cluster package (including pam), look up the help page of daisy, which is able to compute dissimilarity matrices handling missing values appropriately (in most situations). A good reference is the Kaufman and Rousseeuw book cited on that help page.

Re: [R] About clustering techniques

2008-07-29 Thread Christian Hennig
A quick comment on this: imputation is an option to make things technically work, but it is not necessarily good. Imputation always introduces some noise, ie, it fakes information that is not really there. Whether it is good depends strongly on the data, the situation and the imputation metho

Re: [R] About clustering techniques

2008-07-30 Thread pacomet
Hi Christian I've been reading about daisy and think I need to do something like.. > mydaisydata <- daisy(mydata,metric=c("euclidean"),stand=FALSE) Error en vector("double", length) : tamaƱo del vector especificado es muy grande(which means, specified vector size is too big) mydata is an