Re: [R] Cluster analysis with missing data
On Mon, 2009-07-13 at 23:42 -0700, Hollix wrote: > Hi folks, > > I tried for the first time hclust. Unfortunately, with missing data in my > data file, it doesn't seem > to work. I found no information about how to consider missing data. > > Omission of all missings is not really an option as I would loose to many > cases. Holger, hclust takes a dissimilarity matrix as input, not your data, so the problem is in finding an appropriate dissimilarity/distance coefficient that handles missing data. Once such measure is Gower's coefficient and is implemented in function 'daisy' in recommended package 'cluster'. Try: require(cluster) ?daisy to read about it. Also 'vegdist' in package 'vegan' has an ability to not consider pairwise missingness. See ?vegdist after loading 'vegan' and in particular, the 'na.rm' argument. Whether either of these (i.e. the resulting dissimilarities) make sense for your particular problem is another matter... HTH G -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cluster analysis with missing data
vegdist() in the vegan package optionally allows pairwise deletion of missing values when computing dissimilarities. The result can be used as the first agrument to hclust() ('Caveat emptor', of course.) From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of Hollix [holger.steinm...@web.de] Sent: 14 July 2009 16:42 To: r-help@r-project.org Subject: [R] Cluster analysis with missing data Hi folks, I tried for the first time hclust. Unfortunately, with missing data in my data file, it doesn't seem to work. I found no information about how to consider missing data. Omission of all missings is not really an option as I would loose to many cases. Thanks in advance Holger -- View this message in context: http://www.nabble.com/Cluster-analysis-with-missing-data-tp24474486p24474486.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cluster analysis with missing data
Hi folks, I tried for the first time hclust. Unfortunately, with missing data in my data file, it doesn't seem to work. I found no information about how to consider missing data. Omission of all missings is not really an option as I would loose to many cases. Thanks in advance Holger -- View this message in context: http://www.nabble.com/Cluster-analysis-with-missing-data-tp24474486p24474486.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.