Re: [R] Looking for categorization method/module in R
On Dec 15, 2009, at 7:19 AM, James Mcininch wrote: All, I'm relatively new to using R, having used it thus far for some simple statistics and plotting. However, I'm not new to programming by any measure. I've been looking at the various modules available for clustering, factor analysis, etc. and find that I need advice on which modules I should be focusing on and their application. The list is not really advertised as offering general statistical advice, but is more responsive to focussed questions on R use. There is the option of reviewing the Task Views: http://cran.r-project.org/web/views/ I have a data set comprised of columns of both quantitative and qualitative / non-numeric attributes. I would like to perform two operations on this data: identify correlations between attributes, and cluster the records by attribute. All of the clustering algorithms that I've looked at so far are based on numerical distance functions, and it's not clear to me how I'd apply them to qualitative attributes. It's not appropriate to simple convert discrete qualitative attributes (e.g., native language) to numerical values or independent columns with binary values. Is there a module that provides such an algorithm or that can be adapted to this purpose? I can wrap my head around the problem of looking for cross-correlation between the attributes, but would appreciate any insight in how to do it most efficiently and present the results. Thank you. [[alternative HTML version deleted]] David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looking for categorization method/module in R
All, I'm relatively new to using R, having used it thus far for some simple statistics and plotting. However, I'm not new to programming by any measure. I've been looking at the various modules available for clustering, factor analysis, etc. and find that I need advice on which modules I should be focusing on and their application. I have a data set comprised of columns of both quantitative and qualitative / non-numeric attributes. I would like to perform two operations on this data: identify correlations between attributes, and cluster the records by attribute. All of the clustering algorithms that I've looked at so far are based on numerical distance functions, and it's not clear to me how I'd apply them to qualitative attributes. It's not appropriate to simple convert discrete qualitative attributes (e.g., native language) to numerical values or independent columns with binary values. Is there a module that provides such an algorithm or that can be adapted to this purpose? I can wrap my head around the problem of looking for cross-correlation between the attributes, but would appreciate any insight in how to do it most efficiently and present the results. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looking for categorization method/module in R
All, I'm relatively new to using R, having used it thus far for some simple statistics and plotting. However, I'm not new to programming by any measure. I've been looking at the various modules available for clustering, factor analysis, etc. and find that I need advice on which modules I should be focusing on and their application. I have a data set comprised of columns of both quantitative and qualitative / non-numeric attributes. I would like to perform two operations on this data: identify correlations between attributes, and cluster the records by attribute. All of the clustering algorithms that I've looked at so far are based on numerical distance functions, and it's not clear to me how I'd apply them to qualitative attributes. It's not appropriate to simple convert discrete qualitative attributes (e.g., native language) to numerical values or independent columns with binary values. Is there a module that provides such an algorithm or that can be adapted to this purpose? I can wrap my head around the problem of looking for cross-correlation between the attributes, but would appreciate any insight in how to do it most efficiently and present the results. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looking for categorization method/module in R
All, I'm relatively new to using R, having used it thus far for some simple statistics and plotting. However, I'm not new to programming by any measure. I've been looking at the various modules available for clustering, factor analysis, etc. and find that I need advice on which modules I should be focusing on and their application. I have a data set comprised of columns of both quantitative and qualitative / non-numeric attributes. I would like to perform two operations on this data: identify correlations between attributes, and cluster the records by attribute. All of the clustering algorithms that I've looked at so far are based on numerical distance functions, and it's not clear to me how I'd apply them to qualitative attributes. It's not appropriate to simple convert discrete qualitative attributes (e.g., native language) to numerical values or independent columns with binary values. Is there a module that provides such an algorithm or that can be adapted to this purpose? I can wrap my head around the problem of looking for cross-correlation between the attributes, but would appreciate any insight in how to do it most efficiently and present the results. Thank you. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.