Re: [R] exploratory analysis of large categorical datasets

2010-11-13 Thread Kjetil Halvorsen
you can also look at correspondence analysis, which is implemented
in multiple CRAN packages, for instance MASS, ade4 and others.
See the multivariate analysis task view on CRAN.

Kjetil

On Thu, Nov 11, 2010 at 10:39 PM, Dennis Murphy djmu...@gmail.com wrote:
 Hi:

 A good place to start would be package vcd and its suite of demos and
 vignettes, as well as the vcdExtra package, which adds a few more goodies
 and a very nice introductory vignette by Michael Friendly. You can't fault
 the package for a lack of documentation :)

 You might also find the following link useful:  http://www.datavis.ca/R/
 Scroll down to 'vcd and vcdExtra', and further down to 'tableplot', which
 was recently released on CRAN.

 HTH,
 Dennis

 On Thu, Nov 11, 2010 at 2:09 PM, Lara Poplarski 
 larapoplar...@gmail.comwrote:

 Dear List,


 I am looking to perform exploratory analyses of two (relatively) large
 datasets of categorical data. The first one is a binary 80x100 matrix, in
 the form:


 matrix(sample(c(0,1),25,replace=TRUE), nrow = 5, ncol=5, dimnames = list(c(
 group1, group2,group3, group4,group5), c(V.1, V.2, V.3,
 V.4, V.5)))


 and the second one is a multistate 750x1500 matrix, with up to 15
 *unordered* states per variable, in the form:


 matrix(sample(c(1:15),25,replace=TRUE), nrow = 5, ncol=5, dimnames =
 list(c(
 group1, group2,group3, group4,group5), c(V.1, V.2, V.3,
 V.4, V.5)))


 Specifically, I am looking to see which pairs of variables are correlated.
 For continuos data, I would use cor() and cov() to generate the correlation
 matrix and the variance-covariance matrix, which I would then visualize
 with
 symnum() or image(). However, it is not clear to me whether this approach
 is
 suitable for categorical data of this kind.


 Since I am new to R, I would greatly appreciate any input on how to
 approach
 this task and on efficient visualization of the results.


 Many thanks in advance,

 Lara

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] exploratory analysis of large categorical datasets

2010-11-11 Thread Lara Poplarski
Dear List,


I am looking to perform exploratory analyses of two (relatively) large
datasets of categorical data. The first one is a binary 80x100 matrix, in
the form:


matrix(sample(c(0,1),25,replace=TRUE), nrow = 5, ncol=5, dimnames = list(c(
group1, group2,group3, group4,group5), c(V.1, V.2, V.3,
V.4, V.5)))


and the second one is a multistate 750x1500 matrix, with up to 15
*unordered* states per variable, in the form:


matrix(sample(c(1:15),25,replace=TRUE), nrow = 5, ncol=5, dimnames = list(c(
group1, group2,group3, group4,group5), c(V.1, V.2, V.3,
V.4, V.5)))


Specifically, I am looking to see which pairs of variables are correlated.
For continuos data, I would use cor() and cov() to generate the correlation
matrix and the variance-covariance matrix, which I would then visualize with
symnum() or image(). However, it is not clear to me whether this approach is
suitable for categorical data of this kind.


Since I am new to R, I would greatly appreciate any input on how to approach
this task and on efficient visualization of the results.


Many thanks in advance,

Lara

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exploratory analysis of large categorical datasets

2010-11-11 Thread Dennis Murphy
Hi:

A good place to start would be package vcd and its suite of demos and
vignettes, as well as the vcdExtra package, which adds a few more goodies
and a very nice introductory vignette by Michael Friendly. You can't fault
the package for a lack of documentation :)

You might also find the following link useful:  http://www.datavis.ca/R/
Scroll down to 'vcd and vcdExtra', and further down to 'tableplot', which
was recently released on CRAN.

HTH,
Dennis

On Thu, Nov 11, 2010 at 2:09 PM, Lara Poplarski larapoplar...@gmail.comwrote:

 Dear List,


 I am looking to perform exploratory analyses of two (relatively) large
 datasets of categorical data. The first one is a binary 80x100 matrix, in
 the form:


 matrix(sample(c(0,1),25,replace=TRUE), nrow = 5, ncol=5, dimnames = list(c(
 group1, group2,group3, group4,group5), c(V.1, V.2, V.3,
 V.4, V.5)))


 and the second one is a multistate 750x1500 matrix, with up to 15
 *unordered* states per variable, in the form:


 matrix(sample(c(1:15),25,replace=TRUE), nrow = 5, ncol=5, dimnames =
 list(c(
 group1, group2,group3, group4,group5), c(V.1, V.2, V.3,
 V.4, V.5)))


 Specifically, I am looking to see which pairs of variables are correlated.
 For continuos data, I would use cor() and cov() to generate the correlation
 matrix and the variance-covariance matrix, which I would then visualize
 with
 symnum() or image(). However, it is not clear to me whether this approach
 is
 suitable for categorical data of this kind.


 Since I am new to R, I would greatly appreciate any input on how to
 approach
 this task and on efficient visualization of the results.


 Many thanks in advance,

 Lara

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.