On Jul 28, 2011, at 4:24 PM, David Warren wrote:
Hi all,
I'm working with a sizable dataset that I'd like to summarize,
but I
can't find a tool or function that will do quite what I'd like.
Basically,
I'd like to summarize the data by fully crossing three variables and
getting
a count of the number of observations for every level of that 3-way
interaction. For example, if factors A, B, and C each have 3 levels
(all of
which were observed someplace in the dataset), I'd like to know how
many
times A1, B1, and C1 co-occurred in the dataset. Functions like
aggregate
and summaryBy do a decent job when I sum a vector of ones of the
same length
as the original dataset, but I'm getting stuck on the fact that
neither will
return 0-count combinations of the three variables in question.
I think that may depend on what functions and arguments you use.
I understand that this is a desirable outcome (if A1, B1, C2 didn't
occur, it
shouldn't be counted and isn't), but I need to know both when these
combinations of factor did and did not occur. I'm stuck on this
one, and
would really appreciate any help. Thanks in advance!
?xtabs
Dave Warren
PS A functional solution would be best; the original dataset
contains about
2.3 million observations, so any looping is going to be very slow.
In general tabulations like these are very efficient.
--
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.