Hi Marc, many thanks, that is exactly what I was looking for.
Best, Sven ----- Original Nachricht ---- Von: Marc Schwartz <[EMAIL PROTECTED]> An: [EMAIL PROTECTED] Datum: 29.07.2008 17:15 Betreff: Re: [R] Most often pairs of chars across grouping variable > on 07/29/2008 09:51 AM [EMAIL PROTECTED] wrote: > > Hi list, > > > > is there a package or function to compute the frequencies of pairs of > > chars in a variable across a grouping variable? Eg: > > > > > > d <- data.frame(ID=gl(2,3), F=c("A","B","C","A","C","D")) > >> d > > ID F 1 1 A 2 1 B 3 1 C 4 2 A 5 2 C 6 2 D > > > > > > Now I want to summarize the frequencies of all pairs A-B, A-C, A-D, > > B-C, B-D, C-D across ID: > > > > A B C D A - 1 2 1 B - - 1 0 C - - - 1 > > > > > > here, the combination A-C is most frequent. The real problem behind > > that is that 'F' codes diagnoses and I search for the most often > > pairs of diagnoses. > > > > Thanks, Sven > > I suspect that there might be something over in Bioconductor, but here > is one approach: > > > table(data.frame(t(do.call(cbind, > tapply(d$F, d$ID, > function(x) combn(as.character(x), 2)))))) > X2 > X1 B C D > A 1 2 1 > B 0 1 0 > C 0 0 1 > > > See ?combn to create the initial pairs from the data. This is done on a > per ID basis using tapply. The result is transposed into a data frame > and then table() is used to create the cross tabulation of the results. > > HTH, > > Marc Schwartz > > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.