HI Stefan, Thanks for the solutions. Just to add 1 more: f.a<-table(a); f.b<-table(b) c(f.a[!names(f.a)%in%names(f.b)],f.b[!names(f.b)%in%names(f.a)],xtabs(f.a[names(f.a)%in%names(f.b)]+f.b[names(f.b)%in%names(f.a)]~ names(f.a[names(f.a)%in%names(f.b)])))
#e i j a b d f g #1 1 1 3 1 3 5 5 A.K. ----- Original Message ----- From: Stefan Th. Gries <stgr...@gmail.com> To: mce...@lightminersystems.com Cc: r-help@r-project.org Sent: Thursday, September 20, 2012 10:57 AM Subject: [R] (no subject) >From my book on corpus linguistics with R: # (10) Imagine you have two vectors a and b such that a<-c("d", "d", "j", "f", "e", "g", "f", "f", "i", "g") b<-c("a", "g", "d", "f", "g", "a", "f", "a", "b", "g") # Of these vectors, you can create frequency lists by writing freq.list.a<-table(a); freq.list.b<-table(b) rm(a); rm(b) # How do you merge these two frequency lists without merging the two vectors first? More specifically, if I delete a and b from your memory, rm(a); rm(b) # how do you generate the following table only from freq.list.a and freq.list.b, i.e., without any reference to a and b themselves? Before you complain about this question as being unrealistic, consider the possibility that you generated the frequency lists of two corpora (here, a and b) that are so large that you cannot combine them into one (a.and.b<-c(a, b)) and generate a frequency list of that combined vector (table(a.and.b)) ... joint.freqs a b d e f g i j 3 1 3 1 5 5 1 1 joint.freqs<-vector(length=length(sort(unique(c(names(freq.list.a), names(freq.list.b)))))) # You generate an empty vector joint.freqs (i) that is as long as there are different types in both a and b (but note that, as requested, this information is not taken from a or b, but from their frequency lists) ... names(joint.freqs)<-sort(unique(c(names(freq.list.a), names(freq.list.b)))) # ... and (ii) whose elements have these different types as names. joint.freqs[names(freq.list.a)]<-freq.list.a # The elements of the new vector joint.freqs that have the same names as the frequencies in the first frequency list are assigned the respective frequencies. joint.freqs[names(freq.list.b)]<-joint.freqs[names(freq.list.b)]+freq.list.b # The elements of the new vector joint.freqs that have the same names as the frequencies in the second frequency list are assigned the sum of the values they already have (either the ones from the first frequency list or just zeroes) and the respective frequencies. joint.freqs # look at the result # Another shorter and more elegant solution was proposed by Claire Crawford (but uses a function which will only be introduced later in the book) freq.list.a.b<-c(freq.list.a, freq.list.b) # first the two frequency lists are merged into a single vector ... joint.freqs<-as.table(tapply(freq.list.a.b, names(freq.list.a.b), sum)) # ... and then the sums of all numbers that share the same names are computed joint.freqs # look at the result # The shortest, but certainly not memory-efficient way to do this involves just using the frequency lists to create one big vector with all elements and tabulate that. table(c(rep(names(freq.list.a), freq.list.a), rep(names(freq.list.b), freq.list.b))) # kind of cheating but possible with short vectors ... HTH, STG -- Stefan Th. Gries ----------------------------------------------- University of California, Santa Barbara http://www.linguistics.ucsb.edu/faculty/stgries ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.