I have a large file backed big. matrix, with millions of rows and 20 columns.
The columns contain data that I simply need to tabulate. There are a few dozen unique values. and I just want a frequency count Test code with a small "big" matrix. library(bigmemory) library(bigtabulate) test <- big.matrix(nrow = 100, ncol = 10) test[,1:3]<- sample(150) test[,4:6]<- sample(100) test[,7:10]<- sample(100) ## so we have sample big memory matrix. Its not file backed but will do for testing. ## the result we want is one that you would get if you could run table() on the bigmatrix ## thats emulated in this example by coercing the bigmatrix to a matrix. ## in the real application that is not possible, because of RAM limits P <- table(as.matrix(test)) ## the package big tabulate has a version of table called bigtable. ## you can run table on an individual column. ## I want to run it on all the columns. basically combine the results of running it on individual columns ## if you try to specify multiple columns, you get a contingency table, and if you use too many ## columns you will hang your system hard .. so dont try the line below . Well at least I hung my system # Ouch <- bigtable(test, ccols = seq(1,10)) So, is there a simple way to get the answer as emulated by P<-table(as.matrix(test)) without coercing to a matrix. TIA [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.