Hello everyone,
I'm trying to construct bins for each row in a matrix. I'm using apply() in combination with hist() to do this. Performing this binning for a 10K-by-50 matrix takes about 5 seconds, but only 0.5 seconds for a 1K-by-500 matrix. This suggests the bottleneck is accessing rows in apply() rather than the calculations going on inside hist(). My initial idea is to process as many columns (as make sense for the intended use) at once. However, I still have many many rows to process and I would appreciate any feedback on how to speed this up. Any thoughts? Thanks, Ariel Here is the illustration: # create data m1 <- matrix(10*rnorm(50*10^4), ncol=50) m2 <- matrix(10*rnorm(50*10^4), ncol=500) # compute bins bins <- seq(-100,100,1) system.time({ out1 <- t(apply(m1,1, function(x) hist(x,breaks=bins, plot=FALSE)$counts)) }) system.time({ out2 <- t(apply(m2,1, function(x) hist(x,breaks=bins, plot=FALSE)$counts)) }) --- Ariel Ortiz-Bobea Fellow Resources for the Future 1616 P Street, N.W. Washington, DC 20036 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.