Hi, I have been trying to use the new .parallel argument with the most recent version of plyr [1] to speed up some tasks. I can run the example in the NEWS file [1], and it seems to be working correctly. However, R will only use a single core when I try to apply this same approach with ddply().
1. http://cran.r-project.org/web/packages/plyr/NEWS Watching my CPUs I see that in both cases only a single core is used, and they take about the same amount of time. Is there a limitation with how ddply() dispatches parallel jobs, or is this task not suitable for parallel computing? Cheers, Dylan Here is an example: library(plyr) library(doMC) registerDoMC(cores=2) # example data d <- data.frame(y=rnorm(1000), id=rep(letters[1:4], each=500)) # function that wastes some time f <- function(x) { m <- vector(length=10000) for(i in 1:10000) { m[i] <- mean(sample(x$y, 100)) } mean(m) } system.time(ddply(d, .(id), .fun=f, .parallel=FALSE)) # user system elapsed # 2.740 0.016 2.766 system.time(ddply(d, .(id), .fun=f, .parallel=TRUE)) # user system elapsed # 2.720 0.000 2.726 -- Dylan Beaudette Soil Resource Laboratory http://casoilresource.lawr.ucdavis.edu/ University of California at Davis 530.754.7341 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.