Hi,

I really try to understand why working with parallel package, code seems to
be slower using inside a function... for example :

# data
don <- lapply(1:150, function(x){data.frame(a = rnorm(100000), b =
rnorm(100000))})

# inline test
t0 <- Sys.time()

require(parallel)
cl <- makeCluster(4)
res <- parLapplyLB(cl, don, function(x){1})
stopCluster(cl)

Sys.time()-t0 # 3.5 sec, each thread up to 90 Mo

# using function
parF <- function(data){

  require(parallel)
  cl <- makeCluster(4)

  result <- parLapply(cl, data, function(x){1})

  stopCluster(cl)
}

system.time(res2 <- parF(don)) # 9.5 sec, each thread up to 320 Mo ...!


It's seems that, using inside a function :

   - is 3x slower...
   - more data is loaded into each thread...!

Thanks.
-- 


Benoit Thieurmel  +33 6 69 04 06 11 10 place de la Madeleine - 75008 Paris

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to