Re: [R] R parallel / foreach - aggregation of results
If you return just the row that the foreach procedure produces instead of the entire matrix containing that row and use .combine=rbind then you will end up with the matrix of interest. E.g., Simpar3a <- function (n1) { L2distance <- matrix(NA, ncol = n1, nrow = n1) data <- rnorm(n1) diag(L2distance) = 0 cl <- makeCluster(4) registerDoParallel(cl) x <- foreach(j = 1:n1, .combine = rbind) %dopar% { library(np) datj <- data[j] rowJ <- numeric(n1) for (k in j:n1) { rowJ[k] <- k * datj } rowJ } stopCluster(cl) x } Bill Dunlap TIBCO Software wdunlap tibco.com On Sat, Aug 1, 2015 at 12:19 PM, Martin Spindler wrote: > Dear Jim, > > Thank you very much for your response. It seems to work now, but the > return value is not the required matrix but a list of matrices (one for > each repition j). > Any idea how it is possible to return only the last matrix and not all? > > Thanks and best, > > Martin > > > > Gesendet: Freitag, 31. Juli 2015 um 18:22 Uhr > Von: "jim holtman" > An: "Martin Spindler" > Cc: "r-help@r-project.org" > Betreff: Re: [R] R parallel / foreach - aggregation of results > > Try this chance to actually return values: > > > library(doParallel) > Simpar3 <- function(n1) { >L2distance <- matrix(NA, ncol=n1, nrow=n1) >data <- rnorm(n1) >diag(L2distance)=0 >cl <- makeCluster(4) >registerDoParallel(cl) >x <- foreach(j=1:n1) %dopar% { > library(np) > datj <- data[j] > for(k in j:n1) { >L2distance[j,k] <- k*datj > } > L2distance # return the value >} >stopCluster(cl) >return(x) > } > Res <- Simpar3(100) > > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > On Fri, Jul 31, 2015 at 8:39 AM, Martin Spindler > wrote:Dear all, > > when I am running the code attached below, it seems that no results are > returned, only the predefined NAs. What mistake do I make? > Any comments and help is highly appreciated. > > Thanks and best, > > Martin > > > Simpar3 <- function(n1) { > L2distance <- matrix(NA, ncol=n1, nrow=n1) > data <- rnorm(n1) > diag(L2distance)=0 > cl <- makeCluster(4) > registerDoParallel(cl) > foreach(j=1:n1) %dopar% { > library(np) > datj <- data[j] > for(k in j:n1) { > L2distance[j,k] <- k*datj > } > } > stopCluster(cl) > return(L2distance) > } > > Res <- Simpar3(100) > > __ > R-help@r-project.org[R-help@r-project.org] mailing list -- To UNSUBSCRIBE > and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help[https://stat.ethz.ch/mailman/listinfo/r-help] > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html[http://www.R-project.org/posting-guide.html] > and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R parallel / foreach - aggregation of results
You can always just pull the last one off the list. When running things in parallel, what does the "last one" mean? Do you want the last from each of the parallel threads, or just the last one on the list? You might want to put some flag on the data being returned so you can determine which one you want to process. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Sat, Aug 1, 2015 at 3:19 PM, Martin Spindler wrote: > Dear Jim, > > Thank you very much for your response. It seems to work now, but the > return value is not the required matrix but a list of matrices (one for > each repition j). > Any idea how it is possible to return only the last matrix and not all? > > Thanks and best, > > Martin > > > > Gesendet: Freitag, 31. Juli 2015 um 18:22 Uhr > Von: "jim holtman" > An: "Martin Spindler" > Cc: "r-help@r-project.org" > Betreff: Re: [R] R parallel / foreach - aggregation of results > > Try this chance to actually return values: > > > library(doParallel) > Simpar3 <- function(n1) { >L2distance <- matrix(NA, ncol=n1, nrow=n1) >data <- rnorm(n1) >diag(L2distance)=0 >cl <- makeCluster(4) >registerDoParallel(cl) >x <- foreach(j=1:n1) %dopar% { > library(np) > datj <- data[j] > for(k in j:n1) { >L2distance[j,k] <- k*datj > } > L2distance # return the value >} >stopCluster(cl) >return(x) > } > Res <- Simpar3(100) > > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > On Fri, Jul 31, 2015 at 8:39 AM, Martin Spindler > wrote:Dear all, > > when I am running the code attached below, it seems that no results are > returned, only the predefined NAs. What mistake do I make? > Any comments and help is highly appreciated. > > Thanks and best, > > Martin > > > Simpar3 <- function(n1) { > L2distance <- matrix(NA, ncol=n1, nrow=n1) > data <- rnorm(n1) > diag(L2distance)=0 > cl <- makeCluster(4) > registerDoParallel(cl) > foreach(j=1:n1) %dopar% { > library(np) > datj <- data[j] > for(k in j:n1) { > L2distance[j,k] <- k*datj > } > } > stopCluster(cl) > return(L2distance) > } > > Res <- Simpar3(100) > > __ > R-help@r-project.org[R-help@r-project.org] mailing list -- To UNSUBSCRIBE > and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help[https://stat.ethz.ch/mailman/listinfo/r-help] > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html[http://www.R-project.org/posting-guide.html] > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R parallel / foreach - aggregation of results
Dear Jim, Thank you very much for your response. It seems to work now, but the return value is not the required matrix but a list of matrices (one for each repition j). Any idea how it is possible to return only the last matrix and not all? Thanks and best, Martin Gesendet: Freitag, 31. Juli 2015 um 18:22 Uhr Von: "jim holtman" An: "Martin Spindler" Cc: "r-help@r-project.org" Betreff: Re: [R] R parallel / foreach - aggregation of results Try this chance to actually return values: library(doParallel) Simpar3 <- function(n1) { L2distance <- matrix(NA, ncol=n1, nrow=n1) data <- rnorm(n1) diag(L2distance)=0 cl <- makeCluster(4) registerDoParallel(cl) x <- foreach(j=1:n1) %dopar% { library(np) datj <- data[j] for(k in j:n1) { L2distance[j,k] <- k*datj } L2distance # return the value } stopCluster(cl) return(x) } Res <- Simpar3(100) Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Fri, Jul 31, 2015 at 8:39 AM, Martin Spindler wrote:Dear all, when I am running the code attached below, it seems that no results are returned, only the predefined NAs. What mistake do I make? Any comments and help is highly appreciated. Thanks and best, Martin Simpar3 <- function(n1) { L2distance <- matrix(NA, ncol=n1, nrow=n1) data <- rnorm(n1) diag(L2distance)=0 cl <- makeCluster(4) registerDoParallel(cl) foreach(j=1:n1) %dopar% { library(np) datj <- data[j] for(k in j:n1) { L2distance[j,k] <- k*datj } } stopCluster(cl) return(L2distance) } Res <- Simpar3(100) __ R-help@r-project.org[R-help@r-project.org] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help[https://stat.ethz.ch/mailman/listinfo/r-help] PLEASE do read the posting guide http://www.R-project.org/posting-guide.html[http://www.R-project.org/posting-guide.html] and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R parallel / foreach - aggregation of results
Try this chance to actually return values: library(doParallel) Simpar3 <- function(n1) { L2distance <- matrix(NA, ncol=n1, nrow=n1) data <- rnorm(n1) diag(L2distance)=0 cl <- makeCluster(4) registerDoParallel(cl) x <- foreach(j=1:n1) %dopar% { library(np) datj <- data[j] for(k in j:n1) { L2distance[j,k] <- k*datj } L2distance # return the value } stopCluster(cl) return(x) } Res <- Simpar3(100) Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Fri, Jul 31, 2015 at 8:39 AM, Martin Spindler wrote: > Dear all, > > when I am running the code attached below, it seems that no results are > returned, only the predefined NAs. What mistake do I make? > Any comments and help is highly appreciated. > > Thanks and best, > > Martin > > > Simpar3 <- function(n1) { > L2distance <- matrix(NA, ncol=n1, nrow=n1) > data <- rnorm(n1) > diag(L2distance)=0 > cl <- makeCluster(4) > registerDoParallel(cl) > foreach(j=1:n1) %dopar% { > library(np) > datj <- data[j] > for(k in j:n1) { > L2distance[j,k] <- k*datj > } > } > stopCluster(cl) > return(L2distance) > } > > Res <- Simpar3(100) > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R parallel / foreach - aggregation of results
Martin, I think the main problem is that you are trying to assign your results to the result matrix inside the foreach loop. Parallel functions in R are generally not good at updating parts of matrices from the different workers in this way. Instead, using e.g. foreach, each loop of the foreach-call has to return a vector which can be cbind-ed to a result matrix. Something like: L2distance = foreach(j=1:n1, .combine = cbind) %dopar% { res = rep(NA, 10) for (k in j:n1) res[k] = k*data[j] res } L2distance I am not sure what the np-library is, but you should consider putting it in a clusterExport-call after creating the cluster. Best wishes, Jon On 7/31/2015 2:39 PM, Martin Spindler wrote: Dear all, when I am running the code attached below, it seems that no results are returned, only the predefined NAs. What mistake do I make? Any comments and help is highly appreciated. Thanks and best, Martin Simpar3 <- function(n1) { L2distance <- matrix(NA, ncol=n1, nrow=n1) data <- rnorm(n1) diag(L2distance)=0 cl <- makeCluster(4) registerDoParallel(cl) foreach(j=1:n1) %dopar% { library(np) datj <- data[j] for(k in j:n1) { L2distance[j,k] <- k*datj } } stopCluster(cl) return(L2distance) } Res <- Simpar3(100) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jon Olav Skøien Joint Research Centre - European Commission Institute for Environment and Sustainability (IES) Climate Risk Management Unit Via Fermi 2749, TP 100-01, I-21027 Ispra (VA), ITALY jon.sko...@jrc.ec.europa.eu Tel: +39 0332 789205 Disclaimer: Views expressed in this email are those of the individual and do not necessarily represent official views of the European Commission. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R parallel / foreach - aggregation of results
Dear all, when I am running the code attached below, it seems that no results are returned, only the predefined NAs. What mistake do I make? Any comments and help is highly appreciated. Thanks and best, Martin Simpar3 <- function(n1) { L2distance <- matrix(NA, ncol=n1, nrow=n1) data <- rnorm(n1) diag(L2distance)=0 cl <- makeCluster(4) registerDoParallel(cl) foreach(j=1:n1) %dopar% { library(np) datj <- data[j] for(k in j:n1) { L2distance[j,k] <- k*datj } } stopCluster(cl) return(L2distance) } Res <- Simpar3(100) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R parallel - slow speed
Thank you very much to you both for your help. I knew that parallelizing has some additional "overhead" costs, but I was surprised be the order of magnitude (it was 10 times slower.) Therefore I thought I made some mistake or that there is a more clever way to do it. Best, Martin Gesendet: Donnerstag, 30. Juli 2015 um 15:28 Uhr Von: "jim holtman" An: "Jeff Newmiller" Cc: "Martin Spindler" , "r-help@r-project.org" Betreff: Re: [R] R parallel - slow speed I ran a test on my Windows box with 4 CPUs. THere were 4 RScript processes started in response to the request for a cluster of 4. Each of these ran for an elapsed time of around 23 seconds, making the median time around 0.2 seconds for 100 iterations as reported by microbenchmark. The 'apply' only takes about 0.003 seconds for a single iteration - again what microbenchmark is reporting. The 4 RScript processes each use about 3 CPU seconds in the 23 seconds of elapsed time, most of that is probably the communication and startup time for the processes and reporting results. So as was pointed out previous there is overhead is running in parallel. You probably have to have at least several seconds of heavy computation for a iteration to make trying to parallelize something. You should also investigate exactly what is happening on your system so that you can account for the time being spent. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Thu, Jul 30, 2015 at 8:56 AM, Jeff Newmiller wrote:Parallelizing comes at a price... and there is no guarantee that you can afford it. Vectorizing your algorithms is often a better approach. Microbenchmarking is usually overkill for evaluating parallelizing. You assume 4 cores... but many CPUs have 2 cores and use hyperthreading to make each core look like two. The operating system can make a difference also... Windows processes are more expensive to start and communicate between than *nix processes are. In particular, Windows seems to require duplicated RAM pages while *nix can share process RAM (at least until they are written to) so you end up needing more memory and disk paging of virtual memory becomes more likely. --- Jeff Newmiller The . . Go Live... DCN: Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 30, 2015 8:26:34 AM EDT, Martin Spindler wrote: >Dear all, > >I am trying to parallelize the function npnewpar given below. When I am >comparing an application of "apply" with "parApply" the parallelized >version seems to be much slower (cf output below). Therefore I would >like to ask how the function could be parallelized more efficient. >(With increasing sample size the difference becomes smaller, but I was >wondering about this big differences and how it could be improved.) > >Thank you very much for help in advance! > >Best, > >Martin > > >library(microbenchmark) >library(doParallel) > >n <- 500 >y <- rnorm(n) >Xc <- rnorm(n) >Xd <- sample(c(0,1), replace=TRUE) >Weights <- diag(n) >n1 <- 50 >Xeval <- cbind(rnorm(n1), sample(c(0,1), n1, replace=TRUE)) > > >detectCores() >cl <- makeCluster(4) >registerDoParallel(cl) >microbenchmark(apply(Xeval, 1, npnewpar, y=y, Xc=Xc, Xd = Xd, >Weights=Weights, h=0.5), parApply(cl, Xeval, 1, npnewpar, y=y, Xc=Xc, >Xd = Xd, Weights=Weights, h=0.5), times=100) >stopCluster(cl) > > >Unit: milliseconds > expr min lq mean median >apply(Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = Weights, > h = 0.5) 4.674914 4.726463 5.455323 4.771016 >parApply(cl, Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = >Weights, h = 0.5) 34.168250 35.434829 56.553296 39.438899 > uq max neval > 4.843324 57.01519 100 > 49.777265 347.77887 100 > > > > > > > > > > > > > > >npnewpar <- function(y, Xc, Xd, Weights, h, xeval) { > xc <- xeval[1] > xd <- xeval[2] > l <- function(x,X) { > w <- Weights[x,X] > return(w) > } > u <- (Xc-xc)/h > #K <- kernel(u) > K <- dnorm(u) > L <- l(xd,Xd) > nom <- sum(y*K*L) > denom <- sum(K*L) > ghat <- nom/denom > return
Re: [R] R parallel - slow speed
Thank you very much for your help. I tried it under Unix and then the parallel version was faster than under Windows (but still slower than the non parall version). This is an important point to keep in mind. Thanks for this. Best, Martin Gesendet: Donnerstag, 30. Juli 2015 um 14:56 Uhr Von: "Jeff Newmiller" An: "Martin Spindler" , "r-help@r-project.org" Betreff: Re: [R] R parallel - slow speed Parallelizing comes at a price... and there is no guarantee that you can afford it. Vectorizing your algorithms is often a better approach. Microbenchmarking is usually overkill for evaluating parallelizing. You assume 4 cores... but many CPUs have 2 cores and use hyperthreading to make each core look like two. The operating system can make a difference also... Windows processes are more expensive to start and communicate between than *nix processes are. In particular, Windows seems to require duplicated RAM pages while *nix can share process RAM (at least until they are written to) so you end up needing more memory and disk paging of virtual memory becomes more likely. --- Jeff Newmiller The . . Go Live... DCN: Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 30, 2015 8:26:34 AM EDT, Martin Spindler wrote: >Dear all, > >I am trying to parallelize the function npnewpar given below. When I am >comparing an application of "apply" with "parApply" the parallelized >version seems to be much slower (cf output below). Therefore I would >like to ask how the function could be parallelized more efficient. >(With increasing sample size the difference becomes smaller, but I was >wondering about this big differences and how it could be improved.) > >Thank you very much for help in advance! > >Best, > >Martin > > >library(microbenchmark) >library(doParallel) > >n <- 500 >y <- rnorm(n) >Xc <- rnorm(n) >Xd <- sample(c(0,1), replace=TRUE) >Weights <- diag(n) >n1 <- 50 >Xeval <- cbind(rnorm(n1), sample(c(0,1), n1, replace=TRUE)) > > >detectCores() >cl <- makeCluster(4) >registerDoParallel(cl) >microbenchmark(apply(Xeval, 1, npnewpar, y=y, Xc=Xc, Xd = Xd, >Weights=Weights, h=0.5), parApply(cl, Xeval, 1, npnewpar, y=y, Xc=Xc, >Xd = Xd, Weights=Weights, h=0.5), times=100) >stopCluster(cl) > > >Unit: milliseconds > expr min lq mean median >apply(Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = Weights, > h = 0.5) 4.674914 4.726463 5.455323 4.771016 >parApply(cl, Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = >Weights, h = 0.5) 34.168250 35.434829 56.553296 39.438899 > uq max neval > 4.843324 57.01519 100 > 49.777265 347.77887 100 > > > > > > > > > > > > > > >npnewpar <- function(y, Xc, Xd, Weights, h, xeval) { > xc <- xeval[1] > xd <- xeval[2] > l <- function(x,X) { > w <- Weights[x,X] > return(w) > } > u <- (Xc-xc)/h > #K <- kernel(u) > K <- dnorm(u) > L <- l(xd,Xd) > nom <- sum(y*K*L) > denom <- sum(K*L) > ghat <- nom/denom > return(ghat) >} > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html[http://www.R-project.org/posting-guide.html] >and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R parallel - slow speed
I ran a test on my Windows box with 4 CPUs. THere were 4 RScript processes started in response to the request for a cluster of 4. Each of these ran for an elapsed time of around 23 seconds, making the median time around 0.2 seconds for 100 iterations as reported by microbenchmark. The 'apply' only takes about 0.003 seconds for a single iteration - again what microbenchmark is reporting. The 4 RScript processes each use about 3 CPU seconds in the 23 seconds of elapsed time, most of that is probably the communication and startup time for the processes and reporting results. So as was pointed out previous there is overhead is running in parallel. You probably have to have at least several seconds of heavy computation for a iteration to make trying to parallelize something. You should also investigate exactly what is happening on your system so that you can account for the time being spent. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Thu, Jul 30, 2015 at 8:56 AM, Jeff Newmiller wrote: > Parallelizing comes at a price... and there is no guarantee that you can > afford it. Vectorizing your algorithms is often a better approach. > Microbenchmarking is usually overkill for evaluating parallelizing. > > You assume 4 cores... but many CPUs have 2 cores and use hyperthreading to > make each core look like two. > > The operating system can make a difference also... Windows processes are > more expensive to start and communicate between than *nix processes are. In > particular, Windows seems to require duplicated RAM pages while *nix can > share process RAM (at least until they are written to) so you end up > needing more memory and disk paging of virtual memory becomes more likely. > --- > Jeff NewmillerThe . . Go Live... > DCN:Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/BatteriesO.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --- > Sent from my phone. Please excuse my brevity. > > On July 30, 2015 8:26:34 AM EDT, Martin Spindler > wrote: > >Dear all, > > > >I am trying to parallelize the function npnewpar given below. When I am > >comparing an application of "apply" with "parApply" the parallelized > >version seems to be much slower (cf output below). Therefore I would > >like to ask how the function could be parallelized more efficient. > >(With increasing sample size the difference becomes smaller, but I was > >wondering about this big differences and how it could be improved.) > > > >Thank you very much for help in advance! > > > >Best, > > > >Martin > > > > > >library(microbenchmark) > >library(doParallel) > > > >n <- 500 > >y <- rnorm(n) > >Xc <- rnorm(n) > >Xd <- sample(c(0,1), replace=TRUE) > >Weights <- diag(n) > >n1 <- 50 > >Xeval <- cbind(rnorm(n1), sample(c(0,1), n1, replace=TRUE)) > > > > > >detectCores() > >cl <- makeCluster(4) > >registerDoParallel(cl) > >microbenchmark(apply(Xeval, 1, npnewpar, y=y, Xc=Xc, Xd = Xd, > >Weights=Weights, h=0.5), parApply(cl, Xeval, 1, npnewpar, y=y, Xc=Xc, > >Xd = Xd, Weights=Weights, h=0.5), times=100) > >stopCluster(cl) > > > > > >Unit: milliseconds > > expr minlq meanmedian > >apply(Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = Weights, > > h = 0.5) 4.674914 4.726463 5.455323 4.771016 > >parApply(cl, Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = > >Weights, h = 0.5) 34.168250 35.434829 56.553296 39.438899 > >uq max neval > > 4.843324 57.01519 100 > > 49.777265 347.77887 100 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >npnewpar <- function(y, Xc, Xd, Weights, h, xeval) { > > xc <- xeval[1] > > xd <- xeval[2] > > l <- function(x,X) { > >w <- Weights[x,X] > >return(w) > > } > > u <- (Xc-xc)/h > > #K <- kernel(u) > > K <- dnorm(u) > > L <- l(xd,Xd) > > nom <- sum(y*K*L) > > denom <- sum(K*L) > > ghat <- nom/denom > > return(ghat) > >} > > > >__ > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[altern
Re: [R] R parallel - slow speed
Parallelizing comes at a price... and there is no guarantee that you can afford it. Vectorizing your algorithms is often a better approach. Microbenchmarking is usually overkill for evaluating parallelizing. You assume 4 cores... but many CPUs have 2 cores and use hyperthreading to make each core look like two. The operating system can make a difference also... Windows processes are more expensive to start and communicate between than *nix processes are. In particular, Windows seems to require duplicated RAM pages while *nix can share process RAM (at least until they are written to) so you end up needing more memory and disk paging of virtual memory becomes more likely. --- Jeff NewmillerThe . . Go Live... DCN:Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 30, 2015 8:26:34 AM EDT, Martin Spindler wrote: >Dear all, > >I am trying to parallelize the function npnewpar given below. When I am >comparing an application of "apply" with "parApply" the parallelized >version seems to be much slower (cf output below). Therefore I would >like to ask how the function could be parallelized more efficient. >(With increasing sample size the difference becomes smaller, but I was >wondering about this big differences and how it could be improved.) > >Thank you very much for help in advance! > >Best, > >Martin > > >library(microbenchmark) >library(doParallel) > >n <- 500 >y <- rnorm(n) >Xc <- rnorm(n) >Xd <- sample(c(0,1), replace=TRUE) >Weights <- diag(n) >n1 <- 50 >Xeval <- cbind(rnorm(n1), sample(c(0,1), n1, replace=TRUE)) > > >detectCores() >cl <- makeCluster(4) >registerDoParallel(cl) >microbenchmark(apply(Xeval, 1, npnewpar, y=y, Xc=Xc, Xd = Xd, >Weights=Weights, h=0.5), parApply(cl, Xeval, 1, npnewpar, y=y, Xc=Xc, >Xd = Xd, Weights=Weights, h=0.5), times=100) >stopCluster(cl) > > >Unit: milliseconds > expr minlq meanmedian >apply(Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = Weights, > h = 0.5) 4.674914 4.726463 5.455323 4.771016 >parApply(cl, Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = >Weights, h = 0.5) 34.168250 35.434829 56.553296 39.438899 >uq max neval > 4.843324 57.01519 100 > 49.777265 347.77887 100 > > > > > > > > > > > > > > >npnewpar <- function(y, Xc, Xd, Weights, h, xeval) { > xc <- xeval[1] > xd <- xeval[2] > l <- function(x,X) { >w <- Weights[x,X] >return(w) > } > u <- (Xc-xc)/h > #K <- kernel(u) > K <- dnorm(u) > L <- l(xd,Xd) > nom <- sum(y*K*L) > denom <- sum(K*L) > ghat <- nom/denom > return(ghat) >} > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R parallel - slow speed
Dear all, I am trying to parallelize the function npnewpar given below. When I am comparing an application of "apply" with "parApply" the parallelized version seems to be much slower (cf output below). Therefore I would like to ask how the function could be parallelized more efficient. (With increasing sample size the difference becomes smaller, but I was wondering about this big differences and how it could be improved.) Thank you very much for help in advance! Best, Martin library(microbenchmark) library(doParallel) n <- 500 y <- rnorm(n) Xc <- rnorm(n) Xd <- sample(c(0,1), replace=TRUE) Weights <- diag(n) n1 <- 50 Xeval <- cbind(rnorm(n1), sample(c(0,1), n1, replace=TRUE)) detectCores() cl <- makeCluster(4) registerDoParallel(cl) microbenchmark(apply(Xeval, 1, npnewpar, y=y, Xc=Xc, Xd = Xd, Weights=Weights, h=0.5), parApply(cl, Xeval, 1, npnewpar, y=y, Xc=Xc, Xd = Xd, Weights=Weights, h=0.5), times=100) stopCluster(cl) Unit: milliseconds expr minlq meanmedian apply(Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = Weights, h = 0.5) 4.674914 4.726463 5.455323 4.771016 parApply(cl, Xeval, 1, npnewpar, y = y, Xc = Xc, Xd = Xd, Weights = Weights, h = 0.5) 34.168250 35.434829 56.553296 39.438899 uq max neval 4.843324 57.01519 100 49.777265 347.77887 100 npnewpar <- function(y, Xc, Xd, Weights, h, xeval) { xc <- xeval[1] xd <- xeval[2] l <- function(x,X) { w <- Weights[x,X] return(w) } u <- (Xc-xc)/h #K <- kernel(u) K <- dnorm(u) L <- l(xd,Xd) nom <- sum(y*K*L) denom <- sum(K*L) ghat <- nom/denom return(ghat) } __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R Parallel question
On 11.02.2012 23:12, slbfelix wrote: Hi All, I have a question about R parallel computing by using snowfall. How can I set the seeds on parallel workers to get the same result as sequential mode? For example: sfSapply(c(1,1),rnorm) [1] 1.823082 -2.222052 rnorm(2) [1] -0.5179967 -1.0807196 How to get the identical result? I don't think you easily can do it, since each node in a cluster has its own stream. Repoducing this on a different number of nodes would mean to jump around in the RNG stream whcih would be slow (and probably needs some programming work). For a start, read ?RNG Uwe Ligges Thanks. Libo Sun Graduate Student, Department of Statistics, Colorado State University Fort Collins, CO -- View this message in context: http://r.789695.n4.nabble.com/R-Parallel-question-tp4380098p4380098.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R Parallel question
Hi All, I have a question about R parallel computing by using snowfall. How can I set the seeds on parallel workers to get the same result as sequential mode? For example: > sfSapply(c(1,1),rnorm) [1] 1.823082 -2.222052 > rnorm(2) [1] -0.5179967 -1.0807196 How to get the identical result? Thanks. Libo Sun Graduate Student, Department of Statistics, Colorado State University Fort Collins, CO -- View this message in context: http://r.789695.n4.nabble.com/R-Parallel-question-tp4380098p4380098.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R/parallel
On Thu, 8 Dec 2011, Scott Raynaud wrote: Looks like this requires use of foreach and??lots of extra coding.?? R/parallel only requires that the loop be bracketed with the code to start the parallel processing-about 4-5 lines.?? Seems??a lot?? easier to me.?? If I were to go to the trouble of writing a lot of new code, it seems that recompiling with BLAS and pnmath would be a better option. On Windows, that will be man-days of work unless you know the insides intimately (and no guarantees that it will eventually give a speed increase). There is no support for parallel BLAS nor OpenMP nor pthreads in the current R sources/binaries for Windows. ?? My main question is how to handle the random number generation when the child processes are spawned.?? That's a problem no matter what method I choose to create the threads. A caveat: the R interpreter is not thread-safe: don't assume that you can run R code in parallel threads. I don't know what you mean by 'R/parallel'. However, R has a 'parallel' package, and its vignette discusses all this (including RNG). Adding parallel support using package 'parallel' is simple (and well-documented), not least as it comes ready-to-roll with R. If you meant the unfortunately named project at www.rparallel.org, a few comments: 1) It has not been updated in 3 years, and the pre-compiled Windows binaries are not going to work with recent R (like R >= 2.10.0). 2) It seems a lot less mature than the 'parallel' package and makes several restrictive assumptions. 3) Part of that lack of maturity is lack of documentation (including of the restrictive assumptions). 4) Inter-process communication seems to be by files. That is going to be slow, especially on Windows. Package 'parallel' uses sockets and pipes. ?? From: Tal Galili To: Scott Raynaud Cc: "r-help@r-project.org" Sent: Thursday, December 8, 2011 12:38 PM Subject: Re: [R] R/parallel Hi Scott, Why not use the doSMP package from REvolution? http://www.r-statistics.com/2010/04/parallel-multicore-processing-with-r-on-windows/ Tal Contact Details:--- Contact me: tal.gal...@gmail.com |?? 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- te: I want to take advantage?? of my multicore CPU to speed up a loop in a simulation program.?? I didn???t write the code, but the iterations appear independent to me, at least in the sense that the results of one loop do not depend on previous ones.?? Right now I???m relegated to a Windows box that runs Windows 7.?? These appear to be the options: ?? Pnmath-appears to parallelize non-BLAS routine but requires a special build Fork-UNIX only Romp-looks like this hasn???t advanced past the developmental stage Multicore-use on Windows at your own risk R/parallel-seems like the best option if I don???t want to recompile. ?? Has anyone ever used R/parallel??? What kind of results did you have??? One difficulty with my simulation is that the loop includes code to generate random numbers.?? If this loop is split into different threads, then I suspect the randomness of the numbers is not assured.?? What can I do about that? ?? I can provide the loop code, but it???s fairly long, say 75-100 lines. ?? If R/parallel is not feasible then a recompile with BLAS and pnmath appears to be the next best option. [[alternative HTML version deleted]] -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R/parallel
Looks like this requires use of foreach and lots of extra coding. R/parallel only requires that the loop be bracketed with the code to start the parallel processing-about 4-5 lines. Seems a lot easier to me. If I were to go to the trouble of writing a lot of new code, it seems that recompiling with BLAS and pnmath would be a better option.  My main question is how to handle the random number generation when the child processes are spawned. That's a problem no matter what method I choose to create the threads.  From: Tal Galili To: Scott Raynaud Cc: "r-help@r-project.org" Sent: Thursday, December 8, 2011 12:38 PM Subject: Re: [R] R/parallel Hi Scott, Why not use the doSMP package from REvolution? http://www.r-statistics.com/2010/04/parallel-multicore-processing-with-r-on-windows/ Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- te: I want to take advantage of my multicore CPU to speed up a loop in a simulation program. I didnât write the code, >but the iterations appear independent to me, at least in the sense that the >results of one loop do not depend on >previous ones. Right now Iâm relegated to a Windows box that runs Windows >7. These appear to be the options: > >Pnmath-appears to parallelize non-BLAS routine but requires a special build >Fork-UNIX only >Romp-looks like this hasnât advanced past the developmental stage >Multicore-use on Windows at your own risk >R/parallel-seems like the best option if I donât want to recompile. > >Has anyone ever used R/parallel? What kind of results did you have? One >difficulty with my simulation is that the >loop includes code to generate random numbers. If this loop is split into >different threads, then I suspect the >randomness of the numbers is not assured. What can I do about that? > >I can provide the loop code, but itâs fairly long, say 75-100 lines. > >If R/parallel is not feasible then a recompile with BLAS and pnmath appears to >be the next best option. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R/parallel
Hi Scott, Why not use the doSMP package from REvolution? http://www.r-statistics.com/2010/04/parallel-multicore-processing-with-r-on-windows/ Tal Contact Details:--- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) -- On Thu, Dec 8, 2011 at 7:52 PM, Scott Raynaud wrote: > ntage of my multicore CPU to speed up a loop in a simulation program. I > didnt write the code, > but the iterations appear independent to me, at least in the sense that > the results of one loop do not depend on > previous ones. Right now Im relegated to a Windows box that runs Windows > 7. These appear to be the options: > > Pnmath-appears to parallelize non-BLAS routine but requires a special build > Fork-UNIX only > Romp-looks like this ha > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R/parallel
I want to take advantage of my multicore CPU to speed up a loop in a simulation program. I didn’t write the code, but the iterations appear independent to me, at least in the sense that the results of one loop do not depend on previous ones. Right now I’m relegated to a Windows box that runs Windows 7. These appear to be the options: Pnmath-appears to parallelize non-BLAS routine but requires a special build Fork-UNIX only Romp-looks like this hasn’t advanced past the developmental stage Multicore-use on Windows at your own risk R/parallel-seems like the best option if I don’t want to recompile. Has anyone ever used R/parallel? What kind of results did you have? One difficulty with my simulation is that the loop includes code to generate random numbers. If this loop is split into different threads, then I suspect the randomness of the numbers is not assured. What can I do about that? I can provide the loop code, but it’s fairly long, say 75-100 lines. If R/parallel is not feasible then a recompile with BLAS and pnmath appears to be the next best option. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R/Parallel
Hi, have a look to Dirks tutorial at the UseR2008. This should be a good starting point: http://www.statistik.uni-dortmund.de/useR-2008/tutorials/eddelbuettel.html Markus Rajasekaramya wrote: > Hi there, > > I am looking for R/parallel package or some other package that would speed > up the analysis.I am working on computatioanly intensive data so any > suggestions would be really helpful. > Kindly let me know if any -- Dipl.-Tech. Math. Markus Schmidberger Ludwig-Maximilians-Universität München IBE - Institut für medizinische Informationsverarbeitung, Biometrie und Epidemiologie Marchioninistr. 15, D-81377 Muenchen URL: http://www.ibe.med.uni-muenchen.de Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de Tel: +49 (089) 7095 - 4599 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R/Parallel
Hi there, I am looking for R/parallel package or some other package that would speed up the analysis.I am working on computatioanly intensive data so any suggestions would be really helpful. Kindly let me know if any -- View this message in context: http://www.nabble.com/R-Parallel-tp1425p1425.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.