Re: [R] parallel computing with foreach()
Thanks Peter. I failed to realize earlier that one of the functions I used came from a package. The following solved the problem. foreach(i = 1:length(splist)) %:% foreach(j = 1:length(covset), .packages = c("raster")) %dopar% { ... On Thu, Dec 7, 2017 at 1:52 AM, Peter Langfelderwrote: > Your code generates an error that has nothing to do with dopar. I have > no idea what your function stack is supposed to do; you may be > inadvertently calling utils::stack which would produce this kind of > error: > > > stack(1:25, RAT = FALSE) > Error in data.frame(values = unlist(unname(x)), ind, stringsAsFactors = > FALSE) : > arguments imply differing number of rows: 25, 0 > > HTH, > > Peter > > On Wed, Dec 6, 2017 at 10:03 PM, Kumar Mainali > wrote: > > I have used foreach() for parallel computing but in the current problem, > it > > is not working. Given the volume and type of the data involved in the > > analysis, I will try to give below the complete code without reproducible > > example. > > > > In short, each R environment will draw a set of separate files, perform > the > > analysis and dump in separate folders. > > > > splist <- c("juoc", "juos", "jusc", "pico", "pifl", "pipo", "pire", > "psme") > > covset <- c("PEN", "Thorn") > > > > foreach(i = 1:length(splist)) %:% > > foreach(j = 1:length(covset)) %dopar% { > > > > spname <- splist[i]; spname > > myTorP <- covset[j]; myTorP > > > > DataSpecies = data.frame(prsabs = rep(1, 10), lon = rep(30, 10), lat = > > rep(80, 10)) > > myResp = as.numeric(DataSpecies[,1]) > > myRespXY = DataSpecies[, c("lon", "lat")] > > # directory of a bunch of raster files specific to each R environment > > rastdir <- paste0(rootdir, "Current/", myTorP); rastdir > > rasterc = list.files(rastdir, pattern="\\.tif$", full.names = T) > > print(rasterc) > > myExplc = stack(rasterc, RAT=FALSE) > > } > > > > I get the following error message that most likely generates while > stacking > > rasters because there are 25 rasters in the folder of each environment. > > Also, in the normal for loop, this reads all fine. > > Error in { : > > task 1 failed - "arguments imply differing number of rows: 25, 0" > > > > Thank you. > > ᐧ > > > > [[alternative HTML version deleted]] > > > > __ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > -- Postdoctoral Associate Department of Biology University of Maryland, College Park ᐧ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] parallel computing with foreach()
Your code generates an error that has nothing to do with dopar. I have no idea what your function stack is supposed to do; you may be inadvertently calling utils::stack which would produce this kind of error: > stack(1:25, RAT = FALSE) Error in data.frame(values = unlist(unname(x)), ind, stringsAsFactors = FALSE) : arguments imply differing number of rows: 25, 0 HTH, Peter On Wed, Dec 6, 2017 at 10:03 PM, Kumar Mainaliwrote: > I have used foreach() for parallel computing but in the current problem, it > is not working. Given the volume and type of the data involved in the > analysis, I will try to give below the complete code without reproducible > example. > > In short, each R environment will draw a set of separate files, perform the > analysis and dump in separate folders. > > splist <- c("juoc", "juos", "jusc", "pico", "pifl", "pipo", "pire", "psme") > covset <- c("PEN", "Thorn") > > foreach(i = 1:length(splist)) %:% > foreach(j = 1:length(covset)) %dopar% { > > spname <- splist[i]; spname > myTorP <- covset[j]; myTorP > > DataSpecies = data.frame(prsabs = rep(1, 10), lon = rep(30, 10), lat = > rep(80, 10)) > myResp = as.numeric(DataSpecies[,1]) > myRespXY = DataSpecies[, c("lon", "lat")] > # directory of a bunch of raster files specific to each R environment > rastdir <- paste0(rootdir, "Current/", myTorP); rastdir > rasterc = list.files(rastdir, pattern="\\.tif$", full.names = T) > print(rasterc) > myExplc = stack(rasterc, RAT=FALSE) > } > > I get the following error message that most likely generates while stacking > rasters because there are 25 rasters in the folder of each environment. > Also, in the normal for loop, this reads all fine. > Error in { : > task 1 failed - "arguments imply differing number of rows: 25, 0" > > Thank you. > ᐧ > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] parallel computing with foreach()
I have used foreach() for parallel computing but in the current problem, it is not working. Given the volume and type of the data involved in the analysis, I will try to give below the complete code without reproducible example. In short, each R environment will draw a set of separate files, perform the analysis and dump in separate folders. splist <- c("juoc", "juos", "jusc", "pico", "pifl", "pipo", "pire", "psme") covset <- c("PEN", "Thorn") foreach(i = 1:length(splist)) %:% foreach(j = 1:length(covset)) %dopar% { spname <- splist[i]; spname myTorP <- covset[j]; myTorP DataSpecies = data.frame(prsabs = rep(1, 10), lon = rep(30, 10), lat = rep(80, 10)) myResp = as.numeric(DataSpecies[,1]) myRespXY = DataSpecies[, c("lon", "lat")] # directory of a bunch of raster files specific to each R environment rastdir <- paste0(rootdir, "Current/", myTorP); rastdir rasterc = list.files(rastdir, pattern="\\.tif$", full.names = T) print(rasterc) myExplc = stack(rasterc, RAT=FALSE) } I get the following error message that most likely generates while stacking rasters because there are 25 rasters in the folder of each environment. Also, in the normal for loop, this reads all fine. Error in { : task 1 failed - "arguments imply differing number of rows: 25, 0" Thank you. ᐧ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] parallel computing with 'foreach'
Hello Stacey, I do not know whether my answer comes late or not, just came across your post now. I had a similar problem... First: You might want to think about whether to try to parallelize the thing or not. Unless coxph takes several minutes, it is probably of no great help to parallelize it, because there are many jobs associated with it. All workers need to be taught about the environment (the functions and variables they need to know) and some coordination work is necessary as well. So if every for-loop takes a longer time: you may want to use foreach, otherwise there's no great benefit (probably). What you could do is save only the functions you need in a separate R file and just have the workers initialize the functions you need for that. So you split up your source code in two parts - one containing the functions you need in the loop later and one that controls how the functions work together... You can try : ##declare a function that loads only the libraries and functions necessary inside the loop mysource - function(envir, filename) source(source.R) ##tell the programm to have every worker execute that function smpopts - list(initEnvir = mysource) ##have it executed with the foreach loop foreach (.,.options.smp=smpopts){ Hope that helps... Best Lui 2011/7/1 Uwe Ligges lig...@statistik.tu-dortmund.de: Type ?foreach and read the whole help page - as the positng guide asked you to do before posting, you will find the line describing the argument .packages. Uwe Ligges On 28.06.2011 21:17, Stacey Wood wrote: Hi all, I would like to parallelize some R code and would like to use the 'foreach' package with a foreach loop. However, whenever I call a function from an enabled package outside of MASS, I get an error message that a number of the functions aren't recognized (even though the functions should be defined). For example: library(foreach) library(doSMP) library(survival) # Create the simplest test data set test1- list(time=c(4,3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x=c(0,2,1,1,1,0,0), sex=c(0,0,0,0,1,1,1)) # Fit a stratified model coxph(Surv(time, status) ~ x + strata(sex), test1) w- startWorkers() registerDoSMP(w) foreach(i=1:3) %dopar% { # Fit a stratified model fit-coxph(Surv(time, status) ~ x + strata(sex), test1) summary(fit)$coef[i] } stopWorkers(w) Error message: Error in { : task 1 failed - could not find function coxph If I call library(survival) inside the foreach loop, everything runs properly. I don't think that I should have to call the package iteratively inside the loop. I would like to use a foreach loop inside code for my own package, but this is a problem since I can't call my own package in the source code for the package itself! Any advice would be appreciated. Thanks, Stacey [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] parallel computing with 'foreach'
Type ?foreach and read the whole help page - as the positng guide asked you to do before posting, you will find the line describing the argument .packages. Uwe Ligges On 28.06.2011 21:17, Stacey Wood wrote: Hi all, I would like to parallelize some R code and would like to use the 'foreach' package with a foreach loop. However, whenever I call a function from an enabled package outside of MASS, I get an error message that a number of the functions aren't recognized (even though the functions should be defined). For example: library(foreach) library(doSMP) library(survival) # Create the simplest test data set test1- list(time=c(4,3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x=c(0,2,1,1,1,0,0), sex=c(0,0,0,0,1,1,1)) # Fit a stratified model coxph(Surv(time, status) ~ x + strata(sex), test1) w- startWorkers() registerDoSMP(w) foreach(i=1:3) %dopar% { # Fit a stratified model fit-coxph(Surv(time, status) ~ x + strata(sex), test1) summary(fit)$coef[i] } stopWorkers(w) Error message: Error in { : task 1 failed - could not find function coxph If I call library(survival) inside the foreach loop, everything runs properly. I don't think that I should have to call the package iteratively inside the loop. I would like to use a foreach loop inside code for my own package, but this is a problem since I can't call my own package in the source code for the package itself! Any advice would be appreciated. Thanks, Stacey [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] parallel computing with 'foreach'
Hi all, I would like to parallelize some R code and would like to use the 'foreach' package with a foreach loop. However, whenever I call a function from an enabled package outside of MASS, I get an error message that a number of the functions aren't recognized (even though the functions should be defined). For example: library(foreach) library(doSMP) library(survival) # Create the simplest test data set test1 - list(time=c(4,3,1,1,2,2,3), status=c(1,1,1,0,1,1,0), x=c(0,2,1,1,1,0,0), sex=c(0,0,0,0,1,1,1)) # Fit a stratified model coxph(Surv(time, status) ~ x + strata(sex), test1) w - startWorkers() registerDoSMP(w) foreach(i=1:3) %dopar% { # Fit a stratified model fit-coxph(Surv(time, status) ~ x + strata(sex), test1) summary(fit)$coef[i] } stopWorkers(w) Error message: Error in { : task 1 failed - could not find function coxph If I call library(survival) inside the foreach loop, everything runs properly. I don't think that I should have to call the package iteratively inside the loop. I would like to use a foreach loop inside code for my own package, but this is a problem since I can't call my own package in the source code for the package itself! Any advice would be appreciated. Thanks, Stacey [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.