Re: [Bioc-devel] BiocParallel: windows vs. mac/linux behavior

2018-01-31 Thread Martin Morgan



On 01/31/2018 06:39 PM, Ludwig Geistlinger wrote:

Hi,


I am currently considering the following snippet:



data.ids <- paste0("d", 1:5)
f <- function(x) paste("dataset", x, sep=" = ")
res <- BiocParallel::bplapply(data.ids, function(d) f(d))



Using a recent R-devel on both a Linux machine and a Mac machine, this works 
fine.


However, on a Windows R-devel this throws:


Error: BiocParallel errors

   element index: 1, 2, 3, 4, 5 

   first error: could not find function "f"


Is this a bug or is this related to the different ways in which parallel (for 
windows here serial) computation is carried out?



Windows runs brand-new processes, whereas linux / mac use forked 
processes that share the original process's memory. So Windows doesn't 
know about the variables defined in the global environment. You can 
mimick this on non Windows using SnowParam(), e.g.,


register(bpstart(SnowParam(2))  ## set up a cluster for the duration...

> res <- BiocParallel::bplapply(data.ids, function(d) f(d))
Error: BiocParallel errors
  element index: 1, 2, 3, 4, 5
  first error: could not find function "f"

The rule is that the environment of the FUN gets serialized to the 
worker, up to the global environment. So


local({
f <- function(x) paste("dataset", x, sep=" = ")
BiocParallel::bplapply(data.ids, function(d) f(d))
})

works (FUN is defined in the local environment, the local environment 
contains f and is serialized to the worker). This also implies that 
BiocParallel will find functions that are in the same package as FUN.


A tweak

  BiocParallel::bplapply(data.ids, f)

also works, because f (but not the .GlobalEnv in which it was defined) 
is serialized.


You said 'for Windows here serial' but a SerialParam() just runs lapply 
and would not have problems finding f.


Martin



Thanks,

Ludwig



--
Dr. Ludwig Geistlinger
CUNY School of Public Health

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel




This email message may contain legally privileged and/or...{{dropped:2}}

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] BiocParallel: windows vs. mac/linux behavior

2018-01-31 Thread Ludwig Geistlinger
Hi,


I am currently considering the following snippet:


> data.ids <- paste0("d", 1:5)
> f <- function(x) paste("dataset", x, sep=" = ")
> res <- BiocParallel::bplapply(data.ids, function(d) f(d))


Using a recent R-devel on both a Linux machine and a Mac machine, this works 
fine.


However, on a Windows R-devel this throws:


Error: BiocParallel errors

  element index: 1, 2, 3, 4, 5 

  first error: could not find function "f"


Is this a bug or is this related to the different ways in which parallel (for 
windows here serial) computation is carried out?


Thanks,

Ludwig



--
Dr. Ludwig Geistlinger
CUNY School of Public Health

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel