On 03/30/2015 02:51 PM, Simon Urbanek wrote:

On Mar 30, 2015, at 4:40 PM, Valerie Obenchain <voben...@fredhutch.org> wrote:

On 03/25/2015 07:48 PM, Simon Urbanek wrote:
On Mar 25, 2015, at 3:46 PM, Valerie Obenchain <voben...@fredhutch.org> wrote:

Hi Simon,

I'm having trouble with nested parallel workers, specifically, forking inside 
socket connections.


You simply can't by definition - when you fork *all* the workers share the same 
connection inherited from the parent, so you cannot use any I/O operations that 
you didn't start in the worker since reading in one worker affects all the 
workers.


Sorry if I'm missing the obvious here -
I thought since the fork workers were shut down by the time the SOCK worker 
returned to its master conflicting I/O wouldn't be a problem.


If the workers are done and don't use I/O then all is well. However, it's not 
easy to guarantee that they don't use I/O since they all already come with 
active sockets, so e.g. on exit they may flush the socket buffers which would 
confuse the recipient. Interestingly your example works fine on OS X but fails 
on Linux. I'll try to dig deeper in a quiet minute --- in principle it should 
be sufficient to close all FDs right away, which you can do when using 
mcparallel() but not using mclapply().


I see. Thanks for the explanation.

Valerie


Cheers,
Simon



There are quite a few examples floating around where SOCK workers are spawned 
on a cluster and multicore workers are called within them. If I understand 
correctly this should not be done (or at least not encouraged). Instead, nested 
parallel should only be done with distributed memory workers, SOCK, MPI etc.

Thanks.
Valerie


Cheers,
Simon


When mclapply is called inside a SOCK, PSOCK or FORK worker I get an
error in unserialize().

cl <- makeCluster(1, "SOCK")

fun = function(i) {
  library(parallel)
  mclapply(1:2, sqrt)
}

Failure occurs after multiple calls to clusterApply:

clusterApply(cl, 1, fun)
[[1]]
[[1]][[1]]
[1] 1

[[1]][[2]]
[1] 1.414214

clusterApply(cl, 1, fun)
[[1]]
[[1]][[1]]
[1] 1

[[1]][[2]]
[1] 1.414214

clusterApply(cl, 1, fun)
Error in unserialize(node$con) : error reading from connection


This example is from Martin and may be a different problem.

~/tmp >cat test1.R
## like mclapply
## should run 'forever' but terminates semi-randomly
library(parallel)
children <- parallel:::children

while (TRUE) {
    n <- 8            ## n == dectectCores()
    jobs <- lapply(seq_len(n), function(i) mcparallel(Sys.sleep(20)))
    mccollect(children(jobs), FALSE)
    parallel:::mckill(children(jobs), tools::SIGTERM)
    leni <- length(mccollect(children(jobs)))
    message("leni: ", leni)
}

~/tmp >R-dev --vanilla --slave -f test1.R
leni: 6
leni: 7
leni: 7
leni: 7
leni: 7
leni: 7
leni: 7
leni: 7
leni: 8
leni: 7
leni: 7
leni: 7
~/tmp >


Thanks.
Valerie


sessionInfo()
R Under development (unstable) (2015-03-18 r68009)
Platform: x86_64-unknown-linux-gnu (64-bit)
Running under: Fedora 21 (Twenty One)

locale:
[1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8       LC_NAME=C
[9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

loaded via a namespace (and not attached):
[1] snow_0.3-13


--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, Seattle, WA 98109

Email: voben...@fredhutch.org
Phone: (206) 667-3158

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to