Re: [Bioc-devel] BiocParallel load balancing and runtime

2023-08-08 Thread Jiefei Wang
Hello Anna, The speed of parallel computing depends on many factors. To avoid any potential confounders, Please try to use this code for timing (assuming you still have all the variables you used in your example) ``` parallel_param <- SnowParam(workers = ncores, type = "SOCK", tasks =

Re: [Bioc-devel] BiocParallel load balancing and runtime

2023-08-08 Thread Anna Plaxienko
My motivation for using distributed memory was that my package is also accessible on Windows. Is it better to use shared memory as default but check the user's system and then switch to socket only if necessary? Regarding the real data. I have 68 samples (rows) of methylation EPIC array data

Re: [Bioc-devel] BiocParallel load balancing and runtime

2023-08-08 Thread Waldir Leoncio Netto
Dear Anna, According to the documentation of "BiocParallelParam", SnowParam() is a subclass suitable for distributed memory (e.g. cluster) computing. If you're running your code on a simpler machine with shared memory (e.g. your PC), you're probably better off using MulticoreParam() instead.

[Bioc-devel] BiocParallel load balancing and runtime

2023-08-08 Thread Anna Plaxienko
Hi all! I'm switching from the base R *parallel* package to *BiocParallel* for my Bioconductor submission and I have two questions. First, I wanted advice on whether I've implemented load balancing correctly. Second, I've noticed that the running time is about 15% longer with BiocParallel. Any

Re: [Bioc-devel] BiocParallel and Shiny

2022-07-07 Thread Martin Morgan
nice day > > > > Giulia > > > > From: Martin Morgan > Date: Thursday, July 7, 2022 at 14:28 > To: Giulia Pais , Henrik Bengtsson > > Cc: bioc-devel@r-project.org > Subject: Re: [Bioc-devel] BiocParallel and Shiny > > I think it should be straight-forw

Re: [Bioc-devel] BiocParallel and Shiny

2022-07-07 Thread Giulia Pais
, sorry. Thank you From: Vincent Carey Date: Thursday, July 7, 2022 at 11:40 To: Giulia Pais Cc: bioc-devel@r-project.org Subject: Re: [Bioc-devel] BiocParallel and Shiny Interesting question. Have you looked at https://shiny.rstudio.com/articles/progress.html ...? There is also a file called

[Bioc-devel] BiocParallel and Shiny

2022-07-07 Thread Giulia Pais
Hello, I have a question on the use of BiocParallel with Shiny: I would like to show a progress bar on the UI much like the standard progress bar that can be set in functions like bplapply() � is it possible to do it and how? I haven�t found anything on the topic in the documentation

Re: [Bioc-devel] BiocParallel Variable Not Found

2020-03-17 Thread Dario Strbenac
Good day, I am not sure how to fix my package properly, even with the good example. A link to the specific part of my function is https://github.com/DarioS/ClassifyR/blob/e35899caceb401691990136387a517f4c3b57d5e/R/runTests.R#L567 and the example in the help page of runTestsEasyHard function

Re: [Bioc-devel] BiocParallel Variable Not Found

2020-03-17 Thread Martin Morgan
The question is a bit abstract for me to understand and it might be better to point to actual code in a git repository or similar... Inside a package, something like fun = function(x, y, ...) { c(x, y, length(as.list(...))) } user_visible <- function(x, ...) { y = 1

Re: [Bioc-devel] BiocParallel Variable Not Found

2020-03-17 Thread Dario Strbenac
Good day, Thanks for the examples which demonstrate the issue. Do you have other recommendations if, inside the loop, another function in the package is being called and the variable being passed is the ellipsis? There are only a couple of variables which might be provided by the user

Re: [Bioc-devel] BiocParallel Variable Not Found

2020-03-17 Thread Martin Morgan
Windows uses separate processes that do not share memory (SnowParam()), whereas linux / mac by default use forked processes that share the original memory (MulticoreParam()). So > y = 1 > param = MulticoreParam() > res = bplapply(1:2, function(x) y, BPPARAM=param) works because the function

[Bioc-devel] BiocParallel Variable Not Found

2020-03-17 Thread Dario Strbenac
Good day, I have a loop in a function of my R package which by default uses bpparam() to set the framework used for parallelisation. On Windows, I see the error Error: BiocParallel errors element index: 1, 2, 3, 4, 5, 6, ... first error: object 'selParams' not found This error does not

Re: [Bioc-devel] BiocParallel on Windows Never Ends

2018-06-14 Thread Martin Morgan
yes it would be useful to post this to R-devel as a 'using parallel::makeCluster() question, removing BiocParallel from the equation, where some general insight might be had... Martin On 06/13/2018 05:00 PM, Dario Strbenac wrote: Good day, I couldn't get a working param object. It never

Re: [Bioc-devel] BiocParallel on Windows Never Ends

2018-06-13 Thread Dario Strbenac
Good day, I couldn't get a working param object. It never completes the command param = bpstart(SnowParam(2, manager.hostname = "144.130.152.1", manager.port = 2559)) I obtained the IP address by typing "My IP address" into Google and it gave me the address shown. I used netstat -an and

Re: [Bioc-devel] BiocParallel on Windows Never Ends

2018-06-13 Thread Martin Morgan
It's more likely that it never starts, probably because it tries to create socket connections on ports that are not available, or perhaps because the file path to the installed location of the BiocParallel package is on a network share, or the 'master' node needs to be specified with an IP

[Bioc-devel] BiocParallel on Windows Never Ends

2018-06-12 Thread Dario Strbenac
Good day, I was interested how the performance of my package is on a 32-bit Windows computer because I'm going to give a workshop about it soon and some people might bring old laptops. I found that using SnowParam with workers set to more than 1 never finishes. The minimal code to cause the

Re: [Bioc-devel] BiocParallel: windows vs. mac/linux behavior

2018-01-31 Thread Martin Morgan
On 01/31/2018 06:39 PM, Ludwig Geistlinger wrote: Hi, I am currently considering the following snippet: data.ids <- paste0("d", 1:5) f <- function(x) paste("dataset", x, sep=" = ") res <- BiocParallel::bplapply(data.ids, function(d) f(d)) Using a recent R-devel on both a Linux machine

[Bioc-devel] BiocParallel: windows vs. mac/linux behavior

2018-01-31 Thread Ludwig Geistlinger
Hi, I am currently considering the following snippet: > data.ids <- paste0("d", 1:5) > f <- function(x) paste("dataset", x, sep=" = ") > res <- BiocParallel::bplapply(data.ids, function(d) f(d)) Using a recent R-devel on both a Linux machine and a Mac machine, this works fine. However, on

Re: [Bioc-devel] BiocParallel and AnnotationDbi: database disk image is malformed

2018-01-19 Thread Ludwig Geistlinger
ary 19, 2018 4:10 PM To: Ludwig Geistlinger; Gabe Becker; Vincent Carey Cc: bioc-devel@r-project.org Subject: Re: [Bioc-devel] BiocParallel and AnnotationDbi: database disk image is malformed On 01/19/2018 02:24 PM, Ludwig Geistlinger wrote: > I apologize if I haven't been specific enough - h

Re: [Bioc-devel] BiocParallel and AnnotationDbi: database disk image is malformed

2018-01-19 Thread Martin Morgan
Public Health From: Bioc-devel <bioc-devel-boun...@r-project.org> on behalf of Martin Morgan <martin.mor...@roswellpark.org> Sent: Friday, January 19, 2018 1:54 PM To: Gabe Becker; Vincent Carey Cc: bioc-devel@r-project.org Subject: Re: [Bioc-

Re: [Bioc-devel] BiocParallel and AnnotationDbi: database disk image is malformed

2018-01-19 Thread Ludwig Geistlinger
.@roswellpark.org> Sent: Friday, January 19, 2018 1:54 PM To: Gabe Becker; Vincent Carey Cc: bioc-devel@r-project.org Subject: Re: [Bioc-devel] BiocParallel and AnnotationDbi: database disk image is malformed On 01/19/2018 12:37 PM, Gabe Becker wrote: > IT seems like you could also force a copy

Re: [Bioc-devel] BiocParallel and AnnotationDbi: database disk image is malformed

2018-01-19 Thread Martin Morgan
On 01/19/2018 12:23 PM, Vincent Carey wrote: good question some of the discussion on http://sqlite.1065341.n5.nabble.com/Parallel-access-to-read-only-in-memory-database-td91814.html seems relevant. converting the relatively small annotation package content to pure R read-only tables on the

Re: [Bioc-devel] BiocParallel and AnnotationDbi: database disk image is malformed

2018-01-19 Thread Gabe Becker
IT seems like you could also force a copy of the reference object via $copy() and then force a refresh of the conn slot by assigning a new db connection into it. I'm having trouble confirming that this would work, however, because I actually can't reproduce the error. The naive way works for me

Re: [Bioc-devel] BiocParallel and AnnotationDbi: database disk image is malformed

2018-01-19 Thread Vincent Carey
good question some of the discussion on http://sqlite.1065341.n5.nabble.com/Parallel-access-to-read-only-in-memory-database-td91814.html seems relevant. converting the relatively small annotation package content to pure R read-only tables on the master before parallelizing might be very

[Bioc-devel] BiocParallel and AnnotationDbi: database disk image is malformed

2018-01-19 Thread Ludwig Geistlinger
Hi, Within a package I am developing, I would like to enable parallel probe to gene mapping for a compendium of microarray datasets. This accordingly makes use of annotation packages such as hgu133a.db, which in turn connect to the SQLite database via AnnotationDbi. When running in multi-core

Re: [Bioc-devel] BiocParallel: fine-grained progress bar

2017-12-31 Thread Martin Morgan
On 12/30/2017 04:08 PM, Ludwig Geistlinger wrote: Hi, I'm currently playing around with progress bars in BiocParallel - which is a great package! ;-) For demonstration, I'm using the example code from DESeq2::DESeq. library(DESeq2) library(BiocParallel) f <- function(mu) { cnts <-

[Bioc-devel] BiocParallel: fine-grained progress bar

2017-12-30 Thread Ludwig Geistlinger
Hi, I'm currently playing around with progress bars in BiocParallel - which is a great package! ;-) For demonstration, I'm using the example code from DESeq2::DESeq. library(DESeq2) library(BiocParallel) f <- function(mu) { cnts <- matrix(rnbinom(n=1000, mu=mu, size=1/0.5), ncol=10)

[Bioc-devel] BiocParallel on macosx: socketConnection failures with MulticoreParam

2017-11-22 Thread Vincent Carey
from example(bplapply), after register(MulticoreParam(2, timeout=5)) bplppl> bplapply(1:10, fun) *Error in socketConnection(host, port, TRUE, TRUE, "a+b", timeout = timeout) : * * cannot open the connection* *In addition: Warning message:* *In socketConnection(host, port, TRUE, TRUE, "a+b",

Re: [Bioc-devel] BiocParallel::bpvec() and DNAStringSet objects, problem

2015-10-01 Thread Robert Castelo
rote: Hi Robert, Thanks for reporting the bug. The problem was with how 'X' was split before dispatching to bplapply() and affected both SerialParam and SnowParam. Now fixed in release (1.2.21) and devel (1.3.52). Valerie - Forwarded Message - From: "Robert Castelo"<robert.cast...@up

Re: [Bioc-devel] BiocParallel::bpvec() and DNAStringSet objects, problem

2015-09-04 Thread Robert Castelo
). Valerie - Forwarded Message - From: "Robert Castelo" <robert.cast...@upf.edu> To: bioc-devel@r-project.org Sent: Wednesday, September 2, 2015 8:12:33 AM Subject: [Bioc-devel] BiocParallel::bpvec() and DNAStringSet objects, problem hi, I have encountered a problem when

Re: [Bioc-devel] BiocParallel::bpvec() and DNAStringSet objects, problem

2015-09-04 Thread Obenchain, Valerie
" <robert.cast...@upf.edu> > To: bioc-devel@r-project.org > Sent: Wednesday, September 2, 2015 8:12:33 AM > Subject: [Bioc-devel] BiocParallel::bpvec() and DNAStringSet objects, problem > > hi, > > I have encountered a problem when using the bpvec() function from the &g

[Bioc-devel] BiocParallel::bpvec() and DNAStringSet objects, problem

2015-09-02 Thread Robert Castelo
hi, I have encountered a problem when using the bpvec() function from the BiocParallel package with DNAStringSet objects and the "SerialParam" backend: library(Biostrings) library(BiocParallel) ## all correct when using the multicore backend bpvec(X=DNAStringSet(c("AC", "GT")),

Re: [Bioc-devel] BiocParallel-devel error

2014-11-20 Thread Thomas Girke
Hi Valerie, Excellent. In addition to collecting log outputs, I have a few more suggestions that may be worth considering: - Collecting the results form parallel computing tasks directly in an R object is a great convenience, which I like a lot. However, in the context of slow computations

Re: [Bioc-devel] BiocParallel-devel error

2014-11-20 Thread Vincent Carey
On Thu, Nov 20, 2014 at 12:17 PM, Thomas Girke thomas.gi...@ucr.edu wrote: Hi Valerie, Excellent. In addition to collecting log outputs, I have a few more suggestions that may be worth considering: - Collecting the results form parallel computing tasks directly in an R object is a great

Re: [Bioc-devel] BiocParallel-devel error

2014-11-19 Thread Thomas Girke
Hi Valerie, Michel and others, Finally, I freed up some time to revisit this problem. As it turns out, it is related to the use of a module system on our cluster. If I add in the template file for Torque (torque.tmpl) an explicit module load line for the specific R version, I am using on the

Re: [Bioc-devel] BiocParallel-devel error

2014-09-29 Thread Valerie Obenchain
Hi Michel, In BiocParallel 0.99.24 .convertToSimpleError() now checks for NULL and converts to NA_character_. I'm testing with BatchJobs 1.4, BiocParallel 0.99.24 and SLURM. I'm still not getting an informative error message: xx - bplapply(1:2, FUN) SubmitJobs

Re: [Bioc-devel] BiocParallel-devel error

2014-09-26 Thread Michel Lang
This was a bug in BatchJobs::waitForJobs(). We now throw an error if jobs disappear due to a faulty template file. I'd appreciate if you could confirm that this is now correctly catched and handled on your system. I furthermore suggest to replace NULL with NA_character_ in .convertToSimpleError().

Re: [Bioc-devel] BiocParallel-devel error

2014-09-23 Thread Valerie Obenchain
Hi, Martin and I looked into this a bit. It looks like a problem with handling an 'undefined error' returned from a worker (i.e., job did not run). When there is a problem executing the tmpl script no error message is sent back. The NULL is coerced to simpleError and becomes a problem

Re: [Bioc-devel] BiocParallel-devel error

2014-09-22 Thread Valerie Obenchain
Hi Thomas, Just wanted to let you know I saw this and am looking into it. Valerie On 09/20/2014 02:54 PM, Thomas Girke wrote: Hi Martin, Micheal and Vincent, If I run the following code, with the release version of BiocParallel then it works (took me some time to actually realize that), but

[Bioc-devel] BiocParallel: flattening iteration

2013-11-14 Thread Michael Lawrence
Hi guys, We often need to iterate over the cartesian product of two dimensions, like sample X chromosome. This is preferable to nested iteration, which is complicated. I've been using expand.grid and bpmapply for this, but it seems like this could be made easier. Like bpmapply could gain a

Re: [Bioc-devel] BiocParallel: flattening iteration

2013-11-14 Thread Vincent Carey
Streamer package has DAGTeam/DAGParam components that I believe are relevant. An abstraction of the reduction plan for a parallelized task would seem to have a natural home in BatchJobs. On Thu, Nov 14, 2013 at 8:15 AM, Michael Lawrence lawrence.mich...@gene.com wrote: Hi guys, We often

Re: [Bioc-devel] BiocParallel: flattening iteration

2013-11-14 Thread Michel Lang
We use a design iterator in BatchExperiments::makeDesign for a cartesian product. I found a old version of designIterator (cf. https://github.com/tudo-r/BatchExperiments/blob/master/R/designs.R) w/o the optional data.frame input which is easier to read: https://gist.github.com/mllg/7469844.

Re: [Bioc-devel] BiocParallel: flattening iteration

2013-11-14 Thread Michael Lawrence
I like the general idea of having iterators; was just checking out the itertools package after not having looked at it for a while. I could see having a BiocIterators package, and a bpiterate(iterator, FUN, ..., BPPARAM). My suggestion was simpler though. Right now, bpmapply runs a single job per

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-06 Thread Martin Morgan
On 11/04/2013 11:34 AM, Michael Lawrence wrote: The dynamic nature of R limits the extent of these checks. But as Ryan has noted, a simple sanity check goes a long way. If what he has done could be extended to the rest of the search path (people always forget to attach packages), I think we've

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-05 Thread luke-tierney
The 'foreach' framework does this sort of analysis using codetools at least in part. You may be able to build on what they have. luke On Mon, 4 Nov 2013, Ryan wrote: On 11/4/13, 11:05 AM, Gabriel Becker wrote: As a side note, I'm not sure that existence of a symbol is sufficient (it

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-04 Thread Ryan
Actually, the check that I proposed is only supposed to check for usage of user-defined variables, not variables from packages. Truthfully, though, I guess I'm not the right person to work on this, since in practice I use forked processes for the vast majority of my inside-R parallelization,

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-04 Thread Gabriel Becker
Weird, I guess it needs to be logged in or something. I don't know if the issue is that its in a non-master branch or waht. The repo is fully public and the forCRAN_0.3.5 in branch definitely exists on github. I started chrome (where I'm not logged into github) and got the same 404 error but

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-04 Thread Ryan Thompson
The code that I wrote intentionally avoids checking for package variables, since I consider that a separate problem. Package variables can be provided to the child by leading the package, whereas user-defined variables must be serialized in the parent and sent to the child. I think I could fairly

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-04 Thread Gabriel Becker
Ryan, I agree that in some sense it is a different problem, but my point is with a different approach we can easily answer both. The code I posted returns a named character vector of symbol names with package name being the name. This makes it a trivial lookup to determine both a) what symbols

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-04 Thread Ryan
On 11/4/13, 11:05 AM, Gabriel Becker wrote: As a side note, I'm not sure that existence of a symbol is sufficient (it certainly is necessary). What about situations where the symbol exists but is stale compared to the value in the parent? Are we sure that can never happen? I think this is a

[Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-03 Thread Henrik Bengtsson
Hi, in BiocParallel, is there a suggested (or planned) best standards for making *locally* assigned variables (e.g. functions) available to the applied function when it runs in a separate R process (which will be the most common use case)? I understand that avoid local variables should be

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-03 Thread Michael Lawrence
An analog to clusterExport is a good idea. To make it even easier, we could have a dynamic environment based on object tables that would catch missing symbols and download them from the parent thread. But maybe there's some benefit to being explicit? Michael On Sun, Nov 3, 2013 at 12:39 PM,

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-03 Thread Henrik Bengtsson
On Sun, Nov 3, 2013 at 1:29 PM, Michael Lawrence lawrence.mich...@gene.com wrote: An analog to clusterExport is a good idea. To make it even easier, we could have a dynamic environment based on object tables that would catch missing symbols and download them from the parent thread. But maybe

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-03 Thread Ryan
Here's an easy thing we can add to BiocParallel in the short term. The following code defines a wrapper function withBPExtraErrorText that simply appends an additional message to the end of any error that looks like it is about a missing variable. We could wrap every evaluation in a similar

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-03 Thread Ryan
Another potential easy step we can do is that if FUN function in the user's workspace, we automatically export that function under the same name in the children. This would make recursive functions just work, but it might be a bit too magical. On 11/3/13, 2:38 PM, Ryan wrote: Here's an easy

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-03 Thread Gabriel Becker
Henrik, See https://github.com/duncantl/CodeDepends (as used by used by https://github.com/gmbecker/RCacheSuite). It will identify necessarily defined symbols (input variables) for code that is not doing certain tricks (eg get(), mixing data.frame columns and gobal variables in formulas, etc ).

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-03 Thread Ryan
I guess all we need to do is to detect whether a function would try to access a free variable in the user's workspace, and warn/error if so. It looks like CodeDepends could do that. I could try to come up with an implementation. I guess we would add CodeDepends as an optional dependency for

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-03 Thread Gabriel Becker
Ryan (et al), FYI: f function() { x = rnorm(x) x } findGlobals(f) [1] = { rnorm x should be in the list of globals but it isn't. ~G sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3]

Re: [Bioc-devel] BiocParallel: Best standards for passing locally assigned variables/functions, e.g. a bpExport()?

2013-11-03 Thread Ryan
Ok, here is my attempt at a function to get the list of user-defined free variables that a function refers to: https://gist.github.com/DarwinAwardWinner/7298557 Is uses codetools, so it is subject to the limitations of that package, but for simple examples, it successfully detects when a

Re: [Bioc-devel] BiocParallel status

2013-09-03 Thread Martin Morgan
On 09/03/2013 05:25 AM, Hahne, Florian wrote: Hi List, Martin, I just wanted to quickly ask about the status of the BiocParallel package and the cluster support in particular. Is this project finished? And are there plans to having BiocParallel as a proper package again, or will it remain a GIT

Re: [Bioc-devel] BiocParallel: BatchJobs backend (Was: Re: BiocParallel)

2013-06-27 Thread Michel Lang
Hi Florian, Yes you're absolutely right. The fork currently depends on some functions which are not yet included in the CRAN build. For now you can get the latest development version on http://batchjobs.googlecode.com. We'll upload a new version of BatchJobs soon. I've documented this as an issue

Re: [Bioc-devel] BiocParallel: BatchJobs backend (Was: Re: BiocParallel)

2013-06-06 Thread Dan Tenenbaum
On Thu, Jun 6, 2013 at 1:56 PM, Henrik Bengtsson h...@biostat.ucsf.edu wrote: Hi, I'd like to pick up the discussion on a BatchJobs backend for BiocParallel where it was left back in Dec 2012 (Bioc-devel thread 'BiocParallel'

Re: [Bioc-devel] BiocParallel: BatchJobs backend (Was: Re: BiocParallel)

2013-06-06 Thread Michael Lawrence
And here is the on-going development of the backend: https://github.com/mllg/BiocParallel/tree/batchjobs Not sure how well it's been tested. Kudos to Michel Lang for making so much progress so quickly. Michael On Thu, Jun 6, 2013 at 1:59 PM, Dan Tenenbaum dtene...@fhcrc.org wrote: On Thu,

Re: [Bioc-devel] BiocParallel -- update

2012-12-04 Thread Ryan C. Thompson
On Tue 04 Dec 2012 11:31:59 AM PST, Michael Lawrence wrote: The name pvec is not very intuitive. What about bpchunk? And since the function passed to bpvectorize is already vectorized, maybe bpvectorize should be bparallelize? I know everyone has different intuitions/preferences when it comes to

Re: [Bioc-devel] BiocParallel

2012-11-17 Thread Ryan C. Thompson
In reply to: On 11/16/2012 09:45 PM, Steve Lianoglou wrote: But then you have the situation of multi-machines w/ multiple cores -- is this (2) or (3) here? How do you explicitly write code for that w/ foreach mojo? I guess the answer to that is that you let your grid engine (or whatever your

Re: [Bioc-devel] BiocParallel

2012-11-16 Thread Michael Lawrence
This sounds very useful when mixing batch jobs with an interactive session. In fact, it's something I was planning to do, since I noticed their execution model is completely asynchronous. Is it actually a new cluster backend for the parallel package? Michael On Fri, Nov 16, 2012 at 12:18 AM,

Re: [Bioc-devel] BiocParallel

2012-11-16 Thread Michael Lawrence
I'm not sure I understand the appeal of foreach. Why not do this within the functional paradigm, i.e, parLapply? Michael On Fri, Nov 16, 2012 at 9:41 AM, Ryan C. Thompson r...@thompsonclan.orgwrote: You could write a %dopar% backend for the foreach package, which would allow any code using

Re: [Bioc-devel] BiocParallel

2012-11-16 Thread Ryan C. Thompson
To be more specific, instead of: library(parallel) cl - ... # Make a cluster parLapply(cl, X, fun, ...) you can do: library(parallel) library(doParallel) library(plyr) cl - ... registerDoParallel(cl) llply(X, fun, ..., .parallel=TRUE) On Fri 16 Nov 2012 11:44:06 AM PST, Ryan C. Thompson

Re: [Bioc-devel] BiocParallel

2012-11-16 Thread Michael Lawrence
On Fri, Nov 16, 2012 at 11:44 AM, Ryan C. Thompson r...@thompsonclan.orgwrote: You don't have to use foreach directly. I use foreach almost exclusively through the plyr package, which uses foreach internally to implement parallelism. Like you, I'm not particularly fond of the foreach syntax

Re: [Bioc-devel] BiocParallel

2012-11-15 Thread Martin Morgan
On 11/15/2012 6:21 AM, Kasper Daniel Hansen wrote: I'll second Ryan's patch (at least in principle). When I parallelize across multiple cores, I have always found mc.preschedule to be an important option to expose (that, and the number of cores, is all I use routinely). Yes, Ryan provided a

Re: [Bioc-devel] BiocParallel

2012-11-15 Thread Tim Triche, Jr.
Personally, having used memcached in the past for distributed shared memory caching, I am most interested in 3) and doRedis. Many cluster/batch processing systems are a colossal PITA, and a worker queue would go a long way towards fixing that. Less checkpointing, more results... I hope. As an

Re: [Bioc-devel] BiocParallel

2012-11-15 Thread Vincent Carey
should approaches to fault-tolerance/recovery/debugging be a topic here? On Thu, Nov 15, 2012 at 1:53 PM, Henrik Bengtsson h...@biostat.ucsf.eduwrote: Is there any write up/discussion/plans on the various types of parallel computations out there: (1) one machine / multi-core/multi-threaded

Re: [Bioc-devel] BiocParallel

2012-11-15 Thread Michael Lawrence
On Thu, Nov 15, 2012 at 11:00 AM, Martin Morgan mtmor...@fhcrc.org wrote: On 11/15/2012 10:53 AM, Henrik Bengtsson wrote: Is there any write up/discussion/plans on the various types of parallel computations out there: (1) one machine / multi-core/multi-threaded (2) multiple machines /

Re: [Bioc-devel] BiocParallel

2012-11-14 Thread Michael Lawrence
On Wed, Nov 14, 2012 at 12:23 PM, Martin Morgan mtmor...@fhcrc.org wrote: Interested developers -- I added the start of a BiocParallel package to the Bioconductor subversion repository and build system. The package is mirrored on github to allow for social coding; I encourage people to

Re: [Bioc-devel] BiocParallel

2012-11-14 Thread Martin Morgan
On 11/14/2012 03:43 PM, Ryan C. Thompson wrote: Here are two alternative implementations of pvec. pvec2 is just a simple rewrite of pvec to use mclapply. pvec3 then extends pvec2 to accept a specified chunk size or a specified number of chunks. If the number of chunks exceeds the number of

Re: [Bioc-devel] BiocParallel

2012-11-14 Thread Ryan C. Thompson
I just submitted a pull request. I'll add tests shortly if I can figure out how to write them. On Wed 14 Nov 2012 03:50:36 PM PST, Martin Morgan wrote: On 11/14/2012 03:43 PM, Ryan C. Thompson wrote: Here are two alternative implementations of pvec. pvec2 is just a simple rewrite of pvec to