[Bioc-devel] Size limits of packages when including other executables

2013-06-06 Thread Thomas Dybdal Pedersen
Hi

I'm developing a wrapper for the peptide-identification tool MS GF+. The 
algorithm is developed in Java and the .jar file has a size around 20 mb.

For the ease of the user, I think it would make sense to pack the java code 
together with the wrapper (this has been cleared with the MS GF+ developer), 
and would have the added benefit of securing version compatibility. This, on 
the other hand, stretches the size limits imposed in the guidelines by several 
factors…

What is the opinion on this? Is the size limits set in stone or can they be 
lifted for certain cases besides annotation packages?

best

Thomas Pedersen, PhD student at DTU, Denmark
___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] BiocParallel: BatchJobs backend (Was: Re: BiocParallel)

2013-06-06 Thread Henrik Bengtsson
Hi, I'd like to pick up the discussion on a BatchJobs backend for
BiocParallel where it was left back in Dec 2012 (Bioc-devel thread
'BiocParallel' 
[https://stat.ethz.ch/pipermail/bioc-devel/2012-December/003918.html]).

Florian, would you mind sharing your BatchJobs backend code?  Is it
independent of BiocParallel and/or have you tried it with the most
recent BiocParallel implementation
[https://github.com/Bioconductor/BiocParallel/]?

/Henrik

On Tue, Dec 4, 2012 at 12:38 PM, Henrik Bengtsson  wrote:
> Thanks.
>
> On Tue, Dec 4, 2012 at 3:47 AM, Vincent Carey
>  wrote:
>> I have been booked up so no chance to deploy but I do have access to SGE and
>> LSF so will try and will report ASAP.
>
> ...and I'll try it out on PBS (... but I most likely won't have time
> to do this until the end of the year).
>
> Henrik
>
>>
>>
>> On Tue, Dec 4, 2012 at 4:08 AM, Hahne, Florian 
>> wrote:
>>>
>>> Hi Henrik,
>>> I have now come up now with a relatively generic version of this
>>> SGEcluster approach. It does indeed use BatchJobs under the hood and
>>> should thus support all available cluster queues, assuming that the
>>> necessary batchJobs routines are available. I could only test this on our
>>> SGE cluster, but Vince wanted to try other queuing systems. Not sure how
>>> far he got. For now the code is wrapped in a little package called
>>> Qcluster with some documentation. If you want to I can send you a version
>>> in a separate mail. Would be good to test this on other systems, and I am
>>> sure there remain some bugs that need to be ironed out. In particular the
>>> fault tolerance you mentioned needs to be addressed properly. Currently
>>> the code may leave unwanted garbage if things fail in the wrong places
>>> because all the communication is file-based.
>>> Martin, I'll send you my updated version in case you want to include this
>>> in biocParallel for others to contribute.
>>> Florian
>>> --
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 12/4/12 5:46 AM, "Henrik Bengtsson"  wrote:
>>>
>>> >Picking up this thread in lack of other places (= were should
>>> >BiocParallel be discussed?)
>>> >
>>> >I saw Martin's updates on the BiocParallel - great.  Florian's SGE
>>> >scheduler was also mentioned; is that one built on top of BatchJobs?
>>> >If so I'd be interested in looking into that/generalizing that to work
>>> >with any BatchJobs scheduler.
>>> >
>>> >I believe there is going to be a new release of BatchJobs rather soon,
>>> >so it's probably worth waiting until that is available.
>>> >
>>> >The main use case I'm interested in is to launch batch jobs on a
>>> >PBS/Torque cluster, and then use multicore processing on each compute
>>> >node.  It would be nice to be able to do this using the BiocParallel
>>> >model, but maybe it is too optimistic to get everything to work under
>>> >same model.  Also, as Vince hinted, fault tolerance etc needs to be
>>> >addressed and needs to be addressed differently in the different
>>> >setups.
>>> >
>>> >/Henrik
>>> >
>>> >On Tue, Nov 20, 2012 at 6:59 AM, Ramon Diaz-Uriarte 
>>> >wrote:
>>> >>
>>> >>
>>> >>
>>> >> On Sat, 17 Nov 2012 13:05:29 -0800,"Ryan C. Thompson"
>>> >> wrote:
>>> >>
>>> >>> On 11/17/2012 02:39 AM, Ramon Diaz-Uriarte wrote:
>>> >>> > In addition to Steve's comment, is it really a good thing that "all
>>> >>>code
>>> >>> > stays the same."?  I mean, multiple machines vs. multiple cores are,
>>> >>> > often, _very_ different things: for instance, shared vs. distributed
>>> >>> > memory, communication overhead differences, whether or not you can
>>> >>>assume
>>> >>> > packages and objects to be automagically present in the slaves/child
>>> >>> > process, etc. So, given they are different situations, I think it
>>> >>> > sometimes makes sense to want to write different code for each
>>> >>>situation
>>> >>> > (I often do); not to mention Steve's hybrid cases ;-).
>>> >>> >
>>> >>> >
>>> >>> > Since BiocParallel seems to be a major undertaking, maybe it would
>>> >>> > be
>>> >>> > appropriate to provide a flexible approach, instead of hard wiring
>>> >>>the
>>> >>> > foreach approach.
>>> >>> Of course there are cases where the same code simply can't work for
>>> >>>both
>>> >>> multicore and multi-machine situations, but those generally don't fall
>>> >>> into the category of things that can be done using lapply. Lapply and
>>> >>> all of its parallelized buddies like mclapply, parLapply, and foreach
>>> >>> are designed for data-parallel operations with no interdependence
>>> >>> between results, and these kinds of operations generally parallelize
>>> >>> as
>>> >>> well across machines as across cores, unless your network is not fast
>>> >>> enough (in which case you would choose not to use multi-machine
>>> >>> parallelism). If you want a parallel algorithm for something like the
>>> >>> disjoin method of GRanges, you might need to write some special
>>> >>> purpose
>>> >>> code, and that code might be very different for multicore vs
>>> >>>multi-machine.
>>> >

Re: [Bioc-devel] BiocParallel: BatchJobs backend (Was: Re: BiocParallel)

2013-06-06 Thread Dan Tenenbaum
On Thu, Jun 6, 2013 at 1:56 PM, Henrik Bengtsson  wrote:
> Hi, I'd like to pick up the discussion on a BatchJobs backend for
> BiocParallel where it was left back in Dec 2012 (Bioc-devel thread
> 'BiocParallel' 
> [https://stat.ethz.ch/pipermail/bioc-devel/2012-December/003918.html]).
>
> Florian, would you mind sharing your BatchJobs backend code?  Is it
> independent of BiocParallel and/or have you tried it with the most
> recent BiocParallel implementation
> [https://github.com/Bioconductor/BiocParallel/]?
>

You should be aware that there is  Google Summer of Code project in
progress to address this.

http://www.bioconductor.org/developers/gsoc2013/ (towards the bottom)

Dan


> /Henrik
>
> On Tue, Dec 4, 2012 at 12:38 PM, Henrik Bengtsson  
> wrote:
>> Thanks.
>>
>> On Tue, Dec 4, 2012 at 3:47 AM, Vincent Carey
>>  wrote:
>>> I have been booked up so no chance to deploy but I do have access to SGE and
>>> LSF so will try and will report ASAP.
>>
>> ...and I'll try it out on PBS (... but I most likely won't have time
>> to do this until the end of the year).
>>
>> Henrik
>>
>>>
>>>
>>> On Tue, Dec 4, 2012 at 4:08 AM, Hahne, Florian 
>>> wrote:

 Hi Henrik,
 I have now come up now with a relatively generic version of this
 SGEcluster approach. It does indeed use BatchJobs under the hood and
 should thus support all available cluster queues, assuming that the
 necessary batchJobs routines are available. I could only test this on our
 SGE cluster, but Vince wanted to try other queuing systems. Not sure how
 far he got. For now the code is wrapped in a little package called
 Qcluster with some documentation. If you want to I can send you a version
 in a separate mail. Would be good to test this on other systems, and I am
 sure there remain some bugs that need to be ironed out. In particular the
 fault tolerance you mentioned needs to be addressed properly. Currently
 the code may leave unwanted garbage if things fail in the wrong places
 because all the communication is file-based.
 Martin, I'll send you my updated version in case you want to include this
 in biocParallel for others to contribute.
 Florian
 --






 On 12/4/12 5:46 AM, "Henrik Bengtsson"  wrote:

 >Picking up this thread in lack of other places (= were should
 >BiocParallel be discussed?)
 >
 >I saw Martin's updates on the BiocParallel - great.  Florian's SGE
 >scheduler was also mentioned; is that one built on top of BatchJobs?
 >If so I'd be interested in looking into that/generalizing that to work
 >with any BatchJobs scheduler.
 >
 >I believe there is going to be a new release of BatchJobs rather soon,
 >so it's probably worth waiting until that is available.
 >
 >The main use case I'm interested in is to launch batch jobs on a
 >PBS/Torque cluster, and then use multicore processing on each compute
 >node.  It would be nice to be able to do this using the BiocParallel
 >model, but maybe it is too optimistic to get everything to work under
 >same model.  Also, as Vince hinted, fault tolerance etc needs to be
 >addressed and needs to be addressed differently in the different
 >setups.
 >
 >/Henrik
 >
 >On Tue, Nov 20, 2012 at 6:59 AM, Ramon Diaz-Uriarte 
 >wrote:
 >>
 >>
 >>
 >> On Sat, 17 Nov 2012 13:05:29 -0800,"Ryan C. Thompson"
 >> wrote:
 >>
 >>> On 11/17/2012 02:39 AM, Ramon Diaz-Uriarte wrote:
 >>> > In addition to Steve's comment, is it really a good thing that "all
 >>>code
 >>> > stays the same."?  I mean, multiple machines vs. multiple cores are,
 >>> > often, _very_ different things: for instance, shared vs. distributed
 >>> > memory, communication overhead differences, whether or not you can
 >>>assume
 >>> > packages and objects to be automagically present in the slaves/child
 >>> > process, etc. So, given they are different situations, I think it
 >>> > sometimes makes sense to want to write different code for each
 >>>situation
 >>> > (I often do); not to mention Steve's hybrid cases ;-).
 >>> >
 >>> >
 >>> > Since BiocParallel seems to be a major undertaking, maybe it would
 >>> > be
 >>> > appropriate to provide a flexible approach, instead of hard wiring
 >>>the
 >>> > foreach approach.
 >>> Of course there are cases where the same code simply can't work for
 >>>both
 >>> multicore and multi-machine situations, but those generally don't fall
 >>> into the category of things that can be done using lapply. Lapply and
 >>> all of its parallelized buddies like mclapply, parLapply, and foreach
 >>> are designed for data-parallel operations with no interdependence
 >>> between results, and these kinds of operations generally parallelize
 >>> as
 >>> well across machines as across cores, unless your

Re: [Bioc-devel] BiocParallel: BatchJobs backend (Was: Re: BiocParallel)

2013-06-06 Thread Michael Lawrence
And here is the on-going development of the backend:
https://github.com/mllg/BiocParallel/tree/batchjobs

Not sure how well it's been tested.

Kudos to Michel Lang for making so much progress so quickly.

Michael

On Thu, Jun 6, 2013 at 1:59 PM, Dan Tenenbaum  wrote:

> On Thu, Jun 6, 2013 at 1:56 PM, Henrik Bengtsson 
> wrote:
> > Hi, I'd like to pick up the discussion on a BatchJobs backend for
> > BiocParallel where it was left back in Dec 2012 (Bioc-devel thread
> > 'BiocParallel' [
> https://stat.ethz.ch/pipermail/bioc-devel/2012-December/003918.html]).
> >
> > Florian, would you mind sharing your BatchJobs backend code?  Is it
> > independent of BiocParallel and/or have you tried it with the most
> > recent BiocParallel implementation
> > [https://github.com/Bioconductor/BiocParallel/]?
> >
>
> You should be aware that there is  Google Summer of Code project in
> progress to address this.
>
> http://www.bioconductor.org/developers/gsoc2013/ (towards the bottom)
>
> Dan
>
>
> > /Henrik
> >
> > On Tue, Dec 4, 2012 at 12:38 PM, Henrik Bengtsson 
> wrote:
> >> Thanks.
> >>
> >> On Tue, Dec 4, 2012 at 3:47 AM, Vincent Carey
> >>  wrote:
> >>> I have been booked up so no chance to deploy but I do have access to
> SGE and
> >>> LSF so will try and will report ASAP.
> >>
> >> ...and I'll try it out on PBS (... but I most likely won't have time
> >> to do this until the end of the year).
> >>
> >> Henrik
> >>
> >>>
> >>>
> >>> On Tue, Dec 4, 2012 at 4:08 AM, Hahne, Florian <
> florian.ha...@novartis.com>
> >>> wrote:
> 
>  Hi Henrik,
>  I have now come up now with a relatively generic version of this
>  SGEcluster approach. It does indeed use BatchJobs under the hood and
>  should thus support all available cluster queues, assuming that the
>  necessary batchJobs routines are available. I could only test this on
> our
>  SGE cluster, but Vince wanted to try other queuing systems. Not sure
> how
>  far he got. For now the code is wrapped in a little package called
>  Qcluster with some documentation. If you want to I can send you a
> version
>  in a separate mail. Would be good to test this on other systems, and
> I am
>  sure there remain some bugs that need to be ironed out. In particular
> the
>  fault tolerance you mentioned needs to be addressed properly.
> Currently
>  the code may leave unwanted garbage if things fail in the wrong places
>  because all the communication is file-based.
>  Martin, I'll send you my updated version in case you want to include
> this
>  in biocParallel for others to contribute.
>  Florian
>  --
> 
> 
> 
> 
> 
> 
>  On 12/4/12 5:46 AM, "Henrik Bengtsson"  wrote:
> 
>  >Picking up this thread in lack of other places (= were should
>  >BiocParallel be discussed?)
>  >
>  >I saw Martin's updates on the BiocParallel - great.  Florian's SGE
>  >scheduler was also mentioned; is that one built on top of BatchJobs?
>  >If so I'd be interested in looking into that/generalizing that to
> work
>  >with any BatchJobs scheduler.
>  >
>  >I believe there is going to be a new release of BatchJobs rather
> soon,
>  >so it's probably worth waiting until that is available.
>  >
>  >The main use case I'm interested in is to launch batch jobs on a
>  >PBS/Torque cluster, and then use multicore processing on each compute
>  >node.  It would be nice to be able to do this using the BiocParallel
>  >model, but maybe it is too optimistic to get everything to work under
>  >same model.  Also, as Vince hinted, fault tolerance etc needs to be
>  >addressed and needs to be addressed differently in the different
>  >setups.
>  >
>  >/Henrik
>  >
>  >On Tue, Nov 20, 2012 at 6:59 AM, Ramon Diaz-Uriarte <
> rdia...@gmail.com>
>  >wrote:
>  >>
>  >>
>  >>
>  >> On Sat, 17 Nov 2012 13:05:29 -0800,"Ryan C. Thompson"
>  >> wrote:
>  >>
>  >>> On 11/17/2012 02:39 AM, Ramon Diaz-Uriarte wrote:
>  >>> > In addition to Steve's comment, is it really a good thing that
> "all
>  >>>code
>  >>> > stays the same."?  I mean, multiple machines vs. multiple cores
> are,
>  >>> > often, _very_ different things: for instance, shared vs.
> distributed
>  >>> > memory, communication overhead differences, whether or not you
> can
>  >>>assume
>  >>> > packages and objects to be automagically present in the
> slaves/child
>  >>> > process, etc. So, given they are different situations, I think
> it
>  >>> > sometimes makes sense to want to write different code for each
>  >>>situation
>  >>> > (I often do); not to mention Steve's hybrid cases ;-).
>  >>> >
>  >>> >
>  >>> > Since BiocParallel seems to be a major undertaking, maybe it
> would
>  >>> > be
>  >>> > appropriate to provide a flexible approach, instead of hard
> wiring
>  >>>the
> >>

Re: [Bioc-devel] BiocParallel: BatchJobs backend (Was: Re: BiocParallel)

2013-06-06 Thread Henrik Bengtsson
Great - this looks promising already.

What's your test system(s), beyond standard SSH and multicore
clusters?  I'm on a Torque/PBS system.

I'm happy to test, give feedback etc.  I don't see an 'Issues' tab on
the GitHub page.  Michel, how do you prefer to get feedback?

/Henrik


On Thu, Jun 6, 2013 at 5:21 PM, Michael Lawrence
 wrote:
> And here is the on-going development of the backend:
> https://github.com/mllg/BiocParallel/tree/batchjobs
>
> Not sure how well it's been tested.
>
> Kudos to Michel Lang for making so much progress so quickly.
>
> Michael
>
> On Thu, Jun 6, 2013 at 1:59 PM, Dan Tenenbaum  wrote:
>>
>> On Thu, Jun 6, 2013 at 1:56 PM, Henrik Bengtsson 
>> wrote:
>> > Hi, I'd like to pick up the discussion on a BatchJobs backend for
>> > BiocParallel where it was left back in Dec 2012 (Bioc-devel thread
>> > 'BiocParallel'
>> > [https://stat.ethz.ch/pipermail/bioc-devel/2012-December/003918.html]).
>> >
>> > Florian, would you mind sharing your BatchJobs backend code?  Is it
>> > independent of BiocParallel and/or have you tried it with the most
>> > recent BiocParallel implementation
>> > [https://github.com/Bioconductor/BiocParallel/]?
>> >
>>
>> You should be aware that there is  Google Summer of Code project in
>> progress to address this.
>>
>> http://www.bioconductor.org/developers/gsoc2013/ (towards the bottom)
>>
>> Dan
>>
>>
>> > /Henrik
>> >
>> > On Tue, Dec 4, 2012 at 12:38 PM, Henrik Bengtsson 
>> > wrote:
>> >> Thanks.
>> >>
>> >> On Tue, Dec 4, 2012 at 3:47 AM, Vincent Carey
>> >>  wrote:
>> >>> I have been booked up so no chance to deploy but I do have access to
>> >>> SGE and
>> >>> LSF so will try and will report ASAP.
>> >>
>> >> ...and I'll try it out on PBS (... but I most likely won't have time
>> >> to do this until the end of the year).
>> >>
>> >> Henrik
>> >>
>> >>>
>> >>>
>> >>> On Tue, Dec 4, 2012 at 4:08 AM, Hahne, Florian
>> >>> 
>> >>> wrote:
>> 
>>  Hi Henrik,
>>  I have now come up now with a relatively generic version of this
>>  SGEcluster approach. It does indeed use BatchJobs under the hood and
>>  should thus support all available cluster queues, assuming that the
>>  necessary batchJobs routines are available. I could only test this on
>>  our
>>  SGE cluster, but Vince wanted to try other queuing systems. Not sure
>>  how
>>  far he got. For now the code is wrapped in a little package called
>>  Qcluster with some documentation. If you want to I can send you a
>>  version
>>  in a separate mail. Would be good to test this on other systems, and
>>  I am
>>  sure there remain some bugs that need to be ironed out. In particular
>>  the
>>  fault tolerance you mentioned needs to be addressed properly.
>>  Currently
>>  the code may leave unwanted garbage if things fail in the wrong
>>  places
>>  because all the communication is file-based.
>>  Martin, I'll send you my updated version in case you want to include
>>  this
>>  in biocParallel for others to contribute.
>>  Florian
>>  --
>> 
>> 
>> 
>> 
>> 
>> 
>>  On 12/4/12 5:46 AM, "Henrik Bengtsson"  wrote:
>> 
>>  >Picking up this thread in lack of other places (= were should
>>  >BiocParallel be discussed?)
>>  >
>>  >I saw Martin's updates on the BiocParallel - great.  Florian's SGE
>>  >scheduler was also mentioned; is that one built on top of BatchJobs?
>>  >If so I'd be interested in looking into that/generalizing that to
>>  > work
>>  >with any BatchJobs scheduler.
>>  >
>>  >I believe there is going to be a new release of BatchJobs rather
>>  > soon,
>>  >so it's probably worth waiting until that is available.
>>  >
>>  >The main use case I'm interested in is to launch batch jobs on a
>>  >PBS/Torque cluster, and then use multicore processing on each
>>  > compute
>>  >node.  It would be nice to be able to do this using the BiocParallel
>>  >model, but maybe it is too optimistic to get everything to work
>>  > under
>>  >same model.  Also, as Vince hinted, fault tolerance etc needs to be
>>  >addressed and needs to be addressed differently in the different
>>  >setups.
>>  >
>>  >/Henrik
>>  >
>>  >On Tue, Nov 20, 2012 at 6:59 AM, Ramon Diaz-Uriarte
>>  > 
>>  >wrote:
>>  >>
>>  >>
>>  >>
>>  >> On Sat, 17 Nov 2012 13:05:29 -0800,"Ryan C. Thompson"
>>  >> wrote:
>>  >>
>>  >>> On 11/17/2012 02:39 AM, Ramon Diaz-Uriarte wrote:
>>  >>> > In addition to Steve's comment, is it really a good thing that
>>  >>> > "all
>>  >>>code
>>  >>> > stays the same."?  I mean, multiple machines vs. multiple cores
>>  >>> > are,
>>  >>> > often, _very_ different things: for instance, shared vs.
>>  >>> > distributed
>>  >>> > memory, communication overhead differences, whether or not you
>>  >>> > can
>> >

Re: [Bioc-devel] BiocParallel: BatchJobs backend (Was: Re: BiocParallel)

2013-06-06 Thread Michael Lawrence
We're regularly running BatchJobs itself on an LSF cluster. Works great.


On Thu, Jun 6, 2013 at 5:48 PM, Henrik Bengtsson wrote:

> Great - this looks promising already.
>
> What's your test system(s), beyond standard SSH and multicore
> clusters?  I'm on a Torque/PBS system.
>
> I'm happy to test, give feedback etc.  I don't see an 'Issues' tab on
> the GitHub page.  Michel, how do you prefer to get feedback?
>
> /Henrik
>
>
> On Thu, Jun 6, 2013 at 5:21 PM, Michael Lawrence
>  wrote:
> > And here is the on-going development of the backend:
> > https://github.com/mllg/BiocParallel/tree/batchjobs
> >
> > Not sure how well it's been tested.
> >
> > Kudos to Michel Lang for making so much progress so quickly.
> >
> > Michael
> >
> > On Thu, Jun 6, 2013 at 1:59 PM, Dan Tenenbaum 
> wrote:
> >>
> >> On Thu, Jun 6, 2013 at 1:56 PM, Henrik Bengtsson 
> >> wrote:
> >> > Hi, I'd like to pick up the discussion on a BatchJobs backend for
> >> > BiocParallel where it was left back in Dec 2012 (Bioc-devel thread
> >> > 'BiocParallel'
> >> > [https://stat.ethz.ch/pipermail/bioc-devel/2012-December/003918.html]
> ).
> >> >
> >> > Florian, would you mind sharing your BatchJobs backend code?  Is it
> >> > independent of BiocParallel and/or have you tried it with the most
> >> > recent BiocParallel implementation
> >> > [https://github.com/Bioconductor/BiocParallel/]?
> >> >
> >>
> >> You should be aware that there is  Google Summer of Code project in
> >> progress to address this.
> >>
> >> http://www.bioconductor.org/developers/gsoc2013/ (towards the bottom)
> >>
> >> Dan
> >>
> >>
> >> > /Henrik
> >> >
> >> > On Tue, Dec 4, 2012 at 12:38 PM, Henrik Bengtsson <
> h...@biostat.ucsf.edu>
> >> > wrote:
> >> >> Thanks.
> >> >>
> >> >> On Tue, Dec 4, 2012 at 3:47 AM, Vincent Carey
> >> >>  wrote:
> >> >>> I have been booked up so no chance to deploy but I do have access to
> >> >>> SGE and
> >> >>> LSF so will try and will report ASAP.
> >> >>
> >> >> ...and I'll try it out on PBS (... but I most likely won't have time
> >> >> to do this until the end of the year).
> >> >>
> >> >> Henrik
> >> >>
> >> >>>
> >> >>>
> >> >>> On Tue, Dec 4, 2012 at 4:08 AM, Hahne, Florian
> >> >>> 
> >> >>> wrote:
> >> 
> >>  Hi Henrik,
> >>  I have now come up now with a relatively generic version of this
> >>  SGEcluster approach. It does indeed use BatchJobs under the hood
> and
> >>  should thus support all available cluster queues, assuming that the
> >>  necessary batchJobs routines are available. I could only test this
> on
> >>  our
> >>  SGE cluster, but Vince wanted to try other queuing systems. Not
> sure
> >>  how
> >>  far he got. For now the code is wrapped in a little package called
> >>  Qcluster with some documentation. If you want to I can send you a
> >>  version
> >>  in a separate mail. Would be good to test this on other systems,
> and
> >>  I am
> >>  sure there remain some bugs that need to be ironed out. In
> particular
> >>  the
> >>  fault tolerance you mentioned needs to be addressed properly.
> >>  Currently
> >>  the code may leave unwanted garbage if things fail in the wrong
> >>  places
> >>  because all the communication is file-based.
> >>  Martin, I'll send you my updated version in case you want to
> include
> >>  this
> >>  in biocParallel for others to contribute.
> >>  Florian
> >>  --
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >>  On 12/4/12 5:46 AM, "Henrik Bengtsson" 
> wrote:
> >> 
> >>  >Picking up this thread in lack of other places (= were should
> >>  >BiocParallel be discussed?)
> >>  >
> >>  >I saw Martin's updates on the BiocParallel - great.  Florian's SGE
> >>  >scheduler was also mentioned; is that one built on top of
> BatchJobs?
> >>  >If so I'd be interested in looking into that/generalizing that to
> >>  > work
> >>  >with any BatchJobs scheduler.
> >>  >
> >>  >I believe there is going to be a new release of BatchJobs rather
> >>  > soon,
> >>  >so it's probably worth waiting until that is available.
> >>  >
> >>  >The main use case I'm interested in is to launch batch jobs on a
> >>  >PBS/Torque cluster, and then use multicore processing on each
> >>  > compute
> >>  >node.  It would be nice to be able to do this using the
> BiocParallel
> >>  >model, but maybe it is too optimistic to get everything to work
> >>  > under
> >>  >same model.  Also, as Vince hinted, fault tolerance etc needs to
> be
> >>  >addressed and needs to be addressed differently in the different
> >>  >setups.
> >>  >
> >>  >/Henrik
> >>  >
> >>  >On Tue, Nov 20, 2012 at 6:59 AM, Ramon Diaz-Uriarte
> >>  > 
> >>  >wrote:
> >>  >>
> >>  >>
> >>  >>
> >>  >> On Sat, 17 Nov 2012 13:05:29 -0800,"Ryan C. Thompson"
> >>  >> wrote:
> >>  >>
> >>  >>> On 11