On Wed, Feb 15, 2012 at 5:08 PM, Dannon Baker <dannonba...@me.com> wrote:
> It's definitely an experimental feature at this point, and there's no wiki,
> but basic support for breaking jobs into tasks does exist.  It needs a lot
> more work and can go in a few different directions to make it better,

Not what I was hoping to hear, but a promising start :)

> but check out the wrappers with <parallelism> defined, and enable
> use_tasked_jobs in your universe_wsgi.ini and restart.  That's all it
> should take from a fresh galaxy install to get, iirc, at least BWA and
> a few other tools working.  If you want a super trivial example to play
> with, change the tool .xml for text tool like "change case" to have
> <parallelism method="basic"></parallelism> and give that a shot.

Excellent - that saved me searching blindly.

$ cd tools
$ grep parallelism */*.xml
samtools/sam_bitwise_flag_filter.xml:  <parallelism
method="basic"></parallelism>
sr_mapping/bowtie_wrapper.xml:  <parallelism method="basic"></parallelism>
sr_mapping/bwa_color_wrapper.xml:  <parallelism method="basic"></parallelism>
sr_mapping/bwa_wrapper.xml:  <parallelism method="basic"></parallelism>

Are those four tools being used on Galaxy Main already with
this basic parallelism in place?

Looking at the code in lib/galaxy/jobs/splitters/basic.py its
comments suggest it only works on tools with one input and
one output file (although that seems a bit fuzzy as you could
be using BWA with a FASTA history item as the reference -
would that fail?).

I see also interesting things in lib/galaxy/jobs/splitters/multi.py
Is that even more experimental? It looks like it could be used
to say BWA's read file was to be split, but the reference file
shared.

Regarding the merging of the out, I see there is a default merge
method in lib/galaxy/datatypes/data.py which just concatenates
the files. I am surprised at that - it seems like a very bad idea in
general - consider many binary files, or XML. Why not put this
as the default for text and subclasses thereof?

There is also one example where the merge method gets
overridden, lib/galaxy/datatypes/tabular.py which avoids the
repetition of any headers when merging SAM files.

That should be enough clues to implement other customized
merge code for other datatypes.

> If you decide to try this out, do keep in mind that this feature is not
> at all complete and while there's a long list of things we still want
> to experiment with along these lines suggestions (and especially
> contributions) are absolutely welcome.

OK then, I hope to have a play with this shortly.

Thanks,

Peter

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to