Re: What do you use GNU Parallel for?

Matt Oates (Home) Wed, 22 Aug 2012 01:28:01 -0700

Hi Ole,

On 22 August 2012 07:08, Ole Tange <[email protected]> wrote:
> So please write a few lines about the tasks you use it for -
> especially if you have reason to believe you are one of the few doing
> that kind of thing. If you want to be anonymous you can write me
> directly, but otherwise use the mailing list.


Good luck with the talk!

I use parallel to parallelise the external loop of most Bioinformatics
software, especially HMMER3. Many pieces of software have no
parallelisation, so if I give a big long list of input they go through
serially. I work with quite large datasets, 1,765 genomes each having
1-10 thousand protein sequences. With 5x 24 core desktops I can really
cutback how long something takes. We even have an internal script that
bridges parallel with the EC2 compute cloud, so if I need to do
something extra big I just go wider and hand the list of EC2 machine
names to parallel.

More day to day, I frequently use parallel to transform large files
(hundreds of gigabytes per file) of data between text based file
formats, so parallel perl/sed. I use the --pipe feature a lot to split
files too, so something like the FASTA format is splitable with
parallel and I can pipe the data straight in to another program.

I think you would do well to perhaps publish a short paper somewhere
in the Bioinformatic field about the speed ups you can get using
parallel with older non-parallel software.

Best,
Matt.

---
http://www.mattoates.co.uk
http://bccs.bris.ac.uk

Re: What do you use GNU Parallel for?

Reply via email to