Re: parallel + blast + LSF

George Marselis Wed, 15 Apr 2015 13:31:52 -0700

Giuseppe, I was referring to both of you. My apologies I was not clear, I
had my head stuck in Perl while writing the first email.


My suggestion to both of you is that you should not use parallel for your
respective topics.

Giuseppe,

You should use an extra script. Your problem is that you are timing out
while trying to submit all those jobs. The timeout happens because of the
number of jobs you are submitting: LSF cannot write the job descriptions
fast enough to disk, times out because the action is not completed and then
stays in that state

----------------
Martin,

You could use parallel to submit jobs, but its a very bad idea, due to the
limitations of the software. Use batch scripts and job arrays when possible.

----------------

So, as per my suggestion, I think our discussion is offtopic for this list.
We could continue here, if Ole and the list puts up with us, but I think we
should take this on a personal email or switch this to the Debian Medical
email list https://en.wikipedia.org/wiki/Debian-Med .

Let me know which option is better for you.

As with regard to Martin, he should not use parallel for


Ciao,

George

On Wed, Apr 15, 2015 at 10:50 PM, Giuseppe Aprea <[email protected]>
wrote:

> Hi George!
>
> I am not sure who you are talking with. Martin or me? I remind the
> original topic is about using blast under parallel with LSF.
> Martin's problem sounds like something offtopic.
>
> You have both sysadmin and bioinformatics experience so I would really
> appreciate your help!
>
> I am working on a cluster so I must use LSF to get slots and I would
> prefer using parallel also since it splits input automatically with
> --recstart (which is quite nice:D otherwise I have to use another script
> for that). I see I could do better with chunksize (I have 1 record at time
> in my example) but that's a secondary problem now. First I have the
> "lsb_launch(): Failed while waiting for tasks to finish." issue to solve.
>
> cheers,
>
> g
>
>
>
>
> On Wed, Apr 15, 2015 at 7:44 PM, George Marselis <[email protected]> wrote:
>
>> By the way, LSF and GNU parallel do almost the same thing. So using one
>> of the two, defeats the purpose of using the other.
>>
>> In the same way, you could have used LSF to submit your jobs to LSF:
>>
>> bsub < script.sh
>>
>> where script.sh was
>>
>> bsub -J amoeba -q smalljobs  qfasta file1
>> bsub -J amoeba -q smalljobs  qfasta file2
>> ...
>> bsub -J amoeba -q smalljobs  qfasta file2000
>>
>> On Wed, Apr 15, 2015 at 8:39 PM, George Marselis <[email protected]>
>> wrote:
>>
>>> Hi. LSF/Openlava sysadmin in bioinformatics and parallel user here.
>>>
>>> I have seen this a couple more times: You are trying to use GNU parallel
>>> to submit the jobs to all nodes.
>>>
>>> THat's now the way to do things: You should not submit jobs on *all*
>>> your nodes. Please don't do that, as bsub was not designed to read large
>>> chunks of jobs. bsub writes the jobs to your home directory, so if your
>>> storage is not designed for a lot of writes, you are going to blow the
>>> cluster out of the water.
>>>
>>> What you want to do is look up either:
>>>
>>> 1. bsub scripts
>>> https://rc.fas.harvard.edu/resources/documentation/legacy-lsf/lsf-submit-an-lsf-job/
>>>
>>> or
>>>
>>> 2. job arrays
>>> https://rc.fas.harvard.edu/resources/documentation/legacy-lsf/lsf-submitting-lots-of-short-jobs-job-arrays/
>>>
>>> Both bsub scripts and job arrays are useful to you: bsub scripts can be
>>> submitted as part of a pipeline: you can program the output of the bsub
>>> script from your pipeline and then submit it to bsub. So, instead of
>>> submitting your job 2000 times as in
>>>
>>> bsub job0
>>> bsub job1
>>>
>>> ....
>>>
>>> bsub job1999
>>>
>>> you just submit "bsub < scriptname" which contains 2000 lines which
>>> describe your jobs and you are done. The rest is done by bsub/LSF
>>>
>>>
>>> Now, if your jobs are similar in a way that you just increment counter
>>> (as in most bioinformatics jobs), use arrays.
>>>
>>> bsub -J JOBNAME[0-1999], where JOBNAME is a string you would like to
>>> name your job as, eg "fasta files alignment"
>>>
>>>
>>> These techniques are useful because you can submit all 2000 jobs in less
>>> than a second, you can do it from a single node and you will not have to
>>> deal with a grumpy sysadmin or grumpy colleagues who cannot use the
>>> cluster. Just make sure you use the appropriate queue.
>>>
>>> Let me know if you have any questions.
>>>
>>> Best Regards,
>>>
>>> George Marselis
>>>
>>> On Wed, Apr 15, 2015 at 6:48 PM, Martin d'Anjou <
>>> [email protected]> wrote:
>>>
>>>>  Hi,
>>>>
>>>> Thanks for clarifying. I want to use GNU Parallel to bsub jobs. This
>>>> way I can use GNU Parallel to throttle the number of jobs that are
>>>> submitted to LSF, and it is easier than writing a loop.
>>>>
>>>> parallel -j 100 my_script [bsub options] ::: {1..2000}
>>>>
>>>> my_script (pseudo-code):
>>>> #!/bin/bash
>>>> ...
>>>> bsub [bsub options] command ...
>>>> post-process data
>>>>
>>>> This way I can submit jobs, say 100 at a time. When I submit all 2000
>>>> jobs, it gets problematic and I start hitting limits with file descriptors,
>>>> etc.
>>>>
>>>> Thanks for sharing,
>>>> Martin
>>>>
>>>>
>>>> On 15-04-15 11:35 AM, Giuseppe Aprea wrote:
>>>>
>>>> Hi Martin,
>>>>
>>>>  I am not sure I understand. As far as I can see, things work exactly
>>>> the opposite way: you have an LSF script which launches GNU Parallel on
>>>> some hosts provided by LSF. Something like:
>>>>
>>>>
>>>> -------------------------------------------------------------------------------
>>>>
>>>> -------------------------------------------------------------------------------
>>>> #!/bin/bash
>>>>
>>>>  #BSUB -J gnuParallel_blast_test      # Name of the job.
>>>> #BSUB -o %J.out                              # Appends std output to
>>>> file %J.out. (%J is the Job ID)
>>>> #BSUB -e %J.err                               # Appends std error to
>>>> file %J.err.
>>>> #BSUB -q large                                 # Queue name.
>>>> #BSUB -n 30                                      # Number of CPUs.
>>>>
>>>>  module load 4.8.3/ncbi/12.0.0
>>>> module load 4.8.3/parallel/20150122
>>>>
>>>>  SLOTS=`cat ${LSB_DJOB_HOSTFILE} |wc -l`
>>>>
>>>>  SERVER=""
>>>>
>>>>  for i in `cat ${LSB_DJOB_HOSTFILE}| sort`
>>>>  do
>>>>  echo "/afs/enea.it/software/bin/blaunch.sh ${i}" >> servers
>>>> done
>>>>
>>>>  cat absolute_path_to_sequences.fasta | parallel --no-notice -vv -j
>>>> ${SLOTS} --slf servers --plain --recstart '>' -N 1 --pipe blastp -evalue
>>>> 1e-05 -outfmt 6 -db absolute_path_to_db_file -query - -out
>>>> absolute_path_to_result_file_{%}
>>>>
>>>> -------------------------------------------------------------------------------
>>>>
>>>> -------------------------------------------------------------------------------
>>>>
>>>>  LSF is the one which gives you the execution hosts so if you are
>>>> launching bsub from GNU parallel how do you know how to set the --slf
>>>> option?
>>>>
>>>>
>>>>  g
>>>>
>>>>
>>>>
>>>>   On Wed, Apr 15, 2015 at 4:24 PM, Martin d'Anjou <
>>>> [email protected]> wrote:
>>>>
>>>>> On 15-04-15 09:34 AM, Giuseppe Aprea wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I would like to ask you, please, some help in using parallel with
>>>>>> blast alignment software.
>>>>>>
>>>>>>
>>>>>> I am trying to use GNU parallel v. 20150122 with blast for a very
>>>>>> large sequences alignment. I am using Parallel on a cluster which uses 
>>>>>> LSF
>>>>>> as queue system.
>>>>>>
>>>>>
>>>>>  Hello Giuseppe,
>>>>>
>>>>> I am an avid LSF user, and I want to use GNU Parallel to dispatch jobs
>>>>> to LSF. Could you please explain a little bit to me how GNU Parallel works
>>>>> with LSF? I do not see it in the on-line tutorials. For example, I would
>>>>> like to understand how to pass "bsub" options like -oo, -q queue_name, 
>>>>> etc.
>>>>> to LSF from GNU Parallel.
>>>>>
>>>>> Thanks,
>>>>> Martin
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>

Re: parallel + blast + LSF

Reply via email to