Hi Aaron, I'm using MPIBlast 1.4.0 with NCBI Toolbox from Jun 2005.
I'm sorry, I have no clue what file it was trying to write when it failed. I'm running the same job again to see if I get the same error. Perhaps this error is just a side effect of a different problem. The query dataset is 131,813 nucleotide sequences on average of about 100 bases long. The size of this file is 17 MB. The database has 1,015,652 nucelotide sequences on average of about 300 bases each. The total size is 897 MB. I thought this was the 4GB database but I was confused. The sequence headers in the database are not all in FASTA defline format, some are and some aren't. When I formated the database I used the '-o F' option since not all header lines were in the defline format: /usr/local/mpiblast/bin/mpiformatdb -i mgel_tigr -p F --nfrags 32 -o F I hope that gives you bit more insight. Thanks for your help, Stephen On Tue, 2006-01-10 at 18:10, Aaron Darling wrote: > Hi Stephen, > > How large is the "large" query data set that you trying to process? > What version of mpiBLAST and toolbox release are you using? The 1.4.0 > release with the Oct-2004 toolbox is the most thoroughly tested combination. > The mpiblast master process filters query sequences and computes > statistics during startup for the effective search space calculation. > Doing so can be rather time consuming for large sequences. Adding -F F > to disable low-complexity filtering may speed it up somewhat. > Regardless, the error message you're reporting is something I've never > seen, even on large query/db sets. The error format indicates that it's > an error from within the NCBI code. Any idea what "file" it's trying to > write? As far as I know, mpiblast/NCBI lib shouldn't write files during > the startup phase. As Joe mentioned, perhaps this is related to -a 2? > > -Aaron > > > > > Stephen Ficklin wrote: > > >Hi Joe, > > > >My machines files is a list of the nodes in the cluster, all 32 of them, > >each on an individual line. I purposely removed the path from the command > >below, but generally I specify the machine file with the full path. > > > >The Grid Engine is able to start an instance of mpiblast on each node. I > >have verified that, when I run the job with the --debug flag I do get > >responses from the other nodes. They are sending messages back. But, it > >just doesn't seem like they do much of anything else. They have a very low > >load average which isn't what I expected. > > > >I'll try the job without the -a 2 switch and let you know. > > > >When I submit a job on SGE I do use the qsub command with -pe mpich 32 just > >like you suggested below, but I do so in a script and submit the script. > >Here's an example of the script I would use: > > > >#!/bin/csh -f > >#$ -N Job > >#$ -pe mpich 32 > >#$ -o /scratch/mpiblast.out > >#$ -e /scratch/mpiblast.err > >#$ -V > >#$ -r y > >/usr/local/mpi/bin/mpirun -np 32 -nolocal -machinefile machines > >/usr/local/mpiblast/bin/mpiblast -p blastn -i PT_7G4_00005_fil.fas -d tigr > >-m 7 -v 3 -b 3 -a 2 > > > >Thanks for your speedy responses :) > > > >Stephen > > > >On Tue, 2006-01-10 at 10:53, Joe Landman wrote: > > > > > >>Stephen Ficklin wrote: > >> > >> > >>>Hi Joe, > >>> > >>>Here's an example of one of the commands I've used: > >>> > >>>/usr/local/mpi/bin/mpirun -np 32 -nolocal -machinefile machines > >>> > >>> > >>/usr/local/mpiblast/bin/mpiblast -p blastn -i PT_7G4_00005_fil.fas -d > >>tigr -m 7 -v 3 -b 3 -a 2 > >> > >>Ok, what is in your machines file? > >> > >> > >> > >>>Generally I'll submit this to SGE but I seem to get the same response > >>> > >>> > >>> > >>whether I run it through SGE or straight on the command line > >> > >>Ok. For SGE, you want to use $TMPDIR/machines as the machines file, and > >>submit it with > >> > >> qsub -pe mpich 32 ... rest of your command line with the -machinefile > >>$TMPDIR/machines ... > >> > >>Also, don't use -a 2. This sets the numbe of threads to 2, and this > >>could be problematic for mpiblast. I am not sure if Aaron and Lucas > >>have used the -a NCPU switch, or what will happen. There may be some > >>odd interactions with the mpi libraries. Many mpi's are not thread safe > >>unless built with the threading options. > >> > >>Start by taking of the -a 2 (just omit the -a switch entirely). Also > >>let us know what is in your machine file. > >> > >>Joe > >> > >> > >> > >>>The tigr database is 4GB divided up using mpiformatdb into 32 pieces. > > >>>Thanks, > >>>Stephen > >>> > >>>On Tue, 2006-01-10 at 10:17, Joe Landman wrote: > >>> > >>> > >>> > >>>>Stephen Ficklin wrote: > >>>> > >>>> > >>>> > >>>>>I may be wrong on my assessment, but it appears that when I try to run > >>>>>an mpiblast that the master node (chosen by the algorithm) does all the > >>>>>work. I'll get a constant load average of 1 on that node while the > >>>>>program is running. On the other nodes I barely register any activity. > >>>>>For small database searches I will get results, but for larger ones it > >>>>>takes too long and patience gives out or it finishes with errors. The > >>>>>last large job I ran ended with this message after giving a few results: > >>>>> > >>>>>NULL_Caption] FATAL ERROR: CoreLib [002.005] 000049_0498_0531: File > >>>>>write error 0 34645.5 Bailing out with signal -1 > >>>>>[0] MPI Abort by user Aborting program ! > >>>>>[0] Aborting program! > >>>>> > >>>>>In any case it always seems to overload the master node but the workers > >>>>>seem to be doing nothing. I've compiled MPIBlast for OSX, Linux and > >>>>>Solaris and I get the samre response on all three platforms. Before I > >>>>>try an debugging I just wanted to check to see if anyone had experienced > >>>>>something similar. > >>>>> > >>>>> > >>>>> > >>>>Hi Stephen: > >>>> > >>>> Could you tell us how you are launching the job? > >>>> > >>>>Joe > >>>> > >>>>-- > >>>>Joseph Landman, Ph.D > >>>>Founder and CEO > >>>>Scalable Informatics LLC, > >>>>email: [EMAIL PROTECTED] > >>>>web : http://www.scalableinformatics.com > >>>>phone: +1 734 786 8423 > >>>>fax : +1 734 786 8452 > >>>>cell : +1 734 612 4615 > >>>> > >>>> > >>>>------------------------------------------------------- > >>>>This SF.net email is sponsored by: Splunk Inc. Do you grep through log > >>>>files > >>>>for problems? Stop! Download the new AJAX search engine that makes > >>>>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > >>>>http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click > >>>>_______________________________________________ > >>>>Mpiblast-users mailing list > >>>>[email protected] > >>>>https://lists.sourceforge.net/lists/listinfo/mpiblast-users > >>>> > >>>> > >> > >> > >>------------------------------------------------------- > >>This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > >>for problems? Stop! Download the new AJAX search engine that makes > >>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > >>http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click > >>_______________________________________________ > >>Mpiblast-users mailing list > >>[email protected] > >>https://lists.sourceforge.net/lists/listinfo/mpiblast-users > >> > >> > > > > ------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. Do you grep through log files > for problems? Stop! Download the new AJAX search engine that makes > searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! > http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click > _______________________________________________ > Mpiblast-users mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/mpiblast-users -- ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click _______________________________________________ Mpiblast-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mpiblast-users
