Dear all, A quick follow up in aid of Google.
Upgrading the Intel compilers made no difference to the error message. I contacted the researcher who wrote it who told me that the problem was likely to be the Intel compilers over-optimising the code and suggested using GCC which worked. He also pointed me in the direction of new versions of RAxML which are available at http://wwwkramer.in.tum.de/exelixis/software.html Nick 2009/11/6 Nick Holway <nick.hol...@gmail.com>: > Hi, > > Thank you for the information, I'm going to try the new Intel > Compilers which I'm downloading now, but as they're taking so long to > download I don't think I'm going to be able to look into this again > until after the weekend. BTW using their java-based downloader is a > bit less painful than their normal download. > > In the meantime, if anyone else has some suggestions then please let me know. > > Thanks > > Nick > > 2009/11/5 Jeff Squyres <jsquy...@cisco.com>: >> FWIW, I think Intel released 11.1.059 earlier today (I've been trying to >> download it all morning). I doubt it's an issue in this case, but I thought >> I'd mention it as a public service announcement. ;-) >> >> Seg faults are *usually* an application issue (never say "never", but they >> *usually* are). You might want to first contact the RaXML team to see if >> there are any known issues with their software and Open MPI 1.3.3...? >> (Sorry, I'm totally unfamiliar with RaXML) >> >> On Nov 5, 2009, at 12:30 PM, Nick Holway wrote: >> >>> Dear all, >>> >>> I'm trying to run RaXML 7.0.4 on my 64bit Rocks 5.1 cluster (ie Centos >>> 5.2). I compiled Open MPI 1.3.3 using the Intel compilers v 11.1.056 >>> using ./configure CC=icc CXX=icpc F77=ifort FC=ifort --with-sge >>> --prefix=/usr/prog/mpi/openmpi/1.3.3/x86_64-no-mem-man >>> --with-memory-manager=none. >>> >>> When I run run RaXML in a qlogin session using >>> /usr/prog/mpi/openmpi/1.3.3/x86_64-no-mem-man/bin/mpirun -np 8 >>> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI >>> -f a -x 12345 -p12345 -# 10 -m GTRGAMMA -s >>> /users/holwani1/jay/ornodko-1582 -n mpitest39 >>> >>> I get the following output: >>> >>> This is the RAxML MPI Worker Process Number: 1 >>> This is the RAxML MPI Worker Process Number: 3 >>> >>> This is the RAxML MPI Master process >>> >>> This is the RAxML MPI Worker Process Number: 7 >>> >>> This is the RAxML MPI Worker Process Number: 4 >>> >>> This is the RAxML MPI Worker Process Number: 5 >>> >>> This is the RAxML MPI Worker Process Number: 2 >>> >>> This is the RAxML MPI Worker Process Number: 6 >>> IMPORTANT WARNING: Alignment column 1695 contains only undetermined >>> values which will be treated as missing data >>> >>> >>> IMPORTANT WARNING: Sequences A4_H10 and A3ii_E11 are exactly identical >>> >>> >>> IMPORTANT WARNING: Sequences A2_A08 and A9_C10 are exactly identical >>> >>> >>> IMPORTANT WARNING: Sequences A3ii_B03 and A3ii_C06 are exactly identical >>> >>> >>> IMPORTANT WARNING: Sequences A9_D08 and A9_F10 are exactly identical >>> >>> >>> IMPORTANT WARNING: Sequences A3ii_F07 and A9_C08 are exactly identical >>> >>> >>> IMPORTANT WARNING: Sequences A6_F05 and A6_F11 are exactly identical >>> >>> IMPORTANT WARNING >>> Found 6 sequences that are exactly identical to other sequences in the >>> alignment. >>> Normally they should be excluded from the analysis. >>> >>> >>> IMPORTANT WARNING >>> Found 1 column that contains only undetermined values which will be >>> treated as missing data. >>> Normally these columns should be excluded from the analysis. >>> >>> An alignment file with undetermined columns and sequence duplicates >>> removed has already >>> been printed to file /users/holwani1/jay/ornodko-1582.reduced >>> >>> >>> You are using RAxML version 7.0.4 released by Alexandros Stamatakis in >>> April 2008 >>> >>> Alignment has 1280 distinct alignment patterns >>> >>> Proportion of gaps and completely undetermined characters in this >>> alignment: 0.124198 >>> >>> RAxML rapid bootstrapping and subsequent ML search >>> >>> >>> Executing 10 rapid bootstrap inferences and thereafter a thorough ML >>> search >>> >>> All free model parameters will be estimated by RAxML >>> GAMMA model of rate heteorgeneity, ML estimate of alpha-parameter >>> GAMMA Model parameters will be estimated up to an accuracy of >>> 0.1000000000 Log Likelihood units >>> >>> Partition: 0 >>> Name: No Name Provided >>> DataType: DNA >>> Substitution Matrix: GTR >>> Empirical Base Frequencies: >>> pi(A): 0.261129 pi(C): 0.228570 pi(G): 0.315946 pi(T): 0.194354 >>> >>> >>> Switching from GAMMA to CAT for rapid Bootstrap, final ML search will >>> be conducted under the GAMMA model you specified >>> Bootstrap[10]: Time 44.442728 bootstrap likelihood -inf, best >>> rearrangement setting 5 >>> Bootstrap[0]: Time 44.814948 bootstrap likelihood -inf, best >>> rearrangement setting 5 >>> Bootstrap[6]: Time 46.470371 bootstrap likelihood -inf, best >>> rearrangement setting 6 >>> [compute-0-11:08698] *** Process received signal *** >>> [compute-0-11:08698] Signal: Segmentation fault (11) >>> [compute-0-11:08698] Signal code: Address not mapped (1) >>> [compute-0-11:08698] Failing at address: 0x408 >>> [compute-0-11:08698] [ 0] /lib64/libpthread.so.0 [0x3fb580de80] >>> [compute-0-11:08698] [ 1] >>> >>> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI(hookup+0) >>> [0x413ca0] >>> [compute-0-11:08698] [ 2] >>> >>> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI(restoreTL+0xd9) >>> [0x442c09] >>> [compute-0-11:08698] [ 3] >>> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI >>> [0x42c968] >>> [compute-0-11:08698] [ 4] >>> >>> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI(doAllInOne+0x91a) >>> [0x42b21a] >>> [compute-0-11:08698] [ 5] >>> >>> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI(main+0xc25) >>> [0x4063f5] >>> [compute-0-11:08698] [ 6] /lib64/libc.so.6(__libc_start_main+0xf4) >>> [0x3fb501d8b4] >>> [compute-0-11:08698] [ 7] >>> /usr/prog/bioinformatics/RAxML/7.0.4/x86_64/RAxML-7.0.4/raxmlHPC-MPI >>> [0x405719] >>> [compute-0-11:08698] *** End of error message *** >>> Bootstrap[1]: Time 8.400332 bootstrap likelihood -inf, best >>> rearrangement setting 5 >>> -------------------------------------------------------------------------- >>> mpirun noticed that process rank 1 with PID 8698 on node >>> compute-0-11.local exited on signal 11 (Segmentation fault). >>> -------------------------------------------------------------------------- >>> >>> >>> >>> My $PATH is >>> /usr/prog/mpi/openmpi/1.3.3/x86_64-no-mem-man/bin/:/usr/prog/mpi/openmpi/1.3.3/x86_64/bin/:/usr/prog/intel/ifort/11.1.056/bin/intel64:/usr/prog/intel/icc/11.1.056//bin/intel64:/usr/prog/intel/ifort/11.1.056/bin/intel64:/usr/prog/intel/icc/11.1.056//bin/intel64:/opt/gridengine/bin/lx26-amd64:/usr/kerberos/sbin:/usr/kerberos/bin:/opt/gridengine/bin/lx26-amd64:/usr/java/latest/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/opt/ganglia/bin:/opt/ganglia/sbin:/opt/rocks/bin:/opt/rocks/sbin:/root/bin >>> >>> My $LD_LIBRARY_PATH is >>> >>> /usr/prog/mpi/openmpi/1.3.3/x86_64-no-mem-man/lib/:/usr/prog/mpi/openmpi/1.3.3/x86_64/lib/:/usr/prog/intel/ifort/11.1.056/lib/intel64:/usr/prog/intel/ifort/11.1.056/mkl/lib/em64t:/usr/prog/intel/icc/11.1.056//lib/intel64:/usr/prog/intel/icc/11.1.056//ipp/em64t/sharedlib:/usr/prog/intel/icc/11.1.056//mkl/lib/em64t:/usr/prog/intel/icc/11.1.056//tbb/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/lib:/usr/prog/intel/ifort/11.1.056/lib/intel64:/usr/prog/intel/ifort/11.1.056/mkl/lib/em64t:/usr/prog/intel/icc/11.1.056//lib/intel64:/usr/prog/intel/icc/11.1.056//ipp/em64t/sharedlib:/usr/prog/intel/icc/11.1.056//mkl/lib/em64t:/usr/prog/intel/icc/11.1.056//tbb/intel64/cc4.1.0_libc2.4_kernel2.6.16.21/lib:/opt/gridengine/lib/lx26-amd64:/opt/gridengine/lib/lx26-amd64 >>> >>> Although I'm only running this on one node, it may be helpful to know >>> that there is Infiniband with Voltaire OFED v1.4 on the nodes. Rocks' >>> HPC roll MPIs is not installed. I've tried running the above on >>> multiple nodes but still see the same error. I've attached the >>> config.log and ompi_info to the email. >>> >>> I believe that the input is OK as I can run the serial gcc-compiled >>> raXML on the data with no problems. I tried compiling openmpi with >>> --with-memory-manager=none as a quick google >>> (http://osdir.com/ml/clustering.open-mpi.user/2008-07/msg00201.html) >>> suggested that it could help, but it made no difference. Google also >>> suggested that it could be caused by the compile environment being >>> different to the runtime, to test this I compiled and ran RaXML >>> immediately after I compiled Openmpi in the same session, again with >>> no joy. >>> >>> Does any one know how I can fix this? >>> >>> Thanks >>> >>> Nick >>> >>> <config.tar.gz><ompi-info.tar.gz><ATT2831213.txt> >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >