Hello,
Le 2012-05-12 17:58, Keith Robison a écrit :
Hello! I've run into a roadblock.
If I run the following command in the background, the assembler seems
to stall, with the last output being the citation for the assembler
It is likely a problem with MPI, not Ray because there are no messages
sent at this point, I think.
mpirun -hostfile hostfile.actinode34 -np 48 -stdin /dev/null
/home/krobison/packages/Ray-v2.0-ReleaseCandidate5/Ray -i part.8.fasta
-o ray.part.8.actinode34.c 1> ray.part.8.actinode34.c.out 2>
ray.part.8.actinode34.c.err
Where hostfile.actinode34 reads:
actinode03 slots=24
actinode04 slots=24
if instead I run with a hostfile with only one host (either one of
them) and -np 24, but otherwise the same command line, the assembler
seems to be off and running.
So there seems to be a problem when establishing connections.
On which machine are you when launching mpirun/mpiexec ?
My .bashrc has
export PATH=$PATH:/act/openmpi/gnu/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/act/openmpi/gnu/lib
(the cluster vendor put the code in /act)
Any suggestions for what might be triggering this behavior?
In Open-MPI:
Communication between a core and itself is done with memory copying.
Communication between different cores within a machine is done with
shared memory by default.
Communication between two cores on different machines can be done using
various byte transfer layers.
If you have TCP/IP and nothing else, then Open-MPI will use tcp. In this
case, one of the problems can
be that you have more than 1 interface (excluding the loopback) on each
host and that the wrong is used.
Can you ping actinode03 from actinode04 ?
Command:
ssh actinode03 ping actinode04
If you have Infiniband, then Open-MPI will use openib. In this case, one
of the problems can be
that the daemon that computes Infiniband routes between Infiniband
communicators died or is
acting strangely.
Does it hangs too if you launch this command:
mpiexec -n 48-hostfile hostfile.actinode34 \
date
Can you provide the following output:
ompi_info -a
This is a network problem, I think.
Sébastien
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users