Sorry for the delay in replying; this mail slipped by me in my inbox.
On Apr 26, 2009, at 11:50 PM, Rangesh Gupta wrote:
Hi all,
I m facing problem while running Openfoam1.5 the executable is
sonicTurbFoam with the help of openmpi it hang after some time,
every time it hang at different place. The Mpi command is
mpirun --mca btl_openib_if_include ib0 -mca btl_tcp_if_exclude
lo,eth0,eth1 --mca btl_openib_ib_timeout 40 -n $NO_OF_PROCESS -
machinefile $MYHOSTS $1 -parallel
FWIW, if you're submitting via slurm, the -machinefile and -n options
shouldn't be necessary -- it should get those directly from SLURM.
We are using 64 processor on 8 nodes.
I m submitting it with the help of lsf scheduler and internally it
usage SLURM as a resource manager.
Error :
[n112][0,1,41][btl_tcp_frag.c:
202:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed with
errno=110
[n112][0,1,43][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=110
errno=110 is timeout on Linux. Do you happen to have firewalling
enabled on your compute nodes? OMPI needs to be able to use random
TCP ports to connect between all of the processes in an MPI job.
--
Jeff Squyres
Cisco Systems