Sent: Thursday, April 28, 2011 9:03 AM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded
On Apr 28, 2011, at 6:56 AM, Sindhi, Waris PW wrote:
> Yes the procgroup file has more than 128 applications in it.
>
> % wc -l procgroup
> 239 procgroup
>
The --prefix directory is a typo and no longer exists on our system.
We are running 1.4-4 version of OpenMPI
% /opt/openmpi/x86_64/bin/ompi_info
Package: Open MPI
mockbu...@x86-004.build.bos.redhat.com Distribution
Open MPI: 1.4
Open MPI SVN revision: r22285
sers-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Ralph Castain
Sent: Wednesday, April 27, 2011 8:09 PM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI out of band TCP retry exceeded
On Apr 27, 2011, at 1:31 PM, Sindhi, Waris PW wrote:
> No we do not have a f
blish TCP communications with the daemon on ln10.
On Apr 27, 2011, at 11:58 AM, Sindhi, Waris PW wrote:
> Hi,
> I am getting a "oob-tcp: Communication retries exceeded" error
> message when I run a 238 MPI slave code
>
>
> /opt/openmpi/i386/bin/mpirun -mca btl_op
Hi,
I am getting a "oob-tcp: Communication retries exceeded" error
message when I run a 238 MPI slave code
/opt/openmpi/i386/bin/mpirun -mca btl_openib_verbose 1 --mca btl ^tcp
--mca pls_ssh_agent ssh -mca oob_tcp_peer_retries 1000 --prefix
/usr/lib/openmpi/1.2.8-gcc/bin -np 239 --app procgr
Hi,
I am having trouble using the --app option with OpenMPI's mpirun
command. The MPI processes launched with the --app option get launched
on the linux node that mpirun command is executed on.
The same MPI executable works when specified on the command line using
the -np option.
Please let