[OMPI users] Failed to register memory (openmpi 2.0.2)

2017-10-18 Thread Mark Dixon
Hi, We're intermittently seeing messages (below) about failing to register memory with openmpi 2.0.2 on centos7 / Mellanox FDR Connect-X 3 and the vanilla IB stack as shipped by centos. We're not using any mlx4_core module tweaks at the moment. On earlier machines we used to set registered m

[OMPI users] IpV6 Openmpi mpirun failed

2017-10-18 Thread Mukkie
Hi, I have two ipv6 only machines, I configured/built OMPI version 3.0 with - -enable-ipv6 I want to verify a simple MPI communication call through tcp ip between these two machines. I am using ring_c and connectivity_c examples. Issuing from one of the host machine… [mselvam@ipv-rhel73 examp

Re: [OMPI users] Failed to register memory (openmpi 2.0.2)

2017-10-18 Thread r...@open-mpi.org
Put “oob=tcp” in your default MCA param file > On Oct 18, 2017, at 9:00 AM, Mark Dixon wrote: > > Hi, > > We're intermittently seeing messages (below) about failing to register memory > with openmpi 2.0.2 on centos7 / Mellanox FDR Connect-X 3 and the vanilla IB > stack as shipped by centos. >

Re: [OMPI users] IpV6 Openmpi mpirun failed

2017-10-18 Thread Mukkie
Adding a verbose output. Please check for failed and advise. Thank you. [mselvam@ipv-rhel73 examples]$ mpirun -hostfile host --mca oob_base_verbose 100 --mca btl tcp,self ring_c [ipv-rhel73:10575] mca_base_component_repository_open: unable to open mca_plm_tm: libtorque.so.2: cannot open shared obj

Re: [OMPI users] IpV6 Openmpi mpirun failed

2017-10-18 Thread r...@open-mpi.org
Looks like there is a firewall or something blocking communication between those nodes? > On Oct 18, 2017, at 1:29 PM, Mukkie wrote: > > Adding a verbose output. Please check for failed and advise. Thank you. > > [mselvam@ipv-rhel73 examples]$ mpirun -hostfile host --mca oob_base_verbose > 10

Re: [OMPI users] IpV6 Openmpi mpirun failed

2017-10-18 Thread Mukkie
Thanks for your suggestion. However my firewall's are already disabled on both the machines. Cordially, Muku. On Wed, Oct 18, 2017 at 2:38 PM, r...@open-mpi.org wrote: > Looks like there is a firewall or something blocking communication between > those nodes? > > On Oct 18, 2017, at 1:29 PM, Mu