Hi Brenda,

What type of ethernet device (is this a Mellanox HCA?) and ethernet switch
are you using?  The mpirun configure
options look correct to me.  Is it possible that you have all the mpi
processes on a single node?
It should be pretty obvious from the SendRecv IMB test if you're using
RoCE.  The large message
bandwidth will be much better than if you are going through the tcp btl.

If you're using Mellanox cards, you might want to do a sanity check using
the MXM libraries.
You'd want to set MXM_TLS env. variable to "self,shm,rc".   We got close to
90 Gb/sec bandwidth using Connect X-4
+ MXM MTL on a cluster earlier this year.

Howard



2016-11-08 15:15 GMT-07:00 Brendan Myers <brendan.my...@soft-forge.com>:

> Hello,
>
> I am trying to figure out how I can verify that the OpenMPI traffic is
> actually being transmitted over my RoCE fabric connecting my cluster.  My
> MPI job runs quickly and error free but I cannot seem to verify that
> significant amounts of data is being transferred to the other endpoint in
> my RoCE fabric.  I am able to see what I believe to be the oob data when I
> remove the oob exclusion from my command when I analyze my RoCE interface
> using the tools listed below.
>
> Software:
>
> ·         CentOS 7.2
>
> ·         Open MPI 2.0.1
>
> Command:
>
> ·         mpirun   --mca btl openib,self,sm --mca oob_tcp_if_exclude eth3
> --mca btl_openib_receive_queues P,65536,120,64,32 --mca
> btl_openib_cpc_include rdmacm -np 4 -hostfile mpi-hosts-ce
> /usr/local/bin/IMB-MPI1
>
> o   Eth3 is my RoCE interface
>
> o   The 2 nodes involved RoCE interfaces are defined in my mpi-hosts-ce
> file
>
> Ways I have looked to verify data transference:
>
> ·         Through the port counters on my RoCE switch
>
> o   Sees data being sent when using ib_write_bw but not when using Open
> MPI
>
> ·         Through ibdump
>
> o   Sees data being sent when using ib_write_bw but not when using Open
> MPI
>
> ·         Through Wireshark
>
> o   Sees data being sent when using ib_write_bw but not when using Open
> MPI
>
>
>
> I do not have much experience with Open MPI and apologize if I have left
> out necessary information.  I will respond with any data requested.  I
> appreciate the time spent to read and respond to this.
>
>
>
>
>
> Thank you,
>
>
>
> Brendan T. W. Myers
>
> brendan.my...@soft-forge.com
>
> Software Forge Inc
>
>
>
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to