Hello Brendan, This helps some, but looks like we need more debug output.
Could you build a debug version of Open MPI by adding --enable-debug to the config options and rerun the test with the breakout cable setup and keeping the --mca btl_base_verbose 100 command line option? Thanks Howard 2017-01-23 8:23 GMT-07:00 Brendan Myers <brendan.my...@soft-forge.com>: > Hello Howard, > > Thank you for looking into this. Attached is the output you requested. > Also, I am using Open MPI 2.0.1. > > > > Thank you, > > Brendan > > > > *From:* users [mailto:users-boun...@lists.open-mpi.org] *On Behalf Of *Howard > Pritchard > *Sent:* Friday, January 20, 2017 6:35 PM > *To:* Open MPI Users <users@lists.open-mpi.org> > *Subject:* Re: [OMPI users] Open MPI over RoCE using breakout cable and > switch > > > > Hi Brendan > > > > I doubt this kind of config has gotten any testing with OMPI. Could you > rerun with > > > > --mca btl_base_verbose 100 > > > > added to the command line and post the output to the list? > > > > Howard > > > > > > Brendan Myers <brendan.my...@soft-forge.com> schrieb am Fr. 20. Jan. 2017 > um 15:04: > > Hello, > > I am attempting to get Open MPI to run over 2 nodes using a switch and a > single breakout cable with this design: > > (100GbE)QSFP ßà 2x (50GbE)QSFP > > > > Hardware Layout: > > Breakout cable module A connects to switch (100GbE) > > Breakout cable module B1 connects to node 1 RoCE NIC (50GbE) > > Breakout cable module B2 connects to node 2 RoCE NIC (50GbE) > > Switch is Mellanox SN 2700 100GbE RoCE switch > > > > · I am able to pass RDMA traffic between the nodes with perftest > (ib_write_bw) when using the breakout cable as the IC from both nodes to > the switch. > > · When attempting to run a job using the breakout cable as the IC > Open MPI aborts with failure to initialize open fabrics device errors. > > · If I replace the breakout cable with 2 standard QSFP cables the > Open MPI job will complete correctly. > > > > > > This is the command I use, it works unless I attempt a run with the > breakout cable used as IC: > > *mpirun --mca btl openib,self,sm --mca btl_openib_receive_queues > P,65536,120,64,32 --mca btl_openib_cpc_include rdmacm -hostfile > mpi-hosts-ce /usr/local/bin/IMB-MPI1* > > > > If anyone has any idea as to why using a breakout cable is causing my jobs > to fail please let me know. > > > > Thank you, > > > > Brendan T. W. Myers > > brendan.my...@soft-forge.com > > Software Forge Inc > > > > _______________________________________________ > > users mailing list > > users@lists.open-mpi.org > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users