Going to put the RFC out today with a timeout of about 2 weeks. This will give me some time to talk with other Open MPI developers face-to-face at SC14.
If the RFC fails I will still bring that and a couple of other fixes into the master. -Nathan On Tue, Nov 04, 2014 at 04:06:45PM -0600, Steve Wise wrote: > Ok, sounds like I should let you continue the good work! :) When do you > plan to merge this into ompi proper? > > On 11/4/2014 3:58 PM, Nathan Hjelm wrote: > > That certainly addresses part of the problem. I am working on a complete > revamp of the btl RDMA interface. It contains this fix: > > > https://github.com/hjelmn/ompi/commit/66fa429e306beb9fca59da0a4554e9b98d788316 > > -Nathan > > On Tue, Nov 04, 2014 at 03:27:23PM -0600, Steve Wise wrote: > > I found the bug. Here is the fix: > > [root@stevo1 openib]# git diff > diff --git a/opal/mca/btl/openib/btl_openib_component.c > b/opal/mca/btl/openib/btl_openib_component.c > index d876e21..8a5ea82 100644 > --- a/opal/mca/btl/openib/btl_openib_component.c > +++ b/opal/mca/btl/openib/btl_openib_component.c > @@ -1960,9 +1960,8 @@ static int init_one_device(opal_list_t *btl_list, > struct ibv_device* ib_dev) > } > > /* If the MCA param was specified, skip all the checks */ > - if ( MCA_BASE_VAR_SOURCE_COMMAND_LINE || > - MCA_BASE_VAR_SOURCE_ENV == > - mca_btl_openib_component.receive_queues_source) { > + if (MCA_BASE_VAR_SOURCE_COMMAND_LINE == > mca_btl_openib_component.receive_queues_source|| > + MCA_BASE_VAR_SOURCE_ENV == > mca_btl_openib_component.receive_queues_source) { > goto good; > } > > > On 11/4/2014 3:08 PM, Nathan Hjelm wrote: > > I have run into the issue as well. I will open a pull request for 1.8.4 > as part of a patch fixing the coalescing issues. > > -Nathan > > On Tue, Nov 04, 2014 at 02:50:30PM -0600, Steve Wise wrote: > > On 11/4/2014 2:09 PM, Steve Wise wrote: > > Hi, > > I'm running ompi top-o-tree from github and seeing an openib btl issue > where the qp/srq configuration is incorrect for the given device id. This > works fine in 1.8.4rc1, but I see the problem in top-of-tree. A simple 2 > node IMB-MPI1 pingpong fails to get the ranks setup. I see this logged: > > /opt/ompi-trunk/bin/mpirun --allow-run-as-root --np 2 --host stevo1,stevo2 > --mca btl openib,sm,self /opt/ompi-trunk/bin/IMB-MPI1 pingpong > > > Adding this works around the issue: > > --mca btl_openib_receive_queues P,65536,64 > > I also confirmed that opal_btl_openib_ini_query() is getting the correct > receive_queues string from the .ini file on both nodes for the cxgb4 > device... > > > > <snip> > > -------------------------------------------------------------------------- > > The Open MPI receive queue configuration for the OpenFabrics devices > on two nodes are incompatible, meaning that MPI processes on two > specific nodes were unable to communicate with each other. This > generally happens when you are using OpenFabrics devices from > different vendors on the same network. You should be able to use the > mca_btl_openib_receive_queues MCA parameter to set a uniform receive > queue configuration for all the devices in the MPI job, and therefore > be able to run successfully. > > Local host: stevo2 > Local adapter: cxgb4_0 (vendor 0x1425, part ID 21520) > Local queues: > P,128,256,192,128:S,2048,1024,1008,64:S,12288,1024,1008,64:S,65536,1024,1008,64 > > Remote host: stevo1 > Remote adapter: (vendor 0x1425, part ID 21520) > Remote queues: P,65536,64 > ---------------------------------------------------------------------------- > > > The stevo1 rank has the correct queue settings: P,65536,64. For some > reason, stevo2 has the wrong settings, even though it has the correct > device id info. > > Any suggestions on debugging this? Like where to dig in the src to see if > somehow the .ini parsing is broken... > > > Thanks, > > Steve. > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16179.php > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16180.php > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16181.php > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16182.php > > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16184.php > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2014/11/16185.php
pgpmydvzslzQc.pgp
Description: PGP signature