On Mon, Mar 10, 2008 at 10:03:27AM -0500, Jeff Squyres wrote: > On Mar 10, 2008, at 9:50 AM, Steve Wise wrote: > > > (just thinking out loud here): The OMPi code could be designed to > > _not_ > > assume recv's are posted until the CPC indicates they are ready. IE > > sort > > of asynchronous behavior. When the recvs are ready, the CPC could > > up-call the btl and then the credits could be updated. This sounds > > painful though :) > > That's the way it works, but only for the initial credits. The CPC is > not involved beyond that. > > So it's likely that you'll still have this problem after initial > wireup for OMPI PP QP's (except as I noted below, if we only allow > that chelsio rnic to only have one PP QP and it has to be qp 0). > > > On the single-QP angle, Can I just run OMPI with only specifying 1 QP? > > Or will that require coding changes? > > > No coding changes required; just change the value of > mca_btl_openib_receive_queues.
Specifying only 1 PP QP via command line seems to be working. It now passes a tests that failed 100% of the time with the credit issue on my 2 node cluster. Futher tests on a larger setup are still pending, but this looks like a good workaround. I think adding an additional field to the mca-btl-openib-hca-params.ini file to have the 1 PP QP by default would be a good long(er) term solution to this. This way those adapters that have this deficiency can specify it and should work "out of the box". Thoughts? Thanks, Jon > > -- > Jeff Squyres > Cisco Systems > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel