On Mon, Mar 10, 2008 at 10:03:27AM -0500, Jeff Squyres wrote:
> On Mar 10, 2008, at 9:50 AM, Steve Wise wrote:
> 
> > (just thinking out loud here): The OMPi code could be designed to  
> > _not_
> > assume recv's are posted until the CPC indicates they are ready. IE  
> > sort
> > of asynchronous behavior.   When the recvs are ready, the CPC could
> > up-call the btl and then the credits could be updated.  This sounds
> > painful though :)
> 
> That's the way it works, but only for the initial credits.  The CPC is  
> not involved beyond that.
> 
> So it's likely that you'll still have this problem after initial  
> wireup for OMPI PP QP's (except as I noted below, if we only allow  
> that chelsio rnic to only have one PP QP and it has to be qp 0).
> 
> > On the single-QP angle, Can I just run OMPI with only specifying 1 QP?
> > Or will that require coding changes?
> 
> 
> No coding changes required; just change the value of  
> mca_btl_openib_receive_queues.

Specifying only 1 PP QP via command line seems to be working.  It now
passes a tests that failed 100% of the time with the credit issue on my
2 node cluster.  Futher tests on a larger setup are still pending, but
this looks like a good workaround.

I think adding an additional field to the mca-btl-openib-hca-params.ini
file to have the 1 PP QP by default would be a good long(er) term
solution to this.  This way those adapters that have this deficiency can
specify it and should work "out of the box".

Thoughts?

Thanks,
Jon

> 
> -- 
> Jeff Squyres
> Cisco Systems
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to