On Wed, Feb 13, 2008 at 09:05:24AM -0500, Jeff Squyres wrote:
> Actually, we should then also print out a different error message when  
> RNR occurs in PP QP's, too.  It should be something along the lines of  
> "flow control problem occurred; this shouldn't happen..." (right now  
> it says RNR happened, and goes into detail into what that means -- but  
> that's not the real problem).
> 
Good point.

> I'll do that as well.
Thanks!

> 
> 
> On Feb 13, 2008, at 12:59 AM, Gleb Natapov wrote:
> 
> > On Tue, Feb 12, 2008 at 05:41:13PM -0500, Jeff Squyres wrote:
> >> I see that in the OOB CPC for the openib BTL, when setting up the  
> >> send
> >> side of the QP, we set the rnr_retry value depending on whether the
> >> remote receive queue is a per-peer or SRQ:
> >>
> >> - SRQ: btl_openib_rnr_retry MCA param value
> >> - PP: 0
> >>
> >> The rationale given in a comment is that setting the RNR to 0 is a
> >> good way to find bugs in our flow control.
> >>
> >> Do we really want this in production builds?  Or do we want 0 for
> >> developer builds and the same btl_openib_rnr_retry value for PP  
> >> queues?
> >>
> > The comment is mine and IMO it should stay that way for production
> > builds. SW flow control either work or it doesn't and if it doesn't I
> > prefer to know about it immediately. Setting PP to some value greater
> > then 0 just delays the manifestation of the problem and in the case of
> > iWarp such possibility doesn't even exists.
> >
> > --
> >                     Gleb.
> > _______________________________________________
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> -- 
> Jeff Squyres
> Cisco Systems
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
                        Gleb.

Reply via email to