Re: Potential lost receive WCs (was "[PATCH WIP 38/43]")

Chuck Lever Wed, 29 Jul 2015 14:20:07 -0700

On Jul 29, 2015, at 5:15 PM, Jason Gunthorpe <jguntho...@obsidianresearch.com> 
wrote:


> On Wed, Jul 29, 2015 at 04:47:59PM -0400, Chuck Lever wrote:
> 
>> Apparently this is true for some providers, and not for others, and
>> I misunderstood that when I put this together last year.
> 
> Really? In kernel providers? Interesting, those are probably wrong...
> 
>>> The idea that you can completely drain the CQ during the upcall is
>>> inherently racey, so this cannot be the answer to whatever the problem
>>> is..
> 
> This comment was directed toward using a complete drain to cover up a
> driver bug.
> 
> A full drain to guarentee ULP progress is OK and the driver must make
> sure that case isn't racey.
> 
> Which is done via:
> 
>> I thought IB_CQ_REPORT_MISSED_EVENTS was supposed to close the race
>> windows here.
> 
> Basically:
> * Don't call ib_req_notify_cq unless you think the CQ is empty
> * Don't expect an upcall untill you call ib_req_notify_cq
> * Call ib_req_notify_cq last
> 
>> And Section 8.2.5 of draft-hilland-rddp-verbs recommends dequeuing
>> all existing CQEs.
> 
> The drivers we have that don't dequeue all the CQEs are doing
> something like NAPI polling and have other mechanisms to guarentee
> progress. Don't copy something like budget without copying the other
> mechanisms :)

OK, that makes total sense. Thanks for clarifying.


--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Potential lost receive WCs (was "[PATCH WIP 38/43]")

Reply via email to