During some other testing I found that when a completion upcall
returns to the provider leaving CQEs still on the completion queue,
there is a non-zero probability that a completion will be lost.
>>>
>>> What does lost mean?
>>
>> Lost means a WC in the CQ is skipped by ib_poll_cq
On Fri, Jul 24, 2015 at 04:26:00PM -0400, Chuck Lever wrote:
> Basically RPC work flow stopped because an RPC reply never
> arrived.
Oh, that is what I expect to see.. Remebmer the cq upcall is edge
triggered, so if you leave stuff in the cq then you don't get another
upcall until another CQE is a
Hi Jason-
On Jul 24, 2015, at 4:46 PM, Jason Gunthorpe
wrote:
> On Fri, Jul 24, 2015 at 04:26:00PM -0400, Chuck Lever wrote:
>> Basically RPC work flow stopped because an RPC reply never
>> arrived.
>
> Oh, that is what I expect to see.. Remebmer the cq upcall is edge
> triggered, so if you l
On Wed, Jul 29, 2015 at 04:47:59PM -0400, Chuck Lever wrote:
> Apparently this is true for some providers, and not for others, and
> I misunderstood that when I put this together last year.
Really? In kernel providers? Interesting, those are probably wrong...
> > The idea that you can completely
On Jul 29, 2015, at 5:15 PM, Jason Gunthorpe
wrote:
> On Wed, Jul 29, 2015 at 04:47:59PM -0400, Chuck Lever wrote:
>
>> Apparently this is true for some providers, and not for others, and
>> I misunderstood that when I put this together last year.
>
> Really? In kernel providers? Interesting,
The drivers we have that don't dequeue all the CQEs are doing
something like NAPI polling and have other mechanisms to guarentee
progress. Don't copy something like budget without copying the other
mechanisms :)
OK, that makes total sense. Thanks for clarifying.
IIRC NAPI is soft-IRQ which c
On Jul 30, 2015, at 3:00 AM, Sagi Grimberg wrote:
>
>>> The drivers we have that don't dequeue all the CQEs are doing
>>> something like NAPI polling and have other mechanisms to guarentee
>>> progress. Don't copy something like budget without copying the other
>>> mechanisms :)
>>
>> OK, that
On Thu, Jul 30, 2015 at 10:00:08AM +0300, Sagi Grimberg wrote:
> I still think that draining the CQ without respecting a quota is
> wrong, even if driverX has a glitch there.
Sure, but you can't just return from the CQ upcall after doing a
budget and expect to be called again in the future. That