From: Krishna Kumar2 <[EMAIL PROTECTED]>
Date: Thu, 20 Sep 2007 11:24:01 +0530
> Ran 4/16/64 thread iperf on latest bits with this patch and no issues after
> 30 mins. I used to
> consistently get the bug within 1-2 mins with just 4 threads prior to this
> patch.
>
> Tested-by: Krishna Kumar <[EM
From: Krishna Kumar2 <[EMAIL PROTECTED]>
Date: Thu, 20 Sep 2007 10:48:15 +0530
> About the "list deletion occurs", isn't the race I mentioned still present?
> If done < budget, the driver does netif_rx_complete (at which time some
> other cpu can add this NAPI to their list). But the first cpu mig
Ran 4/16/64 thread iperf on latest bits with this patch and no issues after
30 mins. I used to
consistently get the bug within 1-2 mins with just 4 threads prior to this
patch.
Tested-by: Krishna Kumar <[EMAIL PROTECTED]>
(if any value in that)
thanks,
- KK
David Miller <[EMAIL PROTECTED]> wrot
Hi Dave,
David Miller <[EMAIL PROTECTED]> wrote on 09/19/2007 09:35:57 PM:
> The NAPI_STATE_SCHED flag bit should provide all of the necessary
> synchornization.
>
> Only the setter of that bit should add the NAPI instance to the
> polling list.
>
> The polling loop runs atomically on the cpu whe
From: Krishna Kumar2 <[EMAIL PROTECTED]>
Date: Thu, 20 Sep 2007 10:40:33 +0530
> I like the clean changes made by Dave to fix this, and will test it
> today (if I can get my crashed system to come up).
I would very much appreciate this testing, as I'm rather sure we've
plugged up the most serious
Hi Jan-Bernd,
Jan-Bernd Themann <[EMAIL PROTECTED]> wrote on 09/19/2007 06:53:48 PM:
> If I understood it right the problem you describe (quota update in
> __napi_schdule) can cause further problems when you choose the
> following numbers:
>
> CPU1: A. process 99 pkts
> CPU1: B. netif_rx_complete
From: Krishna Kumar <[EMAIL PROTECTED]>
Date: Wed, 19 Sep 2007 17:24:03 +0530
> Note: during steps F-H and C-E, priv/napi is read/modified by both cpu's
> which is another bug relating to the same race.
>
> I guess the above patch is not required if this bug (in IPoIB) is fixed?
The NAPI_S
Hi,
On Wednesday 19 September 2007 13:54, Krishna Kumar wrote:
> CPU#1: ipoib_poll(budget=100)
> {
> A. process 100 skbs
> B. netif_rx_complete()
> CPU#2>
> F. ib_req_notify_cq() (no missed completions, do nothing)
> G. return 100
> H. return to net_rx_a
Hi Dave,
After applying Roland's NAPI patch, system panics when I run multiple
thread iperf (no stack trace at this time, it shows that the panic is in
net_tx_action).
I think the problem is:
In the "done < budget" case, ipoib_poll calls netif_rx_complete()
netif_rx_complete()
__netif_rx