Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-06 Thread Yunsheng Lin
On 2021/4/6 15:31, Michal Kubecek wrote: > On Tue, Apr 06, 2021 at 10:46:29AM +0800, Yunsheng Lin wrote: >> On 2021/4/6 9:49, Cong Wang wrote: >>> On Sat, Apr 3, 2021 at 5:23 AM Jiri Kosina wrote: I am still planning to have Yunsheng Lin's (CCing) fix [1] tested in the coming days.

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-06 Thread Yunsheng Lin
On 2021/4/6 18:13, Juergen Gross wrote: > On 06.04.21 09:06, Michal Kubecek wrote: >> On Tue, Apr 06, 2021 at 08:55:41AM +0800, Yunsheng Lin wrote: >>> >>> Hi, Jiri >>> Do you have a reproducer that can be shared here? >>> With reproducer, I can debug and test it myself too. >> >> I'm afraid we

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-06 Thread Juergen Gross
On 06.04.21 09:06, Michal Kubecek wrote: On Tue, Apr 06, 2021 at 08:55:41AM +0800, Yunsheng Lin wrote: Hi, Jiri Do you have a reproducer that can be shared here? With reproducer, I can debug and test it myself too. I'm afraid we are not aware of a simple reproducer. As mentioned in the

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-06 Thread Michal Kubecek
On Tue, Apr 06, 2021 at 10:46:29AM +0800, Yunsheng Lin wrote: > On 2021/4/6 9:49, Cong Wang wrote: > > On Sat, Apr 3, 2021 at 5:23 AM Jiri Kosina wrote: > >> > >> I am still planning to have Yunsheng Lin's (CCing) fix [1] tested in the > >> coming days. If it works, then we can consider

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-06 Thread Michal Kubecek
On Tue, Apr 06, 2021 at 08:55:41AM +0800, Yunsheng Lin wrote: > > Hi, Jiri > Do you have a reproducer that can be shared here? > With reproducer, I can debug and test it myself too. I'm afraid we are not aware of a simple reproducer. As mentioned in the original discussion, the race window is

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-05 Thread Yunsheng Lin
On 2021/4/6 9:49, Cong Wang wrote: > On Sat, Apr 3, 2021 at 5:23 AM Jiri Kosina wrote: >> >> I am still planning to have Yunsheng Lin's (CCing) fix [1] tested in the >> coming days. If it works, then we can consider proceeding with it, >> otherwise I am all for reverting the whole NOLOCK stuff.

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-05 Thread Cong Wang
On Sat, Apr 3, 2021 at 5:23 AM Jiri Kosina wrote: > > I am still planning to have Yunsheng Lin's (CCing) fix [1] tested in the > coming days. If it works, then we can consider proceeding with it, > otherwise I am all for reverting the whole NOLOCK stuff. > > [1] >

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-05 Thread Yunsheng Lin
On 2021/4/3 20:23, Jiri Kosina wrote: > On Sat, 3 Apr 2021, Hillf Danton wrote: > > Sure. Seems they crept in over time. I had some plans to write a > lockless HTB implementation. But with fq+EDT with BPF it seems that > it is no longer needed, we have a more generic/better solution.

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-03 Thread Jiri Kosina
On Sat, 3 Apr 2021, Hillf Danton wrote: > >>> Sure. Seems they crept in over time. I had some plans to write a > >>> lockless HTB implementation. But with fq+EDT with BPF it seems that > >>> it is no longer needed, we have a more generic/better solution. So > >>> I dropped it. Also most folks

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-02 Thread Josh Hunt
On 4/2/21 12:25 PM, Jiri Kosina wrote: On Thu, 3 Sep 2020, John Fastabend wrote: At this point I fear we could consider reverting the NOLOCK stuff. I personally would hate doing so, but it looks like NOLOCK benefits are outweighed by its issues. I agree, NOLOCK brings more pains than gains.

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2021-04-02 Thread Jiri Kosina
On Thu, 3 Sep 2020, John Fastabend wrote: > > > At this point I fear we could consider reverting the NOLOCK stuff. > > > I personally would hate doing so, but it looks like NOLOCK benefits are > > > outweighed by its issues. > > > > I agree, NOLOCK brings more pains than gains. There are many

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-17 Thread Kehuan Feng
Sorry, guys, the experiment environment is no longer existing now. We finally use fq_codel for online product. Cong Wang 于2020年9月18日周五 上午3:52写道: > > On Sun, Sep 13, 2020 at 7:10 PM Yunsheng Lin wrote: > > > > On 2020/9/11 4:19, Cong Wang wrote: > > > On Thu, Sep 3, 2020 at 8:21 PM Kehuan Feng

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-17 Thread Cong Wang
On Sun, Sep 13, 2020 at 7:10 PM Yunsheng Lin wrote: > > On 2020/9/11 4:19, Cong Wang wrote: > > On Thu, Sep 3, 2020 at 8:21 PM Kehuan Feng wrote: > >> I also tried Cong's patch (shown below on my tree) and it could avoid > >> the issue (stressing for 30 minutus for three times and not jitter >

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-13 Thread Yunsheng Lin
On 2020/9/11 4:19, Cong Wang wrote: > On Thu, Sep 3, 2020 at 8:21 PM Kehuan Feng wrote: >> I also tried Cong's patch (shown below on my tree) and it could avoid >> the issue (stressing for 30 minutus for three times and not jitter >> observed). > > Thanks for verifying it! > >> >> ---

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-10 Thread Paolo Abeni
On Thu, 2020-09-10 at 14:07 -0700, John Fastabend wrote: > Cong Wang wrote: > > On Thu, Sep 3, 2020 at 10:08 PM John Fastabend > > wrote: > > > Maybe this would unlock us, > > > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > > index 7df6c9617321..9b09429103f1 100644 > > > ---

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-10 Thread John Fastabend
Cong Wang wrote: > On Thu, Sep 3, 2020 at 10:08 PM John Fastabend > wrote: > > Maybe this would unlock us, > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > index 7df6c9617321..9b09429103f1 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -3749,7 +3749,7 @@ static inline

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-10 Thread Cong Wang
On Thu, Sep 3, 2020 at 8:21 PM Kehuan Feng wrote: > I also tried Cong's patch (shown below on my tree) and it could avoid > the issue (stressing for 30 minutus for three times and not jitter > observed). Thanks for verifying it! > > --- ./include/net/sch_generic.h.orig 2020-08-21

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-10 Thread Cong Wang
On Thu, Sep 3, 2020 at 10:08 PM John Fastabend wrote: > Maybe this would unlock us, > > diff --git a/net/core/dev.c b/net/core/dev.c > index 7df6c9617321..9b09429103f1 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -3749,7 +3749,7 @@ static inline int __dev_xmit_skb(struct sk_buff

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-03 Thread John Fastabend
Cong Wang wrote: > On Thu, Sep 3, 2020 at 1:40 AM Paolo Abeni wrote: > > > > On Wed, 2020-09-02 at 22:01 -0700, Cong Wang wrote: > > > Can you test the attached one-line fix? I think we are overthinking, > > > probably all > > > we need here is a busy wait. > > > > I think that will solve, but I

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-03 Thread Kehuan Feng
Hi Hillf, Cong, Paolo, Sorry for the late reply due to other urgent task. I tried Hillf's patch (shown below on my tree) and it doesn't help and the jitter shows up very quickly. --- ./include/net/sch_generic.h.orig 2020-08-21 15:13:51.787952710 +0800 +++ ./include/net/sch_generic.h 2020-09-04

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-03 Thread Cong Wang
On Thu, Sep 3, 2020 at 1:40 AM Paolo Abeni wrote: > > On Wed, 2020-09-02 at 22:01 -0700, Cong Wang wrote: > > Can you test the attached one-line fix? I think we are overthinking, > > probably all > > we need here is a busy wait. > > I think that will solve, but I also think that will kill NOLOCK

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-03 Thread Paolo Abeni
On Wed, 2020-09-02 at 22:01 -0700, Cong Wang wrote: > Can you test the attached one-line fix? I think we are overthinking, > probably all > we need here is a busy wait. I think that will solve, but I also think that will kill NOLOCK performances due to really increased contention. At this point

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-09-02 Thread Cong Wang
Hello, Kehuan Can you test the attached one-line fix? I think we are overthinking, probably all we need here is a busy wait. Thanks. diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index d60e7c39d60c..fc1bacdb102b 100644 --- a/include/net/sch_generic.h +++

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-08-27 Thread Kehuan Feng
Hi Hillf, Unfortunately, above mem barriers don't help. The issue shows up within 1 minute ... Hillf Danton 于2020年8月27日周四 下午8:58写道: > > > On Thu, 27 Aug 2020 14:56:31 +0800 Kehuan Feng wrote: > > > > > Lets see if TCQ_F_NOLOC is making fq_codel different in your testing. > > > > I assume you

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-08-27 Thread Kehuan Feng
Hi Hillf, > Let’s see if TCQ_F_NOLOC is making fq_codel different in your testing. I assume you meant disabling NOLOCK for pfifo_fast. Here is the modification, --- ./net/sched/sch_generic.c.orig 2020-08-24 22:02:04.589830751 +0800 +++ ./net/sched/sch_generic.c 2020-08-27

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-08-25 Thread Kehuan Feng
Hi Hillf, Thanks for the patch. I just tried it and it looks better than previous one. The issue appeared only once over ~30 mins stressing (without the patch , it shows up within 1 mins in usual, so I feel like we are getting close to the final fix) (pasted the modifications on my tree in case

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-08-25 Thread Fengkehuan Feng
Hi Hillf, I just tried the updated version and the system can boot up now. It does mitigate the issue a lot but still couldn't get rid of it thoroughly. It seems to me like the effect of Cong's patch. Hillf Danton 于2020年8月25日周二 上午11:23写道: > > > Hi Feng, > > On Tue, 25 Aug 2020 10:18:05 +0800

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-08-24 Thread Fengkehuan Feng
Hillf, With the latest version (attached what I have changed on my tree), the system failed to start up with cpu stalled. Hillf Danton 于2020年8月22日周六 上午11:30写道: > > > On Thu, 20 Aug 2020 20:43:17 +0800 Hillf Danton wrote: > > Hi Jike, > > > > On Thu, 20 Aug 2020 15:43:17 +0800 Jike Song wrote:

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-08-20 Thread Josh Hunt
Hi Jike On 8/20/20 12:43 AM, Jike Song wrote: Hi Josh, We met possibly the same problem when testing nvidia/mellanox's GPUDirect RDMA product, we found that changing NET_SCH_DEFAULT to DEFAULT_FQ_CODEL mitigated the problem, having no idea why. Maybe you can also have a try? We also did

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-08-20 Thread Jike Song
Hi Josh, On Fri, Jul 3, 2020 at 2:14 AM Josh Hunt wrote: {snip} > Initial results with Cong's patch look promising, so far no stalls. We > will let it run over the long weekend and report back on Tuesday. > > Paolo - I have concerns about possible performance regression with the > change as

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-07-09 Thread Paolo Abeni
On Wed, 2020-07-08 at 13:16 -0700, Cong Wang wrote: > On Tue, Jul 7, 2020 at 7:18 AM Paolo Abeni wrote: > > So the regression with 2 pktgen threads is still relevant. 'perf' shows > > relevant time spent into net_tx_action() and __netif_schedule(). > > So, touching the __QDISC_STATE_SCHED bit in

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-07-08 Thread Zhivich, Michael
On 7/2/20, 2:08 PM, "Josh Hunt" wrote: > > On 7/2/20 2:45 AM, Paolo Abeni wrote: > > Hi all, > > > > On Thu, 2020-07-02 at 08:14 +0200, Jonas Bonn wrote: > >> Hi Cong, > >> > >> On 01/07/2020 21:58, Cong Wang wrote: > >>> On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote: > On Tue, Jun 30,

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-07-08 Thread Cong Wang
On Tue, Jul 7, 2020 at 7:18 AM Paolo Abeni wrote: > So the regression with 2 pktgen threads is still relevant. 'perf' shows > relevant time spent into net_tx_action() and __netif_schedule(). So, touching the __QDISC_STATE_SCHED bit in __dev_xmit_skb() is not a good idea. Let me see if there is

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-07-07 Thread Paolo Abeni
On Thu, 2020-07-02 at 11:08 -0700, Josh Hunt wrote: > On 7/2/20 2:45 AM, Paolo Abeni wrote: > > Hi all, > > > > On Thu, 2020-07-02 at 08:14 +0200, Jonas Bonn wrote: > > > Hi Cong, > > > > > > On 01/07/2020 21:58, Cong Wang wrote: > > > > On Wed, Jul 1, 2020 at 9:05 AM Cong Wang > > > > wrote:

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-07-02 Thread Josh Hunt
On 7/2/20 2:45 AM, Paolo Abeni wrote: Hi all, On Thu, 2020-07-02 at 08:14 +0200, Jonas Bonn wrote: Hi Cong, On 01/07/2020 21:58, Cong Wang wrote: On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote: On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote: Do either of you know if there's been any

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-07-02 Thread Paolo Abeni
Hi all, On Thu, 2020-07-02 at 08:14 +0200, Jonas Bonn wrote: > Hi Cong, > > On 01/07/2020 21:58, Cong Wang wrote: > > On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote: > > > On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote: > > > > Do either of you know if there's been any development on a fix

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-07-02 Thread Jonas Bonn
Hi Cong, On 01/07/2020 21:58, Cong Wang wrote: On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote: On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote: Do either of you know if there's been any development on a fix for this issue? If not we can propose something. If you have a reproducer, I can

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-07-01 Thread Josh Hunt
On 7/1/20 12:58 PM, Cong Wang wrote: On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote: On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote: Do either of you know if there's been any development on a fix for this issue? If not we can propose something. If you have a reproducer, I can look into

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-07-01 Thread Cong Wang
On Wed, Jul 1, 2020 at 9:05 AM Cong Wang wrote: > > On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote: > > Do either of you know if there's been any development on a fix for this > > issue? If not we can propose something. > > If you have a reproducer, I can look into this. Does the attached

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-07-01 Thread Cong Wang
On Tue, Jun 30, 2020 at 2:08 PM Josh Hunt wrote: > Do either of you know if there's been any development on a fix for this > issue? If not we can propose something. If you have a reproducer, I can look into this. Thanks.

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-07-01 Thread Jonas Bonn
On 30/06/2020 21:14, Josh Hunt wrote: On 6/23/20 6:42 AM, Michael Zhivich wrote: From: Jonas Bonn To: Paolo Abeni , "net...@vger.kernel.org" , LKML , "David S . Miller" , John Fastabend Subject: Re: Packet gets stuck in NOLOCK pfifo_fast qdisc Date: Fr

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-06-30 Thread Josh Hunt
On 6/23/20 6:42 AM, Michael Zhivich wrote: From: Jonas Bonn To: Paolo Abeni , "net...@vger.kernel.org" , LKML , "David S . Miller" , John Fastabend Subject: Re: Packet gets stuck in NOLOCK pfifo_fast qdisc Date: Fri, 11 Oct 2019 02:39

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2020-06-23 Thread Michael Zhivich
> From: Jonas Bonn > To: Paolo Abeni , > "net...@vger.kernel.org" , > LKML , > "David S . Miller" , > John Fastabend > Subject: Re: Packet gets stuck in NOLOCK pfifo_fast qdisc > Date: Fri, 11 Oct 2019 02:39:48 +0200 >

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2019-10-10 Thread Jonas Bonn
Hi Paolo, On 09/10/2019 21:14, Paolo Abeni wrote: Something alike the following code - completely untested - can possibly address the issue, but it's a bit rough and I would prefer not adding additonal complexity to the lockless qdiscs, can you please have a spin a it? We've tested a couple

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2019-10-10 Thread Jonas Bonn
Hi Paolo, On 09/10/2019 21:14, Paolo Abeni wrote: On Wed, 2019-10-09 at 08:46 +0200, Jonas Bonn wrote: Hi, The lockless pfifo_fast qdisc has an issue with packets getting stuck in the queue. What appears to happen is: i) Thread 1 holds the 'seqlock' on the qdisc and dequeues packets. ii)

Re: Packet gets stuck in NOLOCK pfifo_fast qdisc

2019-10-09 Thread Paolo Abeni
On Wed, 2019-10-09 at 08:46 +0200, Jonas Bonn wrote: > Hi, > > The lockless pfifo_fast qdisc has an issue with packets getting stuck in > the queue. What appears to happen is: > > i) Thread 1 holds the 'seqlock' on the qdisc and dequeues packets. > ii) Thread 1 dequeues the last packet in

Packet gets stuck in NOLOCK pfifo_fast qdisc

2019-10-09 Thread Jonas Bonn
Hi, The lockless pfifo_fast qdisc has an issue with packets getting stuck in the queue. What appears to happen is: i) Thread 1 holds the 'seqlock' on the qdisc and dequeues packets. ii) Thread 1 dequeues the last packet in the queue. iii) Thread 1 iterates through the qdisc->dequeue