subject:"Re\: regression with napi\/softirq \?"

Re: regression with napi/softirq ?

2019-07-19 Thread Sudip Mukherjee

On Thu, Jul 18, 2019 at 4:08 PM Eric Dumazet  wrote:
>
>
>
> On 7/18/19 2:55 PM, Sudip Mukherjee wrote:
>
> > Thanks Eric. But there is no improvement in delay between
> > softirq_raise and softirq_entry with this change.
> > But moving to a later kernel (linus master branch? ) like Thomas has
> > said in the other mail might be difficult atm. I can definitely
> > move to v4.14.133 if that helps. Thomas ?
>
> If you are tracking max latency then I guess you have to tweak 
> SOFTIRQ_NOW_MASK
> to include NET_RX_SOFTIRQ
>
> The patch I gave earlier would only lower the probability of events, not 
> completely get rid of them.
>
>
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 
> 0427a86743a46b7e1891f7b6c1ff585a8a1695f5..302046dd8d7e6740e466c422954f22565fe19e69
>  100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -81,7 +81,7 @@ static void wakeup_softirqd(void)
>   * right now. Let ksoftirqd handle this at its own rate, to get fairness,
>   * unless we're doing some of the synchronous softirqs.
>   */
> -#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> +#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << 
> NET_RX_SOFTIRQ))
>  static bool ksoftirqd_running(unsigned long pending)
>  {
> struct task_struct *tsk = __this_cpu_read(ksoftirqd);

Thanks Eric, this looks better than the hack that tglx gave. :)
Though the hack was good for testing.

But my original problem was a drop is network packets and till now I
was thinking
that the delay in processing the softirq is causing that. But with the
hack tglx has given
the latency has decreased but my problem is still there.
So, I am looking into it again now.

Thanks again for the patch.


-- 
Regards
Sudip

Re: regression with napi/softirq ?

2019-07-18 Thread Eric Dumazet




On 7/18/19 2:55 PM, Sudip Mukherjee wrote:

> Thanks Eric. But there is no improvement in delay between
> softirq_raise and softirq_entry with this change.
> But moving to a later kernel (linus master branch? ) like Thomas has
> said in the other mail might be difficult atm. I can definitely
> move to v4.14.133 if that helps. Thomas ?

If you are tracking max latency then I guess you have to tweak SOFTIRQ_NOW_MASK
to include NET_RX_SOFTIRQ

The patch I gave earlier would only lower the probability of events, not 
completely get rid of them.



diff --git a/kernel/softirq.c b/kernel/softirq.c
index 
0427a86743a46b7e1891f7b6c1ff585a8a1695f5..302046dd8d7e6740e466c422954f22565fe19e69
 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -81,7 +81,7 @@ static void wakeup_softirqd(void)
  * right now. Let ksoftirqd handle this at its own rate, to get fairness,
  * unless we're doing some of the synchronous softirqs.
  */
-#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
+#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << 
NET_RX_SOFTIRQ))
 static bool ksoftirqd_running(unsigned long pending)
 {
struct task_struct *tsk = __this_cpu_read(ksoftirqd);

Re: regression with napi/softirq ?

2019-07-18 Thread Sudip Mukherjee

On Thu, Jul 18, 2019 at 12:42 PM Eric Dumazet  wrote:
>
>
>
> On 7/18/19 1:18 PM, Sudip Mukherjee wrote:
> > Hi Eric,
> >
> > On Thu, Jul 18, 2019 at 7:58 AM Eric Dumazet  wrote:
> >>
> >>
> >>
> >> On 7/17/19 11:52 PM, Thomas Gleixner wrote:
> >>> Sudip,
> >>>
> >>> On Wed, 17 Jul 2019, Sudip Mukherjee wrote:
>  On Wed, Jul 17, 2019 at 9:53 PM Thomas Gleixner  
>  wrote:
> > You can hack ksoftirq_running() to return always false to avoid this, 
> > but
> > that might cause application starvation and a huge packet buffer backlog
> > when the amount of incoming packets makes the CPU do nothing else than
> > softirq processing.
> 
>  I tried that now, it is better but still not as good as v3.8
>  Now I am getting 375.9usec as the maximum time between raising the 
>  softirq
>  and it starting to execute and packet drops still there.
> 
>  And just a thought, do you think there should be a CONFIG_ option for
>  this feature of ksoftirqd_running() so that it can be disabled if needed
>  by users like us?
> >>>
> >>> If at all then a sysctl to allow runtime control.
> >>>
> >
> > 
> >
> >>
> >> ksoftirqd might be spuriously scheduled from tx path, when
> >> __qdisc_run() also reacts to need_resched().
> >>
> >> By raising NET_TX while we are processing NET_RX (say we send a TCP ACK 
> >> packet
> >> in response to incoming packet), we force __do_softirq() to perform
> >> another loop, but before doing an other round, it will also check 
> >> need_resched()
> >> and eventually call wakeup_softirqd()
> >>
> >> I wonder if following patch makes any difference.
> >>
> >> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> >> index 
> >> 11c03cf4aa74b44663c74e0e3284140b0c75d9c4..ab736e974396394ae6ba409868aaea56a50ad57b
> >>  100644
> >> --- a/net/sched/sch_generic.c
> >> +++ b/net/sched/sch_generic.c
> >> @@ -377,6 +377,8 @@ void __qdisc_run(struct Qdisc *q)
> >> int packets;
> >>
> >> while (qdisc_restart(q, &packets)) {
> >> +   if (qdisc_is_empty(q))
> >> +   break;
> >
> > unfortunately its v4.14.55 and qdisc_is_empty() is not yet introduced.
> > And I can not backport 28cff537ef2e ("net: sched: add empty status
> > flag for NOLOCK qdisc")
> > also as TCQ_F_NOLOCK is there. :(
> >
>
> On old kernels, you can simply use
>
> static inline bool qdisc_is_empty(struct Qdisc *q)
> {
> return !qdisc_qlen(q);
> }
>

Thanks Eric. But there is no improvement in delay between
softirq_raise and softirq_entry with this change.
But moving to a later kernel (linus master branch? ) like Thomas has
said in the other mail might be difficult atm. I can definitely
move to v4.14.133 if that helps. Thomas ?


-- 
Regards
Sudip

Re: regression with napi/softirq ?

2019-07-18 Thread Thomas Gleixner

On Thu, 18 Jul 2019, Sudip Mukherjee wrote:
> On Thu, Jul 18, 2019 at 7:58 AM Eric Dumazet  wrote:
> > ksoftirqd might be spuriously scheduled from tx path, when
> > __qdisc_run() also reacts to need_resched().
> >
> > By raising NET_TX while we are processing NET_RX (say we send a TCP ACK 
> > packet
> > in response to incoming packet), we force __do_softirq() to perform
> > another loop, but before doing an other round, it will also check 
> > need_resched()
> > and eventually call wakeup_softirqd()
> >
> > I wonder if following patch makes any difference.
> >
> > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> > index 
> > 11c03cf4aa74b44663c74e0e3284140b0c75d9c4..ab736e974396394ae6ba409868aaea56a50ad57b
> >  100644
> > --- a/net/sched/sch_generic.c
> > +++ b/net/sched/sch_generic.c
> > @@ -377,6 +377,8 @@ void __qdisc_run(struct Qdisc *q)
> > int packets;
> >
> > while (qdisc_restart(q, &packets)) {
> > +   if (qdisc_is_empty(q))
> > +   break;
> 
> unfortunately its v4.14.55 and qdisc_is_empty() is not yet introduced.
> And I can not backport 28cff537ef2e ("net: sched: add empty status
> flag for NOLOCK qdisc")
> also as TCQ_F_NOLOCK is there. :(

Then please run the test on a current kernel. Backports can be discussed
once the problem is understood.

Thanks,

tglx

Re: regression with napi/softirq ?

2019-07-18 Thread Eric Dumazet




On 7/18/19 1:18 PM, Sudip Mukherjee wrote:
> Hi Eric,
> 
> On Thu, Jul 18, 2019 at 7:58 AM Eric Dumazet  wrote:
>>
>>
>>
>> On 7/17/19 11:52 PM, Thomas Gleixner wrote:
>>> Sudip,
>>>
>>> On Wed, 17 Jul 2019, Sudip Mukherjee wrote:
 On Wed, Jul 17, 2019 at 9:53 PM Thomas Gleixner  wrote:
> You can hack ksoftirq_running() to return always false to avoid this, but
> that might cause application starvation and a huge packet buffer backlog
> when the amount of incoming packets makes the CPU do nothing else than
> softirq processing.

 I tried that now, it is better but still not as good as v3.8
 Now I am getting 375.9usec as the maximum time between raising the softirq
 and it starting to execute and packet drops still there.

 And just a thought, do you think there should be a CONFIG_ option for
 this feature of ksoftirqd_running() so that it can be disabled if needed
 by users like us?
>>>
>>> If at all then a sysctl to allow runtime control.
>>>
> 
> 
> 
>>
>> ksoftirqd might be spuriously scheduled from tx path, when
>> __qdisc_run() also reacts to need_resched().
>>
>> By raising NET_TX while we are processing NET_RX (say we send a TCP ACK 
>> packet
>> in response to incoming packet), we force __do_softirq() to perform
>> another loop, but before doing an other round, it will also check 
>> need_resched()
>> and eventually call wakeup_softirqd()
>>
>> I wonder if following patch makes any difference.
>>
>> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
>> index 
>> 11c03cf4aa74b44663c74e0e3284140b0c75d9c4..ab736e974396394ae6ba409868aaea56a50ad57b
>>  100644
>> --- a/net/sched/sch_generic.c
>> +++ b/net/sched/sch_generic.c
>> @@ -377,6 +377,8 @@ void __qdisc_run(struct Qdisc *q)
>> int packets;
>>
>> while (qdisc_restart(q, &packets)) {
>> +   if (qdisc_is_empty(q))
>> +   break;
> 
> unfortunately its v4.14.55 and qdisc_is_empty() is not yet introduced.
> And I can not backport 28cff537ef2e ("net: sched: add empty status
> flag for NOLOCK qdisc")
> also as TCQ_F_NOLOCK is there. :(
> 

On old kernels, you can simply use

static inline bool qdisc_is_empty(struct Qdisc *q)
{
return !qdisc_qlen(q);
}

Re: regression with napi/softirq ?

2019-07-18 Thread Sudip Mukherjee

Hi Eric,

On Thu, Jul 18, 2019 at 7:58 AM Eric Dumazet  wrote:
>
>
>
> On 7/17/19 11:52 PM, Thomas Gleixner wrote:
> > Sudip,
> >
> > On Wed, 17 Jul 2019, Sudip Mukherjee wrote:
> >> On Wed, Jul 17, 2019 at 9:53 PM Thomas Gleixner  wrote:
> >>> You can hack ksoftirq_running() to return always false to avoid this, but
> >>> that might cause application starvation and a huge packet buffer backlog
> >>> when the amount of incoming packets makes the CPU do nothing else than
> >>> softirq processing.
> >>
> >> I tried that now, it is better but still not as good as v3.8
> >> Now I am getting 375.9usec as the maximum time between raising the softirq
> >> and it starting to execute and packet drops still there.
> >>
> >> And just a thought, do you think there should be a CONFIG_ option for
> >> this feature of ksoftirqd_running() so that it can be disabled if needed
> >> by users like us?
> >
> > If at all then a sysctl to allow runtime control.
> >



>
> ksoftirqd might be spuriously scheduled from tx path, when
> __qdisc_run() also reacts to need_resched().
>
> By raising NET_TX while we are processing NET_RX (say we send a TCP ACK packet
> in response to incoming packet), we force __do_softirq() to perform
> another loop, but before doing an other round, it will also check 
> need_resched()
> and eventually call wakeup_softirqd()
>
> I wonder if following patch makes any difference.
>
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index 
> 11c03cf4aa74b44663c74e0e3284140b0c75d9c4..ab736e974396394ae6ba409868aaea56a50ad57b
>  100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -377,6 +377,8 @@ void __qdisc_run(struct Qdisc *q)
> int packets;
>
> while (qdisc_restart(q, &packets)) {
> +   if (qdisc_is_empty(q))
> +   break;

unfortunately its v4.14.55 and qdisc_is_empty() is not yet introduced.
And I can not backport 28cff537ef2e ("net: sched: add empty status
flag for NOLOCK qdisc")
also as TCQ_F_NOLOCK is there. :(


-- 
Regards
Sudip

Re: regression with napi/softirq ?

2019-07-17 Thread Eric Dumazet




On 7/17/19 11:52 PM, Thomas Gleixner wrote:
> Sudip,
> 
> On Wed, 17 Jul 2019, Sudip Mukherjee wrote:
>> On Wed, Jul 17, 2019 at 9:53 PM Thomas Gleixner  wrote:
>>> You can hack ksoftirq_running() to return always false to avoid this, but
>>> that might cause application starvation and a huge packet buffer backlog
>>> when the amount of incoming packets makes the CPU do nothing else than
>>> softirq processing.
>>
>> I tried that now, it is better but still not as good as v3.8
>> Now I am getting 375.9usec as the maximum time between raising the softirq
>> and it starting to execute and packet drops still there.
>>
>> And just a thought, do you think there should be a CONFIG_ option for
>> this feature of ksoftirqd_running() so that it can be disabled if needed
>> by users like us?
> 
> If at all then a sysctl to allow runtime control.
>  
>> Can you please think of anything else that might have changed which I still 
>> need
>> to change to make the time comparable to v3.8..
> 
> Something with in that small range of:
> 
>  63592 files changed, 13783320 insertions(+), 5155492 deletions(-)
> 
> :)
> 
> Seriously, that can be anything.
> 
> Can you please test with Linus' head of tree and add some more
> instrumentation, so we can see what holds off softirqs from being
> processed. If the ksoftirqd enforcement is disabled, then the only reason
> can be a long lasting softirq disabled region. Tracing should tell.

ksoftirqd might be spuriously scheduled from tx path, when
__qdisc_run() also reacts to need_resched().

By raising NET_TX while we are processing NET_RX (say we send a TCP ACK packet
in response to incoming packet), we force __do_softirq() to perform
another loop, but before doing an other round, it will also check need_resched()
and eventually call wakeup_softirqd()

I wonder if following patch makes any difference.

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 
11c03cf4aa74b44663c74e0e3284140b0c75d9c4..ab736e974396394ae6ba409868aaea56a50ad57b
 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -377,6 +377,8 @@ void __qdisc_run(struct Qdisc *q)
int packets;
 
while (qdisc_restart(q, &packets)) {
+   if (qdisc_is_empty(q))
+   break;
/*
 * Ordered by possible occurrence: Postpone processing if
 * 1. we've exceeded packet quota

Re: regression with napi/softirq ?

2019-07-17 Thread Thomas Gleixner

Sudip,

On Wed, 17 Jul 2019, Sudip Mukherjee wrote:
> On Wed, Jul 17, 2019 at 9:53 PM Thomas Gleixner  wrote:
> > You can hack ksoftirq_running() to return always false to avoid this, but
> > that might cause application starvation and a huge packet buffer backlog
> > when the amount of incoming packets makes the CPU do nothing else than
> > softirq processing.
> 
> I tried that now, it is better but still not as good as v3.8
> Now I am getting 375.9usec as the maximum time between raising the softirq
> and it starting to execute and packet drops still there.
>
> And just a thought, do you think there should be a CONFIG_ option for
> this feature of ksoftirqd_running() so that it can be disabled if needed
> by users like us?

If at all then a sysctl to allow runtime control.
 
> Can you please think of anything else that might have changed which I still 
> need
> to change to make the time comparable to v3.8..

Something with in that small range of:

 63592 files changed, 13783320 insertions(+), 5155492 deletions(-)

:)

Seriously, that can be anything.

Can you please test with Linus' head of tree and add some more
instrumentation, so we can see what holds off softirqs from being
processed. If the ksoftirqd enforcement is disabled, then the only reason
can be a long lasting softirq disabled region. Tracing should tell.

Thanks,

tglx

Re: regression with napi/softirq ?

2019-07-17 Thread Sudip Mukherjee

Hi,

On Wed, Jul 17, 2019 at 9:53 PM Thomas Gleixner  wrote:
>
> On Wed, 17 Jul 2019, Sudip Mukherjee wrote:
> > I am using v4.14.55 on an Intel Atom based board and I am seeing network
> > packet drops frequently on wireshark logs. After lots of debugging it
> > seems that when this happens softirq is taking huge time to start after
> > it has been raised. This is a small snippet from ftrace:
> >
> ><...>-2110  [001] dNH1   466.634916: irq_handler_entry: irq=126 
> > name=eth0-TxRx-0
> ><...>-2110  [001] dNH1   466.634917: softirq_raise: vec=3 
> > [action=NET_RX]
> ><...>-2110  [001] dNH1   466.634918: irq_handler_exit: irq=126 
> > ret=handled
> >  ksoftirqd/1-15[001] ..s.   466.635826: softirq_entry: vec=3 
> > [action=NET_RX]
> >  ksoftirqd/1-15[001] ..s.   466.635852: softirq_exit: vec=3 
> > [action=NET_RX]
> >  ksoftirqd/1-15[001] d.H.   466.635856: irq_handler_entry: irq=126 
> > name=eth0-TxRx-0
> >  ksoftirqd/1-15[001] d.H.   466.635857: softirq_raise: vec=3 
> > [action=NET_RX]
> >  ksoftirqd/1-15[001] d.H.   466.635858: irq_handler_exit: irq=126 
> > ret=handled
> >  ksoftirqd/1-15[001] ..s.   466.635860: softirq_entry: vec=3 
> > [action=NET_RX]
> >  ksoftirqd/1-15[001] ..s.   466.635863: softirq_exit: vec=3 
> > [action=NET_RX]
> >
> > So, softirq was raised at 466.634917 but it started at 466.635826 almost
> > 909 usec after it was raised.
>
> This is a situation where the network softirq decided to delegate softirq
> processing to ksoftirqd. That happens when too much work is available while
> processing softirqs on return from interrupt.
>
> That means that softirq processing happens under scheduler control. So if
> there are other runnable tasks on the same CPU ksoftirqd can be delayed
> until their time slice expired. As a consequence ksoftirqd might not be
> able to catch up with the incoming packet flood and the NIC starts to drop.

Yes, and I see in the ftrace that there are many other userspace processes
getting scheduled in that time.

>
> You can hack ksoftirq_running() to return always false to avoid this, but
> that might cause application starvation and a huge packet buffer backlog
> when the amount of incoming packets makes the CPU do nothing else than
> softirq processing.

I tried that now, it is better but still not as good as v3.8
Now I am getting 375.9usec as the maximum time between raising the softirq
and it starting to execute and packet drops still there.

And just a thought, do you think there should be a CONFIG_ option for
this feature
of ksoftirqd_running() so that it can be disabled if needed by users like us?

Can you please think of anything else that might have changed which I still need
to change to make the time comparable to v3.8..


-- 
Regards
Sudip

Re: regression with napi/softirq ?

2019-07-17 Thread Thomas Gleixner

On Wed, 17 Jul 2019, Sudip Mukherjee wrote:
> I am using v4.14.55 on an Intel Atom based board and I am seeing network
> packet drops frequently on wireshark logs. After lots of debugging it
> seems that when this happens softirq is taking huge time to start after
> it has been raised. This is a small snippet from ftrace:
> 
><...>-2110  [001] dNH1   466.634916: irq_handler_entry: irq=126 
> name=eth0-TxRx-0
><...>-2110  [001] dNH1   466.634917: softirq_raise: vec=3 
> [action=NET_RX]
><...>-2110  [001] dNH1   466.634918: irq_handler_exit: irq=126 
> ret=handled
>  ksoftirqd/1-15[001] ..s.   466.635826: softirq_entry: vec=3 
> [action=NET_RX]
>  ksoftirqd/1-15[001] ..s.   466.635852: softirq_exit: vec=3 
> [action=NET_RX]
>  ksoftirqd/1-15[001] d.H.   466.635856: irq_handler_entry: irq=126 
> name=eth0-TxRx-0
>  ksoftirqd/1-15[001] d.H.   466.635857: softirq_raise: vec=3 
> [action=NET_RX]
>  ksoftirqd/1-15[001] d.H.   466.635858: irq_handler_exit: irq=126 
> ret=handled
>  ksoftirqd/1-15[001] ..s.   466.635860: softirq_entry: vec=3 
> [action=NET_RX]
>  ksoftirqd/1-15[001] ..s.   466.635863: softirq_exit: vec=3 
> [action=NET_RX]
> 
> So, softirq was raised at 466.634917 but it started at 466.635826 almost
> 909 usec after it was raised.

This is a situation where the network softirq decided to delegate softirq
processing to ksoftirqd. That happens when too much work is available while
processing softirqs on return from interrupt.

That means that softirq processing happens under scheduler control. So if
there are other runnable tasks on the same CPU ksoftirqd can be delayed
until their time slice expired. As a consequence ksoftirqd might not be
able to catch up with the incoming packet flood and the NIC starts to drop.

You can hack ksoftirq_running() to return always false to avoid this, but
that might cause application starvation and a huge packet buffer backlog
when the amount of incoming packets makes the CPU do nothing else than
softirq processing.

Thanks,

tglx

Re: regression with napi/softirq ?

Re: regression with napi/softirq ?

Re: regression with napi/softirq ?

Re: regression with napi/softirq ?

Re: regression with napi/softirq ?

Re: regression with napi/softirq ?

Re: regression with napi/softirq ?

Re: regression with napi/softirq ?

Re: regression with napi/softirq ?

Re: regression with napi/softirq ?

10 matches

Site Navigation

Mail list logo

Footer information