subject:"\[PATCH\] net\: ethernet\: ti\: cpsw\: fix net watchdog timeout"

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

2018-02-08 Thread David Miller

From: Grygorii Strashko 
Date: Thu, 8 Feb 2018 10:04:31 -0600

> Could this be marked as stable material 4.9+?

Sure, queued up.

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

2018-02-08 Thread Grygorii Strashko

On 02/07/2018 08:57 PM, David Miller wrote:

From: Grygorii Strashko 
Date: Tue, 6 Feb 2018 19:17:06 -0600

It was discovered that simple program which indefinitely sends 200b UDP
packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network
watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog
timeout is triggered due to race between cpsw_ndo_start_xmit() and
cpsw_tx_handler() [NAPI]

cpsw_ndo_start_xmit()
if (unlikely(!cpdma_check_free_tx_desc(txch))) {
txq = netdev_get_tx_queue(ndev, q_idx);
netif_tx_stop_queue(txq);

^^ as per [1] barier has to be used after set_bit() otherwise new value
might not be visible to other cpus
}

cpsw_tx_handler()
if (unlikely(netif_tx_queue_stopped(txq)))
netif_tx_wake_queue(txq);

and when it happens ndev TX queue became disabled forever while driver's HW
TX queue is empty.

Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue()
calls and double check for free TX descriptors after stopping ndev TX queue
- if there are free TX descriptors wake up ndev TX queue.

[1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html
Signed-off-by: Grygorii Strashko 

Applied, thanks.

Thank you David.

Could this be marked as stable material 4.9+?

--
regards,
-grygorii

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

2018-02-07 Thread David Miller

From: Grygorii Strashko 
Date: Tue, 6 Feb 2018 19:17:06 -0600

> It was discovered that simple program which indefinitely sends 200b UDP
> packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network
> watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog
> timeout is triggered due to race between cpsw_ndo_start_xmit() and
> cpsw_tx_handler() [NAPI]
> 
> cpsw_ndo_start_xmit()
>   if (unlikely(!cpdma_check_free_tx_desc(txch))) {
>   txq = netdev_get_tx_queue(ndev, q_idx);
>   netif_tx_stop_queue(txq);
> 
> ^^ as per [1] barier has to be used after set_bit() otherwise new value
> might not be visible to other cpus
>   }
> 
> cpsw_tx_handler()
>   if (unlikely(netif_tx_queue_stopped(txq)))
>   netif_tx_wake_queue(txq);
> 
> and when it happens ndev TX queue became disabled forever while driver's HW
> TX queue is empty.
> 
> Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue()
> calls and double check for free TX descriptors after stopping ndev TX queue
> - if there are free TX descriptors wake up ndev TX queue.
> 
> [1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html
> Signed-off-by: Grygorii Strashko 

Applied, thanks.

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

2018-02-07 Thread Ivan Khoronzhuk

Rechecked once again, seems it covers every case, at this moment, so

Reviewed-by: Ivan Khoronzhuk 

-- 
Regards,
Ivan Khoronzhuk

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

2018-02-07 Thread Ivan Khoronzhuk


On Wed, Feb 07, 2018 at 12:19:11PM -0600, Grygorii Strashko wrote:

On 02/07/2018 07:31 AM, Ivan Khoronzhuk wrote:

On Wed, Feb 07, 2018 at 05:03:19AM +0200, Ivan Khoronzhuk wrote:

On Tue, Feb 06, 2018 at 07:17:06PM -0600, Grygorii Strashko wrote:

It was discovered that simple program which indefinitely sends 200b UDP
packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network
watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog
timeout is triggered due to race between cpsw_ndo_start_xmit() and
cpsw_tx_handler() [NAPI]

cpsw_ndo_start_xmit()
if (unlikely(!cpdma_check_free_tx_desc(txch))) {
txq = netdev_get_tx_queue(ndev, q_idx);
netif_tx_stop_queue(txq);

^^ as per [1] barrier has to be used after set_bit() otherwise new value
might not be visible to other cpus
}

cpsw_tx_handler()
if (unlikely(netif_tx_queue_stopped(txq)))
netif_tx_wake_queue(txq);

and when it happens ndev TX queue became disabled forever while driver's HW
TX queue is empty.

I'm sure it fixes test case somehow but there is some strangeness.
(I've thought about this some X months ago):
1. If no free desc, then there is bunch of descs on the queue ready to be sent
2. If one of this desc while this process was missed then next will wake queue,
because there is bunch of them on the fly. So, if desc on top of the sent queue
missed to enable the queue, then next one more likely will enable it anyway..
then how it could happen? The described race is possible only on last
descriptor, yes, packets are small the speed is hight, possibility is very small
.but then next situation is also possible:
- packets are sent fast
- all packets were sent, but no any descriptors are freed now by sw interrupt 
(NAPI)
- when interrupt had started NAPI, the queue was enabled, all other next
interrupts are throttled once NAPI not finished it's work yet.
- when new packet submitted, no free descs are present yet (NAPI has not freed
any yet), but all packets are sent, so no one can awake tx queue, as interrupt
will not arise when NAPI is started to free first descriptor interrupts are
disabled.because h/w queue to be sent is empty...
- how it can happen as submitting packet and handling packet operations is under
channel lock? Not exactly, a period between handling and freeing the descriptor
to the pool is not under channel lock, here:

spin_unlock_irqrestore(>lock, flags);
if (unlikely(status & CPDMA_DESC_TD_COMPLETE))
cb_status = -ENOSYS;
else
cb_status = status;

__cpdma_chan_free(chan, desc, outlen, cb_status);
return status;

unlock_ret:
spin_unlock_irqrestore(>lock, flags);
return status;

And:
__cpdma_chan_free(chan, desc, outlen, cb_status);
-> cpdma_desc_free(pool, desc, 1);

As result, queue deadlock as you've described.
Just thought, not checked, but theoretically possible.
What do you think?


Yep. And if this happens desc_alloc_fail should be >0 while i usually see it 0 
when
watchdog triggers.

It has to be 0 for situation I trying to describe.
See below re example for keeping all in one place


I was able to reproduce this situation once, but with bunch of debug code added
which changes timings:

I also unintentionally observed it several times but usually was thinking
that it was connected with my other experiments. It was with am572x.
But now seems it actually can happen with plane sources. And maybe here not
only 1 fix is needed, that's my concern.



Prerequisite: only one free desc available.
cpsw_ndo_start_xmit1cpsw_tx_poll
 prepare and send packet
 ->Free desc queue is empty at this moment
if (unlikely(!cpdma_check_free_tx_desc(txch)))
netif_tx_stop_queue(txq);
--- tx queue stopped
cpdma_chan_process()

spin_lock_irqsave(>lock, flags);
chan->count--

spin_unlock_irqrestore(>lock, flags)

cpsw_tx_handler()
   if 
(unlikely(netif_tx_queue_stopped(txq)))

netif_tx_wake_queue(txq);
--- tx queue is woken up
cpsw_ndo_start_xmit2
 prepare packet
ret = cpsw_tx_packet_submit(priv, skb, txch);
//fail as desc not returned to the pool yet
if (unlikely(ret != 0)) {
cpsw_err(priv, tx_err, "desc submit failed\n");
goto fail;
}
cpdma_desc_free()   

fail:
ndev->stats.tx_dropped++;
netif_tx_stop_queue(txq);
// oops.

That's why I added double check and barrier in fail path also

Seems to be reasonable. It

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

2018-02-07 Thread Grygorii Strashko

On 02/07/2018 07:31 AM, Ivan Khoronzhuk wrote:
> On Wed, Feb 07, 2018 at 05:03:19AM +0200, Ivan Khoronzhuk wrote:
>> On Tue, Feb 06, 2018 at 07:17:06PM -0600, Grygorii Strashko wrote:
>>> It was discovered that simple program which indefinitely sends 200b UDP
>>> packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network
>>> watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog
>>> timeout is triggered due to race between cpsw_ndo_start_xmit() and
>>> cpsw_tx_handler() [NAPI]
>>>
>>> cpsw_ndo_start_xmit()
>>> if (unlikely(!cpdma_check_free_tx_desc(txch))) {
>>> txq = netdev_get_tx_queue(ndev, q_idx);
>>> netif_tx_stop_queue(txq);
>>>
>>> ^^ as per [1] barrier has to be used after set_bit() otherwise new value
>>> might not be visible to other cpus
>>> }
>>>
>>> cpsw_tx_handler()
>>> if (unlikely(netif_tx_queue_stopped(txq)))
>>> netif_tx_wake_queue(txq);
>>>
>>> and when it happens ndev TX queue became disabled forever while driver's HW
>>> TX queue is empty.
>> I'm sure it fixes test case somehow but there is some strangeness.
>> (I've thought about this some X months ago):
>> 1. If no free desc, then there is bunch of descs on the queue ready to be 
>> sent
>> 2. If one of this desc while this process was missed then next will wake 
>> queue,
>> because there is bunch of them on the fly. So, if desc on top of the sent 
>> queue
>> missed to enable the queue, then next one more likely will enable it anyway..
>> then how it could happen? The described race is possible only on last
>> descriptor, yes, packets are small the speed is hight, possibility is very 
>> small
>> .but then next situation is also possible:
>> - packets are sent fast
>> - all packets were sent, but no any descriptors are freed now by sw 
>> interrupt (NAPI)
>> - when interrupt had started NAPI, the queue was enabled, all other next
>> interrupts are throttled once NAPI not finished it's work yet.
>> - when new packet submitted, no free descs are present yet (NAPI has not 
>> freed
>> any yet), but all packets are sent, so no one can awake tx queue, as 
>> interrupt
>> will not arise when NAPI is started to free first descriptor interrupts are
>> disabled.because h/w queue to be sent is empty...
>> - how it can happen as submitting packet and handling packet operations is 
>> under
>> channel lock? Not exactly, a period between handling and freeing the 
>> descriptor
>> to the pool is not under channel lock, here:
>>
>>  spin_unlock_irqrestore(>lock, flags);
>>  if (unlikely(status & CPDMA_DESC_TD_COMPLETE))
>>  cb_status = -ENOSYS;
>>  else
>>  cb_status = status;
>>
>>  __cpdma_chan_free(chan, desc, outlen, cb_status);
>>  return status;
>>
>> unlock_ret:
>>  spin_unlock_irqrestore(>lock, flags);
>>  return status;
>>
>> And:
>> __cpdma_chan_free(chan, desc, outlen, cb_status);
>>  -> cpdma_desc_free(pool, desc, 1);
>>
>> As result, queue deadlock as you've described.
>> Just thought, not checked, but theoretically possible.
>> What do you think?

Yep. And if this happens desc_alloc_fail should be >0 while i usually see it 0 
when 
watchdog triggers.
I was able to reproduce this situation once, but with bunch of debug code added
which changes timings:

Prerequisite: only one free desc available.
 cpsw_ndo_start_xmit1   cpsw_tx_poll
  prepare and send packet 
  ->Free desc queue is empty at this moment
 if (unlikely(!cpdma_check_free_tx_desc(txch)))
netif_tx_stop_queue(txq);
 --- tx queue stopped
cpdma_chan_process()

spin_lock_irqsave(>lock, flags);
chan->count--

spin_unlock_irqrestore(>lock, flags)

cpsw_tx_handler()
   if 
(unlikely(netif_tx_queue_stopped(txq)))

netif_tx_wake_queue(txq);
 --- tx queue is woken up
 cpsw_ndo_start_xmit2
  prepare packet
 ret = cpsw_tx_packet_submit(priv, skb, txch);
 //fail as desc not returned to the pool yet
if (unlikely(ret != 0)) {
cpsw_err(priv, tx_err, "desc submit failed\n");
goto fail;
}
cpdma_desc_free()   

fail:
ndev->stats.tx_dropped++;
netif_tx_stop_queue(txq);
// oops.

That's why I added double check and barrier in fail path also
 

> 
> Better explanation, for rare race:
> 
> start conditions:
> - all descs are submitted, except last one, and queue is not stopped
> - any desc was returned to the pool yet (but packets can be sent)
> 
>  time
>   ||
>   \/
> 
> submit process

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

2018-02-07 Thread Ivan Khoronzhuk

On Wed, Feb 07, 2018 at 05:03:19AM +0200, Ivan Khoronzhuk wrote:
> On Tue, Feb 06, 2018 at 07:17:06PM -0600, Grygorii Strashko wrote:
> > It was discovered that simple program which indefinitely sends 200b UDP
> > packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network
> > watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog
> > timeout is triggered due to race between cpsw_ndo_start_xmit() and
> > cpsw_tx_handler() [NAPI]
> > 
> > cpsw_ndo_start_xmit()
> > if (unlikely(!cpdma_check_free_tx_desc(txch))) {
> > txq = netdev_get_tx_queue(ndev, q_idx);
> > netif_tx_stop_queue(txq);
> > 
> > ^^ as per [1] barier has to be used after set_bit() otherwise new value
> > might not be visible to other cpus
> > }
> > 
> > cpsw_tx_handler()
> > if (unlikely(netif_tx_queue_stopped(txq)))
> > netif_tx_wake_queue(txq);
> > 
> > and when it happens ndev TX queue became disabled forever while driver's HW
> > TX queue is empty.
> I'm sure it fixes test case somehow but there is some strangeness.
> (I've thought about this some X months ago):
> 1. If no free desc, then there is bunch of descs on the queue ready to be sent
> 2. If one of this desc while this process was missed then next will wake 
> queue,
> because there is bunch of them on the fly. So, if desc on top of the sent 
> queue
> missed to enable the queue, then next one more likely will enable it anyway..
> then how it could happen? The described race is possible only on last
> descriptor, yes, packets are small the speed is hight, possibility is very 
> small
> .but then next situation is also possible:
> - packets are sent fast
> - all packets were sent, but no any descriptors are freed now by sw interrupt 
> (NAPI)
> - when interrupt had started NAPI, the queue was enabled, all other next 
> interrupts are throttled once NAPI not finished it's work yet.
> - when new packet submitted, no free descs are present yet (NAPI has not freed
> any yet), but all packets are sent, so no one can awake tx queue, as 
> interrupt 
> will not arise when NAPI is started to free first descriptor interrupts are 
> disabled.because h/w queue to be sent is empty...
> - how it can happen as submitting packet and handling packet operations is 
> under 
> channel lock? Not exactly, a period between handling and freeing the 
> descriptor
> to the pool is not under channel lock, here:
> 
>   spin_unlock_irqrestore(>lock, flags);
>   if (unlikely(status & CPDMA_DESC_TD_COMPLETE))
>   cb_status = -ENOSYS;
>   else
>   cb_status = status;
> 
>   __cpdma_chan_free(chan, desc, outlen, cb_status);
>   return status;
> 
> unlock_ret:
>   spin_unlock_irqrestore(>lock, flags);
>   return status;
> 
> And:
> __cpdma_chan_free(chan, desc, outlen, cb_status);
>   -> cpdma_desc_free(pool, desc, 1);
> 
> As result, queue deadlock as you've described.
> Just thought, not checked, but theoretically possible.
> What do you think?

Better explanation, for rare race:

start conditions:
- all descs are submitted, except last one, and queue is not stopped
- any desc was returned to the pool yet (but packets can be sent)

time
 ||
 \/

submit process  NAPI poll process

new packet is scheduled for submission
stated that no free descs (with locks)
lock is freed
returned all descs to the pool
and queue is enabled
interrupt enabled, poll exit
queue is disabled
submit exit

Result:
- all descs are returned to the pool, submission to the queue disabled
- NAPI cannot wake queue, as all desc were handled already

According to packet size in 200B
Data size, bits: 200B * 63desc * 10 = 128000bit roughly
Time all of them are sent: 128000 / 1Gb = 128us

That's enough the CPU to be occupied by other process in RT even.

> 
> > 
> > Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue()
> > calls and double check for free TX descriptors after stopping ndev TX queue
> > - if there are free TX descriptors wake up ndev TX queue.
> > 
> > [1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html
> > Signed-off-by: Grygorii Strashko 
> > ---
> >  drivers/net/ethernet/ti/cpsw.c | 16 ++--
> >  1 file changed, 14 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> > index 10d7cbe..3805b13 100644
> > --- a/drivers/net/ethernet/ti/cpsw.c
> > +++ b/drivers/net/ethernet/ti/cpsw.c
> > @@ -1638,6 +1638,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff 
> > *skb,
> > q_idx = q_idx % cpsw->tx_ch_num;
> >  
> > txch = cpsw->txv[q_idx].ch;
> > +   txq = netdev_get_tx_queue(ndev, q_idx);
> > ret =

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

2018-02-06 Thread Ivan Khoronzhuk

On Tue, Feb 06, 2018 at 07:17:06PM -0600, Grygorii Strashko wrote:
> It was discovered that simple program which indefinitely sends 200b UDP
> packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network
> watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog
> timeout is triggered due to race between cpsw_ndo_start_xmit() and
> cpsw_tx_handler() [NAPI]
> 
> cpsw_ndo_start_xmit()
>   if (unlikely(!cpdma_check_free_tx_desc(txch))) {
>   txq = netdev_get_tx_queue(ndev, q_idx);
>   netif_tx_stop_queue(txq);
> 
> ^^ as per [1] barier has to be used after set_bit() otherwise new value
> might not be visible to other cpus
>   }
> 
> cpsw_tx_handler()
>   if (unlikely(netif_tx_queue_stopped(txq)))
>   netif_tx_wake_queue(txq);
> 
> and when it happens ndev TX queue became disabled forever while driver's HW
> TX queue is empty.
I'm sure it fixes test case somehow but there is some strangeness.
(I've thought about this some X months ago):
1. If no free desc, then there is bunch of descs on the queue ready to be sent
2. If one of this desc while this process was missed then next will wake queue,
because there is bunch of them on the fly. So, if desc on top of the sent queue
missed to enable the queue, then next one more likely will enable it anyway..
then how it could happen? The described race is possible only on last
descriptor, yes, packets are small the speed is hight, possibility is very small
.but then next situation is also possible:
- packets are sent fast
- all packets were sent, but no any descriptors are freed now by sw interrupt 
(NAPI)
- when interrupt had started NAPI, the queue was enabled, all other next 
interrupts are throttled once NAPI not finished it's work yet.
- when new packet submitted, no free descs are present yet (NAPI has not freed
any yet), but all packets are sent, so no one can awake tx queue, as interrupt 
will not arise when NAPI is started to free first descriptor interrupts are 
disabled.because h/w queue to be sent is empty...
- how it can happen as submitting packet and handling packet operations is 
under 
channel lock? Not exactly, a period between handling and freeing the descriptor
to the pool is not under channel lock, here:

spin_unlock_irqrestore(>lock, flags);
if (unlikely(status & CPDMA_DESC_TD_COMPLETE))
cb_status = -ENOSYS;
else
cb_status = status;

__cpdma_chan_free(chan, desc, outlen, cb_status);
return status;

unlock_ret:
spin_unlock_irqrestore(>lock, flags);
return status;

And:
__cpdma_chan_free(chan, desc, outlen, cb_status);
-> cpdma_desc_free(pool, desc, 1);

As result, queue deadlock as you've described.
Just thought, not checked, but theoretically possible.
What do you think?

> 
> Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue()
> calls and double check for free TX descriptors after stopping ndev TX queue
> - if there are free TX descriptors wake up ndev TX queue.
> 
> [1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html
> Signed-off-by: Grygorii Strashko 
> ---
>  drivers/net/ethernet/ti/cpsw.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
> index 10d7cbe..3805b13 100644
> --- a/drivers/net/ethernet/ti/cpsw.c
> +++ b/drivers/net/ethernet/ti/cpsw.c
> @@ -1638,6 +1638,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff 
> *skb,
>   q_idx = q_idx % cpsw->tx_ch_num;
>  
>   txch = cpsw->txv[q_idx].ch;
> + txq = netdev_get_tx_queue(ndev, q_idx);
>   ret = cpsw_tx_packet_submit(priv, skb, txch);
>   if (unlikely(ret != 0)) {
>   cpsw_err(priv, tx_err, "desc submit failed\n");
> @@ -1648,15 +1649,26 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff 
> *skb,
>* tell the kernel to stop sending us tx frames.
>*/
>   if (unlikely(!cpdma_check_free_tx_desc(txch))) {
> - txq = netdev_get_tx_queue(ndev, q_idx);
>   netif_tx_stop_queue(txq);
> +
> + /* Barrier, so that stop_queue visible to other cpus */
> + smp_mb__after_atomic();
> +
> + if (cpdma_check_free_tx_desc(txch))
> + netif_tx_wake_queue(txq);
>   }
>  
>   return NETDEV_TX_OK;
>  fail:
>   ndev->stats.tx_dropped++;
> - txq = netdev_get_tx_queue(ndev, skb_get_queue_mapping(skb));
>   netif_tx_stop_queue(txq);
> +
> + /* Barrier, so that stop_queue visible to other cpus */
> + smp_mb__after_atomic();
> +
> + if (cpdma_check_free_tx_desc(txch))
> + netif_tx_wake_queue(txq);
> +
>   return NETDEV_TX_BUSY;
>  }
>  
> -- 
> 2.10.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-omap" in
> the body of a message to majord...@vger.kernel.org
>

[PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

2018-02-06 Thread Grygorii Strashko

It was discovered that simple program which indefinitely sends 200b UDP
packets and runs on TI AM574x SoC (SMP) under RT Kernel triggers network
watchdog timeout in TI CPSW driver (<6 hours run). The network watchdog
timeout is triggered due to race between cpsw_ndo_start_xmit() and
cpsw_tx_handler() [NAPI]

cpsw_ndo_start_xmit()
if (unlikely(!cpdma_check_free_tx_desc(txch))) {
txq = netdev_get_tx_queue(ndev, q_idx);
netif_tx_stop_queue(txq);

^^ as per [1] barier has to be used after set_bit() otherwise new value
might not be visible to other cpus
}

cpsw_tx_handler()
if (unlikely(netif_tx_queue_stopped(txq)))
netif_tx_wake_queue(txq);

and when it happens ndev TX queue became disabled forever while driver's HW
TX queue is empty.

Fix this, by adding smp_mb__after_atomic() after netif_tx_stop_queue()
calls and double check for free TX descriptors after stopping ndev TX queue
- if there are free TX descriptors wake up ndev TX queue.

[1] https://www.kernel.org/doc/html/latest/core-api/atomic_ops.html
Signed-off-by: Grygorii Strashko 
---
 drivers/net/ethernet/ti/cpsw.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index 10d7cbe..3805b13 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1638,6 +1638,7 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff 
*skb,
q_idx = q_idx % cpsw->tx_ch_num;
 
txch = cpsw->txv[q_idx].ch;
+   txq = netdev_get_tx_queue(ndev, q_idx);
ret = cpsw_tx_packet_submit(priv, skb, txch);
if (unlikely(ret != 0)) {
cpsw_err(priv, tx_err, "desc submit failed\n");
@@ -1648,15 +1649,26 @@ static netdev_tx_t cpsw_ndo_start_xmit(struct sk_buff 
*skb,
 * tell the kernel to stop sending us tx frames.
 */
if (unlikely(!cpdma_check_free_tx_desc(txch))) {
-   txq = netdev_get_tx_queue(ndev, q_idx);
netif_tx_stop_queue(txq);
+
+   /* Barrier, so that stop_queue visible to other cpus */
+   smp_mb__after_atomic();
+
+   if (cpdma_check_free_tx_desc(txch))
+   netif_tx_wake_queue(txq);
}
 
return NETDEV_TX_OK;
 fail:
ndev->stats.tx_dropped++;
-   txq = netdev_get_tx_queue(ndev, skb_get_queue_mapping(skb));
netif_tx_stop_queue(txq);
+
+   /* Barrier, so that stop_queue visible to other cpus */
+   smp_mb__after_atomic();
+
+   if (cpdma_check_free_tx_desc(txch))
+   netif_tx_wake_queue(txq);
+
return NETDEV_TX_BUSY;
 }
 
-- 
2.10.5

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

Re: [PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

[PATCH] net: ethernet: ti: cpsw: fix net watchdog timeout

9 matches

Site Navigation

Mail list logo

Footer information