Re: [PATCH v2] ethernet:arc: Fix racing of TX ring buffer

2016-06-05 Thread Shuyu Wei
On Mon, May 30, 2016 at 11:41:22PM +0200, Lino Sanfilippo wrote:
> 
> Did you see the same issues with the patch before (the one that, as you wrote,
> survived a whole night of stress testing)?
> 
> Lino

Hi Lino,
Sorry for my late reply. I retested the previous patch, it did have
the same issue. However it seems that the possibility of stuck is lower
(just my instinct, no evidence).


Re: [PATCH v2] ethernet:arc: Fix racing of TX ring buffer

2016-05-27 Thread Shuyu Wei
On Wed, May 25, 2016 at 01:56:20AM +0200, Lino Sanfilippo wrote:
> Francois, Shuyu,
> 
> this is the patch with the discussed changes.
> 
> Shuyu it would be great if you could test this one. If it passes
> and there are no further objections I will resend it as a regular patch
> (including commit message, etc.) to the mailing list.
> 
> 
> diff --git a/drivers/net/ethernet/arc/emac_main.c 
> b/drivers/net/ethernet/arc/emac_main.c
> index a3a9392..ec656b3 100644
> --- a/drivers/net/ethernet/arc/emac_main.c
> +++ b/drivers/net/ethernet/arc/emac_main.c
> @@ -153,18 +153,29 @@ static void arc_emac_tx_clean(struct net_device *ndev)
>  {
>   struct arc_emac_priv *priv = netdev_priv(ndev);
>   struct net_device_stats *stats = &ndev->stats;
> + unsigned int curr = priv->txbd_curr;
>   unsigned int i;
>  
> + /* Make sure buffers and txbd_curr are consistent */
> + smp_rmb();
> +
>   for (i = 0; i < TX_BD_NUM; i++) {
>   unsigned int *txbd_dirty = &priv->txbd_dirty;
>   struct arc_emac_bd *txbd = &priv->txbd[*txbd_dirty];
>   struct buffer_state *tx_buff = &priv->tx_buff[*txbd_dirty];
> - struct sk_buff *skb = tx_buff->skb;
> - unsigned int info = le32_to_cpu(txbd->info);
> + unsigned int info;
> + struct sk_buff *skb;
>  
> - if ((info & FOR_EMAC) || !txbd->data || !skb)
> + if (*txbd_dirty == curr)
>   break;
>  
> + info = le32_to_cpu(txbd->info);
> +
> + if (info & FOR_EMAC)
> + break;
> +
> + skb = tx_buff->skb;
> +
>   if (unlikely(info & (DROP | DEFR | LTCL | UFLO))) {
>   stats->tx_errors++;
>   stats->tx_dropped++;
> @@ -195,8 +206,8 @@ static void arc_emac_tx_clean(struct net_device *ndev)
>   *txbd_dirty = (*txbd_dirty + 1) % TX_BD_NUM;
>   }
>  
> - /* Ensure that txbd_dirty is visible to tx() before checking
> -  * for queue stopped.
> + /* Ensure that txbd_dirty is visible to tx() and we see the most recent
> +  * value for txbd_curr.
>*/
>   smp_mb();
>  
> @@ -680,27 +691,24 @@ static int arc_emac_tx(struct sk_buff *skb, struct 
> net_device *ndev)
>   dma_unmap_len_set(&priv->tx_buff[*txbd_curr], len, len);
>  
>   priv->txbd[*txbd_curr].data = cpu_to_le32(addr);
> -
> - /* Make sure pointer to data buffer is set */
> - wmb();
> + priv->tx_buff[*txbd_curr].skb = skb;
>  
>   skb_tx_timestamp(skb);
>  
>   *info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
>  
> - /* Make sure info word is set */
> + /* 1. Make sure that with respect to tx_clean everything is set up
> +  * properly before we advance txbd_curr.
> +  * 2. Make sure writes to DMA descriptors are completed before we inform
> +  * the hardware.
> +  */
>   wmb();
>  
> - priv->tx_buff[*txbd_curr].skb = skb;
> -
>   /* Increment index to point to the next BD */
>   *txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;
>  
> - /* Ensure that tx_clean() sees the new txbd_curr before
> -  * checking the queue status. This prevents an unneeded wake
> -  * of the queue in tx_clean().
> -  */
> - smp_mb();
> + /* Ensure tx_clean() sees the updated value of txbd_curr */
> + smp_wmb();
>  
>   if (!arc_emac_tx_avail(priv)) {
>   netif_stop_queue(ndev);

After some stress testing, it worked well most of the time.
But there is a chance that it may get stuck when I use 2 nc process
to send TCP packects at full speed.  Only when a new rx packet 
arrive can trigger it to run again. This happens only once per several
hours. No problem in UDP mode.  I'm not sure if it's related to tx code in
the driver.


Re: [PATCH v2] ethernet:arc: Fix racing of TX ring buffer

2016-05-23 Thread Shuyu Wei
On Sun, May 22, 2016 at 01:30:27PM +0200, Lino Sanfilippo wrote:
> 
> Thanks for testing. However that extra check for skb not being NULL should 
> not be
> necessary if the code were correct. The changes I suggested were all about 
> having
> skb and info consistent with txbd_curr.
> But I just realized that there is still a big flaw in the last changes. While
> tx() looks correct now (we first set up the descriptor and assign the skb and 
> _then_
> advance txbd_curr) tx_clean still is not:
> 
> We _first_ have to read tx_curr and _then_ read the corresponding descriptor 
> and its skb.
> (The last patch implemented just the reverse - and thus wrong - order, first 
> get skb and 
> descriptor and then read tx_curr).
> 
> So the patch below hopefully handles also tx_clean correctly. Could you 
> please do once more a test
> with this one?

Hi Lino, 
This patch worked after a whole night of stress testing.

> > 
> > After further test, my patch to barrier timestamp() didn't work.
> > Just like the original code in the tree, the emac still got stuck under
> > high load, even if I changed the smp_wmb() to dma_wmb(). So the original
> > code do have race somewhere.
> 
> So to make this clear: with the current code in net-next you still see a 
> problem (lockup), right?
Yes, I mean the mainline kernel, which should be the same as net-next.


> > ... and why Francois' fix worked. Please be patient with me :-).
> 
> So which fix(es) exactly work for you and solve your lockup issue?
I mean the patch below, starting this thread.

diff --git a/drivers/net/ethernet/arc/emac_main.c 
b/drivers/net/ethernet/arc/emac_main.c
index a3a9392..df3dfef 100644
--- a/drivers/net/ethernet/arc/emac_main.c
+++ b/drivers/net/ethernet/arc/emac_main.c
@@ -153,9 +153,8 @@ static void arc_emac_tx_clean(struct net_device *ndev)
 {
struct arc_emac_priv *priv = netdev_priv(ndev);
struct net_device_stats *stats = &ndev->stats;
-   unsigned int i;
 
-   for (i = 0; i < TX_BD_NUM; i++) {
+   while (priv->txbd_dirty != priv->txbd_curr) {
unsigned int *txbd_dirty = &priv->txbd_dirty;
struct arc_emac_bd *txbd = &priv->txbd[*txbd_dirty];
struct buffer_state *tx_buff = &priv->tx_buff[*txbd_dirty];
@@ -685,13 +684,15 @@ static int arc_emac_tx(struct sk_buff *skb, struct 
net_device *ndev)
wmb();
 
skb_tx_timestamp(skb);
+   priv->tx_buff[*txbd_curr].skb = skb;
+
+   dma_wmb();
 
*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
 
/* Make sure info word is set */
wmb();
 
-   priv->tx_buff[*txbd_curr].skb = skb;
 
/* Increment index to point to the next BD */
*txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;




Re: [PATCH v2] ethernet:arc: Fix racing of TX ring buffer

2016-05-22 Thread Shuyu Wei
On Sun, May 22, 2016 at 12:58:55AM +0200, Lino Sanfilippo wrote:
> --- a/drivers/net/ethernet/arc/emac_main.c
> +++ b/drivers/net/ethernet/arc/emac_main.c
> @@ -162,7 +162,13 @@ static void arc_emac_tx_clean(struct net_device *ndev)
>   struct sk_buff *skb = tx_buff->skb;
>   unsigned int info = le32_to_cpu(txbd->info);
>  
> - if ((info & FOR_EMAC) || !txbd->data || !skb)
> + if (info & FOR_EMAC)
> + break;
> +
> + /* Make sure curr pointer is consistent with info */
> + rmb();
> +
> + if (*txbd_dirty == priv->txbd_curr)
>   break;
>  
>   if (unlikely(info & (DROP | DEFR | LTCL | UFLO))) {
> @@ -195,8 +201,8 @@ static void arc_emac_tx_clean(struct net_device *ndev)
>   *txbd_dirty = (*txbd_dirty + 1) % TX_BD_NUM;
>   }
>  
> - /* Ensure that txbd_dirty is visible to tx() before checking
> -  * for queue stopped.
> + /* Ensure that txbd_dirty is visible to tx() and we see the most recent
> +  * value for txbd_curr.
>*/
>   smp_mb();
>  
> @@ -680,35 +686,29 @@ static int arc_emac_tx(struct sk_buff *skb, struct 
> net_device *ndev)
>   dma_unmap_len_set(&priv->tx_buff[*txbd_curr], len, len);
>  
>   priv->txbd[*txbd_curr].data = cpu_to_le32(addr);
> -
> - /* Make sure pointer to data buffer is set */
> - wmb();
> + priv->tx_buff[*txbd_curr].skb = skb;
>  
>   skb_tx_timestamp(skb);
>  
>   *info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
>  
> - /* Make sure info word is set */
> + /* 1. Make sure that with respect to tx_clean everything is set up
> +  * properly before we advance txbd_curr.
> +  * 2. Make sure writes to DMA descriptors are completed before we inform
> +  * the hardware.
> +  */
>   wmb();
>  
> - priv->tx_buff[*txbd_curr].skb = skb;
> -
>   /* Increment index to point to the next BD */
>   *txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;
>  
> - /* Ensure that tx_clean() sees the new txbd_curr before
> -  * checking the queue status. This prevents an unneeded wake
> -  * of the queue in tx_clean().
> + /* Ensure we see the most recent value of txbd_dirty and tx_clean() sees
> +  * the updated value of txbd_curr.
>*/
>   smp_mb();
>  
> - if (!arc_emac_tx_avail(priv)) {
> + if (!arc_emac_tx_avail(priv))
>   netif_stop_queue(ndev);
> - /* Refresh tx_dirty */
> - smp_mb();
> - if (arc_emac_tx_avail(priv))
> - netif_start_queue(ndev);
> - }
>  
>   arc_reg_set(priv, R_STATUS, TXPL_MASK);
>  

Hi Lino,

I tested this patch, it still got panic under stress.
Just wget 2 large files simultaneously and it failed.

Looks like the problem comes from the if statement in tx_clean().
I changed your patch to use 

-   if (info & FOR_EMAC)
+   if ((info & FOR_EMAC) || !txbd->data || !skb)

and it worked. 

After further test, my patch to barrier timestamp() didn't work.
Just like the original code in the tree, the emac still got stuck under
high load, even if I changed the smp_wmb() to dma_wmb(). So the original
code do have race somewhere. 

I'm new to kernel development, and still trying to understand how memory
barrier works and why Francois' fix worked. Please be patient with me :-).

---

[  138.501355] Unable to handle kernel NULL pointer dereference at virtual 
address 00a8
[  138.509482] pgd = c0004000
[  138.512200] [00a8] *pgd=
[  138.515850] Internal error: Oops: 5 [#1] SMP ARM
[  138.520476] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0+ #98
[  138.526477] Hardware name: Rockchip (Device Tree)
[  138.531180] task: c0804e00 ti: c080 task.ti: c080
[  138.536588] PC is at __dev_kfree_skb_irq+0xc/0x8c
[  138.541295] LR is at arc_emac_poll+0x94/0x594
[  138.545653] pc : []lr : []psr: 60050113
[  138.545653] sp : c0801d88  ip : c0801da0  fp : c0801d9c
[  138.557118] r10: ee0a  r9 : ee0a1000  r8 : 01f8
[  138.562340] r7 :   r6 : 00fc  r5 : f08d0400  r4 : 03f0
[  138.568861] r3 : 0001  r2 : 0042  r1 : 0001  r0 : 
[  138.575385] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[  138.582515] Control: 10c5387d  Table: 8da4004a  DAC: 0051
[  138.588256] Process swapper/0 (pid: 0, stack limit = 0xc0800210)
[  138.594258] Stack: (0xc0801d88 to 0xc0802000)
[  138.598617] 1d80:   03f0 f08d0400 c0801df4 c0801da0 
c03c5e64 c046f590
[  138.606794] 1da0:  c0171050 c0801ddc ee0a04a8 c016504c 0028 
c0806614 f08d05f8
[  138.614969] 1dc0: ee0a02a8 0080 f0803100 ee0a04a8 c03c5dd0 0028 
012c 623d
[  138.623144] 1de0: c0802100 c0801e18 c0801e54 c0801df8 c0471c04 c03c5ddc 
c0801eb4 c06e5ae8
[  138.631318] 1e00: c08029f4 c08029f4 c083d821 2e86f000 c07486c0 eefb76c0 
c0801e18 c0801e18
[  138.6

Re: [PATCH v2] ethernet:arc: Fix racing of TX ring buffer

2016-05-21 Thread Shuyu Wei
Looks like I got it wrong in the first place.

priv->tx_buff is not for the device, so there's no need to move it.
The race has been fixed by commit c278c253f3d9, I forgot to check
it out. That's my fault.

I do find another problem. We need to use a barrier to make sure
skb_tx_timestamp() is called before setting the FOR_EMAC flag.

According to the comment(include/linux/skbuff.h):
>/**
> * skb_tx_timestamp() - Driver hook for transmit timestamping
> *
> * Ethernet MAC Drivers should call this function in their hard_xmit()
> * function immediately before giving the sk_buff to the MAC hardware.
> *
> * Specifically, one should make absolutely sure that this function is
> * called before TX completion of this packet can trigger.  Otherwise
> * the packet could potentially already be freed.
> *
> * @skb: A socket buffer.
> */

---

diff --git a/drivers/net/ethernet/arc/emac_main.c 
b/drivers/net/ethernet/arc/emac_main.c
index a3a9392..c2447b0 100644
--- a/drivers/net/ethernet/arc/emac_main.c
+++ b/drivers/net/ethernet/arc/emac_main.c
@@ -686,6 +686,9 @@ static int arc_emac_tx(struct sk_buff *skb, struct 
net_device *ndev)
 
skb_tx_timestamp(skb);
 
+   /* Make sure timestamp is set */
+   smp_wmb();
+
*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
 
/* Make sure info word is set */



Re: [PATCH v2] ethernet:arc: Fix racing of TX ring buffer

2016-05-21 Thread Shuyu Wei
On Thu, May 19, 2016 at 11:15:56PM +0200, Lino Sanfilippo wrote:
>  drivers/net/ethernet/arc/emac_main.c | 33 +
>  1 file changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/net/ethernet/arc/emac_main.c 
> b/drivers/net/ethernet/arc/emac_main.c
> index a3a9392..b986499 100644
> --- a/drivers/net/ethernet/arc/emac_main.c
> +++ b/drivers/net/ethernet/arc/emac_main.c
> @@ -162,8 +162,13 @@ static void arc_emac_tx_clean(struct net_device *ndev)
>   struct sk_buff *skb = tx_buff->skb;
>   unsigned int info = le32_to_cpu(txbd->info);
>  
> - if ((info & FOR_EMAC) || !txbd->data || !skb)
> + if (info & FOR_EMAC) {
> + /* Make sure we see consistent values for info, skb
> +  * and data.
> +  */
> + rmb();
>   break;
> + }
>  
>   if (unlikely(info & (DROP | DEFR | LTCL | UFLO))) {
>   stats->tx_errors++;
> @@ -680,35 +685,31 @@ static int arc_emac_tx(struct sk_buff *skb, struct 
> net_device *ndev)
>   dma_unmap_len_set(&priv->tx_buff[*txbd_curr], len, len);
>  
>   priv->txbd[*txbd_curr].data = cpu_to_le32(addr);
> -
> - /* Make sure pointer to data buffer is set */
> - wmb();
> + priv->tx_buff[*txbd_curr].skb = skb;
>  
>   skb_tx_timestamp(skb);
>  
> - *info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
> -
> - /* Make sure info word is set */
> + /* Make sure info is set after data and skb with respect to
> +  * tx_clean().
> +  */
>   wmb();
>  
> - priv->tx_buff[*txbd_curr].skb = skb;
> + *info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
>  
>   /* Increment index to point to the next BD */
>   *txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;
>  
> - /* Ensure that tx_clean() sees the new txbd_curr before
> + /* 1.Ensure that tx_clean() sees the new txbd_curr before
>* checking the queue status. This prevents an unneeded wake
>* of the queue in tx_clean().
> +  * 2.Ensure that all values are written to RAM and to DMA
> +  * before hardware is informed.
> +  * 3.Ensure we see the most recent value for tx_dirty.
>*/
> - smp_mb();
> + mb();
>  
> - if (!arc_emac_tx_avail(priv)) {
> + if (!arc_emac_tx_avail(priv))
>   netif_stop_queue(ndev);
> - /* Refresh tx_dirty */
> - smp_mb();
> - if (arc_emac_tx_avail(priv))
> - netif_start_queue(ndev);
> - }
>  
>   arc_reg_set(priv, R_STATUS, TXPL_MASK);
> 
> 
Hi Lino,
I applied your patch and tested on my devices, it directly went to
panic. So there must be something wrong.


[   13.084412] rockchip_emac 10204000.ethernet eth0: Link is Up - 100Mbps/Full 
- flow control rx/tx
[   13.093360] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   13.105915] Unable to handle kernel NULL pointer dereference at virtual 
address 00a8
[   13.114173] pgd = c0004000
[   13.117069] [00a8] *pgd=
[   13.120778] Internal error: Oops: 5 [#1] SMP ARM
[   13.125404] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.6.0+ #88
[   13.131404] Hardware name: Rockchip (Device Tree)
[   13.136108] task: c0804e00 ti: c080 task.ti: c080
[   13.141515] PC is at __dev_kfree_skb_irq+0xc/0x8c
[   13.146223] LR is at arc_emac_poll+0x94/0x584
[   13.150581] pc : []lr : []psr: 60080113
[   13.150581] sp : c0801d88  ip : c0801da0  fp : c0801d9c
[   13.162046] r10: ee092000  r9 : ee093000  r8 : 0008
[   13.167268] r7 :   r6 : 0004  r5 : f08d0400  r4 : 0010
[   13.173789] r3 : 0001  r2 :   r1 : 0001  r0 : 
[   13.180312] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   13.187441] Control: 10c5387d  Table: 8d24c04a  DAC: 0051
[   13.193184] Process swapper/0 (pid: 0, stack limit = 0xc0800210)
[   13.199187] Stack: (0xc0801d88 to 0xc0802000)
[   13.203548] 1d80:   0010 f08d0400 c0801df4 c0801da0 
c03c5e64 c046f588
[   13.211723] 1da0:  ee0cc154  ee0924a8 c0801dcc 0028 
c0806614 f08d0408
[   13.219897] 1dc0: ee0922a8 007f  ee0924a8 c03c5dd0 0028 
012c fffee77c
[   13.228071] 1de0: c0802100 c0801e18 c0801e54 c0801df8 c0471bfc c03c5ddc 
c0801e3c c06e5ae8
[   13.236245] 1e00: c08029f4 c08029f4 c083d821 2e86f000 c07486c0 eefb76c0 
c0801e18 c0801e18
[   13.244419] 1e20: c0801e20 c0801e20 ee841900  c080208c c080 
0100 0003
[   13.252594] 1e40: c0802080 4003 c0801eb4 c0801e58 c012810c c04719fc 
0001 ee808000
[   13.260768] 1e60: c08024a8 0020 c0802100 fffee77b 000a c06017ec 
c083f940 c07432f8
[   13.268941] 1e80: c0802080 c0801e58 c0801eb4 c07463cc   
0001 ee808000
[   13.277115] 1ea0: c08024a8 eefe01c0 c0801ec4 c0801eb8 c01284f8 c0127ff4 
c0801eec c0801ec8
[   

[PATCH v2] ethernet:arc: Fix racing of TX ring buffer

2016-05-17 Thread Shuyu Wei
Setting the FOR_EMAC flag should be the last step of modifying the buffer
descriptor, or possible racing will occur.

The loop counter i in tx_clean() is not needed, and we need to make sure
it does not clear the finished txbds.

Signed-off-by: Shuyu Wei 
---
Changes in v2:
- Remove loop counter in tx_clean and check for unfinished txbds (Ueimor)
- Use dma_wmb() to sync writes (Ueimor)

diff --git a/drivers/net/ethernet/arc/emac_main.c 
b/drivers/net/ethernet/arc/emac_main.c
index a3a9392..df3dfef 100644
--- a/drivers/net/ethernet/arc/emac_main.c
+++ b/drivers/net/ethernet/arc/emac_main.c
@@ -153,9 +153,8 @@ static void arc_emac_tx_clean(struct net_device *ndev)
 {
struct arc_emac_priv *priv = netdev_priv(ndev);
struct net_device_stats *stats = &ndev->stats;
-   unsigned int i;
 
-   for (i = 0; i < TX_BD_NUM; i++) {
+   while (priv->txbd_dirty != priv->txbd_curr) {
unsigned int *txbd_dirty = &priv->txbd_dirty;
struct arc_emac_bd *txbd = &priv->txbd[*txbd_dirty];
struct buffer_state *tx_buff = &priv->tx_buff[*txbd_dirty];
@@ -685,13 +684,15 @@ static int arc_emac_tx(struct sk_buff *skb, struct 
net_device *ndev)
wmb();
 
skb_tx_timestamp(skb);
+   priv->tx_buff[*txbd_curr].skb = skb;
+
+   dma_wmb();
 
*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
 
/* Make sure info word is set */
wmb();
 
-   priv->tx_buff[*txbd_curr].skb = skb;
 
/* Increment index to point to the next BD */
*txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;



Re: [PATCH] ethernet:arc: Fix racing of TX ring buffer

2016-05-15 Thread Shuyu Wei
On Sun, May 15, 2016 at 11:19:53AM +0200, Francois Romieu wrote:
> 
> static void arc_emac_tx_clean(struct net_device *ndev)
> {
> [...]
> for (i = 0; i < TX_BD_NUM; i++) {
> unsigned int *txbd_dirty = &priv->txbd_dirty;
> struct arc_emac_bd *txbd = &priv->txbd[*txbd_dirty];
> struct buffer_state *tx_buff = &priv->tx_buff[*txbd_dirty];
> struct sk_buff *skb = tx_buff->skb;
> unsigned int info = le32_to_cpu(txbd->info);
> 
> if ((info & FOR_EMAC) || !txbd->data || !skb)
> break;
> ^
> 
> -> the "break" statement prevents reading all txbds. At most one extra
>descriptor is read and this driver isn't in the Mpps business.
> 

You are right, I forgot the break statement.

> > I tried your advice, Tx throughput can only reach 5.52MB/s.
> 
> Even with the original code above ?

Yes, I left tx_clean unmodified, and took your advice below.
I tested it again just now, this time throughput do reach 9.8MB/s,
Maybe last time cpu is not idle.

I still have a question, is it possible that tx_clean() run
between   priv->tx_buff[*txbd_curr].skb = skb   and   dma_wmb()?

--- a/drivers/net/ethernet/arc/emac_main.c
+++ b/drivers/net/ethernet/arc/emac_main.c
@@ -685,13 +685,15 @@ static int arc_emac_tx(struct sk_buff *skb, struct 
net_device *ndev)
wmb();
 
skb_tx_timestamp(skb);
+   priv->tx_buff[*txbd_curr].skb = skb;
+
+   dma_wmb();
 
*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
 
/* Make sure info word is set */
wmb();
 
-   priv->tx_buff[*txbd_curr].skb = skb;
 
/* Increment index to point to the next BD */
*txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;



Re: [PATCH] ethernet:arc: Fix racing of TX ring buffer

2016-05-14 Thread Shuyu Wei
Sorry, the last two lines is wrong, ignore it.

I mean I intended to ignore one or two packets.
It's just a trade-off for performance, but it
doesn't cause any memory leak.


Re: [PATCH] ethernet:arc: Fix racing of TX ring buffer

2016-05-14 Thread Shuyu Wei
On Sat, May 14, 2016 at 10:03:56PM +0200, Francois Romieu wrote:
> Shuyu Wei  :
> > The tail of the ring buffer(txbd_dirty) should never go ahead of the
> > head(txbd_curr) or the ring buffer will corrupt. 
> > 
> > This is the root cause of racing.
> 
> No (see below).
> 
> It may suffer from some barrier illness though.
> 
> > Besides, setting the FOR_EMAC flag should be the last step of modifying
> > the buffer descriptor, or possible racing will occur.
> 
> (s/Besides//)
> 
> Yes. Good catch.
> 
> > Signed-off-by: Shuyu Wei 
> > ---
> > 
> > diff --git a/drivers/net/ethernet/arc/emac_main.c 
> > b/drivers/net/ethernet/arc/emac_main.c
> > index a3a9392..5ece05b 100644
> > --- a/drivers/net/ethernet/arc/emac_main.c
> > +++ b/drivers/net/ethernet/arc/emac_main.c
> > @@ -155,7 +155,7 @@ static void arc_emac_tx_clean(struct net_device *ndev)
> > struct net_device_stats *stats = &ndev->stats;
> > unsigned int i;
> >  
> > -   for (i = 0; i < TX_BD_NUM; i++) {
> > +   for (i = priv->txbd_dirty; i != priv->txbd_curr; i = (i + 1) % 
> > TX_BD_NUM) {
> > unsigned int *txbd_dirty = &priv->txbd_dirty;
> > struct arc_emac_bd *txbd = &priv->txbd[*txbd_dirty];
> > struct buffer_state *tx_buff = &priv->tx_buff[*txbd_dirty];
> 
> "i" is only used as a loop counter in arc_emac_tx_clean. It is not even
> used as an index to dereference an array or whatever. Only "priv->txbd_dirty"
> is used.
> 
> arc_emac_tx_clean() checks FOR_EMAC, skb, and dirty tx data. It takes care of
> clearing those itself. Thus, (memory / io barrier considerations apart) it can
> only proceed beyond its own "if ((info & FOR_EMAC) || !txbd->data || !skb)"
> check if arc_emac_tx wrote all of those.
> 
> Where they are used as loop counters, both TX_BD_NUM and txbd_curr - 
> txbd_dirty
> can be considered as hints (please note that unsigned arithmetic can replace
> the "%" sludgehammer here).
> 
> > @@ -686,12 +686,12 @@ static int arc_emac_tx(struct sk_buff *skb, struct 
> > net_device *ndev)
> >  
> > skb_tx_timestamp(skb);
> 
> > +   priv->tx_buff[*txbd_curr].skb = skb;
> 
>   dma_wmb();
> 
> (sync writes to memory before releasing descriptor)
> 
> > *info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
> >  
> > /* Make sure info word is set */
> > wmb();
> 
> arc_emac_tx_clean can run from this point.
> 
> txbd_curr is still not set (and it does need to). So, if you insist
> on txbd_curr appearing in arc_emac_tx_clean::for(...), it's perfectly
> possible to ignore a sent packet.
> 
> I ignored arc_reg_set() at the end of arc_emac_tx(). I have no idea
> if it is posted nor if it forces the chipset to read the descriptors
> (synchronously ?) so part of the sentence above could be wrong.
> 
> You have found a big offender in arc_emac_tx() but the arc_emac_tx_clean()
> part is imho useless, incorrectly understood or misworded.
> 
> -- 
> Ueimor


Hi, Ueimor. Thanks for your reply.

I don't think taking txbd_curr and txbd_dirty only as hints is a good idea.
That could be a big waste, since tx_clean have to go through all the txbds.
I tried your advice, Tx throughput can only reach 5.52MB/s.

Leaving one sent packet in tx_clean is acceptable if we respect to txbd_curr
and txbd_dirty, since the ignored packet will be cleaned when new packets
arrive.

>  for (i = priv->txbd_dirty; i != priv->txbd_curr; i = (i + 1) % TX_BD_NUM) {
In fact, the loop above will always ignore one or two sent packet, the loop
below can free all packets or leave one if txbd_curr is not updated. I
use the above one since it is clearer.

  for (i = priv->txbd_dirty; (i + 1) % TX_BD_NUM != priv->txbd_curr; i = (i + 
1) % TX_BD_NUM) {


[PATCH] ethernet:arc: Fix racing of TX ring buffer

2016-05-14 Thread Shuyu Wei
The tail of the ring buffer(txbd_dirty) should never go ahead of the
head(txbd_curr) or the ring buffer will corrupt. 

This is the root cause of racing.

Besides, setting the FOR_EMAC flag should be the last step of modifying
the buffer descriptor, or possible racing will occur.

Signed-off-by: Shuyu Wei 
---

diff --git a/drivers/net/ethernet/arc/emac_main.c 
b/drivers/net/ethernet/arc/emac_main.c
index a3a9392..5ece05b 100644
--- a/drivers/net/ethernet/arc/emac_main.c
+++ b/drivers/net/ethernet/arc/emac_main.c
@@ -155,7 +155,7 @@ static void arc_emac_tx_clean(struct net_device *ndev)
struct net_device_stats *stats = &ndev->stats;
unsigned int i;
 
-   for (i = 0; i < TX_BD_NUM; i++) {
+   for (i = priv->txbd_dirty; i != priv->txbd_curr; i = (i + 1) % 
TX_BD_NUM) {
unsigned int *txbd_dirty = &priv->txbd_dirty;
struct arc_emac_bd *txbd = &priv->txbd[*txbd_dirty];
struct buffer_state *tx_buff = &priv->tx_buff[*txbd_dirty];
@@ -686,12 +686,12 @@ static int arc_emac_tx(struct sk_buff *skb, struct 
net_device *ndev)
 
skb_tx_timestamp(skb);
 
+   priv->tx_buff[*txbd_curr].skb = skb;
*info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len);
 
/* Make sure info word is set */
wmb();
 
-   priv->tx_buff[*txbd_curr].skb = skb;
 
/* Increment index to point to the next BD */
*txbd_curr = (*txbd_curr + 1) % TX_BD_NUM;