Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out
Ok Bhadram thx for this check, I was afraid that the HW FIFO had some issues. Best Regards Peppe On 12/1/2017 4:39 PM, Bhadram Varka wrote: Hi Giuseppe, I don't see any issue with if we execute "ping -s 1400" case. I believe in this case TSO not triggered. Thanks, Bhadram. -Original Message- From: Giuseppe CAVALLARO [mailto:peppe.cavall...@st.com] Sent: Thursday, November 23, 2017 11:58 AM To: Bhadram Varka ; joao.pi...@synopsys.com Cc: linux-netdev Subject: Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out Hi Bhadram you said that In normal ping scenario this is not observed, I wonder if you could try for example, ping with -s 1400. In that case, if still fail I think the issue could be the FIFO tuning and I expect overflow on RX MMC counters. Let me know Regards, Peppe On 11/20/2017 3:22 PM, Bhadram Varka wrote: Hi Giuseppe, Thanks for responding. Actually I am using net-next tree for making the changes. Below patches already present in code base. a0daae1 net: stmmac: Disable flow ctrl for RX AVB queues and really enable TX AVB queues 52a7623 net: stmmac: Use correct values in TQS/RQS fields Thanks, Bhadram. -Original Message- From: Giuseppe CAVALLARO [mailto:peppe.cavall...@st.com] Sent: Monday, November 20, 2017 6:37 PM To: Bhadram Varka ; joao.pi...@synopsys.com Cc: linux-netdev Subject: Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out Hello Bhadram there are some new patches actually in net/net-next repo that you should have; for example: [PATCH net-next v2 0/2] net: stmmac: Improvements for multi-queuing and for AVB Let me know if these help you. Regards Peppe On 11/20/2017 7:38 AM, Bhadram Varka wrote: Hi Joao/Peppe, Observed this issue more frequently with multi-channel case. Am I missing something in DT ? Please help here to understand the issue. Thanks, Bhadram -Original Message- From: Bhadram Varka Sent: Thursday, November 16, 2017 9:41 AM To: linux-netdev Subject: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out Hi, I am trying to enable multi-queue in Tegra186 EQOS (which has support for 4 channels). Observed below netdev watchdog warning. Its easily reproable with iperf test. In normal ping scenario this is not observed. I did not observe any issue if we disable TSO. Looks like issue in stmmac_tso_xmit() in multi-channel scenario. [ 88.801672] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 timed out [ 88.808818] [ cut here ] [ 88.813435] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x2cc/0x2d8 [ 88.821681] Modules linked in: dwmac_dwc_qos_eth stmmac_platform crc32_ce crct10dif_ce stmmac ip_tables x_tables ipv6 [ 88.832290] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G S 4.14.0-rc7-01956-g9395db5-dirty #21 [ 88.841663] Hardware name: NVIDIA Tegra186 P2771- Development Board (DT) [ 88.848697] task: 8001ec8fd400 task.stack: 09e38000 [ 88.854606] PC is at dev_watchdog+0x2cc/0x2d8 [ 88.858952] LR is at dev_watchdog+0x2cc/0x2d8 [ 88.863300] pc : [] lr : [] pstate: 2145 [ 88.870678] sp : 0802bd80 [ 88.873983] x29: 0802bd80 x28: 00a0 [ 88.879287] x27: x26: 8001eae2c3b0 [ 88.884589] x25: 0005 x24: 8001ecb6be80 [ 88.889891] x23: 8001eae2c39c x22: 8001eae2bfb0 [ 88.895192] x21: 8001eae2c000 x20: 08fe7000 [ 88.900493] x19: 0001 x18: 0010 [ 88.905795] x17: x16: [ 88.911098] x15: x14: 756f2064656d6974 [ 88.916399] x13: 2031206575657571 x12: 08fe9df0 [ 88.921699] x11: 08586180 x10: 642d6874652d6377 [ 88.927000] x9 : 0016 x8 : 3a474f4448435441 [ 88.932301] x7 : 572056454454454e x6 : 014f [ 88.937602] x5 : 0020 x4 : [ 88.942902] x3 : x2 : 08fec4c0 [ 88.948203] x1 : 8001ec8fd400 x0 : 0041 [ 88.953504] Call trace: [ 88.955944] Exception stack(0x0802bc40 to 0x0802bd80) [ 88.962371] bc40: 0041 8001ec8fd400 08fec4c0 [ 88.970184] bc60: 0020 014f 572056454454454e [ 88.977998] bc80: 3a474f4448435441 0016 642d6874652d6377 08586180 [ 88.985811] bca0: 08fe9df0 2031206575657571 756f2064656d6974 [ 88.993624] bcc0: 0010 0001 [ 89.001439] bce0: 08fe7000 8001eae2c000 8001eae2bfb0 8001eae2c39c [ 89.009252] bd00: 8001ecb6be80 0005 8001eae2c3b0 [ 89.017065] bd20: 00a0 0802bd80 0894a76c 0802bd80 [ 89.024879] bd40: 0894a76c 2145 f
RE: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out
Hi Lars, > -Original Message- > From: netdev-ow...@vger.kernel.org [mailto:netdev- > ow...@vger.kernel.org] On Behalf Of Lars P (Mailing List Account) > Sent: Friday, December 01, 2017 9:05 PM > To: Bhadram Varka > Cc: joao.pi...@synopsys.com; peppe.cavall...@st.com; linux-netdev > > Subject: Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 > timed out > > Hi Bhadram, > > Does the Tegra by any chance have TSO enabled on multiple TX-DMA > channels ? Yes. TSO enabled for multiple TX DMA channels. > I recently noticed a second TSO bug in the stmmac while making the patch > "stmmac: reset last TSO segment size after device open". > > The last-used MSS setting in TSO is tracked as a device-global variable and > not > per TX queue. Using TSO on tx queue 0 will record mss to priv->mss and if we > later use TSO on tx queue 1 with the same gso_size then the driver will not > use a context descriptor to set the MSS size for this queue. This probably > means that the TSO controller in channel 1 goes nuts with an undefined mss > setting. I believe it would be better we can make the MSS on per queue basis instead of getting through global variable. Thanks, Bhadram. > BR, > Lars Persson > > On Mon, Nov 20, 2017 at 7:38 AM, Bhadram Varka > wrote: > > Hi Joao/Peppe, > > > > Observed this issue more frequently with multi-channel case. Am I missing > something in DT ? > > Please help here to understand the issue. > > > > Thanks, > > Bhadram > > > > -----Original Message----- > > From: Bhadram Varka > > Sent: Thursday, November 16, 2017 9:41 AM > > To: linux-netdev > > Subject: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 > timed > > out > > > > Hi, > > > > I am trying to enable multi-queue in Tegra186 EQOS (which has support for > 4 channels). Observed below netdev watchdog warning. Its easily reproable > with iperf test. > > In normal ping scenario this is not observed. I did not observe any issue if > we disable TSO. Looks like issue in stmmac_tso_xmit() in multi-channel > scenario. > > --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ---
RE: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out
Hi Giuseppe, I don't see any issue with if we execute "ping -s 1400" case. I believe in this case TSO not triggered. Thanks, Bhadram. -Original Message- From: Giuseppe CAVALLARO [mailto:peppe.cavall...@st.com] Sent: Thursday, November 23, 2017 11:58 AM To: Bhadram Varka ; joao.pi...@synopsys.com Cc: linux-netdev Subject: Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out Hi Bhadram you said that In normal ping scenario this is not observed, I wonder if you could try for example, ping with -s 1400. In that case, if still fail I think the issue could be the FIFO tuning and I expect overflow on RX MMC counters. Let me know Regards, Peppe On 11/20/2017 3:22 PM, Bhadram Varka wrote: > Hi Giuseppe, > > Thanks for responding. > > Actually I am using net-next tree for making the changes. Below patches > already present in code base. > > a0daae1 net: stmmac: Disable flow ctrl for RX AVB queues and really > enable TX AVB queues > 52a7623 net: stmmac: Use correct values in TQS/RQS fields > > Thanks, > Bhadram. > > -Original Message- > From: Giuseppe CAVALLARO [mailto:peppe.cavall...@st.com] > Sent: Monday, November 20, 2017 6:37 PM > To: Bhadram Varka ; joao.pi...@synopsys.com > Cc: linux-netdev > Subject: Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 > timed out > > Hello Bhadram > > there are some new patches actually in net/net-next repo that you should > have; for example: > > [PATCH net-next v2 0/2] net: stmmac: Improvements for > multi-queuing and for AVB > > Let me know if these help you. > > Regards > Peppe > > On 11/20/2017 7:38 AM, Bhadram Varka wrote: >> Hi Joao/Peppe, >> >> Observed this issue more frequently with multi-channel case. Am I missing >> something in DT ? >> Please help here to understand the issue. >> >> Thanks, >> Bhadram >> >> -Original Message- >> From: Bhadram Varka >> Sent: Thursday, November 16, 2017 9:41 AM >> To: linux-netdev >> Subject: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 >> timed out >> >> Hi, >> >> I am trying to enable multi-queue in Tegra186 EQOS (which has support for 4 >> channels). Observed below netdev watchdog warning. Its easily reproable with >> iperf test. >> In normal ping scenario this is not observed. I did not observe any issue if >> we disable TSO. Looks like issue in stmmac_tso_xmit() in multi-channel >> scenario. >> >> [ 88.801672] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 timed >> out >> [ 88.808818] [ cut here ] >> [ 88.813435] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:320 >> dev_watchdog+0x2cc/0x2d8 >> [ 88.821681] Modules linked in: dwmac_dwc_qos_eth stmmac_platform crc32_ce >> crct10dif_ce stmmac ip_tables x_tables ipv6 >> [ 88.832290] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G S >> 4.14.0-rc7-01956-g9395db5-dirty #21 >> [ 88.841663] Hardware name: NVIDIA Tegra186 P2771- Development Board >> (DT) >> [ 88.848697] task: 8001ec8fd400 task.stack: 09e38000 >> [ 88.854606] PC is at dev_watchdog+0x2cc/0x2d8 >> [ 88.858952] LR is at dev_watchdog+0x2cc/0x2d8 >> [ 88.863300] pc : [] lr : [] pstate: >> 2145 >> [ 88.870678] sp : 0802bd80 >> [ 88.873983] x29: 0802bd80 x28: 00a0 >> [ 88.879287] x27: x26: 8001eae2c3b0 >> [ 88.884589] x25: 0005 x24: 8001ecb6be80 >> [ 88.889891] x23: 8001eae2c39c x22: 8001eae2bfb0 >> [ 88.895192] x21: 8001eae2c000 x20: 08fe7000 >> [ 88.900493] x19: 0001 x18: 0010 >> [ 88.905795] x17: x16: >> [ 88.911098] x15: x14: 756f2064656d6974 >> [ 88.916399] x13: 2031206575657571 x12: 08fe9df0 >> [ 88.921699] x11: 08586180 x10: 642d6874652d6377 >> [ 88.927000] x9 : 0016 x8 : 3a474f4448435441 >> [ 88.932301] x7 : 572056454454454e x6 : 014f >> [ 88.937602] x5 : 0020 x4 : >> [ 88.942902] x3 : x2 : 08fec4c0 >> [ 88.948203] x1 : 8001ec8fd400 x0 : 0041 >> [ 88.953504] Call trace: >> [ 88.955944] Exception stack(0x0802bc40 to 0x0802bd80) >> [ 88.962371] bc40: 0041 8001ec8fd400 08fec4c0 >> >> [ 88.970184] bc60: 0020 014f >> 572056454454454e >> [ 88.977
Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out
Hi Bhadram, Does the Tegra by any chance have TSO enabled on multiple TX-DMA channels ? I recently noticed a second TSO bug in the stmmac while making the patch "stmmac: reset last TSO segment size after device open". The last-used MSS setting in TSO is tracked as a device-global variable and not per TX queue. Using TSO on tx queue 0 will record mss to priv->mss and if we later use TSO on tx queue 1 with the same gso_size then the driver will not use a context descriptor to set the MSS size for this queue. This probably means that the TSO controller in channel 1 goes nuts with an undefined mss setting. BR, Lars Persson On Mon, Nov 20, 2017 at 7:38 AM, Bhadram Varka wrote: > Hi Joao/Peppe, > > Observed this issue more frequently with multi-channel case. Am I missing > something in DT ? > Please help here to understand the issue. > > Thanks, > Bhadram > > -Original Message- > From: Bhadram Varka > Sent: Thursday, November 16, 2017 9:41 AM > To: linux-netdev > Subject: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out > > Hi, > > I am trying to enable multi-queue in Tegra186 EQOS (which has support for 4 > channels). Observed below netdev watchdog warning. Its easily reproable with > iperf test. > In normal ping scenario this is not observed. I did not observe any issue if > we disable TSO. Looks like issue in stmmac_tso_xmit() in multi-channel > scenario. >
Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out
Hi Bhadram you said that In normal ping scenario this is not observed, I wonder if you could try for example, ping with -s 1400. In that case, if still fail I think the issue could be the FIFO tuning and I expect overflow on RX MMC counters. Let me know Regards, Peppe On 11/20/2017 3:22 PM, Bhadram Varka wrote: Hi Giuseppe, Thanks for responding. Actually I am using net-next tree for making the changes. Below patches already present in code base. a0daae1 net: stmmac: Disable flow ctrl for RX AVB queues and really enable TX AVB queues 52a7623 net: stmmac: Use correct values in TQS/RQS fields Thanks, Bhadram. -Original Message- From: Giuseppe CAVALLARO [mailto:peppe.cavall...@st.com] Sent: Monday, November 20, 2017 6:37 PM To: Bhadram Varka ; joao.pi...@synopsys.com Cc: linux-netdev Subject: Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out Hello Bhadram there are some new patches actually in net/net-next repo that you should have; for example: [PATCH net-next v2 0/2] net: stmmac: Improvements for multi-queuing and for AVB Let me know if these help you. Regards Peppe On 11/20/2017 7:38 AM, Bhadram Varka wrote: Hi Joao/Peppe, Observed this issue more frequently with multi-channel case. Am I missing something in DT ? Please help here to understand the issue. Thanks, Bhadram -Original Message- From: Bhadram Varka Sent: Thursday, November 16, 2017 9:41 AM To: linux-netdev Subject: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out Hi, I am trying to enable multi-queue in Tegra186 EQOS (which has support for 4 channels). Observed below netdev watchdog warning. Its easily reproable with iperf test. In normal ping scenario this is not observed. I did not observe any issue if we disable TSO. Looks like issue in stmmac_tso_xmit() in multi-channel scenario. [ 88.801672] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 timed out [ 88.808818] [ cut here ] [ 88.813435] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x2cc/0x2d8 [ 88.821681] Modules linked in: dwmac_dwc_qos_eth stmmac_platform crc32_ce crct10dif_ce stmmac ip_tables x_tables ipv6 [ 88.832290] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G S 4.14.0-rc7-01956-g9395db5-dirty #21 [ 88.841663] Hardware name: NVIDIA Tegra186 P2771- Development Board (DT) [ 88.848697] task: 8001ec8fd400 task.stack: 09e38000 [ 88.854606] PC is at dev_watchdog+0x2cc/0x2d8 [ 88.858952] LR is at dev_watchdog+0x2cc/0x2d8 [ 88.863300] pc : [] lr : [] pstate: 2145 [ 88.870678] sp : 0802bd80 [ 88.873983] x29: 0802bd80 x28: 00a0 [ 88.879287] x27: x26: 8001eae2c3b0 [ 88.884589] x25: 0005 x24: 8001ecb6be80 [ 88.889891] x23: 8001eae2c39c x22: 8001eae2bfb0 [ 88.895192] x21: 8001eae2c000 x20: 08fe7000 [ 88.900493] x19: 0001 x18: 0010 [ 88.905795] x17: x16: [ 88.911098] x15: x14: 756f2064656d6974 [ 88.916399] x13: 2031206575657571 x12: 08fe9df0 [ 88.921699] x11: 08586180 x10: 642d6874652d6377 [ 88.927000] x9 : 0016 x8 : 3a474f4448435441 [ 88.932301] x7 : 572056454454454e x6 : 014f [ 88.937602] x5 : 0020 x4 : [ 88.942902] x3 : x2 : 08fec4c0 [ 88.948203] x1 : 8001ec8fd400 x0 : 0041 [ 88.953504] Call trace: [ 88.955944] Exception stack(0x0802bc40 to 0x0802bd80) [ 88.962371] bc40: 0041 8001ec8fd400 08fec4c0 [ 88.970184] bc60: 0020 014f 572056454454454e [ 88.977998] bc80: 3a474f4448435441 0016 642d6874652d6377 08586180 [ 88.985811] bca0: 08fe9df0 2031206575657571 756f2064656d6974 [ 88.993624] bcc0: 0010 0001 [ 89.001439] bce0: 08fe7000 8001eae2c000 8001eae2bfb0 8001eae2c39c [ 89.009252] bd00: 8001ecb6be80 0005 8001eae2c3b0 [ 89.017065] bd20: 00a0 0802bd80 0894a76c 0802bd80 [ 89.024879] bd40: 0894a76c 2145 00b67570 0001 [ 89.032693] bd60: 0001 8001ecb6b200 0802bd80 0894a76c [ 89.040508] [] dev_watchdog+0x2cc/0x2d8 [ 89.045900] [] call_timer_fn.isra.5+0x24/0x80 [ 89.051809] [] expire_timers+0xa4/0xb0 [ 89.057111] [] run_timer_softirq+0x140/0x170 [ 89.062933] [] __do_softirq+0x12c/0x228 [ 89.068323] [] irq_exit+0xd0/0x108 [ 89.073278] [] __handle_domain_irq+0x60/0xb8 [ 89.079098] [] gic_handle_irq+0x58/0xa8 [ 89.084484] Exception stack(0x09e3be20 to 0x09e3bf60) [ 89.090910] be20
RE: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out
Hi Giuseppe, Thanks for responding. Actually I am using net-next tree for making the changes. Below patches already present in code base. a0daae1 net: stmmac: Disable flow ctrl for RX AVB queues and really enable TX AVB queues 52a7623 net: stmmac: Use correct values in TQS/RQS fields Thanks, Bhadram. -Original Message- From: Giuseppe CAVALLARO [mailto:peppe.cavall...@st.com] Sent: Monday, November 20, 2017 6:37 PM To: Bhadram Varka ; joao.pi...@synopsys.com Cc: linux-netdev Subject: Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out Hello Bhadram there are some new patches actually in net/net-next repo that you should have; for example: [PATCH net-next v2 0/2] net: stmmac: Improvements for multi-queuing and for AVB Let me know if these help you. Regards Peppe On 11/20/2017 7:38 AM, Bhadram Varka wrote: > Hi Joao/Peppe, > > Observed this issue more frequently with multi-channel case. Am I missing > something in DT ? > Please help here to understand the issue. > > Thanks, > Bhadram > > -Original Message- > From: Bhadram Varka > Sent: Thursday, November 16, 2017 9:41 AM > To: linux-netdev > Subject: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed > out > > Hi, > > I am trying to enable multi-queue in Tegra186 EQOS (which has support for 4 > channels). Observed below netdev watchdog warning. Its easily reproable with > iperf test. > In normal ping scenario this is not observed. I did not observe any issue if > we disable TSO. Looks like issue in stmmac_tso_xmit() in multi-channel > scenario. > > [ 88.801672] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 timed > out > [ 88.808818] [ cut here ] > [ 88.813435] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:320 > dev_watchdog+0x2cc/0x2d8 > [ 88.821681] Modules linked in: dwmac_dwc_qos_eth stmmac_platform crc32_ce > crct10dif_ce stmmac ip_tables x_tables ipv6 > [ 88.832290] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G S > 4.14.0-rc7-01956-g9395db5-dirty #21 > [ 88.841663] Hardware name: NVIDIA Tegra186 P2771- Development Board > (DT) > [ 88.848697] task: 8001ec8fd400 task.stack: 09e38000 > [ 88.854606] PC is at dev_watchdog+0x2cc/0x2d8 > [ 88.858952] LR is at dev_watchdog+0x2cc/0x2d8 > [ 88.863300] pc : [] lr : [] pstate: > 2145 > [ 88.870678] sp : 0802bd80 > [ 88.873983] x29: 0802bd80 x28: 00a0 > [ 88.879287] x27: x26: 8001eae2c3b0 > [ 88.884589] x25: 0005 x24: 8001ecb6be80 > [ 88.889891] x23: 8001eae2c39c x22: 8001eae2bfb0 > [ 88.895192] x21: 8001eae2c000 x20: 08fe7000 > [ 88.900493] x19: 0001 x18: 0010 > [ 88.905795] x17: x16: > [ 88.911098] x15: x14: 756f2064656d6974 > [ 88.916399] x13: 2031206575657571 x12: 08fe9df0 > [ 88.921699] x11: 08586180 x10: 642d6874652d6377 > [ 88.927000] x9 : 0016 x8 : 3a474f4448435441 > [ 88.932301] x7 : 572056454454454e x6 : 014f > [ 88.937602] x5 : 0020 x4 : > [ 88.942902] x3 : x2 : 08fec4c0 > [ 88.948203] x1 : 8001ec8fd400 x0 : 0041 > [ 88.953504] Call trace: > [ 88.955944] Exception stack(0x0802bc40 to 0x0802bd80) > [ 88.962371] bc40: 0041 8001ec8fd400 08fec4c0 > > [ 88.970184] bc60: 0020 014f > 572056454454454e > [ 88.977998] bc80: 3a474f4448435441 0016 642d6874652d6377 > 08586180 > [ 88.985811] bca0: 08fe9df0 2031206575657571 756f2064656d6974 > > [ 88.993624] bcc0: 0010 > 0001 > [ 89.001439] bce0: 08fe7000 8001eae2c000 8001eae2bfb0 > 8001eae2c39c > [ 89.009252] bd00: 8001ecb6be80 0005 8001eae2c3b0 > > [ 89.017065] bd20: 00a0 0802bd80 0894a76c > 0802bd80 > [ 89.024879] bd40: 0894a76c 2145 00b67570 > 0001 > [ 89.032693] bd60: 0001 8001ecb6b200 0802bd80 > 0894a76c > [ 89.040508] [] dev_watchdog+0x2cc/0x2d8 > [ 89.045900] [] call_timer_fn.isra.5+0x24/0x80 > [ 89.051809] [] expire_timers+0xa4/0xb0 > [ 89.057111] [] run_timer_softirq+0x140/0x170 > [ 89.062933] [] __do_softirq+0x12c/0x228 > [ 89.068323] [] irq_exit+0xd0/0x108 > [ 89.073278] [] __handle_domain_irq+0x60/0xb8 > [ 89.079098] [] gic_ha
Re: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out
Hello Bhadram there are some new patches actually in net/net-next repo that you should have; for example: [PATCH net-next v2 0/2] net: stmmac: Improvements for multi-queuing and for AVB Let me know if these help you. Regards Peppe On 11/20/2017 7:38 AM, Bhadram Varka wrote: Hi Joao/Peppe, Observed this issue more frequently with multi-channel case. Am I missing something in DT ? Please help here to understand the issue. Thanks, Bhadram -Original Message- From: Bhadram Varka Sent: Thursday, November 16, 2017 9:41 AM To: linux-netdev Subject: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out Hi, I am trying to enable multi-queue in Tegra186 EQOS (which has support for 4 channels). Observed below netdev watchdog warning. Its easily reproable with iperf test. In normal ping scenario this is not observed. I did not observe any issue if we disable TSO. Looks like issue in stmmac_tso_xmit() in multi-channel scenario. [ 88.801672] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 timed out [ 88.808818] [ cut here ] [ 88.813435] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x2cc/0x2d8 [ 88.821681] Modules linked in: dwmac_dwc_qos_eth stmmac_platform crc32_ce crct10dif_ce stmmac ip_tables x_tables ipv6 [ 88.832290] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G S 4.14.0-rc7-01956-g9395db5-dirty #21 [ 88.841663] Hardware name: NVIDIA Tegra186 P2771- Development Board (DT) [ 88.848697] task: 8001ec8fd400 task.stack: 09e38000 [ 88.854606] PC is at dev_watchdog+0x2cc/0x2d8 [ 88.858952] LR is at dev_watchdog+0x2cc/0x2d8 [ 88.863300] pc : [] lr : [] pstate: 2145 [ 88.870678] sp : 0802bd80 [ 88.873983] x29: 0802bd80 x28: 00a0 [ 88.879287] x27: x26: 8001eae2c3b0 [ 88.884589] x25: 0005 x24: 8001ecb6be80 [ 88.889891] x23: 8001eae2c39c x22: 8001eae2bfb0 [ 88.895192] x21: 8001eae2c000 x20: 08fe7000 [ 88.900493] x19: 0001 x18: 0010 [ 88.905795] x17: x16: [ 88.911098] x15: x14: 756f2064656d6974 [ 88.916399] x13: 2031206575657571 x12: 08fe9df0 [ 88.921699] x11: 08586180 x10: 642d6874652d6377 [ 88.927000] x9 : 0016 x8 : 3a474f4448435441 [ 88.932301] x7 : 572056454454454e x6 : 014f [ 88.937602] x5 : 0020 x4 : [ 88.942902] x3 : x2 : 08fec4c0 [ 88.948203] x1 : 8001ec8fd400 x0 : 0041 [ 88.953504] Call trace: [ 88.955944] Exception stack(0x0802bc40 to 0x0802bd80) [ 88.962371] bc40: 0041 8001ec8fd400 08fec4c0 [ 88.970184] bc60: 0020 014f 572056454454454e [ 88.977998] bc80: 3a474f4448435441 0016 642d6874652d6377 08586180 [ 88.985811] bca0: 08fe9df0 2031206575657571 756f2064656d6974 [ 88.993624] bcc0: 0010 0001 [ 89.001439] bce0: 08fe7000 8001eae2c000 8001eae2bfb0 8001eae2c39c [ 89.009252] bd00: 8001ecb6be80 0005 8001eae2c3b0 [ 89.017065] bd20: 00a0 0802bd80 0894a76c 0802bd80 [ 89.024879] bd40: 0894a76c 2145 00b67570 0001 [ 89.032693] bd60: 0001 8001ecb6b200 0802bd80 0894a76c [ 89.040508] [] dev_watchdog+0x2cc/0x2d8 [ 89.045900] [] call_timer_fn.isra.5+0x24/0x80 [ 89.051809] [] expire_timers+0xa4/0xb0 [ 89.057111] [] run_timer_softirq+0x140/0x170 [ 89.062933] [] __do_softirq+0x12c/0x228 [ 89.068323] [] irq_exit+0xd0/0x108 [ 89.073278] [] __handle_domain_irq+0x60/0xb8 [ 89.079098] [] gic_handle_irq+0x58/0xa8 [ 89.084484] Exception stack(0x09e3be20 to 0x09e3bf60) [ 89.090910] be20: 0001 [ 89.098724] be40: 09e3bf60 8001ecffd000 0001 [ 89.106537] be60: 0002 09e3bee0 0a00 [ 89.114351] be80: 0001 001c3dfbd9959589 1daf5b7a4860 [ 89.122164] bea0: 0825b000 c0311284 08fc5000 [ 89.129978] bec0: 08fe9000 08fe9000 08fd04a0 08fe9e90 [ 89.137792] bee0: 8001ec8fd400 [ 89.145605] bf00: 09e3bf60 0808548c 09e3bf60 [ 89.153418] bf20: 08085490 0145 [ 89.161231] bf40: 081409c4 09e3bf60 08085490 [ 89.169044
RE: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out
Hi Joao/Peppe, Observed this issue more frequently with multi-channel case. Am I missing something in DT ? Please help here to understand the issue. Thanks, Bhadram -Original Message- From: Bhadram Varka Sent: Thursday, November 16, 2017 9:41 AM To: linux-netdev Subject: NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out Hi, I am trying to enable multi-queue in Tegra186 EQOS (which has support for 4 channels). Observed below netdev watchdog warning. Its easily reproable with iperf test. In normal ping scenario this is not observed. I did not observe any issue if we disable TSO. Looks like issue in stmmac_tso_xmit() in multi-channel scenario. [ 88.801672] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 timed out [ 88.808818] [ cut here ] [ 88.813435] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x2cc/0x2d8 [ 88.821681] Modules linked in: dwmac_dwc_qos_eth stmmac_platform crc32_ce crct10dif_ce stmmac ip_tables x_tables ipv6 [ 88.832290] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G S 4.14.0-rc7-01956-g9395db5-dirty #21 [ 88.841663] Hardware name: NVIDIA Tegra186 P2771- Development Board (DT) [ 88.848697] task: 8001ec8fd400 task.stack: 09e38000 [ 88.854606] PC is at dev_watchdog+0x2cc/0x2d8 [ 88.858952] LR is at dev_watchdog+0x2cc/0x2d8 [ 88.863300] pc : [] lr : [] pstate: 2145 [ 88.870678] sp : 0802bd80 [ 88.873983] x29: 0802bd80 x28: 00a0 [ 88.879287] x27: x26: 8001eae2c3b0 [ 88.884589] x25: 0005 x24: 8001ecb6be80 [ 88.889891] x23: 8001eae2c39c x22: 8001eae2bfb0 [ 88.895192] x21: 8001eae2c000 x20: 08fe7000 [ 88.900493] x19: 0001 x18: 0010 [ 88.905795] x17: x16: [ 88.911098] x15: x14: 756f2064656d6974 [ 88.916399] x13: 2031206575657571 x12: 08fe9df0 [ 88.921699] x11: 08586180 x10: 642d6874652d6377 [ 88.927000] x9 : 0016 x8 : 3a474f4448435441 [ 88.932301] x7 : 572056454454454e x6 : 014f [ 88.937602] x5 : 0020 x4 : [ 88.942902] x3 : x2 : 08fec4c0 [ 88.948203] x1 : 8001ec8fd400 x0 : 0041 [ 88.953504] Call trace: [ 88.955944] Exception stack(0x0802bc40 to 0x0802bd80) [ 88.962371] bc40: 0041 8001ec8fd400 08fec4c0 [ 88.970184] bc60: 0020 014f 572056454454454e [ 88.977998] bc80: 3a474f4448435441 0016 642d6874652d6377 08586180 [ 88.985811] bca0: 08fe9df0 2031206575657571 756f2064656d6974 [ 88.993624] bcc0: 0010 0001 [ 89.001439] bce0: 08fe7000 8001eae2c000 8001eae2bfb0 8001eae2c39c [ 89.009252] bd00: 8001ecb6be80 0005 8001eae2c3b0 [ 89.017065] bd20: 00a0 0802bd80 0894a76c 0802bd80 [ 89.024879] bd40: 0894a76c 2145 00b67570 0001 [ 89.032693] bd60: 0001 8001ecb6b200 0802bd80 0894a76c [ 89.040508] [] dev_watchdog+0x2cc/0x2d8 [ 89.045900] [] call_timer_fn.isra.5+0x24/0x80 [ 89.051809] [] expire_timers+0xa4/0xb0 [ 89.057111] [] run_timer_softirq+0x140/0x170 [ 89.062933] [] __do_softirq+0x12c/0x228 [ 89.068323] [] irq_exit+0xd0/0x108 [ 89.073278] [] __handle_domain_irq+0x60/0xb8 [ 89.079098] [] gic_handle_irq+0x58/0xa8 [ 89.084484] Exception stack(0x09e3be20 to 0x09e3bf60) [ 89.090910] be20: 0001 [ 89.098724] be40: 09e3bf60 8001ecffd000 0001 [ 89.106537] be60: 0002 09e3bee0 0a00 [ 89.114351] be80: 0001 001c3dfbd9959589 1daf5b7a4860 [ 89.122164] bea0: 0825b000 c0311284 08fc5000 [ 89.129978] bec0: 08fe9000 08fe9000 08fd04a0 08fe9e90 [ 89.137792] bee0: 8001ec8fd400 [ 89.145605] bf00: 09e3bf60 0808548c 09e3bf60 [ 89.153418] bf20: 08085490 0145 [ 89.161231] bf40: 081409c4 09e3bf60 08085490 [ 89.169044] [] el1_irq+0xb0/0x124 [ 89.173912] [] arch_cpu_idle+0x10/0x18 [ 89.179213] [] do_idle+0x120/0x1e0 [ 89.184166] [] cpu_startup_entry+0x24/0x28 [ 89.189814] [] secondary_start_kernel+0x110/0x120 [ 89.196067] ---[ end trace 039d403d63546b77 ]--- Below are the DT changes - diff --git a
NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 1 timed out
Hi, I am trying to enable multi-queue in Tegra186 EQOS (which has support for 4 channels). Observed below netdev watchdog warning. Its easily reproable with iperf test. In normal ping scenario this is not observed. I did not observe any issue if we disable TSO. Looks like issue in stmmac_tso_xmit() in multi-channel scenario. [ 88.801672] NETDEV WATCHDOG: eth0 (dwc-eth-dwmac): transmit queue 0 timed out [ 88.808818] [ cut here ] [ 88.813435] WARNING: CPU: 5 PID: 0 at net/sched/sch_generic.c:320 dev_watchdog+0x2cc/0x2d8 [ 88.821681] Modules linked in: dwmac_dwc_qos_eth stmmac_platform crc32_ce crct10dif_ce stmmac ip_tables x_tables ipv6 [ 88.832290] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G S 4.14.0-rc7-01956-g9395db5-dirty #21 [ 88.841663] Hardware name: NVIDIA Tegra186 P2771- Development Board (DT) [ 88.848697] task: 8001ec8fd400 task.stack: 09e38000 [ 88.854606] PC is at dev_watchdog+0x2cc/0x2d8 [ 88.858952] LR is at dev_watchdog+0x2cc/0x2d8 [ 88.863300] pc : [] lr : [] pstate: 2145 [ 88.870678] sp : 0802bd80 [ 88.873983] x29: 0802bd80 x28: 00a0 [ 88.879287] x27: x26: 8001eae2c3b0 [ 88.884589] x25: 0005 x24: 8001ecb6be80 [ 88.889891] x23: 8001eae2c39c x22: 8001eae2bfb0 [ 88.895192] x21: 8001eae2c000 x20: 08fe7000 [ 88.900493] x19: 0001 x18: 0010 [ 88.905795] x17: x16: [ 88.911098] x15: x14: 756f2064656d6974 [ 88.916399] x13: 2031206575657571 x12: 08fe9df0 [ 88.921699] x11: 08586180 x10: 642d6874652d6377 [ 88.927000] x9 : 0016 x8 : 3a474f4448435441 [ 88.932301] x7 : 572056454454454e x6 : 014f [ 88.937602] x5 : 0020 x4 : [ 88.942902] x3 : x2 : 08fec4c0 [ 88.948203] x1 : 8001ec8fd400 x0 : 0041 [ 88.953504] Call trace: [ 88.955944] Exception stack(0x0802bc40 to 0x0802bd80) [ 88.962371] bc40: 0041 8001ec8fd400 08fec4c0 [ 88.970184] bc60: 0020 014f 572056454454454e [ 88.977998] bc80: 3a474f4448435441 0016 642d6874652d6377 08586180 [ 88.985811] bca0: 08fe9df0 2031206575657571 756f2064656d6974 [ 88.993624] bcc0: 0010 0001 [ 89.001439] bce0: 08fe7000 8001eae2c000 8001eae2bfb0 8001eae2c39c [ 89.009252] bd00: 8001ecb6be80 0005 8001eae2c3b0 [ 89.017065] bd20: 00a0 0802bd80 0894a76c 0802bd80 [ 89.024879] bd40: 0894a76c 2145 00b67570 0001 [ 89.032693] bd60: 0001 8001ecb6b200 0802bd80 0894a76c [ 89.040508] [] dev_watchdog+0x2cc/0x2d8 [ 89.045900] [] call_timer_fn.isra.5+0x24/0x80 [ 89.051809] [] expire_timers+0xa4/0xb0 [ 89.057111] [] run_timer_softirq+0x140/0x170 [ 89.062933] [] __do_softirq+0x12c/0x228 [ 89.068323] [] irq_exit+0xd0/0x108 [ 89.073278] [] __handle_domain_irq+0x60/0xb8 [ 89.079098] [] gic_handle_irq+0x58/0xa8 [ 89.084484] Exception stack(0x09e3be20 to 0x09e3bf60) [ 89.090910] be20: 0001 [ 89.098724] be40: 09e3bf60 8001ecffd000 0001 [ 89.106537] be60: 0002 09e3bee0 0a00 [ 89.114351] be80: 0001 001c3dfbd9959589 1daf5b7a4860 [ 89.122164] bea0: 0825b000 c0311284 08fc5000 [ 89.129978] bec0: 08fe9000 08fe9000 08fd04a0 08fe9e90 [ 89.137792] bee0: 8001ec8fd400 [ 89.145605] bf00: 09e3bf60 0808548c 09e3bf60 [ 89.153418] bf20: 08085490 0145 [ 89.161231] bf40: 081409c4 09e3bf60 08085490 [ 89.169044] [] el1_irq+0xb0/0x124 [ 89.173912] [] arch_cpu_idle+0x10/0x18 [ 89.179213] [] do_idle+0x120/0x1e0 [ 89.184166] [] cpu_startup_entry+0x24/0x28 [ 89.189814] [] secondary_start_kernel+0x110/0x120 [ 89.196067] ---[ end trace 039d403d63546b77 ]--- Below are the DT changes - diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi b/arch/arm64/boot/dts/nvidia/tegra186.dtsi index 0b0552c..ffe1b80 100644 --- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi +++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi @@ -27,21 +27,40 @@ #gpio-cells = <2>; gpio-controller; }; + + mtl_tx_setup: tx-queues-config { +