Hi, Flavio I am new to join the community, work on e1000e driver currently, And I found a thing strange in this issue, please check below.
Thanks, Michael Wang On 10/18/2011 10:42 PM, Flavio Leitner wrote: > On Mon, 17 Oct 2011 11:48:22 -0700 > Jesse Brandeburg<jesse.brandeb...@intel.com> wrote: > >> On Fri, 14 Oct 2011 10:04:26 -0700 >> Flavio Leitner<f...@redhat.com> wrote: >> >>> Hi, >>> >>> I got few reports so far that 82571EB models are having the >>> "Detected Hardware Unit Hang" issue after upgrading the kernel. >>> >>> Further debugging with an instrumented kernel revealed that the >>> socket buffer time stamp matches with the last time e1000_xmit_frame() >>> was called. Also that the time stamp of e1000_clean_tx_irq() last run >>> is prior to the one in socket buffer. >>> >>> However, ~1 second later, an interrupt is fired and the old entry >>> is found. Sometimes, the scheduled print_hang_task dumps the >>> information _after_ the old entry is sent (shows empty ring), >>> indicating that the HW TX unit isn't really stuck and apparently >>> just missed the signal to initiate the transmission. >>> >>> Order of events: >>> (1) skb is pushed down >>> (2) e1000_xmit_frame() is called >>> (3) ring is filled with one entry >>> (4) TDT is updated >>>> (5) nothing happens for little more than 1 second >>> (6) interrupt is fired >>> (7) e1000_clean_tx_irq() is called >>> (8) finds the entry not ready with an old time stamp, >>> schedules print_hang_task and stops the TX queue. >>> (9) print_hang_task runs, dump the info but the old entry is now sent >>> (10) apparently the TX queue is back. >> Flavio, thanks for the detailed info, please be sure to supply us the >> bugzilla number. >> > It was buried in the end of the first email: > https://bugzilla.redhat.com/show_bug.cgi?id=746272 > >> TDH is probably not moving due to the writeback threshold settings in >> TXDCTL. netperf UDP_RR test is likely a good way to test this. >> > Yeah, makes sense. I haven't heard about new events after had removed > the flag FLAG2_DMA_BURST. Unfortunately, I don't have access to the exact > same hardware and I haven't reproduced the issue in-house yet with another > 82571EB. See below about interface statistics from sar. > > >> I don't think the sequence is quite what you said. We are going to >> work with the hardware team to get a sequence that works right, and we >> should have a fix for you soon. > Yeah, the sequence might not be exact, but gives us a good idea of > what could be happening. > > There are two events right after another: > > Oct 9 05:45:23 kernel: TDH<48> > Oct 9 05:45:23 kernel: TDT<49> > Oct 9 05:45:23 kernel: next_to_use<49> > Oct 9 05:45:23 kernel: next_to_clean<48> > Oct 9 05:45:23 kernel: buffer_info[next_to_clean]: > Oct 9 05:45:23 kernel: time_stamp<102338ca6> > Oct 9 05:45:23 kernel: next_to_watch<48> > Oct 9 05:45:23 kernel: jiffies<102338dc1> > Oct 9 05:45:23 kernel: next_to_watch.status<0> > Oct 9 05:45:23 kernel: MAC Status<80383> > Oct 9 05:45:23 kernel: PHY Status<792d> > Oct 9 05:45:23 kernel: PHY 1000BASE-T Status<3800> > Oct 9 05:45:23 kernel: PHY Extended Status<3000> > Oct 9 05:45:23 kernel: PCI Status<10> > Oct 9 05:51:54 kernel: e1000e 0000:22:00.1: eth7: Detected Hardware Unit > Hang: > Oct 9 05:51:54 kernel: TDH<55> > Oct 9 05:51:54 kernel: TDT<56> > Oct 9 05:51:54 kernel: next_to_use<56> > Oct 9 05:51:54 kernel: next_to_clean<55> > Oct 9 05:51:54 kernel: buffer_info[next_to_clean]: > Oct 9 05:51:54 kernel: time_stamp<102350986> > Oct 9 05:51:54 kernel: next_to_watch<55> > Oct 9 05:51:54 kernel: jiffies<102350b07> > Oct 9 05:51:54 kernel: next_to_watch.status<0> > Oct 9 05:51:54 kernel: MAC Status<80383> > Oct 9 05:51:54 kernel: PHY Status<792d> > Oct 9 05:51:54 kernel: PHY 1000BASE-T Status<3800> > Oct 9 05:51:54 kernel: PHY Extended Status<3000> > Oct 9 05:51:54 kernel: PCI Status<10> > I see the judgement of hang is: time_after(jiffies, tx_ring->buffer_info[i].time_stamp + (adapter->tx_timeout_factor * HZ)) which means the hang happened when current jiffies minus buffer's time stamp is over (adapter->tx_timeout_factor * HZ). And I see the tx_timeout_factor will at least be 1, so on x86 the (jiffies-time_stamp) should over 1000, but here looks only around 300. Could you please check the HZ number of your platform? > This is the sar report, the interface was idling. > 00:00:01 IFACE rxpck/s txpck/s rxbyt/s txbyt/s rxcmp/s > txcmp/s rxmcst/s > 05:40:01 eth7 1.13 0.03 944.69 4.14 0.00 > 0.00 0.87 > 05:50:01 eth7 1.25 0.03 952.37 4.13 0.00 > 0.00 0.87 > 06:00:01 eth7 1.14 0.03 947.26 4.14 0.00 > 0.00 0.87 > > 00:00:01 IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s > txcarr/s rxfram/s rxfifo/s txfifo/s > 05:40:01 eth7 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > 05:50:01 eth7 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > 06:00:01 eth7 0.00 0.00 0.00 0.00 0.00 > 0.00 0.00 0.00 0.00 > > ethtool -i eth7: > driver: e1000e > version: 1.3.10-k2 > firmware-version: 5.12-2 > bus-info: 0000:22:00.1 > 22:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet > Controller (Copper) (rev 06) > 22:00.1 0200: 8086:10bc (rev 06) > (the rest of the lspci is on the first email, same thing...) > > One of the customers has one test system reproducing this issue, so > let me know what else I can do to help you guys. > > thanks, > fbl > >>> The following commit seems to be related to the symptoms seen above: >>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=3a3b75860527a11ba5035c6aa576079245d09e2a >>> >>> From: Jesse Brandeburg<jesse.brandeb...@intel.com> >>> Date: Wed, 29 Sep 2010 21:38:49 +0000 (+0000) >>> Subject: e1000e: use hardware writeback batching >>> X-Git-Tag: v2.6.37-rc1~147^2~299 >>> X-Git-Url: >>> http://git.kernel.org/?p=linux%2Fkernel%2Fgit%2Ftorvalds%2Flinux-2.6.git;a=commitdiff_plain;h=3a3b75860527a11ba5035c6aa576079245d09e2a >>> >>> e1000e: use hardware writeback batching >>> >>> Most e1000e parts support batching writebacks. The problem with this is >>> that when some of the TADV or TIDV timers are not set, Tx can sit forever. >>> >>> This is solved in this patch with write flushes using the Flush Partial >>> Descriptors (FPD) bit in TIDV and RDTR. >>> >>> This improves bus utilization and removes partial writes on e1000e, >>> particularly from 82571 parts in S5500 chipset based machines. >>> >>> Only ES2LAN and 82571/2 parts are included in this optimization, to reduce >>> testing load. >>> >>> We have modified the instrumented kernel to include the following patch >>> disabling writeback batching feature to narrow down the problem: >>> >>> --- debug/drivers/net/e1000e/82571.c.orig 2011-10-11 14:00:44.000000000 >>> -0300 >>> +++ debug/drivers/net/e1000e/82571.c 2011-10-11 15:02:51.000000000 -0300 >>> @@ -2028,8 +2028,7 @@ struct e1000_info e1000_82571_info = { >>> | FLAG_RESET_OVERWRITES_LAA /* errata */ >>> | FLAG_TARC_SPEED_MODE_BIT /* errata */ >>> | FLAG_APME_CHECK_PORT_B, >>> - .flags2 = FLAG2_DISABLE_ASPM_L1 /* errata 13 */ >>> - | FLAG2_DMA_BURST, >>> + .flags2 = FLAG2_DISABLE_ASPM_L1, /* errata 13 */ >>> .pba = 38, >>> .max_hw_frame_size = DEFAULT_JUMBO, >>> >>> >>> and the customer confirmed that the issue has disappeared since then. >>> >>> Board info: >>> 1e:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet >>> Controller (Copper) (rev 06) >>> >>> 1e:00.0 0200: 8086:10bc (rev 06) >>> Subsystem: 103c:704b >>> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ >>> Stepping- SERR- FastB2B- DisINTx+ >>> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- >>> DEVSEL=fast>TAbort-<TAbort- >>> <MAbort->SERR-<PERR- INTx- >>> Latency: 0, Cache Line Size: 64 bytes >>> Interrupt: pin B routed to IRQ 224 >>> Region 0: Memory at fd4e0000 (32-bit, non-prefetchable) [size=128K] >>> Region 1: Memory at fd400000 (32-bit, non-prefetchable) [size=512K] >>> Region 2: I/O ports at 7000 [size=32] >>> Capabilities: [c8] Power Management version 2 >>> Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA >>> PME(D0+,D1-,D2-,D3hot+,D3cold+) >>> Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- >>> Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ >>> Address: 00000000fee00000 Data: 4073 >>> Capabilities: [e0] Express (v1) Endpoint, MSI 00 >>> DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency >>> L0s<512ns, >>> L1<64us >>> ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- >>> DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ >>> Unsupported- >>> RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ >>> MaxPayload 256 bytes, MaxReadReq 4096 bytes >>> DevSta: CorrErr- UncorrErr- FatalErr+ UnsuppReq+ AuxPwr+ >>> TransPend- >>> LnkCap: Port #2, Speed 2.5GT/s, Width x4, ASPM L0s, >>> Latency L0 >>> <4us, L1<64us >>> ClockPM- Surprise- LLActRep- BwNot- >>> LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- >>> CommClk- >>> ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- >>> LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ >>> DLActive- BWMgmt- ABWMgmt- >>> Kernel driver in use: e1000e >>> Kernel modules: e1000e >>> >>> >>> I am checking the NIC dev specs again but so far I couldn't find >>> a reason for this to happen yet. >>> >>> Any ideas? It happens with 5.7 kernel (2.6.18-274.el5) and it seems >>> to be happening with 6.1 as well, though I am waiting the instrumented >>> kernel outputs to confirm. >>> >>> This is related to >>> https://bugzilla.redhat.com/show_bug.cgi?id=746272 >>> > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > E1000-devel mailing list > E1000-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/e1000-devel > To learn more about Intel® Ethernet, visit > http://communities.intel.com/community/wired ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired