Re: 2.6.17-mm6
On 3/07/2006 10:03 p.m., Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17/2.6.17-mm6/ - A major update to the e1000 driver. - 1394 updates Some minor breakage in the e1000... Fedora Core release 5.90 (Test) Kernel 2.6.17-mm6 on an x86_64 tornado.reub.net login: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue 0 TDH a TDT 1c next_to_use 1c next_to_clean8 buffer_info[next_to_clean] time_stamp 100027f1a next_to_watcha jiffies 1000281d4 next_to_watch.status 0 e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue 0 TDH a TDT 1c next_to_use 1c next_to_clean8 buffer_info[next_to_clean] time_stamp 100027f1a next_to_watcha jiffies 1000283c8 next_to_watch.status 0 e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue 0 TDH a TDT 1c next_to_use 1c next_to_clean8 buffer_info[next_to_clean] time_stamp 100027f1a next_to_watcha jiffies 1000285bc next_to_watch.status 0 e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue 0 TDH a TDT 1c next_to_use 1c next_to_clean8 buffer_info[next_to_clean] time_stamp 100027f1a next_to_watcha jiffies 1000287b0 next_to_watch.status 0 e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang Tx Queue 0 TDH a TDT 1c next_to_use 1c next_to_clean8 buffer_info[next_to_clean] time_stamp 100027f1a next_to_watcha jiffies 1000289a4 next_to_watch.status 0 A look through my switch logs and kernel logs over the last few days shows these messages and layer 2/link down disconnections every few hours or so, but of very short duration (I hadn't noticed until now). This output above was under virtually no load. Both the e1000 and switch port on the other end are doing RX and TX flow control. The controller is a built in chip on an Intel D945GNT board. 01:00.0 Ethernet controller: Intel Corporation 82573V Gigabit Ethernet Controller (Copper) (rev 03) Subsystem: Intel Corporation Unknown device 3094 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 313 Region 0: Memory at 4800 (32-bit, non-prefetchable) [size=128K] Region 2: I/O ports at 2000 [size=32] Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable+ Address: fee0100c Data: 4142 Capabilities: [e0] Express Endpoint IRQ 0 Device: Supported: MaxPayload 256 bytes, PhantFunc 0, ExtTag- Device: Latency L0s 512ns, L1 64us Device: AtnBtn- AtnInd- PwrInd- Device: Errors: Correctable- Non-Fatal- Fatal- Unsupported- Device: RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ Device: MaxPayload 128 bytes, MaxReadReq 512 bytes Link: Supported Speed 2.5Gb/s, Width x1, ASPM unknown, Port 0 Link: Latency L0s 128ns, L1 64us Link: ASPM Disabled RCB 64 bytes CommClk+ ExtSynch- Link: Speed 2.5Gb/s, Width x1 [EMAIL PROTECTED] log]# ethtool -i eth0 driver: e1000 version: 7.1.9-k2-NAPI firmware-version: 1.0-5 bus-info: :01:00.0 [EMAIL PROTECTED] log]# Where can I go from here to help debug this further? reuben - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.17-mm6
On 7/4/2006 10:01 PM, Arjan van de Ven wrote: this is one for the networking people, and thus netdev It's actually ieee1394 using net infrastructure for purposes which ar unrelated to networking. Furthermore... On Tue, 2006-07-04 at 21:53 +0200, Rafael J. Wysocki wrote: On Monday 03 July 2006 12:03, Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17/2.6.17-mm6/ - A major update to the e1000 driver. - 1394 updates ...I believe it is unrelated to the 1394 updates new to -mm6. Just found this in dmesg: = [ INFO: inconsistent lock state ] - inconsistent {in-hardirq-W} - {hardirq-on-W} usage. nscd/4929 [HC0[0]:SC0[1]:HE1:SE0] takes: (skb_queue_lock_key){++..}, at: [8044fe40] udp_ioctl+0x50/0xa0 {in-hardirq-W} state was registered at: [8024b4fa] lock_acquire+0x8a/0xc0 [80476e3f] _spin_lock_irqsave+0x3f/0x60 [80408c25] skb_queue_tail+0x25/0x60 ok so skb_queue_lock is used in a hardirq context [881c9517] queue_packet_complete+0x27/0x40 [ieee1394] [881c9d6b] hpsb_packet_sent+0xab/0x100 [ieee1394] [8822a4b5] dma_trm_reset+0x115/0x140 [ohci1394] [8822c512] ohci_devctl+0x1c2/0x540 [ohci1394] [881c9673] hpsb_bus_reset+0x43/0xb0 [ieee1394] [8822d7f6] ohci_irq_handler+0x416/0x830 [ohci1394] [802631ab] handle_IRQ_event+0x2b/0x70 [80264dd4] handle_level_irq+0xc4/0x130 [8020c762] do_IRQ+0x112/0x130 [80209d90] common_interrupt+0x64/0x65 irq event stamp: 4280 hardirqs last enabled at (4279): [8047606a] trace_hardirqs_on_thunk+0x35/0x37 hardirqs last disabled at (4278): [804760a1] trace_hardirqs_off_thunk+0x35/0x67 softirqs last enabled at (4258): [804065b5] release_sock+0xd5/0xe0 softirqs last disabled at (4280): [804764d1] _spin_lock_bh+0x11/0x50 other info that might help us debug this: no locks held by nscd/4929. stack backtrace: Call Trace: [8020ab9f] show_trace+0x9f/0x240 [8020af75] dump_stack+0x15/0x20 [80249e52] print_usage_bug+0x272/0x290 [8024a0d7] mark_lock+0x267/0x5f0 [8024a9a6] __lock_acquire+0x546/0xd10 [8024b4fb] lock_acquire+0x8b/0xc0 [804764f4] _spin_lock_bh+0x34/0x50 [8044fe40] udp_ioctl+0x50/0xa0 yet udp_ioctl takes it only for _bh [80457359] inet_ioctl+0x69/0x70 [804033ac] sock_ioctl+0x22c/0x270 [802a32b1] do_ioctl+0x31/0xa0 [802a35db] vfs_ioctl+0x2bb/0x2e0 [802a366a] sys_ioctl+0x6a/0xa0 [8020985a] system_call+0x7e/0x83 [2b2d76ab98a9] is this a real scenario, or is this a case of firewire is special and needs it's own rules? Well, firewire is special, but that should already be addressed by this patch: lockdep: annotate ieee1394 skb-queue-head locking http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=d378834840907326ac9d448056d957d13cc3718f Why is there still a lockdep warning? (Ieee1394 core's usage of the skb_* API is entirely unrelated to networking; even if eth1394 was used.) -- Stefan Richter -=-=-==- -=== --=-= http://arcgraph.de/sr/ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.17-mm6
I wrote: (Ieee1394 core's usage of the skb_* API is entirely unrelated to networking; even if eth1394 was used.) PS: I wonder if it wouldn't be better to migrate ieee1394 core away from skb_*. I didn't look thoroughly at it yet but the benefit of using this API appears quite low to me. We use it to keep track of IEEE 1394 transactions [ = outgoing request (incoming response || expiry)], with completion of transactions often in-order due to mostly single-threaded usage, but sometimes out-of-order (may happen regardless of multithreaded or single-threaded usage). -- Stefan Richter -=-=-==- -=== --=-= http://arcgraph.de/sr/ - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.17-mm6
* Stefan Richter [EMAIL PROTECTED] wrote: I wrote: (Ieee1394 core's usage of the skb_* API is entirely unrelated to networking; even if eth1394 was used.) PS: I wonder if it wouldn't be better to migrate ieee1394 core away from skb_*. I didn't look thoroughly at it yet but the benefit of using this API appears quite low to me. yeah, it seems to be the wrong abstraction to use. It's also more expensive than necessary: e.g. skb-heads have a qlen field that is maintained in every list op - but the ieee1394 code does not make use of the queue-length information. Using list.h plus a spinlock should do the trick? Ingo - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: 2.6.17-mm6
this is one for the networking people, and thus netdev On Tue, 2006-07-04 at 21:53 +0200, Rafael J. Wysocki wrote: On Monday 03 July 2006 12:03, Andrew Morton wrote: ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17/2.6.17-mm6/ - A major update to the e1000 driver. - 1394 updates Just found this in dmesg: = [ INFO: inconsistent lock state ] - inconsistent {in-hardirq-W} - {hardirq-on-W} usage. nscd/4929 [HC0[0]:SC0[1]:HE1:SE0] takes: (skb_queue_lock_key){++..}, at: [8044fe40] udp_ioctl+0x50/0xa0 {in-hardirq-W} state was registered at: [8024b4fa] lock_acquire+0x8a/0xc0 [80476e3f] _spin_lock_irqsave+0x3f/0x60 [80408c25] skb_queue_tail+0x25/0x60 ok so skb_queue_lock is used in a hardirq context [881c9517] queue_packet_complete+0x27/0x40 [ieee1394] [881c9d6b] hpsb_packet_sent+0xab/0x100 [ieee1394] [8822a4b5] dma_trm_reset+0x115/0x140 [ohci1394] [8822c512] ohci_devctl+0x1c2/0x540 [ohci1394] [881c9673] hpsb_bus_reset+0x43/0xb0 [ieee1394] [8822d7f6] ohci_irq_handler+0x416/0x830 [ohci1394] [802631ab] handle_IRQ_event+0x2b/0x70 [80264dd4] handle_level_irq+0xc4/0x130 [8020c762] do_IRQ+0x112/0x130 [80209d90] common_interrupt+0x64/0x65 irq event stamp: 4280 hardirqs last enabled at (4279): [8047606a] trace_hardirqs_on_thunk+0x35/0x37 hardirqs last disabled at (4278): [804760a1] trace_hardirqs_off_thunk+0x35/0x67 softirqs last enabled at (4258): [804065b5] release_sock+0xd5/0xe0 softirqs last disabled at (4280): [804764d1] _spin_lock_bh+0x11/0x50 other info that might help us debug this: no locks held by nscd/4929. stack backtrace: Call Trace: [8020ab9f] show_trace+0x9f/0x240 [8020af75] dump_stack+0x15/0x20 [80249e52] print_usage_bug+0x272/0x290 [8024a0d7] mark_lock+0x267/0x5f0 [8024a9a6] __lock_acquire+0x546/0xd10 [8024b4fb] lock_acquire+0x8b/0xc0 [804764f4] _spin_lock_bh+0x34/0x50 [8044fe40] udp_ioctl+0x50/0xa0 yet udp_ioctl takes it only for _bh [80457359] inet_ioctl+0x69/0x70 [804033ac] sock_ioctl+0x22c/0x270 [802a32b1] do_ioctl+0x31/0xa0 [802a35db] vfs_ioctl+0x2bb/0x2e0 [802a366a] sys_ioctl+0x6a/0xa0 [8020985a] system_call+0x7e/0x83 [2b2d76ab98a9] is this a real scenario, or is this a case of firewire is special and needs it's own rules? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html