Bug#600281: linux-image-2.6.26-2-sparc64: networks halt after a few weeks on TULIP DAVICOM DM9102, NETDEV WATCHDOG: eth0: transmit timed out
Control: tags -1 + moreinfo Hi, On Tue, Aug 20, 2013 at 06:08:41PM +0200, Moritz Muehlenhoff wrote: > reassign 600281 src:linux > thanks > > On Thu, Oct 21, 2010 at 03:27:37AM +0200, Rémi Bouhl wrote: > > 2010/10/18, dann frazier : > > > > > > Does /var/log/kern.log contain anything interesting before these > > > messages? (oops message, tulip driver messages, etc)? > > > > Ah, yes. Here it is: > > > > Oct 13 06:25:02 titine kernel: imklog 3.18.6, log source = /proc/kmsg > > started. > > Oct 13 18:38:17 titine kernel: [7015022.568660] NETDEV WATCHDOG: eth0: > > transmit timed out > > Oct 13 18:38:17 titine kernel: [7015022.637393] [ cut here > > ] > > Oct 13 18:38:17 titine kernel: [7015022.700285] WARNING: at > > net/sched/sch_generic.c:222 dev_watchdog+0xb4/0x118() > > Oct 13 18:38:17 titine kernel: [7015022.796421] Modules linked in: > > nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc tun xt_multiport > > xt_tcpudp xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat > > nf_conntrack_ipv4 nf_conntrack ip_tables x_tables ipv6 ext3 jbd raid1 > > raid0 md_mod ide_cd_mod cdrom ide_disk alim15x3 ide_pci_generic tulip > > ata_generic libata scsi_mod ohci_hcd dmfe [last unloaded: > > scsi_wait_scan] > > Oct 13 18:38:17 titine kernel: [7015023.238823] Call Trace: > > Oct 13 18:38:17 titine kernel: [7015023.273110] [006235fc] > > dev_watchdog+0xbc/0x118 > > Oct 13 18:38:17 titine kernel: [7015023.343979] [0045f924] > > run_timer_softirq+0x178/0x1e8 > > Oct 13 18:38:17 titine kernel: [7015023.421699] [0045bc48] > > __do_softirq+0x48/0xb8 > > Oct 13 18:38:17 titine kernel: [7015023.491422] [0042d2b4] > > do_softirq+0x58/0x7c > > Oct 13 18:38:17 titine kernel: [7015023.558858] [0045b8a4] > > irq_exit+0x40/0x8c > > Oct 13 18:38:17 titine kernel: [7015023.624004] [004319a4] > > timer_interrupt+0x74/0x84 > > Oct 13 18:38:17 titine kernel: [7015023.697155] [004209d4] > > tl0_irq14+0x1c/0x20 > > Oct 13 18:38:17 titine kernel: [7015023.763447] [00427674] > > cpu_idle+0x9c/0xc4 > > Oct 13 18:38:17 titine kernel: [7015023.828599] [0078296c] > > start_kernel+0x318/0x324 > > Oct 13 18:38:17 titine kernel: [7015023.900605] [0067d260] > > auxio_probe+0x0/0xd0 > > Oct 13 18:38:17 titine kernel: [7015023.968039] [] 0x8 > > Oct 13 18:38:17 titine kernel: [7015024.016044] ---[ end trace > > 8945f4f399a29df3 ]--- > > Oct 13 18:38:51 titine kernel: [7015058.568656] NETDEV WATCHDOG: eth0: > > transmit timed out > > > > Then it repeats "NETDEV WATCHDOG" message. > > > > > > > >> Saw a similar bug, number 522592. It's closed now, with no solution or > > >> fix. > > > > > > As suggested in that bug report, can you try the 2.6.32 kernel from > > > squeeze and see if it has better results? > > > > > I can do this, but would not that be usefull to help finding the bug > > on stable version? > > This server is not critical, I don't mind if 2.6.26 or 2.6.32 is > > running on it. Just tell me the one it's better to use for debug > > purposes, and I'll use it. > > Does this work with current kernels, e.g. Wheezy? I'm going to assume we can close this bug, as there was no followup, and possibly not anymore easy to reproduce. In case this is still the case please feel free to reopen the bug. Regards, Salvatore
Bug#600281: linux-image-2.6.26-2-sparc64: networks halt after a few weeks on TULIP DAVICOM DM9102, NETDEV WATCHDOG: eth0: transmit timed out
reassign 600281 src:linux thanks On Thu, Oct 21, 2010 at 03:27:37AM +0200, Rémi Bouhl wrote: 2010/10/18, dann frazier da...@debian.org: Does /var/log/kern.log contain anything interesting before these messages? (oops message, tulip driver messages, etc)? Ah, yes. Here it is: Oct 13 06:25:02 titine kernel: imklog 3.18.6, log source = /proc/kmsg started. Oct 13 18:38:17 titine kernel: [7015022.568660] NETDEV WATCHDOG: eth0: transmit timed out Oct 13 18:38:17 titine kernel: [7015022.637393] [ cut here ] Oct 13 18:38:17 titine kernel: [7015022.700285] WARNING: at net/sched/sch_generic.c:222 dev_watchdog+0xb4/0x118() Oct 13 18:38:17 titine kernel: [7015022.796421] Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc tun xt_multiport xt_tcpudp xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack ip_tables x_tables ipv6 ext3 jbd raid1 raid0 md_mod ide_cd_mod cdrom ide_disk alim15x3 ide_pci_generic tulip ata_generic libata scsi_mod ohci_hcd dmfe [last unloaded: scsi_wait_scan] Oct 13 18:38:17 titine kernel: [7015023.238823] Call Trace: Oct 13 18:38:17 titine kernel: [7015023.273110] [006235fc] dev_watchdog+0xbc/0x118 Oct 13 18:38:17 titine kernel: [7015023.343979] [0045f924] run_timer_softirq+0x178/0x1e8 Oct 13 18:38:17 titine kernel: [7015023.421699] [0045bc48] __do_softirq+0x48/0xb8 Oct 13 18:38:17 titine kernel: [7015023.491422] [0042d2b4] do_softirq+0x58/0x7c Oct 13 18:38:17 titine kernel: [7015023.558858] [0045b8a4] irq_exit+0x40/0x8c Oct 13 18:38:17 titine kernel: [7015023.624004] [004319a4] timer_interrupt+0x74/0x84 Oct 13 18:38:17 titine kernel: [7015023.697155] [004209d4] tl0_irq14+0x1c/0x20 Oct 13 18:38:17 titine kernel: [7015023.763447] [00427674] cpu_idle+0x9c/0xc4 Oct 13 18:38:17 titine kernel: [7015023.828599] [0078296c] start_kernel+0x318/0x324 Oct 13 18:38:17 titine kernel: [7015023.900605] [0067d260] auxio_probe+0x0/0xd0 Oct 13 18:38:17 titine kernel: [7015023.968039] [] 0x8 Oct 13 18:38:17 titine kernel: [7015024.016044] ---[ end trace 8945f4f399a29df3 ]--- Oct 13 18:38:51 titine kernel: [7015058.568656] NETDEV WATCHDOG: eth0: transmit timed out Then it repeats NETDEV WATCHDOG message. Saw a similar bug, number 522592. It's closed now, with no solution or fix. As suggested in that bug report, can you try the 2.6.32 kernel from squeeze and see if it has better results? I can do this, but would not that be usefull to help finding the bug on stable version? This server is not critical, I don't mind if 2.6.26 or 2.6.32 is running on it. Just tell me the one it's better to use for debug purposes, and I'll use it. Does this work with current kernels, e.g. Wheezy? Cheers, Moritz -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#600281: linux-image-2.6.26-2-sparc64: networks halt after a few weeks on TULIP DAVICOM DM9102, NETDEV WATCHDOG: eth0: transmit timed out
2010/10/18, dann frazier da...@debian.org: Does /var/log/kern.log contain anything interesting before these messages? (oops message, tulip driver messages, etc)? Ah, yes. Here it is: Oct 13 06:25:02 titine kernel: imklog 3.18.6, log source = /proc/kmsg started. Oct 13 18:38:17 titine kernel: [7015022.568660] NETDEV WATCHDOG: eth0: transmit timed out Oct 13 18:38:17 titine kernel: [7015022.637393] [ cut here ] Oct 13 18:38:17 titine kernel: [7015022.700285] WARNING: at net/sched/sch_generic.c:222 dev_watchdog+0xb4/0x118() Oct 13 18:38:17 titine kernel: [7015022.796421] Modules linked in: nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc tun xt_multiport xt_tcpudp xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack ip_tables x_tables ipv6 ext3 jbd raid1 raid0 md_mod ide_cd_mod cdrom ide_disk alim15x3 ide_pci_generic tulip ata_generic libata scsi_mod ohci_hcd dmfe [last unloaded: scsi_wait_scan] Oct 13 18:38:17 titine kernel: [7015023.238823] Call Trace: Oct 13 18:38:17 titine kernel: [7015023.273110] [006235fc] dev_watchdog+0xbc/0x118 Oct 13 18:38:17 titine kernel: [7015023.343979] [0045f924] run_timer_softirq+0x178/0x1e8 Oct 13 18:38:17 titine kernel: [7015023.421699] [0045bc48] __do_softirq+0x48/0xb8 Oct 13 18:38:17 titine kernel: [7015023.491422] [0042d2b4] do_softirq+0x58/0x7c Oct 13 18:38:17 titine kernel: [7015023.558858] [0045b8a4] irq_exit+0x40/0x8c Oct 13 18:38:17 titine kernel: [7015023.624004] [004319a4] timer_interrupt+0x74/0x84 Oct 13 18:38:17 titine kernel: [7015023.697155] [004209d4] tl0_irq14+0x1c/0x20 Oct 13 18:38:17 titine kernel: [7015023.763447] [00427674] cpu_idle+0x9c/0xc4 Oct 13 18:38:17 titine kernel: [7015023.828599] [0078296c] start_kernel+0x318/0x324 Oct 13 18:38:17 titine kernel: [7015023.900605] [0067d260] auxio_probe+0x0/0xd0 Oct 13 18:38:17 titine kernel: [7015023.968039] [] 0x8 Oct 13 18:38:17 titine kernel: [7015024.016044] ---[ end trace 8945f4f399a29df3 ]--- Oct 13 18:38:51 titine kernel: [7015058.568656] NETDEV WATCHDOG: eth0: transmit timed out Then it repeats NETDEV WATCHDOG message. Saw a similar bug, number 522592. It's closed now, with no solution or fix. As suggested in that bug report, can you try the 2.6.32 kernel from squeeze and see if it has better results? I can do this, but would not that be usefull to help finding the bug on stable version? This server is not critical, I don't mind if 2.6.26 or 2.6.32 is running on it. Just tell me the one it's better to use for debug purposes, and I'll use it. Remi. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#600281: linux-image-2.6.26-2-sparc64: networks halt after a few weeks on TULIP DAVICOM DM9102, NETDEV WATCHDOG: eth0: transmit timed out
On Fri, Oct 15, 2010 at 03:48:27PM +0200, Remi Bouhl wrote: Package: linux-image-2.6.26-2-sparc64 Version: 2.6.26-22lenny1 Severity: important I am using a Sun Fire V100 with Debian Lenny. It was OK for a few weeks, then suddenly it was not possible to access it from SSH: connexion timed out. The only thing I could get from network was the index of apache default index page. No way to download a file. All looked like if the network was over logging. I couldn't get physical access, so asked someone to reboot it. After that I had a look to syslog, there are many lines like this: [...] Oct 14 15:02:11 titine kernel: [7088458.574525] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:02:27 titine kernel: [7088474.574528] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:02:47 titine kernel: [7088494.574530] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:03:03 titine kernel: [7088510.574533] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:03:19 titine kernel: [7088526.574532] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:03:39 titine kernel: [7088546.574534] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:03:55 titine kernel: [7088562.574532] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:04:11 titine kernel: [7088578.574537] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:04:23 titine kernel: [7088590.574540] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:04:47 titine kernel: [7088614.574541] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:05:27 titine kernel: [7088654.574546] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:06:03 titine kernel: [7088690.574542] NETDEV WATCHDOG: eth0: transmit timed out [...] Does /var/log/kern.log contain anything interesting before these messages? (oops message, tulip driver messages, etc)? Saw a similar bug, number 522592. It's closed now, with no solution or fix. As suggested in that bug report, can you try the 2.6.32 kernel from squeeze and see if it has better results? -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Bug#600281: linux-image-2.6.26-2-sparc64: networks halt after a few weeks on TULIP DAVICOM DM9102, NETDEV WATCHDOG: eth0: transmit timed out
Package: linux-image-2.6.26-2-sparc64 Version: 2.6.26-22lenny1 Severity: important I am using a Sun Fire V100 with Debian Lenny. It was OK for a few weeks, then suddenly it was not possible to access it from SSH: connexion timed out. The only thing I could get from network was the index of apache default index page. No way to download a file. All looked like if the network was over logging. I couldn't get physical access, so asked someone to reboot it. After that I had a look to syslog, there are many lines like this: [...] Oct 14 15:02:11 titine kernel: [7088458.574525] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:02:27 titine kernel: [7088474.574528] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:02:47 titine kernel: [7088494.574530] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:03:03 titine kernel: [7088510.574533] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:03:19 titine kernel: [7088526.574532] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:03:39 titine kernel: [7088546.574534] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:03:55 titine kernel: [7088562.574532] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:04:11 titine kernel: [7088578.574537] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:04:23 titine kernel: [7088590.574540] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:04:47 titine kernel: [7088614.574541] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:05:27 titine kernel: [7088654.574546] NETDEV WATCHDOG: eth0: transmit timed out Oct 14 15:06:03 titine kernel: [7088690.574542] NETDEV WATCHDOG: eth0: transmit timed out [...] Saw a similar bug, number 522592. It's closed now, with no solution or fix. Network settings: mii-tool eth0 -vv Using SIOCGMIIPHY=0x8947 eth0: negotiated 100baseTx-FD, link ok registers for MII PHY 1: 1000 782d 0181 b840 01e1 41e1 0001 8018 7800 1000 0001 product info: vendor 00:60:6e, model 4 rev 0 basic mode: autonegotiation enabled basic status: autonegotiation complete, link ok capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD Module info: modinfo tulip filename: /lib/modules/2.6.26-2-sparc64/kernel/drivers/net/tulip/tulip.ko version:1.1.15-NAPI license:GPL description:Digital 21*4* Tulip ethernet driver author: The Linux Kernel Team srcversion: A033E0667282FCC8CABF3C6 alias: pci:v1414d0002sv*sd*bc*sc*i* alias: pci:v14EAdAB08sv*sd*bc*sc*i* alias: pci:v10B7d9300sv*sd*bc*sc*i* alias: pci:v17B3dAB08sv*sd*bc*sc*i* alias: pci:v1737dAB08sv*sd*bc*sc*i* alias: pci:v1737dAB09sv*sd*bc*sc*i* alias: pci:v1626d8410sv*sd*bc*sc*i* alias: pci:v14F1d1803sv*sd*bc*sc*i* alias: pci:v1186d1591sv*sd*bc*sc*i* alias: pci:v1186d1561sv*sd*bc*sc*i* alias: pci:v1186d1541sv*sd*bc*sc*i* alias: pci:v1113d9511sv*sd*bc*sc*i* alias: pci:v1113d1217sv*sd*bc*sc*i* alias: pci:v1113d1216sv*sd*bc*sc*i* alias: pci:v1282d9102sv*sd*bc*sc*i* alias: pci:v1282d9100sv*sd*bc*sc*i* alias: pci:v8086d0039sv*sd*bc*sc*i* alias: pci:v11F6d9881sv*sd*bc*sc*i* alias: pci:v1259dA120sv*sd*bc*sc*i* alias: pci:v104Ad2774sv*sd*bc*sc*i* alias: pci:v104Ad0981sv*sd*bc*sc*i* alias: pci:v13D1dAB08sv*sd*bc*sc*i* alias: pci:v13D1dAB03sv*sd*bc*sc*i* alias: pci:v13D1dAB02sv*sd*bc*sc*i* alias: pci:v1317d9511sv*sd*bc*sc*i* alias: pci:v1317d1985sv*sd*bc*sc*i* alias: pci:v1317d0985sv*sd*bc*sc*i* alias: pci:v1317d0981sv*sd*bc*sc*i* alias: pci:v11ADdC115sv*sd*bc*sc*i* alias: pci:v125Bd1400sv*sd*bc*sc*i* alias: pci:v10D9d0531sv*sd*bc*sc*i* alias: pci:v10D9d0512sv*sd*bc*sc*i* alias: pci:v11ADd0002sv*sd*bc*sc*i* alias: pci:v1011d0019sv*sd*bc*sc*i* alias: pci:v1011d0009sv*sd*bc*sc*i* depends: vermagic: 2.6.26-2-sparc64 mod_unload modversions parm: tulip_debug:int parm: max_interrupt_work:int parm: rx_copybreak:int parm: csr0:int parm: options:array of int parm: full_duplex:array of int Are there any tools or options I could put on so as to get more information on the next time this bug appears (if it does)? Remi. -- System Information: Debian Release: 5.0.4 APT prefers stable APT policy: (500,