Bug#600281: linux-image-2.6.26-2-sparc64: networks halt after a few weeks on TULIP DAVICOM DM9102, NETDEV WATCHDOG: eth0: transmit timed out

2021-04-23 Thread Salvatore Bonaccorso
Control: tags -1 + moreinfo

Hi,

On Tue, Aug 20, 2013 at 06:08:41PM +0200, Moritz Muehlenhoff wrote:
> reassign 600281 src:linux
> thanks
> 
> On Thu, Oct 21, 2010 at 03:27:37AM +0200, Rémi Bouhl wrote:
> > 2010/10/18, dann frazier :
> > >
> > > Does /var/log/kern.log contain anything interesting before these
> > > messages? (oops message, tulip driver messages, etc)?
> > 
> > Ah, yes. Here it is:
> > 
> > Oct 13 06:25:02 titine kernel: imklog 3.18.6, log source = /proc/kmsg 
> > started.
> > Oct 13 18:38:17 titine kernel: [7015022.568660] NETDEV WATCHDOG: eth0:
> > transmit timed out
> > Oct 13 18:38:17 titine kernel: [7015022.637393] [ cut here
> > ]
> > Oct 13 18:38:17 titine kernel: [7015022.700285] WARNING: at
> > net/sched/sch_generic.c:222 dev_watchdog+0xb4/0x118()
> > Oct 13 18:38:17 titine kernel: [7015022.796421] Modules linked in:
> > nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc tun xt_multiport
> > xt_tcpudp xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat
> > nf_conntrack_ipv4 nf_conntrack ip_tables x_tables ipv6 ext3 jbd raid1
> > raid0 md_mod ide_cd_mod cdrom ide_disk alim15x3 ide_pci_generic tulip
> > ata_generic libata scsi_mod ohci_hcd dmfe [last unloaded:
> > scsi_wait_scan]
> > Oct 13 18:38:17 titine kernel: [7015023.238823] Call Trace:
> > Oct 13 18:38:17 titine kernel: [7015023.273110]  [006235fc]
> > dev_watchdog+0xbc/0x118
> > Oct 13 18:38:17 titine kernel: [7015023.343979]  [0045f924]
> > run_timer_softirq+0x178/0x1e8
> > Oct 13 18:38:17 titine kernel: [7015023.421699]  [0045bc48]
> > __do_softirq+0x48/0xb8
> > Oct 13 18:38:17 titine kernel: [7015023.491422]  [0042d2b4]
> > do_softirq+0x58/0x7c
> > Oct 13 18:38:17 titine kernel: [7015023.558858]  [0045b8a4]
> > irq_exit+0x40/0x8c
> > Oct 13 18:38:17 titine kernel: [7015023.624004]  [004319a4]
> > timer_interrupt+0x74/0x84
> > Oct 13 18:38:17 titine kernel: [7015023.697155]  [004209d4]
> > tl0_irq14+0x1c/0x20
> > Oct 13 18:38:17 titine kernel: [7015023.763447]  [00427674]
> > cpu_idle+0x9c/0xc4
> > Oct 13 18:38:17 titine kernel: [7015023.828599]  [0078296c]
> > start_kernel+0x318/0x324
> > Oct 13 18:38:17 titine kernel: [7015023.900605]  [0067d260]
> > auxio_probe+0x0/0xd0
> > Oct 13 18:38:17 titine kernel: [7015023.968039]  [] 0x8
> > Oct 13 18:38:17 titine kernel: [7015024.016044] ---[ end trace
> > 8945f4f399a29df3 ]---
> > Oct 13 18:38:51 titine kernel: [7015058.568656] NETDEV WATCHDOG: eth0:
> > transmit timed out
> > 
> > Then it repeats "NETDEV WATCHDOG" message.
> > 
> > >
> > >> Saw a similar bug, number 522592. It's closed now, with no solution or
> > >> fix.
> > >
> > > As suggested in that bug report, can you try the 2.6.32 kernel from
> > > squeeze and see if it has better results?
> > >
> > I can do this, but would not that be usefull to help finding the bug
> > on stable version?
> > This server is not critical, I don't mind if 2.6.26 or 2.6.32 is
> > running on it. Just tell me the one it's better to use for debug
> > purposes, and I'll use it.
> 
> Does this work with current kernels, e.g. Wheezy?

I'm going to assume we can close this bug, as there was no followup,
and possibly not anymore easy to reproduce.

In case this is still the case please feel free to reopen the bug.

Regards,
Salvatore



Bug#600281: linux-image-2.6.26-2-sparc64: networks halt after a few weeks on TULIP DAVICOM DM9102, NETDEV WATCHDOG: eth0: transmit timed out

2013-08-20 Thread Moritz Muehlenhoff
reassign 600281 src:linux
thanks

On Thu, Oct 21, 2010 at 03:27:37AM +0200, Rémi Bouhl wrote:
 2010/10/18, dann frazier da...@debian.org:
 
  Does /var/log/kern.log contain anything interesting before these
  messages? (oops message, tulip driver messages, etc)?
 
 Ah, yes. Here it is:
 
 Oct 13 06:25:02 titine kernel: imklog 3.18.6, log source = /proc/kmsg started.
 Oct 13 18:38:17 titine kernel: [7015022.568660] NETDEV WATCHDOG: eth0:
 transmit timed out
 Oct 13 18:38:17 titine kernel: [7015022.637393] [ cut here
 ]
 Oct 13 18:38:17 titine kernel: [7015022.700285] WARNING: at
 net/sched/sch_generic.c:222 dev_watchdog+0xb4/0x118()
 Oct 13 18:38:17 titine kernel: [7015022.796421] Modules linked in:
 nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc tun xt_multiport
 xt_tcpudp xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat
 nf_conntrack_ipv4 nf_conntrack ip_tables x_tables ipv6 ext3 jbd raid1
 raid0 md_mod ide_cd_mod cdrom ide_disk alim15x3 ide_pci_generic tulip
 ata_generic libata scsi_mod ohci_hcd dmfe [last unloaded:
 scsi_wait_scan]
 Oct 13 18:38:17 titine kernel: [7015023.238823] Call Trace:
 Oct 13 18:38:17 titine kernel: [7015023.273110]  [006235fc]
 dev_watchdog+0xbc/0x118
 Oct 13 18:38:17 titine kernel: [7015023.343979]  [0045f924]
 run_timer_softirq+0x178/0x1e8
 Oct 13 18:38:17 titine kernel: [7015023.421699]  [0045bc48]
 __do_softirq+0x48/0xb8
 Oct 13 18:38:17 titine kernel: [7015023.491422]  [0042d2b4]
 do_softirq+0x58/0x7c
 Oct 13 18:38:17 titine kernel: [7015023.558858]  [0045b8a4]
 irq_exit+0x40/0x8c
 Oct 13 18:38:17 titine kernel: [7015023.624004]  [004319a4]
 timer_interrupt+0x74/0x84
 Oct 13 18:38:17 titine kernel: [7015023.697155]  [004209d4]
 tl0_irq14+0x1c/0x20
 Oct 13 18:38:17 titine kernel: [7015023.763447]  [00427674]
 cpu_idle+0x9c/0xc4
 Oct 13 18:38:17 titine kernel: [7015023.828599]  [0078296c]
 start_kernel+0x318/0x324
 Oct 13 18:38:17 titine kernel: [7015023.900605]  [0067d260]
 auxio_probe+0x0/0xd0
 Oct 13 18:38:17 titine kernel: [7015023.968039]  [] 0x8
 Oct 13 18:38:17 titine kernel: [7015024.016044] ---[ end trace
 8945f4f399a29df3 ]---
 Oct 13 18:38:51 titine kernel: [7015058.568656] NETDEV WATCHDOG: eth0:
 transmit timed out
 
 Then it repeats NETDEV WATCHDOG message.
 
 
  Saw a similar bug, number 522592. It's closed now, with no solution or
  fix.
 
  As suggested in that bug report, can you try the 2.6.32 kernel from
  squeeze and see if it has better results?
 
 I can do this, but would not that be usefull to help finding the bug
 on stable version?
 This server is not critical, I don't mind if 2.6.26 or 2.6.32 is
 running on it. Just tell me the one it's better to use for debug
 purposes, and I'll use it.

Does this work with current kernels, e.g. Wheezy?

Cheers,
Moritz


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#600281: linux-image-2.6.26-2-sparc64: networks halt after a few weeks on TULIP DAVICOM DM9102, NETDEV WATCHDOG: eth0: transmit timed out

2010-10-20 Thread Rémi Bouhl
2010/10/18, dann frazier da...@debian.org:

 Does /var/log/kern.log contain anything interesting before these
 messages? (oops message, tulip driver messages, etc)?

Ah, yes. Here it is:

Oct 13 06:25:02 titine kernel: imklog 3.18.6, log source = /proc/kmsg started.
Oct 13 18:38:17 titine kernel: [7015022.568660] NETDEV WATCHDOG: eth0:
transmit timed out
Oct 13 18:38:17 titine kernel: [7015022.637393] [ cut here
]
Oct 13 18:38:17 titine kernel: [7015022.700285] WARNING: at
net/sched/sch_generic.c:222 dev_watchdog+0xb4/0x118()
Oct 13 18:38:17 titine kernel: [7015022.796421] Modules linked in:
nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc tun xt_multiport
xt_tcpudp xt_state iptable_filter ipt_MASQUERADE iptable_nat nf_nat
nf_conntrack_ipv4 nf_conntrack ip_tables x_tables ipv6 ext3 jbd raid1
raid0 md_mod ide_cd_mod cdrom ide_disk alim15x3 ide_pci_generic tulip
ata_generic libata scsi_mod ohci_hcd dmfe [last unloaded:
scsi_wait_scan]
Oct 13 18:38:17 titine kernel: [7015023.238823] Call Trace:
Oct 13 18:38:17 titine kernel: [7015023.273110]  [006235fc]
dev_watchdog+0xbc/0x118
Oct 13 18:38:17 titine kernel: [7015023.343979]  [0045f924]
run_timer_softirq+0x178/0x1e8
Oct 13 18:38:17 titine kernel: [7015023.421699]  [0045bc48]
__do_softirq+0x48/0xb8
Oct 13 18:38:17 titine kernel: [7015023.491422]  [0042d2b4]
do_softirq+0x58/0x7c
Oct 13 18:38:17 titine kernel: [7015023.558858]  [0045b8a4]
irq_exit+0x40/0x8c
Oct 13 18:38:17 titine kernel: [7015023.624004]  [004319a4]
timer_interrupt+0x74/0x84
Oct 13 18:38:17 titine kernel: [7015023.697155]  [004209d4]
tl0_irq14+0x1c/0x20
Oct 13 18:38:17 titine kernel: [7015023.763447]  [00427674]
cpu_idle+0x9c/0xc4
Oct 13 18:38:17 titine kernel: [7015023.828599]  [0078296c]
start_kernel+0x318/0x324
Oct 13 18:38:17 titine kernel: [7015023.900605]  [0067d260]
auxio_probe+0x0/0xd0
Oct 13 18:38:17 titine kernel: [7015023.968039]  [] 0x8
Oct 13 18:38:17 titine kernel: [7015024.016044] ---[ end trace
8945f4f399a29df3 ]---
Oct 13 18:38:51 titine kernel: [7015058.568656] NETDEV WATCHDOG: eth0:
transmit timed out

Then it repeats NETDEV WATCHDOG message.


 Saw a similar bug, number 522592. It's closed now, with no solution or
 fix.

 As suggested in that bug report, can you try the 2.6.32 kernel from
 squeeze and see if it has better results?

I can do this, but would not that be usefull to help finding the bug
on stable version?
This server is not critical, I don't mind if 2.6.26 or 2.6.32 is
running on it. Just tell me the one it's better to use for debug
purposes, and I'll use it.

Remi.



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#600281: linux-image-2.6.26-2-sparc64: networks halt after a few weeks on TULIP DAVICOM DM9102, NETDEV WATCHDOG: eth0: transmit timed out

2010-10-18 Thread dann frazier
On Fri, Oct 15, 2010 at 03:48:27PM +0200, Remi Bouhl wrote:
 Package: linux-image-2.6.26-2-sparc64
 Version: 2.6.26-22lenny1
 Severity: important
 
 
 I am using a Sun Fire V100 with Debian Lenny. It was OK for a few weeks, then 
 suddenly it was not possible to access it from SSH: connexion timed 
 out. The only thing I could get from network was the index of apache 
 default index page. No way to download a file. All looked like if the 
 network was over logging.
 
 I couldn't get physical access, so asked someone to reboot it. After that I 
 had a look to syslog, there are many lines like this:
 [...]
 Oct 14 15:02:11 titine kernel: [7088458.574525] NETDEV WATCHDOG: eth0: 
 transmit timed out
 Oct 14 15:02:27 titine kernel: [7088474.574528] NETDEV WATCHDOG: eth0: 
 transmit timed out
 Oct 14 15:02:47 titine kernel: [7088494.574530] NETDEV WATCHDOG: eth0: 
 transmit timed out
 Oct 14 15:03:03 titine kernel: [7088510.574533] NETDEV WATCHDOG: eth0: 
 transmit timed out
 Oct 14 15:03:19 titine kernel: [7088526.574532] NETDEV WATCHDOG: eth0: 
 transmit timed out
 Oct 14 15:03:39 titine kernel: [7088546.574534] NETDEV WATCHDOG: eth0: 
 transmit timed out
 Oct 14 15:03:55 titine kernel: [7088562.574532] NETDEV WATCHDOG: eth0: 
 transmit timed out
 Oct 14 15:04:11 titine kernel: [7088578.574537] NETDEV WATCHDOG: eth0: 
 transmit timed out
 Oct 14 15:04:23 titine kernel: [7088590.574540] NETDEV WATCHDOG: eth0: 
 transmit timed out
 Oct 14 15:04:47 titine kernel: [7088614.574541] NETDEV WATCHDOG: eth0: 
 transmit timed out
 Oct 14 15:05:27 titine kernel: [7088654.574546] NETDEV WATCHDOG: eth0: 
 transmit timed out
 Oct 14 15:06:03 titine kernel: [7088690.574542] NETDEV WATCHDOG: eth0: 
 transmit timed out
 [...]

Does /var/log/kern.log contain anything interesting before these
messages? (oops message, tulip driver messages, etc)?

 Saw a similar bug, number 522592. It's closed now, with no solution or fix.

As suggested in that bug report, can you try the 2.6.32 kernel from
squeeze and see if it has better results?



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#600281: linux-image-2.6.26-2-sparc64: networks halt after a few weeks on TULIP DAVICOM DM9102, NETDEV WATCHDOG: eth0: transmit timed out

2010-10-15 Thread Remi Bouhl
Package: linux-image-2.6.26-2-sparc64
Version: 2.6.26-22lenny1
Severity: important


I am using a Sun Fire V100 with Debian Lenny. It was OK for a few weeks, then 
suddenly it was not possible to access it from SSH: connexion timed 
out. The only thing I could get from network was the index of apache default 
index page. No way to download a file. All looked like if the 
network was over logging.

I couldn't get physical access, so asked someone to reboot it. After that I had 
a look to syslog, there are many lines like this:
[...]
Oct 14 15:02:11 titine kernel: [7088458.574525] NETDEV WATCHDOG: eth0: transmit 
timed out
Oct 14 15:02:27 titine kernel: [7088474.574528] NETDEV WATCHDOG: eth0: transmit 
timed out
Oct 14 15:02:47 titine kernel: [7088494.574530] NETDEV WATCHDOG: eth0: transmit 
timed out
Oct 14 15:03:03 titine kernel: [7088510.574533] NETDEV WATCHDOG: eth0: transmit 
timed out
Oct 14 15:03:19 titine kernel: [7088526.574532] NETDEV WATCHDOG: eth0: transmit 
timed out
Oct 14 15:03:39 titine kernel: [7088546.574534] NETDEV WATCHDOG: eth0: transmit 
timed out
Oct 14 15:03:55 titine kernel: [7088562.574532] NETDEV WATCHDOG: eth0: transmit 
timed out
Oct 14 15:04:11 titine kernel: [7088578.574537] NETDEV WATCHDOG: eth0: transmit 
timed out
Oct 14 15:04:23 titine kernel: [7088590.574540] NETDEV WATCHDOG: eth0: transmit 
timed out
Oct 14 15:04:47 titine kernel: [7088614.574541] NETDEV WATCHDOG: eth0: transmit 
timed out
Oct 14 15:05:27 titine kernel: [7088654.574546] NETDEV WATCHDOG: eth0: transmit 
timed out
Oct 14 15:06:03 titine kernel: [7088690.574542] NETDEV WATCHDOG: eth0: transmit 
timed out
[...]

Saw a similar bug, number 522592. It's closed now, with no solution or fix.

Network settings:

mii-tool eth0 -vv
Using SIOCGMIIPHY=0x8947
eth0: negotiated 100baseTx-FD, link ok
  registers for MII PHY 1: 
1000 782d 0181 b840 01e1 41e1 0001 
       
 8018 7800 1000 0001   
       
  product info: vendor 00:60:6e, model 4 rev 0
  basic mode:   autonegotiation enabled
  basic status: autonegotiation complete, link ok
  capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  advertising:  100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
  link partner: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD

Module info:

modinfo tulip
filename:   /lib/modules/2.6.26-2-sparc64/kernel/drivers/net/tulip/tulip.ko
version:1.1.15-NAPI
license:GPL
description:Digital 21*4* Tulip ethernet driver
author: The Linux Kernel Team
srcversion: A033E0667282FCC8CABF3C6
alias:  pci:v1414d0002sv*sd*bc*sc*i*
alias:  pci:v14EAdAB08sv*sd*bc*sc*i*
alias:  pci:v10B7d9300sv*sd*bc*sc*i*
alias:  pci:v17B3dAB08sv*sd*bc*sc*i*
alias:  pci:v1737dAB08sv*sd*bc*sc*i*
alias:  pci:v1737dAB09sv*sd*bc*sc*i*
alias:  pci:v1626d8410sv*sd*bc*sc*i*
alias:  pci:v14F1d1803sv*sd*bc*sc*i*
alias:  pci:v1186d1591sv*sd*bc*sc*i*
alias:  pci:v1186d1561sv*sd*bc*sc*i*
alias:  pci:v1186d1541sv*sd*bc*sc*i*
alias:  pci:v1113d9511sv*sd*bc*sc*i*
alias:  pci:v1113d1217sv*sd*bc*sc*i*
alias:  pci:v1113d1216sv*sd*bc*sc*i*
alias:  pci:v1282d9102sv*sd*bc*sc*i*
alias:  pci:v1282d9100sv*sd*bc*sc*i*
alias:  pci:v8086d0039sv*sd*bc*sc*i*
alias:  pci:v11F6d9881sv*sd*bc*sc*i*
alias:  pci:v1259dA120sv*sd*bc*sc*i*
alias:  pci:v104Ad2774sv*sd*bc*sc*i*
alias:  pci:v104Ad0981sv*sd*bc*sc*i*
alias:  pci:v13D1dAB08sv*sd*bc*sc*i*
alias:  pci:v13D1dAB03sv*sd*bc*sc*i*
alias:  pci:v13D1dAB02sv*sd*bc*sc*i*
alias:  pci:v1317d9511sv*sd*bc*sc*i*
alias:  pci:v1317d1985sv*sd*bc*sc*i*
alias:  pci:v1317d0985sv*sd*bc*sc*i*
alias:  pci:v1317d0981sv*sd*bc*sc*i*
alias:  pci:v11ADdC115sv*sd*bc*sc*i*
alias:  pci:v125Bd1400sv*sd*bc*sc*i*
alias:  pci:v10D9d0531sv*sd*bc*sc*i*
alias:  pci:v10D9d0512sv*sd*bc*sc*i*
alias:  pci:v11ADd0002sv*sd*bc*sc*i*
alias:  pci:v1011d0019sv*sd*bc*sc*i*
alias:  pci:v1011d0009sv*sd*bc*sc*i*
depends:
vermagic:   2.6.26-2-sparc64 mod_unload modversions 
parm:   tulip_debug:int
parm:   max_interrupt_work:int
parm:   rx_copybreak:int
parm:   csr0:int
parm:   options:array of int
parm:   full_duplex:array of int

Are there any tools or options I could put on so as to get more information on 
the next time this bug appears (if it does)?

Remi.

-- System Information:
Debian Release: 5.0.4
  APT prefers stable
  APT policy: (500,