Re: em0: watchdog timeout -- resetting (6.1-STABLE)
Hello, Just wanted to send a me too on this issue. Whenever it happends I can see our Cisco switch reporting the interface going down and up as well (Line Protocol). FreeBSD localhost.localdomain 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #1: Wed Sep 13 00:10:04 CEST 2006 [EMAIL PROTECTED]:/ usr/obj/usr/src/sys/PT i386 em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 options=bRXCSUM,TXCSUM,VLAN_MTU media: Ethernet autoselect (1000baseTX full-duplex) status: active [EMAIL PROTECTED]:11:0: class=0x02 card=0x10048086 chip=0x10048086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82543GC Gigabit Ethernet Controller (Copper)' class= network subclass = ethernet (This is a add-in 64bit PCI card.) I am stress-testing -STABLE on a spare server to aid in making 6.2 as bugfree as possible. It is set up as a NFS server with two Linux NFS clients connected that is concurrently extracting 5 copies of /usr/src to it, and running a program that creates millions of files with random UID's to test for QUOTA issues. On the server I repeatedly dump the exported filesystem with snapshot and cache enabled. (dump -L -C 32 -af /dev/null ...) I'm building todays -STABLE on a different server with SMP and two em NIC's onboard, and will start similar tests on it to see if I can reproduce the watchdog timeouts there as well. -- Frode Nordahl ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
I'm also seeing these on a Supermicro PDSMi board with a recent stable. Please tell me what debugging info that is needed to fix this. /Martin FreeBSD mailbox 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #1: Sun Sep 10 17:43:15 CEST 2006 [EMAIL PROTECTED]:/usr/obj-local/usr/src/sys/SMP amd64 lspci -v output: 04:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet Controller (Copper) (rev 03) Subsystem: Super Micro Computer Inc Unknown device 108c Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at ed20 (32-bit, non-prefetchable) I/O ports at 4000 Capabilities: [c8] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Capabilities: [e0] Express Endpoint IRQ 0 05:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller Subsystem: Super Micro Computer Inc Unknown device 109a Flags: bus master, fast devsel, latency 0, IRQ 17 Memory at ed30 (32-bit, non-prefetchable) I/O ports at 5000 Capabilities: [c8] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Capabilities: [e0] Express Endpoint IRQ 0 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
On Thu, Sep 14, 2006 at 02:27:29AM +0200, Ronald Klop wrote: Them manual page em(4) mentions trying another cable when the watchdog timeout happens, so I tried that. But it didn't help. Is there anything I can test to (help) debug this? It happens a lot when my machine is under load. (100% CPU) Is it possible that it happens since I upgraded the memory from 1GB to 2 GB? I don't think it's the cable. I started getting these recently as well (starting about a week ago). Always when there's a lot of CPU and disk I/O load. Also sometimes my USB keyboard would become unresponsive at about the same time (under high load). Sometimes it would stutter and act like the key was being held down for a second or two. I built a new kernel (6.2-PRE now) on 9/12. The keyboard problem seems to be gone but I still get the em watchdog timeouts occasionally. Craig ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
Something with em0 is really wrong. I dont get timeouts, but Before cvsup I had 6.0-PRERELEASE and didn't have a problem. Now I have FreeBSD 6.2-PRERELEASE #8: Fri Sep 15 03:44:49 MSD 2006 and the problem is so: (On machine I have LARGE_NAT, em0, em1, em2) on fresh system ping to www.ru from client computer (goes to inet via nat) is 3-5ms after few hours (i see it in the night) then traffic is smaller ping to www.ru is 11-12 ms. Why? after reboot it still gut for a few ours. FreeBSD/amd64 kernel with options DEVICE_POLLING options HZ=2500 with HZ=1000 and without DEVICE_POLLING nothing changes - 11-12 still goes after few hours. PS Should I downgrade to 6.0-RELEASE or earlier or tonight cvsup updates could resolve a problem (files sounds like tcp...): Checkout src/sys/contrib/ipfilter/netinet/ip_nat.h Edit src/sys/netinet/in_pcb.c Edit src/sys/netinet/tcp_input.c Edit src/sys/netinet/tcp_subr.c Edit src/sys/netinet/tcp_timer.c Edit src/sys/netinet/tcp_timer.h Edit src/sys/netinet/tcp_var.h Edit src/sys/sys/param.h Edit src/usr.sbin/pkg_install/add/main.c PPS Now I rebuild kernels and tomorrow night will se. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
On 9/14/06, David C. Myers [EMAIL PROTECTED] wrote: watchdogs mean that the transmit ring is not being cleaned, so the question is what is your machine doing at 100% cpu, if its that busy the network watchdogs may just be a side effect and not the real problem? I get them with a completely idle machine. My home directory is mounted via NFS (from FreeBSD 6.1 on an amd64 machine), and with the kernel from earlier this week, the machine would just hang for 30 seconds to a couple of minutes. A slew of watchdog timeout messages would appear. Then I'd get a moment's responsiveness out of the machine, then another long wait, then a moment's responsiveness, then a long wait... The machine would never recover from this cycle (at least, so far as I was patient enough to wait). Going back to a kernel dated late July resolved everything. Someone else asked me for the hardware version of my em0 board... [EMAIL PROTECTED]:10:0: class=0x02 card=0x002e8086 chip=0x100e8086 rev=0x02 hdr=0x00vendor = 'Intel Corporation' device = '82540EM Gigabit Ethernet Controller' class= network subclass = ethernet Could you perhaps go back to the kernel you say was stable and then drop in the latest em driver? Or if that has issues building do it the other way around, take the em driver from the build that gave you no problems and put it on this kernel you are running now? It would be helpful to know if this is a driver problem or something in the stack. Cheers, Jack ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
On 9/15/06, Martin Nilsson [EMAIL PROTECTED] wrote: I'm also seeing these on a Supermicro PDSMi board with a recent stable. Please tell me what debugging info that is needed to fix this. /Martin FreeBSD mailbox 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #1: Sun Sep 10 17:43:15 CEST 2006 [EMAIL PROTECTED]:/usr/obj-local/usr/src/sys/SMP amd64 lspci -v output: 04:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet Controller (Copper) (rev 03) Subsystem: Super Micro Computer Inc Unknown device 108c Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at ed20 (32-bit, non-prefetchable) I/O ports at 4000 Capabilities: [c8] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Capabilities: [e0] Express Endpoint IRQ 0 05:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet Controller Subsystem: Super Micro Computer Inc Unknown device 109a Flags: bus master, fast devsel, latency 0, IRQ 17 Memory at ed30 (32-bit, non-prefetchable) I/O ports at 5000 Capabilities: [c8] Power Management version 2 Capabilities: [d0] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable- Capabilities: [e0] Express Endpoint IRQ 0 Martin, do you see similar problems using either port, I ask because this system may be similar to one that Yahoo has and there was only a problem with one port and not the other, can you check this out please? Jack ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
Jack Vogel wrote: On 9/13/06, Ronald Klop [EMAIL PROTECTED] wrote: ... Them manual page em(4) mentions trying another cable when the watchdog timeout happens, so I tried that. But it didn't help. Is there anything I can test to (help) debug this? It happens a lot when my machine is under load. (100% CPU) Is it possible that it happens since I upgraded the memory from 1GB to 2 GB? watchdogs mean that the transmit ring is not being cleaned, so the question is what is your machine doing at 100% cpu, if its that busy the network watchdogs may just be a side effect and not the real problem? Jack I see these too when installing packages over nfs on my Laptop. If I run with a low level of network traffic, i.e. ssh compile, and peg out the cpu with a benchmark such as flops, I don't see these timeouts. 6.1-STABLE FreeBSD 6.1-STABLE #0: Sat Aug 26 14:45:40 CDT 2006 [EMAIL PROTECTED]:1:0: class=0x02 card=0x05491014 chip=0x101e8086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = '82540EP Gigabit Ethernet Controller (Mobile)' class= network Any suggestions? Thanks Dan ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
watchdogs mean that the transmit ring is not being cleaned, so the question is what is your machine doing at 100% cpu, if its that busy the network watchdogs may just be a side effect and not the real problem? I get them with a completely idle machine. My home directory is mounted via NFS (from FreeBSD 6.1 on an amd64 machine), and with the kernel from earlier this week, the machine would just hang for 30 seconds to a couple of minutes. A slew of watchdog timeout messages would appear. Then I'd get a moment's responsiveness out of the machine, then another long wait, then a moment's responsiveness, then a long wait... The machine would never recover from this cycle (at least, so far as I was patient enough to wait). Going back to a kernel dated late July resolved everything. Someone else asked me for the hardware version of my em0 board... [EMAIL PROTECTED]:10:0: class=0x02 card=0x002e8086 chip=0x100e8086 rev=0x02 hdr=0x00vendor = 'Intel Corporation' device = '82540EM Gigabit Ethernet Controller' class= network subclass = ethernet -David. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
On Fri, 15 Sep 2006 02:06:08 +0200, David C. Myers [EMAIL PROTECTED] wrote: watchdogs mean that the transmit ring is not being cleaned, so the question is what is your machine doing at 100% cpu, if its that busy the network watchdogs may just be a side effect and not the real problem? I get them with a completely idle machine. My home directory is mounted via NFS (from FreeBSD 6.1 on an amd64 machine), and with the kernel from earlier this week, the machine would just hang for 30 seconds to a couple of minutes. A slew of watchdog timeout messages would appear. Then I'd get a moment's responsiveness out of the machine, then another long wait, then a moment's responsiveness, then a long wait... The machine would never recover from this cycle (at least, so far as I was patient enough to wait). Going back to a kernel dated late July resolved everything. Someone else asked me for the hardware version of my em0 board... [EMAIL PROTECTED]:10:0: class=0x02 card=0x002e8086 chip=0x100e8086 rev=0x02 hdr=0x00vendor = 'Intel Corporation' device = '82540EM Gigabit Ethernet Controller' class= network subclass = ethernet -David. This sounds familiar to my problem. I solved it today by enabling polling. I know it's a workaround. -- Ronald Klop Amsterdam, The Netherlands ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
On Tue, 05 Sep 2006 23:52:05 +0200, Ronald Klop [EMAIL PROTECTED] wrote: Hello, I get these errors a lot. Sep 5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting Sep 5 11:55:12 ronald kernel: em0: link state changed to DOWN Sep 5 11:55:14 ronald kernel: em0: link state changed to UP Sep 5 12:00:37 ronald kernel: em0: watchdog timeout -- resetting Sep 5 12:00:37 ronald kernel: em0: link state changed to DOWN Sep 5 12:00:39 ronald kernel: em0: link state changed to UP I tried turning off rxcsum/txcsum and set these sysctl's. dev.em.0.rx_int_delay: 0 dev.em.0.tx_int_delay: 0 (default 66) But the error is still there. Searching the internet and the list provides more of the same problems, but I didn't find an answer. My dmesg is attached. Is there any info I need to provide to debug this or can I try patches? Them manual page em(4) mentions trying another cable when the watchdog timeout happens, so I tried that. But it didn't help. Is there anything I can test to (help) debug this? It happens a lot when my machine is under load. (100% CPU) Is it possible that it happens since I upgraded the memory from 1GB to 2 GB? (dmesg was attached to my previous mail, but I can provide it again.) Ronald. -- Ronald Klop Amsterdam, The Netherlands ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
Sep 5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting I got a bazillion of these, and a completely unusable machine, when I upgraded to 6.1-stable sources as of two days ago. The machine would simply freeze for minutes at a time. Going back to my previous kernel (dating from late July) made everything just fine. So something got seriously broken in the em driver in the last few weeks. -David. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
At 10:20 PM 9/13/2006, David Myers wrote: Sep 5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting I got a bazillion of these, and a completely unusable machine, when I upgraded to 6.1-stable sources as of two days ago. The machine would simply freeze for minutes at a time. Going back to my previous kernel (dating from late July) made everything just fine. So something got seriously broken in the em driver in the last few weeks. Which version of the NIC do you have ? (pciconf -lv ) ---Mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
On 9/13/06, Ronald Klop [EMAIL PROTECTED] wrote: ... Them manual page em(4) mentions trying another cable when the watchdog timeout happens, so I tried that. But it didn't help. Is there anything I can test to (help) debug this? It happens a lot when my machine is under load. (100% CPU) Is it possible that it happens since I upgraded the memory from 1GB to 2 GB? watchdogs mean that the transmit ring is not being cleaned, so the question is what is your machine doing at 100% cpu, if its that busy the network watchdogs may just be a side effect and not the real problem? Jack ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: em0: watchdog timeout -- resetting (6.1-STABLE)
On Tuesday 05 September 2006 14:52, Ronald Klop wrote: Hello, I get these errors a lot. Sep 5 11:55:12 ronald kernel: em0: watchdog timeout -- resetting Sep 5 11:55:12 ronald kernel: em0: link state changed to DOWN Sep 5 11:55:14 ronald kernel: em0: link state changed to UP Sep 5 12:00:37 ronald kernel: em0: watchdog timeout -- resetting Sep 5 12:00:37 ronald kernel: em0: link state changed to DOWN Sep 5 12:00:39 ronald kernel: em0: link state changed to UP So am I. Especially when I transfer a GB or 2 from Windows XP to 6.1-stable. I use the FreeBSD machine as a backup for digital photos and my ripped mp3 files. A photo session is usually in excess of 1 GB and can hang with the watchdog timeout. Kent I tried turning off rxcsum/txcsum and set these sysctl's. dev.em.0.rx_int_delay: 0 dev.em.0.tx_int_delay: 0 (default 66) But the error is still there. Searching the internet and the list provides more of the same problems, but I didn't find an answer. My dmesg is attached. Is there any info I need to provide to debug this or can I try patches? Ronald. -- Kent Stewart Richland, WA http://www.soyandina.com/ I am Andean project. http://users.owt.com/kstewart/index.html ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]