Re: 7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...
On Saturday 09 May 2009 6:43:16 pm Marc G. Fournier wrote: On Tue, 28 Apr 2009, Gavin Atkinson wrote: On Fri, 2009-04-24 at 20:39 +0200, Martin Schmidt wrote: Hi Marc and List, i had similar issues with FreeBSD 7.2-PRERELEASE. Server (zfs,nfs) seems to hang in intervals of about 8 hours. kernel is still there but no connections can be made to nfs/ssh and login on local console doesn't seem to work due to incredible slowness. breaking to the debugger takes a moment but works. (compiling kernel with WITNESS didnt help) the server had been solid before with 7 stable kernel from around 19 October 2008. I now added these lines to /boot/loader.conf hw.pci.enable_msi=0 hw.pci.enable_msix=0 to disable Message Signaled Interrupts. Which are used by the 3ware twa driver and igb network driver on our server. If you are willing to test further on your server, it may be helpful if you could determine which of those two lines in loader.conf fixes the problem for you. It would also be useful to provide a dmesg from the machine when both msi and msix are enabled. FWIW, looking at the vmstat -i output it appears that only the igb driver that are using MSI/MSIX, unless you have a reason to suspect otherwise? How do you tell that, about igb? looking at the server I have the igb device on, it doesn't seem to say anything about that ... IRQs 256 are MSI/MSI-X. -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
RE: 7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...
On Tue, 28 Apr 2009, Gavin Atkinson wrote: On Fri, 2009-04-24 at 20:39 +0200, Martin Schmidt wrote: Hi Marc and List, i had similar issues with FreeBSD 7.2-PRERELEASE. Server (zfs,nfs) seems to hang in intervals of about 8 hours. kernel is still there but no connections can be made to nfs/ssh and login on local console doesn't seem to work due to incredible slowness. breaking to the debugger takes a moment but works. (compiling kernel with WITNESS didnt help) the server had been solid before with 7 stable kernel from around 19 October 2008. I now added these lines to /boot/loader.conf hw.pci.enable_msi=0 hw.pci.enable_msix=0 to disable Message Signaled Interrupts. Which are used by the 3ware twa driver and igb network driver on our server. If you are willing to test further on your server, it may be helpful if you could determine which of those two lines in loader.conf fixes the problem for you. It would also be useful to provide a dmesg from the machine when both msi and msix are enabled. FWIW, looking at the vmstat -i output it appears that only the igb driver that are using MSI/MSIX, unless you have a reason to suspect otherwise? How do you tell that, about igb? looking at the server I have the igb device on, it doesn't seem to say anything about that ... # vmstat -i interrupt total rate irq1: atkbd0 162 0 irq30: twa0402647215187 cpu0: timer 4284778818 1999 irq256: igb0 1282945461598 irq257: igb0 215507100100 irq258: igb0 417702261194 irq259: igb0 314601966146 irq260: igb0 568062067265 irq261: igb0 3 0 cpu5: timer 428475 1999 cpu6: timer 4284731466 1999 cpu7: timer 4284724508 1999 cpu1: timer 4284893874 1999 cpu3: timer 4284899807 1999 cpu2: timer 4284892325 1999 cpu4: timer 4284897264 1999 Total37480028742 17493 The server(s) that I am experiencing the hangs on, vmstat -i shows: # vmstat -i interrupt total rate irq1: atkbd0 2 0 irq3: sio1 8 0 irq25: bge0 4614816213 irq72: ciss0 1835763 85 cpu0: timer 43113685 1997 cpu1: timer 43116889 1997 Total 92681163 4293 Are any of these similiarly using MSI/MSIX? Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
RE: 7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...
'k, based on grep'ng the source files, turns out that the if_bge device driver uses msi, while, as you point out, the igb uses msix ... I have disabled msi on the two servers with bge devices, and msix on the one with igb ... all three have given the same sort of problem after varying periods of time ... let's see if I can get to 30 days uptime with this ... On Tue, 28 Apr 2009, Gavin Atkinson wrote: On Fri, 2009-04-24 at 20:39 +0200, Martin Schmidt wrote: Hi Marc and List, i had similar issues with FreeBSD 7.2-PRERELEASE. Server (zfs,nfs) seems to hang in intervals of about 8 hours. kernel is still there but no connections can be made to nfs/ssh and login on local console doesn't seem to work due to incredible slowness. breaking to the debugger takes a moment but works. (compiling kernel with WITNESS didnt help) the server had been solid before with 7 stable kernel from around 19 October 2008. I now added these lines to /boot/loader.conf hw.pci.enable_msi=0 hw.pci.enable_msix=0 to disable Message Signaled Interrupts. Which are used by the 3ware twa driver and igb network driver on our server. If you are willing to test further on your server, it may be helpful if you could determine which of those two lines in loader.conf fixes the problem for you. It would also be useful to provide a dmesg from the machine when both msi and msix are enabled. FWIW, looking at the vmstat -i output it appears that only the igb driver that are using MSI/MSIX, unless you have a reason to suspect otherwise? Gavin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
RE: 7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...
On Fri, 2009-04-24 at 20:39 +0200, Martin Schmidt wrote: Hi Marc and List, i had similar issues with FreeBSD 7.2-PRERELEASE. Server (zfs,nfs) seems to hang in intervals of about 8 hours. kernel is still there but no connections can be made to nfs/ssh and login on local console doesn't seem to work due to incredible slowness. breaking to the debugger takes a moment but works. (compiling kernel with WITNESS didnt help) the server had been solid before with 7 stable kernel from around 19 October 2008. I now added these lines to /boot/loader.conf hw.pci.enable_msi=0 hw.pci.enable_msix=0 to disable Message Signaled Interrupts. Which are used by the 3ware twa driver and igb network driver on our server. If you are willing to test further on your server, it may be helpful if you could determine which of those two lines in loader.conf fixes the problem for you. It would also be useful to provide a dmesg from the machine when both msi and msix are enabled. FWIW, looking at the vmstat -i output it appears that only the igb driver that are using MSI/MSIX, unless you have a reason to suspect otherwise? Gavin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
RE: 7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...
Hi Marc and List, i had similar issues with FreeBSD 7.2-PRERELEASE. Server (zfs,nfs) seems to hang in intervals of about 8 hours. kernel is still there but no connections can be made to nfs/ssh and login on local console doesn't seem to work due to incredible slowness. breaking to the debugger takes a moment but works. (compiling kernel with WITNESS didnt help) the server had been solid before with 7 stable kernel from around 19 October 2008. I now added these lines to /boot/loader.conf hw.pci.enable_msi=0 hw.pci.enable_msix=0 to disable Message Signaled Interrupts. Which are used by the 3ware twa driver and igb network driver on our server. With this the server had run 3 days with no hangs. I then enabled msi again and had a hang within 24 hours. Disabled again and now the server is online without an issue for 6 days. Im not 100% sure yet if this really is the sole source of the problems (e.g. workload might be another factor). But i guess its worth a try to check if it might help you too. If this is a known problem or there are any other hints to solve this problem or if the server configuration just seems wrong, i appreciate the feedback. regards, Martin pciconf (with msi): hos...@pci0:0:0:0: class=0x06 card=0xa28015d9 chip=0x40038086 rev=0x20 hdr=0x00 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port pc...@pci0:0:1:0: class=0x060400 card=0xa28015d9 chip=0x40218086 rev=0x20 hdr=0x01 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port cap 0d[b0] = PCI Bridge card=0xa28015d9 pc...@pci0:0:3:0: class=0x060400 card=0xa28015d9 chip=0x40238086 rev=0x20 hdr=0x01 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port cap 0d[b0] = PCI Bridge card=0xa28015d9 pc...@pci0:0:5:0: class=0x060400 card=0xa28015d9 chip=0x40258086 rev=0x20 hdr=0x01 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port cap 0d[b0] = PCI Bridge card=0xa28015d9 pc...@pci0:0:7:0: class=0x060400 card=0xa28015d9 chip=0x40278086 rev=0x20 hdr=0x01 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port cap 0d[b0] = PCI Bridge card=0xa28015d9 pc...@pci0:0:9:0: class=0x060400 card=0xa28015d9 chip=0x40298086 rev=0x20 hdr=0x01 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 05[58] = MSI supports 2 messages cap 10[6c] = PCI-Express 2 root port cap 0d[b0] = PCI Bridge card=0xa28015d9 no...@pci0:0:15:0: class=0x088000 card=0xa28015d9 chip=0x402f8086 rev=0x20 hdr=0x00 cap 01[50] = powerspec 3 supports D0 D3 current D0 cap 11[58] = MSI-X supports 4 messages in map 0x10 cap 10[6c] = PCI-Express 2 type 0 hos...@pci0:0:16:0: class=0x06 card=0xa28015d9 chip=0x40308086 rev=0x20 hdr=0x00 hos...@pci0:0:16:1: class=0x06 card=0xa28015d9 chip=0x40308086 rev=0x20 hdr=0x00 hos...@pci0:0:16:2: class=0x06 card=0xa28015d9 chip=0x40308086 rev=0x20 hdr=0x00 hos...@pci0:0:16:3: class=0x06 card=0xa28015d9 chip=0x40308086 rev=0x20 hdr=0x00 hos...@pci0:0:16:4: class=0x06 card=0xa28015d9 chip=0x40308086 rev=0x20 hdr=0x00 hos...@pci0:0:17:0: class=0x06 card=0xa28015d9 chip=0x40318086 rev=0x20 hdr=0x00 hos...@pci0:0:21:0: class=0x06 card=0xa28015d9 chip=0x40358086 rev=0x20 hdr=0x00 hos...@pci0:0:21:1: class=0x06 card=0xa28015d9 chip=0x40358086 rev=0x20 hdr=0x00 hos...@pci0:0:22:0: class=0x06 card=0xa28015d9 chip=0x40368086 rev=0x20 hdr=0x00 host...@pci0:0:22:1:class=0x06 card=0xa28015d9 chip=0x40368086 rev=0x20 hdr=0x00 pc...@pci0:0:28:0: class=0x060400 card=0xa28015d9 chip=0x26908086 rev=0x09 hdr=0x01 cap 10[40] = PCI-Express 1 root port cap 05[80] = MSI supports 1 message cap 0d[90] = PCI Bridge card=0xa28015d9 cap 01[a0] = powerspec 2 supports D0 D3 current D0 uh...@pci0:0:29:0: class=0x0c0300 card=0xa28015d9 chip=0x26888086 rev=0x09 hdr=0x00 uh...@pci0:0:29:1: class=0x0c0300 card=0xa28015d9 chip=0x26898086 rev=0x09 hdr=0x00 uh...@pci0:0:29:2: class=0x0c0300 card=0xa28015d9 chip=0x268a8086 rev=0x09 hdr=0x00 eh...@pci0:0:29:7: class=0x0c0320 card=0xa28015d9 chip=0x268c8086 rev=0x09 hdr=0x00 cap 01[50] = powerspec 2 supports D0 D3 current D0 cap 0a[58] = EHCI Debug Port at offset 0xa0 in map 0x14 pci...@pci0:0:30:0: class=0x060401 card=0xa28015d9 chip=0x244e8086 rev=0xd9 hdr=0x01 cap 0d[50] = PCI Bridge card=0xa28015d9 is...@pci0:0:31:0: class=0x060100 card=0xa28015d9 chip=0x26708086 rev=0x09 hdr=0x00 atap...@pci0:0:31:1:class=0x01018a card=0xa28015d9 chip=0x269e8086 rev=0x09 hdr=0x00