Re: panic during work with jailed postgresql8.4
On Apr 2, 2010, at 4:12 PM, Bjoern A. Zeeb wrote: > On Fri, 2 Apr 2010, Oleg Lomaka wrote: > >>>> uname -a >>>> FreeBSD cerberus.regredi.com 8.0-STABLE FreeBSD 8.0-STABLE #7 r206031: Thu >>>> Apr 1 13:43:57 EEST 2010 >>>> r...@cerberus.regredi.com:/usr/obj/usr/src/sys/GENERIC amd64 >>>> >>>> Link to dmesg.boot: >>>> http://docs.google.com/leaf?id=0B-irbkAqk9i7OGY2ZWJiODgtOWJmMy00NDQ1LTliZDctZjU3N2YwNmMxNjZl&hl=en >>>> >>>> Link to kernel core backtrace: >>>> http://docs.google.com/Doc?docid=0AeirbkAqk9i7ZGc5Yzc2ZndfM2M4NzYydmRw&hl=en >>>> >>>> Can I help to spot this trouble by providing additional info? >>> >>> Looking at the info I doubt it's related to jails or Pg in first >>> place. Have you been running that same setup already before your Apr >>> 1st, r206031, kernel? If so, from when was your last kernel? >> >> Yes, this configuration works on another server fine (8.0-STABLE FreeBSD >> 8.0-STABLE #3 r205202) >> >> Made few more tests. All tests I make using psql command (as it is 100% >> reproducible, may be now try spot it using telnet/netcat, without involving >> pg). psql accomplish login operation fine, panic appears after i run any >> command like \d, so I think it depends on packet size. >> >> Current picture is: >> 1. When connect from host machine - works fine. >> 2. When I connect from other server - works fine. >> 3. When connect from another jail on the same box as db's jail (tried from >> few jails) - kernel fault. >> >> Also tried security.jail.allow_raw_sockets on/off - nothing changes. > > In addition to the private mail I have just sent you, the first thing > you might try it to updat again; I hadn't realized before that your > r206031 seems to be in the middle of a multi-commit merge from two > people. > > It would be worth to update to the latest stable/8 and try again > first. That's it. r206088 works fine. Thank you for help. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic during work with jailed postgresql8.4
On Apr 2, 2010, at 3:02 PM, Bjoern A. Zeeb wrote: > On Thu, 1 Apr 2010, Oleg Lomaka wrote: >> I have a kernel panic when connect to postgresql8.4 server installed in one >> of jails from another jail. It's 100% reproducible. >> Also I have tried to connect from host machine to jailed pg server. That way >> it works fine without crash. >> >> Server configuration uses geli and zfs. Four disks encrypted using geli. And >> raidz2 is using ad8.eli, ad10.eli, ad12.eli, ad14.eli providers. All jails >> located at this raidz2 pool. >> >> Also I use ezjail for jails management. And it uses NFS to mount directories >> with base system. >> >> atal double fault >> rip = 0x8063510a >> rsp = 0xff80eaec5f50 >> rbp = 0xff80eaec6040 >> cpuid = 1; apic id = 02 >> panic: double fault >> cpuid = 1 >> Uptime: 7m11s >> Physical memory: 8169 MB >> >> uname -a >> FreeBSD cerberus.regredi.com 8.0-STABLE FreeBSD 8.0-STABLE #7 r206031: Thu >> Apr 1 13:43:57 EEST 2010 >> r...@cerberus.regredi.com:/usr/obj/usr/src/sys/GENERIC amd64 >> >> Link to dmesg.boot: >> http://docs.google.com/leaf?id=0B-irbkAqk9i7OGY2ZWJiODgtOWJmMy00NDQ1LTliZDctZjU3N2YwNmMxNjZl&hl=en >> >> Link to kernel core backtrace: >> http://docs.google.com/Doc?docid=0AeirbkAqk9i7ZGc5Yzc2ZndfM2M4NzYydmRw&hl=en >> >> Can I help to spot this trouble by providing additional info? > > Looking at the info I doubt it's related to jails or Pg in first > place. Have you been running that same setup already before your Apr > 1st, r206031, kernel? If so, from when was your last kernel? Yes, this configuration works on another server fine (8.0-STABLE FreeBSD 8.0-STABLE #3 r205202) Made few more tests. All tests I make using psql command (as it is 100% reproducible, may be now try spot it using telnet/netcat, without involving pg). psql accomplish login operation fine, panic appears after i run any command like \d, so I think it depends on packet size. Current picture is: 1. When connect from host machine - works fine. 2. When I connect from other server - works fine. 3. When connect from another jail on the same box as db's jail (tried from few jails) - kernel fault. Also tried security.jail.allow_raw_sockets on/off - nothing changes. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: panic during work with jailed postgresql8.4
On Apr 2, 2010, at 4:52 AM, pluknet wrote: > On 1 April 2010 22:18, Oleg Lomaka wrote: >> >> >> I have a kernel panic when connect to postgresql8.4 server installed in one >> of jails from another jail. It's 100% reproducible. >> Also I have tried to connect from host machine to jailed pg server. That way >> it works fine without crash. >> >> Server configuration uses geli and zfs. Four disks encrypted using geli. And >> raidz2 is using ad8.eli, ad10.eli, ad12.eli, ad14.eli providers. All jails >> located at this raidz2 pool. >> >> Also I use ezjail for jails management. And it uses NFS to mount directories >> with base system. >> >> atal double fault >> rip = 0x8063510a >> rsp = 0xff80eaec5f50 >> rbp = 0xff80eaec6040 >> cpuid = 1; apic id = 02 >> panic: double fault >> cpuid = 1 >> Uptime: 7m11s >> Physical memory: 8169 MB >> >> uname -a >> FreeBSD cerberus.regredi.com 8.0-STABLE FreeBSD 8.0-STABLE #7 r206031: Thu >> Apr 1 13:43:57 EEST 2010 >> r...@cerberus.regredi.com:/usr/obj/usr/src/sys/GENERIC amd64 >> >> Link to dmesg.boot: >> http://docs.google.com/leaf?id=0B-irbkAqk9i7OGY2ZWJiODgtOWJmMy00NDQ1LTliZDctZjU3N2YwNmMxNjZl&hl=en >> >> Link to kernel core backtrace: >> http://docs.google.com/Doc?docid=0AeirbkAqk9i7ZGc5Yzc2ZndfM2M4NzYydmRw&hl=en > > Looking at backtrace, I wonder whether tp->t_maxseg changes in > tcp_mtudisc() at all. > You should be able to extract its value on each 2*n frame in that big > recursive call. You are right, pt->t_maxseg doesn't change (kgdb) frame 9 #9 0x807097e8 in tcp_mtudisc (inp=0xff00193c53f0, errno=Variable "errno" is not available. ) at tcp_offload.h:282 282 return (tcp_output(tp)); (kgdb) p tp->t_maxseg $1 = 14336 (kgdb) frame 11 #11 0x807097e8 in tcp_mtudisc (inp=0xff00193c53f0, errno=Variable "errno" is not available. ) at tcp_offload.h:282 282 return (tcp_output(tp)); (kgdb) p tp->t_maxseg $2 = 14336 ... (full log at http://docs.google.com/Doc?docid=0AeirbkAqk9i7ZGc5Yzc2ZndfNGQ4cWpia2dz&hl=en ) (kgdb) frame 81 #81 0x807097e8 in tcp_mtudisc (inp=0xff00193c53f0, errno=Variable "errno" is not available. ) at tcp_offload.h:282 282 return (tcp_output(tp)); (kgdb) p tp->t_maxseg $37 = 14336 (kgdb)
panic during work with jailed postgresql8.4
Hello, I have a kernel panic when connect to postgresql8.4 server installed in one of jails from another jail. It's 100% reproducible. Also I have tried to connect from host machine to jailed pg server. That way it works fine without crash. Server configuration uses geli and zfs. Four disks encrypted using geli. And raidz2 is using ad8.eli, ad10.eli, ad12.eli, ad14.eli providers. All jails located at this raidz2 pool. Also I use ezjail for jails management. And it uses NFS to mount directories with base system. atal double fault rip = 0x8063510a rsp = 0xff80eaec5f50 rbp = 0xff80eaec6040 cpuid = 1; apic id = 02 panic: double fault cpuid = 1 Uptime: 7m11s Physical memory: 8169 MB uname -a FreeBSD cerberus.regredi.com 8.0-STABLE FreeBSD 8.0-STABLE #7 r206031: Thu Apr 1 13:43:57 EEST 2010 r...@cerberus.regredi.com:/usr/obj/usr/src/sys/GENERIC amd64 Link to dmesg.boot: http://docs.google.com/leaf?id=0B-irbkAqk9i7OGY2ZWJiODgtOWJmMy00NDQ1LTliZDctZjU3N2YwNmMxNjZl&hl=en Link to kernel core backtrace: http://docs.google.com/Doc?docid=0AeirbkAqk9i7ZGc5Yzc2ZndfM2M4NzYydmRw&hl=en Can I help to spot this trouble by providing additional info? Thanks.
Re: any hope for nfe/msk?
Hello, Pyun YongHyeon wrote: On Thu, Nov 01, 2007 at 10:59:48AM +0200, Oleg Lomaka wrote: > Hello, > > Pyun YongHyeon wrote: > >On Tue, Oct 30, 2007 at 04:01:04PM +0200, Oleg Lomaka wrote: > > > >[...] > > > > > I had RxFIFO overrun again :( > > > from dmest: > > > msk0: Rx FIFO overrun! > > > >[...] > > > >Please try attached patch again. Sorry for the trouble. > >After applying the patch show me verbosed dmesg output related with > >msk(4)/PHY driver. > > > >Thanks for testing. > > > pcib1: irq 16 at device 28.0 on pci0 > pcib1: domain0 > pcib1: secondary bus 2 > pcib1: subordinate bus 2 > pcib1: I/O decode0x2000-0x2fff > pcib1: memory decode 0xd010-0xd01f > pcib1: no prefetched decode > pci2: on pcib1 > pci2: domain=0, physical bus=2 > found-> vendor=0x11ab, dev=0x4352, revid=0x14 >domain=0, bus=2, slot=0, func=0 >class=02-00-00, hdrtype=0x00, mfdev=0 >cmdreg=0x0007, statreg=0x4010, cachelnsz=16 (dwords) >lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) >intpin=a, irq=11 >powerspec 2 supports D0 D1 D2 D3 current D0 >MSI supports 2 messages, 64 bit >map[10]: type Memory, range 64, base 0xd010, size 14, enabled > pcib1: requested memory range 0xd010-0xd0103fff: good >map[18]: type I/O Port, range 32, base 0x2000, size 8, enabled > pcib1: requested I/O range 0x2000-0x20ff: in range > pcib1: slot 0 INTA routed to irq 16 > mskc0: port 0x2000-0x20ff mem > 0xd010-0xd0103fff irq 16 at device 0.0 on pci2 > mskc0: Reserved 0x4000 bytes for rid 0x10 type 3 at 0xd010 > mskc0: MSI count : 2 > mskc0: RAM buffer size : 4KB > mskc0: Port 0 : Rx Queue 2KB(0x:0x07ff) > mskc0: Port 0 : Tx Queue 2KB(0x0800:0x0fff) > msk0: on mskc0 > msk0: bpf attached > msk0: Ethernet address: 00:1b:24:0e:bc:26 > miibus0: on msk0 > e1000phy0: PHY 0 on miibus0 > e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > ioapic0: routing intpin 16 (PCI IRQ 16) to vector 49 > mskc0: [MPSAFE] > mskc0: [FILTER] > So far all looks good to me. If you encounter watchdog timeouts or Rx FIFO overruns let me know. Got it again: msk0: Rx FIFO overrun! I believe this is happening under heavy CPU usage. Now i have firefox compiling and watched pictures on remote windows box using rdesktop. And after few minutes got network freeze. But it looks i didn't get any packet lost :). Take a look at ping statistics... funny... tdevil% ping 10.1.1.254 PING 10.1.1.254 (10.1.1.254): 56 data bytes 64 bytes from 10.1.1.254: icmp_seq=0 ttl=64 time=35926.404 ms 64 bytes from 10.1.1.254: icmp_seq=1 ttl=64 time=34925.694 ms 64 bytes from 10.1.1.254: icmp_seq=2 ttl=64 time=33924.729 ms 64 bytes from 10.1.1.254: icmp_seq=3 ttl=64 time=32923.814 ms 64 bytes from 10.1.1.254: icmp_seq=4 ttl=64 time=31922.833 ms 64 bytes from 10.1.1.254: icmp_seq=5 ttl=64 time=30921.878 ms 64 bytes from 10.1.1.254: icmp_seq=6 ttl=64 time=29920.923 ms 64 bytes from 10.1.1.254: icmp_seq=7 ttl=64 time=28919.960 ms 64 bytes from 10.1.1.254: icmp_seq=8 ttl=64 time=27919.009 ms 64 bytes from 10.1.1.254: icmp_seq=9 ttl=64 time=26918.042 ms 64 bytes from 10.1.1.254: icmp_seq=10 ttl=64 time=25917.078 ms 64 bytes from 10.1.1.254: icmp_seq=11 ttl=64 time=24916.115 ms 64 bytes from 10.1.1.254: icmp_seq=12 ttl=64 time=23915.144 ms 64 bytes from 10.1.1.254: icmp_seq=13 ttl=64 time=22914.192 ms 64 bytes from 10.1.1.254: icmp_seq=14 ttl=64 time=21913.214 ms 64 bytes from 10.1.1.254: icmp_seq=15 ttl=64 time=20912.278 ms 64 bytes from 10.1.1.254: icmp_seq=16 ttl=64 time=19911.330 ms 64 bytes from 10.1.1.254: icmp_seq=17 ttl=64 time=18910.375 ms 64 bytes from 10.1.1.254: icmp_seq=18 ttl=64 time=17909.419 ms 64 bytes from 10.1.1.254: icmp_seq=19 ttl=64 time=16853.821 ms 64 bytes from 10.1.1.254: icmp_seq=20 ttl=64 time=15854.710 ms 64 bytes from 10.1.1.254: icmp_seq=21 ttl=64 time=14701.312 ms 64 bytes from 10.1.1.254: icmp_seq=22 ttl=64 time=13701.003 ms 64 bytes from 10.1.1.254: icmp_seq=23 ttl=64 time=12700.052 ms 64 bytes from 10.1.1.254: icmp_seq=24 ttl=64 time=11699.098 ms 64 bytes from 10.1.1.254: icmp_seq=25 ttl=64 time=10698.148 ms 64 bytes from 10.1.1.254: icmp_seq=36 ttl=64 time=0.463 ms 64 bytes from 10.1.1.254: icmp_seq=37 ttl=64 time=0.379 ms ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: any hope for nfe/msk?
Hello, Pyun YongHyeon wrote: On Tue, Oct 30, 2007 at 04:01:04PM +0200, Oleg Lomaka wrote: [...] > I had RxFIFO overrun again :( > from dmest: > msk0: Rx FIFO overrun! [...] Please try attached patch again. Sorry for the trouble. After applying the patch show me verbosed dmesg output related with msk(4)/PHY driver. Thanks for testing. pcib1: irq 16 at device 28.0 on pci0 pcib1: domain0 pcib1: secondary bus 2 pcib1: subordinate bus 2 pcib1: I/O decode0x2000-0x2fff pcib1: memory decode 0xd010-0xd01f pcib1: no prefetched decode pci2: on pcib1 pci2: domain=0, physical bus=2 found-> vendor=0x11ab, dev=0x4352, revid=0x14 domain=0, bus=2, slot=0, func=0 class=02-00-00, hdrtype=0x00, mfdev=0 cmdreg=0x0007, statreg=0x4010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=11 powerspec 2 supports D0 D1 D2 D3 current D0 MSI supports 2 messages, 64 bit map[10]: type Memory, range 64, base 0xd010, size 14, enabled pcib1: requested memory range 0xd010-0xd0103fff: good map[18]: type I/O Port, range 32, base 0x2000, size 8, enabled pcib1: requested I/O range 0x2000-0x20ff: in range pcib1: slot 0 INTA routed to irq 16 mskc0: port 0x2000-0x20ff mem 0xd010-0xd0103fff irq 16 at device 0.0 on pci2 mskc0: Reserved 0x4000 bytes for rid 0x10 type 3 at 0xd010 mskc0: MSI count : 2 mskc0: RAM buffer size : 4KB mskc0: Port 0 : Rx Queue 2KB(0x:0x07ff) mskc0: Port 0 : Tx Queue 2KB(0x0800:0x0fff) msk0: on mskc0 msk0: bpf attached msk0: Ethernet address: 00:1b:24:0e:bc:26 miibus0: on msk0 e1000phy0: PHY 0 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto ioapic0: routing intpin 16 (PCI IRQ 16) to vector 49 mskc0: [MPSAFE] mskc0: [FILTER] ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: any hope for nfe/msk?
Pyun YongHyeon wrote: On Thu, Oct 25, 2007 at 05:30:32PM +0900, To Oleg Lomaka wrote: [...] > > tdevil% grep -iE "msk|phy" /var/run/dmesg.boot > > pci0: domain=0, physical bus=0 > > pci2: domain=0, physical bus=2 > > mskc0: port 0x2000-0x20ff mem > > 0xd010-0xd0103fff irq 16 at device 0.0 on pci2 > > mskc0: Reserved 0x4000 bytes for rid 0x10 type 3 at 0xd010 > > mskc0: MSI count : 2 > > mskc0: RAM buffer size : 16KB > > mskc0: Port 0 : Rx Queue 10KB(0x:0x27ff) > > mskc0: Port 0 : Tx Queue 10KB(0x2800:0x4fff) > > msk0: on mskc0 > > msk0: bpf attached > > msk0: Ethernet address: 00:1b:24:0e:bc:26 > > miibus0: on msk0 > > e1000phy0: PHY 0 on miibus0 > > e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > > ukphy0: PHY 3 on miibus0 > > ukphy0: OUI 0x001000, model 0x0004, rev. 0 > > ukphy0: no media present > > ukphy1: PHY 6 on miibus0 > > ukphy1: OUI 0x004400, model 0x0011, rev. 0 > > ukphy1: no media present > > mskc0: [MPSAFE] > > mskc0: [FILTER] > > pci3: domain=0, physical bus=3 > > pci4: domain=0, physical bus=4 > > pci5: domain=0, physical bus=5 > > pci10: domain=0, physical bus=10 > > > > Thanks for the info. Would please try attached patch? > Any progress here? I guess it's very important to fix the bug as it would affect all Yukon FE based NIC. I've applied your patch again yesterday. There was no halts for few hours already (after ports cvs up and other network/cpu loads). I'll give you a note in a day or two if there will no be any troubles. Thanks for your help. -- Oleg Lomaka, System Administrator Kiev Zoral Development Center Tel: +380-44-4928018 ALEK-RIPE, ALEK-UANIC ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: any hope for nfe/msk?
Pyun YongHyeon wrote: On Tue, Oct 30, 2007 at 10:42:33AM +0200, Oleg Lomaka wrote: > Pyun YongHyeon wrote: > >On Thu, Oct 25, 2007 at 05:30:32PM +0900, To Oleg Lomaka wrote: > > > >[...] > > > > > > tdevil% grep -iE "msk|phy" /var/run/dmesg.boot > > > > pci0: domain=0, physical bus=0 > > > > pci2: domain=0, physical bus=2 > > > > mskc0: port 0x2000-0x20ff > > mem > > 0xd010-0xd0103fff irq 16 at device 0.0 on pci2 > > > > mskc0: Reserved 0x4000 bytes for rid 0x10 type 3 at 0xd010 > > > > mskc0: MSI count : 2 > > > > mskc0: RAM buffer size : 16KB > > > > mskc0: Port 0 : Rx Queue 10KB(0x:0x27ff) > > > > mskc0: Port 0 : Tx Queue 10KB(0x2800:0x4fff) > > > > msk0: on > > mskc0 > > > > msk0: bpf attached > > > > msk0: Ethernet address: 00:1b:24:0e:bc:26 > > > > miibus0: on msk0 > > > > e1000phy0: PHY 0 on > > miibus0 > > > > e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > > > > ukphy0: PHY 3 on miibus0 > > > > ukphy0: OUI 0x001000, model 0x0004, rev. 0 > > > > ukphy0: no media present > > > > ukphy1: PHY 6 on miibus0 > > > > ukphy1: OUI 0x004400, model 0x0011, rev. 0 > > > > ukphy1: no media present > > > > mskc0: [MPSAFE] > > > > mskc0: [FILTER] > > > > pci3: domain=0, physical bus=3 > > > > pci4: domain=0, physical bus=4 > > > > pci5: domain=0, physical bus=5 > > > > pci10: domain=0, physical bus=10 > > > > > > > > > > Thanks for the info. Would please try attached patch? > > > > > > >Any progress here? > >I guess it's very important to fix the bug as it would affect all > >Yukon FE based NIC. > > > > > I've applied your patch again yesterday. There was no halts for few > hours already (after ports cvs up and other network/cpu loads). I'll > give you a note in a day or two if there will no be any troubles. > Thanks for your help. > Glad to hear that. Would you show me the verbosed boot messages related with msk(4)? According to your dmesg output I guess you have phantom PHYs attached to msk(4) too. So I'd also like to know the output of "devinfo -rv". I had RxFIFO overrun again :( from dmest: msk0: Rx FIFO overrun! pid 1245 (gnome-vfs-daemon), uid 1001: exited on signal 11 msk0: watchdog timeout (missed Tx interrupts) -- recovering from boot log: pci2: on pcib1 pci2: domain=0, physical bus=2 found-> vendor=0x11ab, dev=0x4352, revid=0x14 domain=0, bus=2, slot=0, func=0 class=02-00-00, hdrtype=0x00, mfdev=0 cmdreg=0x0007, statreg=0x4010, cachelnsz=16 (dwords) lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) intpin=a, irq=11 powerspec 2 supports D0 D1 D2 D3 current D0 MSI supports 2 messages, 64 bit map[10]: type Memory, range 64, base 0xd010, size 14, enabled pcib1: requested memory range 0xd010-0xd0103fff: good map[18]: type I/O Port, range 32, base 0x2000, size 8, enabled pcib1: requested I/O range 0x2000-0x20ff: in range pcib1: slot 0 INTA routed to irq 16 mskc0: port 0x2000-0x20ff mem 0xd010-0xd0103fff irq 16 at device 0.0 on pci2 mskc0: Reserved 0x4000 bytes for rid 0x10 type 3 at 0xd010 mskc0: MSI count : 2 mskc0: RAM buffer size : 4KB mskc0: Port 0 : Rx Queue 10KB(0x:0x27ff) mskc0: Port 0 : Tx Queue -6KB(0x2800:0x0fff) msk0: on mskc0 msk0: bpf attached msk0: Ethernet address: 00:1b:24:0e:bc:26 miibus0: on msk0 e1000phy0: PHY 0 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto ukphy0: PHY 3 on miibus0 ukphy0: OUI 0x001000, model 0x0004, rev. 0 ukphy0: no media present ukphy1: PHY 6 on miibus0 ukphy1: OUI 0x004400, model 0x0011, rev. 0 ukphy1: no media present ioapic0: routing intpin 16 (PCI IRQ 16) to vector 49 mskc0: [MPSAFE] mskc0: [FILTER] pcib2: irq 17 at device 28.1 on pci0 pcib2: domain0 pcib2: secondary bus 3 pcib2: subordinate bus 3 pcib2: I/O decode0xf000-0xfff pcib2: memory decode 0xd000-0xd and devinfo: tdevil% devinfo -rv nexus0 cryptosoft0 apic0 I/O memory addresses: 0xfec0-0xfec0001f 0xfee0-0xfee003ff legacy0 cpu0 pcib0 pci0 hostb0 pnpinfo vendor=0x8086 device=0x27a0 subvendor=0x1025 subdevice=0x0110 class=0x06 at slot=0 function=0 vgapci0 pnpinfo vendor=0x8086 device=0x27a2 subvendor=0x1025 subdevice=0x0110 class=0x03 at slot=2 function=0 I/O ports:
Re: any hope for nfe/msk?
Pyun YongHyeon wrote: On Thu, Oct 25, 2007 at 09:59:15AM +0300, Oleg Lomaka wrote: > Hello, > > Pyun YongHyeon wrote: > >On Wed, Oct 24, 2007 at 05:12:44PM +0300, Oleg Lomaka wrote: > > > Pyun YongHyeon wrote: > > > >On Wed, Oct 24, 2007 at 09:33:48AM +0200, Danny Braniss wrote: > > > > > Hi, > > > > > these drivers don't work under 7.0 > > > > > As soon as some mild preasure is applied, they start loosing > > > > interrupts, and > > > > > in my case the hosts come to a total stand-still, since they are > > > > diskless > > > > > and rely on the network. > > > > > This happens at 1gb and at 100mg. > > > > > > > > > > Maybe the problem is with the shared interrups? > > > > > > > > > > irq16: mskc0 uhci0 3308351 13 > > > > > or > > > > > irq21: nfe0 ohci01584415 24 > > > > > > > > > > but I have no idea how to uncouple this > > > > > > > > > > > > >If you see watchdog timeout errors on your console, shared interrupt > > > >would be culprit. > > > >For msk(4) set hw.msk.legacy_intr="1" in loader.conf or use kenv(1) > > > >to set it before loading msk(4) kernel module. > > > >For nfe(4) you can switch to polling(4). > > > > > > > > > > > I have some msk troubles too. On my laptop (acer travelmate 2483wxmi) > > > under heavy cpu & network load msk periodically stops working for few > > > minutes. > > > >If that happens msk(4) recover from the non-working state? > > > Yes, some times in few seconds, some times in 5 - 10 minutes, but always > recovers. > > > sysctl -a|grep msk > > > <118>msk0: no link ... > > > <118>DHCPREQUEST on msk0 to 255.255.255.255 port 67 > > > <118>DHCPREQUEST on msk0 to 255.255.255.255 port 67 > > > <118>DHCPDISCOVER on msk0 to 255.255.255.255 port 67 interval 3 > > > <118>DHCPREQUEST on msk0 to 255.255.255.255 port 67 > > > <118>msk0: flags=8843 metric 0 > > > mtu 1500 > > > msk0: watchdog timeout (missed Tx interrupts) -- recovering > > > msk0: watchdog timeout (missed Tx interrupts) -- recovering > > > msk0: Rx FIFO overrun! > > > >This looks bad. Would you show me verbosed boot messages related with > >msk(4) and PHY driver as well as "vmstat -i" output. > > > > > Here are values from just booted laptop. If it will halt msk today > again, I'll resend. > > tdevil% vmstat -i > interrupt total rate > irq1: atkbd03275 1 > irq12: psm011157 6 > irq14: ata022500 13 > irq15: ata1 85 0 > irq16: mskc0 uhci+ 17334 10 > irq18: uhci2 1 0 > irq22: pcm046530 27 > irq23: uhci0 ehci0 95882 57 > cpu0: timer 3322705 1999 > Total3519469 2117 > > > tdevil% grep -iE "msk|phy" /var/run/dmesg.boot > pci0: domain=0, physical bus=0 > pci2: domain=0, physical bus=2 > mskc0: port 0x2000-0x20ff mem > 0xd010-0xd0103fff irq 16 at device 0.0 on pci2 > mskc0: Reserved 0x4000 bytes for rid 0x10 type 3 at 0xd010 > mskc0: MSI count : 2 > mskc0: RAM buffer size : 16KB > mskc0: Port 0 : Rx Queue 10KB(0x:0x27ff) > mskc0: Port 0 : Tx Queue 10KB(0x2800:0x4fff) > msk0: on mskc0 > msk0: bpf attached > msk0: Ethernet address: 00:1b:24:0e:bc:26 > miibus0: on msk0 > e1000phy0: PHY 0 on miibus0 > e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto > ukphy0: PHY 3 on miibus0 > ukphy0: OUI 0x001000, model 0x0004, rev. 0 > ukphy0: no media present > ukphy1: PHY 6 on miibus0 > ukphy1: OUI 0x004400, model 0x0011, rev. 0 > ukphy1: no media present > mskc0: [MPSAFE] > mskc0: [FILTER] > pci3: domain=0, physical bus=3 > pci4: domain=0, physical bus=4 > pci5: domain=0, physical bus=5 > pci10: domain=0, physical bus=10 > Thanks for the info. Would please try attached patch? After kldunload/kldload i've got following and had to revert to original one (1.1
Re: any hope for nfe/msk?
Hello, Pyun YongHyeon wrote: On Wed, Oct 24, 2007 at 05:12:44PM +0300, Oleg Lomaka wrote: > Pyun YongHyeon wrote: > >On Wed, Oct 24, 2007 at 09:33:48AM +0200, Danny Braniss wrote: > > > Hi, > > > these drivers don't work under 7.0 > > > As soon as some mild preasure is applied, they start loosing > > interrupts, and > > > in my case the hosts come to a total stand-still, since they are > > diskless > > > and rely on the network. > > > This happens at 1gb and at 100mg. > > > > > > Maybe the problem is with the shared interrups? > > > > > > irq16: mskc0 uhci0 3308351 13 > > > or > > > irq21: nfe0 ohci01584415 24 > > > > > > but I have no idea how to uncouple this > > > > > > >If you see watchdog timeout errors on your console, shared interrupt > >would be culprit. > >For msk(4) set hw.msk.legacy_intr="1" in loader.conf or use kenv(1) > >to set it before loading msk(4) kernel module. > >For nfe(4) you can switch to polling(4). > > > > > I have some msk troubles too. On my laptop (acer travelmate 2483wxmi) > under heavy cpu & network load msk periodically stops working for few > minutes. If that happens msk(4) recover from the non-working state? Yes, some times in few seconds, some times in 5 - 10 minutes, but always recovers. > sysctl -a|grep msk > <118>msk0: no link ... > <118>DHCPREQUEST on msk0 to 255.255.255.255 port 67 > <118>DHCPREQUEST on msk0 to 255.255.255.255 port 67 > <118>DHCPDISCOVER on msk0 to 255.255.255.255 port 67 interval 3 > <118>DHCPREQUEST on msk0 to 255.255.255.255 port 67 > <118>msk0: flags=8843 metric 0 > mtu 1500 > msk0: watchdog timeout (missed Tx interrupts) -- recovering > msk0: watchdog timeout (missed Tx interrupts) -- recovering > msk0: Rx FIFO overrun! This looks bad. Would you show me verbosed boot messages related with msk(4) and PHY driver as well as "vmstat -i" output. Here are values from just booted laptop. If it will halt msk today again, I'll resend. tdevil% vmstat -i interrupt total rate irq1: atkbd03275 1 irq12: psm011157 6 irq14: ata022500 13 irq15: ata1 85 0 irq16: mskc0 uhci+ 17334 10 irq18: uhci2 1 0 irq22: pcm046530 27 irq23: uhci0 ehci0 95882 57 cpu0: timer 3322705 1999 Total3519469 2117 tdevil% grep -iE "msk|phy" /var/run/dmesg.boot pci0: domain=0, physical bus=0 pci2: domain=0, physical bus=2 mskc0: port 0x2000-0x20ff mem 0xd010-0xd0103fff irq 16 at device 0.0 on pci2 mskc0: Reserved 0x4000 bytes for rid 0x10 type 3 at 0xd010 mskc0: MSI count : 2 mskc0: RAM buffer size : 16KB mskc0: Port 0 : Rx Queue 10KB(0x:0x27ff) mskc0: Port 0 : Tx Queue 10KB(0x2800:0x4fff) msk0: on mskc0 msk0: bpf attached msk0: Ethernet address: 00:1b:24:0e:bc:26 miibus0: on msk0 e1000phy0: PHY 0 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto ukphy0: PHY 3 on miibus0 ukphy0: OUI 0x001000, model 0x0004, rev. 0 ukphy0: no media present ukphy1: PHY 6 on miibus0 ukphy1: OUI 0x004400, model 0x0011, rev. 0 ukphy1: no media present mskc0: [MPSAFE] mskc0: [FILTER] pci3: domain=0, physical bus=3 pci4: domain=0, physical bus=4 pci5: domain=0, physical bus=5 pci10: domain=0, physical bus=10 > msk0: watchdog timeout (missed Tx interrupts) -- recovering > msk0: watchdog timeout (missed Tx interrupts) -- recovering > msk0: watchdog timeout (missed Tx interrupts) -- recovering > dev.mskc.0.%desc: Marvell Yukon 88E8038 Gigabit Ethernet > dev.mskc.0.%driver: mskc > dev.mskc.0.%location: slot=0 function=0 > dev.mskc.0.%pnpinfo: vendor=0x11ab device=0x4352 subvendor=0x1025 > subdevice=0x0110 class=0x02 > dev.mskc.0.%parent: pci2 > dev.mskc.0.process_limit: 128 > dev.msk.0.%desc: Marvell Technology Group Ltd. Yukon FE Id 0xb7 Rev 0x01 > dev.msk.0.%driver: msk > dev.msk.0.%parent: mskc0 > dev.miibus.0.%parent: msk0 > > Not sure if it is connected to previous issue. > > uname -a > FreeBSD tdevil.lomaka.org.ua 7.0-BETA1 FreeBSD 7.0-BETA1 #0: Mon Oct 22 > 18:32:01 EEST 2007 > [EMAIL PROTECTED]:/usr/obj/usr/src/sys/TDEVIL-7.kernconf i386 > ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Re: any hope for nfe/msk?
Pyun YongHyeon wrote: On Wed, Oct 24, 2007 at 09:33:48AM +0200, Danny Braniss wrote: > Hi, > these drivers don't work under 7.0 > As soon as some mild preasure is applied, they start loosing interrupts, and > in my case the hosts come to a total stand-still, since they are diskless > and rely on the network. > This happens at 1gb and at 100mg. > > Maybe the problem is with the shared interrups? > > irq16: mskc0 uhci0 3308351 13 > or > irq21: nfe0 ohci01584415 24 > > but I have no idea how to uncouple this > If you see watchdog timeout errors on your console, shared interrupt would be culprit. For msk(4) set hw.msk.legacy_intr="1" in loader.conf or use kenv(1) to set it before loading msk(4) kernel module. For nfe(4) you can switch to polling(4). I have some msk troubles too. On my laptop (acer travelmate 2483wxmi) under heavy cpu & network load msk periodically stops working for few minutes. sysctl -a|grep msk <118>msk0: no link ... <118>DHCPREQUEST on msk0 to 255.255.255.255 port 67 <118>DHCPREQUEST on msk0 to 255.255.255.255 port 67 <118>DHCPDISCOVER on msk0 to 255.255.255.255 port 67 interval 3 <118>DHCPREQUEST on msk0 to 255.255.255.255 port 67 <118>msk0: flags=8843 metric 0 mtu 1500 msk0: watchdog timeout (missed Tx interrupts) -- recovering msk0: watchdog timeout (missed Tx interrupts) -- recovering msk0: Rx FIFO overrun! msk0: watchdog timeout (missed Tx interrupts) -- recovering msk0: watchdog timeout (missed Tx interrupts) -- recovering msk0: watchdog timeout (missed Tx interrupts) -- recovering dev.mskc.0.%desc: Marvell Yukon 88E8038 Gigabit Ethernet dev.mskc.0.%driver: mskc dev.mskc.0.%location: slot=0 function=0 dev.mskc.0.%pnpinfo: vendor=0x11ab device=0x4352 subvendor=0x1025 subdevice=0x0110 class=0x02 dev.mskc.0.%parent: pci2 dev.mskc.0.process_limit: 128 dev.msk.0.%desc: Marvell Technology Group Ltd. Yukon FE Id 0xb7 Rev 0x01 dev.msk.0.%driver: msk dev.msk.0.%parent: mskc0 dev.miibus.0.%parent: msk0 Not sure if it is connected to previous issue. uname -a FreeBSD tdevil.lomaka.org.ua 7.0-BETA1 FreeBSD 7.0-BETA1 #0: Mon Oct 22 18:32:01 EEST 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/TDEVIL-7.kernconf i386 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[EMAIL PROTECTED]"