Re: important NFS client patch for FreeBSD8.n
Greetings, and thank you for the "heads up". On Mon, January 10, 2011 2:22 pm, Rick Macklem wrote: > I just commited a patch (r217242) to head. Anyone who is using client > side NFS on FreeBSD8.n should apply this patch. It is also available at: > http://people.freebsd.org/~rmacklem/krpc.patch > > > It fixes a problem where the kernel rpc assumes that 4 bytes of data > exists in the first mbuf without checking. If the data straddles multiple > mbufs, > it uses garbage and then a typical case will wedge for a minute or so until it > times out and establishes a new TCP connection. It also replaces m_pullup() > with > m_copydata(), since m_pullup() can fail for rare cases when there is data > available. (m_pullup() uses MGET(, M_DONTWAIT,) which can fail when mbuf > allocation is constrainted, for example.) > > Thanks to john.gemignani at isilon.com for spotting this problem, rick I just fired a message off to @amd64 && @net because I am seeing messages like: nfe0: tx v2 error 0x6204 on a recent 8.1/amd64 install which is connected to an 8.0/i386 via NFS. They both run NFS client && server, and they both utilize mount points on each other. They are only 2 of several interconnected servers. The others are all 7x/i386. But I only see these messages on the 8.1/amd64, and only when connected to, and utilizing mounts on the 8.0/i386, and even then, only when the data exceeds ~1.5Mb. I guess I'm asking if the messages I'm receiving are related to the corrections your patch provides. Or should I keep looking for the answer for the messages I am seeing. Thank you for all your time and consideration. --Chris > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > > -- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: ZFS - hot spares : automatic or not?
On 1/4/2011 11:52 AM, John Hawkes-Reed wrote: On 04/01/2011 03:08, Dan Langille wrote: Hello folks, I'm trying to discover if ZFS under FreeBSD will automatically pull in a hot spare if one is required. This raised the issue back in March 2010, and refers to a PR opened in May 2009 * http://lists.freebsd.org/pipermail/freebsd-fs/2010-March/007943.html * http://www.freebsd.org/cgi/query-pr.cgi?pr=134491 In turn, the PR refers to this March 2010 post referring to using devd to accomplish this task. http://lists.freebsd.org/pipermail/freebsd-stable/2010-March/055686.html Does the above represent the the current state? I ask because I just ordered two more HDD to use as spares. Whether they sit on the shelf or in the box is open to discussion. As far as our testing could discover, it's not automatic. I wrote some Ugly Perl that's called by devd when it spots a drive-fail event, which seemed to DTRT when simulating a failure by pulling a drive. Without such a script, what is the value in creating hot spares? -- Dan Langille - http://langille.org/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Enabling DDB prevent kernel from panicing
On Mon, Jan 10, 2011 at 9:13 PM, Jeremy Chadwick wrote: > On Mon, Jan 10, 2011 at 07:42:21PM -0500, Mark Saad wrote: >> On Mon, Jan 10, 2011 at 6:59 PM, wrote: >> > Hello, Mark >> > >> > 2011/1/11 Mark Saad : >> >> All >> >> This was originally posted to hackers@ >> >> >> >> I have a good question that I cant find an answer for. I believe >> >> found a kernel bug in 7.3-RELEASE that prevents me from booting 64-bit >> >> kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: page >> >> fault while in kernel mode " . The hardware works fine in 7.2-RELEASE >> >> amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 . >> >> >> >> In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using the >> >> stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if this >> >> issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC >> >> kernel using patches sources and tried to boot and I got the same >> >> crash. >> >> >> >> Next I rebuilt the kernel with KDB and DDB to see if I could get a >> >> core-dump of the system. I also set loader.conf to >> >> >> >> kernel="kernel.DEBUG" >> >> kern.dumpdev="/dev/da0s1b" >> >> >> >> Next I pxebooted the box and the system does not crash on boot up, it >> >> will easily load a nfs root and work fine. So I copied my debug >> >> kernel, and loader.conf to the local disk and rebooted and it boots >> >> fine from the local disk . >> > >> > Looks like a race condition. >> > Well, you don't need to compile KDB and DDB, just add >> > >> > makeoptions DEBUG=-g >> > >> > into your kernel config file and rebuild kernel. >> > >> > Then after you got a crash dump you can easy debug it (see FreeBSD >> > Developers Handbok): >> > http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html >> > >> > >> > wbr, >> > Nickolas >> > >> >> Sorry let me clarify the issue, When you install a generic >> 7.3-RELEASE amd64 on some of the HP servers I use, the kernel panics >> in boot up >> when it probes the sio driver . Here is a part of my dmesg.boot file >> >> atkbd0: [ITHREAD] >> psm0: irq 12 on atkbdc0 >> psm0: [GIANT-LOCKED] >> psm0: [ITHREAD] >> psm0: model Generic PS/2 mouse, device ID 0 >> sio0: configured irq 4 not in bitmap of probed irqs 0 >> sio0: port may not be enabled >> sio0: configured irq 4 not in bitmap of probed irqs 0 >> sio0: port may not be enabled >> sio0: port 0x3f8-0x3ff irq 4 on acpi0 >> sio0: type 16550A >> sio0: [FILTER] >> Say about here in the boot up , is where the box crashes with the >> above noted error. >> >> If I then boot the same box off a 7.1-RELEASE amd64 netboot server , >> mount the local disks of the 7.3-RELEASE install and edit the >> /boot/device.hints and comment out the sio hints like this >> >> hint.vga.0.at="isa" >> hint.sc.0.at="isa" >> hint.sc.0.flags="0x100" >> #hint.sio.0.at="isa" >> #hint.sio.0.port="0x3F8" >> #hint.sio.0.flags="0x10" >> #hint.sio.0.irq="4" >> #hint.sio.1.at="isa" >> #hint.sio.1.port="0x2F8" >> #hint.sio.1.irq="3" >> #hint.sio.2.at="isa" >> #hint.sio.2.disabled="1" >> #hint.sio.2.port="0x3E8" >> #hint.sio.2.irq="5" >> #hint.sio.3.at="isa" >> #hint.sio.3.disabled="1" >> #hint.sio.3.port="0x2E8" >> #hint.sio.3.irq="9" >> hint.ppc.0.at="isa" >> hint.ppc.0.irq="7" >> >> then boot the server off the local disks , the server boots correctly. >> >> The odd thing was, I rebuilt a debug 7.3-RELEASE amd64 kernel on >> another working server, and installed it on the broken server and >> booted it off the local disks, with out any changes to the hints file >> and the server booted correctly and I was able to manually break out >> into the debugger , but nothing looked wrong . > > The sio(4) driver has been deprecated in RELENG_8, which uses uart(4). > uart(4) is better in a lot of regards, and should also be available for > use on RELENG_7 but you'll need to adjust /etc/ttys to refer to the new > device names (ttyuX vs. ttydX), plus add the uart entries to > /boot/device.hints. > I found that too, and I was thinking about the change but its going to require a source build of the kernel to fix that along with a bunch of manual work on my side that I would rather not do . > I'm mentioning this as a workaround. > > Also worth considering is that the sio(4) ISA probe may be touching > something Bad(tm) as a result, so you might try adding the following > lines to your loader.conf (not a typo) to disable sio(4) entries > entirely: > > hint.sio.0.disabled="1" > hint.sio.1.disabled="1" > > And see if that improves things. If it does, remove the sio.1.disabled > entry and see if that suffices. I'll try the hint disabling but how is that different from removing the hint outright ? > >> So to sum this up there is something broken in 7.3-RELEASE but I cant >> figure out what. This server works with a generic install of >> 7.1-RELEASE 7.2-RELEASE , 6.1-RELEASE, 6.2-RELEASE and 6.4-RELEASE in >> both amd64 and i386 , but not 7.3-RELEASE in amd64 . It also worked in >> 7.4-RC1 . >> >> avg recomme
Re: Enabling DDB prevent kernel from panicing
On Mon, Jan 10, 2011 at 07:42:21PM -0500, Mark Saad wrote: > On Mon, Jan 10, 2011 at 6:59 PM, wrote: > > Hello, Mark > > > > 2011/1/11 Mark Saad : > >> All > >> This was originally posted to hackers@ > >> > >> I have a good question that I cant find an answer for. I believe > >> found a kernel bug in 7.3-RELEASE that prevents me from booting 64-bit > >> kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: page > >> fault while in kernel mode " . The hardware works fine in 7.2-RELEASE > >> amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 . > >> > >> In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using the > >> stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if this > >> issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC > >> kernel using patches sources and tried to boot and I got the same > >> crash. > >> > >> Next I rebuilt the kernel with KDB and DDB to see if I could get a > >> core-dump of the system. I also set loader.conf to > >> > >> kernel="kernel.DEBUG" > >> kern.dumpdev="/dev/da0s1b" > >> > >> Next I pxebooted the box and the system does not crash on boot up, it > >> will easily load a nfs root and work fine. So I copied my debug > >> kernel, and loader.conf to the local disk and rebooted and it boots > >> fine from the local disk . > > > > Looks like a race condition. > > Well, you don't need to compile KDB and DDB, just add > > > > makeoptions DEBUG=-g > > > > into your kernel config file and rebuild kernel. > > > > Then after you got a crash dump you can easy debug it (see FreeBSD > > Developers Handbok): > > http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html > > > > > > wbr, > > Nickolas > > > > Sorry let me clarify the issue, When you install a generic > 7.3-RELEASE amd64 on some of the HP servers I use, the kernel panics > in boot up > when it probes the sio driver . Here is a part of my dmesg.boot file > > atkbd0: [ITHREAD] > psm0: irq 12 on atkbdc0 > psm0: [GIANT-LOCKED] > psm0: [ITHREAD] > psm0: model Generic PS/2 mouse, device ID 0 > sio0: configured irq 4 not in bitmap of probed irqs 0 > sio0: port may not be enabled > sio0: configured irq 4 not in bitmap of probed irqs 0 > sio0: port may not be enabled > sio0: port 0x3f8-0x3ff irq 4 on acpi0 > sio0: type 16550A > sio0: [FILTER] > Say about here in the boot up , is where the box crashes with the > above noted error. > > If I then boot the same box off a 7.1-RELEASE amd64 netboot server , > mount the local disks of the 7.3-RELEASE install and edit the > /boot/device.hints and comment out the sio hints like this > > hint.vga.0.at="isa" > hint.sc.0.at="isa" > hint.sc.0.flags="0x100" > #hint.sio.0.at="isa" > #hint.sio.0.port="0x3F8" > #hint.sio.0.flags="0x10" > #hint.sio.0.irq="4" > #hint.sio.1.at="isa" > #hint.sio.1.port="0x2F8" > #hint.sio.1.irq="3" > #hint.sio.2.at="isa" > #hint.sio.2.disabled="1" > #hint.sio.2.port="0x3E8" > #hint.sio.2.irq="5" > #hint.sio.3.at="isa" > #hint.sio.3.disabled="1" > #hint.sio.3.port="0x2E8" > #hint.sio.3.irq="9" > hint.ppc.0.at="isa" > hint.ppc.0.irq="7" > > then boot the server off the local disks , the server boots correctly. > > The odd thing was, I rebuilt a debug 7.3-RELEASE amd64 kernel on > another working server, and installed it on the broken server and > booted it off the local disks, with out any changes to the hints file > and the server booted correctly and I was able to manually break out > into the debugger , but nothing looked wrong . The sio(4) driver has been deprecated in RELENG_8, which uses uart(4). uart(4) is better in a lot of regards, and should also be available for use on RELENG_7 but you'll need to adjust /etc/ttys to refer to the new device names (ttyuX vs. ttydX), plus add the uart entries to /boot/device.hints. I'm mentioning this as a workaround. Also worth considering is that the sio(4) ISA probe may be touching something Bad(tm) as a result, so you might try adding the following lines to your loader.conf (not a typo) to disable sio(4) entries entirely: hint.sio.0.disabled="1" hint.sio.1.disabled="1" And see if that improves things. If it does, remove the sio.1.disabled entry and see if that suffices. > So to sum this up there is something broken in 7.3-RELEASE but I cant > figure out what. This server works with a generic install of > 7.1-RELEASE 7.2-RELEASE , 6.1-RELEASE, 6.2-RELEASE and 6.4-RELEASE in > both amd64 and i386 , but not 7.3-RELEASE in amd64 . It also worked in > 7.4-RC1 . > > avg recommended I see what changed from r212964 to r212994 I am > currently looking into this . Has anyone seen this before ? If the server works fine with 7.4-PRERELEASE/RC1, why are you caring about 7.3? Upgrade. :-) -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977.
Re: Enabling DDB prevent kernel from panicing
On Mon, Jan 10, 2011 at 6:59 PM, wrote: > Hello, Mark > > 2011/1/11 Mark Saad : >> All >> This was originally posted to hackers@ >> >> I have a good question that I cant find an answer for. I believe >> found a kernel bug in 7.3-RELEASE that prevents me from booting 64-bit >> kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: page >> fault while in kernel mode " . The hardware works fine in 7.2-RELEASE >> amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 . >> >> In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using the >> stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if this >> issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC >> kernel using patches sources and tried to boot and I got the same >> crash. >> >> Next I rebuilt the kernel with KDB and DDB to see if I could get a >> core-dump of the system. I also set loader.conf to >> >> kernel="kernel.DEBUG" >> kern.dumpdev="/dev/da0s1b" >> >> Next I pxebooted the box and the system does not crash on boot up, it >> will easily load a nfs root and work fine. So I copied my debug >> kernel, and loader.conf to the local disk and rebooted and it boots >> fine from the local disk . > > Looks like a race condition. > Well, you don't need to compile KDB and DDB, just add > > makeoptions DEBUG=-g > > into your kernel config file and rebuild kernel. > > Then after you got a crash dump you can easy debug it (see FreeBSD > Developers Handbok): > http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html > > > wbr, > Nickolas > Sorry let me clarify the issue, When you install a generic 7.3-RELEASE amd64 on some of the HP servers I use, the kernel panics in boot up when it probes the sio driver . Here is a part of my dmesg.boot file atkbd0: [ITHREAD] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model Generic PS/2 mouse, device ID 0 sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: configured irq 4 not in bitmap of probed irqs 0 sio0: port may not be enabled sio0: port 0x3f8-0x3ff irq 4 on acpi0 sio0: type 16550A sio0: [FILTER] Say about here in the boot up , is where the box crashes with the above noted error. If I then boot the same box off a 7.1-RELEASE amd64 netboot server , mount the local disks of the 7.3-RELEASE install and edit the /boot/device.hints and comment out the sio hints like this hint.vga.0.at="isa" hint.sc.0.at="isa" hint.sc.0.flags="0x100" #hint.sio.0.at="isa" #hint.sio.0.port="0x3F8" #hint.sio.0.flags="0x10" #hint.sio.0.irq="4" #hint.sio.1.at="isa" #hint.sio.1.port="0x2F8" #hint.sio.1.irq="3" #hint.sio.2.at="isa" #hint.sio.2.disabled="1" #hint.sio.2.port="0x3E8" #hint.sio.2.irq="5" #hint.sio.3.at="isa" #hint.sio.3.disabled="1" #hint.sio.3.port="0x2E8" #hint.sio.3.irq="9" hint.ppc.0.at="isa" hint.ppc.0.irq="7" then boot the server off the local disks , the server boots correctly. The odd thing was, I rebuilt a debug 7.3-RELEASE amd64 kernel on another working server, and installed it on the broken server and booted it off the local disks, with out any changes to the hints file and the server booted correctly and I was able to manually break out into the debugger , but nothing looked wrong . So to sum this up there is something broken in 7.3-RELEASE but I cant figure out what. This server works with a generic install of 7.1-RELEASE 7.2-RELEASE , 6.1-RELEASE, 6.2-RELEASE and 6.4-RELEASE in both amd64 and i386 , but not 7.3-RELEASE in amd64 . It also worked in 7.4-RC1 . avg recommended I see what changed from r212964 to r212994 I am currently looking into this . Has anyone seen this before ? -- mark saad | nones...@longcount.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Enabling DDB prevent kernel from panicing
Hello, Mark 2011/1/11 Mark Saad : > All > This was originally posted to hackers@ > > I have a good question that I cant find an answer for. I believe > found a kernel bug in 7.3-RELEASE that prevents me from booting 64-bit > kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: page > fault while in kernel mode " . The hardware works fine in 7.2-RELEASE > amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 . > > In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using the > stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if this > issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC > kernel using patches sources and tried to boot and I got the same > crash. > > Next I rebuilt the kernel with KDB and DDB to see if I could get a > core-dump of the system. I also set loader.conf to > > kernel="kernel.DEBUG" > kern.dumpdev="/dev/da0s1b" > > Next I pxebooted the box and the system does not crash on boot up, it > will easily load a nfs root and work fine. So I copied my debug > kernel, and loader.conf to the local disk and rebooted and it boots > fine from the local disk . Looks like a race condition. Well, you don't need to compile KDB and DDB, just add makeoptions DEBUG=-g into your kernel config file and rebuild kernel. Then after you got a crash dump you can easy debug it (see FreeBSD Developers Handbok): http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html wbr, Nickolas ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Enabling DDB prevent kernel from panicing
All This was originally posted to hackers@ I have a good question that I cant find an answer for. I believe found a kernel bug in 7.3-RELEASE that prevents me from booting 64-bit kernels on HP's DL360 G4p . The kernel dies with "Fatal trap 12: page fault while in kernel mode " . The hardware works fine in 7.2-RELEASE amd64, 7.1-RELEASE amd64, and 6.4-RELEASE amd64 . In 7.3-RELEASE amd64 I can not boot from cd or pxe correctly using the stock 7.3-RELEASE amd64 kernel however i386 works fine. To see if this issue was some how fixed in 7.3-RELEASE-p4 amd64 I rebuilt a GENERIC kernel using patches sources and tried to boot and I got the same crash. Next I rebuilt the kernel with KDB and DDB to see if I could get a core-dump of the system. I also set loader.conf to kernel="kernel.DEBUG" kern.dumpdev="/dev/da0s1b" Next I pxebooted the box and the system does not crash on boot up, it will easily load a nfs root and work fine. So I copied my debug kernel, and loader.conf to the local disk and rebooted and it boots fine from the local disk . Rebooting the server and running off the local disks and debug kernel, I cant find any issues. Reboot the box into a GENERIC 7.3-RELEASE-p4 kernel and it crashes With this error Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x0 fault code = supervisor write data, page not present instruction pointer = 0x8:0x800070fa stack pointer= 0x10:0x8153cbe0 frame pointer= 0x10:0x8153cc50 code segment = base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 0 (swapper) [thread pid 0 tid 10 ] Stopped at bzero+0xa: repe stosq %es:(%rdi) It was recommended to comment out the sio hints in /boot/device.hints I did this and I can properly boot a GENERIC 7.3-RELEASE kernel. I reran this same test using 7.4-RC1 the system boots with out any changes to anything. So my question, does anyone know what changed in stable/7 after the creation of 7.3-RELEASE that could have fixed this or does anyone know what could be causing this issue. The sio code does not look like its been changed in a long while . Do we still need s the hits for the sio ports anyway does omitting them from the hints file cause any major issues, I can use the serial port for a console and to connect to to other serial devices with out any issues. -- mark saad | nones...@longcount.org ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: tmpfs regression in recent -STABLE
On Mon, Jan 10, 2011 at 11:14:24PM +0100, Ulrich Spörlein wrote: > On Mon, 10.01.2011 at 16:49:14 -0500, John Baldwin wrote: > > On Monday, January 10, 2011 4:40:04 pm Ulrich Spörlein wrote: > > > Hey, > > > > > > the following line in fstab used to work just fine for my /tmp: > > > > > > tmpfs /tmptmpfs rw,size=1g,mode=17770 0 > > > > I thought there was a thread recently about tmpfs not supporting things > > like > > "1g" for size? > > Nah, this must be some leak of another kind. Luckily I could bandaid > this by unionfs mounting an mfs disk over /tmp so programs continue to > run. > > But, tmpfs really is out of resources, as I cannot create new tmpfs's > for example: > > r...@elmar: ~# mount -t tmpfs tmpfs /media > mount: tmpfs : No space left on device > > And besides, the /tmp mount comes up fine and shows enough free space (I > checked this the last time, after I had rebooted the box). Are you using ZFS on the same machine? If so, ZFS and tmpfs don't play well together, don't use tmpfs. Please search the below page for "tmpfs runs out of space" for all relevant posts: http://lists.freebsd.org/pipermail/freebsd-stable/2011-January/thread.html -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
important NFS client patch for FreeBSD8.n
I just commited a patch (r217242) to head. Anyone who is using client side NFS on FreeBSD8.n should apply this patch. It is also available at: http://people.freebsd.org/~rmacklem/krpc.patch It fixes a problem where the kernel rpc assumes that 4 bytes of data exists in the first mbuf without checking. If the data straddles multiple mbufs, it uses garbage and then a typical case will wedge for a minute or so until it times out and establishes a new TCP connection. It also replaces m_pullup() with m_copydata(), since m_pullup() can fail for rare cases when there is data available. (m_pullup() uses MGET(, M_DONTWAIT,) which can fail when mbuf allocation is constrainted, for example.) Thanks to john.gemignani at isilon.com for spotting this problem, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: tmpfs regression in recent -STABLE
On Mon, 10.01.2011 at 16:49:14 -0500, John Baldwin wrote: > On Monday, January 10, 2011 4:40:04 pm Ulrich Spörlein wrote: > > Hey, > > > > the following line in fstab used to work just fine for my /tmp: > > > > tmpfs /tmptmpfs rw,size=1g,mode=17770 0 > > I thought there was a thread recently about tmpfs not supporting things like > "1g" for size? Nah, this must be some leak of another kind. Luckily I could bandaid this by unionfs mounting an mfs disk over /tmp so programs continue to run. But, tmpfs really is out of resources, as I cannot create new tmpfs's for example: r...@elmar: ~# mount -t tmpfs tmpfs /media mount: tmpfs : No space left on device And besides, the /tmp mount comes up fine and shows enough free space (I checked this the last time, after I had rebooted the box). Cheers, Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS performance
> > > > So, did the patch get rid of the 1min + stalls you reported earlier? > > > Yes. The stalls (and the "server not responding" log messages are > gone. Thanks! -- George > Ok, thats a start anyhow. Maybe someday we can explain the slow read rates you are still observing. Thanks for letting us know, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: tmpfs regression in recent -STABLE
On Monday, January 10, 2011 4:40:04 pm Ulrich Spörlein wrote: > Hey, > > the following line in fstab used to work just fine for my /tmp: > > tmpfs /tmptmpfs rw,size=1g,mode=17770 0 I thought there was a thread recently about tmpfs not supporting things like "1g" for size? -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
RE: Supermicro Bladeserver
We attempted to repro this problem with the 82566DM (ich8 btw) in house and failed, it worked correctly for my testers. Oh, and just so the mailing lists have an update, the SM Blade problem was not an issue in the driver, it was a local change in the loader.conf that caused the problem. Regards, Jack -Original Message- From: TAKAHASHI Yoshihiro [mailto:n...@freebsd.org] Sent: Friday, January 07, 2011 7:40 PM To: jfvo...@gmail.com Cc: freebsd-...@freebsd.org; freebsd-stable@freebsd.org; Vogel, Jack Subject: Re: Supermicro Bladeserver In article Jack Vogel writes: > I am trying to track down a problem being experienced at icir.org using > SuperMicro > bladeservers, the SERDES 82575 interfaces are having connectivity or perhaps > autoneg problems, resulting in link transitions and watchdog resets. > > The closest hardware my org at Intel has is a Fujitsu server who's blades > also have > this device, but testing on that has failed to repro the problem. > > I was wondering if anyone else out there has this hardware, if so could you > let me > know your experience, have you had problems or not, etc etc? My machine has the following em(4) device and it has a autoneg problem. When I was using 8-stable kernel at 2010/11/01, it has no problem. But I update to 8-stable at 2010/12/01, the kernel is only linked up as 10M. e...@pci0:0:25:0:class=0x02 card=0x13d510cf chip=0x104a8086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = '82566DM Gigabit Network Connection' class = network subclass = ethernet --- TAKAHASHI Yoshihiro ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 12/16/2010 01:44 PM, Martin Matuska wrote: Hi everyone, following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am providing a ZFSv28 testing patch for 8-STABLE. Link to the patch: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz Link to mfsBSD ISO files for testing (i386 and amd64): http://mfsbsd.vx.sk/iso/zfs-v28/8.2-beta-zfsv28-amd64.iso http://mfsbsd.vx.sk/iso/zfs-v28/8.2-beta-zfsv28-i386.iso The root password for the ISO files: "mfsroot" The ISO files work on real systems and in virtualbox. They conatin a full install of FreeBSD 8.2-PRERELEASE with ZFS v28, simply use the provided "zfsinstall" script. The patch is against FreeBSD 8-STABLE as of 2010-12-15. When applying the patch be sure to use correct options for patch(1) and make sure the file sys/cddl/compat/opensolaris/sys/sysmacros.h gets deleted: # cd /usr/src # fetch http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz # xz -d stable-8-zfsv28-20101215.patch.xz # patch -E -p0< stable-8-zfsv28-20101215.patch # rm sys/cddl/compat/opensolaris/sys/sysmacros.h I've just got a panic: http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/IMAGE_006.jpg The panic line for google: panic: solaris assert: task->ost_magic == TASKQ_MAGIC, file: /usr/src/sys/modules/zfs/../../cddl/compat/opensolaris/kern/opensolaris_taskq.c, line: 150 I hope this is enough for debugging, if it's not yet otherwise known. If not, I will try to catch it againt and make a dump. Thanks, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
tmpfs regression in recent -STABLE
Hey, the following line in fstab used to work just fine for my /tmp: tmpfs /tmptmpfs rw,size=1g,mode=17770 0 But since I upgraded to 8.2-PRERELEASE, /tmp will soon run out of space (usually after leaving the box overnight). % df /tmp Filesystem 1K-blocks Used Avail Capacity Mounted on tmpfs 12 12 0 100%/tmp Yes, what you see here, is not "stuff" filling up the /tmp partition, *BUT* the /tmp partition shrinking to a ridiculous size. /tmp only has the usual stuff on it, as I can now no longer create temporary files there: % du /tmp 4 /tmp/.X11-unix 0 /tmp/.XIM-unix 0 /tmp/.ICE-unix 0 /tmp/.font-unix 4 /tmp/ssh-tEgl0QxQHp 4 /tmp/ksocket-uqs 12 /tmp/kde-uqs 4 /tmp/fam-uqs 8 /tmp/.vbox-uqs-ipc 0 /tmp/worker-uqs 44 /tmp Anything I could try? Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: nfsd stuck in *rc_lock state
> Hello Rick, > > Am 11.11.2010 23:54, schrieb Rick Macklem: > > That patch is "self contained", so I think it should be fine to > > apply it > > to an 8.0 server. > > > > You might also want > > > > http://people.freebsd.org/~rmacklem/freebsd8.0-patches/freebsd8-svc-mbufleak.patch > > which plugged an mbuf leak in the regular FreeBSD8.0 server. > > > > Good luck with it, rick > > the patch fixes the 100% cpu utilization, but we now had two times the > issue, that all boxes lost connection to the nfs server (/home not > responding), but nfsd was at about 1%. > > Top did not show a strange behaviour here: > > > PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 703 root 55 0 4772K 1384K RUN 5 329:12 1.37% > {nfsd: service} > 703 root 56 0 4772K 1384K rpcsvc 0 326:41 0.59% > {nfsd: service} > 703 root 52 0 4772K 1384K rpcsvc 6 326:28 0.29% > {nfsd: service} > 703 root 60 0 4772K 1384K rpcsvc 5 328:42 0.00% > {nfsd: master} > 703 root 54 0 4772K 1384K rpcsvc 0 327:44 0.00% > {nfsd: service} > 703 root 53 0 4772K 1384K rpcsvc 1 327:37 0.00% > {nfsd: service} > 703 root 54 0 4772K 1384K rpcsvc 6 326:51 0.00% > {nfsd: service} > 703 root 57 0 4772K 1384K rpcsvc 2 326:44 0.00% > {nfsd: service} > 703 root 50 0 4772K 1384K rpcsvc 1 326:20 0.00% > {nfsd: service} > 703 root 71 0 4772K 1384K rpcsvc 2 323:11 0.00% > {nfsd: service} > 703 root 47 0 4772K 1384K rpcsvc 7 321:11 0.00% > {nfsd: service} > 703 root 46 0 4772K 1384K tx->tx 2 320:00 0.00% > {nfsd: service} > > there was nothing special in the logfiles, too. > How to debug such a situation? > First off, I hope you don't mind me adding the mailing list as a cc. I'd like this stuff captured in the archive for others to see. (If people don't like the noise, I'll take the heat:-) Ok, I'm sure others have better techniques, but here's how I would start trying to resolve the above, done when the server is stuck. 1 - Make sure the network is still functioning for other things like ssh. 2 - Do a "ps axHlww" and look at all the nfsd threads. I am primarily interested in the MWCHAN field. If it is: rpcsvc - the thread is just waiting for an RPC-->normal ufs or zfs - waiting for a vnode lock on the underlying file system anything else - I need to look in the kernel sources for the "sleep" with that argument. If I can't easily explain what all the nfsd threads are waiting for, wading through a "procstat -ka" is my next step. (I find this rather painful, so I tend to delay doing this as long as possible.:-) 3 - Do a "nfsstat -s" repeatedly and see if any of the counters are increasing. 4 - Fire up a "tcpdump" and see if there is any NFS traffic. (If there is, I'll capture it and put it in wireshark.) 5 - Do a "vmstat -z | fgrep mbuf" and look at the mbuf allocation. (If the machine is running out of mbufs, all sorts of quirky behaviour is possible.) What top shows above isn't much, although I'd wonder what mbuf usage looks like? If you haven't applied the patch mentioned in the above message, you should do that. I don't know if this helps, but... rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.
> > > > Hi, > > > > I have got the first steps set up. No solution yet. > > 1. With the patch OpenOffice opens my homedir (yeah!), but it gives > > an > > I/O > > error when saving a file and everything hangs after that. > > Hmm, I don't think you mentioned what server you were using. It > wouldn't happen to be a FreeBSD one exported ZFS? If so, make > sure you have this patch in it: > http://people.freebsd.org/~rmacklem/freebsd8.0-patches/freebsd8-nfsserver-estale.patch > (With it a stale file handle can result in EIO from a server exporting Oops, I meant "Without the patch a stale file handle...", rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.
> > Hi, > > I have got the first steps set up. No solution yet. > 1. With the patch OpenOffice opens my homedir (yeah!), but it gives an > I/O > error when saving a file and everything hangs after that. Hmm, I don't think you mentioned what server you were using. It wouldn't happen to be a FreeBSD one exported ZFS? If so, make sure you have this patch in it: http://people.freebsd.org/~rmacklem/freebsd8.0-patches/freebsd8-nfsserver-estale.patch (With it a stale file handle can result in EIO from a server exporting ZFS and that can make the client loop around, retrying the RPC.) > 2. I have dumps and stuff. I will mail some links in private e-mail. I'll take a look at some point. > 3. Didn't work. It mount, but ls -l /home gives "Operation not > permitted". > It should work. This hints at a server issue. Anyhow, I'll look at the dumps at some point, rick ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Hang in VOP_LOCK1_APV on 8-STABLE with NFS.
On Fri, 07 Jan 2011 20:52:57 +0100, Kostik Belousov wrote: On Fri, Jan 07, 2011 at 02:37:25PM -0500, Rick Macklem wrote: > Hi, > > OpenOffice hangs on NFS when I try to save a file or even when I try > to > open the save dialog in this case. > > > $ 17:25:35 ron...@ronald [~] > procstat -kk 85575 > PID TID COMM TDNAME KSTACK > 85575 100322 soffice.bin initial thread mi_switch+0x176 > sleepq_wait+0x3b __lockmgr_args+0x655 vop_stdlock+0x39 > VOP_LOCK1_APV+0x46 > _vn_lock+0x44 vget+0x67 vfs_hash_get+0xeb nfs_nget+0xa8 > nfs_lookup+0x65e > VOP_LOOKUP_APV+0x40 lookup+0x48a namei+0x518 kern_statat_vnhook+0x82 > kern_statat+0x15 lstat+0x22 syscallenter+0x186 syscall+0x40 > 85575 100502 soffice.bin - mi_switch+0x176 > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0 > do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186 > syscall+0x40 > Xfast_syscall+0xe2 > 85575 100576 soffice.bin - mi_switch+0x176 > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 _sleep+0x1a0 > do_cv_wait+0x639 __umtx_op_cv_wait+0x51 syscallenter+0x186 > syscall+0x40 > Xfast_syscall+0xe2 > 85575 100577 soffice.bin - mi_switch+0x176 > sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _sleep+0x25d > kern_accept+0x19c accept+0xfe syscallenter+0x186 syscall+0x40 > Xfast_syscall+0xe2 > 85575 100578 soffice.bin - mi_switch+0x176 > sleepq_catch_signals+0x309 sleepq_wait_sig+0xc _cv_wait_sig+0x10e > seltdwait+0xed poll+0x457 syscallenter+0x186 syscall+0x40 > Xfast_syscall+0xe2 > 85575 100579 soffice.bin - mi_switch+0x176 > sleepq_catch_signals+0x309 sleepq_timedwait_sig+0x12 > _cv_timedwait_sig+0x11d seltdwait+0x79 poll+0x457 syscallenter+0x186 > syscall+0x40 Xfast_syscall+0xe2 > > $ 17:25:35 ron...@ronald [~] > uname -a > FreeBSD ronald.office.base.nl 8.2-PRERELEASE FreeBSD 8.2-PRERELEASE > #6: > Mon Dec 27 23:49:30 CET 2010 > r...@ronald.office.base.nl:/usr/obj/usr/src/sys/GENERIC amd64 > I think all the above tells us is that the thread is waiting for a vnode lock. The question then becomes "what is holding a lock on that vnode and why?". > It is not possible to exit or kill soffice.bin. I had a slighty > different > procstat stack before, but that was fixed a couple of days ago. Yea, it will be in an uniterruptible sleep when waiting for a vnode lock. > Any thoughts? Enabling local locks in NFS doesn't fix it. Here's some things you could try: 1 - apply the attached patch. It fixes a known problem w.r.t. the client side of the krpc. Not likely to fix this, but I can hope:-) 1a - Look around of other processes in the uninterruptible sleep state, quite possible, one of them also owns the lock the openoffice is waiting for. Also see http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kerneldebug-deadlocks.html Of the particular interest are the witness output and backtraces for all threads that are reported by witness as owning the vnode locks. 2 - If #1 doesn't fix the problem: - before making it hang, start capturing packets via: # tcpdump -s 0 -w xxx host server - then make it hang, kill the above and # procstat -ka # ps axHlww and capture the output of both of these. Hopefully these 2 commands will indicate what is holding the vnode lock and maybe, why. The "xxx" file can be looked at in wireshark to see what/if any NFS traffic is happening. If you aren't comfortable looking at the above, you can email them to me and I'll take a stab at them someday. 3 - Try the experimental client to see if it behaves differently. The mount command is: # mount -t newnfs -o nfsv3, server:/path /mntpath (This might ideantify if the regular client has an infrequently executed code path that forgets to unlock the vnode, since it uses a somewhat different RPC layer. The buffer cache handling etc are almost the same, but the RPC stuff is fairly different.) > The nfs server is an up-to-date Linux Debian 5 with kernel 2.6.26. > I'm afraid I can't blame Linux (at least not until we have more info;-). > If more info is needed. I can easily reproduce this. See above #2. Good luck with it and let us know how it goes, rick Hi, I have got the first steps set up. No solution yet. 1. With the patch OpenOffice opens my homedir (yeah!), but it gives an I/O error when saving a file and everything hangs after that. 2. I have dumps and stuff. I will mail some links in private e-mail. 3. Didn't work. It mount, but ls -l /home gives "Operation not permitted". I didn't see other processes in uninterruptable state. But maybe you guys see more than I do. If you don't see anything in wireshark I will try WITNESS and friends later this week. Already 2 hours busy with this during work hours. Ronald. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 01/10/2011 09:57 AM, Pawel Jakub Dawidek wrote: On Sun, Jan 09, 2011 at 12:52:56PM +0100, Attila Nagy wrote: [...] I've finally found the time to read the v28 patch and figured out the problem: vfs.zfs.l2arc_noprefetch was changed to 1, so it doesn't use the prefetched data on the L2ARC devices. This is a major hit in my case. Enabling this again restored the previous hit rates and lowered the load on the hard disks significantly. Well, not storing prefetched data on L2ARC vdevs is the default is Solaris. For some reason it was changed by kmacy@ in r205231. Not sure why and we can't ask him now, I'm afraid. I just sent an e-mail to What happened to him? Brendan Gregg from Oracle who originally implemented L2ARC in ZFS why this is turned off by default. Once I get answer we can think about turning it on again. I think it makes some sense as a stupid form of preferring random IO in the L2ARC instead of sequential. But if I rely on auto tuning and let prefetch enabled, even a busy mailserver will prefetch a lot of blocks and I think that's a fine example of random IO (also, it makes the system unusable, but that's another story). Having this choice is good, and in this case enabling this makes sense for me. I don't know any reasons about why you wouldn't use all of your L2ARC space (apart from sparing the quickly wearing out flash space and move disk heads instead), but I'm sure Brendan made this choice with a good reason. If you get an answer, please tell us. :) Thanks, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 01/10/2011 10:02 AM, Pawel Jakub Dawidek wrote: On Sun, Jan 09, 2011 at 12:49:27PM +0100, Attila Nagy wrote: No, it's not related. One of the disks in the RAIDZ2 pool went bad: (da4:arcmsr0:0:4:0): READ(6). CDB: 8 0 2 10 10 0 (da4:arcmsr0:0:4:0): CAM status: SCSI Status Error (da4:arcmsr0:0:4:0): SCSI status: Check Condition (da4:arcmsr0:0:4:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) and it seems it froze the whole zpool. Removing the disk by hand solved the problem. I've seen this previously on other machines with ciss. I wonder why ZFS didn't throw it out of the pool. Such hangs happen when I/O never returns. ZFS doesn't timeout I/O requests on its own, this is driver's responsibility. It is still strange that the driver didn't pass I/O error up to ZFS or it might as well be ZFS bug, but I don't think so. Indeed, it may to be a controller/driver bug. The newly released (last december) firmware says something about a similar problem. I've upgraded, we'll see whether it will help next time a drive goes awry. I've only seen these errors in dmesg, not in zpool status, there everything was clear (all zeroes). BTW, I've swapped those bad drives (da4, which reported the above errors, and da16, which didn't reported anything to the OS, it was just plain bad according to the controller firmware -and after its deletion, I could offline da4, so it seems it's the real cause, see my previous e-mail), and zpool replaced first da4, but after some seconds of thinking all IO on all disks deceased. After waiting some minutes, it was still the same, so I've rebooted. Then I noticed that a scrub is going on, so I stopped it. Then the zpool replace da4 went fine, it started to resilver the disk. But another zpool replace (for da16) causes the same error: some seconds of IO, then nothing and it stuck in that. Has anybody tried replacing two drives simultaneously with the zfs v28 patch? (this is a stripe of two raidz2s and da4 and da16 are in different raidz2) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: NFS performance
> > So, did the patch get rid of the 1min + stalls you reported earlier? > Yes. The stalls (and the "server not responding" log messages are gone. Thanks! -- George ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state
On 08/01/2011 20:42, Lev Serebryakov wrote: Hello, Kostik. You wrote 8 января 2011 г., 22:02:32: If I am guessing right, this creature has a classic deadlock when bio processing requires memory allocation. It seems that tid 100079 is sleeping not even due to the free page shortage, but due to address space exhaustion. As result, read/write requests are stalled. I want to say, that ZFS, for example, could allocate much more memory, and, yes, it had problems on i386 with this, but not on amd64, AFAIK... So, I'm (geom_radi5) doing something wrong... geom_raid5 (I'm assuming you're talking about the module that was written some time ago by an external developer) does serveral things wrong - that's why it wasn't included in FreeBSD. IIRC, one of those things is that it aggressively caches writes below the file system layer, which is a no-no. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: 8.2-PRERELEASE: live deadlock, almost all processes in "pfault" state
On 08/01/2011 23:06, Lev Serebryakov wrote: I need to look how raid3 and vinum/raid5 lives with that situation. One other standard solution is to spawn a thread and offload the job to that thread, instead of within GEOM start(). This is what most current complex GEOM classes to. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On Sat, Dec 18, 2010 at 10:00:11AM +0100, Krzysztof Dajka wrote: > Hi, > I applied patch against evening 2010-12-16 STABLE. I did what Martin asked: > > On Thu, Dec 16, 2010 at 1:44 PM, Martin Matuska wrote: > > # cd /usr/src > > # fetch > > http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz > > # xz -d stable-8-zfsv28-20101215.patch.xz > > # patch -E -p0 < stable-8-zfsv28-20101215.patch > > # rm sys/cddl/compat/opensolaris/sys/sysmacros.h > > > Patch applied cleanly. > > #make buildworld > #make buildkernel > #make installkernel > Reboot into single user mode. > #mergemaster -p > #make installworld > #mergemaster > Reboot. > > > Rebooting with old world and new kernel went fine. But after reboot > with new world I got: > ZFS: zfs_alloc()/zfs_free() mismatch > Just before loading kernel modules, after that my system hangs. Could you tell me more about you pool configuration? 'zpool status' output might be helpful. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgptxjJnkkXhF.pgp Description: PGP signature
8.2-BETA1 / 8.2-RC1 ACPI and other errors in dmesg after upgrade from 7.2
Hi, I have a few machines Sun Fire X2100 M2. I upgraded from FreeBSD 7.2 to 8.2-BETA1 and 8.2-RC1 and now I see following errors in dmesg: acpi0: on motherboard ACPI Error: Invalid type (Alias) for target of Scope operator [CPU1] (Cannot override) (20101013/dswload-324) ACPI Exception: AE_AML_OPERAND_TYPE, During name lookup/catalog (20101013/psloop-326) . acpi0: Power Button (fixed) acpi0: reservation of 0, a (3) failed acpi0: reservation of 10, 7ff0 (3) failed . . uhub_reattach_port: port 1 reset failed, error=USB_ERR_TIMEOUT uhub_reattach_port: device problem (USB_ERR_TIMEOUT), disabling port 1 And then in /var/log/messages pid 3802 (sshd) is using legacy pty devices - not logging anymore pid 38796 (try), uid 0: exited on signal 10 (core dumped) Except these messages, system and services are running fine. So this is just a report of something suspected. Full dmesg: Copyright (c) 1992-2010 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.2-RC1 #0: Wed Dec 22 17:34:20 UTC 2010 r...@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Dual-Core AMD Opteron(tm) Processor 1210 (1811.10-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x40f33 Family = f Model = 43 Stepping = 3 Features=0x178bfbff Features2=0x2001 AMD Features=0xea500800 AMD Features2=0x1f real memory = 4294967296 (4096 MB) avail memory = 4114534400 (3923 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs FreeBSD/SMP: 1 package(s) x 2 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: on motherboard ACPI Error: Invalid type (Alias) for target of Scope operator [CPU1] (Cannot override) (20101013/dswload-324) ACPI Exception: AE_AML_OPERAND_TYPE, During name lookup/catalog (20101013/psloop-326) acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a (3) failed acpi0: reservation of 10, dff0 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x2008-0x200b on acpi0 cpu0: on acpi0 cpu1: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 0.0 (no driver attached) isab0: at device 1.0 on pci0 isa0: on isab0 pci0: at device 1.1 (no driver attached) ohci0: mem 0xfcffb000-0xfcffbfff irq 21 at device 2.0 on pci0 ohci0: [ITHREAD] usbus0: on ohci0 ehci0: mem 0xfcffac00-0xfcffacff irq 22 at device 2.1 on pci0 ehci0: [ITHREAD] usbus1: EHCI version 1.0 usbus1: on ehci0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 4.0 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] atapci1: port 0xd480-0xd487,0xd400-0xd403,0xd080-0xd087,0xd000-0xd003,0xcc00-0xcc0f mem 0xfcff9000-0xfcff9fff irq 23 at device 5.0 on pci0 atapci1: [ITHREAD] ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] pcib1: at device 6.0 on pci0 pci1: on pcib1 vgapci0: port 0xec00-0xec7f mem 0xfd00-0xfd7f,0xfdee-0xfdef irq 16 at device 5.0 on pci1 nfe0: port 0xc880-0xc887 mem 0xfcff8000-0xfcff8fff,0xfcffa800-0xfcffa8ff,0xfcffa400-0xfcffa40f irq 20 at device 8.0 on pci0 miibus0: on nfe0 e1000phy0: PHY 2 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow nfe0: Ethernet address: 00:1b:24:bd:e2:0f nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] nfe1: port 0xc800-0xc807 mem 0xfcff7000-0xfcff7fff,0xfcffa000-0xfcffa0ff,0xfcff6c00-0xfcff6c0f irq 21 at device 9.0 on pci0 miibus1: on nfe1 e1000phy1: PHY 3 on miibus1 e1000phy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow nfe1: Ethernet address: 00:1b:24:bd:e2:10 nfe1: [FILTER] nfe1: [FILTER] nfe1: [FILTER] nfe1: [FILTER] nfe1: [FILTER] nfe1: [FILTER] nfe1: [FILTER] nfe1: [FILTER] pcib2: at device 10.0 on pci0 pci2: on pcib2 pcib3: at device 11.0 on pci0 pci3: on pcib3 pcib4: at device 12.0 on pci0 pci4: on pcib4 pcib5: at device 13.0 on pci0 pci5: on pcib5 pcib6: at device 0.0 on pci5 pci6: on pcib6 bge0: 0x009003> mem 0xfdff-0xfdff,0xfdfe-0xfdfe irq 17 at device 4.0 on pci6 bge0: CHIP ID 0x9003; ASIC REV 0x09; CHIP REV 0x90; PCI-X miibus2: on bge0 brgphy0: PHY 1 on miibus2 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 00:1b:24:bd:e2:0d bge0: [ITHREAD] bge1: 0x009003> mem 0xfdfc-0xfdfc,0xfdfb-0xfdfb irq 18 at device 4.1 o
Re: New ZFSv28 patchset for 8-STABLE
On Sun, Jan 09, 2011 at 12:52:56PM +0100, Attila Nagy wrote: [...] > I've finally found the time to read the v28 patch and figured out the > problem: vfs.zfs.l2arc_noprefetch was changed to 1, so it doesn't use > the prefetched data on the L2ARC devices. > This is a major hit in my case. Enabling this again restored the > previous hit rates and lowered the load on the hard disks significantly. Well, not storing prefetched data on L2ARC vdevs is the default is Solaris. For some reason it was changed by kmacy@ in r205231. Not sure why and we can't ask him now, I'm afraid. I just sent an e-mail to Brendan Gregg from Oracle who originally implemented L2ARC in ZFS why this is turned off by default. Once I get answer we can think about turning it on again. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpxIXdIFOMEK.pgp Description: PGP signature
Re: New ZFSv28 patchset for 8-STABLE
On Sun, Jan 09, 2011 at 12:49:27PM +0100, Attila Nagy wrote: > No, it's not related. One of the disks in the RAIDZ2 pool went bad: > (da4:arcmsr0:0:4:0): READ(6). CDB: 8 0 2 10 10 0 > (da4:arcmsr0:0:4:0): CAM status: SCSI Status Error > (da4:arcmsr0:0:4:0): SCSI status: Check Condition > (da4:arcmsr0:0:4:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read > error) > and it seems it froze the whole zpool. Removing the disk by hand solved > the problem. > I've seen this previously on other machines with ciss. > I wonder why ZFS didn't throw it out of the pool. Such hangs happen when I/O never returns. ZFS doesn't timeout I/O requests on its own, this is driver's responsibility. It is still strange that the driver didn't pass I/O error up to ZFS or it might as well be ZFS bug, but I don't think so. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp246BCVH7mU.pgp Description: PGP signature
classes and kernel_cookie was Re: Specifying root mount options on diskless boot.
... > I note that the response to your message from "danny" offers the ability > to pass arguments to the nfs mount command, but also seems to offer a fix > for the fact that "classes" are not supported under PXE: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/90368 > > I hope "danny" will offer a patch to mainline code - it would be an > important improvement (and already promised in the documentation). ... I'm willing to try and add the missing pieces, but I need some better explanantion as to what they are, for example, I have no clue what the kernel_cookie is used for, nor what the ${class} is all about. BTW, it would be kind if the line in the pxeboot(8): As PXE is still in its infancy ... can be changed :-) "danny" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"