Re: New ZFSv28 patchset for 8-STABLE
On Mon, Jan 10, 2011 at 06:30:39PM +0100, Attila Nagy wrote: > >why and we can't ask him now, I'm afraid. I just sent an e-mail to > > What happened to him? Oops, I was thinking of something else. http://valleywag.gawker.com/383763/freebsd-developer-kip-macy-arrested-for-tormenting-tenants Marcus ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 12/16/2010 01:44 PM, Martin Matuska wrote: Hi everyone, following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am providing a ZFSv28 testing patch for 8-STABLE. Link to the patch: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz Link to mfsBSD ISO files for testing (i386 and amd64): http://mfsbsd.vx.sk/iso/zfs-v28/8.2-beta-zfsv28-amd64.iso http://mfsbsd.vx.sk/iso/zfs-v28/8.2-beta-zfsv28-i386.iso The root password for the ISO files: "mfsroot" The ISO files work on real systems and in virtualbox. They conatin a full install of FreeBSD 8.2-PRERELEASE with ZFS v28, simply use the provided "zfsinstall" script. The patch is against FreeBSD 8-STABLE as of 2010-12-15. When applying the patch be sure to use correct options for patch(1) and make sure the file sys/cddl/compat/opensolaris/sys/sysmacros.h gets deleted: # cd /usr/src # fetch http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz # xz -d stable-8-zfsv28-20101215.patch.xz # patch -E -p0< stable-8-zfsv28-20101215.patch # rm sys/cddl/compat/opensolaris/sys/sysmacros.h I've just got a panic: http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/IMAGE_006.jpg The panic line for google: panic: solaris assert: task->ost_magic == TASKQ_MAGIC, file: /usr/src/sys/modules/zfs/../../cddl/compat/opensolaris/kern/opensolaris_taskq.c, line: 150 I hope this is enough for debugging, if it's not yet otherwise known. If not, I will try to catch it againt and make a dump. Thanks, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 01/10/2011 09:57 AM, Pawel Jakub Dawidek wrote: On Sun, Jan 09, 2011 at 12:52:56PM +0100, Attila Nagy wrote: [...] I've finally found the time to read the v28 patch and figured out the problem: vfs.zfs.l2arc_noprefetch was changed to 1, so it doesn't use the prefetched data on the L2ARC devices. This is a major hit in my case. Enabling this again restored the previous hit rates and lowered the load on the hard disks significantly. Well, not storing prefetched data on L2ARC vdevs is the default is Solaris. For some reason it was changed by kmacy@ in r205231. Not sure why and we can't ask him now, I'm afraid. I just sent an e-mail to What happened to him? Brendan Gregg from Oracle who originally implemented L2ARC in ZFS why this is turned off by default. Once I get answer we can think about turning it on again. I think it makes some sense as a stupid form of preferring random IO in the L2ARC instead of sequential. But if I rely on auto tuning and let prefetch enabled, even a busy mailserver will prefetch a lot of blocks and I think that's a fine example of random IO (also, it makes the system unusable, but that's another story). Having this choice is good, and in this case enabling this makes sense for me. I don't know any reasons about why you wouldn't use all of your L2ARC space (apart from sparing the quickly wearing out flash space and move disk heads instead), but I'm sure Brendan made this choice with a good reason. If you get an answer, please tell us. :) Thanks, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 01/10/2011 10:02 AM, Pawel Jakub Dawidek wrote: On Sun, Jan 09, 2011 at 12:49:27PM +0100, Attila Nagy wrote: No, it's not related. One of the disks in the RAIDZ2 pool went bad: (da4:arcmsr0:0:4:0): READ(6). CDB: 8 0 2 10 10 0 (da4:arcmsr0:0:4:0): CAM status: SCSI Status Error (da4:arcmsr0:0:4:0): SCSI status: Check Condition (da4:arcmsr0:0:4:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) and it seems it froze the whole zpool. Removing the disk by hand solved the problem. I've seen this previously on other machines with ciss. I wonder why ZFS didn't throw it out of the pool. Such hangs happen when I/O never returns. ZFS doesn't timeout I/O requests on its own, this is driver's responsibility. It is still strange that the driver didn't pass I/O error up to ZFS or it might as well be ZFS bug, but I don't think so. Indeed, it may to be a controller/driver bug. The newly released (last december) firmware says something about a similar problem. I've upgraded, we'll see whether it will help next time a drive goes awry. I've only seen these errors in dmesg, not in zpool status, there everything was clear (all zeroes). BTW, I've swapped those bad drives (da4, which reported the above errors, and da16, which didn't reported anything to the OS, it was just plain bad according to the controller firmware -and after its deletion, I could offline da4, so it seems it's the real cause, see my previous e-mail), and zpool replaced first da4, but after some seconds of thinking all IO on all disks deceased. After waiting some minutes, it was still the same, so I've rebooted. Then I noticed that a scrub is going on, so I stopped it. Then the zpool replace da4 went fine, it started to resilver the disk. But another zpool replace (for da16) causes the same error: some seconds of IO, then nothing and it stuck in that. Has anybody tried replacing two drives simultaneously with the zfs v28 patch? (this is a stripe of two raidz2s and da4 and da16 are in different raidz2) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On Sat, Dec 18, 2010 at 10:00:11AM +0100, Krzysztof Dajka wrote: > Hi, > I applied patch against evening 2010-12-16 STABLE. I did what Martin asked: > > On Thu, Dec 16, 2010 at 1:44 PM, Martin Matuska wrote: > > # cd /usr/src > > # fetch > > http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz > > # xz -d stable-8-zfsv28-20101215.patch.xz > > # patch -E -p0 < stable-8-zfsv28-20101215.patch > > # rm sys/cddl/compat/opensolaris/sys/sysmacros.h > > > Patch applied cleanly. > > #make buildworld > #make buildkernel > #make installkernel > Reboot into single user mode. > #mergemaster -p > #make installworld > #mergemaster > Reboot. > > > Rebooting with old world and new kernel went fine. But after reboot > with new world I got: > ZFS: zfs_alloc()/zfs_free() mismatch > Just before loading kernel modules, after that my system hangs. Could you tell me more about you pool configuration? 'zpool status' output might be helpful. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgptxjJnkkXhF.pgp Description: PGP signature
Re: New ZFSv28 patchset for 8-STABLE
On Sun, Jan 09, 2011 at 12:52:56PM +0100, Attila Nagy wrote: [...] > I've finally found the time to read the v28 patch and figured out the > problem: vfs.zfs.l2arc_noprefetch was changed to 1, so it doesn't use > the prefetched data on the L2ARC devices. > This is a major hit in my case. Enabling this again restored the > previous hit rates and lowered the load on the hard disks significantly. Well, not storing prefetched data on L2ARC vdevs is the default is Solaris. For some reason it was changed by kmacy@ in r205231. Not sure why and we can't ask him now, I'm afraid. I just sent an e-mail to Brendan Gregg from Oracle who originally implemented L2ARC in ZFS why this is turned off by default. Once I get answer we can think about turning it on again. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgpxIXdIFOMEK.pgp Description: PGP signature
Re: New ZFSv28 patchset for 8-STABLE
On Sun, Jan 09, 2011 at 12:49:27PM +0100, Attila Nagy wrote: > No, it's not related. One of the disks in the RAIDZ2 pool went bad: > (da4:arcmsr0:0:4:0): READ(6). CDB: 8 0 2 10 10 0 > (da4:arcmsr0:0:4:0): CAM status: SCSI Status Error > (da4:arcmsr0:0:4:0): SCSI status: Check Condition > (da4:arcmsr0:0:4:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read > error) > and it seems it froze the whole zpool. Removing the disk by hand solved > the problem. > I've seen this previously on other machines with ciss. > I wonder why ZFS didn't throw it out of the pool. Such hangs happen when I/O never returns. ZFS doesn't timeout I/O requests on its own, this is driver's responsibility. It is still strange that the driver didn't pass I/O error up to ZFS or it might as well be ZFS bug, but I don't think so. -- Pawel Jakub Dawidek http://www.wheelsystems.com p...@freebsd.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! pgp246BCVH7mU.pgp Description: PGP signature
Re: New ZFSv28 patchset for 8-STABLE
On Sun, Jan 09, 2011 at 01:42:13PM +0100, Attila Nagy wrote: > On 01/09/2011 01:18 PM, Jeremy Chadwick wrote: > >On Sun, Jan 09, 2011 at 12:49:27PM +0100, Attila Nagy wrote: > >> On 01/09/2011 10:00 AM, Attila Nagy wrote: > >>>On 12/16/2010 01:44 PM, Martin Matuska wrote: > Hi everyone, > > following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am > providing a ZFSv28 testing patch for 8-STABLE. > > Link to the patch: > > http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz > > > >>>I've got an IO hang with dedup enabled (not sure it's related, > >>>I've started to rewrite all data on pool, which makes a heavy > >>>load): > >>> > >>>The processes are in various states: > >>>65747 1001 1 54 10 28620K 24360K tx->tx 0 6:58 0.00% cvsup > >>>80383 1001 1 54 10 40616K 30196K select 1 5:38 0.00% rsync > >>>1501 www 1 440 7304K 2504K zio->i 0 2:09 0.00% nginx > >>>1479 www 1 440 7304K 2416K zio->i 1 2:03 0.00% nginx > >>>1477 www 1 440 7304K 2664K zio->i 0 2:02 0.00% nginx > >>>1487 www 1 440 7304K 2376K zio->i 0 1:40 0.00% nginx > >>>1490 www 1 440 7304K 1852K zfs 0 1:30 0.00% nginx > >>>1486 www 1 440 7304K 2400K zfsvfs 1 1:05 0.00% nginx > >>> > >>>And everything which wants to touch the pool is/becomes dead. > >>> > >>>Procstat says about one process: > >>># procstat -k 1497 > >>> PIDTID COMM TDNAME KSTACK > >>>1497 100257 nginx-mi_switch > >>>sleepq_wait __lockmgr_args vop_stdlock VOP_LOCK1_APV null_lock > >>>VOP_LOCK1_APV _vn_lock nullfs_root lookup namei vn_open_cred > >>>kern_openat syscallenter syscall Xfast_syscall > >>No, it's not related. One of the disks in the RAIDZ2 pool went bad: > >>(da4:arcmsr0:0:4:0): READ(6). CDB: 8 0 2 10 10 0 > >>(da4:arcmsr0:0:4:0): CAM status: SCSI Status Error > >>(da4:arcmsr0:0:4:0): SCSI status: Check Condition > >>(da4:arcmsr0:0:4:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered > >>read error) > >>and it seems it froze the whole zpool. Removing the disk by hand > >>solved the problem. > >>I've seen this previously on other machines with ciss. > >>I wonder why ZFS didn't throw it out of the pool. > >Hold on a minute. An unrecoverable read error does not necessarily mean > >the drive is bad, it could mean that the individual LBA that was > >attempted to be read resulted in ASC 0x11 (MEDIUM ERROR) (e.g. a bad > >block was encountered). I would check SMART stats on the disk (since > >these are probably SATA given use of arcmsr(4)) and provide those. > >*That* will tell you if the disk is bad. I'll help you decode the > >attributes values if you provide them. > You are right, and I gave incorrect information. There are a lot > more errors for that disk in the logs, and the zpool was frozen. > I tried to offline the given disk. That helped in the ciss case, > where the symptom is the same, or something similar, like there is > no IO for ages, then something small and nothing for long > seconds/minutes, and there are no errors logged. zpool status > reported no errors, and the dmesg was clear too. > There I could find the bad disk by watching gstat output and there I > saw when the very small amount of IO was done, there was one disk > with response times well above a second, while the others responded > quickly. > There the zpool offline helped. Here not, the command just got hang, > like everything else. > So what I did then: got into the areca-cli and searched for errors. > One disk was set to failed and it seemed to be the cause. I've > removed it (and did a camcontrol rescan, but I'm not sure it was > necessary or not), and suddenly the zpool offline finished and > everything went back to normal. > But there are two controllers in the system and now I see that the > above disk is on ctrl 1, while the one I have removed is on ctrl 2. > I was misleaded by their same position. So now I have an offlined > disk (which produces read errors, but I couldn't see them in the > zpool output) and another, which is shown as failed in the RAID > controller and got removed by hand (and solved the situation): > NAME STATE READ WRITE CKSUM > data DEGRADED 0 0 0 > raidz2-0 DEGRADED 0 0 0 > label/disk20-01 ONLINE 0 0 0 > label/disk20-02 ONLINE 0 0 0 > label/disk20-03 ONLINE 0 0 0 > label/disk20-04 ONLINE 0 0 0 > label/disk20-05 OFFLINE 0 0 0 > label/disk20-06 ONLINE 0 0 0 > label/disk20-07 ONLINE 0 0 0 > label/disk20-08 ONLINE 0 0 0 > label/disk20-09 ONLINE 0 0 0 > label/disk20-10 ONLINE 0 0 0
Re: New ZFSv28 patchset for 8-STABLE
Once upon a time, this was a known problem with the arcmsr driver not correctly interacting with ZFS, resulting in this behavior. Since I'm presuming that the arcmsr driver update which was intended to fix this behavior (in my case, at least) is in your nightly build, it's probably worth pinging the arcmsr driver maintainer about this. - Rich On Sun, Jan 9, 2011 at 7:18 AM, Jeremy Chadwick wrote: > On Sun, Jan 09, 2011 at 12:49:27PM +0100, Attila Nagy wrote: >> On 01/09/2011 10:00 AM, Attila Nagy wrote: >> > On 12/16/2010 01:44 PM, Martin Matuska wrote: >> >>Hi everyone, >> >> >> >>following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am >> >>providing a ZFSv28 testing patch for 8-STABLE. >> >> >> >>Link to the patch: >> >> >> >>http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz >> >> >> >> >> >I've got an IO hang with dedup enabled (not sure it's related, >> >I've started to rewrite all data on pool, which makes a heavy >> >load): >> > >> >The processes are in various states: >> >65747 1001 1 54 10 28620K 24360K tx->tx 0 6:58 0.00% cvsup >> >80383 1001 1 54 10 40616K 30196K select 1 5:38 0.00% rsync >> > 1501 www 1 44 0 7304K 2504K zio->i 0 2:09 0.00% nginx >> > 1479 www 1 44 0 7304K 2416K zio->i 1 2:03 0.00% nginx >> > 1477 www 1 44 0 7304K 2664K zio->i 0 2:02 0.00% nginx >> > 1487 www 1 44 0 7304K 2376K zio->i 0 1:40 0.00% nginx >> > 1490 www 1 44 0 7304K 1852K zfs 0 1:30 0.00% nginx >> > 1486 www 1 44 0 7304K 2400K zfsvfs 1 1:05 0.00% nginx >> > >> >And everything which wants to touch the pool is/becomes dead. >> > >> >Procstat says about one process: >> ># procstat -k 1497 >> > PID TID COMM TDNAME KSTACK >> > 1497 100257 nginx - mi_switch >> >sleepq_wait __lockmgr_args vop_stdlock VOP_LOCK1_APV null_lock >> >VOP_LOCK1_APV _vn_lock nullfs_root lookup namei vn_open_cred >> >kern_openat syscallenter syscall Xfast_syscall >> No, it's not related. One of the disks in the RAIDZ2 pool went bad: >> (da4:arcmsr0:0:4:0): READ(6). CDB: 8 0 2 10 10 0 >> (da4:arcmsr0:0:4:0): CAM status: SCSI Status Error >> (da4:arcmsr0:0:4:0): SCSI status: Check Condition >> (da4:arcmsr0:0:4:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered >> read error) >> and it seems it froze the whole zpool. Removing the disk by hand >> solved the problem. >> I've seen this previously on other machines with ciss. >> I wonder why ZFS didn't throw it out of the pool. > > Hold on a minute. An unrecoverable read error does not necessarily mean > the drive is bad, it could mean that the individual LBA that was > attempted to be read resulted in ASC 0x11 (MEDIUM ERROR) (e.g. a bad > block was encountered). I would check SMART stats on the disk (since > these are probably SATA given use of arcmsr(4)) and provide those. > *That* will tell you if the disk is bad. I'll help you decode the > attributes values if you provide them. > > My understanding is that a single LBA read failure should not warrant > ZFS marking the disk UNAVAIL in the pool. It should have incremented > the READ error counter and that's it. Did you receive a *single* error > for the disk and then things went catatonic? > > If the entire system got wedged (a soft wedge, e.g. kernel is still > alive but nothing's happening in userland), that could be a different > problem -- either with ZFS or arcmsr(4). Does ZFS have some sort of > timeout value internal to itself where it will literally mark a disk > UNAVAIL in the case that repeated I/O transactions takes "too long"? > What is its error recovery methodology? > > Speaking strictly about Solaris 10 and ZFS: I have seen many, many times > a system "soft wedge" after repeated I/O errors (read or write) are > spewed out on the console for a single SATA disk (via AHCI), but only > when the disk is used as a sole root filesystem disk (no mirror/raidz). > My impression is that ZFS isn't the problem in this scenario. In most > cases, post-mortem debugging on my part shows that disks encountered > some CRC errors (indicating cabling issues, etc.), sometimes as few as > 2, but "something else" went crazy -- or possibly ZFS couldn't mark the > disk UNAVAIL (if it has that logic) because it's a single disk > associated with root. Hardware in this scenario are Hitachi SATA disks > with an ICH ESB2 controller, software is Solaris 10 (Generic_142901-06) > with ZFS v15. > > -- > | Jeremy Chadwick j...@parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP 4BD6C0CB | > > ___ > freebsd...@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freeb
Re: New ZFSv28 patchset for 8-STABLE
On 01/09/2011 01:18 PM, Jeremy Chadwick wrote: On Sun, Jan 09, 2011 at 12:49:27PM +0100, Attila Nagy wrote: On 01/09/2011 10:00 AM, Attila Nagy wrote: On 12/16/2010 01:44 PM, Martin Matuska wrote: Hi everyone, following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am providing a ZFSv28 testing patch for 8-STABLE. Link to the patch: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz I've got an IO hang with dedup enabled (not sure it's related, I've started to rewrite all data on pool, which makes a heavy load): The processes are in various states: 65747 1001 1 54 10 28620K 24360K tx->tx 0 6:58 0.00% cvsup 80383 1001 1 54 10 40616K 30196K select 1 5:38 0.00% rsync 1501 www 1 440 7304K 2504K zio->i 0 2:09 0.00% nginx 1479 www 1 440 7304K 2416K zio->i 1 2:03 0.00% nginx 1477 www 1 440 7304K 2664K zio->i 0 2:02 0.00% nginx 1487 www 1 440 7304K 2376K zio->i 0 1:40 0.00% nginx 1490 www 1 440 7304K 1852K zfs 0 1:30 0.00% nginx 1486 www 1 440 7304K 2400K zfsvfs 1 1:05 0.00% nginx And everything which wants to touch the pool is/becomes dead. Procstat says about one process: # procstat -k 1497 PIDTID COMM TDNAME KSTACK 1497 100257 nginx-mi_switch sleepq_wait __lockmgr_args vop_stdlock VOP_LOCK1_APV null_lock VOP_LOCK1_APV _vn_lock nullfs_root lookup namei vn_open_cred kern_openat syscallenter syscall Xfast_syscall No, it's not related. One of the disks in the RAIDZ2 pool went bad: (da4:arcmsr0:0:4:0): READ(6). CDB: 8 0 2 10 10 0 (da4:arcmsr0:0:4:0): CAM status: SCSI Status Error (da4:arcmsr0:0:4:0): SCSI status: Check Condition (da4:arcmsr0:0:4:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) and it seems it froze the whole zpool. Removing the disk by hand solved the problem. I've seen this previously on other machines with ciss. I wonder why ZFS didn't throw it out of the pool. Hold on a minute. An unrecoverable read error does not necessarily mean the drive is bad, it could mean that the individual LBA that was attempted to be read resulted in ASC 0x11 (MEDIUM ERROR) (e.g. a bad block was encountered). I would check SMART stats on the disk (since these are probably SATA given use of arcmsr(4)) and provide those. *That* will tell you if the disk is bad. I'll help you decode the attributes values if you provide them. You are right, and I gave incorrect information. There are a lot more errors for that disk in the logs, and the zpool was frozen. I tried to offline the given disk. That helped in the ciss case, where the symptom is the same, or something similar, like there is no IO for ages, then something small and nothing for long seconds/minutes, and there are no errors logged. zpool status reported no errors, and the dmesg was clear too. There I could find the bad disk by watching gstat output and there I saw when the very small amount of IO was done, there was one disk with response times well above a second, while the others responded quickly. There the zpool offline helped. Here not, the command just got hang, like everything else. So what I did then: got into the areca-cli and searched for errors. One disk was set to failed and it seemed to be the cause. I've removed it (and did a camcontrol rescan, but I'm not sure it was necessary or not), and suddenly the zpool offline finished and everything went back to normal. But there are two controllers in the system and now I see that the above disk is on ctrl 1, while the one I have removed is on ctrl 2. I was misleaded by their same position. So now I have an offlined disk (which produces read errors, but I couldn't see them in the zpool output) and another, which is shown as failed in the RAID controller and got removed by hand (and solved the situation): NAME STATE READ WRITE CKSUM data DEGRADED 0 0 0 raidz2-0 DEGRADED 0 0 0 label/disk20-01 ONLINE 0 0 0 label/disk20-02 ONLINE 0 0 0 label/disk20-03 ONLINE 0 0 0 label/disk20-04 ONLINE 0 0 0 label/disk20-05 OFFLINE 0 0 0 label/disk20-06 ONLINE 0 0 0 label/disk20-07 ONLINE 0 0 0 label/disk20-08 ONLINE 0 0 0 label/disk20-09 ONLINE 0 0 0 label/disk20-10 ONLINE 0 0 0 label/disk20-11 ONLINE 0 0 0 label/disk20-12 ONLINE 0 0 0 raidz2-1 DEGRADED 0 0 0 label/disk21-01 ONLINE 0 0 0 label/disk21-02 ONLINE 0 0 0 label/disk21-03 ONLINE 0 0 0 label/disk21-04 ONLINE 0 0
Re: New ZFSv28 patchset for 8-STABLE
On Sun, Jan 09, 2011 at 12:49:27PM +0100, Attila Nagy wrote: > On 01/09/2011 10:00 AM, Attila Nagy wrote: > > On 12/16/2010 01:44 PM, Martin Matuska wrote: > >>Hi everyone, > >> > >>following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am > >>providing a ZFSv28 testing patch for 8-STABLE. > >> > >>Link to the patch: > >> > >>http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz > >> > >> > >I've got an IO hang with dedup enabled (not sure it's related, > >I've started to rewrite all data on pool, which makes a heavy > >load): > > > >The processes are in various states: > >65747 1001 1 54 10 28620K 24360K tx->tx 0 6:58 0.00% cvsup > >80383 1001 1 54 10 40616K 30196K select 1 5:38 0.00% rsync > > 1501 www 1 440 7304K 2504K zio->i 0 2:09 0.00% nginx > > 1479 www 1 440 7304K 2416K zio->i 1 2:03 0.00% nginx > > 1477 www 1 440 7304K 2664K zio->i 0 2:02 0.00% nginx > > 1487 www 1 440 7304K 2376K zio->i 0 1:40 0.00% nginx > > 1490 www 1 440 7304K 1852K zfs 0 1:30 0.00% nginx > > 1486 www 1 440 7304K 2400K zfsvfs 1 1:05 0.00% nginx > > > >And everything which wants to touch the pool is/becomes dead. > > > >Procstat says about one process: > ># procstat -k 1497 > > PIDTID COMM TDNAME KSTACK > > 1497 100257 nginx-mi_switch > >sleepq_wait __lockmgr_args vop_stdlock VOP_LOCK1_APV null_lock > >VOP_LOCK1_APV _vn_lock nullfs_root lookup namei vn_open_cred > >kern_openat syscallenter syscall Xfast_syscall > No, it's not related. One of the disks in the RAIDZ2 pool went bad: > (da4:arcmsr0:0:4:0): READ(6). CDB: 8 0 2 10 10 0 > (da4:arcmsr0:0:4:0): CAM status: SCSI Status Error > (da4:arcmsr0:0:4:0): SCSI status: Check Condition > (da4:arcmsr0:0:4:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered > read error) > and it seems it froze the whole zpool. Removing the disk by hand > solved the problem. > I've seen this previously on other machines with ciss. > I wonder why ZFS didn't throw it out of the pool. Hold on a minute. An unrecoverable read error does not necessarily mean the drive is bad, it could mean that the individual LBA that was attempted to be read resulted in ASC 0x11 (MEDIUM ERROR) (e.g. a bad block was encountered). I would check SMART stats on the disk (since these are probably SATA given use of arcmsr(4)) and provide those. *That* will tell you if the disk is bad. I'll help you decode the attributes values if you provide them. My understanding is that a single LBA read failure should not warrant ZFS marking the disk UNAVAIL in the pool. It should have incremented the READ error counter and that's it. Did you receive a *single* error for the disk and then things went catatonic? If the entire system got wedged (a soft wedge, e.g. kernel is still alive but nothing's happening in userland), that could be a different problem -- either with ZFS or arcmsr(4). Does ZFS have some sort of timeout value internal to itself where it will literally mark a disk UNAVAIL in the case that repeated I/O transactions takes "too long"? What is its error recovery methodology? Speaking strictly about Solaris 10 and ZFS: I have seen many, many times a system "soft wedge" after repeated I/O errors (read or write) are spewed out on the console for a single SATA disk (via AHCI), but only when the disk is used as a sole root filesystem disk (no mirror/raidz). My impression is that ZFS isn't the problem in this scenario. In most cases, post-mortem debugging on my part shows that disks encountered some CRC errors (indicating cabling issues, etc.), sometimes as few as 2, but "something else" went crazy -- or possibly ZFS couldn't mark the disk UNAVAIL (if it has that logic) because it's a single disk associated with root. Hardware in this scenario are Hitachi SATA disks with an ICH ESB2 controller, software is Solaris 10 (Generic_142901-06) with ZFS v15. -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 01/01/2011 08:09 PM, Artem Belevich wrote: On Sat, Jan 1, 2011 at 10:18 AM, Attila Nagy wrote: What I see: - increased CPU load - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased hard disk load (IOPS graph) ... Any ideas on what could cause these? I haven't upgraded the pool version and nothing was changed in the pool or in the file system. The fact that L2 ARC is full does not mean that it contains the right data. Initial L2ARC warm up happens at a much higher rate than the rate L2ARC is updated after it's been filled initially. Even accelerated warm-up took almost a day in your case. In order for L2ARC to warm up properly you may have to wait quite a bit longer. My guess is that it should slowly improve over the next few days as data goes through L2ARC and those bits that are hit more often take residence there. The larger your data set, the longer it will take for L2ARC to catch the right data. Do you have similar graphs from pre-patch system just after reboot? I suspect that it may show similarly abysmal L2ARC hit rates initially, too. I've finally found the time to read the v28 patch and figured out the problem: vfs.zfs.l2arc_noprefetch was changed to 1, so it doesn't use the prefetched data on the L2ARC devices. This is a major hit in my case. Enabling this again restored the previous hit rates and lowered the load on the hard disks significantly. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 01/09/2011 10:00 AM, Attila Nagy wrote: On 12/16/2010 01:44 PM, Martin Matuska wrote: Hi everyone, following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am providing a ZFSv28 testing patch for 8-STABLE. Link to the patch: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz I've got an IO hang with dedup enabled (not sure it's related, I've started to rewrite all data on pool, which makes a heavy load): The processes are in various states: 65747 1001 1 54 10 28620K 24360K tx->tx 0 6:58 0.00% cvsup 80383 1001 1 54 10 40616K 30196K select 1 5:38 0.00% rsync 1501 www 1 440 7304K 2504K zio->i 0 2:09 0.00% nginx 1479 www 1 440 7304K 2416K zio->i 1 2:03 0.00% nginx 1477 www 1 440 7304K 2664K zio->i 0 2:02 0.00% nginx 1487 www 1 440 7304K 2376K zio->i 0 1:40 0.00% nginx 1490 www 1 440 7304K 1852K zfs 0 1:30 0.00% nginx 1486 www 1 440 7304K 2400K zfsvfs 1 1:05 0.00% nginx And everything which wants to touch the pool is/becomes dead. Procstat says about one process: # procstat -k 1497 PIDTID COMM TDNAME KSTACK 1497 100257 nginx-mi_switch sleepq_wait __lockmgr_args vop_stdlock VOP_LOCK1_APV null_lock VOP_LOCK1_APV _vn_lock nullfs_root lookup namei vn_open_cred kern_openat syscallenter syscall Xfast_syscall No, it's not related. One of the disks in the RAIDZ2 pool went bad: (da4:arcmsr0:0:4:0): READ(6). CDB: 8 0 2 10 10 0 (da4:arcmsr0:0:4:0): CAM status: SCSI Status Error (da4:arcmsr0:0:4:0): SCSI status: Check Condition (da4:arcmsr0:0:4:0): SCSI sense: MEDIUM ERROR asc:11,0 (Unrecovered read error) and it seems it froze the whole zpool. Removing the disk by hand solved the problem. I've seen this previously on other machines with ciss. I wonder why ZFS didn't throw it out of the pool. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 12/16/2010 01:44 PM, Martin Matuska wrote: Hi everyone, following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am providing a ZFSv28 testing patch for 8-STABLE. Link to the patch: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz I've got an IO hang with dedup enabled (not sure it's related, I've started to rewrite all data on pool, which makes a heavy load): The processes are in various states: 65747 1001 1 54 10 28620K 24360K tx->tx 0 6:58 0.00% cvsup 80383 1001 1 54 10 40616K 30196K select 1 5:38 0.00% rsync 1501 www 1 440 7304K 2504K zio->i 0 2:09 0.00% nginx 1479 www 1 440 7304K 2416K zio->i 1 2:03 0.00% nginx 1477 www 1 440 7304K 2664K zio->i 0 2:02 0.00% nginx 1487 www 1 440 7304K 2376K zio->i 0 1:40 0.00% nginx 1490 www 1 440 7304K 1852K zfs 0 1:30 0.00% nginx 1486 www 1 440 7304K 2400K zfsvfs 1 1:05 0.00% nginx And everything which wants to touch the pool is/becomes dead. Procstat says about one process: # procstat -k 1497 PIDTID COMM TDNAME KSTACK 1497 100257 nginx-mi_switch sleepq_wait __lockmgr_args vop_stdlock VOP_LOCK1_APV null_lock VOP_LOCK1_APV _vn_lock nullfs_root lookup namei vn_open_cred kern_openat syscallenter syscall Xfast_syscall ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 01/03/2011 10:35 PM, Bob Friesenhahn wrote: After four days, the L2 hit rate is still hovering around 10-20 percents (was between 60-90), so I think it's clearly a regression in the ZFSv28 patch... And the massive growth in CPU usage can also very nicely be seen... I've updated the graphs at (switch time can be checked on the zfs-mem graph): http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/ There is a new phenomenom: the large IOPS peaks. I use this munin script on a lot of machines and never seen anything like this... I'm not sure whether it's related or not. It is not so clear that there is a problem. I am not sure what you are using this server for but it is wise The IO pattern has changed radically, so for me it's a problem. to consider that this is the funny time when a new year starts, SPAM delivery goes through the roof, and employees and customers behave differently. You chose the worst time of the year to implement the change and observe behavior. It's a free software mirror, ftp.fsn.hu, and I'm sure that it's (the very low hit rate and the increased CPU usage) not related to the time when I made the switch. CPU use is indeed increased somewhat. A lower loading of the l2arc is not necessarily a problem. The l2arc is usually bandwidth limited compared with main store so if bulk data can not be cached in RAM, then it is best left in main store. A smarter l2arc algorithm could put only the data producing the expensive IOPS (the ones requiring a seek) in the l2arc, lessening the amount of data cached on the device. That would make sense, if I wouldn't have 100-120 IOPS (for 7k2 RPM disks, it's about their max, gstat tells me the same) on the disks, and as low as 10 percents of L2 hit rate. What's smarter? Having 60-90% hit rate from the SSDs and moving the slow disk heads less, or having 10-20 percent of hit rate and kill the disks with random IO? If you are right, ZFS tries to be too smart and falls on its face with this kind of workload. BTW, I've checked the v15-v28 patch for arc.c, and I can't see any L2ARC related change there. I'm not sure whether the hypothetical logic would be there, or a different file, I haven't read it end to end. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
After four days, the L2 hit rate is still hovering around 10-20 percents (was between 60-90), so I think it's clearly a regression in the ZFSv28 patch... And the massive growth in CPU usage can also very nicely be seen... I've updated the graphs at (switch time can be checked on the zfs-mem graph): http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/ There is a new phenomenom: the large IOPS peaks. I use this munin script on a lot of machines and never seen anything like this... I'm not sure whether it's related or not. It is not so clear that there is a problem. I am not sure what you are using this server for but it is wise to consider that this is the funny time when a new year starts, SPAM delivery goes through the roof, and employees and customers behave differently. You chose the worst time of the year to implement the change and observe behavior. CPU use is indeed increased somewhat. A lower loading of the l2arc is not necessarily a problem. The l2arc is usually bandwidth limited compared with main store so if bulk data can not be cached in RAM, then it is best left in main store. A smarter l2arc algorithm could put only the data producing the expensive IOPS (the ones requiring a seek) in the l2arc, lessening the amount of data cached on the device. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 01/01/2011 08:09 PM, Artem Belevich wrote: On Sat, Jan 1, 2011 at 10:18 AM, Attila Nagy wrote: What I see: - increased CPU load - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased hard disk load (IOPS graph) ... Any ideas on what could cause these? I haven't upgraded the pool version and nothing was changed in the pool or in the file system. The fact that L2 ARC is full does not mean that it contains the right data. Initial L2ARC warm up happens at a much higher rate than the rate L2ARC is updated after it's been filled initially. Even accelerated warm-up took almost a day in your case. In order for L2ARC to warm up properly you may have to wait quite a bit longer. My guess is that it should slowly improve over the next few days as data goes through L2ARC and those bits that are hit more often take residence there. The larger your data set, the longer it will take for L2ARC to catch the right data. Do you have similar graphs from pre-patch system just after reboot? I suspect that it may show similarly abysmal L2ARC hit rates initially, too. After four days, the L2 hit rate is still hovering around 10-20 percents (was between 60-90), so I think it's clearly a regression in the ZFSv28 patch... And the massive growth in CPU usage can also very nicely be seen... I've updated the graphs at (switch time can be checked on the zfs-mem graph): http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/ There is a new phenomenom: the large IOPS peaks. I use this munin script on a lot of machines and never seen anything like this... I'm not sure whether it's related or not. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 01/02/2011 03:45, Attila Nagy wrote: > On 01/02/2011 05:06 AM, J. Hellenthal wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA1 >> >> On 01/01/2011 13:18, Attila Nagy wrote: >>> On 12/16/2010 01:44 PM, Martin Matuska wrote: Link to the patch: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz >>> I've used this: >>> http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101223-nopython.patch.xz >>> >>> >>> on a server with amd64, 8 G RAM, acting as a file server on >>> ftp/http/rsync, the content being read only mounted with nullfs in >>> jails, and the daemons use sendfile (ftp and http). >>> >>> The effects can be seen here: >>> http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/ >>> the exact moment of the switch can be seen on zfs_mem-week.png, where >>> the L2 ARC has been discarded. >>> >>> What I see: >>> - increased CPU load >>> - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased >>> hard disk load (IOPS graph) >>> >>> Maybe I could accept the higher system load as normal, because there >>> were a lot of things changed between v15 and v28 (but I was hoping if I >>> use the same feature set, it will require less CPU), but dropping the >>> L2ARC hit rate so radically seems to be a major issue somewhere. >>> As you can see from the memory stats, I have enough kernel memory to >>> hold the L2 headers, so the L2 devices got filled up to their maximum >>> capacity. >>> >>> Any ideas on what could cause these? I haven't upgraded the pool version >>> and nothing was changed in the pool or in the file system. >>> >> Running arc_summary.pl[1] -p4 should print a summary about your l2arc >> and you should also notice in that section that there is a high number >> of "SPA Mismatch" mine usually grew to around 172k before I would notice >> a crash and I could reliably trigger this while in scrub. >> >> What ever is causing this needs desperate attention! >> >> I emailed mm@ privately off-list when I noticed this going on but have >> not received any feedback as of yet. > It's at zero currently (2 days of uptime): > kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0 > Right but do you have a 'cache' 'l2arc' vdev attached to any pool in the system ? This suggests to me that you do not at this time. If not can you attach a cache vdev and run a scrub on it and monitor the value of that MIB ? -- Regards, jhell,v JJH48-ARIN ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 01/02/2011 05:06 AM, J. Hellenthal wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/01/2011 13:18, Attila Nagy wrote: On 12/16/2010 01:44 PM, Martin Matuska wrote: Link to the patch: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz I've used this: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101223-nopython.patch.xz on a server with amd64, 8 G RAM, acting as a file server on ftp/http/rsync, the content being read only mounted with nullfs in jails, and the daemons use sendfile (ftp and http). The effects can be seen here: http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/ the exact moment of the switch can be seen on zfs_mem-week.png, where the L2 ARC has been discarded. What I see: - increased CPU load - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased hard disk load (IOPS graph) Maybe I could accept the higher system load as normal, because there were a lot of things changed between v15 and v28 (but I was hoping if I use the same feature set, it will require less CPU), but dropping the L2ARC hit rate so radically seems to be a major issue somewhere. As you can see from the memory stats, I have enough kernel memory to hold the L2 headers, so the L2 devices got filled up to their maximum capacity. Any ideas on what could cause these? I haven't upgraded the pool version and nothing was changed in the pool or in the file system. Running arc_summary.pl[1] -p4 should print a summary about your l2arc and you should also notice in that section that there is a high number of "SPA Mismatch" mine usually grew to around 172k before I would notice a crash and I could reliably trigger this while in scrub. What ever is causing this needs desperate attention! I emailed mm@ privately off-list when I noticed this going on but have not received any feedback as of yet. It's at zero currently (2 days of uptime): kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/01/2011 13:18, Attila Nagy wrote: > On 12/16/2010 01:44 PM, Martin Matuska wrote: >> Link to the patch: >> >> http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz >> >> >> > I've used this: > http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101223-nopython.patch.xz > > on a server with amd64, 8 G RAM, acting as a file server on > ftp/http/rsync, the content being read only mounted with nullfs in > jails, and the daemons use sendfile (ftp and http). > > The effects can be seen here: > http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/ > the exact moment of the switch can be seen on zfs_mem-week.png, where > the L2 ARC has been discarded. > > What I see: > - increased CPU load > - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased > hard disk load (IOPS graph) > > Maybe I could accept the higher system load as normal, because there > were a lot of things changed between v15 and v28 (but I was hoping if I > use the same feature set, it will require less CPU), but dropping the > L2ARC hit rate so radically seems to be a major issue somewhere. > As you can see from the memory stats, I have enough kernel memory to > hold the L2 headers, so the L2 devices got filled up to their maximum > capacity. > > Any ideas on what could cause these? I haven't upgraded the pool version > and nothing was changed in the pool or in the file system. > Running arc_summary.pl[1] -p4 should print a summary about your l2arc and you should also notice in that section that there is a high number of "SPA Mismatch" mine usually grew to around 172k before I would notice a crash and I could reliably trigger this while in scrub. What ever is causing this needs desperate attention! I emailed mm@ privately off-list when I noticed this going on but have not received any feedback as of yet. [1] http://bit.ly/fdRiYT - -- Regards, jhell,v JJH48-ARIN -BEGIN PGP SIGNATURE- iQEcBAEBAgAGBQJNH/msAAoJEJBXh4mJ2FR+bFYH/0bBJbLYU5zzbqpUUXX1M/B9 +g8RwQ9Tek4/fxwpD8DNIfkpzO0MvUcx5Nhwld69jk7sSys9IUpYhuYVggcOgavx sl6AwNNUG0XD/spO2RvV3jD4tVbR6TjlSdLCyBG7iPFU2nNB6wZM+UfWxGYwEyUE loOr13Vk4eU2l2cepUwJH0oGu2hsDZ7qR/fTd+d33NfS6/PT43vbCjPNTsnDJeY9 MdeC5vBUPl3AW3iC/5hxBi9WABGMHeAXTolpAtBQVBNi22mINacYFO6FEdfANy9E Xo207Cd6vBmZb8aTs0BFHs5ZdTHUco/iNysaWvzx9TlIWlyyBRgOXgtBweB+6d4= =lcxW -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 01/01/2011 08:09 PM, Artem Belevich wrote: On Sat, Jan 1, 2011 at 10:18 AM, Attila Nagy wrote: What I see: - increased CPU load - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased hard disk load (IOPS graph) ... Any ideas on what could cause these? I haven't upgraded the pool version and nothing was changed in the pool or in the file system. The fact that L2 ARC is full does not mean that it contains the right data. Initial L2ARC warm up happens at a much higher rate than the rate L2ARC is updated after it's been filled initially. Even accelerated warm-up took almost a day in your case. In order for L2ARC to warm up properly you may have to wait quite a bit longer. My guess is that it should slowly improve over the next few days as data goes through L2ARC and those bits that are hit more often take residence there. The larger your data set, the longer it will take for L2ARC to catch the right data. Do you have similar graphs from pre-patch system just after reboot? I suspect that it may show similarly abysmal L2ARC hit rates initially, too. Sadly no, but I remember that I've seen increasing hit rates as the cache grew, that's what I wrote the email after one and a half days. Currently it's at the same level, when it was right after the reboot... We'll see after few days. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On Sat, Jan 1, 2011 at 10:18 AM, Attila Nagy wrote: > What I see: > - increased CPU load > - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased > hard disk load (IOPS graph) > ... > Any ideas on what could cause these? I haven't upgraded the pool version and > nothing was changed in the pool or in the file system. The fact that L2 ARC is full does not mean that it contains the right data. Initial L2ARC warm up happens at a much higher rate than the rate L2ARC is updated after it's been filled initially. Even accelerated warm-up took almost a day in your case. In order for L2ARC to warm up properly you may have to wait quite a bit longer. My guess is that it should slowly improve over the next few days as data goes through L2ARC and those bits that are hit more often take residence there. The larger your data set, the longer it will take for L2ARC to catch the right data. Do you have similar graphs from pre-patch system just after reboot? I suspect that it may show similarly abysmal L2ARC hit rates initially, too. --Artem ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On 12/16/2010 01:44 PM, Martin Matuska wrote: Link to the patch: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz I've used this: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101223-nopython.patch.xz on a server with amd64, 8 G RAM, acting as a file server on ftp/http/rsync, the content being read only mounted with nullfs in jails, and the daemons use sendfile (ftp and http). The effects can be seen here: http://people.fsn.hu/~bra/freebsd/20110101-zfsv28-fbsd/ the exact moment of the switch can be seen on zfs_mem-week.png, where the L2 ARC has been discarded. What I see: - increased CPU load - decreased L2 ARC hit rate, decreased SSD (ad[46]), therefore increased hard disk load (IOPS graph) Maybe I could accept the higher system load as normal, because there were a lot of things changed between v15 and v28 (but I was hoping if I use the same feature set, it will require less CPU), but dropping the L2ARC hit rate so radically seems to be a major issue somewhere. As you can see from the memory stats, I have enough kernel memory to hold the L2 headers, so the L2 devices got filled up to their maximum capacity. Any ideas on what could cause these? I haven't upgraded the pool version and nothing was changed in the pool or in the file system. Thanks, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
On Wednesday, 29 December 2010, jhell wrote: > > Another note too, I think I read that you mentioned using the L2ARC and > slog device on the same disk You simply shouldn't do this it could > be contributing to the real cause and there is absolutely no gain in > either sanity or performance and you will end up bottle-necking your system. > >> And why would that be? I've read so many conflictinginformation on the matter over the past few days that I'm starting to wonder if there's an actual definitive answer on the matter or if anyone has a clue regarding what they're talking about. It ranges from, should only use raw disk to freebsd isn't solaris so slices are fine. Don't use slice because they can't be read by another OS use partitions.. It doesn't apply to SSD and so on.. The way I look at it, the only thing that would bottleneck access to that SSD drive, is the SATA interface itself. So using two drives, or two partitions on the same drive, I can't see how it would make much difference if any other than the traditional "I think I know" argument. Surely latency as with know it with hard drive do not apply to SSDs. Even within sun's official documentation, they are contradicting information, starting from the commands on how to add remove/cache of log device. It seems to me that tuning ZFS is very much like black magic, everyone has their own idea about what to do, and not once did I get to read conclusive evidence about what is best or find an information people actually agree on. As for using unofficial code, sure I accept that risk now. I made a conscious decision on using it, there's now no way to go back and I accept that. At the end of the day, it's the only thing that will make that code suitable for real world condition: testing. If that particular code isn't put under any actual stress how else are you going to know if its good or not. I don't really like reading between the lines of your post that I shouldn't be surprised should anything break or that it doesn't matter if it crashes. there's a deadlock occurring somewhere : it needs to be found. I know nothing about the ZFS code, and I could only do what I'm capable of under those circumstances: find a way to reproduce the problem consistently, report as much information as I have so someone more clueey will know what to do with it. Hope that makes sense Jean-Yves ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/28/2010 18:20, Martin Matuska wrote: > Please don't consider these patches as production-ready. > What we want to do is find and resolve as many bugs as possible. I completely agree with Martin here. If your running it then your willing to loose what you have if you have not taken precaution to save your data somewhere else. Even though with that said ZFS does a pretty fine job of ensuring that nothing happens to it, it is still best practice to have a copy somewhere other than "IN THAT BOX" ;) Another note too, I think I read that you mentioned using the L2ARC and slog device on the same disk You simply shouldn't do this it could be contributing to the real cause and there is absolutely no gain in either sanity or performance and you will end up bottle-necking your system. > > To help us fix these bugs, a way to reproduce the bug from a clean start > (e.g. in virtualbox) would be great and speed up finding the cause for > the problem. > > Your problem looks like some sort of deadlock. In your case, when you > experiene the hang, try running "procstat -k -k PID" in another shell > (console). That will give us valuable information. > Martin, I agree with the above that it may be some sort of live or dead lock problem in this case. It would be awesome to know if some of the following sysctl(8)'s values are and how this reacts when set to the opposite of their current values. vfs.zfs.l2arc_noprefetch: vfs.zfs.dedup.prefetch: vfs.zfs.prefetch_disable The reason why I say this is on one of my personal systems that I toy with the box cannot make it very long with prefetch enabled on either v15 or v28 after some 'unknown' commit to world on stable/8. Now this may actually be just a contributing factor that makes it happen sooner than it normally would but probably also directly relates to the exact problem. I would love to see this go away as I had been using the L2ARC with prefetch enabled for a long time and now all of a sudden just plainly does not work correctly. I also have about 19 core.txt.NN files from when this started happening with various stack traces. If you would like these just let me know and Ill privately mail them to you. Regards, - -- jhell,v - JJH48-ARIN -BEGIN PGP SIGNATURE- iQEbBAEBAgAGBQJNGsDLAAoJEJBXh4mJ2FR+TqkH8wVFQKiU/C6L+F4Y3/ClScQD b4s0IkC1B+bHl9eD6Hhxif/1iKj1w9clYvuLt8ageDF98KTB9GCRjuh48VswdtPQ FQtDRTj1pGzWPxmOn2Nrf7qrFnymqZk+qoTBX8A1nDvrSe41Mqp82ue9E7nZ1ipg Dz9k5F8J+WxUAZYLHxtxLvYEa19/hvG1K5LOpRKIU0iycsqaywezFflTGDcR5lT8 A50ic9sZ21jr87CK45TLv1Wmu+kDgpy2j1x77bYTDGoAzQMlPcOENO8st8EobcWB eIwwXIjtRwOKF4rSxoqwwYxOM4ek+tK4p6ZnO1uLipNXMB+zJTjs//GV6Xp3TA== =Io4+ -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
Hi On Wednesday, 29 December 2010, Martin Matuska wrote: > Please don't consider these patches as production-ready. > What we want to do is find and resolve as many bugs as possible. > > To help us fix these bugs, a way to reproduce the bug from a clean start > (e.g. in virtualbox) would be great and speed up finding the cause for > the problem. > > Your problem looks like some sort of deadlock. In your case, when you > experiene the hang, try running "procstat -k -k PID" in another shell > (console). That will give us valuable information. > I am away until next week now (hopefully no problem will occur until then) I will try reproduce the issue then. I have to say that v28 massively increased write performance over samba, over 3 times than when v14 or v15. How do you disable ZIL with v28? I wanted to test performance in the case I'm trying to qtroubleshoot. Writing our file over samba, V14-v15: 55s V14-v15 zip disabled: 6s V28: 16s.. (with or without separate log drive: SSD Intel X25-m 40GB). Playing with the only zil parameter showing in sysctl, made no difference whatsoever. UFS boot drive: 14s sequential read shows over 280MB/s from that raidz array, similar with writes I started a thread in the freebsd forum: http://forums.freebsd.org/showthread.php?t=20476 And finally, which patch should I try on your site? Thanks ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
Please don't consider these patches as production-ready. What we want to do is find and resolve as many bugs as possible. To help us fix these bugs, a way to reproduce the bug from a clean start (e.g. in virtualbox) would be great and speed up finding the cause for the problem. Your problem looks like some sort of deadlock. In your case, when you experiene the hang, try running "procstat -k -k PID" in another shell (console). That will give us valuable information. Cheers, mm Dňa 28.12.2010 18:39, Jean-Yves Avenard wrote / napísal(a): > On 29 December 2010 03:15, Jean-Yves Avenard wrote: > >> # zpool import >> load: 0.00 cmd: zpool 405 [spa_namespace_lock] 15.11r 0.00u 0.03s 0% 2556k >> load: 0.00 cmd: zpool 405 [spa_namespace_lock] 15.94r 0.00u 0.03s 0% 2556k >> load: 0.00 cmd: zpool 405 [spa_namespace_lock] 16.57r 0.00u 0.03s 0% 2556k >> load: 0.00 cmd: zpool 405 [spa_namespace_lock] 16.95r 0.00u 0.03s 0% 2556k >> load: 0.00 cmd: zpool 405 [spa_namespace_lock] 32.19r 0.00u 0.03s 0% 2556k >> load: 0.00 cmd: zpool 405 [spa_namespace_lock] 32.72r 0.00u 0.03s 0% 2556k >> load: 0.00 cmd: zpool 405 [spa_namespace_lock] 40.13r 0.00u 0.03s 0% 2556k >> >> ah ah ! >> it's not the separate log that make zpool crash, it's the cache ! >> >> Having the cache in prevent from importing the pool again >> >> rebooting: same deal... can't access the pool any longer ! >> >> Hopefully this is enough hint for someone to track done the bug ... >> > > More details as I was crazy enough to try various things. > > The problem of zpool being stuck in spa_namespace_lock, only occurs if > you are using both the cache and the log at the same time. > Use one or the other : then there's no issue > > But the instant you add both log and cache to the pool, it becomes unusable. > > Now, I haven't tried using cache and log from a different disk. The > motherboard on the server has 8 SATA ports, and I have no free port to > add another disk. So my only option to have both a log and cache > device in my zfs pool, is to use two slices on the same disk. > > Hope this helps.. > Jean-Yves > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
On Tue, Dec 28, 2010 at 9:39 AM, Jean-Yves Avenard > Now, I haven't tried using cache and log from a different disk. The > motherboard on the server has 8 SATA ports, and I have no free port to > add another disk. So my only option to have both a log and cache > device in my zfs pool, is to use two slices on the same disk. For testing, you can always just connect a USB stick, and use that for the cache. I've done this on large ZFS systems (24-drives) and on my home server (5-drives). Works nicely. That should narrow it down to either "can't use cache/log on same device" or "can't use cache and log at the same time". -- Freddie Cash fjwc...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: ARRRGG HELP !!
On Tue, Dec 28, 2010 at 8:58 AM, Jean-Yves Avenard wrote: > On 28 December 2010 08:56, Freddie Cash wrote: > >> Is that a typo, or the actual command you used? You have an extra "s" >> in there. Should be "log" and not "logs". However, I don't think >> that command is correct either. >> >> I believe you want to use the "detach" command, not "remove". >> >> # zpool detach pool label/zil > > well, I tried the detach command: > > server4# zpool detach pool ada1s1 > cannot detach ada1s1: only applicable to mirror and replacing vdevs > > server4# zpool remove pool ada1s1 > server4# > > so you need to use remove, and adding log (or cache) makes no > difference whatsoever.. > Interesting, thanks for the confirmation. I don't have any ZFS systems using log devices, so can only go by what's in the docs. May want to make a note that the man page (at least for ZFSv15 in FreeBSD 8.1) includes several sections that use "zpool detach" when talking about log devices. If that's still in the man page for ZFSv28, it'll need to be cleaned up. -- Freddie Cash fjwc...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
On 29 December 2010 03:15, Jean-Yves Avenard wrote: > # zpool import > load: 0.00 cmd: zpool 405 [spa_namespace_lock] 15.11r 0.00u 0.03s 0% 2556k > load: 0.00 cmd: zpool 405 [spa_namespace_lock] 15.94r 0.00u 0.03s 0% 2556k > load: 0.00 cmd: zpool 405 [spa_namespace_lock] 16.57r 0.00u 0.03s 0% 2556k > load: 0.00 cmd: zpool 405 [spa_namespace_lock] 16.95r 0.00u 0.03s 0% 2556k > load: 0.00 cmd: zpool 405 [spa_namespace_lock] 32.19r 0.00u 0.03s 0% 2556k > load: 0.00 cmd: zpool 405 [spa_namespace_lock] 32.72r 0.00u 0.03s 0% 2556k > load: 0.00 cmd: zpool 405 [spa_namespace_lock] 40.13r 0.00u 0.03s 0% 2556k > > ah ah ! > it's not the separate log that make zpool crash, it's the cache ! > > Having the cache in prevent from importing the pool again > > rebooting: same deal... can't access the pool any longer ! > > Hopefully this is enough hint for someone to track done the bug ... > More details as I was crazy enough to try various things. The problem of zpool being stuck in spa_namespace_lock, only occurs if you are using both the cache and the log at the same time. Use one or the other : then there's no issue But the instant you add both log and cache to the pool, it becomes unusable. Now, I haven't tried using cache and log from a different disk. The motherboard on the server has 8 SATA ports, and I have no free port to add another disk. So my only option to have both a log and cache device in my zfs pool, is to use two slices on the same disk. Hope this helps.. Jean-Yves ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: ARRRGG HELP !!
On 28 December 2010 08:56, Freddie Cash wrote: > Is that a typo, or the actual command you used? You have an extra "s" > in there. Should be "log" and not "logs". However, I don't think > that command is correct either. > > I believe you want to use the "detach" command, not "remove". > > # zpool detach pool label/zil well, I tried the detach command: server4# zpool detach pool ada1s1 cannot detach ada1s1: only applicable to mirror and replacing vdevs server4# zpool remove pool ada1s1 server4# so you need to use remove, and adding log (or cache) makes no difference whatsoever.. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
Hi On 27 December 2010 16:04, jhell wrote: > 1) Set vfs.zfs.recover=1 at the loader prompt (OK set vfs.zfs.recover=1) > 2) Boot into single user mode without opensolaris.ko and zfs.ko loaded > 3) ( mount -w / ) to make sure you can remove and also write new > zpool.cache as needed. > 3) Remove /boot/zfs/zpool.cache > 4) kldload both zfs and opensolaris i.e. ( kldload zfs ) should do the trick > 5) verify that vfs.zfs.recover=1 is set then ( zpool import pool ) > 6) Give it a little bit monitor activity using Ctrl+T to see activity. Ok.. I've got into the same situation again, no idea why this time. I've followed your instructions, and sure enough I could do an import of my pool again. However, wanted to find out what was going on.. So I did: zpool export pool followed by zpool import And guess what ... hanged zpool again.. can't Ctrl-C it, have to reboot.. So here we go again. Rebooted as above. zpool import pool -> ok this time, I decided that maybe that what was screwing things up was the cache. zpool remove pool ada1s2 -> ok zpool status: # zpool status pool: pool state: ONLINE scan: scrub repaired 0 in 18h20m with 0 errors on Tue Dec 28 10:28:05 2010 config: NAMESTATE READ WRITE CKSUM poolONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2ONLINE 0 0 0 ada3ONLINE 0 0 0 ada4ONLINE 0 0 0 ada5ONLINE 0 0 0 ada6ONLINE 0 0 0 ada7ONLINE 0 0 0 logs ada1s1ONLINE 0 0 0 errors: No known data errors # zpool export pool -> ok # zpool import pool -> ok # zpool add pool cache /dev/ada1s2 -> ok # zpool status pool: pool state: ONLINE scan: scrub repaired 0 in 18h20m with 0 errors on Tue Dec 28 10:28:05 2010 config: NAMESTATE READ WRITE CKSUM poolONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2ONLINE 0 0 0 ada3ONLINE 0 0 0 ada4ONLINE 0 0 0 ada5ONLINE 0 0 0 ada6ONLINE 0 0 0 ada7ONLINE 0 0 0 logs ada1s1ONLINE 0 0 0 cache ada1s2ONLINE 0 0 0 errors: No known data errors # zpool export pool -> ok # zpool import load: 0.00 cmd: zpool 405 [spa_namespace_lock] 15.11r 0.00u 0.03s 0% 2556k load: 0.00 cmd: zpool 405 [spa_namespace_lock] 15.94r 0.00u 0.03s 0% 2556k load: 0.00 cmd: zpool 405 [spa_namespace_lock] 16.57r 0.00u 0.03s 0% 2556k load: 0.00 cmd: zpool 405 [spa_namespace_lock] 16.95r 0.00u 0.03s 0% 2556k load: 0.00 cmd: zpool 405 [spa_namespace_lock] 32.19r 0.00u 0.03s 0% 2556k load: 0.00 cmd: zpool 405 [spa_namespace_lock] 32.72r 0.00u 0.03s 0% 2556k load: 0.00 cmd: zpool 405 [spa_namespace_lock] 40.13r 0.00u 0.03s 0% 2556k ah ah ! it's not the separate log that make zpool crash, it's the cache ! Having the cache in prevent from importing the pool again rebooting: same deal... can't access the pool any longer ! Hopefully this is enough hint for someone to track done the bug ... ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: ARRRGG HELP !!
Well Today I added the log device: zpool add pool log /dev/ada1s1 (8GB slice on a SSD Intel X25 disk).. then added the cache (32GB) zpool add pool cache /dev/ada1s2 So far so good. zpool status -> all good. Reboot : it hangs booted in single user mode, zpool status: ZFS filesystem version 5 ZFS storage pool version 28 and that's it no more.. Just like before when I thought that removing the log disk had failed. This time no error nothing... just a nasty hang and unusable system again... :( ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: ARRRGG HELP !!
Hi On Tuesday, 28 December 2010, Freddie Cash wrote: > On Sun, Dec 26, 2010 at 4:43 PM, Jean-Yves Avenard > wrote: >> On 27 December 2010 09:55, Jean-Yves Avenard wrote: >>> Hi there. >>> >>> I used stable-8-zfsv28-20101223-nopython.patch.xz from >>> http://people.freebsd.org/~mm/patches/zfs/v28/ >> >> I did the following: >> >> # zpool status >> pool: pool >> state: ONLINE >> scan: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> pool ONLINE 0 0 0 >> raidz1-0 ONLINE 0 0 0 >> ada2 ONLINE 0 0 0 >> ada3 ONLINE 0 0 0 >> ada4 ONLINE 0 0 0 >> ada5 ONLINE 0 0 0 >> ada6 ONLINE 0 0 0 >> ada7 ONLINE 0 0 0 >> cache >> label/zcache ONLINE 0 0 0 >> >> errors: No known data errors >> >> so far so good >> >> [r...@server4 /pool/home/jeanyves_avenard]# zpool add pool log >> /dev/label/zil [r...@server4 /pool/home/jeanyves_avenard]# zpool >> status >> pool: pool >> state: ONLINE >> scan: none requested >> config: >> >> NAME STATE READ WRITE CKSUM >> pool ONLINE 0 0 0 >> raidz1-0 ONLINE 0 0 0 >> ada2 ONLINE 0 0 0 >> ada3 ONLINE 0 0 0 >> ada4 ONLINE 0 0 0 >> ada5 ONLINE 0 0 0 >> ada6 ONLINE 0 0 0 >> ada7 ONLINE 0 0 0 >> logs >> label/zil ONLINE 0 0 0 >> cache >> label/zcache ONLINE 0 0 0 >> >> errors: No known data errors >> >> so far so good: >> >> # zpool remove pool logs label/zil >> cannot remove logs: no such device in pool > > Is that a typo, or the actual command you used? You have an extra "s" > in there. Should be "log" and not "logs". However, I don't think > that command is correct either. > > I believe you want to use the "detach" command, not "remove". > # zpool detach pool label/zil > > -- > Freddie Cash > fjwc...@gmail.com > It was a typo, it should have been log (according to sun's doc). As it was showing "logs" in the status I typed this. According to sun, it zpool remove pool cache/log A typo should have never resulted in what happened, showing an error for sure; but zpool hanging and kernel panic? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: ARRRGG HELP !!
On Sun, Dec 26, 2010 at 4:43 PM, Jean-Yves Avenard wrote: > On 27 December 2010 09:55, Jean-Yves Avenard wrote: >> Hi there. >> >> I used stable-8-zfsv28-20101223-nopython.patch.xz from >> http://people.freebsd.org/~mm/patches/zfs/v28/ > > I did the following: > > # zpool status > pool: pool > state: ONLINE > scan: none requested > config: > > NAME STATE READ WRITE CKSUM > pool ONLINE 0 0 0 > raidz1-0 ONLINE 0 0 0 > ada2 ONLINE 0 0 0 > ada3 ONLINE 0 0 0 > ada4 ONLINE 0 0 0 > ada5 ONLINE 0 0 0 > ada6 ONLINE 0 0 0 > ada7 ONLINE 0 0 0 > cache > label/zcache ONLINE 0 0 0 > > errors: No known data errors > > so far so good > > [r...@server4 /pool/home/jeanyves_avenard]# zpool add pool log > /dev/label/zil [r...@server4 /pool/home/jeanyves_avenard]# zpool > status > pool: pool > state: ONLINE > scan: none requested > config: > > NAME STATE READ WRITE CKSUM > pool ONLINE 0 0 0 > raidz1-0 ONLINE 0 0 0 > ada2 ONLINE 0 0 0 > ada3 ONLINE 0 0 0 > ada4 ONLINE 0 0 0 > ada5 ONLINE 0 0 0 > ada6 ONLINE 0 0 0 > ada7 ONLINE 0 0 0 > logs > label/zil ONLINE 0 0 0 > cache > label/zcache ONLINE 0 0 0 > > errors: No known data errors > > so far so good: > > # zpool remove pool logs label/zil > cannot remove logs: no such device in pool Is that a typo, or the actual command you used? You have an extra "s" in there. Should be "log" and not "logs". However, I don't think that command is correct either. I believe you want to use the "detach" command, not "remove". # zpool detach pool label/zil -- Freddie Cash fjwc...@gmail.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
Hi On 27 December 2010 16:04, jhell wrote: > > Before anything else can you: (in FreeBSD) > > 1) Set vfs.zfs.recover=1 at the loader prompt (OK set vfs.zfs.recover=1) > 2) Boot into single user mode without opensolaris.ko and zfs.ko loaded > 3) ( mount -w / ) to make sure you can remove and also write new > zpool.cache as needed. > 3) Remove /boot/zfs/zpool.cache > 4) kldload both zfs and opensolaris i.e. ( kldload zfs ) should do the trick > 5) verify that vfs.zfs.recover=1 is set then ( zpool import pool ) > 6) Give it a little bit monitor activity using Ctrl+T to see activity. > > You should have your pool back to a working condition after this. The > reason why oi_127 can't work with your pool is because it cannot see > FreeBSD generic labels. The only way to work around this for oi_127 > would be to either point it directly at the replacing device or to use > actual slices or partitions for your slogs and other such devices. > > Use adaNsN or gpt or gptid for working with your pool if you plan on > using other OS's for recovery effects. > Hi.. Thank you for your response, I will keep it safely should it ever occur again. Let me explain why I used labels.. It all started when I was trying to solve some serious performance issue when running with zfs http://forums.freebsd.org/showthread.php?t=20476 One of the step in trying to trouble shoot the latency problem, was to use AHCI ; I had always thought that activating AHCI in the bios was sufficient to get it going on FreeBSD, turned out that was the case and that I needed to load ahci.ko as well. After doing so, my system wouldn't boot anymore as it was trying to be /dev/ad0 which didn't exist anymore and was now names /dev/ata0. So I used a label to the boot disk to ensure that I will never encounter that problem ever again. In the same mindset, I used labels for the cache and log device I later added to the pool... I have to say however, that zfs had no issue using the labels until I tried to remove it. I had rebooted several times without having any problems. zpool status never hanged It all started to play up when I ran the command: zpool remove pool log label/zil zpool never ever came out from running that command (I let it run for a good 30 minutes, during which I was fearing the worse, and once I rebooted and nothing ever worked, suicide looked like an appealing alternative) It is very disappointing however that because the pool is in a non-working state, none of the command available to troubleshoot the problem would actually work (which I'm guessing is related to zpool looking for a device name that it can never find being a label) I also can't explain why FreeBSD would kernel panic when it was finally in a state of being able to do an import. I have to say unfortunately, that if I hadn't had OpenIndiana, I would probably still by crying underneath my desk right now... Thanks again for your email, I have no doubt that this would have worked but in my situation, I got your answer in just 2 hours, which is better than any paid support could provide ! Jean-Yves PS: saving my 5MB files over the network , went from 40-55s with v15 to a constant 16s with v28... I can't test with ZIL completely disabled , it seems that vfs.zfs.zil_disable has been removed, and so did vfs.zfs.write_limit_override ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/26/2010 23:17, Jean-Yves Avenard wrote: > Responding to myself again :P > > On 27 December 2010 13:28, Jean-Yves Avenard wrote: >> tried to force a zpool import >> >> got a kernel panic: >> panic: solaris assert: weight >= space && weight <= 2 * space, file: >> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c, >> line: 793 >> >> cpuid = 5 >> KDB: stack backtrace >> #0: 0xff805f64be at kdb_backtrace >> #1 .. panic+0x187 >> #2 .. metaslab_weight+0xe1 >> #3: metaslab_sync_done+0x21e >> #4: vdev_sync_done >> #5: spa_sync+0x6a2 >> #6 txg_sync_thread+0x147 >> #7: fork_exit+0x118 >> #8: fork_trampoline+0xe >> >> uptime 2m25s.. >> > > Command used to import in FreeBSD was: > zpool import -fF -R / pool > which told me that zil was missing, and to use -m > > I booted openindiana (which is the only distribution I could ifnd with > a live CD supporting zpool v28) > > Doing a zpool import actually made it show that the pool had > successfully been repaired by the command above. > It did think that the pool was in use (and it was, as I didn't do a > zpool export). > > So I run zpool import -f pool in openindiana, and luckily, all my > files were there. Not sure if anything was lost... > > in openindiana, I then ran zpool export and rebooted into FreeBSD. > > I ran zpool import there, and got the same original behaviour of a > zpool import hanging, I can't sigbreak it nothing. Only left with the > option of rebooting. > > Back into openindiana, tried to remove the log drive, but no luck. > Always end up with the message: > cannot remove log: no such device in pool > > Googling that error seems to be a common issue when trying to remove a > ZIL but while that message is displayed, the log drive is actually > removed. > Not in my case.. > > So I tried something brave: > In Open Indiana > zpool export pool > > rebooted the PC, disconnected the SSD drive I had use and rebooted > into openindiana > ran zpool import -fF -R / pool (complained that log device was > missing) and again zpool import -fF -m -R / pool > > zfs status showed that logs device being unavailable this time. > > ran zpool remove pool log hex_number_showing_in_place > > It showed the error "cannot remove log: no such device in pool" > but zpool status showed that everything was allright > > zpool export pool , then reboot into FreeBSD > > zpool import this time didn't hang and successfully imported my pool. > All data seems to be there. > > > Summary: v28 is still buggy when it comes to removing the log > device... And once something is screwed, zpool utility becomes > hopeless as it hangs. > > So better have a OpenIndiana live CD to repair things :( > > But I won't be trying to remove the log device for a long time ! at > least the data can be recovered when it happens.. > > Could it be that this is related to the v28 patch I used > (http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101223-nopython.patch.xz > and should have stuck to the standard one). > Before anything else can you: (in FreeBSD) 1) Set vfs.zfs.recover=1 at the loader prompt (OK set vfs.zfs.recover=1) 2) Boot into single user mode without opensolaris.ko and zfs.ko loaded 3) ( mount -w / ) to make sure you can remove and also write new zpool.cache as needed. 3) Remove /boot/zfs/zpool.cache 4) kldload both zfs and opensolaris i.e. ( kldload zfs ) should do the trick 5) verify that vfs.zfs.recover=1 is set then ( zpool import pool ) 6) Give it a little bit monitor activity using Ctrl+T to see activity. You should have your pool back to a working condition after this. The reason why oi_127 can't work with your pool is because it cannot see FreeBSD generic labels. The only way to work around this for oi_127 would be to either point it directly at the replacing device or to use actual slices or partitions for your slogs and other such devices. Use adaNsN or gpt or gptid for working with your pool if you plan on using other OS's for recovery effects. Regards, - -- jhell,v -BEGIN PGP SIGNATURE- iQEcBAEBAgAGBQJNGB5QAAoJEJBXh4mJ2FR+rUAH/1HhzfnDI1jTICrA2Oiwyk12 BLXac0HoTY+NVUrdieMUWPh781oiB0eOuzjnOprev1D2uTqrmKvivnWdzuT/5Kfi vWSSnIqWiNbtvA5ocgWs7IPtcaD5pZS06oToihvLlsEiRyYXTSh2XD7JOsLbQMNb uKTfAvGI/XnNX0OY3RNI+OOa031GfpdHEWon8oi5aFBYdsDsv3Wn8Z45qCp8yfI+ WZlI+P+uunrmfgZdSzDbpAxeByhTB+8ntnB6QC4d0GRXKwqTVrFmIw5yuuqRAIf8 oCJYDhH6AUi+cxAGDExhLz2e75mEZNHAqB2nkxTaWbwL/rGjBnVidNm1aj7WnWw= =FlmB -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
Responding to myself again :P On 27 December 2010 13:28, Jean-Yves Avenard wrote: > tried to force a zpool import > > got a kernel panic: > panic: solaris assert: weight >= space && weight <= 2 * space, file: > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c, > line: 793 > > cpuid = 5 > KDB: stack backtrace > #0: 0xff805f64be at kdb_backtrace > #1 .. panic+0x187 > #2 .. metaslab_weight+0xe1 > #3: metaslab_sync_done+0x21e > #4: vdev_sync_done > #5: spa_sync+0x6a2 > #6 txg_sync_thread+0x147 > #7: fork_exit+0x118 > #8: fork_trampoline+0xe > > uptime 2m25s.. > Command used to import in FreeBSD was: zpool import -fF -R / pool which told me that zil was missing, and to use -m I booted openindiana (which is the only distribution I could ifnd with a live CD supporting zpool v28) Doing a zpool import actually made it show that the pool had successfully been repaired by the command above. It did think that the pool was in use (and it was, as I didn't do a zpool export). So I run zpool import -f pool in openindiana, and luckily, all my files were there. Not sure if anything was lost... in openindiana, I then ran zpool export and rebooted into FreeBSD. I ran zpool import there, and got the same original behaviour of a zpool import hanging, I can't sigbreak it nothing. Only left with the option of rebooting. Back into openindiana, tried to remove the log drive, but no luck. Always end up with the message: cannot remove log: no such device in pool Googling that error seems to be a common issue when trying to remove a ZIL but while that message is displayed, the log drive is actually removed. Not in my case.. So I tried something brave: In Open Indiana zpool export pool rebooted the PC, disconnected the SSD drive I had use and rebooted into openindiana ran zpool import -fF -R / pool (complained that log device was missing) and again zpool import -fF -m -R / pool zfs status showed that logs device being unavailable this time. ran zpool remove pool log hex_number_showing_in_place It showed the error "cannot remove log: no such device in pool" but zpool status showed that everything was allright zpool export pool , then reboot into FreeBSD zpool import this time didn't hang and successfully imported my pool. All data seems to be there. Summary: v28 is still buggy when it comes to removing the log device... And once something is screwed, zpool utility becomes hopeless as it hangs. So better have a OpenIndiana live CD to repair things :( But I won't be trying to remove the log device for a long time ! at least the data can be recovered when it happens.. Could it be that this is related to the v28 patch I used (http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101223-nopython.patch.xz and should have stuck to the standard one). Jean-Yves Breezing again ! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
tried to force a zpool import got a kernel panic: panic: solaris assert: weight >= space && weight <= 2 * space, file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c, line: 793 cpuid = 5 KDB: stack backtrace #0: 0xff805f64be at kdb_backtrace #1 .. panic+0x187 #2 .. metaslab_weight+0xe1 #3: metaslab_sync_done+0x21e #4: vdev_sync_done #5: spa_sync+0x6a2 #6 txg_sync_thread+0x147 #7: fork_exit+0x118 #8: fork_trampoline+0xe uptime 2m25s.. sorry for not writing down all the RAM addressed in the backtrace ... Starting to smell very poorly :( ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: ARRRGG HELP !!
Rebooting in single-user mode. zpool status pool or spool scrub pool hangs just the same ... and there's no disk activity either ... Will download a liveCD of OpenIndiana, hopefully it will show me what's wrong :( Jean-Yves ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE: ARRRGG HELP !!
On 27 December 2010 09:55, Jean-Yves Avenard wrote: > Hi there. > > I used stable-8-zfsv28-20101223-nopython.patch.xz from > http://people.freebsd.org/~mm/patches/zfs/v28/ I did the following: # zpool status pool: pool state: ONLINE scan: none requested config: NAMESTATE READ WRITE CKSUM poolONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2ONLINE 0 0 0 ada3ONLINE 0 0 0 ada4ONLINE 0 0 0 ada5ONLINE 0 0 0 ada6ONLINE 0 0 0 ada7ONLINE 0 0 0 cache label/zcache ONLINE 0 0 0 errors: No known data errors so far so good [r...@server4 /pool/home/jeanyves_avenard]# zpool add pool log /dev/label/zil [r...@server4 /pool/home/jeanyves_avenard]# zpool status pool: pool state: ONLINE scan: none requested config: NAMESTATE READ WRITE CKSUM poolONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ada2ONLINE 0 0 0 ada3ONLINE 0 0 0 ada4ONLINE 0 0 0 ada5ONLINE 0 0 0 ada6ONLINE 0 0 0 ada7ONLINE 0 0 0 logs label/zil ONLINE 0 0 0 cache label/zcache ONLINE 0 0 0 errors: No known data errors so far so good: # zpool remove pool logs label/zil cannot remove logs: no such device in pool ^C Great... now nothing respond.. Rebooting the box, I can boot in single user mode. but doing zpool status give me: ZFS filesystem version 5 ZFS storage pool version 28 and it hangs there forever... What should I do :( ? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
Hi there. I used stable-8-zfsv28-20101223-nopython.patch.xz from http://people.freebsd.org/~mm/patches/zfs/v28/ simply because it was the most recent at this location. Is this the one to use? Just asking cause the file server I installed it on has stopped responding this morning and doing a remote power cycle didn't work. So got to get to the office and see what went on :( Suspect a kernel panic of some kind Jean-Yves ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
Hi Martin, List, Patched up to a ZFSv28 20101218 and it is working as expected, Great Job!. There seems to be some assertion errors that are left to be fixed yet with the following examples: Panic String: solaris assert: vd->vdev_stat.vs_alloc == 0 (0x18a000 == 0x0), file:/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c, line: 4623 #3 0x84caca35 in spa_vdev_remove (spa=0x84dba000, guid=2330662286000872312, unspare=0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:4623 4623ASSERT3U(vd->vdev_stat.vs_alloc, ==, 0); (kgdb) list 4618 4619/* 4620 * The evacuation succeeded. Remove any remaining MOS metadata 4621 * associated with this vdev, and wait for these changes to sync. 4622 */ 4623ASSERT3U(vd->vdev_stat.vs_alloc, ==, 0); 4624txg = spa_vdev_config_enter(spa); 4625vd->vdev_removing = B_TRUE; 4626vdev_dirty(vd, 0, NULL, txg); 4627vdev_config_dirty(vd); This happens on i386 upon ( zfs remove pool ) Also if it is of any relevance this happens during ``offline'' too. If further information is needed I still have these cores and kernel just let me know what you need. Regards, -- jhell,v ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Updated py-zfs ? Re: New ZFSv28 patchset for 8-STABLE
Thanks, I'm going to check it out! On 23 Dec 2010, at 9:58, Martin Matuska wrote: > I have updated the py-zfs port right now so it should work with v28, > too. The problem was a non-existing solaris.misc module, I had to patch > and remove references to this module. > > Cheers, > mm > Regards, Ruben ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: Updated py-zfs ? Re: New ZFSv28 patchset for 8-STABLE
I have updated the py-zfs port right now so it should work with v28, too. The problem was a non-existing solaris.misc module, I had to patch and remove references to this module. Cheers, mm Dňa 23.12.2010 09:27, Ruben van Staveren wrote / napísal(a): > Hi, > > On 16 Dec 2010, at 13:44, Martin Matuska wrote: > >> Hi everyone, >> >> following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am >> providing a ZFSv28 testing patch for 8-STABLE. > > Where can I find an updated py-zfs so that zfs (un)allow/userspace/groupspace > can be tested ? > > Regards, > Ruben > > ___ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Updated py-zfs ? Re: New ZFSv28 patchset for 8-STABLE
Hi, On 16 Dec 2010, at 13:44, Martin Matuska wrote: > Hi everyone, > > following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am > providing a ZFSv28 testing patch for 8-STABLE. Where can I find an updated py-zfs so that zfs (un)allow/userspace/groupspace can be tested ? Regards, Ruben ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
Ok, On 16 Dec 2010, at 13:44, Martin Matuska wrote: > Please test, test, test. Chances are this is the last patchset before > v28 going to HEAD (finally) and after a reasonable testing period into > 8-STABLE. > Especially test new changes, like boot support and sendfile(2) support. > Also be sure to verify if you can import for existing ZFS pools > (v13-v15) when running v28 or boot from your existing pools. > > Please test the (v13-v15) compatibility layer as well: > Old usereland + new kernel / old kernel + new userland Using v28 kernel+userland seems to work on FreeBSD/amd64, I didn't dare to mix userland/kernel as that is ill advised by itself when there are major changes, like this one. I can't seem to use zfs allow/userspace/groupspace. old py-zfs just dumped core on those commands, recompiling gave my warnings about a missing solaris.misc module which persisted even after a upgrade to py26-zfs-1_1. Thanks for keeping up the good work on ZFS in FreeBSD! Best Regards, Ruben___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
On Sat, Dec 18, 2010 at 7:30 PM, Martin Matuska wrote: > The information about pools is stored in /boot/zfs/zpool.cache > If this file doesn't contain correct information, your system pools will > not be discovered. > > In v28, importing a pool with the "altroot" option does not touch the > cache file (it has to be specified manually with a option to zpool import). > > Regarding rollback - rolling back a live root file system is not > recommended. > > Dňa 18.12.2010 19:43, Krzysztof Dajka wrote / napísal(a): >> t my system working >> again. I did: >> zpool import -o altroot=/tank tank >> chroot /tank >> reboot >> >> Can anyone explain why chroot to /tank is needed? > I used 8.2BETA1 memstick image to import and rollback. Thanks for explanation, few moments ago I would argue that zpool(1) didn't mention cachefile :) ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
Hi, I applied patch against evening 2010-12-16 STABLE. I did what Martin asked: On Thu, Dec 16, 2010 at 1:44 PM, Martin Matuska wrote: > # cd /usr/src > # fetch > http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz > # xz -d stable-8-zfsv28-20101215.patch.xz > # patch -E -p0 < stable-8-zfsv28-20101215.patch > # rm sys/cddl/compat/opensolaris/sys/sysmacros.h > Patch applied cleanly. #make buildworld #make buildkernel #make installkernel Reboot into single user mode. #mergemaster -p #make installworld #mergemaster Reboot. Rebooting with old world and new kernel went fine. But after reboot with new world I got: ZFS: zfs_alloc()/zfs_free() mismatch Just before loading kernel modules, after that my system hangs. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
17.12.2010 12:12, Romain Garbage пишет: following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am providing a ZFSv28 testing patch for 8-STABLE. Link to the patch: http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz Link to mfsBSD ISO files for testing (i386 and amd64): http://mfsbsd.vx.sk/iso/zfs-v28/8.2-beta-zfsv28-amd64.iso http://mfsbsd.vx.sk/iso/zfs-v28/8.2-beta-zfsv28-i386.iso The root password for the ISO files: "mfsroot" The ISO files work on real systems and in virtualbox. They conatin a full install of FreeBSD 8.2-PRERELEASE with ZFS v28, simply use the provided "zfsinstall" script. The patch is against FreeBSD 8-STABLE as of 2010-12-15. When applying the patch be sure to use correct options for patch(1) and make sure the file sys/cddl/compat/opensolaris/sys/sysmacros.h gets deleted: # cd /usr/src # fetch http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz # xz -d stable-8-zfsv28-20101215.patch.xz # patch -E -p0< stable-8-zfsv28-20101215.patch # rm sys/cddl/compat/opensolaris/sys/sysmacros.h Patch seemed to apply fine against yesterday evening (2010-12-16) 8-STABLE, world and kernel compiled fine, and booting from mirrored pool v15 was also fine. ...booting from RAIDZ2 pool v15 was bad... :( --- Best regards, Andrei Lavreniyuk. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
Re: New ZFSv28 patchset for 8-STABLE
2010/12/16 Martin Matuska : > Hi everyone, > > following the announcement of Pawel Jakub Dawidek (p...@freebsd.org) I am > providing a ZFSv28 testing patch for 8-STABLE. > > Link to the patch: > > http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz > > Link to mfsBSD ISO files for testing (i386 and amd64): > http://mfsbsd.vx.sk/iso/zfs-v28/8.2-beta-zfsv28-amd64.iso > http://mfsbsd.vx.sk/iso/zfs-v28/8.2-beta-zfsv28-i386.iso > > The root password for the ISO files: "mfsroot" > The ISO files work on real systems and in virtualbox. > They conatin a full install of FreeBSD 8.2-PRERELEASE with ZFS v28, > simply use the provided "zfsinstall" script. > > The patch is against FreeBSD 8-STABLE as of 2010-12-15. > > When applying the patch be sure to use correct options for patch(1) > and make sure the file sys/cddl/compat/opensolaris/sys/sysmacros.h gets > deleted: > > # cd /usr/src > # fetch > http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101215.patch.xz > # xz -d stable-8-zfsv28-20101215.patch.xz > # patch -E -p0 < stable-8-zfsv28-20101215.patch > # rm sys/cddl/compat/opensolaris/sys/sysmacros.h Patch seemed to apply fine against yesterday evening (2010-12-16) 8-STABLE, world and kernel compiled fine, and booting from mirrored pool v15 was also fine. Cheers, Romain ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"