Re: [zfs-discuss] ZFS Not Offlining Disk on SCSI Sense Error (X4500)
Hi Eric, Hard to say. I'll use MDB next time it happens for more info. The applications using any zpool lock up. -J On Jan 3, 2008 3:33 PM, Eric Schrock <[EMAIL PROTECTED]> wrote: > When you say "starts throwing sense errors," does that mean every I/O to > the drive will fail, or some arbitrary percentage of I/Os will fail? If > it's the latter, ZFS is trying to do the right thing by recognizing > these as transient errors, but eventually the ZFS diagnosis should kick > in. What does '::spa -ve' in 'mdb -k' show in one of these situations? > How about '::zio_state'? > > - Eric > > > On Thu, Jan 03, 2008 at 03:11:39PM -0700, Jason J. W. Williams wrote: > > Hi Albert, > > > > Thank you for the link. ZFS isn't offlining the disk in b77. > > > > -J > > > > On Jan 3, 2008 3:07 PM, Albert Chin > > <[EMAIL PROTECTED]> wrote: > > > > > > On Thu, Jan 03, 2008 at 02:57:08PM -0700, Jason J. W. Williams wrote: > > > > There seems to be a persistent issue we have with ZFS where one of the > > > > SATA disk in a zpool on a Thumper starts throwing sense errors, ZFS > > > > does not offline the disk and instead hangs all zpools across the > > > > system. If it is not caught soon enough, application data ends up in > > > > an inconsistent state. We've had this issue with b54 through b77 (as > > > > of last night). > > > > > > > > We don't seem to be the only folks with this issue reading through the > > > > archives. Are there any plans to fix this behavior? It really makes > > > > ZFS less than desirable/reliable. > > > > > > http://blogs.sun.com/eschrock/entry/zfs_and_fma > > > > > > FMA For ZFS Phase 2 (PSARC/2007/283) was integrated in b68: > > > http://www.opensolaris.org/os/community/arc/caselog/2007/283/ > > > http://www.opensolaris.org/os/community/on/flag-days/all/ > > > > > > -- > > > albert chin ([EMAIL PROTECTED]) > > > ___ > > > zfs-discuss mailing list > > > zfs-discuss@opensolaris.org > > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > > > ___ > > zfs-discuss mailing list > > zfs-discuss@opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > -- > Eric Schrock, FishWorkshttp://blogs.sun.com/eschrock > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [osol-help] ZFS woes
Scott L. Burson wrote: > Hi, > > This is in build 74, on x64, on a Tyan S2882-D with dual Opteron 275 and 24GB > of ECC DRAM. > > Not an answer, but zfs-discuss is probably the best place to ask, so I've taken the liberty of CCing that list. > I seem to have lost the entire contents of a ZFS raidz pool. The pool is in > a state where, if ZFS looks at it, I get a kernel panic. To make it possible > to boot the machine, I had to boot into safe mode and rename > `/etc/zfs/zpool.cache' (fortunately, this was my only pool on the machine). > > Okay, from the beginning. I bought the drives in October: three 500GB > Western Digital WD5000ABYS SATA drives, installed them in the box in place of > three 250GB Seagates I had been using, and created the raidz pool. For the > first couple of months everything was hunky dory. Then, a couple of weeks > ago, I moved the machine to a different location in the building, which > wouldn't even be worth mentioning except that that's when I started to have > problems. The first time I powered it up, one of the SATA drives didn't show > up; I reseated the drive connectors and tried again, and it seemed fine. I > thought that was odd, since I hadn't had one of those connectors come loose > on me before, but I scrubbed the pool, cleared the errors on the drive, and > thought that was the end of it. > > It wasn't. `zpool status' continued to report errors, only now they were > write and read errors, and spread across all three drives. I started to copy > the most critical parts of the filesystem contents onto other machines (very > fortunately, as it turned out). After a while, the drive that had previously > not shown up was marked faulted, and the other two were marked degraded. > Then, yesterday, there was a much larger number of errors -- over 3000 read > errors -- on a different drive, and that drive was marked faulted and the > other two (i.e. including the one that had previously been faulted) were > marked degraded. Also, `zpool status' told me I had lost some "files"; these > turned out to be all, or mostly, directories, some containing substantial > trees. > > By this point I had already concluded I was going to have to replace a drive, > and had picked up a replacement. I installed it in place of the drive that > was now marked faulted, and powered up. I was met with repeated panics and > reboots. I managed to copy down part of the backtrace: > > unix:die+c8 > unix:trap+1351 > unix:cmntrap+e9 > unix:mutex_enter+b > zfs:metaslab_free+97 > zfs:zio_dva_free+29 > zfs:zio_next_stage+b3 > zfs:zio_gang_pipeline+?? > > (This may contain typos, and I didn't get the offset on that last frame.) > > At this point I tried replacing the drive I had just removed (removing the > new, blank drive), but that didn't help. So, as mentioned above, I tried > booting into safe mode and renaming `/etc/zfs/zpool.cache' -- just on a > hunch, but I figured there had to be some such way to make ZFS forget about > the pool -- and that allowed me to boot. > > I used good old `format' to run read tests on the drives overnight -- no bad > blocks were detected. > > So, there are a couple lines of discussion here. On the one hand, it seems I > have a hardware problem, but I haven't yet diagnosed it. More on this below. > On the other, even in the face of hardware problems, I have to report some > disappointment with ZFS. I had really been enjoying the warm fuzzy feeling > ZFS gave me (and I was talking it up to my colleagues; I'm the only one here > using it). Now I'm in a worse state than I would probably be with UFS on > RAID, where `fsck' would probably have managed to salvage a lot of the > filesystem (I would certainly be able to mount it! -- unless the drives were > all failing catastrophically, which doesn't seem to be happening). > > One could say, there are two aspects to filesystem robustness: integrity > checking and recovery. ZFS, with its block checksums, gets an A in integrity > checking, but now appears to do very poorly in recovering in the face of > substantial but not total hardware degradation, when that degradation is > sufficiently severe that the redundancy of the pool can't correct for it. > > Perhaps this is a vanishingly rare case and I am just very unlucky. > Nonetheless I would like to make some suggestions. (1) It would still be > nice to have a salvager. (2) I think it would make sense, at least as an > option, to add even more redundancy to ZFS's on-disk layout; for instance, it > could keep copies of all directories. > > Okay, back to my hardware problems. I know you're going to tell me I > probably have a bad power supply, and I can't rule that out, but it's an > expensive PSU and generously sized for the box; and the box had been rock > stable for a good 18 months before this happened. I'm naturally more > inclined to suspect the new components, which are the SATA drives. (I
Re: [zfs-discuss] [zones-discuss] ZFS shared /home between zones
James C. McPherson wrote: > > The ws command hates it - "hmm, the underlying device for > /scratch is /scratch maybe if I loop around stat()ing > it it'll turn into a pumpkin" > > :-) > > > As does dmake, which is a real PITA for a developer! Ian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs panic on boot
> space_map_add+0xdb(ff014c1a21b8, 472785000, 1000) > space_map_load+0x1fc(ff014c1a21b8, fbd52568, 1, ff014c1a1e88, ff0149c88c30) > running snv79. hmm.. did you spend any time in snv_74 or snv_75 that might have gotten http://bugs.opensolaris.org/view_bug.do?bug_id=6603147 zdb -e would be interesting, but the damage might have been done. Rob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] rename(2) (mv(1)) between ZFS filesystems in the same zpool
Joerg Schilling wrote: Carsten Bormann <[EMAIL PROTECTED]> wrote: On Dec 29 2007, at 08:33, Jonathan Loran wrote: We snapshot the file as it exists at the time of the mv in the old file system until all referring file handles are closed, then destroy the single file snap. I know, not easy to implement, but that is the correct behavior, I believe. Exactly. Note that apart from open descriptors, there may be other links to the file on the old FS; it has to be clear whether writes to the file in the new FS change the file in the old FS or not. I'd rather say they shouldn't. Yes, this would be different from the normal rename(2) semantics with respect to multiply linked files. And yes, the semantics of link(2) should also be consistent with this. This in an interesting problem. Your proposal would imply that a file may have different identities in different filesystems: - different st_dev - different st_ino - different link count This cannot be implemented with a single "inode data" anymore. Well, it is not impossible as my WOFS (mentioned before) implements hardlinks via "inode relative symlinks". In order to allow this. a file would need a storage pool global serial number that allows to match different inode sets for the file. Jörg At first, as I mentioned in my earlier email, I was thinking we needed to emulate the cross-fs rename/link/etc behavior as it is currently implemented, where a file appears to actually be copied. But now I'm not so sure. In Unixland, the ideal has always been to have the whole file system, kit and caboodle, singly rooted at /. Heck, even devices are in the file system. Of course, reality required that Programmatically, we needed to be aware of what file system your cwd is in. At a minimum, it's returned in our various stat structs (st_dev). I can see I'm getting long winded, but I'm thinking: what is the value of having different behavior with a cross zfs file move, within the same pool as that between directories. I'm not addressing the previous discussion about how to treat file handles, etc, but more about sharing open file blocks, linked across zfs boundaries before and after such a mv. I think the test is this: can we find a scenario where something would break if we did share the file blocks across zfs boundaries after such a mv? For every example I've been able to think of, if I ask the question: what if I moved the file from one directory to the other, instead of across zfs boundaries, would it have been different? it's been no. Comments please. Jon -- - _/ _/ / - Jonathan Loran - - -/ / /IT Manager - - _ / _ / / Space Sciences Laboratory, UC Berkeley -/ / / (510) 643-5146 [EMAIL PROTECTED] - __/__/__/ AST:7731^29u18e3 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Not Offlining Disk on SCSI Sense Error (X4500)
When you say "starts throwing sense errors," does that mean every I/O to the drive will fail, or some arbitrary percentage of I/Os will fail? If it's the latter, ZFS is trying to do the right thing by recognizing these as transient errors, but eventually the ZFS diagnosis should kick in. What does '::spa -ve' in 'mdb -k' show in one of these situations? How about '::zio_state'? - Eric On Thu, Jan 03, 2008 at 03:11:39PM -0700, Jason J. W. Williams wrote: > Hi Albert, > > Thank you for the link. ZFS isn't offlining the disk in b77. > > -J > > On Jan 3, 2008 3:07 PM, Albert Chin > <[EMAIL PROTECTED]> wrote: > > > > On Thu, Jan 03, 2008 at 02:57:08PM -0700, Jason J. W. Williams wrote: > > > There seems to be a persistent issue we have with ZFS where one of the > > > SATA disk in a zpool on a Thumper starts throwing sense errors, ZFS > > > does not offline the disk and instead hangs all zpools across the > > > system. If it is not caught soon enough, application data ends up in > > > an inconsistent state. We've had this issue with b54 through b77 (as > > > of last night). > > > > > > We don't seem to be the only folks with this issue reading through the > > > archives. Are there any plans to fix this behavior? It really makes > > > ZFS less than desirable/reliable. > > > > http://blogs.sun.com/eschrock/entry/zfs_and_fma > > > > FMA For ZFS Phase 2 (PSARC/2007/283) was integrated in b68: > > http://www.opensolaris.org/os/community/arc/caselog/2007/283/ > > http://www.opensolaris.org/os/community/on/flag-days/all/ > > > > -- > > albert chin ([EMAIL PROTECTED]) > > ___ > > zfs-discuss mailing list > > zfs-discuss@opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Eric Schrock, FishWorkshttp://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Not Offlining Disk on SCSI Sense Error (X4500)
This should be pretty much fixed on build 77. It will lock up for the duration of a single command timeout, but ZFS should recover quickly without queueing up additional commands. Since the default timeout is 60 seconds, and we retry 3 times, and we do a probe afterwards, you may see hangs of up to 6 minutes. Unfortunately there's not much we can do, since that's the minimum amount of time to do two I/O operations to a single drive (one that fails and one to do a basic probe of the disk). You can tune down 'sd_io_time' to a more reasonable value to get shorter command timeouts, but this may break slow things (like powered down CD-ROM drives). Other options at the ZFS level could be imagined, but would require per-pool tunables: 1. Allowing I/O to complete as soon as it was on enough devices, instead of replicating to all devices. 2. Inventing a per-pool tunable that controlled timeouts independent of SCSI timeouts. Neither of these is trivial, and both potentially compromise data integrity, hence the lack of such features. There's no easy solution to the problem, but we're happy to hear ideas. - Eric On Thu, Jan 03, 2008 at 02:57:08PM -0700, Jason J. W. Williams wrote: > Hello, > > There seems to be a persistent issue we have with ZFS where one of the > SATA disk in a zpool on a Thumper starts throwing sense errors, ZFS > does not offline the disk and instead hangs all zpools across the > system. If it is not caught soon enough, application data ends up in > an inconsistent state. We've had this issue with b54 through b77 (as > of last night). > > We don't seem to be the only folks with this issue reading through the > archives. Are there any plans to fix this behavior? It really makes > ZFS less than desirable/reliable. > > Best Regards, > Jason > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Eric Schrock, FishWorkshttp://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Not Offlining Disk on SCSI Sense Error (X4500)
Hi Eric, I'd really like to suggest a helpful idea, but all I can suggest is an end result. Running ZFS on top of STK arrays doing the RAID, they offline their bad disks very quickly and the applications never notice. In the X4500s, ZFS times out and locks up the applications. If ZFS is going to be able to compete with the more traditional arrays it seems the failure behavior has to be just as seamless. -J On Jan 3, 2008 3:03 PM, Eric Schrock <[EMAIL PROTECTED]> wrote: > This should be pretty much fixed on build 77. It will lock up for the > duration of a single command timeout, but ZFS should recover quickly > without queueing up additional commands. Since the default timeout is > 60 seconds, and we retry 3 times, and we do a probe afterwards, you may > see hangs of up to 6 minutes. Unfortunately there's not much we can do, > since that's the minimum amount of time to do two I/O operations to a > single drive (one that fails and one to do a basic probe of the disk). > You can tune down 'sd_io_time' to a more reasonable value to get shorter > command timeouts, but this may break slow things (like powered down > CD-ROM drives). > > Other options at the ZFS level could be imagined, but would require > per-pool tunables: > > 1. Allowing I/O to complete as soon as it was on enough devices, instead >of replicating to all devices. > > 2. Inventing a per-pool tunable that controlled timeouts independent >of SCSI timeouts. > > Neither of these is trivial, and both potentially compromise data > integrity, hence the lack of such features. There's no easy solution to > the problem, but we're happy to hear ideas. > > - Eric > > On Thu, Jan 03, 2008 at 02:57:08PM -0700, Jason J. W. Williams wrote: > > > Hello, > > > > There seems to be a persistent issue we have with ZFS where one of the > > SATA disk in a zpool on a Thumper starts throwing sense errors, ZFS > > does not offline the disk and instead hangs all zpools across the > > system. If it is not caught soon enough, application data ends up in > > an inconsistent state. We've had this issue with b54 through b77 (as > > of last night). > > > > We don't seem to be the only folks with this issue reading through the > > archives. Are there any plans to fix this behavior? It really makes > > ZFS less than desirable/reliable. > > > > Best Regards, > > Jason > > > ___ > > zfs-discuss mailing list > > zfs-discuss@opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > -- > Eric Schrock, FishWorkshttp://blogs.sun.com/eschrock > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Not Offlining Disk on SCSI Sense Error (X4500)
Hi Albert, Thank you for the link. ZFS isn't offlining the disk in b77. -J On Jan 3, 2008 3:07 PM, Albert Chin <[EMAIL PROTECTED]> wrote: > > On Thu, Jan 03, 2008 at 02:57:08PM -0700, Jason J. W. Williams wrote: > > There seems to be a persistent issue we have with ZFS where one of the > > SATA disk in a zpool on a Thumper starts throwing sense errors, ZFS > > does not offline the disk and instead hangs all zpools across the > > system. If it is not caught soon enough, application data ends up in > > an inconsistent state. We've had this issue with b54 through b77 (as > > of last night). > > > > We don't seem to be the only folks with this issue reading through the > > archives. Are there any plans to fix this behavior? It really makes > > ZFS less than desirable/reliable. > > http://blogs.sun.com/eschrock/entry/zfs_and_fma > > FMA For ZFS Phase 2 (PSARC/2007/283) was integrated in b68: > http://www.opensolaris.org/os/community/arc/caselog/2007/283/ > http://www.opensolaris.org/os/community/on/flag-days/all/ > > -- > albert chin ([EMAIL PROTECTED]) > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS Not Offlining Disk on SCSI Sense Error (X4500)
On Thu, Jan 03, 2008 at 02:57:08PM -0700, Jason J. W. Williams wrote: > There seems to be a persistent issue we have with ZFS where one of the > SATA disk in a zpool on a Thumper starts throwing sense errors, ZFS > does not offline the disk and instead hangs all zpools across the > system. If it is not caught soon enough, application data ends up in > an inconsistent state. We've had this issue with b54 through b77 (as > of last night). > > We don't seem to be the only folks with this issue reading through the > archives. Are there any plans to fix this behavior? It really makes > ZFS less than desirable/reliable. http://blogs.sun.com/eschrock/entry/zfs_and_fma FMA For ZFS Phase 2 (PSARC/2007/283) was integrated in b68: http://www.opensolaris.org/os/community/arc/caselog/2007/283/ http://www.opensolaris.org/os/community/on/flag-days/all/ -- albert chin ([EMAIL PROTECTED]) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] ZFS Not Offlining Disk on SCSI Sense Error (X4500)
Hello, There seems to be a persistent issue we have with ZFS where one of the SATA disk in a zpool on a Thumper starts throwing sense errors, ZFS does not offline the disk and instead hangs all zpools across the system. If it is not caught soon enough, application data ends up in an inconsistent state. We've had this issue with b54 through b77 (as of last night). We don't seem to be the only folks with this issue reading through the archives. Are there any plans to fix this behavior? It really makes ZFS less than desirable/reliable. Best Regards, Jason ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Bugid 6535160
We loaded Nevada_78 on a peer T2000 unit. Imported the same ZFS pool. I didn't even upgrade the pool since we wanted to be able to move it back to 10u4. Cut 'n paste of my colleague's email with the results: Here's the latest Pepsi Challenge results. Sol10u4 vs Nevada78. Same tuning options, same zpool, same storage, same SAN switch - you get the idea. The only difference is the OS. Sol10u4: 4984: 82.878: Per-Operation Breakdown closefile4404ops/s 0.0mb/s 0.0ms/op 19us/op-cpu readfile4 404ops/s 6.3mb/s 0.1ms/op 109us/op-cpu openfile4 404ops/s 0.0mb/s 0.1ms/op 112us/op-cpu closefile3404ops/s 0.0mb/s 0.0ms/op 25us/op-cpu fsyncfile3404ops/s 0.0mb/s 18.7ms/op 1168us/op-cpu appendfilerand3 404ops/s 6.3mb/s 0.2ms/op 192us/op-cpu readfile3 404ops/s 6.3mb/s 0.1ms/op 111us/op-cpu openfile3 404ops/s 0.0mb/s 0.1ms/op 111us/op-cpu closefile2404ops/s 0.0mb/s 0.0ms/op 24us/op-cpu fsyncfile2404ops/s 0.0mb/s 19.0ms/op 1162us/op-cpu appendfilerand2 404ops/s 6.3mb/s 0.2ms/op 173us/op-cpu createfile2 404ops/s 0.0mb/s 0.3ms/op 334us/op-cpu deletefile1 404ops/s 0.0mb/s 0.2ms/op 173us/op-cpu 4984: 82.879: IO Summary: 318239 ops 5251.8 ops/s, (808/808 r/w) 25.2mb/s, 1228us cpu/op, 9.7ms latency Nevada78: 1107: 82.554: Per-Operation Breakdown closefile4 1223ops/s 0.0mb/s 0.0ms/op 22us/op-cpu readfile41223ops/s 19.4mb/s 0.1ms/op 112us/op-cpu openfile41223ops/s 0.0mb/s 0.1ms/op 128us/op-cpu closefile3 1223ops/s 0.0mb/s 0.0ms/op 29us/op-cpu fsyncfile3 1223ops/s 0.0mb/s 4.6ms/op 256us/op-cpu appendfilerand3 1223ops/s 19.1mb/s 0.2ms/op 191us/op-cpu readfile31223ops/s 19.9mb/s 0.1ms/op 116us/op-cpu openfile31223ops/s 0.0mb/s 0.1ms/op 127us/op-cpu closefile2 1223ops/s 0.0mb/s 0.0ms/op 28us/op-cpu fsyncfile2 1223ops/s 0.0mb/s 4.4ms/op 239us/op-cpu appendfilerand2 1223ops/s 19.1mb/s 0.1ms/op 159us/op-cpu createfile2 1223ops/s 0.0mb/s 0.5ms/op 389us/op-cpu deletefile1 1223ops/s 0.0mb/s 0.2ms/op 198us/op-cpu 1107: 82.581: IO Summary: 954637 ops 15903.4 ops/s, (2447/2447 r/w) 77.5mb/s, 590us cpu/op, 2.6ms latency That's a 3-4x improvement in ops/sec and average fsync time. Here are the results from our UFS software mirror for comparison: 4984: 211.056: Per-Operation Breakdown closefile4465ops/s 0.0mb/s 0.0ms/op 23us/op-cpu readfile4 465ops/s 12.6mb/s 0.1ms/op 142us/op-cpu openfile4 465ops/s 0.0mb/s 0.1ms/op 83us/op-cpu closefile3465ops/s 0.0mb/s 0.0ms/op 24us/op-cpu fsyncfile3465ops/s 0.0mb/s 6.0ms/op 498us/op-cpu appendfilerand3 465ops/s 7.3mb/s 1.7ms/op 282us/op-cpu readfile3 465ops/s 11.1mb/s 0.1ms/op 132us/op-cpu openfile3 465ops/s 0.0mb/s 0.1ms/op 84us/op-cpu closefile2465ops/s 0.0mb/s 0.0ms/op 26us/op-cpu fsyncfile2465ops/s 0.0mb/s 5.9ms/op 445us/op-cpu appendfilerand2 465ops/s 7.3mb/s 1.1ms/op 231us/op-cpu createfile2 465ops/s 0.0mb/s 2.2ms/op 443us/op-cpu deletefile1 465ops/s 0.0mb/s 2.0ms/op 269us/op-cpu 4984: 211.057: IO Summary: 366557 ops 6049.2 ops/s, (931/931 r/w) 38.2mb/s,912us cpu/op, 4.8ms latency So either we're hitting a pretty serious zfs bug, or they're purposely holding back performance in Solaris 10 so that we all have a good reason to upgrade to 11. ;) -Nick This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs panic on boot
I'm seeing this too. Nothing unusual happened before the panic. Just a shutdown (init 5) and later startup. I have the crashdump and copy of the problem zpool (on swan). Here's the stack trace: > $C ff0004463680 vpanic() ff00044636b0 vcmn_err+0x28(3, f792ecf0, ff0004463778) ff00044637a0 zfs_panic_recover+0xb6() ff0004463830 space_map_add+0xdb(ff014c1a21b8, 472785000, 1000) ff00044638e0 space_map_load+0x1fc(ff014c1a21b8, fbd52568, 1, ff014c1a1e88, ff0149c88c30) ff0004463920 metaslab_activate+0x66(ff014c1a1e80, 4000) ff00044639e0 metaslab_group_alloc+0x24e(ff014bdeb000, 4000, 3a6734, 1435b, ff014baa9840, 2) ff0004463ab0 metaslab_alloc_dva+0x1da(ff01477880c0, ff014beefa70, 4000, ff014baa9840, 2, 0, 3a6734, 0) ff0004463b50 metaslab_alloc+0x82(ff01477880c0, ff014beefa70, 4000, ff014baa9840, 3, 3a6734, 0, 0) ff0004463ba0 zio_dva_allocate+0x62(ff014934c458) ff0004463bd0 zio_execute+0x7f(ff014934c458) ff0004463c60 taskq_thread+0x1a7(ff014bfb77a0) ff0004463c70 thread_start+8() This is on a Ferrari laptop (AMD X64) running snv79. I'd love to rescue my zpool. Any suggestions? Thanks, Gordon This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [zones-discuss] ZFS shared /home between zones
In general you should not allow a Solaris system to be both an NFS server and NFS client for the same filesystem, irrespective of whether zones are involved. Among other problems, you can run into kernel deadlocks in some (rare) circumstances. This is documented in the NFS administration docs. A loopback mount is definitely the recommended approach. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] hot spare and resilvering problem
Hi, > Do you have snapshots taking place (like in a cron job) during the > resilver process? If so, you may be hitting a bug that the resilver > will restart from the beginning whenever a new snapshot occurs. If > you disable the snapshots during the resilver then it should complete > to 100%. no, I don't have snapshots taking place. I found that when I query zfs pool with "zpool status" it restarts resilvering process, strange ... Anyway, after ~10 days resilvering has finally completed to 100% resilver completed with 0 errors on Wed Jan 2 12:46:10 2008 The filesystem is still slow however. When I try to run zpool iostat it takes few hours to produce output, same with zfs create. I can't even post output of "zpool status -v" as it takes that long to complete. We have 11 disks (+1 hot spare) in raidZ config, why is the filesystem so slow even now when hot spare has replaced faulty disk ? thanks, Maciej ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] rename(2) (mv(1)) between ZFS filesystems in the same zpool
Carsten Bormann <[EMAIL PROTECTED]> wrote: > On Dec 29 2007, at 08:33, Jonathan Loran wrote: > > > We snapshot the file as it exists at the time of > > the mv in the old file system until all referring file handles are > > closed, then destroy the single file snap. I know, not easy to > > implement, but that is the correct behavior, I believe. > > Exactly. > > Note that apart from open descriptors, there may be other links to the > file on the old FS; it has to be clear whether writes to the file in > the new FS change the file in the old FS or not. I'd rather say they > shouldn't. > Yes, this would be different from the normal rename(2) semantics with > respect to multiply linked files. And yes, the semantics of link(2) > should also be consistent with this. This in an interesting problem. Your proposal would imply that a file may have different identities in different filesystems: - different st_dev - different st_ino - different link count This cannot be implemented with a single "inode data" anymore. Well, it is not impossible as my WOFS (mentioned before) implements hardlinks via "inode relative symlinks". In order to allow this. a file would need a storage pool global serial number that allows to match different inode sets for the file. Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] rename(2) (mv(1)) between ZFS filesystems in the same zpool
On Dec 29 2007, at 08:33, Jonathan Loran wrote: > We snapshot the file as it exists at the time of > the mv in the old file system until all referring file handles are > closed, then destroy the single file snap. I know, not easy to > implement, but that is the correct behavior, I believe. Exactly. Note that apart from open descriptors, there may be other links to the file on the old FS; it has to be clear whether writes to the file in the new FS change the file in the old FS or not. I'd rather say they shouldn't. Yes, this would be different from the normal rename(2) semantics with respect to multiply linked files. And yes, the semantics of link(2) should also be consistent with this. Gruesse, Carsten ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss