Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?
On Thu, Aug 2, 2012 at 3:39 PM, Richard Elling wrote: > On Aug 1, 2012, at 8:30 AM, Nigel W wrote: > > > Yes. +1 > > The L2ARC as is it currently implemented is not terribly useful for > storing the DDT in anyway because each DDT entry is 376 bytes but the > L2ARC reference is 176 bytes, so best case you get just over double > the DDT entries in the L2ARC as what you would get into the ARC but > then you have also have no ARC left for anything else :(. > > > You are making the assumption that each DDT table entry consumes one > metadata update. This is not the case. The DDT is implemented as an AVL > tree. As per other metadata in ZFS, the data is compressed. So you cannot > make a direct correlation between the DDT entry size and the affect on the > stored metadata on disk sectors. > -- richard > It's compressed even when in the ARC? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import the zpool
hi can you post zpool history regards Sent from my iPad On Aug 2, 2012, at 7:47, Suresh Kumar wrote: > Hi Hung-sheng, > > It is not displaying any output, like the following. > > bash-3.2# zpool import -nF tXstpool > bash-3.2# > > > Thanks & Regards, > Suresh. > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import the zpool
My experience has always been that ZFS tries hard to keep you from doing something wrong when devices are failing or otherwise unavailable. With mirrors, it will import with a device missing from a mirror vdev. I don't use cache or log devices in my mainly storage pools, so I've not seen a failure with a "required" device like that missing. But, I've seen problems with a raid-z missing and the pool not coming on line. As Richard says, it would seem there is a cache or log vdev missing since it is showing 1 of 2 mirrored devices in that vdev missing, but still complaining about a missing device. The older OS and ZFS version may in fact have a misbehavior due to some error condition not being correctly managed. Gregg Wonderly On Aug 2, 2012, at 4:49 PM, Richard Elling wrote: > > On Aug 1, 2012, at 12:21 AM, Suresh Kumar wrote: > >> Dear ZFS-Users, >> >> I am using Solarisx86 10u10, All the devices which are belongs to my zpool >> are in available state . >> But I am unable to import the zpool. >> >> #zpool import tXstpool >> cannot import 'tXstpool': one or more devices is currently unavailable >> == >> bash-3.2# zpool import >> pool: tXstpool >> id: 13623426894836622462 >> state: UNAVAIL >> status: One or more devices are missing from the system. >> action: The pool cannot be imported. Attach the missing >> devices and try again. >>see: http://www.sun.com/msg/ZFS-8000-6X >> config: >> >> tXstpool UNAVAIL missing device >> mirror-0 DEGRADED >> c2t210100E08BB2FC85d0s0 FAULTED corrupted data >> c2t21E08B92FC85d2ONLINE >> >> Additional devices are known to be part of this pool, though their >> exact configuration cannot be determined. >> > > This message is your clue. The pool is missing a device. In most of the cases > where I've seen this, it occurs on older ZFS implementations and the missing > device is an auxiliary device: cache or spare. > -- richard > > -- > ZFS Performance and Training > richard.ell...@richardelling.com > +1-760-896-4422 > > > > > > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] encfs on top of zfs
On Jul 31, 2012, at 8:05 PM, opensolarisisdeadlongliveopensolaris wrote: >> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- >> boun...@opensolaris.org] On Behalf Of Richard Elling >> >> I believe what you meant to say was "dedup with HDDs sux." If you had >> used fast SSDs instead of HDDs, you will find dedup to be quite fast. >> -- richard > > Yes, but this is a linear scale. No, it is definitely NOT a linear scale. Study Amdahl's law a little more carefully. > Suppose an SSD without dedup is 100x faster than a HDD without dedup. And > suppose dedup slows down a system by a factor of 10x. Now your SSD with > dedup is only 10x faster than the HDD without dedup. So "quite fast" is a > relative term. Of course it is. > The SSD with dedup is still faster than the HDD without dedup, but it's also > slower than the SSD without dedup. duh. With dedup you are trading IOPS for space. In general, HDDs have lots of space and terrible IOPS. SSDs have less space, but more IOPS. Obviously, as you point out, the best solution is lots of space and lots of IOPS. > The extent of fibbing I'm doing is thusly: In reality, an SSD is about > equally fast with HDD for sequential operations, and about 100x faster for > random IO. It just so happens that the dedup performance hit is almost > purely random IO, so it's right in the sweet spot of what SSD's handle well. In the vast majority of modern systems, there are no sequential I/O workloads. That is a myth propagated by people who still think HDDs can be fast. > You can't use an overly simplified linear model like I described above - In > reality, there's a grain of truth in what Richard said, and also a grain of > truth in what I said. The real truth is somewhere in between what he said > and what I said. But closer to my truth :-) > No, the SSD will not perform as well with dedup as it does without dedup. > But the "suppose dedup slows down by 10x" that I described above is not > accurate. Depending on what you're doing, dedup might slow down an HDD by > 20x, and it might only slow down SSD by 4x doing the same work load. Highly > variable, and highly dependent on the specifics of your workload. You are making the assumption that the system is not bandwidth limited. This is a good assumption for the HDD case, because the media bandwidth is much less than the interconnect bandwidth. For SSDs, this assumption is not necessarily true. There are SSDs that are bandwidth constrained on the interconnect, and in those cases, your model fails. -- richard -- ZFS Performance and Training richard.ell...@richardelling.com +1-760-896-4422 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?
On 2012-Aug-02 18:30:01 +0530, opensolarisisdeadlongliveopensolaris wrote: >Ok, so the point is, in some cases, somebody might want redundancy on >a device that has no redundancy. They're willing to pay for it by >halving their performance. This isn't quite true - write performance will be at least halved (possibly worse due to additional seeking) but read performance could potentially improve (more copies means, on average, there should be less seeking to get a a copy than if there was only one copy). And non-IO performance is unaffected. > The only situation I'll acknowledge is >the laptop situation, and I'll say, present day very few people would >be willing to pay *that* much for this limited use-case redundancy. My guess is that, for most people, the overall performance impact would be minimal because disk write performance isn't the limiting factor for most laptop usage scenarios. >The solution that I as an IT person would recommend and deploy would >be to run without "copies" and instead cover you bum by doing backups. You need backups in any case but backups won't help you if you can't conveniently access them. Before giving a blanket recommendation, you need to consider how the person uses their laptop. Consider the following scenario: You're in the middle of a week-long business trip and your laptop develops a bad sector in an inconvenient spot. Do you: a) Let ZFS automagically repair the sector thanks to copies=2. b) Attempt to rebuild your laptop and restore from backups (left securely at home) via the dodgy hotel wifi. -- Peter Jeremy pgpvosNQQa9DJ.pgp Description: PGP signature ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import the zpool
On Aug 1, 2012, at 12:21 AM, Suresh Kumar wrote: > Dear ZFS-Users, > > I am using Solarisx86 10u10, All the devices which are belongs to my zpool > are in available state . > But I am unable to import the zpool. > > #zpool import tXstpool > cannot import 'tXstpool': one or more devices is currently unavailable > == > bash-3.2# zpool import > pool: tXstpool > id: 13623426894836622462 > state: UNAVAIL > status: One or more devices are missing from the system. > action: The pool cannot be imported. Attach the missing > devices and try again. >see: http://www.sun.com/msg/ZFS-8000-6X > config: > > tXstpool UNAVAIL missing device > mirror-0 DEGRADED > c2t210100E08BB2FC85d0s0 FAULTED corrupted data > c2t21E08B92FC85d2ONLINE > > Additional devices are known to be part of this pool, though their > exact configuration cannot be determined. > This message is your clue. The pool is missing a device. In most of the cases where I've seen this, it occurs on older ZFS implementations and the missing device is an auxiliary device: cache or spare. -- richard -- ZFS Performance and Training richard.ell...@richardelling.com +1-760-896-4422 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?
On Aug 1, 2012, at 8:30 AM, Nigel W wrote: > On Wed, Aug 1, 2012 at 8:33 AM, Sašo Kiselkov wrote: >> On 08/01/2012 04:14 PM, Jim Klimov wrote: >>> chances are that >>> some blocks of userdata might be more popular than a DDT block and >>> would push it out of L2ARC as well... >> >> Which is why I plan on investigating implementing some tunable policy >> module that would allow the administrator to get around this problem. >> E.g. administrator dedicates 50G of ARC space to metadata (which >> includes the DDT) or only the DDT specifically. My idea is still a bit >> fuzzy, but it revolves primarily around allocating and policing min and >> max quotas for a given ARC entry type. I'll start a separate discussion >> thread for this later on once I have everything organized in my mind >> about where I plan on taking this. >> > > Yes. +1 > > The L2ARC as is it currently implemented is not terribly useful for > storing the DDT in anyway because each DDT entry is 376 bytes but the > L2ARC reference is 176 bytes, so best case you get just over double > the DDT entries in the L2ARC as what you would get into the ARC but > then you have also have no ARC left for anything else :(. You are making the assumption that each DDT table entry consumes one metadata update. This is not the case. The DDT is implemented as an AVL tree. As per other metadata in ZFS, the data is compressed. So you cannot make a direct correlation between the DDT entry size and the affect on the stored metadata on disk sectors. -- richard -- ZFS Performance and Training richard.ell...@richardelling.com +1-760-896-4422 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?
On Aug 1, 2012, at 2:41 PM, Peter Jeremy wrote: > On 2012-Aug-01 21:00:46 +0530, Nigel W wrote: >> I think a fantastic idea for dealing with the DDT (and all other >> metadata for that matter) would be an option to put (a copy of) >> metadata exclusively on a SSD. > > This is on my wishlist as well. I believe ZEVO supports it so possibly > it'll be available in ZFS in the near future. ZEVO does not. The only ZFS vendor I'm aware of with a separate top-level vdev for metadata is Tegile, and it is available today. -- richard -- ZFS Performance and Training richard.ell...@richardelling.com +1-760-896-4422 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] single-disk pool - Re: Can the ZFS "copies" attribute substitute HW disk redundancy?
On 01/08/12 3:34 PM, opensolarisisdeadlongliveopensolaris wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Jim Klimov Well, there is at least a couple of failure scenarios where copies>1 are good: 1) A single-disk pool, as in a laptop. Noise on the bus, media degradation, or any other reason to misread or miswrite a block can result in a failed pool. How does mac/win/lin handle this situation? (Not counting btrfs.) Is this a trick question? :) --Toby Such noise might result in a temporarily faulted pool (blue screen of death) that is fully recovered after reboot. Meanwhile you're always paying for it in terms of performance, and it's all solvable via pool redundancy. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Jim Klimov > > In some of my cases I was "lucky" enough to get a corrupted /sbin/init > or something like that once, and the box had no other BE's yet, so the > OS could not do anything reasonable after boot. It is different from a > "corrupted zpool", but ended in a useless OS image due to one broken > sector nonetheless. That's very annoying, but if "copies" could have saved you, then pool redundancy could have also saved you. > For a single-disk box, "copies" IS the redundancy. ;) Ok, so the point is, in some cases, somebody might want redundancy on a device that has no redundancy. They're willing to pay for it by halving their performance. The only situation I'll acknowledge is the laptop situation, and I'll say, present day very few people would be willing to pay *that* much for this limited use-case redundancy. The solution that I as an IT person would recommend and deploy would be to run without "copies" and instead cover you bum by doing backups. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Can the ZFS "copies" attribute substitute HW disk redundancy?
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Jim Klimov > > 2012-08-01 23:40, opensolarisisdeadlongliveopensolaris пишет: > > > Agreed, ARC/L2ARC help in finding the DDT, but whenever you've got a > snapshot destroy (happens every 15 minutes) you've got a lot of entries you > need to write. Those are all scattered about the pool... Even if you can > find > them fast, it's still a bear. > > No, these entries you need to update are scattered around your > SSD (be it ARC or a hypothetical SSD-based copy of metadata > which I also "campaigned" for some time ago). If they were scattered around the hypothetical dedicated DDT SSD, I would say, no problem. But in reality, they're scattered in your main pool. DDT writes don't get coalesced. Is this simply because they're sync writes? Or is it because they're metadata, which is even lower level than sync writes? I know, for example, that you can disable ZIL on your pool, but still the system is going to flush the buffer after certain operations, such as writing the uberblock. I have not seen the code that flushes the buffer after DDT writes, but I have seen the performance evidence. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import the zpool
so zpool import -F .. zpool import -f ... all not working? regards Sent from my iPad On Aug 2, 2012, at 7:47, Suresh Kumar wrote: > Hi Hung-sheng, > > It is not displaying any output, like the following. > > bash-3.2# zpool import -nF tXstpool > bash-3.2# > > > Thanks & Regards, > Suresh. > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import the zpool
Hi Hung-sheng, It is not displaying any output, like the following. bash-3.2# zpool import -nF tXstpool bash-3.2# *Thanks & Regards,* *Suresh.* ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] unable to import the zpool
http://docs.oracle.com/cd/E19963-01/html/821-1448/gbbwl.html what is the output of *zpool import -nF tXstpool* On 8/2/2012 2:21 AM, Suresh Kumar wrote: Hi Hung-sheng, Thanks for your response. I tried to import the zpool using *zpool import -nF tXstpool* please consider the below output. *bash-3.2# zpool import -nF tXstpool bash-3.2# bash-3.2# zpool status tXstpool cannot open 'tXstpool': no such pool * I got these meesages when I run the command using *truss.* * truss -aefo /zpool.txt zpool import -F tXstpool* 742 14582: ioctl(3, ZFS_IOC_POOL_STATS, 0x08041F40) Err#2 ENOENT 743 14582: ioctl(3, ZFS_IOC_POOL_TRYIMPORT, 0x08041F90)= 0 744 14582: sysinfo(SI_HW_SERIAL, "75706560", 11) = 9 745 14582: ioctl(3, ZFS_IOC_POOL_IMPORT, 0x08041C40) Err#6 ENXIO 746 14582: fstat64(2, 0x08040C70) = 0 747 14582: write(2, " c a n n o t i m p o r".., 24) = 24 748 14582: write(2, " : ", 2) = 2 749 14582: write(2, " o n e o r m o r e ".., 44) = 44 750 14582: write(2, "\n", 1) = 1 /*Thanks & Regards*/ /*Suresh*/ -- <>___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss