[zfs-discuss] do zfs filesystems isolate corruption?
In the old days of UFS, on occasion one might create multiple file systems (using multiple partitions) of a large LUN if filesystem corruption was a concern. It didn’t happen often but filesystem corruption has happened. So, if filesystem X was corrupt filesystem Y would be just fine. With ZFS, does the same logic hold true for two filesystems coming from the same pool? Said slightly differently, I’m assuming that if the pool becomes mangled some how then all filesystems will be toast … but is it possible to have one filesystem be corrupted while the other filesystems are fine? Hmmm, does the answer depend on if the filesystems are nested ex: 1 /my_fs_1 /my_fs_2 ex: 2 /home_dirs/home_dirs/chris TIA! This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] do zfs filesystems isolate corruption?
Is it possible that a faulty disk controller could cause corruption to a zpool? I think I had this experience recently when doing a 'zpool replace' with both the old/new device attached to a controller that I discovered was faulty (because I got data checksum errors, and had to dig for backups). Blake On 8/11/07, Richard L. Hamilton [EMAIL PROTECTED] wrote: In the old days of UFS, on occasion one might create multiple file systems (using multiple partitions) of a large LUN if filesystem corruption was a concern. It didn't happen often but filesystem corruption has happened. So, if filesystem X was corrupt filesystem Y would be just fine. With ZFS, does the same logic hold true for two filesystems coming from the same pool? Said slightly differently, I'm assuming that if the pool becomes mangled some how then all filesystems will be toast … but is it possible to have one filesystem be corrupted while the other filesystems are fine? Hmmm, does the answer depend on if the filesystems are nested ex: 1 /my_fs_1 /my_fs_2 ex: 2 /home_dirs/home_dirs/chris TIA! If they're always consistent on-disk, and the checksumming catches storage subsystem errors out to almost 100% certainty, then the only corruption can come from bugs in the code, or uncaught non-storage (i.e. CPU, memory) bugs perhaps. So I suppose the answer would depend on where in the code things went astray; but that you probably could not expect any sort of isolation or even sanity at that point; if privileged code is running amok, anything could happen, and that would be true with two distinct ufs filesystems too, I would think. Perhaps one might guess that it might be more likely for corruption not to be isolated to a single zfs filesystem (given how lightweight a zfs filesystem is). OTOH, since zfs catches errors other filesystems don't, think of how many ufs filesystems may well be corrupt for a very long time before causing a panic and having that get discovered by fsck. Ideally, if zfs code passes its test suites, you're safer with it than with most anything else, even if it isn't perfect. But I'm way out on a limb here; no doubt the experts will correct and amend what I've said... This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs on entire disk?
On 8/11/07, Russ Petruzzelli [EMAIL PROTECTED] wrote: Is it possible/recommended to create a zpool and zfs setup such that the OS itself (in root /) is in its own zpool? Yes. You're looking for zfs root and it's easiest if your installer does that for you. At least latest nexenta unstable installs zfs root by default ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] do zfs filesystems isolate corruption?
Chris, In the old days of UFS, on occasion one might create multiple file systems (using multiple partitions) of a large LUN if filesystem corruption was a concern. It didn’t happen often but filesystem corruption has happened. So, if filesystem X was corrupt filesystem Y would be just fine. With ZFS, does the same logic hold true for two filesystems coming from the same pool? For the purposes of isolating corruption, the separation of two or more filesystems coming from the same ZFS storage pool does not help. An entire ZFS storage pool is the unit of I/O consistency, as all ZFS filesystems created within this single storage pool share the same physical storage. When configuring a ZFS storage pool the [poor] decision of choosing a non-redundant (single or concatenation of disks) verses redundant (mirror, raidz, raidz2) storage pool, offers no means for ZFS to automatically recover for some forms of corruption. Even when using a redundant storage pool, there are scenarios in which this is not good enough. This is when filesystem needs transitions into availability, such as when the loss or accessibility of two or more disks, causes mirroring or raidz to be ineffective. As of Solaris Express build 68, Availability Suite [http:// www.opensolaris.org/os/project/avs/] is part of base Solaris, offering both local snapshots and remote mirrors, both of which work with ZFS. Locally on a single Solaris host, snapshots of the entire ZFS storage pool can be taken at intervals of ones choosing, and with multiple snapshots of a single master, collections of snapshots, say at intervals of one hour, can be retained. Options allow for 100% independent snapshots (much like your UFS analogy above), dependent where only the Copy-On-Write data is retained, or compact dependent where the snapshots physical storage is some percentage of the master. Remotely between to or more Solaris hosts, remote mirrors of the entire ZFS storage pool can be configured, where synchronous replication can offer zero data loss, or asynchronous replication can offer near zero data loss, but both offering write-order, on disk consistency. A key aspect of remote replication with Availability Suite, is that the replicated ZFS storage pool can be quiesced on the remote node and accessed, or in a disaster recover scenario, take over instantly where the primary left off. When the primary site is restored, the MTTR (Mean Time To Recovery) is essentially zero, since Availability Suite supports on-demand pull, so yet to be replicated blocks are retrieved synchronously, allowing the ZFS filesystem and applications to be resumed without waiting for a potentially length resynchronization. Said slightly differently, I’m assuming that if the pool becomes mangled some how then all filesystems will be toast … but is it possible to have one filesystem be corrupted while the other filesystems are fine? Hmmm, does the answer depend on if the filesystems are nested ex: 1 /my_fs_1 /my_fs_2 ex: 2 /home_dirs/home_dirs/chris TIA! This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Jim Dunham Solaris, Storage Software Group Sun Microsystems, Inc. 1617 Southwood Drive Nashua, NH 03063 Email: [EMAIL PROTECTED] http://blogs.sun.com/avs ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool upgrade to more storage
Hello everyone, I am slowly running out of space in my zpool.. so I wanted to replace my zpool with a different zpool.. my current zpool is zpool list NAMESIZEUSED AVAILCAP HEALTH ALTROOT mypool 278G263G 14.7G94% ONLINE - zpool status mypool pool: mypool state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress, 11.37% done, 10h0m to go config: NAMESTATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirrorONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 errors: No known data errors (yes I know its resilvering one of the disks...) Anyway that is a simple mirror zpool, I would like to create another pool lets say mypool2 with few more disks and use raidz2 instead... What would be my options to do this transfer? I cannot attach to this existing pool disks, I don tthink thats an option because thats mirror and not raidz2... can I create raidz2 and just add it to mypool using zpool add option? and then when its added is there any way to remove originall mirror out of it? Now the tricky part is I have lots of snapshots on that mypool and I would like to keep them... Another option that I think I have is just create mypool2 as I want it to be which is raidz2 and then use zfs send and receive to move data around and then restroy original mirror when I am done replacing it with this one... What do you think? what would you recommend? with the second option I probably would need to take system offline and do it and I dont even if first option would even work where I would just add newly created raidz2 to mypool and then remove original mirror out of it... Regards, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] do zfs filesystems isolate corruption?
Thanks for the info folks. In addition to the 2 replies shown above I got the following very knowledgeable reply from Jim Dunham (for some reason it has not shown up here yet so I'm going to paste it in). Chris, For the purposes of isolating corruption, the separation of two or more filesystems coming from the same ZFS storage pool does not help. An entire ZFS storage pool is the unit of I/O consistency, as all ZFS filesystems created within this single storage pool share the same physical storage. When configuring a ZFS storage pool the [poor] decision of choosing a non-redundant (single or concatenation of disks) verses redundant (mirror, raidz, raidz2) storage pool, offers no means for ZFS to automatically recover for some forms of corruption. Even when using a redundant storage pool, there are scenarios in which this is not good enough. This is when filesystem needs transitions into availability, such as when the loss or accessibility of two or more disks, causes mirroring or raidz to be ineffective. As of Solaris Express build 68, Availability Suite [http://www.opensolaris.org/os/project/avs/] is part of base Solaris, offering both local snapshots and remote mirrors, both of which work with ZFS. Locally on a single Solaris host, snapshots of the entire ZFS storage pool can be taken at intervals of ones choosing, and with multiple snapshots of a single master, collections of snapshots, say at intervals of one hour, can be retained. Options allow for 100% independent snapshots (much like your UFS analogy above), dependent where only the Copy-On-Write data is retained, or compact dependent where the snapshots physical storage is some percentage of the master. Remotely between to or more Solaris hosts, remote mirrors of the entire ZFS storage pool can be configured, where synchronous replication can offer zero data loss, or asynchronous replication can offer near zero data loss, but both offering write-order, on disk consistency. A key aspect of remote replication with Availability Suite, is that the replicated ZFS storage pool can be quiesced on the remote node and accessed, or in a disaster recover scenario, take over instantly where the primary left off. When the primary site is restored, the MTTR (Mean Time To Recovery) is essentially zero, since Availability Suite supports on-demand pull, so yet to be replicated blocks are retrieved synchronously, allowing the ZFS filesystem and applications to be resumed without waiting for a potentially length resynchronization. Thanks Jim! This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] do zfs filesystems isolate corruption?
On 8/11/07, Stan Seibert [EMAIL PROTECTED] wrote: I'm not sure if that answers the question you were asking, but generally I found that damage to a zpool was very well confined. But you can't count on it. I currently have an open case where a zpool became corrupt and put the system into a panic loop. As this case has progressed, I found that the panic loop part of it is not present in any released version of S10 tested (S10U3 + 118833-36, 125100-07, 125100-10) but does exist in snv69. The test mechanism is whether zpool import (no pool name) causes the system to panic or not. If that happens, I'm going on the assumption that if this causes panic, having the appropriate zpool.cache in place will cause it to panic during every boot. Oddly enough, I know I can't blame the storage subsystem on this - it is ZFS as well. :) It goes like this: HDS 99xx T2000 primary ldom S10u3 with a file on zfs presented as a block device for an ldom T2000 guest ldom zpool on slice 3 of block device mentioned above Depending on the OS running on the guest LDOM zpool import gives different results: S10U3 118833-36 - 125100-10: zpool is corrupt restore from backups S10u4 Beta, snv69 and I think snv59: panic - S10u4 backtrace is very different from snv* -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss