On Sep 16, 2008, at 5:39 PM, Miles Nordin wrote: >>>>>> "jd" == Jim Dunham <[EMAIL PROTECTED]> writes: > > jd> If at the time the SNDR replica is deleted the set was > jd> actively replicating, along with ZFS actively writing to the > jd> ZFS storage pool, I/O consistency will be lost, leaving ZFS > jd> storage pool in an indeterministic state on the remote node. > > jd> To address this issue, prior to deleting the replicas, the > jd> replica should be placed into logging mode first. > > What if you stop the replication by breaking the network connection > between primary and replica? consistent or inconsistent?
Consistent. > it sounds fishy, like ``we're always-consistent-on-disk with ZFS, but > please use 'zpool offline' to avoid disastrous pool corruption.'' This is not the case at all. Maintaining I/O consistency of all volumes in a single I/O consistency group, is an attribute of replication. The instant an SNDR replica is deleted, that volume is no longer being replicated, and it becomes inconsistent with all other write-order volumes. By placing all volumes in the I/O consistency group in logging mode, not 'zpool offline', and then deleting the replica there is no means for any of the remote volumes to become I/O inconsistent. Yes, one will note that there is a group disable command "sndradm -g <group-name> -d", but it was implemented for easy of administration, not for performing a write-order coordinated disable command. > jd> ndr_ii. This is an automatic snapshot taken before > jd> resynchronization starts, > > yeah that sounds fine, possibly better than DRBD in one way because it > might allow the resync to go faster. > > From the PDF's it sounds like async replication isn't done the same > way as the resync, it's done safely, and that it's even possible for > async replication to accumulate hours of backlog in a ``disk queue'' > without losing write ordering so long as you use the ``blocking mode'' > variant of async. Correct reading of the documentation. > ii might also be good for debugging a corrupt ZFS, so you can tinker > with it but still roll back to the original corrupt copy. I'll read > about it---I'm guessing I will need to prepare ahead of time if I want > ii available in the toolbox after a disaster. > > jd> AVS has the concept of I/O consistency groups, where all disks > jd> of a multi-volume filesystem (ZFS, QFS) or database (Oracle, > jd> Sybase) are kept write-order consistent when using either sync > jd> or async replication. > > Awesome, so long as people know to use it. so I guess that's the > answer for the OP: use consistency groups! I use the name of the ZFS storage pool, as the name of the SNDR I/O consistency group. > The one thing I worry about is, before, AVS was used between RAID and > filesystem, which is impossible now because that inter-layer area n > olonger exists. If you put the individual device members of a > redundant zpool vdev into an AVS consistency group, what will AVS do > when one of the devices fails? Nothing, as it is ZFS the reacts to the failed device > Does it continue replicating the working devices and ignore the > failed one? In this scenario ZFS knows he device failed, which means ZFS will stop writing to the disk, and thus the replica. > This would sacrifice redundancy at the DR site. UFS-AVS-RAID > would not do that in the same situation. > > Or hide the failed device from ZFS and slow things down by sending all > read/writes of the failed device to the remote mirror? This would > slwo down the primary site. UFS-AVS-RAID would not do that in the > same situation. > > The latter ZFS-AVS behavior might be rescueable, if ZFS had the > statistical read-preference feature. but writes would still be > massively slowed with this scenario, while in UFS-AVS-RAID they would > not be. To get back the level of control one used to have for writes, > you'd need a different zpool-level way to achieve the intent of the > AVS sync/async option. Maybe just a slog which is not AVS-replicated > would be enough, modulo other ZFS fixes for hiding slow devices. ZFS-AVS is not UFS-AVS-RAID, and although one can foresee some downside to replicating ZFS with AVS, there are some big wins. Place SNDR in logging mode, and zpool scrub the secondary volumes for consistency, then resume replication. Compressed ZFS Storage pools, result in compressed replication Encrypted ZFS Storage pools, result in encrypted replication > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Jim Dunham Engineering Manager Storage Platform Software Group Sun Microsystems, Inc. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss