>>>>> "nw" == Nicolas Williams <nicolas.willi...@sun.com> writes: >>>>> "tt" == Toby Thain <t...@telegraphics.com.au> writes: >>>>> "jh" == Johan Hartzenberg <jhart...@gmail.com> writes:
nw> If you can fully trust the SAN then there's no reason not to nw> run ZFS on top of it with no ZFS mirrors and no RAID-Z. The best practice I understood is currently to use zpool-layer redundancy especially with SAN even moreso than with single-spindle local storage, because of (1) the new corruption problems people are having with ZFS on single-LUN SAN's that they didn't have when using UFS and vxfs on the same SAN, and (2) the new severity of the problem, losing the whole pool instead of the few files you lose to UFS corruption or that you're supposed to lose to random bit flips on ZFS. The problems do not sound like random bit-flips. They're corruption of every ueberblock. The best-guess explanation AIUI, is not FC checksum gremlins---it's that write access to the SAN is lost and then comes back---ex. if the SAN target loses power or fabric access but the ZFS host doesn't reboot---and either the storage stack is misreporting the failure or ZFS isn't correctly responding to the errors. see the posts I referenced. Apparently the layering is not as simple in practice as one might imagine. Even if you ignore the post-mortem analysis of the corrupt pools and look only at the symptom, if it were random corruption from DRAM and FC checksum gremlins, we should see mostly reports of a few files lost to checksum errors on single-LUN SAN's and reported in 'zpool status', much more often than whole zpool's lost, yet exactly the opposite is happening. jh> The only bit that I understand about why HW raid "might" be jh> bad is that if it had access to the disks behind a HW RAID jh> LUN, then _IF_ zfs were to encounter corrupted data in a read, In at least one case it's certain there are no reported latent sector errors from the SAN on the corrupt LUN---'dd if=<..lun..> of=/dev/null' worked for at least one person who lost a single-LUN zpool. it doesn't sound to me like random bit-flips causing the problem, since all copies of the ueberblock are corrupt, and that's a bit far-fetched to happen randomly on a LUN that scrubs almost clean when mounted with the second-newest ueberblock. jh> ZFS' notorious instability during error conditions. right, availability is a reason to use RAID below ZFS layer. It might or might not be related to the SAN problems. Maybe yes if the corruption happens during a path failover or a temporary connectivity interruption. but the symptom's different from the timeout/availability thread, is a corrupt unmountable pool. The hang discussion was about frozen systems where the pool imports fine after reboot, which is a different symptom.
pgp5QBMINLSeY.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss