>>>>> "t" == Tim <t...@tcsac.net> writes: t> Uhhh, S10 box that provide zfs backed iSCSI is NOT fine. Cite t> the plethora of examples on this list of how the fault t> management stack takes so long to respond it's basically t> unusable as it stands today.
well...if we are talking about reliability, whether or not you lose the whole pool when some network element or disk target reboots, that's separate from availability, do your final applications experience glitches and outages or are they insulated from failures that happen far enough underneath the layering? The insulating right now seems pretty poor compared to other SAN products, but that's not the same problem as the single-LUN Reliability issue we've been discussing recently. if you make a zpool mirror out of two S10 box providing iSCSI instead of one iSCSI target, that is better. If you make a zpool from one S10 box providing iSCSI, does not matter if the iSCSI target software is serving the one LUN from a zvol or a disk or an SVM slice, it is not fine for reliability to have a single-LUN iSCSI vdev. You must have ZFS-layer redundancy on the overall pool, above the iSCSI. if you change from: client NFS | +------------------------------+ +------------+ | SAN | | NFS ZFS|--lun0----|iSCSI/FC | | | +------------------------------+ +------------+ | | | +------------+-------------++------------+ | disk || disk || disk | | shelf || shelf || shelf | | || || | +------------++------------++------------+ to using a non-ZFS filesystem above the iSCSI: [client |notZFS] -----+-------------+------------+ | | | | | | +------------+-------------++------------+ |iSCSI target||iSCSI target||iSCSI target| | ZFS || ZFS || ZFS | |local disk ||local disk ||local disk | +------------++------------++------------+ then that is probably more okay. But for me the appeal of ZFS is to aggregate spread-out storage into really huge pools. Maybe it's hard to find a good notZFS to use in that diagram. My understanding so far is, this is also better: client NFS | +------------------------------+ +------------+---lun0---| SAN | | NFS ZFS|---lun1---|iSCSI/FC | | | +------------------------------+ +------------+ | | | +------------+-------------++------------+ | disk || disk || disk | | shelf || shelf || shelf | | || || | +------------++------------++------------+ where lun0 and lun1 make up a mirrored vdev in ZFS. It does not matter if lun0 and lun1 are carried on the same physical cable or connected to the same SAN controller, but obviously they DO need to be backed by separate storage, so you're wasting a lot of disk to do this. Even if some maintenance event reboots lun0 and lun1 at the same time, my understanding so far is, people have found this configuration above less likely to lose whole pools than running on a single lun. Ex., see thread including message-id <b5eb902a18810e43800f26a02db4279a2e6f1...@cnjexchange.composers.caxton.com> around 2008-08-13. Another alternative is to go ahead and use single-lun vdev's, but have data mirrored across multiple zpools using zfs send | zfs recv or rsync in case you lose a whole pool. That's appealing to me because the backup pool can be built from cheaper slower pieces than the main pool, instead of burning up double the amount of expensive main pool storage, and it protects from problems/bugs/mistakes that a vdev mirror does not, and it allows changing pool geometry and removing slogs and stuff. The downside to planning on restoring from backup is, if you lose your single-lun pool relatively often, these big heavily-aggregated pools can take days to restore. It's a type of offline maintenance, which is always against the original ZFS kool-aid philosophy because offline maintenance puts a de-facto cap on max pool size.
pgpibcGmErmTa.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss