Re: [zfs-discuss] Does ZFS work with SAN-attached devices?

Victor Latushkin Sat, 10 Oct 2009 12:02:42 -0700

Erik Trimble wrote:

ZFS no longer has the issue where loss of a single device (evenintermittently) causes pool corruption. That's been fixed.

Erik, it does not help at all when you are talking about some issuebeing fixed and does not provide corresponding CR number. It does notallow interested observer to go have a look what exactly that issue was,how it's been fixed, does not allow to track it presence or absence inother releases.


So could you please provide CR number for an issue you are talking about?

That is, there used to be an issue in this scenario:

(1) zpool constructed from a single LUN on a SAN device
(2) SAN experiences temporary outage, while ZFS host remains running.
(3) zpool is permanently corrupted, even if no I/O occured during outage

This is fixed. (around b101, IIRC)

You see - you cannot tell exactly when it was fixed yourself. Besides,in the scenario you describe above a whole lot can be hidden in the "SANexperiences temporary outage". It can be as simple as wrong fiber cablebeing unplugged, and as complex as some storage array failing, rebootingand loosing its entire cache content as a result.

In the former case I do not see how it could badly affect ZFS pool. Itmay cause panic, if 'failmode' is set to panic (or software release istoo old and does not support this property), it may requireadministrator intervention to do 'zpool clear'.

In the latter case consequences can really be bad - pool may becorrupted and unopenable. And there are several examples of this in thearchives, as well as success stories of successful recovery.

And there's recovery project to provide support for pool recoveryresulting from these corruptions.

However, ZFS remains much more sensitive to loss of the underlying
LUN  than UFS, and has a tendency to mark such a LUN as defective

> during any such SAN outage. It's much more recoverable nowdays,
> though. Just to be clear, this occasionally occurs when something such
> as a SAN switch dies, or there is a temporary hiccup in the SAN
> infrastructure, causing some small (i.e. < minute) loss of
> connectivity to the underlying LUN.

Again, SANs are very complex structures, and perceived small loss ofconnectivity may in reality be very complex event with difficult topredict consequences.

With non-COW filesystems (like UFS) it is indeed less likely toexperience consequences of small outage immediately (though they canstill manifest itself much much later).

ZFS tends to uncover presence of the consequences much earlier(immediately?). But that does not immediately mean there's an issue withZFS. There may be issue somewhere within SAN infrastructure which wasonly unavailable for less than a minute.

RAIDZ and mirrored zpools are still the preferred method of arrangingthings in ZFS, even with hardware raid backing the underlying LUN(whether the LUN is from a SAN or local HBA doesn't matter).

Fully support this - without redundancy at the ZFS level there's no suchbenefit as self-healing...


regards,
victor
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Does ZFS work with SAN-attached devices?

Reply via email to