Eric Schrock writes:
> A better option would be to not use this to perform FMA diagnosis, but
> instead work into the mirror child selection code.  This has already
> been alluded to before, but it would be cool to keep track of latency
> over time, and use this to both a) prefer one drive over another when
> selecting the child and b) proactively timeout/ignore results from one
> child and select the other if it's taking longer than some historical
> standard deviation.  This keeps away from diagnosing drives as faulty,
> but does allow ZFS to make better choices and maintain response times.
> It shouldn't be hard to keep track of the average and/or standard
> deviation and use it for selection; proactively timing out the slow I/Os
> is much trickier. 
This would be a good solution to the remote iSCSI mirror configuration.  
I've been working though this situation with a client (we have been 
comparing ZFS with Cleversafe) and we'd love to be able to get the read 
performance of the local drives from such a pool. 

> As others have mentioned, things get more difficult with writes.  If I
> issue a write to both halves of a mirror, should I return when the first
> one completes, or when both complete?  One possibility is to expose this
> as a tunable, but any such "best effort RAS" is a little dicey because
> you have very little visibility into the state of the pool in this
> scenario - "is my data protected?" becomes a very difficult question to
> answer. 
One solution (again, to be used with a remote mirror) is the three way 
mirror.  If two devices are local and one remote, data is safe once the two 
local writes return.  I guess the issue then changes from "is my data safe" 
to "how safe is my data".  I would be reluctant to deploy a remote mirror 
device without local redundancy, so this probably won't be an uncommon 
setup.  There would have to be an acceptable window of risk when local data 
isn't replicated. 

