Tejun Heo wrote:
Ric Wheeler wrote:
I think that we do handle a failure in the case that you outline above
since the FS will be able to notice the error before it sends a commit
down (and that commit is wrapped in the barrier flush calls). This is
the easy case since we still have the context for the IO.

I'm no FS guy but for that to be true FS should be waiting for all the
outstanding IOs to finish before issuing a barrier and actually
doesn't need barriers at all - it can do the same with flush_cache.

Waiting for the target to ack an IO is not sufficient, since the target ack does not (with write cache enabled) mean that it is on persistent storage.

The key is to make your transaction commit insure that the commit block itself is not written out of sequence without flushing the dependent IO from the transaction.

If we disable the write cache, then file systems effectively do exactly the right thing today as you describe :-)
It is more challenging  (and kind of related) if the IO done in (4) has
been ack'ed by drive, the drive later destages (not as part of the
flush) its write cache and then an error happens. In this case, there is
nothing waiting on the initiator side to receive the IO error. We have
effectively lost the context for that IO.

IIUC, that should be detectable from FLUSH whether the destaging
occurred as part of flush or not, no?

I am not sure what happens to a write that fails to get destaged from cache. It probably depends on the target firmware, but I imagine that the target cannot hold onto it forever (or all subsequent flushes would always fail).
The only way to detect this is on replay (if the journal has checksums
enabled or the error will be flagged as a media error).

If it's not reported on FLUSH, it basically amounts to silent data
corruption and only checksums can help.

Thanks.


Agreed - checksums (or proper handling of media errors) are the only way to detect this.

Ric


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to