Re: Some very basic questions

Ric Wheeler Wed, 22 Oct 2008 07:28:24 -0700

Chris Mason wrote:

On Wed, 2008-10-22 at 09:38 -0400, Ric Wheeler wrote:
Chris Mason wrote:
On Wed, 2008-10-22 at 22:15 +0900, Tejun Heo wrote:
Ric Wheeler wrote:
I think that we do handle a failure in the case that you outline above
since the FS will be able to notice the error before it sends a commit
down (and that commit is wrapped in the barrier flush calls). This is
the easy case since we still have the context for the IO.
I'm no FS guy but for that to be true FS should be waiting for all the
outstanding IOs to finish before issuing a barrier and actually
doesn't need barriers at all - it can do the same with flush_cache.
We wait and then barrier.  If the barrier returned status that a
previously ack'd IO had actually failed, we could do something to make
sure the FS was consistent.
As I mentioned in a reply to Tejun, I am not sure that we can count onthe barrier op giving us status for IO's that failed to destage cleanly.
Waiting and then doing the FLUSH seems to give us the best coverage fornormal failures (and your own testing shows that it is hugely effectivein reducing some types of corruption at least :-)).
If you look at the types of common drive failures, I would break theminto two big groups.The first group would be transient errors - i.e., this IO fails (usuallya read), but a subsequent IO will succeed with or without a sectorremapping happening. Causes might be:
(1) just a bad read due to dirt on the surface of the drive - theread will always fail, a write might clean the surface and restore it touseful life.(2) vibrations (dropping your laptop, rolling a big machine down thedata center, passing trains :-))(3) adjacent sector writes - hot spotting on drives can degrade thedata on adjacent tracks. This causes IO errors on reads for data thatwas successfully written before, but the track itself is still perfectlyfine.
4) Transient conditions such as heat or other problems made the drive
give errors.

Yes, heat is an issue (as well as severe cold) since drives have partthat expand and contract :-)).

Combine your matrix with the single drive install vs the mirrored
configuration and we get a lot of variables.  What I'd love to have is a
rehab tool for drives that works it over and decides if it should stay
or go.

That would be a really nice thing to have and not really that difficultto sketch out. MD has some of that built in, but this is also somethingthat we could do pretty easily up in user space.

It is somewhat difficult to run the rehab on a mounted single disk
install, but we can start with the multi-device config and work out way
out from there.

Scanning a mounted drive with read-verify or object level signaturechecking can be done on mounted file systems...

For barrier flush, io errors reported back by the barrier flush would
allow us to know when corrective action was required.

-chris

As I mentioned before, this would be great, but I am not sure that itwould work that way (certainly not consistently across devices).


ric

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Some very basic questions

Reply via email to