[zfs-code] correcting single-bit errors in fletcher4 checksums

Jonathan Adams Mon, 4 May 2009 10:03:59 -0700

On Sun, May 03, 2009 at 12:53:31PM -0700, paul wrote:
> Very nice; and ultimately will be very interesting to see what
> percentage of checksum errors within a particular deployment turn out
> to most likely be correctable single bit errors. (And thereby possibly
> even measurably help improve the integrity of non-redundant array
> configurations, short of the catastrophic failure of a sector or drive
> itself.)


The other thing I'm working on is getting better FMA ereports for checksum
errors;  one thing that's currently missing in the case of a mirrored
or raid-z configuration is the information on the difference between the
correct content and the bad content.

That way, we'll have a better idea of what's actually happening, and the
FMA responses may also get better.

> After reviewing the code (and presuming you intended "if (base->a
> less-than bad->a)"), I can't quite seem to convince myself the
> implementation is immune from misdiagnosing a double/triple bit
> error as a single bit error in general (although likely staring me
> in the face; as all correct, single, and double bit error checksums
> are warranted to be unique; as should also be all 4 and 5 bit
> error checksums for a corrected fletcher4 implementation to my
> understanding)?

Let me work on the math some and get back to you.

Cheers,
- jonathan

[zfs-code] correcting single-bit errors in fletcher4 checksums

Reply via email to