https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=288881

            Bug ID: 288881
           Summary: ZFS checksum error at `software' level
           Product: Base System
           Version: 14.3-RELEASE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: [email protected]
          Reporter: [email protected]

On one of our systems we got regular errors in one specific, always exactly the
same, place:

        Aug  2 08:52:43 hostname ZFS[26329]: checksum mismatch, zpool=zroot
path=/dev/ada0p3 offset=1793168560128 size=32768
        Aug  8 05:52:38 hostname ZFS[62461]: checksum mismatch, zpool=zroot
path=/dev/ada0p3 offset=1793168560128 size=32768

With zpool status each time confirming:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          raidz3-0  ONLINE       0     0     0
            ada0p3  ONLINE       0     0     2
            ada1p3  ONLINE       0     0     0
            ....

And with a zpool scrub fixing the issue with no ado or further errors each
time; and the issue not getting worse if left unattended for a few months. 

We assumed that this was a hardware error and recently replaced the disk;
swapping a HGST HMS5C4040BLE640 for a ST4000NM0245.

>From the kernel message:

       # bzcat messages*bz2 | grep ada0 | grep Serial
       Aug  5 18:45:57 hostname kernel: ada0: Serial Number PL2331LAGUP9BJ
       Aug  9 10:26:33 hostname kernel: ada0: Serial Number ZC112BE5

We can see this as successful - and indeed the disk ada0. Resilvering went
without error; However - we no see the very same `error' appearing again: 

         Aug 14 11:50:59 hostname ZFS[83330]: checksum mismatch, zpool=zroot
path=/dev/ada0p3 offset=1793168592896 size=32768
         Aug 14 11:50:59 hostname ZFS[83810]: checksum mismatch, zpool=zroot
path=/dev/ada0p3 offset=1793168560128 size=32768
         Aug 15 03:56:04 hostname root[23278]: hostname - ZFS pool - HEALTH
fault

So I am now starting to doubt that this is a hardware issue - and am wondering
if this is a SW issue - and what can be done to narrow this down.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to