Chris, See notes inline.
On Thu, 2016-05-12 at 19:41 -0600, Chris Murphy wrote: > On Thu, May 12, 2016 at 11:49 AM, Richard A. Lochner <lochner@clone1. > com> wrote: > > > > > I suspected, and I still suspect that the error occurred upon a > > metadata update that corrupted the checksum for the file, probably > > due > > to silent memory corruption. If the checksum was silently > > corrupted, > > it would be simply written to both drives causing this type of > > error. > Metadata is checksummed independently of data. So if the data isn't > updated, its checksum doesn't change, only metadata checksum is > changed. > > > > > > btrfs dmesg(s): > > > > [16510.334020] BTRFS warning (device sdb1): checksum error at > > logical > > 3037444042752 on dev /dev/sdb1, sector 4988789496, root 259, inode > > 1437377, offset 75754369024, length 4096, links 1 (path: > > Rick/sda4.img) > > [16510.334043] BTRFS error (device sdb1): bdev /dev/sdb1 errs: wr > > 0, rd > > 0, flush 0, corrupt 5, gen 0 > > [16510.345662] BTRFS error (device sdb1): unable to fixup (regular) > > error at logical 3037444042752 on dev /dev/sdb1 > > > > [17606.978439] BTRFS warning (device sdb1): checksum error at > > logical > > 3037444042752 on dev /dev/sdc1, sector 4988750584, root 259, inode > > 1437377, offset 75754369024, length 4096, links 1 (path: > > Rick/sda4.img) > > [17606.978460] BTRFS error (device sdb1): bdev /dev/sdc1 errs: wr > > 0, rd > > 13, flush 0, corrupt 4, gen 0 > > [17606.989497] BTRFS error (device sdb1): unable to fixup (regular) > > error at logical 3037444042752 on dev /dev/sdc1 > This is confusing. Are these the same boot? The later time has a > lower > corrupt count. Can you just 'dd if=sda4.img of=/dev/null' and report > all (new) messages in dmesg? It seems to me there should be pretty > much all the same monotonic-time for the problem with both devices. My apologies, they were from different boots. After the dd, I get these: [109479.550836] BTRFS warning (device sdb1): csum failed ino 1437377 off 75754369024 csum 1689728329 expected csum 2165338402 [109479.596626] BTRFS warning (device sdb1): csum failed ino 1437377 off 75754369024 csum 1689728329 expected csum 2165338402 [109479.601969] BTRFS warning (device sdb1): csum failed ino 1437377 off 75754369024 csum 1689728329 expected csum 2165338402 [109479.602189] BTRFS warning (device sdb1): csum failed ino 1437377 off 75754369024 csum 1689728329 expected csum 2165338402 [109479.602323] BTRFS warning (device sdb1): csum failed ino 1437377 off 75754369024 csum 1689728329 expected csum 2165338402 > > Also what do you get for these for each device: > > smartctl scterc -l /dev/sdX > cat /sys/block/sdX/device/timeout > # smartctl -l scterc /dev/sdb sartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.4.8-300.fc23.x86_64] (local build) Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools .org SCT Error Recovery Control: Read: 70 (7.0 seconds) Write: 70 (7.0 seconds) # smartctl -l scterc /dev/sdc smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.4.8-300.fc23.x86_64] (local build) Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools .org SCT Error Recovery Control: Read: 70 (7.0 seconds) Write: 70 (7.0 seconds) # cat /sys/block/sdb/device/timeout 30 # cat /sys/block/sdc/device/timeout 30 > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html