On Mon, Oct 30, 2017 at 2:57 AM, Zak Kohler <y...@y2kbugger.com> wrote:
> $ sudo btrfs scrub start --offline --progress /dev/disk/by-id/WD-XX1 > Scrub result: > Tree bytes scrubbed: 5234425856 > Tree extents scrubbed: 638968 > Data bytes scrubbed: 4353723670528 > Data extents scrubbed: 374300 > Data bytes without csum: 533200896 > Read error: 0 > Verify error: 0 > Csum error: 150 > > $ sudo btrfs scrub start --offline --progress /dev/disk/by-id/WD-XX2 > Scrub result: > Tree bytes scrubbed: 5234425856 > Tree extents scrubbed: 638967 > Data bytes scrubbed: 4353723314176 > Data extents scrubbed: 374300 > Data bytes without csum: 533200896 > Read error: 0 > Verify error: 0 > Csum error: 238 > > $ sudo btrfs scrub start --offline --progress /dev/disk/by-id/WD-XX3 > Scrub result: > Tree bytes scrubbed: 5234491392 > Tree extents scrubbed: 638975 > Data bytes scrubbed: 4353723572224 > Data extents scrubbed: 374300 > Data bytes without csum: 533200896 > Read error: 0 > Verify error: 0 > Csum error: 175 #first run > Csum error: 112 #second run... > Csum error: 285 #third run... > > But I ran the /dev/disk/by-id/WD-XX3 device three times and you can > see the result... I expect these commands are the same, and involve all three drives in the offline scrub each time. So you have five different results, but all five involve csum errors. So the errors have a certain transience to them, hence inconsistent results. But the online scrub consistently reports zero errors. That to me sounds like a bug in the offline scrub code. Maybe it's confused, and reports data without csums (nodatacow) as csum errors? That does not explain the inconsistency though. And then you're getting an consistent failure, but at an inconsistent location, with Btrfs send, ostensibly due to IO error, which sounds like it's hitting a bad csum check. It is entirely possible to get transient errors like this somewhere in a storage stack that's otherwise not reported by the error detection code in that layer. The thing I really don't understand is how you're getting zero errors with conventional online scrub, every time. On my tiny 23G installation I'm traveling with, I get the same results with all three scrub methods on an NVMe drive. Zero errors. The slighly larger spinning rust drives are not with me so I can't check them for a while. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html