Chris Murphy posted on Thu, 11 Aug 2016 14:43:56 -0600 as excerpted:

> On Thu, Aug 11, 2016 at 1:07 PM, Duncan <1i5t5.dun...@cox.net> wrote:
>> The compression-related problem is this:  Btrfs is considerably less
>> tolerant of checksum-related errors on btrfs-compressed data,
> 
> Why? The data is the data. And why would it matter if it's application
> compressed data vs Btrfs compressed data? If there's an error, Btrfs is
> intolerant. I don't see how there's a checksum error that Btrfs
> tolerates.

Apparently, the code path for compressed data is sufficiently different, 
that when there's a burst of checksum errors, even on raid1 where it 
should (and does with scrub) get the correct second copy, it will crash 
the system.  This is my experience and that of others, and what I thought 
was standard btrfs behavior -- I didn't know it was a compression-
specific bug since I use compress on all my btrfs, until someone told me.

When the btrfs compression option hasn't been used on that filesystem, or 
presumably when none of that burst of checksum errors is from btrfs-
compressed files, it will grab the second copy and use it as it should, 
and there will be no crash.  This is as reported by others, including 
people who have tested both with and without btrfs-compressed files and 
found that it only crashed if the files were btrfs-compressed, whereas it 
worked as expected, fetching the valid second copy, if they weren't btrfs-
compressed.

I'd assume this is why this particular bug has remained unsquashed for so 
long.  The devs are likely testing compression, and bad checksum data 
repair from the second copy, but they probably aren't testing bad 
checksum repair on compressed data, so the problem isn't showing up in 
their tests.  Between that and relatively few people running raid1 with 
the compression option and seeing enough bad shutdowns to be aware of the 
problem, it has mostly flown under the radar.  For a long time I myself 
thought it was just the way btrfs behaved with bursts of checksum errors, 
until someone pointed out that it did /not/ behave that way on btrfs that 
didn't have any compressed files when the checksum errors occurred.

> But also I don't know if the checksum is predicated on compressed data
> or uncompressed data - does the scrub blindly read compressed data,
> checksums it, and compares to the previously recorded csum? Or does the
> scrub read compressed data, decompresses it, checksums it, then
> compares? And does compression compress metadata? I don't think it does
> from some of the squashfs testing of the same set of binary files on
> ext4 vs btrfs uncompressed vs btrfs compressed. The difference is
> explained by inline data being compressed (which it is), so I don't
> think the fs itself gets compressed.

As I'm not a coder I can't actually tell you from reading the code, but 
AFAIK, both the 128 KiB compression block size and the checksum are on 
the uncompressed data.  Compression takes place after checksumming.

And I don't believe metadata, whether metadata itself or inline data, is 
compressed by btrfs' transparent compression.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to