On Sat, May 12, 2018 at 8:09 PM, Chris Murphy <li...@colorremedies.com> wrote: > On Sat, May 12, 2018 at 6:10 PM, james harvey <jamespharve...@gmail.com> > wrote: >> On Sat, May 12, 2018 at 3:51 AM, Martin Steigerwald <mar...@lichtvoll.de> >> wrote: >>> Hey James. >>> >>> james harvey - 12.05.18, 07:08: >>>> 100% reproducible, booting from disk, or even Arch installation ISO. >>>> Kernel 4.16.7. btrfs-progs v4.16. >>>> >>>> Reading one of two journalctl files causes a kernel oops. Initially >>>> ran into it from "journalctl --list-boots", but cat'ing the file does >>>> it too. I believe this shows there's compressed data that is invalid, >>>> but its btrfs checksum is invalid. I've cat'ed every file on the >>>> disk, and luckily have the problems narrowed down to only these 2 >>>> files in /var/log/journal. >>>> >>>> This volume has always been mounted with lzo compression. >>>> >>>> scrub has never found anything, and have ran it since the oops. >>>> >>>> Found a user a few years ago who also ran into this, without >>>> resolution, at: >>>> https://www.spinics.net/lists/linux-btrfs/msg52218.html >>>> >>>> 1. Cat'ing a (non-essential) file shouldn't be able to bring down the >>>> system. >>>> >>>> 2. If this is infact invalid compressed data, there should be a way to >>>> check for that. Btrfs check and scrub pass. >>> >>> I think systemd-journald sets those files to nocow on BTRFS in order to >>> reduce fragmentation: That means no checksums, no snapshots, no nothing. >>> I just removed /var/log/journal and thus disabled journalling to disk. >>> Its sufficient for me to have the recent state in /run/journal. >>> >>> Can you confirm nocow being set via lsattr on those files? >>> >>> Still they should be decompressible just fine. >>> >>>> Hardware is fine. Passes memtest86+ in SMP mode. Works fine on all >>>> other files. >>>> >>>> >>>> >>>> [ 381.869940] BUG: unable to handle kernel paging request at >>>> 0000000000390e50 [ 381.870881] BTRFS: decompress failed >>> […] >>> -- >>> Martin >>> >>> >> >> You're right, everything in /var/log/journal has the NoCOW attribute. >> >> This is on a 3 device btrfs RAID1. If I mount ro,degraded with disks >> 1&2 or 1&3, and read the file, I get a crash. With disks 2&3, it >> reads fine > > Unmounted with all three available, you can use btrfs-map-logical to > extract copy 1 and copy 2 to compare; but it might crash also if one > copy is corrupt. But it's another way to test. > > >> >> Does this mean that although I've never had a corrupted disk bit >> before on COW/checksummed data, one somehow happened on the small >> fraction of my storage which is NoCOW? Seems unlikely, but I don't >> know what other explanation there would be. > > Usually nocow also means no compression. But in the archives is a > thread where I found that compression can be forced on nocow if the > file is fragment and either the volume is mounted with compression or > the file has inherited chattr +c (I don't remember which or possibly > both).
"file is fragment" should be "file is (submitted for) defragmentation" -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html