On Thu, Oct 25, 2018 at 9:56 AM, Dmitry Katsubo <dm...@mail.ru> wrote: > Dear btrfs community, > > My excuses for the dumps for rather old kernel (4.9.25), nevertheless I > wonder > about your opinion about the below reported kernel crashes. > > As I could understand the situation (correct me if I am wrong), it happened > that some data block became corrupted which resulted the following kernel > trace > during the boot: > > kernel BUG at /build/linux-fB36Cv/linux-4.9.25/fs/btrfs/extent_io.c:2318! > invalid opcode: 0000 [#1] SMP > Call Trace: > [<f8c63739>] ? end_bio_extent_readpage+0x4e9/0x680 [btrfs] > [<f8c951eb>] ? end_compressed_bio_read+0x3b/0x2d0 [btrfs] > [<f8c771de>] ? btrfs_scrubparity_helper+0xce/0x2d0 [btrfs] > [<de07ebb1>] ? process_one_work+0x141/0x380 > [<de07ee31>] ? worker_thread+0x41/0x460 > [<de0840e4>] ? kthread+0xb4/0xd0 > [<de07edf0>] ? process_one_work+0x380/0x380 > [<de084030>] ? kthread_park+0x50/0x50 > [<de5aae03>] ? ret_from_fork+0x1b/0x28 > > The problematic file turned out to be the one used by systemd-journald > /var/log/journal/c496cea41ebc4700a0dfaabf64a21be4/system.journal > which was trying to read it (or append to it) during the boot and that was > causing the system crash (see attached bootN_dmesg.txt). > > I've rebooted in safe mode and tried to copy the data from this partition to > another location using btrfs-restore, however kernel was crashing as well > with > a bit different symphom (see attached copyN_dmesg.txt): > > Call Trace: > [<f8b4c760>] ? lzo_decompress_biovec+0x1b0/0x2b0 [btrfs] > [<d71a8828>] ? vmalloc+0x38/0x40 > [<f8b4d415>] ? end_compressed_bio_read+0x265/0x2d0 [btrfs] > [<f8b2f1de>] ? btrfs_scrubparity_helper+0xce/0x2d0 [btrfs] > [<d707ebb1>] ? process_one_work+0x141/0x380 > [<d707ee31>] ? worker_thread+0x41/0x460 > [<d70840e4>] ? kthread+0xb4/0xd0 > [<d75aae03>] ? ret_from_fork+0x1b/0x28 > > Just to keep away from the problem, I've removed this file and also removed > "compress=lzo" mount option. > > Are there any updates / fixes done in that area? Is lzo option safe to use?
It should be safe even with that kernel. I'm not sure this is compression related. There is a corruption bug related to inline extents and corruption that had been fairly elusive but I think it's fixed now. I haven't run into it though. I would say the first step no matter what if you're using an older kernel, is to boot a current Fedora or Arch live or install media, mount the Btrfs and try to read the problem files and see if the problem still happens. I can't even being to estimate the tens of thousands of line changes since kernel 4.9. What profile are you using for this Btrfs? Is this a raid56? What do you get for 'btrfs fi us <mountpoint>' ? -- Chris Murphy