On Thu, Mar 18, 2021 at 10:53 AM Stuart Shelton <srcshel...@gmail.com> wrote: > > Hi all, > > I recently migrated an existing ext4 fs using btrfs-convert (setting nodesize > to 32k and enabling optional features `extref`, `skinny-metadata` and > `no-holes` - the first two of which I believe are now the default in any > case?), but I’m subsequently seeing very frequent BUGs being output by the > kernel: > > [ 821.843637] BUG: sleeping function called from invalid context at > kernel/locking/mutex.c:281 > [ 821.843641] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 28214, > name: podman > [ 821.843644] CPU: 3 PID: 28214 Comm: podman Tainted: G W > 5.11.6 #15 > [ 821.843646] Hardware name: Dell Inc. PowerEdge R330/084XW4, BIOS 2.11.0 > 12/08/2020 > [ 821.843647] Call Trace: > [ 821.843650] dump_stack+0xa1/0xfb > [ 821.843656] ___might_sleep+0x144/0x160 > [ 821.843659] mutex_lock+0x17/0x40 > [ 821.843662] kernfs_remove_by_name_ns+0x1f/0x80 > [ 821.843666] sysfs_remove_group+0x7d/0xe0 > [ 821.843668] sysfs_remove_groups+0x28/0x40 > [ 821.843670] kobject_del+0x2a/0x80 > [ 821.843672] btrfs_sysfs_del_one_qgroup+0x2b/0x40 [btrfs] > [ 821.843685] __del_qgroup_rb+0x12/0x150 [btrfs] > [ 821.843696] btrfs_remove_qgroup+0x288/0x2a0 [btrfs] > [ 821.843707] btrfs_ioctl+0x3129/0x36a0 [btrfs] > [ 821.843717] ? __mod_lruvec_page_state+0x5e/0xb0 > [ 821.843719] ? page_add_new_anon_rmap+0xbc/0x150 > [ 821.843723] ? kfree+0x1b4/0x300 > [ 821.843725] ? mntput_no_expire+0x55/0x330 > [ 821.843728] __x64_sys_ioctl+0x5a/0xa0 > [ 821.843731] do_syscall_64+0x33/0x70 > [ 821.843733] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 821.843736] RIP: 0033:0x4cd3fb > [ 821.843739] Code: fa ff eb bd e8 86 8b fa ff e9 61 ff ff ff cc e8 fb 55 fa > ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d > 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30 > [ 821.843741] RSP: 002b:000000c000906b20 EFLAGS: 00000206 ORIG_RAX: > 0000000000000010 > [ 821.843744] RAX: ffffffffffffffda RBX: 000000c000050000 RCX: > 00000000004cd3fb > [ 821.843745] RDX: 000000c000906b98 RSI: 000000004010942a RDI: > 000000000000000f > [ 821.843747] RBP: 000000c000907cd0 R08: 000000c000622901 R09: > 0000000000000000 > [ 821.843748] R10: 000000c000d992c0 R11: 0000000000000206 R12: > 000000000000012d > [ 821.843749] R13: 000000000000012c R14: 0000000000000200 R15: > 0000000000000049 > > The system starts 24 containers on boot via `podman`, and by the time this > process is complete there were (on the last power-cycle) 10 such BUG reports > logged. > > Is this a recognised issue?
Ah, it's taking a mutex while holding a spinlock. I just sent a fix for this: https://lore.kernel.org/linux-btrfs/206d121e2e2b609ffe31217e6d90bfabe1c4e121.1616066404.git.fdman...@suse.com/ Thanks for the report. > > > Support information: > > uname: > Linux dellr330 5.11.6 #15 SMP Wed Mar 17 15:18:52 GMT 2021 x86_64 Intel(R) > Xeon(R) CPU E3-1240L v5 @ 2.10GHz GenuineIntel GNU/Linux > > version: > btrfs-progs v5.10.1 > > btrfs fi: > Label: 'space' uuid: 94cc0dca-4a1f-4d18-bdf8-943982d1b6ff > Total devices 1 FS bytes used 163.44GiB > devid 1 size 1.56TiB used 231.24GiB path /dev/mapper/storage-space > > btrfs df: > Data, single: total=221.16GiB, used=154.74GiB > System, single: total=4.00MiB, used=384.00KiB > Metadata, single: total=10.08GiB, used=8.70GiB > GlobalReserve, single: total=512.00MiB, used=0.00B > > fstab entry: > LABEL=space /space btrfs > noatime,compress-force=zstd:2,user_subvol_rm_allowed,nofail 0 2 > > Other dmesg entries: > [ 61.973985] Btrfs loaded, crc32c=crc32c-intel, zoned=yes > [ 63.310454] BTRFS: device label space devid 1 transid 24453 > /dev/mapper/storage-space scanned by btrfs (6546) > [ 64.471111] BTRFS info (device dm-1): force zstd compression, level 2 > [ 64.471126] BTRFS info (device dm-1): disk space caching is enabled > [ 64.471130] BTRFS info (device dm-1): has skinny extents > [ 81.247002] BTRFS info (device dm-1): checking UUID tree > [ 104.987371] BTRFS error (device dm-1): qgroup scan failed with -4 > [ 106.615043] BTRFS error (device dm-1): qgroup scan failed with -4 > [ 107.258435] BTRFS error (device dm-1): qgroup scan failed with -4 > [ 107.962191] BTRFS error (device dm-1): qgroup scan failed with -4 > [ 118.289293] BUG: sleeping function called from invalid context at > kernel/locking/mutex.c:281 > [ 118.289296] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 9003, > name: podman > [ 118.289298] CPU: 4 PID: 9003 Comm: podman Not tainted 5.11.6 #15 > [ 118.289301] Hardware name: Dell Inc. PowerEdge R330/084XW4, BIOS 2.11.0 > 12/08/2020 > [ 118.289301] Call Trace: > [ 118.289303] dump_stack+0xa1/0xfb > [ 118.289308] ___might_sleep+0x144/0x160 > [ 118.289310] mutex_lock+0x17/0x40 > [ 118.289313] kernfs_remove_by_name_ns+0x1f/0x80 > [ 118.289317] sysfs_remove_group+0x7d/0xe0 > [ 118.289319] sysfs_remove_groups+0x28/0x40 > [ 118.289320] kobject_del+0x2a/0x80 > [ 118.289322] btrfs_sysfs_del_one_qgroup+0x2b/0x40 [btrfs] > [ 118.289334] __del_qgroup_rb+0x12/0x150 [btrfs] > [ 118.289343] btrfs_remove_qgroup+0x288/0x2a0 [btrfs] > [ 118.289352] btrfs_ioctl+0x3129/0x36a0 [btrfs] > [ 118.289361] ? __mod_lruvec_page_state+0x5e/0xb0 > [ 118.289363] ? page_add_new_anon_rmap+0xbc/0x150 > [ 118.289366] ? kfree+0x1b4/0x300 > [ 118.289368] ? mntput_no_expire+0x55/0x330 > [ 118.289371] __x64_sys_ioctl+0x5a/0xa0 > [ 118.289374] do_syscall_64+0x33/0x70 > [ 118.289375] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > [ 118.289378] RIP: 0033:0x4cd3fb > [ 118.289380] Code: fa ff eb bd e8 86 8b fa ff e9 61 ff ff ff cc e8 fb 55 fa > ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d > 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30 > [ 118.289382] RSP: 002b:000000c0005e2b20 EFLAGS: 00000206 ORIG_RAX: > 0000000000000010 > [ 118.289384] RAX: ffffffffffffffda RBX: 000000c000050000 RCX: > 00000000004cd3fb > [ 118.289385] RDX: 000000c0005e2b98 RSI: 000000004010942a RDI: > 0000000000000012 > [ 118.289386] RBP: 000000c0005e3cd0 R08: 000000c000582c01 R09: > 0000000000000000 > [ 118.289387] R10: 000000c000708b70 R11: 0000000000000206 R12: > 00000000000000b8 > [ 118.289388] R13: 00000000000000b7 R14: 0000000000000200 R15: > 0000000000000049 > [ 498.003691] BTRFS info (device dm-1): qgroup scan completed (inconsistency > flag cleared) > [ 499.522376] BTRFS error (device dm-1): qgroup scan failed with -4 > [ 499.975886] BTRFS error (device dm-1): qgroup scan failed with -4 > > -- Filipe David Manana, “Whether you think you can, or you think you can't — you're right.”