On Sun, 31 Jul 2022, Chris Hofstaedtler wrote:
I can't see a difference that should matter from userspace. I have stared a bit at the kernel code... there have been quite some changes and fixes in this area. Which kernel version were you running when testing this? Could you retry on something >= 5.9? I.e. some version with patch 08fc1ab6d748ab1a690fd483f41e2938984ce353.
Dear Chris, I believe that I was running 5.10 (bullseye). It looks like 5.18 (from backports) does not show the issue! (i.e. works) Some more details: I have now tried again: host: linux-image-5.10.0-16-amd64 5.10.127-2 mdadm 4.2-1~bpo11+1 chroot: mdadm 4.1-11 Some more details: This time I did get some dmesg BUG output as well (attached). It does not seem to be the same backtrace on two occurances. I also noticed that the BUG: report in dmesg does not happen directly when doing 'mdadm --examine --scan --config=partitions'. It rather occurs when some activity happens on the host filesystem, e.g. a 'touch /root/a' command. host: linux-image-5.18.0-0.bpo.1-amd64 5.18.2-1~bpo11+1 (did not re-install anything else, except upgraded zfs, also from backports (since pure bullseye would not compile with 5.18)) Does not exhibit the problem.I have tried with both kernels several times, and it was repeatable that 5.10 got stuck while 5.18 does not show issues.
Reminder: to get the issue, /dev/ should not be mounted in the chroot. With /dev/ mounted, 5.10 also works. Best regards, Håkan
[mån aug 1 15:53:08 2022] BUG: kernel NULL pointer dereference, address: 0000000000000010 [mån aug 1 15:53:08 2022] #PF: supervisor read access in kernel mode [mån aug 1 15:53:08 2022] #PF: error_code(0x0000) - not-present page [mån aug 1 15:53:08 2022] PGD 0 P4D 0 [mån aug 1 15:53:08 2022] Oops: 0000 [#1] SMP PTI [mån aug 1 15:53:08 2022] CPU: 2 PID: 284256 Comm: cron Tainted: P OE 5.10.0-16-amd64 #1 Debian 5.10.127-2 [mån aug 1 15:53:08 2022] Hardware name: Dell Computer Corporation PowerEdge 2850/0T7971, BIOS A04 09/22/2005 [mån aug 1 15:53:08 2022] RIP: 0010:__ext4_journal_get_write_access+0x29/0x120 [ext4] [mån aug 1 15:53:08 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab d7 bb e1 48 8b 45 30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00 [mån aug 1 15:53:08 2022] RSP: 0018:ffffae27c059fd60 EFLAGS: 00010246 [mån aug 1 15:53:08 2022] RAX: 0000000000000000 RBX: ffff9d1b94505480 RCX: ffff9d1bc52e5e38 [mån aug 1 15:53:08 2022] RDX: ffff9d1bc13782d8 RSI: 0000000000000c14 RDI: ffffffffc096feb0 [mån aug 1 15:53:08 2022] RBP: ffff9d1bc52e5e38 R08: ffff9d1be04d5230 R09: 0000000000000001 [mån aug 1 15:53:08 2022] R10: ffff9d1bc985f000 R11: 000000000000001d R12: ffff9d1bc13782d8 [mån aug 1 15:53:08 2022] R13: ffff9d1be04d5000 R14: 0000000000000c14 R15: ffff9d1bc13782d8 [mån aug 1 15:53:08 2022] FS: 00007fed5ecb1840(0000) GS:ffff9d1cd7c80000(0000) knlGS:0000000000000000 [mån aug 1 15:53:08 2022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [mån aug 1 15:53:08 2022] CR2: 0000000000000010 CR3: 00000001a46d8000 CR4: 00000000000006e0 [mån aug 1 15:53:08 2022] Call Trace: [mån aug 1 15:53:08 2022] ext4_orphan_del+0x23f/0x290 [ext4] [mån aug 1 15:53:08 2022] ext4_evict_inode+0x31f/0x630 [ext4] [mån aug 1 15:53:08 2022] evict+0xd1/0x1a0 [mån aug 1 15:53:08 2022] __dentry_kill+0xe4/0x180 [mån aug 1 15:53:08 2022] dput+0x149/0x2f0 [mån aug 1 15:53:08 2022] __fput+0xe4/0x240 [mån aug 1 15:53:08 2022] task_work_run+0x65/0xa0 [mån aug 1 15:53:08 2022] exit_to_user_mode_prepare+0x111/0x120 [mån aug 1 15:53:08 2022] syscall_exit_to_user_mode+0x28/0x140 [mån aug 1 15:53:08 2022] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [mån aug 1 15:53:08 2022] RIP: 0033:0x7fed5eea2d77 [mån aug 1 15:53:08 2022] Code: 44 00 00 48 8b 15 19 a1 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 8b 15 e9 a0 0c 00 f7 d8 64 89 02 b8 [mån aug 1 15:53:08 2022] RSP: 002b:00007ffd50452818 EFLAGS: 00000202 ORIG_RAX: 0000000000000003 [mån aug 1 15:53:08 2022] RAX: 0000000000000000 RBX: 000055dab4578910 RCX: 00007fed5eea2d77 [mån aug 1 15:53:08 2022] RDX: 00007fed5ef6e8a0 RSI: 0000000000000000 RDI: 0000000000000006 [mån aug 1 15:53:08 2022] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007fed5ef6dbe0 [mån aug 1 15:53:08 2022] R10: 000000000000006f R11: 0000000000000202 R12: 00007fed5ef6f4a0 [mån aug 1 15:53:08 2022] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000001 [mån aug 1 15:53:08 2022] Modules linked in: msr autofs4 nfsd auth_rpcgss nfsv3 nfs_acl nfs lockd grace sunrpc nfs_ssc fscache xt_mac xt_length xt_recent xt_multiport xt_tcpudp xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables loop dcdbas radeon zfs(POE) zunicode(POE) zzstd(OE) ttm zlua(OE) zavl(POE) icp(POE) drm_kms_helper iTCO_wdt intel_pmc_bxt cec iTCO_vendor_support zcommon(POE) watchdog znvpair(POE) intel_powerclamp ipmi_si drm pcspkr spl(OE) ipmi_devintf serio_raw ipmi_msghandler rng_core i2c_algo_bit sg evdev e752x_edac button overlay ext4 crc16 mbcache jbd2 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid0 multipath linear raid1 sd_mod sr_mod cdrom ata_generic md_mod mptspi mptscsih ata_piix libata mptbase scsi_transport_spi nvme ehci_pci uhci_hcd nvme_core ehci_hcd t10_pi scsi_mod lpc_ich crc_t10dif crct10dif_generic psmouse usbcore e1000 crct10dif_common [mån aug 1 15:53:08 2022] usb_common video [mån aug 1 15:53:08 2022] CR2: 0000000000000010 [mån aug 1 15:53:08 2022] ---[ end trace 4fd9ed73d190bc2a ]--- [mån aug 1 15:53:08 2022] RIP: 0010:__ext4_journal_get_write_access+0x29/0x120 [ext4] [mån aug 1 15:53:08 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab d7 bb e1 48 8b 45 30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00 [mån aug 1 15:53:08 2022] RSP: 0018:ffffae27c059fd60 EFLAGS: 00010246 [mån aug 1 15:53:08 2022] RAX: 0000000000000000 RBX: ffff9d1b94505480 RCX: ffff9d1bc52e5e38 [mån aug 1 15:53:08 2022] RDX: ffff9d1bc13782d8 RSI: 0000000000000c14 RDI: ffffffffc096feb0 [mån aug 1 15:53:08 2022] RBP: ffff9d1bc52e5e38 R08: ffff9d1be04d5230 R09: 0000000000000001 [mån aug 1 15:53:08 2022] R10: ffff9d1bc985f000 R11: 000000000000001d R12: ffff9d1bc13782d8 [mån aug 1 15:53:08 2022] R13: ffff9d1be04d5000 R14: 0000000000000c14 R15: ffff9d1bc13782d8 [mån aug 1 15:53:08 2022] FS: 00007fed5ecb1840(0000) GS:ffff9d1cd7c80000(0000) knlGS:0000000000000000 [mån aug 1 15:53:08 2022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [mån aug 1 15:53:08 2022] CR2: 0000000000000010 CR3: 00000001a46d8000 CR4: 00000000000006e0
[mån aug 1 18:57:57 2022] BUG: kernel NULL pointer dereference, address: 0000000000000010 [mån aug 1 18:57:57 2022] #PF: supervisor read access in kernel mode [mån aug 1 18:57:57 2022] #PF: error_code(0x0000) - not-present page [mån aug 1 18:57:57 2022] PGD 0 P4D 0 [mån aug 1 18:57:57 2022] Oops: 0000 [#1] SMP PTI [mån aug 1 18:57:57 2022] CPU: 2 PID: 4427 Comm: touch Tainted: P OE 5.10.0-16-amd64 #1 Debian 5.10.127-2 [mån aug 1 18:57:57 2022] Hardware name: Dell Computer Corporation PowerEdge 2850/0T7971, BIOS A04 09/22/2005 [mån aug 1 18:57:57 2022] RIP: 0010:__ext4_journal_get_write_access+0x29/0x120 [ext4] [mån aug 1 18:57:57 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab 57 e9 e5 48 8b 45 30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00 [mån aug 1 18:57:57 2022] RSP: 0018:ffffc2b08062fb78 EFLAGS: 00010246 [mån aug 1 18:57:57 2022] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9daed0440068 [mån aug 1 18:57:57 2022] RDX: ffff9daec0fb53b8 RSI: 0000000000000469 RDI: ffffffffc0896c80 [mån aug 1 18:57:57 2022] RBP: ffff9daed0440068 R08: ffff9daed07f7138 R09: 0000000000000000 [mån aug 1 18:57:57 2022] R10: ffff9daec4c2ef08 R11: 0000000000000000 R12: ffff9daec0fb53b8 [mån aug 1 18:57:57 2022] R13: ffff9daee013d800 R14: 0000000000000469 R15: ffff9daee013d800 [mån aug 1 18:57:57 2022] FS: 00007febc0a915c0(0000) GS:ffff9dafd7c80000(0000) knlGS:0000000000000000 [mån aug 1 18:57:57 2022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [mån aug 1 18:57:57 2022] CR2: 0000000000000010 CR3: 0000000106616000 CR4: 00000000000006e0 [mån aug 1 18:57:57 2022] Call Trace: [mån aug 1 18:57:57 2022] ? __ext4_handle_dirty_metadata+0x51/0x1a0 [ext4] [mån aug 1 18:57:57 2022] __ext4_new_inode+0x925/0x1690 [ext4] [mån aug 1 18:57:57 2022] ext4_create+0x106/0x1b0 [ext4] [mån aug 1 18:57:57 2022] path_openat+0xde1/0x1080 [mån aug 1 18:57:57 2022] do_filp_open+0x88/0x130 [mån aug 1 18:57:57 2022] ? getname_flags.part.0+0x29/0x1a0 [mån aug 1 18:57:57 2022] ? __check_object_size+0x136/0x150 [mån aug 1 18:57:57 2022] do_sys_openat2+0x97/0x150 [mån aug 1 18:57:57 2022] __x64_sys_openat+0x54/0x90 [mån aug 1 18:57:57 2022] do_syscall_64+0x33/0x80 [mån aug 1 18:57:57 2022] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [mån aug 1 18:57:57 2022] RIP: 0033:0x7febc09b9be7 [mån aug 1 18:57:57 2022] Code: 25 00 00 41 00 3d 00 00 41 00 74 47 64 8b 04 25 18 00 00 00 85 c0 75 6b 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 95 00 00 00 48 8b 4c 24 28 64 48 2b 0c 25 [mån aug 1 18:57:57 2022] RSP: 002b:00007ffedb21a7f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 [mån aug 1 18:57:57 2022] RAX: ffffffffffffffda RBX: 00007ffedb21aaa8 RCX: 00007febc09b9be7 [mån aug 1 18:57:57 2022] RDX: 0000000000000941 RSI: 00007ffedb21ae94 RDI: 00000000ffffff9c [mån aug 1 18:57:57 2022] RBP: 00007ffedb21ae94 R08: 0000000000000000 R09: 0000000000000000 [mån aug 1 18:57:57 2022] R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 [mån aug 1 18:57:57 2022] R13: 00007ffedb21ae94 R14: 0000000000000000 R15: 0000000000000000 [mån aug 1 18:57:57 2022] Modules linked in: msr autofs4 nfsd auth_rpcgss nfsv3 nfs_acl nfs lockd grace sunrpc nfs_ssc fscache xt_mac xt_length xt_recent xt_multiport xt_tcpudp xt_state xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter ip_tables x_tables loop radeon zfs(POE) ttm zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) drm_kms_helper iTCO_wdt cec icp(POE) intel_pmc_bxt dcdbas iTCO_vendor_support ipmi_si watchdog zcommon(POE) znvpair(POE) intel_powerclamp drm spl(OE) ipmi_devintf pcspkr ipmi_msghandler i2c_algo_bit sg serio_raw rng_core e752x_edac evdev button overlay ext4 crc16 mbcache jbd2 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid0 multipath linear raid1 sd_mod sr_mod cdrom ata_generic md_mod ata_piix libata nvme mptspi mptscsih nvme_core uhci_hcd ehci_pci e1000 ehci_hcd t10_pi crc_t10dif psmouse mptbase usbcore crct10dif_generic scsi_transport_spi scsi_mod lpc_ich crct10dif_common [mån aug 1 18:57:57 2022] usb_common video [mån aug 1 18:57:57 2022] CR2: 0000000000000010 [mån aug 1 18:57:57 2022] ---[ end trace 284590a68ce9a232 ]--- [mån aug 1 18:57:57 2022] RIP: 0010:__ext4_journal_get_write_access+0x29/0x120 [ext4] [mån aug 1 18:57:57 2022] Code: 00 0f 1f 44 00 00 41 57 41 56 41 89 f6 41 55 41 54 49 89 d4 55 48 89 cd 53 48 83 ec 10 48 89 3c 24 e8 ab 57 e9 e5 48 8b 45 30 <4c> 8b 78 10 4d 85 ff 74 2f 49 8b 87 e0 00 00 00 49 8b 9f 88 03 00 [mån aug 1 18:57:57 2022] RSP: 0018:ffffc2b08062fb78 EFLAGS: 00010246 [mån aug 1 18:57:57 2022] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff9daed0440068 [mån aug 1 18:57:57 2022] RDX: ffff9daec0fb53b8 RSI: 0000000000000469 RDI: ffffffffc0896c80 [mån aug 1 18:57:57 2022] RBP: ffff9daed0440068 R08: ffff9daed07f7138 R09: 0000000000000000 [mån aug 1 18:57:57 2022] R10: ffff9daec4c2ef08 R11: 0000000000000000 R12: ffff9daec0fb53b8 [mån aug 1 18:57:57 2022] R13: ffff9daee013d800 R14: 0000000000000469 R15: ffff9daee013d800 [mån aug 1 18:57:57 2022] FS: 00007febc0a915c0(0000) GS:ffff9dafd7c80000(0000) knlGS:0000000000000000 [mån aug 1 18:57:57 2022] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [mån aug 1 18:57:57 2022] CR2: 0000000000000010 CR3: 0000000106616000 CR4: 00000000000006e0
[mån aug 1 19:24:19 2022] EXT4-fs error (device md127): ext4_validate_inode_bitmap:105: comm touch: Corrupt inode bitmap - block_group = 0, inode_bitmap = 494 [mån aug 1 19:24:19 2022] Aborting journal on device md127-8. [mån aug 1 19:24:19 2022] EXT4-fs (md127): Remounting filesystem read-only