On 10 Nov 2025, at 11:06, Lorenzo Stoakes wrote: > On Mon, Nov 10, 2025 at 01:22:16PM +0000, Lorenzo Stoakes wrote: >> On Mon, Nov 10, 2025 at 06:37:58PM +0530, Garg, Shivank wrote: >>> >>> >>> On 11/10/2025 5:31 PM, Lorenzo Stoakes wrote: >>>> On Mon, Nov 10, 2025 at 11:32:53AM +0000, Shivank Garg wrote: >>>>> When MADV_COLLAPSE is called on file-backed mappings (e.g., executable >>> >>>>> --- >>>>> Applies cleanly on: >>>>> 6.18-rc5 >>>>> mm-stable:e9a6fb0bc >>>> >>>> Please base on mm-unstable. mm-stable is usually out of date until very >>>> close to >>>> merge window. >>> >>> I'm observing issues when testing with kselftest on mm-unstable and mm-new >>> branches that prevent >>> proper testing for my patches: >>> >>> On mm-unstable (without my patches): >>> >>> # # running ./transhuge-stress -d 20 >>> # # -------------------------------- >>> # # TAP version 13 >>> # # 1..1 >>> # # transhuge-stress: allocate 220271 transhuge pages, using 440543 MiB >>> virtual memory and 1720 MiB of ram >>> >>> >>> [ 367.225667] RIP: 0010:swap_cache_get_folio+0x2d/0xc0 >>> [ 367.230635] Code: 00 00 48 89 f9 49 89 f9 48 89 fe 48 c1 e1 06 49 c1 e9 >>> 3a 48 c1 e9 0f 48 c1 e1 05 4a 8b 04 cd c0 2e 5b 99 48 8b 78 60 48 01 cf >>> <48> 8b 47 08 48 85 c0 74 20 48 89 f2 81 e2 ff 01 00 00 48 8d 04 d0 >>> [ 367.249378] RSP: 0000:ffffcde32943fba8 EFLAGS: 00010282 >>> [ 367.254605] RAX: ffff8bd1668fdc00 RBX: 00007ffc15df5000 RCX: >>> 00003fffffffffe0 >>> [ 367.261736] RDX: ffffffff995cb530 RSI: 0003ffffffffffff RDI: >>> ffffcbd1560dffe0 >>> [ 367.268862] RBP: 0003ffffffffffff R08: ffffcde32943fc47 R09: >>> 0000000000000000 >>> [ 367.275994] R10: 0000000000000000 R11: 0000000000000000 R12: >>> 0000000000000000 >>> [ 367.283129] R13: 0000000000000000 R14: ffff8bd1668fdc00 R15: >>> 0000000000100cca >>> [ 367.290260] FS: 00007ff600af5b80(0000) GS:ffff8c4e9ec7e000(0000) >>> knlGS:0000000000000000 >>> [ 367.298344] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 367.304083] CR2: ffffcbd1560dffe8 CR3: 00000001280e9001 CR4: >>> 0000000000770ef0 >>> [ 367.311216] PKRU: 55555554 >>> [ 367.313929] Call Trace: >>> [ 367.316375] <TASK> >>> [ 367.318479] __read_swap_cache_async+0x8e/0x1b0 >>> [ 367.323014] swap_vma_readahead+0x23d/0x430 >>> [ 367.327198] swapin_readahead+0xb0/0xc0 >>> [ 367.331039] do_swap_page+0x5bc/0x1260 >>> [ 367.334789] ? rseq_ip_fixup+0x6f/0x190 >>> [ 367.338631] ? __pfx_default_wake_function+0x10/0x10 >>> [ 367.343596] __handle_mm_fault+0x49a/0x760 >>> [ 367.347696] handle_mm_fault+0x188/0x300 >>> [ 367.351620] do_user_addr_fault+0x15b/0x6c0 >>> [ 367.355807] exc_page_fault+0x60/0x100 >>> [ 367.359562] asm_exc_page_fault+0x22/0x30 >>> [ 367.363574] RIP: 0033:0x7ff60091ba99 >>> [ 367.367153] Code: f7 d8 64 89 02 b8 ff ff ff ff eb bd e8 40 c4 01 00 f3 >>> 0f 1e fa 80 3d b5 f5 0e 00 00 74 13 31 c0 0f 05 48 3d 00 f0 ff ff 77 4f >>> <c3> 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 48 89 55 e8 48 89 75 >>> [ 367.385897] RSP: 002b:00007ffc15df1118 EFLAGS: 00010203 >>> [ 367.391124] RAX: 0000000000000001 RBX: 000055941fb672a0 RCX: >>> 00007ff60091ba91 >>> [ 367.398256] RDX: 0000000000000001 RSI: 000055941fb813e0 RDI: >>> 0000000000000000 >>> [ 367.405387] RBP: 00007ffc15df21e0 R08: 0000000000000000 R09: >>> 0000000000000007 >>> [ 367.412513] R10: 000055941fb97cb0 R11: 0000000000000246 R12: >>> 000055941fb813e0 >>> [ 367.419646] R13: 0000000000000000 R14: 0000000000000000 R15: >>> 0000000000000000 >>> [ 367.426781] </TASK> >>> [ 367.428970] Modules linked in: xfrm_user xfrm_algo xt_addrtype >>> xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat >>> nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables >>> overlay bridge stp llc cfg80211 rfkill binfmt_misc ipmi_ssif amd_atl >>> intel_rapl_msr intel_rapl_common wmi_bmof amd64_edac edac_mce_amd mgag200 >>> rapl drm_client_lib i2c_algo_bit drm_shmem_helper drm_kms_helper >>> acpi_cpufreq i2c_piix4 ptdma k10temp i2c_smbus wmi acpi_power_meter ipmi_si >>> acpi_ipmi ipmi_devintf ipmi_msghandler sg dm_multipath drm fuse dm_mod >>> nfnetlink ext4 crc16 mbcache jbd2 raid10 raid456 async_raid6_recov >>> async_memcpy async_pq async_xor xor async_tx raid6_pq raid1 raid0 kvm_amd >>> sd_mod ahci nvme libahci kvm libata nvme_core tg3 ccp megaraid_sas irqbypass >>> [ 367.497528] CR2: ffffcbd1560dffe8 >>> [ 367.500846] ---[ end trace 0000000000000000 ]--- >> >> Yikes, oopsies! >> >> I'll try running tests locally on threadripper, but ran tests against yours >> previously and seemed fine, strange. Maybe fixed since but let me try, maybe >> because swap is not enabled locally for me? >> >> Likely this actually... > > I have tried on swap-enabled setup and no issue with mm-unstable. > > So this is odd, I know you have limited time (_totally sympathise_) but is it > at > all possible if you get a moment to bisect against tip mm-unstable/mm-new? > > Obviously we want to make sure buggy swap code doesn't get merged to mainline!
I could not reproduce locally either. Shivank, can you share your config file and machine config? > >> >>> >>> >>> >>> ----------------- >>> On mm-new (without my patches): >>> >>> [ 394.144770] get_swap_device: Bad swap offset entry 3ffffffffffff >>> >>> dmesg | grep "get_swap_device: Bad swap offset entry" | wc -l >>> 359 >>> >>> >>> Additionally, kexec triggers an oops and crash during swapoff: >>> >>> >>> Deactivating swap 704.854238] BUG: unable to handle page fault >>> for address: ffffcbe2de8dffe8 >>> [ 704.861524] #PF: supervisor read access in kernel mode >>> ;39mswap.img.swa[ 704.866666] #PF: error_code(0x0000) - not-present page >>> [ 704.873253] PGD 0 P4D 0 >>> p - /swap.im[ 704.875790] Oops: Oops: 0000 [#1] SMP NOPTI >>> g... >>> [ 704.881354] CPU: 122 UID: 0 PID: 107680 Comm: swapoff Kdump: loaded Not >>> tainted 6.18.0-rc5+ #11 NONE >>> [ 704.891283] Hardware name: Dell Inc. PowerEdge R6525/024PW1, BIOS 2.16.2 >>> 07/09/2024 >>> [ 704.898930] RIP: 0010:swap_cache_get_folio+0x2d/0xc0 >>> [ 704.903907] Code: 00 00 48 89 f9 49 89 f9 48 89 fe 48 c1 e1 06 49 c1 e9 >>> 3a 48 c1 e9 0f 48 c1 e1 05 4a 8b 04 cd c0 2e 7b 95 48 8b 78 60 48 01 cf >>> <48> 8b 47 08 48 85 c0 74 20 48 89 f2 81 e2 ff 01 00 00 48 8d 04 d0 >>> [ 704.922720] RSP: 0018:ffffcf1227b1fc08 EFLAGS: 00010282 >>> [ 704.928035] RAX: ffff8be2cefb3c00 RBX: 0000555c65a5c000 RCX: >>> 00003fffffffffe0 >>> [ 704.928036] RDX: 0003ffffffffffff RSI: 0003ffffffffffff RDI: >>> ffffcbe2de8dffe0 >>> [ 704.928037] RBP: 0000000000000000 R08: ffff8be2de8e0520 R09: >>> 0000000000000000 >>> Unmount[ 704.928038] R10: 000000000000ffff R11: ffffcf12236f4000 >>> R12: ffff8be2d5b8d968 >>> [ 704.928039] R13: 0003ffffffffffff R14: fffff3eec85eb000 R15: >>> 0000555c65a51000 >>> [ 704.928039] FS: 00007f41fcab3800(0000) GS:ffff8c602b6fe000(0000) >>> knlGS:0000000000000000 >>> [ 704.928040] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> [ 704.928041] CR2: ffffcbe2de8dffe8 CR3: 00000074981af004 CR4: >>> 0000000000770ef0 >>> [ 704.928041] PKRU: 55555554 >>> [ 704.928042] Call Trace: >>> [ 704.928043] <TASK> >>> [ 704.928044] unuse_pte_range+0x10b/0x290 >>> [ 704.928047] unuse_pud_range.isra.0+0x149/0x190 >>> [ 704.928048] unuse_vma+0x1a6/0x220 >>> [ 704.928050] unuse_mm+0x9b/0x110 >>> [ 704.928052] try_to_unuse+0xc5/0x260 >>> [ 704.928053] __do_sys_swapoff+0x244/0x670 >>> ing boo[ 705.016662] do_syscall_64+0x67/0xc50 >>> [ 705.016667] ? do_user_addr_fault+0x15b/0x6c0 >>> t.mount - /b[ 705.026100] ? exc_page_fault+0x60/0x100 >>> [ 705.031498] ? irqentry_exit_to_user_mode+0x20/0xe0 >>> oot... >>> [ 705.036377] entry_SYSCALL_64_after_hwframe+0x76/0x7e >>> [ 705.042200] RIP: 0033:0x7f41fc9271bb >>> [ 705.045780] Code: 0f 1e fa 48 83 fe 01 48 8b 15 59 bc 0d 00 19 c0 83 e0 >>> f0 83 c0 26 64 89 02 b8 ff ff ff ff c3 f3 0f 1e fa b8 a8 00 00 00 0f 05 >>> <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2d bc 0d 00 f7 d8 64 89 01 48 >>> [ 705.064807] RSP: 002b:00007ffd14b5b6e8 EFLAGS: 00000202 ORIG_RAX: >>> 00000000000000a8 >>> [ 705.064809] RAX: ffffffffffffffda RBX: 00007ffd14b5cf30 RCX: >>> 00007f41fc9271bb >>> [ 705.064810] RDX: 0000000000000001 RSI: 0000000000000c00 RDI: >>> 000055d48f533a40 >>> [ 705.064810] RBP: 00007ffd14b5b750 R08: 00007f41fca03b20 R09: >>> 0000000000000000 >>> [ 705.064811] R10: 0000000000000001 R11: 0000000000000202 R12: >>> 0000000000000000 >>> [ 705.064811] R13: 0000000000000000 R14: 000055d4584f1479 R15: >>> 000055d4584f2b20 >>> [ 705.064813] </TASK> >>> [ 705.064814] Modules linked in: xfrm_user xfrm_algo xt_addrtype >>> xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat >>> nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables >>> overlay bridge stp llc cfg80211 rfkill binfmt_misc ipmi_ssif amd_atl >>> intel_rapl_msr intel_rapl_common wmi_bmof amd64_edac edac_mce_amd rapl >>> mgag200 drm_client_lib i2c_algo_bit drm_shmem_helper drm_kms_helper >>> acpi_cpufreq i2c_piix4 ptdma ipmi_si k10temp i2c_smbus acpi_power_meter wmi >>> acpi_ipmi ipmi_msghandler sg dm_multipath fuse drm dm_mod nfnetlink ext4 >>> crc16 mbcache jbd2 raid10 raid456 async_raid6_recov async_memcpy async_pq >>> async_xor xor async_tx raid6_pq raid1 raid0 sd_mod kvm_amd ahci libahci kvm >>> nvme tg3 libata ccp irqbypass nvme_core megaraid_sas [last unloaded: >>> ipmi_devintf] >>> [ 705.180420] CR2: ffffcbe2de8dffe8 >>> [ 705.183852] ---[ end trace 0000000000000000 ]--- >>> >>> >>> I haven't had cycles to dig into this yet and been swamped with other >>> things. >> >> Fully understand, I'm _very_ familiar with this situation :) >> >> I need more cores... ;) > > Oh it's nice to have more :) I am bankrupt now, but it's nice to have more ;) > > Cheers, Lorenzo Best Regards, Yan, Zi
