** Summary changed: - 6.8 backport for CVE-2025-21861 causes kernel hangs + Incorrect backport for CVE-2025-21861 causes kernel hangs
** Description changed: - My team recently picked up linux-azure_6.8.0-1033.38 and linux-azure- - nvidia_6.8.0-1021.2. These kernels have failed our internal - qualification because our nvidia health-checking scripts are getting - stuck in a kernel call. After debugging, the backport of the patch for - "mm/migrate_device: don't add folio to be freed to LRU in - migrate_device_finalize()" appears to have manual conflict resolution - that introduced a bug. I've reverted this patch in a test tree for - linux-azure_6.8.0-1033.38 and validated that the problem resolves. I've - also come up with an alternate merge strategy that should have a much - simpler conflict resolution, and that does not reproduce the problem - after testing. + BugLink: https://bugs.launchpad.net/bugs/2120330 - First, the problem: our nvbandwidth tasks are getting stuck waiting for - one of the nvidia drivers to release memory. The nvidia kernel task - that needs to complete in order for this memory release to proceed is - blocked waiting in migration_entry_wait_on_locked. One of ur parters at - Nvidia also reproduced this problem and had page debug flags enabled. - He observed a BUG for the presence of PG_active and LRU bits set in - pageflags when the page was freed. + [Impact] - stacks from the two participating processes: + The patch for CVE-2025-21861 was incorrectly backported to the noble 6.8 + kernel, leading to hangs when freeing device memory. - ID: 871438 TASK: ffff007d4d668200 CPU: 95 COMMAND: "nvbandwidth" + commit 41cddf83d8b00f29fd105e7a0777366edc69a5cf + Author: David Hildenbrand <[email protected]> + Date: Mon Feb 10 17:13:17 2025 +0100 + Subject: mm/migrate_device: don't add folio to be freed to LRU in migrate_device_finalize() + Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=41cddf83d8b00f29fd105e7a0777366edc69a5cf + ubuntu-noble: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble/commit/?id=3858edb1146374f3240d1ec769ba857186531b17 + + An incorrect backport was performed, causing the old page to be placed + back instead of the new page, e.g.: + + src = page_folio(page); + dst = page_folio(newpage); + + if (!is_zone_device_page(page)) + + putback_lru_page(page); + + when in 41cddf83d8b00f29fd105e7a0777366edc69a5cf we have: + + + if (!folio_is_zone_device(dst)) + + folio_add_lru(dst); + + in which case, we should really have had the backport as: + + + if (!folio_is_zone_device(newpage)) + + folio_add_lru(newpage); + + This keeps references alive to the old memory pages, preventing them from being + released and freed. + + Stack traces of stuck processes: + + ID: 871438 TASK: ffff007d4d668200 CPU: 95 COMMAND: "nvbandwidth" #0 [ffff80010e8ef840] __switch_to at ffffc0f22798c550 #1 [ffff80010e8ef8a0] __schedule at ffffc0f22798c89c #2 [ffff80010e8ef900] schedule at ffffc0f22798cd40 #3 [ffff80010e8ef930] schedule_preempt_disabled at ffffc0f22798d388 #4 [ffff80010e8ef9c0] rwsem_down_write_slowpath at ffffc0f227990dc8 #5 [ffff80010e8efa20] down_write at ffffc0f2279912d0 #6 [ffff80010e8efaa0] uvm_va_space_mm_shutdown at ffffc0f1c2a451ec [nvidia_uvm] #7 [ffff80010e8efb00] uvm_va_space_mm_unregister at ffffc0f1c2a457a0 [nvidia_uvm] #8 [ffff80010e8efb30] uvm_release at ffffc0f1c2a226d4 [nvidia_uvm] #9 [ffff80010e8efc00] uvm_release_entry.part.0 at ffffc0f1c2a227dc [nvidia_uvm] #10 [ffff80010e8efc20] uvm_release_entry at ffffc0f1c2a22850 [nvidia_uvm] #11 [ffff80010e8efc30] __fput at ffffc0f2269a5760 #12 [ffff80010e8efc70] ____fput at ffffc0f2269a5a80 #13 [ffff80010e8efc80] task_work_run at ffffc0f2265ceedc #14 [ffff80010e8efcc0] do_exit at ffffc0f2265a0bc8 #15 [ffff80010e8efcf0] do_group_exit at ffffc0f2265a0fec #16 [ffff80010e8efd50] get_signal at ffffc0f2265b8750 #17 [ffff80010e8efe10] do_signal at ffffc0f22650166c #18 [ffff80010e8efe40] do_notify_resume at ffffc0f2265018f0 #19 [ffff80010e8efe70] el0_interrupt at ffffc0f227985564 #20 [ffff80010e8efe90] __el0_irq_handler_common at ffffc0f2279855f0 #21 [ffff80010e8efea0] el0t_64_irq_handler at ffffc0f227986080 #22 [ffff80010e8effe0] el0t_64_irq at ffffc0f2264f17fc - PID: 871467 TASK: ffff007f6aa66000 CPU: 66 COMMAND: "UVM GPU4 BH" + PID: 871467 TASK: ffff007f6aa66000 CPU: 66 COMMAND: "UVM GPU4 BH" #0 [ffff80015ddef580] __switch_to at ffffc0f22798c550 #1 [ffff80015ddef5e0] __schedule at ffffc0f22798c89c #2 [ffff80015ddef640] schedule at ffffc0f22798cd40 #3 [ffff80015ddef670] io_schedule at ffffc0f22798cec4 #4 [ffff80015ddef6e0] migration_entry_wait_on_locked at ffffc0f22686e3f0 #5 [ffff80015ddef740] migration_entry_wait at ffffc0f22695a6d4 #6 [ffff80015ddef750] do_swap_page at ffffc0f2268d6378 #7 [ffff80015ddef7d0] handle_pte_fault at ffffc0f2268da688 #8 [ffff80015ddef870] __handle_mm_fault at ffffc0f2268da7f8 #9 [ffff80015ddef8b0] handle_mm_fault at ffffc0f2268dab48 #10 [ffff80015ddef910] handle_fault at ffffc0f1c2aace18 [nvidia_uvm] #11 [ffff80015ddef950] uvm_populate_pageable_vma at ffffc0f1c2aacf24 [nvidia_uvm] #12 [ffff80015ddef990] migrate_pageable_vma_populate_mask at ffffc0f1c2aad8c0 [nvidia_uvm] #13 [ffff80015ddefab0] uvm_migrate_pageable at ffffc0f1c2ab0294 [nvidia_uvm] #14 [ffff80015ddefb90] service_ats_requests at ffffc0f1c2abf828 [nvidia_uvm] #15 [ffff80015ddefbb0] uvm_ats_service_faults at ffffc0f1c2ac02f0 [nvidia_uvm] #16 [ffff80015ddefd40] uvm_parent_gpu_service_non_replayable_fault_buffer at ffffc0f1c2a82e00 [nvidia_uvm] #17 [ffff80015ddefda0] non_replayable_faults_isr_bottom_half at ffffc0f1c2a3c3e4 [nvidia_uvm] #18 [ffff80015ddefe00] non_replayable_faults_isr_bottom_half_entry at ffffc0f1c2a3c590 [nvidia_uvm] #19 [ffff80015ddefe20] _main_loop at ffffc0f1c2a207c8 [nvidia_uvm] #20 [ffff80015ddefe70] kthread at ffffc0f2265d40dc - For this one, I was able to find the wait_page_queue in the stack and - get the folio from there: + There is no workaround. - struct wait_page_queue { - folio = 0xffffffc0205cec80, - bit_nr = 0, - wait = { - flags = 0, - private = 0xffff007f6aa66000, - func = 0xffffc0f226867a30 <wake_page_function>, - entry = { - next = 0xffffc0f2297d2ae8 <folio_wait_table+3944>, - prev = 0xffffc0f2297d2ae8 <folio_wait_table+3944> - } - } - } + [Fix] - Folio page has flags: 396316561050206252 which means neither `PG_locked` - nor `PG_waiters` is set. + To make things less confusing, revert the incorrect backport, and backport + "mm: migrate_device: use more folio in migrate_device_finalize()" to use the + new upstream notations, and correctly backport "mm/migrate_device: don't add + folio to be freed to LRU in migrate_device_finalize()". This approach was + suggested and tested by Krister Johansen, and I think it is reasonable. - Looking at Ubuntu's 6.8 backport of "mm/migrate_device: don't add folio - to be freed to LRU in migrate_device_finalize()", the - migrate_device_finalize code does this: + commit 58bf8c2bf47550bc94fea9cafd2bc7304d97102c + Author: Kefeng Wang <[email protected]> + Date: Mon Aug 26 14:58:12 2024 +0800 + Subject: mm: migrate_device: use more folio in migrate_device_finalize() + Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=58bf8c2bf47550bc94fea9cafd2bc7304d97102c - + if (!is_zone_device_page(page)) - + putback_lru_page(page); + commit 41cddf83d8b00f29fd105e7a0777366edc69a5cf + Author: David Hildenbrand <[email protected]> + Date: Mon Feb 10 17:13:17 2025 +0100 + Subject: mm/migrate_device: don't add folio to be freed to LRU in migrate_device_finalize() + Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=41cddf83d8b00f29fd105e7a0777366edc69a5cf - but upstream it does this instead: + The first patch landed in 6.12-rc1 and the second patch in 6.14-rc4. Both are + in plucky. - + if (!folio_is_zone_device(dst)) - + folio_add_lru(dst); + [Testcase] - I think in some cases, this is actually putting back the old page when - it shouldn't. We want the new page instead, generally. I get a cleaner - conflict resolution if I apply the following: + There are a few ways to trigger the issue. - 58bf8c2bf475 mm: migrate_device: use more folio in migrate_device_finalize() - 41cddf83d8b0 mm/migrate_device: don't add folio to be freed to LRU in migrate_device_finalize() + You can run the hmm selftests. - The second patch doesn't merge without conflicts, but has only a trivial - conflict against: + 1) Check out a kernel git tree + 2) cd tools/testing/selftests/mm/ + 3) make + 4) sudo ./hmm_tests - b1f202060afe mm: remap unused subpages to shared zeropage when - splitting isolated thp + You can also run nvidia tests like nvbandwidth, if your system has a Nvidia GPU: + https://github.com/NVIDIA/nvbandwidth + $ git clone https://github.com/NVIDIA/nvbandwidth.git + $ cd nvbandwidth + $ sudo ./debian_install.sh + $ sudo ./nvbandwidth - - remove_migration_ptes(src, dst, false); - + remove_migration_ptes(src, dst, 0); + A test package is available in the following ppa: - Then the only resolution is re-writing the 0 -> false in the second - patch. + https://launchpad.net/~mruffell/+archive/ubuntu/sf416039-test - I've been running with this modification and have not seen any hangs on - a test that used to hang basically immediately. + If you install it, and run the hmm selftests, it should no longer hang. - Do you think Ubuntu would be able to fix this up before any of the - kernels with this CVE graduate from proposed? + [Where problems can occur] + + This changes some core mm code for device memory from standard pages to using + folios, and carries some additional risk because of this. + + If a regression were to occur, it would primarily affect users of devices with + internal memory, such as graphics cards, and quite possibly high end network + cards. + + The largest userbase affected by this regression is nvidia users, so it really + would be a bad idea to release with the broken implementation, and instead, to + respin and release with the fixed implementation. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2120330 Title: Incorrect backport for CVE-2025-21861 causes kernel hangs To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2120330/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
