On 24/03/26 09:32, Jinjie Ruan wrote:

On 2026/3/24 0:55, Andrew Morton wrote:
On Mon, 23 Mar 2026 15:27:40 +0800 Jinjie Ruan <[email protected]> wrote:

The crash memory allocation, and the exclude of crashk_res, crashk_low_res
and crashk_cma memory are almost identical across different architectures,
This patch set handle them in crash core in a general way, which eliminate
a lot of duplication code.

And add support for crashkernel CMA reservation for arm64 and riscv.
Thanks.  AI review has completed and it asks questions:
        
https://sashiko.dev/#/patchset/[email protected]
I believe it identified 4 valid issues:

- The already discovered crashk_low_res not excluded bug in the existing
RISC-V code.

- An existing memory leak issue in the existing PowerPC code.

Yes and suggested approach to fix the issue looks good.
Which is basically replace return with goto out.

diff --git a/arch/powerpc/kexec/crash.c b/arch/powerpc/kexec/crash.c
index 898742a5205c..1426d2099bad 100644
--- a/arch/powerpc/kexec/crash.c
+++ b/arch/powerpc/kexec/crash.c
@@ -440,7 +440,7 @@ static void update_crash_elfcorehdr(struct kimage *image, struct memory_notify *
        ret = get_crash_memory_ranges(&cmem);
        if (ret) {
                pr_err("Failed to get crash mem range\n");
-               return;
+               goto out;
        }

        /*

Are you planning to handle this in this patch series? Or do you want me to send a separate fix patch?



- The ordering issue of adding CMA ranges to "linux,usable-memory-range".

- An existing concurrency issue. A Concurrent memory hotplug may occur
between reading memblock and attempting to fill cmem during kexec_load()
for almost all existing architectures,I'm not sure if this is a
practical issue in reality..

  Race Condition Scenario

   Timeline:
   ---------------------------------------------------------------------
   T1: kexec_load() syscall starts
   T2: kexec_trylock() acquires kexec_lock
   T3: crash_prepare_headers() is called
   T4: arch_get_system_nr_ranges() queries memblock → finds 100 memory ranges
   T5: cmem = alloc_cmem(100) allocates buffer for 100 ranges
   T6: [RACE WINDOW] Another process triggers memory hotplug
   T7: add_memory() → lock_device_hotplug() → memblock_add_node()
   T8: New memory region added to memblock
   T9: arch_crash_populate_cmem() iterates: now finds 102 ranges
   T10: cmem->ranges[100] → OUT OF BOUNDS WRITE!
   T11: cmem->ranges[101] → OUT OF BOUNDS WRITE!
   T12: Kernel crash or memory corruption

   Why This Happens

   1. Different locks used:
     - kexec_load() uses kexec_trylock (atomic_t)
     - Memory hotplug uses device_hotplug_lock (mutex)
   2. No synchronization between these two operations
   3. Time-of-check to time-of-use (TOCTOU) issue:
     - Step T4-T5: We query the number of ranges and allocate buffer
     - Step T6-T9: Memory hotplug adds new ranges between query and
population



Any comments or suggestions on the following approach?


int crash_prepare_headers(...)
   {
       unsigned int max_nr_ranges;
       struct crash_mem *cmem;
       int ret;

       lock_device_hotplug();

       max_nr_ranges = arch_get_system_nr_ranges();
       // ...
       ret = arch_crash_populate_cmem(cmem);
       // ...

       unlock_device_hotplug();
       return ret;
   }




Reply via email to