On 6/26/26 11:38, Wandun wrote:
> On 6/26/26 16:45, Alexander Krabler wrote:
>> However, we were not able to reproduce the actual race
>> (mlockall() process waiting on a migration PTE),
>> not in the past, not now. Might be hard to trigger that race.
>
> Not hard to trigger that case, I added a debug message, such as below,
> lots of messages occur in a few second.
>
> diff --cc mm/memory.c
> index ff338c2abe92,ff338c2abe92..6552b3b14f78
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@@ -4768,6 -4768,6 +4768,8 @@@ vm_fault_t do_swap_page(struct vm_faul
>                 if (softleaf_is_migration(entry)) {
>                         migration_entry_wait(vma->vm_mm, vmf->pmd,
>                                              vmf->address);
> +                       if (!strcmp(current->comm, "repro"))
> +                               pr_err("============== hit 
> ================\n");
>                 } else if (softleaf_is_device_exclusive(entry)) {
>                         vmf->page = softleaf_to_page(entry);
>                         ret = remove_device_exclusive_entry(vmf);

I have a kprobe on migration_entry_wait set and logged into a ftrace buffer
(including kernel stacktrace).
Yes, this function is hit, but only inside the mmap-syscall, which is okay,
memory allocation is not realtime-safe.

           repro-2090    [002] d....   811.129549: frt_migration_entry_wait: 
(migration_entry_wait+0x0/0x100)
           repro-2090    [002] d....   811.129553: <stack trace>
 => migration_entry_wait
 => __handle_mm_fault
 => handle_mm_fault
 => __get_user_pages
 => populate_vma_page_range
 => __mm_populate
 => vm_mmap_pgoff
 => ksys_mmap_pgoff
 => __arm64_sys_mmap
 => el0_svc_common.constprop.0
 => do_el0_svc
 => el0_svc
 => el0t_64_sync_handler
 => el0t_64_sync

The original race was an instruction abort interrupt out of nothing due
to the migration PTE set by kcompactd.
And these kind of races I see quite often on non mlockall()-processes,
but can't reproduce on memory locked processes.

Example:
          podman-832     [000] d....   812.447820: frt_migration_entry_wait: 
(migration_entry_wait+0x0/0x100)
          podman-832     [000] d....   812.447823: <stack trace>
 => migration_entry_wait
 => __handle_mm_fault
 => handle_mm_fault
 => do_page_fault
 => do_translation_fault
 => do_mem_abort
 => el0_da
 => el0t_64_sync_handler
 => el0t_64_sync

Thanks,
Alexander

--

KUKA Deutschland GmbH   Board of Directors: Michael Jürgens (Chairman), Johan 
Naten, Hui Zhang   Registered Office: Augsburg HRB 14914

This e-mail may contain confidential and/or privileged information. If you are 
not the intended recipient (or have received this e-mail in error) please 
notify the sender immediately and destroy this e-mail. Any unauthorized 
copying, disclosure or distribution of contents of this e-mail is strictly 
forbidden.

Please consider the environment before printing this e-mail.

Reply via email to