If a process receives a signal while it executes some kernel code that
calls mm_take_all_locks, we get -EINTR error. The -EINTR is propagated up
the call stack to userspace and userspace may fail if it gets this
error.

This commit changes -EINTR to -ERESTARTSYS, so that if the signal handler
was installed with the SA_RESTART flag, the operation is automatically
restarted.

For example, this problem happens when using OpenCL on AMDGPU. If some
signal races with clGetDeviceIDs, clGetDeviceIDs returns an error
CL_DEVICE_NOT_FOUND (and strace shows that open("/dev/kfd") failed with
EINTR).

This problem can be reproduced with the following program.

To run this program, you need AMD graphics card and the package
"rocm-opencl" installed. You must not have the package "mesa-opencl-icd"
installed, because it redirects the default OpenCL implementation to
itself.

include <stdio.h>
include <stdlib.h>
include <unistd.h>
include <string.h>
include <signal.h>
include <sys/time.h>

define CL_TARGET_OPENCL_VERSION 300
include <CL/opencl.h>

static void fn(void)
{
        while (1) {
                int32_t err;
                cl_device_id device;
                err = clGetDeviceIDs(NULL, CL_DEVICE_TYPE_GPU, 1, &device, 
NULL);
                if (err != CL_SUCCESS) {
                        fprintf(stderr, "clGetDeviceIDs failed: %d\n", err);
                        exit(1);
                }
                write(2, "-", 1);
        }
}

static void alrm(int sig)
{
        write(2, ".", 1);
}

int main(void)
{
        struct itimerval it;
        struct sigaction sa;
        memset(&sa, 0, sizeof sa);
        sa.sa_handler = alrm;
        sa.sa_flags = SA_RESTART;
        sigaction(SIGALRM, &sa, NULL);
        it.it_interval.tv_sec = 0;
        it.it_interval.tv_usec = 50;
        it.it_value.tv_sec = 0;
        it.it_value.tv_usec = 50;
        setitimer(ITIMER_REAL, &it, NULL);
        fn();
        return 1;
}

I'm submitting this patch for the stable kernels, because the AMD ROCm
stack fails if it receives EINTR from open (it seems to restart EINTR
from ioctl correctly). The process may receive signals at unpredictable
times, so the OpenCL implementation may fail at unpredictable times.

Signed-off-by: Mikulas Patocka <[email protected]>
Link: https://lists.freedesktop.org/archives/amd-gfx/2025-November/133141.html
Link: 
https://yhbt.net/lore/linux-mm/[email protected]/T/#u
Cc: [email protected]
Fixes: 7906d00cd1f6 ("mmu-notifiers: add mm_take_all_locks() operation")
---
 mm/vma.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: mm/mm/vma.c
===================================================================
--- mm.orig/mm/vma.c    2026-01-07 20:11:21.000000000 +0100
+++ mm/mm/vma.c 2026-01-07 20:11:21.000000000 +0100
@@ -2202,7 +2202,7 @@ int mm_take_all_locks(struct mm_struct *
 
 out_unlock:
        mm_drop_all_locks(mm);
-       return -EINTR;
+       return -ERESTARTSYS;
 }
 
 static void vm_unlock_anon_vma(struct anon_vma *anon_vma)

Reply via email to