On Wed, Sep 11, 2019 at 06:45:40PM +0100, Eugene Syromiatnikov wrote:
> Previously, higher 32 bits of exit_signal fields were lost when
> copied to the kernel args structure (that uses int as a type for the
> respective field).  Moreover, as Oleg has noted[1], exit_signal is used
> unchecked, so it has to be checked for sanity before use; for the legacy
> syscalls, applying CSIGNAL mask guarantees that it is at least non-negative;
> however, there's no such thing is done in clone3() code path, and that can
> break at least thread_group_leader.
> 
> Adding checks that user-passed exit_signal fits into int and passes
> valid_signal() check solves both of these problems.
> 
> [1] https://lkml.org/lkml/2019/9/10/467
> 
> * kernel/fork.c (copy_clone_args_from_user): Fail with -EINVAL if
> args.exit_signal is greater than UINT_MAX or is not a valid signal.
> (_do_fork): Note that exit_signal is expected to be checked for the
> sanity by the caller.
> 
> Fixes: 7f192e3cd316 ("fork: add clone3")
> Reported-by: Oleg Nesterov <o...@redhat.com>
> Co-authored-by: Oleg Nesterov <o...@redhat.com>
> Co-authored-by: Dmitry V. Levin <l...@altlinux.org>
> Signed-off-by: Eugene Syromiatnikov <e...@redhat.com>

For the sake of posterity I appended a reproducer to the patch (cf. [1])
that I pushed to mainline which illustrates how without this patch you
can cause a crash. I'll also explain how I think this happens here.

By passing in a negative signal you can cause a segfault in
proc_flush_task(). The reason is that at process creation time

static inline bool thread_group_leader(struct task_struct *p)
{
        return p->exit_signal >= 0;
}

will return false even though it should return true because exit_signal
overflowed in copy_clone_args_from_user. This means, the kernel
thinks this is a thread and doesn't make the process a thread-group
leader. In proc_flush_task() the kernel will then try to retrieve the
thread group leader but there'll be none ultimately causing a segfault.
Specifically:

void proc_flush_task(struct task_struct *task)
{
        int i;
        struct pid *pid, *tgid;
        struct upid *upid;

        pid = task_pid(task);
        tgid = task_tgid(task); #### tgid is NULL ####

        for (i = 0; i <= pid->level; i++) {
                upid = &pid->numbers[i];
                proc_flush_task_mnt(upid->ns->proc_mnt, upid->nr,
                                        tgid->numbers[i].nr); #### NULL pointer 
deref ####
        }
}

[1]:
  #define _GNU_SOURCE
 #include <linux/sched.h>
 #include <linux/types.h>
 #include <sched.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <sys/syscall.h>
 #include <sys/wait.h>
 #include <unistd.h>

 int main(int argc, char *argv[])
 {
        pid_t pid = -1;
        struct clone_args args = {0};
        args.exit_signal = -1;

        pid = syscall(__NR_clone3, &args, sizeof(struct clone_args));
        if (pid < 0)
                exit(EXIT_FAILURE);

        if (pid == 0)
                exit(EXIT_SUCCESS);

        wait(NULL);

        exit(EXIT_SUCCESS);
 }

 Christian

Reply via email to