cgroup_enable_task_cg_lists && PF_EXITING (Was: KCSAN: data-race in exit_signals / prepare_signal)

2019-10-21 Thread Oleg Nesterov
could you explain the usage of siglock/PF_EXITING in
cgroup_enable_task_cg_lists() ?

PF_EXITING is protected by cgroup_threadgroup_rwsem, not by
sighand->siglock.

Oleg.



Re: KCSAN: data-race in exit_signals / prepare_signal

2019-10-21 Thread Oleg Nesterov
On 10/21, Marco Elver wrote:
>
> On Mon, 21 Oct 2019 at 14:00, Oleg Nesterov  wrote:
> >
> > I think this is WONTFIX.
>
> If taking the spinlock is unnecessary (which AFAIK it probably is) and
> there are no other writers to this flag, you will still need a
> WRITE_ONCE(tsk->flags, tsk->flags | PF_EXITING) to avoid the
> data-race.

Or even WRITE_ONCE(tsk->flags, READ_ONCE(tsk->flags) | PF_EXITING) in
theory. But in practice, I do not think compiler can turn

curent->flags |= PF_EXITING;

into something which temporary clears another flag, say, PF_KTHREAD.

> However, if it is possible that there are concurrent writers setting
> other bits in flags,

No, only current taks should change its ->flags.

Oleg.



Re: KCSAN: data-race in exit_signals / prepare_signal

2019-10-21 Thread Christian Brauner
On Mon, Oct 21, 2019 at 02:00:30PM +0200, Oleg Nesterov wrote:
> On 10/21, Christian Brauner wrote:
> >
> > This traces back to Oleg fixing a race between a group stop and a thread
> > exiting before it notices that it has a pending signal or is in the middle 
> > of
> > do_exit() already, causing group stop to get wacky.
> > The original commit to fix this race is
> > commit d12619b5ff56 ("fix group stop with exit race") which took sighand
> > lock before setting PF_EXITING on the thread.
> 
> Not really... sig_task_ignored() didn't check task->flags until the recent
> 33da8e7c81 ("signal: Allow cifs and drbd to receive their terminating 
> signals").
> But I think this doesn't matter, see below.
> 
> > If the race really matters and given how tsk->flags is currently accessed
> > everywhere the simple fix for now might be:
> >
> > diff --git a/kernel/signal.c b/kernel/signal.c
> > index c4da1ef56fdf..cf61e044c4cc 100644
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -2819,7 +2819,9 @@ void exit_signals(struct task_struct *tsk)
> > cgroup_threadgroup_change_begin(tsk);
> >
> > if (thread_group_empty(tsk) || signal_group_exit(tsk->signal)) {
> > +   spin_lock_irq(&tsk->sighand->siglock);
> > tsk->flags |= PF_EXITING;
> > +   spin_unlock_irq(&tsk->sighand->siglock);
> 
> Well, exit_signals() tries to avoid ->siglock in this case
> 
> But this doesn't matter. IIUC the problem is not that exit_signals() sets
> PF_EXITING, the problem is that it writes to tsk->flags and kasan detects
> the data race.

Right, that's the reason I said "If the race really matters". I thought
that other writers/readers always take sighand lock. So the easy fix
would have been to take sighand lock too.
The alternative is that we need to fiddle with task_struct itself and
replace flags with an atomic_t or sm which is more invasive and probably
more controversial.

Christian


Re: KCSAN: data-race in exit_signals / prepare_signal

2019-10-21 Thread Marco Elver
On Mon, 21 Oct 2019 at 14:00, Oleg Nesterov  wrote:
>
> On 10/21, Christian Brauner wrote:
> >
> > This traces back to Oleg fixing a race between a group stop and a thread
> > exiting before it notices that it has a pending signal or is in the middle 
> > of
> > do_exit() already, causing group stop to get wacky.
> > The original commit to fix this race is
> > commit d12619b5ff56 ("fix group stop with exit race") which took sighand
> > lock before setting PF_EXITING on the thread.
>
> Not really... sig_task_ignored() didn't check task->flags until the recent
> 33da8e7c81 ("signal: Allow cifs and drbd to receive their terminating 
> signals").
> But I think this doesn't matter, see below.
>
> > If the race really matters and given how tsk->flags is currently accessed
> > everywhere the simple fix for now might be:
> >
> > diff --git a/kernel/signal.c b/kernel/signal.c
> > index c4da1ef56fdf..cf61e044c4cc 100644
> > --- a/kernel/signal.c
> > +++ b/kernel/signal.c
> > @@ -2819,7 +2819,9 @@ void exit_signals(struct task_struct *tsk)
> > cgroup_threadgroup_change_begin(tsk);
> >
> > if (thread_group_empty(tsk) || signal_group_exit(tsk->signal)) {
> > +   spin_lock_irq(&tsk->sighand->siglock);
> > tsk->flags |= PF_EXITING;
> > +   spin_unlock_irq(&tsk->sighand->siglock);
>
> Well, exit_signals() tries to avoid ->siglock in this case
>
> But this doesn't matter. IIUC the problem is not that exit_signals() sets
> PF_EXITING, the problem is that it writes to tsk->flags and kasan detects
> the data race.
>
> For example, freezable_schedule() which sets PF_FREEZER_SKIP can equally
> "race" with sig_task_ignored() or with ANY other code which checks this
> task's flags.
>
> I think this is WONTFIX.

If taking the spinlock is unnecessary (which AFAIK it probably is) and
there are no other writers to this flag, you will still need a
WRITE_ONCE(tsk->flags, tsk->flags | PF_EXITING) to avoid the
data-race.

However, if it is possible that there are concurrent writers setting
other bits in flags, you need to ensure that the bits are set
atomically (unless it's ok to lose some bits here). This can only be
done via 1) taking siglock, or 2) via e.g. atomic_or(...), however,
flags is not atomic_t ...

-- Marco


Re: KCSAN: data-race in exit_signals / prepare_signal

2019-10-21 Thread Oleg Nesterov
On 10/21, Christian Brauner wrote:
>
> This traces back to Oleg fixing a race between a group stop and a thread
> exiting before it notices that it has a pending signal or is in the middle of
> do_exit() already, causing group stop to get wacky.
> The original commit to fix this race is
> commit d12619b5ff56 ("fix group stop with exit race") which took sighand
> lock before setting PF_EXITING on the thread.

Not really... sig_task_ignored() didn't check task->flags until the recent
33da8e7c81 ("signal: Allow cifs and drbd to receive their terminating signals").
But I think this doesn't matter, see below.

> If the race really matters and given how tsk->flags is currently accessed
> everywhere the simple fix for now might be:
>
> diff --git a/kernel/signal.c b/kernel/signal.c
> index c4da1ef56fdf..cf61e044c4cc 100644
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -2819,7 +2819,9 @@ void exit_signals(struct task_struct *tsk)
> cgroup_threadgroup_change_begin(tsk);
>
> if (thread_group_empty(tsk) || signal_group_exit(tsk->signal)) {
> +   spin_lock_irq(&tsk->sighand->siglock);
> tsk->flags |= PF_EXITING;
> +   spin_unlock_irq(&tsk->sighand->siglock);

Well, exit_signals() tries to avoid ->siglock in this case

But this doesn't matter. IIUC the problem is not that exit_signals() sets
PF_EXITING, the problem is that it writes to tsk->flags and kasan detects
the data race.

For example, freezable_schedule() which sets PF_FREEZER_SKIP can equally
"race" with sig_task_ignored() or with ANY other code which checks this
task's flags.

I think this is WONTFIX.

Oleg.



Re: KCSAN: data-race in exit_signals / prepare_signal

2019-10-21 Thread Christian Brauner
[+Cc Will]

On Mon, Oct 21, 2019 at 03:34:07AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:d724f94f x86, kcsan: Enable KCSAN for x86
> git tree:   https://github.com/google/ktsan.git kcsan
> console output: https://syzkaller.appspot.com/x/log.txt?x=13eab79f60
> kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
> dashboard link: https://syzkaller.appspot.com/bug?extid=492a4acccd8fc75ddfd0
> compiler:   gcc (GCC) 9.0.0 20181231 (experimental)
> 
> Unfortunately, I don't have any reproducer for this crash yet.
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+492a4acccd8fc75dd...@syzkaller.appspotmail.com
> 
> ==============
> BUG: KCSAN: data-race in exit_signals / prepare_signal

This traces back to Oleg fixing a race between a group stop and a thread
exiting before it notices that it has a pending signal or is in the middle of
do_exit() already, causing group stop to get wacky.
The original commit to fix this race is
commit d12619b5ff56 ("fix group stop with exit race") which took sighand
lock before setting PF_EXITING on the thread.

Later on in
commit 5dee1707dfbf ("move the related code from exit_notify() to 
exit_signals()")
an improvement was made for the single-threaded case and the
case where the group stop is already in progress. This removed the
sighand lock around the PF_EXITING assignment.

If the race really matters and given how tsk->flags is currently accessed
everywhere the simple fix for now might be:

diff --git a/kernel/signal.c b/kernel/signal.c
index c4da1ef56fdf..cf61e044c4cc 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2819,7 +2819,9 @@ void exit_signals(struct task_struct *tsk)
cgroup_threadgroup_change_begin(tsk);

if (thread_group_empty(tsk) || signal_group_exit(tsk->signal)) {
+   spin_lock_irq(&tsk->sighand->siglock);
tsk->flags |= PF_EXITING;
+   spin_unlock_irq(&tsk->sighand->siglock);
cgroup_threadgroup_change_end(tsk);
return;
}

Christian


KCSAN: data-race in exit_signals / prepare_signal

2019-10-21 Thread syzbot

Hello,

syzbot found the following crash on:

HEAD commit:d724f94f x86, kcsan: Enable KCSAN for x86
git tree:   https://github.com/google/ktsan.git kcsan
console output: https://syzkaller.appspot.com/x/log.txt?x=13eab79f60
kernel config:  https://syzkaller.appspot.com/x/.config?x=c0906aa620713d80
dashboard link: https://syzkaller.appspot.com/bug?extid=492a4acccd8fc75ddfd0
compiler:   gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+492a4acccd8fc75dd...@syzkaller.appspotmail.com

==
BUG: KCSAN: data-race in exit_signals / prepare_signal

read to 0x888103566064 of 4 bytes by interrupt on cpu 0:
 sig_task_ignored kernel/signal.c:94 [inline]
 sig_ignored kernel/signal.c:119 [inline]
 prepare_signal+0x1f5/0x790 kernel/signal.c:956
 send_sigqueue+0xc1/0x4b0 kernel/signal.c:1859
 posix_timer_event kernel/time/posix-timers.c:328 [inline]
 posix_timer_fn+0x10d/0x230 kernel/time/posix-timers.c:354
 __run_hrtimer kernel/time/hrtimer.c:1389 [inline]
 __hrtimer_run_queues+0x288/0x600 kernel/time/hrtimer.c:1451
 hrtimer_interrupt+0x22a/0x480 kernel/time/hrtimer.c:1509
 local_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1110 [inline]
 smp_apic_timer_interrupt+0xdc/0x280 arch/x86/kernel/apic/apic.c:1135
 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:830
 arch_local_irq_enable arch/x86/include/asm/paravirt.h:778 [inline]
 __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline]
 _raw_spin_unlock_irq+0x4e/0x80 kernel/locking/spinlock.c:199
 spin_unlock_irq include/linux/spinlock.h:388 [inline]
 get_signal+0x1f4/0x1320 kernel/signal.c:2707
 do_signal+0x3b/0xc00 arch/x86/kernel/signal.c:815
 exit_to_usermode_loop+0x250/0x2c0 arch/x86/entry/common.c:159
 prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
 syscall_return_slowpath arch/x86/entry/common.c:274 [inline]
 do_syscall_64+0x2d7/0x2f0 arch/x86/entry/common.c:299
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

write to 0x888103566064 of 4 bytes by task 7604 on cpu 1:
 exit_signals+0x13b/0x490 kernel/signal.c:2822
 do_exit+0x1af/0x18e0 kernel/exit.c:825
 do_group_exit+0xb4/0x1c0 kernel/exit.c:983
 __do_sys_exit_group kernel/exit.c:994 [inline]
 __se_sys_exit_group kernel/exit.c:992 [inline]
 __x64_sys_exit_group+0x2e/0x30 kernel/exit.c:992
 do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 7604 Comm: syz-executor.4 Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011

==


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.