Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Thomas Gleixner
On Tue, 2005-08-23 at 20:17 +0400, Oleg Nesterov wrote: > Thomas Gleixner wrote: > > I still think the last patch I sent is still necessary. > > Thomas, you know that I like this change in __exit_{signal,sighand}, > but i think this change is dangerous, should go in a separate patch, > and needs

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Oleg Nesterov
Thomas Gleixner wrote: > > On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: > > > > kernel/posix-timers.c:common_timer_del() calls del_timer_sync(), after > > that nobody can access this timer, so we don't need to lock timer->it_lock > > at all in this case. No lock - no deadlock. > > It

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Thomas Gleixner
On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: > But I know nothing about kernel/posix-cpu-timers.c, I doubt it will work > for posix_cpu_timer_del(). I don't have time to study posix-cpu-timers now. > However, I see that __exit_signal() calls posix_cpu_timers_exit_xxx(), so > may be it

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Thomas Gleixner
On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: > Thomas Gleixner wrote: > Ok, exit_itimers()->itimer_delete() called when the last thread exits > or does exec. > > kernel/posix-timers.c:common_timer_del() calls del_timer_sync(), after > that nobody can access this timer, so we don't need

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Thomas Gleixner
On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: Thomas Gleixner wrote: Ok, exit_itimers()-itimer_delete() called when the last thread exits or does exec. kernel/posix-timers.c:common_timer_del() calls del_timer_sync(), after that nobody can access this timer, so we don't need to

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Thomas Gleixner
On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: But I know nothing about kernel/posix-cpu-timers.c, I doubt it will work for posix_cpu_timer_del(). I don't have time to study posix-cpu-timers now. However, I see that __exit_signal() calls posix_cpu_timers_exit_xxx(), so may be it can

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Oleg Nesterov
Thomas Gleixner wrote: On Mon, 2005-08-22 at 20:45 +0400, Oleg Nesterov wrote: kernel/posix-timers.c:common_timer_del() calls del_timer_sync(), after that nobody can access this timer, so we don't need to lock timer-it_lock at all in this case. No lock - no deadlock. It still

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-23 Thread Thomas Gleixner
On Tue, 2005-08-23 at 20:17 +0400, Oleg Nesterov wrote: Thomas Gleixner wrote: I still think the last patch I sent is still necessary. Thomas, you know that I like this change in __exit_{signal,sighand}, but i think this change is dangerous, should go in a separate patch, and needs a lot

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Oleg Nesterov
Thomas Gleixner wrote: > > But we can not check for p->sighand == NULL, as sighand is released > after exit_itimers() so we are still deadlock prone. So I think > __exit_sighand() should be called before exit_itimers(). Then we can do > > retry: > if (unlikely(!p->sighand)) >

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Oleg Nesterov
Thomas Gleixner wrote: > > @@ -1427,7 +1434,18 @@ send_group_sigqueue(int sig, struct sigq > int ret = 0; > > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > - read_lock(_lock); > +retry: > + if (unlikely(p->flags & PF_EXITING)) > + return -1; > + I don't

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Thomas Gleixner
On Mon, 2005-08-22 at 10:39 +0400, Oleg Nesterov wrote: > Thomas Gleixner wrote: > > > > @@ -1427,7 +1434,18 @@ send_group_sigqueue(int sig, struct sigq > > int ret = 0; > > > > BUG_ON(!(q->flags & SIGQUEUE_PREALLOC)); > > - read_lock(_lock); > > +retry: > > + if

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Thomas Gleixner
Oleg, On Mon, 2005-08-22 at 12:52 +0400, Oleg Nesterov wrote: > Thomas Gleixner wrote: > I think yes, and this is closely related to your and Paul > "Use RCU to protect tasklist for unicast signals" patches. > > We don't need RCU here, but we do need this hypothetical > lock_task_sighand() (and

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Oleg Nesterov
Thomas Gleixner wrote: > > It exists. It triggers on preempt-RT and I can trigger it on vanilla SMP > by waiting for the timer expiry in release_task() before the > __exit_signal() call. That's reasonable, as it can happen that way by > chance too. It requires that the timer expires on a different

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Oleg Nesterov
Thomas Gleixner wrote: It exists. It triggers on preempt-RT and I can trigger it on vanilla SMP by waiting for the timer expiry in release_task() before the __exit_signal() call. That's reasonable, as it can happen that way by chance too. It requires that the timer expires on a different CPU,

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Thomas Gleixner
Oleg, On Mon, 2005-08-22 at 12:52 +0400, Oleg Nesterov wrote: Thomas Gleixner wrote: I think yes, and this is closely related to your and Paul Use RCU to protect tasklist for unicast signals patches. We don't need RCU here, but we do need this hypothetical lock_task_sighand() (and

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Thomas Gleixner
On Mon, 2005-08-22 at 10:39 +0400, Oleg Nesterov wrote: Thomas Gleixner wrote: @@ -1427,7 +1434,18 @@ send_group_sigqueue(int sig, struct sigq int ret = 0; BUG_ON(!(q-flags SIGQUEUE_PREALLOC)); - read_lock(tasklist_lock); +retry: + if

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Oleg Nesterov
Thomas Gleixner wrote: @@ -1427,7 +1434,18 @@ send_group_sigqueue(int sig, struct sigq int ret = 0; BUG_ON(!(q-flags SIGQUEUE_PREALLOC)); - read_lock(tasklist_lock); +retry: + if (unlikely(p-flags PF_EXITING)) + return -1; + I don't think

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-22 Thread Oleg Nesterov
Thomas Gleixner wrote: But we can not check for p-sighand == NULL, as sighand is released after exit_itimers() so we are still deadlock prone. So I think __exit_sighand() should be called before exit_itimers(). Then we can do retry: if (unlikely(!p-sighand))

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
On Sun, 2005-08-21 at 23:24 +0200, Thomas Gleixner wrote: > Oleg, Sorry for making noise. I introduced the race again. tglx --- linux-2.6.13-rc6.signal/kernel/signal.oleg.c2005-08-21 23:07:03.0 +0200 +++ linux-2.6.13-rc6.signal/kernel/signal.c 2005-08-21 23:48:52.0

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
Oleg, On Sun, 2005-08-21 at 14:59 +0400, Oleg Nesterov wrote: > CPU_0 CPU_1 > > release_task: posix_timer_fn: > write_lock(tasklist); lock(timer->it_lock); > > exit_timers:

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
On Sun, 2005-08-21 at 14:41 +0400, Oleg Nesterov wrote: > If not: > > > timer event > > timr->it_process references a freed structure > > > > No, create_timer() does get_task_struct(timr->it_process), this > task may be EXIT_DEAD now, but the task_struct itself is valid, > and it's

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Oleg Nesterov
Thomas Gleixner wrote: > > On Sat, 2005-08-20 at 20:58 +0400, Oleg Nesterov wrote: > > [PATCH] fix send_sigqueue() vs thread exit race > > > > . > > - read_lock(_lock); > > + read_lock(_lock); > > + > > + if (unlikely(p-&g

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
On Sat, 2005-08-20 at 20:58 +0400, Oleg Nesterov wrote: > [PATCH] fix send_sigqueue() vs thread exit race > > posix_timer_event() first checks that the thread (SIGEV_THREAD_ID > case) does not have PF_EXITING flag, then it calls send_sigqueue() > which locks task list. But if t

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
On Sat, 2005-08-20 at 20:58 +0400, Oleg Nesterov wrote: [PATCH] fix send_sigqueue() vs thread exit race posix_timer_event() first checks that the thread (SIGEV_THREAD_ID case) does not have PF_EXITING flag, then it calls send_sigqueue() which locks task list. But if the thread exits

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Oleg Nesterov
Thomas Gleixner wrote: On Sat, 2005-08-20 at 20:58 +0400, Oleg Nesterov wrote: [PATCH] fix send_sigqueue() vs thread exit race . - read_lock(tasklist_lock); + read_lock(tasklist_lock); + + if (unlikely(p-flags PF_EXITING)) { + ret = -1

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
Oleg, On Sun, 2005-08-21 at 14:59 +0400, Oleg Nesterov wrote: CPU_0 CPU_1 release_task: posix_timer_fn: write_lock(tasklist); lock(timer-it_lock); exit_timers:send_sigxxx();

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
On Sun, 2005-08-21 at 14:41 +0400, Oleg Nesterov wrote: If not: timer event timr-it_process references a freed structure No, create_timer() does get_task_struct(timr-it_process), this task may be EXIT_DEAD now, but the task_struct itself is valid, and it's -flags has

Re: [PATCH] fix send_sigqueue() vs thread exit race

2005-08-21 Thread Thomas Gleixner
On Sun, 2005-08-21 at 23:24 +0200, Thomas Gleixner wrote: Oleg, Sorry for making noise. I introduced the race again. tglx --- linux-2.6.13-rc6.signal/kernel/signal.oleg.c2005-08-21 23:07:03.0 +0200 +++ linux-2.6.13-rc6.signal/kernel/signal.c 2005-08-21 23:48:52.0

[PATCH] fix send_sigqueue() vs thread exit race

2005-08-20 Thread Oleg Nesterov
[PATCH] fix send_sigqueue() vs thread exit race posix_timer_event() first checks that the thread (SIGEV_THREAD_ID case) does not have PF_EXITING flag, then it calls send_sigqueue() which locks task list. But if the thread exits in between the kernel will oops (->sighand == NULL af

[PATCH] fix send_sigqueue() vs thread exit race

2005-08-20 Thread Oleg Nesterov
[PATCH] fix send_sigqueue() vs thread exit race posix_timer_event() first checks that the thread (SIGEV_THREAD_ID case) does not have PF_EXITING flag, then it calls send_sigqueue() which locks task list. But if the thread exits in between the kernel will oops (-sighand == NULL after