Re: [PATCH 5/6] finish_utrace_stop: check -stopped lockless
On 08/03, Roland McGrath wrote: But I think we do not care. What we do care is: -stopped must be F after finish_utrace_stop(), whoever checks it under utrace-lock. We do care about a false positive if it makes finish_utrace_stop() hit its WARN_ON and return true when there was no SIGKILL. Yes sure. Now I see I misunderstood your previous email. I think that this type of false positive is not possible. Please see below. when utrace_do_stop() sets -stopped, we hold -siglock, this means we can't race with SIGCONT/SIGKILL. If the signal somes after that, we can rely on rq-lock, both try_to_wake_up() and schedule() take this lock, it should act as a barrier. I think. Ok. If we are in fact relying on any implicit barrier semantics like that, we should have a clear comment about it. Ah. I guess I just added the unnecessary confusion. Indeed, we should not rely on any implicit barrier semantics with rq-lock. What I meant, we can rely on the fact that any wakeup (try_to_wakeup() actually) must be a barrier wrt schedule(). IOW, if we do VAR = 1; try_to_wake_up(TASK); Then TASK can safely do set_current_state(); schedule(); BUG_ON(VAR != 1); this is true even if the task does not sleep, in case it is woken before it enters schedule(). So. If utrace_control() path changes -stopped and wakes up the tracee, it must see the correct value of -stopped. This is simple. But what if the tracee is woken by CONT/KILL? Should the tracee see the last change in -stopped? The signal can race with utrace_wakeup() or utrace_do_stop(), there are a lot of cases when the woken tracee can look at -stopped while it is changed by utrace. But I think all we need is to be sure that: - if -stopped == T, we have a pending SIGKILL If we were woken by SIGCONT which comes right after utrace_wakeup() does s/TRACED/STOPPED/, I think we must see the result of -stopped = 0 in utrace_wakeup(). Both utrace_wakeup() and signal_wake_up() take -siglock, the memory change in -stopped must be visible to utrace_wakeup() and the wakee. - If we were killed, we must see -stopped == T _unless_ utrace_wakeup() is in progress or we are already woken from utrace pov. Now we should worry about SIGKILL racing with utrace_do_stop(). Again, both take -siglock, etc. But in short, I can't understand how utrace-lock can really help, given that a signal changes task's state without this lock. It can only help to avoid the races with utrace playing with -stopped in parallel, but I see nothing dangerous here. BTW, I think that finish_utrace_stop() doesn't need -siglock to check the pending SIGKILL. Shouldn't one always hold siglock to use task-pending at all? Why? It only checks the bit. Note that fatal_signal_pending() is used lockless. But this discussion makes me wonder, what report-killed and utrace_stop's returned value buy us? Can't we kill report-killed and make utrace_stop() return void? With or without this patch. Suppose that (say) utrace_report_syscall_entry() does utrace_stop() and sleeps in TASK_TRACED. Another thread starts the group stop and sets SIGNAL_STOP_STOPPED. utrace_wakeup() clears -stopped, notices SIGNAL_STOP_STOPPED and doesn't wake up. Now, we can sleep a long time. Then SIGKILL kills us, but finish_utrace_stop() returns false. Afaics, only utrace_get_signal() uses report.kill, and only utrace_report_syscall_entry() uses the result of utrace_stop(). Both can just check fatal_signal_pending(). - Currently, if we see -stopped == T under utrace-lock we know that the tracee can do nothing interesting from utrace pov. It can't be waked up by utrace_wakeup(). If the tracee is killed it must take -lock to clear -stopped before it can do anything else. Right. My original thinking was that it must always get into utrace_get_signal() right then, and that would be responsible for any synchronization that mattered. If we remove -stopped, we can't rely on TASK_TRACED in the same manner. For example, a killed tracee can can call utrace_report_jctl()-REPORT() while utrace thinks it is stopped. You mean the tracehook_notify_jctl(why, CLD_CONTINUED) case, right? Not sure. utrace_get_signal() can call -report_signal() before it takes utrace-lock. I wonder if it might not make sense anyway to suppress this for the SIGKILL case. Not just the tracing, but the entire CLD_CONTINUED notification. A parent doesn't really need a SIGCHLD,CLD_CONTINUED immediately followed by a SIGCHLD,CLD_KILLED. Heh. Actually, I was wrong. utrace_report_jctl() is not possible if the tracee is killed. complete_signal(SIGKILL) sets -flags = SIGNAL_GROUP_EXIT. - The exiting task with _UTRACE_DEATH_EVENTS can be considered ^dead/dying ^out as quiescent. But, without
Re: [PATCH 5/6] finish_utrace_stop: check -stopped lockless
What I meant, we can rely on the fact that any wakeup (try_to_wakeup() actually) must be a barrier wrt schedule(). Ok. But what if the tracee is woken by CONT/KILL? Should the tracee see the last change in -stopped? The signal can race with utrace_wakeup() or utrace_do_stop(), [...] It can't actually race with utrace_do_stop, because that holds siglock while touching -stopped. Any signal case should already exclude utrace_do_stop that way. So it's only utrace_wakeup, which clears -stopped before taking siglock--and that's the harmless direction to race. But I think all we need is to be sure that: - if -stopped == T, we have a pending SIGKILL If we were woken by SIGCONT which comes right after utrace_wakeup() does s/TRACED/STOPPED/, I think we must see the result of -stopped = 0 in utrace_wakeup(). Both utrace_wakeup() and signal_wake_up() take -siglock, the memory change in -stopped must be visible to utrace_wakeup() and the wakee. Ok. - If we were killed, we must see -stopped == T _unless_ utrace_wakeup() is in progress or we are already woken from utrace pov. Right. In those cases it doesn't matter whether we think we were killed or not. It can be before the wakeup or after the wakeup. Now we should worry about SIGKILL racing with utrace_do_stop(). Again, both take -siglock, etc. But in short, I can't understand how utrace-lock can really help, given that a signal changes task's state without this lock. It can only help to avoid the races with utrace playing with -stopped in parallel, but I see nothing dangerous here. Right, I just meant about keeping -stopped correct in the signal case. i.e., in case of signal wakeup, it could never see an old -stopped value. Shouldn't one always hold siglock to use task-pending at all? Why? It only checks the bit. Note that fatal_signal_pending() is used lockless. Ok. But this discussion makes me wonder, what report-killed and utrace_stop's returned value buy us? Can't we kill report-killed and make utrace_stop() return void? I think at some point earlier I was not confident that a local pending SIGKILL would always be there. This was before we had fatal_signal_pending et al. It might not have been a valid worry even then. With or without this patch. Suppose that (say) utrace_report_syscall_entry() does utrace_stop() and sleeps in TASK_TRACED. Another thread starts the group stop and sets SIGNAL_STOP_STOPPED. utrace_wakeup() clears -stopped, notices SIGNAL_STOP_STOPPED and doesn't wake up. Now, we can sleep a long time. Then SIGKILL kills us, but finish_utrace_stop() returns false. Yes, that's a good point. Afaics, only utrace_get_signal() uses report.kill, and only utrace_report_syscall_entry() uses the result of utrace_stop(). Both can just check fatal_signal_pending(). Ok. You mean the tracehook_notify_jctl(why, CLD_CONTINUED) case, right? Not sure. utrace_get_signal() can call -report_signal() before it takes utrace-lock. Right. It couldn't before af3eb44. Heh. Actually, I was wrong. utrace_report_jctl() is not possible if the tracee is killed. complete_signal(SIGKILL) sets -flags = SIGNAL_GROUP_EXIT. Ah, good. OTOH we can just keep utrace_finish_jctl() and have it and finish_utrace_stop() just always take the utrace lock purely to synchronize (and then drop it without changing anything). or we can just do spin_unlock_wait(). I guess that has the same effect as lock+unlock, sure. Anyway, I believe that this change, even if good, is low priority. I mean, it doesn't immediately makes possible other improvements, afaics. But this all is up to you. No, but it is up to you! ;-) The point of the exercise is to get utrace merged upstream. We will of course keep refining it in the future. My attention to the idea of dropping -stopped now was purely on the thought that it might make the code easier to understand/review/believe correct so it could help with upstream reviewers liking it for merge. So do it if it helps or do the opposite if that helps, or play it by ear as feels right. Thanks, Roland
Re: [PATCH 5/6] finish_utrace_stop: check -stopped lockless
On 08/01, Roland McGrath wrote: finish_utrace_stop() can check -stopped lockless. It was set by us, we can't miss it. We enter utrace_stop() for some stop. Either then or later, a group jctl stop finishes and sets SIGNAL_STOP_STOPPED. Later, utrace_wakeup() sees that and we switch to TASK_STOPPED after clearing -stopped. We stay in jctl stop for a few days. Some new debugger comes along with utrace_control(,,UTRACE_STOP) and utrace_do_stop() switches us to TASK_TRACED after setting -stopped. Meanwhile, SIGCONT is coming along and clearing SIGNAL_STOP_STOPPED (sibling threads run again, etc.). Now the debugger calls utrace_control(,,UTRACE_RESUME), so utrace_wakeup clears -stopped and wakes us up. Now -stopped was just an instant ago set and then cleared on the other CPU and we are running. Are we really sure that we see it as clear rather than set? Of course, we can have the false positive. Even simpler: by the time finish_utrace_stop() takes utrace-lock the tracer can clear -stopped even if it was really set after schedule(). But I think we do not care. What we do care is: -stopped must be F after finish_utrace_stop(), whoever checks it under utrace-lock. And I think this must be true. If -stopped was ever cleared before finish_utrace_stop(), nobody can set it again in such a way that we can miss it. when utrace_do_stop() sets -stopped, we hold -siglock, this means we can't race with SIGCONT/SIGKILL. If the signal somes after that, we can rely on rq-lock, both try_to_wake_up() and schedule() take this lock, it should act as a barrier. I think. Perhaps it makes sense to move utrace_stop()-recalc_sigpending() code up, before finish_utrace_stop(). This way it would be a bit clearer why we can't race with signals. BTW, I think that finish_utrace_stop() doesn't need -siglock to check the pending SIGKILL. Anyway, this change is very minor. The only reason I sent this patch is that I spent some time trying to understand which races this unconditional spin_lock(utrace-lock) tries to close. But I think that I can even worry about this is a good indicator that we'd probably be happier if we can indeed get rid of -stopped altogether. I think you are right, and -stopped _can_ go away. But, as a devil's advocate, I'd like to give a couple of weak arguments against this change. - Currently, if we see -stopped == T under utrace-lock we know that the tracee can do nothing interesting from utrace pov. It can't be waked up by utrace_wakeup(). If the tracee is killed it must take -lock to clear -stopped before it can do anything else. If we remove -stopped, we can't rely on TASK_TRACED in the same manner. For example, a killed tracee can can call utrace_report_jctl()-REPORT() while utrace thinks it is stopped. - The exiting task with _UTRACE_DEATH_EVENTS can be considered as quiescent. But, without -stopped, looking at this task we can't know if some engine wants this task to be stopped. IOW, if we see such a task we can't figure out was utrace_do_stop() called or not. - I _think_ that -stopped makes the code a bit more readable and understandable. More explicit. But this is really subjective. On the other hand I agree with your arguments. And, as you pointed out, we can kill utrace_finish_jctl(). In short - I do not know. Oleg.
Re: [PATCH 5/6] finish_utrace_stop: check -stopped lockless
Of course, we can have the false positive. Even simpler: by the time finish_utrace_stop() takes utrace-lock the tracer can clear -stopped even if it was really set after schedule(). Right. It didn't even occur to me that the only scenario I thought of was a false positive. I was just thinking through the statement you made about why lockless was OK (we will have set it ourselves). But I think we do not care. What we do care is: -stopped must be F after finish_utrace_stop(), whoever checks it under utrace-lock. We do care about a false positive if it makes finish_utrace_stop() hit its WARN_ON and return true when there was no SIGKILL. when utrace_do_stop() sets -stopped, we hold -siglock, this means we can't race with SIGCONT/SIGKILL. If the signal somes after that, we can rely on rq-lock, both try_to_wake_up() and schedule() take this lock, it should act as a barrier. I think. Ok. If we are in fact relying on any implicit barrier semantics like that, we should have a clear comment about it. Perhaps it makes sense to move utrace_stop()-recalc_sigpending() code up, before finish_utrace_stop(). This way it would be a bit clearer why we can't race with signals. Ok. I'm not quite seeing just now why this is clearer. But if you have a patch with comments in it, I will probably understand why. BTW, I think that finish_utrace_stop() doesn't need -siglock to check the pending SIGKILL. Shouldn't one always hold siglock to use task-pending at all? Anyway, this change is very minor. The only reason I sent this patch is that I spent some time trying to understand which races this unconditional spin_lock(utrace-lock) tries to close. Good! Indeed I don't want to waste too much time on minor things, and we can always clean up more later. But it is crucial that you are thinking thoroughly about all the locking and race questions in the code, so that is always time well spent. I think you are right, and -stopped _can_ go away. But, as a devil's advocate, I'd like to give a couple of weak arguments against this change. Sure! We want to go whichever way makes the code overall easiest to follow, review, and be convinced it's right. - Currently, if we see -stopped == T under utrace-lock we know that the tracee can do nothing interesting from utrace pov. It can't be waked up by utrace_wakeup(). If the tracee is killed it must take -lock to clear -stopped before it can do anything else. Right. My original thinking was that it must always get into utrace_get_signal() right then, and that would be responsible for any synchronization that mattered. If we remove -stopped, we can't rely on TASK_TRACED in the same manner. For example, a killed tracee can can call utrace_report_jctl()-REPORT() while utrace thinks it is stopped. You mean the tracehook_notify_jctl(why, CLD_CONTINUED) case, right? That is literally the only thing that happens after do_signal_stop() wakes up and before utrace_get_signal() is called. I wonder if it might not make sense anyway to suppress this for the SIGKILL case. Not just the tracing, but the entire CLD_CONTINUED notification. A parent doesn't really need a SIGCHLD,CLD_CONTINUED immediately followed by a SIGCHLD,CLD_KILLED. OTOH, there is still the general point. utrace_get_signal() doesn't take the utrace lock unconditionally, and I don't think we want it to. Even if the only thing that happens is dequeuing SIGKILL, then that could still get to utrace_report_exit(). - The exiting task with _UTRACE_DEATH_EVENTS can be considered ^dead/dying ^out as quiescent. But, without -stopped, looking at this task we can't know if some engine wants this task to be stopped. IOW, if we see such a task we can't figure out was utrace_do_stop() called or not. (I try to reserve exiting for PF_EXITING and dying for running with -exit_state set, because the difference is important.) Who cares? There is no meaning to UTRACE_STOP for a dying or dead task. It counts as quiescent for purposes of utrace_control knowing what it can do, that's all. If it's not quiescent yet, it won't ever stop, it will just finish dying (i.e. report_death callbacks will finish). - I _think_ that -stopped makes the code a bit more readable and understandable. More explicit. But this is really subjective. Ok. I'll leave it to you. My first reaction was just that fewer state bits means less bookkeeping to convince oneself is correct, so easier to read in that way. But it's not a strong opinion. On the other hand I agree with your arguments. And, as you pointed out, we can kill utrace_finish_jctl(). In short - I do not know. I don't either. Your first point above is fairly compelling. OTOH we can just keep utrace_finish_jctl() and have it and finish_utrace_stop() just always take the utrace lock purely to synchronize (and then
Re: [PATCH 5/6] finish_utrace_stop: check -stopped lockless
finish_utrace_stop() can check -stopped lockless. It was set by us, we can't miss it. We enter utrace_stop() for some stop. Either then or later, a group jctl stop finishes and sets SIGNAL_STOP_STOPPED. Later, utrace_wakeup() sees that and we switch to TASK_STOPPED after clearing -stopped. We stay in jctl stop for a few days. Some new debugger comes along with utrace_control(,,UTRACE_STOP) and utrace_do_stop() switches us to TASK_TRACED after setting -stopped. Meanwhile, SIGCONT is coming along and clearing SIGNAL_STOP_STOPPED (sibling threads run again, etc.). Now the debugger calls utrace_control(,,UTRACE_RESUME), so utrace_wakeup clears -stopped and wakes us up. Now -stopped was just an instant ago set and then cleared on the other CPU and we are running. Are we really sure that we see it as clear rather than set? (I think there is also a racier version without jctl stops at all, just where utrace_wakeup+utrace_do_stop+utrace_wakeup all came and went before we actually got scheduled.) I might be overlooking some interlock that covers this. I put the patch in, anyway. But I think that I can even worry about this is a good indicator that we'd probably be happier if we can indeed get rid of -stopped altogether. Thanks, Roland