Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Hi Oleg, On 6/28/07, Oleg Nesterov <[EMAIL PROTECTED]> wrote: On 06/28, Satyam Sharma wrote: > > Second, we *must* break that tcp_recvmsg() inside the kthread's > main loop, of course! We want it stopped, after all, and if we don't > make it "break" out of that function, the kthread _will_never_exit_. In that case this kthread is buggy. We have sock->sk_rcvtimeo. > Please note that this > whole thing is about functions that will _simply_*never*_exit_ever_ > _unless_ given a signal. ditto. kthread should not do this. Well, I definitely wouldn't call it "buggy" ... skb_recv_datagram() (if with sock->sk_rcvtimeo != MAX_SCHEDULE_TIMEOUT) would then needlessly have to be put into it's own little while(1) (or put a "continue;" after it back to main kthread loop). A question arises, what timeout value to use? (too little => needless wastage of CPU; too high => see below) More importantly, the other thread that does a kthread_stop() on our kthread (probably a umount(2) or rmmod) would then unfortunately hang (on wait_for_completion i.e. TASK_UNINTERRUPTIBLE) for the duration of the time it takes for our kthread to finish it's timeout, which plays havoc with userspace scripts. OK, I suggest to stop this thread. I don't claim you are wrong, just we think differently ;) That's fine, we can still "agree to disagree" here :-) Cheers, Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On Thu, 28 Jun 2007 21:08:25 +0400 Oleg Nesterov <[EMAIL PROTECTED]> wrote: > On 06/28, Satyam Sharma wrote: > > > > Second, we *must* break that tcp_recvmsg() inside the kthread's > > main loop, of course! We want it stopped, after all, and if we don't > > make it "break" out of that function, the kthread _will_never_exit_. > > In that case this kthread is buggy. We have sock->sk_rcvtimeo. > > > Please note that this > > whole thing is about functions that will _simply_*never*_exit_ever_ > > _unless_ given a signal. > > ditto. kthread should not do this. > > OK, I suggest to stop this thread. I don't claim you are wrong, just > we think differently ;) > > > >This is what I can't understand completely. Why should we check SIGKILL > > >or signal_pending() in addition to kthread_stop_info.k, what is the point? > > > > ... so kthread_stop_info will go away too. > > it should go away regardless, we have patches. Still I see no point > to check signal_pending() in kthread_stop(). > > Oleg. > Again, this isn't an area where I have great expertise... Adding signalling into kthread_stop would seem to be making some assumptions about how kthreads are used and how they should respond to signals. That sounds unsafe and might potentially limit flexibility. agree with Oleg here -- I don't see the benefit to doing this for all kthreads. If you're aware that your kthread might be blocked on a call, then it doesn't seem burdensome to add in an extra send/force_sig call. -- Jeff Layton <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 06/28, Satyam Sharma wrote: > > Second, we *must* break that tcp_recvmsg() inside the kthread's > main loop, of course! We want it stopped, after all, and if we don't > make it "break" out of that function, the kthread _will_never_exit_. In that case this kthread is buggy. We have sock->sk_rcvtimeo. > Please note that this > whole thing is about functions that will _simply_*never*_exit_ever_ > _unless_ given a signal. ditto. kthread should not do this. OK, I suggest to stop this thread. I don't claim you are wrong, just we think differently ;) > >This is what I can't understand completely. Why should we check SIGKILL > >or signal_pending() in addition to kthread_stop_info.k, what is the point? > > ... so kthread_stop_info will go away too. it should go away regardless, we have patches. Still I see no point to check signal_pending() in kthread_stop(). Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 6/28/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: [...] [ BTW even there we're safe as long as we check kthread_stop() _before_ flushing or ^^ Whoops, that should have been "kthread_should_stop()". dequeueing the signals, but then that'll be an ugly rule to try and enforce, of course. ] Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 6/28/07, Oleg Nesterov <[EMAIL PROTECTED]> wrote: > [...] > Hmm... actually, such a change breaks the > > while (signal_pending(current)) > dequeue_signal_and_so_something(); > > loop, see jffs2_garbage_collect_thread() for example. BTW jffs2_garbage_collect_thread() is a horrible abomination :-) Its use of SIGSTOP and SIGHUP is *totally* gratuitous & unwarranted. It does use SIGKILL, but simply as a stop-notification from umount of the corresponding jffs2 partition. I think all the signal handling there can be removed; then it needs to undergo conversion to kthread (it uses horrible locks and completions to handle its exit) -- I'll put it in my endless kernel-cleanups-todo-list ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 6/28/07, Oleg Nesterov <[EMAIL PROTECTED]> wrote: (trimmed the cc: list) On 06/28, Satyam Sharma wrote: > On 6/27/07, Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > > >Contrary, I believe we should avoid signals when it > >comes to kernel threads. > > And I agree, but there's quite a subtle difference between signals being > used like they normally are, and this case here. Fact is, there is simply > no other way to break out of certain functions (if there was, I would've > preferred that myself). > > In fact, what I'm proposing is that kthreads should *not* be tinkering > with (flushing, handling, dequeueing, whatever) signals at all, because > like you mentioned, if they do that, it's possible that the TIF_SIGPENDING > could get lost. But we do have kthreads which call dequeue_signal(). And perhaps some kthread treats SIGKILL as "urgent exit with a lot of printks" while kthread_should_stop() means "exit gracefully". [..] I believe kthread_stop() should not send the signal. Just because it could be actually dequeued by kthread, and it may have some special meaning for this kthread. [...] Hmm... actually, such a change breaks the while (signal_pending(current)) dequeue_signal_and_so_something(); loop, see jffs2_garbage_collect_thread() for example. In short, I think that kthread_stop() + TIF_SIGPENDING should be a special case, the signal should be sent explicitly before kthread_stop(), like cifs does. The existence of such kthreads (that dabble in signals, although I guess everybody here agrees kernel threads have no business dabbling in them) is the *only* reason I didn't submit my proposal as an actual patch :-) Perhaps "one day" we'll clean up all such cases, and fix up kthread semantics to simply outlaw signal-handling in kernel threads (I just can't convince myself there could be a good justifiable reason to do that). When that's done, and seeing that kthreadd_setup() does ignore_signals(), kthread_stop() could become simply force_sig() ... [...] This is what I can't understand completely. Why should we check SIGKILL or signal_pending() in addition to kthread_stop_info.k, what is the point? ... so kthread_stop_info will go away too. After all, the caller of kthread_stop(k) should know what and why k does. One, the kthread code could change over time. Two, the solution to the problem is un-intuitive for the user to do. The API should know such cases, and handle them well too. The caller of kthread_stop() would expect it to do the expected, after all (i.e. stop the kthread :-) In any case, that kind of the changes can break things, just because this means API change. Well, kthreads are a kernel-internal API anyway, I guess as long as we fix all users, we're fine. > >I am talking about the case > >when wait_event_interruptible() should not fail (unless something bad > >happens) inside the "while (!kthread_should_stop())" loop. > >Note also that kthread could use TASK_INTERRUPTIBLE sleep > >[...] and because it knows that all signals are ignored. > > Ok, I think you're saying that if a kthread used wait_event_interruptible > (and was not expecting signals, because it ignores them), then bad > things (say exit in inconsistent state, etc) will happen if we break that > wait_event_interruptible unexpectedly. No. Of course, kthread should check the error and doesn't exit in inconsistent state. The question is: why should we break (say) tcp_recvmsg() inside "while (!kthread_should_stop())" loop if it is supposed to succeed unless something bad happens? (I mean, we may have a kthread which doesn't expect the failure unless something bad happens). First, I don't quite understand this "not expecting failure" / "something bad happens" bit at all. Second, we *must* break that tcp_recvmsg() inside the kthread's main loop, of course! We want it stopped, after all, and if we don't make it "break" out of that function, the kthread _will_never_exit_. [ What's worse, several kthreads are part of modules, and are created during the module_init() and so the module_exit() function needs to call kthread_stop() to clean it up. But the wait_for_completion() in kthread_stop() will never unblock, and so this would also mean the rmmod will _hang_ and module will never get unloaded too. ] OK, let me give a silly example. The correctly written kthread should check the result of kmalloc(), yes? kthread(k) means that k should exit anyway, yes? So let's change kthread_stop(k) to also set TIF_MEMDIE, this means that __alloc_pages() will fail quickly when get_page_from_freelist() doesn't succeed, but won't start pageout() which may take a while. Please don't explain me why this suggestion is bad :), I am just trying to illustrate my point. Your example / analogy is indeed *very* bad :-) Please note that this whole thing is about functions that will _simply_*never*_exit_ever_ _unless_ given a signal. [ I think you've been misunderstanding my proposal that I want to send a SIGKILL
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
(trimmed the cc: list) On 06/28, Satyam Sharma wrote: > > On 6/27/07, Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > > >Contrary, I believe we should avoid signals when it > >comes to kernel threads. > > And I agree, but there's quite a subtle difference between signals being > used like they normally are, and this case here. Fact is, there is simply > no other way to break out of certain functions (if there was, I would've > preferred that myself). > > In fact, what I'm proposing is that kthreads should *not* be tinkering > with (flushing, handling, dequeueing, whatever) signals at all, because > like you mentioned, if they do that, it's possible that the TIF_SIGPENDING > could get lost. But we do have kthreads which call dequeue_signal(). And perhaps some kthread treats SIGKILL as "urgent exit with a lot of printks" while kthread_should_stop() means "exit gracefully". > >I am talking about the case > >when wait_event_interruptible() should not fail (unless something bad > >happens) inside the "while (!kthread_should_stop())" loop. > >Note also that kthread could use TASK_INTERRUPTIBLE sleep > >[...] and because it knows that all signals are ignored. > > Ok, I think you're saying that if a kthread used wait_event_interruptible > (and was not expecting signals, because it ignores them), then bad > things (say exit in inconsistent state, etc) will happen if we break that > wait_event_interruptible unexpectedly. No. Of course, kthread should check the error and doesn't exit in inconsistent state. The question is: why should we break (say) tcp_recvmsg() inside "while (!kthread_should_stop())" loop if it is supposed to succeed unless something bad happens? (I mean, we may have a kthread which doesn't expect the failure unless something bad happens). OK, let me give a silly example. The correctly written kthread should check the result of kmalloc(), yes? kthread(k) means that k should exit anyway, yes? So let's change kthread_stop(k) to also set TIF_MEMDIE, this means that __alloc_pages() will fail quickly when get_page_from_freelist() doesn't succeed, but won't start pageout() which may take a while. Please don't explain me why this suggestion is bad :), I am just trying to illustrate my point. I believe kthread_stop() should not send the signal. Just because it could be actually dequeued by kthread, and it may have some special meaning for this kthread. Perhaps it makes sense to do signal_wake_up() (sets TIF_SIGPENDING and wakes up), recalc_sigpending() could be changed to take kthread_should_stop() into account (so TIF_SIGPENDING can't be cleared). Once again, I _personally_ do not like this too much, but yes, I see your point, and I can't say such a change is "wrong". Hmm... actually, such a change breaks the while (signal_pending(current)) dequeue_signal_and_so_something(); loop, see jffs2_garbage_collect_thread() for example. In short, I think that kthread_stop() + TIF_SIGPENDING should be a special case, the signal should be sent explicitly before kthread_stop(), like cifs does. After all, the caller of kthread_stop(k) should know what and why k does. In any case, that kind of the changes can break things, just because this means API change. > And thirdly, what I'm proposing is putting the check for checking the > SIGKILL in kthread_should_stop itself, in /addition/ to the > kthread_stop_info.k == current check. This is what I can't understand completely. Why should we check SIGKILL or signal_pending() in addition to kthread_stop_info.k, what is the point? Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
(trimmed the cc: list) On 06/28, Satyam Sharma wrote: On 6/27/07, Oleg Nesterov [EMAIL PROTECTED] wrote: Contrary, I believe we should avoid signals when it comes to kernel threads. And I agree, but there's quite a subtle difference between signals being used like they normally are, and this case here. Fact is, there is simply no other way to break out of certain functions (if there was, I would've preferred that myself). In fact, what I'm proposing is that kthreads should *not* be tinkering with (flushing, handling, dequeueing, whatever) signals at all, because like you mentioned, if they do that, it's possible that the TIF_SIGPENDING could get lost. But we do have kthreads which call dequeue_signal(). And perhaps some kthread treats SIGKILL as urgent exit with a lot of printks while kthread_should_stop() means exit gracefully. I am talking about the case when wait_event_interruptible() should not fail (unless something bad happens) inside the while (!kthread_should_stop()) loop. Note also that kthread could use TASK_INTERRUPTIBLE sleep [...] and because it knows that all signals are ignored. Ok, I think you're saying that if a kthread used wait_event_interruptible (and was not expecting signals, because it ignores them), then bad things (say exit in inconsistent state, etc) will happen if we break that wait_event_interruptible unexpectedly. No. Of course, kthread should check the error and doesn't exit in inconsistent state. The question is: why should we break (say) tcp_recvmsg() inside while (!kthread_should_stop()) loop if it is supposed to succeed unless something bad happens? (I mean, we may have a kthread which doesn't expect the failure unless something bad happens). OK, let me give a silly example. The correctly written kthread should check the result of kmalloc(), yes? kthread(k) means that k should exit anyway, yes? So let's change kthread_stop(k) to also set TIF_MEMDIE, this means that __alloc_pages() will fail quickly when get_page_from_freelist() doesn't succeed, but won't start pageout() which may take a while. Please don't explain me why this suggestion is bad :), I am just trying to illustrate my point. I believe kthread_stop() should not send the signal. Just because it could be actually dequeued by kthread, and it may have some special meaning for this kthread. Perhaps it makes sense to do signal_wake_up() (sets TIF_SIGPENDING and wakes up), recalc_sigpending() could be changed to take kthread_should_stop() into account (so TIF_SIGPENDING can't be cleared). Once again, I _personally_ do not like this too much, but yes, I see your point, and I can't say such a change is wrong. Hmm... actually, such a change breaks the while (signal_pending(current)) dequeue_signal_and_so_something(); loop, see jffs2_garbage_collect_thread() for example. In short, I think that kthread_stop() + TIF_SIGPENDING should be a special case, the signal should be sent explicitly before kthread_stop(), like cifs does. After all, the caller of kthread_stop(k) should know what and why k does. In any case, that kind of the changes can break things, just because this means API change. And thirdly, what I'm proposing is putting the check for checking the SIGKILL in kthread_should_stop itself, in /addition/ to the kthread_stop_info.k == current check. This is what I can't understand completely. Why should we check SIGKILL or signal_pending() in addition to kthread_stop_info.k, what is the point? Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 6/28/07, Oleg Nesterov [EMAIL PROTECTED] wrote: (trimmed the cc: list) On 06/28, Satyam Sharma wrote: On 6/27/07, Oleg Nesterov [EMAIL PROTECTED] wrote: Contrary, I believe we should avoid signals when it comes to kernel threads. And I agree, but there's quite a subtle difference between signals being used like they normally are, and this case here. Fact is, there is simply no other way to break out of certain functions (if there was, I would've preferred that myself). In fact, what I'm proposing is that kthreads should *not* be tinkering with (flushing, handling, dequeueing, whatever) signals at all, because like you mentioned, if they do that, it's possible that the TIF_SIGPENDING could get lost. But we do have kthreads which call dequeue_signal(). And perhaps some kthread treats SIGKILL as urgent exit with a lot of printks while kthread_should_stop() means exit gracefully. [..] I believe kthread_stop() should not send the signal. Just because it could be actually dequeued by kthread, and it may have some special meaning for this kthread. [...] Hmm... actually, such a change breaks the while (signal_pending(current)) dequeue_signal_and_so_something(); loop, see jffs2_garbage_collect_thread() for example. In short, I think that kthread_stop() + TIF_SIGPENDING should be a special case, the signal should be sent explicitly before kthread_stop(), like cifs does. The existence of such kthreads (that dabble in signals, although I guess everybody here agrees kernel threads have no business dabbling in them) is the *only* reason I didn't submit my proposal as an actual patch :-) Perhaps one day we'll clean up all such cases, and fix up kthread semantics to simply outlaw signal-handling in kernel threads (I just can't convince myself there could be a good justifiable reason to do that). When that's done, and seeing that kthreadd_setup() does ignore_signals(), kthread_stop() could become simply force_sig() ... [...] This is what I can't understand completely. Why should we check SIGKILL or signal_pending() in addition to kthread_stop_info.k, what is the point? ... so kthread_stop_info will go away too. After all, the caller of kthread_stop(k) should know what and why k does. One, the kthread code could change over time. Two, the solution to the problem is un-intuitive for the user to do. The API should know such cases, and handle them well too. The caller of kthread_stop() would expect it to do the expected, after all (i.e. stop the kthread :-) In any case, that kind of the changes can break things, just because this means API change. Well, kthreads are a kernel-internal API anyway, I guess as long as we fix all users, we're fine. I am talking about the case when wait_event_interruptible() should not fail (unless something bad happens) inside the while (!kthread_should_stop()) loop. Note also that kthread could use TASK_INTERRUPTIBLE sleep [...] and because it knows that all signals are ignored. Ok, I think you're saying that if a kthread used wait_event_interruptible (and was not expecting signals, because it ignores them), then bad things (say exit in inconsistent state, etc) will happen if we break that wait_event_interruptible unexpectedly. No. Of course, kthread should check the error and doesn't exit in inconsistent state. The question is: why should we break (say) tcp_recvmsg() inside while (!kthread_should_stop()) loop if it is supposed to succeed unless something bad happens? (I mean, we may have a kthread which doesn't expect the failure unless something bad happens). First, I don't quite understand this not expecting failure / something bad happens bit at all. Second, we *must* break that tcp_recvmsg() inside the kthread's main loop, of course! We want it stopped, after all, and if we don't make it break out of that function, the kthread _will_never_exit_. [ What's worse, several kthreads are part of modules, and are created during the module_init() and so the module_exit() function needs to call kthread_stop() to clean it up. But the wait_for_completion() in kthread_stop() will never unblock, and so this would also mean the rmmod will _hang_ and module will never get unloaded too. ] OK, let me give a silly example. The correctly written kthread should check the result of kmalloc(), yes? kthread(k) means that k should exit anyway, yes? So let's change kthread_stop(k) to also set TIF_MEMDIE, this means that __alloc_pages() will fail quickly when get_page_from_freelist() doesn't succeed, but won't start pageout() which may take a while. Please don't explain me why this suggestion is bad :), I am just trying to illustrate my point. Your example / analogy is indeed *very* bad :-) Please note that this whole thing is about functions that will _simply_*never*_exit_ever_ _unless_ given a signal. [ I think you've been misunderstanding my proposal that I want to send a SIGKILL / signal to the kthread just to ensure it gets
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 6/28/07, Oleg Nesterov [EMAIL PROTECTED] wrote: [...] Hmm... actually, such a change breaks the while (signal_pending(current)) dequeue_signal_and_so_something(); loop, see jffs2_garbage_collect_thread() for example. BTW jffs2_garbage_collect_thread() is a horrible abomination :-) Its use of SIGSTOP and SIGHUP is *totally* gratuitous unwarranted. It does use SIGKILL, but simply as a stop-notification from umount of the corresponding jffs2 partition. I think all the signal handling there can be removed; then it needs to undergo conversion to kthread (it uses horrible locks and completions to handle its exit) -- I'll put it in my endless kernel-cleanups-todo-list ... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 6/28/07, Satyam Sharma [EMAIL PROTECTED] wrote: [...] [ BTW even there we're safe as long as we check kthread_stop() _before_ flushing or ^^ Whoops, that should have been kthread_should_stop(). dequeueing the signals, but then that'll be an ugly rule to try and enforce, of course. ] Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 06/28, Satyam Sharma wrote: Second, we *must* break that tcp_recvmsg() inside the kthread's main loop, of course! We want it stopped, after all, and if we don't make it break out of that function, the kthread _will_never_exit_. In that case this kthread is buggy. We have sock-sk_rcvtimeo. Please note that this whole thing is about functions that will _simply_*never*_exit_ever_ _unless_ given a signal. ditto. kthread should not do this. OK, I suggest to stop this thread. I don't claim you are wrong, just we think differently ;) This is what I can't understand completely. Why should we check SIGKILL or signal_pending() in addition to kthread_stop_info.k, what is the point? ... so kthread_stop_info will go away too. it should go away regardless, we have patches. Still I see no point to check signal_pending() in kthread_stop(). Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On Thu, 28 Jun 2007 21:08:25 +0400 Oleg Nesterov [EMAIL PROTECTED] wrote: On 06/28, Satyam Sharma wrote: Second, we *must* break that tcp_recvmsg() inside the kthread's main loop, of course! We want it stopped, after all, and if we don't make it break out of that function, the kthread _will_never_exit_. In that case this kthread is buggy. We have sock-sk_rcvtimeo. Please note that this whole thing is about functions that will _simply_*never*_exit_ever_ _unless_ given a signal. ditto. kthread should not do this. OK, I suggest to stop this thread. I don't claim you are wrong, just we think differently ;) This is what I can't understand completely. Why should we check SIGKILL or signal_pending() in addition to kthread_stop_info.k, what is the point? ... so kthread_stop_info will go away too. it should go away regardless, we have patches. Still I see no point to check signal_pending() in kthread_stop(). Oleg. Again, this isn't an area where I have great expertise... Adding signalling into kthread_stop would seem to be making some assumptions about how kthreads are used and how they should respond to signals. That sounds unsafe and might potentially limit flexibility. agree with Oleg here -- I don't see the benefit to doing this for all kthreads. If you're aware that your kthread might be blocked on a call, then it doesn't seem burdensome to add in an extra send/force_sig call. -- Jeff Layton [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Hi Oleg, On 6/28/07, Oleg Nesterov [EMAIL PROTECTED] wrote: On 06/28, Satyam Sharma wrote: Second, we *must* break that tcp_recvmsg() inside the kthread's main loop, of course! We want it stopped, after all, and if we don't make it break out of that function, the kthread _will_never_exit_. In that case this kthread is buggy. We have sock-sk_rcvtimeo. Please note that this whole thing is about functions that will _simply_*never*_exit_ever_ _unless_ given a signal. ditto. kthread should not do this. Well, I definitely wouldn't call it buggy ... skb_recv_datagram() (if with sock-sk_rcvtimeo != MAX_SCHEDULE_TIMEOUT) would then needlessly have to be put into it's own little while(1) (or put a continue; after it back to main kthread loop). A question arises, what timeout value to use? (too little = needless wastage of CPU; too high = see below) More importantly, the other thread that does a kthread_stop() on our kthread (probably a umount(2) or rmmod) would then unfortunately hang (on wait_for_completion i.e. TASK_UNINTERRUPTIBLE) for the duration of the time it takes for our kthread to finish it's timeout, which plays havoc with userspace scripts. OK, I suggest to stop this thread. I don't claim you are wrong, just we think differently ;) That's fine, we can still agree to disagree here :-) Cheers, Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Hi Oleg, On 6/27/07, Oleg Nesterov <[EMAIL PROTECTED]> wrote: On 06/27, Satyam Sharma wrote: > > Thanks for your comments, I'm still not convinced, however. An perhaps you are right. I don't have a very strong opinion on that. Still I can't understand why it is better if kthread_stop() sends a signal as well. [...] One can always use force_sig() or allow_signal() + send_sig() when it is really needed, like cifs does. The way I look at it, this is about an API being the one with "least surprise". So when one writes a kthread using its API, one would expect kthread_stop() to dtrt and just work, which it will not, if the kthread has one of such functions (which are just like any other, after all). Also, imagine such a function being added to a kthread that didn't have it previously ... the solution to which (sending a signal before kthread_stop) is un-intuitive. Hence, why not make it part of the API itself ... It's not like the signal would do any harm to other kthreads (that don't use such function) either. Contrary, I believe we should avoid signals when it comes to kernel threads. And I agree, but there's quite a subtle difference between signals being used like they normally are, and this case here. Fact is, there is simply no other way to break out of certain functions (if there was, I would've preferred that myself). In fact, what I'm proposing is that kthreads should *not* be tinkering with (flushing, handling, dequeueing, whatever) signals at all, because like you mentioned, if they do that, it's possible that the TIF_SIGPENDING could get lost. > Anyway, I think _all_ usages of kthread_stop() in the kernel *do* want > the thread to stop *right then*. After all, kthread_stop() doesn't even > return (gets blocked on wait_for_completion()) till it knows the target > kthread *has* exited completely. Yes, kthread_stop(k) means that k should exit eventually, but I don't think that kthread_stop() should try to force the exit. Well, it's just an additional notice (apart from setting kthread_stop_info) sent to the target kthread that its time has come ... I'm not sure how a kthread would have exit "forced" upon it just by sending it a signal. Also, note that _not_ using a signal would in fact mean that the kthread _never_ exits at all (forget asap). I am talking about the case when wait_event_interruptible() should not fail (unless something bad happens) inside the "while (!kthread_should_stop())" loop. Note also that kthread could use TASK_INTERRUPTIBLE sleep [...] and because it knows that all signals are ignored. Ok, I think you're saying that if a kthread used wait_event_interruptible (and was not expecting signals, because it ignores them), then bad things (say exit in inconsistent state, etc) will happen if we break that wait_event_interruptible unexpectedly. First, such an error ought to be handled gracefully by kthread itself anyway -- anybody who doesn't check the return of _interruptible() functions is just asking for trouble. Second, the kthread must expect that the stop notification _could_ have come during that sleep (in fact all good kthreads I've seen do always put a kthread_should_stop() after all such blocking functions, and not only in the while loop's condition) -- but even if it doesn't check explicitly, what do we lose? If the kthread code _after_ the wait_event_interruptible is written such that it assumes that the wait condition has become true (as I mentioned above), then that code is inherently buggy. And thirdly, what I'm proposing is putting the check for checking the SIGKILL in kthread_should_stop itself, in /addition/ to the kthread_stop_info.k == current check. So the kthread will check should_stop(), and if true, then exiting cleanly -- this is something that all existing kthreads would do already (if some kthread out there exits _uncleanly_ even after seeing seeing kthread_should_stop == true, then it needs fixing anyway). Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 06/27, Satyam Sharma wrote: > > Thanks for your comments, I'm still not convinced, however. An perhaps you are right. I don't have a very strong opinion on that. Still I can't understand why it is better if kthread_stop() sends a signal as well. Contrary, I believe we should avoid signals when it comes to kernel threads. One can always use force_sig() or allow_signal() + send_sig() when it is really needed, like cifs does. > On 6/26/07, Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > > >Personally, I don't think we should do this. > > > >kthread_stop() doesn't always mean "kill this thread asap". Suppose that > >CPU_DOWN does kthread_stop(workqueue->thread) but doesn't flush the queue > >before that (we did so before 2.6.22 and perhaps we will do again). Now > >work_struct->func() doing tcp_recvmsg() or wait_event_interruptible() > >fails, > >but this is probably not that we want. > > Anyway, I think _all_ usages of kthread_stop() in the kernel *do* want > the thread to stop *right then*. After all, kthread_stop() doesn't even > return (gets blocked on wait_for_completion()) till it knows the target > kthread *has* exited completely. Yes, kthread_stop(k) means that k should exit eventually, but I don't think that kthread_stop() should try to force the exit. > And if a workqueue is blocked on tcp_recvmsg() or skb_recv_datagram() > or some such, I don't see how that flush_workqueue (if that is what you > meant) would succeed anyway (unless you do send the signal too), timeout, but this was just a silly example. I am talking about the case when wait_event_interruptible() should not fail (unless something bad happens) inside the "while (!kthread_should_stop())" loop. Note also that kthread could use TASK_INTERRUPTIBLE sleep because it doesn't want to contribute to loadavg, and because it knows that all signals are ignored. > Note that the exact scenario you're talking about wouldn't mean the > kthread getting killed before it's supposed to be stopped anyway. Yes sure, we can't kill the kernel thread via signal. I meant we can have some unexpected failure. > >(offtopic) > > > >cifs_mount: > > > >send_sig(SIGKILL,srvTcp->tsk,1); > >tsk = srvTcp->tsk; > >if(tsk) > >kthread_stop(tsk); > > > >This "if(tsk)" looks wrong to me. > > I think it's bogus myself. [ Added [EMAIL PROTECTED] to Cc: > ] > > >Can srvTcp->tsk be NULL? If yes, send_sig() > >is not safe. Can srvTcp->tsk become NULL after send_sig() ? If yes, this > >check is racy, and kthread_stop() is not safe. > > That's again something the atomicity I proposed above could avoid? I think this "if(tsk)" is just bogus, and should be killed. Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 06/27, Satyam Sharma wrote: Thanks for your comments, I'm still not convinced, however. An perhaps you are right. I don't have a very strong opinion on that. Still I can't understand why it is better if kthread_stop() sends a signal as well. Contrary, I believe we should avoid signals when it comes to kernel threads. One can always use force_sig() or allow_signal() + send_sig() when it is really needed, like cifs does. On 6/26/07, Oleg Nesterov [EMAIL PROTECTED] wrote: Personally, I don't think we should do this. kthread_stop() doesn't always mean kill this thread asap. Suppose that CPU_DOWN does kthread_stop(workqueue-thread) but doesn't flush the queue before that (we did so before 2.6.22 and perhaps we will do again). Now work_struct-func() doing tcp_recvmsg() or wait_event_interruptible() fails, but this is probably not that we want. Anyway, I think _all_ usages of kthread_stop() in the kernel *do* want the thread to stop *right then*. After all, kthread_stop() doesn't even return (gets blocked on wait_for_completion()) till it knows the target kthread *has* exited completely. Yes, kthread_stop(k) means that k should exit eventually, but I don't think that kthread_stop() should try to force the exit. And if a workqueue is blocked on tcp_recvmsg() or skb_recv_datagram() or some such, I don't see how that flush_workqueue (if that is what you meant) would succeed anyway (unless you do send the signal too), timeout, but this was just a silly example. I am talking about the case when wait_event_interruptible() should not fail (unless something bad happens) inside the while (!kthread_should_stop()) loop. Note also that kthread could use TASK_INTERRUPTIBLE sleep because it doesn't want to contribute to loadavg, and because it knows that all signals are ignored. Note that the exact scenario you're talking about wouldn't mean the kthread getting killed before it's supposed to be stopped anyway. Yes sure, we can't kill the kernel thread via signal. I meant we can have some unexpected failure. (offtopic) cifs_mount: send_sig(SIGKILL,srvTcp-tsk,1); tsk = srvTcp-tsk; if(tsk) kthread_stop(tsk); This if(tsk) looks wrong to me. I think it's bogus myself. [ Added [EMAIL PROTECTED] to Cc: ] Can srvTcp-tsk be NULL? If yes, send_sig() is not safe. Can srvTcp-tsk become NULL after send_sig() ? If yes, this check is racy, and kthread_stop() is not safe. That's again something the atomicity I proposed above could avoid? I think this if(tsk) is just bogus, and should be killed. Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Hi Oleg, On 6/27/07, Oleg Nesterov [EMAIL PROTECTED] wrote: On 06/27, Satyam Sharma wrote: Thanks for your comments, I'm still not convinced, however. An perhaps you are right. I don't have a very strong opinion on that. Still I can't understand why it is better if kthread_stop() sends a signal as well. [...] One can always use force_sig() or allow_signal() + send_sig() when it is really needed, like cifs does. The way I look at it, this is about an API being the one with least surprise. So when one writes a kthread using its API, one would expect kthread_stop() to dtrt and just work, which it will not, if the kthread has one of such functions (which are just like any other, after all). Also, imagine such a function being added to a kthread that didn't have it previously ... the solution to which (sending a signal before kthread_stop) is un-intuitive. Hence, why not make it part of the API itself ... It's not like the signal would do any harm to other kthreads (that don't use such function) either. Contrary, I believe we should avoid signals when it comes to kernel threads. And I agree, but there's quite a subtle difference between signals being used like they normally are, and this case here. Fact is, there is simply no other way to break out of certain functions (if there was, I would've preferred that myself). In fact, what I'm proposing is that kthreads should *not* be tinkering with (flushing, handling, dequeueing, whatever) signals at all, because like you mentioned, if they do that, it's possible that the TIF_SIGPENDING could get lost. Anyway, I think _all_ usages of kthread_stop() in the kernel *do* want the thread to stop *right then*. After all, kthread_stop() doesn't even return (gets blocked on wait_for_completion()) till it knows the target kthread *has* exited completely. Yes, kthread_stop(k) means that k should exit eventually, but I don't think that kthread_stop() should try to force the exit. Well, it's just an additional notice (apart from setting kthread_stop_info) sent to the target kthread that its time has come ... I'm not sure how a kthread would have exit forced upon it just by sending it a signal. Also, note that _not_ using a signal would in fact mean that the kthread _never_ exits at all (forget asap). I am talking about the case when wait_event_interruptible() should not fail (unless something bad happens) inside the while (!kthread_should_stop()) loop. Note also that kthread could use TASK_INTERRUPTIBLE sleep [...] and because it knows that all signals are ignored. Ok, I think you're saying that if a kthread used wait_event_interruptible (and was not expecting signals, because it ignores them), then bad things (say exit in inconsistent state, etc) will happen if we break that wait_event_interruptible unexpectedly. First, such an error ought to be handled gracefully by kthread itself anyway -- anybody who doesn't check the return of _interruptible() functions is just asking for trouble. Second, the kthread must expect that the stop notification _could_ have come during that sleep (in fact all good kthreads I've seen do always put a kthread_should_stop() after all such blocking functions, and not only in the while loop's condition) -- but even if it doesn't check explicitly, what do we lose? If the kthread code _after_ the wait_event_interruptible is written such that it assumes that the wait condition has become true (as I mentioned above), then that code is inherently buggy. And thirdly, what I'm proposing is putting the check for checking the SIGKILL in kthread_should_stop itself, in /addition/ to the kthread_stop_info.k == current check. So the kthread will check should_stop(), and if true, then exiting cleanly -- this is something that all existing kthreads would do already (if some kthread out there exits _uncleanly_ even after seeing seeing kthread_should_stop == true, then it needs fixing anyway). Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 6/27/07, Satyam Sharma <[EMAIL PROTECTED]> wrote: [...] On 6/26/07, Oleg Nesterov <[EMAIL PROTECTED]> wrote: > On 06/26, Satyam Sharma wrote: [...] > > So could we have signals in _addition_ to kthread_stop_info and change > > kthread_should_stop() to check for both: > > > > kthread_stop_info.k == current && signal_pending(current) > > No, this can't work in general. Some kthreads do flush_signals/dequeue_signal, > so TIF_SIGPENDING can be lost anyway. Yup, I had thought of precisely this issue yesterday as well. The mental note I made to myself was that the force_sig(SIGKILL) and wake_up_process() in kthread_stop() must be atomic so that the following race is not possible: Hmm, the issue seems to have more to do with the ordering of flush_signals() w.r.t. checking kthread_should_stop() in the kthread's code. I thought about how to tackle this, but there's no easy way to make the stuff atomic like I thought earlier. The problem, like you mentioned, is if the target kthread proactively flushes its signals by hand *before* checking kthread_should_stop(). The only way out seems to be to simply outlaw flush_signals() in kthreads (or anything to do with signals), but that would be impossible to enforce ... Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Hi Oleg, Thanks for your comments, I'm still not convinced, however. On 6/26/07, Oleg Nesterov <[EMAIL PROTECTED]> wrote: On 06/26, Satyam Sharma wrote: > > Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() > in kthread_stop() itself? Personally, I don't think we should do this. kthread_stop() doesn't always mean "kill this thread asap". Suppose that CPU_DOWN does kthread_stop(workqueue->thread) but doesn't flush the queue before that (we did so before 2.6.22 and perhaps we will do again). Now work_struct->func() doing tcp_recvmsg() or wait_event_interruptible() fails, but this is probably not that we want. [ Well, first of all, anybody who sends a possibly-blocking-forever function like tcp_recvmsg() to a *workqueue* needs to get his head checked. ] Anyway, I think _all_ usages of kthread_stop() in the kernel *do* want the thread to stop *right then*. After all, kthread_stop() doesn't even return (gets blocked on wait_for_completion()) till it knows the target kthread *has* exited completely. And if a workqueue is blocked on tcp_recvmsg() or skb_recv_datagram() or some such, I don't see how that flush_workqueue (if that is what you meant) would succeed anyway (unless you do send the signal too), and we'll actually end up having a nice little situation on our hands if we make the mistake of calling flush_workqueue on such a wq. Note that the exact scenario you're talking about wouldn't mean the kthread getting killed before it's supposed to be stopped anyway. force_sig is not a synchronous wakeup, and also note that tcp_recvmsg() or skb_recv_datagram() etc will exit (and are supposed to exit) cleanly on seeing a signal. > So could we have signals in _addition_ to kthread_stop_info and change > kthread_should_stop() to check for both: > > kthread_stop_info.k == current && signal_pending(current) No, this can't work in general. Some kthreads do flush_signals/dequeue_signal, so TIF_SIGPENDING can be lost anyway. Yup, I had thought of precisely this issue yesterday as well. The mental note I made to myself was that the force_sig(SIGKILL) and wake_up_process() in kthread_stop() must be atomic so that the following race is not possible: Say: #1 -> thread that invokes kthread_stop() #2 -> kthread to be stopped, (may be) currently in wait_event_interruptible(), such that there is a bigger loop over the wait_event_interruptible() itself, which puts task back to sleep if this was a spurious wake up (if _not_ due to a signal). Thread #1 Thread #2 = = skb_recv_datagram() -> wait_for_packet() ... force_sig(SIGKILL) ... skb_recv_datagram() -> wait_for_packet() kthread_stop() -> wake_up_process() i.e. thread #2 still does not exit cleanly. The root of the problem is that functions such as skb_recv_datagram() -> wait_for_packet() handle spurious wakeups *internally* by themselves, so our kthread does not get a chance to check for kthread_should_stop(). Of course, above race is true only for kthreads that do flush signals on seeing spurious ones periodically. If it did not, then skb_recv_datagram() called second time above would again have broken out because of signal_pending() and we wouldn't have gone back to sleep. But we have to be on safer side and avoid races *irrespective* of what the kthread might or might not do, so let's _not_ depend on _assumed kthread behaviour_. I suspect the above race be avoided by making force_sig() and wake_up_process() atomic in kthread_stop() itself, please correct me if I'm horribly wrong. I personally think Jeff's idea to use force_sig() is right. kthread_create() doesn't use CLONE_SIGHAND, so it is safe to change ->sighand->actionp[]. (offtopic) cifs_mount: send_sig(SIGKILL,srvTcp->tsk,1); tsk = srvTcp->tsk; if(tsk) kthread_stop(tsk); This "if(tsk)" looks wrong to me. I think it's bogus myself. [ Added [EMAIL PROTECTED] to Cc: ] Can srvTcp->tsk be NULL? If yes, send_sig() is not safe. Can srvTcp->tsk become NULL after send_sig() ? If yes, this check is racy, and kthread_stop() is not safe. That's again something the atomicity I proposed above could avoid? Please comment. Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 06/26, Satyam Sharma wrote: > > Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() > in kthread_stop() itself? Personally, I don't think we should do this. kthread_stop() doesn't always mean "kill this thread asap". Suppose that CPU_DOWN does kthread_stop(workqueue->thread) but doesn't flush the queue before that (we did so before 2.6.22 and perhaps we will do again). Now work_struct->func() doing tcp_recvmsg() or wait_event_interruptible() fails, but this is probably not that we want. > So could we have signals in _addition_ to kthread_stop_info and change > kthread_should_stop() to check for both: > > kthread_stop_info.k == current && signal_pending(current) No, this can't work in general. Some kthreads do flush_signals/dequeue_signal, so TIF_SIGPENDING can be lost anyway. I personally think Jeff's idea to use force_sig() is right. kthread_create() doesn't use CLONE_SIGHAND, so it is safe to change ->sighand->actionp[]. (offtopic) cifs_mount: send_sig(SIGKILL,srvTcp->tsk,1); tsk = srvTcp->tsk; if(tsk) kthread_stop(tsk); This "if(tsk)" looks wrong to me. Can srvTcp->tsk be NULL? If yes, send_sig() is not safe. Can srvTcp->tsk become NULL after send_sig() ? If yes, this check is racy, and kthread_stop() is not safe. Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 06/26, Satyam Sharma wrote: Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() in kthread_stop() itself? Personally, I don't think we should do this. kthread_stop() doesn't always mean kill this thread asap. Suppose that CPU_DOWN does kthread_stop(workqueue-thread) but doesn't flush the queue before that (we did so before 2.6.22 and perhaps we will do again). Now work_struct-func() doing tcp_recvmsg() or wait_event_interruptible() fails, but this is probably not that we want. So could we have signals in _addition_ to kthread_stop_info and change kthread_should_stop() to check for both: kthread_stop_info.k == current signal_pending(current) No, this can't work in general. Some kthreads do flush_signals/dequeue_signal, so TIF_SIGPENDING can be lost anyway. I personally think Jeff's idea to use force_sig() is right. kthread_create() doesn't use CLONE_SIGHAND, so it is safe to change -sighand-actionp[]. (offtopic) cifs_mount: send_sig(SIGKILL,srvTcp-tsk,1); tsk = srvTcp-tsk; if(tsk) kthread_stop(tsk); This if(tsk) looks wrong to me. Can srvTcp-tsk be NULL? If yes, send_sig() is not safe. Can srvTcp-tsk become NULL after send_sig() ? If yes, this check is racy, and kthread_stop() is not safe. Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Hi Oleg, Thanks for your comments, I'm still not convinced, however. On 6/26/07, Oleg Nesterov [EMAIL PROTECTED] wrote: On 06/26, Satyam Sharma wrote: Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() in kthread_stop() itself? Personally, I don't think we should do this. kthread_stop() doesn't always mean kill this thread asap. Suppose that CPU_DOWN does kthread_stop(workqueue-thread) but doesn't flush the queue before that (we did so before 2.6.22 and perhaps we will do again). Now work_struct-func() doing tcp_recvmsg() or wait_event_interruptible() fails, but this is probably not that we want. [ Well, first of all, anybody who sends a possibly-blocking-forever function like tcp_recvmsg() to a *workqueue* needs to get his head checked. ] Anyway, I think _all_ usages of kthread_stop() in the kernel *do* want the thread to stop *right then*. After all, kthread_stop() doesn't even return (gets blocked on wait_for_completion()) till it knows the target kthread *has* exited completely. And if a workqueue is blocked on tcp_recvmsg() or skb_recv_datagram() or some such, I don't see how that flush_workqueue (if that is what you meant) would succeed anyway (unless you do send the signal too), and we'll actually end up having a nice little situation on our hands if we make the mistake of calling flush_workqueue on such a wq. Note that the exact scenario you're talking about wouldn't mean the kthread getting killed before it's supposed to be stopped anyway. force_sig is not a synchronous wakeup, and also note that tcp_recvmsg() or skb_recv_datagram() etc will exit (and are supposed to exit) cleanly on seeing a signal. So could we have signals in _addition_ to kthread_stop_info and change kthread_should_stop() to check for both: kthread_stop_info.k == current signal_pending(current) No, this can't work in general. Some kthreads do flush_signals/dequeue_signal, so TIF_SIGPENDING can be lost anyway. Yup, I had thought of precisely this issue yesterday as well. The mental note I made to myself was that the force_sig(SIGKILL) and wake_up_process() in kthread_stop() must be atomic so that the following race is not possible: Say: #1 - thread that invokes kthread_stop() #2 - kthread to be stopped, (may be) currently in wait_event_interruptible(), such that there is a bigger loop over the wait_event_interruptible() itself, which puts task back to sleep if this was a spurious wake up (if _not_ due to a signal). Thread #1 Thread #2 = = skb_recv_datagram() - wait_for_packet() sleeping ... force_sig(SIGKILL) scheduled out wakes up, sees the pending signal, breaks out of wait_for_packet() and skb_recv_datagram() back out to our kthread code itself, but there we see that kthread_should_stop() is NOT yet true, we also see this spurious signal, flush it, and call skb_recv_datagram() all over again ... skb_recv_datagram() - wait_for_packet() sleeping scheduled in kthread_stop() - wake_up_process() this time we don't even break out of the skb_recv_datagram() either, as no signals are pending any more i.e. thread #2 still does not exit cleanly. The root of the problem is that functions such as skb_recv_datagram() - wait_for_packet() handle spurious wakeups *internally* by themselves, so our kthread does not get a chance to check for kthread_should_stop(). Of course, above race is true only for kthreads that do flush signals on seeing spurious ones periodically. If it did not, then skb_recv_datagram() called second time above would again have broken out because of signal_pending() and we wouldn't have gone back to sleep. But we have to be on safer side and avoid races *irrespective* of what the kthread might or might not do, so let's _not_ depend on _assumed kthread behaviour_. I suspect the above race be avoided by making force_sig() and wake_up_process() atomic in kthread_stop() itself, please correct me if I'm horribly wrong. I personally think Jeff's idea to use force_sig() is right. kthread_create() doesn't use CLONE_SIGHAND, so it is safe to change -sighand-actionp[]. (offtopic) cifs_mount: send_sig(SIGKILL,srvTcp-tsk,1); tsk =
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On 6/27/07, Satyam Sharma [EMAIL PROTECTED] wrote: [...] On 6/26/07, Oleg Nesterov [EMAIL PROTECTED] wrote: On 06/26, Satyam Sharma wrote: [...] So could we have signals in _addition_ to kthread_stop_info and change kthread_should_stop() to check for both: kthread_stop_info.k == current signal_pending(current) No, this can't work in general. Some kthreads do flush_signals/dequeue_signal, so TIF_SIGPENDING can be lost anyway. Yup, I had thought of precisely this issue yesterday as well. The mental note I made to myself was that the force_sig(SIGKILL) and wake_up_process() in kthread_stop() must be atomic so that the following race is not possible: Hmm, the issue seems to have more to do with the ordering of flush_signals() w.r.t. checking kthread_should_stop() in the kthread's code. I thought about how to tackle this, but there's no easy way to make the stuff atomic like I thought earlier. The problem, like you mentioned, is if the target kthread proactively flushes its signals by hand *before* checking kthread_should_stop(). The only way out seems to be to simply outlaw flush_signals() in kthreads (or anything to do with signals), but that would be impossible to enforce ... Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On Tue, 26 Jun 2007 03:39:23 +0530 "Satyam Sharma" <[EMAIL PROTECTED]> wrote: > Hi Jeff, > > [ Trimmed netdev from Cc: list, added Christoph. ] > > On 6/26/07, Jeff Layton <[EMAIL PROTECTED]> wrote: > > On Tue, 26 Jun 2007 01:11:20 +0530 > > "Satyam Sharma" <[EMAIL PROTECTED]> wrote: > > > [...] > > > Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() > > > in kthread_stop() itself? > > > > > > Looking at some happily out-of-date comments in the kthread code, I can > > > guess that at some point of time (perhaps very early drafts) Rusty > > > actually > > > *did* implement the whole kthread_stop() functionality using signals, but > > > I suspect it might've been discarded and the kthread_stop_info approach > > > used instead to protect from spurious signals from userspace. (?) > > > > > > So could we have signals in _addition_ to kthread_stop_info and change > > > kthread_should_stop() to check for both: > > > > > > kthread_stop_info.k == current && signal_pending(current) > > > > > > If !kthread_should_stop() && signal_pending(current) => spurious signal, > > > so just flush and discard (in the kthread). > > > [...] > > > Why is it wrong for kthreads to let signals through? We can ignore out > > > all signals we're not interested in, and flush the spurious ones ... > > > otherwise there really isn't much those kthreads can do that get blocked > > > in such functions, is there? > > > > Yes, after I wrote that I began to question that assumption too. I was > > pretty much going on a statement by Christoph Hellwig on an earlier > > patch that I did: > > Ok, I found both the threads / patches you referred to ... > > > -[snip]-- > > The right way to fix this is to stop sending signals at all and have > > a kernel-internal way to get out of kernel_recvmsg. Uses of signals by > > kernel thread generally are bugs. > > -[snip]-- > > > > Though this makes no sense to me. I don't see any reason why kthreads > > can't use signals, and hacking support for breaking out of sleeping > > functions seems redundant. > > Right, signals _are_ the "signalling" mechanism all through kernel code > already, anything else would clearly be redundant. > > But I've listened / participated in other discussions about kthreads and > signals and the general feeling is that (somebody correct me if I'm wrong) > kernel threads are a kernel _implementation detail_ after all, and good > design must ensure that userspace be unaware of even their existence. > And I agree with that, but the real ugly uses of signals by kernel threads > are those cases where we want to export a full signals-based interface to > our kthread to userspace (such cases do exist in mainline, I think). > But that's clearly not the case with the usage here. > > > My latest patch for cifsd has it block all signals from userspace > > and uses force_sig() instead of send_sig() when trying to stop the > > thread. This seems to work pretty well and still insulates the thread > > from userspace signals. > > Thanks, I find this solution much cleaner too, so now we avoid any > sort of spuriousness getting in from userspace (and pretty much takes > care of all the checking-if-spurious-and-flushing business I referred to > earlier). > > But how about making this part of kthreads proper? Functions such as > skb_recv_datagram etc are pretty standard (and others such would also > exist or get added in due time) and it is not exactly intuitive for a > developer > to add a force_sig(SIGKILL) before the kthread_stop() to ensure that the > kthread using such functions does exit properly. [ I can foresee cases in > the future when such functions are added to kthreads that did not have > them previously, and suddenly someone reports a regression that the > kthread stops exiting cleanly. ] > > Satyam I have no issue with it. I just didn't feel confident enough to know whether that might have harmful side effects. Right offhand I don't see why adding a force_sig() to kthread_stop() would be an issue. It seems like if you're wanting to stop the thread anyway then a signal probably wouldn't hurt anything. -- Jeff Layton <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Hi Jeff, [ Trimmed netdev from Cc: list, added Christoph. ] On 6/26/07, Jeff Layton <[EMAIL PROTECTED]> wrote: On Tue, 26 Jun 2007 01:11:20 +0530 "Satyam Sharma" <[EMAIL PROTECTED]> wrote: > [...] > Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() > in kthread_stop() itself? > > Looking at some happily out-of-date comments in the kthread code, I can > guess that at some point of time (perhaps very early drafts) Rusty actually > *did* implement the whole kthread_stop() functionality using signals, but > I suspect it might've been discarded and the kthread_stop_info approach > used instead to protect from spurious signals from userspace. (?) > > So could we have signals in _addition_ to kthread_stop_info and change > kthread_should_stop() to check for both: > > kthread_stop_info.k == current && signal_pending(current) > > If !kthread_should_stop() && signal_pending(current) => spurious signal, > so just flush and discard (in the kthread). > [...] > Why is it wrong for kthreads to let signals through? We can ignore out > all signals we're not interested in, and flush the spurious ones ... > otherwise there really isn't much those kthreads can do that get blocked > in such functions, is there? Yes, after I wrote that I began to question that assumption too. I was pretty much going on a statement by Christoph Hellwig on an earlier patch that I did: Ok, I found both the threads / patches you referred to ... -[snip]-- The right way to fix this is to stop sending signals at all and have a kernel-internal way to get out of kernel_recvmsg. Uses of signals by kernel thread generally are bugs. -[snip]-- Though this makes no sense to me. I don't see any reason why kthreads can't use signals, and hacking support for breaking out of sleeping functions seems redundant. Right, signals _are_ the "signalling" mechanism all through kernel code already, anything else would clearly be redundant. But I've listened / participated in other discussions about kthreads and signals and the general feeling is that (somebody correct me if I'm wrong) kernel threads are a kernel _implementation detail_ after all, and good design must ensure that userspace be unaware of even their existence. And I agree with that, but the real ugly uses of signals by kernel threads are those cases where we want to export a full signals-based interface to our kthread to userspace (such cases do exist in mainline, I think). But that's clearly not the case with the usage here. My latest patch for cifsd has it block all signals from userspace and uses force_sig() instead of send_sig() when trying to stop the thread. This seems to work pretty well and still insulates the thread from userspace signals. Thanks, I find this solution much cleaner too, so now we avoid any sort of spuriousness getting in from userspace (and pretty much takes care of all the checking-if-spurious-and-flushing business I referred to earlier). But how about making this part of kthreads proper? Functions such as skb_recv_datagram etc are pretty standard (and others such would also exist or get added in due time) and it is not exactly intuitive for a developer to add a force_sig(SIGKILL) before the kthread_stop() to ensure that the kthread using such functions does exit properly. [ I can foresee cases in the future when such functions are added to kthreads that did not have them previously, and suddenly someone reports a regression that the kthread stops exiting cleanly. ] Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On Tue, 26 Jun 2007 01:11:20 +0530 "Satyam Sharma" <[EMAIL PROTECTED]> wrote: > Hi, > > On 6/9/07, Jeff Layton <[EMAIL PROTECTED]> wrote: > > On Sat, 09 Jun 2007 11:30:04 +1000 > > Herbert Xu <[EMAIL PROTECTED]> wrote: > > > > > Please cc networking patches to [EMAIL PROTECTED] > > > > > > Jeff Layton <[EMAIL PROTECTED]> wrote: > > > > > > > > The following patch is a first stab at removing this need. It makes it > > > > so that in tcp_recvmsg() we also check kthread_should_stop() at any > > > > point where we currently check to see if the task was signalled. If > > > > that returns true, then it acts as if it were signalled and returns to > > > > the calling function. > > I got bit by the same thing when I was implementing a kthread that sleeps > on skb_recv_datagram() (=> wait_for_packet) with noblock = 0 over a netlink > socket. I need to use the same SIGKILL hack before kthread_stop() to ensure > the kthread does wake up *and* unblock from skb_recv_datagram() when the > module is being unloaded. Searched hard, but just couldn't find a prettier > solution (if someone knows, please let me know). > > > > This just doesn't seem to fit. Why should networking care about kthreads? > > I agree. > > > > Perhaps you can get kthread_stop to send a signal instead? > > Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() > in kthread_stop() itself? > > Looking at some happily out-of-date comments in the kthread code, I can > guess that at some point of time (perhaps very early drafts) Rusty actually > *did* implement the whole kthread_stop() functionality using signals, but > I suspect it might've been discarded and the kthread_stop_info approach > used instead to protect from spurious signals from userspace. (?) > > So could we have signals in _addition_ to kthread_stop_info and change > kthread_should_stop() to check for both: > > kthread_stop_info.k == current && signal_pending(current) > > If !kthread_should_stop() && signal_pending(current) => spurious signal, > so just flush and discard (in the kthread). > > > The problem there is that we still have to make the kthread let signals > > through. The nice thing about this approach is that we can make the > > kthread ignore signals, but still allow it to break out of kernel_recvmsg > > when a kthread_stop is done. > > Why is it wrong for kthreads to let signals through? We can ignore out > all signals we're not interested in, and flush the spurious ones ... > otherwise there really isn't much those kthreads can do that get blocked > in such functions, is there? > > Satyam Yes, after I wrote that I began to question that assumption too. I was pretty much going on a statement by Christoph Hellwig on an earlier patch that I did: -[snip]-- The right way to fix this is to stop sending signals at all and have a kernel-internal way to get out of kernel_recvmsg. Uses of signals by kernel thread generally are bugs. -[snip]-- Though this makes no sense to me. I don't see any reason why kthreads can't use signals, and hacking support for breaking out of sleeping functions seems redundant. My latest patch for cifsd has it block all signals from userspace and uses force_sig() instead of send_sig() when trying to stop the thread. This seems to work pretty well and still insulates the thread from userspace signals. -- Jeff Layton <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Hi, On 6/9/07, Jeff Layton <[EMAIL PROTECTED]> wrote: On Sat, 09 Jun 2007 11:30:04 +1000 Herbert Xu <[EMAIL PROTECTED]> wrote: > Please cc networking patches to [EMAIL PROTECTED] > > Jeff Layton <[EMAIL PROTECTED]> wrote: > > > > The following patch is a first stab at removing this need. It makes it > > so that in tcp_recvmsg() we also check kthread_should_stop() at any > > point where we currently check to see if the task was signalled. If > > that returns true, then it acts as if it were signalled and returns to > > the calling function. I got bit by the same thing when I was implementing a kthread that sleeps on skb_recv_datagram() (=> wait_for_packet) with noblock = 0 over a netlink socket. I need to use the same SIGKILL hack before kthread_stop() to ensure the kthread does wake up *and* unblock from skb_recv_datagram() when the module is being unloaded. Searched hard, but just couldn't find a prettier solution (if someone knows, please let me know). > This just doesn't seem to fit. Why should networking care about kthreads? I agree. > Perhaps you can get kthread_stop to send a signal instead? Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() in kthread_stop() itself? Looking at some happily out-of-date comments in the kthread code, I can guess that at some point of time (perhaps very early drafts) Rusty actually *did* implement the whole kthread_stop() functionality using signals, but I suspect it might've been discarded and the kthread_stop_info approach used instead to protect from spurious signals from userspace. (?) So could we have signals in _addition_ to kthread_stop_info and change kthread_should_stop() to check for both: kthread_stop_info.k == current && signal_pending(current) If !kthread_should_stop() && signal_pending(current) => spurious signal, so just flush and discard (in the kthread). The problem there is that we still have to make the kthread let signals through. The nice thing about this approach is that we can make the kthread ignore signals, but still allow it to break out of kernel_recvmsg when a kthread_stop is done. Why is it wrong for kthreads to let signals through? We can ignore out all signals we're not interested in, and flush the spurious ones ... otherwise there really isn't much those kthreads can do that get blocked in such functions, is there? Satyam - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Hi, On 6/9/07, Jeff Layton [EMAIL PROTECTED] wrote: On Sat, 09 Jun 2007 11:30:04 +1000 Herbert Xu [EMAIL PROTECTED] wrote: Please cc networking patches to [EMAIL PROTECTED] Jeff Layton [EMAIL PROTECTED] wrote: The following patch is a first stab at removing this need. It makes it so that in tcp_recvmsg() we also check kthread_should_stop() at any point where we currently check to see if the task was signalled. If that returns true, then it acts as if it were signalled and returns to the calling function. I got bit by the same thing when I was implementing a kthread that sleeps on skb_recv_datagram() (= wait_for_packet) with noblock = 0 over a netlink socket. I need to use the same SIGKILL hack before kthread_stop() to ensure the kthread does wake up *and* unblock from skb_recv_datagram() when the module is being unloaded. Searched hard, but just couldn't find a prettier solution (if someone knows, please let me know). This just doesn't seem to fit. Why should networking care about kthreads? I agree. Perhaps you can get kthread_stop to send a signal instead? Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() in kthread_stop() itself? Looking at some happily out-of-date comments in the kthread code, I can guess that at some point of time (perhaps very early drafts) Rusty actually *did* implement the whole kthread_stop() functionality using signals, but I suspect it might've been discarded and the kthread_stop_info approach used instead to protect from spurious signals from userspace. (?) So could we have signals in _addition_ to kthread_stop_info and change kthread_should_stop() to check for both: kthread_stop_info.k == current signal_pending(current) If !kthread_should_stop() signal_pending(current) = spurious signal, so just flush and discard (in the kthread). The problem there is that we still have to make the kthread let signals through. The nice thing about this approach is that we can make the kthread ignore signals, but still allow it to break out of kernel_recvmsg when a kthread_stop is done. Why is it wrong for kthreads to let signals through? We can ignore out all signals we're not interested in, and flush the spurious ones ... otherwise there really isn't much those kthreads can do that get blocked in such functions, is there? Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On Tue, 26 Jun 2007 01:11:20 +0530 Satyam Sharma [EMAIL PROTECTED] wrote: Hi, On 6/9/07, Jeff Layton [EMAIL PROTECTED] wrote: On Sat, 09 Jun 2007 11:30:04 +1000 Herbert Xu [EMAIL PROTECTED] wrote: Please cc networking patches to [EMAIL PROTECTED] Jeff Layton [EMAIL PROTECTED] wrote: The following patch is a first stab at removing this need. It makes it so that in tcp_recvmsg() we also check kthread_should_stop() at any point where we currently check to see if the task was signalled. If that returns true, then it acts as if it were signalled and returns to the calling function. I got bit by the same thing when I was implementing a kthread that sleeps on skb_recv_datagram() (= wait_for_packet) with noblock = 0 over a netlink socket. I need to use the same SIGKILL hack before kthread_stop() to ensure the kthread does wake up *and* unblock from skb_recv_datagram() when the module is being unloaded. Searched hard, but just couldn't find a prettier solution (if someone knows, please let me know). This just doesn't seem to fit. Why should networking care about kthreads? I agree. Perhaps you can get kthread_stop to send a signal instead? Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() in kthread_stop() itself? Looking at some happily out-of-date comments in the kthread code, I can guess that at some point of time (perhaps very early drafts) Rusty actually *did* implement the whole kthread_stop() functionality using signals, but I suspect it might've been discarded and the kthread_stop_info approach used instead to protect from spurious signals from userspace. (?) So could we have signals in _addition_ to kthread_stop_info and change kthread_should_stop() to check for both: kthread_stop_info.k == current signal_pending(current) If !kthread_should_stop() signal_pending(current) = spurious signal, so just flush and discard (in the kthread). The problem there is that we still have to make the kthread let signals through. The nice thing about this approach is that we can make the kthread ignore signals, but still allow it to break out of kernel_recvmsg when a kthread_stop is done. Why is it wrong for kthreads to let signals through? We can ignore out all signals we're not interested in, and flush the spurious ones ... otherwise there really isn't much those kthreads can do that get blocked in such functions, is there? Satyam Yes, after I wrote that I began to question that assumption too. I was pretty much going on a statement by Christoph Hellwig on an earlier patch that I did: -[snip]-- The right way to fix this is to stop sending signals at all and have a kernel-internal way to get out of kernel_recvmsg. Uses of signals by kernel thread generally are bugs. -[snip]-- Though this makes no sense to me. I don't see any reason why kthreads can't use signals, and hacking support for breaking out of sleeping functions seems redundant. My latest patch for cifsd has it block all signals from userspace and uses force_sig() instead of send_sig() when trying to stop the thread. This seems to work pretty well and still insulates the thread from userspace signals. -- Jeff Layton [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Hi Jeff, [ Trimmed netdev from Cc: list, added Christoph. ] On 6/26/07, Jeff Layton [EMAIL PROTECTED] wrote: On Tue, 26 Jun 2007 01:11:20 +0530 Satyam Sharma [EMAIL PROTECTED] wrote: [...] Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() in kthread_stop() itself? Looking at some happily out-of-date comments in the kthread code, I can guess that at some point of time (perhaps very early drafts) Rusty actually *did* implement the whole kthread_stop() functionality using signals, but I suspect it might've been discarded and the kthread_stop_info approach used instead to protect from spurious signals from userspace. (?) So could we have signals in _addition_ to kthread_stop_info and change kthread_should_stop() to check for both: kthread_stop_info.k == current signal_pending(current) If !kthread_should_stop() signal_pending(current) = spurious signal, so just flush and discard (in the kthread). [...] Why is it wrong for kthreads to let signals through? We can ignore out all signals we're not interested in, and flush the spurious ones ... otherwise there really isn't much those kthreads can do that get blocked in such functions, is there? Yes, after I wrote that I began to question that assumption too. I was pretty much going on a statement by Christoph Hellwig on an earlier patch that I did: Ok, I found both the threads / patches you referred to ... -[snip]-- The right way to fix this is to stop sending signals at all and have a kernel-internal way to get out of kernel_recvmsg. Uses of signals by kernel thread generally are bugs. -[snip]-- Though this makes no sense to me. I don't see any reason why kthreads can't use signals, and hacking support for breaking out of sleeping functions seems redundant. Right, signals _are_ the signalling mechanism all through kernel code already, anything else would clearly be redundant. But I've listened / participated in other discussions about kthreads and signals and the general feeling is that (somebody correct me if I'm wrong) kernel threads are a kernel _implementation detail_ after all, and good design must ensure that userspace be unaware of even their existence. And I agree with that, but the real ugly uses of signals by kernel threads are those cases where we want to export a full signals-based interface to our kthread to userspace (such cases do exist in mainline, I think). But that's clearly not the case with the usage here. My latest patch for cifsd has it block all signals from userspace and uses force_sig() instead of send_sig() when trying to stop the thread. This seems to work pretty well and still insulates the thread from userspace signals. Thanks, I find this solution much cleaner too, so now we avoid any sort of spuriousness getting in from userspace (and pretty much takes care of all the checking-if-spurious-and-flushing business I referred to earlier). But how about making this part of kthreads proper? Functions such as skb_recv_datagram etc are pretty standard (and others such would also exist or get added in due time) and it is not exactly intuitive for a developer to add a force_sig(SIGKILL) before the kthread_stop() to ensure that the kthread using such functions does exit properly. [ I can foresee cases in the future when such functions are added to kthreads that did not have them previously, and suddenly someone reports a regression that the kthread stops exiting cleanly. ] Satyam - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On Tue, 26 Jun 2007 03:39:23 +0530 Satyam Sharma [EMAIL PROTECTED] wrote: Hi Jeff, [ Trimmed netdev from Cc: list, added Christoph. ] On 6/26/07, Jeff Layton [EMAIL PROTECTED] wrote: On Tue, 26 Jun 2007 01:11:20 +0530 Satyam Sharma [EMAIL PROTECTED] wrote: [...] Yes, why not embed a send_sig(SIGKILL) just before the wake_up_process() in kthread_stop() itself? Looking at some happily out-of-date comments in the kthread code, I can guess that at some point of time (perhaps very early drafts) Rusty actually *did* implement the whole kthread_stop() functionality using signals, but I suspect it might've been discarded and the kthread_stop_info approach used instead to protect from spurious signals from userspace. (?) So could we have signals in _addition_ to kthread_stop_info and change kthread_should_stop() to check for both: kthread_stop_info.k == current signal_pending(current) If !kthread_should_stop() signal_pending(current) = spurious signal, so just flush and discard (in the kthread). [...] Why is it wrong for kthreads to let signals through? We can ignore out all signals we're not interested in, and flush the spurious ones ... otherwise there really isn't much those kthreads can do that get blocked in such functions, is there? Yes, after I wrote that I began to question that assumption too. I was pretty much going on a statement by Christoph Hellwig on an earlier patch that I did: Ok, I found both the threads / patches you referred to ... -[snip]-- The right way to fix this is to stop sending signals at all and have a kernel-internal way to get out of kernel_recvmsg. Uses of signals by kernel thread generally are bugs. -[snip]-- Though this makes no sense to me. I don't see any reason why kthreads can't use signals, and hacking support for breaking out of sleeping functions seems redundant. Right, signals _are_ the signalling mechanism all through kernel code already, anything else would clearly be redundant. But I've listened / participated in other discussions about kthreads and signals and the general feeling is that (somebody correct me if I'm wrong) kernel threads are a kernel _implementation detail_ after all, and good design must ensure that userspace be unaware of even their existence. And I agree with that, but the real ugly uses of signals by kernel threads are those cases where we want to export a full signals-based interface to our kthread to userspace (such cases do exist in mainline, I think). But that's clearly not the case with the usage here. My latest patch for cifsd has it block all signals from userspace and uses force_sig() instead of send_sig() when trying to stop the thread. This seems to work pretty well and still insulates the thread from userspace signals. Thanks, I find this solution much cleaner too, so now we avoid any sort of spuriousness getting in from userspace (and pretty much takes care of all the checking-if-spurious-and-flushing business I referred to earlier). But how about making this part of kthreads proper? Functions such as skb_recv_datagram etc are pretty standard (and others such would also exist or get added in due time) and it is not exactly intuitive for a developer to add a force_sig(SIGKILL) before the kthread_stop() to ensure that the kthread using such functions does exit properly. [ I can foresee cases in the future when such functions are added to kthreads that did not have them previously, and suddenly someone reports a regression that the kthread stops exiting cleanly. ] Satyam I have no issue with it. I just didn't feel confident enough to know whether that might have harmful side effects. Right offhand I don't see why adding a force_sig() to kthread_stop() would be an issue. It seems like if you're wanting to stop the thread anyway then a signal probably wouldn't hurt anything. -- Jeff Layton [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On Sat, 09 Jun 2007 11:30:04 +1000 Herbert Xu <[EMAIL PROTECTED]> wrote: > Please cc networking patches to [EMAIL PROTECTED] > > Jeff Layton <[EMAIL PROTECTED]> wrote: > > > > The following patch is a first stab at removing this need. It makes it > > so that in tcp_recvmsg() we also check kthread_should_stop() at any > > point where we currently check to see if the task was signalled. If > > that returns true, then it acts as if it were signalled and returns to > > the calling function. > > This just doesn't seem to fit. Why should networking care about kthreads? > > Perhaps you can get kthread_stop to send a signal instead? > The problem there is that we still have to make the kthread let signals through. The nice thing about this approach is that we can make the kthread ignore signals, but still allow it to break out of kernel_recvmsg when a kthread_stop is done. Though I will confess that you have a point about this feeling like a layering violation... -- Jeff Layton <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
On Sat, 09 Jun 2007 11:30:04 +1000 Herbert Xu [EMAIL PROTECTED] wrote: Please cc networking patches to [EMAIL PROTECTED] Jeff Layton [EMAIL PROTECTED] wrote: The following patch is a first stab at removing this need. It makes it so that in tcp_recvmsg() we also check kthread_should_stop() at any point where we currently check to see if the task was signalled. If that returns true, then it acts as if it were signalled and returns to the calling function. This just doesn't seem to fit. Why should networking care about kthreads? Perhaps you can get kthread_stop to send a signal instead? The problem there is that we still have to make the kthread let signals through. The nice thing about this approach is that we can make the kthread ignore signals, but still allow it to break out of kernel_recvmsg when a kthread_stop is done. Though I will confess that you have a point about this feeling like a layering violation... -- Jeff Layton [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Please cc networking patches to [EMAIL PROTECTED] Jeff Layton <[EMAIL PROTECTED]> wrote: > > The following patch is a first stab at removing this need. It makes it > so that in tcp_recvmsg() we also check kthread_should_stop() at any > point where we currently check to see if the task was signalled. If > that returns true, then it acts as if it were signalled and returns to > the calling function. This just doesn't seem to fit. Why should networking care about kthreads? Perhaps you can get kthread_stop to send a signal instead? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <[EMAIL PROTECTED]> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] RFC: have tcp_recvmsg() check kthread_should_stop() and treat it as if it were signalled
Please cc networking patches to [EMAIL PROTECTED] Jeff Layton [EMAIL PROTECTED] wrote: The following patch is a first stab at removing this need. It makes it so that in tcp_recvmsg() we also check kthread_should_stop() at any point where we currently check to see if the task was signalled. If that returns true, then it acts as if it were signalled and returns to the calling function. This just doesn't seem to fit. Why should networking care about kthreads? Perhaps you can get kthread_stop to send a signal instead? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/