Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On 02/08, Andrew Morton wrote: > > On Thu, 8 Feb 2007 11:35:39 +0300 Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > > Andrew, do you think it is worth to tweak delayed works so it would be > > possible to use flush_work(dwork->work) ? > > > > I've completely lost track of what you've been doing in there (this is a > problem) but sure, if the patch isn't too horrid it's always better to be > robust in the core than to have to work around inadequacies in the callers. It is not so obvious to me what should be done. Note that this problem is not connected to recent changes, there were (I hope) completely transparent for the delayed works. The comment for cancel_delayed_work() work says Note that the work callback function may still be running on return from cancel_delayed_work(). Run flush_scheduled_work() or flush_work() to wait on it. The same is true for cancel_rearming_delayed_work(), but not documented. Note also that the comment above is wrong, we can't use flush_work(dwork->work), it was never supposed to do because queue_delayed_work() use work->data "wrongly". Now, - We can change cancel_rearming_delayed_work() so it does a final flush_workqueue(). But this means that 2 flavors of cancel delayed work will have a subtle difference. OR - Document the fact that cancel_rearming_delayed_work() doesn't garantee that ->func() is not running upon return, fix affected callers. Finally, we can also tweak delaed_works so it will actually be possible to use flush_work(dwork->work) after cancel_{,rearming_}delayed_work(). Seems to make sense, but needs (hopefully not too horrid) changes. And other problems. Currently cancel_rearming_delayed_work(dwork) will hang if dwork was never scheduled, or cancel_rearming_delayed_work() was already called before. The first problem is solved by this patch, the second is still here. The fix is simple _unless_ we are going to implement "flush_work() works on dwork->work" above. Oh, I can't make a decision, please tell me... Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On Thu, 8 Feb 2007 11:35:39 +0300 Oleg Nesterov <[EMAIL PROTECTED]> wrote: > On 02/08, Horms wrote: > > > > On Wed, Feb 07, 2007 at 08:43:55PM +0300, Oleg Nesterov wrote: > > > > > > I think we have another problem with delayed_works. > > > > > > cancel_rearming_delayed_workqueue() doesn't garantee that the ->func() is > > > not > > > running upon return. I don't know if it is bug or not, the comment says > > > nothing > > > about that. > > > > > > However, we have the callers which seem to assume the opposite, example > > > > > > net/ipv4/ipvs/ip_vs_core.c > > > > > > module_exit > > > ip_vs_cleanup > > > ip_vs_control_cleanup > > > cancel_rearming_delayed_work > > > // done > > > > > > This is unsafe. The module may be unloaded and the memory may be freed > > > while defense_work_handler() is still running/preempted. > > > > > > Unless I missed something, which side should be fixed? > > > > Assuming the decision is to fix the ipvs side, is the fix > > just to remove the call to cancel_rearming_delayed_work() in > > ip_vs_control_cleanup() ? > > I think ip_vs_control_cleanup() should also do flush_workqueue() after > cancel_rearming_delayed_work(). > > This is ugly, because we have flush_work() but can't use it on delayed > works. This is possible to change, but not so trivial. > > Andrew, do you think it is worth to tweak delayed works so it would be > possible to use flush_work(dwork->work) ? > I've completely lost track of what you've been doing in there (this is a problem) but sure, if the patch isn't too horrid it's always better to be robust in the core than to have to work around inadequacies in the callers. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On 02/08, Horms wrote: > > On Wed, Feb 07, 2007 at 08:43:55PM +0300, Oleg Nesterov wrote: > > > > I think we have another problem with delayed_works. > > > > cancel_rearming_delayed_workqueue() doesn't garantee that the ->func() is > > not > > running upon return. I don't know if it is bug or not, the comment says > > nothing > > about that. > > > > However, we have the callers which seem to assume the opposite, example > > > > net/ipv4/ipvs/ip_vs_core.c > > > > module_exit > > ip_vs_cleanup > > ip_vs_control_cleanup > > cancel_rearming_delayed_work > > // done > > > > This is unsafe. The module may be unloaded and the memory may be freed > > while defense_work_handler() is still running/preempted. > > > > Unless I missed something, which side should be fixed? > > Assuming the decision is to fix the ipvs side, is the fix > just to remove the call to cancel_rearming_delayed_work() in > ip_vs_control_cleanup() ? I think ip_vs_control_cleanup() should also do flush_workqueue() after cancel_rearming_delayed_work(). This is ugly, because we have flush_work() but can't use it on delayed works. This is possible to change, but not so trivial. Andrew, do you think it is worth to tweak delayed works so it would be possible to use flush_work(dwork->work) ? Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On 02/08, Horms wrote: On Wed, Feb 07, 2007 at 08:43:55PM +0300, Oleg Nesterov wrote: I think we have another problem with delayed_works. cancel_rearming_delayed_workqueue() doesn't garantee that the -func() is not running upon return. I don't know if it is bug or not, the comment says nothing about that. However, we have the callers which seem to assume the opposite, example net/ipv4/ipvs/ip_vs_core.c module_exit ip_vs_cleanup ip_vs_control_cleanup cancel_rearming_delayed_work // done This is unsafe. The module may be unloaded and the memory may be freed while defense_work_handler() is still running/preempted. Unless I missed something, which side should be fixed? Assuming the decision is to fix the ipvs side, is the fix just to remove the call to cancel_rearming_delayed_work() in ip_vs_control_cleanup() ? I think ip_vs_control_cleanup() should also do flush_workqueue() after cancel_rearming_delayed_work(). This is ugly, because we have flush_work() but can't use it on delayed works. This is possible to change, but not so trivial. Andrew, do you think it is worth to tweak delayed works so it would be possible to use flush_work(dwork-work) ? Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On Thu, 8 Feb 2007 11:35:39 +0300 Oleg Nesterov [EMAIL PROTECTED] wrote: On 02/08, Horms wrote: On Wed, Feb 07, 2007 at 08:43:55PM +0300, Oleg Nesterov wrote: I think we have another problem with delayed_works. cancel_rearming_delayed_workqueue() doesn't garantee that the -func() is not running upon return. I don't know if it is bug or not, the comment says nothing about that. However, we have the callers which seem to assume the opposite, example net/ipv4/ipvs/ip_vs_core.c module_exit ip_vs_cleanup ip_vs_control_cleanup cancel_rearming_delayed_work // done This is unsafe. The module may be unloaded and the memory may be freed while defense_work_handler() is still running/preempted. Unless I missed something, which side should be fixed? Assuming the decision is to fix the ipvs side, is the fix just to remove the call to cancel_rearming_delayed_work() in ip_vs_control_cleanup() ? I think ip_vs_control_cleanup() should also do flush_workqueue() after cancel_rearming_delayed_work(). This is ugly, because we have flush_work() but can't use it on delayed works. This is possible to change, but not so trivial. Andrew, do you think it is worth to tweak delayed works so it would be possible to use flush_work(dwork-work) ? I've completely lost track of what you've been doing in there (this is a problem) but sure, if the patch isn't too horrid it's always better to be robust in the core than to have to work around inadequacies in the callers. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On 02/08, Andrew Morton wrote: On Thu, 8 Feb 2007 11:35:39 +0300 Oleg Nesterov [EMAIL PROTECTED] wrote: Andrew, do you think it is worth to tweak delayed works so it would be possible to use flush_work(dwork-work) ? I've completely lost track of what you've been doing in there (this is a problem) but sure, if the patch isn't too horrid it's always better to be robust in the core than to have to work around inadequacies in the callers. It is not so obvious to me what should be done. Note that this problem is not connected to recent changes, there were (I hope) completely transparent for the delayed works. The comment for cancel_delayed_work() work says Note that the work callback function may still be running on return from cancel_delayed_work(). Run flush_scheduled_work() or flush_work() to wait on it. The same is true for cancel_rearming_delayed_work(), but not documented. Note also that the comment above is wrong, we can't use flush_work(dwork-work), it was never supposed to do because queue_delayed_work() use work-data wrongly. Now, - We can change cancel_rearming_delayed_work() so it does a final flush_workqueue(). But this means that 2 flavors of cancel delayed work will have a subtle difference. OR - Document the fact that cancel_rearming_delayed_work() doesn't garantee that -func() is not running upon return, fix affected callers. Finally, we can also tweak delaed_works so it will actually be possible to use flush_work(dwork-work) after cancel_{,rearming_}delayed_work(). Seems to make sense, but needs (hopefully not too horrid) changes. And other problems. Currently cancel_rearming_delayed_work(dwork) will hang if dwork was never scheduled, or cancel_rearming_delayed_work() was already called before. The first problem is solved by this patch, the second is still here. The fix is simple _unless_ we are going to implement flush_work() works on dwork-work above. Oh, I can't make a decision, please tell me... Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On Wed, Feb 07, 2007 at 08:43:55PM +0300, Oleg Nesterov wrote: > On 02/07, Oleg Nesterov wrote: > > > > The following code > > > > schedule_delayed_work(dw); > > cancel_rearming_delayed_workqueue(dw); // OK > > cancel_rearming_delayed_workqueue(dw); // HANGS! > > > > still doesn't work. > > I think we have another problem with delayed_works. > > cancel_rearming_delayed_workqueue() doesn't garantee that the ->func() is not > running upon return. I don't know if it is bug or not, the comment says > nothing > about that. > > However, we have the callers which seem to assume the opposite, example > > net/ipv4/ipvs/ip_vs_core.c > > module_exit > ip_vs_cleanup > ip_vs_control_cleanup > cancel_rearming_delayed_work > // done > > This is unsafe. The module may be unloaded and the memory may be freed > while defense_work_handler() is still running/preempted. > > Unless I missed something, which side should be fixed? Assuming the decision is to fix the ipvs side, is the fix just to remove the call to cancel_rearming_delayed_work() in ip_vs_control_cleanup() ? -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On 02/07, Oleg Nesterov wrote: > > The following code > > schedule_delayed_work(dw); > cancel_rearming_delayed_workqueue(dw); // OK > cancel_rearming_delayed_workqueue(dw); // HANGS! > > still doesn't work. I think we have another problem with delayed_works. cancel_rearming_delayed_workqueue() doesn't garantee that the ->func() is not running upon return. I don't know if it is bug or not, the comment says nothing about that. However, we have the callers which seem to assume the opposite, example net/ipv4/ipvs/ip_vs_core.c module_exit ip_vs_cleanup ip_vs_control_cleanup cancel_rearming_delayed_work // done This is unsafe. The module may be unloaded and the memory may be freed while defense_work_handler() is still running/preempted. Unless I missed something, which side should be fixed? Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On 02/07, Daniel Drake wrote: > > Oleg Nesterov wrote: > >cancel_rearming_delayed_workqueue(dwork) will hang forever if dwork was not > >scheduled, because in that case cancel_delayed_work()->del_timer_sync() > >never > >returns true. > > Thanks! We hit this problem before with the zd1211rw driver and avoided > using cancel_rearming_delayed_workqueue() for this reason. Great. But I am afraid my changelog was incomplete. This patch only fixes the cancel_rearming_delayed_workqueue(freshly_initialized_dwork) lockup. The following code schedule_delayed_work(dw); cancel_rearming_delayed_workqueue(dw); // OK cancel_rearming_delayed_workqueue(dw); // HANGS! still doesn't work. Is it worth fixing? The fix is very simple, and probably makes sense by itself: cancel_delayed_work: - work_release(>work); + work->work.data = NULL; Oleg. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
Oleg Nesterov wrote: cancel_rearming_delayed_workqueue(dwork) will hang forever if dwork was not scheduled, because in that case cancel_delayed_work()->del_timer_sync() never returns true. Thanks! We hit this problem before with the zd1211rw driver and avoided using cancel_rearming_delayed_workqueue() for this reason. I never did get around to looking into if the function itself could be fixed, although I see not much effort would have been needed :) Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
Oleg Nesterov wrote: cancel_rearming_delayed_workqueue(dwork) will hang forever if dwork was not scheduled, because in that case cancel_delayed_work()-del_timer_sync() never returns true. Thanks! We hit this problem before with the zd1211rw driver and avoided using cancel_rearming_delayed_workqueue() for this reason. I never did get around to looking into if the function itself could be fixed, although I see not much effort would have been needed :) Daniel - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On 02/07, Daniel Drake wrote: Oleg Nesterov wrote: cancel_rearming_delayed_workqueue(dwork) will hang forever if dwork was not scheduled, because in that case cancel_delayed_work()-del_timer_sync() never returns true. Thanks! We hit this problem before with the zd1211rw driver and avoided using cancel_rearming_delayed_workqueue() for this reason. Great. But I am afraid my changelog was incomplete. This patch only fixes the cancel_rearming_delayed_workqueue(freshly_initialized_dwork) lockup. The following code schedule_delayed_work(dw); cancel_rearming_delayed_workqueue(dw); // OK cancel_rearming_delayed_workqueue(dw); // HANGS! still doesn't work. Is it worth fixing? The fix is very simple, and probably makes sense by itself: cancel_delayed_work: - work_release(work-work); + work-work.data = NULL; Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On 02/07, Oleg Nesterov wrote: The following code schedule_delayed_work(dw); cancel_rearming_delayed_workqueue(dw); // OK cancel_rearming_delayed_workqueue(dw); // HANGS! still doesn't work. I think we have another problem with delayed_works. cancel_rearming_delayed_workqueue() doesn't garantee that the -func() is not running upon return. I don't know if it is bug or not, the comment says nothing about that. However, we have the callers which seem to assume the opposite, example net/ipv4/ipvs/ip_vs_core.c module_exit ip_vs_cleanup ip_vs_control_cleanup cancel_rearming_delayed_work // done This is unsafe. The module may be unloaded and the memory may be freed while defense_work_handler() is still running/preempted. Unless I missed something, which side should be fixed? Oleg. - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
On Wed, Feb 07, 2007 at 08:43:55PM +0300, Oleg Nesterov wrote: On 02/07, Oleg Nesterov wrote: The following code schedule_delayed_work(dw); cancel_rearming_delayed_workqueue(dw); // OK cancel_rearming_delayed_workqueue(dw); // HANGS! still doesn't work. I think we have another problem with delayed_works. cancel_rearming_delayed_workqueue() doesn't garantee that the -func() is not running upon return. I don't know if it is bug or not, the comment says nothing about that. However, we have the callers which seem to assume the opposite, example net/ipv4/ipvs/ip_vs_core.c module_exit ip_vs_cleanup ip_vs_control_cleanup cancel_rearming_delayed_work // done This is unsafe. The module may be unloaded and the memory may be freed while defense_work_handler() is still running/preempted. Unless I missed something, which side should be fixed? Assuming the decision is to fix the ipvs side, is the fix just to remove the call to cancel_rearming_delayed_work() in ip_vs_control_cleanup() ? -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
cancel_rearming_delayed_workqueue(dwork) will hang forever if dwork was not scheduled, because in that case cancel_delayed_work()->del_timer_sync() never returns true. I don't know if there are any callers which may have problems, but this is not so convenient, and the fix is very simple. Q: looks like we don't need "struct workqueue_struct *wq" parameter. If the timer was aborted successfully, get_wq_data() == wq. Is it worth to add the new function? Signed-off-by: Oleg Nesterov <[EMAIL PROTECTED]> --- 6.20-rc6-mm3/kernel/workqueue.c~3_cdw 2007-02-06 23:09:34.0 +0300 +++ 6.20-rc6-mm3/kernel/workqueue.c 2007-02-06 23:42:43.0 +0300 @@ -565,6 +565,10 @@ EXPORT_SYMBOL(flush_work_keventd); void cancel_rearming_delayed_workqueue(struct workqueue_struct *wq, struct delayed_work *dwork) { + /* Was it ever queued ? */ + if (!get_wq_data(>work)) + return; + while (!cancel_delayed_work(dwork)) flush_workqueue(wq); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/6] workqueue: make cancel_rearming_delayed_workqueue() work on idle dwork
cancel_rearming_delayed_workqueue(dwork) will hang forever if dwork was not scheduled, because in that case cancel_delayed_work()-del_timer_sync() never returns true. I don't know if there are any callers which may have problems, but this is not so convenient, and the fix is very simple. Q: looks like we don't need struct workqueue_struct *wq parameter. If the timer was aborted successfully, get_wq_data() == wq. Is it worth to add the new function? Signed-off-by: Oleg Nesterov [EMAIL PROTECTED] --- 6.20-rc6-mm3/kernel/workqueue.c~3_cdw 2007-02-06 23:09:34.0 +0300 +++ 6.20-rc6-mm3/kernel/workqueue.c 2007-02-06 23:42:43.0 +0300 @@ -565,6 +565,10 @@ EXPORT_SYMBOL(flush_work_keventd); void cancel_rearming_delayed_workqueue(struct workqueue_struct *wq, struct delayed_work *dwork) { + /* Was it ever queued ? */ + if (!get_wq_data(dwork-work)) + return; + while (!cancel_delayed_work(dwork)) flush_workqueue(wq); } - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/