On Thu, May 03, 2012 at 02:14:20PM +0100, Attilio Rao wrote: > 2012/5/3, Konstantin Belousov <kostik...@gmail.com>: > > On Thu, May 03, 2012 at 12:02:08PM +0100, Attilio Rao wrote: > >> 2012/5/3, Konstantin Belousov <k...@freebsd.org>: > >> > Author: kib > >> > Date: Thu May 3 10:38:02 2012 > >> > New Revision: 234952 > >> > URL: http://svn.freebsd.org/changeset/base/234952 > >> > > >> > Log: > >> > When callout_reset_on() cannot immediately migrate a callout since it > >> > is running on other cpu, the CALLOUT_PENDING flag is temporarily > >> > cleared. Then, callout_stop() on this, in fact active, callout fails > >> > because CALLOUT_PENDING is not set, and callout_stop() returns 0. > >> > > >> > Now, in sleepq_check_timeout(), the failed callout_stop() causes the > >> > sleepq code to execute mi_switch() without even setting the wmesg, > >> > since the switch-out is supposed to be transient. In fact, the thread > >> > is put off the CPU for full timeout interval, instead of being put on > >> > runq immediately. Until timeout fires, the process is unkillable for > >> > obvious reasons. > >> > > >> > Fix this by marking the migrating callouts with CALLOUT_DFRMIGRATION > >> > flag. The flag is cleared by callout_stop_safe() when the function > >> > detects a migration, besides returning the success. The softclock() > >> > rechecks the flag for migrating callout and cancels its execution if > >> > the flag was cleared meantime. > >> > >> Can you please clarify why you cannot simply drop the deferred > >> migration in the case !CALLOUT_PENDING in callout_stop_safe()? > > > > I probably can, I think I went with the route of committed patch > > because it is slightly less work. Also, the comment in the while() > > loop suggested me to rely on softclock. > > I don't think this is more work at all, the attached patch > (pre-r234952, untested) should address it properly in few than 10 > lines: > http://www.freebsd.org/~attilio/callout_cancel_mig_stop.patch > > without the need to add further flags and re-using existing mechanisms.
(cc->cc_curr != c) is not the case which caused the issue. It might be needed to treatened this way, but the reported case is opposite.
pgpIhQaLZvyTq.pgp
Description: PGP signature