Thiru,

During IPMP stress tests, we've repeatedly hit deadlocks related to the
current (broken) method of restarting the IGMP/MLD timers in ipsq_exit().
Since we really need a fix in order to do proper stress tests, I've come
up with a fairly simple change that should fix the problem.  It's along
the lines of what you were thinking (but not quite as far down that path),
in that a dedicated worker thread has been created that is responsible for
restarting the timers.  Whenever the timer needs to be restarted (e.g. in
igmp_joingroup()), it signals this thread to do the work.  Please see:

   http://zhadum.east/export/ws/clearview/clearview-ipmpdev/webrev/

The relevant files are igmp.c, ip.c and ip_if.c (there are some other
changes in the workspace that should be clearly unrelated).  Please
let me know what you think.

BTW, along the way, I noticed what appears to be a race in the new
capability thread logic: it seems like the ip_stack_t could be freed
before ill_taskq_dispatch() is done referencing it.  This seems fairly
simple to fix with some more communication between ill_taskq_dispatch()
and ip_stack_fini(), as I've done with the timer thread above.

-- 
meem

Reply via email to