Hello, Lai. On Mon, Sep 17, 2012 at 10:40:05AM +0800, Lai Jiangshan wrote: > try_to_grab_pending() leave LINKED tagalong in delayed queue when > it deletes a work. This behavior will cause future > cwq_activate_first_delayed() increase the ->nr_active wrongly, > and may cause the whole cwq frozen. > > example: > > state: cwq->max_active = 1, cwq->nr_active = 1 > one work in cwq->pool, many in cwq->delayed_works. > > step1: try_to_grab_pending() remove a work from delayed_works but > leave tagalong. > step2: when the work in cwq->pool is finished, > cwq_activate_first_delayed() > move the tagalong to cwq->pool and increase the ->nr_active. > > current state: cwq->nr_active = 1, but works of the cwq in cwq->pool are all > NO_COLOR, > so even when these works are finished, cwq->nr_active will not be > decreased, > and no work will be moved from cwq->delayed_works. the cwq is frozen. > > Fix it by moving the tagalong to cwq->pool in try_to_grab_pending(). > > Signed-off-by: Lai Jiangshan <la...@cn.fujitsu.com>
Nice spotting. > @@ -1101,11 +1101,26 @@ static int try_to_grab_pending(struct work_struct > *work, bool is_dwork, > */ > smp_rmb(); > if (gcwq == get_work_gcwq(work)) { > + unsigned long delayed; > + > + /* > + * move LINKED tagalong(if exist) out from > + * delayed queue. Otherwise some future > + * cwq_activate_first_delayed() may move > + * works(which are all NO_COLOR) to cwq->pool > + * and increase the ->nr_active. It may > + * cause the whole cwq frozen. > + */ > + delayed = *work_data_bits(work) & WORK_STRUCT_DELAYED; > + if (delayed) > + move_linked_works(work, > + &get_work_cwq(work)->pool->worklist, > + NULL); > + > debug_work_deactivate(work); > list_del_init(&work->entry); > cwq_dec_nr_in_flight(get_work_cwq(work), > - get_work_color(work), > - *work_data_bits(work) & WORK_STRUCT_DELAYED); > + get_work_color(work), delayed); I'm a bit uneasy about how this introduces something which is on active list but not quite active. Can we first activate it and then grab its pending? IOW, change cwq_activate_first_delayed() to cwq_activate_delayed() which takes @work instead and then call it with the work item being grabbed in try_to_grab_pending() before dec'ing nr_in_flight? Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/