Ping On 03/28/2014 08:07 PM, Lai Jiangshan wrote: >>From 11af0cd0306309f0deaf3326cc26d3e7e517e3d1 Mon Sep 17 00:00:00 2001 > From: Lai Jiangshan <la...@cn.fujitsu.com> > Date: Fri, 28 Mar 2014 00:20:12 +0800 > Subject: [PATCH] workqueue: fix possible race condition when rescuer VS > pwq-release > > There is a race condition between rescuer_thread() and > pwq_unbound_release_workfn(). > > The works of the @pwq may be processed by some other workers, > and @pwq is scheduled to release(due to its wq's attr is changed) > before the rescuer starts to process. In this case > pwq_unbound_release_workfn() will corrupt wq->maydays list, > and rescuer_thead() will access to corrupted data. > > Using get_unbound_pwq() when send_mayday() will keep @pwq's lifetime > and avoid the race condition. > > Changed from V1: > 1) Introduce get_unbound_pwq() for beter readibility. Since > get_pwq() is considerred no-op for percpu workqueue, > so the patch are the same behavior in functionality. > 2) More precise comments. > > Signed-off-by: Lai Jiangshan <la...@cn.fujitsu.com> > --- > kernel/workqueue.c | 30 ++++++++++++++++++++++++++++++ > 1 files changed, 30 insertions(+), 0 deletions(-) > > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > index 0c74979..d845bdd 100644 > --- a/kernel/workqueue.c > +++ b/kernel/workqueue.c > @@ -1050,6 +1050,12 @@ static void get_pwq(struct pool_workqueue *pwq) > pwq->refcnt++; > } > > +static inline void get_unbound_pwq(struct pool_workqueue *pwq) > +{ > + if (pwq->wq->flags & WQ_UNBOUND) > + get_pwq(pwq); > +} > + > /** > * put_pwq - put a pool_workqueue reference > * @pwq: pool_workqueue to put > @@ -1075,6 +1081,12 @@ static void put_pwq(struct pool_workqueue *pwq) > schedule_work(&pwq->unbound_release_work); > } > > +static inline void put_unbound_pwq(struct pool_workqueue *pwq) > +{ > + if (pwq->wq->flags & WQ_UNBOUND) > + put_pwq(pwq); > +} > + > /** > * put_pwq_unlocked - put_pwq() with surrounding pool lock/unlock > * @pwq: pool_workqueue to put (can be %NULL) > @@ -1908,6 +1920,19 @@ static void send_mayday(struct work_struct *work) > > /* mayday mayday mayday */ > if (list_empty(&pwq->mayday_node)) { > + /* > + * A pwq of an unbound wq may be released before wq's > + * destruction when the wq's attr is changed. In this case, > + * pwq_unbound_release_workfn() may execute earlier before > + * rescuer_thread() and corrupt wq->maydays list. > + * > + * get_unbound_pwq() keeps the unbound pwq until the rescuer > + * processes it and protects the pwq from being scheduled to > + * release when someone else processes all the works before > + * the rescuer starts to process. > + */ > + get_unbound_pwq(pwq); > + > list_add_tail(&pwq->mayday_node, &wq->maydays); > wake_up_process(wq->rescuer->task); > } > @@ -2424,6 +2449,7 @@ repeat: > /* migrate to the target cpu if possible */ > worker_maybe_bind_and_lock(pool); > rescuer->pool = pool; > + put_unbound_pwq(pwq); > > /* > * Slurp in all works issued via this workqueue and > @@ -4318,6 +4344,10 @@ void destroy_workqueue(struct workqueue_struct *wq) > /* > * The base ref is never dropped on per-cpu pwqs. Directly > * free the pwqs and wq. > + * > + * The wq->maydays list maybe still have some pwqs linked, > + * but it is safe to free them all together since the rescuer > + * is stopped. > */ > free_percpu(wq->cpu_pwqs); > kfree(wq);
-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/