On Thu, Oct 25, 2018 at 08:05:40AM -0700, Bart Van Assche wrote: > diff --git a/kernel/workqueue.c b/kernel/workqueue.c > index fc9129d5909e..0ef275fe526c 100644 > --- a/kernel/workqueue.c > +++ b/kernel/workqueue.c > @@ -1383,6 +1383,10 @@ static void __queue_work(int cpu, struct > workqueue_struct *wq, > if (unlikely(wq->flags & __WQ_DRAINING) && > WARN_ON_ONCE(!is_chained_work(wq))) > return; > + > + if (!(wq->flags & __WQ_HAS_BEEN_USED)) > + wq->flags |= __WQ_HAS_BEEN_USED; > + > retry: > if (req_cpu == WORK_CPU_UNBOUND) > cpu = wq_select_unbound_cpu(raw_smp_processor_id());
I was looking to fix this problem as well, and I thought about doing this, but I thought wq->mutex had to be taken in order to modify wq->flags --- and that would destroy the scalability of the add-to-work-to-queue functions. We could switch all of wq->flags to use the {test,set,clear}_bits() family of operations, but that seemed like a very large change. Tejun seemed to be ok with creating a destroy_workqueue_no_drain() function and using that instead of destroy_workqueue() for the losers of the cmpxchg lockless initialization code in sb_init_dio_done_wq() in fs/direct_io.c. - Ted