Oleg Nesterov wrote: > On 05/11, Tejun Heo wrote: >> Oleg Nesterov wrote: >>> However, I agree, this smp_wmb() in insert_work() should die. We can >>> introduce "smp_mb__before_spinlock()" (no-op on x86 at least) to kill it. >> Yeah, right, we allow cwq pointer to change without holding the lock. >> Although I still think that is where we should fix the problem. Taking >> down CPU is a cold cold path. We can afford a lot of overhead there. >> IMHO, if we can do that, it would be far better than memory barrier >> dance which tends to be difficult to understand and thus prove/maintain >> correctness. I'll think about it more. > > Yes I hate this barrier too. That is why changelog explicitly mentions it. > > With some trivial code modifications we can move set_wq_data() from > insert_work() > to __queue_work(), then > > void set_wq_data(work, cwq) > { > struct cpu_workqueue_struct *old = get_wq_data(work); > > if (likely(cwq == old)) > return; > > if (old) > spin_lock(old->lock); > > atomic_long_set(&work->data, ...); > > if (old) > spin_lock(old->lock); > } > > I can't say I like this very much, though. I'd prefer use > smp_mb__before_spinlock(). > Probably we can do something else.
Eeek... I don't like the above either. I've been thinking about the barriers a bit more and what makes it different from simple locked enter/leave model. As our pointer can change without locking, work->entry, which is always manipulated locked, is used as a mean to validate the pointer and we need barrier there because the update to work->entry and work->wq_data aren't atomic - new validity test result can be read together with old pointer. Clever && cryptic, I have to say. :-) Fortunately, we have one bit left in the flags and can use it to mark pointer validity instead of list_empty() test. flags and wq_data live in the same atomic_long and thus can be updated together atomically. * insert_work() sets VALID bit and the cwq pointer using one atomic_long_set(). * queue_delayed_work_on() sets the cwq pointer but not the VALID bit. * run_workqueue() clears the cwq pointer and VALID bit while holding lock before executing the work. * try_to_grab_pending() checks VALID && pointers equal after grabbing cwq->lock. What do you think? Is there any hole? Thanks. -- tejun - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/