On 11/16, Roland McGrath wrote: > > > But read_barrier_depends() does nothig except on alpha, why should we > > care? > > I thought this was the barrier you were talking about. Anyway, we should > be pedantically correct about these. (People do even still build > Linux/Alpha.)
Yes, sure. I meant, we shouldn't worry that this barrier adds too much overhead to task_utrace_struct(). > > This reminds me, utrace_interrupt_pending() and similar code needs rmb() > > or task->utrace != NULL check. > > I take it you mean that: > > if (task->utrace_flags) > ... use task->utrace ... > > paths need a barrier to ensure task->utrace_flags read nonzero implies > task->utrace will be read non-null when racing with the first attach. > Right? Yes, > This only needs smp_rmb(), right? I think yes. > Is there an existing explicit barrier > this pairs with? The corresponding smp_wmb() needs to be somewhere between > utrace_task_alloc()'s "task->utrace = utrace;" and utrace_add_engine(): > > if (!target->utrace_flags) { > target->utrace_flags = UTRACE_EVENT(REAP); We don't have the explicit barrier, but I think it is not needed. In this case utrace_attach_task() does unlock+lock at least once before changing ->utrace_flags, this implies mb(). > static inline struct utrace *task_utrace_struct(struct task_struct *task) > { > - struct utrace *utrace = task->utrace; > + struct utrace *utrace; > + > + /* > + * This barrier ensures that any prior load of task->utrace_flags > + * is ordered before this load of task->utrace. We use those > + * utrace_flags checks in the hot path to decide to call into > + * the utrace code. The first attach installs task->utrace before > + * setting task->utrace_flags nonzero, with a barrier between. > + */ > + smp_rmb(); > + utrace = task->utrace; > + > read_barrier_depends(); /* See utrace_task_alloc(). */ > return utrace; > } Yes, I think this patch should close this (pure theoretical) problem. But this smp_rmb() in task_utrace_struct() is only needed when the caller does something like if (task_utrace_flags(tsk)) do_something(task_utrace_struct()); it would be more logical to move this rmb() info task_utrace_flags(). However task_utrace_flags() must be fast. Perhaps, to avoid this smp_rmb(), we can change bool utrace_interrupt_pending(void) { struct utrace *utrace = task_utrace_struct(current); return utrace && utrace->resume == UTRACE_INTERRUPT; } I dunno. Oleg.