Re: [PATCH] utrace: utrace_attach_task() forgets to return when -utrace == NULL
On 11/16, Roland McGrath wrote: But this smp_rmb() in task_utrace_struct() is only needed when the caller does something like if (task_utrace_flags(tsk)) do_something(task_utrace_struct()); If you look at where task_utrace_struct() is used, it's basically always like this. All the tracehook.h callers into utrace.c do that. ... But that's not the only place, is it? Every utrace_report_* and most every other utrace.c entry point is called from tracehook.h code using this pattern. Those can have the same issue. So you'd have to make all of those do an: if (unlikely(!utrace)) return; sort of check. Is that what you mean? Yes, agreed, this doesn't look good. So, I think it is better to add the barriers into task_utrace_struct() like you already did. (a very minor nit: s/read_barrier_depends/smp_read_barrier_depends/) Just to complete the discussion, we could do static inline unsigned long task_utrace_flags(struct task_struct *task) { if (!task-utrace) return 0; return task-utrace_flags; } to void rmb() without complicationg the code, but this still adds some overhead to the hot path. And a bit offtopic question. Apart from task_utrace_flags() should be as fast as possible, any other reason we can't move -utrace_flags into struct utrace ? In the likely case we should worry about (the task was never traced), -utrace == NULL and task_utrace_flags(struct task_struct *task) { if (likley(!task-utrace)) return 0; return task-utrace-flags; } shouldn't be slower. If the task was ever traced and then utraced this adds the extra dereference, yes. Oleg.
Re: [PATCH] utrace: utrace_attach_task() forgets to return when -utrace == NULL
Now, if we race with another task doing utrace_task_alloc() and see -utrace != NULL, why should we see the correctly initialized *utrace? utrace_task_alloc() needs wmb(), and the code like above read_barrier_depends(). Ok. Please review what I put in (esp. commit c575558) and send patches relative to the latest tree if that's not right. Perhaps it is overkill to do read_barrier_depends() in task_utrace_struct(). But AFAICT if we don't actually need it all those places, that is only because of some less-obvious barrier effect with checks on -utrace_flags or something. Do you think it's not really needed in all those places we extract the pointer before spin_lock et al? Thanks, Roland
Re: [PATCH] utrace: utrace_attach_task() forgets to return when -utrace == NULL
On 11/16, Roland McGrath wrote: Now, if we race with another task doing utrace_task_alloc() and see -utrace != NULL, why should we see the correctly initialized *utrace? utrace_task_alloc() needs wmb(), and the code like above read_barrier_depends(). Ok. Please review what I put in (esp. commit c575558) Yes, I think this is correct. Just in case fyi, I cooked the almost identical patch yesterday, but it didn't help: make xcheck crashes. Not that I really expected it can help on x86. Perhaps it is overkill to do read_barrier_depends() in task_utrace_struct(). Yes, perhaps. But this is safer. Do you think it's not really needed in all those places we extract the pointer before spin_lock et al? Not sure I understand exactly... Yes, sometimes we know we don't need read_barrier_depends(). For example, @@ -302,7 +309,7 @@ struct utrace_engine *utrace_attach_task( if (!utrace) { if (unlikely(!utrace_task_alloc(target))) return ERR_PTR(-ENOMEM); - utrace = target-utrace; + utrace = task_utrace_struct(target); } I think this change is not necessary (but imho good anyway), this path must take task_lock() and thus it must always see the correct *utrace. But read_barrier_depends() does nothig except on alpha, why should we care? This reminds me, utrace_interrupt_pending() and similar code needs rmb() or task-utrace != NULL check. Oleg.
Re: [PATCH] utrace: utrace_attach_task() forgets to return when -utrace == NULL
Just in case fyi, I cooked the almost identical patch yesterday, but it didn't help: make xcheck crashes. Not that I really expected it can help on x86. Right. Not sure I understand exactly... Yes, sometimes we know we don't need read_barrier_depends(). For example, @@ -302,7 +309,7 @@ struct utrace_engine *utrace_attach_task( if (!utrace) { if (unlikely(!utrace_task_alloc(target))) return ERR_PTR(-ENOMEM); - utrace = target-utrace; + utrace = task_utrace_struct(target); } I think this change is not necessary (but imho good anyway), this path must take task_lock() and thus it must always see the correct *utrace. Ok. If we were to change it we should add a comment about that. But read_barrier_depends() does nothig except on alpha, why should we care? I thought this was the barrier you were talking about. Anyway, we should be pedantically correct about these. (People do even still build Linux/Alpha.) This reminds me, utrace_interrupt_pending() and similar code needs rmb() or task-utrace != NULL check. I take it you mean that: if (task-utrace_flags) ... use task-utrace ... paths need a barrier to ensure task-utrace_flags read nonzero implies task-utrace will be read non-null when racing with the first attach. Right? This only needs smp_rmb(), right? Is there an existing explicit barrier this pairs with? The corresponding smp_wmb() needs to be somewhere between utrace_task_alloc()'s task-utrace = utrace; and utrace_add_engine(): if (!target-utrace_flags) { target-utrace_flags = UTRACE_EVENT(REAP); Right? A lot goes on in between there that probably has some implied barriers (task_unlock, kmem_cache_alloc, spin_lock). But do we really have the exact barrier we need? Or maybe we need: /* * This barrier makes sure the initialization of the struct * precedes the installation of the pointer. This pairs * with read_barrier_depends() in task_utrace_struct(). */ wmb(); task-utrace = utrace; /* * This barrier pairs with task_utrace_struct()'s smp_rmb(). * It ensures this task-utrace store is always seen before * task-utrace_flags is first set in utrace_add_engine(). */ smp_wmb(); This is basically all the tracehook.h paths calling into utrace. I think the easiest thing to do is to assume that use pattern and also roll this barrier into task_utrace_struct(). diff --git a/include/linux/utrace.h b/include/linux/utrace.h index 8924783..000 100644 --- a/include/linux/utrace.h +++ b/include/linux/utrace.h @@ -157,7 +157,18 @@ static inline unsigned long task_utrace_ static inline struct utrace *task_utrace_struct(struct task_struct *task) { - struct utrace *utrace = task-utrace; + struct utrace *utrace; + + /* +* This barrier ensures that any prior load of task-utrace_flags +* is ordered before this load of task-utrace. We use those +* utrace_flags checks in the hot path to decide to call into +* the utrace code. The first attach installs task-utrace before +* setting task-utrace_flags nonzero, with a barrier between. +*/ + smp_rmb(); + utrace = task-utrace; + read_barrier_depends(); /* See utrace_task_alloc(). */ return utrace; } Is that what you have in mind? Thanks, Roland
Re: [PATCH] utrace: utrace_attach_task() forgets to return when -utrace == NULL
On 11/16, Roland McGrath wrote: But read_barrier_depends() does nothig except on alpha, why should we care? I thought this was the barrier you were talking about. Anyway, we should be pedantically correct about these. (People do even still build Linux/Alpha.) Yes, sure. I meant, we shouldn't worry that this barrier adds too much overhead to task_utrace_struct(). This reminds me, utrace_interrupt_pending() and similar code needs rmb() or task-utrace != NULL check. I take it you mean that: if (task-utrace_flags) ... use task-utrace ... paths need a barrier to ensure task-utrace_flags read nonzero implies task-utrace will be read non-null when racing with the first attach. Right? Yes, This only needs smp_rmb(), right? I think yes. Is there an existing explicit barrier this pairs with? The corresponding smp_wmb() needs to be somewhere between utrace_task_alloc()'s task-utrace = utrace; and utrace_add_engine(): if (!target-utrace_flags) { target-utrace_flags = UTRACE_EVENT(REAP); We don't have the explicit barrier, but I think it is not needed. In this case utrace_attach_task() does unlock+lock at least once before changing -utrace_flags, this implies mb(). static inline struct utrace *task_utrace_struct(struct task_struct *task) { - struct utrace *utrace = task-utrace; + struct utrace *utrace; + + /* + * This barrier ensures that any prior load of task-utrace_flags + * is ordered before this load of task-utrace. We use those + * utrace_flags checks in the hot path to decide to call into + * the utrace code. The first attach installs task-utrace before + * setting task-utrace_flags nonzero, with a barrier between. + */ + smp_rmb(); + utrace = task-utrace; + read_barrier_depends(); /* See utrace_task_alloc(). */ return utrace; } Yes, I think this patch should close this (pure theoretical) problem. But this smp_rmb() in task_utrace_struct() is only needed when the caller does something like if (task_utrace_flags(tsk)) do_something(task_utrace_struct()); it would be more logical to move this rmb() info task_utrace_flags(). However task_utrace_flags() must be fast. Perhaps, to avoid this smp_rmb(), we can change bool utrace_interrupt_pending(void) { struct utrace *utrace = task_utrace_struct(current); return utrace utrace-resume == UTRACE_INTERRUPT; } I dunno. Oleg.
Re: [PATCH] utrace: utrace_attach_task() forgets to return when -utrace == NULL
Yes, sure. I meant, we shouldn't worry that this barrier adds too much overhead to task_utrace_struct(). Ah! Sorry, I misread you. Yes, good point. We don't have the explicit barrier, but I think it is not needed. In this case utrace_attach_task() does unlock+lock at least once before changing -utrace_flags, this implies mb(). Ok, this is exactly the sort of thing that merits explicit comments. Yes, I think this patch should close this (pure theoretical) problem. But this smp_rmb() in task_utrace_struct() is only needed when the caller does something like if (task_utrace_flags(tsk)) do_something(task_utrace_struct()); If you look at where task_utrace_struct() is used, it's basically always like this. All the tracehook.h callers into utrace.c do that. it would be more logical to move this rmb() info task_utrace_flags(). However task_utrace_flags() must be fast. Right. The hot path is that you test task-utrace_flags and it's zero. It seems better not to add anything else to that path, not even lfence. Perhaps, to avoid this smp_rmb(), we can change bool utrace_interrupt_pending(void) { struct utrace *utrace = task_utrace_struct(current); return utrace utrace-resume == UTRACE_INTERRUPT; } I dunno. But that's not the only place, is it? Every utrace_report_* and most every other utrace.c entry point is called from tracehook.h code using this pattern. Those can have the same issue. So you'd have to make all of those do an: if (unlikely(!utrace)) return; sort of check. Is that what you mean? I guess we'd need some chip-knowledgeable microoptimizers to tell us whether: smp_rmb(); /* guarantees we can't be here seeing utrace==NULL */ utrace = task-utrace; is better or worse than: utrace = task-utrace; if (unlikely(!utrace)) return; on today's chips. The latter is more hairy for the utrace.c code to always be sure to check. But it's not prohibitively so. Thanks, Roland
Re: [PATCH] utrace: utrace_attach_task() forgets to return when -utrace == NULL
On 11/16, Oleg Nesterov wrote: UPD: tested the kernel with this patch, now late-ptrace-may-attach-check crashes the kernel silently (no output under kvm). Repeated this test. Got several oopses, bad spinlock magic in utrace-lock. The stack trace varies, often from utrace_get_signal(). Oh, will continue tomorrow. Oleg.