On Tue, Apr 29, 2014 at 02:21:56PM -0400, Vince Weaver wrote: > On Tue, 29 Apr 2014, Peter Zijlstra wrote: > > > > Event #16 is a SW event created and running in the parent on CPU0. > > > > A regular software one, right? Not a timer one. > > Maybe. From traces I have it looks like it's a regular one (i.e. calls > perf_swevent_add() ) but who knows at this point. > > When I actually got a trace with perf_event_open() instrumented to print > some attr values it looked like things were being caused by > PERF_COUNT_SW_TASK_CLOCK which makes no sense. > > > > CPU6 (child) shutting down. > > > last user of event #16 > > > perf_release() called on event > > > which eventually calls event_sched_out() > > > which calls pmu->del which removes event from swevent_htable > > > *but only on CPU6* > > > > So on fork() we'll also clone the counter; after which there's two. One > > will run on each task. > > even if inherit isn't set?
Fair point, nope not in that case. If you can trigger this without ever using .inherit=1 this would exclude a lot of funny code. > > Because of a context switch optimization they can actually flip around > > (the below patch disables that). > > ENOPATCH? urgh.. fail. diff --git a/kernel/events/core.c b/kernel/events/core.c index 5129b1201050..0d6a58950a3b 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -2293,6 +2291,7 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn, if (!cpuctx->task_ctx) return; +#if 0 rcu_read_lock(); next_ctx = next->perf_event_ctxp[ctxn]; if (!next_ctx) @@ -2335,6 +2334,7 @@ static void perf_event_context_sched_out(struct task_struct *task, int ctxn, } unlock: rcu_read_unlock(); +#endif if (do_switch) { raw_spin_lock(&ctx->lock); > > quite the puzzle this one > > yes. > > I'm tediously working on trying to get a good trace of this happening. > > I have a random seed that will trigger the bug in the fuzzer around 1 time > in 10. > > Unfortunately many of the times it crashes so hard/quickly there's no > chance of getting the trace data (dump trace on oops never holds enough > state, and often the fuzzing triggers its own random trace events that > clutter those logs). > > Also trace-cmd is a pain to use. Any suggested events I should trace > beyond the obvious? I've never used trace-cmd :/ What I do in the crashing hard case is try and make dump_ftrace_on_oops work, although capturing a full trace buffer over serial is exceedingly painful -- maxcpus= might work if you have too many CPUs, I forgot. Anyway, I can make the fuzzer to weird shit, but it doesn't look like the thing you're seeing, but who knows. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/