OK, so it turns out that the oops I saw with memory corruption wasn't the 
bug I was tracking, but something that comes up sometimes when trying to 
run ftrace at the same time as fuzzing.  So we'll leave that for another 
day.

The 3.17+ lockup I am tracking still reproduces as of git from yesterday 
(even after the 3.18-rc perf_event merges).

I can use sysrq to get the stack trace, the one CPU is stuck in a call
to find_get_context().

An example backtrace:

[88200.300003]  <EOI>
[88200.300003]  [<ffffffff81114869>] ? ____cache_alloc+0x130/0x25b
[88200.300003]  [<ffffffff8107fb05>] ? __call_rcu.constprop.63+0x1bf/0x1cb
[88200.300003]  [<ffffffff8107fb2b>] kfree_call_rcu+0x1a/0x1c
[88200.300003]  [<ffffffff810cf84f>] put_ctx+0x51/0x55
[88200.300003]  [<ffffffff810d1840>] find_get_context+0x166/0x195
[88200.300003]  [<ffffffff810d5856>] SYSC_perf_event_open+0x47b/0x7f5
[88200.300003]  [<ffffffff810d5f55>] SyS_perf_event_open+0xe/0x10
[88200.300003]  [<ffffffff815362d6>] system_call_fastpath+0x16/0x1b

It looks like the
                        else if (task->perf_event_ctxp[ctxn])
                                err = -EAGAIN;

case is triggering non-stop in the retry path of 
find_get_context() and so the kernel gets stuck forever retrying.

I can drop some printks in if it will help debug.  I've tried running 
ftrace, but for whatever reason if I enable ftrace the bug won't trigger.

Vince


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to