Re: perf: perf_fuzzer triggers instant reboot

2014-10-02 Thread Vince Weaver
On Thu, 2 Oct 2014, Vince Weaver wrote: > It looks like this is easily reproducible (just wedged the machine again) > so let me check back after testing the patch. no, can still wedge the machine even with this patch applied. Will try messing with ftrace to see if I can figure out what's going o

Re: perf: perf_fuzzer triggers instant reboot

2014-10-02 Thread Vince Weaver
On Wed, 1 Oct 2014, Sasha Levin wrote: > On 09/30/2014 01:23 PM, Peter Zijlstra wrote: > > How about this then? > > > > --- > > Subject: perf: Fix unclone_ctx() vs locking > > > > The idiot who did 4a1c0f262f88 forgot to pay attention and fix all > > similar cases. Do so now. > > > > In particu

Re: perf: perf_fuzzer triggers instant reboot

2014-10-01 Thread Sasha Levin
On 09/30/2014 01:23 PM, Peter Zijlstra wrote: > How about this then? > > --- > Subject: perf: Fix unclone_ctx() vs locking > > The idiot who did 4a1c0f262f88 forgot to pay attention and fix all > similar cases. Do so now. > > In particular, unclone_ctx() must be called while holding ctx->lock, >

Re: perf: perf_fuzzer triggers instant reboot

2014-09-30 Thread Cong Wang
On Sun, Sep 28, 2014 at 10:21 PM, Vince Weaver wrote: > On Thu, 25 Sep 2014, Cong Wang wrote: > >> On Wed, Sep 24, 2014 at 9:59 PM, Vince Weaver >> wrote: > >> > Now that just might mean the patch pushed the code around enough so my >> > test doesn't trigger, but there is hope that maybe this fi

Re: perf: perf_fuzzer triggers instant reboot

2014-09-30 Thread Peter Zijlstra
How about this then? --- Subject: perf: Fix unclone_ctx() vs locking The idiot who did 4a1c0f262f88 forgot to pay attention and fix all similar cases. Do so now. In particular, unclone_ctx() must be called while holding ctx->lock, therefore all such sites are broken for the same reason. Pull th

Re: perf: perf_fuzzer triggers instant reboot

2014-09-30 Thread Peter Zijlstra
On Mon, Sep 29, 2014 at 01:01:33PM -0400, Sasha Levin wrote: > On 09/29/2014 07:11 AM, Peter Zijlstra wrote: > > On Sun, Sep 28, 2014 at 12:09:09AM -0400, Sasha Levin wrote: > > > >> > [ 690.801720] 2 locks held by trinity-c95/17888: > >> > [ 690.801738] #0: (cpu_hotplug.lock){++}, at: get_o

Re: perf: perf_fuzzer triggers instant reboot

2014-09-29 Thread Sasha Levin
On 09/29/2014 07:11 AM, Peter Zijlstra wrote: > On Sun, Sep 28, 2014 at 12:09:09AM -0400, Sasha Levin wrote: > >> > [ 690.801720] 2 locks held by trinity-c95/17888: >> > [ 690.801738] #0: (cpu_hotplug.lock){++}, at: get_online_cpus >> > (kernel/cpu.c:92) >> > [ 690.801754] #1: (&ctx->lock)

Re: perf: perf_fuzzer triggers instant reboot

2014-09-29 Thread Peter Zijlstra
On Sun, Sep 28, 2014 at 12:09:09AM -0400, Sasha Levin wrote: > [ 690.801720] 2 locks held by trinity-c95/17888: > [ 690.801738] #0: (cpu_hotplug.lock){++}, at: get_online_cpus > (kernel/cpu.c:92) > [ 690.801754] #1: (&ctx->lock){-.-...}, at: perf_lock_task_context > (kernel/events/core.c:

Re: perf: perf_fuzzer triggers instant reboot

2014-09-28 Thread Vince Weaver
On Thu, 25 Sep 2014, Cong Wang wrote: > On Wed, Sep 24, 2014 at 9:59 PM, Vince Weaver > wrote: > > Now that just might mean the patch pushed the code around enough so my > > test doesn't trigger, but there is hope that maybe this fixes things. > > I read this as it fixes your crash as well? I

Re: perf: perf_fuzzer triggers instant reboot

2014-09-27 Thread Sasha Levin
On 09/25/2014 12:38 PM, Cong Wang wrote: > On Wed, Sep 24, 2014 at 9:59 PM, Vince Weaver > wrote: >> > >> > So I noticed Cong Wang's patch (3577af70a2ce4853d58e57d832e687d739281479) >> > perf: Fix a race condition in perf_remove_from_context() >> > >> > and that sounds a lot like the weir

Re: perf: perf_fuzzer triggers instant reboot

2014-09-25 Thread Cong Wang
On Wed, Sep 24, 2014 at 9:59 PM, Vince Weaver wrote: > > So I noticed Cong Wang's patch (3577af70a2ce4853d58e57d832e687d739281479) > perf: Fix a race condition in perf_remove_from_context() > > and that sounds a lot like the weird fork()/memory-corruption bug that the > fuzzer has been tri

Re: perf: perf_fuzzer triggers instant reboot

2014-09-24 Thread Vince Weaver
So I noticed Cong Wang's patch (3577af70a2ce4853d58e57d832e687d739281479) perf: Fix a race condition in perf_remove_from_context() and that sounds a lot like the weird fork()/memory-corruption bug that the fuzzer has been triggering. So I applied that patch alone on top of the 3.17-rc4

Re: perf: perf_fuzzer triggers instant reboot

2014-09-11 Thread Vince Weaver
On Wed, 10 Sep 2014, Peter Zijlstra wrote: > > I've been trying for months now to make progress on these but this type of > > bug is really hard to debug. > > Did we actually fix some at least? I had the idea we did get a few > sorted. But yes, this is tedious and hard going :/ we did fix some

Re: perf: perf_fuzzer triggers instant reboot

2014-09-10 Thread Peter Zijlstra
On Wed, Sep 10, 2014 at 10:30:31AM -0400, Vince Weaver wrote: > On Wed, 10 Sep 2014, Sasha Levin wrote: > > > On 09/10/2014 09:18 AM, Vince Weaver wrote: > > > that's what got me looking at things again, the trinity reports. Though > > > I > > > think those involve CPU hotplugging which my fuzz

Re: perf: perf_fuzzer triggers instant reboot

2014-09-10 Thread Vince Weaver
On Wed, 10 Sep 2014, Sasha Levin wrote: > On 09/10/2014 09:18 AM, Vince Weaver wrote: > > that's what got me looking at things again, the trinity reports. Though I > > think those involve CPU hotplugging which my fuzzer shouldn't trigger. > > > > I do think this is the same memory corruption/re

Re: perf: perf_fuzzer triggers instant reboot

2014-09-10 Thread Sasha Levin
On 09/10/2014 09:18 AM, Vince Weaver wrote: > that's what got me looking at things again, the trinity reports. Though I > think those involve CPU hotplugging which my fuzzer shouldn't trigger. > > I do think this is the same memory corruption/reboot bug that I reported > back in February (the t

Re: perf: perf_fuzzer triggers instant reboot

2014-09-10 Thread Peter Zijlstra
On Wed, Sep 10, 2014 at 09:18:35AM -0400, Vince Weaver wrote: > Somehow something is stomping over memory with a forking workload (likely > an improper free with RCU like we've seen before) but the fact that it > causes a reboot immediately makes it *really* hard to debug this. Yes, the insta r

Re: perf: perf_fuzzer triggers instant reboot

2014-09-10 Thread Vince Weaver
On Wed, 10 Sep 2014, Peter Zijlstra wrote: > > Sasha reported something from his KVM based fuzzing, maybe that's the > same. But that x86_exceptions thing is interesting, lemme go look at > that first. that's what got me looking at things again, the trinity reports. Though I think those involve

Re: perf: perf_fuzzer triggers instant reboot

2014-09-10 Thread Peter Zijlstra
On Tue, Sep 09, 2014 at 01:53:50PM -0400, Vince Weaver wrote: > > OK so trying to use ftrace to track this issue, and this happens (on > core2, 3.17-rc4) > > [ 295.992012] PANIC: double fault, error_code: 0x0 > [ 295.992012] CPU: 1 PID: 2916 Comm: trace-cmd Not tainted 3.17.0-rc4+ #82 > [ 295

Re: perf: perf_fuzzer triggers instant reboot

2014-09-09 Thread Vince Weaver
OK so trying to use ftrace to track this issue, and this happens (on core2, 3.17-rc4) [ 295.992012] PANIC: double fault, error_code: 0x0 [ 295.992012] CPU: 1 PID: 2916 Comm: trace-cmd Not tainted 3.17.0-rc4+ #82 [ 295.992012] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS

Re: perf: perf_fuzzer triggers instant reboot

2014-09-09 Thread Vince Weaver
On Tue, 9 Sep 2014, Vince Weaver wrote: > > [ 751.656861] NOHZ: local_softirq_pending 100 > [ 756.009271] traps: perf_fuzzer[4236] trap invalid opcode ip:4044c0 > sp:7fffb0ed6f90 error:0 > [ 756.017590] BUG: unable to handle kernel paging request so it turns out that while it seems reproducib

Re: perf: perf_fuzzer triggers instant reboot

2014-09-09 Thread Vince Weaver
On Mon, 8 Sep 2014, Peter Zijlstra wrote: > On Mon, Sep 08, 2014 at 01:47:11PM -0400, Vince Weaver wrote: > > Hello > > > > so I finally had time to run my perf_fuzzer again and it has rapidly > > turned up an alarming crash that instant-reboots my core2 test machine. > > Urgh, of course :/ So

Re: perf: perf_fuzzer triggers instant reboot

2014-09-08 Thread Vince Weaver
On Mon, 8 Sep 2014, Peter Zijlstra wrote: > On Mon, Sep 08, 2014 at 01:47:11PM -0400, Vince Weaver wrote: > > Hello > > > > so I finally had time to run my perf_fuzzer again and it has rapidly > > turned up an alarming crash that instant-reboots my core2 test machine. > > Urgh, of course :/ I

Re: perf: perf_fuzzer triggers instant reboot

2014-09-08 Thread Peter Zijlstra
On Mon, Sep 08, 2014 at 01:47:11PM -0400, Vince Weaver wrote: > Hello > > so I finally had time to run my perf_fuzzer again and it has rapidly > turned up an alarming crash that instant-reboots my core2 test machine. Urgh, of course :/ pgp64KOaVtMNd.pgp Description: PGP signature

perf: perf_fuzzer triggers instant reboot

2014-09-08 Thread Vince Weaver
Hello so I finally had time to run my perf_fuzzer again and it has rapidly turned up an alarming crash that instant-reboots my core2 test machine. This is on 3.17-rc4. It is reproducible. The first time all that appeared on the serial console was: [ 2616.535995] Kernel panic - not syncing: Lo