Re: perf: fuzzer crashes immediately on AMD system

2016-08-24 Thread Vince Weaver
On Wed, 24 Aug 2016, Ingo Molnar wrote:
> If there's no progress finding the root cause I'd be happy to exchange a 
> crash for 
> a leak ...

It's actually a crash of the program doing the perf_event_open() call, not 
a crash of the system (at least in my experience).

However, it's possible that if you have bad luck and if the kfree'd space 
is reused with just the right combination of values you could potentially 
end up crashing the system.

Vince


Re: perf: fuzzer crashes immediately on AMD system

2016-08-24 Thread Vince Weaver
On Wed, 24 Aug 2016, Ingo Molnar wrote:
> If there's no progress finding the root cause I'd be happy to exchange a 
> crash for 
> a leak ...

It's actually a crash of the program doing the perf_event_open() call, not 
a crash of the system (at least in my experience).

However, it's possible that if you have bad luck and if the kfree'd space 
is reused with just the right combination of values you could potentially 
end up crashing the system.

Vince


Re: perf: fuzzer crashes immediately on AMD system

2016-08-24 Thread Ingo Molnar

* Vince Weaver  wrote:

> On Tue, 23 Aug 2016, Peter Zijlstra wrote:
> 
> > On Mon, Aug 22, 2016 at 10:54:32PM -0400, Vince Weaver wrote:
> > > > > > > 
> > > > > > >   perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> > > > amd_uncore_find_online_sibling()
> > > > function is broken. 
> > > 
> > > and that's the problem.  uncore_find_online_sibling() does all kinds of 
> > > wrong things including sticking active uncore structures in 
> > > uncore->free_when_cpu_online
> > > 
> > > Then uncore_online() comes along and frees those structures.
> > > 
> > > Then some other part of the kernel comes and re-uses the free'd data.
> > > 
> > > Then when we try to start an event, all of the fields are invalid because 
> > > the uncore pointer is pointing to re-used data.
> > > 
> > > I don't have a patch because I am not 100% clear on what 
> > > uncore_find_online_sibling() is doing in the first place.
> > 
> > Thanks for doing all that, I'll see if I can make sense of it.
> 
> I should have provided more detail, was just tired after chasing the bug 
> for so long.  I mostly found things by sprinkling printks everywhere.
> Comenting out the call to kfree() in uncore_online() makes the code stop 
> crashing (but perhaps causes a memory leak?)

If there's no progress finding the root cause I'd be happy to exchange a crash 
for 
a leak ...

> In any case it's odd the problem didn't show up earlier, but maybe the 
> recent changes to CPU hotplugging in that file exposed the issue.

Yeah, we had lots of changes to CPU hotplugging recently.

Thanks,

Ingo


Re: perf: fuzzer crashes immediately on AMD system

2016-08-24 Thread Ingo Molnar

* Vince Weaver  wrote:

> On Tue, 23 Aug 2016, Peter Zijlstra wrote:
> 
> > On Mon, Aug 22, 2016 at 10:54:32PM -0400, Vince Weaver wrote:
> > > > > > > 
> > > > > > >   perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> > > > amd_uncore_find_online_sibling()
> > > > function is broken. 
> > > 
> > > and that's the problem.  uncore_find_online_sibling() does all kinds of 
> > > wrong things including sticking active uncore structures in 
> > > uncore->free_when_cpu_online
> > > 
> > > Then uncore_online() comes along and frees those structures.
> > > 
> > > Then some other part of the kernel comes and re-uses the free'd data.
> > > 
> > > Then when we try to start an event, all of the fields are invalid because 
> > > the uncore pointer is pointing to re-used data.
> > > 
> > > I don't have a patch because I am not 100% clear on what 
> > > uncore_find_online_sibling() is doing in the first place.
> > 
> > Thanks for doing all that, I'll see if I can make sense of it.
> 
> I should have provided more detail, was just tired after chasing the bug 
> for so long.  I mostly found things by sprinkling printks everywhere.
> Comenting out the call to kfree() in uncore_online() makes the code stop 
> crashing (but perhaps causes a memory leak?)

If there's no progress finding the root cause I'd be happy to exchange a crash 
for 
a leak ...

> In any case it's odd the problem didn't show up earlier, but maybe the 
> recent changes to CPU hotplugging in that file exposed the issue.

Yeah, we had lots of changes to CPU hotplugging recently.

Thanks,

Ingo


Re: perf: fuzzer crashes immediately on AMD system

2016-08-23 Thread Vince Weaver
On Tue, 23 Aug 2016, Peter Zijlstra wrote:

> On Mon, Aug 22, 2016 at 10:54:32PM -0400, Vince Weaver wrote:
> > > > > > 
> > > > > > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> > >   amd_uncore_find_online_sibling()
> > > function is broken. 
> > 
> > and that's the problem.  uncore_find_online_sibling() does all kinds of 
> > wrong things including sticking active uncore structures in 
> > uncore->free_when_cpu_online
> > 
> > Then uncore_online() comes along and frees those structures.
> > 
> > Then some other part of the kernel comes and re-uses the free'd data.
> > 
> > Then when we try to start an event, all of the fields are invalid because 
> > the uncore pointer is pointing to re-used data.
> > 
> > I don't have a patch because I am not 100% clear on what 
> > uncore_find_online_sibling() is doing in the first place.
> 
> Thanks for doing all that, I'll see if I can make sense of it.

I should have provided more detail, was just tired after chasing the bug 
for so long.  I mostly found things by sprinkling printks everywhere.
Comenting out the call to kfree() in uncore_online() makes the code stop 
crashing (but perhaps causes a memory leak?)

In any case it's odd the problem didn't show up earlier, but maybe the 
recent changes to CPU hotplugging in that file exposed the issue.

Vince


Re: perf: fuzzer crashes immediately on AMD system

2016-08-23 Thread Vince Weaver
On Tue, 23 Aug 2016, Peter Zijlstra wrote:

> On Mon, Aug 22, 2016 at 10:54:32PM -0400, Vince Weaver wrote:
> > > > > > 
> > > > > > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> > >   amd_uncore_find_online_sibling()
> > > function is broken. 
> > 
> > and that's the problem.  uncore_find_online_sibling() does all kinds of 
> > wrong things including sticking active uncore structures in 
> > uncore->free_when_cpu_online
> > 
> > Then uncore_online() comes along and frees those structures.
> > 
> > Then some other part of the kernel comes and re-uses the free'd data.
> > 
> > Then when we try to start an event, all of the fields are invalid because 
> > the uncore pointer is pointing to re-used data.
> > 
> > I don't have a patch because I am not 100% clear on what 
> > uncore_find_online_sibling() is doing in the first place.
> 
> Thanks for doing all that, I'll see if I can make sense of it.

I should have provided more detail, was just tired after chasing the bug 
for so long.  I mostly found things by sprinkling printks everywhere.
Comenting out the call to kfree() in uncore_online() makes the code stop 
crashing (but perhaps causes a memory leak?)

In any case it's odd the problem didn't show up earlier, but maybe the 
recent changes to CPU hotplugging in that file exposed the issue.

Vince


Re: perf: fuzzer crashes immediately on AMD system

2016-08-23 Thread Peter Zijlstra
On Mon, Aug 22, 2016 at 10:54:32PM -0400, Vince Weaver wrote:
> > > > > 
> > > > >   perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> > amd_uncore_find_online_sibling()
> > function is broken. 
> 
> and that's the problem.  uncore_find_online_sibling() does all kinds of 
> wrong things including sticking active uncore structures in 
> uncore->free_when_cpu_online
> 
> Then uncore_online() comes along and frees those structures.
> 
> Then some other part of the kernel comes and re-uses the free'd data.
> 
> Then when we try to start an event, all of the fields are invalid because 
> the uncore pointer is pointing to re-used data.
> 
> I don't have a patch because I am not 100% clear on what 
> uncore_find_online_sibling() is doing in the first place.

Thanks for doing all that, I'll see if I can make sense of it.


Re: perf: fuzzer crashes immediately on AMD system

2016-08-23 Thread Peter Zijlstra
On Mon, Aug 22, 2016 at 10:54:32PM -0400, Vince Weaver wrote:
> > > > > 
> > > > >   perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> > amd_uncore_find_online_sibling()
> > function is broken. 
> 
> and that's the problem.  uncore_find_online_sibling() does all kinds of 
> wrong things including sticking active uncore structures in 
> uncore->free_when_cpu_online
> 
> Then uncore_online() comes along and frees those structures.
> 
> Then some other part of the kernel comes and re-uses the free'd data.
> 
> Then when we try to start an event, all of the fields are invalid because 
> the uncore pointer is pointing to re-used data.
> 
> I don't have a patch because I am not 100% clear on what 
> uncore_find_online_sibling() is doing in the first place.

Thanks for doing all that, I'll see if I can make sense of it.


Re: perf: fuzzer crashes immediately on AMD system

2016-08-22 Thread Vince Weaver
> > > > 
> > > > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
>   amd_uncore_find_online_sibling()
> function is broken. 

and that's the problem.  uncore_find_online_sibling() does all kinds of 
wrong things including sticking active uncore structures in 
uncore->free_when_cpu_online

Then uncore_online() comes along and frees those structures.

Then some other part of the kernel comes and re-uses the free'd data.

Then when we try to start an event, all of the fields are invalid because 
the uncore pointer is pointing to re-used data.

I don't have a patch because I am not 100% clear on what 
uncore_find_online_sibling() is doing in the first place.

Vince


Re: perf: fuzzer crashes immediately on AMD system

2016-08-22 Thread Vince Weaver
> > > > 
> > > > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
>   amd_uncore_find_online_sibling()
> function is broken. 

and that's the problem.  uncore_find_online_sibling() does all kinds of 
wrong things including sticking active uncore structures in 
uncore->free_when_cpu_online

Then uncore_online() comes along and frees those structures.

Then some other part of the kernel comes and re-uses the free'd data.

Then when we try to start an event, all of the fields are invalid because 
the uncore pointer is pointing to re-used data.

I don't have a patch because I am not 100% clear on what 
uncore_find_online_sibling() is doing in the first place.

Vince


Re: perf: fuzzer crashes immediately on AMD system

2016-08-22 Thread Vince Weaver
On Mon, 22 Aug 2016, Huang Rui wrote:

> Hi Peter, Vince
> 
> On Fri, Aug 19, 2016 at 12:01:30PM +0200, Peter Zijlstra wrote:
> > On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > > 
> > > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and 
> > > > it
> > > > falls over more or less immediately.
> > > > 
> > > > This maps to variable_test_bit()
> > > > called by ctx = find_get_context(pmu, task, event);
> > > > in kernel/events/core.c:9467
> > > > 
> > > > It happens quickly enough I can probably track down the exact event 
> > > > that 
> > > > causes this, if needed.
> > > 
> > > I have a one line reproducer:
> > > 
> > >   perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> > 
> > OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> > various manuals to see if I can spot the fail.
> > 
> > Huang could you either prod someone at AMD or do yourself, audit the AMD
> > perf code for all the various new models?
> 
> Actually, there might be some NBPMC event changes between model 0h-fh and
> model 10h-1fh. Below are the documents of these two processors:
> 
> http://support.amd.com/TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf
> http://support.amd.com/TechDocs/42300_15h_Mod_10h-1Fh_BKDG.pdf
> 
> In section 3.16, it describes usage of NB Performance Counter Events.

I don't think it's the hardware that's causing the problem.

I've wasted a lot more time on it, and finally figured out how the "bt" 
instruction works, so the assembly more or less makes sense.

The problem is the per-cpu amd_uncore struct is being over-written with 
kernel memory addresses.

This makes uncore[0]->cpu a large number (it's often, but not always, the 
per-cpu address of uncore[1]->cpu) which leads to the GPF.

I can't figure out what piece of code is overwriting things though.

And to make things complicated, I think the 
amd_uncore_find_online_sibling()
function is broken.  The code could really use more commenting, but I 
think it is designed so all siblings share one single amd_uncore 
structure, but in practice it looks like this doesn't work due to the way 
the list iterator works.

Vince



Re: perf: fuzzer crashes immediately on AMD system

2016-08-22 Thread Vince Weaver
On Mon, 22 Aug 2016, Huang Rui wrote:

> Hi Peter, Vince
> 
> On Fri, Aug 19, 2016 at 12:01:30PM +0200, Peter Zijlstra wrote:
> > On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > > 
> > > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and 
> > > > it
> > > > falls over more or less immediately.
> > > > 
> > > > This maps to variable_test_bit()
> > > > called by ctx = find_get_context(pmu, task, event);
> > > > in kernel/events/core.c:9467
> > > > 
> > > > It happens quickly enough I can probably track down the exact event 
> > > > that 
> > > > causes this, if needed.
> > > 
> > > I have a one line reproducer:
> > > 
> > >   perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> > 
> > OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> > various manuals to see if I can spot the fail.
> > 
> > Huang could you either prod someone at AMD or do yourself, audit the AMD
> > perf code for all the various new models?
> 
> Actually, there might be some NBPMC event changes between model 0h-fh and
> model 10h-1fh. Below are the documents of these two processors:
> 
> http://support.amd.com/TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf
> http://support.amd.com/TechDocs/42300_15h_Mod_10h-1Fh_BKDG.pdf
> 
> In section 3.16, it describes usage of NB Performance Counter Events.

I don't think it's the hardware that's causing the problem.

I've wasted a lot more time on it, and finally figured out how the "bt" 
instruction works, so the assembly more or less makes sense.

The problem is the per-cpu amd_uncore struct is being over-written with 
kernel memory addresses.

This makes uncore[0]->cpu a large number (it's often, but not always, the 
per-cpu address of uncore[1]->cpu) which leads to the GPF.

I can't figure out what piece of code is overwriting things though.

And to make things complicated, I think the 
amd_uncore_find_online_sibling()
function is broken.  The code could really use more commenting, but I 
think it is designed so all siblings share one single amd_uncore 
structure, but in practice it looks like this doesn't work due to the way 
the list iterator works.

Vince



Re: perf: fuzzer crashes immediately on AMD system

2016-08-22 Thread Huang Rui
Hi Peter, Vince

On Fri, Aug 19, 2016 at 12:01:30PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > 
> > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > > falls over more or less immediately.
> > > 
> > > This maps to variable_test_bit()
> > >   called by ctx = find_get_context(pmu, task, event);
> > >   in kernel/events/core.c:9467
> > > 
> > > It happens quickly enough I can probably track down the exact event that 
> > > causes this, if needed.
> > 
> > I have a one line reproducer:
> > 
> > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> 
> OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> various manuals to see if I can spot the fail.
> 
> Huang could you either prod someone at AMD or do yourself, audit the AMD
> perf code for all the various new models?

Actually, there might be some NBPMC event changes between model 0h-fh and
model 10h-1fh. Below are the documents of these two processors:

http://support.amd.com/TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf
http://support.amd.com/TechDocs/42300_15h_Mod_10h-1Fh_BKDG.pdf

In section 3.16, it describes usage of NB Performance Counter Events.

Hope it helps. :-)

Thanks,
Rui


Re: perf: fuzzer crashes immediately on AMD system

2016-08-22 Thread Huang Rui
Hi Peter, Vince

On Fri, Aug 19, 2016 at 12:01:30PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > 
> > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > > falls over more or less immediately.
> > > 
> > > This maps to variable_test_bit()
> > >   called by ctx = find_get_context(pmu, task, event);
> > >   in kernel/events/core.c:9467
> > > 
> > > It happens quickly enough I can probably track down the exact event that 
> > > causes this, if needed.
> > 
> > I have a one line reproducer:
> > 
> > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> 
> OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> various manuals to see if I can spot the fail.
> 
> Huang could you either prod someone at AMD or do yourself, audit the AMD
> perf code for all the various new models?

Actually, there might be some NBPMC event changes between model 0h-fh and
model 10h-1fh. Below are the documents of these two processors:

http://support.amd.com/TechDocs/42301_15h_Mod_00h-0Fh_BKDG.pdf
http://support.amd.com/TechDocs/42300_15h_Mod_10h-1Fh_BKDG.pdf

In section 3.16, it describes usage of NB Performance Counter Events.

Hope it helps. :-)

Thanks,
Rui


Re: perf: fuzzer crashes immediately on AMD system

2016-08-19 Thread Vince Weaver
On Fri, 19 Aug 2016, Peter Zijlstra wrote:

> On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > 
> > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > > falls over more or less immediately.
> > > 
> > > This maps to variable_test_bit()
> > >   called by ctx = find_get_context(pmu, task, event);
> > >   in kernel/events/core.c:9467
> > > 
> > > It happens quickly enough I can probably track down the exact event that 
> > > causes this, if needed.
> > 
> > I have a one line reproducer:
> > 
> > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> 
> OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> various manuals to see if I can spot the fail.
> 
> Huang could you either prod someone at AMD or do yourself, audit the AMD
> perf code for all the various new models?

This is bizzarre, I can't make any sense of the crash.

To recap, the crash looks like this:
BUG: unable to handle kernel paging request at 85e67600
IP: [] find_get_context.isra.75+0x28/0x20f

The code in question is this code:

if (!cpu_online(cpu))

which maps to 
test_bit(cpumask_check(cpu), cpumask_bits((cpumask)));

which assembles to

810e4ca9:   41 89 ccmov%ecx,%r12d
810e4cac:   7f 1e   jg 810e4ccc 

810e4cae:   44 89 e0mov%r12d,%eax
*   810e4cb1:   48 0f a3 05 87 0f 7fbt 
%rax,0x7f0f87(%rip)# 818d5c40 <__cpu_online_mask>
810e4cb8:   00 
810e4cb9:   0f 92 c0setb   %al
810e4cbc:   84 c0   test   %al,%al

There is no way that 0x7f0f87(%rip) should ever possibly be the 
85e67600 value that causes the fault.

Though oddly rax when the call happens (according to the oops message)
is RAX: 22c8ce30 which seems nonsensical for a CPU number, but
shouldn't cause an invalid memory address.  Also oddly RDI matches
RAX but RCX doesn't which I think should be true with that assembly.

So very weird.  I even wrote a kernel module and dumped the raw kernel
memory to make sure the instruction stream didn't get overwritten somehow,
but as far as I can tell the code in memory matches the disassembly.

anyway I am out of time to look at this for now. 

Vince


Re: perf: fuzzer crashes immediately on AMD system

2016-08-19 Thread Vince Weaver
On Fri, 19 Aug 2016, Peter Zijlstra wrote:

> On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > 
> > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > > falls over more or less immediately.
> > > 
> > > This maps to variable_test_bit()
> > >   called by ctx = find_get_context(pmu, task, event);
> > >   in kernel/events/core.c:9467
> > > 
> > > It happens quickly enough I can probably track down the exact event that 
> > > causes this, if needed.
> > 
> > I have a one line reproducer:
> > 
> > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> 
> OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> various manuals to see if I can spot the fail.
> 
> Huang could you either prod someone at AMD or do yourself, audit the AMD
> perf code for all the various new models?

This is bizzarre, I can't make any sense of the crash.

To recap, the crash looks like this:
BUG: unable to handle kernel paging request at 85e67600
IP: [] find_get_context.isra.75+0x28/0x20f

The code in question is this code:

if (!cpu_online(cpu))

which maps to 
test_bit(cpumask_check(cpu), cpumask_bits((cpumask)));

which assembles to

810e4ca9:   41 89 ccmov%ecx,%r12d
810e4cac:   7f 1e   jg 810e4ccc 

810e4cae:   44 89 e0mov%r12d,%eax
*   810e4cb1:   48 0f a3 05 87 0f 7fbt 
%rax,0x7f0f87(%rip)# 818d5c40 <__cpu_online_mask>
810e4cb8:   00 
810e4cb9:   0f 92 c0setb   %al
810e4cbc:   84 c0   test   %al,%al

There is no way that 0x7f0f87(%rip) should ever possibly be the 
85e67600 value that causes the fault.

Though oddly rax when the call happens (according to the oops message)
is RAX: 22c8ce30 which seems nonsensical for a CPU number, but
shouldn't cause an invalid memory address.  Also oddly RDI matches
RAX but RCX doesn't which I think should be true with that assembly.

So very weird.  I even wrote a kernel module and dumped the raw kernel
memory to make sure the instruction stream didn't get overwritten somehow,
but as far as I can tell the code in memory matches the disassembly.

anyway I am out of time to look at this for now. 

Vince


Re: perf: fuzzer crashes immediately on AMD system

2016-08-19 Thread Vince Weaver
On Fri, 19 Aug 2016, Vince Weaver wrote:

> OK, this is weird.  I rebooted (didn't patch the kernel, just rebooted) 
> and I can't reproduce the original problem at all.

I rebooted three more times (after perf_fuzzer turned up a more boring 
probably known dump, shown at end) and now I am hitting the original bug 
again.   Weird.  Let me see if I can figure out what is going on.



and for the record, the bug the fuzzer kicks out when it doesn't hit the 
weird one:

note this is sprinkled among thousands of
[ 3782.364287] BAD LUCK: lost 7650 message(s) from NMI context!


[ 3780.821837] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! 
[perf_fuzzer:12074]
[ 3781.493831] CPU: 2 PID: 12074 Comm: perf_fuzzer Tainted: G L  
4.8.0-rc2+ #27
[ 3781.508478] Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS 
K06 v02.57 08/16/2013
[ 3781.524054] task: 8802232cf280 task.stack: 8802252c
[ 3781.542904] RIP: 0010:[]  [] 
smp_call_function_single+0xbb/0xca
[ 3781.558618] RSP: 0018:8802252c3d78  EFLAGS: 0202
[ 3781.570752] RAX:  RBX: 0001 RCX: 
[ 3781.584757] RDX: 0001 RSI: 08fb RDI: 0300
[ 3781.598819] RBP: 0001 R08: 0003 R09: 7f0c0ea07700
[ 3781.612930] R10: 7f0c0ea079d0 R11: 0206 R12: 810e226b
[ 3781.627107] R13: 8802252c3dc8 R14: 8802252c3d78 R15: 
[ 3781.641335] FS:  7f0c0ea07700() GS:88022ed0() 
knlGS:
[ 3781.656573] CS:  0010 DS:  ES:  CR0: 80050033
[ 3781.669534] CR2: 7f0c0e7d72c8 CR3: 0002251d1000 CR4: 000407e0
[ 3781.683929] DR0:  DR1:  DR2: 
[ 3781.698410] DR3:  DR6: 0ff0 DR7: 00010602
[ 3781.712845] Stack:
[ 3781.747577]   810e226b 8802252c3dc8 
0003
[ 3781.787434]  e8c87190 880223fb7800 810e5676 

[ 3781.827415]  810e18df 810e16cd  
810e13d2
[ 3781.841792] Call Trace:
[ 3781.851292]  [] ? perf_cgroup_attach+0x34/0x34
[ 3781.864355]  [] ? group_sched_out+0x70/0x70
[ 3781.877219]  [] ? event_function_call+0xa8/0xa8
[ 3781.890345]  [] ? cpu_function_call+0x32/0x3b
[ 3781.903284]  [] ? perf_ctx_lock+0x1e/0x1e
[ 3781.915864]  [] ? event_function_call+0x49/0xa8
[ 3781.928952]  [] ? group_sched_out+0x70/0x70
[ 3781.941675]  [] ? event_function_call+0xa8/0xa8
[ 3781.954734]  [] ? perf_event_for_each_child+0x53/0x8a
[ 3781.968295]  [] ? perf_ioctl+0x41d/0x495
[ 3781.980725]  [] ? vfs_ioctl+0x16/0x23
[ 3781.992893]  [] ? do_vfs_ioctl+0x46e/0x519
[ 3782.005532]  [] ? do_sigaltstack+0xe1/0x1b0
[ 3782.018184]  [] ? SyS_ioctl+0x4e/0x71
[ 3782.030319]  [] ? entry_SYSCALL_64_fastpath+0x17/0x93
[ 3782.433996] Code: e2 01 74 04 f3 90 eb f4 83 48 18 01 4c 89 e9 4c 89 e2 4c 
89 f6 89 ef e8 94 fe ff ff 85 db 74 0d 41 8b 56 18 80 e2 01 74 04 f3 90  f3 
48 83 c4 20 5b 5d 41 5c 41 5d 41 5e c3 41 56 41 55 41 89 



Re: perf: fuzzer crashes immediately on AMD system

2016-08-19 Thread Vince Weaver
On Fri, 19 Aug 2016, Vince Weaver wrote:

> OK, this is weird.  I rebooted (didn't patch the kernel, just rebooted) 
> and I can't reproduce the original problem at all.

I rebooted three more times (after perf_fuzzer turned up a more boring 
probably known dump, shown at end) and now I am hitting the original bug 
again.   Weird.  Let me see if I can figure out what is going on.



and for the record, the bug the fuzzer kicks out when it doesn't hit the 
weird one:

note this is sprinkled among thousands of
[ 3782.364287] BAD LUCK: lost 7650 message(s) from NMI context!


[ 3780.821837] NMI watchdog: BUG: soft lockup - CPU#2 stuck for 23s! 
[perf_fuzzer:12074]
[ 3781.493831] CPU: 2 PID: 12074 Comm: perf_fuzzer Tainted: G L  
4.8.0-rc2+ #27
[ 3781.508478] Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS 
K06 v02.57 08/16/2013
[ 3781.524054] task: 8802232cf280 task.stack: 8802252c
[ 3781.542904] RIP: 0010:[]  [] 
smp_call_function_single+0xbb/0xca
[ 3781.558618] RSP: 0018:8802252c3d78  EFLAGS: 0202
[ 3781.570752] RAX:  RBX: 0001 RCX: 
[ 3781.584757] RDX: 0001 RSI: 08fb RDI: 0300
[ 3781.598819] RBP: 0001 R08: 0003 R09: 7f0c0ea07700
[ 3781.612930] R10: 7f0c0ea079d0 R11: 0206 R12: 810e226b
[ 3781.627107] R13: 8802252c3dc8 R14: 8802252c3d78 R15: 
[ 3781.641335] FS:  7f0c0ea07700() GS:88022ed0() 
knlGS:
[ 3781.656573] CS:  0010 DS:  ES:  CR0: 80050033
[ 3781.669534] CR2: 7f0c0e7d72c8 CR3: 0002251d1000 CR4: 000407e0
[ 3781.683929] DR0:  DR1:  DR2: 
[ 3781.698410] DR3:  DR6: 0ff0 DR7: 00010602
[ 3781.712845] Stack:
[ 3781.747577]   810e226b 8802252c3dc8 
0003
[ 3781.787434]  e8c87190 880223fb7800 810e5676 

[ 3781.827415]  810e18df 810e16cd  
810e13d2
[ 3781.841792] Call Trace:
[ 3781.851292]  [] ? perf_cgroup_attach+0x34/0x34
[ 3781.864355]  [] ? group_sched_out+0x70/0x70
[ 3781.877219]  [] ? event_function_call+0xa8/0xa8
[ 3781.890345]  [] ? cpu_function_call+0x32/0x3b
[ 3781.903284]  [] ? perf_ctx_lock+0x1e/0x1e
[ 3781.915864]  [] ? event_function_call+0x49/0xa8
[ 3781.928952]  [] ? group_sched_out+0x70/0x70
[ 3781.941675]  [] ? event_function_call+0xa8/0xa8
[ 3781.954734]  [] ? perf_event_for_each_child+0x53/0x8a
[ 3781.968295]  [] ? perf_ioctl+0x41d/0x495
[ 3781.980725]  [] ? vfs_ioctl+0x16/0x23
[ 3781.992893]  [] ? do_vfs_ioctl+0x46e/0x519
[ 3782.005532]  [] ? do_sigaltstack+0xe1/0x1b0
[ 3782.018184]  [] ? SyS_ioctl+0x4e/0x71
[ 3782.030319]  [] ? entry_SYSCALL_64_fastpath+0x17/0x93
[ 3782.433996] Code: e2 01 74 04 f3 90 eb f4 83 48 18 01 4c 89 e9 4c 89 e2 4c 
89 f6 89 ef e8 94 fe ff ff 85 db 74 0d 41 8b 56 18 80 e2 01 74 04 f3 90  f3 
48 83 c4 20 5b 5d 41 5c 41 5d 41 5e c3 41 56 41 55 41 89 



Re: perf: fuzzer crashes immediately on AMD system

2016-08-19 Thread Vince Weaver
On Fri, 19 Aug 2016, Peter Zijlstra wrote:

> On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > 
> > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > > falls over more or less immediately.
> > > 
> > > This maps to variable_test_bit()
> > >   called by ctx = find_get_context(pmu, task, event);
> > >   in kernel/events/core.c:9467
> > > 
> > > It happens quickly enough I can probably track down the exact event that 
> > > causes this, if needed.
> > 
> > I have a one line reproducer:
> > 
> > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> 
> OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> various manuals to see if I can spot the fail.
> 
> Huang could you either prod someone at AMD or do yourself, audit the AMD
> perf code for all the various new models?


OK, this is weird.  I rebooted (didn't patch the kernel, just rebooted) 
and I can't reproduce the original problem at all.

It was perfectly repeatable before I rebooted, dumped an OOPS message 
every time.

Sadly I don't have the fuzzer logs that originally triggered the bug (need 
more serial/USB cables.  Actually no, I need more null-modem adapters).

Let me look into this a bit more.

Vince


Re: perf: fuzzer crashes immediately on AMD system

2016-08-19 Thread Vince Weaver
On Fri, 19 Aug 2016, Peter Zijlstra wrote:

> On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > 
> > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > > falls over more or less immediately.
> > > 
> > > This maps to variable_test_bit()
> > >   called by ctx = find_get_context(pmu, task, event);
> > >   in kernel/events/core.c:9467
> > > 
> > > It happens quickly enough I can probably track down the exact event that 
> > > causes this, if needed.
> > 
> > I have a one line reproducer:
> > 
> > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> 
> OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> various manuals to see if I can spot the fail.
> 
> Huang could you either prod someone at AMD or do yourself, audit the AMD
> perf code for all the various new models?


OK, this is weird.  I rebooted (didn't patch the kernel, just rebooted) 
and I can't reproduce the original problem at all.

It was perfectly repeatable before I rebooted, dumped an OOPS message 
every time.

Sadly I don't have the fuzzer logs that originally triggered the bug (need 
more serial/USB cables.  Actually no, I need more null-modem adapters).

Let me look into this a bit more.

Vince


Re: perf: fuzzer crashes immediately on AMD system

2016-08-19 Thread Peter Zijlstra
On Fri, Aug 19, 2016 at 12:01:30PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > 
> > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > > falls over more or less immediately.
> > > 
> > > This maps to variable_test_bit()
> > >   called by ctx = find_get_context(pmu, task, event);
> > >   in kernel/events/core.c:9467
> > > 
> > > It happens quickly enough I can probably track down the exact event that 
> > > causes this, if needed.
> > 
> > I have a one line reproducer:
> > 
> > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> 
> OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> various manuals to see if I can spot the fail.
> 
> Huang could you either prod someone at AMD or do yourself, audit the AMD
> perf code for all the various new models?

So this should obviously help a little in that it will limit the events
you can program into the hardware.

Not at all sure that is what you're hitting though, because I cannot for
the life of me figure how that would end up exploding in generic code.

---
 arch/x86/events/amd/uncore.c | 47 +---
 1 file changed, 44 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index e6131d4..8c314d7 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -174,8 +174,8 @@ static void amd_uncore_del(struct perf_event *event, int 
flags)
 
 static int amd_uncore_event_init(struct perf_event *event)
 {
-   struct amd_uncore *uncore;
struct hw_perf_event *hwc = >hw;
+   struct amd_uncore *uncore;
 
if (event->attr.type != event->pmu->type)
return -ENOENT;
@@ -215,6 +215,47 @@ static int amd_uncore_event_init(struct perf_event *event)
return 0;
 }
 
+static inline unsigned int amd_get_event_code(struct hw_perf_event *hwc)
+{
+   return ((hwc->config >> 24) & 0x0f00) | (hwc->config & 0x00ff);
+}
+
+static int amd_uncore_l2_event_init(struct perf_event *event)
+{
+   int ret = amd_uncore_event_init(event);
+   unsigned int event_code;
+
+   if (ret)
+   return ret;
+
+   /*
+* Fam16h L2I performance counter events are in the range: 0x060 - 0x07F
+*/
+   event_code = amd_get_event_code(>hw);
+   if (event_code < 0x060 || event_code > 0x07F)
+   return -EINVAL;
+
+   return 0;
+}
+
+static int amd_uncore_nb_event_init(struct perf_event *event)
+{
+   int ret = amd_uncore_event_init(event);
+   unsigned int event_code;
+
+   if (ret)
+   return ret;
+
+   /*
+* AMD NB events will have bits 0x0E0 set.
+*/
+   event_code = amd_get_event_code(>hw);
+   if ((event_code & 0x0E0) != 0x0E0)
+   return -EINVAL;
+
+   return 0;
+}
+
 static ssize_t amd_uncore_attr_show_cpumask(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -266,7 +307,7 @@ static struct pmu amd_nb_pmu = {
.task_ctx_nr= perf_invalid_context,
.attr_groups= amd_uncore_attr_groups,
.name   = "amd_nb",
-   .event_init = amd_uncore_event_init,
+   .event_init = amd_uncore_nb_event_init,
.add= amd_uncore_add,
.del= amd_uncore_del,
.start  = amd_uncore_start,
@@ -278,7 +319,7 @@ static struct pmu amd_l2_pmu = {
.task_ctx_nr= perf_invalid_context,
.attr_groups= amd_uncore_attr_groups,
.name   = "amd_l2",
-   .event_init = amd_uncore_event_init,
+   .event_init = amd_uncore_l2_event_init,
.add= amd_uncore_add,
.del= amd_uncore_del,
.start  = amd_uncore_start,


Re: perf: fuzzer crashes immediately on AMD system

2016-08-19 Thread Peter Zijlstra
On Fri, Aug 19, 2016 at 12:01:30PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> > On Thu, 18 Aug 2016, Vince Weaver wrote:
> > 
> > > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > > falls over more or less immediately.
> > > 
> > > This maps to variable_test_bit()
> > >   called by ctx = find_get_context(pmu, task, event);
> > >   in kernel/events/core.c:9467
> > > 
> > > It happens quickly enough I can probably track down the exact event that 
> > > causes this, if needed.
> > 
> > I have a one line reproducer:
> > 
> > perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls
> 
> OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
> various manuals to see if I can spot the fail.
> 
> Huang could you either prod someone at AMD or do yourself, audit the AMD
> perf code for all the various new models?

So this should obviously help a little in that it will limit the events
you can program into the hardware.

Not at all sure that is what you're hitting though, because I cannot for
the life of me figure how that would end up exploding in generic code.

---
 arch/x86/events/amd/uncore.c | 47 +---
 1 file changed, 44 insertions(+), 3 deletions(-)

diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c
index e6131d4..8c314d7 100644
--- a/arch/x86/events/amd/uncore.c
+++ b/arch/x86/events/amd/uncore.c
@@ -174,8 +174,8 @@ static void amd_uncore_del(struct perf_event *event, int 
flags)
 
 static int amd_uncore_event_init(struct perf_event *event)
 {
-   struct amd_uncore *uncore;
struct hw_perf_event *hwc = >hw;
+   struct amd_uncore *uncore;
 
if (event->attr.type != event->pmu->type)
return -ENOENT;
@@ -215,6 +215,47 @@ static int amd_uncore_event_init(struct perf_event *event)
return 0;
 }
 
+static inline unsigned int amd_get_event_code(struct hw_perf_event *hwc)
+{
+   return ((hwc->config >> 24) & 0x0f00) | (hwc->config & 0x00ff);
+}
+
+static int amd_uncore_l2_event_init(struct perf_event *event)
+{
+   int ret = amd_uncore_event_init(event);
+   unsigned int event_code;
+
+   if (ret)
+   return ret;
+
+   /*
+* Fam16h L2I performance counter events are in the range: 0x060 - 0x07F
+*/
+   event_code = amd_get_event_code(>hw);
+   if (event_code < 0x060 || event_code > 0x07F)
+   return -EINVAL;
+
+   return 0;
+}
+
+static int amd_uncore_nb_event_init(struct perf_event *event)
+{
+   int ret = amd_uncore_event_init(event);
+   unsigned int event_code;
+
+   if (ret)
+   return ret;
+
+   /*
+* AMD NB events will have bits 0x0E0 set.
+*/
+   event_code = amd_get_event_code(>hw);
+   if ((event_code & 0x0E0) != 0x0E0)
+   return -EINVAL;
+
+   return 0;
+}
+
 static ssize_t amd_uncore_attr_show_cpumask(struct device *dev,
struct device_attribute *attr,
char *buf)
@@ -266,7 +307,7 @@ static struct pmu amd_nb_pmu = {
.task_ctx_nr= perf_invalid_context,
.attr_groups= amd_uncore_attr_groups,
.name   = "amd_nb",
-   .event_init = amd_uncore_event_init,
+   .event_init = amd_uncore_nb_event_init,
.add= amd_uncore_add,
.del= amd_uncore_del,
.start  = amd_uncore_start,
@@ -278,7 +319,7 @@ static struct pmu amd_l2_pmu = {
.task_ctx_nr= perf_invalid_context,
.attr_groups= amd_uncore_attr_groups,
.name   = "amd_l2",
-   .event_init = amd_uncore_event_init,
+   .event_init = amd_uncore_l2_event_init,
.add= amd_uncore_add,
.del= amd_uncore_del,
.start  = amd_uncore_start,


Re: perf: fuzzer crashes immediately on AMD system

2016-08-19 Thread Peter Zijlstra
On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> On Thu, 18 Aug 2016, Vince Weaver wrote:
> 
> > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > falls over more or less immediately.
> > 
> > This maps to variable_test_bit()
> > called by ctx = find_get_context(pmu, task, event);
> > in kernel/events/core.c:9467
> > 
> > It happens quickly enough I can probably track down the exact event that 
> > causes this, if needed.
> 
> I have a one line reproducer:
> 
>   perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls

OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
various manuals to see if I can spot the fail.

Huang could you either prod someone at AMD or do yourself, audit the AMD
perf code for all the various new models?


Re: perf: fuzzer crashes immediately on AMD system

2016-08-19 Thread Peter Zijlstra
On Thu, Aug 18, 2016 at 10:46:31AM -0400, Vince Weaver wrote:
> On Thu, 18 Aug 2016, Vince Weaver wrote:
> 
> > Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> > falls over more or less immediately.
> > 
> > This maps to variable_test_bit()
> > called by ctx = find_get_context(pmu, task, event);
> > in kernel/events/core.c:9467
> > 
> > It happens quickly enough I can probably track down the exact event that 
> > causes this, if needed.
> 
> I have a one line reproducer:
> 
>   perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls

OK, cannot reproduce on my fam15h/model1h. I'll go dig through the
various manuals to see if I can spot the fail.

Huang could you either prod someone at AMD or do yourself, audit the AMD
perf code for all the various new models?


Re: perf: fuzzer crashes immediately on AMD system

2016-08-18 Thread Vince Weaver
On Thu, 18 Aug 2016, Vince Weaver wrote:

> Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> falls over more or less immediately.
> 
> This maps to variable_test_bit()
>   called by ctx = find_get_context(pmu, task, event);
>   in kernel/events/core.c:9467
> 
> It happens quickly enough I can probably track down the exact event that 
> causes this, if needed.

I have a one line reproducer:

perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls




Re: perf: fuzzer crashes immediately on AMD system

2016-08-18 Thread Vince Weaver
On Thu, 18 Aug 2016, Vince Weaver wrote:

> Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
> falls over more or less immediately.
> 
> This maps to variable_test_bit()
>   called by ctx = find_get_context(pmu, task, event);
>   in kernel/events/core.c:9467
> 
> It happens quickly enough I can probably track down the exact event that 
> causes this, if needed.

I have a one line reproducer:

perf stat -a -e amd_nb/config=0x37,config1=0x20/ /bin/ls




perf: fuzzer crashes immediately on AMD system

2016-08-18 Thread Vince Weaver

Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
falls over more or less immediately.

This maps to variable_test_bit()
called by ctx = find_get_context(pmu, task, event);
in kernel/events/core.c:9467

It happens quickly enough I can probably track down the exact event that 
causes this, if needed.

[  101.970659] BUG: unable to handle kernel paging request at 8653d8a0
[  101.977676] IP: [] find_get_context.isra.75+0x28/0x20f
[  101.984405] PGD 2807067 PUD 2808063 PMD 0 
[  101.988563] Oops:  [#1] SMP
[  102.069521] CPU: 0 PID: 2205 Comm: perf_fuzzer Not tainted 4.8.0-rc2+ #27
[  102.076313] Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS 
K06 v02.57 08/16/2013
[  102.085268] task: 880223ae5000 task.stack: 880224ea8000
[  102.091188] RIP: 0010:[]  [] 
find_get_context.isra.75+0x28/0x20f
[  102.100339] RSP: 0018:880224eabe20  EFLAGS: 00010246
[  102.105657] RAX: 2633e300 RBX:  RCX: 2633e300
[  102.112795] RDX:  RSI:  RDI: 8180ea00
[  102.119929] RBP: 8180ea00 R08: 0004 R09: 
[  102.127063] R10: 0003 R11: 0246 R12: 2633e300
[  102.134196] R13:  R14:  R15: 8180ea00
[  102.141327] FS:  7f743b391700() GS:88022ec0() 
knlGS:
[  102.149416] CS:  0010 DS:  ES:  CR0: 80050033
[  102.155167] CR2: 8653d8a0 CR3: 0002255b9000 CR4: 000407f0
[  102.162309] Stack:
[  102.164323]    880223b9d800 
880224fdd000
[  102.171804]  880223b9d800   

[  102.179284]  8180ea00 810e72be 0002 
88022e0006c0
[  102.186765] Call Trace:
[  102.189216]  [] ? SYSC_perf_event_open+0x525/0xa34
[  102.195579]  [] ? entry_SYSCALL_64_fastpath+0x17/0x93
[  102.202203] Code: 41 5c c3 41 57 41 56 41 55 41 54 55 53 48 89 fd 48 89 f3 
48 83 ec 18 48 85 f6 75 6c 83 3d 2f 2a 7f 00 00 41 89 cc 7f 1e 44 89 e0 <48> 0f 
a3 05 87 0f 7f 00 0f 92 c0 84 c0 75 26 48 c7 c0 ed ff ff 
[  102.56] RIP  [] find_get_context.isra.75+0x28/0x20f
[  102.229065]  RSP 
[  102.232556] CR2: 8653d8a0
[  102.235879] ---[ end trace fa649074c022bab1 ]---


perf: fuzzer crashes immediately on AMD system

2016-08-18 Thread Vince Weaver

Tried the perf_fuzzer on my A10 fam15h/model13h system with 4.8-rc2 and it
falls over more or less immediately.

This maps to variable_test_bit()
called by ctx = find_get_context(pmu, task, event);
in kernel/events/core.c:9467

It happens quickly enough I can probably track down the exact event that 
causes this, if needed.

[  101.970659] BUG: unable to handle kernel paging request at 8653d8a0
[  101.977676] IP: [] find_get_context.isra.75+0x28/0x20f
[  101.984405] PGD 2807067 PUD 2808063 PMD 0 
[  101.988563] Oops:  [#1] SMP
[  102.069521] CPU: 0 PID: 2205 Comm: perf_fuzzer Not tainted 4.8.0-rc2+ #27
[  102.076313] Hardware name: Hewlett-Packard HP Compaq Pro 6305 SFF/1850, BIOS 
K06 v02.57 08/16/2013
[  102.085268] task: 880223ae5000 task.stack: 880224ea8000
[  102.091188] RIP: 0010:[]  [] 
find_get_context.isra.75+0x28/0x20f
[  102.100339] RSP: 0018:880224eabe20  EFLAGS: 00010246
[  102.105657] RAX: 2633e300 RBX:  RCX: 2633e300
[  102.112795] RDX:  RSI:  RDI: 8180ea00
[  102.119929] RBP: 8180ea00 R08: 0004 R09: 
[  102.127063] R10: 0003 R11: 0246 R12: 2633e300
[  102.134196] R13:  R14:  R15: 8180ea00
[  102.141327] FS:  7f743b391700() GS:88022ec0() 
knlGS:
[  102.149416] CS:  0010 DS:  ES:  CR0: 80050033
[  102.155167] CR2: 8653d8a0 CR3: 0002255b9000 CR4: 000407f0
[  102.162309] Stack:
[  102.164323]    880223b9d800 
880224fdd000
[  102.171804]  880223b9d800   

[  102.179284]  8180ea00 810e72be 0002 
88022e0006c0
[  102.186765] Call Trace:
[  102.189216]  [] ? SYSC_perf_event_open+0x525/0xa34
[  102.195579]  [] ? entry_SYSCALL_64_fastpath+0x17/0x93
[  102.202203] Code: 41 5c c3 41 57 41 56 41 55 41 54 55 53 48 89 fd 48 89 f3 
48 83 ec 18 48 85 f6 75 6c 83 3d 2f 2a 7f 00 00 41 89 cc 7f 1e 44 89 e0 <48> 0f 
a3 05 87 0f 7f 00 0f 92 c0 84 c0 75 26 48 c7 c0 ed ff ff 
[  102.56] RIP  [] find_get_context.isra.75+0x28/0x20f
[  102.229065]  RSP 
[  102.232556] CR2: 8653d8a0
[  102.235879] ---[ end trace fa649074c022bab1 ]---