On Mon, Mar 13, 2017 at 2:05 PM, Andy Lutomirski <l...@kernel.org> wrote:
> On Mon, Mar 13, 2017 at 9:55 AM, Peter Zijlstra <pet...@infradead.org> wrote:
>> On Mon, Mar 13, 2017 at 09:44:02AM -0700, Andy Lutomirski wrote:
>>> static void x86_pmu_event_mapped(struct perf_event *event)
>>> {
>>>     if (!(event->hw.flags & PERF_X86_EVENT_RDPMC_ALLOWED))
>>>         return;
>>>
>>>     if (atomic_inc_return(&current->mm->context.perf_rdpmc_allowed) == 1)
>>>
>>> <-- thread 1 stalls here
>>>
>>>         on_each_cpu_mask(mm_cpumask(current->mm), refresh_pce, NULL, 1);
>>> }
>>>
>>> Suppose you start with perf_rdpmc_allowed == 0.  Thread 1 runs
>>> x86_pmu_event_mapped and gets preempted (or just runs slowly) where I
>>> marked.  Then thread 2 runs the whole function, does *not* update CR4,
>>> returns to userspace, and GPFs.
>>>
>>> The big hammer solution is to stick a per-mm mutex around it.  Let me
>>> ponder whether a smaller hammer is available.
>>
>> Reminds me a bit of what we ended up with in 
>> kernel/jump_label.c:static_key_slow_inc().
>>
>>
>
> One thing I don't get: isn't mmap_sem held for write the whole time?

mmap_sem is indeed held, so my theory is wrong.  I can reproduce it,
but I don't see the bug yet...

--Andy

-- 
Andy Lutomirski
AMA Capital Management, LLC

Reply via email to