Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-08 Thread Peter Zijlstra
On Fri, Sep 05, 2014 at 10:31:49AM -0700, Linus Torvalds wrote: > So quite frankly, the whole perf_pmu_migrate_context() thing looks > completely and fundamentally broken. > > Your patch doesn't really solve anything in general, and would need to > be extended to do that get_online_cpus() around e

Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-05 Thread Peter Zijlstra
On Fri, Sep 05, 2014 at 10:31:49AM -0700, Linus Torvalds wrote: > So quite frankly, the whole perf_pmu_migrate_context() thing looks > completely and fundamentally broken. Yes, agreed. We can go play very nasty games, but fundamentally agreed. > Or even just say: "if somebody takes down a CPU wit

Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-05 Thread Linus Torvalds
On Fri, Sep 5, 2014 at 9:59 AM, Mark Rutland wrote: > > As you point out below, the race on event->ctx is the fundamental issue. That > is what results in decrementing the refcount twice (once on a stale event->ctx > pointer). So quite frankly, the whole perf_pmu_migrate_context() thing looks com

Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-05 Thread Mark Rutland
On Fri, Sep 05, 2014 at 04:41:43PM +0100, Linus Torvalds wrote: > On Fri, Sep 5, 2014 at 8:16 AM, Peter Zijlstra wrote: > > > > How horrible is the below patch (performance wise). It does pretty much > > the same thing except that percpu_rw_semaphore is a lot saner, its > > read side performance s

Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-05 Thread Vince Weaver
On Fri, 5 Sep 2014, Linus Torvalds wrote: > However, the more fundamental question is "what protects accesses to > 'events->ctx'". Why is "put_event()" so special that *it* gets locking > for the reading of "event->ctx", but none of the other cases of > reading the ctx pointer gets it or needs it

Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-05 Thread Linus Torvalds
On Fri, Sep 5, 2014 at 8:16 AM, Peter Zijlstra wrote: > > How horrible is the below patch (performance wise). It does pretty much > the same thing except that percpu_rw_semaphore is a lot saner, its > read side performance should be minimal in the absence of writes. Ugh. Why do any locking at all

Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-05 Thread Peter Zijlstra
On Thu, Sep 04, 2014 at 12:07:40PM +0100, Mark Rutland wrote: > Thanks for taking a look. If you have any ideas I'm happy to try another > approach. How horrible is the below patch (performance wise). It does pretty much the same thing except that percpu_rw_semaphore is a lot saner, its read side

Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-04 Thread Mark Rutland
On Thu, Sep 04, 2014 at 11:44:02AM +0100, Peter Zijlstra wrote: > On Wed, Sep 03, 2014 at 12:50:14PM +0100, Mark Rutland wrote: > > From 6465beace3ad9b12039127468f4596b8e87a53e8 Mon Sep 17 00:00:00 2001 > > From: Mark Rutland > > Date: Wed, 3 Sep 2014 11:06:22 +0100 > > Subject: [PATCH] perf: prev

Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-04 Thread Peter Zijlstra
On Wed, Sep 03, 2014 at 12:50:14PM +0100, Mark Rutland wrote: > From 6465beace3ad9b12039127468f4596b8e87a53e8 Mon Sep 17 00:00:00 2001 > From: Mark Rutland > Date: Wed, 3 Sep 2014 11:06:22 +0100 > Subject: [PATCH] perf: prevent hotplug race on event->ctx > > The perf_pmu_migrate_context code intr

Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-03 Thread Mark Rutland
Hi all, Further to my earlier reply I've come up with a potential fix below, which has survived my stress test for both my WIP driver and the intel uncore imc driver. As it's impossible to synchronize with the event->ctx I'd hoped it would be possible to synchronize with a field on the event itse

Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-02 Thread Mark Rutland
On Mon, Sep 01, 2014 at 08:05:34PM +0100, Peter Zijlstra wrote: > On Mon, Sep 01, 2014 at 07:18:08PM +0100, Mark Rutland wrote: > > Hi all, > > > [ 66.780759] [] rcu_process_callbacks+0x1e3/0x540 > > > Has anything seen anything like this before? Is this a known issue? > > I've not seen it re

Re: Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-01 Thread Peter Zijlstra
On Mon, Sep 01, 2014 at 07:18:08PM +0100, Mark Rutland wrote: > Hi all, > [ 66.780759] [] rcu_process_callbacks+0x1e3/0x540 > Has anything seen anything like this before? Is this a known issue? I've not seen it reported.. sounds like 'fun' though. -- To unsubscribe from this list: send the li

Possible race between CPU hotplug and perf_pmu_migrate_context

2014-09-01 Thread Mark Rutland
Hi all, While trying some rework of the ARM CCI PMU driver on v3.17-rc2, I encountered what seems to be a race between CPU hotplug and perf event context migration, which results in a BUG in mm/slub.c. It looks like this is a generic issue as I'm able to cause the same splat with the uncore_imc d