On Thu, Aug 31, 2017 at 09:08:05AM +0200, Thomas Gleixner wrote: > On Wed, 30 Aug 2017, Peter Zijlstra wrote: > > On offline it basically does perf_event_disable() for all CPU context > > events, and then adds HOTPLUG_OFFSET (-32) to arrive at: OFF + > > HOTPLUG_OFFSET = -33. > > > > That's smaller than ERROR and thus perf_event_enable() no-ops on events > > for offline CPUs (maybe we should try and plumb an error return for > > IOC_ENABLE). > > > > On online we subtract the HOTPLUG_OFFSET again and the event becomes a > > regular OFF, after which perf_event_enable() should work again. > > I haven't come around to test that as I was busy cleaning up the unholy > mess in the watchdog code. > > One other thing I stumbled over is: > > perf_event_create() > .... > x86_hw_reserve(event) > > if (__x86_pmu_event_init(event) < 0) > event->destroy(event); > x86_hw_release() > .... > cpus_read_lock(); > > If that happens from a hotplug function, we are doomed. > > I mean, that particular watchdog event won't fail if the watchdog code > would verify that already at init time (which it does soon), but in general > event creation during hotplug is dangerous.
Arghh!!! And allowing us to create events for offline CPUs (possible I think, but maybe slightly tricky) won't solve that, because we're already holding the hotplug_lock during PREPARE. I'll try and think...

