On Sat, 7 Sep 2019, Chris Wilson wrote: > Quoting Thomas Gleixner (2019-09-07 15:29:19) > > On Sat, 7 Sep 2019, Chris Wilson wrote: > > > Quoting Linus Torvalds (2019-09-02 18:28:26) > > > > Bandan Das: > > > > x86/apic: Include the LDR when clearing out APIC registers > > > > > > Apologies if this is known already, I'm way behind on email. > > > > > > I've bisected > > > > > > [ 18.693846] smpboot: CPU 0 is now offline > > > [ 19.707737] smpboot: Booting Node 0 Processor 0 APIC 0x0 > > > [ 29.707602] smpboot: do_boot_cpu failed(-1) to wakeup CPU#0 > > > > > > https://intel-gfx-ci.01.org/tree/drm-tip/igt@perf_...@cpu-hotplug.html > > > > > > to 558682b52919. (Reverts cleanly and fixes the problem.) > > > > > > I'm guessing that this is also behind the suspend failures, missing > > > /dev/cpu/0/msr, and random perf_event_open() failures we have observed > > > in our CI since -rc7 across all generations of Intel cpus. > > > > So is this on bare metal or in a VM? > > Our single virtualised piece of kit doesn't support cpu hotplug, so this > test is not being run. We have failures on > icl (2019), glk (2017), kbl (2017), bxt (2016), skl (2015), > bsw (2016), hsw (2013), byt (2013), snb (2011), elk (2008), > bwr (2006), blb (2007)
Ok let me find a testbox to figure out whats wrong there. Does this only happen with that CPU0 hotplug stuff enabled or on CPUs other than CPU0 as well? That hotplug CPU0 stuff is a bandaid so I wouldn't be surprised if we broke that somehow. Thanks, tglx