Linux-Regression-ID: lr#15a115 On Thu, 2018-02-22 at 16:54 +0200, Artem Bityutskiy wrote: > Hi Christoph, > > one of our test box Skylake servers does not boot with v4.16-rcX. > Bisection lead us to this commit: > > 84676c1f21e8 genirq/affinity: assign vectors to all possible CPUs > > Reverting this single commit fixes the problem. > > The server is a Dell R640 machine with the latest Dell BIOS. It has a > single SATA SSD and we do not use raid, even though the system does > have a megaraid controller.
Correction: we have Raid0 with this single disk. > Are you aware of this issue? Below is the failure message and the > full > dmesg with some debugging boot parameters is here: > > https://pastebin.com/raw/tTYrTAEQ FYI, the regression still exists and reverting this single patch fixes it. But today Dell server I did not have time to really debug this, but I think people who are working with this should quickly see what is going on. I think the platform reports way too large possible CPU count. Indeed, in dmesg I see this: [ 0.000000] smpboot: Allowing 328 CPUs, 224 hotplug CPUs 224 is way too large for this system. It only has 2 sockets, it but the number looks like if the system had 4 sockets. The commit changes IRQ affinity logic from being per-present CPU to being per-possible CPU: - for_each_present_cpu(cpu) + for_each_possible_cpu(cpu) And it looks like this has an unexpected side-effect on this Dell platform. Artem.