On Fri, Sep 06, 2019 at 11:16:00PM +0200, Thomas Gleixner wrote: > > So if we want to do late microcode loading in a sane way then there are > only a few options and none of them exist today: > > 1) Micro-code contains a description of CPUID bits which are going to be > exposed after the load. Then the kernel can sanity check whether this > changes anything relevant or not. If there is a relevant change it can > reject the load and tell the admin that a reboot is required.
This is pretty much what we had in mind when we suggested to the uCode teams. Just a process of providing a meta data file to accompany every uCode release. IMO new cpuid bits are probably less harmful than old ones dissappearing. > > 2) Rework CPUID feature handling so that it can reevaluate and reconfigure > the running system safely. There are a lot of things you need for that: > > A) Introduce a safe state for CPUs to reach which guarantees that none > of the CPUs will return from that state via a code path which > depends on previous state and might now go the other route with data > on the stack which only fits the previous configuration. > > B) Make all the cpufeature thingies run time switchable. That means > that you need to keep quite some code around which is currently init > only. That also means that you have to provide backout code for > things which set up data corresponding to cpu feature bits and so > forth. > > So #2 might be finished in about 20 years from now with the result that > some of the code pathes might simply still have a Maybe we can catch the kernel side in 20 years.. user space would still be busted, or have a fault way to control new cpuid much like how we do for VM's. > > if (cpufeature_changed()) > panic(); > > because there are things which you cannot back out. So the only sane > solution is to panic. Which is not a solution as it would be much more sane > to prevent late loading upfront and force people to reboot proper. > > Now #1 is actually a sensible and feasible solution which can be pulled off > in a reasonably short time frame, avoids all the bound to be ugly and > failure laden attempts of fixing late loading completely and provides a > usable and safe solution for joe user, jack admin and the super experts at > big-cloud corporate. > > That is not requiring any new format of microcode payload, as this can be > nicely done as a metadata package which comes with the microcode > payload. So you get the following backwards compatible states: > > Kernel metadata result > > old don't care refuse late load > > new No refuse late load > > new Yes decide based on metadata > > Thoughts? This is 100% in line with what we proposed... Cheers, Ashok