Hi Thomas, On Fri, Sep 06, 2019 at 02:51:17PM +0200, Thomas Gleixner wrote: > Raj, > > On Thu, 5 Sep 2019, Raj, Ashok wrote: > > On Thu, Sep 05, 2019 at 11:22:31PM +0200, Thomas Gleixner wrote: > > > That's all nice, but what it the general use case for this outside of > > > Intel's > > > microcode development and testing? > > > > > > We all know that late microcode loading has severe limitations and we > > > really don't want to proliferate that further if not absolutely required > > > > Several customers have asked this to check the safety of late loads. They > > want > > to validate in production setup prior to rolling late-load to all > > production systems. > > Groan. Late loading _IS_ broken by definition and it was so forever.
Lets tighten the seat belts :-).. I'm with you that late-loading has shown weakness more recently than earlier. There are several obvious reasons that you are well aware. But there is a lot that *must* be done to make sure the guard rails are tight enough for deplopying late-load. 100% agree on that to make sure the interface and mechanism needs to be improved for robustness but not a candidate for removal. Certainly this is an argument that would help me drive towards that objective internally. > > What your customers are asking for is a receipe for disaster. They can > check the safety of late loading forever, it will not magically become safe > because they do so. > > If you want late loading, then the whole approach needs to be reworked from > ground up. You need to make sure that all CPUs are in a safe state, > i.e. where switching of CPU feature bits of all sorts can be done with the > guarantee that no CPU will return to the wrong code path after coming out > of safe state and that any kernel internal state which depends on the > previous set of CPU feature bits has been mopped up and switched over > before CPUs are released. > > That does not exist and unless it does, late loading is just going to cause > trouble nothing else. > > So, no. We are not merging something which is known to be broken and then > we have to deal with the subtle fallout and the bug reports forever. Not to When we did the late-load changes last year we added a warning if any of the cpuid bits either dissappear or new ones appear. Maybe we should have tainted the kernel to track that so its not that subtle anymore. > talk about having to fend of half baken duct tape patches which try to glue > things together. > > The only sensible patch for that is to remove any trace of late loading > crappola once and forever. > > Sorry, -ENOPONIES :-) Cheers, Ashok