On 14/05/2024 12:09 pm, Andrew Cooper wrote: > On 13/05/2024 9:59 am, Roger Pau Monne wrote: >> There's no point in forcing a system wide update of the MTRRs on all >> processors >> when there are no changes to be propagated. On AP startup it's only the AP >> that needs to write the system wide MTRR values in order to match the rest of >> the already online CPUs. >> >> We have occasionally seen the watchdog trigger during `xen-hptool cpu-online` >> in one Intel Cascade Lake box with 448 CPUs due to the re-setting of the >> MTRRs >> on all the CPUs in the system. >> >> While there adjust the comment to clarify why the system-wide resetting of >> the >> MTRR registers is not needed for the purposes of mtrr_ap_init(). >> >> Signed-off-by: Roger Pau Monné <roger....@citrix.com> >> --- >> For consideration for 4.19: it's a bugfix of a rare instance of the watchdog >> triggering, but it's also a good performance improvement when performing >> cpu-online. >> >> Hopefully runtime changes to MTRR will affect a single MSR at a time, >> lowering >> the chance of the watchdog triggering due to the system-wide resetting of the >> range. > "Runtime" changes will only be during dom0 boot, if at all, but yes - it > is restricted to a single MTRR at a time. > > It's XENPF_{add,del,read}_memtype, but it's only used by Classic Linux. > PVOps only issues read_memtype. > > Acked-by: Andrew Cooper <andrew.coop...@citrix.com>
Having stared at the manuals, I expect the reason this is intermittent even with dedicated testing is because on SMM entry, CR0.CD/NW are specifically unmodified. Therefore, an SMI hitting the critical region will proceed at a glacial pace. But it does occur to me that the rendezvous is a plain rendezvous, which means it will also be taking NMIs because of the watchdog at 2Hz, and those will be glacial too. A further optimisation would be to not disable caches if there are no updates to make. This will be the overwhelming common case in general, and 100% case on CPU hot{un,}plug, but as it is, getting rid of the unnecessary rendezvous is still a massive improvement. ~Andrew