On Mon, Jun 02, 2025 at 14:30:43 +0200, Hector Cao wrote: > Hello Jiri, > > Thanks for the feedback, > > On Mon, Jun 2, 2025 at 9:30 AM Jiri Denemark <jdene...@redhat.com> wrote: > > > On Mon, Jun 02, 2025 at 01:19:29 +0200, Hector Cao wrote: > > > Several Intel CPU models with TSX technology (HLE & RTM features) are > > > affected by the vulnerability TAA[1]. One of the mitigation methods > > > for TAA is to disable TSX support on the host system. For that purpose, > > > in 2021, Intel published a microcode update to disable TSX. Linux kernel > > > also disables TSX globally by default. Even though TSX can be activated > > via > > > the kernel command line (tsx=on), many Linux distributions stick with > > > this default behavior and have TSX disabled. This makes existing CPU > > > models that have HLE and RTM enabled not correctly detected by > > > libvirt. > > > > Can you describe the issue in more details? Especially where libvirt > > incorrectly detects CPU models because of this? > > > > > On my platform (Granite Rapids CPU) with TSX disabled by default in the > kernel > The TSX features rtm and hle are missing, per consequence, `virsh > capabilities` detects the CPU as > Icelake-Server-noTSX model.
I see, I was thinking this was the case. The CPU definition provided in host capabilities is limited and cannot cover CPUs that lack some features compared to the corresponding CPU model and a simpler CPU model has to be shown instead. Thus this information is mostly useless (except for checking what exact features a host CPU supports) and it's not used for anything by libvirt itself. And since we have a much better way of describing the host CPU or rather a CPU that can be provided to a guest on the host (virsh domcapabilities --xpath "//cpu/mode[@name='host-model']") there's no reason other applications or users should look at the CPU in virsh capabilities either. It's similar to how cpu/topology element in virsh capabilities is useless and should not be used. So except for not having the right CPU model in the capabilities XML (which is not a bug, but rather a known limitation), is there any other issue? I believe the host CPU would be correctly reported as SapphireRapids/GraniteRapids with both hle and rtm disabled in domain capabilities XML. > > > This commit adds 2 remaining -noTSX models: > > > - SapphireRapids-noTSX > > > - GraniteRapids-noTSX > > > > QEMU switched away from adding suffixes to CPU models and just adds a > > new version for a CPU model in case it needs to be updated. There's no > > point adding these models to libvirt. Any CPU model that would only > > exist in libvirt would not be directly usable anyway and would have to > > be translated to another CPU model. > > > > I would be grateful if you can provide me some background on what is the > criteria to add a > new version to an existing model. For the case of Intel, how do we know > that we need to > add a new version to the CPU model ? I don't know, you'd need to ask QEMU developers. > Beyond the naming issue (version vs suffix), I understand that we stopped > doing what we did for older CPU models > like this commit for Icelake, do I understand it correctly ? > > i386: Add -noTSX aliases for hle=off, rtm=off CPU models > https://github.com/qemu/qemu/commit/02fa60d10137ed2ef17534718d7467e0d2170142 This was the original approach for creating modified CPU models that can be used as-is without having to manually specify bunch of features. But when more cases appeared they realized such approach didn't scale and switched to versioned CPU models with -v* suffixes instead. > Do you think that adding a new version for Sapphire and Granite Rapids > CPU models both in QEMU and libvirt would be something that makes > sense to tackle this issue ? Well, you can try asking whether adding such CPU model in QEMU would make sense. From libvirt's POV this is just a cosmetic issue so not worth the effort IMHO. Jirka