On Tue, 13 Aug 2024 at 17:52, Jakob Bohm <jb-gnumli...@wisemo.com> wrote: > > On 2024-08-13 18:43, Peter Maydell wrote: > > On Tue, 13 Aug 2024 at 16:54, Jakob Bohm via <qemu-discuss@nongnu.org> > > wrote: > >> Another approach that might be worth examining would be to build the TCG > >> snippet collection for different host CPU generations (such as x86-64 > >> with only SSE2, x86-64 with SSE3 and x86-64 with AVX512), thereby > >> providing fast emulation of the latest on the new on the new and medium > >> CPU generations . This would imply some table selection logic in the > >> TCG initialization based on actual host and selected guest CPUs. This > >> idea is mostly for Qemu binaries that will run on different hosts . > > > > What "TCG snippet collection" ? QEMU generates code from IR > > instruction-at-a-time, it doesn't do assembly of prebuilt > > larger bits of code. > > I was referring to the part of the old Qemu build process that compiles > a bunch of TCG equivalents, at least when the host CPU has no hardcoded > translations for the guest CPU.
This sounds like you're thinking about the original QEMU "dyngen" code generator. That was removed back in 2008 or thereabouts, more than 15 years ago. Modern QEMU TCG is a fairly standard "read guest instruction in the guest architecture specific frontend; generate intermediate-representation operations; optimize IR and register-allocate in architecture-agnostic middleend; emit host instructions in host architecture specific backend" pipeline. If QEMU TCG has no support for the host CPU, then you can't run on it (well, in theory you can fall back to the TCI interpreter, but I don't recommend it, because it's terribly slow and not very well tested). > > We do have some support for runtime detection of host > > CPU features, so we can generate better code if some > > things are available; notably we will check for and > > use AVX1, AVX2, AVX512 if available. > > > > This seems to leave out the cases of SSE3 orig on x86-64 > and SSE on 80386SX Yes, we don't distinguish "x86-64 with SSE but not AVX1" from "no vector support at all" -- there's not much benefit from trying to optimize for 15 year old host CPUs. thanks -- PMM