On Mon, 12 Aug 2024 at 18:47, Chris Parker <ch...@parkerfamily.name> wrote:
> FYI - I have been interested for some time in making an accelerator that 
> would run in a sort-of hybrid KVM mode, where host-supported instructions are 
> executed by the processor, but unsupported functions are provided by 
> something like the TCG. How feasible that is depends on how new and different 
> the missing instructions are.  For example, this first came up when I wanted 
> to run a x86 32-bit program with SSE2 instructions on a Pentium III which 
> only had SSE instructions; that would be doable in theory since SSE2 only 
> adds new processor instructions, but not any new registers (if it did, that 
> would be a lot more complicated or even impossible to support in this way).  
> My original idea was to handle the ILLEGAL INSTRUCTION processor exception 
> and examine the offending instruction to see if it was covered in the TCG 
> implementation, and if so execute the TCG and then resume the program, which 
> in theory is totally possible. Now I am facing a similar issue running x64 
> 64-bit SSE4.1 instructions on a 64-bit processor that only supports up to 
> SSE2, so it's like deja vu! I don't know if/when I'll ever get around to 
> coding something like this, but as you can see, the problem isn't going away, 
> and demand for this sort of thing is likely to grow in the coming years as 
> applications and operating systems continue to disregard old hardware and 
> begin to use newer processor features.


Yes, this kind of idea is one that's come up before. It's
certainly in theory possible, but QEMU as it stands isn't really
designed to be able to do this -- the choice of accelerator
is an all-or-nothing one, with no thought of "use TCG only
for this next instruction or two and then do something else".
If you do it by catching the illegal-instruction exception
this also limits you to only new CPU features which add new
instructions that previously would cause an exception and not
any that change any of the behaviour of existing instructions.

If you want an example of using this approach with something
other than QEMU:
https://community.arm.com/arm-community-blogs/b/high-performance-computing-blog/posts/emulating-sve-on-armv8-using-dynamorio-and-armie
discusses the "ArmIE" tool which is based on DynamoRIO.
(DynamoRIO is an open source framework: https://dynamorio.org/
and ArmIE a closed-source tool built using it.)
I think because DynamoRIO is an instrumentation framework
it's much more suited to this kind of "leave most instructions
the way they are but when you see XYZ do something else"
design than QEMU is (QEMU's emulation always converts to
an intermediate representation, does register allocation
and optimisation and then emits code for that, whether the
host architecture is the same as the guest or different).

thanks
-- PMM

Reply via email to