While looking again at Paul's patches for AVX, I came to the conclusion that the x86 decoder is unsalvageable. The encoding of x86 is simply too messy for it to be decoded in code; huge tables, derived as much as possible from the architecture reference, are the real way to go.
So here is a new, albeit partial decoder, that is based on three principles: - use mostly table-driven decoding, using tables derived as much as possible from the Intel manual, keeping the code as "non-branchy" as possible - keep address generation and (for ALU operands) memory loads and write back as much in common code as possible, to avoid code duplication (this is less relevant to non-ALU instructions because read-modify-write operations are rare) - do minimal changes on the old decoder while allowing incremental replacement of the old decoder with the new one So this series introduces the main decoder flow, integrates it with the old decoder (which takes care of parsing prefixes and then optionally drops to the new one based on the first byte of the opcode), and implements three quarters of the one byte opcodes. It is only lightly tested but it can boot to iPXE and run some 64-bit coreutils just fine; Linux seems to trigger a bug in outsw/l/q emulation that I haven't checked yet, but still it's enough to show the result of a couple days of hacking. The generated code is mostly the same, though marginally worse in some cases because I privileged code simplicity. For example, MOVSXD is not able to use MO_SL and falls back to MO_UL + sign extension. One notable difference is that the new decoder always sign-extends 8-bit immediates, so for example a "cmpb $e9, %dl" instruction will subtract $0xfff...fffe9 from the temporary value. This is the way Intel intended "Ib" immediates to work, and there's no difference between the two. Anyay, porting these opcodes is really more of a validation for the whole concept and a test for the common decoder code; it's probably more efficient to focus on the SSE and VEX 2-byte and 3-byte opcodes as a path towards enabling AVX in QEMU, and keep the existing decoder for non-VEX, non-SSE opcodes. Getting the conditions right for VEX.L, VEX.W etc. is going to be, well, vexing because of the way Intel has decided to format the exception tables in the manual, but it should be feasible to use a more table-based decoding process for those operations as well. The series is available at https://gitlab.com/bonzini/qemu.git, branch i386. Paolo Paolo Bonzini (17): target/i386: extract old decoder to a separate file target/i386: introduce insn_get_addr target/i386: add core of new i386 decoder target/i386: add ALU load/writeback core target/i386: add 00-07, 10-17 opcodes target/i386: add 08-0F, 18-1F opcodes target/i386: add 20-27, 30-37 opcodes target/i386: add 28-2f, 38-3f opcodes target/i386: add 40-47, 50-57 opcodes target/i386: add 48-4f, 58-5f opcodes target/i386: add 60-67, 70-77 opcodes target/i386: add 68-6f, 78-7f opcodes target/i386: add 80-87, 90-97 opcodes target/i386: add a0-a7, b0-b7 opcodes target/i386: do not clobber A0 in POP translation target/i386: add 88-8f, 98-9f opcodes target/i386: add a8-af, b8-bf opcodes target/i386/tcg/decode-new.c.inc | 1254 +++++++ target/i386/tcg/decode-old.c.inc | 5707 +++++++++++++++++++++++++++++ target/i386/tcg/emit.c.inc | 684 ++++ target/i386/tcg/translate.c | 5822 +----------------------------- 4 files changed, 7740 insertions(+), 5727 deletions(-) create mode 100644 target/i386/tcg/decode-new.c.inc create mode 100644 target/i386/tcg/decode-old.c.inc create mode 100644 target/i386/tcg/emit.c.inc -- 2.37.1