Re: [PULL 00/24] Build system and target/i386/translate.c cleanups for 2025-05-25

2024-05-25 Thread Richard Henderson

On 5/25/24 04:33, Paolo Bonzini wrote:

The following changes since commit 70581940cabcc51b329652becddfbc6a261b1b83:

   Merge tag 'pull-tcg-20240523' ofhttps://gitlab.com/rth7680/qemu  into 
staging (2024-05-23 09:47:40 -0700)

are available in the Git repository at:

   https://gitlab.com/bonzini/qemu.git  tags/for-upstream

for you to fetch changes up to 70eb5fde05bdd051c087669ffcf2aee39e0c8170:

   migration: remove unnecessary zlib dependency (2024-05-25 13:28:02 +0200)


Build system and target/i386/translate.c cleanups


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/9.1 as 
appropriate.


r~




Re: [PATCH 0/6] target/riscv: Support Zabha extension

2024-05-25 Thread LIU Zhiwei



On 2024/5/24 19:44, Daniel Henrique Barboza wrote:

Hi Zhiwei!



On 5/23/24 09:40, LIU Zhiwei wrote:
Zabha adds support AMO operations for byte and half word. If zacas 
has been implemented,

zabha also adds support amocas.b and amocas.h.

More details is on the specification here:
https://github.com/riscv/riscv-zabha

The implemenation of zabha follows the way of AMOs and zacas.

This patch set is based on these two patch set:
1. https://mail.gnu.org/archive/html/qemu-riscv/2024-05/msg00207.html
2. https://mail.gnu.org/archive/html/qemu-riscv/2024-05/msg00212.html


These 2 series doesn't seem to apply on top of each other, doesn't 
matter which

order I try. Applying zimop/zcmop first, then zama16b:

$ git am \[PATCH\ 1_1\]\ target_riscv\:\ Support\ Zama16b\ extension\ 
-\ LIU\ Zhiwei\ \\ -\ 2024-05-22\ 0613.eml

Applying: target/riscv: Support Zama16b extension
error: patch failed: target/riscv/cpu.c:1464
error: target/riscv/cpu.c: patch does not apply
Patch failed at 0001 target/riscv: Support Zama16b extension
hint: Use 'git am --show-current-patch=diff' to see the failed patch


Applying zama16b first, then zimop/zcmop:

$ git am \[PATCH\ 1_1\]\ target_riscv\:\ Support\ Zama16b\ extension\ 
-\ LIU\ Zhiwei\ \\ -\ 2024-05-22\ 0613.eml

Applying: target/riscv: Support Zama16b extension
$
$ git am \[PATCH\ 1_4\]\ target_riscv\:\ Add\ zimop\ extension\ -\ 
LIU\ Zhiwei\ \\ -\ 2024-05-22\ 0329.eml 
\[PATCH\ 2_4\]\ disas_riscv\:\ Support\ zimop\ disassemble\ -\ LIU\ 
Zhiwei\ \\ -\ 2024-05-22\ 0329.eml

Applying: target/riscv: Add zimop extension
error: patch failed: target/riscv/cpu.c:1463
error: target/riscv/cpu.c: patch does not apply
Patch failed at 0001 target/riscv: Add zimop extension


If the series are dependent on each other perhaps it's easier to send 
everything

in a single 11 patches series.


They don't have dependency on each other. But if we both rebase them to 
the master branch, they
couldn't be merged at the time, as them both modify cpu.h and cpu.c in 
the same place.



I will send them as a whole patch set(RVA23 patch set) after I fix other 
issues on implementing the RVA23 profile.


Thanks,

Zhiwei




Thanks,

Daniel




LIU Zhiwei (6):
   target/riscv: Move gen_amo before implement Zabha
   target/riscv: Add AMO instructions for Zabha
   target/riscv: Move gen_cmpxchg before adding amocas.[b|h]
   target/riscv: Add amocas.[b|h] for Zabha
   target/riscv: Enable zabha for max cpu
   disas/riscv: Support zabha disassemble

  disas/riscv.c   |  60 
  target/riscv/cpu.c  |   2 +
  target/riscv/cpu_cfg.h  |   1 +
  target/riscv/insn32.decode  |  22 +++
  target/riscv/insn_trans/trans_rva.c.inc |  21 ---
  target/riscv/insn_trans/trans_rvzabha.c.inc | 145 
  target/riscv/insn_trans/trans_rvzacas.c.inc |  13 --
  target/riscv/translate.c    |  36 +
  8 files changed, 266 insertions(+), 34 deletions(-)
  create mode 100644 target/riscv/insn_trans/trans_rvzabha.c.inc





Re: [PATCH 1/4] target/riscv: Add zimop extension

2024-05-25 Thread LIU Zhiwei

Hi Daniel,

On 2024/5/24 17:46, Daniel Henrique Barboza wrote:



On 5/22/24 03:29, LIU Zhiwei wrote:

Zimop extension defines an encoding space for 40 MOPs.The Zimop
extension defines 32 MOP instructions named MOP.R.n, where n is
an integer between 0 and 31, inclusive. The Zimop extension
additionally defines 8 MOP instructions named MOP.RR.n, where n
is an integer between 0 and 7.

These 40 MOPs initially are defined to simply write zero to x[rd],
but are designed to be redefined by later extensions to perform some
other action.

Signed-off-by: LIU Zhiwei 
---
  target/riscv/cpu.c  |  2 ++
  target/riscv/cpu_cfg.h  |  1 +
  target/riscv/insn32.decode  | 11 ++
  target/riscv/insn_trans/trans_rvzimop.c.inc | 37 +
  target/riscv/translate.c    |  1 +
  5 files changed, 52 insertions(+)
  create mode 100644 target/riscv/insn_trans/trans_rvzimop.c.inc

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index eb1a2e7d6d..c1ac521142 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -175,6 +175,7 @@ const RISCVIsaExtData isa_edata_arr[] = {
  ISA_EXT_DATA_ENTRY(zvkt, PRIV_VERSION_1_12_0, ext_zvkt),
  ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx),
  ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin),
+    ISA_EXT_DATA_ENTRY(zimop, PRIV_VERSION_1_12_0, ext_zimop),


Shouldn't this be placed right after zihpm?


Yes. Thanks.

I didn't notice the strict order between extensions. And will fix this 
and other similar comments in other patches.


Zhiwei



    ISA_EXT_DATA_ENTRY(zihintpause, PRIV_VERSION_1_10_0, 
ext_zihintpause),

    ISA_EXT_DATA_ENTRY(zihpm, PRIV_VERSION_1_12_0, ext_zihpm),

+    ISA_EXT_DATA_ENTRY(zimop, PRIV_VERSION_1_12_0, ext_zimop),

    ISA_EXT_DATA_ENTRY(zmmul, PRIV_VERSION_1_12_0, ext_zmmul),


Thanks,

Daniel



  ISA_EXT_DATA_ENTRY(smaia, PRIV_VERSION_1_12_0, ext_smaia),
  ISA_EXT_DATA_ENTRY(smepmp, PRIV_VERSION_1_12_0, ext_smepmp),
  ISA_EXT_DATA_ENTRY(smstateen, PRIV_VERSION_1_12_0, ext_smstateen),
@@ -1463,6 +1464,7 @@ const RISCVCPUMultiExtConfig 
riscv_cpu_extensions[] = {

  MULTI_EXT_CFG_BOOL("zicsr", ext_zicsr, true),
  MULTI_EXT_CFG_BOOL("zihintntl", ext_zihintntl, true),
  MULTI_EXT_CFG_BOOL("zihintpause", ext_zihintpause, true),
+    MULTI_EXT_CFG_BOOL("zimop", ext_zimop, false),
  MULTI_EXT_CFG_BOOL("zacas", ext_zacas, false),
  MULTI_EXT_CFG_BOOL("zaamo", ext_zaamo, false),
  MULTI_EXT_CFG_BOOL("zalrsc", ext_zalrsc, false),
diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
index cb750154bd..b547fbba9d 100644
--- a/target/riscv/cpu_cfg.h
+++ b/target/riscv/cpu_cfg.h
@@ -71,6 +71,7 @@ struct RISCVCPUConfig {
  bool ext_zihintntl;
  bool ext_zihintpause;
  bool ext_zihpm;
+    bool ext_zimop;
  bool ext_ztso;
  bool ext_smstateen;
  bool ext_sstc;
diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode
index f22df04cfd..972a1e8fd1 100644
--- a/target/riscv/insn32.decode
+++ b/target/riscv/insn32.decode
@@ -38,6 +38,8 @@
  %imm_bs   30:2   !function=ex_shift_3
  %imm_rnum 20:4
  %imm_z6   26:1 15:5
+%imm_mop5 30:1 26:2 20:2
+%imm_mop3 30:1 26:2
    # Argument sets:
  &empty
@@ -56,6 +58,8 @@
  &r2nfvm    vm rd rs1 nf
  &rnfvm vm rd rs1 rs2 nf
  &k_aes shamt rs2 rs1 rd
+&mop5 imm rd rs1
+&mop3 imm rd rs1 rs2
    # Formats 32:
  @r   ...   . . ... . ... &r    
%rs2 %rs1 %rd

@@ -98,6 +102,9 @@
  @k_aes   .. . . .  ... . ... &k_aes 
shamt=%imm_bs   %rs2 %rs1 %rd
  @i_aes   .. . . .  ... . ... &i 
imm=%imm_rnum    %rs1 %rd
  +@mop5 . . .. ..  .. . ... . ... &mop5 
imm=%imm_mop5 %rd %rs1
+@mop3 . . .. .. . . . ... . ... &mop3 imm=%imm_mop3 
%rd %rs1 %rs2

+
  # Formats 64:
  @sh5 ...  . .  ... . ... &shift 
shamt=%sh5  %rs1 %rd
  @@ -1010,3 +1017,7 @@ amocas_w    00101 . . . . 010 . 
010 @atom_st

  amocas_d    00101 . . . . 011 . 010 @atom_st
  # *** RV64 Zacas Standard Extension ***
  amocas_q    00101 . . . . 100 . 010 @atom_st
+
+# *** Zimop may-be-operation extension ***
+mop_r_n 1 . 00 .. 0111 .. . 100 . 0111011 @mop5
+mop_rr_n    1 . 00 .. 1 . . 100 . 0111011 @mop3
diff --git a/target/riscv/insn_trans/trans_rvzimop.c.inc 
b/target/riscv/insn_trans/trans_rvzimop.c.inc

new file mode 100644
index 00..165aacd2b6
--- /dev/null
+++ b/target/riscv/insn_trans/trans_rvzimop.c.inc
@@ -0,0 +1,37 @@
+/*
+ * RISC-V translation routines for May-Be-Operation(zimop).
+ *
+ * Copyright (c) 2024 Alibaba Group.
+ *
+ * This program is free software; you can redistribute it and/or 
modify it

+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation

Re: [PULL 17/20] target/arm: Do memory type alignment check when translation disabled

2024-05-25 Thread Bernhard Beschow



Am 25. Mai 2024 13:41:54 UTC schrieb Bernhard Beschow :
>
>
>Am 5. März 2024 13:52:34 UTC schrieb Peter Maydell :
>>From: Richard Henderson 
>>
>>If translation is disabled, the default memory type is Device, which
>>requires alignment checking.  This is more optimally done early via
>>the MemOp given to the TCG memory operation.
>>
>>Reviewed-by: Philippe Mathieu-Daudé 
>>Reported-by: Idan Horowitz 
>>Signed-off-by: Richard Henderson 
>>Message-id: 20240301204110.656742-6-richard.hender...@linaro.org
>>Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1204
>>Signed-off-by: Richard Henderson 
>>Signed-off-by: Peter Maydell 
>
>Hi,
>
>This change causes an old 4.14.40 Linux kernel to panic on boot using the 
>sabrelite machine:
>
>[snip]
>Alignment trap: init (1) PC=0x76f1e3d4 Instr=0x14913004 Address=0x76f34f3e FSR 
>0x001
>Alignment trap: init (1) PC=0x76f1e3d8 Instr=0x148c3004 Address=0x7e8492bd FSR 
>0x801
>Alignment trap: init (1) PC=0x76f0dab0 Instr=0x6823 Address=0x7e849fbb FSR 
>0x001
>Alignment trap: init (1) PC=0x76f0dab2 Instr=0x6864 Address=0x7e849fbf FSR 
>0x001
>scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK2.5+ PQ: 0 ANSI: 5
>fsl-asoc-card sound: ASoC: CODEC DAI sgtl5000 not registered
>imx-sgtl5000 sound: ASoC: CODEC DAI sgtl5000 not registered
>imx-sgtl5000 sound: snd_soc_register_card failed (-517)
>Alignment trap: init (1) PC=0x76eac95a Instr=0xf8dd5015 Address=0x7e849b05 FSR 
>0x001
>Alignment trap: not handling instruction f8dd5015 at [<76eac95a>]
>Unhandled fault: alignment exception (0x001) at 0x7e849b05
>pgd = 9c59c000
>[7e849b05] *pgd=2c552831, *pte=109eb34f, *ppte=109eb83f
>Kernel panic - not syncing: Attempted to kill init! exitcode=0x0007
>
>---[ end Kernel panic - not syncing: Attempted to kill init! 
>exitcode=0x0007
>
>As you can see, some alignment exceptions are handled by the kernel, the last 
>one isn't. I added some additional printk()'s and traced it down to this 
>location in the kernel: 
> 
>which claims that ARMv6++ CPUs can handle up to word-sized unaligned accesses, 
>thus no fixup is needed.
>
>I hope that this will be sufficient for a fix. Let me know if you need any 
>additional information.

I'm performing a direct kernel boot. On real hardware, a bootloader is involved 
which probably enables unaligned access. This may explain why it works there 
but not in QEMU any longer.

To fix direct kernel boot, it seems as if the "built-in bootloader" would need 
to be adapted/extended [1]. Any ideas?

Best regards,
Bernhard

 [1] 
https://stackoverflow.com/questions/68949890/how-does-qemu-emulate-a-kernel-without-a-bootloader

>
>Best regards,
>Bernhard
>
>>---
>> target/arm/tcg/hflags.c | 34 --
>> 1 file changed, 32 insertions(+), 2 deletions(-)
>>
>>diff --git a/target/arm/tcg/hflags.c b/target/arm/tcg/hflags.c
>>index 8e5d35d9227..5da1b0fc1d4 100644
>>--- a/target/arm/tcg/hflags.c
>>+++ b/target/arm/tcg/hflags.c
>>@@ -26,6 +26,35 @@ static inline bool fgt_svc(CPUARMState *env, int el)
>> FIELD_EX64(env->cp15.fgt_exec[FGTREG_HFGITR], HFGITR_EL2, SVC_EL1);
>> }
>> 
>>+/* Return true if memory alignment should be enforced. */
>>+static bool aprofile_require_alignment(CPUARMState *env, int el, uint64_t 
>>sctlr)
>>+{
>>+#ifdef CONFIG_USER_ONLY
>>+return false;
>>+#else
>>+/* Check the alignment enable bit. */
>>+if (sctlr & SCTLR_A) {
>>+return true;
>>+}
>>+
>>+/*
>>+ * If translation is disabled, then the default memory type is
>>+ * Device(-nGnRnE) instead of Normal, which requires that alignment
>>+ * be enforced.  Since this affects all ram, it is most efficient
>>+ * to handle this during translation.
>>+ */
>>+if (sctlr & SCTLR_M) {
>>+/* Translation enabled: memory type in PTE via MAIR_ELx. */
>>+return false;
>>+}
>>+if (el < 2 && (arm_hcr_el2_eff(env) & (HCR_DC | HCR_VM))) {
>>+/* Stage 2 translation enabled: memory type in PTE. */
>>+return false;
>>+}
>>+return true;
>>+#endif
>>+}
>>+
>> static CPUARMTBFlags rebuild_hflags_common(CPUARMState *env, int fp_el,
>>ARMMMUIdx mmu_idx,
>>CPUARMTBFlags flags)
>>@@ -121,8 +150,9 @@ static CPUARMTBFlags rebuild_hflags_a32(CPUARMState *env, 
>>int fp_el,
>> {
>> CPUARMTBFlags flags = {};
>> int el = arm_current_el(env);
>>+uint64_t sctlr = arm_sctlr(env, el);
>> 
>>-if (arm_sctlr(env, el) & SCTLR_A) {
>>+if (aprofile_require_alignment(env, el, sctlr)) {
>> DP_TBFLAG_ANY(flags, ALIGN_MEM, 1);
>> }
>> 
>>@@ -223,7 +253,7 @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, 
>>int el, int fp_el,
>> 
>> sctlr = regime_sctlr(env, stage1);
>> 
>>-if (sctlr & SCTLR_A) {
>>+if (aprofile_require_alignment(env, el, sctlr)) {
>> DP_TBFLAG_ANY(fl

Re: [PULL 17/20] target/arm: Do memory type alignment check when translation disabled

2024-05-25 Thread Bernhard Beschow



Am 5. März 2024 13:52:34 UTC schrieb Peter Maydell :
>From: Richard Henderson 
>
>If translation is disabled, the default memory type is Device, which
>requires alignment checking.  This is more optimally done early via
>the MemOp given to the TCG memory operation.
>
>Reviewed-by: Philippe Mathieu-Daudé 
>Reported-by: Idan Horowitz 
>Signed-off-by: Richard Henderson 
>Message-id: 20240301204110.656742-6-richard.hender...@linaro.org
>Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1204
>Signed-off-by: Richard Henderson 
>Signed-off-by: Peter Maydell 

Hi,

This change causes an old 4.14.40 Linux kernel to panic on boot using the 
sabrelite machine:

[snip]
Alignment trap: init (1) PC=0x76f1e3d4 Instr=0x14913004 Address=0x76f34f3e FSR 
0x001
Alignment trap: init (1) PC=0x76f1e3d8 Instr=0x148c3004 Address=0x7e8492bd FSR 
0x801
Alignment trap: init (1) PC=0x76f0dab0 Instr=0x6823 Address=0x7e849fbb FSR 0x001
Alignment trap: init (1) PC=0x76f0dab2 Instr=0x6864 Address=0x7e849fbf FSR 0x001
scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK2.5+ PQ: 0 ANSI: 5
fsl-asoc-card sound: ASoC: CODEC DAI sgtl5000 not registered
imx-sgtl5000 sound: ASoC: CODEC DAI sgtl5000 not registered
imx-sgtl5000 sound: snd_soc_register_card failed (-517)
Alignment trap: init (1) PC=0x76eac95a Instr=0xf8dd5015 Address=0x7e849b05 FSR 
0x001
Alignment trap: not handling instruction f8dd5015 at [<76eac95a>]
Unhandled fault: alignment exception (0x001) at 0x7e849b05
pgd = 9c59c000
[7e849b05] *pgd=2c552831, *pte=109eb34f, *ppte=109eb83f
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0007

---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0007

As you can see, some alignment exceptions are handled by the kernel, the last 
one isn't. I added some additional printk()'s and traced it down to this 
location in the kernel: 
 
which claims that ARMv6++ CPUs can handle up to word-sized unaligned accesses, 
thus no fixup is needed.

I hope that this will be sufficient for a fix. Let me know if you need any 
additional information.

Best regards,
Bernhard

>---
> target/arm/tcg/hflags.c | 34 --
> 1 file changed, 32 insertions(+), 2 deletions(-)
>
>diff --git a/target/arm/tcg/hflags.c b/target/arm/tcg/hflags.c
>index 8e5d35d9227..5da1b0fc1d4 100644
>--- a/target/arm/tcg/hflags.c
>+++ b/target/arm/tcg/hflags.c
>@@ -26,6 +26,35 @@ static inline bool fgt_svc(CPUARMState *env, int el)
> FIELD_EX64(env->cp15.fgt_exec[FGTREG_HFGITR], HFGITR_EL2, SVC_EL1);
> }
> 
>+/* Return true if memory alignment should be enforced. */
>+static bool aprofile_require_alignment(CPUARMState *env, int el, uint64_t 
>sctlr)
>+{
>+#ifdef CONFIG_USER_ONLY
>+return false;
>+#else
>+/* Check the alignment enable bit. */
>+if (sctlr & SCTLR_A) {
>+return true;
>+}
>+
>+/*
>+ * If translation is disabled, then the default memory type is
>+ * Device(-nGnRnE) instead of Normal, which requires that alignment
>+ * be enforced.  Since this affects all ram, it is most efficient
>+ * to handle this during translation.
>+ */
>+if (sctlr & SCTLR_M) {
>+/* Translation enabled: memory type in PTE via MAIR_ELx. */
>+return false;
>+}
>+if (el < 2 && (arm_hcr_el2_eff(env) & (HCR_DC | HCR_VM))) {
>+/* Stage 2 translation enabled: memory type in PTE. */
>+return false;
>+}
>+return true;
>+#endif
>+}
>+
> static CPUARMTBFlags rebuild_hflags_common(CPUARMState *env, int fp_el,
>ARMMMUIdx mmu_idx,
>CPUARMTBFlags flags)
>@@ -121,8 +150,9 @@ static CPUARMTBFlags rebuild_hflags_a32(CPUARMState *env, 
>int fp_el,
> {
> CPUARMTBFlags flags = {};
> int el = arm_current_el(env);
>+uint64_t sctlr = arm_sctlr(env, el);
> 
>-if (arm_sctlr(env, el) & SCTLR_A) {
>+if (aprofile_require_alignment(env, el, sctlr)) {
> DP_TBFLAG_ANY(flags, ALIGN_MEM, 1);
> }
> 
>@@ -223,7 +253,7 @@ static CPUARMTBFlags rebuild_hflags_a64(CPUARMState *env, 
>int el, int fp_el,
> 
> sctlr = regime_sctlr(env, stage1);
> 
>-if (sctlr & SCTLR_A) {
>+if (aprofile_require_alignment(env, el, sctlr)) {
> DP_TBFLAG_ANY(flags, ALIGN_MEM, 1);
> }
> 



[RFC PATCH 1/3] hw/intc/s390_flic: Migrate pending state

2024-05-25 Thread Nicholas Piggin
The flic pending state is not migrated, so if the machine is migrated
while an interrupt is pending, it can be lost. This shows up in
qtest migration test, an extint is pending (due to console writes?)
and the CPU waits via s390_cpu_set_psw and expects the interrupt to
wake it. However when the flic pending state is lost, s390_cpu_has_int
returns false, so s390_cpu_exec_interrupt falls through to halting
again.

Fix this by migrating pending. This prevents the qtest from hanging.
Does service_param need to be migrated? Or the IO lists?

Signed-off-by: Nicholas Piggin 
---
 hw/intc/s390_flic.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/intc/s390_flic.c b/hw/intc/s390_flic.c
index 6771645699..b70cf2295a 100644
--- a/hw/intc/s390_flic.c
+++ b/hw/intc/s390_flic.c
@@ -369,6 +369,7 @@ static const VMStateDescription qemu_s390_flic_vmstate = {
 .fields = (const VMStateField[]) {
 VMSTATE_UINT8(simm, QEMUS390FLICState),
 VMSTATE_UINT8(nimm, QEMUS390FLICState),
+VMSTATE_UINT32(pending, QEMUS390FLICState),
 VMSTATE_END_OF_LIST()
 }
 };
-- 
2.43.0




[RFC PATCH 3/3] tests/qtest/migration-test: Enable test_ignore_shared

2024-05-25 Thread Nicholas Piggin
This was said to be broken on aarch64, but if it works on others,
let's try enable it. It's already starting to bitrot...

Cc: Yury Kotov 
Cc: Dr. David Alan Gilbert 
Signed-off-by: Nicholas Piggin 
---
 tests/qtest/migration-test.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 7987faaded..2bcdc33b7c 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1862,14 +1862,15 @@ static void 
test_precopy_unix_tls_x509_override_host(void)
 #endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
-#if 0
-/* Currently upset on aarch64 TCG */
 static void test_ignore_shared(void)
 {
 g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
 QTestState *from, *to;
+MigrateStart args = {
+.use_shmem = true,
+};
 
-if (test_migrate_start(&from, &to, uri, false, true, NULL, NULL)) {
+if (test_migrate_start(&from, &to, uri, &args)) {
 return;
 }
 
@@ -1898,7 +1899,6 @@ static void test_ignore_shared(void)
 
 test_migrate_end(from, to, true);
 }
-#endif
 
 static void *
 test_migrate_xbzrle_start(QTestState *from,
@@ -3537,7 +3537,10 @@ int main(int argc, char **argv)
 #endif /* CONFIG_TASN1 */
 #endif /* CONFIG_GNUTLS */
 
-/* migration_test_add("/migration/ignore_shared", test_ignore_shared); */
+if (strcmp(arch, "aarch64") == 0) { /* Currently upset on aarch64 TCG */
+migration_test_add("/migration/ignore_shared", test_ignore_shared);
+}
+
 #ifndef _WIN32
 migration_test_add("/migration/precopy/fd/tcp",
test_migrate_precopy_fd_socket);
-- 
2.43.0




[RFC PATCH 0/3] Fix s390x flic migration and add some more qtests

2024-05-25 Thread Nicholas Piggin
I don't know s390x enough to know if this is the right fix, but I
could debug the migration hangs this far at least (and the patch
fixes the condition that would previously result in a hang on the
qtest).

Also we could enable the test_ignore_shared test that seems to work
on s390x and ppc64 at least.

Thanks,
Nick

Nicholas Piggin (3):
  hw/intc/s390_flic: Migrate pending state
  tests/qtest/migration-test: enable on s390x
  tests/qtest/migration-test: Enable test_ignore_shared

 hw/intc/s390_flic.c  |  1 +
 tests/qtest/migration-test.c | 25 -
 2 files changed, 9 insertions(+), 17 deletions(-)

-- 
2.43.0




[RFC PATCH 2/3] tests/qtest/migration-test: enable on s390x

2024-05-25 Thread Nicholas Piggin
s390x is more stable now. Enable it.

Signed-off-by: Nicholas Piggin 
---
 tests/qtest/migration-test.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 94d5057857..7987faaded 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -3428,16 +3428,6 @@ int main(int argc, char **argv)
 migration_test_add("/migration/analyze-script", test_analyze_script);
 #endif
 
-/*
- * On s390x, the test seems to be touchy with TCG, perhaps due to race
- * conditions on dirty bits, so disable it there until the problems are
- * resolved.
- */
-if (g_str_equal(arch, "s390x") && !has_kvm) {
-g_test_message("Skipping tests: s390x host with KVM is required");
-goto test_add_done;
-}
-
 if (is_x86) {
 migration_test_add("/migration/precopy/unix/suspend/live",
test_precopy_unix_suspend_live);
@@ -3619,8 +3609,6 @@ int main(int argc, char **argv)
test_vcpu_dirty_limit);
 }
 
-test_add_done:
-
 ret = g_test_run();
 
 g_assert_cmpint(ret, ==, 0);
-- 
2.43.0




[PULL 24/24] migration: remove unnecessary zlib dependency

2024-05-25 Thread Paolo Bonzini
zlib code is only used by the emulators, not by the tests.

Signed-off-by: Paolo Bonzini 
---
 meson.build   | 2 +-
 migration/dirtyrate.c | 1 -
 migration/qemu-file.c | 1 -
 migration/meson.build | 2 +-
 4 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/meson.build b/meson.build
index 7fd82b5f48c..63866071445 100644
--- a/meson.build
+++ b/meson.build
@@ -3696,7 +3696,7 @@ libmigration = static_library('migration', sources: 
migration_files + genh,
   name_suffix: 'fa',
   build_by_default: false)
 migration = declare_dependency(link_with: libmigration,
-   dependencies: [zlib, qom, io])
+   dependencies: [qom, io])
 system_ss.add(migration)
 
 block_ss = block_ss.apply({})
diff --git a/migration/dirtyrate.c b/migration/dirtyrate.c
index d02d70b7b4b..1d9db812990 100644
--- a/migration/dirtyrate.c
+++ b/migration/dirtyrate.c
@@ -12,7 +12,6 @@
 
 #include "qemu/osdep.h"
 #include "qemu/error-report.h"
-#include 
 #include "hw/core/cpu.h"
 #include "qapi/error.h"
 #include "exec/ramblock.h"
diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 9ccbbb00991..b6d2f588bd7 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -22,7 +22,6 @@
  * THE SOFTWARE.
  */
 #include "qemu/osdep.h"
-#include 
 #include "qemu/madvise.h"
 #include "qemu/error-report.h"
 #include "qemu/iov.h"
diff --git a/migration/meson.build b/migration/meson.build
index 8815f808374..bdc3244bce0 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -29,7 +29,7 @@ system_ss.add(files(
   'socket.c',
   'tls.c',
   'threadinfo.c',
-), gnutls)
+), gnutls, zlib)
 
 if get_option('replication').allowed()
   system_ss.add(files('colo-failover.c', 'colo.c'))
-- 
2.45.1




[PULL 13/24] target/i386: reg in gen_ldst_modrm is always OR_TMP0

2024-05-25 Thread Paolo Bonzini
Values other than OR_TMP0 were only ever used by MOV and MOVNTI
opcodes.  Now that these have been converted to the new decoder,
remove the argument.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 33 -
 1 file changed, 12 insertions(+), 21 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 46c452032ba..4bb932af16b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1828,10 +1828,9 @@ static void gen_add_A0_ds_seg(DisasContext *s)
 gen_lea_v_seg(s, s->aflag, s->A0, R_DS, s->override);
 }
 
-/* generate modrm memory load or store of 'reg'. TMP0 is used if reg ==
-   OR_TMP0 */
+/* generate modrm memory load or store of 'reg'. */
 static void gen_ldst_modrm(CPUX86State *env, DisasContext *s, int modrm,
-   MemOp ot, int reg, int is_store)
+   MemOp ot, int is_store)
 {
 int mod, rm;
 
@@ -1839,24 +1838,16 @@ static void gen_ldst_modrm(CPUX86State *env, 
DisasContext *s, int modrm,
 rm = (modrm & 7) | REX_B(s);
 if (mod == 3) {
 if (is_store) {
-if (reg != OR_TMP0)
-gen_op_mov_v_reg(s, ot, s->T0, reg);
 gen_op_mov_reg_v(s, ot, rm, s->T0);
 } else {
 gen_op_mov_v_reg(s, ot, s->T0, rm);
-if (reg != OR_TMP0)
-gen_op_mov_reg_v(s, ot, reg, s->T0);
 }
 } else {
 gen_lea_modrm(env, s, modrm);
 if (is_store) {
-if (reg != OR_TMP0)
-gen_op_mov_v_reg(s, ot, s->T0, reg);
 gen_op_st_v(s, ot, s->T0, s->A0);
 } else {
 gen_op_ld_v(s, ot, s->T0, s->A0);
-if (reg != OR_TMP0)
-gen_op_mov_reg_v(s, ot, reg, s->T0);
 }
 }
 }
@@ -3447,7 +3438,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 ot = dflag;
 modrm = x86_ldub_code(env, s);
 reg = ((modrm >> 3) & 7) | REX_R(s);
-gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 0);
+gen_ldst_modrm(env, s, modrm, ot, 0);
 gen_extu(ot, s->T0);
 
 /* Note that lzcnt and tzcnt are in different extensions.  */
@@ -3598,14 +3589,14 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 tcg_gen_ld32u_tl(s->T0, tcg_env,
  offsetof(CPUX86State, ldt.selector));
 ot = mod == 3 ? dflag : MO_16;
-gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 1);
+gen_ldst_modrm(env, s, modrm, ot, 1);
 break;
 case 2: /* lldt */
 if (!PE(s) || VM86(s))
 goto illegal_op;
 if (check_cpl0(s)) {
 gen_svm_check_intercept(s, SVM_EXIT_LDTR_WRITE);
-gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
+gen_ldst_modrm(env, s, modrm, MO_16, 0);
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_lldt(tcg_env, s->tmp2_i32);
 }
@@ -3620,14 +3611,14 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 tcg_gen_ld32u_tl(s->T0, tcg_env,
  offsetof(CPUX86State, tr.selector));
 ot = mod == 3 ? dflag : MO_16;
-gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 1);
+gen_ldst_modrm(env, s, modrm, ot, 1);
 break;
 case 3: /* ltr */
 if (!PE(s) || VM86(s))
 goto illegal_op;
 if (check_cpl0(s)) {
 gen_svm_check_intercept(s, SVM_EXIT_TR_WRITE);
-gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
+gen_ldst_modrm(env, s, modrm, MO_16, 0);
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_ltr(tcg_env, s->tmp2_i32);
 }
@@ -3636,7 +3627,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 case 5: /* verw */
 if (!PE(s) || VM86(s))
 goto illegal_op;
-gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
+gen_ldst_modrm(env, s, modrm, MO_16, 0);
 gen_update_cc_op(s);
 if (op == 4) {
 gen_helper_verr(tcg_env, s->T0);
@@ -3900,7 +3891,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
  */
 mod = (modrm >> 6) & 3;
 ot = (mod != 3 ? MO_16 : s->dflag);
-gen_ldst_modrm(env, s, modrm, ot, OR_TMP0, 1);
+gen_ldst_modrm(env, s, modrm, ot, 1);
 break;
 case 0xee: /* rdpkru */
 if (s->prefix & (PREFIX_LOCK | PREFIX_DATA
@@ -3927,7 +3918,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 break;
 }
 gen_svm_check_intercept(s, SVM_EXIT_WRITE_CR0);
-gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP

[PULL 06/24] target/i386: cpu_load_eflags already sets cc_op

2024-05-25 Thread Paolo Bonzini
No need to set it again at the end of the translation block, cc_op_dirty
can be set to false.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 37 -
 target/i386/tcg/emit.c.inc  |  2 +-
 2 files changed, 25 insertions(+), 14 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 920d854c2b5..25c973e20c6 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -309,7 +309,7 @@ static const uint8_t cc_op_live[CC_OP_NB] = {
 [CC_OP_POPCNT] = USES_CC_SRC,
 };
 
-static void set_cc_op(DisasContext *s, CCOp op)
+static void set_cc_op_1(DisasContext *s, CCOp op, bool dirty)
 {
 int dead;
 
@@ -332,20 +332,27 @@ static void set_cc_op(DisasContext *s, CCOp op)
 tcg_gen_discard_tl(s->cc_srcT);
 }
 
-if (op == CC_OP_DYNAMIC) {
-/* The DYNAMIC setting is translator only, and should never be
-   stored.  Thus we always consider it clean.  */
-s->cc_op_dirty = false;
-} else {
-/* Discard any computed CC_OP value (see shifts).  */
-if (s->cc_op == CC_OP_DYNAMIC) {
-tcg_gen_discard_i32(cpu_cc_op);
-}
-s->cc_op_dirty = true;
+if (dirty && s->cc_op == CC_OP_DYNAMIC) {
+tcg_gen_discard_i32(cpu_cc_op);
 }
+s->cc_op_dirty = dirty;
 s->cc_op = op;
 }
 
+static void set_cc_op(DisasContext *s, CCOp op)
+{
+/*
+ * The DYNAMIC setting is translator only, everything else
+ * will be spilled later.
+ */
+set_cc_op_1(s, op, op != CC_OP_DYNAMIC);
+}
+
+static void assume_cc_op(DisasContext *s, CCOp op)
+{
+set_cc_op_1(s, op, false);
+}
+
 static void gen_update_cc_op(DisasContext *s)
 {
 if (s->cc_op_dirty) {
@@ -3554,6 +3561,10 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
 gen_helper_syscall(tcg_env, cur_insn_len_i32(s));
+/* condition codes are modified only in long mode */
+if (LMA(s)) {
+assume_cc_op(s, CC_OP_EFLAGS);
+}
 /* TF handling for the syscall insn is different. The TF bit is  
checked
after the syscall insn completes. This allows #DB to not be
generated after one has entered CPL0 if TF is set in FMASK.  */
@@ -3570,7 +3581,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 gen_helper_sysret(tcg_env, tcg_constant_i32(dflag - 1));
 /* condition codes are modified only in long mode */
 if (LMA(s)) {
-set_cc_op(s, CC_OP_EFLAGS);
+assume_cc_op(s, CC_OP_EFLAGS);
 }
 /* TF handling for the sysret insn is different. The TF bit is
checked after the sysret insn completes. This allows #DB to be
@@ -4489,7 +4500,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 g_assert_not_reached();
 #else
 gen_helper_rsm(tcg_env);
-set_cc_op(s, CC_OP_EFLAGS);
+assume_cc_op(s, CC_OP_EFLAGS);
 #endif /* CONFIG_USER_ONLY */
 s->base.is_jmp = DISAS_EOB_ONLY;
 break;
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index c78e35b1e28..3f2ae0aa7e7 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1883,7 +1883,7 @@ static void gen_IRET(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 gen_helper_iret_protected(tcg_env, tcg_constant_i32(s->dflag - 1),
   eip_next_i32(s));
 }
-set_cc_op(s, CC_OP_EFLAGS);
+assume_cc_op(s, CC_OP_EFLAGS);
 s->base.is_jmp = DISAS_EOB_ONLY;
 }
 
-- 
2.45.1




[PULL 21/24] meson: remove unnecessary dependency

2024-05-25 Thread Paolo Bonzini
The dbus_display1_dep is not really used since all occurrences also
request gio independently.  Just list the generated sources and drop
dbus_display1_dep.

Signed-off-by: Paolo Bonzini 
---
 audio/meson.build   | 4 ++--
 tests/qtest/meson.build | 2 +-
 ui/meson.build  | 5 ++---
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/audio/meson.build b/audio/meson.build
index 608f35e6af7..59f0a431d51 100644
--- a/audio/meson.build
+++ b/audio/meson.build
@@ -30,8 +30,8 @@ endforeach
 
 if dbus_display
 module_ss = ss.source_set()
-module_ss.add(when: [gio, dbus_display1_dep, pixman],
-  if_true: files('dbusaudio.c'))
+module_ss.add(when: [gio, pixman],
+  if_true: [dbus_display1, files('dbusaudio.c')])
 audio_modules += {'dbus': module_ss}
 endif
 
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 86293051dce..b98fae6a6dd 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -354,7 +354,7 @@ if vnc.found()
 endif
 
 if dbus_display
-  qtests += {'dbus-display-test': [dbus_display1_dep, gio]}
+  qtests += {'dbus-display-test': [dbus_display1, gio]}
 endif
 
 qtest_executables = {}
diff --git a/ui/meson.build b/ui/meson.build
index 5d89986b0ee..cfbf29428df 100644
--- a/ui/meson.build
+++ b/ui/meson.build
@@ -91,8 +91,7 @@ if dbus_display
   '--interface-prefix', 'org.qemu.',
   '--c-namespace', 'QemuDBus',
   '--generate-c-code', '@BASENAME@'])
-  dbus_display1_dep = declare_dependency(sources: dbus_display1, dependencies: 
gio)
-  dbus_ss.add(when: [gio, dbus_display1_dep],
+  dbus_ss.add(when: gio,
   if_true: [files(
 'dbus-chardev.c',
 'dbus-clipboard.c',
@@ -100,7 +99,7 @@ if dbus_display
 'dbus-error.c',
 'dbus-listener.c',
 'dbus.c',
-  ), opengl, gbm, pixman])
+  ), opengl, gbm, pixman, dbus_display1])
   ui_modules += {'dbus' : dbus_ss}
 endif
 
-- 
2.45.1




[PULL 04/24] target/i386: cleanup eob handling of RSM

2024-05-25 Thread Paolo Bonzini
gen_helper_rsm cannot generate an exception, and reloads the flags.
So there's no need to spill cc_op and update cpu_eip, but on the
other hand cc_op must be reset to CC_OP_EFLAGS before returning.

It all works by chance, because by spilling cc_op before the call
to the helper, it becomes non-dirty and gen_eob will not overwrite
the CC_OP_EFLAGS value that is placed there by the helper.  But
let's clean it up.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 9782250b20b..849864d1aa2 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -4488,9 +4488,8 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 /* we should not be in SMM mode */
 g_assert_not_reached();
 #else
-gen_update_cc_op(s);
-gen_update_eip_next(s);
 gen_helper_rsm(tcg_env);
+set_cc_op(s, CC_OP_EFLAGS);
 #endif /* CONFIG_USER_ONLY */
 s->base.is_jmp = DISAS_EOB_ONLY;
 break;
-- 
2.45.1




[PULL 10/24] target/i386: avoid calling gen_eob_inhibit_irq before tb_stop

2024-05-25 Thread Paolo Bonzini
sti only has one exit, so it does not need to generate the
end-of-translation code inline.  It can be deferred to tb_stop.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 13 -
 target/i386/tcg/emit.c.inc  |  4 +---
 2 files changed, 1 insertion(+), 16 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 0600b43..a7493b5ccfd 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -564,19 +564,6 @@ static void gen_update_eip_cur(DisasContext *s)
 s->pc_save = s->base.pc_next;
 }
 
-static void gen_update_eip_next(DisasContext *s)
-{
-assert(s->pc_save != -1);
-if (tb_cflags(s->base.tb) & CF_PCREL) {
-tcg_gen_addi_tl(cpu_eip, cpu_eip, s->pc - s->pc_save);
-} else if (CODE64(s)) {
-tcg_gen_movi_tl(cpu_eip, s->pc);
-} else {
-tcg_gen_movi_tl(cpu_eip, (uint32_t)(s->pc - s->cs_base));
-}
-s->pc_save = s->pc;
-}
-
 static int cur_insn_len(DisasContext *s)
 {
 return s->pc - s->base.pc_next;
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index e0ac21abe28..88bcb9699c3 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -3475,9 +3475,7 @@ static void gen_STD(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 static void gen_STI(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
 {
 gen_set_eflags(s, IF_MASK);
-/* interruptions are enabled only the first insn after sti */
-gen_update_eip_next(s);
-gen_eob_inhibit_irq(s);
+s->base.is_jmp = DISAS_EOB_INHIBIT_IRQ;
 }
 
 static void gen_VAESKEYGEN(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
-- 
2.45.1




[PULL 17/24] target/i386: introduce gen_lea_ss_ofs

2024-05-25 Thread Paolo Bonzini
Generalize gen_stack_A0() to include an initial add and to use an arbitrary
destination.  This is a common pattern and it is not a huge burden to
add the extra arguments to the only caller of gen_stack_A0().

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 51 +++--
 target/i386/tcg/emit.c.inc  |  2 +-
 2 files changed, 22 insertions(+), 31 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 2a20f9bafbb..15993f83024 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2035,24 +2035,27 @@ static inline void gen_stack_update(DisasContext *s, 
int addend)
 gen_op_add_reg_im(s, mo_stacksize(s), R_ESP, addend);
 }
 
+static void gen_lea_ss_ofs(DisasContext *s, TCGv dest, TCGv src, target_ulong 
offset)
+{
+if (offset) {
+tcg_gen_addi_tl(dest, src, offset);
+src = dest;
+}
+gen_lea_v_seg_dest(s, mo_stacksize(s), dest, src, R_SS, -1);
+}
+
 /* Generate a push. It depends on ss32, addseg and dflag.  */
 static void gen_push_v(DisasContext *s, TCGv val)
 {
 MemOp d_ot = mo_pushpop(s, s->dflag);
 MemOp a_ot = mo_stacksize(s);
 int size = 1 << d_ot;
-TCGv new_esp = s->A0;
+TCGv new_esp = tcg_temp_new();
 
-tcg_gen_subi_tl(s->A0, cpu_regs[R_ESP], size);
-
-if (!CODE64(s)) {
-if (ADDSEG(s)) {
-new_esp = tcg_temp_new();
-tcg_gen_mov_tl(new_esp, s->A0);
-}
-gen_lea_v_seg(s, a_ot, s->A0, R_SS, -1);
-}
+tcg_gen_subi_tl(new_esp, cpu_regs[R_ESP], size);
 
+/* Now reduce the value to the address size and apply SS base.  */
+gen_lea_ss_ofs(s, s->A0, new_esp, 0);
 gen_op_st_v(s, d_ot, val, s->A0);
 gen_op_mov_reg_v(s, a_ot, R_ESP, new_esp);
 }
@@ -2062,7 +2065,7 @@ static MemOp gen_pop_T0(DisasContext *s)
 {
 MemOp d_ot = mo_pushpop(s, s->dflag);
 
-gen_lea_v_seg_dest(s, mo_stacksize(s), s->T0, cpu_regs[R_ESP], R_SS, -1);
+gen_lea_ss_ofs(s, s->T0, cpu_regs[R_ESP], 0);
 gen_op_ld_v(s, d_ot, s->T0, s->T0);
 
 return d_ot;
@@ -2073,21 +2076,14 @@ static inline void gen_pop_update(DisasContext *s, 
MemOp ot)
 gen_stack_update(s, 1 << ot);
 }
 
-static inline void gen_stack_A0(DisasContext *s)
-{
-gen_lea_v_seg(s, mo_stacksize(s), cpu_regs[R_ESP], R_SS, -1);
-}
-
 static void gen_pusha(DisasContext *s)
 {
-MemOp s_ot = mo_stacksize(s);
 MemOp d_ot = s->dflag;
 int size = 1 << d_ot;
 int i;
 
 for (i = 0; i < 8; i++) {
-tcg_gen_addi_tl(s->A0, cpu_regs[R_ESP], (i - 8) * size);
-gen_lea_v_seg(s, s_ot, s->A0, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, cpu_regs[R_ESP], (i - 8) * size);
 gen_op_st_v(s, d_ot, cpu_regs[7 - i], s->A0);
 }
 
@@ -2096,7 +2092,6 @@ static void gen_pusha(DisasContext *s)
 
 static void gen_popa(DisasContext *s)
 {
-MemOp s_ot = mo_stacksize(s);
 MemOp d_ot = s->dflag;
 int size = 1 << d_ot;
 int i;
@@ -2106,8 +2101,7 @@ static void gen_popa(DisasContext *s)
 if (7 - i == R_ESP) {
 continue;
 }
-tcg_gen_addi_tl(s->A0, cpu_regs[R_ESP], i * size);
-gen_lea_v_seg(s, s_ot, s->A0, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, cpu_regs[R_ESP], i * size);
 gen_op_ld_v(s, d_ot, s->T0, s->A0);
 gen_op_mov_reg_v(s, d_ot, 7 - i, s->T0);
 }
@@ -2123,7 +2117,7 @@ static void gen_enter(DisasContext *s, int esp_addend, 
int level)
 
 /* Push BP; compute FrameTemp into T1.  */
 tcg_gen_subi_tl(s->T1, cpu_regs[R_ESP], size);
-gen_lea_v_seg(s, a_ot, s->T1, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, s->T1, 0);
 gen_op_st_v(s, d_ot, cpu_regs[R_EBP], s->A0);
 
 level &= 31;
@@ -2132,18 +2126,15 @@ static void gen_enter(DisasContext *s, int esp_addend, 
int level)
 
 /* Copy level-1 pointers from the previous frame.  */
 for (i = 1; i < level; ++i) {
-tcg_gen_subi_tl(s->A0, cpu_regs[R_EBP], size * i);
-gen_lea_v_seg(s, a_ot, s->A0, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, cpu_regs[R_EBP], -size * i);
 gen_op_ld_v(s, d_ot, s->tmp0, s->A0);
 
-tcg_gen_subi_tl(s->A0, s->T1, size * i);
-gen_lea_v_seg(s, a_ot, s->A0, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, s->T1, -size * i);
 gen_op_st_v(s, d_ot, s->tmp0, s->A0);
 }
 
 /* Push the current FrameTemp as the last level.  */
-tcg_gen_subi_tl(s->A0, s->T1, size * level);
-gen_lea_v_seg(s, a_ot, s->A0, R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, s->T1, -size * level);
 gen_op_st_v(s, d_ot, s->T1, s->A0);
 }
 
@@ -2160,7 +2151,7 @@ static void gen_leave(DisasContext *s)
 MemOp d_ot = mo_pushpop(s, s->dflag);
 MemOp a_ot = mo_stacksize(s);
 
-gen_lea_v_seg(s, a_ot, cpu_regs[R_EBP], R_SS, -1);
+gen_lea_ss_ofs(s, s->A0, cpu_regs[R_EBP], 0);
 gen_op_ld_v(s, d_ot, s->T0, s->A0);
 
 tcg_g

[PULL 18/24] target/i386: clean up repeated string operations

2024-05-25 Thread Paolo Bonzini
Do not bother generating inline wrappers for gen_repz and gen_repz2;
use s->prefix to separate REPZ from REPNZ in the case of SCAS and
CMPS.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 22 --
 target/i386/tcg/emit.c.inc  | 22 +-
 2 files changed, 13 insertions(+), 31 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 15993f83024..7dd7ebf60d4 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1327,14 +1327,12 @@ static void gen_repz(DisasContext *s, MemOp ot,
 gen_jmp_rel_csize(s, -cur_insn_len(s), 0);
 }
 
-#define GEN_REPZ(op) \
-static inline void gen_repz_ ## op(DisasContext *s, MemOp ot) \
-{ gen_repz(s, ot, gen_##op); }
-
-static void gen_repz2(DisasContext *s, MemOp ot, int nz,
-  void (*fn)(DisasContext *s, MemOp ot))
+static void gen_repz_nz(DisasContext *s, MemOp ot,
+void (*fn)(DisasContext *s, MemOp ot))
 {
 TCGLabel *l2;
+int nz = (s->prefix & PREFIX_REPNZ) ? 1 : 0;
+
 l2 = gen_jz_ecx_string(s);
 fn(s, ot);
 gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
@@ -1350,18 +1348,6 @@ static void gen_repz2(DisasContext *s, MemOp ot, int nz,
 gen_jmp_rel_csize(s, -cur_insn_len(s), 0);
 }
 
-#define GEN_REPZ2(op) \
-static inline void gen_repz_ ## op(DisasContext *s, MemOp ot, int nz) \
-{ gen_repz2(s, ot, nz, gen_##op); }
-
-GEN_REPZ(movs)
-GEN_REPZ(stos)
-GEN_REPZ(lods)
-GEN_REPZ(ins)
-GEN_REPZ(outs)
-GEN_REPZ2(scas)
-GEN_REPZ2(cmps)
-
 static void gen_helper_fp_arith_ST0_FT0(int op)
 {
 switch (op) {
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 0a13be4989a..377d2201c91 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1508,10 +1508,8 @@ static void gen_CMPccXADD(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *dec
 static void gen_CMPS(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[2].ot;
-if (s->prefix & PREFIX_REPNZ) {
-gen_repz_cmps(s, ot, 1);
-} else if (s->prefix & PREFIX_REPZ) {
-gen_repz_cmps(s, ot, 0);
+if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
+gen_repz_nz(s, ot, gen_cmps);
 } else {
 gen_cmps(s, ot);
 }
@@ -1834,7 +1832,7 @@ static void gen_INS(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 
 translator_io_start(&s->base);
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_ins(s, ot);
+gen_repz(s, ot, gen_ins);
 } else {
 gen_ins(s, ot);
 }
@@ -1993,7 +1991,7 @@ static void gen_LODS(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[2].ot;
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_lods(s, ot);
+gen_repz(s, ot, gen_lods);
 } else {
 gen_lods(s, ot);
 }
@@ -2155,7 +2153,7 @@ static void gen_MOVS(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[2].ot;
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_movs(s, ot);
+gen_repz(s, ot, gen_movs);
 } else {
 gen_movs(s, ot);
 }
@@ -2321,7 +2319,7 @@ static void gen_OUTS(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 
 translator_io_start(&s->base);
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_outs(s, ot);
+gen_repz(s, ot, gen_outs);
 } else {
 gen_outs(s, ot);
 }
@@ -3329,10 +3327,8 @@ static void gen_SBB(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 static void gen_SCAS(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[2].ot;
-if (s->prefix & PREFIX_REPNZ) {
-gen_repz_scas(s, ot, 1);
-} else if (s->prefix & PREFIX_REPZ) {
-gen_repz_scas(s, ot, 0);
+if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
+gen_repz_nz(s, ot, gen_scas);
 } else {
 gen_scas(s, ot);
 }
@@ -3495,7 +3491,7 @@ static void gen_STOS(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[1].ot;
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
-gen_repz_stos(s, ot);
+gen_repz(s, ot, gen_stos);
 } else {
 gen_stos(s, ot);
 }
-- 
2.45.1




[PULL 16/24] target/i386: use mo_stacksize more

2024-05-25 Thread Paolo Bonzini
Use mo_stacksize for all stack accesses, including when
a 64-bit code segment is impossible and the code is
therefore checking only for SS32(s).

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 2039ccf283a..2a20f9bafbb 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2075,12 +2075,12 @@ static inline void gen_pop_update(DisasContext *s, 
MemOp ot)
 
 static inline void gen_stack_A0(DisasContext *s)
 {
-gen_lea_v_seg(s, SS32(s) ? MO_32 : MO_16, cpu_regs[R_ESP], R_SS, -1);
+gen_lea_v_seg(s, mo_stacksize(s), cpu_regs[R_ESP], R_SS, -1);
 }
 
 static void gen_pusha(DisasContext *s)
 {
-MemOp s_ot = SS32(s) ? MO_32 : MO_16;
+MemOp s_ot = mo_stacksize(s);
 MemOp d_ot = s->dflag;
 int size = 1 << d_ot;
 int i;
@@ -2096,7 +2096,7 @@ static void gen_pusha(DisasContext *s)
 
 static void gen_popa(DisasContext *s)
 {
-MemOp s_ot = SS32(s) ? MO_32 : MO_16;
+MemOp s_ot = mo_stacksize(s);
 MemOp d_ot = s->dflag;
 int size = 1 << d_ot;
 int i;
@@ -2118,7 +2118,7 @@ static void gen_popa(DisasContext *s)
 static void gen_enter(DisasContext *s, int esp_addend, int level)
 {
 MemOp d_ot = mo_pushpop(s, s->dflag);
-MemOp a_ot = CODE64(s) ? MO_64 : SS32(s) ? MO_32 : MO_16;
+MemOp a_ot = mo_stacksize(s);
 int size = 1 << d_ot;
 
 /* Push BP; compute FrameTemp into T1.  */
-- 
2.45.1




[PULL 01/24] configure: move -mcx16 flag out of CPU_CFLAGS

2024-05-25 Thread Paolo Bonzini
From: Artyom Kunakovsky 

The point of CPU_CFLAGS is really just to select the appropriate multilib,
for example for library linking tests, and -mcx16 is not needed for
that purpose.

Furthermore, if -mcx16 is part of QEMU's choice of a basic x86_64
instruction set, it should be applied to cross-compiled x86_64 code too;
it is plausible that tests/tcg would want to cover cmpxchg16b as well,
for example.  In the end this makes just as much sense as a per sub-build
tweak, so move the flag to meson.build and cross_cc_cflags_x86_64.

This leaves out contrib/plugins, which would fail when attempting to use
__sync_val_compare_and_swap_16 (note it does not do yet); while minor,
this *is* a disadvantage of this change.  But building contrib/plugins
with a Makefile instead of meson.build is something self-inflicted just
for the sake of showing that it can be done, and if this kind of papercut
started becoming a problem we could make the directory part of the meson
build.  Until then, we can live with the limitation.

Signed-off-by: Artyom Kunakovsky 
Message-ID: <20240523051118.29367-1-artyomkunakov...@gmail.com>
[rewrite commit message, remove from configure. - Paolo]
Signed-off-by: Paolo Bonzini 
---
 configure   | 7 ++-
 meson.build | 7 +++
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index 38ee2577013..4d01a42ba65 100755
--- a/configure
+++ b/configure
@@ -512,10 +512,7 @@ case "$cpu" in
 cpu="x86_64"
 host_arch=x86_64
 linux_arch=x86
-# ??? Only extremely old AMD cpus do not have cmpxchg16b.
-# If we truly care, we should simply detect this case at
-# runtime and generate the fallback to serial emulation.
-CPU_CFLAGS="-m64 -mcx16"
+CPU_CFLAGS="-m64"
 ;;
 esac
 
@@ -1203,7 +1200,7 @@ fi
 : ${cross_cc_cflags_sparc64="-m64 -mcpu=ultrasparc"}
 : ${cross_cc_sparc="$cross_cc_sparc64"}
 : ${cross_cc_cflags_sparc="-m32 -mcpu=supersparc"}
-: ${cross_cc_cflags_x86_64="-m64"}
+: ${cross_cc_cflags_x86_64="-m64 -mcx16"}
 
 compute_target_variable() {
   eval "$2="
diff --git a/meson.build b/meson.build
index a9de71d4506..7fd82b5f48c 100644
--- a/meson.build
+++ b/meson.build
@@ -336,6 +336,13 @@ if host_arch == 'i386' and not cc.links('''
   qemu_common_flags = ['-march=i486'] + qemu_common_flags
 endif
 
+# ??? Only extremely old AMD cpus do not have cmpxchg16b.
+# If we truly care, we should simply detect this case at
+# runtime and generate the fallback to serial emulation.
+if host_arch == 'x86_64'
+  qemu_common_flags = ['-mcx16'] + qemu_common_flags
+endif
+
 if get_option('prefer_static')
   qemu_ldflags += get_option('b_pie') ? '-static-pie' : '-static'
 endif
-- 
2.45.1




[PULL 14/24] target/i386: split gen_ldst_modrm for load and store

2024-05-25 Thread Paolo Bonzini
The is_store argument of gen_ldst_modrm has only ever been passed
a constant.  Just split the function in two.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 52 +
 1 file changed, 29 insertions(+), 23 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 4bb932af16b..afbed87056a 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1828,27 +1828,33 @@ static void gen_add_A0_ds_seg(DisasContext *s)
 gen_lea_v_seg(s, s->aflag, s->A0, R_DS, s->override);
 }
 
-/* generate modrm memory load or store of 'reg'. */
-static void gen_ldst_modrm(CPUX86State *env, DisasContext *s, int modrm,
-   MemOp ot, int is_store)
+/* generate modrm load of memory or register. */
+static void gen_ld_modrm(CPUX86State *env, DisasContext *s, int modrm, MemOp 
ot)
 {
 int mod, rm;
 
 mod = (modrm >> 6) & 3;
 rm = (modrm & 7) | REX_B(s);
 if (mod == 3) {
-if (is_store) {
-gen_op_mov_reg_v(s, ot, rm, s->T0);
-} else {
-gen_op_mov_v_reg(s, ot, s->T0, rm);
-}
+gen_op_mov_v_reg(s, ot, s->T0, rm);
 } else {
 gen_lea_modrm(env, s, modrm);
-if (is_store) {
-gen_op_st_v(s, ot, s->T0, s->A0);
-} else {
-gen_op_ld_v(s, ot, s->T0, s->A0);
-}
+gen_op_ld_v(s, ot, s->T0, s->A0);
+}
+}
+
+/* generate modrm store of memory or register. */
+static void gen_st_modrm(CPUX86State *env, DisasContext *s, int modrm, MemOp 
ot)
+{
+int mod, rm;
+
+mod = (modrm >> 6) & 3;
+rm = (modrm & 7) | REX_B(s);
+if (mod == 3) {
+gen_op_mov_reg_v(s, ot, rm, s->T0);
+} else {
+gen_lea_modrm(env, s, modrm);
+gen_op_st_v(s, ot, s->T0, s->A0);
 }
 }
 
@@ -3438,7 +3444,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 ot = dflag;
 modrm = x86_ldub_code(env, s);
 reg = ((modrm >> 3) & 7) | REX_R(s);
-gen_ldst_modrm(env, s, modrm, ot, 0);
+gen_ld_modrm(env, s, modrm, ot);
 gen_extu(ot, s->T0);
 
 /* Note that lzcnt and tzcnt are in different extensions.  */
@@ -3589,14 +3595,14 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 tcg_gen_ld32u_tl(s->T0, tcg_env,
  offsetof(CPUX86State, ldt.selector));
 ot = mod == 3 ? dflag : MO_16;
-gen_ldst_modrm(env, s, modrm, ot, 1);
+gen_st_modrm(env, s, modrm, ot);
 break;
 case 2: /* lldt */
 if (!PE(s) || VM86(s))
 goto illegal_op;
 if (check_cpl0(s)) {
 gen_svm_check_intercept(s, SVM_EXIT_LDTR_WRITE);
-gen_ldst_modrm(env, s, modrm, MO_16, 0);
+gen_ld_modrm(env, s, modrm, MO_16);
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_lldt(tcg_env, s->tmp2_i32);
 }
@@ -3611,14 +3617,14 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 tcg_gen_ld32u_tl(s->T0, tcg_env,
  offsetof(CPUX86State, tr.selector));
 ot = mod == 3 ? dflag : MO_16;
-gen_ldst_modrm(env, s, modrm, ot, 1);
+gen_st_modrm(env, s, modrm, ot);
 break;
 case 3: /* ltr */
 if (!PE(s) || VM86(s))
 goto illegal_op;
 if (check_cpl0(s)) {
 gen_svm_check_intercept(s, SVM_EXIT_TR_WRITE);
-gen_ldst_modrm(env, s, modrm, MO_16, 0);
+gen_ld_modrm(env, s, modrm, MO_16);
 tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_ltr(tcg_env, s->tmp2_i32);
 }
@@ -3627,7 +3633,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 case 5: /* verw */
 if (!PE(s) || VM86(s))
 goto illegal_op;
-gen_ldst_modrm(env, s, modrm, MO_16, 0);
+gen_ld_modrm(env, s, modrm, MO_16);
 gen_update_cc_op(s);
 if (op == 4) {
 gen_helper_verr(tcg_env, s->T0);
@@ -3891,7 +3897,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
  */
 mod = (modrm >> 6) & 3;
 ot = (mod != 3 ? MO_16 : s->dflag);
-gen_ldst_modrm(env, s, modrm, ot, 1);
+gen_st_modrm(env, s, modrm, ot);
 break;
 case 0xee: /* rdpkru */
 if (s->prefix & (PREFIX_LOCK | PREFIX_DATA
@@ -3918,7 +3924,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 break;
 }
 gen_svm_check_intercept(s, SVM_EXIT_WRITE_CR0);
-gen_ldst_modrm(env, s, modrm, MO_16, 0);
+gen_ld_modrm(env, s, modrm, MO_16);
 /*
  * On

[PULL 20/24] meson: remove unnecessary reference to libm

2024-05-25 Thread Paolo Bonzini
libm is linked into all targets via libqemuutil, no need to specify it
explicitly.

Signed-off-by: Paolo Bonzini 
---
 block/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/meson.build b/block/meson.build
index e1f03fd773e..8993055c75e 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -110,7 +110,7 @@ foreach m : [
   [blkio, 'blkio', files('blkio.c')],
   [curl, 'curl', files('curl.c')],
   [glusterfs, 'gluster', files('gluster.c')],
-  [libiscsi, 'iscsi', [files('iscsi.c'), libm]],
+  [libiscsi, 'iscsi', files('iscsi.c')],
   [libnfs, 'nfs', files('nfs.c')],
   [libssh, 'ssh', files('ssh.c')],
   [rbd, 'rbd', files('rbd.c')],
-- 
2.45.1




[PULL 05/24] target/i386: remove unnecessary gen_update_cc_op before gen_eob*

2024-05-25 Thread Paolo Bonzini
This is already handled in gen_eob().  Before adding another DISAS_*
case, remove the double calls.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 849864d1aa2..920d854c2b5 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -4775,14 +4775,12 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 gen_jmp_rel_csize(dc, 0, 0);
 break;
 case DISAS_EOB_NEXT:
-gen_update_cc_op(dc);
 gen_update_eip_cur(dc);
 /* fall through */
 case DISAS_EOB_ONLY:
 gen_eob(dc);
 break;
 case DISAS_EOB_INHIBIT_IRQ:
-gen_update_cc_op(dc);
 gen_update_eip_cur(dc);
 gen_eob_inhibit_irq(dc);
 break;
-- 
2.45.1




[PULL 23/24] meson: do not query modules before they are processed

2024-05-25 Thread Paolo Bonzini
Signed-off-by: Paolo Bonzini 
---
 block/meson.build | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/meson.build b/block/meson.build
index 8993055c75e..158dc3b89db 100644
--- a/block/meson.build
+++ b/block/meson.build
@@ -119,7 +119,7 @@ foreach m : [
 module_ss = ss.source_set()
 module_ss.add(when: m[0], if_true: m[2])
 if enable_modules
-  modsrc += module_ss.all_sources()
+  modsrc += m[2]
 endif
 block_modules += {m[1] : module_ss}
   endif
-- 
2.45.1




[PULL 03/24] target/i386: no single-step exception after MOV or POP SS

2024-05-25 Thread Paolo Bonzini
Intel SDM 18.3.1.4 "If an occurrence of the MOV or POP instruction
loads the SS register executes with EFLAGS.TF = 1, no single-step debug
exception occurs following the MOV or POP instruction."

Cc: qemu-sta...@nongnu.org
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index ebcff8766cf..9782250b20b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -2273,7 +2273,7 @@ gen_eob_worker(DisasContext *s, bool inhibit, bool 
recheck_tf, bool jr)
 if (recheck_tf) {
 gen_helper_rechecking_single_step(tcg_env);
 tcg_gen_exit_tb(NULL, 0);
-} else if (s->flags & HF_TF_MASK) {
+} else if ((s->flags & HF_TF_MASK) && !inhibit) {
 gen_helper_single_step(tcg_env);
 } else if (jr &&
/* give irqs a chance to happen */
-- 
2.45.1




[PULL 08/24] target/i386: document and group DISAS_* constants

2024-05-25 Thread Paolo Bonzini
Place DISAS_* constants that update cpu_eip first, and
the "jump" ones last.  Add comments explaining the differences
and usage.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 1b0485e01b3..1246118e42b 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -144,9 +144,28 @@ typedef struct DisasContext {
 TCGOp *prev_insn_end;
 } DisasContext;
 
-#define DISAS_EOB_ONLY DISAS_TARGET_0
-#define DISAS_EOB_NEXT DISAS_TARGET_1
-#define DISAS_EOB_INHIBIT_IRQ  DISAS_TARGET_2
+/*
+ * Point EIP to next instruction before ending translation.
+ * For instructions that can change hflags.
+ */
+#define DISAS_EOB_NEXT DISAS_TARGET_0
+
+/*
+ * Point EIP to next instruction and set HF_INHIBIT_IRQ if not
+ * already set.  For instructions that activate interrupt shadow.
+ */
+#define DISAS_EOB_INHIBIT_IRQ  DISAS_TARGET_1
+
+/*
+ * Return to the main loop; EIP might have already been updated
+ * but even in that case do not use lookup_and_goto_ptr().
+ */
+#define DISAS_EOB_ONLY DISAS_TARGET_2
+
+/*
+ * EIP has already been updated.  For jumps that wish to use
+ * lookup_and_goto_ptr()
+ */
 #define DISAS_JUMP DISAS_TARGET_3
 
 /* The environment in which user-only runs is constrained. */
-- 
2.45.1




[PULL 22/24] tcg: include dependencies in static_library()

2024-05-25 Thread Paolo Bonzini
This ensures that for example libffi can be reached even if it is not
in /usr/include.

Signed-off-by: Paolo Bonzini 
---
 tcg/meson.build | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tcg/meson.build b/tcg/meson.build
index 8251589fd4e..ffbe754d8b3 100644
--- a/tcg/meson.build
+++ b/tcg/meson.build
@@ -32,19 +32,19 @@ tcg_ss = tcg_ss.apply({})
 libtcg_user = static_library('tcg_user',
  tcg_ss.sources() + genh,
  name_suffix: 'fa',
+ dependencies: tcg_ss.dependencies(),
  c_args: '-DCONFIG_USER_ONLY',
  build_by_default: false)
 
-tcg_user = declare_dependency(link_with: libtcg_user,
-  dependencies: tcg_ss.dependencies())
+tcg_user = declare_dependency(link_with: libtcg_user)
 user_ss.add(tcg_user)
 
 libtcg_system = static_library('tcg_system',
 tcg_ss.sources() + genh,
 name_suffix: 'fa',
+dependencies: tcg_ss.dependencies(),
 c_args: '-DCONFIG_SOFTMMU',
 build_by_default: false)
 
-tcg_system = declare_dependency(link_with: libtcg_system,
- dependencies: tcg_ss.dependencies())
+tcg_system = declare_dependency(link_with: libtcg_system)
 system_ss.add(tcg_system)
-- 
2.45.1




[PULL 11/24] target/i386: assert that gen_update_eip_cur and gen_update_eip_next are the same in tb_stop

2024-05-25 Thread Paolo Bonzini
This is an invariant now that there are no calls to gen_eob_inhibit_irq()
outside tb_stop.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index a7493b5ccfd..fcb7934efa7 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -4798,6 +4798,7 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 gen_jmp_rel_csize(dc, 0, 0);
 break;
 case DISAS_EOB_NEXT:
+assert(dc->base.pc_next == dc->pc);
 gen_update_eip_cur(dc);
 /* fall through */
 case DISAS_EOB_ONLY:
@@ -4807,6 +4808,7 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 gen_eob_syscall(dc);
 break;
 case DISAS_EOB_INHIBIT_IRQ:
+assert(dc->base.pc_next == dc->pc);
 gen_update_eip_cur(dc);
 gen_eob_inhibit_irq(dc);
 break;
-- 
2.45.1




[PULL 12/24] target/i386: raze the gen_eob* jungle

2024-05-25 Thread Paolo Bonzini
Make gen_eob take the DISAS_* constant as an argument, so that
it is not necessary to have wrappers around it.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 62 +
 1 file changed, 15 insertions(+), 47 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index fcb7934efa7..46c452032ba 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -260,8 +260,6 @@ STUB_HELPER(write_crN, TCGv_env env, TCGv_i32 reg, TCGv val)
 STUB_HELPER(wrmsr, TCGv_env env)
 #endif
 
-static void gen_eob(DisasContext *s);
-static void gen_jr(DisasContext *s);
 static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num);
 static void gen_jmp_rel_csize(DisasContext *s, int diff, int tb_num);
 static void gen_exception_gpf(DisasContext *s);
@@ -2266,12 +2264,13 @@ static void gen_bnd_jmp(DisasContext *s)
 }
 }
 
-/* Generate an end of block. Trace exception is also generated if needed.
-   If INHIBIT, set HF_INHIBIT_IRQ_MASK if it isn't already set.
-   If RECHECK_TF, emit a rechecking helper for #DB, ignoring the state of
-   S->TF.  This is used by the syscall/sysret insns.  */
+/*
+ * Generate an end of block, including common tasks such as generating
+ * single step traps, resetting the RF flag, and handling the interrupt
+ * shadow.
+ */
 static void
-gen_eob_worker(DisasContext *s, bool inhibit, bool recheck_tf, bool jr)
+gen_eob(DisasContext *s, int mode)
 {
 bool inhibit_reset;
 
@@ -2282,52 +2281,29 @@ gen_eob_worker(DisasContext *s, bool inhibit, bool 
recheck_tf, bool jr)
 if (s->flags & HF_INHIBIT_IRQ_MASK) {
 gen_reset_hflag(s, HF_INHIBIT_IRQ_MASK);
 inhibit_reset = true;
-} else if (inhibit) {
+} else if (mode == DISAS_EOB_INHIBIT_IRQ) {
 gen_set_hflag(s, HF_INHIBIT_IRQ_MASK);
 }
 
 if (s->base.tb->flags & HF_RF_MASK) {
 gen_reset_eflags(s, RF_MASK);
 }
-if (recheck_tf) {
+if (mode == DISAS_EOB_RECHECK_TF) {
 gen_helper_rechecking_single_step(tcg_env);
 tcg_gen_exit_tb(NULL, 0);
-} else if ((s->flags & HF_TF_MASK) && !inhibit) {
+} else if ((s->flags & HF_TF_MASK) && mode != DISAS_EOB_INHIBIT_IRQ) {
 gen_helper_single_step(tcg_env);
-} else if (jr &&
+} else if (mode == DISAS_JUMP &&
/* give irqs a chance to happen */
!inhibit_reset) {
 tcg_gen_lookup_and_goto_ptr();
 } else {
 tcg_gen_exit_tb(NULL, 0);
 }
+
 s->base.is_jmp = DISAS_NORETURN;
 }
 
-static inline void
-gen_eob_syscall(DisasContext *s)
-{
-gen_eob_worker(s, false, true, false);
-}
-
-/* End of block.  Set HF_INHIBIT_IRQ_MASK if it isn't already set.  */
-static void gen_eob_inhibit_irq(DisasContext *s)
-{
-gen_eob_worker(s, true, false, false);
-}
-
-/* End of block, resetting the inhibit irq flag.  */
-static void gen_eob(DisasContext *s)
-{
-gen_eob_worker(s, false, false, false);
-}
-
-/* Jump to register */
-static void gen_jr(DisasContext *s)
-{
-gen_eob_worker(s, false, false, true);
-}
-
 /* Jump to eip+diff, truncating the result to OT. */
 static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num)
 {
@@ -2379,9 +2355,9 @@ static void gen_jmp_rel(DisasContext *s, MemOp ot, int 
diff, int tb_num)
 tcg_gen_movi_tl(cpu_eip, new_eip);
 }
 if (s->jmp_opt) {
-gen_jr(s);   /* jump to another page */
+gen_eob(s, DISAS_JUMP);   /* jump to another page */
 } else {
-gen_eob(s);  /* exit to main loop */
+gen_eob(s, DISAS_EOB_ONLY);  /* exit to main loop */
 }
 }
 }
@@ -4798,22 +4774,14 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 gen_jmp_rel_csize(dc, 0, 0);
 break;
 case DISAS_EOB_NEXT:
+case DISAS_EOB_INHIBIT_IRQ:
 assert(dc->base.pc_next == dc->pc);
 gen_update_eip_cur(dc);
 /* fall through */
 case DISAS_EOB_ONLY:
-gen_eob(dc);
-break;
 case DISAS_EOB_RECHECK_TF:
-gen_eob_syscall(dc);
-break;
-case DISAS_EOB_INHIBIT_IRQ:
-assert(dc->base.pc_next == dc->pc);
-gen_update_eip_cur(dc);
-gen_eob_inhibit_irq(dc);
-break;
 case DISAS_JUMP:
-gen_jr(dc);
+gen_eob(dc, dc->base.is_jmp);
 break;
 default:
 g_assert_not_reached();
-- 
2.45.1




[PULL 15/24] target/i386: inline gen_add_A0_ds_seg

2024-05-25 Thread Paolo Bonzini
It is only used in MONITOR, where a direct call of gen_lea_v_seg
is simpler, and in XLAT.  Inline it in the latter.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 9 +
 target/i386/tcg/emit.c.inc  | 2 +-
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index afbed87056a..2039ccf283a 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1822,12 +1822,6 @@ static void gen_bndck(CPUX86State *env, DisasContext *s, 
int modrm,
 gen_helper_bndck(tcg_env, s->tmp2_i32);
 }
 
-/* used for LEA and MOV AX, mem */
-static void gen_add_A0_ds_seg(DisasContext *s)
-{
-gen_lea_v_seg(s, s->aflag, s->A0, R_DS, s->override);
-}
-
 /* generate modrm load of memory or register. */
 static void gen_ld_modrm(CPUX86State *env, DisasContext *s, int modrm, MemOp 
ot)
 {
@@ -3674,8 +3668,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 }
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
-tcg_gen_mov_tl(s->A0, cpu_regs[R_EAX]);
-gen_add_A0_ds_seg(s);
+gen_lea_v_seg(s, s->aflag, cpu_regs[R_EAX], R_DS, s->override);
 gen_helper_monitor(tcg_env, s->A0);
 break;
 
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 88bcb9699c3..01ad57629e4 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -4043,7 +4043,7 @@ static void gen_XLAT(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 {
 /* AL is already zero-extended into s->T0.  */
 tcg_gen_add_tl(s->A0, cpu_regs[R_EBX], s->T0);
-gen_add_A0_ds_seg(s);
+gen_lea_v_seg(s, s->aflag, s->A0, R_DS, s->override);
 gen_op_ld_v(s, MO_8, s->T0, s->A0);
 }
 
-- 
2.45.1




[PULL 19/24] target/i386: remove aflag argument of gen_lea_v_seg

2024-05-25 Thread Paolo Bonzini
It is always s->aflag.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 20 ++--
 target/i386/tcg/emit.c.inc  |  6 +++---
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 7dd7ebf60d4..6dedfe94c04 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -680,20 +680,20 @@ static void gen_lea_v_seg_dest(DisasContext *s, MemOp 
aflag, TCGv dest, TCGv a0,
 }
 }
 
-static void gen_lea_v_seg(DisasContext *s, MemOp aflag, TCGv a0,
+static void gen_lea_v_seg(DisasContext *s, TCGv a0,
   int def_seg, int ovr_seg)
 {
-gen_lea_v_seg_dest(s, aflag, s->A0, a0, def_seg, ovr_seg);
+gen_lea_v_seg_dest(s, s->aflag, s->A0, a0, def_seg, ovr_seg);
 }
 
 static inline void gen_string_movl_A0_ESI(DisasContext *s)
 {
-gen_lea_v_seg(s, s->aflag, cpu_regs[R_ESI], R_DS, s->override);
+gen_lea_v_seg(s, cpu_regs[R_ESI], R_DS, s->override);
 }
 
 static inline void gen_string_movl_A0_EDI(DisasContext *s)
 {
-gen_lea_v_seg(s, s->aflag, cpu_regs[R_EDI], R_ES, -1);
+gen_lea_v_seg(s, cpu_regs[R_EDI], R_ES, -1);
 }
 
 static inline TCGv gen_compute_Dshift(DisasContext *s, MemOp ot)
@@ -1784,7 +1784,7 @@ static void gen_lea_modrm(CPUX86State *env, DisasContext 
*s, int modrm)
 {
 AddressParts a = gen_lea_modrm_0(env, s, modrm);
 TCGv ea = gen_lea_modrm_1(s, a, false);
-gen_lea_v_seg(s, s->aflag, ea, a.def_seg, s->override);
+gen_lea_v_seg(s, ea, a.def_seg, s->override);
 }
 
 static void gen_nop_modrm(CPUX86State *env, DisasContext *s, int modrm)
@@ -2523,7 +2523,7 @@ static bool disas_insn_x87(DisasContext *s, CPUState 
*cpu, int b)
 bool update_fdp = true;
 
 tcg_gen_mov_tl(last_addr, ea);
-gen_lea_v_seg(s, s->aflag, ea, a.def_seg, s->override);
+gen_lea_v_seg(s, ea, a.def_seg, s->override);
 
 switch (op) {
 case 0x00 ... 0x07: /* fxxxs */
@@ -3320,7 +3320,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 tcg_gen_sari_tl(s->tmp0, s->T1, 3 + ot);
 tcg_gen_shli_tl(s->tmp0, s->tmp0, ot);
 tcg_gen_add_tl(s->A0, gen_lea_modrm_1(s, a, false), s->tmp0);
-gen_lea_v_seg(s, s->aflag, s->A0, a.def_seg, s->override);
+gen_lea_v_seg(s, s->A0, a.def_seg, s->override);
 if (!(s->prefix & PREFIX_LOCK)) {
 gen_op_ld_v(s, ot, s->T0, s->A0);
 }
@@ -3645,7 +3645,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 }
 gen_update_cc_op(s);
 gen_update_eip_cur(s);
-gen_lea_v_seg(s, s->aflag, cpu_regs[R_EAX], R_DS, s->override);
+gen_lea_v_seg(s, cpu_regs[R_EAX], R_DS, s->override);
 gen_helper_monitor(tcg_env, s->A0);
 break;
 
@@ -4051,7 +4051,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 } else {
 tcg_gen_movi_tl(s->A0, 0);
 }
-gen_lea_v_seg(s, s->aflag, s->A0, a.def_seg, s->override);
+gen_lea_v_seg(s, s->A0, a.def_seg, s->override);
 if (a.index >= 0) {
 tcg_gen_mov_tl(s->T0, cpu_regs[a.index]);
 } else {
@@ -4156,7 +4156,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 } else {
 tcg_gen_movi_tl(s->A0, 0);
 }
-gen_lea_v_seg(s, s->aflag, s->A0, a.def_seg, s->override);
+gen_lea_v_seg(s, s->A0, a.def_seg, s->override);
 if (a.index >= 0) {
 tcg_gen_mov_tl(s->T0, cpu_regs[a.index]);
 } else {
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 377d2201c91..e990141454b 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -76,7 +76,7 @@ static void gen_NM_exception(DisasContext *s)
 static void gen_load_ea(DisasContext *s, AddressParts *mem, bool is_vsib)
 {
 TCGv ea = gen_lea_modrm_1(s, *mem, is_vsib);
-gen_lea_v_seg(s, s->aflag, ea, mem->def_seg, s->override);
+gen_lea_v_seg(s, ea, mem->def_seg, s->override);
 }
 
 static inline int mmx_offset(MemOp ot)
@@ -2044,7 +2044,7 @@ static void gen_MOV(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 
 static void gen_MASKMOV(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
 {
-gen_lea_v_seg(s, s->aflag, cpu_regs[R_EDI], R_DS, s->override);
+gen_lea_v_seg(s, cpu_regs[R_EDI], R_DS, s->override);
 
 if (s->prefix & PREFIX_DATA) {
 gen_helper_maskmov_xmm(tcg_env, OP_PTR1, OP_PTR2, s->A0);
@@ -4039,7 +4039,7 @@ static void gen_XLAT(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 {
 /* AL is already zero-extended into s->T0.  */
 tcg_gen_add_tl(s->A0, cpu_regs[R_EBX], s->T0);
-gen_l

[PULL 07/24] target/i386: set CC_OP in helpers if they want CC_OP_EFLAGS

2024-05-25 Thread Paolo Bonzini
Mark cc_op as clean and do not spill it at the end of the translation block.
Technically this is a tiny bit less efficient, but:

* it results in translations that are a tiny bit smaller

* for most of these instructions, it is not unlikely that they are close to
the end of the basic block, in which case cc_op would not be overwritten

* anyway the cost is probably dwarfed by that of computing flags.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/ops_sse.h|  8 
 target/i386/tcg/fpu_helper.c |  2 ++
 target/i386/tcg/int_helper.c | 13 +
 target/i386/tcg/seg_helper.c | 16 
 target/i386/tcg/translate.c  | 12 ++--
 target/i386/tcg/emit.c.inc   | 22 +++---
 6 files changed, 44 insertions(+), 29 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 6a465a35fdb..f0aa1894aa2 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -,6 +,7 @@ void helper_ucomiss(CPUX86State *env, Reg *d, Reg *s)
 s1 = s->ZMM_S(0);
 ret = float32_compare_quiet(s0, s1, &env->sse_status);
 CC_SRC = comis_eflags[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_comiss(CPUX86State *env, Reg *d, Reg *s)
@@ -1122,6 +1123,7 @@ void helper_comiss(CPUX86State *env, Reg *d, Reg *s)
 s1 = s->ZMM_S(0);
 ret = float32_compare(s0, s1, &env->sse_status);
 CC_SRC = comis_eflags[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_ucomisd(CPUX86State *env, Reg *d, Reg *s)
@@ -1133,6 +1135,7 @@ void helper_ucomisd(CPUX86State *env, Reg *d, Reg *s)
 d1 = s->ZMM_D(0);
 ret = float64_compare_quiet(d0, d1, &env->sse_status);
 CC_SRC = comis_eflags[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_comisd(CPUX86State *env, Reg *d, Reg *s)
@@ -1144,6 +1147,7 @@ void helper_comisd(CPUX86State *env, Reg *d, Reg *s)
 d1 = s->ZMM_D(0);
 ret = float64_compare(d0, d1, &env->sse_status);
 CC_SRC = comis_eflags[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 }
 #endif
 
@@ -1610,6 +1614,7 @@ void glue(helper_ptest, SUFFIX)(CPUX86State *env, Reg *d, 
Reg *s)
 cf |= (s->Q(i) & ~d->Q(i));
 }
 CC_SRC = (zf ? 0 : CC_Z) | (cf ? 0 : CC_C);
+CC_OP = CC_OP_EFLAGS;
 }
 
 #define FMOVSLDUP(i) s->L((i) & ~1)
@@ -1966,6 +1971,7 @@ static inline unsigned pcmpxstrx(CPUX86State *env, Reg 
*d, Reg *s,
 validd--;
 
 CC_SRC = (valids < upper ? CC_Z : 0) | (validd < upper ? CC_S : 0);
+CC_OP = CC_OP_EFLAGS;
 
 switch ((ctrl >> 2) & 3) {
 case 0:
@@ -2297,6 +2303,7 @@ void glue(helper_vtestps, SUFFIX)(CPUX86State *env, Reg 
*d, Reg *s)
 cf |= (s->L(i) & ~d->L(i));
 }
 CC_SRC = ((zf >> 31) ? 0 : CC_Z) | ((cf >> 31) ? 0 : CC_C);
+CC_OP = CC_OP_EFLAGS;
 }
 
 void glue(helper_vtestpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
@@ -2309,6 +2316,7 @@ void glue(helper_vtestpd, SUFFIX)(CPUX86State *env, Reg 
*d, Reg *s)
 cf |= (s->Q(i) & ~d->Q(i));
 }
 CC_SRC = ((zf >> 63) ? 0 : CC_Z) | ((cf >> 63) ? 0 : CC_C);
+CC_OP = CC_OP_EFLAGS;
 }
 
 void glue(helper_vpmaskmovd_st, SUFFIX)(CPUX86State *env,
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c
index ece22a3553f..8df8cae6310 100644
--- a/target/i386/tcg/fpu_helper.c
+++ b/target/i386/tcg/fpu_helper.c
@@ -487,6 +487,7 @@ void helper_fcomi_ST0_FT0(CPUX86State *env)
 ret = floatx80_compare(ST0, FT0, &env->fp_status);
 eflags = cpu_cc_compute_all(env) & ~(CC_Z | CC_P | CC_C);
 CC_SRC = eflags | fcomi_ccval[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 merge_exception_flags(env, old_flags);
 }
 
@@ -499,6 +500,7 @@ void helper_fucomi_ST0_FT0(CPUX86State *env)
 ret = floatx80_compare_quiet(ST0, FT0, &env->fp_status);
 eflags = cpu_cc_compute_all(env) & ~(CC_Z | CC_P | CC_C);
 CC_SRC = eflags | fcomi_ccval[ret + 1];
+CC_OP = CC_OP_EFLAGS;
 merge_exception_flags(env, old_flags);
 }
 
diff --git a/target/i386/tcg/int_helper.c b/target/i386/tcg/int_helper.c
index 4cc59f15203..e1f92405282 100644
--- a/target/i386/tcg/int_helper.c
+++ b/target/i386/tcg/int_helper.c
@@ -187,6 +187,7 @@ void helper_aaa(CPUX86State *env)
 }
 env->regs[R_EAX] = (env->regs[R_EAX] & ~0x) | al | (ah << 8);
 CC_SRC = eflags;
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_aas(CPUX86State *env)
@@ -211,6 +212,7 @@ void helper_aas(CPUX86State *env)
 }
 env->regs[R_EAX] = (env->regs[R_EAX] & ~0x) | al | (ah << 8);
 CC_SRC = eflags;
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_daa(CPUX86State *env)
@@ -238,6 +240,7 @@ void helper_daa(CPUX86State *env)
 eflags |= parity_table[al]; /* pf */
 eflags |= (al & 0x80); /* sf */
 CC_SRC = eflags;
+CC_OP = CC_OP_EFLAGS;
 }
 
 void helper_das(CPUX86State *env)
@@ -269,6 +272,7 @@ void helper_das(CPUX86State *env)
 eflags |= parity_table[al]; /* pf */
 eflags |= (al & 0x80); /* sf */
 CC_SRC = eflags;
+CC_OP = CC_OP_EFLAGS;
 }
 
 #ifdef TARGET_X86_64
@@ -449,10 +453,1

[PULL 09/24] target/i386: avoid calling gen_eob_syscall before tb_stop

2024-05-25 Thread Paolo Bonzini
syscall and sysret only have one exit, so they do not need to
generate the end-of-translation code inline.  It can be
deferred to tb_stop.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 1246118e42b..0600b43 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -168,6 +168,12 @@ typedef struct DisasContext {
  */
 #define DISAS_JUMP DISAS_TARGET_3
 
+/*
+ * EIP has already been updated.  Use updated value of
+ * EFLAGS.TF to determine singlestep trap (SYSCALL/SYSRET).
+ */
+#define DISAS_EOB_RECHECK_TF   DISAS_TARGET_4
+
 /* The environment in which user-only runs is constrained. */
 #ifdef CONFIG_USER_ONLY
 #define PE(S) true
@@ -3587,7 +3593,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 /* TF handling for the syscall insn is different. The TF bit is  
checked
after the syscall insn completes. This allows #DB to not be
generated after one has entered CPL0 if TF is set in FMASK.  */
-gen_eob_syscall(s);
+s->base.is_jmp = DISAS_EOB_RECHECK_TF;
 break;
 case 0x107: /* sysret */
 /* For Intel SYSRET is only valid in long mode */
@@ -3606,7 +3612,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
checked after the sysret insn completes. This allows #DB to be
generated "as if" the syscall insn in userspace has just
completed.  */
-gen_eob_syscall(s);
+s->base.is_jmp = DISAS_EOB_RECHECK_TF;
 }
 break;
 case 0x1a2: /* cpuid */
@@ -4810,6 +4816,9 @@ static void i386_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 case DISAS_EOB_ONLY:
 gen_eob(dc);
 break;
+case DISAS_EOB_RECHECK_TF:
+gen_eob_syscall(dc);
+break;
 case DISAS_EOB_INHIBIT_IRQ:
 gen_update_eip_cur(dc);
 gen_eob_inhibit_irq(dc);
-- 
2.45.1




[PULL 02/24] target/i386: disable jmp_opt if EFLAGS.RF is 1

2024-05-25 Thread Paolo Bonzini
If EFLAGS.RF is 1, special processing in gen_eob_worker() is needed and
therefore goto_tb cannot be used.

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 76be7425800..ebcff8766cf 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -4660,7 +4660,7 @@ static void i386_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cpu)
 dc->cpuid_7_1_eax_features = env->features[FEAT_7_1_EAX];
 dc->cpuid_xsave_features = env->features[FEAT_XSAVE];
 dc->jmp_opt = !((cflags & CF_NO_GOTO_TB) ||
-(flags & (HF_TF_MASK | HF_INHIBIT_IRQ_MASK)));
+(flags & (HF_RF_MASK | HF_TF_MASK | HF_INHIBIT_IRQ_MASK)));
 /*
  * If jmp_opt, we want to handle each string instruction individually.
  * For icount also disable repz optimization so that each iteration
-- 
2.45.1




[PULL 00/24] Build system and target/i386/translate.c cleanups for 2025-05-25

2024-05-25 Thread Paolo Bonzini
The following changes since commit 70581940cabcc51b329652becddfbc6a261b1b83:

  Merge tag 'pull-tcg-20240523' of https://gitlab.com/rth7680/qemu into staging 
(2024-05-23 09:47:40 -0700)

are available in the Git repository at:

  https://gitlab.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 70eb5fde05bdd051c087669ffcf2aee39e0c8170:

  migration: remove unnecessary zlib dependency (2024-05-25 13:28:02 +0200)


Build system and target/i386/translate.c cleanups


Artyom Kunakovsky (1):
  configure: move -mcx16 flag out of CPU_CFLAGS

Paolo Bonzini (23):
  target/i386: disable jmp_opt if EFLAGS.RF is 1
  target/i386: no single-step exception after MOV or POP SS
  target/i386: cleanup eob handling of RSM
  target/i386: remove unnecessary gen_update_cc_op before gen_eob*
  target/i386: cpu_load_eflags already sets cc_op
  target/i386: set CC_OP in helpers if they want CC_OP_EFLAGS
  target/i386: document and group DISAS_* constants
  target/i386: avoid calling gen_eob_syscall before tb_stop
  target/i386: avoid calling gen_eob_inhibit_irq before tb_stop
  target/i386: assert that gen_update_eip_cur and gen_update_eip_next are 
the same in tb_stop
  target/i386: raze the gen_eob* jungle
  target/i386: reg in gen_ldst_modrm is always OR_TMP0
  target/i386: split gen_ldst_modrm for load and store
  target/i386: inline gen_add_A0_ds_seg
  target/i386: use mo_stacksize more
  target/i386: introduce gen_lea_ss_ofs
  target/i386: clean up repeated string operations
  target/i386: remove aflag argument of gen_lea_v_seg
  meson: remove unnecessary reference to libm
  meson: remove unnecessary dependency
  tcg: include dependencies in static_library()
  meson: do not query modules before they are processed
  migration: remove unnecessary zlib dependency

 configure|   7 +-
 meson.build  |   9 +-
 target/i386/ops_sse.h|   8 ++
 migration/dirtyrate.c|   1 -
 migration/qemu-file.c|   1 -
 target/i386/tcg/fpu_helper.c |   2 +
 target/i386/tcg/int_helper.c |  13 +-
 target/i386/tcg/seg_helper.c |  16 +--
 target/i386/tcg/translate.c  | 326 +++
 target/i386/tcg/emit.c.inc   |  58 
 audio/meson.build|   4 +-
 block/meson.build|   4 +-
 migration/meson.build|   2 +-
 tcg/meson.build  |   8 +-
 tests/qtest/meson.build  |   2 +-
 ui/meson.build   |   5 +-
 16 files changed, 218 insertions(+), 248 deletions(-)
-- 
2.45.1




Re: [PATCH] target/i386: always go through gen_eob*()

2024-05-25 Thread Paolo Bonzini
On Fri, May 24, 2024 at 6:51 PM Richard Henderson
 wrote:
> >   static void gen_set_hflag(DisasContext *s, uint32_t mask)
> > @@ -2354,7 +2354,7 @@ static void gen_jmp_rel(DisasContext *s, MemOp ot, 
> > int diff, int tb_num)
> >   tcg_gen_movi_tl(cpu_eip, new_eip);
> >   }
> >   tcg_gen_exit_tb(s->base.tb, tb_num);
> > -s->base.is_jmp = DISAS_NORETURN;
> > +s->base.is_jmp = DISAS_EOB_ONLY;
>
> This is wrong because exit_tb exits, and anything you add after is 
> unreachable.
> I think you simply want to remove the exit_tb call as well, but there may be 
> more cleanup
> possible in the wider context; I haven't checked.

Ok, I'll check this one more closely.

> > @@ -3769,7 +3769,7 @@ static void disas_insn_old(DisasContext *s, CPUState 
> > *cpu, int b)
> >   gen_helper_vmrun(tcg_env, tcg_constant_i32(s->aflag - 1),
> >cur_insn_len_i32(s));
> >   tcg_gen_exit_tb(NULL, 0);
> > -s->base.is_jmp = DISAS_NORETURN;
> > +s->base.is_jmp = DISAS_EOB_ONLY;
>
> Calls exit_tb, which is probably bogus here and EOB_ONLY is correct.
> But I'd need to look deeper into what vmrun does.

This is correct, but do_vmexit() needs to clear RF and handle
singlestep, and the helper needs to clear HF_INHIBIT_IRQ_MASK. In this
respect VMRUN/vmexit are is not unlike SYSRET/SYSCALL respectively,
except that EFLAGS.TF is never set right after VMRUN. That is, the
guest EFLAGS value has its effect only after the first instruction in
the guest, while the SYSCALL EFLAGS value interrupts before the first
instruction in CPL0.

> > @@ -1642,7 +1642,7 @@ static void gen_HLT(DisasContext *s, CPUX86State 
> > *env, X86DecodedInsn *decode)
> >   gen_update_cc_op(s);
> >   gen_update_eip_cur(s);
> >   gen_helper_hlt(tcg_env, cur_insn_len_i32(s));
> > -s->base.is_jmp = DISAS_NORETURN;
> > +s->base.is_jmp = DISAS_EOB_ONLY;
>
> noreturn.
>
> > @@ -4022,7 +4022,7 @@ static void gen_XCHG(DisasContext *s, CPUX86State 
> > *env, X86DecodedInsn *decode)
> >   gen_update_cc_op(s);
> >   gen_update_eip_cur(s);
> >   gen_helper_pause(tcg_env, cur_insn_len_i32(s));
> > -s->base.is_jmp = DISAS_NORETURN;
> > +s->base.is_jmp = DISAS_EOB_ONLY;
>
> noreturn.

But these should handle HF_INHIBIT_IRQ_MASK/RF/TF and they don't
(except for HLT clearing HF_INHIBIT_IRQ_MASK). So there is a bug but
it's in the helpers.

Paolo