Re: [PATCH v2 17/23] target/i386: Create gen_jmp_rel

2022-09-30 Thread Paolo Bonzini
On Sat, Oct 1, 2022 at 2:53 AM Richard Henderson
 wrote:
> I believe it really should be s->dflag, which makes all users of the function 
> pass dflag
> (the manual consistently talks about "operand size").  At which point this 
> parameter goes
> away and gen_jmp_rel grabs the operand size from DisasContext.
>
> Also, pre-existing bug vs CODE64 here -- operand size is always 64-bits for 
> near jumps.

Yes, sounds good.

Paolo




Re: [PATCH v4 27/54] hw/usb: dev-mtp: Use g_mkdir()

2022-09-30 Thread Bin Meng
Hi Gerd,

On Tue, Sep 27, 2022 at 7:07 PM Bin Meng  wrote:
>
> From: Bin Meng 
>
> Use g_mkdir() to create a directory on all platforms.
>
> Signed-off-by: Bin Meng 
> Acked-by: Gerd Hoffmann 
> ---
>
> (no changes since v2)
>
> Changes in v2:
> - Change to use g_mkdir()
>
>  hw/usb/dev-mtp.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>

Would you pick up this patch in your queue?

Regards,
Bin



Re: [PATCH v4 26/54] fsdev/virtfs-proxy-helper: Use g_mkdir()

2022-09-30 Thread Bin Meng
Hi Christian,

On Tue, Sep 27, 2022 at 7:07 PM Bin Meng  wrote:
>
> From: Bin Meng 
>
> Use g_mkdir() to create a directory on all platforms.
>
> Signed-off-by: Bin Meng 
> Reviewed-by: Christian Schoenebeck 
> ---
>
> (no changes since v2)
>
> Changes in v2:
> - Change to use g_mkdir()
>
>  fsdev/virtfs-proxy-helper.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>

Would you pick up this patch in your queue?

Regards,
Bin



Re: [PATCH v4 25/54] block/vvfat: Unify the mkdir() call

2022-09-30 Thread Bin Meng
Hi Kevin,

On Tue, Sep 27, 2022 at 7:07 PM Bin Meng  wrote:
>
> From: Bin Meng 
>
> There is a difference in the mkdir() call for win32 and non-win32
> platforms, and currently is handled in the codes with #ifdefs.
>
> glib provides a portable g_mkdir() API and we can use it to unify
> the codes without #ifdefs.
>
> Signed-off-by: Bin Meng 
> Reviewed-by: Marc-André Lureau 
> ---
>
> (no changes since v2)
>
> Changes in v2:
> - Change to use g_mkdir()
>
>  block/vvfat.c | 9 +++--
>  1 file changed, 3 insertions(+), 6 deletions(-)
>

Would you pick up this patch in your queue?

Regards,
Bin



Re: [PATCH v4 04/54] util/qemu-sockets: Use g_get_tmp_dir() to get the directory for temporary files

2022-09-30 Thread Bin Meng
Hi Daniel,

On Tue, Sep 27, 2022 at 7:06 PM Bin Meng  wrote:
>
> From: Bin Meng 
>
> Replace the existing logic to get the directory for temporary files
> with g_get_tmp_dir(), which works for win32 too.
>
> Signed-off-by: Bin Meng 
> Reviewed-by: Marc-André Lureau 
> ---
>
> (no changes since v1)
>
>  util/qemu-sockets.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>

Would you pick up this patch in your queue?

Regards,
Bin



Re: [PATCH v4 03/54] tcg: Avoid using hardcoded /tmp

2022-09-30 Thread Bin Meng
Hi Alex, Richard,

On Tue, Sep 27, 2022 at 7:06 PM Bin Meng  wrote:
>
> From: Bin Meng 
>
> Use g_get_tmp_dir() to get the directory to use for temporary files.
>
> Signed-off-by: Bin Meng 
> Reviewed-by: Marc-André Lureau 
> Reviewed-by: Alex Bennée 
> ---
>
> (no changes since v2)
>
> Changes in v2:
> - Use g_autofree to declare the variable
>
>  tcg/tcg.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>

Would you pick up this patch in your queue?

Regards,
Bin



Re: [PATCH v4 02/54] semihosting/arm-compat-semi: Avoid using hardcoded /tmp

2022-09-30 Thread Bin Meng
Hi Alex,

On Tue, Sep 27, 2022 at 7:06 PM Bin Meng  wrote:
>
> From: Bin Meng 
>
> Use g_get_tmp_dir() to get the directory to use for temporary files.
>
> Signed-off-by: Bin Meng 
> Reviewed-by: Alex Bennée 
> ---
>
> (no changes since v1)
>

Would you pick up this patch in your queue?

Regards,
Bin



Re: [PATCH v2 23/23] target/i386: Enable TARGET_TB_PCREL

2022-09-30 Thread Richard Henderson

On 9/21/22 06:31, Paolo Bonzini wrote:

On Tue, Sep 6, 2022 at 12:10 PM Richard Henderson
 wrote:

  static void gen_update_eip_cur(DisasContext *s)
  {
  gen_jmp_im(s, s->base.pc_next - s->cs_base);
+s->pc_save = s->base.pc_next;


s->pc_save is not valid after all gen_jmp_im() calls. Is it worth
noting after each call to gen_jmp_im() why this is not a problem?


  }

  static void gen_update_eip_next(DisasContext *s)
  {
  gen_jmp_im(s, s->pc - s->cs_base);
+s->pc_save = s->pc;
+}
+
+static TCGv gen_eip_cur(DisasContext *s)
+{
+if (TARGET_TB_PCREL) {
+gen_update_eip_cur(s);
+return cpu_eip;
+} else {
+return tcg_constant_tl(s->base.pc_next - s->cs_base);
+}


Ok, now I see why you called it gen_eip_cur(), but it's still a bit
disconcerting to see the difference in behavior between the
TARGET_TB_PCREL and !TARGET_TB_PCREL cases, one of them updating
cpu_eip and other not.

Perhaps gen_jmp_im() and gen_update_eip_cur() could be rewritten to
return the destination instead:

static TCGv gen_jmp_im(DisasContext *s, target_ulong eip)
{
 if (TARGET_TB_PCREL) {
 target_ulong eip_save = s->pc_save - s->cs_base;
 tcg_gen_addi_tl(cpu_eip, cpu_eip, eip - eip_save);
 return cpu_eip;
 } else {
 TCGv dest = tcg_constant_tl(eip);
 tcg_gen_mov_tl(cpu_eip, dest);
 return dest;
 }
}

static TCGv gen_update_eip_cur(DisasContext *s)
{
 TCGv dest = gen_jmp_im(s, s->base.pc_next - s->cs_base);
 s->pc_save = s->base.pc_next;
 return dest;
}


I don't see what I'd do with the return values.  But I see your point about gen_eip_cur 
only updating eip sometimes.  I have changed the name to eip_cur_tl, as suggested, and it 
writes to a temporary, like eip_next_tl.



r~



Question about RISC-V brom register a1 set value

2022-09-30 Thread Eric Chan
Hi, qemu

As I know, brom will pass 3 parameters to the next stage bootloader, ex:
openSBI.
a0 will pass hartid, a2 will pass fw_dynamic_info start address.
although a1 doesn't use directly in openSBI.
a1 read value is determined in compile time rather than read from the
original a1 that passes from brom.
In qemu/hw/riscv/boot.c
both 32bit and 64bit machines read 4byte that offset 32byte from the brom
start address.

for 64 bits machine: a1 read low 32bit data member magic of fw_dynamic_info,
the value will same as FW_DYNAMIC_INFO_MAGIC_VALUE because risc-v is little
endian.

for 32bits machine: each data member of fw_dynamic_info is 4 bytes, so a1
will read the version rather than magic.

Do the 32bit and 64bit pass different parameters are expected?
If it is not expected, I guess the original version is 64bit machine, and
then supports 32bit but misses this detail, I hope I can have an
opportunity to fix this problem.
If it is expected, why they must be done?

Thanks,
Eric Chan

qemu/include/hw/riscv/boot_opensbi.h
#define FW_DYNAMIC_INFO_MAGIC_VALUE 0x4942534f
qemu/hw/riscv/boot.c
void riscv_setup_rom_reset_vec(MachineState *machine, RISCVHartArrayState *
harts,
   hwaddr start_addr,
   hwaddr rom_base, hwaddr rom_size,
   uint64_t kernel_entry,
   uint64_t fdt_load_addr)
{
int i;
uint32_t start_addr_hi32 = 0x;
uint32_t fdt_load_addr_hi32 = 0x;

if (!riscv_is_32bit(harts)) {
start_addr_hi32 = start_addr >> 32;
fdt_load_addr_hi32 = fdt_load_addr >> 32;
}
/* reset vector */
uint32_t reset_vec[10] = {
0x0297,  /* 1:  auipc  t0, %pcrel_hi(fw_dyn) */
0x02828613,  /* addi   a2, t0, %pcrel_lo(1b) */
0xf1402573,  /* csrr   a0, mhartid  */
0,
0,
0x00028067,  /* jr t0 */
start_addr,  /* start: .dword */
start_addr_hi32,
fdt_load_addr,   /* fdt_laddr: .dword */
fdt_load_addr_hi32,
 /* fw_dyn: */
};
if (riscv_is_32bit(harts)) {
reset_vec[3] = 0x0202a583;   /* lw a1, 32(t0) */
reset_vec[4] = 0x0182a283;   /* lw t0, 24(t0) */
} else {
reset_vec[3] = 0x0202b583;   /* ld a1, 32(t0) */
reset_vec[4] = 0x0182b283;   /* ld t0, 24(t0) */
}

/* copy in the reset vector in little_endian byte order */
for (i = 0; i < ARRAY_SIZE(reset_vec); i++) {
reset_vec[i] = cpu_to_le32(reset_vec[i]);
}
rom_add_blob_fixed_as("mrom.reset", reset_vec, sizeof(reset_vec),
  rom_base, &address_space_memory);
riscv_rom_copy_firmware_info(machine, rom_base, rom_size, sizeof
(reset_vec),
 kernel_entry);
}

opensbi/firmware/fw_dynamic.S
fw_boot_hart:
/* Sanity checks */
li  a1, FW_DYNAMIC_INFO_MAGIC_VALUE
REG_L   a0, FW_DYNAMIC_INFO_MAGIC_OFFSET(a2)
bne a0, a1, _bad_dynamic_info
li  a1, FW_DYNAMIC_INFO_VERSION_MAX
REG_L   a0, FW_DYNAMIC_INFO_VERSION_OFFSET(a2)
bgt a0, a1, _bad_dynamic_info

/* Read boot HART id */
li  a1, FW_DYNAMIC_INFO_VERSION_2
blt a0, a1, 2f
REG_L   a0, FW_DYNAMIC_INFO_BOOT_HART_OFFSET(a2)
ret
2:  li  a0, -1
ret


Re: [PATCH v2 19/23] target/i386: Use gen_jmp_rel for gen_jcc

2022-09-30 Thread Richard Henderson

On 9/21/22 06:09, Paolo Bonzini wrote:

On Tue, Sep 6, 2022 at 12:09 PM Richard Henderson
 wrote:

-static inline void gen_jcc(DisasContext *s, int b,
-   target_ulong val, target_ulong next_eip)
+static void gen_jcc(DisasContext *s, MemOp ot, int b, int diff)
  {
-TCGLabel *l1, *l2;
+TCGLabel *l1 = gen_new_label();

-if (s->jmp_opt) {
-l1 = gen_new_label();
-gen_jcc1(s, b, l1);
-
-gen_goto_tb(s, 0, next_eip);
-
-gen_set_label(l1);
-gen_goto_tb(s, 1, val);
-} else {
-l1 = gen_new_label();
-l2 = gen_new_label();
-gen_jcc1(s, b, l1);
-
-gen_jmp_im(s, next_eip);
-tcg_gen_br(l2);
-
-gen_set_label(l1);
-gen_jmp_im(s, val);
-gen_set_label(l2);
-gen_eob(s);
-}
+gen_jcc1(s, b, l1);
+gen_jmp_rel(s, ot, 0, 1);
+gen_set_label(l1);
+gen_jmp_rel(s, ot, diff, 0);


Might be worth a comment that jumps with 16-bit operand size truncate
EIP even if the jump is not taken.


Hmm.  But is that correct?  That's not reflected by the pseudocode for Jcc.


r~



Re: [PATCH v2 17/23] target/i386: Create gen_jmp_rel

2022-09-30 Thread Richard Henderson

On 9/21/22 06:06, Paolo Bonzini wrote:

On Tue, Sep 6, 2022 at 12:09 PM Richard Henderson
 wrote:


Create a common helper for pc-relative branches.
The jmp jb insn was missing a mask for CODE32.

Signed-off-by: Richard Henderson 


(Oops, my remark the previous patch should still have pointed to gen_jmp_tb).

In gen_jz_ecx_string, in the translation for LOOPNZ/LOOPZ/LOOP/JECXZ
and in i386_tr_tb_stop there is:


-gen_jmp_tb(s, s->pc - s->cs_base, 1);
+gen_jmp_rel(s, MO_32, 0, 1);


What happens if the instruction's last byte is at 0x? Wraparound
in the middle of an instruction is generally undefined, but I think it
should work if the instruction does not cross the 64K/4G limit (and on
real hardware, which obeys segment limits unlike TCG, said limit must
be 64K/4G of course).

In other words, why MO_32 and not "CODE32(s) ? MO_32 : MO_16"?


I believe it really should be s->dflag, which makes all users of the function pass dflag 
(the manual consistently talks about "operand size").  At which point this parameter goes 
away and gen_jmp_rel grabs the operand size from DisasContext.


Also, pre-existing bug vs CODE64 here -- operand size is always 64-bits for 
near jumps.


r~



Re: [PULL v2 00/15] x86 + misc changes for 2022-09-29

2022-09-30 Thread Stefan Hajnoczi
This pull request doesn't build:

../meson.build:545:95: ERROR: Expecting endif got rparen.
gdbus_codegen_error = '@0@ uses gdbus-codegen, which does not support
control flow integrity')

https://gitlab.com/qemu-project/qemu/-/jobs/3112498668



[PATCH v5 7/9] target/arm: Introduce gen_pc_plus_diff for aarch64

2022-09-30 Thread Richard Henderson
In preparation for TARGET_TB_PCREL, reduce reliance on absolute values.

Signed-off-by: Richard Henderson 
---
 target/arm/translate-a64.c | 41 +++---
 1 file changed, 29 insertions(+), 12 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 005fd767fb..28a417fb2b 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -148,9 +148,14 @@ static void reset_btype(DisasContext *s)
 }
 }
 
+static void gen_pc_plus_diff(DisasContext *s, TCGv_i64 dest, target_long diff)
+{
+tcg_gen_movi_i64(dest, s->pc_curr + diff);
+}
+
 void gen_a64_update_pc(DisasContext *s, target_long diff)
 {
-tcg_gen_movi_i64(cpu_pc, s->pc_curr + diff);
+gen_pc_plus_diff(s, cpu_pc, diff);
 }
 
 /*
@@ -1368,7 +1373,7 @@ static void disas_uncond_b_imm(DisasContext *s, uint32_t 
insn)
 
 if (insn & (1U << 31)) {
 /* BL Branch with link */
-tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next);
+gen_pc_plus_diff(s, cpu_reg(s, 30), curr_insn_len(s));
 }
 
 /* B Branch / BL Branch with link */
@@ -2309,11 +2314,17 @@ static void disas_uncond_b_reg(DisasContext *s, 
uint32_t insn)
 default:
 goto do_unallocated;
 }
-gen_a64_set_pc(s, dst);
 /* BLR also needs to load return address */
 if (opc == 1) {
-tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next);
+TCGv_i64 lr = cpu_reg(s, 30);
+if (dst == lr) {
+TCGv_i64 tmp = new_tmp_a64(s);
+tcg_gen_mov_i64(tmp, dst);
+dst = tmp;
+}
+gen_pc_plus_diff(s, lr, curr_insn_len(s));
 }
+gen_a64_set_pc(s, dst);
 break;
 
 case 8: /* BRAA */
@@ -2336,11 +2347,17 @@ static void disas_uncond_b_reg(DisasContext *s, 
uint32_t insn)
 } else {
 dst = cpu_reg(s, rn);
 }
-gen_a64_set_pc(s, dst);
 /* BLRAA also needs to load return address */
 if (opc == 9) {
-tcg_gen_movi_i64(cpu_reg(s, 30), s->base.pc_next);
+TCGv_i64 lr = cpu_reg(s, 30);
+if (dst == lr) {
+TCGv_i64 tmp = new_tmp_a64(s);
+tcg_gen_mov_i64(tmp, dst);
+dst = tmp;
+}
+gen_pc_plus_diff(s, lr, curr_insn_len(s));
 }
+gen_a64_set_pc(s, dst);
 break;
 
 case 4: /* ERET */
@@ -2908,7 +2925,8 @@ static void disas_ld_lit(DisasContext *s, uint32_t insn)
 
 tcg_rt = cpu_reg(s, rt);
 
-clean_addr = tcg_constant_i64(s->pc_curr + imm);
+clean_addr = new_tmp_a64(s);
+gen_pc_plus_diff(s, clean_addr, imm);
 if (is_vector) {
 do_fp_ld(s, rt, clean_addr, size);
 } else {
@@ -4252,23 +4270,22 @@ static void disas_ldst(DisasContext *s, uint32_t insn)
 static void disas_pc_rel_adr(DisasContext *s, uint32_t insn)
 {
 unsigned int page, rd;
-uint64_t base;
-uint64_t offset;
+int64_t offset;
 
 page = extract32(insn, 31, 1);
 /* SignExtend(immhi:immlo) -> offset */
 offset = sextract64(insn, 5, 19);
 offset = offset << 2 | extract32(insn, 29, 2);
 rd = extract32(insn, 0, 5);
-base = s->pc_curr;
 
 if (page) {
 /* ADRP (page based) */
-base &= ~0xfff;
 offset <<= 12;
+/* The page offset is ok for TARGET_TB_PCREL. */
+offset -= s->pc_curr & 0xfff;
 }
 
-tcg_gen_movi_i64(cpu_reg(s, rd), base + offset);
+gen_pc_plus_diff(s, cpu_reg(s, rd), offset);
 }
 
 /*
-- 
2.34.1




[PATCH v5 9/9] target/arm: Enable TARGET_TB_PCREL

2022-09-30 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/cpu-param.h |  1 +
 target/arm/translate.h | 19 
 target/arm/cpu.c   | 23 +++---
 target/arm/translate-a64.c | 37 ++-
 target/arm/translate.c | 62 ++
 5 files changed, 112 insertions(+), 30 deletions(-)

diff --git a/target/arm/cpu-param.h b/target/arm/cpu-param.h
index 68ffb12427..29c5fc4241 100644
--- a/target/arm/cpu-param.h
+++ b/target/arm/cpu-param.h
@@ -30,6 +30,7 @@
  */
 # define TARGET_PAGE_BITS_VARY
 # define TARGET_PAGE_BITS_MIN  10
+# define TARGET_TB_PCREL 1
 #endif
 
 #define NB_MMU_MODES 15
diff --git a/target/arm/translate.h b/target/arm/translate.h
index 4aa239e23c..41d14cc067 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -12,6 +12,25 @@ typedef struct DisasContext {
 
 /* The address of the current instruction being translated. */
 target_ulong pc_curr;
+/*
+ * For TARGET_TB_PCREL, the full value of cpu_pc is not known
+ * (although the page offset is known).  For convenience, the
+ * translation loop uses the full virtual address that triggered
+ * the translation is used, from base.pc_start through pc_curr.
+ * For efficiency, we do not update cpu_pc for every instruction.
+ * Instead, pc_save has the value of pc_curr at the time of the
+ * last update to cpu_pc, which allows us to compute the addend
+ * needed to bring cpu_pc current: pc_curr - pc_save.
+ * If cpu_pc now contains the destiation of an indirect branch,
+ * pc_save contains -1 to indicate that relative updates are no
+ * longer possible.
+ */
+target_ulong pc_save;
+/*
+ * Similarly, pc_cond_save contains the value of pc_save at the
+ * beginning of an AArch32 conditional instruction.
+ */
+target_ulong pc_cond_save;
 target_ulong page_start;
 uint32_t insn;
 /* Nonzero if this instruction has been conditionally skipped.  */
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 94ca6f163f..0bc5e9b125 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -76,17 +76,18 @@ static vaddr arm_cpu_get_pc(CPUState *cs)
 void arm_cpu_synchronize_from_tb(CPUState *cs,
  const TranslationBlock *tb)
 {
-ARMCPU *cpu = ARM_CPU(cs);
-CPUARMState *env = &cpu->env;
-
-/*
- * It's OK to look at env for the current mode here, because it's
- * never possible for an AArch64 TB to chain to an AArch32 TB.
- */
-if (is_a64(env)) {
-env->pc = tb_pc(tb);
-} else {
-env->regs[15] = tb_pc(tb);
+/* The program counter is always up to date with TARGET_TB_PCREL. */
+if (!TARGET_TB_PCREL) {
+CPUARMState *env = cs->env_ptr;
+/*
+ * It's OK to look at env for the current mode here, because it's
+ * never possible for an AArch64 TB to chain to an AArch32 TB.
+ */
+if (is_a64(env)) {
+env->pc = tb_pc(tb);
+} else {
+env->regs[15] = tb_pc(tb);
+}
 }
 }
 #endif /* CONFIG_TCG */
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 28a417fb2b..57cfc9f1a9 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -150,12 +150,18 @@ static void reset_btype(DisasContext *s)
 
 static void gen_pc_plus_diff(DisasContext *s, TCGv_i64 dest, target_long diff)
 {
-tcg_gen_movi_i64(dest, s->pc_curr + diff);
+assert(s->pc_save != -1);
+if (TARGET_TB_PCREL) {
+tcg_gen_addi_i64(dest, cpu_pc, (s->pc_curr - s->pc_save) + diff);
+} else {
+tcg_gen_movi_i64(dest, s->pc_curr + diff);
+}
 }
 
 void gen_a64_update_pc(DisasContext *s, target_long diff)
 {
 gen_pc_plus_diff(s, cpu_pc, diff);
+s->pc_save = s->pc_curr + diff;
 }
 
 /*
@@ -209,6 +215,7 @@ static void gen_a64_set_pc(DisasContext *s, TCGv_i64 src)
  * then loading an address into the PC will clear out any tag.
  */
 gen_top_byte_ignore(s, cpu_pc, src, s->tbii);
+s->pc_save = -1;
 }
 
 /*
@@ -347,16 +354,22 @@ static void gen_exception_internal(int excp)
 
 static void gen_exception_internal_insn(DisasContext *s, int excp)
 {
+target_ulong pc_save = s->pc_save;
+
 gen_a64_update_pc(s, 0);
 gen_exception_internal(excp);
 s->base.is_jmp = DISAS_NORETURN;
+s->pc_save = pc_save;
 }
 
 static void gen_exception_bkpt_insn(DisasContext *s, uint32_t syndrome)
 {
+target_ulong pc_save = s->pc_save;
+
 gen_a64_update_pc(s, 0);
 gen_helper_exception_bkpt_insn(cpu_env, tcg_constant_i32(syndrome));
 s->base.is_jmp = DISAS_NORETURN;
+s->pc_save = pc_save;
 }
 
 static void gen_step_complete_exception(DisasContext *s)
@@ -385,11 +398,16 @@ static inline bool use_goto_tb(DisasContext *s, uint64_t 
dest)
 
 static void gen_goto_tb(DisasContext *s, int n, int64_t diff)
 {
-uint64_t dest = s->pc_curr + diff;
+target_ulong pc_save = s->pc_save;
 
-if (us

[PATCH v5 6/9] target/arm: Change gen_jmp* to work on displacements

2022-09-30 Thread Richard Henderson
In preparation for TARGET_TB_PCREL, reduce reliance on absolute values.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 37 +
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index e0b1d415a2..fd35db8c8c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -270,6 +270,12 @@ static uint32_t read_pc(DisasContext *s)
 return s->pc_curr + (s->thumb ? 4 : 8);
 }
 
+/* The pc_curr difference for an architectural jump. */
+static target_long jmp_diff(DisasContext *s, target_long diff)
+{
+return diff + (s->thumb ? 4 : 8);
+}
+
 /* Set a variable to the value of a CPU register.  */
 void load_reg_var(DisasContext *s, TCGv_i32 var, int reg)
 {
@@ -2596,7 +2602,7 @@ static void gen_goto_ptr(void)
  * cpu_loop_exec. Any live exit_requests will be processed as we
  * enter the next TB.
  */
-static void gen_goto_tb(DisasContext *s, int n, int diff)
+static void gen_goto_tb(DisasContext *s, int n, target_long diff)
 {
 target_ulong dest = s->pc_curr + diff;
 
@@ -2612,10 +2618,8 @@ static void gen_goto_tb(DisasContext *s, int n, int diff)
 }
 
 /* Jump, specifying which TB number to use if we gen_goto_tb() */
-static inline void gen_jmp_tb(DisasContext *s, uint32_t dest, int tbno)
+static void gen_jmp_tb(DisasContext *s, target_long diff, int tbno)
 {
-int diff = dest - s->pc_curr;
-
 if (unlikely(s->ss_active)) {
 /* An indirect jump so that we still trigger the debug exception.  */
 gen_update_pc(s, diff);
@@ -2657,9 +2661,9 @@ static inline void gen_jmp_tb(DisasContext *s, uint32_t 
dest, int tbno)
 }
 }
 
-static inline void gen_jmp(DisasContext *s, uint32_t dest)
+static inline void gen_jmp(DisasContext *s, target_long diff)
 {
-gen_jmp_tb(s, dest, 0);
+gen_jmp_tb(s, diff, 0);
 }
 
 static inline void gen_mulxy(TCGv_i32 t0, TCGv_i32 t1, int x, int y)
@@ -8326,7 +8330,7 @@ static bool trans_CLRM(DisasContext *s, arg_CLRM *a)
 
 static bool trans_B(DisasContext *s, arg_i *a)
 {
-gen_jmp(s, read_pc(s) + a->imm);
+gen_jmp(s, jmp_diff(s, a->imm));
 return true;
 }
 
@@ -8341,14 +8345,14 @@ static bool trans_B_cond_thumb(DisasContext *s, arg_ci 
*a)
 return true;
 }
 arm_skip_unless(s, a->cond);
-gen_jmp(s, read_pc(s) + a->imm);
+gen_jmp(s, jmp_diff(s, a->imm));
 return true;
 }
 
 static bool trans_BL(DisasContext *s, arg_i *a)
 {
 tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | s->thumb);
-gen_jmp(s, read_pc(s) + a->imm);
+gen_jmp(s, jmp_diff(s, a->imm));
 return true;
 }
 
@@ -8368,7 +8372,8 @@ static bool trans_BLX_i(DisasContext *s, arg_BLX_i *a)
 }
 tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | s->thumb);
 store_cpu_field_constant(!s->thumb, thumb);
-gen_jmp(s, (read_pc(s) & ~3) + a->imm);
+/* This difference computes a page offset so ok for TARGET_TB_PCREL. */
+gen_jmp(s, (read_pc(s) & ~3) - s->pc_curr + a->imm);
 return true;
 }
 
@@ -8529,10 +8534,10 @@ static bool trans_WLS(DisasContext *s, arg_WLS *a)
  * when we take this upcoming exit from this TB, so gen_jmp_tb() is OK.
  */
 }
-gen_jmp_tb(s, s->base.pc_next, 1);
+gen_jmp_tb(s, curr_insn_len(s), 1);
 
 gen_set_label(nextlabel);
-gen_jmp(s, read_pc(s) + a->imm);
+gen_jmp(s, jmp_diff(s, a->imm));
 return true;
 }
 
@@ -8612,7 +8617,7 @@ static bool trans_LE(DisasContext *s, arg_LE *a)
 
 if (a->f) {
 /* Loop-forever: just jump back to the loop start */
-gen_jmp(s, read_pc(s) - a->imm);
+gen_jmp(s, jmp_diff(s, -a->imm));
 return true;
 }
 
@@ -8643,7 +8648,7 @@ static bool trans_LE(DisasContext *s, arg_LE *a)
 tcg_temp_free_i32(decr);
 }
 /* Jump back to the loop start */
-gen_jmp(s, read_pc(s) - a->imm);
+gen_jmp(s, jmp_diff(s, -a->imm));
 
 gen_set_label(loopend);
 if (a->tp) {
@@ -8651,7 +8656,7 @@ static bool trans_LE(DisasContext *s, arg_LE *a)
 store_cpu_field(tcg_constant_i32(4), v7m.ltpsize);
 }
 /* End TB, continuing to following insn */
-gen_jmp_tb(s, s->base.pc_next, 1);
+gen_jmp_tb(s, curr_insn_len(s), 1);
 return true;
 }
 
@@ -8750,7 +8755,7 @@ static bool trans_CBZ(DisasContext *s, arg_CBZ *a)
 tcg_gen_brcondi_i32(a->nz ? TCG_COND_EQ : TCG_COND_NE,
 tmp, 0, s->condlabel);
 tcg_temp_free_i32(tmp);
-gen_jmp(s, read_pc(s) + a->imm);
+gen_jmp(s, jmp_diff(s, a->imm));
 return true;
 }
 
-- 
2.34.1




[PATCH v5 4/9] target/arm: Change gen_exception_insn* to work on displacements

2022-09-30 Thread Richard Henderson
In preparation for TARGET_TB_PCREL, reduce reliance on absolute values.

Signed-off-by: Richard Henderson 
---
 target/arm/translate.h|  5 +++--
 target/arm/translate-a64.c| 28 ++-
 target/arm/translate-m-nocp.c |  6 ++---
 target/arm/translate-mve.c|  2 +-
 target/arm/translate-vfp.c|  6 ++---
 target/arm/translate.c| 42 +--
 6 files changed, 43 insertions(+), 46 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index d651044855..4aa239e23c 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -281,9 +281,10 @@ void arm_jump_cc(DisasCompare *cmp, TCGLabel *label);
 void arm_gen_test_cc(int cc, TCGLabel *label);
 MemOp pow2_align(unsigned i);
 void unallocated_encoding(DisasContext *s);
-void gen_exception_insn_el(DisasContext *s, uint64_t pc, int excp,
+void gen_exception_insn_el(DisasContext *s, target_long pc_diff, int excp,
uint32_t syn, uint32_t target_el);
-void gen_exception_insn(DisasContext *s, uint64_t pc, int excp, uint32_t syn);
+void gen_exception_insn(DisasContext *s, target_long pc_diff,
+int excp, uint32_t syn);
 
 /* Return state of Alternate Half-precision flag, caller frees result */
 static inline TCGv_i32 get_ahp_flag(void)
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 914c789187..2621b3b36a 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -1163,7 +1163,7 @@ static bool fp_access_check_only(DisasContext *s)
 assert(!s->fp_access_checked);
 s->fp_access_checked = true;
 
-gen_exception_insn_el(s, s->pc_curr, EXCP_UDEF,
+gen_exception_insn_el(s, 0, EXCP_UDEF,
   syn_fp_access_trap(1, 0xe, false, 0),
   s->fp_excp_el);
 return false;
@@ -1178,7 +1178,7 @@ static bool fp_access_check(DisasContext *s)
 return false;
 }
 if (s->sme_trap_nonstreaming && s->is_nonstreaming) {
-gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+gen_exception_insn(s, 0, EXCP_UDEF,
syn_smetrap(SME_ET_Streaming, false));
 return false;
 }
@@ -1198,7 +1198,7 @@ bool sve_access_check(DisasContext *s)
 goto fail_exit;
 }
 } else if (s->sve_excp_el) {
-gen_exception_insn_el(s, s->pc_curr, EXCP_UDEF,
+gen_exception_insn_el(s, 0, EXCP_UDEF,
   syn_sve_access_trap(), s->sve_excp_el);
 goto fail_exit;
 }
@@ -1220,7 +1220,7 @@ bool sve_access_check(DisasContext *s)
 static bool sme_access_check(DisasContext *s)
 {
 if (s->sme_excp_el) {
-gen_exception_insn_el(s, s->pc_curr, EXCP_UDEF,
+gen_exception_insn_el(s, 0, EXCP_UDEF,
   syn_smetrap(SME_ET_AccessTrap, false),
   s->sme_excp_el);
 return false;
@@ -1250,12 +1250,12 @@ bool sme_enabled_check_with_svcr(DisasContext *s, 
unsigned req)
 return false;
 }
 if (FIELD_EX64(req, SVCR, SM) && !s->pstate_sm) {
-gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+gen_exception_insn(s, 0, EXCP_UDEF,
syn_smetrap(SME_ET_NotStreaming, false));
 return false;
 }
 if (FIELD_EX64(req, SVCR, ZA) && !s->pstate_za) {
-gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
+gen_exception_insn(s, 0, EXCP_UDEF,
syn_smetrap(SME_ET_InactiveZA, false));
 return false;
 }
@@ -1915,7 +1915,7 @@ static void gen_sysreg_undef(DisasContext *s, bool isread,
 } else {
 syndrome = syn_uncategorized();
 }
-gen_exception_insn(s, s->pc_curr, EXCP_UDEF, syndrome);
+gen_exception_insn(s, 0, EXCP_UDEF, syndrome);
 }
 
 /* MRS - move from system register
@@ -2169,8 +2169,7 @@ static void disas_exc(DisasContext *s, uint32_t insn)
 switch (op2_ll) {
 case 1: /* SVC */
 gen_ss_advance(s);
-gen_exception_insn(s, s->base.pc_next, EXCP_SWI,
-   syn_aa64_svc(imm16));
+gen_exception_insn(s, 4, EXCP_SWI, syn_aa64_svc(imm16));
 break;
 case 2: /* HVC */
 if (s->current_el == 0) {
@@ -2183,8 +2182,7 @@ static void disas_exc(DisasContext *s, uint32_t insn)
 gen_a64_update_pc(s, 0);
 gen_helper_pre_hvc(cpu_env);
 gen_ss_advance(s);
-gen_exception_insn_el(s, s->base.pc_next, EXCP_HVC,
-  syn_aa64_hvc(imm16), 2);
+gen_exception_insn_el(s, 4, EXCP_HVC, syn_aa64_hvc(imm16), 2);
 break;
 case 3: /* SMC */
 if (s->current_el == 0) {
@@ -2194,8 +2192,7 @@ static voi

[PATCH v5 8/9] target/arm: Introduce gen_pc_plus_diff for aarch32

2022-09-30 Thread Richard Henderson
In preparation for TARGET_TB_PCREL, reduce reliance on absolute values.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/arm/translate.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/target/arm/translate.c b/target/arm/translate.c
index fd35db8c8c..050da9e740 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -276,11 +276,16 @@ static target_long jmp_diff(DisasContext *s, target_long 
diff)
 return diff + (s->thumb ? 4 : 8);
 }
 
+static void gen_pc_plus_diff(DisasContext *s, TCGv_i32 var, target_long diff)
+{
+tcg_gen_movi_i32(var, s->pc_curr + diff);
+}
+
 /* Set a variable to the value of a CPU register.  */
 void load_reg_var(DisasContext *s, TCGv_i32 var, int reg)
 {
 if (reg == 15) {
-tcg_gen_movi_i32(var, read_pc(s));
+gen_pc_plus_diff(s, var, jmp_diff(s, 0));
 } else {
 tcg_gen_mov_i32(var, cpu_R[reg]);
 }
@@ -296,7 +301,8 @@ TCGv_i32 add_reg_for_lit(DisasContext *s, int reg, int ofs)
 TCGv_i32 tmp = tcg_temp_new_i32();
 
 if (reg == 15) {
-tcg_gen_movi_i32(tmp, (read_pc(s) & ~3) + ofs);
+/* This difference computes a page offset so ok for TARGET_TB_PCREL. */
+gen_pc_plus_diff(s, tmp, (read_pc(s) & ~3) - s->pc_curr + ofs);
 } else {
 tcg_gen_addi_i32(tmp, cpu_R[reg], ofs);
 }
@@ -1159,7 +1165,7 @@ void unallocated_encoding(DisasContext *s)
 /* Force a TB lookup after an instruction that changes the CPU state.  */
 void gen_lookup_tb(DisasContext *s)
 {
-tcg_gen_movi_i32(cpu_R[15], s->base.pc_next);
+gen_pc_plus_diff(s, cpu_R[15], curr_insn_len(s));
 s->base.is_jmp = DISAS_EXIT;
 }
 
@@ -6483,7 +6489,7 @@ static bool trans_BLX_r(DisasContext *s, arg_BLX_r *a)
 return false;
 }
 tmp = load_reg(s, a->rm);
-tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | s->thumb);
+gen_pc_plus_diff(s, cpu_R[14], curr_insn_len(s) | s->thumb);
 gen_bx(s, tmp);
 return true;
 }
@@ -8351,7 +8357,7 @@ static bool trans_B_cond_thumb(DisasContext *s, arg_ci *a)
 
 static bool trans_BL(DisasContext *s, arg_i *a)
 {
-tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | s->thumb);
+gen_pc_plus_diff(s, cpu_R[14], curr_insn_len(s) | s->thumb);
 gen_jmp(s, jmp_diff(s, a->imm));
 return true;
 }
@@ -8370,7 +8376,7 @@ static bool trans_BLX_i(DisasContext *s, arg_BLX_i *a)
 if (s->thumb && (a->imm & 2)) {
 return false;
 }
-tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | s->thumb);
+gen_pc_plus_diff(s, cpu_R[14], curr_insn_len(s) | s->thumb);
 store_cpu_field_constant(!s->thumb, thumb);
 /* This difference computes a page offset so ok for TARGET_TB_PCREL. */
 gen_jmp(s, (read_pc(s) & ~3) - s->pc_curr + a->imm);
@@ -8380,7 +8386,7 @@ static bool trans_BLX_i(DisasContext *s, arg_BLX_i *a)
 static bool trans_BL_BLX_prefix(DisasContext *s, arg_BL_BLX_prefix *a)
 {
 assert(!arm_dc_feature(s, ARM_FEATURE_THUMB2));
-tcg_gen_movi_i32(cpu_R[14], read_pc(s) + (a->imm << 12));
+gen_pc_plus_diff(s, cpu_R[14], jmp_diff(s, a->imm << 12));
 return true;
 }
 
@@ -8390,7 +8396,7 @@ static bool trans_BL_suffix(DisasContext *s, 
arg_BL_suffix *a)
 
 assert(!arm_dc_feature(s, ARM_FEATURE_THUMB2));
 tcg_gen_addi_i32(tmp, cpu_R[14], (a->imm << 1) | 1);
-tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | 1);
+gen_pc_plus_diff(s, cpu_R[14], curr_insn_len(s) | 1);
 gen_bx(s, tmp);
 return true;
 }
@@ -8406,7 +8412,7 @@ static bool trans_BLX_suffix(DisasContext *s, 
arg_BLX_suffix *a)
 tmp = tcg_temp_new_i32();
 tcg_gen_addi_i32(tmp, cpu_R[14], a->imm << 1);
 tcg_gen_andi_i32(tmp, tmp, 0xfffc);
-tcg_gen_movi_i32(cpu_R[14], s->base.pc_next | 1);
+gen_pc_plus_diff(s, cpu_R[14], curr_insn_len(s) | 1);
 gen_bx(s, tmp);
 return true;
 }
@@ -8729,10 +8735,11 @@ static bool op_tbranch(DisasContext *s, arg_tbranch *a, 
bool half)
 tcg_gen_add_i32(addr, addr, tmp);
 
 gen_aa32_ld_i32(s, tmp, addr, get_mem_index(s), half ? MO_UW : MO_UB);
-tcg_temp_free_i32(addr);
 
 tcg_gen_add_i32(tmp, tmp, tmp);
-tcg_gen_addi_i32(tmp, tmp, read_pc(s));
+gen_pc_plus_diff(s, addr, jmp_diff(s, 0));
+tcg_gen_add_i32(tmp, tmp, addr);
+tcg_temp_free_i32(addr);
 store_reg(s, 15, tmp);
 return true;
 }
-- 
2.34.1




[PATCH v5 2/9] target/arm: Change gen_goto_tb to work on displacements

2022-09-30 Thread Richard Henderson
In preparation for TARGET_TB_PCREL, reduce reliance on absolute values.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 target/arm/translate-a64.c | 40 --
 target/arm/translate.c | 10 ++
 2 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 78b2d91ed4..8f5c2675f7 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -378,8 +378,10 @@ static inline bool use_goto_tb(DisasContext *s, uint64_t 
dest)
 return translator_use_goto_tb(&s->base, dest);
 }
 
-static inline void gen_goto_tb(DisasContext *s, int n, uint64_t dest)
+static void gen_goto_tb(DisasContext *s, int n, int64_t diff)
 {
+uint64_t dest = s->pc_curr + diff;
+
 if (use_goto_tb(s, dest)) {
 tcg_gen_goto_tb(n);
 gen_a64_set_pc_im(dest);
@@ -1362,7 +1364,7 @@ static inline AArch64DecodeFn *lookup_disas_fn(const 
AArch64DecodeTable *table,
  */
 static void disas_uncond_b_imm(DisasContext *s, uint32_t insn)
 {
-uint64_t addr = s->pc_curr + sextract32(insn, 0, 26) * 4;
+int64_t diff = sextract32(insn, 0, 26) * 4;
 
 if (insn & (1U << 31)) {
 /* BL Branch with link */
@@ -1371,7 +1373,7 @@ static void disas_uncond_b_imm(DisasContext *s, uint32_t 
insn)
 
 /* B Branch / BL Branch with link */
 reset_btype(s);
-gen_goto_tb(s, 0, addr);
+gen_goto_tb(s, 0, diff);
 }
 
 /* Compare and branch (immediate)
@@ -1383,14 +1385,14 @@ static void disas_uncond_b_imm(DisasContext *s, 
uint32_t insn)
 static void disas_comp_b_imm(DisasContext *s, uint32_t insn)
 {
 unsigned int sf, op, rt;
-uint64_t addr;
+int64_t diff;
 TCGLabel *label_match;
 TCGv_i64 tcg_cmp;
 
 sf = extract32(insn, 31, 1);
 op = extract32(insn, 24, 1); /* 0: CBZ; 1: CBNZ */
 rt = extract32(insn, 0, 5);
-addr = s->pc_curr + sextract32(insn, 5, 19) * 4;
+diff = sextract32(insn, 5, 19) * 4;
 
 tcg_cmp = read_cpu_reg(s, rt, sf);
 label_match = gen_new_label();
@@ -1399,9 +1401,9 @@ static void disas_comp_b_imm(DisasContext *s, uint32_t 
insn)
 tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
 tcg_cmp, 0, label_match);
 
-gen_goto_tb(s, 0, s->base.pc_next);
+gen_goto_tb(s, 0, 4);
 gen_set_label(label_match);
-gen_goto_tb(s, 1, addr);
+gen_goto_tb(s, 1, diff);
 }
 
 /* Test and branch (immediate)
@@ -1413,13 +1415,13 @@ static void disas_comp_b_imm(DisasContext *s, uint32_t 
insn)
 static void disas_test_b_imm(DisasContext *s, uint32_t insn)
 {
 unsigned int bit_pos, op, rt;
-uint64_t addr;
+int64_t diff;
 TCGLabel *label_match;
 TCGv_i64 tcg_cmp;
 
 bit_pos = (extract32(insn, 31, 1) << 5) | extract32(insn, 19, 5);
 op = extract32(insn, 24, 1); /* 0: TBZ; 1: TBNZ */
-addr = s->pc_curr + sextract32(insn, 5, 14) * 4;
+diff = sextract32(insn, 5, 14) * 4;
 rt = extract32(insn, 0, 5);
 
 tcg_cmp = tcg_temp_new_i64();
@@ -1430,9 +1432,9 @@ static void disas_test_b_imm(DisasContext *s, uint32_t 
insn)
 tcg_gen_brcondi_i64(op ? TCG_COND_NE : TCG_COND_EQ,
 tcg_cmp, 0, label_match);
 tcg_temp_free_i64(tcg_cmp);
-gen_goto_tb(s, 0, s->base.pc_next);
+gen_goto_tb(s, 0, 4);
 gen_set_label(label_match);
-gen_goto_tb(s, 1, addr);
+gen_goto_tb(s, 1, diff);
 }
 
 /* Conditional branch (immediate)
@@ -1444,13 +1446,13 @@ static void disas_test_b_imm(DisasContext *s, uint32_t 
insn)
 static void disas_cond_b_imm(DisasContext *s, uint32_t insn)
 {
 unsigned int cond;
-uint64_t addr;
+int64_t diff;
 
 if ((insn & (1 << 4)) || (insn & (1 << 24))) {
 unallocated_encoding(s);
 return;
 }
-addr = s->pc_curr + sextract32(insn, 5, 19) * 4;
+diff = sextract32(insn, 5, 19) * 4;
 cond = extract32(insn, 0, 4);
 
 reset_btype(s);
@@ -1458,12 +1460,12 @@ static void disas_cond_b_imm(DisasContext *s, uint32_t 
insn)
 /* genuinely conditional branches */
 TCGLabel *label_match = gen_new_label();
 arm_gen_test_cc(cond, label_match);
-gen_goto_tb(s, 0, s->base.pc_next);
+gen_goto_tb(s, 0, 4);
 gen_set_label(label_match);
-gen_goto_tb(s, 1, addr);
+gen_goto_tb(s, 1, diff);
 } else {
 /* 0xe and 0xf are both "always" conditions */
-gen_goto_tb(s, 0, addr);
+gen_goto_tb(s, 0, diff);
 }
 }
 
@@ -1637,7 +1639,7 @@ static void handle_sync(DisasContext *s, uint32_t insn,
  * any pending interrupts immediately.
  */
 reset_btype(s);
-gen_goto_tb(s, 0, s->base.pc_next);
+gen_goto_tb(s, 0, 4);
 return;
 
 case 7: /* SB */
@@ -1649,7 +1651,7 @@ static void handle_sync(DisasContext *s, uint32_t insn,
  * MB and end the TB instead.
  */
 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_SC);
-gen_goto_tb(s, 0, s->base.pc_next);

[PATCH v5 3/9] target/arm: Change gen_*set_pc_im to gen_*update_pc

2022-09-30 Thread Richard Henderson
In preparation for TARGET_TB_PCREL, reduce reliance on
absolute values by passing in pc difference.

Signed-off-by: Richard Henderson 
---
 target/arm/translate-a32.h |  2 +-
 target/arm/translate.h |  6 ++--
 target/arm/translate-a64.c | 32 +-
 target/arm/translate-vfp.c |  2 +-
 target/arm/translate.c | 68 --
 5 files changed, 56 insertions(+), 54 deletions(-)

diff --git a/target/arm/translate-a32.h b/target/arm/translate-a32.h
index 78a84c1414..5339c22f1e 100644
--- a/target/arm/translate-a32.h
+++ b/target/arm/translate-a32.h
@@ -40,7 +40,7 @@ void write_neon_element64(TCGv_i64 src, int reg, int ele, 
MemOp memop);
 TCGv_i32 add_reg_for_lit(DisasContext *s, int reg, int ofs);
 void gen_set_cpsr(TCGv_i32 var, uint32_t mask);
 void gen_set_condexec(DisasContext *s);
-void gen_set_pc_im(DisasContext *s, target_ulong val);
+void gen_update_pc(DisasContext *s, target_long diff);
 void gen_lookup_tb(DisasContext *s);
 long vfp_reg_offset(bool dp, unsigned reg);
 long neon_full_reg_offset(unsigned reg);
diff --git a/target/arm/translate.h b/target/arm/translate.h
index 90bf7c57fc..d651044855 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -254,7 +254,7 @@ static inline int curr_insn_len(DisasContext *s)
  * For instructions which want an immediate exit to the main loop, as opposed
  * to attempting to use lookup_and_goto_ptr.  Unlike DISAS_UPDATE_EXIT, this
  * doesn't write the PC on exiting the translation loop so you need to ensure
- * something (gen_a64_set_pc_im or runtime helper) has done so before we reach
+ * something (gen_a64_update_pc or runtime helper) has done so before we reach
  * return from cpu_tb_exec.
  */
 #define DISAS_EXIT  DISAS_TARGET_9
@@ -263,14 +263,14 @@ static inline int curr_insn_len(DisasContext *s)
 
 #ifdef TARGET_AARCH64
 void a64_translate_init(void);
-void gen_a64_set_pc_im(uint64_t val);
+void gen_a64_update_pc(DisasContext *s, target_long diff);
 extern const TranslatorOps aarch64_translator_ops;
 #else
 static inline void a64_translate_init(void)
 {
 }
 
-static inline void gen_a64_set_pc_im(uint64_t val)
+static inline void gen_a64_update_pc(DisasContext *s, target_long diff)
 {
 }
 #endif
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 8f5c2675f7..914c789187 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -148,9 +148,9 @@ static void reset_btype(DisasContext *s)
 }
 }
 
-void gen_a64_set_pc_im(uint64_t val)
+void gen_a64_update_pc(DisasContext *s, target_long diff)
 {
-tcg_gen_movi_i64(cpu_pc, val);
+tcg_gen_movi_i64(cpu_pc, s->pc_curr + diff);
 }
 
 /*
@@ -342,14 +342,14 @@ static void gen_exception_internal(int excp)
 
 static void gen_exception_internal_insn(DisasContext *s, uint64_t pc, int excp)
 {
-gen_a64_set_pc_im(pc);
+gen_a64_update_pc(s, pc - s->pc_curr);
 gen_exception_internal(excp);
 s->base.is_jmp = DISAS_NORETURN;
 }
 
 static void gen_exception_bkpt_insn(DisasContext *s, uint32_t syndrome)
 {
-gen_a64_set_pc_im(s->pc_curr);
+gen_a64_update_pc(s, 0);
 gen_helper_exception_bkpt_insn(cpu_env, tcg_constant_i32(syndrome));
 s->base.is_jmp = DISAS_NORETURN;
 }
@@ -384,11 +384,11 @@ static void gen_goto_tb(DisasContext *s, int n, int64_t 
diff)
 
 if (use_goto_tb(s, dest)) {
 tcg_gen_goto_tb(n);
-gen_a64_set_pc_im(dest);
+gen_a64_update_pc(s, diff);
 tcg_gen_exit_tb(s->base.tb, n);
 s->base.is_jmp = DISAS_NORETURN;
 } else {
-gen_a64_set_pc_im(dest);
+gen_a64_update_pc(s, diff);
 if (s->ss_active) {
 gen_step_complete_exception(s);
 } else {
@@ -1960,7 +1960,7 @@ static void handle_sys(DisasContext *s, uint32_t insn, 
bool isread,
 uint32_t syndrome;
 
 syndrome = syn_aa64_sysregtrap(op0, op1, op2, crn, crm, rt, isread);
-gen_a64_set_pc_im(s->pc_curr);
+gen_a64_update_pc(s, 0);
 gen_helper_access_check_cp_reg(cpu_env,
tcg_constant_ptr(ri),
tcg_constant_i32(syndrome),
@@ -1970,7 +1970,7 @@ static void handle_sys(DisasContext *s, uint32_t insn, 
bool isread,
  * The readfn or writefn might raise an exception;
  * synchronize the CPU state in case it does.
  */
-gen_a64_set_pc_im(s->pc_curr);
+gen_a64_update_pc(s, 0);
 }
 
 /* Handle special cases first */
@@ -2180,7 +2180,7 @@ static void disas_exc(DisasContext *s, uint32_t insn)
 /* The pre HVC helper handles cases when HVC gets trapped
  * as an undefined insn by runtime configuration.
  */
-gen_a64_set_pc_im(s->pc_curr);
+gen_a64_update_pc(s, 0);
 gen_helper_pre_hvc(cpu_env);
 gen_ss_advance(s);
 gen_exception_insn_el(s, s->base.pc_next, EXCP_HVC,
@@ -2191,7 +2191,7 @@ static vo

[PATCH v5 1/9] target/arm: Introduce curr_insn_len

2022-09-30 Thread Richard Henderson
A simple helper to retrieve the length of the current insn.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/arm/translate.h | 5 +
 target/arm/translate-vfp.c | 2 +-
 target/arm/translate.c | 5 ++---
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/target/arm/translate.h b/target/arm/translate.h
index af5d4a7086..90bf7c57fc 100644
--- a/target/arm/translate.h
+++ b/target/arm/translate.h
@@ -226,6 +226,11 @@ static inline void disas_set_insn_syndrome(DisasContext 
*s, uint32_t syn)
 s->insn_start = NULL;
 }
 
+static inline int curr_insn_len(DisasContext *s)
+{
+return s->base.pc_next - s->pc_curr;
+}
+
 /* is_jmp field values */
 #define DISAS_JUMP  DISAS_TARGET_0 /* only pc was modified dynamically */
 /* CPU state was modified dynamically; exit to main loop for interrupts. */
diff --git a/target/arm/translate-vfp.c b/target/arm/translate-vfp.c
index bd5ae27d09..94cc1e4b77 100644
--- a/target/arm/translate-vfp.c
+++ b/target/arm/translate-vfp.c
@@ -242,7 +242,7 @@ static bool vfp_access_check_a(DisasContext *s, bool 
ignore_vfp_enabled)
 if (s->sme_trap_nonstreaming) {
 gen_exception_insn(s, s->pc_curr, EXCP_UDEF,
syn_smetrap(SME_ET_Streaming,
-   s->base.pc_next - s->pc_curr == 2));
+   curr_insn_len(s) == 2));
 return false;
 }
 
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 5aaccbbf71..42e11102f7 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -6654,7 +6654,7 @@ static ISSInfo make_issinfo(DisasContext *s, int rd, bool 
p, bool w)
 /* ISS not valid if writeback */
 if (p && !w) {
 ret = rd;
-if (s->base.pc_next - s->pc_curr == 2) {
+if (curr_insn_len(s) == 2) {
 ret |= ISSIs16Bit;
 }
 } else {
@@ -9817,8 +9817,7 @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, 
CPUState *cpu)
 /* nothing more to generate */
 break;
 case DISAS_WFI:
-gen_helper_wfi(cpu_env,
-   tcg_constant_i32(dc->base.pc_next - dc->pc_curr));
+gen_helper_wfi(cpu_env, tcg_constant_i32(curr_insn_len(dc)));
 /*
  * The helper doesn't necessarily throw an exception, but we
  * must go back to the main loop to check for interrupts anyway.
-- 
2.34.1




[PATCH v5 0/9] target/arm: pc-relative translation blocks

2022-09-30 Thread Richard Henderson
This is the Arm specific changes required to reduce the
amount of translation for address space randomization.

Changes for v5:
  * Minor updates for patch review, mostly using target_long
for pc displacements.


r~

Based-on: 20220930212622.108363-1-richard.hender...@linaro.org
("[PATCH v6 00/18] tcg: CPUTLBEntryFull and TARGET_TB_PCREL")

Richard Henderson (9):
  target/arm: Introduce curr_insn_len
  target/arm: Change gen_goto_tb to work on displacements
  target/arm: Change gen_*set_pc_im to gen_*update_pc
  target/arm: Change gen_exception_insn* to work on displacements
  target/arm: Remove gen_exception_internal_insn pc argument
  target/arm: Change gen_jmp* to work on displacements
  target/arm: Introduce gen_pc_plus_diff for aarch64
  target/arm: Introduce gen_pc_plus_diff for aarch32
  target/arm: Enable TARGET_TB_PCREL

 target/arm/cpu-param.h|   1 +
 target/arm/translate-a32.h|   2 +-
 target/arm/translate.h|  35 -
 target/arm/cpu.c  |  23 ++--
 target/arm/translate-a64.c| 174 +++--
 target/arm/translate-m-nocp.c |   6 +-
 target/arm/translate-mve.c|   2 +-
 target/arm/translate-vfp.c|  10 +-
 target/arm/translate.c| 235 +-
 9 files changed, 303 insertions(+), 185 deletions(-)

-- 
2.34.1




[PATCH v5 5/9] target/arm: Remove gen_exception_internal_insn pc argument

2022-09-30 Thread Richard Henderson
In preparation for TARGET_TB_PCREL, reduce reliance on absolute values.
Since we always pass dc->pc_curr, fold the arithmetic to zero displacement.

Signed-off-by: Richard Henderson 
---
 target/arm/translate-a64.c |  6 +++---
 target/arm/translate.c | 10 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index 2621b3b36a..005fd767fb 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -340,9 +340,9 @@ static void gen_exception_internal(int excp)
 gen_helper_exception_internal(cpu_env, tcg_constant_i32(excp));
 }
 
-static void gen_exception_internal_insn(DisasContext *s, uint64_t pc, int excp)
+static void gen_exception_internal_insn(DisasContext *s, int excp)
 {
-gen_a64_update_pc(s, pc - s->pc_curr);
+gen_a64_update_pc(s, 0);
 gen_exception_internal(excp);
 s->base.is_jmp = DISAS_NORETURN;
 }
@@ -2219,7 +2219,7 @@ static void disas_exc(DisasContext *s, uint32_t insn)
  * Secondly, "HLT 0xf000" is the A64 semihosting syscall instruction.
  */
 if (semihosting_enabled(s->current_el == 0) && imm16 == 0xf000) {
-gen_exception_internal_insn(s, s->pc_curr, EXCP_SEMIHOST);
+gen_exception_internal_insn(s, EXCP_SEMIHOST);
 } else {
 unallocated_encoding(s);
 }
diff --git a/target/arm/translate.c b/target/arm/translate.c
index f9d3128656..e0b1d415a2 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -1078,10 +1078,10 @@ static inline void gen_smc(DisasContext *s)
 s->base.is_jmp = DISAS_SMC;
 }
 
-static void gen_exception_internal_insn(DisasContext *s, uint32_t pc, int excp)
+static void gen_exception_internal_insn(DisasContext *s, int excp)
 {
 gen_set_condexec(s);
-gen_update_pc(s, pc - s->pc_curr);
+gen_update_pc(s, 0);
 gen_exception_internal(excp);
 s->base.is_jmp = DISAS_NORETURN;
 }
@@ -1173,7 +1173,7 @@ static inline void gen_hlt(DisasContext *s, int imm)
  */
 if (semihosting_enabled(s->current_el != 0) &&
 (imm == (s->thumb ? 0x3c : 0xf000))) {
-gen_exception_internal_insn(s, s->pc_curr, EXCP_SEMIHOST);
+gen_exception_internal_insn(s, EXCP_SEMIHOST);
 return;
 }
 
@@ -6560,7 +6560,7 @@ static bool trans_BKPT(DisasContext *s, arg_BKPT *a)
 if (arm_dc_feature(s, ARM_FEATURE_M) &&
 semihosting_enabled(s->current_el == 0) &&
 (a->imm == 0xab)) {
-gen_exception_internal_insn(s, s->pc_curr, EXCP_SEMIHOST);
+gen_exception_internal_insn(s, EXCP_SEMIHOST);
 } else {
 gen_exception_bkpt_insn(s, syn_aa32_bkpt(a->imm, false));
 }
@@ -8766,7 +8766,7 @@ static bool trans_SVC(DisasContext *s, arg_SVC *a)
 if (!arm_dc_feature(s, ARM_FEATURE_M) &&
 semihosting_enabled(s->current_el == 0) &&
 (a->imm == semihost_imm)) {
-gen_exception_internal_insn(s, s->pc_curr, EXCP_SEMIHOST);
+gen_exception_internal_insn(s, EXCP_SEMIHOST);
 } else {
 gen_update_pc(s, curr_insn_len(s));
 s->svc_imm = a->imm;
-- 
2.34.1




RE: [PATCH v6 16/18] hw/core: Add CPUClass.get_pc

2022-09-30 Thread Taylor Simpson


> -Original Message-
> From: Richard Henderson 
> Sent: Friday, September 30, 2022 4:26 PM
> To: qemu-devel@nongnu.org
> Cc: peter.mayd...@linux.org; alex.ben...@linux.org; Eduardo Habkost
> ; Marcel Apfelbaum
> ; Philippe Mathieu-Daudé
> ; Yanan Wang ; Michael
> Rolnik ; Edgar E. Iglesias ;
> Taylor Simpson ; Song Gao
> ; Xiaojuan Yang ;
> Laurent Vivier ; Jiaxun Yang ;
> Aleksandar Rikalo ; Chris Wulff
> ; Marek Vasut ; Stafford Horne
> ; Yoshinori Sato ; Mark
> Cave-Ayland ; Bastian Koppelmann
> ; Max Filippov ;
> qemu-...@nongnu.org; qemu-...@nongnu.org; qemu-ri...@nongnu.org;
> qemu-s3...@nongnu.org
> Subject: [PATCH v6 16/18] hw/core: Add CPUClass.get_pc
> 
> diff --git a/target/hexagon/cpu.c
> b/target/hexagon/cpu.c index fa9bd702d6..04a497db5e 100644
> --- a/target/hexagon/cpu.c
> +++ b/target/hexagon/cpu.c
> @@ -251,6 +251,13 @@ static void hexagon_cpu_set_pc(CPUState *cs,
> vaddr value)
>  env->gpr[HEX_REG_PC] = value;
>  }
> 
> +static vaddr hexagon_cpu_get_pc(CPUState *cs) {
> +HexagonCPU *cpu = HEXAGON_CPU(cs);
> +CPUHexagonState *env = &cpu->env;
> +return env->gpr[HEX_REG_PC];
> +}
> +
>  static void hexagon_cpu_synchronize_from_tb(CPUState *cs,
>  const TranslationBlock *tb)  { 
> @@ -337,6 +344,7 @@
> static void hexagon_cpu_class_init(ObjectClass *c, void *data)
>  cc->has_work = hexagon_cpu_has_work;
>  cc->dump_state = hexagon_dump_state;
>  cc->set_pc = hexagon_cpu_set_pc;
> +cc->get_pc = hexagon_cpu_get_pc;
>  cc->gdb_read_register = hexagon_gdb_read_register;
>  cc->gdb_write_register = hexagon_gdb_write_register;
>  cc->gdb_num_core_regs = TOTAL_PER_THREAD_REGS + NUM_VREGS +

Reviewed-by: Taylor Simpson 


RE: [PATCH] Hexagon (gen_tcg_funcs.py): avoid duplicated tcg code on A_CVI_NEW

2022-09-30 Thread Taylor Simpson



> -Original Message-
> From: Matheus Tavares Bernardino 
> Sent: Friday, September 30, 2022 3:08 PM
> To: qemu-devel@nongnu.org
> Cc: Taylor Simpson 
> Subject: [PATCH] Hexagon (gen_tcg_funcs.py): avoid duplicated tcg code on
> A_CVI_NEW
> 
> Hexagon instructions with the A_CVI_NEW attribute produce a vector value
> that can be used in the same packet. The python function responsible for
> generating code for such instructions has a typo ("if" instead of "elif"), 
> which
> makes genptr_dst_write_ext() be executed twice, thus also generating the
> same tcg code twice. Fortunately, this doesn't cause any problems for
> correctness, but it is less efficient than it could be. Fix it by using an 
> "elif" and
> avoiding the unnecessary extra code gen.
> 
> Signed-off-by: Matheus Tavares Bernardino 
> ---
>  target/hexagon/gen_tcg_funcs.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/hexagon/gen_tcg_funcs.py
> b/target/hexagon/gen_tcg_funcs.py index d72c689ad7..6dea02b0b9 100755
> --- a/target/hexagon/gen_tcg_funcs.py
> +++ b/target/hexagon/gen_tcg_funcs.py
> @@ -548,7 +548,7 @@ def genptr_dst_write_opn(f,regtype, regid, tag):
>  if (hex_common.is_hvx_reg(regtype)):
>  if (hex_common.is_new_result(tag)):
>  genptr_dst_write_ext(f, tag, regtype, regid, "EXT_NEW")
> -if (hex_common.is_tmp_result(tag)):
> +elif (hex_common.is_tmp_result(tag)):
>  genptr_dst_write_ext(f, tag, regtype, regid, "EXT_TMP")
>  else:
>  genptr_dst_write_ext(f, tag, regtype, regid, "EXT_DFL")

Reviewed-by: Taylor Simpson 



Re: [PATCH v6 00/18] tcg: CPUTLBEntryFull and TARGET_TB_PCREL

2022-09-30 Thread Richard Henderson

Good grief: typo in the cc list, twice.
You'd think I'd know where I work by now...


r~

On 9/30/22 14:26, Richard Henderson wrote:

Changes for v6:
   * CPUTLBEntryFull is now completely reviewed.

   * Incorporated the CPUClass caching patches,
 as I will add a new use of the cached value.

   * Move CPUJumpCache out of include/hw/core.h.  While looking at
 Alex's review of the patch, I realized that adding the virtual
 pc value unconditionally would consume 64kB per cpu on targets
 that do not require it.  Further, making it dynamically allocated
 (a consequence of core.h not having the structure definition to
 add to CPUState), means that we save 64kB per cpu when running
 with hardware virtualization (kvm, xen, etc).

   * Add CPUClass.get_pc, so that we can always use or filter on the
 virtual address when logging.

Patches needing review:

   13-accel-tcg-Do-not-align-tb-page_addr-0.patch
   14-accel-tcg-Inline-tb_flush_jmp_cache.patch (new)
   16-hw-core-Add-CPUClass.get_pc.patch (new)
   17-accel-tcg-Introduce-tb_pc-and-log_pc.patch (mostly new)
   18-accel-tcg-Introduce-TARGET_TB_PCREL.patch


r~


Alex Bennée (3):
   cpu: cache CPUClass in CPUState for hot code paths
   hw/core/cpu-sysemu: used cached class in cpu_asidx_from_attrs
   cputlb: used cached CPUClass in our hot-paths

Richard Henderson (15):
   accel/tcg: Rename CPUIOTLBEntry to CPUTLBEntryFull
   accel/tcg: Drop addr member from SavedIOTLB
   accel/tcg: Suppress auto-invalidate in probe_access_internal
   accel/tcg: Introduce probe_access_full
   accel/tcg: Introduce tlb_set_page_full
   include/exec: Introduce TARGET_PAGE_ENTRY_EXTRA
   accel/tcg: Remove PageDesc code_bitmap
   accel/tcg: Use bool for page_find_alloc
   accel/tcg: Use DisasContextBase in plugin_gen_tb_start
   accel/tcg: Do not align tb->page_addr[0]
   accel/tcg: Inline tb_flush_jmp_cache
   include/hw/core: Create struct CPUJumpCache
   hw/core: Add CPUClass.get_pc
   accel/tcg: Introduce tb_pc and log_pc
   accel/tcg: Introduce TARGET_TB_PCREL

  accel/tcg/internal.h|  10 +
  accel/tcg/tb-hash.h |   1 +
  accel/tcg/tb-jmp-cache.h|  29 +++
  include/exec/cpu-common.h   |   1 +
  include/exec/cpu-defs.h |  48 -
  include/exec/exec-all.h |  75 ++-
  include/exec/plugin-gen.h   |   7 +-
  include/hw/core/cpu.h   |  28 ++-
  include/qemu/typedefs.h |   1 +
  include/tcg/tcg.h   |   2 +-
  accel/tcg/cpu-exec.c| 122 +++
  accel/tcg/cputlb.c  | 259 ++--
  accel/tcg/plugin-gen.c  |  22 +-
  accel/tcg/translate-all.c   | 200 --
  accel/tcg/translator.c  |   2 +-
  cpu.c   |   9 +-
  hw/core/cpu-common.c|   3 +-
  hw/core/cpu-sysemu.c|   5 +-
  plugins/core.c  |   2 +-
  target/alpha/cpu.c  |   9 +
  target/arm/cpu.c|  17 +-
  target/arm/mte_helper.c |  14 +-
  target/arm/sve_helper.c |   4 +-
  target/arm/translate-a64.c  |   2 +-
  target/avr/cpu.c|  10 +-
  target/cris/cpu.c   |   8 +
  target/hexagon/cpu.c|  10 +-
  target/hppa/cpu.c   |  12 +-
  target/i386/cpu.c   |   9 +
  target/i386/tcg/tcg-cpu.c   |   2 +-
  target/loongarch/cpu.c  |  11 +-
  target/m68k/cpu.c   |   8 +
  target/microblaze/cpu.c |  10 +-
  target/mips/cpu.c   |   8 +
  target/mips/tcg/exception.c |   2 +-
  target/mips/tcg/sysemu/special_helper.c |   2 +-
  target/nios2/cpu.c  |   9 +
  target/openrisc/cpu.c   |  10 +-
  target/ppc/cpu_init.c   |   8 +
  target/riscv/cpu.c  |  17 +-
  target/rx/cpu.c |  10 +-
  target/s390x/cpu.c  |   8 +
  target/s390x/tcg/mem_helper.c   |   4 -
  target/sh4/cpu.c|  12 +-
  target/sparc/cpu.c  |  10 +-
  target/tricore/cpu.c|  11 +-
  target/xtensa/cpu.c |   8 +
  tcg/tcg.c   |   8 +-
  trace/control-target.c  |   2 +-
  49 files changed, 723 insertions(+), 358 deletions(-)
  create mode 100644 accel/tcg/tb-jmp-cache.h






[PATCH v6 14/18] accel/tcg: Inline tb_flush_jmp_cache

2022-09-30 Thread Richard Henderson
This function has two users, who use it incompatibly.
In tlb_flush_page_by_mmuidx_async_0, when flushing a
single page, we need to flush exactly two pages.
In tlb_flush_range_by_mmuidx_async_0, when flushing a
range of pages, we need to flush N+1 pages.

This avoids double-flushing of jmp cache pages in a range.

Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index a0db2d32a8..c7909fb619 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -107,14 +107,6 @@ static void tb_jmp_cache_clear_page(CPUState *cpu, 
target_ulong page_addr)
 }
 }
 
-static void tb_flush_jmp_cache(CPUState *cpu, target_ulong addr)
-{
-/* Discard jump cache entries for any tb which might potentially
-   overlap the flushed page.  */
-tb_jmp_cache_clear_page(cpu, addr - TARGET_PAGE_SIZE);
-tb_jmp_cache_clear_page(cpu, addr);
-}
-
 /**
  * tlb_mmu_resize_locked() - perform TLB resize bookkeeping; resize if 
necessary
  * @desc: The CPUTLBDesc portion of the TLB
@@ -541,7 +533,12 @@ static void tlb_flush_page_by_mmuidx_async_0(CPUState *cpu,
 }
 qemu_spin_unlock(&env_tlb(env)->c.lock);
 
-tb_flush_jmp_cache(cpu, addr);
+/*
+ * Discard jump cache entries for any tb which might potentially
+ * overlap the flushed page, which includes the previous.
+ */
+tb_jmp_cache_clear_page(cpu, addr - TARGET_PAGE_SIZE);
+tb_jmp_cache_clear_page(cpu, addr);
 }
 
 /**
@@ -792,8 +789,14 @@ static void tlb_flush_range_by_mmuidx_async_0(CPUState 
*cpu,
 return;
 }
 
-for (target_ulong i = 0; i < d.len; i += TARGET_PAGE_SIZE) {
-tb_flush_jmp_cache(cpu, d.addr + i);
+/*
+ * Discard jump cache entries for any tb which might potentially
+ * overlap the flushed pages, which includes the previous.
+ */
+d.addr -= TARGET_PAGE_SIZE;
+for (target_ulong i = 0, n = d.len / TARGET_PAGE_SIZE + 1; i < n; i++) {
+tb_jmp_cache_clear_page(cpu, d.addr);
+d.addr += TARGET_PAGE_SIZE;
 }
 }
 
-- 
2.34.1




[PATCH v6 13/18] accel/tcg: Do not align tb->page_addr[0]

2022-09-30 Thread Richard Henderson
Let tb->page_addr[0] contain the offset within the page of the
start of the translation block.  We need to recover this value
anyway at various points, and it is easier to discard the page
offset when it's not needed, which happens naturally via the
existing find_page shift.

Signed-off-by: Richard Henderson 
---
 accel/tcg/cpu-exec.c  | 16 
 accel/tcg/cputlb.c|  3 ++-
 accel/tcg/translate-all.c |  9 +
 3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 5f43b9769a..dd58a144a8 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -174,7 +174,7 @@ struct tb_desc {
 target_ulong pc;
 target_ulong cs_base;
 CPUArchState *env;
-tb_page_addr_t phys_page1;
+tb_page_addr_t page_addr0;
 uint32_t flags;
 uint32_t cflags;
 uint32_t trace_vcpu_dstate;
@@ -186,7 +186,7 @@ static bool tb_lookup_cmp(const void *p, const void *d)
 const struct tb_desc *desc = d;
 
 if (tb->pc == desc->pc &&
-tb->page_addr[0] == desc->phys_page1 &&
+tb->page_addr[0] == desc->page_addr0 &&
 tb->cs_base == desc->cs_base &&
 tb->flags == desc->flags &&
 tb->trace_vcpu_dstate == desc->trace_vcpu_dstate &&
@@ -195,8 +195,8 @@ static bool tb_lookup_cmp(const void *p, const void *d)
 if (tb->page_addr[1] == -1) {
 return true;
 } else {
-tb_page_addr_t phys_page2;
-target_ulong virt_page2;
+tb_page_addr_t phys_page1;
+target_ulong virt_page1;
 
 /*
  * We know that the first page matched, and an otherwise valid TB
@@ -207,9 +207,9 @@ static bool tb_lookup_cmp(const void *p, const void *d)
  * is different for the new TB.  Therefore any exception raised
  * here by the faulting lookup is not premature.
  */
-virt_page2 = TARGET_PAGE_ALIGN(desc->pc);
-phys_page2 = get_page_addr_code(desc->env, virt_page2);
-if (tb->page_addr[1] == phys_page2) {
+virt_page1 = TARGET_PAGE_ALIGN(desc->pc);
+phys_page1 = get_page_addr_code(desc->env, virt_page1);
+if (tb->page_addr[1] == phys_page1) {
 return true;
 }
 }
@@ -235,7 +235,7 @@ static TranslationBlock *tb_htable_lookup(CPUState *cpu, 
target_ulong pc,
 if (phys_pc == -1) {
 return NULL;
 }
-desc.phys_page1 = phys_pc & TARGET_PAGE_MASK;
+desc.page_addr0 = phys_pc;
 h = tb_hash_func(phys_pc, pc, flags, cflags, *cpu->trace_dstate);
 return qht_lookup_custom(&tb_ctx.htable, &desc, h, tb_lookup_cmp);
 }
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 361078471b..a0db2d32a8 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -951,7 +951,8 @@ void tlb_flush_page_bits_by_mmuidx_all_cpus_synced(CPUState 
*src_cpu,
can be detected */
 void tlb_protect_code(ram_addr_t ram_addr)
 {
-cpu_physical_memory_test_and_clear_dirty(ram_addr, TARGET_PAGE_SIZE,
+cpu_physical_memory_test_and_clear_dirty(ram_addr & TARGET_PAGE_MASK,
+ TARGET_PAGE_SIZE,
  DIRTY_MEMORY_CODE);
 }
 
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index ca685f6ede..3a63113c41 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1167,7 +1167,7 @@ static void do_tb_phys_invalidate(TranslationBlock *tb, 
bool rm_from_page_list)
 qemu_spin_unlock(&tb->jmp_lock);
 
 /* remove the TB from the hash list */
-phys_pc = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
+phys_pc = tb->page_addr[0];
 h = tb_hash_func(phys_pc, tb->pc, tb->flags, orig_cflags,
  tb->trace_vcpu_dstate);
 if (!qht_remove(&tb_ctx.htable, tb, h)) {
@@ -1291,7 +1291,7 @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
  * we can only insert TBs that are fully initialized.
  */
 page_lock_pair(&p, phys_pc, &p2, phys_page2, true);
-tb_page_add(p, tb, 0, phys_pc & TARGET_PAGE_MASK);
+tb_page_add(p, tb, 0, phys_pc);
 if (p2) {
 tb_page_add(p2, tb, 1, phys_page2);
 } else {
@@ -1644,11 +1644,12 @@ tb_invalidate_phys_page_range__locked(struct 
page_collection *pages,
 if (n == 0) {
 /* NOTE: tb_end may be after the end of the page, but
it is not a problem */
-tb_start = tb->page_addr[0] + (tb->pc & ~TARGET_PAGE_MASK);
+tb_start = tb->page_addr[0];
 tb_end = tb_start + tb->size;
 } else {
 tb_start = tb->page_addr[1];
-tb_end = tb_start + ((tb->pc + tb->size) & ~TARGET_PAGE_MASK);
+tb_end = tb_start + ((tb->page_addr[0] + tb->size)
+ & ~TARGET_PAGE_MASK);
 }
 if (!(tb_end <= start || tb_start >= end)) {
 #ifdef TARGET_HAS_PRECISE_SM

[PATCH v6 12/18] accel/tcg: Use DisasContextBase in plugin_gen_tb_start

2022-09-30 Thread Richard Henderson
Use the pc coming from db->pc_first rather than the TB.

Use the cached host_addr rather than re-computing for the
first page.  We still need a separate lookup for the second
page because it won't be computed for DisasContextBase until
the translator actually performs a read from the page.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 include/exec/plugin-gen.h |  7 ---
 accel/tcg/plugin-gen.c| 22 +++---
 accel/tcg/translator.c|  2 +-
 3 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/include/exec/plugin-gen.h b/include/exec/plugin-gen.h
index f92f169739..5004728c61 100644
--- a/include/exec/plugin-gen.h
+++ b/include/exec/plugin-gen.h
@@ -19,7 +19,8 @@ struct DisasContextBase;
 
 #ifdef CONFIG_PLUGIN
 
-bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool 
supress);
+bool plugin_gen_tb_start(CPUState *cpu, const struct DisasContextBase *db,
+ bool supress);
 void plugin_gen_tb_end(CPUState *cpu);
 void plugin_gen_insn_start(CPUState *cpu, const struct DisasContextBase *db);
 void plugin_gen_insn_end(void);
@@ -48,8 +49,8 @@ static inline void plugin_insn_append(abi_ptr pc, const void 
*from, size_t size)
 
 #else /* !CONFIG_PLUGIN */
 
-static inline
-bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool 
supress)
+static inline bool
+plugin_gen_tb_start(CPUState *cpu, const struct DisasContextBase *db, bool sup)
 {
 return false;
 }
diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 3d0b101e34..80dff68934 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -852,7 +852,8 @@ static void plugin_gen_inject(const struct qemu_plugin_tb 
*plugin_tb)
 pr_ops();
 }
 
-bool plugin_gen_tb_start(CPUState *cpu, const TranslationBlock *tb, bool 
mem_only)
+bool plugin_gen_tb_start(CPUState *cpu, const DisasContextBase *db,
+ bool mem_only)
 {
 bool ret = false;
 
@@ -870,9 +871,9 @@ bool plugin_gen_tb_start(CPUState *cpu, const 
TranslationBlock *tb, bool mem_onl
 
 ret = true;
 
-ptb->vaddr = tb->pc;
+ptb->vaddr = db->pc_first;
 ptb->vaddr2 = -1;
-get_page_addr_code_hostp(cpu->env_ptr, tb->pc, &ptb->haddr1);
+ptb->haddr1 = db->host_addr[0];
 ptb->haddr2 = NULL;
 ptb->mem_only = mem_only;
 
@@ -898,16 +899,15 @@ void plugin_gen_insn_start(CPUState *cpu, const 
DisasContextBase *db)
  * Note that we skip this when haddr1 == NULL, e.g. when we're
  * fetching instructions from a region not backed by RAM.
  */
-if (likely(ptb->haddr1 != NULL && ptb->vaddr2 == -1) &&
-unlikely((db->pc_next & TARGET_PAGE_MASK) !=
- (db->pc_first & TARGET_PAGE_MASK))) {
-get_page_addr_code_hostp(cpu->env_ptr, db->pc_next,
- &ptb->haddr2);
-ptb->vaddr2 = db->pc_next;
-}
-if (likely(ptb->vaddr2 == -1)) {
+if (ptb->haddr1 == NULL) {
+pinsn->haddr = NULL;
+} else if (is_same_page(db, db->pc_next)) {
 pinsn->haddr = ptb->haddr1 + pinsn->vaddr - ptb->vaddr;
 } else {
+if (ptb->vaddr2 == -1) {
+ptb->vaddr2 = TARGET_PAGE_ALIGN(db->pc_first);
+get_page_addr_code_hostp(cpu->env_ptr, ptb->vaddr2, &ptb->haddr2);
+}
 pinsn->haddr = ptb->haddr2 + pinsn->vaddr - ptb->vaddr2;
 }
 }
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index ca8a5f2d83..8e78fd7a9c 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -75,7 +75,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int 
max_insns,
 ops->tb_start(db, cpu);
 tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
 
-plugin_enabled = plugin_gen_tb_start(cpu, tb, cflags & CF_MEMI_ONLY);
+plugin_enabled = plugin_gen_tb_start(cpu, db, cflags & CF_MEMI_ONLY);
 
 while (true) {
 db->num_insns++;
-- 
2.34.1




[PATCH v6 17/18] accel/tcg: Introduce tb_pc and log_pc

2022-09-30 Thread Richard Henderson
The availability of tb->pc will shortly be conditional.
Introduce accessor functions to minimize ifdefs.

Pass around a known pc to places like tcg_gen_code,
where the caller must already have the value.

Signed-off-by: Richard Henderson 
---
 accel/tcg/internal.h|  6 
 include/exec/exec-all.h |  6 
 include/tcg/tcg.h   |  2 +-
 accel/tcg/cpu-exec.c| 46 ++---
 accel/tcg/translate-all.c   | 37 +++-
 target/arm/cpu.c|  4 +--
 target/avr/cpu.c|  2 +-
 target/hexagon/cpu.c|  2 +-
 target/hppa/cpu.c   |  4 +--
 target/i386/tcg/tcg-cpu.c   |  2 +-
 target/loongarch/cpu.c  |  2 +-
 target/microblaze/cpu.c |  2 +-
 target/mips/tcg/exception.c |  2 +-
 target/mips/tcg/sysemu/special_helper.c |  2 +-
 target/openrisc/cpu.c   |  2 +-
 target/riscv/cpu.c  |  4 +--
 target/rx/cpu.c |  2 +-
 target/sh4/cpu.c|  4 +--
 target/sparc/cpu.c  |  2 +-
 target/tricore/cpu.c|  2 +-
 tcg/tcg.c   |  8 ++---
 21 files changed, 82 insertions(+), 61 deletions(-)

diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
index 3092bfa964..a3875a3b5a 100644
--- a/accel/tcg/internal.h
+++ b/accel/tcg/internal.h
@@ -18,4 +18,10 @@ G_NORETURN void cpu_io_recompile(CPUState *cpu, uintptr_t 
retaddr);
 void page_init(void);
 void tb_htable_init(void);
 
+/* Return the current PC from CPU, which may be cached in TB. */
+static inline target_ulong log_pc(CPUState *cpu, const TranslationBlock *tb)
+{
+return tb_pc(tb);
+}
+
 #endif /* ACCEL_TCG_INTERNAL_H */
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index b1b920a713..7ea6026ba9 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -570,6 +570,12 @@ struct TranslationBlock {
 uintptr_t jmp_dest[2];
 };
 
+/* Hide the read to avoid ifdefs for TARGET_TB_PCREL. */
+static inline target_ulong tb_pc(const TranslationBlock *tb)
+{
+return tb->pc;
+}
+
 /* Hide the qatomic_read to make code a little easier on the eyes */
 static inline uint32_t tb_cflags(const TranslationBlock *tb)
 {
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 26a70526f1..d84bae6e3f 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -840,7 +840,7 @@ void tcg_register_thread(void);
 void tcg_prologue_init(TCGContext *s);
 void tcg_func_start(TCGContext *s);
 
-int tcg_gen_code(TCGContext *s, TranslationBlock *tb);
+int tcg_gen_code(TCGContext *s, TranslationBlock *tb, target_ulong pc_start);
 
 void tcg_set_frame(TCGContext *s, TCGReg reg, intptr_t start, intptr_t size);
 
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 2d7e610ee2..8b3f8435fb 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -186,7 +186,7 @@ static bool tb_lookup_cmp(const void *p, const void *d)
 const TranslationBlock *tb = p;
 const struct tb_desc *desc = d;
 
-if (tb->pc == desc->pc &&
+if (tb_pc(tb) == desc->pc &&
 tb->page_addr[0] == desc->page_addr0 &&
 tb->cs_base == desc->cs_base &&
 tb->flags == desc->flags &&
@@ -271,12 +271,10 @@ static inline TranslationBlock *tb_lookup(CPUState *cpu, 
target_ulong pc,
 return tb;
 }
 
-static inline void log_cpu_exec(target_ulong pc, CPUState *cpu,
-const TranslationBlock *tb)
+static void log_cpu_exec(target_ulong pc, CPUState *cpu,
+ const TranslationBlock *tb)
 {
-if (unlikely(qemu_loglevel_mask(CPU_LOG_TB_CPU | CPU_LOG_EXEC))
-&& qemu_log_in_addr_range(pc)) {
-
+if (qemu_log_in_addr_range(pc)) {
 qemu_log_mask(CPU_LOG_EXEC,
   "Trace %d: %p [" TARGET_FMT_lx
   "/" TARGET_FMT_lx "/%08x/%08x] %s\n",
@@ -400,7 +398,9 @@ const void *HELPER(lookup_tb_ptr)(CPUArchState *env)
 return tcg_code_gen_epilogue;
 }
 
-log_cpu_exec(pc, cpu, tb);
+if (qemu_loglevel_mask(CPU_LOG_TB_CPU | CPU_LOG_EXEC)) {
+log_cpu_exec(pc, cpu, tb);
+}
 
 return tb->tc.ptr;
 }
@@ -423,7 +423,9 @@ cpu_tb_exec(CPUState *cpu, TranslationBlock *itb, int 
*tb_exit)
 TranslationBlock *last_tb;
 const void *tb_ptr = itb->tc.ptr;
 
-log_cpu_exec(itb->pc, cpu, itb);
+if (qemu_loglevel_mask(CPU_LOG_TB_CPU | CPU_LOG_EXEC)) {
+log_cpu_exec(log_pc(cpu, itb), cpu, itb);
+}
 
 qemu_thread_jit_execute();
 ret = tcg_qemu_tb_exec(env, tb_ptr);
@@ -447,16 +449,20 @@ cpu_tb_exec(CPUState *cpu, TranslationBlock *itb, int 
*tb_exit)
  * of the start of the TB.
  */
 CPUClass *cc = CPU_GET_CLASS(cpu);
-qemu_log_mask_and_addr(CPU_LOG_EXEC, last_tb->pc,
-   "Stopped exe

[PATCH v6 18/18] accel/tcg: Introduce TARGET_TB_PCREL

2022-09-30 Thread Richard Henderson
Prepare for targets to be able to produce TBs that can
run in more than one virtual context.

Signed-off-by: Richard Henderson 
---
 accel/tcg/internal.h  |  4 +++
 accel/tcg/tb-jmp-cache.h  |  5 
 include/exec/cpu-defs.h   |  3 +++
 include/exec/exec-all.h   | 32 --
 accel/tcg/cpu-exec.c  | 56 +++
 accel/tcg/translate-all.c | 50 +-
 6 files changed, 119 insertions(+), 31 deletions(-)

diff --git a/accel/tcg/internal.h b/accel/tcg/internal.h
index a3875a3b5a..dc800fd485 100644
--- a/accel/tcg/internal.h
+++ b/accel/tcg/internal.h
@@ -21,7 +21,11 @@ void tb_htable_init(void);
 /* Return the current PC from CPU, which may be cached in TB. */
 static inline target_ulong log_pc(CPUState *cpu, const TranslationBlock *tb)
 {
+#if TARGET_TB_PCREL
+return cpu->cc->get_pc(cpu);
+#else
 return tb_pc(tb);
+#endif
 }
 
 #endif /* ACCEL_TCG_INTERNAL_H */
diff --git a/accel/tcg/tb-jmp-cache.h b/accel/tcg/tb-jmp-cache.h
index 2d8fbb1bfe..a7288150bc 100644
--- a/accel/tcg/tb-jmp-cache.h
+++ b/accel/tcg/tb-jmp-cache.h
@@ -14,10 +14,15 @@
 
 /*
  * Accessed in parallel; all accesses to 'tb' must be atomic.
+ * For TARGET_TB_PCREL, accesses to 'pc' must be protected by
+ * a load_acquire/store_release to 'tb'.
  */
 struct CPUJumpCache {
 struct {
 TranslationBlock *tb;
+#if TARGET_TB_PCREL
+target_ulong pc;
+#endif
 } array[TB_JMP_CACHE_SIZE];
 };
 
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index 67239b4e5e..21309cf567 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -54,6 +54,9 @@
 #  error TARGET_PAGE_BITS must be defined in cpu-param.h
 # endif
 #endif
+#ifndef TARGET_TB_PCREL
+# define TARGET_TB_PCREL 0
+#endif
 
 #define TARGET_LONG_SIZE (TARGET_LONG_BITS / 8)
 
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 7ea6026ba9..e5f8b224a5 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -496,8 +496,32 @@ struct tb_tc {
 };
 
 struct TranslationBlock {
-target_ulong pc;   /* simulated PC corresponding to this block (EIP + CS 
base) */
-target_ulong cs_base; /* CS base for this block */
+#if !TARGET_TB_PCREL
+/*
+ * Guest PC corresponding to this block.  This must be the true
+ * virtual address.  Therefore e.g. x86 stores EIP + CS_BASE, and
+ * targets like Arm, MIPS, HP-PA, which reuse low bits for ISA or
+ * privilege, must store those bits elsewhere.
+ *
+ * If TARGET_TB_PCREL, the opcodes for the TranslationBlock are
+ * written such that the TB is associated only with the physical
+ * page and may be run in any virtual address context.  In this case,
+ * PC must always be taken from ENV in a target-specific manner.
+ * Unwind information is taken as offsets from the page, to be
+ * deposited into the "current" PC.
+ */
+target_ulong pc;
+#endif
+
+/*
+ * Target-specific data associated with the TranslationBlock, e.g.:
+ * x86: the original user, the Code Segment virtual base,
+ * arm: an extension of tb->flags,
+ * s390x: instruction data for EXECUTE,
+ * sparc: the next pc of the instruction queue (for delay slots).
+ */
+target_ulong cs_base;
+
 uint32_t flags; /* flags defining in which context the code was generated 
*/
 uint32_t cflags;/* compile flags */
 
@@ -573,7 +597,11 @@ struct TranslationBlock {
 /* Hide the read to avoid ifdefs for TARGET_TB_PCREL. */
 static inline target_ulong tb_pc(const TranslationBlock *tb)
 {
+#if TARGET_TB_PCREL
+qemu_build_not_reached();
+#else
 return tb->pc;
+#endif
 }
 
 /* Hide the qatomic_read to make code a little easier on the eyes */
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index 8b3f8435fb..acb5646b03 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -186,7 +186,7 @@ static bool tb_lookup_cmp(const void *p, const void *d)
 const TranslationBlock *tb = p;
 const struct tb_desc *desc = d;
 
-if (tb_pc(tb) == desc->pc &&
+if ((TARGET_TB_PCREL || tb_pc(tb) == desc->pc) &&
 tb->page_addr[0] == desc->page_addr0 &&
 tb->cs_base == desc->cs_base &&
 tb->flags == desc->flags &&
@@ -237,7 +237,8 @@ static TranslationBlock *tb_htable_lookup(CPUState *cpu, 
target_ulong pc,
 return NULL;
 }
 desc.page_addr0 = phys_pc;
-h = tb_hash_func(phys_pc, pc, flags, cflags, *cpu->trace_dstate);
+h = tb_hash_func(phys_pc, (TARGET_TB_PCREL ? 0 : pc),
+ flags, cflags, *cpu->trace_dstate);
 return qht_lookup_custom(&tb_ctx.htable, &desc, h, tb_lookup_cmp);
 }
 
@@ -247,27 +248,52 @@ static inline TranslationBlock *tb_lookup(CPUState *cpu, 
target_ulong pc,
   uint32_t flags, uint32_t cflags)
 {
 TranslationBlock *tb;
+CPUJumpCache *jc;
 uint32_t hash;
 
 /* we should never be trying to look up an INVALID 

[PATCH v6 10/18] accel/tcg: Remove PageDesc code_bitmap

2022-09-30 Thread Richard Henderson
This bitmap is created and discarded immediately.
We gain nothing by its existence.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
Message-Id: <20220822232338.1727934-2-richard.hender...@linaro.org>
---
 accel/tcg/translate-all.c | 78 ++-
 1 file changed, 4 insertions(+), 74 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index d71d04d338..59432dc558 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -102,21 +102,14 @@
 #define assert_memory_lock() tcg_debug_assert(have_mmap_lock())
 #endif
 
-#define SMC_BITMAP_USE_THRESHOLD 10
-
 typedef struct PageDesc {
 /* list of TBs intersecting this ram page */
 uintptr_t first_tb;
-#ifdef CONFIG_SOFTMMU
-/* in order to optimize self modifying code, we count the number
-   of lookups we do to a given page to use a bitmap */
-unsigned long *code_bitmap;
-unsigned int code_write_count;
-#else
+#ifdef CONFIG_USER_ONLY
 unsigned long flags;
 void *target_data;
 #endif
-#ifndef CONFIG_USER_ONLY
+#ifdef CONFIG_SOFTMMU
 QemuSpin lock;
 #endif
 } PageDesc;
@@ -907,17 +900,6 @@ void tb_htable_init(void)
 qht_init(&tb_ctx.htable, tb_cmp, CODE_GEN_HTABLE_SIZE, mode);
 }
 
-/* call with @p->lock held */
-static inline void invalidate_page_bitmap(PageDesc *p)
-{
-assert_page_locked(p);
-#ifdef CONFIG_SOFTMMU
-g_free(p->code_bitmap);
-p->code_bitmap = NULL;
-p->code_write_count = 0;
-#endif
-}
-
 /* Set to NULL all the 'first_tb' fields in all PageDescs. */
 static void page_flush_tb_1(int level, void **lp)
 {
@@ -932,7 +914,6 @@ static void page_flush_tb_1(int level, void **lp)
 for (i = 0; i < V_L2_SIZE; ++i) {
 page_lock(&pd[i]);
 pd[i].first_tb = (uintptr_t)NULL;
-invalidate_page_bitmap(pd + i);
 page_unlock(&pd[i]);
 }
 } else {
@@ -1197,11 +1178,9 @@ static void do_tb_phys_invalidate(TranslationBlock *tb, 
bool rm_from_page_list)
 if (rm_from_page_list) {
 p = page_find(tb->page_addr[0] >> TARGET_PAGE_BITS);
 tb_page_remove(p, tb);
-invalidate_page_bitmap(p);
 if (tb->page_addr[1] != -1) {
 p = page_find(tb->page_addr[1] >> TARGET_PAGE_BITS);
 tb_page_remove(p, tb);
-invalidate_page_bitmap(p);
 }
 }
 
@@ -1246,35 +1225,6 @@ void tb_phys_invalidate(TranslationBlock *tb, 
tb_page_addr_t page_addr)
 }
 }
 
-#ifdef CONFIG_SOFTMMU
-/* call with @p->lock held */
-static void build_page_bitmap(PageDesc *p)
-{
-int n, tb_start, tb_end;
-TranslationBlock *tb;
-
-assert_page_locked(p);
-p->code_bitmap = bitmap_new(TARGET_PAGE_SIZE);
-
-PAGE_FOR_EACH_TB(p, tb, n) {
-/* NOTE: this is subtle as a TB may span two physical pages */
-if (n == 0) {
-/* NOTE: tb_end may be after the end of the page, but
-   it is not a problem */
-tb_start = tb->pc & ~TARGET_PAGE_MASK;
-tb_end = tb_start + tb->size;
-if (tb_end > TARGET_PAGE_SIZE) {
-tb_end = TARGET_PAGE_SIZE;
- }
-} else {
-tb_start = 0;
-tb_end = ((tb->pc + tb->size) & ~TARGET_PAGE_MASK);
-}
-bitmap_set(p->code_bitmap, tb_start, tb_end - tb_start);
-}
-}
-#endif
-
 /* add the tb in the target page and protect it if necessary
  *
  * Called with mmap_lock held for user-mode emulation.
@@ -1295,7 +1245,6 @@ static inline void tb_page_add(PageDesc *p, 
TranslationBlock *tb,
 page_already_protected = p->first_tb != (uintptr_t)NULL;
 #endif
 p->first_tb = (uintptr_t)tb | n;
-invalidate_page_bitmap(p);
 
 #if defined(CONFIG_USER_ONLY)
 /* translator_loop() must have made all TB pages non-writable */
@@ -1357,10 +1306,8 @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t 
phys_pc,
 /* remove TB from the page(s) if we couldn't insert it */
 if (unlikely(existing_tb)) {
 tb_page_remove(p, tb);
-invalidate_page_bitmap(p);
 if (p2) {
 tb_page_remove(p2, tb);
-invalidate_page_bitmap(p2);
 }
 tb = existing_tb;
 }
@@ -1731,7 +1678,6 @@ tb_invalidate_phys_page_range__locked(struct 
page_collection *pages,
 #if !defined(CONFIG_USER_ONLY)
 /* if no code remaining, no need to continue to use slow writes */
 if (!p->first_tb) {
-invalidate_page_bitmap(p);
 tlb_unprotect_code(start);
 }
 #endif
@@ -1827,24 +1773,8 @@ void tb_invalidate_phys_page_fast(struct page_collection 
*pages,
 }
 
 assert_page_locked(p);
-if (!p->code_bitmap &&
-++p->code_write_count >= SMC_BITMAP_USE_THRESHOLD) {
-build_page_bitmap(p);
-}
-if (p->code_bitmap) {
-unsigned int nr;
-unsigned long b;
-
-nr = start & ~TARGET_PAGE_MASK;
-b = p->code_bitmap[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG - 1));
-if (b & ((1

[PATCH v6 16/18] hw/core: Add CPUClass.get_pc

2022-09-30 Thread Richard Henderson
Populate this new method for all targets.  Always match
the result that would be given by cpu_get_tb_cpu_state,
as we will want these values to correspond in the logs.

Signed-off-by: Richard Henderson 
---
Cc: Eduardo Habkost  (supporter:Machine core)
Cc: Marcel Apfelbaum  (supporter:Machine core)
Cc: "Philippe Mathieu-Daudé"  (reviewer:Machine core)
Cc: Yanan Wang  (reviewer:Machine core)
Cc: Michael Rolnik  (maintainer:AVR TCG CPUs)
Cc: "Edgar E. Iglesias"  (maintainer:CRIS TCG CPUs)
Cc: Taylor Simpson  (supporter:Hexagon TCG CPUs)
Cc: Song Gao  (maintainer:LoongArch TCG CPUs)
Cc: Xiaojuan Yang  (maintainer:LoongArch TCG CPUs)
Cc: Laurent Vivier  (maintainer:M68K TCG CPUs)
Cc: Jiaxun Yang  (reviewer:MIPS TCG CPUs)
Cc: Aleksandar Rikalo  (reviewer:MIPS TCG CPUs)
Cc: Chris Wulff  (maintainer:NiosII TCG CPUs)
Cc: Marek Vasut  (maintainer:NiosII TCG CPUs)
Cc: Stafford Horne  (odd fixer:OpenRISC TCG CPUs)
Cc: Yoshinori Sato  (reviewer:RENESAS RX CPUs)
Cc: Mark Cave-Ayland  (maintainer:SPARC TCG CPUs)
Cc: Bastian Koppelmann  (maintainer:TriCore TCG 
CPUs)
Cc: Max Filippov  (maintainer:Xtensa TCG CPUs)
Cc: qemu-...@nongnu.org (open list:ARM TCG CPUs)
Cc: qemu-...@nongnu.org (open list:PowerPC TCG CPUs)
Cc: qemu-ri...@nongnu.org (open list:RISC-V TCG CPUs)
Cc: qemu-s3...@nongnu.org (open list:S390 TCG CPUs)
---
 include/hw/core/cpu.h   |  3 +++
 target/alpha/cpu.c  |  9 +
 target/arm/cpu.c| 13 +
 target/avr/cpu.c|  8 
 target/cris/cpu.c   |  8 
 target/hexagon/cpu.c|  8 
 target/hppa/cpu.c   |  8 
 target/i386/cpu.c   |  9 +
 target/loongarch/cpu.c  |  9 +
 target/m68k/cpu.c   |  8 
 target/microblaze/cpu.c |  8 
 target/mips/cpu.c   |  8 
 target/nios2/cpu.c  |  9 +
 target/openrisc/cpu.c   |  8 
 target/ppc/cpu_init.c   |  8 
 target/riscv/cpu.c  | 13 +
 target/rx/cpu.c |  8 
 target/s390x/cpu.c  |  8 
 target/sh4/cpu.c|  8 
 target/sparc/cpu.c  |  8 
 target/tricore/cpu.c|  9 +
 target/xtensa/cpu.c |  8 
 22 files changed, 186 insertions(+)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 18ca701b44..f9b58773f7 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -115,6 +115,8 @@ struct SysemuCPUOps;
  *   If the target behaviour here is anything other than "set
  *   the PC register to the value passed in" then the target must
  *   also implement the synchronize_from_tb hook.
+ * @get_pc: Callback for getting the Program Counter register.
+ *   As above, with the semantics of the target architecture.
  * @gdb_read_register: Callback for letting GDB read a register.
  * @gdb_write_register: Callback for letting GDB write a register.
  * @gdb_adjust_breakpoint: Callback for adjusting the address of a
@@ -151,6 +153,7 @@ struct CPUClass {
 void (*dump_state)(CPUState *cpu, FILE *, int flags);
 int64_t (*get_arch_id)(CPUState *cpu);
 void (*set_pc)(CPUState *cpu, vaddr value);
+vaddr (*get_pc)(CPUState *cpu);
 int (*gdb_read_register)(CPUState *cpu, GByteArray *buf, int reg);
 int (*gdb_write_register)(CPUState *cpu, uint8_t *buf, int reg);
 vaddr (*gdb_adjust_breakpoint)(CPUState *cpu, vaddr addr);
diff --git a/target/alpha/cpu.c b/target/alpha/cpu.c
index a8990d401b..979a629d59 100644
--- a/target/alpha/cpu.c
+++ b/target/alpha/cpu.c
@@ -33,6 +33,14 @@ static void alpha_cpu_set_pc(CPUState *cs, vaddr value)
 cpu->env.pc = value;
 }
 
+static vaddr alpha_cpu_get_pc(CPUState *cs)
+{
+AlphaCPU *cpu = ALPHA_CPU(cs);
+
+return cpu->env.pc;
+}
+
+
 static bool alpha_cpu_has_work(CPUState *cs)
 {
 /* Here we are checking to see if the CPU should wake up from HALT.
@@ -244,6 +252,7 @@ static void alpha_cpu_class_init(ObjectClass *oc, void 
*data)
 cc->has_work = alpha_cpu_has_work;
 cc->dump_state = alpha_cpu_dump_state;
 cc->set_pc = alpha_cpu_set_pc;
+cc->get_pc = alpha_cpu_get_pc;
 cc->gdb_read_register = alpha_cpu_gdb_read_register;
 cc->gdb_write_register = alpha_cpu_gdb_write_register;
 #ifndef CONFIG_USER_ONLY
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 7ec3281da9..fa67ba6647 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -60,6 +60,18 @@ static void arm_cpu_set_pc(CPUState *cs, vaddr value)
 }
 }
 
+static vaddr arm_cpu_get_pc(CPUState *cs)
+{
+ARMCPU *cpu = ARM_CPU(cs);
+CPUARMState *env = &cpu->env;
+
+if (is_a64(env)) {
+return env->pc;
+} else {
+return env->regs[15];
+}
+}
+
 #ifdef CONFIG_TCG
 void arm_cpu_synchronize_from_tb(CPUState *cs,
  const TranslationBlock *tb)
@@ -2172,6 +2184,7 @@ static void arm_cpu_class_init(ObjectClass *oc, void 
*data)
 cc->has_work = arm_cpu_has_work;
 cc->dump_state = arm_cpu_dump_state;
 cc->set_pc =

[PATCH v6 15/18] include/hw/core: Create struct CPUJumpCache

2022-09-30 Thread Richard Henderson
Wrap the bare TranslationBlock pointer into a structure.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 accel/tcg/tb-hash.h   |  1 +
 accel/tcg/tb-jmp-cache.h  | 24 
 include/exec/cpu-common.h |  1 +
 include/hw/core/cpu.h | 15 +--
 include/qemu/typedefs.h   |  1 +
 accel/tcg/cpu-exec.c  | 10 +++---
 accel/tcg/cputlb.c|  9 +
 accel/tcg/translate-all.c | 28 +---
 hw/core/cpu-common.c  |  3 +--
 plugins/core.c|  2 +-
 trace/control-target.c|  2 +-
 11 files changed, 68 insertions(+), 28 deletions(-)
 create mode 100644 accel/tcg/tb-jmp-cache.h

diff --git a/accel/tcg/tb-hash.h b/accel/tcg/tb-hash.h
index 0a273d9605..83dc610e4c 100644
--- a/accel/tcg/tb-hash.h
+++ b/accel/tcg/tb-hash.h
@@ -23,6 +23,7 @@
 #include "exec/cpu-defs.h"
 #include "exec/exec-all.h"
 #include "qemu/xxhash.h"
+#include "tb-jmp-cache.h"
 
 #ifdef CONFIG_SOFTMMU
 
diff --git a/accel/tcg/tb-jmp-cache.h b/accel/tcg/tb-jmp-cache.h
new file mode 100644
index 00..2d8fbb1bfe
--- /dev/null
+++ b/accel/tcg/tb-jmp-cache.h
@@ -0,0 +1,24 @@
+/*
+ * The per-CPU TranslationBlock jump cache.
+ *
+ *  Copyright (c) 2003 Fabrice Bellard
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef ACCEL_TCG_TB_JMP_CACHE_H
+#define ACCEL_TCG_TB_JMP_CACHE_H
+
+#define TB_JMP_CACHE_BITS 12
+#define TB_JMP_CACHE_SIZE (1 << TB_JMP_CACHE_BITS)
+
+/*
+ * Accessed in parallel; all accesses to 'tb' must be atomic.
+ */
+struct CPUJumpCache {
+struct {
+TranslationBlock *tb;
+} array[TB_JMP_CACHE_SIZE];
+};
+
+#endif /* ACCEL_TCG_TB_JMP_CACHE_H */
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index d909429427..c493510ee9 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -38,6 +38,7 @@ void cpu_list_unlock(void);
 unsigned int cpu_list_generation_id_get(void);
 
 void tcg_flush_softmmu_tlb(CPUState *cs);
+void tcg_flush_jmp_cache(CPUState *cs);
 
 void tcg_iommu_init_notifier_list(CPUState *cpu);
 void tcg_iommu_free_notifier_list(CPUState *cpu);
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 009dc0d336..18ca701b44 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -236,9 +236,6 @@ struct kvm_run;
 struct hax_vcpu_state;
 struct hvf_vcpu_state;
 
-#define TB_JMP_CACHE_BITS 12
-#define TB_JMP_CACHE_SIZE (1 << TB_JMP_CACHE_BITS)
-
 /* work queue */
 
 /* The union type allows passing of 64 bit target pointers on 32 bit
@@ -369,8 +366,7 @@ struct CPUState {
 CPUArchState *env_ptr;
 IcountDecr *icount_decr_ptr;
 
-/* Accessed in parallel; all accesses must be atomic */
-TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE];
+CPUJumpCache *tb_jmp_cache;
 
 struct GDBRegisterState *gdb_regs;
 int gdb_num_regs;
@@ -456,15 +452,6 @@ extern CPUTailQ cpus;
 
 extern __thread CPUState *current_cpu;
 
-static inline void cpu_tb_jmp_cache_clear(CPUState *cpu)
-{
-unsigned int i;
-
-for (i = 0; i < TB_JMP_CACHE_SIZE; i++) {
-qatomic_set(&cpu->tb_jmp_cache[i], NULL);
-}
-}
-
 /**
  * qemu_tcg_mttcg_enabled:
  * Check whether we are running MultiThread TCG or not.
diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h
index 42f4ceb701..5449aaf483 100644
--- a/include/qemu/typedefs.h
+++ b/include/qemu/typedefs.h
@@ -41,6 +41,7 @@ typedef struct CoMutex CoMutex;
 typedef struct ConfidentialGuestSupport ConfidentialGuestSupport;
 typedef struct CPUAddressSpace CPUAddressSpace;
 typedef struct CPUArchState CPUArchState;
+typedef struct CPUJumpCache CPUJumpCache;
 typedef struct CPUState CPUState;
 typedef struct DeviceListener DeviceListener;
 typedef struct DeviceState DeviceState;
diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index dd58a144a8..2d7e610ee2 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -42,6 +42,7 @@
 #include "sysemu/replay.h"
 #include "sysemu/tcg.h"
 #include "exec/helper-proto.h"
+#include "tb-jmp-cache.h"
 #include "tb-hash.h"
 #include "tb-context.h"
 #include "internal.h"
@@ -252,7 +253,7 @@ static inline TranslationBlock *tb_lookup(CPUState *cpu, 
target_ulong pc,
 tcg_debug_assert(!(cflags & CF_INVALID));
 
 hash = tb_jmp_cache_hash_func(pc);
-tb = qatomic_rcu_read(&cpu->tb_jmp_cache[hash]);
+tb = qatomic_rcu_read(&cpu->tb_jmp_cache->array[hash].tb);
 
 if (likely(tb &&
tb->pc == pc &&
@@ -266,7 +267,7 @@ static inline TranslationBlock *tb_lookup(CPUState *cpu, 
target_ulong pc,
 if (tb == NULL) {
 return NULL;
 }
-qatomic_set(&cpu->tb_jmp_cache[hash], tb);
+qatomic_set(&cpu->tb_jmp_cache->array[hash].tb, tb);
 return tb;
 }
 
@@ -987,6 +988,8 @@ int cpu_exec(CPUState *cpu)
 
 tb = tb_lookup(cpu, pc, cs_base, flags, cflags);
 if (tb == NULL) {
+uint32_t h;
+
 mmap_lock();
 tb = tb_gen_code(cpu, pc, cs

[PATCH v6 09/18] include/exec: Introduce TARGET_PAGE_ENTRY_EXTRA

2022-09-30 Thread Richard Henderson
Allow the target to cache items from the guest page tables.

Reviewed-by: Alex Bennée 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/exec/cpu-defs.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index 5e12cc1854..67239b4e5e 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -163,6 +163,15 @@ typedef struct CPUTLBEntryFull {
 
 /* @lg_page_size contains the log2 of the page size. */
 uint8_t lg_page_size;
+
+/*
+ * Allow target-specific additions to this structure.
+ * This may be used to cache items from the guest cpu
+ * page tables for later use by the implementation.
+ */
+#ifdef TARGET_PAGE_ENTRY_EXTRA
+TARGET_PAGE_ENTRY_EXTRA
+#endif
 } CPUTLBEntryFull;
 
 /*
-- 
2.34.1




[PATCH v6 08/18] accel/tcg: Introduce tlb_set_page_full

2022-09-30 Thread Richard Henderson
Now that we have collected all of the page data into
CPUTLBEntryFull, provide an interface to record that
all in one go, instead of using 4 arguments.  This interface
allows CPUTLBEntryFull to be extended without having to
change the number of arguments.

Reviewed-by: Alex Bennée 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/exec/cpu-defs.h | 14 +++
 include/exec/exec-all.h | 22 ++
 accel/tcg/cputlb.c  | 51 ++---
 3 files changed, 69 insertions(+), 18 deletions(-)

diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index f70f54d850..5e12cc1854 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -148,7 +148,21 @@ typedef struct CPUTLBEntryFull {
  * + the offset within the target MemoryRegion (otherwise)
  */
 hwaddr xlat_section;
+
+/*
+ * @phys_addr contains the physical address in the address space
+ * given by cpu_asidx_from_attrs(cpu, @attrs).
+ */
+hwaddr phys_addr;
+
+/* @attrs contains the memory transaction attributes for the page. */
 MemTxAttrs attrs;
+
+/* @prot contains the complete protections for the page. */
+uint8_t prot;
+
+/* @lg_page_size contains the log2 of the page size. */
+uint8_t lg_page_size;
 } CPUTLBEntryFull;
 
 /*
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index d255d69bc1..b1b920a713 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -257,6 +257,28 @@ void tlb_flush_range_by_mmuidx_all_cpus_synced(CPUState 
*cpu,
uint16_t idxmap,
unsigned bits);
 
+/**
+ * tlb_set_page_full:
+ * @cpu: CPU context
+ * @mmu_idx: mmu index of the tlb to modify
+ * @vaddr: virtual address of the entry to add
+ * @full: the details of the tlb entry
+ *
+ * Add an entry to @cpu tlb index @mmu_idx.  All of the fields of
+ * @full must be filled, except for xlat_section, and constitute
+ * the complete description of the translated page.
+ *
+ * This is generally called by the target tlb_fill function after
+ * having performed a successful page table walk to find the physical
+ * address and attributes for the translation.
+ *
+ * At most one entry for a given virtual address is permitted. Only a
+ * single TARGET_PAGE_SIZE region is mapped; @full->lg_page_size is only
+ * used by tlb_flush_page.
+ */
+void tlb_set_page_full(CPUState *cpu, int mmu_idx, target_ulong vaddr,
+   CPUTLBEntryFull *full);
+
 /**
  * tlb_set_page_with_attrs:
  * @cpu: CPU to add this TLB entry for
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index e3ee4260bd..361078471b 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1095,16 +1095,16 @@ static void tlb_add_large_page(CPUArchState *env, int 
mmu_idx,
 env_tlb(env)->d[mmu_idx].large_page_mask = lp_mask;
 }
 
-/* Add a new TLB entry. At most one entry for a given virtual address
+/*
+ * Add a new TLB entry. At most one entry for a given virtual address
  * is permitted. Only a single TARGET_PAGE_SIZE region is mapped, the
  * supplied size is only used by tlb_flush_page.
  *
  * Called from TCG-generated code, which is under an RCU read-side
  * critical section.
  */
-void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
- hwaddr paddr, MemTxAttrs attrs, int prot,
- int mmu_idx, target_ulong size)
+void tlb_set_page_full(CPUState *cpu, int mmu_idx,
+   target_ulong vaddr, CPUTLBEntryFull *full)
 {
 CPUArchState *env = cpu->env_ptr;
 CPUTLB *tlb = env_tlb(env);
@@ -1117,35 +1117,36 @@ void tlb_set_page_with_attrs(CPUState *cpu, 
target_ulong vaddr,
 CPUTLBEntry *te, tn;
 hwaddr iotlb, xlat, sz, paddr_page;
 target_ulong vaddr_page;
-int asidx = cpu_asidx_from_attrs(cpu, attrs);
-int wp_flags;
+int asidx, wp_flags, prot;
 bool is_ram, is_romd;
 
 assert_cpu_is_self(cpu);
 
-if (size <= TARGET_PAGE_SIZE) {
+if (full->lg_page_size <= TARGET_PAGE_BITS) {
 sz = TARGET_PAGE_SIZE;
 } else {
-tlb_add_large_page(env, mmu_idx, vaddr, size);
-sz = size;
+sz = (hwaddr)1 << full->lg_page_size;
+tlb_add_large_page(env, mmu_idx, vaddr, sz);
 }
 vaddr_page = vaddr & TARGET_PAGE_MASK;
-paddr_page = paddr & TARGET_PAGE_MASK;
+paddr_page = full->phys_addr & TARGET_PAGE_MASK;
 
+prot = full->prot;
+asidx = cpu_asidx_from_attrs(cpu, full->attrs);
 section = address_space_translate_for_iotlb(cpu, asidx, paddr_page,
-&xlat, &sz, attrs, &prot);
+&xlat, &sz, full->attrs, 
&prot);
 assert(sz >= TARGET_PAGE_SIZE);
 
 tlb_debug("vaddr=" TARGET_FMT_lx " paddr=0x" TARGET_FMT_plx
   " prot=%x idx=

[PATCH v6 11/18] accel/tcg: Use bool for page_find_alloc

2022-09-30 Thread Richard Henderson
Bool is more appropriate type for the alloc parameter.

Reviewed-by: Alex Bennée 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 59432dc558..ca685f6ede 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -465,7 +465,7 @@ void page_init(void)
 #endif
 }
 
-static PageDesc *page_find_alloc(tb_page_addr_t index, int alloc)
+static PageDesc *page_find_alloc(tb_page_addr_t index, bool alloc)
 {
 PageDesc *pd;
 void **lp;
@@ -533,11 +533,11 @@ static PageDesc *page_find_alloc(tb_page_addr_t index, 
int alloc)
 
 static inline PageDesc *page_find(tb_page_addr_t index)
 {
-return page_find_alloc(index, 0);
+return page_find_alloc(index, false);
 }
 
 static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
-   PageDesc **ret_p2, tb_page_addr_t phys2, int alloc);
+   PageDesc **ret_p2, tb_page_addr_t phys2, bool 
alloc);
 
 /* In user-mode page locks aren't used; mmap_lock is enough */
 #ifdef CONFIG_USER_ONLY
@@ -651,7 +651,7 @@ static inline void page_unlock(PageDesc *pd)
 /* lock the page(s) of a TB in the correct acquisition order */
 static inline void page_lock_tb(const TranslationBlock *tb)
 {
-page_lock_pair(NULL, tb->page_addr[0], NULL, tb->page_addr[1], 0);
+page_lock_pair(NULL, tb->page_addr[0], NULL, tb->page_addr[1], false);
 }
 
 static inline void page_unlock_tb(const TranslationBlock *tb)
@@ -840,7 +840,7 @@ void page_collection_unlock(struct page_collection *set)
 #endif /* !CONFIG_USER_ONLY */
 
 static void page_lock_pair(PageDesc **ret_p1, tb_page_addr_t phys1,
-   PageDesc **ret_p2, tb_page_addr_t phys2, int alloc)
+   PageDesc **ret_p2, tb_page_addr_t phys2, bool alloc)
 {
 PageDesc *p1, *p2;
 tb_page_addr_t page1;
@@ -1290,7 +1290,7 @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t phys_pc,
  * Note that inserting into the hash table first isn't an option, since
  * we can only insert TBs that are fully initialized.
  */
-page_lock_pair(&p, phys_pc, &p2, phys_page2, 1);
+page_lock_pair(&p, phys_pc, &p2, phys_page2, true);
 tb_page_add(p, tb, 0, phys_pc & TARGET_PAGE_MASK);
 if (p2) {
 tb_page_add(p2, tb, 1, phys_page2);
@@ -2219,7 +2219,7 @@ void page_set_flags(target_ulong start, target_ulong end, 
int flags)
 for (addr = start, len = end - start;
  len != 0;
  len -= TARGET_PAGE_SIZE, addr += TARGET_PAGE_SIZE) {
-PageDesc *p = page_find_alloc(addr >> TARGET_PAGE_BITS, 1);
+PageDesc *p = page_find_alloc(addr >> TARGET_PAGE_BITS, true);
 
 /* If the write protection bit is set, then we invalidate
the code inside.  */
-- 
2.34.1




[PATCH v6 02/18] hw/core/cpu-sysemu: used cached class in cpu_asidx_from_attrs

2022-09-30 Thread Richard Henderson
From: Alex Bennée 

This is a heavily used function so lets avoid the cost of
CPU_GET_CLASS. On the romulus-bmc run it has a modest effect:

  Before: 36.812 s ±  0.506 s
  After:  35.912 s ±  0.168 s

Signed-off-by: Alex Bennée 
Reviewed-by: Richard Henderson 
Message-Id: <20220811151413.3350684-4-alex.ben...@linaro.org>
Signed-off-by: Cédric Le Goater 
Message-Id: <20220923084803.498337-4-...@kaod.org>
Signed-off-by: Richard Henderson 
---
 hw/core/cpu-sysemu.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/core/cpu-sysemu.c b/hw/core/cpu-sysemu.c
index 00253f8929..5eaf2e79e6 100644
--- a/hw/core/cpu-sysemu.c
+++ b/hw/core/cpu-sysemu.c
@@ -69,11 +69,10 @@ hwaddr cpu_get_phys_page_debug(CPUState *cpu, vaddr addr)
 
 int cpu_asidx_from_attrs(CPUState *cpu, MemTxAttrs attrs)
 {
-CPUClass *cc = CPU_GET_CLASS(cpu);
 int ret = 0;
 
-if (cc->sysemu_ops->asidx_from_attrs) {
-ret = cc->sysemu_ops->asidx_from_attrs(cpu, attrs);
+if (cpu->cc->sysemu_ops->asidx_from_attrs) {
+ret = cpu->cc->sysemu_ops->asidx_from_attrs(cpu, attrs);
 assert(ret < cpu->num_ases && ret >= 0);
 }
 return ret;
-- 
2.34.1




[PATCH v6 04/18] accel/tcg: Rename CPUIOTLBEntry to CPUTLBEntryFull

2022-09-30 Thread Richard Henderson
This structure will shortly contain more than just
data for accessing MMIO.  Rename the 'addr' member
to 'xlat_section' to more clearly indicate its purpose.

Reviewed-by: Alex Bennée 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/exec/cpu-defs.h|  22 
 accel/tcg/cputlb.c | 102 +++--
 target/arm/mte_helper.c|  14 ++---
 target/arm/sve_helper.c|   4 +-
 target/arm/translate-a64.c |   2 +-
 5 files changed, 73 insertions(+), 71 deletions(-)

diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index ba3cd32a1e..f70f54d850 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -108,6 +108,7 @@ typedef uint64_t target_ulong;
 #  endif
 # endif
 
+/* Minimalized TLB entry for use by TCG fast path. */
 typedef struct CPUTLBEntry {
 /* bit TARGET_LONG_BITS to TARGET_PAGE_BITS : virtual address
bit TARGET_PAGE_BITS-1..4  : Nonzero for accesses that should not
@@ -131,14 +132,14 @@ typedef struct CPUTLBEntry {
 
 QEMU_BUILD_BUG_ON(sizeof(CPUTLBEntry) != (1 << CPU_TLB_ENTRY_BITS));
 
-/* The IOTLB is not accessed directly inline by generated TCG code,
- * so the CPUIOTLBEntry layout is not as critical as that of the
- * CPUTLBEntry. (This is also why we don't want to combine the two
- * structs into one.)
+/*
+ * The full TLB entry, which is not accessed by generated TCG code,
+ * so the layout is not as critical as that of CPUTLBEntry. This is
+ * also why we don't want to combine the two structs.
  */
-typedef struct CPUIOTLBEntry {
+typedef struct CPUTLBEntryFull {
 /*
- * @addr contains:
+ * @xlat_section contains:
  *  - in the lower TARGET_PAGE_BITS, a physical section number
  *  - with the lower TARGET_PAGE_BITS masked off, an offset which
  *must be added to the virtual address to obtain:
@@ -146,9 +147,9 @@ typedef struct CPUIOTLBEntry {
  *   number is PHYS_SECTION_NOTDIRTY or PHYS_SECTION_ROM)
  * + the offset within the target MemoryRegion (otherwise)
  */
-hwaddr addr;
+hwaddr xlat_section;
 MemTxAttrs attrs;
-} CPUIOTLBEntry;
+} CPUTLBEntryFull;
 
 /*
  * Data elements that are per MMU mode, minus the bits accessed by
@@ -172,9 +173,8 @@ typedef struct CPUTLBDesc {
 size_t vindex;
 /* The tlb victim table, in two parts.  */
 CPUTLBEntry vtable[CPU_VTLB_SIZE];
-CPUIOTLBEntry viotlb[CPU_VTLB_SIZE];
-/* The iotlb.  */
-CPUIOTLBEntry *iotlb;
+CPUTLBEntryFull vfulltlb[CPU_VTLB_SIZE];
+CPUTLBEntryFull *fulltlb;
 } CPUTLBDesc;
 
 /*
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 193bfc1cfc..aa22f578cb 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -200,13 +200,13 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, 
CPUTLBDescFast *fast,
 }
 
 g_free(fast->table);
-g_free(desc->iotlb);
+g_free(desc->fulltlb);
 
 tlb_window_reset(desc, now, 0);
 /* desc->n_used_entries is cleared by the caller */
 fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
 fast->table = g_try_new(CPUTLBEntry, new_size);
-desc->iotlb = g_try_new(CPUIOTLBEntry, new_size);
+desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
 
 /*
  * If the allocations fail, try smaller sizes. We just freed some
@@ -215,7 +215,7 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, 
CPUTLBDescFast *fast,
  * allocations to fail though, so we progressively reduce the allocation
  * size, aborting if we cannot even allocate the smallest TLB we support.
  */
-while (fast->table == NULL || desc->iotlb == NULL) {
+while (fast->table == NULL || desc->fulltlb == NULL) {
 if (new_size == (1 << CPU_TLB_DYN_MIN_BITS)) {
 error_report("%s: %s", __func__, strerror(errno));
 abort();
@@ -224,9 +224,9 @@ static void tlb_mmu_resize_locked(CPUTLBDesc *desc, 
CPUTLBDescFast *fast,
 fast->mask = (new_size - 1) << CPU_TLB_ENTRY_BITS;
 
 g_free(fast->table);
-g_free(desc->iotlb);
+g_free(desc->fulltlb);
 fast->table = g_try_new(CPUTLBEntry, new_size);
-desc->iotlb = g_try_new(CPUIOTLBEntry, new_size);
+desc->fulltlb = g_try_new(CPUTLBEntryFull, new_size);
 }
 }
 
@@ -258,7 +258,7 @@ static void tlb_mmu_init(CPUTLBDesc *desc, CPUTLBDescFast 
*fast, int64_t now)
 desc->n_used_entries = 0;
 fast->mask = (n_entries - 1) << CPU_TLB_ENTRY_BITS;
 fast->table = g_new(CPUTLBEntry, n_entries);
-desc->iotlb = g_new(CPUIOTLBEntry, n_entries);
+desc->fulltlb = g_new(CPUTLBEntryFull, n_entries);
 tlb_mmu_flush_locked(desc, fast);
 }
 
@@ -299,7 +299,7 @@ void tlb_destroy(CPUState *cpu)
 CPUTLBDescFast *fast = &env_tlb(env)->f[i];
 
 g_free(fast->table);
-g_free(desc->iotlb);
+g_free(desc->fulltlb);
 }
 }
 
@@ -1219,7 +1219,7 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr

[PATCH v6 07/18] accel/tcg: Introduce probe_access_full

2022-09-30 Thread Richard Henderson
Add an interface to return the CPUTLBEntryFull struct
that goes with the lookup.  The result is not intended
to be valid across multiple lookups, so the user must
use the results immediately.

Reviewed-by: Alex Bennée 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/exec/exec-all.h | 15 +
 accel/tcg/cputlb.c  | 47 +
 2 files changed, 44 insertions(+), 18 deletions(-)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index bcad607c4e..d255d69bc1 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -434,6 +434,21 @@ int probe_access_flags(CPUArchState *env, target_ulong 
addr,
MMUAccessType access_type, int mmu_idx,
bool nonfault, void **phost, uintptr_t retaddr);
 
+#ifndef CONFIG_USER_ONLY
+/**
+ * probe_access_full:
+ * Like probe_access_flags, except also return into @pfull.
+ *
+ * The CPUTLBEntryFull structure returned via @pfull is transient
+ * and must be consumed or copied immediately, before any further
+ * access or changes to TLB @mmu_idx.
+ */
+int probe_access_full(CPUArchState *env, target_ulong addr,
+  MMUAccessType access_type, int mmu_idx,
+  bool nonfault, void **phost,
+  CPUTLBEntryFull **pfull, uintptr_t retaddr);
+#endif
+
 #define CODE_GEN_ALIGN   16 /* must be >= of the size of a icache line 
*/
 
 /* Estimated block size for TB allocation.  */
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 264f84a248..e3ee4260bd 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1510,7 +1510,8 @@ static void notdirty_write(CPUState *cpu, vaddr 
mem_vaddr, unsigned size,
 static int probe_access_internal(CPUArchState *env, target_ulong addr,
  int fault_size, MMUAccessType access_type,
  int mmu_idx, bool nonfault,
- void **phost, uintptr_t retaddr)
+ void **phost, CPUTLBEntryFull **pfull,
+ uintptr_t retaddr)
 {
 uintptr_t index = tlb_index(env, mmu_idx, addr);
 CPUTLBEntry *entry = tlb_entry(env, mmu_idx, addr);
@@ -1543,10 +1544,12 @@ static int probe_access_internal(CPUArchState *env, 
target_ulong addr,
mmu_idx, nonfault, retaddr)) {
 /* Non-faulting page table read failed.  */
 *phost = NULL;
+*pfull = NULL;
 return TLB_INVALID_MASK;
 }
 
 /* TLB resize via tlb_fill may have moved the entry.  */
+index = tlb_index(env, mmu_idx, addr);
 entry = tlb_entry(env, mmu_idx, addr);
 
 /*
@@ -1560,6 +1563,8 @@ static int probe_access_internal(CPUArchState *env, 
target_ulong addr,
 }
 flags &= tlb_addr;
 
+*pfull = &env_tlb(env)->d[mmu_idx].fulltlb[index];
+
 /* Fold all "mmio-like" bits into TLB_MMIO.  This is not RAM.  */
 if (unlikely(flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY))) {
 *phost = NULL;
@@ -1571,37 +1576,44 @@ static int probe_access_internal(CPUArchState *env, 
target_ulong addr,
 return flags;
 }
 
-int probe_access_flags(CPUArchState *env, target_ulong addr,
-   MMUAccessType access_type, int mmu_idx,
-   bool nonfault, void **phost, uintptr_t retaddr)
+int probe_access_full(CPUArchState *env, target_ulong addr,
+  MMUAccessType access_type, int mmu_idx,
+  bool nonfault, void **phost, CPUTLBEntryFull **pfull,
+  uintptr_t retaddr)
 {
-int flags;
-
-flags = probe_access_internal(env, addr, 0, access_type, mmu_idx,
-  nonfault, phost, retaddr);
+int flags = probe_access_internal(env, addr, 0, access_type, mmu_idx,
+  nonfault, phost, pfull, retaddr);
 
 /* Handle clean RAM pages.  */
 if (unlikely(flags & TLB_NOTDIRTY)) {
-uintptr_t index = tlb_index(env, mmu_idx, addr);
-CPUTLBEntryFull *full = &env_tlb(env)->d[mmu_idx].fulltlb[index];
-
-notdirty_write(env_cpu(env), addr, 1, full, retaddr);
+notdirty_write(env_cpu(env), addr, 1, *pfull, retaddr);
 flags &= ~TLB_NOTDIRTY;
 }
 
 return flags;
 }
 
+int probe_access_flags(CPUArchState *env, target_ulong addr,
+   MMUAccessType access_type, int mmu_idx,
+   bool nonfault, void **phost, uintptr_t retaddr)
+{
+CPUTLBEntryFull *full;
+
+return probe_access_full(env, addr, access_type, mmu_idx,
+ nonfault, phost, &full, retaddr);
+}
+
 void *probe_access(CPUArchState *env, target_ulong addr, int size,
MMUAccessType access_type, int mmu_idx, uintptr_t retaddr)
 {
+CPUTLBE

[PATCH v6 01/18] cpu: cache CPUClass in CPUState for hot code paths

2022-09-30 Thread Richard Henderson
From: Alex Bennée 

The class cast checkers are quite expensive and always on (unlike the
dynamic case who's checks are gated by CONFIG_QOM_CAST_DEBUG). To
avoid the overhead of repeatedly checking something which should never
change we cache the CPUClass reference for use in the hot code paths.

Signed-off-by: Alex Bennée 
Reviewed-by: Richard Henderson 
Message-Id: <20220811151413.3350684-3-alex.ben...@linaro.org>
Signed-off-by: Cédric Le Goater 
Message-Id: <20220923084803.498337-3-...@kaod.org>
Signed-off-by: Richard Henderson 
---
 include/hw/core/cpu.h | 9 +
 cpu.c | 9 -
 2 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 500503da13..1a7e1a9380 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -51,6 +51,13 @@ typedef int (*WriteCoreDumpFunction)(const void *buf, size_t 
size,
  */
 #define CPU(obj) ((CPUState *)(obj))
 
+/*
+ * The class checkers bring in CPU_GET_CLASS() which is potentially
+ * expensive given the eventual call to
+ * object_class_dynamic_cast_assert(). Because of this the CPUState
+ * has a cached value for the class in cs->cc which is set up in
+ * cpu_exec_realizefn() for use in hot code paths.
+ */
 typedef struct CPUClass CPUClass;
 DECLARE_CLASS_CHECKERS(CPUClass, CPU,
TYPE_CPU)
@@ -317,6 +324,8 @@ struct qemu_work_item;
 struct CPUState {
 /*< private >*/
 DeviceState parent_obj;
+/* cache to avoid expensive CPU_GET_CLASS */
+CPUClass *cc;
 /*< public >*/
 
 int nr_cores;
diff --git a/cpu.c b/cpu.c
index 584ac78baf..14365e36f3 100644
--- a/cpu.c
+++ b/cpu.c
@@ -131,9 +131,8 @@ const VMStateDescription vmstate_cpu_common = {
 
 void cpu_exec_realizefn(CPUState *cpu, Error **errp)
 {
-#ifndef CONFIG_USER_ONLY
-CPUClass *cc = CPU_GET_CLASS(cpu);
-#endif
+/* cache the cpu class for the hotpath */
+cpu->cc = CPU_GET_CLASS(cpu);
 
 cpu_list_add(cpu);
 if (!accel_cpu_realizefn(cpu, errp)) {
@@ -151,8 +150,8 @@ void cpu_exec_realizefn(CPUState *cpu, Error **errp)
 if (qdev_get_vmsd(DEVICE(cpu)) == NULL) {
 vmstate_register(NULL, cpu->cpu_index, &vmstate_cpu_common, cpu);
 }
-if (cc->sysemu_ops->legacy_vmsd != NULL) {
-vmstate_register(NULL, cpu->cpu_index, cc->sysemu_ops->legacy_vmsd, 
cpu);
+if (cpu->cc->sysemu_ops->legacy_vmsd != NULL) {
+vmstate_register(NULL, cpu->cpu_index, 
cpu->cc->sysemu_ops->legacy_vmsd, cpu);
 }
 #endif /* CONFIG_USER_ONLY */
 }
-- 
2.34.1




[PATCH v6 03/18] cputlb: used cached CPUClass in our hot-paths

2022-09-30 Thread Richard Henderson
From: Alex Bennée 

Before: 35.912 s ±  0.168 s
  After: 35.565 s ±  0.087 s

Signed-off-by: Alex Bennée 
Reviewed-by: Richard Henderson 
Message-Id: <20220811151413.3350684-5-alex.ben...@linaro.org>
Signed-off-by: Cédric Le Goater 
Message-Id: <20220923084803.498337-5-...@kaod.org>
Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 8fad2d9b83..193bfc1cfc 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1291,15 +1291,14 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
 static void tlb_fill(CPUState *cpu, target_ulong addr, int size,
  MMUAccessType access_type, int mmu_idx, uintptr_t retaddr)
 {
-CPUClass *cc = CPU_GET_CLASS(cpu);
 bool ok;
 
 /*
  * This is not a probe, so only valid return is success; failure
  * should result in exception + longjmp to the cpu loop.
  */
-ok = cc->tcg_ops->tlb_fill(cpu, addr, size,
-   access_type, mmu_idx, false, retaddr);
+ok = cpu->cc->tcg_ops->tlb_fill(cpu, addr, size,
+access_type, mmu_idx, false, retaddr);
 assert(ok);
 }
 
@@ -1307,9 +1306,8 @@ static inline void cpu_unaligned_access(CPUState *cpu, 
vaddr addr,
 MMUAccessType access_type,
 int mmu_idx, uintptr_t retaddr)
 {
-CPUClass *cc = CPU_GET_CLASS(cpu);
-
-cc->tcg_ops->do_unaligned_access(cpu, addr, access_type, mmu_idx, retaddr);
+cpu->cc->tcg_ops->do_unaligned_access(cpu, addr, access_type,
+  mmu_idx, retaddr);
 }
 
 static inline void cpu_transaction_failed(CPUState *cpu, hwaddr physaddr,
@@ -1539,10 +1537,9 @@ static int probe_access_internal(CPUArchState *env, 
target_ulong addr,
 if (!tlb_hit_page(tlb_addr, page_addr)) {
 if (!victim_tlb_hit(env, mmu_idx, index, elt_ofs, page_addr)) {
 CPUState *cs = env_cpu(env);
-CPUClass *cc = CPU_GET_CLASS(cs);
 
-if (!cc->tcg_ops->tlb_fill(cs, addr, fault_size, access_type,
-   mmu_idx, nonfault, retaddr)) {
+if (!cs->cc->tcg_ops->tlb_fill(cs, addr, fault_size, access_type,
+   mmu_idx, nonfault, retaddr)) {
 /* Non-faulting page table read failed.  */
 *phost = NULL;
 return TLB_INVALID_MASK;
-- 
2.34.1




[PATCH v6 06/18] accel/tcg: Suppress auto-invalidate in probe_access_internal

2022-09-30 Thread Richard Henderson
When PAGE_WRITE_INV is set when calling tlb_set_page,
we immediately set TLB_INVALID_MASK in order to force
tlb_fill to be called on the next lookup.  Here in
probe_access_internal, we have just called tlb_fill
and eliminated true misses, thus the lookup must be valid.

This allows us to remove a warning comment from s390x.
There doesn't seem to be a reason to change the code though.

Reviewed-by: Alex Bennée 
Reviewed-by: David Hildenbrand 
Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 accel/tcg/cputlb.c| 10 +-
 target/s390x/tcg/mem_helper.c |  4 
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index d06ff44ce9..264f84a248 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1533,6 +1533,7 @@ static int probe_access_internal(CPUArchState *env, 
target_ulong addr,
 }
 tlb_addr = tlb_read_ofs(entry, elt_ofs);
 
+flags = TLB_FLAGS_MASK;
 page_addr = addr & TARGET_PAGE_MASK;
 if (!tlb_hit_page(tlb_addr, page_addr)) {
 if (!victim_tlb_hit(env, mmu_idx, index, elt_ofs, page_addr)) {
@@ -1547,10 +1548,17 @@ static int probe_access_internal(CPUArchState *env, 
target_ulong addr,
 
 /* TLB resize via tlb_fill may have moved the entry.  */
 entry = tlb_entry(env, mmu_idx, addr);
+
+/*
+ * With PAGE_WRITE_INV, we set TLB_INVALID_MASK immediately,
+ * to force the next access through tlb_fill.  We've just
+ * called tlb_fill, so we know that this entry *is* valid.
+ */
+flags &= ~TLB_INVALID_MASK;
 }
 tlb_addr = tlb_read_ofs(entry, elt_ofs);
 }
-flags = tlb_addr & TLB_FLAGS_MASK;
+flags &= tlb_addr;
 
 /* Fold all "mmio-like" bits into TLB_MMIO.  This is not RAM.  */
 if (unlikely(flags & ~(TLB_WATCHPOINT | TLB_NOTDIRTY))) {
diff --git a/target/s390x/tcg/mem_helper.c b/target/s390x/tcg/mem_helper.c
index fc52aa128b..3758b9e688 100644
--- a/target/s390x/tcg/mem_helper.c
+++ b/target/s390x/tcg/mem_helper.c
@@ -148,10 +148,6 @@ static int s390_probe_access(CPUArchState *env, 
target_ulong addr, int size,
 #else
 int flags;
 
-/*
- * For !CONFIG_USER_ONLY, we cannot rely on TLB_INVALID_MASK or haddr==NULL
- * to detect if there was an exception during tlb_fill().
- */
 env->tlb_fill_exc = 0;
 flags = probe_access_flags(env, addr, access_type, mmu_idx, nonfault, 
phost,
ra);
-- 
2.34.1




[PATCH v6 05/18] accel/tcg: Drop addr member from SavedIOTLB

2022-09-30 Thread Richard Henderson
This field is only written, not read; remove it.

Reviewed-by: Alex Bennée 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/hw/core/cpu.h | 1 -
 accel/tcg/cputlb.c| 7 +++
 2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 1a7e1a9380..009dc0d336 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -225,7 +225,6 @@ struct CPUWatchpoint {
  * the memory regions get moved around  by io_writex.
  */
 typedef struct SavedIOTLB {
-hwaddr addr;
 MemoryRegionSection *section;
 hwaddr mr_offset;
 } SavedIOTLB;
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index aa22f578cb..d06ff44ce9 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -1372,12 +1372,11 @@ static uint64_t io_readx(CPUArchState *env, 
CPUTLBEntryFull *full,
  * This is read by tlb_plugin_lookup if the fulltlb entry doesn't match
  * because of the side effect of io_writex changing memory layout.
  */
-static void save_iotlb_data(CPUState *cs, hwaddr addr,
-MemoryRegionSection *section, hwaddr mr_offset)
+static void save_iotlb_data(CPUState *cs, MemoryRegionSection *section,
+hwaddr mr_offset)
 {
 #ifdef CONFIG_PLUGIN
 SavedIOTLB *saved = &cs->saved_iotlb;
-saved->addr = addr;
 saved->section = section;
 saved->mr_offset = mr_offset;
 #endif
@@ -1406,7 +1405,7 @@ static void io_writex(CPUArchState *env, CPUTLBEntryFull 
*full,
  * The memory_region_dispatch may trigger a flush/resize
  * so for plugins we save the iotlb_data just in case.
  */
-save_iotlb_data(cpu, full->xlat_section, section, mr_offset);
+save_iotlb_data(cpu, section, mr_offset);
 
 if (!qemu_mutex_iothread_locked()) {
 qemu_mutex_lock_iothread();
-- 
2.34.1




[PATCH v6 00/18] tcg: CPUTLBEntryFull and TARGET_TB_PCREL

2022-09-30 Thread Richard Henderson
Changes for v6:
  * CPUTLBEntryFull is now completely reviewed.

  * Incorporated the CPUClass caching patches,
as I will add a new use of the cached value.

  * Move CPUJumpCache out of include/hw/core.h.  While looking at
Alex's review of the patch, I realized that adding the virtual
pc value unconditionally would consume 64kB per cpu on targets
that do not require it.  Further, making it dynamically allocated
(a consequence of core.h not having the structure definition to
add to CPUState), means that we save 64kB per cpu when running
with hardware virtualization (kvm, xen, etc).

  * Add CPUClass.get_pc, so that we can always use or filter on the
virtual address when logging.

Patches needing review:

  13-accel-tcg-Do-not-align-tb-page_addr-0.patch
  14-accel-tcg-Inline-tb_flush_jmp_cache.patch (new)
  16-hw-core-Add-CPUClass.get_pc.patch (new)
  17-accel-tcg-Introduce-tb_pc-and-log_pc.patch (mostly new)
  18-accel-tcg-Introduce-TARGET_TB_PCREL.patch


r~


Alex Bennée (3):
  cpu: cache CPUClass in CPUState for hot code paths
  hw/core/cpu-sysemu: used cached class in cpu_asidx_from_attrs
  cputlb: used cached CPUClass in our hot-paths

Richard Henderson (15):
  accel/tcg: Rename CPUIOTLBEntry to CPUTLBEntryFull
  accel/tcg: Drop addr member from SavedIOTLB
  accel/tcg: Suppress auto-invalidate in probe_access_internal
  accel/tcg: Introduce probe_access_full
  accel/tcg: Introduce tlb_set_page_full
  include/exec: Introduce TARGET_PAGE_ENTRY_EXTRA
  accel/tcg: Remove PageDesc code_bitmap
  accel/tcg: Use bool for page_find_alloc
  accel/tcg: Use DisasContextBase in plugin_gen_tb_start
  accel/tcg: Do not align tb->page_addr[0]
  accel/tcg: Inline tb_flush_jmp_cache
  include/hw/core: Create struct CPUJumpCache
  hw/core: Add CPUClass.get_pc
  accel/tcg: Introduce tb_pc and log_pc
  accel/tcg: Introduce TARGET_TB_PCREL

 accel/tcg/internal.h|  10 +
 accel/tcg/tb-hash.h |   1 +
 accel/tcg/tb-jmp-cache.h|  29 +++
 include/exec/cpu-common.h   |   1 +
 include/exec/cpu-defs.h |  48 -
 include/exec/exec-all.h |  75 ++-
 include/exec/plugin-gen.h   |   7 +-
 include/hw/core/cpu.h   |  28 ++-
 include/qemu/typedefs.h |   1 +
 include/tcg/tcg.h   |   2 +-
 accel/tcg/cpu-exec.c| 122 +++
 accel/tcg/cputlb.c  | 259 ++--
 accel/tcg/plugin-gen.c  |  22 +-
 accel/tcg/translate-all.c   | 200 --
 accel/tcg/translator.c  |   2 +-
 cpu.c   |   9 +-
 hw/core/cpu-common.c|   3 +-
 hw/core/cpu-sysemu.c|   5 +-
 plugins/core.c  |   2 +-
 target/alpha/cpu.c  |   9 +
 target/arm/cpu.c|  17 +-
 target/arm/mte_helper.c |  14 +-
 target/arm/sve_helper.c |   4 +-
 target/arm/translate-a64.c  |   2 +-
 target/avr/cpu.c|  10 +-
 target/cris/cpu.c   |   8 +
 target/hexagon/cpu.c|  10 +-
 target/hppa/cpu.c   |  12 +-
 target/i386/cpu.c   |   9 +
 target/i386/tcg/tcg-cpu.c   |   2 +-
 target/loongarch/cpu.c  |  11 +-
 target/m68k/cpu.c   |   8 +
 target/microblaze/cpu.c |  10 +-
 target/mips/cpu.c   |   8 +
 target/mips/tcg/exception.c |   2 +-
 target/mips/tcg/sysemu/special_helper.c |   2 +-
 target/nios2/cpu.c  |   9 +
 target/openrisc/cpu.c   |  10 +-
 target/ppc/cpu_init.c   |   8 +
 target/riscv/cpu.c  |  17 +-
 target/rx/cpu.c |  10 +-
 target/s390x/cpu.c  |   8 +
 target/s390x/tcg/mem_helper.c   |   4 -
 target/sh4/cpu.c|  12 +-
 target/sparc/cpu.c  |  10 +-
 target/tricore/cpu.c|  11 +-
 target/xtensa/cpu.c |   8 +
 tcg/tcg.c   |   8 +-
 trace/control-target.c  |   2 +-
 49 files changed, 723 insertions(+), 358 deletions(-)
 create mode 100644 accel/tcg/tb-jmp-cache.h

-- 
2.34.1




Re: [PATCH] Hexagon (gen_tcg_funcs.py): avoid duplicated tcg code on A_CVI_NEW

2022-09-30 Thread Richard Henderson

On 9/30/22 13:08, Matheus Tavares Bernardino wrote:

Hexagon instructions with the A_CVI_NEW attribute produce a vector value
that can be used in the same packet. The python function responsible for
generating code for such instructions has a typo ("if" instead of
"elif"), which makes genptr_dst_write_ext() be executed twice, thus also
generating the same tcg code twice. Fortunately, this doesn't cause any
problems for correctness, but it is less efficient than it could be. Fix
it by using an "elif" and avoiding the unnecessary extra code gen.

Signed-off-by: Matheus Tavares Bernardino 
---
  target/hexagon/gen_tcg_funcs.py | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py
index d72c689ad7..6dea02b0b9 100755
--- a/target/hexagon/gen_tcg_funcs.py
+++ b/target/hexagon/gen_tcg_funcs.py
@@ -548,7 +548,7 @@ def genptr_dst_write_opn(f,regtype, regid, tag):
  if (hex_common.is_hvx_reg(regtype)):
  if (hex_common.is_new_result(tag)):
  genptr_dst_write_ext(f, tag, regtype, regid, "EXT_NEW")
-if (hex_common.is_tmp_result(tag)):
+elif (hex_common.is_tmp_result(tag)):
  genptr_dst_write_ext(f, tag, regtype, regid, "EXT_TMP")
  else:
  genptr_dst_write_ext(f, tag, regtype, regid, "EXT_DFL")


Reviewed-by: Richard Henderson 


r~



Re: [PATCH 1/2] target/arm: Don't allow guest to use unimplemented granule sizes

2022-09-30 Thread Richard Henderson

On 9/30/22 10:48, Peter Maydell wrote:

@@ -10289,20 +10289,113 @@ static int aa64_va_parameter_tcma(uint64_t tcr, 
ARMMMUIdx mmu_idx)
  }
  }
  
+typedef enum GranuleSize {

+/* Same order as TG0 encoding */
+Gran4K,
+Gran64K,
+Gran16K,
+GranInvalid,
+} GranuleSize;


It might be worth using this in ARMVAParameters. Even if you don't do that now, it would 
be worth putting this typedef in internals.h.


Otherwise,
Reviewed-by: Richard Henderson 


r~



[PULL 1/8] hw/virtio/vhost-shadow-virtqueue: Silence GCC error "maybe-uninitialized"

2022-09-30 Thread Laurent Vivier
From: Bernhard Beschow 

GCC issues a false positive warning, resulting in build failure with -Werror:

  In file included from /usr/include/glib-2.0/glib.h:114,
   from src/include/glib-compat.h:32,
   from src/include/qemu/osdep.h:144,
   from ../src/hw/virtio/vhost-shadow-virtqueue.c:10:
  In function ‘g_autoptr_cleanup_generic_gfree’,
  inlined from ‘vhost_handle_guest_kick’ at 
../src/hw/virtio/vhost-shadow-virtqueue.c:292:42:
  /usr/include/glib-2.0/glib/glib-autocleanups.h:28:3: error: ‘elem’ may be 
used uninitialized [-Werror=maybe-uninitialized]
 28 |   g_free (*pp);
|   ^~~~
  ../src/hw/virtio/vhost-shadow-virtqueue.c: In function 
‘vhost_handle_guest_kick’:
  ../src/hw/virtio/vhost-shadow-virtqueue.c:292:42: note: ‘elem’ was declared 
here
292 | g_autofree VirtQueueElement *elem;
|  ^~~~
  cc1: all warnings being treated as errors

There is actually no problem since "elem" is initialized in both branches.
Silence the warning by initializig it with "NULL".

$ gcc --version
gcc (GCC) 12.2.0

Fixes: 9c2ab2f1ec333be8614cc12272d4b91960704dbe ("vhost: stop transfer elem 
ownership in vhost_handle_guest_kick")
Signed-off-by: Bernhard Beschow 
Reviewed-by: Richard Henderson 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20220910151117.6665-1-shen...@gmail.com>
Signed-off-by: Laurent Vivier 
---
 hw/virtio/vhost-shadow-virtqueue.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index e8e5bbc368dd..596d4434d289 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -289,7 +289,7 @@ static void vhost_handle_guest_kick(VhostShadowVirtqueue 
*svq)
 virtio_queue_set_notification(svq->vq, false);
 
 while (true) {
-g_autofree VirtQueueElement *elem;
+g_autofree VirtQueueElement *elem = NULL;
 int r;
 
 if (svq->next_guest_avail_elem) {
-- 
2.37.3




[PULL 0/8] Trivial branch for 7.2 patches

2022-09-30 Thread Laurent Vivier
The following changes since commit c8de6ec63d766ca1998c5af468483ce912fdc0c2:

  Merge tag 'pull-request-2022-09-28' of https://gitlab.com/thuth/qemu into 
staging (2022-09-28 17:04:11 -0400)

are available in the Git repository at:

  https://gitlab.com/laurent_vivier/qemu.git 
tags/trivial-branch-for-7.2-pull-request

for you to fetch changes up to a40ee29bbf3c169597d85f0871d189398b667d9f:

  docs: Update TPM documentation for usage of a TPM 2 (2022-09-29 21:31:56 
+0200)


Pull request trivial patches branch 20220930



Bernhard Beschow (1):
  hw/virtio/vhost-shadow-virtqueue: Silence GCC error
"maybe-uninitialized"

Markus Armbruster (2):
  Drop superfluous conditionals around g_free()
  Use g_new() & friends where that makes obvious sense

Matheus Tavares Bernardino (1):
  checkpatch: ignore target/hexagon/imported/* files

Philippe Mathieu-Daudé via (1):
  block/qcow2-bitmap: Add missing cast to silent GCC error

Stefan Berger (1):
  docs: Update TPM documentation for usage of a TPM 2

Tong Zhang (1):
  mem/cxl_type3: fix GPF DVSEC

Wang, Lei (1):
  .gitignore: add .cache/ to .gitignore

 .gitignore |  1 +
 block/qcow2-bitmap.c   |  2 +-
 docs/specs/tpm.rst | 44 --
 hw/mem/cxl_type3.c |  2 +-
 hw/remote/iommu.c  |  2 +-
 hw/virtio/vhost-shadow-virtqueue.c |  2 +-
 hw/virtio/virtio-crypto.c  |  2 +-
 migration/dirtyrate.c  |  4 +--
 replay/replay.c|  6 ++--
 scripts/checkpatch.pl  |  1 +
 softmmu/dirtylimit.c   |  4 +--
 target/i386/kvm/kvm.c  | 12 +++-
 target/i386/whpx/whpx-all.c| 14 --
 13 files changed, 47 insertions(+), 49 deletions(-)

-- 
2.37.3




Re: [PATCH 2/2] docs/system/arm/emulation.rst: Report FEAT_GTG support

2022-09-30 Thread Richard Henderson

On 9/30/22 10:48, Peter Maydell wrote:

FEAT_GTG is a change tho the ID register ID_AA64MMFR0_EL1 so that it
can report a different set of supported granule (page) sizes for
stage 1 and stage 2 translation tables.  As of commit c20281b2a5048
we already report the granule sizes that way for '-cpu max', and now
we also correctly make attempts to use unimplemented granule sizes
fail, so we can report the support of the feature in the
documentation.

Signed-off-by: Peter Maydell
---
  docs/system/arm/emulation.rst | 1 +
  1 file changed, 1 insertion(+)


Reviewed-by: Richard Henderson 

r~



[PATCH] Hexagon (gen_tcg_funcs.py): avoid duplicated tcg code on A_CVI_NEW

2022-09-30 Thread Matheus Tavares Bernardino
Hexagon instructions with the A_CVI_NEW attribute produce a vector value
that can be used in the same packet. The python function responsible for
generating code for such instructions has a typo ("if" instead of
"elif"), which makes genptr_dst_write_ext() be executed twice, thus also
generating the same tcg code twice. Fortunately, this doesn't cause any
problems for correctness, but it is less efficient than it could be. Fix
it by using an "elif" and avoiding the unnecessary extra code gen.

Signed-off-by: Matheus Tavares Bernardino 
---
 target/hexagon/gen_tcg_funcs.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/hexagon/gen_tcg_funcs.py b/target/hexagon/gen_tcg_funcs.py
index d72c689ad7..6dea02b0b9 100755
--- a/target/hexagon/gen_tcg_funcs.py
+++ b/target/hexagon/gen_tcg_funcs.py
@@ -548,7 +548,7 @@ def genptr_dst_write_opn(f,regtype, regid, tag):
 if (hex_common.is_hvx_reg(regtype)):
 if (hex_common.is_new_result(tag)):
 genptr_dst_write_ext(f, tag, regtype, regid, "EXT_NEW")
-if (hex_common.is_tmp_result(tag)):
+elif (hex_common.is_tmp_result(tag)):
 genptr_dst_write_ext(f, tag, regtype, regid, "EXT_TMP")
 else:
 genptr_dst_write_ext(f, tag, regtype, regid, "EXT_DFL")
-- 
2.37.2




Re: [PATCH] net: print a more actionable error when slirp is not found

2022-09-30 Thread Christian Schoenebeck
On Donnerstag, 29. September 2022 18:32:37 CEST Marc-André Lureau wrote:
> From: Marc-André Lureau 
> 
> If slirp is not found during compile-time, and not manually disabled,
> print a friendly error message, as suggested in the "If your networking
> is failing after updating to the latest git version of QEMU..." thread
> by various people.
> 
> Signed-off-by: Marc-André Lureau 
> ---
>  meson.build |  4 
>  net/net.c   | 19 +--
>  2 files changed, 21 insertions(+), 2 deletions(-)
> 
> diff --git a/meson.build b/meson.build
> index 8dc661363f..4f69d7d0b4 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -657,6 +657,10 @@ if not get_option('slirp').auto() or have_system
>endif
>  endif
> 
> +if get_option('slirp').disabled()
> +  config_host_data.set('CONFIG_SLIRP_DISABLED', true)
> +endif
> +
>  vde = not_found
>  if not get_option('vde').auto() or have_system or have_tools
>vde = cc.find_library('vdeplug', has_headers: ['libvdeplug.h'],
> diff --git a/net/net.c b/net/net.c
> index 2db160e063..e6072a5ddd 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -990,14 +990,29 @@ static int net_init_nic(const Netdev *netdev, const
> char *name, return idx;
>  }
> 
> +#if (defined(CONFIG_SLIRP) || !defined(CONFIG_SLIRP_DISABLED))
> +static int net_init_user(const Netdev *netdev, const char *name,
> + NetClientState *peer, Error **errp)
> +{
> +#ifdef CONFIG_SLIRP
> +return net_init_slirp(netdev, name, peer, errp);
> +#else
> +error_setg(errp,
> +   "Type 'user' is not a supported netdev backend by this QEMU
> build " +   "because the libslirp development files were not
> found during build " +   "of QEMU.");
> +#endif
> +return -1;
> +}
> +#endif

I just tried this, but somehow it is not working for me. net_init_user() is 
never called and therefore I don't get the error message. That should be 
working if the user launched QEMU without any networking arg, right?

And still, I would find it better if there was also a clear build-time error 
if there was no libslirp and slirp feature was not explicitly disabled.

> 
>  static int (* const net_client_init_fun[NET_CLIENT_DRIVER__MAX])(
>  const Netdev *netdev,
>  const char *name,
>  NetClientState *peer, Error **errp) = {
>  [NET_CLIENT_DRIVER_NIC]   = net_init_nic,
> -#ifdef CONFIG_SLIRP
> -[NET_CLIENT_DRIVER_USER]  = net_init_slirp,
> +#if (defined(CONFIG_SLIRP) || !defined(CONFIG_SLIRP_DISABLED))
> +[NET_CLIENT_DRIVER_USER]  = net_init_user,
>  #endif
>  [NET_CLIENT_DRIVER_TAP]   = net_init_tap,
>  [NET_CLIENT_DRIVER_SOCKET]= net_init_socket,







[PULL v2 1/3] Hexagon (target/hexagon) add instruction attributes from archlib

2022-09-30 Thread Taylor Simpson
The imported files from the architecture library have added some
instruction attributes.  Some of these will be used in a subsequent
patch for determing the size of a store.

Signed-off-by: Taylor Simpson 
Acked-by: Richard Henderson 
Message-Id: <20220920080746.26791-2-tsimp...@quicinc.com>
---
 target/hexagon/attribs_def.h.inc  |  37 +++-
 target/hexagon/imported/ldst.idef | 122 +-
 target/hexagon/imported/subinsns.idef |  72 +++
 3 files changed, 133 insertions(+), 98 deletions(-)

diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc
index dc890a557f..222ad95fb0 100644
--- a/target/hexagon/attribs_def.h.inc
+++ b/target/hexagon/attribs_def.h.inc
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -38,6 +38,16 @@ DEF_ATTRIB(SUBINSN, "sub-instruction", "", "")
 /* Load and Store attributes */
 DEF_ATTRIB(LOAD, "Loads from memory", "", "")
 DEF_ATTRIB(STORE, "Stores to memory", "", "")
+DEF_ATTRIB(STOREIMMED, "Stores immed to memory", "", "")
+DEF_ATTRIB(MEMSIZE_0B, "Memory width is 0 byte", "", "")
+DEF_ATTRIB(MEMSIZE_1B, "Memory width is 1 byte", "", "")
+DEF_ATTRIB(MEMSIZE_2B, "Memory width is 2 bytes", "", "")
+DEF_ATTRIB(MEMSIZE_4B, "Memory width is 4 bytes", "", "")
+DEF_ATTRIB(MEMSIZE_8B, "Memory width is 8 bytes", "", "")
+DEF_ATTRIB(REGWRSIZE_1B, "Memory width is 1 byte", "", "")
+DEF_ATTRIB(REGWRSIZE_2B, "Memory width is 2 bytes", "", "")
+DEF_ATTRIB(REGWRSIZE_4B, "Memory width is 4 bytes", "", "")
+DEF_ATTRIB(REGWRSIZE_8B, "Memory width is 8 bytes", "", "")
 DEF_ATTRIB(MEMLIKE, "Memory-like instruction", "", "")
 DEF_ATTRIB(MEMLIKE_PACKET_RULES, "follows Memory-like packet rules", "", "")
 
@@ -71,6 +81,11 @@ DEF_ATTRIB(COF, "Change-of-flow instruction", "", "")
 DEF_ATTRIB(CONDEXEC, "May be cancelled by a predicate", "", "")
 DEF_ATTRIB(DOTNEWVALUE, "Uses a register value generated in this pkt", "", "")
 DEF_ATTRIB(NEWCMPJUMP, "Compound compare and jump", "", "")
+DEF_ATTRIB(NVSTORE, "New-value store", "", "")
+DEF_ATTRIB(MEMOP, "memop", "", "")
+
+DEF_ATTRIB(ROPS_2, "Compound instruction worth 2 RISC-ops", "", "")
+DEF_ATTRIB(ROPS_3, "Compound instruction worth 3 RISC-ops", "", "")
 
 /* access to implicit registers */
 DEF_ATTRIB(IMPLICIT_WRITES_LR, "Writes the link register", "", "UREG.LR")
@@ -87,6 +102,9 @@ DEF_ATTRIB(IMPLICIT_WRITES_P3, "May write Predicate 3", "", 
"UREG.P3")
 DEF_ATTRIB(IMPLICIT_READS_PC, "Reads the PC register", "", "")
 DEF_ATTRIB(IMPLICIT_WRITES_USR, "May write USR", "", "")
 DEF_ATTRIB(WRITES_PRED_REG, "Writes a predicate register", "", "")
+DEF_ATTRIB(COMMUTES, "The operation is communitive", "", "")
+DEF_ATTRIB(DEALLOCRET, "dealloc_return", "", "")
+DEF_ATTRIB(DEALLOCFRAME, "deallocframe", "", "")
 
 DEF_ATTRIB(CRSLOT23, "Can execute in slot 2 or slot 3 (CR)", "", "")
 DEF_ATTRIB(IT_NOP, "nop instruction", "", "")
@@ -94,17 +112,21 @@ DEF_ATTRIB(IT_EXTENDER, "constant extender instruction", 
"", "")
 
 
 /* Restrictions to make note of */
+DEF_ATTRIB(RESTRICT_COF_MAX1, "One change-of-flow per packet", "", "")
+DEF_ATTRIB(RESTRICT_NOPACKET, "Not allowed in a packet", "", "")
 DEF_ATTRIB(RESTRICT_SLOT0ONLY, "Must execute on slot0", "", "")
 DEF_ATTRIB(RESTRICT_SLOT1ONLY, "Must execute on slot1", "", "")
 DEF_ATTRIB(RESTRICT_SLOT2ONLY, "Must execute on slot2", "", "")
 DEF_ATTRIB(RESTRICT_SLOT3ONLY, "Must execute on slot3", "", "")
 DEF_ATTRIB(RESTRICT_NOSLOT1, "No slot 1 instruction in parallel", "", "")
 DEF_ATTRIB(RESTRICT_PREFERSLOT0, "Try to encode into slot 0", "", "")
+DEF_ATTRIB(RESTRICT_PACKET_AXOK, "May exist with A-type or X-type", "", "")
 
 DEF_ATTRIB(ICOP, "Instruction cache op", "", "")
 
 DEF_ATTRIB(HWLOOP0_END, "Ends HW loop0", "", "")
 DEF_ATTRIB(HWLOOP1_END, "Ends HW loop1", "", "")
+DEF_ATTRIB(RET_TYPE, "return type", "", "")
 DEF_ATTRIB(DCZEROA, "dczeroa type", "", "")
 DEF_ATTRIB(ICFLUSHOP, "icflush op type", "", "")
 DEF_ATTRIB(DCFLUSHOP, "dcflush op type", "", "")
@@ -116,5 +138,18 @@ DEF_ATTRIB(L2FETCH, "Instruction is l2fetch type", "", "")
 DEF_ATTRIB(ICINVA, "icinva", "", "")
 DEF_ATTRIB(DCCLEANINVA, "dccleaninva", "", "")
 
+/* Documentation Notes */
+DEF_ATTRIB(NOTE_CONDITIONAL, "can be conditionally executed", "", "")
+DEF_ATTRIB(NOTE_NEWVAL_SLOT0, "New-value oprnd must execute on slot 0", "", "")
+DEF_ATTRIB(NOTE_PRIV, "Monitor-level feature", "", "")
+DEF_ATTRIB(NOTE_NOPACKET, "solo instruction", "", "")
+DEF_ATTRIB(NOTE_AXOK, "May only be grouped with ALU32 or non-FP XTYPE.", "", 
"")
+DEF_ATTRIB(NOTE_LATEPRED, "The predicate can not be used as a .new", "", "")
+DEF_ATTRIB(NOTE_NVSLOT0, "Can execute only in slot 0 (ST)", "", "")
+
+/* Restrictions to make note of */
+DEF_ATTR

[PULL v2 2/3] Hexagon (target/hexagon) Change decision to set pkt_has_store_s[01]

2022-09-30 Thread Taylor Simpson
We have found cases where pkt_has_store_s[01] is set incorrectly.
This leads to generating an unnecessary store that is left over
from a previous packet.

Add an attribute to determine if an instruction is a scalar store
The attribute is attached to the fSTORE macro (hex_common.py)
Update the logic in decode.c that sets pkt_has_store_s[01]

Signed-off-by: Taylor Simpson 
Reviewed-by: Richard Henderson 
Message-Id: <20220920080746.26791-4-tsimp...@quicinc.com>
---
 target/hexagon/attribs_def.h.inc |  1 +
 target/hexagon/decode.c  | 13 -
 target/hexagon/translate.c   | 10 ++
 target/hexagon/hex_common.py |  3 ++-
 4 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc
index 222ad95fb0..5d2a102c18 100644
--- a/target/hexagon/attribs_def.h.inc
+++ b/target/hexagon/attribs_def.h.inc
@@ -44,6 +44,7 @@ DEF_ATTRIB(MEMSIZE_1B, "Memory width is 1 byte", "", "")
 DEF_ATTRIB(MEMSIZE_2B, "Memory width is 2 bytes", "", "")
 DEF_ATTRIB(MEMSIZE_4B, "Memory width is 4 bytes", "", "")
 DEF_ATTRIB(MEMSIZE_8B, "Memory width is 8 bytes", "", "")
+DEF_ATTRIB(SCALAR_STORE, "Store is scalar", "", "")
 DEF_ATTRIB(REGWRSIZE_1B, "Memory width is 1 byte", "", "")
 DEF_ATTRIB(REGWRSIZE_2B, "Memory width is 2 bytes", "", "")
 DEF_ATTRIB(REGWRSIZE_4B, "Memory width is 4 bytes", "", "")
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index 6f0f27b4ba..6b73b5c60c 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -402,10 +402,13 @@ static void decode_set_insn_attr_fields(Packet *pkt)
 }
 
 if (GET_ATTRIB(opcode, A_STORE)) {
-if (pkt->insn[i].slot == 0) {
-pkt->pkt_has_store_s0 = true;
-} else {
-pkt->pkt_has_store_s1 = true;
+if (GET_ATTRIB(opcode, A_SCALAR_STORE) &&
+!GET_ATTRIB(opcode, A_MEMSIZE_0B)) {
+if (pkt->insn[i].slot == 0) {
+pkt->pkt_has_store_s0 = true;
+} else {
+pkt->pkt_has_store_s1 = true;
+}
 }
 }
 
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 0e8a0772f7..b6b834b4ee 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -1,5 +1,5 @@
 /*
- *  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+ *  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
  *
  *  This program is free software; you can redistribute it and/or modify
  *  it under the terms of the GNU General Public License as published by
@@ -499,10 +499,12 @@ static void process_store_log(DisasContext *ctx, Packet 
*pkt)
  *  slot 1 and then slot 0.  This will be important when
  *  the memory accesses overlap.
  */
-if (pkt->pkt_has_store_s1 && !pkt->pkt_has_dczeroa) {
+if (pkt->pkt_has_store_s1) {
+g_assert(!pkt->pkt_has_dczeroa);
 process_store(ctx, pkt, 1);
 }
-if (pkt->pkt_has_store_s0 && !pkt->pkt_has_dczeroa) {
+if (pkt->pkt_has_store_s0) {
+g_assert(!pkt->pkt_has_dczeroa);
 process_store(ctx, pkt, 0);
 }
 }
@@ -665,7 +667,7 @@ static void gen_commit_packet(CPUHexagonState *env, 
DisasContext *ctx,
  * The dczeroa will be the store in slot 0, check that we don't have
  * a store in slot 1 or an HVX store.
  */
-g_assert(has_store_s0 && !has_store_s1 && !has_hvx_store);
+g_assert(!has_store_s1 && !has_hvx_store);
 process_dczeroa(ctx, pkt);
 } else if (has_hvx_store) {
 TCGv mem_idx = tcg_constant_tl(ctx->mem_idx);
diff --git a/target/hexagon/hex_common.py b/target/hexagon/hex_common.py
index c81aca8d2a..d9ba7df786 100755
--- a/target/hexagon/hex_common.py
+++ b/target/hexagon/hex_common.py
@@ -1,7 +1,7 @@
 #!/usr/bin/env python3
 
 ##
-##  Copyright(c) 2019-2021 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
+##  Copyright(c) 2019-2022 Qualcomm Innovation Center, Inc. All Rights 
Reserved.
 ##
 ##  This program is free software; you can redistribute it and/or modify
 ##  it under the terms of the GNU General Public License as published by
@@ -75,6 +75,7 @@ def calculate_attribs():
 add_qemu_macro_attrib('fWRITE_P3', 'A_WRITES_PRED_REG')
 add_qemu_macro_attrib('fSET_OVERFLOW', 'A_IMPLICIT_WRITES_USR')
 add_qemu_macro_attrib('fSET_LPCFG', 'A_IMPLICIT_WRITES_USR')
+add_qemu_macro_attrib('fSTORE', 'A_SCALAR_STORE')
 
 # Recurse down macros, find attributes from sub-macros
 macroValues = list(macros.values())
-- 
2.17.1


[PULL v2 3/3] Hexagon (target/hexagon) move store size tracking to translation

2022-09-30 Thread Taylor Simpson
The store width is needed for packet commit, so it is stored in
ctx->store_width.  Currently, it is set when a store has a TCG
override instead of a QEMU helper.  In the QEMU helper case, the
ctx->store_width is not set, we invoke a helper during packet commit
that uses the runtime store width.

This patch ensures ctx->store_width is set for all store instructions,
so performance is improved because packet commit can generate the proper
TCG store rather than the generic helper.

We do this by
- Use the attributes from the instructions during translation to
  set ctx->store_width
- Remove setting of ctx->store_width from genptr.c

Signed-off-by: Taylor Simpson 
Reviewed-by: Richard Henderson 
Message-Id: <20220920080746.26791-3-tsimp...@quicinc.com>
---
 target/hexagon/macros.h|  8 
 target/hexagon/genptr.c| 36 
 target/hexagon/translate.c | 25 +
 3 files changed, 41 insertions(+), 28 deletions(-)

diff --git a/target/hexagon/macros.h b/target/hexagon/macros.h
index 92eb8bbf05..c8805bdaeb 100644
--- a/target/hexagon/macros.h
+++ b/target/hexagon/macros.h
@@ -156,7 +156,7 @@
 __builtin_choose_expr(TYPE_TCGV(X), \
 gen_store1, (void)0))
 #define MEM_STORE1(VA, DATA, SLOT) \
-MEM_STORE1_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
+MEM_STORE1_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
 
 #define MEM_STORE2_FUNC(X) \
 __builtin_choose_expr(TYPE_INT(X), \
@@ -164,7 +164,7 @@
 __builtin_choose_expr(TYPE_TCGV(X), \
 gen_store2, (void)0))
 #define MEM_STORE2(VA, DATA, SLOT) \
-MEM_STORE2_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
+MEM_STORE2_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
 
 #define MEM_STORE4_FUNC(X) \
 __builtin_choose_expr(TYPE_INT(X), \
@@ -172,7 +172,7 @@
 __builtin_choose_expr(TYPE_TCGV(X), \
 gen_store4, (void)0))
 #define MEM_STORE4(VA, DATA, SLOT) \
-MEM_STORE4_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
+MEM_STORE4_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
 
 #define MEM_STORE8_FUNC(X) \
 __builtin_choose_expr(TYPE_INT(X), \
@@ -180,7 +180,7 @@
 __builtin_choose_expr(TYPE_TCGV_I64(X), \
 gen_store8, (void)0))
 #define MEM_STORE8(VA, DATA, SLOT) \
-MEM_STORE8_FUNC(DATA)(cpu_env, VA, DATA, ctx, SLOT)
+MEM_STORE8_FUNC(DATA)(cpu_env, VA, DATA, SLOT)
 #else
 #define MEM_LOAD1s(VA) ((int8_t)mem_load1(env, slot, VA))
 #define MEM_LOAD1u(VA) ((uint8_t)mem_load1(env, slot, VA))
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 8a334ba07b..806d0974ff 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -401,62 +401,50 @@ static inline void gen_store32(TCGv vaddr, TCGv src, int 
width, int slot)
 tcg_gen_mov_tl(hex_store_val32[slot], src);
 }
 
-static inline void gen_store1(TCGv_env cpu_env, TCGv vaddr, TCGv src,
-  DisasContext *ctx, int slot)
+static inline void gen_store1(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot)
 {
 gen_store32(vaddr, src, 1, slot);
-ctx->store_width[slot] = 1;
 }
 
-static inline void gen_store1i(TCGv_env cpu_env, TCGv vaddr, int32_t src,
-   DisasContext *ctx, int slot)
+static inline void gen_store1i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int 
slot)
 {
 TCGv tmp = tcg_constant_tl(src);
-gen_store1(cpu_env, vaddr, tmp, ctx, slot);
+gen_store1(cpu_env, vaddr, tmp, slot);
 }
 
-static inline void gen_store2(TCGv_env cpu_env, TCGv vaddr, TCGv src,
-  DisasContext *ctx, int slot)
+static inline void gen_store2(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot)
 {
 gen_store32(vaddr, src, 2, slot);
-ctx->store_width[slot] = 2;
 }
 
-static inline void gen_store2i(TCGv_env cpu_env, TCGv vaddr, int32_t src,
-   DisasContext *ctx, int slot)
+static inline void gen_store2i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int 
slot)
 {
 TCGv tmp = tcg_constant_tl(src);
-gen_store2(cpu_env, vaddr, tmp, ctx, slot);
+gen_store2(cpu_env, vaddr, tmp, slot);
 }
 
-static inline void gen_store4(TCGv_env cpu_env, TCGv vaddr, TCGv src,
-  DisasContext *ctx, int slot)
+static inline void gen_store4(TCGv_env cpu_env, TCGv vaddr, TCGv src, int slot)
 {
 gen_store32(vaddr, src, 4, slot);
-ctx->store_width[slot] = 4;
 }
 
-static inline void gen_store4i(TCGv_env cpu_env, TCGv vaddr, int32_t src,
-   DisasContext *ctx, int slot)
+static inline void gen_store4i(TCGv_env cpu_env, TCGv vaddr, int32_t src, int 
slot)
 {
 TCGv tmp = tcg_constant_tl(src);
-gen_store4(cpu_env, vaddr, tmp, ctx, slot);
+gen_store4(cpu_env, vaddr, tmp, slot);
 }
 
-static inline void gen_store8(TCGv_env cpu_env, TCGv vaddr, TCGv_i64 src,
-  DisasContext *ctx, int slot)
+static inline void gen_store8(TCGv_env cpu_env, TCGv vaddr, TCGv_i64 src, int 
slot)
 {
 tcg_gen

[PULL v2 0/3] Hexagon (target/hexagon) improve store handling

2022-09-30 Thread Taylor Simpson
The following changes since commit c8de6ec63d766ca1998c5af468483ce912fdc0c2:

  Merge tag 'pull-request-2022-09-28' of https://gitlab.com/thuth/qemu into 
staging (2022-09-28 17:04:11 -0400)

are available in the Git repository at:

  https://github.com/quic/qemu tags/pull-hex-20220930

for you to fetch changes up to 661ad999c554d1cc99ff96b3baf3ff4acbe2ecee:

  Hexagon (target/hexagon) move store size tracking to translation (2022-09-30 
11:25:37 -0700)


Make store handling faster and more robust


Taylor Simpson (3):
  Hexagon (target/hexagon) add instruction attributes from archlib
  Hexagon (target/hexagon) Change decision to set pkt_has_store_s[01]
  Hexagon (target/hexagon) move store size tracking to translation

 target/hexagon/macros.h   |   8 +--
 target/hexagon/attribs_def.h.inc  |  38 ++-
 target/hexagon/decode.c   |  13 ++--
 target/hexagon/genptr.c   |  36 --
 target/hexagon/translate.c|  35 --
 target/hexagon/hex_common.py  |   3 +-
 target/hexagon/imported/ldst.idef | 122 +-
 target/hexagon/imported/subinsns.idef |  72 ++--
 8 files changed, 191 insertions(+), 136 deletions(-)


Re: [RFC PATCH v2 11/29] target/ppc: add power-saving interrupt masking logic to p9_next_unmasked_interrupt

2022-09-30 Thread Fabiano Rosas
Matheus Ferst  writes:

> Export p9_interrupt_powersave and use it in p9_next_unmasked_interrupt.
>
> Signed-off-by: Matheus Ferst 
> ---
> Temporarily putting the prototype in internal.h for lack of a better place,
> we will un-export p9_interrupt_powersave in future patches.
> ---
>  target/ppc/cpu_init.c|  2 +-
>  target/ppc/excp_helper.c | 46 
>  target/ppc/internal.h|  4 
>  3 files changed, 38 insertions(+), 14 deletions(-)
>
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index 1f8f6c6ef2..7889158c52 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -6351,7 +6351,7 @@ static bool ppc_pvr_match_power9(PowerPCCPUClass *pcc, 
> uint32_t pvr, bool best)
>  return false;
>  }
>  
> -static int p9_interrupt_powersave(CPUPPCState *env)
> +int p9_interrupt_powersave(CPUPPCState *env)
>  {
>  /* External Exception */
>  if ((env->pending_interrupts & PPC_INTERRUPT_EXT) &&
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index 67e73f30ab..5a0d2c11a2 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -1686,28 +1686,39 @@ void ppc_cpu_do_interrupt(CPUState *cs)
>  
>  static int p9_next_unmasked_interrupt(CPUPPCState *env)
>  {
> -bool async_deliver;
> +PowerPCCPU *cpu = env_archcpu(env);
> +CPUState *cs = CPU(cpu);
> +/* Ignore MSR[EE] when coming out of some power management states */
> +bool msr_ee = FIELD_EX64(env->msr, MSR, EE) || env->resume_as_sreset;
>  
>  assert((env->pending_interrupts & P9_UNUSED_INTERRUPTS) == 0);
>  
> +if (cs->halted) {
> +if (env->spr[SPR_PSSCR] & PSSCR_EC) {
> +/*
> + * When PSSCR[EC] is set, LPCR[PECE] controls which interrupts 
> can
> + * wakeup the processor
> + */
> +return p9_interrupt_powersave(env);
> +} else {
> +/*
> + * When it's clear, any system-caused exception exits 
> power-saving
> + * mode, even the ones that gate on MSR[EE].
> + */
> +msr_ee = true;
> +}
> +}
> +
>  /* Machine check exception */
>  if (env->pending_interrupts & PPC_INTERRUPT_MCK) {
>  return PPC_INTERRUPT_MCK;
>  }
>  
> -/*
> - * For interrupts that gate on MSR:EE, we need to do something a
> - * bit more subtle, as we need to let them through even when EE is
> - * clear when coming out of some power management states (in order
> - * for them to become a 0x100).
> - */
> -async_deliver = FIELD_EX64(env->msr, MSR, EE) || env->resume_as_sreset;
> -

You could simplify the code below if you bail early here when !msr_ee.

>  /* Hypervisor decrementer exception */
>  if (env->pending_interrupts & PPC_INTERRUPT_HDECR) {
>  /* LPCR will be clear when not supported so this will work */
>  bool hdice = !!(env->spr[SPR_LPCR] & LPCR_HDICE);
> -if ((async_deliver || !FIELD_EX64_HV(env->msr)) && hdice) {
> +if ((msr_ee || !FIELD_EX64_HV(env->msr)) && hdice) {
>  /* HDEC clears on delivery */
>  return PPC_INTERRUPT_HDECR;
>  }
> @@ -1717,7 +1728,7 @@ static int p9_next_unmasked_interrupt(CPUPPCState *env)
>  if (env->pending_interrupts & PPC_INTERRUPT_HVIRT) {
>  /* LPCR will be clear when not supported so this will work */
>  bool hvice = !!(env->spr[SPR_LPCR] & LPCR_HVICE);
> -if ((async_deliver || !FIELD_EX64_HV(env->msr)) && hvice) {
> +if ((msr_ee || !FIELD_EX64_HV(env->msr)) && hvice) {
>  return PPC_INTERRUPT_HVIRT;
>  }
>  }
> @@ -1727,13 +1738,13 @@ static int p9_next_unmasked_interrupt(CPUPPCState 
> *env)
>  bool lpes0 = !!(env->spr[SPR_LPCR] & LPCR_LPES0);
>  bool heic = !!(env->spr[SPR_LPCR] & LPCR_HEIC);
>  /* HEIC blocks delivery to the hypervisor */
> -if ((async_deliver && !(heic && FIELD_EX64_HV(env->msr) &&
> +if ((msr_ee && !(heic && FIELD_EX64_HV(env->msr) &&
>  !FIELD_EX64(env->msr, MSR, PR))) ||
>  (env->has_hv_mode && !FIELD_EX64_HV(env->msr) && !lpes0)) {
>  return PPC_INTERRUPT_EXT;
>  }
>  }
> -if (async_deliver != 0) {
> +if (msr_ee != 0) {
>  /* Decrementer exception */
>  if (env->pending_interrupts & PPC_INTERRUPT_DECR) {
>  return PPC_INTERRUPT_DECR;
> @@ -1895,6 +1906,15 @@ static void p9_deliver_interrupt(CPUPPCState *env, int 
> interrupt)
>  PowerPCCPU *cpu = env_archcpu(env);
>  CPUState *cs = env_cpu(env);
>  
> +if (cs->halted && !(env->spr[SPR_PSSCR] & PSSCR_EC) &&
> +!FIELD_EX64(env->msr, MSR, EE)) {
> +/*
> + * A pending interrupt took us out of power-saving, but MSR[EE] says
> + * that we should return to NIP+4 instead of delivering it.
> + */
> +return;

How will the NIP be advanced in 

Re: [RFC PATCH v2 09/29] target/ppc: remove generic architecture checks from p9_deliver_interrupt

2022-09-30 Thread Fabiano Rosas
Matheus Ferst  writes:

> No functional change intended.
>
> Signed-off-by: Matheus Ferst 
> ---
>  target/ppc/excp_helper.c | 9 +
>  1 file changed, 1 insertion(+), 8 deletions(-)
>
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index 603c956588..67e73f30ab 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -1919,18 +1919,11 @@ static void p9_deliver_interrupt(CPUPPCState *env, 
> int interrupt)
>  break;
>  
>  case PPC_INTERRUPT_DECR: /* Decrementer exception */
> -if (ppc_decr_clear_on_delivery(env)) {
> -env->pending_interrupts &= ~PPC_INTERRUPT_DECR;
> -}

Maybe I'm missing something, but this should continue to clear the bit,
no? Same comment for P8.

>  powerpc_excp(cpu, POWERPC_EXCP_DECR);
>  break;
>  case PPC_INTERRUPT_DOORBELL:
>  env->pending_interrupts &= ~PPC_INTERRUPT_DOORBELL;
> -if (is_book3s_arch2x(env)) {
> -powerpc_excp(cpu, POWERPC_EXCP_SDOOR);
> -} else {
> -powerpc_excp(cpu, POWERPC_EXCP_DOORI);
> -}
> +powerpc_excp(cpu, POWERPC_EXCP_SDOOR);
>  break;
>  case PPC_INTERRUPT_HDOORBELL:
>  env->pending_interrupts &= ~PPC_INTERRUPT_HDOORBELL;



[PATCH 2/2] docs/system/arm/emulation.rst: Report FEAT_GTG support

2022-09-30 Thread Peter Maydell
FEAT_GTG is a change tho the ID register ID_AA64MMFR0_EL1 so that it
can report a different set of supported granule (page) sizes for
stage 1 and stage 2 translation tables.  As of commit c20281b2a5048
we already report the granule sizes that way for '-cpu max', and now
we also correctly make attempts to use unimplemented granule sizes
fail, so we can report the support of the feature in the
documentation.

Signed-off-by: Peter Maydell 
---
 docs/system/arm/emulation.rst | 1 +
 1 file changed, 1 insertion(+)

diff --git a/docs/system/arm/emulation.rst b/docs/system/arm/emulation.rst
index be7bbffe595..cfb4b0768b0 100644
--- a/docs/system/arm/emulation.rst
+++ b/docs/system/arm/emulation.rst
@@ -31,6 +31,7 @@ the following architecture extensions:
 - FEAT_FRINTTS (Floating-point to integer instructions)
 - FEAT_FlagM (Flag manipulation instructions v2)
 - FEAT_FlagM2 (Enhancements to flag manipulation instructions)
+- FEAT_GTG (Guest translation granule size)
 - FEAT_HCX (Support for the HCRX_EL2 register)
 - FEAT_HPDS (Hierarchical permission disables)
 - FEAT_I8MM (AArch64 Int8 matrix multiplication instructions)
-- 
2.25.1




[PATCH 0/2] target/arm: Enforce implemented granule size limits

2022-09-30 Thread Peter Maydell
Arm CPUs support some subset of the granule (page) sizes 4K, 16K and
64K.  The guest selects the one it wants using bits in the TCR_ELx
registers.  If it tries to program these registers with a value that
is either reserved or which requests a size that the CPU does not
implement, the architecture requires that the CPU behaves as if the
field was programmed to some size that has been implemented.
Currently we don't implement this, and instead let the guest use any
granule size, even if the CPU ID register fields say it isn't
present.

Patch 1 in this series makes us enforce this architectural
requirement (the main effect will be that we stop incorrectly
implementing 16K granules on most of the non-cpu-max CPUs).

Patch 2 adds FEAT_GTG to the list of supported features, because
all this feature really is is the definition of the separate
fields for stage1 and stage2 granule support in ID_AA64MMFR0_EL1,
and we already updated -cpu max to report its granule support
that way when we were adding the LPA2 support.

thanks
-- PMM

Peter Maydell (2):
  target/arm: Don't allow guest to use unimplemented granule sizes
  docs/system/arm/emulation.rst: Report FEAT_GTG support

 docs/system/arm/emulation.rst |   1 +
 target/arm/cpu.h  |  33 ++
 target/arm/helper.c   | 110 +++---
 3 files changed, 136 insertions(+), 8 deletions(-)

-- 
2.25.1




[PATCH 1/2] target/arm: Don't allow guest to use unimplemented granule sizes

2022-09-30 Thread Peter Maydell
Arm CPUs support some subset of the granule (page) sizes 4K, 16K and
64K.  The guest selects the one it wants using bits in the TCR_ELx
registers.  If it tries to program these registers with a value that
is either reserved or which requests a size that the CPU does not
implement, the architecture requires that the CPU behaves as if the
field was programmed to some size that has been implemented.
Currently we don't implement this, and instead let the guest use any
granule size, even if the CPU ID register fields say it isn't
present.

Make aa64_va_parameters() check against the supported granule size
and force use of a different one if it is not implemented.

Signed-off-by: Peter Maydell 
---
Unusually, the architecture would allow us to do this sanitizing
when the TCR_ELx register is written, because it permits that the
value of the register read back can be one corresponding to the
IMPDEF chosen size rather than having to be the value written.
But I opted to do the handling in aa64_va_parameters() anyway,
on the assumption that this isn't critically in the fast path.
---
 target/arm/cpu.h|  33 +
 target/arm/helper.c | 110 
 2 files changed, 135 insertions(+), 8 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 33cdbc0143e..6d39d27378d 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -4103,6 +4103,39 @@ static inline bool 
isar_feature_aa64_tgran16_2_lpa2(const ARMISARegisters *id)
 return t >= 3 || (t == 0 && isar_feature_aa64_tgran16_lpa2(id));
 }
 
+static inline bool isar_feature_aa64_tgran4(const ARMISARegisters *id)
+{
+return FIELD_SEX64(id->id_aa64mmfr0, ID_AA64MMFR0, TGRAN4) >= 0;
+}
+
+static inline bool isar_feature_aa64_tgran16(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64mmfr0, ID_AA64MMFR0, TGRAN16) >= 1;
+}
+
+static inline bool isar_feature_aa64_tgran64(const ARMISARegisters *id)
+{
+return FIELD_SEX64(id->id_aa64mmfr0, ID_AA64MMFR0, TGRAN64) >= 0;
+}
+
+static inline bool isar_feature_aa64_tgran4_2(const ARMISARegisters *id)
+{
+unsigned t = FIELD_EX64(id->id_aa64mmfr0, ID_AA64MMFR0, TGRAN4_2);
+return t >= 2 || (t == 0 && isar_feature_aa64_tgran4(id));
+}
+
+static inline bool isar_feature_aa64_tgran16_2(const ARMISARegisters *id)
+{
+unsigned t = FIELD_EX64(id->id_aa64mmfr0, ID_AA64MMFR0, TGRAN16_2);
+return t >= 2 || (t == 0 && isar_feature_aa64_tgran16(id));
+}
+
+static inline bool isar_feature_aa64_tgran64_2(const ARMISARegisters *id)
+{
+unsigned t = FIELD_EX64(id->id_aa64mmfr0, ID_AA64MMFR0, TGRAN64_2);
+return t >= 2 || (t == 0 && isar_feature_aa64_tgran64(id));
+}
+
 static inline bool isar_feature_aa64_ccidx(const ARMISARegisters *id)
 {
 return FIELD_EX64(id->id_aa64mmfr2, ID_AA64MMFR2, CCIDX) != 0;
diff --git a/target/arm/helper.c b/target/arm/helper.c
index b5dac651e75..7c4eea58739 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -10289,20 +10289,113 @@ static int aa64_va_parameter_tcma(uint64_t tcr, 
ARMMMUIdx mmu_idx)
 }
 }
 
+typedef enum GranuleSize {
+/* Same order as TG0 encoding */
+Gran4K,
+Gran64K,
+Gran16K,
+GranInvalid,
+} GranuleSize;
+
+static GranuleSize tg0_to_gran_size(int tg)
+{
+switch (tg) {
+case 0:
+return Gran4K;
+case 1:
+return Gran64K;
+case 2:
+return Gran16K;
+default:
+return GranInvalid;
+}
+}
+
+static GranuleSize tg1_to_gran_size(int tg)
+{
+switch (tg) {
+case 1:
+return Gran16K;
+case 2:
+return Gran4K;
+case 3:
+return Gran64K;
+default:
+return GranInvalid;
+}
+}
+
+static inline bool have4k(ARMCPU *cpu, bool stage2)
+{
+return stage2 ? cpu_isar_feature(aa64_tgran4_2, cpu)
+: cpu_isar_feature(aa64_tgran4, cpu);
+}
+
+static inline bool have16k(ARMCPU *cpu, bool stage2)
+{
+return stage2 ? cpu_isar_feature(aa64_tgran16_2, cpu)
+: cpu_isar_feature(aa64_tgran16, cpu);
+}
+
+static inline bool have64k(ARMCPU *cpu, bool stage2)
+{
+return stage2 ? cpu_isar_feature(aa64_tgran64_2, cpu)
+: cpu_isar_feature(aa64_tgran64, cpu);
+}
+
+static GranuleSize sanitize_gran_size(ARMCPU *cpu, GranuleSize gran,
+  bool stage2)
+{
+switch (gran) {
+case Gran4K:
+if (have4k(cpu, stage2)) {
+return gran;
+}
+break;
+case Gran16K:
+if (have16k(cpu, stage2)) {
+return gran;
+}
+break;
+case Gran64K:
+if (have64k(cpu, stage2)) {
+return gran;
+}
+break;
+case GranInvalid:
+break;
+}
+/*
+ * If the guest selects a granule size that isn't implemented,
+ * the architecture requires that we behave as if it selected one
+ * that is (with an IMPDEF choice of which one to pick). We choose
+ * to implement the smallest supported granule size.
+ */
+

Re: [PATCH v5 16/17] accel/tcg: Introduce TARGET_TB_PCREL

2022-09-30 Thread Richard Henderson

On 9/30/22 06:25, Peter Maydell wrote:

On Fri, 30 Sept 2022 at 14:23, Alex Bennée  wrote:



Peter Maydell  writes:

This is going to break previously working setups involving
the "filter logging to a particular address range" and also
anybody post-processing logfiles and expecting to see
the virtual address in -d exec logging, I think.


To be honest I've never found -exec logging that useful for system
emulation (beyond check-tcg tests) because it just generates so much
data.


It can be very useful for "give me a list of all the
PC values where we executed an instruction", for shorter
test cases. You can then (given several of these) look at
where two runs diverge, and similar things. I use it,
so please don't break it :-)


Ok, I'm reworking the patchset to always have the proper virtual pc for logging.


r~



Re: [PATCH v2 for-7.2 0/6] Drop libslirp submodule

2022-09-30 Thread Thomas Huth

On 30/09/2022 18.50, Christian Schoenebeck wrote:

On Mittwoch, 24. August 2022 17:11:16 CEST Thomas Huth wrote:

At the point in time we're going to release QEMU 7.2, all supported
host OS distributions will have a libslirp package available, so
there is no need anymore for us to ship the slirp submodule. Thus
let's clean up the related tests and finally remove the submodule now.

v2:
- Added patches to clean up and adapt the tests
- Rebased the removal patch to the latest version of the master branch

Thomas Huth (6):
   tests/docker: Update the debian-all-test-cross container to Debian 11
   tests/vm: Add libslirp to the VM tests
   tests/lcitool/libvirt-ci: Update the lcitool module to the latest
 version
   tests: Refresh dockerfiles and FreeBSD vars with lcitool
   tests/avocado: Do not run tests that require libslirp if it is not
 available
   Remove the slirp submodule (i.e. compile only with an external
 libslirp)


And I was wondering (bisecting) why network silently stopped working here.

While I understand the motivation for this change, it's probably not a user
friendly situation to just silently decease functionality. As slirp was the
default networking (i.e. not just some exotic QEMU feature), wouldn't it make
sense then to make missing libslirp a build-time error by default?


See discussion here:


https://lore.kernel.org/qemu-devel/a25c238b-dabd-bf20-9aee-7cda4e422...@redhat.com/

and patch here:


https://lore.kernel.org/qemu-devel/20220929163237.1417215-1-marcandre.lur...@redhat.com/

 HTH,
  Thomas




[PATCH] Revert "qapi: fix examples of blockdev-add with qcow2"

2022-09-30 Thread Markus Armbruster
This reverts commit b6522938327141235b97ab38e40c6c4512587373.

Kevin Wolf NAKed this patch, because:

'file' is a required member (defined in BlockdevOptionsGenericFormat),
removing it makes the example invalid. 'data-file' is only an additional
optional member to be used for external data files (i.e. when the guest
data is kept separate from the metadata in the .qcow2 file).

However, it had already been merged then.  Revert.

Signed-off-by: Markus Armbruster 
---
 qapi/block-core.json | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index f21fa235f2..882b266532 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1541,8 +1541,8 @@
 # -> { "execute": "blockdev-add",
 #  "arguments": { "driver": "qcow2",
 # "node-name": "node1534",
-# "data-file": { "driver": "file",
-#"filename": "hd1.qcow2" },
+# "file": { "driver": "file",
+#   "filename": "hd1.qcow2" },
 # "backing": null } }
 #
 # <- { "return": {} }
@@ -4378,7 +4378,7 @@
 #  "arguments": {
 #   "driver": "qcow2",
 #   "node-name": "test1",
-#   "data-file": {
+#   "file": {
 #   "driver": "file",
 #   "filename": "test.qcow2"
 #}
@@ -4395,7 +4395,7 @@
 #   "cache": {
 #  "direct": true
 #},
-#   "data-file": {
+#"file": {
 #  "driver": "file",
 #  "filename": "/tmp/test.qcow2"
 #},
@@ -4477,7 +4477,7 @@
 #  "arguments": {
 #   "driver": "qcow2",
 #   "node-name": "node0",
-#   "data-file": {
+#   "file": {
 #   "driver": "file",
 #   "filename": "test.qcow2"
 #   }
-- 
2.37.2




Re: [PULL 00/10] QAPI patches patches for 2022-09-07

2022-09-30 Thread Markus Armbruster
Markus Armbruster  writes:

> Markus Armbruster  writes:
>
>> Gentle reminder, Victor :)
>>
>> Markus Armbruster  writes:
>>
>>> Markus Armbruster  writes:
>>>
 Kevin Wolf  writes:

> Am 07.09.2022 um 17:03 hat Markus Armbruster geschrieben:
>> The following changes since commit 
>> 946e9bccf12f2bcc3ca471b820738fb22d14fc80:
>
> [...]
>
>>   qapi: fix examples of blockdev-add with qcow2
>
> NACK, this patch is wrong.
>
> 'file' is a required member (defined in BlockdevOptionsGenericFormat),
> removing it makes the example invalid. 'data-file' is only an additional
> optional member to be used for external data files (i.e. when the guest
> data is kept separate from the metadata in the .qcow2 file).

 I'll respin with #8 dropped.  Thank you!
>>>
>>> Too late, it's already merged.
>>>
>>> Victor, could you fix on top?  Or would you like me to revert the patch?
>
> Revert posted: 
>
> Subject: [PATCH] Revert "qapi: fix examples of blockdev-add with qcow2"
> Date: Fri, 30 Sep 2022 17:26:34 +0200
> Message-Id: <20220930152634.774907-1-arm...@redhat.com>

I messed up the send.  Correction is
Message-Id: <20220930171908.846769-1-arm...@redhat.com>




[PULL 16/18] hw/ide/core: Clear LBA and drive bits for EXECUTE DEVICE DIAGNOSTIC

2022-09-30 Thread Kevin Wolf
From: Lev Kujawski 

Prior to this patch, cmd_exec_dev_diagnostic relied upon
ide_set_signature to clear the device register.  While the
preservation of the drive bit by ide_set_signature is necessary for
the DEVICE RESET, IDENTIFY DEVICE, and READ SECTOR commands,
ATA/ATAPI-6 specifies that "DEV shall be cleared to zero" for EXECUTE
DEVICE DIAGNOSTIC.

This deviation was uncovered by the ATACT Device Testing Program
written by Hale Landis.

Signed-off-by: Lev Kujawski 
Message-Id: <20220707031140.158958-3-lku...@member.fsf.org>
Signed-off-by: Kevin Wolf 
---
 hw/ide/core.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 7cbc0a54a7..b747191ebf 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1704,8 +1704,14 @@ static bool cmd_identify_packet(IDEState *s, uint8_t cmd)
 return false;
 }
 
+/* EXECUTE DEVICE DIAGNOSTIC */
 static bool cmd_exec_dev_diagnostic(IDEState *s, uint8_t cmd)
 {
+/*
+ * Clear the device register per the ATA (v6) specification,
+ * because ide_set_signature does not clear LBA or drive bits.
+ */
+s->select = (ATA_DEV_ALWAYS_ON);
 ide_set_signature(s);
 
 if (s->drive_kind == IDE_CD) {
-- 
2.37.3




[PULL 18/18] hw/ide/core.c: Implement ATA INITIALIZE_DEVICE_PARAMETERS command

2022-09-30 Thread Kevin Wolf
From: Lev Kujawski 

CHS-based disk utilities and operating systems may adjust the logical
geometry of a hard drive to cope with the expectations or limitations
of software using the ATA INITIALIZE_DEVICE_PARAMETERS command.

Prior to this patch, INITIALIZE_DEVICE_PARAMETERS was a nop that
always returned success, raising the possibility of data loss or
corruption if the CHS<->LBA translation redirected a write to the
wrong sector.

* hw/ide/core.c
ide_reset():
  Reset the logical CHS geometry of the hard disk when the power-on
  defaults feature is enabled.
cmd_specify():
  a) New function implementing INITIALIZE_DEVICE_PARAMETERS.
  b) Ignore calls for empty or ATAPI devices.
cmd_set_features():
  Implement the power-on defaults enable and disable features.
struct ide_cmd_table:
  Switch WIN_SPECIFY from cmd_nop() to cmd_specify().
ide_init_drive():
  Set new fields 'drive_heads' and 'drive_sectors' based upon the
  actual disk geometry.

* include/hw/ide/internal.h
struct IDEState:
a) Store the actual drive CHS values within the new fields
   'drive_heads' and 'drive_sectors.'
b) Track whether a soft IDE reset should also reset the logical CHS
   geometry of the hard disk within the new field 'reset_reverts'.

Signed-off-by: Lev Kujawski 
Message-Id: <20220707031140.158958-7-lku...@member.fsf.org>
Signed-off-by: Kevin Wolf 
---
 include/hw/ide/internal.h |  3 +++
 hw/ide/core.c | 29 ++---
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/include/hw/ide/internal.h b/include/hw/ide/internal.h
index 97e7e59dc5..b17f36df95 100644
--- a/include/hw/ide/internal.h
+++ b/include/hw/ide/internal.h
@@ -375,6 +375,7 @@ struct IDEState {
 uint8_t unit;
 /* ide config */
 IDEDriveKind drive_kind;
+int drive_heads, drive_sectors;
 int cylinders, heads, sectors, chs_trans;
 int64_t nb_sectors;
 int mult_sectors;
@@ -401,6 +402,8 @@ struct IDEState {
 uint8_t select;
 uint8_t status;
 
+bool reset_reverts;
+
 /* set for lba48 access */
 uint8_t lba48;
 BlockBackend *blk;
diff --git a/hw/ide/core.c b/hw/ide/core.c
index b747191ebf..39afdc0006 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -1340,6 +1340,11 @@ static void ide_reset(IDEState *s)
 s->pio_aiocb = NULL;
 }
 
+if (s->reset_reverts) {
+s->reset_reverts = false;
+s->heads = s->drive_heads;
+s->sectors   = s->drive_sectors;
+}
 if (s->drive_kind == IDE_CFATA)
 s->mult_sectors = 0;
 else
@@ -1618,6 +1623,20 @@ static bool cmd_check_power_mode(IDEState *s, uint8_t 
cmd)
 return true;
 }
 
+/* INITIALIZE DEVICE PARAMETERS */
+static bool cmd_specify(IDEState *s, uint8_t cmd)
+{
+if (s->blk && s->drive_kind != IDE_CD) {
+s->heads = (s->select & (ATA_DEV_HS)) + 1;
+s->sectors = s->nsector;
+ide_set_irq(s->bus);
+} else {
+ide_abort_command(s);
+}
+
+return true;
+}
+
 static bool cmd_set_features(IDEState *s, uint8_t cmd)
 {
 uint16_t *identify_data;
@@ -1641,7 +1660,11 @@ static bool cmd_set_features(IDEState *s, uint8_t cmd)
 ide_flush_cache(s);
 return false;
 case 0xcc: /* reverting to power-on defaults enable */
+s->reset_reverts = true;
+return true;
 case 0x66: /* reverting to power-on defaults disable */
+s->reset_reverts = false;
+return true;
 case 0xaa: /* read look-ahead enable */
 case 0x55: /* read look-ahead disable */
 case 0x05: /* set advanced power management mode */
@@ -2051,7 +2074,7 @@ static const struct {
 [WIN_SEEK]= { cmd_seek, HD_CFA_OK | SET_DSC },
 [CFA_TRANSLATE_SECTOR]= { cmd_cfa_translate_sector, CFA_OK },
 [WIN_DIAGNOSE]= { cmd_exec_dev_diagnostic, ALL_OK },
-[WIN_SPECIFY] = { cmd_nop, HD_CFA_OK | SET_DSC },
+[WIN_SPECIFY] = { cmd_specify, HD_CFA_OK | SET_DSC },
 [WIN_STANDBYNOW2] = { cmd_nop, HD_CFA_OK },
 [WIN_IDLEIMMEDIATE2]  = { cmd_nop, HD_CFA_OK },
 [WIN_STANDBY2]= { cmd_nop, HD_CFA_OK },
@@ -2541,8 +2564,8 @@ int ide_init_drive(IDEState *s, BlockBackend *blk, 
IDEDriveKind kind,
 
 blk_get_geometry(blk, &nb_sectors);
 s->cylinders = cylinders;
-s->heads = heads;
-s->sectors = secs;
+s->heads = s->drive_heads = heads;
+s->sectors = s->drive_sectors = secs;
 s->chs_trans = chs_trans;
 s->nb_sectors = nb_sectors;
 s->wwn = wwn;
-- 
2.37.3




[PULL 09/18] block/qcow2: Keep auto_backing_file if possible

2022-09-30 Thread Kevin Wolf
From: Hanna Reitz 

qcow2_do_open() is used by qcow2_co_invalidate_cache(), i.e. may be run
on an image that has been opened before.  When reading the backing file
string from the image header, compare it against the existing
bs->backing_file, and update bs->auto_backing_file only if they differ.

auto_backing_file should ideally contain the filename the backing BDS
will actually have after opening, i.e. a post-bdrv_refresh_filename()
version of what is in the image header.  So for example, if the image
header reports the following backing file string:

json:{"driver": "qcow2", "file": {
"driver": "file", "filename": "/tmp/backing.qcow2"
}}

Then auto_backing_file should contain simply "/tmp/backing.qcow2".

Because bdrv_refresh_filename() only works on existing BDSs, though, the
way how we get this auto_backing_file value is to have the format driver
set it to whatever is in the image header, and when the backing BDS is
opened based on that, we update it with the filename the backing BDS
actually got.

However, qcow2's qcow2_co_invalidate_cache() implementation breaks this
because it just resets auto_backing_file to whatever is in the image
file without opening a BDS based on it, so we never get
auto_backing_file back to the "refreshed" version, and in the example
above, it would stay "json:{...}".

Then, bs->backing->bs->filename will differ from bs->auto_backing_file,
making bdrv_backing_overridden(bs) return true, which will lead
bdrv_refresh_filename(bs) to generate a json:{} filename for bs, even
though that may not have been necessary.  This is reported in the issue
linked below.

Therefore, skip updating auto_backing_file if nothing has changed in the
image header since we last read it.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1117
Signed-off-by: Hanna Reitz 
Message-Id: <2022080316.20723-2-hre...@redhat.com>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 block/qcow2.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index c8fc3a6160..6c8c8b2b5a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1697,16 +1697,27 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 ret = -EINVAL;
 goto fail;
 }
+
+s->image_backing_file = g_malloc(len + 1);
 ret = bdrv_pread(bs->file, header.backing_file_offset, len,
- bs->auto_backing_file, 0);
+ s->image_backing_file, 0);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Could not read backing file name");
 goto fail;
 }
-bs->auto_backing_file[len] = '\0';
-pstrcpy(bs->backing_file, sizeof(bs->backing_file),
-bs->auto_backing_file);
-s->image_backing_file = g_strdup(bs->auto_backing_file);
+s->image_backing_file[len] = '\0';
+
+/*
+ * Update only when something has changed.  This function is called by
+ * qcow2_co_invalidate_cache(), and we do not want to reset
+ * auto_backing_file unless necessary.
+ */
+if (!g_str_equal(s->image_backing_file, bs->backing_file)) {
+pstrcpy(bs->backing_file, sizeof(bs->backing_file),
+s->image_backing_file);
+pstrcpy(bs->auto_backing_file, sizeof(bs->auto_backing_file),
+s->image_backing_file);
+}
 }
 
 /*
-- 
2.37.3




[PULL 14/18] piix_ide_reset: Use pci_set_* functions instead of direct access

2022-09-30 Thread Kevin Wolf
From: Lev Kujawski 

Eliminate the remaining TODOs in hw/ide/piix.c by:
* Using pci_set_{size} functions to write the PIIX PCI configuration
  space instead of manipulating it directly as an array; and
* Documenting the default register values by reference to the
  controlling specification.

Signed-off-by: Lev Kujawski 
Message-Id: <20220707031140.158958-1-lku...@member.fsf.org>
Signed-off-by: Kevin Wolf 
---
 hw/ide/piix.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index 9a9b28078e..de1f4f0efb 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -21,6 +21,10 @@
  * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
  * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
  * THE SOFTWARE.
+ *
+ * References:
+ *  [1] 82371FB (PIIX) AND 82371SB (PIIX3) PCI ISA IDE XCELERATOR,
+ *  290550-002, Intel Corporation, April 1997.
  */
 
 #include "qemu/osdep.h"
@@ -114,14 +118,11 @@ static void piix_ide_reset(DeviceState *dev)
 ide_bus_reset(&d->bus[i]);
 }
 
-/* TODO: this is the default. do not override. */
-pci_conf[PCI_COMMAND] = 0x00;
-/* TODO: this is the default. do not override. */
-pci_conf[PCI_COMMAND + 1] = 0x00;
-/* TODO: use pci_set_word */
-pci_conf[PCI_STATUS] = PCI_STATUS_FAST_BACK;
-pci_conf[PCI_STATUS + 1] = PCI_STATUS_DEVSEL_MEDIUM >> 8;
-pci_conf[0x20] = 0x01; /* BMIBA: 20-23h */
+/* PCI command register default value (h) per [1, p.48].  */
+pci_set_word(pci_conf + PCI_COMMAND, 0x);
+pci_set_word(pci_conf + PCI_STATUS,
+ PCI_STATUS_DEVSEL_MEDIUM | PCI_STATUS_FAST_BACK);
+pci_set_byte(pci_conf + 0x20, 0x01);  /* BMIBA: 20-23h */
 }
 
 static int pci_piix_init_ports(PCIIDEState *d)
-- 
2.37.3




[PULL 15/18] tests/qtest/ide-test.c: Create disk image for use as a secondary

2022-09-30 Thread Kevin Wolf
From: Lev Kujawski 

Change 'tmp_path' into an array of two members to accommodate another
disk image of size TEST_IMAGE_SIZE.  This facilitates testing ATA
protocol aspects peculiar to secondary devices on the same controller.

Signed-off-by: Lev Kujawski 
Message-Id: <20220707031140.158958-2-lku...@member.fsf.org>
Signed-off-by: Kevin Wolf 
---
 tests/qtest/ide-test.c | 39 ++-
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/tests/qtest/ide-test.c b/tests/qtest/ide-test.c
index 4ea89c26c9..93b4416023 100644
--- a/tests/qtest/ide-test.c
+++ b/tests/qtest/ide-test.c
@@ -121,7 +121,7 @@ enum {
 static QPCIBus *pcibus = NULL;
 static QGuestAllocator guest_malloc;
 
-static char *tmp_path;
+static char *tmp_path[2];
 static char *debug_path;
 
 static QTestState *ide_test_start(const char *cmdline_fmt, ...)
@@ -310,7 +310,7 @@ static QTestState *test_bmdma_setup(void)
 qts = ide_test_start(
 "-drive file=%s,if=ide,cache=writeback,format=raw "
 "-global ide-hd.serial=%s -global ide-hd.ver=%s",
-tmp_path, "testdisk", "version");
+tmp_path[0], "testdisk", "version");
 qtest_irq_intercept_in(qts, "ioapic");
 
 return qts;
@@ -574,7 +574,7 @@ static void test_identify(void)
 qts = ide_test_start(
 "-drive file=%s,if=ide,cache=writeback,format=raw "
 "-global ide-hd.serial=%s -global ide-hd.ver=%s",
-tmp_path, "testdisk", "version");
+tmp_path[0], "testdisk", "version");
 
 dev = get_pci_device(qts, &bmdma_bar, &ide_bar);
 
@@ -662,7 +662,7 @@ static void test_flush(void)
 
 qts = ide_test_start(
 "-drive file=blkdebug::%s,if=ide,cache=writeback,format=raw",
-tmp_path);
+tmp_path[0]);
 
 dev = get_pci_device(qts, &bmdma_bar, &ide_bar);
 
@@ -713,7 +713,7 @@ static void test_pci_retry_flush(void)
 qts = ide_test_start(
 "-drive file=blkdebug:%s:%s,if=ide,cache=writeback,format=raw,"
 "rerror=stop,werror=stop",
-debug_path, tmp_path);
+debug_path, tmp_path[0]);
 
 dev = get_pci_device(qts, &bmdma_bar, &ide_bar);
 
@@ -892,14 +892,14 @@ static void cdrom_pio_impl(int nblocks)
 
 /* Prepopulate the CDROM with an interesting pattern */
 generate_pattern(pattern, patt_len, ATAPI_BLOCK_SIZE);
-fh = fopen(tmp_path, "wb+");
+fh = fopen(tmp_path[0], "wb+");
 ret = fwrite(pattern, ATAPI_BLOCK_SIZE, patt_blocks, fh);
 g_assert_cmpint(ret, ==, patt_blocks);
 fclose(fh);
 
 qts = ide_test_start(
 "-drive if=none,file=%s,media=cdrom,format=raw,id=sr0,index=0 "
-"-device ide-cd,drive=sr0,bus=ide.0", tmp_path);
+"-device ide-cd,drive=sr0,bus=ide.0", tmp_path[0]);
 dev = get_pci_device(qts, &bmdma_bar, &ide_bar);
 qtest_irq_intercept_in(qts, "ioapic");
 
@@ -985,7 +985,7 @@ static void test_cdrom_dma(void)
 
 qts = ide_test_start(
 "-drive if=none,file=%s,media=cdrom,format=raw,id=sr0,index=0 "
-"-device ide-cd,drive=sr0,bus=ide.0", tmp_path);
+"-device ide-cd,drive=sr0,bus=ide.0", tmp_path[0]);
 qtest_irq_intercept_in(qts, "ioapic");
 
 guest_buf = guest_alloc(&guest_malloc, len);
@@ -993,7 +993,7 @@ static void test_cdrom_dma(void)
 prdt[0].size = cpu_to_le32(len | PRDT_EOT);
 
 generate_pattern(pattern, ATAPI_BLOCK_SIZE * 16, ATAPI_BLOCK_SIZE);
-fh = fopen(tmp_path, "wb+");
+fh = fopen(tmp_path[0], "wb+");
 ret = fwrite(pattern, ATAPI_BLOCK_SIZE, 16, fh);
 g_assert_cmpint(ret, ==, 16);
 fclose(fh);
@@ -1012,6 +1012,7 @@ static void test_cdrom_dma(void)
 int main(int argc, char **argv)
 {
 const char *base;
+int i;
 int fd;
 int ret;
 
@@ -1035,12 +1036,14 @@ int main(int argc, char **argv)
 close(fd);
 
 /* Create a temporary raw image */
-tmp_path = g_strdup_printf("%s/qtest.XX", base);
-fd = g_mkstemp(tmp_path);
-g_assert(fd >= 0);
-ret = ftruncate(fd, TEST_IMAGE_SIZE);
-g_assert(ret == 0);
-close(fd);
+for (i = 0; i < 2; ++i) {
+tmp_path[i] = g_strdup_printf("%s/qtest.XX", base);
+fd = g_mkstemp(tmp_path[i]);
+g_assert(fd >= 0);
+ret = ftruncate(fd, TEST_IMAGE_SIZE);
+g_assert(ret == 0);
+close(fd);
+}
 
 /* Run the tests */
 g_test_init(&argc, &argv, NULL);
@@ -1064,8 +1067,10 @@ int main(int argc, char **argv)
 ret = g_test_run();
 
 /* Cleanup */
-unlink(tmp_path);
-g_free(tmp_path);
+for (i = 0; i < 2; ++i) {
+unlink(tmp_path[i]);
+g_free(tmp_path[i]);
+}
 unlink(debug_path);
 g_free(debug_path);
 
-- 
2.37.3




[PULL 13/18] block: use the request length for iov alignment

2022-09-30 Thread Kevin Wolf
From: Keith Busch 

An iov length needs to be aligned to the logical block size, which may
be larger than the memory alignment.

Tested-by: Jens Axboe 
Signed-off-by: Keith Busch 
Message-Id: <20220929200523.3218710-3-kbu...@meta.com>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 block/file-posix.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 989dfc4586..66fdb07820 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2068,13 +2068,14 @@ static bool bdrv_qiov_is_aligned(BlockDriverState *bs, 
QEMUIOVector *qiov)
 {
 int i;
 size_t alignment = bdrv_min_mem_align(bs);
+size_t len = bs->bl.request_alignment;
 IO_CODE();
 
 for (i = 0; i < qiov->niov; i++) {
 if ((uintptr_t) qiov->iov[i].iov_base % alignment) {
 return false;
 }
-if (qiov->iov[i].iov_len % alignment) {
+if (qiov->iov[i].iov_len % len) {
 return false;
 }
 }
-- 
2.37.3




[PULL 00/18] Block layer patches

2022-09-30 Thread Kevin Wolf
The following changes since commit c8de6ec63d766ca1998c5af468483ce912fdc0c2:

  Merge tag 'pull-request-2022-09-28' of https://gitlab.com/thuth/qemu into 
staging (2022-09-28 17:04:11 -0400)

are available in the Git repository at:

  git://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to 176e4961bb33d559da1af441fb0ee2e0cb8245ae:

  hw/ide/core.c: Implement ATA INITIALIZE_DEVICE_PARAMETERS command (2022-09-30 
18:43:44 +0200)


Block layer patches

- Fix missing block_acct_setup() with -blockdev
- Keep auto_backing_file post-migration
- file-posix: Fixed O_DIRECT memory alignment
- ide: Fix state after EXECUTE DEVICE DIAGNOSTIC and implement
  INITIALIZE DEVICE PARAMETERS
- qemu-img: Wean documentation and help output off '?' for help
- qcow2: fix memory leak and compiler warning
- Code cleanups


Denis V. Lunev (4):
  block: pass OnOffAuto instead of bool to block_acct_setup()
  block: add missed block_acct_setup with new block device init procedure
  block: use bdrv_is_sg() helper instead of raw bs->sg reading
  block: make serializing requests functions 'void'

Hanna Reitz (3):
  block/qcow2: Keep auto_backing_file if possible
  block/qed: Keep auto_backing_file if possible
  iotests/backing-file-invalidation: Add new test

Keith Busch (2):
  block: move bdrv_qiov_is_aligned to file-posix
  block: use the request length for iov alignment

Lev Kujawski (5):
  piix_ide_reset: Use pci_set_* functions instead of direct access
  tests/qtest/ide-test.c: Create disk image for use as a secondary
  hw/ide/core: Clear LBA and drive bits for EXECUTE DEVICE DIAGNOSTIC
  tests/qtest/ide-test: Verify that DIAGNOSTIC clears DEV to zero
  hw/ide/core.c: Implement ATA INITIALIZE_DEVICE_PARAMETERS command

Markus Armbruster (1):
  qemu-img: Wean documentation and help output off '?' for help

Philippe Mathieu-Daudé (1):
  block/qcow2-bitmap: Add missing cast to silent GCC error

Stefan Hajnoczi (1):
  gluster: stop using .bdrv_needs_filename

lu zhipeng (1):
  qcow2: fix memory leak in qcow2_read_extensions

 docs/tools/qemu-img.rst|   2 +-
 include/block/accounting.h |   6 +-
 include/block/block-io.h   |   1 -
 include/block/block_int-io.h   |   2 +-
 include/hw/block/block.h   |   7 +-
 include/hw/ide/internal.h  |   3 +
 block/accounting.c |  26 +++-
 block/file-posix.c |  24 +++-
 block/gluster.c|   4 -
 block/io.c |  44 +-
 block/iscsi.c  |   2 +-
 block/qcow2-bitmap.c   |   2 +-
 block/qcow2.c  |  22 ++-
 block/qed.c|  15 +-
 block/raw-format.c |   4 +-
 blockdev.c |  17 ++-
 hw/block/block.c   |   2 +
 hw/ide/core.c  |  35 -
 hw/ide/piix.c  |  17 +--
 qemu-img.c |   4 +-
 tests/qtest/ide-test.c |  72 +++---
 tests/qemu-iotests/172.out |  76 +++
 tests/qemu-iotests/227.out |   4 +-
 tests/qemu-iotests/tests/backing-file-invalidation | 152 +
 .../tests/backing-file-invalidation.out|   5 +
 25 files changed, 447 insertions(+), 101 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/backing-file-invalidation
 create mode 100644 tests/qemu-iotests/tests/backing-file-invalidation.out




[PULL 11/18] iotests/backing-file-invalidation: Add new test

2022-09-30 Thread Kevin Wolf
From: Hanna Reitz 

Add a new test to see what happens when you migrate a VM with a backing
chain that has json:{} backing file strings, which, when opened, will be
resolved to plain filenames.

Signed-off-by: Hanna Reitz 
Message-Id: <2022080316.20723-4-hre...@redhat.com>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 .../tests/backing-file-invalidation   | 152 ++
 .../tests/backing-file-invalidation.out   |   5 +
 2 files changed, 157 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/backing-file-invalidation
 create mode 100644 tests/qemu-iotests/tests/backing-file-invalidation.out

diff --git a/tests/qemu-iotests/tests/backing-file-invalidation 
b/tests/qemu-iotests/tests/backing-file-invalidation
new file mode 100755
index 00..4eccc80153
--- /dev/null
+++ b/tests/qemu-iotests/tests/backing-file-invalidation
@@ -0,0 +1,152 @@
+#!/usr/bin/env python3
+# group: rw migration
+#
+# Migrate a VM with a BDS with backing nodes, which runs
+# bdrv_invalidate_cache(), which for qcow2 and qed triggers reading the
+# backing file string from the image header.  Check whether this
+# interferes with bdrv_backing_overridden().
+#
+# Copyright (C) 2022 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import json
+import os
+from typing import Optional
+
+import iotests
+from iotests import qemu_img_create, qemu_img_info
+
+
+image_size = 1 * 1024 * 1024
+imgs = [os.path.join(iotests.test_dir, f'{i}.img') for i in range(0, 4)]
+
+mig_sock = os.path.join(iotests.sock_dir, 'mig.sock')
+
+
+class TestPostMigrateFilename(iotests.QMPTestCase):
+vm_s: Optional[iotests.VM] = None
+vm_d: Optional[iotests.VM] = None
+
+def setUp(self) -> None:
+# Create backing chain of three images, where the backing file strings
+# are json:{} filenames
+qemu_img_create('-f', iotests.imgfmt, imgs[0], str(image_size))
+for i in range(1, 3):
+backing = {
+'driver': iotests.imgfmt,
+'file': {
+'driver': 'file',
+'filename': imgs[i - 1]
+}
+}
+qemu_img_create('-f', iotests.imgfmt, '-F', iotests.imgfmt,
+'-b', 'json:' + json.dumps(backing),
+imgs[i], str(image_size))
+
+def tearDown(self) -> None:
+if self.vm_s is not None:
+self.vm_s.shutdown()
+if self.vm_d is not None:
+self.vm_d.shutdown()
+
+for img in imgs:
+try:
+os.remove(img)
+except OSError:
+pass
+try:
+os.remove(mig_sock)
+except OSError:
+pass
+
+def test_migration(self) -> None:
+"""
+Migrate a VM with the backing chain created in setUp() attached.  At
+the end of the migration process, the destination will run
+bdrv_invalidate_cache(), which for some image formats (qcow2 and qed)
+means the backing file string is re-read from the image header.  If
+this overwrites bs->auto_backing_file, doing so may cause
+bdrv_backing_overridden() to become true: The image header reports a
+json:{} filename, but when opening it, bdrv_refresh_filename() will
+simplify it to a plain simple filename; and when bs->auto_backing_file
+and bs->backing->bs->filename differ, bdrv_backing_overridden() becomes
+true.
+If bdrv_backing_overridden() is true, the BDS will be forced to get a
+json:{} filename, which in general is not the end of the world, but not
+great.  Check whether that happens, i.e. whether migration changes the
+node's filename.
+"""
+
+blockdev = {
+'node-name': 'node0',
+'driver': iotests.imgfmt,
+'file': {
+'driver': 'file',
+'filename': imgs[2]
+}
+}
+
+self.vm_s = iotests.VM(path_suffix='a') \
+   .add_blockdev(json.dumps(blockdev))
+self.vm_d = iotests.VM(path_suffix='b') \
+   .add_blockdev(json.dumps(blockdev)) \
+   .add_incoming(f'unix:{mig_sock}')
+
+assert self.vm_s is not None
+assert self.vm_d is not None
+
+self

[PULL 12/18] block: move bdrv_qiov_is_aligned to file-posix

2022-09-30 Thread Kevin Wolf
From: Keith Busch 

There is only user of bdrv_qiov_is_aligned(), so move the alignment
function to there and make it static.

Signed-off-by: Keith Busch 
Message-Id: <20220929200523.3218710-2-kbu...@meta.com>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 include/block/block-io.h |  1 -
 block/file-posix.c   | 21 +
 block/io.c   | 21 -
 3 files changed, 21 insertions(+), 22 deletions(-)

diff --git a/include/block/block-io.h b/include/block/block-io.h
index fd25ffa9be..492f95fc05 100644
--- a/include/block/block-io.h
+++ b/include/block/block-io.h
@@ -150,7 +150,6 @@ void *qemu_blockalign(BlockDriverState *bs, size_t size);
 void *qemu_blockalign0(BlockDriverState *bs, size_t size);
 void *qemu_try_blockalign(BlockDriverState *bs, size_t size);
 void *qemu_try_blockalign0(BlockDriverState *bs, size_t size);
-bool bdrv_qiov_is_aligned(BlockDriverState *bs, QEMUIOVector *qiov);
 
 void bdrv_enable_copy_on_read(BlockDriverState *bs);
 void bdrv_disable_copy_on_read(BlockDriverState *bs);
diff --git a/block/file-posix.c b/block/file-posix.c
index 256de1f456..989dfc4586 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2061,6 +2061,27 @@ static int coroutine_fn 
raw_thread_pool_submit(BlockDriverState *bs,
 return thread_pool_submit_co(pool, func, arg);
 }
 
+/*
+ * Check if all memory in this vector is sector aligned.
+ */
+static bool bdrv_qiov_is_aligned(BlockDriverState *bs, QEMUIOVector *qiov)
+{
+int i;
+size_t alignment = bdrv_min_mem_align(bs);
+IO_CODE();
+
+for (i = 0; i < qiov->niov; i++) {
+if ((uintptr_t) qiov->iov[i].iov_base % alignment) {
+return false;
+}
+if (qiov->iov[i].iov_len % alignment) {
+return false;
+}
+}
+
+return true;
+}
+
 static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov, int 
type)
 {
diff --git a/block/io.c b/block/io.c
index 51d8f943a4..c3200bcdff 100644
--- a/block/io.c
+++ b/block/io.c
@@ -3227,27 +3227,6 @@ void *qemu_try_blockalign0(BlockDriverState *bs, size_t 
size)
 return mem;
 }
 
-/*
- * Check if all memory in this vector is sector aligned.
- */
-bool bdrv_qiov_is_aligned(BlockDriverState *bs, QEMUIOVector *qiov)
-{
-int i;
-size_t alignment = bdrv_min_mem_align(bs);
-IO_CODE();
-
-for (i = 0; i < qiov->niov; i++) {
-if ((uintptr_t) qiov->iov[i].iov_base % alignment) {
-return false;
-}
-if (qiov->iov[i].iov_len % alignment) {
-return false;
-}
-}
-
-return true;
-}
-
 void bdrv_io_plug(BlockDriverState *bs)
 {
 BdrvChild *child;
-- 
2.37.3




[PULL 17/18] tests/qtest/ide-test: Verify that DIAGNOSTIC clears DEV to zero

2022-09-30 Thread Kevin Wolf
From: Lev Kujawski 

Verify correction of EXECUTE DEVICE DIAGNOSTIC introduced in commit
72423831c3 (hw/ide/core: Clear LBA and drive bits for EXECUTE DEVICE
DIAGNOSTIC, 2022-05-28).

Signed-off-by: Lev Kujawski 
Message-Id: <20220707031140.158958-4-lku...@member.fsf.org>
Signed-off-by: Kevin Wolf 
---
 tests/qtest/ide-test.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/tests/qtest/ide-test.c b/tests/qtest/ide-test.c
index 93b4416023..dbe1563b23 100644
--- a/tests/qtest/ide-test.c
+++ b/tests/qtest/ide-test.c
@@ -90,6 +90,7 @@ enum {
 
 enum {
 CMD_DSM = 0x06,
+CMD_DIAGNOSE= 0x90,
 CMD_READ_DMA= 0xc8,
 CMD_WRITE_DMA   = 0xca,
 CMD_FLUSH_CACHE = 0xe7,
@@ -614,6 +615,36 @@ static void test_identify(void)
 free_pci_device(dev);
 }
 
+static void test_diagnostic(void)
+{
+QTestState *qts;
+QPCIDevice *dev;
+QPCIBar bmdma_bar, ide_bar;
+uint8_t data;
+
+qts = ide_test_start(
+"-blockdev driver=file,node-name=hda,filename=%s "
+"-blockdev driver=file,node-name=hdb,filename=%s "
+"-device ide-hd,drive=hda,bus=ide.0,unit=0 "
+"-device ide-hd,drive=hdb,bus=ide.0,unit=1 ",
+tmp_path[0], tmp_path[1]);
+
+dev = get_pci_device(qts, &bmdma_bar, &ide_bar);
+
+/* DIAGNOSE command on device 1 */
+qpci_io_writeb(dev, ide_bar, reg_device, DEV);
+data = qpci_io_readb(dev, ide_bar, reg_device);
+g_assert_cmphex(data & DEV, ==, DEV);
+qpci_io_writeb(dev, ide_bar, reg_command, CMD_DIAGNOSE);
+
+/* Verify that DEVICE is now 0 */
+data = qpci_io_readb(dev, ide_bar, reg_device);
+g_assert_cmphex(data & DEV, ==, 0);
+
+ide_test_quit(qts);
+free_pci_device(dev);
+}
+
 /*
  * Write sector 1 with random data to make IDE storage dirty
  * Needed for flush tests so that flushes actually go though the block layer
@@ -1050,6 +1081,8 @@ int main(int argc, char **argv)
 
 qtest_add_func("/ide/identify", test_identify);
 
+qtest_add_func("/ide/diagnostic", test_diagnostic);
+
 qtest_add_func("/ide/bmdma/simple_rw", test_bmdma_simple_rw);
 qtest_add_func("/ide/bmdma/trim", test_bmdma_trim);
 qtest_add_func("/ide/bmdma/various_prdts", test_bmdma_various_prdts);
-- 
2.37.3




[PULL 01/18] qcow2: fix memory leak in qcow2_read_extensions

2022-09-30 Thread Kevin Wolf
From: lu zhipeng 

Free feature_table if it is failed in bdrv_pread.

Signed-off-by: lu zhipeng 
Message-Id: <20220921144515.1166-1-luzhip...@cestc.cn>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 block/qcow2.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index c6c6692fb7..c8fc3a6160 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -275,6 +275,7 @@ static int qcow2_read_extensions(BlockDriverState *bs, 
uint64_t start_offset,
 if (ret < 0) {
 error_setg_errno(errp, -ret, "ERROR: ext_feature_table: "
  "Could not read table");
+g_free(feature_table);
 return ret;
 }
 
-- 
2.37.3




[PULL 07/18] block: make serializing requests functions 'void'

2022-09-30 Thread Kevin Wolf
From: "Denis V. Lunev" 

Return codes of the following functions are never used in the code:
* bdrv_wait_serialising_requests_locked
* bdrv_wait_serialising_requests
* bdrv_make_request_serialising

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Hanna Reitz 
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: Ronnie Sahlberg 
CC: Paolo Bonzini 
CC: Peter Lieven 
CC: Vladimir Sementsov-Ogievskiy 
Message-Id: <20220817083736.40981-3-...@openvz.org>
Reviewed-by: Kevin Wolf 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Kevin Wolf 
---
 include/block/block_int-io.h |  2 +-
 block/io.c   | 23 +++
 2 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/include/block/block_int-io.h b/include/block/block_int-io.h
index 91cdd61692..4b0b3e17ef 100644
--- a/include/block/block_int-io.h
+++ b/include/block/block_int-io.h
@@ -73,7 +73,7 @@ static inline int coroutine_fn bdrv_co_pwrite(BdrvChild 
*child,
 return bdrv_co_pwritev(child, offset, bytes, &qiov, flags);
 }
 
-bool coroutine_fn bdrv_make_request_serialising(BdrvTrackedRequest *req,
+void coroutine_fn bdrv_make_request_serialising(BdrvTrackedRequest *req,
 uint64_t align);
 BdrvTrackedRequest *coroutine_fn bdrv_co_get_self_request(BlockDriverState 
*bs);
 
diff --git a/block/io.c b/block/io.c
index 0a8cbefe86..51d8f943a4 100644
--- a/block/io.c
+++ b/block/io.c
@@ -828,20 +828,16 @@ bdrv_find_conflicting_request(BdrvTrackedRequest *self)
 }
 
 /* Called with self->bs->reqs_lock held */
-static bool coroutine_fn
+static void coroutine_fn
 bdrv_wait_serialising_requests_locked(BdrvTrackedRequest *self)
 {
 BdrvTrackedRequest *req;
-bool waited = false;
 
 while ((req = bdrv_find_conflicting_request(self))) {
 self->waiting_for = req;
 qemu_co_queue_wait(&req->wait_queue, &self->bs->reqs_lock);
 self->waiting_for = NULL;
-waited = true;
 }
-
-return waited;
 }
 
 /* Called with req->bs->reqs_lock held */
@@ -934,36 +930,31 @@ void bdrv_dec_in_flight(BlockDriverState *bs)
 bdrv_wakeup(bs);
 }
 
-static bool coroutine_fn bdrv_wait_serialising_requests(BdrvTrackedRequest 
*self)
+static void coroutine_fn
+bdrv_wait_serialising_requests(BdrvTrackedRequest *self)
 {
 BlockDriverState *bs = self->bs;
-bool waited = false;
 
 if (!qatomic_read(&bs->serialising_in_flight)) {
-return false;
+return;
 }
 
 qemu_co_mutex_lock(&bs->reqs_lock);
-waited = bdrv_wait_serialising_requests_locked(self);
+bdrv_wait_serialising_requests_locked(self);
 qemu_co_mutex_unlock(&bs->reqs_lock);
-
-return waited;
 }
 
-bool coroutine_fn bdrv_make_request_serialising(BdrvTrackedRequest *req,
+void coroutine_fn bdrv_make_request_serialising(BdrvTrackedRequest *req,
 uint64_t align)
 {
-bool waited;
 IO_CODE();
 
 qemu_co_mutex_lock(&req->bs->reqs_lock);
 
 tracked_request_set_serialising(req, align);
-waited = bdrv_wait_serialising_requests_locked(req);
+bdrv_wait_serialising_requests_locked(req);
 
 qemu_co_mutex_unlock(&req->bs->reqs_lock);
-
-return waited;
 }
 
 int bdrv_check_qiov_request(int64_t offset, int64_t bytes,
-- 
2.37.3




[PULL 08/18] gluster: stop using .bdrv_needs_filename

2022-09-30 Thread Kevin Wolf
From: Stefan Hajnoczi 

The gluster protocol driver used to parse URIs (filenames) but was
extended with a richer JSON syntax in commit 6c7189bb29de
("block/gluster: add support for multiple gluster servers"). The gluster
drivers that have JSON parsing set .bdrv_needs_filename to false.

The gluster+unix and gluster+rdma drivers still to require a filename
even though the JSON parser is equipped to parse the same
volume/path/sockaddr details as the URI parser. Let's allow JSON parsing
for these drivers too.

Note that the gluster+rdma driver actually uses TCP because RDMA support
is not available, so the JSON server.type field must be "inet".

Drop .bdrv_needs_filename since both the filename and the JSON parsers
can handle gluster+unix and gluster+rdma. This change is in preparation
for eventually removing .bdrv_needs_filename across the entire codebase.

Cc: Prasanna Kumar Kalever 
Signed-off-by: Stefan Hajnoczi 
Message-Id: <20220811164905.430834-1-stefa...@redhat.com>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 block/gluster.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index b60213ab80..bb1144cf6a 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -1555,7 +1555,6 @@ static BlockDriver bdrv_gluster = {
 .format_name  = "gluster",
 .protocol_name= "gluster",
 .instance_size= sizeof(BDRVGlusterState),
-.bdrv_needs_filename  = false,
 .bdrv_file_open   = qemu_gluster_open,
 .bdrv_reopen_prepare  = qemu_gluster_reopen_prepare,
 .bdrv_reopen_commit   = qemu_gluster_reopen_commit,
@@ -1585,7 +1584,6 @@ static BlockDriver bdrv_gluster_tcp = {
 .format_name  = "gluster",
 .protocol_name= "gluster+tcp",
 .instance_size= sizeof(BDRVGlusterState),
-.bdrv_needs_filename  = false,
 .bdrv_file_open   = qemu_gluster_open,
 .bdrv_reopen_prepare  = qemu_gluster_reopen_prepare,
 .bdrv_reopen_commit   = qemu_gluster_reopen_commit,
@@ -1615,7 +1613,6 @@ static BlockDriver bdrv_gluster_unix = {
 .format_name  = "gluster",
 .protocol_name= "gluster+unix",
 .instance_size= sizeof(BDRVGlusterState),
-.bdrv_needs_filename  = true,
 .bdrv_file_open   = qemu_gluster_open,
 .bdrv_reopen_prepare  = qemu_gluster_reopen_prepare,
 .bdrv_reopen_commit   = qemu_gluster_reopen_commit,
@@ -1651,7 +1648,6 @@ static BlockDriver bdrv_gluster_rdma = {
 .format_name  = "gluster",
 .protocol_name= "gluster+rdma",
 .instance_size= sizeof(BDRVGlusterState),
-.bdrv_needs_filename  = true,
 .bdrv_file_open   = qemu_gluster_open,
 .bdrv_reopen_prepare  = qemu_gluster_reopen_prepare,
 .bdrv_reopen_commit   = qemu_gluster_reopen_commit,
-- 
2.37.3




[PULL 10/18] block/qed: Keep auto_backing_file if possible

2022-09-30 Thread Kevin Wolf
From: Hanna Reitz 

Just like qcow2, qed invokes its open function in its
.bdrv_co_invalidate_cache() implementation.  Therefore, just like done
for qcow2 in HEAD^, update auto_backing_file only if the backing file
string in the image header differs from the one we have read before.

Signed-off-by: Hanna Reitz 
Message-Id: <2022080316.20723-3-hre...@redhat.com>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 block/qed.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/block/qed.c b/block/qed.c
index 40943e679b..324ca0e95a 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -445,6 +445,8 @@ static int coroutine_fn bdrv_qed_do_open(BlockDriverState 
*bs, QDict *options,
 }
 
 if ((s->header.features & QED_F_BACKING_FILE)) {
+g_autofree char *backing_file_str = NULL;
+
 if ((uint64_t)s->header.backing_filename_offset +
 s->header.backing_filename_size >
 s->header.cluster_size * s->header.header_size) {
@@ -452,16 +454,21 @@ static int coroutine_fn bdrv_qed_do_open(BlockDriverState 
*bs, QDict *options,
 return -EINVAL;
 }
 
+backing_file_str = g_malloc(sizeof(bs->backing_file));
 ret = qed_read_string(bs->file, s->header.backing_filename_offset,
   s->header.backing_filename_size,
-  bs->auto_backing_file,
-  sizeof(bs->auto_backing_file));
+  backing_file_str, sizeof(bs->backing_file));
 if (ret < 0) {
 error_setg(errp, "Failed to read backing filename");
 return ret;
 }
-pstrcpy(bs->backing_file, sizeof(bs->backing_file),
-bs->auto_backing_file);
+
+if (!g_str_equal(backing_file_str, bs->backing_file)) {
+pstrcpy(bs->backing_file, sizeof(bs->backing_file),
+backing_file_str);
+pstrcpy(bs->auto_backing_file, sizeof(bs->auto_backing_file),
+backing_file_str);
+}
 
 if (s->header.features & QED_F_BACKING_FORMAT_NO_PROBE) {
 pstrcpy(bs->backing_format, sizeof(bs->backing_format), "raw");
-- 
2.37.3




[PULL 03/18] qemu-img: Wean documentation and help output off '?' for help

2022-09-30 Thread Kevin Wolf
From: Markus Armbruster 

'?' for help is deprecated since commit c8057f951d "Support 'help' as
a synonym for '?' in command line options", v1.2.0.  We neglected to
update output of qemu-img --help and the manual.  Do that now.

Signed-off-by: Markus Armbruster 
Message-Id: <20220908130842.641410-1-arm...@redhat.com>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 docs/tools/qemu-img.rst | 2 +-
 qemu-img.c  | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/docs/tools/qemu-img.rst b/docs/tools/qemu-img.rst
index 85a6e05b35..15aeddc6d8 100644
--- a/docs/tools/qemu-img.rst
+++ b/docs/tools/qemu-img.rst
@@ -57,7 +57,7 @@ cases. See below for a description of the supported disk 
formats.
 *OUTPUT_FMT* is the destination format.
 
 *OPTIONS* is a comma separated list of format specific options in a
-name=value format. Use ``-o ?`` for an overview of the options supported
+name=value format. Use ``-o help`` for an overview of the options supported
 by the used format or see the format descriptions below for details.
 
 *SNAPSHOT_PARAM* is param used for internal snapshot, format is
diff --git a/qemu-img.c b/qemu-img.c
index 7d4b33b3da..cab9776f42 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -164,8 +164,8 @@ void help(void)
"  'output_filename' is the destination disk image filename\n"
"  'output_fmt' is the destination format\n"
"  'options' is a comma separated list of format specific options 
in a\n"
-   "name=value format. Use -o ? for an overview of the options 
supported by the\n"
-   "used format\n"
+   "name=value format. Use -o help for an overview of the options 
supported by\n"
+   "the used format\n"
"  'snapshot_param' is param used for internal snapshot, format\n"
"is 'snapshot.id=[ID],snapshot.name=[NAME]', or\n"
"'[ID_OR_NAME]'\n"
-- 
2.37.3




[PULL 05/18] block: add missed block_acct_setup with new block device init procedure

2022-09-30 Thread Kevin Wolf
From: "Denis V. Lunev" 

Commit 5f76a7aac156ca75680dad5df4a385fd0b58f6b1 is looking harmless from
the first glance, but it has changed things a lot. 'libvirt' uses it to
detect that it should follow new initialization way and this changes
things considerably. With this procedure followed, blockdev_init() is
not called anymore and thus block_acct_setup() helper is not called.

This means in particular that defaults for block accounting statistics
are changed and account_invalid/account_failed are actually initialized
as false instead of true originally.

This commit changes things to match original world. There are the following
constraints:
* new default value in block_acct_init() is set to true
* block_acct_setup() inside blockdev_init() is called before
  blkconf_apply_backend_options()
* thus newly created option in block device properties has precedence if
  specified

Signed-off-by: Denis V. Lunev 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
CC: Peter Krempa 
CC: Markus Armbruster 
CC: John Snow 
CC: Kevin Wolf 
CC: Hanna Reitz 
Message-Id: <20220824095044.166009-3-...@openvz.org>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 include/hw/block/block.h   |  7 +++-
 block/accounting.c |  8 +++-
 hw/block/block.c   |  2 +
 tests/qemu-iotests/172.out | 76 ++
 tests/qemu-iotests/227.out |  4 +-
 5 files changed, 92 insertions(+), 5 deletions(-)

diff --git a/include/hw/block/block.h b/include/hw/block/block.h
index 5902c0440a..15fff66435 100644
--- a/include/hw/block/block.h
+++ b/include/hw/block/block.h
@@ -31,6 +31,7 @@ typedef struct BlockConf {
 uint32_t lcyls, lheads, lsecs;
 OnOffAuto wce;
 bool share_rw;
+OnOffAuto account_invalid, account_failed;
 BlockdevOnError rerror;
 BlockdevOnError werror;
 } BlockConf;
@@ -61,7 +62,11 @@ static inline unsigned int get_physical_block_exp(BlockConf 
*conf)
_conf.discard_granularity, -1),  \
 DEFINE_PROP_ON_OFF_AUTO("write-cache", _state, _conf.wce,   \
 ON_OFF_AUTO_AUTO),  \
-DEFINE_PROP_BOOL("share-rw", _state, _conf.share_rw, false)
+DEFINE_PROP_BOOL("share-rw", _state, _conf.share_rw, false),\
+DEFINE_PROP_ON_OFF_AUTO("account-invalid", _state,  \
+_conf.account_invalid, ON_OFF_AUTO_AUTO),   \
+DEFINE_PROP_ON_OFF_AUTO("account-failed", _state,   \
+_conf.account_failed, ON_OFF_AUTO_AUTO)
 
 #define DEFINE_BLOCK_PROPERTIES(_state, _conf)  \
 DEFINE_PROP_DRIVE("drive", _state, _conf.blk),  \
diff --git a/block/accounting.c b/block/accounting.c
index 6b300c5129..2829745377 100644
--- a/block/accounting.c
+++ b/block/accounting.c
@@ -38,6 +38,8 @@ void block_acct_init(BlockAcctStats *stats)
 if (qtest_enabled()) {
 clock_type = QEMU_CLOCK_VIRTUAL;
 }
+stats->account_invalid = true;
+stats->account_failed = true;
 }
 
 static bool bool_from_onoffauto(OnOffAuto val, bool def)
@@ -57,8 +59,10 @@ static bool bool_from_onoffauto(OnOffAuto val, bool def)
 void block_acct_setup(BlockAcctStats *stats, enum OnOffAuto account_invalid,
   enum OnOffAuto account_failed)
 {
-stats->account_invalid = bool_from_onoffauto(account_invalid, true);
-stats->account_failed = bool_from_onoffauto(account_failed, true);
+stats->account_invalid = bool_from_onoffauto(account_invalid,
+ stats->account_invalid);
+stats->account_failed = bool_from_onoffauto(account_failed,
+stats->account_failed);
 }
 
 void block_acct_cleanup(BlockAcctStats *stats)
diff --git a/hw/block/block.c b/hw/block/block.c
index 04279166ee..f9c4fe6767 100644
--- a/hw/block/block.c
+++ b/hw/block/block.c
@@ -205,6 +205,8 @@ bool blkconf_apply_backend_options(BlockConf *conf, bool 
readonly,
 blk_set_enable_write_cache(blk, wce);
 blk_set_on_error(blk, rerror, werror);
 
+block_acct_setup(blk_get_stats(blk), conf->account_invalid,
+ conf->account_failed);
 return true;
 }
 
diff --git a/tests/qemu-iotests/172.out b/tests/qemu-iotests/172.out
index 9479b92185..07eebf3583 100644
--- a/tests/qemu-iotests/172.out
+++ b/tests/qemu-iotests/172.out
@@ -28,6 +28,8 @@ Testing:
 discard_granularity = 4294967295 (4 GiB)
 write-cache = "auto"
 share-rw = false
+account-invalid = "auto"
+account-failed = "auto"
 drive-type = "288"
 
 
@@ -55,6 +57,8 @@ Testing: -fda TEST_DIR/t.qcow2
 discard_granularity = 4294967295 (4 GiB)
 write-cache = "auto"
 share-rw = false
+account-invalid = "auto"
+account-failed = "auto"

[PULL 06/18] block: use bdrv_is_sg() helper instead of raw bs->sg reading

2022-09-30 Thread Kevin Wolf
From: "Denis V. Lunev" 

I believe that if the helper exists, it must be used always for reading
of the value. It breaks expectations in the other case.

Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Hanna Reitz 
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: Ronnie Sahlberg 
CC: Paolo Bonzini 
CC: Peter Lieven 
CC: Vladimir Sementsov-Ogievskiy 
Message-Id: <20220817083736.40981-2-...@openvz.org>
Reviewed-by: Kevin Wolf 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Kevin Wolf 
---
 block/file-posix.c | 2 +-
 block/iscsi.c  | 2 +-
 block/raw-format.c | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 48cd096624..256de1f456 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1295,7 +1295,7 @@ static void raw_refresh_limits(BlockDriverState *bs, 
Error **errp)
 }
 #endif
 
-if (bs->sg || S_ISBLK(st.st_mode)) {
+if (bdrv_is_sg(bs) || S_ISBLK(st.st_mode)) {
 int ret = hdev_get_max_hw_transfer(s->fd, &st);
 
 if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
diff --git a/block/iscsi.c b/block/iscsi.c
index d707d0b354..612de127e5 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -2065,7 +2065,7 @@ static void iscsi_refresh_limits(BlockDriverState *bs, 
Error **errp)
 uint64_t max_xfer_len = iscsilun->use_16_for_rw ? 0x : 0x;
 unsigned int block_size = MAX(BDRV_SECTOR_SIZE, iscsilun->block_size);
 
-assert(iscsilun->block_size >= BDRV_SECTOR_SIZE || bs->sg);
+assert(iscsilun->block_size >= BDRV_SECTOR_SIZE || bdrv_is_sg(bs));
 
 bs->bl.request_alignment = block_size;
 
diff --git a/block/raw-format.c b/block/raw-format.c
index 69fd650eaf..c7278e348e 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -463,7 +463,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, 
int flags,
 return -EINVAL;
 }
 
-bs->sg = bs->file->bs->sg;
+bs->sg = bdrv_is_sg(bs->file->bs);
 bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
 (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
 bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
@@ -489,7 +489,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, 
int flags,
 return ret;
 }
 
-if (bs->sg && (s->offset || s->has_size)) {
+if (bdrv_is_sg(bs) && (s->offset || s->has_size)) {
 error_setg(errp, "Cannot use offset/size with SCSI generic devices");
 return -EINVAL;
 }
-- 
2.37.3




[PULL 02/18] block/qcow2-bitmap: Add missing cast to silent GCC error

2022-09-30 Thread Kevin Wolf
From: Philippe Mathieu-Daudé 

Commit d1258dd0c8 ("qcow2: autoloading dirty bitmaps") added the
set_readonly_helper() GFunc handler, correctly casting the gpointer
user_data in both the g_slist_foreach() caller and the handler.
Few commits later (commit 1b6b0562db), the handler is reused in
qcow2_reopen_bitmaps_rw() but missing the gpointer cast, resulting
in the following error when using Homebrew GCC 12.2.0:

  [2/658] Compiling C object libblock.fa.p/block_qcow2-bitmap.c.o
  ../../block/qcow2-bitmap.c: In function 'qcow2_reopen_bitmaps_rw':
  ../../block/qcow2-bitmap.c:1211:60: error: incompatible type for argument 3 
of 'g_slist_foreach'
   1211 | g_slist_foreach(ro_dirty_bitmaps, set_readonly_helper, false);
|^
||
|_Bool
  In file included from 
/opt/homebrew/Cellar/glib/2.72.3_1/include/glib-2.0/glib/gmain.h:26,
   from 
/opt/homebrew/Cellar/glib/2.72.3_1/include/glib-2.0/glib/giochannel.h:33,
   from 
/opt/homebrew/Cellar/glib/2.72.3_1/include/glib-2.0/glib.h:54,
   from /Users/philmd/source/qemu/include/glib-compat.h:32,
   from /Users/philmd/source/qemu/include/qemu/osdep.h:144,
   from ../../block/qcow2-bitmap.c:28:
  /opt/homebrew/Cellar/glib/2.72.3_1/include/glib-2.0/glib/gslist.h:127:61: 
note: expected 'gpointer' {aka 'void *'} but argument is of type '_Bool'
127 |   gpointer  
user_data);
|   ~~^
  At top level:
  FAILED: libblock.fa.p/block_qcow2-bitmap.c.o

Fix by adding the missing gpointer cast.

Fixes: 1b6b0562db ("qcow2: support .bdrv_reopen_bitmaps_rw")
Signed-off-by: Philippe Mathieu-Daudé 
Message-Id: <20220919182755.51967-1-f4...@amsat.org>
Reviewed-by: Kevin Wolf 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Kevin Wolf 
---
 block/qcow2-bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index ff3309846c..7197754843 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -1208,7 +1208,7 @@ int qcow2_reopen_bitmaps_rw(BlockDriverState *bs, Error 
**errp)
 }
 }
 
-g_slist_foreach(ro_dirty_bitmaps, set_readonly_helper, false);
+g_slist_foreach(ro_dirty_bitmaps, set_readonly_helper, (gpointer)false);
 ret = 0;
 
 out:
-- 
2.37.3




[PULL 04/18] block: pass OnOffAuto instead of bool to block_acct_setup()

2022-09-30 Thread Kevin Wolf
From: "Denis V. Lunev" 

We would have one more place for block_acct_setup() calling, which should
not corrupt original value.

Signed-off-by: Denis V. Lunev 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
CC: Peter Krempa 
CC: Markus Armbruster 
CC: John Snow 
CC: Kevin Wolf 
CC: Hanna Reitz 
Message-Id: <20220824095044.166009-2-...@openvz.org>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 include/block/accounting.h |  6 +++---
 block/accounting.c | 22 ++
 blockdev.c | 17 ++---
 3 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/include/block/accounting.h b/include/block/accounting.h
index 878b4c3581..b9caad60d5 100644
--- a/include/block/accounting.h
+++ b/include/block/accounting.h
@@ -27,7 +27,7 @@
 
 #include "qemu/timed-average.h"
 #include "qemu/thread.h"
-#include "qapi/qapi-builtin-types.h"
+#include "qapi/qapi-types-common.h"
 
 typedef struct BlockAcctTimedStats BlockAcctTimedStats;
 typedef struct BlockAcctStats BlockAcctStats;
@@ -100,8 +100,8 @@ typedef struct BlockAcctCookie {
 } BlockAcctCookie;
 
 void block_acct_init(BlockAcctStats *stats);
-void block_acct_setup(BlockAcctStats *stats, bool account_invalid,
- bool account_failed);
+void block_acct_setup(BlockAcctStats *stats, enum OnOffAuto account_invalid,
+  enum OnOffAuto account_failed);
 void block_acct_cleanup(BlockAcctStats *stats);
 void block_acct_add_interval(BlockAcctStats *stats, unsigned interval_length);
 BlockAcctTimedStats *block_acct_interval_next(BlockAcctStats *stats,
diff --git a/block/accounting.c b/block/accounting.c
index 2030851d79..6b300c5129 100644
--- a/block/accounting.c
+++ b/block/accounting.c
@@ -40,11 +40,25 @@ void block_acct_init(BlockAcctStats *stats)
 }
 }
 
-void block_acct_setup(BlockAcctStats *stats, bool account_invalid,
-  bool account_failed)
+static bool bool_from_onoffauto(OnOffAuto val, bool def)
 {
-stats->account_invalid = account_invalid;
-stats->account_failed = account_failed;
+switch (val) {
+case ON_OFF_AUTO_AUTO:
+return def;
+case ON_OFF_AUTO_ON:
+return true;
+case ON_OFF_AUTO_OFF:
+return false;
+default:
+abort();
+}
+}
+
+void block_acct_setup(BlockAcctStats *stats, enum OnOffAuto account_invalid,
+  enum OnOffAuto account_failed)
+{
+stats->account_invalid = bool_from_onoffauto(account_invalid, true);
+stats->account_failed = bool_from_onoffauto(account_failed, true);
 }
 
 void block_acct_cleanup(BlockAcctStats *stats)
diff --git a/blockdev.c b/blockdev.c
index 9230888e34..392d9476e6 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -455,6 +455,17 @@ static void extract_common_blockdev_options(QemuOpts 
*opts, int *bdrv_flags,
 }
 }
 
+static OnOffAuto account_get_opt(QemuOpts *opts, const char *name)
+{
+if (!qemu_opt_find(opts, name)) {
+return ON_OFF_AUTO_AUTO;
+}
+if (qemu_opt_get_bool(opts, name, true)) {
+return ON_OFF_AUTO_ON;
+}
+return ON_OFF_AUTO_OFF;
+}
+
 /* Takes the ownership of bs_opts */
 static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
Error **errp)
@@ -462,7 +473,7 @@ static BlockBackend *blockdev_init(const char *file, QDict 
*bs_opts,
 const char *buf;
 int bdrv_flags = 0;
 int on_read_error, on_write_error;
-bool account_invalid, account_failed;
+OnOffAuto account_invalid, account_failed;
 bool writethrough, read_only;
 BlockBackend *blk;
 BlockDriverState *bs;
@@ -496,8 +507,8 @@ static BlockBackend *blockdev_init(const char *file, QDict 
*bs_opts,
 /* extract parameters */
 snapshot = qemu_opt_get_bool(opts, "snapshot", 0);
 
-account_invalid = qemu_opt_get_bool(opts, "stats-account-invalid", true);
-account_failed = qemu_opt_get_bool(opts, "stats-account-failed", true);
+account_invalid = account_get_opt(opts, "stats-account-invalid");
+account_failed = account_get_opt(opts, "stats-account-failed");
 
 writethrough = !qemu_opt_get_bool(opts, BDRV_OPT_CACHE_WB, true);
 
-- 
2.37.3




Re: [PATCH v2 for-7.2 0/6] Drop libslirp submodule

2022-09-30 Thread Christian Schoenebeck
On Mittwoch, 24. August 2022 17:11:16 CEST Thomas Huth wrote:
> At the point in time we're going to release QEMU 7.2, all supported
> host OS distributions will have a libslirp package available, so
> there is no need anymore for us to ship the slirp submodule. Thus
> let's clean up the related tests and finally remove the submodule now.
> 
> v2:
> - Added patches to clean up and adapt the tests
> - Rebased the removal patch to the latest version of the master branch
> 
> Thomas Huth (6):
>   tests/docker: Update the debian-all-test-cross container to Debian 11
>   tests/vm: Add libslirp to the VM tests
>   tests/lcitool/libvirt-ci: Update the lcitool module to the latest
> version
>   tests: Refresh dockerfiles and FreeBSD vars with lcitool
>   tests/avocado: Do not run tests that require libslirp if it is not
> available
>   Remove the slirp submodule (i.e. compile only with an external
> libslirp)

And I was wondering (bisecting) why network silently stopped working here.

While I understand the motivation for this change, it's probably not a user 
friendly situation to just silently decease functionality. As slirp was the 
default networking (i.e. not just some exotic QEMU feature), wouldn't it make 
sense then to make missing libslirp a build-time error by default?

Best regards,
Christian Schoenebeck





Re: [PATCH v3] virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events.

2022-09-30 Thread Paolo Bonzini
On Fri, Sep 30, 2022 at 4:42 PM Venu Busireddy
 wrote:
> > > Immediately after a hotunplug event, qemu (without any action from
> > > the guest) processes a REPORT_LUNS command on the lun 0 of the device
> > > (haven't figured out what causes this).
> >
> > There is only one call to virtio_scsi_handle_cmd_req_prepare and it
> > takes the command from the guest, are you sure it is without any
> > action from the guest?
>
> I am sure, based on what I am observing. I am running the scsitrace
> (scsitrace -n vtioscsi -v) command on the Solaris guest, and I see no
> output there.

Do you have the sources to the driver and/or to the scsitrace dtrace
script? Something must be putting the SCSI command in the queue.
Perhaps the driver is doing so when it sees an event? And if it is
bypassing the normal submission mechanism, the REPORT LUNS commands is
hidden in scsitrac; that in turn retruns a unit attention and steals
it from the other commands such as TEST UNIT READY, but that's a guest
driver bug.

But QEMU cannot just return the unit attention twice. I would start
with the patch to use the bus unit attention mechanism. It would be
even better to have two unit tests that check the behavior prescribed
by the standard: 1) UNIT ATTENTION from TEST UNIT READY immediately
after a hotunplug notification; 2) no UNIT ATTENTION from REPORT LUNS
and also no UNIT ATTENTION from a subsequent TEST UNIT READY command.
Debugging the guest is a separate step.

Paolo

> However, for whatever it's worth, if I have two or more luns
> on a virtio-scsi adapter, the spurious REPORT_LUNS processing
> (virtio_scsi_handle_cmd_req_prepare() call) occurs only when
> I hotunplug a lun while the other luns are still plugged in,
> until the last lun is unplugged. I do not see the spurious call to
> virtio_scsi_handle_cmd_req_prepare() when the last lun is unplugged,
> whether that was the only lun present, or if it was the last of many.
>
> Venu
>




[PATCH v1 1/1] hw/block/m25p80: Micron Xccela mt35xu01g flash Octal command support

2022-09-30 Thread Francisco Iglesias
Provide the Micron Xccela flash mt35xu01g with Octal command support.

Signed-off-by: Francisco Iglesias 
---
 hw/block/m25p80.c | 57 +++
 1 file changed, 57 insertions(+)

diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
index a8d2519141..79e26424ec 100644
--- a/hw/block/m25p80.c
+++ b/hw/block/m25p80.c
@@ -360,6 +360,8 @@ typedef enum {
 READ4 = 0x13,
 FAST_READ = 0x0b,
 FAST_READ4 = 0x0c,
+O_FAST_READ = 0x9d,
+O_FAST_READ4 = 0xfd,
 DOR = 0x3b,
 DOR4 = 0x3c,
 QOR = 0x6b,
@@ -368,6 +370,10 @@ typedef enum {
 DIOR4 = 0xbc,
 QIOR = 0xeb,
 QIOR4 = 0xec,
+OOR = 0x8b,
+OOR4 = 0x7c,
+OIOR = 0xcb,
+OIOR4 = 0xcc,
 
 PP = 0x02,
 PP4 = 0x12,
@@ -375,6 +381,10 @@ typedef enum {
 DPP = 0xa2,
 QPP = 0x32,
 QPP_4 = 0x34,
+OPP = 0x82,
+OPP4 = 0x84,
+EOPP = 0xc2,
+EOPP4 = 0x8e,
 RDID_90 = 0x90,
 RDID_AB = 0xab,
 AAI_WP = 0xad,
@@ -430,6 +440,7 @@ typedef enum {
 MAN_WINBOND,
 MAN_SST,
 MAN_ISSI,
+MAN_MICRON_OCTAL,
 MAN_GENERIC,
 } Manufacturer;
 
@@ -514,6 +525,8 @@ static inline Manufacturer get_man(Flash *s)
 return MAN_SST;
 case 0x9D:
 return MAN_ISSI;
+case 0x2C:
+return MAN_MICRON_OCTAL;
 default:
 return MAN_GENERIC;
 }
@@ -682,15 +695,20 @@ static inline int get_addr_length(Flash *s)
case PP4:
case PP4_4:
case QPP_4:
+   case OPP4:
+   case EOPP4:
case READ4:
case QIOR4:
case ERASE4_4K:
case ERASE4_32K:
case ERASE4_SECTOR:
case FAST_READ4:
+   case O_FAST_READ4:
case DOR4:
case QOR4:
case DIOR4:
+   case OOR4:
+   case OIOR4:
return 4;
default:
return s->four_bytes_address_mode ? 4 : 3;
@@ -722,6 +740,10 @@ static void complete_collecting_data(Flash *s)
 case PP:
 case PP4:
 case PP4_4:
+case OPP:
+case OPP4:
+case EOPP:
+case EOPP4:
 s->state = STATE_PAGE_PROGRAM;
 break;
 case AAI_WP:
@@ -741,6 +763,12 @@ static void complete_collecting_data(Flash *s)
 case DIOR4:
 case QIOR:
 case QIOR4:
+case OOR:
+case OOR4:
+case OIOR:
+case OIOR4:
+case O_FAST_READ:
+case O_FAST_READ4:
 s->state = STATE_READ;
 break;
 case ERASE_4K:
@@ -963,6 +991,9 @@ static void decode_fast_read_cmd(Flash *s)
 SPANSION_DUMMY_CLK_LEN
 );
 break;
+case MAN_MICRON_OCTAL:
+s->needed_bytes += 8;
+break;
 case MAN_ISSI:
 /*
  * The Fast Read instruction code is followed by address bytes and
@@ -1117,6 +1148,10 @@ static void decode_new_cmd(Flash *s, uint32_t value)
 case ERASE4_SECTOR:
 case PP:
 case PP4:
+case OPP:
+case OPP4:
+case EOPP:
+case EOPP4:
 case DIE_ERASE:
 case RDID_90:
 case RDID_AB:
@@ -1184,6 +1219,15 @@ static void decode_new_cmd(Flash *s, uint32_t value)
   "DIO mode\n", s->cmd_in_progress);
 }
 break;
+case OOR:
+case OOR4:
+case O_FAST_READ:
+if (get_man(s) == MAN_MICRON_OCTAL) {
+decode_fast_read_cmd(s);
+} else {
+qemu_log_mask(LOG_GUEST_ERROR, "M25P80: Unknown cmd %x\n", value);
+}
+break;
 
 case DIOR:
 case DIOR4:
@@ -1204,6 +1248,19 @@ static void decode_new_cmd(Flash *s, uint32_t value)
   "DIO mode\n", s->cmd_in_progress);
 }
 break;
+case OIOR:
+case OIOR4:
+case O_FAST_READ4:
+if (get_man(s) == MAN_MICRON_OCTAL) {
+s->needed_bytes = get_addr_length(s);
+s->needed_bytes += 16;
+s->pos = 0;
+s->len = 0;
+s->state = STATE_COLLECTING_DATA;
+} else {
+qemu_log_mask(LOG_GUEST_ERROR, "M25P80: Unknown cmd %x\n", value);
+}
+break;
 
 case WRSR:
 /*
-- 
2.20.1




Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd

2022-09-30 Thread Fuad Tabba
Hi,

On Tue, Sep 27, 2022 at 11:47 PM Sean Christopherson  wrote:
>
> On Mon, Sep 26, 2022, Fuad Tabba wrote:
> > Hi,
> >
> > On Mon, Sep 26, 2022 at 3:28 PM Chao Peng  
> > wrote:
> > >
> > > On Fri, Sep 23, 2022 at 04:19:46PM +0100, Fuad Tabba wrote:
> > > > > Then on the KVM side, its mmap_start() + mmap_end() sequence would:
> > > > >
> > > > >   1. Not be supported for TDX or SEV-SNP because they don't allow 
> > > > > adding non-zero
> > > > >  memory into the guest (after pre-boot phase).
> > > > >
> > > > >   2. Be mutually exclusive with shared<=>private conversions, and is 
> > > > > allowed if
> > > > >  and only if the entire gfn range of the associated memslot is 
> > > > > shared.
> > > >
> > > > In general I think that this would work with pKVM. However, limiting
> > > > private<->shared conversions to the granularity of a whole memslot
> > > > might be difficult to handle in pKVM, since the guest doesn't have the
> > > > concept of memslots. For example, in pKVM right now, when a guest
> > > > shares back its restricted DMA pool with the host it does so at the
> > > > page-level.
>
> Y'all are killing me :-)

 :D

> Isn't the guest enlightened?  E.g. can't you tell the guest "thou shalt share 
> at
> granularity X"?  With KVM's newfangled scalable memslots and per-vCPU MRU 
> slot,
> X doesn't even have to be that high to get reasonable performance, e.g. 
> assuming
> the DMA pool is at most 2GiB, that's "only" 1024 memslots, which is supposed 
> to
> work just fine in KVM.

The guest is potentially enlightened, but the host doesn't necessarily
know which memslot the guest might want to share back, since it
doesn't know where the guest might want to place the DMA pool. If I
understand this correctly, for this to work, all memslots would need
to be the same size and sharing would always need to happen at that
granularity.

Moreover, for something like a small DMA pool this might scale, but
I'm not sure about potential future workloads (e.g., multimedia
in-place sharing).

>
> > > > pKVM would also need a way to make an fd accessible again
> > > > when shared back, which I think isn't possible with this patch.
> > >
> > > But does pKVM really want to mmap/munmap a new region at the page-level,
> > > that can cause VMA fragmentation if the conversion is frequent as I see.
> > > Even with a KVM ioctl for mapping as mentioned below, I think there will
> > > be the same issue.
> >
> > pKVM doesn't really need to unmap the memory. What is really important
> > is that the memory is not GUP'able.
>
> Well, not entirely unguppable, just unguppable without a magic FOLL_* flag,
> otherwise KVM wouldn't be able to get the PFN to map into guest memory.
>
> The problem is that gup() and "mapped" are tied together.  So yes, pKVM 
> doesn't
> strictly need to unmap memory _in the untrusted host_, but since 
> mapped==guppable,
> the end result is the same.
>
> Emphasis above because pKVM still needs unmap the memory _somehwere_.  IIUC, 
> the
> current approach is to do that only in the stage-2 page tables, i.e. only in 
> the
> context of the hypervisor.  Which is also the source of the gup() problems; 
> the
> untrusted kernel is blissfully unaware that the memory is inaccessible.
>
> Any approach that moves some of that information into the untrusted kernel so 
> that
> the kernel can protect itself will incur fragmentation in the VMAs.  Well, 
> unless
> all of guest memory becomes unguppable, but that's likely not a viable option.

Actually, for pKVM, there is no need for the guest memory to be
GUP'able at all if we use the new inaccessible_get_pfn(). This of
course goes back to what I'd mentioned before in v7; it seems that
representing the memslot memory as a file descriptor should be
orthogonal to whether the memory is shared or private, rather than a
private_fd for private memory and the userspace_addr for shared
memory. The host can then map or unmap the shared/private memory using
the fd, which allows it more freedom in even choosing to unmap shared
memory when not needed, for example.

Cheers,
/fuad



Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd

2022-09-30 Thread Kirill A . Shutemov
On Fri, Sep 30, 2022 at 05:14:00PM +0100, Fuad Tabba wrote:
> Hi,
> 
> <...>
> 
> > diff --git a/mm/memfd_inaccessible.c b/mm/memfd_inaccessible.c
> > new file mode 100644
> > index ..2d33cbdd9282
> > --- /dev/null
> > +++ b/mm/memfd_inaccessible.c
> > @@ -0,0 +1,219 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include "linux/sbitmap.h"
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +struct inaccessible_data {
> > +   struct mutex lock;
> > +   struct file *memfd;
> > +   struct list_head notifiers;
> > +};
> > +
> > +static void inaccessible_notifier_invalidate(struct inaccessible_data 
> > *data,
> > +pgoff_t start, pgoff_t end)
> > +{
> > +   struct inaccessible_notifier *notifier;
> > +
> > +   mutex_lock(&data->lock);
> > +   list_for_each_entry(notifier, &data->notifiers, list) {
> > +   notifier->ops->invalidate(notifier, start, end);
> > +   }
> > +   mutex_unlock(&data->lock);
> > +}
> > +
> > +static int inaccessible_release(struct inode *inode, struct file *file)
> > +{
> > +   struct inaccessible_data *data = inode->i_mapping->private_data;
> > +
> > +   fput(data->memfd);
> > +   kfree(data);
> > +   return 0;
> > +}
> > +
> > +static long inaccessible_fallocate(struct file *file, int mode,
> > +  loff_t offset, loff_t len)
> > +{
> > +   struct inaccessible_data *data = file->f_mapping->private_data;
> > +   struct file *memfd = data->memfd;
> > +   int ret;
> > +
> > +   if (mode & FALLOC_FL_PUNCH_HOLE) {
> > +   if (!PAGE_ALIGNED(offset) || !PAGE_ALIGNED(len))
> > +   return -EINVAL;
> > +   }
> > +
> > +   ret = memfd->f_op->fallocate(memfd, mode, offset, len);
> 
> I think that shmem_file_operations.fallocate is only set if
> CONFIG_TMPFS is enabled (shmem.c). Should there be a check at
> initialization that fallocate is set, or maybe a config dependency, or
> can we count on it always being enabled?

It is already there:

config MEMFD_CREATE
def_bool TMPFS || HUGETLBFS

And we reject inaccessible memfd_create() for HUGETLBFS.

But if we go with a separate syscall, yes, we need the dependency.

> > +   inaccessible_notifier_invalidate(data, offset, offset + len);
> > +   return ret;
> > +}
> > +
> 
> <...>
> 
> > +void inaccessible_register_notifier(struct file *file,
> > +   struct inaccessible_notifier *notifier)
> > +{
> > +   struct inaccessible_data *data = file->f_mapping->private_data;
> > +
> > +   mutex_lock(&data->lock);
> > +   list_add(¬ifier->list, &data->notifiers);
> > +   mutex_unlock(&data->lock);
> > +}
> > +EXPORT_SYMBOL_GPL(inaccessible_register_notifier);
> 
> If the memfd wasn't marked as inaccessible, or more generally
> speaking, if the file isn't a memfd_inaccessible file, this ends up
> accessing an uninitialized pointer for the notifier list. Should there
> be a check for that here, and have this function return an error if
> that's not the case?

I think it is "don't do that" category. inaccessible_register_notifier()
caller has to know what file it operates on, no?

-- 
  Kiryl Shutsemau / Kirill A. Shutemov



Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd

2022-09-30 Thread Fuad Tabba
Hi,

<...>

> diff --git a/mm/memfd_inaccessible.c b/mm/memfd_inaccessible.c
> new file mode 100644
> index ..2d33cbdd9282
> --- /dev/null
> +++ b/mm/memfd_inaccessible.c
> @@ -0,0 +1,219 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include "linux/sbitmap.h"
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +struct inaccessible_data {
> +   struct mutex lock;
> +   struct file *memfd;
> +   struct list_head notifiers;
> +};
> +
> +static void inaccessible_notifier_invalidate(struct inaccessible_data *data,
> +pgoff_t start, pgoff_t end)
> +{
> +   struct inaccessible_notifier *notifier;
> +
> +   mutex_lock(&data->lock);
> +   list_for_each_entry(notifier, &data->notifiers, list) {
> +   notifier->ops->invalidate(notifier, start, end);
> +   }
> +   mutex_unlock(&data->lock);
> +}
> +
> +static int inaccessible_release(struct inode *inode, struct file *file)
> +{
> +   struct inaccessible_data *data = inode->i_mapping->private_data;
> +
> +   fput(data->memfd);
> +   kfree(data);
> +   return 0;
> +}
> +
> +static long inaccessible_fallocate(struct file *file, int mode,
> +  loff_t offset, loff_t len)
> +{
> +   struct inaccessible_data *data = file->f_mapping->private_data;
> +   struct file *memfd = data->memfd;
> +   int ret;
> +
> +   if (mode & FALLOC_FL_PUNCH_HOLE) {
> +   if (!PAGE_ALIGNED(offset) || !PAGE_ALIGNED(len))
> +   return -EINVAL;
> +   }
> +
> +   ret = memfd->f_op->fallocate(memfd, mode, offset, len);

I think that shmem_file_operations.fallocate is only set if
CONFIG_TMPFS is enabled (shmem.c). Should there be a check at
initialization that fallocate is set, or maybe a config dependency, or
can we count on it always being enabled?

> +   inaccessible_notifier_invalidate(data, offset, offset + len);
> +   return ret;
> +}
> +

<...>

> +void inaccessible_register_notifier(struct file *file,
> +   struct inaccessible_notifier *notifier)
> +{
> +   struct inaccessible_data *data = file->f_mapping->private_data;
> +
> +   mutex_lock(&data->lock);
> +   list_add(¬ifier->list, &data->notifiers);
> +   mutex_unlock(&data->lock);
> +}
> +EXPORT_SYMBOL_GPL(inaccessible_register_notifier);

If the memfd wasn't marked as inaccessible, or more generally
speaking, if the file isn't a memfd_inaccessible file, this ends up
accessing an uninitialized pointer for the notifier list. Should there
be a check for that here, and have this function return an error if
that's not the case?

Thanks,
/fuad



> +
> +void inaccessible_unregister_notifier(struct file *file,
> + struct inaccessible_notifier *notifier)
> +{
> +   struct inaccessible_data *data = file->f_mapping->private_data;
> +
> +   mutex_lock(&data->lock);
> +   list_del(¬ifier->list);
> +   mutex_unlock(&data->lock);
> +}
> +EXPORT_SYMBOL_GPL(inaccessible_unregister_notifier);
> +
> +int inaccessible_get_pfn(struct file *file, pgoff_t offset, pfn_t *pfn,
> +int *order)
> +{
> +   struct inaccessible_data *data = file->f_mapping->private_data;
> +   struct file *memfd = data->memfd;
> +   struct page *page;
> +   int ret;
> +
> +   ret = shmem_getpage(file_inode(memfd), offset, &page, SGP_WRITE);
> +   if (ret)
> +   return ret;
> +
> +   *pfn = page_to_pfn_t(page);
> +   *order = thp_order(compound_head(page));
> +   SetPageUptodate(page);
> +   unlock_page(page);
> +
> +   return 0;
> +}
> +EXPORT_SYMBOL_GPL(inaccessible_get_pfn);
> +
> +void inaccessible_put_pfn(struct file *file, pfn_t pfn)
> +{
> +   struct page *page = pfn_t_to_page(pfn);
> +
> +   if (WARN_ON_ONCE(!page))
> +   return;
> +
> +   put_page(page);
> +}
> +EXPORT_SYMBOL_GPL(inaccessible_put_pfn);
> --
> 2.25.1
>



Re: [RFC PATCH v2 03/29] target/ppc: split interrupt masking and delivery from ppc_hw_interrupt

2022-09-30 Thread Fabiano Rosas
Matheus Ferst  writes:

> Split ppc_hw_interrupt into an interrupt masking method,
> ppc_next_unmasked_interrupt, and an interrupt processing method,
> ppc_deliver_interrupt.
>



> @@ -1822,20 +1782,106 @@ static void ppc_hw_interrupt(CPUPPCState *env)
>   */
>  if (FIELD_EX64(env->msr, MSR, PR) &&
>  (env->spr[SPR_BESCR] & BESCR_GE)) {
> -env->pending_interrupts &= ~PPC_INTERRUPT_EBB;
> -
> -if (env->spr[SPR_BESCR] & BESCR_PMEO) {
> -powerpc_excp(cpu, POWERPC_EXCP_PERFM_EBB);
> -} else if (env->spr[SPR_BESCR] & BESCR_EEO) {
> -powerpc_excp(cpu, POWERPC_EXCP_EXTERNAL_EBB);
> -}
> -
> -return;
> +return PPC_INTERRUPT_EBB;
>  }
>  }
>  }
>  
> -if (env->resume_as_sreset) {
> +return 0;
> +}
> +
> +static void ppc_deliver_interrupt(CPUPPCState *env, int interrupt)
> +{
> +PowerPCCPU *cpu = env_archcpu(env);
> +CPUState *cs = env_cpu(env);
> +
> +switch (interrupt) {
> +case PPC_INTERRUPT_RESET: /* External reset */
> +env->pending_interrupts &= ~PPC_INTERRUPT_RESET;
> +powerpc_excp(cpu, POWERPC_EXCP_RESET);
> +break;
> +case PPC_INTERRUPT_MCK: /* Machine check exception */
> +env->pending_interrupts &= ~PPC_INTERRUPT_MCK;
> +powerpc_excp(cpu, POWERPC_EXCP_MCHECK);
> +break;
> +#if 0 /* TODO */
> +case PPC_INTERRUPT_DEBUG: /* External debug exception */
> +env->pending_interrupts &= ~PPC_INTERRUPT_DEBUG;
> +powerpc_excp(cpu, POWERPC_EXCP_DEBUG);
> +break;
> +#endif
> +
> +case PPC_INTERRUPT_HDECR: /* Hypervisor decrementer exception */
> +/* HDEC clears on delivery */
> +env->pending_interrupts &= ~PPC_INTERRUPT_HDECR;
> +powerpc_excp(cpu, POWERPC_EXCP_HDECR);
> +break;
> +case PPC_INTERRUPT_HVIRT: /* Hypervisor virtualization interrupt */
> +powerpc_excp(cpu, POWERPC_EXCP_HVIRT);
> +break;
> +
> +case PPC_INTERRUPT_EXT:
> +if (books_vhyp_promotes_external_to_hvirt(cpu)) {
> +powerpc_excp(cpu, POWERPC_EXCP_HVIRT);
> +} else {
> +powerpc_excp(cpu, POWERPC_EXCP_EXTERNAL);
> +}
> +break;
> +case PPC_INTERRUPT_CEXT: /* External critical interrupt */
> +powerpc_excp(cpu, POWERPC_EXCP_CRITICAL);
> +break;
> +
> +case PPC_INTERRUPT_WDT: /* Watchdog timer on embedded PowerPC */
> +env->pending_interrupts &= ~PPC_INTERRUPT_WDT;
> +powerpc_excp(cpu, POWERPC_EXCP_WDT);
> +break;
> +case PPC_INTERRUPT_CDOORBELL:
> +env->pending_interrupts &= ~PPC_INTERRUPT_CDOORBELL;
> +powerpc_excp(cpu, POWERPC_EXCP_DOORCI);
> +break;
> +case PPC_INTERRUPT_FIT: /* Fixed interval timer on embedded PowerPC */
> +env->pending_interrupts &= ~PPC_INTERRUPT_FIT;
> +powerpc_excp(cpu, POWERPC_EXCP_FIT);
> +break;
> +case PPC_INTERRUPT_PIT: /* Programmable interval timer on embedded 
> PowerPC */
> +env->pending_interrupts &= ~PPC_INTERRUPT_PIT;
> +powerpc_excp(cpu, POWERPC_EXCP_PIT);
> +break;
> +case PPC_INTERRUPT_DECR: /* Decrementer exception */
> +if (ppc_decr_clear_on_delivery(env)) {
> +env->pending_interrupts &= ~PPC_INTERRUPT_DECR;
> +}
> +powerpc_excp(cpu, POWERPC_EXCP_DECR);
> +break;
> +case PPC_INTERRUPT_DOORBELL:
> +env->pending_interrupts &= ~PPC_INTERRUPT_DOORBELL;
> +if (is_book3s_arch2x(env)) {
> +powerpc_excp(cpu, POWERPC_EXCP_SDOOR);
> +} else {
> +powerpc_excp(cpu, POWERPC_EXCP_DOORI);
> +}
> +break;
> +case PPC_INTERRUPT_HDOORBELL:
> +env->pending_interrupts &= ~PPC_INTERRUPT_HDOORBELL;
> +powerpc_excp(cpu, POWERPC_EXCP_SDOOR_HV);
> +break;
> +case PPC_INTERRUPT_PERFM:
> +env->pending_interrupts &= ~PPC_INTERRUPT_PERFM;
> +powerpc_excp(cpu, POWERPC_EXCP_PERFM);
> +break;
> +case PPC_INTERRUPT_THERM:  /* Thermal interrupt */
> +env->pending_interrupts &= ~PPC_INTERRUPT_THERM;
> +powerpc_excp(cpu, POWERPC_EXCP_THERM);
> +break;
> +case PPC_INTERRUPT_EBB: /* EBB exception */
> +env->pending_interrupts &= ~PPC_INTERRUPT_EBB;
> +if (env->spr[SPR_BESCR] & BESCR_PMEO) {
> +powerpc_excp(cpu, POWERPC_EXCP_PERFM_EBB);
> +} else if (env->spr[SPR_BESCR] & BESCR_EEO) {
> +powerpc_excp(cpu, POWERPC_EXCP_EXTERNAL_EBB);
> +}
> +break;
> +case 0:
>  /*
>   * This is a bug ! It means that has_work took us out of halt without
>   * anything to deliver while in a PM state that requires getting
> @@ -1847,8 +1893,10 @@ static void ppc_hw_interrupt(CPUPPCState *env)
>   * It generally means a discrepancy b

[PATCH v3] hyperv: fix SynIC SINT assertion failure on guest reset

2022-09-30 Thread Maciej S. Szmigiero
From: "Maciej S. Szmigiero" 

Resetting a guest that has Hyper-V VMBus support enabled triggers a QEMU
assertion failure:
hw/hyperv/hyperv.c:131: synic_reset: Assertion 
`QLIST_EMPTY(&synic->sint_routes)' failed.

This happens both on normal guest reboot or when using "system_reset" HMP
command.

The failing assertion was introduced by commit 64ddecc88bcf ("hyperv: SControl 
is optional to enable SynIc")
to catch dangling SINT routes on SynIC reset.

The root cause of this problem is that the SynIC itself is reset before
devices using SINT routes have chance to clean up these routes.

Since there seems to be no existing mechanism to force reset callbacks (or
methods) to be executed in specific order let's use a similar method that
is already used to reset another interrupt controller (APIC) after devices
have been reset - by invoking the SynIC reset from the machine reset
handler via a new x86_cpu_after_reset() function co-located with
the existing x86_cpu_reset() in target/i386/cpu.c.
Opportunistically move the APIC reset handler there, too.

Fixes: 64ddecc88bcf ("hyperv: SControl is optional to enable SynIc") # exposed 
the bug
Signed-off-by: Maciej S. Szmigiero 
---

Changes from v2:
Make sure that the microvm machine reset handler also calls
x86_cpu_after_reset().
Opportunistically move the APIC reset handler to x86_cpu_after_reset().

 hw/i386/microvm.c  |  4 +---
 hw/i386/pc.c   |  5 ++---
 target/i386/cpu.c  | 13 +
 target/i386/cpu.h  |  2 ++
 target/i386/kvm/hyperv.c   |  4 
 target/i386/kvm/kvm.c  | 24 +---
 target/i386/kvm/kvm_i386.h |  1 +
 7 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index 52cafa003d..a3ff915b71 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -485,9 +485,7 @@ static void microvm_machine_reset(MachineState *machine)
 CPU_FOREACH(cs) {
 cpu = X86_CPU(cs);
 
-if (cpu->apic_state) {
-device_legacy_reset(cpu->apic_state);
-}
+x86_cpu_after_reset(cpu);
 }
 }
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 566accf7e6..768982ae9a 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -92,6 +92,7 @@
 #include "hw/virtio/virtio-mem-pci.h"
 #include "hw/mem/memory-device.h"
 #include "sysemu/replay.h"
+#include "target/i386/cpu.h"
 #include "qapi/qmp/qerror.h"
 #include "e820_memory_layout.h"
 #include "fw_cfg.h"
@@ -1859,9 +1860,7 @@ static void pc_machine_reset(MachineState *machine)
 CPU_FOREACH(cs) {
 cpu = X86_CPU(cs);
 
-if (cpu->apic_state) {
-device_legacy_reset(cpu->apic_state);
-}
+x86_cpu_after_reset(cpu);
 }
 }
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1db1278a59..ddb4fce2e0 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6034,6 +6034,19 @@ static void x86_cpu_reset(DeviceState *dev)
 #endif
 }
 
+void x86_cpu_after_reset(X86CPU *cpu)
+{
+#ifndef CONFIG_USER_ONLY
+if (kvm_enabled()) {
+kvm_arch_after_reset_vcpu(cpu);
+}
+
+if (cpu->apic_state) {
+device_legacy_reset(cpu->apic_state);
+}
+#endif
+}
+
 static void mce_init(X86CPU *cpu)
 {
 CPUX86State *cenv = &cpu->env;
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 82004b65b9..c67d98e1a9 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2079,6 +2079,8 @@ typedef struct PropValue {
 } PropValue;
 void x86_cpu_apply_props(X86CPU *cpu, PropValue *props);
 
+void x86_cpu_after_reset(X86CPU *cpu);
+
 uint32_t cpu_x86_virtual_addr_width(CPUX86State *env);
 
 /* cpu.c other functions (cpuid) */
diff --git a/target/i386/kvm/hyperv.c b/target/i386/kvm/hyperv.c
index 9026ef3a81..e3ac978648 100644
--- a/target/i386/kvm/hyperv.c
+++ b/target/i386/kvm/hyperv.c
@@ -23,6 +23,10 @@ int hyperv_x86_synic_add(X86CPU *cpu)
 return 0;
 }
 
+/*
+ * All devices possibly using SynIC have to be reset before calling this to let
+ * them remove their SINT routes first.
+ */
 void hyperv_x86_synic_reset(X86CPU *cpu)
 {
 hyperv_synic_reset(CPU(cpu));
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index a1fd1f5379..774484c588 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -2203,20 +2203,30 @@ void kvm_arch_reset_vcpu(X86CPU *cpu)
 env->mp_state = KVM_MP_STATE_RUNNABLE;
 }
 
+/* enabled by default */
+env->poll_control_msr = 1;
+
+kvm_init_nested_state(env);
+
+sev_es_set_reset_vector(CPU(cpu));
+}
+
+void kvm_arch_after_reset_vcpu(X86CPU *cpu)
+{
+CPUX86State *env = &cpu->env;
+int i;
+
+/*
+ * Reset SynIC after all other devices have been reset to let them remove
+ * their SINT routes first.
+ */
 if (hyperv_feat_enabled(cpu, HYPERV_FEAT_SYNIC)) {
-int i;
 for (i = 0; i < ARRAY_SIZE(env->msr_hv_synic_sint); i++) {
 env->msr_hv_synic_sint[i] = HV_SINT_MASKED;
 }
 
 hyperv_x86_synic_reset(cpu);
 }
-

Re: [PATCH v2 for-7.2 0/2] pci *_by_mask() coverity fix

2022-09-30 Thread Michael S. Tsirkin
Will merge early next week.

On Thu, Sep 29, 2022 at 05:29:58PM +0100, Peter Maydell wrote:
> Ping! This series has been reviewed.
> 
> I can take it via target-arm.next if you'd prefer.
> 
> thanks
> -- PMM
> 
> On Thu, 18 Aug 2022 at 14:54, Peter Maydell  wrote:
> >
> > This patchset fixes a Coverity nit relating to the
> > pci_set_*_by_mask() helper functions, where we might shift off the
> > end of a variable if the caller passes in a bogus mask argument.
> > Patch 2 is the coverity fix (adding an assert() to help Coverity
> > out a bit and to catch potential future actual bugs). Patch 1
> > removes the _get_ versions of the functions, because they've been
> > in the tree for a decade and never had any callers at any point
> > in those 10 years :-)
> >
> > This is only de-confusing Coverity, so this is definitely
> > 7.2 material at this point.
> >
> > All patches already have a reviewed-by tag; only change
> > v1->v2 is removing a couple of unnecessary mask operations
> > in patch 2.
> >
> > thanks
> > -- PMM
> >
> > Peter Maydell (2):
> >   pci: Remove unused pci_get_*_by_mask() functions
> >   pci: Sanity check mask argument to pci_set_*_by_mask()
> >
> >  include/hw/pci/pci.h | 48 +++-
> >  1 file changed, 16 insertions(+), 32 deletions(-)




Re: [PATCH 2/2] thread-pool: use ThreadPool from the running thread

2022-09-30 Thread Kevin Wolf
Am 30.09.2022 um 14:17 hat Emanuele Giuseppe Esposito geschrieben:
> Am 29/09/2022 um 17:30 schrieb Kevin Wolf:
> > Am 09.06.2022 um 15:44 hat Emanuele Giuseppe Esposito geschrieben:
> >> Remove usage of aio_context_acquire by always submitting work items
> >> to the current thread's ThreadPool.
> >>
> >> Signed-off-by: Paolo Bonzini 
> >> Signed-off-by: Emanuele Giuseppe Esposito 
> > 
> > The thread pool is used by things outside of the file-* block drivers,
> > too. Even outside the block layer. Not all of these seem to submit work
> > in the same thread.
> > 
> > 
> > For example:
> > 
> > postcopy_ram_listen_thread() -> qemu_loadvm_state_main() ->
> > qemu_loadvm_section_start_full() -> vmstate_load() ->
> > vmstate_load_state() -> spapr_nvdimm_flush_post_load(), which has:
> > 
> > ThreadPool *pool = aio_get_thread_pool(qemu_get_aio_context());
> > ...
> > thread_pool_submit_aio(pool, flush_worker_cb, state,
> >spapr_nvdimm_flush_completion_cb, state);
> > 
> > So it seems to me that we may be submitting work for the main thread
> > from a postcopy migration thread.
> > 
> > I believe the other direct callers of thread_pool_submit_aio() all
> > submit work for the main thread and also run in the main thread.
> > 
> > 
> > For thread_pool_submit_co(), pr_manager_execute() calls it with the pool
> > it gets passed as a parameter. This is still bdrv_get_aio_context(bs) in
> > hdev_co_ioctl() and should probably be changed the same way as for the
> > AIO call in file-posix, i.e. use qemu_get_current_aio_context().
> > 
> > 
> > We could consider either asserting in thread_pool_submit_aio() that we
> > are really in the expected thread, or like I suggested for LinuxAio drop
> > the pool parameter and always get it from the current thread (obviously
> > this is only possible if migration could in fact schedule the work on
> > its current thread - if it schedules it on the main thread and then
> > exits the migration thread (which destroys the thread pool), that
> > wouldn't be good).
> 
> Dumb question: why not extend the already-existing poll->lock to cover
> also the necessary fields like pool->head that are accessed by other
> threads (only case I could find with thread_pool_submit_aio is the one
> you pointed above)?

Other people are more familiar with this code, but I believe this could
have performance implications. I seem to remember that this code is
careful to avoid locking to synchronise between worker threads and the
main thread.

But looking at the patch again, I have actually a dumb question, too:
The locking you're removing is in thread_pool_completion_bh(). As this
is a BH, it's running the the ThreadPool's context either way, no matter
which thread called thread_pool_submit_aio().

I'm not sure what this aio_context_acquire/release pair is actually
supposed to protect. Paolo's commit 1919631e6b5 introduced it. Was it
just more careful than it needs to be?

Kevin




Re: [PATCH 1/2] linux-aio: use LinuxAioState from the running thread

2022-09-30 Thread Kevin Wolf
Am 30.09.2022 um 12:00 hat Emanuele Giuseppe Esposito geschrieben:
> 
> 
> Am 29/09/2022 um 16:52 schrieb Kevin Wolf:
> > Am 09.06.2022 um 15:44 hat Emanuele Giuseppe Esposito geschrieben:
> >> From: Paolo Bonzini 
> >>
> >> Remove usage of aio_context_acquire by always submitting asynchronous
> >> AIO to the current thread's LinuxAioState.
> >>
> >> Signed-off-by: Paolo Bonzini 
> >> Signed-off-by: Emanuele Giuseppe Esposito 
> >> ---
> >>  block/file-posix.c  |  3 ++-
> >>  block/linux-aio.c   | 13 ++---
> >>  include/block/aio.h |  4 
> >>  3 files changed, 8 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/block/file-posix.c b/block/file-posix.c
> >> index 48cd096624..33f92f004a 100644
> >> --- a/block/file-posix.c
> >> +++ b/block/file-posix.c
> >> @@ -2086,7 +2086,8 @@ static int coroutine_fn raw_co_prw(BlockDriverState 
> >> *bs, uint64_t offset,
> >>  #endif
> >>  #ifdef CONFIG_LINUX_AIO
> >>  } else if (s->use_linux_aio) {
> >> -LinuxAioState *aio = aio_get_linux_aio(bdrv_get_aio_context(bs));
> >> +AioContext *ctx = qemu_get_current_aio_context();
> >> +LinuxAioState *aio = aio_get_linux_aio(ctx);
> >>  assert(qiov->size == bytes);
> >>  return laio_co_submit(bs, aio, s->fd, offset, qiov, type,
> >>s->aio_max_batch);
> > 
> > raw_aio_plug() and raw_aio_unplug() need the same change.
> > 
> > I wonder if we should actually better remove the 'aio' parameter from
> > the functions that linux-aio.c offers to avoid suggesting that any
> > LinuxAioState works for any thread. Getting it from the current
> > AioContext is something it can do by itself. But this would be code
> > cleanup for a separate patch.
> 
> I do not think that this would work. At least not for all functions of
> the API. I tried removing the ctx parameter from aio_setup_linux_aio and
> it's already problematic, as it used by raw_aio_attach_aio_context()
> which is a .bdrv_attach_aio_context() callback, which should be called
> by the main thread. So that function needs the aiocontext parameter.
> 
> So maybe for now just simplify aio_get_linux_aio()? In a separate patch.

Oh, I don't mind the ctx parameter in these functions at all.

I was talking about the functions in linux-aio.c, specifically
laio_co_submit(), laio_io_plug() and laio_io_unplug(). They could call
aio_get_linux_aio() internally for the current thread instead of letting
the caller do that and giving the false impression that there is more
than one correct value for their LinuxAioState parameter.

But anyway, as I said, this would be a separate cleanup patch. For this
one, it's just important that at least file-posix.c does the right thing
for plug/unplug, too.

> >> diff --git a/block/linux-aio.c b/block/linux-aio.c
> >> index 4c423fcccf..1d3cc767d1 100644
> >> --- a/block/linux-aio.c
> >> +++ b/block/linux-aio.c
> >> @@ -16,6 +16,9 @@
> >>  #include "qemu/coroutine.h"
> >>  #include "qapi/error.h"
> >>  
> >> +/* Only used for assertions.  */
> >> +#include "qemu/coroutine_int.h"
> >> +
> >>  #include 
> >>  
> >>  /*
> >> @@ -56,10 +59,8 @@ struct LinuxAioState {
> >>  io_context_t ctx;
> >>  EventNotifier e;
> >>  
> >> -/* io queue for submit at batch.  Protected by AioContext lock. */
> >> +/* All data is only used in one I/O thread.  */
> >>  LaioQueue io_q;
> >> -
> >> -/* I/O completion processing.  Only runs in I/O thread.  */
> >>  QEMUBH *completion_bh;
> >>  int event_idx;
> >>  int event_max;
> >> @@ -102,9 +103,8 @@ static void qemu_laio_process_completion(struct 
> >> qemu_laiocb *laiocb)
> >>   * later.  Coroutines cannot be entered recursively so avoid doing
> >>   * that!
> >>   */
> >> -if (!qemu_coroutine_entered(laiocb->co)) {
> >> -aio_co_wake(laiocb->co);
> >> -}
> >> +assert(laiocb->co->ctx == laiocb->ctx->aio_context);
> >> +qemu_coroutine_enter_if_inactive(laiocb->co);
> >>  }
> >>  
> >>  /**
> >> @@ -238,7 +238,6 @@ static void 
> >> qemu_laio_process_completions_and_submit(LinuxAioState *s)
> >>  if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
> >>  ioq_submit(s);
> >>  }
> >> -aio_context_release(s->aio_context);
> >>  }
> > 
> > I certainly expected the aio_context_acquire() in the same function to
> > go away, too! Am I missing something?
> 
> ops

:-)

If it's unintentional, I'm actually surprised that locking without
unlocking later didn't cause problems immediately.

Kevin




Re: [PULL 00/10] QAPI patches patches for 2022-09-07

2022-09-30 Thread Markus Armbruster
Markus Armbruster  writes:

> Gentle reminder, Victor :)
>
> Markus Armbruster  writes:
>
>> Markus Armbruster  writes:
>>
>>> Kevin Wolf  writes:
>>>
 Am 07.09.2022 um 17:03 hat Markus Armbruster geschrieben:
> The following changes since commit 
> 946e9bccf12f2bcc3ca471b820738fb22d14fc80:

[...]

>   qapi: fix examples of blockdev-add with qcow2

 NACK, this patch is wrong.

 'file' is a required member (defined in BlockdevOptionsGenericFormat),
 removing it makes the example invalid. 'data-file' is only an additional
 optional member to be used for external data files (i.e. when the guest
 data is kept separate from the metadata in the .qcow2 file).
>>>
>>> I'll respin with #8 dropped.  Thank you!
>>
>> Too late, it's already merged.
>>
>> Victor, could you fix on top?  Or would you like me to revert the patch?

Revert posted: 

Subject: [PATCH] Revert "qapi: fix examples of blockdev-add with qcow2"
Date: Fri, 30 Sep 2022 17:26:34 +0200
Message-Id: <20220930152634.774907-1-arm...@redhat.com>




[PATCH 0/4] Qemu SEV reduced-phys-bits fixes

2022-09-30 Thread Tom Lendacky
This patch series fixes up and tries to remove some confusion around the
SEV reduced-phys-bits parameter.

Based on the "AMD64 Architecture Programmer's Manual Volume 2: System
Programming", section "15.34.6 Page Table Support" [1], a guest should
only ever see a maximum of 1 bit of physical address space reduction.

- Update the documentation, to change the default value from 5 to 1.
- Update the validation of the parameter to ensure the parameter value
  is within the range of the CPUID field that it is reported in. To allow
  for backwards compatibility, especially to support the previously
  documented value of 5, allow the full range of values from 1 to 63
  (0 was never allowed).
- Update the setting of CPUID 0x801F_EBX to limit the values to the
  field width that they are setting as an additional safeguard.

[1] https://www.amd.com/system/files/TechDocs/24593.pdf

Tom Lendacky (4):
  qapi, i386/sev: Change the reduced-phys-bits value from 5 to 1
  qemu-options.hx: Update the reduced-phys-bits documentation
  i386/sev: Update checks and information related to reduced-phys-bits
  i386/cpu: Update how the EBX register of CPUID 0x801F is set

 qapi/misc-target.json |  2 +-
 qemu-options.hx   |  4 ++--
 target/i386/cpu.c |  4 ++--
 target/i386/sev.c | 17 ++---
 4 files changed, 19 insertions(+), 8 deletions(-)

-- 
2.37.3




[PATCH 3/4] i386/sev: Update checks and information related to reduced-phys-bits

2022-09-30 Thread Tom Lendacky
The value of the reduced-phys-bits parameter is propogated to the CPUID
information exposed to the guest. Update the current validation check to
account for the size of the CPUID field (6-bits), ensuring the value is
in the range of 1 to 63.

Maintain backward compatibility, to an extent, by allowing a value greater
than 1 (so that the previously documented value of 5 still works), but not
allowing anything over 63.

Fixes: d8575c6c02 ("sev/i386: add command to initialize the memory encryption 
context")
Signed-off-by: Tom Lendacky 
---
 target/i386/sev.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index 32f7dbac4e..78c2d37eba 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -932,15 +932,26 @@ int sev_kvm_init(ConfidentialGuestSupport *cgs, Error 
**errp)
 host_cpuid(0x801F, 0, NULL, &ebx, NULL, NULL);
 host_cbitpos = ebx & 0x3f;
 
+/*
+ * The cbitpos value will be placed in bit positions 5:0 of the EBX
+ * register of CPUID 0x801F. No need to verify the range as the
+ * comparison against the host value accomplishes that.
+ */
 if (host_cbitpos != sev->cbitpos) {
 error_setg(errp, "%s: cbitpos check failed, host '%d' requested '%d'",
__func__, host_cbitpos, sev->cbitpos);
 goto err;
 }
 
-if (sev->reduced_phys_bits < 1) {
-error_setg(errp, "%s: reduced_phys_bits check failed, it should be 
>=1,"
-   " requested '%d'", __func__, sev->reduced_phys_bits);
+/*
+ * The reduced-phys-bits value will be placed in bit positions 11:6 of
+ * the EBX register of CPUID 0x801F, so verify the supplied value
+ * is in the range of 1 to 63.
+ */
+if (sev->reduced_phys_bits < 1 || sev->reduced_phys_bits > 63) {
+error_setg(errp, "%s: reduced_phys_bits check failed,"
+   " it should be in the range of 1 to 63, requested '%d'",
+   __func__, sev->reduced_phys_bits);
 goto err;
 }
 
-- 
2.37.3




Re: [PATCH v2 1/7] piix_ide_reset: Use pci_set_* functions instead of direct access

2022-09-30 Thread Kevin Wolf
Am 07.07.2022 um 05:11 hat Lev Kujawski geschrieben:
> Eliminate the remaining TODOs in hw/ide/piix.c by:
> * Using pci_set_{size} functions to write the PIIX PCI configuration
>   space instead of manipulating it directly as an array; and
> * Documenting the default register values by reference to the
>   controlling specification.
> 
> Signed-off-by: Lev Kujawski 

Thanks, dropped patches 5 and 6 because I see that you have posted a
newer version of them, and applied the rest to the block branch.

Kevin




[PATCH 4/4] i386/cpu: Update how the EBX register of CPUID 0x8000001F is set

2022-09-30 Thread Tom Lendacky
Update the setting of CPUID 0x801F EBX to clearly document the ranges
associated with fields being set.

Fixes: 6cb8f2a663 ("cpu/i386: populate CPUID 0x8000_001F when SEV is active")
Signed-off-by: Tom Lendacky 
---
 target/i386/cpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1db1278a59..d4b806cfec 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -5853,8 +5853,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 if (sev_enabled()) {
 *eax = 0x2;
 *eax |= sev_es_enabled() ? 0x8 : 0;
-*ebx = sev_get_cbit_position();
-*ebx |= sev_get_reduced_phys_bits() << 6;
+*ebx = sev_get_cbit_position() & 0x3f; /* EBX[5:0] */
+*ebx |= (sev_get_reduced_phys_bits() & 0x3f) << 6; /* EBX[11:6] */
 }
 break;
 default:
-- 
2.37.3




[PATCH 2/4] qemu-options.hx: Update the reduced-phys-bits documentation

2022-09-30 Thread Tom Lendacky
A guest only ever experiences, at most, 1 bit of reduced physical
addressing. Update the documentation to reflect this as well as change
the example value on the reduced-phys-bits option.

Fixes: a9b4942f48 ("target/i386: add Secure Encrypted Virtualization (SEV) 
object")
Signed-off-by: Tom Lendacky 
---
 qemu-options.hx | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 913c71e38f..3396085cf0 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -5391,7 +5391,7 @@ SRST
 physical address space. The ``reduced-phys-bits`` is used to
 provide the number of bits we loose in physical address space.
 Similar to C-bit, the value is Host family dependent. On EPYC,
-the value should be 5.
+a guest will lose a maximum of 1 bit, so the value should be 1.
 
 The ``sev-device`` provides the device file to use for
 communicating with the SEV firmware running inside AMD Secure
@@ -5426,7 +5426,7 @@ SRST
 
  # |qemu_system_x86| \\
  .. \\
- -object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=5 \\
+ -object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=1 \\
  -machine ...,memory-encryption=sev0 \\
  .
 
-- 
2.37.3




  1   2   >