Re: [PATCH 04/20] target/arm: Convert CFINV, XAFLAG and AXFLAG to decodetree

2023-06-02 Thread Richard Henderson

On 6/2/23 08:52, Peter Maydell wrote:

Convert the CFINV, XAFLAG and AXFLAG insns to decodetree.
The old decoder handles these in handle_msr_i(), but
the architecture defines them as separate instructions
from MSR (immediate).

Signed-off-by: Peter Maydell 
---
  target/arm/tcg/a64.decode  |  6 
  target/arm/tcg/translate-a64.c | 56 ++
  2 files changed, 35 insertions(+), 27 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index 553f6904d9c..26a0b44cea9 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -188,3 +188,9 @@ CLREX   1101 0101  0011 0011 imm:4 010 1
  DSB_DMB 1101 0101  0011 0011 domain:2 types:2 10- 1
  ISB 1101 0101  0011 0011 imm:4 110 1
  SB  1101 0101  0011 0011  111 1
+
+# PSTATE
+
+CFINV   1101 0101  0 000 0100  000 1
+XAFLAG  1101 0101  0 000 0100  001 1
+AXFLAG  1101 0101  0 000 0100  010 1
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 09258a9854f..33bebe594d1 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -1809,9 +1809,25 @@ static bool trans_SB(DisasContext *s, arg_SB *a)
  return true;
  }
  
-static void gen_xaflag(void)

+static bool trans_CFINV(DisasContext *s, arg_CFINV *a)
  {
-TCGv_i32 z = tcg_temp_new_i32();
+if (!dc_isar_feature(aa64_condm_4, s)) {
+return false;
+}
+tcg_gen_xori_i32(cpu_CF, cpu_CF, 1);
+s->base.is_jmp = DISAS_NEXT;
+return true;
+}


The settings of is_jmp do not need to be copied across.
That's another benefit of extracting from MSR_i.

Otherwise,
Reviewed-by: Richard Henderson 


r~



Re: [PATCH 03/20] target/arm: Convert barrier insns to decodetree

2023-06-02 Thread Richard Henderson

On 6/2/23 08:52, Peter Maydell wrote:

+# Barriers
+
+CLREX   1101 0101  0011 0011 imm:4 010 1

...

+ISB 1101 0101  0011 0011 imm:4 110 1


The two imm:4 fields are ignored; use  instead?

Otherwise,
Reviewed-by: Richard Henderson 


r~



Re: [PATCH 02/20] target/arm: Convert hint instruction space to decodetree

2023-06-02 Thread Richard Henderson

On 6/2/23 08:52, Peter Maydell wrote:

Convert the various instructions in the hint instruction space
to decodetree.

Signed-off-by: Peter Maydell
---
  target/arm/tcg/a64.decode  |  31 
  target/arm/tcg/translate-a64.c | 277 ++---
  2 files changed, 185 insertions(+), 123 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 01/20] target/arm: Fix return value from LDSMIN/LDSMAX 8/16 bit atomics

2023-06-02 Thread Richard Henderson

On 6/2/23 08:52, Peter Maydell wrote:

The atomic memory operations are supposed to return the old memory
data value in the destination register.  This value is not
sign-extended, even if the operation is the signed minimum or
maximum.  (In the pseudocode for the instructions the returned data
value is passed to ZeroExtend() to create the value in the register.)

We got this wrong because we were doing a 32-to-64 zero extend on the
result for 8 and 16 bit data values, rather than the correct amount
of zero extension.

Fix the bug by using ext8u and ext16u for the MO_8 and MO_16 data
sizes rather than ext32u.

Cc:qemu-sta...@nongnu.org
Signed-off-by: Peter Maydell
---
  target/arm/tcg/translate-a64.c | 18 --
  1 file changed, 16 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH] target/arm: trap DCC access in user mode emulation

2023-06-02 Thread Richard Henderson

On 6/2/23 14:43, Zhuojia Shen wrote:

Accessing EL0-accessible Debug Communication Channel (DCC) registers in
user mode emulation is currently enabled.  However, it does not match
Linux behavior as Linux sets MDSCR_EL1.TDCC on startup to disable EL0
access to DCC (see __cpu_setup() in arch/arm64/mm/proc.S).

This patch fixes access_tdcc() to check MDSCR_EL1.TDCC for EL0 and sets
MDSCR_EL1.TDCC for user mode emulation to match Linux.

Signed-off-by: Zhuojia Shen 


It would be nice to define the fields of MDSCR properly but either way,

Reviewed-by: Richard Henderson 

r~



Re: [PATCH v3 00/48] tcg: Build once for system, once for user

2023-06-02 Thread Richard Henderson

On 6/2/23 14:25, Philippe Mathieu-Daudé wrote:

On 31/5/23 06:02, Richard Henderson wrote:


  133 files changed, 3022 insertions(+), 2728 deletions(-)



  create mode 100644 include/exec/helper-gen-common.h
  create mode 100644 include/exec/helper-proto-common.h



  create mode 100644 include/exec/helper-gen.h.inc
  create mode 100644 include/exec/helper-proto.h.inc
  create mode 100644 include/exec/helper-info.c.inc


These new files miss a license.


The old file from which they were split didn't have one either.
But I think there's no reason not to add

   SPDX-License-Identifier: GPL-2.0-or-later

to the top of each.


r~



Re: [PATCH 35.5] target/pcc: Inline gen_icount_io_start()

2023-06-02 Thread Richard Henderson

On 6/2/23 02:54, Philippe Mathieu-Daudé wrote:

Now that gen_icount_io_start() is a simple wrapper to
translator_io_start(), inline it.

Signed-off-by: Philippe Mathieu-Daudé 
---
  target/ppc/translate.c | 63 --
  target/ppc/power8-pmu-regs.c.inc   | 10 ++--
  target/ppc/translate/branch-impl.c.inc |  2 +-
  3 files changed, 35 insertions(+), 40 deletions(-)


Reviewed-by: Richard Henderson 

Added to my patch set.


r~



Re: [PATCH v3 30/48] exec-all: Widen tb_page_addr_t for user-only

2023-06-02 Thread Richard Henderson

On 6/2/23 03:02, Philippe Mathieu-Daudé wrote:

On 31/5/23 06:03, Richard Henderson wrote:

This is a step toward making TranslationBlock agnostic
to the address size of the guest.


My understanding is tb_page_addr_t is QEMU internal, not exposed
to the guest, thus abi_ulong isn't required. It was a tiny memory
optimization we could do when abi_ulong is 32-bit. Therefore we
can widen the type, unifying/simplifying TB management on the host.
Is that correct?


Yes, exactly.


r~



Re: [PATCH v3 24/48] tcg: Split helper-proto.h

2023-06-02 Thread Richard Henderson

On 6/2/23 14:14, Philippe Mathieu-Daudé wrote:

On 31/5/23 06:03, Richard Henderson wrote:

Create helper-proto-common.h without the target specific portion.
Use that in tcg-op-common.h.  Include helper-proto.h in target/arm
and target/hexagon before helper-info.c.inc; all other targets are
already correct in this regard.

Signed-off-by: Richard Henderson 
---
  include/exec/helper-proto-common.h | 17 +++
  include/exec/helper-proto.h    | 72 --
  include/tcg/tcg-op-common.h    |  2 +-
  include/exec/helper-proto.h.inc    | 67 +++
  accel/tcg/cputlb.c |  3 +-
  accel/tcg/plugin-gen.c |  2 +-
  accel/tcg/tcg-runtime-gvec.c   |  2 +-
  accel/tcg/tcg-runtime.c    |  2 +-
  target/arm/tcg/translate.c |  1 +
  target/hexagon/translate.c |  1 +
  10 files changed, 99 insertions(+), 70 deletions(-)
  create mode 100644 include/exec/helper-proto-common.h
  create mode 100644 include/exec/helper-proto.h.inc




diff --git a/include/exec/helper-proto.h.inc b/include/exec/helper-proto.h.inc
new file mode 100644
index 00..f6f0cfcacd
--- /dev/null
+++ b/include/exec/helper-proto.h.inc

...


Should we guard this header for multiple inclusions?


No, *.h.inc again.


r~



Re: [PATCH v3 23/48] tcg: Split helper-gen.h

2023-06-02 Thread Richard Henderson

On 6/2/23 14:17, Philippe Mathieu-Daudé wrote:

On 31/5/23 06:03, Richard Henderson wrote:

Create helper-gen-common.h without the target specific portion.
Use that in tcg-op-common.h.  Reorg headers in target/arm to
ensure that helper-gen.h is included before helper-info.c.inc.
All other targets are already correct in this regard.

Signed-off-by: Richard Henderson 
---
  include/exec/helper-gen-common.h |  17 ++
  include/exec/helper-gen.h    | 101 ++-
  include/tcg/tcg-op-common.h  |   2 +-
  include/exec/helper-gen.h.inc    | 101 +++
  target/arm/tcg/translate.c   |   8 +--
  5 files changed, 126 insertions(+), 103 deletions(-)
  create mode 100644 include/exec/helper-gen-common.h
  create mode 100644 include/exec/helper-gen.h.inc




diff --git a/include/exec/helper-gen.h.inc b/include/exec/helper-gen.h.inc
new file mode 100644
index 00..83bfa5b23f
--- /dev/null
+++ b/include/exec/helper-gen.h.inc
@@ -0,0 +1,101 @@
+/*
+ * Helper file for declaring TCG helper functions.
+ * This one expands generation functions for tcg opcodes.
+ * Define HELPER_H for the header file to be expanded,
+ * and static inline to change from global file scope.
+ */
+
+#include "tcg/tcg.h"
+#include "tcg/helper-info.h"
+#include "exec/helper-head.h"
+
+#define DEF_HELPER_FLAGS_0(name, flags, ret)    \
+extern TCGHelperInfo glue(helper_info_, name);  \
+static inline void glue(gen_helper_, name)(dh_retvar_decl0(ret))    \
+{   \
+    tcg_gen_call0((helper_info_, name), dh_retvar(ret));   \
+}

[...]

File not guarded for multiple inclusions, otherwise:
Reviewed-by: Philippe Mathieu-Daudé 


That is why it is named ".h.inc", because it *is* included multiple times.


r~



Re: [PATCH v3 15/48] tcg: Split tcg/tcg-op-common.h from tcg/tcg-op.h

2023-06-02 Thread Richard Henderson

On 6/2/23 14:29, Philippe Mathieu-Daudé wrote:

On 31/5/23 06:02, Richard Henderson wrote:

Create tcg/tcg-op-common.h, moving everything that does not concern
TARGET_LONG_BITS or TCGv.  Adjust tcg/*.c to use the new header
instead of tcg-op.h, in preparation for compiling tcg/ only once.

Signed-off-by: Richard Henderson 
---
  include/tcg/tcg-op-common.h |  996 ++
  include/tcg/tcg-op.h    | 1004 +--
  tcg/optimize.c  |    2 +-
  tcg/tcg-op-gvec.c   |    2 +-
  tcg/tcg-op-ldst.c   |    2 +-
  tcg/tcg-op-vec.c    |    2 +-
  tcg/tcg-op.c    |    2 +-
  tcg/tcg.c   |    2 +-
  tcg/tci.c   |    3 +-
  9 files changed, 1007 insertions(+), 1008 deletions(-)


Trivial review using 'git-diff --color-moved=dimmed-zebra'.


r-b?

r~



Re: [RFC PATCH 1/2] bulk: Replace !CONFIG_SOFTMMU -> CONFIG_USER_ONLY

2023-06-02 Thread Richard Henderson

On 6/2/23 15:58, Philippe Mathieu-Daudé wrote:

CONFIG_USER_ONLY is the opposite of CONFIG_SOFTMMU.
Replace !CONFIG_SOFTMMU negation by the positive form
which is clearer when reviewing code.


CONFIG_SOFTMMU should be reserved for the actual softmmu tlb, which we *should* be able to 
enable for user-only.  It is the only way to handle some of our host/guest page size 
problems.  Further, CONFIG_SOFTMMU should go away as a #define and become a runtime test 
(forced to true for system mode).  Pie in the sky stuff.


It is quite likely that all uses of CONFIG_SOFTMMU outside of tcg/, accel/tcg/, and random 
bits of include/ should only be using CONFIG_USER_ONLY.



r~



Re: [PATCH 0/2] target/i386/helper: Minor #ifdef'ry simplifications

2023-06-02 Thread Richard Henderson

On 6/2/23 15:46, Philippe Mathieu-Daudé wrote:

Not very interesting code shuffle, but this was in
the way of another big cleanup. So sending apart.

BTW this file isn't covered in MAINTAINERS:

   $ ./scripts/get_maintainer.pl -f target/i386/helper.c
   get_maintainer.pl: No maintainers found

Philippe Mathieu-Daudé (2):
   target/i386/helper: Remove do_cpu_sipi() stub for user-mode emulation
   target/i386/helper: Shuffle do_cpu_init()

  target/i386/cpu.h|  3 ++-
  target/i386/helper.c | 15 ---
  2 files changed, 6 insertions(+), 12 deletions(-)



Reviewed-by: Richard Henderson 

r~



Re: [PATCH] target/hppa/meson: Only build int_helper.o with system emulation

2023-06-02 Thread Richard Henderson

On 6/2/23 15:30, Philippe Mathieu-Daudé wrote:

int_helper.c only contains system emulation code:
remove the #ifdef'ry and move the file to the meson
softmmu source set.

Signed-off-by: Philippe Mathieu-Daudé
---
  target/hppa/int_helper.c | 3 ---
  target/hppa/meson.build  | 2 +-
  2 files changed, 1 insertion(+), 4 deletions(-)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH 0/3] meson.build: Group some entries in separate summary sections

2023-06-02 Thread Richard Henderson

On 6/2/23 10:18, Thomas Huth wrote:

For the average user, it is likely quite difficult which library is
responsible for different features that QEMU supports. Let's make
it a little bit easier for them and put some libraries into separate
groups in the summary output of meson.

Thomas Huth (3):
   meson.build: Group the UI entries in a separate summary section
   meson.build: Group the network backend entries in a separate summary
 section
   meson.build: Group the audio backend entries in a separate summary
 section

  meson.build | 48 +++-
  1 file changed, 31 insertions(+), 17 deletions(-)



Reviewed-by: Richard Henderson 

r~



Re: [PATCH] target/arm: Fix return value from LDSMIN/LDSMAX 8/16 bit atomics

2023-06-02 Thread Richard Henderson

On 6/2/23 07:22, Peter Maydell wrote:

The atomic memory operations are supposed to return the old memory
data value in the destination register.  This value is not
sign-extended, even if the operation is the signed minimum or
maximum.  (In the pseudocode for the instructions the returned data
value is passed to ZeroExtend() to create the value in the register.)

We got this wrong because we were doing a 32-to-64 zero extend on the
result for 8 and 16 bit data values, rather than the correct amount
of zero extension.

Fix the bug by using ext8u and ext16u for the MO_8 and MO_16 data
sizes rather than ext32u.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Maydell 
---
  target/arm/tcg/translate-a64.c | 18 --
  1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 741a6087399..075553e15f5 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -3401,8 +3401,22 @@ static void disas_ldst_atomic(DisasContext *s, uint32_t 
insn,
   */
  fn(tcg_rt, clean_addr, tcg_rs, get_mem_index(s), mop);
  
-if ((mop & MO_SIGN) && size != MO_64) {

-tcg_gen_ext32u_i64(tcg_rt, tcg_rt);
+if (mop & MO_SIGN) {
+switch (size) {
+case MO_8:
+tcg_gen_ext8u_i64(tcg_rt, tcg_rt);
+break;
+case MO_16:
+tcg_gen_ext16u_i64(tcg_rt, tcg_rt);
+break;
+case MO_32:
+tcg_gen_ext32u_i64(tcg_rt, tcg_rt);
+break;
+case MO_64:
+break;
+default:
+g_assert_not_reached();
+}


This reminds me that we have a function in tcg to handle this switch, but it isn't 
exposed.  I keep meaning to do that...


Reviewed-by: Richard Henderson 


r~



Re: [PATCH v4 0/2] target/arm: allow DC CVA[D]P in user mode emulation

2023-06-02 Thread Richard Henderson

On 6/1/23 14:53, Zhuojia Shen wrote:

Zhuojia Shen (2):
   target/arm: allow DC CVA[D]P in user mode emulation
   tests/tcg/aarch64: add DC CVA[D]P tests

  target/arm/helper.c   |  6 +--
  tests/tcg/aarch64/Makefile.target | 11 ++
  tests/tcg/aarch64/dcpodp.c| 63 +++
  tests/tcg/aarch64/dcpop.c | 63 +++
  4 files changed, 139 insertions(+), 4 deletions(-)
  create mode 100644 tests/tcg/aarch64/dcpodp.c
  create mode 100644 tests/tcg/aarch64/dcpop.c


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v2] meson.build: Use -Wno-undef only for SDL2 versions that need it

2023-06-02 Thread Richard Henderson

On 6/2/23 09:34, Thomas Huth wrote:

There is no need to disable this useful compiler warning for
all versions of the SDL. Unfortunately, various versions are
buggy (beside SDL 2.0.8, the version 2.26.0 and 2.26.1 are
broken, too, seehttps://github.com/libsdl-org/SDL/issues/6619  ),
but we can use a simple compiler check to see whether we need
the -Wno-undef or not.

This also enables the printing of the version number with
good versions of the SDL in the summary of the meson output
again.

Signed-off-by: Thomas Huth
---
  v2: Compile test code instead of hard-coding the version number


Reviewed-by: Richard Henderson 

r~



[PATCH v3 0/2] hw/i386/pc: Update max_cpus and default to SMBIOS

2023-06-02 Thread Suravee Suthikulpanit
In order to support large number of vcpus, a newer 64-bit SMBIOS
entry point type is needed. Therefore, upgrade the default SMBIOS version
for PC machines to SMBIOS 3.0 for newer systems. Then increase the maximum
number of vCPUs for Q35 models to 1024, which is the limit for KVM.

Changes from V2:
(https://lore.kernel.org/qemu-devel/20230531225127.331998-1-suravee.suthikulpa...@amd.com/)
* Add patch 1.

Changes from V1:
(https://lore.kernel.org/all/ynkdgsii1vfvx...@redhat.com/T/)
 * Bump from 512 to KVM_MAX_VCPUS (per Igor's suggestion)

Thank you,
Suravee

Suravee Suthikulpanit (2):
  hw/i386/pc: Default to use SMBIOS 3.0 for newer machine models
  pc: q35: Bump max_cpus to 1024

 hw/i386/pc.c |  5 -
 hw/i386/pc_piix.c| 14 ++
 hw/i386/pc_q35.c | 16 +++-
 include/hw/i386/pc.h |  2 ++
 4 files changed, 35 insertions(+), 2 deletions(-)

-- 
2.34.1




Re: [PATCH 1/2] target/riscv: Add Zacas ISA extension support

2023-06-02 Thread Richard Henderson

On 6/2/23 05:16, Rob Bradford wrote:

+#if TARGET_LONG_BITS == 32
+static bool trans_amocas_w(DisasContext *ctx, arg_amocas_w *a)


You need to eliminate all of the ifdefs, because we can switch a 64-bit cpu into 32-bit 
mode -- get_xl(ctx) shows which mode we are in.



r~



[PATCH v3 2/2] pc: q35: Bump max_cpus to 1024

2023-06-02 Thread Suravee Suthikulpanit
Since KVM_MAX_VCPUS is currently defined to 1024 for x86 as shown in
arch/x86/include/asm/kvm_host.h, update QEMU limits to the same number.

In case KVM could not support the specified number of vcpus, QEMU would
return the following error message:

  qemu-system-x86_64: kvm_init_vcpu: kvm_get_vcpu failed (xxx): Invalid argument

Cc: Igor Mammedov 
Cc: Daniel P. Berrangé 
Cc: Michael S. Tsirkin 
Cc: Julia Suvorova 
Signed-off-by: Suravee Suthikulpanit 
---
 hw/i386/pc_q35.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 2d1bb5fde5..2124ada58d 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -382,7 +382,7 @@ static void pc_q35_machine_options(MachineClass *m)
 machine_class_allow_dynamic_sysbus_dev(m, TYPE_INTEL_IOMMU_DEVICE);
 machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
 machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
-m->max_cpus = 288;
+m->max_cpus = 1024;
 }
 
 static void pc_q35_8_1_machine_options(MachineClass *m)
-- 
2.34.1




[PATCH v3 1/2] hw/i386/pc: Default to use SMBIOS 3.0 for newer machine models

2023-06-02 Thread Suravee Suthikulpanit
Currently, pc-q35 and pc-i44fx machine models are default to use SMBIOS 2.8
(32-bit entry point). Since SMBIOS 3.0 (64-bit entry point) is now fully
supported since QEMU 7.0, default to use SMBIOS 3.0 for newer machine
models. This is necessary to avoid the following message when launching
a VM with large number of vcpus.

   "SMBIOS 2.1 table length 66822 exceeds 65535"

Note that user can still override the entry point tyme w/ QEMU option
"-M ..., smbios-entry-point-type=[32|64].

Signed-off-by: Suravee Suthikulpanit 
---
 hw/i386/pc.c |  5 -
 hw/i386/pc_piix.c| 14 ++
 hw/i386/pc_q35.c | 14 ++
 include/hw/i386/pc.h |  2 ++
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index bb62c994fa..fced0ab0eb 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1770,7 +1770,10 @@ static void pc_machine_set_smbios_ep(Object *obj, 
Visitor *v, const char *name,
 {
 PCMachineState *pcms = PC_MACHINE(obj);
 
-visit_type_SmbiosEntryPointType(v, name, >smbios_entry_point_type, 
errp);
+pcms->smbios_use_cmdline_ep_type =
+visit_type_SmbiosEntryPointType(v, name,
+>smbios_entry_point_type,
+errp);
 }
 
 static void pc_machine_get_max_ram_below_4g(Object *obj, Visitor *v,
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index d5b0dcd1fe..2905b2 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -199,6 +199,14 @@ static void pc_init1(MachineState *machine,
 pc_guest_info_init(pcms);
 
 if (pcmc->smbios_defaults) {
+/*
+ * Check if user has specified command line option to override
+ * the default SMBIOS default entry point type.
+ */
+if (!pcms->smbios_use_cmdline_ep_type) {
+pcms->smbios_entry_point_type = pcmc->default_smbios_ep_type;
+}
+
 MachineClass *mc = MACHINE_GET_CLASS(machine);
 /* These values are guest ABI, do not change */
 smbios_set_defaults("QEMU", mc->desc,
@@ -453,6 +461,7 @@ static void pc_i440fx_machine_options(MachineClass *m)
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pcmc->pci_root_uid = 0;
 pcmc->default_cpu_version = 1;
+pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_64;
 
 m->family = "pc_piix";
 m->desc = "Standard PC (i440FX + PIIX, 1996)";
@@ -476,11 +485,16 @@ DEFINE_I440FX_MACHINE(v8_1, "pc-i440fx-8.1", NULL,
 
 static void pc_i440fx_8_0_machine_options(MachineClass *m)
 {
+PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+
 pc_i440fx_8_1_machine_options(m);
 m->alias = NULL;
 m->is_default = false;
 compat_props_add(m->compat_props, hw_compat_8_0, hw_compat_8_0_len);
 compat_props_add(m->compat_props, pc_compat_8_0, pc_compat_8_0_len);
+
+/* For pc-i44fx-8.0 and older, use SMBIOS 2.8 by default */
+pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_32;
 }
 
 DEFINE_I440FX_MACHINE(v8_0, "pc-i440fx-8.0", NULL,
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 6155427e48..2d1bb5fde5 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -199,6 +199,14 @@ static void pc_q35_init(MachineState *machine)
 pc_guest_info_init(pcms);
 
 if (pcmc->smbios_defaults) {
+/*
+ * Check if user has specified command line option to override
+ * the default SMBIOS default entry point type.
+ */
+if (!pcms->smbios_use_cmdline_ep_type) {
+pcms->smbios_entry_point_type = pcmc->default_smbios_ep_type;
+}
+
 /* These values are guest ABI, do not change */
 smbios_set_defaults("QEMU", mc->desc,
 mc->name, pcmc->smbios_legacy_mode,
@@ -359,6 +367,7 @@ static void pc_q35_machine_options(MachineClass *m)
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pcmc->pci_root_uid = 0;
 pcmc->default_cpu_version = 1;
+pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_64;
 
 m->family = "pc_q35";
 m->desc = "Standard PC (Q35 + ICH9, 2009)";
@@ -387,10 +396,15 @@ DEFINE_Q35_MACHINE(v8_1, "pc-q35-8.1", NULL,
 
 static void pc_q35_8_0_machine_options(MachineClass *m)
 {
+PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+
 pc_q35_8_1_machine_options(m);
 m->alias = NULL;
 compat_props_add(m->compat_props, hw_compat_8_0, hw_compat_8_0_len);
 compat_props_add(m->compat_props, pc_compat_8_0, pc_compat_8_0_len);
+
+/* For pc-q35-8.0 and older, use SMBIOS 2.8 by default */
+pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_32;
 }
 
 DEFINE_Q35_MACHINE(v8_0, "pc-q35-8.0", NULL,
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index c661e9cc80..f754da5a38 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -50,6 +50,7 @@ typedef struct PCMachineState {
 bool i8042_enabled;
 bool default_bus_bypass_iommu;
 uint64_t max_fw_size;
+bool smbios_use_cmdline_ep_type;
 
 /* ACPI Memory hotplug 

Re: [RFC v2] linux-user/riscv: Add syscall riscv_hwprobe

2023-06-02 Thread Richard Henderson

On 6/2/23 08:07, Andrew Jones wrote:

On Fri, Jun 02, 2023 at 04:39:20PM +0200, Robbin Ehn wrote:

On Fri, 2023-06-02 at 16:02 +0200, Andrew Jones wrote:

On Fri, Jun 02, 2023 at 11:41:11AM +0200, Robbin Ehn wrote:

...

+#if defined(TARGET_RISCV)
+case TARGET_NR_riscv_hwprobe:
+{


The { goes under the c of case, which will shift all the below four spaces
left as well.


This was an attempt to blend in, i.e. same style as the preceding case.
I'll change, thanks.


Hmm, I see. This function does have many cases with the indented format,
but not all of them, and the rest of the code base doesn't indent. I won't
insist on changing this, as long as checkpatch isn't complaining.


Splitting the entire thing out to a helper function is even cleaner.
We have lots of those, but certainly not universal.

I have, from time to time, tried to clean all of this up, but no one wanted to look at a 
100+ RFC patch set which only scratched the surface.



r~




Re: [RFC v2] linux-user/riscv: Add syscall riscv_hwprobe

2023-06-02 Thread Richard Henderson

On 6/2/23 07:02, Andrew Jones wrote:

+struct riscv_hwprobe {
+int64_t  key;
+uint64_t value;
+};


The above is all uapi so Linux's arch/riscv/include/uapi/asm/hwprobe.h
should be picked up on Linux header update. You'll need to modify the
script, scripts/update-linux-headers.sh, to do that by adding a new
riscv-specific block. Hacking this by importing the header file manually
is fine for an RFC, but that should be a separate patch or part of the
syscall define hack patch. And hack patches should be clearly tagged as
"NOT FOR MERGE".



Not true.  linux-user/ never looks at linux-headers/.


r~



Re: [RFC v2] linux-user/riscv: Add syscall riscv_hwprobe

2023-06-02 Thread Richard Henderson

On 6/2/23 02:41, Robbin Ehn wrote:

+struct riscv_hwprobe {
+int64_t  key;
+uint64_t value;
+};


This needs to use abi_llong and abi_ullong, as the guest may not have the same alignment 
requirements as the host.




+case RISCV_HWPROBE_KEY_MVENDORID:
+pair->value = cfg->mvendorid;
+break;


You must use __get_user and __put_user to handle host vs guest endianness.  All 
over.



+case RISCV_HWPROBE_KEY_CPUPERF_0:
+pair->value = RISCV_HWPROBE_MISALIGNED_UNKNOWN;


Is that really what you want to expose here?  FAST is always going to be true, in that 
handling the unaligned access in the host is going to be faster than in the emulated guest.



+default:
+pair->key = -1;
+break;


Misalignment.


+#if defined(TARGET_RISCV)
+case TARGET_NR_riscv_hwprobe:
+{
+struct riscv_hwprobe *host_pairs;
+
+/* flags must be 0 */
+if (arg5 != 0) {
+return -TARGET_EINVAL;
+}
+
+/* check cpu_set */
+if (arg3 != 0) {
+int ccpu;
+size_t cpu_setsize = CPU_ALLOC_SIZE(arg3);
+cpu_set_t *host_cpus = lock_user(VERIFY_READ, arg4,
+ cpu_setsize, 0);
+if (!host_cpus) {
+return -TARGET_EFAULT;
+}
+ccpu = CPU_COUNT_S(cpu_setsize, host_cpus);


Where does CPU_ALLOC_SIZE and CPU_COUNT_S come from?


+unlock_user(host_cpus, arg4, cpu_setsize);
+/* no selected cpu */
+if (ccpu == 0) {
+return -TARGET_EINVAL;
+}


I suppose you're just looking to see that the set is not empty?


r~



[PATCH 27/35] target/riscv: Use aesdec_ISB_ISR_IMC_AK

2023-06-02 Thread Richard Henderson
This implements the AES64DSM instruction.  This was the last use
of aes64_operation and its support macros, so remove them all.

Signed-off-by: Richard Henderson 
---
 target/riscv/crypto_helper.c | 101 ---
 1 file changed, 10 insertions(+), 91 deletions(-)

diff --git a/target/riscv/crypto_helper.c b/target/riscv/crypto_helper.c
index 71694b787c..affa8292d1 100644
--- a/target/riscv/crypto_helper.c
+++ b/target/riscv/crypto_helper.c
@@ -104,96 +104,6 @@ target_ulong HELPER(aes32dsi)(target_ulong rs1, 
target_ulong rs2,
 return aes32_operation(shamt, rs1, rs2, false, false);
 }
 
-#define BY(X, I) ((X >> (8 * I)) & 0xFF)
-
-#define AES_SHIFROWS_LO(RS1, RS2) ( \
-(((RS1 >> 24) & 0xFF) << 56) | (((RS2 >> 48) & 0xFF) << 48) | \
-(((RS2 >> 8) & 0xFF) << 40) | (((RS1 >> 32) & 0xFF) << 32) | \
-(((RS2 >> 56) & 0xFF) << 24) | (((RS2 >> 16) & 0xFF) << 16) | \
-(((RS1 >> 40) & 0xFF) << 8) | (((RS1 >> 0) & 0xFF) << 0))
-
-#define AES_INVSHIFROWS_LO(RS1, RS2) ( \
-(((RS2 >> 24) & 0xFF) << 56) | (((RS2 >> 48) & 0xFF) << 48) | \
-(((RS1 >> 8) & 0xFF) << 40) | (((RS1 >> 32) & 0xFF) << 32) | \
-(((RS1 >> 56) & 0xFF) << 24) | (((RS2 >> 16) & 0xFF) << 16) | \
-(((RS2 >> 40) & 0xFF) << 8) | (((RS1 >> 0) & 0xFF) << 0))
-
-#define AES_MIXBYTE(COL, B0, B1, B2, B3) ( \
-BY(COL, B3) ^ BY(COL, B2) ^ AES_GFMUL(BY(COL, B1), 3) ^ \
-AES_GFMUL(BY(COL, B0), 2))
-
-#define AES_MIXCOLUMN(COL) ( \
-AES_MIXBYTE(COL, 3, 0, 1, 2) << 24 | \
-AES_MIXBYTE(COL, 2, 3, 0, 1) << 16 | \
-AES_MIXBYTE(COL, 1, 2, 3, 0) << 8 | AES_MIXBYTE(COL, 0, 1, 2, 3) << 0)
-
-#define AES_INVMIXBYTE(COL, B0, B1, B2, B3) ( \
-AES_GFMUL(BY(COL, B3), 0x9) ^ AES_GFMUL(BY(COL, B2), 0xd) ^ \
-AES_GFMUL(BY(COL, B1), 0xb) ^ AES_GFMUL(BY(COL, B0), 0xe))
-
-#define AES_INVMIXCOLUMN(COL) ( \
-AES_INVMIXBYTE(COL, 3, 0, 1, 2) << 24 | \
-AES_INVMIXBYTE(COL, 2, 3, 0, 1) << 16 | \
-AES_INVMIXBYTE(COL, 1, 2, 3, 0) << 8 | \
-AES_INVMIXBYTE(COL, 0, 1, 2, 3) << 0)
-
-static inline target_ulong aes64_operation(target_ulong rs1, target_ulong rs2,
-   bool enc, bool mix)
-{
-uint64_t RS1 = rs1;
-uint64_t RS2 = rs2;
-uint64_t result;
-uint64_t temp;
-uint32_t col_0;
-uint32_t col_1;
-
-if (enc) {
-temp = AES_SHIFROWS_LO(RS1, RS2);
-temp = (((uint64_t)AES_sbox[(temp >> 0) & 0xFF] << 0) |
-((uint64_t)AES_sbox[(temp >> 8) & 0xFF] << 8) |
-((uint64_t)AES_sbox[(temp >> 16) & 0xFF] << 16) |
-((uint64_t)AES_sbox[(temp >> 24) & 0xFF] << 24) |
-((uint64_t)AES_sbox[(temp >> 32) & 0xFF] << 32) |
-((uint64_t)AES_sbox[(temp >> 40) & 0xFF] << 40) |
-((uint64_t)AES_sbox[(temp >> 48) & 0xFF] << 48) |
-((uint64_t)AES_sbox[(temp >> 56) & 0xFF] << 56));
-if (mix) {
-col_0 = temp & 0x;
-col_1 = temp >> 32;
-
-col_0 = AES_MIXCOLUMN(col_0);
-col_1 = AES_MIXCOLUMN(col_1);
-
-result = ((uint64_t)col_1 << 32) | col_0;
-} else {
-result = temp;
-}
-} else {
-temp = AES_INVSHIFROWS_LO(RS1, RS2);
-temp = (((uint64_t)AES_isbox[(temp >> 0) & 0xFF] << 0) |
-((uint64_t)AES_isbox[(temp >> 8) & 0xFF] << 8) |
-((uint64_t)AES_isbox[(temp >> 16) & 0xFF] << 16) |
-((uint64_t)AES_isbox[(temp >> 24) & 0xFF] << 24) |
-((uint64_t)AES_isbox[(temp >> 32) & 0xFF] << 32) |
-((uint64_t)AES_isbox[(temp >> 40) & 0xFF] << 40) |
-((uint64_t)AES_isbox[(temp >> 48) & 0xFF] << 48) |
-((uint64_t)AES_isbox[(temp >> 56) & 0xFF] << 56));
-if (mix) {
-col_0 = temp & 0x;
-col_1 = temp >> 32;
-
-col_0 = AES_INVMIXCOLUMN(col_0);
-col_1 = AES_INVMIXCOLUMN(col_1);
-
-result = ((uint64_t)col_1 << 32) | col_0;
-} else {
-result = temp;
-}
-}
-
-return result;
-}
-
 target_ulong HELPER(aes64esm)(target_ulong rs1, target_ulong rs2)
 {
 AESState t, z = { };
@@ -230,7 +140,16 @@ target_ulong HELPER(aes64ds)(target_ulong rs1, 
target_ulong rs2)
 
 target_ulong HELPER(aes64dsm)(target_ulong rs1, target_ulong rs2)
 {
-return aes64_operation(rs1, rs2, false, true);
+AESState t, z = { };
+
+/*
+ * This instruction does not include a round key,
+ * so supply a zero to our primitive.
+ */
+t.d[HOST_BIG_ENDIAN] = rs1;
+t.d[!HOST_BIG_ENDIAN] = rs2;
+aesdec_ISB_ISR_IMC_AK(, , , false);
+return t.d[HOST_BIG_ENDIAN];
 }
 
 target_ulong HELPER(aes64ks2)(target_ulong rs1, target_ulong rs2)
-- 
2.34.1




[PATCH 34/35] crypto: Remove AES_imc

2023-06-02 Thread Richard Henderson
This array is no longer used.

Signed-off-by: Richard Henderson 
---
 include/crypto/aes.h |   7 --
 crypto/aes.c | 264 ---
 2 files changed, 271 deletions(-)

diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index aa8b54065d..99209f51b9 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -36,13 +36,6 @@ extern const uint32_t AES_mc_rot[256];
 /* AES InvMixColumns, for use with rot32. */
 extern const uint32_t AES_imc_rot[256];
 
-/* AES InvMixColumns */
-/* AES_imc[x][0] = [x].[0e, 09, 0d, 0b]; */
-/* AES_imc[x][1] = [x].[0b, 0e, 09, 0d]; */
-/* AES_imc[x][2] = [x].[0d, 0b, 0e, 09]; */
-/* AES_imc[x][3] = [x].[09, 0d, 0b, 0e]; */
-extern const uint32_t AES_imc[256][4];
-
 /*
 AES_Te0[x] = S [x].[02, 01, 01, 03];
 AES_Te1[x] = S [x].[03, 02, 01, 01];
diff --git a/crypto/aes.c b/crypto/aes.c
index 914ccf38ef..4d84bef520 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -293,270 +293,6 @@ const uint32_t AES_imc_rot[256] = {
 0xbe805d9f, 0xb58d5491, 0xa89a4f83, 0xa397468d,
 };
 
-/* AES_imc[x][0] = [x].[0e, 09, 0d, 0b]; */
-/* AES_imc[x][1] = [x].[0b, 0e, 09, 0d]; */
-/* AES_imc[x][2] = [x].[0d, 0b, 0e, 09]; */
-/* AES_imc[x][3] = [x].[09, 0d, 0b, 0e]; */
-const uint32_t AES_imc[256][4] = {
-{ 0x, 0x, 0x, 0x, }, /* x=00 */
-{ 0x0E090D0B, 0x0B0E090D, 0x0D0B0E09, 0x090D0B0E, }, /* x=01 */
-{ 0x1C121A16, 0x161C121A, 0x1A161C12, 0x121A161C, }, /* x=02 */
-{ 0x121B171D, 0x1D121B17, 0x171D121B, 0x1B171D12, }, /* x=03 */
-{ 0x3824342C, 0x2C382434, 0x342C3824, 0x24342C38, }, /* x=04 */
-{ 0x362D3927, 0x27362D39, 0x3927362D, 0x2D392736, }, /* x=05 */
-{ 0x24362E3A, 0x3A24362E, 0x2E3A2436, 0x362E3A24, }, /* x=06 */
-{ 0x2A3F2331, 0x312A3F23, 0x23312A3F, 0x3F23312A, }, /* x=07 */
-{ 0x70486858, 0x58704868, 0x68587048, 0x48685870, }, /* x=08 */
-{ 0x7E416553, 0x537E4165, 0x65537E41, 0x4165537E, }, /* x=09 */
-{ 0x6C5A724E, 0x4E6C5A72, 0x724E6C5A, 0x5A724E6C, }, /* x=0A */
-{ 0x62537F45, 0x4562537F, 0x7F456253, 0x537F4562, }, /* x=0B */
-{ 0x486C5C74, 0x74486C5C, 0x5C74486C, 0x6C5C7448, }, /* x=0C */
-{ 0x4665517F, 0x7F466551, 0x517F4665, 0x65517F46, }, /* x=0D */
-{ 0x547E4662, 0x62547E46, 0x4662547E, 0x7E466254, }, /* x=0E */
-{ 0x5A774B69, 0x695A774B, 0x4B695A77, 0x774B695A, }, /* x=0F */
-{ 0xE090D0B0, 0xB0E090D0, 0xD0B0E090, 0x90D0B0E0, }, /* x=10 */
-{ 0xEE99DDBB, 0xBBEE99DD, 0xDDBBEE99, 0x99DDBBEE, }, /* x=11 */
-{ 0xFC82CAA6, 0xA6FC82CA, 0xCAA6FC82, 0x82CAA6FC, }, /* x=12 */
-{ 0xF28BC7AD, 0xADF28BC7, 0xC7ADF28B, 0x8BC7ADF2, }, /* x=13 */
-{ 0xD8B4E49C, 0x9CD8B4E4, 0xE49CD8B4, 0xB4E49CD8, }, /* x=14 */
-{ 0xD6BDE997, 0x97D6BDE9, 0xE997D6BD, 0xBDE997D6, }, /* x=15 */
-{ 0xC4A6FE8A, 0x8AC4A6FE, 0xFE8AC4A6, 0xA6FE8AC4, }, /* x=16 */
-{ 0xCAAFF381, 0x81CAAFF3, 0xF381CAAF, 0xAFF381CA, }, /* x=17 */
-{ 0x90D8B8E8, 0xE890D8B8, 0xB8E890D8, 0xD8B8E890, }, /* x=18 */
-{ 0x9ED1B5E3, 0xE39ED1B5, 0xB5E39ED1, 0xD1B5E39E, }, /* x=19 */
-{ 0x8CCAA2FE, 0xFE8CCAA2, 0xA2FE8CCA, 0xCAA2FE8C, }, /* x=1A */
-{ 0x82C3AFF5, 0xF582C3AF, 0xAFF582C3, 0xC3AFF582, }, /* x=1B */
-{ 0xA8FC8CC4, 0xC4A8FC8C, 0x8CC4A8FC, 0xFC8CC4A8, }, /* x=1C */
-{ 0xA6F581CF, 0xCFA6F581, 0x81CFA6F5, 0xF581CFA6, }, /* x=1D */
-{ 0xB4EE96D2, 0xD2B4EE96, 0x96D2B4EE, 0xEE96D2B4, }, /* x=1E */
-{ 0xBAE79BD9, 0xD9BAE79B, 0x9BD9BAE7, 0xE79BD9BA, }, /* x=1F */
-{ 0xDB3BBB7B, 0x7BDB3BBB, 0xBB7BDB3B, 0x3BBB7BDB, }, /* x=20 */
-{ 0xD532B670, 0x70D532B6, 0xB670D532, 0x32B670D5, }, /* x=21 */
-{ 0xC729A16D, 0x6DC729A1, 0xA16DC729, 0x29A16DC7, }, /* x=22 */
-{ 0xC920AC66, 0x66C920AC, 0xAC66C920, 0x20AC66C9, }, /* x=23 */
-{ 0xE31F8F57, 0x57E31F8F, 0x8F57E31F, 0x1F8F57E3, }, /* x=24 */
-{ 0xED16825C, 0x5CED1682, 0x825CED16, 0x16825CED, }, /* x=25 */
-{ 0xFF0D9541, 0x41FF0D95, 0x9541FF0D, 0x0D9541FF, }, /* x=26 */
-{ 0xF104984A, 0x4AF10498, 0x984AF104, 0x04984AF1, }, /* x=27 */
-{ 0xAB73D323, 0x23AB73D3, 0xD323AB73, 0x73D323AB, }, /* x=28 */
-{ 0xA57ADE28, 0x28A57ADE, 0xDE28A57A, 0x7ADE28A5, }, /* x=29 */
-{ 0xB761C935, 0x35B761C9, 0xC935B761, 0x61C935B7, }, /* x=2A */
-{ 0xB968C43E, 0x3EB968C4, 0xC43EB968, 0x68C43EB9, }, /* x=2B */
-{ 0x9357E70F, 0x0F9357E7, 0xE70F9357, 0x57E70F93, }, /* x=2C */
-{ 0x9D5EEA04, 0x049D5EEA, 0xEA049D5E, 0x5EEA049D, }, /* x=2D */
-{ 0x8F45FD19, 0x198F45FD, 0xFD198F45, 0x45FD198F, }, /* x=2E */
-{ 0x814CF012, 0x12814CF0, 0xF012814C, 0x4CF01281, }, /* x=2F */
-{ 0x3BAB6BCB, 0xCB3BAB6B, 0x6BCB3BAB, 0xAB6BCB3B, }, /* x=30 */
-{ 0x35A266C0, 0xC035A266, 0x66C035A2, 0xA266C035, }, /* x=31 */
-{ 0x27B971DD, 0xDD27B971, 0x71DD27B9, 0xB971DD27, }, /* x=32 */
-{ 0x29B07CD6, 0xD629B07C, 0x7CD629B0, 0xB07CD629, }, /* x=33 */
-{ 0x038F5FE7, 0xE7038F5F, 0x5FE7038F, 0x8F5FE703, }, /* x=34 */
-{ 0x0D8652EC, 0xEC0D8652, 0x52EC0D86, 0x8652EC0D, }, /* 

[PATCH 17/35] crypto: Add aesdec_IMC

2023-06-02 Thread Richard Henderson
Add a primitive for InvMixColumns.

Signed-off-by: Richard Henderson 
---
 host/include/generic/host/aes-round.h |  3 ++
 include/crypto/aes-round.h| 18 +
 crypto/aes.c  | 57 +++
 3 files changed, 78 insertions(+)

diff --git a/host/include/generic/host/aes-round.h 
b/host/include/generic/host/aes-round.h
index 7c48db24b6..1e9b97d274 100644
--- a/host/include/generic/host/aes-round.h
+++ b/host/include/generic/host/aes-round.h
@@ -15,6 +15,9 @@ void aesenc_MC_accel(AESState *, const AESState *, bool)
 void aesenc_SB_SR_accel(AESState *, const AESState *, bool)
 QEMU_ERROR("unsupported accel");
 
+void aesdec_IMC_accel(AESState *, const AESState *, bool)
+QEMU_ERROR("unsupported accel");
+
 void aesdec_ISB_ISR_accel(AESState *, const AESState *, bool)
 QEMU_ERROR("unsupported accel");
 
diff --git a/include/crypto/aes-round.h b/include/crypto/aes-round.h
index f25e9572a3..2d962ede0b 100644
--- a/include/crypto/aes-round.h
+++ b/include/crypto/aes-round.h
@@ -74,4 +74,22 @@ static inline void aesdec_ISB_ISR(AESState *r, const 
AESState *st, bool be)
 }
 }
 
+/*
+ * Perform InvMixColumns.
+ */
+
+void aesdec_IMC_gen(AESState *ret, const AESState *st);
+void aesdec_IMC_genrev(AESState *ret, const AESState *st);
+
+static inline void aesdec_IMC(AESState *r, const AESState *st, bool be)
+{
+if (HAVE_AES_ACCEL) {
+aesdec_IMC_accel(r, st, be);
+} else if (HOST_BIG_ENDIAN == be) {
+aesdec_IMC_gen(r, st);
+} else {
+aesdec_IMC_genrev(r, st);
+}
+}
+
 #endif /* CRYPTO_AES_ROUND_H */
diff --git a/crypto/aes.c b/crypto/aes.c
index c7123eddd5..4e654e5404 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -1402,6 +1402,63 @@ void aesdec_ISB_ISR_genrev(AESState *r, const AESState 
*st)
 aesdec_ISB_ISR_swap(r, st, true);
 }
 
+/* Perform InvMixColumns. */
+static inline void
+aesdec_IMC_swap(AESState *r, const AESState *st, bool swap)
+{
+int swap_b = swap * 0xf;
+int swap_w = swap * 0x3;
+bool be = HOST_BIG_ENDIAN ^ swap;
+uint32_t t;
+
+/* Note that AES_imc is encoded for big-endian. */
+t = (AES_imc[st->b[swap_b ^ 0x0]][0] ^
+ AES_imc[st->b[swap_b ^ 0x1]][1] ^
+ AES_imc[st->b[swap_b ^ 0x2]][2] ^
+ AES_imc[st->b[swap_b ^ 0x3]][3]);
+if (!be) {
+t = bswap32(t);
+}
+r->w[swap_w ^ 0] = t;
+
+t = (AES_imc[st->b[swap_b ^ 0x4]][0] ^
+ AES_imc[st->b[swap_b ^ 0x5]][1] ^
+ AES_imc[st->b[swap_b ^ 0x6]][2] ^
+ AES_imc[st->b[swap_b ^ 0x7]][3]);
+if (!be) {
+t = bswap32(t);
+}
+r->w[swap_w ^ 1] = t;
+
+t = (AES_imc[st->b[swap_b ^ 0x8]][0] ^
+ AES_imc[st->b[swap_b ^ 0x9]][1] ^
+ AES_imc[st->b[swap_b ^ 0xA]][2] ^
+ AES_imc[st->b[swap_b ^ 0xB]][3]);
+if (!be) {
+t = bswap32(t);
+}
+r->w[swap_w ^ 2] = t;
+
+t = (AES_imc[st->b[swap_b ^ 0xC]][0] ^
+ AES_imc[st->b[swap_b ^ 0xD]][1] ^
+ AES_imc[st->b[swap_b ^ 0xE]][2] ^
+ AES_imc[st->b[swap_b ^ 0xF]][3]);
+if (!be) {
+t = bswap32(t);
+}
+r->w[swap_w ^ 3] = t;
+}
+
+void aesdec_IMC_gen(AESState *r, const AESState *st)
+{
+aesdec_IMC_swap(r, st, false);
+}
+
+void aesdec_IMC_genrev(AESState *r, const AESState *st)
+{
+aesdec_IMC_swap(r, st, true);
+}
+
 /**
  * Expand the cipher key into the encryption key schedule.
  */
-- 
2.34.1




[PATCH 09/35] target/riscv: Use aesenc_SB_SR

2023-06-02 Thread Richard Henderson
This implements the AES64ES instruction.

Signed-off-by: Richard Henderson 
---
 target/riscv/crypto_helper.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/target/riscv/crypto_helper.c b/target/riscv/crypto_helper.c
index 2ef30281b1..82d7f3a060 100644
--- a/target/riscv/crypto_helper.c
+++ b/target/riscv/crypto_helper.c
@@ -22,6 +22,7 @@
 #include "exec/exec-all.h"
 #include "exec/helper-proto.h"
 #include "crypto/aes.h"
+#include "crypto/aes-round.h"
 #include "crypto/sm4.h"
 
 #define AES_XTIME(a) \
@@ -200,7 +201,12 @@ target_ulong HELPER(aes64esm)(target_ulong rs1, 
target_ulong rs2)
 
 target_ulong HELPER(aes64es)(target_ulong rs1, target_ulong rs2)
 {
-return aes64_operation(rs1, rs2, true, false);
+AESState t;
+
+t.d[HOST_BIG_ENDIAN] = rs1;
+t.d[!HOST_BIG_ENDIAN] = rs2;
+aesenc_SB_SR(, , false);
+return t.d[HOST_BIG_ENDIAN];
 }
 
 target_ulong HELPER(aes64ds)(target_ulong rs1, target_ulong rs2)
-- 
2.34.1




[PATCH 10/35] crypto: Add aesdec_ISB_ISR

2023-06-02 Thread Richard Henderson
Add a primitive for InvSubBytes + InvShiftRows.

Signed-off-by: Richard Henderson 
---
 host/include/generic/host/aes-round.h |  3 ++
 include/crypto/aes-round.h| 18 +++
 crypto/aes.c  | 46 +++
 3 files changed, 67 insertions(+)

diff --git a/host/include/generic/host/aes-round.h 
b/host/include/generic/host/aes-round.h
index 598242c603..cb4fed61fe 100644
--- a/host/include/generic/host/aes-round.h
+++ b/host/include/generic/host/aes-round.h
@@ -12,4 +12,7 @@
 void aesenc_SB_SR_accel(AESState *, const AESState *, bool)
 QEMU_ERROR("unsupported accel");
 
+void aesdec_ISB_ISR_accel(AESState *, const AESState *, bool)
+QEMU_ERROR("unsupported accel");
+
 #endif
diff --git a/include/crypto/aes-round.h b/include/crypto/aes-round.h
index 784e1daee6..ff1914bd63 100644
--- a/include/crypto/aes-round.h
+++ b/include/crypto/aes-round.h
@@ -38,4 +38,22 @@ static inline void aesenc_SB_SR(AESState *r, const AESState 
*st, bool be)
 }
 }
 
+/*
+ * Perform InvSubBytes + InvShiftRows.
+ */
+
+void aesdec_ISB_ISR_gen(AESState *ret, const AESState *st);
+void aesdec_ISB_ISR_genrev(AESState *ret, const AESState *st);
+
+static inline void aesdec_ISB_ISR(AESState *r, const AESState *st, bool be)
+{
+if (HAVE_AES_ACCEL) {
+aesdec_ISB_ISR_accel(r, st, be);
+} else if (HOST_BIG_ENDIAN == be) {
+aesdec_ISB_ISR_gen(r, st);
+} else {
+aesdec_ISB_ISR_genrev(r, st);
+}
+}
+
 #endif /* CRYPTO_AES_ROUND_H */
diff --git a/crypto/aes.c b/crypto/aes.c
index 708838315a..937377647f 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -1298,6 +1298,52 @@ void aesenc_SB_SR_genrev(AESState *r, const AESState *st)
 aesenc_SB_SR_swap(r, st, true);
 }
 
+/* Perform InvSubBytes + InvShiftRows. */
+static inline void
+aesdec_ISB_ISR_swap(AESState *r, const AESState *st, bool swap)
+{
+const int swap_b = swap ? 15 : 0;
+uint8_t t;
+
+/* These four indexes are not swizzled. */
+r->b[swap_b ^ 0x0] = AES_isbox[st->b[swap_b ^ AES_ISH_0]];
+r->b[swap_b ^ 0x4] = AES_isbox[st->b[swap_b ^ AES_ISH_4]];
+r->b[swap_b ^ 0x8] = AES_isbox[st->b[swap_b ^ AES_ISH_8]];
+r->b[swap_b ^ 0xc] = AES_isbox[st->b[swap_b ^ AES_ISH_C]];
+
+/* Otherwise, break cycles. */
+
+t = AES_isbox[st->b[swap_b ^ AES_ISH_5]];
+r->b[swap_b ^ 0x1] = AES_isbox[st->b[swap_b ^ AES_ISH_1]];
+r->b[swap_b ^ 0xd] = AES_isbox[st->b[swap_b ^ AES_ISH_D]];
+r->b[swap_b ^ 0x9] = AES_isbox[st->b[swap_b ^ AES_ISH_9]];
+r->b[swap_b ^ 0x5] = t;
+
+t = AES_isbox[st->b[swap_b ^ AES_ISH_A]];
+r->b[swap_b ^ 0x2] = AES_isbox[st->b[swap_b ^ AES_ISH_2]];
+r->b[swap_b ^ 0xa] = t;
+
+t = AES_isbox[st->b[swap_b ^ AES_ISH_E]];
+r->b[swap_b ^ 0x6] = AES_isbox[st->b[swap_b ^ AES_ISH_6]];
+r->b[swap_b ^ 0xe] = t;
+
+t = AES_isbox[st->b[swap_b ^ AES_ISH_F]];
+r->b[swap_b ^ 0x3] = AES_isbox[st->b[swap_b ^ AES_ISH_3]];
+r->b[swap_b ^ 0x7] = AES_isbox[st->b[swap_b ^ AES_ISH_7]];
+r->b[swap_b ^ 0xb] = AES_isbox[st->b[swap_b ^ AES_ISH_B]];
+r->b[swap_b ^ 0xf] = t;
+}
+
+void aesdec_ISB_ISR_gen(AESState *r, const AESState *st)
+{
+aesdec_ISB_ISR_swap(r, st, false);
+}
+
+void aesdec_ISB_ISR_genrev(AESState *r, const AESState *st)
+{
+aesdec_ISB_ISR_swap(r, st, true);
+}
+
 /**
  * Expand the cipher key into the encryption key schedule.
  */
-- 
2.34.1




[PATCH 23/35] target/ppc: Use aesenc_SB_SR_MC_AK

2023-06-02 Thread Richard Henderson
This implements the VCIPHER instruction.

Signed-off-by: Richard Henderson 
---
 target/ppc/int_helper.c | 14 --
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 444beb1779..c7f8b39e9a 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -2933,17 +2933,11 @@ void helper_vsbox(ppc_avr_t *r, ppc_avr_t *a)
 
 void helper_vcipher(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
-ppc_avr_t result;
-int i;
+AESState *ad = (AESState *)r;
+AESState *st = (AESState *)a;
+AESState *rk = (AESState *)b;
 
-VECTOR_FOR_INORDER_I(i, u32) {
-result.VsrW(i) = b->VsrW(i) ^
-(AES_Te0[a->VsrB(AES_shifts[4 * i + 0])] ^
- AES_Te1[a->VsrB(AES_shifts[4 * i + 1])] ^
- AES_Te2[a->VsrB(AES_shifts[4 * i + 2])] ^
- AES_Te3[a->VsrB(AES_shifts[4 * i + 3])]);
-}
-*r = result;
+aesenc_SB_SR_MC_AK(ad, st, rk, true);
 }
 
 void helper_vcipherlast(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-- 
2.34.1




[PATCH 31/35] host/include/aarch64: Implement aes-round.h

2023-06-02 Thread Richard Henderson
Detect AES in cpuinfo; implement the accel hooks.

Signed-off-by: Richard Henderson 
---
 host/include/aarch64/host/aes-round.h | 204 ++
 host/include/aarch64/host/cpuinfo.h   |   1 +
 util/cpuinfo-aarch64.c|   2 +
 3 files changed, 207 insertions(+)
 create mode 100644 host/include/aarch64/host/aes-round.h

diff --git a/host/include/aarch64/host/aes-round.h 
b/host/include/aarch64/host/aes-round.h
new file mode 100644
index 00..27ca823db6
--- /dev/null
+++ b/host/include/aarch64/host/aes-round.h
@@ -0,0 +1,204 @@
+/*
+ * AArch64 specific aes acceleration.
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HOST_AES_ROUND_H
+#define HOST_AES_ROUND_H
+
+#include "host/cpuinfo.h"
+#include 
+
+#ifdef __ARM_FEATURE_AES
+# define HAVE_AES_ACCEL  true
+# define ATTR_AES_ACCEL
+#else
+# define HAVE_AES_ACCEL  likely(cpuinfo & CPUINFO_AES)
+# define ATTR_AES_ACCEL  __attribute__((target("+crypto")))
+#endif
+
+static inline uint8x16_t aes_accel_bswap(uint8x16_t x)
+{
+/* No arm_neon.h primitive, and the compilers don't share builtins. */
+#ifdef __clang__
+return __builtin_shufflevector(x, x, 15, 14, 13, 12, 11, 10, 9, 8,
+   7, 6, 5, 4, 3, 2, 1, 0);
+#else
+return __builtin_shuffle(x, (uint8x16_t)
+ { 15, 14, 13, 12, 11, 10, 9, 8,
+   7,  6,  5,  4,  3,   2, 1, 0, });
+#endif
+}
+
+/*
+ * Through clang 15, the aes inlines are only defined if __ARM_FEATURE_AES;
+ * one cannot use __attribute__((target)) to make them appear after the fact.
+ * Therefore we must fallback to inline asm.
+ */
+#ifdef __ARM_FEATURE_AES
+# define aes_accel_aesd   vaesdq_u8
+# define aes_accel_aese   vaeseq_u8
+# define aes_accel_aesmc  vaesmcq_u8
+# define aes_accel_aesimc vaesimcq_u8
+#else
+static inline uint8x16_t aes_accel_aesd(uint8x16_t d, uint8x16_t k)
+{
+asm(".arch_extension aes\n\t"
+"aesd %0.16b, %1.16b" : "+w"(d) : "w"(k));
+return d;
+}
+
+static inline uint8x16_t aes_accel_aese(uint8x16_t d, uint8x16_t k)
+{
+asm(".arch_extension aes\n\t"
+"aese %0.16b, %1.16b" : "+w"(d) : "w"(k));
+return d;
+}
+
+static inline uint8x16_t aes_accel_aesmc(uint8x16_t d)
+{
+asm(".arch_extension aes\n\t"
+"aesmc %0.16b, %1.16b" : "=w"(d) : "w"(d));
+return d;
+}
+
+static inline uint8x16_t aes_accel_aesimc(uint8x16_t d)
+{
+asm(".arch_extension aes\n\t"
+"aesimc %0.16b, %1.16b" : "=w"(d) : "w"(d));
+return d;
+}
+#endif /* __ARM_FEATURE_AES */
+
+static inline void ATTR_AES_ACCEL
+aesenc_MC_accel(AESState *ret, const AESState *st, bool be)
+{
+uint8x16_t t = (uint8x16_t)st->v;
+
+if (be) {
+t = aes_accel_bswap(t);
+t = aes_accel_aesmc(t);
+t = aes_accel_bswap(t);
+} else {
+t = aes_accel_aesmc(t);
+}
+ret->v = (AESStateVec)t;
+}
+
+static inline void ATTR_AES_ACCEL
+aesenc_SB_SR_accel(AESState *ret, const AESState *st, bool be)
+{
+uint8x16_t t = (uint8x16_t)st->v;
+uint8x16_t z = { };
+
+if (be) {
+t = aes_accel_bswap(t);
+t = aes_accel_aese(t, z);
+t = aes_accel_bswap(t);
+} else {
+t = aes_accel_aese(t, z);
+}
+ret->v = (AESStateVec)t;
+}
+
+static inline void ATTR_AES_ACCEL
+aesenc_SB_SR_MC_AK_accel(AESState *ret, const AESState *st,
+ const AESState *rk, bool be)
+{
+uint8x16_t t = (uint8x16_t)st->v;
+uint8x16_t k = (uint8x16_t)rk->v;
+uint8x16_t z = { };
+
+if (be) {
+t = aes_accel_bswap(t);
+k = aes_accel_bswap(k);
+t = aes_accel_aese(t, z);
+t = aes_accel_aesmc(t);
+t = veorq_u8(t, k);
+t = aes_accel_bswap(t);
+} else {
+t = aes_accel_aese(t, z);
+t = aes_accel_aesmc(t);
+t = veorq_u8(t, k);
+}
+ret->v = (AESStateVec)t;
+}
+
+static inline void ATTR_AES_ACCEL
+aesdec_IMC_accel(AESState *ret, const AESState *st, bool be)
+{
+uint8x16_t t = (uint8x16_t)st->v;
+
+if (be) {
+t = aes_accel_bswap(t);
+t = aes_accel_aesimc(t);
+t = aes_accel_bswap(t);
+} else {
+t = aes_accel_aesimc(t);
+}
+ret->v = (AESStateVec)t;
+}
+
+static inline void ATTR_AES_ACCEL
+aesdec_ISB_ISR_accel(AESState *ret, const AESState *st, bool be)
+{
+uint8x16_t t = (uint8x16_t)st->v;
+uint8x16_t z = { };
+
+if (be) {
+t = aes_accel_bswap(t);
+t = aes_accel_aesd(t, z);
+t = aes_accel_bswap(t);
+} else {
+t = aes_accel_aesd(t, z);
+}
+ret->v = (AESStateVec)t;
+}
+
+static inline void ATTR_AES_ACCEL
+aesdec_ISB_ISR_AK_IMC_accel(AESState *ret, const AESState *st,
+const AESState *rk, bool be)
+{
+uint8x16_t t = (uint8x16_t)st->v;
+uint8x16_t k = (uint8x16_t)rk->v;
+uint8x16_t z = { };
+
+if (be) {
+t = aes_accel_bswap(t);
+k = aes_accel_bswap(k);
+t = 

[PATCH 35/35] crypto: Unexport AES_*_rot, AES_TeN, AES_TdN

2023-06-02 Thread Richard Henderson
These arrays are no longer used outside of aes.c.

Signed-off-by: Richard Henderson 
---
 include/crypto/aes.h | 25 -
 crypto/aes.c | 33 +
 2 files changed, 21 insertions(+), 37 deletions(-)

diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 99209f51b9..709d4d226b 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -30,29 +30,4 @@ void AES_decrypt(const unsigned char *in, unsigned char *out,
 extern const uint8_t AES_sbox[256];
 extern const uint8_t AES_isbox[256];
 
-/* AES MixColumns, for use with rot32. */
-extern const uint32_t AES_mc_rot[256];
-
-/* AES InvMixColumns, for use with rot32. */
-extern const uint32_t AES_imc_rot[256];
-
-/*
-AES_Te0[x] = S [x].[02, 01, 01, 03];
-AES_Te1[x] = S [x].[03, 02, 01, 01];
-AES_Te2[x] = S [x].[01, 03, 02, 01];
-AES_Te3[x] = S [x].[01, 01, 03, 02];
-AES_Te4[x] = S [x].[01, 01, 01, 01];
-
-AES_Td0[x] = Si[x].[0e, 09, 0d, 0b];
-AES_Td1[x] = Si[x].[0b, 0e, 09, 0d];
-AES_Td2[x] = Si[x].[0d, 0b, 0e, 09];
-AES_Td3[x] = Si[x].[09, 0d, 0b, 0e];
-AES_Td4[x] = Si[x].[01, 01, 01, 01];
-*/
-
-extern const uint32_t AES_Te0[256], AES_Te1[256], AES_Te2[256],
-  AES_Te3[256], AES_Te4[256];
-extern const uint32_t AES_Td0[256], AES_Td1[256], AES_Td2[256],
-  AES_Td3[256], AES_Td4[256];
-
 #endif
diff --git a/crypto/aes.c b/crypto/aes.c
index 4d84bef520..c51b1c1d5e 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -155,7 +155,7 @@ enum {
  * MixColumns lookup table, for use with rot32.
  * From Arm ARM pseudocode.
  */
-const uint32_t AES_mc_rot[256] = {
+static const uint32_t AES_mc_rot[256] = {
 0x, 0x03010102, 0x06020204, 0x05030306,
 0x0c040408, 0x0f05050a, 0x0a06060c, 0x0907070e,
 0x18080810, 0x1b090912, 0x1e0a0a14, 0x1d0b0b16,
@@ -226,7 +226,7 @@ const uint32_t AES_mc_rot[256] = {
  * Inverse MixColumns lookup table, for use with rot32.
  * From Arm ARM pseudocode.
  */
-const uint32_t AES_imc_rot[256] = {
+static const uint32_t AES_imc_rot[256] = {
 0x, 0x0b0d090e, 0x161a121c, 0x1d171b12,
 0x2c342438, 0x27392d36, 0x3a2e3624, 0x31233f2a,
 0x58684870, 0x5365417e, 0x4e725a6c, 0x457f5362,
@@ -308,7 +308,7 @@ AES_Td3[x] = Si[x].[09, 0d, 0b, 0e];
 AES_Td4[x] = Si[x].[01, 01, 01, 01];
 */
 
-const uint32_t AES_Te0[256] = {
+static const uint32_t AES_Te0[256] = {
 0xc66363a5U, 0xf87c7c84U, 0xee99U, 0xf67b7b8dU,
 0xfff2f20dU, 0xd66b6bbdU, 0xde6f6fb1U, 0x91c5c554U,
 0x60303050U, 0x02010103U, 0xce6767a9U, 0x562b2b7dU,
@@ -374,7 +374,8 @@ const uint32_t AES_Te0[256] = {
 0x824141c3U, 0x29b0U, 0x5a2d2d77U, 0x1e0f0f11U,
 0x7bb0b0cbU, 0xa85454fcU, 0x6dd6U, 0x2c16163aU,
 };
-const uint32_t AES_Te1[256] = {
+
+static const uint32_t AES_Te1[256] = {
 0xa5c66363U, 0x84f87c7cU, 0x99eeU, 0x8df67b7bU,
 0x0dfff2f2U, 0xbdd66b6bU, 0xb1de6f6fU, 0x5491c5c5U,
 0x50603030U, 0x03020101U, 0xa9ce6767U, 0x7d562b2bU,
@@ -440,7 +441,8 @@ const uint32_t AES_Te1[256] = {
 0xc3824141U, 0xb029U, 0x775a2d2dU, 0x111e0f0fU,
 0xcb7bb0b0U, 0xfca85454U, 0xd66dU, 0x3a2c1616U,
 };
-const uint32_t AES_Te2[256] = {
+
+static const uint32_t AES_Te2[256] = {
 0x63a5c663U, 0x7c84f87cU, 0x7799ee77U, 0x7b8df67bU,
 0xf20dfff2U, 0x6bbdd66bU, 0x6fb1de6fU, 0xc55491c5U,
 0x30506030U, 0x01030201U, 0x67a9ce67U, 0x2b7d562bU,
@@ -506,8 +508,8 @@ const uint32_t AES_Te2[256] = {
 0x41c38241U, 0x99b02999U, 0x2d775a2dU, 0x0f111e0fU,
 0xb0cb7bb0U, 0x54fca854U, 0xbbd66dbbU, 0x163a2c16U,
 };
-const uint32_t AES_Te3[256] = {
 
+static const uint32_t AES_Te3[256] = {
 0x6363a5c6U, 0x7c7c84f8U, 0x99eeU, 0x7b7b8df6U,
 0xf2f20dffU, 0x6b6bbdd6U, 0x6f6fb1deU, 0xc5c55491U,
 0x30305060U, 0x01010302U, 0x6767a9ceU, 0x2b2b7d56U,
@@ -573,7 +575,8 @@ const uint32_t AES_Te3[256] = {
 0x4141c382U, 0xb029U, 0x2d2d775aU, 0x0f0f111eU,
 0xb0b0cb7bU, 0x5454fca8U, 0xd66dU, 0x16163a2cU,
 };
-const uint32_t AES_Te4[256] = {
+
+static const uint32_t AES_Te4[256] = {
 0x63636363U, 0x7c7c7c7cU, 0xU, 0x7b7b7b7bU,
 0xf2f2f2f2U, 0x6b6b6b6bU, 0x6f6f6f6fU, 0xc5c5c5c5U,
 0x30303030U, 0x01010101U, 0x67676767U, 0x2b2b2b2bU,
@@ -639,7 +642,8 @@ const uint32_t AES_Te4[256] = {
 0x41414141U, 0xU, 0x2d2d2d2dU, 0x0f0f0f0fU,
 0xb0b0b0b0U, 0x54545454U, 0xU, 0x16161616U,
 };
-const uint32_t AES_Td0[256] = {
+
+static const uint32_t AES_Td0[256] = {
 0x51f4a750U, 0x7e416553U, 0x1a17a4c3U, 0x3a275e96U,
 0x3bab6bcbU, 0x1f9d45f1U, 0xacfa58abU, 0x4be30393U,
 0x2030fa55U, 0xad766df6U, 0x88cc7691U, 0xf5024c25U,
@@ -705,7 +709,8 @@ const uint32_t AES_Td0[256] = {
 0x39a80171U, 0x080cb3deU, 0xd8b4e49cU, 0x6456c190U,
 0x7bcb8461U, 0xd532b670U, 0x486c5c74U, 0xd0b85742U,
 };
-const uint32_t AES_Td1[256] = {
+
+static const uint32_t AES_Td1[256] = {
 0x5051f4a7U, 0x537e4165U, 0xc31a17a4U, 0x963a275eU,
 0xcb3bab6bU, 0xf11f9d45U, 0xabacfa58U, 

[PATCH 13/35] target/ppc: Use aesdec_ISB_ISR

2023-06-02 Thread Richard Henderson
This implements the VNCIPHERLAST instruction.

Signed-off-by: Richard Henderson 
---
 target/ppc/int_helper.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index b49e17685b..444beb1779 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -2979,13 +2979,13 @@ void helper_vncipher(ppc_avr_t *r, ppc_avr_t *a, 
ppc_avr_t *b)
 
 void helper_vncipherlast(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
-ppc_avr_t result;
-int i;
+AESState *ad = (AESState *)r;
+AESState *st = (AESState *)a;
+AESState *rk = (AESState *)b;
+AESState t;
 
-VECTOR_FOR_INORDER_I(i, u8) {
-result.VsrB(i) = b->VsrB(i) ^ (AES_isbox[a->VsrB(AES_ishifts[i])]);
-}
-*r = result;
+aesdec_ISB_ISR(, st, true);
+ad->v = t.v ^ rk->v;
 }
 
 void helper_vshasigmaw(ppc_avr_t *r,  ppc_avr_t *a, uint32_t st_six)
-- 
2.34.1




[PATCH 32/35] crypto: Remove AES_shifts, AES_ishifts

2023-06-02 Thread Richard Henderson
These arrays are no longer used, replaced by AES_SH_*, AES_ISH_*.

Signed-off-by: Richard Henderson 
---
 include/crypto/aes.h |  4 
 crypto/aes.c | 14 --
 2 files changed, 18 deletions(-)

diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 24b073d569..aa8b54065d 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -30,10 +30,6 @@ void AES_decrypt(const unsigned char *in, unsigned char *out,
 extern const uint8_t AES_sbox[256];
 extern const uint8_t AES_isbox[256];
 
-/* AES ShiftRows and InvShiftRows */
-extern const uint8_t AES_shifts[16];
-extern const uint8_t AES_ishifts[16];
-
 /* AES MixColumns, for use with rot32. */
 extern const uint32_t AES_mc_rot[256];
 
diff --git a/crypto/aes.c b/crypto/aes.c
index c0e4bc5580..4438d4dcdc 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -131,13 +131,6 @@ enum {
 AES_SH_F = 0xb,
 };
 
-const uint8_t AES_shifts[16] = {
-AES_SH_0, AES_SH_1, AES_SH_2, AES_SH_3,
-AES_SH_4, AES_SH_5, AES_SH_6, AES_SH_7,
-AES_SH_8, AES_SH_9, AES_SH_A, AES_SH_B,
-AES_SH_C, AES_SH_D, AES_SH_E, AES_SH_F,
-};
-
 /* AES InvShiftRows, for complete unrolling. */
 enum {
 AES_ISH_0 = 0x0,
@@ -158,13 +151,6 @@ enum {
 AES_ISH_F = 0x3,
 };
 
-const uint8_t AES_ishifts[16] = {
-AES_ISH_0, AES_ISH_1, AES_ISH_2, AES_ISH_3,
-AES_ISH_4, AES_ISH_5, AES_ISH_6, AES_ISH_7,
-AES_ISH_8, AES_ISH_9, AES_ISH_A, AES_ISH_B,
-AES_ISH_C, AES_ISH_D, AES_ISH_E, AES_ISH_F,
-};
-
 /*
  * MixColumns lookup table, for use with rot32.
  * From Arm ARM pseudocode.
-- 
2.34.1




[PATCH 24/35] target/riscv: Use aesenc_SB_SR_MC_AK

2023-06-02 Thread Richard Henderson
This implements the AES64ESM instruction.

Signed-off-by: Richard Henderson 
---
 target/riscv/crypto_helper.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/target/riscv/crypto_helper.c b/target/riscv/crypto_helper.c
index 64004b2329..71694b787c 100644
--- a/target/riscv/crypto_helper.c
+++ b/target/riscv/crypto_helper.c
@@ -196,7 +196,16 @@ static inline target_ulong aes64_operation(target_ulong 
rs1, target_ulong rs2,
 
 target_ulong HELPER(aes64esm)(target_ulong rs1, target_ulong rs2)
 {
-return aes64_operation(rs1, rs2, true, true);
+AESState t, z = { };
+
+/*
+ * This instruction does not include a round key,
+ * so supply a zero to our primitive.
+ */
+t.d[HOST_BIG_ENDIAN] = rs1;
+t.d[!HOST_BIG_ENDIAN] = rs2;
+aesenc_SB_SR_MC_AK(, , , false);
+return t.d[HOST_BIG_ENDIAN];
 }
 
 target_ulong HELPER(aes64es)(target_ulong rs1, target_ulong rs2)
-- 
2.34.1




[PATCH 26/35] target/i386: Use aesdec_ISB_ISR_IMC_AK

2023-06-02 Thread Richard Henderson
This implements the AESDEC instruction.

Signed-off-by: Richard Henderson 
---
 target/i386/ops_sse.h | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index c7a2c586f4..e666bd5068 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2162,16 +2162,12 @@ void glue(helper_pclmulqdq, SUFFIX)(CPUX86State *env, 
Reg *d, Reg *v, Reg *s,
 
 void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-int i;
-Reg st = *v;
-Reg rk = *s;
+for (int i = 0; i < SHIFT; i++) {
+AESState *ad = (AESState *)>ZMM_X(i);
+AESState *st = (AESState *)>ZMM_X(i);
+AESState *rk = (AESState *)>ZMM_X(i);
 
-for (i = 0 ; i < 2 << SHIFT ; i++) {
-int j = i & 3;
-d->L(i) = rk.L(i) ^ bswap32(AES_Td0[st.B(AES_ishifts[4 * j + 0])] ^
-AES_Td1[st.B(AES_ishifts[4 * j + 1])] ^
-AES_Td2[st.B(AES_ishifts[4 * j + 2])] ^
-AES_Td3[st.B(AES_ishifts[4 * j + 3])]);
+aesdec_ISB_ISR_IMC_AK(ad, st, rk, false);
 }
 }
 
-- 
2.34.1




[PATCH 20/35] target/riscv: Use aesdec_IMC

2023-06-02 Thread Richard Henderson
This implements the AES64IM instruction.

Signed-off-by: Richard Henderson 
---
 target/riscv/crypto_helper.c | 15 +--
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/target/riscv/crypto_helper.c b/target/riscv/crypto_helper.c
index 08191b4b2a..64004b2329 100644
--- a/target/riscv/crypto_helper.c
+++ b/target/riscv/crypto_helper.c
@@ -270,17 +270,12 @@ target_ulong HELPER(aes64ks1i)(target_ulong rs1, 
target_ulong rnum)
 
 target_ulong HELPER(aes64im)(target_ulong rs1)
 {
-uint64_t RS1 = rs1;
-uint32_t col_0 = RS1 & 0x;
-uint32_t col_1 = RS1 >> 32;
-target_ulong result;
+AESState t;
 
-col_0 = AES_INVMIXCOLUMN(col_0);
-col_1 = AES_INVMIXCOLUMN(col_1);
-
-result = ((uint64_t)col_1 << 32) | col_0;
-
-return result;
+t.d[HOST_BIG_ENDIAN] = rs1;
+t.d[!HOST_BIG_ENDIAN] = 0;
+aesdec_IMC(, , false);
+return t.d[HOST_BIG_ENDIAN];
 }
 
 target_ulong HELPER(sm4ed)(target_ulong rs1, target_ulong rs2,
-- 
2.34.1




[PATCH 28/35] crypto: Add aesdec_ISB_ISR_AK_IMC

2023-06-02 Thread Richard Henderson
Add a primitive for InvSubBytes + InvShiftRows +
AddRoundKey + InvMixColumns.

Signed-off-by: Richard Henderson 
---
 host/include/generic/host/aes-round.h |  4 
 include/crypto/aes-round.h| 21 +
 crypto/aes.c  | 20 
 3 files changed, 45 insertions(+)

diff --git a/host/include/generic/host/aes-round.h 
b/host/include/generic/host/aes-round.h
index 848436379d..84f82e53d8 100644
--- a/host/include/generic/host/aes-round.h
+++ b/host/include/generic/host/aes-round.h
@@ -25,6 +25,10 @@ void aesdec_IMC_accel(AESState *, const AESState *, bool)
 void aesdec_ISB_ISR_accel(AESState *, const AESState *, bool)
 QEMU_ERROR("unsupported accel");
 
+void aesdec_ISB_ISR_AK_IMC_accel(AESState *, const AESState *,
+ const AESState *, bool)
+QEMU_ERROR("unsupported accel");
+
 void aesdec_ISB_ISR_IMC_AK_accel(AESState *, const AESState *,
  const AESState *, bool)
 QEMU_ERROR("unsupported accel");
diff --git a/include/crypto/aes-round.h b/include/crypto/aes-round.h
index 352687ce11..b48b87671c 100644
--- a/include/crypto/aes-round.h
+++ b/include/crypto/aes-round.h
@@ -113,6 +113,27 @@ static inline void aesdec_IMC(AESState *r, const AESState 
*st, bool be)
 }
 }
 
+/*
+ * Perform InvSubBytes + InvShiftRows + AddRoundKey + InvMixColumns.
+ */
+
+void aesdec_ISB_ISR_AK_IMC_gen(AESState *ret, const AESState *st,
+   const AESState *rk);
+void aesdec_ISB_ISR_AK_IMC_genrev(AESState *ret, const AESState *st,
+  const AESState *rk);
+
+static inline void aesdec_ISB_ISR_AK_IMC(AESState *r, const AESState *st,
+ const AESState *rk, bool be)
+{
+if (HAVE_AES_ACCEL) {
+aesdec_ISB_ISR_AK_IMC_accel(r, st, rk, be);
+} else if (HOST_BIG_ENDIAN == be) {
+aesdec_ISB_ISR_AK_IMC_gen(r, st, rk);
+} else {
+aesdec_ISB_ISR_AK_IMC_genrev(r, st, rk);
+}
+}
+
 /*
  * Perform InvSubBytes + InvShiftRows + InvMixColumns + AddRoundKey.
  */
diff --git a/crypto/aes.c b/crypto/aes.c
index 1696086868..c0e4bc5580 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -1571,6 +1571,26 @@ void aesdec_ISB_ISR_IMC_AK_genrev(AESState *r, const 
AESState *st,
 aesdec_ISB_ISR_IMC_AK_swap(r, st, rk, true);
 }
 
+void aesdec_ISB_ISR_AK_IMC_gen(AESState *r, const AESState *st,
+   const AESState *rk)
+{
+AESState t;
+
+aesdec_ISB_ISR_gen(, st);
+t.v ^= rk->v;
+aesdec_IMC_gen(r, );
+}
+
+void aesdec_ISB_ISR_AK_IMC_genrev(AESState *r, const AESState *st,
+  const AESState *rk)
+{
+AESState t;
+
+aesdec_ISB_ISR_genrev(, st);
+t.v ^= rk->v;
+aesdec_IMC_genrev(r, );
+}
+
 /**
  * Expand the cipher key into the encryption key schedule.
  */
-- 
2.34.1




[PATCH 14/35] target/riscv: Use aesdec_ISB_ISR

2023-06-02 Thread Richard Henderson
This implements the AES64DS instruction.

Signed-off-by: Richard Henderson 
---
 target/riscv/crypto_helper.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/target/riscv/crypto_helper.c b/target/riscv/crypto_helper.c
index 82d7f3a060..08191b4b2a 100644
--- a/target/riscv/crypto_helper.c
+++ b/target/riscv/crypto_helper.c
@@ -211,7 +211,12 @@ target_ulong HELPER(aes64es)(target_ulong rs1, 
target_ulong rs2)
 
 target_ulong HELPER(aes64ds)(target_ulong rs1, target_ulong rs2)
 {
-return aes64_operation(rs1, rs2, false, false);
+AESState t;
+
+t.d[HOST_BIG_ENDIAN] = rs1;
+t.d[!HOST_BIG_ENDIAN] = rs2;
+aesdec_ISB_ISR(, , false);
+return t.d[HOST_BIG_ENDIAN];
 }
 
 target_ulong HELPER(aes64dsm)(target_ulong rs1, target_ulong rs2)
-- 
2.34.1




[PATCH 25/35] crypto: Add aesdec_ISB_ISR_IMC_AK

2023-06-02 Thread Richard Henderson
Add a primitive for InvSubBytes + InvShiftRows +
InvMixColumns + AddRoundKey.

Signed-off-by: Richard Henderson 
---
 host/include/generic/host/aes-round.h |  4 ++
 include/crypto/aes-round.h| 21 ++
 crypto/aes.c  | 56 +++
 3 files changed, 81 insertions(+)

diff --git a/host/include/generic/host/aes-round.h 
b/host/include/generic/host/aes-round.h
index dc2c751ac3..848436379d 100644
--- a/host/include/generic/host/aes-round.h
+++ b/host/include/generic/host/aes-round.h
@@ -25,4 +25,8 @@ void aesdec_IMC_accel(AESState *, const AESState *, bool)
 void aesdec_ISB_ISR_accel(AESState *, const AESState *, bool)
 QEMU_ERROR("unsupported accel");
 
+void aesdec_ISB_ISR_IMC_AK_accel(AESState *, const AESState *,
+ const AESState *, bool)
+QEMU_ERROR("unsupported accel");
+
 #endif
diff --git a/include/crypto/aes-round.h b/include/crypto/aes-round.h
index aefa17fcc3..352687ce11 100644
--- a/include/crypto/aes-round.h
+++ b/include/crypto/aes-round.h
@@ -113,4 +113,25 @@ static inline void aesdec_IMC(AESState *r, const AESState 
*st, bool be)
 }
 }
 
+/*
+ * Perform InvSubBytes + InvShiftRows + InvMixColumns + AddRoundKey.
+ */
+
+void aesdec_ISB_ISR_IMC_AK_gen(AESState *ret, const AESState *st,
+   const AESState *rk);
+void aesdec_ISB_ISR_IMC_AK_genrev(AESState *ret, const AESState *st,
+  const AESState *rk);
+
+static inline void aesdec_ISB_ISR_IMC_AK(AESState *r, const AESState *st,
+ const AESState *rk, bool be)
+{
+if (HAVE_AES_ACCEL) {
+aesdec_ISB_ISR_IMC_AK_accel(r, st, rk, be);
+} else if (HOST_BIG_ENDIAN == be) {
+aesdec_ISB_ISR_IMC_AK_gen(r, st, rk);
+} else {
+aesdec_ISB_ISR_IMC_AK_genrev(r, st, rk);
+}
+}
+
 #endif /* CRYPTO_AES_ROUND_H */
diff --git a/crypto/aes.c b/crypto/aes.c
index 6172495b46..1696086868 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -1515,6 +1515,62 @@ void aesdec_IMC_genrev(AESState *r, const AESState *st)
 aesdec_IMC_swap(r, st, true);
 }
 
+/* Perform InvSubBytes + InvShiftRows + InvMixColumns + AddRoundKey. */
+static inline void
+aesdec_ISB_ISR_IMC_AK_swap(AESState *r, const AESState *st,
+   const AESState *rk, bool swap)
+{
+int swap_b = swap * 0xf;
+int swap_w = swap * 0x3;
+bool be = HOST_BIG_ENDIAN ^ swap;
+uint32_t w0, w1, w2, w3;
+
+w0 = (AES_Td0[st->b[swap_b ^ AES_ISH_0]] ^
+  AES_Td1[st->b[swap_b ^ AES_ISH_1]] ^
+  AES_Td2[st->b[swap_b ^ AES_ISH_2]] ^
+  AES_Td3[st->b[swap_b ^ AES_ISH_3]]);
+
+w1 = (AES_Td0[st->b[swap_b ^ AES_ISH_4]] ^
+  AES_Td1[st->b[swap_b ^ AES_ISH_5]] ^
+  AES_Td2[st->b[swap_b ^ AES_ISH_6]] ^
+  AES_Td3[st->b[swap_b ^ AES_ISH_7]]);
+
+w2 = (AES_Td0[st->b[swap_b ^ AES_ISH_8]] ^
+  AES_Td1[st->b[swap_b ^ AES_ISH_9]] ^
+  AES_Td2[st->b[swap_b ^ AES_ISH_A]] ^
+  AES_Td3[st->b[swap_b ^ AES_ISH_B]]);
+
+w3 = (AES_Td0[st->b[swap_b ^ AES_ISH_C]] ^
+  AES_Td1[st->b[swap_b ^ AES_ISH_D]] ^
+  AES_Td2[st->b[swap_b ^ AES_ISH_E]] ^
+  AES_Td3[st->b[swap_b ^ AES_ISH_F]]);
+
+/* Note that AES_TdX is encoded for big-endian. */
+if (!be) {
+w0 = bswap32(w0);
+w1 = bswap32(w1);
+w2 = bswap32(w2);
+w3 = bswap32(w3);
+}
+
+r->w[swap_w ^ 0] = rk->w[swap_w ^ 0] ^ w0;
+r->w[swap_w ^ 1] = rk->w[swap_w ^ 1] ^ w1;
+r->w[swap_w ^ 2] = rk->w[swap_w ^ 2] ^ w2;
+r->w[swap_w ^ 3] = rk->w[swap_w ^ 3] ^ w3;
+}
+
+void aesdec_ISB_ISR_IMC_AK_gen(AESState *r, const AESState *st,
+   const AESState *rk)
+{
+aesdec_ISB_ISR_IMC_AK_swap(r, st, rk, false);
+}
+
+void aesdec_ISB_ISR_IMC_AK_genrev(AESState *r, const AESState *st,
+  const AESState *rk)
+{
+aesdec_ISB_ISR_IMC_AK_swap(r, st, rk, true);
+}
+
 /**
  * Expand the cipher key into the encryption key schedule.
  */
-- 
2.34.1




[PATCH 21/35] crypto: Add aesenc_SB_SR_MC_AK

2023-06-02 Thread Richard Henderson
Add a primitive for SubBytes + ShiftRows + MixColumns + AddRoundKey.

Signed-off-by: Richard Henderson 
---
 host/include/generic/host/aes-round.h |  4 ++
 include/crypto/aes-round.h| 21 ++
 crypto/aes.c  | 56 +++
 3 files changed, 81 insertions(+)

diff --git a/host/include/generic/host/aes-round.h 
b/host/include/generic/host/aes-round.h
index 1e9b97d274..dc2c751ac3 100644
--- a/host/include/generic/host/aes-round.h
+++ b/host/include/generic/host/aes-round.h
@@ -15,6 +15,10 @@ void aesenc_MC_accel(AESState *, const AESState *, bool)
 void aesenc_SB_SR_accel(AESState *, const AESState *, bool)
 QEMU_ERROR("unsupported accel");
 
+void aesenc_SB_SR_MC_AK_accel(AESState *, const AESState *,
+  const AESState *, bool)
+QEMU_ERROR("unsupported accel");
+
 void aesdec_IMC_accel(AESState *, const AESState *, bool)
 QEMU_ERROR("unsupported accel");
 
diff --git a/include/crypto/aes-round.h b/include/crypto/aes-round.h
index 2d962ede0b..aefa17fcc3 100644
--- a/include/crypto/aes-round.h
+++ b/include/crypto/aes-round.h
@@ -56,6 +56,27 @@ static inline void aesenc_MC(AESState *r, const AESState 
*st, bool be)
 }
 }
 
+/*
+ * Perform SubBytes + ShiftRows + MixColumns + AddRoundKey.
+ */
+
+void aesenc_SB_SR_MC_AK_gen(AESState *ret, const AESState *st,
+const AESState *rk);
+void aesenc_SB_SR_MC_AK_genrev(AESState *ret, const AESState *st,
+   const AESState *rk);
+
+static inline void aesenc_SB_SR_MC_AK(AESState *r, const AESState *st,
+  const AESState *rk, bool be)
+{
+if (HAVE_AES_ACCEL) {
+aesenc_SB_SR_MC_AK_accel(r, st, rk, be);
+} else if (HOST_BIG_ENDIAN == be) {
+aesenc_SB_SR_MC_AK_gen(r, st, rk);
+} else {
+aesenc_SB_SR_MC_AK_genrev(r, st, rk);
+}
+}
+
 /*
  * Perform InvSubBytes + InvShiftRows.
  */
diff --git a/crypto/aes.c b/crypto/aes.c
index 4e654e5404..6172495b46 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -1356,6 +1356,62 @@ void aesenc_MC_genrev(AESState *r, const AESState *st)
 aesenc_MC_swap(r, st, true);
 }
 
+/* Perform SubBytes + ShiftRows + MixColumns + AddRoundKey. */
+static inline void
+aesenc_SB_SR_MC_AK_swap(AESState *r, const AESState *st,
+const AESState *rk, bool swap)
+{
+int swap_b = swap * 0xf;
+int swap_w = swap * 0x3;
+bool be = HOST_BIG_ENDIAN ^ swap;
+uint32_t w0, w1, w2, w3;
+
+w0 = (AES_Te0[st->b[swap_b ^ AES_SH_0]] ^
+  AES_Te1[st->b[swap_b ^ AES_SH_1]] ^
+  AES_Te2[st->b[swap_b ^ AES_SH_2]] ^
+  AES_Te3[st->b[swap_b ^ AES_SH_3]]);
+
+w1 = (AES_Te0[st->b[swap_b ^ AES_SH_4]] ^
+  AES_Te1[st->b[swap_b ^ AES_SH_5]] ^
+  AES_Te2[st->b[swap_b ^ AES_SH_6]] ^
+  AES_Te3[st->b[swap_b ^ AES_SH_7]]);
+
+w2 = (AES_Te0[st->b[swap_b ^ AES_SH_8]] ^
+  AES_Te1[st->b[swap_b ^ AES_SH_9]] ^
+  AES_Te2[st->b[swap_b ^ AES_SH_A]] ^
+  AES_Te3[st->b[swap_b ^ AES_SH_B]]);
+
+w3 = (AES_Te0[st->b[swap_b ^ AES_SH_C]] ^
+  AES_Te1[st->b[swap_b ^ AES_SH_D]] ^
+  AES_Te2[st->b[swap_b ^ AES_SH_E]] ^
+  AES_Te3[st->b[swap_b ^ AES_SH_F]]);
+
+/* Note that AES_TeX is encoded for big-endian. */
+if (!be) {
+w0 = bswap32(w0);
+w1 = bswap32(w1);
+w2 = bswap32(w2);
+w3 = bswap32(w3);
+}
+
+r->w[swap_w ^ 0] = rk->w[swap_w ^ 0] ^ w0;
+r->w[swap_w ^ 1] = rk->w[swap_w ^ 1] ^ w1;
+r->w[swap_w ^ 2] = rk->w[swap_w ^ 2] ^ w2;
+r->w[swap_w ^ 3] = rk->w[swap_w ^ 3] ^ w3;
+}
+
+void aesenc_SB_SR_MC_AK_gen(AESState *r, const AESState *st,
+const AESState *rk)
+{
+aesenc_SB_SR_MC_AK_swap(r, st, rk, false);
+}
+
+void aesenc_SB_SR_MC_AK_genrev(AESState *r, const AESState *st,
+   const AESState *rk)
+{
+aesenc_SB_SR_MC_AK_swap(r, st, rk, true);
+}
+
 /* Perform InvSubBytes + InvShiftRows. */
 static inline void
 aesdec_ISB_ISR_swap(AESState *r, const AESState *st, bool swap)
-- 
2.34.1




[PATCH 12/35] target/arm: Use aesdec_ISB_ISR

2023-06-02 Thread Richard Henderson
This implements the AESD instruction.

Signed-off-by: Richard Henderson 
---
 target/arm/tcg/crypto_helper.c | 37 +++---
 1 file changed, 16 insertions(+), 21 deletions(-)

diff --git a/target/arm/tcg/crypto_helper.c b/target/arm/tcg/crypto_helper.c
index 5cebc88f5f..d7b644851f 100644
--- a/target/arm/tcg/crypto_helper.c
+++ b/target/arm/tcg/crypto_helper.c
@@ -46,26 +46,6 @@ static void clear_tail_16(void *vd, uint32_t desc)
 clear_tail(vd, opr_sz, max_sz);
 }
 
-static void do_crypto_aese(uint64_t *rd, uint64_t *rn, uint64_t *rm,
-   const uint8_t *sbox, const uint8_t *shift)
-{
-union CRYPTO_STATE rk = { .l = { rm[0], rm[1] } };
-union CRYPTO_STATE st = { .l = { rn[0], rn[1] } };
-int i;
-
-/* xor state vector with round key */
-rk.l[0] ^= st.l[0];
-rk.l[1] ^= st.l[1];
-
-/* combine ShiftRows operation and sbox substitution */
-for (i = 0; i < 16; i++) {
-CR_ST_BYTE(st, i) = sbox[CR_ST_BYTE(rk, shift[i])];
-}
-
-rd[0] = st.l[0];
-rd[1] = st.l[1];
-}
-
 void HELPER(crypto_aese)(void *vd, void *vn, void *vm, uint32_t desc)
 {
 intptr_t i, opr_sz = simd_oprsz(desc);
@@ -96,7 +76,22 @@ void HELPER(crypto_aesd)(void *vd, void *vn, void *vm, 
uint32_t desc)
 intptr_t i, opr_sz = simd_oprsz(desc);
 
 for (i = 0; i < opr_sz; i += 16) {
-do_crypto_aese(vd + i, vn + i, vm + i, AES_isbox, AES_ishifts);
+AESState *ad = (AESState *)(vd + i);
+AESState *st = (AESState *)(vn + i);
+AESState *rk = (AESState *)(vm + i);
+AESState t;
+
+/* Our uint64_t are in the wrong order for big-endian. */
+if (HOST_BIG_ENDIAN) {
+t.d[0] = st->d[1] ^ rk->d[1];
+t.d[1] = st->d[0] ^ rk->d[0];
+aesdec_ISB_ISR(, , false);
+ad->d[0] = t.d[1];
+ad->d[1] = t.d[0];
+} else {
+t.v = st->v ^ rk->v;
+aesdec_ISB_ISR(ad, , false);
+}
 }
 clear_tail(vd, opr_sz, simd_maxsz(desc));
 }
-- 
2.34.1




[PATCH 29/35] target/ppc: Use aesdec_ISB_ISR_AK_IMC

2023-06-02 Thread Richard Henderson
This implements the VNCIPHER instruction.

Signed-off-by: Richard Henderson 
---
 target/ppc/int_helper.c | 19 ---
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index c7f8b39e9a..8ae10ad748 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -2953,22 +2953,11 @@ void helper_vcipherlast(ppc_avr_t *r, ppc_avr_t *a, 
ppc_avr_t *b)
 
 void helper_vncipher(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
-/* This differs from what is written in ISA V2.07.  The RTL is */
-/* incorrect and will be fixed in V2.07B.  */
-int i;
-ppc_avr_t tmp;
+AESState *ad = (AESState *)r;
+AESState *st = (AESState *)a;
+AESState *rk = (AESState *)b;
 
-VECTOR_FOR_INORDER_I(i, u8) {
-tmp.VsrB(i) = b->VsrB(i) ^ AES_isbox[a->VsrB(AES_ishifts[i])];
-}
-
-VECTOR_FOR_INORDER_I(i, u32) {
-r->VsrW(i) =
-AES_imc[tmp.VsrB(4 * i + 0)][0] ^
-AES_imc[tmp.VsrB(4 * i + 1)][1] ^
-AES_imc[tmp.VsrB(4 * i + 2)][2] ^
-AES_imc[tmp.VsrB(4 * i + 3)][3];
-}
+aesdec_ISB_ISR_AK_IMC(ad, st, rk, true);
 }
 
 void helper_vncipherlast(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-- 
2.34.1




[PATCH 22/35] target/i386: Use aesenc_SB_SR_MC_AK

2023-06-02 Thread Richard Henderson
This implements the AESENC instruction.

Signed-off-by: Richard Henderson 
---
 target/i386/ops_sse.h | 14 +-
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 0187651140..c7a2c586f4 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2190,16 +2190,12 @@ void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, 
Reg *d, Reg *v, Reg *s)
 
 void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-int i;
-Reg st = *v;
-Reg rk = *s;
+for (int i = 0; i < SHIFT; i++) {
+AESState *ad = (AESState *)>ZMM_X(i);
+AESState *st = (AESState *)>ZMM_X(i);
+AESState *rk = (AESState *)>ZMM_X(i);
 
-for (i = 0 ; i < 2 << SHIFT ; i++) {
-int j = i & 3;
-d->L(i) = rk.L(i) ^ bswap32(AES_Te0[st.B(AES_shifts[4 * j + 0])] ^
-AES_Te1[st.B(AES_shifts[4 * j + 1])] ^
-AES_Te2[st.B(AES_shifts[4 * j + 2])] ^
-AES_Te3[st.B(AES_shifts[4 * j + 3])]);
+aesenc_SB_SR_MC_AK(ad, st, rk, false);
 }
 }
 
-- 
2.34.1




[PATCH 33/35] crypto: Implement aesdec_IMC with AES_imc_rot

2023-06-02 Thread Richard Henderson
This method uses one uint32_t * 256 table instead of 4,
which means its data cache overhead is less.

Signed-off-by: Richard Henderson 
---
 crypto/aes.c | 41 -
 1 file changed, 20 insertions(+), 21 deletions(-)

diff --git a/crypto/aes.c b/crypto/aes.c
index 4438d4dcdc..914ccf38ef 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -1453,39 +1453,38 @@ aesdec_IMC_swap(AESState *r, const AESState *st, bool 
swap)
 bool be = HOST_BIG_ENDIAN ^ swap;
 uint32_t t;
 
-/* Note that AES_imc is encoded for big-endian. */
-t = (AES_imc[st->b[swap_b ^ 0x0]][0] ^
- AES_imc[st->b[swap_b ^ 0x1]][1] ^
- AES_imc[st->b[swap_b ^ 0x2]][2] ^
- AES_imc[st->b[swap_b ^ 0x3]][3]);
-if (!be) {
+t = (  AES_imc_rot[st->b[swap_b ^ 0x0]] ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0x1]], 8) ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0x2]], 16) ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0x3]], 24));
+if (be) {
 t = bswap32(t);
 }
 r->w[swap_w ^ 0] = t;
 
-t = (AES_imc[st->b[swap_b ^ 0x4]][0] ^
- AES_imc[st->b[swap_b ^ 0x5]][1] ^
- AES_imc[st->b[swap_b ^ 0x6]][2] ^
- AES_imc[st->b[swap_b ^ 0x7]][3]);
-if (!be) {
+t = (  AES_imc_rot[st->b[swap_b ^ 0x4]] ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0x5]], 8) ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0x6]], 16) ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0x7]], 24));
+if (be) {
 t = bswap32(t);
 }
 r->w[swap_w ^ 1] = t;
 
-t = (AES_imc[st->b[swap_b ^ 0x8]][0] ^
- AES_imc[st->b[swap_b ^ 0x9]][1] ^
- AES_imc[st->b[swap_b ^ 0xA]][2] ^
- AES_imc[st->b[swap_b ^ 0xB]][3]);
-if (!be) {
+t = (  AES_imc_rot[st->b[swap_b ^ 0x8]] ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0x9]], 8) ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0xA]], 16) ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0xB]], 24));
+if (be) {
 t = bswap32(t);
 }
 r->w[swap_w ^ 2] = t;
 
-t = (AES_imc[st->b[swap_b ^ 0xC]][0] ^
- AES_imc[st->b[swap_b ^ 0xD]][1] ^
- AES_imc[st->b[swap_b ^ 0xE]][2] ^
- AES_imc[st->b[swap_b ^ 0xF]][3]);
-if (!be) {
+t = (  AES_imc_rot[st->b[swap_b ^ 0xC]] ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0xD]], 8) ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0xE]], 16) ^
+ rol32(AES_imc_rot[st->b[swap_b ^ 0xF]], 24));
+if (be) {
 t = bswap32(t);
 }
 r->w[swap_w ^ 3] = t;
-- 
2.34.1




[PATCH 15/35] crypto: Add aesenc_MC

2023-06-02 Thread Richard Henderson
Add a primitive for MixColumns.

Signed-off-by: Richard Henderson 
---
 host/include/generic/host/aes-round.h |  3 ++
 include/crypto/aes-round.h| 18 +
 crypto/aes.c  | 58 +++
 3 files changed, 79 insertions(+)

diff --git a/host/include/generic/host/aes-round.h 
b/host/include/generic/host/aes-round.h
index cb4fed61fe..7c48db24b6 100644
--- a/host/include/generic/host/aes-round.h
+++ b/host/include/generic/host/aes-round.h
@@ -9,6 +9,9 @@
 #define HAVE_AES_ACCEL  false
 #define ATTR_AES_ACCEL
 
+void aesenc_MC_accel(AESState *, const AESState *, bool)
+QEMU_ERROR("unsupported accel");
+
 void aesenc_SB_SR_accel(AESState *, const AESState *, bool)
 QEMU_ERROR("unsupported accel");
 
diff --git a/include/crypto/aes-round.h b/include/crypto/aes-round.h
index ff1914bd63..f25e9572a3 100644
--- a/include/crypto/aes-round.h
+++ b/include/crypto/aes-round.h
@@ -38,6 +38,24 @@ static inline void aesenc_SB_SR(AESState *r, const AESState 
*st, bool be)
 }
 }
 
+/*
+ * Perform MixColumns.
+ */
+
+void aesenc_MC_gen(AESState *ret, const AESState *st);
+void aesenc_MC_genrev(AESState *ret, const AESState *st);
+
+static inline void aesenc_MC(AESState *r, const AESState *st, bool be)
+{
+if (HAVE_AES_ACCEL) {
+aesenc_MC_accel(r, st, be);
+} else if (HOST_BIG_ENDIAN == be) {
+aesenc_MC_gen(r, st);
+} else {
+aesenc_MC_genrev(r, st);
+}
+}
+
 /*
  * Perform InvSubBytes + InvShiftRows.
  */
diff --git a/crypto/aes.c b/crypto/aes.c
index 937377647f..c7123eddd5 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -28,6 +28,8 @@
  * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 #include "qemu/osdep.h"
+#include "qemu/bswap.h"
+#include "qemu/bitops.h"
 #include "crypto/aes.h"
 #include "crypto/aes-round.h"
 
@@ -1298,6 +1300,62 @@ void aesenc_SB_SR_genrev(AESState *r, const AESState *st)
 aesenc_SB_SR_swap(r, st, true);
 }
 
+/* Perform MixColumns. */
+static inline void
+aesenc_MC_swap(AESState *r, const AESState *st, bool swap)
+{
+int swap_b = swap * 0xf;
+int swap_w = swap * 0x3;
+bool be = HOST_BIG_ENDIAN ^ swap;
+uint32_t t;
+
+t = (  AES_mc_rot[st->b[swap_b ^ 0x0]] ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0x1]], 8) ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0x2]], 16) ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0x3]], 24));
+if (be) {
+t = bswap32(t);
+}
+r->w[swap_w ^ 0] = t;
+
+t = (  AES_mc_rot[st->b[swap_b ^ 0x4]] ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0x5]], 8) ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0x6]], 16) ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0x7]], 24));
+if (be) {
+t = bswap32(t);
+}
+r->w[swap_w ^ 1] = t;
+
+t = (  AES_mc_rot[st->b[swap_b ^ 0x8]] ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0x9]], 8) ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0xA]], 16) ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0xB]], 24));
+if (be) {
+t = bswap32(t);
+}
+r->w[swap_w ^ 2] = t;
+
+t = (  AES_mc_rot[st->b[swap_b ^ 0xC]] ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0xD]], 8) ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0xE]], 16) ^
+ rol32(AES_mc_rot[st->b[swap_b ^ 0xF]], 24));
+if (be) {
+t = bswap32(t);
+}
+r->w[swap_w ^ 3] = t;
+}
+
+void aesenc_MC_gen(AESState *r, const AESState *st)
+{
+aesenc_MC_swap(r, st, false);
+}
+
+void aesenc_MC_genrev(AESState *r, const AESState *st)
+{
+aesenc_MC_swap(r, st, true);
+}
+
 /* Perform InvSubBytes + InvShiftRows. */
 static inline void
 aesdec_ISB_ISR_swap(AESState *r, const AESState *st, bool swap)
-- 
2.34.1




[PATCH 05/35] target/i386: Use aesenc_SB_SR

2023-06-02 Thread Richard Henderson
This implements the AESENCLAST instruction.

Signed-off-by: Richard Henderson 
---
 target/i386/ops_sse.h | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index fb63af7afa..31e1f6edc7 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -19,6 +19,7 @@
  */
 
 #include "crypto/aes.h"
+#include "crypto/aes-round.h"
 
 #if SHIFT == 0
 #define Reg MMXReg
@@ -2202,12 +2203,14 @@ void glue(helper_aesenc, SUFFIX)(CPUX86State *env, Reg 
*d, Reg *v, Reg *s)
 
 void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-int i;
-Reg st = *v;
-Reg rk = *s;
+for (int i = 0; i < SHIFT; i++) {
+AESState *ad = (AESState *)>ZMM_X(i);
+AESState *st = (AESState *)>ZMM_X(i);
+AESState *rk = (AESState *)>ZMM_X(i);
+AESState t;
 
-for (i = 0; i < 8 << SHIFT; i++) {
-d->B(i) = rk.B(i) ^ (AES_sbox[st.B(AES_shifts[i & 15] + (i & ~15))]);
+aesenc_SB_SR(, st, false);
+ad->v = t.v ^ rk->v;
 }
 }
 
-- 
2.34.1




[PATCH 01/35] tests/multiarch: Add test-aes

2023-06-02 Thread Richard Henderson
Use a shared driver and backends for i386, aarch64, ppc64, riscv64.

Signed-off-by: Richard Henderson 
---
 tests/tcg/aarch64/test-aes.c|  58 
 tests/tcg/i386/test-aes.c   |  68 +
 tests/tcg/ppc64/test-aes.c  | 116 +++
 tests/tcg/riscv64/test-aes.c|  76 ++
 tests/tcg/multiarch/test-aes-main.c.inc | 183 
 tests/tcg/aarch64/Makefile.target   |   4 +
 tests/tcg/i386/Makefile.target  |   4 +
 tests/tcg/ppc64/Makefile.target |   1 +
 tests/tcg/riscv64/Makefile.target   |   4 +
 9 files changed, 514 insertions(+)
 create mode 100644 tests/tcg/aarch64/test-aes.c
 create mode 100644 tests/tcg/i386/test-aes.c
 create mode 100644 tests/tcg/ppc64/test-aes.c
 create mode 100644 tests/tcg/riscv64/test-aes.c
 create mode 100644 tests/tcg/multiarch/test-aes-main.c.inc

diff --git a/tests/tcg/aarch64/test-aes.c b/tests/tcg/aarch64/test-aes.c
new file mode 100644
index 00..2cd324f09b
--- /dev/null
+++ b/tests/tcg/aarch64/test-aes.c
@@ -0,0 +1,58 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#include "../multiarch/test-aes-main.c.inc"
+
+bool test_SB_SR(uint8_t *o, const uint8_t *i)
+{
+/* aese also adds round key, so supply zero. */
+asm("ld1 { v0.16b }, [%1]\n\t"
+"movi v1.16b, #0\n\t"
+"aese v0.16b, v1.16b\n\t"
+"st1 { v0.16b }, [%0]"
+: : "r"(o), "r"(i) : "v0", "v1", "memory");
+return true;
+}
+
+bool test_MC(uint8_t *o, const uint8_t *i)
+{
+asm("ld1 { v0.16b }, [%1]\n\t"
+"aesmc v0.16b, v0.16b\n\t"
+"st1 { v0.16b }, [%0]"
+: : "r"(o), "r"(i) : "v0", "memory");
+return true;
+}
+
+bool test_SB_SR_MC_AK(uint8_t *o, const uint8_t *i, const uint8_t *k)
+{
+return false;
+}
+
+bool test_ISB_ISR(uint8_t *o, const uint8_t *i)
+{
+/* aesd also adds round key, so supply zero. */
+asm("ld1 { v0.16b }, [%1]\n\t"
+"movi v1.16b, #0\n\t"
+"aesd v0.16b, v1.16b\n\t"
+"st1 { v0.16b }, [%0]"
+: : "r"(o), "r"(i) : "v0", "v1", "memory");
+return true;
+}
+
+bool test_IMC(uint8_t *o, const uint8_t *i)
+{
+asm("ld1 { v0.16b }, [%1]\n\t"
+"aesimc v0.16b, v0.16b\n\t"
+"st1 { v0.16b }, [%0]"
+: : "r"(o), "r"(i) : "v0", "memory");
+return true;
+}
+
+bool test_ISB_ISR_AK_IMC(uint8_t *o, const uint8_t *i, const uint8_t *k)
+{
+return false;
+}
+
+bool test_ISB_ISR_IMC_AK(uint8_t *o, const uint8_t *i, const uint8_t *k)
+{
+return false;
+}
diff --git a/tests/tcg/i386/test-aes.c b/tests/tcg/i386/test-aes.c
new file mode 100644
index 00..199395e6cc
--- /dev/null
+++ b/tests/tcg/i386/test-aes.c
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#include "../multiarch/test-aes-main.c.inc"
+#include 
+
+static bool test_SB_SR(uint8_t *o, const uint8_t *i)
+{
+__m128i vi = _mm_loadu_si128((const __m128i_u *)i);
+
+/* aesenclast also adds round key, so supply zero. */
+vi = _mm_aesenclast_si128(vi, _mm_setzero_si128());
+
+_mm_storeu_si128((__m128i_u *)o, vi);
+return true;
+}
+
+static bool test_MC(uint8_t *o, const uint8_t *i)
+{
+return false;
+}
+
+static bool test_SB_SR_MC_AK(uint8_t *o, const uint8_t *i, const uint8_t *k)
+{
+__m128i vi = _mm_loadu_si128((const __m128i_u *)i);
+__m128i vk = _mm_loadu_si128((const __m128i_u *)k);
+
+vi = _mm_aesenc_si128(vi, vk);
+
+_mm_storeu_si128((__m128i_u *)o, vi);
+return true;
+}
+
+static bool test_ISB_ISR(uint8_t *o, const uint8_t *i)
+{
+__m128i vi = _mm_loadu_si128((const __m128i_u *)i);
+
+/* aesdeclast also adds round key, so supply zero. */
+vi = _mm_aesdeclast_si128(vi, _mm_setzero_si128());
+
+_mm_storeu_si128((__m128i_u *)o, vi);
+return true;
+}
+
+static bool test_IMC(uint8_t *o, const uint8_t *i)
+{
+__m128i vi = _mm_loadu_si128((const __m128i_u *)i);
+
+vi = _mm_aesimc_si128(vi);
+
+_mm_storeu_si128((__m128i_u *)o, vi);
+return true;
+}
+
+static bool test_ISB_ISR_AK_IMC(uint8_t *o, const uint8_t *i, const uint8_t *k)
+{
+return false;
+}
+
+static bool test_ISB_ISR_IMC_AK(uint8_t *o, const uint8_t *i, const uint8_t *k)
+{
+__m128i vi = _mm_loadu_si128((const __m128i_u *)i);
+__m128i vk = _mm_loadu_si128((const __m128i_u *)k);
+
+vi = _mm_aesdec_si128(vi, vk);
+
+_mm_storeu_si128((__m128i_u *)o, vi);
+return true;
+}
diff --git a/tests/tcg/ppc64/test-aes.c b/tests/tcg/ppc64/test-aes.c
new file mode 100644
index 00..1d2be488e9
--- /dev/null
+++ b/tests/tcg/ppc64/test-aes.c
@@ -0,0 +1,116 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#include "../multiarch/test-aes-main.c.inc"
+
+#undef BIG_ENDIAN
+#define BIG_ENDIAN  (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
+
+static unsigned char bswap_le[16] __attribute__((aligned(16))) = {
+8,9,10,11,12,13,14,15,
+0,1,2,3,4,5,6,7
+};
+
+bool test_SB_SR(uint8_t *o, const uint8_t *i)
+{
+/* 

Re: [PULL 00/10] Migration 20230602 patches

2023-06-02 Thread Richard Henderson

On 6/2/23 03:49, Juan Quintela wrote:

The following changes since commit a86d7b9ec0adb2f1efce8ab30d9ed2b72db0236e:

   Merge tag 'migration-20230601-pull-request' of 
https://gitlab.com/juan.quintela/qemu into staging (2023-06-01 20:59:28 -0700)

are available in the Git repository at:

   https://gitlab.com/juan.quintela/qemu.git 
tags/migration-20230602-pull-request

for you to fetch changes up to b861383c2690501ff2687f9ef9268b128b0fb3b3:

   qtest/migration: Document live=true cases (2023-06-02 11:46:20 +0200)


Migration Pull request (20230602 vintage)

This PULL request get:
- All migration-test patches except last one (daniel)
- Documentation about live test cases (peter)

Please apply.


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/8.1 as 
appropriate.


r~





[PATCH 07/35] target/arm: Use aesenc_SB_SR

2023-06-02 Thread Richard Henderson
This implements the AESE instruction.

Signed-off-by: Richard Henderson 
---
 target/arm/tcg/crypto_helper.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/target/arm/tcg/crypto_helper.c b/target/arm/tcg/crypto_helper.c
index 75882d9ea3..5cebc88f5f 100644
--- a/target/arm/tcg/crypto_helper.c
+++ b/target/arm/tcg/crypto_helper.c
@@ -15,6 +15,7 @@
 #include "exec/helper-proto.h"
 #include "tcg/tcg-gvec-desc.h"
 #include "crypto/aes.h"
+#include "crypto/aes-round.h"
 #include "crypto/sm4.h"
 #include "vec_internal.h"
 
@@ -70,7 +71,22 @@ void HELPER(crypto_aese)(void *vd, void *vn, void *vm, 
uint32_t desc)
 intptr_t i, opr_sz = simd_oprsz(desc);
 
 for (i = 0; i < opr_sz; i += 16) {
-do_crypto_aese(vd + i, vn + i, vm + i, AES_sbox, AES_shifts);
+AESState *ad = (AESState *)(vd + i);
+AESState *st = (AESState *)(vn + i);
+AESState *rk = (AESState *)(vm + i);
+AESState t;
+
+/* Our uint64_t are in the wrong order for big-endian. */
+if (HOST_BIG_ENDIAN) {
+t.d[0] = st->d[1] ^ rk->d[1];
+t.d[1] = st->d[0] ^ rk->d[0];
+aesenc_SB_SR(, , false);
+ad->d[0] = t.d[1];
+ad->d[1] = t.d[0];
+} else {
+t.v = st->v ^ rk->v;
+aesenc_SB_SR(ad, , false);
+}
 }
 clear_tail(vd, opr_sz, simd_maxsz(desc));
 }
-- 
2.34.1




[PATCH 04/35] crypto: Add aesenc_SB_SR

2023-06-02 Thread Richard Henderson
Start adding infrastructure for accelerating guest AES.
Begin with a SubBytes + ShiftRows primitive.

Signed-off-by: Richard Henderson 
---
 host/include/generic/host/aes-round.h | 15 +
 include/crypto/aes-round.h| 41 +++
 crypto/aes.c  | 47 +++
 3 files changed, 103 insertions(+)
 create mode 100644 host/include/generic/host/aes-round.h
 create mode 100644 include/crypto/aes-round.h

diff --git a/host/include/generic/host/aes-round.h 
b/host/include/generic/host/aes-round.h
new file mode 100644
index 00..598242c603
--- /dev/null
+++ b/host/include/generic/host/aes-round.h
@@ -0,0 +1,15 @@
+/*
+ * No host specific aes acceleration.
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HOST_AES_ROUND_H
+#define HOST_AES_ROUND_H
+
+#define HAVE_AES_ACCEL  false
+#define ATTR_AES_ACCEL
+
+void aesenc_SB_SR_accel(AESState *, const AESState *, bool)
+QEMU_ERROR("unsupported accel");
+
+#endif
diff --git a/include/crypto/aes-round.h b/include/crypto/aes-round.h
new file mode 100644
index 00..784e1daee6
--- /dev/null
+++ b/include/crypto/aes-round.h
@@ -0,0 +1,41 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ * AES round fragments, generic version
+ *
+ * Copyright (C) 2023 Linaro, Ltd.
+ */
+
+#ifndef CRYPTO_AES_ROUND_H
+#define CRYPTO_AES_ROUND_H
+
+/* Hosts with acceleration will usually need a 16-byte vector type. */
+typedef uint8_t AESStateVec __attribute__((vector_size(16)));
+
+typedef union {
+uint8_t b[16];
+uint32_t w[4];
+uint64_t d[4];
+AESStateVec v;
+} AESState;
+
+#include "host/aes-round.h"
+
+/*
+ * Perform SubBytes + ShiftRows.
+ */
+
+void aesenc_SB_SR_gen(AESState *ret, const AESState *st);
+void aesenc_SB_SR_genrev(AESState *ret, const AESState *st);
+
+static inline void aesenc_SB_SR(AESState *r, const AESState *st, bool be)
+{
+if (HAVE_AES_ACCEL) {
+aesenc_SB_SR_accel(r, st, be);
+} else if (HOST_BIG_ENDIAN == be) {
+aesenc_SB_SR_gen(r, st);
+} else {
+aesenc_SB_SR_genrev(r, st);
+}
+}
+
+#endif /* CRYPTO_AES_ROUND_H */
diff --git a/crypto/aes.c b/crypto/aes.c
index 1309a13e91..708838315a 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -29,6 +29,7 @@
  */
 #include "qemu/osdep.h"
 #include "crypto/aes.h"
+#include "crypto/aes-round.h"
 
 typedef uint32_t u32;
 typedef uint8_t u8;
@@ -1251,6 +1252,52 @@ static const u32 rcon[] = {
 0x1B00, 0x3600, /* for 128-bit blocks, Rijndael never uses 
more than 10 rcon values */
 };
 
+/* Perform SubBytes + ShiftRows. */
+static inline void
+aesenc_SB_SR_swap(AESState *r, const AESState *st, bool swap)
+{
+const int swap_b = swap ? 15 : 0;
+uint8_t t;
+
+/* These four indexes are not swizzled. */
+r->b[swap_b ^ 0x0] = AES_sbox[st->b[swap_b ^ AES_SH_0]];
+r->b[swap_b ^ 0x4] = AES_sbox[st->b[swap_b ^ AES_SH_4]];
+r->b[swap_b ^ 0x8] = AES_sbox[st->b[swap_b ^ AES_SH_8]];
+r->b[swap_b ^ 0xc] = AES_sbox[st->b[swap_b ^ AES_SH_C]];
+
+/* Otherwise, break cycles. */
+
+t = AES_sbox[st->b[swap_b ^ AES_SH_D]];
+r->b[swap_b ^ 0x1] = AES_sbox[st->b[swap_b ^ AES_SH_1]];
+r->b[swap_b ^ 0x5] = AES_sbox[st->b[swap_b ^ AES_SH_5]];
+r->b[swap_b ^ 0x9] = AES_sbox[st->b[swap_b ^ AES_SH_9]];
+r->b[swap_b ^ 0xd] = t;
+
+t = AES_sbox[st->b[swap_b ^ AES_SH_A]];
+r->b[swap_b ^ 0x2] = AES_sbox[st->b[swap_b ^ AES_SH_2]];
+r->b[swap_b ^ 0xa] = t;
+
+t = AES_sbox[st->b[swap_b ^ AES_SH_E]];
+r->b[swap_b ^ 0x6] = AES_sbox[st->b[swap_b ^ AES_SH_6]];
+r->b[swap_b ^ 0xe] = t;
+
+t = AES_sbox[st->b[swap_b ^ AES_SH_7]];
+r->b[swap_b ^ 0x3] = AES_sbox[st->b[swap_b ^ AES_SH_3]];
+r->b[swap_b ^ 0xf] = AES_sbox[st->b[swap_b ^ AES_SH_F]];
+r->b[swap_b ^ 0xb] = AES_sbox[st->b[swap_b ^ AES_SH_B]];
+r->b[swap_b ^ 0x7] = t;
+}
+
+void aesenc_SB_SR_gen(AESState *r, const AESState *st)
+{
+aesenc_SB_SR_swap(r, st, false);
+}
+
+void aesenc_SB_SR_genrev(AESState *r, const AESState *st)
+{
+aesenc_SB_SR_swap(r, st, true);
+}
+
 /**
  * Expand the cipher key into the encryption key schedule.
  */
-- 
2.34.1




[PATCH 19/35] target/arm: Use aesdec_IMC

2023-06-02 Thread Richard Henderson
This implements the AESIMC instruction.  We have converted everything
to crypto/aes-round.h; crypto/aes.h is no longer needed.

Signed-off-by: Richard Henderson 
---
 target/arm/tcg/crypto_helper.c | 33 ++---
 1 file changed, 14 insertions(+), 19 deletions(-)

diff --git a/target/arm/tcg/crypto_helper.c b/target/arm/tcg/crypto_helper.c
index a0fec08771..d2da80f2ba 100644
--- a/target/arm/tcg/crypto_helper.c
+++ b/target/arm/tcg/crypto_helper.c
@@ -14,7 +14,6 @@
 #include "cpu.h"
 #include "exec/helper-proto.h"
 #include "tcg/tcg-gvec-desc.h"
-#include "crypto/aes.h"
 #include "crypto/aes-round.h"
 #include "crypto/sm4.h"
 #include "vec_internal.h"
@@ -96,23 +95,6 @@ void HELPER(crypto_aesd)(void *vd, void *vn, void *vm, 
uint32_t desc)
 clear_tail(vd, opr_sz, simd_maxsz(desc));
 }
 
-static void do_crypto_aesmc(uint64_t *rd, uint64_t *rm, const uint32_t *mc)
-{
-union CRYPTO_STATE st = { .l = { rm[0], rm[1] } };
-int i;
-
-for (i = 0; i < 16; i += 4) {
-CR_ST_WORD(st, i >> 2) =
-mc[CR_ST_BYTE(st, i)] ^
-rol32(mc[CR_ST_BYTE(st, i + 1)], 8) ^
-rol32(mc[CR_ST_BYTE(st, i + 2)], 16) ^
-rol32(mc[CR_ST_BYTE(st, i + 3)], 24);
-}
-
-rd[0] = st.l[0];
-rd[1] = st.l[1];
-}
-
 void HELPER(crypto_aesmc)(void *vd, void *vm, uint32_t desc)
 {
 intptr_t i, opr_sz = simd_oprsz(desc);
@@ -141,7 +123,20 @@ void HELPER(crypto_aesimc)(void *vd, void *vm, uint32_t 
desc)
 intptr_t i, opr_sz = simd_oprsz(desc);
 
 for (i = 0; i < opr_sz; i += 16) {
-do_crypto_aesmc(vd + i, vm + i, AES_imc_rot);
+AESState *ad = (AESState *)(vd + i);
+AESState *st = (AESState *)(vm + i);
+AESState t;
+
+/* Our uint64_t are in the wrong order for big-endian. */
+if (HOST_BIG_ENDIAN) {
+t.d[0] = st->d[1];
+t.d[1] = st->d[0];
+aesdec_IMC(, , false);
+ad->d[0] = t.d[1];
+ad->d[1] = t.d[0];
+} else {
+aesdec_IMC(ad, st, false);
+}
 }
 clear_tail(vd, opr_sz, simd_maxsz(desc));
 }
-- 
2.34.1




[PATCH 30/35] host/include/i386: Implement aes-round.h

2023-06-02 Thread Richard Henderson
Detect AES in cpuinfo; implement the accel hooks.

Signed-off-by: Richard Henderson 
---
 host/include/i386/host/aes-round.h   | 148 +++
 host/include/i386/host/cpuinfo.h |   1 +
 host/include/x86_64/host/aes-round.h |   1 +
 util/cpuinfo-i386.c  |   3 +
 4 files changed, 153 insertions(+)
 create mode 100644 host/include/i386/host/aes-round.h
 create mode 100644 host/include/x86_64/host/aes-round.h

diff --git a/host/include/i386/host/aes-round.h 
b/host/include/i386/host/aes-round.h
new file mode 100644
index 00..b67e20578d
--- /dev/null
+++ b/host/include/i386/host/aes-round.h
@@ -0,0 +1,148 @@
+/*
+ * x86 specific aes acceleration.
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#ifndef HOST_AES_ROUND_H
+#define HOST_AES_ROUND_H
+
+#include "host/cpuinfo.h"
+#include 
+
+#if defined(__AES__) && defined(__SSSE3__)
+# define HAVE_AES_ACCEL  true
+# define ATTR_AES_ACCEL
+#else
+# define HAVE_AES_ACCEL  likely(cpuinfo & CPUINFO_AES)
+# define ATTR_AES_ACCEL  __attribute__((target("aes,ssse3")))
+#endif
+
+static inline __m128i ATTR_AES_ACCEL
+aes_accel_bswap(__m128i x)
+{
+return _mm_shuffle_epi8(x, _mm_set_epi8(0, 1, 2, 3, 4, 5, 6, 7, 8,
+9, 10, 11, 12, 13, 14, 15));
+}
+
+static inline void ATTR_AES_ACCEL
+aesenc_MC_accel(AESState *ret, const AESState *st, bool be)
+{
+__m128i t = (__m128i)st->v;
+__m128i z = _mm_setzero_si128();
+
+if (be) {
+t = aes_accel_bswap(t);
+t = _mm_aesdeclast_si128(t, z);
+t = _mm_aesenc_si128(t, z);
+t = aes_accel_bswap(t);
+} else {
+t = _mm_aesdeclast_si128(t, z);
+t = _mm_aesenc_si128(t, z);
+}
+ret->v = (AESStateVec)t;
+}
+
+static inline void ATTR_AES_ACCEL
+aesenc_SB_SR_accel(AESState *ret, const AESState *st, bool be)
+{
+__m128i t = (__m128i)st->v;
+__m128i z = _mm_setzero_si128();
+
+if (be) {
+t = aes_accel_bswap(t);
+t = _mm_aesenclast_si128(t, z);
+t = aes_accel_bswap(t);
+} else {
+t = _mm_aesenclast_si128(t, z);
+}
+ret->v = (AESStateVec)t;
+}
+
+static inline void ATTR_AES_ACCEL
+aesenc_SB_SR_MC_AK_accel(AESState *ret, const AESState *st,
+ const AESState *rk, bool be)
+{
+__m128i t = (__m128i)st->v;
+__m128i k = (__m128i)rk->v;
+
+if (be) {
+t = aes_accel_bswap(t);
+k = aes_accel_bswap(k);
+t = _mm_aesenc_si128(t, k);
+t = aes_accel_bswap(t);
+} else {
+t = _mm_aesenc_si128(t, k);
+}
+ret->v = (AESStateVec)t;
+}
+
+static inline void ATTR_AES_ACCEL
+aesdec_IMC_accel(AESState *ret, const AESState *st, bool be)
+{
+__m128i t = (__m128i)st->v;
+
+if (be) {
+t = aes_accel_bswap(t);
+t = _mm_aesimc_si128(t);
+t = aes_accel_bswap(t);
+} else {
+t = _mm_aesimc_si128(t);
+}
+ret->v = (AESStateVec)t;
+}
+
+static inline void ATTR_AES_ACCEL
+aesdec_ISB_ISR_accel(AESState *ret, const AESState *st, bool be)
+{
+__m128i t = (__m128i)st->v;
+__m128i z = _mm_setzero_si128();
+
+if (be) {
+t = aes_accel_bswap(t);
+t = _mm_aesdeclast_si128(t, z);
+t = aes_accel_bswap(t);
+} else {
+t = _mm_aesdeclast_si128(t, z);
+}
+ret->v = (AESStateVec)t;
+}
+
+static inline void ATTR_AES_ACCEL
+aesdec_ISB_ISR_AK_IMC_accel(AESState *ret, const AESState *st,
+const AESState *rk, bool be)
+{
+__m128i t = (__m128i)st->v;
+__m128i k = (__m128i)rk->v;
+
+if (be) {
+t = aes_accel_bswap(t);
+k = aes_accel_bswap(k);
+k = _mm_aesimc_si128(k);
+t = _mm_aesdec_si128(t, k);
+t = aes_accel_bswap(t);
+} else {
+k = _mm_aesimc_si128(k);
+t = _mm_aesdec_si128(t, k);
+}
+ret->v = (AESStateVec)t;
+}
+
+static inline void ATTR_AES_ACCEL
+aesdec_ISB_ISR_IMC_AK_accel(AESState *ret, const AESState *st,
+const AESState *rk, bool be)
+{
+__m128i t = (__m128i)st->v;
+__m128i k = (__m128i)rk->v;
+
+if (be) {
+t = aes_accel_bswap(t);
+k = aes_accel_bswap(k);
+t = _mm_aesdec_si128(t, k);
+t = aes_accel_bswap(t);
+} else {
+t = _mm_aesdec_si128(t, k);
+}
+ret->v = (AESStateVec)t;
+}
+
+#endif
diff --git a/host/include/i386/host/cpuinfo.h b/host/include/i386/host/cpuinfo.h
index a6537123cf..073d0a426f 100644
--- a/host/include/i386/host/cpuinfo.h
+++ b/host/include/i386/host/cpuinfo.h
@@ -26,6 +26,7 @@
 #define CPUINFO_AVX512VBMI2 (1u << 15)
 #define CPUINFO_ATOMIC_VMOVDQA  (1u << 16)
 #define CPUINFO_ATOMIC_VMOVDQU  (1u << 17)
+#define CPUINFO_AES (1u << 18)
 
 /* Initialized with a constructor. */
 extern unsigned cpuinfo;
diff --git a/host/include/x86_64/host/aes-round.h 
b/host/include/x86_64/host/aes-round.h
new file mode 100644
index 00..7da13f5424
--- /dev/null
+++ 

[PATCH 16/35] target/arm: Use aesenc_MC

2023-06-02 Thread Richard Henderson
This implements the AESMC instruction.

Signed-off-by: Richard Henderson 
---
 target/arm/tcg/crypto_helper.c | 15 ++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/target/arm/tcg/crypto_helper.c b/target/arm/tcg/crypto_helper.c
index d7b644851f..a0fec08771 100644
--- a/target/arm/tcg/crypto_helper.c
+++ b/target/arm/tcg/crypto_helper.c
@@ -118,7 +118,20 @@ void HELPER(crypto_aesmc)(void *vd, void *vm, uint32_t 
desc)
 intptr_t i, opr_sz = simd_oprsz(desc);
 
 for (i = 0; i < opr_sz; i += 16) {
-do_crypto_aesmc(vd + i, vm + i, AES_mc_rot);
+AESState *ad = (AESState *)(vd + i);
+AESState *st = (AESState *)(vm + i);
+AESState t;
+
+/* Our uint64_t are in the wrong order for big-endian. */
+if (HOST_BIG_ENDIAN) {
+t.d[0] = st->d[1];
+t.d[1] = st->d[0];
+aesenc_MC(, , false);
+ad->d[0] = t.d[1];
+ad->d[1] = t.d[0];
+} else {
+aesenc_MC(ad, st, false);
+}
 }
 clear_tail(vd, opr_sz, simd_maxsz(desc));
 }
-- 
2.34.1




[PATCH 11/35] target/i386: Use aesdec_ISB_ISR

2023-06-02 Thread Richard Henderson
This implements the AESDECLAST instruction.

Signed-off-by: Richard Henderson 
---
 target/i386/ops_sse.h | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 31e1f6edc7..036eabdf95 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2177,12 +2177,14 @@ void glue(helper_aesdec, SUFFIX)(CPUX86State *env, Reg 
*d, Reg *v, Reg *s)
 
 void glue(helper_aesdeclast, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
 {
-int i;
-Reg st = *v;
-Reg rk = *s;
+for (int i = 0; i < SHIFT; i++) {
+AESState *ad = (AESState *)>ZMM_X(i);
+AESState *st = (AESState *)>ZMM_X(i);
+AESState *rk = (AESState *)>ZMM_X(i);
+AESState t;
 
-for (i = 0; i < 8 << SHIFT; i++) {
-d->B(i) = rk.B(i) ^ (AES_isbox[st.B(AES_ishifts[i & 15] + (i & ~15))]);
+aesdec_ISB_ISR(, st, false);
+ad->v = t.v ^ rk->v;
 }
 }
 
-- 
2.34.1




[PATCH 02/35] target/arm: Move aesmc and aesimc tables to crypto/aes.c

2023-06-02 Thread Richard Henderson
We do not currently have a table in crypto/ for
just MixColumns.  Move both tables for consistency.

Signed-off-by: Richard Henderson 
---
 include/crypto/aes.h   |   6 ++
 crypto/aes.c   | 142 
 target/arm/tcg/crypto_helper.c | 143 ++---
 3 files changed, 153 insertions(+), 138 deletions(-)

diff --git a/include/crypto/aes.h b/include/crypto/aes.h
index 822d64588c..24b073d569 100644
--- a/include/crypto/aes.h
+++ b/include/crypto/aes.h
@@ -34,6 +34,12 @@ extern const uint8_t AES_isbox[256];
 extern const uint8_t AES_shifts[16];
 extern const uint8_t AES_ishifts[16];
 
+/* AES MixColumns, for use with rot32. */
+extern const uint32_t AES_mc_rot[256];
+
+/* AES InvMixColumns, for use with rot32. */
+extern const uint32_t AES_imc_rot[256];
+
 /* AES InvMixColumns */
 /* AES_imc[x][0] = [x].[0e, 09, 0d, 0b]; */
 /* AES_imc[x][1] = [x].[0b, 0e, 09, 0d]; */
diff --git a/crypto/aes.c b/crypto/aes.c
index af72ff7779..72c95c38fb 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -116,6 +116,148 @@ const uint8_t AES_ishifts[16] = {
 0, 13, 10, 7, 4, 1, 14, 11, 8, 5, 2, 15, 12, 9, 6, 3
 };
 
+/*
+ * MixColumns lookup table, for use with rot32.
+ * From Arm ARM pseudocode.
+ */
+const uint32_t AES_mc_rot[256] = {
+0x, 0x03010102, 0x06020204, 0x05030306,
+0x0c040408, 0x0f05050a, 0x0a06060c, 0x0907070e,
+0x18080810, 0x1b090912, 0x1e0a0a14, 0x1d0b0b16,
+0x140c0c18, 0x170d0d1a, 0x120e0e1c, 0x110f0f1e,
+0x30101020, 0x3322, 0x36121224, 0x35131326,
+0x3c141428, 0x3f15152a, 0x3a16162c, 0x3917172e,
+0x28181830, 0x2b191932, 0x2e1a1a34, 0x2d1b1b36,
+0x241c1c38, 0x271d1d3a, 0x221e1e3c, 0x211f1f3e,
+0x60202040, 0x63212142, 0x6644, 0x65232346,
+0x6c242448, 0x6f25254a, 0x6a26264c, 0x6927274e,
+0x78282850, 0x7b292952, 0x7e2a2a54, 0x7d2b2b56,
+0x742c2c58, 0x772d2d5a, 0x722e2e5c, 0x712f2f5e,
+0x50303060, 0x53313162, 0x56323264, 0x5566,
+0x5c343468, 0x5f35356a, 0x5a36366c, 0x5937376e,
+0x48383870, 0x4b393972, 0x4e3a3a74, 0x4d3b3b76,
+0x443c3c78, 0x473d3d7a, 0x423e3e7c, 0x413f3f7e,
+0xc0404080, 0xc3414182, 0xc6424284, 0xc5434386,
+0xcc88, 0xcf45458a, 0xca46468c, 0xc947478e,
+0xd8484890, 0xdb494992, 0xde4a4a94, 0xdd4b4b96,
+0xd44c4c98, 0xd74d4d9a, 0xd24e4e9c, 0xd14f4f9e,
+0xf05050a0, 0xf35151a2, 0xf65252a4, 0xf55353a6,
+0xfc5454a8, 0xffaa, 0xfa5656ac, 0xf95757ae,
+0xe85858b0, 0xeb5959b2, 0xee5a5ab4, 0xed5b5bb6,
+0xe45c5cb8, 0xe75d5dba, 0xe25e5ebc, 0xe15f5fbe,
+0xa06060c0, 0xa36161c2, 0xa66262c4, 0xa56363c6,
+0xac6464c8, 0xaf6565ca, 0xaacc, 0xa96767ce,
+0xb86868d0, 0xbb6969d2, 0xbe6a6ad4, 0xbd6b6bd6,
+0xb46c6cd8, 0xb76d6dda, 0xb26e6edc, 0xb16f6fde,
+0x907070e0, 0x937171e2, 0x967272e4, 0x957373e6,
+0x9c7474e8, 0x9f7575ea, 0x9a7676ec, 0x99ee,
+0x887878f0, 0x8b7979f2, 0x8e7a7af4, 0x8d7b7bf6,
+0x847c7cf8, 0x877d7dfa, 0x827e7efc, 0x817f7ffe,
+0x9b80801b, 0x98818119, 0x9d82821f, 0x9e83831d,
+0x97848413, 0x94858511, 0x91868617, 0x92878715,
+0x830b, 0x80898909, 0x858a8a0f, 0x868b8b0d,
+0x8f8c8c03, 0x8c8d8d01, 0x898e8e07, 0x8a8f8f05,
+0xab90903b, 0xa8919139, 0xad92923f, 0xae93933d,
+0xa7949433, 0xa4959531, 0xa1969637, 0xa2979735,
+0xb398982b, 0xb029, 0xb59a9a2f, 0xb69b9b2d,
+0xbf9c9c23, 0xbc9d9d21, 0xb99e9e27, 0xba9f9f25,
+0xfba0a05b, 0xf8a1a159, 0xfda2a25f, 0xfea3a35d,
+0xf7a4a453, 0xf4a5a551, 0xf1a6a657, 0xf2a7a755,
+0xe3a8a84b, 0xe0a9a949, 0xe54f, 0xe6abab4d,
+0xefacac43, 0xecadad41, 0xe9aeae47, 0xeaafaf45,
+0xcbb0b07b, 0xc8b1b179, 0xcdb2b27f, 0xceb3b37d,
+0xc7b4b473, 0xc4b5b571, 0xc1b6b677, 0xc2b7b775,
+0xd3b8b86b, 0xd0b9b969, 0xd5baba6f, 0xd66d,
+0xdfbcbc63, 0xdcbdbd61, 0xd9bebe67, 0xdabfbf65,
+0x5bc0c09b, 0x58c1c199, 0x5dc2c29f, 0x5ec3c39d,
+0x57c4c493, 0x54c5c591, 0x51c6c697, 0x52c7c795,
+0x43c8c88b, 0x40c9c989, 0x45caca8f, 0x46cbcb8d,
+0x4f83, 0x4ccdcd81, 0x49cece87, 0x4acfcf85,
+0x6bd0d0bb, 0x68d1d1b9, 0x6dd2d2bf, 0x6ed3d3bd,
+0x67d4d4b3, 0x64d5d5b1, 0x61d6d6b7, 0x62d7d7b5,
+0x73d8d8ab, 0x70d9d9a9, 0x75dadaaf, 0x76dbdbad,
+0x7fdcdca3, 0x7ca1, 0x79dedea7, 0x7adfdfa5,
+0x3be0e0db, 0x38e1e1d9, 0x3de2e2df, 0x3ee3e3dd,
+0x37e4e4d3, 0x34e5e5d1, 0x31e6e6d7, 0x32e7e7d5,
+0x23e8e8cb, 0x20e9e9c9, 0x25eaeacf, 0x26ebebcd,
+0x2fececc3, 0x2cededc1, 0x29c7, 0x2aefefc5,
+0x0bf0f0fb, 0x08f1f1f9, 0x0df2f2ff, 0x0ef3f3fd,
+0x07f4f4f3, 0x04f5f5f1, 0x01f6f6f7, 0x02f7f7f5,
+0x13f8f8eb, 0x10f9f9e9, 0x15fafaef, 0x16fbfbed,
+0x1ffcfce3, 0x1cfdfde1, 0x19fefee7, 0x1ae5,
+};
+
+/*
+ * Inverse MixColumns lookup table, for use with rot32.
+ * From Arm ARM pseudocode.
+ */
+const uint32_t AES_imc_rot[256] = {
+0x, 0x0b0d090e, 0x161a121c, 0x1d171b12,
+0x2c342438, 0x27392d36, 0x3a2e3624, 0x31233f2a,
+0x58684870, 0x5365417e, 0x4e725a6c, 0x457f5362,
+  

[PATCH 08/35] target/ppc: Use aesenc_SB_SR

2023-06-02 Thread Richard Henderson
This implements the VCIPHERLAST instruction.

Signed-off-by: Richard Henderson 
---
 target/ppc/int_helper.c | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index d97a7f1f28..b49e17685b 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -25,6 +25,7 @@
 #include "qemu/log.h"
 #include "exec/helper-proto.h"
 #include "crypto/aes.h"
+#include "crypto/aes-round.h"
 #include "fpu/softfloat.h"
 #include "qapi/error.h"
 #include "qemu/guest-random.h"
@@ -2947,13 +2948,13 @@ void helper_vcipher(ppc_avr_t *r, ppc_avr_t *a, 
ppc_avr_t *b)
 
 void helper_vcipherlast(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
-ppc_avr_t result;
-int i;
+AESState *ad = (AESState *)r;
+AESState *st = (AESState *)a;
+AESState *rk = (AESState *)b;
+AESState t;
 
-VECTOR_FOR_INORDER_I(i, u8) {
-result.VsrB(i) = b->VsrB(i) ^ (AES_sbox[a->VsrB(AES_shifts[i])]);
-}
-*r = result;
+aesenc_SB_SR(, st, true);
+ad->v = t.v ^ rk->v;
 }
 
 void helper_vncipher(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-- 
2.34.1




[PATCH 18/35] target/i386: Use aesdec_IMC

2023-06-02 Thread Richard Henderson
This implements the AESIMC instruction.

Signed-off-by: Richard Henderson 
---
 target/i386/ops_sse.h | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 036eabdf95..0187651140 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2219,15 +2219,10 @@ void glue(helper_aesenclast, SUFFIX)(CPUX86State *env, 
Reg *d, Reg *v, Reg *s)
 #if SHIFT == 1
 void glue(helper_aesimc, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
 {
-int i;
-Reg tmp = *s;
+AESState *ad = (AESState *)>ZMM_X(0);
+AESState *st = (AESState *)>ZMM_X(0);
 
-for (i = 0 ; i < 4 ; i++) {
-d->L(i) = bswap32(AES_imc[tmp.B(4 * i + 0)][0] ^
-  AES_imc[tmp.B(4 * i + 1)][1] ^
-  AES_imc[tmp.B(4 * i + 2)][2] ^
-  AES_imc[tmp.B(4 * i + 3)][3]);
-}
+aesdec_IMC(ad, st, false);
 }
 
 void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State *env, Reg *d, Reg *s,
-- 
2.34.1




[PATCH 06/35] target/arm: Demultiplex AESE and AESMC

2023-06-02 Thread Richard Henderson
Split these helpers so that we are not passing 'decrypt'
within the simd descriptor.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.h |  2 ++
 target/arm/tcg/sve.decode   |  4 ++--
 target/arm/tcg/crypto_helper.c  | 37 +++--
 target/arm/tcg/translate-a64.c  | 13 
 target/arm/tcg/translate-neon.c |  4 ++--
 target/arm/tcg/translate-sve.c  |  8 ---
 6 files changed, 41 insertions(+), 27 deletions(-)

diff --git a/target/arm/helper.h b/target/arm/helper.h
index 3335c2b10b..95e32a697a 100644
--- a/target/arm/helper.h
+++ b/target/arm/helper.h
@@ -552,7 +552,9 @@ DEF_HELPER_FLAGS_2(neon_qzip16, TCG_CALL_NO_RWG, void, ptr, 
ptr)
 DEF_HELPER_FLAGS_2(neon_qzip32, TCG_CALL_NO_RWG, void, ptr, ptr)
 
 DEF_HELPER_FLAGS_4(crypto_aese, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
+DEF_HELPER_FLAGS_4(crypto_aesd, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_3(crypto_aesmc, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
+DEF_HELPER_FLAGS_3(crypto_aesimc, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
 
 DEF_HELPER_FLAGS_4(crypto_sha1su0, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
 DEF_HELPER_FLAGS_4(crypto_sha1c, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
diff --git a/target/arm/tcg/sve.decode b/target/arm/tcg/sve.decode
index 14b3a69c36..04b6fcc0cf 100644
--- a/target/arm/tcg/sve.decode
+++ b/target/arm/tcg/sve.decode
@@ -1629,8 +1629,8 @@ STNT1_zprz  1110010 .. 10 . 001 ... . . \
 ### SVE2 Crypto Extensions
 
 # SVE2 crypto unary operations
-# AESMC and AESIMC
-AESMC   01000101 00 1011100 decrypt:1 0 rd:5
+AESMC   01000101 00 1011100 0 0 rd:5
+AESIMC  01000101 00 1011100 1 0 rd:5
 
 # SVE2 crypto destructive binary operations
 AESE01000101 00 10001 0 11100 0 . .  @rdn_rm_e0
diff --git a/target/arm/tcg/crypto_helper.c b/target/arm/tcg/crypto_helper.c
index 06254939d2..75882d9ea3 100644
--- a/target/arm/tcg/crypto_helper.c
+++ b/target/arm/tcg/crypto_helper.c
@@ -45,11 +45,9 @@ static void clear_tail_16(void *vd, uint32_t desc)
 clear_tail(vd, opr_sz, max_sz);
 }
 
-static void do_crypto_aese(uint64_t *rd, uint64_t *rn,
-   uint64_t *rm, bool decrypt)
+static void do_crypto_aese(uint64_t *rd, uint64_t *rn, uint64_t *rm,
+   const uint8_t *sbox, const uint8_t *shift)
 {
-static uint8_t const * const sbox[2] = { AES_sbox, AES_isbox };
-static uint8_t const * const shift[2] = { AES_shifts, AES_ishifts };
 union CRYPTO_STATE rk = { .l = { rm[0], rm[1] } };
 union CRYPTO_STATE st = { .l = { rn[0], rn[1] } };
 int i;
@@ -60,7 +58,7 @@ static void do_crypto_aese(uint64_t *rd, uint64_t *rn,
 
 /* combine ShiftRows operation and sbox substitution */
 for (i = 0; i < 16; i++) {
-CR_ST_BYTE(st, i) = sbox[decrypt][CR_ST_BYTE(rk, shift[decrypt][i])];
+CR_ST_BYTE(st, i) = sbox[CR_ST_BYTE(rk, shift[i])];
 }
 
 rd[0] = st.l[0];
@@ -70,18 +68,26 @@ static void do_crypto_aese(uint64_t *rd, uint64_t *rn,
 void HELPER(crypto_aese)(void *vd, void *vn, void *vm, uint32_t desc)
 {
 intptr_t i, opr_sz = simd_oprsz(desc);
-bool decrypt = simd_data(desc);
 
 for (i = 0; i < opr_sz; i += 16) {
-do_crypto_aese(vd + i, vn + i, vm + i, decrypt);
+do_crypto_aese(vd + i, vn + i, vm + i, AES_sbox, AES_shifts);
 }
 clear_tail(vd, opr_sz, simd_maxsz(desc));
 }
 
-static void do_crypto_aesmc(uint64_t *rd, uint64_t *rm, bool decrypt)
+void HELPER(crypto_aesd)(void *vd, void *vn, void *vm, uint32_t desc)
+{
+intptr_t i, opr_sz = simd_oprsz(desc);
+
+for (i = 0; i < opr_sz; i += 16) {
+do_crypto_aese(vd + i, vn + i, vm + i, AES_isbox, AES_ishifts);
+}
+clear_tail(vd, opr_sz, simd_maxsz(desc));
+}
+
+static void do_crypto_aesmc(uint64_t *rd, uint64_t *rm, const uint32_t *mc)
 {
 union CRYPTO_STATE st = { .l = { rm[0], rm[1] } };
-const uint32_t *mc = decrypt ? AES_imc_rot : AES_mc_rot;
 int i;
 
 for (i = 0; i < 16; i += 4) {
@@ -99,10 +105,19 @@ static void do_crypto_aesmc(uint64_t *rd, uint64_t *rm, 
bool decrypt)
 void HELPER(crypto_aesmc)(void *vd, void *vm, uint32_t desc)
 {
 intptr_t i, opr_sz = simd_oprsz(desc);
-bool decrypt = simd_data(desc);
 
 for (i = 0; i < opr_sz; i += 16) {
-do_crypto_aesmc(vd + i, vm + i, decrypt);
+do_crypto_aesmc(vd + i, vm + i, AES_mc_rot);
+}
+clear_tail(vd, opr_sz, simd_maxsz(desc));
+}
+
+void HELPER(crypto_aesimc)(void *vd, void *vm, uint32_t desc)
+{
+intptr_t i, opr_sz = simd_oprsz(desc);
+
+for (i = 0; i < opr_sz; i += 16) {
+do_crypto_aesmc(vd + i, vm + i, AES_imc_rot);
 }
 clear_tail(vd, opr_sz, simd_maxsz(desc));
 }
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 741a608739..3a97216d9b 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -13416,7 +13416,6 @@ static 

[PATCH 03/35] crypto/aes: Add constants for ShiftRows, InvShiftRows

2023-06-02 Thread Richard Henderson
These symbols will avoid the indirection through memory
when fully unrolling some new primitives.

Signed-off-by: Richard Henderson 
---
 crypto/aes.c | 50 --
 1 file changed, 48 insertions(+), 2 deletions(-)

diff --git a/crypto/aes.c b/crypto/aes.c
index 72c95c38fb..1309a13e91 100644
--- a/crypto/aes.c
+++ b/crypto/aes.c
@@ -108,12 +108,58 @@ const uint8_t AES_isbox[256] = {
 0xE1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0C, 0x7D,
 };
 
+/* AES ShiftRows, for complete unrolling. */
+enum {
+AES_SH_0 = 0x0,
+AES_SH_1 = 0x5,
+AES_SH_2 = 0xa,
+AES_SH_3 = 0xf,
+AES_SH_4 = 0x4,
+AES_SH_5 = 0x9,
+AES_SH_6 = 0xe,
+AES_SH_7 = 0x3,
+AES_SH_8 = 0x8,
+AES_SH_9 = 0xd,
+AES_SH_A = 0x2,
+AES_SH_B = 0x7,
+AES_SH_C = 0xc,
+AES_SH_D = 0x1,
+AES_SH_E = 0x6,
+AES_SH_F = 0xb,
+};
+
 const uint8_t AES_shifts[16] = {
-0, 5, 10, 15, 4, 9, 14, 3, 8, 13, 2, 7, 12, 1, 6, 11
+AES_SH_0, AES_SH_1, AES_SH_2, AES_SH_3,
+AES_SH_4, AES_SH_5, AES_SH_6, AES_SH_7,
+AES_SH_8, AES_SH_9, AES_SH_A, AES_SH_B,
+AES_SH_C, AES_SH_D, AES_SH_E, AES_SH_F,
+};
+
+/* AES InvShiftRows, for complete unrolling. */
+enum {
+AES_ISH_0 = 0x0,
+AES_ISH_1 = 0xd,
+AES_ISH_2 = 0xa,
+AES_ISH_3 = 0x7,
+AES_ISH_4 = 0x4,
+AES_ISH_5 = 0x1,
+AES_ISH_6 = 0xe,
+AES_ISH_7 = 0xb,
+AES_ISH_8 = 0x8,
+AES_ISH_9 = 0x5,
+AES_ISH_A = 0x2,
+AES_ISH_B = 0xf,
+AES_ISH_C = 0xc,
+AES_ISH_D = 0x9,
+AES_ISH_E = 0x6,
+AES_ISH_F = 0x3,
 };
 
 const uint8_t AES_ishifts[16] = {
-0, 13, 10, 7, 4, 1, 14, 11, 8, 5, 2, 15, 12, 9, 6, 3
+AES_ISH_0, AES_ISH_1, AES_ISH_2, AES_ISH_3,
+AES_ISH_4, AES_ISH_5, AES_ISH_6, AES_ISH_7,
+AES_ISH_8, AES_ISH_9, AES_ISH_A, AES_ISH_B,
+AES_ISH_C, AES_ISH_D, AES_ISH_E, AES_ISH_F,
 };
 
 /*
-- 
2.34.1




[PATCH 00/35] crypto: Provide aes-round.h and host accel

2023-06-02 Thread Richard Henderson
Inspired by Ard Biesheuvel's RFC patches for accelerating AES
under emulation, provide a set of primitives that maps between
the guest and host fragments.

There is a small guest correctness test case.

I think the end result is quite a bit cleaner, since the logic
is now centralized, rather than spread across 4 different guests.

Further work could clean up crypto/aes.c itself to use these
instead of the tables directly.  I'm sure that's just an ultimate
fallback when an appropriate system library is not available, and
so not terribly important, but it could still significantly reduce
the amount of code we carry.

I would imagine structuring a polynomial multiplication header
in a similar way.  There are 4 or 5 versions of those spread across
the different guests.

Anyway, please review.


r~


Richard Henderson (35):
  tests/multiarch: Add test-aes
  target/arm: Move aesmc and aesimc tables to crypto/aes.c
  crypto/aes: Add constants for ShiftRows, InvShiftRows
  crypto: Add aesenc_SB_SR
  target/i386: Use aesenc_SB_SR
  target/arm: Demultiplex AESE and AESMC
  target/arm: Use aesenc_SB_SR
  target/ppc: Use aesenc_SB_SR
  target/riscv: Use aesenc_SB_SR
  crypto: Add aesdec_ISB_ISR
  target/i386: Use aesdec_ISB_ISR
  target/arm: Use aesdec_ISB_ISR
  target/ppc: Use aesdec_ISB_ISR
  target/riscv: Use aesdec_ISB_ISR
  crypto: Add aesenc_MC
  target/arm: Use aesenc_MC
  crypto: Add aesdec_IMC
  target/i386: Use aesdec_IMC
  target/arm: Use aesdec_IMC
  target/riscv: Use aesdec_IMC
  crypto: Add aesenc_SB_SR_MC_AK
  target/i386: Use aesenc_SB_SR_MC_AK
  target/ppc: Use aesenc_SB_SR_MC_AK
  target/riscv: Use aesenc_SB_SR_MC_AK
  crypto: Add aesdec_ISB_ISR_IMC_AK
  target/i386: Use aesdec_ISB_ISR_IMC_AK
  target/riscv: Use aesdec_ISB_ISR_IMC_AK
  crypto: Add aesdec_ISB_ISR_AK_IMC
  target/ppc: Use aesdec_ISB_ISR_AK_IMC
  host/include/i386: Implement aes-round.h
  host/include/aarch64: Implement aes-round.h
  crypto: Remove AES_shifts, AES_ishifts
  crypto: Implement aesdec_IMC with AES_imc_rot
  crypto: Remove AES_imc
  crypto: Unexport AES_*_rot, AES_TeN, AES_TdN

 host/include/aarch64/host/aes-round.h   | 204 ++
 host/include/aarch64/host/cpuinfo.h |   1 +
 host/include/generic/host/aes-round.h   |  36 ++
 host/include/i386/host/aes-round.h  | 148 +
 host/include/i386/host/cpuinfo.h|   1 +
 host/include/x86_64/host/aes-round.h|   1 +
 include/crypto/aes-round.h  | 158 +
 include/crypto/aes.h|  30 -
 target/arm/helper.h |   2 +
 target/i386/ops_sse.h   |  64 +-
 target/arm/tcg/sve.decode   |   4 +-
 crypto/aes.c| 808 
 target/arm/tcg/crypto_helper.c  | 245 +++
 target/arm/tcg/translate-a64.c  |  13 +-
 target/arm/tcg/translate-neon.c |   4 +-
 target/arm/tcg/translate-sve.c  |   8 +-
 target/ppc/int_helper.c |  58 +-
 target/riscv/crypto_helper.c| 142 ++---
 tests/tcg/aarch64/test-aes.c|  58 ++
 tests/tcg/i386/test-aes.c   |  68 ++
 tests/tcg/ppc64/test-aes.c  | 116 
 tests/tcg/riscv64/test-aes.c|  76 +++
 util/cpuinfo-aarch64.c  |   2 +
 util/cpuinfo-i386.c |   3 +
 tests/tcg/multiarch/test-aes-main.c.inc | 183 ++
 tests/tcg/aarch64/Makefile.target   |   4 +
 tests/tcg/i386/Makefile.target  |   4 +
 tests/tcg/ppc64/Makefile.target |   1 +
 tests/tcg/riscv64/Makefile.target   |   4 +
 29 files changed, 1776 insertions(+), 670 deletions(-)
 create mode 100644 host/include/aarch64/host/aes-round.h
 create mode 100644 host/include/generic/host/aes-round.h
 create mode 100644 host/include/i386/host/aes-round.h
 create mode 100644 host/include/x86_64/host/aes-round.h
 create mode 100644 include/crypto/aes-round.h
 create mode 100644 tests/tcg/aarch64/test-aes.c
 create mode 100644 tests/tcg/i386/test-aes.c
 create mode 100644 tests/tcg/ppc64/test-aes.c
 create mode 100644 tests/tcg/riscv64/test-aes.c
 create mode 100644 tests/tcg/multiarch/test-aes-main.c.inc

-- 
2.34.1




Re: [PULL v2 00/21] NBD and miscellaneous patches for 2023-06-01

2023-06-02 Thread Richard Henderson

On 6/2/23 10:33, Eric Blake wrote:

The following changes since commit a86d7b9ec0adb2f1efce8ab30d9ed2b72db0236e:

   Merge tag 'migration-20230601-pull-request' 
ofhttps://gitlab.com/juan.quintela/qemu  into staging (2023-06-01 20:59:28 
-0700)

are available in the Git repository at:

   https://repo.or.cz/qemu/ericb.git  tags/pull-nbd-2023-06-01-v2

for you to fetch changes up to 42cc08d13ab8e68f76882b216da0b28d06f29e11:

   cutils: Improve qemu_strtosz handling of fractions (2023-06-02 12:29:27 
-0500)

In v2:
- fix build failure on mingw [CI, via Richard]
- drop dead comparisons to UINT64_MAX [Markus]

only the changed patches are re-posted


nbd and misc patches for 2023-06-01

- Eric Blake: Fix iotest 104 for NBD
- Eric Blake: Improve qcow2 spec on padding bytes
- Eric Blake: Fix read-beyond-bounds bug in qemu_strtosz


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/8.1 as 
appropriate.


r~




Re: [PATCH v3 47/48] exec/poison: Do not poison CONFIG_SOFTMMU

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:03, Richard Henderson wrote:

If CONFIG_USER_ONLY is ok generically, so is CONFIG_SOFTMMU,
because they are exactly opposite.

Signed-off-by: Richard Henderson 
---
  include/exec/poison.h | 1 -
  scripts/make-config-poison.sh | 5 +++--
  2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/exec/poison.h b/include/exec/poison.h
index 256736e11a..e94ee8dfef 100644
--- a/include/exec/poison.h
+++ b/include/exec/poison.h
@@ -85,7 +85,6 @@
  #pragma GCC poison CONFIG_HVF
  #pragma GCC poison CONFIG_LINUX_USER
  #pragma GCC poison CONFIG_KVM
-#pragma GCC poison CONFIG_SOFTMMU
  #pragma GCC poison CONFIG_WHPX
  #pragma GCC poison CONFIG_XEN
  
diff --git a/scripts/make-config-poison.sh b/scripts/make-config-poison.sh

index 1892854261..2b36907e23 100755
--- a/scripts/make-config-poison.sh
+++ b/scripts/make-config-poison.sh
@@ -4,11 +4,12 @@ if test $# = 0; then
exit 0
  fi
  
-# Create list of config switches that should be poisoned in common code...

-# but filter out CONFIG_TCG and CONFIG_USER_ONLY which are special.
+# Create list of config switches that should be poisoned in common code,
+# but filter out several which are handled manually.
  exec sed -n \
-e' /CONFIG_TCG/d' \
-e '/CONFIG_USER_ONLY/d' \
+  -e '/CONFIG_SOFTMMU/d' \
-e '/^#define / {' \
-e's///' \
-e's/ .*//' \


Reviewed-by: Philippe Mathieu-Daudé 




Re: [RFC PATCH 2/2] bulk: Replace !CONFIG_USER_ONLY -> CONFIG_SOFTMMU

2023-06-02 Thread Philippe Mathieu-Daudé

On 3/6/23 00:58, Philippe Mathieu-Daudé wrote:

CONFIG_SOFTMMU is the opposite of CONFIG_USER_ONLY.
Now that CONFIG_SOFTMMU isn't poisoined anymore,
replace !CONFIG_USER_ONLY negation by the positive
form which is clearer when reviewing code.

Change mostly done mechanically using:

   $ sed -i -e 's/ifndef CONFIG_USER_ONLY/ifdef CONFIG_SOFTMMU/' \
-e 's/!defined(CONFIG_USER_ONLY)/defined(CONFIG_SOFTMMU)/' \
$(git grep -l CONFIG_USER_ONLY)

and adapting comments manually.

Signed-off-by: Philippe Mathieu-Daudé 


*Sigh* I was not building in the correct build directory,
now I realize this patch is crap because the CONFIG_SOFTMMU
definitions is not propagated on all objects.

Please disregard...



Re: [RFC PATCH 2/2] bulk: Replace !CONFIG_USER_ONLY -> CONFIG_SOFTMMU

2023-06-02 Thread Philippe Mathieu-Daudé

On 3/6/23 00:58, Philippe Mathieu-Daudé wrote:

CONFIG_SOFTMMU is the opposite of CONFIG_USER_ONLY.
Now that CONFIG_SOFTMMU isn't poisoined anymore,
replace !CONFIG_USER_ONLY negation by the positive
form which is clearer when reviewing code.

Change mostly done mechanically using:

   $ sed -i -e 's/ifndef CONFIG_USER_ONLY/ifdef CONFIG_SOFTMMU/' \
-e 's/!defined(CONFIG_USER_ONLY)/defined(CONFIG_SOFTMMU)/' \
$(git grep -l CONFIG_USER_ONLY)

and adapting comments manually.

Signed-off-by: Philippe Mathieu-Daudé 
---
  scripts/coccinelle/round.cocci|   6 +




diff --git a/scripts/coccinelle/round.cocci b/scripts/coccinelle/round.cocci
index ed06773289..0a27b6da4d 100644
--- a/scripts/coccinelle/round.cocci
+++ b/scripts/coccinelle/round.cocci
@@ -17,3 +17,9 @@ expression e2;
  @@
  -(DIV_ROUND_UP(e1,e2))
  +DIV_ROUND_UP(e1,e2)
+
+@@
+expression n, d;
+@@
+-   n & ~(d - 1)
++   ROUND_DOWN(n, d)


Oops, unrelated =)



[RFC PATCH 0/2] bulk: Replace !CONFIG_SOFTMMU and !CONFIG_USER_ONLY

2023-06-02 Thread Philippe Mathieu-Daudé
Since CONFIG_SOFTMMU is poisoned, we are using its opposite
form via "!CONFIG_USER_ONLY" (because CONFIG_USER_ONLY is
not poisoned).
Since patch [2] unpoison CONFIG_SOFTMMU, we can remove the
kludge, resulting is a more logical code to review.

Personally I like the resulting code, but I can understand
others simply see code churn here, so I'm simply posting as
bulk patches. I don't have problem to split if nobody object
to this change.

Based-on: 20230531040330.8950-1-richard.hender...@linaro.org
  "tcg: Build once for system, once for user" (v3)
[1] 
https://lore.kernel.org/qemu-devel/20230531040330.8950-1-richard.hender...@linaro.org/
[2] 
https://lore.kernel.org/qemu-devel/20230531040330.8950-48-richard.hender...@linaro.org/

*** BLURB HERE ***

Philippe Mathieu-Daudé (2):
  bulk: Replace !CONFIG_SOFTMMU -> CONFIG_USER_ONLY
  bulk: Replace !CONFIG_USER_ONLY -> CONFIG_SOFTMMU

 scripts/coccinelle/round.cocci|   6 +
 include/exec/address-spaces.h |   2 +-
 include/exec/confidential-guest-support.h |   4 +-
 include/exec/cpu-all.h|   4 +-
 include/exec/cpu-common.h |   4 +-
 include/exec/cpu-defs.h   |  14 +-
 include/exec/cputlb.h |   2 +-
 include/exec/exec-all.h   |   6 +-
 include/exec/ioport.h |   2 +-
 include/exec/memory-internal.h|   2 +-
 include/exec/memory.h |   2 +-
 include/exec/ram_addr.h   |   2 +-
 include/exec/ramblock.h   |   2 +-
 include/hw/core/cpu.h |  12 +-
 include/hw/core/tcg-cpu-ops.h |   8 +-
 include/hw/intc/armv7m_nvic.h |   4 +-
 include/hw/s390x/css.h|   2 +-
 include/qemu/accel.h  |   6 +-
 include/semihosting/semihost.h|   4 +-
 include/sysemu/cpus.h |   2 +-
 include/sysemu/xen.h  |   4 +-
 target/alpha/cpu.h|   6 +-
 target/arm/common-semi-target.h   |   2 +-
 target/arm/cpu.h  |  14 +-
 target/arm/internals.h|   6 +-
 target/arm/tcg/arm_ldst.h |   2 +-
 target/arm/tcg/translate.h|   2 +-
 target/cris/cpu.h |   2 +-
 target/hppa/cpu.h |   2 +-
 target/hppa/helper.h  |   2 +-
 target/i386/cpu-internal.h|   4 +-
 target/i386/cpu.h |  12 +-
 target/i386/helper.h  |  12 +-
 target/i386/sev.h |   2 +-
 target/i386/tcg/helper-tcg.h  |   4 +-
 target/loongarch/cpu.h|   4 +-
 target/loongarch/helper.h |   2 +-
 target/loongarch/internals.h  |   4 +-
 target/m68k/cpu.h |   6 +-
 target/microblaze/cpu.h   |  12 +-
 target/microblaze/helper.h|   2 +-
 target/mips/cpu.h |   6 +-
 target/mips/helper.h  |   8 +-
 target/mips/internal.h|   4 +-
 target/mips/tcg/tcg-internal.h|   4 +-
 target/nios2/cpu.h|  10 +-
 target/nios2/helper.h |   2 +-
 target/openrisc/cpu.h |   8 +-
 target/ppc/cpu-qom.h  |   4 +-
 target/ppc/cpu.h  |  32 ++---
 target/ppc/helper.h   |   8 +-
 target/ppc/internal.h |   6 +-
 target/ppc/kvm_ppc.h  |   8 +-
 target/ppc/mmu-book3s-v3.h|   2 +-
 target/ppc/mmu-hash32.h   |   2 +-
 target/ppc/mmu-hash64.h   |   2 +-
 target/ppc/mmu-radix64.h  |   2 +-
 target/ppc/power8-pmu.h   |   2 +-
 target/ppc/spr_common.h   |   2 +-
 target/riscv/cpu.h|  10 +-
 target/riscv/helper.h |   4 +-
 target/riscv/internals.h  |   2 +-
 target/rx/cpu.h   |   4 +-
 target/s390x/cpu.h|   8 +-
 target/s390x/helper.h |   2 +-
 target/s390x/s390x-internal.h |   8 +-
 target/sh4/cpu.h  |   2 +-
 target/sparc/cpu.h|  10 +-
 target/sparc/helper.h |   2 +-
 target/xtensa/cpu.h   |   6 +-
 target/xtensa/helper.h|   4 +-
 target/s390x/tcg/insn-data.h.inc  |   2 +-
 accel/accel-common.c  |   8 +-
 

[RFC PATCH 1/2] bulk: Replace !CONFIG_SOFTMMU -> CONFIG_USER_ONLY

2023-06-02 Thread Philippe Mathieu-Daudé
CONFIG_USER_ONLY is the opposite of CONFIG_SOFTMMU.
Replace !CONFIG_SOFTMMU negation by the positive form
which is clearer when reviewing code.

Change mostly done mechanically using:

  $ sed -i -e 's/!defined(CONFIG_SOFTMMU)/defined(CONFIG_USER_ONLY)/' \
   -e 's/ifndef CONFIG_SOFTMMU/ifdef CONFIG_USER_ONLY/' \
   $(git grep -l CONFIG_SOFTMMU)

and adapting comments manually.

Signed-off-by: Philippe Mathieu-Daudé 
---
 accel/tcg/cpu-exec.c | 4 ++--
 target/i386/helper.c | 6 +++---
 tcg/aarch64/tcg-target.c.inc | 4 ++--
 tcg/arm/tcg-target.c.inc | 4 ++--
 tcg/i386/tcg-target.c.inc| 8 
 tcg/loongarch64/tcg-target.c.inc | 4 ++--
 tcg/mips/tcg-target.c.inc| 6 +++---
 tcg/ppc/tcg-target.c.inc | 4 ++--
 tcg/riscv/tcg-target.c.inc   | 2 +-
 tcg/s390x/tcg-target.c.inc   | 4 ++--
 tcg/sparc64/tcg-target.c.inc | 4 ++--
 11 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c
index f1eae7b8e5..d5695a7083 100644
--- a/accel/tcg/cpu-exec.c
+++ b/accel/tcg/cpu-exec.c
@@ -568,7 +568,7 @@ void cpu_exec_step_atomic(CPUState *cpu)
 cpu_tb_exec(cpu, tb, _exit);
 cpu_exec_exit(cpu);
 } else {
-#ifndef CONFIG_SOFTMMU
+#ifdef CONFIG_USER_ONLY
 clear_helper_retaddr();
 if (have_mmap_lock()) {
 mmap_unlock();
@@ -1025,7 +1025,7 @@ static int cpu_exec_setjmp(CPUState *cpu, SyncClocks *sc)
 /* Non-buggy compilers preserve this; assert the correct value. */
 g_assert(cpu == current_cpu);
 
-#ifndef CONFIG_SOFTMMU
+#ifdef CONFIG_USER_ONLY
 clear_helper_retaddr();
 if (have_mmap_lock()) {
 mmap_unlock();
diff --git a/target/i386/helper.c b/target/i386/helper.c
index 89aa696c6d..c9755b3aba 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -582,7 +582,7 @@ int cpu_x86_get_descr_debug(CPUX86State *env, unsigned int 
selector,
 
 void do_cpu_init(X86CPU *cpu)
 {
-#if !defined(CONFIG_USER_ONLY)
+#if defined(CONFIG_SOFTMMU)
 CPUState *cs = CPU(cpu);
 CPUX86State *env = >env;
 CPUX86State *save = g_new(CPUX86State, 1);
@@ -601,10 +601,10 @@ void do_cpu_init(X86CPU *cpu)
 kvm_arch_do_init_vcpu(cpu);
 }
 apic_init_reset(cpu->apic_state);
-#endif /* CONFIG_USER_ONLY */
+#endif /* CONFIG_SOFTMMU */
 }
 
-#ifndef CONFIG_USER_ONLY
+#ifdef CONFIG_SOFTMMU
 
 void do_cpu_sipi(X86CPU *cpu)
 {
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 35ca80cd56..d82654ac64 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -77,7 +77,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind 
kind, int slot)
 #define TCG_REG_TMP2 TCG_REG_X30
 #define TCG_VEC_TMP0 TCG_REG_V31
 
-#ifndef CONFIG_SOFTMMU
+#ifdef CONFIG_USER_ONLY
 #define TCG_REG_GUEST_BASE TCG_REG_X28
 #endif
 
@@ -3083,7 +3083,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 tcg_set_frame(s, TCG_REG_SP, TCG_STATIC_CALL_ARGS_SIZE,
   CPU_TEMP_BUF_NLONGS * sizeof(long));
 
-#if !defined(CONFIG_SOFTMMU)
+#if defined(CONFIG_USER_ONLY)
 /*
  * Note that XZR cannot be encoded in the address base register slot,
  * as that actaully encodes SP.  Depending on the guest, we may need
diff --git a/tcg/arm/tcg-target.c.inc b/tcg/arm/tcg-target.c.inc
index 83e286088f..9248c1eb2a 100644
--- a/tcg/arm/tcg-target.c.inc
+++ b/tcg/arm/tcg-target.c.inc
@@ -89,7 +89,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind 
kind, int slot)
 
 #define TCG_REG_TMP  TCG_REG_R12
 #define TCG_VEC_TMP  TCG_REG_Q15
-#ifndef CONFIG_SOFTMMU
+#ifdef CONFIG_USER_ONLY
 #define TCG_REG_GUEST_BASE  TCG_REG_R11
 #endif
 
@@ -2920,7 +2920,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 
 tcg_out_mov(s, TCG_TYPE_PTR, TCG_AREG0, tcg_target_call_iarg_regs[0]);
 
-#ifndef CONFIG_SOFTMMU
+#ifdef CONFIG_USER_ONLY
 if (guest_base) {
 tcg_out_movi(s, TCG_TYPE_PTR, TCG_REG_GUEST_BASE, guest_base);
 tcg_regset_set_reg(s->reserved_regs, TCG_REG_GUEST_BASE);
diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index ab997b5fb3..fa3ef70417 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1866,7 +1866,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 return true;
 }
 
-#ifndef CONFIG_SOFTMMU
+#ifdef CONFIG_USER_ONLY
 static HostAddress x86_guest_base = {
 .index = -1
 };
@@ -1898,7 +1898,7 @@ static inline int setup_guest_base_seg(void)
 return 0;
 }
 #endif /* setup_guest_base_seg */
-#endif /* !SOFTMMU */
+#endif /* CONFIG_USER_ONLY */
 
 #define MIN_TLB_MASK_TABLE_OFS  INT_MIN
 
@@ -4069,7 +4069,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
  (ARRAY_SIZE(tcg_target_callee_save_regs) + 2) * 4
  + stack_addend);
 #else
-# if !defined(CONFIG_SOFTMMU)
+# if defined(CONFIG_USER_ONLY)
   

[PATCH 1/2] target/i386/helper: Remove do_cpu_sipi() stub for user-mode emulation

2023-06-02 Thread Philippe Mathieu-Daudé
Since commit  604664726f ("target/i386: Restrict cpu_exec_interrupt()
handler to sysemu"), do_cpu_sipi() isn't called anymore on user
emulation. Remove the now pointless stub.

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/i386/cpu.h| 3 ++-
 target/i386/helper.c | 3 ---
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 7201a71de8..cd047e0410 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -2285,7 +2285,6 @@ static inline void cpu_get_tb_cpu_state(CPUX86State *env, 
target_ulong *pc,
 }
 
 void do_cpu_init(X86CPU *cpu);
-void do_cpu_sipi(X86CPU *cpu);
 
 #define MCE_INJECT_BROADCAST1
 #define MCE_INJECT_UNCOND_AO2
@@ -2419,6 +2418,8 @@ void x86_cpu_set_default_version(X86CPUVersion version);
 
 #ifndef CONFIG_USER_ONLY
 
+void do_cpu_sipi(X86CPU *cpu);
+
 #define APIC_DEFAULT_ADDRESS 0xfee0
 #define APIC_SPACE_SIZE  0x10
 
diff --git a/target/i386/helper.c b/target/i386/helper.c
index 36bf2107e7..792c8eb45e 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -611,9 +611,6 @@ void do_cpu_sipi(X86CPU *cpu)
 void do_cpu_init(X86CPU *cpu)
 {
 }
-void do_cpu_sipi(X86CPU *cpu)
-{
-}
 #endif
 
 #ifndef CONFIG_USER_ONLY
-- 
2.38.1




[PATCH 2/2] target/i386/helper: Shuffle do_cpu_init()

2023-06-02 Thread Philippe Mathieu-Daudé
Move the #ifdef'ry inside do_cpu_init() instead of
declaring an empty stub for user emulation.

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/i386/helper.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/target/i386/helper.c b/target/i386/helper.c
index 792c8eb45e..89aa696c6d 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -580,9 +580,9 @@ int cpu_x86_get_descr_debug(CPUX86State *env, unsigned int 
selector,
 return 1;
 }
 
-#if !defined(CONFIG_USER_ONLY)
 void do_cpu_init(X86CPU *cpu)
 {
+#if !defined(CONFIG_USER_ONLY)
 CPUState *cs = CPU(cpu);
 CPUX86State *env = >env;
 CPUX86State *save = g_new(CPUX86State, 1);
@@ -601,19 +601,15 @@ void do_cpu_init(X86CPU *cpu)
 kvm_arch_do_init_vcpu(cpu);
 }
 apic_init_reset(cpu->apic_state);
+#endif /* CONFIG_USER_ONLY */
 }
 
+#ifndef CONFIG_USER_ONLY
+
 void do_cpu_sipi(X86CPU *cpu)
 {
 apic_sipi(cpu->apic_state);
 }
-#else
-void do_cpu_init(X86CPU *cpu)
-{
-}
-#endif
-
-#ifndef CONFIG_USER_ONLY
 
 void cpu_load_efer(CPUX86State *env, uint64_t val)
 {
-- 
2.38.1




[PATCH 0/2] target/i386/helper: Minor #ifdef'ry simplifications

2023-06-02 Thread Philippe Mathieu-Daudé
Not very interesting code shuffle, but this was in
the way of another big cleanup. So sending apart.

BTW this file isn't covered in MAINTAINERS:

  $ ./scripts/get_maintainer.pl -f target/i386/helper.c
  get_maintainer.pl: No maintainers found

Philippe Mathieu-Daudé (2):
  target/i386/helper: Remove do_cpu_sipi() stub for user-mode emulation
  target/i386/helper: Shuffle do_cpu_init()

 target/i386/cpu.h|  3 ++-
 target/i386/helper.c | 15 ---
 2 files changed, 6 insertions(+), 12 deletions(-)

-- 
2.38.1




[PATCH] target/hppa/meson: Only build int_helper.o with system emulation

2023-06-02 Thread Philippe Mathieu-Daudé
int_helper.c only contains system emulation code:
remove the #ifdef'ry and move the file to the meson
softmmu source set.

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/hppa/int_helper.c | 3 ---
 target/hppa/meson.build  | 2 +-
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/target/hppa/int_helper.c b/target/hppa/int_helper.c
index f599dccfff..d2480b163b 100644
--- a/target/hppa/int_helper.c
+++ b/target/hppa/int_helper.c
@@ -25,7 +25,6 @@
 #include "hw/core/cpu.h"
 #include "hw/hppa/hppa_hardware.h"
 
-#ifndef CONFIG_USER_ONLY
 static void eval_interrupt(HPPACPU *cpu)
 {
 CPUState *cs = CPU(cpu);
@@ -273,5 +272,3 @@ bool hppa_cpu_exec_interrupt(CPUState *cs, int 
interrupt_request)
 }
 return false;
 }
-
-#endif /* !CONFIG_USER_ONLY */
diff --git a/target/hppa/meson.build b/target/hppa/meson.build
index 81b4b4e617..83b1e0ee7d 100644
--- a/target/hppa/meson.build
+++ b/target/hppa/meson.build
@@ -7,13 +7,13 @@ hppa_ss.add(files(
   'fpu_helper.c',
   'gdbstub.c',
   'helper.c',
-  'int_helper.c',
   'op_helper.c',
   'translate.c',
 ))
 
 hppa_softmmu_ss = ss.source_set()
 hppa_softmmu_ss.add(files(
+  'int_helper.c',
   'machine.c',
   'mem_helper.c',
   'sys_helper.c',
-- 
2.38.1




[PATCH] target/arm: trap DCC access in user mode emulation

2023-06-02 Thread Zhuojia Shen
Accessing EL0-accessible Debug Communication Channel (DCC) registers in
user mode emulation is currently enabled.  However, it does not match
Linux behavior as Linux sets MDSCR_EL1.TDCC on startup to disable EL0
access to DCC (see __cpu_setup() in arch/arm64/mm/proc.S).

This patch fixes access_tdcc() to check MDSCR_EL1.TDCC for EL0 and sets
MDSCR_EL1.TDCC for user mode emulation to match Linux.

Signed-off-by: Zhuojia Shen 
---
 target/arm/cpu.c  | 2 ++
 target/arm/debug_helper.c | 5 +
 2 files changed, 7 insertions(+)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 5182ed0c91..4d5bb57f07 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -289,6 +289,8 @@ static void arm_cpu_reset_hold(Object *obj)
  * This is not yet exposed from the Linux kernel in any way.
  */
 env->cp15.sctlr_el[1] |= SCTLR_TSCXT;
+/* Disable access to Debug Communication Channel (DCC). */
+env->cp15.mdscr_el1 |= 1 << 12;
 #else
 /* Reset into the highest available EL */
 if (arm_feature(env, ARM_FEATURE_EL3)) {
diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
index d41cc643b1..8362462a07 100644
--- a/target/arm/debug_helper.c
+++ b/target/arm/debug_helper.c
@@ -842,12 +842,14 @@ static CPAccessResult access_tda(CPUARMState *env, const 
ARMCPRegInfo *ri,
  * is implemented then these are controlled by MDCR_EL2.TDCC for
  * EL2 and MDCR_EL3.TDCC for EL3. They are also controlled by
  * the general debug access trap bits MDCR_EL2.TDA and MDCR_EL3.TDA.
+ * For EL0, they are also controlled by MDSCR_EL1.TDCC.
  */
 static CPAccessResult access_tdcc(CPUARMState *env, const ARMCPRegInfo *ri,
   bool isread)
 {
 int el = arm_current_el(env);
 uint64_t mdcr_el2 = arm_mdcr_el2_eff(env);
+bool mdscr_el1_tdcc = extract32(env->cp15.mdscr_el1, 12, 1);
 bool mdcr_el2_tda = (mdcr_el2 & MDCR_TDA) || (mdcr_el2 & MDCR_TDE) ||
 (arm_hcr_el2_eff(env) & HCR_TGE);
 bool mdcr_el2_tdcc = cpu_isar_feature(aa64_fgt, env_archcpu(env)) &&
@@ -855,6 +857,9 @@ static CPAccessResult access_tdcc(CPUARMState *env, const 
ARMCPRegInfo *ri,
 bool mdcr_el3_tdcc = cpu_isar_feature(aa64_fgt, env_archcpu(env)) &&
   (env->cp15.mdcr_el3 & MDCR_TDCC);
 
+if (el < 1 && mdscr_el1_tdcc) {
+return CP_ACCESS_TRAP;
+}
 if (el < 2 && (mdcr_el2_tda || mdcr_el2_tdcc)) {
 return CP_ACCESS_TRAP_EL2;
 }
-- 
2.40.1




Re: [PATCH v3 15/48] tcg: Split tcg/tcg-op-common.h from tcg/tcg-op.h

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:02, Richard Henderson wrote:

Create tcg/tcg-op-common.h, moving everything that does not concern
TARGET_LONG_BITS or TCGv.  Adjust tcg/*.c to use the new header
instead of tcg-op.h, in preparation for compiling tcg/ only once.

Signed-off-by: Richard Henderson 
---
  include/tcg/tcg-op-common.h |  996 ++
  include/tcg/tcg-op.h| 1004 +--
  tcg/optimize.c  |2 +-
  tcg/tcg-op-gvec.c   |2 +-
  tcg/tcg-op-ldst.c   |2 +-
  tcg/tcg-op-vec.c|2 +-
  tcg/tcg-op.c|2 +-
  tcg/tcg.c   |2 +-
  tcg/tci.c   |3 +-
  9 files changed, 1007 insertions(+), 1008 deletions(-)


Trivial review using 'git-diff --color-moved=dimmed-zebra'.



Re: [PATCH v3 00/48] tcg: Build once for system, once for user

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:02, Richard Henderson wrote:


  133 files changed, 3022 insertions(+), 2728 deletions(-)



  create mode 100644 include/exec/helper-gen-common.h
  create mode 100644 include/exec/helper-proto-common.h



  create mode 100644 include/exec/helper-gen.h.inc
  create mode 100644 include/exec/helper-proto.h.inc
  create mode 100644 include/exec/helper-info.c.inc


These new files miss a license.




Re: [PATCH v3 22/48] tcg: Split tcg_gen_callN

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:03, Richard Henderson wrote:

Make tcg_gen_callN a static function.  Create tcg_gen_call[0-7]
functions for use by helper-gen.h.inc.

Removes a multiplicty of calls to __stack_chk_fail, saving up
to 143kiB of .text space as measured on an x86_64 host.

 Old New Less%Change
680 8741816 146864  1.65%   qemu-system-aarch64
5911832 5856152 55680   0.94%   qemu-system-riscv64
5816728 5767512 49216   0.85%   qemu-system-mips64
6707832 6659144 48688   0.73%   qemu-system-ppc64

Signed-off-by: Richard Henderson 
---
  include/exec/helper-gen.h | 40 ++---
  include/tcg/tcg.h | 14 +-
  tcg/tcg.c | 54 ++-
  3 files changed, 86 insertions(+), 22 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v3 23/48] tcg: Split helper-gen.h

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:03, Richard Henderson wrote:

Create helper-gen-common.h without the target specific portion.
Use that in tcg-op-common.h.  Reorg headers in target/arm to
ensure that helper-gen.h is included before helper-info.c.inc.
All other targets are already correct in this regard.

Signed-off-by: Richard Henderson 
---
  include/exec/helper-gen-common.h |  17 ++
  include/exec/helper-gen.h| 101 ++-
  include/tcg/tcg-op-common.h  |   2 +-
  include/exec/helper-gen.h.inc| 101 +++
  target/arm/tcg/translate.c   |   8 +--
  5 files changed, 126 insertions(+), 103 deletions(-)
  create mode 100644 include/exec/helper-gen-common.h
  create mode 100644 include/exec/helper-gen.h.inc




diff --git a/include/exec/helper-gen.h.inc b/include/exec/helper-gen.h.inc
new file mode 100644
index 00..83bfa5b23f
--- /dev/null
+++ b/include/exec/helper-gen.h.inc
@@ -0,0 +1,101 @@
+/*
+ * Helper file for declaring TCG helper functions.
+ * This one expands generation functions for tcg opcodes.
+ * Define HELPER_H for the header file to be expanded,
+ * and static inline to change from global file scope.
+ */
+
+#include "tcg/tcg.h"
+#include "tcg/helper-info.h"
+#include "exec/helper-head.h"
+
+#define DEF_HELPER_FLAGS_0(name, flags, ret)\
+extern TCGHelperInfo glue(helper_info_, name);  \
+static inline void glue(gen_helper_, name)(dh_retvar_decl0(ret))\
+{   \
+tcg_gen_call0((helper_info_, name), dh_retvar(ret));   \
+}

[...]

File not guarded for multiple inclusions, otherwise:
Reviewed-by: Philippe Mathieu-Daudé 





Re: [PATCH v3 24/48] tcg: Split helper-proto.h

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:03, Richard Henderson wrote:

Create helper-proto-common.h without the target specific portion.
Use that in tcg-op-common.h.  Include helper-proto.h in target/arm
and target/hexagon before helper-info.c.inc; all other targets are
already correct in this regard.

Signed-off-by: Richard Henderson 
---
  include/exec/helper-proto-common.h | 17 +++
  include/exec/helper-proto.h| 72 --
  include/tcg/tcg-op-common.h|  2 +-
  include/exec/helper-proto.h.inc| 67 +++
  accel/tcg/cputlb.c |  3 +-
  accel/tcg/plugin-gen.c |  2 +-
  accel/tcg/tcg-runtime-gvec.c   |  2 +-
  accel/tcg/tcg-runtime.c|  2 +-
  target/arm/tcg/translate.c |  1 +
  target/hexagon/translate.c |  1 +
  10 files changed, 99 insertions(+), 70 deletions(-)
  create mode 100644 include/exec/helper-proto-common.h
  create mode 100644 include/exec/helper-proto.h.inc




diff --git a/include/exec/helper-proto.h.inc b/include/exec/helper-proto.h.inc
new file mode 100644
index 00..f6f0cfcacd
--- /dev/null
+++ b/include/exec/helper-proto.h.inc
@@ -0,0 +1,67 @@
+/*
+ * Helper file for declaring TCG helper functions.
+ * This one expands prototypes for the helper functions.
+ * Define HELPER_H for the header file to be expanded.
+ */
+
+#include "exec/helper-head.h"
+
+/*
+ * Work around an issue with --enable-lto, in which GCC's ipa-split pass
+ * decides to split out the noreturn code paths that raise an exception,
+ * taking the __builtin_return_address() along into the new function,
+ * where it no longer computes a value that returns to TCG generated code.
+ * Despite the name, the noinline attribute affects splitter, so this
+ * prevents the optimization in question.  Given that helpers should not
+ * otherwise be called directly, this should have any other visible effect.
+ *
+ * See https://gitlab.com/qemu-project/qemu/-/issues/1454
+ */
+#define DEF_HELPER_ATTR  __attribute__((noinline))
+
+#define DEF_HELPER_FLAGS_0(name, flags, ret) \
+dh_ctype(ret) HELPER(name) (void) DEF_HELPER_ATTR;
+
+#define DEF_HELPER_FLAGS_1(name, flags, ret, t1) \
+dh_ctype(ret) HELPER(name) (dh_ctype(t1)) DEF_HELPER_ATTR;
+
+#define DEF_HELPER_FLAGS_2(name, flags, ret, t1, t2) \
+dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2)) DEF_HELPER_ATTR;
+
+#define DEF_HELPER_FLAGS_3(name, flags, ret, t1, t2, t3) \
+dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), \
+dh_ctype(t3)) DEF_HELPER_ATTR;
+
+#define DEF_HELPER_FLAGS_4(name, flags, ret, t1, t2, t3, t4) \
+dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
+dh_ctype(t4)) DEF_HELPER_ATTR;
+
+#define DEF_HELPER_FLAGS_5(name, flags, ret, t1, t2, t3, t4, t5) \
+dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
+dh_ctype(t4), dh_ctype(t5)) DEF_HELPER_ATTR;
+
+#define DEF_HELPER_FLAGS_6(name, flags, ret, t1, t2, t3, t4, t5, t6) \
+dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
+dh_ctype(t4), dh_ctype(t5), \
+dh_ctype(t6)) DEF_HELPER_ATTR;
+
+#define DEF_HELPER_FLAGS_7(name, flags, ret, t1, t2, t3, t4, t5, t6, t7) \
+dh_ctype(ret) HELPER(name) (dh_ctype(t1), dh_ctype(t2), dh_ctype(t3), \
+dh_ctype(t4), dh_ctype(t5), dh_ctype(t6), \
+dh_ctype(t7)) DEF_HELPER_ATTR;
+
+#define IN_HELPER_PROTO
+
+#include HELPER_H
+
+#undef IN_HELPER_PROTO
+
+#undef DEF_HELPER_FLAGS_0
+#undef DEF_HELPER_FLAGS_1
+#undef DEF_HELPER_FLAGS_2
+#undef DEF_HELPER_FLAGS_3
+#undef DEF_HELPER_FLAGS_4
+#undef DEF_HELPER_FLAGS_5
+#undef DEF_HELPER_FLAGS_6
+#undef DEF_HELPER_FLAGS_7
+#undef DEF_HELPER_ATTR


Should we guard this header for multiple inclusions?

Otherwise:
Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v3 39/48] *: Add missing includes of exec/translation-block.h

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:03, Richard Henderson wrote:

This had been pulled in via exec/exec-all.h, via exec/translator.h,
but the include of exec-all.h will be removed.

Signed-off-by: Richard Henderson 
---
  target/hexagon/translate.c   | 1 +
  target/loongarch/translate.c | 3 +--
  target/mips/tcg/translate.c  | 1 +
  3 files changed, 3 insertions(+), 2 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v3 40/48] *: Add missing includes of exec/exec-all.h

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:03, Richard Henderson wrote:

This had been pulled in via exec/translator.h,
but the include of exec-all.h will be removed.

Signed-off-by: Richard Henderson 
---
  target/arm/tcg/translate.h | 1 +
  1 file changed, 1 insertion(+)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 2/4] target/riscv: Remove check on mode for MPRV

2023-06-02 Thread Richard Henderson

On 6/1/23 18:31, Weiwei Li wrote:
Even though MPRV normally can be set to 1 in M mode, it seems possible to set it to 1 in 
other mode by gdbstub.


That would seem to be a gdbstub bug, since it is cleared on exit from M-mode, and cannot 
be set again until we re-enter M-mode.



r~



Re: [PATCH v3 46/48] plugins: Drop unused headers from exec/plugin-gen.h

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:03, Richard Henderson wrote:

Two headers are not required for the rest of the
contents of plugin-gen.h.

Signed-off-by: Richard Henderson 
---
  include/exec/plugin-gen.h | 2 --
  1 file changed, 2 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v3 45/48] plugins: Move plugin_insn_append to translator.c

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:03, Richard Henderson wrote:

This function is only used in translator.c, and uses a
target-specific typedef, abi_ptr.

Signed-off-by: Richard Henderson 
---
  include/exec/plugin-gen.h | 22 --
  accel/tcg/translator.c| 21 +
  2 files changed, 21 insertions(+), 22 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v3 42/48] tcg: Fix PAGE/PROT confusion

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:03, Richard Henderson wrote:

The bug was hidden because they happen to have the same values.

Signed-off-by: Richard Henderson 
---
  tcg/region.c | 18 +-
  1 file changed, 13 insertions(+), 5 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH v3 41/48] accel/tcg: Tidy includes for translator.[ch]

2023-06-02 Thread Philippe Mathieu-Daudé

On 31/5/23 06:03, Richard Henderson wrote:

Reduce the header to only bswap.h and cpu_ldst.h.
Move exec/translate-all.h to translator.c.
Reduce tcg.h and tcg-op.h to tcg-op-common.h.
Remove otherwise unused headers.

Signed-off-by: Richard Henderson 
---
  include/exec/translator.h | 6 +-
  accel/tcg/translator.c| 8 +++-
  2 files changed, 4 insertions(+), 10 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 16/20] target/arm: Convert load (pointer auth) insns to decodetree

2023-06-02 Thread Philippe Mathieu-Daudé

On 2/6/23 17:52, Peter Maydell wrote:

Convert the instructions in the load/store register (pointer
authentication) group ot decodetree: LDRAA, LDRAB.

Signed-off-by: Peter Maydell 
---
  target/arm/tcg/a64.decode  |  7 +++
  target/arm/tcg/translate-a64.c | 83 +++---
  2 files changed, 23 insertions(+), 67 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index 69635586718..2ea85312bba 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -457,3 +457,10 @@ LDUMIN  .. 111 0 00 . . 1 . 0111 00 . 
. @atomic
  SWP .. 111 0 00 . . 1 . 1000 00 . . @atomic
  
  LDAPR   sz:2 111 0 00 1 0 1 1 1100 00 rn:5 rt:5

+
+# Load/store register (pointer authentication)
+
+# LDRA immediate is 10 bits signed and scaled, but the bits aren't all 
contiguous
+%ldra_imm   22:s1 12:9 !function=times_2
+
+LDRA11 111 0 00 m:1 . 1 . w:1 1 rn:5 rt:5 imm=%ldra_imm


Only sz=3 && v=0, OK (previous code was calling unallocated_encoding).

Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 13/20] target/arm: Convert LDR/STR with 12-bit immediate to decodetree

2023-06-02 Thread Philippe Mathieu-Daudé

Hi Peter,

On 2/6/23 17:52, Peter Maydell wrote:

Convert the LDR and STR instructions which use a 12-bit immediate
offset to decodetree. We can reuse the existing LDR and STR
trans functions for these.

Signed-off-by: Peter Maydell 
---
  target/arm/tcg/a64.decode  |  25 
  target/arm/tcg/translate-a64.c | 103 +
  2 files changed, 41 insertions(+), 87 deletions(-)

diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index 4dfb7bbdc2e..c3a6d0b740a 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode




+# Load/store with an unsigned 12 bit immediate, which is scaled by the
+# element size. The function gets the sz:imm and returns the scaled immediate.
+%uimm_scaled   10:12 sz:3 !function=uimm_scaled
+
+@ldst_uimm  .. ... . .. ..  rn:5 rt:5 _imm unpriv=0 p=0 
w=0 imm=%uimm_scaled
+
+STR_i   sz:2 111 0 01 00  . . @ldst_uimm sign=0 
ext=0
+LDR_i   00 111 0 01 01  . . @ldst_uimm sign=0 
ext=1 sz=0
+LDR_i   01 111 0 01 01  . . @ldst_uimm sign=0 
ext=1 sz=1
+LDR_i   10 111 0 01 01  . . @ldst_uimm sign=0 
ext=1 sz=2
+LDR_i   11 111 0 01 01  . . @ldst_uimm sign=0 
ext=0 sz=3
+LDR_i   00 111 0 01 10  . . @ldst_uimm sign=1 
ext=0 sz=0
+LDR_i   01 111 0 01 10  . . @ldst_uimm sign=1 
ext=0 sz=1
+LDR_i   10 111 0 01 10  . . @ldst_uimm sign=1 
ext=0 sz=2
+LDR_i   00 111 0 01 11  . . @ldst_uimm sign=1 
ext=1 sz=0
+LDR_i   01 111 0 01 11  . . @ldst_uimm sign=1 
ext=1 sz=1


Why not use "sz:2 111 0 01 sign:1 ext:1", returning false for the
cases not covered?



Re: [PATCH] linux-user: Return EINVAL for getgroups() with negative gidsetsize

2023-06-02 Thread Philippe Mathieu-Daudé

On 2/6/23 19:48, Peter Maydell wrote:

Coverity doesn't like the way we might end up calling getgroups()
with a NULL grouplist pointer. This is fine for the special case
of gidsetsize == 0, but we will also do it if the guest passes
us a negative gidsetsize. (CID 1512465)

Explicitly fail the negative gidsetsize with EINVAL, as the kernel
does. This means we definitely only call the libc getgroups()
with valid parameters.

Possibly Coverity may still complain about getgroups(0, NULL), but
that would be a false positive.

Signed-off-by: Peter Maydell 
---
  linux-user/syscall.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 3/3] meson.build: Group the audio backend entries in a separate summary section

2023-06-02 Thread Philippe Mathieu-Daudé

On 2/6/23 19:18, Thomas Huth wrote:

Let's make it easier for the users to spot audio-related entries
in the summary of the meson output.

Signed-off-by: Thomas Huth 
---
  meson.build | 32 ++--
  1 file changed, 18 insertions(+), 14 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH 2/3] meson.build: Group the network backend entries in a separate summary section

2023-06-02 Thread Philippe Mathieu-Daudé

On 2/6/23 19:18, Thomas Huth wrote:

Let's make it easier for the users to spot network-related entries
in the summary of the meson output.

Signed-off-by: Thomas Huth 
---
  meson.build | 13 -
  1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/meson.build b/meson.build
index 4a20a2e712..c64ad3c365 100644
--- a/meson.build
+++ b/meson.build
@@ -4267,13 +4267,19 @@ summary_info += {'curses support':curses}
  summary_info += {'brlapi support':brlapi}
  summary(summary_info, bool_yn: true, section: 'User interface')
  
-# Libraries

+# Network backends
  summary_info = {}
  if targetos == 'darwin'
summary_info += {'vmnet.framework support': vmnet}
  endif
-summary_info = {}


Ah, this should be squashed in the previous patch.

Reviewed-by: Philippe Mathieu-Daudé 


  summary_info += {'slirp support': slirp}
+summary_info += {'vde support':   vde}
+summary_info += {'netmap support':have_netmap}
+summary_info += {'l2tpv3 support':have_l2tpv3}
+summary(summary_info, bool_yn: true, section: 'Network backends')
+
+# Libraries
+summary_info = {}
  summary_info += {'libtasn1':  tasn1}
  summary_info += {'PAM':   pam}
  summary_info += {'iconv support': iconv}
@@ -4295,9 +4301,6 @@ if targetos == 'linux'
  endif
  summary_info += {'Pipewire support':   pipewire}
  summary_info += {'JACK support':  jack}
-summary_info += {'vde support':   vde}
-summary_info += {'netmap support':have_netmap}
-summary_info += {'l2tpv3 support':have_l2tpv3}
  summary_info += {'Linux AIO support': libaio}
  summary_info += {'Linux io_uring support': linux_io_uring}
  summary_info += {'ATTR/XATTR support': libattr}





Re: [PATCH 1/3] meson.build: Group the UI entries in a separate summary section

2023-06-02 Thread Philippe Mathieu-Daudé

On 2/6/23 19:18, Thomas Huth wrote:

Let's make it easier for the users to spot UI-related entries in
the summary of the meson output.

Signed-off-by: Thomas Huth 
---
  meson.build | 35 +--
  1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/meson.build b/meson.build
index a61d3e9b06..4a20a2e712 100644
--- a/meson.build
+++ b/meson.build
@@ -4243,32 +4243,44 @@ summary_info += {'rng-none':  
get_option('rng_none')}
  summary_info += {'Linux keyring': have_keyring}
  summary(summary_info, bool_yn: true, section: 'Crypto')
  
-# Libraries

+# UI
  summary_info = {}
  if targetos == 'darwin'
summary_info += {'Cocoa support':   cocoa}
-  summary_info += {'vmnet.framework support': vmnet}
  endif
  summary_info += {'SDL support':   sdl}
  summary_info += {'SDL image support': sdl_image}
  summary_info += {'GTK support':   gtk}
  summary_info += {'pixman':pixman}
  summary_info += {'VTE support':   vte}
+summary_info += {'PNG support':   png}
+summary_info += {'VNC support':   vnc}
+if vnc.found()
+  summary_info += {'VNC SASL support':  sasl}
+  summary_info += {'VNC JPEG support':  jpeg}
+endif
+summary_info += {'spice protocol support': spice_protocol}
+if spice_protocol.found()
+  summary_info += {'  spice server support': spice}
+endif
+summary_info += {'curses support':curses}
+summary_info += {'brlapi support':brlapi}
+summary(summary_info, bool_yn: true, section: 'User interface')
+
+# Libraries
+summary_info = {}
+if targetos == 'darwin'
+  summary_info += {'vmnet.framework support': vmnet}
+endif
+summary_info = {}


Conditional to dropping the previous line:
Reviewed-by: Philippe Mathieu-Daudé 


  summary_info += {'slirp support': slirp}
  summary_info += {'libtasn1':  tasn1}
  summary_info += {'PAM':   pam}
  summary_info += {'iconv support': iconv}
-summary_info += {'curses support':curses}
  summary_info += {'virgl support': virgl}
  summary_info += {'blkio support': blkio}
  summary_info += {'curl support':  curl}
  summary_info += {'Multipath support': mpathpersist}
-summary_info += {'PNG support':   png}
-summary_info += {'VNC support':   vnc}
-if vnc.found()
-  summary_info += {'VNC SASL support':  sasl}
-  summary_info += {'VNC JPEG support':  jpeg}
-endif
  if targetos not in ['darwin', 'haiku', 'windows']
summary_info += {'OSS support': oss}
summary_info += {'sndio support':   sndio}
@@ -4283,7 +4295,6 @@ if targetos == 'linux'
  endif
  summary_info += {'Pipewire support':   pipewire}
  summary_info += {'JACK support':  jack}
-summary_info += {'brlapi support':brlapi}
  summary_info += {'vde support':   vde}
  summary_info += {'netmap support':have_netmap}
  summary_info += {'l2tpv3 support':have_l2tpv3}
@@ -4295,10 +4306,6 @@ summary_info += {'PVRDMA support':have_pvrdma}
  summary_info += {'fdt support':   fdt_opt == 'disabled' ? false : fdt_opt}
  summary_info += {'libcap-ng support': libcap_ng}
  summary_info += {'bpf support':   libbpf}
-summary_info += {'spice protocol support': spice_protocol}
-if spice_protocol.found()
-  summary_info += {'  spice server support': spice}
-endif
  summary_info += {'rbd support':   rbd}
  summary_info += {'smartcard support': cacard}
  summary_info += {'U2F support':   u2f}





Re: [PATCH] vhost: fix vhost_dev_enable_notifiers() error case

2023-06-02 Thread Philippe Mathieu-Daudé

On 2/6/23 18:27, Laurent Vivier wrote:

in vhost_dev_enable_notifiers(), if virtio_bus_set_host_notifier(true)
fails, we call vhost_dev_disable_notifiers() that executes
virtio_bus_set_host_notifier(false) on all queues, even on queues that
have failed to be initialized.

This triggers a core dump in memory_region_del_eventfd():

  virtio_bus_set_host_notifier: unable to init event notifier: Too many open 
files (-24)
  vhost VQ 1 notifier binding failed: 24
  .../softmmu/memory.c:2611: memory_region_del_eventfd: Assertion `i != 
mr->ioeventfd_nb' failed.

Fix the problem by providing to vhost_dev_disable_notifiers() the
number of queues to disable.

Fixes: 8771589b6f81 ("vhost: simplify vhost_dev_enable_notifiers")
Cc: longpe...@huawei.com
Signed-off-by: Laurent Vivier 
---
  hw/virtio/vhost.c | 65 ++-
  1 file changed, 36 insertions(+), 29 deletions(-)


I'd rather have 2 patches, one factoring the new helper out
and the 2nd fixing the bug. If you ever need to respin...
Anyhow,

Reviewed-by: Philippe Mathieu-Daudé 




Re: [PATCH] target/arm: Fix return value from LDSMIN/LDSMAX 8/16 bit atomics

2023-06-02 Thread Philippe Mathieu-Daudé

On 2/6/23 16:22, Peter Maydell wrote:

The atomic memory operations are supposed to return the old memory
data value in the destination register.  This value is not
sign-extended, even if the operation is the signed minimum or
maximum.  (In the pseudocode for the instructions the returned data
value is passed to ZeroExtend() to create the value in the register.)

We got this wrong because we were doing a 32-to-64 zero extend on the
result for 8 and 16 bit data values, rather than the correct amount
of zero extension.

Fix the bug by using ext8u and ext16u for the MO_8 and MO_16 data
sizes rather than ext32u.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Peter Maydell 
---
  target/arm/tcg/translate-a64.c | 18 --
  1 file changed, 16 insertions(+), 2 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 




[PATCH] target/hexagon: Emit comments to silence coverity

2023-06-02 Thread Anton Johansson via
idef-parser emits safety checks around shifts and extensions to deal
with shift amounts larger than the TCGv size and extensions of 0-bit
regions.  These safety checks sometimes result in dead branches, which
coverity detects and warns about.

This commits silences these dead code warnings in emitted code by using
markup comments.

Signed-off-by: Anton Johansson 
---
 target/hexagon/idef-parser/parser-helpers.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/target/hexagon/idef-parser/parser-helpers.c 
b/target/hexagon/idef-parser/parser-helpers.c
index 7b5ebafec2..59ef018d44 100644
--- a/target/hexagon/idef-parser/parser-helpers.c
+++ b/target/hexagon/idef-parser/parser-helpers.c
@@ -636,6 +636,7 @@ static void gen_asl_op(Context *c, YYLTYPE *locp, unsigned 
bit_width,
 } break;
 case REG_IMM: {
 OUT(c, locp, "if (", op2, " >= ", _width, ") {\n");
+OUT(c, locp, "/* coverity[dead_error_condition] */\n");
 OUT(c, locp, "tcg_gen_movi_", bit_suffix, "(", res, ", 0);\n");
 OUT(c, locp, "} else {\n");
 OUT(c, locp, "tcg_gen_shli_", bit_suffix,
@@ -691,7 +692,8 @@ static void gen_asr_op(Context *c, YYLTYPE *locp, unsigned 
bit_width,
 gen_c_int_type(c, locp, bit_width, signedness);
 OUT(c, locp, " shift = ", op2, ";\n");
 OUT(c, locp, "if (", op2, " >= ", _width, ") {\n");
-OUT(c, locp, "shift = ", _width, " - 1;\n");
+OUT(c, locp, "/* coverity[dead_error_condition] */\n");
+OUT(c, locp, "shift = ", _width, " - 1;\n");
 OUT(c, locp, "}\n");
 OUT(c, locp, "tcg_gen_sari_", bit_suffix,
 "(", res, ", ", op1, ", shift);\n}\n");
@@ -1060,6 +1062,7 @@ static HexValue gen_extend_imm_width_op(Context *c,
 ");\n");
 if (need_guarding) {
 OUT(c, locp, "} else {\n");
+OUT(c, locp, "/* coverity[dead_error_condition] */\n");
 OUT(c, locp, "tcg_gen_movi_i", _width, "(", ,
 ", 0);\n");
 OUT(c, locp, "}\n");
-- 
2.39.1




Re: [PULL v2 02/16] block/file-posix: introduce helper functions for sysfs attributes

2023-06-02 Thread Sam Li
Matthew Rosato  于2023年6月3日周六 02:41写道:
>
> On 6/2/23 2:18 PM, Sam Li wrote:
> > Matthew Rosato  于2023年6月1日周四 02:21写道:
> >>
> >> On 5/15/23 12:04 PM, Stefan Hajnoczi wrote:
> >>> From: Sam Li 
> >>>
> >>> Use get_sysfs_str_val() to get the string value of device
> >>> zoned model. Then get_sysfs_zoned_model() can convert it to
> >>> BlockZoneModel type of QEMU.
> >>>
> >>> Use get_sysfs_long_val() to get the long value of zoned device
> >>> information.
> >>
> >> Hi Stefan, Sam,
> >>
> >> I am having an issue on s390x using virtio-blk-{pci,ccw} backed by an NVMe 
> >> partition, and I've bisected the root cause to this commit.
> >>
> >> I noticed that tests which use the partition e.g. /dev/nvme0n1p1 as a 
> >> backing device would fail, but those that use the namespace e.g. 
> >> /dev/nvme0n1 would still succeed.  The root issue appears to be that the 
> >> block device associated with the partition does not have a "max_segments" 
> >> attribute, and prior to this patch hdev_get_max_segment() would return 
> >> -ENOENT in this case.  After this patch, however, QEMU is instead 
> >> crashing.  It looks like g_file_get_contents is returning 0 with a len == 
> >> 0 if the specified sysfs path does not exist.  The following diff on top 
> >> seems to resolve the issue for me:
> >>
> >>
> >> diff --git a/block/file-posix.c b/block/file-posix.c
> >> index 0ab158efba2..eeb0247c74e 100644
> >> --- a/block/file-posix.c
> >> +++ b/block/file-posix.c
> >> @@ -1243,7 +1243,7 @@ static int get_sysfs_str_val(struct stat *st, const 
> >> char *attribute,
> >>  major(st->st_rdev), minor(st->st_rdev),
> >>  attribute);
> >>  ret = g_file_get_contents(sysfspath, val, , NULL);
> >> -if (ret == -1) {
> >> +if (ret == -1 || len == 0) {
> >>  return -ENOENT;
> >>  }
> >>
> >
> > Hi Matthew,
> >
> > Thanks for the information. After some checking, I think the bug here
> > is that g_file_get_contens returns g_boolean value and the error case
> > will return 0 instead of -1 in my previous code. Can the following
> > line fix your issue on the s390x device?
> >
> > + if (ret == FALSE) {
> >
> > https://docs.gtk.org/glib/func.file_get_contents.html
>
> Hi Sam,
>
> Ah, good point, I didn't notice file_get_contents was meant to be a bool 
> return and wondered why I was getting a return of 0 in the failing case, 
> hence the check for len == 0.
>
> Anyway, yes, I verified that checking for ret == FALSE fixes the issue.  
> FWIW, along the same line I also checked that this works:
>
> if (!g_file_get_contents(sysfspath, val, , NULL)) {
> return -ENOENT;
> }
>
> which I personally think looks cleaner and matches the other uses of 
> g_file_get_contents in QEMU.  Could also get rid of ret and just return 0 at 
> the bottom of the function.

Indeed. I will fix this. Thanks!

Sam



Re: [PULL v2 02/16] block/file-posix: introduce helper functions for sysfs attributes

2023-06-02 Thread Matthew Rosato
On 6/2/23 2:18 PM, Sam Li wrote:
> Matthew Rosato  于2023年6月1日周四 02:21写道:
>>
>> On 5/15/23 12:04 PM, Stefan Hajnoczi wrote:
>>> From: Sam Li 
>>>
>>> Use get_sysfs_str_val() to get the string value of device
>>> zoned model. Then get_sysfs_zoned_model() can convert it to
>>> BlockZoneModel type of QEMU.
>>>
>>> Use get_sysfs_long_val() to get the long value of zoned device
>>> information.
>>
>> Hi Stefan, Sam,
>>
>> I am having an issue on s390x using virtio-blk-{pci,ccw} backed by an NVMe 
>> partition, and I've bisected the root cause to this commit.
>>
>> I noticed that tests which use the partition e.g. /dev/nvme0n1p1 as a 
>> backing device would fail, but those that use the namespace e.g. 
>> /dev/nvme0n1 would still succeed.  The root issue appears to be that the 
>> block device associated with the partition does not have a "max_segments" 
>> attribute, and prior to this patch hdev_get_max_segment() would return 
>> -ENOENT in this case.  After this patch, however, QEMU is instead crashing.  
>> It looks like g_file_get_contents is returning 0 with a len == 0 if the 
>> specified sysfs path does not exist.  The following diff on top seems to 
>> resolve the issue for me:
>>
>>
>> diff --git a/block/file-posix.c b/block/file-posix.c
>> index 0ab158efba2..eeb0247c74e 100644
>> --- a/block/file-posix.c
>> +++ b/block/file-posix.c
>> @@ -1243,7 +1243,7 @@ static int get_sysfs_str_val(struct stat *st, const 
>> char *attribute,
>>  major(st->st_rdev), minor(st->st_rdev),
>>  attribute);
>>  ret = g_file_get_contents(sysfspath, val, , NULL);
>> -if (ret == -1) {
>> +if (ret == -1 || len == 0) {
>>  return -ENOENT;
>>  }
>>
> 
> Hi Matthew,
> 
> Thanks for the information. After some checking, I think the bug here
> is that g_file_get_contens returns g_boolean value and the error case
> will return 0 instead of -1 in my previous code. Can the following
> line fix your issue on the s390x device?
> 
> + if (ret == FALSE) {
> 
> https://docs.gtk.org/glib/func.file_get_contents.html

Hi Sam,

Ah, good point, I didn't notice file_get_contents was meant to be a bool return 
and wondered why I was getting a return of 0 in the failing case, hence the 
check for len == 0.

Anyway, yes, I verified that checking for ret == FALSE fixes the issue.  FWIW, 
along the same line I also checked that this works:

if (!g_file_get_contents(sysfspath, val, , NULL)) {
return -ENOENT;
}

which I personally think looks cleaner and matches the other uses of 
g_file_get_contents in QEMU.  Could also get rid of ret and just return 0 at 
the bottom of the function.

Thanks,
Matt





Re: [PULL v2 02/16] block/file-posix: introduce helper functions for sysfs attributes

2023-06-02 Thread Sam Li
Matthew Rosato  于2023年6月1日周四 02:21写道:
>
> On 5/15/23 12:04 PM, Stefan Hajnoczi wrote:
> > From: Sam Li 
> >
> > Use get_sysfs_str_val() to get the string value of device
> > zoned model. Then get_sysfs_zoned_model() can convert it to
> > BlockZoneModel type of QEMU.
> >
> > Use get_sysfs_long_val() to get the long value of zoned device
> > information.
>
> Hi Stefan, Sam,
>
> I am having an issue on s390x using virtio-blk-{pci,ccw} backed by an NVMe 
> partition, and I've bisected the root cause to this commit.
>
> I noticed that tests which use the partition e.g. /dev/nvme0n1p1 as a backing 
> device would fail, but those that use the namespace e.g. /dev/nvme0n1 would 
> still succeed.  The root issue appears to be that the block device associated 
> with the partition does not have a "max_segments" attribute, and prior to 
> this patch hdev_get_max_segment() would return -ENOENT in this case.  After 
> this patch, however, QEMU is instead crashing.  It looks like 
> g_file_get_contents is returning 0 with a len == 0 if the specified sysfs 
> path does not exist.  The following diff on top seems to resolve the issue 
> for me:
>
>
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 0ab158efba2..eeb0247c74e 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -1243,7 +1243,7 @@ static int get_sysfs_str_val(struct stat *st, const 
> char *attribute,
>  major(st->st_rdev), minor(st->st_rdev),
>  attribute);
>  ret = g_file_get_contents(sysfspath, val, , NULL);
> -if (ret == -1) {
> +if (ret == -1 || len == 0) {
>  return -ENOENT;
>  }
>

Hi Matthew,

Thanks for the information. After some checking, I think the bug here
is that g_file_get_contens returns g_boolean value and the error case
will return 0 instead of -1 in my previous code. Can the following
line fix your issue on the s390x device?

+ if (ret == FALSE) {

https://docs.gtk.org/glib/func.file_get_contents.html

Thanks,
Sam




>
>
>
> >
> > Signed-off-by: Sam Li 
> > Reviewed-by: Hannes Reinecke 
> > Reviewed-by: Stefan Hajnoczi 
> > Reviewed-by: Damien Le Moal 
> > Reviewed-by: Dmitry Fomichev 
> > Acked-by: Kevin Wolf 
> > Signed-off-by: Stefan Hajnoczi 
> > Message-id: 20230508045533.175575-3-faithilike...@gmail.com
> > Message-id: 20230324090605.28361-3-faithilike...@gmail.com
> > [Adjust commit message prefix as suggested by Philippe Mathieu-Daudé
> > .
> > --Stefan]
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >  include/block/block_int-common.h |   3 +
> >  block/file-posix.c   | 135 ++-
> >  2 files changed, 100 insertions(+), 38 deletions(-)
> >
> > diff --git a/include/block/block_int-common.h 
> > b/include/block/block_int-common.h
> > index 4909876756..c7ca5a83e9 100644
> > --- a/include/block/block_int-common.h
> > +++ b/include/block/block_int-common.h
> > @@ -862,6 +862,9 @@ typedef struct BlockLimits {
> >   * an explicit monitor command to load the disk inside the guest).
> >   */
> >  bool has_variable_length;
> > +
> > +/* device zone model */
> > +BlockZoneModel zoned;
> >  } BlockLimits;
> >
> >  typedef struct BdrvOpBlocker BdrvOpBlocker;
> > diff --git a/block/file-posix.c b/block/file-posix.c
> > index c7b723368e..97c597a2a0 100644
> > --- a/block/file-posix.c
> > +++ b/block/file-posix.c
> > @@ -1202,15 +1202,89 @@ static int hdev_get_max_hw_transfer(int fd, struct 
> > stat *st)
> >  #endif
> >  }
> >
> > -static int hdev_get_max_segments(int fd, struct stat *st)
> > +/*
> > + * Get a sysfs attribute value as character string.
> > + */
> > +#ifdef CONFIG_LINUX
> > +static int get_sysfs_str_val(struct stat *st, const char *attribute,
> > + char **val) {
> > +g_autofree char *sysfspath = NULL;
> > +int ret;
> > +size_t len;
> > +
> > +if (!S_ISBLK(st->st_mode)) {
> > +return -ENOTSUP;
> > +}
> > +
> > +sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/%s",
> > +major(st->st_rdev), minor(st->st_rdev),
> > +attribute);
> > +ret = g_file_get_contents(sysfspath, val, , NULL);
> > +if (ret == -1) {
> > +return -ENOENT;
> > +}
> > +
> > +/* The file is ended with '\n' */
> > +char *p;
> > +p = *val;
> > +if (*(p + len - 1) == '\n') {
> > +*(p + len - 1) = '\0';
> > +}
> > +return ret;
> > +}
> > +#endif
> > +
> > +static int get_sysfs_zoned_model(struct stat *st, BlockZoneModel *zoned)
> >  {
> > +g_autofree char *val = NULL;
> > +int ret;
> > +
> > +ret = get_sysfs_str_val(st, "zoned", );
> > +if (ret < 0) {
> > +return ret;
> > +}
> > +
> > +if (strcmp(val, "host-managed") == 0) {
> > +*zoned = BLK_Z_HM;
> > +} else if (strcmp(val, "host-aware") == 0) {
> > +*zoned = BLK_Z_HA;
> > +} else if (strcmp(val, "none") == 0) {
> > 

Re: [PATCH] linux-user: Return EINVAL for getgroups() with negative gidsetsize

2023-06-02 Thread Laurent Vivier

Le 02/06/2023 à 19:48, Peter Maydell a écrit :

Coverity doesn't like the way we might end up calling getgroups()
with a NULL grouplist pointer. This is fine for the special case
of gidsetsize == 0, but we will also do it if the guest passes
us a negative gidsetsize. (CID 1512465)

Explicitly fail the negative gidsetsize with EINVAL, as the kernel
does. This means we definitely only call the libc getgroups()
with valid parameters.

Possibly Coverity may still complain about getgroups(0, NULL), but
that would be a false positive.

Signed-off-by: Peter Maydell 
---
  linux-user/syscall.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 89b58b386b1..29fdfdf18e4 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -11574,7 +11574,7 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
  g_autofree gid_t *grouplist = NULL;
  int i;
  
-if (gidsetsize > NGROUPS_MAX) {

+if (gidsetsize > NGROUPS_MAX || gidsetsize < 0) {
  return -TARGET_EINVAL;
  }
  if (gidsetsize > 0) {


Reviewed-by: Laurent Vivier 



Re: [PULL v2 10/16] block: introduce zone append write for zoned devices

2023-06-02 Thread Sam Li
Peter Maydell  于2023年6月3日周六 01:52写道:
>
> On Fri, 2 Jun 2023 at 18:35, Sam Li  wrote:
> >
> > Peter Maydell  于2023年6月3日周六 01:30写道:
> > >
> > > On Fri, 2 Jun 2023 at 18:23, Sam Li  wrote:
> > > > Thanks for spotting this. You are right that bs->wps is not checked in
> > > > this code path. I think the get_zones_wp() should handle a NULL
> > > > bs->wps which is the function calling wps directly.
> > > >
> > > > Would you like to submit a patch for this? Or I can do it if you are
> > > > not available.
> > >
> > > I don't know anything about this code, so I'm not really in
> > > a position to write a patch. I'm just passing on the information
> > > from the Coverity scanner -- it scales a lot better that way
> > > than trying to write fixes for everything myself :-)
> >
> > I see. I'll fix it. Wish I had known more about this tool when I was
> > testing this code.
>
> Coverity is a bit awkward because the free online scanner only
> runs on code that's already been committed to QEMU, so it doesn't
> tell us about issues until we've already gone through the
> whole code-review-test cycle. Plus it often complains about
> things that aren't bugs, so you have to be a bit cautious
> about interpreting its reports. But it's still a nice tool
> to have.
>
> The online UI is at https://scan.coverity.com/projects/qemu
> and you can create an account and apply for permission to look
> at the recorded defects if you like.

Good to know. Thanks!

Sam



Re: [PULL v2 10/16] block: introduce zone append write for zoned devices

2023-06-02 Thread Peter Maydell
On Fri, 2 Jun 2023 at 18:35, Sam Li  wrote:
>
> Peter Maydell  于2023年6月3日周六 01:30写道:
> >
> > On Fri, 2 Jun 2023 at 18:23, Sam Li  wrote:
> > > Thanks for spotting this. You are right that bs->wps is not checked in
> > > this code path. I think the get_zones_wp() should handle a NULL
> > > bs->wps which is the function calling wps directly.
> > >
> > > Would you like to submit a patch for this? Or I can do it if you are
> > > not available.
> >
> > I don't know anything about this code, so I'm not really in
> > a position to write a patch. I'm just passing on the information
> > from the Coverity scanner -- it scales a lot better that way
> > than trying to write fixes for everything myself :-)
>
> I see. I'll fix it. Wish I had known more about this tool when I was
> testing this code.

Coverity is a bit awkward because the free online scanner only
runs on code that's already been committed to QEMU, so it doesn't
tell us about issues until we've already gone through the
whole code-review-test cycle. Plus it often complains about
things that aren't bugs, so you have to be a bit cautious
about interpreting its reports. But it's still a nice tool
to have.

The online UI is at https://scan.coverity.com/projects/qemu
and you can create an account and apply for permission to look
at the recorded defects if you like.

thanks
-- PMM



  1   2   3   >