Re: [PATCH v3] target/riscv: Add isa extenstion strings to the device tree

2022-02-25 Thread Frank Chang
Atish Patra  於 2022年2月23日 週三 上午6:39寫道:

> The Linux kernel parses the ISA extensions from "riscv,isa" DT
> property. It used to parse only the single letter base extensions
> until now. A generic ISA extension parsing framework was proposed[1]
> recently that can parse multi-letter ISA extensions as well.
>
> Generate the extended ISA string by appending  the available ISA extensions
> to the "riscv,isa" string if it is enabled so that kernel can process it.
>
> [1] https://lkml.org/lkml/2022/2/15/263
>
> Suggested-by: Heiko Stubner 
> Signed-off-by: Atish Patra 
> ---
> Changes from v2->v3:
> 1. Used g_strconcat to replace snprintf & a max isa string length as
> suggested by Anup.
> 2. I have not included the Tested-by Tag from Heiko because the
> implementation changed from v2 to v3.
>
> Changes from v1->v2:
> 1. Improved the code redability by using arrays instead of individual check
> ---
>  target/riscv/cpu.c | 29 +
>  1 file changed, 29 insertions(+)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index b0a40b83e7a8..2c7ff6ef555a 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -34,6 +34,12 @@
>
>  /* RISC-V CPU definitions */
>
> +/* This includes the null terminated character '\0' */
> +struct isa_ext_data {
> +const char *name;
> +bool enabled;
> +};
> +
>  static const char riscv_exts[26] = "IEMAFDQCLBJTPVNSUHKORWXYZG";
>
>  const char * const riscv_int_regnames[] = {
> @@ -881,6 +887,28 @@ static void riscv_cpu_class_init(ObjectClass *c, void
> *data)
>  device_class_set_props(dc, riscv_cpu_properties);
>  }
>
> +static void riscv_isa_string_ext(RISCVCPU *cpu, char **isa_str, int
> max_str_len)
> +{
> +char *old = *isa_str;
> +char *new = *isa_str;
> +int i;
> +struct isa_ext_data isa_edata_arr[] = {
> +{ "svpbmt", cpu->cfg.ext_svpbmt   },
> +{ "svinval", cpu->cfg.ext_svinval },
> +{ "svnapot", cpu->cfg.ext_svnapot },
>

We still have other sub-extensions, e.g. Zfh, Zba, Zbb, Zbc, Zbs... etc.
Do you mind adding them as well?

Also, I think the order of ISA strings should be alphabetical as described:
https://github.com/riscv/riscv-isa-manual/blob/master/src/naming.tex#L96

Regards,
Frank Chang


> +};
> +
> +for (i = 0; i < ARRAY_SIZE(isa_edata_arr); i++) {
> +if (isa_edata_arr[i].enabled) {
> +new = g_strconcat(old, "_", isa_edata_arr[i].name, NULL);
> +g_free(old);
> +old = new;
> +}
> +}
> +
> +*isa_str = new;
> +}
> +
>  char *riscv_isa_string(RISCVCPU *cpu)
>  {
>  int i;
> @@ -893,6 +921,7 @@ char *riscv_isa_string(RISCVCPU *cpu)
>  }
>  }
>  *p = '\0';
> +riscv_isa_string_ext(cpu, &isa_str, maxlen);
>  return isa_str;
>  }
>
> --
> 2.30.2
>
>


Re: [PATCH 6/8] char: move qemu_openpty_raw from util/ to char/

2022-02-25 Thread Paolo Bonzini

On 2/24/22 18:04, Marc-André Lureau wrote:

Paolo,

This patch is ok, but in some (new?) circumstances it fails with freebsd 
and reveals that -lutil was missing for kinfo_getproc() in 
util/oslib-posix.c. Please add:


-util_ss.add(when: 'CONFIG_POSIX', if_true: files('oslib-posix.c'))
+util_ss.add(when: 'CONFIG_POSIX', if_true: [files('oslib-posix.c'), util])

(even better if we made this specific to freebsd I guess, but not 
strictly necessary)


Looking again at the patch (because indeed it broke CI :)), I'm not sure 
it's a good idea.  The code seems to be partly taken from other projects 
and doesn't follow the QEMU coding standards.


Paolo



Re: [PATCH 1/3] util & iothread: Introduce event-loop abstract class

2022-02-25 Thread Paolo Bonzini

On 2/24/22 10:48, Stefan Hajnoczi wrote:

On Mon, Feb 21, 2022 at 06:08:43PM +0100, Nicolas Saenz Julienne wrote:

diff --git a/qom/meson.build b/qom/meson.build
index 062a3789d8..c20e5dd1cb 100644
--- a/qom/meson.build
+++ b/qom/meson.build
@@ -4,6 +4,7 @@ qom_ss.add(files(
'object.c',
'object_interfaces.c',
'qom-qobject.c',
+  '../util/event-loop.c',


This looks strange. I expected util/event-loop.c to be in
util/meson.build and added to the util_ss SourceSet instead of qom_ss.


Or alternatively, to be in the root just like iothread.c.

Paolo


What is the reason for this?


  ))
  
  qmp_ss.add(files('qom-qmp-cmds.c'))

diff --git a/util/event-loop.c b/util/event-loop.c
new file mode 100644
index 00..f3e50909a0
--- /dev/null
+++ b/util/event-loop.c


The naming is a little inconsistent. The filename "event-loop.c" does
match the QOM type or typedef name event-loop-backend/EventLoopBackend.

I suggest calling the source file event-loop-base.c and the QOM type
"event-loop-base".


@@ -0,0 +1,142 @@
+/*
+ * QEMU event-loop backend
+ *
+ * Copyright (C) 2022 Red Hat Inc
+ *
+ * Authors:
+ *  Nicolas Saenz Julienne 


Most of the code is cut and pasted. It would be nice to carry over the
authorship information too.


+struct EventLoopBackend {
+Object parent;
+
+/* AioContext poll parameters */
+int64_t poll_max_ns;
+int64_t poll_grow;
+int64_t poll_shrink;


These parameters do not affect the main loop because it cannot poll. If
you decide to keep them in the base class, please document that they
have no effect on the main loop so users aren't confused. I would keep
them unique to IOThread for now.





[PATCH v3 4/4] tests/acpi: i386: update FACP table differences

2022-02-25 Thread Liav Albani
After changing the IAPC boot flags register to indicate support of i8042
in the machine chipset to help the guest OS to determine its existence
"faster", we need to have the updated FACP ACPI binary images in tree.

@@ -1,32 +1,32 @@
 /*
  * Intel ACPI Component Architecture
  * AML/ASL+ Disassembler version 20211217 (64-bit version)
  * Copyright (c) 2000 - 2021 Intel Corporation
  *
- * Disassembly of tests/data/acpi/q35/FACP, Wed Feb 23 22:37:39 2022
+ * Disassembly of /tmp/aml-BBFBI1, Wed Feb 23 22:37:39 2022
  *
  * ACPI Data Table [FACP]
  *
  * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue (in 
hex)
  */

 [000h    4]Signature : "FACP"[Fixed ACPI 
Description Table (FADT)]
 [004h 0004   4] Table Length : 00F4
 [008h 0008   1] Revision : 03
-[009h 0009   1] Checksum : B9
+[009h 0009   1] Checksum : B7
 [00Ah 0010   6]   Oem ID : "BOCHS "
 [010h 0016   8] Oem Table ID : "BXPC"
 [018h 0024   4] Oem Revision : 0001
 [01Ch 0028   4]  Asl Compiler ID : "BXPC"
 [020h 0032   4]Asl Compiler Revision : 0001

 [024h 0036   4] FACS Address : 
 [028h 0040   4] DSDT Address : 
 [02Ch 0044   1]Model : 01
 [02Dh 0045   1]   PM Profile : 00 [Unspecified]
 [02Eh 0046   2]SCI Interrupt : 0009
 [030h 0048   4] SMI Command Port : 00B2
 [034h 0052   1]ACPI Enable Value : 02
 [035h 0053   1]   ACPI Disable Value : 03
 [036h 0054   1]   S4BIOS Command : 00
 [037h 0055   1]  P-State Control : 00
@@ -42,35 +42,35 @@
 [059h 0089   1] PM1 Control Block Length : 02
 [05Ah 0090   1] PM2 Control Block Length : 00
 [05Bh 0091   1]PM Timer Block Length : 04
 [05Ch 0092   1]GPE0 Block Length : 10
 [05Dh 0093   1]GPE1 Block Length : 00
 [05Eh 0094   1] GPE1 Base Offset : 00
 [05Fh 0095   1] _CST Support : 00
 [060h 0096   2]   C2 Latency : 0FFF
 [062h 0098   2]   C3 Latency : 0FFF
 [064h 0100   2]   CPU Cache Size : 
 [066h 0102   2]   Cache Flush Stride : 
 [068h 0104   1]Duty Cycle Offset : 00
 [069h 0105   1] Duty Cycle Width : 00
 [06Ah 0106   1]  RTC Day Alarm Index : 00
 [06Bh 0107   1]RTC Month Alarm Index : 00
 [06Ch 0108   1]RTC Century Index : 32
-[06Dh 0109   2]   Boot Flags (decoded below) : 
+[06Dh 0109   2]   Boot Flags (decoded below) : 0002
Legacy Devices Supported (V2) : 0
-8042 Present on ports 60/64 (V2) : 0
+8042 Present on ports 60/64 (V2) : 1
 VGA Not Present (V4) : 0
   MSI Not Supported (V4) : 0
 PCIe ASPM Not Supported (V4) : 0
CMOS RTC Not Present (V5) : 0
 [06Fh 0111   1] Reserved : 00
 [070h 0112   4]Flags (decoded below) : 84A5
   WBINVD instruction is operational (V1) : 1
   WBINVD flushes all caches (V1) : 0
 All CPUs support C1 (V1) : 1
   C2 works on MP system (V1) : 0
 Control Method Power Button (V1) : 0
 Control Method Sleep Button (V1) : 1
 RTC wake not in fixed reg space (V1) : 0
 RTC can wake system from S4 (V1) : 1
 32-bit PM Timer (V1) : 0
   Docking Supported (V1) : 0
@@ -148,32 +148,32 @@
 [0DCh 0220   1] Space ID : 01 [SystemIO]
 [0DDh 0221   1]Bit Width : 80
 [0DEh 0222   1]   Bit Offset : 00
 [0DFh 0223   1] Encoded Access Width : 00 [Undefined/Legacy]
 [0E0h 0224   8]  Address : 0620

 [0E8h 0232  12]   GPE1 Block : [Generic Address Structure]
 [0E8h 0232   1] Space ID : 00 [SystemMemory]
 [0E9h 0233   1]Bit Width : 00
 [0EAh 0234   1]   Bit Offset : 00
 [0EBh 0235   1] Encoded Access Width : 00 [Undefined/Legacy]
 [0ECh 0236   8]  Address : 

 Raw Table Data: Length 244 (0xF4)

-: 46 41 43 50 F4 00 00 00 03 B9 42 4F 43 48 53 20  // FACP..BOCHS
+: 46 41 43 50 F4 00 00 00 03 B7 42 4F 43 48 53 20  // FACP..BOCHS
 0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPCBXPC
 0020: 01 00 00 00 00 00 00 00 00 00 00 00 01 00 09 00  // 
 0030: B2 00 00 00 02 03 00 00 00 06 00 00 00 00 00 00  // 
 0040: 04 06 00 00 00 00 00 00 00 00 00 00 08 06 00 00  // 
 0050: 20 06 00 00 00 00 00 00 04 02 00 04 10 00 00 00  //  ...
-0060: FF 0

[PATCH v3 3/4] hw/acpi: add indication for i8042 in IA-PC boot flags of the FADT table

2022-02-25 Thread Liav Albani
This can allow the guest OS to determine more easily if i8042 controller
is present in the system or not, so it doesn't need to do probing of the
controller, but just initialize it immediately, before enumerating the
ACPI AML namespace.

Signed-off-by: Liav Albani 
---
 hw/acpi/aml-build.c | 7 ++-
 hw/i386/acpi-build.c| 8 
 hw/i386/acpi-microvm.c  | 9 +
 include/hw/acpi/acpi-defs.h | 1 +
 4 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index 8966e16320..ef5f4cad87 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -2152,7 +2152,12 @@ void build_fadt(GArray *tbl, BIOSLinker *linker, const 
AcpiFadtData *f,
 build_append_int_noprefix(tbl, 0, 1); /* DAY_ALRM */
 build_append_int_noprefix(tbl, 0, 1); /* MON_ALRM */
 build_append_int_noprefix(tbl, f->rtc_century, 1); /* CENTURY */
-build_append_int_noprefix(tbl, 0, 2); /* IAPC_BOOT_ARCH */
+/* IAPC_BOOT_ARCH */
+if (f->rev == 1) {
+build_append_int_noprefix(tbl, 0, 2);
+} else {
+build_append_int_noprefix(tbl, f->iapc_boot_arch, 2);
+}
 build_append_int_noprefix(tbl, 0, 1); /* Reserved */
 build_append_int_noprefix(tbl, f->flags, 4); /* Flags */
 
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ebd47aa26f..65dbc1ec36 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -192,6 +192,14 @@ static void init_common_fadt_data(MachineState *ms, Object 
*o,
 .address = object_property_get_uint(o, ACPI_PM_PROP_GPE0_BLK, NULL)
 },
 };
+/*
+ * second bit of 16 but IAPC_BOOT_ARCH indicates presence of 8042 or
+ * equivalent micro controller. See table 5-10 of APCI spec version 2.0
+ * (the earliest acpi revision that supports this).
+ */
+
+fadt.iapc_boot_arch = isa_check_device_existence("i8042") ? 0x0002 : 
0x;
+
 *data = fadt;
 }
 
diff --git a/hw/i386/acpi-microvm.c b/hw/i386/acpi-microvm.c
index 68ca7e7fc2..e5f89164be 100644
--- a/hw/i386/acpi-microvm.c
+++ b/hw/i386/acpi-microvm.c
@@ -189,6 +189,15 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
 .reset_val = ACPI_GED_RESET_VALUE,
 };
 
+/*
+ * second bit of 16 but IAPC_BOOT_ARCH indicates presence of 8042 or
+ * equivalent micro controller. See table 5-10 of APCI spec version 2.0
+ * (the earliest acpi revision that supports this).
+ */
+
+pmfadt.iapc_boot_arch = isa_check_device_existence("i8042") ? 0x0002
+: 0x;
+
 table_offsets = g_array_new(false, true /* clear */,
 sizeof(uint32_t));
 bios_linker_loader_alloc(tables->linker,
diff --git a/include/hw/acpi/acpi-defs.h b/include/hw/acpi/acpi-defs.h
index c97e8633ad..2b42e4192b 100644
--- a/include/hw/acpi/acpi-defs.h
+++ b/include/hw/acpi/acpi-defs.h
@@ -77,6 +77,7 @@ typedef struct AcpiFadtData {
 uint16_t plvl2_lat;/* P_LVL2_LAT */
 uint16_t plvl3_lat;/* P_LVL3_LAT */
 uint16_t arm_boot_arch;/* ARM_BOOT_ARCH */
+uint16_t iapc_boot_arch;   /* IAPC_BOOT_ARCH */
 uint8_t minor_ver; /* FADT Minor Version */
 
 /*
-- 
2.35.1




[PATCH v3 2/4] tests/acpi: i386: allow FACP acpi table changes

2022-02-25 Thread Liav Albani
The FACP table is going to be changed for x86/q35 machines. To be sure
the following changes are not breaking any QEMU test this change follows
step 2 from the bios-tables-test.c guide on changes that affect ACPI
tables.

Signed-off-by: Liav Albani 
---
 tests/qtest/bios-tables-test-allowed-diff.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..7570e39369 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,5 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/q35/FACP",
+"tests/data/acpi/q35/FACP.nosmm",
+"tests/data/acpi/q35/FACP.slic",
+"tests/data/acpi/q35/FACP.xapic",
-- 
2.35.1




[PATCH v3 1/4] hw/isa: add function to check for existence of device by its type

2022-02-25 Thread Liav Albani
This function enumerates all attached ISA devices in the machine, and
tries to compare a given device type name to the enumerated devices.
For example, this can help other code to determine if a i8042 controller
exists in the machine.

Signed-off-by: Liav Albani 
---
 hw/isa/isa-bus.c | 23 +++
 include/hw/isa/isa.h |  1 +
 2 files changed, 24 insertions(+)

diff --git a/hw/isa/isa-bus.c b/hw/isa/isa-bus.c
index 6c31398dda..663aa36d29 100644
--- a/hw/isa/isa-bus.c
+++ b/hw/isa/isa-bus.c
@@ -222,6 +222,29 @@ void isa_build_aml(ISABus *bus, Aml *scope)
 }
 }
 
+bool isa_check_device_existence(const char *typename)
+{
+/*
+ * If there's no ISA bus, we know for sure that the checked ISA device type
+ * doesn't exist in the machine.
+ */
+if (isabus == NULL) {
+return false;
+}
+
+BusChild *kid;
+ISADevice *dev;
+
+QTAILQ_FOREACH(kid, &isabus->parent_obj.children, sibling) {
+dev = ISA_DEVICE(kid->child);
+const char *object_type = object_get_typename(OBJECT(dev));
+if (object_type && strcmp(object_type, typename) == 0) {
+return true;
+}
+}
+return false;
+}
+
 static void isabus_dev_print(Monitor *mon, DeviceState *dev, int indent)
 {
 ISADevice *d = ISA_DEVICE(dev);
diff --git a/include/hw/isa/isa.h b/include/hw/isa/isa.h
index d4417b34b6..65f0c7e28c 100644
--- a/include/hw/isa/isa.h
+++ b/include/hw/isa/isa.h
@@ -99,6 +99,7 @@ IsaDma *isa_get_dma(ISABus *bus, int nchan);
 MemoryRegion *isa_address_space(ISADevice *dev);
 MemoryRegion *isa_address_space_io(ISADevice *dev);
 ISADevice *isa_new(const char *name);
+bool isa_check_device_existence(const char *typename);
 ISADevice *isa_try_new(const char *name);
 bool isa_realize_and_unref(ISADevice *dev, ISABus *bus, Error **errp);
 ISADevice *isa_create_simple(ISABus *bus, const char *name);
-- 
2.35.1




[PATCH v3 0/4] hw/acpi: add indication for i8042 in IA-PC boot flags of the FADT table

2022-02-25 Thread Liav Albani
This can allow the guest OS to determine more easily if i8042 controller
is present in the system or not, so it doesn't need to do probing of the
controller, but just initialize it immediately, before enumerating the
ACPI AML namespace.

To allow "flexible" indication, I don't hardcode the bit at location 1
as on in the IA-PC boot flags, but try to search for i8042 on the ISA
bus to verify it exists in the system.

Why this is useful you might ask - this patch allows the guest OS to
probe and use the i8042 controller without decoding the ACPI AML blob
at all. For example, as a developer of the SerenityOS kernel, I might
want to allow people to not try to decode the ACPI AML namespace (for
now, we still don't support ACPI AML as it's a work in progress), but
still to not probe for the i8042 but just use it after looking in the
IA-PC boot flags in the ACPI FADT table.

A note about this version of the patch series: I changed the assertion
checking if the ISA bus exists to a if statement, because I can see how
in the future someone might want to run a x86 machine without an ISA bus
so we should not assert if someone calls the ISA check device existence
function but return FALSE gracefully.
If someone thinks this is wrong, I'm more than happy to discuss and fix
the code :)

Liav Albani (4):
  hw/isa: add function to check for existence of device by its type
  tests/acpi: i386: allow FACP acpi table changes
  hw/acpi: add indication for i8042 in IA-PC boot flags of the FADT
table
  tests/acpi: i386: update FACP table differences

 hw/acpi/aml-build.c|   7 ++-
 hw/i386/acpi-build.c   |   8 
 hw/i386/acpi-microvm.c |   9 +
 hw/isa/isa-bus.c   |  23 +++
 include/hw/acpi/acpi-defs.h|   1 +
 include/hw/isa/isa.h   |   1 +
 tests/data/acpi/q35/FACP   | Bin 244 -> 244 bytes
 tests/data/acpi/q35/FACP.nosmm | Bin 244 -> 244 bytes
 tests/data/acpi/q35/FACP.slic  | Bin 244 -> 244 bytes
 tests/data/acpi/q35/FACP.xapic | Bin 244 -> 244 bytes
 10 files changed, 48 insertions(+), 1 deletion(-)

-- 
2.35.1




[PULL v2 6/6] hw/openrisc/openrisc_sim: Add support for initrd loading

2022-02-25 Thread Stafford Horne
The initrd passed via the command line is loaded into memory.  It's
location and size is then added to the device tree so the kernel knows
where to find it.

Signed-off-by: Stafford Horne 
Reviewed-by: Peter Maydell 
---
 hw/openrisc/openrisc_sim.c | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/hw/openrisc/openrisc_sim.c b/hw/openrisc/openrisc_sim.c
index e0e71c0faa..8184caa60b 100644
--- a/hw/openrisc/openrisc_sim.c
+++ b/hw/openrisc/openrisc_sim.c
@@ -315,6 +315,33 @@ static hwaddr openrisc_load_kernel(ram_addr_t ram_size,
 return 0;
 }
 
+static hwaddr openrisc_load_initrd(Or1ksimState *state, const char *filename,
+   hwaddr load_start, uint64_t mem_size)
+{
+void *fdt = state->fdt;
+int size;
+hwaddr start;
+
+/* We put the initrd right after the kernel; page aligned. */
+start = TARGET_PAGE_ALIGN(load_start);
+
+size = load_ramdisk(filename, start, mem_size - start);
+if (size < 0) {
+size = load_image_targphys(filename, start, mem_size - start);
+if (size < 0) {
+error_report("could not load ramdisk '%s'", filename);
+exit(1);
+}
+}
+
+qemu_fdt_setprop_cell(fdt, "/chosen",
+  "linux,initrd-start", start);
+qemu_fdt_setprop_cell(fdt, "/chosen",
+  "linux,initrd-end", start + size);
+
+return start + size;
+}
+
 static uint32_t openrisc_load_fdt(Or1ksimState *state, hwaddr load_start,
   uint64_t mem_size)
 {
@@ -393,6 +420,10 @@ static void openrisc_sim_init(MachineState *machine)
 
 load_addr = openrisc_load_kernel(ram_size, kernel_filename);
 if (load_addr > 0) {
+if (machine->initrd_filename) {
+load_addr = openrisc_load_initrd(state, machine->initrd_filename,
+ load_addr, machine->ram_size);
+}
 boot_info.fdt_addr = openrisc_load_fdt(state, load_addr,
machine->ram_size);
 }
-- 
2.31.1




[PULL v2 5/6] hw/openrisc/openrisc_sim: Add automatic device tree generation

2022-02-25 Thread Stafford Horne
Using the device tree means that qemu can now directly tell
the kernel what hardware is configured rather than use having
to maintain and update a separate device tree file.

This patch adds automatic device tree generation support for the
OpenRISC simulator.  A device tree is built up based on the state of the
configure openrisc simulator.

This is then dumped to memory and the load address is passed to the
kernel in register r3.

Signed-off-by: Stafford Horne 
Reviewed-by: Peter Maydell 
---
Since v1:
 - Added fdt to CONFIG_OR1K_SIM source set

 configs/targets/or1k-softmmu.mak |   1 +
 hw/openrisc/meson.build  |   2 +-
 hw/openrisc/openrisc_sim.c   | 189 ---
 3 files changed, 176 insertions(+), 16 deletions(-)

diff --git a/configs/targets/or1k-softmmu.mak b/configs/targets/or1k-softmmu.mak
index 1dfb93e46d..9e1d4a1fb1 100644
--- a/configs/targets/or1k-softmmu.mak
+++ b/configs/targets/or1k-softmmu.mak
@@ -1,2 +1,3 @@
 TARGET_ARCH=openrisc
 TARGET_WORDS_BIGENDIAN=y
+TARGET_NEED_FDT=y
diff --git a/hw/openrisc/meson.build b/hw/openrisc/meson.build
index 947f63ee08..ec48172c9d 100644
--- a/hw/openrisc/meson.build
+++ b/hw/openrisc/meson.build
@@ -1,5 +1,5 @@
 openrisc_ss = ss.source_set()
 openrisc_ss.add(files('cputimer.c'))
-openrisc_ss.add(when: 'CONFIG_OR1K_SIM', if_true: files('openrisc_sim.c'))
+openrisc_ss.add(when: 'CONFIG_OR1K_SIM', if_true: [files('openrisc_sim.c'), 
fdt])
 
 hw_arch += {'openrisc': openrisc_ss}
diff --git a/hw/openrisc/openrisc_sim.c b/hw/openrisc/openrisc_sim.c
index 8cfb92bec6..e0e71c0faa 100644
--- a/hw/openrisc/openrisc_sim.c
+++ b/hw/openrisc/openrisc_sim.c
@@ -29,15 +29,20 @@
 #include "net/net.h"
 #include "hw/loader.h"
 #include "hw/qdev-properties.h"
+#include "exec/address-spaces.h"
+#include "sysemu/device_tree.h"
 #include "sysemu/sysemu.h"
 #include "hw/sysbus.h"
 #include "sysemu/qtest.h"
 #include "sysemu/reset.h"
 #include "hw/core/split-irq.h"
 
+#include 
+
 #define KERNEL_LOAD_ADDR 0x100
 
 #define OR1KSIM_CPUS_MAX 4
+#define OR1KSIM_CLK_MHZ 2000
 
 #define TYPE_OR1KSIM_MACHINE MACHINE_TYPE_NAME("or1k-sim")
 #define OR1KSIM_MACHINE(obj) \
@@ -48,6 +53,8 @@ typedef struct Or1ksimState {
 MachineState parent_obj;
 
 /*< public >*/
+void *fdt;
+int fdt_size;
 
 } Or1ksimState;
 
@@ -76,6 +83,7 @@ static const struct MemmapEntry {
 
 static struct openrisc_boot_info {
 uint32_t bootstrap_pc;
+uint32_t fdt_addr;
 } boot_info;
 
 static void main_cpu_reset(void *opaque)
@@ -86,6 +94,7 @@ static void main_cpu_reset(void *opaque)
 cpu_reset(CPU(cpu));
 
 cpu_set_pc(cs, boot_info.bootstrap_pc);
+cpu_set_gpr(&cpu->env, 3, boot_info.fdt_addr);
 }
 
 static qemu_irq get_cpu_irq(OpenRISCCPU *cpus[], int cpunum, int irq_pin)
@@ -93,12 +102,77 @@ static qemu_irq get_cpu_irq(OpenRISCCPU *cpus[], int 
cpunum, int irq_pin)
 return qdev_get_gpio_in_named(DEVICE(cpus[cpunum]), "IRQ", irq_pin);
 }
 
-static void openrisc_sim_net_init(hwaddr base, hwaddr descriptors,
+static void openrisc_create_fdt(Or1ksimState *state,
+const struct MemmapEntry *memmap,
+int num_cpus, uint64_t mem_size,
+const char *cmdline)
+{
+void *fdt;
+int cpu;
+char *nodename;
+int pic_ph;
+
+fdt = state->fdt = create_device_tree(&state->fdt_size);
+if (!fdt) {
+error_report("create_device_tree() failed");
+exit(1);
+}
+
+qemu_fdt_setprop_string(fdt, "/", "compatible", "opencores,or1ksim");
+qemu_fdt_setprop_cell(fdt, "/", "#address-cells", 0x1);
+qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x1);
+
+nodename = g_strdup_printf("/memory@%" HWADDR_PRIx,
+   memmap[OR1KSIM_DRAM].base);
+qemu_fdt_add_subnode(fdt, nodename);
+qemu_fdt_setprop_cells(fdt, nodename, "reg",
+   memmap[OR1KSIM_DRAM].base, mem_size);
+qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
+g_free(nodename);
+
+qemu_fdt_add_subnode(fdt, "/cpus");
+qemu_fdt_setprop_cell(fdt, "/cpus", "#size-cells", 0x0);
+qemu_fdt_setprop_cell(fdt, "/cpus", "#address-cells", 0x1);
+
+for (cpu = 0; cpu < num_cpus; cpu++) {
+nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
+qemu_fdt_add_subnode(fdt, nodename);
+qemu_fdt_setprop_string(fdt, nodename, "compatible",
+"opencores,or1200-rtlsvn481");
+qemu_fdt_setprop_cell(fdt, nodename, "reg", cpu);
+qemu_fdt_setprop_cell(fdt, nodename, "clock-frequency",
+  OR1KSIM_CLK_MHZ);
+g_free(nodename);
+}
+
+nodename = (char *)"/pic";
+qemu_fdt_add_subnode(fdt, nodename);
+pic_ph = qemu_fdt_alloc_phandle(fdt);
+qemu_fdt_setprop_string(fdt, nodename, "compatible",
+"opencores,or1k-pic-level");
+qemu_fdt_setprop_cell(fdt, nodename, "#

[PULL v2 3/6] hw/openrisc/openrisc_sim: Use IRQ splitter when connecting UART

2022-02-25 Thread Stafford Horne
Currently the OpenRISC SMP configuration only supports 2 cores due to
the UART IRQ routing being limited to 2 cores.  As was done in commit
1eeffbeb11 ("hw/openrisc/openrisc_sim: Use IRQ splitter when connecting
IRQ to multiple CPUs") we can use a splitter to wire more than 2 CPUs.

This patch moves serial initialization out to it's own function and
uses a splitter to connect multiple CPU irq lines to the UART.

Signed-off-by: Stafford Horne 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/openrisc/openrisc_sim.c | 32 
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/hw/openrisc/openrisc_sim.c b/hw/openrisc/openrisc_sim.c
index d12b3e0c5e..5bfbac00f8 100644
--- a/hw/openrisc/openrisc_sim.c
+++ b/hw/openrisc/openrisc_sim.c
@@ -137,6 +137,28 @@ static void openrisc_sim_ompic_init(hwaddr base, int 
num_cpus,
 sysbus_mmio_map(s, 0, base);
 }
 
+static void openrisc_sim_serial_init(hwaddr base, int num_cpus,
+ OpenRISCCPU *cpus[], int irq_pin)
+{
+qemu_irq serial_irq;
+int i;
+
+if (num_cpus > 1) {
+DeviceState *splitter = qdev_new(TYPE_SPLIT_IRQ);
+qdev_prop_set_uint32(splitter, "num-lines", num_cpus);
+qdev_realize_and_unref(splitter, NULL, &error_fatal);
+for (i = 0; i < num_cpus; i++) {
+qdev_connect_gpio_out(splitter, i, get_cpu_irq(cpus, i, irq_pin));
+}
+serial_irq = qdev_get_gpio_in(splitter, 0);
+} else {
+serial_irq = get_cpu_irq(cpus, 0, irq_pin);
+}
+serial_mm_init(get_system_memory(), base, 0, serial_irq, 115200,
+   serial_hd(0), DEVICE_NATIVE_ENDIAN);
+}
+
+
 static void openrisc_load_kernel(ram_addr_t ram_size,
  const char *kernel_filename)
 {
@@ -177,7 +199,6 @@ static void openrisc_sim_init(MachineState *machine)
 const char *kernel_filename = machine->kernel_filename;
 OpenRISCCPU *cpus[2] = {};
 MemoryRegion *ram;
-qemu_irq serial_irq;
 int n;
 unsigned int smp_cpus = machine->smp.cpus;
 
@@ -208,15 +229,10 @@ static void openrisc_sim_init(MachineState *machine)
 if (smp_cpus > 1) {
 openrisc_sim_ompic_init(or1ksim_memmap[OR1KSIM_OMPIC].base, smp_cpus,
 cpus, OR1KSIM_OMPIC_IRQ);
-
-serial_irq = qemu_irq_split(get_cpu_irq(cpus, 0, OR1KSIM_UART_IRQ),
-get_cpu_irq(cpus, 1, OR1KSIM_UART_IRQ));
-} else {
-serial_irq = get_cpu_irq(cpus, 0, OR1KSIM_UART_IRQ);
 }
 
-serial_mm_init(get_system_memory(), or1ksim_memmap[OR1KSIM_UART].base, 0,
-   serial_irq, 115200, serial_hd(0), DEVICE_NATIVE_ENDIAN);
+openrisc_sim_serial_init(or1ksim_memmap[OR1KSIM_UART].base, smp_cpus, cpus,
+ OR1KSIM_UART_IRQ);
 
 openrisc_load_kernel(ram_size, kernel_filename);
 }
-- 
2.31.1




[PULL v2 2/6] hw/openrisc/openrisc_sim: Parameterize initialization

2022-02-25 Thread Stafford Horne
Move magic numbers to variables and enums. These will be reused for
upcoming fdt initialization.

Signed-off-by: Stafford Horne 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/openrisc/openrisc_sim.c | 42 ++
 1 file changed, 34 insertions(+), 8 deletions(-)

diff --git a/hw/openrisc/openrisc_sim.c b/hw/openrisc/openrisc_sim.c
index 26d2370e60..d12b3e0c5e 100644
--- a/hw/openrisc/openrisc_sim.c
+++ b/hw/openrisc/openrisc_sim.c
@@ -49,6 +49,29 @@ typedef struct Or1ksimState {
 
 } Or1ksimState;
 
+enum {
+OR1KSIM_DRAM,
+OR1KSIM_UART,
+OR1KSIM_ETHOC,
+OR1KSIM_OMPIC,
+};
+
+enum {
+OR1KSIM_OMPIC_IRQ = 1,
+OR1KSIM_UART_IRQ = 2,
+OR1KSIM_ETHOC_IRQ = 4,
+};
+
+static const struct MemmapEntry {
+hwaddr base;
+hwaddr size;
+} or1ksim_memmap[] = {
+[OR1KSIM_DRAM] =  { 0x,  0 },
+[OR1KSIM_UART] =  { 0x9000,  0x100 },
+[OR1KSIM_ETHOC] = { 0x9200,  0x800 },
+[OR1KSIM_OMPIC] = { 0x9800, 16 },
+};
+
 static struct openrisc_boot_info {
 uint32_t bootstrap_pc;
 } boot_info;
@@ -176,21 +199,24 @@ static void openrisc_sim_init(MachineState *machine)
 memory_region_add_subregion(get_system_memory(), 0, ram);
 
 if (nd_table[0].used) {
-openrisc_sim_net_init(0x9200, 0x92000400, smp_cpus,
-  cpus, 4, nd_table);
+openrisc_sim_net_init(or1ksim_memmap[OR1KSIM_ETHOC].base,
+  or1ksim_memmap[OR1KSIM_ETHOC].base + 0x400,
+  smp_cpus, cpus,
+  OR1KSIM_ETHOC_IRQ, nd_table);
 }
 
 if (smp_cpus > 1) {
-openrisc_sim_ompic_init(0x9800, smp_cpus, cpus, 1);
+openrisc_sim_ompic_init(or1ksim_memmap[OR1KSIM_OMPIC].base, smp_cpus,
+cpus, OR1KSIM_OMPIC_IRQ);
 
-serial_irq = qemu_irq_split(get_cpu_irq(cpus, 0, 2),
-get_cpu_irq(cpus, 1, 2));
+serial_irq = qemu_irq_split(get_cpu_irq(cpus, 0, OR1KSIM_UART_IRQ),
+get_cpu_irq(cpus, 1, OR1KSIM_UART_IRQ));
 } else {
-serial_irq = get_cpu_irq(cpus, 0, 2);
+serial_irq = get_cpu_irq(cpus, 0, OR1KSIM_UART_IRQ);
 }
 
-serial_mm_init(get_system_memory(), 0x9000, 0, serial_irq,
-   115200, serial_hd(0), DEVICE_NATIVE_ENDIAN);
+serial_mm_init(get_system_memory(), or1ksim_memmap[OR1KSIM_UART].base, 0,
+   serial_irq, 115200, serial_hd(0), DEVICE_NATIVE_ENDIAN);
 
 openrisc_load_kernel(ram_size, kernel_filename);
 }
-- 
2.31.1




[PULL v2 4/6] hw/openrisc/openrisc_sim: Increase max_cpus to 4

2022-02-25 Thread Stafford Horne
Now that we no longer have a limit of 2 CPUs due to fixing the
IRQ routing issues we can increase the max.  Here we increase
the limit to 4, we could go higher, but currently OMPIC has a
limit of 4, so we align with that.

Signed-off-by: Stafford Horne 
Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/openrisc/openrisc_sim.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/openrisc/openrisc_sim.c b/hw/openrisc/openrisc_sim.c
index 5bfbac00f8..8cfb92bec6 100644
--- a/hw/openrisc/openrisc_sim.c
+++ b/hw/openrisc/openrisc_sim.c
@@ -37,6 +37,8 @@
 
 #define KERNEL_LOAD_ADDR 0x100
 
+#define OR1KSIM_CPUS_MAX 4
+
 #define TYPE_OR1KSIM_MACHINE MACHINE_TYPE_NAME("or1k-sim")
 #define OR1KSIM_MACHINE(obj) \
 OBJECT_CHECK(Or1ksimState, (obj), TYPE_OR1KSIM_MACHINE)
@@ -197,12 +199,12 @@ static void openrisc_sim_init(MachineState *machine)
 {
 ram_addr_t ram_size = machine->ram_size;
 const char *kernel_filename = machine->kernel_filename;
-OpenRISCCPU *cpus[2] = {};
+OpenRISCCPU *cpus[OR1KSIM_CPUS_MAX] = {};
 MemoryRegion *ram;
 int n;
 unsigned int smp_cpus = machine->smp.cpus;
 
-assert(smp_cpus >= 1 && smp_cpus <= 2);
+assert(smp_cpus >= 1 && smp_cpus <= OR1KSIM_CPUS_MAX);
 for (n = 0; n < smp_cpus; n++) {
 cpus[n] = OPENRISC_CPU(cpu_create(machine->cpu_type));
 if (cpus[n] == NULL) {
@@ -243,7 +245,7 @@ static void openrisc_sim_machine_init(ObjectClass *oc, void 
*data)
 
 mc->desc = "or1k simulation";
 mc->init = openrisc_sim_init;
-mc->max_cpus = 2;
+mc->max_cpus = OR1KSIM_CPUS_MAX;
 mc->is_default = true;
 mc->default_cpu_type = OPENRISC_CPU_TYPE_NAME("or1200");
 }
-- 
2.31.1




[PULL v2 1/6] hw/openrisc/openrisc_sim: Create machine state for or1ksim

2022-02-25 Thread Stafford Horne
This will allow us to attach machine state attributes like
the device tree fdt.

Signed-off-by: Stafford Horne 
Reviewed-by: Philippe Mathieu-Daudé 
---
 hw/openrisc/openrisc_sim.c | 30 --
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/hw/openrisc/openrisc_sim.c b/hw/openrisc/openrisc_sim.c
index 73fe383c2d..26d2370e60 100644
--- a/hw/openrisc/openrisc_sim.c
+++ b/hw/openrisc/openrisc_sim.c
@@ -37,6 +37,18 @@
 
 #define KERNEL_LOAD_ADDR 0x100
 
+#define TYPE_OR1KSIM_MACHINE MACHINE_TYPE_NAME("or1k-sim")
+#define OR1KSIM_MACHINE(obj) \
+OBJECT_CHECK(Or1ksimState, (obj), TYPE_OR1KSIM_MACHINE)
+
+typedef struct Or1ksimState {
+/*< private >*/
+MachineState parent_obj;
+
+/*< public >*/
+
+} Or1ksimState;
+
 static struct openrisc_boot_info {
 uint32_t bootstrap_pc;
 } boot_info;
@@ -183,8 +195,10 @@ static void openrisc_sim_init(MachineState *machine)
 openrisc_load_kernel(ram_size, kernel_filename);
 }
 
-static void openrisc_sim_machine_init(MachineClass *mc)
+static void openrisc_sim_machine_init(ObjectClass *oc, void *data)
 {
+MachineClass *mc = MACHINE_CLASS(oc);
+
 mc->desc = "or1k simulation";
 mc->init = openrisc_sim_init;
 mc->max_cpus = 2;
@@ -192,4 +206,16 @@ static void openrisc_sim_machine_init(MachineClass *mc)
 mc->default_cpu_type = OPENRISC_CPU_TYPE_NAME("or1200");
 }
 
-DEFINE_MACHINE("or1k-sim", openrisc_sim_machine_init)
+static const TypeInfo or1ksim_machine_typeinfo = {
+.name   = TYPE_OR1KSIM_MACHINE,
+.parent = TYPE_MACHINE,
+.class_init = openrisc_sim_machine_init,
+.instance_size = sizeof(Or1ksimState),
+};
+
+static void or1ksim_machine_init_register_types(void)
+{
+type_register_static(&or1ksim_machine_typeinfo);
+}
+
+type_init(or1ksim_machine_init_register_types)
-- 
2.31.1




[PULL v2 0/6] OpenRISC DTS Generation patches for 7.0

2022-02-25 Thread Stafford Horne
The following changes since commit 4aa2e497a98bafe962e72997f67a369e4b52d9c1:

  Merge remote-tracking branch 
'remotes/berrange-gitlab/tags/misc-next-pull-request' into staging (2022-02-23 
09:25:05 +)

are available in the Git repository at:

  git://github.com/stffrdhrn/qemu.git tags/or1k-pull-request

for you to fetch changes up to 9576abf28280499a4497f39c2fae55bf97285e94:

  hw/openrisc/openrisc_sim: Add support for initrd loading (2022-02-26 10:39:36 
+0900)


OpenRISC patches

 - Add automatic DTS generation to openrisc_sim



Since v1:
 - Added fdt file include into meson.build
 - I couldn't figure out how to run CI easily, so but I think this is the right
   fix.

Stafford Horne (6):
  hw/openrisc/openrisc_sim: Create machine state for or1ksim
  hw/openrisc/openrisc_sim: Parameterize initialization
  hw/openrisc/openrisc_sim: Use IRQ splitter when connecting UART
  hw/openrisc/openrisc_sim: Increase max_cpus to 4
  hw/openrisc/openrisc_sim: Add automatic device tree generation
  hw/openrisc/openrisc_sim: Add support for initrd loading

 configs/targets/or1k-softmmu.mak |   1 +
 hw/openrisc/meson.build  |   2 +-
 hw/openrisc/openrisc_sim.c   | 308 ---
 3 files changed, 286 insertions(+), 25 deletions(-)



[PATCH] tcg/tci: Use tcg_out_ldst in tcg_out_st

2022-02-25 Thread Richard Henderson
The tcg_out_ldst helper will handle out-of-range offsets.
We haven't actually encountered any, since we haven't run
across the assert within tcg_out_op_rrs, but an out-of-range
offset would not be impossible in future.

Fixes: 65089889183 ("tcg/tci: Change encoding to uint32_t units")
Signed-off-by: Richard Henderson 
---
 tcg/tci/tcg-target.c.inc | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/tcg/tci/tcg-target.c.inc b/tcg/tci/tcg-target.c.inc
index 0cb16aaa81..9ff1fa0832 100644
--- a/tcg/tci/tcg-target.c.inc
+++ b/tcg/tci/tcg-target.c.inc
@@ -790,14 +790,13 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
 static void tcg_out_st(TCGContext *s, TCGType type, TCGReg val, TCGReg base,
intptr_t offset)
 {
-stack_bounds_check(base, offset);
 switch (type) {
 case TCG_TYPE_I32:
-tcg_out_op_rrs(s, INDEX_op_st_i32, val, base, offset);
+tcg_out_ldst(s, INDEX_op_st_i32, val, base, offset);
 break;
 #if TCG_TARGET_REG_BITS == 64
 case TCG_TYPE_I64:
-tcg_out_op_rrs(s, INDEX_op_st_i64, val, base, offset);
+tcg_out_ldst(s, INDEX_op_st_i64, val, base, offset);
 break;
 #endif
 default:
-- 
2.25.1




Re: [PULL 0/6] OpenRISC DTS Generation patches for 7.0

2022-02-25 Thread Stafford Horne
On Fri, Feb 25, 2022 at 01:52:52PM +, Peter Maydell wrote:
> On Fri, 25 Feb 2022 at 09:19, Stafford Horne  wrote:
> >
> > The following changes since commit 4aa2e497a98bafe962e72997f67a369e4b52d9c1:
> >
> >   Merge remote-tracking branch 
> > 'remotes/berrange-gitlab/tags/misc-next-pull-request' into staging 
> > (2022-02-23 09:25:05 +)
> >
> > are available in the Git repository at:
> >
> >   git://github.com/stffrdhrn/qemu.git tags/or1k-pull-request
> >
> > for you to fetch changes up to 94c71f14e9ca15ede4172e0826d690b15069a7f8:
> >
> >   hw/openrisc/openrisc_sim: Add support for initrd loading (2022-02-25 
> > 15:42:23 +0900)
> >
> > 
> > OpenRISC patches
> >
> >  - Add automatic DTS generation to openrisc_sim
> >
> > 
> > Stafford Horne (6):
> >   hw/openrisc/openrisc_sim: Create machine state for or1ksim
> >   hw/openrisc/openrisc_sim: Parameterize initialization
> >   hw/openrisc/openrisc_sim: Use IRQ splitter when connecting UART
> >   hw/openrisc/openrisc_sim: Increase max_cpus to 4
> >   hw/openrisc/openrisc_sim: Add automatic device tree generation
> >   hw/openrisc/openrisc_sim: Add support for initrd loading
> 
> Hi; this fails to build on various CI configs, eg:
> https://gitlab.com/qemu-project/qemu/-/jobs/2137393314
> https://gitlab.com/qemu-project/qemu/-/jobs/2137393335
> 
> ../hw/openrisc/openrisc_sim.c:40:10: fatal error: libfdt.h: No such
> file or directory
> 40 | #include 
> | ^~
> 
> 
> This happens because meson doesn't put the include path for libfdt
> on the include path for every .c file -- you have to do something
> special in the meson.build file for the files that include it.
> Paolo can tell you what that is, I expect.

OK, I missed the CI results as it was all working for me.  I will fix and test
with the same as CI configs.

-Stafford

> Paolo: are we going to be able to stop doing this at some point
> and get meson to just DTRT and put includes on the path for
> every C file ?
> 
> thanks
> -- PMM



Re: [PATCH v3 00/11] blockdev-replace

2022-02-25 Thread Vladimir Sementsov-Ogievskiy

26.02.2022 02:42, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

Finally, that's a proposal for new interface for filter insertion, which
provides generic way for inserting between different block graph nodes,
like BDS nodes, block exports and block devices.

v3: - add transaction support
 - add test, that shows transactional filter insertion in different
   cases
 - drop RFC mark. I think it's now close to be a good solution. And
   anyway, no comments on "RFC v2" version:)  Still, I want to keep
   x- prefix for now, just because there were too many different
   ideas on this topic.


Oh, forget to mention, that it's based on recent "[PATCH 0/2] blockdev-add 
transaction":
Based-on: <20220224171328.1628047-1-vsement...@virtuozzo.com>

--
Best regards,
Vladimir



[PATCH v3 10/11] iotests.py: add VM.qmp_check() helper

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
I'm tired of this pattern being everywhere. Let's add a helper.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/iotests.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 1b48c5b9c9..dd33970454 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -977,6 +977,10 @@ def get_block_graph(self):
 def assert_edges_list(self, edges):
 assert sorted(edges) == sorted(self.get_block_graph())
 
+def qmp_check(self, *args, **kwargs):
+result = self.qmp(*args, **kwargs)
+assert result == {'return': {}}
+
 def assert_block_path(self, root, path, expected_node, graph=None):
 """
 Check whether the node under the given path in the block graph
-- 
2.31.1




[PATCH v3 08/11] iotests.py: qemu_img_create: use imgfmt by default

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
Less typing: let's use imgfmt by default if user doesn't specify
neither -f nor --image-opts.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/iotests.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 6ba65eb1ff..ca17a5c64c 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -233,6 +233,8 @@ def ordered_qmp(qmsg, conv_keys=True):
 return qmsg
 
 def qemu_img_create(*args):
+if '-f' not in args and '--image-opts' not in args:
+args = ['-f', imgfmt] + list(args)
 return qemu_img('create', *args)
 
 def qemu_img_measure(*args):
-- 
2.31.1




[PATCH v3 11/11] iotests: add filter-insertion

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
Demonstrate new API for filter insertion and removal.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/tests/filter-insertion | 253 ++
 tests/qemu-iotests/tests/filter-insertion.out |   5 +
 2 files changed, 258 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/filter-insertion
 create mode 100644 tests/qemu-iotests/tests/filter-insertion.out

diff --git a/tests/qemu-iotests/tests/filter-insertion 
b/tests/qemu-iotests/tests/filter-insertion
new file mode 100755
index 00..4898d6e043
--- /dev/null
+++ b/tests/qemu-iotests/tests/filter-insertion
@@ -0,0 +1,253 @@
+#!/usr/bin/env python3
+#
+# Tests for inserting and removing filters in a block graph.
+#
+# Copyright (c) 2022 Virtuozzo International GmbH.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+
+import iotests
+from iotests import qemu_img_create, try_remove
+
+
+disk = os.path.join(iotests.test_dir, 'disk')
+sock = os.path.join(iotests.sock_dir, 'sock')
+size = 1024 * 1024
+
+
+class TestFilterInsertion(iotests.QMPTestCase):
+def setUp(self):
+qemu_img_create(disk, str(size))
+
+self.vm = iotests.VM()
+self.vm.launch()
+
+self.vm.qmp_check('blockdev-add', {
+'node-name': 'disk0',
+'driver': 'qcow2',
+'file': {
+'node-name': 'file0',
+'driver': 'file',
+'filename': disk
+}
+})
+
+def tearDown(self):
+self.vm.shutdown()
+os.remove(disk)
+try_remove(sock)
+
+def test_simple_insertion(self):
+vm = self.vm
+
+vm.qmp_check('blockdev-add', {
+'node-name': 'filter',
+'driver': 'preallocate',
+'file': 'file0'
+})
+
+vm.qmp_check('x-blockdev-replace', {
+'parent-type': 'driver',
+'node-name': 'disk0',
+'child': 'file',
+'new-child': 'filter'
+})
+
+# Filter inserted:
+# disk0 -file-> filter -file-> file0
+vm.assert_edges_list([
+('disk0', 'file', 'filter'),
+('filter', 'file', 'file0')
+])
+
+vm.qmp_check('x-blockdev-replace', {
+'parent-type': 'driver',
+'node-name': 'disk0',
+'child': 'file',
+'new-child': 'file0'
+})
+
+# Filter replaced, but still exists:
+# dik0 -file-> file0 <-file- filter
+vm.assert_edges_list([
+('disk0', 'file', 'file0'),
+('filter', 'file', 'file0')
+])
+
+vm.qmp_check('blockdev-del', node_name='filter')
+
+# Filter removed
+# dik0 -file-> file0
+vm.assert_edges_list([
+('disk0', 'file', 'file0')
+])
+
+def test_tran_insert_under_qdev(self):
+vm = self.vm
+
+vm.qmp_check('device_add', driver='virtio-scsi')
+vm.qmp_check('device_add', id='sda', driver='scsi-hd', drive='disk0')
+
+vm.qmp_check('transaction', actions=[
+{
+'type': 'blockdev-add',
+'data': {
+'node-name': 'filter',
+'driver': 'compress',
+'file': 'disk0'
+}
+}, {
+'type': 'x-blockdev-replace',
+'data': {
+'parent-type': 'qdev',
+'qdev-id': 'sda',
+'new-child': 'filter'
+}
+}
+])
+
+# Filter inserted:
+# sda -root-> filter -file-> disk0 -file-> file0
+vm.assert_edges_list([
+# parent_node_name, child_name, child_node_name
+('sda', 'root', 'filter'),
+('filter', 'file', 'disk0'),
+('disk0', 'file', 'file0'),
+])
+
+vm.qmp_check('x-blockdev-replace', {
+'parent-type': 'qdev',
+'qdev-id': 'sda',
+'new-child': 'disk0'
+})
+vm.qmp_check('blockdev-del', node_name='filter')
+
+# Filter removed:
+# sda -root-> disk0 -file-> file0
+vm.assert_edges_list([
+# parent_node_name, child_name, child_node_name
+('sda', 'root', 'disk0'),
+('disk0', 'file', 'file0'),
+])
+
+def test_tran_insert_under_nbd_export(sel

[PATCH v3 07/11] block: bdrv_get_xdbg_block_graph(): report export ids

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
Currently for block exports we report empty blk names. That's not good.
Let's try to find corresponding block export and report its id.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/export.h  |  1 +
 block.c |  4 
 block/export/export.c   | 13 +
 stubs/blk-exp-find-by-blk.c |  9 +
 stubs/meson.build   |  1 +
 5 files changed, 28 insertions(+)
 create mode 100644 stubs/blk-exp-find-by-blk.c

diff --git a/include/block/export.h b/include/block/export.h
index 7feb02e10d..172c180819 100644
--- a/include/block/export.h
+++ b/include/block/export.h
@@ -80,6 +80,7 @@ struct BlockExport {
 
 BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp);
 BlockExport *blk_exp_find(const char *id);
+BlockExport *blk_exp_find_by_blk(BlockBackend *blk);
 void blk_exp_ref(BlockExport *exp);
 void blk_exp_unref(BlockExport *exp);
 void blk_exp_request_shutdown(BlockExport *exp);
diff --git a/block.c b/block.c
index b2f55ff872..24baf58e80 100644
--- a/block.c
+++ b/block.c
@@ -5979,7 +5979,11 @@ XDbgBlockGraph *bdrv_get_xdbg_block_graph(Error **errp)
 for (blk = blk_all_next(NULL); blk; blk = blk_all_next(blk)) {
 char *allocated_name = NULL;
 const char *name = blk_name(blk);
+BlockExport *exp = blk_exp_find_by_blk(blk);
 
+if (!*name && exp) {
+name = exp->id;
+}
 if (!*name) {
 name = allocated_name = blk_get_attached_dev_id(blk);
 }
diff --git a/block/export/export.c b/block/export/export.c
index 613b5bc1d5..ca6c8969ca 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -54,6 +54,19 @@ BlockExport *blk_exp_find(const char *id)
 return NULL;
 }
 
+BlockExport *blk_exp_find_by_blk(BlockBackend *blk)
+{
+BlockExport *exp;
+
+QLIST_FOREACH(exp, &block_exports, next) {
+if (exp->blk == blk) {
+return exp;
+}
+}
+
+return NULL;
+}
+
 static const BlockExportDriver *blk_exp_find_driver(BlockExportType type)
 {
 int i;
diff --git a/stubs/blk-exp-find-by-blk.c b/stubs/blk-exp-find-by-blk.c
new file mode 100644
index 00..2fc1da953b
--- /dev/null
+++ b/stubs/blk-exp-find-by-blk.c
@@ -0,0 +1,9 @@
+#include "qemu/osdep.h"
+#include "sysemu/block-backend.h"
+#include "block/export.h"
+
+BlockExport *blk_exp_find_by_blk(BlockBackend *blk)
+{
+return NULL;
+}
+
diff --git a/stubs/meson.build b/stubs/meson.build
index 90358823fc..92e362a45e 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -2,6 +2,7 @@ stub_ss.add(files('bdrv-next-monitor-owned.c'))
 stub_ss.add(files('blk-commit-all.c'))
 stub_ss.add(files('blk-exp-close-all.c'))
 stub_ss.add(files('blk-by-qdev-id.c'))
+stub_ss.add(files('blk-exp-find-by-blk.c'))
 stub_ss.add(files('blockdev-close-all-bdrv-states.c'))
 stub_ss.add(files('change-state-handler.c'))
 stub_ss.add(files('cmos.c'))
-- 
2.31.1




[PATCH v3 09/11] iotests.py: introduce VM.assert_edges_list() method

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
Add an alternative method to check block graph, to be used in further
commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/iotests.py | 17 +
 1 file changed, 17 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index ca17a5c64c..1b48c5b9c9 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -960,6 +960,23 @@ def check_bitmap_status(self, node_name, bitmap_name, 
fields):
 
 return fields.items() <= ret.items()
 
+def get_block_graph(self):
+"""
+Returns block graph in form of edges list, where each edge is a tuple:
+  (parent_node_name, child_name, child_node_name)
+"""
+graph = self.qmp('x-debug-query-block-graph')['return']
+
+nodes = {n['id']: n['name'] for n in graph['nodes']}
+# Check that all names are unique:
+assert len(set(nodes.values())) == len(nodes)
+
+return [(nodes[e['parent']], e['name'], nodes[e['child']])
+for e in graph['edges']]
+
+def assert_edges_list(self, edges):
+assert sorted(edges) == sorted(self.get_block_graph())
+
 def assert_block_path(self, root, path, expected_node, graph=None):
 """
 Check whether the node under the given path in the block graph
-- 
2.31.1




[PATCH v3 05/11] qapi: add x-blockdev-replace command

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
Add a command that can replace bs in following BdrvChild structures:

 - qdev blk root child
 - block-export blk root child
 - any child BlockDriverState selected by child-name

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block-core.json   | 62 
 blockdev.c | 65 ++
 stubs/blk-by-qdev-id.c |  9 ++
 stubs/meson.build  |  1 +
 4 files changed, 137 insertions(+)
 create mode 100644 stubs/blk-by-qdev-id.c

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 9a5a3641d0..f760dc21f5 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -5574,3 +5574,65 @@
 { 'command': 'blockdev-snapshot-delete-internal-sync',
   'data': { 'device': 'str', '*id': 'str', '*name': 'str'},
   'returns': 'SnapshotInfo' }
+
+##
+# @BlockParentType:
+#
+# Since 7.0
+##
+{ 'enum': 'BlockParentType',
+  'data': ['qdev', 'driver', 'export'] }
+
+##
+# @BdrvChildRefQdev:
+#
+# Since 7.0
+##
+{ 'struct': 'BdrvChildRefQdev',
+  'data': { 'qdev-id': 'str' } }
+
+##
+# @BdrvChildRefExport:
+#
+# Since 7.0
+##
+{ 'struct': 'BdrvChildRefExport',
+  'data': { 'export-id': 'str' } }
+
+##
+# @BdrvChildRefDriver:
+#
+# Since 7.0
+##
+{ 'struct': 'BdrvChildRefDriver',
+  'data': { 'node-name': 'str', 'child': 'str' } }
+
+##
+# @BlockdevReplace:
+#
+# Since 7.0
+##
+{ 'union': 'BlockdevReplace',
+  'base': {
+  'parent-type': 'BlockParentType',
+  'new-child': 'str'
+  },
+  'discriminator': 'parent-type',
+  'data': {
+  'qdev': 'BdrvChildRefQdev',
+  'export': 'BdrvChildRefExport',
+  'driver': 'BdrvChildRefDriver'
+  } }
+
+##
+# @x-blockdev-replace:
+#
+# Replace a block-node associated with device (selected by
+# @qdev-id) or with block-export (selected by @export-id) or
+# any child of block-node (selected by @node-name and @child)
+# with @new-child block-node.
+#
+# Since 7.0
+##
+{ 'command': 'x-blockdev-replace', 'boxed': true,
+  'data': 'BlockdevReplace' }
diff --git a/blockdev.c b/blockdev.c
index d20963be2a..9fd1783be2 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2229,6 +2229,71 @@ static void blockdev_add_abort(BlkActionState *common)
 bdrv_unref(s->bs);
 }
 
+static int blockdev_replace(BlockdevReplace *repl, Transaction *tran,
+Error **errp)
+{
+BdrvChild *child = NULL;
+BlockDriverState *new_child_bs;
+
+if (repl->parent_type == BLOCK_PARENT_TYPE_DRIVER) {
+BlockDriverState *parent_bs;
+
+parent_bs = bdrv_find_node(repl->u.driver.node_name);
+if (!parent_bs) {
+error_setg(errp, "Block driver node with node-name '%s' not "
+   "found", repl->u.driver.node_name);
+return -EINVAL;
+}
+
+child = bdrv_find_child(parent_bs, repl->u.driver.child);
+if (!child) {
+error_setg(errp, "Block driver node '%s' doesn't have child "
+   "named '%s'", repl->u.driver.node_name,
+   repl->u.driver.child);
+return -EINVAL;
+}
+} else {
+/* Other types are similar, they work through blk */
+BlockBackend *blk;
+bool is_qdev = repl->parent_type == BLOCK_PARENT_TYPE_QDEV;
+const char *id =
+is_qdev ? repl->u.qdev.qdev_id : repl->u.export.export_id;
+
+assert(is_qdev || repl->parent_type == BLOCK_PARENT_TYPE_EXPORT);
+
+blk = is_qdev ? blk_by_qdev_id(id, errp) : blk_by_export_id(id, errp);
+if (!blk) {
+return -EINVAL;
+}
+
+child = blk_root(blk);
+if (!child) {
+error_setg(errp, "%s '%s' is empty, nothing to replace",
+   is_qdev ? "Device" : "Export", id);
+return -EINVAL;
+}
+}
+
+assert(child);
+assert(child->bs);
+
+new_child_bs = bdrv_find_node(repl->new_child);
+if (!new_child_bs) {
+error_setg(errp, "Node '%s' not found", repl->new_child);
+return -EINVAL;
+}
+
+return bdrv_replace_child_bs(child, new_child_bs, tran, errp);
+}
+
+void qmp_x_blockdev_replace(BlockdevReplace *repl, Error **errp)
+{
+Transaction *tran = tran_new();
+int ret = blockdev_replace(repl, tran, errp);
+
+tran_finalize(tran, ret);
+}
+
 static const BlkActionOps actions[] = {
 [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT] = {
 .instance_size = sizeof(ExternalSnapshotState),
diff --git a/stubs/blk-by-qdev-id.c b/stubs/blk-by-qdev-id.c
new file mode 100644
index 00..0e751ce4f7
--- /dev/null
+++ b/stubs/blk-by-qdev-id.c
@@ -0,0 +1,9 @@
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "sysemu/block-backend.h"
+
+BlockBackend *blk_by_qdev_id(const char *id, Error **errp)
+{
+error_setg(errp, "blk '%s' not found", id);
+return NULL;
+}
diff --git a/stubs/meson.build b/stubs/meson.build
index d359cbe1ad..90358823fc 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -1,6 +1,7 

[PATCH v3 02/11] block/export: add blk_by_export_id()

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/sysemu/block-backend.h |  1 +
 block/export/export.c  | 18 ++
 2 files changed, 19 insertions(+)

diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 904d70f49c..250c7465a5 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -124,6 +124,7 @@ DeviceState *blk_get_attached_dev(BlockBackend *blk);
 char *blk_get_attached_dev_id(BlockBackend *blk);
 BlockBackend *blk_by_dev(void *dev);
 BlockBackend *blk_by_qdev_id(const char *id, Error **errp);
+BlockBackend *blk_by_export_id(const char *id, Error **errp);
 void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops, void *opaque);
 int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset,
int64_t bytes, QEMUIOVector *qiov,
diff --git a/block/export/export.c b/block/export/export.c
index 6d3b9964c8..613b5bc1d5 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -362,3 +362,21 @@ BlockExportInfoList *qmp_query_block_exports(Error **errp)
 
 return head;
 }
+
+BlockBackend *blk_by_export_id(const char *id, Error **errp)
+{
+BlockExport *exp;
+
+exp = blk_exp_find(id);
+if (exp == NULL) {
+error_setg(errp, "Export '%s' not found", id);
+return NULL;
+}
+
+if (!exp->blk) {
+error_setg(errp, "Export '%s' is empty", id);
+return NULL;
+}
+
+return exp->blk;
+}
-- 
2.31.1




[PATCH v3 04/11] block: bdrv_replace_child_bs(): move to external transaction

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
We'll need this functionality as part of external transaction, so make
the whole function to be transaction action. For this we need to
introduce a transaction action helper: bdrv_drained(), which calls
bdrv_drained_begin() and postpone bdrv_drained_end() to .clean() phase.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/block.h |  2 +-
 block.c   | 42 +++---
 block/block-backend.c |  8 +++-
 3 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index e1713ee306..1cc1f736c7 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -362,7 +362,7 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState 
*bs_top,
 int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
   Error **errp);
 int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
-  Error **errp);
+  Transaction *tran, Error **errp);
 BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *node_options,
int flags, Error **errp);
 int bdrv_drop_filter(BlockDriverState *bs, Error **errp);
diff --git a/block.c b/block.c
index 601fee163b..b2f55ff872 100644
--- a/block.c
+++ b/block.c
@@ -5204,19 +5204,39 @@ out:
 return ret;
 }
 
+static void bdrv_drained_clean(void *opaque)
+{
+BlockDriverState *bs = opaque;
+
+bdrv_drained_end(bs);
+bdrv_unref(bs);
+}
+
+TransactionActionDrv bdrv_drained_drv = {
+.clean = bdrv_drained_clean,
+};
+
+/*
+ * Start drained section on @bs, and finish it in .clean action.
+ * Reference to @bs is kept, so @bs can't be removed during transaction.
+ */
+static void bdrv_drained(BlockDriverState *bs, Transaction *tran)
+{
+bdrv_ref(bs);
+bdrv_drained_begin(bs);
+tran_add(tran, &bdrv_drained_drv, bs);
+}
+
 /* Not for empty child */
 int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
-  Error **errp)
+  Transaction *tran, Error **errp)
 {
-int ret;
-Transaction *tran = tran_new();
 g_autoptr(GHashTable) found = NULL;
 g_autoptr(GSList) refresh_list = NULL;
 BlockDriverState *old_bs = child->bs;
 
-bdrv_ref(old_bs);
-bdrv_drained_begin(old_bs);
-bdrv_drained_begin(new_bs);
+bdrv_drained(old_bs, tran);
+bdrv_drained(new_bs, tran);
 
 bdrv_replace_child_tran(&child, new_bs, tran, true);
 /* @new_bs must have been non-NULL, so @child must not have been freed */
@@ -5226,15 +5246,7 @@ int bdrv_replace_child_bs(BdrvChild *child, 
BlockDriverState *new_bs,
 refresh_list = bdrv_topological_dfs(refresh_list, found, old_bs);
 refresh_list = bdrv_topological_dfs(refresh_list, found, new_bs);
 
-ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
-
-tran_finalize(tran, ret);
-
-bdrv_drained_end(old_bs);
-bdrv_drained_end(new_bs);
-bdrv_unref(old_bs);
-
-return ret;
+return bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
 }
 
 static void bdrv_delete(BlockDriverState *bs)
diff --git a/block/block-backend.c b/block/block-backend.c
index 97913acfcd..dbbbc56b2c 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -892,7 +892,13 @@ int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, 
Error **errp)
  */
 int blk_replace_bs(BlockBackend *blk, BlockDriverState *new_bs, Error **errp)
 {
-return bdrv_replace_child_bs(blk->root, new_bs, errp);
+int ret;
+Transaction *tran = tran_new();
+
+ret = bdrv_replace_child_bs(blk->root, new_bs, tran, errp);
+tran_finalize(tran, ret);
+
+return ret;
 }
 
 /*
-- 
2.31.1




[PATCH v3 06/11] qapi: add x-blockdev-replace transaction action

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
Support blockdev-replace in a transaction.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/transaction.json | 14 +-
 blockdev.c| 34 ++
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/qapi/transaction.json b/qapi/transaction.json
index a938dc7d10..48dd2db1ed 100644
--- a/qapi/transaction.json
+++ b/qapi/transaction.json
@@ -54,10 +54,12 @@
 # @blockdev-snapshot-sync: since 1.1
 # @drive-backup: Since 1.6
 # @blockdev-add: since 7.0
+# @x-blockdev-replace: since 7.0
 #
 # Features:
 # @deprecated: Member @drive-backup is deprecated.  Use member
 #  @blockdev-backup instead.
+# @unstable: Member @x-blockdev-replace is experimental
 #
 # Since: 1.1
 ##
@@ -68,6 +70,7 @@
 'blockdev-backup', 'blockdev-snapshot',
 'blockdev-snapshot-internal-sync', 'blockdev-snapshot-sync',
 'blockdev-add',
+{ 'name': 'x-blockdev-replace', 'features': [ 'unstable' ] },
 { 'name': 'drive-backup', 'features': [ 'deprecated' ] } ] }
 
 ##
@@ -150,6 +153,14 @@
 { 'struct': 'BlockdevAddWrapper',
   'data': { 'data': 'BlockdevOptions' } }
 
+##
+# @BlockdevReplaceWrapper:
+#
+# Since: 7.0
+##
+{ 'struct': 'BlockdevReplaceWrapper',
+  'data': { 'data': 'BlockdevReplace' } }
+
 ##
 # @TransactionAction:
 #
@@ -174,7 +185,8 @@
'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternalWrapper',
'blockdev-snapshot-sync': 'BlockdevSnapshotSyncWrapper',
'blockdev-add': 'BlockdevAddWrapper',
-   'drive-backup': 'DriveBackupWrapper'
+   'drive-backup': 'DriveBackupWrapper',
+   'x-blockdev-replace': 'BlockdevReplaceWrapper'
} }
 
 ##
diff --git a/blockdev.c b/blockdev.c
index 9fd1783be2..8ff0e2afe8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2294,6 +2294,34 @@ void qmp_x_blockdev_replace(BlockdevReplace *repl, Error 
**errp)
 tran_finalize(tran, ret);
 }
 
+typedef struct TranObjState {
+BlkActionState common;
+Transaction *tran;
+} TranObjState;
+
+static void tran_obj_commit(BlkActionState *common)
+{
+TranObjState *s = DO_UPCAST(TranObjState, common, common);
+
+tran_commit(s->tran);
+}
+
+static void tran_obj_abort(BlkActionState *common)
+{
+TranObjState *s = DO_UPCAST(TranObjState, common, common);
+
+tran_abort(s->tran);
+}
+
+static void blockdev_replace_prepare(BlkActionState *common, Error **errp)
+{
+TranObjState *s = DO_UPCAST(TranObjState, common, common);
+
+s->tran = tran_new();
+
+blockdev_replace(common->action->u.x_blockdev_replace.data, s->tran, errp);
+}
+
 static const BlkActionOps actions[] = {
 [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT] = {
 .instance_size = sizeof(ExternalSnapshotState),
@@ -2372,6 +2400,12 @@ static const BlkActionOps actions[] = {
 .prepare = blockdev_add_prepare,
 .abort = blockdev_add_abort,
 },
+[TRANSACTION_ACTION_KIND_X_BLOCKDEV_REPLACE] = {
+.instance_size = sizeof(TranObjState),
+.prepare = blockdev_replace_prepare,
+.commit = tran_obj_commit,
+.abort = tran_obj_abort,
+},
 /* Where are transactions for MIRROR, COMMIT and STREAM?
  * Although these blockjobs use transaction callbacks like the backup job,
  * these jobs do not necessarily adhere to transaction semantics.
-- 
2.31.1




[PATCH v3 01/11] block-backend: blk_root(): drop const specifier on return type

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
We'll need get non-const child pointer for graph modifications in
further commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/sysemu/block-backend.h | 2 +-
 block/block-backend.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index e5e1524f06..904d70f49c 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -277,7 +277,7 @@ int coroutine_fn blk_co_copy_range(BlockBackend *blk_in, 
int64_t off_in,
int64_t bytes, BdrvRequestFlags read_flags,
BdrvRequestFlags write_flags);
 
-const BdrvChild *blk_root(BlockBackend *blk);
+BdrvChild *blk_root(BlockBackend *blk);
 
 int blk_make_empty(BlockBackend *blk, Error **errp);
 
diff --git a/block/block-backend.c b/block/block-backend.c
index 4ff6b4d785..97913acfcd 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -2464,7 +2464,7 @@ int coroutine_fn blk_co_copy_range(BlockBackend *blk_in, 
int64_t off_in,
   bytes, read_flags, write_flags);
 }
 
-const BdrvChild *blk_root(BlockBackend *blk)
+BdrvChild *blk_root(BlockBackend *blk)
 {
 return blk->root;
 }
-- 
2.31.1




[PATCH v3 00/11] blockdev-replace

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
Hi all!

Finally, that's a proposal for new interface for filter insertion, which
provides generic way for inserting between different block graph nodes,
like BDS nodes, block exports and block devices.

v3: - add transaction support
- add test, that shows transactional filter insertion in different
  cases
- drop RFC mark. I think it's now close to be a good solution. And
  anyway, no comments on "RFC v2" version :) Still, I want to keep
  x- prefix for now, just because there were too many different
  ideas on this topic.

Vladimir Sementsov-Ogievskiy (11):
  block-backend: blk_root(): drop const specifier on return type
  block/export: add blk_by_export_id()
  block: make bdrv_find_child() function public
  block: bdrv_replace_child_bs(): move to external transaction
  qapi: add x-blockdev-replace command
  qapi: add x-blockdev-replace transaction action
  block: bdrv_get_xdbg_block_graph(): report export ids
  iotests.py: qemu_img_create: use imgfmt by default
  iotests.py: introduce VM.assert_edges_list() method
  iotests.py: add VM.qmp_check() helper
  iotests: add filter-insertion

 qapi/block-core.json  |  62 +
 qapi/transaction.json |  14 +-
 include/block/block.h |   2 +-
 include/block/block_int.h |   1 +
 include/block/export.h|   1 +
 include/sysemu/block-backend.h|   3 +-
 block.c   |  59 ++--
 block/block-backend.c |  10 +-
 block/export/export.c |  31 +++
 blockdev.c| 113 +++-
 stubs/blk-by-qdev-id.c|   9 +
 stubs/blk-exp-find-by-blk.c   |   9 +
 stubs/meson.build |   2 +
 tests/qemu-iotests/iotests.py |  23 ++
 tests/qemu-iotests/tests/filter-insertion | 253 ++
 tests/qemu-iotests/tests/filter-insertion.out |   5 +
 16 files changed, 563 insertions(+), 34 deletions(-)
 create mode 100644 stubs/blk-by-qdev-id.c
 create mode 100644 stubs/blk-exp-find-by-blk.c
 create mode 100755 tests/qemu-iotests/tests/filter-insertion
 create mode 100644 tests/qemu-iotests/tests/filter-insertion.out

-- 
2.31.1




[PATCH v3 03/11] block: make bdrv_find_child() function public

2022-02-25 Thread Vladimir Sementsov-Ogievskiy
To be reused soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/block_int.h |  1 +
 block.c   | 13 +
 blockdev.c| 14 --
 3 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 27008cfb22..e44348e851 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1430,6 +1430,7 @@ BdrvDirtyBitmap *block_dirty_bitmap_remove(const char 
*node, const char *name,
BlockDriverState **bitmap_bs,
Error **errp);
 
+BdrvChild *bdrv_find_child(BlockDriverState *parent_bs, const char 
*child_name);
 BdrvChild *bdrv_cow_child(BlockDriverState *bs);
 BdrvChild *bdrv_filter_child(BlockDriverState *bs);
 BdrvChild *bdrv_filter_or_cow_child(BlockDriverState *bs);
diff --git a/block.c b/block.c
index b54d59d1fa..601fee163b 100644
--- a/block.c
+++ b/block.c
@@ -7728,6 +7728,19 @@ int bdrv_make_empty(BdrvChild *c, Error **errp)
 return 0;
 }
 
+BdrvChild *bdrv_find_child(BlockDriverState *parent_bs, const char *child_name)
+{
+BdrvChild *child;
+
+QLIST_FOREACH(child, &parent_bs->children, next) {
+if (strcmp(child->name, child_name) == 0) {
+return child;
+}
+}
+
+return NULL;
+}
+
 /*
  * Return the child that @bs acts as an overlay for, and from which data may be
  * copied in COW or COR operations.  Usually this is the backing file.
diff --git a/blockdev.c b/blockdev.c
index eb9ad9cb89..d20963be2a 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -3661,20 +3661,6 @@ out:
 aio_context_release(aio_context);
 }
 
-static BdrvChild *bdrv_find_child(BlockDriverState *parent_bs,
-  const char *child_name)
-{
-BdrvChild *child;
-
-QLIST_FOREACH(child, &parent_bs->children, next) {
-if (strcmp(child->name, child_name) == 0) {
-return child;
-}
-}
-
-return NULL;
-}
-
 void qmp_x_blockdev_change(const char *parent, bool has_child,
const char *child, bool has_node,
const char *node, Error **errp)
-- 
2.31.1




Re: [PATCH v5 44/49] target/ppc: Refactor VSX_MAX_MINC helper

2022-02-25 Thread Richard Henderson

On 2/25/22 11:09, matheus.fe...@eldorado.org.br wrote:

From: Víctor Colombo 

Refactor xs{max,min}cdp VSX_MAX_MINC helper to prepare for
xs{max,min}cqp implementation.

Signed-off-by: Víctor Colombo 
Signed-off-by: Matheus Ferst 
---
changes for v5:
- use float_flag_invalid_snan as suggested by Richard Henderson
---
  target/ppc/fpu_helper.c | 41 +
  1 file changed, 17 insertions(+), 24 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 4bfa1c4283..0aaf529ac8 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2533,40 +2533,33 @@ VSX_MAX_MIN(xsmindp, minnum, 1, float64, VsrD(0))
  VSX_MAX_MIN(xvmindp, minnum, 2, float64, VsrD(i))
  VSX_MAX_MIN(xvminsp, minnum, 4, float32, VsrW(i))
  
-#define VSX_MAX_MINC(name, max)   \

+#define VSX_MAX_MINC(name, max, tp, fld)  \
  void helper_##name(CPUPPCState *env,  
\
 ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)   
\
  { 
\
  ppc_vsr_t t = { };
\
-bool vxsnan_flag = false, vex_flag = false;   \
+bool first;   \

\
-if (unlikely(float64_is_any_nan(xa->VsrD(0)) ||   \
- float64_is_any_nan(xb->VsrD(0 {  \
-if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) || \
-float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) { \
-vxsnan_flag = true;   \
-} \
-t.VsrD(0) = xb->VsrD(0);  \
-} else if ((max &&\
-   !float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status)) || \
-   (!max &&   \
-   float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status))) {  \
-t.VsrD(0) = xa->VsrD(0);  \
+if (max) {\
+first = tp##_le_quiet(xb->fld, xa->fld, &env->fp_status); \
  } else {  
\
-t.VsrD(0) = xb->VsrD(0);  \
+first = tp##_lt_quiet(xa->fld, xb->fld, &env->fp_status); \
  } 
\

\
-vex_flag = fpscr_ve & vxsnan_flag;\
-if (vxsnan_flag) {\
-float_invalid_op_vxsnan(env, GETPC());\
+if (first) {  \
+t.fld = xa->fld;  \
+} else {  \
+t.fld = xb->fld;  \
+if (env->fp_status.float_exception_flags & float_flag_invalid_snan) { \
+float_invalid_op_vxsnan(env, GETPC());\
+} \
  } 
\
-if (!vex_flag) {  \
-*xt = t;  \
-} \
-} \
+  \
+*xt = t;  \
+}


I just noticed that we're missing reset_fpstatus at the beginning here.
Since invalid via snan is the only possible exception for min/max, we do not need 
do_float_check_status at the end.


Otherwise,
Reviewed-by: Richard Henderson 


r~



Re: [PATCH v5 34/49] target/ppc: Implement xxeval

2022-02-25 Thread Richard Henderson

On 2/25/22 11:09, matheus.fe...@eldorado.org.br wrote:

From: Matheus Ferst

Signed-off-by: Matheus Ferst
---
v5:
  - Some equivalent functions implemented with tcg_gen_gvec_*
---
  target/ppc/helper.h |   1 +
  target/ppc/insn64.decode|   8 +
  target/ppc/int_helper.c |  42 ++
  target/ppc/translate/vsx-impl.c.inc | 220 
  4 files changed, 271 insertions(+)


Reviewed-by: Richard Henderson 

r~



[PATCH v5 48/49] target/ppc: implement plxssp/pstxssp

2022-02-25 Thread matheus . ferst
From: Leandro Lupori 

Implement instructions plxssp/pstxssp and port lxssp/stxssp to
decode tree.

Reviewed-by: Richard Henderson 
Signed-off-by: Leandro Lupori 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  2 +
 target/ppc/insn64.decode|  6 ++
 target/ppc/translate.c  | 29 +++--
 target/ppc/translate/vsx-impl.c.inc | 93 +++--
 4 files changed, 62 insertions(+), 68 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 37b6470503..1641a31894 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -604,6 +604,8 @@ VCLRRB  000100 . . . 00111001101@VX
 
 LXSD111001 . . .. 10@DS
 STXSD   01 . . .. 10@DS
+LXSSP   111001 . . .. 11@DS
+STXSSP  01 . . .. 11@DS
 LXV 01 . .  . 001   @DQ_TSX
 STXV01 . .  . 101   @DQ_TSX
 LXVP000110 . .  @DQ_TSXP
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index b7426f5b24..691e8fe6c0 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -190,6 +190,12 @@ PLXSD   01 00 0--.-- .. \
 PSTXSD  01 00 0--.-- .. \
 101110 . .  @8LS_D
 
+PLXSSP  01 00 0--.-- .. \
+101011 . .  @8LS_D
+
+PSTXSSP 01 00 0--.-- .. \
+10 . .  @8LS_D
+
 PLXV01 00 0--.-- .. \
 11001 .. .  @8LS_D_TSX
 PSTXV   01 00 0--.-- .. \
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 1ef2ad..408ae26173 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6668,39 +6668,24 @@ static bool resolve_PLS_D(DisasContext *ctx, arg_D *d, 
arg_PLS_D *a)
 
 #include "translate/branch-impl.c.inc"
 
-/* Handles lfdp, lxssp */
+/* Handles lfdp */
 static void gen_dform39(DisasContext *ctx)
 {
-switch (ctx->opcode & 0x3) {
-case 0: /* lfdp */
+if ((ctx->opcode & 0x3) == 0) {
 if (ctx->insns_flags2 & PPC2_ISA205) {
 return gen_lfdp(ctx);
 }
-break;
-case 3: /* lxssp */
-if (ctx->insns_flags2 & PPC2_ISA300) {
-return gen_lxssp(ctx);
-}
-break;
 }
 return gen_invalid(ctx);
 }
 
-/* handles stfdp, lxv, stxssp lxvx */
+/* Handles stfdp */
 static void gen_dform3D(DisasContext *ctx)
 {
-if ((ctx->opcode & 3) != 1) { /* DS-FORM */
-switch (ctx->opcode & 0x3) {
-case 0: /* stfdp */
-if (ctx->insns_flags2 & PPC2_ISA205) {
-return gen_stfdp(ctx);
-}
-break;
-case 3: /* stxssp */
-if (ctx->insns_flags2 & PPC2_ISA300) {
-return gen_stxssp(ctx);
-}
-break;
+if ((ctx->opcode & 3) == 0) { /* DS-FORM */
+/* stfdp */
+if (ctx->insns_flags2 & PPC2_ISA205) {
+return gen_stfdp(ctx);
 }
 }
 return gen_invalid(ctx);
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index a6e9417f2d..a980a79b78 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -288,29 +288,6 @@ VSX_VECTOR_LOAD_STORE_LENGTH(stxvl)
 VSX_VECTOR_LOAD_STORE_LENGTH(stxvll)
 #endif
 
-#define VSX_LOAD_SCALAR_DS(name, operation)   \
-static void gen_##name(DisasContext *ctx) \
-{ \
-TCGv EA;  \
-TCGv_i64 xth; \
-  \
-if (unlikely(!ctx->altivec_enabled)) {\
-gen_exception(ctx, POWERPC_EXCP_VPU); \
-return;   \
-} \
-xth = tcg_temp_new_i64(); \
-gen_set_access_type(ctx, ACCESS_INT); \
-EA = tcg_temp_new();  \
-gen_addr_imm_index(ctx, EA, 0x03);\
-gen_qemu_##operation(ctx, xth, EA);   \
-set_cpu_vsr(rD(ctx->opcode) + 32, xth, true); \
-/* NOTE: cpu_vsrl is undefined */ \
-tcg_temp_free(EA);\
-tcg_temp_free_i64(xth); 

[PATCH v5 44/49] target/ppc: Refactor VSX_MAX_MINC helper

2022-02-25 Thread matheus . ferst
From: Víctor Colombo 

Refactor xs{max,min}cdp VSX_MAX_MINC helper to prepare for
xs{max,min}cqp implementation.

Signed-off-by: Víctor Colombo 
Signed-off-by: Matheus Ferst 
---
changes for v5:
- use float_flag_invalid_snan as suggested by Richard Henderson
---
 target/ppc/fpu_helper.c | 41 +
 1 file changed, 17 insertions(+), 24 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 4bfa1c4283..0aaf529ac8 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2533,40 +2533,33 @@ VSX_MAX_MIN(xsmindp, minnum, 1, float64, VsrD(0))
 VSX_MAX_MIN(xvmindp, minnum, 2, float64, VsrD(i))
 VSX_MAX_MIN(xvminsp, minnum, 4, float32, VsrW(i))
 
-#define VSX_MAX_MINC(name, max)   \
+#define VSX_MAX_MINC(name, max, tp, fld)  \
 void helper_##name(CPUPPCState *env,  \
ppc_vsr_t *xt, ppc_vsr_t *xa, ppc_vsr_t *xb)   \
 { \
 ppc_vsr_t t = { };\
-bool vxsnan_flag = false, vex_flag = false;   \
+bool first;   \
   \
-if (unlikely(float64_is_any_nan(xa->VsrD(0)) ||   \
- float64_is_any_nan(xb->VsrD(0 {  \
-if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) || \
-float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) { \
-vxsnan_flag = true;   \
-} \
-t.VsrD(0) = xb->VsrD(0);  \
-} else if ((max &&\
-   !float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status)) || \
-   (!max &&   \
-   float64_lt(xa->VsrD(0), xb->VsrD(0), &env->fp_status))) {  \
-t.VsrD(0) = xa->VsrD(0);  \
+if (max) {\
+first = tp##_le_quiet(xb->fld, xa->fld, &env->fp_status); \
 } else {  \
-t.VsrD(0) = xb->VsrD(0);  \
+first = tp##_lt_quiet(xa->fld, xb->fld, &env->fp_status); \
 } \
   \
-vex_flag = fpscr_ve & vxsnan_flag;\
-if (vxsnan_flag) {\
-float_invalid_op_vxsnan(env, GETPC());\
+if (first) {  \
+t.fld = xa->fld;  \
+} else {  \
+t.fld = xb->fld;  \
+if (env->fp_status.float_exception_flags & float_flag_invalid_snan) { \
+float_invalid_op_vxsnan(env, GETPC());\
+} \
 } \
-if (!vex_flag) {  \
-*xt = t;  \
-} \
-} \
+  \
+*xt = t;  \
+}
 
-VSX_MAX_MINC(XSMAXCDP, 1);
-VSX_MAX_MINC(XSMINCDP, 0);
+VSX_MAX_MINC(XSMAXCDP, true, float64, VsrD(0));
+VSX_MAX_MINC(XSMINCDP, false, float64, VsrD(0));
 
 #define VSX_MAX_MINJ(name, max)   \
 void helper_##name(CPUPPCState *env,  \
-- 
2.25.1




[PATCH v5 42/49] target/ppc: Move xscmp{eq,ge,gt}dp to decodetree

2022-02-25 Thread matheus . ferst
From: Víctor Colombo 

Reviewed-by: Richard Henderson 
Signed-off-by: Víctor Colombo 
Signed-off-by: Matheus Ferst 
---
 target/ppc/fpu_helper.c |  6 +++---
 target/ppc/helper.h |  6 +++---
 target/ppc/insn32.decode|  3 +++
 target/ppc/translate/vsx-impl.c.inc | 28 +---
 target/ppc/translate/vsx-ops.c.inc  |  3 ---
 5 files changed, 34 insertions(+), 12 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index a589e6b7a5..4d67180bec 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2304,9 +2304,9 @@ VSX_MADDQ(XSNMSUBQPO, NMSUB_FLGS, 0)
 do_float_check_status(env, GETPC());  \
 }
 
-VSX_SCALAR_CMP(xscmpeqdp, float64, eq, VsrD(0), 0)
-VSX_SCALAR_CMP(xscmpgedp, float64, le, VsrD(0), 1)
-VSX_SCALAR_CMP(xscmpgtdp, float64, lt, VsrD(0), 1)
+VSX_SCALAR_CMP(XSCMPEQDP, float64, eq, VsrD(0), 0)
+VSX_SCALAR_CMP(XSCMPGEDP, float64, le, VsrD(0), 1)
+VSX_SCALAR_CMP(XSCMPGTDP, float64, lt, VsrD(0), 1)
 VSX_SCALAR_CMP(XSCMPEQQP, float128, eq, f128, 0)
 VSX_SCALAR_CMP(XSCMPGEQP, float128, le, f128, 1)
 VSX_SCALAR_CMP(XSCMPGTQP, float128, lt, f128, 1)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index b9c5c0dd48..8a3db7c13f 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -357,9 +357,9 @@ DEF_HELPER_5(XSMADDDP, void, env, vsr, vsr, vsr, vsr)
 DEF_HELPER_5(XSMSUBDP, void, env, vsr, vsr, vsr, vsr)
 DEF_HELPER_5(XSNMADDDP, void, env, vsr, vsr, vsr, vsr)
 DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPEQDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGTDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGEDP, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(XSCMPEQQP, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(XSCMPGTQP, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(XSCMPGEQP, void, env, vsr, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 92327a0a71..6fbb2d188f 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -664,6 +664,9 @@ XSMAXCDP00 . . . 1000 ...   @XX3
 XSMINCDP00 . . . 10001000 ...   @XX3
 XSMAXJDP00 . . . 1001 ...   @XX3
 XSMINJDP00 . . . 10011000 ...   @XX3
+XSCMPEQDP   00 . . . 0011 ...   @XX3
+XSCMPGEDP   00 . . . 00010011 ...   @XX3
+XSCMPGTDP   00 . . . 1011 ...   @XX3
 XSCMPEQQP   11 . . . 0001000100 -   @X
 XSCMPGEQP   11 . . . 0011000100 -   @X
 XSCMPGTQP   11 . . . 0011100100 -   @X
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 6dd20b0309..fca25c267b 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1052,9 +1052,6 @@ GEN_VSX_HELPER_X2(xssqrtdp, 0x16, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xsrsqrtedp, 0x14, 0x04, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2_AB(xstdivdp, 0x14, 0x07, 0, PPC2_VSX)
 GEN_VSX_HELPER_X1(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X2_AB(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
 GEN_VSX_HELPER_R2_AB(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X2_AB(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
@@ -2589,6 +2586,31 @@ TRANS(XXBLENDVH, do_xxblendv, MO_16)
 TRANS(XXBLENDVW, do_xxblendv, MO_32)
 TRANS(XXBLENDVD, do_xxblendv, MO_64)
 
+static bool do_helper_XX3(DisasContext *ctx, arg_XX3 *a,
+void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+TCGv_ptr xt, xa, xb;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+REQUIRE_VSX(ctx);
+
+xt = gen_vsr_ptr(a->xt);
+xa = gen_vsr_ptr(a->xa);
+xb = gen_vsr_ptr(a->xb);
+
+helper(cpu_env, xt, xa, xb);
+
+tcg_temp_free_ptr(xt);
+tcg_temp_free_ptr(xa);
+tcg_temp_free_ptr(xb);
+
+return true;
+}
+
+TRANS(XSCMPEQDP, do_helper_XX3, gen_helper_XSCMPEQDP)
+TRANS(XSCMPGEDP, do_helper_XX3, gen_helper_XSCMPGEDP)
+TRANS(XSCMPGTDP, do_helper_XX3, gen_helper_XSCMPGTDP)
+
 static bool do_xsmaxmincjdp(DisasContext *ctx, arg_XX3 *a,
 void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, 
TCGv_ptr))
 {
diff --git a/target/ppc/translate/vsx-ops.c.inc 
b/target/ppc/translate/vsx-ops.c.inc
index 34310c1fb5..b8fd116728 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -186,9 +186,6 @@ GEN_XX2FORM(xssqrtdp,  0x16, 0x04, PPC2_VSX),
 GEN_XX2FORM(xsrsqrtedp,  0x14, 0x04, PPC2_VSX),
 GEN_XX3FORM(xstdivdp,  0x14, 0x07, PPC2_VSX),
 GEN_XX2FORM(xstsqrtdp,  0x14, 0x06, PPC2_VSX),
-GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00

[PATCH v5 40/49] target/ppc: Refactor VSX_SCALAR_CMP_DP

2022-02-25 Thread matheus . ferst
From: Víctor Colombo 

Refactor VSX_SCALAR_CMP_DP, changing its name to VSX_SCALAR_CMP and
prepare the helper to be used for quadword comparisons.

Suggested-by: Richard Henderson 
Signed-off-by: Víctor Colombo 
Signed-off-by: Matheus Ferst 
---
changes for v5:
- Improve refactor as suggested by Richard Henderson
---
 target/ppc/fpu_helper.c | 66 +++--
 1 file changed, 30 insertions(+), 36 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 9b034d1fe4..bbd54b2d9c 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2265,54 +2265,48 @@ VSX_MADDQ(XSNMSUBQP, NMSUB_FLGS, 0)
 VSX_MADDQ(XSNMSUBQPO, NMSUB_FLGS, 0)
 
 /*
- * VSX_SCALAR_CMP_DP - VSX scalar floating point compare double precision
+ * VSX_SCALAR_CMP - VSX scalar floating point compare
  *   op- instruction mnemonic
+ *   tp- type
  *   cmp   - comparison operation
- *   exp   - expected result of comparison
+ *   fld   - vsr_t field
  *   svxvc - set VXVC bit
  */
-#define VSX_SCALAR_CMP_DP(op, cmp, exp, svxvc)\
-void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
- ppc_vsr_t *xa, ppc_vsr_t *xb)\
+#define VSX_SCALAR_CMP(op, tp, cmp, fld, svxvc)   \
+void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
+ppc_vsr_t *xa, ppc_vsr_t *xb) \
 { \
-ppc_vsr_t t = *xt;\
-bool vxsnan_flag = false, vxvc_flag = false, vex_flag = false;\
+int flags;\
+bool r, vxvc; \
   \
-if (float64_is_signaling_nan(xa->VsrD(0), &env->fp_status) || \
-float64_is_signaling_nan(xb->VsrD(0), &env->fp_status)) { \
-vxsnan_flag = true;   \
-if (fpscr_ve == 0 && svxvc) { \
-vxvc_flag = true; \
+helper_reset_fpstatus(env);   \
+  \
+if (svxvc) {  \
+r = tp##_##cmp(xb->fld, xa->fld, &env->fp_status);\
+} else {  \
+r = tp##_##cmp##_quiet(xb->fld, xa->fld, &env->fp_status);\
+} \
+  \
+flags = get_float_exception_flags(&env->fp_status);   \
+if (unlikely(flags & float_flag_invalid)) {   \
+vxvc = svxvc; \
+if (flags & float_flag_invalid_snan) {\
+float_invalid_op_vxsnan(env, GETPC());\
+vxvc &= fpscr_ve == 0;\
 } \
-} else if (svxvc) {   \
-vxvc_flag = float64_is_quiet_nan(xa->VsrD(0), &env->fp_status) || \
-float64_is_quiet_nan(xb->VsrD(0), &env->fp_status);   \
-} \
-if (vxsnan_flag) {\
-float_invalid_op_vxsnan(env, GETPC());\
-} \
-if (vxvc_flag) {  \
-float_invalid_op_vxvc(env, 0, GETPC());   \
-} \
-vex_flag = fpscr_ve && (vxvc_flag || vxsnan_flag);\
-  \
-if (!vex_flag) {  \
-if (float64_##cmp(xb->VsrD(0), xa->VsrD(0),   \
-  &env->fp_status) == exp) {  \
-t.VsrD(0) = -1;   \
-t.VsrD(1) = 0;\
-  

Re: [PATCH v5 40/49] target/ppc: Refactor VSX_SCALAR_CMP_DP

2022-02-25 Thread Richard Henderson

On 2/25/22 11:09, matheus.fe...@eldorado.org.br wrote:

From: Víctor Colombo

Refactor VSX_SCALAR_CMP_DP, changing its name to VSX_SCALAR_CMP and
prepare the helper to be used for quadword comparisons.

Suggested-by: Richard Henderson
Signed-off-by: Víctor Colombo
Signed-off-by: Matheus Ferst
---
changes for v5:
- Improve refactor as suggested by Richard Henderson
---
  target/ppc/fpu_helper.c | 66 +++--
  1 file changed, 30 insertions(+), 36 deletions(-)


Reviewed-by: Richard Henderson 

r~



[PATCH v5 39/49] target/ppc: Remove xscmpnedp instruction

2022-02-25 Thread matheus . ferst
From: Víctor Colombo 

xscmpnedp was added in ISA v3.0 but removed in v3.0B. This patch
removes this instruction as it was not in the final version of v3.0.

Signed-off-by: Víctor Colombo 
Acked-by: Greg Kurz 
Reviewed-by: Cédric Le Goater 
Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/fpu_helper.c | 1 -
 target/ppc/helper.h | 1 -
 target/ppc/translate/vsx-impl.c.inc | 1 -
 target/ppc/translate/vsx-ops.c.inc  | 1 -
 4 files changed, 4 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 98e9576608..9b034d1fe4 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2313,7 +2313,6 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, 
\
 VSX_SCALAR_CMP_DP(xscmpeqdp, eq, 1, 0)
 VSX_SCALAR_CMP_DP(xscmpgedp, le, 1, 1)
 VSX_SCALAR_CMP_DP(xscmpgtdp, lt, 1, 1)
-VSX_SCALAR_CMP_DP(xscmpnedp, eq, 0, 0)
 
 void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode,
ppc_vsr_t *xa, ppc_vsr_t *xb)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 412b034496..00e2e6f7b7 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -360,7 +360,6 @@ DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xscmpnedp, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpexpdp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscmpexpqp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscmpodp, void, env, i32, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 4da889531b..3baaac1abd 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1055,7 +1055,6 @@ GEN_VSX_HELPER_X1(xstsqrtdp, 0x14, 0x06, 0, PPC2_VSX)
 GEN_VSX_HELPER_X3(xscmpeqdp, 0x0C, 0x00, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xscmpgtdp, 0x0C, 0x01, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X3(xscmpgedp, 0x0C, 0x02, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xscmpnedp, 0x0C, 0x03, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X2_AB(xscmpexpdp, 0x0C, 0x07, 0, PPC2_ISA300)
 GEN_VSX_HELPER_R2_AB(xscmpexpqp, 0x04, 0x05, 0, PPC2_ISA300)
 GEN_VSX_HELPER_X2_AB(xscmpodp, 0x0C, 0x05, 0, PPC2_VSX)
diff --git a/target/ppc/translate/vsx-ops.c.inc 
b/target/ppc/translate/vsx-ops.c.inc
index 9cfec53df0..34310c1fb5 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -189,7 +189,6 @@ GEN_XX2FORM(xstsqrtdp,  0x14, 0x06, PPC2_VSX),
 GEN_XX3FORM(xscmpeqdp, 0x0C, 0x00, PPC2_ISA300),
 GEN_XX3FORM(xscmpgtdp, 0x0C, 0x01, PPC2_ISA300),
 GEN_XX3FORM(xscmpgedp, 0x0C, 0x02, PPC2_ISA300),
-GEN_XX3FORM(xscmpnedp, 0x0C, 0x03, PPC2_ISA300),
 GEN_XX3FORM(xscmpexpdp, 0x0C, 0x07, PPC2_ISA300),
 GEN_VSX_XFORM_300(xscmpexpqp, 0x04, 0x05, 0x0061),
 GEN_XX2IFORM(xscmpodp,  0x0C, 0x05, PPC2_VSX),
-- 
2.25.1




Re: [PATCH v5 38/49] target/ppc: Implement xvtlsbb instruction

2022-02-25 Thread Richard Henderson

On 2/25/22 11:09, matheus.fe...@eldorado.org.br wrote:

From: Víctor Colombo

Signed-off-by: Víctor Colombo
Signed-off-by: Matheus Ferst
---
changes for v5:
- unroll for-loop as suggested by Richard Henderson
---
  target/ppc/insn32.decode|  7 +
  target/ppc/translate/vsx-impl.c.inc | 40 +
  2 files changed, 47 insertions(+)


Reviewed-by: Richard Henderson 

r~



[PATCH v5 37/49] target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o]

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Implement the following PowerISA v3.0 instuctions:
xsmaddqp[o]: VSX Scalar Multiply-Add Quad-Precision [using round to Odd]
xsmsubqp[o]: VSX Scalar Multiply-Subtract Quad-Precision [using round
 to Odd]
xsnmaddqp[o]: VSX Scalar Negative Multiply-Add Quad-Precision [using
  round to Odd]
xsnmsubqp[o]: VSX Scalar Negative Multiply-Subtract Quad-Precision
  [using round to Odd]

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/fpu_helper.c | 42 +
 target/ppc/helper.h |  9 +++
 target/ppc/insn32.decode|  4 +++
 target/ppc/translate/vsx-impl.c.inc | 25 +
 4 files changed, 80 insertions(+)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index c8797d8053..98e9576608 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -,6 +,48 @@ VSX_MADD(xvmsubsp, 4, float32, VsrW(i), MSUB_FLGS, 0, 0)
 VSX_MADD(xvnmaddsp, 4, float32, VsrW(i), NMADD_FLGS, 0, 0)
 VSX_MADD(xvnmsubsp, 4, float32, VsrW(i), NMSUB_FLGS, 0, 0)
 
+/*
+ * VSX_MADDQ - VSX floating point quad-precision muliply/add
+ *   op- instruction mnemonic
+ *   maddflgs - flags for the float*muladd routine that control the
+ *   various forms (madd, msub, nmadd, nmsub)
+ *   ro- round to odd
+ */
+#define VSX_MADDQ(op, maddflgs, ro)
\
+void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *s1, ppc_vsr_t 
*s2,\
+ ppc_vsr_t *s3)
\
+{  
\
+ppc_vsr_t t = *xt; 
\
+   
\
+helper_reset_fpstatus(env);
\
+   
\
+float_status tstat = env->fp_status;   
\
+set_float_exception_flags(0, &tstat);  
\
+if (ro) {  
\
+tstat.float_rounding_mode = float_round_to_odd;
\
+}  
\
+t.f128 = float128_muladd(s1->f128, s3->f128, s2->f128, maddflgs, &tstat);  
\
+env->fp_status.float_exception_flags |= tstat.float_exception_flags;   
\
+   
\
+if (unlikely(tstat.float_exception_flags & float_flag_invalid)) {  
\
+float_invalid_op_madd(env, tstat.float_exception_flags,
\
+  false, GETPC()); 
\
+}  
\
+   
\
+helper_compute_fprf_float128(env, t.f128); 
\
+*xt = t;   
\
+do_float_check_status(env, GETPC());   
\
+}
+
+VSX_MADDQ(XSMADDQP, MADD_FLGS, 0)
+VSX_MADDQ(XSMADDQPO, MADD_FLGS, 1)
+VSX_MADDQ(XSMSUBQP, MSUB_FLGS, 0)
+VSX_MADDQ(XSMSUBQPO, MSUB_FLGS, 1)
+VSX_MADDQ(XSNMADDQP, NMADD_FLGS, 0)
+VSX_MADDQ(XSNMADDQPO, NMADD_FLGS, 1)
+VSX_MADDQ(XSNMSUBQP, NMSUB_FLGS, 0)
+VSX_MADDQ(XSNMSUBQPO, NMSUB_FLGS, 0)
+
 /*
  * VSX_SCALAR_CMP_DP - VSX scalar floating point compare double precision
  *   op- instruction mnemonic
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index fff6041a7b..412b034496 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -421,6 +421,15 @@ DEF_HELPER_5(XSMSUBSP, void, env, vsr, vsr, vsr, vsr)
 DEF_HELPER_5(XSNMADDSP, void, env, vsr, vsr, vsr, vsr)
 DEF_HELPER_5(XSNMSUBSP, void, env, vsr, vsr, vsr, vsr)
 
+DEF_HELPER_5(XSMADDQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMADDQPO, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMSUBQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMSUBQPO, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMADDQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMADDQPO, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMSUBQP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMSUBQPO, void, env, vsr, vsr, vsr, vsr)
+
 DEF_HELPER_4(xvadddp, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xvsubdp, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xvmuldp, void, env, vsr, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index ed24b39e5a..c28cc13325 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -611,21 +611,25 @@ XSMADDADP   00 . . . 0011 . . . 
@XX3
 XSMADDMDP   00 . ...

Re: [PATCH v5 35/49] target/ppc: Implement xxgenpcv[bhwd]m instruction

2022-02-25 Thread Richard Henderson

On 2/25/22 11:09, matheus.fe...@eldorado.org.br wrote:
  
+#define XXGENPCV(NAME) \

+static bool trans_##NAME(DisasContext *ctx, arg_X_imm5 *a)  \
+{   \
+TCGv_ptr xt, vrb;   \
+\
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);  \
+REQUIRE_VSX(ctx);   \
+\
+if (a->imm & ~0x3) {\
+gen_invalid(ctx);   \
+return true;\
+}   \
+\
+xt = gen_vsr_ptr(a->xt);\
+vrb = gen_avr_ptr(a->vrb);  \
+\
+switch (a->imm) {   \
+case 0b0: /* Big-Endian expansion */\
+glue(gen_helper_, glue(NAME, _be_exp))(xt, vrb);\
+break;  \
+case 0b1: /* Big-Endian compression */  \
+glue(gen_helper_, glue(NAME, _be_comp))(xt, vrb);   \
+break;  \
+case 0b00010: /* Little-Endian expansion */ \
+glue(gen_helper_, glue(NAME, _le_exp))(xt, vrb);\
+break;  \
+case 0b00011: /* Little-Endian compression */   \
+glue(gen_helper_, glue(NAME, _le_comp))(xt, vrb);   \
+break;  \
+}   \
+\
+tcg_temp_free_ptr(xt);  \
+tcg_temp_free_ptr(vrb); \
+\
+return true;\
+}
+
+XXGENPCV(XXGENPCVBM)
+XXGENPCV(XXGENPCVHM)
+XXGENPCV(XXGENPCVWM)
+XXGENPCV(XXGENPCVDM)
+#undef XXGENPCV


Suggestion:

typedef void (*xxgenpcv_genfn)(TCGv_ptr, TCGv_ptr);

static bool do_xxgenpcv(DisasContext *ctx, arg_X_imm5 *a,
xxgenpcv_genfn fn[4])
{
   ...
   fn[a->imm](xt, vrb);
   ...
}

#define XXGENPCV(NAME) \
static bool trans_##NAME(...)
{
static const xxgenpcv_genfn fn[4] = {
gen_helper_##NAME##_be_exp,
gen_helper_##NAME##_be_comp,
gen_helper_##NAME##_le_exp,
gen_helper_##NAME##_le_comp,
};
return do_xxgenpcv(ctx, a, fn);
}

For debugging purposes, prefer to put as little within giant macro expansion as 
possible.


r~



Re: [PATCH v2 13/18] tests/tcg: add vectorised sha512 versions

2022-02-25 Thread Richard Henderson

On 2/25/22 07:20, Alex Bennée wrote:

+++ b/tests/tcg/i386/Makefile.target
@@ -71,3 +71,9 @@ TESTS=$(MULTIARCH_TESTS) $(I386_TESTS)
  
  # On i386 and x86_64 Linux only supports 4k pages (large pages are a different hack)

  EXTRA_RUNS+=run-test-mmap-4096
+
+sha512-sse: CFLAGS=-msse4.1 -O3
+sha512-sse: sha512.c
+   $(CC) $(CFLAGS) $(EXTRA_CFLAGS) $< -o $@ $(LDFLAGS)
+
+TESTS+=sha512-sse


The default cpu, qemu32, only implements sse3, not sse4.1, so we get a guest SIGILL.  We 
can execute this with -cpu max, or we could limit the vectorization.



r~



Re: [PATCH v5 26/49] target/ppc: implement vrlqnm

2022-02-25 Thread Richard Henderson

On 2/25/22 11:09, matheus.fe...@eldorado.org.br wrote:

+/* t = t >> 1 */
+tcg_gen_shli_i64(t0, th, 63);
+tcg_gen_shri_i64(tl, tl, 1);
+tcg_gen_shri_i64(th, th, 1);
+tcg_gen_or_i64(tl, t0, tl);


tcg_gen_extract2_i64(tl, tl, th, 1);
tcg_gen_shri_i64(th, th, 1);


+if (mask) {
+tcg_gen_shri_i64(n, vrb, 8);
+tcg_gen_shri_i64(vrb, vrb, 16);
+tcg_gen_andi_i64(n, n, 0x7f);
+tcg_gen_andi_i64(vrb, vrb, 0x7f);


Two tcg_gen_extract_i64.

Otherwise,
Reviewed-by: Richard Henderson 


r~



Re: [PATCH v5 45/49] target/ppc: Implement xs{max,min}cqp

2022-02-25 Thread Richard Henderson

On 2/25/22 11:09, matheus.fe...@eldorado.org.br wrote:

From: Víctor Colombo

Signed-off-by: Víctor Colombo
Signed-off-by: Matheus Ferst
---
changes for v5:
- Update the helper macro call with the new parameters added to
   VSX_MAX_MINC
---
  target/ppc/fpu_helper.c | 2 ++
  target/ppc/helper.h | 2 ++
  target/ppc/insn32.decode| 3 +++
  target/ppc/translate/vsx-impl.c.inc | 2 ++
  4 files changed, 9 insertions(+)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v5 46/49] target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions

2022-02-25 Thread Richard Henderson

On 2/25/22 11:09, matheus.fe...@eldorado.org.br wrote:

+void helper_XVCVSPBF16(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)
+{
+ppc_vsr_t t = { };
+int i, status;
+
+for (i = 0; i < 4; i++) {
+t.VsrH(2 * i + 1) = float32_to_bfloat16(xb->VsrW(i), &env->fp_status);
+}
+
+status = get_float_exception_flags(&env->fp_status);
+if (unlikely(status & float_flag_invalid_snan)) {
+float_invalid_op_vxsnan(env, GETPC());
+}
+
+*xt = t;
+do_float_check_status(env, GETPC());
+}


Missing reset_fpstatus.  Otherwise.

Reviewed-by: Richard Henderson 


r~


PS: Which is reminding me that cleaning that up has been on the to-do list for a long 
time. We should be able to rely on env->fp_status.float_exception_flags being 0 between 
and at the start of each fp operation. In do_float_check_status, we would reset 
float_exception_flags in the expected unlikely case that it is ever non-zero.




[PATCH v5 33/49] tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Following the implementation of tcg_gen_gvec_3i, add a four-vector and
immediate operand expansion method.

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 include/tcg/tcg-op-gvec.h |  22 ++
 tcg/tcg-op-gvec.c | 146 ++
 2 files changed, 168 insertions(+)

diff --git a/include/tcg/tcg-op-gvec.h b/include/tcg/tcg-op-gvec.h
index da55fed870..28cafbcc5c 100644
--- a/include/tcg/tcg-op-gvec.h
+++ b/include/tcg/tcg-op-gvec.h
@@ -218,6 +218,25 @@ typedef struct {
 bool write_aofs;
 } GVecGen4;
 
+typedef struct {
+/*
+ * Expand inline as a 64-bit or 32-bit integer. Only one of these will be
+ * non-NULL.
+ */
+void (*fni8)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64, int64_t);
+void (*fni4)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32, int32_t);
+/* Expand inline with a host vector type.  */
+void (*fniv)(unsigned, TCGv_vec, TCGv_vec, TCGv_vec, TCGv_vec, int64_t);
+/* Expand out-of-line helper w/descriptor, data in descriptor.  */
+gen_helper_gvec_4 *fno;
+/* The optional opcodes, if any, utilized by .fniv.  */
+const TCGOpcode *opt_opc;
+/* The vector element size, if applicable.  */
+uint8_t vece;
+/* Prefer i64 to v64.  */
+bool prefer_i64;
+} GVecGen4i;
+
 void tcg_gen_gvec_2(uint32_t dofs, uint32_t aofs,
 uint32_t oprsz, uint32_t maxsz, const GVecGen2 *);
 void tcg_gen_gvec_2i(uint32_t dofs, uint32_t aofs, uint32_t oprsz,
@@ -231,6 +250,9 @@ void tcg_gen_gvec_3i(uint32_t dofs, uint32_t aofs, uint32_t 
bofs,
  const GVecGen3i *);
 void tcg_gen_gvec_4(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t cofs,
 uint32_t oprsz, uint32_t maxsz, const GVecGen4 *);
+void tcg_gen_gvec_4i(uint32_t dofs, uint32_t aofs, uint32_t bofs, uint32_t 
cofs,
+ uint32_t oprsz, uint32_t maxsz, int64_t c,
+ const GVecGen4i *);
 
 /* Expand a specific vector operation.  */
 
diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index ffe55e908f..079a761b04 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -836,6 +836,30 @@ static void expand_4_i32(uint32_t dofs, uint32_t aofs, 
uint32_t bofs,
 tcg_temp_free_i32(t0);
 }
 
+static void expand_4i_i32(uint32_t dofs, uint32_t aofs, uint32_t bofs,
+  uint32_t cofs, uint32_t oprsz, int32_t c,
+  void (*fni)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_i32,
+  int32_t))
+{
+TCGv_i32 t0 = tcg_temp_new_i32();
+TCGv_i32 t1 = tcg_temp_new_i32();
+TCGv_i32 t2 = tcg_temp_new_i32();
+TCGv_i32 t3 = tcg_temp_new_i32();
+uint32_t i;
+
+for (i = 0; i < oprsz; i += 4) {
+tcg_gen_ld_i32(t1, cpu_env, aofs + i);
+tcg_gen_ld_i32(t2, cpu_env, bofs + i);
+tcg_gen_ld_i32(t3, cpu_env, cofs + i);
+fni(t0, t1, t2, t3, c);
+tcg_gen_st_i32(t0, cpu_env, dofs + i);
+}
+tcg_temp_free_i32(t3);
+tcg_temp_free_i32(t2);
+tcg_temp_free_i32(t1);
+tcg_temp_free_i32(t0);
+}
+
 /* Expand OPSZ bytes worth of two-operand operations using i64 elements.  */
 static void expand_2_i64(uint32_t dofs, uint32_t aofs, uint32_t oprsz,
  bool load_dest, void (*fni)(TCGv_i64, TCGv_i64))
@@ -971,6 +995,30 @@ static void expand_4_i64(uint32_t dofs, uint32_t aofs, 
uint32_t bofs,
 tcg_temp_free_i64(t0);
 }
 
+static void expand_4i_i64(uint32_t dofs, uint32_t aofs, uint32_t bofs,
+  uint32_t cofs, uint32_t oprsz, int64_t c,
+  void (*fni)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_i64,
+  int64_t))
+{
+TCGv_i64 t0 = tcg_temp_new_i64();
+TCGv_i64 t1 = tcg_temp_new_i64();
+TCGv_i64 t2 = tcg_temp_new_i64();
+TCGv_i64 t3 = tcg_temp_new_i64();
+uint32_t i;
+
+for (i = 0; i < oprsz; i += 8) {
+tcg_gen_ld_i64(t1, cpu_env, aofs + i);
+tcg_gen_ld_i64(t2, cpu_env, bofs + i);
+tcg_gen_ld_i64(t3, cpu_env, cofs + i);
+fni(t0, t1, t2, t3, c);
+tcg_gen_st_i64(t0, cpu_env, dofs + i);
+}
+tcg_temp_free_i64(t3);
+tcg_temp_free_i64(t2);
+tcg_temp_free_i64(t1);
+tcg_temp_free_i64(t0);
+}
+
 /* Expand OPSZ bytes worth of two-operand operations using host vectors.  */
 static void expand_2_vec(unsigned vece, uint32_t dofs, uint32_t aofs,
  uint32_t oprsz, uint32_t tysz, TCGType type,
@@ -1121,6 +1169,35 @@ static void expand_4_vec(unsigned vece, uint32_t dofs, 
uint32_t aofs,
 tcg_temp_free_vec(t0);
 }
 
+/*
+ * Expand OPSZ bytes worth of four-vector operands and an immediate operand
+ * using host vectors.
+ */
+static void expand_4i_vec(unsigned vece, uint32_t dofs, uint32_t aofs,
+  uint32_t bofs, uint32_t cofs, uint32_t oprsz,
+  uint32_t tysz, TCGType type, int64_t c,
+   

Re: [PATCH v5 35/49] target/ppc: Implement xxgenpcv[bhwd]m instruction

2022-02-25 Thread Richard Henderson

On 2/25/22 11:09, matheus.fe...@eldorado.org.br wrote:

From: Matheus Ferst

Signed-off-by: Matheus Ferst
---
v5:
  - One helper for each IMM value.
---
  target/ppc/helper.h | 16 +
  target/ppc/insn32.decode| 10 
  target/ppc/int_helper.c | 91 +
  target/ppc/translate/vsx-impl.c.inc | 43 ++
  4 files changed, 160 insertions(+)


Reviewed-by: Richard Henderson 

r~



Re: [PATCH v5 27/49] target/ppc: implement vrlqmi

2022-02-25 Thread Richard Henderson

On 2/25/22 11:09, matheus.fe...@eldorado.org.br wrote:

+if (insert) {
+get_avr64(n, a->vrt, true);
+get_avr64(vrb, a->vrt, false);
+tcg_gen_not_i64(ah, ah);
+tcg_gen_not_i64(al, al);
+tcg_gen_and_i64(n, n, ah);
+tcg_gen_and_i64(vrb, vrb, al);


Two tcg_gen_andc_i64.

Otherwise,
Reviewed-by: Richard Henderson 


r~



[PATCH v5 29/49] target/ppc: Move xxsel to decodetree

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  6 
 target/ppc/insn64.decode| 24 
 target/ppc/translate/vsx-impl.c.inc | 20 ++
 target/ppc/translate/vsx-ops.c.inc  | 43 -
 4 files changed, 26 insertions(+), 67 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 0a3ada2b66..66cb9184cd 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -148,12 +148,16 @@
 %xx_xt  0:1 21:5
 %xx_xb  1:1 11:5
 %xx_xa  2:1 16:5
+%xx_xc  3:1 6:5
 &XX2xt xb uim:uint8_t
 @XX2.. . ... uim:2 . . ..   &XX2 xt=%xx_xt 
xb=%xx_xb
 
 &XX3xt xa xb
 @XX3.. . . .  ...   &XX3 xt=%xx_xt 
xa=%xx_xa xb=%xx_xb
 
+&XX4xt xa xb xc
+@XX4.. . . . . ..   &XX4 xt=%xx_xt 
xa=%xx_xa xb=%xx_xb xc=%xx_xc
+
 &Z22_bf_fra bf fra dm
 @Z22_bf_fra .. bf:3 .. fra:5 dm:6 . .   &Z22_bf_fra
 
@@ -600,6 +604,8 @@ STXVPX  01 . . . 0111001101 -   
@X_TSXP
 XXSPLTIB00 . 00  0101101000 .   @X_imm8
 XXSPLTW 00 . ---.. . 010100100 . .  @XX2
 
+XXSEL   00 . . . . 11   @XX4
+
 ## VSX Vector Load Special Value Instruction
 
 LXVKQ   00 . 1 . 0101101000 .   @X_uim5
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index 39e610913d..9e4f531fb9 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -44,15 +44,15 @@
 .. .   .  \
 &8RR_D si=%8rr_si xt=%8rr_xt
 
-# Format XX4
-&XX4xt xa xb xc
-%xx4_xt 0:1 21:5
-%xx4_xa 2:1 16:5
-%xx4_xb 1:1 11:5
-%xx4_xc 3:1  6:5
-@XX4    \
+# Format 8RR:XX4
+%8rr_xx_xt  0:1 21:5
+%8rr_xx_xa  2:1 16:5
+%8rr_xx_xb  1:1 11:5
+%8rr_xx_xc  3:1  6:5
+&8RR_XX4xt xa xb xc
+@8RR_XX4    \
 .. . . . . ..  \
-&XX4 xt=%xx4_xt xa=%xx4_xa xb=%xx4_xb xc=%xx4_xc
+&8RR_XX4 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb 
xc=%8rr_xx_xc
 
 ### Fixed-Point Load Instructions
 
@@ -187,10 +187,10 @@ XXSPLTI32DX 01 01  -- --  \
 10 . 000 .. @8RR_D_IX
 
 XXBLENDVD   01 01  -- -- \
-11 . . . . 11   @XX4
+11 . . . . 11   @8RR_XX4
 XXBLENDVW   01 01  -- -- \
-11 . . . . 10   @XX4
+11 . . . . 10   @8RR_XX4
 XXBLENDVH   01 01  -- -- \
-11 . . . . 01   @XX4
+11 . . . . 01   @8RR_XX4
 XXBLENDVB   01 01  -- -- \
-11 . . . . 00   @XX4
+11 . . . . 00   @8RR_XX4
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index e8a4ba0cfa..48e4a2e266 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1422,19 +1422,15 @@ static void glue(gen_, name)(DisasContext *ctx) 
\
 VSX_XXMRG(xxmrghw, 1)
 VSX_XXMRG(xxmrglw, 0)
 
-static void gen_xxsel(DisasContext *ctx)
+static bool trans_XXSEL(DisasContext *ctx, arg_XX4 *a)
 {
-int rt = xT(ctx->opcode);
-int ra = xA(ctx->opcode);
-int rb = xB(ctx->opcode);
-int rc = xC(ctx->opcode);
+REQUIRE_INSNS_FLAGS2(ctx, VSX);
+REQUIRE_VSX(ctx);
 
-if (unlikely(!ctx->vsx_enabled)) {
-gen_exception(ctx, POWERPC_EXCP_VSXU);
-return;
-}
-tcg_gen_gvec_bitsel(MO_64, vsr_full_offset(rt), vsr_full_offset(rc),
-vsr_full_offset(rb), vsr_full_offset(ra), 16, 16);
+tcg_gen_gvec_bitsel(MO_64, vsr_full_offset(a->xt), vsr_full_offset(a->xc),
+vsr_full_offset(a->xb), vsr_full_offset(a->xa), 16, 
16);
+
+return true;
 }
 
 static bool trans_XXSPLTW(DisasContext *ctx, arg_XX2 *a)
@@ -2127,7 +2123,7 @@ static void gen_xxblendv_vec(unsigned vece, TCGv_vec t, 
TCGv_vec a, TCGv_vec b,
 tcg_temp_free_vec(tmp);
 }
 
-static bool do_xxblendv(DisasContext *ctx, arg_XX4 *a, unsigned vece)
+static bool do_xxblendv(DisasContext *ctx, arg_8RR_XX4 *a, unsigned vece)
 {
 static const TCGOpcode vecop_list[] = {
 INDEX_op_sari_vec, 0
diff --git a/target/ppc/translate/vsx-ops.c.inc 
b/target/ppc/translate/vsx-ops.c.inc
index c974324c4c..b0dbb38c80 

[PATCH v5 46/49] target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions

2022-02-25 Thread matheus . ferst
From: Víctor Colombo 

Signed-off-by: Víctor Colombo 
Signed-off-by: Matheus Ferst 
---
 target/ppc/fpu_helper.c | 18 +
 target/ppc/helper.h |  1 +
 target/ppc/insn32.decode| 11 +++---
 target/ppc/translate/vsx-impl.c.inc | 31 -
 4 files changed, 57 insertions(+), 4 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 7144007316..8f970288f5 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2785,6 +2785,24 @@ VSX_CVT_FP_TO_FP_HP(xscvhpdp, 1, float16, float64, 
VsrH(3), VsrD(0), 1)
 VSX_CVT_FP_TO_FP_HP(xvcvsphp, 4, float32, float16, VsrW(i), VsrH(2 * i  + 1), 
0)
 VSX_CVT_FP_TO_FP_HP(xvcvhpsp, 4, float16, float32, VsrH(2 * i + 1), VsrW(i), 0)
 
+void helper_XVCVSPBF16(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)
+{
+ppc_vsr_t t = { };
+int i, status;
+
+for (i = 0; i < 4; i++) {
+t.VsrH(2 * i + 1) = float32_to_bfloat16(xb->VsrW(i), &env->fp_status);
+}
+
+status = get_float_exception_flags(&env->fp_status);
+if (unlikely(status & float_flag_invalid_snan)) {
+float_invalid_op_vxsnan(env, GETPC());
+}
+
+*xt = t;
+do_float_check_status(env, GETPC());
+}
+
 void helper_XSCVQPDP(CPUPPCState *env, uint32_t ro, ppc_vsr_t *xt,
  ppc_vsr_t *xb)
 {
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 025da6271a..d06e573a9b 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -490,6 +490,7 @@ DEF_HELPER_FLAGS_4(xvcmpnesp, TCG_CALL_NO_RWG, i32, env, 
vsr, vsr, vsr)
 DEF_HELPER_3(xvcvspdp, void, env, vsr, vsr)
 DEF_HELPER_3(xvcvsphp, void, env, vsr, vsr)
 DEF_HELPER_3(xvcvhpsp, void, env, vsr, vsr)
+DEF_HELPER_3(XVCVSPBF16, void, env, vsr, vsr)
 DEF_HELPER_3(xvcvspsxds, void, env, vsr, vsr)
 DEF_HELPER_3(xvcvspsxws, void, env, vsr, vsr)
 DEF_HELPER_3(xvcvspuxds, void, env, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 126cadadb8..fede42f5ce 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -152,8 +152,11 @@
 %xx_xb  1:1 11:5
 %xx_xa  2:1 16:5
 %xx_xc  3:1 6:5
-&XX2xt xb uim:uint8_t
-@XX2.. . ... uim:2 . . ..   &XX2 xt=%xx_xt 
xb=%xx_xb
+&XX2xt xb
+@XX2.. . . . . ..   &XX2 xt=%xx_xt 
xb=%xx_xb
+
+&XX2_uim2   xt xb uim:uint8_t
+@XX2_uim2   .. . ... uim:2 . . ..   &XX2_uim2 
xt=%xx_xt xb=%xx_xb
 
 &XX2_bf_xb  bf xb
 @XX2_bf_xb  .. bf:3 .. . . . . .&XX2_bf_xb 
xb=%xx_xb
@@ -637,7 +640,7 @@ XSNMSUBQP   11 . . . 000100 .   
@X_rc
 ## VSX splat instruction
 
 XXSPLTIB00 . 00  0101101000 .   @X_imm8
-XXSPLTW 00 . ---.. . 010100100 . .  @XX2
+XXSPLTW 00 . ---.. . 010100100 . .  @XX2_uim2
 
 ## VSX Permute Instructions
 
@@ -677,6 +680,8 @@ XSCMPGTQP   11 . . . 0011100100 -   @X
 ## VSX Binary Floating-Point Convert Instructions
 
 XSCVQPDP11 . 10100 . 1101000100 .   @X_tb_rc
+XVCVBF16SPN 00 . 1 . 111011011 ..   @XX2
+XVCVSPBF16  00 . 10001 . 111011011 ..   @XX2
 
 ## VSX Vector Test Least-Significant Bit by Byte Instruction
 
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 3b0c8bf3ca..0344c47eed 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1590,7 +1590,7 @@ static bool trans_XXSEL(DisasContext *ctx, arg_XX4 *a)
 return true;
 }
 
-static bool trans_XXSPLTW(DisasContext *ctx, arg_XX2 *a)
+static bool trans_XXSPLTW(DisasContext *ctx, arg_XX2_uim2 *a)
 {
 int tofs, bofs;
 
@@ -2648,6 +2648,35 @@ TRANS(XSCMPGTQP, do_xscmpqp, gen_helper_XSCMPGTQP)
 TRANS(XSMAXCQP, do_xscmpqp, gen_helper_XSMAXCQP)
 TRANS(XSMINCQP, do_xscmpqp, gen_helper_XSMINCQP)
 
+static bool trans_XVCVSPBF16(DisasContext *ctx, arg_XX2 *a)
+{
+TCGv_ptr xt, xb;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VSX(ctx);
+
+xt = gen_vsr_ptr(a->xt);
+xb = gen_vsr_ptr(a->xb);
+
+gen_helper_XVCVSPBF16(cpu_env, xt, xb);
+
+tcg_temp_free_ptr(xt);
+tcg_temp_free_ptr(xb);
+
+return true;
+}
+
+static bool trans_XVCVBF16SPN(DisasContext *ctx, arg_XX2 *a)
+{
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VSX(ctx);
+
+tcg_gen_gvec_shli(MO_32, vsr_full_offset(a->xt), vsr_full_offset(a->xb),
+  16, 16, 16);
+
+return true;
+}
+
 #undef GEN_XX2FORM
 #undef GEN_XX3FORM
 #undef GEN_XX2IFORM
-- 
2.25.1




[PATCH v5 28/49] target/ppc: Move vsel and vperm/vpermr to decodetree

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/helper.h |  5 +--
 target/ppc/insn32.decode|  5 +++
 target/ppc/int_helper.c | 13 +-
 target/ppc/translate/vmx-impl.c.inc | 69 ++---
 target/ppc/translate/vmx-ops.c.inc  |  2 -
 5 files changed, 62 insertions(+), 32 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 6bd7fad70c..fd559d72d3 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -223,9 +223,8 @@ DEF_HELPER_2(vupklsh, void, avr, avr)
 DEF_HELPER_2(vupklsw, void, avr, avr)
 DEF_HELPER_5(vmsumubm, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vmsummbm, void, env, avr, avr, avr, avr)
-DEF_HELPER_5(vsel, void, env, avr, avr, avr, avr)
-DEF_HELPER_5(vperm, void, env, avr, avr, avr, avr)
-DEF_HELPER_5(vpermr, void, env, avr, avr, avr, avr)
+DEF_HELPER_FLAGS_4(VPERM, TCG_CALL_NO_RWG, void, avr, avr, avr, avr)
+DEF_HELPER_FLAGS_4(VPERMR, TCG_CALL_NO_RWG, void, avr, avr, avr, avr)
 DEF_HELPER_4(vpkshss, void, env, avr, avr, avr)
 DEF_HELPER_4(vpkshus, void, env, avr, avr, avr)
 DEF_HELPER_4(vpkswss, void, env, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index abc2007129..0a3ada2b66 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -467,6 +467,11 @@ VINSWVRX000100 . . . 0011000@VX
 VSLDBI  000100 . . . 00 ... 010110  @VN
 VSRDBI  000100 . . . 01 ... 010110  @VN
 
+VPERM   000100 . . . . 101011   @VA
+VPERMR  000100 . . . . 111011   @VA
+
+VSEL000100 . . . . 101010   @VA
+
 ## Vector Integer Shift Instruction
 
 VSLB000100 . . . 0010100@VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index f52242ca81..6c63c7b227 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1015,8 +1015,7 @@ VMUL(UW, u32, VsrW, VsrD, uint64_t)
 #undef VMUL_DO_ODD
 #undef VMUL
 
-void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
-  ppc_avr_t *c)
+void helper_VPERM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
 {
 ppc_avr_t result;
 int i;
@@ -1034,8 +1033,7 @@ void helper_vperm(CPUPPCState *env, ppc_avr_t *r, 
ppc_avr_t *a, ppc_avr_t *b,
 *r = result;
 }
 
-void helper_vpermr(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
-  ppc_avr_t *c)
+void helper_VPERMR(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
 {
 ppc_avr_t result;
 int i;
@@ -1303,13 +1301,6 @@ VRLMI(VRLWMI, 32, u32, 1);
 VRLMI(VRLDNM, 64, u64, 0);
 VRLMI(VRLWNM, 32, u32, 0);
 
-void helper_vsel(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
- ppc_avr_t *c)
-{
-r->u64[0] = (a->u64[0] & ~c->u64[0]) | (b->u64[0] & c->u64[0]);
-r->u64[1] = (a->u64[1] & ~c->u64[1]) | (b->u64[1] & c->u64[1]);
-}
-
 void helper_vexptefp(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *b)
 {
 int i;
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 352250fad0..f91bee839d 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2559,28 +2559,65 @@ static void gen_vmladduhm(DisasContext *ctx)
 tcg_temp_free_ptr(rd);
 }
 
-static void gen_vpermr(DisasContext *ctx)
+static bool trans_VPERM(DisasContext *ctx, arg_VA *a)
 {
-TCGv_ptr ra, rb, rc, rd;
-if (unlikely(!ctx->altivec_enabled)) {
-gen_exception(ctx, POWERPC_EXCP_VPU);
-return;
-}
-ra = gen_avr_ptr(rA(ctx->opcode));
-rb = gen_avr_ptr(rB(ctx->opcode));
-rc = gen_avr_ptr(rC(ctx->opcode));
-rd = gen_avr_ptr(rD(ctx->opcode));
-gen_helper_vpermr(cpu_env, rd, ra, rb, rc);
-tcg_temp_free_ptr(ra);
-tcg_temp_free_ptr(rb);
-tcg_temp_free_ptr(rc);
-tcg_temp_free_ptr(rd);
+TCGv_ptr vrt, vra, vrb, vrc;
+
+REQUIRE_INSNS_FLAGS(ctx, ALTIVEC);
+REQUIRE_VECTOR(ctx);
+
+vrt = gen_avr_ptr(a->vrt);
+vra = gen_avr_ptr(a->vra);
+vrb = gen_avr_ptr(a->vrb);
+vrc = gen_avr_ptr(a->rc);
+
+gen_helper_VPERM(vrt, vra, vrb, vrc);
+
+tcg_temp_free_ptr(vrt);
+tcg_temp_free_ptr(vra);
+tcg_temp_free_ptr(vrb);
+tcg_temp_free_ptr(vrc);
+
+return true;
+}
+
+static bool trans_VPERMR(DisasContext *ctx, arg_VA *a)
+{
+TCGv_ptr vrt, vra, vrb, vrc;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+REQUIRE_VECTOR(ctx);
+
+vrt = gen_avr_ptr(a->vrt);
+vra = gen_avr_ptr(a->vra);
+vrb = gen_avr_ptr(a->vrb);
+vrc = gen_avr_ptr(a->rc);
+
+gen_helper_VPERMR(vrt, vra, vrb, vrc);
+
+tcg_temp_free_ptr(vrt);
+tcg_temp_free_ptr(vra);
+tcg_temp_free_ptr(vrb);
+tcg_temp_free_ptr(vrc);
+
+return true;
+}
+
+static bool trans_VSEL(DisasContext *ctx, arg_VA *a)
+{
+REQUIRE_INSNS_FLAGS(ctx, ALTIVEC);
+

Re: [PATCH v5 24/49] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi to decodetree

2022-02-25 Thread Richard Henderson

On 2/25/22 11:09, matheus.fe...@eldorado.org.br wrote:

From: Matheus Ferst

Signed-off-by: Matheus Ferst
---
I couldn't figure out how to use tcg_gen_gvec_rotlv here. Since the code
is in the fniv implementation, we have TCGv_vec instead of offsets. I'm
keeping the masking for now, so the generated code has the desired
effect.


Fair.

Reviewed-by: Richard Henderson 

r~



[PATCH v5 49/49] target/ppc: implement lxvr[bhwd]/stxvr[bhwd]x

2022-02-25 Thread matheus . ferst
From: Lucas Coutinho 

Implement the following PowerISA v3.1 instuctions:
lxvrbx: Load VSX Vector Rightmost Byte Indexed X-form
lxvrhx: Load VSX Vector Rightmost Halfword Indexed X-form
lxvrwx: Load VSX Vector Rightmost Word Indexed X-form
lxvrdx: Load VSX Vector Rightmost Doubleword Indexed X-form

stxvrbx: Store VSX Vector Rightmost Byte Indexed X-form
stxvrhx: Store VSX Vector Rightmost Halfword Indexed X-form
stxvrwx: Store VSX Vector Rightmost Word Indexed X-form
stxvrdx: Store VSX Vector Rightmost Doubleword Indexed X-form

Reviewed-by: Richard Henderson 
Signed-off-by: Lucas Coutinho 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  8 +++
 target/ppc/translate/vsx-impl.c.inc | 35 +
 2 files changed, 43 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 1641a31894..ac2d3da9a7 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -614,6 +614,14 @@ LXVX01 . . . 0100 - 01100 . 
@X_TSX
 STXVX   01 . . . 0110001100 .   @X_TSX
 LXVPX   01 . . . 0101001101 -   @X_TSXP
 STXVPX  01 . . . 0111001101 -   @X_TSXP
+LXVRBX  01 . . . 001101 .   @X_TSX
+LXVRHX  01 . . . 101101 .   @X_TSX
+LXVRWX  01 . . . 0001001101 .   @X_TSX
+LXVRDX  01 . . . 0001101101 .   @X_TSX
+STXVRBX 01 . . . 0010001101 .   @X_TSX
+STXVRHX 01 . . . 0010101101 .   @X_TSX
+STXVRWX 01 . . . 0011001101 .   @X_TSX
+STXVRDX 01 . . . 0011101101 .   @X_TSX
 
 ## VSX Scalar Multiply-Add Instructions
 
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index a980a79b78..2ffeab5287 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2363,6 +2363,41 @@ TRANS64_FLAGS2(ISA310, PLXV, do_lstxv_PLS_D, false, 
false)
 TRANS64_FLAGS2(ISA310, PSTXVP, do_lstxv_PLS_D, true, true)
 TRANS64_FLAGS2(ISA310, PLXVP, do_lstxv_PLS_D, false, true)
 
+static bool do_lstrm(DisasContext *ctx, arg_X *a, MemOp mop, bool store)
+{
+TCGv ea;
+TCGv_i64 xt;
+
+REQUIRE_VSX(ctx);
+
+xt = tcg_temp_new_i64();
+
+gen_set_access_type(ctx, ACCESS_INT);
+ea = do_ea_calc(ctx, a->ra , cpu_gpr[a->rb]);
+
+if (store) {
+get_cpu_vsr(xt, a->rt, false);
+tcg_gen_qemu_st_i64(xt, ea, ctx->mem_idx, mop);
+} else {
+tcg_gen_qemu_ld_i64(xt, ea, ctx->mem_idx, mop);
+set_cpu_vsr(a->rt, xt, false);
+set_cpu_vsr(a->rt, tcg_constant_i64(0), true);
+}
+
+tcg_temp_free(ea);
+tcg_temp_free_i64(xt);
+return true;
+}
+
+TRANS_FLAGS2(ISA310, LXVRBX, do_lstrm, DEF_MEMOP(MO_UB), false)
+TRANS_FLAGS2(ISA310, LXVRHX, do_lstrm, DEF_MEMOP(MO_UW), false)
+TRANS_FLAGS2(ISA310, LXVRWX, do_lstrm, DEF_MEMOP(MO_UL), false)
+TRANS_FLAGS2(ISA310, LXVRDX, do_lstrm, DEF_MEMOP(MO_UQ), false)
+TRANS_FLAGS2(ISA310, STXVRBX, do_lstrm, DEF_MEMOP(MO_UB), true)
+TRANS_FLAGS2(ISA310, STXVRHX, do_lstrm, DEF_MEMOP(MO_UW), true)
+TRANS_FLAGS2(ISA310, STXVRWX, do_lstrm, DEF_MEMOP(MO_UL), true)
+TRANS_FLAGS2(ISA310, STXVRDX, do_lstrm, DEF_MEMOP(MO_UQ), true)
+
 static void gen_xxeval_i64(TCGv_i64 t, TCGv_i64 a, TCGv_i64 b, TCGv_i64 c,
int64_t imm)
 {
-- 
2.25.1




Re: [PATCH v5 04/49] target/ppc: vmulh* instructions without helpers

2022-02-25 Thread Richard Henderson

On 2/25/22 11:08, matheus.fe...@eldorado.org.br wrote:

+static void do_vx_vmulhw_i64(TCGv_i64 t, TCGv_i64 a, TCGv_i64 b, bool sign)
+{
+TCGv_i64 hh, lh, temp;
+
+uint64_t c;
+hh = tcg_temp_new_i64();
+lh = tcg_temp_new_i64();
+temp = tcg_temp_new_i64();
+
+c = 0x;
+
+if (sign) {
+tcg_gen_ext32s_i64(lh, a);
+tcg_gen_ext32s_i64(temp, b);
+} else {
+tcg_gen_andi_i64(lh, a, c);
+tcg_gen_andi_i64(temp, b, c);


Nit: tcg_gen_ext32u_i64.


+tcg_gen_andi_i64(hh, hh, c << 32);
+tcg_gen_or_i64(t, hh, lh);


Nit: tcg_gen_deposit_i64(t, hh, lh, 0, 32);

Reviewed-by: Richard Henderson 


r~



[PATCH v5 45/49] target/ppc: Implement xs{max,min}cqp

2022-02-25 Thread matheus . ferst
From: Víctor Colombo 

Signed-off-by: Víctor Colombo 
Signed-off-by: Matheus Ferst 
---
changes for v5:
- Update the helper macro call with the new parameters added to
  VSX_MAX_MINC
---
 target/ppc/fpu_helper.c | 2 ++
 target/ppc/helper.h | 2 ++
 target/ppc/insn32.decode| 3 +++
 target/ppc/translate/vsx-impl.c.inc | 2 ++
 4 files changed, 9 insertions(+)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 0aaf529ac8..7144007316 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2560,6 +2560,8 @@ void helper_##name(CPUPPCState *env,  
\
 
 VSX_MAX_MINC(XSMAXCDP, true, float64, VsrD(0));
 VSX_MAX_MINC(XSMINCDP, false, float64, VsrD(0));
+VSX_MAX_MINC(XSMAXCQP, true, float128, f128);
+VSX_MAX_MINC(XSMINCQP, false, float128, f128);
 
 #define VSX_MAX_MINJ(name, max)   \
 void helper_##name(CPUPPCState *env,  \
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index b8033e2f58..025da6271a 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -375,6 +375,8 @@ DEF_HELPER_4(XSMAXCDP, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(XSMINCDP, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(XSMAXJDP, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(XSMINJDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMAXCQP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMINCQP, void, env, vsr, vsr, vsr)
 DEF_HELPER_3(xscvdphp, void, env, vsr, vsr)
 DEF_HELPER_4(xscvdpqp, void, env, i32, vsr, vsr)
 DEF_HELPER_3(xscvdpsp, void, env, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 6fbb2d188f..126cadadb8 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -664,6 +664,9 @@ XSMAXCDP00 . . . 1000 ...   @XX3
 XSMINCDP00 . . . 10001000 ...   @XX3
 XSMAXJDP00 . . . 1001 ...   @XX3
 XSMINJDP00 . . . 10011000 ...   @XX3
+XSMAXCQP11 . . . 1010100100 -   @X
+XSMINCQP11 . . . 1011100100 -   @X
+
 XSCMPEQDP   00 . . . 0011 ...   @XX3
 XSCMPGEDP   00 . . . 00010011 ...   @XX3
 XSCMPGTDP   00 . . . 1011 ...   @XX3
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 5273452d13..3b0c8bf3ca 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2645,6 +2645,8 @@ static bool do_xscmpqp(DisasContext *ctx, arg_X *a,
 TRANS(XSCMPEQQP, do_xscmpqp, gen_helper_XSCMPEQQP)
 TRANS(XSCMPGEQP, do_xscmpqp, gen_helper_XSCMPGEQP)
 TRANS(XSCMPGTQP, do_xscmpqp, gen_helper_XSCMPGTQP)
+TRANS(XSMAXCQP, do_xscmpqp, gen_helper_XSMAXCQP)
+TRANS(XSMINCQP, do_xscmpqp, gen_helper_XSMINCQP)
 
 #undef GEN_XX2FORM
 #undef GEN_XX3FORM
-- 
2.25.1




[PATCH v5 23/49] target/ppc: move vrl[bhwd] to decodetree

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  5 +
 target/ppc/translate/vmx-impl.c.inc | 13 +
 target/ppc/translate/vmx-ops.c.inc  |  6 ++
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 7a9fc1dffa..d918e2d0f2 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -487,6 +487,11 @@ VSRAW   000100 . . . 0111100@VX
 VSRAD   000100 . . . 0000100@VX
 VSRAQ   000100 . . . 0110101@VX
 
+VRLB000100 . . . 100@VX
+VRLH000100 . . . 1000100@VX
+VRLW000100 . . . 0001100@VX
+VRLD000100 . . . 00011000100@VX
+
 ## Vector Integer Arithmetic Instructions
 
 VEXTSB2W000100 . 1 . 1100010@VX_tb
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 27ed87fcd6..f24b78d42e 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -834,6 +834,11 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, 
tcg_gen_gvec_sarv);
 TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
 TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
 
+TRANS_FLAGS(ALTIVEC, VRLB, do_vector_gvec3_VX, MO_8, tcg_gen_gvec_rotlv)
+TRANS_FLAGS(ALTIVEC, VRLH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_rotlv)
+TRANS_FLAGS(ALTIVEC, VRLW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_rotlv)
+TRANS_FLAGS2(ALTIVEC_207, VRLD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_rotlv)
+
 static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right,
  bool alg)
 {
@@ -970,16 +975,8 @@ GEN_VXFORM3(vsubeuqm, 31, 0);
 GEN_VXFORM3(vsubecuq, 31, 0);
 GEN_VXFORM_DUAL(vsubeuqm, PPC_NONE, PPC2_ALTIVEC_207, \
 vsubecuq, PPC_NONE, PPC2_ALTIVEC_207)
-GEN_VXFORM_V(vrlb, MO_8, tcg_gen_gvec_rotlv, 2, 0);
-GEN_VXFORM_V(vrlh, MO_16, tcg_gen_gvec_rotlv, 2, 1);
-GEN_VXFORM_V(vrlw, MO_32, tcg_gen_gvec_rotlv, 2, 2);
 GEN_VXFORM(vrlwmi, 2, 2);
-GEN_VXFORM_DUAL(vrlw, PPC_ALTIVEC, PPC_NONE, \
-vrlwmi, PPC_NONE, PPC2_ISA300)
-GEN_VXFORM_V(vrld, MO_64, tcg_gen_gvec_rotlv, 2, 3);
 GEN_VXFORM(vrldmi, 2, 3);
-GEN_VXFORM_DUAL(vrld, PPC_NONE, PPC2_ALTIVEC_207, \
-vrldmi, PPC_NONE, PPC2_ISA300)
 GEN_VXFORM_TRANS(vsl, 2, 7);
 GEN_VXFORM(vrldnm, 2, 7);
 GEN_VXFORM_DUAL(vsl, PPC_ALTIVEC, PPC_NONE, \
diff --git a/target/ppc/translate/vmx-ops.c.inc 
b/target/ppc/translate/vmx-ops.c.inc
index 878bce92c6..a7acea3ca7 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -133,10 +133,8 @@ GEN_VXFORM_DUAL(vaddeuqm, vaddecuq, 30, 0xFF, PPC_NONE, 
PPC2_ALTIVEC_207),
 GEN_VXFORM_DUAL(vsubuqm, bcdtrunc, 0, 20, PPC2_ALTIVEC_207, PPC2_ISA300),
 GEN_VXFORM_DUAL(vsubcuq, bcdutrunc, 0, 21, PPC2_ALTIVEC_207, PPC2_ISA300),
 GEN_VXFORM_DUAL(vsubeuqm, vsubecuq, 31, 0xFF, PPC_NONE, PPC2_ALTIVEC_207),
-GEN_VXFORM(vrlb, 2, 0),
-GEN_VXFORM(vrlh, 2, 1),
-GEN_VXFORM_DUAL(vrlw, vrlwmi, 2, 2, PPC_ALTIVEC, PPC_NONE),
-GEN_VXFORM_DUAL(vrld, vrldmi, 2, 3, PPC_NONE, PPC2_ALTIVEC_207),
+GEN_VXFORM_300(vrlwmi, 2, 2),
+GEN_VXFORM_300(vrldmi, 2, 3),
 GEN_VXFORM_DUAL(vsl, vrldnm, 2, 7, PPC_ALTIVEC, PPC_NONE),
 GEN_VXFORM(vsr, 2, 11),
 GEN_VXFORM(vpkuhum, 7, 0),
-- 
2.25.1




[PATCH v5 36/49] target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/fpu_helper.c | 23 ++--
 target/ppc/helper.h | 16 -
 target/ppc/insn32.decode| 22 
 target/ppc/translate/vsx-impl.c.inc | 56 -
 target/ppc/translate/vsx-ops.c.inc  | 16 -
 5 files changed, 90 insertions(+), 43 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 0fd285defc..c8797d8053 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2156,10 +2156,11 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, VsrW(i), -126, 23)
  *   maddflgs - flags for the float*muladd routine that control the
  *   various forms (madd, msub, nmadd, nmsub)
  *   sfprf - set FPRF
+ *   r2sp  - round intermediate double precision result to single precision
  */
 #define VSX_MADD(op, nels, tp, fld, maddflgs, sfprf, r2sp)\
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
- ppc_vsr_t *xa, ppc_vsr_t *b, ppc_vsr_t *c)   \
+ ppc_vsr_t *s1, ppc_vsr_t *s2, ppc_vsr_t *s3) \
 { \
 ppc_vsr_t t = *xt;\
 int i;\
@@ -2175,12 +2176,12 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,   
  \
  * result to odd. \
  */   \
 set_float_rounding_mode(float_round_to_zero, &tstat); \
-t.fld = tp##_muladd(xa->fld, b->fld, c->fld,  \
+t.fld = tp##_muladd(s1->fld, s3->fld, s2->fld,\
 maddflgs, &tstat);\
 t.fld |= (get_float_exception_flags(&tstat) & \
   float_flag_inexact) != 0;   \
 } else {  \
-t.fld = tp##_muladd(xa->fld, b->fld, c->fld,  \
+t.fld = tp##_muladd(s1->fld, s3->fld, s2->fld,\
 maddflgs, &tstat);\
 } \
 env->fp_status.float_exception_flags |= tstat.float_exception_flags;  \
@@ -2202,14 +2203,14 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt,   
  \
 do_float_check_status(env, GETPC());  \
 }
 
-VSX_MADD(xsmadddp, 1, float64, VsrD(0), MADD_FLGS, 1, 0)
-VSX_MADD(xsmsubdp, 1, float64, VsrD(0), MSUB_FLGS, 1, 0)
-VSX_MADD(xsnmadddp, 1, float64, VsrD(0), NMADD_FLGS, 1, 0)
-VSX_MADD(xsnmsubdp, 1, float64, VsrD(0), NMSUB_FLGS, 1, 0)
-VSX_MADD(xsmaddsp, 1, float64, VsrD(0), MADD_FLGS, 1, 1)
-VSX_MADD(xsmsubsp, 1, float64, VsrD(0), MSUB_FLGS, 1, 1)
-VSX_MADD(xsnmaddsp, 1, float64, VsrD(0), NMADD_FLGS, 1, 1)
-VSX_MADD(xsnmsubsp, 1, float64, VsrD(0), NMSUB_FLGS, 1, 1)
+VSX_MADD(XSMADDDP, 1, float64, VsrD(0), MADD_FLGS, 1, 0)
+VSX_MADD(XSMSUBDP, 1, float64, VsrD(0), MSUB_FLGS, 1, 0)
+VSX_MADD(XSNMADDDP, 1, float64, VsrD(0), NMADD_FLGS, 1, 0)
+VSX_MADD(XSNMSUBDP, 1, float64, VsrD(0), NMSUB_FLGS, 1, 0)
+VSX_MADD(XSMADDSP, 1, float64, VsrD(0), MADD_FLGS, 1, 1)
+VSX_MADD(XSMSUBSP, 1, float64, VsrD(0), MSUB_FLGS, 1, 1)
+VSX_MADD(XSNMADDSP, 1, float64, VsrD(0), NMADD_FLGS, 1, 1)
+VSX_MADD(XSNMSUBSP, 1, float64, VsrD(0), NMSUB_FLGS, 1, 1)
 
 VSX_MADD(xvmadddp, 2, float64, VsrD(i), MADD_FLGS, 0, 0)
 VSX_MADD(xvmsubdp, 2, float64, VsrD(i), MSUB_FLGS, 0, 0)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index d1ed043b41..fff6041a7b 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -353,10 +353,10 @@ DEF_HELPER_3(xssqrtdp, void, env, vsr, vsr)
 DEF_HELPER_3(xsrsqrtedp, void, env, vsr, vsr)
 DEF_HELPER_4(xstdivdp, void, env, i32, vsr, vsr)
 DEF_HELPER_3(xstsqrtdp, void, env, i32, vsr)
-DEF_HELPER_5(xsmadddp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsmsubdp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmadddp, void, env, vsr, vsr, vsr, vsr)
-DEF_HELPER_5(xsnmsubdp, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMADDDP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSMSUBDP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMADDDP, void, env, vsr, vsr, vsr, vsr)
+DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
@@ -416,10 +416,10 @@ DEF_HELPER_3(xsresp, void, env, vsr, vsr)
 DEF_HELPER_2(xsrsp, i64, env, i64)

[PATCH v5 47/49] target/ppc: implement plxsd/pstxsd

2022-02-25 Thread matheus . ferst
From: Leandro Lupori 

Implement instructions plxsd/pstxsd and port lxsd/stxsd to decode
tree.

Reviewed-by: Richard Henderson 
Signed-off-by: Leandro Lupori 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  2 ++
 target/ppc/insn64.decode| 10 ++
 target/ppc/translate.c  | 14 ++--
 target/ppc/translate/vsx-impl.c.inc | 55 +++--
 4 files changed, 67 insertions(+), 14 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index fede42f5ce..37b6470503 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -602,6 +602,8 @@ VCLRRB  000100 . . . 00111001101@VX
 
 # VSX Load/Store Instructions
 
+LXSD111001 . . .. 10@DS
+STXSD   01 . . .. 10@DS
 LXV 01 . .  . 001   @DQ_TSX
 STXV01 . .  . 101   @DQ_TSX
 LXVP000110 . .  @DQ_TSXP
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index fdb859f62d..b7426f5b24 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -32,6 +32,10 @@
 .. . ra:5    \
 &PLS_D si=%pls_si rt=%rt_tsxp
 
+@8LS_D  .. .. . .. r:1 .. .. \
+.. rt:5 ra:5 \
+&PLS_D si=%pls_si
+
 # Format 8RR:D
 %8rr_si 32:s16 0:16
 %8rr_xt 16:1 21:5
@@ -180,6 +184,12 @@ PSTFD   01 10 0--.-- .. \
 
 ### VSX instructions
 
+PLXSD   01 00 0--.-- .. \
+101010 . .  @8LS_D
+
+PSTXSD  01 00 0--.-- .. \
+101110 . .  @8LS_D
+
 PLXV01 00 0--.-- .. \
 11001 .. .  @8LS_D_TSX
 PSTXV   01 00 0--.-- .. \
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index b46a11386e..1ef2ad 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6668,7 +6668,7 @@ static bool resolve_PLS_D(DisasContext *ctx, arg_D *d, 
arg_PLS_D *a)
 
 #include "translate/branch-impl.c.inc"
 
-/* Handles lfdp, lxsd, lxssp */
+/* Handles lfdp, lxssp */
 static void gen_dform39(DisasContext *ctx)
 {
 switch (ctx->opcode & 0x3) {
@@ -6677,11 +6677,6 @@ static void gen_dform39(DisasContext *ctx)
 return gen_lfdp(ctx);
 }
 break;
-case 2: /* lxsd */
-if (ctx->insns_flags2 & PPC2_ISA300) {
-return gen_lxsd(ctx);
-}
-break;
 case 3: /* lxssp */
 if (ctx->insns_flags2 & PPC2_ISA300) {
 return gen_lxssp(ctx);
@@ -6691,7 +6686,7 @@ static void gen_dform39(DisasContext *ctx)
 return gen_invalid(ctx);
 }
 
-/* handles stfdp, lxv, stxsd, stxssp lxvx */
+/* handles stfdp, lxv, stxssp lxvx */
 static void gen_dform3D(DisasContext *ctx)
 {
 if ((ctx->opcode & 3) != 1) { /* DS-FORM */
@@ -6701,11 +6696,6 @@ static void gen_dform3D(DisasContext *ctx)
 return gen_stfdp(ctx);
 }
 break;
-case 2: /* stxsd */
-if (ctx->insns_flags2 & PPC2_ISA300) {
-return gen_stxsd(ctx);
-}
-break;
 case 3: /* stxssp */
 if (ctx->insns_flags2 & PPC2_ISA300) {
 return gen_stxssp(ctx);
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 0344c47eed..a6e9417f2d 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -309,7 +309,6 @@ static void gen_##name(DisasContext *ctx)   
  \
 tcg_temp_free_i64(xth);   \
 }
 
-VSX_LOAD_SCALAR_DS(lxsd, ld64_i64)
 VSX_LOAD_SCALAR_DS(lxssp, ld32fs)
 
 #define VSX_STORE_SCALAR(name, operation) \
@@ -482,7 +481,6 @@ static void gen_##name(DisasContext *ctx)   
  \
 tcg_temp_free_i64(xth);   \
 }
 
-VSX_STORE_SCALAR_DS(stxsd, st64_i64)
 VSX_STORE_SCALAR_DS(stxssp, st32fs)
 
 static void gen_mfvsrwz(DisasContext *ctx)
@@ -2298,6 +2296,57 @@ static bool do_lstxv_X(DisasContext *ctx, arg_X *a, bool 
store, bool paired)
 return do_lstxv(ctx, a->ra, cpu_gpr[a->rb], a->rt, store, paired);
 }
 
+static bool do_lstxsd(DisasContext *ctx, int rt, int ra, TCGv displ, bool 
store)
+{
+TCGv ea;
+TCGv_i64 xt;
+MemOp mop;
+
+if (store) {
+REQUIRE_VECTOR(ctx);
+} else {
+REQUIRE_VSX(ctx);
+}
+
+xt = tcg_temp_new_i64();
+mop = DEF_MEMOP(MO_UQ);
+
+gen_set_access_type(ctx, ACCESS_INT);
+ea = do_ea_calc(ctx, ra, displ);
+
+if (store) {
+get_cpu_vsr(xt, rt 

[PATCH v5 20/49] target/ppc: implement vslq

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  1 +
 target/ppc/translate/vmx-impl.c.inc | 40 +
 2 files changed, 41 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 88baebe35e..3799065508 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -473,6 +473,7 @@ VSLB000100 . . . 0010100@VX
 VSLH000100 . . . 00101000100@VX
 VSLW000100 . . . 0011100@VX
 VSLD000100 . . . 10111000100@VX
+VSLQ000100 . . . 0010101@VX
 
 VSRB000100 . . . 0100100@VX
 VSRH000100 . . . 01001000100@VX
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 1b05b0b3a3..49c722e862 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -834,6 +834,46 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, 
tcg_gen_gvec_sarv);
 TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
 TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
 
+static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
+{
+TCGv_i64 hi, lo, t0, n, zero = tcg_constant_i64(0);
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VECTOR(ctx);
+
+n = tcg_temp_new_i64();
+hi = tcg_temp_new_i64();
+lo = tcg_temp_new_i64();
+t0 = tcg_temp_new_i64();
+
+get_avr64(lo, a->vra, false);
+get_avr64(hi, a->vra, true);
+
+get_avr64(n, a->vrb, true);
+
+tcg_gen_andi_i64(t0, n, 64);
+tcg_gen_movcond_i64(TCG_COND_NE, hi, t0, zero, lo, hi);
+tcg_gen_movcond_i64(TCG_COND_NE, lo, t0, zero, zero, lo);
+tcg_gen_andi_i64(n, n, 0x3F);
+
+tcg_gen_shl_i64(t0, lo, n);
+set_avr64(a->vrt, t0, false);
+
+tcg_gen_shl_i64(hi, hi, n);
+tcg_gen_xori_i64(n, n, 63);
+tcg_gen_shr_i64(lo, lo, n);
+tcg_gen_shri_i64(lo, lo, 1);
+tcg_gen_or_i64(hi, hi, lo);
+set_avr64(a->vrt, hi, true);
+
+tcg_temp_free_i64(hi);
+tcg_temp_free_i64(lo);
+tcg_temp_free_i64(t0);
+tcg_temp_free_i64(n);
+
+return true;
+}
+
 #define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3)   \
 static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
  TCGv_vec sat, TCGv_vec a,  \
-- 
2.25.1




[PATCH v5 43/49] target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3

2022-02-25 Thread matheus . ferst
From: Víctor Colombo 

Also, fixes these instructions not being capitalized.

Reviewed-by: Richard Henderson 
Signed-off-by: Víctor Colombo 
Signed-off-by: Matheus Ferst 
---
 target/ppc/fpu_helper.c |  8 
 target/ppc/helper.h |  8 
 target/ppc/translate/vsx-impl.c.inc | 30 -
 3 files changed, 12 insertions(+), 34 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 4d67180bec..4bfa1c4283 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2565,8 +2565,8 @@ void helper_##name(CPUPPCState *env,  
\
 } \
 } \
 
-VSX_MAX_MINC(xsmaxcdp, 1);
-VSX_MAX_MINC(xsmincdp, 0);
+VSX_MAX_MINC(XSMAXCDP, 1);
+VSX_MAX_MINC(XSMINCDP, 0);
 
 #define VSX_MAX_MINJ(name, max)   \
 void helper_##name(CPUPPCState *env,  \
@@ -2620,8 +2620,8 @@ void helper_##name(CPUPPCState *env,  
\
 } \
 } \
 
-VSX_MAX_MINJ(xsmaxjdp, 1);
-VSX_MAX_MINJ(xsminjdp, 0);
+VSX_MAX_MINJ(XSMAXJDP, 1);
+VSX_MAX_MINJ(XSMINJDP, 0);
 
 /*
  * VSX_CMP - VSX floating point compare
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 8a3db7c13f..b8033e2f58 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -371,10 +371,10 @@ DEF_HELPER_4(xscmpoqp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscmpuqp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xsmaxdp, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xsmindp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsmaxcdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsmincdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsmaxjdp, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xsminjdp, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMAXCDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMINCDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMAXJDP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSMINJDP, void, env, vsr, vsr, vsr)
 DEF_HELPER_3(xscvdphp, void, env, vsr, vsr)
 DEF_HELPER_4(xscvdpqp, void, env, i32, vsr, vsr)
 DEF_HELPER_3(xscvdpsp, void, env, vsr, vsr)
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index fca25c267b..5273452d13 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2610,32 +2610,10 @@ static bool do_helper_XX3(DisasContext *ctx, arg_XX3 *a,
 TRANS(XSCMPEQDP, do_helper_XX3, gen_helper_XSCMPEQDP)
 TRANS(XSCMPGEDP, do_helper_XX3, gen_helper_XSCMPGEDP)
 TRANS(XSCMPGTDP, do_helper_XX3, gen_helper_XSCMPGTDP)
-
-static bool do_xsmaxmincjdp(DisasContext *ctx, arg_XX3 *a,
-void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, 
TCGv_ptr))
-{
-TCGv_ptr xt, xa, xb;
-
-REQUIRE_INSNS_FLAGS2(ctx, ISA300);
-REQUIRE_VSX(ctx);
-
-xt = gen_vsr_ptr(a->xt);
-xa = gen_vsr_ptr(a->xa);
-xb = gen_vsr_ptr(a->xb);
-
-helper(cpu_env, xt, xa, xb);
-
-tcg_temp_free_ptr(xt);
-tcg_temp_free_ptr(xa);
-tcg_temp_free_ptr(xb);
-
-return true;
-}
-
-TRANS(XSMAXCDP, do_xsmaxmincjdp, gen_helper_xsmaxcdp)
-TRANS(XSMINCDP, do_xsmaxmincjdp, gen_helper_xsmincdp)
-TRANS(XSMAXJDP, do_xsmaxmincjdp, gen_helper_xsmaxjdp)
-TRANS(XSMINJDP, do_xsmaxmincjdp, gen_helper_xsminjdp)
+TRANS(XSMAXCDP, do_helper_XX3, gen_helper_XSMAXCDP)
+TRANS(XSMINCDP, do_helper_XX3, gen_helper_XSMINCDP)
+TRANS(XSMAXJDP, do_helper_XX3, gen_helper_XSMAXJDP)
+TRANS(XSMINJDP, do_helper_XX3, gen_helper_XSMINJDP)
 
 static bool do_helper_X(arg_X *a,
 void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
-- 
2.25.1




[PATCH v5 41/49] target/ppc: Implement xscmp{eq,ge,gt}qp

2022-02-25 Thread matheus . ferst
From: Víctor Colombo 

Reviewed-by: Richard Henderson 
Signed-off-by: Víctor Colombo 
Signed-off-by: Matheus Ferst 
---
 target/ppc/fpu_helper.c |  3 +++
 target/ppc/helper.h |  3 +++
 target/ppc/insn32.decode|  3 +++
 target/ppc/translate/vsx-impl.c.inc | 31 +
 4 files changed, 40 insertions(+)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index bbd54b2d9c..a589e6b7a5 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2307,6 +2307,9 @@ VSX_MADDQ(XSNMSUBQPO, NMSUB_FLGS, 0)
 VSX_SCALAR_CMP(xscmpeqdp, float64, eq, VsrD(0), 0)
 VSX_SCALAR_CMP(xscmpgedp, float64, le, VsrD(0), 1)
 VSX_SCALAR_CMP(xscmpgtdp, float64, lt, VsrD(0), 1)
+VSX_SCALAR_CMP(XSCMPEQQP, float128, eq, f128, 0)
+VSX_SCALAR_CMP(XSCMPGEQP, float128, le, f128, 1)
+VSX_SCALAR_CMP(XSCMPGTQP, float128, lt, f128, 1)
 
 void helper_xscmpexpdp(CPUPPCState *env, uint32_t opcode,
ppc_vsr_t *xa, ppc_vsr_t *xb)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 00e2e6f7b7..b9c5c0dd48 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -360,6 +360,9 @@ DEF_HELPER_5(XSNMSUBDP, void, env, vsr, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpeqdp, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpgtdp, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpgedp, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPEQQP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGTQP, void, env, vsr, vsr, vsr)
+DEF_HELPER_4(XSCMPGEQP, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xscmpexpdp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscmpexpqp, void, env, i32, vsr, vsr)
 DEF_HELPER_4(xscmpodp, void, env, i32, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 973cda1131..92327a0a71 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -664,6 +664,9 @@ XSMAXCDP00 . . . 1000 ...   @XX3
 XSMINCDP00 . . . 10001000 ...   @XX3
 XSMAXJDP00 . . . 1001 ...   @XX3
 XSMINJDP00 . . . 10011000 ...   @XX3
+XSCMPEQQP   11 . . . 0001000100 -   @X
+XSCMPGEQP   11 . . . 0011000100 -   @X
+XSCMPGTQP   11 . . . 0011100100 -   @X
 
 ## VSX Binary Floating-Point Convert Instructions
 
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 3baaac1abd..6dd20b0309 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2615,6 +2615,37 @@ TRANS(XSMINCDP, do_xsmaxmincjdp, gen_helper_xsmincdp)
 TRANS(XSMAXJDP, do_xsmaxmincjdp, gen_helper_xsmaxjdp)
 TRANS(XSMINJDP, do_xsmaxmincjdp, gen_helper_xsminjdp)
 
+static bool do_helper_X(arg_X *a,
+void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+TCGv_ptr rt, ra, rb;
+
+rt = gen_avr_ptr(a->rt);
+ra = gen_avr_ptr(a->ra);
+rb = gen_avr_ptr(a->rb);
+
+helper(cpu_env, rt, ra, rb);
+
+tcg_temp_free_ptr(rt);
+tcg_temp_free_ptr(ra);
+tcg_temp_free_ptr(rb);
+
+return true;
+}
+
+static bool do_xscmpqp(DisasContext *ctx, arg_X *a,
+void (*helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr, TCGv_ptr))
+{
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VSX(ctx);
+
+return do_helper_X(a, helper);
+}
+
+TRANS(XSCMPEQQP, do_xscmpqp, gen_helper_XSCMPEQQP)
+TRANS(XSCMPGEQP, do_xscmpqp, gen_helper_XSCMPGEQP)
+TRANS(XSCMPGTQP, do_xscmpqp, gen_helper_XSCMPGTQP)
+
 #undef GEN_XX2FORM
 #undef GEN_XX3FORM
 #undef GEN_XX2IFORM
-- 
2.25.1




[PATCH v5 31/49] target/ppc: Move xxpermdi to decodetree

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  4 ++
 target/ppc/translate/vsx-impl.c.inc | 71 +
 target/ppc/translate/vsx-ops.c.inc  |  2 -
 3 files changed, 36 insertions(+), 41 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 3de4a32e38..b8dbac553e 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -155,6 +155,9 @@
 &XX3xt xa xb
 @XX3.. . . .  ...   &XX3 xt=%xx_xt 
xa=%xx_xa xb=%xx_xb
 
+&XX3_dm xt xa xb dm
+@XX3_dm .. . . . . dm:2 . ...   &XX3_dm 
xt=%xx_xt xa=%xx_xa xb=%xx_xb
+
 &XX4xt xa xb xc
 @XX4.. . . . . ..   &XX4 xt=%xx_xt 
xa=%xx_xa xb=%xx_xb xc=%xx_xc
 
@@ -608,6 +611,7 @@ XXSPLTW 00 . ---.. . 010100100 . .  @XX2
 
 XXPERM  00 . . . 00011010 ...   @XX3
 XXPERMR 00 . . . 00111010 ...   @XX3
+XXPERMDI00 . . . 0 .. 01010 ... @XX3_dm
 
 XXSEL   00 . . . . 11   @XX4
 
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 7ce90f18a5..cdefa13590 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -665,45 +665,6 @@ static void gen_mtvsrws(DisasContext *ctx)
 
 #endif
 
-static void gen_xxpermdi(DisasContext *ctx)
-{
-TCGv_i64 xh, xl;
-
-if (unlikely(!ctx->vsx_enabled)) {
-gen_exception(ctx, POWERPC_EXCP_VSXU);
-return;
-}
-
-xh = tcg_temp_new_i64();
-xl = tcg_temp_new_i64();
-
-if (unlikely((xT(ctx->opcode) == xA(ctx->opcode)) ||
- (xT(ctx->opcode) == xB(ctx->opcode {
-get_cpu_vsr(xh, xA(ctx->opcode), (DM(ctx->opcode) & 2) == 0);
-get_cpu_vsr(xl, xB(ctx->opcode), (DM(ctx->opcode) & 1) == 0);
-
-set_cpu_vsr(xT(ctx->opcode), xh, true);
-set_cpu_vsr(xT(ctx->opcode), xl, false);
-} else {
-if ((DM(ctx->opcode) & 2) == 0) {
-get_cpu_vsr(xh, xA(ctx->opcode), true);
-set_cpu_vsr(xT(ctx->opcode), xh, true);
-} else {
-get_cpu_vsr(xh, xA(ctx->opcode), false);
-set_cpu_vsr(xT(ctx->opcode), xh, true);
-}
-if ((DM(ctx->opcode) & 1) == 0) {
-get_cpu_vsr(xl, xB(ctx->opcode), true);
-set_cpu_vsr(xT(ctx->opcode), xl, false);
-} else {
-get_cpu_vsr(xl, xB(ctx->opcode), false);
-set_cpu_vsr(xT(ctx->opcode), xl, false);
-}
-}
-tcg_temp_free_i64(xh);
-tcg_temp_free_i64(xl);
-}
-
 #define OP_ABS 1
 #define OP_NABS 2
 #define OP_NEG 3
@@ -1241,6 +1202,38 @@ static bool trans_XXPERMR(DisasContext *ctx, arg_XX3 *a)
 return true;
 }
 
+static bool trans_XXPERMDI(DisasContext *ctx, arg_XX3_dm *a)
+{
+TCGv_i64 t0, t1;
+
+REQUIRE_INSNS_FLAGS2(ctx, VSX);
+REQUIRE_VSX(ctx);
+
+t0 = tcg_temp_new_i64();
+
+if (unlikely(a->xt == a->xa || a->xt == a->xb)) {
+t1 = tcg_temp_new_i64();
+
+get_cpu_vsr(t0, a->xa, (a->dm & 2) == 0);
+get_cpu_vsr(t1, a->xb, (a->dm & 1) == 0);
+
+set_cpu_vsr(a->xt, t0, true);
+set_cpu_vsr(a->xt, t1, false);
+
+tcg_temp_free_i64(t1);
+} else {
+get_cpu_vsr(t0, a->xa, (a->dm & 2) == 0);
+set_cpu_vsr(a->xt, t0, true);
+
+get_cpu_vsr(t0, a->xb, (a->dm & 1) == 0);
+set_cpu_vsr(a->xt, t0, false);
+}
+
+tcg_temp_free_i64(t0);
+
+return true;
+}
+
 #define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
 static void gen_##name(DisasContext *ctx) \
 { \
diff --git a/target/ppc/translate/vsx-ops.c.inc 
b/target/ppc/translate/vsx-ops.c.inc
index 86ed1a996a..0a6b2b31ac 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -344,5 +344,3 @@ GEN_XX3FORM(xxmrglw, 0x08, 0x06, PPC2_VSX),
 GEN_XX3FORM_DM(xxsldwi, 0x08, 0x00),
 GEN_XX2FORM_EXT(xxextractuw, 0x0A, 0x0A, PPC2_ISA300),
 GEN_XX2FORM_EXT(xxinsertw, 0x0A, 0x0B, PPC2_ISA300),
-
-GEN_XX3FORM_DM(xxpermdi, 0x08, 0x01),
-- 
2.25.1




[PATCH v5 38/49] target/ppc: Implement xvtlsbb instruction

2022-02-25 Thread matheus . ferst
From: Víctor Colombo 

Signed-off-by: Víctor Colombo 
Signed-off-by: Matheus Ferst 
---
changes for v5:
- unroll for-loop as suggested by Richard Henderson
---
 target/ppc/insn32.decode|  7 +
 target/ppc/translate/vsx-impl.c.inc | 40 +
 2 files changed, 47 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index c28cc13325..973cda1131 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -155,6 +155,9 @@
 &XX2xt xb uim:uint8_t
 @XX2.. . ... uim:2 . . ..   &XX2 xt=%xx_xt 
xb=%xx_xb
 
+&XX2_bf_xb  bf xb
+@XX2_bf_xb  .. bf:3 .. . . . . .&XX2_bf_xb 
xb=%xx_xb
+
 &XX3xt xa xb
 @XX3.. . . .  ...   &XX3 xt=%xx_xt 
xa=%xx_xa xb=%xx_xb
 
@@ -666,6 +669,10 @@ XSMINJDP00 . . . 10011000 ...   
@XX3
 
 XSCVQPDP11 . 10100 . 1101000100 .   @X_tb_rc
 
+## VSX Vector Test Least-Significant Bit by Byte Instruction
+
+XVTLSBB 00 ... -- 00010 . 111011011 . - @XX2_bf_xb
+
 ### rfebb
 &XL_s   s:uint8_t
 @XL_s   ..-- s:1 .. -   &XL_s
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 292a14f5aa..4da889531b 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1704,6 +1704,46 @@ static bool trans_LXVKQ(DisasContext *ctx, arg_X_uim5 *a)
 return true;
 }
 
+static bool trans_XVTLSBB(DisasContext *ctx, arg_XX2_bf_xb *a)
+{
+TCGv_i64 xb, t0, t1, all_true, all_false, mask, zero;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VSX(ctx);
+
+xb = tcg_temp_new_i64();
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+all_true = tcg_temp_new_i64();
+all_false = tcg_temp_new_i64();
+mask = tcg_constant_i64(dup_const(MO_8, 1));
+zero = tcg_constant_i64(0);
+
+get_cpu_vsr(xb, a->xb, true);
+tcg_gen_and_i64(t0, mask, xb);
+get_cpu_vsr(xb, a->xb, false);
+tcg_gen_and_i64(t1, mask, xb);
+
+tcg_gen_or_i64(all_false, t0, t1);
+tcg_gen_and_i64(all_true, t0, t1);
+
+tcg_gen_setcond_i64(TCG_COND_EQ, all_false, all_false, zero);
+tcg_gen_shli_i64(all_false, all_false, 1);
+tcg_gen_setcond_i64(TCG_COND_EQ, all_true, all_true, mask);
+tcg_gen_shli_i64(all_true, all_true, 3);
+
+tcg_gen_or_i64(t0, all_false, all_true);
+tcg_gen_extrl_i64_i32(cpu_crf[a->bf], t0);
+
+tcg_temp_free_i64(xb);
+tcg_temp_free_i64(t0);
+tcg_temp_free_i64(t1);
+tcg_temp_free_i64(all_true);
+tcg_temp_free_i64(all_false);
+
+return true;
+}
+
 static void gen_xxsldwi(DisasContext *ctx)
 {
 TCGv_i64 xth, xtl;
-- 
2.25.1




[PATCH v5 18/49] target/ppc: implement vgnb

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|   5 ++
 target/ppc/translate/vmx-impl.c.inc | 135 
 2 files changed, 140 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 31a3c3b508..02df4a98e6 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -66,6 +66,9 @@
 &VX_mp  rt mp:bool vrb
 @VX_mp  .. rt:5  mp:1 vrb:5 ... &VX_mp
 
+&VX_n   rt vrb n
+@VX_n   .. rt:5 .. n:3 vrb:5 ...&VX_n
+
 &VX_tb_rc   vrt vrb rc:bool
 @VX_tb_rc   .. vrt:5 . vrb:5 rc:1 ..&VX_tb_rc
 
@@ -418,6 +421,8 @@ VCMPUQ  000100 ... -- . . 0010001   
@VX_bf
 
 ## Vector Bit Manipulation Instruction
 
+VGNB000100 . -- ... . 10011001100   @VX_n
+
 VCFUGED 000100 . . . 10101001101@VX
 VCLZDM  000100 . . . 100@VX
 VCTZDM  000100 . . . 1000100@VX
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index e45bd194f4..52774cdd4d 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1416,6 +1416,141 @@ GEN_VXFORM_DUAL(vsplth, PPC_ALTIVEC, PPC_NONE,
 GEN_VXFORM_DUAL(vspltw, PPC_ALTIVEC, PPC_NONE,
 vextractuw, PPC_NONE, PPC2_ISA300);
 
+static bool trans_VGNB(DisasContext *ctx, arg_VX_n *a)
+{
+/*
+ * Similar to do_vextractm, we'll use a sequence of mask-shift-or 
operations
+ * to gather the bits. The masks can be created with
+ *
+ * uint64_t mask(uint64_t n, uint64_t step)
+ * {
+ * uint64_t p = ((1UL << (1UL << step)) - 1UL) << ((n - 1UL) << step),
+ *  plen = n << step, m = 0;
+ * for(int i = 0; i < 64/plen; i++) {
+ * m |= p;
+ * m = ror64(m, plen);
+ * }
+ * p >>= plen * DIV_ROUND_UP(64, plen) - 64;
+ * return m | p;
+ * }
+ *
+ * But since there are few values of N, we'll use a lookup table to avoid
+ * these calculations at runtime.
+ */
+static const uint64_t mask[6][5] = {
+{
+0xULL, 0xULL, 
0xf0f0f0f0f0f0f0f0ULL,
+0xff00ff00ff00ff00ULL, 0xULL
+},
+{
+0x9249249249249249ULL, 0xC30C30C30C30C30CULL, 
0xF00F00F00F00F00FULL,
+0xFFFFFF00ULL, 0xULL
+},
+{
+/* For N >= 4, some mask operations can be elided */
+0xULL, 0, 0xf000f000f000f000ULL, 0,
+0xULL
+},
+{
+0x8421084210842108ULL, 0, 0xFFFFULL, 0, 0
+},
+{
+0x8208208208208208ULL, 0, 0xF0F0F000ULL, 0, 0
+},
+{
+0x8102040810204081ULL, 0, 0xF00F00F0ULL, 0, 0
+}
+};
+uint64_t m;
+int i, sh, nbits = DIV_ROUND_UP(64, a->n);
+TCGv_i64 hi, lo, t0, t1;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VECTOR(ctx);
+
+if (a->n < 2) {
+/*
+ * "N can be any value between 2 and 7, inclusive." Otherwise, the
+ * result is undefined, so we don't need to change RT. Also, N > 7 is
+ * impossible since the immediate field is 3 bits only.
+ */
+return true;
+}
+
+hi = tcg_temp_new_i64();
+lo = tcg_temp_new_i64();
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+
+get_avr64(hi, a->vrb, true);
+get_avr64(lo, a->vrb, false);
+
+/* Align the lower doubleword so we can use the same mask */
+tcg_gen_shli_i64(lo, lo, a->n * nbits - 64);
+
+/*
+ * Starting from the most significant bit, gather every Nth bit with a
+ * sequence of mask-shift-or operation. E.g.: for N=3
+ * AxxBxxCxxDxxExxFxxGxxHxxIxxJxxKxxLxxMxxNxxOxxPxxQxxRxxSxxTxxUxxV
+ * & rep(0b100)
+ * A..B..C..D..E..F..G..H..I..J..K..L..M..N..O..P..Q..R..S..T..U..V
+ * << 2
+ * .B..C..D..E..F..G..H..I..J..K..L..M..N..O..P..Q..R..S..T..U..V..
+ * |
+ * AB.BC.CD.DE.EF.FG.GH.HI.IJ.JK.KL.LM.MN.NO.OP.PQ.QR.RS.ST.TU.UV.V
+ *  & rep(0b11)
+ * ABCDEFGHIJKLMNOPQRSTUV..
+ * << 4
+ * ..CDEFGHIJKLMNOPQRSTUV..
+ * |
+ * ABCD..CDEF..EFGH..GHIJ..IJKL..KLMN..MNOP..OPQR..QRST..STUV..UV..
+ * & rep(0b)
+ * ABCDEFGHIJKLMNOPQRSTUV..
+ * << 8
+ * EFGHIJKLMNOPQRSTUV..
+ * |
+ * ABCDEFGHEFGHIJKLIJKLMNOPMNOPQRSTQRSTUV..UV..
+ *  & rep(0b)
+ * ABCDEFGH

[PATCH v5 30/49] target/ppc: move xxperm/xxpermr to decodetree

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/fpu_helper.c | 21 ---
 target/ppc/helper.h |  2 --
 target/ppc/insn32.decode|  5 
 target/ppc/translate/vsx-impl.c.inc | 42 +++--
 target/ppc/translate/vsx-ops.c.inc  |  2 --
 5 files changed, 45 insertions(+), 27 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index bd76bee7f1..0fd285defc 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -3055,27 +3055,6 @@ uint64_t helper_xsrsp(CPUPPCState *env, uint64_t xb)
 return xt;
 }
 
-#define VSX_XXPERM(op, indexed)   \
-void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
- ppc_vsr_t *xa, ppc_vsr_t *pcv)   \
-{ \
-ppc_vsr_t t = *xt;\
-int i, idx;   \
-  \
-for (i = 0; i < 16; i++) {\
-idx = pcv->VsrB(i) & 0x1F;\
-if (indexed) {\
-idx = 31 - idx;   \
-} \
-t.VsrB(i) = (idx <= 15) ? xa->VsrB(idx)   \
-: xt->VsrB(idx - 16); \
-} \
-*xt = t;  \
-}
-
-VSX_XXPERM(xxperm, 0)
-VSX_XXPERM(xxpermr, 1)
-
 void helper_xvxsigsp(CPUPPCState *env, ppc_vsr_t *xt, ppc_vsr_t *xb)
 {
 ppc_vsr_t t = { };
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index fd559d72d3..f75a26b4af 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -492,8 +492,6 @@ DEF_HELPER_3(xvrspic, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspim, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspip, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspiz, void, env, vsr, vsr)
-DEF_HELPER_4(xxperm, void, env, vsr, vsr, vsr)
-DEF_HELPER_4(xxpermr, void, env, vsr, vsr, vsr)
 DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
 DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
 DEF_HELPER_3(xvxsigsp, void, env, vsr, vsr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 66cb9184cd..3de4a32e38 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -604,6 +604,11 @@ STXVPX  01 . . . 0111001101 -   
@X_TSXP
 XXSPLTIB00 . 00  0101101000 .   @X_imm8
 XXSPLTW 00 . ---.. . 010100100 . .  @XX2
 
+## VSX Permute Instructions
+
+XXPERM  00 . . . 00011010 ...   @XX3
+XXPERMR 00 . . . 00111010 ...   @XX3
+
 XXSEL   00 . . . . 11   @XX4
 
 ## VSX Vector Load Special Value Instruction
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 48e4a2e266..7ce90f18a5 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1200,8 +1200,46 @@ GEN_VSX_HELPER_X2(xvrspip, 0x12, 0x0A, 0, PPC2_VSX)
 GEN_VSX_HELPER_X2(xvrspiz, 0x12, 0x09, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtstdcsp, 0x14, 0x1A, 0, PPC2_VSX)
 GEN_VSX_HELPER_2(xvtstdcdp, 0x14, 0x1E, 0, PPC2_VSX)
-GEN_VSX_HELPER_X3(xxperm, 0x08, 0x03, 0, PPC2_ISA300)
-GEN_VSX_HELPER_X3(xxpermr, 0x08, 0x07, 0, PPC2_ISA300)
+
+static bool trans_XXPERM(DisasContext *ctx, arg_XX3 *a)
+{
+TCGv_ptr xt, xa, xb;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+REQUIRE_VSX(ctx);
+
+xt = gen_vsr_ptr(a->xt);
+xa = gen_vsr_ptr(a->xa);
+xb = gen_vsr_ptr(a->xb);
+
+gen_helper_VPERM(xt, xa, xt, xb);
+
+tcg_temp_free_ptr(xt);
+tcg_temp_free_ptr(xa);
+tcg_temp_free_ptr(xb);
+
+return true;
+}
+
+static bool trans_XXPERMR(DisasContext *ctx, arg_XX3 *a)
+{
+TCGv_ptr xt, xa, xb;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+REQUIRE_VSX(ctx);
+
+xt = gen_vsr_ptr(a->xt);
+xa = gen_vsr_ptr(a->xa);
+xb = gen_vsr_ptr(a->xb);
+
+gen_helper_VPERMR(xt, xa, xt, xb);
+
+tcg_temp_free_ptr(xt);
+tcg_temp_free_ptr(xa);
+tcg_temp_free_ptr(xb);
+
+return true;
+}
 
 #define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
 static void gen_##name(DisasContext *ctx) \
diff --git a/target/ppc/translate/vsx-ops.c.inc 
b/target/ppc/translate/vsx-ops.c.inc
index b0dbb38c80..86ed1a996a 100644
--- a/target/ppc/translate/vsx-ops.c.inc
+++ b/target/ppc/translate/vsx-ops.c.inc
@@ -341,8 +341,6 @@ VSX_LOGICAL(xxlnand, 0x8, 0x16, PPC2_VSX207),
 VSX_L

[PATCH v5 17/49] target/ppc: implement vcntmb[bhwd]

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  8 
 target/ppc/translate/vmx-impl.c.inc | 32 +
 2 files changed, 40 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index b20f1eaa8e..31a3c3b508 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -63,6 +63,9 @@
 &VX_bf  bf vra vrb
 @VX_bf  .. bf:3 .. vra:5 vrb:5 ...  &VX_bf
 
+&VX_mp  rt mp:bool vrb
+@VX_mp  .. rt:5  mp:1 vrb:5 ... &VX_mp
+
 &VX_tb_rc   vrt vrb rc:bool
 @VX_tb_rc   .. vrt:5 . vrb:5 rc:1 ..&VX_tb_rc
 
@@ -489,6 +492,11 @@ VEXTRACTWM  000100 . 01010 . 1100110
@VX_tb
 VEXTRACTDM  000100 . 01011 . 1100110@VX_tb
 VEXTRACTQM  000100 . 01100 . 1100110@VX_tb
 
+VCNTMBB 000100 . 1100 . . 1100110   @VX_mp
+VCNTMBH 000100 . 1101 . . 1100110   @VX_mp
+VCNTMBW 000100 . 1110 . . 1100110   @VX_mp
+VCNTMBD 000100 .  . . 1100110   @VX_mp
+
 ## Vector Multiply Instruction
 
 VMULESB 000100 . . . 0111000@VX
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 4db5656669..e45bd194f4 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1910,6 +1910,38 @@ static bool trans_MTVSRBMI(DisasContext *ctx, arg_DX_b 
*a)
 return true;
 }
 
+static bool do_vcntmb(DisasContext *ctx, arg_VX_mp *a, int vece)
+{
+TCGv_i64 rt, vrb, mask;
+rt = tcg_const_i64(0);
+vrb = tcg_temp_new_i64();
+mask = tcg_constant_i64(dup_const(vece, 1ULL << ((8 << vece) - 1)));
+
+for (int i = 0; i < 2; i++) {
+get_avr64(vrb, a->vrb, i);
+if (a->mp) {
+tcg_gen_and_i64(vrb, mask, vrb);
+} else {
+tcg_gen_andc_i64(vrb, mask, vrb);
+}
+tcg_gen_ctpop_i64(vrb, vrb);
+tcg_gen_add_i64(rt, rt, vrb);
+}
+
+tcg_gen_shli_i64(rt, rt, TARGET_LONG_BITS - 8 + vece);
+tcg_gen_trunc_i64_tl(cpu_gpr[a->rt], rt);
+
+tcg_temp_free_i64(vrb);
+tcg_temp_free_i64(rt);
+
+return true;
+}
+
+TRANS(VCNTMBB, do_vcntmb, MO_8)
+TRANS(VCNTMBH, do_vcntmb, MO_16)
+TRANS(VCNTMBW, do_vcntmb, MO_32)
+TRANS(VCNTMBD, do_vcntmb, MO_64)
+
 static bool do_vstri(DisasContext *ctx, arg_VX_tb_rc *a,
  void (*gen_helper)(TCGv_i32, TCGv_ptr, TCGv_ptr))
 {
-- 
2.25.1




[PATCH v5 34/49] target/ppc: Implement xxeval

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Signed-off-by: Matheus Ferst 
---
v5:
 - Some equivalent functions implemented with tcg_gen_gvec_*
---
 target/ppc/helper.h |   1 +
 target/ppc/insn64.decode|   8 +
 target/ppc/int_helper.c |  42 ++
 target/ppc/translate/vsx-impl.c.inc | 220 
 4 files changed, 271 insertions(+)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 06fac7e082..ef9655c7cd 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -496,6 +496,7 @@ DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
 DEF_HELPER_FLAGS_5(XXPERMX, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, tl)
 DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
 DEF_HELPER_3(xvxsigsp, void, env, vsr, vsr)
+DEF_HELPER_FLAGS_5(XXEVAL, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, i32)
 DEF_HELPER_5(XXBLENDVB, void, vsr, vsr, vsr, vsr, i32)
 DEF_HELPER_5(XXBLENDVH, void, vsr, vsr, vsr, vsr, i32)
 DEF_HELPER_5(XXBLENDVW, void, vsr, vsr, vsr, vsr, i32)
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index 0963e064b1..fdb859f62d 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -54,6 +54,11 @@
 .. . . . . ..  \
 &8RR_XX4 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb 
xc=%8rr_xx_xc
 
+&8RR_XX4_immxt xa xb xc imm
+@8RR_XX4_imm   imm:8 \
+.. . . . . ..  \
+&8RR_XX4_imm xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb 
xc=%8rr_xx_xc
+
 &8RR_XX4_uim3   xt xa xb xc uim3
 @8RR_XX4_uim3   .. ..  .. ... uim3:3 \
 .. . . . . ..    \
@@ -184,6 +189,9 @@ PLXVP   01 00 0--.-- .. \
 PSTXVP  01 00 0--.-- .. \
 10 . .  @8LS_D_TSXP
 
+XXEVAL  01 01  -- --  \
+100010 . . . . 01   @8RR_XX4_imm
+
 XXSPLTIDP   01 01  -- --  \
 10 . 0010 . @8RR_D
 XXSPLTIW01 01  -- --  \
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index d9bfdc290f..ca59cd3d79 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -28,6 +28,7 @@
 #include "fpu/softfloat.h"
 #include "qapi/error.h"
 #include "qemu/guest-random.h"
+#include "tcg/tcg-gvec-desc.h"
 
 #include "helper_regs.h"
 /*/
@@ -1572,6 +1573,47 @@ void helper_xxinsertw(CPUPPCState *env, ppc_vsr_t *xt,
 *xt = t;
 }
 
+void helper_XXEVAL(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c,
+   uint32_t desc)
+{
+/*
+ * Instead of processing imm bit-by-bit, we'll skip the computation of
+ * conjunctions whose corresponding bit is unset.
+ */
+int bit, imm = simd_data(desc);
+Int128 conj, disj = int128_zero();
+
+/* Iterate over set bits from the least to the most significant bit */
+while (imm) {
+/*
+ * Get the next bit to be processed with ctz64. Invert the result of
+ * ctz64 to match the indexing used by PowerISA.
+ */
+bit = 7 - ctzl(imm);
+if (bit & 0x4) {
+conj = a->s128;
+} else {
+conj = int128_not(a->s128);
+}
+if (bit & 0x2) {
+conj = int128_and(conj, b->s128);
+} else {
+conj = int128_and(conj, int128_not(b->s128));
+}
+if (bit & 0x1) {
+conj = int128_and(conj, c->s128);
+} else {
+conj = int128_and(conj, int128_not(c->s128));
+}
+disj = int128_or(disj, conj);
+
+/* Unset the least significant bit that is set */
+imm &= imm - 1;
+}
+
+t->s128 = disj;
+}
+
 #define XXBLEND(name, sz) \
 void glue(helper_XXBLENDV, name)(ppc_avr_t *t, ppc_avr_t *a, ppc_avr_t *b,  \
  ppc_avr_t *c, uint32_t desc)   \
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 92851b8926..b5e07cf3df 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2167,6 +2167,226 @@ TRANS64_FLAGS2(ISA310, PLXV, do_lstxv_PLS_D, false, 
false)
 TRANS64_FLAGS2(ISA310, PSTXVP, do_lstxv_PLS_D, true, true)
 TRANS64_FLAGS2(ISA310, PLXVP, do_lstxv_PLS_D, false, true)
 
+static void gen_xxeval_i64(TCGv_i64 t, TCGv_i64 a, TCGv_i64 b, TCGv_i64 c,
+   int64_t imm)
+{
+/*
+ * Instead of processing imm bit-by-bit, we'll skip the computation of
+ * conjunctions whose corresponding bit is unset.
+ */
+int bit;
+TCGv_i64 conj, disj;
+
+conj = tcg_temp_new_i64();
+disj = tcg_const_i64(0);
+
+/* Iterate over set bits from the least to the most significan

[PATCH v5 35/49] target/ppc: Implement xxgenpcv[bhwd]m instruction

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Signed-off-by: Matheus Ferst 
---
v5:
 - One helper for each IMM value.
---
 target/ppc/helper.h | 16 +
 target/ppc/insn32.decode| 10 
 target/ppc/int_helper.c | 91 +
 target/ppc/translate/vsx-impl.c.inc | 43 ++
 4 files changed, 160 insertions(+)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index ef9655c7cd..d1ed043b41 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -492,6 +492,22 @@ DEF_HELPER_3(xvrspic, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspim, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspip, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspiz, void, env, vsr, vsr)
+DEF_HELPER_FLAGS_2(XXGENPCVBM_be_exp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVBM_be_comp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVBM_le_exp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVBM_le_comp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVHM_be_exp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVHM_be_comp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVHM_le_exp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVHM_le_comp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVWM_be_exp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVWM_be_comp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVWM_le_exp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVWM_le_comp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVDM_be_exp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVDM_be_comp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVDM_le_exp, TCG_CALL_NO_RWG, void, vsr, avr)
+DEF_HELPER_FLAGS_2(XXGENPCVDM_le_comp, TCG_CALL_NO_RWG, void, vsr, avr)
 DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
 DEF_HELPER_FLAGS_5(XXPERMX, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, tl)
 DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index b8dbac553e..22b245607b 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -119,6 +119,9 @@
 @X_bfl  .. bf:3 - l:1 ra:5 rb:5 ..- &X_bfl
 
 %x_xt   0:1 21:5
+&X_imm5 xt imm:uint8_t vrb
+@X_imm5 .. . imm:5 vrb:5 .. .   &X_imm5 
xt=%x_xt
+
 &X_imm8 xt imm:uint8_t
 @X_imm8 .. . .. imm:8 .. .  &X_imm8 
xt=%x_xt
 
@@ -615,6 +618,13 @@ XXPERMDI00 . . . 0 .. 01010 ... 
@XX3_dm
 
 XXSEL   00 . . . . 11   @XX4
 
+## VSX Vector Generate PCV
+
+XXGENPCVBM  00 . . . 1110010100 .   @X_imm5
+XXGENPCVHM  00 . . . 1110010101 .   @X_imm5
+XXGENPCVWM  00 . . . 1110110100 .   @X_imm5
+XXGENPCVDM  00 . . . 1110110101 .   @X_imm5
+
 ## VSX Vector Load Special Value Instruction
 
 LXVKQ   00 . 1 . 0101101000 .   @X_uim5
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index ca59cd3d79..b2b17bb1ca 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1072,6 +1072,97 @@ void helper_VPERMR(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t 
*b, ppc_avr_t *c)
 *r = result;
 }
 
+#define XXGENPCV(NAME, SZ) \
+void glue(helper_, glue(NAME, _be_exp))(ppc_vsr_t *t, ppc_vsr_t *b) \
+{   \
+ppc_vsr_t tmp;  \
+\
+/* Initialize tmp with the result of an all-zeros mask */   \
+tmp.VsrD(0) = 0x1011121314151617;   \
+tmp.VsrD(1) = 0x18191A1B1C1D1E1F;   \
+\
+/* Iterate over the most significant byte of each element */\
+for (int i = 0, j = 0; i < ARRAY_SIZE(b->u8); i += SZ) {\
+if (b->VsrB(i) & 0x80) {\
+/* Update each byte of the element */   \
+for (int k = 0; k < SZ; k++) {  \
+tmp.VsrB(i + k) = j + k;\
+}   \
+j += SZ;\
+}   \
+}   \
+\
+*t = tmp;   \
+}   \
+\
+void glue(helper_

[PATCH v5 27/49] target/ppc: implement vrlqmi

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  1 +
 target/ppc/translate/vmx-impl.c.inc | 21 +
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 87d482c5d9..abc2007129 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -495,6 +495,7 @@ VRLQ000100 . . . 101@VX
 
 VRLWMI  000100 . . . 0001101@VX
 VRLDMI  000100 . . . 00011000101@VX
+VRLQMI  000100 . . . 1000101@VX
 
 VRLWNM  000100 . . . 0011101@VX
 VRLDNM  000100 . . . 00111000101@VX
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index eb305e84da..352250fad0 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1109,7 +1109,8 @@ static void do_vrlq_mask(TCGv_i64 mh, TCGv_i64 ml, 
TCGv_i64 b, TCGv_i64 e)
 tcg_temp_free_i64(t1);
 }
 
-static bool do_vector_rotl_quad(DisasContext *ctx, arg_VX *a, bool mask)
+static bool do_vector_rotl_quad(DisasContext *ctx, arg_VX *a, bool mask,
+bool insert)
 {
 TCGv_i64 ah, al, vrb, n, t0, t1, zero = tcg_constant_i64(0);
 
@@ -1146,7 +1147,7 @@ static bool do_vector_rotl_quad(DisasContext *ctx, arg_VX 
*a, bool mask)
 tcg_gen_shri_i64(ah, ah, 1);
 tcg_gen_or_i64(t1, ah, t1);
 
-if (mask) {
+if (mask || insert) {
 tcg_gen_shri_i64(n, vrb, 8);
 tcg_gen_shri_i64(vrb, vrb, 16);
 tcg_gen_andi_i64(n, n, 0x7f);
@@ -1156,6 +1157,17 @@ static bool do_vector_rotl_quad(DisasContext *ctx, 
arg_VX *a, bool mask)
 
 tcg_gen_and_i64(t0, t0, ah);
 tcg_gen_and_i64(t1, t1, al);
+
+if (insert) {
+get_avr64(n, a->vrt, true);
+get_avr64(vrb, a->vrt, false);
+tcg_gen_not_i64(ah, ah);
+tcg_gen_not_i64(al, al);
+tcg_gen_and_i64(n, n, ah);
+tcg_gen_and_i64(vrb, vrb, al);
+tcg_gen_or_i64(t0, t0, n);
+tcg_gen_or_i64(t1, t1, vrb);
+}
 }
 
 set_avr64(a->vrt, t0, true);
@@ -1171,8 +1183,9 @@ static bool do_vector_rotl_quad(DisasContext *ctx, arg_VX 
*a, bool mask)
 return true;
 }
 
-TRANS(VRLQ, do_vector_rotl_quad, false)
-TRANS(VRLQNM, do_vector_rotl_quad, true)
+TRANS(VRLQ, do_vector_rotl_quad, false, false)
+TRANS(VRLQNM, do_vector_rotl_quad, true, false)
+TRANS(VRLQMI, do_vector_rotl_quad, false, true)
 
 #define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3)   \
 static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
-- 
2.25.1




[PATCH v5 32/49] target/ppc: Implement xxpermx instruction

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/helper.h |  1 +
 target/ppc/insn64.decode|  8 
 target/ppc/int_helper.c | 20 
 target/ppc/translate/vsx-impl.c.inc | 22 ++
 4 files changed, 51 insertions(+)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index f75a26b4af..06fac7e082 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -493,6 +493,7 @@ DEF_HELPER_3(xvrspim, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspip, void, env, vsr, vsr)
 DEF_HELPER_3(xvrspiz, void, env, vsr, vsr)
 DEF_HELPER_4(xxextractuw, void, env, vsr, vsr, i32)
+DEF_HELPER_FLAGS_5(XXPERMX, TCG_CALL_NO_RWG, void, vsr, vsr, vsr, vsr, tl)
 DEF_HELPER_4(xxinsertw, void, env, vsr, vsr, i32)
 DEF_HELPER_3(xvxsigsp, void, env, vsr, vsr)
 DEF_HELPER_5(XXBLENDVB, void, vsr, vsr, vsr, vsr, i32)
diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index 9e4f531fb9..0963e064b1 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -54,6 +54,11 @@
 .. . . . . ..  \
 &8RR_XX4 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb 
xc=%8rr_xx_xc
 
+&8RR_XX4_uim3   xt xa xb xc uim3
+@8RR_XX4_uim3   .. ..  .. ... uim3:3 \
+.. . . . . ..    \
+&8RR_XX4_uim3 xt=%8rr_xx_xt xa=%8rr_xx_xa xb=%8rr_xx_xb 
xc=%8rr_xx_xc
+
 ### Fixed-Point Load Instructions
 
 PLBZ01 10 0--.-- .. \
@@ -194,3 +199,6 @@ XXBLENDVH   01 01  -- -- \
 11 . . . . 01   @8RR_XX4
 XXBLENDVB   01 01  -- -- \
 11 . . . . 00   @8RR_XX4
+
+XXPERMX 01 01  -- --- ... \
+100010 . . . . 00   @8RR_XX4_uim3
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 6c63c7b227..d9bfdc290f 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1015,6 +1015,26 @@ VMUL(UW, u32, VsrW, VsrD, uint64_t)
 #undef VMUL_DO_ODD
 #undef VMUL
 
+void helper_XXPERMX(ppc_vsr_t *t, ppc_vsr_t *s0, ppc_vsr_t *s1, ppc_vsr_t *pcv,
+target_ulong uim)
+{
+int i, idx;
+ppc_vsr_t tmp = { .u64 = {0, 0} };
+
+for (i = 0; i < ARRAY_SIZE(t->u8); i++) {
+if ((pcv->VsrB(i) >> 5) == uim) {
+idx = pcv->VsrB(i) & 0x1f;
+if (idx < ARRAY_SIZE(t->u8)) {
+tmp.VsrB(i) = s0->VsrB(idx);
+} else {
+tmp.VsrB(i) = s1->VsrB(idx - ARRAY_SIZE(t->u8));
+}
+}
+}
+
+*t = tmp;
+}
+
 void helper_VPERM(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, ppc_avr_t *c)
 {
 ppc_avr_t result;
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index cdefa13590..92851b8926 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -1234,6 +1234,28 @@ static bool trans_XXPERMDI(DisasContext *ctx, arg_XX3_dm 
*a)
 return true;
 }
 
+static bool trans_XXPERMX(DisasContext *ctx, arg_8RR_XX4_uim3 *a)
+{
+TCGv_ptr xt, xa, xb, xc;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VSX(ctx);
+
+xt = gen_vsr_ptr(a->xt);
+xa = gen_vsr_ptr(a->xa);
+xb = gen_vsr_ptr(a->xb);
+xc = gen_vsr_ptr(a->xc);
+
+gen_helper_XXPERMX(xt, xa, xb, xc, tcg_constant_tl(a->uim3));
+
+tcg_temp_free_ptr(xt);
+tcg_temp_free_ptr(xa);
+tcg_temp_free_ptr(xb);
+tcg_temp_free_ptr(xc);
+
+return true;
+}
+
 #define GEN_VSX_HELPER_VSX_MADD(name, op1, aop, mop, inval, type) \
 static void gen_##name(DisasContext *ctx) \
 { \
-- 
2.25.1




[PATCH v5 14/49] target/ppc: implement vstri[bh][lr]

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/helper.h |  4 
 target/ppc/insn32.decode| 10 ++
 target/ppc/int_helper.c | 28 +++
 target/ppc/translate/vmx-impl.c.inc | 30 +
 4 files changed, 72 insertions(+)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 3257203791..e763093dbb 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -207,6 +207,10 @@ DEF_HELPER_4(VINSBLX, void, env, avr, i64, tl)
 DEF_HELPER_4(VINSHLX, void, env, avr, i64, tl)
 DEF_HELPER_4(VINSWLX, void, env, avr, i64, tl)
 DEF_HELPER_4(VINSDLX, void, env, avr, i64, tl)
+DEF_HELPER_FLAGS_2(VSTRIBL, TCG_CALL_NO_RWG, i32, avr, avr)
+DEF_HELPER_FLAGS_2(VSTRIBR, TCG_CALL_NO_RWG, i32, avr, avr)
+DEF_HELPER_FLAGS_2(VSTRIHL, TCG_CALL_NO_RWG, i32, avr, avr)
+DEF_HELPER_FLAGS_2(VSTRIHR, TCG_CALL_NO_RWG, i32, avr, avr)
 DEF_HELPER_2(vnegw, void, avr, avr)
 DEF_HELPER_2(vnegd, void, avr, avr)
 DEF_HELPER_2(vupkhpx, void, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index f0cb6602e2..d844d86829 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -63,6 +63,9 @@
 &VX_bf  bf vra vrb
 @VX_bf  .. bf:3 .. vra:5 vrb:5 ...  &VX_bf
 
+&VX_tb_rc   vrt vrb rc:bool
+@VX_tb_rc   .. vrt:5 . vrb:5 rc:1 ..&VX_tb_rc
+
 &VX_uim4vrt uim vrb
 @VX_uim4.. vrt:5 . uim:4 vrb:5 ...  &VX_uim4
 
@@ -519,6 +522,13 @@ VMULLD  000100 . . . 00111001001@VX
 VMSUMCUD000100 . . . . 010111   @VA
 VMSUMUDM000100 . . . . 100011   @VA
 
+## Vector String Instructions
+
+VSTRIBL 000100 . 0 . . 001101   @VX_tb_rc
+VSTRIBR 000100 . 1 . . 001101   @VX_tb_rc
+VSTRIHL 000100 . 00010 . . 001101   @VX_tb_rc
+VSTRIHR 000100 . 00011 . . 001101   @VX_tb_rc
+
 # VSX Load/Store Instructions
 
 LXV 01 . .  . 001   @DQ_TSX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index f31dba9469..71b31fbd89 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1502,6 +1502,34 @@ VEXTRACT(uw, u32)
 VEXTRACT(d, u64)
 #undef VEXTRACT
 
+#define VSTRI(NAME, ELEM, NUM_ELEMS, LEFT) \
+uint32_t helper_##NAME(ppc_avr_t *t, ppc_avr_t *b) \
+{   \
+int i, idx, crf = 0;\
+\
+for (i = 0; i < NUM_ELEMS; i++) {   \
+idx = LEFT ? i : NUM_ELEMS - i - 1; \
+if (b->Vsr##ELEM(idx)) {\
+t->Vsr##ELEM(idx) = b->Vsr##ELEM(idx);  \
+} else {\
+crf = 0b0010;   \
+break;  \
+}   \
+}   \
+\
+for (; i < NUM_ELEMS; i++) {\
+idx = LEFT ? i : NUM_ELEMS - i - 1; \
+t->Vsr##ELEM(idx) = 0;  \
+}   \
+\
+return crf; \
+}
+VSTRI(VSTRIBL, B, 16, true)
+VSTRI(VSTRIBR, B, 16, false)
+VSTRI(VSTRIHL, H, 8, true)
+VSTRI(VSTRIHR, H, 8, false)
+#undef VSTRI
+
 void helper_xxextractuw(CPUPPCState *env, ppc_vsr_t *xt,
 ppc_vsr_t *xb, uint32_t index)
 {
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index ba2106dc7c..6962929826 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1910,6 +1910,36 @@ static bool trans_MTVSRBMI(DisasContext *ctx, arg_DX_b 
*a)
 return true;
 }
 
+static bool do_vstri(DisasContext *ctx, arg_VX_tb_rc *a,
+ void (*gen_helper)(TCGv_i32, TCGv_ptr, TCGv_ptr))
+{
+TCGv_ptr vrt, vrb;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VECTOR(ctx);
+
+vrt = gen_avr_ptr(a->vrt);
+vrb = gen_avr_ptr(a->vrb);
+
+if (a->rc) {
+gen_helper(cpu_crf[6], vrt, vrb);
+} else {
+TCGv_i32 discard = tcg_temp_new_i32();
+gen_helper(discard, vrt, vrb);
+tcg_temp_free_i32(discard);
+}
+
+tcg_temp_free_ptr(vrt);
+tcg_temp_free_ptr(vrb);
+
+return true;
+}
+
+TRANS(VSTRIBL, do_vstri, gen_helper_VSTRIBL)
+TRANS(VSTRIBR, do_vstri, gen_helper_VSTRIBR)
+TRANS(VSTRIHL, do_vstri, gen_helper_VSTRIHL)
+TRANS(VSTRIHR, do_vstri, gen_helper_VSTRIHR)
+
 #define GEN_VAFORM_PAIRED(name0, name1, opc2)   \
 static void glue(gen_, name0##

[PATCH v5 26/49] target/ppc: implement vrlqnm

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  1 +
 target/ppc/translate/vmx-impl.c.inc | 81 +++--
 2 files changed, 77 insertions(+), 5 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index c3d47a8815..87d482c5d9 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -498,6 +498,7 @@ VRLDMI  000100 . . . 00011000101@VX
 
 VRLWNM  000100 . . . 0011101@VX
 VRLDNM  000100 . . . 00111000101@VX
+VRLQNM  000100 . . . 00101000101@VX
 
 ## Vector Integer Arithmetic Instructions
 
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 478a62440d..eb305e84da 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1055,28 +1055,83 @@ TRANS_FLAGS2(ISA310, VSLQ, do_vector_shift_quad, false, 
false);
 TRANS_FLAGS2(ISA310, VSRQ, do_vector_shift_quad, true, false);
 TRANS_FLAGS2(ISA310, VSRAQ, do_vector_shift_quad, true, true);
 
-static bool trans_VRLQ(DisasContext *ctx, arg_VX *a)
+static void do_vrlq_mask(TCGv_i64 mh, TCGv_i64 ml, TCGv_i64 b, TCGv_i64 e)
 {
-TCGv_i64 ah, al, n, t0, t1, zero = tcg_constant_i64(0);
+TCGv_i64 th, tl, t0, t1, zero = tcg_constant_i64(0),
+ ones = tcg_constant_i64(-1);
+
+th = tcg_temp_new_i64();
+tl = tcg_temp_new_i64();
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+
+/* m = ~0 >> b */
+tcg_gen_andi_i64(t0, b, 64);
+tcg_gen_movcond_i64(TCG_COND_NE, t1, t0, zero, zero, ones);
+tcg_gen_andi_i64(t0, b, 0x3F);
+tcg_gen_shr_i64(mh, t1, t0);
+tcg_gen_shr_i64(ml, ones, t0);
+tcg_gen_xori_i64(t0, t0, 63);
+tcg_gen_shl_i64(t1, t1, t0);
+tcg_gen_shli_i64(t1, t1, 1);
+tcg_gen_or_i64(ml, t1, ml);
+
+/* t = ~0 >> e */
+tcg_gen_andi_i64(t0, e, 64);
+tcg_gen_movcond_i64(TCG_COND_NE, t1, t0, zero, zero, ones);
+tcg_gen_andi_i64(t0, e, 0x3F);
+tcg_gen_shr_i64(th, t1, t0);
+tcg_gen_shr_i64(tl, ones, t0);
+tcg_gen_xori_i64(t0, t0, 63);
+tcg_gen_shl_i64(t1, t1, t0);
+tcg_gen_shli_i64(t1, t1, 1);
+tcg_gen_or_i64(tl, t1, tl);
+
+/* t = t >> 1 */
+tcg_gen_shli_i64(t0, th, 63);
+tcg_gen_shri_i64(tl, tl, 1);
+tcg_gen_shri_i64(th, th, 1);
+tcg_gen_or_i64(tl, t0, tl);
+
+/* m = m ^ t */
+tcg_gen_xor_i64(mh, mh, th);
+tcg_gen_xor_i64(ml, ml, tl);
+
+/* Negate the mask if begin > end */
+tcg_gen_movcond_i64(TCG_COND_GT, t0, b, e, ones, zero);
+
+tcg_gen_xor_i64(mh, mh, t0);
+tcg_gen_xor_i64(ml, ml, t0);
+
+tcg_temp_free_i64(th);
+tcg_temp_free_i64(tl);
+tcg_temp_free_i64(t0);
+tcg_temp_free_i64(t1);
+}
+
+static bool do_vector_rotl_quad(DisasContext *ctx, arg_VX *a, bool mask)
+{
+TCGv_i64 ah, al, vrb, n, t0, t1, zero = tcg_constant_i64(0);
 
 REQUIRE_VECTOR(ctx);
 REQUIRE_INSNS_FLAGS2(ctx, ISA310);
 
 ah = tcg_temp_new_i64();
 al = tcg_temp_new_i64();
+vrb = tcg_temp_new_i64();
 n = tcg_temp_new_i64();
 t0 = tcg_temp_new_i64();
 t1 = tcg_temp_new_i64();
 
 get_avr64(ah, a->vra, true);
 get_avr64(al, a->vra, false);
-get_avr64(n, a->vrb, true);
+get_avr64(vrb, a->vrb, true);
 
 tcg_gen_mov_i64(t0, ah);
-tcg_gen_andi_i64(t1, n, 64);
+tcg_gen_andi_i64(t1, vrb, 64);
 tcg_gen_movcond_i64(TCG_COND_NE, ah, t1, zero, al, ah);
 tcg_gen_movcond_i64(TCG_COND_NE, al, t1, zero, t0, al);
-tcg_gen_andi_i64(n, n, 0x3F);
+tcg_gen_andi_i64(n, vrb, 0x3F);
 
 tcg_gen_shl_i64(t0, ah, n);
 tcg_gen_shl_i64(t1, al, n);
@@ -1091,11 +1146,24 @@ static bool trans_VRLQ(DisasContext *ctx, arg_VX *a)
 tcg_gen_shri_i64(ah, ah, 1);
 tcg_gen_or_i64(t1, ah, t1);
 
+if (mask) {
+tcg_gen_shri_i64(n, vrb, 8);
+tcg_gen_shri_i64(vrb, vrb, 16);
+tcg_gen_andi_i64(n, n, 0x7f);
+tcg_gen_andi_i64(vrb, vrb, 0x7f);
+
+do_vrlq_mask(ah, al, vrb, n);
+
+tcg_gen_and_i64(t0, t0, ah);
+tcg_gen_and_i64(t1, t1, al);
+}
+
 set_avr64(a->vrt, t0, true);
 set_avr64(a->vrt, t1, false);
 
 tcg_temp_free_i64(ah);
 tcg_temp_free_i64(al);
+tcg_temp_free_i64(vrb);
 tcg_temp_free_i64(n);
 tcg_temp_free_i64(t0);
 tcg_temp_free_i64(t1);
@@ -1103,6 +1171,9 @@ static bool trans_VRLQ(DisasContext *ctx, arg_VX *a)
 return true;
 }
 
+TRANS(VRLQ, do_vector_rotl_quad, false)
+TRANS(VRLQNM, do_vector_rotl_quad, true)
+
 #define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3)   \
 static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
  TCGv_vec sat, TCGv_vec a,  \
-- 
2.25.1




[PATCH v5 24/49] target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi to decodetree

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Signed-off-by: Matheus Ferst 
---
I couldn't figure out how to use tcg_gen_gvec_rotlv here. Since the code
is in the fniv implementation, we have TCGv_vec instead of offsets. I'm
keeping the masking for now, so the generated code has the desired
effect.
---
 target/ppc/helper.h |   8 +-
 target/ppc/insn32.decode|   6 ++
 target/ppc/int_helper.c |  50 -
 target/ppc/translate/vmx-impl.c.inc | 152 ++--
 target/ppc/translate/vmx-ops.c.inc  |   5 +-
 5 files changed, 182 insertions(+), 39 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index e763093dbb..6bd7fad70c 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -271,10 +271,10 @@ DEF_HELPER_4(vmaxfp, void, env, avr, avr, avr)
 DEF_HELPER_4(vminfp, void, env, avr, avr, avr)
 DEF_HELPER_3(vrefp, void, env, avr, avr)
 DEF_HELPER_3(vrsqrtefp, void, env, avr, avr)
-DEF_HELPER_3(vrlwmi, void, avr, avr, avr)
-DEF_HELPER_3(vrldmi, void, avr, avr, avr)
-DEF_HELPER_3(vrldnm, void, avr, avr, avr)
-DEF_HELPER_3(vrlwnm, void, avr, avr, avr)
+DEF_HELPER_FLAGS_4(VRLWMI, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VRLDMI, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VRLDNM, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VRLWNM, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
 DEF_HELPER_5(vmaddfp, void, env, avr, avr, avr, avr)
 DEF_HELPER_5(vnmsubfp, void, env, avr, avr, avr, avr)
 DEF_HELPER_3(vexptefp, void, env, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index d918e2d0f2..e788dc5152 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -492,6 +492,12 @@ VRLH000100 . . . 1000100@VX
 VRLW000100 . . . 0001100@VX
 VRLD000100 . . . 00011000100@VX
 
+VRLWMI  000100 . . . 0001101@VX
+VRLDMI  000100 . . . 00011000101@VX
+
+VRLWNM  000100 . . . 0011101@VX
+VRLDNM  000100 . . . 00111000101@VX
+
 ## Vector Integer Arithmetic Instructions
 
 VEXTSB2W000100 . 1 . 1100010@VX_tb
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 71b31fbd89..f52242ca81 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1275,33 +1275,33 @@ void helper_vrsqrtefp(CPUPPCState *env, ppc_avr_t *r, 
ppc_avr_t *b)
 }
 }
 
-#define VRLMI(name, size, element, insert)\
-void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)  \
-{ \
-int i;\
-for (i = 0; i < ARRAY_SIZE(r->element); i++) {\
-uint##size##_t src1 = a->element[i];  \
-uint##size##_t src2 = b->element[i];  \
-uint##size##_t src3 = r->element[i];  \
-uint##size##_t begin, end, shift, mask, rot_val;  \
-  \
-shift = extract##size(src2, 0, 6);\
-end   = extract##size(src2, 8, 6);\
-begin = extract##size(src2, 16, 6);   \
-rot_val = rol##size(src1, shift); \
-mask = mask_u##size(begin, end);  \
-if (insert) { \
-r->element[i] = (rot_val & mask) | (src3 & ~mask);\
-} else {  \
-r->element[i] = (rot_val & mask); \
-} \
-} \
+#define VRLMI(name, size, element, insert)  \
+void helper_##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b, uint32_t desc) \
+{   \
+int i;  \
+for (i = 0; i < ARRAY_SIZE(r->element); i++) {  \
+uint##size##_t src1 = a->element[i];\
+uint##size##_t src2 = b->element[i];\
+uint##size##_t src3 = r->element[i];\
+uint##size##_t begin, end, shift, mask, rot_val;\
+\
+shift = extract##size(src2, 0, 6);  \
+end   = extract##size(src2, 8, 6);  

[PATCH v5 11/49] target/ppc: Implement Vector Compare Equal Quadword

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Implement the following PowerISA v3.1 instructions:
vcmpequq: Vector Compare Equal Quadword

Suggested-by: Richard Henderson 
Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  1 +
 target/ppc/translate/vmx-impl.c.inc | 36 +
 2 files changed, 37 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index be9e05cc73..437a3e29e0 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -382,6 +382,7 @@ VCMPEQUB000100 . . . . 000110   @VC
 VCMPEQUH000100 . . . . 0001000110   @VC
 VCMPEQUW000100 . . . . 001110   @VC
 VCMPEQUD000100 . . . . 0011000111   @VC
+VCMPEQUQ000100 . . . . 0111000111   @VC
 
 VCMPGTSB000100 . . . . 110110   @VC
 VCMPGTSH000100 . . . . 1101000110   @VC
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 0574bb8bab..b7e9afb978 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1107,6 +1107,42 @@ TRANS(VCMPNEZB, do_vcmpnez, MO_8)
 TRANS(VCMPNEZH, do_vcmpnez, MO_16)
 TRANS(VCMPNEZW, do_vcmpnez, MO_32)
 
+static bool trans_VCMPEQUQ(DisasContext *ctx, arg_VC *a)
+{
+TCGv_i64 t0, t1, t2;
+
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+t2 = tcg_temp_new_i64();
+
+get_avr64(t0, a->vra, true);
+get_avr64(t1, a->vrb, true);
+tcg_gen_xor_i64(t2, t0, t1);
+
+get_avr64(t0, a->vra, false);
+get_avr64(t1, a->vrb, false);
+tcg_gen_xor_i64(t1, t0, t1);
+
+tcg_gen_or_i64(t1, t1, t2);
+tcg_gen_setcondi_i64(TCG_COND_EQ, t1, t1, 0);
+tcg_gen_neg_i64(t1, t1);
+
+set_avr64(a->vrt, t1, true);
+set_avr64(a->vrt, t1, false);
+
+if (a->rc) {
+tcg_gen_extrl_i64_i32(cpu_crf[6], t1);
+tcg_gen_andi_i32(cpu_crf[6], cpu_crf[6], 0xa);
+tcg_gen_xori_i32(cpu_crf[6], cpu_crf[6], 0x2);
+}
+
+tcg_temp_free_i64(t0);
+tcg_temp_free_i64(t1);
+tcg_temp_free_i64(t2);
+
+return true;
+}
+
 GEN_VXRFORM(vcmpeqfp, 3, 3)
 GEN_VXRFORM(vcmpgefp, 3, 7)
 GEN_VXRFORM(vcmpgtfp, 3, 11)
-- 
2.25.1




[PATCH v5 25/49] target/ppc: implement vrlq

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  1 +
 target/ppc/translate/vmx-impl.c.inc | 48 +
 2 files changed, 49 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index e788dc5152..c3d47a8815 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -491,6 +491,7 @@ VRLB000100 . . . 100@VX
 VRLH000100 . . . 1000100@VX
 VRLW000100 . . . 0001100@VX
 VRLD000100 . . . 00011000100@VX
+VRLQ000100 . . . 101@VX
 
 VRLWMI  000100 . . . 0001101@VX
 VRLDMI  000100 . . . 00011000101@VX
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 09d6c88e62..478a62440d 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1055,6 +1055,54 @@ TRANS_FLAGS2(ISA310, VSLQ, do_vector_shift_quad, false, 
false);
 TRANS_FLAGS2(ISA310, VSRQ, do_vector_shift_quad, true, false);
 TRANS_FLAGS2(ISA310, VSRAQ, do_vector_shift_quad, true, true);
 
+static bool trans_VRLQ(DisasContext *ctx, arg_VX *a)
+{
+TCGv_i64 ah, al, n, t0, t1, zero = tcg_constant_i64(0);
+
+REQUIRE_VECTOR(ctx);
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+
+ah = tcg_temp_new_i64();
+al = tcg_temp_new_i64();
+n = tcg_temp_new_i64();
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+
+get_avr64(ah, a->vra, true);
+get_avr64(al, a->vra, false);
+get_avr64(n, a->vrb, true);
+
+tcg_gen_mov_i64(t0, ah);
+tcg_gen_andi_i64(t1, n, 64);
+tcg_gen_movcond_i64(TCG_COND_NE, ah, t1, zero, al, ah);
+tcg_gen_movcond_i64(TCG_COND_NE, al, t1, zero, t0, al);
+tcg_gen_andi_i64(n, n, 0x3F);
+
+tcg_gen_shl_i64(t0, ah, n);
+tcg_gen_shl_i64(t1, al, n);
+
+tcg_gen_xori_i64(n, n, 63);
+
+tcg_gen_shr_i64(al, al, n);
+tcg_gen_shri_i64(al, al, 1);
+tcg_gen_or_i64(t0, al, t0);
+
+tcg_gen_shr_i64(ah, ah, n);
+tcg_gen_shri_i64(ah, ah, 1);
+tcg_gen_or_i64(t1, ah, t1);
+
+set_avr64(a->vrt, t0, true);
+set_avr64(a->vrt, t1, false);
+
+tcg_temp_free_i64(ah);
+tcg_temp_free_i64(al);
+tcg_temp_free_i64(n);
+tcg_temp_free_i64(t0);
+tcg_temp_free_i64(t1);
+
+return true;
+}
+
 #define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3)   \
 static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
  TCGv_vec sat, TCGv_vec a,  \
-- 
2.25.1




[PATCH v5 16/49] target/ppc: implement vclrrb

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  1 +
 target/ppc/translate/vmx-impl.c.inc | 32 +
 2 files changed, 25 insertions(+), 8 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 31cdbba86b..b20f1eaa8e 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -530,6 +530,7 @@ VSTRIHL 000100 . 00010 . . 001101   
@VX_tb_rc
 VSTRIHR 000100 . 00011 . . 001101   @VX_tb_rc
 
 VCLRLB  000100 . . . 00110001101@VX
+VCLRRB  000100 . . . 00111001101@VX
 
 # VSX Load/Store Instructions
 
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index d43fba00ed..4db5656669 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1940,7 +1940,7 @@ TRANS(VSTRIBR, do_vstri, gen_helper_VSTRIBR)
 TRANS(VSTRIHL, do_vstri, gen_helper_VSTRIHL)
 TRANS(VSTRIHR, do_vstri, gen_helper_VSTRIHR)
 
-static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
+static bool do_vclrb(DisasContext *ctx, arg_VX *a, bool right)
 {
 TCGv_i64 rb, mh, ml, tmp,
  ones = tcg_constant_i64(-1),
@@ -1954,15 +1954,28 @@ static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
 tcg_gen_extu_tl_i64(rb, cpu_gpr[a->vrb]);
 tcg_gen_andi_i64(tmp, rb, 7);
 tcg_gen_shli_i64(tmp, tmp, 3);
-tcg_gen_shl_i64(tmp, ones, tmp);
+if (right) {
+tcg_gen_shr_i64(tmp, ones, tmp);
+} else {
+tcg_gen_shl_i64(tmp, ones, tmp);
+}
 tcg_gen_not_i64(tmp, tmp);
 
-tcg_gen_movcond_i64(TCG_COND_LTU, ml, rb, tcg_constant_i64(8),
-tmp, ones);
-tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(8),
-zero, tmp);
-tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(16),
-mh, ones);
+if (right) {
+tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(8),
+tmp, ones);
+tcg_gen_movcond_i64(TCG_COND_LTU, ml, rb, tcg_constant_i64(8),
+zero, tmp);
+tcg_gen_movcond_i64(TCG_COND_LTU, ml, rb, tcg_constant_i64(16),
+ml, ones);
+} else {
+tcg_gen_movcond_i64(TCG_COND_LTU, ml, rb, tcg_constant_i64(8),
+tmp, ones);
+tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(8),
+zero, tmp);
+tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(16),
+mh, ones);
+}
 
 get_avr64(tmp, a->vra, true);
 tcg_gen_and_i64(tmp, tmp, mh);
@@ -1980,6 +1993,9 @@ static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
 return true;
 }
 
+TRANS(VCLRLB, do_vclrb, false)
+TRANS(VCLRRB, do_vclrb, true)
+
 #define GEN_VAFORM_PAIRED(name0, name1, opc2)   \
 static void glue(gen_, name0##_##name1)(DisasContext *ctx)  \
 {   \
-- 
2.25.1




[PATCH v5 22/49] target/ppc: implement vsraq

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  1 +
 target/ppc/translate/vmx-impl.c.inc | 23 +--
 2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 96ee730242..7a9fc1dffa 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -485,6 +485,7 @@ VSRAB   000100 . . . 0110100@VX
 VSRAH   000100 . . . 01101000100@VX
 VSRAW   000100 . . . 0111100@VX
 VSRAD   000100 . . . 0000100@VX
+VSRAQ   000100 . . . 0110101@VX
 
 ## Vector Integer Arithmetic Instructions
 
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 8a1e64d7f2..27ed87fcd6 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -834,9 +834,10 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, 
tcg_gen_gvec_sarv);
 TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
 TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
 
-static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right)
+static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right,
+ bool alg)
 {
-TCGv_i64 hi, lo, t0, n, zero = tcg_constant_i64(0);
+TCGv_i64 hi, lo, t0, t1, n, zero = tcg_constant_i64(0);
 
 REQUIRE_VECTOR(ctx);
 
@@ -844,6 +845,7 @@ static bool do_vector_shift_quad(DisasContext *ctx, arg_VX 
*a, bool right)
 hi = tcg_temp_new_i64();
 lo = tcg_temp_new_i64();
 t0 = tcg_temp_new_i64();
+t1 = tcg_const_i64(0);
 
 get_avr64(lo, a->vra, false);
 get_avr64(hi, a->vra, true);
@@ -853,7 +855,10 @@ static bool do_vector_shift_quad(DisasContext *ctx, arg_VX 
*a, bool right)
 tcg_gen_andi_i64(t0, n, 64);
 if (right) {
 tcg_gen_movcond_i64(TCG_COND_NE, lo, t0, zero, hi, lo);
-tcg_gen_movcond_i64(TCG_COND_NE, hi, t0, zero, zero, hi);
+if (alg) {
+tcg_gen_sari_i64(t1, lo, 63);
+}
+tcg_gen_movcond_i64(TCG_COND_NE, hi, t0, zero, t1, hi);
 } else {
 tcg_gen_movcond_i64(TCG_COND_NE, hi, t0, zero, lo, hi);
 tcg_gen_movcond_i64(TCG_COND_NE, lo, t0, zero, zero, lo);
@@ -861,7 +866,11 @@ static bool do_vector_shift_quad(DisasContext *ctx, arg_VX 
*a, bool right)
 tcg_gen_andi_i64(n, n, 0x3F);
 
 if (right) {
-tcg_gen_shr_i64(t0, hi, n);
+if (alg) {
+tcg_gen_sar_i64(t0, hi, n);
+} else {
+tcg_gen_shr_i64(t0, hi, n);
+}
 } else {
 tcg_gen_shl_i64(t0, lo, n);
 }
@@ -886,13 +895,15 @@ static bool do_vector_shift_quad(DisasContext *ctx, 
arg_VX *a, bool right)
 tcg_temp_free_i64(hi);
 tcg_temp_free_i64(lo);
 tcg_temp_free_i64(t0);
+tcg_temp_free_i64(t1);
 tcg_temp_free_i64(n);
 
 return true;
 }
 
-TRANS_FLAGS2(ISA310, VSLQ, do_vector_shift_quad, false);
-TRANS_FLAGS2(ISA310, VSRQ, do_vector_shift_quad, true);
+TRANS_FLAGS2(ISA310, VSLQ, do_vector_shift_quad, false, false);
+TRANS_FLAGS2(ISA310, VSRQ, do_vector_shift_quad, true, false);
+TRANS_FLAGS2(ISA310, VSRAQ, do_vector_shift_quad, true, true);
 
 #define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3)   \
 static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
-- 
2.25.1




[PATCH v5 12/49] target/ppc: Implement Vector Compare Greater Than Quadword

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Implement the following PowerISA v3.1 instructions:
vcmpgtsq: Vector Compare Greater Than Signed Quadword
vcmpgtuq: Vector Compare Greater Than Unsigned Quadword

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  2 ++
 target/ppc/translate/vmx-impl.c.inc | 39 +
 2 files changed, 41 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 437a3e29e0..07a4ef9103 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -388,11 +388,13 @@ VCMPGTSB000100 . . . . 110110   
@VC
 VCMPGTSH000100 . . . . 1101000110   @VC
 VCMPGTSW000100 . . . . 111110   @VC
 VCMPGTSD000100 . . . . 000111   @VC
+VCMPGTSQ000100 . . . . 111111   @VC
 
 VCMPGTUB000100 . . . . 100110   @VC
 VCMPGTUH000100 . . . . 1001000110   @VC
 VCMPGTUW000100 . . . . 101110   @VC
 VCMPGTUD000100 . . . . 1011000111   @VC
+VCMPGTUQ000100 . . . . 101111   @VC
 
 VCMPNEB 000100 . . . . 000111   @VC
 VCMPNEH 000100 . . . . 0001000111   @VC
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index b7e9afb978..7f9913235e 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1143,6 +1143,45 @@ static bool trans_VCMPEQUQ(DisasContext *ctx, arg_VC *a)
 return true;
 }
 
+static bool do_vcmpgtq(DisasContext *ctx, arg_VC *a, bool sign)
+{
+TCGv_i64 t0, t1, t2;
+
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
+t2 = tcg_temp_new_i64();
+
+get_avr64(t0, a->vra, false);
+get_avr64(t1, a->vrb, false);
+tcg_gen_setcond_i64(TCG_COND_GTU, t2, t0, t1);
+
+get_avr64(t0, a->vra, true);
+get_avr64(t1, a->vrb, true);
+tcg_gen_movcond_i64(TCG_COND_EQ, t2, t0, t1, t2, tcg_constant_i64(0));
+tcg_gen_setcond_i64(sign ? TCG_COND_GT : TCG_COND_GTU, t1, t0, t1);
+
+tcg_gen_or_i64(t1, t1, t2);
+tcg_gen_neg_i64(t1, t1);
+
+set_avr64(a->vrt, t1, true);
+set_avr64(a->vrt, t1, false);
+
+if (a->rc) {
+tcg_gen_extrl_i64_i32(cpu_crf[6], t1);
+tcg_gen_andi_i32(cpu_crf[6], cpu_crf[6], 0xa);
+tcg_gen_xori_i32(cpu_crf[6], cpu_crf[6], 0x2);
+}
+
+tcg_temp_free_i64(t0);
+tcg_temp_free_i64(t1);
+tcg_temp_free_i64(t2);
+
+return true;
+}
+
+TRANS(VCMPGTSQ, do_vcmpgtq, true)
+TRANS(VCMPGTUQ, do_vcmpgtq, false)
+
 GEN_VXRFORM(vcmpeqfp, 3, 3)
 GEN_VXRFORM(vcmpgefp, 3, 7)
 GEN_VXRFORM(vcmpgtfp, 3, 11)
-- 
2.25.1




[PATCH v5 15/49] target/ppc: implement vclrlb

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  2 ++
 target/ppc/translate/vmx-impl.c.inc | 40 +
 2 files changed, 42 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index d844d86829..31cdbba86b 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -529,6 +529,8 @@ VSTRIBR 000100 . 1 . . 001101   
@VX_tb_rc
 VSTRIHL 000100 . 00010 . . 001101   @VX_tb_rc
 VSTRIHR 000100 . 00011 . . 001101   @VX_tb_rc
 
+VCLRLB  000100 . . . 00110001101@VX
+
 # VSX Load/Store Instructions
 
 LXV 01 . .  . 001   @DQ_TSX
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 6962929826..d43fba00ed 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1940,6 +1940,46 @@ TRANS(VSTRIBR, do_vstri, gen_helper_VSTRIBR)
 TRANS(VSTRIHL, do_vstri, gen_helper_VSTRIHL)
 TRANS(VSTRIHR, do_vstri, gen_helper_VSTRIHR)
 
+static bool trans_VCLRLB(DisasContext *ctx, arg_VX *a)
+{
+TCGv_i64 rb, mh, ml, tmp,
+ ones = tcg_constant_i64(-1),
+ zero = tcg_constant_i64(0);
+
+rb = tcg_temp_new_i64();
+mh = tcg_temp_new_i64();
+ml = tcg_temp_new_i64();
+tmp = tcg_temp_new_i64();
+
+tcg_gen_extu_tl_i64(rb, cpu_gpr[a->vrb]);
+tcg_gen_andi_i64(tmp, rb, 7);
+tcg_gen_shli_i64(tmp, tmp, 3);
+tcg_gen_shl_i64(tmp, ones, tmp);
+tcg_gen_not_i64(tmp, tmp);
+
+tcg_gen_movcond_i64(TCG_COND_LTU, ml, rb, tcg_constant_i64(8),
+tmp, ones);
+tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(8),
+zero, tmp);
+tcg_gen_movcond_i64(TCG_COND_LTU, mh, rb, tcg_constant_i64(16),
+mh, ones);
+
+get_avr64(tmp, a->vra, true);
+tcg_gen_and_i64(tmp, tmp, mh);
+set_avr64(a->vrt, tmp, true);
+
+get_avr64(tmp, a->vra, false);
+tcg_gen_and_i64(tmp, tmp, ml);
+set_avr64(a->vrt, tmp, false);
+
+tcg_temp_free_i64(rb);
+tcg_temp_free_i64(mh);
+tcg_temp_free_i64(ml);
+tcg_temp_free_i64(tmp);
+
+return true;
+}
+
 #define GEN_VAFORM_PAIRED(name0, name1, opc2)   \
 static void glue(gen_, name0##_##name1)(DisasContext *ctx)  \
 {   \
-- 
2.25.1




[PATCH v5 21/49] target/ppc: implement vsrq

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  1 +
 target/ppc/translate/vmx-impl.c.inc | 40 +
 2 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 3799065508..96ee730242 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -479,6 +479,7 @@ VSRB000100 . . . 0100100@VX
 VSRH000100 . . . 01001000100@VX
 VSRW000100 . . . 0101100@VX
 VSRD000100 . . . 11011000100@VX
+VSRQ000100 . . . 0100101@VX
 
 VSRAB   000100 . . . 0110100@VX
 VSRAH   000100 . . . 01101000100@VX
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 49c722e862..8a1e64d7f2 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -834,11 +834,10 @@ TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, 
tcg_gen_gvec_sarv);
 TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
 TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
 
-static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
+static bool do_vector_shift_quad(DisasContext *ctx, arg_VX *a, bool right)
 {
 TCGv_i64 hi, lo, t0, n, zero = tcg_constant_i64(0);
 
-REQUIRE_INSNS_FLAGS2(ctx, ISA310);
 REQUIRE_VECTOR(ctx);
 
 n = tcg_temp_new_i64();
@@ -852,19 +851,37 @@ static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
 get_avr64(n, a->vrb, true);
 
 tcg_gen_andi_i64(t0, n, 64);
-tcg_gen_movcond_i64(TCG_COND_NE, hi, t0, zero, lo, hi);
-tcg_gen_movcond_i64(TCG_COND_NE, lo, t0, zero, zero, lo);
+if (right) {
+tcg_gen_movcond_i64(TCG_COND_NE, lo, t0, zero, hi, lo);
+tcg_gen_movcond_i64(TCG_COND_NE, hi, t0, zero, zero, hi);
+} else {
+tcg_gen_movcond_i64(TCG_COND_NE, hi, t0, zero, lo, hi);
+tcg_gen_movcond_i64(TCG_COND_NE, lo, t0, zero, zero, lo);
+}
 tcg_gen_andi_i64(n, n, 0x3F);
 
-tcg_gen_shl_i64(t0, lo, n);
-set_avr64(a->vrt, t0, false);
+if (right) {
+tcg_gen_shr_i64(t0, hi, n);
+} else {
+tcg_gen_shl_i64(t0, lo, n);
+}
+set_avr64(a->vrt, t0, right);
 
-tcg_gen_shl_i64(hi, hi, n);
+if (right) {
+tcg_gen_shr_i64(lo, lo, n);
+} else {
+tcg_gen_shl_i64(hi, hi, n);
+}
 tcg_gen_xori_i64(n, n, 63);
-tcg_gen_shr_i64(lo, lo, n);
-tcg_gen_shri_i64(lo, lo, 1);
+if (right) {
+tcg_gen_shl_i64(hi, hi, n);
+tcg_gen_shli_i64(hi, hi, 1);
+} else {
+tcg_gen_shr_i64(lo, lo, n);
+tcg_gen_shri_i64(lo, lo, 1);
+}
 tcg_gen_or_i64(hi, hi, lo);
-set_avr64(a->vrt, hi, true);
+set_avr64(a->vrt, hi, !right);
 
 tcg_temp_free_i64(hi);
 tcg_temp_free_i64(lo);
@@ -874,6 +891,9 @@ static bool trans_VSLQ(DisasContext *ctx, arg_VX *a)
 return true;
 }
 
+TRANS_FLAGS2(ISA310, VSLQ, do_vector_shift_quad, false);
+TRANS_FLAGS2(ISA310, VSRQ, do_vector_shift_quad, true);
+
 #define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3)   \
 static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
  TCGv_vec sat, TCGv_vec a,  \
-- 
2.25.1




[PATCH v5 07/49] target/ppc: Move vexts[bhw]2[wd] to decodetree

2022-02-25 Thread matheus . ferst
From: Lucas Coutinho 

Move the following instructions to decodetree:
vextsb2w: Vector Extend Sign Byte To Word
vextsh2w: Vector Extend Sign Halfword To Word
vextsb2d: Vector Extend Sign Byte To Doubleword
vextsh2d: Vector Extend Sign Halfword To Doubleword
vextsw2d: Vector Extend Sign Word To Doubleword

Reviewed-by: Richard Henderson 
Signed-off-by: Lucas Coutinho 
Signed-off-by: Matheus Ferst 
---
 target/ppc/helper.h |  5 ---
 target/ppc/insn32.decode|  8 
 target/ppc/int_helper.c | 15 
 target/ppc/translate/vmx-impl.c.inc | 58 ++---
 target/ppc/translate/vmx-ops.c.inc  |  5 ---
 5 files changed, 61 insertions(+), 30 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index a223d6ce92..79e1a10a1c 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -240,11 +240,6 @@ DEF_HELPER_4(VINSBLX, void, env, avr, i64, tl)
 DEF_HELPER_4(VINSHLX, void, env, avr, i64, tl)
 DEF_HELPER_4(VINSWLX, void, env, avr, i64, tl)
 DEF_HELPER_4(VINSDLX, void, env, avr, i64, tl)
-DEF_HELPER_2(vextsb2w, void, avr, avr)
-DEF_HELPER_2(vextsh2w, void, avr, avr)
-DEF_HELPER_2(vextsb2d, void, avr, avr)
-DEF_HELPER_2(vextsh2d, void, avr, avr)
-DEF_HELPER_2(vextsw2d, void, avr, avr)
 DEF_HELPER_2(vnegw, void, avr, avr)
 DEF_HELPER_2(vnegd, void, avr, avr)
 DEF_HELPER_2(vupkhpx, void, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 732a2bb79e..1dcf9c61e9 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -419,6 +419,14 @@ VINSWVRX000100 . . . 0011000@VX
 VSLDBI  000100 . . . 00 ... 010110  @VN
 VSRDBI  000100 . . . 01 ... 010110  @VN
 
+## Vector Integer Arithmetic Instructions
+
+VEXTSB2W000100 . 1 . 1100010@VX_tb
+VEXTSH2W000100 . 10001 . 1100010@VX_tb
+VEXTSB2D000100 . 11000 . 1100010@VX_tb
+VEXTSH2D000100 . 11001 . 1100010@VX_tb
+VEXTSW2D000100 . 11010 . 1100010@VX_tb
+
 ## Vector Mask Manipulation Instructions
 
 MTVSRBM 000100 . 1 . 1100110@VX_tb
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 46ef3ffb3f..a75a5482fc 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1630,21 +1630,6 @@ XXBLEND(W, 32)
 XXBLEND(D, 64)
 #undef XXBLEND
 
-#define VEXT_SIGNED(name, element, cast)\
-void helper_##name(ppc_avr_t *r, ppc_avr_t *b)  \
-{   \
-int i;  \
-for (i = 0; i < ARRAY_SIZE(r->element); i++) {  \
-r->element[i] = (cast)b->element[i];\
-}   \
-}
-VEXT_SIGNED(vextsb2w, s32, int8_t)
-VEXT_SIGNED(vextsb2d, s64, int8_t)
-VEXT_SIGNED(vextsh2w, s32, int16_t)
-VEXT_SIGNED(vextsh2d, s64, int16_t)
-VEXT_SIGNED(vextsw2d, s64, int32_t)
-#undef VEXT_SIGNED
-
 #define VNEG(name, element) \
 void helper_##name(ppc_avr_t *r, ppc_avr_t *b)  \
 {   \
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index fcff3418c5..aa021bdf54 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1772,11 +1772,59 @@ GEN_VXFORM_TRANS(vclzw, 1, 30)
 GEN_VXFORM_TRANS(vclzd, 1, 31)
 GEN_VXFORM_NOA_2(vnegw, 1, 24, 6)
 GEN_VXFORM_NOA_2(vnegd, 1, 24, 7)
-GEN_VXFORM_NOA_2(vextsb2w, 1, 24, 16)
-GEN_VXFORM_NOA_2(vextsh2w, 1, 24, 17)
-GEN_VXFORM_NOA_2(vextsb2d, 1, 24, 24)
-GEN_VXFORM_NOA_2(vextsh2d, 1, 24, 25)
-GEN_VXFORM_NOA_2(vextsw2d, 1, 24, 26)
+
+static void gen_vexts_i64(TCGv_i64 t, TCGv_i64 b, int64_t s)
+{
+tcg_gen_sextract_i64(t, b, 0, 64 - s);
+}
+
+static void gen_vexts_i32(TCGv_i32 t, TCGv_i32 b, int32_t s)
+{
+tcg_gen_sextract_i32(t, b, 0, 32 - s);
+}
+
+static void gen_vexts_vec(unsigned vece, TCGv_vec t, TCGv_vec b, int64_t s)
+{
+tcg_gen_shli_vec(vece, t, b, s);
+tcg_gen_sari_vec(vece, t, t, s);
+}
+
+static bool do_vexts(DisasContext *ctx, arg_VX_tb *a, unsigned vece, int64_t s)
+{
+static const TCGOpcode vecop_list[] = {
+INDEX_op_shli_vec, INDEX_op_sari_vec, 0
+};
+
+static const GVecGen2i op[2] = {
+{
+.fni4 = gen_vexts_i32,
+.fniv = gen_vexts_vec,
+.opt_opc = vecop_list,
+.vece = MO_32
+},
+{
+.fni8 = gen_vexts_i64,
+.fniv = gen_vexts_vec,
+.opt_opc = vecop_list,
+.vece = MO_64
+},
+};
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+REQUIRE_VECTOR(ctx);
+
+tcg_gen_gvec_2i(avr_full_offset(a->vrt), avr

[PATCH v5 05/49] target/ppc: Implement vmsumcud instruction

2022-02-25 Thread matheus . ferst
From: Víctor Colombo 

Based on [1] by Lijun Pan , which was never merged
into master.

[1]: https://lists.gnu.org/archive/html/qemu-ppc/2020-07/msg00419.html

Reviewed-by: Richard Henderson 
Signed-off-by: Víctor Colombo 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  4 +++
 target/ppc/translate/vmx-impl.c.inc | 53 +
 2 files changed, 57 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index d817e44c71..e85a75db2f 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -468,6 +468,10 @@ VMULHSD 000100 . . . 0001001@VX
 VMULHUD 000100 . . . 01011001001@VX
 VMULLD  000100 . . . 00111001001@VX
 
+## Vector Multiply-Sum Instructions
+
+VMSUMCUD000100 . . . . 010111   @VA
+
 # VSX Load/Store Instructions
 
 LXV 01 . .  . 001   @DQ_TSX
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 97a075efd1..4f528dc820 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2081,6 +2081,59 @@ static bool trans_VPEXTD(DisasContext *ctx, arg_VX *a)
 return true;
 }
 
+static bool trans_VMSUMCUD(DisasContext *ctx, arg_VA *a)
+{
+TCGv_i64 tmp0, tmp1, prod1h, prod1l, prod0h, prod0l, zero;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VECTOR(ctx);
+
+tmp0 = tcg_temp_new_i64();
+tmp1 = tcg_temp_new_i64();
+prod1h = tcg_temp_new_i64();
+prod1l = tcg_temp_new_i64();
+prod0h = tcg_temp_new_i64();
+prod0l = tcg_temp_new_i64();
+zero = tcg_constant_i64(0);
+
+/* prod1 = vsr[vra+32].dw[1] * vsr[vrb+32].dw[1] */
+get_avr64(tmp0, a->vra, false);
+get_avr64(tmp1, a->vrb, false);
+tcg_gen_mulu2_i64(prod1l, prod1h, tmp0, tmp1);
+
+/* prod0 = vsr[vra+32].dw[0] * vsr[vrb+32].dw[0] */
+get_avr64(tmp0, a->vra, true);
+get_avr64(tmp1, a->vrb, true);
+tcg_gen_mulu2_i64(prod0l, prod0h, tmp0, tmp1);
+
+/* Sum lower 64-bits elements */
+get_avr64(tmp1, a->rc, false);
+tcg_gen_add2_i64(tmp1, tmp0, tmp1, zero, prod1l, zero);
+tcg_gen_add2_i64(tmp1, tmp0, tmp1, tmp0, prod0l, zero);
+
+/*
+ * Discard lower 64-bits, leaving the carry into bit 64.
+ * Then sum the higher 64-bit elements.
+ */
+get_avr64(tmp1, a->rc, true);
+tcg_gen_add2_i64(tmp1, tmp0, tmp0, zero, tmp1, zero);
+tcg_gen_add2_i64(tmp1, tmp0, tmp1, tmp0, prod1h, zero);
+tcg_gen_add2_i64(tmp1, tmp0, tmp1, tmp0, prod0h, zero);
+
+/* Discard 64 more bits to complete the CHOP128(temp >> 128) */
+set_avr64(a->vrt, tmp0, false);
+set_avr64(a->vrt, zero, true);
+
+tcg_temp_free_i64(tmp0);
+tcg_temp_free_i64(tmp1);
+tcg_temp_free_i64(prod1h);
+tcg_temp_free_i64(prod1l);
+tcg_temp_free_i64(prod0h);
+tcg_temp_free_i64(prod0l);
+
+return true;
+}
+
 static bool do_vx_helper(DisasContext *ctx, arg_VX *a,
  void (*gen_helper)(TCGv_ptr, TCGv_ptr, TCGv_ptr))
 {
-- 
2.25.1




[PATCH v5 13/49] target/ppc: Implement Vector Compare Quadword

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Implement the following PowerISA v3.1 instructions:
vcmpsq: Vector Compare Signed Quadword
vcmpuq: Vector Compare Unsigned Quadword

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  6 
 target/ppc/translate/vmx-impl.c.inc | 45 +
 2 files changed, 51 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 07a4ef9103..f0cb6602e2 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -60,6 +60,9 @@
 &VX vrt vra vrb
 @VX .. vrt:5 vra:5 vrb:5 .. .   &VX
 
+&VX_bf  bf vra vrb
+@VX_bf  .. bf:3 .. vra:5 vrb:5 ...  &VX_bf
+
 &VX_uim4vrt uim vrb
 @VX_uim4.. vrt:5 . uim:4 vrb:5 ...  &VX_uim4
 
@@ -404,6 +407,9 @@ VCMPNEZB000100 . . . . 010111   @VC
 VCMPNEZH000100 . . . . 0101000111   @VC
 VCMPNEZW000100 . . . . 011111   @VC
 
+VCMPSQ  000100 ... -- . . 0010101   @VX_bf
+VCMPUQ  000100 ... -- . . 0010001   @VX_bf
+
 ## Vector Bit Manipulation Instruction
 
 VCFUGED 000100 . . . 10101001101@VX
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 7f9913235e..ba2106dc7c 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1182,6 +1182,51 @@ static bool do_vcmpgtq(DisasContext *ctx, arg_VC *a, 
bool sign)
 TRANS(VCMPGTSQ, do_vcmpgtq, true)
 TRANS(VCMPGTUQ, do_vcmpgtq, false)
 
+static bool do_vcmpq(DisasContext *ctx, arg_VX_bf *a, bool sign)
+{
+TCGv_i64 vra, vrb;
+TCGLabel *gt, *lt, *done;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VECTOR(ctx);
+
+vra = tcg_temp_local_new_i64();
+vrb = tcg_temp_local_new_i64();
+gt = gen_new_label();
+lt = gen_new_label();
+done = gen_new_label();
+
+get_avr64(vra, a->vra, true);
+get_avr64(vrb, a->vrb, true);
+tcg_gen_brcond_i64((sign ? TCG_COND_GT : TCG_COND_GTU), vra, vrb, gt);
+tcg_gen_brcond_i64((sign ? TCG_COND_LT : TCG_COND_LTU), vra, vrb, lt);
+
+get_avr64(vra, a->vra, false);
+get_avr64(vrb, a->vrb, false);
+tcg_gen_brcond_i64(TCG_COND_GTU, vra, vrb, gt);
+tcg_gen_brcond_i64(TCG_COND_LTU, vra, vrb, lt);
+
+tcg_gen_movi_i32(cpu_crf[a->bf], CRF_EQ);
+tcg_gen_br(done);
+
+gen_set_label(gt);
+tcg_gen_movi_i32(cpu_crf[a->bf], CRF_GT);
+tcg_gen_br(done);
+
+gen_set_label(lt);
+tcg_gen_movi_i32(cpu_crf[a->bf], CRF_LT);
+tcg_gen_br(done);
+
+gen_set_label(done);
+tcg_temp_free_i64(vra);
+tcg_temp_free_i64(vrb);
+
+return true;
+}
+
+TRANS(VCMPSQ, do_vcmpq, true)
+TRANS(VCMPUQ, do_vcmpq, false)
+
 GEN_VXRFORM(vcmpeqfp, 3, 3)
 GEN_VXRFORM(vcmpgefp, 3, 7)
 GEN_VXRFORM(vcmpgtfp, 3, 11)
-- 
2.25.1




[PATCH v5 19/49] target/ppc: move vs[lr][a][bhwd] to decodetree

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode| 17 
 target/ppc/translate/vmx-impl.c.inc | 41 +++--
 target/ppc/translate/vmx-ops.c.inc  | 13 +
 3 files changed, 45 insertions(+), 26 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 02df4a98e6..88baebe35e 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -467,6 +467,23 @@ VINSWVRX000100 . . . 0011000@VX
 VSLDBI  000100 . . . 00 ... 010110  @VN
 VSRDBI  000100 . . . 01 ... 010110  @VN
 
+## Vector Integer Shift Instruction
+
+VSLB000100 . . . 0010100@VX
+VSLH000100 . . . 00101000100@VX
+VSLW000100 . . . 0011100@VX
+VSLD000100 . . . 10111000100@VX
+
+VSRB000100 . . . 0100100@VX
+VSRH000100 . . . 01001000100@VX
+VSRW000100 . . . 0101100@VX
+VSRD000100 . . . 11011000100@VX
+
+VSRAB   000100 . . . 0110100@VX
+VSRAH   000100 . . . 01101000100@VX
+VSRAW   000100 . . . 0111100@VX
+VSRAD   000100 . . . 0000100@VX
+
 ## Vector Integer Arithmetic Instructions
 
 VEXTSB2W000100 . 1 . 1100010@VX_tb
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 52774cdd4d..1b05b0b3a3 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -799,21 +799,7 @@ static void trans_vclzd(DisasContext *ctx)
 }
 
 GEN_VXFORM_V(vmuluwm, MO_32, tcg_gen_gvec_mul, 4, 2);
-GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
-GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
-GEN_VXFORM_V(vslw, MO_32, tcg_gen_gvec_shlv, 2, 6);
 GEN_VXFORM(vrlwnm, 2, 6);
-GEN_VXFORM_DUAL(vslw, PPC_ALTIVEC, PPC_NONE, \
-vrlwnm, PPC_NONE, PPC2_ISA300)
-GEN_VXFORM_V(vsld, MO_64, tcg_gen_gvec_shlv, 2, 23);
-GEN_VXFORM_V(vsrb, MO_8, tcg_gen_gvec_shrv, 2, 8);
-GEN_VXFORM_V(vsrh, MO_16, tcg_gen_gvec_shrv, 2, 9);
-GEN_VXFORM_V(vsrw, MO_32, tcg_gen_gvec_shrv, 2, 10);
-GEN_VXFORM_V(vsrd, MO_64, tcg_gen_gvec_shrv, 2, 27);
-GEN_VXFORM_V(vsrab, MO_8, tcg_gen_gvec_sarv, 2, 12);
-GEN_VXFORM_V(vsrah, MO_16, tcg_gen_gvec_sarv, 2, 13);
-GEN_VXFORM_V(vsraw, MO_32, tcg_gen_gvec_sarv, 2, 14);
-GEN_VXFORM_V(vsrad, MO_64, tcg_gen_gvec_sarv, 2, 15);
 GEN_VXFORM(vsrv, 2, 28);
 GEN_VXFORM(vslv, 2, 29);
 GEN_VXFORM(vslo, 6, 16);
@@ -821,6 +807,33 @@ GEN_VXFORM(vsro, 6, 17);
 GEN_VXFORM(vaddcuw, 0, 6);
 GEN_VXFORM(vsubcuw, 0, 22);
 
+static bool do_vector_gvec3_VX(DisasContext *ctx, arg_VX *a, int vece,
+   void (*gen_gvec)(unsigned, uint32_t, uint32_t,
+uint32_t, uint32_t, uint32_t))
+{
+REQUIRE_VECTOR(ctx);
+
+gen_gvec(vece, avr_full_offset(a->vrt), avr_full_offset(a->vra),
+ avr_full_offset(a->vrb), 16, 16);
+
+return true;
+}
+
+TRANS_FLAGS(ALTIVEC, VSLB, do_vector_gvec3_VX, MO_8, tcg_gen_gvec_shlv);
+TRANS_FLAGS(ALTIVEC, VSLH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_shlv);
+TRANS_FLAGS(ALTIVEC, VSLW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_shlv);
+TRANS_FLAGS2(ALTIVEC_207, VSLD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_shlv);
+
+TRANS_FLAGS(ALTIVEC, VSRB, do_vector_gvec3_VX, MO_8, tcg_gen_gvec_shrv);
+TRANS_FLAGS(ALTIVEC, VSRH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_shrv);
+TRANS_FLAGS(ALTIVEC, VSRW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_shrv);
+TRANS_FLAGS2(ALTIVEC_207, VSRD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_shrv);
+
+TRANS_FLAGS(ALTIVEC, VSRAB, do_vector_gvec3_VX, MO_8, tcg_gen_gvec_sarv);
+TRANS_FLAGS(ALTIVEC, VSRAH, do_vector_gvec3_VX, MO_16, tcg_gen_gvec_sarv);
+TRANS_FLAGS(ALTIVEC, VSRAW, do_vector_gvec3_VX, MO_32, tcg_gen_gvec_sarv);
+TRANS_FLAGS2(ALTIVEC_207, VSRAD, do_vector_gvec3_VX, MO_64, tcg_gen_gvec_sarv);
+
 #define GEN_VXFORM_SAT(NAME, VECE, NORM, SAT, OPC2, OPC3)   \
 static void glue(glue(gen_, NAME), _vec)(unsigned vece, TCGv_vec t, \
  TCGv_vec sat, TCGv_vec a,  \
diff --git a/target/ppc/translate/vmx-ops.c.inc 
b/target/ppc/translate/vmx-ops.c.inc
index cb4c5bb953..878bce92c6 100644
--- a/target/ppc/translate/vmx-ops.c.inc
+++ b/target/ppc/translate/vmx-ops.c.inc
@@ -102,18 +102,7 @@ GEN_VXFORM_300(vextubrx, 6, 28),
 GEN_VXFORM_300(vextuhrx, 6, 29),
 GEN_VXFORM_DUAL(vmrgew, vextuwrx, 6, 30, PPC_NONE, PPC2_ALTIVEC_207),
 GEN_VXFORM_207(vmuluwm, 4, 2),
-GEN_VXFORM(vslb, 2, 4),
-GEN_VXFORM(vslh, 2, 5),
-GEN_VXFORM_DUAL(vslw, vrlwnm, 2, 6, PPC_ALTIVEC, PPC_NONE),
-GEN_VXFORM_207(vsld, 2, 23),
-GEN_VXFORM(vsrb, 2, 8),
-GEN_VXFORM(vsrh, 2, 9),
-GEN_

[PATCH v5 04/49] target/ppc: vmulh* instructions without helpers

2022-02-25 Thread matheus . ferst
From: "Lucas Mateus Castro (alqotel)" 

Changed vmulhuw, vmulhud, vmulhsw, vmulhsd to not
use helpers.

Signed-off-by: Lucas Mateus Castro (alqotel) 
Signed-off-by: Matheus Ferst 
---
 target/ppc/helper.h |  4 --
 target/ppc/int_helper.c | 35 ---
 target/ppc/translate/vmx-impl.c.inc | 91 +++--
 3 files changed, 87 insertions(+), 43 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 4ff71b2fa3..a223d6ce92 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -202,10 +202,6 @@ DEF_HELPER_FLAGS_3(VMULOSW, TCG_CALL_NO_RWG, void, avr, 
avr, avr)
 DEF_HELPER_FLAGS_3(VMULOUB, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VMULOUH, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VMULOUW, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(VMULHSW, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(VMULHUW, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(VMULHSD, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(VMULHUD, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 873f957bf4..46ef3ffb3f 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1097,41 +1097,6 @@ VMUL(UW, u32, VsrW, VsrD, uint64_t)
 #undef VMUL_DO_ODD
 #undef VMUL
 
-void helper_VMULHSW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-{
-int i;
-
-for (i = 0; i < 4; i++) {
-r->s32[i] = (int32_t)(((int64_t)a->s32[i] * (int64_t)b->s32[i]) >> 32);
-}
-}
-
-void helper_VMULHUW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-{
-int i;
-
-for (i = 0; i < 4; i++) {
-r->u32[i] = (uint32_t)(((uint64_t)a->u32[i] *
-   (uint64_t)b->u32[i]) >> 32);
-}
-}
-
-void helper_VMULHSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-{
-uint64_t discard;
-
-muls64(&discard, &r->u64[0], a->s64[0], b->s64[0]);
-muls64(&discard, &r->u64[1], a->s64[1], b->s64[1]);
-}
-
-void helper_VMULHUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
-{
-uint64_t discard;
-
-mulu64(&discard, &r->u64[0], a->u64[0], b->u64[0]);
-mulu64(&discard, &r->u64[1], a->u64[1], b->u64[1]);
-}
-
 void helper_vperm(CPUPPCState *env, ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b,
   ppc_avr_t *c)
 {
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index d493de3629..97a075efd1 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2151,10 +2151,93 @@ TRANS_FLAGS2(ISA310, VMULOSD, do_vx_vmuleo, false, 
tcg_gen_muls2_i64)
 TRANS_FLAGS2(ISA310, VMULEUD, do_vx_vmuleo, true , tcg_gen_mulu2_i64)
 TRANS_FLAGS2(ISA310, VMULOUD, do_vx_vmuleo, false, tcg_gen_mulu2_i64)
 
-TRANS_FLAGS2(ISA310, VMULHSW, do_vx_helper, gen_helper_VMULHSW)
-TRANS_FLAGS2(ISA310, VMULHSD, do_vx_helper, gen_helper_VMULHSD)
-TRANS_FLAGS2(ISA310, VMULHUW, do_vx_helper, gen_helper_VMULHUW)
-TRANS_FLAGS2(ISA310, VMULHUD, do_vx_helper, gen_helper_VMULHUD)
+static void do_vx_vmulhw_i64(TCGv_i64 t, TCGv_i64 a, TCGv_i64 b, bool sign)
+{
+TCGv_i64 hh, lh, temp;
+
+uint64_t c;
+hh = tcg_temp_new_i64();
+lh = tcg_temp_new_i64();
+temp = tcg_temp_new_i64();
+
+c = 0x;
+
+if (sign) {
+tcg_gen_ext32s_i64(lh, a);
+tcg_gen_ext32s_i64(temp, b);
+} else {
+tcg_gen_andi_i64(lh, a, c);
+tcg_gen_andi_i64(temp, b, c);
+}
+tcg_gen_mul_i64(lh, lh, temp);
+
+if (sign) {
+tcg_gen_sari_i64(hh, a, 32);
+tcg_gen_sari_i64(temp, b, 32);
+} else {
+tcg_gen_shri_i64(hh, a, 32);
+tcg_gen_shri_i64(temp, b, 32);
+}
+tcg_gen_mul_i64(hh, hh, temp);
+
+tcg_gen_shri_i64(lh, lh, 32);
+tcg_gen_andi_i64(hh, hh, c << 32);
+tcg_gen_or_i64(t, hh, lh);
+
+tcg_temp_free_i64(hh);
+tcg_temp_free_i64(lh);
+tcg_temp_free_i64(temp);
+}
+
+static void do_vx_vmulhd_i64(TCGv_i64 t, TCGv_i64 a, TCGv_i64 b, bool sign)
+{
+TCGv_i64 tlow;
+
+tlow  = tcg_temp_new_i64();
+if (sign) {
+tcg_gen_muls2_i64(tlow, t, a, b);
+} else {
+tcg_gen_mulu2_i64(tlow, t, a, b);
+}
+
+tcg_temp_free_i64(tlow);
+}
+
+static bool do_vx_mulh(DisasContext *ctx, arg_VX *a, bool sign,
+   void (*func)(TCGv_i64, TCGv_i64, TCGv_i64, bool))
+{
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VECTOR(ctx);
+
+TCGv_i64 vra, vrb, vrt;
+int i;
+
+vra = tcg_temp_new_i64();
+vrb = tcg_temp_new_i64();
+vrt = tcg_temp_new_i64();
+
+for (i = 0; i < 2; i++) {
+get_avr64(vra, a->vra, i);
+get_avr64(vrb, a->vrb, i);
+get_avr64(vrt, a->vrt, i);
+
+func(vrt, vra, vrb, sign);
+
+set_avr64(a->vrt, vrt, i);
+}
+
+tcg_temp_free_i64(vra);
+tcg_temp_free_i64(vrb);
+ 

[PATCH v5 10/49] target/ppc: Move Vector Compare Not Equal or Zero to decodetree

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/helper.h |  9 ++--
 target/ppc/insn32.decode|  4 ++
 target/ppc/int_helper.c | 50 +-
 target/ppc/translate/vmx-impl.c.inc | 66 +++--
 target/ppc/translate/vmx-ops.c.inc  |  3 --
 5 files changed, 80 insertions(+), 52 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 67f78b801b..3257203791 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -140,16 +140,13 @@ DEF_HELPER_3(vabsduw, void, avr, avr, avr)
 DEF_HELPER_3(vavgsb, void, avr, avr, avr)
 DEF_HELPER_3(vavgsh, void, avr, avr, avr)
 DEF_HELPER_3(vavgsw, void, avr, avr, avr)
-DEF_HELPER_4(vcmpnezb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezw, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpeqfp, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgefp, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgtfp, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpbfp, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnezw_dot, void, env, avr, avr, avr)
+DEF_HELPER_FLAGS_4(VCMPNEZB, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VCMPNEZH, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VCMPNEZW, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
 DEF_HELPER_4(vcmpeqfp_dot, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgefp_dot, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgtfp_dot, void, env, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 5443ee0394..be9e05cc73 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -397,6 +397,10 @@ VCMPNEB 000100 . . . . 000111   @VC
 VCMPNEH 000100 . . . . 0001000111   @VC
 VCMPNEW 000100 . . . . 001111   @VC
 
+VCMPNEZB000100 . . . . 010111   @VC
+VCMPNEZH000100 . . . . 0101000111   @VC
+VCMPNEZW000100 . . . . 011111   @VC
+
 ## Vector Bit Manipulation Instruction
 
 VCFUGED 000100 . . . 10101001101@VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index 734b817b68..f31dba9469 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -662,46 +662,18 @@ VCF(ux, uint32_to_float32, u32)
 VCF(sx, int32_to_float32, s32)
 #undef VCF
 
-#define VCMPNE_DO(suffix, element, etype, cmpzero, record)  \
-void helper_vcmpne##suffix(CPUPPCState *env, ppc_avr_t *r,  \
-ppc_avr_t *a, ppc_avr_t *b) \
-{   \
-etype ones = (etype)-1; \
-etype all = ones;   \
-etype result, none = 0; \
-int i;  \
-\
-for (i = 0; i < ARRAY_SIZE(r->element); i++) {  \
-if (cmpzero) {  \
-result = ((a->element[i] == 0)  \
-   || (b->element[i] == 0)  \
-   || (a->element[i] != b->element[i]) ?\
-   ones : 0x0); \
-} else {\
-result = (a->element[i] != b->element[i]) ? ones : 0x0; \
-}   \
-r->element[i] = result; \
-all &= result;  \
-none |= result; \
-}   \
-if (record) {   \
-env->crf[6] = ((all != 0) << 3) | ((none == 0) << 1);   \
-}   \
+#define VCMPNEZ(NAME, ELEM) \
+void helper_##NAME(ppc_vsr_t *t, ppc_vsr_t *a, ppc_vsr_t *b, uint32_t desc) \
+{   \
+for (int i = 0; i < ARRAY_SIZE(t->ELEM); i++) { \
+t->ELEM[i] = ((a->ELEM[i] == 0) || (b->ELEM[i] == 0) || \
+  (a->ELEM[i] != b->ELEM[i])) ? -1 : 0; \
+}   \
 }
-
-/*
- * VCMPNEZ - Vector compare not equal to zero
- * 

[PATCH v5 01/49] target/ppc: Introduce TRANS*FLAGS macros

2022-02-25 Thread matheus . ferst
From: Luis Pires 

New macros that add FLAGS and FLAGS2 checking were added for
both TRANS and TRANS64.

Reviewed-by: Richard Henderson 
Signed-off-by: Luis Pires 
[ferst: - TRANS_FLAGS2 instead of TRANS_FLAGS_E
- Use the new macros in load/store vector insns ]
Signed-off-by: Matheus Ferst 
---
 target/ppc/translate.c  | 19 +++
 target/ppc/translate/vsx-impl.c.inc | 37 ++---
 2 files changed, 31 insertions(+), 25 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index ecc5a104e0..b46a11386e 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6604,10 +6604,29 @@ static int times_16(DisasContext *ctx, int x)
 #define TRANS(NAME, FUNC, ...) \
 static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
 { return FUNC(ctx, a, __VA_ARGS__); }
+#define TRANS_FLAGS(FLAGS, NAME, FUNC, ...) \
+static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
+{  \
+REQUIRE_INSNS_FLAGS(ctx, FLAGS);   \
+return FUNC(ctx, a, __VA_ARGS__);  \
+}
+#define TRANS_FLAGS2(FLAGS2, NAME, FUNC, ...) \
+static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
+{  \
+REQUIRE_INSNS_FLAGS2(ctx, FLAGS2); \
+return FUNC(ctx, a, __VA_ARGS__);  \
+}
 
 #define TRANS64(NAME, FUNC, ...) \
 static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
 { REQUIRE_64BIT(ctx); return FUNC(ctx, a, __VA_ARGS__); }
+#define TRANS64_FLAGS2(FLAGS2, NAME, FUNC, ...) \
+static bool trans_##NAME(DisasContext *ctx, arg_##NAME *a) \
+{  \
+REQUIRE_64BIT(ctx);\
+REQUIRE_INSNS_FLAGS2(ctx, FLAGS2); \
+return FUNC(ctx, a, __VA_ARGS__);  \
+}
 
 /* TODO: More TRANS* helpers for extra insn_flags checks. */
 
diff --git a/target/ppc/translate/vsx-impl.c.inc 
b/target/ppc/translate/vsx-impl.c.inc
index 128968b5e7..e8a4ba0cfa 100644
--- a/target/ppc/translate/vsx-impl.c.inc
+++ b/target/ppc/translate/vsx-impl.c.inc
@@ -2072,12 +2072,6 @@ static bool do_lstxv(DisasContext *ctx, int ra, TCGv 
displ,
 
 static bool do_lstxv_D(DisasContext *ctx, arg_D *a, bool store, bool paired)
 {
-if (paired) {
-REQUIRE_INSNS_FLAGS2(ctx, ISA310);
-} else {
-REQUIRE_INSNS_FLAGS2(ctx, ISA300);
-}
-
 if (paired || a->rt >= 32) {
 REQUIRE_VSX(ctx);
 } else {
@@ -2091,7 +2085,6 @@ static bool do_lstxv_PLS_D(DisasContext *ctx, arg_PLS_D 
*a,
bool store, bool paired)
 {
 arg_D d;
-REQUIRE_INSNS_FLAGS2(ctx, ISA310);
 REQUIRE_VSX(ctx);
 
 if (!resolve_PLS_D(ctx, &d, a)) {
@@ -2103,12 +2096,6 @@ static bool do_lstxv_PLS_D(DisasContext *ctx, arg_PLS_D 
*a,
 
 static bool do_lstxv_X(DisasContext *ctx, arg_X *a, bool store, bool paired)
 {
-if (paired) {
-REQUIRE_INSNS_FLAGS2(ctx, ISA310);
-} else {
-REQUIRE_INSNS_FLAGS2(ctx, ISA300);
-}
-
 if (paired || a->rt >= 32) {
 REQUIRE_VSX(ctx);
 } else {
@@ -2118,18 +2105,18 @@ static bool do_lstxv_X(DisasContext *ctx, arg_X *a, 
bool store, bool paired)
 return do_lstxv(ctx, a->ra, cpu_gpr[a->rb], a->rt, store, paired);
 }
 
-TRANS(STXV, do_lstxv_D, true, false)
-TRANS(LXV, do_lstxv_D, false, false)
-TRANS(STXVP, do_lstxv_D, true, true)
-TRANS(LXVP, do_lstxv_D, false, true)
-TRANS(STXVX, do_lstxv_X, true, false)
-TRANS(LXVX, do_lstxv_X, false, false)
-TRANS(STXVPX, do_lstxv_X, true, true)
-TRANS(LXVPX, do_lstxv_X, false, true)
-TRANS64(PSTXV, do_lstxv_PLS_D, true, false)
-TRANS64(PLXV, do_lstxv_PLS_D, false, false)
-TRANS64(PSTXVP, do_lstxv_PLS_D, true, true)
-TRANS64(PLXVP, do_lstxv_PLS_D, false, true)
+TRANS_FLAGS2(ISA300, STXV, do_lstxv_D, true, false)
+TRANS_FLAGS2(ISA300, LXV, do_lstxv_D, false, false)
+TRANS_FLAGS2(ISA310, STXVP, do_lstxv_D, true, true)
+TRANS_FLAGS2(ISA310, LXVP, do_lstxv_D, false, true)
+TRANS_FLAGS2(ISA300, STXVX, do_lstxv_X, true, false)
+TRANS_FLAGS2(ISA300, LXVX, do_lstxv_X, false, false)
+TRANS_FLAGS2(ISA310, STXVPX, do_lstxv_X, true, true)
+TRANS_FLAGS2(ISA310, LXVPX, do_lstxv_X, false, true)
+TRANS64_FLAGS2(ISA310, PSTXV, do_lstxv_PLS_D, true, false)
+TRANS64_FLAGS2(ISA310, PLXV, do_lstxv_PLS_D, false, false)
+TRANS64_FLAGS2(ISA310, PSTXVP, do_lstxv_PLS_D, true, true)
+TRANS64_FLAGS2(ISA310, PLXVP, do_lstxv_PLS_D, false, true)
 
 static void gen_xxblendv_vec(unsigned vece, TCGv_vec t, TCGv_vec a, TCGv_vec b,
  TCGv_vec c)
-- 
2.25.1




[PATCH v5 03/49] target/ppc: Moved vector multiply high and low to decodetree

2022-02-25 Thread matheus . ferst
From: "Lucas Mateus Castro (alqotel)" 

Moved instructions vmulld, vmulhuw, vmulhsw, vmulhud and vmulhsd to
decodetree

Reviewed-by: Richard Henderson 
Signed-off-by: Lucas Mateus Castro (alqotel) 
Signed-off-by: Matheus Ferst 
---
 target/ppc/helper.h |  8 
 target/ppc/insn32.decode|  6 ++
 target/ppc/int_helper.c |  8 
 target/ppc/translate/vmx-impl.c.inc | 21 -
 target/ppc/translate/vmx-ops.c.inc  |  5 -
 5 files changed, 30 insertions(+), 18 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 07433b6f79..4ff71b2fa3 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -202,10 +202,10 @@ DEF_HELPER_FLAGS_3(VMULOSW, TCG_CALL_NO_RWG, void, avr, 
avr, avr)
 DEF_HELPER_FLAGS_3(VMULOUB, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VMULOUH, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(VMULOUW, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
-DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
-DEF_HELPER_3(vmulhsd, void, avr, avr, avr)
-DEF_HELPER_3(vmulhud, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULHSW, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULHUW, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULHSD, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULHUD, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_3(vslo, void, avr, avr, avr)
 DEF_HELPER_3(vsro, void, avr, avr, avr)
 DEF_HELPER_3(vsrv, void, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 092ea79618..d817e44c71 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -462,6 +462,12 @@ VMULOSD 000100 . . . 00111001000@VX
 VMULEUD 000100 . . . 01011001000@VX
 VMULOUD 000100 . . . 00011001000@VX
 
+VMULHSW 000100 . . . 01110001001@VX
+VMULHUW 000100 . . . 01010001001@VX
+VMULHSD 000100 . . . 0001001@VX
+VMULHUD 000100 . . . 01011001001@VX
+VMULLD  000100 . . . 00111001001@VX
+
 # VSX Load/Store Instructions
 
 LXV 01 . .  . 001   @DQ_TSX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index c9f34ce3ca..873f957bf4 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1097,7 +1097,7 @@ VMUL(UW, u32, VsrW, VsrD, uint64_t)
 #undef VMUL_DO_ODD
 #undef VMUL
 
-void helper_vmulhsw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHSW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
 int i;
 
@@ -1106,7 +1106,7 @@ void helper_vmulhsw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t 
*b)
 }
 }
 
-void helper_vmulhuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHUW(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
 int i;
 
@@ -1116,7 +1116,7 @@ void helper_vmulhuw(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t 
*b)
 }
 }
 
-void helper_vmulhsd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHSD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
 uint64_t discard;
 
@@ -1124,7 +1124,7 @@ void helper_vmulhsd(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t 
*b)
 muls64(&discard, &r->u64[1], a->s64[1], b->s64[1]);
 }
 
-void helper_vmulhud(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
+void helper_VMULHUD(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)
 {
 uint64_t discard;
 
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index a34a080e83..d493de3629 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -799,11 +799,6 @@ static void trans_vclzd(DisasContext *ctx)
 }
 
 GEN_VXFORM_V(vmuluwm, MO_32, tcg_gen_gvec_mul, 4, 2);
-GEN_VXFORM_V(vmulld, MO_64, tcg_gen_gvec_mul, 4, 7);
-GEN_VXFORM(vmulhuw, 4, 10);
-GEN_VXFORM(vmulhud, 4, 11);
-GEN_VXFORM(vmulhsw, 4, 14);
-GEN_VXFORM(vmulhsd, 4, 15);
 GEN_VXFORM_V(vslb, MO_8, tcg_gen_gvec_shlv, 2, 4);
 GEN_VXFORM_V(vslh, MO_16, tcg_gen_gvec_shlv, 2, 5);
 GEN_VXFORM_V(vslw, MO_32, tcg_gen_gvec_shlv, 2, 6);
@@ -2128,6 +2123,17 @@ static bool do_vx_vmuleo(DisasContext *ctx, arg_VX *a, 
bool even,
 return true;
 }
 
+static bool trans_VMULLD(DisasContext *ctx, arg_VX *a)
+{
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VECTOR(ctx);
+
+tcg_gen_gvec_mul(MO_64, avr_full_offset(a->vrt), avr_full_offset(a->vra),
+ avr_full_offset(a->vrb), 16, 16);
+
+return true;
+}
+
 TRANS_FLAGS2(ALTIVEC_207, VMULESB, do_vx_helper, gen_helper_VMULESB)
 TRANS_FLAGS2(ALTIVEC_207, VMULOSB, do_vx_helper, gen_helper_VMULOSB)
 TRANS_FLAGS2(ALTIVEC_207, VMULEUB, do_vx_helper, gen_helper_VMULEUB)
@@ -2145,6 +2151,11 @@ TRANS_FLAGS2(ISA310, VMULOSD, do_vx_vmuleo, false, 
tcg_gen_muls2_i64)
 TRANS_FLAGS2(ISA310, VMULEUD, do_vx_vmuleo, true , tcg_gen_mulu2_i64)
 TRANS_FLAGS2(ISA310, VMULOUD, do_vx_vmuleo, false, tcg_

[PATCH v5 09/49] target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to decodetree

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

Reviewed-by: Richard Henderson 
Signed-off-by: Matheus Ferst 
---
 target/ppc/helper.h | 30 --
 target/ppc/insn32.decode| 24 
 target/ppc/int_helper.c | 54 -
 target/ppc/translate/vmx-impl.c.inc | 89 -
 target/ppc/translate/vmx-ops.c.inc  | 15 +
 5 files changed, 88 insertions(+), 124 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 79e1a10a1c..67f78b801b 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -140,46 +140,16 @@ DEF_HELPER_3(vabsduw, void, avr, avr, avr)
 DEF_HELPER_3(vavgsb, void, avr, avr, avr)
 DEF_HELPER_3(vavgsh, void, avr, avr, avr)
 DEF_HELPER_3(vavgsw, void, avr, avr, avr)
-DEF_HELPER_4(vcmpequb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequd, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnew, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpnezb, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpnezh, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpnezw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtub, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtud, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsb, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsh, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsw, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsd, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpeqfp, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgefp, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgtfp, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpbfp, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpequd_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpneh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpnew_dot, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpnezb_dot, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpnezh_dot, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpnezw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtub_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtuw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtud_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsb_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsh_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsw_dot, void, env, avr, avr, avr)
-DEF_HELPER_4(vcmpgtsd_dot, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpeqfp_dot, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgefp_dot, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgtfp_dot, void, env, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index cba680075b..5443ee0394 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -51,6 +51,9 @@
 &VA vrt vra vrb rc
 @VA .. vrt:5 vra:5 vrb:5 rc:5 ..&VA
 
+&VC vrt vra vrb rc:bool
+@VC .. vrt:5 vra:5 vrb:5 rc:1 ..&VC
+
 &VN vrt vra vrb sh
 @VN .. vrt:5 vra:5 vrb:5 .. sh:3 .. &VN
 
@@ -373,6 +376,27 @@ DSCLIQ  11 . . .. 00110 .   
@Z22_tap_sh_rc
 DSCRI   111011 . . .. 001100010 .   @Z22_ta_sh_rc
 DSCRIQ  11 . . .. 001100010 .   @Z22_tap_sh_rc
 
+## Vector Integer Instructions
+
+VCMPEQUB000100 . . . . 000110   @VC
+VCMPEQUH000100 . . . . 0001000110   @VC
+VCMPEQUW000100 . . . . 001110   @VC
+VCMPEQUD000100 . . . . 0011000111   @VC
+
+VCMPGTSB000100 . . . . 110110   @VC
+VCMPGTSH000100 . . . . 1101000110   @VC
+VCMPGTSW000100 . . . . 111110   @VC
+VCMPGTSD000100 . . . . 000111   @VC
+
+VCMPGTUB000100 . . . . 100110   @VC
+VCMPGTUH000100 . . . . 1001000110   @VC
+VCMPGTUW000100 . . . . 101110   @VC
+VCMPGTUD000100 . . . . 1011000111   @VC
+
+VCMPNEB 000100 . . . . 000111   @VC
+VCMPNEH 000100 . . . . 0001000111   @VC
+VCMPNEW 000100 . . . . 001111   @VC
+
 ## Vector Bit Manipulation Instruction
 
 VCFUGED 000100 . . . 10101001101@VX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index a75a5482fc..734b817b68 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -662,57 +662,6 @@ VCF(ux, uint32_to_float32, u32)
 VCF(sx, int

[PATCH 8/9] Avocado tests: classify tests based on what it's booted

2022-02-25 Thread Cleber Rosa
This adds some classification to the existing tests, based on the
mechanism (and a lot more loosely) on the content of the binary blob.

The proposal is to use the "boots" tag, and so far the following
values have been defined with the following meaning:

 - bios:   the "-bios" option is used to select the BIOS file to be
   loaded.  Because default bios are used in many QEMU runs,
   only tests that change the default are tagged.
 - kernel: means that the direct kernel boot mechanism (-kernel) is
   used.  Most of the time it means that a Linux kernel is
   booted, although there are occurrences of uboot usage.
 - initrd: means that an initial ram disk (-initrd) is used in
   addition to the kernel boot.
 - rootfs: means that a root filesystem is booted, in addition to a
   kernel and optionally an initrd.  This is usually done with
   a "-drive" command line option.
 - distro: means that a full blown distro image is booted, which may
   or may not include a kernel and initrd.  This is also
   usually done with a "-drive" command line option.

As with any other Avocado tags, it's possible to use them to select a
subset of tests.  For instance, if one wants to run tests that boots a
bios:

  $ avocado run -t boots:bios tests/avocado/

If one want to run tests that boots a kernel and an initrd:

  $ avocado run -t boots:kernel,boots:initrd tests/avocado/

It's possible, if deemed valuable, to further evolve this
classification into one with a clear separation between mechanism and
content.

Signed-off-by: Cleber Rosa 
---
 tests/avocado/boot_linux.py   |  4 ++
 tests/avocado/boot_linux_console.py   | 54 +++
 tests/avocado/boot_xen.py |  3 ++
 tests/avocado/hotplug_cpu.py  |  1 +
 tests/avocado/intel_iommu.py  |  1 +
 tests/avocado/linux_initrd.py |  2 +
 tests/avocado/linux_ssh_mips_malta.py |  2 +
 tests/avocado/machine_arm_canona1100.py   |  1 +
 tests/avocado/machine_arm_integratorcp.py |  4 ++
 tests/avocado/machine_arm_n8x0.py |  2 +
 tests/avocado/machine_avr6.py |  1 +
 tests/avocado/machine_m68k_nextcube.py|  1 +
 tests/avocado/machine_microblaze.py   |  1 +
 tests/avocado/machine_mips_fuloong2e.py   |  1 +
 tests/avocado/machine_mips_loongson3v.py  |  1 +
 tests/avocado/machine_mips_malta.py   |  3 ++
 tests/avocado/machine_rx_gdbsim.py|  2 +
 tests/avocado/machine_s390_ccw_virtio.py  |  4 ++
 tests/avocado/machine_sparc64_sun4u.py|  1 +
 tests/avocado/machine_sparc_leon3.py  |  1 +
 tests/avocado/multiprocess.py |  4 ++
 tests/avocado/ppc_405.py  |  2 +
 tests/avocado/ppc_bamboo.py   |  2 +
 tests/avocado/ppc_mpc8544ds.py|  1 +
 tests/avocado/ppc_prep_40p.py |  1 +
 tests/avocado/ppc_pseries.py  |  1 +
 tests/avocado/ppc_virtex_ml507.py |  1 +
 tests/avocado/replay_kernel.py| 28 
 tests/avocado/replay_linux.py |  1 +
 tests/avocado/reverse_debugging.py|  1 +
 tests/avocado/smmu.py |  1 +
 tests/avocado/tcg_plugins.py  |  3 ++
 tests/avocado/virtio-gpu.py   |  2 +
 tests/avocado/virtiofs_submounts.py   |  1 +
 34 files changed, 139 insertions(+)

diff --git a/tests/avocado/boot_linux.py b/tests/avocado/boot_linux.py
index ab19146d1e..c4172f11e3 100644
--- a/tests/avocado/boot_linux.py
+++ b/tests/avocado/boot_linux.py
@@ -18,6 +18,7 @@
 class BootLinuxX8664(LinuxTest):
 """
 :avocado: tags=arch:x86_64
+:avocado: tags=boots:distro
 """
 
 def test_pc_i440fx_tcg(self):
@@ -62,6 +63,7 @@ class BootLinuxAarch64(LinuxTest):
 :avocado: tags=arch:aarch64
 :avocado: tags=machine:virt
 :avocado: tags=machine:gic-version=2
+:avocado: tags=boots:distro
 """
 
 def add_common_args(self):
@@ -110,6 +112,7 @@ def test_virt_kvm(self):
 class BootLinuxPPC64(LinuxTest):
 """
 :avocado: tags=arch:ppc64
+:avocado: tags=boots:distro
 """
 
 def test_pseries_tcg(self):
@@ -125,6 +128,7 @@ def test_pseries_tcg(self):
 class BootLinuxS390X(LinuxTest):
 """
 :avocado: tags=arch:s390x
+:avocado: tags=boots:distro
 """
 
 @skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
diff --git a/tests/avocado/boot_linux_console.py 
b/tests/avocado/boot_linux_console.py
index 9c618d4809..0a8980953f 100644
--- a/tests/avocado/boot_linux_console.py
+++ b/tests/avocado/boot_linux_console.py
@@ -95,6 +95,7 @@ def test_x86_64_pc(self):
 """
 :avocado: tags=arch:x86_64
 :avocado: tags=machine:pc
+:avocado: tags=boots:kernel
 """
 kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
   '/linux/releases/29/Everything/x86_64/os/images/pxeboot'
@@ -115,6 +116,7 @@ def test_mips_malta(self):
  

[PATCH v5 00/49] target/ppc: PowerISA Vector/VSX instruction batch

2022-02-25 Thread matheus . ferst
From: Matheus Ferst 

This patch series implements 5 missing instructions from PowerISA v3.0
and 58 new instructions from PowerISA v3.1, moving 87 other instructions
to decodetree along the way.

Patches without review: 4, 24, 26, 27, 34, 35, 38, 40, 44-46

This series can also be found at:
https://github.com/PPC64/qemu/tree/ppc-isa31-2112-v5

v5:
 - 2 new instructions: vrlqnm/vrlqmi;
 - DEF_HELPER_FLAGS_N with TCG_CALL_NO_RWG where possible (rth);
 - Other fixes/optimizations (rth).

v4:
 - Rebase on master;
 - 16 new instructions: vs[lr]q, vrlq, vextsd2q, lxvr[bhwd]x/stxvr[bhwd]x,
   plxssp/pstxssp and plxsd/pstxsd;
 - Multiple fixes/optimizations (rth)

v3:
 - Dropped patch 33, which caused a regression in xxperm[r]

v2:
 - New patch (30) to remove xscmpnedp

Leandro Lupori (2):
  target/ppc: implement plxsd/pstxsd
  target/ppc: implement plxssp/pstxssp

Lucas Coutinho (3):
  target/ppc: Move vexts[bhw]2[wd] to decodetree
  target/ppc: Implement vextsd2q
  target/ppc: implement lxvr[bhwd]/stxvr[bhwd]x

Lucas Mateus Castro (alqotel) (3):
  target/ppc: moved vector even and odd multiplication to decodetree
  target/ppc: Moved vector multiply high and low to decodetree
  target/ppc: vmulh* instructions without helpers

Luis Pires (1):
  target/ppc: Introduce TRANS*FLAGS macros

Matheus Ferst (29):
  target/ppc: Move Vector Compare Equal/Not Equal/Greater Than to
decodetree
  target/ppc: Move Vector Compare Not Equal or Zero to decodetree
  target/ppc: Implement Vector Compare Equal Quadword
  target/ppc: Implement Vector Compare Greater Than Quadword
  target/ppc: Implement Vector Compare Quadword
  target/ppc: implement vstri[bh][lr]
  target/ppc: implement vclrlb
  target/ppc: implement vclrrb
  target/ppc: implement vcntmb[bhwd]
  target/ppc: implement vgnb
  target/ppc: move vs[lr][a][bhwd] to decodetree
  target/ppc: implement vslq
  target/ppc: implement vsrq
  target/ppc: implement vsraq
  target/ppc: move vrl[bhwd] to decodetree
  target/ppc: move vrl[bhwd]nm/vrl[bhwd]mi to decodetree
  target/ppc: implement vrlq
  target/ppc: implement vrlqnm
  target/ppc: implement vrlqmi
  target/ppc: Move vsel and vperm/vpermr to decodetree
  target/ppc: Move xxsel to decodetree
  target/ppc: move xxperm/xxpermr to decodetree
  target/ppc: Move xxpermdi to decodetree
  target/ppc: Implement xxpermx instruction
  tcg/tcg-op-gvec.c: Introduce tcg_gen_gvec_4i
  target/ppc: Implement xxeval
  target/ppc: Implement xxgenpcv[bhwd]m instruction
  target/ppc: move xs[n]madd[am][ds]p/xs[n]msub[am][ds]p to decodetree
  target/ppc: implement xs[n]maddqp[o]/xs[n]msubqp[o]

Víctor Colombo (11):
  target/ppc: Implement vmsumcud instruction
  target/ppc: Implement vmsumudm instruction
  target/ppc: Implement xvtlsbb instruction
  target/ppc: Remove xscmpnedp instruction
  target/ppc: Refactor VSX_SCALAR_CMP_DP
  target/ppc: Implement xscmp{eq,ge,gt}qp
  target/ppc: Move xscmp{eq,ge,gt}dp to decodetree
  target/ppc: Move xs{max, min}[cj]dp to use do_helper_XX3
  target/ppc: Refactor VSX_MAX_MINC helper
  target/ppc: Implement xs{max,min}cqp
  target/ppc: Implement xvcvbf16spn and xvcvspbf16 instructions

 include/tcg/tcg-op-gvec.h   |   22 +
 target/ppc/fpu_helper.c |  221 +++--
 target/ppc/helper.h |  155 ++-
 target/ppc/insn32.decode|  234 -
 target/ppc/insn64.decode|   56 +-
 target/ppc/int_helper.c |  408 
 target/ppc/translate.c  |   58 +-
 target/ppc/translate/vmx-impl.c.inc | 1348 +--
 target/ppc/translate/vmx-ops.c.inc  |   59 +-
 target/ppc/translate/vsx-impl.c.inc |  842 ++---
 target/ppc/translate/vsx-ops.c.inc  |   67 --
 tcg/ppc/tcg-target.c.inc|6 +
 tcg/tcg-op-gvec.c   |  146 +++
 13 files changed, 2845 insertions(+), 777 deletions(-)

-- 
2.25.1




[PATCH v5 06/49] target/ppc: Implement vmsumudm instruction

2022-02-25 Thread matheus . ferst
From: Víctor Colombo 

Based on [1] by Lijun Pan , which was never merged
into master.

[1]: https://lists.gnu.org/archive/html/qemu-ppc/2020-07/msg00419.html

Reviewed-by: Richard Henderson 
Signed-off-by: Víctor Colombo 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  1 +
 target/ppc/translate/vmx-impl.c.inc | 34 +
 2 files changed, 35 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index e85a75db2f..732a2bb79e 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -471,6 +471,7 @@ VMULLD  000100 . . . 00111001001@VX
 ## Vector Multiply-Sum Instructions
 
 VMSUMCUD000100 . . . . 010111   @VA
+VMSUMUDM000100 . . . . 100011   @VA
 
 # VSX Load/Store Instructions
 
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 4f528dc820..fcff3418c5 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -2081,6 +2081,40 @@ static bool trans_VPEXTD(DisasContext *ctx, arg_VX *a)
 return true;
 }
 
+static bool trans_VMSUMUDM(DisasContext *ctx, arg_VA *a)
+{
+TCGv_i64 rl, rh, src1, src2;
+int dw;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA300);
+REQUIRE_VECTOR(ctx);
+
+rh = tcg_temp_new_i64();
+rl = tcg_temp_new_i64();
+src1 = tcg_temp_new_i64();
+src2 = tcg_temp_new_i64();
+
+get_avr64(rl, a->rc, false);
+get_avr64(rh, a->rc, true);
+
+for (dw = 0; dw < 2; dw++) {
+get_avr64(src1, a->vra, dw);
+get_avr64(src2, a->vrb, dw);
+tcg_gen_mulu2_i64(src1, src2, src1, src2);
+tcg_gen_add2_i64(rl, rh, rl, rh, src1, src2);
+}
+
+set_avr64(a->vrt, rl, false);
+set_avr64(a->vrt, rh, true);
+
+tcg_temp_free_i64(rl);
+tcg_temp_free_i64(rh);
+tcg_temp_free_i64(src1);
+tcg_temp_free_i64(src2);
+
+return true;
+}
+
 static bool trans_VMSUMCUD(DisasContext *ctx, arg_VA *a)
 {
 TCGv_i64 tmp0, tmp1, prod1h, prod1l, prod0h, prod0l, zero;
-- 
2.25.1




[PATCH v5 08/49] target/ppc: Implement vextsd2q

2022-02-25 Thread matheus . ferst
From: Lucas Coutinho 

Reviewed-by: Richard Henderson 
Signed-off-by: Lucas Coutinho 
Signed-off-by: Matheus Ferst 
---
 target/ppc/insn32.decode|  1 +
 target/ppc/translate/vmx-impl.c.inc | 18 ++
 2 files changed, 19 insertions(+)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 1dcf9c61e9..cba680075b 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -426,6 +426,7 @@ VEXTSH2W000100 . 10001 . 1100010
@VX_tb
 VEXTSB2D000100 . 11000 . 1100010@VX_tb
 VEXTSH2D000100 . 11001 . 1100010@VX_tb
 VEXTSW2D000100 . 11010 . 1100010@VX_tb
+VEXTSD2Q000100 . 11011 . 1100010@VX_tb
 
 ## Vector Mask Manipulation Instructions
 
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index aa021bdf54..afe56a19d5 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1825,6 +1825,24 @@ TRANS(VEXTSB2D, do_vexts, MO_64, 56);
 TRANS(VEXTSH2D, do_vexts, MO_64, 48);
 TRANS(VEXTSW2D, do_vexts, MO_64, 32);
 
+static bool trans_VEXTSD2Q(DisasContext *ctx, arg_VX_tb *a)
+{
+TCGv_i64 tmp;
+
+REQUIRE_INSNS_FLAGS2(ctx, ISA310);
+REQUIRE_VECTOR(ctx);
+
+tmp = tcg_temp_new_i64();
+
+get_avr64(tmp, a->vrb, false);
+set_avr64(a->vrt, tmp, false);
+tcg_gen_sari_i64(tmp, tmp, 63);
+set_avr64(a->vrt, tmp, true);
+
+tcg_temp_free_i64(tmp);
+return true;
+}
+
 GEN_VXFORM_NOA_2(vctzb, 1, 24, 28)
 GEN_VXFORM_NOA_2(vctzh, 1, 24, 29)
 GEN_VXFORM_NOA_2(vctzw, 1, 24, 30)
-- 
2.25.1




[PATCH 9/9] Avocado tests: don't run tests with TCG that boot full blown distros

2022-02-25 Thread Cleber Rosa
Tests that use TCG and boot full blown distros, such as Fedora, will
take a good time to run.  This excludes those combinations by default
on invocations of "make check-avocado".

Tests that rely on KVM instead, will continue to run.

As a reminder, one can always supply a list of tests or tags to be
used on a "make check-avocado" by setting AVOCADO_TESTS or
AVOCADO_TAGS.

Signed-off-by: Cleber Rosa 
---
 tests/Makefile.include | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 676aa0d944..6d9cf7cbc9 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -91,8 +91,11 @@ endif
 # Any number of command separated loggers are accepted.  For more
 # information please refer to "avocado --help".
 AVOCADO_SHOW=app
+comma:=,
 ifndef AVOCADO_TAGS
-   AVOCADO_CMDLINE_TAGS=$(patsubst %-softmmu,-t arch:%, \
+   AVOCADO_CMDLINE_TAGS=$(patsubst %-softmmu,-t 
arch:%$(comma)accel:tcg$(comma)boots:-distro, \
+$(filter %-softmmu,$(TARGETS)))
+   AVOCADO_CMDLINE_TAGS+=$(patsubst %-softmmu,-t arch:%$(comma)accel:kvm, \
 $(filter %-softmmu,$(TARGETS)))
 else
AVOCADO_CMDLINE_TAGS=$(addprefix -t , $(AVOCADO_TAGS))
-- 
2.35.1




[PATCH v5 02/49] target/ppc: moved vector even and odd multiplication to decodetree

2022-02-25 Thread matheus . ferst
From: "Lucas Mateus Castro (alqotel)" 

Moved the instructions vmulesb, vmulosb, vmuleub, vmuloub,
vmulesh, vmulosh, vmuleuh, vmulouh, vmulesw, vmulosw,
muleuw and vmulouw from legacy to decodetree. Implemented
the instructions vmulesd, vmulosd, vmuleud, vmuloud.

Reviewed-by: Richard Henderson 
Signed-off-by: Lucas Mateus Castro (alqotel) 
Signed-off-by: Matheus Ferst 
---
 target/ppc/helper.h | 24 -
 target/ppc/insn32.decode| 22 +
 target/ppc/int_helper.c | 20 
 target/ppc/translate/vmx-impl.c.inc | 77 ++---
 target/ppc/translate/vmx-ops.c.inc  | 15 ++
 tcg/ppc/tcg-target.c.inc|  6 +++
 6 files changed, 112 insertions(+), 52 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index ab008c9d4e..07433b6f79 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -190,18 +190,18 @@ DEF_HELPER_3(vmrglw, void, avr, avr, avr)
 DEF_HELPER_3(vmrghb, void, avr, avr, avr)
 DEF_HELPER_3(vmrghh, void, avr, avr, avr)
 DEF_HELPER_3(vmrghw, void, avr, avr, avr)
-DEF_HELPER_3(vmulesb, void, avr, avr, avr)
-DEF_HELPER_3(vmulesh, void, avr, avr, avr)
-DEF_HELPER_3(vmulesw, void, avr, avr, avr)
-DEF_HELPER_3(vmuleub, void, avr, avr, avr)
-DEF_HELPER_3(vmuleuh, void, avr, avr, avr)
-DEF_HELPER_3(vmuleuw, void, avr, avr, avr)
-DEF_HELPER_3(vmulosb, void, avr, avr, avr)
-DEF_HELPER_3(vmulosh, void, avr, avr, avr)
-DEF_HELPER_3(vmulosw, void, avr, avr, avr)
-DEF_HELPER_3(vmuloub, void, avr, avr, avr)
-DEF_HELPER_3(vmulouh, void, avr, avr, avr)
-DEF_HELPER_3(vmulouw, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULESB, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULESH, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULESW, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULEUB, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULEUH, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULEUW, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULOSB, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULOSH, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULOSW, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULOUB, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULOUH, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_3(VMULOUW, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_3(vmulhsw, void, avr, avr, avr)
 DEF_HELPER_3(vmulhuw, void, avr, avr, avr)
 DEF_HELPER_3(vmulhsd, void, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 2a9c91a423..092ea79618 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -440,6 +440,28 @@ VEXTRACTWM  000100 . 01010 . 1100110
@VX_tb
 VEXTRACTDM  000100 . 01011 . 1100110@VX_tb
 VEXTRACTQM  000100 . 01100 . 1100110@VX_tb
 
+## Vector Multiply Instruction
+
+VMULESB 000100 . . . 0111000@VX
+VMULOSB 000100 . . . 0011000@VX
+VMULEUB 000100 . . . 0101000@VX
+VMULOUB 000100 . . . 0001000@VX
+
+VMULESH 000100 . . . 01101001000@VX
+VMULOSH 000100 . . . 00101001000@VX
+VMULEUH 000100 . . . 01001001000@VX
+VMULOUH 000100 . . . 1001000@VX
+
+VMULESW 000100 . . . 01110001000@VX
+VMULOSW 000100 . . . 00110001000@VX
+VMULEUW 000100 . . . 01010001000@VX
+VMULOUW 000100 . . . 00010001000@VX
+
+VMULESD 000100 . . . 0001000@VX
+VMULOSD 000100 . . . 00111001000@VX
+VMULEUD 000100 . . . 01011001000@VX
+VMULOUD 000100 . . . 00011001000@VX
+
 # VSX Load/Store Instructions
 
 LXV 01 . .  . 001   @DQ_TSX
diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index d1b12788b2..c9f34ce3ca 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -1063,7 +1063,7 @@ void helper_vmsumuhs(CPUPPCState *env, ppc_avr_t *r, 
ppc_avr_t *a,
 }
 
 #define VMUL_DO_EVN(name, mul_element, mul_access, prod_access, cast)   \
-void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)   \
+void helper_V##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)   \
 {   \
 int i;  \
 \
@@ -1074,7 +1074,7 @@ void helper_vmsumuhs(CPUPPCState *env, ppc_avr_t *r, 
ppc_avr_t *a,
 }
 
 #define VMUL_DO_ODD(name, mul_element, mul_access, prod_access, cast)   \
-void helper_v##name(ppc_avr_t *r, ppc_avr_t *a, ppc_avr_t *b)   \

[PATCH 4/9] Avocado: bump to version 95.0

2022-02-25 Thread Cleber Rosa
Even though there have been a number of improvements (and some pretty
deep internal changes) since Avocado 88.1, only one change should
affect "make check-avocado".

With the nrunner architecture, test execution happens in parallel by
default.  But, tests may fail due to insufficient timeouts or similar
reasons when run under systems with limited or shared resources.  To
avoid breakages, especially on CI, let's keep the serial execution
until proven that it won't impact the CI jobs.

Signed-off-by: Cleber Rosa 
---
 tests/Makefile.include | 1 +
 tests/requirements.txt | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index e7153c8e91..676aa0d944 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -135,6 +135,7 @@ check-avocado: check-venv $(TESTS_RESULTS_DIR) get-vm-images
 $(if $(AVOCADO_TAGS),, --filter-by-tags-include-empty \
--filter-by-tags-include-empty-key) \
 $(AVOCADO_CMDLINE_TAGS) \
+--nrunner-max-parallel-tasks=1 \
 $(if $(GITLAB_CI),,--failfast) $(AVOCADO_TESTS), \
 "AVOCADO", "tests/avocado")
 
diff --git a/tests/requirements.txt b/tests/requirements.txt
index a21b59b443..49aa0fd6f6 100644
--- a/tests/requirements.txt
+++ b/tests/requirements.txt
@@ -1,5 +1,5 @@
 # Add Python module requirements, one per line, to be installed
 # in the tests/venv Python virtual environment. For more info,
 # refer to: https://pip.pypa.io/en/stable/user_guide/#id1
-avocado-framework==88.1
+avocado-framework==95.0
 pycdlib==1.11.0
-- 
2.35.1




[PATCH 0/9] Avocado tests: filter out tests using TCG booting full blown distros

2022-02-25 Thread Cleber Rosa
It was previously reported[1] and discussed that tests booting full
blown distros and relying on TCG would take too much time to run,
especially in the environments given by GitLab CI's shared runners.

This is an implementation of a proposal to exclude those tests from
being run by default on `make check-avocado` invocations.  To make it
extra clear, all tests are still available, but those that are tagged
with "accel:tcg" and "boots:distro", are filtered out by default on
`make check-avocado`.

This is the situation of the Avocado GitLab CI jobs with and without
the changes in this PS:

 +--+--- +
 |Now[2]|   Before[3]|
++--++
|  Job   |  Length |  Tests |  Length  |  Tests  |
| Name   | (mm:ss) |   Run  |  (mm:ss) |   Run   |
++--+---+
|avocado-system-alpine   |  06:33  16   |   20:30   18   |
|avocado-system-debian   |  12:06  24   |   13:05   24   |
|avocado-system-centos   |  09:58  41   |   24:15   44   |
|avocado-system-fedora   |  08:50  35   |   08:59   35   |
|avocado-system-opensuse |  08:09  38   |   27:21   42   |
|avocado-system-ubuntu   |  06:52  16   |   18:52   18   |
|avocado-cfi-x86_64  |  05:43  27   |   15:07   29   |
++--++
|TOTALS  |  58:11 197   | 2:08:09  210   |
++--++

Assuming the jobs run in parallel, the overall wait time for all the
Avocado jobs to complete is now ~12 minutes.

[1] https://lists.gnu.org/archive/html/qemu-devel/2021-07/msg07271.html
[2] https://gitlab.com/cleber.gnu/qemu/-/pipelines/479720240
[3] https://gitlab.com/qemu-project/qemu/-/pipelines/478580581

Cleber Rosa (9):
  Avocado GitLab CI jobs: don't reset TARGETS and simplify commands
  Avocado tests: use logging namespace that is preserved in test logs
  Avocado migration test: adapt to "utils.network" API namespace change
  Avocado: bump to version 95.0
  tests/avocado/linux_ssh_mips_malta.py: add missing accel (tcg) tag
  tests/avocado/virtiofs_submounts.py: shared_dir may not exist
  Avocado tests: improve documentation on tag filtering
  Avocado tests: classify tests based on what it's booted
  Avocado tests: don't run tests with TCG that boot full blown distros

 .gitlab-ci.d/buildtest-template.yml   |  3 ++
 .gitlab-ci.d/buildtest.yml|  9 
 docs/devel/testing.rst| 22 +
 tests/Makefile.include|  6 ++-
 tests/avocado/avocado_qemu/__init__.py| 10 ++---
 tests/avocado/boot_linux.py   |  4 ++
 tests/avocado/boot_linux_console.py   | 54 +++
 tests/avocado/boot_xen.py |  3 ++
 tests/avocado/hotplug_cpu.py  |  1 +
 tests/avocado/intel_iommu.py  |  1 +
 tests/avocado/linux_initrd.py |  5 ++-
 tests/avocado/linux_ssh_mips_malta.py |  5 +++
 tests/avocado/machine_arm_canona1100.py   |  1 +
 tests/avocado/machine_arm_integratorcp.py |  7 ++-
 tests/avocado/machine_arm_n8x0.py |  2 +
 tests/avocado/machine_avr6.py |  1 +
 tests/avocado/machine_m68k_nextcube.py|  1 +
 tests/avocado/machine_microblaze.py   |  1 +
 tests/avocado/machine_mips_fuloong2e.py   |  1 +
 tests/avocado/machine_mips_loongson3v.py  |  1 +
 tests/avocado/machine_mips_malta.py   |  6 ++-
 tests/avocado/machine_rx_gdbsim.py|  2 +
 tests/avocado/machine_s390_ccw_virtio.py  |  4 ++
 tests/avocado/machine_sparc64_sun4u.py|  1 +
 tests/avocado/machine_sparc_leon3.py  |  1 +
 tests/avocado/migration.py|  4 +-
 tests/avocado/multiprocess.py |  4 ++
 tests/avocado/ppc_405.py  |  2 +
 tests/avocado/ppc_bamboo.py   |  2 +
 tests/avocado/ppc_mpc8544ds.py|  1 +
 tests/avocado/ppc_prep_40p.py |  1 +
 tests/avocado/ppc_pseries.py  |  1 +
 tests/avocado/ppc_virtex_ml507.py |  1 +
 tests/avocado/replay_kernel.py| 33 --
 tests/avocado/replay_linux.py |  6 +--
 tests/avocado/reverse_debugging.py|  6 +--
 tests/avocado/smmu.py |  1 +
 tests/avocado/tcg_plugins.py  |  3 ++
 tests/avocado/tesseract_utils.py  |  6 +--
 tests/avocado/virtio-gpu.py   |  2 +
 tests/avocado/virtio_check_params.py  |  3 +-
 tests/avocado/virtiofs_submounts.py   |  8 ++--
 tests/requirements.txt|  2 +-
 43 files changed, 197 insertions(+), 41 deletions(-)

-- 
2.35.1





[PATCH 6/9] tests/avocado/virtiofs_submounts.py: shared_dir may not exist

2022-02-25 Thread Cleber Rosa
If the test is skipped because of their conditionals, the shared_dir
attribute may not exist.

Check for its existence in the tearDown() method to avoid and
AttributeError.

Signed-off-by: Cleber Rosa 
---
 tests/avocado/virtiofs_submounts.py | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tests/avocado/virtiofs_submounts.py 
b/tests/avocado/virtiofs_submounts.py
index e6dc32ffd4..d9c2c9d9ef 100644
--- a/tests/avocado/virtiofs_submounts.py
+++ b/tests/avocado/virtiofs_submounts.py
@@ -157,9 +157,10 @@ def tearDown(self):
 except:
 pass
 
-scratch_dir = os.path.join(self.shared_dir, 'scratch')
-self.run(('bash', self.get_data('cleanup.sh'), scratch_dir),
- ignore_error=True)
+if hasattr(self, 'shared_dir'):
+scratch_dir = os.path.join(self.shared_dir, 'scratch')
+self.run(('bash', self.get_data('cleanup.sh'), scratch_dir),
+ ignore_error=True)
 
 def test_pre_virtiofsd_set_up(self):
 self.set_up_shared_dir()
-- 
2.35.1




[PATCH 3/9] Avocado migration test: adapt to "utils.network" API namespace change

2022-02-25 Thread Cleber Rosa
Since Avocado 94.0[1], the "avocado.utils.network" dropped a lot of
previously deprecated API names, having the new names into a finer
grained structure.

This simply uses the new API names for the network port utility
module.

[1] - 
https://avocado-framework.readthedocs.io/en/latest/releases/94_0.html#utility-apis

Signed-off-by: Cleber Rosa 
---
 tests/avocado/avocado_qemu/__init__.py | 5 +++--
 tests/avocado/migration.py | 4 ++--
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/tests/avocado/avocado_qemu/__init__.py 
b/tests/avocado/avocado_qemu/__init__.py
index 88cec83e5c..3f15c8a222 100644
--- a/tests/avocado/avocado_qemu/__init__.py
+++ b/tests/avocado/avocado_qemu/__init__.py
@@ -17,7 +17,8 @@
 import uuid
 
 import avocado
-from avocado.utils import cloudinit, datadrainer, network, process, ssh, 
vmimage
+from avocado.utils import cloudinit, datadrainer, process, ssh, vmimage
+from avocado.utils.network import ports
 from avocado.utils.path import find_command
 
 #: The QEMU build root directory.  It may also be the source directory
@@ -601,7 +602,7 @@ def prepare_cloudinit(self, ssh_pubkey=None):
 self.log.info('Preparing cloudinit image')
 try:
 cloudinit_iso = os.path.join(self.workdir, 'cloudinit.iso')
-self.phone_home_port = network.find_free_port()
+self.phone_home_port = ports.find_free_port()
 pubkey_content = None
 if ssh_pubkey:
 with open(ssh_pubkey) as pubkey:
diff --git a/tests/avocado/migration.py b/tests/avocado/migration.py
index 584d6ef53f..4b25680c50 100644
--- a/tests/avocado/migration.py
+++ b/tests/avocado/migration.py
@@ -14,7 +14,7 @@
 from avocado_qemu import QemuSystemTest
 from avocado import skipUnless
 
-from avocado.utils import network
+from avocado.utils.network import ports
 from avocado.utils import wait
 from avocado.utils.path import find_command
 
@@ -57,7 +57,7 @@ def do_migrate(self, dest_uri, src_uri=None):
 self.assert_migration(source_vm, dest_vm)
 
 def _get_free_port(self):
-port = network.find_free_port()
+port = ports.find_free_port()
 if port is None:
 self.cancel('Failed to find a free port')
 return port
-- 
2.35.1




[PATCH 7/9] Avocado tests: improve documentation on tag filtering

2022-02-25 Thread Cleber Rosa
It's possible to filter based on a combination of criteria.  This adds
examples to the documentation.

Signed-off-by: Cleber Rosa 
---
 docs/devel/testing.rst | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index 92d40cdd19..f5b6e07b5c 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -936,6 +936,28 @@ in the current directory, tagged as "quick", run:
 
   avocado run -t quick .
 
+To run tests with a given value for a given tag, such as having the
+``accel`` tag set to ``kvm``, run:
+
+.. code::
+
+  avocado run -t accel:kvm .
+
+Multiple mandatory conditions can also be given.  To run only tests
+with ``arch`` set to ``x86_64`` and ``accell`` set to ``kvm``, run:
+
+.. code::
+
+  avocado run -t arch:x86_64,accel:kvm .
+
+It's also possible to exclude tests that contain a given value for a
+tag.  To list all tests that do *not* have ``arch`` set to ``x86_64``,
+run:
+
+.. code::
+
+  avocado run -t arch:-x86_64 .
+
 The ``avocado_qemu.Test`` base test class
 ^
 
-- 
2.35.1




  1   2   3   4   >