Re: [PATCH] accel/tcg: Fix typo causing tb->page_addr[1] to not be recorded

2024-06-11 Thread Manos Pitsidianakis

On Wed, 12 Jun 2024 00:58, Anton Johansson via  wrote:

For TBs crossing page boundaries, the 2nd page will never be
recorded/removed, as the index of the 2nd page is computed from the
address of the 1st page. This is due to a typo, fix it.

Signed-off-by: Anton Johansson 
---


Reviewed-by: Manos Pitsidianakis 



Re: [PATCH 00/26] hw/ppc: Prefer HumanReadableText over Monitor

2024-06-11 Thread Manos Pitsidianakis

On Mon, 10 Jun 2024 09:20, Philippe Mathieu-Daudé  wrote:

Hi,

This series remove uses of Monitor in hw/ppc/,
replacing by the more generic HumanReadableText.
Care is taken to keep the commit bisectables by
updating functions one by one, also easing review.

For rationale see previous series from Daniel:
https://lore.kernel.org/qemu-devel/20211028155457.967291-1-berra...@redhat.com/

Regards,

Phil.

Philippe Mathieu-Daudé (26):
 hw/ppc: Avoid using Monitor in pnv_phb3_msi_pic_print_info()
 hw/ppc: Avoid using Monitor in icp_pic_print_info()
 hw/ppc: Avoid using Monitor in xive_tctx_pic_print_info()
 hw/ppc: Avoid using Monitor in ics_pic_print_info()
 hw/ppc: Avoid using Monitor in PnvChipClass::intc_print_info()
 hw/ppc: Avoid using Monitor in xive_end_queue_pic_print_info()
 hw/ppc: Avoid using Monitor in spapr_xive_end_pic_print_info()
 hw/ppc: Avoid using Monitor in spapr_xive_pic_print_info()
 hw/ppc: Avoid using Monitor in xive_source_pic_print_info()
 hw/ppc: Avoid using Monitor in pnv_phb4_pic_print_info()
 hw/ppc: Avoid using Monitor in xive_eas_pic_print_info()
 hw/ppc: Avoid using Monitor in xive_end_pic_print_info()
 hw/ppc: Avoid using Monitor in xive_end_eas_pic_print_info()
 hw/ppc: Avoid using Monitor in xive_nvt_pic_print_info()
 hw/ppc: Avoid using Monitor in pnv_xive_pic_print_info()
 hw/ppc: Avoid using Monitor in pnv_psi_pic_print_info()
 hw/ppc: Avoid using Monitor in xive2_eas_pic_print_info()
 hw/ppc: Avoid using Monitor in xive2_end_eas_pic_print_info()
 hw/ppc: Avoid using Monitor in xive2_end_queue_pic_print_info()
 hw/ppc: Avoid using Monitor in xive2_end_pic_print_info()
 hw/ppc: Avoid using Monitor in xive2_nvp_pic_print_info()
 hw/ppc: Avoid using Monitor in pnv_xive2_pic_print_info()
 hw/ppc: Avoid using Monitor in
   SpaprInterruptControllerClass::print_info()
 hw/ppc: Avoid using Monitor in spapr_irq_print_info()
 hw/ppc: Avoid using Monitor in pnv_chip_power9_pic_print_info_child()
 hw/ppc: Avoid using Monitor in pic_print_info()

include/hw/pci-host/pnv_phb3.h |   2 +-
include/hw/pci-host/pnv_phb4.h |   2 +-
include/hw/ppc/pnv_chip.h  |   4 +-
include/hw/ppc/pnv_psi.h   |   2 +-
include/hw/ppc/pnv_xive.h  |   4 +-
include/hw/ppc/spapr_irq.h |   4 +-
include/hw/ppc/xics.h  |   4 +-
include/hw/ppc/xive.h  |   4 +-
include/hw/ppc/xive2_regs.h|   8 +--
include/hw/ppc/xive_regs.h |   8 +--
hw/intc/pnv_xive.c |  38 ++--
hw/intc/pnv_xive2.c|  48 +++
hw/intc/spapr_xive.c   |  41 ++---
hw/intc/xics.c |  25 
hw/intc/xics_spapr.c   |   7 +--
hw/intc/xive.c | 108 -
hw/intc/xive2.c|  87 +-
hw/pci-host/pnv_phb3_msi.c |  21 +++
hw/pci-host/pnv_phb4.c |  17 +++---
hw/ppc/pnv.c   |  52 
hw/ppc/pnv_psi.c   |   9 ++-
hw/ppc/spapr.c |  11 +++-
hw/ppc/spapr_irq.c |   4 +-
23 files changed, 256 insertions(+), 254 deletions(-)

--
2.41.0


For the series:

Reviewed-by: Manos Pitsidianakis 



Re: [PATCH v4 0/4] hw/nvme: FDP and SR-IOV enhancements

2024-06-11 Thread Klaus Jensen
On May 29 21:42, Minwoo Im wrote:
> Hello,
> 
> This is v4 patchset to increase number of virtual functions for NVMe SR-IOV.
> Please consider the following change notes per version.
> 
> This patchset has been tested with the following simple script more than
> 127 VFs.
> 
>   -device nvme-subsys,id=subsys0 \
>   -device ioh3420,id=rp2,multifunction=on,chassis=12 \
>   -device 
> nvme,serial=foo,id=nvme0,bus=rp2,subsys=subsys0,mdts=9,msix_qsize=130,max_ioqpairs=260,sriov_max_vfs=129,sriov_vq_flexible=258,sriov_vi_flexible=129
>  \
> 
>   $ cat nvme-enable-vfs.sh
>   #!/bin/bash
> 
>   nr_vfs=129
> 
>   for (( i=1; i<=$nr_vfs; i++ ))
>   do
>   nvme virt-mgmt /dev/nvme0 -c $i -r 0 -a 8 -n 2
>   nvme virt-mgmt /dev/nvme0 -c $i -r 1 -a 8 -n 1
>   done
> 
>   bdf=":01:00.0"
>   sysfs="/sys/bus/pci/devices/$bdf"
>   nvme="/sys/bus/pci/drivers/nvme"
> 
>   echo 0 > $sysfs/sriov_drivers_autoprobe
>   echo $nr_vfs > $sysfs/sriov_numvfs
> 
>   for (( i=1; i<=$nr_vfs; i++ ))
>   do
>   nvme virt-mgmt /dev/nvme0 -c $i -a 9
> 
>   echo "nvme" > $sysfs/virtfn$(($i-1))/driver_override
>   bdf="$(basename $(readlink $sysfs/virtfn$(($i-1"
>   echo $bdf > $nvme/bind
>   done
> 
> Thanks,
> 
> v4:
>  - Rebased on the latest master.
>  - Update n->params.sriov_max_vfs to uint16_t as per spec.
> 
> v3:
>  - Replace [3/4] patch with one allocating a dyanmic array of secondary
>controller list rather than a static array with a fixed size of
>maximum number of VF to support (Suggested by Klaus).
> v2: 
>  - Added [2/4] commit to fix crash due to entry overflow
> 
> Minwoo Im (4):
>   hw/nvme: add Identify Endurance Group List
>   hw/nvme: separate identify data for sec. ctrl list
>   hw/nvme: Allocate sec-ctrl-list as a dynamic array
>   hw/nvme: Expand VI/VQ resource to uint32
> 
>  hw/nvme/ctrl.c   | 59 +++-
>  hw/nvme/nvme.h   | 19 +++---
>  hw/nvme/subsys.c | 10 +---
>  include/block/nvme.h |  1 +
>  4 files changed, 54 insertions(+), 35 deletions(-)
> 
> -- 
> 2.34.1
> 

Looks good Minwoo!

Grabbing for nvme-next.

Reviewed-by: Klaus Jensen 


signature.asc
Description: PGP signature


Re: [PATCH] ui/gtk: Wait until the current guest frame is rendered before switching to RUN_STATE_SAVE_VM

2024-06-11 Thread Marc-André Lureau
Hi

On Wed, Jun 12, 2024 at 5:29 AM Kim, Dongwon  wrote:

> Hi,
>
> From: Marc-André Lureau 
> Sent: Wednesday, June 5, 2024 12:56 AM
> To: Kim, Dongwon 
> Cc: qemu-devel@nongnu.org; Peter Xu 
> Subject: Re: [PATCH] ui/gtk: Wait until the current guest frame is
> rendered before switching to RUN_STATE_SAVE_VM
>
> Hi
>
> On Tue, Jun 4, 2024 at 9:49 PM Kim, Dongwon 
> wrote:
> On 6/4/2024 4:12 AM, Marc-André Lureau wrote:
> > Hi
> >
> > On Thu, May 30, 2024 at 2:44 AM  > > wrote:
> >
> > From: Dongwon  dongwon@intel.com>>
> >
> > Make sure rendering of the current frame is finished before switching
> > the run state to RUN_STATE_SAVE_VM by waiting for egl-sync object to
> be
> > signaled.
> >
> >
> > Can you expand on what this solves?
>
> In current scheme, guest waits for the fence to be signaled for each
> frame it submits before moving to the next frame. If the guest’s state
> is saved while it is still waiting for the fence, The guest will
> continue to  wait for the fence that was signaled while ago when it is
> restored to the point. One way to prevent it is to get it finish the
> current frame before changing the state.
>
> After the UI sets a fence, hw_ops->gl_block(true) gets called, which will
> block virtio-gpu/virgl from processing commands (until the fence is
> signaled and gl_block/false called again).
>
> But this "blocking" state is not saved. So how does this affect
> save/restore? Please give more details, thanks
>
> Yeah sure. "Blocking" state is not saved but guest's state is saved while
> it was still waiting for the response for its last resource-flush virtio
> msg. This virtio response, by the way is set to be sent to the guest when
> the pipeline is unblocked (and when the fence is signaled.). Once the
> guest's state is saved, current instance of guest will be continued and
> receives the response as usual. The problem is happening when we restore
> the saved guest's state again because what guest does will be waiting for
> the response that was sent a while ago to the original instance.
>

Where is the pending response saved? Can you detail how you test this?

thanks


-- 
Marc-André Lureau


[PATCH v4 1/1] qga/linux: Add new api 'guest-network-get-route'

2024-06-11 Thread Dehan Meng
The Route information of the Linux VM needs to be used
by administrators and users when debugging network problems
and troubleshooting.

Signed-off-by: Dehan Meng 
---
 qga/commands-posix.c | 73 
 qga/commands-win32.c |  6 
 qga/qapi-schema.json | 68 +
 3 files changed, 147 insertions(+)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 6169bbf7a0..ffae88ca69 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -2747,6 +2747,73 @@ GuestCpuStatsList *qmp_guest_get_cpustats(Error **errp)
 return head;
 }
 
+char *hexToIPAddress(unsigned int hexValue, char ipAddress[16]);
+char *hexToIPAddress(unsigned int hexValue, char ipAddress[16])
+{
+unsigned int byte1 = (hexValue >> 24) & 0xFF;
+unsigned int byte2 = (hexValue >> 16) & 0xFF;
+unsigned int byte3 = (hexValue >> 8) & 0xFF;
+unsigned int byte4 = hexValue & 0xFF;
+
+snprintf(ipAddress, 16, "%u.%u.%u.%u", byte4, byte3, byte2, byte1);
+
+return ipAddress;
+}
+
+GuestNetworkRouteStatList *qmp_guest_network_get_route(Error **errp)
+{
+GuestNetworkRouteStatList *head = NULL, **tail = 
+const char *routeFile = "/proc/net/route";
+FILE *fp;
+size_t n;
+char *line = NULL;
+
+fp = fopen(routeFile, "r");
+if (fp == NULL) {
+error_setg_errno(errp, errno, "open(\"%s\")", routeFile);
+return NULL;
+}
+
+while (getline(, , fp) != -1) {
+GuestNetworkRouteStat *networkroute;
+int i;
+char Iface[16];
+unsigned int Destination, Gateway, Mask, Flags;
+int RefCnt, Use, Metric, MTU, Window, IRTT;
+
+i = (sscanf(line, "%s %X %X %x %d %d %d %X %d %d %d",
+Iface, , , , ,
+, , , , , ) == 11);
+if (i == EOF) {
+continue;
+}
+
+networkroute = g_new0(GuestNetworkRouteStat, 1);
+
+char DestAddress[16];
+char GateAddress[16];
+char MaskAddress[16];
+
+networkroute->iface = g_strdup(Iface);
+networkroute->destination = g_strdup(hexToIPAddress(Destination, 
DestAddress));
+networkroute->gateway = g_strdup(hexToIPAddress(Gateway, GateAddress));
+networkroute->mask = g_strdup(hexToIPAddress(Mask, MaskAddress));
+networkroute->metric = Metric;
+networkroute->flags = Flags;
+networkroute->refcnt = RefCnt;
+networkroute->use = Use;
+networkroute->mtu = MTU;
+networkroute->window = Window;
+networkroute->irtt = IRTT;
+
+QAPI_LIST_APPEND(tail, networkroute);
+}
+
+free(line);
+fclose(fp);
+return head;
+}
+
 #else /* defined(__linux__) */
 
 void qmp_guest_suspend_disk(Error **errp)
@@ -3118,6 +3185,12 @@ GuestCpuStatsList *qmp_guest_get_cpustats(Error **errp)
 return NULL;
 }
 
+GuestNetworkRouteList *qmp_guest_network_get_route(Error **errp)
+{
+error_setg(errp, QERR_UNSUPPORTED);
+return NULL;
+}
+
 #endif /* CONFIG_FSFREEZE */
 
 #if !defined(CONFIG_FSTRIM)
diff --git a/qga/commands-win32.c b/qga/commands-win32.c
index 697c65507c..e62c04800a 100644
--- a/qga/commands-win32.c
+++ b/qga/commands-win32.c
@@ -2522,3 +2522,9 @@ GuestCpuStatsList *qmp_guest_get_cpustats(Error **errp)
 error_setg(errp, QERR_UNSUPPORTED);
 return NULL;
 }
+
+GuestNetworkRouteList *qmp_guest_network_get_route(Error **errp)
+{
+error_setg(errp, QERR_UNSUPPORTED);
+return NULL;
+}
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index 876e2a8ea8..195f6cd4e7 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -1789,3 +1789,71 @@
 { 'command': 'guest-get-cpustats',
   'returns': ['GuestCpuStats']
 }
+
+##
+# @GuestNetworkRouteStat:
+#
+# Route information, currently, only linux supported.
+#
+# @iface: The destination network or host's egress network interface in the 
routing table
+#
+# @destination: The IP address of the target network or host, The final 
destination of the packet
+#
+# @gateway: The IP address of the next hop router
+#
+# @mask: Subnet Mask
+#
+# @metric: Route metricls
+#
+# @flags: Route flags (not for windows)
+#
+# @irtt: Initial round-trip delay (not for windows)
+#
+# @refcnt: The route's reference count (not for windows)
+#
+# @use: Route usage count (not for windows)
+#
+# @window: TCP window size, used for flow control (not for windows)
+#
+# @mtu: Data link layer maximum packet size (not for windows)
+#
+# Since: 9.1
+
+##
+{ 'struct': 'GuestNetworkRouteStat',
+  'data': {'iface': 'str',
+   'destination': 'str',
+   'gateway': 'str',
+   'metric': 'int',
+   'mask': 'str',
+   '*irtt': 'int',
+   '*flags': 'uint64',
+   '*refcnt': 'int',
+   '*use': 'int',
+   '*window': 'int',
+   '*mtu': 'int'
+   }}
+
+##
+# @GuestNetworkRoute:
+#
+# Get route information of system.
+#
+# @routes: A list of network route 

[PATCH v4 0/1] qga/linux: Add new api 'guest-network-get-route'

2024-06-11 Thread Dehan Meng
v3 -> v4
- Fix some indentation issues
- Update 'Since 8.2' to 'Since 9.1'
- Remove useless enum and adjust this change.

v2 -> v3
- Remove this declaration and make the function 'hexToIPAddress' as static.
- Define 'IFNAMSIZ' from kernel instead of a hardcode
- Remove 'GUEST_NETWORK_ROUTE_TYPE_LINUX'
- Set flags 'has_xxx' for checking if a field exists or has a value set

v1 -> v2
- Replace snprintf() to g_strdup_printf() to avoid memory problems.
- Remove the parameter 'char ipAddress[16]' in function 'char 
*hexToIPAddress()'.
- Add a piece of logic to skip traversing the first line of the file

Dehan Meng (1):
  qga/linux: Add new api 'guest-network-get-route'

 qga/commands-posix.c | 73 
 qga/commands-win32.c |  6 
 qga/qapi-schema.json | 68 +
 3 files changed, 147 insertions(+)

-- 
2.40.1




[PATCH] hw/loongarch/virt: Remove unused assignment

2024-06-11 Thread Bibo Mao
There is abuse usage about local variable gap. Remove
duplicated assignment and solve Coverity reported error.

Resolves: Coverity CID 1546441
Fixes: 3cc451cbce ("hw/loongarch: Refine fwcfg memory map")
Signed-off-by: Bibo Mao 
---
 hw/loongarch/virt.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/hw/loongarch/virt.c b/hw/loongarch/virt.c
index 66cef201ab..2fe08583b8 100644
--- a/hw/loongarch/virt.c
+++ b/hw/loongarch/virt.c
@@ -1054,7 +1054,6 @@ static void fw_cfg_add_memory(MachineState *ms)
 memmap_add_entry(base, gap, 1);
 size -= gap;
 base = VIRT_HIGHMEM_BASE;
-gap = ram_size - VIRT_LOWMEM_SIZE;
 }
 
 if (size) {
@@ -1067,17 +1066,17 @@ static void fw_cfg_add_memory(MachineState *ms)
 }
 
 /* add fw_cfg memory map of other nodes */
-size = ram_size - numa_info[0].node_mem;
-gap  = VIRT_LOWMEM_BASE + VIRT_LOWMEM_SIZE;
-if (base < gap && (base + size) > gap) {
+if (numa_info[0].node_mem < gap && ram_size > gap) {
 /*
  * memory map for the maining nodes splited into two part
- *   lowram:  [base, +(gap - base))
- *   highram: [VIRT_HIGHMEM_BASE, +(size - (gap - base)))
+ * lowram:  [base, +(gap - numa_info[0].node_mem))
+ * highram: [VIRT_HIGHMEM_BASE, +(ram_size - gap))
  */
-memmap_add_entry(base, gap - base, 1);
-size -= gap - base;
+memmap_add_entry(base, gap - numa_info[0].node_mem, 1);
+size = ram_size - gap;
 base = VIRT_HIGHMEM_BASE;
+} else {
+size = ram_size - numa_info[0].node_mem;
 }
 
if (size)

base-commit: 80e8f0602168f451a93e71cbb1d59e93d745e62e
-- 
2.39.3




[PATCH v7 1/2] hw/misc/riscv_iopmp: Add RISC-V IOPMP device

2024-06-11 Thread Ethan Chen via
Support basic functions of IOPMP specification v0.9.1 rapid-k model.
The specification url:
https://github.com/riscv-non-isa/iopmp-spec/releases/tag/v0.9.1

IOPMP check memory access from device is valid or not. This implementation uses
IOMMU to change address space that device access. There are three possible
results of an access: valid, blocked, and stalled(stall is not supported in this
 patch).

If an access is valid, target address space is downstream_as.
If an access is blocked, it will go to blocked_io_as. The operation of
blocked_io_as could be a bus error, or it can respond a success with fabricated
data depending on IOPMP ERR_CFG register value.

Signed-off-by: Ethan Chen 
---
 hw/misc/Kconfig   |3 +
 hw/misc/meson.build   |1 +
 hw/misc/riscv_iopmp.c | 1002 +
 hw/misc/trace-events  |4 +
 include/hw/misc/riscv_iopmp.h |  152 +
 5 files changed, 1162 insertions(+)
 create mode 100644 hw/misc/riscv_iopmp.c
 create mode 100644 include/hw/misc/riscv_iopmp.h

diff --git a/hw/misc/Kconfig b/hw/misc/Kconfig
index 1e08785b83..427f0c702e 100644
--- a/hw/misc/Kconfig
+++ b/hw/misc/Kconfig
@@ -213,4 +213,7 @@ config IOSB
 config XLNX_VERSAL_TRNG
 bool
 
+config RISCV_IOPMP
+bool
+
 source macio/Kconfig
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
index 86596a3888..f83cd108f8 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -34,6 +34,7 @@ system_ss.add(when: 'CONFIG_SIFIVE_E_PRCI', if_true: 
files('sifive_e_prci.c'))
 system_ss.add(when: 'CONFIG_SIFIVE_E_AON', if_true: files('sifive_e_aon.c'))
 system_ss.add(when: 'CONFIG_SIFIVE_U_OTP', if_true: files('sifive_u_otp.c'))
 system_ss.add(when: 'CONFIG_SIFIVE_U_PRCI', if_true: files('sifive_u_prci.c'))
+specific_ss.add(when: 'CONFIG_RISCV_IOPMP', if_true: files('riscv_iopmp.c'))
 
 subdir('macio')
 
diff --git a/hw/misc/riscv_iopmp.c b/hw/misc/riscv_iopmp.c
new file mode 100644
index 00..75b28dc559
--- /dev/null
+++ b/hw/misc/riscv_iopmp.c
@@ -0,0 +1,1002 @@
+/*
+ * QEMU RISC-V IOPMP (Input Output Physical Memory Protection)
+ *
+ * Copyright (c) 2023 Andes Tech. Corp.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/log.h"
+#include "qapi/error.h"
+#include "trace.h"
+#include "exec/exec-all.h"
+#include "exec/address-spaces.h"
+#include "hw/qdev-properties.h"
+#include "hw/sysbus.h"
+#include "hw/misc/riscv_iopmp.h"
+#include "memory.h"
+#include "hw/irq.h"
+#include "hw/registerfields.h"
+#include "trace.h"
+
+#define TYPE_IOPMP_IOMMU_MEMORY_REGION "iopmp-iommu-memory-region"
+
+REG32(VERSION, 0x00)
+FIELD(VERSION, VENDOR, 0, 24)
+FIELD(VERSION, SPECVER , 24, 8)
+REG32(IMP, 0x04)
+FIELD(IMP, IMPID, 0, 32)
+REG32(HWCFG0, 0x08)
+FIELD(HWCFG0, MODEL, 0, 4)
+FIELD(HWCFG0, TOR_EN, 4, 1)
+FIELD(HWCFG0, SPS_EN, 5, 1)
+FIELD(HWCFG0, USER_CFG_EN, 6, 1)
+FIELD(HWCFG0, PRIENT_PROG, 7, 1)
+FIELD(HWCFG0, RRID_TRANSL_EN, 8, 1)
+FIELD(HWCFG0, RRID_TRANSL_PROG, 9, 1)
+FIELD(HWCFG0, CHK_X, 10, 1)
+FIELD(HWCFG0, NO_X, 11, 1)
+FIELD(HWCFG0, NO_W, 12, 1)
+FIELD(HWCFG0, STALL_EN, 13, 1)
+FIELD(HWCFG0, PEIS, 14, 1)
+FIELD(HWCFG0, PEES, 15, 1)
+FIELD(HWCFG0, MFR_EN, 16, 1)
+FIELD(HWCFG0, MD_NUM, 24, 7)
+FIELD(HWCFG0, ENABLE, 31, 1)
+REG32(HWCFG1, 0x0C)
+FIELD(HWCFG1, RRID_NUM, 0, 16)
+FIELD(HWCFG1, ENTRY_NUM, 16, 16)
+REG32(HWCFG2, 0x10)
+FIELD(HWCFG2, PRIO_ENTRY, 0, 16)
+FIELD(HWCFG2, RRID_TRANSL, 16, 16)
+REG32(ENTRYOFFSET, 0x14)
+FIELD(ENTRYOFFSET, OFFSET, 0, 32)
+REG32(MDSTALL, 0x30)
+FIELD(MDSTALL, EXEMPT, 0, 1)
+FIELD(MDSTALL, MD, 1, 31)
+REG32(MDSTALLH, 0x34)
+FIELD(MDSTALLH, MD, 0, 32)
+REG32(RRIDSCP, 0x38)
+FIELD(RRIDSCP, RRID, 0, 16)
+FIELD(RRIDSCP, OP, 30, 2)
+REG32(MDLCK, 0x40)
+FIELD(MDLCK, L, 0, 1)
+FIELD(MDLCK, MD, 1, 31)
+REG32(MDLCKH, 0x44)
+FIELD(MDLCKH, MDH, 0, 32)
+REG32(MDCFGLCK, 0x48)
+FIELD(MDCFGLCK, L, 0, 1)
+FIELD(MDCFGLCK, F, 1, 7)
+REG32(ENTRYLCK, 0x4C)
+FIELD(ENTRYLCK, L, 0, 1)
+FIELD(ENTRYLCK, F, 1, 16)
+REG32(ERR_CFG, 0x60)
+FIELD(ERR_CFG, L, 0, 1)
+FIELD(ERR_CFG, IE, 1, 1)
+FIELD(ERR_CFG, IRE, 2, 1)
+FIELD(ERR_CFG, IWE, 3, 1)
+FIELD(ERR_CFG, IXE, 4, 1)
+FIELD(ERR_CFG, RRE, 5, 1)
+FIELD(ERR_CFG, RWE, 6, 1)
+FIELD(ERR_CFG, RXE, 7, 1)
+REG32(ERR_REQINFO, 

[PATCH v7 2/2] hw/riscv/virt: Add IOPMP support

2024-06-11 Thread Ethan Chen via
If a requestor device is connected to the IOPMP device, its memory access will
be checked by the IOPMP rule.

- Add 'iopmp=on' option to add an iopmp device and make the Generic PCI Express
  Bridge connect to IOPMP.

Signed-off-by: Ethan Chen 
---
 docs/system/riscv/virt.rst |  6 
 hw/riscv/Kconfig   |  1 +
 hw/riscv/virt.c| 57 --
 include/hw/riscv/virt.h|  5 +++-
 4 files changed, 66 insertions(+), 3 deletions(-)

diff --git a/docs/system/riscv/virt.rst b/docs/system/riscv/virt.rst
index 9a06f95a34..3b2576f905 100644
--- a/docs/system/riscv/virt.rst
+++ b/docs/system/riscv/virt.rst
@@ -116,6 +116,12 @@ The following machine-specific options are supported:
   having AIA IMSIC (i.e. "aia=aplic-imsic" selected). When not specified,
   the default number of per-HART VS-level AIA IMSIC pages is 0.
 
+- iopmp=[on|off]
+
+  When this option is "on", an IOPMP device is added to machine. It checks dma
+  operations from the generic PCIe host bridge. This option is assumed to be
+  "off".
+
 Running Linux kernel
 
 
diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index a2030e3a6f..0b45a5ade2 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -56,6 +56,7 @@ config RISCV_VIRT
 select PLATFORM_BUS
 select ACPI
 select ACPI_PCI
+select RISCV_IOPMP
 
 config SHAKTI_C
 bool
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 4fdb660525..53a1b71c71 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -55,6 +55,7 @@
 #include "hw/acpi/aml-build.h"
 #include "qapi/qapi-visit-common.h"
 #include "hw/virtio/virtio-iommu.h"
+#include "hw/misc/riscv_iopmp.h"
 
 /* KVM AIA only supports APLIC MSI. APLIC Wired is always emulated by QEMU. */
 static bool virt_use_kvm_aia(RISCVVirtState *s)
@@ -82,6 +83,7 @@ static const MemMapEntry virt_memmap[] = {
 [VIRT_UART0] ={ 0x1000, 0x100 },
 [VIRT_VIRTIO] =   { 0x10001000,0x1000 },
 [VIRT_FW_CFG] =   { 0x1010,  0x18 },
+[VIRT_IOPMP] ={ 0x1020,  0x10 },
 [VIRT_FLASH] ={ 0x2000, 0x400 },
 [VIRT_IMSIC_M] =  { 0x2400, VIRT_IMSIC_MAX_SIZE },
 [VIRT_IMSIC_S] =  { 0x2800, VIRT_IMSIC_MAX_SIZE },
@@ -1006,6 +1008,24 @@ static void create_fdt_virtio_iommu(RISCVVirtState *s, 
uint16_t bdf)
bdf + 1, iommu_phandle, bdf + 1, 0x - bdf);
 }
 
+static void create_fdt_iopmp(RISCVVirtState *s, const MemMapEntry *memmap,
+ uint32_t irq_mmio_phandle) {
+g_autofree char *name = NULL;
+MachineState *ms = MACHINE(s);
+
+name = g_strdup_printf("/soc/iopmp@%lx", (long)memmap[VIRT_IOPMP].base);
+qemu_fdt_add_subnode(ms->fdt, name);
+qemu_fdt_setprop_string(ms->fdt, name, "compatible", "riscv_iopmp");
+qemu_fdt_setprop_cells(ms->fdt, name, "reg", 0x0, memmap[VIRT_IOPMP].base,
+0x0, memmap[VIRT_IOPMP].size);
+qemu_fdt_setprop_cell(ms->fdt, name, "interrupt-parent", irq_mmio_phandle);
+if (s->aia_type == VIRT_AIA_TYPE_NONE) {
+qemu_fdt_setprop_cell(ms->fdt, name, "interrupts", IOPMP_IRQ);
+} else {
+qemu_fdt_setprop_cells(ms->fdt, name, "interrupts", IOPMP_IRQ, 0x4);
+}
+}
+
 static void finalize_fdt(RISCVVirtState *s)
 {
 uint32_t phandle = 1, irq_mmio_phandle = 1, msi_pcie_phandle = 1;
@@ -1024,6 +1044,10 @@ static void finalize_fdt(RISCVVirtState *s)
 create_fdt_uart(s, virt_memmap, irq_mmio_phandle);
 
 create_fdt_rtc(s, virt_memmap, irq_mmio_phandle);
+
+if (s->have_iopmp) {
+create_fdt_iopmp(s, virt_memmap, irq_mmio_phandle);
+}
 }
 
 static void create_fdt(RISCVVirtState *s, const MemMapEntry *memmap)
@@ -1404,7 +1428,7 @@ static void virt_machine_init(MachineState *machine)
 RISCVVirtState *s = RISCV_VIRT_MACHINE(machine);
 MemoryRegion *system_memory = get_system_memory();
 MemoryRegion *mask_rom = g_new(MemoryRegion, 1);
-DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip;
+DeviceState *mmio_irqchip, *virtio_irqchip, *pcie_irqchip, *gpex_dev;
 int i, base_hartid, hart_count;
 int socket_count = riscv_socket_count(machine);
 
@@ -1570,7 +1594,7 @@ static void virt_machine_init(MachineState *machine)
 qdev_get_gpio_in(virtio_irqchip, VIRTIO_IRQ + i));
 }
 
-gpex_pcie_init(system_memory, pcie_irqchip, s);
+gpex_dev = gpex_pcie_init(system_memory, pcie_irqchip, s);
 
 create_platform_bus(s, mmio_irqchip);
 
@@ -1581,6 +1605,14 @@ static void virt_machine_init(MachineState *machine)
 sysbus_create_simple("goldfish_rtc", memmap[VIRT_RTC].base,
 qdev_get_gpio_in(mmio_irqchip, RTC_IRQ));
 
+if (s->have_iopmp) {
+DeviceState *iopmp_dev = sysbus_create_simple(TYPE_IOPMP,
+memmap[VIRT_IOPMP].base,
+qdev_get_gpio_in(DEVICE(mmio_irqchip), IOPMP_IRQ));
+
+iopmp_setup_pci(iopmp_dev, PCI_HOST_BRIDGE(gpex_dev)->bus);

[PATCH v7 0/2] Support RISC-V IOPMP

2024-06-11 Thread Ethan Chen via
Due to changing the referenced specification version, this patch has changed
a lot in this version.

This series implements basic functions of IOPMP specification v0.9.1 rapid-k
model.
The specification url:
https://github.com/riscv-non-isa/iopmp-spec/releases/tag/v0.9.1

When IOPMP is enabled, memory access from devices will check by IOPMP.

CPU as an IOPMP requestor has not been implemented because the IOTLB does not
support recording sections outside the current CPU address space.

Changes for v7:

  - Change the specification version to v0.9.1
  - Remove the sps extension
  - Remove stall support, transaction information which need requestor device
support.
  - Remove iopmp_cascade option for virt machine
  - Refine 'addr' range checks switch case (Daniel)


Ethan Chen (2):
  hw/misc/riscv_iopmp: Add RISC-V IOPMP device
  hw/riscv/virt: Add IOPMP support

 docs/system/riscv/virt.rst|6 +
 hw/misc/Kconfig   |3 +
 hw/misc/meson.build   |1 +
 hw/misc/riscv_iopmp.c | 1002 +
 hw/misc/trace-events  |4 +
 hw/riscv/Kconfig  |1 +
 hw/riscv/virt.c   |   57 +-
 include/hw/misc/riscv_iopmp.h |  152 +
 include/hw/riscv/virt.h   |5 +-
 9 files changed, 1228 insertions(+), 3 deletions(-)
 create mode 100644 hw/misc/riscv_iopmp.c
 create mode 100644 include/hw/misc/riscv_iopmp.h

-- 
2.34.1




Re: [PATCH 3/6] target/riscv: Add support for Control Transfer Records extension CSRs.

2024-06-11 Thread Jason Chien
It makes sense. Thank you for the explanation.

Rajnesh Kanwal  於 2024年6月10日 週一 下午10:12寫道:

>
> Thanks Jason for your review.
>
> On Tue, Jun 4, 2024 at 11:14 AM Jason Chien 
> wrote:
> >
> >
> > Rajnesh Kanwal 於 2024/5/30 上午 12:09 寫道:
> >
> > This commit adds support for [m|s|vs]ctrcontrol, sctrstatus and
> > sctrdepth CSRs handling.
> >
> > Signed-off-by: Rajnesh Kanwal 
> > ---
> >  target/riscv/cpu.h |   5 ++
> >  target/riscv/cpu_cfg.h |   2 +
> >  target/riscv/csr.c | 159 +
> >  3 files changed, 166 insertions(+)
> >
> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> > index a185e2d494..3d4d5172b8 100644
> > --- a/target/riscv/cpu.h
> > +++ b/target/riscv/cpu.h
> > @@ -263,6 +263,11 @@ struct CPUArchState {
> >  target_ulong mcause;
> >  target_ulong mtval;  /* since: priv-1.10.0 */
> >
> > +uint64_t mctrctl;
> > +uint32_t sctrdepth;
> > +uint32_t sctrstatus;
> > +uint64_t vsctrctl;
> > +
> >  /* Machine and Supervisor interrupt priorities */
> >  uint8_t miprio[64];
> >  uint8_t siprio[64];
> > diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h
> > index d9354dc80a..d329a65811 100644
> > --- a/target/riscv/cpu_cfg.h
> > +++ b/target/riscv/cpu_cfg.h
> > @@ -123,6 +123,8 @@ struct RISCVCPUConfig {
> >  bool ext_zvfhmin;
> >  bool ext_smaia;
> >  bool ext_ssaia;
> > +bool ext_smctr;
> > +bool ext_ssctr;
> >  bool ext_sscofpmf;
> >  bool ext_smepmp;
> >  bool rvv_ta_all_1s;
> > diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> > index 2f92e4b717..888084d8e5 100644
> > --- a/target/riscv/csr.c
> > +++ b/target/riscv/csr.c
> > @@ -621,6 +621,61 @@ static RISCVException pointer_masking(CPURISCVState
> *env, int csrno)
> >  return RISCV_EXCP_ILLEGAL_INST;
> >  }
> >
> > +/*
> > + * M-mode:
> > + * Without ext_smctr raise illegal inst excep.
> > + * Otherwise everything is accessible to m-mode.
> > + *
> > + * S-mode:
> > + * Without ext_ssctr or mstateen.ctr raise illegal inst excep.
> > + * Otherwise everything other than mctrctl is accessible.
> > + *
> > + * VS-mode:
> > + * Without ext_ssctr or mstateen.ctr raise illegal inst excep.
> > + * Without hstateen.ctr raise virtual illegal inst excep.
> > + * Otherwise allow vsctrctl, sctrstatus, 0x200-0x2ff entry range.
> > + * Always raise illegal instruction exception for sctrdepth.
> > + */
> > +static RISCVException ctr_mmode(CPURISCVState *env, int csrno)
> > +{
> > +/* Check if smctr-ext is present */
> > +if (riscv_cpu_cfg(env)->ext_smctr) {
> > +return RISCV_EXCP_NONE;
> > +}
> > +
> > +return RISCV_EXCP_ILLEGAL_INST;
> > +}
> > +
> > +static RISCVException ctr_smode(CPURISCVState *env, int csrno)
> > +{
> > +if ((env->priv == PRV_M && riscv_cpu_cfg(env)->ext_smctr) ||
> > +(env->priv == PRV_S && !env->virt_enabled &&
> > + riscv_cpu_cfg(env)->ext_ssctr)) {
> > +return smstateen_acc_ok(env, 0, SMSTATEEN0_CTR);
> > +}
> > +
> > +if (env->priv == PRV_S && env->virt_enabled &&
> > +riscv_cpu_cfg(env)->ext_ssctr) {
> > +if (csrno == CSR_SCTRSTATUS) {
> >
> > missing sctrctl?
> >
> > +return smstateen_acc_ok(env, 0, SMSTATEEN0_CTR);
> > +}
> > +
> > +return RISCV_EXCP_VIRT_INSTRUCTION_FAULT;
> > +}
> > +
> > +return RISCV_EXCP_ILLEGAL_INST;
> > +}
> >
> > I think there is no need to bind M-mode with ext_smctr, S-mode with
> ext_ssctr and VS-mode with ext_ssctr, since this predicate function is for
> S-mode CSRs, which are defined in both smctr and ssctr, we just need to
> check at least one of ext_ssctr or ext_smctr is true.
> >
> > The spec states that:
> > Attempts to access sctrdepth from VS-mode or VU-mode raise a
> virtual-instruction exception, unless CTR state enable access restrictions
> apply.
> >
> > In my understanding, we should check the presence of smstateen extension
> first, and
> >
> > if smstateen is implemented:
> >
> > for sctrctl and sctrstatus, call smstateen_acc_ok()
> > for sctrdepth, call smstateen_acc_ok(), and if there is any exception
> returned, always report virtual-instruction exception.
>
> For sctrdepth, we are supposed to always return a virt-inst exception in
> case of
> VS-VU mode unless CTR state enable access restrictions apply.
>
> So for sctrdepth, call smstateen_acc_ok(), and if there is no exception
> returned
> (mstateen.CTR=1 and hstateen.CTR=1 for virt mode), check if we are in
> virtual
> mode and return virtual-instruction exception otherwise return
> RISCV_EXCP_NONE.
> Note that if hstateen.CTR=0, smstateen_acc_ok() will return
> virtual-instruction
> exception which means regardless of the hstateen.CTR state, we will always
> return virtual-instruction exception for VS/VU mode access to sctrdepth.
>
> Basically this covers following rules for sctrdepth:
>
> if mstateen.ctr == 0
> return RISCV_EXCP_ILLEGAL_INST; // For all modes lower than M-mode.
> else 

Re: [PATCH v3] hw/arm/virt: Avoid unexpected warning from Linux guest on host with Fujitsu CPUs

2024-06-11 Thread Donald Dutile




On 6/11/24 10:05 PM, Zhenyu Zhang wrote:

Multiple warning messages and corresponding backtraces are observed when Linux
guest is booted on the host with Fujitsu CPUs. One of them is shown as below.

[0.032443] [ cut here ]
[0.032446] uart-pl011 900.pl011: ARCH_DMA_MINALIGN smaller than
CTR_EL0.CWG (128 < 256)
[0.032454] WARNING: CPU: 0 PID: 1 at arch/arm64/mm/dma-mapping.c:54
arch_setup_dma_ops+0xbc/0xcc
[0.032470] Modules linked in:
[0.032475] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-452.el9.aarch64
[0.032481] Hardware name: linux,dummy-virt (DT)
[0.032484] pstate: 6045 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[0.032490] pc : arch_setup_dma_ops+0xbc/0xcc
[0.032496] lr : arch_setup_dma_ops+0xbc/0xcc
[0.032501] sp : 80008003b860
[0.032503] x29: 80008003b860 x28:  x27: aae4b949049c
[0.032510] x26:  x25:  x24: 
[0.032517] x23: 0100 x22:  x21: 
[0.032523] x20: 0001 x19: 2f06c02ea400 x18: 
[0.032529] x17: 208a5f76 x16: 6589dbcb x15: aae4ba071c89
[0.032535] x14:  x13: aae4ba071c84 x12: 455f525443206e61
[0.032541] x11: 68742072656c6c61 x10: 0029 x9 : aae4b7d21da4
[0.032547] x8 : 0029 x7 : 4c414e494d5f414d x6 : 0029
[0.032553] x5 : 000f x4 : aae4b9617a00 x3 : 0001
[0.032558] x2 :  x1 :  x0 : 2f06c029be40
[0.032564] Call trace:
[0.032566]  arch_setup_dma_ops+0xbc/0xcc
[0.032572]  of_dma_configure_id+0x138/0x300
[0.032591]  amba_dma_configure+0x34/0xc0
[0.032600]  really_probe+0x78/0x3dc
[0.032614]  __driver_probe_device+0x108/0x160
[0.032619]  driver_probe_device+0x44/0x114
[0.032624]  __device_attach_driver+0xb8/0x14c
[0.032629]  bus_for_each_drv+0x88/0xe4
[0.032634]  __device_attach+0xb0/0x1e0
[0.032638]  device_initial_probe+0x18/0x20
[0.032643]  bus_probe_device+0xa8/0xb0
[0.032648]  device_add+0x4b4/0x6c0
[0.032652]  amba_device_try_add.part.0+0x48/0x360
[0.032657]  amba_device_add+0x104/0x144
[0.032662]  of_amba_device_create.isra.0+0x100/0x1c4
[0.032666]  of_platform_bus_create+0x294/0x35c
[0.032669]  of_platform_populate+0x5c/0x150
[0.032672]  of_platform_default_populate_init+0xd0/0xec
[0.032697]  do_one_initcall+0x4c/0x2e0
[0.032701]  do_initcalls+0x100/0x13c
[0.032707]  kernel_init_freeable+0x1c8/0x21c
[0.032712]  kernel_init+0x28/0x140
[0.032731]  ret_from_fork+0x10/0x20
[0.032735] ---[ end trace  ]---

In Linux, a check is applied to every device which is exposed through
device-tree node. The warning message is raised when the device isn't
DMA coherent and the cache line size is larger than ARCH_DMA_MINALIGN
(128 bytes). The cache line is sorted from CTR_EL0[CWG], which corresponds
to 256 bytes on the guest CPUs. The DMA coherent capability is claimed
through 'dma-coherent' in their device-tree nodes or parent nodes.

Fix the issue by adding 'dma-coherent' property to the device-tree root
node, meaning all devices are capable of DMA coherent by default.

Signed-off-by: Zhenyu Zhang 
---
v3: Add comments explaining why we add 'dma-coherent' property (Peter)
---
  hw/arm/virt.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3c93c0c0a6..3cefac6d43 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -271,6 +271,17 @@ static void create_fdt(VirtMachineState *vms)
  qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
  qemu_fdt_setprop_string(fdt, "/", "model", "linux,dummy-virt");
  
+/*

+ * For QEMU, all DMA is coherent. Advertising this in the root node
+ * has two benefits:
+ *
+ * - It avoids potential bugs where we forget to mark a DMA
+ *   capable device as being dma-coherent
+ * - It avoids spurious warnings from the Linux kernel about
+ *   devices which can't do DMA at all
+ */
+qemu_fdt_setprop(fdt, "/", "dma-coherent", NULL, 0);
+
  /* /chosen must exist for load_dtb to fill in necessary properties later 
*/
  qemu_fdt_add_subnode(fdt, "/chosen");
  if (vms->dtb_randomness) {


+1 to Peter's suggested comment, otherwise, unless privy to this thread,
one would wonder how/why.

Reviewed-by: Donald Dutile 

Re: [PATCH v3] hw/arm/virt: Avoid unexpected warning from Linux guest on host with Fujitsu CPUs

2024-06-11 Thread Gavin Shan

On 6/12/24 12:05, Zhenyu Zhang wrote:

Multiple warning messages and corresponding backtraces are observed when Linux
guest is booted on the host with Fujitsu CPUs. One of them is shown as below.

[0.032443] [ cut here ]
[0.032446] uart-pl011 900.pl011: ARCH_DMA_MINALIGN smaller than
CTR_EL0.CWG (128 < 256)
[0.032454] WARNING: CPU: 0 PID: 1 at arch/arm64/mm/dma-mapping.c:54
arch_setup_dma_ops+0xbc/0xcc
[0.032470] Modules linked in:
[0.032475] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-452.el9.aarch64
[0.032481] Hardware name: linux,dummy-virt (DT)
[0.032484] pstate: 6045 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[0.032490] pc : arch_setup_dma_ops+0xbc/0xcc
[0.032496] lr : arch_setup_dma_ops+0xbc/0xcc
[0.032501] sp : 80008003b860
[0.032503] x29: 80008003b860 x28:  x27: aae4b949049c
[0.032510] x26:  x25:  x24: 
[0.032517] x23: 0100 x22:  x21: 
[0.032523] x20: 0001 x19: 2f06c02ea400 x18: 
[0.032529] x17: 208a5f76 x16: 6589dbcb x15: aae4ba071c89
[0.032535] x14:  x13: aae4ba071c84 x12: 455f525443206e61
[0.032541] x11: 68742072656c6c61 x10: 0029 x9 : aae4b7d21da4
[0.032547] x8 : 0029 x7 : 4c414e494d5f414d x6 : 0029
[0.032553] x5 : 000f x4 : aae4b9617a00 x3 : 0001
[0.032558] x2 :  x1 :  x0 : 2f06c029be40
[0.032564] Call trace:
[0.032566]  arch_setup_dma_ops+0xbc/0xcc
[0.032572]  of_dma_configure_id+0x138/0x300
[0.032591]  amba_dma_configure+0x34/0xc0
[0.032600]  really_probe+0x78/0x3dc
[0.032614]  __driver_probe_device+0x108/0x160
[0.032619]  driver_probe_device+0x44/0x114
[0.032624]  __device_attach_driver+0xb8/0x14c
[0.032629]  bus_for_each_drv+0x88/0xe4
[0.032634]  __device_attach+0xb0/0x1e0
[0.032638]  device_initial_probe+0x18/0x20
[0.032643]  bus_probe_device+0xa8/0xb0
[0.032648]  device_add+0x4b4/0x6c0
[0.032652]  amba_device_try_add.part.0+0x48/0x360
[0.032657]  amba_device_add+0x104/0x144
[0.032662]  of_amba_device_create.isra.0+0x100/0x1c4
[0.032666]  of_platform_bus_create+0x294/0x35c
[0.032669]  of_platform_populate+0x5c/0x150
[0.032672]  of_platform_default_populate_init+0xd0/0xec
[0.032697]  do_one_initcall+0x4c/0x2e0
[0.032701]  do_initcalls+0x100/0x13c
[0.032707]  kernel_init_freeable+0x1c8/0x21c
[0.032712]  kernel_init+0x28/0x140
[0.032731]  ret_from_fork+0x10/0x20
[0.032735] ---[ end trace  ]---

In Linux, a check is applied to every device which is exposed through
device-tree node. The warning message is raised when the device isn't
DMA coherent and the cache line size is larger than ARCH_DMA_MINALIGN
(128 bytes). The cache line is sorted from CTR_EL0[CWG], which corresponds
to 256 bytes on the guest CPUs. The DMA coherent capability is claimed
through 'dma-coherent' in their device-tree nodes or parent nodes.

Fix the issue by adding 'dma-coherent' property to the device-tree root
node, meaning all devices are capable of DMA coherent by default.

Signed-off-by: Zhenyu Zhang 
---
v3: Add comments explaining why we add 'dma-coherent' property (Peter)
---
  hw/arm/virt.c | 11 +++
  1 file changed, 11 insertions(+)



Reviewed-by: Gavin Shan 




[PATCH v3] hw/arm/virt: Avoid unexpected warning from Linux guest on host with Fujitsu CPUs

2024-06-11 Thread Zhenyu Zhang
Multiple warning messages and corresponding backtraces are observed when Linux
guest is booted on the host with Fujitsu CPUs. One of them is shown as below.

[0.032443] [ cut here ]
[0.032446] uart-pl011 900.pl011: ARCH_DMA_MINALIGN smaller than
CTR_EL0.CWG (128 < 256)
[0.032454] WARNING: CPU: 0 PID: 1 at arch/arm64/mm/dma-mapping.c:54
arch_setup_dma_ops+0xbc/0xcc
[0.032470] Modules linked in:
[0.032475] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-452.el9.aarch64
[0.032481] Hardware name: linux,dummy-virt (DT)
[0.032484] pstate: 6045 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[0.032490] pc : arch_setup_dma_ops+0xbc/0xcc
[0.032496] lr : arch_setup_dma_ops+0xbc/0xcc
[0.032501] sp : 80008003b860
[0.032503] x29: 80008003b860 x28:  x27: aae4b949049c
[0.032510] x26:  x25:  x24: 
[0.032517] x23: 0100 x22:  x21: 
[0.032523] x20: 0001 x19: 2f06c02ea400 x18: 
[0.032529] x17: 208a5f76 x16: 6589dbcb x15: aae4ba071c89
[0.032535] x14:  x13: aae4ba071c84 x12: 455f525443206e61
[0.032541] x11: 68742072656c6c61 x10: 0029 x9 : aae4b7d21da4
[0.032547] x8 : 0029 x7 : 4c414e494d5f414d x6 : 0029
[0.032553] x5 : 000f x4 : aae4b9617a00 x3 : 0001
[0.032558] x2 :  x1 :  x0 : 2f06c029be40
[0.032564] Call trace:
[0.032566]  arch_setup_dma_ops+0xbc/0xcc
[0.032572]  of_dma_configure_id+0x138/0x300
[0.032591]  amba_dma_configure+0x34/0xc0
[0.032600]  really_probe+0x78/0x3dc
[0.032614]  __driver_probe_device+0x108/0x160
[0.032619]  driver_probe_device+0x44/0x114
[0.032624]  __device_attach_driver+0xb8/0x14c
[0.032629]  bus_for_each_drv+0x88/0xe4
[0.032634]  __device_attach+0xb0/0x1e0
[0.032638]  device_initial_probe+0x18/0x20
[0.032643]  bus_probe_device+0xa8/0xb0
[0.032648]  device_add+0x4b4/0x6c0
[0.032652]  amba_device_try_add.part.0+0x48/0x360
[0.032657]  amba_device_add+0x104/0x144
[0.032662]  of_amba_device_create.isra.0+0x100/0x1c4
[0.032666]  of_platform_bus_create+0x294/0x35c
[0.032669]  of_platform_populate+0x5c/0x150
[0.032672]  of_platform_default_populate_init+0xd0/0xec
[0.032697]  do_one_initcall+0x4c/0x2e0
[0.032701]  do_initcalls+0x100/0x13c
[0.032707]  kernel_init_freeable+0x1c8/0x21c
[0.032712]  kernel_init+0x28/0x140
[0.032731]  ret_from_fork+0x10/0x20
[0.032735] ---[ end trace  ]---

In Linux, a check is applied to every device which is exposed through
device-tree node. The warning message is raised when the device isn't
DMA coherent and the cache line size is larger than ARCH_DMA_MINALIGN
(128 bytes). The cache line is sorted from CTR_EL0[CWG], which corresponds
to 256 bytes on the guest CPUs. The DMA coherent capability is claimed
through 'dma-coherent' in their device-tree nodes or parent nodes.

Fix the issue by adding 'dma-coherent' property to the device-tree root
node, meaning all devices are capable of DMA coherent by default.

Signed-off-by: Zhenyu Zhang 
---
v3: Add comments explaining why we add 'dma-coherent' property (Peter)
---
 hw/arm/virt.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3c93c0c0a6..3cefac6d43 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -271,6 +271,17 @@ static void create_fdt(VirtMachineState *vms)
 qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
 qemu_fdt_setprop_string(fdt, "/", "model", "linux,dummy-virt");
 
+/*
+ * For QEMU, all DMA is coherent. Advertising this in the root node
+ * has two benefits:
+ *
+ * - It avoids potential bugs where we forget to mark a DMA
+ *   capable device as being dma-coherent
+ * - It avoids spurious warnings from the Linux kernel about
+ *   devices which can't do DMA at all
+ */
+qemu_fdt_setprop(fdt, "/", "dma-coherent", NULL, 0);
+
 /* /chosen must exist for load_dtb to fill in necessary properties later */
 qemu_fdt_add_subnode(fdt, "/chosen");
 if (vms->dtb_randomness) {
-- 
2.43.0




RE: [PATCH] ui/gtk: Wait until the current guest frame is rendered before switching to RUN_STATE_SAVE_VM

2024-06-11 Thread Kim, Dongwon
Hi, 

From: Marc-André Lureau  
Sent: Wednesday, June 5, 2024 12:56 AM
To: Kim, Dongwon 
Cc: qemu-devel@nongnu.org; Peter Xu 
Subject: Re: [PATCH] ui/gtk: Wait until the current guest frame is rendered 
before switching to RUN_STATE_SAVE_VM

Hi

On Tue, Jun 4, 2024 at 9:49 PM Kim, Dongwon  
wrote:
On 6/4/2024 4:12 AM, Marc-André Lureau wrote:
> Hi
> 
> On Thu, May 30, 2024 at 2:44 AM  > wrote:
> 
>     From: Dongwon >
> 
>     Make sure rendering of the current frame is finished before switching
>     the run state to RUN_STATE_SAVE_VM by waiting for egl-sync object to be
>     signaled.
> 
> 
> Can you expand on what this solves?

In current scheme, guest waits for the fence to be signaled for each 
frame it submits before moving to the next frame. If the guest’s state 
is saved while it is still waiting for the fence, The guest will 
continue to  wait for the fence that was signaled while ago when it is 
restored to the point. One way to prevent it is to get it finish the 
current frame before changing the state.

After the UI sets a fence, hw_ops->gl_block(true) gets called, which will block 
virtio-gpu/virgl from processing commands (until the fence is signaled and 
gl_block/false called again).

But this "blocking" state is not saved. So how does this affect save/restore? 
Please give more details, thanks

Yeah sure. "Blocking" state is not saved but guest's state is saved while it 
was still waiting for the response for its last resource-flush virtio msg. This 
virtio response, by the way is set to be sent to the guest when the pipeline is 
unblocked (and when the fence is signaled.). Once the guest's state is saved, 
current instance of guest will be continued and receives the response as usual. 
The problem is happening when we restore the saved guest's state again because 
what guest does will be waiting for the response that was sent a while ago to 
the original instance.

> 
> 
>     Cc: Marc-André Lureau      >
>     Cc: Vivek Kasireddy      >
>     Signed-off-by: Dongwon Kim      >
>     ---
>       ui/egl-helpers.c |  2 --
>       ui/gtk.c         | 19 +++
>       2 files changed, 19 insertions(+), 2 deletions(-)
> 
>     diff --git a/ui/egl-helpers.c b/ui/egl-helpers.c
>     index 99b2ebbe23..dafeb36074 100644
>     --- a/ui/egl-helpers.c
>     +++ b/ui/egl-helpers.c
>     @@ -396,8 +396,6 @@ void egl_dmabuf_create_fence(QemuDmaBuf *dmabuf)
>               fence_fd = eglDupNativeFenceFDANDROID(qemu_egl_display,
>                                                     sync);
>               qemu_dmabuf_set_fence_fd(dmabuf, fence_fd);
>     -        eglDestroySyncKHR(qemu_egl_display, sync);
>     -        qemu_dmabuf_set_sync(dmabuf, NULL);
> 
> 
> If this function is called multiple times, it will now set a new 
> fence_fd each time, and potentially leak older fd. Maybe it could first 
> check if a fence_fd exists instead.

We can make that change.

> 
>           }
>       }
> 
>     diff --git a/ui/gtk.c b/ui/gtk.c
>     index 93b13b7a30..cf0dd6abed 100644
>     --- a/ui/gtk.c
>     +++ b/ui/gtk.c
>     @@ -600,9 +600,12 @@ void gd_hw_gl_flushed(void *vcon)
> 
>           fence_fd = qemu_dmabuf_get_fence_fd(dmabuf);
>           if (fence_fd >= 0) {
>     +        void *sync = qemu_dmabuf_get_sync(dmabuf);
>               qemu_set_fd_handler(fence_fd, NULL, NULL, NULL);
>               close(fence_fd);
>               qemu_dmabuf_set_fence_fd(dmabuf, -1);
>     +        eglDestroySyncKHR(qemu_egl_display, sync);
>     +        qemu_dmabuf_set_sync(dmabuf, NULL);
>               graphic_hw_gl_block(vc->gfx.dcl.con, false);
>           }
>       }
>     @@ -682,6 +685,22 @@ static const DisplayGLCtxOps egl_ctx_ops = {
>       static void gd_change_runstate(void *opaque, bool running,
>     RunState state)
>       {
>           GtkDisplayState *s = opaque;
>     +    QemuDmaBuf *dmabuf;
>     +    int i;
>     +
>     +    if (state == RUN_STATE_SAVE_VM) {
>     +        for (i = 0; i < s->nb_vcs; i++) {
>     +            VirtualConsole *vc = >vc[i];
>     +            dmabuf = vc->gfx.guest_fb.dmabuf;
>     +            if (dmabuf && qemu_dmabuf_get_fence_fd(dmabuf) >= 0) {
>     +                /* wait for the rendering to be completed */
>     +                eglClientWaitSync(qemu_egl_display,
>     +                                  qemu_dmabuf_get_sync(dmabuf),
>     +                                  EGL_SYNC_FLUSH_COMMANDS_BIT_KHR,
>     +                                  10);
> 
> 
>   I don't think adding waiting points in the migration path is 
> appropriate. Perhaps once 

Re: qemu-riscv32 usermode still broken?

2024-06-11 Thread Alistair Francis
On Tue, Jun 11, 2024 at 6:57 PM Andreas K. Huettel  wrote:
>
> Hi Alistair,
>
> >
> > Ok!
> >
> > So on my x86 machine I see this
> >
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=285545,
> > si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
> > wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}],
> > WNOHANG|WSTOPPED|WCONTINUED, NULL) = 285545
> > wait4(-1, 0x7ffe3eeb8210, WNOHANG|WSTOPPED|WCONTINUED, NULL) = 0
> > rt_sigreturn({mask=[INT]})  = 0
> > close(3)= 0
> >
> > It all looks ok.
>
> This was fixed in the meantime (hooray!), sorry I didn't think anyone
> would still look at the old thread. The commit is given below.
>
> Since then we've been able to build riscv32 stages for Gentoo just fine
> using qemu-user, see
> https://www.gentoo.org/downloads/#riscv

Great!

Alistair

>
> Cheers,
> Andreas
>
> commit f0907ff4cae743f1a4ef3d0a55a047029eed06ff
> Author: Richard Henderson 
> AuthorDate: Fri Apr 5 11:58:14 2024 -1000
> Commit: Richard Henderson 
> CommitDate: Tue Apr 9 07:43:11 2024 -1000
>
> linux-user: Fix waitid return of siginfo_t and rusage
>
> The copy back to siginfo_t should be conditional only on arg3,
> not the specific values that might have been written.
> The copy back to rusage was missing entirely.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2262
> Signed-off-by: Richard Henderson 
> Tested-by: Alex Fan 
> Reviewed-by: Philippe Mathieu-Daudé 
>
>
>
> >
> > Maybe the host_to_target_siginfo() function in QEMU is the issue?
> > Something in here?
> > https://github.com/qemu/qemu/blob/master/linux-user/signal.c#L335
> >
> > Nothing jumps out with a quick look though
> >
> > Alistair
> >
> > >
> > >
> > >
> > > --
> > > Andreas K. Hüttel
> > > dilfri...@gentoo.org
> > > Gentoo Linux developer
> > > (council, toolchain, base-system, perl, libreoffice)
> >
>
>
> --
> Andreas K. Hüttel
> dilfri...@gentoo.org
> Gentoo Linux developer
> (council, toolchain, base-system, perl, libreoffice)



Re: [PATCH RESEND 2/6] target/riscv: Introduce extension implied rule helpers

2024-06-11 Thread Frank Chang
On Wed, Jun 5, 2024 at 2:32 PM  wrote:

> From: Frank Chang 
>
> Introduce helpers to enable the extensions based on the implied rules.
> The implied extensions are enabled recursively, so we don't have to
> expand all of them manually. This also eliminates the old-fashioned
> ordering requirement. For example, Zvksg implies Zvks, Zvks implies
> Zvksed, etc., removing the need to check the implied rules of Zvksg
> before Zvks.
>
> Signed-off-by: Frank Chang 
> ---
>  target/riscv/tcg/tcg-cpu.c | 89 ++
>  1 file changed, 89 insertions(+)
>
> diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
> index 683f604d9f..899d605d36 100644
> --- a/target/riscv/tcg/tcg-cpu.c
> +++ b/target/riscv/tcg/tcg-cpu.c
> @@ -36,6 +36,9 @@
>  static GHashTable *multi_ext_user_opts;
>  static GHashTable *misa_ext_user_opts;
>
> +static GHashTable *misa_implied_rules;
> +static GHashTable *ext_implied_rules;
> +
>  static bool cpu_cfg_ext_is_user_set(uint32_t ext_offset)
>  {
>  return g_hash_table_contains(multi_ext_user_opts,
> @@ -833,11 +836,95 @@ static void riscv_cpu_validate_profiles(RISCVCPU
> *cpu)
>  }
>  }
>
> +static void riscv_cpu_init_implied_exts_rules(void)
> +{
> +RISCVCPUImpliedExtsRule *rule;
> +int i;
> +
> +for (i = 0; (rule = riscv_misa_implied_rules[i]); i++) {
> +g_hash_table_insert(misa_implied_rules,
> GUINT_TO_POINTER(rule->ext),
> +(gpointer)rule);
> +}
> +
> +for (i = 0; (rule = riscv_ext_implied_rules[i]); i++) {
> +g_hash_table_insert(ext_implied_rules,
> GUINT_TO_POINTER(rule->ext),
> +(gpointer)rule);
> +}
> +}
> +
> +static void cpu_enable_implied_rule(RISCVCPU *cpu,
> +RISCVCPUImpliedExtsRule *rule)
> +{
> +CPURISCVState *env = >env;
> +RISCVCPUImpliedExtsRule *ir;
> +target_ulong hartid = 0;
> +int i;
> +
> +#if !defined(CONFIG_USER_ONLY)
> +hartid = env->mhartid;
> +#endif
> +
> +if (!(rule->enabled & BIT_ULL(hartid))) {
> +/* Enable the implied MISAs. */
> +if (rule->implied_misas) {
> +riscv_cpu_set_misa_ext(env, env->misa_ext |
> rule->implied_misas);
> +
> +for (i = 0; misa_bits[i] != 0; i++) {
> +if (rule->implied_misas & misa_bits[i]) {
> +ir = g_hash_table_lookup(misa_implied_rules,
> +
>  GUINT_TO_POINTER(misa_bits[i]));
> +
> +if (ir) {
> +cpu_enable_implied_rule(cpu, ir);
> +}
> +}
> +}
> +}
> +
> +/* Enable the implied extensions. */
> +for (i = 0; rule->implied_exts[i] != RISCV_IMPLIED_EXTS_RULE_END;
> i++) {
> +cpu_cfg_ext_auto_update(cpu, rule->implied_exts[i], true);
> +
> +ir = g_hash_table_lookup(ext_implied_rules,
> +
>  GUINT_TO_POINTER(rule->implied_exts[i]));
> +
> +if (ir) {
> +cpu_enable_implied_rule(cpu, ir);
> +}
> +}
> +
> +rule->enabled |= BIT_ULL(hartid);
>

Should I use the qatomic API here to set the enabled bitmask?

This wouldn't impact the results but it may cause the implied rules
to be traversed and re-enabled (which has no harm) if the enabled bit
of a hart is accidentally cleared by another harts.


> +}
> +}
> +
> +static void riscv_cpu_enable_implied_rules(RISCVCPU *cpu)
> +{
> +RISCVCPUImpliedExtsRule *rule;
> +int i;
> +
> +/* Enable the implied MISAs. */
> +for (i = 0; (rule = riscv_misa_implied_rules[i]); i++) {
> +if (riscv_has_ext(>env, rule->ext)) {
> +cpu_enable_implied_rule(cpu, rule);
> +}
> +}
> +
> +/* Enable the implied extensions. */
> +for (i = 0; (rule = riscv_ext_implied_rules[i]); i++) {
> +if (isa_ext_is_enabled(cpu, rule->ext)) {
> +cpu_enable_implied_rule(cpu, rule);
> +}
> +}
> +}
> +
>  void riscv_tcg_cpu_finalize_features(RISCVCPU *cpu, Error **errp)
>  {
>  CPURISCVState *env = >env;
>  Error *local_err = NULL;
>
> +riscv_cpu_init_implied_exts_rules();
> +riscv_cpu_enable_implied_rules(cpu);
> +
>  riscv_cpu_validate_misa_priv(env, _err);
>  if (local_err != NULL) {
>  error_propagate(errp, local_err);
> @@ -1343,6 +1430,8 @@ static void riscv_tcg_cpu_instance_init(CPUState *cs)
>
>  misa_ext_user_opts = g_hash_table_new(NULL, g_direct_equal);
>  multi_ext_user_opts = g_hash_table_new(NULL, g_direct_equal);
> +misa_implied_rules = g_hash_table_new(NULL, g_direct_equal);
> +ext_implied_rules = g_hash_table_new(NULL, g_direct_equal);
>  riscv_cpu_add_user_properties(obj);
>
>  if (riscv_cpu_has_max_extensions(obj)) {
> --
> 2.43.2
>
>


[PATCH] accel/tcg: Fix typo causing tb->page_addr[1] to not be recorded

2024-06-11 Thread Anton Johansson via
For TBs crossing page boundaries, the 2nd page will never be
recorded/removed, as the index of the 2nd page is computed from the
address of the 1st page. This is due to a typo, fix it.

Signed-off-by: Anton Johansson 
---
 accel/tcg/tb-maint.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/accel/tcg/tb-maint.c b/accel/tcg/tb-maint.c
index 19ae6793f3..cc0f5afd47 100644
--- a/accel/tcg/tb-maint.c
+++ b/accel/tcg/tb-maint.c
@@ -713,7 +713,7 @@ static void tb_record(TranslationBlock *tb)
 tb_page_addr_t paddr0 = tb_page_addr0(tb);
 tb_page_addr_t paddr1 = tb_page_addr1(tb);
 tb_page_addr_t pindex0 = paddr0 >> TARGET_PAGE_BITS;
-tb_page_addr_t pindex1 = paddr0 >> TARGET_PAGE_BITS;
+tb_page_addr_t pindex1 = paddr1 >> TARGET_PAGE_BITS;
 
 assert(paddr0 != -1);
 if (unlikely(paddr1 != -1) && pindex0 != pindex1) {
@@ -745,7 +745,7 @@ static void tb_remove(TranslationBlock *tb)
 tb_page_addr_t paddr0 = tb_page_addr0(tb);
 tb_page_addr_t paddr1 = tb_page_addr1(tb);
 tb_page_addr_t pindex0 = paddr0 >> TARGET_PAGE_BITS;
-tb_page_addr_t pindex1 = paddr0 >> TARGET_PAGE_BITS;
+tb_page_addr_t pindex1 = paddr1 >> TARGET_PAGE_BITS;
 
 assert(paddr0 != -1);
 if (unlikely(paddr1 != -1) && pindex0 != pindex1) {
-- 
2.45.0




Re: [PATCH v3 1/6] Add an "info pg" command that prints the current page tables

2024-06-11 Thread Don Porter

On 6/7/24 3:16 AM, Daniel P. Berrangé wrote:

On Thu, Jun 06, 2024 at 10:02:48AM -0400, Don Porter wrote:
Please don't add new HMP commands that don't have a QMP
equivalent.

This should be adding an 'x-query-pg' QMP command, which
returns HumanReadableText, and then call that from the HMP

There is guidance on this here:

   
https://www.qemu.org/docs/master/devel/writing-monitor-commands.html#writing-a-debugging-aid-returning-unstructured-text

If you need more real examples, look at the various
'x-query-' commands in qapi/machine.json  and
their impl.


Thank you both for the pointers.  This makes sense to me;
outputting a string is much cleaner.  Will implement in v4...

-dp




Re: [RFC PATCH v2 0/2] ui/gtk: Introduce new param - Connectors

2024-06-11 Thread Kim, Dongwon

Hi Marc-André,

On 6/5/2024 12:26 AM, Marc-André Lureau wrote:

Hi

On Tue, Jun 4, 2024 at 9:59 PM Kim, Dongwon > wrote:


Hi Marc-André,

On 6/4/2024 3:37 AM, Marc-André Lureau wrote:
 > Hi
 >
 > On Fri, May 31, 2024 at 11:00 PM mailto:dongwon@intel.com>
 > >> wrote:
 >
 >     From: Dongwon Kim mailto:dongwon@intel.com> >>
 >
 >     This patch series is a replacement of
 >
https://mail.gnu.org/archive/html/qemu-devel/2023-06/msg03989.html

 >   
  >

 >
 >     There is a need, expressed by several users, to assign
ownership of one
 >     or more physical monitors/connectors to individual guests.
This creates
 >     a clear notion of which guest's contents are being displayed
on any
 >     given
 >     monitor. Given that there is always a display
server/compositor running
 >     on the host, monitor ownership can never truly be transferred
to guests.
 >     However, the closest approximation is to request the host
compositor to
 >     fullscreen the guest's windows on individual monitors. This
allows for
 >     various configurations, such as displaying four different guests'
 >     windows
 >     on four different monitors, a single guest's windows (or virtual
 >     consoles)
 >     on four monitors, or any similar combination.
 >
 >     This patch series attempts to accomplish this by introducing
a new
 >     parameter named "connector" to assign monitors to the GFX VCs
associated
 >     with a guest. If the assigned monitor is not connected, the
guest's
 >     window
 >     will not be displayed, similar to how a host compositor
behaves when
 >     connectors are not connected. Once the monitor is
hot-plugged, the
 >     guest's
 >     window(s) will be positioned on the assigned monitor.
 >
 >     Usage example:
 >
 >     -display gtk,gl=on,connectors=DP-1:eDP-1:HDMI-2...
 >
 >     In this example, the first graphics virtual console will be
placed
 >     on the
 >     DP-1 display, the second on eDP-1, and the third on HDMI-2.
 >
 >
 > Unfortunately, this approach with GTK is doomed. gtk4 dropped the
 > gtk_window_set_position() altogether.

Do you mean we have a plan to lift GTK version in QEMU? Are we going to
lose all GTK3 specific features?


No concrete plan, no. But eventually GTK3 will go away some day.


There are users who still rely on features provided by GTK3 and we also 
have customers who are moving from VMware, virtualbox that have 
requested for this feature. Their use-cases are current and active. If 
windows repositioning won't be supported someday, then we would need to 
make this feature obsolete but many users/customers would benefit from 
it until then.




fwiw, I wish QEMU wouldn't have N built-in UIs/Spice/VNC, but different 
projects elsewhere using -display dbus. There is 
https://gitlab.gnome.org/GNOME/libmks 
 or 
https://gitlab.com/marcandre.lureau/qemu-display 
 gtk4 efforts.


As you know, there cannot be a one size fits all solution that would 
work for all the users, which is probably why there are many Qemu UIs.





 >
 > It's not even clear how the different monitors/outputs/connectors
are
 > actually named, whether they are stable etc (not mentioning the
 > portability).
 >
 > Window placement & geometry is a job for the compositor. Can you
discuss
 > this issue with GTK devs & the compositor you are targeting?

I guess you are talking about wayland compositor. We are mainly using
Xorg on the host and this feature works pretty good on it. I am


Xorg may not be going away soon, but it's used less and less. As one of 
the developers, I am no longer running/testing it for a long time. I 
wish we would just drop its support tbh.


There are features offered by Xorg that are not offered by Wayland 
compositors and again, we have customers that rely on these features.
One of them is the ability to position the window via 
gtk_window_set_position(). There are strong arguments

made on either side when it comes to window positioning:
https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/247

Until there is a way to do this with Wayland compositors, we have to 
unfortunately rely on Gnome + Xorg.




wondering if we limit the feature to Xorg case or adding some warning
messages with error return 

Re: [RFC PATCH v1 1/6] build-sys: Add rust feature option

2024-06-11 Thread Stefan Hajnoczi
On Tue, 11 Jun 2024 at 13:54, Manos Pitsidianakis
 wrote:
>
> On Tue, 11 Jun 2024 at 17:05, Stefan Hajnoczi  wrote:
> >
> > On Mon, Jun 10, 2024 at 09:22:36PM +0300, Manos Pitsidianakis wrote:
> > > Add options for Rust in meson_options.txt, meson.build, configure to
> > > prepare for adding Rust code in the followup commits.
> > >
> > > `rust` is a reserved meson name, so we have to use an alternative.
> > > `with_rust` was chosen.
> > >
> > > Signed-off-by: Manos Pitsidianakis 
> > > ---
> > > The cargo wrapper script hardcodes some rust target triples. This is
> > > just temporary.
> > > ---
> > >  .gitignore   |   2 +
> > >  configure|  12 +++
> > >  meson.build  |  11 ++
> > >  meson_options.txt|   4 +
> > >  scripts/cargo_wrapper.py | 211 +++
> > >  5 files changed, 240 insertions(+)
> > >  create mode 100644 scripts/cargo_wrapper.py
> > >
> > > diff --git a/.gitignore b/.gitignore
> > > index 61fa39967b..f42b0d937e 100644
> > > --- a/.gitignore
> > > +++ b/.gitignore
> > > @@ -2,6 +2,8 @@
> > >  /build/
> > >  /.cache/
> > >  /.vscode/
> > > +/target/
> > > +rust/**/target
> >
> > Are these necessary since the cargo build command-line below uses
> > --target-dir ?
> >
> > Adding new build output directories outside build/ makes it harder to
> > clean up the source tree and ensure no state from previous builds
> > remains.
>
> Agreed! These build directories would show up when using cargo
> directly instead of through the cargo_wrapper.py script, i.e. during
> development. I'd consider it an edge case, it won't happen much and if
> it does it's better to gitignore them than accidentally checking them
> in. Also, whatever artifacts are in a `target` directory won't be used
> for compilation with qemu inside a build directory.

Why would someone bypass the build system? I don't think we should
encourage developers to do this.

>
>
> > >  *.pyc
> > >  .sdk
> > >  .stgit-*
> > > diff --git a/configure b/configure
> > > index 38ee257701..c195630771 100755
> > > --- a/configure
> > > +++ b/configure
> > > @@ -302,6 +302,9 @@ else
> > >objcc="${objcc-${cross_prefix}clang}"
> > >  fi
> > >
> > > +with_rust="auto"
> > > +with_rust_target_triple=""
> > > +
> > >  ar="${AR-${cross_prefix}ar}"
> > >  as="${AS-${cross_prefix}as}"
> > >  ccas="${CCAS-$cc}"
> > > @@ -760,6 +763,12 @@ for opt do
> > >;;
> > >--gdb=*) gdb_bin="$optarg"
> > >;;
> > > +  --enable-rust) with_rust=enabled
> > > +  ;;
> > > +  --disable-rust) with_rust=disabled
> > > +  ;;
> > > +  --rust-target-triple=*) with_rust_target_triple="$optarg"
> > > +  ;;
> > ># everything else has the same name in configure and meson
> > >--*) meson_option_parse "$opt" "$optarg"
> > >;;
> > > @@ -1796,6 +1805,9 @@ if test "$skip_meson" = no; then
> > >test -n "${LIB_FUZZING_ENGINE+xxx}" && meson_option_add 
> > > "-Dfuzzing_engine=$LIB_FUZZING_ENGINE"
> > >test "$plugins" = yes && meson_option_add "-Dplugins=true"
> > >test "$tcg" != enabled && meson_option_add "-Dtcg=$tcg"
> > > +  test "$with_rust" != enabled && meson_option_add 
> > > "-Dwith_rust=$with_rust"
> > > +  test "$with_rust" != enabled && meson_option_add 
> > > "-Dwith_rust=$with_rust"
> >
> > Duplicate line.
>
> Thanks!
>
> >
> > > +  test "$with_rust_target_triple" != "" && meson_option_add 
> > > "-Dwith_rust_target_triple=$with_rust_target_triple"
> > >run_meson() {
> > >  NINJA=$ninja $meson setup "$@" "$PWD" "$source_path"
> > >}
> > > diff --git a/meson.build b/meson.build
> > > index a9de71d450..3533889852 100644
> > > --- a/meson.build
> > > +++ b/meson.build
> > > @@ -290,6 +290,12 @@ foreach lang : all_languages
> > >endif
> > >  endforeach
> > >
> > > +cargo = not_found
> > > +if get_option('with_rust').allowed()
> > > +  cargo = find_program('cargo', required: get_option('with_rust'))
> > > +endif
> > > +with_rust = cargo.found()
> > > +
> > >  # default flags for all hosts
> > >  # We use -fwrapv to tell the compiler that we require a C dialect where
> > >  # left shift of signed integers is well defined and has the expected
> > > @@ -2066,6 +2072,7 @@ endif
> > >
> > >  config_host_data = configuration_data()
> > >
> > > +config_host_data.set('CONFIG_WITH_RUST', with_rust)
> > >  audio_drivers_selected = []
> > >  if have_system
> > >audio_drivers_available = {
> > > @@ -4190,6 +4197,10 @@ if 'objc' in all_languages
> > >  else
> > >summary_info += {'Objective-C compiler': false}
> > >  endif
> > > +summary_info += {'Rust support':  with_rust}
> > > +if with_rust and get_option('with_rust_target_triple') != ''
> > > +  summary_info += {'Rust target': 
> > > get_option('with_rust_target_triple')}
> > > +endif
> > >  option_cflags = (get_option('debug') ? ['-g'] : [])
> > >  if get_option('optimization') != 'plain'
> > >option_cflags += ['-O' + get_option('optimization')]
> > > diff --git a/meson_options.txt 

Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`

2024-06-11 Thread Kevin Wolf
Am 11.06.2024 um 18:22 hat Amjad Alsharafi geschrieben:
> On Tue, Jun 11, 2024 at 04:30:53PM +0200, Kevin Wolf wrote:
> > Am 11.06.2024 um 14:31 hat Amjad Alsharafi geschrieben:
> > > On Mon, Jun 10, 2024 at 06:49:43PM +0200, Kevin Wolf wrote:
> > > > Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> > > > > The field is marked as "the offset in the file (in clusters)", but it
> > > > > was being used like this
> > > > > `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> > > > > 
> > > > > Additionally, removed the `abort` when `first_mapping_index` does not
> > > > > match, as this matches the case when adding new clusters for files, 
> > > > > and
> > > > > its inevitable that we reach this condition when doing that if the
> > > > > clusters are not after one another, so there is no reason to `abort`
> > > > > here, execution continues and the new clusters are written to disk
> > > > > correctly.
> > > > > 
> > > > > Signed-off-by: Amjad Alsharafi 
> > > > 
> > > > Can you help me understand how first_mapping_index really works?
> > > > 
> > > > It seems to me that you get a chain of mappings for each file on the FAT
> > > > filesystem, which are just the contiguous areas in it, and
> > > > first_mapping_index refers to the mapping at the start of the file. But
> > > > for much of the time, it actually doesn't seem to be set at all, so you
> > > > have mapping->first_mapping_index == -1. Do you understand the rules
> > > > around when it's set and when it isn't?
> > > 
> > > Yeah. So `first_mapping_index` is the index of the first mapping, each
> > > mapping is a group of clusters that are contiguous in the file.
> > > Its mostly `-1` because the first mapping will have the value set as
> > > `-1` and not its own index, this value will only be set when the file
> > > contain more than one mapping, and this will only happen when you add
> > > clusters to a file that are not contiguous with the existing clusters.
> > 
> > Ah, that makes some sense. Not sure if it's optimal, but it's a rule I
> > can work with. So just to confirm, this is the invariant that we think
> > should always hold true, right?
> > 
> > assert((mapping->mode & MODE_DIRECTORY) ||
> >!mapping->info.file.offset ||
> >mapping->first_mapping_index > 0);
> > 
> 
> Yes.
> 
> We can add this into `get_cluster_count_for_direntry` loop.

Maybe even find_mapping_for_cluster() because we think it should apply
always? It's called by get_cluster_count_for_direntry(), but also by
other functions.

Either way, I think this should be a separate patch.

> I'm thinking of also converting those `abort` into `assert`, since
> the line `copy_it = 1;` was confusing me, since it was after the `abort`.

I agree for the abort() that you removed, but I'm not sure about the
other one. I have a feeling the copy_it = 1 might actually be correct
there (if the copying logic is implemented correctly; I didn't check
that).

> > > And actually, thanks to that I noticed another bug not fixed in PATCH 3, 
> > > We are doing this check 
> > > `s->current_mapping->first_mapping_index != mapping->first_mapping_index`
> > > to know if we should switch to the new mapping or not. 
> > > If we were reading from the first mapping (`first_mapping_index == -1`)
> > > and we jumped to the second mapping (`first_mapping_index == n`), we
> > > will catch this condition and switch to the new mapping.
> > > 
> > > But if the file has more than 2 mappings, and we jumped to the 3rd
> > > mapping, we will not catch this since (`first_mapping_index == n`) for
> > > both of them haha. I think a better check is to check the `mapping`
> > > pointer directly. (I'll add it also in the next series together with a
> > > test for it.)
> > 
> > This comparison is exactly what confused me. I didn't realise that the
> > first mapping in the chain has a different value here, so I thought this
> > must mean that we're looking at a different file now - but of course I
> > couldn't see a reason for that because we're iterating through a single
> > file in this function.
> > 
> > But even now that I know that the condition triggers when switching from
> > the first to the second mapping, it doesn't make sense to me. We don't
> > have to copy things around just because a file is non-contiguous.
> > 
> > What we want to catch is if the order of mappings has changed compared
> > to the old state. Do we need a linked list, maybe a prev_mapping_index,
> > instead of first_mapping_index so that we can compare if it is still the
> > same as before?
> 
> I think this would be the better design (tbh, that's what I thought 
> `first_mapping_index` would do), though not sure if other components
> depend so much into the current design that it would be hard to change.
> 
> I'll try to implement this `prev_mapping_index` and see how it goes.

Let's try not to do too much at once. We know that vvfat is a mess,
nobody fully understands it, and the write support 

Re: [RFC PATCH v1 1/6] build-sys: Add rust feature option

2024-06-11 Thread Manos Pitsidianakis
On Tue, 11 Jun 2024 at 17:05, Stefan Hajnoczi  wrote:
>
> On Mon, Jun 10, 2024 at 09:22:36PM +0300, Manos Pitsidianakis wrote:
> > Add options for Rust in meson_options.txt, meson.build, configure to
> > prepare for adding Rust code in the followup commits.
> >
> > `rust` is a reserved meson name, so we have to use an alternative.
> > `with_rust` was chosen.
> >
> > Signed-off-by: Manos Pitsidianakis 
> > ---
> > The cargo wrapper script hardcodes some rust target triples. This is
> > just temporary.
> > ---
> >  .gitignore   |   2 +
> >  configure|  12 +++
> >  meson.build  |  11 ++
> >  meson_options.txt|   4 +
> >  scripts/cargo_wrapper.py | 211 +++
> >  5 files changed, 240 insertions(+)
> >  create mode 100644 scripts/cargo_wrapper.py
> >
> > diff --git a/.gitignore b/.gitignore
> > index 61fa39967b..f42b0d937e 100644
> > --- a/.gitignore
> > +++ b/.gitignore
> > @@ -2,6 +2,8 @@
> >  /build/
> >  /.cache/
> >  /.vscode/
> > +/target/
> > +rust/**/target
>
> Are these necessary since the cargo build command-line below uses
> --target-dir ?
>
> Adding new build output directories outside build/ makes it harder to
> clean up the source tree and ensure no state from previous builds
> remains.

Agreed! These build directories would show up when using cargo
directly instead of through the cargo_wrapper.py script, i.e. during
development. I'd consider it an edge case, it won't happen much and if
it does it's better to gitignore them than accidentally checking them
in. Also, whatever artifacts are in a `target` directory won't be used
for compilation with qemu inside a build directory.


> >  *.pyc
> >  .sdk
> >  .stgit-*
> > diff --git a/configure b/configure
> > index 38ee257701..c195630771 100755
> > --- a/configure
> > +++ b/configure
> > @@ -302,6 +302,9 @@ else
> >objcc="${objcc-${cross_prefix}clang}"
> >  fi
> >
> > +with_rust="auto"
> > +with_rust_target_triple=""
> > +
> >  ar="${AR-${cross_prefix}ar}"
> >  as="${AS-${cross_prefix}as}"
> >  ccas="${CCAS-$cc}"
> > @@ -760,6 +763,12 @@ for opt do
> >;;
> >--gdb=*) gdb_bin="$optarg"
> >;;
> > +  --enable-rust) with_rust=enabled
> > +  ;;
> > +  --disable-rust) with_rust=disabled
> > +  ;;
> > +  --rust-target-triple=*) with_rust_target_triple="$optarg"
> > +  ;;
> ># everything else has the same name in configure and meson
> >--*) meson_option_parse "$opt" "$optarg"
> >;;
> > @@ -1796,6 +1805,9 @@ if test "$skip_meson" = no; then
> >test -n "${LIB_FUZZING_ENGINE+xxx}" && meson_option_add 
> > "-Dfuzzing_engine=$LIB_FUZZING_ENGINE"
> >test "$plugins" = yes && meson_option_add "-Dplugins=true"
> >test "$tcg" != enabled && meson_option_add "-Dtcg=$tcg"
> > +  test "$with_rust" != enabled && meson_option_add "-Dwith_rust=$with_rust"
> > +  test "$with_rust" != enabled && meson_option_add "-Dwith_rust=$with_rust"
>
> Duplicate line.

Thanks!

>
> > +  test "$with_rust_target_triple" != "" && meson_option_add 
> > "-Dwith_rust_target_triple=$with_rust_target_triple"
> >run_meson() {
> >  NINJA=$ninja $meson setup "$@" "$PWD" "$source_path"
> >}
> > diff --git a/meson.build b/meson.build
> > index a9de71d450..3533889852 100644
> > --- a/meson.build
> > +++ b/meson.build
> > @@ -290,6 +290,12 @@ foreach lang : all_languages
> >endif
> >  endforeach
> >
> > +cargo = not_found
> > +if get_option('with_rust').allowed()
> > +  cargo = find_program('cargo', required: get_option('with_rust'))
> > +endif
> > +with_rust = cargo.found()
> > +
> >  # default flags for all hosts
> >  # We use -fwrapv to tell the compiler that we require a C dialect where
> >  # left shift of signed integers is well defined and has the expected
> > @@ -2066,6 +2072,7 @@ endif
> >
> >  config_host_data = configuration_data()
> >
> > +config_host_data.set('CONFIG_WITH_RUST', with_rust)
> >  audio_drivers_selected = []
> >  if have_system
> >audio_drivers_available = {
> > @@ -4190,6 +4197,10 @@ if 'objc' in all_languages
> >  else
> >summary_info += {'Objective-C compiler': false}
> >  endif
> > +summary_info += {'Rust support':  with_rust}
> > +if with_rust and get_option('with_rust_target_triple') != ''
> > +  summary_info += {'Rust target': 
> > get_option('with_rust_target_triple')}
> > +endif
> >  option_cflags = (get_option('debug') ? ['-g'] : [])
> >  if get_option('optimization') != 'plain'
> >option_cflags += ['-O' + get_option('optimization')]
> > diff --git a/meson_options.txt b/meson_options.txt
> > index 4c1583eb40..223491b731 100644
> > --- a/meson_options.txt
> > +++ b/meson_options.txt
> > @@ -366,3 +366,7 @@ option('qemu_ga_version', type: 'string', value: '',
> >
> >  option('hexagon_idef_parser', type : 'boolean', value : true,
> > description: 'use idef-parser to automatically generate TCG code 
> > for the Hexagon frontend')
> > +option('with_rust', type: 'feature', value: 'auto',
> > +   description: 

Re: [PATCH v4 5/5] iotests: add backup-discard-source

2024-06-11 Thread Kevin Wolf
Am 13.03.2024 um 16:28 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Add test for a new backup option: discard-source.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Fiona Ebner 
> Tested-by: Fiona Ebner 

This test fails for me, and it already does so after this commit that
introduced it. I haven't checked what get_actual_size(), but I'm running
on XFS, so its preallocation could be causing this. We generally avoid
checking the number of allocated blocks in image files for this reason.

Kevin


backup-discard-source   fail   [19:45:49] [19:45:50]   0.8s 
failed, exit status 1
--- /home/kwolf/source/qemu/tests/qemu-iotests/tests/backup-discard-source.out
+++ 
/home/kwolf/source/qemu/build-clang/scratch/qcow2-file-backup-discard-source/backup-discard-source.out.bad
@@ -1,5 +1,14 @@
-..
+F.
+==
+FAIL: test_discard_cbw (__main__.TestBackup.test_discard_cbw)
+1. do backup(discard_source=True), which should inform
+--
+Traceback (most recent call last):
+  File 
"/home/kwolf/source/qemu/tests/qemu-iotests/tests/backup-discard-source", line 
147, in test_discard_cbw
+self.assertLess(get_actual_size(self.vm, 'temp'), 512 * 1024)
+AssertionError: 1249280 not less than 524288
+
 --
 Ran 2 tests

-OK
+FAILED (failures=1)
Failures: backup-discard-source
Failed 1 of 1 iotests




Re: [PATCH v1] virtio-iommu: add error check before assert

2024-06-11 Thread Manos Pitsidianakis
On Tue, 11 Jun 2024 at 18:01, Philippe Mathieu-Daudé  wrote:
>
> On 11/6/24 14:23, Manos Pitsidianakis wrote:
> > A fuzzer case discovered by Zheyu Ma causes an assert failure.
> >
> > Add a check before the assert, and respond with an error before moving
> > on to the next queue element.
> >
> > To reproduce the failure:
> >
> > cat << EOF | \
> > qemu-system-x86_64 \
> > -display none -machine accel=qtest -m 512M -machine q35 -nodefaults \
> > -device virtio-iommu -qtest stdio
> > outl 0xcf8 0x8804
> > outw 0xcfc 0x06
> > outl 0xcf8 0x8820
> > outl 0xcfc 0xe0004000
> > write 0x1e 0x1 0x01
> > write 0xe0004020 0x4 0x1000
> > write 0xe0004028 0x4 0x00101000
> > write 0xe000401c 0x1 0x01
> > write 0x106000 0x1 0x05
> > write 0x11 0x1 0x60
> > write 0x12 0x1 0x10
> > write 0x19 0x1 0x04
> > write 0x1c 0x1 0x01
> > write 0x100018 0x1 0x04
> > write 0x10001c 0x1 0x02
> > write 0x101003 0x1 0x01
> > write 0xe0007001 0x1 0x00
> > EOF
> >
> > Reported-by: Zheyu Ma 
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2359
> > Signed-off-by: Manos Pitsidianakis 
> > ---
> >   hw/virtio/virtio-iommu.c | 12 
> >   1 file changed, 12 insertions(+)
> >
> > diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
> > index 1326c6ec41..9b99def39f 100644
> > --- a/hw/virtio/virtio-iommu.c
> > +++ b/hw/virtio/virtio-iommu.c
> > @@ -818,6 +818,18 @@ static void virtio_iommu_handle_command(VirtIODevice 
> > *vdev, VirtQueue *vq)
> >   out:
> >   sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
> > buf ? buf : , output_size);
> > +if (unlikely(sz != output_size)) {
>
> Is this a normal guest behavior? Should we log it as GUEST_ERROR?

It's not, it'd be a virtio spec (implementation) mis-use by the guest.
the Internal device error (VIRTIO_IOMMU_S_DEVERR) would be logged by
the kernel; should we log it as well?



[PULL 3/8] aio: warn about iohandler_ctx special casing

2024-06-11 Thread Kevin Wolf
From: Stefan Hajnoczi 

The main loop has two AioContexts: qemu_aio_context and iohandler_ctx.
The main loop runs them both, but nested aio_poll() calls on
qemu_aio_context exclude iohandler_ctx.

Which one should qemu_get_current_aio_context() return when called from
the main loop? Document that it's always qemu_aio_context.

This has subtle effects on functions that use
qemu_get_current_aio_context(). For example, aio_co_reschedule_self()
does not work when moving from iohandler_ctx to qemu_aio_context because
qemu_get_current_aio_context() does not differentiate these two
AioContexts.

Document this in order to reduce the chance of future bugs.

Signed-off-by: Stefan Hajnoczi 
Message-ID: <20240506190622.56095-3-stefa...@redhat.com>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 include/block/aio.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/block/aio.h b/include/block/aio.h
index 8378553eb9..4ee81936ed 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -629,6 +629,9 @@ void aio_co_schedule(AioContext *ctx, Coroutine *co);
  *
  * Move the currently running coroutine to new_ctx. If the coroutine is already
  * running in new_ctx, do nothing.
+ *
+ * Note that this function cannot reschedule from iohandler_ctx to
+ * qemu_aio_context.
  */
 void coroutine_fn aio_co_reschedule_self(AioContext *new_ctx);
 
@@ -661,6 +664,9 @@ void aio_co_enter(AioContext *ctx, Coroutine *co);
  * If called from an IOThread this will be the IOThread's AioContext.  If
  * called from the main thread or with the "big QEMU lock" taken it
  * will be the main loop AioContext.
+ *
+ * Note that the return value is never the main loop's iohandler_ctx and the
+ * return value is the main loop AioContext instead.
  */
 AioContext *qemu_get_current_aio_context(void);
 
-- 
2.45.2




[PULL 6/8] linux-aio: add IO_CMD_FDSYNC command support

2024-06-11 Thread Kevin Wolf
From: Prasad Pandit 

Libaio defines IO_CMD_FDSYNC command to sync all outstanding
asynchronous I/O operations, by flushing out file data to the
disk storage. Enable linux-aio to submit such aio request.

When using aio=native without fdsync() support, QEMU creates
pthreads, and destroying these pthreads results in TLB flushes.
In a real-time guest environment, TLB flushes cause a latency
spike. This patch helps to avoid such spikes.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Prasad Pandit 
Message-ID: <20240425070412.37248-1-ppan...@redhat.com>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 include/block/raw-aio.h |  1 +
 block/file-posix.c  |  9 +
 block/linux-aio.c   | 21 -
 3 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h
index 20e000b8ef..626706827f 100644
--- a/include/block/raw-aio.h
+++ b/include/block/raw-aio.h
@@ -60,6 +60,7 @@ void laio_cleanup(LinuxAioState *s);
 int coroutine_fn laio_co_submit(int fd, uint64_t offset, QEMUIOVector *qiov,
 int type, uint64_t dev_max_batch);
 
+bool laio_has_fdsync(int);
 void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context);
 void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context);
 #endif
diff --git a/block/file-posix.c b/block/file-posix.c
index 5c46938936..be25e35ff6 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -159,6 +159,7 @@ typedef struct BDRVRawState {
 bool has_discard:1;
 bool has_write_zeroes:1;
 bool use_linux_aio:1;
+bool has_laio_fdsync:1;
 bool use_linux_io_uring:1;
 int page_cache_inconsistent; /* errno from fdatasync failure */
 bool has_fallocate;
@@ -718,6 +719,9 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 ret = -EINVAL;
 goto fail;
 }
+if (s->use_linux_aio) {
+s->has_laio_fdsync = laio_has_fdsync(s->fd);
+}
 #else
 if (s->use_linux_aio) {
 error_setg(errp, "aio=native was specified, but is not supported "
@@ -2598,6 +2602,11 @@ static int coroutine_fn 
raw_co_flush_to_disk(BlockDriverState *bs)
 if (raw_check_linux_io_uring(s)) {
 return luring_co_submit(bs, s->fd, 0, NULL, QEMU_AIO_FLUSH);
 }
+#endif
+#ifdef CONFIG_LINUX_AIO
+if (s->has_laio_fdsync && raw_check_linux_aio(s)) {
+return laio_co_submit(s->fd, 0, NULL, QEMU_AIO_FLUSH, 0);
+}
 #endif
 return raw_thread_pool_submit(handle_aiocb_flush, );
 }
diff --git a/block/linux-aio.c b/block/linux-aio.c
index ec05d946f3..e3b5ec9aba 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -384,6 +384,9 @@ static int laio_do_submit(int fd, struct qemu_laiocb 
*laiocb, off_t offset,
 case QEMU_AIO_READ:
 io_prep_preadv(iocbs, fd, qiov->iov, qiov->niov, offset);
 break;
+case QEMU_AIO_FLUSH:
+io_prep_fdsync(iocbs, fd);
+break;
 /* Currently Linux kernel does not support other operations */
 default:
 fprintf(stderr, "%s: invalid AIO request type 0x%x.\n",
@@ -412,7 +415,7 @@ int coroutine_fn laio_co_submit(int fd, uint64_t offset, 
QEMUIOVector *qiov,
 AioContext *ctx = qemu_get_current_aio_context();
 struct qemu_laiocb laiocb = {
 .co = qemu_coroutine_self(),
-.nbytes = qiov->size,
+.nbytes = qiov ? qiov->size : 0,
 .ctx= aio_get_linux_aio(ctx),
 .ret= -EINPROGRESS,
 .is_read= (type == QEMU_AIO_READ),
@@ -486,3 +489,19 @@ void laio_cleanup(LinuxAioState *s)
 }
 g_free(s);
 }
+
+bool laio_has_fdsync(int fd)
+{
+struct iocb cb;
+struct iocb *cbs[] = {, NULL};
+
+io_context_t ctx = 0;
+io_setup(1, );
+
+/* check if host kernel supports IO_CMD_FDSYNC */
+io_prep_fdsync(, fd);
+int ret = io_submit(ctx, 1, cbs);
+
+io_destroy(ctx);
+return (ret == -EINVAL) ? false : true;
+}
-- 
2.45.2




[PULL 0/8] Block layer patches

2024-06-11 Thread Kevin Wolf
The following changes since commit 80e8f0602168f451a93e71cbb1d59e93d745e62e:

  Merge tag 'bsd-user-misc-2024q2-pull-request' of gitlab.com:bsdimp/qemu into 
staging (2024-06-09 11:21:55 -0700)

are available in the Git repository at:

  https://repo.or.cz/qemu/kevin.git tags/for-upstream

for you to fetch changes up to 3ab0f063e58ed9224237d69c4211ca83335164c4:

  crypto/block: drop qcrypto_block_open() n_threads argument (2024-06-10 
11:05:43 +0200)


Block layer patches

- crypto: Fix crash when used with multiqueue devices
- linux-aio: add IO_CMD_FDSYNC command support
- copy-before-write: Avoid integer overflows for timeout > 4s
- Fix crash with QMP block_resize and iothreads
- qemu-io: add cvtnum() error handling for zone commands
- Code cleanup


Denis V. Lunev via (1):
  block: drop force_dup parameter of raw_reconfigure_getfd()

Fiona Ebner (1):
  block/copy-before-write: use uint64_t for timeout in nanoseconds

Prasad J Pandit (1):
  linux-aio: add IO_CMD_FDSYNC command support

Stefan Hajnoczi (5):
  Revert "monitor: use aio_co_reschedule_self()"
  aio: warn about iohandler_ctx special casing
  qemu-io: add cvtnum() error handling for zone commands
  block/crypto: create ciphers on demand
  crypto/block: drop qcrypto_block_open() n_threads argument

 crypto/blockpriv.h |  13 +++--
 include/block/aio.h|   6 +++
 include/block/raw-aio.h|   1 +
 include/crypto/block.h |   2 -
 block/copy-before-write.c  |   2 +-
 block/crypto.c |   1 -
 block/file-posix.c |  17 --
 block/linux-aio.c  |  21 +++-
 block/qcow.c   |   2 +-
 block/qcow2.c  |   5 +-
 crypto/block-luks.c|   4 +-
 crypto/block-qcow.c|   8 ++-
 crypto/block.c | 114 -
 qapi/qmp-dispatch.c|   7 ++-
 qemu-io-cmds.c |  48 -
 tests/unit/test-crypto-block.c |   4 --
 16 files changed, 176 insertions(+), 79 deletions(-)




[PULL 4/8] qemu-io: add cvtnum() error handling for zone commands

2024-06-11 Thread Kevin Wolf
From: Stefan Hajnoczi 

cvtnum() parses positive int64_t values and returns a negative errno on
failure. Print errors and return early when cvtnum() fails.

While we're at it, also reject nr_zones values greater or equal to 2^32
since they cannot be represented.

Reported-by: Peter Maydell 
Cc: Sam Li 
Signed-off-by: Stefan Hajnoczi 
Message-ID: <20240507180558.377233-1-stefa...@redhat.com>
Reviewed-by: Sam Li 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 qemu-io-cmds.c | 48 +++-
 1 file changed, 47 insertions(+), 1 deletion(-)

diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c
index f5d7202a13..e2fab57183 100644
--- a/qemu-io-cmds.c
+++ b/qemu-io-cmds.c
@@ -1739,12 +1739,26 @@ static int zone_report_f(BlockBackend *blk, int argc, 
char **argv)
 {
 int ret;
 int64_t offset;
+int64_t val;
 unsigned int nr_zones;
 
 ++optind;
 offset = cvtnum(argv[optind]);
+if (offset < 0) {
+print_cvtnum_err(offset, argv[optind]);
+return offset;
+}
 ++optind;
-nr_zones = cvtnum(argv[optind]);
+val = cvtnum(argv[optind]);
+if (val < 0) {
+print_cvtnum_err(val, argv[optind]);
+return val;
+}
+if (val > UINT_MAX) {
+printf("Number of zones must be less than 2^32\n");
+return -ERANGE;
+}
+nr_zones = val;
 
 g_autofree BlockZoneDescriptor *zones = NULL;
 zones = g_new(BlockZoneDescriptor, nr_zones);
@@ -1780,8 +1794,16 @@ static int zone_open_f(BlockBackend *blk, int argc, char 
**argv)
 int64_t offset, len;
 ++optind;
 offset = cvtnum(argv[optind]);
+if (offset < 0) {
+print_cvtnum_err(offset, argv[optind]);
+return offset;
+}
 ++optind;
 len = cvtnum(argv[optind]);
+if (len < 0) {
+print_cvtnum_err(len, argv[optind]);
+return len;
+}
 ret = blk_zone_mgmt(blk, BLK_ZO_OPEN, offset, len);
 if (ret < 0) {
 printf("zone open failed: %s\n", strerror(-ret));
@@ -1805,8 +1827,16 @@ static int zone_close_f(BlockBackend *blk, int argc, 
char **argv)
 int64_t offset, len;
 ++optind;
 offset = cvtnum(argv[optind]);
+if (offset < 0) {
+print_cvtnum_err(offset, argv[optind]);
+return offset;
+}
 ++optind;
 len = cvtnum(argv[optind]);
+if (len < 0) {
+print_cvtnum_err(len, argv[optind]);
+return len;
+}
 ret = blk_zone_mgmt(blk, BLK_ZO_CLOSE, offset, len);
 if (ret < 0) {
 printf("zone close failed: %s\n", strerror(-ret));
@@ -1830,8 +1860,16 @@ static int zone_finish_f(BlockBackend *blk, int argc, 
char **argv)
 int64_t offset, len;
 ++optind;
 offset = cvtnum(argv[optind]);
+if (offset < 0) {
+print_cvtnum_err(offset, argv[optind]);
+return offset;
+}
 ++optind;
 len = cvtnum(argv[optind]);
+if (len < 0) {
+print_cvtnum_err(len, argv[optind]);
+return len;
+}
 ret = blk_zone_mgmt(blk, BLK_ZO_FINISH, offset, len);
 if (ret < 0) {
 printf("zone finish failed: %s\n", strerror(-ret));
@@ -1855,8 +1893,16 @@ static int zone_reset_f(BlockBackend *blk, int argc, 
char **argv)
 int64_t offset, len;
 ++optind;
 offset = cvtnum(argv[optind]);
+if (offset < 0) {
+print_cvtnum_err(offset, argv[optind]);
+return offset;
+}
 ++optind;
 len = cvtnum(argv[optind]);
+if (len < 0) {
+print_cvtnum_err(len, argv[optind]);
+return len;
+}
 ret = blk_zone_mgmt(blk, BLK_ZO_RESET, offset, len);
 if (ret < 0) {
 printf("zone reset failed: %s\n", strerror(-ret));
-- 
2.45.2




[PULL 5/8] block/copy-before-write: use uint64_t for timeout in nanoseconds

2024-06-11 Thread Kevin Wolf
From: Fiona Ebner 

rather than the uint32_t for which the maximum is slightly more than 4
seconds and larger values would overflow. The QAPI interface allows
specifying the number of seconds, so only values 0 to 4 are safe right
now, other values lead to a much lower timeout than a user expects.

The block_copy() call where this is used already takes a uint64_t for
the timeout, so no change required there.

Fixes: 6db7fd1ca9 ("block/copy-before-write: implement cbw-timeout option")
Reported-by: Friedrich Weber 
Signed-off-by: Fiona Ebner 
Message-ID: <20240429141934.442154-1-f.eb...@proxmox.com>
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 block/copy-before-write.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index cd65524e26..853e01a1eb 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -43,7 +43,7 @@ typedef struct BDRVCopyBeforeWriteState {
 BlockCopyState *bcs;
 BdrvChild *target;
 OnCbwError on_cbw_error;
-uint32_t cbw_timeout_ns;
+uint64_t cbw_timeout_ns;
 bool discard_source;
 
 /*
-- 
2.45.2




[PULL 8/8] crypto/block: drop qcrypto_block_open() n_threads argument

2024-06-11 Thread Kevin Wolf
From: Stefan Hajnoczi 

The n_threads argument is no longer used since the previous commit.
Remove it.

Signed-off-by: Stefan Hajnoczi 
Message-ID: <20240527155851.892885-3-stefa...@redhat.com>
Reviewed-by: Kevin Wolf 
Acked-by: Daniel P. Berrangé 
Signed-off-by: Kevin Wolf 
---
 crypto/blockpriv.h | 1 -
 include/crypto/block.h | 2 --
 block/crypto.c | 1 -
 block/qcow.c   | 2 +-
 block/qcow2.c  | 5 ++---
 crypto/block-luks.c| 1 -
 crypto/block-qcow.c| 6 ++
 crypto/block.c | 3 +--
 tests/unit/test-crypto-block.c | 4 
 9 files changed, 6 insertions(+), 19 deletions(-)

diff --git a/crypto/blockpriv.h b/crypto/blockpriv.h
index 4bf6043d5d..b8f77cb5eb 100644
--- a/crypto/blockpriv.h
+++ b/crypto/blockpriv.h
@@ -59,7 +59,6 @@ struct QCryptoBlockDriver {
 QCryptoBlockReadFunc readfunc,
 void *opaque,
 unsigned int flags,
-size_t n_threads,
 Error **errp);
 
 int (*create)(QCryptoBlock *block,
diff --git a/include/crypto/block.h b/include/crypto/block.h
index 92e823c9f2..5b5d039800 100644
--- a/include/crypto/block.h
+++ b/include/crypto/block.h
@@ -76,7 +76,6 @@ typedef enum {
  * @readfunc: callback for reading data from the volume
  * @opaque: data to pass to @readfunc
  * @flags: bitmask of QCryptoBlockOpenFlags values
- * @n_threads: allow concurrent I/O from up to @n_threads threads
  * @errp: pointer to a NULL-initialized error object
  *
  * Create a new block encryption object for an existing
@@ -113,7 +112,6 @@ QCryptoBlock *qcrypto_block_open(QCryptoBlockOpenOptions 
*options,
  QCryptoBlockReadFunc readfunc,
  void *opaque,
  unsigned int flags,
- size_t n_threads,
  Error **errp);
 
 typedef enum {
diff --git a/block/crypto.c b/block/crypto.c
index 21eed909c1..4eed3ffa6a 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -363,7 +363,6 @@ static int block_crypto_open_generic(QCryptoBlockFormat 
format,
block_crypto_read_func,
bs,
cflags,
-   1,
errp);
 
 if (!crypto->block) {
diff --git a/block/qcow.c b/block/qcow.c
index ca8e1d5ec8..c2f89db055 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -211,7 +211,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, 
int flags,
 cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
 }
 s->crypto = qcrypto_block_open(crypto_opts, "encrypt.",
-   NULL, NULL, cflags, 1, errp);
+   NULL, NULL, cflags, errp);
 if (!s->crypto) {
 ret = -EINVAL;
 goto fail;
diff --git a/block/qcow2.c b/block/qcow2.c
index 956128b409..10883a2494 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -321,7 +321,7 @@ qcow2_read_extensions(BlockDriverState *bs, uint64_t 
start_offset,
 }
 s->crypto = qcrypto_block_open(s->crypto_opts, "encrypt.",
qcow2_crypto_hdr_read_func,
-   bs, cflags, QCOW2_MAX_THREADS, 
errp);
+   bs, cflags, errp);
 if (!s->crypto) {
 return -EINVAL;
 }
@@ -1701,8 +1701,7 @@ qcow2_do_open(BlockDriverState *bs, QDict *options, int 
flags,
 cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
 }
 s->crypto = qcrypto_block_open(s->crypto_opts, "encrypt.",
-   NULL, NULL, cflags,
-   QCOW2_MAX_THREADS, errp);
+   NULL, NULL, cflags, errp);
 if (!s->crypto) {
 ret = -EINVAL;
 goto fail;
diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 3357852c0a..5b777c15d3 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -1189,7 +1189,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
 QCryptoBlockReadFunc readfunc,
 void *opaque,
 unsigned int flags,
-size_t n_threads,
 Error **errp)
 {
 QCryptoBlockLUKS *luks = NULL;
diff --git a/crypto/block-qcow.c b/crypto/block-qcow.c
index 02305058e3..42e9556e42 100644
--- a/crypto/block-qcow.c
+++ b/crypto/block-qcow.c
@@ -44,7 +44,6 @@ qcrypto_block_qcow_has_format(const uint8_t *buf 
G_GNUC_UNUSED,
 static int
 qcrypto_block_qcow_init(QCryptoBlock *block,
 const char *keysecret,
-

[PULL 2/8] Revert "monitor: use aio_co_reschedule_self()"

2024-06-11 Thread Kevin Wolf
From: Stefan Hajnoczi 

Commit 1f25c172f837 ("monitor: use aio_co_reschedule_self()") was a code
cleanup that uses aio_co_reschedule_self() instead of open coding
coroutine rescheduling.

Bug RHEL-34618 was reported and Kevin Wolf  identified
the root cause. I missed that aio_co_reschedule_self() ->
qemu_get_current_aio_context() only knows about
qemu_aio_context/IOThread AioContexts and not about iohandler_ctx. It
does not function correctly when going back from the iohandler_ctx to
qemu_aio_context.

Go back to open coding the AioContext transitions to avoid this bug.

This reverts commit 1f25c172f83704e350c0829438d832384084a74d.

Cc: qemu-sta...@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-34618
Signed-off-by: Stefan Hajnoczi 
Message-ID: <20240506190622.56095-2-stefa...@redhat.com>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 qapi/qmp-dispatch.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index f3488afeef..176b549473 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -212,7 +212,8 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList 
*cmds, QObject *requ
  * executing the command handler so that it can make progress if it
  * involves an AIO_WAIT_WHILE().
  */
-aio_co_reschedule_self(qemu_get_aio_context());
+aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self());
+qemu_coroutine_yield();
 }
 
 monitor_set_cur(qemu_coroutine_self(), cur_mon);
@@ -226,7 +227,9 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList 
*cmds, QObject *requ
  * Move back to iohandler_ctx so that nested event loops for
  * qemu_aio_context don't start new monitor commands.
  */
-aio_co_reschedule_self(iohandler_get_aio_context());
+aio_co_schedule(iohandler_get_aio_context(),
+qemu_coroutine_self());
+qemu_coroutine_yield();
 }
 } else {
/*
-- 
2.45.2




[PULL 1/8] block: drop force_dup parameter of raw_reconfigure_getfd()

2024-06-11 Thread Kevin Wolf
From: "Denis V. Lunev via" 

Since commit 72373e40fbc, this parameter is always passed as 'false'
from the caller.

Signed-off-by: Denis V. Lunev 
CC: Andrey Zhadchenko 
CC: Kevin Wolf 
CC: Hanna Reitz 
Message-ID: <20240430170213.148558-1-...@openvz.org>
Reviewed-by: Kevin Wolf 
Signed-off-by: Kevin Wolf 
---
 block/file-posix.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 35684f7e21..5c46938936 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1039,8 +1039,7 @@ static int fcntl_setfl(int fd, int flag)
 }
 
 static int raw_reconfigure_getfd(BlockDriverState *bs, int flags,
- int *open_flags, uint64_t perm, bool 
force_dup,
- Error **errp)
+ int *open_flags, uint64_t perm, Error **errp)
 {
 BDRVRawState *s = bs->opaque;
 int fd = -1;
@@ -1068,7 +1067,7 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, 
int flags,
 assert((s->open_flags & O_ASYNC) == 0);
 #endif
 
-if (!force_dup && *open_flags == s->open_flags) {
+if (*open_flags == s->open_flags) {
 /* We're lucky, the existing fd is fine */
 return s->fd;
 }
@@ -3748,8 +3747,7 @@ static int raw_check_perm(BlockDriverState *bs, uint64_t 
perm, uint64_t shared,
 int ret;
 
 /* We may need a new fd if auto-read-only switches the mode */
-ret = raw_reconfigure_getfd(bs, input_flags, _flags, perm,
-false, errp);
+ret = raw_reconfigure_getfd(bs, input_flags, _flags, perm, errp);
 if (ret < 0) {
 return ret;
 } else if (ret != s->fd) {
-- 
2.45.2




[PULL 7/8] block/crypto: create ciphers on demand

2024-06-11 Thread Kevin Wolf
From: Stefan Hajnoczi 

Ciphers are pre-allocated by qcrypto_block_init_cipher() depending on
the given number of threads. The -device
virtio-blk-pci,iothread-vq-mapping= feature allows users to assign
multiple IOThreads to a virtio-blk device, but the association between
the virtio-blk device and the block driver happens after the block
driver is already open.

When the number of threads given to qcrypto_block_init_cipher() is
smaller than the actual number of threads at runtime, the
block->n_free_ciphers > 0 assertion in qcrypto_block_pop_cipher() can
fail.

Get rid of qcrypto_block_init_cipher() n_thread's argument and allocate
ciphers on demand.

Reported-by: Qing Wang 
Buglink: https://issues.redhat.com/browse/RHEL-36159
Signed-off-by: Stefan Hajnoczi 
Message-ID: <20240527155851.892885-2-stefa...@redhat.com>
Reviewed-by: Kevin Wolf 
Acked-by: Daniel P. Berrangé 
Signed-off-by: Kevin Wolf 
---
 crypto/blockpriv.h  |  12 +++--
 crypto/block-luks.c |   3 +-
 crypto/block-qcow.c |   2 +-
 crypto/block.c  | 111 ++--
 4 files changed, 78 insertions(+), 50 deletions(-)

diff --git a/crypto/blockpriv.h b/crypto/blockpriv.h
index 836f3b4726..4bf6043d5d 100644
--- a/crypto/blockpriv.h
+++ b/crypto/blockpriv.h
@@ -32,8 +32,14 @@ struct QCryptoBlock {
 const QCryptoBlockDriver *driver;
 void *opaque;
 
-QCryptoCipher **ciphers;
-size_t n_ciphers;
+/* Cipher parameters */
+QCryptoCipherAlgorithm alg;
+QCryptoCipherMode mode;
+uint8_t *key;
+size_t nkey;
+
+QCryptoCipher **free_ciphers;
+size_t max_free_ciphers;
 size_t n_free_ciphers;
 QCryptoIVGen *ivgen;
 QemuMutex mutex;
@@ -130,7 +136,7 @@ int qcrypto_block_init_cipher(QCryptoBlock *block,
   QCryptoCipherAlgorithm alg,
   QCryptoCipherMode mode,
   const uint8_t *key, size_t nkey,
-  size_t n_threads, Error **errp);
+  Error **errp);
 
 void qcrypto_block_free_cipher(QCryptoBlock *block);
 
diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 3ee928fb5a..3357852c0a 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -1262,7 +1262,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
   luks->cipher_mode,
   masterkey,
   luks->header.master_key_len,
-  n_threads,
   errp) < 0) {
 goto fail;
 }
@@ -1456,7 +1455,7 @@ qcrypto_block_luks_create(QCryptoBlock *block,
 /* Setup the block device payload encryption objects */
 if (qcrypto_block_init_cipher(block, luks_opts.cipher_alg,
   luks_opts.cipher_mode, masterkey,
-  luks->header.master_key_len, 1, errp) < 0) {
+  luks->header.master_key_len, errp) < 0) {
 goto error;
 }
 
diff --git a/crypto/block-qcow.c b/crypto/block-qcow.c
index 4d7cf36a8f..02305058e3 100644
--- a/crypto/block-qcow.c
+++ b/crypto/block-qcow.c
@@ -75,7 +75,7 @@ qcrypto_block_qcow_init(QCryptoBlock *block,
 ret = qcrypto_block_init_cipher(block, QCRYPTO_CIPHER_ALG_AES_128,
 QCRYPTO_CIPHER_MODE_CBC,
 keybuf, G_N_ELEMENTS(keybuf),
-n_threads, errp);
+errp);
 if (ret < 0) {
 ret = -ENOTSUP;
 goto fail;
diff --git a/crypto/block.c b/crypto/block.c
index 506ea1d1a3..ba6d1cebc7 100644
--- a/crypto/block.c
+++ b/crypto/block.c
@@ -20,6 +20,7 @@
 
 #include "qemu/osdep.h"
 #include "qapi/error.h"
+#include "qemu/lockable.h"
 #include "blockpriv.h"
 #include "block-qcow.h"
 #include "block-luks.h"
@@ -57,6 +58,8 @@ QCryptoBlock *qcrypto_block_open(QCryptoBlockOpenOptions 
*options,
 {
 QCryptoBlock *block = g_new0(QCryptoBlock, 1);
 
+qemu_mutex_init(>mutex);
+
 block->format = options->format;
 
 if (options->format >= G_N_ELEMENTS(qcrypto_block_drivers) ||
@@ -76,8 +79,6 @@ QCryptoBlock *qcrypto_block_open(QCryptoBlockOpenOptions 
*options,
 return NULL;
 }
 
-qemu_mutex_init(>mutex);
-
 return block;
 }
 
@@ -92,6 +93,8 @@ QCryptoBlock *qcrypto_block_create(QCryptoBlockCreateOptions 
*options,
 {
 QCryptoBlock *block = g_new0(QCryptoBlock, 1);
 
+qemu_mutex_init(>mutex);
+
 block->format = options->format;
 
 if (options->format >= G_N_ELEMENTS(qcrypto_block_drivers) ||
@@ -111,8 +114,6 @@ QCryptoBlock 
*qcrypto_block_create(QCryptoBlockCreateOptions *options,
 return NULL;
 }
 
-qemu_mutex_init(>mutex);
-
 return block;
 }
 
@@ -227,37 +228,42 @@ QCryptoCipher *qcrypto_block_get_cipher(QCryptoBlock 
*block)
  * This function is used only in test 

[PATCH 0/1] i386/tcg fix for IRET as used in dotnet runtime

2024-06-11 Thread Robert R. Henry
This patch fixes the i386/tcg implementation of the IRET instruction
so that IRET can return from user space to user space, as used by the
dotnet runtime to switch threads.

This fixes https://gitlab.com/qemu-project/qemu/-/issues/249

I debugged this issue 4+ years ago, and wrote this patch then.

At the time, I did not fully understand the nuances of the priority
levels in the TCG emulation of the x86, nor of the x86 itself.
I understand less now!

I do not recall exactly how I was led to the conclusion that an
unhandled page fault in kernel space was due to a bug in the code
executed in the tcg emulator for IRET. Eventually, my approach to
debugging was to modify the source for the dotnet runtime so that
immediately prior to the IRET I executed an x87 fpatan2 instruction,
knowing that no modern program used that instruction, and that there
was a single point in QEMU source code that emulated that, making it a
convenient place to put gdb breakpoints to enable further breakpoints in
the IRET emulation code.

With this change the page faults go away, and that the dotnet program
completes as expected. For the curious,
https://github.com/dotnet/runtime/blob/main/src/coreclr/pal/src/arch/amd64/context2.S#L241
shows how the dotnet runtime uses iret.

I have booted BSD, solaris and macosX with this change, and await
results for booting Windows from the Windows kernel team.

I have not tested this with other modern JITers, such as Java,
v8, or HHVM.

Robert R. Henry (1):
  i386/tcg: Allow IRET from user mode to user mode for dotnet runtime

 target/i386/tcg/seg_helper.c | 78 ++--
 1 file changed, 47 insertions(+), 31 deletions(-)

-- 
2.34.1




[PATCH 1/1] i386/tcg: Allow IRET from user mode to user mode for dotnet runtime

2024-06-11 Thread Robert R. Henry
This fixes a bug wherein i386/tcg assumed an interrupt return using
the IRET instruction was always returning from kernel mode to either
kernel mode or user mode. This assumption is violated when IRET is used
as a clever way to restore thread state, as for example in the dotnet
runtime. There, IRET returns from user mode to user mode.

This bug manifested itself as a page fault in the guest Linux kernel.

This bug appears to have been in QEMU since the beginning.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/249
Signed-off-by: Robert R. Henry 
---
 target/i386/tcg/seg_helper.c | 78 ++--
 1 file changed, 47 insertions(+), 31 deletions(-)

diff --git a/target/i386/tcg/seg_helper.c b/target/i386/tcg/seg_helper.c
index 715db1f232..815d26e61d 100644
--- a/target/i386/tcg/seg_helper.c
+++ b/target/i386/tcg/seg_helper.c
@@ -843,20 +843,35 @@ static void do_interrupt_protected(CPUX86State *env, int 
intno, int is_int,
 
 #ifdef TARGET_X86_64
 
-#define PUSHQ_RA(sp, val, ra)   \
-{   \
-sp -= 8;\
-cpu_stq_kernel_ra(env, sp, (val), ra);  \
-}
-
-#define POPQ_RA(sp, val, ra)\
-{   \
-val = cpu_ldq_kernel_ra(env, sp, ra);   \
-sp += 8;\
-}
+#define PUSHQ_RA(sp, val, ra, cpl, dpl) \
+  FUNC_PUSHQ_RA(env, , val, ra, cpl, dpl)
+
+static inline void FUNC_PUSHQ_RA(
+CPUX86State *env, target_ulong *sp,
+target_ulong val, target_ulong ra, int cpl, int dpl) {
+  *sp -= 8;
+  if (dpl == 0) {
+cpu_stq_kernel_ra(env, *sp, val, ra);
+  } else {
+cpu_stq_data_ra(env, *sp, val, ra);
+  } 
+}
 
-#define PUSHQ(sp, val) PUSHQ_RA(sp, val, 0)
-#define POPQ(sp, val) POPQ_RA(sp, val, 0)
+#define POPQ_RA(sp, val, ra, cpl, dpl) \
+  val = FUNC_POPQ_RA(env, , ra, cpl, dpl)
+
+static inline target_ulong FUNC_POPQ_RA(
+CPUX86State *env, target_ulong *sp,
+target_ulong ra, int cpl, int dpl) {
+  target_ulong val;
+  if (cpl == 0) {  /* TODO perhaps both arms reduce to cpu_ldq_data_ra? */
+val = cpu_ldq_kernel_ra(env, *sp, ra);
+  } else {
+val = cpu_ldq_data_ra(env, *sp, ra);
+  }
+  *sp += 8;
+  return val;
+}
 
 static inline target_ulong get_rsp_from_tss(CPUX86State *env, int level)
 {
@@ -901,6 +916,7 @@ static void do_interrupt64(CPUX86State *env, int intno, int 
is_int,
 uint32_t e1, e2, e3, ss, eflags;
 target_ulong old_eip, esp, offset;
 bool set_rf;
+const target_ulong retaddr = 0;
 
 has_error_code = 0;
 if (!is_int && !is_hw) {
@@ -989,13 +1005,13 @@ static void do_interrupt64(CPUX86State *env, int intno, 
int is_int,
 eflags |= RF_MASK;
 }
 
-PUSHQ(esp, env->segs[R_SS].selector);
-PUSHQ(esp, env->regs[R_ESP]);
-PUSHQ(esp, eflags);
-PUSHQ(esp, env->segs[R_CS].selector);
-PUSHQ(esp, old_eip);
+PUSHQ_RA(esp, env->segs[R_SS].selector, retaddr, cpl, dpl);
+PUSHQ_RA(esp, env->regs[R_ESP], retaddr, cpl, dpl);
+PUSHQ_RA(esp, eflags,   retaddr, cpl, dpl);
+PUSHQ_RA(esp, env->segs[R_CS].selector, retaddr, cpl, dpl);
+PUSHQ_RA(esp, old_eip,  retaddr, cpl, dpl);
 if (has_error_code) {
-PUSHQ(esp, error_code);
+PUSHQ_RA(esp, error_code, retaddr, cpl, dpl);
 }
 
 /* interrupt gate clear IF mask */
@@ -1621,8 +1637,8 @@ void helper_lcall_protected(CPUX86State *env, int new_cs, 
target_ulong new_eip,
 
 /* 64 bit case */
 rsp = env->regs[R_ESP];
-PUSHQ_RA(rsp, env->segs[R_CS].selector, GETPC());
-PUSHQ_RA(rsp, next_eip, GETPC());
+PUSHQ_RA(rsp, env->segs[R_CS].selector, GETPC(), cpl, dpl);
+PUSHQ_RA(rsp, next_eip, GETPC(), cpl, dpl);
 /* from this point, not restartable */
 env->regs[R_ESP] = rsp;
 cpu_x86_load_seg_cache(env, R_CS, (new_cs & 0xfffc) | cpl,
@@ -1792,8 +1808,8 @@ void helper_lcall_protected(CPUX86State *env, int new_cs, 
target_ulong new_eip,
 #ifdef TARGET_X86_64
 if (shift == 2) {
 /* XXX: verify if new stack address is canonical */
-PUSHQ_RA(sp, env->segs[R_SS].selector, GETPC());
-PUSHQ_RA(sp, env->regs[R_ESP], GETPC());
+PUSHQ_RA(sp, env->segs[R_SS].selector, GETPC(), cpl, dpl);
+PUSHQ_RA(sp, env->regs[R_ESP], GETPC(), cpl, dpl);
 /* parameters aren't supported for 64-bit call gates */
 } else
 #endif
@@ -1828,8 +1844,8 @@ void helper_lcall_protected(CPUX86State *env, int new_cs, 
target_ulong new_eip,
 
 #ifdef TARGET_X86_64
 if (shift == 2) {
-PUSHQ_RA(sp, env->segs[R_CS].selector, GETPC());
-PUSHQ_RA(sp, next_eip, GETPC());
+PUSHQ_RA(sp, env->segs[R_CS].selector, GETPC(), cpl, dpl);
+PUSHQ_RA(sp, next_eip, 

Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`

2024-06-11 Thread Amjad Alsharafi
On Tue, Jun 11, 2024 at 04:30:53PM +0200, Kevin Wolf wrote:
> Am 11.06.2024 um 14:31 hat Amjad Alsharafi geschrieben:
> > On Mon, Jun 10, 2024 at 06:49:43PM +0200, Kevin Wolf wrote:
> > > Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> > > > The field is marked as "the offset in the file (in clusters)", but it
> > > > was being used like this
> > > > `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> > > > 
> > > > Additionally, removed the `abort` when `first_mapping_index` does not
> > > > match, as this matches the case when adding new clusters for files, and
> > > > its inevitable that we reach this condition when doing that if the
> > > > clusters are not after one another, so there is no reason to `abort`
> > > > here, execution continues and the new clusters are written to disk
> > > > correctly.
> > > > 
> > > > Signed-off-by: Amjad Alsharafi 
> > > 
> > > Can you help me understand how first_mapping_index really works?
> > > 
> > > It seems to me that you get a chain of mappings for each file on the FAT
> > > filesystem, which are just the contiguous areas in it, and
> > > first_mapping_index refers to the mapping at the start of the file. But
> > > for much of the time, it actually doesn't seem to be set at all, so you
> > > have mapping->first_mapping_index == -1. Do you understand the rules
> > > around when it's set and when it isn't?
> > 
> > Yeah. So `first_mapping_index` is the index of the first mapping, each
> > mapping is a group of clusters that are contiguous in the file.
> > Its mostly `-1` because the first mapping will have the value set as
> > `-1` and not its own index, this value will only be set when the file
> > contain more than one mapping, and this will only happen when you add
> > clusters to a file that are not contiguous with the existing clusters.
> 
> Ah, that makes some sense. Not sure if it's optimal, but it's a rule I
> can work with. So just to confirm, this is the invariant that we think
> should always hold true, right?
> 
> assert((mapping->mode & MODE_DIRECTORY) ||
>!mapping->info.file.offset ||
>mapping->first_mapping_index > 0);
> 

Yes.

We can add this into `get_cluster_count_for_direntry` loop.
I'm thinking of also converting those `abort` into `assert`, since
the line `copy_it = 1;` was confusing me, since it was after the `abort`.

> > And actually, thanks to that I noticed another bug not fixed in PATCH 3, 
> > We are doing this check 
> > `s->current_mapping->first_mapping_index != mapping->first_mapping_index`
> > to know if we should switch to the new mapping or not. 
> > If we were reading from the first mapping (`first_mapping_index == -1`)
> > and we jumped to the second mapping (`first_mapping_index == n`), we
> > will catch this condition and switch to the new mapping.
> > 
> > But if the file has more than 2 mappings, and we jumped to the 3rd
> > mapping, we will not catch this since (`first_mapping_index == n`) for
> > both of them haha. I think a better check is to check the `mapping`
> > pointer directly. (I'll add it also in the next series together with a
> > test for it.)
> 
> This comparison is exactly what confused me. I didn't realise that the
> first mapping in the chain has a different value here, so I thought this
> must mean that we're looking at a different file now - but of course I
> couldn't see a reason for that because we're iterating through a single
> file in this function.
> 
> But even now that I know that the condition triggers when switching from
> the first to the second mapping, it doesn't make sense to me. We don't
> have to copy things around just because a file is non-contiguous.
> 
> What we want to catch is if the order of mappings has changed compared
> to the old state. Do we need a linked list, maybe a prev_mapping_index,
> instead of first_mapping_index so that we can compare if it is still the
> same as before?

I think this would be the better design (tbh, that's what I thought 
`first_mapping_index` would do), though not sure if other components
depend so much into the current design that it would be hard to change.

I'll try to implement this `prev_mapping_index` and see how it goes.

> 
> Or actually, I suppose that's the first block with an abort() in the
> code, just that it doesn't compare mappings, but their offsets.

I think, I'm still confused on the whole logic there, the function
`get_cluster_count_for_direntry` is a mess, and it doesn't just
*get* the cluster count, it also schedule writeouts and may
copy clusters around.

> 
> > > 
> > > >  block/vvfat.c | 12 +++-
> > > >  1 file changed, 7 insertions(+), 5 deletions(-)
> > > > 
> > > > diff --git a/block/vvfat.c b/block/vvfat.c
> > > > index 19da009a5b..f0642ac3e4 100644
> > > > --- a/block/vvfat.c
> > > > +++ b/block/vvfat.c
> > > > @@ -1408,7 +1408,9 @@ read_cluster_directory:
> > > >  
> > > >  assert(s->current_fd);
> > > >  
> > > > -
> > > > 

Re: [PATCH v3 03/13] hw/riscv: add RISC-V IOMMU base emulation

2024-06-11 Thread Jason Chien

Hi Daniel,

On 2024/5/24 上午 01:39, Daniel Henrique Barboza wrote:

From: Tomasz Jeznach 

The RISC-V IOMMU specification is now ratified as-per the RISC-V
international process. The latest frozen specifcation can be found
at:

https://github.com/riscv-non-isa/riscv-iommu/releases/download/v1.0/riscv-iommu.pdf

Add the foundation of the device emulation for RISC-V IOMMU, which
includes an IOMMU that has no capabilities but MSI interrupt support and
fault queue interfaces. We'll add add more features incrementally in the
next patches.

Co-developed-by: Sebastien Boeuf 
Signed-off-by: Sebastien Boeuf 
Signed-off-by: Tomasz Jeznach 
Signed-off-by: Daniel Henrique Barboza 
---
  hw/riscv/Kconfig |4 +
  hw/riscv/meson.build |1 +
  hw/riscv/riscv-iommu.c   | 1602 ++
  hw/riscv/riscv-iommu.h   |  141 
  hw/riscv/trace-events|   11 +
  hw/riscv/trace.h |1 +
  include/hw/riscv/iommu.h |   36 +
  meson.build  |1 +
  8 files changed, 1797 insertions(+)
  create mode 100644 hw/riscv/riscv-iommu.c
  create mode 100644 hw/riscv/riscv-iommu.h
  create mode 100644 hw/riscv/trace-events
  create mode 100644 hw/riscv/trace.h
  create mode 100644 include/hw/riscv/iommu.h

diff --git a/hw/riscv/Kconfig b/hw/riscv/Kconfig
index a2030e3a6f..f69d6e3c8e 100644
--- a/hw/riscv/Kconfig
+++ b/hw/riscv/Kconfig
@@ -1,3 +1,6 @@
+config RISCV_IOMMU
+bool
+
  config RISCV_NUMA
  bool
  
@@ -47,6 +50,7 @@ config RISCV_VIRT

  select SERIAL
  select RISCV_ACLINT
  select RISCV_APLIC
+select RISCV_IOMMU
  select RISCV_IMSIC
  select SIFIVE_PLIC
  select SIFIVE_TEST
diff --git a/hw/riscv/meson.build b/hw/riscv/meson.build
index f872674093..cbc99c6e8e 100644
--- a/hw/riscv/meson.build
+++ b/hw/riscv/meson.build
@@ -10,5 +10,6 @@ riscv_ss.add(when: 'CONFIG_SIFIVE_U', if_true: 
files('sifive_u.c'))
  riscv_ss.add(when: 'CONFIG_SPIKE', if_true: files('spike.c'))
  riscv_ss.add(when: 'CONFIG_MICROCHIP_PFSOC', if_true: 
files('microchip_pfsoc.c'))
  riscv_ss.add(when: 'CONFIG_ACPI', if_true: files('virt-acpi-build.c'))
+riscv_ss.add(when: 'CONFIG_RISCV_IOMMU', if_true: files('riscv-iommu.c'))
  
  hw_arch += {'riscv': riscv_ss}

diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
new file mode 100644
index 00..39b4ff1405
--- /dev/null
+++ b/hw/riscv/riscv-iommu.c
@@ -0,0 +1,1602 @@
+/*
+ * QEMU emulation of an RISC-V IOMMU
+ *
+ * Copyright (C) 2021-2023, Rivos Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "qom/object.h"
+#include "hw/pci/pci_bus.h"
+#include "hw/pci/pci_device.h"
+#include "hw/qdev-properties.h"
+#include "hw/riscv/riscv_hart.h"
+#include "migration/vmstate.h"
+#include "qapi/error.h"
+#include "qemu/timer.h"
+
+#include "cpu_bits.h"
+#include "riscv-iommu.h"
+#include "riscv-iommu-bits.h"
+#include "trace.h"
+
+#define LIMIT_CACHE_CTX   (1U << 7)
+#define LIMIT_CACHE_IOT   (1U << 20)
+
+/* Physical page number coversions */
+#define PPN_PHYS(ppn) ((ppn) << TARGET_PAGE_BITS)
+#define PPN_DOWN(phy) ((phy) >> TARGET_PAGE_BITS)
+
+typedef struct RISCVIOMMUContext RISCVIOMMUContext;
+typedef struct RISCVIOMMUEntry RISCVIOMMUEntry;
+
+/* Device assigned I/O address space */
+struct RISCVIOMMUSpace {
+IOMMUMemoryRegion iova_mr;  /* IOVA memory region for attached device */
+AddressSpace iova_as;   /* IOVA address space for attached device */
+RISCVIOMMUState *iommu; /* Managing IOMMU device state */
+uint32_t devid; /* Requester identifier, AKA device_id */
+bool notifier;  /* IOMMU unmap notifier enabled */
+QLIST_ENTRY(RISCVIOMMUSpace) list;
+};
+
+/* Device translation context state. */
+struct RISCVIOMMUContext {
+uint64_t devid:24;  /* Requester Id, AKA device_id */
+uint64_t pasid:20;  /* Process Address Space ID */
+uint64_t __rfu:20;  /* reserved */
+uint64_t tc;/* Translation Control */
+uint64_t ta;/* Translation Attributes */
+uint64_t msi_addr_mask; /* MSI filtering - address mask */
+uint64_t msi_addr_pattern;  /* MSI filtering - address pattern */
+uint64_t msiptp;/* MSI redirection page table pointer */
+};
+
+/* IOMMU index for transactions without PASID 

Re: about QEMU TLS

2024-06-11 Thread Yu Zhang
Hello Daniel and all,

When I was using TLS encryption for VM live-migration, I noticed one
thing: the migration works regardless of the "endpoint" setting (that
is: either "endpoint=server", or "endpoint=client") on the target
server.
The line I added is:
"-object tls-creds-x509,id=tls0,dir=/path/to/qemutls,endpoint=client
(or server),verify-peer=on".

It seems that currently the setting of "endpoint" is not strictly
enforced for VM migration. I'd like to know, if it's intentionally
done to allow a certain flexibility, or should be fixed from the
security perspective. Thank you very much!

Best regards,
Yu Zhang @ IONOS cloud

On Mon, Aug 21, 2023 at 4:29 PM Yu Zhang  wrote:
>
> Hello Daniel,
>
> sorry for my slow reply! I tested the approach you suggested by the
> following way:
>
> On the target server, start a VM in -incoming mode:
>
> qemu-7.1 \
> -uuid ${VM_UUID} \
>  ...
> -object tls-creds-x509,id=tls0,dir=${HOME}/qemutls,endpoint=server \
>  ...
> -incoming defer \
> -qmp unix:${SOCK},server,nowait \
> -qmp unix:${SOCK},server,nowait &
>
> Set the migrate parameter and waiting for the incoming VM from source:
>
> echo '{"execute":"qmp_capabilities"}{ "execute":
> "migrate-set-parameters", "arguments": { "tls-creds": "tls0" }}' |
> sudo nc -U -w 1 ${SOCK}
> echo '{"execute":"qmp_capabilities"}{ "execute": "migrate",
> "arguments": { "uri": "tcp::8089" }}
>
> in HMP:
> (qemu) migrate_set_parameter tls-creds tls0
> (qemu) migrate_incoming tcp:[::]:8089
>
> On the source server, start a VM:
>
> qemu-7.1 \
> -uuid ${VM_UUID} \
>  ...
> -object tls-creds-x509,id=tls0,dir=${HOME}/qemutls,endpoint=client \
>  ...
> -qmp unix:${SOCK},server,nowait \
> -qmp unix:${SOCK},server,nowait &
>
> Set the migrate parameter and migrate the VM from source to target:
>
> echo '{"execute":"qmp_capabilities"}{ "execute":
> "migrate-set-parameters", "arguments": { "tls-creds": "tls0" }}' |
> sudo nc -U -w 1 ${SOCK}
> echo '{"execute":"qmp_capabilities"}{ "execute": "migrate",
> "arguments": { "uri": "tcp:10.41.19.32:8089" }}
>
> and query the migration after a few seconds:
>
> echo '{"execute":"qmp_capabilities"}{ "execute": "query-migrate" }' |
> sudo nc -U -w 1 ${SOCK}
>
> the migrate is completed successfully.
>
> To further migrate the VM from source (the target for the previously
> migration), the endpoint must be changed from "server" to "client" by
> QMP commands:
>
> echo '{"execute":"qmp_capabilities"}{ "execute": "object-del",
> "arguments": { "id": "tls0" }}' | sudo nc -U -w 1 ${SOCK}
> echo '{"execute":"qmp_capabilities"}{ "execute": "object-add",
> "arguments": { "id": "tls0", "qom-type": "tls-creds-x509", "endpoint":
> "client", "dir": "${HOME}/qemutls", "verify-peer": false }}' | sudo nc
> -U -w 1 ${SOCK}
>
> which in HMP commands are:
>
> (qemu) object_del tls0
> (qemu) object_add tls-creds-x509,id=tls0,dir=${HOME}/qemutls,endpoint=client
> (qemu) migrate_set_parameter tls-creds tls0
> (qemu) migrate tcp:10.41.16.10:8089
>
> So far as I tested, the TLS certificate must be valid for at least one
> day. Therefore, the VM migration with an expired TLS certificate can
> only be done in one day.
>
> Thank you so much for your kind reply!
> Best regards
>
> Yu Zhang @ IONOS Compute Platform
>
> On Thu, Aug 17, 2023 at 12:49 PM Daniel P. Berrangé  
> wrote:
> >
> > On Mon, Aug 07, 2023 at 12:07:31AM +0200, Yu Zhang wrote:
> > > Hi all,
> > >
> > > According to qemu docs [1], TLS parameters are specified as an object in
> > > the QEMU command line:
> > >
> > >-object tls-creds-x509,id=id,endpoint=endpoint,dir=/path/to/cred/dir 
> > > ...
> > >
> > > of which "endpoint" is a type of "QCryptoTLSCredsEndpoint" and can be
> > > either a "server" or a "client".
> > >
> > > I'd like to know:
> > >
> > > - When a VM is started with this config, is there a way (e.g. QMP) to
> > > change the value of "endpoint"?
> > >   If possible, how to do this? or else after the first migration of a VM,
> > > the VM has "endpoint=server",
> > >   which can't be migrated without stop / start.
> >
> > Use object_del + object_add to delete the old credentials and
> > create new ones.
> >
> > > - In which case does the QEMU reload its TLS certificate, e.g. when a QEMU
> > > VM has been run longer
> > >   than the valid period of its TLS certificate?
> >
> > The certs are loaded at the time the incoming/outgoing migration
> > operation is initiated, so they are always fresh.
> >
> > > - The migration is done by using HMP monitor on both source and target
> > > side. Is it possible to do it
> > >   by using QMP commands?
> >
> > Almost everything in HMP has an equivalent QMP command.
> >
> >
> > With regards,
> > Daniel
> > --
> > |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange 
> > :|
> > |: https://libvirt.org -o-https://fstop138.berrange.com 
> > :|
> > |: https://entangle-photo.org-o-https://www.instagram.com/dberrange 
> > :|
> >



Re: [PATCH v2 4/7] migration/multifd: Add UADK initialization

2024-06-11 Thread Zhangfei Gao
On Tue, 11 Jun 2024 at 02:35, Fabiano Rosas  wrote:
>
> Shameer Kolothum via  writes:
>
> > Initialize UADK session and allocate buffers required. The actual
> > compression/decompression will only be done in a subsequent patch.
> >
> > Signed-off-by: Shameer Kolothum 
>
> Reviewed-by: Fabiano Rosas 

Reviewed-by: Zhangfei Gao 



Re: [PATCH v2 6/7] migration/multifd: Switch to no compression when no hardware support

2024-06-11 Thread Zhangfei Gao
On Fri, 7 Jun 2024 at 21:54, Shameer Kolothum
 wrote:
>
> Send raw packets over if UADK hardware support is not available. This is to
> satisfy  Qemu qtest CI which may run on platforms that don't have UADK
> hardware support. Subsequent patch will add support for uadk migration
> qtest.
>
> Reviewed-by: Fabiano Rosas 
> Signed-off-by: Shameer Kolothum 

Reviewed-by: Zhangfei Gao 



Re: [PATCH v2 5/7] migration/multifd: Add UADK based compression and decompression

2024-06-11 Thread Zhangfei Gao
On Fri, 7 Jun 2024 at 21:54, Shameer Kolothum
 wrote:
>
> Uses UADK wd_do_comp_sync() API to (de)compress a normal page using
> hardware accelerator.
>
> Reviewed-by: Fabiano Rosas 
> Signed-off-by: Shameer Kolothum 

Reviewed-by: Zhangfei Gao 



Re: [PATCH v2 3/7] migration/multifd: add uadk compression framework

2024-06-11 Thread Zhangfei Gao
On Fri, 7 Jun 2024 at 21:54, Shameer Kolothum
 wrote:
>
> Adds the skeleton to support uadk compression method.
> Complete functionality will be added in subsequent patches.
>
> Acked-by: Markus Armbruster 
> Reviewed-by: Fabiano Rosas 
> Signed-off-by: Shameer Kolothum 
Reviewed-by: Zhangfei Gao 



Re: [PATCH v2 2/7] configure: Add uadk option

2024-06-11 Thread Zhangfei Gao
On Fri, 7 Jun 2024 at 21:54, Shameer Kolothum
 wrote:
>
> Add --enable-uadk and --disable-uadk options to enable and disable
> UADK compression accelerator. This is for using UADK based hardware
> accelerators for live migration.
>
> Reviewed-by: Fabiano Rosas 
> Signed-off-by: Shameer Kolothum 

Reviewed-by: Zhangfei Gao 



Re: [PATCH v2 1/7] docs/migration: add uadk compression feature

2024-06-11 Thread Zhangfei Gao
On Fri, 7 Jun 2024 at 21:54, Shameer Kolothum
 wrote:
>
> Document UADK(User Space Accelerator Development Kit) library details
> and how to use that for migration.
>
> Signed-off-by: Shameer Kolothum 

Good job, thanks Shameer

Reviewed-by: Zhangfei Gao 



Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

2024-06-11 Thread Pierrick Bouvier

On 6/11/24 02:21, Alex Bennée wrote:

Pierrick Bouvier  writes:


On 6/10/24 13:29, Manos Pitsidianakis wrote:

On Mon, 10 Jun 2024 22:37, Pierrick Bouvier  wrote:

Hello Manos,




Excellent work, and thanks for posting this RFC!

IMHO, having patches 2 and 5 splitted is a bit confusing, and exposing
(temporarily) the generated.rs file in patches is not a good move.
Any reason you kept it this way?

That was my first approach, I will rework it on the second version.
The
generated code should not exist in committed code at all.
It was initally tricky setting up the dependency orders correctly,
so I
first committed it and then made it a dependency.



Maybe it could be better if build.rs file was *not* needed for new
devices/folders, and could be abstracted as a detail of the python
wrapper script instead of something that should be committed.

That'd mean you cannot work on the rust files with a LanguageServer,
you
cannot run cargo build or cargo check or cargo clippy, etc. That's why I
left the alternative choice of including a manually generated bindings
file (generated.rs.inc)



Maybe I missed something, but it seems like it just checks/copies the
generated.rs file where it's expected. Definitely something that could
be done as part of the rust build.

Having to run the build before getting completion does not seem to be
a huge compromise.




As long as the Language Server can kick in after a first build. Rust
definitely leans in to the concept of the tooling helping you out while
coding.

I think for the C LSPs compile_commands.json is generated during the
configure step but I could be wrong.



Yes, meson generates it.
I agree having support for completion tooling is important nowadays, 
whether in C or in Rust.


Re: Re: [PATCH v5 00/10] Support persistent reservation operations

2024-06-11 Thread Stefan Hajnoczi
On Mon, Jun 10, 2024 at 07:55:20PM -0700, 卢长奇 wrote:
> Hi,
> 
> Sorry, I explained it in patch2 and forgot to reply your email.
> 
> The existing PRManager only works with local scsi devices. This series
> will completely decouple devices and drivers. The device can not only be
> scsi, but also other devices such as nvme. The same is true for the
> driver, which is completely unrestricted.
> 
> And block/file-posix.c can implement the new block driver, and
> pr_manager can be executed after splicing ioctl commands in these
> drivers. This will be implemented in subsequent patches.

Thanks for explaining!

Stefan

> 
> On 2024/6/11 01:18, Stefan Hajnoczi wrote:
> > On Thu, Jun 06, 2024 at 08:24:34PM +0800, Changqi Lu wrote:
> >> Hi,
> >>
> >> patchv5 has been modified.
> >>
> >> Sincerely hope that everyone can help review the
> >> code and provide some suggestions.
> >>
> >> v4->v5:
> >> - Fixed a memory leak bug at hw/nvme/ctrl.c.
> >>
> >> v3->v4:
> >> - At the nvme layer, the two patches of enabling the ONCS
> >> function and enabling rescap are combined into one.
> >> - At the nvme layer, add helper functions for pr capacity
> >> conversion between the block layer and the nvme layer.
> >>
> >> v2->v3:
> >> In v2 Persist Through Power Loss(PTPL) is enable default.
> >> In v3 PTPL is supported, which is passed as a parameter.
> >>
> >> v1->v2:
> >> - Add sg_persist --report-capabilities for SCSI protocol and enable
> >> oncs and rescap for NVMe protocol.
> >> - Add persistent reservation capabilities constants and helper functions
> for
> >> SCSI and NVMe protocol.
> >> - Add comments for necessary APIs.
> >>
> >> v1:
> >> - Add seven APIs about persistent reservation command for block layer.
> >> These APIs including reading keys, reading reservations, registering,
> >> reserving, releasing, clearing and preempting.
> >> - Add the necessary pr-related operation APIs for both the
> >> SCSI protocol and NVMe protocol at the device layer.
> >> - Add scsi driver at the driver layer to verify the functions
> >
> > My question from v1 is unanswered:
> >
> > What is the relationship to the existing PRManager functionality
> > (docs/interop/pr-helper.rst) where block/file-posix.c interprets SCSI
> > ioctls and sends persistent reservation requests to an external helper
> > process?
> >
> > I wonder if block/file-posix.c can implement the new block driver
> > callbacks using pr_mgr (while keeping the existing scsi-generic
> > support).
> >
> > Thanks,
> > Stefan
> >
> >>
> >>
> >> Changqi Lu (10):
> >> block: add persistent reservation in/out api
> >> block/raw: add persistent reservation in/out driver
> >> scsi/constant: add persistent reservation in/out protocol constants
> >> scsi/util: add helper functions for persistent reservation types
> >> conversion
> >> hw/scsi: add persistent reservation in/out api for scsi device
> >> block/nvme: add reservation command protocol constants
> >> hw/nvme: add helper functions for converting reservation types
> >> hw/nvme: enable ONCS and rescap function
> >> hw/nvme: add reservation protocal command
> >> block/iscsi: add persistent reservation in/out driver
> >>
> >> block/block-backend.c | 397 ++
> >> block/io.c | 163 +++
> >> block/iscsi.c | 443 ++
> >> block/raw-format.c | 56 
> >> hw/nvme/ctrl.c | 326 +-
> >> hw/nvme/ns.c | 5 +
> >> hw/nvme/nvme.h | 84 ++
> >> hw/scsi/scsi-disk.c | 352 
> >> include/block/block-common.h | 40 +++
> >> include/block/block-io.h | 20 ++
> >> include/block/block_int-common.h | 84 ++
> >> include/block/nvme.h | 98 +++
> >> include/scsi/constants.h | 52 
> >> include/scsi/utils.h | 8 +
> >> include/sysemu/block-backend-io.h | 24 ++
> >> scsi/utils.c | 81 ++
> >> 16 files changed, 2231 insertions(+), 2 deletions(-)
> >>
> >> --
> >> 2.20.1
> >>


signature.asc
Description: PGP signature


Re: [PATCH v1] virtio-iommu: add error check before assert

2024-06-11 Thread Philippe Mathieu-Daudé

On 11/6/24 14:23, Manos Pitsidianakis wrote:

A fuzzer case discovered by Zheyu Ma causes an assert failure.

Add a check before the assert, and respond with an error before moving
on to the next queue element.

To reproduce the failure:

cat << EOF | \
qemu-system-x86_64 \
-display none -machine accel=qtest -m 512M -machine q35 -nodefaults \
-device virtio-iommu -qtest stdio
outl 0xcf8 0x8804
outw 0xcfc 0x06
outl 0xcf8 0x8820
outl 0xcfc 0xe0004000
write 0x1e 0x1 0x01
write 0xe0004020 0x4 0x1000
write 0xe0004028 0x4 0x00101000
write 0xe000401c 0x1 0x01
write 0x106000 0x1 0x05
write 0x11 0x1 0x60
write 0x12 0x1 0x10
write 0x19 0x1 0x04
write 0x1c 0x1 0x01
write 0x100018 0x1 0x04
write 0x10001c 0x1 0x02
write 0x101003 0x1 0x01
write 0xe0007001 0x1 0x00
EOF

Reported-by: Zheyu Ma 
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2359
Signed-off-by: Manos Pitsidianakis 
---
  hw/virtio/virtio-iommu.c | 12 
  1 file changed, 12 insertions(+)

diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 1326c6ec41..9b99def39f 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -818,6 +818,18 @@ static void virtio_iommu_handle_command(VirtIODevice 
*vdev, VirtQueue *vq)
  out:
  sz = iov_from_buf(elem->in_sg, elem->in_num, 0,
buf ? buf : , output_size);
+if (unlikely(sz != output_size)) {


Is this a normal guest behavior? Should we log it as GUEST_ERROR?


+tail.status = VIRTIO_IOMMU_S_DEVERR;
+/* We checked that tail can fit earlier */
+output_size = sizeof(tail);
+g_free(buf);
+buf = NULL;
+sz = iov_from_buf(elem->in_sg,
+  elem->in_num,
+  0,
+  ,
+  output_size);
+}
  assert(sz == output_size);
  
  virtqueue_push(vq, elem, sz);


base-commit: 80e8f0602168f451a93e71cbb1d59e93d745e62e





Re: [PATCH v2 3/3] hw/arm/virt: allow creation of a second NonSecure UART

2024-06-11 Thread Philippe Mathieu-Daudé

On 10/6/24 18:23, Peter Maydell wrote:

For some use-cases, it is helpful to have more than one UART
available to the guest.  If the second UART slot is not already used
for a TrustZone Secure-World-only UART, create it as a NonSecure UART
only when the user provides a serial backend (e.g.  via a second
-serial command line option).

This avoids problems where existing guest software only expects a
single UART, and gets confused by the second UART in the DTB.  The
major example of this is older EDK2 firmware, which will send the
GRUB bootloader output to UART1 and the guest serial output to UART0.
Users who want to use both UARTs with a guest setup including EDK2
are advised to update to EDK2 release edk2-stable202311 or newer.
(The prebuilt EDK2 blobs QEMU upstream provides are new enough.)
The relevant EDK2 changes are the ones described here:
https://bugzilla.tianocore.org/show_bug.cgi?id=4577

Inspired-by: Axel Heider 
Signed-off-by: Peter Maydell 
Tested-by: Laszlo Ersek 
---
  docs/system/arm/virt.rst |  6 +-
  include/hw/arm/virt.h|  1 +
  hw/arm/virt-acpi-build.c | 12 
  hw/arm/virt.c| 38 +++---
  4 files changed, 49 insertions(+), 8 deletions(-)


Reviewed-by: Philippe Mathieu-Daudé 



Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`

2024-06-11 Thread Kevin Wolf
Am 11.06.2024 um 14:31 hat Amjad Alsharafi geschrieben:
> On Mon, Jun 10, 2024 at 06:49:43PM +0200, Kevin Wolf wrote:
> > Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> > > The field is marked as "the offset in the file (in clusters)", but it
> > > was being used like this
> > > `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> > > 
> > > Additionally, removed the `abort` when `first_mapping_index` does not
> > > match, as this matches the case when adding new clusters for files, and
> > > its inevitable that we reach this condition when doing that if the
> > > clusters are not after one another, so there is no reason to `abort`
> > > here, execution continues and the new clusters are written to disk
> > > correctly.
> > > 
> > > Signed-off-by: Amjad Alsharafi 
> > 
> > Can you help me understand how first_mapping_index really works?
> > 
> > It seems to me that you get a chain of mappings for each file on the FAT
> > filesystem, which are just the contiguous areas in it, and
> > first_mapping_index refers to the mapping at the start of the file. But
> > for much of the time, it actually doesn't seem to be set at all, so you
> > have mapping->first_mapping_index == -1. Do you understand the rules
> > around when it's set and when it isn't?
> 
> Yeah. So `first_mapping_index` is the index of the first mapping, each
> mapping is a group of clusters that are contiguous in the file.
> Its mostly `-1` because the first mapping will have the value set as
> `-1` and not its own index, this value will only be set when the file
> contain more than one mapping, and this will only happen when you add
> clusters to a file that are not contiguous with the existing clusters.

Ah, that makes some sense. Not sure if it's optimal, but it's a rule I
can work with. So just to confirm, this is the invariant that we think
should always hold true, right?

assert((mapping->mode & MODE_DIRECTORY) ||
   !mapping->info.file.offset ||
   mapping->first_mapping_index > 0);

> And actually, thanks to that I noticed another bug not fixed in PATCH 3, 
> We are doing this check 
> `s->current_mapping->first_mapping_index != mapping->first_mapping_index`
> to know if we should switch to the new mapping or not. 
> If we were reading from the first mapping (`first_mapping_index == -1`)
> and we jumped to the second mapping (`first_mapping_index == n`), we
> will catch this condition and switch to the new mapping.
> 
> But if the file has more than 2 mappings, and we jumped to the 3rd
> mapping, we will not catch this since (`first_mapping_index == n`) for
> both of them haha. I think a better check is to check the `mapping`
> pointer directly. (I'll add it also in the next series together with a
> test for it.)

This comparison is exactly what confused me. I didn't realise that the
first mapping in the chain has a different value here, so I thought this
must mean that we're looking at a different file now - but of course I
couldn't see a reason for that because we're iterating through a single
file in this function.

But even now that I know that the condition triggers when switching from
the first to the second mapping, it doesn't make sense to me. We don't
have to copy things around just because a file is non-contiguous.

What we want to catch is if the order of mappings has changed compared
to the old state. Do we need a linked list, maybe a prev_mapping_index,
instead of first_mapping_index so that we can compare if it is still the
same as before?

Or actually, I suppose that's the first block with an abort() in the
code, just that it doesn't compare mappings, but their offsets.

> > 
> > >  block/vvfat.c | 12 +++-
> > >  1 file changed, 7 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/block/vvfat.c b/block/vvfat.c
> > > index 19da009a5b..f0642ac3e4 100644
> > > --- a/block/vvfat.c
> > > +++ b/block/vvfat.c
> > > @@ -1408,7 +1408,9 @@ read_cluster_directory:
> > >  
> > >  assert(s->current_fd);
> > >  
> > > -
> > > offset=s->cluster_size*(cluster_num-s->current_mapping->begin)+s->current_mapping->info.file.offset;
> > > +offset = s->cluster_size *
> > > +((cluster_num - s->current_mapping->begin)
> > > ++ s->current_mapping->info.file.offset);
> > >  if(lseek(s->current_fd, offset, SEEK_SET)!=offset)
> > >  return -3;
> > >  s->cluster=s->cluster_buffer;
> > > @@ -1929,8 +1931,9 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, 
> > > direntry_t* direntry, const ch
> > >  (mapping->mode & MODE_DIRECTORY) == 0) {
> > >  
> > >  /* was modified in qcow */
> > > -if (offset != mapping->info.file.offset + 
> > > s->cluster_size
> > > -* (cluster_num - mapping->begin)) {
> > > +if (offset != s->cluster_size
> > > +* ((cluster_num - mapping->begin)
> > > +  

[PULL 15/25] target/i386: finish converting 0F AE to the new decoder

2024-06-11 Thread Paolo Bonzini
This is already partly implemented due to VLDMXCSR and VSTMXCSR; finish
the job.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.h |   7 ++
 target/i386/tcg/translate.c  | 188 ---
 target/i386/tcg/decode-new.c.inc |  48 +++-
 target/i386/tcg/emit.c.inc   |  80 +
 4 files changed, 129 insertions(+), 194 deletions(-)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 8465717ea21..5577f7509aa 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -108,10 +108,15 @@ typedef enum X86CPUIDFeature {
 X86_FEAT_AVX2,
 X86_FEAT_BMI1,
 X86_FEAT_BMI2,
+X86_FEAT_CLFLUSH,
+X86_FEAT_CLFLUSHOPT,
+X86_FEAT_CLWB,
 X86_FEAT_CMOV,
 X86_FEAT_CMPCCXADD,
 X86_FEAT_F16C,
 X86_FEAT_FMA,
+X86_FEAT_FSGSBASE,
+X86_FEAT_FXSR,
 X86_FEAT_MOVBE,
 X86_FEAT_PCLMULQDQ,
 X86_FEAT_SHA_NI,
@@ -122,6 +127,8 @@ typedef enum X86CPUIDFeature {
 X86_FEAT_SSE41,
 X86_FEAT_SSE42,
 X86_FEAT_SSE4A,
+X86_FEAT_XSAVE,
+X86_FEAT_XSAVEOPT,
 } X86CPUIDFeature;
 
 /* Execution flags */
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 4958f4c45d5..ebae745ecba 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -4197,194 +4197,6 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 s->base.is_jmp = DISAS_EOB_NEXT;
 }
 break;
-/* MMX/3DNow!/SSE/SSE2/SSE3/SSSE3/SSE4 support */
-case 0x1ae:
-modrm = x86_ldub_code(env, s);
-switch (modrm) {
-CASE_MODRM_MEM_OP(0): /* fxsave */
-if (!(s->cpuid_features & CPUID_FXSR)
-|| (prefixes & PREFIX_LOCK)) {
-goto illegal_op;
-}
-if ((s->flags & HF_EM_MASK) || (s->flags & HF_TS_MASK)) {
-gen_exception(s, EXCP07_PREX);
-break;
-}
-gen_lea_modrm(env, s, modrm);
-gen_helper_fxsave(tcg_env, s->A0);
-break;
-
-CASE_MODRM_MEM_OP(1): /* fxrstor */
-if (!(s->cpuid_features & CPUID_FXSR)
-|| (prefixes & PREFIX_LOCK)) {
-goto illegal_op;
-}
-if ((s->flags & HF_EM_MASK) || (s->flags & HF_TS_MASK)) {
-gen_exception(s, EXCP07_PREX);
-break;
-}
-gen_lea_modrm(env, s, modrm);
-gen_helper_fxrstor(tcg_env, s->A0);
-break;
-
-CASE_MODRM_MEM_OP(2): /* ldmxcsr */
-if ((s->flags & HF_EM_MASK) || !(s->flags & HF_OSFXSR_MASK)) {
-goto illegal_op;
-}
-if (s->flags & HF_TS_MASK) {
-gen_exception(s, EXCP07_PREX);
-break;
-}
-gen_lea_modrm(env, s, modrm);
-tcg_gen_qemu_ld_i32(s->tmp2_i32, s->A0, s->mem_index, MO_LEUL);
-gen_helper_ldmxcsr(tcg_env, s->tmp2_i32);
-break;
-
-CASE_MODRM_MEM_OP(3): /* stmxcsr */
-if ((s->flags & HF_EM_MASK) || !(s->flags & HF_OSFXSR_MASK)) {
-goto illegal_op;
-}
-if (s->flags & HF_TS_MASK) {
-gen_exception(s, EXCP07_PREX);
-break;
-}
-gen_helper_update_mxcsr(tcg_env);
-gen_lea_modrm(env, s, modrm);
-tcg_gen_ld32u_tl(s->T0, tcg_env, offsetof(CPUX86State, mxcsr));
-gen_op_st_v(s, MO_32, s->T0, s->A0);
-break;
-
-CASE_MODRM_MEM_OP(4): /* xsave */
-if ((s->cpuid_ext_features & CPUID_EXT_XSAVE) == 0
-|| (prefixes & (PREFIX_LOCK | PREFIX_DATA
-| PREFIX_REPZ | PREFIX_REPNZ))) {
-goto illegal_op;
-}
-gen_lea_modrm(env, s, modrm);
-tcg_gen_concat_tl_i64(s->tmp1_i64, cpu_regs[R_EAX],
-  cpu_regs[R_EDX]);
-gen_helper_xsave(tcg_env, s->A0, s->tmp1_i64);
-break;
-
-CASE_MODRM_MEM_OP(5): /* xrstor */
-if ((s->cpuid_ext_features & CPUID_EXT_XSAVE) == 0
-|| (prefixes & (PREFIX_LOCK | PREFIX_DATA
-| PREFIX_REPZ | PREFIX_REPNZ))) {
-goto illegal_op;
-}
-gen_lea_modrm(env, s, modrm);
-tcg_gen_concat_tl_i64(s->tmp1_i64, cpu_regs[R_EAX],
-  cpu_regs[R_EDX]);
-gen_helper_xrstor(tcg_env, s->A0, s->tmp1_i64);
-/* XRSTOR is how MPX is enabled, which changes how
-   we translate.  Thus we need to end the TB.  */
-s->base.is_jmp = DISAS_EOB_NEXT;
-break;
-
-CASE_MODRM_MEM_OP(6): /* xsaveopt / clwb */
-if (prefixes & PREFIX_LOCK) {
-goto illegal_op;
-}
-   

[PULL 18/25] target/i386: convert non-grouped, helper-based 2-byte opcodes

2024-06-11 Thread Paolo Bonzini
These have very simple generators and no need for complex group
decoding.  Apart from LAR/LSL which are simplified to use
gen_op_deposit_reg_v and movcond, the code is generally lifted
from translate.c into the generators.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.h |   7 ++
 target/i386/tcg/seg_helper.c |  16 ++--
 target/i386/tcg/translate.c  | 148 -
 target/i386/tcg/decode-new.c.inc |  48 +++---
 target/i386/tcg/emit.c.inc   | 157 ++-
 5 files changed, 206 insertions(+), 170 deletions(-)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index b46a9a0ccb3..c9f958bb0e5 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -170,6 +170,13 @@ typedef enum X86InsnCheck {
 /* Fault outside protected mode, possibly including vm86 mode */
 X86_CHECK_prot_or_vm86 = 512,
 X86_CHECK_prot = X86_CHECK_prot_or_vm86 | X86_CHECK_no_vm86,
+
+/* Fault outside SMM */
+X86_CHECK_smm = 1024,
+
+/* Vendor-specific checks for Intel/AMD differences */
+X86_CHECK_i64_amd = 2048,
+X86_CHECK_o64_intel = 4096,
 } X86InsnCheck;
 
 typedef enum X86InsnSpecial {
diff --git a/target/i386/tcg/seg_helper.c b/target/i386/tcg/seg_helper.c
index 715db1f2326..aee3d19f29b 100644
--- a/target/i386/tcg/seg_helper.c
+++ b/target/i386/tcg/seg_helper.c
@@ -2265,11 +2265,11 @@ void helper_sysexit(CPUX86State *env, int dflag)
 target_ulong helper_lsl(CPUX86State *env, target_ulong selector1)
 {
 unsigned int limit;
-uint32_t e1, e2, eflags, selector;
+uint32_t e1, e2, selector;
 int rpl, dpl, cpl, type;
 
 selector = selector1 & 0x;
-eflags = cpu_cc_compute_all(env);
+assert(CC_OP == CC_OP_EFLAGS);
 if ((selector & 0xfffc) == 0) {
 goto fail;
 }
@@ -2301,22 +2301,22 @@ target_ulong helper_lsl(CPUX86State *env, target_ulong 
selector1)
 }
 if (dpl < cpl || dpl < rpl) {
 fail:
-CC_SRC = eflags & ~CC_Z;
+CC_SRC &= ~CC_Z;
 return 0;
 }
 }
 limit = get_seg_limit(e1, e2);
-CC_SRC = eflags | CC_Z;
+CC_SRC |= CC_Z;
 return limit;
 }
 
 target_ulong helper_lar(CPUX86State *env, target_ulong selector1)
 {
-uint32_t e1, e2, eflags, selector;
+uint32_t e1, e2, selector;
 int rpl, dpl, cpl, type;
 
 selector = selector1 & 0x;
-eflags = cpu_cc_compute_all(env);
+assert(CC_OP == CC_OP_EFLAGS);
 if ((selector & 0xfffc) == 0) {
 goto fail;
 }
@@ -2351,11 +2351,11 @@ target_ulong helper_lar(CPUX86State *env, target_ulong 
selector1)
 }
 if (dpl < cpl || dpl < rpl) {
 fail:
-CC_SRC = eflags & ~CC_Z;
+CC_SRC &= ~CC_Z;
 return 0;
 }
 }
-CC_SRC = eflags | CC_Z;
+CC_SRC |= CC_Z;
 return e2 & 0x00f0ff00;
 }
 
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index ebae745ecba..4b2f7488022 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -246,7 +246,6 @@ STUB_HELPER(mwait, TCGv_env env, TCGv_i32 pc_ofs)
 STUB_HELPER(outb, TCGv_env env, TCGv_i32 port, TCGv_i32 val)
 STUB_HELPER(outw, TCGv_env env, TCGv_i32 port, TCGv_i32 val)
 STUB_HELPER(outl, TCGv_env env, TCGv_i32 port, TCGv_i32 val)
-STUB_HELPER(rdmsr, TCGv_env env)
 STUB_HELPER(stgi, TCGv_env env)
 STUB_HELPER(svm_check_intercept, TCGv_env env, TCGv_i32 type)
 STUB_HELPER(vmload, TCGv_env env, TCGv_i32 aflag)
@@ -254,7 +253,6 @@ STUB_HELPER(vmmcall, TCGv_env env)
 STUB_HELPER(vmrun, TCGv_env env, TCGv_i32 aflag, TCGv_i32 pc_ofs)
 STUB_HELPER(vmsave, TCGv_env env, TCGv_i32 aflag)
 STUB_HELPER(write_crN, TCGv_env env, TCGv_i32 reg, TCGv val)
-STUB_HELPER(wrmsr, TCGv_env env)
 #endif
 
 static void gen_jmp_rel(DisasContext *s, MemOp ot, int diff, int tb_num);
@@ -3470,97 +3468,6 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 }
 gen_op_mov_reg_v(s, ot, reg, s->T0);
 break;
-case 0x130: /* wrmsr */
-case 0x132: /* rdmsr */
-if (check_cpl0(s)) {
-gen_update_cc_op(s);
-gen_update_eip_cur(s);
-if (b & 2) {
-gen_helper_rdmsr(tcg_env);
-} else {
-gen_helper_wrmsr(tcg_env);
-s->base.is_jmp = DISAS_EOB_NEXT;
-}
-}
-break;
-case 0x131: /* rdtsc */
-gen_update_cc_op(s);
-gen_update_eip_cur(s);
-translator_io_start(>base);
-gen_helper_rdtsc(tcg_env);
-break;
-case 0x133: /* rdpmc */
-gen_update_cc_op(s);
-gen_update_eip_cur(s);
-gen_helper_rdpmc(tcg_env);
-s->base.is_jmp = DISAS_NORETURN;
-break;
-case 0x134: /* sysenter */
-/* For AMD SYSENTER is not valid in long mode */
-if (LMA(s) && env->cpuid_vendor1 != 

[PULL 20/25] target/i386: adapt gen_shift_count for SHLD/SHRD

2024-06-11 Thread Paolo Bonzini
SHLD/SHRD can have 3 register operands - s->T0, s->T1 and either
1 or CL - and therefore decode->op[2] is taken by the low part
of the register being shifted.  Pass X86_OP_* to gen_shift_count
from its current callers and hardcode cpu_regs[R_ECX] as the
shift count.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/emit.c.inc | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 92635f53cf4..156ea282af4 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -2874,16 +2874,16 @@ static void gen_PUSHF(DisasContext *s, X86DecodedInsn 
*decode)
 }
 
 static MemOp gen_shift_count(DisasContext *s, X86DecodedInsn *decode,
- bool *can_be_zero, TCGv *count)
+ bool *can_be_zero, TCGv *count, int unit)
 {
 MemOp ot = decode->op[0].ot;
 int mask = (ot <= MO_32 ? 0x1f : 0x3f);
 
 *can_be_zero = false;
-switch (decode->op[2].unit) {
+switch (unit) {
 case X86_OP_INT:
 *count = tcg_temp_new();
-tcg_gen_andi_tl(*count, s->T1, mask);
+tcg_gen_andi_tl(*count, cpu_regs[R_ECX], mask);
 *can_be_zero = true;
 break;
 
@@ -3068,7 +3068,7 @@ static void gen_RCL(DisasContext *s, X86DecodedInsn 
*decode)
 bool have_1bit_cin, can_be_zero;
 TCGv count;
 TCGLabel *zero_label = NULL;
-MemOp ot = gen_shift_count(s, decode, _be_zero, );
+MemOp ot = gen_shift_count(s, decode, _be_zero, , 
decode->op[2].unit);
 TCGv low, high, low_count;
 
 if (!count) {
@@ -3120,7 +3120,7 @@ static void gen_RCR(DisasContext *s, X86DecodedInsn 
*decode)
 bool have_1bit_cin, can_be_zero;
 TCGv count;
 TCGLabel *zero_label = NULL;
-MemOp ot = gen_shift_count(s, decode, _be_zero, );
+MemOp ot = gen_shift_count(s, decode, _be_zero, , 
decode->op[2].unit);
 TCGv low, high, high_count;
 
 if (!count) {
@@ -3298,7 +3298,7 @@ static void gen_ROL(DisasContext *s, X86DecodedInsn 
*decode)
 {
 bool can_be_zero;
 TCGv count;
-MemOp ot = gen_shift_count(s, decode, _be_zero, );
+MemOp ot = gen_shift_count(s, decode, _be_zero, , 
decode->op[2].unit);
 TCGv_i32 temp32, count32;
 TCGv old = tcg_temp_new();
 
@@ -3326,7 +3326,7 @@ static void gen_ROR(DisasContext *s, X86DecodedInsn 
*decode)
 {
 bool can_be_zero;
 TCGv count;
-MemOp ot = gen_shift_count(s, decode, _be_zero, );
+MemOp ot = gen_shift_count(s, decode, _be_zero, , 
decode->op[2].unit);
 TCGv_i32 temp32, count32;
 TCGv old = tcg_temp_new();
 
@@ -3438,7 +3438,7 @@ static void gen_SAR(DisasContext *s, X86DecodedInsn 
*decode)
 {
 bool can_be_zero;
 TCGv count;
-MemOp ot = gen_shift_count(s, decode, _be_zero, );
+MemOp ot = gen_shift_count(s, decode, _be_zero, , 
decode->op[2].unit);
 
 if (!count) {
 return;
@@ -3566,7 +3566,7 @@ static void gen_SHL(DisasContext *s, X86DecodedInsn 
*decode)
 {
 bool can_be_zero;
 TCGv count;
-MemOp ot = gen_shift_count(s, decode, _be_zero, );
+MemOp ot = gen_shift_count(s, decode, _be_zero, , 
decode->op[2].unit);
 
 if (!count) {
 return;
@@ -3598,7 +3598,7 @@ static void gen_SHR(DisasContext *s, X86DecodedInsn 
*decode)
 {
 bool can_be_zero;
 TCGv count;
-MemOp ot = gen_shift_count(s, decode, _be_zero, );
+MemOp ot = gen_shift_count(s, decode, _be_zero, , 
decode->op[2].unit);
 
 if (!count) {
 return;
-- 
2.45.1




[PULL 13/25] target/i386: convert MOV from/to CR and DR to new decoder

2024-06-11 Thread Paolo Bonzini
Complete implementation of C and D operand types, then the operations
are just MOVs.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c  | 79 
 target/i386/tcg/decode-new.c.inc | 53 +++--
 target/i386/tcg/emit.c.inc   | 20 +++-
 3 files changed, 68 insertions(+), 84 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index fcba9c155f9..4958f4c45d5 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -247,9 +247,6 @@ STUB_HELPER(outb, TCGv_env env, TCGv_i32 port, TCGv_i32 val)
 STUB_HELPER(outw, TCGv_env env, TCGv_i32 port, TCGv_i32 val)
 STUB_HELPER(outl, TCGv_env env, TCGv_i32 port, TCGv_i32 val)
 STUB_HELPER(rdmsr, TCGv_env env)
-STUB_HELPER(read_crN, TCGv ret, TCGv_env env, TCGv_i32 reg)
-STUB_HELPER(get_dr, TCGv ret, TCGv_env env, TCGv_i32 reg)
-STUB_HELPER(set_dr, TCGv_env env, TCGv_i32 reg, TCGv val)
 STUB_HELPER(stgi, TCGv_env env)
 STUB_HELPER(svm_check_intercept, TCGv_env env, TCGv_i32 type)
 STUB_HELPER(vmload, TCGv_env env, TCGv_i32 aflag)
@@ -4192,82 +4189,6 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 gen_nop_modrm(env, s, modrm);
 break;
 
-case 0x120: /* mov reg, crN */
-case 0x122: /* mov crN, reg */
-if (!check_cpl0(s)) {
-break;
-}
-modrm = x86_ldub_code(env, s);
-/*
- * Ignore the mod bits (assume (modrm&0xc0)==0xc0).
- * AMD documentation (24594.pdf) and testing of Intel 386 and 486
- * processors all show that the mod bits are assumed to be 1's,
- * regardless of actual values.
- */
-rm = (modrm & 7) | REX_B(s);
-reg = ((modrm >> 3) & 7) | REX_R(s);
-switch (reg) {
-case 0:
-if ((prefixes & PREFIX_LOCK) &&
-(s->cpuid_ext3_features & CPUID_EXT3_CR8LEG)) {
-reg = 8;
-}
-break;
-case 2:
-case 3:
-case 4:
-case 8:
-break;
-default:
-goto unknown_op;
-}
-ot  = (CODE64(s) ? MO_64 : MO_32);
-
-translator_io_start(>base);
-if (b & 2) {
-gen_svm_check_intercept(s, SVM_EXIT_WRITE_CR0 + reg);
-gen_op_mov_v_reg(s, ot, s->T0, rm);
-gen_helper_write_crN(tcg_env, tcg_constant_i32(reg), s->T0);
-s->base.is_jmp = DISAS_EOB_NEXT;
-} else {
-gen_svm_check_intercept(s, SVM_EXIT_READ_CR0 + reg);
-gen_helper_read_crN(s->T0, tcg_env, tcg_constant_i32(reg));
-gen_op_mov_reg_v(s, ot, rm, s->T0);
-}
-break;
-
-case 0x121: /* mov reg, drN */
-case 0x123: /* mov drN, reg */
-if (check_cpl0(s)) {
-modrm = x86_ldub_code(env, s);
-/* Ignore the mod bits (assume (modrm&0xc0)==0xc0).
- * AMD documentation (24594.pdf) and testing of
- * intel 386 and 486 processors all show that the mod bits
- * are assumed to be 1's, regardless of actual values.
- */
-rm = (modrm & 7) | REX_B(s);
-reg = ((modrm >> 3) & 7) | REX_R(s);
-if (CODE64(s))
-ot = MO_64;
-else
-ot = MO_32;
-if (reg >= 8) {
-goto illegal_op;
-}
-if (b & 2) {
-gen_svm_check_intercept(s, SVM_EXIT_WRITE_DR0 + reg);
-gen_op_mov_v_reg(s, ot, s->T0, rm);
-tcg_gen_movi_i32(s->tmp2_i32, reg);
-gen_helper_set_dr(tcg_env, s->tmp2_i32, s->T0);
-s->base.is_jmp = DISAS_EOB_NEXT;
-} else {
-gen_svm_check_intercept(s, SVM_EXIT_READ_DR0 + reg);
-tcg_gen_movi_i32(s->tmp2_i32, reg);
-gen_helper_get_dr(s->T0, tcg_env, s->tmp2_i32);
-gen_op_mov_reg_v(s, ot, rm, s->T0);
-}
-}
-break;
 case 0x106: /* clts */
 if (check_cpl0(s)) {
 gen_svm_check_intercept(s, SVM_EXIT_WRITE_CR0);
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index cd925fe3589..4c567911f41 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -151,6 +151,8 @@
 X86_OP_GROUP3(op, op0, s0, 2op, s0, op1, s1, ## __VA_ARGS__)
 #define X86_OP_GROUPw(op, op0, s0, ...)   \
 X86_OP_GROUP3(op, op0, s0, None, None, None, None, ## __VA_ARGS__)
+#define X86_OP_GROUPwr(op, op0, s0, op1, s1, ...) \
+X86_OP_GROUP3(op, op0, s0, op1, s1, None, None, ## __VA_ARGS__)
 #define X86_OP_GROUP0(op, ...)\
 X86_OP_GROUP3(op, None, None, None, None, None, None, ## __VA_ARGS__)
 
@@ -985,6 +987,24 @@ static void decode_0FE6(DisasContext *s, CPUX86State *env, 
X86OpEntry *entry, ui

[PULL 08/25] target/i386: put BLS* input in T1, use generic flag writeback

2024-06-11 Thread Paolo Bonzini
This makes for easier cpu_cc_* setup, and not using set_cc_op()
should come in handy if QEMU ever implements APX.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc |  4 ++--
 target/i386/tcg/emit.c.inc   | 24 +---
 2 files changed, 11 insertions(+), 17 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index e7d88020481..380fb793531 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -633,7 +633,7 @@ static const X86OpEntry opcodes_0F38_F0toFF[16][5] = {
 {},
 },
 [3] = {
-X86_OP_GROUP3(group17, B,y, E,y, None,None, vex13 cpuid(BMI1)),
+X86_OP_GROUP3(group17, B,y, None,None, E,y, vex13 cpuid(BMI1)),
 {},
 {},
 {},
@@ -2604,7 +2604,7 @@ static void disas_insn(DisasContext *s, CPUState *cpu)
 }
 
 /*
- * Write back flags after last memory access.  Some newer ALU 
instructions, as
+ * Write back flags after last memory access.  Some older ALU 
instructions, as
  * well as SSE instructions, write flags in the gen_* function, but that 
can
  * cause incorrect tracking of CC_OP for instructions that write to both 
memory
  * and flags.
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 2041ea9d04a..a25b3dfc6b5 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1272,40 +1272,34 @@ static void gen_BEXTR(DisasContext *s, X86DecodedInsn 
*decode)
 prepare_update1_cc(decode, s, CC_OP_LOGICB + ot);
 }
 
-/* BLSI do not have memory operands and can use set_cc_op.  */
 static void gen_BLSI(DisasContext *s, X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[0].ot;
 
-tcg_gen_mov_tl(cpu_cc_src, s->T0);
-tcg_gen_neg_tl(s->T1, s->T0);
+/* input in T1, which is ready for prepare_update2_cc  */
+tcg_gen_neg_tl(s->T0, s->T1);
 tcg_gen_and_tl(s->T0, s->T0, s->T1);
-tcg_gen_mov_tl(cpu_cc_dst, s->T0);
-set_cc_op(s, CC_OP_BMILGB + ot);
+prepare_update2_cc(decode, s, CC_OP_BMILGB + ot);
 }
 
-/* BLSMSK do not have memory operands and can use set_cc_op.  */
 static void gen_BLSMSK(DisasContext *s, X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[0].ot;
 
-tcg_gen_mov_tl(cpu_cc_src, s->T0);
-tcg_gen_subi_tl(s->T1, s->T0, 1);
+/* input in T1, which is ready for prepare_update2_cc  */
+tcg_gen_subi_tl(s->T0, s->T1, 1);
 tcg_gen_xor_tl(s->T0, s->T0, s->T1);
-tcg_gen_mov_tl(cpu_cc_dst, s->T0);
-set_cc_op(s, CC_OP_BMILGB + ot);
+prepare_update2_cc(decode, s, CC_OP_BMILGB + ot);
 }
 
-/* BLSR do not have memory operands and can use set_cc_op.  */
 static void gen_BLSR(DisasContext *s, X86DecodedInsn *decode)
 {
 MemOp ot = decode->op[0].ot;
 
-tcg_gen_mov_tl(cpu_cc_src, s->T0);
-tcg_gen_subi_tl(s->T1, s->T0, 1);
+/* input in T1, which is ready for prepare_update2_cc  */
+tcg_gen_subi_tl(s->T0, s->T1, 1);
 tcg_gen_and_tl(s->T0, s->T0, s->T1);
-tcg_gen_mov_tl(cpu_cc_dst, s->T0);
-set_cc_op(s, CC_OP_BMILGB + ot);
+prepare_update2_cc(decode, s, CC_OP_BMILGB + ot);
 }
 
 static void gen_BOUND(DisasContext *s, X86DecodedInsn *decode)
-- 
2.45.1




[PULL 07/25] target/i386: rewrite flags writeback for ADCX/ADOX

2024-06-11 Thread Paolo Bonzini
Avoid using set_cc_op() in preparation for implementing APX; treat
CC_OP_EFLAGS similar to the case where we have the "opposite" cc_op
(CC_OP_ADOX for ADCX and CC_OP_ADCX for ADOX), except the resulting
cc_op is not CC_OP_ADCOX. This is written easily as two "if"s, whose
conditions are both false for CC_OP_EFLAGS, both true for CC_OP_ADCOX,
and one each true for CC_OP_ADCX/ADOX.

The new logic also makes it easy to drop usage of tmp0.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.h  |  9 +++---
 target/i386/tcg/emit.c.inc | 61 ++
 2 files changed, 40 insertions(+), 30 deletions(-)

diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 8fe28b67e0f..7e2a9b56aea 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1260,6 +1260,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 /* Use a clearer name for this.  */
 #define CPU_INTERRUPT_INIT  CPU_INTERRUPT_RESET
 
+#define CC_OP_HAS_EFLAGS(op) ((op) >= CC_OP_EFLAGS && (op) <= CC_OP_ADCOX)
+
 /* Instead of computing the condition codes after each x86 instruction,
  * QEMU just stores one operand (called CC_SRC), the result
  * (called CC_DST) and the type of operation (called CC_OP). When the
@@ -1270,6 +1272,9 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
 typedef enum {
 CC_OP_DYNAMIC, /* must use dynamic code to get cc_op */
 CC_OP_EFLAGS,  /* all cc are explicitly computed, CC_SRC = flags */
+CC_OP_ADCX, /* CC_DST = C, CC_SRC = rest.  */
+CC_OP_ADOX, /* CC_SRC2 = O, CC_SRC = rest.  */
+CC_OP_ADCOX, /* CC_DST = C, CC_SRC2 = O, CC_SRC = rest.  */
 
 CC_OP_MULB, /* modify all flags, C, O = (CC_SRC != 0) */
 CC_OP_MULW,
@@ -1326,10 +1331,6 @@ typedef enum {
 CC_OP_BMILGL,
 CC_OP_BMILGQ,
 
-CC_OP_ADCX, /* CC_DST = C, CC_SRC = rest.  */
-CC_OP_ADOX, /* CC_DST = O, CC_SRC = rest.  */
-CC_OP_ADCOX, /* CC_DST = C, CC_SRC2 = O, CC_SRC = rest.  */
-
 CC_OP_CLR, /* Z set, all other flags clear.  */
 CC_OP_POPCNT, /* Z via CC_SRC, all other flags clear.  */
 
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index df7597c7e2f..2041ea9d04a 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1122,24 +1122,41 @@ static void gen_ADC(DisasContext *s, X86DecodedInsn 
*decode)
 prepare_update3_cc(decode, s, CC_OP_ADCB + ot, c_in);
 }
 
-/* ADCX/ADOX do not have memory operands and can use set_cc_op.  */
-static void gen_ADCOX(DisasContext *s, MemOp ot, int cc_op)
+static void gen_ADCOX(DisasContext *s, X86DecodedInsn *decode, int cc_op)
 {
-int opposite_cc_op;
+MemOp ot = decode->op[0].ot;
 TCGv carry_in = NULL;
-TCGv carry_out = (cc_op == CC_OP_ADCX ? cpu_cc_dst : cpu_cc_src2);
+TCGv *carry_out = (cc_op == CC_OP_ADCX ? >cc_dst : 
>cc_src2);
 TCGv zero;
 
-if (cc_op == s->cc_op || s->cc_op == CC_OP_ADCOX) {
-/* Re-use the carry-out from a previous round.  */
-carry_in = carry_out;
-} else {
-/* We don't have a carry-in, get it out of EFLAGS.  */
-if (s->cc_op != CC_OP_ADCX && s->cc_op != CC_OP_ADOX) {
-gen_compute_eflags(s);
+decode->cc_op = cc_op;
+*carry_out = tcg_temp_new();
+if (CC_OP_HAS_EFLAGS(s->cc_op)) {
+decode->cc_src = cpu_cc_src;
+
+/* Re-use the carry-out from a previous round?  */
+if (s->cc_op == cc_op || s->cc_op == CC_OP_ADCOX) {
+carry_in = (cc_op == CC_OP_ADCX ? cpu_cc_dst : cpu_cc_src2);
 }
-carry_in = s->tmp0;
-tcg_gen_extract_tl(carry_in, cpu_cc_src,
+
+/* Preserve the opposite carry from previous rounds?  */
+if (s->cc_op != cc_op && s->cc_op != CC_OP_EFLAGS) {
+decode->cc_op = CC_OP_ADCOX;
+if (carry_out == >cc_dst) {
+decode->cc_src2 = cpu_cc_src2;
+} else {
+decode->cc_dst = cpu_cc_dst;
+}
+}
+} else {
+decode->cc_src = tcg_temp_new();
+gen_mov_eflags(s, decode->cc_src);
+}
+
+if (!carry_in) {
+/* Get carry_in out of EFLAGS.  */
+carry_in = tcg_temp_new();
+tcg_gen_extract_tl(carry_in, decode->cc_src,
 ctz32(cc_op == CC_OP_ADCX ? CC_C : CC_O), 1);
 }
 
@@ -1151,28 +1168,20 @@ static void gen_ADCOX(DisasContext *s, MemOp ot, int 
cc_op)
 tcg_gen_ext32u_tl(s->T1, s->T1);
 tcg_gen_add_i64(s->T0, s->T0, s->T1);
 tcg_gen_add_i64(s->T0, s->T0, carry_in);
-tcg_gen_shri_i64(carry_out, s->T0, 32);
+tcg_gen_shri_i64(*carry_out, s->T0, 32);
 break;
 #endif
 default:
 zero = tcg_constant_tl(0);
-tcg_gen_add2_tl(s->T0, carry_out, s->T0, zero, carry_in, zero);
-tcg_gen_add2_tl(s->T0, carry_out, s->T0, carry_out, s->T1, zero);
+tcg_gen_add2_tl(s->T0, *carry_out, s->T0, zero, carry_in, zero);
+tcg_gen_add2_tl(s->T0, *carry_out, s->T0, 

[PULL 01/25] scsi-disk: Fix crash for VM configured with USB CDROM after live migration

2024-06-11 Thread Paolo Bonzini
From: Hyman Huang 

For VMs configured with the USB CDROM device:

-drive file=/path/to/local/file,id=drive-usb-disk0,media=cdrom,readonly=on...
-device usb-storage,drive=drive-usb-disk0,id=usb-disk0...

QEMU process may crash after live migration, to reproduce the issue,
configure VM (Guest OS ubuntu 20.04 or 21.10) with the following XML:


  
  
  
  
  



Do the live migration repeatedly, crash may happen after live migratoin,
trace log at the source before live migration is as follows:

324808@1711972823.521945:usb_uhci_frame_start nr 319
324808@1711972823.521978:usb_uhci_qh_load qh 0x35cb5400
324808@1711972823.521989:usb_uhci_qh_load qh 0x35cb5480
324808@1711972823.521997:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 
0x0, token 0xffe07f69
324808@1711972823.522010:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
324808@1711972823.522022:usb_uhci_qh_load qh 0x35cb5680
324808@1711972823.522030:usb_uhci_td_load qh 0x35cb5680, td 0x75ac5180, ctrl 
0x1980, token 0x3c903e1
324808@1711972823.522045:usb_uhci_packet_add token 0x103e1, td 0x75ac5180
324808@1711972823.522056:usb_packet_state_change bus 0, port 2, ep 2, packet 
0x559f9ba14b00, state undef -> setup
324808@1711972823.522079:usb_msd_cmd_submit lun 0, tag 0x472, flags 0x0080, 
len 10, data-len 8
324808@1711972823.522107:scsi_req_parsed target 0 lun 0 tag 1138 command 74 dir 
1 length 8
324808@1711972823.522124:scsi_req_parsed_lba target 0 lun 0 tag 1138 command 74 
lba 4096
324808@1711972823.522139:scsi_req_alloc target 0 lun 0 tag 1138
324808@1711972823.522169:scsi_req_continue target 0 lun 0 tag 1138
324808@1711972823.522181:scsi_req_data target 0 lun 0 tag 1138 len 8
324808@1711972823.522194:usb_packet_state_change bus 0, port 2, ep 2, packet 
0x559f9ba14b00, state setup -> complete
324808@1711972823.522209:usb_uhci_packet_complete_success token 0x103e1, td 
0x75ac5180
324808@1711972823.522219:usb_uhci_packet_del token 0x103e1, td 0x75ac5180
324808@1711972823.522232:usb_uhci_td_complete qh 0x35cb5680, td 0x75ac5180

trace log at the destination after live migration is as follows:

3286206@1711972823.951646:usb_uhci_frame_start nr 320
3286206@1711972823.951663:usb_uhci_qh_load qh 0x35cb5100
3286206@1711972823.951671:usb_uhci_qh_load qh 0x35cb5480
3286206@1711972823.951680:usb_uhci_td_load qh 0x35cb5480, td 0x35cbe000, ctrl 
0x100, token 0xffe07f69
3286206@1711972823.951693:usb_uhci_td_nextqh qh 0x35cb5480, td 0x35cbe000
3286206@1711972823.951702:usb_uhci_qh_load qh 0x35cb5700
3286206@1711972823.951709:usb_uhci_td_load qh 0x35cb5700, td 0x75ac5240, ctrl 
0x3980, token 0xe08369
3286206@1711972823.951727:usb_uhci_queue_add token 0x8369
3286206@1711972823.951735:usb_uhci_packet_add token 0x8369, td 0x75ac5240
3286206@1711972823.951746:usb_packet_state_change bus 0, port 2, ep 1, packet 
0x56066b2fb5a0, state undef -> setup
3286206@1711972823.951766:usb_msd_data_in 8/8 (scsi 8)
2024-04-01 12:00:24.665+: shutting down, reason=crashed

The backtrace reveals the following:

Program terminated with signal SIGSEGV, Segmentation fault.
0  __memmove_sse2_unaligned_erms () at 
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
312movq-8(%rsi,%rdx), %rcx
[Current thread is 1 (Thread 0x7f0a9025fc00 (LWP 3286206))]
(gdb) bt
0  __memmove_sse2_unaligned_erms () at 
../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:312
1  memcpy (__len=8, __src=, __dest=) at 
/usr/include/bits/string_fortified.h:34
2  iov_from_buf_full (iov=, iov_cnt=, 
offset=, buf=0x0, bytes=bytes@entry=8) at ../util/iov.c:33
3  iov_from_buf (bytes=8, buf=, offset=, 
iov_cnt=, iov=)
   at 
/usr/src/debug/qemu-6-6.2.0-75.7.oe1.smartx.git.40.x86_64/include/qemu/iov.h:49
4  usb_packet_copy (p=p@entry=0x56066b2fb5a0, ptr=, 
bytes=bytes@entry=8) at ../hw/usb/core.c:636
5  usb_msd_copy_data (s=s@entry=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at 
../hw/usb/dev-storage.c:186
6  usb_msd_handle_data (dev=0x56066c62c770, p=0x56066b2fb5a0) at 
../hw/usb/dev-storage.c:496
7  usb_handle_packet (dev=0x56066c62c770, p=p@entry=0x56066b2fb5a0) at 
../hw/usb/core.c:455
8  uhci_handle_td (s=s@entry=0x56066bd5f210, q=0x56066bb7fbd0, q@entry=0x0, 
qh_addr=qh_addr@entry=902518530, td=td@entry=0x7fffe6e788f0, td_addr=,
   int_mask=int_mask@entry=0x7fffe6e788e4) at ../hw/usb/hcd-uhci.c:885
9  uhci_process_frame (s=s@entry=0x56066bd5f210) at ../hw/usb/hcd-uhci.c:1061
10 uhci_frame_timer (opaque=opaque@entry=0x56066bd5f210) at 
../hw/usb/hcd-uhci.c:1159
11 timerlist_run_timers (timer_list=0x56066af26bd0) at ../util/qemu-timer.c:642
12 qemu_clock_run_timers (type=QEMU_CLOCK_VIRTUAL) at ../util/qemu-timer.c:656
13 qemu_clock_run_all_timers () at ../util/qemu-timer.c:738
14 main_loop_wait (nonblocking=nonblocking@entry=0) at ../util/main-loop.c:542
15 qemu_main_loop () at ../softmmu/runstate.c:739
16 main (argc=, argv=, envp=) at 
../softmmu/main.c:52
(gdb) frame 5
(gdb) p ((SCSIDiskReq *)s->req)->iov
$1 = {iov_base = 0x0, iov_len = 0}
(gdb) p/x s->req->tag
$2 = 0x472


[PULL 10/25] target/i386: change X86_ENTRYwr to use T0, use it for moves

2024-06-11 Thread Paolo Bonzini
Just like X86_ENTRYr, X86_ENTRYwr is easily changed to use only T0.
In this case, the motivation is to use it for the MOV instruction
family.  The case when you need to preserve the input value is the
odd one, as it is used basically only for BLS* instructions.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 48 
 target/i386/tcg/emit.c.inc   |  2 +-
 2 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index f9d3e2577b2..d41002e2f5c 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -180,7 +180,7 @@
 #define X86_OP_ENTRYrr(op, op0, s0, op1, s1, ...) \
 X86_OP_ENTRY3(op, None, None, op0, s0, op1, s1, ## __VA_ARGS__)
 #define X86_OP_ENTRYwr(op, op0, s0, op1, s1, ...) \
-X86_OP_ENTRY3(op, op0, s0, None, None, op1, s1, ## __VA_ARGS__)
+X86_OP_ENTRY3(op, op0, s0, op1, s1, None, None, ## __VA_ARGS__)
 #define X86_OP_ENTRY2(op, op0, s0, op1, s1, ...)  \
 X86_OP_ENTRY3(op, op0, s0, 2op, s0, op1, s1, ## __VA_ARGS__)
 #define X86_OP_ENTRYw(op, op0, s0, ...)   \
@@ -612,15 +612,15 @@ static const X86OpEntry opcodes_0F38_00toEF[240] = {
 /* five rows for no prefix, 66, F3, F2, 66+F2  */
 static const X86OpEntry opcodes_0F38_F0toFF[16][5] = {
 [0] = {
-X86_OP_ENTRY3(MOVBE, G,y, M,y, None,None, cpuid(MOVBE)),
-X86_OP_ENTRY3(MOVBE, G,w, M,w, None,None, cpuid(MOVBE)),
+X86_OP_ENTRYwr(MOVBE, G,y, M,y, cpuid(MOVBE)),
+X86_OP_ENTRYwr(MOVBE, G,w, M,w, cpuid(MOVBE)),
 {},
 X86_OP_ENTRY2(CRC32, G,d, E,b, cpuid(SSE42)),
 X86_OP_ENTRY2(CRC32, G,d, E,b, cpuid(SSE42)),
 },
 [1] = {
-X86_OP_ENTRY3(MOVBE, M,y, G,y, None,None, cpuid(MOVBE)),
-X86_OP_ENTRY3(MOVBE, M,w, G,w, None,None, cpuid(MOVBE)),
+X86_OP_ENTRYwr(MOVBE, M,y, G,y, cpuid(MOVBE)),
+X86_OP_ENTRYwr(MOVBE, M,w, G,w, cpuid(MOVBE)),
 {},
 X86_OP_ENTRY2(CRC32, G,d, E,y, cpuid(SSE42)),
 X86_OP_ENTRY2(CRC32, G,d, E,w, cpuid(SSE42)),
@@ -1586,18 +1586,18 @@ static const X86OpEntry opcodes_root[256] = {
 [0x7E] = X86_OP_ENTRYr(Jcc, J,b),
 [0x7F] = X86_OP_ENTRYr(Jcc, J,b),
 
-[0x88] = X86_OP_ENTRY3(MOV, E,b, G,b, None, None),
-[0x89] = X86_OP_ENTRY3(MOV, E,v, G,v, None, None),
-[0x8A] = X86_OP_ENTRY3(MOV, G,b, E,b, None, None),
-[0x8B] = X86_OP_ENTRY3(MOV, G,v, E,v, None, None),
-/* Missing in Table A-2: memory destination is always 16-bit.  */
-[0x8C] = X86_OP_ENTRY3(MOV, E,v, S,w, None, None, op0_Mw),
-[0x8D] = X86_OP_ENTRY3(LEA, G,v, M,v, None, None, noseg),
-[0x8E] = X86_OP_ENTRY3(MOV, S,w, E,w, None, None),
+[0x88] = X86_OP_ENTRYwr(MOV, E,b, G,b),
+[0x89] = X86_OP_ENTRYwr(MOV, E,v, G,v),
+[0x8A] = X86_OP_ENTRYwr(MOV, G,b, E,b),
+[0x8B] = X86_OP_ENTRYwr(MOV, G,v, E,v),
+ /* Missing in Table A-2: memory destination is always 16-bit.  */
+[0x8C] = X86_OP_ENTRYwr(MOV, E,v, S,w, op0_Mw),
+[0x8D] = X86_OP_ENTRYwr(LEA, G,v, M,v, noseg),
+[0x8E] = X86_OP_ENTRYwr(MOV, S,w, E,w),
 [0x8F] = X86_OP_GROUPw(group1A, E,v),
 
 [0x98] = X86_OP_ENTRY1(CBW,0,v), /* rAX */
-[0x99] = X86_OP_ENTRY3(CWD,2,v, 0,v, None, None), /* rDX, rAX */
+[0x99] = X86_OP_ENTRYwr(CWD,   2,v, 0,v), /* rDX, rAX */
 [0x9A] = X86_OP_ENTRYrr(CALLF, I_unsigned,p, I_unsigned,w, chk(i64)),
 [0x9B] = X86_OP_ENTRY0(WAIT),
 [0x9C] = X86_OP_ENTRY0(PUSHF,  chk(vm86_iopl) svm(PUSHF)),
@@ -1607,22 +1607,22 @@ static const X86OpEntry opcodes_root[256] = {
 
 [0xA8] = X86_OP_ENTRYrr(AND, 0,b, I,b),   /* AL, Ib */
 [0xA9] = X86_OP_ENTRYrr(AND, 0,v, I,z),   /* rAX, Iz */
-[0xAA] = X86_OP_ENTRY3(STOS, Y,b, 0,b, None, None),
-[0xAB] = X86_OP_ENTRY3(STOS, Y,v, 0,v, None, None),
+[0xAA] = X86_OP_ENTRYwr(STOS, Y,b, 0,b),
+[0xAB] = X86_OP_ENTRYwr(STOS, Y,v, 0,v),
 /* Manual writeback because REP LODS (!) has to write EAX/RAX after every 
LODS.  */
 [0xAC] = X86_OP_ENTRYr(LODS, X,b),
 [0xAD] = X86_OP_ENTRYr(LODS, X,v),
 [0xAE] = X86_OP_ENTRYrr(SCAS, 0,b, Y,b),
 [0xAF] = X86_OP_ENTRYrr(SCAS, 0,v, Y,v),
 
-[0xB8] = X86_OP_ENTRY3(MOV, LoBits,v, I,v, None, None),
-[0xB9] = X86_OP_ENTRY3(MOV, LoBits,v, I,v, None, None),
-[0xBA] = X86_OP_ENTRY3(MOV, LoBits,v, I,v, None, None),
-[0xBB] = X86_OP_ENTRY3(MOV, LoBits,v, I,v, None, None),
-[0xBC] = X86_OP_ENTRY3(MOV, LoBits,v, I,v, None, None),
-[0xBD] = X86_OP_ENTRY3(MOV, LoBits,v, I,v, None, None),
-[0xBE] = X86_OP_ENTRY3(MOV, LoBits,v, I,v, None, None),
-[0xBF] = X86_OP_ENTRY3(MOV, LoBits,v, I,v, None, None),
+[0xB8] = X86_OP_ENTRYwr(MOV, LoBits,v, I,v),
+[0xB9] = X86_OP_ENTRYwr(MOV, LoBits,v, I,v),
+[0xBA] = X86_OP_ENTRYwr(MOV, LoBits,v, I,v),
+[0xBB] = X86_OP_ENTRYwr(MOV, LoBits,v, 

[PULL 06/25] target/i386: remove CPUX86State argument from generator functions

2024-06-11 Thread Paolo Bonzini
CPUX86State argument would only be used to fetch bytes, but that has to be
done before the generator function is called.  So remove it, and all
temptation together with it.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.h |   2 +-
 target/i386/tcg/decode-new.c.inc |   4 +-
 target/i386/tcg/emit.c.inc   | 572 +++
 3 files changed, 289 insertions(+), 289 deletions(-)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 1f90cf96407..f704698575f 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -245,7 +245,7 @@ typedef struct X86DecodedInsn X86DecodedInsn;
 typedef void (*X86DecodeFunc)(DisasContext *s, CPUX86State *env, X86OpEntry 
*entry, uint8_t *b);
 
 /* Code generation function.  */
-typedef void (*X86GenFunc)(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode);
+typedef void (*X86GenFunc)(DisasContext *s, X86DecodedInsn *decode);
 
 struct X86OpEntry {
 /* Based on the is_decode flags.  */
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index c2d8da8d14e..e7d88020481 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -2590,7 +2590,7 @@ static void disas_insn(DisasContext *s, CPUState *cpu)
 }
 if (s->prefix & PREFIX_LOCK) {
 gen_load(s, , 2, s->T1);
-decode.e.gen(s, env, );
+decode.e.gen(s, );
 } else {
 if (decode.op[0].unit == X86_OP_MMX) {
 compute_mmx_offset([0]);
@@ -2599,7 +2599,7 @@ static void disas_insn(DisasContext *s, CPUState *cpu)
 }
 gen_load(s, , 1, s->T0);
 gen_load(s, , 2, s->T1);
-decode.e.gen(s, env, );
+decode.e.gen(s, );
 gen_writeback(s, , 0, s->T0);
 }
 
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 4be3d9a6fba..df7597c7e2f 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -60,8 +60,8 @@ typedef void (*SSEFunc_0_eii)(TCGv_ptr env, TCGv_ptr 
reg_a, TCGv_ptr reg_b,
   TCGv_ptr reg_c, TCGv_ptr reg_d, TCGv_i32 
even,
   TCGv_i32 odd);
 
-static void gen_JMP_m(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode);
-static void gen_JMP(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode);
+static void gen_JMP_m(DisasContext *s, X86DecodedInsn *decode);
+static void gen_JMP(DisasContext *s, X86DecodedInsn *decode);
 
 static inline TCGv_i32 tcg_constant8u_i32(uint8_t val)
 {
@@ -446,7 +446,7 @@ static const SSEFunc_0_epp fns_3dnow[] = {
 [0xbf] = gen_helper_pavgusb,
 };
 
-static void gen_3dnow(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+static void gen_3dnow(DisasContext *s, X86DecodedInsn *decode)
 {
 uint8_t b = decode->immediate;
 SSEFunc_0_epp fn = b < ARRAY_SIZE(fns_3dnow) ? fns_3dnow[b] : NULL;
@@ -479,7 +479,7 @@ static void gen_3dnow(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
  * f3 = v*ss Vss, Hss, Wps
  * f2 = v*sd Vsd, Hsd, Wps
  */
-static inline void gen_unary_fp_sse(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode,
+static inline void gen_unary_fp_sse(DisasContext *s, X86DecodedInsn *decode,
   SSEFunc_0_epp pd_xmm, SSEFunc_0_epp ps_xmm,
   SSEFunc_0_epp pd_ymm, SSEFunc_0_epp ps_ymm,
   SSEFunc_0_eppp sd, SSEFunc_0_eppp ss)
@@ -504,9 +504,9 @@ static inline void gen_unary_fp_sse(DisasContext *s, 
CPUX86State *env, X86Decode
 }
 }
 #define UNARY_FP_SSE(uname, lname) 
\
-static void gen_##uname(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode) \
+static void gen_##uname(DisasContext *s, X86DecodedInsn *decode)   
\
 {  
\
-gen_unary_fp_sse(s, env, decode,   
\
+gen_unary_fp_sse(s, decode,
\
  gen_helper_##lname##pd_xmm,   
\
  gen_helper_##lname##ps_xmm,   
\
  gen_helper_##lname##pd_ymm,   
\
@@ -522,7 +522,7 @@ UNARY_FP_SSE(VSQRT, sqrt)
  * f3 = v*ss Vss, Hss, Wps
  * f2 = v*sd Vsd, Hsd, Wps
  */
-static inline void gen_fp_sse(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode,
+static inline void gen_fp_sse(DisasContext *s, X86DecodedInsn *decode,
   SSEFunc_0_eppp pd_xmm, SSEFunc_0_eppp ps_xmm,
   SSEFunc_0_eppp pd_ymm, SSEFunc_0_eppp ps_ymm,
   SSEFunc_0_eppp sd, SSEFunc_0_eppp ss)
@@ -543,9 +543,9 @@ static inline void gen_fp_sse(DisasContext *s, CPUX86State 
*env, 

[PULL 09/25] target/i386: change X86_ENTRYr to use T0

2024-06-11 Thread Paolo Bonzini
I am not sure why I made it use T1.  It is a bit more symmetric with
respect to X86_ENTRYwr (which uses T0 for the "w"ritten operand
and T1 for the "r"ead operand), but it is also less flexible because it
does not let you apply zextT0/sextT0.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc |  6 +++---
 target/i386/tcg/emit.c.inc   | 34 
 2 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 380fb793531..f9d3e2577b2 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -186,7 +186,7 @@
 #define X86_OP_ENTRYw(op, op0, s0, ...)   \
 X86_OP_ENTRY3(op, op0, s0, None, None, None, None, ## __VA_ARGS__)
 #define X86_OP_ENTRYr(op, op0, s0, ...)   \
-X86_OP_ENTRY3(op, None, None, None, None, op0, s0, ## __VA_ARGS__)
+X86_OP_ENTRY3(op, None, None, op0, s0, None, None, ## __VA_ARGS__)
 #define X86_OP_ENTRY1(op, op0, s0, ...)   \
 X86_OP_ENTRY3(op, op0, s0, 2op, s0, None, None, ## __VA_ARGS__)
 #define X86_OP_ENTRY0(op, ...)\
@@ -1335,9 +1335,9 @@ static void decode_group4_5(DisasContext *s, CPUX86State 
*env, X86OpEntry *entry
 /* 0xff */
 [0x08] = X86_OP_ENTRY1(INC, E,v,   lock),
 [0x09] = X86_OP_ENTRY1(DEC, E,v,   lock),
-[0x0a] = X86_OP_ENTRY3(CALL_m,  None, None, E,f64, None, None, zextT0),
+[0x0a] = X86_OP_ENTRYr(CALL_m,  E,f64, zextT0),
 [0x0b] = X86_OP_ENTRYr(CALLF_m, M,p),
-[0x0c] = X86_OP_ENTRY3(JMP_m,   None, None, E,f64, None, None, zextT0),
+[0x0c] = X86_OP_ENTRYr(JMP_m,   E,f64, zextT0),
 [0x0d] = X86_OP_ENTRYr(JMPF_m,  M,p),
 [0x0e] = X86_OP_ENTRYr(PUSH,E,f64),
 };
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index a25b3dfc6b5..797e6e81406 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1363,7 +1363,7 @@ static void gen_CALLF(DisasContext *s, X86DecodedInsn 
*decode)
 
 static void gen_CALLF_m(DisasContext *s, X86DecodedInsn *decode)
 {
-MemOp ot = decode->op[2].ot;
+MemOp ot = decode->op[1].ot;
 
 gen_op_ld_v(s, ot, s->T0, s->A0);
 gen_add_A0_im(s, 1 << ot);
@@ -1593,22 +1593,22 @@ static void gen_DEC(DisasContext *s, X86DecodedInsn 
*decode)
 
 static void gen_DIV(DisasContext *s, X86DecodedInsn *decode)
 {
-MemOp ot = decode->op[2].ot;
+MemOp ot = decode->op[1].ot;
 
 switch(ot) {
 case MO_8:
-gen_helper_divb_AL(tcg_env, s->T1);
+gen_helper_divb_AL(tcg_env, s->T0);
 break;
 case MO_16:
-gen_helper_divw_AX(tcg_env, s->T1);
+gen_helper_divw_AX(tcg_env, s->T0);
 break;
 default:
 case MO_32:
-gen_helper_divl_EAX(tcg_env, s->T1);
+gen_helper_divl_EAX(tcg_env, s->T0);
 break;
 #ifdef TARGET_X86_64
 case MO_64:
-gen_helper_divq_EAX(tcg_env, s->T1);
+gen_helper_divq_EAX(tcg_env, s->T0);
 break;
 #endif
 }
@@ -1649,22 +1649,22 @@ static void gen_HLT(DisasContext *s, X86DecodedInsn 
*decode)
 
 static void gen_IDIV(DisasContext *s, X86DecodedInsn *decode)
 {
-MemOp ot = decode->op[2].ot;
+MemOp ot = decode->op[1].ot;
 
 switch(ot) {
 case MO_8:
-gen_helper_idivb_AL(tcg_env, s->T1);
+gen_helper_idivb_AL(tcg_env, s->T0);
 break;
 case MO_16:
-gen_helper_idivw_AX(tcg_env, s->T1);
+gen_helper_idivw_AX(tcg_env, s->T0);
 break;
 default:
 case MO_32:
-gen_helper_idivl_EAX(tcg_env, s->T1);
+gen_helper_idivl_EAX(tcg_env, s->T0);
 break;
 #ifdef TARGET_X86_64
 case MO_64:
-gen_helper_idivq_EAX(tcg_env, s->T1);
+gen_helper_idivq_EAX(tcg_env, s->T0);
 break;
 #endif
 }
@@ -1926,7 +1926,7 @@ static void gen_JMPF(DisasContext *s, X86DecodedInsn 
*decode)
 
 static void gen_JMPF_m(DisasContext *s, X86DecodedInsn *decode)
 {
-MemOp ot = decode->op[2].ot;
+MemOp ot = decode->op[1].ot;
 
 gen_op_ld_v(s, ot, s->T0, s->A0);
 gen_add_A0_im(s, 1 << ot);
@@ -1947,7 +1947,7 @@ static void gen_LAHF(DisasContext *s, X86DecodedInsn 
*decode)
 
 static void gen_LDMXCSR(DisasContext *s, X86DecodedInsn *decode)
 {
-tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T1);
+tcg_gen_trunc_tl_i32(s->tmp2_i32, s->T0);
 gen_helper_ldmxcsr(tcg_env, s->tmp2_i32);
 }
 
@@ -1995,7 +1995,7 @@ static void gen_LGS(DisasContext *s, X86DecodedInsn 
*decode)
 
 static void gen_LODS(DisasContext *s, X86DecodedInsn *decode)
 {
-MemOp ot = decode->op[2].ot;
+MemOp ot = decode->op[1].ot;
 if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
 gen_repz(s, ot, gen_lods);
 } else {
@@ 

[PULL 00/25] target/i386, SCSI changes for 2024-06-11

2024-06-11 Thread Paolo Bonzini
The following changes since commit 80e8f0602168f451a93e71cbb1d59e93d745e62e:

  Merge tag 'bsd-user-misc-2024q2-pull-request' of gitlab.com:bsdimp/qemu into 
staging (2024-06-09 11:21:55 -0700)

are available in the Git repository at:

  https://gitlab.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 58ab5e809ad66a02b6fa273ba11ed35b8b2fea60:

  target/i386: SEV: do not assume machine->cgs is SEV (2024-06-11 14:29:23 
+0200)


* i386: fix issue with cache topology passthrough
* scsi-disk: migrate emulated requests
* i386/sev: fix Coverity issues
* i386/tcg: more conversions to new decoder


Chuang Xu (1):
  i386/cpu: fixup number of addressable IDs for processor cores in the 
physical package

Hyman Huang (1):
  scsi-disk: Fix crash for VM configured with USB CDROM after live migration

Pankaj Gupta (3):
  i386/sev: fix unreachable code coverity issue
  i386/sev: Move SEV_COMMON null check before dereferencing
  i386/sev: Return when sev_common is null

Paolo Bonzini (20):
  target/i386: remove CPUX86State argument from generator functions
  target/i386: rewrite flags writeback for ADCX/ADOX
  target/i386: put BLS* input in T1, use generic flag writeback
  target/i386: change X86_ENTRYr to use T0
  target/i386: change X86_ENTRYwr to use T0, use it for moves
  target/i386: replace NoSeg special with NoLoadEA
  target/i386: fix processing of intercept 0 (read CR0)
  target/i386: convert MOV from/to CR and DR to new decoder
  target/i386: fix bad sorting of entries in the 0F table
  target/i386: finish converting 0F AE to the new decoder
  target/i386: replace read_crN helper with read_cr8
  target/i386: split X86_CHECK_prot into PE and VM86 checks
  target/i386: convert non-grouped, helper-based 2-byte opcodes
  target/i386: pull load/writeback out of gen_shiftd_rm_T1
  target/i386: adapt gen_shift_count for SHLD/SHRD
  target/i386: convert SHLD/SHRD to new decoder
  target/i386: convert LZCNT/TZCNT/BSF/BSR/POPCNT to new decoder
  target/i386: convert XADD to new decoder
  target/i386: convert CMPXCHG to new decoder
  target/i386: SEV: do not assume machine->cgs is SEV

 target/i386/cpu.h|9 +-
 target/i386/helper.h |2 +-
 target/i386/tcg/decode-new.h |   31 +-
 hw/core/machine.c|1 +
 hw/scsi/scsi-disk.c  |   24 +-
 target/i386/cpu.c|6 +-
 target/i386/sev.c|   11 +-
 target/i386/tcg/seg_helper.c |   16 +-
 target/i386/tcg/sysemu/misc_helper.c |   20 +-
 target/i386/tcg/translate.c  |  716 +
 target/i386/tcg/decode-new.c.inc |  381 +++
 target/i386/tcg/emit.c.inc   | 1162 +++---
 12 files changed, 1160 insertions(+), 1219 deletions(-)
-- 
2.45.1




[PULL 25/25] target/i386: SEV: do not assume machine->cgs is SEV

2024-06-11 Thread Paolo Bonzini
There can be other confidential computing classes that are not derived
from sev-common.  Avoid aborting when encountering them.

Signed-off-by: Paolo Bonzini 
---
 target/i386/sev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index c40562dce31..30b83f1d77d 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -1712,7 +1712,9 @@ void sev_es_set_reset_vector(CPUState *cpu)
 {
 X86CPU *x86;
 CPUX86State *env;
-SevCommonState *sev_common = SEV_COMMON(MACHINE(qdev_get_machine())->cgs);
+ConfidentialGuestSupport *cgs = MACHINE(qdev_get_machine())->cgs;
+SevCommonState *sev_common = SEV_COMMON(
+object_dynamic_cast(OBJECT(cgs), TYPE_SEV_COMMON));
 
 /* Only update if we have valid reset information */
 if (!sev_common || !sev_common->reset_data_valid) {
-- 
2.45.1




[PULL 11/25] target/i386: replace NoSeg special with NoLoadEA

2024-06-11 Thread Paolo Bonzini
This is a bit more generic, as it can be applied to MPX as well.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.h |  5 +++--
 target/i386/tcg/decode-new.c.inc | 12 
 target/i386/tcg/emit.c.inc   |  3 ++-
 3 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index f704698575f..46a96b220d0 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -170,8 +170,9 @@ typedef enum X86InsnSpecial {
 /* Always locked if it has a memory operand (XCHG) */
 X86_SPECIAL_Locked,
 
-/* Do not apply segment base to effective address */
-X86_SPECIAL_NoSeg,
+/* Do not load effective address in s->A0 */
+X86_SPECIAL_NoLoadEA,
+
 /*
  * Rd/Mb or Rd/Mw in the manual: register operand 0 is treated as 32 bits
  * (and writeback zero-extends it to 64 bits if applicable).  PREFIX_DATA
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index d41002e2f5c..4f5fcdb88dd 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -193,7 +193,7 @@
 X86_OP_ENTRY3(op, None, None, None, None, None, None, ## __VA_ARGS__)
 
 #define cpuid(feat) .cpuid = X86_FEAT_##feat,
-#define noseg .special = X86_SPECIAL_NoSeg,
+#define nolea .special = X86_SPECIAL_NoLoadEA,
 #define xchg .special = X86_SPECIAL_Locked,
 #define lock .special = X86_SPECIAL_HasLock,
 #define mmx .special = X86_SPECIAL_MMX,
@@ -1592,7 +1592,7 @@ static const X86OpEntry opcodes_root[256] = {
 [0x8B] = X86_OP_ENTRYwr(MOV, G,v, E,v),
  /* Missing in Table A-2: memory destination is always 16-bit.  */
 [0x8C] = X86_OP_ENTRYwr(MOV, E,v, S,w, op0_Mw),
-[0x8D] = X86_OP_ENTRYwr(LEA, G,v, M,v, noseg),
+[0x8D] = X86_OP_ENTRYwr(LEA, G,v, M,v, nolea),
 [0x8E] = X86_OP_ENTRYwr(MOV, S,w, E,w),
 [0x8F] = X86_OP_GROUPw(group1A, E,v),
 
@@ -2524,11 +2524,6 @@ static void disas_insn(DisasContext *s, CPUState *cpu)
 assert(decode.op[1].unit == X86_OP_INT);
 break;
 
-case X86_SPECIAL_NoSeg:
-decode.mem.def_seg = -1;
-s->override = -1;
-break;
-
 case X86_SPECIAL_Op0_Mw:
 assert(decode.op[0].unit == X86_OP_INT);
 if (decode.op[0].has_ea) {
@@ -2585,7 +2580,8 @@ static void disas_insn(DisasContext *s, CPUState *cpu)
 gen_helper_enter_mmx(tcg_env);
 }
 
-if (decode.op[0].has_ea || decode.op[1].has_ea || decode.op[2].has_ea) {
+if (decode.e.special != X86_SPECIAL_NoLoadEA &&
+(decode.op[0].has_ea || decode.op[1].has_ea || decode.op[2].has_ea)) {
 gen_load_ea(s, , decode.e.vex_class == 12);
 }
 if (s->prefix & PREFIX_LOCK) {
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 78d89db57cd..e6521632edd 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1970,7 +1970,8 @@ static void gen_LDS(DisasContext *s, X86DecodedInsn 
*decode)
 
 static void gen_LEA(DisasContext *s, X86DecodedInsn *decode)
 {
-tcg_gen_mov_tl(s->T0, s->A0);
+TCGv ea = gen_lea_modrm_1(s, decode->mem, false);
+gen_lea_v_seg_dest(s, s->aflag, s->T0, ea, -1, -1);
 }
 
 static void gen_LEAVE(DisasContext *s, X86DecodedInsn *decode)
-- 
2.45.1




[PULL 24/25] target/i386: convert CMPXCHG to new decoder

2024-06-11 Thread Paolo Bonzini
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c  | 79 
 target/i386/tcg/decode-new.c.inc |  3 +-
 target/i386/tcg/emit.c.inc   | 51 +
 3 files changed, 53 insertions(+), 80 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 5d9312bb48c..ad1819815ab 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -434,13 +434,6 @@ static inline MemOp mo_stacksize(DisasContext *s)
 return CODE64(s) ? MO_64 : SS32(s) ? MO_32 : MO_16;
 }
 
-/* Select size 8 if lsb of B is clear, else OT.  Used for decoding
-   byte vs word opcodes.  */
-static inline MemOp mo_b_d(int b, MemOp ot)
-{
-return b & 1 ? ot : MO_8;
-}
-
 /* Compute the result of writing t0 to the OT-sized register REG.
  *
  * If DEST is NULL, store the result into the register and return the
@@ -715,11 +708,6 @@ static TCGv gen_ext_tl(TCGv dst, TCGv src, MemOp size, 
bool sign)
 return dst;
 }
 
-static void gen_extu(MemOp ot, TCGv reg)
-{
-gen_ext_tl(reg, reg, ot, false);
-}
-
 static void gen_exts(MemOp ot, TCGv reg)
 {
 gen_ext_tl(reg, reg, ot, true);
@@ -3003,73 +2991,6 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 
 /* now check op code */
 switch (b) {
-/**/
-/* arith & logic */
-case 0x1b0:
-case 0x1b1: /* cmpxchg Ev, Gv */
-{
-TCGv oldv, newv, cmpv, dest;
-
-ot = mo_b_d(b, dflag);
-modrm = x86_ldub_code(env, s);
-reg = ((modrm >> 3) & 7) | REX_R(s);
-mod = (modrm >> 6) & 3;
-oldv = tcg_temp_new();
-newv = tcg_temp_new();
-cmpv = tcg_temp_new();
-gen_op_mov_v_reg(s, ot, newv, reg);
-tcg_gen_mov_tl(cmpv, cpu_regs[R_EAX]);
-gen_extu(ot, cmpv);
-if (s->prefix & PREFIX_LOCK) {
-if (mod == 3) {
-goto illegal_op;
-}
-gen_lea_modrm(env, s, modrm);
-tcg_gen_atomic_cmpxchg_tl(oldv, s->A0, cmpv, newv,
-  s->mem_index, ot | MO_LE);
-} else {
-if (mod == 3) {
-rm = (modrm & 7) | REX_B(s);
-gen_op_mov_v_reg(s, ot, oldv, rm);
-gen_extu(ot, oldv);
-
-/*
- * Unlike the memory case, where "the destination operand 
receives
- * a write cycle without regard to the result of the 
comparison",
- * rm must not be touched altogether if the write fails, 
including
- * not zero-extending it on 64-bit processors.  So, 
precompute
- * the result of a successful writeback and perform the 
movcond
- * directly on cpu_regs.  Also need to write accumulator 
first, in
- * case rm is part of RAX too.
- */
-dest = gen_op_deposit_reg_v(s, ot, rm, newv, newv);
-tcg_gen_movcond_tl(TCG_COND_EQ, dest, oldv, cmpv, newv, 
dest);
-} else {
-gen_lea_modrm(env, s, modrm);
-gen_op_ld_v(s, ot, oldv, s->A0);
-
-/*
- * Perform an unconditional store cycle like physical cpu;
- * must be before changing accumulator to ensure
- * idempotency if the store faults and the instruction
- * is restarted
- */
-tcg_gen_movcond_tl(TCG_COND_EQ, newv, oldv, cmpv, newv, 
oldv);
-gen_op_st_v(s, ot, newv, s->A0);
-}
-}
-   /*
-* Write EAX only if the cmpxchg fails; reuse newv as the 
destination,
-* since it's dead here.
-*/
-dest = gen_op_deposit_reg_v(s, ot, R_EAX, newv, oldv);
-tcg_gen_movcond_tl(TCG_COND_EQ, dest, oldv, cmpv, dest, newv);
-tcg_gen_mov_tl(cpu_cc_src, oldv);
-tcg_gen_mov_tl(s->cc_srcT, cmpv);
-tcg_gen_sub_tl(cpu_cc_dst, cmpv, oldv);
-set_cc_op(s, CC_OP_SUBB + ot);
-}
-break;
 case 0x1c7: /* cmpxchg8b */
 modrm = x86_ldub_code(env, s);
 mod = (modrm >> 6) & 3;
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 008a8387bda..d199f2d4b6f 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1161,6 +1161,8 @@ static const X86OpEntry opcodes_0F[256] = {
 [0xa4] = X86_OP_ENTRY4(SHLD,  E,v, 2op,v, G,v),
 [0xa5] = X86_OP_ENTRY3(SHLD,  E,v, 2op,v, G,v),
 
+[0xb0] = X86_OP_ENTRY2(CMPXCHG,E,b, G,b, lock),
+[0xb1] = X86_OP_ENTRY2(CMPXCHG,E,v, G,v, lock),
 [0xb2] = X86_OP_ENTRY3(LSS,G,v, 

[PULL 12/25] target/i386: fix processing of intercept 0 (read CR0)

2024-06-11 Thread Paolo Bonzini
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.h | 1 +
 target/i386/tcg/decode-new.c.inc | 4 ++--
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 46a96b220d0..8465717ea21 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -272,6 +272,7 @@ struct X86OpEntry {
 unsigned valid_prefix:16;
 unsigned check:16;
 unsigned intercept:8;
+bool has_intercept:1;
 bool is_decode:1;
 };
 
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 4f5fcdb88dd..cd925fe3589 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -221,7 +221,7 @@
 #define vex13 .vex_class = 13,
 
 #define chk(a) .check = X86_CHECK_##a,
-#define svm(a) .intercept = SVM_EXIT_##a,
+#define svm(a) .intercept = SVM_EXIT_##a, .has_intercept = true,
 
 #define avx2_256 .vex_special = X86_VEX_AVX2_256,
 
@@ -2559,7 +2559,7 @@ static void disas_insn(DisasContext *s, CPUState *cpu)
 goto gp_fault;
 }
 }
-if (decode.e.intercept && unlikely(GUEST(s))) {
+if (decode.e.has_intercept && unlikely(GUEST(s))) {
 gen_helper_svm_check_intercept(tcg_env,
tcg_constant_i32(decode.e.intercept));
 }
-- 
2.45.1




[PULL 21/25] target/i386: convert SHLD/SHRD to new decoder

2024-06-11 Thread Paolo Bonzini
Use the same flag generation code as SHL and SHR, but use
the existing gen_shiftd_rm_T1 function to compute the result
as well as CC_SRC.

Decoding-wise, SHLD/SHRD by immediate count as a 4 operand
instruction because s->T0 and s->T1 actually occupy three op
slots.  The infrastructure used by opcodes in the 0F 3A table
works fine.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c  | 84 +---
 target/i386/tcg/decode-new.c.inc |  8 ++-
 target/i386/tcg/emit.c.inc   | 42 
 3 files changed, 50 insertions(+), 84 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 5200b578a0e..33058db4e30 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1434,57 +1434,11 @@ static bool check_cpl0(DisasContext *s)
 return false;
 }
 
-static void gen_shift_flags(DisasContext *s, MemOp ot, TCGv result,
-TCGv shm1, TCGv count, bool is_right)
-{
-TCGv_i32 z32, s32, oldop;
-TCGv z_tl;
-
-/* Store the results into the CC variables.  If we know that the
-   variable must be dead, store unconditionally.  Otherwise we'll
-   need to not disrupt the current contents.  */
-z_tl = tcg_constant_tl(0);
-if (cc_op_live[s->cc_op] & USES_CC_DST) {
-tcg_gen_movcond_tl(TCG_COND_NE, cpu_cc_dst, count, z_tl,
-   result, cpu_cc_dst);
-} else {
-tcg_gen_mov_tl(cpu_cc_dst, result);
-}
-if (cc_op_live[s->cc_op] & USES_CC_SRC) {
-tcg_gen_movcond_tl(TCG_COND_NE, cpu_cc_src, count, z_tl,
-   shm1, cpu_cc_src);
-} else {
-tcg_gen_mov_tl(cpu_cc_src, shm1);
-}
-
-/* Get the two potential CC_OP values into temporaries.  */
-tcg_gen_movi_i32(s->tmp2_i32, (is_right ? CC_OP_SARB : CC_OP_SHLB) + ot);
-if (s->cc_op == CC_OP_DYNAMIC) {
-oldop = cpu_cc_op;
-} else {
-tcg_gen_movi_i32(s->tmp3_i32, s->cc_op);
-oldop = s->tmp3_i32;
-}
-
-/* Conditionally store the CC_OP value.  */
-z32 = tcg_constant_i32(0);
-s32 = tcg_temp_new_i32();
-tcg_gen_trunc_tl_i32(s32, count);
-tcg_gen_movcond_i32(TCG_COND_NE, cpu_cc_op, s32, z32, s->tmp2_i32, oldop);
-
-/* The CC_OP value is no longer predictable.  */
-set_cc_op(s, CC_OP_DYNAMIC);
-}
-
 /* XXX: add faster immediate case */
-static TCGv gen_shiftd_rm_T1(DisasContext *s, MemOp ot,
- bool is_right, TCGv count_in)
+static void gen_shiftd_rm_T1(DisasContext *s, MemOp ot,
+ bool is_right, TCGv count)
 {
 target_ulong mask = (ot == MO_64 ? 63 : 31);
-TCGv count;
-
-count = tcg_temp_new();
-tcg_gen_andi_tl(count, count_in, mask);
 
 switch (ot) {
 case MO_16:
@@ -1546,8 +1500,6 @@ static TCGv gen_shiftd_rm_T1(DisasContext *s, MemOp ot,
 tcg_gen_or_tl(s->T0, s->T0, s->T1);
 break;
 }
-
-return count;
 }
 
 #define X86_MAX_INSN_LENGTH 15
@@ -3057,7 +3009,6 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 CPUX86State *env = cpu_env(cpu);
 int prefixes = s->prefix;
 MemOp dflag = s->dflag;
-TCGv shift;
 MemOp ot;
 int modrm, reg, rm, mod, op, val;
 
@@ -3221,37 +3172,6 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 }
 break;
 
-/**/
-/* shifts */
-case 0x1a4: /* shld imm */
-op = 0;
-shift = NULL;
-goto do_shiftd;
-case 0x1a5: /* shld cl */
-op = 0;
-shift = cpu_regs[R_ECX];
-goto do_shiftd;
-case 0x1ac: /* shrd imm */
-op = 1;
-shift = NULL;
-goto do_shiftd;
-case 0x1ad: /* shrd cl */
-op = 1;
-shift = cpu_regs[R_ECX];
-do_shiftd:
-ot = dflag;
-modrm = x86_ldub_code(env, s);
-reg = ((modrm >> 3) & 7) | REX_R(s);
-gen_ld_modrm(env, s, modrm, ot);
-if (!shift) {
-shift = tcg_constant_tl(x86_ldub_code(env, s));
-}
-gen_op_mov_v_reg(s, ot, s->T1, reg);
-shift = gen_shiftd_rm_T1(s, ot, op, shift);
-gen_st_modrm(env, s, modrm, ot);
-gen_shift_flags(s, ot, s->T0, s->tmp0, shift, op);
-break;
-
 //
 /* bit operations */
 case 0x1ba: /* bt/bts/btr/btc Gv, im */
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 1db9d1e2bc3..2d27b07f296 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1114,6 +1114,8 @@ static const X86OpEntry opcodes_0F[256] = {
 [0xa0] = X86_OP_ENTRYr(PUSH, FS, w),
 [0xa1] = X86_OP_ENTRYw(POP, FS, w),
 [0xa2] = X86_OP_ENTRY0(CPUID),
+[0xa4] = X86_OP_ENTRY4(SHLD,  E,v, 2op,v, G,v),
+[0xa5] = X86_OP_ENTRY3(SHLD,  E,v, 2op,v, G,v),
 
 [0xb2] = X86_OP_ENTRY3(LSS,G,v, 

[PULL 19/25] target/i386: pull load/writeback out of gen_shiftd_rm_T1

2024-06-11 Thread Paolo Bonzini
Use gen_ld_modrm/gen_st_modrm, moving them and gen_shift_flags to the
caller.  This way, gen_shiftd_rm_T1 becomes something that the new
decoder can call.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c | 55 ++---
 1 file changed, 14 insertions(+), 41 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 4b2f7488022..5200b578a0e 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -535,15 +535,6 @@ static inline void gen_op_st_v(DisasContext *s, int idx, 
TCGv t0, TCGv a0)
 tcg_gen_qemu_st_tl(t0, a0, s->mem_index, idx | MO_LE);
 }
 
-static inline void gen_op_st_rm_T0_A0(DisasContext *s, int idx, int d)
-{
-if (d == OR_TMP0) {
-gen_op_st_v(s, idx, s->T0, s->A0);
-} else {
-gen_op_mov_reg_v(s, idx, d, s->T0);
-}
-}
-
 static void gen_update_eip_next(DisasContext *s)
 {
 assert(s->pc_save != -1);
@@ -1486,19 +1477,12 @@ static void gen_shift_flags(DisasContext *s, MemOp ot, 
TCGv result,
 }
 
 /* XXX: add faster immediate case */
-static void gen_shiftd_rm_T1(DisasContext *s, MemOp ot, int op1,
+static TCGv gen_shiftd_rm_T1(DisasContext *s, MemOp ot,
  bool is_right, TCGv count_in)
 {
 target_ulong mask = (ot == MO_64 ? 63 : 31);
 TCGv count;
 
-/* load */
-if (op1 == OR_TMP0) {
-gen_op_ld_v(s, ot, s->T0, s->A0);
-} else {
-gen_op_mov_v_reg(s, ot, s->T0, op1);
-}
-
 count = tcg_temp_new();
 tcg_gen_andi_tl(count, count_in, mask);
 
@@ -1563,10 +1547,7 @@ static void gen_shiftd_rm_T1(DisasContext *s, MemOp ot, 
int op1,
 break;
 }
 
-/* store */
-gen_op_st_rm_T0_A0(s, ot, op1);
-
-gen_shift_flags(s, ot, s->T0, s->tmp0, count, is_right);
+return count;
 }
 
 #define X86_MAX_INSN_LENGTH 15
@@ -3076,9 +3057,9 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 CPUX86State *env = cpu_env(cpu);
 int prefixes = s->prefix;
 MemOp dflag = s->dflag;
-int shift;
+TCGv shift;
 MemOp ot;
-int modrm, reg, rm, mod, op, opreg, val;
+int modrm, reg, rm, mod, op, val;
 
 /* now check op code */
 switch (b) {
@@ -3244,39 +3225,31 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 /* shifts */
 case 0x1a4: /* shld imm */
 op = 0;
-shift = 1;
+shift = NULL;
 goto do_shiftd;
 case 0x1a5: /* shld cl */
 op = 0;
-shift = 0;
+shift = cpu_regs[R_ECX];
 goto do_shiftd;
 case 0x1ac: /* shrd imm */
 op = 1;
-shift = 1;
+shift = NULL;
 goto do_shiftd;
 case 0x1ad: /* shrd cl */
 op = 1;
-shift = 0;
+shift = cpu_regs[R_ECX];
 do_shiftd:
 ot = dflag;
 modrm = x86_ldub_code(env, s);
-mod = (modrm >> 6) & 3;
-rm = (modrm & 7) | REX_B(s);
 reg = ((modrm >> 3) & 7) | REX_R(s);
-if (mod != 3) {
-gen_lea_modrm(env, s, modrm);
-opreg = OR_TMP0;
-} else {
-opreg = rm;
+gen_ld_modrm(env, s, modrm, ot);
+if (!shift) {
+shift = tcg_constant_tl(x86_ldub_code(env, s));
 }
 gen_op_mov_v_reg(s, ot, s->T1, reg);
-
-if (shift) {
-TCGv imm = tcg_constant_tl(x86_ldub_code(env, s));
-gen_shiftd_rm_T1(s, ot, opreg, op, imm);
-} else {
-gen_shiftd_rm_T1(s, ot, opreg, op, cpu_regs[R_ECX]);
-}
+shift = gen_shiftd_rm_T1(s, ot, op, shift);
+gen_st_modrm(env, s, modrm, ot);
+gen_shift_flags(s, ot, s->T0, s->tmp0, shift, op);
 break;
 
 //
-- 
2.45.1




[PULL 14/25] target/i386: fix bad sorting of entries in the 0F table

2024-06-11 Thread Paolo Bonzini
Aesthetic change only.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 93 
 1 file changed, 46 insertions(+), 47 deletions(-)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 4c567911f41..4e745f10dd8 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1006,14 +1006,6 @@ static void decode_MOV_CR_DR(DisasContext *s, 
CPUX86State *env, X86OpEntry *entr
 }
 
 static const X86OpEntry opcodes_0F[256] = {
-[0x0E] = X86_OP_ENTRY0(EMMS,  cpuid(3DNOW)), 
/* femms */
-/*
- * 3DNow!'s opcode byte comes *after* modrm and displacements, making it
- * more like an Ib operand.  Dispatch to the right helper in a single gen_*
- * function.
- */
-[0x0F] = X86_OP_ENTRY3(3dnow,   P,q, Q,q, I,b,cpuid(3DNOW)),
-
 [0x10] = X86_OP_GROUP0(0F10),
 [0x11] = X86_OP_GROUP0(0F11),
 [0x12] = X86_OP_GROUP0(0F12),
@@ -1086,8 +1078,54 @@ static const X86OpEntry opcodes_0F[256] = {
 [0xa0] = X86_OP_ENTRYr(PUSH, FS, w),
 [0xa1] = X86_OP_ENTRYw(POP, FS, w),
 
+[0xb2] = X86_OP_ENTRY3(LSS,G,v, EM,p, None, None),
+[0xb4] = X86_OP_ENTRY3(LFS,G,v, EM,p, None, None),
+[0xb5] = X86_OP_ENTRY3(LGS,G,v, EM,p, None, None),
+[0xb6] = X86_OP_ENTRY3(MOV,G,v, E,b, None, None, zextT0), /* MOVZX */
+[0xb7] = X86_OP_ENTRY3(MOV,G,v, E,w, None, None, zextT0), /* MOVZX */
+
+[0xc2] = X86_OP_ENTRY4(VCMP,   V,x, H,x, W,x,   vex2_rep3 
p_00_66_f3_f2),
+[0xc3] = X86_OP_ENTRY3(MOV,EM,y,G,y, None,None, cpuid(SSE2)), /* 
MOVNTI */
+[0xc4] = X86_OP_ENTRY4(PINSRW, V,dq,H,dq,E,w,   vex5 mmx p_00_66),
+[0xc5] = X86_OP_ENTRY3(PEXTRW, G,d, U,dq,I,b,   vex5 mmx p_00_66),
+[0xc6] = X86_OP_ENTRY4(VSHUF,  V,x, H,x, W,x,   vex4 p_00_66),
+
+[0xd0] = X86_OP_ENTRY3(VADDSUB,   V,x, H,x, W,x,vex2 cpuid(SSE3) 
p_66_f2),
+[0xd1] = X86_OP_ENTRY3(PSRLW_r,   V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
+[0xd2] = X86_OP_ENTRY3(PSRLD_r,   V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
+[0xd3] = X86_OP_ENTRY3(PSRLQ_r,   V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
+[0xd4] = X86_OP_ENTRY3(PADDQ, V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
+[0xd5] = X86_OP_ENTRY3(PMULLW,V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
+[0xd6] = X86_OP_GROUP0(0FD6),
+[0xd7] = X86_OP_ENTRY3(PMOVMSKB,  G,d, None,None, U,x,  vex7 mmx avx2_256 
p_00_66),
+
+[0xe0] = X86_OP_ENTRY3(PAVGB, V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
+[0xe1] = X86_OP_ENTRY3(PSRAW_r,   V,x, H,x, W,x,vex7 mmx avx2_256 
p_00_66),
+[0xe2] = X86_OP_ENTRY3(PSRAD_r,   V,x, H,x, W,x,vex7 mmx avx2_256 
p_00_66),
+[0xe3] = X86_OP_ENTRY3(PAVGW, V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
+[0xe4] = X86_OP_ENTRY3(PMULHUW,   V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
+[0xe5] = X86_OP_ENTRY3(PMULHW,V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
+[0xe6] = X86_OP_GROUP0(0FE6),
+[0xe7] = X86_OP_ENTRY3(MOVDQ, W,x, None,None, V,x,  vex1 mmx p_00_66), 
/* MOVNTQ/MOVNTDQ */
+
+[0xf0] = X86_OP_ENTRY3(MOVDQ,V,x, None,None, WM,x,  vex4_unal 
cpuid(SSE3) p_f2), /* LDDQU */
+[0xf1] = X86_OP_ENTRY3(PSLLW_r,  V,x, H,x, W,x, vex7 mmx avx2_256 
p_00_66),
+[0xf2] = X86_OP_ENTRY3(PSLLD_r,  V,x, H,x, W,x, vex7 mmx avx2_256 
p_00_66),
+[0xf3] = X86_OP_ENTRY3(PSLLQ_r,  V,x, H,x, W,x, vex7 mmx avx2_256 
p_00_66),
+[0xf4] = X86_OP_ENTRY3(PMULUDQ,  V,x, H,x, W,x, vex4 mmx avx2_256 
p_00_66),
+[0xf5] = X86_OP_ENTRY3(PMADDWD,  V,x, H,x, W,x, vex4 mmx avx2_256 
p_00_66),
+[0xf6] = X86_OP_ENTRY3(PSADBW,   V,x, H,x, W,x, vex4 mmx avx2_256 
p_00_66),
+[0xf7] = X86_OP_ENTRY3(MASKMOV,  None,None, V,dq, U,dq, vex4_unal avx2_256 
mmx p_00_66),
+
 [0x0b] = X86_OP_ENTRY0(UD),   /* UD2 */
 [0x0d] = X86_OP_ENTRY1(NOP,  M,v),/* 3DNow! prefetch */
+[0x0e] = X86_OP_ENTRY0(EMMS,  cpuid(3DNOW)), 
/* femms */
+/*
+ * 3DNow!'s opcode byte comes *after* modrm and displacements, making it
+ * more like an Ib operand.  Dispatch to the right helper in a single gen_*
+ * function.
+ */
+[0x0f] = X86_OP_ENTRY3(3dnow,   P,q, Q,q, I,b,cpuid(3DNOW)),
 
 [0x18] = X86_OP_ENTRY1(NOP,  nop,v),  /* prefetch/reserved NOP */
 [0x19] = X86_OP_ENTRY1(NOP,  nop,v),  /* reserved NOP */
@@ -1169,23 +1207,11 @@ static const X86OpEntry opcodes_0F[256] = {
  */
 [0xaf] = X86_OP_ENTRY3(IMUL3,  G,v, E,v, 2op,v, sextT0),
 
-[0xb2] = X86_OP_ENTRY3(LSS,G,v, EM,p, None, None),
-[0xb4] = X86_OP_ENTRY3(LFS,G,v, EM,p, None, None),
-[0xb5] = X86_OP_ENTRY3(LGS,G,v, EM,p, None, None),
-[0xb6] = X86_OP_ENTRY3(MOV,  

[PULL 22/25] target/i386: convert LZCNT/TZCNT/BSF/BSR/POPCNT to new decoder

2024-06-11 Thread Paolo Bonzini
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.h |  1 +
 target/i386/tcg/translate.c  | 74 
 target/i386/tcg/decode-new.c.inc | 52 +++-
 target/i386/tcg/emit.c.inc   | 82 
 4 files changed, 133 insertions(+), 76 deletions(-)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index c9f958bb0e5..9b684af7cd0 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -119,6 +119,7 @@ typedef enum X86CPUIDFeature {
 X86_FEAT_FXSR,
 X86_FEAT_MOVBE,
 X86_FEAT_PCLMULQDQ,
+X86_FEAT_POPCNT,
 X86_FEAT_SHA_NI,
 X86_FEAT_SSE,
 X86_FEAT_SSE2,
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 33058db4e30..68a11f81786 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -823,11 +823,6 @@ static void gen_movs(DisasContext *s, MemOp ot)
 gen_op_add_reg(s, s->aflag, R_EDI, dshift);
 }
 
-static void gen_op_update1_cc(DisasContext *s)
-{
-tcg_gen_mov_tl(cpu_cc_dst, s->T0);
-}
-
 static void gen_op_update2_cc(DisasContext *s)
 {
 tcg_gen_mov_tl(cpu_cc_src, s->T1);
@@ -3311,56 +3306,6 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 break;
 }
 break;
-case 0x1bc: /* bsf / tzcnt */
-case 0x1bd: /* bsr / lzcnt */
-ot = dflag;
-modrm = x86_ldub_code(env, s);
-reg = ((modrm >> 3) & 7) | REX_R(s);
-gen_ld_modrm(env, s, modrm, ot);
-gen_extu(ot, s->T0);
-
-/* Note that lzcnt and tzcnt are in different extensions.  */
-if ((prefixes & PREFIX_REPZ)
-&& (b & 1
-? s->cpuid_ext3_features & CPUID_EXT3_ABM
-: s->cpuid_7_0_ebx_features & CPUID_7_0_EBX_BMI1)) {
-int size = 8 << ot;
-/* For lzcnt/tzcnt, C bit is defined related to the input. */
-tcg_gen_mov_tl(cpu_cc_src, s->T0);
-if (b & 1) {
-/* For lzcnt, reduce the target_ulong result by the
-   number of zeros that we expect to find at the top.  */
-tcg_gen_clzi_tl(s->T0, s->T0, TARGET_LONG_BITS);
-tcg_gen_subi_tl(s->T0, s->T0, TARGET_LONG_BITS - size);
-} else {
-/* For tzcnt, a zero input must return the operand size.  */
-tcg_gen_ctzi_tl(s->T0, s->T0, size);
-}
-/* For lzcnt/tzcnt, Z bit is defined related to the result.  */
-gen_op_update1_cc(s);
-set_cc_op(s, CC_OP_BMILGB + ot);
-} else {
-/* For bsr/bsf, only the Z bit is defined and it is related
-   to the input and not the result.  */
-tcg_gen_mov_tl(cpu_cc_dst, s->T0);
-set_cc_op(s, CC_OP_LOGICB + ot);
-
-/* ??? The manual says that the output is undefined when the
-   input is zero, but real hardware leaves it unchanged, and
-   real programs appear to depend on that.  Accomplish this
-   by passing the output as the value to return upon zero.  */
-if (b & 1) {
-/* For bsr, return the bit index of the first 1 bit,
-   not the count of leading zeros.  */
-tcg_gen_xori_tl(s->T1, cpu_regs[reg], TARGET_LONG_BITS - 1);
-tcg_gen_clz_tl(s->T0, s->T0, s->T1);
-tcg_gen_xori_tl(s->T0, s->T0, TARGET_LONG_BITS - 1);
-} else {
-tcg_gen_ctz_tl(s->T0, s->T0, cpu_regs[reg]);
-}
-}
-gen_op_mov_reg_v(s, ot, reg, s->T0);
-break;
 case 0x100:
 modrm = x86_ldub_code(env, s);
 mod = (modrm >> 6) & 3;
@@ -3955,25 +3900,6 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 }
 gen_nop_modrm(env, s, modrm);
 break;
-case 0x1b8: /* SSE4.2 popcnt */
-if ((prefixes & (PREFIX_REPZ | PREFIX_LOCK | PREFIX_REPNZ)) !=
- PREFIX_REPZ)
-goto illegal_op;
-if (!(s->cpuid_ext_features & CPUID_EXT_POPCNT))
-goto illegal_op;
-
-modrm = x86_ldub_code(env, s);
-reg = ((modrm >> 3) & 7) | REX_R(s);
-
-ot = dflag;
-gen_ld_modrm(env, s, modrm, ot);
-gen_extu(ot, s->T0);
-tcg_gen_mov_tl(cpu_cc_src, s->T0);
-tcg_gen_ctpop_tl(s->T0, s->T0);
-gen_op_mov_reg_v(s, ot, reg, s->T0);
-
-set_cc_op(s, CC_OP_POPCNT);
-break;
 default:
 g_assert_not_reached();
 }
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 2d27b07f296..15ebc1233ea 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -450,6 +450,50 @@ static void decode_0F7F(DisasContext *s, CPUX86State *env, 
X86OpEntry *entry, ui
 *entry = *decode_by_prefix(s, 

[PULL 17/25] target/i386: split X86_CHECK_prot into PE and VM86 checks

2024-06-11 Thread Paolo Bonzini
SYSENTER is allowed in VM86 mode, but not in real mode.  Split the check
so that PE and !VM86 are covered by separate bits.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.h | 8 ++--
 target/i386/tcg/decode-new.c.inc | 9 +++--
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 5577f7509aa..b46a9a0ccb3 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -149,8 +149,8 @@ typedef enum X86InsnCheck {
 X86_CHECK_i64 = 1,
 X86_CHECK_o64 = 2,
 
-/* Fault outside protected mode */
-X86_CHECK_prot = 4,
+/* Fault in vm86 mode */
+X86_CHECK_no_vm86 = 4,
 
 /* Privileged instruction checks */
 X86_CHECK_cpl0 = 8,
@@ -166,6 +166,10 @@ typedef enum X86InsnCheck {
 
 /* Fault if VEX.W=0 */
 X86_CHECK_W1 = 256,
+
+/* Fault outside protected mode, possibly including vm86 mode */
+X86_CHECK_prot_or_vm86 = 512,
+X86_CHECK_prot = X86_CHECK_prot_or_vm86 | X86_CHECK_no_vm86,
 } X86InsnCheck;
 
 typedef enum X86InsnSpecial {
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 1c6fa39c3eb..f02f7c62647 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -2558,8 +2558,13 @@ static void disas_insn(DisasContext *s, CPUState *cpu)
 goto illegal_op;
 }
 }
-if (decode.e.check & X86_CHECK_prot) {
-if (!PE(s) || VM86(s)) {
+if (decode.e.check & X86_CHECK_prot_or_vm86) {
+if (!PE(s)) {
+goto illegal_op;
+}
+}
+if (decode.e.check & X86_CHECK_no_vm86) {
+if (VM86(s)) {
 goto illegal_op;
 }
 }
-- 
2.45.1




[PULL 23/25] target/i386: convert XADD to new decoder

2024-06-11 Thread Paolo Bonzini
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/translate.c  | 35 
 target/i386/tcg/decode-new.c.inc |  3 ++-
 target/i386/tcg/emit.c.inc   | 24 ++
 3 files changed, 26 insertions(+), 36 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 68a11f81786..5d9312bb48c 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -823,12 +823,6 @@ static void gen_movs(DisasContext *s, MemOp ot)
 gen_op_add_reg(s, s->aflag, R_EDI, dshift);
 }
 
-static void gen_op_update2_cc(DisasContext *s)
-{
-tcg_gen_mov_tl(cpu_cc_src, s->T1);
-tcg_gen_mov_tl(cpu_cc_dst, s->T0);
-}
-
 /* compute all eflags to reg */
 static void gen_mov_eflags(DisasContext *s, TCGv reg)
 {
@@ -3011,35 +3005,6 @@ static void disas_insn_old(DisasContext *s, CPUState 
*cpu, int b)
 switch (b) {
 /**/
 /* arith & logic */
-case 0x1c0:
-case 0x1c1: /* xadd Ev, Gv */
-ot = mo_b_d(b, dflag);
-modrm = x86_ldub_code(env, s);
-reg = ((modrm >> 3) & 7) | REX_R(s);
-mod = (modrm >> 6) & 3;
-gen_op_mov_v_reg(s, ot, s->T0, reg);
-if (mod == 3) {
-rm = (modrm & 7) | REX_B(s);
-gen_op_mov_v_reg(s, ot, s->T1, rm);
-tcg_gen_add_tl(s->T0, s->T0, s->T1);
-gen_op_mov_reg_v(s, ot, reg, s->T1);
-gen_op_mov_reg_v(s, ot, rm, s->T0);
-} else {
-gen_lea_modrm(env, s, modrm);
-if (s->prefix & PREFIX_LOCK) {
-tcg_gen_atomic_fetch_add_tl(s->T1, s->A0, s->T0,
-s->mem_index, ot | MO_LE);
-tcg_gen_add_tl(s->T0, s->T0, s->T1);
-} else {
-gen_op_ld_v(s, ot, s->T1, s->A0);
-tcg_gen_add_tl(s->T0, s->T0, s->T1);
-gen_op_st_v(s, ot, s->T0, s->A0);
-}
-gen_op_mov_reg_v(s, ot, reg, s->T1);
-}
-gen_op_update2_cc(s);
-set_cc_op(s, CC_OP_ADDB + ot);
-break;
 case 0x1b0:
 case 0x1b1: /* cmpxchg Ev, Gv */
 {
diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 15ebc1233ea..008a8387bda 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -1167,6 +1167,8 @@ static const X86OpEntry opcodes_0F[256] = {
 [0xb6] = X86_OP_ENTRY3(MOV,G,v, E,b, None, None, zextT0), /* MOVZX */
 [0xb7] = X86_OP_ENTRY3(MOV,G,v, E,w, None, None, zextT0), /* MOVZX */
 
+[0xc0] = X86_OP_ENTRY2(XADD,   E,b, G,b,lock),
+[0xc1] = X86_OP_ENTRY2(XADD,   E,v, G,v,lock),
 [0xc2] = X86_OP_ENTRY4(VCMP,   V,x, H,x, W,x,   vex2_rep3 
p_00_66_f3_f2),
 [0xc3] = X86_OP_ENTRY3(MOV,EM,y,G,y, None,None, cpuid(SSE2)), /* 
MOVNTI */
 [0xc4] = X86_OP_ENTRY4(PINSRW, V,dq,H,dq,E,w,   vex5 mmx p_00_66),
@@ -2590,7 +2592,6 @@ static void disas_insn(DisasContext *s, CPUState *cpu)
 case 0xb0 ... 0xb1: /* cmpxchg */
 case 0xb3:  /* btr */
 case 0xba ... 0xbb: /* grp8, btc */
-case 0xc0 ... 0xc1: /* xadd */
 case 0xc7:  /* grp9 */
 disas_insn_old(s, cpu, b + 0x100);
 return;
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 7f554ba1173..9c8fe14e286 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -4368,6 +4368,30 @@ static void gen_WRxxBASE(DisasContext *s, X86DecodedInsn 
*decode)
 tcg_gen_mov_tl(base, s->T0);
 }
 
+static void gen_XADD(DisasContext *s, X86DecodedInsn *decode)
+{
+MemOp ot = decode->op[1].ot;
+
+decode->cc_dst = tcg_temp_new();
+decode->cc_src = s->T1;
+decode->cc_op = CC_OP_ADDB + ot;
+
+if (s->prefix & PREFIX_LOCK) {
+tcg_gen_atomic_fetch_add_tl(s->T0, s->A0, s->T1, s->mem_index, ot | 
MO_LE);
+tcg_gen_add_tl(decode->cc_dst, s->T0, s->T1);
+} else {
+tcg_gen_add_tl(decode->cc_dst, s->T0, s->T1);
+/*
+ * NOTE: writing memory first is important for MMU exceptions,
+ * but "new result" wins for XADD AX, AX.
+ */
+gen_writeback(s, decode, 0, decode->cc_dst);
+}
+if (decode->op[0].has_ea || decode->op[2].n != decode->op[0].n) {
+gen_writeback(s, decode, 2, s->T0);
+}
+}
+
 static void gen_XCHG(DisasContext *s, X86DecodedInsn *decode)
 {
 if (s->prefix & PREFIX_LOCK) {
-- 
2.45.1




[PULL 05/25] i386/sev: Return when sev_common is null

2024-06-11 Thread Paolo Bonzini
From: Pankaj Gupta 

Fixes Coverity CID 1546885.

Fixes: 16dcf200dc ("i386/sev: Introduce "sev-common" type to encapsulate common 
SEV state")
Signed-off-by: Pankaj Gupta 
Message-ID: <20240607183611.100-4-pankaj.gu...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 target/i386/sev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index f18432f58e2..c40562dce31 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -587,6 +587,7 @@ static SevCapability *sev_get_capabilities(Error **errp)
 sev_common = SEV_COMMON(MACHINE(qdev_get_machine())->cgs);
 if (!sev_common) {
 error_setg(errp, "SEV is not configured");
+return NULL;
 }
 
 sev_device = object_property_get_str(OBJECT(sev_common), "sev-device",
-- 
2.45.1




[PULL 16/25] target/i386: replace read_crN helper with read_cr8

2024-06-11 Thread Paolo Bonzini
All other control registers are stored plainly in CPUX86State.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/helper.h |  2 +-
 target/i386/tcg/sysemu/misc_helper.c | 20 +---
 target/i386/tcg/emit.c.inc   |  2 +-
 3 files changed, 7 insertions(+), 17 deletions(-)

diff --git a/target/i386/helper.h b/target/i386/helper.h
index 2f46cffabd8..eeb8df56eaa 100644
--- a/target/i386/helper.h
+++ b/target/i386/helper.h
@@ -95,7 +95,7 @@ DEF_HELPER_FLAGS_2(monitor, TCG_CALL_NO_WG, void, env, tl)
 DEF_HELPER_FLAGS_2(mwait, TCG_CALL_NO_WG, noreturn, env, int)
 DEF_HELPER_1(rdmsr, void, env)
 DEF_HELPER_1(wrmsr, void, env)
-DEF_HELPER_FLAGS_2(read_crN, TCG_CALL_NO_RWG, tl, env, int)
+DEF_HELPER_FLAGS_1(read_cr8, TCG_CALL_NO_RWG, tl, env)
 DEF_HELPER_FLAGS_3(write_crN, TCG_CALL_NO_RWG, void, env, int, tl)
 #endif /* !CONFIG_USER_ONLY */
 
diff --git a/target/i386/tcg/sysemu/misc_helper.c 
b/target/i386/tcg/sysemu/misc_helper.c
index 7fa0c5a06de..094aa56a20d 100644
--- a/target/i386/tcg/sysemu/misc_helper.c
+++ b/target/i386/tcg/sysemu/misc_helper.c
@@ -63,23 +63,13 @@ target_ulong helper_inl(CPUX86State *env, uint32_t port)
  cpu_get_mem_attrs(env), NULL);
 }
 
-target_ulong helper_read_crN(CPUX86State *env, int reg)
+target_ulong helper_read_cr8(CPUX86State *env)
 {
-target_ulong val;
-
-switch (reg) {
-default:
-val = env->cr[reg];
-break;
-case 8:
-if (!(env->hflags2 & HF2_VINTR_MASK)) {
-val = cpu_get_apic_tpr(env_archcpu(env)->apic_state);
-} else {
-val = env->int_ctl & V_TPR_MASK;
-}
-break;
+if (!(env->hflags2 & HF2_VINTR_MASK)) {
+return cpu_get_apic_tpr(env_archcpu(env)->apic_state);
+} else {
+return env->int_ctl & V_TPR_MASK;
 }
-return val;
 }
 
 void helper_write_crN(CPUX86State *env, int reg, target_ulong t0)
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 5ca3764e006..709ef7b0cb2 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -245,7 +245,7 @@ static void gen_load(DisasContext *s, X86DecodedInsn 
*decode, int opn, TCGv v)
 #ifndef CONFIG_USER_ONLY
 case X86_OP_CR:
 if (op->n == 8) {
-gen_helper_read_crN(v, tcg_env, tcg_constant_i32(op->n));
+gen_helper_read_cr8(v, tcg_env);
 } else {
 tcg_gen_ld_tl(v, tcg_env, offsetof(CPUX86State, cr[op->n]));
 }
-- 
2.45.1




[PULL 04/25] i386/sev: Move SEV_COMMON null check before dereferencing

2024-06-11 Thread Paolo Bonzini
From: Pankaj Gupta 

Fixes Coverity CID 1546886.

Fixes: 9861405a8f ("i386/sev: Invoke launch_updata_data() for SEV class")
Signed-off-by: Pankaj Gupta 
Message-ID: <20240607183611.100-3-pankaj.gu...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 target/i386/sev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index 7c9df621de1..f18432f58e2 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -1529,11 +1529,12 @@ int
 sev_encrypt_flash(hwaddr gpa, uint8_t *ptr, uint64_t len, Error **errp)
 {
 SevCommonState *sev_common = SEV_COMMON(MACHINE(qdev_get_machine())->cgs);
-SevCommonStateClass *klass = SEV_COMMON_GET_CLASS(sev_common);
+SevCommonStateClass *klass;
 
 if (!sev_common) {
 return 0;
 }
+klass = SEV_COMMON_GET_CLASS(sev_common);
 
 /* if SEV is in update state then encrypt the data else do nothing */
 if (sev_check_state(sev_common, SEV_STATE_LAUNCH_UPDATE)) {
-- 
2.45.1




[PULL 02/25] i386/cpu: fixup number of addressable IDs for processor cores in the physical package

2024-06-11 Thread Paolo Bonzini
From: Chuang Xu 

When QEMU is started with:
-cpu host,host-cache-info=on,l3-cache=off \
-smp 2,sockets=1,dies=1,cores=1,threads=2
Guest can't acquire maximum number of addressable IDs for processor cores in
the physical package from CPUID[04H].

When creating a CPU topology of 1 core per package, host-cache-info only
uses the Host's addressable core IDs field (CPUID.04H.EAX[bits 31-26]),
resulting in a conflict (on the multicore Host) between the Guest core
topology information in this field and the Guest's actual cores number.

Fix it by removing the unnecessary condition to cover 1 core per package
case. This is safe because cores_per_pkg will not be 0 and will be at
least 1.

Fixes: d7caf13b5fcf ("x86: cpu: fixup number of addressable IDs for logical 
processors sharing cache")
Signed-off-by: Guixiong Wei 
Signed-off-by: Yipeng Yin 
Signed-off-by: Chuang Xu 
Reviewed-by: Zhao Liu 
Message-ID: <20240611032314.64076-1-xuchuangxc...@bytedance.com>
Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 7466217d5ea..365852cb99e 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6455,10 +6455,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 if (*eax & 31) {
 int host_vcpus_per_cache = 1 + ((*eax & 0x3FFC000) >> 14);
 
-if (cores_per_pkg > 1) {
-*eax &= ~0xFC00;
-*eax |= max_core_ids_in_package(_info) << 26;
-}
+*eax &= ~0xFC00;
+*eax |= max_core_ids_in_package(_info) << 26;
 if (host_vcpus_per_cache > threads_per_pkg) {
 *eax &= ~0x3FFC000;
 
-- 
2.45.1




[PULL 03/25] i386/sev: fix unreachable code coverity issue

2024-06-11 Thread Paolo Bonzini
From: Pankaj Gupta 

Set 'finish->id_block_en' early, so that it is properly reset.

Fixes coverity CID 1546887.

Fixes: 7b34df4426 ("i386/sev: Introduce 'sev-snp-guest' object")
Signed-off-by: Pankaj Gupta 
Message-ID: <20240607183611.100-2-pankaj.gu...@amd.com>
Signed-off-by: Paolo Bonzini 
---
 target/i386/sev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index 004c667ac14..7c9df621de1 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -2165,6 +2165,7 @@ sev_snp_guest_set_id_block(Object *obj, const char 
*value, Error **errp)
 struct kvm_sev_snp_launch_finish *finish = _snp_guest->kvm_finish_conf;
 gsize len;
 
+finish->id_block_en = 0;
 g_free(sev_snp_guest->id_block);
 g_free((guchar *)finish->id_block_uaddr);
 
@@ -2184,7 +2185,7 @@ sev_snp_guest_set_id_block(Object *obj, const char 
*value, Error **errp)
 return;
 }
 
-finish->id_block_en = (len) ? 1 : 0;
+finish->id_block_en = 1;
 }
 
 static char *
-- 
2.45.1




Re: [RFC PATCH v1 1/6] build-sys: Add rust feature option

2024-06-11 Thread Alex Bennée
Stefan Hajnoczi  writes:

> On Mon, Jun 10, 2024 at 09:22:36PM +0300, Manos Pitsidianakis wrote:
>> Add options for Rust in meson_options.txt, meson.build, configure to
>> prepare for adding Rust code in the followup commits.
>> 
>> `rust` is a reserved meson name, so we have to use an alternative.
>> `with_rust` was chosen.
>> 
>> Signed-off-by: Manos Pitsidianakis 
>> ---
>> The cargo wrapper script hardcodes some rust target triples. This is 
>> just temporary.
>> ---
>>  .gitignore   |   2 +
>>  configure|  12 +++
>>  meson.build  |  11 ++
>>  meson_options.txt|   4 +
>>  scripts/cargo_wrapper.py | 211 +++
>>  5 files changed, 240 insertions(+)
>>  create mode 100644 scripts/cargo_wrapper.py
>> 
>> diff --git a/.gitignore b/.gitignore
>> index 61fa39967b..f42b0d937e 100644
>> --- a/.gitignore
>> +++ b/.gitignore
>> @@ -2,6 +2,8 @@
>>  /build/
>>  /.cache/
>>  /.vscode/
>> +/target/
>> +rust/**/target
>
> Are these necessary since the cargo build command-line below uses
> --target-dir ?
>
> Adding new build output directories outside build/ makes it harder to
> clean up the source tree and ensure no state from previous builds
> remains.

Indeed my tree looks like:

 $SRC
   /builds
 /buildA
 /buildB

etc. So I would expect the rust build stuff to be in the builddir
because I have multiple configurations.

>
>>  *.pyc
>>  .sdk
>>  .stgit-*
>> diff --git a/configure b/configure
>> index 38ee257701..c195630771 100755
>> --- a/configure
>> +++ b/configure
>> @@ -302,6 +302,9 @@ else
>>objcc="${objcc-${cross_prefix}clang}"
>>  fi
>>  
>> +with_rust="auto"
>> +with_rust_target_triple=""
>> +
>>  ar="${AR-${cross_prefix}ar}"
>>  as="${AS-${cross_prefix}as}"
>>  ccas="${CCAS-$cc}"
>> @@ -760,6 +763,12 @@ for opt do
>>;;
>>--gdb=*) gdb_bin="$optarg"
>>;;
>> +  --enable-rust) with_rust=enabled
>> +  ;;
>> +  --disable-rust) with_rust=disabled
>> +  ;;
>> +  --rust-target-triple=*) with_rust_target_triple="$optarg"
>> +  ;;
>># everything else has the same name in configure and meson
>>--*) meson_option_parse "$opt" "$optarg"
>>;;
>> @@ -1796,6 +1805,9 @@ if test "$skip_meson" = no; then
>>test -n "${LIB_FUZZING_ENGINE+xxx}" && meson_option_add 
>> "-Dfuzzing_engine=$LIB_FUZZING_ENGINE"
>>test "$plugins" = yes && meson_option_add "-Dplugins=true"
>>test "$tcg" != enabled && meson_option_add "-Dtcg=$tcg"
>> +  test "$with_rust" != enabled && meson_option_add "-Dwith_rust=$with_rust"
>> +  test "$with_rust" != enabled && meson_option_add "-Dwith_rust=$with_rust"
>
> Duplicate line.
>
>> +  test "$with_rust_target_triple" != "" && meson_option_add 
>> "-Dwith_rust_target_triple=$with_rust_target_triple"
>>run_meson() {
>>  NINJA=$ninja $meson setup "$@" "$PWD" "$source_path"
>>}
>> diff --git a/meson.build b/meson.build
>> index a9de71d450..3533889852 100644
>> --- a/meson.build
>> +++ b/meson.build
>> @@ -290,6 +290,12 @@ foreach lang : all_languages
>>endif
>>  endforeach
>>  
>> +cargo = not_found
>> +if get_option('with_rust').allowed()
>> +  cargo = find_program('cargo', required: get_option('with_rust'))
>> +endif
>> +with_rust = cargo.found()
>> +
>>  # default flags for all hosts
>>  # We use -fwrapv to tell the compiler that we require a C dialect where
>>  # left shift of signed integers is well defined and has the expected
>> @@ -2066,6 +2072,7 @@ endif
>>  
>>  config_host_data = configuration_data()
>>  
>> +config_host_data.set('CONFIG_WITH_RUST', with_rust)
>>  audio_drivers_selected = []
>>  if have_system
>>audio_drivers_available = {
>> @@ -4190,6 +4197,10 @@ if 'objc' in all_languages
>>  else
>>summary_info += {'Objective-C compiler': false}
>>  endif
>> +summary_info += {'Rust support':  with_rust}
>> +if with_rust and get_option('with_rust_target_triple') != ''
>> +  summary_info += {'Rust target': get_option('with_rust_target_triple')}
>> +endif
>>  option_cflags = (get_option('debug') ? ['-g'] : [])
>>  if get_option('optimization') != 'plain'
>>option_cflags += ['-O' + get_option('optimization')]
>> diff --git a/meson_options.txt b/meson_options.txt
>> index 4c1583eb40..223491b731 100644
>> --- a/meson_options.txt
>> +++ b/meson_options.txt
>> @@ -366,3 +366,7 @@ option('qemu_ga_version', type: 'string', value: '',
>>  
>>  option('hexagon_idef_parser', type : 'boolean', value : true,
>> description: 'use idef-parser to automatically generate TCG code for 
>> the Hexagon frontend')
>> +option('with_rust', type: 'feature', value: 'auto',
>> +   description: 'Enable Rust support')
>> +option('with_rust_target_triple', type : 'string', value: '',
>> +   description: 'Rust target triple')
>> diff --git a/scripts/cargo_wrapper.py b/scripts/cargo_wrapper.py
>> new file mode 100644
>> index 00..d338effdaa
>> --- /dev/null
>> +++ b/scripts/cargo_wrapper.py
>> @@ -0,0 +1,211 @@
>> +#!/usr/bin/env python3
>> +# 

Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

2024-06-11 Thread Zhao Liu
On Tue, Jun 11, 2024 at 01:41:57PM +0300, Manos Pitsidianakis wrote:
> Date: Tue, 11 Jun 2024 13:41:57 +0300
> From: Manos Pitsidianakis 
> Subject: Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust
> 
> > Currently, pl011 exclusively occupies a cargo as a package. In the
> > future, will other Rust implementations utilize the workspace mechanism
> > to act as a second package in the same cargo? Or will new cargo be created
> > again?
> 
> What do you mean by "new cargo"? I didn't catch that :(
> 
> A workspace would make sense if we have "general" crate libraries that
> hardware crates depend on.

Thanks Manos!

I mean if we spread the rust device across the QEMU submodules, wouldn't
we have to create their own cargo directories (aka single-package cargo)
for each rust device?

However, if the Rust code is all centralized under the /Rust directory,
then it can be managed by multiple-packages in cargo workspace. 

About the "general" crate, I'm not sure whether a base lib to manage
external crates is a good idea, like I replied in [1].

[1]: 
https://lore.kernel.org/qemu-devel/CAJSP0QWLe6yPDE3rPztx=os0g+vkt9w3gykrnu0eqzcaw06...@mail.gmail.com/T/#mfaf9abf06ed82dd7f8ce5e7520bbb4447083b550

> > 
> > Under a unified Rust directory, using a workspace to manage multiple
> > packages looks as if it would be easier to maintain. Decentralized to an
> > existing directory, they're all separate cargos, and external dependencies
> > tend to become fragmented?
> 
> Hmm potentially yes, but that's a "what if" scenario. Let's worry about that
> bridge when we cross it!
>

Yes!





Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

2024-06-11 Thread Daniel P . Berrangé
On Tue, Jun 11, 2024 at 03:16:19PM +0200, Paolo Bonzini wrote:
> On Tue, Jun 11, 2024 at 10:22 AM Daniel P. Berrangé  
> wrote:
> >
> > On Mon, Jun 10, 2024 at 09:22:35PM +0300, Manos Pitsidianakis wrote:
> > > Hello everyone,
> > >
> > > This is an early draft of my work on implementing a very simple device,
> > > in this case the ARM PL011 (which in C code resides in hw/char/pl011.c
> > > and is used in hw/arm/virt.c).
> >
> > looking at the diffstat:
> >
> > >  .gitignore |   2 +
> > >  .gitlab-ci.d/buildtest.yml |  64 ++--
> > >  configure  |  12 +
> > >  hw/arm/virt.c  |   2 +-
> > >  meson.build|  99 ++
> > >  meson_options.txt  |   4 +
> > >  rust/meson.build   |  93 ++
> > >  rust/pl011/.cargo/config.toml  |   2 +
> > >  rust/pl011/.gitignore  |   2 +
> > >  rust/pl011/Cargo.lock  | 120 +++
> > >  rust/pl011/Cargo.toml  |  26 ++
> > >  rust/pl011/README.md   |  42 +++
> > >  rust/pl011/build.rs|  44 +++
> > >  rust/pl011/meson.build |   7 +
> > >  rust/pl011/rustfmt.toml|  10 +
> > >  rust/pl011/src/definitions.rs  |  95 ++
> > >  rust/pl011/src/device.rs   | 531 ++
> > >  rust/pl011/src/device_class.rs |  95 ++
> > >  rust/pl011/src/generated.rs|   5 +
> > >  rust/pl011/src/lib.rs  | 575 +
> > >  rust/pl011/src/memory_ops.rs   |  38 +++
> >
> > My thought is that if we're going to start implementing devices
> > or other parts of QEMU, in Rust, then I do not want to see it
> > placed in a completely separate directory sub-tree.
> >
> > In this example, I would expect to have hw/arm/pl011.rs, or 
> > hw/arm/pl011/*.rs
> > so that the device is part of the normal Arm hardware directory structure
> > and maintainer assignments.
> 
> I think that's incompatible with the layout that Cargo expects.
> rust/hw/arm/pl011/ could be another possibility.

It doesn't look like its a problem in this patch series. It is just
introducing a "rust/pl011/Cargo.toml", and I don't see anything
that has an fundamental assumption that it is below a 'rust/' top
level dir, as opposed to under our existing dir structure.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2] qapi: clarify that the default is backend dependent

2024-06-11 Thread Markus Armbruster
Stefano Garzarella  writes:

> The default value of the @share option of the @MemoryBackendProperties
> really depends on the backend type, so let's document the default
> values in the same place where we define the option to avoid
> dispersing the information.
>
> Cc: David Hildenbrand 
> Suggested-by: Markus Armbruster 
> Signed-off-by: Stefano Garzarella 

Reviewed-by: Markus Armbruster 

and queued.  Thanks!




Re: [PATCH] hw/net/virtio-net.c: fix crash in iov_copy()

2024-06-11 Thread Alex Bennée
Дмитрий Фролов  writes:

> ping
>
> https://patchew.org/QEMU/20240527133140.218300-2-fro...@swemel.ru/
>
> On 27.05.2024 16:31, Dmitry Frolov wrote:
>> A crash found while fuzzing device virtio-net-socket-check-used.
>> Assertion "offset == 0" in iov_copy() fails if less than guest_hdr_len bytes
>> were transmited.
>>
>> Signed-off-by: Dmitry Frolov 
>> ---
>>   hw/net/virtio-net.c | 6 ++
>>   1 file changed, 6 insertions(+)
>>
>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>> index 24e5e7d347..603b80a50a 100644
>> --- a/hw/net/virtio-net.c
>> +++ b/hw/net/virtio-net.c
>> @@ -2783,6 +2783,12 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
>>*/
>>   assert(n->host_hdr_len <= n->guest_hdr_len);
>>   if (n->host_hdr_len != n->guest_hdr_len) {
>> +if (iov_size(out_sg, out_num) < n->guest_hdr_len) {
>> +virtio_error(vdev, "virtio-net header is invalid");
>> +virtqueue_detach_element(q->tx_vq, elem, 0);
>> +g_free(elem);
>> +return -EINVAL;
>> +}

Isn't this basically another case for goto detach?

Although the use of goto's here is a bit of a code smell. I wonder if
there is any way to better structure this function and take care of the
auto-freeing of elements?

>>   unsigned sg_num = iov_copy(sg, ARRAY_SIZE(sg),
>>  out_sg, out_num,
>>  0, n->host_hdr_len);

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [RFC PATCH v1 1/6] build-sys: Add rust feature option

2024-06-11 Thread Stefan Hajnoczi
On Mon, Jun 10, 2024 at 09:22:36PM +0300, Manos Pitsidianakis wrote:
> Add options for Rust in meson_options.txt, meson.build, configure to
> prepare for adding Rust code in the followup commits.
> 
> `rust` is a reserved meson name, so we have to use an alternative.
> `with_rust` was chosen.
> 
> Signed-off-by: Manos Pitsidianakis 
> ---
> The cargo wrapper script hardcodes some rust target triples. This is 
> just temporary.
> ---
>  .gitignore   |   2 +
>  configure|  12 +++
>  meson.build  |  11 ++
>  meson_options.txt|   4 +
>  scripts/cargo_wrapper.py | 211 +++
>  5 files changed, 240 insertions(+)
>  create mode 100644 scripts/cargo_wrapper.py
> 
> diff --git a/.gitignore b/.gitignore
> index 61fa39967b..f42b0d937e 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -2,6 +2,8 @@
>  /build/
>  /.cache/
>  /.vscode/
> +/target/
> +rust/**/target

Are these necessary since the cargo build command-line below uses
--target-dir ?

Adding new build output directories outside build/ makes it harder to
clean up the source tree and ensure no state from previous builds
remains.

>  *.pyc
>  .sdk
>  .stgit-*
> diff --git a/configure b/configure
> index 38ee257701..c195630771 100755
> --- a/configure
> +++ b/configure
> @@ -302,6 +302,9 @@ else
>objcc="${objcc-${cross_prefix}clang}"
>  fi
>  
> +with_rust="auto"
> +with_rust_target_triple=""
> +
>  ar="${AR-${cross_prefix}ar}"
>  as="${AS-${cross_prefix}as}"
>  ccas="${CCAS-$cc}"
> @@ -760,6 +763,12 @@ for opt do
>;;
>--gdb=*) gdb_bin="$optarg"
>;;
> +  --enable-rust) with_rust=enabled
> +  ;;
> +  --disable-rust) with_rust=disabled
> +  ;;
> +  --rust-target-triple=*) with_rust_target_triple="$optarg"
> +  ;;
># everything else has the same name in configure and meson
>--*) meson_option_parse "$opt" "$optarg"
>;;
> @@ -1796,6 +1805,9 @@ if test "$skip_meson" = no; then
>test -n "${LIB_FUZZING_ENGINE+xxx}" && meson_option_add 
> "-Dfuzzing_engine=$LIB_FUZZING_ENGINE"
>test "$plugins" = yes && meson_option_add "-Dplugins=true"
>test "$tcg" != enabled && meson_option_add "-Dtcg=$tcg"
> +  test "$with_rust" != enabled && meson_option_add "-Dwith_rust=$with_rust"
> +  test "$with_rust" != enabled && meson_option_add "-Dwith_rust=$with_rust"

Duplicate line.

> +  test "$with_rust_target_triple" != "" && meson_option_add 
> "-Dwith_rust_target_triple=$with_rust_target_triple"
>run_meson() {
>  NINJA=$ninja $meson setup "$@" "$PWD" "$source_path"
>}
> diff --git a/meson.build b/meson.build
> index a9de71d450..3533889852 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -290,6 +290,12 @@ foreach lang : all_languages
>endif
>  endforeach
>  
> +cargo = not_found
> +if get_option('with_rust').allowed()
> +  cargo = find_program('cargo', required: get_option('with_rust'))
> +endif
> +with_rust = cargo.found()
> +
>  # default flags for all hosts
>  # We use -fwrapv to tell the compiler that we require a C dialect where
>  # left shift of signed integers is well defined and has the expected
> @@ -2066,6 +2072,7 @@ endif
>  
>  config_host_data = configuration_data()
>  
> +config_host_data.set('CONFIG_WITH_RUST', with_rust)
>  audio_drivers_selected = []
>  if have_system
>audio_drivers_available = {
> @@ -4190,6 +4197,10 @@ if 'objc' in all_languages
>  else
>summary_info += {'Objective-C compiler': false}
>  endif
> +summary_info += {'Rust support':  with_rust}
> +if with_rust and get_option('with_rust_target_triple') != ''
> +  summary_info += {'Rust target': get_option('with_rust_target_triple')}
> +endif
>  option_cflags = (get_option('debug') ? ['-g'] : [])
>  if get_option('optimization') != 'plain'
>option_cflags += ['-O' + get_option('optimization')]
> diff --git a/meson_options.txt b/meson_options.txt
> index 4c1583eb40..223491b731 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -366,3 +366,7 @@ option('qemu_ga_version', type: 'string', value: '',
>  
>  option('hexagon_idef_parser', type : 'boolean', value : true,
> description: 'use idef-parser to automatically generate TCG code for 
> the Hexagon frontend')
> +option('with_rust', type: 'feature', value: 'auto',
> +   description: 'Enable Rust support')
> +option('with_rust_target_triple', type : 'string', value: '',
> +   description: 'Rust target triple')
> diff --git a/scripts/cargo_wrapper.py b/scripts/cargo_wrapper.py
> new file mode 100644
> index 00..d338effdaa
> --- /dev/null
> +++ b/scripts/cargo_wrapper.py
> @@ -0,0 +1,211 @@
> +#!/usr/bin/env python3
> +# Copyright (c) 2020 Red Hat, Inc.
> +# Copyright (c) 2023 Linaro Ltd.
> +#
> +# Authors:
> +#  Manos Pitsidianakis 
> +#  Marc-André Lureau 
> +#
> +# This work is licensed under the terms of the GNU GPL, version 2 or
> +# later.  See the COPYING file in the top-level directory.
> +
> +import argparse
> +import configparser
> +import distutils.file_util

Re: [RFC PATCH] migration/savevm: do not schedule snapshot_save_job_bh in qemu_aio_context

2024-06-11 Thread Stefan Hajnoczi
On Tue, Jun 11, 2024 at 02:08:49PM +0200, Fiona Ebner wrote:
> Am 06.06.24 um 20:36 schrieb Stefan Hajnoczi:
> > On Wed, Jun 05, 2024 at 02:08:48PM +0200, Fiona Ebner wrote:
> >> The fact that the snapshot_save_job_bh() is scheduled in the main
> >> loop's qemu_aio_context AioContext means that it might get executed
> >> during a vCPU thread's aio_poll(). But saving of the VM state cannot
> >> happen while the guest or devices are active and can lead to assertion
> >> failures. See issue #2111 for two examples. Avoid the problem by
> >> scheduling the snapshot_save_job_bh() in the iohandler AioContext,
> >> which is not polled by vCPU threads.
> >>
> >> Solves Issue #2111.
> >>
> >> This change also solves the following issue:
> >>
> >> Since commit effd60c878 ("monitor: only run coroutine commands in
> >> qemu_aio_context"), the 'snapshot-save' QMP call would not respond
> >> right after starting the job anymore, but only after the job finished,
> >> which can take a long time. The reason is, because after commit
> >> effd60c878, do_qmp_dispatch_bh() runs in the iohandler AioContext.
> >> When do_qmp_dispatch_bh() wakes the qmp_dispatch() coroutine, the
> >> coroutine cannot be entered immediately anymore, but needs to be
> >> scheduled to the main loop's qemu_aio_context AioContext. But
> >> snapshot_save_job_bh() was scheduled first to the same AioContext and
> >> thus gets executed first.
> >>
> >> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/2111
> >> Signed-off-by: Fiona Ebner 
> >> ---
> >>
> >> While initial smoke testing seems fine, I'm not familiar enough with
> >> this to rule out any pitfalls with the approach. Any reason why
> >> scheduling to the iohandler AioContext could be wrong here?
> > 
> > If something waits for a BlockJob to finish using aio_poll() from
> > qemu_aio_context then a deadlock is possible since the iohandler_ctx
> > won't get a chance to execute. The only suspicious code path I found was
> > job_completed_txn_abort_locked() -> job_finish_sync_locked() but I'm not
> > sure whether it triggers this scenario. Please check that code path.
> > 
> 
> Sorry, I don't understand. Isn't executing the scheduled BH the only
> additional progress that the iohandler_ctx needs to make compared to
> before the patch? How exactly would that cause issues when waiting for a
> BlockJob?
> 
> Or do you mean something waiting for the SnapshotJob from
> qemu_aio_context before snapshot_save_job_bh had the chance to run?

Yes, exactly. job_finish_sync_locked() will hang since iohandler_ctx has
no chance to execute. But I haven't audited the code to understand
whether this can happen.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH 08/20] qga: conditionalize schema for commands unsupported on Windows

2024-06-11 Thread Daniel P . Berrangé
On Tue, Jun 11, 2024 at 03:55:37PM +0200, Markus Armbruster wrote:
> Daniel P. Berrangé  writes:
> 
> > Rather than creating stubs for every command that just return
> > QERR_UNSUPPORTED, use 'if' conditions in the QAPI schema to
> > fully exclude generation of the commands on Windows.
> >
> > The command will be rejected at QMP dispatch time instead,
> > avoiding reimplementing rejection by blocking the stub commands.
> >
> > This fixes inconsistency where some commands are implemented
> > as stubs, yet not added to the blockedrpc list.
> >
> > This has the additional benefit that the QGA protocol reference
> > now documents what conditions enable use of the command.
> >
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >  qga/commands-win32.c | 56 +---
> >  qga/qapi-schema.json | 45 +++
> >  2 files changed, 31 insertions(+), 70 deletions(-)
> >
> > diff --git a/qga/commands-win32.c b/qga/commands-win32.c
> > index 9fe670d5b4..2533e4c748 100644
> > --- a/qga/commands-win32.c
> > +++ b/qga/commands-win32.c
> 
> [...]
> 
> >  /* add unsupported commands to the list of blocked RPCs */
> >  GList *ga_command_init_blockedrpcs(GList *blockedrpcs)
> >  {
> > -const char *list_unsupported[] = {
> > -"guest-suspend-hybrid",
> > -"guest-set-vcpus",
> > -"guest-get-memory-blocks", "guest-set-memory-blocks",
> > -"guest-get-memory-block-info",
> > -NULL};
> > -char **p = (char **)list_unsupported;
> > -
> > -while (*p) {
> > -blockedrpcs = g_list_append(blockedrpcs, g_strdup(*p++));
> > -}
> > -
> >  if (!vss_init(true)) {
> >  g_debug("vss_init failed, vss commands are going to be disabled");
> >  const char *list[] = {
> >  "guest-get-fsinfo", "guest-fsfreeze-status",
> >  "guest-fsfreeze-freeze", "guest-fsfreeze-thaw", NULL};
> > -p = (char **)list;
> > +char **p = (char **)list;
> >  
> >  while (*p) {
> >  blockedrpcs = g_list_append(blockedrpcs, g_strdup(*p++));
>}
>}
> 
>return blockedrpcs;
>}
> 
> Four commands get disabled when vss_init() fails, i.e. when qga-vss.dll
> can't be loaded and initialized.
> 
> Three of the four commands do this first:
> 
> if (!vss_initialized()) {
> error_setg(errp, QERR_UNSUPPORTED);
> return 0;
> }
> 
> The execption is qmp_guest_get_fsinfo().
> 
> vss_initialized() returns true between successful vss_init() and
> vss_deinit().
> 
> Aside: we call vss_init() in three places.  Two of them init, call
> something, then deinit.  Weird.  Moving on.
> 
> If these commands are meant to be only available when the DLL is, then
> having them check vss_initialized() is useless.
> 
> Conversely, if the check isn't useless, then the "make it available
> only" business is.
> 
> Opportunity for further cleanup?

If we eliminate the "make it available" check in ga_command_init_blockedrpcs,
that would be a nice cleanup IMHO, as these few commands are the only
special case where that's needed now.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH 08/20] qga: conditionalize schema for commands unsupported on Windows

2024-06-11 Thread Markus Armbruster
Daniel P. Berrangé  writes:

> Rather than creating stubs for every command that just return
> QERR_UNSUPPORTED, use 'if' conditions in the QAPI schema to
> fully exclude generation of the commands on Windows.
>
> The command will be rejected at QMP dispatch time instead,
> avoiding reimplementing rejection by blocking the stub commands.
>
> This fixes inconsistency where some commands are implemented
> as stubs, yet not added to the blockedrpc list.
>
> This has the additional benefit that the QGA protocol reference
> now documents what conditions enable use of the command.
>
> Signed-off-by: Daniel P. Berrangé 
> ---
>  qga/commands-win32.c | 56 +---
>  qga/qapi-schema.json | 45 +++
>  2 files changed, 31 insertions(+), 70 deletions(-)
>
> diff --git a/qga/commands-win32.c b/qga/commands-win32.c
> index 9fe670d5b4..2533e4c748 100644
> --- a/qga/commands-win32.c
> +++ b/qga/commands-win32.c

[...]

>  /* add unsupported commands to the list of blocked RPCs */
>  GList *ga_command_init_blockedrpcs(GList *blockedrpcs)
>  {
> -const char *list_unsupported[] = {
> -"guest-suspend-hybrid",
> -"guest-set-vcpus",
> -"guest-get-memory-blocks", "guest-set-memory-blocks",
> -"guest-get-memory-block-info",
> -NULL};
> -char **p = (char **)list_unsupported;
> -
> -while (*p) {
> -blockedrpcs = g_list_append(blockedrpcs, g_strdup(*p++));
> -}
> -
>  if (!vss_init(true)) {
>  g_debug("vss_init failed, vss commands are going to be disabled");
>  const char *list[] = {
>  "guest-get-fsinfo", "guest-fsfreeze-status",
>  "guest-fsfreeze-freeze", "guest-fsfreeze-thaw", NULL};
> -p = (char **)list;
> +char **p = (char **)list;
>  
>  while (*p) {
>  blockedrpcs = g_list_append(blockedrpcs, g_strdup(*p++));
   }
   }

   return blockedrpcs;
   }

Four commands get disabled when vss_init() fails, i.e. when qga-vss.dll
can't be loaded and initialized.

Three of the four commands do this first:

if (!vss_initialized()) {
error_setg(errp, QERR_UNSUPPORTED);
return 0;
}

The execption is qmp_guest_get_fsinfo().

vss_initialized() returns true between successful vss_init() and
vss_deinit().

Aside: we call vss_init() in three places.  Two of them init, call
something, then deinit.  Weird.  Moving on.

If these commands are meant to be only available when the DLL is, then
having them check vss_initialized() is useless.

Conversely, if the check isn't useless, then the "make it available
only" business is.

Opportunity for further cleanup?

[...]




Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

2024-06-11 Thread Paolo Bonzini
On Tue, Jun 11, 2024 at 10:22 AM Daniel P. Berrangé  wrote:
>
> On Mon, Jun 10, 2024 at 09:22:35PM +0300, Manos Pitsidianakis wrote:
> > Hello everyone,
> >
> > This is an early draft of my work on implementing a very simple device,
> > in this case the ARM PL011 (which in C code resides in hw/char/pl011.c
> > and is used in hw/arm/virt.c).
>
> looking at the diffstat:
>
> >  .gitignore |   2 +
> >  .gitlab-ci.d/buildtest.yml |  64 ++--
> >  configure  |  12 +
> >  hw/arm/virt.c  |   2 +-
> >  meson.build|  99 ++
> >  meson_options.txt  |   4 +
> >  rust/meson.build   |  93 ++
> >  rust/pl011/.cargo/config.toml  |   2 +
> >  rust/pl011/.gitignore  |   2 +
> >  rust/pl011/Cargo.lock  | 120 +++
> >  rust/pl011/Cargo.toml  |  26 ++
> >  rust/pl011/README.md   |  42 +++
> >  rust/pl011/build.rs|  44 +++
> >  rust/pl011/meson.build |   7 +
> >  rust/pl011/rustfmt.toml|  10 +
> >  rust/pl011/src/definitions.rs  |  95 ++
> >  rust/pl011/src/device.rs   | 531 ++
> >  rust/pl011/src/device_class.rs |  95 ++
> >  rust/pl011/src/generated.rs|   5 +
> >  rust/pl011/src/lib.rs  | 575 +
> >  rust/pl011/src/memory_ops.rs   |  38 +++
>
> My thought is that if we're going to start implementing devices
> or other parts of QEMU, in Rust, then I do not want to see it
> placed in a completely separate directory sub-tree.
>
> In this example, I would expect to have hw/arm/pl011.rs, or hw/arm/pl011/*.rs
> so that the device is part of the normal Arm hardware directory structure
> and maintainer assignments.

I think that's incompatible with the layout that Cargo expects.
rust/hw/arm/pl011/ could be another possibility.

Paolo




Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

2024-06-11 Thread Paolo Bonzini
On Mon, Jun 10, 2024 at 10:47 PM Stefan Hajnoczi  wrote:
> On Mon, 10 Jun 2024 at 16:27, Manos Pitsidianakis
>  wrote:
> >
> > On Mon, 10 Jun 2024 22:59, Stefan Hajnoczi  wrote:
> > >> What are the issues with not using the compiler, rustc, directly?
> > >> -
> > >> [whataretheissueswith] Back to [TOC]
> > >>
> > >> 1. Tooling
> > >>Mostly writing up the build-sys tooling to do so. Ideally we'd
> > >>compile everything without cargo but rustc directly.
> > >
> > >Why would that be ideal?
> >
> > It remove the indirection level of meson<->cargo<->rustc. I don't have a
> > concrete idea on how to tackle this, but if cargo ends up not strictly
> > necessary, I don't see why we cannot use one build system.
>
> The convenience of being able to use cargo dependencies without
> special QEMU meson build system effort seems worth the overhead of
> meson<->cargo<->rustc to me. There is a blog post that explores using
> cargo crates using meson's wrap dependencies here, and it seems like
> extra work:
> https://coaxion.net/blog/2023/04/building-a-gstreamer-plugin-in-rust-with-meson-instead-of-cargo/

The worst part of using cargo from meson (like in libblkio) is the
lack of integration with Rust tests, but otherwise it's a much better
experience. IIUC Meson's cargo subprojects do not support build.rs,
which is a problem if one of your dependencies (for example libc)
needs it.

https://mesonbuild.com/Wrap-dependency-system-manual.html#cargo-wraps

On the other hand, I think it's possible, possibly even clearer, to
invoke bindgen from meson. I would prefer to have many small .rs files
produced by bindgen, to be imported via "use", and I would prefer an
allowlist approach that excludes symbols from system headers.

> > >I guess there will be interest in using rust-vmm crates in some way.

Yes, especially the ByteValued, VolatileMemory and Bytes traits.

One complication in that respect is that anything that does DMA
depends on either RCU or the BQL. We could, at least at the beginning,
introduce a dummy guard that simply enforces at run-time that the BQL
is taken.

Paolo




[PATCH v2] qapi: clarify that the default is backend dependent

2024-06-11 Thread Stefano Garzarella
The default value of the @share option of the @MemoryBackendProperties
really depends on the backend type, so let's document the default
values in the same place where we define the option to avoid
dispersing the information.

Cc: David Hildenbrand 
Suggested-by: Markus Armbruster 
Signed-off-by: Stefano Garzarella 
---
v2:
- documented @share's default right where it's defined [Markus]

v1: https://patchew.org/QEMU/20240523133302.103858-1-sgarz...@redhat.com/
---
 qapi/qom.json | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/qapi/qom.json b/qapi/qom.json
index 8bd299265e..9b8f6a7ab5 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -600,7 +600,9 @@
 # preallocation threads (default: none) (since 7.2)
 #
 # @share: if false, the memory is private to QEMU; if true, it is
-# shared (default: false)
+# shared (default false for backends memory-backend-file and
+# memory-backend-ram, true for backends memory-backend-epc and
+# memory-backend-memfd)
 #
 # @reserve: if true, reserve swap space (or huge pages) if applicable
 # (default: true) (since 6.1)
@@ -700,8 +702,6 @@
 #
 # Properties for memory-backend-memfd objects.
 #
-# The @share boolean option is true by default with memfd.
-#
 # @hugetlb: if true, the file to be created resides in the hugetlbfs
 # filesystem (default: false)
 #
@@ -726,8 +726,6 @@
 #
 # Properties for memory-backend-epc objects.
 #
-# The @share boolean option is true by default with epc
-#
 # The @merge boolean option is false by default with epc
 #
 # The @dump boolean option is false by default with epc
-- 
2.45.2




Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

2024-06-11 Thread Daniel P . Berrangé
On Tue, Jun 11, 2024 at 01:51:13PM +0100, Alex Bennée wrote:
> Daniel P. Berrangé  writes:
> 
> > On Tue, Jun 11, 2024 at 01:58:10PM +0300, Manos Pitsidianakis wrote:
> >> On Tue, 11 Jun 2024 13:57, "Daniel P. Berrangé"  
> >> wrote:
> >> > On Mon, Jun 10, 2024 at 11:29:36PM +0300, Manos Pitsidianakis wrote:
> >> > > On Mon, 10 Jun 2024 22:37, Pierrick Bouvier 
> >> > >  wrote:
> >> > > > Hello Manos,
> >> > > > > On 6/10/24 11:22, Manos Pitsidianakis wrote:
> >> > > > > Hello everyone,
> >> > > > > > > This is an early draft of my work on implementing a very
> >> > > simple device,
> >> > > > > in this case the ARM PL011 (which in C code resides in 
> >> > > > > hw/char/pl011.c
> >> > > > > and is used in hw/arm/virt.c).
> >> > > > > > > The device is functional, with copied logic from the C code
> >> > > but with
> >> > > > > effort not to make a direct C to Rust translation. In other words, 
> >> > > > > do
> >> > > > > not write Rust as a C developer would.
> >> > > > > > > That goal is not complete but a best-effort case. To give a
> >> > > specific
> >> > > > > example, register values are typed but interrupt bit flags are not 
> >> > > > > (but
> >> > > > > could be). I will leave such minutiae for later iterations.
> >> > 
> >> > snip
> >> > 
> >> > > > Maybe it could be better if build.rs file was *not* needed for new
> >> > > > devices/folders, and could be abstracted as a detail of the python
> >> > > > wrapper script instead of something that should be committed.
> >> > > 
> >> > > 
> >> > > That'd mean you cannot work on the rust files with a LanguageServer, 
> >> > > you
> >> > > cannot run cargo build or cargo check or cargo clippy, etc. That's why 
> >> > > I
> >> > > left the alternative choice of including a manually generated bindings 
> >> > > file
> >> > > (generated.rs.inc)
> >> > 
> >> > I would not expect QEMU developers to be running 'cargo '
> >> > directly at all.
> >> > 
> >> > QEMU's build system is 'meson' + 'ninja' with a 'configure' + 'make'
> >> > convenience facade.
> >> > 
> >> > Any use of 'cargo' would be an internal impl detail of meson rules
> >> > for building rust code, and developers should still exclusively work
> >> > with 'make' or 'ninja' to run builds & tests.
> >> 
> >> No, that's not true. If I wrote the pl011 device with this workflow I'd 
> >> just
> >> waste time using meson. Part of the development is making sure the library
> >> type checks, compiles, using cargo to run style formatting, to check for
> >> lints, perhaps run tests. Doing this only through meson is an unnecessary
> >> complication.
> >
> > I don't see why it should waste time, when we ultimately end up calling
> > the same underlying tools. We need to have a consistent experiance for
> > developers working on QEMU, not have to use different tools for different
> > parts of QEMU depending on whether a piece of code happens to be rust
> > or C.
> 
> For example if I wanted to run rust-based unit tests (which I think
> potentially offer an easier solution than qtest) I would expect that to
> be done from the normal make/ninja targets.

Meson provides a nice "suite" concept to facilitate selection of a
subset of tests.

eg, to limit to running just 'rust' unit tests, I might expect we
should have

  meson test --suite rustunit

and have this invoked by 'make check-rustunit' 

Similar can be done for clippy, or other types of rust tests

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

2024-06-11 Thread Manos Pitsidianakis

Hello Antonio!

On Tue, 11 Jun 2024 15:45, Antonio Caggiano  wrote:

Hi there :)

On 11/06/2024 12:58, Manos Pitsidianakis wrote:
On Tue, 11 Jun 2024 13:57, "Daniel P. Berrangé"  
wrote:

On Mon, Jun 10, 2024 at 11:29:36PM +0300, Manos Pitsidianakis wrote:
On Mon, 10 Jun 2024 22:37, Pierrick Bouvier 
 wrote:

> Hello Manos,
> > On 6/10/24 11:22, Manos Pitsidianakis wrote:
> > Hello everyone,
> > > > This is an early draft of my work on implementing a very 
simple device,
> > in this case the ARM PL011 (which in C code resides in 
hw/char/pl011.c

> > and is used in hw/arm/virt.c).
> > > > The device is functional, with copied logic from the C code 
but with
> > effort not to make a direct C to Rust translation. In other 
words, do

> > not write Rust as a C developer would.
> > > > That goal is not complete but a best-effort case. To give a 
specific
> > example, register values are typed but interrupt bit flags are 
not (but

> > could be). I will leave such minutiae for later iterations.


snip


> Maybe it could be better if build.rs file was *not* needed for new
> devices/folders, and could be abstracted as a detail of the python
> wrapper script instead of something that should be committed.


That'd mean you cannot work on the rust files with a LanguageServer, you
cannot run cargo build or cargo check or cargo clippy, etc. That's why I
left the alternative choice of including a manually generated 
bindings file

(generated.rs.inc)


I would not expect QEMU developers to be running 'cargo '
directly at all.

QEMU's build system is 'meson' + 'ninja' with a 'configure' + 'make'
convenience facade.

Any use of 'cargo' would be an internal impl detail of meson rules
for building rust code, and developers should still exclusively work
with 'make' or 'ninja' to run builds & tests.


No, that's not true. If I wrote the pl011 device with this workflow I'd 
just waste time using meson. Part of the development is making sure the 
library type checks, compiles, using cargo to run style formatting, to 
check for lints, perhaps run tests. Doing this only through meson is an 
unnecessary complication.




My favorite tool for Rust development is rust-analyzer, which works very 
well with cargo-based projects. Making it work with meson is just a 
matter of pointing rust-analyzer to the rust-project.json file generated 
by meson at configuration time (just like compile_commands.json).


That's only generated for meson rust targets, whereas we are currently 
compiling with a cargo wrapper script.




Unfortunately, rust-analyzer also relies on cargo for doing its check. I 
was able to override that with ninja, but it requires `meson setup` with 
`RUSTFLAGS="--emit=metadata --error-format=json"`. That makes 
rust-analyzer happy, but compilation output is not readable anymore 
being json-like.


I ended up working with 2 build folders, one for me, one for 
rust-analyzer. So, yeah, it complicates a bit.



To compile and run QEMU with a rust component, sure, you'd use meson.



Cheers,
Antonio




Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

2024-06-11 Thread Alex Bennée
Daniel P. Berrangé  writes:

> On Tue, Jun 11, 2024 at 01:58:10PM +0300, Manos Pitsidianakis wrote:
>> On Tue, 11 Jun 2024 13:57, "Daniel P. Berrangé"  wrote:
>> > On Mon, Jun 10, 2024 at 11:29:36PM +0300, Manos Pitsidianakis wrote:
>> > > On Mon, 10 Jun 2024 22:37, Pierrick Bouvier 
>> > >  wrote:
>> > > > Hello Manos,
>> > > > > On 6/10/24 11:22, Manos Pitsidianakis wrote:
>> > > > > Hello everyone,
>> > > > > > > This is an early draft of my work on implementing a very
>> > > simple device,
>> > > > > in this case the ARM PL011 (which in C code resides in 
>> > > > > hw/char/pl011.c
>> > > > > and is used in hw/arm/virt.c).
>> > > > > > > The device is functional, with copied logic from the C code
>> > > but with
>> > > > > effort not to make a direct C to Rust translation. In other words, do
>> > > > > not write Rust as a C developer would.
>> > > > > > > That goal is not complete but a best-effort case. To give a
>> > > specific
>> > > > > example, register values are typed but interrupt bit flags are not 
>> > > > > (but
>> > > > > could be). I will leave such minutiae for later iterations.
>> > 
>> > snip
>> > 
>> > > > Maybe it could be better if build.rs file was *not* needed for new
>> > > > devices/folders, and could be abstracted as a detail of the python
>> > > > wrapper script instead of something that should be committed.
>> > > 
>> > > 
>> > > That'd mean you cannot work on the rust files with a LanguageServer, you
>> > > cannot run cargo build or cargo check or cargo clippy, etc. That's why I
>> > > left the alternative choice of including a manually generated bindings 
>> > > file
>> > > (generated.rs.inc)
>> > 
>> > I would not expect QEMU developers to be running 'cargo '
>> > directly at all.
>> > 
>> > QEMU's build system is 'meson' + 'ninja' with a 'configure' + 'make'
>> > convenience facade.
>> > 
>> > Any use of 'cargo' would be an internal impl detail of meson rules
>> > for building rust code, and developers should still exclusively work
>> > with 'make' or 'ninja' to run builds & tests.
>> 
>> No, that's not true. If I wrote the pl011 device with this workflow I'd just
>> waste time using meson. Part of the development is making sure the library
>> type checks, compiles, using cargo to run style formatting, to check for
>> lints, perhaps run tests. Doing this only through meson is an unnecessary
>> complication.
>
> I don't see why it should waste time, when we ultimately end up calling
> the same underlying tools. We need to have a consistent experiance for
> developers working on QEMU, not have to use different tools for different
> parts of QEMU depending on whether a piece of code happens to be rust
> or C.

For example if I wanted to run rust-based unit tests (which I think
potentially offer an easier solution than qtest) I would expect that to
be done from the normal make/ninja targets.

>
>> To compile and run QEMU with a rust component, sure, you'd use meson.
>
> With regards,
> Daniel

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [RFC PATCH v1 0/6] Implement ARM PL011 in Rust

2024-06-11 Thread Antonio Caggiano

Hi there :)

On 11/06/2024 12:58, Manos Pitsidianakis wrote:
On Tue, 11 Jun 2024 13:57, "Daniel P. Berrangé"  
wrote:

On Mon, Jun 10, 2024 at 11:29:36PM +0300, Manos Pitsidianakis wrote:
On Mon, 10 Jun 2024 22:37, Pierrick Bouvier 
 wrote:

> Hello Manos,
> > On 6/10/24 11:22, Manos Pitsidianakis wrote:
> > Hello everyone,
> > > > This is an early draft of my work on implementing a very 
simple device,
> > in this case the ARM PL011 (which in C code resides in 
hw/char/pl011.c

> > and is used in hw/arm/virt.c).
> > > > The device is functional, with copied logic from the C code 
but with
> > effort not to make a direct C to Rust translation. In other 
words, do

> > not write Rust as a C developer would.
> > > > That goal is not complete but a best-effort case. To give a 
specific
> > example, register values are typed but interrupt bit flags are 
not (but

> > could be). I will leave such minutiae for later iterations.


snip


> Maybe it could be better if build.rs file was *not* needed for new
> devices/folders, and could be abstracted as a detail of the python
> wrapper script instead of something that should be committed.


That'd mean you cannot work on the rust files with a LanguageServer, you
cannot run cargo build or cargo check or cargo clippy, etc. That's why I
left the alternative choice of including a manually generated 
bindings file

(generated.rs.inc)


I would not expect QEMU developers to be running 'cargo '
directly at all.

QEMU's build system is 'meson' + 'ninja' with a 'configure' + 'make'
convenience facade.

Any use of 'cargo' would be an internal impl detail of meson rules
for building rust code, and developers should still exclusively work
with 'make' or 'ninja' to run builds & tests.


No, that's not true. If I wrote the pl011 device with this workflow I'd 
just waste time using meson. Part of the development is making sure the 
library type checks, compiles, using cargo to run style formatting, to 
check for lints, perhaps run tests. Doing this only through meson is an 
unnecessary complication.




My favorite tool for Rust development is rust-analyzer, which works very 
well with cargo-based projects. Making it work with meson is just a 
matter of pointing rust-analyzer to the rust-project.json file generated 
by meson at configuration time (just like compile_commands.json).


Unfortunately, rust-analyzer also relies on cargo for doing its check. I 
was able to override that with ninja, but it requires `meson setup` with 
`RUSTFLAGS="--emit=metadata --error-format=json"`. That makes 
rust-analyzer happy, but compilation output is not readable anymore 
being json-like.


I ended up working with 2 build folders, one for me, one for 
rust-analyzer. So, yeah, it complicates a bit.



To compile and run QEMU with a rust component, sure, you'd use meson.



Cheers,
Antonio



Re: [PATCH] hw/net/virtio-net.c: fix crash in iov_copy()

2024-06-11 Thread Дмитрий Фролов

ping

https://patchew.org/QEMU/20240527133140.218300-2-fro...@swemel.ru/

On 27.05.2024 16:31, Dmitry Frolov wrote:

A crash found while fuzzing device virtio-net-socket-check-used.
Assertion "offset == 0" in iov_copy() fails if less than guest_hdr_len bytes
were transmited.

Signed-off-by: Dmitry Frolov 
---
  hw/net/virtio-net.c | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 24e5e7d347..603b80a50a 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2783,6 +2783,12 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
   */
  assert(n->host_hdr_len <= n->guest_hdr_len);
  if (n->host_hdr_len != n->guest_hdr_len) {
+if (iov_size(out_sg, out_num) < n->guest_hdr_len) {
+virtio_error(vdev, "virtio-net header is invalid");
+virtqueue_detach_element(q->tx_vq, elem, 0);
+g_free(elem);
+return -EINVAL;
+}
  unsigned sg_num = iov_copy(sg, ARRAY_SIZE(sg),
 out_sg, out_num,
 0, n->host_hdr_len);





Re: [PATCH] tests/qtest/fuzz/virtio_net_fuzz.c: fix virtio_net_fuzz_multi

2024-06-11 Thread Дмитрий Фролов

ping

https://patchew.org/QEMU/20240523102813.396750-2-fro...@swemel.ru/

On 23.05.2024 13:28, Dmitry Frolov wrote:

If QTestState was already CLOSED due to error, calling qtest_clock_step()
afterwards makes no sense and only raises false-crash with message:
"assertion timer != NULL failed".

Signed-off-by: Dmitry Frolov 
---
  tests/qtest/fuzz/virtio_net_fuzz.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/tests/qtest/fuzz/virtio_net_fuzz.c 
b/tests/qtest/fuzz/virtio_net_fuzz.c
index e239875e3b..2f57a8ddd8 100644
--- a/tests/qtest/fuzz/virtio_net_fuzz.c
+++ b/tests/qtest/fuzz/virtio_net_fuzz.c
@@ -81,6 +81,9 @@ static void virtio_net_fuzz_multi(QTestState *s,
  /* Run the main loop */
  qtest_clock_step(s, 100);
  flush_events(s);
+if (!qtest_probe_child(s)) {
+return;
+}
  
  /* Wait on used descriptors */

  if (check_used && !vqa.rx) {





Re: [PATCH v4 2/4] vvfat: Fix usage of `info.file.offset`

2024-06-11 Thread Amjad Alsharafi
On Mon, Jun 10, 2024 at 06:49:43PM +0200, Kevin Wolf wrote:
> Am 05.06.2024 um 02:58 hat Amjad Alsharafi geschrieben:
> > The field is marked as "the offset in the file (in clusters)", but it
> > was being used like this
> > `cluster_size*(nums)+mapping->info.file.offset`, which is incorrect.
> > 
> > Additionally, removed the `abort` when `first_mapping_index` does not
> > match, as this matches the case when adding new clusters for files, and
> > its inevitable that we reach this condition when doing that if the
> > clusters are not after one another, so there is no reason to `abort`
> > here, execution continues and the new clusters are written to disk
> > correctly.
> > 
> > Signed-off-by: Amjad Alsharafi 
> 
> Can you help me understand how first_mapping_index really works?
> 
> It seems to me that you get a chain of mappings for each file on the FAT
> filesystem, which are just the contiguous areas in it, and
> first_mapping_index refers to the mapping at the start of the file. But
> for much of the time, it actually doesn't seem to be set at all, so you
> have mapping->first_mapping_index == -1. Do you understand the rules
> around when it's set and when it isn't?

Yeah. So `first_mapping_index` is the index of the first mapping, each
mapping is a group of clusters that are contiguous in the file.
Its mostly `-1` because the first mapping will have the value set as
`-1` and not its own index, this value will only be set when the file
contain more than one mapping, and this will only happen when you add
clusters to a file that are not contiguous with the existing clusters.

And actually, thanks to that I noticed another bug not fixed in PATCH 3, 
We are doing this check 
`s->current_mapping->first_mapping_index != mapping->first_mapping_index`
to know if we should switch to the new mapping or not. 
If we were reading from the first mapping (`first_mapping_index == -1`)
and we jumped to the second mapping (`first_mapping_index == n`), we
will catch this condition and switch to the new mapping.

But if the file has more than 2 mappings, and we jumped to the 3rd
mapping, we will not catch this since (`first_mapping_index == n`) for
both of them haha. I think a better check is to check the `mapping`
pointer directly. (I'll add it also in the next series together with a
test for it.)

> 
> >  block/vvfat.c | 12 +++-
> >  1 file changed, 7 insertions(+), 5 deletions(-)
> > 
> > diff --git a/block/vvfat.c b/block/vvfat.c
> > index 19da009a5b..f0642ac3e4 100644
> > --- a/block/vvfat.c
> > +++ b/block/vvfat.c
> > @@ -1408,7 +1408,9 @@ read_cluster_directory:
> >  
> >  assert(s->current_fd);
> >  
> > -
> > offset=s->cluster_size*(cluster_num-s->current_mapping->begin)+s->current_mapping->info.file.offset;
> > +offset = s->cluster_size *
> > +((cluster_num - s->current_mapping->begin)
> > ++ s->current_mapping->info.file.offset);
> >  if(lseek(s->current_fd, offset, SEEK_SET)!=offset)
> >  return -3;
> >  s->cluster=s->cluster_buffer;
> > @@ -1929,8 +1931,9 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, 
> > direntry_t* direntry, const ch
> >  (mapping->mode & MODE_DIRECTORY) == 0) {
> >  
> >  /* was modified in qcow */
> > -if (offset != mapping->info.file.offset + 
> > s->cluster_size
> > -* (cluster_num - mapping->begin)) {
> > +if (offset != s->cluster_size
> > +* ((cluster_num - mapping->begin)
> > ++ mapping->info.file.offset)) {
> >  /* offset of this cluster in file chain has 
> > changed */
> >  abort();
> >  copy_it = 1;
> > @@ -1944,7 +1947,6 @@ get_cluster_count_for_direntry(BDRVVVFATState* s, 
> > direntry_t* direntry, const ch
> >  
> >  if (mapping->first_mapping_index != first_mapping_index
> >  && mapping->info.file.offset > 0) {
> > -abort();
> >  copy_it = 1;
> >  }
> 
> I'm unsure which case this represents. If first_mapping_index refers to
> the mapping of the first cluster in the file, does this mean we got a
> mapping for a different file here? Or is the comparison between -1 and a
> real value?

Now that I think more about it, I think this `abort` is actually
correct, the issue though is that the handling around this code is not.

What this `abort` actually does is that it checks.
- if the `mapping->first_mapping_index` is not the same as
  `first_mapping_index`, which **should** happen only in one case, when
  we are handling the first mapping, in that case
  `mapping->first_mapping_index == -1`, in all other cases, the other
  mappings after the first should have the condition true.
- From above, we know that this is the first mapping, so if 

Re: [PATCH] tests/qtest/fuzz: fix memleak in qos_fuzz.c

2024-06-11 Thread Дмитрий Фролов

ping

https://patchew.org/QEMU/20240521103106.119021-3-fro...@swemel.ru/

On 21.05.2024 13:31, Dmitry Frolov wrote:

Found with fuzzing for qemu-8.2, but also relevant for master

Signed-off-by: Dmitry Frolov 
---
  tests/qtest/fuzz/qos_fuzz.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/tests/qtest/fuzz/qos_fuzz.c b/tests/qtest/fuzz/qos_fuzz.c
index b71e945c5f..d3839bf999 100644
--- a/tests/qtest/fuzz/qos_fuzz.c
+++ b/tests/qtest/fuzz/qos_fuzz.c
@@ -180,6 +180,7 @@ static void walk_path(QOSGraphNode *orig_path, int len)
  
  fuzz_path_vec = path_vec;

  } else {
+g_string_free(cmd_line, true);
  g_free(path_vec);
  }
  





Re: [PATCH v2 3/3] monitor: Remove monitor_register_hmp()

2024-06-11 Thread Daniel P . Berrangé
On Tue, Jun 11, 2024 at 12:23:05PM +0200, Philippe Mathieu-Daudé wrote:
> Previous commit removed the last use of monitor_register_hmp(),
> remove it so new commands are implemented using
> monitor_register_hmp_info_hrt().
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  include/monitor/monitor.h |  2 --
>  monitor/hmp-target.c  | 16 
>  2 files changed, 18 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH v2 2/3] hw/usb: Introduce x-query-usbhost QMP command

2024-06-11 Thread Daniel P . Berrangé
On Tue, Jun 11, 2024 at 12:23:04PM +0200, Philippe Mathieu-Daudé wrote:
> This is a counterpart to the HMP "info usbhost" command. It is being
> added with an "x-" prefix because this QMP command is intended as an
> adhoc debugging tool and will thus not be modelled in QAPI as fully
> structured data, nor will it have long term guaranteed stability.
> The existing HMP command is rewritten to call the QMP command.
> 
> Since host-libusb.c can be built as part of the 'hw-usb' module,
> we introduce the libusb_register_hmp_info_hrt() helper to allow late
> registration when the module is loaded.
> 
> Suggested-by: Daniel P. Berrangé 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  qapi/machine.json   | 18 
>  hw/usb/host-libusb.h| 16 ++
>  include/hw/usb.h|  3 ---
>  hw/usb/bus-stub.c   |  7 +-
>  hw/usb/host-libusb-common.c | 31 ++
>  hw/usb/host-libusb.c| 43 +
>  tests/qtest/qmp-cmd-test.c  |  3 +++
>  hmp-commands-info.hx|  2 ++
>  hw/usb/meson.build  |  1 +
>  9 files changed, 106 insertions(+), 18 deletions(-)
>  create mode 100644 hw/usb/host-libusb.h
>  create mode 100644 hw/usb/host-libusb-common.c

Reviewed-by: Daniel P. Berrangé 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: [PATCH 0/3] snp: fix coverity reported issues

2024-06-11 Thread Paolo Bonzini
Queued, thanks.

Paolo




Re: [PATCH v2 1/3] hw/usb: Remove unused 'host.h' header

2024-06-11 Thread Daniel P . Berrangé
On Tue, Jun 11, 2024 at 12:23:03PM +0200, Philippe Mathieu-Daudé wrote:
> Since commit 99761176ee ("usb: Remove legacy -usbdevice options
> (host, serial, disk and net)") hw/usb/host.h is not used, remove
> it.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> Cc: Thomas Huth 
> ---
>  hw/usb/host.h | 44 
>  1 file changed, 44 deletions(-)
>  delete mode 100644 hw/usb/host.h

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




  1   2   >