[PATCH] hw/block/nvme: add smart_critical_warning property
There is a very low probability that hitting physical NVMe disk hardware critical warning case, it's hard to write & test a monitor agent service. For debugging purposes, add a new 'smart_critical_warning' property to emulate this situation. Test with this patch: 1, append 'smart_critical_warning=16' for nvme parameters. 2, run smartctl in guest #smartctl -H -l error /dev/nvme0n1 === START OF SMART DATA SECTION === SMART overall-health self-assessment test result: FAILED! - volatile memory backup device has failed Signed-off-by: zhenwei pi --- hw/block/nvme.c | 4 hw/block/nvme.h | 1 + 2 files changed, 5 insertions(+) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 27d2c72716..2f0bcac91c 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -1215,6 +1215,8 @@ static uint16_t nvme_smart_info(NvmeCtrl *n, uint8_t rae, uint32_t buf_len, trans_len = MIN(sizeof(smart) - off, buf_len); +smart.critical_warning = n->params.smart_critical_warning; + smart.data_units_read[0] = cpu_to_le64(DIV_ROUND_UP(stats.units_read, 1000)); smart.data_units_written[0] = cpu_to_le64(DIV_ROUND_UP(stats.units_written, @@ -2824,6 +2826,8 @@ static Property nvme_props[] = { DEFINE_PROP_UINT32("aer_max_queued", NvmeCtrl, params.aer_max_queued, 64), DEFINE_PROP_UINT8("mdts", NvmeCtrl, params.mdts, 7), DEFINE_PROP_BOOL("use-intel-id", NvmeCtrl, params.use_intel_id, false), +DEFINE_PROP_UINT8("smart_critical_warning", NvmeCtrl, + params.smart_critical_warning, 0), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/block/nvme.h b/hw/block/nvme.h index e080a2318a..76684f5ac0 100644 --- a/hw/block/nvme.h +++ b/hw/block/nvme.h @@ -16,6 +16,7 @@ typedef struct NvmeParams { uint32_t aer_max_queued; uint8_t mdts; bool use_intel_id; +uint8_t smart_critical_warning; } NvmeParams; typedef struct NvmeAsyncEvent { -- 2.25.1
Re: [PATCH] configure: Add flags for MinGW32 standalone build
On 11/01/21 08:29, Stefan Weil wrote: Am 11.01.21 um 08:04 schrieb Thomas Huth: On 08/01/2021 19.30, Joshua Watt wrote: On 1/8/21 1:25 AM, Thomas Huth wrote: On 07/01/2021 22.38, Joshua Watt wrote: There are two cases that need to be accounted for when compiling QEMU for MinGW32: 1) A standalone distribution, where QEMU is self contained and extracted by the user, such as a user would download from the QEMU website. In this case, all of the QEMU files should be rooted in $prefix to ensure they can be easily packaged together for distribution 2) QEMU integrated into a distribution image/sysroot/SDK and distributed with other programs. In this case, the provided arguments for bindir/datadir/etc. should be respected as they for a Linux build. Add a configure time flags --enable-standalone-mingw and --disable-standalone-mingw that allows the user to control this behavior. The flag defaults to "enabled" if unspecified to retain the existing build behavior Signed-off-by: Joshua Watt --- configure | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/configure b/configure index 5860bdb77b..5c83edb502 100755 --- a/configure +++ b/configure @@ -358,6 +358,7 @@ strip_opt="yes" tcg_interpreter="no" bigendian="no" mingw32="no" +mingw32_standalone="yes" gcov="no" EXESUF="$default_feature" HOST_DSOSUF=".so" @@ -1558,6 +1559,10 @@ for opt do ;; --disable-fuse-lseek) fuse_lseek="disabled" ;; + --enable-standalone-mingw) mingw32_standalone="yes" + ;; + --disable-standalone-mingw) mingw32_standalone="no" + ;; *) echo "ERROR: unknown option $opt" echo "Try '$0 --help' for more information" @@ -1570,7 +1575,7 @@ libdir="${libdir:-$prefix/lib}" libexecdir="${libexecdir:-$prefix/libexec}" includedir="${includedir:-$prefix/include}" -if test "$mingw32" = "yes" ; then +if test "$mingw32" = "yes" && test "$mingw32_standalone" = "yes"; then mandir="$prefix" datadir="$prefix" docdir="$prefix" @@ -1897,6 +1902,7 @@ disabled with --disable-FEATURE, default is enabled if available libdaxctl libdaxctl support fuse FUSE block device export fuse-lseek SEEK_HOLE/SEEK_DATA support for FUSE exports + standalone-mingw Build for standalone distribution on MinGW NOTE: The object files are built at the place where configure is launched EOF I think this should maybe be done independently from MinGW, so that it could be used on other systems, too. Thus maybe rather name the switch "--enable-standalone-distribution" or "--enable-standalone-installation" or something like this? On MinGW, the value of the switch could then default to "yes" while on other systems it would be "no" by default. We could, but I'm curious how useful that is? Does that make the option just a shorthand for "--mandir=$prefix --bindir=$prefix --datadir=$prefix etc..." for all builds? Yes, that would basically be a shorthand for that. Could be useful for people who want to create standalone binaries on Linux etc., too. Thomas Aren't nearly all files already rooted in $prefix? The only exception I know is /etc/qemu. Rooting in $prefix still allows hierarchical subdirectories. I'd prefer them for MinGW, too. I agree, it was an issue before 5.2 but now we have relocatable installations. So it would be better to remove all the special casing of mingw, except that (for backwards compatibility) on mingw bindir defaults to $prefix instead of $prefix/bin. Then Joshua's usecase is covered simply by --bindir=/mingw/bin. Paolo
Re: [PATCH v3] drivers/virt: vmgenid: add vm generation id driver
+ Eric W. Biederman Eric's email was filtered by my server for some reason so I can't directly reply to it, this is the closest thread relative I could answer on. On 01/12/2020 12:00, Eric W. Biederman wrote: > > > On 27.11.20 19:26, Catangiu, Adrian Costin wrote: >> - Background >> >> The VM Generation ID is a feature defined by Microsoft (paper: >> http://go.microsoft.com/fwlink/?LinkId=260709) and supported by >> multiple hypervisor vendors. >> >> The feature is required in virtualized environments by apps that work >> with local copies/caches of world-unique data such as random values, >> uuids, monotonically increasing counters, etc. >> Such apps can be negatively affected by VM snapshotting when the VM >> is either cloned or returned to an earlier point in time. >> > How does this differ from /proc/sys/kernel/random/boot_id? The boot_id only changes at OS boot whereas we need the generation id to change _while_ the system/guest-os is running - generation changes because underlyingVM or container goes through a snapshot restore event which is otherwisetransparent to guest system. >> >> The VM Generation ID is a simple concept meant to alleviate the issue >> by providing a unique ID that changes each time the VM is restored >> from a snapshot. The hw provided UUID value can be used to >> differentiate between VMs or different generations of the same VM. >> > Does the VM generation ID change in a running that effectively things it > is running? Yes, the generation id changes while guest OS is running, the generation change itself is what lets the guest OS and guest userspace know there's been a VM or container snapshot restore event. >> >> - Problem >> >> The VM Generation ID is exposed through an ACPI device by multiple >> hypervisor vendors but neither the vendors or upstream Linux have no >> default driver for it leaving users to fend for themselves. >> >> Furthermore, simply finding out about a VM generation change is only >> the starting point of a process to renew internal states of possibly >> multiple applications across the system. This process could benefit >> from a driver that provides an interface through which orchestration >> can be easily done. >> >> - Solution >> >> This patch is a driver that exposes a monotonic incremental Virtual >> Machine Generation u32 counter via a char-dev FS interface. > Earlier it was a UUID now it is 32bit number? The generation id exposed to userspace is a 32bit monotonic incremental counter. This counter is internally driven by the acpi vmgenid device. The 128-bit vmgenid-device-provided UUID is only used internally by the driver. I will make all of this clearer in the next patch version. >> The FS >> interface provides sync and async VmGen counter updates notifications. >> It also provides VmGen counter retrieval and confirmation mechanisms. >> >> The generation counter and the interface through which it is exposed >> are available even when there is no acpi device present. >> >> When the device is present, the hw provided UUID is not exposed to >> userspace, it is internally used by the driver to keep accounting for >> the exposed VmGen counter. The counter starts from zero when the >> driver is initialized and monotonically increments every time the hw >> UUID changes (the VM generation changes). >> On each hw UUID change, the new hypervisor-provided UUID is also fed >> to the kernel RNG. > Should this be a hotplug even rather than a new character device? > > Without plugging into udev and the rest of the hotplug infrastructure > I suspect things will be missed. That's a good idea, I will look into it. >> >> If there is no acpi vmgenid device present, the generation changes are >> not driven by hw vmgenid events but can be driven by software through >> a dedicated driver ioctl. >> >> This patch builds on top of Or Idgar 's proposal >> https://lkml.org/lkml/2018/3/1/498 > Eric > Amazon Development Center (Romania) S.R.L. registered office: 27A Sf. Lazar Street, UBC5, floor 2, Iasi, Iasi County, 700045, Romania. Registered in Romania. Registration number J22/2621/2005.
[PATCH] vfio/migrate: Move switch of dirty tracking into vfio_memory_listener
For now the switch of vfio dirty page tracking is integrated into the vfio_save_handler, it causes some problems [1]. The object of dirty tracking is guest memory, but the object of the vfio_save_handler is device state. This mixed logic produces unnecessary coupling and conflicts: 1. Coupling: Their saving granule is different (perVM vs perDevice). vfio will enable dirty_page_tracking for each devices, actually once is enough. 2. Conflicts: The ram_save_setup() traverses all memory_listeners to execute their log_start() and log_sync() hooks to get the first round dirty bitmap, which is used by the bulk stage of ram saving. However, it can't get dirty bitmap from vfio, as @savevm_ram_handlers is registered before @vfio_save_handler. Move the switch of vfio dirty_page_tracking into vfio_memory_listener can solve above problems. Besides, Do not require devices in SAVING state for vfio_sync_dirty_bitmap(). [1] https://www.spinics.net/lists/kvm/msg229967.html Reported-by: Zenghui Yu Signed-off-by: Keqian Zhu --- hw/vfio/common.c| 53 + hw/vfio/migration.c | 35 -- 2 files changed, 44 insertions(+), 44 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 6ff1daa763..9128cd7ee1 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -311,7 +311,7 @@ bool vfio_mig_active(void) return true; } -static bool vfio_devices_all_saving(VFIOContainer *container) +static bool vfio_devices_all_dirty_tracking(VFIOContainer *container) { VFIOGroup *group; VFIODevice *vbasedev; @@ -329,13 +329,8 @@ static bool vfio_devices_all_saving(VFIOContainer *container) return false; } -if (migration->device_state & VFIO_DEVICE_STATE_SAVING) { -if ((vbasedev->pre_copy_dirty_page_tracking == ON_OFF_AUTO_OFF) -&& (migration->device_state & VFIO_DEVICE_STATE_RUNNING)) { -return false; -} -continue; -} else { +if ((vbasedev->pre_copy_dirty_page_tracking == ON_OFF_AUTO_OFF) +&& (migration->device_state & VFIO_DEVICE_STATE_RUNNING)) { return false; } } @@ -987,6 +982,44 @@ static void vfio_listener_region_del(MemoryListener *listener, } } +static void vfio_set_dirty_page_tracking(VFIOContainer *container, bool start) +{ +int ret; +struct vfio_iommu_type1_dirty_bitmap dirty = { +.argsz = sizeof(dirty), +}; + +if (start) { +dirty.flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_START; +} else { +dirty.flags = VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP; +} + +ret = ioctl(container->fd, VFIO_IOMMU_DIRTY_PAGES, &dirty); +if (ret) { +error_report("Failed to set dirty tracking flag 0x%x errno: %d", + dirty.flags, errno); +} +} + +static void vfio_listener_log_start(MemoryListener *listener, +MemoryRegionSection *section, +int old, int new) +{ +VFIOContainer *container = container_of(listener, VFIOContainer, listener); + +vfio_set_dirty_page_tracking(container, true); +} + +static void vfio_listener_log_stop(MemoryListener *listener, + MemoryRegionSection *section, + int old, int new) +{ +VFIOContainer *container = container_of(listener, VFIOContainer, listener); + +vfio_set_dirty_page_tracking(container, false); +} + static int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova, uint64_t size, ram_addr_t ram_addr) { @@ -1128,7 +1161,7 @@ static void vfio_listerner_log_sync(MemoryListener *listener, return; } -if (vfio_devices_all_saving(container)) { +if (vfio_devices_all_dirty_tracking(container)) { vfio_sync_dirty_bitmap(container, section); } } @@ -1136,6 +1169,8 @@ static void vfio_listerner_log_sync(MemoryListener *listener, static const MemoryListener vfio_memory_listener = { .region_add = vfio_listener_region_add, .region_del = vfio_listener_region_del, +.log_start = vfio_listener_log_start, +.log_stop = vfio_listener_log_stop, .log_sync = vfio_listerner_log_sync, }; diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 00daa50ed8..c0f646823a 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -395,40 +395,10 @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) return qemu_file_get_error(f); } -static int vfio_set_dirty_page_tracking(VFIODevice *vbasedev, bool start) -{ -int ret; -VFIOMigration *migration = vbasedev->migration; -VFIOContainer *container = vbasedev->group->container; -struct vfio_iommu_type1_dirty_bitmap dirty = { -.argsz = sizeof(dirty), -}; - -if (start) { -i
Re: [PATCH] configure: Add flags for MinGW32 standalone build
Am 11.01.21 um 08:04 schrieb Thomas Huth: On 08/01/2021 19.30, Joshua Watt wrote: On 1/8/21 1:25 AM, Thomas Huth wrote: On 07/01/2021 22.38, Joshua Watt wrote: There are two cases that need to be accounted for when compiling QEMU for MinGW32: 1) A standalone distribution, where QEMU is self contained and extracted by the user, such as a user would download from the QEMU website. In this case, all of the QEMU files should be rooted in $prefix to ensure they can be easily packaged together for distribution 2) QEMU integrated into a distribution image/sysroot/SDK and distributed with other programs. In this case, the provided arguments for bindir/datadir/etc. should be respected as they for a Linux build. Add a configure time flags --enable-standalone-mingw and --disable-standalone-mingw that allows the user to control this behavior. The flag defaults to "enabled" if unspecified to retain the existing build behavior Signed-off-by: Joshua Watt --- configure | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/configure b/configure index 5860bdb77b..5c83edb502 100755 --- a/configure +++ b/configure @@ -358,6 +358,7 @@ strip_opt="yes" tcg_interpreter="no" bigendian="no" mingw32="no" +mingw32_standalone="yes" gcov="no" EXESUF="$default_feature" HOST_DSOSUF=".so" @@ -1558,6 +1559,10 @@ for opt do ;; --disable-fuse-lseek) fuse_lseek="disabled" ;; + --enable-standalone-mingw) mingw32_standalone="yes" + ;; + --disable-standalone-mingw) mingw32_standalone="no" + ;; *) echo "ERROR: unknown option $opt" echo "Try '$0 --help' for more information" @@ -1570,7 +1575,7 @@ libdir="${libdir:-$prefix/lib}" libexecdir="${libexecdir:-$prefix/libexec}" includedir="${includedir:-$prefix/include}" -if test "$mingw32" = "yes" ; then +if test "$mingw32" = "yes" && test "$mingw32_standalone" = "yes"; then mandir="$prefix" datadir="$prefix" docdir="$prefix" @@ -1897,6 +1902,7 @@ disabled with --disable-FEATURE, default is enabled if available libdaxctl libdaxctl support fuse FUSE block device export fuse-lseek SEEK_HOLE/SEEK_DATA support for FUSE exports + standalone-mingw Build for standalone distribution on MinGW NOTE: The object files are built at the place where configure is launched EOF I think this should maybe be done independently from MinGW, so that it could be used on other systems, too. Thus maybe rather name the switch "--enable-standalone-distribution" or "--enable-standalone-installation" or something like this? On MinGW, the value of the switch could then default to "yes" while on other systems it would be "no" by default. We could, but I'm curious how useful that is? Does that make the option just a shorthand for "--mandir=$prefix --bindir=$prefix --datadir=$prefix etc..." for all builds? Yes, that would basically be a shorthand for that. Could be useful for people who want to create standalone binaries on Linux etc., too. Thomas Aren't nearly all files already rooted in $prefix? The only exception I know is /etc/qemu. Rooting in $prefix still allows hierarchical subdirectories. I'd prefer them for MinGW, too. Stefan
Re: [PATCH] configure: Add flags for MinGW32 standalone build
On 08/01/2021 19.30, Joshua Watt wrote: On 1/8/21 1:25 AM, Thomas Huth wrote: On 07/01/2021 22.38, Joshua Watt wrote: There are two cases that need to be accounted for when compiling QEMU for MinGW32: 1) A standalone distribution, where QEMU is self contained and extracted by the user, such as a user would download from the QEMU website. In this case, all of the QEMU files should be rooted in $prefix to ensure they can be easily packaged together for distribution 2) QEMU integrated into a distribution image/sysroot/SDK and distributed with other programs. In this case, the provided arguments for bindir/datadir/etc. should be respected as they for a Linux build. Add a configure time flags --enable-standalone-mingw and --disable-standalone-mingw that allows the user to control this behavior. The flag defaults to "enabled" if unspecified to retain the existing build behavior Signed-off-by: Joshua Watt --- configure | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/configure b/configure index 5860bdb77b..5c83edb502 100755 --- a/configure +++ b/configure @@ -358,6 +358,7 @@ strip_opt="yes" tcg_interpreter="no" bigendian="no" mingw32="no" +mingw32_standalone="yes" gcov="no" EXESUF="$default_feature" HOST_DSOSUF=".so" @@ -1558,6 +1559,10 @@ for opt do ;; --disable-fuse-lseek) fuse_lseek="disabled" ;; + --enable-standalone-mingw) mingw32_standalone="yes" + ;; + --disable-standalone-mingw) mingw32_standalone="no" + ;; *) echo "ERROR: unknown option $opt" echo "Try '$0 --help' for more information" @@ -1570,7 +1575,7 @@ libdir="${libdir:-$prefix/lib}" libexecdir="${libexecdir:-$prefix/libexec}" includedir="${includedir:-$prefix/include}" -if test "$mingw32" = "yes" ; then +if test "$mingw32" = "yes" && test "$mingw32_standalone" = "yes"; then mandir="$prefix" datadir="$prefix" docdir="$prefix" @@ -1897,6 +1902,7 @@ disabled with --disable-FEATURE, default is enabled if available libdaxctl libdaxctl support fuse FUSE block device export fuse-lseek SEEK_HOLE/SEEK_DATA support for FUSE exports + standalone-mingw Build for standalone distribution on MinGW NOTE: The object files are built at the place where configure is launched EOF I think this should maybe be done independently from MinGW, so that it could be used on other systems, too. Thus maybe rather name the switch "--enable-standalone-distribution" or "--enable-standalone-installation" or something like this? On MinGW, the value of the switch could then default to "yes" while on other systems it would be "no" by default. We could, but I'm curious how useful that is? Does that make the option just a shorthand for "--mandir=$prefix --bindir=$prefix --datadir=$prefix etc..." for all builds? Yes, that would basically be a shorthand for that. Could be useful for people who want to create standalone binaries on Linux etc., too. Thomas
Fwd: VirtioSound device emulation implementation
-- Forwarded message - From: Shreyansh Chouhan Date: Mon, 11 Jan 2021 at 11:59 Subject: Re: VirtioSound device emulation implementation To: Gerd Hoffmann On Sun, 10 Jan 2021 at 13:55, Shreyansh Chouhan < chouhan.shreyansh2...@gmail.com> wrote: > Hi, > > I have been reading about the virtio and vhost specifications, however I > have a few doubts. I tried looking for them but I still > do not understand them clearly enough. From what I understand, there are > two protocols: > > The virtio protocol: The one that specifies how we can have common > emulation for virtual devices. The front end drivers > interact with these devices, and these devices could then process the > information that they have received either in QEMU, > or somewhere else. From what I understand the front driver uses mmaps to > communicate with the virtio device. > > The vhost protocol: The one that specifies how we can _offload_ the > processing from QEMU to a separate process. We > want to offload so that we do not have to stop the guest when we are > processing information passed to a virtio device. This > service could either be implemented in the host kernel or the host > userspace. Now when we offload the processing, we map the > memory of the device to this vhost service, so that this service has all > the information that it should process. > Also, this process can generate the vCPU interrupts, and this process > responds to the ioeventfd notifications. > > What I do not understand is, once we have this vhost service, either in > userspace or in kernel space, which does the information processing, > why do we need a virtio device still emulated in QEMU? Is it only to pass > on the configurations between the driver and the > vhost service? I know that the vhost service doesn't emulate anything, but > then what is the difference between "processing" the > information and "emulating" a device? > > Also, from article[3], moving the vhost-net service to userspace was > faster somehow. I am assuming this was only the case for > networking devices, and would not be true in general. Since there would be > more context switches between user and kernel space? > (KVM receives the irq/ioevent notification and then transfers control back > to user space, as opposed to when vhost was in kernel > space.) > > For context, I've been reading the following: > [1] > https://www.redhat.com/en/blog/introduction-virtio-networking-and-vhost-net > [2] > https://www.redhat.com/en/blog/deep-dive-virtio-networking-and-vhost-net > [3] https://www.redhat.com/en/blog/journey-vhost-users-realm > > Found the answers in this blog: http://blog.vmsplice.net/2011/09/qemu-internals-vhost-architecture.html In short, yes, the configuration plane still remains with QEMU. The frontend driver interacts with the PCI adapter emulated in QEMU, for configurations and memory map setup. Only the data plane is forwarded to the vhost service. This makes sense since we would only want to configure the device once, and hence having that emulated in QEMU is not a performance issue, as much as having the data plane was. There is still a little confusion in my mind regarding a few things, but I think looking at the source code of the already implemented drivers will clear that up for me. So that is what I will be doing next. I will start looking at the source code for in-QEMU and vhost implementations of other virtio drivers, and then decide which one I'd like to go with. I will probably follow that decision with an implementation plan/timeline so that everyone can follow the progress on the development of this project.
[PATCH v8 7/7] fuzz: heuristic split write based on past IOs
If previous write commands write the same length of data with the same step, we view it as a hint. Signed-off-by: Qiuhao Li Reviewed-by: Alexander Bulekov Tested-by: Alexander Bulekov --- scripts/oss-fuzz/minimize_qtest_trace.py | 56 1 file changed, 56 insertions(+) diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py index 0e59bdbb01..4cba96dee2 100755 --- a/scripts/oss-fuzz/minimize_qtest_trace.py +++ b/scripts/oss-fuzz/minimize_qtest_trace.py @@ -88,6 +88,43 @@ def check_if_trace_crashes(trace, path): return False +# If previous write commands write the same length of data at the same +# interval, we view it as a hint. +def split_write_hint(newtrace, i): +HINT_LEN = 3 # > 2 +if i <=(HINT_LEN-1): +return None + +#find previous continuous write traces +k = 0 +l = i-1 +writes = [] +while (k != HINT_LEN and l >= 0): +if newtrace[l].startswith("write "): +writes.append(newtrace[l]) +k += 1 +l -= 1 +elif newtrace[l] == "": +l -= 1 +else: +return None +if k != HINT_LEN: +return None + +length = int(writes[0].split()[2], 16) +for j in range(1, HINT_LEN): +if length != int(writes[j].split()[2], 16): +return None + +step = int(writes[0].split()[1], 16) - int(writes[1].split()[1], 16) +for j in range(1, HINT_LEN-1): +if step != int(writes[j].split()[1], 16) - \ +int(writes[j+1].split()[1], 16): +return None + +return (int(writes[0].split()[1], 16)+step, length) + + def remove_lines(newtrace, outpath): remove_step = 1 i = 0 @@ -151,6 +188,25 @@ def remove_lines(newtrace, outpath): length = int(newtrace[i].split()[2], 16) data = newtrace[i].split()[3][2:] if length > 1: + +# Can we get a hint from previous writes? +hint = split_write_hint(newtrace, i) +if hint is not None: +hint_addr = hint[0] +hint_len = hint[1] +if hint_addr >= addr and hint_addr+hint_len <= addr+length: +newtrace[i] = "write {addr} {size} 0x{data}\n".format( +addr=hex(hint_addr), +size=hex(hint_len), +data=data[(hint_addr-addr)*2:\ +(hint_addr-addr)*2+hint_len*2]) +if check_if_trace_crashes(newtrace, outpath): +# next round +i += 1 +continue +newtrace[i] = prior[0] + +# Try splitting it using a binary approach leftlength = int(length/2) rightlength = length - leftlength newtrace.insert(i+1, "") -- 2.25.1
[PATCH v8 6/7] fuzz: add minimization options
-M1: remove IO commands iteratively -M2: try setting bits in operand of write/out to zero Signed-off-by: Qiuhao Li Reviewed-by: Alexander Bulekov Tested-by: Alexander Bulekov --- scripts/oss-fuzz/minimize_qtest_trace.py | 30 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py index 219858a9e3..0e59bdbb01 100755 --- a/scripts/oss-fuzz/minimize_qtest_trace.py +++ b/scripts/oss-fuzz/minimize_qtest_trace.py @@ -16,6 +16,10 @@ QEMU_PATH = None TIMEOUT = 5 CRASH_TOKEN = None +# Minimization levels +M1 = False # try removing IO commands iteratively +M2 = False # try setting bits in operand of write/out to zero + write_suffix_lookup = {"b": (1, "B"), "w": (2, "H"), "l": (4, "L"), @@ -23,10 +27,20 @@ write_suffix_lookup = {"b": (1, "B"), def usage(): sys.exit("""\ -Usage: QEMU_PATH="/path/to/qemu" QEMU_ARGS="args" {} input_trace output_trace +Usage: + +QEMU_PATH="/path/to/qemu" QEMU_ARGS="args" {} [Options] input_trace output_trace + By default, will try to use the second-to-last line in the output to identify whether the crash occred. Optionally, manually set a string that idenitifes the crash by setting CRASH_TOKEN= + +Options: + +-M1: enable a loop around the remove minimizer, which may help decrease some + timing dependant instructions. Off by default. +-M2: try setting bits in operand of write/out to zero. Off by default. + """.format((sys.argv[0]))) deduplication_note = """\n\ @@ -216,24 +230,32 @@ def minimize_trace(inpath, outpath): print("Setting the timeout for {} seconds".format(TIMEOUT)) newtrace = trace[:] +global M1, M2 # remove lines old_len = len(newtrace) + 1 while(old_len > len(newtrace)): old_len = len(newtrace) +print("trace lenth = ", old_len) remove_lines(newtrace, outpath) +if not M1 and not M2: +break newtrace = list(filter(lambda s: s != "", newtrace)) assert(check_if_trace_crashes(newtrace, outpath)) # set bits to zero -clear_bits(newtrace, outpath) +if M2: +clear_bits(newtrace, outpath) assert(check_if_trace_crashes(newtrace, outpath)) if __name__ == '__main__': if len(sys.argv) < 3: usage() - +if "-M1" in sys.argv: +M1 = True +if "-M2" in sys.argv: +M2 = True QEMU_PATH = os.getenv("QEMU_PATH") QEMU_ARGS = os.getenv("QEMU_ARGS") if QEMU_PATH is None or QEMU_ARGS is None: @@ -242,4 +264,4 @@ if __name__ == '__main__': # QEMU_ARGS += " -accel qtest" CRASH_TOKEN = os.getenv("CRASH_TOKEN") QEMU_ARGS += " -qtest stdio -monitor none -serial none " -minimize_trace(sys.argv[1], sys.argv[2]) +minimize_trace(sys.argv[-2], sys.argv[-1]) -- 2.25.1
[PATCH v8 5/7] fuzz: set bits in operand of write/out to zero
Simplifying the crash cases by opportunistically setting bits in operands of out/write to zero may help to debug, since usually bit one means turn on or trigger a function while zero is the default turn-off setting. Tested Bug 1908062. Signed-off-by: Qiuhao Li Reviewed-by: Alexander Bulekov Tested-by: Alexander Bulekov --- scripts/oss-fuzz/minimize_qtest_trace.py | 39 1 file changed, 39 insertions(+) diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py index 59e91de7e2..219858a9e3 100755 --- a/scripts/oss-fuzz/minimize_qtest_trace.py +++ b/scripts/oss-fuzz/minimize_qtest_trace.py @@ -167,6 +167,42 @@ def remove_lines(newtrace, outpath): i += 1 +def clear_bits(newtrace, outpath): +# try setting bits in operands of out/write to zero +i = 0 +while i < len(newtrace): +if (not newtrace[i].startswith("write ") and not + newtrace[i].startswith("out")): + i += 1 + continue +# write ADDR SIZE DATA +# outx ADDR VALUE +print("\nzero setting bits: {}".format(newtrace[i])) + +prefix = " ".join(newtrace[i].split()[:-1]) +data = newtrace[i].split()[-1] +data_bin = bin(int(data, 16)) +data_bin_list = list(data_bin) + +for j in range(2, len(data_bin_list)): +prior = newtrace[i] +if (data_bin_list[j] == '1'): +data_bin_list[j] = '0' +data_try = hex(int("".join(data_bin_list), 2)) +# It seems qtest only accepts padded hex-values. +if len(data_try) % 2 == 1: +data_try = data_try[:2] + "0" + data_try[2:-1] + +newtrace[i] = "{prefix} {data_try}\n".format( +prefix=prefix, +data_try=data_try) + +if not check_if_trace_crashes(newtrace, outpath): +data_bin_list[j] = '1' +newtrace[i] = prior +i += 1 + + def minimize_trace(inpath, outpath): global TIMEOUT with open(inpath) as f: @@ -187,7 +223,10 @@ def minimize_trace(inpath, outpath): old_len = len(newtrace) remove_lines(newtrace, outpath) newtrace = list(filter(lambda s: s != "", newtrace)) +assert(check_if_trace_crashes(newtrace, outpath)) +# set bits to zero +clear_bits(newtrace, outpath) assert(check_if_trace_crashes(newtrace, outpath)) -- 2.25.1
[PATCH v8 4/7] fuzz: remove IO commands iteratively
Now we use a one-time scan and remove strategy in the minimizer, which is not suitable for timing dependent instructions. For example, instruction A will indicate an address where the config chunk locates, and instruction B will make the configuration active. If we have the following instruction sequence: ... A1 B1 A2 B2 ... A2 and B2 are the actual instructions that trigger the bug. If we scan from top to bottom, after we remove A1, the behavior of B1 might be unknowable, including not to crash the program. But we will successfully remove B1 later cause A2 and B2 will crash the process anyway: ... A1 A2 B2 ... Now one more trimming will remove A1. In the perfect case, we would need to be able to remove A and B (or C!) at the same time. But for now, let's just add a loop around the minimizer. Since we only remove instructions, this iterative algorithm is converging. Tested with Bug 1908062. Signed-off-by: Qiuhao Li Reviewed-by: Alexander Bulekov Tested-by: Alexander Bulekov --- scripts/oss-fuzz/minimize_qtest_trace.py | 41 +++- 1 file changed, 26 insertions(+), 15 deletions(-) diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py index af9767f7e4..59e91de7e2 100755 --- a/scripts/oss-fuzz/minimize_qtest_trace.py +++ b/scripts/oss-fuzz/minimize_qtest_trace.py @@ -74,21 +74,9 @@ def check_if_trace_crashes(trace, path): return False -def minimize_trace(inpath, outpath): -global TIMEOUT -with open(inpath) as f: -trace = f.readlines() -start = time.time() -if not check_if_trace_crashes(trace, outpath): -sys.exit("The input qtest trace didn't cause a crash...") -end = time.time() -print("Crashed in {} seconds".format(end-start)) -TIMEOUT = (end-start)*5 -print("Setting the timeout for {} seconds".format(TIMEOUT)) - -i = 0 -newtrace = trace[:] +def remove_lines(newtrace, outpath): remove_step = 1 +i = 0 while i < len(newtrace): # 1.) Try to remove lines completely and reproduce the crash. # If it works, we're done. @@ -177,7 +165,30 @@ def minimize_trace(inpath, outpath): newtrace[i] = prior[0] del newtrace[i+1] i += 1 -check_if_trace_crashes(newtrace, outpath) + + +def minimize_trace(inpath, outpath): +global TIMEOUT +with open(inpath) as f: +trace = f.readlines() +start = time.time() +if not check_if_trace_crashes(trace, outpath): +sys.exit("The input qtest trace didn't cause a crash...") +end = time.time() +print("Crashed in {} seconds".format(end-start)) +TIMEOUT = (end-start)*5 +print("Setting the timeout for {} seconds".format(TIMEOUT)) + +newtrace = trace[:] + +# remove lines +old_len = len(newtrace) + 1 +while(old_len > len(newtrace)): +old_len = len(newtrace) +remove_lines(newtrace, outpath) +newtrace = list(filter(lambda s: s != "", newtrace)) + +assert(check_if_trace_crashes(newtrace, outpath)) if __name__ == '__main__': -- 2.25.1
[PATCH v8 3/7] fuzz: split write operand using binary approach
Currently, we split the write commands' data from the middle. If it does not work, try to move the pivot left by one byte and retry until there is no space. But, this method has two flaws: 1. It may fail to trim all unnecessary bytes on the right side. For example, there is an IO write command: write addr uuuu u is the unnecessary byte for the crash. Unlike ram write commands, in most case, a split IO write won't trigger the same crash, So if we split from the middle, we will get: write addr uu (will be removed in next round) write addr uu For uu, since split it from the middle and retry to the leftmost byte won't get the same crash, we will be stopped from removing the last two bytes. 2. The algorithm complexity is O(n) since we move the pivot byte by byte. To solve the first issue, we can try a symmetrical position on the right if we fail on the left. As for the second issue, instead moving by one byte, we can approach the boundary exponentially, achieving O(log(n)). Give an example: uu len=6 + | + xxx,xuu 6/2=3 fail + +--+-+ || ++ xx,xxuu 6/2^2=1 fail u,u 6-1=5 success + + +--++ | | |+-+ u removed + + xx,xxu 5/2=2 fail ,u 6-2=4 success + | +---+ u removed In some rare cases, this algorithm will fail to trim all unnecessary bytes: xuxx -xuxx Fail -xuxx Fail xuxx- Fail ... I think the trade-off is worth it. Signed-off-by: Qiuhao Li Reviewed-by: Alexander Bulekov Tested-by: Alexander Bulekov --- scripts/oss-fuzz/minimize_qtest_trace.py | 29 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py index cacabf2638..af9767f7e4 100755 --- a/scripts/oss-fuzz/minimize_qtest_trace.py +++ b/scripts/oss-fuzz/minimize_qtest_trace.py @@ -97,7 +97,7 @@ def minimize_trace(inpath, outpath): prior = newtrace[i:i+remove_step] for j in range(i, i+remove_step): newtrace[j] = "" -print("Removing {lines} ...".format(lines=prior)) +print("Removing {lines} ...\n".format(lines=prior)) if check_if_trace_crashes(newtrace, outpath): i += remove_step # Double the number of lines to remove for next round @@ -110,9 +110,11 @@ def minimize_trace(inpath, outpath): remove_step = 1 continue newtrace[i] = prior[0] # remove_step = 1 + # 2.) Try to replace write{bwlq} commands with a write addr, len # command. Since this can require swapping endianness, try both LE and # BE options. We do this, so we can "trim" the writes in (3) + if (newtrace[i].startswith("write") and not newtrace[i].startswith("write ")): suffix = newtrace[i].split()[0][-1] @@ -133,11 +135,15 @@ def minimize_trace(inpath, outpath): newtrace[i] = prior[0] # 3.) If it is a qtest write command: write addr len data, try to split -# it into two separate write commands. If splitting the write down the -# middle does not work, try to move the pivot "left" and retry, until -# there is no space left. The idea is to prune unneccessary bytes from -# long writes, while accommodating arbitrary MemoryRegion access sizes -# and alignments. +# it into two separate write commands. If splitting the data operand +# from length/2^n bytes to the left does not work, try to move the pivot +# to the right side, then add one to n, until length/2^n == 0. The idea +# is to prune unneccessary bytes from long writes, while accommodating +# arbitrary MemoryRegion access sizes and alignments. + +# This algorithm will fail under some rare situations. +# e.g., xuxx (u is the unnecessary byte) + if newtrace[i].startswith("write "): addr = int(newtrace[i].split()[1], 16) length = int(newtrace[i].split()[2], 16) @@ -146,6 +152,7 @@ def minimize_trace(inpath, outpath): leftlength = int(length/2) rightlength = length - leftlength newtrace.insert(i+1, "") +power = 1 while leftlength > 0: newtrace[i] = "write {addr} {size} 0x{data}\n".format( addr=hex(addr), @@ -157,9 +164,13 @@ def minimize_trace(inpath, outpath):
[PATCH v8 2/7] fuzz: double the IOs to remove for every loop
Instead of removing IO instructions one by one, we can try deleting multiple instructions at once. According to the locality of reference, we double the number of instructions to remove for the next round and recover it to one once we fail. This patch is usually significant for large input. Test with quadrupled trace input at: https://bugs.launchpad.net/qemu/+bug/1890333/comments/1 Patched 1/6 version: real 0m45.904s user 0m16.874s sys 0m10.042s Refined version: real 0m11.412s user 0m6.888s sys 0m3.325s Signed-off-by: Qiuhao Li Reviewed-by: Alexander Bulekov Tested-by: Alexander Bulekov --- scripts/oss-fuzz/minimize_qtest_trace.py | 33 +++- 1 file changed, 21 insertions(+), 12 deletions(-) diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py index a28913a2a7..cacabf2638 100755 --- a/scripts/oss-fuzz/minimize_qtest_trace.py +++ b/scripts/oss-fuzz/minimize_qtest_trace.py @@ -88,19 +88,28 @@ def minimize_trace(inpath, outpath): i = 0 newtrace = trace[:] -# For each line +remove_step = 1 while i < len(newtrace): -# 1.) Try to remove it completely and reproduce the crash. If it works, -# we're done. -prior = newtrace[i] -print("Trying to remove {}".format(newtrace[i])) -# Try to remove the line completely -newtrace[i] = "" +# 1.) Try to remove lines completely and reproduce the crash. +# If it works, we're done. +if (i+remove_step) >= len(newtrace): +remove_step = 1 +prior = newtrace[i:i+remove_step] +for j in range(i, i+remove_step): +newtrace[j] = "" +print("Removing {lines} ...".format(lines=prior)) if check_if_trace_crashes(newtrace, outpath): -i += 1 +i += remove_step +# Double the number of lines to remove for next round +remove_step *= 2 continue -newtrace[i] = prior - +# Failed to remove multiple IOs, fast recovery +if remove_step > 1: +for j in range(i, i+remove_step): +newtrace[j] = prior[j-i] +remove_step = 1 +continue +newtrace[i] = prior[0] # remove_step = 1 # 2.) Try to replace write{bwlq} commands with a write addr, len # command. Since this can require swapping endianness, try both LE and # BE options. We do this, so we can "trim" the writes in (3) @@ -121,7 +130,7 @@ def minimize_trace(inpath, outpath): if(check_if_trace_crashes(newtrace, outpath)): break else: -newtrace[i] = prior +newtrace[i] = prior[0] # 3.) If it is a qtest write command: write addr len data, try to split # it into two separate write commands. If splitting the write down the @@ -154,7 +163,7 @@ def minimize_trace(inpath, outpath): if check_if_trace_crashes(newtrace, outpath): i -= 1 else: -newtrace[i] = prior +newtrace[i] = prior[0] del newtrace[i+1] i += 1 check_if_trace_crashes(newtrace, outpath) -- 2.25.1
[PATCH v8 1/7] fuzz: accelerate non-crash detection
We spend much time waiting for the timeout program during the minimization process until it passes a time limit. This patch hacks the CLOSED (indicates the redirection file closed) notification in QTest's output if it doesn't crash. Test with quadrupled trace input at: https://bugs.launchpad.net/qemu/+bug/1890333/comments/1 Original version: real 1m37.246s user 0m13.069s sys 0m8.399s Refined version: real 0m45.904s user 0m16.874s sys 0m10.042s Note: Sometimes the mutated or the same trace may trigger a different crash summary (second-to-last line) but indicates the same bug. For example, Bug 1910826 [1], which will trigger a stack overflow, may output summaries like: SUMMARY: AddressSanitizer: stack-overflow /home/qiuhao/hack/qemu/build/../softmmu/physmem.c:488 in flatview_do_translate or SUMMARY: AddressSanitizer: stack-overflow (/home/qiuhao/hack/qemu/build/qemu-system-i386+0x27ca049) in __asan_memcpy Etc. If we use the whole summary line as the token, we may be prevented from further minimization. So in this patch, we only use the first three words which indicate the type of crash: SUMMARY: AddressSanitizer: stack-overflow [1] https://bugs.launchpad.net/qemu/+bug/1910826 Signed-off-by: Qiuhao Li Reviewed-by: Alexander Bulekov Tested-by: Alexander Bulekov --- scripts/oss-fuzz/minimize_qtest_trace.py | 42 +--- 1 file changed, 30 insertions(+), 12 deletions(-) diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py index 5e405a0d5f..a28913a2a7 100755 --- a/scripts/oss-fuzz/minimize_qtest_trace.py +++ b/scripts/oss-fuzz/minimize_qtest_trace.py @@ -29,8 +29,14 @@ whether the crash occred. Optionally, manually set a string that idenitifes the crash by setting CRASH_TOKEN= """.format((sys.argv[0]))) +deduplication_note = """\n\ +Note: While trimming the input, sometimes the mutated trace triggers a different +type crash but indicates the same bug. Under this situation, our minimizer is +incapable of recognizing and stopped from removing it. In the future, we may +use a more sophisticated crash case deduplication method. +\n""" + def check_if_trace_crashes(trace, path): -global CRASH_TOKEN with open(path, "w") as tracefile: tracefile.write("".join(trace)) @@ -41,18 +47,31 @@ def check_if_trace_crashes(trace, path): trace_path=path), shell=True, stdin=subprocess.PIPE, - stdout=subprocess.PIPE) -stdo = rc.communicate()[0] -output = stdo.decode('unicode_escape') -if rc.returncode == 137:# Timed Out -return False -if len(output.splitlines()) < 2: -return False - + stdout=subprocess.PIPE, + encoding="utf-8") +global CRASH_TOKEN if CRASH_TOKEN is None: -CRASH_TOKEN = output.splitlines()[-2] +try: +outs, _ = rc.communicate(timeout=5) +CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3]) +except subprocess.TimeoutExpired: +print("subprocess.TimeoutExpired") +return False +print("Identifying Crashes by this string: {}".format(CRASH_TOKEN)) +global deduplication_note +print(deduplication_note) +return True -return CRASH_TOKEN in output +for line in iter(rc.stdout.readline, ""): +if "CLOSED" in line: +return False +if CRASH_TOKEN in line: +return True + +print("\nWarning:") +print(" There is no 'CLOSED'or CRASH_TOKEN in the stdout of subprocess.") +print(" Usually this indicates a different type of crash.\n") +return False def minimize_trace(inpath, outpath): @@ -66,7 +85,6 @@ def minimize_trace(inpath, outpath): print("Crashed in {} seconds".format(end-start)) TIMEOUT = (end-start)*5 print("Setting the timeout for {} seconds".format(TIMEOUT)) -print("Identifying Crashes by this string: {}".format(CRASH_TOKEN)) i = 0 newtrace = trace[:] -- 2.25.1
[PATCH v8 0/7] fuzz: improve crash case minimization
Extend and refine the crash case minimization process. Test input: Bug 1909261 full_reproducer 6500 QTest instructions (write mostly) Refined (-M1 minimization level) vs. Original version: real 38m31.942s <-- real 532m57.192s user 28m18.188s <-- user 89m0.536s sys 12m42.239s <-- sys 50m33.074s 2558 instructions <-- 2846 instructions Test Enviroment: i7-8550U, 16GB LPDDR3, SSD Ubuntu 20.04.1 5.4.0-58-generic x86_64 Python 3.8.5 v8: Fix: [PATCH v7 1/7] misused the bytes type Add: [PATCH v7 1/7] warn when the CRASH_TOKEN cannot be found v7: Fix: [PATCH v6 1/7] get stuck in crash detection v6: Fix: add Reviewed-by and Tested-by tags v5: Fix: send SIGKILL on timeout Fix: rename minimization functions v4: Fix: messy diff in [PATCH v3 4/7] v3: Fix: checkpatch.pl errors v2: New: [PATCH v2 1/7] New: [PATCH v2 2/7] New: [PATCH v2 4/7] New: [PATCH v2 6/7] New: [PATCH v2 7/7] Fix: [PATCH 2/4] split using binary approach Fix: [PATCH 3/4] typo in comments Discard: [PATCH 1/4] the hardcoded regex match for crash detection Discard: [PATCH 4/4] the delaying minimizer Thanks for the suggestions from: Alexander Bulekov Qiuhao Li (7): fuzz: accelerate non-crash detection fuzz: double the IOs to remove for every loop fuzz: split write operand using binary approach fuzz: remove IO commands iteratively fuzz: set bits in operand of write/out to zero fuzz: add minimization options fuzz: heuristic split write based on past IOs scripts/oss-fuzz/minimize_qtest_trace.py | 260 +++ 1 file changed, 213 insertions(+), 47 deletions(-) -- 2.25.1
Re: [PATCH v16 00/20] Initial support for multi-process Qemu
I have a question, does this support/test on Windows? On Mon, Jan 11, 2021 at 1:08 PM Jagannathan Raman wrote: > > Hi > > This is the v16 of the patchset. Thank you for your time reviewing v15. > > This version has the following changes: > > [PATCH v16 04/20] multi-process: Add config option for multi-process QEMU > - Using “default_feature” value to enable/disable multiprocess > > [PATCH v16 07/20] io: add qio_channel_writev_full_all helper > - Removed local variable in qio_channel_writev_full_all(), setting arguments > directly > - Fixed indentation issues > - Updated commit message > > [PATCH v16 08/20] io: add qio_channel_readv_full_all_eof & > qio_channel_readv_full_all helpers > - Added two variants of readv - _full_all_eof & _full_all based on feedback > - Dropped errno return value > - Updated commit message > - Unable to remove local variables and set arguments directly as the > arguments are later needed for cleanup (g_free/close) during failure > > Switched to using OBJECT_DECLARE_{SIMPLE_TYPE, TYPE} macros in the > following patches: > - [PATCH v16 05/20] multi-process: setup PCI host bridge for remote device > - [PATCH v16 06/20] multi-process: setup a machine object for remote device > process > - [PATCH v16 11/20] multi-process: Associate fd of a PCIDevice with its object > - [PATCH v16 13/20] multi-process: introduce proxy object > > Updated copyright text to use the year 2021 in the files that show them. > > To touch upon the history of this project, we posted the Proof Of Concept > patches before the BoF session in 2018. Subsequently, we have posted 15 > versions on the qemu-devel mailing list. You can find them by following > the links below ([1] - [15]). Following people contributed to the design and > implementation of this project: > Jagannathan Raman > Elena Ufimtseva > John G Johnson > Stefan Hajnoczi > Konrad Wilk > Kanth Ghatraju > > We would like to thank the QEMU community for your feedback in the > design and implementation of this project. Qemu wiki page: > https://wiki.qemu.org/Features/MultiProcessQEMU > > For the full concept writeup about QEMU multi-process, please > refer to docs/devel/qemu-multiprocess.rst. Also, see > docs/qemu-multiprocess.txt for usage information. > > Thank you for reviewing this series! > > [POC]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg566538.html > [1]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg602285.html > [2]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg624877.html > [3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg642000.html > [4]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg655118.html > [5]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg682429.html > [6]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg697484.html > [7]: https://patchew.org/QEMU/cover.1593273671.git.elena.ufimts...@oracle.com/ > [8]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg727007.html > [9]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg734275.html > [10]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg747638.html > [11]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg750972.html > [12]: https://patchew.org/QEMU/cover.1606853298.git.jag.ra...@oracle.com/ > [13]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg766825.html > [14]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg768376.html > [15]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg769178.html > > Elena Ufimtseva (8): > multi-process: add configure and usage information > io: add qio_channel_writev_full_all helper > io: add qio_channel_readv_full_all_eof & qio_channel_readv_full_all > helpers > multi-process: define MPQemuMsg format and transmission functions > multi-process: introduce proxy object > multi-process: add proxy communication functions > multi-process: Forward PCI config space acceses to the remote process > multi-process: perform device reset in the remote process > > Jagannathan Raman (11): > memory: alloc RAM from file at offset > multi-process: Add config option for multi-process QEMU > multi-process: setup PCI host bridge for remote device > multi-process: setup a machine object for remote device process > multi-process: Initialize message handler in remote device > multi-process: Associate fd of a PCIDevice with its object > multi-process: setup memory manager for remote device > multi-process: PCI BAR read/write handling for proxy & remote > endpoints > multi-process: Synchronize remote memory > multi-process: create IOHUB object to handle irq > multi-process: Retrieve PCI info from remote process > > John G Johnson (1): > multi-process: add the concept description to > docs/devel/qemu-multiprocess > > docs/devel/index.rst | 1 + > docs/devel/multi-process.rst | 966 ++ > docs/m
[PATCH v16 10/20] multi-process: Initialize message handler in remote device
Initializes the message handler function in the remote process. It is called whenever there's an event pending on QIOChannel that registers this function. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/remote/machine.h | 9 +++ hw/remote/message.c | 57 + MAINTAINERS | 1 + hw/remote/meson.build | 1 + 4 files changed, 68 insertions(+) create mode 100644 hw/remote/message.c diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index bdfbca4..b92b2ce 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -14,6 +14,7 @@ #include "qom/object.h" #include "hw/boards.h" #include "hw/pci-host/remote.h" +#include "io/channel.h" struct RemoteMachineState { MachineState parent_obj; @@ -21,7 +22,15 @@ struct RemoteMachineState { RemotePCIHost *host; }; +/* Used to pass to co-routine device and ioc. */ +typedef struct RemoteCommDev { +PCIDevice *dev; +QIOChannel *ioc; +} RemoteCommDev; + #define TYPE_REMOTE_MACHINE "x-remote-machine" OBJECT_DECLARE_SIMPLE_TYPE(RemoteMachineState, REMOTE_MACHINE) +void coroutine_fn mpqemu_remote_msg_loop_co(void *data); + #endif diff --git a/hw/remote/message.c b/hw/remote/message.c new file mode 100644 index 000..36e2d4f --- /dev/null +++ b/hw/remote/message.c @@ -0,0 +1,57 @@ +/* + * Copyright © 2020, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL-v2, version 2 or later. + * + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "hw/remote/machine.h" +#include "io/channel.h" +#include "hw/remote/mpqemu-link.h" +#include "qapi/error.h" +#include "sysemu/runstate.h" + +void coroutine_fn mpqemu_remote_msg_loop_co(void *data) +{ +g_autofree RemoteCommDev *com = (RemoteCommDev *)data; +PCIDevice *pci_dev = NULL; +Error *local_err = NULL; + +assert(com->ioc); + +pci_dev = com->dev; +for (; !local_err;) { +MPQemuMsg msg = {0}; + +if (!mpqemu_msg_recv(&msg, com->ioc, &local_err)) { +break; +} + +if (!mpqemu_msg_valid(&msg)) { +error_setg(&local_err, "Received invalid message from proxy" + "in remote process pid="FMT_pid"", + getpid()); +break; +} + +switch (msg.cmd) { +default: +error_setg(&local_err, + "Unknown command (%d) received for device %s" + " (pid="FMT_pid")", + msg.cmd, DEVICE(pci_dev)->id, getpid()); +} +} + +if (local_err) { +error_report_err(local_err); +qemu_system_shutdown_request(SHUTDOWN_CAUSE_HOST_ERROR); +} else { +qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); +} +} diff --git a/MAINTAINERS b/MAINTAINERS index 3a297b6..14786c6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3186,6 +3186,7 @@ F: hw/remote/machine.c F: include/hw/remote/machine.h F: hw/remote/mpqemu-link.c F: include/hw/remote/mpqemu-link.h +F: hw/remote/message.c Build and test automation - diff --git a/hw/remote/meson.build b/hw/remote/meson.build index a2b2fc0..9f5c57f 100644 --- a/hw/remote/meson.build +++ b/hw/remote/meson.build @@ -2,5 +2,6 @@ remote_ss = ss.source_set() remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('machine.c')) remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('mpqemu-link.c')) +remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('message.c')) softmmu_ss.add_all(when: 'CONFIG_MULTIPROCESS', if_true: remote_ss) -- 1.8.3.1
[PATCH v16 06/20] multi-process: setup a machine object for remote device process
x-remote-machine object sets up various subsystems of the remote device process. Instantiate PCI host bridge object and initialize RAM, IO & PCI memory regions. Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Reviewed-by: Stefan Hajnoczi --- include/hw/pci-host/remote.h | 1 + include/hw/remote/machine.h | 27 + hw/remote/machine.c | 70 MAINTAINERS | 2 ++ hw/meson.build | 1 + hw/remote/meson.build| 5 6 files changed, 106 insertions(+) create mode 100644 include/hw/remote/machine.h create mode 100644 hw/remote/machine.c create mode 100644 hw/remote/meson.build diff --git a/include/hw/pci-host/remote.h b/include/hw/pci-host/remote.h index 06b8a83..3dcf6aa 100644 --- a/include/hw/pci-host/remote.h +++ b/include/hw/pci-host/remote.h @@ -24,6 +24,7 @@ struct RemotePCIHost { MemoryRegion *mr_pci_mem; MemoryRegion *mr_sys_io; +MemoryRegion *mr_sys_mem; }; #endif diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h new file mode 100644 index 000..bdfbca4 --- /dev/null +++ b/include/hw/remote/machine.h @@ -0,0 +1,27 @@ +/* + * Remote machine configuration + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef REMOTE_MACHINE_H +#define REMOTE_MACHINE_H + +#include "qom/object.h" +#include "hw/boards.h" +#include "hw/pci-host/remote.h" + +struct RemoteMachineState { +MachineState parent_obj; + +RemotePCIHost *host; +}; + +#define TYPE_REMOTE_MACHINE "x-remote-machine" +OBJECT_DECLARE_SIMPLE_TYPE(RemoteMachineState, REMOTE_MACHINE) + +#endif diff --git a/hw/remote/machine.c b/hw/remote/machine.c new file mode 100644 index 000..9519a6c --- /dev/null +++ b/hw/remote/machine.c @@ -0,0 +1,70 @@ +/* + * Machine for remote device + * + * This machine type is used by the remote device process in multi-process + * QEMU. QEMU device models depend on parent busses, interrupt controllers, + * memory regions, etc. The remote machine type offers this environment so + * that QEMU device models can be used as remote devices. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "hw/remote/machine.h" +#include "exec/address-spaces.h" +#include "exec/memory.h" +#include "qapi/error.h" + +static void remote_machine_init(MachineState *machine) +{ +MemoryRegion *system_memory, *system_io, *pci_memory; +RemoteMachineState *s = REMOTE_MACHINE(machine); +RemotePCIHost *rem_host; + +system_memory = get_system_memory(); +system_io = get_system_io(); + +pci_memory = g_new(MemoryRegion, 1); +memory_region_init(pci_memory, NULL, "pci", UINT64_MAX); + +rem_host = REMOTE_PCIHOST(qdev_new(TYPE_REMOTE_PCIHOST)); + +rem_host->mr_pci_mem = pci_memory; +rem_host->mr_sys_mem = system_memory; +rem_host->mr_sys_io = system_io; + +s->host = rem_host; + +object_property_add_child(OBJECT(s), "remote-pcihost", OBJECT(rem_host)); +memory_region_add_subregion_overlap(system_memory, 0x0, pci_memory, -1); + +qdev_realize(DEVICE(rem_host), sysbus_get_default(), &error_fatal); +} + +static void remote_machine_class_init(ObjectClass *oc, void *data) +{ +MachineClass *mc = MACHINE_CLASS(oc); + +mc->init = remote_machine_init; +mc->desc = "Experimental remote machine"; +} + +static const TypeInfo remote_machine = { +.name = TYPE_REMOTE_MACHINE, +.parent = TYPE_MACHINE, +.instance_size = sizeof(RemoteMachineState), +.class_init = remote_machine_class_init, +}; + +static void remote_machine_register_types(void) +{ +type_register_static(&remote_machine); +} + +type_init(remote_machine_register_types); diff --git a/MAINTAINERS b/MAINTAINERS index c5dc042..e3c7d9f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3182,6 +3182,8 @@ F: docs/devel/multi-process.rst F: docs/multi-process.rst F: hw/pci-host/remote.c F: include/hw/pci-host/remote.h +F: hw/remote/machine.c +F: include/hw/remote/machine.h Build and test automation - diff --git a/hw/meson.build b/hw/meson.build index 010de72..e615d72 100644 --- a/hw/meson.build +++ b/hw/meson.build @@ -56,6 +56,7 @@ subdir('moxie') subdir('nios2') subdir('openrisc') subdir('ppc') +subdir('remote') subdir('riscv') subdir('rx') subdir('s390x') diff --git a/hw/remote/meson.build b/hw/remote/meson.build new file mode 100644 index 000..197b038 --- /dev/null +++ b/hw/remote/meson.build @@ -0,0 +1,5 @@ +remote_ss = ss.source_set() + +remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('machine.c')) + +s
[PATCH v16 07/20] io: add qio_channel_writev_full_all helper
From: Elena Ufimtseva Adds qio_channel_writev_full_all() to transmit both data and FDs. Refactors existing code to use this helper. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi Acked-by: Daniel P. Berrangé --- include/io/channel.h | 25 + io/channel.c | 15 ++- 2 files changed, 39 insertions(+), 1 deletion(-) diff --git a/include/io/channel.h b/include/io/channel.h index 4d6fe45..2a45fb5 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -774,4 +774,29 @@ void qio_channel_set_aio_fd_handler(QIOChannel *ioc, IOHandler *io_write, void *opaque); +/** + * qio_channel_writev_full_all: + * @ioc: the channel object + * @iov: the array of memory regions to write data from + * @niov: the length of the @iov array + * @fds: an array of file handles to send + * @nfds: number of file handles in @fds + * @errp: pointer to a NULL-initialized error object + * + * + * Behaves like qio_channel_writev_full but will attempt + * to send all data passed (file handles and memory regions). + * The function will wait for all requested data + * to be written, yielding from the current coroutine + * if required. + * + * Returns: 0 if all bytes were written, or -1 on error + */ + +int qio_channel_writev_full_all(QIOChannel *ioc, +const struct iovec *iov, +size_t niov, +int *fds, size_t nfds, +Error **errp); + #endif /* QIO_CHANNEL_H */ diff --git a/io/channel.c b/io/channel.c index 93d449d..0d4b8b5 100644 --- a/io/channel.c +++ b/io/channel.c @@ -157,6 +157,15 @@ int qio_channel_writev_all(QIOChannel *ioc, size_t niov, Error **errp) { +return qio_channel_writev_full_all(ioc, iov, niov, NULL, 0, errp); +} + +int qio_channel_writev_full_all(QIOChannel *ioc, +const struct iovec *iov, +size_t niov, +int *fds, size_t nfds, +Error **errp) +{ int ret = -1; struct iovec *local_iov = g_new(struct iovec, niov); struct iovec *local_iov_head = local_iov; @@ -168,7 +177,8 @@ int qio_channel_writev_all(QIOChannel *ioc, while (nlocal_iov > 0) { ssize_t len; -len = qio_channel_writev(ioc, local_iov, nlocal_iov, errp); +len = qio_channel_writev_full(ioc, local_iov, nlocal_iov, fds, nfds, + errp); if (len == QIO_CHANNEL_ERR_BLOCK) { if (qemu_in_coroutine()) { qio_channel_yield(ioc, G_IO_OUT); @@ -182,6 +192,9 @@ int qio_channel_writev_all(QIOChannel *ioc, } iov_discard_front(&local_iov, &nlocal_iov, len); + +fds = NULL; +nfds = 0; } ret = 0; -- 1.8.3.1
[PATCH v16 18/20] multi-process: create IOHUB object to handle irq
IOHUB object is added to manage PCI IRQs. It uses KVM_IRQFD ioctl to create irqfd to injecting PCI interrupts to the guest. IOHUB object forwards the irqfd to the remote process. Remote process uses this fd to directly send interrupts to the guest, bypassing QEMU. Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Reviewed-by: Stefan Hajnoczi --- include/hw/pci/pci_ids.h| 3 + include/hw/remote/iohub.h | 42 ++ include/hw/remote/machine.h | 2 + include/hw/remote/mpqemu-link.h | 1 + include/hw/remote/proxy.h | 4 ++ hw/remote/iohub.c | 119 hw/remote/machine.c | 10 hw/remote/message.c | 4 ++ hw/remote/mpqemu-link.c | 5 ++ hw/remote/proxy.c | 56 +++ MAINTAINERS | 2 + hw/remote/meson.build | 1 + 12 files changed, 249 insertions(+) create mode 100644 include/hw/remote/iohub.h create mode 100644 hw/remote/iohub.c diff --git a/include/hw/pci/pci_ids.h b/include/hw/pci/pci_ids.h index 11f8ab7..bd0c17d 100644 --- a/include/hw/pci/pci_ids.h +++ b/include/hw/pci/pci_ids.h @@ -192,6 +192,9 @@ #define PCI_DEVICE_ID_SUN_SIMBA 0x5000 #define PCI_DEVICE_ID_SUN_SABRE 0xa000 +#define PCI_VENDOR_ID_ORACLE 0x108e +#define PCI_DEVICE_ID_REMOTE_IOHUB 0xb000 + #define PCI_VENDOR_ID_CMD0x1095 #define PCI_DEVICE_ID_CMD_6460x0646 diff --git a/include/hw/remote/iohub.h b/include/hw/remote/iohub.h new file mode 100644 index 000..0bf98e0 --- /dev/null +++ b/include/hw/remote/iohub.h @@ -0,0 +1,42 @@ +/* + * IO Hub for remote device + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef REMOTE_IOHUB_H +#define REMOTE_IOHUB_H + +#include "hw/pci/pci.h" +#include "qemu/event_notifier.h" +#include "qemu/thread-posix.h" +#include "hw/remote/mpqemu-link.h" + +#define REMOTE_IOHUB_NB_PIRQSPCI_DEVFN_MAX + +typedef struct ResampleToken { +void *iohub; +int pirq; +} ResampleToken; + +typedef struct RemoteIOHubState { +PCIDevice d; +EventNotifier irqfds[REMOTE_IOHUB_NB_PIRQS]; +EventNotifier resamplefds[REMOTE_IOHUB_NB_PIRQS]; +unsigned int irq_level[REMOTE_IOHUB_NB_PIRQS]; +ResampleToken token[REMOTE_IOHUB_NB_PIRQS]; +QemuMutex irq_level_lock[REMOTE_IOHUB_NB_PIRQS]; +} RemoteIOHubState; + +int remote_iohub_map_irq(PCIDevice *pci_dev, int intx); +void remote_iohub_set_irq(void *opaque, int pirq, int level); +void process_set_irqfd_msg(PCIDevice *pci_dev, MPQemuMsg *msg); + +void remote_iohub_init(RemoteIOHubState *iohub); +void remote_iohub_finalize(RemoteIOHubState *iohub); + +#endif diff --git a/include/hw/remote/machine.h b/include/hw/remote/machine.h index b92b2ce..2a2a33c 100644 --- a/include/hw/remote/machine.h +++ b/include/hw/remote/machine.h @@ -15,11 +15,13 @@ #include "hw/boards.h" #include "hw/pci-host/remote.h" #include "io/channel.h" +#include "hw/remote/iohub.h" struct RemoteMachineState { MachineState parent_obj; RemotePCIHost *host; +RemoteIOHubState iohub; }; /* Used to pass to co-routine device and ioc. */ diff --git a/include/hw/remote/mpqemu-link.h b/include/hw/remote/mpqemu-link.h index 6303e62..71d206f 100644 --- a/include/hw/remote/mpqemu-link.h +++ b/include/hw/remote/mpqemu-link.h @@ -39,6 +39,7 @@ typedef enum { MPQEMU_CMD_PCI_CFGREAD, MPQEMU_CMD_BAR_WRITE, MPQEMU_CMD_BAR_READ, +MPQEMU_CMD_SET_IRQFD, MPQEMU_CMD_MAX, } MPQemuCmd; diff --git a/include/hw/remote/proxy.h b/include/hw/remote/proxy.h index 12888b4..741def7 100644 --- a/include/hw/remote/proxy.h +++ b/include/hw/remote/proxy.h @@ -12,6 +12,7 @@ #include "hw/pci/pci.h" #include "io/channel.h" #include "hw/remote/proxy-memory-listener.h" +#include "qemu/event_notifier.h" #define TYPE_PCI_PROXY_DEV "x-pci-proxy-dev" OBJECT_DECLARE_SIMPLE_TYPE(PCIProxyDev, PCI_PROXY_DEV) @@ -38,6 +39,9 @@ struct PCIProxyDev { QIOChannel *ioc; Error *migration_blocker; ProxyMemoryListener proxy_listener; +int virq; +EventNotifier intr; +EventNotifier resample; ProxyMemoryRegion region[PCI_NUM_REGIONS]; }; diff --git a/hw/remote/iohub.c b/hw/remote/iohub.c new file mode 100644 index 000..e4ff131 --- /dev/null +++ b/hw/remote/iohub.c @@ -0,0 +1,119 @@ +/* + * Remote IO Hub + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "hw/pci/pci.h" +#include "hw/pci/pci_ids.h" +#include "hw/pci/pci_bus.h" +#include "qemu/thread.h" +#include "hw/boards
[PATCH v16 17/20] multi-process: Synchronize remote memory
Add ProxyMemoryListener object which is used to keep the view of the RAM in sync between QEMU and remote process. A MemoryListener is registered for system-memory AddressSpace. The listener sends SYNC_SYSMEM message to the remote process when memory listener commits the changes to memory, the remote process receives the message and processes it in the handler for SYNC_SYSMEM message. Signed-off-by: Jagannathan Raman Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Reviewed-by: Stefan Hajnoczi --- include/hw/remote/proxy-memory-listener.h | 28 include/hw/remote/proxy.h | 2 + hw/remote/message.c | 4 + hw/remote/proxy-memory-listener.c | 227 ++ hw/remote/proxy.c | 6 + MAINTAINERS | 2 + hw/remote/meson.build | 1 + 7 files changed, 270 insertions(+) create mode 100644 include/hw/remote/proxy-memory-listener.h create mode 100644 hw/remote/proxy-memory-listener.c diff --git a/include/hw/remote/proxy-memory-listener.h b/include/hw/remote/proxy-memory-listener.h new file mode 100644 index 000..c4f3efb --- /dev/null +++ b/include/hw/remote/proxy-memory-listener.h @@ -0,0 +1,28 @@ +/* + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef PROXY_MEMORY_LISTENER_H +#define PROXY_MEMORY_LISTENER_H + +#include "exec/memory.h" +#include "io/channel.h" + +typedef struct ProxyMemoryListener { +MemoryListener listener; + +int n_mr_sections; +MemoryRegionSection *mr_sections; + +QIOChannel *ioc; +} ProxyMemoryListener; + +void proxy_memory_listener_configure(ProxyMemoryListener *proxy_listener, + QIOChannel *ioc); +void proxy_memory_listener_deconfigure(ProxyMemoryListener *proxy_listener); + +#endif diff --git a/include/hw/remote/proxy.h b/include/hw/remote/proxy.h index ea7fa4f..12888b4 100644 --- a/include/hw/remote/proxy.h +++ b/include/hw/remote/proxy.h @@ -11,6 +11,7 @@ #include "hw/pci/pci.h" #include "io/channel.h" +#include "hw/remote/proxy-memory-listener.h" #define TYPE_PCI_PROXY_DEV "x-pci-proxy-dev" OBJECT_DECLARE_SIMPLE_TYPE(PCIProxyDev, PCI_PROXY_DEV) @@ -36,6 +37,7 @@ struct PCIProxyDev { QemuMutex io_mutex; QIOChannel *ioc; Error *migration_blocker; +ProxyMemoryListener proxy_listener; ProxyMemoryRegion region[PCI_NUM_REGIONS]; }; diff --git a/hw/remote/message.c b/hw/remote/message.c index f2e8445..25341d8 100644 --- a/hw/remote/message.c +++ b/hw/remote/message.c @@ -17,6 +17,7 @@ #include "sysemu/runstate.h" #include "hw/pci/pci.h" #include "exec/memattrs.h" +#include "hw/remote/memory.h" static void process_config_write(QIOChannel *ioc, PCIDevice *dev, MPQemuMsg *msg, Error **errp); @@ -61,6 +62,9 @@ void coroutine_fn mpqemu_remote_msg_loop_co(void *data) case MPQEMU_CMD_BAR_READ: process_bar_read(com->ioc, &msg, &local_err); break; +case MPQEMU_CMD_SYNC_SYSMEM: +remote_sysmem_reconfig(&msg, &local_err); +break; default: error_setg(&local_err, "Unknown command (%d) received for device %s" diff --git a/hw/remote/proxy-memory-listener.c b/hw/remote/proxy-memory-listener.c new file mode 100644 index 000..af1fa6f --- /dev/null +++ b/hw/remote/proxy-memory-listener.c @@ -0,0 +1,227 @@ +/* + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "qemu/compiler.h" +#include "qemu/int128.h" +#include "qemu/range.h" +#include "exec/memory.h" +#include "exec/cpu-common.h" +#include "cpu.h" +#include "exec/ram_addr.h" +#include "exec/address-spaces.h" +#include "qapi/error.h" +#include "hw/remote/mpqemu-link.h" +#include "hw/remote/proxy-memory-listener.h" + +/* + * TODO: get_fd_from_hostaddr(), proxy_mrs_can_merge() and + * proxy_memory_listener_commit() defined below perform tasks similar to the + * functions defined in vhost-user.c. These functions are good candidates + * for refactoring. + * + */ + +static void proxy_memory_listener_reset(MemoryListener *listener) +{ +ProxyMemoryListener *proxy_listener = container_of(listener, + ProxyMemoryListener, + listener); +int mrs; + +for (mrs = 0; mrs < proxy_listener->n_mr_sections; mrs++) { +memory_region_unref(proxy_listener->mr_sections[mrs].mr); +} + +g_free(proxy_listener->mr_sections); +proxy_listener->mr_sections = NULL; +proxy_
[PATCH v16 19/20] multi-process: Retrieve PCI info from remote process
Retrieve PCI configuration info about the remote device and configure the Proxy PCI object based on the returned information Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/proxy.c | 84 +++ 1 file changed, 84 insertions(+) diff --git a/hw/remote/proxy.c b/hw/remote/proxy.c index 555b310..a082709 100644 --- a/hw/remote/proxy.c +++ b/hw/remote/proxy.c @@ -25,6 +25,8 @@ #include "sysemu/kvm.h" #include "util/event_notifier-posix.c" +static void probe_pci_info(PCIDevice *dev, Error **errp); + static void proxy_intx_update(PCIDevice *pci_dev) { PCIProxyDev *dev = PCI_PROXY_DEV(pci_dev); @@ -77,6 +79,7 @@ static void pci_proxy_dev_realize(PCIDevice *device, Error **errp) { ERRP_GUARD(); PCIProxyDev *dev = PCI_PROXY_DEV(device); +uint8_t *pci_conf = device->config; int fd; if (!dev->fd) { @@ -106,9 +109,14 @@ static void pci_proxy_dev_realize(PCIDevice *device, Error **errp) qemu_mutex_init(&dev->io_mutex); qio_channel_set_blocking(dev->ioc, true, NULL); +pci_conf[PCI_LATENCY_TIMER] = 0xff; +pci_conf[PCI_INTERRUPT_PIN] = 0x01; + proxy_memory_listener_configure(&dev->proxy_listener, dev->ioc); setup_irqfd(dev); + +probe_pci_info(PCI_DEVICE(dev), errp); } static void pci_proxy_dev_exit(PCIDevice *pdev) @@ -274,3 +282,79 @@ const MemoryRegionOps proxy_mr_ops = { .max_access_size = 8, }, }; + +static void probe_pci_info(PCIDevice *dev, Error **errp) +{ +PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(dev); +uint32_t orig_val, new_val, base_class, val; +PCIProxyDev *pdev = PCI_PROXY_DEV(dev); +DeviceClass *dc = DEVICE_CLASS(pc); +uint8_t type; +int i, size; + +config_op_send(pdev, PCI_VENDOR_ID, &val, 2, MPQEMU_CMD_PCI_CFGREAD); +pc->vendor_id = (uint16_t)val; + +config_op_send(pdev, PCI_DEVICE_ID, &val, 2, MPQEMU_CMD_PCI_CFGREAD); +pc->device_id = (uint16_t)val; + +config_op_send(pdev, PCI_CLASS_DEVICE, &val, 2, MPQEMU_CMD_PCI_CFGREAD); +pc->class_id = (uint16_t)val; + +config_op_send(pdev, PCI_SUBSYSTEM_ID, &val, 2, MPQEMU_CMD_PCI_CFGREAD); +pc->subsystem_id = (uint16_t)val; + +base_class = pc->class_id >> 4; +switch (base_class) { +case PCI_BASE_CLASS_BRIDGE: +set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); +break; +case PCI_BASE_CLASS_STORAGE: +set_bit(DEVICE_CATEGORY_STORAGE, dc->categories); +break; +case PCI_BASE_CLASS_NETWORK: +set_bit(DEVICE_CATEGORY_NETWORK, dc->categories); +break; +case PCI_BASE_CLASS_INPUT: +set_bit(DEVICE_CATEGORY_INPUT, dc->categories); +break; +case PCI_BASE_CLASS_DISPLAY: +set_bit(DEVICE_CATEGORY_DISPLAY, dc->categories); +break; +case PCI_BASE_CLASS_PROCESSOR: +set_bit(DEVICE_CATEGORY_CPU, dc->categories); +break; +default: +set_bit(DEVICE_CATEGORY_MISC, dc->categories); +break; +} + +for (i = 0; i < PCI_NUM_REGIONS; i++) { +config_op_send(pdev, PCI_BASE_ADDRESS_0 + (4 * i), &orig_val, 4, + MPQEMU_CMD_PCI_CFGREAD); +new_val = 0x; +config_op_send(pdev, PCI_BASE_ADDRESS_0 + (4 * i), &new_val, 4, + MPQEMU_CMD_PCI_CFGWRITE); +config_op_send(pdev, PCI_BASE_ADDRESS_0 + (4 * i), &new_val, 4, + MPQEMU_CMD_PCI_CFGREAD); +size = (~(new_val & 0xFFF0)) + 1; +config_op_send(pdev, PCI_BASE_ADDRESS_0 + (4 * i), &orig_val, 4, + MPQEMU_CMD_PCI_CFGWRITE); +type = (new_val & 0x1) ? + PCI_BASE_ADDRESS_SPACE_IO : PCI_BASE_ADDRESS_SPACE_MEMORY; + +if (size) { +g_autofree char *name; +pdev->region[i].dev = pdev; +pdev->region[i].present = true; +if (type == PCI_BASE_ADDRESS_SPACE_MEMORY) { +pdev->region[i].memory = true; +} +name = g_strdup_printf("bar-region-%d", i); +memory_region_init_io(&pdev->region[i].mr, OBJECT(pdev), + &proxy_mr_ops, &pdev->region[i], + name, size); +pci_register_bar(dev, i, type, &pdev->region[i].mr); +} +} +} -- 1.8.3.1
[PATCH v16 14/20] multi-process: add proxy communication functions
From: Elena Ufimtseva Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John G Johnson Reviewed-by: Stefan Hajnoczi --- include/hw/remote/mpqemu-link.h | 4 hw/remote/mpqemu-link.c | 34 ++ 2 files changed, 38 insertions(+) diff --git a/include/hw/remote/mpqemu-link.h b/include/hw/remote/mpqemu-link.h index 6ee5bc5..1b35d40 100644 --- a/include/hw/remote/mpqemu-link.h +++ b/include/hw/remote/mpqemu-link.h @@ -15,6 +15,8 @@ #include "qemu/thread.h" #include "io/channel.h" #include "exec/hwaddr.h" +#include "io/channel-socket.h" +#include "hw/remote/proxy.h" #define REMOTE_MAX_FDS 8 @@ -68,6 +70,8 @@ typedef struct { bool mpqemu_msg_send(MPQemuMsg *msg, QIOChannel *ioc, Error **errp); bool mpqemu_msg_recv(MPQemuMsg *msg, QIOChannel *ioc, Error **errp); +uint64_t mpqemu_msg_send_and_await_reply(MPQemuMsg *msg, PCIProxyDev *pdev, + Error **errp); bool mpqemu_msg_valid(MPQemuMsg *msg); #endif diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c index 4b25649..88d1f9b 100644 --- a/hw/remote/mpqemu-link.c +++ b/hw/remote/mpqemu-link.c @@ -182,6 +182,40 @@ fail: return ret; } +/* + * Send msg and wait for a reply with command code RET_MSG. + * Returns the message received of size u64 or UINT64_MAX + * on error. + * Called from VCPU thread in non-coroutine context. + * Used by the Proxy object to communicate to remote processes. + */ +uint64_t mpqemu_msg_send_and_await_reply(MPQemuMsg *msg, PCIProxyDev *pdev, + Error **errp) +{ +ERRP_GUARD(); +MPQemuMsg msg_reply = {0}; +uint64_t ret = UINT64_MAX; + +assert(!qemu_in_coroutine()); + +QEMU_LOCK_GUARD(&pdev->io_mutex); +if (!mpqemu_msg_send(msg, pdev->ioc, errp)) { +return ret; +} + +if (!mpqemu_msg_recv(&msg_reply, pdev->ioc, errp)) { +return ret; +} + +if (!mpqemu_msg_valid(&msg_reply)) { +error_setg(errp, "ERROR: Invalid reply received for command %d", + msg->cmd); +return ret; +} + +return msg_reply.data.u64; +} + bool mpqemu_msg_valid(MPQemuMsg *msg) { if (msg->cmd >= MPQEMU_CMD_MAX && msg->cmd < 0) { -- 1.8.3.1
[PATCH v16 01/20] multi-process: add the concept description to docs/devel/qemu-multiprocess
From: John G Johnson Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- docs/devel/index.rst | 1 + docs/devel/multi-process.rst | 966 +++ MAINTAINERS | 7 + 3 files changed, 974 insertions(+) create mode 100644 docs/devel/multi-process.rst diff --git a/docs/devel/index.rst b/docs/devel/index.rst index ea0e1e1..5ccaf8b 100644 --- a/docs/devel/index.rst +++ b/docs/devel/index.rst @@ -36,3 +36,4 @@ Contents: clocks qom block-coroutine-wrapper + multi-process diff --git a/docs/devel/multi-process.rst b/docs/devel/multi-process.rst new file mode 100644 index 000..6969932 --- /dev/null +++ b/docs/devel/multi-process.rst @@ -0,0 +1,966 @@ +This is the design document for multi-process QEMU. It does not +necessarily reflect the status of the current implementation, which +may lack features or be considerably different from what is described +in this document. This document is still useful as a description of +the goals and general direction of this feature. + +Please refer to the following wiki for latest details: +https://wiki.qemu.org/Features/MultiProcessQEMU + +Multi-process QEMU +=== + +QEMU is often used as the hypervisor for virtual machines running in the +Oracle cloud. Since one of the advantages of cloud computing is the +ability to run many VMs from different tenants in the same cloud +infrastructure, a guest that compromised its hypervisor could +potentially use the hypervisor's access privileges to access data it is +not authorized for. + +QEMU can be susceptible to security attacks because it is a large, +monolithic program that provides many features to the VMs it services. +Many of these features can be configured out of QEMU, but even a reduced +configuration QEMU has a large amount of code a guest can potentially +attack. Separating QEMU reduces the attack surface by aiding to +limit each component in the system to only access the resources that +it needs to perform its job. + +QEMU services +- + +QEMU can be broadly described as providing three main services. One is a +VM control point, where VMs can be created, migrated, re-configured, and +destroyed. A second is to emulate the CPU instructions within the VM, +often accelerated by HW virtualization features such as Intel's VT +extensions. Finally, it provides IO services to the VM by emulating HW +IO devices, such as disk and network devices. + +A multi-process QEMU + + +A multi-process QEMU involves separating QEMU services into separate +host processes. Each of these processes can be given only the privileges +it needs to provide its service, e.g., a disk service could be given +access only to the disk images it provides, and not be allowed to +access other files, or any network devices. An attacker who compromised +this service would not be able to use this exploit to access files or +devices beyond what the disk service was given access to. + +A QEMU control process would remain, but in multi-process mode, will +have no direct interfaces to the VM. During VM execution, it would still +provide the user interface to hot-plug devices or live migrate the VM. + +A first step in creating a multi-process QEMU is to separate IO services +from the main QEMU program, which would continue to provide CPU +emulation. i.e., the control process would also be the CPU emulation +process. In a later phase, CPU emulation could be separated from the +control process. + +Separating IO services +-- + +Separating IO services into individual host processes is a good place to +begin for a couple of reasons. One is the sheer number of IO devices QEMU +can emulate provides a large surface of interfaces which could potentially +be exploited, and, indeed, have been a source of exploits in the past. +Another is the modular nature of QEMU device emulation code provides +interface points where the QEMU functions that perform device emulation +can be separated from the QEMU functions that manage the emulation of +guest CPU instructions. The devices emulated in the separate process are +referred to as remote devices. + +QEMU device emulation +~ + +QEMU uses an object oriented SW architecture for device emulation code. +Configured objects are all compiled into the QEMU binary, then objects +are instantiated by name when used by the guest VM. For example, the +code to emulate a device named "foo" is always present in QEMU, but its +instantiation code is only run when the device is included in the target +VM. (e.g., via the QEMU command line as *-device foo*) + +The object model is hierarchical, so device emulation code names its +parent object (such as "pci-device" for a PCI device) and QEMU will +instantiate a parent object before calling the device's instantiation +code. + +Current separation models +
[PATCH v16 20/20] multi-process: perform device reset in the remote process
From: Elena Ufimtseva Perform device reset in the remote process when QEMU performs device reset. This is required to reset the internal state (like registers, etc...) of emulated devices Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- include/hw/remote/mpqemu-link.h | 1 + hw/remote/message.c | 22 ++ hw/remote/proxy.c | 19 +++ 3 files changed, 42 insertions(+) diff --git a/include/hw/remote/mpqemu-link.h b/include/hw/remote/mpqemu-link.h index 71d206f..4ec0915 100644 --- a/include/hw/remote/mpqemu-link.h +++ b/include/hw/remote/mpqemu-link.h @@ -40,6 +40,7 @@ typedef enum { MPQEMU_CMD_BAR_WRITE, MPQEMU_CMD_BAR_READ, MPQEMU_CMD_SET_IRQFD, +MPQEMU_CMD_DEVICE_RESET, MPQEMU_CMD_MAX, } MPQemuCmd; diff --git a/hw/remote/message.c b/hw/remote/message.c index adab040..11d7298 100644 --- a/hw/remote/message.c +++ b/hw/remote/message.c @@ -19,6 +19,7 @@ #include "exec/memattrs.h" #include "hw/remote/memory.h" #include "hw/remote/iohub.h" +#include "sysemu/reset.h" static void process_config_write(QIOChannel *ioc, PCIDevice *dev, MPQemuMsg *msg, Error **errp); @@ -26,6 +27,8 @@ static void process_config_read(QIOChannel *ioc, PCIDevice *dev, MPQemuMsg *msg, Error **errp); static void process_bar_write(QIOChannel *ioc, MPQemuMsg *msg, Error **errp); static void process_bar_read(QIOChannel *ioc, MPQemuMsg *msg, Error **errp); +static void process_device_reset_msg(QIOChannel *ioc, PCIDevice *dev, + Error **errp); void coroutine_fn mpqemu_remote_msg_loop_co(void *data) { @@ -69,6 +72,9 @@ void coroutine_fn mpqemu_remote_msg_loop_co(void *data) case MPQEMU_CMD_SET_IRQFD: process_set_irqfd_msg(pci_dev, &msg); break; +case MPQEMU_CMD_DEVICE_RESET: +process_device_reset_msg(com->ioc, pci_dev, &local_err); +break; default: error_setg(&local_err, "Unknown command (%d) received for device %s" @@ -206,3 +212,19 @@ fail: getpid()); } } + +static void process_device_reset_msg(QIOChannel *ioc, PCIDevice *dev, + Error **errp) +{ +DeviceClass *dc = DEVICE_GET_CLASS(dev); +DeviceState *s = DEVICE(dev); +MPQemuMsg ret = { 0 }; + +if (dc->reset) { +dc->reset(s); +} + +ret.cmd = MPQEMU_CMD_RET; + +mpqemu_msg_send(&ret, ioc, errp); +} diff --git a/hw/remote/proxy.c b/hw/remote/proxy.c index a082709..4fa4be0 100644 --- a/hw/remote/proxy.c +++ b/hw/remote/proxy.c @@ -26,6 +26,7 @@ #include "util/event_notifier-posix.c" static void probe_pci_info(PCIDevice *dev, Error **errp); +static void proxy_device_reset(DeviceState *dev); static void proxy_intx_update(PCIDevice *pci_dev) { @@ -202,6 +203,8 @@ static void pci_proxy_dev_class_init(ObjectClass *klass, void *data) k->config_read = pci_proxy_read_config; k->config_write = pci_proxy_write_config; +dc->reset = proxy_device_reset; + device_class_set_props(dc, proxy_properties); } @@ -358,3 +361,19 @@ static void probe_pci_info(PCIDevice *dev, Error **errp) } } } + +static void proxy_device_reset(DeviceState *dev) +{ +PCIProxyDev *pdev = PCI_PROXY_DEV(dev); +MPQemuMsg msg = { 0 }; +Error *local_err = NULL; + +msg.cmd = MPQEMU_CMD_DEVICE_RESET; +msg.size = 0; + +mpqemu_msg_send_and_await_reply(&msg, pdev, &local_err); +if (local_err) { +error_report_err(local_err); +} + +} -- 1.8.3.1
[PATCH v16 16/20] multi-process: PCI BAR read/write handling for proxy & remote endpoints
Proxy device object implements handler for PCI BAR writes and reads. The handler uses BAR_WRITE/BAR_READ message to communicate to the remote process with the BAR address and value to be written/read. The remote process implements handler for BAR_WRITE/BAR_READ message. Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Reviewed-by: Stefan Hajnoczi --- include/hw/remote/mpqemu-link.h | 10 + include/hw/remote/proxy.h | 9 + hw/remote/message.c | 83 + hw/remote/mpqemu-link.c | 6 +++ hw/remote/proxy.c | 60 + 5 files changed, 168 insertions(+) diff --git a/include/hw/remote/mpqemu-link.h b/include/hw/remote/mpqemu-link.h index 7bc0bdd..6303e62 100644 --- a/include/hw/remote/mpqemu-link.h +++ b/include/hw/remote/mpqemu-link.h @@ -37,6 +37,8 @@ typedef enum { MPQEMU_CMD_RET, MPQEMU_CMD_PCI_CFGWRITE, MPQEMU_CMD_PCI_CFGREAD, +MPQEMU_CMD_BAR_WRITE, +MPQEMU_CMD_BAR_READ, MPQEMU_CMD_MAX, } MPQemuCmd; @@ -52,6 +54,13 @@ typedef struct { int len; } PciConfDataMsg; +typedef struct { +hwaddr addr; +uint64_t val; +unsigned size; +bool memory; +} BarAccessMsg; + /** * MPQemuMsg: * @cmd: The remote command @@ -71,6 +80,7 @@ typedef struct { uint64_t u64; PciConfDataMsg pci_conf_data; SyncSysmemMsg sync_sysmem; +BarAccessMsg bar_access; } data; int fds[REMOTE_MAX_FDS]; diff --git a/include/hw/remote/proxy.h b/include/hw/remote/proxy.h index faa9c4d..ea7fa4f 100644 --- a/include/hw/remote/proxy.h +++ b/include/hw/remote/proxy.h @@ -15,6 +15,14 @@ #define TYPE_PCI_PROXY_DEV "x-pci-proxy-dev" OBJECT_DECLARE_SIMPLE_TYPE(PCIProxyDev, PCI_PROXY_DEV) +typedef struct ProxyMemoryRegion { +PCIProxyDev *dev; +MemoryRegion mr; +bool memory; +bool present; +uint8_t type; +} ProxyMemoryRegion; + struct PCIProxyDev { PCIDevice parent_dev; char *fd; @@ -28,6 +36,7 @@ struct PCIProxyDev { QemuMutex io_mutex; QIOChannel *ioc; Error *migration_blocker; +ProxyMemoryRegion region[PCI_NUM_REGIONS]; }; #endif /* PROXY_H */ diff --git a/hw/remote/message.c b/hw/remote/message.c index 636bd16..f2e8445 100644 --- a/hw/remote/message.c +++ b/hw/remote/message.c @@ -16,11 +16,14 @@ #include "qapi/error.h" #include "sysemu/runstate.h" #include "hw/pci/pci.h" +#include "exec/memattrs.h" static void process_config_write(QIOChannel *ioc, PCIDevice *dev, MPQemuMsg *msg, Error **errp); static void process_config_read(QIOChannel *ioc, PCIDevice *dev, MPQemuMsg *msg, Error **errp); +static void process_bar_write(QIOChannel *ioc, MPQemuMsg *msg, Error **errp); +static void process_bar_read(QIOChannel *ioc, MPQemuMsg *msg, Error **errp); void coroutine_fn mpqemu_remote_msg_loop_co(void *data) { @@ -52,6 +55,12 @@ void coroutine_fn mpqemu_remote_msg_loop_co(void *data) case MPQEMU_CMD_PCI_CFGREAD: process_config_read(com->ioc, pci_dev, &msg, &local_err); break; +case MPQEMU_CMD_BAR_WRITE: +process_bar_write(com->ioc, &msg, &local_err); +break; +case MPQEMU_CMD_BAR_READ: +process_bar_read(com->ioc, &msg, &local_err); +break; default: error_setg(&local_err, "Unknown command (%d) received for device %s" @@ -115,3 +124,77 @@ static void process_config_read(QIOChannel *ioc, PCIDevice *dev, getpid()); } } + +static void process_bar_write(QIOChannel *ioc, MPQemuMsg *msg, Error **errp) +{ +ERRP_GUARD(); +BarAccessMsg *bar_access = &msg->data.bar_access; +AddressSpace *as = +bar_access->memory ? &address_space_memory : &address_space_io; +MPQemuMsg ret = { 0 }; +MemTxResult res; +uint64_t val; + +if (!is_power_of_2(bar_access->size) || + (bar_access->size > sizeof(uint64_t))) { +ret.data.u64 = UINT64_MAX; +goto fail; +} + +val = cpu_to_le64(bar_access->val); + +res = address_space_rw(as, bar_access->addr, MEMTXATTRS_UNSPECIFIED, + (void *)&val, bar_access->size, true); + +if (res != MEMTX_OK) { +error_setg(errp, "Bad address %"PRIx64" for mem write, pid "FMT_pid".", + bar_access->addr, getpid()); +ret.data.u64 = -1; +} + +fail: +ret.cmd = MPQEMU_CMD_RET; +ret.size = sizeof(ret.data.u64); + +if (!mpqemu_msg_send(&ret, ioc, NULL)) { +error_prepend(errp, "Error returning code to proxy, pid "FMT_pid": ", + getpid()); +} +} + +static void process_bar_read(QIOChannel *ioc, MPQemuMsg *msg, Error **errp) +{ +ERRP_GUARD(); +BarAccessMsg *bar_access = &msg->data.bar_access; +MPQemuMsg ret = { 0 }; +AddressSp
[PATCH v16 11/20] multi-process: Associate fd of a PCIDevice with its object
Associate the file descriptor for a PCIDevice in remote process with DeviceState object. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Reviewed-by: Stefan Hajnoczi --- hw/remote/remote-obj.c | 203 + MAINTAINERS| 1 + hw/remote/meson.build | 1 + 3 files changed, 205 insertions(+) create mode 100644 hw/remote/remote-obj.c diff --git a/hw/remote/remote-obj.c b/hw/remote/remote-obj.c new file mode 100644 index 000..4f21254 --- /dev/null +++ b/hw/remote/remote-obj.c @@ -0,0 +1,203 @@ +/* + * Copyright © 2020, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL-v2, version 2 or later. + * + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "qemu/error-report.h" +#include "qemu/notify.h" +#include "qom/object_interfaces.h" +#include "hw/qdev-core.h" +#include "io/channel.h" +#include "hw/qdev-core.h" +#include "hw/remote/machine.h" +#include "io/channel-util.h" +#include "qapi/error.h" +#include "sysemu/sysemu.h" +#include "hw/pci/pci.h" +#include "qemu/sockets.h" +#include "monitor/monitor.h" + +#define TYPE_REMOTE_OBJECT "x-remote-object" +OBJECT_DECLARE_TYPE(RemoteObject, RemoteObjectClass, REMOTE_OBJECT) + +struct RemoteObjectClass { +ObjectClass parent_class; + +unsigned int nr_devs; +unsigned int max_devs; +}; + +struct RemoteObject { +/* private */ +Object parent; + +Notifier machine_done; + +int32_t fd; +char *devid; + +QIOChannel *ioc; + +DeviceState *dev; +DeviceListener listener; +}; + +static void remote_object_set_fd(Object *obj, const char *str, Error **errp) +{ +RemoteObject *o = REMOTE_OBJECT(obj); +int fd = -1; + +fd = monitor_fd_param(monitor_cur(), str, errp); +if (fd == -1) { +error_prepend(errp, "Could not parse remote object fd %s:", str); +return; +} + +if (!fd_is_socket(fd)) { +error_setg(errp, "File descriptor '%s' is not a socket", str); +close(fd); +return; +} + +o->fd = fd; +} + +static void remote_object_set_devid(Object *obj, const char *str, Error **errp) +{ +RemoteObject *o = REMOTE_OBJECT(obj); + +g_free(o->devid); + +o->devid = g_strdup(str); +} + +static void remote_object_unrealize_listener(DeviceListener *listener, + DeviceState *dev) +{ +RemoteObject *o = container_of(listener, RemoteObject, listener); + +if (o->dev == dev) { +object_unref(OBJECT(o)); +} +} + +static void remote_object_machine_done(Notifier *notifier, void *data) +{ +RemoteObject *o = container_of(notifier, RemoteObject, machine_done); +DeviceState *dev = NULL; +QIOChannel *ioc = NULL; +Coroutine *co = NULL; +RemoteCommDev *comdev = NULL; +Error *err = NULL; + +dev = qdev_find_recursive(sysbus_get_default(), o->devid); +if (!dev || !object_dynamic_cast(OBJECT(dev), TYPE_PCI_DEVICE)) { +error_report("%s is not a PCI device", o->devid); +return; +} + +ioc = qio_channel_new_fd(o->fd, &err); +if (!ioc) { +error_report_err(err); +return; +} +qio_channel_set_blocking(ioc, false, NULL); + +o->dev = dev; + +o->listener.unrealize = remote_object_unrealize_listener; +device_listener_register(&o->listener); + +/* co-routine should free this. */ +comdev = g_new0(RemoteCommDev, 1); +*comdev = (RemoteCommDev) { +.ioc = ioc, +.dev = PCI_DEVICE(dev), +}; + +co = qemu_coroutine_create(mpqemu_remote_msg_loop_co, comdev); +qemu_coroutine_enter(co); +} + +static void remote_object_init(Object *obj) +{ +RemoteObjectClass *k = REMOTE_OBJECT_GET_CLASS(obj); +RemoteObject *o = REMOTE_OBJECT(obj); + +if (k->nr_devs >= k->max_devs) { +error_report("Reached maximum number of devices: %u", k->max_devs); +return; +} + +o->ioc = NULL; +o->fd = -1; +o->devid = NULL; + +k->nr_devs++; + +o->machine_done.notify = remote_object_machine_done; +qemu_add_machine_init_done_notifier(&o->machine_done); +} + +static void remote_object_finalize(Object *obj) +{ +RemoteObjectClass *k = REMOTE_OBJECT_GET_CLASS(obj); +RemoteObject *o = REMOTE_OBJECT(obj); + +device_listener_unregister(&o->listener); + +if (o->ioc) { +qio_channel_shutdown(o->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); +qio_channel_close(o->ioc, NULL); +} + +object_unref(OBJECT(o->ioc)); + +k->nr_devs--; +g_free(o->devid); +} + +static void remote_object_class_init(ObjectClass *klass, void *data) +{ +RemoteObjectClass *k = REMOTE_OBJECT_CLASS(klass); + +/* + * Limit number of supported devices to 1. This is done to avoid devices + * from one VM accessing the RAM of another VM. This is done until we + * start us
[PATCH v16 09/20] multi-process: define MPQemuMsg format and transmission functions
From: Elena Ufimtseva Defines MPQemuMsg, which is the message that is sent to the remote process. This message is sent over QIOChannel and is used to command the remote process to perform various tasks. Define transmission functions used by proxy and by remote. Signed-off-by: Jagannathan Raman Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Reviewed-by: Stefan Hajnoczi --- meson.build | 1 + hw/remote/trace.h | 1 + include/hw/remote/mpqemu-link.h | 63 include/sysemu/iothread.h | 6 ++ hw/remote/mpqemu-link.c | 205 iothread.c | 6 ++ MAINTAINERS | 2 + hw/remote/meson.build | 1 + hw/remote/trace-events | 4 + 9 files changed, 289 insertions(+) create mode 100644 hw/remote/trace.h create mode 100644 include/hw/remote/mpqemu-link.h create mode 100644 hw/remote/mpqemu-link.c create mode 100644 hw/remote/trace-events diff --git a/meson.build b/meson.build index 5e27bb5..dd387b5 100644 --- a/meson.build +++ b/meson.build @@ -1736,6 +1736,7 @@ if have_system 'net', 'softmmu', 'ui', +'hw/remote', ] endif trace_events_subdirs += [ diff --git a/hw/remote/trace.h b/hw/remote/trace.h new file mode 100644 index 000..5d5e3ac --- /dev/null +++ b/hw/remote/trace.h @@ -0,0 +1 @@ +#include "trace/trace-hw_remote.h" diff --git a/include/hw/remote/mpqemu-link.h b/include/hw/remote/mpqemu-link.h new file mode 100644 index 000..cac699c --- /dev/null +++ b/include/hw/remote/mpqemu-link.h @@ -0,0 +1,63 @@ +/* + * Communication channel between QEMU and remote device process + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef MPQEMU_LINK_H +#define MPQEMU_LINK_H + +#include "qom/object.h" +#include "qemu/thread.h" +#include "io/channel.h" + +#define REMOTE_MAX_FDS 8 + +#define MPQEMU_MSG_HDR_SIZE offsetof(MPQemuMsg, data.u64) + +/** + * MPQemuCmd: + * + * MPQemuCmd enum type to specify the command to be executed on the remote + * device. + * + * This uses a private protocol between QEMU and the remote process. vfio-user + * protocol would supersede this in the future. + * + */ +typedef enum { +MPQEMU_CMD_MAX, +} MPQemuCmd; + +/** + * MPQemuMsg: + * @cmd: The remote command + * @size: Size of the data to be shared + * @data: Structured data + * @fds: File descriptors to be shared with remote device + * + * MPQemuMsg Format of the message sent to the remote device from QEMU. + * + */ +typedef struct { +int cmd; +size_t size; + +union { +uint64_t u64; +} data; + +int fds[REMOTE_MAX_FDS]; +int num_fds; +} MPQemuMsg; + +bool mpqemu_msg_send(MPQemuMsg *msg, QIOChannel *ioc, Error **errp); +bool mpqemu_msg_recv(MPQemuMsg *msg, QIOChannel *ioc, Error **errp); + +bool mpqemu_msg_valid(MPQemuMsg *msg); + +#endif diff --git a/include/sysemu/iothread.h b/include/sysemu/iothread.h index 0c5284d..f177142 100644 --- a/include/sysemu/iothread.h +++ b/include/sysemu/iothread.h @@ -57,4 +57,10 @@ IOThread *iothread_create(const char *id, Error **errp); void iothread_stop(IOThread *iothread); void iothread_destroy(IOThread *iothread); +/* + * Returns true if executing withing IOThread context, + * false otherwise. + */ +bool qemu_in_iothread(void); + #endif /* IOTHREAD_H */ diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c new file mode 100644 index 000..b3d380e --- /dev/null +++ b/hw/remote/mpqemu-link.c @@ -0,0 +1,205 @@ +/* + * Communication channel between QEMU and remote device process + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "qemu/module.h" +#include "hw/remote/mpqemu-link.h" +#include "qapi/error.h" +#include "qemu/iov.h" +#include "qemu/error-report.h" +#include "qemu/main-loop.h" +#include "io/channel.h" +#include "sysemu/iothread.h" +#include "trace.h" + +/* + * Send message over the ioc QIOChannel. + * This function is safe to call from: + * - main loop in co-routine context. Will block the main loop if not in + * co-routine context; + * - vCPU thread with no co-routine context and if the channel is not part + * of the main loop handling; + * - IOThread within co-routine context, outside of co-routine context + * will block IOThread; + * Returns true if no errors were encountered, false otherwise. + */ +bool mpqemu_msg_send(MPQemuMsg *msg, QIOChannel *ioc, Error **errp) +{ +ERRP_GUARD(); +bool iolock = qemu_mutex_iothread_locked(); +bool iothread = qemu_in_iothread(); +struct iovec send[2] = {0}; +int *fds = NULL; +size_t nfds = 0; +
[PATCH v16 15/20] multi-process: Forward PCI config space acceses to the remote process
From: Elena Ufimtseva The Proxy Object sends the PCI config space accesses as messages to the remote process over the communication channel Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John G Johnson Reviewed-by: Stefan Hajnoczi --- include/hw/remote/mpqemu-link.h | 10 +++ hw/remote/message.c | 60 + hw/remote/mpqemu-link.c | 8 +- hw/remote/proxy.c | 55 + 4 files changed, 132 insertions(+), 1 deletion(-) diff --git a/include/hw/remote/mpqemu-link.h b/include/hw/remote/mpqemu-link.h index 1b35d40..7bc0bdd 100644 --- a/include/hw/remote/mpqemu-link.h +++ b/include/hw/remote/mpqemu-link.h @@ -34,6 +34,9 @@ */ typedef enum { MPQEMU_CMD_SYNC_SYSMEM, +MPQEMU_CMD_RET, +MPQEMU_CMD_PCI_CFGWRITE, +MPQEMU_CMD_PCI_CFGREAD, MPQEMU_CMD_MAX, } MPQemuCmd; @@ -43,6 +46,12 @@ typedef struct { off_t offsets[REMOTE_MAX_FDS]; } SyncSysmemMsg; +typedef struct { +uint32_t addr; +uint32_t val; +int len; +} PciConfDataMsg; + /** * MPQemuMsg: * @cmd: The remote command @@ -60,6 +69,7 @@ typedef struct { union { uint64_t u64; +PciConfDataMsg pci_conf_data; SyncSysmemMsg sync_sysmem; } data; diff --git a/hw/remote/message.c b/hw/remote/message.c index 36e2d4f..636bd16 100644 --- a/hw/remote/message.c +++ b/hw/remote/message.c @@ -15,6 +15,12 @@ #include "hw/remote/mpqemu-link.h" #include "qapi/error.h" #include "sysemu/runstate.h" +#include "hw/pci/pci.h" + +static void process_config_write(QIOChannel *ioc, PCIDevice *dev, + MPQemuMsg *msg, Error **errp); +static void process_config_read(QIOChannel *ioc, PCIDevice *dev, +MPQemuMsg *msg, Error **errp); void coroutine_fn mpqemu_remote_msg_loop_co(void *data) { @@ -40,6 +46,12 @@ void coroutine_fn mpqemu_remote_msg_loop_co(void *data) } switch (msg.cmd) { +case MPQEMU_CMD_PCI_CFGWRITE: +process_config_write(com->ioc, pci_dev, &msg, &local_err); +break; +case MPQEMU_CMD_PCI_CFGREAD: +process_config_read(com->ioc, pci_dev, &msg, &local_err); +break; default: error_setg(&local_err, "Unknown command (%d) received for device %s" @@ -55,3 +67,51 @@ void coroutine_fn mpqemu_remote_msg_loop_co(void *data) qemu_system_shutdown_request(SHUTDOWN_CAUSE_GUEST_SHUTDOWN); } } + +static void process_config_write(QIOChannel *ioc, PCIDevice *dev, + MPQemuMsg *msg, Error **errp) +{ +ERRP_GUARD(); +PciConfDataMsg *conf = (PciConfDataMsg *)&msg->data.pci_conf_data; +MPQemuMsg ret = { 0 }; + +if ((conf->addr + sizeof(conf->val)) > pci_config_size(dev)) { +error_setg(errp, "Bad address for PCI config write, pid "FMT_pid".", + getpid()); +ret.data.u64 = UINT64_MAX; +} else { +pci_default_write_config(dev, conf->addr, conf->val, conf->len); +} + +ret.cmd = MPQEMU_CMD_RET; +ret.size = sizeof(ret.data.u64); + +if (!mpqemu_msg_send(&ret, ioc, NULL)) { +error_prepend(errp, "Error returning code to proxy, pid "FMT_pid": ", + getpid()); +} +} + +static void process_config_read(QIOChannel *ioc, PCIDevice *dev, +MPQemuMsg *msg, Error **errp) +{ +ERRP_GUARD(); +PciConfDataMsg *conf = (PciConfDataMsg *)&msg->data.pci_conf_data; +MPQemuMsg ret = { 0 }; + +if ((conf->addr + sizeof(conf->val)) > pci_config_size(dev)) { +error_setg(errp, "Bad address for PCI config read, pid "FMT_pid".", + getpid()); +ret.data.u64 = UINT64_MAX; +} else { +ret.data.u64 = pci_default_read_config(dev, conf->addr, conf->len); +} + +ret.cmd = MPQEMU_CMD_RET; +ret.size = sizeof(ret.data.u64); + +if (!mpqemu_msg_send(&ret, ioc, NULL)) { +error_prepend(errp, "Error returning code to proxy, pid "FMT_pid": ", + getpid()); +} +} diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c index 88d1f9b..5bd6a9d 100644 --- a/hw/remote/mpqemu-link.c +++ b/hw/remote/mpqemu-link.c @@ -207,7 +207,7 @@ uint64_t mpqemu_msg_send_and_await_reply(MPQemuMsg *msg, PCIProxyDev *pdev, return ret; } -if (!mpqemu_msg_valid(&msg_reply)) { +if (!mpqemu_msg_valid(&msg_reply) || msg_reply.cmd != MPQEMU_CMD_RET) { error_setg(errp, "ERROR: Invalid reply received for command %d", msg->cmd); return ret; @@ -242,6 +242,12 @@ bool mpqemu_msg_valid(MPQemuMsg *msg) return false; } break; +case MPQEMU_CMD_PCI_CFGWRITE: +case MPQEMU_CMD_PCI_CFGREAD: +if (msg->size != sizeof(PciConfDataMsg)) { +return false; +
[PATCH v16 05/20] multi-process: setup PCI host bridge for remote device
PCI host bridge is setup for the remote device process. It is implemented using remote-pcihost object. It is an extension of the PCI host bridge setup by QEMU. Remote-pcihost configures a PCI bus which could be used by the remote PCI device to latch on to. Signed-off-by: Jagannathan Raman Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Reviewed-by: Stefan Hajnoczi --- include/hw/pci-host/remote.h | 29 + hw/pci-host/remote.c | 75 MAINTAINERS | 2 ++ hw/pci-host/Kconfig | 3 ++ hw/pci-host/meson.build | 1 + hw/remote/Kconfig| 1 + 6 files changed, 111 insertions(+) create mode 100644 include/hw/pci-host/remote.h create mode 100644 hw/pci-host/remote.c diff --git a/include/hw/pci-host/remote.h b/include/hw/pci-host/remote.h new file mode 100644 index 000..06b8a83 --- /dev/null +++ b/include/hw/pci-host/remote.h @@ -0,0 +1,29 @@ +/* + * PCI Host for remote device + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef REMOTE_PCIHOST_H +#define REMOTE_PCIHOST_H + +#include "exec/memory.h" +#include "hw/pci/pcie_host.h" + +#define TYPE_REMOTE_PCIHOST "remote-pcihost" +OBJECT_DECLARE_SIMPLE_TYPE(RemotePCIHost, REMOTE_PCIHOST) + +struct RemotePCIHost { +/*< private >*/ +PCIExpressHost parent_obj; +/*< public >*/ + +MemoryRegion *mr_pci_mem; +MemoryRegion *mr_sys_io; +}; + +#endif diff --git a/hw/pci-host/remote.c b/hw/pci-host/remote.c new file mode 100644 index 000..eee4544 --- /dev/null +++ b/hw/pci-host/remote.c @@ -0,0 +1,75 @@ +/* + * Remote PCI host device + * + * Unlike PCI host devices that model physical hardware, the purpose + * of this PCI host is to host multi-process QEMU devices. + * + * Multi-process QEMU extends the PCI host of a QEMU machine into a + * remote process. Any PCI device attached to the remote process is + * visible in the QEMU guest. This allows existing QEMU device models + * to be reused in the remote process. + * + * This PCI host is purely a container for PCI devices. It's fake in the + * sense that the guest never sees this PCI host and has no way of + * accessing it. Its job is just to provide the environment that QEMU + * PCI device models need when running in a remote process. + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "hw/pci/pci.h" +#include "hw/pci/pci_host.h" +#include "hw/pci/pcie_host.h" +#include "hw/qdev-properties.h" +#include "hw/pci-host/remote.h" +#include "exec/memory.h" + +static const char *remote_pcihost_root_bus_path(PCIHostState *host_bridge, +PCIBus *rootbus) +{ +return ":00"; +} + +static void remote_pcihost_realize(DeviceState *dev, Error **errp) +{ +PCIHostState *pci = PCI_HOST_BRIDGE(dev); +RemotePCIHost *s = REMOTE_PCIHOST(dev); + +pci->bus = pci_root_bus_new(DEVICE(s), "remote-pci", +s->mr_pci_mem, s->mr_sys_io, +0, TYPE_PCIE_BUS); +} + +static void remote_pcihost_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); +PCIHostBridgeClass *hc = PCI_HOST_BRIDGE_CLASS(klass); + +hc->root_bus_path = remote_pcihost_root_bus_path; +dc->realize = remote_pcihost_realize; + +dc->user_creatable = false; +set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); +dc->fw_name = "pci"; +} + +static const TypeInfo remote_pcihost_info = { +.name = TYPE_REMOTE_PCIHOST, +.parent = TYPE_PCIE_HOST_BRIDGE, +.instance_size = sizeof(RemotePCIHost), +.class_init = remote_pcihost_class_init, +}; + +static void remote_pcihost_register(void) +{ +type_register_static(&remote_pcihost_info); +} + +type_init(remote_pcihost_register) diff --git a/MAINTAINERS b/MAINTAINERS index da7f735..c5dc042 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3180,6 +3180,8 @@ M: John G Johnson S: Maintained F: docs/devel/multi-process.rst F: docs/multi-process.rst +F: hw/pci-host/remote.c +F: include/hw/pci-host/remote.h Build and test automation - diff --git a/hw/pci-host/Kconfig b/hw/pci-host/Kconfig index eb03f04..8b8c763 100644 --- a/hw/pci-host/Kconfig +++ b/hw/pci-host/Kconfig @@ -65,3 +65,6 @@ config PCI_POWERNV select PCI_EXPRESS select MSI_NONBROKEN select PCIE_PORT + +config REMOTE_PCIHOST +bool diff --git a/hw/pci-host/meson.build b/hw/pci-host/meson.build index da9d1a9..1847c69 100644 --- a/hw/pci-host/meson.build +++ b/hw/pci-host/meson.build @@ -9,6 +9,7 @@ pci_ss.add(when: 'CONFIG_PCI_EXPRESS
[PATCH v16 13/20] multi-process: introduce proxy object
From: Elena Ufimtseva Defines a PCI Device proxy object as a child of TYPE_PCI_DEVICE. Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John G Johnson Reviewed-by: Stefan Hajnoczi --- include/hw/remote/proxy.h | 33 hw/remote/proxy.c | 99 +++ MAINTAINERS | 2 + hw/remote/meson.build | 1 + 4 files changed, 135 insertions(+) create mode 100644 include/hw/remote/proxy.h create mode 100644 hw/remote/proxy.c diff --git a/include/hw/remote/proxy.h b/include/hw/remote/proxy.h new file mode 100644 index 000..faa9c4d --- /dev/null +++ b/include/hw/remote/proxy.h @@ -0,0 +1,33 @@ +/* + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef PROXY_H +#define PROXY_H + +#include "hw/pci/pci.h" +#include "io/channel.h" + +#define TYPE_PCI_PROXY_DEV "x-pci-proxy-dev" +OBJECT_DECLARE_SIMPLE_TYPE(PCIProxyDev, PCI_PROXY_DEV) + +struct PCIProxyDev { +PCIDevice parent_dev; +char *fd; + +/* + * Mutex used to protect the QIOChannel fd from + * the concurrent access by the VCPUs since proxy + * blocks while awaiting for the replies from the + * process remote. + */ +QemuMutex io_mutex; +QIOChannel *ioc; +Error *migration_blocker; +}; + +#endif /* PROXY_H */ diff --git a/hw/remote/proxy.c b/hw/remote/proxy.c new file mode 100644 index 000..cd5b071 --- /dev/null +++ b/hw/remote/proxy.c @@ -0,0 +1,99 @@ +/* + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "hw/remote/proxy.h" +#include "hw/pci/pci.h" +#include "qapi/error.h" +#include "io/channel-util.h" +#include "hw/qdev-properties.h" +#include "monitor/monitor.h" +#include "migration/blocker.h" +#include "qemu/sockets.h" + +static void pci_proxy_dev_realize(PCIDevice *device, Error **errp) +{ +ERRP_GUARD(); +PCIProxyDev *dev = PCI_PROXY_DEV(device); +int fd; + +if (!dev->fd) { +error_setg(errp, "fd parameter not specified for %s", + DEVICE(device)->id); +return; +} + +fd = monitor_fd_param(monitor_cur(), dev->fd, errp); +if (fd == -1) { +error_prepend(errp, "proxy: unable to parse fd %s: ", dev->fd); +return; +} + +if (!fd_is_socket(fd)) { +error_setg(errp, "proxy: fd %d is not a socket", fd); +close(fd); +return; +} + +dev->ioc = qio_channel_new_fd(fd, errp); + +error_setg(&dev->migration_blocker, "%s does not support migration", + TYPE_PCI_PROXY_DEV); +migrate_add_blocker(dev->migration_blocker, errp); + +qemu_mutex_init(&dev->io_mutex); +qio_channel_set_blocking(dev->ioc, true, NULL); +} + +static void pci_proxy_dev_exit(PCIDevice *pdev) +{ +PCIProxyDev *dev = PCI_PROXY_DEV(pdev); + +if (dev->ioc) { +qio_channel_close(dev->ioc, NULL); +} + +migrate_del_blocker(dev->migration_blocker); + +error_free(dev->migration_blocker); +} + +static Property proxy_properties[] = { +DEFINE_PROP_STRING("fd", PCIProxyDev, fd), +DEFINE_PROP_END_OF_LIST(), +}; + +static void pci_proxy_dev_class_init(ObjectClass *klass, void *data) +{ +DeviceClass *dc = DEVICE_CLASS(klass); +PCIDeviceClass *k = PCI_DEVICE_CLASS(klass); + +k->realize = pci_proxy_dev_realize; +k->exit = pci_proxy_dev_exit; +device_class_set_props(dc, proxy_properties); +} + +static const TypeInfo pci_proxy_dev_type_info = { +.name = TYPE_PCI_PROXY_DEV, +.parent= TYPE_PCI_DEVICE, +.instance_size = sizeof(PCIProxyDev), +.class_init= pci_proxy_dev_class_init, +.interfaces = (InterfaceInfo[]) { +{ INTERFACE_CONVENTIONAL_PCI_DEVICE }, +{ }, +}, +}; + +static void pci_proxy_dev_register_types(void) +{ +type_register_static(&pci_proxy_dev_type_info); +} + +type_init(pci_proxy_dev_register_types) diff --git a/MAINTAINERS b/MAINTAINERS index 539a8be..389287c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3190,6 +3190,8 @@ F: hw/remote/message.c F: hw/remote/remote-obj.c F: include/hw/remote/memory.h F: hw/remote/memory.c +F: hw/remote/proxy.c +F: include/hw/remote/proxy.h Build and test automation - diff --git a/hw/remote/meson.build b/hw/remote/meson.build index 64da16c..569cd20 100644 --- a/hw/remote/meson.build +++ b/hw/remote/meson.build @@ -4,6 +4,7 @@ remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('machine.c')) remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('mpqemu-link.c')) remote_ss.add(when: 'CONFIG_MULTIPROCESS', if_true: files('message.c')) remote_ss.add(when: 'CON
[PATCH v16 12/20] multi-process: setup memory manager for remote device
SyncSysMemMsg message format is defined. It is used to send file descriptors of the RAM regions to remote device. RAM on the remote device is configured with a set of file descriptors. Old RAM regions are deleted and new regions, each with an fd, is added to the RAM. Signed-off-by: Jagannathan Raman Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Reviewed-by: Stefan Hajnoczi --- include/hw/remote/memory.h | 19 include/hw/remote/mpqemu-link.h | 10 +++ hw/remote/memory.c | 65 + hw/remote/mpqemu-link.c | 11 +++ MAINTAINERS | 2 ++ hw/remote/meson.build | 2 ++ 6 files changed, 109 insertions(+) create mode 100644 include/hw/remote/memory.h create mode 100644 hw/remote/memory.c diff --git a/include/hw/remote/memory.h b/include/hw/remote/memory.h new file mode 100644 index 000..bc2e309 --- /dev/null +++ b/include/hw/remote/memory.h @@ -0,0 +1,19 @@ +/* + * Memory manager for remote device + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#ifndef REMOTE_MEMORY_H +#define REMOTE_MEMORY_H + +#include "exec/hwaddr.h" +#include "hw/remote/mpqemu-link.h" + +void remote_sysmem_reconfig(MPQemuMsg *msg, Error **errp); + +#endif diff --git a/include/hw/remote/mpqemu-link.h b/include/hw/remote/mpqemu-link.h index cac699c..6ee5bc5 100644 --- a/include/hw/remote/mpqemu-link.h +++ b/include/hw/remote/mpqemu-link.h @@ -14,6 +14,7 @@ #include "qom/object.h" #include "qemu/thread.h" #include "io/channel.h" +#include "exec/hwaddr.h" #define REMOTE_MAX_FDS 8 @@ -30,9 +31,16 @@ * */ typedef enum { +MPQEMU_CMD_SYNC_SYSMEM, MPQEMU_CMD_MAX, } MPQemuCmd; +typedef struct { +hwaddr gpas[REMOTE_MAX_FDS]; +uint64_t sizes[REMOTE_MAX_FDS]; +off_t offsets[REMOTE_MAX_FDS]; +} SyncSysmemMsg; + /** * MPQemuMsg: * @cmd: The remote command @@ -43,12 +51,14 @@ typedef enum { * MPQemuMsg Format of the message sent to the remote device from QEMU. * */ + typedef struct { int cmd; size_t size; union { uint64_t u64; +SyncSysmemMsg sync_sysmem; } data; int fds[REMOTE_MAX_FDS]; diff --git a/hw/remote/memory.c b/hw/remote/memory.c new file mode 100644 index 000..32085b1 --- /dev/null +++ b/hw/remote/memory.c @@ -0,0 +1,65 @@ +/* + * Memory manager for remote device + * + * Copyright © 2018, 2021 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu-common.h" + +#include "hw/remote/memory.h" +#include "exec/address-spaces.h" +#include "exec/ram_addr.h" +#include "qapi/error.h" + +static void remote_sysmem_reset(void) +{ +MemoryRegion *sysmem, *subregion, *next; + +sysmem = get_system_memory(); + +QTAILQ_FOREACH_SAFE(subregion, &sysmem->subregions, subregions_link, next) { +if (subregion->ram) { +memory_region_del_subregion(sysmem, subregion); +object_unparent(OBJECT(subregion)); +} +} +} + +void remote_sysmem_reconfig(MPQemuMsg *msg, Error **errp) +{ +ERRP_GUARD(); +SyncSysmemMsg *sysmem_info = &msg->data.sync_sysmem; +MemoryRegion *sysmem, *subregion; +static unsigned int suffix; +int region; + +sysmem = get_system_memory(); + +remote_sysmem_reset(); + +for (region = 0; region < msg->num_fds; region++) { +g_autofree char *name; +subregion = g_new(MemoryRegion, 1); +name = g_strdup_printf("remote-mem-%u", suffix++); +memory_region_init_ram_from_fd(subregion, NULL, + name, sysmem_info->sizes[region], + true, msg->fds[region], + sysmem_info->offsets[region], + errp); + +if (*errp) { +g_free(subregion); +remote_sysmem_reset(); +return; +} + +memory_region_add_subregion(sysmem, sysmem_info->gpas[region], +subregion); + +} +} diff --git a/hw/remote/mpqemu-link.c b/hw/remote/mpqemu-link.c index b3d380e..4b25649 100644 --- a/hw/remote/mpqemu-link.c +++ b/hw/remote/mpqemu-link.c @@ -201,5 +201,16 @@ bool mpqemu_msg_valid(MPQemuMsg *msg) } } + /* Verify message specific fields. */ +switch (msg->cmd) { +case MPQEMU_CMD_SYNC_SYSMEM: +if (msg->num_fds == 0 || msg->size != sizeof(SyncSysmemMsg)) { +return false; +} +break; +default: +break; +} + return true; } diff --git a/MAINTAINERS b/MAINTAINERS index 6ee7a8f..539a8be 100644 --- a/MAINTAINERS +++ b/MAINTAINE
[PATCH v16 03/20] memory: alloc RAM from file at offset
Allow RAM MemoryRegion to be created from an offset in a file, instead of allocating at offset of 0 by default. This is needed to synchronize RAM between QEMU & remote process. Signed-off-by: Jagannathan Raman Signed-off-by: John G Johnson Signed-off-by: Elena Ufimtseva Reviewed-by: Stefan Hajnoczi --- include/exec/memory.h | 2 ++ include/exec/ram_addr.h | 2 +- include/qemu/mmap-alloc.h | 4 +++- backends/hostmem-memfd.c | 2 +- hw/misc/ivshmem.c | 3 ++- softmmu/memory.c | 3 ++- softmmu/physmem.c | 11 +++ util/mmap-alloc.c | 7 --- util/oslib-posix.c| 2 +- 9 files changed, 23 insertions(+), 13 deletions(-) diff --git a/include/exec/memory.h b/include/exec/memory.h index 521d990..a9d2b66 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -990,6 +990,7 @@ void memory_region_init_ram_from_file(MemoryRegion *mr, * @size: size of the region. * @share: %true if memory must be mmaped with the MAP_SHARED flag * @fd: the fd to mmap. + * @offset: offset within the file referenced by fd * @errp: pointer to Error*, to store an error if it happens. * * Note that this function does not do anything to cause the data in the @@ -1001,6 +1002,7 @@ void memory_region_init_ram_from_fd(MemoryRegion *mr, uint64_t size, bool share, int fd, +ram_addr_t offset, Error **errp); #endif diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h index c6d2ef1..d465a48 100644 --- a/include/exec/ram_addr.h +++ b/include/exec/ram_addr.h @@ -121,7 +121,7 @@ RAMBlock *qemu_ram_alloc_from_file(ram_addr_t size, MemoryRegion *mr, Error **errp); RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr, uint32_t ram_flags, int fd, - Error **errp); + off_t offset, Error **errp); RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host, MemoryRegion *mr, Error **errp); diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h index e786266..b096ffb 100644 --- a/include/qemu/mmap-alloc.h +++ b/include/qemu/mmap-alloc.h @@ -16,6 +16,7 @@ size_t qemu_mempath_getpagesize(const char *mem_path); * otherwise, the alignment in use will be determined by QEMU. * @shared: map has RAM_SHARED flag. * @is_pmem: map has RAM_PMEM flag. + * @map_offset: map starts at offset of map_offset from the start of fd * * Return: * On success, return a pointer to the mapped area. @@ -25,7 +26,8 @@ void *qemu_ram_mmap(int fd, size_t size, size_t align, bool shared, -bool is_pmem); +bool is_pmem, +off_t map_offset); void qemu_ram_munmap(int fd, void *ptr, size_t size); diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c index e5626d4..69b0ae3 100644 --- a/backends/hostmem-memfd.c +++ b/backends/hostmem-memfd.c @@ -55,7 +55,7 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp) name = host_memory_backend_get_name(backend); memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name, backend->size, - backend->share, fd, errp); + backend->share, fd, 0, errp); g_free(name); } diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c index 0505b52..603e992 100644 --- a/hw/misc/ivshmem.c +++ b/hw/misc/ivshmem.c @@ -495,7 +495,8 @@ static void process_msg_shmem(IVShmemState *s, int fd, Error **errp) /* mmap the region and map into the BAR2 */ memory_region_init_ram_from_fd(&s->server_bar2, OBJECT(s), - "ivshmem.bar2", size, true, fd, &local_err); + "ivshmem.bar2", size, true, fd, 0, + &local_err); if (local_err) { error_propagate(errp, local_err); return; diff --git a/softmmu/memory.c b/softmmu/memory.c index 333e1ed..fa65f45 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -1609,6 +1609,7 @@ void memory_region_init_ram_from_fd(MemoryRegion *mr, uint64_t size, bool share, int fd, +ram_addr_t offset, Error **errp) { Error *err = NULL; @@ -1618,7 +1619,7 @@ void memory_region_init_ram_from_fd(MemoryRegion *mr, mr->destructor = memory_region_destructor_ram; mr->ram_block = qemu_ram_alloc_from_fd(size, mr, share ? R
[PATCH v16 08/20] io: add qio_channel_readv_full_all_eof & qio_channel_readv_full_all helpers
From: Elena Ufimtseva Adds qio_channel_readv_full_all_eof() and qio_channel_readv_full_all() to read both data and FDs. Refactors existing code to use these helpers. Signed-off-by: Elena Ufimtseva Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman --- include/io/channel.h | 51 io/channel.c | 73 2 files changed, 107 insertions(+), 17 deletions(-) diff --git a/include/io/channel.h b/include/io/channel.h index 2a45fb5..31e4164 100644 --- a/include/io/channel.h +++ b/include/io/channel.h @@ -775,6 +775,57 @@ void qio_channel_set_aio_fd_handler(QIOChannel *ioc, void *opaque); /** + * qio_channel_readv_full_all_eof: + * @ioc: the channel object + * @iov: the array of memory regions to read data to + * @niov: the length of the @iov array + * @fds: an array of file handles to read + * @nfds: number of file handles in @fds + * @errp: pointer to a NULL-initialized error object + * + * + * Performs same function as qio_channel_readv_all_eof. + * Additionally, attempts to read file descriptors shared + * over the channel. The function will wait for all + * requested data to be read, yielding from the current + * coroutine if required. + * + * Returns: 1 if all bytes were read, 0 if end-of-file + * occurs without data, or -1 on error + */ + +int qio_channel_readv_full_all_eof(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int **fds, size_t *nfds, + Error **errp); + +/** + * qio_channel_readv_full_all: + * @ioc: the channel object + * @iov: the array of memory regions to read data to + * @niov: the length of the @iov array + * @fds: an array of file handles to read + * @nfds: number of file handles in @fds + * @errp: pointer to a NULL-initialized error object + * + * + * Performs same function as qio_channel_readv_all_eof. + * Additionally, attempts to read file descriptors shared + * over the channel. The function will wait for all + * requested data to be read, yielding from the current + * coroutine if required. + * + * Returns: 0 if all bytes were read, or -1 on error + */ + +int qio_channel_readv_full_all(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int **fds, size_t *nfds, + Error **errp); + +/** * qio_channel_writev_full_all: * @ioc: the channel object * @iov: the array of memory regions to write data from diff --git a/io/channel.c b/io/channel.c index 0d4b8b5..09ec31e 100644 --- a/io/channel.c +++ b/io/channel.c @@ -92,10 +92,29 @@ int qio_channel_readv_all_eof(QIOChannel *ioc, size_t niov, Error **errp) { +return qio_channel_readv_full_all_eof(ioc, iov, niov, NULL, NULL, errp); +} + +int qio_channel_readv_all(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + Error **errp) +{ +return qio_channel_readv_full_all(ioc, iov, niov, NULL, NULL, errp); +} + +int qio_channel_readv_full_all_eof(QIOChannel *ioc, + const struct iovec *iov, + size_t niov, + int **fds, size_t *nfds, + Error **errp) +{ int ret = -1; struct iovec *local_iov = g_new(struct iovec, niov); struct iovec *local_iov_head = local_iov; unsigned int nlocal_iov = niov; +int **local_fds = fds; +size_t *local_nfds = nfds; bool partial = false; nlocal_iov = iov_copy(local_iov, nlocal_iov, @@ -104,7 +123,8 @@ int qio_channel_readv_all_eof(QIOChannel *ioc, while (nlocal_iov > 0) { ssize_t len; -len = qio_channel_readv(ioc, local_iov, nlocal_iov, errp); +len = qio_channel_readv_full(ioc, local_iov, nlocal_iov, local_fds, + local_nfds, errp); if (len == QIO_CHANNEL_ERR_BLOCK) { if (qemu_in_coroutine()) { qio_channel_yield(ioc, G_IO_IN); @@ -112,20 +132,36 @@ int qio_channel_readv_all_eof(QIOChannel *ioc, qio_channel_wait(ioc, G_IO_IN); } continue; -} else if (len < 0) { -goto cleanup; -} else if (len == 0) { -if (partial) { -error_setg(errp, - "Unexpected end-of-file before all bytes were read"); -} else { -ret = 0; +} + +if (len <= 0) { +size_t fd_idx = nfds ? *nfds : 0; +if (len == 0) { +if (partial) { +error_setg(errp, +
[PATCH v16 04/20] multi-process: Add config option for multi-process QEMU
Add configuration options to enable or disable multiprocess QEMU code Signed-off-by: John G Johnson Signed-off-by: Jagannathan Raman Signed-off-by: Elena Ufimtseva Reviewed-by: Stefan Hajnoczi --- configure | 10 ++ meson.build | 4 +++- Kconfig.host | 4 hw/Kconfig| 1 + hw/remote/Kconfig | 3 +++ 5 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 hw/remote/Kconfig diff --git a/configure b/configure index 5860bdb..f82dc2c 100755 --- a/configure +++ b/configure @@ -458,6 +458,7 @@ skip_meson=no gettext="auto" fuse="auto" fuse_lseek="auto" +multiprocess="no" malloc_trim="auto" @@ -804,6 +805,7 @@ Linux) linux="yes" linux_user="yes" vhost_user=${default_feature:-yes} + multiprocess=${default_feature:-yes} ;; esac @@ -1558,6 +1560,10 @@ for opt do ;; --disable-fuse-lseek) fuse_lseek="disabled" ;; + --enable-multiprocess) multiprocess="yes" + ;; + --disable-multiprocess) multiprocess="no" + ;; *) echo "ERROR: unknown option $opt" echo "Try '$0 --help' for more information" @@ -1897,6 +1903,7 @@ disabled with --disable-FEATURE, default is enabled if available libdaxctl libdaxctl support fuseFUSE block device export fuse-lseek SEEK_HOLE/SEEK_DATA support for FUSE exports + multiprocessMultiprocess QEMU support NOTE: The object files are built at the place where configure is launched EOF @@ -6208,6 +6215,9 @@ fi if test "$have_mlockall" = "yes" ; then echo "HAVE_MLOCKALL=y" >> $config_host_mak fi +if test "$multiprocess" = "yes" ; then + echo "CONFIG_MULTIPROCESS_ALLOWED=y" >> $config_host_mak +fi if test "$fuzzing" = "yes" ; then # If LIB_FUZZING_ENGINE is set, assume we are running on OSS-Fuzz, and the # needed CFLAGS have already been provided diff --git a/meson.build b/meson.build index 563688d..5e27bb5 100644 --- a/meson.build +++ b/meson.build @@ -1177,7 +1177,8 @@ host_kconfig = \ ('CONFIG_VHOST_KERNEL' in config_host ? ['CONFIG_VHOST_KERNEL=y'] : []) + \ (have_virtfs ? ['CONFIG_VIRTFS=y'] : []) + \ ('CONFIG_LINUX' in config_host ? ['CONFIG_LINUX=y'] : []) + \ - ('CONFIG_PVRDMA' in config_host ? ['CONFIG_PVRDMA=y'] : []) + ('CONFIG_PVRDMA' in config_host ? ['CONFIG_PVRDMA=y'] : []) + \ + ('CONFIG_MULTIPROCESS_ALLOWED' in config_host ? ['CONFIG_MULTIPROCESS_ALLOWED=y'] : []) ignored = [ 'TARGET_XML_FILES', 'TARGET_ABI_DIR', 'TARGET_ARCH' ] @@ -2489,6 +2490,7 @@ summary_info += {'rng-none': config_host.has_key('CONFIG_RNG_NONE')} summary_info += {'Linux keyring': config_host.has_key('CONFIG_SECRET_KEYRING')} summary_info += {'FUSE exports': fuse.found()} summary_info += {'FUSE lseek':fuse_lseek.found()} +summary_info += {'Multiprocess QEMU': config_host.has_key('CONFIG_MULTIPROCESS_ALLOWED')} summary(summary_info, bool_yn: true) if not supported_cpus.contains(cpu) diff --git a/Kconfig.host b/Kconfig.host index a9a55a9..24255ef 100644 --- a/Kconfig.host +++ b/Kconfig.host @@ -37,3 +37,7 @@ config VIRTFS config PVRDMA bool + +config MULTIPROCESS_ALLOWED +bool +imply MULTIPROCESS diff --git a/hw/Kconfig b/hw/Kconfig index 5ad3c6b..525fb52 100644 --- a/hw/Kconfig +++ b/hw/Kconfig @@ -27,6 +27,7 @@ source pci-host/Kconfig source pcmcia/Kconfig source pci/Kconfig source rdma/Kconfig +source remote/Kconfig source rtc/Kconfig source scsi/Kconfig source sd/Kconfig diff --git a/hw/remote/Kconfig b/hw/remote/Kconfig new file mode 100644 index 000..5484446 --- /dev/null +++ b/hw/remote/Kconfig @@ -0,0 +1,3 @@ +config MULTIPROCESS +bool +depends on PCI && KVM -- 1.8.3.1
[PATCH v16 00/20] Initial support for multi-process Qemu
Hi This is the v16 of the patchset. Thank you for your time reviewing v15. This version has the following changes: [PATCH v16 04/20] multi-process: Add config option for multi-process QEMU - Using “default_feature” value to enable/disable multiprocess [PATCH v16 07/20] io: add qio_channel_writev_full_all helper - Removed local variable in qio_channel_writev_full_all(), setting arguments directly - Fixed indentation issues - Updated commit message [PATCH v16 08/20] io: add qio_channel_readv_full_all_eof & qio_channel_readv_full_all helpers - Added two variants of readv - _full_all_eof & _full_all based on feedback - Dropped errno return value - Updated commit message - Unable to remove local variables and set arguments directly as the arguments are later needed for cleanup (g_free/close) during failure Switched to using OBJECT_DECLARE_{SIMPLE_TYPE, TYPE} macros in the following patches: - [PATCH v16 05/20] multi-process: setup PCI host bridge for remote device - [PATCH v16 06/20] multi-process: setup a machine object for remote device process - [PATCH v16 11/20] multi-process: Associate fd of a PCIDevice with its object - [PATCH v16 13/20] multi-process: introduce proxy object Updated copyright text to use the year 2021 in the files that show them. To touch upon the history of this project, we posted the Proof Of Concept patches before the BoF session in 2018. Subsequently, we have posted 15 versions on the qemu-devel mailing list. You can find them by following the links below ([1] - [15]). Following people contributed to the design and implementation of this project: Jagannathan Raman Elena Ufimtseva John G Johnson Stefan Hajnoczi Konrad Wilk Kanth Ghatraju We would like to thank the QEMU community for your feedback in the design and implementation of this project. Qemu wiki page: https://wiki.qemu.org/Features/MultiProcessQEMU For the full concept writeup about QEMU multi-process, please refer to docs/devel/qemu-multiprocess.rst. Also, see docs/qemu-multiprocess.txt for usage information. Thank you for reviewing this series! [POC]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg566538.html [1]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg602285.html [2]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg624877.html [3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg642000.html [4]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg655118.html [5]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg682429.html [6]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg697484.html [7]: https://patchew.org/QEMU/cover.1593273671.git.elena.ufimts...@oracle.com/ [8]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg727007.html [9]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg734275.html [10]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg747638.html [11]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg750972.html [12]: https://patchew.org/QEMU/cover.1606853298.git.jag.ra...@oracle.com/ [13]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg766825.html [14]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg768376.html [15]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg769178.html Elena Ufimtseva (8): multi-process: add configure and usage information io: add qio_channel_writev_full_all helper io: add qio_channel_readv_full_all_eof & qio_channel_readv_full_all helpers multi-process: define MPQemuMsg format and transmission functions multi-process: introduce proxy object multi-process: add proxy communication functions multi-process: Forward PCI config space acceses to the remote process multi-process: perform device reset in the remote process Jagannathan Raman (11): memory: alloc RAM from file at offset multi-process: Add config option for multi-process QEMU multi-process: setup PCI host bridge for remote device multi-process: setup a machine object for remote device process multi-process: Initialize message handler in remote device multi-process: Associate fd of a PCIDevice with its object multi-process: setup memory manager for remote device multi-process: PCI BAR read/write handling for proxy & remote endpoints multi-process: Synchronize remote memory multi-process: create IOHUB object to handle irq multi-process: Retrieve PCI info from remote process John G Johnson (1): multi-process: add the concept description to docs/devel/qemu-multiprocess docs/devel/index.rst | 1 + docs/devel/multi-process.rst | 966 ++ docs/multi-process.rst| 64 ++ configure | 10 + meson.build | 5 +- hw/remote/trace.h | 1 + include/exec/memory.h | 2 + include/exec/ram_addr.h | 2 +- incl
[PATCH v16 02/20] multi-process: add configure and usage information
From: Elena Ufimtseva Adds documentation explaining the command-line arguments needed to use multi-process. Signed-off-by: Elena Ufimtseva Signed-off-by: Jagannathan Raman Signed-off-by: John G Johnson Reviewed-by: Stefan Hajnoczi --- docs/multi-process.rst | 64 ++ MAINTAINERS| 1 + 2 files changed, 65 insertions(+) create mode 100644 docs/multi-process.rst diff --git a/docs/multi-process.rst b/docs/multi-process.rst new file mode 100644 index 000..46bb0ca --- /dev/null +++ b/docs/multi-process.rst @@ -0,0 +1,64 @@ +Multi-process QEMU +== + +This document describes how to configure and use multi-process qemu. +For the design document refer to docs/devel/qemu-multiprocess. + +1) Configuration + + +multi-process is enabled by default for targets that enable KVM + + +2) Usage + + +Multi-process QEMU requires an orchestrator to launch. + +Following is a description of command-line used to launch mpqemu. + +* Orchestrator: + + - The Orchestrator creates a unix socketpair + + - It launches the remote process and passes one of the +sockets to it via command-line. + + - It then launches QEMU and specifies the other socket as an option +to the Proxy device object + +* Remote Process: + + - QEMU can enter remote process mode by using the "remote" machine +option. + + - The orchestrator creates a "remote-object" with details about +the device and the file descriptor for the device + + - The remaining options are no different from how one launches QEMU with +devices. + + - Example command-line for the remote process is as follows: + + /usr/bin/qemu-system-x86_64\ + -machine x-remote \ + -device lsi53c895a,id=lsi0 \ + -drive id=drive_image2,file=/build/ol7-nvme-test-1.qcow2 \ + -device scsi-hd,id=drive2,drive=drive_image2,bus=lsi0.0,scsi-id=0 \ + -object x-remote-object,id=robj1,devid=lsi1,fd=4, + +* QEMU: + + - Since parts of the RAM are shared between QEMU & remote process, a +memory-backend-memfd is required to facilitate this, as follows: + +-object memory-backend-memfd,id=mem,size=2G + + - A "x-pci-proxy-dev" device is created for each of the PCI devices emulated +in the remote process. A "socket" sub-option specifies the other end of +unix channel created by orchestrator. The "id" sub-option must be specified +and should be the same as the "id" specified for the remote PCI device + + - Example commandline for QEMU is as follows: + + -device x-pci-proxy-dev,id=lsi0,socket=3 diff --git a/MAINTAINERS b/MAINTAINERS index d50b75c..da7f735 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3179,6 +3179,7 @@ M: Jagannathan Raman M: John G Johnson S: Maintained F: docs/devel/multi-process.rst +F: docs/multi-process.rst Build and test automation - -- 1.8.3.1
[Bug 1910696] Re: Qemu fails to start with error " There is no option group 'spice'"
Additional information: This error occurs only if spice is compiled as module (`--enable-modules`) and spice parameters are supplied from file with `-readconfig /path/to/file` . If spice parameters are supplied from the command line (`-spice param1=a,param2=b`) , an error does not occur. Possible workaround: Build most modules statically (https://salsa.debian.org/qemu-team/qemu/-/blob/master/debian/patches /build-most-modules-statically-hack.diff) or disable modules entirely (`--disable-modules`) -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1910696 Title: Qemu fails to start with error " There is no option group 'spice'" Status in QEMU: New Bug description: After upgrade from 5.1.0 to 5.2.0, qemu fails on start with error: ` /usr/bin/qemu-system-x86_64 -S -name trinti -uuid f8ad2ff6-8808-4f42-8f0b-9e23acd20f84 -daemonize -cpu host -nographic -serial chardev:console -nodefaults -no-reboot -no-user-config -sandbox on,obsolete=deny,elevateprivileges=allow,spawn=deny,resourcecontrol=deny -readconfig /var/log/lxd/trinti/qemu.conf -pidfile /var/log/lxd/trinti/qemu.pid -D /var/log/lxd/trinti/qemu.log -chroot /var/lib/lxd/virtual-machines/trinti -smbios type=2,manufacturer=Canonical Ltd.,product=LXD -runas nobody: qemu-system-x86_64:/var/log/lxd/trinti/qemu.conf:27: There is no option group 'spice' qemu-system-x86_64: -readconfig /var/log/lxd/trinti/qemu.conf: read config /var/log/lxd/trinti/qemu.conf: Invalid argument ` Bisected to first bad commit: https://github.com/qemu/qemu/commit/cbe5fa11789035c43fd2108ac6f45848954954b5 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1910696/+subscriptions
Re: [PATCH v2] hvf: guard xgetbv call.
On Sun, Jan 10, 2021 at 01:08:54PM -0800, Hill Ma wrote: > This prevents illegal instruction on cpus do not support xgetbv. > > Buglink: https://bugs.launchpad.net/qemu/+bug/1758819 > Signed-off-by: Hill Ma > --- > v2: xgetbv() modified based on feedback. > > target/i386/hvf/x86_cpuid.c | 28 +++- > 1 file changed, 19 insertions(+), 9 deletions(-) > > diff --git a/target/i386/hvf/x86_cpuid.c b/target/i386/hvf/x86_cpuid.c > index a6842912f5..edaa1b7da2 100644 > --- a/target/i386/hvf/x86_cpuid.c > +++ b/target/i386/hvf/x86_cpuid.c > @@ -27,15 +27,22 @@ > #include "vmx.h" > #include "sysemu/hvf.h" > > -static uint64_t xgetbv(uint32_t xcr) > +static bool xgetbv(uint32_t cpuid_ecx, uint32_t idx, uint64_t *xcr) > { > -uint32_t eax, edx; > +uint32_t xcrl, xcrh; > > -__asm__ volatile ("xgetbv" > - : "=a" (eax), "=d" (edx) > - : "c" (xcr)); > +if (cpuid_ecx & CPUID_EXT_OSXSAVE) { > +/* > + * The xgetbv instruction is not available to older versions of > + * the assembler, so we encode the instruction manually. > + */ > +asm(".byte 0x0f, 0x01, 0xd0" : "=a" (xcrl), "=d" (xcrh) : "c" (idx)); > > -return (((uint64_t)edx) << 32) | eax; > +*xcr = (((uint64_t)xcrh) << 32) | xcrl; > +return true; > +} > + > +return false; > } > > uint32_t hvf_get_supported_cpuid(uint32_t func, uint32_t idx, > @@ -100,11 +107,14 @@ uint32_t hvf_get_supported_cpuid(uint32_t func, > uint32_t idx, > break; > case 0xD: > if (idx == 0) { > -uint64_t host_xcr0 = xgetbv(0); > -uint64_t supp_xcr0 = host_xcr0 & (XSTATE_FP_MASK | > XSTATE_SSE_MASK | > +uint64_t supp_xcr0 = XSTATE_FP_MASK | XSTATE_SSE_MASK | >XSTATE_YMM_MASK | XSTATE_BNDREGS_MASK | >XSTATE_BNDCSR_MASK | XSTATE_OPMASK_MASK | > - XSTATE_ZMM_Hi256_MASK | > XSTATE_Hi16_ZMM_MASK); > + XSTATE_ZMM_Hi256_MASK | > XSTATE_Hi16_ZMM_MASK; > +uint64_t host_xcr0; > +if (xgetbv(ecx, 0, &host_xcr0)) { > +supp_xcr0 &= host_xcr0; Hi Hill, I'm not sure if eax should be modified with mask because the mask has no value per se. I.e. eax &= supp_xcr0 from below should be placed inside the if. It'd express clearly that eax is not modified unless xgetbv is supported. Thanks, Roman > +} > eax &= supp_xcr0; > } else if (idx == 1) { > hv_vmx_read_capability(HV_VMX_CAP_PROCBASED2, &cap); > -- > 2.20.1 (Apple Git-117) >
Re: [PATCH] hvf: guard xgetbv call.
On Sun, Jan 10, 2021 at 08:38:36AM -1000, Richard Henderson wrote: > On 1/10/21 8:34 AM, Richard Henderson wrote: > > On 1/9/21 3:46 PM, Roman Bolshakov wrote: > >> +static int xgetbv(uint32_t cpuid_ecx, uint32_t idx, uint64_t *xcr) > >> { > >> -uint32_t eax, edx; > >> +uint32_t xcrl, xcrh; > >> > >> -__asm__ volatile ("xgetbv" > >> - : "=a" (eax), "=d" (edx) > >> - : "c" (xcr)); > >> +if (cpuid_ecx && CPUID_EXT_OSXSAVE) { > >> +/* The xgetbv instruction is not available to older versions of > >> + * the assembler, so we encode the instruction manually. > >> + */ > >> +asm(".byte 0x0f, 0x01, 0xd0" : "=a" (xcrl), "=d" (xcrh) : "c" > >> (idx)); > >> > >> -return (((uint64_t)edx) << 32) | eax; > >> +*xcr = (((uint64_t)xcrh) << 32) | xcrl; > >> +return 0; > >> +} > >> + > >> +return 1; > >> } > > > > Not to bikeshed too much, but this looks like it should return bool, and > > true > > on success, not the other way around. > I agree, it'd better to comprehend (and Hill has already sent v2 with this). > Also, if we're going to put this some place common, forcing the caller to do > the cpuid that feeds this, then we should probably make all of the startup > cpuid stuff common as well. > I proposed the version because all callers of xgetbv instruction already call cpuid before invoking inline xgetbv. > Note that we'd probably have to use constructor priorities to get that right > for util/bufferiszero.c. > Please correct me if I read this wrong. What you're saying is we should initialize cpuid in constructors and then use cached cpuid ecx in xgetbv() (and drop one argument, respectively)? Thanks, Roman
[Bug 1776096] Re: qemu 2.12.0 qemu-system-ppc illegal instruction on ppc64le, crashes emulator
[Expired for QEMU because there has been no activity for 60 days.] ** Changed in: qemu Status: Incomplete => Expired -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1776096 Title: qemu 2.12.0 qemu-system-ppc illegal instruction on ppc64le, crashes emulator Status in QEMU: Expired Bug description: % uname -a Linux tim.floodgap.com 4.16.14-300.fc28.ppc64le #1 SMP Tue Jun 5 15:59:48 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux STR: Start QEMU and boot Mac OS X 10.4.11. Download the current version of TenFourFox (I used G3 so that AltiVec was not a confounder). Try to start TenFourFox in safe mode (hold down Option as you double-click while the icon bounces in the Dock). Expected: TenFourFox starts. Actual: The entire emulator exits with an illegal instruction error. Trace of session (including some disassembly so you can see where TCG went wrong): tim:/home/spectre/src/qemu-2.12.0/ppc-softmmu/% gdb --args ./qemu- system-ppc -M mac99,accel=tcg -m 2048 -prom-env boot-args=-v -boot c -drive file=tigerhd.img,format=raw,cache=none -netdev user,id=mynet0 -device usb-net,netdev=mynet0 -usb -device usb-tablet GNU gdb (GDB) Fedora 8.1-15.fc28 [...] Reading symbols from ./qemu-system-ppc...done. (gdb) run [...] Thread 6 "qemu-system-ppc" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7242ea30 (LWP 7017)] 0xfffc in ?? () #0 0xfffc in () #1 0x7fffd4edec00 in code_gen_buffer () #2 0x100c9e20 in cpu_tb_exec (itb=, cpu=) at /home/spectre/src/qemu-2.12.0/accel/tcg/cpu-exec.c:169 #3 0x100c9e20 in cpu_loop_exec_tb (tb_exit=, last_tb=, tb=, cpu=) at /home/spectre/src/qemu-2.12.0/accel/tcg/cpu-exec.c:626 #4 0x100c9e20 in cpu_exec (cpu=) at /home/spectre/src/qemu-2.12.0/accel/tcg/cpu-exec.c:734 #5 0x1007decc in tcg_cpu_exec (cpu=0x11774e10) at /home/spectre/src/qemu-2.12.0/cpus.c:1362 (gdb) disas 0x7fffd4edebf0, 0x7fffd4edec10 Dump of assembler code from 0x7fffd4edebf0 to 0x7fffd4edec10: 0x7fffd4edebf0 :addir0,r4,3 0x7fffd4edebf4 :rlwinm r0,r0,0,0,19 0x7fffd4edebf8 :cmplw cr7,r0,r12 0x7fffd4edebfc :bnel cr7,0x7fffd4ed8b64 0x7fffd4edec00 :lwbrx r14,r3,r4 0x7fffd4edec04 :stw r14,40(r27) 0x7fffd4edec08 :clrldi r4,r14,32 0x7fffd4edec0c :rlwinm r3,r4,25,19,26 End of assembler dump. (gdb) disas 0x7fffd4ed8b60, 0x7fffd4ed8b70 Dump of assembler code from 0x7fffd4ed8b60 to 0x7fffd4ed8b70: 0x7fffd4ed8b60 :bctrl 0x7fffd4ed8b64 :mtctr r3 0x7fffd4ed8b68 :mr r31,r3 0x7fffd4ed8b6c :li r3,0 End of assembler dump. (gdb) i reg ctr ctr0x 18446744073709551615 It appears that the branch at 0x7fffd4edebfc caused a jump back (a return?) through CTR, but CTR has -1 in it, hence setting PC to 0xfffc. I am not sure how to debug this further. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1776096/+subscriptions
[Bug 1777301] Re: Boot failed after installing Checkpoint Pointsec FDE
[Expired for QEMU because there has been no activity for 60 days.] ** Changed in: qemu Status: Incomplete => Expired -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1777301 Title: Boot failed after installing Checkpoint Pointsec FDE Status in QEMU: Expired Bug description: Boot failed after installing Checkpoint Pointsec FDE Hi, I installed Windows 10 64-bit guest on CentOS 7. Everything works great as expected. However after installing CheckPoint AlertSec full disk encryption, the guest failed to boot. The following error is displayed in qemu log file. KVM internal error. Suberror: 1 emulation failure Installed Software [root@sesamvmh01 qemu]# yum list installed | grep qemu ipxe-roms-qemu.noarch 20170123-1.git4e85b27.el7_4.1 @base libvirt-daemon-driver-qemu.x86_64 3.9.0-14.el7_5.5 @updates qemu-guest-agent.x86_64 10:2.8.0-2.el7 @base qemu-img-ev.x86_64 10:2.3.0-29.1.el7 @qemu-kvm-rhev qemu-kvm-common-ev.x86_64 10:2.3.0-29.1.el7 @qemu-kvm-rhev qemu-kvm-ev.x86_64 10:2.3.0-29.1.el7 @qemu-kvm-rhev # uname -r 3.10.0-862.3.2.el7.x86_64 CPU info: processor : 0..3 vendor_id : GenuineIntel cpu family: 6 model : 30 model name: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz stepping : 5 microcode : 0x7 cpu MHz : 1200.000 cache size: 8192 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid: 0 initial apicid: 0 fpu : yes fpu_exception : yes cpuid level : 11 wp: yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid dtherm ida bogomips : 4799.98 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: Please also check attached logs. I am new to qemu-kvm so please don't hesitate to ask missing info. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1777301/+subscriptions
[Bug 1777232] Re: NVME fails on big writes
[Expired for QEMU because there has been no activity for 60 days.] ** Changed in: qemu Status: Incomplete => Expired -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1777232 Title: NVME fails on big writes Status in QEMU: Expired Bug description: NVME Compliance test 8:3.3.0 tries to write and read back big chunks of pages. Currently, on the latest QEMU operation of size 1024 blocks will fail when device is backed by a file. NVME specification has several types of data transfers from guests, one of the is the PRP list (Physical Region Page List). PRP is a list of entries pointing to pages to be written. The list it self resides in a single or multiple pages. NVME device maps the PRP list into QEMUSGList which will be me mapped into linux IO vectors. Finally, when the file driver will write the changes, it uses the posix pwritev, which fails if the number of vectors exceeds the maximum. NVME Compliance - https://github.com/nvmecompliance/tnvme/wiki To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1777232/+subscriptions
[Bug 1778473] Re: [Crash] qemu-system-x86_64: mov_ss_trap_64 PANIC: double fault, error_code: 0x0
[Expired for QEMU because there has been no activity for 60 days.] ** Changed in: qemu Status: Incomplete => Expired -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1778473 Title: [Crash] qemu-system-x86_64: mov_ss_trap_64 PANIC: double fault, error_code: 0x0 Status in QEMU: Expired Bug description: Kselftest test case mov_ss_trap_64 is causing kernel panic on qemu-system-x86_64 and PASS on real x86_64 hardware. qemu-system-x86_64 version is 2.12.0 host architecture: amd64 Test failed on recent stable rc kernel, 4.17.3-rc1, 4.16.18-rc1 and 4.14.52-rc1. Test code snippet, main() { <> printf("[RUN]\tMOV SS; CS CS INT3\n"); asm volatile ("mov %[ss], %%ss; .byte 0x2e, 0x2e; int3" :: [ss] "m" (ss)); <> } Kerel crash log, # cd /opt/kselftests/mainline/x86 # ./mov_ss_trap_64 SS = 0x2b, &SS = 0x0x604188 Set up a watchpoint DR0 = 604188, DR1 = 400a19, DR7 = 7000a [RUN] Read from watched memory (should get SIGTRAP) Got SIGTRAP with RIP=4008ea, EFLAGS.RF=0 [RUN] MOV SS; INT3 Got SIGTRAP with RIP=4008fb, EFLAGS.RF=0 [RUN] MOV SS; INT 3 Got SIGTRAP with RIP=40090d, EFLAGS.RF=0 [RUN] M[ 20.305426] PANIC: double fault, error_code: 0x0 OV SS; CS CS INT3 Got SIGTRAP with RIP=400920,[ 20.308317] CPU: 3 PID: 2471 Comm: mov_ss_trap_64 Not tainted 4.17.3-rc1 #1 EFLAGS.RF=0 [RUN] MOV SS; CSx14 INT3 [ 20.311664] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 20.314738] RIP: 0010:error_entry+0x32/0x100 [ 20.316198] RSP: :fe086000 EFLAGS: 00010046 [ 20.317911] RAX: 92400a87 RBX: RCX: [ 20.320168] RDX: RSI: 92400f18 RDI: 92401146 [ 20.322405] RBP: R08: R09: [ 20.324320] R10: R11: R12: [ 20.326073] R13: R14: R15: [ 20.327869] FS: 7f3174aefe80() GS:9f447fd8() knlGS: [ 20.329850] CS: 0010 DS: ES: CR0: 80050033 [ 20.331343] CR2: fe085ff8 CR3: 000136d2e000 CR4: 06e0 [ 20.333150] DR0: 00604188 DR1: 00400a19 DR2: [ 20.334893] DR3: DR6: 0ff0 DR7: 0007060a [ 20.336649] Call Trace: [ 20.337523] [ 20.338507] ? native_iret+0x7/0x7 [ 20.339611] ? page_fault+0x8/0x30 [ 20.340693] ? error_entry+0x86/0x100 [ 20.341871] ? trace_hardirqs_off_caller+0x7/0xa0 [ 20.343212] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 20.344554] ? native_iret+0x7/0x7 [ 20.345647] ? page_fault+0x8/0x30 [ 20.346716] ? error_entry+0x86/0x100 [ 20.347853] ? page_fault+0x8/0x30 [ 20.348920] ? ist_enter+0x6/0xa0 [ 20.349961] ? do_int3+0x34/0x120 [ 20.351095] ? int3+0x14/0x20 [ 20.352047] [ 20.353060] Code: 48 89 7c 24 08 52 31 d2 51 31 c9 50 41 50 45 31 c0 41 51 45 31 c9 41 52 45 31 d2 41 53 45 31 db 53 31 db 55 31 ed 41 54 45 31 e4 <41> 55 45 31 ed 41 56 45 31 f6 41 57 45 31 ff 56 48 8d 6c 24 09 [ 20.357895] Kernel panic - not syncing: Machine halted. [ 20.359385] CPU: 3 PID: 2471 Comm: mov_ss_trap_64 Not tainted 4.17.3-rc1 #1 [ 20.361271] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 20.363513] Call Trace: [ 20.364367] <#DF> [ 20.365109] dump_stack+0x68/0x95 [ 20.366131] panic+0xe3/0x22a [ 20.367207] df_debug+0x2d/0x30 [ 20.368254] do_double_fault+0x9f/0x120 [ 20.369387] double_fault+0x23/0x30 [ 20.370444] RIP: 0010:error_entry+0x32/0x100 [ 20.371791] RSP: :fe086000 EFLAGS: 00010046 [ 20.373246] RAX: 92400a87 RBX: RCX: [ 20.375250] RDX: RSI: 92400f18 RDI: 92401146 [ 20.377103] RBP: R08: R09: [ 20.378958] R10: R11: R12: [ 20.380808] R13: R14: R15: [ 20.382744] ? page_fault+0x8/0x30 [ 20.383925] ? error_entry+0x86/0x100 [ 20.385037] [ 20.385793] [ 20.386774] ? native_iret+0x7/0x7 [ 20.387839] ? page_fault+0x8/0x30 [ 20.388901] ? error_entry+0x86/0x100 [ 20.389997] ? trace_hardirqs_off_caller+0x7/0xa0 [ 20.391464] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 20.392850] ? native_iret+0x7/0x7 [ 20.393886] ? page_fault+0x8/0x30 [ 20.394984] ? error_entry+0x86/0x100 [ 20.396092] ? page_fault+0x8/0x30 [ 20.397145] ? ist_enter+0x6/0xa0 [ 20.398167] ? do_int3+0x34/0x120 [
[Bug 1777236] Re: NVME is missing support for mandatory features through "Get/Set Feature" command
[Expired for QEMU because there has been no activity for 60 days.] ** Changed in: qemu Status: Incomplete => Expired -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1777236 Title: NVME is missing support for mandatory features through "Get/Set Feature" command Status in QEMU: Expired Bug description: The following are features which are marked as mandatory by the 1.2 specification (NVMe 1.2, Section 5.14.1, Figure 108) as currently not implemented - 0x1 Arbitration - 0x2 Power Management - 0x4 Temperature Threshold - 0x5 Error Recovery - 0x6 Interrupt Coalescing - 0x7 Interrupt Vector Configuration - 0x8 Write Atomicity Normal - 0x9 Asynchronous Event Configuration To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1777236/+subscriptions
[Bug 1903712] Re: when ../configure, cannot find Ninjia
[Expired for QEMU because there has been no activity for 60 days.] ** Changed in: qemu Status: Incomplete => Expired -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1903712 Title: when ../configure, cannot find Ninjia Status in QEMU: Expired Bug description: On unbuntu18.04, after finishing wget https://download.qemu.org/qemu-5.2.0-rc0.tar.xz tar xvJf qemu-5.2.0-rc0.tar.xz cd qemu-5.2.0-rc0 when I input mkdir build cd build ../configure Return Error: cannot find Ninjia To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1903712/+subscriptions
[Bug 1910941] Re: Assertion `addr < cache->len && 2 <= cache->len - addr' in virtio-blk
This is OSS-Fuzz Issue 26797 === Reproducer === cat << EOF | ./qemu-system-i386 -machine q35 \ -device virtio-blk,drive=disk0 \ -drive file=null-co://,id=disk0,if=none,format=raw \ -serial none -monitor none -qtest stdio -nographic outl 0xcf8 0x80001890 outl 0xcfc 0x4 outl 0xcf8 0x8000188a outl 0xcfc 0xd4624 outl 0xcf8 0x80001894 outl 0xcfc 0x2002 outl 0xcf8 0x80001889 outl 0xcfc 0x1800 outl 0xcf8 0x80001896 outl 0xcfc 0x0 outl 0xcf8 0x8000188c outw 0xcfc 0x20 outl 0xcf8 0x80001894 outl 0xcfc 0x1 outl 0xcf8 0x8000188c outw 0xcfc 0x1c outl 0xcf8 0x80001895 outl 0xcfc 0x0 outl 0xcf8 0x80001889 outl 0xcfc 0x1800 outl 0xcf8 0x80001894 outl 0xcfc 0x40 outl 0xcf8 0x8000188c outw 0xcfc 0x14 outl 0xcf8 0x80001894 outl 0xcfc 0x1004 EOF === Stack Trace === qemu-fuzz-i386-target-generic-fuzz-virtio-blk: /src/qemu/include/exec/memory_ldst_cached.h.inc:88: void address_space_stw_le_cached(MemoryRegionCache *, hwaddr, uint32_t, MemTxAttrs, MemTxResult *): Assertion `addr < cache->len && 2 <= cache->len - addr' failed. ==2382430== ERROR: libFuzzer: deadly signal #8 address_space_stw_le_cached /src/qemu/include/exec/memory_ldst_cached.h.inc:88:5 #9 stw_le_phys_cached /src/qemu/include/exec/memory_ldst_phys.h.inc:121:5 #10 virtio_stw_phys_cached /src/qemu/include/hw/virtio/virtio-access.h:196:9 #11 vring_set_avail_event /src/qemu/hw/virtio/virtio.c:429:5 #12 virtio_queue_split_set_notification /src/qemu/hw/virtio/virtio.c:438:9 #13 virtio_queue_set_notification /src/qemu/hw/virtio/virtio.c:499:9 #14 virtio_blk_handle_vq /src/qemu/hw/block/virtio-blk.c:795:13 #15 virtio_blk_data_plane_handle_output /src/qemu/hw/block/dataplane/virtio-blk.c:165:12 #16 virtio_queue_notify_aio_vq /src/qemu/hw/virtio/virtio.c:2326:15 #17 virtio_queue_host_notifier_aio_read /src/qemu/hw/virtio/virtio.c:3533:9 #18 aio_dispatch_handler /src/qemu/util/aio-posix.c:329:9 #19 aio_dispatch_handlers /src/qemu/util/aio-posix.c:372:20 #20 aio_dispatch /src/qemu/util/aio-posix.c:382:5 #21 aio_ctx_dispatch /src/qemu/util/async.c:306:5 #22 g_main_context_dispatch #23 glib_pollfds_poll /src/qemu/util/main-loop.c:232:9 #24 os_host_main_loop_wait /src/qemu/util/main-loop.c:255:5 #25 main_loop_wait /src/qemu/util/main-loop.c:531:11 #26 flush_events /src/qemu/tests/qtest/fuzz/fuzz.c:49:9 #27 generic_fuzz /src/qemu/tests/qtest/fuzz/generic_fuzz.c:683:17 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=2qemu-fuzz-i386 -target-generic-fuzz-virtio-blk: /src/qemu/include/exec/memory_ldst_cached.h.inc:88: void address_space_stw_le_cached(MemoryRegionCache *, hwaddr, uint32_t, MemTxAttrs, MemTxResult *): Assertion `addr < cache->len && 2 <= cache->len - addr' failed.6797 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1910941 Title: Assertion `addr < cache->len && 2 <= cache->len - addr' in virtio-blk Status in QEMU: New Bug description: Hello, Using hypervisor fuzzer, hyfuzz, I found an assertion failure through virtio-blk emulator. A malicious guest user/process could use this flaw to abort the QEMU process on the host, resulting in a denial of service. This was found in version 5.2.0 (master) ``` qemu-system-i386: /home/cwmyung/prj/hyfuzz/src/qemu-master/include/exec/memory_ldst_cached.h.inc:88: void address_space_stw_le_cached(MemoryRegionCache *, hwaddr, uint32_t, MemTxAttrs, MemTxResult *): Assertion `addr < cache->len && 2 <= cache->len - addr' failed. [1]1877 abort (core dumped) /home/cwmyung/prj/hyfuzz/src/qemu-master/build/i386-softmmu/qemu-system-i386 Program terminated with signal SIGABRT, Aborted. #0 0x7f71cc171f47 in __GI_raise (sig=sig@entry=0x6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x7f71cc1738b1 in __GI_abort () at abort.c:79 #2 0x7f71cc16342a in __assert_fail_base (fmt=0x7f71cc2eaa38 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x56537b324230 "addr < cache->len && 2 <= cache->len - addr", file=file@entry=0x56537b32425c "/home/cwmyung/prj/hyfuzz/src/qemu-master/include/exec/memory_ldst_cached.h.inc", line=line@entry=0x58, function=function@entry=0x56537b3242ab "void address_space_stw_le_cached(MemoryRegionCache *, hwaddr, uint32_t, MemTxAttrs, MemTxResult *)") at assert.c:92 #3 0x7f71cc1634a2 in __GI___assert_fail (assertion=0x56537b324230 "addr < cache->len && 2 <= cache->len - addr", file=0x56537b32425c "/home/cwmyung/prj/hyfuzz/src/qemu-master/include/exec/memory_ldst_cached.h.inc", line=0x58, function=0x56537b3242ab "void address_space_stw_le_cached(MemoryRegionCache *, hwaddr, uint32_t, MemTxAttrs, MemTxResult *)") at assert.c:101 #4 0x56537af3c917 in address_space_stw_le_cached (attrs=..., result=, cache=, addr=, val=) at /home/cwmyung/prj/hyfuzz/src/qemu-master/include/exec/memory_ldst_cached.h.inc:88 #5 0x56537af3c917 in stw_le_phys_cached (cache=, addr=, val=) at /home/cwmyung/prj/hyfuzz/sr
Re: [PATCH] util/oslib-win32: Fix _aligned_malloc() arguments order
On Sun, Jan 10, 2021 at 4:16 PM Philippe Mathieu-Daudé wrote: > > Commit dfbd0b873a8 inadvertently swapped the arguments > of _aligned_malloc(), correct it to fix [*]: > > G_TEST_SRCDIR=C:/Users/ContainerAdministrator/AppData/Local/Temp/cirrus-ci-build/tests > G_TEST_BUILDDIR=C:/Users/ContainerAdministrator/AppData/Local/Temp/cirrus-ci-build/build/tests > tests/test-qht.exe --tap -k > ERROR test-qht - too few tests run (expected 2, got 0) > make: *** [Makefile.mtest:256: run-test-30] Error 1 > > [*] https://cirrus-ci.com/task/6055645751279616?command=test#L593 > > Fixes: dfbd0b873a8 ("util/oslib-win32: Use _aligned_malloc for qemu_try_memalign") > Reported-by: Yonggang Luo > Reported-by: Volker Rümelin > Suggested-by: Volker Rümelin > Signed-off-by: Philippe Mathieu-Daudé > --- > util/oslib-win32.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/util/oslib-win32.c b/util/oslib-win32.c > index e6f83e10edb..f68b8012bb8 100644 > --- a/util/oslib-win32.c > +++ b/util/oslib-win32.c > @@ -59,7 +59,7 @@ void *qemu_try_memalign(size_t alignment, size_t size) > > g_assert(size != 0); > g_assert(is_power_of_2(alignment)); > -ptr = _aligned_malloc(alignment, size); > +ptr = _aligned_malloc(size, alignment); > trace_qemu_memalign(alignment, size, ptr); > return ptr; > } > -- > 2.26.2 > Oh, sorry, you 've fixed this. ignore my patch Reviewed-by: Yonggang Luo -- 此致 礼 罗勇刚 Yours sincerely, Yonggang Luo
[PATCH] util/oslib-win32: Fixes Use _aligned_malloc for qemu_try_memalign
In commit dfbd0b873a85021c083d9b4b84630c3732645963, the use of _aligned_malloc are called with wrong parameter order, fixed it. Signed-off-by: Yonggang Luo --- util/oslib-win32.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/util/oslib-win32.c b/util/oslib-win32.c index 83b8e89330..33af8e2506 100644 --- a/util/oslib-win32.c +++ b/util/oslib-win32.c @@ -59,7 +59,7 @@ void *qemu_try_memalign(size_t alignment, size_t size) g_assert(size != 0); g_assert(is_power_of_2(alignment)); -ptr = _aligned_malloc(alignment, size); +ptr = _aligned_malloc(size, alignment); trace_qemu_memalign(alignment, size, ptr); return ptr; } -- 2.29.2.windows.3
Re: [PATCH] tcg: Remove unused tcg_out_dupi_vec() stub
On 2021/01/11 6:32, Philippe Mathieu-Daudé wrote: > On 1/10/21 7:23 PM, Richard Henderson wrote: >> On 1/9/21 6:10 PM, Wataru Ashihara wrote: >>> This fixes the build with --enable-tcg-interpreter: >>> >>> clang -Ilibqemu-arm-softmmu.fa.p -I. -I.. -Itarget/arm -I../target/arm >>> -I../dtc/libfdt -I../capstone/include/capstone -Iqapi -Itrace -Iui >>> -Iui/shader -I/usr/include/pixman-1 -I/usr/include/glib-2.0 >>> -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -Xclang -fcolor-diagnostics >>> -pipe -Wall -Winvalid-pch -Werror -std=gnu99 -g -m64 -mcx16 -D_GNU_SOURCE >>> -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes >>> -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes >>> -fno-strict-aliasing -fno-common -fwrapv -Wold-style-definition >>> -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self >>> -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels >>> -Wexpansion-to-defined -Wno-initializer-overrides -Wno-missing-include-dirs >>> -Wno-shift-negative-value -Wno-string-plus-int -Wno-typedef-redefinition >>> -Wno-tautological-type-limit-compare -fstack-protector-strong -isystem >>> /home/wsh/qc/qemu/linux-headers -isystem linux-headers -iquote >>> /home/wsh/qc/qemu/tcg/tci -iquote . -iquote /home/wsh/qc/qemu -iquote >>> /home/wsh/qc/qemu/accel/tcg -iquote /home/wsh/qc/qemu/include -iquote >>> /home/wsh/qc/qemu/disas/libvixl -pthread -fPIC -isystem../linux-headers >>> -isystemlinux-headers -DNEED_CPU_H >>> '-DCONFIG_TARGET="arm-softmmu-config-target.h"' >>> '-DCONFIG_DEVICES="arm-softmmu-config-devices.h"' -MD -MQ >>> libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -MF >>> libqemu-arm-softmmu.fa.p/tcg_tcg.c.o.d -o >>> libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -c ../tcg/tcg.c >>> ../tcg/tcg.c:136:20: error: unused function 'tcg_out_dupi_vec' >>> [-Werror,-Wunused-function] >> >> >> What version of clang? >> With clang 10, I can't even run configure without --disable-werror. > > clang version 10.0.1 (Fedora 10.0.1-3.fc32) > > I tested using: > > ../configure '--cc=clang' '--cxx=clang++' \ > '--extra-cflags=-Wunused-function' '--enable-tcg-interpreter' \ > '--disable-tools' '--target-list=arm-softmmu' > $ clang --version clang version 10.0.0-4ubuntu1 Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/bin And I configured with: ../configure --prefix=$HOME/opt/qemu-tci --cc=clang --host-cc=clang --cxx=clang++ --enable-debug --enable-tcg-interpreter --enable-debug-tcg --enable-debug-info --enable-debug-mutex
Re: [PATCH] tcg: Remove unused tcg_out_dupi_vec() stub
Philippe, Richard, thank you for reviewing. On 2021/01/11 1:17, Philippe Mathieu-Daudé wrote: > Cc'ing Stefan. > > On 1/10/21 5:10 AM, Wataru Ashihara wrote: >> This fixes the build with --enable-tcg-interpreter: >> >> clang -Ilibqemu-arm-softmmu.fa.p -I. -I.. -Itarget/arm -I../target/arm >> -I../dtc/libfdt -I../capstone/include/capstone -Iqapi -Itrace -Iui >> -Iui/shader -I/usr/include/pixman-1 -I/usr/include/glib-2.0 >> -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -Xclang -fcolor-diagnostics >> -pipe -Wall -Winvalid-pch -Werror -std=gnu99 -g -m64 -mcx16 -D_GNU_SOURCE >> -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes >> -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes >> -fno-strict-aliasing -fno-common -fwrapv -Wold-style-definition >> -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self >> -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels >> -Wexpansion-to-defined -Wno-initializer-overrides -Wno-missing-include-dirs >> -Wno-shift-negative-value -Wno-string-plus-int -Wno-typedef-redefinition >> -Wno-tautological-type-limit-compare -fstack-protector-strong -isystem >> /home/wsh/qc/qemu/linux-headers -isystem linux-headers -iquote >> /home/wsh/qc/qemu/tcg/tci -iquote . -iquote /home/wsh/qc/qemu -iquote >> /home/wsh/qc/qemu/accel/tcg -iquote /home/wsh/qc/qemu/include -iquote >> /home/wsh/qc/qemu/disas/libvixl -pthread -fPIC -isystem../linux-headers >> -isystemlinux-headers -DNEED_CPU_H >> '-DCONFIG_TARGET="arm-softmmu-config-target.h"' >> '-DCONFIG_DEVICES="arm-softmmu-config-devices.h"' -MD -MQ >> libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -MF >> libqemu-arm-softmmu.fa.p/tcg_tcg.c.o.d -o >> libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -c ../tcg/tcg.c >> ../tcg/tcg.c:136:20: error: unused function 'tcg_out_dupi_vec' >> [-Werror,-Wunused-function] >> >> Signed-off-by: Wataru Ashihara >> --- >> tcg/tcg.c | 7 --- >> 1 file changed, 7 deletions(-) >> >> diff --git a/tcg/tcg.c b/tcg/tcg.c >> index 472bf1755b..32df149b12 100644 >> --- a/tcg/tcg.c >> +++ b/tcg/tcg.c >> @@ -117,8 +117,6 @@ static bool tcg_out_dup_vec(TCGContext *s, TCGType type, >> unsigned vece, >> TCGReg dst, TCGReg src); >> static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, >> TCGReg dst, TCGReg base, intptr_t offset); >> -static void tcg_out_dupi_vec(TCGContext *s, TCGType type, >> - TCGReg dst, tcg_target_long arg); >> static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, >> unsigned vece, const TCGArg *args, >> const int *const_args); >> @@ -133,11 +131,6 @@ static inline bool tcg_out_dupm_vec(TCGContext *s, >> TCGType type, unsigned vece, >> { >> g_assert_not_reached(); >> } >> -static inline void tcg_out_dupi_vec(TCGContext *s, TCGType type, >> -TCGReg dst, tcg_target_long arg) >> -{ >> -g_assert_not_reached(); >> -} >> static inline void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned >> vecl, >>unsigned vece, const TCGArg *args, >>const int *const_args) > > AFAIK TCI does not support vectors, using them would trigger > tcg_debug_assert(type == TCG_TYPE_I64) in tcg_out_movi(). > > As your approach might break other backends, I'm going to > send an alternate patch using __attribute__((unused)). Currently it doesn't. Unlike all the other tcg_out_*(), tcg_out_dupi_vec() is not used in tcg.c as discussed in [1]. > > Thanks for reporting this, > > Phil. > I discard this patch in favor of the unconditionally-using way mentioned in [1]. Thanks. [1]: https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg01647.html
Re: [PULL 04/47] util/oslib-win32: Use _aligned_malloc for qemu_try_memalign
On Sun, Jan 10, 2021 at 3:19 PM Volker Rümelin wrote: > > > We do not need or want to be allocating page sized quanta. > > > > Reviewed-by: Philippe Mathieu-Daudé > > Reviewed-by: Stefan Weil > > Message-Id: <20201018164836.1149452-1-richard.hender...@linaro.org> > > Signed-off-by: Philippe Mathieu-Daudé > > Signed-off-by: Richard Henderson > > --- > > util/oslib-win32.c | 11 --- > > 1 file changed, 4 insertions(+), 7 deletions(-) > > > > diff --git a/util/oslib-win32.c b/util/oslib-win32.c > > index 01787df74c..8adc651259 100644 > > --- a/util/oslib-win32.c > > +++ b/util/oslib-win32.c > > @@ -39,6 +39,7 @@ > > #include "trace.h" > > #include "qemu/sockets.h" > > #include "qemu/cutils.h" > > +#include > > > > /* this must come after including "trace.h" */ > > #include > > @@ -56,10 +57,8 @@ void *qemu_try_memalign(size_t alignment, size_t size) > > { > > void *ptr; > > > > -if (!size) { > > -abort(); > > -} > > -ptr = VirtualAlloc(NULL, size, MEM_COMMIT, PAGE_READWRITE); > > +g_assert(size != 0); > > +ptr = _aligned_malloc(alignment, size); > > Hi Richard, > > this doesn't work really well. The _aligned_malloc parameters are swapped. ptr = _aligned_malloc(size, alignment) is correct. > > With best regards, > Volker > > > trace_qemu_memalign(alignment, size, ptr); > > return ptr; > > } > > @@ -93,9 +92,7 @@ void *qemu_anon_ram_alloc(size_t size, uint64_t *align, bool shared) > > void qemu_vfree(void *ptr) > > { > > trace_qemu_vfree(ptr); > > -if (ptr) { > > -VirtualFree(ptr, 0, MEM_RELEASE); > > -} > > +_aligned_free(ptr); > > } > > > > void qemu_anon_ram_free(void *ptr, size_t size) > > Dos this the cause of this failure? https://cirrus-ci.com/task/6055645751279616?command=test#L593 MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} G_TEST_SRCDIR=C:/Users/ContainerAdministrator/AppData/Local/Temp/cirrus-ci-build/tests G_TEST_BUILDDIR=C:/Users/ContainerAdministrator/AppData/Local/Temp/cirrus-ci-build/build/tests tests/test-qht.exe --tap -k ERROR test-qht - too few tests run (expected 2, got 0) make: *** [Makefile.mtest:256: run-test-30] Error 1 -- 此致 礼 罗勇刚 Yours sincerely, Yonggang Luo -- 此致 礼 罗勇刚 Yours sincerely, Yonggang Luo
Re: [PATCH v7 1/7] fuzz: accelerate non-crash detection
On 210110 2119, Qiuhao Li wrote: > We spend much time waiting for the timeout program during the minimization > process until it passes a time limit. This patch hacks the CLOSED (indicates > the redirection file closed) notification in QTest's output if it doesn't > crash. > > Test with quadrupled trace input at: > https://bugs.launchpad.net/qemu/+bug/1890333/comments/1 > > Original version: > real1m37.246s > user0m13.069s > sys 0m8.399s > > Refined version: > real0m45.904s > user0m16.874s > sys 0m10.042s > > Note: > > Sometimes the mutated or the same trace may trigger a different crash > summary (second-to-last line) but indicates the same bug. For example, Bug > 1910826 [1], which will trigger a stack overflow, may output summaries > like: > > SUMMARY: AddressSanitizer: stack-overflow > /home/qiuhao/hack/qemu/build/../softmmu/physmem.c:488 in > flatview_do_translate > > or > > SUMMARY: AddressSanitizer: stack-overflow > (/home/qiuhao/hack/qemu/build/qemu-system-i386+0x27ca049) in __asan_memcpy > > Etc. > > If we use the whole summary line as the token, we may be prevented from > further minimization. So in this patch, we only use the first three words > which indicate the type of crash: > > SUMMARY: AddressSanitizer: stack-overflow > > [1] https://bugs.launchpad.net/qemu/+bug/1910826 > > Signed-off-by: Qiuhao Li Reviewed-by: Alexander Bulekov Thanks > --- > scripts/oss-fuzz/minimize_qtest_trace.py | 43 +--- > 1 file changed, 31 insertions(+), 12 deletions(-) > > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py > b/scripts/oss-fuzz/minimize_qtest_trace.py > index 5e405a0d5f..97f1201747 100755 > --- a/scripts/oss-fuzz/minimize_qtest_trace.py > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py > @@ -29,8 +29,14 @@ whether the crash occred. Optionally, manually set a > string that idenitifes the > crash by setting CRASH_TOKEN= > """.format((sys.argv[0]))) > > +deduplication_note = """\n\ > +Note: While trimming the input, sometimes the mutated trace triggers a > different > +type crash but indicates the same bug. Under this situation, our minimizer is > +incapable of recognizing and stopped from removing it. In the future, we may > +use a more sophisticated crash case deduplication method. > +\n""" > + > def check_if_trace_crashes(trace, path): > -global CRASH_TOKEN > with open(path, "w") as tracefile: > tracefile.write("".join(trace)) > > @@ -41,18 +47,32 @@ def check_if_trace_crashes(trace, path): > trace_path=path), >shell=True, >stdin=subprocess.PIPE, > - stdout=subprocess.PIPE) > -stdo = rc.communicate()[0] > -output = stdo.decode('unicode_escape') > -if rc.returncode == 137:# Timed Out > -return False > -if len(output.splitlines()) < 2: > -return False > - > + stdout=subprocess.PIPE, > + encoding="utf-8") > +global CRASH_TOKEN > if CRASH_TOKEN is None: > -CRASH_TOKEN = output.splitlines()[-2] > +try: > +outs, _ = rc.communicate(timeout=5) > +CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3]) > +except subprocess.TimeoutExpired: > +print("subprocess.TimeoutExpired") > +return False > +print("Identifying Crashes by this string: {}".format(CRASH_TOKEN)) > +global deduplication_note > +print(deduplication_note) > +return True > > -return CRASH_TOKEN in output > +for line in iter(rc.stdout.readline, b''): > +if "CLOSED" in line: > +return False > +if CRASH_TOKEN in line: > +return True > +# We reach the end of stdout and there is no "CLOSED" or CRASH_TOKEN > +# Usually this is caused by a different type of crash > +if line == "": > +return False > + > +return False > > > def minimize_trace(inpath, outpath): > @@ -66,7 +86,6 @@ def minimize_trace(inpath, outpath): > print("Crashed in {} seconds".format(end-start)) > TIMEOUT = (end-start)*5 > print("Setting the timeout for {} seconds".format(TIMEOUT)) > -print("Identifying Crashes by this string: {}".format(CRASH_TOKEN)) > > i = 0 > newtrace = trace[:] > -- > 2.25.1 >
[Bug 1910941] [NEW] Assertion `addr < cache->len && 2 <= cache->len - addr' in virtio-blk
Public bug reported: Hello, Using hypervisor fuzzer, hyfuzz, I found an assertion failure through virtio-blk emulator. A malicious guest user/process could use this flaw to abort the QEMU process on the host, resulting in a denial of service. This was found in version 5.2.0 (master) ``` qemu-system-i386: /home/cwmyung/prj/hyfuzz/src/qemu-master/include/exec/memory_ldst_cached.h.inc:88: void address_space_stw_le_cached(MemoryRegionCache *, hwaddr, uint32_t, MemTxAttrs, MemTxResult *): Assertion `addr < cache->len && 2 <= cache->len - addr' failed. [1]1877 abort (core dumped) /home/cwmyung/prj/hyfuzz/src/qemu-master/build/i386-softmmu/qemu-system-i386 Program terminated with signal SIGABRT, Aborted. #0 0x7f71cc171f47 in __GI_raise (sig=sig@entry=0x6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x7f71cc1738b1 in __GI_abort () at abort.c:79 #2 0x7f71cc16342a in __assert_fail_base (fmt=0x7f71cc2eaa38 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x56537b324230 "addr < cache->len && 2 <= cache->len - addr", file=file@entry=0x56537b32425c "/home/cwmyung/prj/hyfuzz/src/qemu-master/include/exec/memory_ldst_cached.h.inc", line=line@entry=0x58, function=function@entry=0x56537b3242ab "void address_space_stw_le_cached(MemoryRegionCache *, hwaddr, uint32_t, MemTxAttrs, MemTxResult *)") at assert.c:92 #3 0x7f71cc1634a2 in __GI___assert_fail (assertion=0x56537b324230 "addr < cache->len && 2 <= cache->len - addr", file=0x56537b32425c "/home/cwmyung/prj/hyfuzz/src/qemu-master/include/exec/memory_ldst_cached.h.inc", line=0x58, function=0x56537b3242ab "void address_space_stw_le_cached(MemoryRegionCache *, hwaddr, uint32_t, MemTxAttrs, MemTxResult *)") at assert.c:101 #4 0x56537af3c917 in address_space_stw_le_cached (attrs=..., result=, cache=, addr=, val=) at /home/cwmyung/prj/hyfuzz/src/qemu-master/include/exec/memory_ldst_cached.h.inc:88 #5 0x56537af3c917 in stw_le_phys_cached (cache=, addr=, val=) at /home/cwmyung/prj/hyfuzz/src/qemu-master/include/exec/memory_ldst_phys.h.inc:121 #6 0x56537af3c917 in virtio_stw_phys_cached (vdev=, cache=, pa=, value=) at /home/cwmyung/prj/hyfuzz/src/qemu-master/include/hw/virtio/virtio-access.h:196 #7 0x56537af2b809 in vring_set_avail_event (vq=, val=0x0) at ../hw/virtio/virtio.c:429 #8 0x56537af2b809 in virtio_queue_split_set_notification (vq=, enable=) at ../hw/virtio/virtio.c:438 #9 0x56537af2b809 in virtio_queue_set_notification (vq=, enable=0x1) at ../hw/virtio/virtio.c:499 #10 0x56537b07ce1c in virtio_blk_handle_vq (s=0x56537d6bb3a0, vq=0x56537d6c0680) at ../hw/block/virtio-blk.c:795 #11 0x56537af3eb4d in virtio_queue_notify_aio_vq (vq=0x56537d6c0680) at ../hw/virtio/virtio.c:2326 #12 0x56537af3ba04 in virtio_queue_host_notifier_aio_read (n=) at ../hw/virtio/virtio.c:3533 #13 0x56537b20901c in aio_dispatch_handler (ctx=0x56537c4179f0, node=0x7f71a810b370) at ../util/aio-posix.c:329 #14 0x56537b20838c in aio_dispatch_handlers (ctx=) at ../util/aio-posix.c:372 #15 0x56537b20838c in aio_dispatch (ctx=0x56537c4179f0) at ../util/aio-posix.c:382 #16 0x56537b1f99cb in aio_ctx_dispatch (source=0x2, callback=0x7ffc8add9f90, user_data=0x0) at ../util/async.c:306 #17 0x7f71d1c10417 in g_main_context_dispatch () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0 #18 0x56537b1f1bab in glib_pollfds_poll () at ../util/main-loop.c:232 #19 0x56537b1f1bab in os_host_main_loop_wait (timeout=) at ../util/main-loop.c:255 #20 0x56537b1f1bab in main_loop_wait (nonblocking=) at ../util/main-loop.c:531 #21 0x56537af879d7 in qemu_main_loop () at ../softmmu/runstate.c:720 #22 0x56537a928a3b in main (argc=, argc@entry=0x15, argv=, argv@entry=0x7ffc8adda718, envp=) at ../softmmu/main.c:50 #23 0x7f71cc154b97 in __libc_start_main (main=0x56537a928a30 , argc=0x15, argv=0x7ffc8adda718, init=, fini=, rtld_fini=, stack_end=0x7ffc8adda708) at ../csu/libc-start.c:310 #24 0x56537a92894a in _start () ``` To reproduce this issue, please run the QEMU with the following command line. ``` # To reproduce this issue, please run the QEMU process with the following command line. $ qemu-system-i386 -m 512 -drive file=hyfuzz.img,index=0,media=disk,format=raw -device virtio-blk- pci,drive=drive0,id=virtblk0,num-queues=4 -drive file=disk.img,if=none,id=drive0 ``` Please let me know if I can provide any further info. Thank you. ** Affects: qemu Importance: Undecided Status: New ** Attachment added: "attachment.tar.gz" https://bugs.launchpad.net/bugs/1910941/+attachment/5451586/+files/attachment.tar.gz -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1910941 Title: Assertion `addr < cache->len && 2 <= cache->len - addr' in virtio-blk Status in QEMU: New Bug description: Hello, Using hypervisor fuzzer, hyfuzz, I found an ass
RE: [PATCH v2 0/7] Fix some memleaks caused by ptimer_init
> -Original Message- > From: Peter Maydell [mailto:peter.mayd...@linaro.org] > Sent: Friday, January 8, 2021 7:43 PM > To: ganqixin > Cc: QEMU Developers ; QEMU Trivial > ; Beniamino Galvani ; > Antony Pavlov ; Igor Mitsyanko > ; sundeep subbaraya ; > Jan Kiszka ; Chenqun (kuhn) > ; Zhanghailiang > > Subject: Re: [PATCH v2 0/7] Fix some memleaks caused by ptimer_init > > On Thu, 17 Dec 2020 at 11:32, Gan Qixin wrote: > > > > v1->v2: > > Changes suggested by Peter Maydell: > > Delete the modification of unrelated whitespace. > > > > Gan Qixin (7): > > allwinner-a10-pit: Use ptimer_free() in the finalize function to avoid > > memleaks > > digic-timer: Use ptimer_free() in the finalize function to avoid > > memleaks > > exynos4210_mct: Use ptimer_free() in the finalize function to avoid > > memleaks > > exynos4210_pwm: Use ptimer_free() in the finalize function to avoid > > memleaks > > exynos4210_rtc: Use ptimer_free() in the finalize function to avoid > > memleaks > > mss-timer: Use ptimer_free() in the finalize function to avoid > > memleaks > > musicpal: Use ptimer_free() in the finalize function to avoid > > memleaks > > Applied to target-arm.next, thanks. > > PS: something odd happened with the threading of this series -- the patch > emails weren't follow-ups to the cover letter -- so the automated tools like > patchew got confused and thought the series was empty: > https://patchew.org/QEMU/20201217113137.121607-1-ganqi...@huawei.co > m/ > > You might want to look into fixing that for next time you send a patchset. > Thanks for telling me the error that occurred when receiving this patch email. I will fix it.
Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection
On Sun, 2021-01-10 at 11:00 -0500, Alexander Bulekov wrote: > On 210110 2110, Qiuhao Li wrote: > > On Wed, 2021-01-06 at 23:18 -0500, Alexander Bulekov wrote: > > > On 201229 1240, Qiuhao Li wrote: > > > > We spend much time waiting for the timeout program during the > > > > minimization > > > > process until it passes a time limit. This patch hacks the > > > > CLOSED > > > > (indicates > > > > the redirection file closed) notification in QTest's output if > > > > it > > > > doesn't > > > > crash. > > > > > > > > Test with quadrupled trace input at: > > > > https://bugs.launchpad.net/qemu/+bug/1890333/comments/1 > > > > > > > > Original version: > > > > real 1m37.246s > > > > user 0m13.069s > > > > sys 0m8.399s > > > > > > > > Refined version: > > > > real 0m45.904s > > > > user 0m16.874s > > > > sys 0m10.042s > > > > > > > > Signed-off-by: Qiuhao Li > > > > --- > > > > scripts/oss-fuzz/minimize_qtest_trace.py | 41 > > > > -- > > > > -- > > > > 1 file changed, 28 insertions(+), 13 deletions(-) > > > > > > > > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py > > > > b/scripts/oss-fuzz/minimize_qtest_trace.py > > > > index 5e405a0d5f..aa69c7963e 100755 > > > > --- a/scripts/oss-fuzz/minimize_qtest_trace.py > > > > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py > > > > @@ -29,30 +29,46 @@ whether the crash occred. Optionally, > > > > manually > > > > set a string that idenitifes the > > > > crash by setting CRASH_TOKEN= > > > > """.format((sys.argv[0]))) > > > > > > > > +deduplication_note = """\n\ > > > > +Note: While trimming the input, sometimes the mutated trace > > > > triggers a different > > > > +crash output but indicates the same bug. Under this situation, > > > > our > > > > minimizer is > > > > +incapable of recognizing and stopped from removing it. In the > > > > future, we may > > > > +use a more sophisticated crash case deduplication method. > > > > +\n""" > > > > + > > > > def check_if_trace_crashes(trace, path): > > > > -global CRASH_TOKEN > > > > with open(path, "w") as tracefile: > > > > tracefile.write("".join(trace)) > > > > > > > > -rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path} > > > > {qemu_args} 2>&1\ > > > > +proc = subprocess.Popen("timeout {timeout}s {qemu_path} > > > > {qemu_args} 2>&1\ > > > > > > Why remove the -s 9 here? I ran into a case where the minimizer > > > got > > > stuck on one iteration. Adding back "sigkill" to the timeout can > > > be a > > > safety net to catch those bad cases. > > > -Alex > > > > Hi Alex, > > > > After reviewed this patch again, I think this get-stuck bug may be > > caused by code: > > > > -return CRASH_TOKEN in output > > Hi, > Thanks for fixing this. Strangely, I was able to fix it by swapping > the b'' for a ' ' when I was stuck on a testcase a few days ago. > vvv > > +for line in iter(rc.stdout.readline, b''): > > +if "CLOSED" in line: > > +return False > > +if CRASH_TOKEN in line: > > +return True > > > > I think your proposed change essentially does the same? > -Alex Hi Alex, It looks like I misused the bytes type. Instead of b'', '' (the str type) should be used here: -for line in iter(rc.stdout.readline, b''): +for line in iter(rc.stdout.readline, ''): And you are right, if we use iter() with sentinel parameter '', it's does the same as: +if line == "": +return False But if we just fix the get-stuck bug here, we may fail the assert(check_if_trace_crashes(newtrace, outpath)) check after remove_lines() or clear_bits() since the same trace input may trigger a different output between runs. My solution is instead of using the whole second-to-last line as token, we only use the the first three words which indicate the type of crash: -CRASH_TOKEN = output.splitlines()[-2] +CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3]) Example: "SUMMARY: AddressSanitizer: stack-overflow" And thus, we may a get a more trimmed input trace. > > > I assumed there are only two end cases in lines of stdout, but > > while we > > are trimming the trace input, the crash output (second-to-last > > line) > > may changes, in which case we will go through the output and fail > > to > > find "CLOSED" and CRASH_TOKEN, thus get stuck in the loop above. > > > > To fix this bug and get a more trimmed input trace, we can: > > > > Use the first three words of the second-to-last line instead of the > > whole string, which indicate the type of crash as the token. > > > > -CRASH_TOKEN = output.splitlines()[-2] > > +CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3]) > > > > If we reach the end of a subprocess' output, return False. > > > > +if line == "": > > +return False > > > > I fix it in [PATCH v7 1/7] and give an example. Could you review > > again? > > Thanks :-) > >
[Bug 1658141] Re: QEMU's default msrs handling causes Windows 10 64 bit to crash
The bug is still present so changing the status back to New. ** Changed in: qemu Status: Expired => New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1658141 Title: QEMU's default msrs handling causes Windows 10 64 bit to crash Status in QEMU: New Bug description: Wine uses QEMU to run its conformance test suite on Windows virtual machines. Wine's conformance tests check the behavior of various Windows APIs and verify that they behave as expected. One such test checks handling of exceptions down. When run on Windows 10 64 bit in QEMU it triggers a "KMOD_EXCEPTION_NOT_HANDLED" BSOD in the VM. See: https://bugs.winehq.org/show_bug.cgi?id=40240 To reproduce this bug: * Pick a Windows 10 64 bit VM on an Intel host. * Start the VM. I'm pretty sure any qemu command will do but here's what I used: qemu-system-x86_64 -machine pc-i440fx-2.1,accel=kvm -cpu core2duo,+nx -m 2048 -hda /var/lib/libvirt/images/wtbw1064.qcow2 * Grab the attached source code. The tar file is a bit big at 85KB because I had to include some Wine headers. However the source file proper, exception.c, is only 85 lines, including the LGPL header. * Compile the source code with MinGW by typing 'make'. This produces a 32 bit exception.exe executable. I'll attach it for good measure. * Put exception.exe on the VM and run it. After investigation it turns out this happens: * Only for Windows 10 64 bit guests. Windows 10 32 bit and older Windows versions are unaffected. * Only on Intel hosts. At least both my Xeon E3-1226 v3 and i7-4790K hosts are impacted but not my Opteron 6128 one. * It does not seem to depend on the emulated CPU type: on the Intel hosts this happened with both core2duo,nx and 'copy the host configuration' and did not depend on the number of emulated cpus/cores. * This happened with both QEMU 2.1 and 2.7, and both the 3.16.0 and 4.8.11 Linux kernels, both on Debian 8.6 and Debian Testing. After searching for quite some time I discovered that the kvm kernel module was sneaking the following messages into /var/log/syslog precisely when the BSOD happens: Dec 16 13:43:48 vm3 kernel: [ 191.624802] kvm [2064]: vcpu0, guest rIP: 0xf803cb3c0bf3 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x1, nop Dec 16 13:43:48 vm3 kernel: [ 191.624835] kvm [2064]: vcpu0, guest rIP: 0xf803cb3c0c5c unhandled rdmsr: 0x1c9 A search on the Internet turned up a post suggesting to change kvm's ignore_msrs setting: echo 1 >/sys/module/kvm/parameters/ignore_msrs https://www.reddit.com/r/VFIO/comments/42dj7n/some_games_crash_to_biosboot_on_launch/ This does actually work and provides a workaround at least. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1658141/+subscriptions
[Bug 1658141] Re: QEMU's default msrs handling causes Windows 10 64 bit to crash
This bug is still present. However the "ignore_msrs=1" workaround does not work with QEmu 3.1 anymore. To prevent Windows 10 from crashing one must upgrade QEmu to 5.0.14. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1658141 Title: QEMU's default msrs handling causes Windows 10 64 bit to crash Status in QEMU: New Bug description: Wine uses QEMU to run its conformance test suite on Windows virtual machines. Wine's conformance tests check the behavior of various Windows APIs and verify that they behave as expected. One such test checks handling of exceptions down. When run on Windows 10 64 bit in QEMU it triggers a "KMOD_EXCEPTION_NOT_HANDLED" BSOD in the VM. See: https://bugs.winehq.org/show_bug.cgi?id=40240 To reproduce this bug: * Pick a Windows 10 64 bit VM on an Intel host. * Start the VM. I'm pretty sure any qemu command will do but here's what I used: qemu-system-x86_64 -machine pc-i440fx-2.1,accel=kvm -cpu core2duo,+nx -m 2048 -hda /var/lib/libvirt/images/wtbw1064.qcow2 * Grab the attached source code. The tar file is a bit big at 85KB because I had to include some Wine headers. However the source file proper, exception.c, is only 85 lines, including the LGPL header. * Compile the source code with MinGW by typing 'make'. This produces a 32 bit exception.exe executable. I'll attach it for good measure. * Put exception.exe on the VM and run it. After investigation it turns out this happens: * Only for Windows 10 64 bit guests. Windows 10 32 bit and older Windows versions are unaffected. * Only on Intel hosts. At least both my Xeon E3-1226 v3 and i7-4790K hosts are impacted but not my Opteron 6128 one. * It does not seem to depend on the emulated CPU type: on the Intel hosts this happened with both core2duo,nx and 'copy the host configuration' and did not depend on the number of emulated cpus/cores. * This happened with both QEMU 2.1 and 2.7, and both the 3.16.0 and 4.8.11 Linux kernels, both on Debian 8.6 and Debian Testing. After searching for quite some time I discovered that the kvm kernel module was sneaking the following messages into /var/log/syslog precisely when the BSOD happens: Dec 16 13:43:48 vm3 kernel: [ 191.624802] kvm [2064]: vcpu0, guest rIP: 0xf803cb3c0bf3 kvm_set_msr_common: MSR_IA32_DEBUGCTLMSR 0x1, nop Dec 16 13:43:48 vm3 kernel: [ 191.624835] kvm [2064]: vcpu0, guest rIP: 0xf803cb3c0c5c unhandled rdmsr: 0x1c9 A search on the Internet turned up a post suggesting to change kvm's ignore_msrs setting: echo 1 >/sys/module/kvm/parameters/ignore_msrs https://www.reddit.com/r/VFIO/comments/42dj7n/some_games_crash_to_biosboot_on_launch/ This does actually work and provides a workaround at least. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1658141/+subscriptions
Re: [PATCH v2 08/13] vt82c686: Move creation of ISA devices to the ISA bridge
On Mon, Jan 11, 2021, at 3:25 AM, BALATON Zoltan wrote: > On Sun, 10 Jan 2021, Philippe Mathieu-Daudé wrote: > > +PCI experts > > > > On 1/10/21 1:43 AM, BALATON Zoltan wrote: > >> On Sun, 10 Jan 2021, Philippe Mathieu-Daudé wrote: [...] > > I'm not a PCI expert but my understanding is PCI device functions are > > restricted to the PCI bus address space. The host bridge may map this > > space within the host. > > > > QEMU might be using get_system_memory() because for some host bridge > > the mapping is not implemented so it was easier this way? > > Maybe, also one less indirection which if not really needed is a good > thing for performance so unless it's found to be needed to use another > address space here I'm happy with this as it matches what other similar > devices do and it seems to work. Maybe a separate address space is only > really needed if we have an iommu? Hi Zoltan, It is possible for bonito to remap PCI address space so maybe it's essential for bonito. Appreciate for your work. I'm going to help with reviewing as well. > > Regards, > BALATON Zoltan -- - Jiaxun
Re: [PULL 23/35] hw/intc: Rework Loongson LIOINTC
On Mon, Jan 11, 2021, at 8:36 AM, Huacai Chen wrote: > I think R_END should be 0x60, Jiaxun, what do you think? U r right. The manual is misleading. Thanks. - Jiaxun > > Huacai > > On Mon, Jan 11, 2021 at 5:51 AM BALATON Zoltan wrote: > > > > On Sun, 10 Jan 2021, Philippe Mathieu-Daudé wrote: > > > Hi Peter, Huacai, > > > > > > On 1/10/21 8:49 PM, Peter Maydell wrote: > > >> On Sun, 3 Jan 2021 at 21:11, Philippe Mathieu-Daudé > > >> wrote: > > >>> > > >>> From: Huacai Chen > > >>> > > >>> As suggested by Philippe Mathieu-Daudé, rework Loongson's liointc: > > >>> 1, Move macro definitions to loongson_liointc.h; > > >>> 2, Remove magic values and use macros instead; > > >>> 3, Replace dead D() code by trace events. > > >>> > > >>> Suggested-by: Philippe Mathieu-Daudé > > >>> Signed-off-by: Huacai Chen > > >>> Tested-by: Philippe Mathieu-Daudé > > >>> Reviewed-by: Philippe Mathieu-Daudé > > >>> Message-Id: <20201221110538.3186646-2-chenhua...@kernel.org> > > >>> Signed-off-by: Philippe Mathieu-Daudé > > >>> --- > > >>> include/hw/intc/loongson_liointc.h | 22 ++ > > >>> hw/intc/loongson_liointc.c | 36 +- > > >>> 2 files changed, 38 insertions(+), 20 deletions(-) > > >>> create mode 100644 include/hw/intc/loongson_liointc.h > > >> > > >> Hi; Coverity complains about a possible array overrun > > >> in this commit: > > >> > > >> > > >>> @@ -40,13 +39,10 @@ > > >>> #define R_IEN 0x24 > > >>> #define R_IEN_SET 0x28 > > >>> #define R_IEN_CLR 0x2c > > >>> -#define R_PERCORE_ISR(x)(0x40 + 0x8 * x) > > >>> +#define R_ISR_SIZE 0x8 > > >>> +#define R_START 0x40 > > >>> #define R_END 0x64 > > >>> > > >>> -#define TYPE_LOONGSON_LIOINTC "loongson.liointc" > > >>> -DECLARE_INSTANCE_CHECKER(struct loongson_liointc, LOONGSON_LIOINTC, > > >>> - TYPE_LOONGSON_LIOINTC) > > >>> - > > >>> struct loongson_liointc { > > >>> SysBusDevice parent_obj; > > >>> > > >>> @@ -123,14 +119,13 @@ liointc_read(void *opaque, hwaddr addr, unsigned > > >>> int size) > > >>> goto out; > > >>> } > > >>> > > >>> -/* Rest is 4 byte */ > > >>> +/* Rest are 4 bytes */ > > >>> if (size != 4 || (addr % 4)) { > > >>> goto out; > > >>> } > > >>> > > > > Expanding macros in the following: > > > > >>> -if (addr >= R_PERCORE_ISR(0) && > > >>> -addr < R_PERCORE_ISR(NUM_CORES)) { > > >>> -int core = (addr - R_PERCORE_ISR(0)) / 8; > > > > if (addr >= (0x40 + 0x8 * 0) && addr < (0x40 + 0x8 * 4)) > > -> > > if (addr >= 0x40 && addr < 0x60) > > int core = (addr - 0x40) / 8; > > > > > > >>> +if (addr >= R_START && addr < R_END) { > > >>> +int core = (addr - R_START) / R_ISR_SIZE; > > > > if (addr >= 0x40 && addr < 0x64) > > int core = (addr - 0x40) / 0x8; > > > > R_END seems to be off by 4 in the above. Should it be 0x60? > > > > Regards, > > BALATON Zoltan > > > > >> R_END is 0x64 and R_START is 0x40, so if addr is 0x60 > > >> then addr - R_START is 0x32 and so core here is 4. > > >> However p->per_core_isr[] only has 4 entries, so this will > > >> be off the end of the array. > > >> > > >> This is CID 1438965. > > >> > > >>> r = p->per_core_isr[core]; > > >>> goto out; > > >>> } > > >> > > >>> -if (addr >= R_PERCORE_ISR(0) && > > >>> -addr < R_PERCORE_ISR(NUM_CORES)) { > > >>> -int core = (addr - R_PERCORE_ISR(0)) / 8; > > >>> +if (addr >= R_START && addr < R_END) { > > >>> +int core = (addr - R_START) / R_ISR_SIZE; > > >>> p->per_core_isr[core] = value; > > >>> goto out; > > >>> } > > >> > > >> Same thing here, CID 1438967. > > > > > > Thanks Peter. > > > > > > Huacai, can you have a look please? > > > > > > Thanks, > > > > > > Phil. > > > > > > > -- - Jiaxun
Re: [PATCH 4/5] hw/ppc/ppc4xx_pci: Replace pointless warning by assert()
On Tue, Sep 01, 2020 at 12:40:42PM +0200, Philippe Mathieu-Daudé wrote: > We call pci_register_root_bus() to register 4 IRQs with the > ppc4xx_pci_set_irq() handler. As it can only be called with > values in the [0-4[ range, replace the pointless warning by > an assert(). > > Signed-off-by: Philippe Mathieu-Daudé > --- > hw/ppc/ppc4xx_pci.c | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/hw/ppc/ppc4xx_pci.c b/hw/ppc/ppc4xx_pci.c > index cd3f192a138..503ef46b39a 100644 > --- a/hw/ppc/ppc4xx_pci.c > +++ b/hw/ppc/ppc4xx_pci.c > @@ -256,10 +256,7 @@ static void ppc4xx_pci_set_irq(void *opaque, int > irq_num, int level) > qemu_irq *pci_irqs = opaque; > > trace_ppc4xx_pci_set_irq(irq_num); > -if (irq_num < 0) { > -fprintf(stderr, "%s: PCI irq %d\n", __func__, irq_num); > -return; > -} > +assert(irq_num >= 0); > qemu_set_irq(pci_irqs[irq_num], level); > } > > -- > 2.26.2 > > Hopefully reporting this here is okay, I find Launchpad hard to use but I can file it there if need be. The assertion added by this patch triggers while trying to boot a ppc44x_defconfig Linux kernel: $ qemu-system-ppc \ -machine bamboo \ -no-reboot \ -append console=ttyS0 \ -display none \ -kernel uImage \ -m 128m \ -nodefaults \ -serial mon:stdio Linux version 5.11.0-rc3 (nathan@ubuntu-m3-large-x86) (powerpc-linux-gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.35) #1 Sun Jan 10 15:52:24 MST 2021 Using PowerPC 44x Platform machine description ioremap() called early from find_legacy_serial_ports+0x64c/0x794. Use early_ioremap() instead printk: bootconsole [udbg0] enabled - phys_mem_size = 0x800 dcache_bsize = 0x20 icache_bsize = 0x20 cpu_features = 0x0100 possible= 0x4100 always = 0x0100 cpu_user_features = 0x8c008000 0x mmu_features = 0x0008 - Zone ranges: Normal [mem 0x-0x07ff] Movable zone start for each node Early memory node ranges node 0: [mem 0x-0x07ff] Initmem setup node 0 [mem 0x-0x07ff] MMU: Allocated 1088 bytes of context maps for 255 contexts Built 1 zonelists, mobility grouping on. Total pages: 32448 Kernel command line: console=ttyS0 Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear) Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear) mem auto-init: stack:off, heap alloc:off, heap free:off Memory: 122712K/131072K available (5040K kernel code, 236K rwdata, 1260K rodata, 200K init, 134K bss, 8360K reserved, 0K cma-reserved) Kernel virtual memory layout: * 0xffbdf000..0xf000 : fixmap * 0xffbdd000..0xffbdf000 : early ioremap * 0xd100..0xffbdd000 : vmalloc & ioremap SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1 NR_IRQS: 512, nr_irqs: 512, preallocated irqs: 16 UIC0 (32 IRQ sources) at DCR 0xc0 random: get_random_u32 called from start_kernel+0x370/0x508 with crng_init=0 clocksource: timebase: mask: 0x max_cycles: 0x5c4093a7d1, max_idle_ns: 440795210635 ns clocksource: timebase mult[280] shift[24] registered pid_max: default: 32768 minimum: 301 Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear) clocksource: jiffies: mask: 0x max_cycles: 0x, max_idle_ns: 764504178510 ns futex hash table entries: 256 (order: -1, 3072 bytes, linear) NET: Registered protocol family 16 DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations PCI host bridge /plb/pci@ec00 (primary) ranges: MEM 0xa000..0xbfff -> 0xa000 IO 0xe800..0xe800 -> 0x 4xx PCI DMA offset set to 0x 4xx PCI DMA window base to 0x DMA window size 0x8000 PCI: Probing PCI hardware PCI host bridge to bus :00 pci_bus :00: root bus resource [io 0x-0x] pci_bus :00: root bus resource [mem 0xa000-0xbfff] pci_bus :00: root bus resource [bus 00-ff] pci_bus :00: busn_res: [bus 00-ff] end is updated to ff pci :00:00.0: [1014:027f] type 00 class 0x068000 qemu-system-ppc: ../hw/ppc/ppc4xx_pci.c:259: ppc4xx_pci_set_irq: Assertion `irq_num >= 0' failed. On v5.2.0, it looks like a higher assertion triggers, added by commit 459ca8bfa4 ("pci: Assert irqnum is between 0 and bus->nirqs in pci_bus_change_irq_level"). qemu-system-ppc: ../hw/pci/pci.c:253: pci_bus_change_irq_level: Assertion `irq_num >= 0' failed. I have uploaded the kernel image here: https://github.com/nathanchance/bug-files/blob/8edf230441bd8eda067973fdf0eb063c94f04379/qemu-0270d74ef886235051c13c39b0de88500c628a02/uImage Cheers, Nathan
Re: [PATCH 4/8] hw/ppc/ppc440_bamboo: Drop use of ppcuic_init()
On Sat, Dec 12, 2020 at 12:15:33AM +, Peter Maydell wrote: > Switch the bamboo board to directly creating and configuring the UIC, > rather than doing it via the old ppcuic_init() helper function. > > Signed-off-by: Peter Maydell > --- > hw/ppc/ppc440_bamboo.c | 38 +++--- > 1 file changed, 27 insertions(+), 11 deletions(-) > > diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c > index 665bc1784e1..b156bcb9990 100644 > --- a/hw/ppc/ppc440_bamboo.c > +++ b/hw/ppc/ppc440_bamboo.c > @@ -33,6 +33,9 @@ > #include "sysemu/qtest.h" > #include "sysemu/reset.h" > #include "hw/sysbus.h" > +#include "hw/intc/ppc-uic.h" > +#include "hw/qdev-properties.h" > +#include "qapi/error.h" > > #define BINARY_DEVICE_TREE_FILE "bamboo.dtb" > > @@ -168,13 +171,13 @@ static void bamboo_init(MachineState *machine) > MemoryRegion *ram_memories = g_new(MemoryRegion, > PPC440EP_SDRAM_NR_BANKS); > hwaddr ram_bases[PPC440EP_SDRAM_NR_BANKS]; > hwaddr ram_sizes[PPC440EP_SDRAM_NR_BANKS]; > -qemu_irq *pic; > -qemu_irq *irqs; > PCIBus *pcibus; > PowerPCCPU *cpu; > CPUPPCState *env; > target_long initrd_size = 0; > DeviceState *dev; > +DeviceState *uicdev; > +SysBusDevice *uicsbd; > int success; > int i; > > @@ -192,10 +195,17 @@ static void bamboo_init(MachineState *machine) > ppc_dcr_init(env, NULL, NULL); > > /* interrupt controller */ > -irqs = g_new0(qemu_irq, PPCUIC_OUTPUT_NB); > -irqs[PPCUIC_OUTPUT_INT] = ((qemu_irq > *)env->irq_inputs)[PPC40x_INPUT_INT]; > -irqs[PPCUIC_OUTPUT_CINT] = ((qemu_irq > *)env->irq_inputs)[PPC40x_INPUT_CINT]; > -pic = ppcuic_init(env, irqs, 0x0C0, 0, 1); > +uicdev = qdev_new(TYPE_PPC_UIC); > +uicsbd = SYS_BUS_DEVICE(uicdev); > + > +object_property_set_link(OBJECT(uicdev), "cpu", OBJECT(cpu), > + &error_fatal); > +sysbus_realize_and_unref(uicsbd, &error_fatal); > + > +sysbus_connect_irq(uicsbd, PPCUIC_OUTPUT_INT, > + ((qemu_irq *)env->irq_inputs)[PPC40x_INPUT_INT]); > +sysbus_connect_irq(uicsbd, PPCUIC_OUTPUT_CINT, > + ((qemu_irq *)env->irq_inputs)[PPC40x_INPUT_CINT]); > > /* SDRAM controller */ > memset(ram_bases, 0, sizeof(ram_bases)); > @@ -203,14 +213,18 @@ static void bamboo_init(MachineState *machine) > ppc4xx_sdram_banks(machine->ram, PPC440EP_SDRAM_NR_BANKS, ram_memories, > ram_bases, ram_sizes, ppc440ep_sdram_bank_sizes); > /* XXX 440EP's ECC interrupts are on UIC1, but we've only created UIC0. > */ > -ppc4xx_sdram_init(env, pic[14], PPC440EP_SDRAM_NR_BANKS, ram_memories, > +ppc4xx_sdram_init(env, > + qdev_get_gpio_in(uicdev, 14), > + PPC440EP_SDRAM_NR_BANKS, ram_memories, >ram_bases, ram_sizes, 1); > > /* PCI */ > dev = sysbus_create_varargs(TYPE_PPC4xx_PCI_HOST_BRIDGE, > PPC440EP_PCI_CONFIG, > -pic[pci_irq_nrs[0]], pic[pci_irq_nrs[1]], > -pic[pci_irq_nrs[2]], pic[pci_irq_nrs[3]], > +qdev_get_gpio_in(uicdev, pci_irq_nrs[0]), > +qdev_get_gpio_in(uicdev, pci_irq_nrs[1]), > +qdev_get_gpio_in(uicdev, pci_irq_nrs[2]), > +qdev_get_gpio_in(uicdev, pci_irq_nrs[3]), > NULL); > pcibus = (PCIBus *)qdev_get_child_bus(dev, "pci.0"); > if (!pcibus) { > @@ -223,12 +237,14 @@ static void bamboo_init(MachineState *machine) > memory_region_add_subregion(get_system_memory(), PPC440EP_PCI_IO, isa); > > if (serial_hd(0) != NULL) { > -serial_mm_init(address_space_mem, 0xef600300, 0, pic[0], > +serial_mm_init(address_space_mem, 0xef600300, 0, > + qdev_get_gpio_in(uicdev, 0), > PPC_SERIAL_MM_BAUDBASE, serial_hd(0), > DEVICE_BIG_ENDIAN); > } > if (serial_hd(1) != NULL) { > -serial_mm_init(address_space_mem, 0xef600400, 0, pic[1], > +serial_mm_init(address_space_mem, 0xef600400, 0, > + qdev_get_gpio_in(uicdev, 1), > PPC_SERIAL_MM_BAUDBASE, serial_hd(1), > DEVICE_BIG_ENDIAN); > } > -- > 2.20.1 > > Hopefully reporting this here is okay, I find Launchpad hard to use but I can file it there if need be. This patch causes a panic while trying to boot a ppc44x_defconfig Linux kernel: $ qemu-system-ppc \ -machine bamboo \ -no-reboot \ -append console=ttyS0 \ -display none \ -kernel uImage \ -m 128m \ -nodefaults \ -serial mon:stdio Linux version 5.11.0-rc3 (nathan@ubuntu-m3-large-x86) (powerpc-linux-gcc (GCC) 10.2.0, GNU ld (GNU Binutils) 2.35) #1 Sun Jan 10
Re: [PULL 23/35] hw/intc: Rework Loongson LIOINTC
I think R_END should be 0x60, Jiaxun, what do you think? Huacai On Mon, Jan 11, 2021 at 5:51 AM BALATON Zoltan wrote: > > On Sun, 10 Jan 2021, Philippe Mathieu-Daudé wrote: > > Hi Peter, Huacai, > > > > On 1/10/21 8:49 PM, Peter Maydell wrote: > >> On Sun, 3 Jan 2021 at 21:11, Philippe Mathieu-Daudé > >> wrote: > >>> > >>> From: Huacai Chen > >>> > >>> As suggested by Philippe Mathieu-Daudé, rework Loongson's liointc: > >>> 1, Move macro definitions to loongson_liointc.h; > >>> 2, Remove magic values and use macros instead; > >>> 3, Replace dead D() code by trace events. > >>> > >>> Suggested-by: Philippe Mathieu-Daudé > >>> Signed-off-by: Huacai Chen > >>> Tested-by: Philippe Mathieu-Daudé > >>> Reviewed-by: Philippe Mathieu-Daudé > >>> Message-Id: <20201221110538.3186646-2-chenhua...@kernel.org> > >>> Signed-off-by: Philippe Mathieu-Daudé > >>> --- > >>> include/hw/intc/loongson_liointc.h | 22 ++ > >>> hw/intc/loongson_liointc.c | 36 +- > >>> 2 files changed, 38 insertions(+), 20 deletions(-) > >>> create mode 100644 include/hw/intc/loongson_liointc.h > >> > >> Hi; Coverity complains about a possible array overrun > >> in this commit: > >> > >> > >>> @@ -40,13 +39,10 @@ > >>> #define R_IEN 0x24 > >>> #define R_IEN_SET 0x28 > >>> #define R_IEN_CLR 0x2c > >>> -#define R_PERCORE_ISR(x)(0x40 + 0x8 * x) > >>> +#define R_ISR_SIZE 0x8 > >>> +#define R_START 0x40 > >>> #define R_END 0x64 > >>> > >>> -#define TYPE_LOONGSON_LIOINTC "loongson.liointc" > >>> -DECLARE_INSTANCE_CHECKER(struct loongson_liointc, LOONGSON_LIOINTC, > >>> - TYPE_LOONGSON_LIOINTC) > >>> - > >>> struct loongson_liointc { > >>> SysBusDevice parent_obj; > >>> > >>> @@ -123,14 +119,13 @@ liointc_read(void *opaque, hwaddr addr, unsigned > >>> int size) > >>> goto out; > >>> } > >>> > >>> -/* Rest is 4 byte */ > >>> +/* Rest are 4 bytes */ > >>> if (size != 4 || (addr % 4)) { > >>> goto out; > >>> } > >>> > > Expanding macros in the following: > > >>> -if (addr >= R_PERCORE_ISR(0) && > >>> -addr < R_PERCORE_ISR(NUM_CORES)) { > >>> -int core = (addr - R_PERCORE_ISR(0)) / 8; > > if (addr >= (0x40 + 0x8 * 0) && addr < (0x40 + 0x8 * 4)) > -> > if (addr >= 0x40 && addr < 0x60) > int core = (addr - 0x40) / 8; > > > >>> +if (addr >= R_START && addr < R_END) { > >>> +int core = (addr - R_START) / R_ISR_SIZE; > > if (addr >= 0x40 && addr < 0x64) > int core = (addr - 0x40) / 0x8; > > R_END seems to be off by 4 in the above. Should it be 0x60? > > Regards, > BALATON Zoltan > > >> R_END is 0x64 and R_START is 0x40, so if addr is 0x60 > >> then addr - R_START is 0x32 and so core here is 4. > >> However p->per_core_isr[] only has 4 entries, so this will > >> be off the end of the array. > >> > >> This is CID 1438965. > >> > >>> r = p->per_core_isr[core]; > >>> goto out; > >>> } > >> > >>> -if (addr >= R_PERCORE_ISR(0) && > >>> -addr < R_PERCORE_ISR(NUM_CORES)) { > >>> -int core = (addr - R_PERCORE_ISR(0)) / 8; > >>> +if (addr >= R_START && addr < R_END) { > >>> +int core = (addr - R_START) / R_ISR_SIZE; > >>> p->per_core_isr[core] = value; > >>> goto out; > >>> } > >> > >> Same thing here, CID 1438967. > > > > Thanks Peter. > > > > Huacai, can you have a look please? > > > > Thanks, > > > > Phil. > > > >
[PATCH] util/oslib-win32: Fix _aligned_malloc() arguments order
Commit dfbd0b873a8 inadvertently swapped the arguments of _aligned_malloc(), correct it to fix [*]: G_TEST_SRCDIR=C:/Users/ContainerAdministrator/AppData/Local/Temp/cirrus-ci-build/tests G_TEST_BUILDDIR=C:/Users/ContainerAdministrator/AppData/Local/Temp/cirrus-ci-build/build/tests tests/test-qht.exe --tap -k ERROR test-qht - too few tests run (expected 2, got 0) make: *** [Makefile.mtest:256: run-test-30] Error 1 [*] https://cirrus-ci.com/task/6055645751279616?command=test#L593 Fixes: dfbd0b873a8 ("util/oslib-win32: Use _aligned_malloc for qemu_try_memalign") Reported-by: Yonggang Luo Reported-by: Volker Rümelin Suggested-by: Volker Rümelin Signed-off-by: Philippe Mathieu-Daudé --- util/oslib-win32.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/util/oslib-win32.c b/util/oslib-win32.c index e6f83e10edb..f68b8012bb8 100644 --- a/util/oslib-win32.c +++ b/util/oslib-win32.c @@ -59,7 +59,7 @@ void *qemu_try_memalign(size_t alignment, size_t size) g_assert(size != 0); g_assert(is_power_of_2(alignment)); -ptr = _aligned_malloc(alignment, size); +ptr = _aligned_malloc(size, alignment); trace_qemu_memalign(alignment, size, ptr); return ptr; } -- 2.26.2
Re: [PULL 04/47] util/oslib-win32: Use _aligned_malloc for qemu_try_memalign
> We do not need or want to be allocating page sized quanta. > > Reviewed-by: Philippe Mathieu-Daudé > Reviewed-by: Stefan Weil > Message-Id: <20201018164836.1149452-1-richard.hender...@linaro.org> > Signed-off-by: Philippe Mathieu-Daudé > Signed-off-by: Richard Henderson > --- > util/oslib-win32.c | 11 --- > 1 file changed, 4 insertions(+), 7 deletions(-) > > diff --git a/util/oslib-win32.c b/util/oslib-win32.c > index 01787df74c..8adc651259 100644 > --- a/util/oslib-win32.c > +++ b/util/oslib-win32.c > @@ -39,6 +39,7 @@ > #include "trace.h" > #include "qemu/sockets.h" > #include "qemu/cutils.h" > +#include > > /* this must come after including "trace.h" */ > #include > @@ -56,10 +57,8 @@ void *qemu_try_memalign(size_t alignment, size_t size) > { > void *ptr; > > -if (!size) { > -abort(); > -} > -ptr = VirtualAlloc(NULL, size, MEM_COMMIT, PAGE_READWRITE); > +g_assert(size != 0); > +ptr = _aligned_malloc(alignment, size); Hi Richard, this doesn't work really well. The _aligned_malloc parameters are swapped. ptr = _aligned_malloc(size, alignment) is correct. With best regards, Volker > trace_qemu_memalign(alignment, size, ptr); > return ptr; > } > @@ -93,9 +92,7 @@ void *qemu_anon_ram_alloc(size_t size, uint64_t *align, > bool shared) > void qemu_vfree(void *ptr) > { > trace_qemu_vfree(ptr); > -if (ptr) { > -VirtualFree(ptr, 0, MEM_RELEASE); > -} > +_aligned_free(ptr); > } > > void qemu_anon_ram_free(void *ptr, size_t size)
Re: [PATCH 00/23] next round of audio patches
> Patchew URL: > https://patchew.org/QEMU/9315afe5-5958-c0b4-ea1e-14769511a...@t-online.de/ > > > > Hi, > > This series seems to have some coding style problems. See output below for > more information: > > Type: series > Message-id: 9315afe5-5958-c0b4-ea1e-14769511a...@t-online.de > Subject: [PATCH 00/23] next round of audio patches > > === TEST SCRIPT BEGIN === > #!/bin/bash > git rev-parse base > /dev/null || exit 0 > git config --local diff.renamelimit 0 > git config --local diff.renames True > git config --local diff.algorithm histogram > ./scripts/checkpatch.pl --mailback base.. > === TEST SCRIPT END === > > Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 > From https://github.com/patchew-project/qemu > * [new tag] patchew/9315afe5-5958-c0b4-ea1e-14769511a...@t-online.de > -> patchew/9315afe5-5958-c0b4-ea1e-14769511a...@t-online.de > Switched to a new branch 'test' > f5676b9 dsoundaudio: fix log message > 825e3ad dsoundaudio: enable f32 audio sample format > ead5e23 dsoundaudio: rename dsound_open() > 3bf4a8e dsoundaudio: replace GetForegroundWindow() > 98454ba paaudio: send recorded data in smaller chunks > 2f94e48 paaudio: limit minreq to 75% of audio timer_rate > 4fd4d92 paaudio: comment bugs in functions qpa_init_* > 1045ac7 paaudio: remove unneeded code > d5c8eb1 paaudio: wait until the playback stream is ready > 11d1092 paaudio: wait for PA_STREAM_READY in qpa_write() > 6cc0dee paaudio: avoid to clip samples multiple times > d670342 audio: remove remaining unused plive code > 0cc736d sdlaudio: enable (in|out).mixing-engine=off > 61543ec audio: break generic buffer dependency on mixing-engine > 330dfbe sdlaudio: add recording functions > 998b92f audio: split pcm_ops function get_buffer_in > bc84416 sdlaudio: replace legacy functions with modern ones > 8f33798 sdlaudio: fill remaining sample buffer with silence > 78c2474 sdlaudio: always clear the sample buffer > f5ea854 sdlaudio: don't start playback in init routine > 7a1a2df sdlaudio: add -audiodev sdl,out.buffer-count option > 6aa7760 audio: fix bit-rotted code > 98cf04b sdlaudio: remove leftover SDL1.2 code > > === OUTPUT BEGIN === > 1/23 Checking commit 98cf04b7c44a (sdlaudio: remove leftover SDL1.2 code) > 2/23 Checking commit 6aa776039e75 (audio: fix bit-rotted code) > 3/23 Checking commit 7a1a2df1c97f (sdlaudio: add -audiodev > sdl,out.buffer-count option) > 4/23 Checking commit f5ea85493feb (sdlaudio: don't start playback in init > routine) > 5/23 Checking commit 78c2474549af (sdlaudio: always clear the sample buffer) > 6/23 Checking commit 8f33798c4ff8 (sdlaudio: fill remaining sample buffer > with silence) > 7/23 Checking commit bc844166b227 (sdlaudio: replace legacy functions with > modern ones) > ERROR: spaces required around that '*' (ctx:WxV) > #133: FILE: audio/sdlaudio.c:247: > +glue(SDLVoice, dir) *sdl = (glue(SDLVoice, dir) *)hw; \ > ^ > > ERROR: spaces required around that '*' (ctx:WxB) > #133: FILE: audio/sdlaudio.c:247: > +glue(SDLVoice, dir) *sdl = (glue(SDLVoice, dir) *)hw; \ > ^ > > total: 2 errors, 0 warnings, 222 lines checked > > Patch 7/23 has style problems, please review. If any of these errors > are false positives report them to the maintainer, see > CHECKPATCH in MAINTAINERS. > > 8/23 Checking commit 998b92fd32ff (audio: split pcm_ops function > get_buffer_in) > 9/23 Checking commit 330dfbed90b4 (sdlaudio: add recording functions) > ERROR: spaces required around that '*' (ctx:WxV) > #86: FILE: audio/sdlaudio.c:306: > +glue(SDLVoice, dir) *sdl = (glue(SDLVoice, dir) *)hw; \ > ^ > > ERROR: spaces required around that '*' (ctx:WxB) > #86: FILE: audio/sdlaudio.c:306: > +glue(SDLVoice, dir) *sdl = (glue(SDLVoice, dir) *)hw; \ > ^ > > total: 2 errors, 0 warnings, 185 lines checked > > Patch 9/23 has style problems, please review. If any of these errors > are false positives report them to the maintainer, see > CHECKPATCH in MAINTAINERS. All errors are false positives. The * isn't a multiplication. > > 10/23 Checking commit 61543ec830f5 (audio: break generic buffer dependency on > mixing-engine) > 11/23 Checking commit 0cc736d525a7 (sdlaudio: enable > (in|out).mixing-engine=off) > 12/23 Checking commit d6703422e706 (audio: remove remaining unused plive code) > 13/23 Checking commit 6cc0dee46213 (paaudio: avoid to clip samples multiple > times) > 14/23 Checking commit 11d109269391 (paaudio: wait for PA_STREAM_READY in > qpa_write()) > 15/23 Checking commit d5c8eb16d112 (paaudio: wait until the playback stream > is ready) > 16/23 Checking commit 1045ac7440af (paaudio: remove unneeded code) > 17/23 Checking commit 4fd4d92e5788 (paaudio: comment bugs in functions > qpa_init_*) > 18/23 Checking commit 2f94e489468a (paaudio: limit minreq to 75% of audio > timer_rate) > 19/23 Checking commit 98
Re: [PULL 23/35] hw/intc: Rework Loongson LIOINTC
On Sun, 10 Jan 2021, Philippe Mathieu-Daudé wrote: Hi Peter, Huacai, On 1/10/21 8:49 PM, Peter Maydell wrote: On Sun, 3 Jan 2021 at 21:11, Philippe Mathieu-Daudé wrote: From: Huacai Chen As suggested by Philippe Mathieu-Daudé, rework Loongson's liointc: 1, Move macro definitions to loongson_liointc.h; 2, Remove magic values and use macros instead; 3, Replace dead D() code by trace events. Suggested-by: Philippe Mathieu-Daudé Signed-off-by: Huacai Chen Tested-by: Philippe Mathieu-Daudé Reviewed-by: Philippe Mathieu-Daudé Message-Id: <20201221110538.3186646-2-chenhua...@kernel.org> Signed-off-by: Philippe Mathieu-Daudé --- include/hw/intc/loongson_liointc.h | 22 ++ hw/intc/loongson_liointc.c | 36 +- 2 files changed, 38 insertions(+), 20 deletions(-) create mode 100644 include/hw/intc/loongson_liointc.h Hi; Coverity complains about a possible array overrun in this commit: @@ -40,13 +39,10 @@ #define R_IEN 0x24 #define R_IEN_SET 0x28 #define R_IEN_CLR 0x2c -#define R_PERCORE_ISR(x)(0x40 + 0x8 * x) +#define R_ISR_SIZE 0x8 +#define R_START 0x40 #define R_END 0x64 -#define TYPE_LOONGSON_LIOINTC "loongson.liointc" -DECLARE_INSTANCE_CHECKER(struct loongson_liointc, LOONGSON_LIOINTC, - TYPE_LOONGSON_LIOINTC) - struct loongson_liointc { SysBusDevice parent_obj; @@ -123,14 +119,13 @@ liointc_read(void *opaque, hwaddr addr, unsigned int size) goto out; } -/* Rest is 4 byte */ +/* Rest are 4 bytes */ if (size != 4 || (addr % 4)) { goto out; } Expanding macros in the following: -if (addr >= R_PERCORE_ISR(0) && -addr < R_PERCORE_ISR(NUM_CORES)) { -int core = (addr - R_PERCORE_ISR(0)) / 8; if (addr >= (0x40 + 0x8 * 0) && addr < (0x40 + 0x8 * 4)) -> if (addr >= 0x40 && addr < 0x60) int core = (addr - 0x40) / 8; +if (addr >= R_START && addr < R_END) { +int core = (addr - R_START) / R_ISR_SIZE; if (addr >= 0x40 && addr < 0x64) int core = (addr - 0x40) / 0x8; R_END seems to be off by 4 in the above. Should it be 0x60? Regards, BALATON Zoltan R_END is 0x64 and R_START is 0x40, so if addr is 0x60 then addr - R_START is 0x32 and so core here is 4. However p->per_core_isr[] only has 4 entries, so this will be off the end of the array. This is CID 1438965. r = p->per_core_isr[core]; goto out; } -if (addr >= R_PERCORE_ISR(0) && -addr < R_PERCORE_ISR(NUM_CORES)) { -int core = (addr - R_PERCORE_ISR(0)) / 8; +if (addr >= R_START && addr < R_END) { +int core = (addr - R_START) / R_ISR_SIZE; p->per_core_isr[core] = value; goto out; } Same thing here, CID 1438967. Thanks Peter. Huacai, can you have a look please? Thanks, Phil.
Re: [PULL 23/35] hw/intc: Rework Loongson LIOINTC
Hi Peter, Huacai, On 1/10/21 8:49 PM, Peter Maydell wrote: > On Sun, 3 Jan 2021 at 21:11, Philippe Mathieu-Daudé wrote: >> >> From: Huacai Chen >> >> As suggested by Philippe Mathieu-Daudé, rework Loongson's liointc: >> 1, Move macro definitions to loongson_liointc.h; >> 2, Remove magic values and use macros instead; >> 3, Replace dead D() code by trace events. >> >> Suggested-by: Philippe Mathieu-Daudé >> Signed-off-by: Huacai Chen >> Tested-by: Philippe Mathieu-Daudé >> Reviewed-by: Philippe Mathieu-Daudé >> Message-Id: <20201221110538.3186646-2-chenhua...@kernel.org> >> Signed-off-by: Philippe Mathieu-Daudé >> --- >> include/hw/intc/loongson_liointc.h | 22 ++ >> hw/intc/loongson_liointc.c | 36 +- >> 2 files changed, 38 insertions(+), 20 deletions(-) >> create mode 100644 include/hw/intc/loongson_liointc.h > > Hi; Coverity complains about a possible array overrun > in this commit: > > >> @@ -40,13 +39,10 @@ >> #define R_IEN 0x24 >> #define R_IEN_SET 0x28 >> #define R_IEN_CLR 0x2c >> -#define R_PERCORE_ISR(x)(0x40 + 0x8 * x) >> +#define R_ISR_SIZE 0x8 >> +#define R_START 0x40 >> #define R_END 0x64 >> >> -#define TYPE_LOONGSON_LIOINTC "loongson.liointc" >> -DECLARE_INSTANCE_CHECKER(struct loongson_liointc, LOONGSON_LIOINTC, >> - TYPE_LOONGSON_LIOINTC) >> - >> struct loongson_liointc { >> SysBusDevice parent_obj; >> >> @@ -123,14 +119,13 @@ liointc_read(void *opaque, hwaddr addr, unsigned int >> size) >> goto out; >> } >> >> -/* Rest is 4 byte */ >> +/* Rest are 4 bytes */ >> if (size != 4 || (addr % 4)) { >> goto out; >> } >> >> -if (addr >= R_PERCORE_ISR(0) && >> -addr < R_PERCORE_ISR(NUM_CORES)) { >> -int core = (addr - R_PERCORE_ISR(0)) / 8; >> +if (addr >= R_START && addr < R_END) { >> +int core = (addr - R_START) / R_ISR_SIZE; > > R_END is 0x64 and R_START is 0x40, so if addr is 0x60 > then addr - R_START is 0x32 and so core here is 4. > However p->per_core_isr[] only has 4 entries, so this will > be off the end of the array. > > This is CID 1438965. > >> r = p->per_core_isr[core]; >> goto out; >> } > >> -if (addr >= R_PERCORE_ISR(0) && >> -addr < R_PERCORE_ISR(NUM_CORES)) { >> -int core = (addr - R_PERCORE_ISR(0)) / 8; >> +if (addr >= R_START && addr < R_END) { >> +int core = (addr - R_START) / R_ISR_SIZE; >> p->per_core_isr[core] = value; >> goto out; >> } > > Same thing here, CID 1438967. Thanks Peter. Huacai, can you have a look please? Thanks, Phil.
Re: [PATCH] tcg: Remove unused tcg_out_dupi_vec() stub
On 1/10/21 7:23 PM, Richard Henderson wrote: > On 1/9/21 6:10 PM, Wataru Ashihara wrote: >> This fixes the build with --enable-tcg-interpreter: >> >> clang -Ilibqemu-arm-softmmu.fa.p -I. -I.. -Itarget/arm -I../target/arm >> -I../dtc/libfdt -I../capstone/include/capstone -Iqapi -Itrace -Iui >> -Iui/shader -I/usr/include/pixman-1 -I/usr/include/glib-2.0 >> -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -Xclang -fcolor-diagnostics >> -pipe -Wall -Winvalid-pch -Werror -std=gnu99 -g -m64 -mcx16 -D_GNU_SOURCE >> -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes >> -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes >> -fno-strict-aliasing -fno-common -fwrapv -Wold-style-definition >> -Wtype-limits -Wformat-security -Wformat-y2k -Winit-self >> -Wignored-qualifiers -Wempty-body -Wnested-externs -Wendif-labels >> -Wexpansion-to-defined -Wno-initializer-overrides -Wno-missing-include-dirs >> -Wno-shift-negative-value -Wno-string-plus-int -Wno-typedef-redefinition >> -Wno-tautological-type-limit-compare -fstack-protector-strong -isystem >> /home/wsh/qc/qemu/linux-headers -isystem linux-headers -iquote >> /home/wsh/qc/qemu/tcg/tci -iquote . -iquote /home/wsh/qc/qemu -iquote >> /home/wsh/qc/qemu/accel/tcg -iquote /home/wsh/qc/qemu/include -iquote >> /home/wsh/qc/qemu/disas/libvixl -pthread -fPIC -isystem../linux-headers >> -isystemlinux-headers -DNEED_CPU_H >> '-DCONFIG_TARGET="arm-softmmu-config-target.h"' >> '-DCONFIG_DEVICES="arm-softmmu-config-devices.h"' -MD -MQ >> libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -MF >> libqemu-arm-softmmu.fa.p/tcg_tcg.c.o.d -o >> libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -c ../tcg/tcg.c >> ../tcg/tcg.c:136:20: error: unused function 'tcg_out_dupi_vec' >> [-Werror,-Wunused-function] > > > What version of clang? > With clang 10, I can't even run configure without --disable-werror. clang version 10.0.1 (Fedora 10.0.1-3.fc32) I tested using: ../configure '--cc=clang' '--cxx=clang++' \ '--extra-cflags=-Wunused-function' '--enable-tcg-interpreter' \ '--disable-tools' '--target-list=arm-softmmu'
check-tcg HOWTO?
Hi Alex, happy new year, I am trying to get check-tcg to run reliably, as I am doing some substantial refactoring of tcg cpu operations, so I need to verify that TCG is fine. This is an overall getting started question, is there a how-to on how to use check-tcg and how to fix things when things don't go smoothly? I get different results on different machines for check-tcg, although the runs are containerized, on one machine the tests for aarch64 tcg are SKIPPED completely (so no errors), on the other machine I get: qemu-system-aarch64: terminating on signal 15 from pid 18583 (timeout) qemu-system-aarch64: terminating on signal 15 from pid 18584 (timeout) qemu-system-aarch64: terminating on signal 15 from pid 18585 (timeout) make[2]: *** [../Makefile.target:162: run-hello] Error 124 make[2]: *** Waiting for unfinished jobs make[2]: *** [../Makefile.target:162: run-pauth-3] Error 124 make[2]: *** [../Makefile.target:162: run-memory] Error 124 Both are configured with configure --enable-tcg Anything more than V=1 to get more output? How do I debug and get logs and cores out of containers? in tests/tcg/ there is: a README (with no hint unfortunately) , Makefile.qemu Makefile.prereqs Makefile.target There are a bunch of variables in these files, which seem to be possible to configure, am I expected to set some of those? I think that it would be beneficial to have either more documentation or more immediately actionable information out of make check failures; Any help you could give me to make some progess? Thanks, Claudio
[PATCH v2] hvf: guard xgetbv call.
This prevents illegal instruction on cpus do not support xgetbv. Buglink: https://bugs.launchpad.net/qemu/+bug/1758819 Signed-off-by: Hill Ma --- v2: xgetbv() modified based on feedback. target/i386/hvf/x86_cpuid.c | 28 +++- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/target/i386/hvf/x86_cpuid.c b/target/i386/hvf/x86_cpuid.c index a6842912f5..edaa1b7da2 100644 --- a/target/i386/hvf/x86_cpuid.c +++ b/target/i386/hvf/x86_cpuid.c @@ -27,15 +27,22 @@ #include "vmx.h" #include "sysemu/hvf.h" -static uint64_t xgetbv(uint32_t xcr) +static bool xgetbv(uint32_t cpuid_ecx, uint32_t idx, uint64_t *xcr) { -uint32_t eax, edx; +uint32_t xcrl, xcrh; -__asm__ volatile ("xgetbv" - : "=a" (eax), "=d" (edx) - : "c" (xcr)); +if (cpuid_ecx & CPUID_EXT_OSXSAVE) { +/* + * The xgetbv instruction is not available to older versions of + * the assembler, so we encode the instruction manually. + */ +asm(".byte 0x0f, 0x01, 0xd0" : "=a" (xcrl), "=d" (xcrh) : "c" (idx)); -return (((uint64_t)edx) << 32) | eax; +*xcr = (((uint64_t)xcrh) << 32) | xcrl; +return true; +} + +return false; } uint32_t hvf_get_supported_cpuid(uint32_t func, uint32_t idx, @@ -100,11 +107,14 @@ uint32_t hvf_get_supported_cpuid(uint32_t func, uint32_t idx, break; case 0xD: if (idx == 0) { -uint64_t host_xcr0 = xgetbv(0); -uint64_t supp_xcr0 = host_xcr0 & (XSTATE_FP_MASK | XSTATE_SSE_MASK | +uint64_t supp_xcr0 = XSTATE_FP_MASK | XSTATE_SSE_MASK | XSTATE_YMM_MASK | XSTATE_BNDREGS_MASK | XSTATE_BNDCSR_MASK | XSTATE_OPMASK_MASK | - XSTATE_ZMM_Hi256_MASK | XSTATE_Hi16_ZMM_MASK); + XSTATE_ZMM_Hi256_MASK | XSTATE_Hi16_ZMM_MASK; +uint64_t host_xcr0; +if (xgetbv(ecx, 0, &host_xcr0)) { +supp_xcr0 &= host_xcr0; +} eax &= supp_xcr0; } else if (idx == 1) { hv_vmx_read_capability(HV_VMX_CAP_PROCBASED2, &cap); -- 2.20.1 (Apple Git-117)
Re: [PATCH 1/2] tcg: Mark more tcg_out*() functions with attribute 'unused'
On 1/10/21 6:51 PM, Richard Henderson wrote: > On 1/10/21 6:27 AM, Philippe Mathieu-Daudé wrote: >> The tcg_out* functions are utility routines that may or >> may not be used by a particular backend. Similarly to commit >> 4196dca63b8, mark them with the 'unused' attribute to suppress >> spurious warnings if they aren't used. >> >> This fixes the build with --enable-tcg-interpreter: >> >> [98/151] Compiling C object libqemu-arm-softmmu.fa.p/tcg_tcg.c.o >> FAILED: libqemu-arm-softmmu.fa.p/tcg_tcg.c.o >> clang [...] -o libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -c ../tcg/tcg.c >> ../tcg/tcg.c:136:20: error: unused function 'tcg_out_dupi_vec' >> [-Werror,-Wunused-function] >> >> Reported-by: Wataru Ashihara >> Signed-off-by: Philippe Mathieu-Daudé >> --- >> tcg/tcg.c | 30 +- >> 1 file changed, 21 insertions(+), 9 deletions(-) > > > This does too much to fix that Werror, as all of the other functions are > unconditionally used. > > Alternately, I'll re-test and merge my tcg constant branch, which will make > tcg_out_dupi_vec also unconditionally used. Then we don't need > __attribute__((unused)) at all. OK, better then. Regards, Phil.
coverity warning about possible missing error check in v9fs_request()
Hi; Coverity has just come up with a new warning (CID 1438968) about an unchecked error return value in the 9pfs code. (I'm not sure why now -- the code in question is unchanged since 2011; probably some other callsites changed enough to trigger the "other callsites check return value" heuristic.) Anyway, in the middle of v9fs_request() is this code: /* marshal the header details */ proxy_marshal(iovec, 0, "dd", header.type, header.size); header.size += PROXY_HDR_SZ; retval = qemu_write_full(proxy->sockfd, iovec->iov_base, header.size); if (retval != header.size) { goto close_error; } This is apparently the only call to proxy_marshal() that does not check its return value -- is it missing a check? thanks -- PMM
Re: [PULL 22/23] hw/riscv: Use the CPU to determine if 32-bit
On Fri, 18 Dec 2020 at 06:01, Alistair Francis wrote: > > Instead of using string compares to determine if a RISC-V machine is > using 32-bit or 64-bit CPUs we can use the initalised CPUs. This avoids > us having to maintain a list of CPU names to compare against. > > This commit also fixes the name of the function to match the > riscv_cpu_is_32bit() function. > > Signed-off-by: Alistair Francis > Reviewed-by: Richard Henderson > Message-id: > 8ab7614e5df93ab5267788b73dcd75f9f5615e82.1608142916.git.alistair.fran...@wdc.com Hi; coverity points out a probably-unintentional inefficiency here (CID 1438099, CID 1438100, CID 1438101): > --- a/hw/riscv/boot.c > +++ b/hw/riscv/boot.c > @@ -33,28 +33,16 @@ > > #include > > -bool riscv_is_32_bit(MachineState *machine) > +bool riscv_is_32bit(RISCVHartArrayState harts) The RISCVHartArrayState type is 824 bytes long. That's a very big type to be passing by value. You probably wanted to pass a pointer to it instead. Similarly for the arguments to riscv_calc_kernel_start_addr() and riscv_setup_rom_reset_vec(). thanks -- PMM
Re: [PULL 23/35] hw/intc: Rework Loongson LIOINTC
On Sun, 3 Jan 2021 at 21:11, Philippe Mathieu-Daudé wrote: > > From: Huacai Chen > > As suggested by Philippe Mathieu-Daudé, rework Loongson's liointc: > 1, Move macro definitions to loongson_liointc.h; > 2, Remove magic values and use macros instead; > 3, Replace dead D() code by trace events. > > Suggested-by: Philippe Mathieu-Daudé > Signed-off-by: Huacai Chen > Tested-by: Philippe Mathieu-Daudé > Reviewed-by: Philippe Mathieu-Daudé > Message-Id: <20201221110538.3186646-2-chenhua...@kernel.org> > Signed-off-by: Philippe Mathieu-Daudé > --- > include/hw/intc/loongson_liointc.h | 22 ++ > hw/intc/loongson_liointc.c | 36 +- > 2 files changed, 38 insertions(+), 20 deletions(-) > create mode 100644 include/hw/intc/loongson_liointc.h Hi; Coverity complains about a possible array overrun in this commit: > @@ -40,13 +39,10 @@ > #define R_IEN 0x24 > #define R_IEN_SET 0x28 > #define R_IEN_CLR 0x2c > -#define R_PERCORE_ISR(x)(0x40 + 0x8 * x) > +#define R_ISR_SIZE 0x8 > +#define R_START 0x40 > #define R_END 0x64 > > -#define TYPE_LOONGSON_LIOINTC "loongson.liointc" > -DECLARE_INSTANCE_CHECKER(struct loongson_liointc, LOONGSON_LIOINTC, > - TYPE_LOONGSON_LIOINTC) > - > struct loongson_liointc { > SysBusDevice parent_obj; > > @@ -123,14 +119,13 @@ liointc_read(void *opaque, hwaddr addr, unsigned int > size) > goto out; > } > > -/* Rest is 4 byte */ > +/* Rest are 4 bytes */ > if (size != 4 || (addr % 4)) { > goto out; > } > > -if (addr >= R_PERCORE_ISR(0) && > -addr < R_PERCORE_ISR(NUM_CORES)) { > -int core = (addr - R_PERCORE_ISR(0)) / 8; > +if (addr >= R_START && addr < R_END) { > +int core = (addr - R_START) / R_ISR_SIZE; R_END is 0x64 and R_START is 0x40, so if addr is 0x60 then addr - R_START is 0x32 and so core here is 4. However p->per_core_isr[] only has 4 entries, so this will be off the end of the array. This is CID 1438965. > r = p->per_core_isr[core]; > goto out; > } > -if (addr >= R_PERCORE_ISR(0) && > -addr < R_PERCORE_ISR(NUM_CORES)) { > -int core = (addr - R_PERCORE_ISR(0)) / 8; > +if (addr >= R_START && addr < R_END) { > +int core = (addr - R_START) / R_ISR_SIZE; > p->per_core_isr[core] = value; > goto out; > } Same thing here, CID 1438967. thanks -- PMM
Re: [PATCH v2 08/13] vt82c686: Move creation of ISA devices to the ISA bridge
On Sun, 10 Jan 2021, Philippe Mathieu-Daudé wrote: +PCI experts On 1/10/21 1:43 AM, BALATON Zoltan wrote: On Sun, 10 Jan 2021, Philippe Mathieu-Daudé wrote: Hi Zoltan, On 1/9/21 9:16 PM, BALATON Zoltan wrote: Currently the ISA devices that are part of the VIA south bridge, superio chip are wired up by board code. Move creation of these ISA devices to the VIA ISA bridge model so that board code does not need to access ISA bus. This also allows vt82c686b-superio to be made internal to vt82c686 which allows implementing its configuration via registers in subseqent commits. Is this patch dependent of the VT82C686B_PM changes or can it be applied before them? I don't know but why would that be better? I thought it's clearer to clean up pm related parts first before moving more stuff to this file so that's why this patch comes after (and also because that's the order I did it). Not any better, but easier for me to get your patches integrated, as I'm reviewing your patches slowly. Finding other reviewers would certainly help. No problem, I'll wait for your review. Merging parts of the series does not help much because the whole series is needed for vt8231 which is prerequisite for pegasos2 so eventually all of these are needed so it does not matter if this one patch gets in earlier or later. Not sure who could help with review. Maybe Jiaxun or Huacai as this is used by fuloong2e so they might be interested and could have info on this chip. Most of these patches just cleaning up the vt82c686b and adding some missing features so these can be reused by the vt8231 model in last 3 patches (which is very similar to 686b only some reg addresses and ids seem to be different for what we are concerned). Signed-off-by: BALATON Zoltan --- hw/isa/vt82c686.c | 20 hw/mips/fuloong2e.c | 29 + 2 files changed, 25 insertions(+), 24 deletions(-) diff --git a/hw/isa/vt82c686.c b/hw/isa/vt82c686.c index 58c0bba1d0..5df9be8ff4 100644 --- a/hw/isa/vt82c686.c +++ b/hw/isa/vt82c686.c @@ -16,6 +16,11 @@ #include "hw/qdev-properties.h" #include "hw/isa/isa.h" #include "hw/isa/superio.h" +#include "hw/intc/i8259.h" +#include "hw/irq.h" +#include "hw/dma/i8257.h" +#include "hw/timer/i8254.h" +#include "hw/rtc/mc146818rtc.h" #include "migration/vmstate.h" #include "hw/isa/apm.h" #include "hw/acpi/acpi.h" @@ -307,9 +312,16 @@ OBJECT_DECLARE_SIMPLE_TYPE(VT82C686BISAState, VT82C686B_ISA) struct VT82C686BISAState { PCIDevice dev; + qemu_irq cpu_intr; SuperIOConfig superio_cfg; }; +static void via_isa_request_i8259_irq(void *opaque, int irq, int level) +{ + VT82C686BISAState *s = opaque; + qemu_set_irq(s->cpu_intr, level); +} + static void vt82c686b_write_config(PCIDevice *d, uint32_t addr, uint32_t val, int len) { @@ -365,10 +377,18 @@ static void vt82c686b_realize(PCIDevice *d, Error **errp) VT82C686BISAState *s = VT82C686B_ISA(d); DeviceState *dev = DEVICE(d); ISABus *isa_bus; + qemu_irq *isa_irq; int i; + qdev_init_gpio_out(dev, &s->cpu_intr, 1); Why not use the SysBus API? How? This is a PCIDevice not a SysBusDevice. Indeed :) + isa_irq = qemu_allocate_irqs(via_isa_request_i8259_irq, s, 1); isa_bus = isa_bus_new(dev, get_system_memory(), pci_address_space_io(d), &error_fatal); Isn't it get_system_memory() -> pci_address_space(d)? I don't really know. Most other places that create an isa bus seem to also use get_system_memory(), only piix4 uses pci_address_space(dev) so I thought if those others are OK this should be too. I'm not a PCI expert but my understanding is PCI device functions are restricted to the PCI bus address space. The host bridge may map this space within the host. QEMU might be using get_system_memory() because for some host bridge the mapping is not implemented so it was easier this way? Maybe, also one less indirection which if not really needed is a good thing for performance so unless it's found to be needed to use another address space here I'm happy with this as it matches what other similar devices do and it seems to work. Maybe a separate address space is only really needed if we have an iommu? Regards, BALATON Zoltan
Re: [PULL 00/23] target-arm queue
On Sat, Jan 9, 2021 at 1:51 AM Peter Maydell wrote: > > On Fri, 8 Jan 2021 at 15:36, Peter Maydell wrote: > > > > Nothing too exciting, but does include the last bits of v8.1M support work. > > > > -- PMM > > > > The following changes since commit e79de63ab1bd1f6550e7b915e433bec1ad1a870a: > > > > Merge remote-tracking branch 'remotes/rth-gitlab/tags/pull-tcg-20210107' into staging (2021-01-07 20:34:05 +) > > > > are available in the Git repository at: > > > > https://git.linaro.org/people/pmaydell/qemu-arm.git tags/pull-target-arm-20210108 > > > > for you to fetch changes up to c9f8511ea8d2b80723af0fea1f716d752c1b5208: > > > > docs/system: arm: Add sabrelite board description (2021-01-08 15:13:39 +) > > > > > > target-arm queue: > > * intc/arm_gic: Fix gic_irq_signaling_enabled() for vCPUs > > * target/arm: Fix MTE0_ACTIVE > > * target/arm: Implement v8.1M and Cortex-M55 model > > * hw/arm/highbank: Drop dead KVM support code > > * util/qemu-timer: Make timer_free() imply timer_del() > > * various devices: Use ptimer_free() in finalize function > > * docs/system: arm: Add sabrelite board description > > * sabrelite: Minor fixes to allow booting U-Boot > > > Applied, thanks. > > Please update the changelog at https://wiki.qemu.org/ChangeLog/6.0 > for any user-visible changes. > > -- PMM > Caused win32 CI failure https://cirrus-ci.com/task/6055645751279616?command=test#L593 MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} G_TEST_SRCDIR=C:/Users/ContainerAdministrator/AppData/Local/Temp/cirrus-ci-build/tests G_TEST_BUILDDIR=C:/Users/ContainerAdministrator/AppData/Local/Temp/cirrus-ci-build/build/tests tests/test-qht.exe --tap -k ERROR test-qht - too few tests run (expected 2, got 0) make: *** [Makefile.mtest:256: run-test-30] Error 1 -- 此致 礼 罗勇刚 Yours sincerely, Yonggang Luo
[PATCH v7 6/6] [RISCV_PM] Allow experimental J-ext to be turned on
Signed-off-by: Alexey Baturo --- target/riscv/cpu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 19398977d3..234401c3c6 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -499,6 +499,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp) } if (cpu->cfg.ext_j) { env->mmte |= PM_EXT_INITIAL; +target_misa |= RVJ; } if (cpu->cfg.ext_v) { target_misa |= RVV; @@ -571,6 +572,7 @@ static Property riscv_cpu_properties[] = { DEFINE_PROP_BOOL("u", RISCVCPU, cfg.ext_u, true), /* This is experimental so mark with 'x-' */ DEFINE_PROP_BOOL("x-h", RISCVCPU, cfg.ext_h, false), +DEFINE_PROP_BOOL("x-j", RISCVCPU, cfg.ext_j, false), DEFINE_PROP_BOOL("x-v", RISCVCPU, cfg.ext_v, false), DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true), DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true), -- 2.20.1
[PATCH v7 3/6] [RISCV_PM] Print new PM CSRs in QEMU logs
Signed-off-by: Alexey Baturo Reviewed-by: Richard Henderson --- target/riscv/cpu.c | 25 + 1 file changed, 25 insertions(+) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index d50f09b757..19398977d3 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -287,6 +287,31 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, int flags) qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "htval ", env->htval); qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mtval2 ", env->mtval2); } +if (riscv_has_ext(env, RVJ)) { +qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mmte", env->mmte); +switch (env->priv) { +case PRV_U: +qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "upmbase ", + env->upmbase); +qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "upmmask ", + env->upmmask); +break; +case PRV_S: +qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "spmbase ", + env->spmbase); +qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "spmmask ", + env->spmmask); +break; +case PRV_M: +qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mpmbase ", + env->mpmbase); +qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mpmmask ", + env->mpmmask); +break; +default: +g_assert_not_reached(); +} +} #endif for (i = 0; i < 32; i++) { -- 2.20.1
[PATCH v7 2/6] [RISCV_PM] Support CSRs required for RISC-V PM extension except for the ones required for hypervisor mode
Signed-off-by: Alexey Baturo --- target/riscv/cpu.c | 3 + target/riscv/cpu.h | 12 ++ target/riscv/cpu_bits.h | 66 ++ target/riscv/csr.c | 271 4 files changed, 352 insertions(+) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 8227d7aea9..d50f09b757 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -472,6 +472,9 @@ static void riscv_cpu_realize(DeviceState *dev, Error **errp) if (cpu->cfg.ext_h) { target_misa |= RVH; } +if (cpu->cfg.ext_j) { +env->mmte |= PM_EXT_INITIAL; +} if (cpu->cfg.ext_v) { target_misa |= RVV; if (!is_power_of_2(cpu->cfg.vlen)) { diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index d152842e37..37ea7f7802 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -234,6 +234,18 @@ struct CPURISCVState { /* True if in debugger mode. */ bool debugger; + +/* + * CSRs for PM + * TODO: move these csr to appropriate groups + */ +target_ulong mmte; +target_ulong mpmmask; +target_ulong mpmbase; +target_ulong spmmask; +target_ulong spmbase; +target_ulong upmmask; +target_ulong upmbase; #endif float_status fp_status; diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h index b41e8836c3..c92d0896aa 100644 --- a/target/riscv/cpu_bits.h +++ b/target/riscv/cpu_bits.h @@ -354,6 +354,21 @@ #define CSR_MHPMCOUNTER30H 0xb9e #define CSR_MHPMCOUNTER31H 0xb9f +/* Custom user register */ +#define CSR_UMTE0x8c0 +#define CSR_UPMMASK 0x8c1 +#define CSR_UPMBASE 0x8c2 + +/* Custom machine register */ +#define CSR_MMTE0x7c0 +#define CSR_MPMMASK 0x7c1 +#define CSR_MPMBASE 0x7c2 + +/* Custom supervisor register */ +#define CSR_SMTE0x9c0 +#define CSR_SPMMASK 0x9c1 +#define CSR_SPMBASE 0x9c2 + /* Legacy Machine Protection and Translation (priv v1.9.1) */ #define CSR_MBASE 0x380 #define CSR_MBOUND 0x381 @@ -590,4 +605,55 @@ #define MIE_UTIE (1 << IRQ_U_TIMER) #define MIE_SSIE (1 << IRQ_S_SOFT) #define MIE_USIE (1 << IRQ_U_SOFT) + +/* general mte CSR bits*/ +#define PM_ENABLE 0x0001ULL +#define PM_CURRENT 0x0002ULL +#define PM_XS_MASK 0x0003ULL + +/* PM XS bits values */ +#define PM_EXT_DISABLE 0xULL +#define PM_EXT_INITIAL 0x0001ULL +#define PM_EXT_CLEAN0x0002ULL +#define PM_EXT_DIRTY0x0003ULL + +/* offsets for every pair of control bits per each priv level */ +#define XS_OFFSET0ULL +#define U_OFFSET 2ULL +#define S_OFFSET 4ULL +#define M_OFFSET 6ULL + +#define PM_XS_BITS (PM_XS_MASK << XS_OFFSET) +#define U_PM_ENABLE (PM_ENABLE << U_OFFSET) +#define U_PM_CURRENT (PM_CURRENT << U_OFFSET) +#define S_PM_ENABLE (PM_ENABLE << S_OFFSET) +#define S_PM_CURRENT (PM_CURRENT << S_OFFSET) +#define M_PM_ENABLE (PM_ENABLE << M_OFFSET) + +/* mmte CSR bits */ +#define MMTE_PM_XS_BITS PM_XS_BITS +#define MMTE_U_PM_ENABLEU_PM_ENABLE +#define MMTE_U_PM_CURRENT U_PM_CURRENT +#define MMTE_S_PM_ENABLES_PM_ENABLE +#define MMTE_S_PM_CURRENT S_PM_CURRENT +#define MMTE_M_PM_ENABLEM_PM_ENABLE +#define MMTE_MASK (MMTE_U_PM_ENABLE | MMTE_U_PM_CURRENT | \ + MMTE_S_PM_ENABLE | MMTE_S_PM_CURRENT | \ + MMTE_M_PM_ENABLE | MMTE_PM_XS_BITS) + +/* smte CSR bits */ +#define SMTE_PM_XS_BITS PM_XS_BITS +#define SMTE_U_PM_ENABLEU_PM_ENABLE +#define SMTE_U_PM_CURRENT U_PM_CURRENT +#define SMTE_S_PM_ENABLES_PM_ENABLE +#define SMTE_S_PM_CURRENT S_PM_CURRENT +#define SMTE_MASK (SMTE_U_PM_ENABLE | SMTE_U_PM_CURRENT | \ + SMTE_S_PM_ENABLE | SMTE_S_PM_CURRENT | \ + SMTE_PM_XS_BITS) + +/* umte CSR bits */ +#define UMTE_U_PM_ENABLEU_PM_ENABLE +#define UMTE_U_PM_CURRENT U_PM_CURRENT +#define UMTE_MASK (UMTE_U_PM_ENABLE | MMTE_U_PM_CURRENT) + #endif diff --git a/target/riscv/csr.c b/target/riscv/csr.c index 10ab82ed1f..28a3eaf18d 100644 --- a/target/riscv/csr.c +++ b/target/riscv/csr.c @@ -192,6 +192,11 @@ static int hmode32(CPURISCVState *env, int csrno) } +static int umode(CPURISCVState *env, int csrno) +{ +return -!riscv_has_ext(env, RVU); +} + static int pmp(CPURISCVState *env, int csrno) { return -!riscv_feature(env, RISCV_FEATURE_PMP); @@ -1270,6 +1275,257 @@ static int write_pmpaddr(CPURISCVState *env, int csrno, target_ulong val) return 0; } +/* + * Functions to access Pointer Masking feature registers + * We have to check if current priv lvl could modify + * csr in given mode + */ +static int check_pm_current_disabled(CPURISCVState *env, int csrno) +{ +int csr_priv = get_field(csrno, 0xC00); +/* + * If pri
[PATCH v7 5/6] [RISCV_PM] Implement address masking functions required for RISC-V Pointer Masking extension
From: Anatoly Parshintsev Signed-off-by: Anatoly Parshintsev Reviewed-by: Richard Henderson --- target/riscv/cpu.h | 19 +++ target/riscv/translate.c | 34 -- 2 files changed, 51 insertions(+), 2 deletions(-) diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index 37ea7f7802..b3c63ca5ff 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -397,6 +397,7 @@ FIELD(TB_FLAGS, SEW, 5, 3) FIELD(TB_FLAGS, VILL, 8, 1) /* Is a Hypervisor instruction load/store allowed? */ FIELD(TB_FLAGS, HLSX, 9, 1) +FIELD(TB_FLAGS, PM_ENABLED, 10, 1) bool riscv_cpu_is_32bit(CPURISCVState *env); @@ -454,6 +455,24 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState *env, target_ulong *pc, flags = FIELD_DP32(flags, TB_FLAGS, HLSX, 1); } } +if (riscv_has_ext(env, RVJ)) { +int priv = cpu_mmu_index(env, false); +bool pm_enabled = false; +switch (priv) { +case PRV_U: +pm_enabled = env->mmte & U_PM_ENABLE; +break; +case PRV_S: +pm_enabled = env->mmte & S_PM_ENABLE; +break; +case PRV_M: +pm_enabled = env->mmte & M_PM_ENABLE; +break; +default: +g_assert_not_reached(); +} +flags = FIELD_DP32(flags, TB_FLAGS, PM_ENABLED, pm_enabled); +} #endif *pflags = flags; diff --git a/target/riscv/translate.c b/target/riscv/translate.c index 5da7330f33..980604935d 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -36,6 +36,9 @@ static TCGv cpu_gpr[32], cpu_pc, cpu_vl; static TCGv_i64 cpu_fpr[32]; /* assume F and D extensions */ static TCGv load_res; static TCGv load_val; +/* globals for PM CSRs */ +static TCGv pm_mask[4]; +static TCGv pm_base[4]; #include "exec/gen-icount.h" @@ -64,6 +67,10 @@ typedef struct DisasContext { uint16_t vlen; uint16_t mlen; bool vl_eq_vlmax; +/* PointerMasking extension */ +bool pm_enabled; +TCGv pm_mask; +TCGv pm_base; } DisasContext; #ifdef TARGET_RISCV64 @@ -103,13 +110,19 @@ static void gen_nanbox_s(TCGv_i64 out, TCGv_i64 in) } /* - * Temp stub: generates address adjustment for PointerMasking + * Generates address adjustment for PointerMasking */ static void gen_pm_adjust_address(DisasContext *s, TCGv_i64 dst, TCGv_i64 src) { -tcg_gen_mov_i64(dst, src); +if (!s->pm_enabled) { +/* Load unmodified address */ +tcg_gen_mov_i64(dst, src); +} else { +tcg_gen_andc_i64(dst, src, s->pm_mask); +tcg_gen_or_i64(dst, dst, s->pm_base); +} } /* @@ -828,6 +841,10 @@ static void riscv_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs) ctx->lmul = FIELD_EX32(tb_flags, TB_FLAGS, LMUL); ctx->mlen = 1 << (ctx->sew + 3 - ctx->lmul); ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX); +ctx->pm_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_ENABLED); +int priv = cpu_mmu_index(env, false); +ctx->pm_mask = pm_mask[priv]; +ctx->pm_base = pm_base[priv]; } static void riscv_tr_tb_start(DisasContextBase *db, CPUState *cpu) @@ -947,4 +964,17 @@ void riscv_translate_init(void) "load_res"); load_val = tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, load_val), "load_val"); +/* Assign PM CSRs to tcg globals */ +pm_mask[PRV_U] = + tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, upmmask), "upmmask"); +pm_base[PRV_U] = + tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, upmbase), "upmbase"); +pm_mask[PRV_S] = + tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, spmmask), "spmmask"); +pm_base[PRV_S] = + tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, spmbase), "spmbase"); +pm_mask[PRV_M] = + tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, mpmmask), "mpmmask"); +pm_base[PRV_M] = + tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, mpmbase), "mpmbase"); } -- 2.20.1
[PATCH v7 0/6] RISC-V Pointer Masking implementation
Hi folks, Sorry it took me almost 3 month to provide the reply and fixes: it was a really busy EOY. This series contains fixed @Alistair suggestion on enabling J-ext. As for @Richard comments: - Indeed I've missed appending review-by to the approved commits. Now I've restored them except for the fourth commit. @Richard could you please tell if you think it's still ok to commit it as is, or should I support masking mem ops for RVV first? - These patches don't have any support for load/store masking for RVV and RVH extensions, so no support for special load/store for Hypervisor in particular. If this patch series would be accepted, I think my further attention would be to: - Support pm for memory operations for RVV - Add proper csr and support pm for memory operations for Hypervisor mode - Support address wrapping on unaligned accesses as @Richard mentioned previously Thanks! Alexey Baturo (5): [RISCV_PM] Add J-extension into RISC-V [RISCV_PM] Support CSRs required for RISC-V PM extension except for the ones required for hypervisor mode [RISCV_PM] Print new PM CSRs in QEMU logs [RISCV_PM] Support pointer masking for RISC-V for i/c/f/d/a types of instructions [RISCV_PM] Allow experimental J-ext to be turned on Anatoly Parshintsev (1): [RISCV_PM] Implement address masking functions required for RISC-V Pointer Masking extension target/riscv/cpu.c | 30 +++ target/riscv/cpu.h | 33 +++ target/riscv/cpu_bits.h | 66 ++ target/riscv/csr.c | 271 target/riscv/insn_trans/trans_rva.c.inc | 3 + target/riscv/insn_trans/trans_rvd.c.inc | 2 + target/riscv/insn_trans/trans_rvf.c.inc | 2 + target/riscv/insn_trans/trans_rvi.c.inc | 2 + target/riscv/translate.c| 44 9 files changed, 453 insertions(+) -- 2.20.1
[PATCH v7 1/6] [RISCV_PM] Add J-extension into RISC-V
Signed-off-by: Alexey Baturo Reviewed-by: Richard Henderson --- target/riscv/cpu.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h index 6339e84819..d152842e37 100644 --- a/target/riscv/cpu.h +++ b/target/riscv/cpu.h @@ -72,6 +72,7 @@ #define RVS RV('S') #define RVU RV('U') #define RVH RV('H') +#define RVJ RV('J') /* S extension denotes that Supervisor mode exists, however it is possible to have a core that support S mode but does not have an MMU and there @@ -285,6 +286,7 @@ struct RISCVCPU { bool ext_s; bool ext_u; bool ext_h; +bool ext_j; bool ext_v; bool ext_counters; bool ext_ifencei; -- 2.20.1
[PATCH v7 4/6] [RISCV_PM] Support pointer masking for RISC-V for i/c/f/d/a types of instructions
Signed-off-by: Alexey Baturo --- target/riscv/insn_trans/trans_rva.c.inc | 3 +++ target/riscv/insn_trans/trans_rvd.c.inc | 2 ++ target/riscv/insn_trans/trans_rvf.c.inc | 2 ++ target/riscv/insn_trans/trans_rvi.c.inc | 2 ++ target/riscv/translate.c| 14 ++ 5 files changed, 23 insertions(+) diff --git a/target/riscv/insn_trans/trans_rva.c.inc b/target/riscv/insn_trans/trans_rva.c.inc index be8a9f06dd..5559e347ba 100644 --- a/target/riscv/insn_trans/trans_rva.c.inc +++ b/target/riscv/insn_trans/trans_rva.c.inc @@ -26,6 +26,7 @@ static inline bool gen_lr(DisasContext *ctx, arg_atomic *a, MemOp mop) if (a->rl) { tcg_gen_mb(TCG_MO_ALL | TCG_BAR_STRL); } +gen_pm_adjust_address(ctx, src1, src1); tcg_gen_qemu_ld_tl(load_val, src1, ctx->mem_idx, mop); if (a->aq) { tcg_gen_mb(TCG_MO_ALL | TCG_BAR_LDAQ); @@ -46,6 +47,7 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, MemOp mop) TCGLabel *l2 = gen_new_label(); gen_get_gpr(src1, a->rs1); +gen_pm_adjust_address(ctx, src1, src1); tcg_gen_brcond_tl(TCG_COND_NE, load_res, src1, l1); gen_get_gpr(src2, a->rs2); @@ -91,6 +93,7 @@ static bool gen_amo(DisasContext *ctx, arg_atomic *a, gen_get_gpr(src1, a->rs1); gen_get_gpr(src2, a->rs2); +gen_pm_adjust_address(ctx, src1, src1); (*func)(src2, src1, src2, ctx->mem_idx, mop); gen_set_gpr(a->rd, src2); diff --git a/target/riscv/insn_trans/trans_rvd.c.inc b/target/riscv/insn_trans/trans_rvd.c.inc index 4f832637fa..935342f66d 100644 --- a/target/riscv/insn_trans/trans_rvd.c.inc +++ b/target/riscv/insn_trans/trans_rvd.c.inc @@ -25,6 +25,7 @@ static bool trans_fld(DisasContext *ctx, arg_fld *a) TCGv t0 = tcg_temp_new(); gen_get_gpr(t0, a->rs1); tcg_gen_addi_tl(t0, t0, a->imm); +gen_pm_adjust_address(ctx, t0, t0); tcg_gen_qemu_ld_i64(cpu_fpr[a->rd], t0, ctx->mem_idx, MO_TEQ); @@ -40,6 +41,7 @@ static bool trans_fsd(DisasContext *ctx, arg_fsd *a) TCGv t0 = tcg_temp_new(); gen_get_gpr(t0, a->rs1); tcg_gen_addi_tl(t0, t0, a->imm); +gen_pm_adjust_address(ctx, t0, t0); tcg_gen_qemu_st_i64(cpu_fpr[a->rs2], t0, ctx->mem_idx, MO_TEQ); diff --git a/target/riscv/insn_trans/trans_rvf.c.inc b/target/riscv/insn_trans/trans_rvf.c.inc index 3dfec8211d..04b3c3eb3d 100644 --- a/target/riscv/insn_trans/trans_rvf.c.inc +++ b/target/riscv/insn_trans/trans_rvf.c.inc @@ -30,6 +30,7 @@ static bool trans_flw(DisasContext *ctx, arg_flw *a) TCGv t0 = tcg_temp_new(); gen_get_gpr(t0, a->rs1); tcg_gen_addi_tl(t0, t0, a->imm); +gen_pm_adjust_address(ctx, t0, t0); tcg_gen_qemu_ld_i64(cpu_fpr[a->rd], t0, ctx->mem_idx, MO_TEUL); gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]); @@ -47,6 +48,7 @@ static bool trans_fsw(DisasContext *ctx, arg_fsw *a) gen_get_gpr(t0, a->rs1); tcg_gen_addi_tl(t0, t0, a->imm); +gen_pm_adjust_address(ctx, t0, t0); tcg_gen_qemu_st_i64(cpu_fpr[a->rs2], t0, ctx->mem_idx, MO_TEUL); diff --git a/target/riscv/insn_trans/trans_rvi.c.inc b/target/riscv/insn_trans/trans_rvi.c.inc index d04ca0394c..bee7f6be46 100644 --- a/target/riscv/insn_trans/trans_rvi.c.inc +++ b/target/riscv/insn_trans/trans_rvi.c.inc @@ -141,6 +141,7 @@ static bool gen_load(DisasContext *ctx, arg_lb *a, MemOp memop) TCGv t1 = tcg_temp_new(); gen_get_gpr(t0, a->rs1); tcg_gen_addi_tl(t0, t0, a->imm); +gen_pm_adjust_address(ctx, t0, t0); tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, memop); gen_set_gpr(a->rd, t1); @@ -180,6 +181,7 @@ static bool gen_store(DisasContext *ctx, arg_sb *a, MemOp memop) TCGv dat = tcg_temp_new(); gen_get_gpr(t0, a->rs1); tcg_gen_addi_tl(t0, t0, a->imm); +gen_pm_adjust_address(ctx, t0, t0); gen_get_gpr(dat, a->rs2); tcg_gen_qemu_st_tl(dat, t0, ctx->mem_idx, memop); diff --git a/target/riscv/translate.c b/target/riscv/translate.c index 554d52a4be..5da7330f33 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -102,6 +102,16 @@ static void gen_nanbox_s(TCGv_i64 out, TCGv_i64 in) tcg_gen_ori_i64(out, in, MAKE_64BIT_MASK(32, 32)); } +/* + * Temp stub: generates address adjustment for PointerMasking + */ +static void gen_pm_adjust_address(DisasContext *s, + TCGv_i64 dst, + TCGv_i64 src) +{ +tcg_gen_mov_i64(dst, src); +} + /* * A narrow n-bit operation, where n < FLEN, checks that input operands * are correctly Nan-boxed, i.e., all upper FLEN - n bits are 1. @@ -381,6 +391,7 @@ static void gen_load_c(DisasContext *ctx, uint32_t opc, int rd, int rs1, TCGv t1 = tcg_temp_new(); gen_get_gpr(t0, rs1); tcg_gen_addi_tl(t0, t0, imm); +gen_pm_adjust_address(ctx, t0, t0); int memop = tcg_memop_lookup[(opc >> 12) & 0x7]; if (memop < 0) { @@ -401,6 +412,7 @@ static void gen_store_c(DisasContext *ctx, uint32_t opc,
Re: [PATCH] hvf: guard xgetbv call.
On 1/10/21 8:34 AM, Richard Henderson wrote: > On 1/9/21 3:46 PM, Roman Bolshakov wrote: >> +static int xgetbv(uint32_t cpuid_ecx, uint32_t idx, uint64_t *xcr) >> { >> -uint32_t eax, edx; >> +uint32_t xcrl, xcrh; >> >> -__asm__ volatile ("xgetbv" >> - : "=a" (eax), "=d" (edx) >> - : "c" (xcr)); >> +if (cpuid_ecx && CPUID_EXT_OSXSAVE) { >> +/* The xgetbv instruction is not available to older versions of >> + * the assembler, so we encode the instruction manually. >> + */ >> +asm(".byte 0x0f, 0x01, 0xd0" : "=a" (xcrl), "=d" (xcrh) : "c" >> (idx)); >> >> -return (((uint64_t)edx) << 32) | eax; >> +*xcr = (((uint64_t)xcrh) << 32) | xcrl; >> +return 0; >> +} >> + >> +return 1; >> } > > Not to bikeshed too much, but this looks like it should return bool, and true > on success, not the other way around. Also, if we're going to put this some place common, forcing the caller to do the cpuid that feeds this, then we should probably make all of the startup cpuid stuff common as well. Note that we'd probably have to use constructor priorities to get that right for util/bufferiszero.c. r~
Re: [PATCH] hvf: guard xgetbv call.
On 1/9/21 3:46 PM, Roman Bolshakov wrote: > +static int xgetbv(uint32_t cpuid_ecx, uint32_t idx, uint64_t *xcr) > { > -uint32_t eax, edx; > +uint32_t xcrl, xcrh; > > -__asm__ volatile ("xgetbv" > - : "=a" (eax), "=d" (edx) > - : "c" (xcr)); > +if (cpuid_ecx && CPUID_EXT_OSXSAVE) { > +/* The xgetbv instruction is not available to older versions of > + * the assembler, so we encode the instruction manually. > + */ > +asm(".byte 0x0f, 0x01, 0xd0" : "=a" (xcrl), "=d" (xcrh) : "c" (idx)); > > -return (((uint64_t)edx) << 32) | eax; > +*xcr = (((uint64_t)xcrh) << 32) | xcrl; > +return 0; > +} > + > +return 1; > } Not to bikeshed too much, but this looks like it should return bool, and true on success, not the other way around. r~
Re: [PATCH] target/i386: Use X86Seg enum for segment registers
On 1/9/21 1:34 PM, Philippe Mathieu-Daudé wrote: > Use the dedicated X86Seg enum type for segment registers. > > Signed-off-by: Philippe Mathieu-Daudé > --- > target/i386/cpu.h| 4 ++-- > target/i386/gdbstub.c| 2 +- > target/i386/tcg/seg_helper.c | 8 > target/i386/tcg/translate.c | 6 +++--- > 4 files changed, 10 insertions(+), 10 deletions(-) Reviewed-by: Richard Henderson r~
Re: [PATCH] tcg: Remove unused tcg_out_dupi_vec() stub
On 1/9/21 6:10 PM, Wataru Ashihara wrote: > This fixes the build with --enable-tcg-interpreter: > > clang -Ilibqemu-arm-softmmu.fa.p -I. -I.. -Itarget/arm -I../target/arm > -I../dtc/libfdt -I../capstone/include/capstone -Iqapi -Itrace -Iui > -Iui/shader -I/usr/include/pixman-1 -I/usr/include/glib-2.0 > -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -Xclang -fcolor-diagnostics > -pipe -Wall -Winvalid-pch -Werror -std=gnu99 -g -m64 -mcx16 -D_GNU_SOURCE > -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes > -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes > -fno-strict-aliasing -fno-common -fwrapv -Wold-style-definition -Wtype-limits > -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body > -Wnested-externs -Wendif-labels -Wexpansion-to-defined > -Wno-initializer-overrides -Wno-missing-include-dirs > -Wno-shift-negative-value -Wno-string-plus-int -Wno-typedef-redefinition > -Wno-tautological-type-limit-compare -fstack-protector-strong -isystem > /home/wsh/qc/qemu/linux-headers -isystem linux-headers -iquote > /home/wsh/qc/qemu/tcg/tci -iquote . -iquote /home/wsh/qc/qemu -iquote > /home/wsh/qc/qemu/accel/tcg -iquote /home/wsh/qc/qemu/include -iquote > /home/wsh/qc/qemu/disas/libvixl -pthread -fPIC -isystem../linux-headers > -isystemlinux-headers -DNEED_CPU_H > '-DCONFIG_TARGET="arm-softmmu-config-target.h"' > '-DCONFIG_DEVICES="arm-softmmu-config-devices.h"' -MD -MQ > libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -MF > libqemu-arm-softmmu.fa.p/tcg_tcg.c.o.d -o > libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -c ../tcg/tcg.c > ../tcg/tcg.c:136:20: error: unused function 'tcg_out_dupi_vec' > [-Werror,-Wunused-function] What version of clang? With clang 10, I can't even run configure without --disable-werror. r~
Re: [PATCH v2] target/i386/sev: add support to query the attestation report
Hello Brijesh, On 05/01/2021 18:39, Brijesh Singh wrote: The SEV FW >= 0.23 added a new command that can be used to query the attestation report containing the SHA-256 digest of the guest memory and VMSA encrypted with the LAUNCH_UPDATE and sign it with the PEK. Note, we already have a command (LAUNCH_MEASURE) that can be used to query the SHA-256 digest of the guest memory encrypted through the LAUNCH_UPDATE. The main difference between previous and this command is that the report is signed with the PEK and unlike the LAUNCH_MEASURE command the ATTESATION_REPORT command can be called while the guest is running. Add a QMP interface "query-sev-attestation-report" that can be used to get the report encoded in base64. Cc: James Bottomley Cc: Tom Lendacky Cc: Eric Blake Cc: Paolo Bonzini Cc: k...@vger.kernel.org Signed-off-by: Brijesh Singh --- v2: * add trace event. * fix the goto to return NULL on failure. * make the mnonce as a base64 encoded string linux-headers/linux/kvm.h | 8 + qapi/misc-target.json | 38 ++ target/i386/monitor.c | 6 target/i386/sev-stub.c| 7 + target/i386/sev.c | 66 +++ target/i386/sev_i386.h| 2 ++ target/i386/trace-events | 1 + 7 files changed, 128 insertions(+) diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index 56ce14ad20..6d0f8101ba 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -1585,6 +1585,8 @@ enum sev_cmd_id { KVM_SEV_DBG_ENCRYPT, /* Guest certificates commands */ KVM_SEV_CERT_EXPORT, + /* Attestation report */ + KVM_SEV_GET_ATTESTATION_REPORT, KVM_SEV_NR_MAX, }; @@ -1637,6 +1639,12 @@ struct kvm_sev_dbg { __u32 len; }; +struct kvm_sev_attestation_report { + __u8 mnonce[16]; + __u64 uaddr; + __u32 len; +}; + #define KVM_DEV_ASSIGN_ENABLE_IOMMU (1 << 0) #define KVM_DEV_ASSIGN_PCI_2_3(1 << 1) #define KVM_DEV_ASSIGN_MASK_INTX (1 << 2) diff --git a/qapi/misc-target.json b/qapi/misc-target.json index 06ef8757f0..5907a2dfaa 100644 --- a/qapi/misc-target.json +++ b/qapi/misc-target.json @@ -285,3 +285,41 @@ ## { 'command': 'query-gic-capabilities', 'returns': ['GICCapability'], 'if': 'defined(TARGET_ARM)' } + + +## +# @SevAttestationReport: +# +# The struct describes attestation report for a Secure Encrypted Virtualization +# feature. +# +# @data: guest attestation report (base64 encoded) +# +# +# Since: 5.2 +## +{ 'struct': 'SevAttestationReport', + 'data': { 'data': 'str'}, + 'if': 'defined(TARGET_I386)' } + +## +# @query-sev-attestation-report: +# +# This command is used to get the SEV attestation report, and is supported on AMD +# X86 platforms only. +# +# @mnonce: a random 16 bytes value encoded in base64 (it will be included in report) +# +# Returns: SevAttestationReport objects. +# +# Since: 5.3 +# +# Example: +# +# -> { "execute" : "query-sev-attestation-report", "arguments": { "mnonce": "aaa" } } +# <- { "return" : { "data": "bbbd"} } +# +## +{ 'command': 'query-sev-attestation-report', 'data': { 'mnonce': 'str' }, + 'returns': 'SevAttestationReport', + 'if': 'defined(TARGET_I386)' } diff --git a/target/i386/monitor.c b/target/i386/monitor.c index 1bc91442b1..0c8377f900 100644 --- a/target/i386/monitor.c +++ b/target/i386/monitor.c @@ -736,3 +736,9 @@ void qmp_sev_inject_launch_secret(const char *packet_hdr, { sev_inject_launch_secret(packet_hdr, secret, gpa, errp); } + +SevAttestationReport * +qmp_query_sev_attestation_report(const char *mnonce, Error **errp) +{ +return sev_get_attestation_report(mnonce, errp); +} diff --git a/target/i386/sev-stub.c b/target/i386/sev-stub.c index c1fecc2101..cdc9a014ee 100644 --- a/target/i386/sev-stub.c +++ b/target/i386/sev-stub.c @@ -54,3 +54,10 @@ int sev_inject_launch_secret(const char *hdr, const char *secret, { return 1; } + +SevAttestationReport * +sev_get_attestation_report(const char *mnonce, Error **errp) +{ +error_setg(errp, "SEV is not available in this QEMU"); +return NULL; +} diff --git a/target/i386/sev.c b/target/i386/sev.c index 1546606811..d1f90a1d8a 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -492,6 +492,72 @@ out: return cap; } +SevAttestationReport * +sev_get_attestation_report(const char *mnonce, Error **errp) +{ +struct kvm_sev_attestation_report input = {}; +SevAttestationReport *report = NULL; +SevGuestState *sev = sev_guest; +guchar *data; +guchar *buf; +gsize len; +int err = 0, ret; + +if (!sev_enabled()) { +error_setg(errp, "SEV is not enabled"); +return NULL; +} + +/* lets decode the mnonce string */ +buf = g_base64_decode(mnonce, &len); +if (!buf) { +error_setg(errp, "SEV: failed to decode mnonce input"); +return NULL; +} + +/* verify the input mnonce le
Re: [PATCH 1/2] tcg: Mark more tcg_out*() functions with attribute 'unused'
On 1/10/21 6:27 AM, Philippe Mathieu-Daudé wrote: > The tcg_out* functions are utility routines that may or > may not be used by a particular backend. Similarly to commit > 4196dca63b8, mark them with the 'unused' attribute to suppress > spurious warnings if they aren't used. > > This fixes the build with --enable-tcg-interpreter: > > [98/151] Compiling C object libqemu-arm-softmmu.fa.p/tcg_tcg.c.o > FAILED: libqemu-arm-softmmu.fa.p/tcg_tcg.c.o > clang [...] -o libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -c ../tcg/tcg.c > ../tcg/tcg.c:136:20: error: unused function 'tcg_out_dupi_vec' > [-Werror,-Wunused-function] > > Reported-by: Wataru Ashihara > Signed-off-by: Philippe Mathieu-Daudé > --- > tcg/tcg.c | 30 +- > 1 file changed, 21 insertions(+), 9 deletions(-) This does too much to fix that Werror, as all of the other functions are unconditionally used. Alternately, I'll re-test and merge my tcg constant branch, which will make tcg_out_dupi_vec also unconditionally used. Then we don't need __attribute__((unused)) at all. r~
Re: [PATCH v3 0/3] unbreak non-tcg builds
On 10/13/20 4:55 PM, Philippe Mathieu-Daudé wrote: > On 10/13/20 4:38 PM, Claudio Fontana wrote: >> This series now unbreaks current non-tcg builds >> (!CONFIG_TCG). >> >> tests Makefiles need to avoid relying on all non-native >> archs binaries to be present, >> >> bios-tables-test needs to skip tests that are tcg-only, >> >> and notably the replay framework needs to consider that >> it might not be functional (or its code present at all) >> without TCG. >> >> Tested ok target x86_64-softmmu on x86_64 host with: >> >> ./configure --enable-tcg --disable-kvm >> ./configure --enable-kvm --disable-tcg >> ./configure --enable-tcg --enable-kvm > > If you want to avoid these configurations to bitrot, > consider covering them by adding Gitlab jobs :))) > Hello Philippe, happy new year, I am going back to look at the current code in master, slowly trying to get back a hold on things, and I remember some time ago you suggested to keep testing "tools" builds, with ./configure --disable-tcg --disable-kvm --enable-tools but the drawback of the "tools-only" build is that currently one cannot run make check on it. It fails with ERRORS in bios-tables-test and others. Is it supposed to actually work? Is there any make check work that can be done without any accelerator? I would assume so... Thanks, Claudio
[RFC PATCH 2/2] gitlab-ci: Add a job building TCI with Clang
Split the current GCC build-tci job in 2, and use Clang compiler in the new job. Signed-off-by: Philippe Mathieu-Daudé --- RFC in case someone have better idea to optimize can respin this patch. .gitlab-ci.yml | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 01c9e46410d..9053161793f 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -397,12 +397,12 @@ build-oss-fuzz: # Unrelated to fuzzer: run some tests with -fsanitize=address - cd build-oss-fuzz && make check-qtest-i386 check-unit -build-tci: +build-tci-gcc: <<: *native_build_job_definition variables: IMAGE: fedora script: -- TARGETS="aarch64 alpha arm hppa m68k microblaze moxie ppc64 s390x x86_64" +- TARGETS="aarch64 alpha arm hppa x86_64" - mkdir build - cd build - ../configure --enable-tcg-interpreter @@ -416,6 +416,24 @@ build-tci: ./tests/qtest/cdrom-test || exit 1 ; done - QTEST_QEMU_BINARY="./qemu-system-x86_64" ./tests/qtest/pxe-test + +build-tci-clang: + <<: *native_build_job_definition + variables: +IMAGE: fedora + script: +- TARGETS="m68k microblaze moxie ppc64 s390x" +- mkdir build +- cd build +- ../configure --enable-tcg-interpreter --cc=clang --cxx=clang++ +--target-list="$(for tg in $TARGETS; do echo -n ${tg}'-softmmu '; done)" || { cat config.log meson-logs/meson-log.txt && exit 1; } +- make -j"$JOBS" +- make tests/qtest/boot-serial-test tests/qtest/cdrom-test tests/qtest/pxe-test +- for tg in $TARGETS ; do +export QTEST_QEMU_BINARY="./qemu-system-${tg}" ; +./tests/qtest/boot-serial-test || exit 1 ; +./tests/qtest/cdrom-test || exit 1 ; + done - QTEST_QEMU_BINARY="./qemu-system-s390x" ./tests/qtest/pxe-test -m slow # Alternate coroutines implementations are only really of interest to KVM users -- 2.26.2
[PATCH 1/2] tcg: Mark more tcg_out*() functions with attribute 'unused'
The tcg_out* functions are utility routines that may or may not be used by a particular backend. Similarly to commit 4196dca63b8, mark them with the 'unused' attribute to suppress spurious warnings if they aren't used. This fixes the build with --enable-tcg-interpreter: [98/151] Compiling C object libqemu-arm-softmmu.fa.p/tcg_tcg.c.o FAILED: libqemu-arm-softmmu.fa.p/tcg_tcg.c.o clang [...] -o libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -c ../tcg/tcg.c ../tcg/tcg.c:136:20: error: unused function 'tcg_out_dupi_vec' [-Werror,-Wunused-function] Reported-by: Wataru Ashihara Signed-off-by: Philippe Mathieu-Daudé --- tcg/tcg.c | 30 +- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/tcg/tcg.c b/tcg/tcg.c index 472bf1755bf..a7fc2043cbf 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -123,24 +123,36 @@ static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, unsigned vece, const TCGArg *args, const int *const_args); #else -static inline bool tcg_out_dup_vec(TCGContext *s, TCGType type, unsigned vece, - TCGReg dst, TCGReg src) +static __attribute__((unused)) inline bool tcg_out_dup_vec(TCGContext *s, + TCGType type, + unsigned vece, + TCGReg dst, + TCGReg src) { g_assert_not_reached(); } -static inline bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, -TCGReg dst, TCGReg base, intptr_t offset) +static __attribute__((unused)) inline bool tcg_out_dupm_vec(TCGContext *s, +TCGType type, +unsigned vece, +TCGReg dst, +TCGReg base, +intptr_t offset) { g_assert_not_reached(); } -static inline void tcg_out_dupi_vec(TCGContext *s, TCGType type, -TCGReg dst, tcg_target_long arg) +static __attribute__((unused)) inline void tcg_out_dupi_vec(TCGContext *s, +TCGType type, +TCGReg dst, +tcg_target_long arg) { g_assert_not_reached(); } -static inline void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, - unsigned vece, const TCGArg *args, - const int *const_args) +static __attribute__((unused)) inline void tcg_out_vec_op(TCGContext *s, + TCGOpcode opc, + unsigned vecl, + unsigned vece, + const TCGArg *args, + const int *const_args) { g_assert_not_reached(); } -- 2.26.2
[PATCH 0/2] tcg/tci: Fix Clang build
Fix the build failure reported by Wataru Ashihara on [*] and add a CI test to catch future problems. [*] https://www.mail-archive.com/qemu-devel@nongnu.org/msg771326.html Philippe Mathieu-Daudé (2): tcg: Mark more tcg_out*() functions with attribute 'unused' gitlab-ci: Add a job building TCI with Clang tcg/tcg.c | 30 +- .gitlab-ci.yml | 22 -- 2 files changed, 41 insertions(+), 11 deletions(-) -- 2.26.2
Re: [PATCH] tcg: Remove unused tcg_out_dupi_vec() stub
Cc'ing Stefan. On 1/10/21 5:10 AM, Wataru Ashihara wrote: > This fixes the build with --enable-tcg-interpreter: > > clang -Ilibqemu-arm-softmmu.fa.p -I. -I.. -Itarget/arm -I../target/arm > -I../dtc/libfdt -I../capstone/include/capstone -Iqapi -Itrace -Iui > -Iui/shader -I/usr/include/pixman-1 -I/usr/include/glib-2.0 > -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -Xclang -fcolor-diagnostics > -pipe -Wall -Winvalid-pch -Werror -std=gnu99 -g -m64 -mcx16 -D_GNU_SOURCE > -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -Wstrict-prototypes > -Wredundant-decls -Wundef -Wwrite-strings -Wmissing-prototypes > -fno-strict-aliasing -fno-common -fwrapv -Wold-style-definition -Wtype-limits > -Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body > -Wnested-externs -Wendif-labels -Wexpansion-to-defined > -Wno-initializer-overrides -Wno-missing-include-dirs > -Wno-shift-negative-value -Wno-string-plus-int -Wno-typedef-redefinition > -Wno-tautological-type-limit-compare -fstack-protector-strong -isystem > /home/wsh/qc/qemu/linux-headers -isystem linux-headers -iquote > /home/wsh/qc/qemu/tcg/tci -iquote . -iquote /home/wsh/qc/qemu -iquote > /home/wsh/qc/qemu/accel/tcg -iquote /home/wsh/qc/qemu/include -iquote > /home/wsh/qc/qemu/disas/libvixl -pthread -fPIC -isystem../linux-headers > -isystemlinux-headers -DNEED_CPU_H > '-DCONFIG_TARGET="arm-softmmu-config-target.h"' > '-DCONFIG_DEVICES="arm-softmmu-config-devices.h"' -MD -MQ > libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -MF > libqemu-arm-softmmu.fa.p/tcg_tcg.c.o.d -o > libqemu-arm-softmmu.fa.p/tcg_tcg.c.o -c ../tcg/tcg.c > ../tcg/tcg.c:136:20: error: unused function 'tcg_out_dupi_vec' > [-Werror,-Wunused-function] > > Signed-off-by: Wataru Ashihara > --- > tcg/tcg.c | 7 --- > 1 file changed, 7 deletions(-) > > diff --git a/tcg/tcg.c b/tcg/tcg.c > index 472bf1755b..32df149b12 100644 > --- a/tcg/tcg.c > +++ b/tcg/tcg.c > @@ -117,8 +117,6 @@ static bool tcg_out_dup_vec(TCGContext *s, TCGType type, > unsigned vece, > TCGReg dst, TCGReg src); > static bool tcg_out_dupm_vec(TCGContext *s, TCGType type, unsigned vece, > TCGReg dst, TCGReg base, intptr_t offset); > -static void tcg_out_dupi_vec(TCGContext *s, TCGType type, > - TCGReg dst, tcg_target_long arg); > static void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned vecl, > unsigned vece, const TCGArg *args, > const int *const_args); > @@ -133,11 +131,6 @@ static inline bool tcg_out_dupm_vec(TCGContext *s, > TCGType type, unsigned vece, > { > g_assert_not_reached(); > } > -static inline void tcg_out_dupi_vec(TCGContext *s, TCGType type, > -TCGReg dst, tcg_target_long arg) > -{ > -g_assert_not_reached(); > -} > static inline void tcg_out_vec_op(TCGContext *s, TCGOpcode opc, unsigned > vecl, >unsigned vece, const TCGArg *args, >const int *const_args) AFAIK TCI does not support vectors, using them would trigger tcg_debug_assert(type == TCG_TYPE_I64) in tcg_out_movi(). As your approach might break other backends, I'm going to send an alternate patch using __attribute__((unused)). Thanks for reporting this, Phil.
Re: [PATCH v4 1/7] fuzz: accelerate non-crash detection
On 210110 2110, Qiuhao Li wrote: > On Wed, 2021-01-06 at 23:18 -0500, Alexander Bulekov wrote: > > On 201229 1240, Qiuhao Li wrote: > > > We spend much time waiting for the timeout program during the > > > minimization > > > process until it passes a time limit. This patch hacks the CLOSED > > > (indicates > > > the redirection file closed) notification in QTest's output if it > > > doesn't > > > crash. > > > > > > Test with quadrupled trace input at: > > > https://bugs.launchpad.net/qemu/+bug/1890333/comments/1 > > > > > > Original version: > > > real1m37.246s > > > user0m13.069s > > > sys 0m8.399s > > > > > > Refined version: > > > real0m45.904s > > > user0m16.874s > > > sys 0m10.042s > > > > > > Signed-off-by: Qiuhao Li > > > --- > > > scripts/oss-fuzz/minimize_qtest_trace.py | 41 -- > > > -- > > > 1 file changed, 28 insertions(+), 13 deletions(-) > > > > > > diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py > > > b/scripts/oss-fuzz/minimize_qtest_trace.py > > > index 5e405a0d5f..aa69c7963e 100755 > > > --- a/scripts/oss-fuzz/minimize_qtest_trace.py > > > +++ b/scripts/oss-fuzz/minimize_qtest_trace.py > > > @@ -29,30 +29,46 @@ whether the crash occred. Optionally, manually > > > set a string that idenitifes the > > > crash by setting CRASH_TOKEN= > > > """.format((sys.argv[0]))) > > > > > > +deduplication_note = """\n\ > > > +Note: While trimming the input, sometimes the mutated trace > > > triggers a different > > > +crash output but indicates the same bug. Under this situation, our > > > minimizer is > > > +incapable of recognizing and stopped from removing it. In the > > > future, we may > > > +use a more sophisticated crash case deduplication method. > > > +\n""" > > > + > > > def check_if_trace_crashes(trace, path): > > > -global CRASH_TOKEN > > > with open(path, "w") as tracefile: > > > tracefile.write("".join(trace)) > > > > > > -rc = subprocess.Popen("timeout -s 9 {timeout}s {qemu_path} > > > {qemu_args} 2>&1\ > > > +proc = subprocess.Popen("timeout {timeout}s {qemu_path} > > > {qemu_args} 2>&1\ > > > > Why remove the -s 9 here? I ran into a case where the minimizer got > > stuck on one iteration. Adding back "sigkill" to the timeout can be a > > safety net to catch those bad cases. > > -Alex > > Hi Alex, > > After reviewed this patch again, I think this get-stuck bug may be > caused by code: > > -return CRASH_TOKEN in output Hi, Thanks for fixing this. Strangely, I was able to fix it by swapping the b'' for a ' ' when I was stuck on a testcase a few days ago. vvv > +for line in iter(rc.stdout.readline, b''): > +if "CLOSED" in line: > +return False > +if CRASH_TOKEN in line: > +return True > I think your proposed change essentially does the same? -Alex > I assumed there are only two end cases in lines of stdout, but while we > are trimming the trace input, the crash output (second-to-last line) > may changes, in which case we will go through the output and fail to > find "CLOSED" and CRASH_TOKEN, thus get stuck in the loop above. > > To fix this bug and get a more trimmed input trace, we can: > > Use the first three words of the second-to-last line instead of the > whole string, which indicate the type of crash as the token. > > -CRASH_TOKEN = output.splitlines()[-2] > +CRASH_TOKEN = " ".join(outs.splitlines()[-2].split()[0:3]) > > If we reach the end of a subprocess' output, return False. > > +if line == "": > +return False > > I fix it in [PATCH v7 1/7] and give an example. Could you review again? > Thanks :-) > > FYI, I mentioned this situation firstly in [PATCH 1/4], where I gave a > more detailed example: > > https://lists.gnu.org/archive/html/qemu-devel/2020-12/msg05888.html > > > > > > < {trace_path}".format(timeout=TIMEOUT, > > > qemu_path=QEMU_PATH, > > > qemu_args=QEMU_ARGS, > > > trace_path=path), > > >shell=True, > > >stdin=subprocess.PIPE, > > > - stdout=subprocess.PIPE) > > > -stdo = rc.communicate()[0] > > > -output = stdo.decode('unicode_escape') > > > -if rc.returncode == 137:# Timed Out > > > -return False > > > -if len(output.splitlines()) < 2: > > > -return False > > > - > > > + stdout=subprocess.PIPE, > > > + encoding="utf-8") > > > +global CRASH_TOKEN > > > if CRASH_TOKEN is None: > > > -CRASH_TOKEN = output.splitlines()[-2] > > > +try: > > > +outs, _ = proc.communicate(timeout=5) > > > +CRASH_TOKEN = outs.splitlines()[-2] > > > +except subprocess.TimeoutExpired: > > > +print("subprocess.TimeoutExpired") > > >
Re: What's the correct way to implement rfi and related instruction.
On Fri, Jan 8, 2021 at 2:02 AM Cédric Le Goater wrote: > > On 1/8/21 5:21 AM, 罗勇刚(Yonggang Luo) wrote: > > > > > > On Fri, Jan 8, 2021 at 5:54 AM Cédric Le Goater > wrote: > >> > >> On 1/7/21 8:14 PM, 罗勇刚(Yonggang Luo) wrote: > >> > This is the first patch,: > >> > It's store MSR bits differntly for different rfi instructions: > >> > [Qemu-devel] [PATCH] target-ppc: fix RFI by clearing some bits of MSR > >> > https://lists.gnu.org/archive/html/qemu-devel/2010-05/msg02999.html < https://lists.gnu.org/archive/html/qemu-devel/2010-05/msg02999.html> < https://lists.gnu.org/archive/html/qemu-devel/2010-05/msg02999.html < https://lists.gnu.org/archive/html/qemu-devel/2010-05/msg02999.html>> > >> > Comes from target-ppc: fix RFI by clearing some bits of MSR > >> > SHA-1: c3d420ead1aee9fcfd12be11cbdf6b1620134773 > >> > target-ppc/op_helper.c | 6 +++--- > >> > 1 file changed, 3 insertions(+), 3 deletions(-) > >> > ``` > >> > diff --git a/target-ppc/op_helper.c b/target-ppc/op_helper.c > >> > index 8f2ee986bb..3c3aa60bc3 100644 > >> > --- a/target-ppc/op_helper.c > >> > +++ b/target-ppc/op_helper.c > >> > @@ -1646,20 +1646,20 @@ static inline void do_rfi(target_ulong nip, target_ulong msr, > >> > void helper_rfi (void) > >> > { > >> > do_rfi(env->spr[SPR_SRR0], env->spr[SPR_SRR1], > >> > - ~((target_ulong)0x0), 1); > >> > + ~((target_ulong)0x783F), 1); > >> > } > >> > > >> > #if defined(TARGET_PPC64) > >> > void helper_rfid (void) > >> > { > >> > do_rfi(env->spr[SPR_SRR0], env->spr[SPR_SRR1], > >> > - ~((target_ulong)0x0), 0); > >> > + ~((target_ulong)0x783F), 0); > >> > } > >> > > >> > void helper_hrfid (void) > >> > { > >> > do_rfi(env->spr[SPR_HSRR0], env->spr[SPR_HSRR1], > >> > - ~((target_ulong)0x0), 0); > >> > + ~((target_ulong)0x783F), 0); > >> > } > >> > #endif > >> > #endif > >> > ``` > >> > > >> > This is the second patch,: > >> > it's remove the parameter `target_ulong msrm, int keep_msrh` > >> > Comes from ppc: Fix rfi/rfid/hrfi/... emulation > >> > SHA-1: a2e71b28e832346409efc795ecd1f0a2bcb705a3 > >> > ``` > >> > target-ppc/excp_helper.c | 51 +++- > >> > 1 file changed, 20 insertions(+), 31 deletions(-) > >> > > >> > diff --git a/target-ppc/excp_helper.c b/target-ppc/excp_helper.c > >> > index 30e960e30b..aa0b63f4b0 100644 > >> > --- a/target-ppc/excp_helper.c > >> > +++ b/target-ppc/excp_helper.c > >> > @@ -922,25 +922,20 @@ void helper_store_msr(CPUPPCState *env, target_ulong val) > >> > } > >> > } > >> > > >> > -static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr, > >> > - target_ulong msrm, int keep_msrh) > >> > +static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr) > >> > { > >> > CPUState *cs = CPU(ppc_env_get_cpu(env)); > >> > > >> > +/* MSR:POW cannot be set by any form of rfi */ > >> > +msr &= ~(1ULL << MSR_POW); > >> > + > >> > #if defined(TARGET_PPC64) > >> > -if (msr_is_64bit(env, msr)) { > >> > -nip = (uint64_t)nip; > >> > -msr &= (uint64_t)msrm; > >> > -} else { > >> > +/* Switching to 32-bit ? Crop the nip */ > >> > +if (!msr_is_64bit(env, msr)) { > >> > nip = (uint32_t)nip; > >> > -msr = (uint32_t)(msr & msrm); > >> > -if (keep_msrh) { > >> > -msr |= env->msr & ~((uint64_t)0x); > >> > -} > >> > } > >> > #else > >> > nip = (uint32_t)nip; > >> > -msr &= (uint32_t)msrm; > >> > #endif > >> > /* XXX: beware: this is false if VLE is supported */ > >> > env->nip = nip & ~((target_ulong)0x0003); > >> > @@ -959,26 +954,24 @@ static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr, > >> > > >> > void helper_rfi(CPUPPCState *env) > >> > { > >> > -if (env->excp_model == POWERPC_EXCP_BOOKE) { > >> > -do_rfi(env, env->spr[SPR_SRR0], env->spr[SPR_SRR1], > >> > - ~((target_ulong)0), 0); > >> > -} else { > >> > -do_rfi(env, env->spr[SPR_SRR0], env->spr[SPR_SRR1], > >> > - ~((target_ulong)0x783F), 1); > >> > -} > >> > +do_rfi(env, env->spr[SPR_SRR0], env->spr[SPR_SRR1] & 0xul); > >> > } > >> > > >> > +#define MSR_BOOK3S_MASK > >> > #if defined(TARGET_PPC64) > >> > void helper_rfid(CPUPPCState *env) > >> > { > >> > -do_rfi(env, env->spr[SPR_SRR0], env->spr[SPR_SRR1], > >> > - ~((target_ulong)0x783F), 0); > >> > +/* The architeture defines a number of rules for which bits > >> > + * can change but in practice, we handle this in hreg_store_msr() > >> > + * which will be called by do_rfi(), so there is no need to filter > >> > + * here > >> > + */ > >> > +do_rfi(env, env->spr[SPR_SRR0], env->spr[SPR_SRR1]); > >> > } > >> > > >> > void helper_hrfid(CPUPPCState *env) > >> > { > >> > -do_rfi(env, env->spr[SP
[PATCH v7 0/7] fuzz: improve crash case minimization
Extend and refine the crash case minimization process. Test input: Bug 1909261 full_reproducer 6500 QTest instructions (write mostly) Refined (-M1 minimization level) vs. Original version: real 38m31.942s <-- real 532m57.192s user 28m18.188s <-- user 89m0.536s sys 12m42.239s <-- sys 50m33.074s 2558 instructions <-- 2846 instructions Test Enviroment: i7-8550U, 16GB LPDDR3, SSD Ubuntu 20.04.1 5.4.0-58-generic x86_64 Python 3.8.5 v7: Fix: [PATCH v6 1/7] get stuck in crash detection v6: Fix: add Reviewed-by and Tested-by tags v5: Fix: send SIGKILL on timeout Fix: rename minimization functions v4: Fix: messy diff in [PATCH v3 4/7] v3: Fix: checkpatch.pl errors v2: New: [PATCH v2 1/7] New: [PATCH v2 2/7] New: [PATCH v2 4/7] New: [PATCH v2 6/7] New: [PATCH v2 7/7] Fix: [PATCH 2/4] split using binary approach Fix: [PATCH 3/4] typo in comments Discard: [PATCH 1/4] the hardcoded regex match for crash detection Discard: [PATCH 4/4] the delaying minimizer Thanks for the suggestions from: Alexander Bulekov Qiuhao Li (7): fuzz: accelerate non-crash detection fuzz: double the IOs to remove for every loop fuzz: split write operand using binary approach fuzz: remove IO commands iteratively fuzz: set bits in operand of write/out to zero fuzz: add minimization options fuzz: heuristic split write based on past IOs scripts/oss-fuzz/minimize_qtest_trace.py | 261 +++ 1 file changed, 214 insertions(+), 47 deletions(-) -- 2.25.1
[PATCH v7 7/7] fuzz: heuristic split write based on past IOs
If previous write commands write the same length of data with the same step, we view it as a hint. Signed-off-by: Qiuhao Li Reviewed-by: Alexander Bulekov Tested-by: Alexander Bulekov --- scripts/oss-fuzz/minimize_qtest_trace.py | 56 1 file changed, 56 insertions(+) diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py index d2e3f67b66..831e1f107f 100755 --- a/scripts/oss-fuzz/minimize_qtest_trace.py +++ b/scripts/oss-fuzz/minimize_qtest_trace.py @@ -89,6 +89,43 @@ def check_if_trace_crashes(trace, path): return False +# If previous write commands write the same length of data at the same +# interval, we view it as a hint. +def split_write_hint(newtrace, i): +HINT_LEN = 3 # > 2 +if i <=(HINT_LEN-1): +return None + +#find previous continuous write traces +k = 0 +l = i-1 +writes = [] +while (k != HINT_LEN and l >= 0): +if newtrace[l].startswith("write "): +writes.append(newtrace[l]) +k += 1 +l -= 1 +elif newtrace[l] == "": +l -= 1 +else: +return None +if k != HINT_LEN: +return None + +length = int(writes[0].split()[2], 16) +for j in range(1, HINT_LEN): +if length != int(writes[j].split()[2], 16): +return None + +step = int(writes[0].split()[1], 16) - int(writes[1].split()[1], 16) +for j in range(1, HINT_LEN-1): +if step != int(writes[j].split()[1], 16) - \ +int(writes[j+1].split()[1], 16): +return None + +return (int(writes[0].split()[1], 16)+step, length) + + def remove_lines(newtrace, outpath): remove_step = 1 i = 0 @@ -152,6 +189,25 @@ def remove_lines(newtrace, outpath): length = int(newtrace[i].split()[2], 16) data = newtrace[i].split()[3][2:] if length > 1: + +# Can we get a hint from previous writes? +hint = split_write_hint(newtrace, i) +if hint is not None: +hint_addr = hint[0] +hint_len = hint[1] +if hint_addr >= addr and hint_addr+hint_len <= addr+length: +newtrace[i] = "write {addr} {size} 0x{data}\n".format( +addr=hex(hint_addr), +size=hex(hint_len), +data=data[(hint_addr-addr)*2:\ +(hint_addr-addr)*2+hint_len*2]) +if check_if_trace_crashes(newtrace, outpath): +# next round +i += 1 +continue +newtrace[i] = prior[0] + +# Try splitting it using a binary approach leftlength = int(length/2) rightlength = length - leftlength newtrace.insert(i+1, "") -- 2.25.1
[PATCH v7 3/7] fuzz: split write operand using binary approach
Currently, we split the write commands' data from the middle. If it does not work, try to move the pivot left by one byte and retry until there is no space. But, this method has two flaws: 1. It may fail to trim all unnecessary bytes on the right side. For example, there is an IO write command: write addr uuuu u is the unnecessary byte for the crash. Unlike ram write commands, in most case, a split IO write won't trigger the same crash, So if we split from the middle, we will get: write addr uu (will be removed in next round) write addr uu For uu, since split it from the middle and retry to the leftmost byte won't get the same crash, we will be stopped from removing the last two bytes. 2. The algorithm complexity is O(n) since we move the pivot byte by byte. To solve the first issue, we can try a symmetrical position on the right if we fail on the left. As for the second issue, instead moving by one byte, we can approach the boundary exponentially, achieving O(log(n)). Give an example: uu len=6 + | + xxx,xuu 6/2=3 fail + +--+-+ || ++ xx,xxuu 6/2^2=1 fail u,u 6-1=5 success + + +--++ | | |+-+ u removed + + xx,xxu 5/2=2 fail ,u 6-2=4 success + | +---+ u removed In some rare cases, this algorithm will fail to trim all unnecessary bytes: xuxx -xuxx Fail -xuxx Fail xuxx- Fail ... I think the trade-off is worth it. Signed-off-by: Qiuhao Li Reviewed-by: Alexander Bulekov Tested-by: Alexander Bulekov --- scripts/oss-fuzz/minimize_qtest_trace.py | 29 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/scripts/oss-fuzz/minimize_qtest_trace.py b/scripts/oss-fuzz/minimize_qtest_trace.py index 3c11db4b8a..319f4c02d0 100755 --- a/scripts/oss-fuzz/minimize_qtest_trace.py +++ b/scripts/oss-fuzz/minimize_qtest_trace.py @@ -98,7 +98,7 @@ def minimize_trace(inpath, outpath): prior = newtrace[i:i+remove_step] for j in range(i, i+remove_step): newtrace[j] = "" -print("Removing {lines} ...".format(lines=prior)) +print("Removing {lines} ...\n".format(lines=prior)) if check_if_trace_crashes(newtrace, outpath): i += remove_step # Double the number of lines to remove for next round @@ -111,9 +111,11 @@ def minimize_trace(inpath, outpath): remove_step = 1 continue newtrace[i] = prior[0] # remove_step = 1 + # 2.) Try to replace write{bwlq} commands with a write addr, len # command. Since this can require swapping endianness, try both LE and # BE options. We do this, so we can "trim" the writes in (3) + if (newtrace[i].startswith("write") and not newtrace[i].startswith("write ")): suffix = newtrace[i].split()[0][-1] @@ -134,11 +136,15 @@ def minimize_trace(inpath, outpath): newtrace[i] = prior[0] # 3.) If it is a qtest write command: write addr len data, try to split -# it into two separate write commands. If splitting the write down the -# middle does not work, try to move the pivot "left" and retry, until -# there is no space left. The idea is to prune unneccessary bytes from -# long writes, while accommodating arbitrary MemoryRegion access sizes -# and alignments. +# it into two separate write commands. If splitting the data operand +# from length/2^n bytes to the left does not work, try to move the pivot +# to the right side, then add one to n, until length/2^n == 0. The idea +# is to prune unneccessary bytes from long writes, while accommodating +# arbitrary MemoryRegion access sizes and alignments. + +# This algorithm will fail under some rare situations. +# e.g., xuxx (u is the unnecessary byte) + if newtrace[i].startswith("write "): addr = int(newtrace[i].split()[1], 16) length = int(newtrace[i].split()[2], 16) @@ -147,6 +153,7 @@ def minimize_trace(inpath, outpath): leftlength = int(length/2) rightlength = length - leftlength newtrace.insert(i+1, "") +power = 1 while leftlength > 0: newtrace[i] = "write {addr} {size} 0x{data}\n".format( addr=hex(addr), @@ -158,9 +165,13 @@ def minimize_trace(inpath, outpath):