Re: [PATCH] riscv: sifive_u: Add a "serial" property for board serial number
Hi Palmer, On Fri, Nov 22, 2019 at 10:38 AM Palmer Dabbelt wrote: > > On Thu, 21 Nov 2019 17:10:18 PST (-0800), bmeng...@gmail.com wrote: > > On Sat, Nov 16, 2019 at 11:08 PM Bin Meng wrote: > >> > >> At present the board serial number is hard-coded to 1, and passed > >> to OTP model during initialization. Firmware (FSBL, U-Boot) uses > >> the serial number to generate a unique MAC address for the on-chip > >> ethernet controller. When multiple QEMU 'sifive_u' instances are > >> created and connected to the same subnet, they all have the same > >> MAC address hence it creates a unusable network. > >> > >> A new "serial" property is introduced to specify the board serial > >> number. When not given, the default serial number 1 is used. > >> > >> Signed-off-by: Bin Meng > >> --- > >> > >> hw/riscv/sifive_u.c | 21 - > >> include/hw/riscv/sifive_u.h | 1 + > >> 2 files changed, 21 insertions(+), 1 deletion(-) > >> > > > > ping? > > Sorry, it looks like I dropped this one. I've put it in the queue for 5.0, > with a > > Reviewed-by: Palmer Dabbelt Has this been applied somewhere? Regards, Bin
Re: [PATCH] vl: Don't mismatch g_strsplit()/g_free()
Hi On Fri, Jan 10, 2020 at 7:17 AM wrote: > > From: Pan Nengyuan > > It's a mismatch between g_strsplit and g_free, it will cause a memory leak as > follow: > > [root@localhost]# ./aarch64-softmmu/qemu-system-aarch64 -accel help > Accelerators supported in QEMU binary: > tcg > kvm > = > ==1207900==ERROR: LeakSanitizer: detected memory leaks > > Direct leak of 8 byte(s) in 2 object(s) allocated from: > #0 0xfffd700231cb in __interceptor_malloc (/lib64/libasan.so.4+0xd31cb) > #1 0xfffd6ec57163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163) > #2 0xfffd6ec724d7 in g_strndup (/lib64/libglib-2.0.so.0+0x724d7) > #3 0xfffd6ec73d3f in g_strsplit (/lib64/libglib-2.0.so.0+0x73d3f) > #4 0xaaab66be5077 in main /mnt/sdc/qemu-master/qemu-4.2.0-rc0/vl.c:3517 > #5 0xfffd6e140b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f) > #6 0xaaab66bf0f53 (./build/aarch64-softmmu/qemu-system-aarch64+0x8a0f53) > > Direct leak of 2 byte(s) in 2 object(s) allocated from: > #0 0xfffd700231cb in __interceptor_malloc (/lib64/libasan.so.4+0xd31cb) > #1 0xfffd6ec57163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163) > #2 0xfffd6ec7243b in g_strdup (/lib64/libglib-2.0.so.0+0x7243b) > #3 0xfffd6ec73e6f in g_strsplit (/lib64/libglib-2.0.so.0+0x73e6f) > #4 0xaaab66be5077 in main /mnt/sdc/qemu-master/qemu-4.2.0-rc0/vl.c:3517 > #5 0xfffd6e140b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f) > #6 0xaaab66bf0f53 (./build/aarch64-softmmu/qemu-system-aarch64+0x8a0f53) > > Reported-by: Euler Robot > Signed-off-by: Pan Nengyuan Fixes: cbe6d6365a48bce4526c664170cda6fe738484f8 Reviewed-by: Marc-André Lureau > --- > vl.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/vl.c b/vl.c > index 86474a55c9..2fa5cb3b9a 100644 > --- a/vl.c > +++ b/vl.c > @@ -3476,7 +3476,7 @@ int main(int argc, char **argv, char **envp) > gchar **optname = g_strsplit(typename, > ACCEL_CLASS_SUFFIX, > 0); > printf("%s\n", optname[0]); > -g_free(optname); > +g_strfreev(optname); > } > g_free(typename); > } > -- > 2.21.0.windows.1 > > > -- Marc-André Lureau
Re: Audio pace issue with qemu4.2
On Fri, Jan 10, 2020 at 11:30:24AM +0530, padmashree mandri wrote: > Hi all, > > I am trying to run qemu4.2 with ALSA. Playback pace is fast. Where can > i set sampling rate and all parameters related to audio in qemu? https://www.kraxel.org/blog/2020/01/qemu-sound-audiodev/ HTH, Gerd
Re: [PATCH] virtio: add the queue number check
On Fri, Jan 03, 2020 at 10:18:58PM +0100, Paolo Bonzini wrote: > Il ven 3 gen 2020, 16:08 Yang Zhong ha scritto: > > > I also tried virtio-blk device like below: > > https://patchwork.kernel.org/cover/10873193/ > > > > The virtio-blk can work with this changes, but vhost-user-blk device > > failed with this kernel patch. > > > > in vhost_virtqueue_start() function, below operation to check if the > > desc addr set by guest kernel. This will ignore the extra vqs. > > a = virtio_queue_get_desc_addr(vdev, idx); > > if (a == 0) { > > /* Queue might not be ready for start */ > > return 0; > > } > > > > If guest kernel add min(cpu,num_vqs), do we need add same check in > > realize function of vhost-user-blk device? > > > > No. If virtio-blk works, the bug is in vhost-user-blk; if virtio-blk needs > no check in cpu count, vhost-user-blk also doesn't. > > You need to check first if the bug is in QEMU or the vhost-user-blk server. > Paolo and all, It seems i had root cause this issue, let me list what's i found. Any issue please correct me, thanks! I found there are 2 issues in DPDK and seabios by debugging this issue. (1). Seabios issue In init_virtio_blk() function, which set VIRTIO_CONFIG_S_DRIVER_OK status to qemu vhost-user-blk device. // the related code .. status |= VIRTIO_CONFIG_S_DRIVER_OK; vp_set_status(&vdrive->vp, status); .. I think there is no need for seabios to set VIRTIO_CONFIG_S_DRIVER_OK status to qemu vhost-user-blk device. In fact, this time, the guest virtio-blk module is not started. Once seabios set this DRIVER_OK status to qemu vhost user device, the vhost-user-blk will call vhost_user_blk_start(). static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status) { VHostUserBlk *s = VHOST_USER_BLK(vdev); bool should_start = virtio_device_started(vdev, status); int ret; if (!vdev->vm_running) { should_start = false; } if (!s->connected) { return; } if (s->dev.started == should_start) { return; } if (should_start) { ret = vhost_user_blk_start(vdev); if (ret < 0) { error_report("vhost-user-blk: vhost start failed: %s", strerror(-ret)); qemu_chr_fe_disconnect(&s->chardev); } } else { vhost_user_blk_stop(vdev); } } In fact, this time vhost_user_blk_start almost do nothing because the real guest virtio-blk driver still not started yet. This time, there is only one vq can be used(this vq should be inited in seabios). When the guest virtio-blk driver really start and complet the probe(), the guest virtio-blk driver will set VIRTIO_CONFIG_S_DRIVER_OK to vhost-user-blk device again. This time, this driver will allocate RIGHT queue num according to MIN(vcpu, num_vqs). So, i think set VIRTIO_CONFIG_S_DRIVER_OK status should be removed from seabios, and guest driver should do this. I removed this status seting in the seabios, and test verified this. guest virtio-blk driver can be normally initialized by virtio-blk or vhost-user-blk device from qemu. (2). DPDK issue DPDK does not know the real queue number used by guest virtio-blk driver and it only know the queue number from vhost-user-blk commond line. Once the guest virtio-blk driver change the queue number according to MIN(vcpu, num_vqs), DPDK still use previous queue number and it think virtio is never ready by virtio_is_ready() function. There are two methods to fix this issue, one is add VHOST_USER_SET_QUEUE_NUM to vhost user protocol. vhost-user-blk device can call this to inform SPDK the real queue number. vhost-user-device can use below method to get the real queue number(to check if vring.desc is NON-NULL) for (i = 0; i < s->dev.nvqs; i++) { VirtQueue *kick_vq = virtio_get_queue(vdev, i); if (!virtio_queue_get_desc_addr(vdev, i)) { continue; } } Current vhost-user protocol only support GET_QUEUE_NUM(get max queue num), we can add SET_QUEUE_NUM. or DPDK can get the real queue number by checking if the vring.desc is NON-NULL. By the way, vhost SCSI device has the same issue with vhost-user-blk device. Yang > Paolo
RE: [PATCH v1 15/59] block/blkreplay.c: remove unneeded 'fail' label in blkreplay_open()
> From: Daniel Henrique Barboza [mailto:danielhb...@gmail.com] > Sent: Monday, January 06, 2020 9:24 PM > To: qemu-devel@nongnu.org > Cc: qemu-triv...@nongnu.org; Daniel Henrique Barboza; Pavel Dovgalyuk; Paolo > Bonzini > Subject: [PATCH v1 15/59] block/blkreplay.c: remove unneeded 'fail' label in > blkreplay_open() > > Both the 'fail' label and the 'ret' variable can be removed. > Use 'return -EINVAL' in the error condition and 'return 0' in > the end of the function. > > CC: Pavel Dovgalyuk > CC: Paolo Bonzini > Signed-off-by: Daniel Henrique Barboza Reviewed-by: Pavel Dovgalyuk > --- > block/blkreplay.c | 8 ++-- > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/block/blkreplay.c b/block/blkreplay.c > index c96ac8f4bc..d8c4c311f3 100644 > --- a/block/blkreplay.c > +++ b/block/blkreplay.c > @@ -24,23 +24,19 @@ static int blkreplay_open(BlockDriverState *bs, QDict > *options, int flags, >Error **errp) > { > Error *local_err = NULL; > -int ret; > > /* Open the image file */ > bs->file = bdrv_open_child(NULL, options, "image", > bs, &child_file, false, &local_err); > if (local_err) { > -ret = -EINVAL; > error_propagate(errp, local_err); > -goto fail; > +return -EINVAL; > } > > bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED; > bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED; > > -ret = 0; > -fail: > -return ret; > +return 0; > } > > static int64_t blkreplay_getlength(BlockDriverState *bs) > -- > 2.24.1 Pavel Dovgalyuk
Audio pace issue with qemu4.2
Hi all, I am trying to run qemu4.2 with ALSA. Playback pace is fast. Where can i set sampling rate and all parameters related to audio in qemu? Thanks padma
[PATCH] vl: Don't mismatch g_strsplit()/g_free()
From: Pan Nengyuan It's a mismatch between g_strsplit and g_free, it will cause a memory leak as follow: [root@localhost]# ./aarch64-softmmu/qemu-system-aarch64 -accel help Accelerators supported in QEMU binary: tcg kvm = ==1207900==ERROR: LeakSanitizer: detected memory leaks Direct leak of 8 byte(s) in 2 object(s) allocated from: #0 0xfffd700231cb in __interceptor_malloc (/lib64/libasan.so.4+0xd31cb) #1 0xfffd6ec57163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163) #2 0xfffd6ec724d7 in g_strndup (/lib64/libglib-2.0.so.0+0x724d7) #3 0xfffd6ec73d3f in g_strsplit (/lib64/libglib-2.0.so.0+0x73d3f) #4 0xaaab66be5077 in main /mnt/sdc/qemu-master/qemu-4.2.0-rc0/vl.c:3517 #5 0xfffd6e140b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f) #6 0xaaab66bf0f53 (./build/aarch64-softmmu/qemu-system-aarch64+0x8a0f53) Direct leak of 2 byte(s) in 2 object(s) allocated from: #0 0xfffd700231cb in __interceptor_malloc (/lib64/libasan.so.4+0xd31cb) #1 0xfffd6ec57163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163) #2 0xfffd6ec7243b in g_strdup (/lib64/libglib-2.0.so.0+0x7243b) #3 0xfffd6ec73e6f in g_strsplit (/lib64/libglib-2.0.so.0+0x73e6f) #4 0xaaab66be5077 in main /mnt/sdc/qemu-master/qemu-4.2.0-rc0/vl.c:3517 #5 0xfffd6e140b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f) #6 0xaaab66bf0f53 (./build/aarch64-softmmu/qemu-system-aarch64+0x8a0f53) Reported-by: Euler Robot Signed-off-by: Pan Nengyuan --- vl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/vl.c b/vl.c index 86474a55c9..2fa5cb3b9a 100644 --- a/vl.c +++ b/vl.c @@ -3476,7 +3476,7 @@ int main(int argc, char **argv, char **envp) gchar **optname = g_strsplit(typename, ACCEL_CLASS_SUFFIX, 0); printf("%s\n", optname[0]); -g_free(optname); +g_strfreev(optname); } g_free(typename); } -- 2.21.0.windows.1
RE: [PATCH] hw/arm/acpi: Pack the SRAT processors structure by node_id ascending order
> -Original Message- > From: Igor Mammedov [mailto:imamm...@redhat.com] > Sent: Thursday, January 09, 2020 5:54 PM > To: Zengtao (B) > Cc: Michael S. Tsirkin; qemu-devel@nongnu.org; qemu-triv...@nongnu.org; > Shannon Zhao; Peter Maydell; qemu-...@nongnu.org > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors structure by > node_id ascending order > > On Thu, 9 Jan 2020 02:45:58 + > "Zengtao (B)" wrote: > > > > -Original Message- > > > From: Igor Mammedov [mailto:imamm...@redhat.com] > > > Sent: Thursday, January 09, 2020 12:39 AM > > > To: Zengtao (B) > > > Cc: Michael S. Tsirkin; qemu-devel@nongnu.org; > qemu-triv...@nongnu.org; > > > Shannon Zhao; Peter Maydell; qemu-...@nongnu.org > > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors structure > by > > > node_id ascending order > > > > > > On Wed, 8 Jan 2020 04:02:10 + > > > "Zengtao (B)" wrote: > > > > > > > > -Original Message- > > > > > From: Igor Mammedov [mailto:imamm...@redhat.com] > > > > > Sent: Tuesday, January 07, 2020 11:50 PM > > > > > To: Zengtao (B) > > > > > Cc: Michael S. Tsirkin; qemu-devel@nongnu.org; > > > qemu-triv...@nongnu.org; > > > > > Shannon Zhao; Peter Maydell; qemu-...@nongnu.org > > > > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors > structure > > > by > > > > > node_id ascending order > > > > > > > > > > On Tue, 7 Jan 2020 10:29:22 + > > > > > "Zengtao (B)" wrote: > > > > > > > > > > > > -Original Message- > > > > > > > From: Michael S. Tsirkin [mailto:m...@redhat.com] > > > > > > > Sent: Tuesday, January 07, 2020 5:33 PM > > > > > > > To: Zengtao (B) > > > > > > > Cc: qemu-devel@nongnu.org; qemu-triv...@nongnu.org; > Shannon > > > > > Zhao; > > > > > > > Peter Maydell; Igor Mammedov; qemu-...@nongnu.org > > > > > > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors > > > structure > > > > > by > > > > > > > node_id ascending order > > > > > > > > > > > > > > On Tue, Jan 07, 2020 at 05:18:49PM +0800, Zeng Tao wrote: > > > > > > > > When booting the guest linux with the following numa > > > configuration: > > > > > > > > -numa node,node_id=1,cpus=0-3 > > > > > > > > -numa node,node_id=0,cpus=4-7 > > > > > > > > We can get the following numa topology in the guest system: > > > > > > > > Architecture: aarch64 > > > > > > > > Byte Order:Little Endian > > > > > > > > CPU(s):8 > > > > > > > > On-line CPU(s) list: 0-7 > > > > > > > > Thread(s) per core:1 > > > > > > > > Core(s) per socket:8 > > > > > > > > Socket(s): 1 > > > > > > > > NUMA node(s): 2 > > > > > > > > L1d cache: unknown size > > > > > > > > L1i cache: unknown size > > > > > > > > L2 cache: unknown size > > > > > > > > NUMA node0 CPU(s): 0-3 > > > > > > > > NUMA node1 CPU(s): 4-7 > > > > > > > > The Cpus 0-3 is assigned with NUMA node 1 in QEMU while it > get > > > > > NUMA > > > > > > > node > > > > > > > > 0 in the guest. > > > > > > > > > > > > > > > > In fact, In the linux kernel, numa_node_id is allocated per the > > > ACPI > > > > > > > > SRAT processors structure order,so the cpu 0 will be the first > one > > > to > > > > > > > > allocate its NUMA node id, so it gets the NUMA node 0. > > > > > > > > > > > > > > > > To fix this issue, we pack the SRAT processors structure in > numa > > > node > > > > > id > > > > > > > > order but not the default cpu number order. > > > > > > > > > > > > > > > > Signed-off-by: Zeng Tao > > > > > > > > > > > > > > > > > > > > > Does this matter? If yes fixing linux to take node id from > proximity > > > > > > > field in ACPI seems cleaner ... > > > > > > > > > > > > > > > > > > > In fact, I just want to align the node_id concept in QEMU and > Linux. > > > > > > If we fix the kernel side, we need to align with all platforms. > > > > > > i think maybe not a good idea. > > > > > if linux makes up node ID's on it's own, it would be hard for it to > > > > > map SRAT entries to other tables that use proximity id as well. > > > > > > > > > > So it would need to maintain map of [proximity id] -> [node id] (and > > > reverse) > > > > > somewhere to resolve mappings on other tables. > > > > > If it doesn't do this then it's broken and works just by accident, > > > > > if it does the fix probably should be in that code and not in QEMU. > > > > > > > > > > > > > Hmm, the problem is how to understand the concept of node id. > > > > 1. In dts, there is node id. Both the QEMU and Linux can use it > > > > directly, so no conflict. > > > > 2. In ACPI, there is only proximity domain, but no node id, there > > > > should be a mapping between them, while kernel and QEMU > maintain > > > > their own rule, and unfortunately they conflict with each other > > > > sometimes. > > > > > > > > There is no specification to indicate what we should do to maintain > the > > > > mapping, so it's hard to align the QEMU and Linux. > > > > > > > > Any sugge
Re: [PATCH 0/2] not use multifd during postcopy
On Thu, Jan 09, 2020 at 10:50:25AM +0100, Juan Quintela wrote: >Wei Yang wrote: >> On Mon, Dec 16, 2019 at 10:35:39AM +0800, Wei Yang wrote: >>>Would this one be picked up this time? >> >> Happy new year to all. >> >> Can I ask the plan for this patch set? > >queued > Thanks :-) -- Wei Yang Help you, Help me
[PATCH qemu v5] spapr: Kill SLOF
The Petitboot bootloader is way more advanced than SLOF is ever going to be as Petitboot comes with the full-featured Linux kernel with all the drivers, and initramdisk with quite user friendly interface. The problem with ditching SLOF is that an unmodified pseries kernel can either start via: 1. kexec, this requires presence of RTAS and skips ibm,client-architecture-support entirely; 2. normal boot, this heavily relies on the OF1275 client interface to fetch the device tree and do early setup (claim memory). This adds a new bios-less mode to the pseries machine: "bios=on|off". When enabled, QEMU does not load SLOF and jumps to the kernel from "-kernel". The client interface is implemented exactly as RTAS - a 20 bytes blob, right next after the RTAS blob. The entry point is passed to the kernel via GPR5. This implements a handful of client interface methods just to get going. In particular, this implements the device tree fetching, ibm,client-architecture-support and instantiate-rtas. This implements changing FDT properties for RTAS (for vmlinux and zImage) and initramdisk location (for zImage). To make this work, this skips fdt_pack() when bios=off as not packing the blob leaves some room for appending. This assigns "phandles" to device tree nodes as there is no more SLOF and OF nodes addresses of which served as phandle values. This keeps predefined nodes (such as XICS/NVLINK/...) unchanged. phandles are regenerated at every FDT rebuild. When bios=off, this adds "/chosen" every time QEMU builds a tree. This implements "claim" which the client (Linux) uses for memory allocation; this is also used by QEMU for claiming kernel/initrd images, client interface entry point, RTAS and the initial stack. While at this, add a "kernel-addr" machine parameter to allow moving the kernel in memory. This is useful for debugging if the kernel is loaded at @0, although not necessary. This adds basic instances support which are managed by a hashmap ihandle->[phandle, DeviceState, Chardev]. Note that a 64bit PCI fix is required for Linux: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735e The test command line: qemu-system-ppc64 \ -nodefaults \ -chardev stdio,id=STDIO0,signal=off,mux=on \ -device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \ -mon id=MON0,chardev=STDIO0,mode=readline \ -nographic \ -vga none \ -kernel pbuild/kernel-le-guest/arch/powerpc/boot/zImage.pseries \ -machine pseries,bios=off,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken \ -m 4G \ -enable-kvm \ -initrd pb/rootfs.cpio.xz \ -device nec-usb-xhci,id=nec-usb-xhci0 \ -netdev tap,id=TAP0,helper=/home/aik/qemu-bridge-helper --br=br0 \ -device virtio-net-pci,id=vnet0,netdev=TAP0 img/f30le.qcow2 \ -snapshot \ -smp 8,threads=8 \ -trace events=qemu_trace_events \ -d guest_errors \ -chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.ssh54088 \ -mon chardev=SOCKET0,mode=control Signed-off-by: Alexey Kardashevskiy --- Changes: v5: * made instances keep device and chardev pointers * removed VIO dependencies * print error if RTAS memory is not claimed as it should have been * pack FDT as "quiesce" v4: * fixed open * validate ihandles in "call-method" v3: * fixed phandles allocation * s/__be32/uint32_t/ as we do not normally have __be32 type in qemu * fixed size of /chosen/stdout * bunch of renames * do not create rtas properties at all, let the client deal with it; instead setprop allows changing these in the FDT * no more packing FDT when bios=off - nobody needs it and getprop does not work otherwise * allow updating initramdisk device tree properties (for zImage) * added instances * fixed stdout on OF's "write" * removed special handling for stdout in OF client, spapr-vty handles it instead v2: * fixed claim() * added "setprop" * cleaner client interface and RTAS blobs management * boots to petitboot and further to the target system * more trace points --- hw/ppc/Makefile.objs | 1 + include/hw/ppc/spapr.h | 28 +- hw/ppc/spapr.c | 266 ++-- hw/ppc/spapr_hcall.c | 74 +++-- hw/ppc/spapr_of_client.c | 633 +++ hw/ppc/trace-events | 12 + 6 files changed, 959 insertions(+), 55 deletions(-) create mode 100644 hw/ppc/spapr_of_client.c diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs index 101e9fc59185..20efc0aa6f9b 100644 --- a/hw/ppc/Makefile.objs +++ b/hw/ppc/Makefile.objs @@ -6,6 +6,7 @@ obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o spapr_irq.o obj-$(CONFIG_PSERIES) += spapr_tpm_proxy.o +obj-$(CONFIG_PSERIES) += spapr_of_client.o obj-$(CONFIG_SPAPR_RNG) += spapr_rng.o # IBM PowerNV obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o pnv_occ.o pnv_bmc.o diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h index 61f005c6f686..efc2c70abf99 100
[Bug 1859081] Re: Mouse way too fast when Qemu is on a Windows VM with a OS 9 Guest
What is the QEMU command-line you use? Does this problem exist with the usb mouse (-device usb-mouse)? Could you try upgrading to the latest version of QEMU and see if the issue is resolved please? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1859081 Title: Mouse way too fast when Qemu is on a Windows VM with a OS 9 Guest Status in QEMU: New Bug description: On a server, I have a Windows 10 VM with Qemu 4.1.0 (latest) from https://qemu.weilnetz.de/w64/ installed. There I have a Mac OS 9.2.2 machine. Now if I connect to the Windows VM with VNC or RDP or even VMWare console, the Mouse in the Mac OS Guest inside Qemu is wy to fast. Even when lowering the mouse speed in the Mac OS mouse setting, one pixel in the Host (Windows 10 VM) still moves the mouse by 10 pixels inside the Qemu machine. I tried different resolutions but that does not help. Is there any way to fix this or any way how I can provide more information? Thanks To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1859081/+subscriptions
[Bug 1859106] [NEW] 4.2 regression: ReactOS crashes on boot
Public bug reported: In QEMU 4.1.x and earlier, ReactOS can successfully boot, but starting with 4.2, it fails, instead coming up with an error "The video driver failed to initialize." This happens regardless of VM configuration (even -M pc-i440fx-4.1) and it works well with older versions of QEMU. bisecting QEMU to find the first bad commit reveals 0221d73ce6a8e075adaa0a35a6ef853d2652b855 as the culprit, which is updating the seabios version; perhaps this bug belongs there, but I don't know the appropriate avenue (it seems seabios is a subproject of QEMU anyway?). ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1859106 Title: 4.2 regression: ReactOS crashes on boot Status in QEMU: New Bug description: In QEMU 4.1.x and earlier, ReactOS can successfully boot, but starting with 4.2, it fails, instead coming up with an error "The video driver failed to initialize." This happens regardless of VM configuration (even -M pc-i440fx-4.1) and it works well with older versions of QEMU. bisecting QEMU to find the first bad commit reveals 0221d73ce6a8e075adaa0a35a6ef853d2652b855 as the culprit, which is updating the seabios version; perhaps this bug belongs there, but I don't know the appropriate avenue (it seems seabios is a subproject of QEMU anyway?). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1859106/+subscriptions
[Bug 1859106] Re: 4.2 regression: ReactOS crashes on boot
I should add, ReactOS can be downloaded from here: https://reactos.org/download The LiveCD is sufficient to see the problem. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1859106 Title: 4.2 regression: ReactOS crashes on boot Status in QEMU: New Bug description: In QEMU 4.1.x and earlier, ReactOS can successfully boot, but starting with 4.2, it fails, instead coming up with an error "The video driver failed to initialize." This happens regardless of VM configuration (even -M pc-i440fx-4.1) and it works well with older versions of QEMU. bisecting QEMU to find the first bad commit reveals 0221d73ce6a8e075adaa0a35a6ef853d2652b855 as the culprit, which is updating the seabios version; perhaps this bug belongs there, but I don't know the appropriate avenue (it seems seabios is a subproject of QEMU anyway?). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1859106/+subscriptions
Re: [virtio-dev][RFC PATCH v1 2/2] virtio-gpu: add the ability to export resources
I like the idea of having one central place in the kernel where virtio-devices get their uuid from -- i.e, no separate VM specific, device specific implementations calling into uuid_gen(). On Wed, Jan 8, 2020 at 3:20 AM David Stevens wrote: > > > Is there a specific reason why you want the host pick the uuid? I would > > let the guest define the uuid, i.e. move the uuid fields to > > virtio_gpu_export_resource and scratch virtio_gpu_resp_export_resource. > > Sending the uuid in the original request doesn't really buy us > anything, at least in terms of asynchronicity. The guest would still > need to wait for the response to arrive before it could safely pass > the uuid to any other virtio devices, to prevent a race where the > import fails because it is processed before virtio-gpu processes the > export. Perhaps this wouldn't be the case if we supported sharing > fences between virtio devices, but even then, fences are more of a > thing for the operation of a pipeline, not for the setup of a > pipeline. > > At that point, I think it's just a matter of aesthetics. I lean > slightly towards returning the uuid from the host, since that rules > out any implementation with the aforementioned race. That being said, > if there are any specific reasons or preferences to assigning the uuid > from the guest, I can switch to that direction. > > -David
Re: [PATCH qemu v4] spapr: Kill SLOF
On 10/01/2020 10:32, Alexey Kardashevskiy wrote: > > > On 10/01/2020 10:05, Alexey Kardashevskiy wrote: >> >> >> On 08/01/2020 17:18, Alexey Kardashevskiy wrote: >>> The Petitboot bootloader is way more advanced than SLOF is ever going to >>> be as Petitboot comes with the full-featured Linux kernel with all >>> the drivers, and initramdisk with quite user friendly interface. >>> The problem with ditching SLOF is that an unmodified pseries kernel can >>> either start via: >>> 1. kexec, this requires presence of RTAS and skips >>> ibm,client-architecture-support entirely; >>> 2. normal boot, this heavily relies on the OF1275 client interface to >>> fetch the device tree and do early setup (claim memory). >>> >>> This adds a new bios-less mode to the pseries machine: "bios=on|off". >>> When enabled, QEMU does not load SLOF and jumps to the kernel from >>> "-kernel". >>> >>> The client interface is implemented exactly as RTAS - a 20 bytes blob, >>> right next after the RTAS blob. The entry point is passed to the kernel >>> via GPR5. >>> >>> This implements a handful of client interface methods just to get going. >>> In particular, this implements the device tree fetching, >>> ibm,client-architecture-support and instantiate-rtas. >>> >>> This implements changing FDT properties for RTAS (for vmlinux and zImage) >>> and initramdisk location (for zImage). To make this work, this skips >>> fdt_pack() when bios=off as not packing the blob leaves some room for >>> appending. >>> >>> This assigns "phandles" to device tree nodes as there is no more SLOF >>> and OF nodes addresses of which served as phandle values. >>> This keeps predefined nodes (such as XICS/NVLINK/...) unchanged. >>> phandles are regenerated at every FDT rebuild. >>> >>> This defines phandles for VIO devices to have phandle assigned to >>> the default stdout device at the point when we write "/chosen/stdout" >>> which an ihandle which the OS uses to write to the console. >> >> >> And I do not really need to preallocate phandles for stdout as it is a >> leftover from when I populated /chosen/stdout before populating VIO >> nodes, now /chosen/stdout is added at the very end. Thanks, > > > Ah noo, I do, to implement "write" to the selected stdout as I need to > trace ihandle back to Object* and object_resolve_path() does not know > about FDT path, it is /machine/peripheral/svty0 in QOM. The commit log > needs an update, or this needs a fix but I cannot think of a nicer one. > Thanks, I just might extend instances to do real instances, i.e. associate ihandle with phandle _and_ Object*, I just need a helper to find Object which matches what qdev_get_fw_dev_path() returns. Fun :) > > >> >> >> >>> >>> When bios=off, this adds "/chosen" every time QEMU builds a tree. >>> >>> This implements "claim" which the client (Linux) uses for memory >>> allocation; this is also used by QEMU for claiming kernel/initrd images, >>> client interface entry point, RTAS and the initial stack. >>> >>> While at this, add a "kernel-addr" machine parameter to allow moving >>> the kernel in memory. This is useful for debugging if the kernel is >>> loaded at @0, although not necessary. >>> >>> This adds very basic instances support which are managed by a hashmap >>> ihandle->phandle. >>> >>> Note that a 64bit PCI fix is required for Linux: >>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735e >> >> >> > -- Alexey
Does ppc target support for MTTCG are stable?
I enabled it and working, but I don't know if there any other issue are still not resolved.. -- 此致 礼 罗勇刚 Yours sincerely, Yonggang Luo
Re: [PATCH] riscv/sifive_u: fix a memory leak in soc_realize()
On Tue, 10 Dec 2019 10:38:29 PST (-0800), alistai...@gmail.com wrote: On Mon, Dec 9, 2019 at 11:15 PM wrote: From: Pan Nengyuan Fix a minor memory leak in riscv_sifive_u_soc_realize() Reported-by: Euler Robot Signed-off-by: Pan Nengyuan Reviewed-by: Alistair Francis Thanks. This is in the queue, I'm hoping to submit a PR after the H patch set is ready to go. Alistair --- hw/riscv/sifive_u.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c index 0140e95..0e12b3c 100644 --- a/hw/riscv/sifive_u.c +++ b/hw/riscv/sifive_u.c @@ -542,6 +542,7 @@ static void riscv_sifive_u_soc_realize(DeviceState *dev, Error **errp) SIFIVE_U_PLIC_CONTEXT_BASE, SIFIVE_U_PLIC_CONTEXT_STRIDE, memmap[SIFIVE_U_PLIC].size); +g_free(plic_hart_config); sifive_uart_create(system_memory, memmap[SIFIVE_U_UART0].base, serial_hd(0), qdev_get_gpio_in(DEVICE(s->plic), SIFIVE_U_UART0_IRQ)); sifive_uart_create(system_memory, memmap[SIFIVE_U_UART1].base, -- 2.7.2.windows.1
Re: [PATCH qemu v4] spapr: Kill SLOF
On 10/01/2020 10:05, Alexey Kardashevskiy wrote: > > > On 08/01/2020 17:18, Alexey Kardashevskiy wrote: >> The Petitboot bootloader is way more advanced than SLOF is ever going to >> be as Petitboot comes with the full-featured Linux kernel with all >> the drivers, and initramdisk with quite user friendly interface. >> The problem with ditching SLOF is that an unmodified pseries kernel can >> either start via: >> 1. kexec, this requires presence of RTAS and skips >> ibm,client-architecture-support entirely; >> 2. normal boot, this heavily relies on the OF1275 client interface to >> fetch the device tree and do early setup (claim memory). >> >> This adds a new bios-less mode to the pseries machine: "bios=on|off". >> When enabled, QEMU does not load SLOF and jumps to the kernel from >> "-kernel". >> >> The client interface is implemented exactly as RTAS - a 20 bytes blob, >> right next after the RTAS blob. The entry point is passed to the kernel >> via GPR5. >> >> This implements a handful of client interface methods just to get going. >> In particular, this implements the device tree fetching, >> ibm,client-architecture-support and instantiate-rtas. >> >> This implements changing FDT properties for RTAS (for vmlinux and zImage) >> and initramdisk location (for zImage). To make this work, this skips >> fdt_pack() when bios=off as not packing the blob leaves some room for >> appending. >> >> This assigns "phandles" to device tree nodes as there is no more SLOF >> and OF nodes addresses of which served as phandle values. >> This keeps predefined nodes (such as XICS/NVLINK/...) unchanged. >> phandles are regenerated at every FDT rebuild. >> >> This defines phandles for VIO devices to have phandle assigned to >> the default stdout device at the point when we write "/chosen/stdout" >> which an ihandle which the OS uses to write to the console. > > > And I do not really need to preallocate phandles for stdout as it is a > leftover from when I populated /chosen/stdout before populating VIO > nodes, now /chosen/stdout is added at the very end. Thanks, Ah noo, I do, to implement "write" to the selected stdout as I need to trace ihandle back to Object* and object_resolve_path() does not know about FDT path, it is /machine/peripheral/svty0 in QOM. The commit log needs an update, or this needs a fix but I cannot think of a nicer one. Thanks, > > > >> >> When bios=off, this adds "/chosen" every time QEMU builds a tree. >> >> This implements "claim" which the client (Linux) uses for memory >> allocation; this is also used by QEMU for claiming kernel/initrd images, >> client interface entry point, RTAS and the initial stack. >> >> While at this, add a "kernel-addr" machine parameter to allow moving >> the kernel in memory. This is useful for debugging if the kernel is >> loaded at @0, although not necessary. >> >> This adds very basic instances support which are managed by a hashmap >> ihandle->phandle. >> >> Note that a 64bit PCI fix is required for Linux: >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735e > > > -- Alexey
Re: [PATCH qemu v4] spapr: Kill SLOF
On 08/01/2020 17:18, Alexey Kardashevskiy wrote: > The Petitboot bootloader is way more advanced than SLOF is ever going to > be as Petitboot comes with the full-featured Linux kernel with all > the drivers, and initramdisk with quite user friendly interface. > The problem with ditching SLOF is that an unmodified pseries kernel can > either start via: > 1. kexec, this requires presence of RTAS and skips > ibm,client-architecture-support entirely; > 2. normal boot, this heavily relies on the OF1275 client interface to > fetch the device tree and do early setup (claim memory). > > This adds a new bios-less mode to the pseries machine: "bios=on|off". > When enabled, QEMU does not load SLOF and jumps to the kernel from > "-kernel". > > The client interface is implemented exactly as RTAS - a 20 bytes blob, > right next after the RTAS blob. The entry point is passed to the kernel > via GPR5. > > This implements a handful of client interface methods just to get going. > In particular, this implements the device tree fetching, > ibm,client-architecture-support and instantiate-rtas. > > This implements changing FDT properties for RTAS (for vmlinux and zImage) > and initramdisk location (for zImage). To make this work, this skips > fdt_pack() when bios=off as not packing the blob leaves some room for > appending. > > This assigns "phandles" to device tree nodes as there is no more SLOF > and OF nodes addresses of which served as phandle values. > This keeps predefined nodes (such as XICS/NVLINK/...) unchanged. > phandles are regenerated at every FDT rebuild. > > This defines phandles for VIO devices to have phandle assigned to > the default stdout device at the point when we write "/chosen/stdout" > which an ihandle which the OS uses to write to the console. And I do not really need to preallocate phandles for stdout as it is a leftover from when I populated /chosen/stdout before populating VIO nodes, now /chosen/stdout is added at the very end. Thanks, > > When bios=off, this adds "/chosen" every time QEMU builds a tree. > > This implements "claim" which the client (Linux) uses for memory > allocation; this is also used by QEMU for claiming kernel/initrd images, > client interface entry point, RTAS and the initial stack. > > While at this, add a "kernel-addr" machine parameter to allow moving > the kernel in memory. This is useful for debugging if the kernel is > loaded at @0, although not necessary. > > This adds very basic instances support which are managed by a hashmap > ihandle->phandle. > > Note that a 64bit PCI fix is required for Linux: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735e -- Alexey
Re: version 3.2 is missing from QEMU download page
On Thursday, January 9, 2020, Aleksandar Markovic < aleksandar.m.m...@gmail.com> wrote: > Hi all, > > For some reason, QEMU download page > > https://download.qemu.org > > seems to contain tarballs of all QEMU versions/releases, except 3.2 and > its follow-ups. This broke up some scripts that rely on presence of all > versions on that page. > > Hm, sorry, I think I for some reason thought that 3.2 must have existed, but it looks we had only 3.0 and 3.1 in 2018. Sorry for the noise, Alwksandar > I don't know who is in charge of QEMU download page, but I would like to > ask anyone with this power to fix the absence of 3.2, if possible. > > Thanks in advance, > Aleksandar >
version 3.2 is missing from QEMU download page
Hi all, For some reason, QEMU download page https://download.qemu.org seems to contain tarballs of all QEMU versions/releases, except 3.2 and its follow-ups. This broke up some scripts that rely on presence of all versions on that page. I don't know who is in charge of QEMU download page, but I would like to ask anyone with this power to fix the absence of 3.2, if possible. Thanks in advance, Aleksandar
Re: [kvm-unit-tests PATCH 05/10] arm: pmu: Basic event counter Tests
Hi Andre, On 1/7/20 1:19 PM, Andre Przywara wrote: > On Mon, 16 Dec 2019 21:47:52 +0100 > Eric Auger wrote: > > Hi Eric, > > thanks a lot for your work on these elaborate tests! I have some PMU test > extensions as well, but they are nowhere as sophisticated as yours! > > Just ran this on my ThunderX2 desktop (4.15.0-65-generic #74-Ubuntu kernel, > QEMU emulator version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.21)), and it > reported the following fails: > INFO: pmu: basic-event-count: counter #0 is 0x207e (CPU_CYCLES) > INFO: pmu: basic-event-count: counter #1 is 0xc89 (INST_RETIRED) > INFO: pmu: basic-event-count: overflow reg = 0x0 > FAIL: pmu: basic-event-count: check overflow happened on #0 only > > INFO: PMU version: 4 > INFO: Implements 6 event counters > INFO: pmu: mem-access: counter #0 is 1297 (MEM_ACCESS) > INFO: pmu: mem-access: counter #1 is 1287 (MEM_ACCESS) > FAIL: pmu: mem-access: Ran 20 mem accesses > FAIL: pmu: mem-access: Ran 20 mem accesses with expected overflows on both > counters > INFO: pmu: mem-access: cnt#0 = 1353 cnt#1=1259 overflow=0x0 > > Do you know about this? Is this due to kernel bugs? Because Ubuntu cleverly > chose an EOL kernel for their stable distro :-P > Will try to have a look and repeat on a Juno. I think this kernel version misses: 30d97754b2d1 KVM: arm/arm64: Re-create event when setting counter value In the tests I am setting the TYPER before presetting the counter #0 value so this may be related. You may try to set the counter #0 pmevtyper again just after the pmevcntr #0 preset to validate the above. write_regn(pmevcntr, 0, 0xFFF0); write_regn(pmevtyper, 0, events[0] | PMEVTYPER_EXCLUDE_EL0); Besides I am only testing against the master branch. > > Comments inline > >> Adds the following tests: >> - event-counter-config: test event counter configuration >> - basic-event-count: >> - programs counters #0 and #1 to count 2 required events >> (resp. CPU_CYCLES and INST_RETIRED). Counter #0 is preset >> to a value close enough to the 32b >> overflow limit so that we check the overflow bit is set >> after the execution of the asm loop. >> - mem-access: counts MEM_ACCESS event on counters #0 and #1 >> with and without 32-bit overflow. >> >> Signed-off-by: Eric Auger >> --- >> arm/pmu.c | 261 ++ >> arm/unittests.cfg | 18 >> 2 files changed, 279 insertions(+) >> >> diff --git a/arm/pmu.c b/arm/pmu.c >> index d88ef22..139dae3 100644 >> --- a/arm/pmu.c >> +++ b/arm/pmu.c >> @@ -18,9 +18,15 @@ >> #include "asm/barrier.h" >> #include "asm/sysreg.h" >> #include "asm/processor.h" >> +#include >> +#include >> >> #define PMU_PMCR_E (1 << 0) >> +#define PMU_PMCR_P (1 << 1) >> #define PMU_PMCR_C (1 << 2) >> +#define PMU_PMCR_D (1 << 3) >> +#define PMU_PMCR_X (1 << 4) >> +#define PMU_PMCR_DP(1 << 5) >> #define PMU_PMCR_LC(1 << 6) >> #define PMU_PMCR_N_SHIFT 11 >> #define PMU_PMCR_N_MASK0x1f >> @@ -104,6 +110,9 @@ static inline void precise_instrs_loop(int loop, >> uint32_t pmcr) >> >> /* event counter tests only implemented for aarch64 */ >> static void test_event_introspection(void) {} >> +static void test_event_counter_config(void) {} >> +static void test_basic_event_count(void) {} >> +static void test_mem_access(void) {} >> >> #elif defined(__aarch64__) >> #define ID_AA64DFR0_PERFMON_SHIFT 8 >> @@ -145,6 +154,32 @@ static inline void precise_instrs_loop(int loop, >> uint32_t pmcr) >> } >> >> #define PMCEID1_EL0 sys_reg(11, 3, 9, 12, 7)>> +#define PMCNTENSET_EL0 >> sys_reg(11, 3, 9, 12, 1) >> +#define PMCNTENCLR_EL0 sys_reg(11, 3, 9, 12, 2) > > op0 (the first argument) is only two bits, so it should read "3" instead of > "11" here. That's already a bug in the existing PMCEID1_EL0 definition. We > get away with it because the macro masks with 3, but it should be still > written correctly here. Not sure where the "11" actually comes from. op0 op1 CRn CRm op2 PMCEID1_EL0 11 011 1001 1100 111 binary read as dec :-( > >> + >> +#define PMEVTYPER_EXCLUDE_EL1 (1 << 31) >> +#define PMEVTYPER_EXCLUDE_EL0 (1 << 30) >> + >> +#define regn_el0(__reg, __n) __reg ## __n ## _el0 >> +#define write_regn(__reg, __n, __val) \ >> +write_sysreg((__val), __reg ## __n ## _el0) >> + >> +#define read_regn(__reg, __n) \ >> +read_sysreg(__reg ## __n ## _el0) >> + >> +#define print_pmevtyper(__s, __n) do { \ >> +uint32_t val; \ >> +val = read_regn(pmevtyper, __n);\ >> +report_info("%s pmevtyper%d=0x%x, eventcount=0x%x (p=%ld, u=%ld >> nsk=%ld, nsu=%ld, nsh=%ld m=%ld, mt=%ld)", \ >> +(__s), (__n), val, val & 0x, \ >> +(BIT_MASK(31) & val) >> 31, \ >> +(BIT_MASK(30) & val) >> 30, \ >> +(BIT_MASK(29) & val) >> 29, \ >> +(BIT_MASK(28) & val) >> 28, \ >> +(BIT_MASK(27) & va
[PATCH] crypto: fix getter of a QCryptoSecret's property
This fixes the condition-check done by the "loaded" property getter, such that the property returns true even when the secret is loaded by the 'file' option. Signed-off-by: Tong Ho --- Pre-existing getter returns true only when the secret is loaded by the 'data' option. crypto/secret.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/crypto/secret.c b/crypto/secret.c index 1cf0ad0..5fb6bbe 100644 --- a/crypto/secret.c +++ b/crypto/secret.c @@ -221,6 +221,7 @@ qcrypto_secret_prop_set_loaded(Object *obj, secret->rawlen = inputlen; } else { g_free(secret->rawdata); +secret->rawdata = NULL; secret->rawlen = 0; } } @@ -231,7 +232,7 @@ qcrypto_secret_prop_get_loaded(Object *obj, Error **errp G_GNUC_UNUSED) { QCryptoSecret *secret = QCRYPTO_SECRET(obj); -return secret->data != NULL; +return secret->rawdata != NULL; } -- 2.7.4
[Bug 1859081] [NEW] Mouse way too fast when Qemu is on a Windows VM with a OS 9 Guest
Public bug reported: On a server, I have a Windows 10 VM with Qemu 4.1.0 (latest) from https://qemu.weilnetz.de/w64/ installed. There I have a Mac OS 9.2.2 machine. Now if I connect to the Windows VM with VNC or RDP or even VMWare console, the Mouse in the Mac OS Guest inside Qemu is wy to fast. Even when lowering the mouse speed in the Mac OS mouse setting, one pixel in the Host (Windows 10 VM) still moves the mouse by 10 pixels inside the Qemu machine. I tried different resolutions but that does not help. Is there any way to fix this or any way how I can provide more information? Thanks ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1859081 Title: Mouse way too fast when Qemu is on a Windows VM with a OS 9 Guest Status in QEMU: New Bug description: On a server, I have a Windows 10 VM with Qemu 4.1.0 (latest) from https://qemu.weilnetz.de/w64/ installed. There I have a Mac OS 9.2.2 machine. Now if I connect to the Windows VM with VNC or RDP or even VMWare console, the Mouse in the Mac OS Guest inside Qemu is wy to fast. Even when lowering the mouse speed in the Mac OS mouse setting, one pixel in the Host (Windows 10 VM) still moves the mouse by 10 pixels inside the Qemu machine. I tried different resolutions but that does not help. Is there any way to fix this or any way how I can provide more information? Thanks To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1859081/+subscriptions
Re: [PATCH] virtio: Prevent double swap due to target pre 1.0 VirtIO
Hi Greg, Thanks for the commit info. But I'm testing in this scenario, that is, a ppc64le host with a ppc64 BE guest, and without my patch I can't get virtio to work. The patch makes virtio 0.95 (legacy) net, scsi, blk work. I don't get the firmware error. I also tested with a ppc64le guest and had no problems either. Maybe we have different firmware versions? My firmware output: SLOF ** QEMU Starting Build Date = Jul 3 2019 12:26:14 FW Version = git-ba1ab360eebe6338 Press "s" to enter Open Firmware. Populating /vdevice methods Populating /vdevice/vty@7100 Populating /vdevice/nvram@7101 Populating /vdevice/v-scsi@7102 SCSI: Looking for devices 8200 CD-ROM : "QEMU QEMU CD-ROM 2.5+" Populating /pci@8002000 00 (D) : 1af4 1000virtio [ net ] 00 0800 (D) : 1af4 1001virtio [ block ] No NVRAM common partition, re-initializing... Scanning USB Using default console: /vdevice/vty@7100 Welcome to Open Firmware Copyright (c) 2004, 2017 IBM Corporation All rights reserved. This program and the accompanying materials are made available under the terms of the BSD License available at http://www.opensource.org/licenses/bsd-license.php Trying to load: from: /pci@8002000/scsi@1 ... Successfully loaded >> FreeBSD/powerpc Open Firmware boot block Boot path: /pci@8002000/scsi@1 Boot loader: /boot/loader Boot volume: /pci@8002000/scsi@1:2 Consoles: Open Firmware console FreeBSD/powerpc64 Open Firmware loader, Revision 0.1 (Mon Nov 11 22:33:43 -02 2019 jenkins@FreeBSD_x86) Memory: 4194304KB Booted from: /pci@8002000/scsi@1 Loading /boot/defaults/loader.conf /boot/kernel/kernel data=0x129f658+0x4aaa88 syms=[0x8+0x105120+0x8+0x125429] ... Until now, I was able to test the patch and see virtio working on the following systems: Qemu Host Guest Guest VirtIO --- -- master Ubuntu ppc64le FreeBSD 13.0-current ppc64 BE legacy master Ubuntu ppc64le debian 4.19.0-6-powerpc64le modern master Ubuntu ppc64le debian 4.19.0-6-powerpc64le legacy master arch x86_64 FreeBSD 13.0-current ppc64 BE legacy Thanks, andré On Thu, Jan 9, 2020 at 1:06 PM Greg Kurz wrote: > > On Thu, 9 Jan 2020 07:39:17 -0500 > "Michael S. Tsirkin" wrote: > > > On Thu, Jan 09, 2020 at 09:25:42AM -0300, André Silva wrote: > > > Hi Michael! > > > Thanks for reviewing the patch! > > > > > > > we always get LE values from memory subsystem, > > > > not target endian values: > > > > > > I see. So do you think the patch is correct in eliminating the extra > > > swap (as virtio_config_readw for example already makes a swap)? > > > > > > Thanks, > > > andré > > > > I don't think it is, I think we do need an extra swap > > in some cases. It's possible that some cross-endian > > setups are broken now, if so pls include testing > > result not just theoretical analysis. > > > > I confirm that we must keep the extra swap otherwise > read/write in cross-endian setups will have wrong > endian. Please read this commit for a more detailed > explanation: > > commit 82afa58641b0e67abbaf4da6c325ebd7c2513262 > Author: Benjamin Herrenschmidt > Date: Tue Jan 10 01:35:11 2012 + > > virtio-pci: Fix endianness of virtio config > > https://git.qemu.org/?p=qemu.git;a=commitdiff;h=82afa58641b0e67abbaf4da6c325ebd7c2513262 > > This is especially critical on ppc64 since _all_ hosts are now LE > but the first piece of code in the guest that is likely to drive > the device is the SLOF firmware which is BE. > > This is what we get with this patch when trying to run a pseries guest on a > ppc64le host: > > Trying to load: from: /pci@8002000/scsi@0 ... virtioblk_transfer: > Access beyond end of device! > > Cheers, > > -- > Greg > > > > On Thu, Jan 9, 2020 at 7:50 AM Michael S. Tsirkin wrote: > > > > > > > > On Wed, Jan 08, 2020 at 01:16:18PM -0300, Andre Silva wrote: > > > > > Remove the bswap function calls after reading and before writing > > > > > memory bytes in virtio_pci_config_read and virtio_pci_config_write > > > > > because they are reverting back an already swapped bytes. > > > > > > > > > > Consider the table below in the context of virtio_pci_config_read > > > > > function. > > > > > > > > > > Host Target virtio-config-read[wl] > > > > >swap? virtio-is-big-endian? extra > > > > > bswap? Should be Final result Final result ok? > > > > > - --- --- > > > > > -- --- -- -- > > > > > LE BE s(x)true > > > > > s(s(x))s(x)x No > > > > > LE LE x
Re: [PATCH] target/arm/arm-semi: fix SYS_OPEN to return nonzero filehandle
On 1/9/20 3:12 PM, Masahiro Yamada wrote: > According to the specification "Semihosting for AArch32 and Aarch64", > the SYS_OPEN operation should return: > > - A nonzero handle if the call is successful > - -1 if the call is not successful > > So, it should never return 0. > > Prior to commit 35e9a0a8ce4b ("target/arm/arm-semi: Make semihosting > code hand out its own file descriptors"), the guest fd matched to the > host fd. It returned a nonzero handle on success since the fd 0 is > already used for stdin. > > Now that the guest fd is the index of guestfd_array, it starts from 0. > > I noticed this issue particularly because Trusted Firmware-A built with > PLAT=qemu is no longer working. Its io_semihosting driver only handles > a positive return value as a valid filehandle. > > Basically, there are two ways to fix this: > > - Use (guestfd - 1) as the index of guestfs_arrary. We need to insert > increment/decrement to convert the guestfd and the array index back > and forth. > > - Keep using guestfd as the index of guestfs_array. The first entry > of guestfs_array is left unused. > > I thought the latter is simpler. We end up with wasting a small piece > of memory for the unused first entry of guestfd_array, but this is > probably not a big deal. > > Fixes: 35e9a0a8ce4b ("target/arm/arm-semi: Make semihosting code hand out its > own file descriptors") > Signed-off-by: Masahiro Yamada > --- Reviewed-by: Richard Henderson r~
Re: [PATCH v1 06/59] mips-semi.c: remove 'uhi_done' label in helper_do_semihosting()
On Monday, January 6, 2020, Daniel Henrique Barboza wrote: > The label 'uhi_done' is a simple 'return' call and can > be removed for a bit more clarity in the code. > > CC: Aurelien Jarno > CC: Aleksandar Markovic > CC: Aleksandar Rikalo > Signed-off-by: Daniel Henrique Barboza > --- > target/mips/mips-semi.c | 15 +++ > 1 file changed, 7 insertions(+), 8 deletions(-) > > Reviewed-by: Aleksandar Markovic > diff --git a/target/mips/mips-semi.c b/target/mips/mips-semi.c > index 35bdfd7c77..10a710c1e8 100644 > --- a/target/mips/mips-semi.c > +++ b/target/mips/mips-semi.c > @@ -218,7 +218,7 @@ static int copy_argn_to_target(CPUMIPSState *env, int > arg_num, > if (!p) { \ > gpr[2] = -1;\ > gpr[3] = EFAULT;\ > -goto uhi_done; \ > +return; \ > } \ > } while (0) > > @@ -228,14 +228,14 @@ static int copy_argn_to_target(CPUMIPSState *env, > int arg_num, > if (!p) { \ > gpr[2] = -1;\ > gpr[3] = EFAULT;\ > -goto uhi_done; \ > +return; \ > } \ > p2 = lock_user_string(addr2); \ > if (!p2) { \ > unlock_user(p, addr, 0);\ > gpr[2] = -1;\ > gpr[3] = EFAULT;\ > -goto uhi_done; \ > +return; \ > } \ > } while (0) > > @@ -272,7 +272,7 @@ void helper_do_semihosting(CPUMIPSState *env) > if (gpr[4] < 3) { > /* ignore closing stdin/stdout/stderr */ > gpr[2] = 0; > -goto uhi_done; > +return; > } > gpr[2] = close(gpr[4]); > gpr[3] = errno_mips(errno); > @@ -302,7 +302,7 @@ void helper_do_semihosting(CPUMIPSState *env) > gpr[2] = fstat(gpr[4], &sbuf); > gpr[3] = errno_mips(errno); > if (gpr[2]) { > -goto uhi_done; > +return; > } > gpr[2] = copy_stat_to_target(env, &sbuf, gpr[5]); > gpr[3] = errno_mips(errno); > @@ -314,14 +314,14 @@ void helper_do_semihosting(CPUMIPSState *env) > case UHI_argnlen: > if (gpr[4] >= semihosting_get_argc()) { > gpr[2] = -1; > -goto uhi_done; > +return; > } > gpr[2] = strlen(semihosting_get_arg(gpr[4])); > break; > case UHI_argn: > if (gpr[4] >= semihosting_get_argc()) { > gpr[2] = -1; > -goto uhi_done; > +return; > } > gpr[2] = copy_argn_to_target(env, gpr[4], gpr[5]); > break; > @@ -369,6 +369,5 @@ void helper_do_semihosting(CPUMIPSState *env) > fprintf(stderr, "Unknown UHI operation %d\n", op); > abort(); > } > -uhi_done: > return; > } > -- > 2.24.1 > > >
Re: [PATCH v1 00/59] trivial unneeded labels cleanup
On 1/7/20 6:43 AM, Max Reitz wrote: On 06.01.20 19:23, Daniel Henrique Barboza wrote: [...] For me, it doesn’t require any brain cycles, because I generally just assume the cleanup label will do the right thing. OTOH, a return statement may make me invest some some brain cycles, because maybe there is something to be cleaned up and it isn’t done here. Another common case uses a variable to set a return value, generally an error, then return: if () { ret = -ENOENT; goto out; } [..] out: return ret; Likewise, it is clearer to just 'return -ENOENT' instead of jumping to a label. There are other cases being handled in these patches, but these are the most common. I find it clearer from the perspective of “less LoC”, but I find it less clear from the perspective of “Is this the right way to clean up?”. Even on patch 15 (which you say isn’t too much of a debate), I don’t find the change to make things any clearer. Just less verbose. Fair enough. As I said in the cover, all this patches makes no functional changes, just a clean up aiming for less LoC (and more clarity, at least in my opinion). I am aware all the good sides of keeping the code as is, such as being easier to debug (although I would argue that an explicit trace_event call is better than keeping verbose code 'just in case'), or goto usage to keep just one return statement per function. I am also aware that the existing QEMU code base has a mesh of styles. What I'm proposing here isn't a 'my way is better' case by any means, but it's not something unprecedented in the existing code base either. Since there's no QEMU code guideline imposing that a function should have only one 'return' statement regardless of how many 'goto' calls are needed, or a guideline that discourages 'goto' calls regardless of how many 'return' calls are needed, in the end it's a matter of seeing what fits the function/code better. In the maintainers opinion, of course. Thanks, DHB I suppose none of this would matter if we used __attribute__((cleanup)) everywhere and simply never had to clean up anything manually. But as long as we don’t and require cleanup paths in many places, I disagree that they require more brain cycles than plain return statements. Max
Re: [PATCH 092/104] virtiofsd: add man page
* Daniel P. Berrangé (berra...@redhat.com) wrote: > On Thu, Dec 12, 2019 at 04:38:52PM +, Dr. David Alan Gilbert (git) wrote: > > From: Stefan Hajnoczi > > > > Signed-off-by: Stefan Hajnoczi > > --- > > Makefile | 7 +++ > > tools/virtiofsd/virtiofsd.texi | 85 ++ > > 2 files changed, 92 insertions(+) > > create mode 100644 tools/virtiofsd/virtiofsd.texi > > Reviewed-by: Daniel P. Berrangé Thanks. > with some notes at the very end > > +@c man begin DESCRIPTION > > + > > +Share a host directory tree with a guest through a virtio-fs device. This > > +program is a vhost-user backend that implements the virtio-fs device. Each > > +virtio-fs device instance requires its own virtiofsd process. > > + > > +This program is designed to work with QEMU's @code{--device > > vhost-user-fs-pci} > > +but should work with any virtual machine monitor (VMM) that supports > > +vhost-user. See the EXAMPLES section below. > > + > > +This program must be run as the root user. > > So there's no way for an unprivileged user to do file sharing like they > can with 9p right now ? Correct. (Which also makes it a pain for using in a make check) > > Upon startup the program will > > +switch into a new file system namespace with the shared directory tree as > > its > > +root. This prevents "file system escapes" due to symlinks and other file > > +system objects that might lead to files outside the shared directory. The > > +program also sandboxes itself using seccomp(2) to prevent ptrace(2) and > > other > > +vectors that could allow an attacker to compromise the system after gaining > > +control of the virtiofsd process. > > + > > +@c man end > > + > > +@c man begin OPTIONS > > +@table @option > > +@item -h, --help > > +Print help. > > +@item -V, --version > > +Print version. > > +@item -d, -o debug > > +Enable debug output. > > +@item --syslog > > +Print log messages to syslog instead of stderr. > > +@item -o log_level=@var{level} > > +Print only log messages matching @var{level} or more severe. @var{level} > > is > > +one of @code{err}, @code{warn}, @code{info}, or @code{debug}. The default > > is > > +@var{info}. > > +@item -o source=@var{path} > > +Share host directory tree located at @var{path}. This option is required. > > +@item --socket-path=@var{path}, -o vhost_user_socket=@var{path} > > +Listen on vhost-user UNIX domain socket at @var{path}. > > +@item --fd=@var{fdnum} > > +Accept connections from vhost-user UNIX domain socket file descriptor > > @var{fdnum}. The file descriptor must already be listening for connections. > > +@item --thread-pool-size=@var{num} > > +Restrict the number of worker threads per request queue to @var{num}. The > > default is 64. > > +@item --cache=@code{none}|@code{auto}|@code{always} > > +Select the desired trade-off between coherency and performance. > > @code{none} > > +forbids the FUSE client from caching to achieve best coherency at the cost > > of > > +performance. @code{auto} acts similar to NFS with a 1 second metadata > > cache > > +timeout. @code{always} sets a long cache lifetime at the expense of > > coherency. > > +@item --writeback > > +Enable writeback cache, allowing the FUSE client to buffer and merge write > > requests. > > +@end table > > +@c man end > > + > > +@c man begin EXAMPLES > > +Export @code{/var/lib/fs/vm001/} on vhost-user UNIX domain socket > > @code{/var/run/vm001-vhost-fs.sock}: > > + > > +@example > > +host# virtiofsd --socket-path=/var/run/vm001-vhost-fs.sock -o > > source=/var/lib/fs/vm001 > > +host# qemu-system-x86_64 \ > > +-chardev socket,id=char0,path=/var/run/vm001-vhost-fs.sock \ > > +-device vhost-user-fs-pci,chardev=char0,tag=myfs \ > > +-object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on \ > > +-numa node,memdev=mem \ > > +... > > +guest# mount -t virtio_fs \ > > +-o > > default_permissions,allow_other,user_id=0,group_id=0,rootmode=04,dax \ > > +myfs /mnt > > +@end example > > +@c man end > > + > > +@ignore > > +@setfilename virtiofsd > > +@settitle QEMU virtio-fs shared file system daemon > > + > > +@c man begin AUTHOR > > s/AUTHOR/COPYRIGHT/ OK > since this isn't providing any author information. > > > +Copyright (C) 2019 Red Hat, Inc. > > 2019-2020 ! Time flies... > And now insert > > @c man end > @c man begin LICENSE > > > +This is free software; see the source for copying conditions. There is NO > > +warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > +@c man end > > +@end ignore Hmm, so it ends up like: @c man end @ignore @setfilename virtiofsd @settitle QEMU virtio-fs shared file system daemon @c man begin COPYRIGHT Copyright (C) 2019-2020 Red Hat, Inc. @c man end @c man begin LICENSE This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS F
Re: [PATCH v1 00/59] trivial unneeded labels cleanup
On 1/7/20 4:06 AM, Philippe Mathieu-Daudé wrote: On 1/7/20 7:16 AM, Kevin Wolf wrote: Am 06.01.2020 um 21:35 hat Daniel Henrique Barboza geschrieben: On 1/6/20 4:54 PM, Corey Minyard wrote: On Mon, Jan 06, 2020 at 03:23:26PM -0300, Daniel Henrique Barboza wrote: Hello, [...] Which is cleaner and requires less brain cycles to wonder whether the 'cleanup' label does anything special, such as a heap memory cleanup. I would disagree with this analysis. To me, I often wonder when I have to add cleanup code to a routine whether there is some hidden return in the middle of the function. That's a lot harder to spot than just looking for the cleanup label at the end of the function to see what it does. For non-trivial functions I prefer to have one point of return at the end (and maybe some minor checks with returns right at the beginning). I'm not adamant about this, just my opinion. It depends on the case, but yes, I had similar thoughts, at least when we're talking about non-trivial parts of a function. (Very short functions of just some initial checks returning directly are usually fine.) From a debugging point of view, and when adding trace-events, it is easier to have a single function exit path. In various functions modified by your patches, we can split big functions in smaller ones and avoid the goto label. Splitting and refactoring big funcitons wasn't the initial idea of this work, but I see no problems in doing it. Care to point out where did you see such cases? Thanks, DHB
[PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value
This replaces all remaining instances in the qcow2 code. Signed-off-by: Alberto Garcia --- block/qcow2-cluster.c | 2 +- block/qcow2.c | 10 ++ 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 932fc48919..777ca2d409 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset, * Writes one sector of the L1 table to the disk (can't update single entries * and we really don't want bdrv_pread to perform a read-modify-write) */ -#define L1_ENTRIES_PER_SECTOR (512 / 8) +#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8) int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index) { BDRVQcow2State *s = bs->opaque; diff --git a/block/qcow2.c b/block/qcow2.c index 783d2b9578..c0f3e715ef 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -3280,7 +3280,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp) /* Validate options and set default values */ if (!QEMU_IS_ALIGNED(qcow2_opts->size, BDRV_SECTOR_SIZE)) { -error_setg(errp, "Image size must be a multiple of 512 bytes"); +error_setg(errp, "Image size must be a multiple of %u bytes", + (unsigned) BDRV_SECTOR_SIZE); ret = -EINVAL; goto out; } @@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs, case QCOW2_CLUSTER_NORMAL: child = s->data_file; copy_offset += offset_into_cluster(s, src_offset); -if ((copy_offset & 511) != 0) { +if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) { ret = -EIO; goto out; } @@ -3958,8 +3959,9 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, return -ENOTSUP; } -if (offset & 511) { -error_setg(errp, "The new size must be a multiple of 512"); +if (!QEMU_IS_ALIGNED(offset, BDRV_SECTOR_SIZE)) { +error_setg(errp, "The new size must be a multiple of %u", + (unsigned) BDRV_SECTOR_SIZE); return -EINVAL; } -- 2.20.1
[PATCH v2 2/4] qcow2: Don't round the L1 table allocation up to the sector size
The L1 table is read from disk using the byte-based bdrv_pread() and is never accessed beyond its last element, so there's no need to allocate more memory than that. Signed-off-by: Alberto Garcia --- block/qcow2-cluster.c | 5 ++--- block/qcow2-refcount.c | 2 +- block/qcow2-snapshot.c | 3 +-- block/qcow2.c | 2 +- 4 files changed, 5 insertions(+), 7 deletions(-) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 8982b7b762..932fc48919 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -124,12 +124,11 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size, #endif new_l1_size2 = sizeof(uint64_t) * new_l1_size; -new_l1_table = qemu_try_blockalign(bs->file->bs, - ROUND_UP(new_l1_size2, 512)); +new_l1_table = qemu_try_blockalign(bs->file->bs, new_l1_size2); if (new_l1_table == NULL) { return -ENOMEM; } -memset(new_l1_table, 0, ROUND_UP(new_l1_size2, 512)); +memset(new_l1_table, 0, new_l1_size2); if (s->l1_size) { memcpy(new_l1_table, s->l1_table, s->l1_size * sizeof(uint64_t)); diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index f67ac6b2d8..c963bc8de1 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -1262,7 +1262,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, * l1_table_offset when it is the current s->l1_table_offset! Be careful * when changing this! */ if (l1_table_offset != s->l1_table_offset) { -l1_table = g_try_malloc0(ROUND_UP(l1_size2, 512)); +l1_table = g_try_malloc0(l1_size2); if (l1_size2 && l1_table == NULL) { ret = -ENOMEM; goto fail; diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c index 5ab64da1ec..82c32d4c9b 100644 --- a/block/qcow2-snapshot.c +++ b/block/qcow2-snapshot.c @@ -1024,8 +1024,7 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs, return ret; } new_l1_bytes = sn->l1_size * sizeof(uint64_t); -new_l1_table = qemu_try_blockalign(bs->file->bs, - ROUND_UP(new_l1_bytes, 512)); +new_l1_table = qemu_try_blockalign(bs->file->bs, new_l1_bytes); if (new_l1_table == NULL) { return -ENOMEM; } diff --git a/block/qcow2.c b/block/qcow2.c index 87ca2832f0..848a6c5182 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1499,7 +1499,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options, if (s->l1_size > 0) { s->l1_table = qemu_try_blockalign(bs->file->bs, -ROUND_UP(s->l1_size * sizeof(uint64_t), 512)); + s->l1_size * sizeof(uint64_t)); if (s->l1_table == NULL) { error_setg(errp, "Could not allocate L1 table"); ret = -ENOMEM; -- 2.20.1
[PATCH v2 0/4] qcow2: Misc BDRV_SECTOR_SIZE updates
This small series gets rid of all the remaining instances of hardcoded sector sizes in the qcow2 code and adds a check for images whose virtual size is not a multiple of the sector size. See the individual patches for details. Berto v2: - Modify output of iotest 080 to make it easier to understand [Nir] - Use the QEMU_IS_ALIGNED() macro instead of the modulus operator [Nir] - Tighten some assertions [Kevin] v1: https://lists.gnu.org/archive/html/qemu-block/2020-01/msg00139.html Alberto Garcia (4): qcow2: Require that the virtual size is a multiple of the sector size qcow2: Don't round the L1 table allocation up to the sector size qcow2: Tighten cluster_offset alignment assertions qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value block/qcow2-cluster.c | 7 +++ block/qcow2-refcount.c | 2 +- block/qcow2-snapshot.c | 3 +-- block/qcow2.c | 28 +--- docs/interop/qcow2.txt | 3 ++- tests/qemu-iotests/080 | 8 tests/qemu-iotests/080.out | 5 + 7 files changed, 37 insertions(+), 19 deletions(-) -- 2.20.1
[PATCH v2 3/4] qcow2: Tighten cluster_offset alignment assertions
qcow2_alloc_cluster_offset() and qcow2_get_cluster_offset() always return offsets that are cluster-aligned so don't just check that they are sector-aligned. The check in qcow2_co_preadv_task() is also replaced by an assertion for the same reason. Signed-off-by: Alberto Garcia --- block/qcow2.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 848a6c5182..783d2b9578 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2175,10 +2175,7 @@ static coroutine_fn int qcow2_co_preadv_task(BlockDriverState *bs, offset, bytes, qiov, qiov_offset); case QCOW2_CLUSTER_NORMAL: -if ((file_cluster_offset & 511) != 0) { -return -EIO; -} - +assert(QEMU_IS_ALIGNED(file_cluster_offset, s->cluster_size)); if (bs->encrypted) { return qcow2_co_preadv_encrypted(bs, file_cluster_offset, offset, bytes, qiov, qiov_offset); @@ -2514,7 +2511,7 @@ static coroutine_fn int qcow2_co_pwritev_part( goto out_locked; } -assert((cluster_offset & 511) == 0); +assert(QEMU_IS_ALIGNED(cluster_offset, s->cluster_size)); ret = qcow2_pre_write_overlap_check(bs, 0, cluster_offset + offset_in_cluster, @@ -3904,7 +3901,7 @@ qcow2_co_copy_range_to(BlockDriverState *bs, goto fail; } -assert((cluster_offset & 511) == 0); +assert(QEMU_IS_ALIGNED(cluster_offset, s->cluster_size)); ret = qcow2_pre_write_overlap_check(bs, 0, cluster_offset + offset_in_cluster, cur_bytes, true); -- 2.20.1
[PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size
The qcow2 header specifies the virtual size of the image in bytes, but BlockDriverState stores it as a number of 512-byte sectors. If the user tries to create an image with a size that is not a multiple of the sector size then this is fixed on creation by silently rounding the image size up (see commit c2eb918e32). qcow2_co_truncate() is more strict and returns an error instead. However when an image is opened the virtual size is rounded down, which means that trying to access the last few advertised bytes will result in an error. As seen above QEMU cannot create such images and there's no good use case that would require us to try to handle them so let's just treat them as unsupported. Signed-off-by: Alberto Garcia --- block/qcow2.c | 7 +++ docs/interop/qcow2.txt | 3 ++- tests/qemu-iotests/080 | 8 tests/qemu-iotests/080.out | 5 + 4 files changed, 22 insertions(+), 1 deletion(-) diff --git a/block/qcow2.c b/block/qcow2.c index 7fbaac8457..87ca2832f0 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1326,6 +1326,13 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options, goto fail; } +if (!QEMU_IS_ALIGNED(header.size, BDRV_SECTOR_SIZE)) { +error_setg(errp, "Virtual size is not a multiple of %u", + (unsigned) BDRV_SECTOR_SIZE); +ret = -EINVAL; +goto fail; +} + if (header.header_length > sizeof(header)) { s->unknown_header_fields_size = header.header_length - sizeof(header); s->unknown_header_fields = g_malloc(s->unknown_header_fields_size); diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt index af5711e533..891f5662d8 100644 --- a/docs/interop/qcow2.txt +++ b/docs/interop/qcow2.txt @@ -40,7 +40,8 @@ The first cluster of a qcow2 image contains the file header: with larger cluster sizes. 24 - 31: size -Virtual disk size in bytes. +Virtual disk size in bytes. qemu can only handle +sizes that are a multiple of 512 bytes. Note: qemu has an implementation limit of 32 MB as the maximum L1 table size. With a 2 MB cluster diff --git a/tests/qemu-iotests/080 b/tests/qemu-iotests/080 index 4bcb5021e8..6f136d616f 100755 --- a/tests/qemu-iotests/080 +++ b/tests/qemu-iotests/080 @@ -48,6 +48,7 @@ header_size=104 offset_backing_file_offset=8 offset_backing_file_size=16 +offset_virtual_size=24 offset_l1_size=36 offset_l1_table_offset=40 offset_refcount_table_offset=48 @@ -197,6 +198,13 @@ poke_file "$TEST_IMG" "$offset_snap1_l1_size" "\x10\x00\x00\x00" { $QEMU_IMG snapshot -d test $TEST_IMG; } 2>&1 | _filter_testdir _check_test_img +echo +echo "== Image size not a multiple of the sector size ==" +_make_test_img 64k +echo "Modifying virtual size to 65535 bytes" +poke_file "$TEST_IMG" "$offset_virtual_size" "\x00\x00\x00\x00\x00\x00\xff\xff" +{ $QEMU_IMG info $TEST_IMG; } 2>&1 | _filter_testdir | _filter_imgfmt + # success, all done echo "*** done" rm -f $seq.full diff --git a/tests/qemu-iotests/080.out b/tests/qemu-iotests/080.out index 45ab01db8e..aadc817339 100644 --- a/tests/qemu-iotests/080.out +++ b/tests/qemu-iotests/080.out @@ -104,4 +104,9 @@ Data may be corrupted, or further writes to the image may corrupt it. 3 leaked clusters were found on the image. This means waste of disk space, but no harm to data. + +== Image size not a multiple of the sector size == +Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65536 +Modifying virtual size to 65535 bytes +qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Virtual size is not a multiple of 512 *** done -- 2.20.1
Re: [PATCH v2 3/9] hw/9pfs/9p-synth: added directory for readdir test
On Wed, 18 Dec 2019 14:23:48 +0100 Christian Schoenebeck wrote: > This will provide the following virtual files by the 9pfs > synth driver: > > - /ReadDirDir/ReadDirFile99 > - /ReadDirDir/ReadDirFile98 > ... > - /ReadDirDir/ReadDirFile1 > - /ReadDirDir/ReadDirFile0 > > This virtual directory and its virtual 100 files will be > used by the upcoming 9pfs readdir tests. > > Signed-off-by: Christian Schoenebeck > --- LGTM Reviewed-by: Greg Kurz > hw/9pfs/9p-synth.c | 19 +++ > hw/9pfs/9p-synth.h | 5 + > 2 files changed, 24 insertions(+) > > diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c > index 54239c9bbf..7eb210ffa8 100644 > --- a/hw/9pfs/9p-synth.c > +++ b/hw/9pfs/9p-synth.c > @@ -578,6 +578,25 @@ static int synth_init(FsContext *ctx, Error **errp) > NULL, v9fs_synth_qtest_flush_write, > ctx); > assert(!ret); > + > +/* Directory for READDIR test */ > +{ > +V9fsSynthNode *dir = NULL; > +ret = qemu_v9fs_synth_mkdir( > +NULL, 0700, QTEST_V9FS_SYNTH_READDIR_DIR, &dir > +); > +assert(!ret); > +for (i = 0; i < QTEST_V9FS_SYNTH_READDIR_NFILES; ++i) { > +char *name = g_strdup_printf( > +QTEST_V9FS_SYNTH_READDIR_FILE, i > +); > +ret = qemu_v9fs_synth_add_file( > +dir, 0, name, NULL, NULL, ctx > +); > +assert(!ret); > +g_free(name); > +} > +} > } > > return 0; > diff --git a/hw/9pfs/9p-synth.h b/hw/9pfs/9p-synth.h > index af7a993a1e..036d7e4a5b 100644 > --- a/hw/9pfs/9p-synth.h > +++ b/hw/9pfs/9p-synth.h > @@ -55,6 +55,11 @@ int qemu_v9fs_synth_add_file(V9fsSynthNode *parent, int > mode, > #define QTEST_V9FS_SYNTH_LOPEN_FILE "LOPEN" > #define QTEST_V9FS_SYNTH_WRITE_FILE "WRITE" > > +/* for READDIR test */ > +#define QTEST_V9FS_SYNTH_READDIR_DIR "ReadDirDir" > +#define QTEST_V9FS_SYNTH_READDIR_FILE "ReadDirFile%d" > +#define QTEST_V9FS_SYNTH_READDIR_NFILES 100 > + > /* Any write to the "FLUSH" file is handled one byte at a time by the > * backend. If the byte is zero, the backend returns success (ie, 1), > * otherwise it forces the server to try again forever. Thus allowing
Re: qemu-4.0.1: vhost_region_add_section:Section rounded to 0 prior to previous a0000
* Peter Lieven (p...@kamp.de) wrote: > Am 08.01.20 um 16:04 schrieb Dr. David Alan Gilbert: > > * Peter Lieven (p...@kamp.de) wrote: > > > Hi, > > > > > > > > > I have a Qemu 4.0.1 machine with vhost-net network adapter, thats > > > polluting the log with the above message. > > > > > > Is this something known? Googling revealed the following patch in Nemu > > > (with seems to be a Qemu fork from Intel): > > > > > > https://github.com/intel/nemu/commit/03940ded7f5370ce7492c619dccced114ef7f56e > > > > > > > > > The network stopped functioning. After a live-migration the vServer is > > > reachable again. > > > > > > > > > Any ideas? > > What guest are you running and what does your qemu commandline look > > like? > > > Its running debian9. We have hundreds of other VMs with identical setup. Do > not know why this one makes trouble. Could you extract an 'info mtree' from it - particularly the 'address-space: memory' near the top. > Here is the cmdline: > > > /usr/bin/qemu-4.0.0 -enable-kvm -M pc-i440fx-2.9 -nodefaults -netdev > type=tap,id=guest0,script=no,downscript=no,ifname=tap0,vhost=on,vnet_hdr=on > -device virtio-net-pci,netdev=guest0,mac=52:54:00:80:07:bc -iscsi > initiator-name=iqn.2005-03.org.virtual-core:0025b51f006c,timeout=30 -object > rng-random,filename=/dev/urandom,id=rng0 -device > virtio-rng-pci,rng=rng0,max-bytes=65536,period=1000 -drive > format=raw,discard=on,detect-zeroes=off,file=XXX,if=none,cache=writeback,aio=native,id=disk0 > -device virtio-scsi-pci -device scsi-hd,drive=disk0 -serial null > -parallel null -m 1024,slots=16,maxmem=393216M -smp > 1,sockets=64,cores=1,threads=1,maxcpus=64 -monitor > tcp:0:4001,server,nowait,nodelay -vnc :1 -qmp > tcp:0:3001,server,nowait,nodelay -name 'debian9' -drive > index=2,media=cdrom,if=ide,aio=native,readonly=on -boot order=c,menu=off > -k de -pidfile /var/run/qemu/vm-5182.pid -mem-path /hugepages > -mem-prealloc -cpu Westmere,+pcid,enforce -rtc base=utc -usb -device > usb-tablet -no-hpet -vga vmware -chardev > socket,host=127.0.0.1,port=2001,id=qga0,server,nowait,nodelay -device > virtio-serial -device > virtserialport,chardev=qga0,name=org.qemu.guest_agent.0 All pretty boring stuff. hmm. Dave > > Thanks, > > Peter > > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[PATCH 1/4] qapi: Add a 'coroutine' flag for commands
This patch adds a new 'coroutine' flag to QMP command definitions that tells the QMP dispatcher than the command handler is safe to be run in a coroutine. Signed-off-by: Kevin Wolf --- tests/qapi-schema/qapi-schema-test.json | 1 + docs/devel/qapi-code-gen.txt| 4 include/qapi/qmp/dispatch.h | 1 + tests/test-qmp-cmds.c | 4 scripts/qapi/commands.py| 17 +++-- scripts/qapi/doc.py | 2 +- scripts/qapi/expr.py| 4 ++-- scripts/qapi/introspect.py | 2 +- scripts/qapi/schema.py | 9 ++--- tests/qapi-schema/qapi-schema-test.out | 2 ++ tests/qapi-schema/test-qapi.py | 7 --- 11 files changed, 37 insertions(+), 16 deletions(-) diff --git a/tests/qapi-schema/qapi-schema-test.json b/tests/qapi-schema/qapi-schema-test.json index 9abf175fe0..55f596dbaa 100644 --- a/tests/qapi-schema/qapi-schema-test.json +++ b/tests/qapi-schema/qapi-schema-test.json @@ -147,6 +147,7 @@ 'returns': 'UserDefTwo' } { 'command': 'cmd-success-response', 'data': {}, 'success-response': false } +{ 'command': 'coroutine_cmd', 'data': {}, 'coroutine': true } # Returning a non-dictionary requires a name from the whitelist { 'command': 'guest-get-time', 'data': {'a': 'int', '*b': 'int' }, diff --git a/docs/devel/qapi-code-gen.txt b/docs/devel/qapi-code-gen.txt index 45c93a43cc..753f6711d3 100644 --- a/docs/devel/qapi-code-gen.txt +++ b/docs/devel/qapi-code-gen.txt @@ -457,6 +457,7 @@ Syntax: '*gen': false, '*allow-oob': true, '*allow-preconfig': true, +'*coroutine': true, '*if': COND, '*features': FEATURES } @@ -581,6 +582,9 @@ before the machine is built. It defaults to false. For example: QMP is available before the machine is built only when QEMU was started with --preconfig. +Member 'coroutine' tells the QMP dispatcher whether the command handler +is safe to be run in a coroutine. It defaults to false. + The optional 'if' member specifies a conditional. See "Configuring the schema" below for more on this. diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h index 9aa426a398..d6ce9efc8e 100644 --- a/include/qapi/qmp/dispatch.h +++ b/include/qapi/qmp/dispatch.h @@ -24,6 +24,7 @@ typedef enum QmpCommandOptions QCO_NO_SUCCESS_RESP = (1U << 0), QCO_ALLOW_OOB = (1U << 1), QCO_ALLOW_PRECONFIG = (1U << 2), +QCO_COROUTINE = (1U << 3), } QmpCommandOptions; typedef struct QmpCommand diff --git a/tests/test-qmp-cmds.c b/tests/test-qmp-cmds.c index 27b0afe55a..e2f71e42a1 100644 --- a/tests/test-qmp-cmds.c +++ b/tests/test-qmp-cmds.c @@ -34,6 +34,10 @@ void qmp_cmd_success_response(Error **errp) { } +void qmp_coroutine_cmd(Error **errp) +{ +} + Empty2 *qmp_user_def_cmd0(Error **errp) { return g_new0(Empty2, 1); diff --git a/scripts/qapi/commands.py b/scripts/qapi/commands.py index ab98e504f3..97e51401f1 100644 --- a/scripts/qapi/commands.py +++ b/scripts/qapi/commands.py @@ -15,6 +15,7 @@ See the COPYING file in the top-level directory. from qapi.common import * from qapi.gen import QAPIGenCCode, QAPISchemaModularCVisitor, ifcontext +from typing import List def gen_command_decl(name, arg_type, boxed, ret_type): @@ -194,8 +195,9 @@ out: return ret -def gen_register_command(name, success_response, allow_oob, allow_preconfig): -options = [] +def gen_register_command(name: str, success_response: bool, allow_oob: bool, + allow_preconfig: bool, coroutine: bool) -> str: +options = [] # type: List[str] if not success_response: options += ['QCO_NO_SUCCESS_RESP'] @@ -203,18 +205,20 @@ def gen_register_command(name, success_response, allow_oob, allow_preconfig): options += ['QCO_ALLOW_OOB'] if allow_preconfig: options += ['QCO_ALLOW_PRECONFIG'] +if coroutine: +options += ['QCO_COROUTINE'] if not options: options = ['QCO_NO_OPTIONS'] -options = " | ".join(options) +options_str = " | ".join(options) ret = mcgen(''' qmp_register_command(cmds, "%(name)s", qmp_marshal_%(c_name)s, %(opts)s); ''', name=name, c_name=c_name(name), -opts=options) +opts=options_str) return ret @@ -278,7 +282,7 @@ void %(c_prefix)sqmp_init_marshal(QmpCommandList *cmds); def visit_command(self, name, info, ifcond, arg_type, ret_type, gen, success_response, boxed, allow_oob, allow_preconfig, - features): + coroutine, features): if not gen: return # FIXME: If T is a user-defined type, the user is responsible @@ -296,7 +300,8 @@ void %(c_prefix)sqmp_init_marshal(QmpCommandList *cmds)
[PATCH 0/4] qmp: Optionally run handlers in coroutines
Some QMP command handlers can block the main loop for a relatively long time, for example because they perform some I/O. This is quite nasty. Allowing such handlers to run in a coroutine where they can yield (and therefore release the BQL) while waiting for an event such as I/O completion solves the problem. This series adds the infrastructure to allow this and switches block_resize to run in a coroutine as a first example. This is an alternative solution to Marc-André's "monitor: add asynchronous command type" series. Kevin Wolf (4): qapi: Add a 'coroutine' flag for commands block: Mark 'block_resize' as coroutine vl: Initialise main loop earlier qmp: Move dispatcher to a coroutine qapi/block-core.json| 3 +- tests/qapi-schema/qapi-schema-test.json | 1 + docs/devel/qapi-code-gen.txt| 4 ++ include/qapi/qmp/dispatch.h | 3 + monitor/monitor-internal.h | 5 +- monitor/monitor.c | 24 --- monitor/qmp.c | 83 - qapi/qmp-dispatch.c | 38 ++- tests/test-qmp-cmds.c | 4 ++ vl.c| 10 +-- scripts/qapi/commands.py| 17 +++-- scripts/qapi/doc.py | 2 +- scripts/qapi/expr.py| 4 +- scripts/qapi/introspect.py | 2 +- scripts/qapi/schema.py | 9 ++- tests/qapi-schema/qapi-schema-test.out | 2 + tests/qapi-schema/test-qapi.py | 7 ++- 17 files changed, 155 insertions(+), 63 deletions(-) -- 2.20.1
[PATCH 4/4] qmp: Move dispatcher to a coroutine
This moves the QMP dispatcher to a coroutine and runs all QMP command handlers that declare 'coroutine': true in coroutine context so they can avoid blocking the main loop while doing I/O or waiting for other events. For commands that are not declared safe to run in a coroutine, the dispatcher drops out of coroutine context by calling the QMP command handler from a bottom half. Signed-off-by: Kevin Wolf --- include/qapi/qmp/dispatch.h | 2 + monitor/monitor-internal.h | 5 ++- monitor/monitor.c | 24 +++ monitor/qmp.c | 83 +++-- qapi/qmp-dispatch.c | 38 - 5 files changed, 111 insertions(+), 41 deletions(-) diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h index d6ce9efc8e..d6d5443391 100644 --- a/include/qapi/qmp/dispatch.h +++ b/include/qapi/qmp/dispatch.h @@ -30,6 +30,8 @@ typedef enum QmpCommandOptions typedef struct QmpCommand { const char *name; +/* Runs in coroutine context if QCO_COROUTINE is set, except for OOB + * commands */ QmpCommandFunc *fn; QmpCommandOptions options; QTAILQ_ENTRY(QmpCommand) node; diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h index d78f5ca190..7389b6a56c 100644 --- a/monitor/monitor-internal.h +++ b/monitor/monitor-internal.h @@ -154,7 +154,8 @@ static inline bool monitor_is_qmp(const Monitor *mon) typedef QTAILQ_HEAD(MonitorList, Monitor) MonitorList; extern IOThread *mon_iothread; -extern QEMUBH *qmp_dispatcher_bh; +extern Coroutine *qmp_dispatcher_co; +extern bool qmp_dispatcher_co_busy; extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands; extern QemuMutex monitor_lock; extern MonitorList mon_list; @@ -172,7 +173,7 @@ void monitor_fdsets_cleanup(void); void qmp_send_response(MonitorQMP *mon, const QDict *rsp); void monitor_data_destroy_qmp(MonitorQMP *mon); -void monitor_qmp_bh_dispatcher(void *data); +void coroutine_fn monitor_qmp_dispatcher_co(void *data); int get_monitor_def(int64_t *pval, const char *name); void help_cmd(Monitor *mon, const char *name); diff --git a/monitor/monitor.c b/monitor/monitor.c index 12898b6448..c72763fa4e 100644 --- a/monitor/monitor.c +++ b/monitor/monitor.c @@ -53,8 +53,9 @@ typedef struct { /* Shared monitor I/O thread */ IOThread *mon_iothread; -/* Bottom half to dispatch the requests received from I/O thread */ -QEMUBH *qmp_dispatcher_bh; +/* Coroutine to dispatch the requests received from I/O thread */ +Coroutine *qmp_dispatcher_co; +bool qmp_dispatcher_co_busy; /* Protects mon_list, monitor_qapi_event_state, monitor_destroyed. */ QemuMutex monitor_lock; @@ -579,9 +580,16 @@ void monitor_cleanup(void) } qemu_mutex_unlock(&monitor_lock); -/* QEMUBHs needs to be deleted before destroying the I/O thread */ -qemu_bh_delete(qmp_dispatcher_bh); -qmp_dispatcher_bh = NULL; +/* The dispatcher needs to stop before destroying the I/O thread */ +if (!atomic_mb_read(&qmp_dispatcher_co_busy)) { +aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co); +qmp_dispatcher_co = NULL; +} + +AIO_WAIT_WHILE(qemu_get_aio_context(), + (aio_bh_poll(iohandler_get_aio_context()), +atomic_mb_read(&qmp_dispatcher_co_busy))); + if (mon_iothread) { iothread_destroy(mon_iothread); mon_iothread = NULL; @@ -604,9 +612,9 @@ void monitor_init_globals_core(void) * have commands assuming that context. It would be nice to get * rid of those assumptions. */ -qmp_dispatcher_bh = aio_bh_new(iohandler_get_aio_context(), - monitor_qmp_bh_dispatcher, - NULL); +qmp_dispatcher_co = qemu_coroutine_create(monitor_qmp_dispatcher_co, NULL); +atomic_mb_set(&qmp_dispatcher_co_busy, true); +aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co); } QemuOptsList qemu_mon_opts = { diff --git a/monitor/qmp.c b/monitor/qmp.c index b67a8e7d1f..9fd66c7b97 100644 --- a/monitor/qmp.c +++ b/monitor/qmp.c @@ -133,6 +133,8 @@ static void monitor_qmp_respond(MonitorQMP *mon, QDict *rsp) } } +/* Runs outside of coroutine context for OOB commands, but in coroutine context + * for everything else. */ static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req) { Monitor *old_mon; @@ -211,43 +213,62 @@ static QMPRequest *monitor_qmp_requests_pop_any_with_lock(void) return req_obj; } -void monitor_qmp_bh_dispatcher(void *data) +void coroutine_fn monitor_qmp_dispatcher_co(void *data) { -QMPRequest *req_obj = monitor_qmp_requests_pop_any_with_lock(); +QMPRequest *req_obj = NULL; QDict *rsp; bool need_resume; MonitorQMP *mon; -if (!req_obj) { -return; -} +while (true) { +assert(atomic_mb_read(&qmp_dispatcher_co_busy) == true); + +while (!(req_obj = monitor_qmp_requests_pop_any_with
[PATCH 3/4] vl: Initialise main loop earlier
We want to be able to use qemu_aio_context in the monitor initialisation. Signed-off-by: Kevin Wolf --- vl.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/vl.c b/vl.c index 86474a55c9..4c79a00857 100644 --- a/vl.c +++ b/vl.c @@ -2903,6 +2903,11 @@ int main(int argc, char **argv, char **envp) runstate_init(); precopy_infrastructure_init(); postcopy_infrastructure_init(); + +if (qemu_init_main_loop(&main_loop_err)) { +error_report_err(main_loop_err); +exit(1); +} monitor_init_globals(); if (qcrypto_init(&err) < 0) { @@ -3817,11 +3822,6 @@ int main(int argc, char **argv, char **envp) qemu_unlink_pidfile_notifier.notify = qemu_unlink_pidfile; qemu_add_exit_notifier(&qemu_unlink_pidfile_notifier); -if (qemu_init_main_loop(&main_loop_err)) { -error_report_err(main_loop_err); -exit(1); -} - #ifdef CONFIG_SECCOMP olist = qemu_find_opts_err("sandbox", NULL); if (olist) { -- 2.20.1
[PATCH 2/4] block: Mark 'block_resize' as coroutine
block_resize is safe to run in a coroutine, so use it as an example for the new 'coroutine': true annotation in the QAPI schema. Signed-off-by: Kevin Wolf --- qapi/block-core.json | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 7ff5e5edaf..1dbb2a9901 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1341,7 +1341,8 @@ { 'command': 'block_resize', 'data': { '*device': 'str', '*node-name': 'str', -'size': 'int' } } +'size': 'int' }, + 'coroutine': true } ## # @NewImageMode: -- 2.20.1
Re: qemu-4.0.1: vhost_region_add_section:Section rounded to 0 prior to previous a0000
Am 08.01.20 um 16:04 schrieb Dr. David Alan Gilbert: * Peter Lieven (p...@kamp.de) wrote: Hi, I have a Qemu 4.0.1 machine with vhost-net network adapter, thats polluting the log with the above message. Is this something known? Googling revealed the following patch in Nemu (with seems to be a Qemu fork from Intel): https://github.com/intel/nemu/commit/03940ded7f5370ce7492c619dccced114ef7f56e The network stopped functioning. After a live-migration the vServer is reachable again. Any ideas? What guest are you running and what does your qemu commandline look like? Its running debian9. We have hundreds of other VMs with identical setup. Do not know why this one makes trouble. Here is the cmdline: /usr/bin/qemu-4.0.0 -enable-kvm -M pc-i440fx-2.9 -nodefaults -netdev type=tap,id=guest0,script=no,downscript=no,ifname=tap0,vhost=on,vnet_hdr=on -device virtio-net-pci,netdev=guest0,mac=52:54:00:80:07:bc -iscsi initiator-name=iqn.2005-03.org.virtual-core:0025b51f006c,timeout=30 -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,max-bytes=65536,period=1000 -drive format=raw,discard=on,detect-zeroes=off,file=XXX,if=none,cache=writeback,aio=native,id=disk0 -device virtio-scsi-pci -device scsi-hd,drive=disk0 -serial null -parallel null -m 1024,slots=16,maxmem=393216M -smp 1,sockets=64,cores=1,threads=1,maxcpus=64 -monitor tcp:0:4001,server,nowait,nodelay -vnc :1 -qmp tcp:0:3001,server,nowait,nodelay -name 'debian9' -drive index=2,media=cdrom,if=ide,aio=native,readonly=on -boot order=c,menu=off -k de -pidfile /var/run/qemu/vm-5182.pid -mem-path /hugepages -mem-prealloc -cpu Westmere,+pcid,enforce -rtc base=utc -usb -device usb-tablet -no-hpet -vga vmware -chardev socket,host=127.0.0.1,port=2001,id=qga0,server,nowait,nodelay -device virtio-serial -device virtserialport,chardev=qga0,name=org.qemu.guest_agent.0 Thanks, Peter
Re: [PATCH 04/15] hw/ppc/spapr_rtas: Restrict variables scope to single switch case
On Thu, 9 Jan 2020 16:21:22 +0100 Philippe Mathieu-Daudé wrote: > We only access these variables in RTAS_SYSPARM_SPLPAR_CHARACTERISTICS > case, restrict their scope to avoid unnecessary initialization. > I guess a decent compiler can be smart enough detect that the initialization isn't needed outside of the RTAS_SYSPARM_SPLPAR_CHARACTERISTICS branch... Anyway, reducing scope isn't bad. The only hitch I could see is that some people do prefer to have all variables declared upfront, but there's a nested param_val variable already so I guess it's okay. > Signed-off-by: Philippe Mathieu-Daudé > --- > hw/ppc/spapr_rtas.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c > index 6f06e9d7fe..7237e5ebf2 100644 > --- a/hw/ppc/spapr_rtas.c > +++ b/hw/ppc/spapr_rtas.c > @@ -267,8 +267,6 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu, >uint32_t nret, target_ulong rets) > { > PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); > -MachineState *ms = MACHINE(spapr); > -unsigned int max_cpus = ms->smp.max_cpus; > target_ulong parameter = rtas_ld(args, 0); > target_ulong buffer = rtas_ld(args, 1); > target_ulong length = rtas_ld(args, 2); > @@ -276,6 +274,8 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu, > > switch (parameter) { > case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: { > +MachineState *ms = MACHINE(spapr); > +unsigned int max_cpus = ms->smp.max_cpus; The max_cpus variable used to be a global. Now that it got moved below ms->smp, I'm not sure it's worth keeping it IMHO. What about dropping it completely and do: char *param_val = g_strdup_printf("MaxEntCap=%d," "DesMem=%" PRIu64 "," "DesProcs=%d," "MaxPlatProcs=%d", ms->smp.max_cpus, current_machine->ram_size / MiB, ms->smp.cpus, ms->smp.max_cpus); And maybe insert an empty line between the declaration of param_val and the code for a better readability ? > char *param_val = g_strdup_printf("MaxEntCap=%d," >"DesMem=%" PRIu64 "," >"DesProcs=%d,"
Re: [PATCH 080/104] virtiofsd: add helper for lo_data cleanup
* Daniel P. Berrangé (berra...@redhat.com) wrote: > On Thu, Dec 12, 2019 at 04:38:40PM +, Dr. David Alan Gilbert (git) wrote: > > From: Liu Bo > > > > This offers an helper function for lo_data's cleanup. > > > > Signed-off-by: Liu Bo > > --- > > tools/virtiofsd/passthrough_ll.c | 37 ++-- > > 1 file changed, 21 insertions(+), 16 deletions(-) > > Reviewed-by: Daniel P. Berrangé Thanks. > > > > > diff --git a/tools/virtiofsd/passthrough_ll.c > > b/tools/virtiofsd/passthrough_ll.c > > index 45cf466178..097033aa00 100644 > > --- a/tools/virtiofsd/passthrough_ll.c > > +++ b/tools/virtiofsd/passthrough_ll.c > > @@ -2439,6 +2439,26 @@ static gboolean lo_key_equal(gconstpointer a, > > gconstpointer b) > > return la->ino == lb->ino && la->dev == lb->dev; > > } > > > > +static void fuse_lo_data_cleanup(struct lo_data *lo) > > +{ > > +if (lo->inodes) { > > +g_hash_table_destroy(lo->inodes); > > +} > > +lo_map_destroy(&lo->fd_map); > > +lo_map_destroy(&lo->dirp_map); > > +lo_map_destroy(&lo->ino_map); > > + > > +if (lo->proc_self_fd >= 0) { > > +close(lo->proc_self_fd); > > +} > > + > > +if (lo->root.fd >= 0) { > > +close(lo->root.fd); > > +} > > + > > +free((char *)lo->source); > > This will need changing if you follow my comment on prev patch about > removing the const & cast Done. > > > +} > > + > > int main(int argc, char *argv[]) > > { > > struct fuse_args args = FUSE_ARGS_INIT(argc, argv); > > @@ -2594,22 +2614,7 @@ err_out2: > > err_out1: > > fuse_opt_free_args(&args); > > > > -if (lo.inodes) { > > -g_hash_table_destroy(lo.inodes); > > -} > > -lo_map_destroy(&lo.fd_map); > > -lo_map_destroy(&lo.dirp_map); > > -lo_map_destroy(&lo.ino_map); > > - > > -if (lo.proc_self_fd >= 0) { > > -close(lo.proc_self_fd); > > -} > > - > > -if (lo.root.fd >= 0) { > > -close(lo.root.fd); > > -} > > - > > -free((char *)lo.source); > > +fuse_lo_data_cleanup(&lo); > > > > return ret ? 1 : 0; > > } > > -- > > 2.23.0 > > > > > > Regards, > Daniel > -- > |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o-https://fstop138.berrange.com :| > |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH 079/104] virtiofsd: fix memory leak on lo.source
* Daniel P. Berrangé (berra...@redhat.com) wrote: > On Thu, Dec 12, 2019 at 04:38:39PM +, Dr. David Alan Gilbert (git) wrote: > > From: Liu Bo > > > > valgrind reported that lo.source is leaked on quiting, but it was defined > > as (const char*) as it may point to a const string "/". > > > > Signed-off-by: Liu Bo > > --- > > tools/virtiofsd/passthrough_ll.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/tools/virtiofsd/passthrough_ll.c > > b/tools/virtiofsd/passthrough_ll.c > > index 33092de65a..45cf466178 100644 > > --- a/tools/virtiofsd/passthrough_ll.c > > +++ b/tools/virtiofsd/passthrough_ll.c > > @@ -2529,9 +2529,8 @@ int main(int argc, char *argv[]) > > fuse_log(FUSE_LOG_ERR, "source is not a directory\n"); > > exit(1); > > } > > - > > } else { > > -lo.source = "/"; > > +lo.source = strdup("/"); > > } > > lo.root.is_symlink = false; > > if (!lo.timeout_set) { > > @@ -2610,5 +2609,7 @@ err_out1: > > close(lo.root.fd); > > } > > > > +free((char *)lo.source); > > Can we not change the 'lo_data' struct so that source is not const > and thus avoid free'ing a const field ? Done. Made that free(lo.source) and dropped the const. Dave > > Regards, > Daniel > -- > |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o-https://fstop138.berrange.com :| > |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[Bug 1850000] Re: 4.1.0 bogus QCOW2 corruption reported after compress
Fix has been included in QEMU v4.2: https://git.qemu.org/?p=qemu.git;a=commitdiff;h=24552feb6ae2f615b76c2 ** Changed in: qemu Status: New => Fix Released -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/185 Title: 4.1.0 bogus QCOW2 corruption reported after compress Status in QEMU: Fix Released Bug description: Creating a compressed image then running `qemu-img check <..>.qcow2' on said image seems to report bogus corruption in some (but not all) cases: Step 1. # qemu-img info win7-base.qcow2 image: win7-base.qcow2 file format: qcow2 virtual size: 20 GiB (21474836480 bytes) disk size: 12.2 GiB cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: true refcount bits: 16 corrupt: false # qemu-img check win7-base.qcow2 No errors were found on the image. 327680/327680 = 100.00% allocated, 0.00% fragmented, 0.00% compressed clusters Image end offset: 21478375424 Step 2. # qemu-img convert -f qcow2 -O qcow2 -c win7-base.qcow2 test1-z.qcow2 Step 3. # qemu-img info test1-z.qcow2 image: test1-z.qcow2 file format: qcow2 virtual size: 20 GiB (21474836480 bytes) disk size: 5.78 GiB cluster_size: 65536 Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false # qemu-img check test1-z.qcow2 ERROR cluster 1191 refcount=1 reference=2 ERROR cluster 1194 refcount=1 reference=4 ERROR cluster 1195 refcount=1 reference=7 ERROR cluster 1196 refcount=1 reference=7 ERROR cluster 1197 refcount=1 reference=6 ERROR cluster 1198 refcount=1 reference=4 ERROR cluster 1199 refcount=1 reference=4 ERROR cluster 1200 refcount=1 reference=5 ERROR cluster 1201 refcount=1 reference=3 <...> snip many errors Leaked cluster 94847 refcount=3 reference=0 Leaked cluster 94848 refcount=3 reference=0 Leaked cluster 94849 refcount=11 reference=0 Leaked cluster 94850 refcount=14 reference=0 20503 errors were found on the image. Data may be corrupted, or further writes to the image may corrupt it. 20503 leaked clusters were found on the image. This means waste of disk space, but no harm to data. 197000/327680 = 60.12% allocated, 89.32% fragmented, 88.50% compressed clusters Image end offset: 6216220672 The resultant image seems to work fine in a VM when used as a backing file. Interestingly, if I substitute a qemu-img binary from qemu-4.0 then no errors are reported. # /tmp/qemu-img check test1-z.qcow2 No errors were found on the image. 197000/327680 = 60.12% allocated, 89.32% fragmented, 88.50% compressed clusters Image end offset: 6216220672 Is the image corrupted or not? I'm guessing not. Just in case it matters, this is ext4 fs on rotational disk. Latest Arch Linux but self compiled 4.1.0 with recent QCOW2 corruption fixes added. I haven't tried latest trunk but might do so if time permits. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/185/+subscriptions
Re: [kvm-unit-tests PATCH 04/10] arm: pmu: Check Required Event Support
Hi Andre, On 1/9/20 6:30 PM, André Przywara wrote: > On 09/01/2020 16:54, Auger Eric wrote: > > Hi Eric, > >> On 1/3/20 7:12 PM, Andre Przywara wrote: >>> On Mon, 16 Dec 2019 21:47:51 +0100 >>> Eric Auger wrote: >>> >>> Hi Eric, >>> If event counters are implemented check the common events required by the PMUv3 are implemented. Some are unconditionally required (SW_INCR, CPU_CYCLES, either INST_RETIRED or INST_SPEC). Some others only are required if the implementation implements some other features. Check those wich are unconditionally required. This test currently fails on TCG as neither INST_RETIRED or INST_SPEC are supported. Signed-off-by: Eric Auger --- v1 ->v2: - add a comment to explain the PMCEID0/1 splits --- arm/pmu.c | 71 +++ arm/unittests.cfg | 6 2 files changed, 77 insertions(+) diff --git a/arm/pmu.c b/arm/pmu.c index d24857e..d88ef22 100644 --- a/arm/pmu.c +++ b/arm/pmu.c @@ -101,6 +101,10 @@ static inline void precise_instrs_loop(int loop, uint32_t pmcr) : [pmcr] "r" (pmcr), [z] "r" (0) : "cc"); } + +/* event counter tests only implemented for aarch64 */ +static void test_event_introspection(void) {} + #elif defined(__aarch64__) #define ID_AA64DFR0_PERFMON_SHIFT 8 #define ID_AA64DFR0_PERFMON_MASK 0xf @@ -139,6 +143,70 @@ static inline void precise_instrs_loop(int loop, uint32_t pmcr) : [pmcr] "r" (pmcr) : "cc"); } + +#define PMCEID1_EL0 sys_reg(11, 3, 9, 12, 7) + +static bool is_event_supported(uint32_t n, bool warn) +{ + uint64_t pmceid0 = read_sysreg(pmceid0_el0); + uint64_t pmceid1 = read_sysreg_s(PMCEID1_EL0); + bool supported; + uint32_t reg; + + /* + * The low 32-bits of PMCEID0/1 respectly describe + * event support for events 0-31/32-63. Their High + * 32-bits describe support for extended events + * starting at 0x4000, using the same split. + */ + if (n >= 0x0 && n <= 0x1F) + reg = pmceid0 & 0x; + else if (n >= 0x4000 && n <= 0x401F) + reg = pmceid0 >> 32; + else if (n >= 0x20 && n <= 0x3F) + reg = pmceid1 & 0x; + else if (n >= 0x4020 && n <= 0x403F) + reg = pmceid1 >> 32; + else + abort(); + + supported = reg & (1 << n); >>> >>> Don't we need to mask off everything but the lowest 5 bits of "n"? Probably >>> also using "1U" is better. >> I added an assert to check n is less or equal than 0x3F > > But "n" will definitely be bigger than that in case of an extended > event, won't it? So you adjust "reg" accordingly, but miss to do > something similar to "n"? ouch yes. Sorry. I Will do what you suggest. Nethertheless this would be test code error. > >>> + if (!supported && warn) + report_info("event %d is not supported", n); + return supported; +} + +static void test_event_introspection(void) >>> >>> "introspection" sounds quite sophisticated. Are you planning to extend >>> this? If not, could we maybe rename it to "test_available_events"? >> Yes this test is a placeholder for looking at the PMU characteristics >> and we may add some other queries there. >>> +{ + bool required_events; + + if (!pmu.nb_implemented_counters) { + report_skip("No event counter, skip ..."); + return; + } + + /* PMUv3 requires an implementation includes some common events */ + required_events = is_event_supported(0x0, true) /* SW_INCR */ && +is_event_supported(0x11, true) /* CPU_CYCLES */ && +(is_event_supported(0x8, true) /* INST_RETIRED */ || + is_event_supported(0x1B, true) /* INST_PREC */); + + if (pmu.version == 0x4) { + /* ARMv8.1 PMU: STALL_FRONTEND and STALL_BACKEND are required */ + required_events = required_events || +is_event_supported(0x23, true) || >>> >>> Shouldn't those two operators be '&&' instead? >> yes definitively >>> +is_event_supported(0x24, true); + } + + /* + * L1D_CACHE_REFILL(0x3) and L1D_CACHE(0x4) are only required if + * L1 data / unified cache. BR_MIS_PRED(0x10), BR_PRED(0x12) are only + * required if program-flow prediction is implemented. + */ >>> >>> Is this a TODO? >> yes. Added TODO. I do not know how to check whether the conditions are >> satisfied? Do you have any idea? > > Well, AFAICS KVM doesn't filter PMCEIDn, right? So some basic checks are > surely fine, but I wouldn't go crazy about checking every possibl
Re: [kvm-unit-tests PATCH 04/10] arm: pmu: Check Required Event Support
On 09/01/2020 16:54, Auger Eric wrote: Hi Eric, > On 1/3/20 7:12 PM, Andre Przywara wrote: >> On Mon, 16 Dec 2019 21:47:51 +0100 >> Eric Auger wrote: >> >> Hi Eric, >> >>> If event counters are implemented check the common events >>> required by the PMUv3 are implemented. >>> >>> Some are unconditionally required (SW_INCR, CPU_CYCLES, >>> either INST_RETIRED or INST_SPEC). Some others only are >>> required if the implementation implements some other features. >>> >>> Check those wich are unconditionally required. >>> >>> This test currently fails on TCG as neither INST_RETIRED >>> or INST_SPEC are supported. >>> >>> Signed-off-by: Eric Auger >>> >>> --- >>> >>> v1 ->v2: >>> - add a comment to explain the PMCEID0/1 splits >>> --- >>> arm/pmu.c | 71 +++ >>> arm/unittests.cfg | 6 >>> 2 files changed, 77 insertions(+) >>> >>> diff --git a/arm/pmu.c b/arm/pmu.c >>> index d24857e..d88ef22 100644 >>> --- a/arm/pmu.c >>> +++ b/arm/pmu.c >>> @@ -101,6 +101,10 @@ static inline void precise_instrs_loop(int loop, >>> uint32_t pmcr) >>> : [pmcr] "r" (pmcr), [z] "r" (0) >>> : "cc"); >>> } >>> + >>> +/* event counter tests only implemented for aarch64 */ >>> +static void test_event_introspection(void) {} >>> + >>> #elif defined(__aarch64__) >>> #define ID_AA64DFR0_PERFMON_SHIFT 8 >>> #define ID_AA64DFR0_PERFMON_MASK 0xf >>> @@ -139,6 +143,70 @@ static inline void precise_instrs_loop(int loop, >>> uint32_t pmcr) >>> : [pmcr] "r" (pmcr) >>> : "cc"); >>> } >>> + >>> +#define PMCEID1_EL0 sys_reg(11, 3, 9, 12, 7) >>> + >>> +static bool is_event_supported(uint32_t n, bool warn) >>> +{ >>> + uint64_t pmceid0 = read_sysreg(pmceid0_el0); >>> + uint64_t pmceid1 = read_sysreg_s(PMCEID1_EL0); >>> + bool supported; >>> + uint32_t reg; >>> + >>> + /* >>> +* The low 32-bits of PMCEID0/1 respectly describe >>> +* event support for events 0-31/32-63. Their High >>> +* 32-bits describe support for extended events >>> +* starting at 0x4000, using the same split. >>> +*/ >>> + if (n >= 0x0 && n <= 0x1F) >>> + reg = pmceid0 & 0x; >>> + else if (n >= 0x4000 && n <= 0x401F) >>> + reg = pmceid0 >> 32; >>> + else if (n >= 0x20 && n <= 0x3F) >>> + reg = pmceid1 & 0x; >>> + else if (n >= 0x4020 && n <= 0x403F) >>> + reg = pmceid1 >> 32; >>> + else >>> + abort(); >>> + >>> + supported = reg & (1 << n); >> >> Don't we need to mask off everything but the lowest 5 bits of "n"? Probably >> also using "1U" is better. > I added an assert to check n is less or equal than 0x3F But "n" will definitely be bigger than that in case of an extended event, won't it? So you adjust "reg" accordingly, but miss to do something similar to "n"? >> >>> + if (!supported && warn) >>> + report_info("event %d is not supported", n); >>> + return supported; >>> +} >>> + >>> +static void test_event_introspection(void) >> >> "introspection" sounds quite sophisticated. Are you planning to extend this? >> If not, could we maybe rename it to "test_available_events"? > Yes this test is a placeholder for looking at the PMU characteristics > and we may add some other queries there. >> >>> +{ >>> + bool required_events; >>> + >>> + if (!pmu.nb_implemented_counters) { >>> + report_skip("No event counter, skip ..."); >>> + return; >>> + } >>> + >>> + /* PMUv3 requires an implementation includes some common events */ >>> + required_events = is_event_supported(0x0, true) /* SW_INCR */ && >>> + is_event_supported(0x11, true) /* CPU_CYCLES */ && >>> + (is_event_supported(0x8, true) /* INST_RETIRED */ || >>> + is_event_supported(0x1B, true) /* INST_PREC */); >>> + >>> + if (pmu.version == 0x4) { >>> + /* ARMv8.1 PMU: STALL_FRONTEND and STALL_BACKEND are required */ >>> + required_events = required_events || >>> + is_event_supported(0x23, true) || >> >> Shouldn't those two operators be '&&' instead? > yes definitively >> >>> + is_event_supported(0x24, true); >>> + } >>> + >>> + /* >>> +* L1D_CACHE_REFILL(0x3) and L1D_CACHE(0x4) are only required if >>> +* L1 data / unified cache. BR_MIS_PRED(0x10), BR_PRED(0x12) are only >>> +* required if program-flow prediction is implemented. >>> +*/ >> >> Is this a TODO? > yes. Added TODO. I do not know how to check whether the conditions are > satisfied? Do you have any idea? Well, AFAICS KVM doesn't filter PMCEIDn, right? So some basic checks are surely fine, but I wouldn't go crazy about checking every possible aspect of it. After all you would just check the hardware, as we pass this register on. Cheers, Andre. > Thank you for the review! > > Eric >> >> Cheers, >> Andre >> >> >>> + >>> + report(required_events, "Check required events are impleme
[Bug 1835827] Re: HTIF symbols no longer recognized by RISC-V spike board
Patch has been included in QEMU v4.2: https://git.qemu.org/?p=qemu.git;a=commitdiff;h=6478dd745dca49d632 ** Changed in: qemu Status: New => Fix Released -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1835827 Title: HTIF symbols no longer recognized by RISC-V spike board Status in QEMU: Fix Released Bug description: Tested commit: f34edbc760b0f689deddd175fc08732ecb46665f I belive this was introduced in 0ac24d56c5e7d32423ea78ac58a06b444d1df04d when the spike's load_kernel() was moved to riscv_load_kernel() which no longer included htif_symbol_callback(). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1835827/+subscriptions
[Bug 1777777] Re: arm9 clock pending (SP804)
Fix has been included in QEMU v4.2: https://git.qemu.org/?p=qemu.git;a=commitdiff;h=5a65f7b5f4907ca70cb6 ** Changed in: qemu Status: Confirmed => Fix Released -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/177 Title: arm9 clock pending (SP804) Status in QEMU: Fix Released Bug description: Hello all, I'm using the versatilepb board and the timer Interrupt Mask Status register (offset 0x14 of the SP804) does not seem to be working properly on the latest qemu-2.12. I tried on the 2.5 (i believe this is the mainstream version that comes with Linux) and it works perfectly. What happens is that the pending bit does not seem to be set in some scenarios. In my case, I see the timer value decreasing to 0 and then being reset to the reload value and neither does the interrupt is triggered nor the pending bit is set. I believe this is a matter of timing since in the "long" run the system eventually catches up (after a few microseconds). Thank you To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/177/+subscriptions
Re: [PATCH] linux-aio: increasing MAX_EVENTS to a larger hardcoded value
On Tue, Jan 07, 2020 at 06:01:01AM +, Wangyong wrote: > Since commit 6040aedddb5f474a9c2304b6a432a652d82b3d3c "virtio-blk: > make queue size configurable",if the user set the queue size to > more than 128 ,it will not take effect. That's because linux aio's > maximum outstanding requests at a time is always less than or equal > to 128. > > This patch simply increase MAX_EVENTS to a larger hardcoded value of > 1024 as a shortterm fix. > > Signed-off-by: wangyong > --- > block/linux-aio.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) We discussed that long-terms solutions would take into account the queue size of the emulated storage controllers associated with this AioContext, but this is a first step that will improve performance. Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan signature.asc Description: PGP signature
Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device
On Thu, Jan 09, 2020 at 04:27:45PM +, Dr. David Alan Gilbert wrote: > * Roman Kagan (rka...@virtuozzo.com) wrote: > > On Thu, Jan 09, 2020 at 01:28:21PM +, Dr. David Alan Gilbert wrote: > > > * Roman Kagan (rka...@virtuozzo.com) wrote: > > > > On Thu, Jan 09, 2020 at 02:00:00PM +0100, Vitaly Kuznetsov wrote: > > > > > "Dr. David Alan Gilbert" writes: > > > > > > > > > > > And I think vhost-user will fail if you have too many sections - and > > > > > > the 16 sections from synic I think will blow the slots available. > > > > > > > > > > > > > > > > SynIC is percpu, it will allocate two 4k pages for every vCPU the > > > > > guest > > > > > has so we're potentially looking at hundreds of such regions. > > > > > > > > Indeed. > > > > > > > > I think my original idea to implement overlay pages word-for-word to the > > > > HyperV spec was a mistake, as it lead to fragmentation and memslot > > > > waste. > > > > > > > > I'll look into reworking it without actually mapping extra pages over > > > > the existing RAM, but achieving overlay semantics by just shoving the > > > > *content* of the "overlaid" memory somewhere. > > > > > > > > That said, I haven't yet fully understood how the reported issue came > > > > about, and thus whether the proposed approach would resolve it too. > > > > > > The problem happens when we end up with: > > > > > > a) 0-512k RAM > > > b) 512k + synic > > > c) 570kish-640k RAM > > > > > > the page alignment code rounds > > > (a) to 0-2MB - aligning to the hugepage it's in > > > (b) leaves as is > > > (c) aligns to 0-2MB > > > > > > it then tries to coalesce (c) and (a) and notices (b) got in the way > > > and fails it. > > > > I see, thanks. The only bit I still haven't quite followed is how this > > failure results in a quiet vhost malfunction rather than a refusal to > > start vhost. > > Because there's no way to fail in vhost_region_add_section other than to > abort; > > if (mrs_gpa < prev_gpa_start) { > error_report("%s:Section rounded to %"PRIx64 > " prior to previous %"PRIx64, > __func__, mrs_gpa, prev_gpa_start); > /* A way to cleanly fail here would be better */ > return; > } > > > > Given the guest can put Synic anywhere I'm not sure that changing it's > > > implementatino would help here. > > > > There would be no (b) nor (separate) (c): synic would just refer to some > > memory straight from (a), regardless of its paging granularity. > > Oh, if it's actually memory from main RAM, then sure, but I guess you'd > have to reserve that somehow to stop the OS using it. It's up to the OS to use it. > > > (And changing it's implementation would probably break migration > > > compatibility). > > > > I'm afraid I see no better option. > > Migration compatibility! Hmm, good point! I think I should be able to achieve that, by keeping the current scheme of allocating a one-page RAM memory region for every synic page, but, instead of mapping it on top of the general RAM, swap the content with the page being "overlaid". Let me see if it works out (and hope the QEMU gang won't shoot me for such an (ab)use of memory regions). Thanks, Roman.
Re: [PATCH 03/15] hw/ppc/spapr_rtas: Access MachineState via SpaprMachineState argument
On Thu, 9 Jan 2020 16:21:21 +0100 Philippe Mathieu-Daudé wrote: > We received a SpaprMachineState argument. Since SpaprMachineState > inherits of MachineState, use it instead of calling qdev_get_machine. > > Signed-off-by: Philippe Mathieu-Daudé > --- Reviewed-by: Greg Kurz > hw/ppc/spapr_rtas.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c > index e88bb1930e..6f06e9d7fe 100644 > --- a/hw/ppc/spapr_rtas.c > +++ b/hw/ppc/spapr_rtas.c > @@ -267,7 +267,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu, >uint32_t nret, target_ulong rets) > { > PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); > -MachineState *ms = MACHINE(qdev_get_machine()); > +MachineState *ms = MACHINE(spapr); > unsigned int max_cpus = ms->smp.max_cpus; > target_ulong parameter = rtas_ld(args, 0); > target_ulong buffer = rtas_ld(args, 1);
Re: [PATCH 02/15] hw/ppc/spapr_rtas: Use local MachineState variable
On Thu, 9 Jan 2020 16:21:20 +0100 Philippe Mathieu-Daudé wrote: > Since we have the MachineState already available locally, > ues it instead of the global current_machine. > > Signed-off-by: Philippe Mathieu-Daudé > --- Reviewed-by: Greg Kurz > hw/ppc/spapr_rtas.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c > index 8d8d8cdfcb..e88bb1930e 100644 > --- a/hw/ppc/spapr_rtas.c > +++ b/hw/ppc/spapr_rtas.c > @@ -281,7 +281,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu, >"DesProcs=%d," >"MaxPlatProcs=%d", >max_cpus, > - current_machine->ram_size / MiB, > + ms->ram_size / MiB, >ms->smp.cpus, >max_cpus); > if (pcc->n_host_threads > 0) {
Re: [PATCH 0/5] ARM virt: Add NVDIMM support
On Mon, 6 Jan 2020 17:06:32 + Shameerali Kolothum Thodi wrote: > Hi Igor, > > > -Original Message- > > From: Shameerali Kolothum Thodi > > Sent: 13 December 2019 12:52 > > To: 'Igor Mammedov' > > Cc: xiaoguangrong.e...@gmail.com; peter.mayd...@linaro.org; > > drjo...@redhat.com; shannon.zha...@gmail.com; qemu-devel@nongnu.org; > > Linuxarm ; Auger Eric ; > > qemu-...@nongnu.org; xuwei (O) ; > > ler...@redhat.com > > Subject: RE: [PATCH 0/5] ARM virt: Add NVDIMM support > > > > [...] > > > > > Thanks for your help. I did spend some more time debugging this further. > > I tried to introduce a totally new Buffer field object with different > > sizes and printing the size after creation. > > > > --- SSDT.dsl2019-12-12 15:28:21.976986949 + > > +++ SSDT-arm64-dbg.dsl 2019-12-13 12:17:11.026806186 + > > @@ -18,7 +18,7 @@ > > * Compiler ID "BXPC" > > * Compiler Version 0x0001 (1) > > */ > > -DefinitionBlock ("", "SSDT", 1, "BOCHS ", "NVDIMM", 0x0001) > > +DefinitionBlock ("", "SSDT", 1, "BOCHS ", "NVDIMM", 0x0002) > > { > > Scope (\_SB) > > { > > @@ -48,6 +48,11 @@ > > RLEN, 32, > > ODAT, 32736 > > } > > + > > +Field (NRAM, DWordAcc, NoLock, Preserve) > > +{ > > +NBUF, 32768 > > +} > > > > If ((Arg4 == Zero)) > > { > > @@ -87,6 +92,12 @@ > > Local3 = DerefOf (Local2) > > FARG = Local3 > > } > > + > > +Local2 = 0x2 > > +printf("AML:NVDIMM Creating TBUF with bytes %o", > > Local2) > > +CreateField (NBUF, Zero, (Local2 << 3), TBUF) > > +Concatenate (Buffer (Zero){}, TBUF, Local3) > > +printf("AML:NVDIMM Size of TBUF(Local3) %o", > > SizeOf(Local3)) > > > > NTFI = Local6 > > Local1 = (RLEN - 0x04) > > > > And run it by changing Local2 with different values, It looks on ARM64, > > > > For cases where, Local2 <8, the created buffer size is always 8 bytes > > > > "AML:NVDIMM Creating TBUF with bytes 0002" > > "AML:NVDIMM Size of TBUF(Local3) 0008" > > > > ... > > "AML:NVDIMM Creating TBUF with bytes 0005" > > "AML:NVDIMM Size of TBUF(Local3) 0008" > > > > And once Local2 >=8, it gets the correct size, > > > > "AML:NVDIMM Creating TBUF with bytes 0009" > > "AML:NVDIMM Size of TBUF(Local3) 0009" > > > > > > But on x86, the behavior is like, > > > > For cases where, Local2 <4, the created buffer size is always 4 bytes > > > > "AML:NVDIMM Creating TBUF with bytes 0002" > > "AML:NVDIMM Size of TBUF(Local3) 0004" > > > > "AML:NVDIMM Creating TBUF with bytes 0003" > > "AML:NVDIMM Size of TBUF(Local3) 0004" > > > > And once Local2 >= 4, it is ok > > > > "AML:NVDIMM Creating TBUF with bytes 0005" > > "AML:NVDIMM Size of TBUF(Local3) 0005" > > ... > > "AML:NVDIMM Creating TBUF with bytes 0009" > > "AML:NVDIMM Size of TBUF(Local3) 0009" > > > > This is the reason why it works on x86 and not on ARM64. Because, if you > > remember on second iteration of the FIT buffer, the requested buffer size > > is 4 . > > > > I tried changing the AccessType of the below NBUF field from DWordAcc to > > ByteAcc/BufferAcc, but no luck. > > > > +Field (NRAM, DWordAcc, NoLock, Preserve) > > +{ > > +NBUF, 32768 > > +} > > > > Not sure what we need to change for ARM64 to create buffer object of size 4 > > here. Please let me know if you have any pointers to debug this further. > > > > (I am attaching both x86 and ARM64 SSDT dsl used for reference) > > (+Jonathan) > > Thanks to Jonathan for taking a fresh look at this issue and spotting this, > https://elixir.bootlin.com/linux/v5.5-rc5/source/drivers/acpi/acpica/utmisc.c#L109 > > And, from ACPI 6.3, table 19-419 > > "If the Buffer Field is smaller than or equal to the size of an Integer (in > bits), it > will be treated as an Integer. Otherwise, it will be treated as a Buffer. The > size > of an Integer is indicated by the Definition Block table header's Revision > field. > A Revision field value less than 2 indicates that the size of an Integer is > 32 bits. > A value greater than or equal to 2 signifies that the size of an Integer is > 64 bits." > > It looks like the main reason for the difference in behavior of the buffer > object > size between x86 and ARM/virt, is because of the Revision number used in the > DSDT table. On x86 it is 1 and ARM/virt it is 2. > > So most likely, > > > CreateField (ODAT, Zero, Local1, OBUF) You are right, that's where it goes wrong, since OBUF is implicitly converted to integer if size is less than 64bits. > > Concatenate (Buffer (Z
Re: [PATCH 056/104] virtiofsd: add security guide document
* Daniel P. Berrangé (berra...@redhat.com) wrote: > On Mon, Jan 06, 2020 at 05:53:55PM +, Dr. David Alan Gilbert wrote: > > * Daniel P. Berrangé (berra...@redhat.com) wrote: > > > On Thu, Dec 12, 2019 at 04:38:16PM +, Dr. David Alan Gilbert (git) > > > wrote: > > > > From: Stefan Hajnoczi > > > > > > > > Many people want to know: what's up with virtiofsd and security? This > > > > document provides the answers! > > > > > > > > Signed-off-by: Stefan Hajnoczi > > > > --- > > > > tools/virtiofsd/security.rst | 118 +++ > > > > > > Do we need to link this into the rest of QEMU's docs in some > > > index page ? > > > > I wonder how; there's a autogenerated thing in > > docs/index.rst > > > > that includes some of the docs directories subdirectories/index > > Does that mean we should have this in docs/tools/virtiofsd/security.rst > > and a docs/tools/index and a docs/tools/virtiofsd/index ? > > I was wondering if this fits into any of the current three sections > "devel" or "interop" or "specs", but it doesn't feel quite right in > any of them to me. So having a new docs/tools subtree looks like an > ok idea in absence of better suggestions. OK, so what I've done is I've added a preceding patch that creates: docs/tools /conf.py /index.rst and adds it to the Makefile and docs/index.rst and then this patch now adds itself as docs/tools/virtiofsd-security.rst and just adds the entry in docs/tools/index.rst Dave > Regards, > Daniel > -- > |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o-https://fstop138.berrange.com :| > |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [kvm-unit-tests PATCH 04/10] arm: pmu: Check Required Event Support
Hi Andre, On 1/3/20 7:12 PM, Andre Przywara wrote: > On Mon, 16 Dec 2019 21:47:51 +0100 > Eric Auger wrote: > > Hi Eric, > >> If event counters are implemented check the common events >> required by the PMUv3 are implemented. >> >> Some are unconditionally required (SW_INCR, CPU_CYCLES, >> either INST_RETIRED or INST_SPEC). Some others only are >> required if the implementation implements some other features. >> >> Check those wich are unconditionally required. >> >> This test currently fails on TCG as neither INST_RETIRED >> or INST_SPEC are supported. >> >> Signed-off-by: Eric Auger >> >> --- >> >> v1 ->v2: >> - add a comment to explain the PMCEID0/1 splits >> --- >> arm/pmu.c | 71 +++ >> arm/unittests.cfg | 6 >> 2 files changed, 77 insertions(+) >> >> diff --git a/arm/pmu.c b/arm/pmu.c >> index d24857e..d88ef22 100644 >> --- a/arm/pmu.c >> +++ b/arm/pmu.c >> @@ -101,6 +101,10 @@ static inline void precise_instrs_loop(int loop, >> uint32_t pmcr) >> : [pmcr] "r" (pmcr), [z] "r" (0) >> : "cc"); >> } >> + >> +/* event counter tests only implemented for aarch64 */ >> +static void test_event_introspection(void) {} >> + >> #elif defined(__aarch64__) >> #define ID_AA64DFR0_PERFMON_SHIFT 8 >> #define ID_AA64DFR0_PERFMON_MASK 0xf >> @@ -139,6 +143,70 @@ static inline void precise_instrs_loop(int loop, >> uint32_t pmcr) >> : [pmcr] "r" (pmcr) >> : "cc"); >> } >> + >> +#define PMCEID1_EL0 sys_reg(11, 3, 9, 12, 7) >> + >> +static bool is_event_supported(uint32_t n, bool warn) >> +{ >> +uint64_t pmceid0 = read_sysreg(pmceid0_el0); >> +uint64_t pmceid1 = read_sysreg_s(PMCEID1_EL0); >> +bool supported; >> +uint32_t reg; >> + >> +/* >> + * The low 32-bits of PMCEID0/1 respectly describe >> + * event support for events 0-31/32-63. Their High >> + * 32-bits describe support for extended events >> + * starting at 0x4000, using the same split. >> + */ >> +if (n >= 0x0 && n <= 0x1F) >> +reg = pmceid0 & 0x; >> +else if (n >= 0x4000 && n <= 0x401F) >> +reg = pmceid0 >> 32; >> +else if (n >= 0x20 && n <= 0x3F) >> +reg = pmceid1 & 0x; >> +else if (n >= 0x4020 && n <= 0x403F) >> +reg = pmceid1 >> 32; >> +else >> +abort(); >> + >> +supported = reg & (1 << n); > > Don't we need to mask off everything but the lowest 5 bits of "n"? Probably > also using "1U" is better. I added an assert to check n is less or equal than 0x3F > >> +if (!supported && warn) >> +report_info("event %d is not supported", n); >> +return supported; >> +} >> + >> +static void test_event_introspection(void) > > "introspection" sounds quite sophisticated. Are you planning to extend this? > If not, could we maybe rename it to "test_available_events"? Yes this test is a placeholder for looking at the PMU characteristics and we may add some other queries there. > >> +{ >> +bool required_events; >> + >> +if (!pmu.nb_implemented_counters) { >> +report_skip("No event counter, skip ..."); >> +return; >> +} >> + >> +/* PMUv3 requires an implementation includes some common events */ >> +required_events = is_event_supported(0x0, true) /* SW_INCR */ && >> + is_event_supported(0x11, true) /* CPU_CYCLES */ && >> + (is_event_supported(0x8, true) /* INST_RETIRED */ || >> + is_event_supported(0x1B, true) /* INST_PREC */); >> + >> +if (pmu.version == 0x4) { >> +/* ARMv8.1 PMU: STALL_FRONTEND and STALL_BACKEND are required */ >> +required_events = required_events || >> + is_event_supported(0x23, true) || > > Shouldn't those two operators be '&&' instead? yes definitively > >> + is_event_supported(0x24, true); >> +} >> + >> +/* >> + * L1D_CACHE_REFILL(0x3) and L1D_CACHE(0x4) are only required if >> + * L1 data / unified cache. BR_MIS_PRED(0x10), BR_PRED(0x12) are only >> + * required if program-flow prediction is implemented. >> + */ > > Is this a TODO? yes. Added TODO. I do not know how to check whether the conditions are satisfied? Do you have any idea? Thank you for the review! Eric > > Cheers, > Andre > > >> + >> +report(required_events, "Check required events are implemented"); >> +} >> + >> #endif >> >> /* >> @@ -326,6 +394,9 @@ int main(int argc, char *argv[]) >> "Monotonically increasing cycle count"); >> report(check_cpi(cpi), "Cycle/instruction ratio"); >> pmccntr64_test(); >> +} else if (strcmp(argv[1], "event-introspection") == 0) { >> +report_prefix_push(argv[1]); >> +test_event_introspection(); >> } else { >> report_abort("Unknown sub-test '%s'", argv[1]); >> } >> diff --git a/arm/un
[Bug 1528239] Re: Unable to debug PIE binaries with QEMU gdb stub.
Patch has been included in QEMU v4.2: https://git.qemu.org/?p=qemu.git;a=commitdiff;h=dc12567a53c88d7a91b9 ** Changed in: qemu Status: New => Fix Released -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1528239 Title: Unable to debug PIE binaries with QEMU gdb stub. Status in QEMU: Fix Released Bug description: The issue occurs on current trunk: max@max:~/build/qemu$ cat test.c #include int main() { printf("Hello, world!\n"); return 0; } max@max:~/build/qemu$ gcc test.c -fPIC -pie -o bad.x max@max:~/build/qemu$ ./x86_64-linux-user/qemu-x86_64 -g 1234 bad.x . max@max:~/build/qemu$ gdb GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 (gdb) file bad.x Reading symbols from bad.x...(no debugging symbols found)...done. (gdb) b main Breakpoint 1 at 0x779 (gdb) target remote localhost:1234 Remote debugging using localhost:1234 Reading symbols from /lib64/ld-linux-x86-64.so.2...warning: the debug information found in "/lib64/ld-2.19.so" does not match "/lib64/ld-linux-x86-64.so.2" (CRC mismatch). Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.19.so...done. done. Loaded symbols for /lib64/ld-linux-x86-64.so.2 Error in re-setting breakpoint 1: Cannot access memory at address 0x775 Error in re-setting breakpoint 1: Cannot access memory at address 0x775 0x004000a042d0 in _start () from /lib64/ld-linux-x86-64.so.2 (gdb) c Continuing. [Inferior 1 (Remote target) exited normally] (gdb) max@max:~/build/qemu$ cat config.log # Configured with: '/home/max/src/qemu/configure' '--prefix=/home/max/install/qemu' '--target-list=arm-linux-user,aarch64-linux-user,x86_64-linux-user' '--static' W/O QEMU or -pie flag breakpoint on main works fine. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1528239/+subscriptions
[PATCH 2/2] target/ppc: add support for Hypervisor Facility Unavailable Exception
The privileged message send and clear instructions (msgsndp & msgclrp) are privileged, but will generate a hypervisor facility unavailable exception if not enabled in the HFSCR and executed in privileged non-hypervisor state. Add checks when accessing the DPDES register and when using the msgsndp and msgclrp isntructions. Based on previous work from Suraj Jitindar Singh. Cc: Suraj Jitindar Singh Signed-off-by: Cédric Le Goater --- target/ppc/cpu.h | 6 ++ target/ppc/excp_helper.c | 13 + target/ppc/misc_helper.c | 27 +++ 3 files changed, 46 insertions(+) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index d175ec9a641d..1ff6afbccdb2 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -397,6 +397,10 @@ typedef struct ppc_v3_pate_t { #define PSSCR_ESL PPC_BIT(42) /* Enable State Loss */ #define PSSCR_EC PPC_BIT(43) /* Exit Criterion */ +/* HFSCR bits */ +#define HFSCR_MSGP PPC_BIT(53) /* Privileged Message Send Facilities */ +#define HFSCR_IC_MSGP 0xA + #define msr_sf ((env->msr >> MSR_SF) & 1) #define msr_isf ((env->msr >> MSR_ISF) & 1) #define msr_shv ((env->msr >> MSR_SHV) & 1) @@ -1332,6 +1336,8 @@ void cpu_ppc_set_vhyp(PowerPCCPU *cpu, PPCVirtualHypervisor *vhyp); #endif void store_fpscr(CPUPPCState *env, uint64_t arg, uint32_t mask); +void helper_hfscr_facility_check(CPUPPCState *env, uint32_t bit, + const char *caller, uint32_t cause); static inline uint64_t ppc_dump_gpr(CPUPPCState *env, int gprn) { diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c index 343f3a6b30c4..3887fc6c 100644 --- a/target/ppc/excp_helper.c +++ b/target/ppc/excp_helper.c @@ -471,6 +471,15 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp) case POWERPC_EXCP_FU: /* Facility unavailable exception */ #ifdef TARGET_PPC64 env->spr[SPR_FSCR] |= ((target_ulong)env->error_code << 56); +#endif +break; +case POWERPC_EXCP_HV_FU: /* Hypervisor Facility Unavailable Exception */ +#ifdef TARGET_PPC64 +env->spr[SPR_HFSCR] |= ((target_ulong)env->error_code << FSCR_IC_POS); +srr0 = SPR_HSRR0; +srr1 = SPR_HSRR1; +new_msr |= (target_ulong)MSR_HVB; +new_msr |= env->msr & ((target_ulong)1 << MSR_RI); #endif break; case POWERPC_EXCP_PIT: /* Programmable interval timer interrupt*/ @@ -1287,6 +1296,8 @@ void helper_book3s_msgclrp(CPUPPCState *env, target_ulong rb) { int irq = book3s_dbell2irq(rb, false); +helper_hfscr_facility_check(env, HFSCR_MSGP, "msgclrp", HFSCR_IC_MSGP); + if (irq < 0) { return; } @@ -1303,6 +1314,8 @@ void helper_book3s_msgsndp(CPUPPCState *env, target_ulong rb) int irq = book3s_dbell2irq(rb, false); int pir = env->spr_cb[SPR_PIR].default_value; +helper_hfscr_facility_check(env, HFSCR_MSGP, "msgsndp", HFSCR_IC_MSGP); + if (irq < 0) { return; } diff --git a/target/ppc/misc_helper.c b/target/ppc/misc_helper.c index 66b5b0824208..2ff6bed10228 100644 --- a/target/ppc/misc_helper.c +++ b/target/ppc/misc_helper.c @@ -41,6 +41,18 @@ void helper_store_dump_spr(CPUPPCState *env, uint32_t sprn) } #ifdef TARGET_PPC64 +static void raise_hv_fu_exception(CPUPPCState *env, uint32_t bit, + const char *caller, uint32_t cause, + uintptr_t raddr) +{ +qemu_log_mask(LOG_GUEST_ERROR, "HV Facility %d is unavailable (%s)\n", + bit, caller); + +env->spr[SPR_HFSCR] &= ~((target_ulong)FSCR_IC_MASK << FSCR_IC_POS); + +raise_exception_err_ra(env, POWERPC_EXCP_HV_FU, cause, raddr); +} + static void raise_fu_exception(CPUPPCState *env, uint32_t bit, uint32_t sprn, uint32_t cause, uintptr_t raddr) @@ -55,6 +67,17 @@ static void raise_fu_exception(CPUPPCState *env, uint32_t bit, } #endif +void helper_hfscr_facility_check(CPUPPCState *env, uint32_t bit, + const char *caller, uint32_t cause) +{ +#ifdef TARGET_PPC64 +if ((env->msr_mask & MSR_HVB) && !msr_hv && + !(env->spr[SPR_HFSCR] & (1UL << bit))) { +raise_hv_fu_exception(env, bit, caller, cause, GETPC()); +} +#endif +} + void helper_fscr_facility_check(CPUPPCState *env, uint32_t bit, uint32_t sprn, uint32_t cause) { @@ -114,6 +137,8 @@ target_ulong helper_load_dpdes(CPUPPCState *env) { target_ulong dpdes = 0; +helper_hfscr_facility_check(env, HFSCR_MSGP, "load DPDES", HFSCR_IC_MSGP); + /* TODO: TCG supports only one thread */ if (env->pending_interrupts & (1 << PPC_INTERRUPT_DOORBELL)) { dpdes |= (env->spr_cb[SPR_PIR].default_value & DBELL_TIRTAG_MASK); @@ -127,6 +152,8 @@ void helper_store_dpdes(CPUPPCState *env, target_ulong val)
[PATCH 1/2] target/ppc: Add privileged message send facilities
The Processor Control facility POWER8 processors and later provides a mechanism for the hypervisor to send messages to other threads in the system (msgsnd instruction) and cause hypervisor-level exceptions. Privileged non-hypervisor programs are also able to send messages (msgsndp instruction) but are restricted to the threads of the same core and cause privileged-level exceptions. The Directed Privileged Doorbell Exception State (DPDES) register reflects the state of pending privileged doorbell exceptions and can also be used to modify that state. The register can be used to read and modify the state of privileged doorbell exceptions for all threads of a subprocessor and thus is a shared facility for that subprocessor. The register can be read/written by the hypervisor and read by the supervisor if enabled in the HFSCR, otherwise a hypervisor facility unavailable exception is generated. The privileged message send and clear instructions (msgsndp & msgclrp) are used to generate and clear the presence of a directed privileged doorbell exception, respectively. The msgsndp instruction can be used to target any thread of the current subprocessor, msgclrp acts on the thread issuing the instruction. These instructions are privileged, but will generate a hypervisor facility unavailable exception if not enabled in the HFSCR and executed in privileged non-hypervisor state. The HV facility unavailable exception will be addressed in other patch. Add and implement this register and instructions by reading or modifying the pending interrupt state of the cpu. Note that TCG only supports one thread per core and so we only need to worry about the cpu making the access. Based on previous work from Suraj Jitindar Singh. Cc: Suraj Jitindar Singh [clg: took ownership due to the amount of changes ] Signed-off-by: Cédric Le Goater --- target/ppc/cpu.h| 1 + target/ppc/helper.h | 4 ++ target/ppc/excp_helper.c| 68 +++-- target/ppc/misc_helper.c| 36 + target/ppc/translate.c | 26 + target/ppc/translate_init.inc.c | 20 -- 6 files changed, 140 insertions(+), 15 deletions(-) diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h index 103bfe9dc274..d175ec9a641d 100644 --- a/target/ppc/cpu.h +++ b/target/ppc/cpu.h @@ -935,6 +935,7 @@ enum { #define DBELL_PIRTAG_MASK 0x3fff #define DBELL_PROCIDTAG_MASK PPC_BITMASK(44, 63) +#define DBELL_TIRTAG_MASK PPC_BITMASK(57, 63) #define PPC_PAGE_SIZES_MAX_SZ 8 diff --git a/target/ppc/helper.h b/target/ppc/helper.h index cd0dfe383a2a..cfb4c07085ca 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -657,6 +657,10 @@ DEF_HELPER_FLAGS_1(load_601_rtcu, TCG_CALL_NO_RWG, tl, env) DEF_HELPER_FLAGS_1(load_purr, TCG_CALL_NO_RWG, tl, env) DEF_HELPER_FLAGS_2(store_purr, TCG_CALL_NO_RWG, void, env, tl) DEF_HELPER_2(store_ptcr, void, env, tl) +DEF_HELPER_FLAGS_1(load_dpdes, TCG_CALL_NO_RWG, tl, env) +DEF_HELPER_FLAGS_2(store_dpdes, TCG_CALL_NO_RWG, void, env, tl) +DEF_HELPER_2(book3s_msgsndp, void, env, tl) +DEF_HELPER_2(book3s_msgclrp, void, env, tl) #endif DEF_HELPER_2(store_sdr1, void, env, tl) DEF_HELPER_2(store_pidr, void, env, tl) diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c index 5752ed4a4d83..343f3a6b30c4 100644 --- a/target/ppc/excp_helper.c +++ b/target/ppc/excp_helper.c @@ -900,7 +900,11 @@ static void ppc_hw_interrupt(CPUPPCState *env) } if (env->pending_interrupts & (1 << PPC_INTERRUPT_DOORBELL)) { env->pending_interrupts &= ~(1 << PPC_INTERRUPT_DOORBELL); -powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_DOORI); +if (env->insns_flags & PPC_SEGMENT_64B) { +powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_SDOOR); +} else { +powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_DOORI); +} return; } if (env->pending_interrupts & (1 << PPC_INTERRUPT_HDOORBELL)) { @@ -1221,7 +1225,7 @@ void helper_msgsnd(target_ulong rb) } /* Server Processor Control */ -static int book3s_dbell2irq(target_ulong rb) +static int book3s_dbell2irq(target_ulong rb, bool hv_dbell) { int msg = rb & DBELL_TYPE_MASK; @@ -1230,12 +1234,16 @@ static int book3s_dbell2irq(target_ulong rb) * message type is 5. All other types are reserved and the * instruction is a no-op */ -return msg == DBELL_TYPE_DBELL_SERVER ? PPC_INTERRUPT_HDOORBELL : -1; +if (msg == DBELL_TYPE_DBELL_SERVER) { +return hv_dbell ? PPC_INTERRUPT_HDOORBELL : PPC_INTERRUPT_DOORBELL; +} + +return -1; } void helper_book3s_msgclr(CPUPPCState *env, target_ulong rb) { -int irq = book3s_dbell2irq(rb); +int irq = book3s_dbell2irq(rb, true); if (irq < 0) { return; @@ -1244,16 +1252,10 @@ void helper_book3s_msgclr(CPUPPCState *env, target_ulong rb)
[PATCH 0/2] ppc: add support for Directed Privileged Doorbell (non-hypervisor)
Hello, The Processor Control facility POWER8 processors and later provides a mechanism for the hypervisor to send messages to other threads in the system (msgsnd instruction) and cause hypervisor-level exceptions. Privileged non-hypervisor programs can also send messages (msgsndp instruction) but are restricted to the threads of the same subprocessor and cause privileged-level exceptions. The Directed Privileged Doorbell Exception State (DPDES) register reflects the state of pending privileged-level doorbell exceptions for all threads and can be used to modify that state. If the MSGP facility is not in the HFSCR, a hypervisor facility unavailable exception is generated when these instructions are used or when the DPDES register is accessed by the supervisor. Based on previous work from Suraj Jitindar Singh. I took ownership due to the amount of changes. Thanks, C. Cédric Le Goater (2): target/ppc: Add privileged message send facilities target/ppc: add support for Hypervisor Facility Unavailable Exception target/ppc/cpu.h| 7 +++ target/ppc/helper.h | 4 ++ target/ppc/excp_helper.c| 81 - target/ppc/misc_helper.c| 63 + target/ppc/translate.c | 26 +++ target/ppc/translate_init.inc.c | 20 ++-- 6 files changed, 186 insertions(+), 15 deletions(-) -- 2.21.1
Re: [PATCH 1/4] configure: Do not build libfdt is not required
On 09/01/2020 16.39, Philippe Mathieu-Daudé wrote: > We only require libfdt for system emulation, in a small set > of architecture: > > 4077 # fdt support is mandatory for at least some target architectures, > 4078 # so insist on it if we're building those system emulators. > 4079 fdt_required=no > 4080 for target in $target_list; do > 4081case $target in > 4082 > aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu|mips64el-softmmu|riscv*-softmmu) > 4083fdt_required=yes > > Do not build libfdt if we did not manually specified --enable-fdt. I suggest to add: "... or have one of the platforms that require it in our target list." > Signed-off-by: Philippe Mathieu-Daudé > --- > configure | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/configure b/configure > index 0ce2c0354a..266a8386d1 100755 > --- a/configure > +++ b/configure > @@ -4092,6 +4092,8 @@ if test "$fdt_required" = "yes"; then >"targets which need it (by specifying a cut down --target-list)." >fi >fdt=yes > +elif test "$fdt" != "yes" ; then > + fdt=no > fi > > if test "$fdt" != "no" ; then > Reviewed-by: Thomas Huth
Re: Pre-Christmas meeting notes
On 09/01/2020 16.03, Vladimir Sementsov-Ogievskiy wrote: > 06.01.2020 20:15, Max Reitz wrote: >> Misc >> >> >> The Wiki’s TODO list is horribly outdated. What should we do about >> it? Maybe archive it and start a new one? (Most of the things on the >> current list are either done or we don’t want to do anymore.) > > > May be, create block/TODO.txt instead? It would be simpler to manage file > in git than wiki page. Is it? We removed a bunch of stale TODO files in the past already: https://git.qemu.org/?p=qemu.git;a=commitdiff;h=d494d79eabfdac0 https://git.qemu.org/?p=qemu.git;a=commitdiff;h=9e564a1dde5abc7 https://git.qemu.org/?p=qemu.git;a=commitdiff;h=ba43da36983a0bf And there are some more TODO files in the target/ subfolders which look also quite stale... so I somewhat doubt that a TODO list in git is much better than a TODO list in the wiki. Maybe we just need a proper bug/feature tracker instead (since Launchpad is IMHO quite a bad choice for bug tracking, too)... Thomas
Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device
* Roman Kagan (rka...@virtuozzo.com) wrote: > On Thu, Jan 09, 2020 at 01:28:21PM +, Dr. David Alan Gilbert wrote: > > * Roman Kagan (rka...@virtuozzo.com) wrote: > > > On Thu, Jan 09, 2020 at 02:00:00PM +0100, Vitaly Kuznetsov wrote: > > > > "Dr. David Alan Gilbert" writes: > > > > > > > > > And I think vhost-user will fail if you have too many sections - and > > > > > the 16 sections from synic I think will blow the slots available. > > > > > > > > > > > > > SynIC is percpu, it will allocate two 4k pages for every vCPU the guest > > > > has so we're potentially looking at hundreds of such regions. > > > > > > Indeed. > > > > > > I think my original idea to implement overlay pages word-for-word to the > > > HyperV spec was a mistake, as it lead to fragmentation and memslot > > > waste. > > > > > > I'll look into reworking it without actually mapping extra pages over > > > the existing RAM, but achieving overlay semantics by just shoving the > > > *content* of the "overlaid" memory somewhere. > > > > > > That said, I haven't yet fully understood how the reported issue came > > > about, and thus whether the proposed approach would resolve it too. > > > > The problem happens when we end up with: > > > > a) 0-512k RAM > > b) 512k + synic > > c) 570kish-640k RAM > > > > the page alignment code rounds > > (a) to 0-2MB - aligning to the hugepage it's in > > (b) leaves as is > > (c) aligns to 0-2MB > > > > it then tries to coalesce (c) and (a) and notices (b) got in the way > > and fails it. > > I see, thanks. The only bit I still haven't quite followed is how this > failure results in a quiet vhost malfunction rather than a refusal to > start vhost. Because there's no way to fail in vhost_region_add_section other than to abort; if (mrs_gpa < prev_gpa_start) { error_report("%s:Section rounded to %"PRIx64 " prior to previous %"PRIx64, __func__, mrs_gpa, prev_gpa_start); /* A way to cleanly fail here would be better */ return; } > > Given the guest can put Synic anywhere I'm not sure that changing it's > > implementatino would help here. > > There would be no (b) nor (separate) (c): synic would just refer to some > memory straight from (a), regardless of its paging granularity. Oh, if it's actually memory from main RAM, then sure, but I guess you'd have to reserve that somehow to stop the OS using it. > > (And changing it's implementation would probably break migration > > compatibility). > > I'm afraid I see no better option. Migration compatibility! Dave > Thanks, > Roman. > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[RFC PATCH] tests/tcg: add a vtimer test for aarch64
Bug: https://bugs.launchpad.net/bugs/1859021 Signed-off-by: Alex Bennée --- tests/tcg/aarch64/system/vtimer.c | 48 +++ tests/tcg/aarch64/Makefile.softmmu-target | 4 ++ 2 files changed, 52 insertions(+) create mode 100644 tests/tcg/aarch64/system/vtimer.c diff --git a/tests/tcg/aarch64/system/vtimer.c b/tests/tcg/aarch64/system/vtimer.c new file mode 100644 index 000..42f2f7796c7 --- /dev/null +++ b/tests/tcg/aarch64/system/vtimer.c @@ -0,0 +1,48 @@ +/* + * Simple Virtual Timer Test + * + * Copyright (c) 2020 Linaro Ltd + * + * SPDX-License-Identifier: GPL-2.0-or-later + */ + +#include +#include + +/* grabbed from Linux */ +#define __stringify_1(x...) #x +#define __stringify(x...) __stringify_1(x) + +#define read_sysreg(r) ({ \ +uint64_t __val; \ +asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \ +__val; \ +}) + +#define write_sysreg(r, v) do { \ +uint64_t __val = (uint64_t)(v); \ +asm volatile("msr " __stringify(r) ", %x0" \ + : : "rZ" (__val)); \ +} while (0) + +int main(void) +{ +int i; + +ml_printf("VTimer Test\n"); + +write_sysreg(cntvoff_el2, 1); +write_sysreg(cntv_cval_el0, -1); +write_sysreg(cntv_ctl_el0, 1); + +ml_printf("cntvoff_el2=%lx\n", read_sysreg(cntvoff_el2)); +ml_printf("cntv_cval_el0=%lx\n", read_sysreg(cntv_cval_el0)); +ml_printf("cntv_ctl_el0=%lx\n", read_sysreg(cntv_ctl_el0)); + +/* Now read cval a few times */ +for (i = 0; i < 10; i++) { +ml_printf("%d: cntv_cval_el0=%lx\n", i, read_sysreg(cntv_cval_el0)); +} + +return 0; +} diff --git a/tests/tcg/aarch64/Makefile.softmmu-target b/tests/tcg/aarch64/Makefile.softmmu-target index 7b4eede3f07..62cdddbb215 100644 --- a/tests/tcg/aarch64/Makefile.softmmu-target +++ b/tests/tcg/aarch64/Makefile.softmmu-target @@ -62,3 +62,7 @@ run-memory-replay: memory-replay run-memory-record "$< on $(TARGET_NAME)") EXTRA_TESTS+=memory-record memory-replay + +# vtimer test +QEMU_EL2_MACHINE=-machine virt,virtualization=on,gic-version=2 -cpu cortex-a57 -smp 4 +run-vtimer: QEMU_OPTS=$(QEMU_EL2_MACHINE) $(QEMU_SEMIHOST) -kernel -- 2.20.1
Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device
On Thu, Jan 09, 2020 at 01:28:21PM +, Dr. David Alan Gilbert wrote: > * Roman Kagan (rka...@virtuozzo.com) wrote: > > On Thu, Jan 09, 2020 at 02:00:00PM +0100, Vitaly Kuznetsov wrote: > > > "Dr. David Alan Gilbert" writes: > > > > > > > And I think vhost-user will fail if you have too many sections - and > > > > the 16 sections from synic I think will blow the slots available. > > > > > > > > > > SynIC is percpu, it will allocate two 4k pages for every vCPU the guest > > > has so we're potentially looking at hundreds of such regions. > > > > Indeed. > > > > I think my original idea to implement overlay pages word-for-word to the > > HyperV spec was a mistake, as it lead to fragmentation and memslot > > waste. > > > > I'll look into reworking it without actually mapping extra pages over > > the existing RAM, but achieving overlay semantics by just shoving the > > *content* of the "overlaid" memory somewhere. > > > > That said, I haven't yet fully understood how the reported issue came > > about, and thus whether the proposed approach would resolve it too. > > The problem happens when we end up with: > > a) 0-512k RAM > b) 512k + synic > c) 570kish-640k RAM > > the page alignment code rounds > (a) to 0-2MB - aligning to the hugepage it's in > (b) leaves as is > (c) aligns to 0-2MB > > it then tries to coalesce (c) and (a) and notices (b) got in the way > and fails it. I see, thanks. The only bit I still haven't quite followed is how this failure results in a quiet vhost malfunction rather than a refusal to start vhost. > Given the guest can put Synic anywhere I'm not sure that changing it's > implementatino would help here. There would be no (b) nor (separate) (c): synic would just refer to some memory straight from (a), regardless of its paging granularity. > (And changing it's implementation would probably break migration > compatibility). I'm afraid I see no better option. Thanks, Roman.
Re: [PATCH v12 00/13] VIRTIO-IOMMU device
Hi, On 1/9/20 4:12 PM, no-re...@patchew.org wrote: > Patchew URL: > https://patchew.org/QEMU/20200109144319.15912-1-eric.au...@redhat.com/ > > > > Hi, > > This series failed the docker-quick@centos7 build test. Please find the > testing commands and > their output below. If you have Docker installed, you can probably reproduce > it > locally. > > === TEST SCRIPT BEGIN === > #!/bin/bash > make docker-image-centos7 V=1 NETWORK=1 > time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1 > === TEST SCRIPT END === > > TESTcheck-qtest-x86_64: tests/test-hmp > TESTcheck-qtest-x86_64: tests/qos-test > ** > ERROR:/tmp/qemu-test/src/tests/virtio-iommu-test.c:46:pci_config: assertion > failed (probe_size == 0x200): (0x == 0x0200) > ERROR - Bail out! > ERROR:/tmp/qemu-test/src/tests/virtio-iommu-test.c:46:pci_config: assertion > failed (probe_size == 0x200): (0x == 0x0200) OK sorry that's because I eventually removed "virtio-iommu: Implement probe request" patch from the sent series. I will remove that probe_size field test on next round. Thanks Eric > make: *** [check-qtest-x86_64] Error 1 > make: *** Waiting for unfinished jobs > TESTcheck-qtest-aarch64: tests/test-hmp > TESTiotest-qcow2: 220 > --- > raise CalledProcessError(retcode, cmd) > subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', > '--label', 'com.qemu.instance.uuid=de2154161ccf4468af5ebd35cbcbdc03', '-u', > '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', > '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', > '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', > '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', > '/var/tmp/patchew-tester-tmp-_ap6ulch/src/docker-src.2020-01-09-10.01.05.32472:/var/tmp/qemu:z,ro', > 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit > status 2. > filter=--filter=label=com.qemu.instance.uuid=de2154161ccf4468af5ebd35cbcbdc03 > make[1]: *** [docker-run] Error 1 > make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-_ap6ulch/src' > make: *** [docker-run-test-quick@centos7] Error 2 > > real11m50.898s > user0m8.023s > > > The full log is available at > http://patchew.org/logs/20200109144319.15912-1-eric.au...@redhat.com/testing.docker-quick@centos7/?type=message. > --- > Email generated automatically by Patchew [https://patchew.org/]. > Please send your feedback to patchew-de...@redhat.com >
Re: [PATCH] virtio: Prevent double swap due to target pre 1.0 VirtIO
On Thu, 9 Jan 2020 07:39:17 -0500 "Michael S. Tsirkin" wrote: > On Thu, Jan 09, 2020 at 09:25:42AM -0300, André Silva wrote: > > Hi Michael! > > Thanks for reviewing the patch! > > > > > we always get LE values from memory subsystem, > > > not target endian values: > > > > I see. So do you think the patch is correct in eliminating the extra > > swap (as virtio_config_readw for example already makes a swap)? > > > > Thanks, > > andré > > I don't think it is, I think we do need an extra swap > in some cases. It's possible that some cross-endian > setups are broken now, if so pls include testing > result not just theoretical analysis. > I confirm that we must keep the extra swap otherwise read/write in cross-endian setups will have wrong endian. Please read this commit for a more detailed explanation: commit 82afa58641b0e67abbaf4da6c325ebd7c2513262 Author: Benjamin Herrenschmidt Date: Tue Jan 10 01:35:11 2012 + virtio-pci: Fix endianness of virtio config https://git.qemu.org/?p=qemu.git;a=commitdiff;h=82afa58641b0e67abbaf4da6c325ebd7c2513262 This is especially critical on ppc64 since _all_ hosts are now LE but the first piece of code in the guest that is likely to drive the device is the SLOF firmware which is BE. This is what we get with this patch when trying to run a pseries guest on a ppc64le host: Trying to load: from: /pci@8002000/scsi@0 ... virtioblk_transfer: Access beyond end of device! Cheers, -- Greg > > On Thu, Jan 9, 2020 at 7:50 AM Michael S. Tsirkin wrote: > > > > > > On Wed, Jan 08, 2020 at 01:16:18PM -0300, Andre Silva wrote: > > > > Remove the bswap function calls after reading and before writing > > > > memory bytes in virtio_pci_config_read and virtio_pci_config_write > > > > because they are reverting back an already swapped bytes. > > > > > > > > Consider the table below in the context of virtio_pci_config_read > > > > function. > > > > > > > > Host Target virtio-config-read[wl] > > > >swap? virtio-is-big-endian? extra > > > > bswap? Should be Final result Final result ok? > > > > - --- --- > > > > -- --- -- -- > > > > LE BE s(x)trues(s(x)) > > > > s(x)x No > > > > LE LE x false - > > > > x x Yes > > > > BE LE s(x)false - > > > > s(x)s(x) Yes > > > > BE BE x trues(x) > > > > x s(x) No > > > > > > we always get LE values from memory subsystem, > > > not target endian values: > > > > > > static const MemoryRegionOps virtio_pci_config_ops = { > > > .read = virtio_pci_config_read, > > > .write = virtio_pci_config_write, > > > .impl = { > > > .min_access_size = 1, > > > .max_access_size = 4, > > > }, > > > .endianness = DEVICE_LITTLE_ENDIAN, > > > }; > > > > > > > > > This triggers another swap in address_space_ldl_internal > > > (memory_ldst.inc.c). > > > > > > > > > > In table above, when target is big endian and VirtIO is pre 1.0, > > > > function virtio_is_big_endian would return true and the extra > > > > swap would be executed, reverting the previous swap made by > > > > virtio_config_read[wl]. > > > > > > > > The 's(x)' means that a swap function was applied at > > > > address x. 'LE' is little endian and 'BE' is big endian. The > > > > 'Final result' column is the returned value from > > > > virtio_pci_config_read, considering a target Virtio pre 1.0. > > > > 'x' means that target's value was not swapped in Qemu, 's(x)' means > > > > that Qemu will use a swapped value. > > > > > > > > If we remove the extra swap made in virtio_pci_config_read we will > > > > have the correct result in any host/target combination, both for > > > > VirtIO pre 1.0 or later versions. > > > > > > > > The same reasoning applies to virtio_pci_config_write. > > > > > > > > Signed-off-by: Andre Silva > > > > --- > > > > hw/virtio/virtio-pci.c | 12 > > > > 1 file changed, 12 deletions(-) > > > > > > > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c > > > > index c6b47a9c73..4ba9e847f3 100644 > > > > --- a/hw/virtio/virtio-pci.c > > > > +++ b/hw/virtio/virtio-pci.c > > > > @@ -431,15 +431,9 @@ static uint64_t virtio_pci_config_read(void > > > > *opaque, hwaddr addr, > > > > break; > > > > case 2: > > > > val = virtio_config_readw(vdev, addr); > > > > -if (virtio_is_big_endian(vdev)) { > > > > -val = bswap16(val); > > > > -} > > > > break; > > > > case 4: > > > > val = virtio_config_readl(vdev, addr); > > > > -if (virtio_is_big_endian(vdev)) { > > > > -
Re: [PATCH 06/15] migration/savevm: Replace current_machine by qdev_get_machine()
* Philippe Mathieu-Daudé (phi...@redhat.com) wrote: > As we want to remove the global current_machine, > replace MACHINE_GET_CLASS(current_machine) by > MACHINE_GET_CLASS(qdev_get_machine()). > > Signed-off-by: Philippe Mathieu-Daudé > --- > migration/savevm.c | 10 +- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/migration/savevm.c b/migration/savevm.c > index 59efc1981d..0e8b6a4715 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -292,7 +292,8 @@ static uint32_t get_validatable_capabilities_count(void) > static int configuration_pre_save(void *opaque) > { > SaveState *state = opaque; > -const char *current_name = MACHINE_GET_CLASS(current_machine)->name; > +MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine()); > +const char *current_name = mc->name; For migration: Acked-by: Dr. David Alan Gilbert (Personally I'd keep it on one line, but that's just taste) > MigrationState *s = migrate_get_current(); > int i, j; > > @@ -362,7 +363,8 @@ static bool configuration_validate_capabilities(SaveState > *state) > static int configuration_post_load(void *opaque, int version_id) > { > SaveState *state = opaque; > -const char *current_name = MACHINE_GET_CLASS(current_machine)->name; > +MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine()); > +const char *current_name = mc->name; > > if (strncmp(state->name, current_name, state->len) != 0) { > error_report("Machine type received is '%.*s' and local is '%s'", > @@ -615,9 +617,7 @@ static void dump_vmstate_vmsd(FILE *out_file, > > static void dump_machine_type(FILE *out_file) > { > -MachineClass *mc; > - > -mc = MACHINE_GET_CLASS(current_machine); > +MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine()); > > fprintf(out_file, " \"vmschkmachine\": {\n"); > fprintf(out_file, "\"Name\": \"%s\"\n", mc->name); > -- > 2.21.1 > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device
Paolo Bonzini writes: > On 09/01/20 14:22, Dr. David Alan Gilbert wrote: >> * Michael S. Tsirkin (m...@redhat.com) wrote: >>> On Thu, Jan 09, 2020 at 12:22:37PM +, Dr. David Alan Gilbert wrote: Do we want a new memory_region_init for that or just to be able to add a flag? >>> I think a flag API is preferable since this can apply to any kind of >>> region. But can go either way, Paolo's the maintainer there. >> >> (Copying Paolo in) >> So what exactly does this flag mean; to me it's 'no vhost' - but is it >> actually more general? > > It has two more effects in addition to no vhost: > > 1) it is skipped when dumping the guest (is this a good or bad idea for > SynIC?) Imagine we have an not yet consumed message sitting in message page, or a signalled event, do I understand correctly that these are going to get lost upon migration? This may not work then -- unless we transfer in-QEMU synic state somehow separately. -- Vitaly
[PATCH 2/4] Makefile: Clarify all the codebase requires qom/ objects
QEMU user-mode also requires the qom/ objects, it is not only used by "system emulation and qemu-img". As we will use a big if() block, move it upper in the "Common libraries for tools and emulators" section. Signed-off-by: Philippe Mathieu-Daudé --- Makefile.objs | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/Makefile.objs b/Makefile.objs index 7c1e50f9d6..5aae561984 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -2,6 +2,7 @@ # Common libraries for tools and emulators stub-obj-y = stubs/ util-obj-y = crypto/ util/ qobject/ qapi/ +qom-obj-y = qom/ chardev-obj-y = chardev/ @@ -26,11 +27,6 @@ block-obj-m = block/ crypto-obj-y = crypto/ -### -# qom-obj-y is code used by both qemu system emulation and qemu-img - -qom-obj-y = qom/ - ### # io-obj-y is code used by both qemu system emulation and qemu-img -- 2.21.1
Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device
* Paolo Bonzini (pbonz...@redhat.com) wrote: > On 09/01/20 14:22, Dr. David Alan Gilbert wrote: > > * Michael S. Tsirkin (m...@redhat.com) wrote: > >> On Thu, Jan 09, 2020 at 12:22:37PM +, Dr. David Alan Gilbert wrote: > >>> Do we want a new memory_region_init for that or just to be able to add > >>> a flag? > >>> > >> I think a flag API is preferable since this can apply to any kind of > >> region. But can go either way, Paolo's the maintainer there. > > > > (Copying Paolo in) > > So what exactly does this flag mean; to me it's 'no vhost' - but is it > > actually more general? > > It has two more effects in addition to no vhost: > > 1) it is skipped when dumping the guest (is this a good or bad idea for > SynIC?) > > 2) accesses to the region will use the specified size (e.g. 4-bytes for > address_space_stl, 1-byte for address_space_stb) instead of a memcpy. > Doesn't really matter for SynIC regions. > > If (1) is a good idea, then it's 2 out of 3 and I guess the patch is okay. It's probably best to keep them in the dump because they give some info on the current state of the windows guest and interrupts. Also, as Roman points out the ram-device pages aren't migrated, so we need to fix that as well. So, do we add a new flag? If so, is no-vhost what we want? Dave > Paolo > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
[PATCH 3/4] Makefile: Restrict system emulation and tools objects
Restrict all the system emulation and tools objects with a Makefile IF (CONFIG_SOFTMMU OR CONFIG_TOOLS) check. Using the same description over and over is not very helpful. Use it once, just before the if() block. Signed-off-by: Philippe Mathieu-Daudé --- Makefile.objs | 19 +++ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/Makefile.objs b/Makefile.objs index 5aae561984..395dd1e670 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -4,16 +4,15 @@ stub-obj-y = stubs/ util-obj-y = crypto/ util/ qobject/ qapi/ qom-obj-y = qom/ +### +# code used by both qemu system emulation and qemu-img + +ifeq ($(call lor,$(CONFIG_SOFTMMU),$(CONFIG_TOOLS)),y) + chardev-obj-y = chardev/ -### -# authz-obj-y is code used by both qemu system emulation and qemu-img - authz-obj-y = authz/ -### -# block-obj-y is code used by both qemu system emulation and qemu-img - block-obj-y = nbd/ block-obj-y += block.o blockjob.o job.o block-obj-y += block/ scsi/ @@ -22,16 +21,12 @@ block-obj-$(CONFIG_REPLICATION) += replication.o block-obj-m = block/ -### -# crypto-obj-y is code used by both qemu system emulation and qemu-img - crypto-obj-y = crypto/ -### -# io-obj-y is code used by both qemu system emulation and qemu-img - io-obj-y = io/ +endif # CONFIG_SOFTMMU or CONFIG_TOOLS + ## # Target independent part of system emulation. The long term path is to # suppress *all* target specific code in case of system emulation, i.e. a -- 2.21.1
[PATCH 4/4] Makefile: Remove unhelpful comment
It is pointless to keep qapi/ object separate from the other common-objects. Drop the comment. Signed-off-by: Philippe Mathieu-Daudé --- Makefile.objs | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/Makefile.objs b/Makefile.objs index 395dd1e670..c6321d0465 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -71,11 +71,9 @@ qemu-seccomp.o-libs := $(SECCOMP_LIBS) common-obj-$(CONFIG_FDT) += device_tree.o -## -# qapi - common-obj-y += qapi/ -endif + +endif # CONFIG_SOFTMMU ### # Target-independent parts used in system and user emulation -- 2.21.1
[PATCH 0/4] buildsys: Build quicker (mostly tools and linux-user)
In some configuration (linux-user, tools) we can ignore building various objects (and the libfdt). Philippe Mathieu-Daudé (4): configure: Do not build libfdt is not required Makefile: Clarify all the codebase requires qom/ objects Makefile: Restrict system emulation and tools objects Makefile: Remove unhelpful comment configure | 2 ++ Makefile.objs | 31 ++- 2 files changed, 12 insertions(+), 21 deletions(-) -- 2.21.1
[PATCH 1/4] configure: Do not build libfdt is not required
We only require libfdt for system emulation, in a small set of architecture: 4077 # fdt support is mandatory for at least some target architectures, 4078 # so insist on it if we're building those system emulators. 4079 fdt_required=no 4080 for target in $target_list; do 4081case $target in 4082 aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu|mips64el-softmmu|riscv*-softmmu) 4083fdt_required=yes Do not build libfdt if we did not manually specified --enable-fdt. Signed-off-by: Philippe Mathieu-Daudé --- configure | 2 ++ 1 file changed, 2 insertions(+) diff --git a/configure b/configure index 0ce2c0354a..266a8386d1 100755 --- a/configure +++ b/configure @@ -4092,6 +4092,8 @@ if test "$fdt_required" = "yes"; then "targets which need it (by specifying a cut down --target-list)." fi fdt=yes +elif test "$fdt" != "yes" ; then + fdt=no fi if test "$fdt" != "no" ; then -- 2.21.1
[PATCH 14/15] accel/accel: Replace current_machine by qdev_get_machine()
As we want to remove the global current_machine, replace 'current_machine' by MACHINE(qdev_get_machine()). Signed-off-by: Philippe Mathieu-Daudé --- accel/accel.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/accel/accel.c b/accel/accel.c index cb555e3b06..777d6ba119 100644 --- a/accel/accel.c +++ b/accel/accel.c @@ -65,7 +65,9 @@ int accel_init_machine(AccelState *accel, MachineState *ms) AccelState *current_accel(void) { -return current_machine->accelerator; +MachineState *ms = MACHINE(qdev_get_machine()); + +return ms->accelerator; } void accel_setup_post(MachineState *ms) -- 2.21.1
[PATCH 13/15] accel: Replace current_machine->accelerator by current_accel() method
As we want to remove the global current_machine, replace 'current_machine->accelerator' by current_accel(). Signed-off-by: Philippe Mathieu-Daudé --- accel/kvm/kvm-all.c | 4 ++-- accel/tcg/tcg-all.c | 2 +- memory.c| 2 +- target/arm/kvm64.c | 4 ++-- target/i386/kvm.c | 2 +- target/ppc/kvm.c| 2 +- vl.c| 2 +- 7 files changed, 9 insertions(+), 9 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index b2f1a5bcb5..be1980f250 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -164,7 +164,7 @@ static NotifierList kvm_irqchip_change_notifiers = int kvm_get_max_memslots(void) { -KVMState *s = KVM_STATE(current_machine->accelerator); +KVMState *s = KVM_STATE(current_accel()); return s->nr_slots; } @@ -1847,7 +1847,7 @@ static int kvm_max_vcpu_id(KVMState *s) bool kvm_vcpu_id_is_valid(int vcpu_id) { -KVMState *s = KVM_STATE(current_machine->accelerator); +KVMState *s = KVM_STATE(current_accel()); return vcpu_id >= 0 && vcpu_id < kvm_max_vcpu_id(s); } diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c index 1dc384c8d2..1802ce02f6 100644 --- a/accel/tcg/tcg-all.c +++ b/accel/tcg/tcg-all.c @@ -124,7 +124,7 @@ static void tcg_accel_instance_init(Object *obj) static int tcg_init(MachineState *ms) { -TCGState *s = TCG_STATE(current_machine->accelerator); +TCGState *s = TCG_STATE(current_accel()); tcg_exec_init(s->tb_size * 1024 * 1024); cpu_interrupt_handler = tcg_handle_interrupt; diff --git a/memory.c b/memory.c index 57e38b1f50..60e8993499 100644 --- a/memory.c +++ b/memory.c @@ -3106,7 +3106,7 @@ void mtree_info(bool flatview, bool dispatch_tree, bool owner) }; GArray *fv_address_spaces; GHashTable *views = g_hash_table_new(g_direct_hash, g_direct_equal); -AccelClass *ac = ACCEL_GET_CLASS(current_machine->accelerator); +AccelClass *ac = ACCEL_GET_CLASS(current_accel()); if (ac->has_memory) { fvi.ac = ac; diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c index 876184b8fe..f677877a1e 100644 --- a/target/arm/kvm64.c +++ b/target/arm/kvm64.c @@ -613,14 +613,14 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures *ahcf) bool kvm_arm_aarch32_supported(CPUState *cpu) { -KVMState *s = KVM_STATE(current_machine->accelerator); +KVMState *s = KVM_STATE(current_accel()); return kvm_check_extension(s, KVM_CAP_ARM_EL1_32BIT); } bool kvm_arm_sve_supported(CPUState *cpu) { -KVMState *s = KVM_STATE(current_machine->accelerator); +KVMState *s = KVM_STATE(current_accel()); return kvm_check_extension(s, KVM_CAP_ARM_SVE); } diff --git a/target/i386/kvm.c b/target/i386/kvm.c index 0b511906e3..2ed15814dc 100644 --- a/target/i386/kvm.c +++ b/target/i386/kvm.c @@ -147,7 +147,7 @@ bool kvm_allows_irq0_override(void) static bool kvm_x2apic_api_set_flags(uint64_t flags) { -KVMState *s = KVM_STATE(current_machine->accelerator); +KVMState *s = KVM_STATE(current_accel()); return !kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags); } diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c index d1c334f0e3..2d011308e0 100644 --- a/target/ppc/kvm.c +++ b/target/ppc/kvm.c @@ -258,7 +258,7 @@ static void kvm_get_smmu_info(struct kvm_ppc_smmu_info *info, Error **errp) struct ppc_radix_page_info *kvm_get_radix_page_info(void) { -KVMState *s = KVM_STATE(current_machine->accelerator); +KVMState *s = KVM_STATE(current_accel()); struct ppc_radix_page_info *radix_page_info; struct kvm_ppc_rmmu_info rmmu_info; int i; diff --git a/vl.c b/vl.c index 86474a55c9..3ff3548183 100644 --- a/vl.c +++ b/vl.c @@ -2804,7 +2804,7 @@ static void configure_accelerators(const char *progname) } if (init_failed) { -AccelClass *ac = ACCEL_GET_CLASS(current_machine->accelerator); +AccelClass *ac = ACCEL_GET_CLASS(current_accel()); error_report("falling back to %s", ac->name); } -- 2.21.1
[PATCH 11/15] exec: Replace current_machine by qdev_get_machine()
As we want to remove the global current_machine, replace 'current_machine' by MACHINE(qdev_get_machine()). Signed-off-by: Philippe Mathieu-Daudé --- exec.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/exec.c b/exec.c index d4b769d0d4..98f5b049ca 100644 --- a/exec.c +++ b/exec.c @@ -1984,11 +1984,11 @@ static unsigned long last_ram_page(void) static void qemu_ram_setup_dump(void *addr, ram_addr_t size) { -int ret; +MachineState *ms = MACHINE(qdev_get_machine()); /* Use MADV_DONTDUMP, if user doesn't want the guest memory in the core */ -if (!machine_dump_guest_core(current_machine)) { -ret = qemu_madvise(addr, size, QEMU_MADV_DONTDUMP); +if (!machine_dump_guest_core(ms)) { +int ret = qemu_madvise(addr, size, QEMU_MADV_DONTDUMP); if (ret) { perror("qemu_madvise"); fprintf(stderr, "madvise doesn't support MADV_DONTDUMP, " @@ -2108,7 +2108,9 @@ size_t qemu_ram_pagesize_largest(void) static int memory_try_enable_merging(void *addr, size_t len) { -if (!machine_mem_merge(current_machine)) { +MachineState *ms = MACHINE(qdev_get_machine()); + +if (!machine_mem_merge(ms)) { /* disabled by the user */ return 0; } -- 2.21.1
[Bug 1841491] Re: floating point emulation can fail to set FE_UNDERFLOW
Comment #5 suggested splitting the "float" issue to a separate bug, which was done some time ago (bug #1841592). I think this ticket can be closed. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1841491 Title: floating point emulation can fail to set FE_UNDERFLOW Status in QEMU: Confirmed Bug description: Floating point emulation can fail to set FE_UNDERFLOW in some circumstances. This shows up often in glibc's "math" tests. A similar test is attached. This is similar to bug #1841442, but not the same problem, and I don't think the fix will be in the same code. On ppc64le native: -- $ gcc -c -O2 fma.c $ gcc -O2 test-fma.c fma.o -lm -o test-fma $ ./test-fma $(./test-fma) fma(0x1.cp-1022, 0x1.1p-1, 0x0.1p-1022) 0x0 0xa00 FE_INEXACT FE_UNDERFLOW 0x1p-1022 -- On qemu-system-ppc64: -- $ ./test-fma $(./test-fma) fma(0x1.cp-1022, 0x1.1p-1, 0x0.1p-1022) 0x0 0x200 FE_INEXACT 0x1p-1022 -- QEMU versions vary, but not too much, and are pretty close to git HEAD: - 586f3dced9 (HEAD -> master, origin/master, origin/HEAD) Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20190822' into staging - 864ab31 Update version for v4.1.0-rc4 release There are worse symptoms on qemu-x86_64, but this is apparently not surprising per https://bugs.launchpad.net/qemu/+bug/1841442/comments/6. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1841491/+subscriptions
Re: Difference between 'current_machine' vs MACHINE(qdev_get_machine())
On 2020/1/9 20:01, Paolo Bonzini wrote: On 09/01/20 12:23, Philippe Mathieu-Daudé wrote: current_machine = MACHINE(object_new_with_class(OBJECT_CLASS(machine_class))); object_property_add_child(object_get_root(), "machine", OBJECT(current_machine), &error_abort); The bigger user of 'current_machine' is the accel/KVM code. Recently in a0628599f..cc7d44c2e0 "Replace global smp variables with machine smp properties" we started to use MACHINE(qdev_get_machine()). qdev_get_machine() resolves the machine in the QOM composition tree. I am confused by this comment: /* qdev_get_machine() can return something that's not TYPE_MACHINE * if this is one of the user-only emulators; in that case there's * no need to check the ignore_memory_transaction_failures board flag. */ Following a0628599f..cc7d44c2e0, a5e0b33119 use 'current_machine' again. What are the differences between both form, when should we use one or another (or can we use a single one?). Can this break user-only mode? I would always use MACHINE(qdev_get_machine()), espeecially outside vl.c. Ideally, current_machine would be static within vl.c or even unused outside the object_property_add_child() that you quote above. Most of the times, I noticed from a quick grep, we actually want to access the accelerator, not the machine, so we could add a qemu_get_accelerator() wrapper that does MACHINE(qdev_get_machine())->accelerator. Paolo I prefer to use MACHINE(qdev_get_machine()) wherever possible. However, the qdev_get_machine() would return non TYPE_MACHINE object if: - call qdev_get_machine() before we do "object_property_add_child(object_get_root(), "machine", OBJECT(current_machine), &error_abort);" in vl.c; - or in the context with '#ifdef CONFIG_USER_ONLY'; Thanks, Like Xu
[PATCH 09/15] device_tree: Replace current_machine by qdev_get_machine()
As we want to remove the global current_machine, replace 'current_machine' by MACHINE(qdev_get_machine()). Signed-off-by: Philippe Mathieu-Daudé --- device_tree.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/device_tree.c b/device_tree.c index f8b46b3c73..665ea2f586 100644 --- a/device_tree.c +++ b/device_tree.c @@ -466,7 +466,9 @@ uint32_t qemu_fdt_alloc_phandle(void *fdt) * which phandle id to start allocating phandles. */ if (!phandle) { -phandle = machine_phandle_start(current_machine); +MachineState *ms = MACHINE(qdev_get_machine()); + +phandle = machine_phandle_start(ms); } if (!phandle) { -- 2.21.1
[PATCH 15/15] vl: Make current_machine a local variable
Since we now only use current_machine in vl.c, stop exporting it as a global variable in "hw/board.h", and make it static to vl.c. Signed-off-by: Philippe Mathieu-Daudé --- include/hw/boards.h | 2 -- vl.c| 4 ++-- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/include/hw/boards.h b/include/hw/boards.h index 61f8bb8e5a..b0c0d4376d 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -59,8 +59,6 @@ void memory_region_allocate_system_memory(MemoryRegion *mr, Object *owner, #define MACHINE_CLASS(klass) \ OBJECT_CLASS_CHECK(MachineClass, (klass), TYPE_MACHINE) -extern MachineState *current_machine; - void machine_run_board_init(MachineState *machine); bool machine_usb(MachineState *machine); int machine_phandle_start(MachineState *machine); diff --git a/vl.c b/vl.c index 3ff3548183..7a69af4bef 100644 --- a/vl.c +++ b/vl.c @@ -214,6 +214,8 @@ static int default_sdcard = 1; static int default_vga = 1; static int default_net = 1; +static MachineState *current_machine; + static struct { const char *driver; int *flag; @@ -1164,8 +1166,6 @@ static int usb_parse(const char *cmdline) /***/ /* machine registration */ -MachineState *current_machine; - static MachineClass *find_machine(const char *name, GSList *machines) { GSList *el; -- 2.21.1
Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device
On 09/01/20 14:22, Dr. David Alan Gilbert wrote: > * Michael S. Tsirkin (m...@redhat.com) wrote: >> On Thu, Jan 09, 2020 at 12:22:37PM +, Dr. David Alan Gilbert wrote: >>> Do we want a new memory_region_init for that or just to be able to add >>> a flag? >>> >> I think a flag API is preferable since this can apply to any kind of >> region. But can go either way, Paolo's the maintainer there. > > (Copying Paolo in) > So what exactly does this flag mean; to me it's 'no vhost' - but is it > actually more general? It has two more effects in addition to no vhost: 1) it is skipped when dumping the guest (is this a good or bad idea for SynIC?) 2) accesses to the region will use the specified size (e.g. 4-bytes for address_space_stl, 1-byte for address_space_stb) instead of a memcpy. Doesn't really matter for SynIC regions. If (1) is a good idea, then it's 2 out of 3 and I guess the patch is okay. Paolo
[PATCH 06/15] migration/savevm: Replace current_machine by qdev_get_machine()
As we want to remove the global current_machine, replace MACHINE_GET_CLASS(current_machine) by MACHINE_GET_CLASS(qdev_get_machine()). Signed-off-by: Philippe Mathieu-Daudé --- migration/savevm.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/migration/savevm.c b/migration/savevm.c index 59efc1981d..0e8b6a4715 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -292,7 +292,8 @@ static uint32_t get_validatable_capabilities_count(void) static int configuration_pre_save(void *opaque) { SaveState *state = opaque; -const char *current_name = MACHINE_GET_CLASS(current_machine)->name; +MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine()); +const char *current_name = mc->name; MigrationState *s = migrate_get_current(); int i, j; @@ -362,7 +363,8 @@ static bool configuration_validate_capabilities(SaveState *state) static int configuration_post_load(void *opaque, int version_id) { SaveState *state = opaque; -const char *current_name = MACHINE_GET_CLASS(current_machine)->name; +MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine()); +const char *current_name = mc->name; if (strncmp(state->name, current_name, state->len) != 0) { error_report("Machine type received is '%.*s' and local is '%s'", @@ -615,9 +617,7 @@ static void dump_vmstate_vmsd(FILE *out_file, static void dump_machine_type(FILE *out_file) { -MachineClass *mc; - -mc = MACHINE_GET_CLASS(current_machine); +MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine()); fprintf(out_file, " \"vmschkmachine\": {\n"); fprintf(out_file, "\"Name\": \"%s\"\n", mc->name); -- 2.21.1
[PATCH 07/15] hw/core/machine-qmp-cmds: Replace current_machine by qdev_get_machine()
As we want to remove the global current_machine, replace MACHINE_GET_CLASS(current_machine) by MACHINE_GET_CLASS(qdev_get_machine()). Signed-off-by: Philippe Mathieu-Daudé --- hw/core/machine-qmp-cmds.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c index eed5aeb2f7..5a04d00e4f 100644 --- a/hw/core/machine-qmp-cmds.c +++ b/hw/core/machine-qmp-cmds.c @@ -280,9 +280,9 @@ void qmp_cpu_add(int64_t id, Error **errp) { MachineClass *mc; -mc = MACHINE_GET_CLASS(current_machine); +mc = MACHINE_GET_CLASS(qdev_get_machine()); if (mc->hot_add_cpu) { -mc->hot_add_cpu(current_machine, id, errp); +mc->hot_add_cpu(MACHINE(qdev_get_machine()), id, errp); } else { error_setg(errp, "Not supported"); } -- 2.21.1
[PATCH 10/15] memory: Replace current_machine by qdev_get_machine()
As we want to remove the global current_machine, replace 'current_machine' by MACHINE(qdev_get_machine()). Signed-off-by: Philippe Mathieu-Daudé --- memory.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/memory.c b/memory.c index d7b9bb6951..57e38b1f50 100644 --- a/memory.c +++ b/memory.c @@ -3004,6 +3004,7 @@ static void mtree_print_flatview(gpointer key, gpointer value, int n = view->nr; int i; AddressSpace *as; +MachineState *ms; qemu_printf("FlatView #%d\n", fvi->counter); ++fvi->counter; @@ -3026,6 +3027,7 @@ static void mtree_print_flatview(gpointer key, gpointer value, return; } +ms = MACHINE(qdev_get_machine()); while (n--) { mr = range->mr; if (range->offset_in_region) { @@ -3057,7 +3059,7 @@ static void mtree_print_flatview(gpointer key, gpointer value, if (fvi->ac) { for (i = 0; i < fv_address_spaces->len; ++i) { as = g_array_index(fv_address_spaces, AddressSpace*, i); -if (fvi->ac->has_memory(current_machine, as, +if (fvi->ac->has_memory(ms, as, int128_get64(range->addr.start), MR_SIZE(range->addr.size) + 1)) { qemu_printf(" %s", fvi->ac->name); -- 2.21.1
[PATCH 12/15] accel: Introduce the current_accel() method
We want to remove the global current_machine. The accel/ code access few times current_machine->accelerator. Introduce the current_accel() method first, it will then be easier to replace 'current_machine' by MACHINE(qdev_get_machine()). Signed-off-by: Philippe Mathieu-Daudé --- include/sysemu/accel.h | 2 ++ accel/accel.c | 5 + 2 files changed, 7 insertions(+) diff --git a/include/sysemu/accel.h b/include/sysemu/accel.h index d4c1429711..47e5788530 100644 --- a/include/sysemu/accel.h +++ b/include/sysemu/accel.h @@ -70,4 +70,6 @@ int accel_init_machine(AccelState *accel, MachineState *ms); /* Called just before os_setup_post (ie just before drop OS privs) */ void accel_setup_post(MachineState *ms); +AccelState *current_accel(void); + #endif diff --git a/accel/accel.c b/accel/accel.c index 1c5c3a6abb..cb555e3b06 100644 --- a/accel/accel.c +++ b/accel/accel.c @@ -63,6 +63,11 @@ int accel_init_machine(AccelState *accel, MachineState *ms) return ret; } +AccelState *current_accel(void) +{ +return current_machine->accelerator; +} + void accel_setup_post(MachineState *ms) { AccelState *accel = ms->accelerator; -- 2.21.1
[PATCH 05/15] device-hotplug: Replace current_machine by qdev_get_machine()
As we want to remove the global current_machine, replace MACHINE_GET_CLASS(current_machine) by MACHINE_GET_CLASS(qdev_get_machine()). Signed-off-by: Philippe Mathieu-Daudé --- device-hotplug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/device-hotplug.c b/device-hotplug.c index f01d53774b..44d687f254 100644 --- a/device-hotplug.c +++ b/device-hotplug.c @@ -45,7 +45,7 @@ static DriveInfo *add_init_drive(const char *optstr) if (!opts) return NULL; -mc = MACHINE_GET_CLASS(current_machine); +mc = MACHINE_GET_CLASS(qdev_get_machine()); dinfo = drive_new(opts, mc->block_default_type, &err); if (err) { error_report_err(err); -- 2.21.1
[PATCH 04/15] hw/ppc/spapr_rtas: Restrict variables scope to single switch case
We only access these variables in RTAS_SYSPARM_SPLPAR_CHARACTERISTICS case, restrict their scope to avoid unnecessary initialization. Signed-off-by: Philippe Mathieu-Daudé --- hw/ppc/spapr_rtas.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c index 6f06e9d7fe..7237e5ebf2 100644 --- a/hw/ppc/spapr_rtas.c +++ b/hw/ppc/spapr_rtas.c @@ -267,8 +267,6 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu, uint32_t nret, target_ulong rets) { PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); -MachineState *ms = MACHINE(spapr); -unsigned int max_cpus = ms->smp.max_cpus; target_ulong parameter = rtas_ld(args, 0); target_ulong buffer = rtas_ld(args, 1); target_ulong length = rtas_ld(args, 2); @@ -276,6 +274,8 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu, switch (parameter) { case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: { +MachineState *ms = MACHINE(spapr); +unsigned int max_cpus = ms->smp.max_cpus; char *param_val = g_strdup_printf("MaxEntCap=%d," "DesMem=%" PRIu64 "," "DesProcs=%d," -- 2.21.1
[PATCH 02/15] hw/ppc/spapr_rtas: Use local MachineState variable
Since we have the MachineState already available locally, ues it instead of the global current_machine. Signed-off-by: Philippe Mathieu-Daudé --- hw/ppc/spapr_rtas.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c index 8d8d8cdfcb..e88bb1930e 100644 --- a/hw/ppc/spapr_rtas.c +++ b/hw/ppc/spapr_rtas.c @@ -281,7 +281,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu, "DesProcs=%d," "MaxPlatProcs=%d", max_cpus, - current_machine->ram_size / MiB, + ms->ram_size / MiB, ms->smp.cpus, max_cpus); if (pcc->n_host_threads > 0) { -- 2.21.1
[PATCH 08/15] target/arm/monitor: Replace current_machine by qdev_get_machine()
As we want to remove the global current_machine, replace 'current_machine' by MACHINE(qdev_get_machine()). Signed-off-by: Philippe Mathieu-Daudé --- target/arm/monitor.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/target/arm/monitor.c b/target/arm/monitor.c index fa054f8a36..bcbf69802d 100644 --- a/target/arm/monitor.c +++ b/target/arm/monitor.c @@ -136,7 +136,8 @@ CpuModelExpansionInfo *qmp_query_cpu_model_expansion(CpuModelExpansionType type, } if (kvm_enabled()) { -const char *cpu_type = current_machine->cpu_type; +MachineState *ms = MACHINE(qdev_get_machine()); +const char *cpu_type = ms->cpu_type; int len = strlen(cpu_type) - strlen(ARM_CPU_TYPE_SUFFIX); bool supported = false; -- 2.21.1
[PATCH 03/15] hw/ppc/spapr_rtas: Access MachineState via SpaprMachineState argument
We received a SpaprMachineState argument. Since SpaprMachineState inherits of MachineState, use it instead of calling qdev_get_machine. Signed-off-by: Philippe Mathieu-Daudé --- hw/ppc/spapr_rtas.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c index e88bb1930e..6f06e9d7fe 100644 --- a/hw/ppc/spapr_rtas.c +++ b/hw/ppc/spapr_rtas.c @@ -267,7 +267,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu, uint32_t nret, target_ulong rets) { PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu); -MachineState *ms = MACHINE(qdev_get_machine()); +MachineState *ms = MACHINE(spapr); unsigned int max_cpus = ms->smp.max_cpus; target_ulong parameter = rtas_ld(args, 0); target_ulong buffer = rtas_ld(args, 1); -- 2.21.1
[PATCH 01/15] target/arm/kvm: Use CPUState::kvm_state in kvm_arm_pmu_supported()
KVMState is already accessible via CPUState::kvm_state, use it. Signed-off-by: Philippe Mathieu-Daudé --- target/arm/kvm.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/target/arm/kvm.c b/target/arm/kvm.c index b87b59a02a..8d82889150 100644 --- a/target/arm/kvm.c +++ b/target/arm/kvm.c @@ -181,9 +181,7 @@ void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu) bool kvm_arm_pmu_supported(CPUState *cpu) { -KVMState *s = KVM_STATE(current_machine->accelerator); - -return kvm_check_extension(s, KVM_CAP_ARM_PMU_V3); +return kvm_check_extension(cpu->kvm_state, KVM_CAP_ARM_PMU_V3); } int kvm_arm_get_max_vm_ipa_size(MachineState *ms) -- 2.21.1
[PATCH 00/15] Replace current_machine by qdev_get_machine()
Blurb from previous question [1]: "hw/boards.h" declare current_machine, and vl.c defines it: current_machine = MACHINE(object_new_with_class(OBJECT_CLASS(machine_class))); object_property_add_child(object_get_root(), "machine", OBJECT(current_machine), &error_abort); The bigger user of 'current_machine' is the accel/KVM code. Recently in a0628599f..cc7d44c2e0 "Replace global smp variables with machine smp properties" we started to use MACHINE(qdev_get_machine()). Following a0628599f..cc7d44c2e0, a5e0b33119 use 'current_machine' again. qdev_get_machine() resolves the machine in the QOM composition tree. Paolo answered [2]: > I would always use MACHINE(qdev_get_machine()), espeecially outside > vl.c. Ideally, current_machine would be static within vl.c or even > unused outside the object_property_add_child() that you quote above. Let's remove the global current_machine. I am still confused by this comment: /* qdev_get_machine() can return something that's not TYPE_MACHINE * if this is one of the user-only emulators; in that case there's * no need to check the ignore_memory_transaction_failures board flag. */ [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg669475.html [2] https://www.mail-archive.com/qemu-devel@nongnu.org/msg669493.html Philippe Mathieu-Daudé (15): target/arm/kvm: Use CPUState::kvm_state in kvm_arm_pmu_supported() hw/ppc/spapr_rtas: Use local MachineState variable hw/ppc/spapr_rtas: Access MachineState via SpaprMachineState argument hw/ppc/spapr_rtas: Restrict variables scope to single switch case device-hotplug: Replace current_machine by qdev_get_machine() migration/savevm: Replace current_machine by qdev_get_machine() hw/core/machine-qmp-cmds: Replace current_machine by qdev_get_machine() target/arm/monitor: Replace current_machine by qdev_get_machine() device_tree: Replace current_machine by qdev_get_machine() memory: Replace current_machine by qdev_get_machine() exec: Replace current_machine by qdev_get_machine() accel: Introduce the current_accel() method accel: Replace current_machine->accelerator by current_accel() method accel/accel: Replace current_machine by qdev_get_machine() vl: Make current_machine a local variable include/hw/boards.h| 2 -- include/sysemu/accel.h | 2 ++ accel/accel.c | 7 +++ accel/kvm/kvm-all.c| 4 ++-- accel/tcg/tcg-all.c| 2 +- device-hotplug.c | 2 +- device_tree.c | 4 +++- exec.c | 10 ++ hw/core/machine-qmp-cmds.c | 4 ++-- hw/ppc/spapr_rtas.c| 6 +++--- memory.c | 6 -- migration/savevm.c | 10 +- target/arm/kvm.c | 4 +--- target/arm/kvm64.c | 4 ++-- target/arm/monitor.c | 3 ++- target/i386/kvm.c | 2 +- target/ppc/kvm.c | 2 +- vl.c | 6 +++--- 18 files changed, 46 insertions(+), 34 deletions(-) -- 2.21.1
Re: [PATCH v12 00/13] VIRTIO-IOMMU device
Patchew URL: https://patchew.org/QEMU/20200109144319.15912-1-eric.au...@redhat.com/ Hi, This series failed the docker-quick@centos7 build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash make docker-image-centos7 V=1 NETWORK=1 time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1 === TEST SCRIPT END === TESTcheck-qtest-x86_64: tests/test-hmp TESTcheck-qtest-x86_64: tests/qos-test ** ERROR:/tmp/qemu-test/src/tests/virtio-iommu-test.c:46:pci_config: assertion failed (probe_size == 0x200): (0x == 0x0200) ERROR - Bail out! ERROR:/tmp/qemu-test/src/tests/virtio-iommu-test.c:46:pci_config: assertion failed (probe_size == 0x200): (0x == 0x0200) make: *** [check-qtest-x86_64] Error 1 make: *** Waiting for unfinished jobs TESTcheck-qtest-aarch64: tests/test-hmp TESTiotest-qcow2: 220 --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=de2154161ccf4468af5ebd35cbcbdc03', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-_ap6ulch/src/docker-src.2020-01-09-10.01.05.32472:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=de2154161ccf4468af5ebd35cbcbdc03 make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-_ap6ulch/src' make: *** [docker-run-test-quick@centos7] Error 2 real11m50.898s user0m8.023s The full log is available at http://patchew.org/logs/20200109144319.15912-1-eric.au...@redhat.com/testing.docker-quick@centos7/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-de...@redhat.com
Re: Pre-Christmas meeting notes
06.01.2020 20:15, Max Reitz wrote: > Misc > > > The Wiki’s TODO list is horribly outdated. What should we do about > it? Maybe archive it and start a new one? (Most of the things on the > current list are either done or we don’t want to do anymore.) May be, create block/TODO.txt instead? It would be simpler to manage file in git than wiki page. -- Best regards, Vladimir
[Bug 1859021] Re: qemu-system-aarch64 (tcg): cval + voff overflow not handled, causes qemu to hang
** Tags added: arm tcg -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1859021 Title: qemu-system-aarch64 (tcg): cval + voff overflow not handled, causes qemu to hang Status in QEMU: New Bug description: The Armv8 architecture reference manual states that for any timer set (e.g. CNTP* and CNTV*), the condition for such timer to generate an interrupt (if enabled & unmasked) is: CVAL <= CNT(P/V)CT Although this is arguably sloppy coding, I have seen code that is therefore assuming it can set CVAL to a very high value (e.g. UINT64_MAX) and leave the interrupt enabled in CTL, and never get the interrupt. On latest master commit as the time of writing, there is an integer overflow in target/arm/helper.c gt_recalc_timer affecting the virtual timer when the interrupt is enabled in CTL: /* Next transition is when we hit cval */ nexttick = gt->cval + offset; When this overflow happens, I notice that qemu is no longer responsive and that I have to SIGKILL the process: - qemu takes nearly all the cpu time of the cores it is running on (e.g. 50% cpu usage if running on half the cores) and is completely unresponsive - no guest interrupt (reported via -d int) is generated Here the minimal code example to reproduce the issue: mov x0, #1 msr cntvoff_el2, x0 mov x0, #-1 msr cntv_cval_el0, x0 mov x0, #1 msr cntv_ctl_el0, x0 // interrupt generation enabled, not masked; qemu will start to hang here Options used: -nographic -machine virt,virtualization=on,gic-version=2,accel=tcg -cpu cortex-a57 -smp 4 -m 1024 -kernel whatever.elf -d unimp,guest_errors,int -semihosting-config enable,target=native -serial mon:stdio Version used: 4.2 To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1859021/+subscriptions
Re: [PATCH v11 Kernel 3/6] vfio iommu: Implementation of ioctl to for dirty pages tracking.
On Thu, 9 Jan 2020 18:59:40 +0530 Kirti Wankhede wrote: > On 1/9/2020 3:59 AM, Alex Williamson wrote: > > On Thu, 9 Jan 2020 01:31:16 +0530 > > Kirti Wankhede wrote: > > > >> On 1/8/2020 3:32 AM, Alex Williamson wrote: > >>> On Wed, 8 Jan 2020 01:37:03 +0530 > >>> Kirti Wankhede wrote: > >>> > >> > >> > >> > >> + > >> + unlocked = vfio_iova_put_vfio_pfn(dma, vpfn, dirty_tracking); > >> > >>if (do_accounting) > >>vfio_lock_acct(dma, -unlocked, true); > >> @@ -571,8 +606,12 @@ static int vfio_iommu_type1_pin_pages(void > >> *iommu_data, > >> > >>vpfn = vfio_iova_get_vfio_pfn(dma, iova); > >>if (vpfn) { > >> - phys_pfn[i] = vpfn->pfn; > >> - continue; > >> + if (vpfn->unpinned) > >> + vfio_remove_from_pfn_list(dma, vpfn); > > > > This seems inefficient, we have an allocated vpfn at the right places > > in the list, wouldn't it be better to repin the page? > > > > vfio_pin_page_external() takes care of pinning and accounting as well. > >>> > >>> Yes, but could we call vfio_pin_page_external() without {unlinking, > >>> freeing} and {re-allocating, linking} on either side of it since it's > >>> already in the list? That's the inefficient part. Maybe at least a > >>> TODO comment? > >>> > >> > >> Changing it as below: > >> > >> vpfn = vfio_iova_get_vfio_pfn(dma, iova); > >> if (vpfn) { > >> - phys_pfn[i] = vpfn->pfn; > >> - continue; > >> + if (vpfn->ref_count > 1) { > >> + phys_pfn[i] = vpfn->pfn; > >> + continue; > >> + } > >> } > >> > >> remote_vaddr = dma->vaddr + iova - dma->iova; > >> ret = vfio_pin_page_external(dma, remote_vaddr, > >> &phys_pfn[i], > >>do_accounting); > >> if (ret) > >> goto pin_unwind; > >> - > >> - ret = vfio_add_to_pfn_list(dma, iova, phys_pfn[i]); > >> - if (ret) { > >> - vfio_unpin_page_external(dma, iova, do_accounting); > >> - goto pin_unwind; > >> - } > >> + if (!vpfn) { > >> + ret = vfio_add_to_pfn_list(dma, iova, phys_pfn[i]); > >> + if (ret) { > >> + vfio_unpin_page_external(dma, iova, > >> +do_accounting, > >> false); > >> + goto pin_unwind; > >> + } > >> + } else > >> + vpfn->pfn = phys_pfn[i]; > >> } > >> > >> > >> > >> > >> + else { > >> + phys_pfn[i] = vpfn->pfn; > >> + continue; > >> + } > >>} > >> > >>remote_vaddr = dma->vaddr + iova - dma->iova; > >> @@ -583,7 +622,8 @@ static int vfio_iommu_type1_pin_pages(void > >> *iommu_data, > >> > >>ret = vfio_add_to_pfn_list(dma, iova, phys_pfn[i]); > >>if (ret) { > >> - vfio_unpin_page_external(dma, iova, > >> do_accounting); > >> + vfio_unpin_page_external(dma, iova, > >> do_accounting, > >> + false); > >>goto pin_unwind; > >>} > >>} > >> > >> > >> > > >> + if (range.flags & VFIO_IOMMU_DIRTY_PAGES_FLAG_START) { > >> + iommu->dirty_page_tracking = true; > >> + return 0; > >> + } else if (range.flags & > >> VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP) { > >> + iommu->dirty_page_tracking = false; > >> + > >> + mutex_lock(&iommu->lock); > >> + vfio_remove_unpinned_from_dma_list(iommu); > >> + mutex_unlock(&iommu->lock); > >> + return 0; > >> + > >> + } else if (range.flags & > >> + > >> VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP) { > >> + uint64_t iommu_pgmask; > >> + unsigned long pgshift = __ffs(range.pgsize); > >> + unsigned long *bitmap; > >> + long bsize; > >> + > >> + iommu_pgmask = > >> +