Re: [PATCH] riscv: sifive_u: Add a "serial" property for board serial number

2020-01-09 Thread Bin Meng
Hi Palmer,

On Fri, Nov 22, 2019 at 10:38 AM Palmer Dabbelt
 wrote:
>
> On Thu, 21 Nov 2019 17:10:18 PST (-0800), bmeng...@gmail.com wrote:
> > On Sat, Nov 16, 2019 at 11:08 PM Bin Meng  wrote:
> >>
> >> At present the board serial number is hard-coded to 1, and passed
> >> to OTP model during initialization. Firmware (FSBL, U-Boot) uses
> >> the serial number to generate a unique MAC address for the on-chip
> >> ethernet controller. When multiple QEMU 'sifive_u' instances are
> >> created and connected to the same subnet, they all have the same
> >> MAC address hence it creates a unusable network.
> >>
> >> A new "serial" property is introduced to specify the board serial
> >> number. When not given, the default serial number 1 is used.
> >>
> >> Signed-off-by: Bin Meng 
> >> ---
> >>
> >>  hw/riscv/sifive_u.c | 21 -
> >>  include/hw/riscv/sifive_u.h |  1 +
> >>  2 files changed, 21 insertions(+), 1 deletion(-)
> >>
> >
> > ping?
>
> Sorry, it looks like I dropped this one.  I've put it in the queue for 5.0,
> with a
>
> Reviewed-by: Palmer Dabbelt 

Has this been applied somewhere?

Regards,
Bin



Re: [PATCH] vl: Don't mismatch g_strsplit()/g_free()

2020-01-09 Thread Marc-André Lureau
Hi

On Fri, Jan 10, 2020 at 7:17 AM  wrote:
>
> From: Pan Nengyuan 
>
> It's a mismatch between g_strsplit and g_free, it will cause a memory leak as 
> follow:
>
> [root@localhost]# ./aarch64-softmmu/qemu-system-aarch64 -accel help
> Accelerators supported in QEMU binary:
> tcg
> kvm
> =
> ==1207900==ERROR: LeakSanitizer: detected memory leaks
>
> Direct leak of 8 byte(s) in 2 object(s) allocated from:
> #0 0xfffd700231cb in __interceptor_malloc (/lib64/libasan.so.4+0xd31cb)
> #1 0xfffd6ec57163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163)
> #2 0xfffd6ec724d7 in g_strndup (/lib64/libglib-2.0.so.0+0x724d7)
> #3 0xfffd6ec73d3f in g_strsplit (/lib64/libglib-2.0.so.0+0x73d3f)
> #4 0xaaab66be5077 in main /mnt/sdc/qemu-master/qemu-4.2.0-rc0/vl.c:3517
> #5 0xfffd6e140b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f)
> #6 0xaaab66bf0f53  (./build/aarch64-softmmu/qemu-system-aarch64+0x8a0f53)
>
> Direct leak of 2 byte(s) in 2 object(s) allocated from:
> #0 0xfffd700231cb in __interceptor_malloc (/lib64/libasan.so.4+0xd31cb)
> #1 0xfffd6ec57163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163)
> #2 0xfffd6ec7243b in g_strdup (/lib64/libglib-2.0.so.0+0x7243b)
> #3 0xfffd6ec73e6f in g_strsplit (/lib64/libglib-2.0.so.0+0x73e6f)
> #4 0xaaab66be5077 in main /mnt/sdc/qemu-master/qemu-4.2.0-rc0/vl.c:3517
> #5 0xfffd6e140b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f)
> #6 0xaaab66bf0f53  (./build/aarch64-softmmu/qemu-system-aarch64+0x8a0f53)
>
> Reported-by: Euler Robot 
> Signed-off-by: Pan Nengyuan 

Fixes: cbe6d6365a48bce4526c664170cda6fe738484f8

Reviewed-by: Marc-André Lureau 

> ---
>  vl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/vl.c b/vl.c
> index 86474a55c9..2fa5cb3b9a 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -3476,7 +3476,7 @@ int main(int argc, char **argv, char **envp)
>  gchar **optname = g_strsplit(typename,
>   ACCEL_CLASS_SUFFIX, 
> 0);
>  printf("%s\n", optname[0]);
> -g_free(optname);
> +g_strfreev(optname);
>  }
>  g_free(typename);
>  }
> --
> 2.21.0.windows.1
>
>
>


-- 
Marc-André Lureau



Re: Audio pace issue with qemu4.2

2020-01-09 Thread Gerd Hoffmann
On Fri, Jan 10, 2020 at 11:30:24AM +0530, padmashree mandri wrote:
> Hi all,
> 
> I am trying to run qemu4.2 with ALSA. Playback pace is fast. Where can
> i set sampling rate and all parameters related to audio in qemu?

https://www.kraxel.org/blog/2020/01/qemu-sound-audiodev/

HTH,
  Gerd




Re: [PATCH] virtio: add the queue number check

2020-01-09 Thread Yang Zhong


On Fri, Jan 03, 2020 at 10:18:58PM +0100, Paolo Bonzini wrote:
> Il ven 3 gen 2020, 16:08 Yang Zhong  ha scritto:
> 
> >   I also tried virtio-blk device like below:
> >   https://patchwork.kernel.org/cover/10873193/
> >
> >   The virtio-blk can work with this changes, but vhost-user-blk device
> >   failed with this kernel patch.
> >
> >   in vhost_virtqueue_start() function, below operation to check if the
> >   desc addr set by guest kernel. This will ignore the extra vqs.
> > a = virtio_queue_get_desc_addr(vdev, idx);
> > if (a == 0) {
> > /* Queue might not be ready for start */
> > return 0;
> > }
> >
> >   If guest kernel add min(cpu,num_vqs), do we need add same check in
> >   realize function of vhost-user-blk device?
> >
> 
> No. If virtio-blk works, the bug is in vhost-user-blk; if virtio-blk needs
> no check in cpu count, vhost-user-blk also doesn't.
> 
> You need to check first if the bug is in QEMU or the vhost-user-blk server.
> 
  Paolo and all,

  It seems i had root cause this issue, let me list what's i found. Any
  issue please correct me, thanks!

  I found there are 2 issues in DPDK and seabios by debugging this
  issue.

  (1). Seabios issue
  In init_virtio_blk() function, which set VIRTIO_CONFIG_S_DRIVER_OK
  status to qemu vhost-user-blk device.

  // the related code
  ..
  status |= VIRTIO_CONFIG_S_DRIVER_OK;
  vp_set_status(>vp, status);
  ..

  I think there is no need for seabios to set VIRTIO_CONFIG_S_DRIVER_OK
  status to qemu vhost-user-blk device. In fact, this time, the guest
  virtio-blk module is not started. Once seabios set this DRIVER_OK
  status to qemu vhost user device, the vhost-user-blk will call
  vhost_user_blk_start().
  
  static void vhost_user_blk_set_status(VirtIODevice *vdev, uint8_t status)
  {
VHostUserBlk *s = VHOST_USER_BLK(vdev);
bool should_start = virtio_device_started(vdev, status);
int ret;

if (!vdev->vm_running) {
should_start = false;
}

if (!s->connected) {
return;
}

if (s->dev.started == should_start) {
return;
}

if (should_start) {
ret = vhost_user_blk_start(vdev);
if (ret < 0) {
error_report("vhost-user-blk: vhost start failed: %s",
 strerror(-ret));
qemu_chr_fe_disconnect(>chardev);
}
} else {
vhost_user_blk_stop(vdev);
}
}

In fact, this time vhost_user_blk_start almost do nothing because
the real guest virtio-blk driver still not started yet. This time,
there is only one vq can be used(this vq should be inited in seabios).

When the guest virtio-blk driver really start and complet the
probe(), the guest virtio-blk driver will set
VIRTIO_CONFIG_S_DRIVER_OK to vhost-user-blk device again. This
time, this driver will allocate RIGHT queue num according to
MIN(vcpu, num_vqs).

So, i think set VIRTIO_CONFIG_S_DRIVER_OK status should be removed from
seabios, and guest driver should do this.

I removed this status seting in the seabios, and test verified this.
guest virtio-blk driver can be normally initialized by virtio-blk or
vhost-user-blk device from qemu.

(2). DPDK issue
 DPDK does not know the real queue number used by guest virtio-blk
 driver and it only know the queue number from vhost-user-blk
 commond line. Once the guest virtio-blk driver change the queue
 number according to MIN(vcpu, num_vqs), DPDK still use previous
 queue number and it think virtio is never ready by
 virtio_is_ready() function.

 There are two methods to fix this issue,

 one is add VHOST_USER_SET_QUEUE_NUM to vhost user protocol.
 vhost-user-blk device can call this to inform SPDK the real
 queue number. vhost-user-device can use below method to get
 the real queue number(to check if vring.desc is NON-NULL)
 for (i = 0; i < s->dev.nvqs; i++) {
VirtQueue *kick_vq = virtio_get_queue(vdev, i);

if (!virtio_queue_get_desc_addr(vdev, i)) {
continue;
}
 }

 Current vhost-user protocol only support GET_QUEUE_NUM(get max
 queue num), we can add SET_QUEUE_NUM.

 or DPDK can get the real queue number by checking if the vring.desc
 is NON-NULL.

 By the way, vhost SCSI device has the same issue with
 vhost-user-blk device. 

 Yang

> Paolo



RE: [PATCH v1 15/59] block/blkreplay.c: remove unneeded 'fail' label in blkreplay_open()

2020-01-09 Thread Pavel Dovgalyuk
> From: Daniel Henrique Barboza [mailto:danielhb...@gmail.com]
> Sent: Monday, January 06, 2020 9:24 PM
> To: qemu-devel@nongnu.org
> Cc: qemu-triv...@nongnu.org; Daniel Henrique Barboza; Pavel Dovgalyuk; Paolo 
> Bonzini
> Subject: [PATCH v1 15/59] block/blkreplay.c: remove unneeded 'fail' label in 
> blkreplay_open()
> 
> Both the 'fail' label and the 'ret' variable can be removed.
> Use 'return -EINVAL' in the error condition and 'return 0' in
> the end of the function.
> 
> CC: Pavel Dovgalyuk 
> CC: Paolo Bonzini 
> Signed-off-by: Daniel Henrique Barboza 


Reviewed-by: Pavel Dovgalyuk 

> ---
>  block/blkreplay.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/block/blkreplay.c b/block/blkreplay.c
> index c96ac8f4bc..d8c4c311f3 100644
> --- a/block/blkreplay.c
> +++ b/block/blkreplay.c
> @@ -24,23 +24,19 @@ static int blkreplay_open(BlockDriverState *bs, QDict 
> *options, int flags,
>Error **errp)
>  {
>  Error *local_err = NULL;
> -int ret;
> 
>  /* Open the image file */
>  bs->file = bdrv_open_child(NULL, options, "image",
> bs, _file, false, _err);
>  if (local_err) {
> -ret = -EINVAL;
>  error_propagate(errp, local_err);
> -goto fail;
> +return -EINVAL;
>  }
> 
>  bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED;
>  bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED;
> 
> -ret = 0;
> -fail:
> -return ret;
> +return 0;
>  }
> 
>  static int64_t blkreplay_getlength(BlockDriverState *bs)
> --
> 2.24.1



Pavel Dovgalyuk




Audio pace issue with qemu4.2

2020-01-09 Thread padmashree mandri
Hi all,

I am trying to run qemu4.2 with ALSA. Playback pace is fast. Where can
i set sampling rate and all parameters related to audio in qemu?

Thanks
padma


[PATCH] vl: Don't mismatch g_strsplit()/g_free()

2020-01-09 Thread pannengyuan
From: Pan Nengyuan 

It's a mismatch between g_strsplit and g_free, it will cause a memory leak as 
follow:

[root@localhost]# ./aarch64-softmmu/qemu-system-aarch64 -accel help
Accelerators supported in QEMU binary:
tcg
kvm
=
==1207900==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 2 object(s) allocated from:
#0 0xfffd700231cb in __interceptor_malloc (/lib64/libasan.so.4+0xd31cb)
#1 0xfffd6ec57163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163)
#2 0xfffd6ec724d7 in g_strndup (/lib64/libglib-2.0.so.0+0x724d7)
#3 0xfffd6ec73d3f in g_strsplit (/lib64/libglib-2.0.so.0+0x73d3f)
#4 0xaaab66be5077 in main /mnt/sdc/qemu-master/qemu-4.2.0-rc0/vl.c:3517
#5 0xfffd6e140b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f)
#6 0xaaab66bf0f53  (./build/aarch64-softmmu/qemu-system-aarch64+0x8a0f53)

Direct leak of 2 byte(s) in 2 object(s) allocated from:
#0 0xfffd700231cb in __interceptor_malloc (/lib64/libasan.so.4+0xd31cb)
#1 0xfffd6ec57163 in g_malloc (/lib64/libglib-2.0.so.0+0x57163)
#2 0xfffd6ec7243b in g_strdup (/lib64/libglib-2.0.so.0+0x7243b)
#3 0xfffd6ec73e6f in g_strsplit (/lib64/libglib-2.0.so.0+0x73e6f)
#4 0xaaab66be5077 in main /mnt/sdc/qemu-master/qemu-4.2.0-rc0/vl.c:3517
#5 0xfffd6e140b9f in __libc_start_main (/lib64/libc.so.6+0x20b9f)
#6 0xaaab66bf0f53  (./build/aarch64-softmmu/qemu-system-aarch64+0x8a0f53)

Reported-by: Euler Robot 
Signed-off-by: Pan Nengyuan 
---
 vl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/vl.c b/vl.c
index 86474a55c9..2fa5cb3b9a 100644
--- a/vl.c
+++ b/vl.c
@@ -3476,7 +3476,7 @@ int main(int argc, char **argv, char **envp)
 gchar **optname = g_strsplit(typename,
  ACCEL_CLASS_SUFFIX, 
0);
 printf("%s\n", optname[0]);
-g_free(optname);
+g_strfreev(optname);
 }
 g_free(typename);
 }
-- 
2.21.0.windows.1





RE: [PATCH] hw/arm/acpi: Pack the SRAT processors structure by node_id ascending order

2020-01-09 Thread Zengtao (B)
> -Original Message-
> From: Igor Mammedov [mailto:imamm...@redhat.com]
> Sent: Thursday, January 09, 2020 5:54 PM
> To: Zengtao (B)
> Cc: Michael S. Tsirkin; qemu-devel@nongnu.org; qemu-triv...@nongnu.org;
> Shannon Zhao; Peter Maydell; qemu-...@nongnu.org
> Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors structure by
> node_id ascending order
> 
> On Thu, 9 Jan 2020 02:45:58 +
> "Zengtao (B)"  wrote:
> 
> > > -Original Message-
> > > From: Igor Mammedov [mailto:imamm...@redhat.com]
> > > Sent: Thursday, January 09, 2020 12:39 AM
> > > To: Zengtao (B)
> > > Cc: Michael S. Tsirkin; qemu-devel@nongnu.org;
> qemu-triv...@nongnu.org;
> > > Shannon Zhao; Peter Maydell; qemu-...@nongnu.org
> > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors structure
> by
> > > node_id ascending order
> > >
> > > On Wed, 8 Jan 2020 04:02:10 +
> > > "Zengtao (B)"  wrote:
> > >
> > > > > -Original Message-
> > > > > From: Igor Mammedov [mailto:imamm...@redhat.com]
> > > > > Sent: Tuesday, January 07, 2020 11:50 PM
> > > > > To: Zengtao (B)
> > > > > Cc: Michael S. Tsirkin; qemu-devel@nongnu.org;
> > > qemu-triv...@nongnu.org;
> > > > > Shannon Zhao; Peter Maydell; qemu-...@nongnu.org
> > > > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors
> structure
> > > by
> > > > > node_id ascending order
> > > > >
> > > > > On Tue, 7 Jan 2020 10:29:22 +
> > > > > "Zengtao (B)"  wrote:
> > > > >
> > > > > > > -Original Message-
> > > > > > > From: Michael S. Tsirkin [mailto:m...@redhat.com]
> > > > > > > Sent: Tuesday, January 07, 2020 5:33 PM
> > > > > > > To: Zengtao (B)
> > > > > > > Cc: qemu-devel@nongnu.org; qemu-triv...@nongnu.org;
> Shannon
> > > > > Zhao;
> > > > > > > Peter Maydell; Igor Mammedov; qemu-...@nongnu.org
> > > > > > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors
> > > structure
> > > > > by
> > > > > > > node_id ascending order
> > > > > > >
> > > > > > > On Tue, Jan 07, 2020 at 05:18:49PM +0800, Zeng Tao wrote:
> > > > > > > > When booting the guest linux with the following numa
> > > configuration:
> > > > > > > > -numa node,node_id=1,cpus=0-3
> > > > > > > > -numa node,node_id=0,cpus=4-7
> > > > > > > > We can get the following numa topology in the guest system:
> > > > > > > > Architecture:  aarch64
> > > > > > > > Byte Order:Little Endian
> > > > > > > > CPU(s):8
> > > > > > > > On-line CPU(s) list:   0-7
> > > > > > > > Thread(s) per core:1
> > > > > > > > Core(s) per socket:8
> > > > > > > > Socket(s): 1
> > > > > > > > NUMA node(s):  2
> > > > > > > > L1d cache: unknown size
> > > > > > > > L1i cache: unknown size
> > > > > > > > L2 cache:  unknown size
> > > > > > > > NUMA node0 CPU(s): 0-3
> > > > > > > > NUMA node1 CPU(s): 4-7
> > > > > > > > The Cpus 0-3 is assigned with NUMA node 1 in QEMU while it
> get
> > > > > NUMA
> > > > > > > node
> > > > > > > > 0 in the guest.
> > > > > > > >
> > > > > > > > In fact, In the linux kernel, numa_node_id is allocated per the
> > > ACPI
> > > > > > > > SRAT processors structure order,so the cpu 0 will be the first
> one
> > > to
> > > > > > > > allocate its NUMA node id, so it gets the NUMA node 0.
> > > > > > > >
> > > > > > > > To fix this issue, we pack the SRAT processors structure in
> numa
> > > node
> > > > > id
> > > > > > > > order but not the default cpu number order.
> > > > > > > >
> > > > > > > > Signed-off-by: Zeng Tao 
> > > > > > >
> > > > > > >
> > > > > > > Does this matter? If yes fixing linux to take node id from
> proximity
> > > > > > > field in ACPI seems cleaner ...
> > > > > > >
> > > > > >
> > > > > > In fact, I just want to align the node_id concept in QEMU and
> Linux.
> > > > > > If we fix the kernel side, we need to align with all platforms.
> > > > > > i think maybe not a good idea.
> > > > > if linux makes up node ID's on it's own, it would be hard for it to
> > > > > map SRAT entries to other tables that use proximity id as well.
> > > > >
> > > > > So it would need to maintain map of [proximity id] -> [node id] (and
> > > reverse)
> > > > > somewhere to resolve mappings on other tables.
> > > > > If it doesn't do this then it's broken and works just by accident,
> > > > > if it does the fix probably should be in that code and not in QEMU.
> > > > >
> > > >
> > > > Hmm, the problem is how to understand the concept of node id.
> > > > 1. In dts, there is node id. Both the QEMU and Linux can use it
> > > > directly, so no conflict.
> > > > 2. In ACPI, there is only proximity domain, but no node id, there
> > > >  should be a mapping between them, while kernel and QEMU
> maintain
> > > >  their own rule, and unfortunately they conflict with each other
> > > >  sometimes.
> > > >
> > > > There is no specification to indicate what we should do to maintain
> the
> > > > mapping, so it's hard to align the QEMU and Linux.
> > > >
> > > > Any 

Re: [PATCH 0/2] not use multifd during postcopy

2020-01-09 Thread Wei Yang
On Thu, Jan 09, 2020 at 10:50:25AM +0100, Juan Quintela wrote:
>Wei Yang  wrote:
>> On Mon, Dec 16, 2019 at 10:35:39AM +0800, Wei Yang wrote:
>>>Would this one be picked up this time?
>>
>> Happy new year to all.
>>
>> Can I ask the plan for this patch set?
>
>queued
>

Thanks :-)

-- 
Wei Yang
Help you, Help me



[PATCH qemu v5] spapr: Kill SLOF

2020-01-09 Thread Alexey Kardashevskiy
The Petitboot bootloader is way more advanced than SLOF is ever going to
be as Petitboot comes with the full-featured Linux kernel with all
the drivers, and initramdisk with quite user friendly interface.
The problem with ditching SLOF is that an unmodified pseries kernel can
either start via:
1. kexec, this requires presence of RTAS and skips
ibm,client-architecture-support entirely;
2. normal boot, this heavily relies on the OF1275 client interface to
fetch the device tree and do early setup (claim memory).

This adds a new bios-less mode to the pseries machine: "bios=on|off".
When enabled, QEMU does not load SLOF and jumps to the kernel from
"-kernel".

The client interface is implemented exactly as RTAS - a 20 bytes blob,
right next after the RTAS blob. The entry point is passed to the kernel
via GPR5.

This implements a handful of client interface methods just to get going.
In particular, this implements the device tree fetching,
ibm,client-architecture-support and instantiate-rtas.

This implements changing FDT properties for RTAS (for vmlinux and zImage)
and initramdisk location (for zImage). To make this work, this skips
fdt_pack() when bios=off as not packing the blob leaves some room for
appending.

This assigns "phandles" to device tree nodes as there is no more SLOF
and OF nodes addresses of which served as phandle values.
This keeps predefined nodes (such as XICS/NVLINK/...) unchanged.
phandles are regenerated at every FDT rebuild.

When bios=off, this adds "/chosen" every time QEMU builds a tree.

This implements "claim" which the client (Linux) uses for memory
allocation; this is also  used by QEMU for claiming kernel/initrd images,
client interface entry point, RTAS and the initial stack.

While at this, add a "kernel-addr" machine parameter to allow moving
the kernel in memory. This is useful for debugging if the kernel is
loaded at @0, although not necessary.

This adds basic instances support which are managed by a hashmap
ihandle->[phandle, DeviceState, Chardev].

Note that a 64bit PCI fix is required for Linux:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735e

The test command line:

qemu-system-ppc64 \
-nodefaults \
-chardev stdio,id=STDIO0,signal=off,mux=on \
-device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \
-mon id=MON0,chardev=STDIO0,mode=readline \
-nographic \
-vga none \
-kernel pbuild/kernel-le-guest/arch/powerpc/boot/zImage.pseries \
-machine pseries,bios=off,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken \
-m 4G \
-enable-kvm \
-initrd pb/rootfs.cpio.xz \
-device nec-usb-xhci,id=nec-usb-xhci0 \
-netdev tap,id=TAP0,helper=/home/aik/qemu-bridge-helper --br=br0 \
-device virtio-net-pci,id=vnet0,netdev=TAP0 img/f30le.qcow2 \
-snapshot \
-smp 8,threads=8 \
-trace events=qemu_trace_events \
-d guest_errors \
-chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.ssh54088 \
-mon chardev=SOCKET0,mode=control

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v5:
* made instances keep device and chardev pointers
* removed VIO dependencies
* print error if RTAS memory is not claimed as it should have been
* pack FDT as "quiesce"

v4:
* fixed open
* validate ihandles in "call-method"

v3:
* fixed phandles allocation
* s/__be32/uint32_t/ as we do not normally have __be32 type in qemu
* fixed size of /chosen/stdout
* bunch of renames
* do not create rtas properties at all, let the client deal with it;
instead setprop allows changing these in the FDT
* no more packing FDT when bios=off - nobody needs it and getprop does not
work otherwise
* allow updating initramdisk device tree properties (for zImage)
* added instances
* fixed stdout on OF's "write"
* removed special handling for stdout in OF client, spapr-vty handles it
instead

v2:
* fixed claim()
* added "setprop"
* cleaner client interface and RTAS blobs management
* boots to petitboot and further to the target system
* more trace points
---
 hw/ppc/Makefile.objs |   1 +
 include/hw/ppc/spapr.h   |  28 +-
 hw/ppc/spapr.c   | 266 ++--
 hw/ppc/spapr_hcall.c |  74 +++--
 hw/ppc/spapr_of_client.c | 633 +++
 hw/ppc/trace-events  |  12 +
 6 files changed, 959 insertions(+), 55 deletions(-)
 create mode 100644 hw/ppc/spapr_of_client.c

diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
index 101e9fc59185..20efc0aa6f9b 100644
--- a/hw/ppc/Makefile.objs
+++ b/hw/ppc/Makefile.objs
@@ -6,6 +6,7 @@ obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o 
spapr_rtas.o
 obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o
 obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o spapr_irq.o
 obj-$(CONFIG_PSERIES) += spapr_tpm_proxy.o
+obj-$(CONFIG_PSERIES) += spapr_of_client.o
 obj-$(CONFIG_SPAPR_RNG) +=  spapr_rng.o
 # IBM PowerNV
 obj-$(CONFIG_POWERNV) += pnv.o pnv_xscom.o pnv_core.o pnv_lpc.o pnv_psi.o 
pnv_occ.o pnv_bmc.o
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 61f005c6f686..efc2c70abf99 

[Bug 1859081] Re: Mouse way too fast when Qemu is on a Windows VM with a OS 9 Guest

2020-01-09 Thread John Arbuckle
What is the QEMU command-line you use?
Does this problem exist with the usb mouse (-device usb-mouse)?
Could you try upgrading to the latest version of QEMU and see if the issue is 
resolved please?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1859081

Title:
  Mouse way too fast when Qemu is on a Windows VM with a OS 9 Guest

Status in QEMU:
  New

Bug description:
  On a server, I have a Windows 10 VM with Qemu 4.1.0 (latest) from 
https://qemu.weilnetz.de/w64/ installed.
  There I have a Mac OS 9.2.2 machine.
  Now if I connect to the Windows VM with VNC or RDP or even VMWare console, 
the Mouse in the Mac OS Guest inside Qemu is wy to fast. Even when lowering 
the mouse speed in the Mac OS mouse setting, one pixel in the Host (Windows 10 
VM) still moves the mouse by 10 pixels inside the Qemu machine.
  I tried different resolutions but that does not help.
  Is there any way to fix this or any way how I can provide more information?
  Thanks

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1859081/+subscriptions



[Bug 1859106] [NEW] 4.2 regression: ReactOS crashes on boot

2020-01-09 Thread Mike Swanson
Public bug reported:

In QEMU 4.1.x and earlier, ReactOS can successfully boot, but starting
with 4.2, it fails, instead coming up with an error "The video driver
failed to initialize."

This happens regardless of VM configuration (even -M pc-i440fx-4.1) and
it works well with older versions of QEMU. bisecting QEMU to find the
first bad commit reveals 0221d73ce6a8e075adaa0a35a6ef853d2652b855 as the
culprit, which is updating the seabios version; perhaps this bug belongs
there, but I don't know the appropriate avenue (it seems seabios is a
subproject of QEMU anyway?).

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1859106

Title:
  4.2 regression: ReactOS crashes on boot

Status in QEMU:
  New

Bug description:
  In QEMU 4.1.x and earlier, ReactOS can successfully boot, but starting
  with 4.2, it fails, instead coming up with an error "The video driver
  failed to initialize."

  This happens regardless of VM configuration (even -M pc-i440fx-4.1)
  and it works well with older versions of QEMU. bisecting QEMU to find
  the first bad commit reveals 0221d73ce6a8e075adaa0a35a6ef853d2652b855
  as the culprit, which is updating the seabios version; perhaps this
  bug belongs there, but I don't know the appropriate avenue (it seems
  seabios is a subproject of QEMU anyway?).

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1859106/+subscriptions



[Bug 1859106] Re: 4.2 regression: ReactOS crashes on boot

2020-01-09 Thread Mike Swanson
I should add, ReactOS can be downloaded from here:
https://reactos.org/download

The LiveCD is sufficient to see the problem.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1859106

Title:
  4.2 regression: ReactOS crashes on boot

Status in QEMU:
  New

Bug description:
  In QEMU 4.1.x and earlier, ReactOS can successfully boot, but starting
  with 4.2, it fails, instead coming up with an error "The video driver
  failed to initialize."

  This happens regardless of VM configuration (even -M pc-i440fx-4.1)
  and it works well with older versions of QEMU. bisecting QEMU to find
  the first bad commit reveals 0221d73ce6a8e075adaa0a35a6ef853d2652b855
  as the culprit, which is updating the seabios version; perhaps this
  bug belongs there, but I don't know the appropriate avenue (it seems
  seabios is a subproject of QEMU anyway?).

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1859106/+subscriptions



Re: [virtio-dev][RFC PATCH v1 2/2] virtio-gpu: add the ability to export resources

2020-01-09 Thread Gurchetan Singh
I like the idea of having one central place in the kernel where
virtio-devices get their uuid from -- i.e, no separate VM specific,
device specific implementations calling into uuid_gen().

On Wed, Jan 8, 2020 at 3:20 AM David Stevens  wrote:
>
> > Is there a specific reason why you want the host pick the uuid?  I would
> > let the guest define the uuid, i.e. move the uuid fields to
> > virtio_gpu_export_resource and scratch virtio_gpu_resp_export_resource.
>
> Sending the uuid in the original request doesn't really buy us
> anything, at least in terms of asynchronicity. The guest would still
> need to wait for the response to arrive before it could safely pass
> the uuid to any other virtio devices, to prevent a race where the
> import fails because it is processed before virtio-gpu processes the
> export. Perhaps this wouldn't be the case if we supported sharing
> fences between virtio devices, but even then, fences are more of a
> thing for the operation of a pipeline, not for the setup of a
> pipeline.
>
> At that point, I think it's just a matter of aesthetics. I lean
> slightly towards returning the uuid from the host, since that rules
> out any implementation with the aforementioned race. That being said,
> if there are any specific reasons or preferences to assigning the uuid
> from the guest, I can switch to that direction.
>
> -David



Re: [PATCH qemu v4] spapr: Kill SLOF

2020-01-09 Thread Alexey Kardashevskiy



On 10/01/2020 10:32, Alexey Kardashevskiy wrote:
> 
> 
> On 10/01/2020 10:05, Alexey Kardashevskiy wrote:
>>
>>
>> On 08/01/2020 17:18, Alexey Kardashevskiy wrote:
>>> The Petitboot bootloader is way more advanced than SLOF is ever going to
>>> be as Petitboot comes with the full-featured Linux kernel with all
>>> the drivers, and initramdisk with quite user friendly interface.
>>> The problem with ditching SLOF is that an unmodified pseries kernel can
>>> either start via:
>>> 1. kexec, this requires presence of RTAS and skips
>>> ibm,client-architecture-support entirely;
>>> 2. normal boot, this heavily relies on the OF1275 client interface to
>>> fetch the device tree and do early setup (claim memory).
>>>
>>> This adds a new bios-less mode to the pseries machine: "bios=on|off".
>>> When enabled, QEMU does not load SLOF and jumps to the kernel from
>>> "-kernel".
>>>
>>> The client interface is implemented exactly as RTAS - a 20 bytes blob,
>>> right next after the RTAS blob. The entry point is passed to the kernel
>>> via GPR5.
>>>
>>> This implements a handful of client interface methods just to get going.
>>> In particular, this implements the device tree fetching,
>>> ibm,client-architecture-support and instantiate-rtas.
>>>
>>> This implements changing FDT properties for RTAS (for vmlinux and zImage)
>>> and initramdisk location (for zImage). To make this work, this skips
>>> fdt_pack() when bios=off as not packing the blob leaves some room for
>>> appending.
>>>
>>> This assigns "phandles" to device tree nodes as there is no more SLOF
>>> and OF nodes addresses of which served as phandle values.
>>> This keeps predefined nodes (such as XICS/NVLINK/...) unchanged.
>>> phandles are regenerated at every FDT rebuild.
>>>
>>> This defines phandles for VIO devices to have phandle assigned to
>>> the default stdout device at the point when we write "/chosen/stdout"
>>> which an ihandle which the OS uses to write to the console.
>>
>>
>> And I do not really need to preallocate phandles for stdout as it is a
>> leftover from when I populated /chosen/stdout before populating VIO
>> nodes, now /chosen/stdout is added at the very end. Thanks,
> 
> 
> Ah noo, I do, to implement "write" to the selected stdout as I need to
> trace ihandle back to Object* and  object_resolve_path() does not know
> about FDT path, it is /machine/peripheral/svty0 in QOM. The commit log
> needs an update, or this needs a fix but I cannot think of a nicer one.
> Thanks,


I just might extend instances to do real instances, i.e. associate
ihandle with phandle _and_ Object*, I just need a helper to find Object
which matches what qdev_get_fw_dev_path() returns. Fun :)


> 
> 
>>
>>
>>
>>>
>>> When bios=off, this adds "/chosen" every time QEMU builds a tree.
>>>
>>> This implements "claim" which the client (Linux) uses for memory
>>> allocation; this is also  used by QEMU for claiming kernel/initrd images,
>>> client interface entry point, RTAS and the initial stack.
>>>
>>> While at this, add a "kernel-addr" machine parameter to allow moving
>>> the kernel in memory. This is useful for debugging if the kernel is
>>> loaded at @0, although not necessary.
>>>
>>> This adds very basic instances support which are managed by a hashmap
>>> ihandle->phandle.
>>>
>>> Note that a 64bit PCI fix is required for Linux:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735e
>>
>>
>>
> 

-- 
Alexey



Does ppc target support for MTTCG are stable?

2020-01-09 Thread Yonggang Luo
I enabled it and working, but I don't know if there any other issue are
still not resolved..


-- 
 此致
礼
罗勇刚
Yours
sincerely,
Yonggang Luo


Re: [PATCH] riscv/sifive_u: fix a memory leak in soc_realize()

2020-01-09 Thread Palmer Dabbelt

On Tue, 10 Dec 2019 10:38:29 PST (-0800), alistai...@gmail.com wrote:

On Mon, Dec 9, 2019 at 11:15 PM  wrote:


From: Pan Nengyuan 

Fix a minor memory leak in riscv_sifive_u_soc_realize()

Reported-by: Euler Robot 
Signed-off-by: Pan Nengyuan 


Reviewed-by: Alistair Francis 


Thanks.  This is in the queue, I'm hoping to submit a PR after the H patch set
is ready to go.



Alistair


---
 hw/riscv/sifive_u.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
index 0140e95..0e12b3c 100644
--- a/hw/riscv/sifive_u.c
+++ b/hw/riscv/sifive_u.c
@@ -542,6 +542,7 @@ static void riscv_sifive_u_soc_realize(DeviceState *dev, 
Error **errp)
 SIFIVE_U_PLIC_CONTEXT_BASE,
 SIFIVE_U_PLIC_CONTEXT_STRIDE,
 memmap[SIFIVE_U_PLIC].size);
+g_free(plic_hart_config);
 sifive_uart_create(system_memory, memmap[SIFIVE_U_UART0].base,
 serial_hd(0), qdev_get_gpio_in(DEVICE(s->plic), SIFIVE_U_UART0_IRQ));
 sifive_uart_create(system_memory, memmap[SIFIVE_U_UART1].base,
--
2.7.2.windows.1







Re: [PATCH qemu v4] spapr: Kill SLOF

2020-01-09 Thread Alexey Kardashevskiy



On 10/01/2020 10:05, Alexey Kardashevskiy wrote:
> 
> 
> On 08/01/2020 17:18, Alexey Kardashevskiy wrote:
>> The Petitboot bootloader is way more advanced than SLOF is ever going to
>> be as Petitboot comes with the full-featured Linux kernel with all
>> the drivers, and initramdisk with quite user friendly interface.
>> The problem with ditching SLOF is that an unmodified pseries kernel can
>> either start via:
>> 1. kexec, this requires presence of RTAS and skips
>> ibm,client-architecture-support entirely;
>> 2. normal boot, this heavily relies on the OF1275 client interface to
>> fetch the device tree and do early setup (claim memory).
>>
>> This adds a new bios-less mode to the pseries machine: "bios=on|off".
>> When enabled, QEMU does not load SLOF and jumps to the kernel from
>> "-kernel".
>>
>> The client interface is implemented exactly as RTAS - a 20 bytes blob,
>> right next after the RTAS blob. The entry point is passed to the kernel
>> via GPR5.
>>
>> This implements a handful of client interface methods just to get going.
>> In particular, this implements the device tree fetching,
>> ibm,client-architecture-support and instantiate-rtas.
>>
>> This implements changing FDT properties for RTAS (for vmlinux and zImage)
>> and initramdisk location (for zImage). To make this work, this skips
>> fdt_pack() when bios=off as not packing the blob leaves some room for
>> appending.
>>
>> This assigns "phandles" to device tree nodes as there is no more SLOF
>> and OF nodes addresses of which served as phandle values.
>> This keeps predefined nodes (such as XICS/NVLINK/...) unchanged.
>> phandles are regenerated at every FDT rebuild.
>>
>> This defines phandles for VIO devices to have phandle assigned to
>> the default stdout device at the point when we write "/chosen/stdout"
>> which an ihandle which the OS uses to write to the console.
> 
> 
> And I do not really need to preallocate phandles for stdout as it is a
> leftover from when I populated /chosen/stdout before populating VIO
> nodes, now /chosen/stdout is added at the very end. Thanks,


Ah noo, I do, to implement "write" to the selected stdout as I need to
trace ihandle back to Object* and  object_resolve_path() does not know
about FDT path, it is /machine/peripheral/svty0 in QOM. The commit log
needs an update, or this needs a fix but I cannot think of a nicer one.
Thanks,


> 
> 
> 
>>
>> When bios=off, this adds "/chosen" every time QEMU builds a tree.
>>
>> This implements "claim" which the client (Linux) uses for memory
>> allocation; this is also  used by QEMU for claiming kernel/initrd images,
>> client interface entry point, RTAS and the initial stack.
>>
>> While at this, add a "kernel-addr" machine parameter to allow moving
>> the kernel in memory. This is useful for debugging if the kernel is
>> loaded at @0, although not necessary.
>>
>> This adds very basic instances support which are managed by a hashmap
>> ihandle->phandle.
>>
>> Note that a 64bit PCI fix is required for Linux:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735e
> 
> 
> 

-- 
Alexey



Re: [PATCH qemu v4] spapr: Kill SLOF

2020-01-09 Thread Alexey Kardashevskiy



On 08/01/2020 17:18, Alexey Kardashevskiy wrote:
> The Petitboot bootloader is way more advanced than SLOF is ever going to
> be as Petitboot comes with the full-featured Linux kernel with all
> the drivers, and initramdisk with quite user friendly interface.
> The problem with ditching SLOF is that an unmodified pseries kernel can
> either start via:
> 1. kexec, this requires presence of RTAS and skips
> ibm,client-architecture-support entirely;
> 2. normal boot, this heavily relies on the OF1275 client interface to
> fetch the device tree and do early setup (claim memory).
> 
> This adds a new bios-less mode to the pseries machine: "bios=on|off".
> When enabled, QEMU does not load SLOF and jumps to the kernel from
> "-kernel".
> 
> The client interface is implemented exactly as RTAS - a 20 bytes blob,
> right next after the RTAS blob. The entry point is passed to the kernel
> via GPR5.
> 
> This implements a handful of client interface methods just to get going.
> In particular, this implements the device tree fetching,
> ibm,client-architecture-support and instantiate-rtas.
> 
> This implements changing FDT properties for RTAS (for vmlinux and zImage)
> and initramdisk location (for zImage). To make this work, this skips
> fdt_pack() when bios=off as not packing the blob leaves some room for
> appending.
> 
> This assigns "phandles" to device tree nodes as there is no more SLOF
> and OF nodes addresses of which served as phandle values.
> This keeps predefined nodes (such as XICS/NVLINK/...) unchanged.
> phandles are regenerated at every FDT rebuild.
> 
> This defines phandles for VIO devices to have phandle assigned to
> the default stdout device at the point when we write "/chosen/stdout"
> which an ihandle which the OS uses to write to the console.


And I do not really need to preallocate phandles for stdout as it is a
leftover from when I populated /chosen/stdout before populating VIO
nodes, now /chosen/stdout is added at the very end. Thanks,



> 
> When bios=off, this adds "/chosen" every time QEMU builds a tree.
> 
> This implements "claim" which the client (Linux) uses for memory
> allocation; this is also  used by QEMU for claiming kernel/initrd images,
> client interface entry point, RTAS and the initial stack.
> 
> While at this, add a "kernel-addr" machine parameter to allow moving
> the kernel in memory. This is useful for debugging if the kernel is
> loaded at @0, although not necessary.
> 
> This adds very basic instances support which are managed by a hashmap
> ihandle->phandle.
> 
> Note that a 64bit PCI fix is required for Linux:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=df5be5be8735e



-- 
Alexey



Re: version 3.2 is missing from QEMU download page

2020-01-09 Thread Aleksandar Markovic
On Thursday, January 9, 2020, Aleksandar Markovic <
aleksandar.m.m...@gmail.com> wrote:

> Hi all,
>
> For some reason, QEMU download page
>
> https://download.qemu.org
>
> seems to contain tarballs of all QEMU versions/releases, except 3.2 and
> its follow-ups. This broke up some scripts that rely on presence of all
> versions on that page.
>
> Hm, sorry, I think I for some reason thought that 3.2 must have existed,
but it looks we had only  3.0 and 3.1 in 2018.

Sorry for the noise,
Alwksandar



> I don't know who is in charge of QEMU download page, but I would like to
> ask anyone with this power to fix the absence of 3.2, if possible.
>
> Thanks in advance,
> Aleksandar
>


version 3.2 is missing from QEMU download page

2020-01-09 Thread Aleksandar Markovic
Hi all,

For some reason, QEMU download page

https://download.qemu.org

seems to contain tarballs of all QEMU versions/releases, except 3.2 and its
follow-ups. This broke up some scripts that rely on presence of all
versions on that page.

I don't know who is in charge of QEMU download page, but I would like to
ask anyone with this power to fix the absence of 3.2, if possible.

Thanks in advance,
Aleksandar


Re: [kvm-unit-tests PATCH 05/10] arm: pmu: Basic event counter Tests

2020-01-09 Thread Auger Eric
Hi Andre,

On 1/7/20 1:19 PM, Andre Przywara wrote:
> On Mon, 16 Dec 2019 21:47:52 +0100
> Eric Auger  wrote:
> 
> Hi Eric,
> 
> thanks a lot for your work on these elaborate tests! I have some PMU test 
> extensions as well, but they are nowhere as sophisticated as yours!
> 
> Just ran this on my ThunderX2 desktop (4.15.0-65-generic #74-Ubuntu kernel, 
> QEMU emulator version 2.11.1(Debian 1:2.11+dfsg-1ubuntu7.21)), and it 
> reported the following fails:
> INFO: pmu: basic-event-count: counter #0 is 0x207e (CPU_CYCLES)
> INFO: pmu: basic-event-count: counter #1 is 0xc89 (INST_RETIRED)
> INFO: pmu: basic-event-count: overflow reg = 0x0
> FAIL: pmu: basic-event-count: check overflow happened on #0 only
> 
> INFO: PMU version: 4
> INFO: Implements 6 event counters
> INFO: pmu: mem-access: counter #0 is 1297 (MEM_ACCESS)
> INFO: pmu: mem-access: counter #1 is 1287 (MEM_ACCESS)
> FAIL: pmu: mem-access: Ran 20 mem accesses
> FAIL: pmu: mem-access: Ran 20 mem accesses with expected overflows on both 
> counters
> INFO: pmu: mem-access: cnt#0 = 1353 cnt#1=1259 overflow=0x0
> 
> Do you know about this? Is this due to kernel bugs? Because Ubuntu cleverly 
> chose an EOL kernel for their stable distro :-P
> Will try to have a look and repeat on a Juno.
I think this kernel version misses:
30d97754b2d1  KVM: arm/arm64: Re-create event when setting counter value

In the tests I am setting the TYPER before presetting the counter #0
value so this may be related.

You may try to set the counter #0 pmevtyper again just after the
pmevcntr #0 preset to validate the above.

write_regn(pmevcntr, 0, 0xFFF0);
write_regn(pmevtyper, 0, events[0] | PMEVTYPER_EXCLUDE_EL0);


Besides I am only testing against the master branch.

> 
> Comments inline 
> 
>> Adds the following tests:
>> - event-counter-config: test event counter configuration
>> - basic-event-count:
>>   - programs counters #0 and #1 to count 2 required events
>>   (resp. CPU_CYCLES and INST_RETIRED). Counter #0 is preset
>>   to a value close enough to the 32b
>>   overflow limit so that we check the overflow bit is set
>>   after the execution of the asm loop.
>> - mem-access: counts MEM_ACCESS event on counters #0 and #1
>>   with and without 32-bit overflow.
>>
>> Signed-off-by: Eric Auger 
>> ---
>>  arm/pmu.c | 261 ++
>>  arm/unittests.cfg |  18 
>>  2 files changed, 279 insertions(+)
>>
>> diff --git a/arm/pmu.c b/arm/pmu.c
>> index d88ef22..139dae3 100644
>> --- a/arm/pmu.c
>> +++ b/arm/pmu.c
>> @@ -18,9 +18,15 @@
>>  #include "asm/barrier.h"
>>  #include "asm/sysreg.h"
>>  #include "asm/processor.h"
>> +#include 
>> +#include 
>>  
>>  #define PMU_PMCR_E (1 << 0)
>> +#define PMU_PMCR_P (1 << 1)
>>  #define PMU_PMCR_C (1 << 2)
>> +#define PMU_PMCR_D (1 << 3)
>> +#define PMU_PMCR_X (1 << 4)
>> +#define PMU_PMCR_DP(1 << 5)
>>  #define PMU_PMCR_LC(1 << 6)
>>  #define PMU_PMCR_N_SHIFT   11
>>  #define PMU_PMCR_N_MASK0x1f
>> @@ -104,6 +110,9 @@ static inline void precise_instrs_loop(int loop, 
>> uint32_t pmcr)
>>  
>>  /* event counter tests only implemented for aarch64 */
>>  static void test_event_introspection(void) {}
>> +static void test_event_counter_config(void) {}
>> +static void test_basic_event_count(void) {}
>> +static void test_mem_access(void) {}
>>  
>>  #elif defined(__aarch64__)
>>  #define ID_AA64DFR0_PERFMON_SHIFT 8
>> @@ -145,6 +154,32 @@ static inline void precise_instrs_loop(int loop, 
>> uint32_t pmcr)
>>  }
>>  
>>  #define PMCEID1_EL0 sys_reg(11, 3, 9, 12, 7)>> +#define PMCNTENSET_EL0 
>> sys_reg(11, 3, 9, 12, 1)
>> +#define PMCNTENCLR_EL0 sys_reg(11, 3, 9, 12, 2)
> 
> op0 (the first argument) is only two bits, so it should read "3" instead of 
> "11" here. That's already a bug in the existing PMCEID1_EL0 definition. We 
> get away with it because the macro masks with 3, but it should be still 
> written correctly here. Not sure where the "11" actually comes from.
 op0 op1 CRn CRm op2
PMCEID1_EL0 11 011 1001 1100 111

binary read as dec :-(

> 
>> +
>> +#define PMEVTYPER_EXCLUDE_EL1 (1 << 31)
>> +#define PMEVTYPER_EXCLUDE_EL0 (1 << 30)
>> +
>> +#define regn_el0(__reg, __n) __reg ## __n  ## _el0
>> +#define write_regn(__reg, __n, __val) \
>> +write_sysreg((__val), __reg ## __n ## _el0)
>> +
>> +#define read_regn(__reg, __n) \
>> +read_sysreg(__reg ## __n ## _el0)
>> +
>> +#define print_pmevtyper(__s, __n) do { \
>> +uint32_t val; \
>> +val = read_regn(pmevtyper, __n);\
>> +report_info("%s pmevtyper%d=0x%x, eventcount=0x%x (p=%ld, u=%ld 
>> nsk=%ld, nsu=%ld, nsh=%ld m=%ld, mt=%ld)", \
>> +(__s), (__n), val, val & 0x,  \
>> +(BIT_MASK(31) & val) >> 31, \
>> +(BIT_MASK(30) & val) >> 30, \
>> +(BIT_MASK(29) & val) >> 29, \
>> +(BIT_MASK(28) & val) >> 28, \
>> +(BIT_MASK(27) & 

[PATCH] crypto: fix getter of a QCryptoSecret's property

2020-01-09 Thread Tong Ho
This fixes the condition-check done by the "loaded" property
getter, such that the property returns true even when the
secret is loaded by the 'file' option.

Signed-off-by: Tong Ho 
---
 Pre-existing getter returns true only when the secret is loaded
 by the 'data' option.

 crypto/secret.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/crypto/secret.c b/crypto/secret.c
index 1cf0ad0..5fb6bbe 100644
--- a/crypto/secret.c
+++ b/crypto/secret.c
@@ -221,6 +221,7 @@ qcrypto_secret_prop_set_loaded(Object *obj,
 secret->rawlen = inputlen;
 } else {
 g_free(secret->rawdata);
+secret->rawdata = NULL;
 secret->rawlen = 0;
 }
 }
@@ -231,7 +232,7 @@ qcrypto_secret_prop_get_loaded(Object *obj,
Error **errp G_GNUC_UNUSED)
 {
 QCryptoSecret *secret = QCRYPTO_SECRET(obj);
-return secret->data != NULL;
+return secret->rawdata != NULL;
 }
 
 
-- 
2.7.4




[Bug 1859081] [NEW] Mouse way too fast when Qemu is on a Windows VM with a OS 9 Guest

2020-01-09 Thread Roman Bäriswyl
Public bug reported:

On a server, I have a Windows 10 VM with Qemu 4.1.0 (latest) from 
https://qemu.weilnetz.de/w64/ installed.
There I have a Mac OS 9.2.2 machine.
Now if I connect to the Windows VM with VNC or RDP or even VMWare console, the 
Mouse in the Mac OS Guest inside Qemu is wy to fast. Even when lowering the 
mouse speed in the Mac OS mouse setting, one pixel in the Host (Windows 10 VM) 
still moves the mouse by 10 pixels inside the Qemu machine.
I tried different resolutions but that does not help.
Is there any way to fix this or any way how I can provide more information?
Thanks

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1859081

Title:
  Mouse way too fast when Qemu is on a Windows VM with a OS 9 Guest

Status in QEMU:
  New

Bug description:
  On a server, I have a Windows 10 VM with Qemu 4.1.0 (latest) from 
https://qemu.weilnetz.de/w64/ installed.
  There I have a Mac OS 9.2.2 machine.
  Now if I connect to the Windows VM with VNC or RDP or even VMWare console, 
the Mouse in the Mac OS Guest inside Qemu is wy to fast. Even when lowering 
the mouse speed in the Mac OS mouse setting, one pixel in the Host (Windows 10 
VM) still moves the mouse by 10 pixels inside the Qemu machine.
  I tried different resolutions but that does not help.
  Is there any way to fix this or any way how I can provide more information?
  Thanks

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1859081/+subscriptions



Re: [PATCH] virtio: Prevent double swap due to target pre 1.0 VirtIO

2020-01-09 Thread André Silva
Hi Greg,

Thanks for the commit info.
But I'm testing in this scenario, that is, a ppc64le host with a ppc64
BE guest, and without my patch I can't get virtio to work. The patch
makes virtio 0.95 (legacy) net, scsi, blk work. I don't get the
firmware error. I also tested with a ppc64le guest and had no problems
either. Maybe we have different firmware versions?

My firmware output:

SLOF **
QEMU Starting
 Build Date = Jul  3 2019 12:26:14
 FW Version = git-ba1ab360eebe6338
 Press "s" to enter Open Firmware.

Populating /vdevice methods
Populating /vdevice/vty@7100
Populating /vdevice/nvram@7101
Populating /vdevice/v-scsi@7102
   SCSI: Looking for devices
  8200 CD-ROM   : "QEMU QEMU CD-ROM  2.5+"
Populating /pci@8002000
 00  (D) : 1af4 1000virtio [ net ]
 00 0800 (D) : 1af4 1001virtio [ block ]
No NVRAM common partition, re-initializing...
Scanning USB
Using default console: /vdevice/vty@7100

  Welcome to Open Firmware

  Copyright (c) 2004, 2017 IBM Corporation All rights reserved.
  This program and the accompanying materials are made available
  under the terms of the BSD License available at
  http://www.opensource.org/licenses/bsd-license.php


Trying to load:  from: /pci@8002000/scsi@1 ...   Successfully loaded

>> FreeBSD/powerpc Open Firmware boot block
   Boot path:   /pci@8002000/scsi@1
   Boot loader: /boot/loader
   Boot volume:   /pci@8002000/scsi@1:2
Consoles: Open Firmware console

FreeBSD/powerpc64 Open Firmware loader, Revision 0.1
(Mon Nov 11 22:33:43 -02 2019 jenkins@FreeBSD_x86)
Memory: 4194304KB
Booted from: /pci@8002000/scsi@1

Loading /boot/defaults/loader.conf
/boot/kernel/kernel data=0x129f658+0x4aaa88 syms=[0x8+0x105120+0x8+0x125429]
...

Until now, I was able to test the patch and see virtio working on the
following systems:

  Qemu Host Guest   Guest VirtIO
    --- --
  master   Ubuntu ppc64le   FreeBSD 13.0-current ppc64 BE   legacy
  master   Ubuntu ppc64le   debian 4.19.0-6-powerpc64le modern
  master   Ubuntu ppc64le   debian 4.19.0-6-powerpc64le legacy
  master   arch x86_64  FreeBSD 13.0-current ppc64 BE   legacy

Thanks,
andré

On Thu, Jan 9, 2020 at 1:06 PM Greg Kurz  wrote:
>
> On Thu, 9 Jan 2020 07:39:17 -0500
> "Michael S. Tsirkin"  wrote:
>
> > On Thu, Jan 09, 2020 at 09:25:42AM -0300, André Silva wrote:
> > > Hi Michael!
> > > Thanks for reviewing the patch!
> > >
> > > > we always get LE values from memory subsystem,
> > > > not target endian values:
> > >
> > > I see. So do you think the patch is correct in eliminating the extra
> > > swap (as virtio_config_readw for example already makes a swap)?
> > >
> > > Thanks,
> > > andré
> >
> > I don't think it is, I think we do need an extra swap
> > in some cases. It's possible that some cross-endian
> > setups are broken now, if so pls include testing
> > result not just theoretical analysis.
> >
>
> I confirm that we must keep the extra swap otherwise
> read/write in cross-endian setups will have wrong
> endian. Please read this commit for a more detailed
> explanation:
>
> commit 82afa58641b0e67abbaf4da6c325ebd7c2513262
> Author: Benjamin Herrenschmidt 
> Date:   Tue Jan 10 01:35:11 2012 +
>
> virtio-pci: Fix endianness of virtio config
>
> https://git.qemu.org/?p=qemu.git;a=commitdiff;h=82afa58641b0e67abbaf4da6c325ebd7c2513262
>
> This is especially critical on ppc64 since _all_ hosts are now LE
> but the first piece of code in the guest that is likely to drive
> the device is the SLOF firmware which is BE.
>
> This is what we get with this patch when trying to run a pseries guest on a
> ppc64le host:
>
> Trying to load:  from: /pci@8002000/scsi@0 ... virtioblk_transfer: 
> Access beyond end of device!
>
> Cheers,
>
> --
> Greg
>
> > > On Thu, Jan 9, 2020 at 7:50 AM Michael S. Tsirkin  wrote:
> > > >
> > > > On Wed, Jan 08, 2020 at 01:16:18PM -0300, Andre Silva wrote:
> > > > > Remove the bswap function calls after reading and before writing
> > > > > memory bytes in virtio_pci_config_read and virtio_pci_config_write
> > > > > because they are reverting back an already swapped bytes.
> > > > >
> > > > > Consider the table below in the context of virtio_pci_config_read
> > > > > function.
> > > > >
> > > > > Host   Target  virtio-config-read[wl]
> > > > >swap?   virtio-is-big-endian?   extra 
> > > > > bswap?   Should be   Final result   Final result ok?
> > > > > - ---  --- 
> > > > > -- --- -- --
> > > > > LE BE  s(x)true
> > > > > s(s(x))s(x)x  No
> > > > > LE LE  x  

Re: [PATCH] target/arm/arm-semi: fix SYS_OPEN to return nonzero filehandle

2020-01-09 Thread Richard Henderson
On 1/9/20 3:12 PM, Masahiro Yamada wrote:
> According to the specification "Semihosting for AArch32 and Aarch64",
> the SYS_OPEN operation should return:
> 
>  - A nonzero handle if the call is successful
>  - -1 if the call is not successful
> 
> So, it should never return 0.
> 
> Prior to commit 35e9a0a8ce4b ("target/arm/arm-semi: Make semihosting
> code hand out its own file descriptors"), the guest fd matched to the
> host fd. It returned a nonzero handle on success since the fd 0 is
> already used for stdin.
> 
> Now that the guest fd is the index of guestfd_array, it starts from 0.
> 
> I noticed this issue particularly because Trusted Firmware-A built with
> PLAT=qemu is no longer working. Its io_semihosting driver only handles
> a positive return value as a valid filehandle.
> 
> Basically, there are two ways to fix this:
> 
>   - Use (guestfd - 1) as the index of guestfs_arrary. We need to insert
> increment/decrement to convert the guestfd and the array index back
> and forth.
> 
>   - Keep using guestfd as the index of guestfs_array. The first entry
> of guestfs_array is left unused.
> 
> I thought the latter is simpler. We end up with wasting a small piece
> of memory for the unused first entry of guestfd_array, but this is
> probably not a big deal.
> 
> Fixes: 35e9a0a8ce4b ("target/arm/arm-semi: Make semihosting code hand out its 
> own file descriptors")
> Signed-off-by: Masahiro Yamada 
> ---

Reviewed-by: Richard Henderson 


r~



Re: [PATCH v1 06/59] mips-semi.c: remove 'uhi_done' label in helper_do_semihosting()

2020-01-09 Thread Aleksandar Markovic
On Monday, January 6, 2020, Daniel Henrique Barboza 
wrote:

> The label 'uhi_done' is a simple 'return' call and can
> be removed for a bit more clarity in the code.
>
> CC: Aurelien Jarno 
> CC: Aleksandar Markovic 
> CC: Aleksandar Rikalo 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  target/mips/mips-semi.c | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)
>
>
Reviewed-by: Aleksandar Markovic 


> diff --git a/target/mips/mips-semi.c b/target/mips/mips-semi.c
> index 35bdfd7c77..10a710c1e8 100644
> --- a/target/mips/mips-semi.c
> +++ b/target/mips/mips-semi.c
> @@ -218,7 +218,7 @@ static int copy_argn_to_target(CPUMIPSState *env, int
> arg_num,
>  if (!p) {   \
>  gpr[2] = -1;\
>  gpr[3] = EFAULT;\
> -goto uhi_done;  \
> +return; \
>  }   \
>  } while (0)
>
> @@ -228,14 +228,14 @@ static int copy_argn_to_target(CPUMIPSState *env,
> int arg_num,
>  if (!p) {   \
>  gpr[2] = -1;\
>  gpr[3] = EFAULT;\
> -goto uhi_done;  \
> +return; \
>  }   \
>  p2 = lock_user_string(addr2);   \
>  if (!p2) {  \
>  unlock_user(p, addr, 0);\
>  gpr[2] = -1;\
>  gpr[3] = EFAULT;\
> -goto uhi_done;  \
> +return; \
>  }   \
>  } while (0)
>
> @@ -272,7 +272,7 @@ void helper_do_semihosting(CPUMIPSState *env)
>  if (gpr[4] < 3) {
>  /* ignore closing stdin/stdout/stderr */
>  gpr[2] = 0;
> -goto uhi_done;
> +return;
>  }
>  gpr[2] = close(gpr[4]);
>  gpr[3] = errno_mips(errno);
> @@ -302,7 +302,7 @@ void helper_do_semihosting(CPUMIPSState *env)
>  gpr[2] = fstat(gpr[4], );
>  gpr[3] = errno_mips(errno);
>  if (gpr[2]) {
> -goto uhi_done;
> +return;
>  }
>  gpr[2] = copy_stat_to_target(env, , gpr[5]);
>  gpr[3] = errno_mips(errno);
> @@ -314,14 +314,14 @@ void helper_do_semihosting(CPUMIPSState *env)
>  case UHI_argnlen:
>  if (gpr[4] >= semihosting_get_argc()) {
>  gpr[2] = -1;
> -goto uhi_done;
> +return;
>  }
>  gpr[2] = strlen(semihosting_get_arg(gpr[4]));
>  break;
>  case UHI_argn:
>  if (gpr[4] >= semihosting_get_argc()) {
>  gpr[2] = -1;
> -goto uhi_done;
> +return;
>  }
>  gpr[2] = copy_argn_to_target(env, gpr[4], gpr[5]);
>  break;
> @@ -369,6 +369,5 @@ void helper_do_semihosting(CPUMIPSState *env)
>  fprintf(stderr, "Unknown UHI operation %d\n", op);
>  abort();
>  }
> -uhi_done:
>  return;
>  }
> --
> 2.24.1
>
>
>


Re: [PATCH v1 00/59] trivial unneeded labels cleanup

2020-01-09 Thread Daniel Henrique Barboza




On 1/7/20 6:43 AM, Max Reitz wrote:

On 06.01.20 19:23, Daniel Henrique Barboza wrote:


[...]

 

For me, it doesn’t require any brain cycles, because I generally just
assume the cleanup label will do the right thing.  OTOH, a return
statement may make me invest some some brain cycles, because maybe there
is something to be cleaned up and it isn’t done here.


Another common case uses a variable to set a return value,
generally an error, then return:

if () {
 ret = -ENOENT;
 goto out;
}
[..]
  out:
 return ret;

Likewise, it is clearer to just 'return -ENOENT' instead of
jumping to a label. There are other cases being handled in
these patches, but these are the most common.


I find it clearer from the perspective of “less LoC”, but I find it less
clear from the perspective of “Is this the right way to clean up?”.

Even on patch 15 (which you say isn’t too much of a debate), I don’t
find the change to make things any clearer.  Just less verbose.


Fair enough. As I said in the cover, all this patches makes no functional
changes, just a clean up aiming for less LoC (and more clarity, at least
in my opinion).

I am aware all the good sides of keeping the code as is, such as being
easier to debug (although I would argue that an explicit trace_event
call is better than keeping verbose code 'just in case'), or goto usage
to keep just one return statement per function.

I am also aware that the existing QEMU code base has a mesh of styles.
What I'm proposing here isn't a 'my way is better' case by any means,
but it's not something unprecedented in the existing code base either.
Since there's no QEMU code guideline imposing that a function should
have only one 'return' statement regardless of how many 'goto' calls
are needed, or a guideline that discourages 'goto' calls regardless of
how many 'return' calls are needed, in the end it's a matter of seeing
what fits the function/code better. In the maintainers opinion, of
course.


Thanks,


DHB




I suppose none of this would matter if we used __attribute__((cleanup))
everywhere and simply never had to clean up anything manually.  But as
long as we don’t and require cleanup paths in many places, I disagree
that they require more brain cycles than plain return statements.

Max





Re: [PATCH 092/104] virtiofsd: add man page

2020-01-09 Thread Dr. David Alan Gilbert
* Daniel P. Berrangé (berra...@redhat.com) wrote:
> On Thu, Dec 12, 2019 at 04:38:52PM +, Dr. David Alan Gilbert (git) wrote:
> > From: Stefan Hajnoczi 
> > 
> > Signed-off-by: Stefan Hajnoczi 
> > ---
> >  Makefile   |  7 +++
> >  tools/virtiofsd/virtiofsd.texi | 85 ++
> >  2 files changed, 92 insertions(+)
> >  create mode 100644 tools/virtiofsd/virtiofsd.texi
> 
> Reviewed-by: Daniel P. Berrangé 

Thanks.

> with some notes at the very end



> > +@c man begin DESCRIPTION
> > +
> > +Share a host directory tree with a guest through a virtio-fs device.  This
> > +program is a vhost-user backend that implements the virtio-fs device.  Each
> > +virtio-fs device instance requires its own virtiofsd process.
> > +
> > +This program is designed to work with QEMU's @code{--device 
> > vhost-user-fs-pci}
> > +but should work with any virtual machine monitor (VMM) that supports
> > +vhost-user.  See the EXAMPLES section below.
> > +
> > +This program must be run as the root user.
> 
> So there's no way for an unprivileged user to do file sharing like they
> can with 9p right now ?

Correct.

(Which also makes it a pain for using in a make check)

> >  Upon startup the program will
> > +switch into a new file system namespace with the shared directory tree as 
> > its
> > +root.  This prevents "file system escapes" due to symlinks and other file
> > +system objects that might lead to files outside the shared directory.  The
> > +program also sandboxes itself using seccomp(2) to prevent ptrace(2) and 
> > other
> > +vectors that could allow an attacker to compromise the system after gaining
> > +control of the virtiofsd process.
> > +
> > +@c man end
> > +
> > +@c man begin OPTIONS
> > +@table @option
> > +@item -h, --help
> > +Print help.
> > +@item -V, --version
> > +Print version.
> > +@item -d, -o debug
> > +Enable debug output.
> > +@item --syslog
> > +Print log messages to syslog instead of stderr.
> > +@item -o log_level=@var{level}
> > +Print only log messages matching @var{level} or more severe.  @var{level} 
> > is
> > +one of @code{err}, @code{warn}, @code{info}, or @code{debug}.  The default 
> > is
> > +@var{info}.
> > +@item -o source=@var{path}
> > +Share host directory tree located at @var{path}.  This option is required.
> > +@item --socket-path=@var{path}, -o vhost_user_socket=@var{path}
> > +Listen on vhost-user UNIX domain socket at @var{path}.
> > +@item --fd=@var{fdnum}
> > +Accept connections from vhost-user UNIX domain socket file descriptor 
> > @var{fdnum}.  The file descriptor must already be listening for connections.
> > +@item --thread-pool-size=@var{num}
> > +Restrict the number of worker threads per request queue to @var{num}.  The 
> > default is 64.
> > +@item --cache=@code{none}|@code{auto}|@code{always}
> > +Select the desired trade-off between coherency and performance.  
> > @code{none}
> > +forbids the FUSE client from caching to achieve best coherency at the cost 
> > of
> > +performance.  @code{auto} acts similar to NFS with a 1 second metadata 
> > cache
> > +timeout.  @code{always} sets a long cache lifetime at the expense of 
> > coherency.
> > +@item --writeback
> > +Enable writeback cache, allowing the FUSE client to buffer and merge write 
> > requests.
> > +@end table
> > +@c man end
> > +
> > +@c man begin EXAMPLES
> > +Export @code{/var/lib/fs/vm001/} on vhost-user UNIX domain socket 
> > @code{/var/run/vm001-vhost-fs.sock}:
> > +
> > +@example
> > +host# virtiofsd --socket-path=/var/run/vm001-vhost-fs.sock -o 
> > source=/var/lib/fs/vm001
> > +host# qemu-system-x86_64 \
> > +-chardev socket,id=char0,path=/var/run/vm001-vhost-fs.sock \
> > +-device vhost-user-fs-pci,chardev=char0,tag=myfs \
> > +-object memory-backend-file,id=mem,size=4G,mem-path=/dev/shm,share=on \
> > +-numa node,memdev=mem \
> > +...
> > +guest# mount -t virtio_fs \
> > +-o 
> > default_permissions,allow_other,user_id=0,group_id=0,rootmode=04,dax \
> > +myfs /mnt
> > +@end example
> > +@c man end
> > +
> > +@ignore
> > +@setfilename virtiofsd
> > +@settitle QEMU virtio-fs shared file system daemon
> > +
> > +@c man begin AUTHOR
> 
> s/AUTHOR/COPYRIGHT/

OK

> since this isn't providing any author information.
> 
> > +Copyright (C) 2019 Red Hat, Inc.
> 
> 2019-2020 !

Time flies...

> And now insert
> 
>  @c man end
>  @c man begin LICENSE
> 
> > +This is free software; see the source for copying conditions.  There is NO
> > +warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> > +@c man end
> > +@end ignore

Hmm, so it ends up like:


@c man end

@ignore
@setfilename virtiofsd
@settitle QEMU virtio-fs shared file system daemon

@c man begin COPYRIGHT
Copyright (C) 2019-2020 Red Hat, Inc.
@c man end
@c man begin LICENSE
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS 

Re: [PATCH v1 00/59] trivial unneeded labels cleanup

2020-01-09 Thread Daniel Henrique Barboza




On 1/7/20 4:06 AM, Philippe Mathieu-Daudé wrote:

On 1/7/20 7:16 AM, Kevin Wolf wrote:

Am 06.01.2020 um 21:35 hat Daniel Henrique Barboza geschrieben:

On 1/6/20 4:54 PM, Corey Minyard wrote:

On Mon, Jan 06, 2020 at 03:23:26PM -0300, Daniel Henrique Barboza wrote:

Hello,

[...]


Which is cleaner and requires less brain cycles to wonder
whether the 'cleanup' label does anything special, such
as a heap memory cleanup.


I would disagree with this analysis.  To me, I often wonder
when I have to add cleanup code to a routine whether there is
some hidden return in the middle of the function.  That's a lot
harder to spot than just looking for the cleanup label at the
end of the function to see what it does.  For non-trivial
functions I prefer to have one point of return at the end
(and maybe some minor checks with returns right at the beginning).
I'm not adamant about this, just my opinion.


It depends on the case, but yes, I had similar thoughts, at least when
we're talking about non-trivial parts of a function. (Very short
functions of just some initial checks returning directly are usually
fine.)


 From a debugging point of view, and when adding trace-events, it is easier to 
have a single function exit path.

In various functions modified by your patches, we can split big functions in 
smaller ones and avoid the goto label.



Splitting and refactoring big funcitons wasn't the initial idea of this
work, but I see no problems in doing it. Care to point out where did you
see such cases?


Thanks,


DHB



[PATCH v2 4/4] qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value

2020-01-09 Thread Alberto Garcia
This replaces all remaining instances in the qcow2 code.

Signed-off-by: Alberto Garcia 
---
 block/qcow2-cluster.c |  2 +-
 block/qcow2.c | 10 ++
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 932fc48919..777ca2d409 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -219,7 +219,7 @@ static int l2_load(BlockDriverState *bs, uint64_t offset,
  * Writes one sector of the L1 table to the disk (can't update single entries
  * and we really don't want bdrv_pread to perform a read-modify-write)
  */
-#define L1_ENTRIES_PER_SECTOR (512 / 8)
+#define L1_ENTRIES_PER_SECTOR (BDRV_SECTOR_SIZE / 8)
 int qcow2_write_l1_entry(BlockDriverState *bs, int l1_index)
 {
 BDRVQcow2State *s = bs->opaque;
diff --git a/block/qcow2.c b/block/qcow2.c
index 783d2b9578..c0f3e715ef 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3280,7 +3280,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, 
Error **errp)
 
 /* Validate options and set default values */
 if (!QEMU_IS_ALIGNED(qcow2_opts->size, BDRV_SECTOR_SIZE)) {
-error_setg(errp, "Image size must be a multiple of 512 bytes");
+error_setg(errp, "Image size must be a multiple of %u bytes",
+   (unsigned) BDRV_SECTOR_SIZE);
 ret = -EINVAL;
 goto out;
 }
@@ -3836,7 +3837,7 @@ qcow2_co_copy_range_from(BlockDriverState *bs,
 case QCOW2_CLUSTER_NORMAL:
 child = s->data_file;
 copy_offset += offset_into_cluster(s, src_offset);
-if ((copy_offset & 511) != 0) {
+if (!QEMU_IS_ALIGNED(copy_offset, BDRV_SECTOR_SIZE)) {
 ret = -EIO;
 goto out;
 }
@@ -3958,8 +3959,9 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 return -ENOTSUP;
 }
 
-if (offset & 511) {
-error_setg(errp, "The new size must be a multiple of 512");
+if (!QEMU_IS_ALIGNED(offset, BDRV_SECTOR_SIZE)) {
+error_setg(errp, "The new size must be a multiple of %u",
+   (unsigned) BDRV_SECTOR_SIZE);
 return -EINVAL;
 }
 
-- 
2.20.1




[PATCH v2 2/4] qcow2: Don't round the L1 table allocation up to the sector size

2020-01-09 Thread Alberto Garcia
The L1 table is read from disk using the byte-based bdrv_pread() and
is never accessed beyond its last element, so there's no need to
allocate more memory than that.

Signed-off-by: Alberto Garcia 
---
 block/qcow2-cluster.c  | 5 ++---
 block/qcow2-refcount.c | 2 +-
 block/qcow2-snapshot.c | 3 +--
 block/qcow2.c  | 2 +-
 4 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
index 8982b7b762..932fc48919 100644
--- a/block/qcow2-cluster.c
+++ b/block/qcow2-cluster.c
@@ -124,12 +124,11 @@ int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t 
min_size,
 #endif
 
 new_l1_size2 = sizeof(uint64_t) * new_l1_size;
-new_l1_table = qemu_try_blockalign(bs->file->bs,
-   ROUND_UP(new_l1_size2, 512));
+new_l1_table = qemu_try_blockalign(bs->file->bs, new_l1_size2);
 if (new_l1_table == NULL) {
 return -ENOMEM;
 }
-memset(new_l1_table, 0, ROUND_UP(new_l1_size2, 512));
+memset(new_l1_table, 0, new_l1_size2);
 
 if (s->l1_size) {
 memcpy(new_l1_table, s->l1_table, s->l1_size * sizeof(uint64_t));
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index f67ac6b2d8..c963bc8de1 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1262,7 +1262,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
  * l1_table_offset when it is the current s->l1_table_offset! Be careful
  * when changing this! */
 if (l1_table_offset != s->l1_table_offset) {
-l1_table = g_try_malloc0(ROUND_UP(l1_size2, 512));
+l1_table = g_try_malloc0(l1_size2);
 if (l1_size2 && l1_table == NULL) {
 ret = -ENOMEM;
 goto fail;
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 5ab64da1ec..82c32d4c9b 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -1024,8 +1024,7 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs,
 return ret;
 }
 new_l1_bytes = sn->l1_size * sizeof(uint64_t);
-new_l1_table = qemu_try_blockalign(bs->file->bs,
-   ROUND_UP(new_l1_bytes, 512));
+new_l1_table = qemu_try_blockalign(bs->file->bs, new_l1_bytes);
 if (new_l1_table == NULL) {
 return -ENOMEM;
 }
diff --git a/block/qcow2.c b/block/qcow2.c
index 87ca2832f0..848a6c5182 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1499,7 +1499,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 
 if (s->l1_size > 0) {
 s->l1_table = qemu_try_blockalign(bs->file->bs,
-ROUND_UP(s->l1_size * sizeof(uint64_t), 512));
+  s->l1_size * sizeof(uint64_t));
 if (s->l1_table == NULL) {
 error_setg(errp, "Could not allocate L1 table");
 ret = -ENOMEM;
-- 
2.20.1




[PATCH v2 0/4] qcow2: Misc BDRV_SECTOR_SIZE updates

2020-01-09 Thread Alberto Garcia
This small series gets rid of all the remaining instances of hardcoded
sector sizes in the qcow2 code and adds a check for images whose
virtual size is not a multiple of the sector size.

See the individual patches for details.

Berto

v2:
- Modify output of iotest 080 to make it easier to understand [Nir]
- Use the QEMU_IS_ALIGNED() macro instead of the modulus operator [Nir]
- Tighten some assertions [Kevin]

v1: https://lists.gnu.org/archive/html/qemu-block/2020-01/msg00139.html

Alberto Garcia (4):
  qcow2: Require that the virtual size is a multiple of the sector size
  qcow2: Don't round the L1 table allocation up to the sector size
  qcow2: Tighten cluster_offset alignment assertions
  qcow2: Use BDRV_SECTOR_SIZE instead of the hardcoded value

 block/qcow2-cluster.c  |  7 +++
 block/qcow2-refcount.c |  2 +-
 block/qcow2-snapshot.c |  3 +--
 block/qcow2.c  | 28 +---
 docs/interop/qcow2.txt |  3 ++-
 tests/qemu-iotests/080 |  8 
 tests/qemu-iotests/080.out |  5 +
 7 files changed, 37 insertions(+), 19 deletions(-)

-- 
2.20.1




[PATCH v2 3/4] qcow2: Tighten cluster_offset alignment assertions

2020-01-09 Thread Alberto Garcia
qcow2_alloc_cluster_offset() and qcow2_get_cluster_offset() always
return offsets that are cluster-aligned so don't just check that they
are sector-aligned.

The check in qcow2_co_preadv_task() is also replaced by an assertion
for the same reason.

Signed-off-by: Alberto Garcia 
---
 block/qcow2.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 848a6c5182..783d2b9578 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2175,10 +2175,7 @@ static coroutine_fn int 
qcow2_co_preadv_task(BlockDriverState *bs,
   offset, bytes, qiov, qiov_offset);
 
 case QCOW2_CLUSTER_NORMAL:
-if ((file_cluster_offset & 511) != 0) {
-return -EIO;
-}
-
+assert(QEMU_IS_ALIGNED(file_cluster_offset, s->cluster_size));
 if (bs->encrypted) {
 return qcow2_co_preadv_encrypted(bs, file_cluster_offset,
  offset, bytes, qiov, qiov_offset);
@@ -2514,7 +2511,7 @@ static coroutine_fn int qcow2_co_pwritev_part(
 goto out_locked;
 }
 
-assert((cluster_offset & 511) == 0);
+assert(QEMU_IS_ALIGNED(cluster_offset, s->cluster_size));
 
 ret = qcow2_pre_write_overlap_check(bs, 0,
 cluster_offset + offset_in_cluster,
@@ -3904,7 +3901,7 @@ qcow2_co_copy_range_to(BlockDriverState *bs,
 goto fail;
 }
 
-assert((cluster_offset & 511) == 0);
+assert(QEMU_IS_ALIGNED(cluster_offset, s->cluster_size));
 
 ret = qcow2_pre_write_overlap_check(bs, 0,
 cluster_offset + offset_in_cluster, cur_bytes, true);
-- 
2.20.1




[PATCH v2 1/4] qcow2: Require that the virtual size is a multiple of the sector size

2020-01-09 Thread Alberto Garcia
The qcow2 header specifies the virtual size of the image in bytes, but
BlockDriverState stores it as a number of 512-byte sectors.

If the user tries to create an image with a size that is not a
multiple of the sector size then this is fixed on creation by
silently rounding the image size up (see commit c2eb918e32).
qcow2_co_truncate() is more strict and returns an error instead.

However when an image is opened the virtual size is rounded down,
which means that trying to access the last few advertised bytes will
result in an error. As seen above QEMU cannot create such images and
there's no good use case that would require us to try to handle them
so let's just treat them as unsupported.

Signed-off-by: Alberto Garcia 
---
 block/qcow2.c  | 7 +++
 docs/interop/qcow2.txt | 3 ++-
 tests/qemu-iotests/080 | 8 
 tests/qemu-iotests/080.out | 5 +
 4 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7fbaac8457..87ca2832f0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1326,6 +1326,13 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 goto fail;
 }
 
+if (!QEMU_IS_ALIGNED(header.size, BDRV_SECTOR_SIZE)) {
+error_setg(errp, "Virtual size is not a multiple of %u",
+   (unsigned) BDRV_SECTOR_SIZE);
+ret = -EINVAL;
+goto fail;
+}
+
 if (header.header_length > sizeof(header)) {
 s->unknown_header_fields_size = header.header_length - sizeof(header);
 s->unknown_header_fields = g_malloc(s->unknown_header_fields_size);
diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index af5711e533..891f5662d8 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -40,7 +40,8 @@ The first cluster of a qcow2 image contains the file header:
 with larger cluster sizes.
 
  24 - 31:   size
-Virtual disk size in bytes.
+Virtual disk size in bytes. qemu can only handle
+sizes that are a multiple of 512 bytes.
 
 Note: qemu has an implementation limit of 32 MB as
 the maximum L1 table size.  With a 2 MB cluster
diff --git a/tests/qemu-iotests/080 b/tests/qemu-iotests/080
index 4bcb5021e8..6f136d616f 100755
--- a/tests/qemu-iotests/080
+++ b/tests/qemu-iotests/080
@@ -48,6 +48,7 @@ header_size=104
 
 offset_backing_file_offset=8
 offset_backing_file_size=16
+offset_virtual_size=24
 offset_l1_size=36
 offset_l1_table_offset=40
 offset_refcount_table_offset=48
@@ -197,6 +198,13 @@ poke_file "$TEST_IMG" "$offset_snap1_l1_size" 
"\x10\x00\x00\x00"
 { $QEMU_IMG snapshot -d test $TEST_IMG; } 2>&1 | _filter_testdir
 _check_test_img
 
+echo
+echo "== Image size not a multiple of the sector size =="
+_make_test_img 64k
+echo "Modifying virtual size to 65535 bytes"
+poke_file "$TEST_IMG" "$offset_virtual_size" "\x00\x00\x00\x00\x00\x00\xff\xff"
+{ $QEMU_IMG info $TEST_IMG; } 2>&1 | _filter_testdir | _filter_imgfmt
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/080.out b/tests/qemu-iotests/080.out
index 45ab01db8e..aadc817339 100644
--- a/tests/qemu-iotests/080.out
+++ b/tests/qemu-iotests/080.out
@@ -104,4 +104,9 @@ Data may be corrupted, or further writes to the image may 
corrupt it.
 
 3 leaked clusters were found on the image.
 This means waste of disk space, but no harm to data.
+
+== Image size not a multiple of the sector size ==
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65536
+Modifying virtual size to 65535 bytes
+qemu-img: Could not open 'TEST_DIR/t.IMGFMT': Virtual size is not a multiple 
of 512
 *** done
-- 
2.20.1




Re: [PATCH v2 3/9] hw/9pfs/9p-synth: added directory for readdir test

2020-01-09 Thread Greg Kurz
On Wed, 18 Dec 2019 14:23:48 +0100
Christian Schoenebeck  wrote:

> This will provide the following virtual files by the 9pfs
> synth driver:
> 
>   - /ReadDirDir/ReadDirFile99
>   - /ReadDirDir/ReadDirFile98
>   ...
>   - /ReadDirDir/ReadDirFile1
>   - /ReadDirDir/ReadDirFile0
> 
> This virtual directory and its virtual 100 files will be
> used by the upcoming 9pfs readdir tests.
> 
> Signed-off-by: Christian Schoenebeck 
> ---

LGTM

Reviewed-by: Greg Kurz 

>  hw/9pfs/9p-synth.c | 19 +++
>  hw/9pfs/9p-synth.h |  5 +
>  2 files changed, 24 insertions(+)
> 
> diff --git a/hw/9pfs/9p-synth.c b/hw/9pfs/9p-synth.c
> index 54239c9bbf..7eb210ffa8 100644
> --- a/hw/9pfs/9p-synth.c
> +++ b/hw/9pfs/9p-synth.c
> @@ -578,6 +578,25 @@ static int synth_init(FsContext *ctx, Error **errp)
> NULL, v9fs_synth_qtest_flush_write,
> ctx);
>  assert(!ret);
> +
> +/* Directory for READDIR test */
> +{
> +V9fsSynthNode *dir = NULL;
> +ret = qemu_v9fs_synth_mkdir(
> +NULL, 0700, QTEST_V9FS_SYNTH_READDIR_DIR, 
> +);
> +assert(!ret);
> +for (i = 0; i < QTEST_V9FS_SYNTH_READDIR_NFILES; ++i) {
> +char *name = g_strdup_printf(
> +QTEST_V9FS_SYNTH_READDIR_FILE, i
> +);
> +ret = qemu_v9fs_synth_add_file(
> +dir, 0, name, NULL, NULL, ctx
> +);
> +assert(!ret);
> +g_free(name);
> +}
> +}
>  }
>  
>  return 0;
> diff --git a/hw/9pfs/9p-synth.h b/hw/9pfs/9p-synth.h
> index af7a993a1e..036d7e4a5b 100644
> --- a/hw/9pfs/9p-synth.h
> +++ b/hw/9pfs/9p-synth.h
> @@ -55,6 +55,11 @@ int qemu_v9fs_synth_add_file(V9fsSynthNode *parent, int 
> mode,
>  #define QTEST_V9FS_SYNTH_LOPEN_FILE "LOPEN"
>  #define QTEST_V9FS_SYNTH_WRITE_FILE "WRITE"
>  
> +/* for READDIR test */
> +#define QTEST_V9FS_SYNTH_READDIR_DIR "ReadDirDir"
> +#define QTEST_V9FS_SYNTH_READDIR_FILE "ReadDirFile%d"
> +#define QTEST_V9FS_SYNTH_READDIR_NFILES 100
> +
>  /* Any write to the "FLUSH" file is handled one byte at a time by the
>   * backend. If the byte is zero, the backend returns success (ie, 1),
>   * otherwise it forces the server to try again forever. Thus allowing




Re: qemu-4.0.1: vhost_region_add_section:Section rounded to 0 prior to previous a0000

2020-01-09 Thread Dr. David Alan Gilbert
* Peter Lieven (p...@kamp.de) wrote:
> Am 08.01.20 um 16:04 schrieb Dr. David Alan Gilbert:
> > * Peter Lieven (p...@kamp.de) wrote:
> > > Hi,
> > > 
> > > 
> > > I have a Qemu 4.0.1 machine with vhost-net network adapter, thats 
> > > polluting the log with the above message.
> > > 
> > > Is this something known? Googling revealed the following patch in Nemu 
> > > (with seems to be a Qemu fork from Intel):
> > > 
> > > https://github.com/intel/nemu/commit/03940ded7f5370ce7492c619dccced114ef7f56e
> > > 
> > > 
> > > The network stopped functioning. After a live-migration the vServer is 
> > > reachable again.
> > > 
> > > 
> > > Any ideas?
> > What guest are you running and what does your qemu commandline look
> > like?
> 
> 
> Its running debian9. We have hundreds of other VMs with identical setup. Do 
> not know why this one makes trouble.

Could you extract an 'info mtree' from it - particularly the
'address-space: memory' near the top.


> Here is the cmdline:
> 
> 
> /usr/bin/qemu-4.0.0  -enable-kvm  -M pc-i440fx-2.9  -nodefaults -netdev
> type=tap,id=guest0,script=no,downscript=no,ifname=tap0,vhost=on,vnet_hdr=on
> -device virtio-net-pci,netdev=guest0,mac=52:54:00:80:07:bc -iscsi
> initiator-name=iqn.2005-03.org.virtual-core:0025b51f006c,timeout=30 -object
> rng-random,filename=/dev/urandom,id=rng0 -device
> virtio-rng-pci,rng=rng0,max-bytes=65536,period=1000  -drive 
> format=raw,discard=on,detect-zeroes=off,file=XXX,if=none,cache=writeback,aio=native,id=disk0
> -device virtio-scsi-pci  -device scsi-hd,drive=disk0  -serial null 
> -parallel null  -m 1024,slots=16,maxmem=393216M  -smp
> 1,sockets=64,cores=1,threads=1,maxcpus=64  -monitor
> tcp:0:4001,server,nowait,nodelay -vnc :1 -qmp
> tcp:0:3001,server,nowait,nodelay  -name 'debian9'  -drive
> index=2,media=cdrom,if=ide,aio=native,readonly=on  -boot order=c,menu=off 
> -k de  -pidfile /var/run/qemu/vm-5182.pid -mem-path /hugepages 
> -mem-prealloc  -cpu Westmere,+pcid,enforce -rtc base=utc -usb -device
> usb-tablet -no-hpet -vga vmware -chardev
> socket,host=127.0.0.1,port=2001,id=qga0,server,nowait,nodelay -device
> virtio-serial -device
> virtserialport,chardev=qga0,name=org.qemu.guest_agent.0

All pretty boring stuff. hmm.

Dave

> 
> Thanks,
> 
> Peter
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




[PATCH 1/4] qapi: Add a 'coroutine' flag for commands

2020-01-09 Thread Kevin Wolf
This patch adds a new 'coroutine' flag to QMP command definitions that
tells the QMP dispatcher than the command handler is safe to be run in a
coroutine.

Signed-off-by: Kevin Wolf 
---
 tests/qapi-schema/qapi-schema-test.json |  1 +
 docs/devel/qapi-code-gen.txt|  4 
 include/qapi/qmp/dispatch.h |  1 +
 tests/test-qmp-cmds.c   |  4 
 scripts/qapi/commands.py| 17 +++--
 scripts/qapi/doc.py |  2 +-
 scripts/qapi/expr.py|  4 ++--
 scripts/qapi/introspect.py  |  2 +-
 scripts/qapi/schema.py  |  9 ++---
 tests/qapi-schema/qapi-schema-test.out  |  2 ++
 tests/qapi-schema/test-qapi.py  |  7 ---
 11 files changed, 37 insertions(+), 16 deletions(-)

diff --git a/tests/qapi-schema/qapi-schema-test.json 
b/tests/qapi-schema/qapi-schema-test.json
index 9abf175fe0..55f596dbaa 100644
--- a/tests/qapi-schema/qapi-schema-test.json
+++ b/tests/qapi-schema/qapi-schema-test.json
@@ -147,6 +147,7 @@
   'returns': 'UserDefTwo' }
 
 { 'command': 'cmd-success-response', 'data': {}, 'success-response': false }
+{ 'command': 'coroutine_cmd', 'data': {}, 'coroutine': true }
 
 # Returning a non-dictionary requires a name from the whitelist
 { 'command': 'guest-get-time', 'data': {'a': 'int', '*b': 'int' },
diff --git a/docs/devel/qapi-code-gen.txt b/docs/devel/qapi-code-gen.txt
index 45c93a43cc..753f6711d3 100644
--- a/docs/devel/qapi-code-gen.txt
+++ b/docs/devel/qapi-code-gen.txt
@@ -457,6 +457,7 @@ Syntax:
 '*gen': false,
 '*allow-oob': true,
 '*allow-preconfig': true,
+'*coroutine': true,
 '*if': COND,
 '*features': FEATURES }
 
@@ -581,6 +582,9 @@ before the machine is built.  It defaults to false.  For 
example:
 QMP is available before the machine is built only when QEMU was
 started with --preconfig.
 
+Member 'coroutine' tells the QMP dispatcher whether the command handler
+is safe to be run in a coroutine. It defaults to false.
+
 The optional 'if' member specifies a conditional.  See "Configuring
 the schema" below for more on this.
 
diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
index 9aa426a398..d6ce9efc8e 100644
--- a/include/qapi/qmp/dispatch.h
+++ b/include/qapi/qmp/dispatch.h
@@ -24,6 +24,7 @@ typedef enum QmpCommandOptions
 QCO_NO_SUCCESS_RESP   =  (1U << 0),
 QCO_ALLOW_OOB =  (1U << 1),
 QCO_ALLOW_PRECONFIG   =  (1U << 2),
+QCO_COROUTINE =  (1U << 3),
 } QmpCommandOptions;
 
 typedef struct QmpCommand
diff --git a/tests/test-qmp-cmds.c b/tests/test-qmp-cmds.c
index 27b0afe55a..e2f71e42a1 100644
--- a/tests/test-qmp-cmds.c
+++ b/tests/test-qmp-cmds.c
@@ -34,6 +34,10 @@ void qmp_cmd_success_response(Error **errp)
 {
 }
 
+void qmp_coroutine_cmd(Error **errp)
+{
+}
+
 Empty2 *qmp_user_def_cmd0(Error **errp)
 {
 return g_new0(Empty2, 1);
diff --git a/scripts/qapi/commands.py b/scripts/qapi/commands.py
index ab98e504f3..97e51401f1 100644
--- a/scripts/qapi/commands.py
+++ b/scripts/qapi/commands.py
@@ -15,6 +15,7 @@ See the COPYING file in the top-level directory.
 
 from qapi.common import *
 from qapi.gen import QAPIGenCCode, QAPISchemaModularCVisitor, ifcontext
+from typing import List
 
 
 def gen_command_decl(name, arg_type, boxed, ret_type):
@@ -194,8 +195,9 @@ out:
 return ret
 
 
-def gen_register_command(name, success_response, allow_oob, allow_preconfig):
-options = []
+def gen_register_command(name: str, success_response: bool, allow_oob: bool,
+ allow_preconfig: bool, coroutine: bool) -> str:
+options = [] # type: List[str]
 
 if not success_response:
 options += ['QCO_NO_SUCCESS_RESP']
@@ -203,18 +205,20 @@ def gen_register_command(name, success_response, 
allow_oob, allow_preconfig):
 options += ['QCO_ALLOW_OOB']
 if allow_preconfig:
 options += ['QCO_ALLOW_PRECONFIG']
+if coroutine:
+options += ['QCO_COROUTINE']
 
 if not options:
 options = ['QCO_NO_OPTIONS']
 
-options = " | ".join(options)
+options_str = " | ".join(options)
 
 ret = mcgen('''
 qmp_register_command(cmds, "%(name)s",
  qmp_marshal_%(c_name)s, %(opts)s);
 ''',
 name=name, c_name=c_name(name),
-opts=options)
+opts=options_str)
 return ret
 
 
@@ -278,7 +282,7 @@ void %(c_prefix)sqmp_init_marshal(QmpCommandList *cmds);
 
 def visit_command(self, name, info, ifcond, arg_type, ret_type, gen,
   success_response, boxed, allow_oob, allow_preconfig,
-  features):
+  coroutine, features):
 if not gen:
 return
 # FIXME: If T is a user-defined type, the user is responsible
@@ -296,7 +300,8 @@ void %(c_prefix)sqmp_init_marshal(QmpCommandList 

[PATCH 0/4] qmp: Optionally run handlers in coroutines

2020-01-09 Thread Kevin Wolf
Some QMP command handlers can block the main loop for a relatively long
time, for example because they perform some I/O. This is quite nasty.
Allowing such handlers to run in a coroutine where they can yield (and
therefore release the BQL) while waiting for an event such as I/O
completion solves the problem.

This series adds the infrastructure to allow this and switches
block_resize to run in a coroutine as a first example.

This is an alternative solution to Marc-André's "monitor: add
asynchronous command type" series.

Kevin Wolf (4):
  qapi: Add a 'coroutine' flag for commands
  block: Mark 'block_resize' as coroutine
  vl: Initialise main loop earlier
  qmp: Move dispatcher to a coroutine

 qapi/block-core.json|  3 +-
 tests/qapi-schema/qapi-schema-test.json |  1 +
 docs/devel/qapi-code-gen.txt|  4 ++
 include/qapi/qmp/dispatch.h |  3 +
 monitor/monitor-internal.h  |  5 +-
 monitor/monitor.c   | 24 ---
 monitor/qmp.c   | 83 -
 qapi/qmp-dispatch.c | 38 ++-
 tests/test-qmp-cmds.c   |  4 ++
 vl.c| 10 +--
 scripts/qapi/commands.py| 17 +++--
 scripts/qapi/doc.py |  2 +-
 scripts/qapi/expr.py|  4 +-
 scripts/qapi/introspect.py  |  2 +-
 scripts/qapi/schema.py  |  9 ++-
 tests/qapi-schema/qapi-schema-test.out  |  2 +
 tests/qapi-schema/test-qapi.py  |  7 ++-
 17 files changed, 155 insertions(+), 63 deletions(-)

-- 
2.20.1




[PATCH 4/4] qmp: Move dispatcher to a coroutine

2020-01-09 Thread Kevin Wolf
This moves the QMP dispatcher to a coroutine and runs all QMP command
handlers that declare 'coroutine': true in coroutine context so they
can avoid blocking the main loop while doing I/O or waiting for other
events.

For commands that are not declared safe to run in a coroutine, the
dispatcher drops out of coroutine context by calling the QMP command
handler from a bottom half.

Signed-off-by: Kevin Wolf 
---
 include/qapi/qmp/dispatch.h |  2 +
 monitor/monitor-internal.h  |  5 ++-
 monitor/monitor.c   | 24 +++
 monitor/qmp.c   | 83 +++--
 qapi/qmp-dispatch.c | 38 -
 5 files changed, 111 insertions(+), 41 deletions(-)

diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
index d6ce9efc8e..d6d5443391 100644
--- a/include/qapi/qmp/dispatch.h
+++ b/include/qapi/qmp/dispatch.h
@@ -30,6 +30,8 @@ typedef enum QmpCommandOptions
 typedef struct QmpCommand
 {
 const char *name;
+/* Runs in coroutine context if QCO_COROUTINE is set, except for OOB
+ * commands */
 QmpCommandFunc *fn;
 QmpCommandOptions options;
 QTAILQ_ENTRY(QmpCommand) node;
diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
index d78f5ca190..7389b6a56c 100644
--- a/monitor/monitor-internal.h
+++ b/monitor/monitor-internal.h
@@ -154,7 +154,8 @@ static inline bool monitor_is_qmp(const Monitor *mon)
 
 typedef QTAILQ_HEAD(MonitorList, Monitor) MonitorList;
 extern IOThread *mon_iothread;
-extern QEMUBH *qmp_dispatcher_bh;
+extern Coroutine *qmp_dispatcher_co;
+extern bool qmp_dispatcher_co_busy;
 extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands;
 extern QemuMutex monitor_lock;
 extern MonitorList mon_list;
@@ -172,7 +173,7 @@ void monitor_fdsets_cleanup(void);
 
 void qmp_send_response(MonitorQMP *mon, const QDict *rsp);
 void monitor_data_destroy_qmp(MonitorQMP *mon);
-void monitor_qmp_bh_dispatcher(void *data);
+void coroutine_fn monitor_qmp_dispatcher_co(void *data);
 
 int get_monitor_def(int64_t *pval, const char *name);
 void help_cmd(Monitor *mon, const char *name);
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 12898b6448..c72763fa4e 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -53,8 +53,9 @@ typedef struct {
 /* Shared monitor I/O thread */
 IOThread *mon_iothread;
 
-/* Bottom half to dispatch the requests received from I/O thread */
-QEMUBH *qmp_dispatcher_bh;
+/* Coroutine to dispatch the requests received from I/O thread */
+Coroutine *qmp_dispatcher_co;
+bool qmp_dispatcher_co_busy;
 
 /* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
 QemuMutex monitor_lock;
@@ -579,9 +580,16 @@ void monitor_cleanup(void)
 }
 qemu_mutex_unlock(_lock);
 
-/* QEMUBHs needs to be deleted before destroying the I/O thread */
-qemu_bh_delete(qmp_dispatcher_bh);
-qmp_dispatcher_bh = NULL;
+/* The dispatcher needs to stop before destroying the I/O thread */
+if (!atomic_mb_read(_dispatcher_co_busy)) {
+aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
+qmp_dispatcher_co = NULL;
+}
+
+AIO_WAIT_WHILE(qemu_get_aio_context(),
+   (aio_bh_poll(iohandler_get_aio_context()),
+atomic_mb_read(_dispatcher_co_busy)));
+
 if (mon_iothread) {
 iothread_destroy(mon_iothread);
 mon_iothread = NULL;
@@ -604,9 +612,9 @@ void monitor_init_globals_core(void)
  * have commands assuming that context.  It would be nice to get
  * rid of those assumptions.
  */
-qmp_dispatcher_bh = aio_bh_new(iohandler_get_aio_context(),
-   monitor_qmp_bh_dispatcher,
-   NULL);
+qmp_dispatcher_co = qemu_coroutine_create(monitor_qmp_dispatcher_co, NULL);
+atomic_mb_set(_dispatcher_co_busy, true);
+aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
 }
 
 QemuOptsList qemu_mon_opts = {
diff --git a/monitor/qmp.c b/monitor/qmp.c
index b67a8e7d1f..9fd66c7b97 100644
--- a/monitor/qmp.c
+++ b/monitor/qmp.c
@@ -133,6 +133,8 @@ static void monitor_qmp_respond(MonitorQMP *mon, QDict *rsp)
 }
 }
 
+/* Runs outside of coroutine context for OOB commands, but in coroutine context
+ * for everything else. */
 static void monitor_qmp_dispatch(MonitorQMP *mon, QObject *req)
 {
 Monitor *old_mon;
@@ -211,43 +213,62 @@ static QMPRequest 
*monitor_qmp_requests_pop_any_with_lock(void)
 return req_obj;
 }
 
-void monitor_qmp_bh_dispatcher(void *data)
+void coroutine_fn monitor_qmp_dispatcher_co(void *data)
 {
-QMPRequest *req_obj = monitor_qmp_requests_pop_any_with_lock();
+QMPRequest *req_obj = NULL;
 QDict *rsp;
 bool need_resume;
 MonitorQMP *mon;
 
-if (!req_obj) {
-return;
-}
+while (true) {
+assert(atomic_mb_read(_dispatcher_co_busy) == true);
+
+while (!(req_obj = monitor_qmp_requests_pop_any_with_lock())) {
+   

[PATCH 3/4] vl: Initialise main loop earlier

2020-01-09 Thread Kevin Wolf
We want to be able to use qemu_aio_context in the monitor
initialisation.

Signed-off-by: Kevin Wolf 
---
 vl.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/vl.c b/vl.c
index 86474a55c9..4c79a00857 100644
--- a/vl.c
+++ b/vl.c
@@ -2903,6 +2903,11 @@ int main(int argc, char **argv, char **envp)
 runstate_init();
 precopy_infrastructure_init();
 postcopy_infrastructure_init();
+
+if (qemu_init_main_loop(_loop_err)) {
+error_report_err(main_loop_err);
+exit(1);
+}
 monitor_init_globals();
 
 if (qcrypto_init() < 0) {
@@ -3817,11 +3822,6 @@ int main(int argc, char **argv, char **envp)
 qemu_unlink_pidfile_notifier.notify = qemu_unlink_pidfile;
 qemu_add_exit_notifier(_unlink_pidfile_notifier);
 
-if (qemu_init_main_loop(_loop_err)) {
-error_report_err(main_loop_err);
-exit(1);
-}
-
 #ifdef CONFIG_SECCOMP
 olist = qemu_find_opts_err("sandbox", NULL);
 if (olist) {
-- 
2.20.1




[PATCH 2/4] block: Mark 'block_resize' as coroutine

2020-01-09 Thread Kevin Wolf
block_resize is safe to run in a coroutine, so use it as an example for
the new 'coroutine': true annotation in the QAPI schema.

Signed-off-by: Kevin Wolf 
---
 qapi/block-core.json | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 7ff5e5edaf..1dbb2a9901 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1341,7 +1341,8 @@
 { 'command': 'block_resize',
   'data': { '*device': 'str',
 '*node-name': 'str',
-'size': 'int' } }
+'size': 'int' },
+  'coroutine': true }
 
 ##
 # @NewImageMode:
-- 
2.20.1




Re: qemu-4.0.1: vhost_region_add_section:Section rounded to 0 prior to previous a0000

2020-01-09 Thread Peter Lieven

Am 08.01.20 um 16:04 schrieb Dr. David Alan Gilbert:

* Peter Lieven (p...@kamp.de) wrote:

Hi,


I have a Qemu 4.0.1 machine with vhost-net network adapter, thats polluting the 
log with the above message.

Is this something known? Googling revealed the following patch in Nemu (with 
seems to be a Qemu fork from Intel):

https://github.com/intel/nemu/commit/03940ded7f5370ce7492c619dccced114ef7f56e


The network stopped functioning. After a live-migration the vServer is 
reachable again.


Any ideas?

What guest are you running and what does your qemu commandline look
like?



Its running debian9. We have hundreds of other VMs with identical setup. Do not 
know why this one makes trouble.

Here is the cmdline:


/usr/bin/qemu-4.0.0  -enable-kvm  -M pc-i440fx-2.9  -nodefaults -netdev type=tap,id=guest0,script=no,downscript=no,ifname=tap0,vhost=on,vnet_hdr=on -device virtio-net-pci,netdev=guest0,mac=52:54:00:80:07:bc -iscsi 
initiator-name=iqn.2005-03.org.virtual-core:0025b51f006c,timeout=30 -object rng-random,filename=/dev/urandom,id=rng0 -device virtio-rng-pci,rng=rng0,max-bytes=65536,period=1000  -drive 
format=raw,discard=on,detect-zeroes=off,file=XXX,if=none,cache=writeback,aio=native,id=disk0 -device virtio-scsi-pci  -device scsi-hd,drive=disk0  -serial null  -parallel null  -m 1024,slots=16,maxmem=393216M  -smp 
1,sockets=64,cores=1,threads=1,maxcpus=64  -monitor tcp:0:4001,server,nowait,nodelay -vnc :1 -qmp tcp:0:3001,server,nowait,nodelay  -name 'debian9'  -drive index=2,media=cdrom,if=ide,aio=native,readonly=on  -boot order=c,menu=off  -k de  -pidfile 
/var/run/qemu/vm-5182.pid -mem-path /hugepages  -mem-prealloc  -cpu Westmere,+pcid,enforce -rtc base=utc -usb -device usb-tablet -no-hpet -vga vmware -chardev socket,host=127.0.0.1,port=2001,id=qga0,server,nowait,nodelay -device virtio-serial -device 
virtserialport,chardev=qga0,name=org.qemu.guest_agent.0



Thanks,

Peter





Re: [PATCH 04/15] hw/ppc/spapr_rtas: Restrict variables scope to single switch case

2020-01-09 Thread Greg Kurz
On Thu,  9 Jan 2020 16:21:22 +0100
Philippe Mathieu-Daudé  wrote:

> We only access these variables in RTAS_SYSPARM_SPLPAR_CHARACTERISTICS
> case, restrict their scope to avoid unnecessary initialization.
> 

I guess a decent compiler can be smart enough detect that the initialization
isn't needed outside of the RTAS_SYSPARM_SPLPAR_CHARACTERISTICS branch...
Anyway, reducing scope isn't bad. The only hitch I could see is that some
people do prefer to have all variables declared upfront, but there's a nested
param_val variable already so I guess it's okay.

> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/ppc/spapr_rtas.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 6f06e9d7fe..7237e5ebf2 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -267,8 +267,6 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
>uint32_t nret, target_ulong rets)
>  {
>  PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> -MachineState *ms = MACHINE(spapr);
> -unsigned int max_cpus = ms->smp.max_cpus;
>  target_ulong parameter = rtas_ld(args, 0);
>  target_ulong buffer = rtas_ld(args, 1);
>  target_ulong length = rtas_ld(args, 2);
> @@ -276,6 +274,8 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
>  
>  switch (parameter) {
>  case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: {
> +MachineState *ms = MACHINE(spapr);
> +unsigned int max_cpus = ms->smp.max_cpus;

The max_cpus variable used to be a global. Now that it got moved
below ms->smp, I'm not sure it's worth keeping it IMHO. What about
dropping it completely and do:

char *param_val = g_strdup_printf("MaxEntCap=%d,"
  "DesMem=%" PRIu64 ","
  "DesProcs=%d,"
  "MaxPlatProcs=%d",
  ms->smp.max_cpus,
  current_machine->ram_size / MiB,
  ms->smp.cpus,
  ms->smp.max_cpus);

And maybe insert an empty line between the declaration of param_val
and the code for a better readability ?

>  char *param_val = g_strdup_printf("MaxEntCap=%d,"
>"DesMem=%" PRIu64 ","
>"DesProcs=%d,"




Re: [PATCH 080/104] virtiofsd: add helper for lo_data cleanup

2020-01-09 Thread Dr. David Alan Gilbert
* Daniel P. Berrangé (berra...@redhat.com) wrote:
> On Thu, Dec 12, 2019 at 04:38:40PM +, Dr. David Alan Gilbert (git) wrote:
> > From: Liu Bo 
> > 
> > This offers an helper function for lo_data's cleanup.
> > 
> > Signed-off-by: Liu Bo 
> > ---
> >  tools/virtiofsd/passthrough_ll.c | 37 ++--
> >  1 file changed, 21 insertions(+), 16 deletions(-)
> 
> Reviewed-by: Daniel P. Berrangé 

Thanks.

> 
> > 
> > diff --git a/tools/virtiofsd/passthrough_ll.c 
> > b/tools/virtiofsd/passthrough_ll.c
> > index 45cf466178..097033aa00 100644
> > --- a/tools/virtiofsd/passthrough_ll.c
> > +++ b/tools/virtiofsd/passthrough_ll.c
> > @@ -2439,6 +2439,26 @@ static gboolean lo_key_equal(gconstpointer a, 
> > gconstpointer b)
> >  return la->ino == lb->ino && la->dev == lb->dev;
> >  }
> >  
> > +static void fuse_lo_data_cleanup(struct lo_data *lo)
> > +{
> > +if (lo->inodes) {
> > +g_hash_table_destroy(lo->inodes);
> > +}
> > +lo_map_destroy(>fd_map);
> > +lo_map_destroy(>dirp_map);
> > +lo_map_destroy(>ino_map);
> > +
> > +if (lo->proc_self_fd >= 0) {
> > +close(lo->proc_self_fd);
> > +}
> > +
> > +if (lo->root.fd >= 0) {
> > +close(lo->root.fd);
> > +}
> > +
> > +free((char *)lo->source);
> 
> This will need changing if you follow my comment on prev patch about
> removing the const & cast

Done.

> 
> > +}
> > +
> >  int main(int argc, char *argv[])
> >  {
> >  struct fuse_args args = FUSE_ARGS_INIT(argc, argv);
> > @@ -2594,22 +2614,7 @@ err_out2:
> >  err_out1:
> >  fuse_opt_free_args();
> >  
> > -if (lo.inodes) {
> > -g_hash_table_destroy(lo.inodes);
> > -}
> > -lo_map_destroy(_map);
> > -lo_map_destroy(_map);
> > -lo_map_destroy(_map);
> > -
> > -if (lo.proc_self_fd >= 0) {
> > -close(lo.proc_self_fd);
> > -}
> > -
> > -if (lo.root.fd >= 0) {
> > -close(lo.root.fd);
> > -}
> > -
> > -free((char *)lo.source);
> > +fuse_lo_data_cleanup();
> >  
> >  return ret ? 1 : 0;
> >  }
> > -- 
> > 2.23.0
> > 
> > 
> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH 079/104] virtiofsd: fix memory leak on lo.source

2020-01-09 Thread Dr. David Alan Gilbert
* Daniel P. Berrangé (berra...@redhat.com) wrote:
> On Thu, Dec 12, 2019 at 04:38:39PM +, Dr. David Alan Gilbert (git) wrote:
> > From: Liu Bo 
> > 
> > valgrind reported that lo.source is leaked on quiting, but it was defined
> > as (const char*) as it may point to a const string "/".
> > 
> > Signed-off-by: Liu Bo 
> > ---
> >  tools/virtiofsd/passthrough_ll.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/virtiofsd/passthrough_ll.c 
> > b/tools/virtiofsd/passthrough_ll.c
> > index 33092de65a..45cf466178 100644
> > --- a/tools/virtiofsd/passthrough_ll.c
> > +++ b/tools/virtiofsd/passthrough_ll.c
> > @@ -2529,9 +2529,8 @@ int main(int argc, char *argv[])
> >  fuse_log(FUSE_LOG_ERR, "source is not a directory\n");
> >  exit(1);
> >  }
> > -
> >  } else {
> > -lo.source = "/";
> > +lo.source = strdup("/");
> >  }
> >  lo.root.is_symlink = false;
> >  if (!lo.timeout_set) {
> > @@ -2610,5 +2609,7 @@ err_out1:
> >  close(lo.root.fd);
> >  }
> >  
> > +free((char *)lo.source);
> 
> Can we not change the 'lo_data' struct so that source is not const
> and thus avoid free'ing a const field ?

Done.  Made that free(lo.source) and dropped the const.

Dave


> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




[Bug 1850000] Re: 4.1.0 bogus QCOW2 corruption reported after compress

2020-01-09 Thread Thomas Huth
Fix has been included in QEMU v4.2:
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=24552feb6ae2f615b76c2

** Changed in: qemu
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/185

Title:
  4.1.0 bogus QCOW2 corruption reported after compress

Status in QEMU:
  Fix Released

Bug description:
  Creating a compressed image then running `qemu-img check <..>.qcow2'
  on said image seems to report bogus corruption in some (but not all)
  cases:

  Step 1.

  # qemu-img info win7-base.qcow2
  image: win7-base.qcow2
  file format: qcow2
  virtual size: 20 GiB (21474836480 bytes)
  disk size: 12.2 GiB
  cluster_size: 65536
  Format specific information:
  compat: 1.1
  lazy refcounts: true
  refcount bits: 16
  corrupt: false

  # qemu-img check win7-base.qcow2
  No errors were found on the image.
  327680/327680 = 100.00% allocated, 0.00% fragmented, 0.00% compressed clusters
  Image end offset: 21478375424

  Step 2.

  # qemu-img convert -f qcow2 -O qcow2 -c win7-base.qcow2 test1-z.qcow2

  Step 3.

  # qemu-img info test1-z.qcow2
  image: test1-z.qcow2
  file format: qcow2
  virtual size: 20 GiB (21474836480 bytes)
  disk size: 5.78 GiB
  cluster_size: 65536
  Format specific information:
  compat: 1.1
  lazy refcounts: false
  refcount bits: 16
  corrupt: false

  # qemu-img check test1-z.qcow2
  ERROR cluster 1191 refcount=1 reference=2
  ERROR cluster 1194 refcount=1 reference=4
  ERROR cluster 1195 refcount=1 reference=7
  ERROR cluster 1196 refcount=1 reference=7
  ERROR cluster 1197 refcount=1 reference=6
  ERROR cluster 1198 refcount=1 reference=4
  ERROR cluster 1199 refcount=1 reference=4
  ERROR cluster 1200 refcount=1 reference=5
  ERROR cluster 1201 refcount=1 reference=3
  <...> snip many errors
  Leaked cluster 94847 refcount=3 reference=0
  Leaked cluster 94848 refcount=3 reference=0
  Leaked cluster 94849 refcount=11 reference=0
  Leaked cluster 94850 refcount=14 reference=0

  20503 errors were found on the image.
  Data may be corrupted, or further writes to the image may corrupt it.

  20503 leaked clusters were found on the image.
  This means waste of disk space, but no harm to data.
  197000/327680 = 60.12% allocated, 89.32% fragmented, 88.50% compressed 
clusters
  Image end offset: 6216220672

  
  The resultant image seems to work fine in a VM when used as a backing file.

  Interestingly, if I substitute a qemu-img binary from qemu-4.0 then no
  errors are reported.

  # /tmp/qemu-img check test1-z.qcow2
  No errors were found on the image.
  197000/327680 = 60.12% allocated, 89.32% fragmented, 88.50% compressed 
clusters
  Image end offset: 6216220672

  Is the image corrupted or not? I'm guessing not.

  Just in case it matters, this is ext4 fs on rotational disk. Latest
  Arch Linux but self compiled 4.1.0 with recent QCOW2 corruption fixes
  added.

  I haven't tried latest trunk but might do so if time permits.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/185/+subscriptions



Re: [kvm-unit-tests PATCH 04/10] arm: pmu: Check Required Event Support

2020-01-09 Thread Auger Eric
Hi Andre,

On 1/9/20 6:30 PM, André Przywara wrote:
> On 09/01/2020 16:54, Auger Eric wrote:
> 
> Hi Eric,
> 
>> On 1/3/20 7:12 PM, Andre Przywara wrote:
>>> On Mon, 16 Dec 2019 21:47:51 +0100
>>> Eric Auger  wrote:
>>>
>>> Hi Eric,
>>>
 If event counters are implemented check the common events
 required by the PMUv3 are implemented.

 Some are unconditionally required (SW_INCR, CPU_CYCLES,
 either INST_RETIRED or INST_SPEC). Some others only are
 required if the implementation implements some other features.

 Check those wich are unconditionally required.

 This test currently fails on TCG as neither INST_RETIRED
 or INST_SPEC are supported.

 Signed-off-by: Eric Auger 

 ---

 v1 ->v2:
 - add a comment to explain the PMCEID0/1 splits
 ---
  arm/pmu.c | 71 +++
  arm/unittests.cfg |  6 
  2 files changed, 77 insertions(+)

 diff --git a/arm/pmu.c b/arm/pmu.c
 index d24857e..d88ef22 100644
 --- a/arm/pmu.c
 +++ b/arm/pmu.c
 @@ -101,6 +101,10 @@ static inline void precise_instrs_loop(int loop, 
 uint32_t pmcr)
: [pmcr] "r" (pmcr), [z] "r" (0)
: "cc");
  }
 +
 +/* event counter tests only implemented for aarch64 */
 +static void test_event_introspection(void) {}
 +
  #elif defined(__aarch64__)
  #define ID_AA64DFR0_PERFMON_SHIFT 8
  #define ID_AA64DFR0_PERFMON_MASK  0xf
 @@ -139,6 +143,70 @@ static inline void precise_instrs_loop(int loop, 
 uint32_t pmcr)
: [pmcr] "r" (pmcr)
: "cc");
  }
 +
 +#define PMCEID1_EL0 sys_reg(11, 3, 9, 12, 7)
 +
 +static bool is_event_supported(uint32_t n, bool warn)
 +{
 +  uint64_t pmceid0 = read_sysreg(pmceid0_el0);
 +  uint64_t pmceid1 = read_sysreg_s(PMCEID1_EL0);
 +  bool supported;
 +  uint32_t reg;
 +
 +  /*
 +   * The low 32-bits of PMCEID0/1 respectly describe
 +   * event support for events 0-31/32-63. Their High
 +   * 32-bits describe support for extended events
 +   * starting at 0x4000, using the same split.
 +   */
 +  if (n >= 0x0  && n <= 0x1F)
 +  reg = pmceid0 & 0x;
 +  else if  (n >= 0x4000 && n <= 0x401F)
 +  reg = pmceid0 >> 32;
 +  else if (n >= 0x20  && n <= 0x3F)
 +  reg = pmceid1 & 0x;
 +  else if (n >= 0x4020 && n <= 0x403F)
 +  reg = pmceid1 >> 32;
 +  else
 +  abort();
 +
 +  supported =  reg & (1 << n);
>>>
>>> Don't we need to mask off everything but the lowest 5 bits of "n"? Probably 
>>> also using "1U" is better.
>> I added an assert to check n is less or equal than 0x3F
> 
> But "n" will definitely be bigger than that in case of an extended
> event, won't it? So you adjust "reg" accordingly, but miss to do
> something similar to "n"?
ouch yes. Sorry. I Will do what you suggest. Nethertheless this would be
test code error.
> 
>>>
 +  if (!supported && warn)
 +  report_info("event %d is not supported", n);
 +  return supported;
 +}
 +
 +static void test_event_introspection(void)
>>>
>>> "introspection" sounds quite sophisticated. Are you planning to extend 
>>> this? If not, could we maybe rename it to "test_available_events"?
>> Yes this test is a placeholder for looking at the PMU characteristics
>> and we may add some other queries there.
>>>
 +{
 +  bool required_events;
 +
 +  if (!pmu.nb_implemented_counters) {
 +  report_skip("No event counter, skip ...");
 +  return;
 +  }
 +
 +  /* PMUv3 requires an implementation includes some common events */
 +  required_events = is_event_supported(0x0, true) /* SW_INCR */ &&
 +is_event_supported(0x11, true) /* CPU_CYCLES */ &&
 +(is_event_supported(0x8, true) /* INST_RETIRED */ ||
 + is_event_supported(0x1B, true) /* INST_PREC */);
 +
 +  if (pmu.version == 0x4) {
 +  /* ARMv8.1 PMU: STALL_FRONTEND and STALL_BACKEND are required */
 +  required_events = required_events ||
 +is_event_supported(0x23, true) ||
>>>
>>> Shouldn't those two operators be '&&' instead?
>> yes definitively
>>>
 +is_event_supported(0x24, true);
 +  }
 +
 +  /*
 +   * L1D_CACHE_REFILL(0x3) and L1D_CACHE(0x4) are only required if
 +   * L1 data / unified cache. BR_MIS_PRED(0x10), BR_PRED(0x12) are only
 +   * required if program-flow prediction is implemented.
 +   */
>>>
>>> Is this a TODO?
>> yes. Added TODO. I do not know how to check whether the conditions are
>> satisfied? Do you have any idea?
> 
> Well, AFAICS KVM doesn't filter PMCEIDn, right? So some basic checks are
> surely fine, but I wouldn't go crazy about checking every 

Re: [kvm-unit-tests PATCH 04/10] arm: pmu: Check Required Event Support

2020-01-09 Thread André Przywara
On 09/01/2020 16:54, Auger Eric wrote:

Hi Eric,

> On 1/3/20 7:12 PM, Andre Przywara wrote:
>> On Mon, 16 Dec 2019 21:47:51 +0100
>> Eric Auger  wrote:
>>
>> Hi Eric,
>>
>>> If event counters are implemented check the common events
>>> required by the PMUv3 are implemented.
>>>
>>> Some are unconditionally required (SW_INCR, CPU_CYCLES,
>>> either INST_RETIRED or INST_SPEC). Some others only are
>>> required if the implementation implements some other features.
>>>
>>> Check those wich are unconditionally required.
>>>
>>> This test currently fails on TCG as neither INST_RETIRED
>>> or INST_SPEC are supported.
>>>
>>> Signed-off-by: Eric Auger 
>>>
>>> ---
>>>
>>> v1 ->v2:
>>> - add a comment to explain the PMCEID0/1 splits
>>> ---
>>>  arm/pmu.c | 71 +++
>>>  arm/unittests.cfg |  6 
>>>  2 files changed, 77 insertions(+)
>>>
>>> diff --git a/arm/pmu.c b/arm/pmu.c
>>> index d24857e..d88ef22 100644
>>> --- a/arm/pmu.c
>>> +++ b/arm/pmu.c
>>> @@ -101,6 +101,10 @@ static inline void precise_instrs_loop(int loop, 
>>> uint32_t pmcr)
>>> : [pmcr] "r" (pmcr), [z] "r" (0)
>>> : "cc");
>>>  }
>>> +
>>> +/* event counter tests only implemented for aarch64 */
>>> +static void test_event_introspection(void) {}
>>> +
>>>  #elif defined(__aarch64__)
>>>  #define ID_AA64DFR0_PERFMON_SHIFT 8
>>>  #define ID_AA64DFR0_PERFMON_MASK  0xf
>>> @@ -139,6 +143,70 @@ static inline void precise_instrs_loop(int loop, 
>>> uint32_t pmcr)
>>> : [pmcr] "r" (pmcr)
>>> : "cc");
>>>  }
>>> +
>>> +#define PMCEID1_EL0 sys_reg(11, 3, 9, 12, 7)
>>> +
>>> +static bool is_event_supported(uint32_t n, bool warn)
>>> +{
>>> +   uint64_t pmceid0 = read_sysreg(pmceid0_el0);
>>> +   uint64_t pmceid1 = read_sysreg_s(PMCEID1_EL0);
>>> +   bool supported;
>>> +   uint32_t reg;
>>> +
>>> +   /*
>>> +* The low 32-bits of PMCEID0/1 respectly describe
>>> +* event support for events 0-31/32-63. Their High
>>> +* 32-bits describe support for extended events
>>> +* starting at 0x4000, using the same split.
>>> +*/
>>> +   if (n >= 0x0  && n <= 0x1F)
>>> +   reg = pmceid0 & 0x;
>>> +   else if  (n >= 0x4000 && n <= 0x401F)
>>> +   reg = pmceid0 >> 32;
>>> +   else if (n >= 0x20  && n <= 0x3F)
>>> +   reg = pmceid1 & 0x;
>>> +   else if (n >= 0x4020 && n <= 0x403F)
>>> +   reg = pmceid1 >> 32;
>>> +   else
>>> +   abort();
>>> +
>>> +   supported =  reg & (1 << n);
>>
>> Don't we need to mask off everything but the lowest 5 bits of "n"? Probably 
>> also using "1U" is better.
> I added an assert to check n is less or equal than 0x3F

But "n" will definitely be bigger than that in case of an extended
event, won't it? So you adjust "reg" accordingly, but miss to do
something similar to "n"?

>>
>>> +   if (!supported && warn)
>>> +   report_info("event %d is not supported", n);
>>> +   return supported;
>>> +}
>>> +
>>> +static void test_event_introspection(void)
>>
>> "introspection" sounds quite sophisticated. Are you planning to extend this? 
>> If not, could we maybe rename it to "test_available_events"?
> Yes this test is a placeholder for looking at the PMU characteristics
> and we may add some other queries there.
>>
>>> +{
>>> +   bool required_events;
>>> +
>>> +   if (!pmu.nb_implemented_counters) {
>>> +   report_skip("No event counter, skip ...");
>>> +   return;
>>> +   }
>>> +
>>> +   /* PMUv3 requires an implementation includes some common events */
>>> +   required_events = is_event_supported(0x0, true) /* SW_INCR */ &&
>>> + is_event_supported(0x11, true) /* CPU_CYCLES */ &&
>>> + (is_event_supported(0x8, true) /* INST_RETIRED */ ||
>>> +  is_event_supported(0x1B, true) /* INST_PREC */);
>>> +
>>> +   if (pmu.version == 0x4) {
>>> +   /* ARMv8.1 PMU: STALL_FRONTEND and STALL_BACKEND are required */
>>> +   required_events = required_events ||
>>> + is_event_supported(0x23, true) ||
>>
>> Shouldn't those two operators be '&&' instead?
> yes definitively
>>
>>> + is_event_supported(0x24, true);
>>> +   }
>>> +
>>> +   /*
>>> +* L1D_CACHE_REFILL(0x3) and L1D_CACHE(0x4) are only required if
>>> +* L1 data / unified cache. BR_MIS_PRED(0x10), BR_PRED(0x12) are only
>>> +* required if program-flow prediction is implemented.
>>> +*/
>>
>> Is this a TODO?
> yes. Added TODO. I do not know how to check whether the conditions are
> satisfied? Do you have any idea?

Well, AFAICS KVM doesn't filter PMCEIDn, right? So some basic checks are
surely fine, but I wouldn't go crazy about checking every possible
aspect of it. After all you would just check the hardware, as we pass
this register on.

Cheers,
Andre.

> Thank you for the review!
> 
> Eric
>>
>> Cheers,
>> Andre
>>
>>
>>> +
>>> +   report(required_events, "Check required events are 

[Bug 1835827] Re: HTIF symbols no longer recognized by RISC-V spike board

2020-01-09 Thread Thomas Huth
Patch has been included in QEMU v4.2:
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=6478dd745dca49d632

** Changed in: qemu
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1835827

Title:
  HTIF symbols no longer recognized by RISC-V spike board

Status in QEMU:
  Fix Released

Bug description:
  Tested commit: f34edbc760b0f689deddd175fc08732ecb46665f

  I belive this was introduced in
  0ac24d56c5e7d32423ea78ac58a06b444d1df04d when the spike's
  load_kernel() was moved to riscv_load_kernel() which no longer
  included htif_symbol_callback().

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1835827/+subscriptions



[Bug 1777777] Re: arm9 clock pending (SP804)

2020-01-09 Thread Thomas Huth
Fix has been included in QEMU v4.2:
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=5a65f7b5f4907ca70cb6

** Changed in: qemu
   Status: Confirmed => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/177

Title:
  arm9 clock pending (SP804)

Status in QEMU:
  Fix Released

Bug description:
  Hello all,

  I'm using the versatilepb board and the timer Interrupt Mask Status
  register (offset 0x14 of the SP804) does not seem to be working
  properly on the latest qemu-2.12. I tried on the 2.5 (i believe this
  is the mainstream version that comes with Linux) and it works
  perfectly.

  What happens is that the pending bit does not seem to be set in some
  scenarios. In my case, I see the timer value decreasing to 0 and then
  being reset to the reload value and neither does the interrupt is
  triggered nor the pending bit is set.

  I believe this is a matter of timing since in the "long" run the
  system eventually catches up (after a few microseconds).

  Thank you

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/177/+subscriptions



Re: [PATCH] linux-aio: increasing MAX_EVENTS to a larger hardcoded value

2020-01-09 Thread Stefan Hajnoczi
On Tue, Jan 07, 2020 at 06:01:01AM +, Wangyong wrote:
> Since commit 6040aedddb5f474a9c2304b6a432a652d82b3d3c "virtio-blk:
> make queue size configurable",if the user set the queue size to
> more than 128 ,it will not take effect. That's because linux aio's
> maximum outstanding requests at a time is always less than or equal
> to 128.
> 
> This patch simply increase MAX_EVENTS to a larger hardcoded value of
> 1024 as a shortterm fix.
> 
> Signed-off-by: wangyong 
> ---
>  block/linux-aio.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

We discussed that long-terms solutions would take into account the queue
size of the emulated storage controllers associated with this
AioContext, but this is a first step that will improve performance.

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature


Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device

2020-01-09 Thread Roman Kagan
On Thu, Jan 09, 2020 at 04:27:45PM +, Dr. David Alan Gilbert wrote:
> * Roman Kagan (rka...@virtuozzo.com) wrote:
> > On Thu, Jan 09, 2020 at 01:28:21PM +, Dr. David Alan Gilbert wrote:
> > > * Roman Kagan (rka...@virtuozzo.com) wrote:
> > > > On Thu, Jan 09, 2020 at 02:00:00PM +0100, Vitaly Kuznetsov wrote:
> > > > > "Dr. David Alan Gilbert"  writes:
> > > > > 
> > > > > > And I think vhost-user will fail if you have too many sections - and
> > > > > > the 16 sections from synic I think will blow the slots available.
> > > > > >
> > > > > 
> > > > > SynIC is percpu, it will allocate two 4k pages for every vCPU the 
> > > > > guest
> > > > > has so we're potentially looking at hundreds of such regions.
> > > > 
> > > > Indeed.
> > > > 
> > > > I think my original idea to implement overlay pages word-for-word to the
> > > > HyperV spec was a mistake, as it lead to fragmentation and memslot
> > > > waste.
> > > > 
> > > > I'll look into reworking it without actually mapping extra pages over
> > > > the existing RAM, but achieving overlay semantics by just shoving the
> > > > *content* of the "overlaid" memory somewhere.
> > > > 
> > > > That said, I haven't yet fully understood how the reported issue came
> > > > about, and thus whether the proposed approach would resolve it too.
> > > 
> > > The problem happens when we end up with:
> > > 
> > >  a)  0-512k  RAM
> > >  b)  512k +  synic
> > >  c)  570kish-640k  RAM
> > > 
> > > the page alignment code rounds
> > >   (a) to 0-2MB   - aligning to the hugepage it's in
> > >   (b) leaves as is
> > >   (c) aligns to 0-2MB
> > > 
> > >   it then tries to coalesce (c) and (a) and notices (b) got in the way
> > > and fails it.
> > 
> > I see, thanks.  The only bit I still haven't quite followed is how this
> > failure results in a quiet vhost malfunction rather than a refusal to
> > start vhost.
> 
> Because there's no way to fail in vhost_region_add_section other than to
> abort;
> 
> if (mrs_gpa < prev_gpa_start) {
> error_report("%s:Section rounded to %"PRIx64
>  " prior to previous %"PRIx64,
>  __func__, mrs_gpa, prev_gpa_start);
> /* A way to cleanly fail here would be better */
> return;
> }
> 
> > > Given the guest can put Synic anywhere I'm not sure that changing it's
> > > implementatino would help here.
> > 
> > There would be no (b) nor (separate) (c): synic would just refer to some
> > memory straight from (a), regardless of its paging granularity.
> 
> Oh, if it's actually memory from main RAM, then sure, but I guess you'd
> have to reserve that somehow to stop the OS using it.

It's up to the OS to use it.

> > > (And changing it's implementation would probably break migration
> > > compatibility).
> > 
> > I'm afraid I see no better option.
> 
> Migration compatibility!

Hmm, good point!

I think I should be able to achieve that, by keeping the current scheme
of allocating a one-page RAM memory region for every synic page, but,
instead of mapping it on top of the general RAM, swap the content with
the page being "overlaid".  Let me see if it works out (and hope the
QEMU gang won't shoot me for such an (ab)use of memory regions).

Thanks,
Roman.



Re: [PATCH 03/15] hw/ppc/spapr_rtas: Access MachineState via SpaprMachineState argument

2020-01-09 Thread Greg Kurz
On Thu,  9 Jan 2020 16:21:21 +0100
Philippe Mathieu-Daudé  wrote:

> We received a SpaprMachineState argument. Since SpaprMachineState
> inherits of MachineState, use it instead of calling qdev_get_machine.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---

Reviewed-by: Greg Kurz 

>  hw/ppc/spapr_rtas.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index e88bb1930e..6f06e9d7fe 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -267,7 +267,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
>uint32_t nret, target_ulong rets)
>  {
>  PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
> -MachineState *ms = MACHINE(qdev_get_machine());
> +MachineState *ms = MACHINE(spapr);
>  unsigned int max_cpus = ms->smp.max_cpus;
>  target_ulong parameter = rtas_ld(args, 0);
>  target_ulong buffer = rtas_ld(args, 1);




Re: [PATCH 02/15] hw/ppc/spapr_rtas: Use local MachineState variable

2020-01-09 Thread Greg Kurz
On Thu,  9 Jan 2020 16:21:20 +0100
Philippe Mathieu-Daudé  wrote:

> Since we have the MachineState already available locally,
> ues it instead of the global current_machine.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---

Reviewed-by: Greg Kurz 

>  hw/ppc/spapr_rtas.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 8d8d8cdfcb..e88bb1930e 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -281,7 +281,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
>"DesProcs=%d,"
>"MaxPlatProcs=%d",
>max_cpus,
> -  current_machine->ram_size / MiB,
> +  ms->ram_size / MiB,
>ms->smp.cpus,
>max_cpus);
>  if (pcc->n_host_threads > 0) {




Re: [PATCH 0/5] ARM virt: Add NVDIMM support

2020-01-09 Thread Igor Mammedov
On Mon, 6 Jan 2020 17:06:32 +
Shameerali Kolothum Thodi  wrote:

> Hi Igor,
> 
> > -Original Message-
> > From: Shameerali Kolothum Thodi
> > Sent: 13 December 2019 12:52
> > To: 'Igor Mammedov' 
> > Cc: xiaoguangrong.e...@gmail.com; peter.mayd...@linaro.org;
> > drjo...@redhat.com; shannon.zha...@gmail.com; qemu-devel@nongnu.org;
> > Linuxarm ; Auger Eric ;
> > qemu-...@nongnu.org; xuwei (O) ;
> > ler...@redhat.com
> > Subject: RE: [PATCH 0/5] ARM virt: Add NVDIMM support
> >   
> 
> [...]
> 
> > 
> > Thanks for your help. I did spend some more time debugging this further.
> > I tried to introduce a totally new Buffer field object with different
> > sizes and printing the size after creation.
> > 
> > --- SSDT.dsl2019-12-12 15:28:21.976986949 +
> > +++ SSDT-arm64-dbg.dsl  2019-12-13 12:17:11.026806186 +
> > @@ -18,7 +18,7 @@
> >   * Compiler ID  "BXPC"
> >   * Compiler Version 0x0001 (1)
> >   */
> > -DefinitionBlock ("", "SSDT", 1, "BOCHS ", "NVDIMM", 0x0001)
> > +DefinitionBlock ("", "SSDT", 1, "BOCHS ", "NVDIMM", 0x0002)
> >  {
> >  Scope (\_SB)
> >  {
> > @@ -48,6 +48,11 @@
> >  RLEN,   32,
> >  ODAT,   32736
> >  }
> > +
> > +Field (NRAM, DWordAcc, NoLock, Preserve)
> > +{
> > +NBUF,   32768
> > +}
> > 
> >  If ((Arg4 == Zero))
> >  {
> > @@ -87,6 +92,12 @@
> >  Local3 = DerefOf (Local2)
> >  FARG = Local3
> >  }
> > +
> > +Local2 = 0x2
> > +printf("AML:NVDIMM Creating TBUF with bytes %o",
> > Local2)
> > +CreateField (NBUF, Zero, (Local2 << 3), TBUF)
> > +Concatenate (Buffer (Zero){}, TBUF, Local3)
> > +printf("AML:NVDIMM Size of TBUF(Local3) %o",
> > SizeOf(Local3))
> > 
> >  NTFI = Local6
> >  Local1 = (RLEN - 0x04)
> > 
> > And run it by changing Local2 with different values, It looks on ARM64,
> > 
> > For cases where, Local2 <8, the created buffer size is always 8 bytes
> > 
> > "AML:NVDIMM Creating TBUF with bytes 0002"
> > "AML:NVDIMM Size of TBUF(Local3) 0008"
> > 
> > ...
> > "AML:NVDIMM Creating TBUF with bytes 0005"
> > "AML:NVDIMM Size of TBUF(Local3) 0008"
> > 
> > And once Local2 >=8, it gets the correct size,
> > 
> > "AML:NVDIMM Creating TBUF with bytes 0009"
> > "AML:NVDIMM Size of TBUF(Local3) 0009"
> > 
> > 
> > But on x86, the behavior is like,
> > 
> > For cases where, Local2 <4, the created buffer size is always 4 bytes
> > 
> > "AML:NVDIMM Creating TBUF with bytes 0002"
> > "AML:NVDIMM Size of TBUF(Local3) 0004"
> > 
> > "AML:NVDIMM Creating TBUF with bytes 0003"
> > "AML:NVDIMM Size of TBUF(Local3) 0004"
> > 
> > And once Local2 >= 4, it is ok
> > 
> > "AML:NVDIMM Creating TBUF with bytes 0005"
> > "AML:NVDIMM Size of TBUF(Local3) 0005"
> > ...
> > "AML:NVDIMM Creating TBUF with bytes 0009"
> > "AML:NVDIMM Size of TBUF(Local3) 0009"
> > 
> > This is the reason why it works on x86 and not on ARM64. Because, if you
> > remember on second iteration of the FIT buffer, the requested buffer size 
> > is 4 .
> > 
> > I tried changing the AccessType of the below NBUF field from DWordAcc to
> > ByteAcc/BufferAcc, but no luck.
> > 
> > +Field (NRAM, DWordAcc, NoLock, Preserve)
> > +{
> > +NBUF,   32768
> > +}
> > 
> > Not sure what we need to change for ARM64 to create buffer object of size 4
> > here. Please let me know if you have any pointers to debug this further.
> > 
> > (I am attaching both x86 and ARM64 SSDT dsl used for reference)  
> 
> (+Jonathan)
> 
> Thanks to Jonathan for taking a fresh look at this issue and spotting this,
> https://elixir.bootlin.com/linux/v5.5-rc5/source/drivers/acpi/acpica/utmisc.c#L109
> 
> And, from ACPI 6.3, table 19-419
> 
> "If the Buffer Field is smaller than or equal to the size of an Integer (in 
> bits), it
> will be treated as an Integer. Otherwise, it will be treated as a Buffer. The 
> size
> of an Integer is indicated by the Definition Block table header's Revision 
> field.
> A Revision field value less than 2 indicates that the size of an Integer is 
> 32 bits.
> A value greater than or equal to 2 signifies that the size of an Integer is 
> 64 bits."
> 
> It looks like the main reason for the difference in behavior of the buffer 
> object
> size between x86 and ARM/virt, is because of the Revision number used in the
> DSDT table. On x86 it is 1 and ARM/virt it is 2.
> 
> So most likely,
> 
> > CreateField (ODAT, Zero, Local1, OBUF)

You are right, that's where it goes wrong, since OBUF
is implicitly converted to integer if size is less than 64bits.

> > Concatenate (Buffer 

Re: [PATCH 056/104] virtiofsd: add security guide document

2020-01-09 Thread Dr. David Alan Gilbert
* Daniel P. Berrangé (berra...@redhat.com) wrote:
> On Mon, Jan 06, 2020 at 05:53:55PM +, Dr. David Alan Gilbert wrote:
> > * Daniel P. Berrangé (berra...@redhat.com) wrote:
> > > On Thu, Dec 12, 2019 at 04:38:16PM +, Dr. David Alan Gilbert (git) 
> > > wrote:
> > > > From: Stefan Hajnoczi 
> > > > 
> > > > Many people want to know: what's up with virtiofsd and security?  This
> > > > document provides the answers!
> > > > 
> > > > Signed-off-by: Stefan Hajnoczi 
> > > > ---
> > > >  tools/virtiofsd/security.rst | 118 +++
> > > 
> > > Do we need to link this into the rest of QEMU's docs in some
> > > index page ?
> > 
> > I wonder how;  there's a autogenerated thing in
> >   docs/index.rst
> > 
> > that includes some of the docs directories subdirectories/index
> > Does that mean we should have this in docs/tools/virtiofsd/security.rst
> > and a docs/tools/index  and a docs/tools/virtiofsd/index  ?
> 
> I was wondering if this fits into any of the current three sections
> "devel" or "interop" or "specs", but it doesn't feel quite right in
> any of them to me. So having a new docs/tools subtree looks like an
> ok idea in absence of better suggestions.

OK, so what I've done is I've added a preceding patch that creates:

   docs/tools
 /conf.py
 /index.rst

  and adds it to the Makefile and docs/index.rst 

and then this patch now adds itself as docs/tools/virtiofsd-security.rst
and just adds the entry in docs/tools/index.rst

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [kvm-unit-tests PATCH 04/10] arm: pmu: Check Required Event Support

2020-01-09 Thread Auger Eric
Hi Andre,

On 1/3/20 7:12 PM, Andre Przywara wrote:
> On Mon, 16 Dec 2019 21:47:51 +0100
> Eric Auger  wrote:
> 
> Hi Eric,
> 
>> If event counters are implemented check the common events
>> required by the PMUv3 are implemented.
>>
>> Some are unconditionally required (SW_INCR, CPU_CYCLES,
>> either INST_RETIRED or INST_SPEC). Some others only are
>> required if the implementation implements some other features.
>>
>> Check those wich are unconditionally required.
>>
>> This test currently fails on TCG as neither INST_RETIRED
>> or INST_SPEC are supported.
>>
>> Signed-off-by: Eric Auger 
>>
>> ---
>>
>> v1 ->v2:
>> - add a comment to explain the PMCEID0/1 splits
>> ---
>>  arm/pmu.c | 71 +++
>>  arm/unittests.cfg |  6 
>>  2 files changed, 77 insertions(+)
>>
>> diff --git a/arm/pmu.c b/arm/pmu.c
>> index d24857e..d88ef22 100644
>> --- a/arm/pmu.c
>> +++ b/arm/pmu.c
>> @@ -101,6 +101,10 @@ static inline void precise_instrs_loop(int loop, 
>> uint32_t pmcr)
>>  : [pmcr] "r" (pmcr), [z] "r" (0)
>>  : "cc");
>>  }
>> +
>> +/* event counter tests only implemented for aarch64 */
>> +static void test_event_introspection(void) {}
>> +
>>  #elif defined(__aarch64__)
>>  #define ID_AA64DFR0_PERFMON_SHIFT 8
>>  #define ID_AA64DFR0_PERFMON_MASK  0xf
>> @@ -139,6 +143,70 @@ static inline void precise_instrs_loop(int loop, 
>> uint32_t pmcr)
>>  : [pmcr] "r" (pmcr)
>>  : "cc");
>>  }
>> +
>> +#define PMCEID1_EL0 sys_reg(11, 3, 9, 12, 7)
>> +
>> +static bool is_event_supported(uint32_t n, bool warn)
>> +{
>> +uint64_t pmceid0 = read_sysreg(pmceid0_el0);
>> +uint64_t pmceid1 = read_sysreg_s(PMCEID1_EL0);
>> +bool supported;
>> +uint32_t reg;
>> +
>> +/*
>> + * The low 32-bits of PMCEID0/1 respectly describe
>> + * event support for events 0-31/32-63. Their High
>> + * 32-bits describe support for extended events
>> + * starting at 0x4000, using the same split.
>> + */
>> +if (n >= 0x0  && n <= 0x1F)
>> +reg = pmceid0 & 0x;
>> +else if  (n >= 0x4000 && n <= 0x401F)
>> +reg = pmceid0 >> 32;
>> +else if (n >= 0x20  && n <= 0x3F)
>> +reg = pmceid1 & 0x;
>> +else if (n >= 0x4020 && n <= 0x403F)
>> +reg = pmceid1 >> 32;
>> +else
>> +abort();
>> +
>> +supported =  reg & (1 << n);
> 
> Don't we need to mask off everything but the lowest 5 bits of "n"? Probably 
> also using "1U" is better.
I added an assert to check n is less or equal than 0x3F
> 
>> +if (!supported && warn)
>> +report_info("event %d is not supported", n);
>> +return supported;
>> +}
>> +
>> +static void test_event_introspection(void)
> 
> "introspection" sounds quite sophisticated. Are you planning to extend this? 
> If not, could we maybe rename it to "test_available_events"?
Yes this test is a placeholder for looking at the PMU characteristics
and we may add some other queries there.
> 
>> +{
>> +bool required_events;
>> +
>> +if (!pmu.nb_implemented_counters) {
>> +report_skip("No event counter, skip ...");
>> +return;
>> +}
>> +
>> +/* PMUv3 requires an implementation includes some common events */
>> +required_events = is_event_supported(0x0, true) /* SW_INCR */ &&
>> +  is_event_supported(0x11, true) /* CPU_CYCLES */ &&
>> +  (is_event_supported(0x8, true) /* INST_RETIRED */ ||
>> +   is_event_supported(0x1B, true) /* INST_PREC */);
>> +
>> +if (pmu.version == 0x4) {
>> +/* ARMv8.1 PMU: STALL_FRONTEND and STALL_BACKEND are required */
>> +required_events = required_events ||
>> +  is_event_supported(0x23, true) ||
> 
> Shouldn't those two operators be '&&' instead?
yes definitively
> 
>> +  is_event_supported(0x24, true);
>> +}
>> +
>> +/*
>> + * L1D_CACHE_REFILL(0x3) and L1D_CACHE(0x4) are only required if
>> + * L1 data / unified cache. BR_MIS_PRED(0x10), BR_PRED(0x12) are only
>> + * required if program-flow prediction is implemented.
>> + */
> 
> Is this a TODO?
yes. Added TODO. I do not know how to check whether the conditions are
satisfied? Do you have any idea?

Thank you for the review!

Eric
> 
> Cheers,
> Andre
> 
> 
>> +
>> +report(required_events, "Check required events are implemented");
>> +}
>> +
>>  #endif
>>  
>>  /*
>> @@ -326,6 +394,9 @@ int main(int argc, char *argv[])
>> "Monotonically increasing cycle count");
>>  report(check_cpi(cpi), "Cycle/instruction ratio");
>>  pmccntr64_test();
>> +} else if (strcmp(argv[1], "event-introspection") == 0) {
>> +report_prefix_push(argv[1]);
>> +test_event_introspection();
>>  } else {
>>  report_abort("Unknown sub-test '%s'", argv[1]);
>>  }
>> diff --git 

[Bug 1528239] Re: Unable to debug PIE binaries with QEMU gdb stub.

2020-01-09 Thread Thomas Huth
Patch has been included in QEMU v4.2:
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=dc12567a53c88d7a91b9

** Changed in: qemu
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1528239

Title:
  Unable to debug PIE binaries with QEMU gdb stub.

Status in QEMU:
  Fix Released

Bug description:
  The issue occurs on current trunk:

  max@max:~/build/qemu$ cat test.c
  #include 

  int main() {
printf("Hello, world!\n");
return 0;
  }

  max@max:~/build/qemu$ gcc test.c -fPIC -pie -o bad.x
  max@max:~/build/qemu$ ./x86_64-linux-user/qemu-x86_64 -g 1234 bad.x 
  .

  
  max@max:~/build/qemu$ gdb
  GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
  

  (gdb) file bad.x
  Reading symbols from bad.x...(no debugging symbols found)...done.
  (gdb) b main
  Breakpoint 1 at 0x779
  (gdb) target remote localhost:1234
  Remote debugging using localhost:1234
  Reading symbols from /lib64/ld-linux-x86-64.so.2...warning: the debug 
information found in "/lib64/ld-2.19.so" does not match 
"/lib64/ld-linux-x86-64.so.2" (CRC mismatch).

  Reading symbols from /usr/lib/debug//lib/x86_64-linux-gnu/ld-2.19.so...done.
  done.
  Loaded symbols for /lib64/ld-linux-x86-64.so.2
  Error in re-setting breakpoint 1: Cannot access memory at address 0x775
  Error in re-setting breakpoint 1: Cannot access memory at address 0x775
  0x004000a042d0 in _start () from /lib64/ld-linux-x86-64.so.2
  (gdb) c
  Continuing.
  [Inferior 1 (Remote target) exited normally]
  (gdb) 

  
  max@max:~/build/qemu$ cat config.log
  # Configured with: '/home/max/src/qemu/configure' 
'--prefix=/home/max/install/qemu' 
'--target-list=arm-linux-user,aarch64-linux-user,x86_64-linux-user' '--static'

  
  W/O QEMU or -pie flag breakpoint on main works fine.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1528239/+subscriptions



[PATCH 2/2] target/ppc: add support for Hypervisor Facility Unavailable Exception

2020-01-09 Thread Cédric Le Goater
The privileged message send and clear instructions (msgsndp & msgclrp)
are privileged, but will generate a hypervisor facility unavailable
exception if not enabled in the HFSCR and executed in privileged
non-hypervisor state.

Add checks when accessing the DPDES register and when using the
msgsndp and msgclrp isntructions.

Based on previous work from Suraj Jitindar Singh.

Cc: Suraj Jitindar Singh 
Signed-off-by: Cédric Le Goater 
---
 target/ppc/cpu.h |  6 ++
 target/ppc/excp_helper.c | 13 +
 target/ppc/misc_helper.c | 27 +++
 3 files changed, 46 insertions(+)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index d175ec9a641d..1ff6afbccdb2 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -397,6 +397,10 @@ typedef struct ppc_v3_pate_t {
 #define PSSCR_ESL PPC_BIT(42) /* Enable State Loss */
 #define PSSCR_EC  PPC_BIT(43) /* Exit Criterion */
 
+/* HFSCR bits */
+#define HFSCR_MSGP PPC_BIT(53) /* Privileged Message Send Facilities */
+#define HFSCR_IC_MSGP  0xA
+
 #define msr_sf   ((env->msr >> MSR_SF)   & 1)
 #define msr_isf  ((env->msr >> MSR_ISF)  & 1)
 #define msr_shv  ((env->msr >> MSR_SHV)  & 1)
@@ -1332,6 +1336,8 @@ void cpu_ppc_set_vhyp(PowerPCCPU *cpu, 
PPCVirtualHypervisor *vhyp);
 #endif
 
 void store_fpscr(CPUPPCState *env, uint64_t arg, uint32_t mask);
+void helper_hfscr_facility_check(CPUPPCState *env, uint32_t bit,
+ const char *caller, uint32_t cause);
 
 static inline uint64_t ppc_dump_gpr(CPUPPCState *env, int gprn)
 {
diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 343f3a6b30c4..3887fc6c 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -471,6 +471,15 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int 
excp_model, int excp)
 case POWERPC_EXCP_FU: /* Facility unavailable exception  */
 #ifdef TARGET_PPC64
 env->spr[SPR_FSCR] |= ((target_ulong)env->error_code << 56);
+#endif
+break;
+case POWERPC_EXCP_HV_FU: /* Hypervisor Facility Unavailable Exception 
*/
+#ifdef TARGET_PPC64
+env->spr[SPR_HFSCR] |= ((target_ulong)env->error_code << FSCR_IC_POS);
+srr0 = SPR_HSRR0;
+srr1 = SPR_HSRR1;
+new_msr |= (target_ulong)MSR_HVB;
+new_msr |= env->msr & ((target_ulong)1 << MSR_RI);
 #endif
 break;
 case POWERPC_EXCP_PIT:   /* Programmable interval timer interrupt*/
@@ -1287,6 +1296,8 @@ void helper_book3s_msgclrp(CPUPPCState *env, target_ulong 
rb)
 {
 int irq = book3s_dbell2irq(rb, false);
 
+helper_hfscr_facility_check(env, HFSCR_MSGP, "msgclrp", HFSCR_IC_MSGP);
+
 if (irq < 0) {
 return;
 }
@@ -1303,6 +1314,8 @@ void helper_book3s_msgsndp(CPUPPCState *env, target_ulong 
rb)
 int irq = book3s_dbell2irq(rb, false);
 int pir = env->spr_cb[SPR_PIR].default_value;
 
+helper_hfscr_facility_check(env, HFSCR_MSGP, "msgsndp", HFSCR_IC_MSGP);
+
 if (irq < 0) {
 return;
 }
diff --git a/target/ppc/misc_helper.c b/target/ppc/misc_helper.c
index 66b5b0824208..2ff6bed10228 100644
--- a/target/ppc/misc_helper.c
+++ b/target/ppc/misc_helper.c
@@ -41,6 +41,18 @@ void helper_store_dump_spr(CPUPPCState *env, uint32_t sprn)
 }
 
 #ifdef TARGET_PPC64
+static void raise_hv_fu_exception(CPUPPCState *env, uint32_t bit,
+  const char *caller, uint32_t cause,
+  uintptr_t raddr)
+{
+qemu_log_mask(LOG_GUEST_ERROR, "HV Facility %d is unavailable (%s)\n",
+  bit, caller);
+
+env->spr[SPR_HFSCR] &= ~((target_ulong)FSCR_IC_MASK << FSCR_IC_POS);
+
+raise_exception_err_ra(env, POWERPC_EXCP_HV_FU, cause, raddr);
+}
+
 static void raise_fu_exception(CPUPPCState *env, uint32_t bit,
uint32_t sprn, uint32_t cause,
uintptr_t raddr)
@@ -55,6 +67,17 @@ static void raise_fu_exception(CPUPPCState *env, uint32_t 
bit,
 }
 #endif
 
+void helper_hfscr_facility_check(CPUPPCState *env, uint32_t bit,
+ const char *caller, uint32_t cause)
+{
+#ifdef TARGET_PPC64
+if ((env->msr_mask & MSR_HVB) && !msr_hv &&
+ !(env->spr[SPR_HFSCR] & (1UL << bit))) {
+raise_hv_fu_exception(env, bit, caller, cause, GETPC());
+}
+#endif
+}
+
 void helper_fscr_facility_check(CPUPPCState *env, uint32_t bit,
 uint32_t sprn, uint32_t cause)
 {
@@ -114,6 +137,8 @@ target_ulong helper_load_dpdes(CPUPPCState *env)
 {
 target_ulong dpdes = 0;
 
+helper_hfscr_facility_check(env, HFSCR_MSGP, "load DPDES", HFSCR_IC_MSGP);
+
 /* TODO: TCG supports only one thread */
 if (env->pending_interrupts & (1 << PPC_INTERRUPT_DOORBELL)) {
 dpdes |= (env->spr_cb[SPR_PIR].default_value & DBELL_TIRTAG_MASK);
@@ -127,6 +152,8 @@ void helper_store_dpdes(CPUPPCState *env, target_ulong val)
   

[PATCH 1/2] target/ppc: Add privileged message send facilities

2020-01-09 Thread Cédric Le Goater
The Processor Control facility POWER8 processors and later provides
a mechanism for the hypervisor to send messages to other threads
in the system (msgsnd instruction) and cause hypervisor-level
exceptions. Privileged non-hypervisor programs are also able to
send messages (msgsndp instruction) but are restricted to the
threads of the same core and cause privileged-level exceptions.

The Directed Privileged Doorbell Exception State (DPDES) register
reflects the state of pending privileged doorbell exceptions and can
also be used to modify that state. The register can be used to read
and modify the state of privileged doorbell exceptions for all threads
of a subprocessor and thus is a shared facility for that subprocessor.
The register can be read/written by the hypervisor and read by the
supervisor if enabled in the HFSCR, otherwise a hypervisor facility
unavailable exception is generated.

The privileged message send and clear instructions (msgsndp & msgclrp)
are used to generate and clear the presence of a directed privileged
doorbell exception, respectively. The msgsndp instruction can be used
to target any thread of the current subprocessor, msgclrp acts on the
thread issuing the instruction. These instructions are privileged, but
will generate a hypervisor facility unavailable exception if not
enabled in the HFSCR and executed in privileged non-hypervisor
state. The HV facility unavailable exception will be addressed in
other patch.

Add and implement this register and instructions by reading or
modifying the pending interrupt state of the cpu.

Note that TCG only supports one thread per core and so we only need to
worry about the cpu making the access.

Based on previous work from Suraj Jitindar Singh.

Cc: Suraj Jitindar Singh 
[clg: took ownership due to the amount of changes ]
Signed-off-by: Cédric Le Goater 
---
 target/ppc/cpu.h|  1 +
 target/ppc/helper.h |  4 ++
 target/ppc/excp_helper.c| 68 +++--
 target/ppc/misc_helper.c| 36 +
 target/ppc/translate.c  | 26 +
 target/ppc/translate_init.inc.c | 20 --
 6 files changed, 140 insertions(+), 15 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 103bfe9dc274..d175ec9a641d 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -935,6 +935,7 @@ enum {
 #define DBELL_PIRTAG_MASK  0x3fff
 
 #define DBELL_PROCIDTAG_MASK   PPC_BITMASK(44, 63)
+#define DBELL_TIRTAG_MASK  PPC_BITMASK(57, 63)
 
 #define PPC_PAGE_SIZES_MAX_SZ   8
 
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index cd0dfe383a2a..cfb4c07085ca 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -657,6 +657,10 @@ DEF_HELPER_FLAGS_1(load_601_rtcu, TCG_CALL_NO_RWG, tl, env)
 DEF_HELPER_FLAGS_1(load_purr, TCG_CALL_NO_RWG, tl, env)
 DEF_HELPER_FLAGS_2(store_purr, TCG_CALL_NO_RWG, void, env, tl)
 DEF_HELPER_2(store_ptcr, void, env, tl)
+DEF_HELPER_FLAGS_1(load_dpdes, TCG_CALL_NO_RWG, tl, env)
+DEF_HELPER_FLAGS_2(store_dpdes, TCG_CALL_NO_RWG, void, env, tl)
+DEF_HELPER_2(book3s_msgsndp, void, env, tl)
+DEF_HELPER_2(book3s_msgclrp, void, env, tl)
 #endif
 DEF_HELPER_2(store_sdr1, void, env, tl)
 DEF_HELPER_2(store_pidr, void, env, tl)
diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 5752ed4a4d83..343f3a6b30c4 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -900,7 +900,11 @@ static void ppc_hw_interrupt(CPUPPCState *env)
 }
 if (env->pending_interrupts & (1 << PPC_INTERRUPT_DOORBELL)) {
 env->pending_interrupts &= ~(1 << PPC_INTERRUPT_DOORBELL);
-powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_DOORI);
+if (env->insns_flags & PPC_SEGMENT_64B) {
+powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_SDOOR);
+} else {
+powerpc_excp(cpu, env->excp_model, POWERPC_EXCP_DOORI);
+}
 return;
 }
 if (env->pending_interrupts & (1 << PPC_INTERRUPT_HDOORBELL)) {
@@ -1221,7 +1225,7 @@ void helper_msgsnd(target_ulong rb)
 }
 
 /* Server Processor Control */
-static int book3s_dbell2irq(target_ulong rb)
+static int book3s_dbell2irq(target_ulong rb, bool hv_dbell)
 {
 int msg = rb & DBELL_TYPE_MASK;
 
@@ -1230,12 +1234,16 @@ static int book3s_dbell2irq(target_ulong rb)
  * message type is 5. All other types are reserved and the
  * instruction is a no-op
  */
-return msg == DBELL_TYPE_DBELL_SERVER ? PPC_INTERRUPT_HDOORBELL : -1;
+if (msg == DBELL_TYPE_DBELL_SERVER) {
+return hv_dbell ? PPC_INTERRUPT_HDOORBELL : PPC_INTERRUPT_DOORBELL;
+}
+
+return -1;
 }
 
 void helper_book3s_msgclr(CPUPPCState *env, target_ulong rb)
 {
-int irq = book3s_dbell2irq(rb);
+int irq = book3s_dbell2irq(rb, true);
 
 if (irq < 0) {
 return;
@@ -1244,16 +1252,10 @@ void helper_book3s_msgclr(CPUPPCState *env, 
target_ulong rb)
 

[PATCH 0/2] ppc: add support for Directed Privileged Doorbell (non-hypervisor)

2020-01-09 Thread Cédric Le Goater
Hello,

The Processor Control facility POWER8 processors and later provides a
mechanism for the hypervisor to send messages to other threads in the
system (msgsnd instruction) and cause hypervisor-level exceptions.

Privileged non-hypervisor programs can also send messages (msgsndp
instruction) but are restricted to the threads of the same
subprocessor and cause privileged-level exceptions. The Directed
Privileged Doorbell Exception State (DPDES) register reflects the
state of pending privileged-level doorbell exceptions for all threads
and can be used to modify that state.

If the MSGP facility is not in the HFSCR, a hypervisor facility
unavailable exception is generated when these instructions are used or
when the DPDES register is accessed by the supervisor.

Based on previous work from Suraj Jitindar Singh. I took ownership due
to the amount of changes.

Thanks,
 
C.

Cédric Le Goater (2):
  target/ppc: Add privileged message send facilities
  target/ppc: add support for Hypervisor Facility Unavailable Exception

 target/ppc/cpu.h|  7 +++
 target/ppc/helper.h |  4 ++
 target/ppc/excp_helper.c| 81 -
 target/ppc/misc_helper.c| 63 +
 target/ppc/translate.c  | 26 +++
 target/ppc/translate_init.inc.c | 20 ++--
 6 files changed, 186 insertions(+), 15 deletions(-)

-- 
2.21.1




Re: [PATCH 1/4] configure: Do not build libfdt is not required

2020-01-09 Thread Thomas Huth
On 09/01/2020 16.39, Philippe Mathieu-Daudé wrote:
> We only require libfdt for system emulation, in a small set
> of architecture:
> 
> 4077  # fdt support is mandatory for at least some target architectures,
> 4078  # so insist on it if we're building those system emulators.
> 4079  fdt_required=no
> 4080  for target in $target_list; do
> 4081case $target in
> 4082  
> aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu|mips64el-softmmu|riscv*-softmmu)
> 4083fdt_required=yes
> 
> Do not build libfdt if we did not manually specified --enable-fdt.

I suggest to add:

"... or have one of the platforms that require it in our target list."

> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  configure | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/configure b/configure
> index 0ce2c0354a..266a8386d1 100755
> --- a/configure
> +++ b/configure
> @@ -4092,6 +4092,8 @@ if test "$fdt_required" = "yes"; then
>"targets which need it (by specifying a cut down --target-list)."
>fi
>fdt=yes
> +elif test "$fdt" != "yes" ; then
> +  fdt=no
>  fi
>  
>  if test "$fdt" != "no" ; then
> 

Reviewed-by: Thomas Huth 




Re: Pre-Christmas meeting notes

2020-01-09 Thread Thomas Huth
On 09/01/2020 16.03, Vladimir Sementsov-Ogievskiy wrote:
> 06.01.2020 20:15, Max Reitz wrote:
>> Misc
>> 
>>
>> The Wiki’s TODO list is horribly outdated.  What should we do about
>> it?  Maybe archive it and start a new one?  (Most of the things on the
>> current list are either done or we don’t want to do anymore.)
> 
> 
> May be, create block/TODO.txt instead? It would be simpler to manage file
> in git than wiki page.

Is it? We removed a bunch of stale TODO files in the past already:

https://git.qemu.org/?p=qemu.git;a=commitdiff;h=d494d79eabfdac0
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=9e564a1dde5abc7
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=ba43da36983a0bf

And there are some more TODO files in the target/ subfolders which look
also quite stale... so I somewhat doubt that a TODO list in git is much
better than a TODO list in the wiki.

Maybe we just need a proper bug/feature tracker instead (since Launchpad
is IMHO quite a bad choice for bug tracking, too)...

 Thomas




Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device

2020-01-09 Thread Dr. David Alan Gilbert
* Roman Kagan (rka...@virtuozzo.com) wrote:
> On Thu, Jan 09, 2020 at 01:28:21PM +, Dr. David Alan Gilbert wrote:
> > * Roman Kagan (rka...@virtuozzo.com) wrote:
> > > On Thu, Jan 09, 2020 at 02:00:00PM +0100, Vitaly Kuznetsov wrote:
> > > > "Dr. David Alan Gilbert"  writes:
> > > > 
> > > > > And I think vhost-user will fail if you have too many sections - and
> > > > > the 16 sections from synic I think will blow the slots available.
> > > > >
> > > > 
> > > > SynIC is percpu, it will allocate two 4k pages for every vCPU the guest
> > > > has so we're potentially looking at hundreds of such regions.
> > > 
> > > Indeed.
> > > 
> > > I think my original idea to implement overlay pages word-for-word to the
> > > HyperV spec was a mistake, as it lead to fragmentation and memslot
> > > waste.
> > > 
> > > I'll look into reworking it without actually mapping extra pages over
> > > the existing RAM, but achieving overlay semantics by just shoving the
> > > *content* of the "overlaid" memory somewhere.
> > > 
> > > That said, I haven't yet fully understood how the reported issue came
> > > about, and thus whether the proposed approach would resolve it too.
> > 
> > The problem happens when we end up with:
> > 
> >  a)  0-512k  RAM
> >  b)  512k +  synic
> >  c)  570kish-640k  RAM
> > 
> > the page alignment code rounds
> >   (a) to 0-2MB   - aligning to the hugepage it's in
> >   (b) leaves as is
> >   (c) aligns to 0-2MB
> > 
> >   it then tries to coalesce (c) and (a) and notices (b) got in the way
> > and fails it.
> 
> I see, thanks.  The only bit I still haven't quite followed is how this
> failure results in a quiet vhost malfunction rather than a refusal to
> start vhost.

Because there's no way to fail in vhost_region_add_section other than to
abort;

if (mrs_gpa < prev_gpa_start) {
error_report("%s:Section rounded to %"PRIx64
 " prior to previous %"PRIx64,
 __func__, mrs_gpa, prev_gpa_start);
/* A way to cleanly fail here would be better */
return;
}

> > Given the guest can put Synic anywhere I'm not sure that changing it's
> > implementatino would help here.
> 
> There would be no (b) nor (separate) (c): synic would just refer to some
> memory straight from (a), regardless of its paging granularity.

Oh, if it's actually memory from main RAM, then sure, but I guess you'd
have to reserve that somehow to stop the OS using it.

> > (And changing it's implementation would probably break migration
> > compatibility).
> 
> I'm afraid I see no better option.

Migration compatibility!

Dave

> Thanks,
> Roman.
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




[RFC PATCH] tests/tcg: add a vtimer test for aarch64

2020-01-09 Thread Alex Bennée
Bug: https://bugs.launchpad.net/bugs/1859021

Signed-off-by: Alex Bennée 
---
 tests/tcg/aarch64/system/vtimer.c | 48 +++
 tests/tcg/aarch64/Makefile.softmmu-target |  4 ++
 2 files changed, 52 insertions(+)
 create mode 100644 tests/tcg/aarch64/system/vtimer.c

diff --git a/tests/tcg/aarch64/system/vtimer.c 
b/tests/tcg/aarch64/system/vtimer.c
new file mode 100644
index 000..42f2f7796c7
--- /dev/null
+++ b/tests/tcg/aarch64/system/vtimer.c
@@ -0,0 +1,48 @@
+/*
+ * Simple Virtual Timer Test
+ *
+ * Copyright (c) 2020 Linaro Ltd
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include 
+#include 
+
+/* grabbed from Linux */
+#define __stringify_1(x...) #x
+#define __stringify(x...)   __stringify_1(x)
+
+#define read_sysreg(r) ({   \
+uint64_t __val; \
+asm volatile("mrs %0, " __stringify(r) : "=r" (__val)); \
+__val;  \
+})
+
+#define write_sysreg(r, v) do { \
+uint64_t __val = (uint64_t)(v); \
+asm volatile("msr " __stringify(r) ", %x0"  \
+ : : "rZ" (__val)); \
+} while (0)
+
+int main(void)
+{
+int i;
+
+ml_printf("VTimer Test\n");
+
+write_sysreg(cntvoff_el2, 1);
+write_sysreg(cntv_cval_el0, -1);
+write_sysreg(cntv_ctl_el0, 1);
+
+ml_printf("cntvoff_el2=%lx\n", read_sysreg(cntvoff_el2));
+ml_printf("cntv_cval_el0=%lx\n", read_sysreg(cntv_cval_el0));
+ml_printf("cntv_ctl_el0=%lx\n", read_sysreg(cntv_ctl_el0));
+
+/* Now read cval a few times */
+for (i = 0; i < 10; i++) {
+ml_printf("%d: cntv_cval_el0=%lx\n", i, read_sysreg(cntv_cval_el0));
+}
+
+return 0;
+}
diff --git a/tests/tcg/aarch64/Makefile.softmmu-target 
b/tests/tcg/aarch64/Makefile.softmmu-target
index 7b4eede3f07..62cdddbb215 100644
--- a/tests/tcg/aarch64/Makefile.softmmu-target
+++ b/tests/tcg/aarch64/Makefile.softmmu-target
@@ -62,3 +62,7 @@ run-memory-replay: memory-replay run-memory-record
  "$< on $(TARGET_NAME)")
 
 EXTRA_TESTS+=memory-record memory-replay
+
+# vtimer test
+QEMU_EL2_MACHINE=-machine virt,virtualization=on,gic-version=2 -cpu cortex-a57 
-smp 4
+run-vtimer: QEMU_OPTS=$(QEMU_EL2_MACHINE) $(QEMU_SEMIHOST)  -kernel
-- 
2.20.1




Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device

2020-01-09 Thread Roman Kagan
On Thu, Jan 09, 2020 at 01:28:21PM +, Dr. David Alan Gilbert wrote:
> * Roman Kagan (rka...@virtuozzo.com) wrote:
> > On Thu, Jan 09, 2020 at 02:00:00PM +0100, Vitaly Kuznetsov wrote:
> > > "Dr. David Alan Gilbert"  writes:
> > > 
> > > > And I think vhost-user will fail if you have too many sections - and
> > > > the 16 sections from synic I think will blow the slots available.
> > > >
> > > 
> > > SynIC is percpu, it will allocate two 4k pages for every vCPU the guest
> > > has so we're potentially looking at hundreds of such regions.
> > 
> > Indeed.
> > 
> > I think my original idea to implement overlay pages word-for-word to the
> > HyperV spec was a mistake, as it lead to fragmentation and memslot
> > waste.
> > 
> > I'll look into reworking it without actually mapping extra pages over
> > the existing RAM, but achieving overlay semantics by just shoving the
> > *content* of the "overlaid" memory somewhere.
> > 
> > That said, I haven't yet fully understood how the reported issue came
> > about, and thus whether the proposed approach would resolve it too.
> 
> The problem happens when we end up with:
> 
>  a)  0-512k  RAM
>  b)  512k +  synic
>  c)  570kish-640k  RAM
> 
> the page alignment code rounds
>   (a) to 0-2MB   - aligning to the hugepage it's in
>   (b) leaves as is
>   (c) aligns to 0-2MB
> 
>   it then tries to coalesce (c) and (a) and notices (b) got in the way
> and fails it.

I see, thanks.  The only bit I still haven't quite followed is how this
failure results in a quiet vhost malfunction rather than a refusal to
start vhost.

> Given the guest can put Synic anywhere I'm not sure that changing it's
> implementatino would help here.

There would be no (b) nor (separate) (c): synic would just refer to some
memory straight from (a), regardless of its paging granularity.

> (And changing it's implementation would probably break migration
> compatibility).

I'm afraid I see no better option.

Thanks,
Roman.



Re: [PATCH v12 00/13] VIRTIO-IOMMU device

2020-01-09 Thread Auger Eric
Hi,

On 1/9/20 4:12 PM, no-re...@patchew.org wrote:
> Patchew URL: 
> https://patchew.org/QEMU/20200109144319.15912-1-eric.au...@redhat.com/
> 
> 
> 
> Hi,
> 
> This series failed the docker-quick@centos7 build test. Please find the 
> testing commands and
> their output below. If you have Docker installed, you can probably reproduce 
> it
> locally.
> 
> === TEST SCRIPT BEGIN ===
> #!/bin/bash
> make docker-image-centos7 V=1 NETWORK=1
> time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
> === TEST SCRIPT END ===
> 
>   TESTcheck-qtest-x86_64: tests/test-hmp
>   TESTcheck-qtest-x86_64: tests/qos-test
> **
> ERROR:/tmp/qemu-test/src/tests/virtio-iommu-test.c:46:pci_config: assertion 
> failed (probe_size == 0x200): (0x == 0x0200)
> ERROR - Bail out! 
> ERROR:/tmp/qemu-test/src/tests/virtio-iommu-test.c:46:pci_config: assertion 
> failed (probe_size == 0x200): (0x == 0x0200)

OK sorry that's because I eventually removed "virtio-iommu: Implement
probe request" patch from the sent series. I will remove that probe_size
field test on next round.

Thanks

Eric
> make: *** [check-qtest-x86_64] Error 1
> make: *** Waiting for unfinished jobs
>   TESTcheck-qtest-aarch64: tests/test-hmp
>   TESTiotest-qcow2: 220
> ---
> raise CalledProcessError(retcode, cmd)
> subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
> '--label', 'com.qemu.instance.uuid=de2154161ccf4468af5ebd35cbcbdc03', '-u', 
> '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
> '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', 
> '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
> '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
> '/var/tmp/patchew-tester-tmp-_ap6ulch/src/docker-src.2020-01-09-10.01.05.32472:/var/tmp/qemu:z,ro',
>  'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
> status 2.
> filter=--filter=label=com.qemu.instance.uuid=de2154161ccf4468af5ebd35cbcbdc03
> make[1]: *** [docker-run] Error 1
> make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-_ap6ulch/src'
> make: *** [docker-run-test-quick@centos7] Error 2
> 
> real11m50.898s
> user0m8.023s
> 
> 
> The full log is available at
> http://patchew.org/logs/20200109144319.15912-1-eric.au...@redhat.com/testing.docker-quick@centos7/?type=message.
> ---
> Email generated automatically by Patchew [https://patchew.org/].
> Please send your feedback to patchew-de...@redhat.com
> 




Re: [PATCH] virtio: Prevent double swap due to target pre 1.0 VirtIO

2020-01-09 Thread Greg Kurz
On Thu, 9 Jan 2020 07:39:17 -0500
"Michael S. Tsirkin"  wrote:

> On Thu, Jan 09, 2020 at 09:25:42AM -0300, André Silva wrote:
> > Hi Michael!
> > Thanks for reviewing the patch!
> > 
> > > we always get LE values from memory subsystem,
> > > not target endian values:
> > 
> > I see. So do you think the patch is correct in eliminating the extra
> > swap (as virtio_config_readw for example already makes a swap)?
> > 
> > Thanks,
> > andré
> 
> I don't think it is, I think we do need an extra swap
> in some cases. It's possible that some cross-endian
> setups are broken now, if so pls include testing
> result not just theoretical analysis.
> 

I confirm that we must keep the extra swap otherwise
read/write in cross-endian setups will have wrong
endian. Please read this commit for a more detailed
explanation:

commit 82afa58641b0e67abbaf4da6c325ebd7c2513262
Author: Benjamin Herrenschmidt 
Date:   Tue Jan 10 01:35:11 2012 +

virtio-pci: Fix endianness of virtio config

https://git.qemu.org/?p=qemu.git;a=commitdiff;h=82afa58641b0e67abbaf4da6c325ebd7c2513262

This is especially critical on ppc64 since _all_ hosts are now LE
but the first piece of code in the guest that is likely to drive
the device is the SLOF firmware which is BE.

This is what we get with this patch when trying to run a pseries guest on a
ppc64le host:

Trying to load:  from: /pci@8002000/scsi@0 ... virtioblk_transfer: 
Access beyond end of device!

Cheers,

--
Greg

> > On Thu, Jan 9, 2020 at 7:50 AM Michael S. Tsirkin  wrote:
> > >
> > > On Wed, Jan 08, 2020 at 01:16:18PM -0300, Andre Silva wrote:
> > > > Remove the bswap function calls after reading and before writing
> > > > memory bytes in virtio_pci_config_read and virtio_pci_config_write
> > > > because they are reverting back an already swapped bytes.
> > > >
> > > > Consider the table below in the context of virtio_pci_config_read
> > > > function.
> > > >
> > > > Host   Target  virtio-config-read[wl]
> > > >swap?   virtio-is-big-endian?   extra 
> > > > bswap?   Should be   Final result   Final result ok?
> > > > - ---  --- 
> > > > -- --- -- --
> > > > LE BE  s(x)trues(s(x))  
> > > >   s(x)x  No
> > > > LE LE  x   false   -
> > > >   x   x  Yes
> > > > BE LE  s(x)false   -
> > > >   s(x)s(x)   Yes
> > > > BE BE  x   trues(x) 
> > > >   x   s(x)   No
> > >
> > > we always get LE values from memory subsystem,
> > > not target endian values:
> > >
> > > static const MemoryRegionOps virtio_pci_config_ops = {
> > > .read = virtio_pci_config_read,
> > > .write = virtio_pci_config_write,
> > > .impl = {
> > > .min_access_size = 1,
> > > .max_access_size = 4,
> > > },
> > > .endianness = DEVICE_LITTLE_ENDIAN,
> > > };
> > >
> > >
> > > This triggers another swap in address_space_ldl_internal
> > > (memory_ldst.inc.c).
> > >
> > >
> > > > In table above, when target is big endian and VirtIO is pre 1.0,
> > > > function virtio_is_big_endian would return true and the extra
> > > > swap would be executed, reverting the previous swap made by
> > > > virtio_config_read[wl].
> > > >
> > > > The 's(x)' means that a swap function was applied at
> > > > address x. 'LE' is little endian and 'BE' is big endian. The
> > > > 'Final result' column is the returned value from
> > > > virtio_pci_config_read, considering a target Virtio pre 1.0.
> > > > 'x' means that target's value was not swapped in Qemu, 's(x)' means
> > > > that Qemu will use a swapped value.
> > > >
> > > > If we remove the extra swap made in virtio_pci_config_read we will
> > > > have the correct result in any host/target combination, both for
> > > > VirtIO pre 1.0 or later versions.
> > > >
> > > > The same reasoning applies to virtio_pci_config_write.
> > > >
> > > > Signed-off-by: Andre Silva 
> > > > ---
> > > >  hw/virtio/virtio-pci.c | 12 
> > > >  1 file changed, 12 deletions(-)
> > > >
> > > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > > index c6b47a9c73..4ba9e847f3 100644
> > > > --- a/hw/virtio/virtio-pci.c
> > > > +++ b/hw/virtio/virtio-pci.c
> > > > @@ -431,15 +431,9 @@ static uint64_t virtio_pci_config_read(void 
> > > > *opaque, hwaddr addr,
> > > >  break;
> > > >  case 2:
> > > >  val = virtio_config_readw(vdev, addr);
> > > > -if (virtio_is_big_endian(vdev)) {
> > > > -val = bswap16(val);
> > > > -}
> > > >  break;
> > > >  case 4:
> > > >  val = virtio_config_readl(vdev, addr);
> > > > -if (virtio_is_big_endian(vdev)) {
> > > > -   

Re: [PATCH 06/15] migration/savevm: Replace current_machine by qdev_get_machine()

2020-01-09 Thread Dr. David Alan Gilbert
* Philippe Mathieu-Daudé (phi...@redhat.com) wrote:
> As we want to remove the global current_machine,
> replace MACHINE_GET_CLASS(current_machine) by
> MACHINE_GET_CLASS(qdev_get_machine()).
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  migration/savevm.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 59efc1981d..0e8b6a4715 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -292,7 +292,8 @@ static uint32_t get_validatable_capabilities_count(void)
>  static int configuration_pre_save(void *opaque)
>  {
>  SaveState *state = opaque;
> -const char *current_name = MACHINE_GET_CLASS(current_machine)->name;
> +MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
> +const char *current_name = mc->name;

For migration:

Acked-by: Dr. David Alan Gilbert 

(Personally I'd keep it on one line, but that's just taste)


>  MigrationState *s = migrate_get_current();
>  int i, j;
>  
> @@ -362,7 +363,8 @@ static bool configuration_validate_capabilities(SaveState 
> *state)
>  static int configuration_post_load(void *opaque, int version_id)
>  {
>  SaveState *state = opaque;
> -const char *current_name = MACHINE_GET_CLASS(current_machine)->name;
> +MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
> +const char *current_name = mc->name;
>  
>  if (strncmp(state->name, current_name, state->len) != 0) {
>  error_report("Machine type received is '%.*s' and local is '%s'",
> @@ -615,9 +617,7 @@ static void dump_vmstate_vmsd(FILE *out_file,
>  
>  static void dump_machine_type(FILE *out_file)
>  {
> -MachineClass *mc;
> -
> -mc = MACHINE_GET_CLASS(current_machine);
> +MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
>  
>  fprintf(out_file, "  \"vmschkmachine\": {\n");
>  fprintf(out_file, "\"Name\": \"%s\"\n", mc->name);
> -- 
> 2.21.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device

2020-01-09 Thread Vitaly Kuznetsov
Paolo Bonzini  writes:

> On 09/01/20 14:22, Dr. David Alan Gilbert wrote:
>> * Michael S. Tsirkin (m...@redhat.com) wrote:
>>> On Thu, Jan 09, 2020 at 12:22:37PM +, Dr. David Alan Gilbert wrote:
 Do we want a new memory_region_init for that or just to be able to add
 a flag?

>>> I think a flag API is preferable since this can apply to any kind of
>>> region. But can go either way, Paolo's the maintainer there.
>> 
>> (Copying Paolo in)
>> So what exactly does this flag mean; to me it's 'no vhost' - but is it
>> actually more general?
>
> It has two more effects in addition to no vhost:
>
> 1) it is skipped when dumping the guest (is this a good or bad idea for
> SynIC?)

Imagine we have an not yet consumed message sitting in message page, or
a signalled event, do I understand correctly that these are going to get
lost upon migration? This may not work then -- unless we transfer
in-QEMU synic state somehow separately.

-- 
Vitaly




[PATCH 2/4] Makefile: Clarify all the codebase requires qom/ objects

2020-01-09 Thread Philippe Mathieu-Daudé
QEMU user-mode also requires the qom/ objects, it is not only
used by "system emulation and qemu-img". As we will use a big
if() block, move it upper in the "Common libraries for tools
and emulators" section.

Signed-off-by: Philippe Mathieu-Daudé 
---
 Makefile.objs | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index 7c1e50f9d6..5aae561984 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -2,6 +2,7 @@
 # Common libraries for tools and emulators
 stub-obj-y = stubs/
 util-obj-y = crypto/ util/ qobject/ qapi/
+qom-obj-y = qom/
 
 chardev-obj-y = chardev/
 
@@ -26,11 +27,6 @@ block-obj-m = block/
 
 crypto-obj-y = crypto/
 
-###
-# qom-obj-y is code used by both qemu system emulation and qemu-img
-
-qom-obj-y = qom/
-
 ###
 # io-obj-y is code used by both qemu system emulation and qemu-img
 
-- 
2.21.1




Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device

2020-01-09 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> On 09/01/20 14:22, Dr. David Alan Gilbert wrote:
> > * Michael S. Tsirkin (m...@redhat.com) wrote:
> >> On Thu, Jan 09, 2020 at 12:22:37PM +, Dr. David Alan Gilbert wrote:
> >>> Do we want a new memory_region_init for that or just to be able to add
> >>> a flag?
> >>>
> >> I think a flag API is preferable since this can apply to any kind of
> >> region. But can go either way, Paolo's the maintainer there.
> > 
> > (Copying Paolo in)
> > So what exactly does this flag mean; to me it's 'no vhost' - but is it
> > actually more general?
> 
> It has two more effects in addition to no vhost:
> 
> 1) it is skipped when dumping the guest (is this a good or bad idea for
> SynIC?)
> 
> 2) accesses to the region will use the specified size (e.g. 4-bytes for
> address_space_stl, 1-byte for address_space_stb) instead of a memcpy.
> Doesn't really matter for SynIC regions.
> 
> If (1) is a good idea, then it's 2 out of 3 and I guess the patch is okay.

It's probably best  to keep them in the dump because they give some info
on the current state of the windows guest and interrupts.
Also, as Roman points out the ram-device pages aren't migrated, so we
need to fix that as well.

So, do we add a new flag? If so, is no-vhost what we want?

Dave

> Paolo
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




[PATCH 3/4] Makefile: Restrict system emulation and tools objects

2020-01-09 Thread Philippe Mathieu-Daudé
Restrict all the system emulation and tools objects with a
Makefile IF (CONFIG_SOFTMMU OR CONFIG_TOOLS) check.

Using the same description over and over is not very helpful.
Use it once, just before the if() block.

Signed-off-by: Philippe Mathieu-Daudé 
---
 Makefile.objs | 19 +++
 1 file changed, 7 insertions(+), 12 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index 5aae561984..395dd1e670 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -4,16 +4,15 @@ stub-obj-y = stubs/
 util-obj-y = crypto/ util/ qobject/ qapi/
 qom-obj-y = qom/
 
+###
+# code used by both qemu system emulation and qemu-img
+
+ifeq ($(call lor,$(CONFIG_SOFTMMU),$(CONFIG_TOOLS)),y)
+
 chardev-obj-y = chardev/
 
-###
-# authz-obj-y is code used by both qemu system emulation and qemu-img
-
 authz-obj-y = authz/
 
-###
-# block-obj-y is code used by both qemu system emulation and qemu-img
-
 block-obj-y = nbd/
 block-obj-y += block.o blockjob.o job.o
 block-obj-y += block/ scsi/
@@ -22,16 +21,12 @@ block-obj-$(CONFIG_REPLICATION) += replication.o
 
 block-obj-m = block/
 
-###
-# crypto-obj-y is code used by both qemu system emulation and qemu-img
-
 crypto-obj-y = crypto/
 
-###
-# io-obj-y is code used by both qemu system emulation and qemu-img
-
 io-obj-y = io/
 
+endif # CONFIG_SOFTMMU or CONFIG_TOOLS
+
 ##
 # Target independent part of system emulation. The long term path is to
 # suppress *all* target specific code in case of system emulation, i.e. a
-- 
2.21.1




[PATCH 4/4] Makefile: Remove unhelpful comment

2020-01-09 Thread Philippe Mathieu-Daudé
It is pointless to keep qapi/ object separate from the other
common-objects. Drop the comment.

Signed-off-by: Philippe Mathieu-Daudé 
---
 Makefile.objs | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/Makefile.objs b/Makefile.objs
index 395dd1e670..c6321d0465 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -71,11 +71,9 @@ qemu-seccomp.o-libs := $(SECCOMP_LIBS)
 
 common-obj-$(CONFIG_FDT) += device_tree.o
 
-##
-# qapi
-
 common-obj-y += qapi/
-endif
+
+endif # CONFIG_SOFTMMU
 
 ###
 # Target-independent parts used in system and user emulation
-- 
2.21.1




[PATCH 0/4] buildsys: Build quicker (mostly tools and linux-user)

2020-01-09 Thread Philippe Mathieu-Daudé
In some configuration (linux-user, tools) we can ignore building
various objects (and the libfdt).

Philippe Mathieu-Daudé (4):
  configure: Do not build libfdt is not required
  Makefile: Clarify all the codebase requires qom/ objects
  Makefile: Restrict system emulation and tools objects
  Makefile: Remove unhelpful comment

 configure |  2 ++
 Makefile.objs | 31 ++-
 2 files changed, 12 insertions(+), 21 deletions(-)

-- 
2.21.1




[PATCH 1/4] configure: Do not build libfdt is not required

2020-01-09 Thread Philippe Mathieu-Daudé
We only require libfdt for system emulation, in a small set
of architecture:

4077  # fdt support is mandatory for at least some target architectures,
4078  # so insist on it if we're building those system emulators.
4079  fdt_required=no
4080  for target in $target_list; do
4081case $target in
4082  
aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu|mips64el-softmmu|riscv*-softmmu)
4083fdt_required=yes

Do not build libfdt if we did not manually specified --enable-fdt.

Signed-off-by: Philippe Mathieu-Daudé 
---
 configure | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/configure b/configure
index 0ce2c0354a..266a8386d1 100755
--- a/configure
+++ b/configure
@@ -4092,6 +4092,8 @@ if test "$fdt_required" = "yes"; then
   "targets which need it (by specifying a cut down --target-list)."
   fi
   fdt=yes
+elif test "$fdt" != "yes" ; then
+  fdt=no
 fi
 
 if test "$fdt" != "no" ; then
-- 
2.21.1




[PATCH 14/15] accel/accel: Replace current_machine by qdev_get_machine()

2020-01-09 Thread Philippe Mathieu-Daudé
As we want to remove the global current_machine,
replace 'current_machine' by MACHINE(qdev_get_machine()).

Signed-off-by: Philippe Mathieu-Daudé 
---
 accel/accel.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/accel/accel.c b/accel/accel.c
index cb555e3b06..777d6ba119 100644
--- a/accel/accel.c
+++ b/accel/accel.c
@@ -65,7 +65,9 @@ int accel_init_machine(AccelState *accel, MachineState *ms)
 
 AccelState *current_accel(void)
 {
-return current_machine->accelerator;
+MachineState *ms = MACHINE(qdev_get_machine());
+
+return ms->accelerator;
 }
 
 void accel_setup_post(MachineState *ms)
-- 
2.21.1




[PATCH 13/15] accel: Replace current_machine->accelerator by current_accel() method

2020-01-09 Thread Philippe Mathieu-Daudé
As we want to remove the global current_machine,
replace 'current_machine->accelerator' by current_accel().

Signed-off-by: Philippe Mathieu-Daudé 
---
 accel/kvm/kvm-all.c | 4 ++--
 accel/tcg/tcg-all.c | 2 +-
 memory.c| 2 +-
 target/arm/kvm64.c  | 4 ++--
 target/i386/kvm.c   | 2 +-
 target/ppc/kvm.c| 2 +-
 vl.c| 2 +-
 7 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index b2f1a5bcb5..be1980f250 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -164,7 +164,7 @@ static NotifierList kvm_irqchip_change_notifiers =
 
 int kvm_get_max_memslots(void)
 {
-KVMState *s = KVM_STATE(current_machine->accelerator);
+KVMState *s = KVM_STATE(current_accel());
 
 return s->nr_slots;
 }
@@ -1847,7 +1847,7 @@ static int kvm_max_vcpu_id(KVMState *s)
 
 bool kvm_vcpu_id_is_valid(int vcpu_id)
 {
-KVMState *s = KVM_STATE(current_machine->accelerator);
+KVMState *s = KVM_STATE(current_accel());
 return vcpu_id >= 0 && vcpu_id < kvm_max_vcpu_id(s);
 }
 
diff --git a/accel/tcg/tcg-all.c b/accel/tcg/tcg-all.c
index 1dc384c8d2..1802ce02f6 100644
--- a/accel/tcg/tcg-all.c
+++ b/accel/tcg/tcg-all.c
@@ -124,7 +124,7 @@ static void tcg_accel_instance_init(Object *obj)
 
 static int tcg_init(MachineState *ms)
 {
-TCGState *s = TCG_STATE(current_machine->accelerator);
+TCGState *s = TCG_STATE(current_accel());
 
 tcg_exec_init(s->tb_size * 1024 * 1024);
 cpu_interrupt_handler = tcg_handle_interrupt;
diff --git a/memory.c b/memory.c
index 57e38b1f50..60e8993499 100644
--- a/memory.c
+++ b/memory.c
@@ -3106,7 +3106,7 @@ void mtree_info(bool flatview, bool dispatch_tree, bool 
owner)
 };
 GArray *fv_address_spaces;
 GHashTable *views = g_hash_table_new(g_direct_hash, g_direct_equal);
-AccelClass *ac = ACCEL_GET_CLASS(current_machine->accelerator);
+AccelClass *ac = ACCEL_GET_CLASS(current_accel());
 
 if (ac->has_memory) {
 fvi.ac = ac;
diff --git a/target/arm/kvm64.c b/target/arm/kvm64.c
index 876184b8fe..f677877a1e 100644
--- a/target/arm/kvm64.c
+++ b/target/arm/kvm64.c
@@ -613,14 +613,14 @@ bool kvm_arm_get_host_cpu_features(ARMHostCPUFeatures 
*ahcf)
 
 bool kvm_arm_aarch32_supported(CPUState *cpu)
 {
-KVMState *s = KVM_STATE(current_machine->accelerator);
+KVMState *s = KVM_STATE(current_accel());
 
 return kvm_check_extension(s, KVM_CAP_ARM_EL1_32BIT);
 }
 
 bool kvm_arm_sve_supported(CPUState *cpu)
 {
-KVMState *s = KVM_STATE(current_machine->accelerator);
+KVMState *s = KVM_STATE(current_accel());
 
 return kvm_check_extension(s, KVM_CAP_ARM_SVE);
 }
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 0b511906e3..2ed15814dc 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -147,7 +147,7 @@ bool kvm_allows_irq0_override(void)
 
 static bool kvm_x2apic_api_set_flags(uint64_t flags)
 {
-KVMState *s = KVM_STATE(current_machine->accelerator);
+KVMState *s = KVM_STATE(current_accel());
 
 return !kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags);
 }
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index d1c334f0e3..2d011308e0 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -258,7 +258,7 @@ static void kvm_get_smmu_info(struct kvm_ppc_smmu_info 
*info, Error **errp)
 
 struct ppc_radix_page_info *kvm_get_radix_page_info(void)
 {
-KVMState *s = KVM_STATE(current_machine->accelerator);
+KVMState *s = KVM_STATE(current_accel());
 struct ppc_radix_page_info *radix_page_info;
 struct kvm_ppc_rmmu_info rmmu_info;
 int i;
diff --git a/vl.c b/vl.c
index 86474a55c9..3ff3548183 100644
--- a/vl.c
+++ b/vl.c
@@ -2804,7 +2804,7 @@ static void configure_accelerators(const char *progname)
 }
 
 if (init_failed) {
-AccelClass *ac = ACCEL_GET_CLASS(current_machine->accelerator);
+AccelClass *ac = ACCEL_GET_CLASS(current_accel());
 error_report("falling back to %s", ac->name);
 }
 
-- 
2.21.1




[PATCH 11/15] exec: Replace current_machine by qdev_get_machine()

2020-01-09 Thread Philippe Mathieu-Daudé
As we want to remove the global current_machine,
replace 'current_machine' by MACHINE(qdev_get_machine()).

Signed-off-by: Philippe Mathieu-Daudé 
---
 exec.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/exec.c b/exec.c
index d4b769d0d4..98f5b049ca 100644
--- a/exec.c
+++ b/exec.c
@@ -1984,11 +1984,11 @@ static unsigned long last_ram_page(void)
 
 static void qemu_ram_setup_dump(void *addr, ram_addr_t size)
 {
-int ret;
+MachineState *ms = MACHINE(qdev_get_machine());
 
 /* Use MADV_DONTDUMP, if user doesn't want the guest memory in the core */
-if (!machine_dump_guest_core(current_machine)) {
-ret = qemu_madvise(addr, size, QEMU_MADV_DONTDUMP);
+if (!machine_dump_guest_core(ms)) {
+int ret = qemu_madvise(addr, size, QEMU_MADV_DONTDUMP);
 if (ret) {
 perror("qemu_madvise");
 fprintf(stderr, "madvise doesn't support MADV_DONTDUMP, "
@@ -2108,7 +2108,9 @@ size_t qemu_ram_pagesize_largest(void)
 
 static int memory_try_enable_merging(void *addr, size_t len)
 {
-if (!machine_mem_merge(current_machine)) {
+MachineState *ms = MACHINE(qdev_get_machine());
+
+if (!machine_mem_merge(ms)) {
 /* disabled by the user */
 return 0;
 }
-- 
2.21.1




[Bug 1841491] Re: floating point emulation can fail to set FE_UNDERFLOW

2020-01-09 Thread Paul Clarke
Comment #5 suggested splitting the "float" issue to a separate bug,
which was done some time ago (bug #1841592).

I think this ticket can be closed.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1841491

Title:
  floating point emulation can fail to set FE_UNDERFLOW

Status in QEMU:
  Confirmed

Bug description:
  Floating point emulation can fail to set FE_UNDERFLOW in some
  circumstances. This shows up often in glibc's "math" tests. A similar
  test is attached.

  This is similar to bug #1841442, but not the same problem, and I don't
  think the fix will be in the same code.

  On ppc64le native:
  --
  $ gcc -c -O2 fma.c
  $ gcc -O2 test-fma.c fma.o -lm -o test-fma
  $ ./test-fma $(./test-fma)
  fma(0x1.cp-1022, 0x1.1p-1, 0x0.1p-1022)
  0x0

  0xa00
  FE_INEXACT FE_UNDERFLOW 
  0x1p-1022
  --

  On qemu-system-ppc64:
  --
  $ ./test-fma $(./test-fma)
  fma(0x1.cp-1022, 0x1.1p-1, 0x0.1p-1022)
  0x0

  0x200
  FE_INEXACT 
  0x1p-1022
  --

  QEMU versions vary, but not too much, and are pretty close to git HEAD:
  - 586f3dced9 (HEAD -> master, origin/master, origin/HEAD) Merge 
remote-tracking branch 'remotes/cohuck/tags/s390x-20190822' into staging
  - 864ab31 Update version for v4.1.0-rc4 release

  There are worse symptoms on qemu-x86_64, but this is apparently not
  surprising per
  https://bugs.launchpad.net/qemu/+bug/1841442/comments/6.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1841491/+subscriptions



Re: Difference between 'current_machine' vs MACHINE(qdev_get_machine())

2020-01-09 Thread Like Xu

On 2020/1/9 20:01, Paolo Bonzini wrote:

On 09/01/20 12:23, Philippe Mathieu-Daudé wrote:



     current_machine =
MACHINE(object_new_with_class(OBJECT_CLASS(machine_class)));
     object_property_add_child(object_get_root(), "machine",
   OBJECT(current_machine), _abort);

The bigger user of 'current_machine' is the accel/KVM code.

Recently in a0628599f..cc7d44c2e0 "Replace global smp variables with
machine smp properties" we started to use MACHINE(qdev_get_machine()).

qdev_get_machine() resolves the machine in the QOM composition tree.
I am confused by this comment:

   /* qdev_get_machine() can return something that's not TYPE_MACHINE
    * if this is one of the user-only emulators; in that case there's
    * no need to check the ignore_memory_transaction_failures board flag.
    */

Following a0628599f..cc7d44c2e0, a5e0b33119 use 'current_machine' again.

What are the differences between both form, when should we use one or
another (or can we use a single one?). Can this break user-only mode?


I would always use MACHINE(qdev_get_machine()), espeecially outside
vl.c.  Ideally, current_machine would be static within vl.c or even
unused outside the object_property_add_child() that you quote above.

Most of the times, I noticed from a quick grep, we actually want to
access the accelerator, not the machine, so we could add a
qemu_get_accelerator() wrapper that does
MACHINE(qdev_get_machine())->accelerator.

Paolo



I prefer to use MACHINE(qdev_get_machine()) wherever possible.

However, the qdev_get_machine() would return non TYPE_MACHINE object if:
- call qdev_get_machine() before we do 
"object_property_add_child(object_get_root(), "machine", 
OBJECT(current_machine), _abort);" in vl.c;

- or in the context with '#ifdef CONFIG_USER_ONLY';

Thanks,
Like Xu




[PATCH 09/15] device_tree: Replace current_machine by qdev_get_machine()

2020-01-09 Thread Philippe Mathieu-Daudé
As we want to remove the global current_machine,
replace 'current_machine' by MACHINE(qdev_get_machine()).

Signed-off-by: Philippe Mathieu-Daudé 
---
 device_tree.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/device_tree.c b/device_tree.c
index f8b46b3c73..665ea2f586 100644
--- a/device_tree.c
+++ b/device_tree.c
@@ -466,7 +466,9 @@ uint32_t qemu_fdt_alloc_phandle(void *fdt)
  * which phandle id to start allocating phandles.
  */
 if (!phandle) {
-phandle = machine_phandle_start(current_machine);
+MachineState *ms = MACHINE(qdev_get_machine());
+
+phandle = machine_phandle_start(ms);
 }
 
 if (!phandle) {
-- 
2.21.1




[PATCH 15/15] vl: Make current_machine a local variable

2020-01-09 Thread Philippe Mathieu-Daudé
Since we now only use current_machine in vl.c, stop exporting
it as a global variable in "hw/board.h", and make it static
to vl.c.

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/boards.h | 2 --
 vl.c| 4 ++--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 61f8bb8e5a..b0c0d4376d 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -59,8 +59,6 @@ void memory_region_allocate_system_memory(MemoryRegion *mr, 
Object *owner,
 #define MACHINE_CLASS(klass) \
 OBJECT_CLASS_CHECK(MachineClass, (klass), TYPE_MACHINE)
 
-extern MachineState *current_machine;
-
 void machine_run_board_init(MachineState *machine);
 bool machine_usb(MachineState *machine);
 int machine_phandle_start(MachineState *machine);
diff --git a/vl.c b/vl.c
index 3ff3548183..7a69af4bef 100644
--- a/vl.c
+++ b/vl.c
@@ -214,6 +214,8 @@ static int default_sdcard = 1;
 static int default_vga = 1;
 static int default_net = 1;
 
+static MachineState *current_machine;
+
 static struct {
 const char *driver;
 int *flag;
@@ -1164,8 +1166,6 @@ static int usb_parse(const char *cmdline)
 /***/
 /* machine registration */
 
-MachineState *current_machine;
-
 static MachineClass *find_machine(const char *name, GSList *machines)
 {
 GSList *el;
-- 
2.21.1




Re: [PATCH 2/2] hyperv/synic: Allocate as ram_device

2020-01-09 Thread Paolo Bonzini
On 09/01/20 14:22, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (m...@redhat.com) wrote:
>> On Thu, Jan 09, 2020 at 12:22:37PM +, Dr. David Alan Gilbert wrote:
>>> Do we want a new memory_region_init for that or just to be able to add
>>> a flag?
>>>
>> I think a flag API is preferable since this can apply to any kind of
>> region. But can go either way, Paolo's the maintainer there.
> 
> (Copying Paolo in)
> So what exactly does this flag mean; to me it's 'no vhost' - but is it
> actually more general?

It has two more effects in addition to no vhost:

1) it is skipped when dumping the guest (is this a good or bad idea for
SynIC?)

2) accesses to the region will use the specified size (e.g. 4-bytes for
address_space_stl, 1-byte for address_space_stb) instead of a memcpy.
Doesn't really matter for SynIC regions.

If (1) is a good idea, then it's 2 out of 3 and I guess the patch is okay.

Paolo




[PATCH 06/15] migration/savevm: Replace current_machine by qdev_get_machine()

2020-01-09 Thread Philippe Mathieu-Daudé
As we want to remove the global current_machine,
replace MACHINE_GET_CLASS(current_machine) by
MACHINE_GET_CLASS(qdev_get_machine()).

Signed-off-by: Philippe Mathieu-Daudé 
---
 migration/savevm.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 59efc1981d..0e8b6a4715 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -292,7 +292,8 @@ static uint32_t get_validatable_capabilities_count(void)
 static int configuration_pre_save(void *opaque)
 {
 SaveState *state = opaque;
-const char *current_name = MACHINE_GET_CLASS(current_machine)->name;
+MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
+const char *current_name = mc->name;
 MigrationState *s = migrate_get_current();
 int i, j;
 
@@ -362,7 +363,8 @@ static bool configuration_validate_capabilities(SaveState 
*state)
 static int configuration_post_load(void *opaque, int version_id)
 {
 SaveState *state = opaque;
-const char *current_name = MACHINE_GET_CLASS(current_machine)->name;
+MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
+const char *current_name = mc->name;
 
 if (strncmp(state->name, current_name, state->len) != 0) {
 error_report("Machine type received is '%.*s' and local is '%s'",
@@ -615,9 +617,7 @@ static void dump_vmstate_vmsd(FILE *out_file,
 
 static void dump_machine_type(FILE *out_file)
 {
-MachineClass *mc;
-
-mc = MACHINE_GET_CLASS(current_machine);
+MachineClass *mc = MACHINE_GET_CLASS(qdev_get_machine());
 
 fprintf(out_file, "  \"vmschkmachine\": {\n");
 fprintf(out_file, "\"Name\": \"%s\"\n", mc->name);
-- 
2.21.1




[PATCH 07/15] hw/core/machine-qmp-cmds: Replace current_machine by qdev_get_machine()

2020-01-09 Thread Philippe Mathieu-Daudé
As we want to remove the global current_machine,
replace MACHINE_GET_CLASS(current_machine) by
MACHINE_GET_CLASS(qdev_get_machine()).

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/core/machine-qmp-cmds.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c
index eed5aeb2f7..5a04d00e4f 100644
--- a/hw/core/machine-qmp-cmds.c
+++ b/hw/core/machine-qmp-cmds.c
@@ -280,9 +280,9 @@ void qmp_cpu_add(int64_t id, Error **errp)
 {
 MachineClass *mc;
 
-mc = MACHINE_GET_CLASS(current_machine);
+mc = MACHINE_GET_CLASS(qdev_get_machine());
 if (mc->hot_add_cpu) {
-mc->hot_add_cpu(current_machine, id, errp);
+mc->hot_add_cpu(MACHINE(qdev_get_machine()), id, errp);
 } else {
 error_setg(errp, "Not supported");
 }
-- 
2.21.1




[PATCH 10/15] memory: Replace current_machine by qdev_get_machine()

2020-01-09 Thread Philippe Mathieu-Daudé
As we want to remove the global current_machine,
replace 'current_machine' by MACHINE(qdev_get_machine()).

Signed-off-by: Philippe Mathieu-Daudé 
---
 memory.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/memory.c b/memory.c
index d7b9bb6951..57e38b1f50 100644
--- a/memory.c
+++ b/memory.c
@@ -3004,6 +3004,7 @@ static void mtree_print_flatview(gpointer key, gpointer 
value,
 int n = view->nr;
 int i;
 AddressSpace *as;
+MachineState *ms;
 
 qemu_printf("FlatView #%d\n", fvi->counter);
 ++fvi->counter;
@@ -3026,6 +3027,7 @@ static void mtree_print_flatview(gpointer key, gpointer 
value,
 return;
 }
 
+ms = MACHINE(qdev_get_machine());
 while (n--) {
 mr = range->mr;
 if (range->offset_in_region) {
@@ -3057,7 +3059,7 @@ static void mtree_print_flatview(gpointer key, gpointer 
value,
 if (fvi->ac) {
 for (i = 0; i < fv_address_spaces->len; ++i) {
 as = g_array_index(fv_address_spaces, AddressSpace*, i);
-if (fvi->ac->has_memory(current_machine, as,
+if (fvi->ac->has_memory(ms, as,
 int128_get64(range->addr.start),
 MR_SIZE(range->addr.size) + 1)) {
 qemu_printf(" %s", fvi->ac->name);
-- 
2.21.1




[PATCH 12/15] accel: Introduce the current_accel() method

2020-01-09 Thread Philippe Mathieu-Daudé
We want to remove the global current_machine. The accel/
code access few times current_machine->accelerator. Introduce
the current_accel() method first, it will then be easier to
replace 'current_machine' by MACHINE(qdev_get_machine()).

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/sysemu/accel.h | 2 ++
 accel/accel.c  | 5 +
 2 files changed, 7 insertions(+)

diff --git a/include/sysemu/accel.h b/include/sysemu/accel.h
index d4c1429711..47e5788530 100644
--- a/include/sysemu/accel.h
+++ b/include/sysemu/accel.h
@@ -70,4 +70,6 @@ int accel_init_machine(AccelState *accel, MachineState *ms);
 /* Called just before os_setup_post (ie just before drop OS privs) */
 void accel_setup_post(MachineState *ms);
 
+AccelState *current_accel(void);
+
 #endif
diff --git a/accel/accel.c b/accel/accel.c
index 1c5c3a6abb..cb555e3b06 100644
--- a/accel/accel.c
+++ b/accel/accel.c
@@ -63,6 +63,11 @@ int accel_init_machine(AccelState *accel, MachineState *ms)
 return ret;
 }
 
+AccelState *current_accel(void)
+{
+return current_machine->accelerator;
+}
+
 void accel_setup_post(MachineState *ms)
 {
 AccelState *accel = ms->accelerator;
-- 
2.21.1




[PATCH 05/15] device-hotplug: Replace current_machine by qdev_get_machine()

2020-01-09 Thread Philippe Mathieu-Daudé
As we want to remove the global current_machine,
replace MACHINE_GET_CLASS(current_machine) by
MACHINE_GET_CLASS(qdev_get_machine()).

Signed-off-by: Philippe Mathieu-Daudé 
---
 device-hotplug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/device-hotplug.c b/device-hotplug.c
index f01d53774b..44d687f254 100644
--- a/device-hotplug.c
+++ b/device-hotplug.c
@@ -45,7 +45,7 @@ static DriveInfo *add_init_drive(const char *optstr)
 if (!opts)
 return NULL;
 
-mc = MACHINE_GET_CLASS(current_machine);
+mc = MACHINE_GET_CLASS(qdev_get_machine());
 dinfo = drive_new(opts, mc->block_default_type, );
 if (err) {
 error_report_err(err);
-- 
2.21.1




[PATCH 04/15] hw/ppc/spapr_rtas: Restrict variables scope to single switch case

2020-01-09 Thread Philippe Mathieu-Daudé
We only access these variables in RTAS_SYSPARM_SPLPAR_CHARACTERISTICS
case, restrict their scope to avoid unnecessary initialization.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/ppc/spapr_rtas.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 6f06e9d7fe..7237e5ebf2 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -267,8 +267,6 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
   uint32_t nret, target_ulong rets)
 {
 PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
-MachineState *ms = MACHINE(spapr);
-unsigned int max_cpus = ms->smp.max_cpus;
 target_ulong parameter = rtas_ld(args, 0);
 target_ulong buffer = rtas_ld(args, 1);
 target_ulong length = rtas_ld(args, 2);
@@ -276,6 +274,8 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
 
 switch (parameter) {
 case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: {
+MachineState *ms = MACHINE(spapr);
+unsigned int max_cpus = ms->smp.max_cpus;
 char *param_val = g_strdup_printf("MaxEntCap=%d,"
   "DesMem=%" PRIu64 ","
   "DesProcs=%d,"
-- 
2.21.1




[PATCH 02/15] hw/ppc/spapr_rtas: Use local MachineState variable

2020-01-09 Thread Philippe Mathieu-Daudé
Since we have the MachineState already available locally,
ues it instead of the global current_machine.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/ppc/spapr_rtas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 8d8d8cdfcb..e88bb1930e 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -281,7 +281,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
   "DesProcs=%d,"
   "MaxPlatProcs=%d",
   max_cpus,
-  current_machine->ram_size / MiB,
+  ms->ram_size / MiB,
   ms->smp.cpus,
   max_cpus);
 if (pcc->n_host_threads > 0) {
-- 
2.21.1




[PATCH 08/15] target/arm/monitor: Replace current_machine by qdev_get_machine()

2020-01-09 Thread Philippe Mathieu-Daudé
As we want to remove the global current_machine,
replace 'current_machine' by MACHINE(qdev_get_machine()).

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/arm/monitor.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/target/arm/monitor.c b/target/arm/monitor.c
index fa054f8a36..bcbf69802d 100644
--- a/target/arm/monitor.c
+++ b/target/arm/monitor.c
@@ -136,7 +136,8 @@ CpuModelExpansionInfo 
*qmp_query_cpu_model_expansion(CpuModelExpansionType type,
 }
 
 if (kvm_enabled()) {
-const char *cpu_type = current_machine->cpu_type;
+MachineState *ms = MACHINE(qdev_get_machine());
+const char *cpu_type = ms->cpu_type;
 int len = strlen(cpu_type) - strlen(ARM_CPU_TYPE_SUFFIX);
 bool supported = false;
 
-- 
2.21.1




[PATCH 03/15] hw/ppc/spapr_rtas: Access MachineState via SpaprMachineState argument

2020-01-09 Thread Philippe Mathieu-Daudé
We received a SpaprMachineState argument. Since SpaprMachineState
inherits of MachineState, use it instead of calling qdev_get_machine.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/ppc/spapr_rtas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index e88bb1930e..6f06e9d7fe 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -267,7 +267,7 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
   uint32_t nret, target_ulong rets)
 {
 PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cpu);
-MachineState *ms = MACHINE(qdev_get_machine());
+MachineState *ms = MACHINE(spapr);
 unsigned int max_cpus = ms->smp.max_cpus;
 target_ulong parameter = rtas_ld(args, 0);
 target_ulong buffer = rtas_ld(args, 1);
-- 
2.21.1




[PATCH 01/15] target/arm/kvm: Use CPUState::kvm_state in kvm_arm_pmu_supported()

2020-01-09 Thread Philippe Mathieu-Daudé
KVMState is already accessible via CPUState::kvm_state, use it.

Signed-off-by: Philippe Mathieu-Daudé 
---
 target/arm/kvm.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index b87b59a02a..8d82889150 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -181,9 +181,7 @@ void kvm_arm_set_cpu_features_from_host(ARMCPU *cpu)
 
 bool kvm_arm_pmu_supported(CPUState *cpu)
 {
-KVMState *s = KVM_STATE(current_machine->accelerator);
-
-return kvm_check_extension(s, KVM_CAP_ARM_PMU_V3);
+return kvm_check_extension(cpu->kvm_state, KVM_CAP_ARM_PMU_V3);
 }
 
 int kvm_arm_get_max_vm_ipa_size(MachineState *ms)
-- 
2.21.1




[PATCH 00/15] Replace current_machine by qdev_get_machine()

2020-01-09 Thread Philippe Mathieu-Daudé
Blurb from previous question [1]:

  "hw/boards.h" declare current_machine, and vl.c defines it:

current_machine = 
MACHINE(object_new_with_class(OBJECT_CLASS(machine_class)));
object_property_add_child(object_get_root(), "machine",
  OBJECT(current_machine), _abort);

  The bigger user of 'current_machine' is the accel/KVM code.

  Recently in a0628599f..cc7d44c2e0 "Replace global smp variables
  with machine smp properties" we started to use MACHINE(qdev_get_machine()).
  Following a0628599f..cc7d44c2e0, a5e0b33119 use 'current_machine' again.

  qdev_get_machine() resolves the machine in the QOM composition tree.

Paolo answered [2]:

> I would always use MACHINE(qdev_get_machine()), espeecially outside
> vl.c.  Ideally, current_machine would be static within vl.c or even
> unused outside the object_property_add_child() that you quote above.

Let's remove the global current_machine.

I am still confused by this comment:

  /* qdev_get_machine() can return something that's not TYPE_MACHINE
   * if this is one of the user-only emulators; in that case there's
   * no need to check the ignore_memory_transaction_failures board flag.
   */

[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg669475.html
[2] https://www.mail-archive.com/qemu-devel@nongnu.org/msg669493.html

Philippe Mathieu-Daudé (15):
  target/arm/kvm: Use CPUState::kvm_state in kvm_arm_pmu_supported()
  hw/ppc/spapr_rtas: Use local MachineState variable
  hw/ppc/spapr_rtas: Access MachineState via SpaprMachineState argument
  hw/ppc/spapr_rtas: Restrict variables scope to single switch case
  device-hotplug: Replace current_machine by qdev_get_machine()
  migration/savevm: Replace current_machine by qdev_get_machine()
  hw/core/machine-qmp-cmds: Replace current_machine by
qdev_get_machine()
  target/arm/monitor: Replace current_machine by qdev_get_machine()
  device_tree: Replace current_machine by qdev_get_machine()
  memory: Replace current_machine by qdev_get_machine()
  exec: Replace current_machine by qdev_get_machine()
  accel: Introduce the current_accel() method
  accel: Replace current_machine->accelerator by current_accel() method
  accel/accel: Replace current_machine by qdev_get_machine()
  vl: Make current_machine a local variable

 include/hw/boards.h|  2 --
 include/sysemu/accel.h |  2 ++
 accel/accel.c  |  7 +++
 accel/kvm/kvm-all.c|  4 ++--
 accel/tcg/tcg-all.c|  2 +-
 device-hotplug.c   |  2 +-
 device_tree.c  |  4 +++-
 exec.c | 10 ++
 hw/core/machine-qmp-cmds.c |  4 ++--
 hw/ppc/spapr_rtas.c|  6 +++---
 memory.c   |  6 --
 migration/savevm.c | 10 +-
 target/arm/kvm.c   |  4 +---
 target/arm/kvm64.c |  4 ++--
 target/arm/monitor.c   |  3 ++-
 target/i386/kvm.c  |  2 +-
 target/ppc/kvm.c   |  2 +-
 vl.c   |  6 +++---
 18 files changed, 46 insertions(+), 34 deletions(-)

-- 
2.21.1




Re: [PATCH v12 00/13] VIRTIO-IOMMU device

2020-01-09 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20200109144319.15912-1-eric.au...@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  TESTcheck-qtest-x86_64: tests/test-hmp
  TESTcheck-qtest-x86_64: tests/qos-test
**
ERROR:/tmp/qemu-test/src/tests/virtio-iommu-test.c:46:pci_config: assertion 
failed (probe_size == 0x200): (0x == 0x0200)
ERROR - Bail out! 
ERROR:/tmp/qemu-test/src/tests/virtio-iommu-test.c:46:pci_config: assertion 
failed (probe_size == 0x200): (0x == 0x0200)
make: *** [check-qtest-x86_64] Error 1
make: *** Waiting for unfinished jobs
  TESTcheck-qtest-aarch64: tests/test-hmp
  TESTiotest-qcow2: 220
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=de2154161ccf4468af5ebd35cbcbdc03', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-_ap6ulch/src/docker-src.2020-01-09-10.01.05.32472:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=de2154161ccf4468af5ebd35cbcbdc03
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-_ap6ulch/src'
make: *** [docker-run-test-quick@centos7] Error 2

real11m50.898s
user0m8.023s


The full log is available at
http://patchew.org/logs/20200109144319.15912-1-eric.au...@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: Pre-Christmas meeting notes

2020-01-09 Thread Vladimir Sementsov-Ogievskiy
06.01.2020 20:15, Max Reitz wrote:
> Misc
> 
> 
> The Wiki’s TODO list is horribly outdated.  What should we do about
> it?  Maybe archive it and start a new one?  (Most of the things on the
> current list are either done or we don’t want to do anymore.)


May be, create block/TODO.txt instead? It would be simpler to manage file
in git than wiki page.


-- 
Best regards,
Vladimir


[Bug 1859021] Re: qemu-system-aarch64 (tcg): cval + voff overflow not handled, causes qemu to hang

2020-01-09 Thread Alex Bennée
** Tags added: arm tcg

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1859021

Title:
  qemu-system-aarch64 (tcg):  cval + voff overflow not handled, causes
  qemu to hang

Status in QEMU:
  New

Bug description:
  The Armv8 architecture reference manual states that for any timer set
  (e.g. CNTP* and CNTV*), the condition for such timer to generate an
  interrupt (if enabled & unmasked) is:

  CVAL <= CNT(P/V)CT

  Although this is arguably sloppy coding, I have seen code that is
  therefore assuming it can set CVAL to a very high value (e.g.
  UINT64_MAX) and leave the interrupt enabled in CTL, and never get the
  interrupt.

  On latest master commit as the time of writing, there is an integer
  overflow in target/arm/helper.c gt_recalc_timer affecting the virtual
  timer when the interrupt is enabled in CTL:

  /* Next transition is when we hit cval */
  nexttick = gt->cval + offset;

  When this overflow happens, I notice that qemu is no longer responsive and 
that I have to SIGKILL the process:
  - qemu takes nearly all the cpu time of the cores it is running on (e.g. 
50% cpu usage if running on half the cores) and is completely unresponsive
  - no guest interrupt (reported via -d int) is generated

  Here the minimal code example to reproduce the issue:

  mov x0, #1
  msr cntvoff_el2, x0
  mov x0, #-1
  msr cntv_cval_el0, x0
  mov x0, #1
  msr cntv_ctl_el0, x0 // interrupt generation enabled, not masked; 
qemu will start to hang here

  Options used:
  -nographic -machine virt,virtualization=on,gic-version=2,accel=tcg -cpu 
cortex-a57
  -smp 4 -m 1024 -kernel whatever.elf -d unimp,guest_errors,int 
-semihosting-config enable,target=native
  -serial mon:stdio

  Version used: 4.2

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1859021/+subscriptions



Re: [PATCH v11 Kernel 3/6] vfio iommu: Implementation of ioctl to for dirty pages tracking.

2020-01-09 Thread Alex Williamson
On Thu, 9 Jan 2020 18:59:40 +0530
Kirti Wankhede  wrote:

> On 1/9/2020 3:59 AM, Alex Williamson wrote:
> > On Thu, 9 Jan 2020 01:31:16 +0530
> > Kirti Wankhede  wrote:
> >   
> >> On 1/8/2020 3:32 AM, Alex Williamson wrote:  
> >>> On Wed, 8 Jan 2020 01:37:03 +0530
> >>> Kirti Wankhede  wrote:
> >>>  
> >>
> >> 
> >>  
> >> +
> >> +  unlocked = vfio_iova_put_vfio_pfn(dma, vpfn, dirty_tracking);
> >> 
> >>if (do_accounting)
> >>vfio_lock_acct(dma, -unlocked, true);
> >> @@ -571,8 +606,12 @@ static int vfio_iommu_type1_pin_pages(void 
> >> *iommu_data,
> >> 
> >>vpfn = vfio_iova_get_vfio_pfn(dma, iova);
> >>if (vpfn) {
> >> -  phys_pfn[i] = vpfn->pfn;
> >> -  continue;
> >> +  if (vpfn->unpinned)
> >> +  vfio_remove_from_pfn_list(dma, vpfn);  
> >
> > This seems inefficient, we have an allocated vpfn at the right places
> > in the list, wouldn't it be better to repin the page?
> > 
> 
>  vfio_pin_page_external() takes care of pinning and accounting as well.  
> >>>
> >>> Yes, but could we call vfio_pin_page_external() without {unlinking,
> >>> freeing} and {re-allocating, linking} on either side of it since it's
> >>> already in the list?  That's the inefficient part.  Maybe at least a
> >>> TODO comment?
> >>>  
> >>
> >> Changing it as below:
> >>
> >>   vpfn = vfio_iova_get_vfio_pfn(dma, iova);
> >>   if (vpfn) {
> >> -   phys_pfn[i] = vpfn->pfn;
> >> -   continue;
> >> +   if (vpfn->ref_count > 1) {
> >> +   phys_pfn[i] = vpfn->pfn;
> >> +   continue;
> >> +   }
> >>   }
> >>
> >>   remote_vaddr = dma->vaddr + iova - dma->iova;
> >>   ret = vfio_pin_page_external(dma, remote_vaddr,
> >> _pfn[i],
> >>do_accounting);
> >>   if (ret)
> >>   goto pin_unwind;
> >> -
> >> -   ret = vfio_add_to_pfn_list(dma, iova, phys_pfn[i]);
> >> -   if (ret) {
> >> -   vfio_unpin_page_external(dma, iova, do_accounting);
> >> -   goto pin_unwind;
> >> -   }
> >> +   if (!vpfn) {
> >> +   ret = vfio_add_to_pfn_list(dma, iova, phys_pfn[i]);
> >> +   if (ret) {
> >> +   vfio_unpin_page_external(dma, iova,
> >> +do_accounting,
> >> false);
> >> +   goto pin_unwind;
> >> +   }
> >> +   } else
> >> +   vpfn->pfn = phys_pfn[i];
> >>   }
> >>
> >>
> >>
> >>  
> >> +  else {
> >> +  phys_pfn[i] = vpfn->pfn;
> >> +  continue;
> >> +  }
> >>}
> >> 
> >>remote_vaddr = dma->vaddr + iova - dma->iova;
> >> @@ -583,7 +622,8 @@ static int vfio_iommu_type1_pin_pages(void 
> >> *iommu_data,
> >> 
> >>ret = vfio_add_to_pfn_list(dma, iova, phys_pfn[i]);
> >>if (ret) {
> >> -  vfio_unpin_page_external(dma, iova, 
> >> do_accounting);
> >> +  vfio_unpin_page_external(dma, iova, 
> >> do_accounting,
> >> +   false);
> >>goto pin_unwind;
> >>}
> >>}  
> >>
> >> 
> >>  
>  
> >> +  if (range.flags & VFIO_IOMMU_DIRTY_PAGES_FLAG_START) {
> >> +  iommu->dirty_page_tracking = true;
> >> +  return 0;
> >> +  } else if (range.flags & 
> >> VFIO_IOMMU_DIRTY_PAGES_FLAG_STOP) {
> >> +  iommu->dirty_page_tracking = false;
> >> +
> >> +  mutex_lock(>lock);
> >> +  vfio_remove_unpinned_from_dma_list(iommu);
> >> +  mutex_unlock(>lock);
> >> +  return 0;
> >> +
> >> +  } else if (range.flags &
> >> +   
> >> VFIO_IOMMU_DIRTY_PAGES_FLAG_GET_BITMAP) {
> >> +  uint64_t iommu_pgmask;
> >> +  unsigned long pgshift = __ffs(range.pgsize);
> >> +  unsigned long *bitmap;
> >> +  long bsize;
> >> +
> >> +  iommu_pgmask =
> >> +   ((uint64_t)1 << 

  1   2   3   >