date:20221123

Re: [PATCH for-7.2] vhost: enable vrings in vhost_dev_start() for vhost-user devices

2022-11-23 Thread Stefano Garzarella


On Thu, Nov 24, 2022 at 01:50:19AM -0500, Michael S. Tsirkin wrote:

On Thu, Nov 24, 2022 at 12:19:25AM +, Raphael Norwitz wrote:


> On Nov 23, 2022, at 8:16 AM, Stefano Garzarella  wrote:
>
> Commit 02b61f38d3 ("hw/virtio: incorporate backend features in features")
> properly negotiates VHOST_USER_F_PROTOCOL_FEATURES with the vhost-user
> backend, but we forgot to enable vrings as specified in
> docs/interop/vhost-user.rst:
>
>If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
>ring starts directly in the enabled state.
>
>If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is
>initialized in a disabled state and is enabled by
>``VHOST_USER_SET_VRING_ENABLE`` with parameter 1.
>
> Some vhost-user front-ends already did this by calling
> vhost_ops.vhost_set_vring_enable() directly:
> - backends/cryptodev-vhost.c
> - hw/net/virtio-net.c
> - hw/virtio/vhost-user-gpio.c

To simplify why not rather change these devices to use the new 
semantics?


Maybe the only one I wouldn't be scared of is vhost-user-gpio, but for 
example vhost-net would require a lot of changes, maybe better after the 
release.


For example, we could do like vhost-vdpa and call SET_VRING_ENABLE in 
the VhostOps.vhost_dev_start callback of vhost-user.c, but I think it's 
too risky to do that now.




Granted this is already scary enough for this release.


Yeah, I tried to touch as little as possible but I'm scared too, I just 
haven't found a better solution for now :-(





>
> But most didn't do that, so we would leave the vrings disabled and some
> backends would not work. We observed this issue with the rust version of
> virtiofsd [1], which uses the event loop [2] provided by the
> vhost-user-backend crate where requests are not processed if vring is
> not enabled.
>
> Let's fix this issue by enabling the vrings in vhost_dev_start() for
> vhost-user front-ends that don't already do this directly. Same thing
> also in vhost_dev_stop() where we disable vrings.
>
> [1] https://gitlab.com/virtio-fs/virtiofsd
> [2] 
https://github.com/rust-vmm/vhost/blob/240fc2966/crates/vhost-user-backend/src/event_loop.rs#L217
> Fixes: 02b61f38d3 ("hw/virtio: incorporate backend features in features")
> Reported-by: German Maglione 
> Tested-by: German Maglione 
> Signed-off-by: Stefano Garzarella 

Looks good for vhost-user-blk/vhost-user-scsi.

Acked-by: Raphael Norwitz 


Thanks for the review!
Stefano

Re: [PATCH for 7.2?] vhost: fix vq dirt bitmap syncing when vIOMMU is enabled

2022-11-23 Thread Michael S. Tsirkin

On Thu, Nov 24, 2022 at 03:31:59PM +0800, Jason Wang wrote:
> On Thu, Nov 24, 2022 at 3:06 PM Michael S. Tsirkin  wrote:
> >
> > On Thu, Nov 24, 2022 at 12:12:15PM +0800, Jason Wang wrote:
> > > On Wed, Nov 23, 2022 at 3:21 PM Michael S. Tsirkin  
> > > wrote:
> > > >
> > > > On Wed, Nov 23, 2022 at 03:08:25PM +0800, Jason Wang wrote:
> > > > > On Wed, Nov 23, 2022 at 2:15 PM Michael S. Tsirkin  
> > > > > wrote:
> > > > > >
> > > > > > On Wed, Nov 23, 2022 at 01:47:04PM +0800, Jason Wang wrote:
> > > > > > > On Wed, Nov 23, 2022 at 1:26 PM Jason Wang  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Wed, Nov 23, 2022 at 12:28 AM Eric Auger 
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > On 11/22/22 10:43, Michael S. Tsirkin wrote:
> > > > > > > > > > On Tue, Nov 22, 2022 at 11:01:11AM +0800, Jason Wang wrote:
> > > > > > > > > >> When vIOMMU is enabled, the vq->used_phys is actually the 
> > > > > > > > > >> IOVA not
> > > > > > > > > >> GPA. So we need to translate it to GPA before the syncing 
> > > > > > > > > >> otherwise we
> > > > > > > > > >> may hit the following crash since IOVA could be out of the 
> > > > > > > > > >> scope of
> > > > > > > > > >> the GPA log size. This could be noted when using 
> > > > > > > > > >> virtio-IOMMU with
> > > > > > > > > >> vhost using 1G memory.
> > > > > > > > > > Noted how exactly? What does "using 1G memory" mean?
> > > > > > > > >
> > > > > > > > > We hit the following assertion:
> > > > > > > > >
> > > > > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: 
> > > > > > > > > vhost_dev_sync_region: Assertion `end / VHOST_LOG_CHUNK < 
> > > > > > > > > dev->log_size' failed.
> > > > > > > > >
> > > > > > > > > On the last time vhost_get_log_size() is called it takes into 
> > > > > > > > > account 2 regions when computing the log_size:
> > > > > > > > > qemu-system-x86_64: vhost_get_log_size region 0 last=0x9 
> > > > > > > > > updated log_size=0x3
> > > > > > > > > qemu-system-x86_64: vhost_get_log_size region 1 
> > > > > > > > > last=0x3fff updated log_size=0x1000
> > > > > > > > > so in vhost_migration_log() vhost_get_log_size(dev) returns 
> > > > > > > > > 0x1000
> > > > > > > > >
> > > > > > > > > In the test case, memory_region_sync_dirty_bitmap() gets 
> > > > > > > > > called for mem-machine_mem, vga.vram (several times) and 
> > > > > > > > > eventually on pc.bios. This latter is reponsible for the 
> > > > > > > > > assertion:
> > > > > > > > >
> > > > > > > > > qemu-system-x86_64: vhost_log_sync calls sync_dirty_map on 
> > > > > > > > > pc.bios for the full range
> > > > > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > > > > vhost_dev_sync_region on region 0
> > > > > > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x9 < 
> > > > > > > > > start=0xfffc
> > > > > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > > > > vhost_dev_sync_region on region 1
> > > > > > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x3fff < 
> > > > > > > > > start=0xfffc
> > > > > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > > > > vhost_dev_sync_region on vq 0 <-
> > > > > > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios 
> > > > > > > > > mfirst=0xfffc mlast=0x rfirst=0xf240 
> > > > > > > > > rlast=0xfa45
> > > > > > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios 
> > > > > > > > > end=0xfa45 VHOST_LOG_CHUNK=0x4 
> > > > > > > > > end/VHOST_LOG_CHUNK=0x3fff dev->log_size=0x1000
> > > > > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: 
> > > > > > > > > vhost_dev_sync_region: Assertion `end / VHOST_LOG_CHUNK < 
> > > > > > > > > dev->log_size' failed.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > "using 1G memory": We hit the issue with a guest started with 
> > > > > > > > > 1GB initial RAM.
> > > > > > > >
> > > > > > > > Yes, so in the case the guest iova allocator may try to use an 
> > > > > > > > IOVA
> > > > > > > > that is beyond 1G.
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > > > > > > > > >> Cc: qemu-sta...@nongnu.org
> > > > > > > > > >> Reported-by: Yalan Zhang 
> > > > > > > > > >> Tested-by: Eric Auger 
> > > > > > > > > >> Tested-by: Lei Yang 
> > > > > > > > > >> Signed-off-by: Jason Wang 
> > > > > > > > > >> ---
> > > > > > > > > >>  hw/virtio/vhost.c | 65 
> > > > > > > > > >> ---
> > > > > > > > > >>  1 file changed, 45 insertions(+), 20 deletions(-)
> > > > > > > > > >>
> > > > > > > > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > > > > > > > >> index d1c4c20b8c..26b319f34e 100644
> > > > > > > > > >> --- a/hw/virtio/vhost.c
> > > > > > > > > >> +++ b/hw/virtio/vhost.c
> > > > > > > > > >> @@ -106,11 +106,30 @@ static void 
> > > > >

Re: [PATCH for 7.2?] vhost: fix vq dirt bitmap syncing when vIOMMU is enabled

2022-11-23 Thread Jason Wang

On Thu, Nov 24, 2022 at 3:06 PM Michael S. Tsirkin  wrote:
>
> On Thu, Nov 24, 2022 at 12:12:15PM +0800, Jason Wang wrote:
> > On Wed, Nov 23, 2022 at 3:21 PM Michael S. Tsirkin  wrote:
> > >
> > > On Wed, Nov 23, 2022 at 03:08:25PM +0800, Jason Wang wrote:
> > > > On Wed, Nov 23, 2022 at 2:15 PM Michael S. Tsirkin  
> > > > wrote:
> > > > >
> > > > > On Wed, Nov 23, 2022 at 01:47:04PM +0800, Jason Wang wrote:
> > > > > > On Wed, Nov 23, 2022 at 1:26 PM Jason Wang  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Wed, Nov 23, 2022 at 12:28 AM Eric Auger 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > On 11/22/22 10:43, Michael S. Tsirkin wrote:
> > > > > > > > > On Tue, Nov 22, 2022 at 11:01:11AM +0800, Jason Wang wrote:
> > > > > > > > >> When vIOMMU is enabled, the vq->used_phys is actually the 
> > > > > > > > >> IOVA not
> > > > > > > > >> GPA. So we need to translate it to GPA before the syncing 
> > > > > > > > >> otherwise we
> > > > > > > > >> may hit the following crash since IOVA could be out of the 
> > > > > > > > >> scope of
> > > > > > > > >> the GPA log size. This could be noted when using 
> > > > > > > > >> virtio-IOMMU with
> > > > > > > > >> vhost using 1G memory.
> > > > > > > > > Noted how exactly? What does "using 1G memory" mean?
> > > > > > > >
> > > > > > > > We hit the following assertion:
> > > > > > > >
> > > > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: 
> > > > > > > > vhost_dev_sync_region: Assertion `end / VHOST_LOG_CHUNK < 
> > > > > > > > dev->log_size' failed.
> > > > > > > >
> > > > > > > > On the last time vhost_get_log_size() is called it takes into 
> > > > > > > > account 2 regions when computing the log_size:
> > > > > > > > qemu-system-x86_64: vhost_get_log_size region 0 last=0x9 
> > > > > > > > updated log_size=0x3
> > > > > > > > qemu-system-x86_64: vhost_get_log_size region 1 last=0x3fff 
> > > > > > > > updated log_size=0x1000
> > > > > > > > so in vhost_migration_log() vhost_get_log_size(dev) returns 
> > > > > > > > 0x1000
> > > > > > > >
> > > > > > > > In the test case, memory_region_sync_dirty_bitmap() gets called 
> > > > > > > > for mem-machine_mem, vga.vram (several times) and eventually on 
> > > > > > > > pc.bios. This latter is reponsible for the assertion:
> > > > > > > >
> > > > > > > > qemu-system-x86_64: vhost_log_sync calls sync_dirty_map on 
> > > > > > > > pc.bios for the full range
> > > > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > > > vhost_dev_sync_region on region 0
> > > > > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x9 < 
> > > > > > > > start=0xfffc
> > > > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > > > vhost_dev_sync_region on region 1
> > > > > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x3fff < 
> > > > > > > > start=0xfffc
> > > > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > > > vhost_dev_sync_region on vq 0 <-
> > > > > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios 
> > > > > > > > mfirst=0xfffc mlast=0x rfirst=0xf240 
> > > > > > > > rlast=0xfa45
> > > > > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios 
> > > > > > > > end=0xfa45 VHOST_LOG_CHUNK=0x4 
> > > > > > > > end/VHOST_LOG_CHUNK=0x3fff dev->log_size=0x1000
> > > > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: 
> > > > > > > > vhost_dev_sync_region: Assertion `end / VHOST_LOG_CHUNK < 
> > > > > > > > dev->log_size' failed.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > "using 1G memory": We hit the issue with a guest started with 
> > > > > > > > 1GB initial RAM.
> > > > > > >
> > > > > > > Yes, so in the case the guest iova allocator may try to use an 
> > > > > > > IOVA
> > > > > > > that is beyond 1G.
> > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > > > > > > > >> Cc: qemu-sta...@nongnu.org
> > > > > > > > >> Reported-by: Yalan Zhang 
> > > > > > > > >> Tested-by: Eric Auger 
> > > > > > > > >> Tested-by: Lei Yang 
> > > > > > > > >> Signed-off-by: Jason Wang 
> > > > > > > > >> ---
> > > > > > > > >>  hw/virtio/vhost.c | 65 
> > > > > > > > >> ---
> > > > > > > > >>  1 file changed, 45 insertions(+), 20 deletions(-)
> > > > > > > > >>
> > > > > > > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > > > > > > >> index d1c4c20b8c..26b319f34e 100644
> > > > > > > > >> --- a/hw/virtio/vhost.c
> > > > > > > > >> +++ b/hw/virtio/vhost.c
> > > > > > > > >> @@ -106,11 +106,30 @@ static void 
> > > > > > > > >> vhost_dev_sync_region(struct vhost_dev *dev,
> > > > > > > > >>  }
> > > > > > > > >>  }
> > > > > > > > >>
> > > > > > > > >> +static bool vhost_dev_has_iommu(struct vhost_dev *dev)
> > > > > > > > >> +{
> > > > > > > > >> +VirtIODevice *vdev = dev->vdev;
> > > > > >

[PULL 4/4] ui/gtk: prevent ui lock up when dpy_gl_update called again before current draw event occurs

2022-11-23 Thread Gerd Hoffmann

From: Dongwon Kim 

A warning, "qemu: warning: console: no gl-unblock within" followed by
guest scanout lockup can happen if dpy_gl_update is called in a row
and the second call is made before gd_draw_event scheduled by the first
call is taking place. This is because draw call returns without decrementing
gl_block ref count if the dmabuf was already submitted as shown below.

(gd_gl_area_draw/gd_egl_draw)

if (dmabuf) {
if (!dmabuf->draw_submitted) {
return;
} else {
dmabuf->draw_submitted = false;
}
}

So it should not schedule any redundant draw event in case draw_submitted is
already set in gd_egl_fluch/gd_gl_area_scanout_flush.

Cc: Gerd Hoffmann 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
Reviewed-by: Marc-André Lureau 
Message-Id: <20221021192315.9110-1-dongwon@intel.com>
Signed-off-by: Gerd Hoffmann 
---
 ui/gtk-egl.c | 2 +-
 ui/gtk-gl-area.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ui/gtk-egl.c b/ui/gtk-egl.c
index 35f917ceb15e..e84431790c9b 100644
--- a/ui/gtk-egl.c
+++ b/ui/gtk-egl.c
@@ -341,7 +341,7 @@ void gd_egl_flush(DisplayChangeListener *dcl,
 VirtualConsole *vc = container_of(dcl, VirtualConsole, gfx.dcl);
 GtkWidget *area = vc->gfx.drawing_area;
 
-if (vc->gfx.guest_fb.dmabuf) {
+if (vc->gfx.guest_fb.dmabuf && !vc->gfx.guest_fb.dmabuf->draw_submitted) {
 graphic_hw_gl_block(vc->gfx.dcl.con, true);
 vc->gfx.guest_fb.dmabuf->draw_submitted = true;
 gtk_widget_queue_draw_area(area, x, y, w, h);
diff --git a/ui/gtk-gl-area.c b/ui/gtk-gl-area.c
index 682638a197d2..7696df1f6bc4 100644
--- a/ui/gtk-gl-area.c
+++ b/ui/gtk-gl-area.c
@@ -278,7 +278,7 @@ void gd_gl_area_scanout_flush(DisplayChangeListener *dcl,
 {
 VirtualConsole *vc = container_of(dcl, VirtualConsole, gfx.dcl);
 
-if (vc->gfx.guest_fb.dmabuf) {
+if (vc->gfx.guest_fb.dmabuf && !vc->gfx.guest_fb.dmabuf->draw_submitted) {
 graphic_hw_gl_block(vc->gfx.dcl.con, true);
 vc->gfx.guest_fb.dmabuf->draw_submitted = true;
 }
-- 
2.38.1

[PULL 2/4] gtk: disable GTK Clipboard with a new meson option

2022-11-23 Thread Gerd Hoffmann

From: Claudio Fontana 

The GTK Clipboard implementation may cause guest hangs.

Therefore implement new configure switch: --enable-gtk-clipboard,

as a meson option disabled by default, which warns in the help
text about the experimental nature of the feature.
Regenerate the meson build options to include it.

The initialization of the clipboard is gtk.c, as well as the
compilation of gtk-clipboard.c are now conditional on this new
option to be set.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1150
Signed-off-by: Claudio Fontana 
Acked-by: Gerd Hoffmann 
Reviewed-by: Jim Fehlig 
Message-Id: <20221121135538.14625-1-cfont...@suse.de>
Signed-off-by: Gerd Hoffmann 
---
 meson_options.txt | 7 +++
 ui/gtk.c  | 2 ++
 meson.build   | 5 +
 scripts/meson-buildoptions.sh | 3 +++
 ui/meson.build| 5 -
 5 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/meson_options.txt b/meson_options.txt
index 66128178bffa..4b749ca54900 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -219,6 +219,13 @@ option('vnc_sasl', type : 'feature', value : 'auto',
description: 'SASL authentication for VNC server')
 option('vte', type : 'feature', value : 'auto',
description: 'vte support for the gtk UI')
+
+# GTK Clipboard implementation is disabled by default, since it may cause hangs
+# of the guest VCPUs. See gitlab issue 1150:
+# https://gitlab.com/qemu-project/qemu/-/issues/1150
+
+option('gtk_clipboard', type: 'feature', value : 'disabled',
+   description: 'clipboard support for the gtk UI (EXPERIMENTAL, MAY 
HANG)')
 option('xkbcommon', type : 'feature', value : 'auto',
description: 'xkbcommon support')
 option('zstd', type : 'feature', value : 'auto',
diff --git a/ui/gtk.c b/ui/gtk.c
index 7ec21f7798ef..4817623c8f3f 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -2403,7 +2403,9 @@ static void gtk_display_init(DisplayState *ds, 
DisplayOptions *opts)
 opts->u.gtk.show_tabs) {
 gtk_menu_item_activate(GTK_MENU_ITEM(s->show_tabs_item));
 }
+#ifdef CONFIG_GTK_CLIPBOARD
 gd_clipboard_init(s);
+#endif /* CONFIG_GTK_CLIPBOARD */
 }
 
 static void early_gtk_display_init(DisplayOptions *opts)
diff --git a/meson.build b/meson.build
index cf3e517e56d8..5c6b5a1c757f 100644
--- a/meson.build
+++ b/meson.build
@@ -1246,6 +1246,8 @@ endif
 gtk = not_found
 gtkx11 = not_found
 vte = not_found
+have_gtk_clipboard = get_option('gtk_clipboard').enabled()
+
 if not get_option('gtk').auto() or have_system
   gtk = dependency('gtk+-3.0', version: '>=3.22.0',
method: 'pkg-config',
@@ -1264,6 +1266,8 @@ if not get_option('gtk').auto() or have_system
required: get_option('vte'),
kwargs: static_kwargs)
 endif
+  elif have_gtk_clipboard
+error('GTK clipboard requested, but GTK not found')
   endif
 endif
 
@@ -1842,6 +1846,7 @@ if glusterfs.found()
 endif
 config_host_data.set('CONFIG_GTK', gtk.found())
 config_host_data.set('CONFIG_VTE', vte.found())
+config_host_data.set('CONFIG_GTK_CLIPBOARD', have_gtk_clipboard)
 config_host_data.set('CONFIG_LIBATTR', have_old_libattr)
 config_host_data.set('CONFIG_LIBCAP_NG', libcap_ng.found())
 config_host_data.set('CONFIG_EBPF', libbpf.found())
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
index 2cb0de5601ef..aa6e30ea911e 100644
--- a/scripts/meson-buildoptions.sh
+++ b/scripts/meson-buildoptions.sh
@@ -93,6 +93,7 @@ meson_options_help() {
   printf "%s\n" '  glusterfs   Glusterfs block device driver'
   printf "%s\n" '  gnutls  GNUTLS cryptography support'
   printf "%s\n" '  gtk GTK+ user interface'
+  printf "%s\n" '  gtk-clipboard   clipboard support for GTK (EXPERIMENTAL, 
MAY HANG)'
   printf "%s\n" '  guest-agent Build QEMU Guest Agent'
   printf "%s\n" '  guest-agent-msi Build MSI package for the QEMU Guest Agent'
   printf "%s\n" '  hax HAX acceleration support'
@@ -274,6 +275,8 @@ _meson_option_parse() {
 --disable-gprof) printf "%s" -Dgprof=false ;;
 --enable-gtk) printf "%s" -Dgtk=enabled ;;
 --disable-gtk) printf "%s" -Dgtk=disabled ;;
+--enable-gtk-clipboard) printf "%s" -Dgtk_clipboard=enabled ;;
+--disable-gtk-clipboard) printf "%s" -Dgtk_clipboard=disabled ;;
 --enable-guest-agent) printf "%s" -Dguest_agent=enabled ;;
 --disable-guest-agent) printf "%s" -Dguest_agent=disabled ;;
 --enable-guest-agent-msi) printf "%s" -Dguest_agent_msi=enabled ;;
diff --git a/ui/meson.build b/ui/meson.build
index ec139497766a..c1b137bf330c 100644
--- a/ui/meson.build
+++ b/ui/meson.build
@@ -97,7 +97,10 @@ if gtk.found()
   softmmu_ss.add(when: 'CONFIG_WIN32', if_true: files('win32-kbd-hook.c'))
 
   gtk_ss = ss.source_set()
-  gtk_ss.add(gtk, vte, pixman, files('gtk.c', 'gtk-clipboard.c'))
+  gtk_ss.add(gtk, vte, pixman, files('gtk.c'))
+  if have_gtk_clipboard
+gtk_ss.add(files('gtk-clipboar

[PULL 3/4] hw/usb/hcd-xhci.c: spelling: tranfer

2022-11-23 Thread Gerd Hoffmann

From: Michael Tokarev 

Fixes: effaf5a240e03020f4ae953e10b764622c3e87cc
Signed-off-by: Michael Tokarev 
Reviewed-by: Thomas Huth 
Reviewed-by: Stefan Weil 
Message-Id: <20221105114851.306206-1-...@msgid.tls.msk.ru>
Signed-off-by: Gerd Hoffmann 
---
 hw/usb/hcd-xhci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index 8299f35e6695..b89b618ec210 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -796,7 +796,7 @@ static int xhci_ring_chain_length(XHCIState *xhci, const 
XHCIRing *ring)
  */
 } while (length < TRB_LINK_LIMIT * 65536 / TRB_SIZE);
 
-qemu_log_mask(LOG_GUEST_ERROR, "%s: exceeded maximum tranfer ring size!\n",
+qemu_log_mask(LOG_GUEST_ERROR, "%s: exceeded maximum transfer ring 
size!\n",
   __func__);
 
 return -1;
-- 
2.38.1

[PULL 0/4] Fixes 20221124 patches

2022-11-23 Thread Gerd Hoffmann

The following changes since commit 7c09a7f6ae1770d15535980d15dffdb23f4d9786:

  Update VERSION for v7.2.0-rc2 (2022-11-22 18:59:56 -0500)

are available in the Git repository at:

  https://gitlab.com/kraxel/qemu.git tags/fixes-20221124-pull-request

for you to fetch changes up to 64f1359bd08060ffe7a5689fdcbaeec6d8a59980:

  ui/gtk: prevent ui lock up when dpy_gl_update called again before current 
draw event occurs (2022-11-23 12:27:55 +0100)


usb+ui: fixes for 7.2



Claudio Fontana (1):
  gtk: disable GTK Clipboard with a new meson option

Dongwon Kim (1):
  ui/gtk: prevent ui lock up when dpy_gl_update called again before
current draw event occurs

Joelle van Dyne (1):
  Revert "usbredir: avoid queuing hello packet on snapshot restore"

Michael Tokarev (1):
  hw/usb/hcd-xhci.c: spelling: tranfer

 meson_options.txt | 7 +++
 hw/usb/hcd-xhci.c | 2 +-
 hw/usb/redirect.c | 3 +--
 ui/gtk-egl.c  | 2 +-
 ui/gtk-gl-area.c  | 2 +-
 ui/gtk.c  | 2 ++
 meson.build   | 5 +
 scripts/meson-buildoptions.sh | 3 +++
 ui/meson.build| 5 -
 9 files changed, 25 insertions(+), 6 deletions(-)

-- 
2.38.1

[PULL 1/4] Revert "usbredir: avoid queuing hello packet on snapshot restore"

2022-11-23 Thread Gerd Hoffmann

From: Joelle van Dyne 

Run state is also in RUN_STATE_PRELAUNCH while "-S" is used.

This reverts commit 0631d4b448454ae8a1ab091c447e3f71ab6e088a

Signed-off-by: Joelle van Dyne 
Reviewed-by: Ján Tomko 

The original commit broke the usage of usbredir with libvirt, which
starts every domain with "-S".

This workaround is no longer needed because the usbredir behavior
has been fixed in the meantime:
https://gitlab.freedesktop.org/spice/usbredir/-/merge_requests/61

Signed-off-by: Ján Tomko 
Message-Id: 
<1689cec3eadcea87255e390cb236033aca72e168.1669193161.git.jto...@redhat.com>
Signed-off-by: Gerd Hoffmann 
---
 hw/usb/redirect.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c
index 1bd30efc3ef0..fd7df599bc0b 100644
--- a/hw/usb/redirect.c
+++ b/hw/usb/redirect.c
@@ -1280,8 +1280,7 @@ static void usbredir_create_parser(USBRedirDevice *dev)
 }
 #endif
 
-if (runstate_check(RUN_STATE_INMIGRATE) ||
-runstate_check(RUN_STATE_PRELAUNCH)) {
+if (runstate_check(RUN_STATE_INMIGRATE)) {
 flags |= usbredirparser_fl_no_hello;
 }
 usbredirparser_init(dev->parser, VERSION, caps, USB_REDIR_CAPS_SIZE,
-- 
2.38.1

Re: [PATCH] hw/usb: add configuration flags for emulated and passthru usb smartcard

2022-11-23 Thread Paolo Bonzini

Il mer 23 nov 2022, 08:59 Marc-André Lureau  ha
scritto:

> config USB_SMARTCARD_PASSTHRU
> bool
> default y
> select USB_SMARTCARD
>
> config USB_SMARTCARD_EMULATED
> bool
> default y
> select USB_SMARTCARD
>

Yes, this is the way. (TM)

Also, you should add a "config LIBCACARD" (resp. "CONFIG_LIBCACARD=y")
symbol to Kconfig.host and the root meson.build, so that you can make these
symbols "depends on CACARD" and remove the "if cacard.found()" below.

Paolo


>
> > +
> > +config USB_SMARTCARD_EMULATED
> > +bool
> > +default y
> > +depends on USB
> >
> >  config USB_STORAGE_MTP
> >  bool
> > diff --git a/hw/usb/meson.build b/hw/usb/meson.build
> > index 793df42e21..353006fb6c 100644
> > --- a/hw/usb/meson.build
> > +++ b/hw/usb/meson.build
> > @@ -51,8 +51,8 @@ softmmu_ss.add(when: 'CONFIG_USB_SMARTCARD', if_true:
> files('dev-smartcard-reade
> >
> >  if cacard.found()
> >usbsmartcard_ss = ss.source_set()
> > -  usbsmartcard_ss.add(when: 'CONFIG_USB_SMARTCARD',
> > -  if_true: [cacard, files('ccid-card-emulated.c',
> 'ccid-card-passthru.c')])
> > +  usbsmartcard_ss.add(when: 'CONFIG_USB_SMARTCARD_EMULATED', if_true:
> [cacard, files('ccid-card-emulated.c')])
> > +  usbsmartcard_ss.add(when: 'CONFIG_USB_SMARTCARD_PASSTHRU', if_true:
> [cacard, files('ccid-card-passthru.c')])
> >hw_usb_modules += {'smartcard': usbsmartcard_ss}
> >  endif
> >
> > --
> > 2.35.3
> >
> >
>
>
> --
> Marc-André Lureau
>
>

Re: [PATCH for 7.2?] vhost: fix vq dirt bitmap syncing when vIOMMU is enabled

2022-11-23 Thread Michael S. Tsirkin

On Thu, Nov 24, 2022 at 12:12:15PM +0800, Jason Wang wrote:
> On Wed, Nov 23, 2022 at 3:21 PM Michael S. Tsirkin  wrote:
> >
> > On Wed, Nov 23, 2022 at 03:08:25PM +0800, Jason Wang wrote:
> > > On Wed, Nov 23, 2022 at 2:15 PM Michael S. Tsirkin  
> > > wrote:
> > > >
> > > > On Wed, Nov 23, 2022 at 01:47:04PM +0800, Jason Wang wrote:
> > > > > On Wed, Nov 23, 2022 at 1:26 PM Jason Wang  
> > > > > wrote:
> > > > > >
> > > > > > On Wed, Nov 23, 2022 at 12:28 AM Eric Auger  
> > > > > > wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > On 11/22/22 10:43, Michael S. Tsirkin wrote:
> > > > > > > > On Tue, Nov 22, 2022 at 11:01:11AM +0800, Jason Wang wrote:
> > > > > > > >> When vIOMMU is enabled, the vq->used_phys is actually the IOVA 
> > > > > > > >> not
> > > > > > > >> GPA. So we need to translate it to GPA before the syncing 
> > > > > > > >> otherwise we
> > > > > > > >> may hit the following crash since IOVA could be out of the 
> > > > > > > >> scope of
> > > > > > > >> the GPA log size. This could be noted when using virtio-IOMMU 
> > > > > > > >> with
> > > > > > > >> vhost using 1G memory.
> > > > > > > > Noted how exactly? What does "using 1G memory" mean?
> > > > > > >
> > > > > > > We hit the following assertion:
> > > > > > >
> > > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: 
> > > > > > > vhost_dev_sync_region: Assertion `end / VHOST_LOG_CHUNK < 
> > > > > > > dev->log_size' failed.
> > > > > > >
> > > > > > > On the last time vhost_get_log_size() is called it takes into 
> > > > > > > account 2 regions when computing the log_size:
> > > > > > > qemu-system-x86_64: vhost_get_log_size region 0 last=0x9 
> > > > > > > updated log_size=0x3
> > > > > > > qemu-system-x86_64: vhost_get_log_size region 1 last=0x3fff 
> > > > > > > updated log_size=0x1000
> > > > > > > so in vhost_migration_log() vhost_get_log_size(dev) returns 0x1000
> > > > > > >
> > > > > > > In the test case, memory_region_sync_dirty_bitmap() gets called 
> > > > > > > for mem-machine_mem, vga.vram (several times) and eventually on 
> > > > > > > pc.bios. This latter is reponsible for the assertion:
> > > > > > >
> > > > > > > qemu-system-x86_64: vhost_log_sync calls sync_dirty_map on 
> > > > > > > pc.bios for the full range
> > > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > > vhost_dev_sync_region on region 0
> > > > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x9 < 
> > > > > > > start=0xfffc
> > > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > > vhost_dev_sync_region on region 1
> > > > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x3fff < 
> > > > > > > start=0xfffc
> > > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > > vhost_dev_sync_region on vq 0 <-
> > > > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios 
> > > > > > > mfirst=0xfffc mlast=0x rfirst=0xf240 
> > > > > > > rlast=0xfa45
> > > > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios end=0xfa45 
> > > > > > > VHOST_LOG_CHUNK=0x4 end/VHOST_LOG_CHUNK=0x3fff 
> > > > > > > dev->log_size=0x1000
> > > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: 
> > > > > > > vhost_dev_sync_region: Assertion `end / VHOST_LOG_CHUNK < 
> > > > > > > dev->log_size' failed.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > "using 1G memory": We hit the issue with a guest started with 1GB 
> > > > > > > initial RAM.
> > > > > >
> > > > > > Yes, so in the case the guest iova allocator may try to use an IOVA
> > > > > > that is beyond 1G.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > > > > > > >> Cc: qemu-sta...@nongnu.org
> > > > > > > >> Reported-by: Yalan Zhang 
> > > > > > > >> Tested-by: Eric Auger 
> > > > > > > >> Tested-by: Lei Yang 
> > > > > > > >> Signed-off-by: Jason Wang 
> > > > > > > >> ---
> > > > > > > >>  hw/virtio/vhost.c | 65 
> > > > > > > >> ---
> > > > > > > >>  1 file changed, 45 insertions(+), 20 deletions(-)
> > > > > > > >>
> > > > > > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > > > > > >> index d1c4c20b8c..26b319f34e 100644
> > > > > > > >> --- a/hw/virtio/vhost.c
> > > > > > > >> +++ b/hw/virtio/vhost.c
> > > > > > > >> @@ -106,11 +106,30 @@ static void vhost_dev_sync_region(struct 
> > > > > > > >> vhost_dev *dev,
> > > > > > > >>  }
> > > > > > > >>  }
> > > > > > > >>
> > > > > > > >> +static bool vhost_dev_has_iommu(struct vhost_dev *dev)
> > > > > > > >> +{
> > > > > > > >> +VirtIODevice *vdev = dev->vdev;
> > > > > > > >> +
> > > > > > > >> +/*
> > > > > > > >> + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend 
> > > > > > > >> support
> > > > > > > >> + * incremental memory mapping API via IOTLB API. For 
> > > > > > > >> platform that
> > > > > > > >> + * does not have IOMMU, there's n

Re: [PATCH v5 01/15] block-io: introduce coroutine_fn duplicates for bdrv_common_block_status_above callers

2022-11-23 Thread Paolo Bonzini

Il mer 23 nov 2022, 17:29 Kevin Wolf  ha scritto:

> As I said, personally, I don't feel like putting QEMU_IN_COROUTINE()
> assertions into every coroutine_fn is a useful thing to do. Static
> analysis (maybe even with something vrc based in 'make check'? Paolo,
> would this be realistic?)


Yes, using pyclang just to build the call graph should be much much faster
than doing all the static analysis. So it should be feasible to integrate
it with "make check". We can decide whether to bundle vrc, which is just a
dozen files, or install it from pypi (sooner or later configure will have
to be taught about that, too).

Paolo

>

Re: [PATCH v5 07/15] block: introduce QEMU_IN_COROUTINE macro

2022-11-23 Thread Paolo Bonzini

Il mer 23 nov 2022, 17:49 Kevin Wolf  ha scritto:

> I already asked about other opinions on this in patch 1.
>
> These assertions are runtime checks and I don't feel they are the right
> tool to verify coroutine_fn consistency. Asserting in tricky places
> makes sense to me, especially as long as we can't rely on static
> analysis, but adding it everywhere feels over the top to me.
>

I agree that they don't seem necessary, since static analysis is possible
and superior.

Paolo


> Kevin
>
>

Re: [PATCH for-7.2] vhost: enable vrings in vhost_dev_start() for vhost-user devices

2022-11-23 Thread Michael S. Tsirkin

On Thu, Nov 24, 2022 at 12:19:25AM +, Raphael Norwitz wrote:
> 
> > On Nov 23, 2022, at 8:16 AM, Stefano Garzarella  wrote:
> > 
> > Commit 02b61f38d3 ("hw/virtio: incorporate backend features in features")
> > properly negotiates VHOST_USER_F_PROTOCOL_FEATURES with the vhost-user
> > backend, but we forgot to enable vrings as specified in
> > docs/interop/vhost-user.rst:
> > 
> >If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
> >ring starts directly in the enabled state.
> > 
> >If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is
> >initialized in a disabled state and is enabled by
> >``VHOST_USER_SET_VRING_ENABLE`` with parameter 1.
> > 
> > Some vhost-user front-ends already did this by calling
> > vhost_ops.vhost_set_vring_enable() directly:
> > - backends/cryptodev-vhost.c
> > - hw/net/virtio-net.c
> > - hw/virtio/vhost-user-gpio.c
> 
> To simplify why not rather change these devices to use the new semantics?

Granted this is already scary enough for this release.

> > 
> > But most didn't do that, so we would leave the vrings disabled and some
> > backends would not work. We observed this issue with the rust version of
> > virtiofsd [1], which uses the event loop [2] provided by the
> > vhost-user-backend crate where requests are not processed if vring is
> > not enabled.
> > 
> > Let's fix this issue by enabling the vrings in vhost_dev_start() for
> > vhost-user front-ends that don't already do this directly. Same thing
> > also in vhost_dev_stop() where we disable vrings.
> > 
> > [1] https://gitlab.com/virtio-fs/virtiofsd
> > [2] 
> > https://github.com/rust-vmm/vhost/blob/240fc2966/crates/vhost-user-backend/src/event_loop.rs#L217
> > Fixes: 02b61f38d3 ("hw/virtio: incorporate backend features in features")
> > Reported-by: German Maglione 
> > Tested-by: German Maglione 
> > Signed-off-by: Stefano Garzarella 
> 
> Looks good for vhost-user-blk/vhost-user-scsi.
> 
> Acked-by: Raphael Norwitz 
> 
> > ---
> > include/hw/virtio/vhost.h  |  6 +++--
> > backends/cryptodev-vhost.c |  4 ++--
> > backends/vhost-user.c  |  4 ++--
> > hw/block/vhost-user-blk.c  |  4 ++--
> > hw/net/vhost_net.c |  8 +++
> > hw/scsi/vhost-scsi-common.c|  4 ++--
> > hw/virtio/vhost-user-fs.c  |  4 ++--
> > hw/virtio/vhost-user-gpio.c|  4 ++--
> > hw/virtio/vhost-user-i2c.c |  4 ++--
> > hw/virtio/vhost-user-rng.c |  4 ++--
> > hw/virtio/vhost-vsock-common.c |  4 ++--
> > hw/virtio/vhost.c  | 44 ++
> > hw/virtio/trace-events |  4 ++--
> > 13 files changed, 67 insertions(+), 31 deletions(-)
> > 
> > diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> > index 353252ac3e..67a6807fac 100644
> > --- a/include/hw/virtio/vhost.h
> > +++ b/include/hw/virtio/vhost.h
> > @@ -184,24 +184,26 @@ static inline bool vhost_dev_is_started(struct 
> > vhost_dev *hdev)
> >  * vhost_dev_start() - start the vhost device
> >  * @hdev: common vhost_dev structure
> >  * @vdev: the VirtIODevice structure
> > + * @vrings: true to have vrings enabled in this call
> >  *
> >  * Starts the vhost device. From this point VirtIO feature negotiation
> >  * can start and the device can start processing VirtIO transactions.
> >  *
> >  * Return: 0 on success, < 0 on error.
> >  */
> > -int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev);
> > +int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool 
> > vrings);
> > 
> > /**
> >  * vhost_dev_stop() - stop the vhost device
> >  * @hdev: common vhost_dev structure
> >  * @vdev: the VirtIODevice structure
> > + * @vrings: true to have vrings disabled in this call
> >  *
> >  * Stop the vhost device. After the device is stopped the notifiers
> >  * can be disabled (@vhost_dev_disable_notifiers) and the device can
> >  * be torn down (@vhost_dev_cleanup).
> >  */
> > -void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev);
> > +void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool 
> > vrings);
> > 
> > /**
> >  * DOC: vhost device configuration handling
> > diff --git a/backends/cryptodev-vhost.c b/backends/cryptodev-vhost.c
> > index bc13e466b4..572f87b3be 100644
> > --- a/backends/cryptodev-vhost.c
> > +++ b/backends/cryptodev-vhost.c
> > @@ -94,7 +94,7 @@ cryptodev_vhost_start_one(CryptoDevBackendVhost *crypto,
> > goto fail_notifiers;
> > }
> > 
> > -r = vhost_dev_start(&crypto->dev, dev);
> > +r = vhost_dev_start(&crypto->dev, dev, false);
> > if (r < 0) {
> > goto fail_start;
> > }
> > @@ -111,7 +111,7 @@ static void
> > cryptodev_vhost_stop_one(CryptoDevBackendVhost *crypto,
> >  VirtIODevice *dev)
> > {
> > -vhost_dev_stop(&crypto->dev, dev);
> > +vhost_dev_stop(&crypto->dev, dev, false);
> > vhost_dev_disable_notifiers(&crypto->dev, dev);
> > }
> > 
> > diff --git a/backends/vhost-user.c b/backends

[PATCH 2/2] target/ppc: Check DEXCR on hash{st, chk} instructions

2022-11-23 Thread Nicholas Miehlbradt

Adds checks to the hashst and hashchk instructions to only execute if
enabled by the relevant aspect in the DEXCR and HDEXCR.

This behaviour is guarded behind TARGET_PPC64 since Power10 is
currently the only implementation which has the DEXCR.

Signed-off-by: Nicholas Miehlbradt 
---
 target/ppc/excp_helper.c | 58 +---
 1 file changed, 43 insertions(+), 15 deletions(-)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 94adcb766b..add4d54ae7 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -2902,29 +2902,57 @@ static uint64_t hash_digest(uint64_t ra, uint64_t rb, 
uint64_t key)
 return stage1_h ^ stage1_l;
 }
 
+static void do_hash(CPUPPCState *env, target_ulong ea, target_ulong ra,
+target_ulong rb, uint64_t key, bool store)
+{
+uint64_t calculated_hash = hash_digest(ra, rb, key), loaded_hash;
+
+if (store) {
+cpu_stq_data_ra(env, ea, calculated_hash, GETPC());
+} else {
+loaded_hash = cpu_ldq_data_ra(env, ea, GETPC());
+if (loaded_hash != calculated_hash) {
+raise_exception_err_ra(env, POWERPC_EXCP_PROGRAM,
+POWERPC_EXCP_TRAP, GETPC());
+}
+}
+}
+
 #include "qemu/guest-random.h"
 
-#define HELPER_HASH(op, key, store)   \
+#ifdef TARGET_PPC64
+#define HELPER_HASH(op, key, store, dexcr_aspect) \
 void helper_##op(CPUPPCState *env, target_ulong ea, target_ulong ra,  \
  target_ulong rb) \
 { \
-uint64_t calculated_hash = hash_digest(ra, rb, key), loaded_hash; \
-  \
-if (store) {  \
-cpu_stq_data_ra(env, ea, calculated_hash, GETPC());   \
-} else {  \
-loaded_hash = cpu_ldq_data_ra(env, ea, GETPC());  \
-if (loaded_hash != calculated_hash) { \
-raise_exception_err_ra(env, POWERPC_EXCP_PROGRAM, \
-POWERPC_EXCP_TRAP, GETPC());  \
-} \
+if (env->msr & R_MSR_PR_MASK) {   \
+if (!(env->spr[SPR_DEXCR] & R_DEXCR_PRO_##dexcr_aspect##_MASK ||  \
+env->spr[SPR_HDEXCR] & R_HDEXCR_ENF_##dexcr_aspect##_MASK))   \
+return;   \
+} else if (!(env->msr & R_MSR_HV_MASK)) { \
+if (!(env->spr[SPR_DEXCR] & R_DEXCR_PNH_##dexcr_aspect##_MASK ||  \
+env->spr[SPR_HDEXCR] & R_HDEXCR_ENF_##dexcr_aspect##_MASK))   \
+return;   \
+} else if (!(env->msr & R_MSR_S_MASK)) {  \
+if (!(env->spr[SPR_HDEXCR] & R_HDEXCR_HNU_##dexcr_aspect##_MASK)) \
+return;   \
 } \
+  \
+do_hash(env, ea, ra, rb, key, store); \
+}
+#else
+#define HELPER_HASH(op, key, store, dexcr_aspect) \
+void helper_##op(CPUPPCState *env, target_ulong ea, target_ulong ra,  \
+ target_ulong rb) \
+{ \
+do_hash(env, ea, ra, rb, key, store); \
 }
+#endif /* TARGET_PPC64 */
 
-HELPER_HASH(HASHST, env->spr[SPR_HASHKEYR], true)
-HELPER_HASH(HASHCHK, env->spr[SPR_HASHKEYR], false)
-HELPER_HASH(HASHSTP, env->spr[SPR_HASHPKEYR], true)
-HELPER_HASH(HASHCHKP, env->spr[SPR_HASHPKEYR], false)
+HELPER_HASH(HASHST, env->spr[SPR_HASHKEYR], true, NPHIE)
+HELPER_HASH(HASHCHK, env->spr[SPR_HASHKEYR], false, NPHIE)
+HELPER_HASH(HASHSTP, env->spr[SPR_HASHPKEYR], true, PHIE)
+HELPER_HASH(HASHCHKP, env->spr[SPR_HASHPKEYR], false, PHIE)
 #endif /* CONFIG_TCG */
 
 #if !defined(CONFIG_USER_ONLY)
-- 
2.34.1

[PATCH 1/2] target/ppc: Implement the DEXCR and HDEXCR

2022-11-23 Thread Nicholas Miehlbradt

Define the DEXCR and HDEXCR as special purpose registers.

Each register occupies two SPR indicies, one which can be read in an
unprivileged state and one which can be modified in the appropriate
priviliged state, however both indicies refer to the same underlying
value.

Note that the ISA uses the abbreviation UDEXCR in two different
contexts: the userspace DEXCR, the SPR index which can be read from
userspace (implemented in this patch), and the ultravisor DEXCR, the
equivalent register for the ultravisor state (not implemented).

Signed-off-by: Nicholas Miehlbradt 
---
 target/ppc/cpu.h| 19 +++
 target/ppc/cpu_init.c   | 25 +
 target/ppc/spr_common.h |  1 +
 target/ppc/translate.c  |  9 +
 4 files changed, 54 insertions(+)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 81d4263a07..0ed9f2ae35 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1068,6 +1068,21 @@ struct ppc_radix_page_info {
 uint32_t entries[PPC_PAGE_SIZES_MAX_SZ];
 };
 
+/*/
+/* Dynamic Execution Control Register */
+
+#define DEXCR_ASPECT(name, num)\
+FIELD(DEXCR, PNH_##name, PPC_BIT_NR(num), 1)   \
+FIELD(DEXCR, PRO_##name, PPC_BIT_NR(num + 32), 1)  \
+FIELD(HDEXCR, HNU_##name, PPC_BIT_NR(num), 1)  \
+FIELD(HDEXCR, ENF_##name, PPC_BIT_NR(num + 32), 1) \
+
+DEXCR_ASPECT(SBHE, 0)
+DEXCR_ASPECT(IDRTPB, 1)
+DEXCR_ASPECT(SRAPD, 4)
+DEXCR_ASPECT(NPHIE, 5)
+DEXCR_ASPECT(PHIE, 6)
+
 /*/
 /* The whole PowerPC CPU context */
 
@@ -1674,9 +1689,11 @@ void ppc_compat_add_property(Object *obj, const char 
*name,
 #define SPR_BOOKE_GIVOR13 (0x1BC)
 #define SPR_BOOKE_GIVOR14 (0x1BD)
 #define SPR_TIR   (0x1BE)
+#define SPR_UHDEXCR   (0x1C7)
 #define SPR_PTCR  (0x1D0)
 #define SPR_HASHKEYR  (0x1D4)
 #define SPR_HASHPKEYR (0x1D5)
+#define SPR_HDEXCR(0x1D7)
 #define SPR_BOOKE_SPEFSCR (0x200)
 #define SPR_Exxx_BBEAR(0x201)
 #define SPR_Exxx_BBTAR(0x202)
@@ -1865,8 +1882,10 @@ void ppc_compat_add_property(Object *obj, const char 
*name,
 #define SPR_RCPU_L2U_RA2  (0x32A)
 #define SPR_MPC_MD_DBRAM1 (0x32A)
 #define SPR_RCPU_L2U_RA3  (0x32B)
+#define SPR_UDEXCR(0x32C)
 #define SPR_TAR   (0x32F)
 #define SPR_ASDR  (0x330)
+#define SPR_DEXCR (0x33C)
 #define SPR_IC(0x350)
 #define SPR_VTB   (0x351)
 #define SPR_MMCRC (0x353)
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index cbf0081374..d6b950feb6 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -5727,6 +5727,30 @@ static void register_power10_hash_sprs(CPUPPCState *env)
 hashpkeyr_initial_value);
 }
 
+static void register_power10_dexcr_sprs(CPUPPCState *env)
+{
+spr_register(env, SPR_DEXCR, "DEXCR",
+SPR_NOACCESS, SPR_NOACCESS,
+&spr_read_generic, &spr_write_dexcr,
+0);
+
+spr_register(env, SPR_UDEXCR, "DEXCR",
+&spr_read_generic, SPR_NOACCESS,
+&spr_read_generic, SPR_NOACCESS,
+0);
+
+spr_register_hv(env, SPR_HDEXCR, "HDEXCR",
+SPR_NOACCESS, SPR_NOACCESS,
+SPR_NOACCESS, SPR_NOACCESS,
+&spr_read_generic, &spr_write_dexcr,
+0);
+
+spr_register(env, SPR_UHDEXCR, "HDEXCR",
+&spr_read_generic, SPR_NOACCESS,
+&spr_read_generic, SPR_NOACCESS,
+0);
+}
+
 /*
  * Initialize PMU counter overflow timers for Power8 and
  * newer Power chips when using TCG.
@@ -6402,6 +6426,7 @@ static void init_proc_POWER10(CPUPPCState *env)
 register_power8_rpr_sprs(env);
 register_power9_mmu_sprs(env);
 register_power10_hash_sprs(env);
+register_power10_dexcr_sprs(env);
 
 /* FIXME: Filter fields properly based on privilege level */
 spr_register_kvm_hv(env, SPR_PSSCR, "PSSCR", NULL, NULL, NULL, NULL,
diff --git a/target/ppc/spr_common.h b/target/ppc/spr_common.h
index b5a5bc6895..3cfa500250 100644
--- a/target/ppc/spr_common.h
+++ b/target/ppc/spr_common.h
@@ -195,6 +195,7 @@ void spr_read_ebb_upper32(DisasContext *ctx, int gprn, int 
sprn);
 void spr_write_ebb_upper32(DisasContext *ctx, int sprn, int gprn);
 void spr_write_hmer(DisasContext *ctx, int sprn, int gprn);
 void spr_write_lpcr(DisasContext *ctx, int sprn, int gprn);
+void spr_write_dexcr(DisasContext *ctx, int sprn, int gprn);
 #endif
 
 void register_low_BATs(CPUPPCState *env);
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 19c1d17cb0..24e9e2fece 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -1249,6 +1249,15 @@ void spr_write_ebb_upper32(DisasContext *ctx, int sprn, 
int gprn)
 gen_fscr_facility_check(ctx, SPR_FSCR, FSCR_EBB, sprn, FSCR_IC_EBB);
 spr_wr

[PATCH 0/2] target/ppc: Implement Dynamic Execution Control Registers

2022-11-23 Thread Nicholas Miehlbradt

Implements the Dynamic Execution Control Register (DEXCR) and the
Hypervisor Dynamic Execution Control Register (HDEXCR) in TCG as 
defined in Power ISA 3.1B. Only aspects 5 (Non-privileged hash instruction 
enable) and 6 (Privileged hash instruction enable) have architectural 
effects. Other aspects can be manipulated but have no effect on execution.

Adds checks to these registers in the hashst and hashchk instructions so
that they are executed as nops when not enabled.

Nicholas Miehlbradt (2):
  target/ppc: Implement the DEXCR and HDEXCR
  target/ppc: Check DEXCR on hash{st, chk} instructions

 target/ppc/cpu.h | 19 +
 target/ppc/cpu_init.c| 25 +
 target/ppc/excp_helper.c | 58 +---
 target/ppc/spr_common.h  |  1 +
 target/ppc/translate.c   |  9 +++
 5 files changed, 97 insertions(+), 15 deletions(-)

-- 
2.34.1

Re: [PATCH for 7.2?] vhost: fix vq dirt bitmap syncing when vIOMMU is enabled

2022-11-23 Thread Jason Wang

On Wed, Nov 23, 2022 at 3:21 PM Michael S. Tsirkin  wrote:
>
> On Wed, Nov 23, 2022 at 03:08:25PM +0800, Jason Wang wrote:
> > On Wed, Nov 23, 2022 at 2:15 PM Michael S. Tsirkin  wrote:
> > >
> > > On Wed, Nov 23, 2022 at 01:47:04PM +0800, Jason Wang wrote:
> > > > On Wed, Nov 23, 2022 at 1:26 PM Jason Wang  wrote:
> > > > >
> > > > > On Wed, Nov 23, 2022 at 12:28 AM Eric Auger  
> > > > > wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On 11/22/22 10:43, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Nov 22, 2022 at 11:01:11AM +0800, Jason Wang wrote:
> > > > > > >> When vIOMMU is enabled, the vq->used_phys is actually the IOVA 
> > > > > > >> not
> > > > > > >> GPA. So we need to translate it to GPA before the syncing 
> > > > > > >> otherwise we
> > > > > > >> may hit the following crash since IOVA could be out of the scope 
> > > > > > >> of
> > > > > > >> the GPA log size. This could be noted when using virtio-IOMMU 
> > > > > > >> with
> > > > > > >> vhost using 1G memory.
> > > > > > > Noted how exactly? What does "using 1G memory" mean?
> > > > > >
> > > > > > We hit the following assertion:
> > > > > >
> > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: 
> > > > > > Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed.
> > > > > >
> > > > > > On the last time vhost_get_log_size() is called it takes into 
> > > > > > account 2 regions when computing the log_size:
> > > > > > qemu-system-x86_64: vhost_get_log_size region 0 last=0x9 
> > > > > > updated log_size=0x3
> > > > > > qemu-system-x86_64: vhost_get_log_size region 1 last=0x3fff 
> > > > > > updated log_size=0x1000
> > > > > > so in vhost_migration_log() vhost_get_log_size(dev) returns 0x1000
> > > > > >
> > > > > > In the test case, memory_region_sync_dirty_bitmap() gets called for 
> > > > > > mem-machine_mem, vga.vram (several times) and eventually on 
> > > > > > pc.bios. This latter is reponsible for the assertion:
> > > > > >
> > > > > > qemu-system-x86_64: vhost_log_sync calls sync_dirty_map on pc.bios 
> > > > > > for the full range
> > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > vhost_dev_sync_region on region 0
> > > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x9 < 
> > > > > > start=0xfffc
> > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > vhost_dev_sync_region on region 1
> > > > > > qemu-system-x86_64: vhost_dev_sync_region end=0x3fff < 
> > > > > > start=0xfffc
> > > > > > qemu-system-x86_64: vhost_sync_dirty_bitmap calls 
> > > > > > vhost_dev_sync_region on vq 0 <-
> > > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios mfirst=0xfffc 
> > > > > > mlast=0x rfirst=0xf240 rlast=0xfa45
> > > > > > qemu-system-x86_64: vhost_dev_sync_region pc.bios end=0xfa45 
> > > > > > VHOST_LOG_CHUNK=0x4 end/VHOST_LOG_CHUNK=0x3fff 
> > > > > > dev->log_size=0x1000
> > > > > > qemu-system-x86_64: ../hw/virtio/vhost.c:85: vhost_dev_sync_region: 
> > > > > > Assertion `end / VHOST_LOG_CHUNK < dev->log_size' failed.
> > > > > >
> > > > > >
> > > > > >
> > > > > > "using 1G memory": We hit the issue with a guest started with 1GB 
> > > > > > initial RAM.
> > > > >
> > > > > Yes, so in the case the guest iova allocator may try to use an IOVA
> > > > > that is beyond 1G.
> > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support")
> > > > > > >> Cc: qemu-sta...@nongnu.org
> > > > > > >> Reported-by: Yalan Zhang 
> > > > > > >> Tested-by: Eric Auger 
> > > > > > >> Tested-by: Lei Yang 
> > > > > > >> Signed-off-by: Jason Wang 
> > > > > > >> ---
> > > > > > >>  hw/virtio/vhost.c | 65 
> > > > > > >> ---
> > > > > > >>  1 file changed, 45 insertions(+), 20 deletions(-)
> > > > > > >>
> > > > > > >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> > > > > > >> index d1c4c20b8c..26b319f34e 100644
> > > > > > >> --- a/hw/virtio/vhost.c
> > > > > > >> +++ b/hw/virtio/vhost.c
> > > > > > >> @@ -106,11 +106,30 @@ static void vhost_dev_sync_region(struct 
> > > > > > >> vhost_dev *dev,
> > > > > > >>  }
> > > > > > >>  }
> > > > > > >>
> > > > > > >> +static bool vhost_dev_has_iommu(struct vhost_dev *dev)
> > > > > > >> +{
> > > > > > >> +VirtIODevice *vdev = dev->vdev;
> > > > > > >> +
> > > > > > >> +/*
> > > > > > >> + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend 
> > > > > > >> support
> > > > > > >> + * incremental memory mapping API via IOTLB API. For 
> > > > > > >> platform that
> > > > > > >> + * does not have IOMMU, there's no need to enable this 
> > > > > > >> feature
> > > > > > >> + * which may cause unnecessary IOTLB miss/update 
> > > > > > >> transactions.
> > > > > > >> + */
> > > > > > >> +if (vdev) {
> > > > > > >> +return virtio_bus_device_iommu_enabled(vdev) &&
> > > > > > >> +virtio_host_has_feature(vdev,

Re: [PATCH for-7.2] vhost: enable vrings in vhost_dev_start() for vhost-user devices

2022-11-23 Thread Raphael Norwitz



> On Nov 23, 2022, at 8:16 AM, Stefano Garzarella  wrote:
> 
> Commit 02b61f38d3 ("hw/virtio: incorporate backend features in features")
> properly negotiates VHOST_USER_F_PROTOCOL_FEATURES with the vhost-user
> backend, but we forgot to enable vrings as specified in
> docs/interop/vhost-user.rst:
> 
>If ``VHOST_USER_F_PROTOCOL_FEATURES`` has not been negotiated, the
>ring starts directly in the enabled state.
> 
>If ``VHOST_USER_F_PROTOCOL_FEATURES`` has been negotiated, the ring is
>initialized in a disabled state and is enabled by
>``VHOST_USER_SET_VRING_ENABLE`` with parameter 1.
> 
> Some vhost-user front-ends already did this by calling
> vhost_ops.vhost_set_vring_enable() directly:
> - backends/cryptodev-vhost.c
> - hw/net/virtio-net.c
> - hw/virtio/vhost-user-gpio.c

To simplify why not rather change these devices to use the new semantics?

> 
> But most didn't do that, so we would leave the vrings disabled and some
> backends would not work. We observed this issue with the rust version of
> virtiofsd [1], which uses the event loop [2] provided by the
> vhost-user-backend crate where requests are not processed if vring is
> not enabled.
> 
> Let's fix this issue by enabling the vrings in vhost_dev_start() for
> vhost-user front-ends that don't already do this directly. Same thing
> also in vhost_dev_stop() where we disable vrings.
> 
> [1] https://gitlab.com/virtio-fs/virtiofsd
> [2] 
> https://github.com/rust-vmm/vhost/blob/240fc2966/crates/vhost-user-backend/src/event_loop.rs#L217
> Fixes: 02b61f38d3 ("hw/virtio: incorporate backend features in features")
> Reported-by: German Maglione 
> Tested-by: German Maglione 
> Signed-off-by: Stefano Garzarella 

Looks good for vhost-user-blk/vhost-user-scsi.

Acked-by: Raphael Norwitz 

> ---
> include/hw/virtio/vhost.h  |  6 +++--
> backends/cryptodev-vhost.c |  4 ++--
> backends/vhost-user.c  |  4 ++--
> hw/block/vhost-user-blk.c  |  4 ++--
> hw/net/vhost_net.c |  8 +++
> hw/scsi/vhost-scsi-common.c|  4 ++--
> hw/virtio/vhost-user-fs.c  |  4 ++--
> hw/virtio/vhost-user-gpio.c|  4 ++--
> hw/virtio/vhost-user-i2c.c |  4 ++--
> hw/virtio/vhost-user-rng.c |  4 ++--
> hw/virtio/vhost-vsock-common.c |  4 ++--
> hw/virtio/vhost.c  | 44 ++
> hw/virtio/trace-events |  4 ++--
> 13 files changed, 67 insertions(+), 31 deletions(-)
> 
> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
> index 353252ac3e..67a6807fac 100644
> --- a/include/hw/virtio/vhost.h
> +++ b/include/hw/virtio/vhost.h
> @@ -184,24 +184,26 @@ static inline bool vhost_dev_is_started(struct 
> vhost_dev *hdev)
>  * vhost_dev_start() - start the vhost device
>  * @hdev: common vhost_dev structure
>  * @vdev: the VirtIODevice structure
> + * @vrings: true to have vrings enabled in this call
>  *
>  * Starts the vhost device. From this point VirtIO feature negotiation
>  * can start and the device can start processing VirtIO transactions.
>  *
>  * Return: 0 on success, < 0 on error.
>  */
> -int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev);
> +int vhost_dev_start(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
> 
> /**
>  * vhost_dev_stop() - stop the vhost device
>  * @hdev: common vhost_dev structure
>  * @vdev: the VirtIODevice structure
> + * @vrings: true to have vrings disabled in this call
>  *
>  * Stop the vhost device. After the device is stopped the notifiers
>  * can be disabled (@vhost_dev_disable_notifiers) and the device can
>  * be torn down (@vhost_dev_cleanup).
>  */
> -void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev);
> +void vhost_dev_stop(struct vhost_dev *hdev, VirtIODevice *vdev, bool vrings);
> 
> /**
>  * DOC: vhost device configuration handling
> diff --git a/backends/cryptodev-vhost.c b/backends/cryptodev-vhost.c
> index bc13e466b4..572f87b3be 100644
> --- a/backends/cryptodev-vhost.c
> +++ b/backends/cryptodev-vhost.c
> @@ -94,7 +94,7 @@ cryptodev_vhost_start_one(CryptoDevBackendVhost *crypto,
> goto fail_notifiers;
> }
> 
> -r = vhost_dev_start(&crypto->dev, dev);
> +r = vhost_dev_start(&crypto->dev, dev, false);
> if (r < 0) {
> goto fail_start;
> }
> @@ -111,7 +111,7 @@ static void
> cryptodev_vhost_stop_one(CryptoDevBackendVhost *crypto,
>  VirtIODevice *dev)
> {
> -vhost_dev_stop(&crypto->dev, dev);
> +vhost_dev_stop(&crypto->dev, dev, false);
> vhost_dev_disable_notifiers(&crypto->dev, dev);
> }
> 
> diff --git a/backends/vhost-user.c b/backends/vhost-user.c
> index 5dedb2d987..7bfcaef976 100644
> --- a/backends/vhost-user.c
> +++ b/backends/vhost-user.c
> @@ -85,7 +85,7 @@ vhost_user_backend_start(VhostUserBackend *b)
> }
> 
> b->dev.acked_features = b->vdev->guest_features;
> -ret = vhost_dev_start(&b->dev, b->vdev);
> +ret = vhost_dev_start(&b->dev, b->vdev, true);
> if (ret < 0) {

[ANNOUNCE] QEMU 7.2.0-rc2 is now available

2022-11-23 Thread Michael Roth

Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
third release candidate for the QEMU 7.2 release. This release is meant
for testing purposes and should not be used in a production environment.

  http://download.qemu-project.org/qemu-7.2.0-rc2.tar.xz
  http://download.qemu-project.org/qemu-7.2.0-rc2.tar.xz.sig

You can help improve the quality of the QEMU 7.2 release by testing this
release and reporting bugs using our GitLab issue tracker:

  https://gitlab.com/qemu-project/qemu/-/milestones/7#tab-issues

The release plan, as well a documented known issues for release
candidates, are available at:

  http://wiki.qemu.org/Planning/7.2

Please add entries to the ChangeLog for the 7.2 release below:

  http://wiki.qemu.org/ChangeLog/7.2

Thank you to everyone involved!

Changes since rc1:

7c09a7f6ae: Update VERSION for v7.2.0-rc2 (Stefan Hajnoczi)
15f8f4671a: target/arm: Use signed quantity to represent VMSAv8-64 translation 
level (Ard Biesheuvel)
26ba00cf58: target/arm: Don't do two-stage lookup if stage 2 is disabled (Peter 
Maydell)
4451cc4653: hw/loongarch: Replace the value of uart info with macro (Xiaojuan 
Yang)
e8c8203e55: hw/loongarch: Fix setprop_sized method in fdt rtc node. (Xiaojuan 
Yang)
0208ba74c5: hw/loongarch: Add default stdout uart in fdt (Xiaojuan Yang)
b7c61789e6: virtio: disable error for out of spec queue-enable (Michael S. 
Tsirkin)
04e5bd441a: acpi/tests/avocado/bits: keep the work directory when BITS_DEBUG is 
set in env (Ani Sinha)
c4d4c40c51: tests/avocado: configure acpi-bits to use avocado timeout (John 
Snow)
242a58cabe: MAINTAINERS: add mst to list of biosbits maintainers (Ani Sinha)
83afb1409f: tests: acpi: x86: update expected DSDT after moving PRQx fields in 
_SB scope (Igor Mammedov)
4fd75ce076: acpi: x86: move RPQx field back to _SB scope (Igor Mammedov)
2df30863fa: tests: acpi: whitelist DSDT before moving PRQx to _SB scope (Igor 
Mammedov)
562a7d23bf: vhost: mask VIRTIO_F_RING_RESET for vhost and vhost-user devices 
(Stefano Garzarella)
5544d33d4b: gitlab: integrate coverage report (Alex Bennée)
f22a80727f: tests/avocado: skip aarch64 cloud TCG tests in CI (Alex Bennée)
ba5d1f23f7: tests/avocado: introduce alpine virt test for CI (Alex Bennée)
5d25e1e02c: tests/avocado: Raise timeout for 
boot_linux.py:BootLinuxPPC64.test_pseries_tcg (Peter Maydell)
73ee4c55f7: docs/devel: try and improve the language around patch review (Alex 
Bennée)
ca127fe96d: docs/devel: simplify the minimal checklist (Alex Bennée)
115847f6b0: docs/devel: make language a little less code centric (Alex Bennée)
668725ce6b: docs/devel: add a maintainers section to development process (Alex 
Bennée)
e558220df0: tests/docker: allow user to override check target (Alex Bennée)
a4b14b46d9: tests/avocado/machine_aspeed.py: Reduce noise on the console for 
SDK tests (Cédric Le Goater)
47fdc8fb82: Run docker probe only if docker or podman are available (Stefan 
Weil)
6d71357a3b: rtl8139: honor large send MSS value (Stefan Hajnoczi)
c74831a02c: rtl8139: keep Tx command mode 0 and 1 separate (Stefan Hajnoczi)
bd142b2391: rtl8139: avoid clobbering tx descriptor bits (Stefan Hajnoczi)
312b71abce: target/arm: Limit LPA2 effective output address when TCR.DS == 0 
(Ard Biesheuvel)
c4462523ff: tests/avocado/boot_linux.py: Bump aarch64 virt test timeout to 720s 
(Peter Maydell)
3d5af538a4: hw/intc: add implementation of GICD_IIDR to Arm GIC (Alex Bennée)
69e7e60d01: hw/intc: clean-up access to GIC multi-byte registers (Alex Bennée)
93e2da36ed: hw/sd: Fix sun4i allwinner-sdhost for U-Boot (Strahinja Jankovic)
b5280437a7: migration: Block migration comment or code is wrong (Juan Quintela)
6f39c90b86: migration: Disable multifd explicitly with compression (Peter Xu)
cedb70eafb: migration: Use non-atomic ops for clear log bitmap (Peter Xu)
afed4273b5: migration: Disallow postcopy preempt to be used with compress 
(Peter Xu)
f5816b5c86: migration: Fix race on qemu_file_shutdown() (Peter Xu)
4934a5dd7c: migration: Fix possible infinite loop of ram save process (Peter Xu)
4cc47b4395: migration/multifd/zero-copy: Create helper function for flushing 
(Leonardo Bras)
a216ec85b7: migration/channel-block: fix return value for 
qio_channel_block_{readv,writev} (Fiona Ebner)
06639f8ff5: chardev/char-win-stdio: Pass Ctrl+C to guest with a multiplexed 
monitor (Bin Meng)
049b4ad669: target/ppc: Fix build warnings when building with 'disable-tcg' 
(Vaibhav Jain)
1b7a07c441: acpi/tests/avocado/bits: some misc fixes (Ani Sinha)
c70fe3b148: ci: replace x86_64 macos-11 with aarch64 macos-12 (Daniel P. 
Berrangé)
be5df2edb5: docs/system/s390x: Document the "loadparm" machine property (Thomas 
Huth)
44ee69ea16: s390x: Fix spelling errors (Thomas Huth)

Re: [RFC PATCH] tests/avocado: use new rootfs for orangepi test

2022-11-23 Thread Philippe Mathieu-Daudé

On 23/11/22 19:49, Cédric Le Goater wrote:

On 11/23/22 19:13, Philippe Mathieu-Daudé wrote:

On 23/11/22 15:12, Alex Bennée wrote:

Thomas Huth  writes:

On 23/11/2022 12.15, Philippe Mathieu-Daudé wrote:

On 18/11/22 12:33, Alex Bennée wrote:

The old URL wasn't stable. I suspect the current URL will only be
stable for a few months so maybe we need another strategy for hosting
rootfs snapshots?

Signed-off-by: Alex Bennée 
---
   tests/avocado/boot_linux_console.py | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/avocado/boot_linux_console.py
b/tests/avocado/boot_linux_console.py
index 4c9d551f47..5a2923c423 100644
--- a/tests/avocado/boot_linux_console.py
+++ b/tests/avocado/boot_linux_console.py
@@ -793,8 +793,8 @@ def test_arm_orangepi_sd(self):
   dtb_path =
'/usr/lib/linux-image-current-sunxi/sun8i-h3-orangepi-pc.dtb'
   dtb_path = self.extract_from_deb(deb_path, dtb_path)
   rootfs_url =
('http://storage.kernelci.org/images/rootfs/buildroot/'
-  'kci-2019.02/armel/base/rootfs.ext2.xz')
-    rootfs_hash = '692510cb625efda31640d1de0a8d60e26040f061'
+  
'buildroot-baseline/20221116.0/armel/rootfs.ext2.xz')

+    rootfs_hash = 'fae32f337c7b87547b10f42599acf109da8b6d9a'
If Avocado doesn't find an artifact in its local cache, it will 
fetch it

from the URL.
The cache might be populated with artifacts previously downloaded, but
their URL is not valid anymore (my case for many tests).
We can also add artifacts manually, see [1].
I'd rather keep pre-existing tests if possible, to test older
(kernel / user-space) images. We don't need to run all the tests all
the time:
tests can be filtered by tags (see [2]).
My preference here is to refactor this test, adding the
"kci-2019.02"
and "baseline-20221116.0" releases. I can prepare the patch if you /
Thomas don't object.

IMHO we shouldn't keep tests in the upstream git repository where the
binaries are not available in public anymore. They won't get run by
new contributors anymore, and also could vanish from the disks of the
people who previously downloaded it, once they wipe their cache or
upgrade to a new installation, so the test code will sooner or later
be bitrotting. But if you want to keep the tests around on your hard
disk, you could also stick the test in a local branch on your hard
disk instead.

CI/Workstation splits aside I tend to agree with Thomas here that having
tests no one else can run will lead to an accretion of broken tests.

Following this idea, should we remove all boards for which no open
source & GPL software is available? I.e:

40p  IBM RS/6000 7020 (40p)

This machine can run debian :

IMHO having QEMU able to run anything an architecture can run seems way
more interesting/helpful rather than restricting it to just open source
projects.

   qemu-system-ppc -M 40p -cpu 604 -nic user -hda ./prep.qcow2 -cdrom 
./zImage.hdd -serial mon:stdio -nographic

   >> =
   >> OpenBIOS 1.1 [Mar 7 2022 23:07]
   >> Configuration device id QEMU version 1 machine id 0
   >> CPUs: 0
   >> Memory: 128M
   >> UUID: ----
   >> CPU type PowerPC,604
   milliseconds isn't unique.
   Welcome to OpenBIOS v1.1 built on Mar 7 2022 23:07
   Trying hd:,\\:tbxi...
   >> Not a bootable ELF image
   >> switching to new context:
   loaded at: 04000400 04015218
   relocated to:  0080 00814E18
   board data at: 07C9E870 07CA527C
   relocated to:  0080B130 00811B3C
   zimage at: 0400B400 0411DC98
   avail ram: 0040 0080
   Linux/PPC load: console=/dev/ttyS0,9600 console=tty0 
ether=5,0x210,eth0 ether=11,0x300,eth1 ramdisk_size=8192 root=/dev/sda3

   Uncompressing Linuxdone.
   Now booting the kernel
   Debian GNU/Linux 3.0 6015 ttyS0
   6015 login:

Please keep it ! :)

and it also boots AIX 4.4/5.1 (with 2 small patches) but that's clearly
not open source. It is downloadable from the net though, like many macos
PPC images.

That said, we might have been putting too much in avocado and it takes
ages to run (when it does not hit some random python issue).

w.r.t. "too much in avocado", are you referring to GitLab CI?

I see the following 2 use cases with Avocado:
 1/ Run tests locally
 2/ Run tests on CI
The set of tests used in 1/ and 2/ doesn't have to be the same...

1/ is very helpful for maintainers, to run tests specific to their
subsystems. Also useful during refactor when touching other subsystems,
to run their tests before sending a patch set.

2/ is the "gating" testing. With retrospective, it was a mistake to
start running avocado on CI without any filtering on what tests to run.
Instead of trying to explain my view here, I'd like to go back to Daniel
earlier proposal:
https://lore.kernel.org/qemu-devel/20200427152036.gi1244...@redhat.com/

Per this proposal, we should only r

Re: [PULL 0/7] Fixes 20221123 patches

2022-11-23 Thread Stefan Hajnoczi

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any 
user-visible changes.


signature.asc
Description: PGP signature

Re: [PULL 0/3] Avocado tests and qtests improvements

2022-11-23 Thread Stefan Hajnoczi

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any 
user-visible changes.


signature.asc
Description: PGP signature

[PATCH] cpu/i386: update xsave components after CPUID filtering

2022-11-23 Thread Huanyu ZHAI

Subject: [PATCH] cpu/i386: update xsave components after CPUID filtering

On i386 platform, CPUID data are setup through three consecutive steps: CPU 
model definition, expansion and filtering.
XSAVE components are enabled during the expansion stage, by checking if they 
are enabled in CPUID. However, it is still
probable that some XSAVE features will be enabled/disabled during the filtering 
stage and the XSAVE components left unchanged.
Inconsistency between XSAVE features and enabled XSAVE components can lead to 
problems on some Linux guests in the absence of
the following patch in the kernel:

https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1452368.html

A simple case to reproduce this problem is to start a SUSE 12 SP3 guest with 
cpu model set to Skylake-Server:
$ qemu-system-x86_64 -cpu Skylake-Server ...

In the SUSE 12 SP3 guest, one can observe that PKRU will be enabled without 
Intel PKU's presence.
That's because on platform with Skylake-Server cpus, Intel PKU is disabled 
during x86_cpu_filter_features(),
but the XSAVE PKRU bit was enabled by x86_cpu_expand_features().

Signed-off-by: Huanyu ZHAI zhaihua...@huawei.com
Signed-off-by: Xin Wang 
wangxinxin.w...@huawei.com
---
target/i386/cpu.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 22b681ca37..2ee574cf05 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6362,6 +6362,9 @@ static void x86_cpu_filter_features(X86CPU *cpu, bool 
verbose)
 mark_unavailable_features(cpu, FEAT_7_0_EBX, 
CPUID_7_0_EBX_INTEL_PT, prefix);
 }
 }
+
+/* Update XSAVE components again based on the filtered CPU feature flags */
+x86_cpu_enable_xsave_components(cpu);
}

static void x86_cpu_hyperv_realize(X86CPU *cpu)
--
2.27.0

Re: [PULL 0/7] Fixes 20221123 patches

2022-11-23 Thread Peter Maydell

On Wed, 23 Nov 2022 at 14:45, Gerd Hoffmann  wrote:
>
> The following changes since commit 7c09a7f6ae1770d15535980d15dffdb23f4d9786:
>
>   Update VERSION for v7.2.0-rc2 (2022-11-22 18:59:56 -0500)
>
> are available in the Git repository at:
>
>   https://gitlab.com/kraxel/qemu.git tags/fixes-20221123-pull-request
>
> for you to fetch changes up to 7d3cf19548b7f9afd9d25c30dd1450aad7d1877d:
>
>   hw/audio/intel-hda: Drop unnecessary prototype (2022-11-23 12:30:45 +0100)
>
> 
> ui+usb+audio: bugfixes for 7.2
>
> 
>
> Claudio Fontana (1):
>   gtk: disable GTK Clipboard with a new meson option
>
> Dongwon Kim (1):
>   ui/gtk: prevent ui lock up when dpy_gl_update called again before
> current draw event occurs
>
> Joelle van Dyne (1):
>   Revert "usbredir: avoid queuing hello packet on snapshot restore"
>
> Michael Tokarev (1):
>   hw/usb/hcd-xhci.c: spelling: tranfer
>
> Peter Maydell (3):
>   hw/usb/hcd-xhci: Reset the XHCIState with device_cold_reset()
>   hw/audio/intel-hda: don't reset codecs twice
>   hw/audio/intel-hda: Drop unnecessary prototype

That reset-related series was for-8.0 material, it shouldn't be in
a pullreq at this point in the release cycle I think.

thanks
-- PMM

Re: [PATCH v2] vfio/pci: Verify each MSI vector to avoid invalid MSI vectors

2022-11-23 Thread Alex Williamson

On Wed, 23 Nov 2022 12:08:05 +
Marc Zyngier  wrote:

> On Wed, 23 Nov 2022 01:42:36 +,
> chenxiang  wrote:
> > 
> > From: Xiang Chen 
> > 
> > Currently the number of MSI vectors comes from register PCI_MSI_FLAGS
> > which should be power-of-2 in qemu, in some scenaries it is not the same as
> > the number that driver requires in guest, for example, a PCI driver wants
> > to allocate 6 MSI vecotrs in guest, but as the limitation, it will allocate
> > 8 MSI vectors. So it requires 8 MSI vectors in qemu while the driver in
> > guest only wants to allocate 6 MSI vectors.
> > 
> > When GICv4.1 is enabled, it iterates over all possible MSIs and enable the
> > forwarding while the guest has only created some of mappings in the virtual
> > ITS, so some calls fail. The exception print is as following:
> > vfio-pci :3a:00.1: irq bypass producer (token 8f08224d) 
> > registration
> > fails:66311
> > 
> > To avoid the issue, verify each MSI vector, skip some operations such as
> > request_irq() and irq_bypass_register_producer() for those invalid MSI 
> > vectors.
> > 
> > Signed-off-by: Xiang Chen 
> > ---
> > I reported the issue at the link:
> > https://lkml.kernel.org/lkml/87cze9lcut.wl-...@kernel.org/T/
> > 
> > Change Log:
> > v1 -> v2:
> > Verify each MSI vector in kernel instead of adding systemcall according to
> > Mar's suggestion
> > ---
> >  arch/arm64/kvm/vgic/vgic-irqfd.c  | 13 +
> >  arch/arm64/kvm/vgic/vgic-its.c| 36 
> >  arch/arm64/kvm/vgic/vgic.h|  1 +
> >  drivers/vfio/pci/vfio_pci_intrs.c | 33 +
> >  include/linux/kvm_host.h  |  2 ++
> >  5 files changed, 85 insertions(+)
> > 
> > diff --git a/arch/arm64/kvm/vgic/vgic-irqfd.c 
> > b/arch/arm64/kvm/vgic/vgic-irqfd.c
> > index 475059b..71f6af57 100644
> > --- a/arch/arm64/kvm/vgic/vgic-irqfd.c
> > +++ b/arch/arm64/kvm/vgic/vgic-irqfd.c
> > @@ -98,6 +98,19 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> > return vgic_its_inject_msi(kvm, &msi);
> >  }
> >  
> > +int kvm_verify_msi(struct kvm *kvm,
> > +  struct kvm_kernel_irq_routing_entry *irq_entry)
> > +{
> > +   struct kvm_msi msi;
> > +
> > +   if (!vgic_has_its(kvm))
> > +   return -ENODEV;
> > +
> > +   kvm_populate_msi(irq_entry, &msi);
> > +
> > +   return vgic_its_verify_msi(kvm, &msi);
> > +}
> > +
> >  /**
> >   * kvm_arch_set_irq_inatomic: fast-path for irqfd injection
> >   */
> > diff --git a/arch/arm64/kvm/vgic/vgic-its.c b/arch/arm64/kvm/vgic/vgic-its.c
> > index 94a666d..8312a4a 100644
> > --- a/arch/arm64/kvm/vgic/vgic-its.c
> > +++ b/arch/arm64/kvm/vgic/vgic-its.c
> > @@ -767,6 +767,42 @@ int vgic_its_inject_cached_translation(struct kvm 
> > *kvm, struct kvm_msi *msi)
> > return 0;
> >  }
> >  
> > +int vgic_its_verify_msi(struct kvm *kvm, struct kvm_msi *msi)
> > +{
> > +   struct vgic_its *its;
> > +   struct its_ite *ite;
> > +   struct kvm_vcpu *vcpu;
> > +   int ret = 0;
> > +
> > +   if (!irqchip_in_kernel(kvm) || (msi->flags & ~KVM_MSI_VALID_DEVID))
> > +   return -EINVAL;
> > +
> > +   if (!vgic_has_its(kvm))
> > +   return -ENODEV;
> > +
> > +   its = vgic_msi_to_its(kvm, msi);
> > +   if (IS_ERR(its))
> > +   return PTR_ERR(its);
> > +
> > +   mutex_lock(&its->its_lock);
> > +   if (!its->enabled) {
> > +   ret = -EBUSY;
> > +   goto unlock;
> > +   }
> > +   ite = find_ite(its, msi->devid, msi->data);
> > +   if (!ite || !its_is_collection_mapped(ite->collection)) {
> > +   ret = E_ITS_INT_UNMAPPED_INTERRUPT;
> > +   goto unlock;
> > +   }
> > +
> > +   vcpu = kvm_get_vcpu(kvm, ite->collection->target_addr);
> > +   if (!vcpu)
> > +   ret = E_ITS_INT_UNMAPPED_INTERRUPT;  
> 
> I'm sorry, but what does this mean to the caller? This should never
> leak outside of the ITS code.
> 
> > +unlock:
> > +   mutex_unlock(&its->its_lock);
> > +   return ret;
> > +}
> > +
> >  /*
> >   * Queries the KVM IO bus framework to get the ITS pointer from the given
> >   * doorbell address.
> > diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
> > index 0c8da72..d452150 100644
> > --- a/arch/arm64/kvm/vgic/vgic.h
> > +++ b/arch/arm64/kvm/vgic/vgic.h
> > @@ -240,6 +240,7 @@ int kvm_vgic_register_its_device(void);
> >  void vgic_enable_lpis(struct kvm_vcpu *vcpu);
> >  void vgic_flush_pending_lpis(struct kvm_vcpu *vcpu);
> >  int vgic_its_inject_msi(struct kvm *kvm, struct kvm_msi *msi);
> > +int vgic_its_verify_msi(struct kvm *kvm, struct kvm_msi *msi);
> >  int vgic_v3_has_attr_regs(struct kvm_device *dev, struct kvm_device_attr 
> > *attr);
> >  int vgic_v3_dist_uaccess(struct kvm_vcpu *vcpu, bool is_write,
> >  int offset, u32 *val);
> > diff --git a/drivers/vfio/pci/vfio_pci_intrs.c 
> > b/drivers/vfio/pci/vfio_pci_intrs.c
> > index 40c3d7c..3027805 100644
> > --- a/drivers/vfio/pci/vfio_pci_intrs.c
> > +++ b/drivers/vfio/pci/

Re: [PATCH v3 12/17] vfio/migration: Implement VFIO migration protocol v2

2022-11-23 Thread Dr. David Alan Gilbert

* Avihai Horon (avih...@nvidia.com) wrote:



> +ret = qemu_file_get_to_fd(f, migration->data_fd, data_size);
> +if (!ret) {
> +trace_vfio_load_state_device_data(vbasedev->name, data_size);
> +
> +}

I notice you had a few cases like that; I wouldn't bother making that
conditional - just add 'ret' to the trace parameters; that way if it
fails then you can see that in the trace, and it's simpler anyway.

Dave

> +
> +return ret;
> +}
> +
>  static int vfio_v1_load_buffer(QEMUFile *f, VFIODevice *vbasedev,
> uint64_t data_size)
>  {
> @@ -394,6 +484,14 @@ static int vfio_load_device_config_state(QEMUFile *f, 
> void *opaque)
>  return qemu_file_get_error(f);
>  }
>  
> +static void vfio_migration_cleanup(VFIODevice *vbasedev)
> +{
> +VFIOMigration *migration = vbasedev->migration;
> +
> +close(migration->data_fd);
> +migration->data_fd = -1;
> +}
> +
>  static void vfio_migration_v1_cleanup(VFIODevice *vbasedev)
>  {
>  VFIOMigration *migration = vbasedev->migration;
> @@ -405,6 +503,18 @@ static void vfio_migration_v1_cleanup(VFIODevice 
> *vbasedev)
>  
>  /* -- */
>  
> +static int vfio_save_setup(QEMUFile *f, void *opaque)
> +{
> +VFIODevice *vbasedev = opaque;
> +
> +trace_vfio_save_setup(vbasedev->name);
> +
> +qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
> +qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
> +
> +return qemu_file_get_error(f);
> +}
> +
>  static int vfio_v1_save_setup(QEMUFile *f, void *opaque)
>  {
>  VFIODevice *vbasedev = opaque;
> @@ -448,6 +558,14 @@ static int vfio_v1_save_setup(QEMUFile *f, void *opaque)
>  return 0;
>  }
>  
> +static void vfio_save_cleanup(void *opaque)
> +{
> +VFIODevice *vbasedev = opaque;
> +
> +vfio_migration_cleanup(vbasedev);
> +trace_vfio_save_cleanup(vbasedev->name);
> +}
> +
>  static void vfio_v1_save_cleanup(void *opaque)
>  {
>  VFIODevice *vbasedev = opaque;
> @@ -456,6 +574,23 @@ static void vfio_v1_save_cleanup(void *opaque)
>  trace_vfio_save_cleanup(vbasedev->name);
>  }
>  
> +#define VFIO_MIG_PENDING_SIZE (512 * 1024 * 1024)
> +static void vfio_save_pending(void *opaque, uint64_t threshold_size,
> +  uint64_t *res_precopy, uint64_t *res_postcopy)
> +{
> +VFIODevice *vbasedev = opaque;
> +
> +/*
> + * VFIO migration protocol v2 currently doesn't have an API to get 
> pending
> + * device state size. Until such API is introduced, report some big
> + * arbitrary pending size so the device will be taken into account for
> + * downtime limit calculations.
> + */
> +*res_postcopy += VFIO_MIG_PENDING_SIZE;
> +
> +trace_vfio_save_pending(vbasedev->name, *res_precopy, *res_postcopy);
> +}
> +
>  static void vfio_v1_save_pending(void *opaque, uint64_t threshold_size,
>   uint64_t *res_precopy, uint64_t 
> *res_postcopy)
>  {
> @@ -520,6 +655,67 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
>  return 0;
>  }
>  
> +/* Returns 1 if end-of-stream is reached, 0 if more data and -1 if error */
> +static int vfio_save_block(QEMUFile *f, VFIOMigration *migration)
> +{
> +ssize_t data_size;
> +
> +data_size = read(migration->data_fd, migration->data_buffer,
> + migration->data_buffer_size);
> +if (data_size < 0) {
> +return -1;
> +}
> +if (data_size == 0) {
> +return 1;
> +}
> +
> +qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE);
> +qemu_put_be64(f, data_size);
> +qemu_put_buffer(f, migration->data_buffer, data_size);
> +bytes_transferred += data_size;
> +
> +trace_vfio_save_block(migration->vbasedev->name, data_size);
> +
> +return qemu_file_get_error(f);
> +}
> +
> +static int vfio_save_complete_precopy(QEMUFile *f, void *opaque)
> +{
> +VFIODevice *vbasedev = opaque;
> +enum vfio_device_mig_state recover_state;
> +int ret;
> +
> +/* We reach here with device state STOP only */
> +recover_state = VFIO_DEVICE_STATE_STOP;
> +ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY,
> +   recover_state);
> +if (ret) {
> +return ret;
> +}
> +
> +do {
> +ret = vfio_save_block(f, vbasedev->migration);
> +if (ret < 0) {
> +return ret;
> +}
> +} while (!ret);
> +
> +qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);
> +ret = qemu_file_get_error(f);
> +if (ret) {
> +return ret;
> +}
> +
> +recover_state = VFIO_DEVICE_STATE_ERROR;
> +ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP,
> +   recover_state);
> +if (!ret) {
> +trace_vfio_save_complete_precopy(vbasedev->name);
> +}
> +
> +return ret;
> +}
> +
>  static int vfio_v1_save_complete_precopy(QEMUFile *f, void *op

Re: [RFC PATCH] tests/avocado: use new rootfs for orangepi test

2022-11-23 Thread Cédric Le Goater

On 11/23/22 19:13, Philippe Mathieu-Daudé wrote:

On 23/11/22 15:12, Alex Bennée wrote:

Thomas Huth  writes:

On 23/11/2022 12.15, Philippe Mathieu-Daudé wrote:

On 18/11/22 12:33, Alex Bennée wrote:

The old URL wasn't stable. I suspect the current URL will only be
stable for a few months so maybe we need another strategy for hosting
rootfs snapshots?

Signed-off-by: Alex Bennée 
---
   tests/avocado/boot_linux_console.py | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/avocado/boot_linux_console.py
b/tests/avocado/boot_linux_console.py
index 4c9d551f47..5a2923c423 100644
--- a/tests/avocado/boot_linux_console.py
+++ b/tests/avocado/boot_linux_console.py
@@ -793,8 +793,8 @@ def test_arm_orangepi_sd(self):
   dtb_path =
'/usr/lib/linux-image-current-sunxi/sun8i-h3-orangepi-pc.dtb'
   dtb_path = self.extract_from_deb(deb_path, dtb_path)
   rootfs_url =
('http://storage.kernelci.org/images/rootfs/buildroot/'
-  'kci-2019.02/armel/base/rootfs.ext2.xz')
-    rootfs_hash = '692510cb625efda31640d1de0a8d60e26040f061'
+  'buildroot-baseline/20221116.0/armel/rootfs.ext2.xz')
+    rootfs_hash = 'fae32f337c7b87547b10f42599acf109da8b6d9a'

If Avocado doesn't find an artifact in its local cache, it will fetch it
from the URL.
The cache might be populated with artifacts previously downloaded, but
their URL is not valid anymore (my case for many tests).
We can also add artifacts manually, see [1].
I'd rather keep pre-existing tests if possible, to test older
(kernel / user-space) images. We don't need to run all the tests all
the time:
tests can be filtered by tags (see [2]).
My preference here is to refactor this test, adding the
"kci-2019.02"
and "baseline-20221116.0" releases. I can prepare the patch if you /
Thomas don't object.

IMHO we shouldn't keep tests in the upstream git repository where the
binaries are not available in public anymore. They won't get run by
new contributors anymore, and also could vanish from the disks of the
people who previously downloaded it, once they wipe their cache or
upgrade to a new installation, so the test code will sooner or later
be bitrotting. But if you want to keep the tests around on your hard
disk, you could also stick the test in a local branch on your hard
disk instead.

CI/Workstation splits aside I tend to agree with Thomas here that having
tests no one else can run will lead to an accretion of broken tests.

Following this idea, should we remove all boards for which no open
source & GPL software is available? I.e:

40p  IBM RS/6000 7020 (40p)

This machine can run debian :

  qemu-system-ppc -M 40p -cpu 604 -nic user -hda ./prep.qcow2 -cdrom 
./zImage.hdd -serial mon:stdio -nographic

  >> =

  >> OpenBIOS 1.1 [Mar 7 2022 23:07]
  >> Configuration device id QEMU version 1 machine id 0
  >> CPUs: 0
  >> Memory: 128M
  >> UUID: ----
  >> CPU type PowerPC,604
  milliseconds isn't unique.
  Welcome to OpenBIOS v1.1 built on Mar 7 2022 23:07
  Trying hd:,\\:tbxi...
  >> Not a bootable ELF image
  >> switching to new context:
  loaded at: 04000400 04015218
  relocated to:  0080 00814E18
  board data at: 07C9E870 07CA527C
  relocated to:  0080B130 00811B3C
  zimage at: 0400B400 0411DC98
  avail ram: 0040 0080

  Linux/PPC load: console=/dev/ttyS0,9600 console=tty0 ether=5,0x210,eth0 ether=11,0x300,eth1 ramdisk_size=8192 root=/dev/sda3

  Uncompressing Linuxdone.
  Now booting the kernel

  Debian GNU/Linux 3.0 6015 ttyS0

  6015 login:

Please keep it ! :)

and it also boots AIX 4.4/5.1 (with 2 small patches) but that's clearly
not open source. It is downloadable from the net though, like many macos
PPC images.

That said, we might have been putting too much in avocado and it takes
ages to run (when it does not hit some random python issue).

akita    Sharp SL-C1000 (Akita) PDA (PXA270)
midway   Calxeda Midway (ECX-2000)
terrier  Sharp SL-C3200 (Terrier) PDA (PXA270)
tosa Sharp SL-6000 (Tosa) PDA (PXA255)

Given the tests themselves are standalone couldn't the prospective test
hoarder keep their own personal repository to be run with the rest of the
in-tree code, something like:

   cd my/test/zoo/repo
   $(QEMU_BUILD)/tests/venv/bin/avocado run my_test_zoo.py

for convenience we could maybe support an env variable so the existing
test selection tags would work:

   set -x QEMU_AVOCADO_EXTRA_TESTS /my/test/zoo/repo
   ./tests/venv/bin/avocado list
   ...

   ...

?

Yes, this is what we use to test the Fuloong2E:

$ git grep RESCUE_YL_PATH tests/avocado/
tests/avocado/machine_mips_fuloong2e.py:21: 
@skipUnless(os.getenv('RESCUE_YL_PATH'), 'RESCUE_YL_PATH not available')
tests/avocado/machine_mips_fuloong2e.py:34:    kernel

Re: [PATCH v3 01/17] migration: Remove res_compatible parameter

2022-11-23 Thread Dr. David Alan Gilbert

* Avihai Horon (avih...@nvidia.com) wrote:
> 
> On 08/11/2022 19:52, Vladimir Sementsov-Ogievskiy wrote:
> > External email: Use caution opening links or attachments
> > 
> > 
> > On 11/3/22 19:16, Avihai Horon wrote:
> > > From: Juan Quintela 
> > > 
> > > It was only used for RAM, and in that case, it means that this amount
> > > of data was sent for memory.
> > 
> > Not clear for me, what means "this amount of data was sent for
> > memory"... That amount of data was not yet sent, actually.
> > 
> Yes, this should be changed to something like:
> 
> "It was only used for RAM, and in that case, it means that this amount
> of data still needs to be sent for memory, and can be sent in any phase
> of migration. The same functionality can be achieved without res_compatible,
> so just delete the field in all callers and change the definition of
> res_postcopy accordingly.".

Sorry, I recently sent a similar comment in reply to Juan's original
post.
If I understand correctly though, the dirty bitmap code relies on
'postcopy' here to be data only sent during postcopy.

Dave

> > > Just delete the field in all callers.
> > > 
> > > Signed-off-by: Juan Quintela 
> > > ---
> > >   hw/s390x/s390-stattrib.c   |  6 ++
> > >   hw/vfio/migration.c    | 10 --
> > >   hw/vfio/trace-events   |  2 +-
> > >   include/migration/register.h   | 20 ++--
> > >   migration/block-dirty-bitmap.c |  7 +++
> > >   migration/block.c  |  7 +++
> > >   migration/migration.c  |  9 -
> > >   migration/ram.c    |  8 +++-
> > >   migration/savevm.c | 14 +-
> > >   migration/savevm.h |  4 +---
> > >   migration/trace-events |  2 +-
> > >   11 files changed, 37 insertions(+), 52 deletions(-)
> > > 
> > 
> > [..]
> > 
> > > diff --git a/include/migration/register.h b/include/migration/register.h
> > > index c1dcff0f90..1950fee6a8 100644
> > > --- a/include/migration/register.h
> > > +++ b/include/migration/register.h
> > > @@ -48,18 +48,18 @@ typedef struct SaveVMHandlers {
> > >   int (*save_setup)(QEMUFile *f, void *opaque);
> > >   void (*save_live_pending)(QEMUFile *f, void *opaque,
> > >     uint64_t threshold_size,
> > > -  uint64_t *res_precopy_only,
> > > -  uint64_t *res_compatible,
> > > -  uint64_t *res_postcopy_only);
> > > +  uint64_t *rest_precopy,
> > > +  uint64_t *rest_postcopy);
> > >   /* Note for save_live_pending:
> > > - * - res_precopy_only is for data which must be migrated in
> > > precopy phase
> > > - * or in stopped state, in other words - before target vm start
> > > - * - res_compatible is for data which may be migrated in any phase
> > > - * - res_postcopy_only is for data which must be migrated in
> > > postcopy phase
> > > - * or in stopped state, in other words - after source vm stop
> > > + * - res_precopy is for data which must be migrated in precopy
> > > + * phase or in stopped state, in other words - before target
> > > + * vm start
> > > + * - res_postcopy is for data which must be migrated in postcopy
> > > + * phase or in stopped state, in other words - after source vm
> > > + * stop
> > >    *
> > > - * Sum of res_postcopy_only, res_compatible and
> > > res_postcopy_only is the
> > > - * whole amount of pending data.
> > > + * Sum of res_precopy and res_postcopy is the whole amount of
> > > + * pending data.
> > >    */
> > > 
> > > 
> > 
> > [..]
> > 
> > > diff --git a/migration/ram.c b/migration/ram.c
> > > index dc1de9ddbc..20167e1102 100644
> > > --- a/migration/ram.c
> > > +++ b/migration/ram.c
> > > @@ -3435,9 +3435,7 @@ static int ram_save_complete(QEMUFile *f, void
> > > *opaque)
> > >   }
> > > 
> > >   static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t
> > > max_size,
> > > - uint64_t *res_precopy_only,
> > > - uint64_t *res_compatible,
> > > - uint64_t *res_postcopy_only)
> > > + uint64_t *res_precopy, uint64_t
> > > *res_postcopy)
> > >   {
> > >   RAMState **temp = opaque;
> > >   RAMState *rs = *temp;
> > > @@ -3457,9 +3455,9 @@ static void ram_save_pending(QEMUFile *f, void
> > > *opaque, uint64_t max_size,
> > > 
> > >   if (migrate_postcopy_ram()) {
> > >   /* We can do postcopy, and all the data is postcopiable */
> > > -    *res_compatible += remaining_size;
> > > +    *res_postcopy += remaining_size;
> > 
> > That's seems to be not quite correct.
> > 
> > res_postcopy is defined as "data which must be migrated in postcopy",
> > but that's not true here, as RAM can be migrated both in precopy and
> > postcopy.
> > 
> > Still

Re: [PATCH] gdbstub: move update guest debug to accel ops

2022-11-23 Thread Philippe Mathieu-Daudé


Hi,

On 23/11/22 13:17, Mads Ynddal wrote:

From: Mads Ynddal 

Continuing the refactor of a48e7d9e52 (gdbstub: move guest debug support
check to ops) by removing hardcoded kvm_enabled() from generic cpu.c
code, and replace it with a property of AccelOpsClass.

Signed-off-by: Mads Ynddal 
---
  accel/kvm/kvm-accel-ops.c  |  1 +
  cpu.c  | 10 +++---
  include/sysemu/accel-ops.h |  1 +
  3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/accel/kvm/kvm-accel-ops.c b/accel/kvm/kvm-accel-ops.c
index fbf4fe3497..6ebf9a644f 100644
--- a/accel/kvm/kvm-accel-ops.c
+++ b/accel/kvm/kvm-accel-ops.c
@@ -99,6 +99,7 @@ static void kvm_accel_ops_class_init(ObjectClass *oc, void 
*data)
  ops->synchronize_pre_loadvm = kvm_cpu_synchronize_pre_loadvm;
  
  #ifdef KVM_CAP_SET_GUEST_DEBUG

+ops->update_guest_debug = kvm_update_guest_debug;
  ops->supports_guest_debug = kvm_supports_guest_debug;
  ops->insert_breakpoint = kvm_insert_breakpoint;
  ops->remove_breakpoint = kvm_remove_breakpoint;
diff --git a/cpu.c b/cpu.c
index 2a09b05205..ef433a79e3 100644
--- a/cpu.c
+++ b/cpu.c
@@ -31,8 +31,8 @@
  #include "hw/core/sysemu-cpu-ops.h"
  #include "exec/address-spaces.h"
  #endif
+#include "sysemu/cpus.h"
  #include "sysemu/tcg.h"
-#include "sysemu/kvm.h"
  #include "sysemu/replay.h"
  #include "exec/cpu-common.h"
  #include "exec/exec-all.h"
@@ -378,10 +378,14 @@ void cpu_breakpoint_remove_all(CPUState *cpu, int mask)
  void cpu_single_step(CPUState *cpu, int enabled)
  {
  if (cpu->singlestep_enabled != enabled) {
+const AccelOpsClass *ops = cpus_get_accel();
+
  cpu->singlestep_enabled = enabled;
-if (kvm_enabled()) {
-kvm_update_guest_debug(cpu, 0);
+
+if (ops->update_guest_debug) {
+ops->update_guest_debug(cpu, 0);


Isn't this '0' flag here accelerator-specific? ...


  }
+
  trace_breakpoint_singlestep(cpu->cpu_index, enabled);
  }
  }
diff --git a/include/sysemu/accel-ops.h b/include/sysemu/accel-ops.h
index 8cc7996def..0a47a2f00c 100644
--- a/include/sysemu/accel-ops.h
+++ b/include/sysemu/accel-ops.h
@@ -48,6 +48,7 @@ struct AccelOpsClass {
  
  /* gdbstub hooks */

  bool (*supports_guest_debug)(void);
+int (*update_guest_debug)(CPUState *cpu, unsigned long flags);


... if so the prototype should be:

   int (*update_guest_debug)(CPUState *cpu);

and the '0' value set within kvm-accel-ops.c handler implementation.


  int (*insert_breakpoint)(CPUState *cpu, int type, hwaddr addr, hwaddr 
len);
  int (*remove_breakpoint)(CPUState *cpu, int type, hwaddr addr, hwaddr 
len);
  void (*remove_all_breakpoints)(CPUState *cpu);


Regards,

Phil.

Re: [RFC PATCH] tests/avocado: use new rootfs for orangepi test

2022-11-23 Thread Philippe Mathieu-Daudé


On 23/11/22 15:12, Alex Bennée wrote:

Thomas Huth  writes:

On 23/11/2022 12.15, Philippe Mathieu-Daudé wrote:

On 18/11/22 12:33, Alex Bennée wrote:

The old URL wasn't stable. I suspect the current URL will only be
stable for a few months so maybe we need another strategy for hosting
rootfs snapshots?

Signed-off-by: Alex Bennée 
---
   tests/avocado/boot_linux_console.py | 4 ++--
   1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/avocado/boot_linux_console.py
b/tests/avocado/boot_linux_console.py
index 4c9d551f47..5a2923c423 100644
--- a/tests/avocado/boot_linux_console.py
+++ b/tests/avocado/boot_linux_console.py
@@ -793,8 +793,8 @@ def test_arm_orangepi_sd(self):
   dtb_path =
'/usr/lib/linux-image-current-sunxi/sun8i-h3-orangepi-pc.dtb'
   dtb_path = self.extract_from_deb(deb_path, dtb_path)
   rootfs_url =
('http://storage.kernelci.org/images/rootfs/buildroot/'
-  'kci-2019.02/armel/base/rootfs.ext2.xz')
-    rootfs_hash = '692510cb625efda31640d1de0a8d60e26040f061'
+  'buildroot-baseline/20221116.0/armel/rootfs.ext2.xz')
+    rootfs_hash = 'fae32f337c7b87547b10f42599acf109da8b6d9a'

If Avocado doesn't find an artifact in its local cache, it will fetch it
from the URL.
The cache might be populated with artifacts previously downloaded, but
their URL is not valid anymore (my case for many tests).
We can also add artifacts manually, see [1].
I'd rather keep pre-existing tests if possible, to test older
(kernel / user-space) images. We don't need to run all the tests all
the time:
tests can be filtered by tags (see [2]).
My preference here is to refactor this test, adding the
"kci-2019.02"
and "baseline-20221116.0" releases. I can prepare the patch if you /
Thomas don't object.


IMHO we shouldn't keep tests in the upstream git repository where the
binaries are not available in public anymore. They won't get run by
new contributors anymore, and also could vanish from the disks of the
people who previously downloaded it, once they wipe their cache or
upgrade to a new installation, so the test code will sooner or later
be bitrotting. But if you want to keep the tests around on your hard
disk, you could also stick the test in a local branch on your hard
disk instead.


CI/Workstation splits aside I tend to agree with Thomas here that having
tests no one else can run will lead to an accretion of broken tests.


Following this idea, should we remove all boards for which no open
source & GPL software is available? I.e:

40p  IBM RS/6000 7020 (40p)
akitaSharp SL-C1000 (Akita) PDA (PXA270)
midway   Calxeda Midway (ECX-2000)
terrier  Sharp SL-C3200 (Terrier) PDA (PXA270)
tosa Sharp SL-6000 (Tosa) PDA (PXA255)


Given the tests themselves are standalone couldn't the prospective test
hoarder keep their own personal repository to be run with the rest of the
in-tree code, something like:

   cd my/test/zoo/repo
   $(QEMU_BUILD)/tests/venv/bin/avocado run my_test_zoo.py

for convenience we could maybe support an env variable so the existing
test selection tags would work:

   set -x QEMU_AVOCADO_EXTRA_TESTS /my/test/zoo/repo
   ./tests/venv/bin/avocado list
   ...
   
   ...

?


Yes, this is what we use to test the Fuloong2E:

$ git grep RESCUE_YL_PATH tests/avocado/
tests/avocado/machine_mips_fuloong2e.py:21: 
@skipUnless(os.getenv('RESCUE_YL_PATH'), 'RESCUE_YL_PATH not available')
tests/avocado/machine_mips_fuloong2e.py:34:kernel_path = 
self.fetch_asset('file://' + os.getenv('RESCUE_YL_PATH'),


The firmware is not open source / GPL but if you have a Fuloong2E board
you can dump it from the flash, then use it to test QEMU from hard reset
up to userland. Otherwise you are forced to use the -kernel argument.


The other possibility is to upload the binaries to a new public
location in the web ... but for software that contains GPLed software,
you should then also make sure to provide the source code to comply
with the license.


This is the traditional reason we've lent so hard on external hosting
for binaries because the upstream doesn't want the hassle of maintaining
that sort of zoo of binaries. That said we have tests where binaries are
served from fileserver.linaro.org but its then only my problem to deal
with GPL requirements and not the upstream.


Maybe we are discussing 2 different topics. I am in possession of
old Solaris installation CDROMs and could boot some of them with
qemu-system-sparc. I want to automatize my testing, and wrote Avocado
scripts doing that. I suppose other QEMU users have similar CDROMs.
If I contribute my tests, they can run them. Isn't it in the interest
of the community to have such examples and tests available?

Regards,

Phil.

Re: [PATCH 00/15] Protect the block layer with a rwlock: part 3

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 18:04 hat Paolo Bonzini geschrieben:
> On 11/23/22 14:45, Kevin Wolf wrote:
> > I think this means that if we clean up everything, in the end we'll have
> > coroutine_wrapper and coroutine_wrapper_bdrv (the fourth version not in
> > the above list, but that Paolo mentioned we may want to have).
> 
> Yes, I agree.
> 
> > The only thing I'm unsure about is whether coroutine_wrapper_bdrv is
> > descriptive enough as a name or whether it should be something more
> > explicit like coroutine_wrapper_bdrv_graph_locked.
> 
> That's already long and becomes longer if you add "mixed", but perhaps
> co_wrapper_{mixed_,}{bdrv_graph_rdlock,} would be okay?
> 
> In other words:
> 
> generated_co_wrapper_simple -> co_wrapper
> generated_co_wrapper-> co_wrapper_mixed
> generated_co_wrapper_bdrv   -> co_wrapper_mixed_bdrv_graph_rdlock
> 
> ?

Works for me. Maybe co_wrapper_mixed_bdrv_rdlock (without the "graph")
would be enough, too, if it is too long.

Kevin

Re: [PATCH v5 1/2] io: Add support for MSG_PEEK for socket channel

2022-11-23 Thread Peter Xu

On Wed, Nov 23, 2022 at 05:27:34PM +, manish.mishra wrote:
> MSG_PEEK reads from the peek of channel, The data is treated as
> unread and the next read shall still return this data. This
> support is currently added only for socket class. Extra parameter
> 'flags' is added to io_readv calls to pass extra read flags like
> MSG_PEEK.
> 
> Reviewed-by: Daniel P. Berrangé  Suggested-by: Daniel P. Berrangé  Signed-off-by: manish.mishra 

Reviewed-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH v2] Drop more useless casts from void * to pointer

2022-11-23 Thread Markus Armbruster

BALATON Zoltan  writes:

> On Wed, 23 Nov 2022, Markus Armbruster wrote:
>> Daniel P. Berrangé  writes:
>>> On Wed, Nov 23, 2022 at 02:51:49PM +0100, BALATON Zoltan wrote:
 On Wed, 23 Nov 2022, Markus Armbruster wrote:
> Signed-off-by: Markus Armbruster 
> Reviewed-by: Laurent Vivier 
> ---
> v2:
> * PATCH 1+2 merged as commit 0a553c12c7 and 3f7febc937
> * PATCH 3 change to util/coroutine-ucontext.c dropped [Laurent]
>
> bsd-user/elfload.c  | 2 +-
> contrib/plugins/cache.c | 8 
> contrib/vhost-user-blk/vhost-user-blk.c | 2 +-
> hw/core/qdev-clock.c| 2 +-
> hw/hyperv/vmbus.c   | 2 +-
> hw/net/cadence_gem.c| 2 +-
> hw/net/virtio-net.c | 2 +-
> hw/nvme/ctrl.c  | 4 ++--
> hw/rdma/vmw/pvrdma_cmd.c| 9 +++--
> hw/rdma/vmw/pvrdma_qp_ops.c | 6 +++---
> hw/virtio/virtio-iommu.c| 3 +--
> linux-user/syscall.c| 2 +-
> target/i386/hax/hax-all.c   | 2 +-
> tests/tcg/aarch64/system/semiheap.c | 4 ++--
> util/vfio-helpers.c | 2 +-
> 15 files changed, 24 insertions(+), 28 deletions(-)
>
> diff --git a/bsd-user/elfload.c b/bsd-user/elfload.c
> index f8edb22f2a..fbcdc94b96 100644
> --- a/bsd-user/elfload.c
> +++ b/bsd-user/elfload.c
> @@ -156,7 +156,7 @@ static abi_ulong copy_elf_strings(int argc, char 
> **argv, void **page,
> --p; --tmp; --len;
> if (--offset < 0) {
> offset = p % TARGET_PAGE_SIZE;
> -pag = (char *)page[p / TARGET_PAGE_SIZE];
> +pag = page[p / TARGET_PAGE_SIZE];

 I think arithmetic on void pointer was undefined at least in the past so
 some compilers may warn for it but not sure if this is still the case for
 the compilers we care about. Apparently not if this now compiles but that
 explains why this cast was not useless.
>>
>> I don't think so :)
>>
>> @pag is char *.
>>
>> @page is void **.
>>
>> page[p / TARGET_PAGE_SIZE] is void *.  No need to cast to char * before
>> assigning to @pag.
>
> You are right. Although I'm not sure what page[] counts as because it adds an 
> offset to the pointer but [] is higher priority than (type) cast so it 
> does not matter and that cast is not needed here then. Maybe I should be more 
> attentive to details but I did not take the time to look at it more 
> carefully. I did not say we should keep the cast anyway (considering only gcc 
> and clang are targeted), I was just trying to understand why it might 
> have been there in the first place.

And that's perfectly okay!

Re: [PATCH v9 3/8] KVM: Add KVM_EXIT_MEMORY_FAULT exit

2022-11-23 Thread Sean Christopherson

On Tue, Nov 22, 2022, Chao Peng wrote:
> On Fri, Nov 18, 2022 at 03:59:12PM +, Sean Christopherson wrote:
> > On Fri, Nov 18, 2022, Alex Benn?e wrote:
> > > > We don't actually need a new bit, the opposite side of private is
> > > > shared, i.e. flags with KVM_MEMORY_EXIT_FLAG_PRIVATE cleared expresses
> > > > 'shared'.
> > > 
> > > If that is always true and we never expect a 3rd type of memory that is
> > > fine. But given we are leaving room for expansion having an explicit bit
> > > allows for that as well as making cases of forgetting to set the flags
> > > more obvious.
> > 
> > Hrm, I'm on the fence.
> > 
> > A dedicated flag isn't strictly needed, e.g. even if we end up with 3+ 
> > types in
> > this category, the baseline could always be "private".
> 
> The baseline for the current code is actually "shared".

Ah, right, the baseline needs to be "shared" so that legacy code doesn't end up
with impossible states.

> > I do like being explicit, and adding a PRIVATE flag costs KVM practically 
> > nothing
> > to implement and maintain, but evetually we'll up with flags that are 
> > paired with
> > an implicit state, e.g. see the many #PF error codes in x86.  In other 
> > words,
> > inevitably KVM will need to define the default/base state of the access, at 
> > which
> > point the base state for SHARED vs. PRIVATE is "undefined".  
> 
> Current memory conversion for confidential usage is bi-directional so we
> already need both private and shared states and if we use one bit for
> both "shared" and "private" then we will have to define the default
> state, e.g, currently the default state is "shared" when we define
> 
>   KVM_MEMORY_EXIT_FLAG_PRIVATE(1 << 0)

...

> > So I would say if we add an explicit READ flag, then we might as well add 
> > an explicit
> > PRIVATE flag too.  But if we omit PRIVATE, then we should omit READ too.
> 
> Since we assume the default state is shared, so we actually only need a
> PRIVATE flag, e.g. there is no SHARED flag and will ignore the RWX for now.

Yeah, I'm leading towards "shared" being the implied default state.  Ditto for
"read" if/when we need to communicate write/execute information  E.g. for VMs
that don't support guest private memory, the "shared" flag is in some ways
nonsensical.  Worst case scenario, e.g. if we end up with variations of 
"shared",
we'll need something like KVM_MEMORY_EXIT_FLAG_SHARED_RESTRICTIVE or whatever,
but the basic "shared" default will still work.

Re: [PATCH 3/4] hw/cxl: Multi-Region CXL Type-3 Devices (Volatile and Persistent)

2022-11-23 Thread Gregory Price

> > -  -object 
> > memory-backend-file,id=cxl-mem1,share=on,mem-path=/tmp/cxltest.raw,size=256M
> >  \
> > -  -object 
> > memory-backend-file,id=cxl-lsa1,share=on,mem-path=/tmp/lsa.raw,size=256M \
> > +  -object 
> > memory-backend-file,pmem=true,id=pmem0,share=on,mem-path=/tmp/cxltest.raw,size=256M
> >  \
> > +  -object 
> > memory-backend-file,pmem=true,id=cxl-lsa0,share=on,mem-path=/tmp/lsa.raw,size=256M
> >  \
> 
> Why make the pmem=true change in here? If we want to do that I think it 
> should be in a
> separate patch with explanation.
> 

this is mostly an observation that memory-backend's have a pmem option.
It was unclear to me that using this backend for a pmem region without
setting pmem=true was "correct", but i also am not sure it has a real
effect.  I'll drop this from the changeset.

> > +error_cleanup:
> > +int i;
> > +for (i = 0; i < cur_ent; i++) {
> > +g_free(*cdat_table[i]);
> 
> Until the steal pointer above, *cdata_table not set to anything.
> Possibly gfree(table[i])?
> 
> 

good catch

> Hmm. I wonder if this is simpler done as below. Both look fine
> to me though so up to you for next version.  Trade off between
> slightly ugly nested logic, and the readability always lost when
> a ternary operator puts in an appearance.
> 
> if (ct3d->hostvmem) {

this seems reasonable, pulled in

> 
> If we have both volatile and persistent and yet still only have our one HDM
> decoder, then I think a write into the persistent range will have the wrong 
> offset.
> 
> dpa_offset == address space offset when we only had one region. Not so much 
> any more.
> For persistent i think we'll need to subtract the size of the volatile region
> (possibly taking care with alignment - I need to check that).
>

I had originally done this, but it wasn't clear to me what was correct
here, I'll make the adjustment

> > diff --git a/tests/qtest/cxl-test.c b/tests/qtest/cxl-test.c
> > index 6baed747fa..a05ddc0c7b 100644
> > --- a/tests/qtest/cxl-test.c
> > +++ b/tests/qtest/cxl-test.c
> > @@ -34,29 +34,46 @@
> > -  "-object 
> > memory-backend-file,id=cxl-mem3,mem-path=%s,size=256M "\
> > -  "-object 
> > memory-backend-file,id=lsa3,mem-path=%s,size=256M "\
> > -  "-device 
> > cxl-type3,bus=rp3,memdev=cxl-mem3,lsa=lsa3,id=cxl-pmem3 "
> 
> If re-indenting makes sense (I'm really convinced it is worth the noise) then 
> do it
> in a precusor no-op patch so we can more easily see what is new here.)
> 

can do

Re: [RFC 6/7] migration: simplify migration_iteration_run()

2022-11-23 Thread Dr. David Alan Gilbert

* Juan Quintela (quint...@redhat.com) wrote:
> Signed-off-by: Juan Quintela 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/migration.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 97fefd579e..35e512887a 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3747,23 +3747,23 @@ static MigIterateState 
> migration_iteration_run(MigrationState *s)
>  trace_migrate_pending_exact(pending_size, s->threshold_size, 
> pend_pre, pend_post);
>  }
>  
> -if (pending_size && pending_size >= s->threshold_size) {
> -/* Still a significant amount to transfer */
> -if (!in_postcopy && pend_pre <= s->threshold_size &&
> -qatomic_read(&s->start_postcopy)) {
> -if (postcopy_start(s)) {
> -error_report("%s: postcopy failed to start", __func__);
> -}
> -return MIG_ITERATE_SKIP;
> -}
> -/* Just another iteration step */
> -qemu_savevm_state_iterate(s->to_dst_file, in_postcopy);
> -} else {
> +if (pending_size < s->threshold_size) {
>  trace_migration_thread_low_pending(pending_size);
>  migration_completion(s);
>  return MIG_ITERATE_BREAK;
>  }
>  
> +/* Still a significant amount to transfer */
> +if (!in_postcopy && pend_pre <= s->threshold_size &&
> +qatomic_read(&s->start_postcopy)) {
> +if (postcopy_start(s)) {
> +error_report("%s: postcopy failed to start", __func__);
> +}
> +return MIG_ITERATE_SKIP;
> +}
> +
> +/* Just another iteration step */
> +qemu_savevm_state_iterate(s->to_dst_file, in_postcopy);
>  return MIG_ITERATE_RESUME;
>  }
>  
> -- 
> 2.37.2
> 
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[PATCH v5 0/2] check magic value for deciding the mapping of channels

2022-11-23 Thread manish.mishra

Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the main channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, tls live
migrations already does tls handshake before creating other channels, so
this issue is not possible with tls, hence this logic is avoided for tls
live migrations. This patch uses MSG_PEEK to check the magic number of
channels so that current data/control stream management remains
un-effected.

v2:
  TLS does not support MSG_PEEK, so V1 was broken for tls live
  migrations. For tls live migration, while initializing main channel
  tls handshake is done before we can create other channels, so this
  issue is not possible for tls live migrations. In V2 added a check
  to avoid checking magic number for tls live migration and fallback
  to older method to decide mapping of channels on destination side.

v3:
  1. Split change in two patches, io patch for read_peek routines,
 migration patch for migration related changes.
  2. Add flags to io_readv calls to get extra read flags like
 MSG_PEEK.
  3. Some other minor fixes.

v4:
  1. Removed common *all_eof routines for read peek and added one
 specific to live migration.
  2. Updated to use qemu_co_sleep_ns instead of qio_channel_yield.
  3. Some other minor fixes.

v5:
  1. Handle busy-wait in migration_channel_read_peek due partial reads.

manish.mishra (2):
  io: Add support for MSG_PEEK for socket channel
  migration: check magic value for deciding the mapping of channels

 chardev/char-socket.c   |  4 +--
 include/io/channel.h|  6 
 io/channel-buffer.c |  1 +
 io/channel-command.c|  1 +
 io/channel-file.c   |  1 +
 io/channel-null.c   |  1 +
 io/channel-socket.c | 17 ++-
 io/channel-tls.c|  1 +
 io/channel-websock.c|  1 +
 io/channel.c| 16 +++---
 migration/channel-block.c   |  1 +
 migration/channel.c | 45 +
 migration/channel.h |  5 
 migration/migration.c   | 45 +
 migration/multifd.c | 12 +++-
 migration/multifd.h |  2 +-
 migration/postcopy-ram.c|  5 +---
 migration/postcopy-ram.h|  2 +-
 scsi/qemu-pr-helper.c   |  2 +-
 tests/qtest/tpm-emu.c   |  2 +-
 tests/unit/test-io-channel-socket.c |  1 +
 util/vhost-user-server.c|  2 +-
 22 files changed, 137 insertions(+), 36 deletions(-)

-- 
2.22.3

[PATCH v5 1/2] io: Add support for MSG_PEEK for socket channel

2022-11-23 Thread manish.mishra

MSG_PEEK reads from the peek of channel, The data is treated as
unread and the next read shall still return this data. This
support is currently added only for socket class. Extra parameter
'flags' is added to io_readv calls to pass extra read flags like
MSG_PEEK.

Reviewed-by: Daniel P. Berrangé 
---
 chardev/char-socket.c   |  4 ++--
 include/io/channel.h|  6 ++
 io/channel-buffer.c |  1 +
 io/channel-command.c|  1 +
 io/channel-file.c   |  1 +
 io/channel-null.c   |  1 +
 io/channel-socket.c | 17 -
 io/channel-tls.c|  1 +
 io/channel-websock.c|  1 +
 io/channel.c| 16 
 migration/channel-block.c   |  1 +
 scsi/qemu-pr-helper.c   |  2 +-
 tests/qtest/tpm-emu.c   |  2 +-
 tests/unit/test-io-channel-socket.c |  1 +
 util/vhost-user-server.c|  2 +-
 15 files changed, 47 insertions(+), 10 deletions(-)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 879564aa8a..5afce9a464 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -283,11 +283,11 @@ static ssize_t tcp_chr_recv(Chardev *chr, char *buf, 
size_t len)
 if (qio_channel_has_feature(s->ioc, QIO_CHANNEL_FEATURE_FD_PASS)) {
 ret = qio_channel_readv_full(s->ioc, &iov, 1,
  &msgfds, &msgfds_num,
- NULL);
+ 0, NULL);
 } else {
 ret = qio_channel_readv_full(s->ioc, &iov, 1,
  NULL, NULL,
- NULL);
+ 0, NULL);
 }
 
 if (msgfds_num) {
diff --git a/include/io/channel.h b/include/io/channel.h
index c680ee7480..716235d496 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -34,6 +34,8 @@ OBJECT_DECLARE_TYPE(QIOChannel, QIOChannelClass,
 
 #define QIO_CHANNEL_WRITE_FLAG_ZERO_COPY 0x1
 
+#define QIO_CHANNEL_READ_FLAG_MSG_PEEK 0x1
+
 typedef enum QIOChannelFeature QIOChannelFeature;
 
 enum QIOChannelFeature {
@@ -41,6 +43,7 @@ enum QIOChannelFeature {
 QIO_CHANNEL_FEATURE_SHUTDOWN,
 QIO_CHANNEL_FEATURE_LISTEN,
 QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY,
+QIO_CHANNEL_FEATURE_READ_MSG_PEEK,
 };
 
 
@@ -114,6 +117,7 @@ struct QIOChannelClass {
 size_t niov,
 int **fds,
 size_t *nfds,
+int flags,
 Error **errp);
 int (*io_close)(QIOChannel *ioc,
 Error **errp);
@@ -188,6 +192,7 @@ void qio_channel_set_name(QIOChannel *ioc,
  * @niov: the length of the @iov array
  * @fds: pointer to an array that will received file handles
  * @nfds: pointer filled with number of elements in @fds on return
+ * @flags: read flags (QIO_CHANNEL_READ_FLAG_*)
  * @errp: pointer to a NULL-initialized error object
  *
  * Read data from the IO channel, storing it in the
@@ -224,6 +229,7 @@ ssize_t qio_channel_readv_full(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+   int flags,
Error **errp);
 
 
diff --git a/io/channel-buffer.c b/io/channel-buffer.c
index bf52011be2..8096180f85 100644
--- a/io/channel-buffer.c
+++ b/io/channel-buffer.c
@@ -54,6 +54,7 @@ static ssize_t qio_channel_buffer_readv(QIOChannel *ioc,
 size_t niov,
 int **fds,
 size_t *nfds,
+int flags,
 Error **errp)
 {
 QIOChannelBuffer *bioc = QIO_CHANNEL_BUFFER(ioc);
diff --git a/io/channel-command.c b/io/channel-command.c
index 74516252ba..e7edd091af 100644
--- a/io/channel-command.c
+++ b/io/channel-command.c
@@ -203,6 +203,7 @@ static ssize_t qio_channel_command_readv(QIOChannel *ioc,
  size_t niov,
  int **fds,
  size_t *nfds,
+ int flags,
  Error **errp)
 {
 QIOChannelCommand *cioc = QIO_CHANNEL_COMMAND(ioc);
diff --git a/io/channel-file.c b/io/channel-file.c
index b67687c2aa..d76663e6ae 100644
--- a/io/channel-file.c
+++ b/io/channel-file.c
@@ -86,6 +86,7 @@ static ssize_t qio_channel_file_readv(QIOChannel *ioc,
   size_t niov,
   int **fds,
   size_t *nfds,
+  int flags,
   Error **errp)
 {
 QIOChannelFil

[PATCH v5 2/2] migration: check magic value for deciding the mapping of channels

2022-11-23 Thread manish.mishra

Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the main channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, tls live
migrations already does tls handshake before creating other channels, so
this issue is not possible with tls, hence this logic is avoided for tls
live migrations. This patch uses read peek to check the magic number of
channels so that current data/control stream management remains
un-effected.

Reviewed-by: Peter Xu 
Reviewed-by: Daniel P. Berrangé 
---
 migration/channel.c  | 45 
 migration/channel.h  |  5 +
 migration/migration.c| 45 +---
 migration/multifd.c  | 12 ---
 migration/multifd.h  |  2 +-
 migration/postcopy-ram.c |  5 +
 migration/postcopy-ram.h |  2 +-
 7 files changed, 90 insertions(+), 26 deletions(-)

diff --git a/migration/channel.c b/migration/channel.c
index 1b0815039f..270b7acca2 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -92,3 +92,48 @@ void migration_channel_connect(MigrationState *s,
 migrate_fd_connect(s, error);
 error_free(error);
 }
+
+
+/**
+ * @migration_channel_read_peek - Read from the peek of migration channel,
+ *without actually removing it from channel buffer.
+ *
+ * @ioc: the channel object
+ * @buf: the memory region to read data into
+ * @buflen: the number of bytes to read in @buf
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Returns 0 if successful, returns -1 and sets @errp if fails.
+ */
+int migration_channel_read_peek(QIOChannel *ioc,
+const char *buf,
+const size_t buflen,
+Error **errp)
+{
+ssize_t len = 0;
+struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen };
+
+while (true) {
+len = qio_channel_readv_full(ioc, &iov, 1, NULL,
+ NULL, QIO_CHANNEL_READ_FLAG_MSG_PEEK, 
errp);
+
+if (len <= 0 && len != QIO_CHANNEL_ERR_BLOCK) {
+error_setg(errp,
+   "Failed to read peek of channel");
+return -1;
+}
+
+if (len == buflen) {
+break;
+}
+
+/* 1ms sleep. */
+if (qemu_in_coroutine()) {
+qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100);
+} else {
+g_usleep(1000);
+}
+}
+
+return 0;
+}
diff --git a/migration/channel.h b/migration/channel.h
index 67a461c28a..5bdb8208a7 100644
--- a/migration/channel.h
+++ b/migration/channel.h
@@ -24,4 +24,9 @@ void migration_channel_connect(MigrationState *s,
QIOChannel *ioc,
const char *hostname,
Error *error_in);
+
+int migration_channel_read_peek(QIOChannel *ioc,
+const char *buf,
+const size_t buflen,
+Error **errp);
 #endif
diff --git a/migration/migration.c b/migration/migration.c
index f485eea5fb..8509f20a80 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -31,6 +31,7 @@
 #include "migration.h"
 #include "savevm.h"
 #include "qemu-file.h"
+#include "channel.h"
 #include "migration/vmstate.h"
 #include "block/block.h"
 #include "qapi/error.h"
@@ -733,31 +734,51 @@ void migration_ioc_process_incoming(QIOChannel *ioc, 
Error **errp)
 {
 MigrationIncomingState *mis = migration_incoming_get_current();
 Error *local_err = NULL;
-bool start_migration;
 QEMUFile *f;
+bool default_channel = true;
+uint32_t channel_magic = 0;
+int ret = 0;
 
-if (!mis->from_src_file) {
-/* The first connection (multifd may have multiple) */
+if (migrate_use_multifd() && !migrate_postcopy_ram() &&
+qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
+/*
+ * With multiple channels, it is possible that we receive channels
+ * out of order on destination side, causing incorrect mapping of
+ * source channels on destination side. Check channel MAGIC to
+ * decide type of channel. Please note this is best effort, postcopy
+ * preempt channel does not send any magic number so avoid it for
+

Re: [PATCH v5 13/15] block-coroutine-wrapper.py: support also basic return types

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 12:42 hat Emanuele Giuseppe Esposito geschrieben:
> Extend the regex to cover also return type, pointers included.
> This implies that the value returned by the function cannot be
> a simple "int" anymore, but the custom return type.
> Therefore remove poll_state->ret and instead use a per-function
> custom "ret" field.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 

This looks unchanged compared to v4 where I already gave:

Reviewed-by: Kevin Wolf

Re: [PATCH v5 12/15] block-coroutine-wrapper.py: default to main loop aiocontext if function does not have a BlockDriverState parameter

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 12:42 hat Emanuele Giuseppe Esposito geschrieben:
> Right now, we take the first parameter of the function to get the
> BlockDriverState to pass to bdrv_poll_co(), that internally calls
> functions that figure in which aiocontext the coroutine should run.
> 
> However, it is useless to pass a bs just to get its own AioContext,
> so instead pass it directly, and default to the main loop if no
> BlockDriverState is passed as parameter.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 

Patch 10 and 11 have the same subject line. Did you intend them to be
squashed together?

> diff --git a/scripts/block-coroutine-wrapper.py 
> b/scripts/block-coroutine-wrapper.py
> index 7e8f2da84b..1d552cb734 100644
> --- a/scripts/block-coroutine-wrapper.py
> +++ b/scripts/block-coroutine-wrapper.py
> @@ -78,14 +78,14 @@ def __init__(self, return_type: str, name: str, args: str,
>  
>  t = self.args[0].type
>  if t == 'BlockDriverState *':
> -bs = 'bs'
> +ctx = 'bdrv_get_aio_context(bs)'
>  elif t == 'BdrvChild *':
> -bs = 'child->bs'
> +ctx = 'bdrv_get_aio_context(child->bs)'
>  elif t == 'BlockBackend *':
> -bs = 'blk_bs(blk)'
> +ctx = 'bdrv_get_aio_context(blk_bs(blk))'

This should use blk_get_aio_context(). If a BlockBackend has no root
attached, i.e. blk_bs(blk) == NULL, it can still be in a non-mainloop
AioContext.

Kevin

Re: [PATCH v5 10/15] block-coroutine-wrapper.py: introduce generated_co_wrapper_simple

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 12:42 hat Emanuele Giuseppe Esposito geschrieben:
> This new annotation creates just a function wrapper that creates
> a new coroutine. It assumes the caller is not a coroutine.
> 
> This is much better as g_c_w, because it is clear if the caller
> is a coroutine or not, and provides the advantage of automating
> the code creation. In the future all g_c_w functions will be
> substituted on g_c_w_simple.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 

Reviewed-by: Kevin Wolf

Re: [PATCH 00/15] Protect the block layer with a rwlock: part 3

2022-11-23 Thread Paolo Bonzini


On 11/23/22 14:45, Kevin Wolf wrote:

I think this means that if we clean up everything, in the end we'll have
coroutine_wrapper and coroutine_wrapper_bdrv (the fourth version not in
the above list, but that Paolo mentioned we may want to have).


Yes, I agree.


The only thing I'm unsure about is whether coroutine_wrapper_bdrv is
descriptive enough as a name or whether it should be something more
explicit like coroutine_wrapper_bdrv_graph_locked.


That's already long and becomes longer if you add "mixed", but perhaps 
co_wrapper_{mixed_,}{bdrv_graph_rdlock,} would be okay?


In other words:

generated_co_wrapper_simple -> co_wrapper
generated_co_wrapper-> co_wrapper_mixed
generated_co_wrapper_bdrv   -> co_wrapper_mixed_bdrv_graph_rdlock

?

Paolo

Re: [PATCH v5 09/15] block: bdrv_create_file is a coroutine_fn

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 12:42 hat Emanuele Giuseppe Esposito geschrieben:
> It is always called in coroutine_fn callbacks, therefore
> it can directly call bdrv_co_create().
> 
> Rename it to bdrv_co_create_file too.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 

Reviewed-by: Kevin Wolf

Re: [PATCH v5 08/15] block: distinguish between bdrv_create running in coroutine and not

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 12:42 hat Emanuele Giuseppe Esposito geschrieben:
> Call two different functions depending on whether bdrv_create
> is in coroutine or not, following the same pattern as
> generated_co_wrapper functions.
> 
> This allows to also call the coroutine function directly,
> without using CreateCo or relying in bdrv_create().
> 
> Signed-off-by: Emanuele Giuseppe Esposito 

Reviewed-by: Kevin Wolf

Re: [PATCH v5 07/15] block: introduce QEMU_IN_COROUTINE macro

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 12:42 hat Emanuele Giuseppe Esposito geschrieben:
> This macro will be used to mark all coroutine_fn functions.
> Right now, it will be used for the newly introduced coroutine_fn, since
> we know the callers.
> 
> As a TODO, in the future we might want to add this macro to all
> corotuine_fn functions, to be sure that they are only called in

s/corotuine_fn/coroutine_fn/

> coroutines context.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 

I already asked about other opinions on this in patch 1.

These assertions are runtime checks and I don't feel they are the right
tool to verify coroutine_fn consistency. Asserting in tricky places
makes sense to me, especially as long as we can't rely on static
analysis, but adding it everywhere feels over the top to me.

Kevin

Re: [PATCH v5 06/15] block: avoid duplicating filename string in bdrv_create

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 12:42 hat Emanuele Giuseppe Esposito geschrieben:
> We know that the string will stay around until the function
> returns, and the parameter of drv->bdrv_co_create_opts is const char*,
> so it must not be modified either.
> 
> Suggested-by: Kevin Wolf 
> Signed-off-by: Emanuele Giuseppe Esposito 

Reviewed-by: Kevin Wolf

Re: [PATCH v5 04/15] block-backend: replace bdrv__above with blk__above

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 12:42 hat Emanuele Giuseppe Esposito geschrieben:
> Avoid mixing bdrv_* functions with blk_*, so create blk_* counterparts
> for bdrv_block_status_above and bdrv_is_allocated_above.
> 
> Note that since bdrv_block_status_above only calls the g_c_w function
> bdrv_common_block_status_above and is marked as coroutine_fn, call
> directly bdrv_co_common_block_status_above() to avoid using a g_c_w.

This sentence is a bit confusing, because it talks about the blk_* side
of things, but doesn't mention it at all.

The same argument is true for is_allocated.

> Signed-off-by: Emanuele Giuseppe Esposito 
> Reviewed-by: Paolo Bonzini 
> ---
>  block/block-backend.c | 21 
>  block/commit.c|  4 ++--
>  include/sysemu/block-backend-io.h |  9 +
>  nbd/server.c  | 32 +++
>  4 files changed, 48 insertions(+), 18 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 742efa7955..03bed68e4f 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -1424,6 +1424,27 @@ int coroutine_fn blk_co_pwritev(BlockBackend *blk, 
> int64_t offset,
>  return blk_co_pwritev_part(blk, offset, bytes, qiov, 0, flags);
>  }
>  
> +int coroutine_fn blk_co_block_status_above(BlockBackend *blk,
> +   BlockDriverState *base,
> +   int64_t offset, int64_t bytes,
> +   int64_t *pnum, int64_t *map,
> +   BlockDriverState **file)
> +{
> +IO_CODE();
> +return bdrv_co_block_status_above(blk_bs(blk), base, offset, bytes, pnum,
> +  map, file);
> +}
> +
> +int coroutine_fn blk_is_allocated_above(BlockBackend *blk,

Any reason why you renamed only blk_co_block_status_above(), but not
this one into blk_co_is_allocated_above()?

> +BlockDriverState *base,
> +bool include_base, int64_t offset,
> +int64_t bytes, int64_t *pnum)
> +{
> +IO_CODE();
> +return bdrv_co_is_allocated_above(blk_bs(blk), base, include_base, 
> offset,
> +  bytes, pnum);
> +}

Kevin

Re: [RFC 5/7] migration: Remove unused threshold_size parameter

2022-11-23 Thread Dr. David Alan Gilbert

* Juan Quintela (quint...@redhat.com) wrote:
> Until previous commit, save_live_pending() was used for ram.  Now with
> the split into state_pending_estimate() and state_pending_exact() it
> is not needed anymore, so remove them.
> 
> Signed-off-by: Juan Quintela 
> ---
>  include/migration/register.h   |  7 +++
>  migration/savevm.h |  6 ++
>  hw/vfio/migration.c|  6 +++---
>  migration/block-dirty-bitmap.c |  3 +--
>  migration/block.c  |  3 +--
>  migration/migration.c  |  4 ++--
>  migration/ram.c|  6 ++
>  migration/savevm.c | 12 
>  8 files changed, 18 insertions(+), 29 deletions(-)
> 
> diff --git a/include/migration/register.h b/include/migration/register.h
> index 313b8e1c3b..5f08204fb2 100644
> --- a/include/migration/register.h
> +++ b/include/migration/register.h
> @@ -58,11 +58,10 @@ typedef struct SaveVMHandlers {
>   * pending data.
>   */
>  /* This calculate the exact remaining data to transfer */
> -void (*state_pending_exact)(void *opaque,  uint64_t threshold_size,
> -uint64_t *rest_precopy, uint64_t 
> *rest_postcopy);
> +void (*state_pending_exact)(void *opaque, uint64_t *rest_precopy,
> +uint64_t *rest_postcopy);
>  /* This estimates the remaining data to transfer */
> -void (*state_pending_estimate)(void *opaque,  uint64_t threshold_size,
> -   uint64_t *rest_precopy,
> +void (*state_pending_estimate)(void *opaque, uint64_t *rest_precopy,
> uint64_t *rest_postcopy);
>  
>  LoadStateHandler *load_state;
> diff --git a/migration/savevm.h b/migration/savevm.h
> index 613f85e717..24f2d2a28b 100644
> --- a/migration/savevm.h
> +++ b/migration/savevm.h
> @@ -40,11 +40,9 @@ void qemu_savevm_state_cleanup(void);
>  void qemu_savevm_state_complete_postcopy(QEMUFile *f);
>  int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
> bool inactivate_disks);
> -void qemu_savevm_state_pending_exact(uint64_t max_size,
> - uint64_t *res_precopy,
> +void qemu_savevm_state_pending_exact(uint64_t *res_precopy,
>   uint64_t *res_postcopy);
> -void qemu_savevm_state_pending_estimate(uint64_t max_size,
> -uint64_t *res_precopy,
> +void qemu_savevm_state_pending_estimate(uint64_t *res_precopy,
>  uint64_t *res_postcopy);
>  void qemu_savevm_send_ping(QEMUFile *f, uint32_t value);
>  void qemu_savevm_send_open_return_path(QEMUFile *f);
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 680cf4df6e..d9598ce070 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -456,8 +456,8 @@ static void vfio_save_cleanup(void *opaque)
>  trace_vfio_save_cleanup(vbasedev->name);
>  }
>  
> -static void vfio_state_pending(void *opaque,  uint64_t threshold_size,
> -   uint64_t *res_precopy, uint64_t *res_postcopy)
> +static void vfio_state_pending(void *opaque, uint64_t *res_precopy,
> +   uint64_t *res_postcopy)
>  {
>  VFIODevice *vbasedev = opaque;
>  VFIOMigration *migration = vbasedev->migration;
> @@ -511,7 +511,7 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque)
>  }
>  
>  /*
> - * Reset pending_bytes as .save_live_pending is not called during savevm 
> or
> + * Reset pending_bytes as .state_pending_* is not called during savevm or

That comment change feels like it lives in a different patch, other than
that:


Reviewed-by: Dr. David Alan Gilbert 

>   * snapshot case, in such case vfio_update_pending() at the start of this
>   * function updates pending_bytes.
>   */
> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
> index 5b24007650..8a11577252 100644
> --- a/migration/block-dirty-bitmap.c
> +++ b/migration/block-dirty-bitmap.c
> @@ -761,8 +761,7 @@ static int dirty_bitmap_save_complete(QEMUFile *f, void 
> *opaque)
>  return 0;
>  }
>  
> -static void dirty_bitmap_state_pending(void *opaque, uint64_t max_size,
> -   uint64_t *res_precopy,
> +static void dirty_bitmap_state_pending(void *opaque, uint64_t *res_precopy,
> uint64_t *res_postcopy)
>  {
>  DBMSaveState *s = &((DBMState *)opaque)->save;
> diff --git a/migration/block.c b/migration/block.c
> index 8e6ad1c468..4f1f7c0f8d 100644
> --- a/migration/block.c
> +++ b/migration/block.c
> @@ -862,8 +862,7 @@ static int block_save_complete(QEMUFile *f, void *opaque)
>  return 0;
>  }
>  
> -static void block_state_pending(void *opaque, uint64_t max_size,
> -uint64_t *res_precopy,
> +static void block_state_pending(void *opaq

Re: [PATCH v5 02/15] block-copy: add missing coroutine_fn annotations

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 12:42 hat Emanuele Giuseppe Esposito geschrieben:
> These functions end up calling bdrv_common_block_status_above(), a
> generated_co_wrapper function.
> In addition, they also happen to be always called in coroutine context,
> meaning all callers are coroutine_fn.
> This means that the g_c_w function will enter the qemu_in_coroutine()
> case and eventually suspend (or in other words call qemu_coroutine_yield()).
> Therefore we need to mark such functions coroutine_fn too.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 
> Reviewed-by: Paolo Bonzini 

Reviewed-by: Kevin Wolf

Re: [PATCH v4 2/2] migration: check magic value for deciding the mapping of channels

2022-11-23 Thread manish.mishra




On 23/11/22 9:57 pm, Peter Xu wrote:

On Wed, Nov 23, 2022 at 09:28:14PM +0530, manish.mishra wrote:

On 23/11/22 9:22 pm, Peter Xu wrote:

On Wed, Nov 23, 2022 at 03:05:27PM +, manish.mishra wrote:

+int migration_channel_read_peek(QIOChannel *ioc,
+const char *buf,
+const size_t buflen,
+Error **errp)
+{
+   ssize_t len = 0;
+   struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen };
+
+   while (len < buflen) {
+   len = qio_channel_readv_full(ioc, &iov, 1, NULL,
+NULL, QIO_CHANNEL_READ_FLAG_MSG_PEEK, 
errp);
+
+   if (len == QIO_CHANNEL_ERR_BLOCK) {

This needs to take care of partial len too?


sorry Peter, I did not quite understand it. Can you please give some more 
details.

As Daniel pointed out, I think if we peek partially it'll go into a loop.
Since we shouldn't read 0 anyway here, maybe you can directly write it as:

while (true) {
   len = read();
   if (len <= 0 && len != QIO_CHANNEL_ERR_BLOCK) {
   error_setg(...);
   return -1;
   }
   if (len == buflen) {
   break;
   }
   /* For either partial peek or QIO_CHANNEL_ERR_BLOCK, retry with timeout */
   sleep();
}



Yes, got it, sorry, will update it.

Thanks

Manish Mishra


+if (qemu_in_coroutine()) {
+/* 1ms sleep. */
+qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100);
+} else {
+qio_channel_wait(ioc, G_IO_IN);
+}
+continue;
+   }
+   if (len == 0) {
+   error_setg(errp,
+  "Unexpected end-of-file on channel");
+   return -1;
+   }
+   if (len < 0) {
+   return -1;
+   }
+   }
+
+   return 0;
+}

Thanks

Manish Mishra

Re: [PATCH v2] Drop more useless casts from void * to pointer

2022-11-23 Thread BALATON Zoltan


On Wed, 23 Nov 2022, Markus Armbruster wrote:

Daniel P. Berrangé  writes:

On Wed, Nov 23, 2022 at 02:51:49PM +0100, BALATON Zoltan wrote:

On Wed, 23 Nov 2022, Markus Armbruster wrote:

Signed-off-by: Markus Armbruster 
Reviewed-by: Laurent Vivier 
---
v2:
* PATCH 1+2 merged as commit 0a553c12c7 and 3f7febc937
* PATCH 3 change to util/coroutine-ucontext.c dropped [Laurent]

bsd-user/elfload.c  | 2 +-
contrib/plugins/cache.c | 8 
contrib/vhost-user-blk/vhost-user-blk.c | 2 +-
hw/core/qdev-clock.c| 2 +-
hw/hyperv/vmbus.c   | 2 +-
hw/net/cadence_gem.c| 2 +-
hw/net/virtio-net.c | 2 +-
hw/nvme/ctrl.c  | 4 ++--
hw/rdma/vmw/pvrdma_cmd.c| 9 +++--
hw/rdma/vmw/pvrdma_qp_ops.c | 6 +++---
hw/virtio/virtio-iommu.c| 3 +--
linux-user/syscall.c| 2 +-
target/i386/hax/hax-all.c   | 2 +-
tests/tcg/aarch64/system/semiheap.c | 4 ++--
util/vfio-helpers.c | 2 +-
15 files changed, 24 insertions(+), 28 deletions(-)

diff --git a/bsd-user/elfload.c b/bsd-user/elfload.c
index f8edb22f2a..fbcdc94b96 100644
--- a/bsd-user/elfload.c
+++ b/bsd-user/elfload.c
@@ -156,7 +156,7 @@ static abi_ulong copy_elf_strings(int argc, char **argv, 
void **page,
--p; --tmp; --len;
if (--offset < 0) {
offset = p % TARGET_PAGE_SIZE;
-pag = (char *)page[p / TARGET_PAGE_SIZE];
+pag = page[p / TARGET_PAGE_SIZE];


I think arithmetic on void pointer was undefined at least in the past so
some compilers may warn for it but not sure if this is still the case for
the compilers we care about. Apparently not if this now compiles but that
explains why this cast was not useless.


I don't think so :)

@pag is char *.

@page is void **.

page[p / TARGET_PAGE_SIZE] is void *.  No need to cast to char * before
assigning to @pag.


You are right. Although I'm not sure what page[] counts as because it adds 
an offset to the pointer but [] is higher priority than (type) cast so it 
does not matter and that cast is not needed here then. Maybe I should be 
more attentive to details but I did not take the time to look at it more 
carefully. I did not say we should keep the cast anyway (considering only 
gcc and clang are targeted), I was just trying to understand why it might 
have been there in the first place.


Regards,
BALATON Zoltan


No pointer arithmetic so far.  There's some further down: pag + offset.
@pag is char * before and after my patch.


Found some more info on this here:

https://stackoverflow.com/questions/3523145/pointer-arithmetic-for-void-pointer-in-c


QEMU explicitly only targets GCC + Clang, so portability to other
compilers is not required.


Correct.  We do arithmentic with void * in many places already.

If we cared for portability to other compilers, we'd enable

'-Wpointer-arith'
Warn about anything that depends on the "size of" a function type
or of 'void'.  GNU C assigns these types a size of 1, for
convenience in calculations with 'void *' pointers and pointers to
functions.  In C++, warn also when an arithmetic operation involves
'NULL'.  This warning is also enabled by '-Wpedantic'.

But we don't.

Re: [PATCH v5 01/15] block-io: introduce coroutine_fn duplicates for bdrv_common_block_status_above callers

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 12:42 hat Emanuele Giuseppe Esposito geschrieben:
> bdrv_common_block_status_above() is a g_c_w, and it is being called by
> many "wrapper" functions like bdrv_is_allocated(),
> bdrv_is_allocated_above() and bdrv_block_status_above().
> 
> Because we want to eventually split the coroutine from non-coroutine
> case in g_c_w, create duplicate wrappers that take care of directly
> calling the same coroutine functions called in the g_c_w.
> 
> Signed-off-by: Emanuele Giuseppe Esposito 
> ---
>  block/io.c   | 64 ++--
>  include/block/block-io.h | 15 ++
>  2 files changed, 76 insertions(+), 3 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 38e57d1f67..1bc05c8282 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2533,6 +2533,19 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
>  return ret;
>  }
>  
> +int coroutine_fn bdrv_co_block_status_above(BlockDriverState *bs,
> +BlockDriverState *base,
> +int64_t offset, int64_t bytes,
> +int64_t *pnum, int64_t *map,
> +BlockDriverState **file)
> +{
> +IO_CODE();
> +/* If QEMU_IN_COROUTINE() fails, use bdrv_block_status_above() */
> +QEMU_IN_COROUTINE();

This is an obvious patch order problem: The macro doesn't even exist
yet.

As I said, personally, I don't feel like putting QEMU_IN_COROUTINE()
assertions into every coroutine_fn is a useful thing to do. Static
analysis (maybe even with something vrc based in 'make check'? Paolo,
would this be realistic?) seems much preferable. But I'd like to hear
other opinions on this, too.

I feel the same way about the comment. Yes, of course, if you're not in
a coroutine, don't call the _co variant of a function, but the one
without it. But that goes without saying, doesn't it?

> +return bdrv_co_common_block_status_above(bs, base, false, true, offset,
> + bytes, pnum, map, file, NULL);
> +}

Apart from these considerations the patch looks right.

Reviewed-by: Kevin Wolf

Re: [PATCH v4 2/2] migration: check magic value for deciding the mapping of channels

2022-11-23 Thread Peter Xu

On Wed, Nov 23, 2022 at 09:28:14PM +0530, manish.mishra wrote:
> 
> On 23/11/22 9:22 pm, Peter Xu wrote:
> > On Wed, Nov 23, 2022 at 03:05:27PM +, manish.mishra wrote:
> > > +int migration_channel_read_peek(QIOChannel *ioc,
> > > +const char *buf,
> > > +const size_t buflen,
> > > +Error **errp)
> > > +{
> > > +   ssize_t len = 0;
> > > +   struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen };
> > > +
> > > +   while (len < buflen) {
> > > +   len = qio_channel_readv_full(ioc, &iov, 1, NULL,
> > > +NULL, 
> > > QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp);
> > > +
> > > +   if (len == QIO_CHANNEL_ERR_BLOCK) {
> > This needs to take care of partial len too?
> 
> 
> sorry Peter, I did not quite understand it. Can you please give some more 
> details.

As Daniel pointed out, I think if we peek partially it'll go into a loop.
Since we shouldn't read 0 anyway here, maybe you can directly write it as:

while (true) {
  len = read();
  if (len <= 0 && len != QIO_CHANNEL_ERR_BLOCK) {
  error_setg(...);
  return -1;
  }
  if (len == buflen) {
  break;
  }
  /* For either partial peek or QIO_CHANNEL_ERR_BLOCK, retry with timeout */
  sleep();
}

> 
> > 
> > > +if (qemu_in_coroutine()) {
> > > +/* 1ms sleep. */
> > > +qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100);
> > > +} else {
> > > +qio_channel_wait(ioc, G_IO_IN);
> > > +}
> > > +continue;
> > > +   }
> > > +   if (len == 0) {
> > > +   error_setg(errp,
> > > +  "Unexpected end-of-file on channel");
> > > +   return -1;
> > > +   }
> > > +   if (len < 0) {
> > > +   return -1;
> > > +   }
> > > +   }
> > > +
> > > +   return 0;
> > > +}
> 
> Thanks
> 
> Manish Mishra
> 

-- 
Peter Xu

Re: [qemu-web PATCH] Add a blog post about zoned storage emulation

2022-11-23 Thread Stefan Hajnoczi

Cool, thanks for posting an update.

An short introduction would be nice before diving into the details of ZBDs:
"This summer I worked on adding Zoned Block Device (ZBD) support to
virtio-blk as part of the https://www.outreachy.org/";>Outreachy internship program.
QEMU hasn't directly supported ZBDs before so this article explains
how they work and how QEMU needed to be extended."

Please include a sample command-line so readers have an idea of how to
use ZBDs. Something like:

"Once the QEMU and Linux patches have been merged it will be possible
to expose a virtio-blk ZBD to the guest like this:
--blockdev zoned_host_device,node-name=zbd0,filename=path/to/zbd,cache.direct=on
\
--device virtio-blk-pci,drive=zbd0"

If you want to include URLs to the patch series, lore.kernel.org is
good for linking:
- https://lore.kernel.org/all/20221027154504.20684-1-faithilike...@gmail.com/
- https://lore.kernel.org/all/20221030093242.208839-1-faithilike...@gmail.com/
- https://lore.kernel.org/all/20221110053952.3378990-1-dmitry.fomic...@wdc.com/

Stefan

Re: [PATCH 3/3] tcg: Move ffi_cif pointer into TCGHelperInfo

2022-11-23 Thread Philippe Mathieu-Daudé


On 22/11/22 19:08, Philippe Mathieu-Daudé wrote:

From: Richard Henderson 

Instead of requiring a separate hash table lookup,
put a pointer to the CIF into TCGHelperInfo.

Signed-off-by: Richard Henderson 
Message-Id: <2022074101.2069454-27-richard.hender...@linaro.org>
[PMD: Split from bigger patch]
Signed-off-by: Philippe Mathieu-Daudé 
---
  tcg/tcg-internal.h |  7 +++
  tcg/tcg.c  | 26 ++
  2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/tcg/tcg-internal.h b/tcg/tcg-internal.h
index c7e87e193d..6e50aeba3a 100644
--- a/tcg/tcg-internal.h
+++ b/tcg/tcg-internal.h
@@ -25,6 +25,10 @@
  #ifndef TCG_INTERNAL_H
  #define TCG_INTERNAL_H
  
+#ifdef CONFIG_TCG_INTERPRETER

+#include 
+#endif
+
  #define TCG_HIGHWATER 1024
  
  /*

@@ -57,6 +61,9 @@ typedef struct TCGCallArgumentLoc {
  typedef struct TCGHelperInfo {
  void *func;
  const char *name;
+#ifdef CONFIG_TCG_INTERPRETER
+ffi_cif *cif;
+#endif
  unsigned typemask   : 32;
  unsigned flags  : 8;
  unsigned nr_in  : 8;
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 9b24b4d863..d6a3036412 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c


I forgot this hunk:

-- >8 --
@@ -62,10 +62,6 @@
 #include "tcg/tcg-ldst.h"
 #include "tcg-internal.h"

-#ifdef CONFIG_TCG_INTERPRETER
-#include 
-#endif
-

---


@@ -552,8 +552,6 @@ static TCGHelperInfo all_helpers[] = {
  static GHashTable *helper_table;
  
  #ifdef CONFIG_TCG_INTERPRETER

-static GHashTable *ffi_table;
-
  static ffi_type *typecode_to_ffi(int argmask)
  {
  switch (argmask) {
@@ -576,9 +574,11 @@ static ffi_type *typecode_to_ffi(int argmask)
  static void init_ffi_layouts(void)
  {
  /* g_direct_hash/equal for direct comparisons on uint32_t.  */
-ffi_table = g_hash_table_new(NULL, NULL);
+GHashTable *ffi_table = g_hash_table_new(NULL, NULL);
+
  for (int i = 0; i < ARRAY_SIZE(all_helpers); ++i) {
-uint32_t typemask = all_helpers[i].typemask;
+TCGHelperInfo *info = &all_helpers[i];
+unsigned typemask = info->typemask;
  gpointer hash = (gpointer)(uintptr_t)typemask;
  struct {
  ffi_cif cif;
@@ -586,8 +586,11 @@ static void init_ffi_layouts(void)
  } *ca;
  ffi_status status;
  int nargs;
+ffi_cif *cif;
  
-if (g_hash_table_lookup(ffi_table, hash)) {

+cif = g_hash_table_lookup(ffi_table, hash);
+if (cif) {
+info->cif = cif;
  continue;
  }
  
@@ -611,8 +614,12 @@ static void init_ffi_layouts(void)

ca->cif.rtype, ca->cif.arg_types);
  assert(status == FFI_OK);
  
-g_hash_table_insert(ffi_table, hash, (gpointer)&ca->cif);

+cif = &ca->cif;
+info->cif = cif;
+g_hash_table_insert(ffi_table, hash, (gpointer)cif);
  }
+
+g_hash_table_destroy(ffi_table);
  }
  #endif /* CONFIG_TCG_INTERPRETER */
  
@@ -4413,12 +4420,7 @@ static void tcg_reg_alloc_call(TCGContext *s, TCGOp *op)

  }
  
  #ifdef CONFIG_TCG_INTERPRETER

-{
-gpointer hash = (gpointer)(uintptr_t)info->typemask;
-ffi_cif *cif = g_hash_table_lookup(ffi_table, hash);
-assert(cif != NULL);
-tcg_out_call(s, tcg_call_func(op), cif);
-}
+tcg_out_call(s, tcg_call_func(op), info->cif);
  #else
  tcg_out_call(s, tcg_call_func(op));
  #endif

Re: [PATCH v4 2/2] migration: check magic value for deciding the mapping of channels

2022-11-23 Thread Daniel P . Berrangé

On Wed, Nov 23, 2022 at 09:34:35PM +0530, manish.mishra wrote:
> 
> On 23/11/22 9:28 pm, Daniel P. Berrangé wrote:
> > On Wed, Nov 23, 2022 at 03:05:27PM +, manish.mishra wrote:
> > > Current logic assumes that channel connections on the destination side are
> > > always established in the same order as the source and the first one will
> > > always be the main channel followed by the multifid or post-copy
> > > preemption channel. This may not be always true, as even if a channel has 
> > > a
> > > connection established on the source side it can be in the pending state 
> > > on
> > > the destination side and a newer connection can be established first.
> > > Basically causing out of order mapping of channels on the destination 
> > > side.
> > > Currently, all channels except post-copy preempt send a magic number, this
> > > patch uses that magic number to decide the type of channel. This logic is
> > > applicable only for precopy(multifd) live migration, as mentioned, the
> > > post-copy preempt channel does not send any magic number. Also, tls live
> > > migrations already does tls handshake before creating other channels, so
> > > this issue is not possible with tls, hence this logic is avoided for tls
> > > live migrations. This patch uses read peek to check the magic number of
> > > channels so that current data/control stream management remains
> > > un-effected.
> > > 
> > > Reviewed-by: Peter Xu 
> > > Reviewed-by: Daniel P. Berrangé  > > Suggested-by: Daniel P. Berrangé  > > Signed-off-by: manish.mishra 
> > > ---
> > >   migration/channel.c  | 46 
> > >   migration/channel.h  |  5 +
> > >   migration/migration.c| 45 ---
> > >   migration/multifd.c  | 12 ---
> > >   migration/multifd.h  |  2 +-
> > >   migration/postcopy-ram.c |  5 +
> > >   migration/postcopy-ram.h |  2 +-
> > >   7 files changed, 91 insertions(+), 26 deletions(-)
> > > 
> > > diff --git a/migration/channel.c b/migration/channel.c
> > > index 1b0815039f..a4600f52c5 100644
> > > --- a/migration/channel.c
> > > +++ b/migration/channel.c
> > > @@ -92,3 +92,49 @@ void migration_channel_connect(MigrationState *s,
> > >   migrate_fd_connect(s, error);
> > >   error_free(error);
> > >   }
> > > +
> > > +
> > > +/**
> > > + * @migration_channel_read_peek - Read from the peek of migration 
> > > channel,
> > > + *without actually removing it from channel buffer.
> > > + *
> > > + * @ioc: the channel object
> > > + * @buf: the memory region to read data into
> > > + * @buflen: the number of bytes to read in @buf
> > > + * @errp: pointer to a NULL-initialized error object
> > > + *
> > > + * Returns 0 if successful, returns -1 and sets @errp if fails.
> > > + */
> > > +int migration_channel_read_peek(QIOChannel *ioc,
> > > +const char *buf,
> > > +const size_t buflen,
> > > +Error **errp)
> > > +{
> > > +   ssize_t len = 0;
> > > +   struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen };
> > > +
> > > +   while (len < buflen) {
> > > +   len = qio_channel_readv_full(ioc, &iov, 1, NULL,
> > > +NULL, 
> > > QIO_CHANNEL_READ_FLAG_MSG_PEEK, errp);
> > > +
> > > +   if (len == QIO_CHANNEL_ERR_BLOCK) {
> > > +if (qemu_in_coroutine()) {
> > > +/* 1ms sleep. */
> > > +qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100);
> > > +} else {
> > > +qio_channel_wait(ioc, G_IO_IN);
> > > +}
> > > +continue;
> > > +   }
> > > +   if (len == 0) {
> > > +   error_setg(errp,
> > > +  "Unexpected end-of-file on channel");
> > > +   return -1;
> > > +   }
> > > +   if (len < 0) {
> > > +   return -1;
> > > +   }
> > > +   }
> > This busy waits when len > 0 and < buflen
> > 
> > 
> > With regards,
> > Daniel
> 
> Sorry,   Daniel, may be i misunderstood something from earlier
> discussions. I thought we discussed we may not prevent it from
> looping multiple times but we can  qemu_co_sleep_ns after every
> retry to deal with busy wait?

You're only calling  qemu_co_sleep_ns when you get ERR_BLOCK
though. That will happen the first time when 0 bytes are
pending. Once 2 bytes arrive and we're waiting for 2 more,
this code will busy wait instead of calling qemu_co_sleep_ns
again, because len==2 in that case.



With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH for 7.2-rc3 v1 0/2] virtio fixes

2022-11-23 Thread Michael S. Tsirkin

On Wed, Nov 23, 2022 at 04:03:49PM +, Alex Bennée wrote:
> 
> "Michael S. Tsirkin"  writes:
> 
> > On Wed, Nov 23, 2022 at 03:21:32PM +, Alex Bennée wrote:
> >> Hi,
> >> 
> >> This hopefully fixes the problems with VirtIO migration caused by the
> >> previous refactoring of virtio_device_started(). That introduced a
> >> different order of checking which didn't give the VM state primacy but
> >> wasn't noticed as we don't properly exercise VirtIO device migration
> >> and caused issues when dev->started wasn't checked in the core code.
> >> The introduction of virtio_device_should_start() split the overloaded
> >> function up but the broken order still remained. The series finally
> >> fixes that by restoring the original semantics but with the cleaned up
> >> functions.
> >> 
> >> I've added more documentation to the various structures involved as
> >> well as the functions. There is still some inconsistencies in the
> >> VirtIO code between different devices but I think that can be looked
> >> at over the 8.0 cycle.
> >
> >
> > Thanks a lot! Did you try this with gitlab CI? A patch similar to your
> > 2/2 broke it previously ...
> 
> Looking into it now - so far hasn't broken locally but I guess there is
> something different about the CI.


yes - pls push to gitlab, create pipeline e.g. with QEMU_CI set to 2

Or with QEMU_CI set to 1 and then run fedora container and then clang-system 
manually.

> >
> >> Alex Bennée (2):
> >>   include/hw: attempt to document VirtIO feature variables
> >>   include/hw: VM state takes precedence in virtio_device_should_start
> >> 
> >>  include/hw/virtio/vhost.h  | 25 +++---
> >>  include/hw/virtio/virtio.h | 43 --
> >>  2 files changed, 59 insertions(+), 9 deletions(-)
> >> 
> >> -- 
> >> 2.34.1
> 
> 
> -- 
> Alex Bennée

Re: [PATCH v4 2/2] migration: check magic value for deciding the mapping of channels

2022-11-23 Thread manish.mishra




On 23/11/22 9:28 pm, Daniel P. Berrangé wrote:

On Wed, Nov 23, 2022 at 03:05:27PM +, manish.mishra wrote:

Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the main channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, tls live
migrations already does tls handshake before creating other channels, so
this issue is not possible with tls, hence this logic is avoided for tls
live migrations. This patch uses read peek to check the magic number of
channels so that current data/control stream management remains
un-effected.

Reviewed-by: Peter Xu 
Reviewed-by: Daniel P. Berrangé 
---
  migration/channel.c  | 46 
  migration/channel.h  |  5 +
  migration/migration.c| 45 ---
  migration/multifd.c  | 12 ---
  migration/multifd.h  |  2 +-
  migration/postcopy-ram.c |  5 +
  migration/postcopy-ram.h |  2 +-
  7 files changed, 91 insertions(+), 26 deletions(-)

diff --git a/migration/channel.c b/migration/channel.c
index 1b0815039f..a4600f52c5 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -92,3 +92,49 @@ void migration_channel_connect(MigrationState *s,
  migrate_fd_connect(s, error);
  error_free(error);
  }
+
+
+/**
+ * @migration_channel_read_peek - Read from the peek of migration channel,
+ *without actually removing it from channel buffer.
+ *
+ * @ioc: the channel object
+ * @buf: the memory region to read data into
+ * @buflen: the number of bytes to read in @buf
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Returns 0 if successful, returns -1 and sets @errp if fails.
+ */
+int migration_channel_read_peek(QIOChannel *ioc,
+const char *buf,
+const size_t buflen,
+Error **errp)
+{
+   ssize_t len = 0;
+   struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen };
+
+   while (len < buflen) {
+   len = qio_channel_readv_full(ioc, &iov, 1, NULL,
+NULL, QIO_CHANNEL_READ_FLAG_MSG_PEEK, 
errp);
+
+   if (len == QIO_CHANNEL_ERR_BLOCK) {
+if (qemu_in_coroutine()) {
+/* 1ms sleep. */
+qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100);
+} else {
+qio_channel_wait(ioc, G_IO_IN);
+}
+continue;
+   }
+   if (len == 0) {
+   error_setg(errp,
+  "Unexpected end-of-file on channel");
+   return -1;
+   }
+   if (len < 0) {
+   return -1;
+   }
+   }

This busy waits when len > 0 and < buflen


With regards,
Daniel


Sorry,   Daniel, may be i misunderstood something from earlier discussions. I 
thought we discussed we may not prevent it from looping multiple times but we 
can  qemu_co_sleep_ns after every retry to deal with busy wait?

Thanks

Manish Mishra

Re: [PATCH for 7.2-rc3 v1 0/2] virtio fixes

2022-11-23 Thread Alex Bennée



"Michael S. Tsirkin"  writes:

> On Wed, Nov 23, 2022 at 03:21:32PM +, Alex Bennée wrote:
>> Hi,
>> 
>> This hopefully fixes the problems with VirtIO migration caused by the
>> previous refactoring of virtio_device_started(). That introduced a
>> different order of checking which didn't give the VM state primacy but
>> wasn't noticed as we don't properly exercise VirtIO device migration
>> and caused issues when dev->started wasn't checked in the core code.
>> The introduction of virtio_device_should_start() split the overloaded
>> function up but the broken order still remained. The series finally
>> fixes that by restoring the original semantics but with the cleaned up
>> functions.
>> 
>> I've added more documentation to the various structures involved as
>> well as the functions. There is still some inconsistencies in the
>> VirtIO code between different devices but I think that can be looked
>> at over the 8.0 cycle.
>
>
> Thanks a lot! Did you try this with gitlab CI? A patch similar to your
> 2/2 broke it previously ...

Looking into it now - so far hasn't broken locally but I guess there is
something different about the CI.

>
>> Alex Bennée (2):
>>   include/hw: attempt to document VirtIO feature variables
>>   include/hw: VM state takes precedence in virtio_device_should_start
>> 
>>  include/hw/virtio/vhost.h  | 25 +++---
>>  include/hw/virtio/virtio.h | 43 --
>>  2 files changed, 59 insertions(+), 9 deletions(-)
>> 
>> -- 
>> 2.34.1


-- 
Alex Bennée

Re: [PATCH v4 2/2] migration: check magic value for deciding the mapping of channels

2022-11-23 Thread manish.mishra




On 23/11/22 9:22 pm, Peter Xu wrote:

On Wed, Nov 23, 2022 at 03:05:27PM +, manish.mishra wrote:

+int migration_channel_read_peek(QIOChannel *ioc,
+const char *buf,
+const size_t buflen,
+Error **errp)
+{
+   ssize_t len = 0;
+   struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen };
+
+   while (len < buflen) {
+   len = qio_channel_readv_full(ioc, &iov, 1, NULL,
+NULL, QIO_CHANNEL_READ_FLAG_MSG_PEEK, 
errp);
+
+   if (len == QIO_CHANNEL_ERR_BLOCK) {

This needs to take care of partial len too?



sorry Peter, I did not quite understand it. Can you please give some more 
details.




+if (qemu_in_coroutine()) {
+/* 1ms sleep. */
+qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100);
+} else {
+qio_channel_wait(ioc, G_IO_IN);
+}
+continue;
+   }
+   if (len == 0) {
+   error_setg(errp,
+  "Unexpected end-of-file on channel");
+   return -1;
+   }
+   if (len < 0) {
+   return -1;
+   }
+   }
+
+   return 0;
+}


Thanks

Manish Mishra

Re: [PATCH v4 2/2] migration: check magic value for deciding the mapping of channels

2022-11-23 Thread Daniel P . Berrangé

On Wed, Nov 23, 2022 at 03:05:27PM +, manish.mishra wrote:
> Current logic assumes that channel connections on the destination side are
> always established in the same order as the source and the first one will
> always be the main channel followed by the multifid or post-copy
> preemption channel. This may not be always true, as even if a channel has a
> connection established on the source side it can be in the pending state on
> the destination side and a newer connection can be established first.
> Basically causing out of order mapping of channels on the destination side.
> Currently, all channels except post-copy preempt send a magic number, this
> patch uses that magic number to decide the type of channel. This logic is
> applicable only for precopy(multifd) live migration, as mentioned, the
> post-copy preempt channel does not send any magic number. Also, tls live
> migrations already does tls handshake before creating other channels, so
> this issue is not possible with tls, hence this logic is avoided for tls
> live migrations. This patch uses read peek to check the magic number of
> channels so that current data/control stream management remains
> un-effected.
> 
> Reviewed-by: Peter Xu 
> Reviewed-by: Daniel P. Berrangé  Suggested-by: Daniel P. Berrangé  Signed-off-by: manish.mishra 
> ---
>  migration/channel.c  | 46 
>  migration/channel.h  |  5 +
>  migration/migration.c| 45 ---
>  migration/multifd.c  | 12 ---
>  migration/multifd.h  |  2 +-
>  migration/postcopy-ram.c |  5 +
>  migration/postcopy-ram.h |  2 +-
>  7 files changed, 91 insertions(+), 26 deletions(-)
> 
> diff --git a/migration/channel.c b/migration/channel.c
> index 1b0815039f..a4600f52c5 100644
> --- a/migration/channel.c
> +++ b/migration/channel.c
> @@ -92,3 +92,49 @@ void migration_channel_connect(MigrationState *s,
>  migrate_fd_connect(s, error);
>  error_free(error);
>  }
> +
> +
> +/**
> + * @migration_channel_read_peek - Read from the peek of migration channel,
> + *without actually removing it from channel buffer.
> + *
> + * @ioc: the channel object
> + * @buf: the memory region to read data into
> + * @buflen: the number of bytes to read in @buf
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Returns 0 if successful, returns -1 and sets @errp if fails.
> + */
> +int migration_channel_read_peek(QIOChannel *ioc,
> +const char *buf,
> +const size_t buflen,
> +Error **errp)
> +{
> +   ssize_t len = 0;
> +   struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen };
> +
> +   while (len < buflen) {
> +   len = qio_channel_readv_full(ioc, &iov, 1, NULL,
> +NULL, QIO_CHANNEL_READ_FLAG_MSG_PEEK, 
> errp);
> +
> +   if (len == QIO_CHANNEL_ERR_BLOCK) {
> +if (qemu_in_coroutine()) {
> +/* 1ms sleep. */
> +qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100);
> +} else {
> +qio_channel_wait(ioc, G_IO_IN);
> +}
> +continue;
> +   }
> +   if (len == 0) {
> +   error_setg(errp,
> +  "Unexpected end-of-file on channel");
> +   return -1;
> +   }
> +   if (len < 0) {
> +   return -1;
> +   }
> +   }

This busy waits when len > 0 and < buflen


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v4 2/2] migration: check magic value for deciding the mapping of channels

2022-11-23 Thread Peter Xu

On Wed, Nov 23, 2022 at 03:05:27PM +, manish.mishra wrote:
> +int migration_channel_read_peek(QIOChannel *ioc,
> +const char *buf,
> +const size_t buflen,
> +Error **errp)
> +{
> +   ssize_t len = 0;
> +   struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen };
> +
> +   while (len < buflen) {
> +   len = qio_channel_readv_full(ioc, &iov, 1, NULL,
> +NULL, QIO_CHANNEL_READ_FLAG_MSG_PEEK, 
> errp);
> +
> +   if (len == QIO_CHANNEL_ERR_BLOCK) {

This needs to take care of partial len too?

> +if (qemu_in_coroutine()) {
> +/* 1ms sleep. */
> +qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100);
> +} else {
> +qio_channel_wait(ioc, G_IO_IN);
> +}
> +continue;
> +   }
> +   if (len == 0) {
> +   error_setg(errp,
> +  "Unexpected end-of-file on channel");
> +   return -1;
> +   }
> +   if (len < 0) {
> +   return -1;
> +   }
> +   }
> +
> +   return 0;
> +}

-- 
Peter Xu

Re: [PATCH v4 1/2] io: Add support for MSG_PEEK for socket channel

2022-11-23 Thread Daniel P . Berrangé

On Wed, Nov 23, 2022 at 03:05:26PM +, manish.mishra wrote:
> MSG_PEEK reads from the peek of channel, The data is treated as
> unread and the next read shall still return this data. This
> support is currently added only for socket class. Extra parameter
> 'flags' is added to io_readv calls to pass extra read flags like
> MSG_PEEK.
> 
> Suggested-by: Daniel P. Berrangé  Signed-off-by: manish.mishra 
> ---
>  chardev/char-socket.c   |  4 ++--
>  include/io/channel.h|  6 ++
>  io/channel-buffer.c |  1 +
>  io/channel-command.c|  1 +
>  io/channel-file.c   |  1 +
>  io/channel-null.c   |  1 +
>  io/channel-socket.c | 17 -
>  io/channel-tls.c|  1 +
>  io/channel-websock.c|  1 +
>  io/channel.c| 16 
>  migration/channel-block.c   |  1 +
>  scsi/qemu-pr-helper.c   |  2 +-
>  tests/qtest/tpm-emu.c   |  2 +-
>  tests/unit/test-io-channel-socket.c |  1 +
>  util/vhost-user-server.c|  2 +-
>  15 files changed, 47 insertions(+), 10 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v2 2/3] hw/mips/malta: Set PIIX4 IRQ routes in embedded bootloader

2022-11-23 Thread Jiaxun Yang




> 2022年11月22日 12:37，BALATON Zoltan  写道：
> 
> Hello,
> 
> On Mon, 21 Nov 2022, Bernhard Beschow wrote:
>> Am 21. November 2022 22:43:50 UTC schrieb "Philippe Mathieu-Daudé" 
>> :
>>> On 21/11/22 16:34, Bernhard Beschow wrote:
 Am 27. Oktober 2022 20:47:19 UTC schrieb "Philippe Mathieu-Daudé" 
 :
> Linux kernel expects the northbridge & southbridge chipsets
> configured by the BIOS firmware. We emulate that by writing
> a tiny bootloader code in write_bootloader().
> 
> Upon introduction in commit 5c2b87e34d ("PIIX4 support"),
> the PIIX4 configuration space included values specific to
> the Malta board.
> 
> Set the Malta-specific IRQ routing values in the embedded
> bootloader, so the next commit can remove the Malta specific
> bits from the PIIX4 PCI-ISA bridge and make it generic
> (matching the real hardware).
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> FIXME: Missing the nanoMIPS counter-part!
 
 Who will be taking care of this? I have absolutely no clue how the 
 write_bootloader functions work, so I don't see how to fix it.
>>> 
>>> Oh actually I wrote that and tested it but context switched and forgot
>>> about it... I'll look back when I get some time, probably around the
>>> release.

I can try to adopt existing boot loader helper functions, just a matter of 
opcodes I think.

> 
> Unrelated to this but found it while looking at malta.c now: another possible 
> clean up is to replace the local generate_eeprom_spd() func with 
> spd_data_generate() from hw/i2c/smbus_eeprom.c that other boards use already 
> but I did not change malta because I could not test it. If you can test malta 
> then it should be an easy change and simplify malta.c a bit.
> 
 Couldn't we just do it like in pegasos2_init() where the registers are 
 initialized by QEMU directly if there is no bootloader binary configured? 
 I could do that.
>>> I rather mimic bootloaders... maybe a matter of taste?
> 
> Is that a bootloader or a replacement firmware? To me bootloader is some OS 
> specific binary that is loaded by firware to boot an OS. But there are OS 
> independent bootloaders like grub so maybe you could emulate something like 
> that, I don't know what malta does.

YAMON is a OS-dependent and HW-dependent firmware like u-boot.

> 
> If there's no firmware binary QEMU should provide something to replace it to 
> give the expected environment for the binary loaded by -kernel. In case of 
> pegasos2 the init method sets up regs to init devices as done by the firmware 
> and the rest is implemented by VOF (loaded from pc-bios) that provices the 
> OpenFirmware client interface. The device setup in init is needed because VOF 
> does not do that.
> 
>> I don't mind either way. I meant that I could help with the second approach 
>> but not with the current one since I have no clue whatsoever how it works. 
>> There are just too many magic constants that don't make any sense to me, and 
>> too many layers of indirection, for example.
> 
> If malta has a replacement firmware for this case maybe it could be stored in 
> a binary in pc-bios and loaded from there instead of writing it in hex to 
> guest memory. That binary could even be assembled from source which should 
> make it simpler to write and change. Or is YAMON open source? According to 
> this page it is: https://www.mips.com/develop/tools/boot-loaders/ so maybe it 
> could be included as a firmware binary instead of being emulated?

Hmm, YAMON was a open source software but I’m unable to find a copy of source 
for Malta board comes with GT chipset that QEMU emulated.
So nowadays we mainly use -kernel feature to do direct kernel boot.

Direct kernel boot is really a brilliant function that I don’t want to lose :-)

Thanks
- Jiaxun


> 
> Regards,
> BALATON Zoltan
> 
>> Anyway, I'm asking for the current state because I'm pretty much ready for 
>> posting a v3 of my PIIX consolidation series which now depends on this 
>> series.
>> 
>> Best regards,
>> Bernhard
>> 
>>> 
>>> Regards,
>>> 
>>> Phil.
>>

Re: [PATCH for 7.2-rc3 v1 0/2] virtio fixes

2022-11-23 Thread Michael S. Tsirkin

On Wed, Nov 23, 2022 at 03:21:32PM +, Alex Bennée wrote:
> Hi,
> 
> This hopefully fixes the problems with VirtIO migration caused by the
> previous refactoring of virtio_device_started(). That introduced a
> different order of checking which didn't give the VM state primacy but
> wasn't noticed as we don't properly exercise VirtIO device migration
> and caused issues when dev->started wasn't checked in the core code.
> The introduction of virtio_device_should_start() split the overloaded
> function up but the broken order still remained. The series finally
> fixes that by restoring the original semantics but with the cleaned up
> functions.
> 
> I've added more documentation to the various structures involved as
> well as the functions. There is still some inconsistencies in the
> VirtIO code between different devices but I think that can be looked
> at over the 8.0 cycle.


Thanks a lot! Did you try this with gitlab CI? A patch similar to your
2/2 broke it previously ...

> Alex Bennée (2):
>   include/hw: attempt to document VirtIO feature variables
>   include/hw: VM state takes precedence in virtio_device_should_start
> 
>  include/hw/virtio/vhost.h  | 25 +++---
>  include/hw/virtio/virtio.h | 43 --
>  2 files changed, 59 insertions(+), 9 deletions(-)
> 
> -- 
> 2.34.1

Re: [qemu-web PATCH] Add a blog post about zoned storage emulation

2022-11-23 Thread Sam Li

Thomas Huth  于2022年11月23日周三 20:48写道：
>
> On 17/11/2022 20.12, Stefan Hajnoczi wrote:
> > Hi Sam,
> > Please send a git repo URL so Thomas can fetch the commit without
> > email/file size limitations.
>
> The size obviously comes from the PNG image ... since this seems to be a
> photo, I think JPG would be a better file type, so please convert it to JPG
> with an appropriate compression level. I assume this will help to shrink it
> to a reasonable size.
>
> >> +
>
> Another question : Where does the picture come from? Does it have a license
> that allows it to be used on websites like the QEMU blog?

It comes from slide P12 in this sharing and it doesn't have such a
license. I'll remove this image instead.
https://kvmforum2022.sched.com/event/15jL3/whats-in-virtio-12-and-what-isnt-there-michael-s-tsirkin-red-hat?

Here is a link to the fixed version:
https://github.com/sgzerolc/qemu-web/tree/zbd


Thanks,
Sam

[PATCH v1 1/2] include/hw: attempt to document VirtIO feature variables

2022-11-23 Thread Alex Bennée

We have a bunch of variables associated with the device and the vhost
backend which are used inconsistently throughout the code base. Lets
start trying to bring some order by agreeing what each variable is
for.

Signed-off-by: Alex Bennée 
Cc: Stefano Garzarella 
Cc: "Michael S. Tsirkin" 
Cc: Stefan Hajnoczi 

---
v2
  - dropped DISCUSS and commentary
  - separated protocol section for clarity
  - updated working on vhost->backend_features
  - made clear guest_features was the written state
---
 include/hw/virtio/vhost.h  | 25 ++---
 include/hw/virtio/virtio.h | 19 ++-
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
index 353252ac3e..eaf628f656 100644
--- a/include/hw/virtio/vhost.h
+++ b/include/hw/virtio/vhost.h
@@ -88,13 +88,32 @@ struct vhost_dev {
 int vq_index_end;
 /* if non-zero, minimum required value for max_queues */
 int num_queues;
+/**
+ * vhost feature handling requires matching the feature set
+ * offered by a backend which may be a subset of the total
+ * features eventually offered to the guest.
+ *
+ * @features: available features provided by the backend
+ * @acked_features: final negotiated features with front-end driver
+ *
+ * @backend_features: this is used in a couple of places to either
+ * store VHOST_USER_F_PROTOCOL_FEATURES to apply to
+ * VHOST_USER_SET_FEATURES or VHOST_NET_F_VIRTIO_NET_HDR. Its
+ * future use should be discouraged and the variable retired as
+ * its easy to confuse with the VirtIO backend_features.
+ */
 uint64_t features;
-/** @acked_features: final set of negotiated features */
 uint64_t acked_features;
-/** @backend_features: backend specific feature bits */
 uint64_t backend_features;
-/** @protocol_features: final negotiated protocol features */
+
+/**
+ * @protocol_features: is the vhost-user only feature set by
+ * VHOST_USER_SET_PROTOCOL_FEATURES. Protocol features are only
+ * negotiated if VHOST_USER_F_PROTOCOL_FEATURES has been offered
+ * by the backend (see @features).
+ */
 uint64_t protocol_features;
+
 uint64_t max_queues;
 uint64_t backend_cap;
 /* @started: is the vhost device started? */
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index a973811cbf..0f612067f7 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -93,6 +93,12 @@ enum virtio_device_endian {
 VIRTIO_DEVICE_ENDIAN_BIG,
 };
 
+/**
+ * struct VirtIODevice - common VirtIO structure
+ * @name: name of the device
+ * @status: VirtIO Device Status field
+ *
+ */
 struct VirtIODevice
 {
 DeviceState parent_obj;
@@ -100,9 +106,20 @@ struct VirtIODevice
 uint8_t status;
 uint8_t isr;
 uint16_t queue_sel;
-uint64_t guest_features;
+/**
+ * These fields represent a set of VirtIO features at various
+ * levels of the stack. @host_features indicates the complete
+ * feature set the VirtIO device can offer to the driver.
+ * @guest_features indicates which features the VirtIO driver has
+ * selected by writing to the feature register. Finally
+ * @backend_features represents everything supported by the
+ * backend (e.g. vhost) and could potentially be a subset of the
+ * total feature set offered by QEMU.
+ */
 uint64_t host_features;
+uint64_t guest_features;
 uint64_t backend_features;
+
 size_t config_len;
 void *config;
 uint16_t config_vector;
-- 
2.34.1

[PATCH v1 2/2] include/hw: VM state takes precedence in virtio_device_should_start

2022-11-23 Thread Alex Bennée

The VM status should always preempt the device status for these
checks. This ensures the device is in the correct state when we
suspend the VM prior to migrations. This restores the checks to the
order they where in before the refactoring moved things around.

While we are at it lets improve our documentation of the various
fields involved and document the two functions.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 259d69c00b (hw/virtio: introduce virtio_device_should_start)
Signed-off-by: Alex Bennée 
Tested-by: Christian Borntraeger 
---
 include/hw/virtio/virtio.h | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 0f612067f7..48f539d0fe 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -133,6 +133,13 @@ struct VirtIODevice
 bool broken; /* device in invalid state, needs reset */
 bool use_disabled_flag; /* allow use of 'disable' flag when needed */
 bool disabled; /* device in temporarily disabled state */
+/**
+ * @use_started: true if the @started flag should be used to check the
+ * current state of the VirtIO device. Otherwise status bits
+ * should be checked for a current status of the device.
+ * @use_started is only set via QMP and defaults to true for all
+ * modern machines (since 4.1).
+ */
 bool use_started;
 bool started;
 bool start_on_kick; /* when virtio 1.0 feature has not been negotiated */
@@ -408,6 +415,17 @@ static inline bool virtio_is_big_endian(VirtIODevice *vdev)
 return false;
 }
 
+
+/**
+ * virtio_device_should_start() - check if device started
+ * @vdev - the VirtIO device
+ * @status - the devices status bits
+ *
+ * Check if the device is started. For most modern machines this is
+ * tracked via the @vdev->started field (to support migration),
+ * otherwise we check for the final negotiated status bit that
+ * indicates everything is ready.
+ */
 static inline bool virtio_device_started(VirtIODevice *vdev, uint8_t status)
 {
 if (vdev->use_started) {
@@ -428,15 +446,11 @@ static inline bool virtio_device_started(VirtIODevice 
*vdev, uint8_t status)
  */
 static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t 
status)
 {
-if (vdev->use_started) {
-return vdev->started;
-}
-
 if (!vdev->vm_running) {
 return false;
 }
 
-return status & VIRTIO_CONFIG_S_DRIVER_OK;
+return virtio_device_started(vdev, status);
 }
 
 static inline void virtio_set_started(VirtIODevice *vdev, bool started)
-- 
2.34.1

[PATCH for 7.2-rc3 v1 0/2] virtio fixes

2022-11-23 Thread Alex Bennée

Hi,

This hopefully fixes the problems with VirtIO migration caused by the
previous refactoring of virtio_device_started(). That introduced a
different order of checking which didn't give the VM state primacy but
wasn't noticed as we don't properly exercise VirtIO device migration
and caused issues when dev->started wasn't checked in the core code.
The introduction of virtio_device_should_start() split the overloaded
function up but the broken order still remained. The series finally
fixes that by restoring the original semantics but with the cleaned up
functions.

I've added more documentation to the various structures involved as
well as the functions. There is still some inconsistencies in the
VirtIO code between different devices but I think that can be looked
at over the 8.0 cycle.

Alex Bennée (2):
  include/hw: attempt to document VirtIO feature variables
  include/hw: VM state takes precedence in virtio_device_should_start

 include/hw/virtio/vhost.h  | 25 +++---
 include/hw/virtio/virtio.h | 43 --
 2 files changed, 59 insertions(+), 9 deletions(-)

-- 
2.34.1

Re: [PATCH] target/mips: Properly set C0_CMGCRBase after CPU reset

2022-11-23 Thread Jiaxun Yang




> 2022年11月14日 16:25，Jiaxun Yang  写道：
> 
> Value of C0_CMGCRBase will be reseted to default when cpu reset
> happens. In some cases software may move GCR base and then initiate
> a CPU reset, this will leave C0_CMGCRBase of reseted core incorrect.
> 
> Implement a callback in CMGCR device to allow C0_CMGCRBase and other
> global states to be overriden after CPU reset.
> 
> Signed-off-by: Jiaxun Yang 
> ---
> This fixes SMP boot for Boston board.
> I'm not sure if it's the best palce to make such a callback,
> but we can add more global states such as BEV here in future.

Ping :-)

Any comments?

> ---
> hw/mips/cps.c| 3 ++-
> hw/misc/mips_cmgcr.c | 5 +
> target/mips/cpu.c| 4 +++-
> target/mips/cpu.h| 4 
> 4 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/mips/cps.c b/hw/mips/cps.c
> index 2b436700ce..29b10ff8d0 100644
> --- a/hw/mips/cps.c
> +++ b/hw/mips/cps.c
> @@ -98,6 +98,7 @@ static void mips_cps_realize(DeviceState *dev, Error **errp)
> cpu_mips_clock_init(cpu);
> 
> env = &cpu->env;
> +env->gcr = &s->gcr;
> if (cpu_mips_itu_supported(env)) {
> itu_present = true;
> /* Attach ITC Tag to the VP */
> @@ -158,7 +159,7 @@ static void mips_cps_realize(DeviceState *dev, Error 
> **errp)
> sysbus_mmio_get_region(SYS_BUS_DEVICE(&s->gic), 
> 0));
> 
> /* Global Configuration Registers */
> -gcr_base = env->CP0_CMGCRBase << 4;
> +gcr_base = GCR_BASE_ADDR;
> 
> object_initialize_child(OBJECT(dev), "gcr", &s->gcr, TYPE_MIPS_GCR);
> object_property_set_int(OBJECT(&s->gcr), "num-vp", s->num_vp,
> diff --git a/hw/misc/mips_cmgcr.c b/hw/misc/mips_cmgcr.c
> index 3c8b37f700..f2108b7d32 100644
> --- a/hw/misc/mips_cmgcr.c
> +++ b/hw/misc/mips_cmgcr.c
> @@ -19,6 +19,11 @@
> #include "hw/qdev-properties.h"
> #include "hw/intc/mips_gic.h"
> 
> +void gcr_cpu_reset(struct MIPSGCRState *s, CPUMIPSState *env)
> +{
> +env->CP0_CMGCRBase = s->gcr_base >> 4;
> +}
> +
> static inline bool is_cpc_connected(MIPSGCRState *s)
> {
> return s->cpc_mr != NULL;
> diff --git a/target/mips/cpu.c b/target/mips/cpu.c
> index e997c1b9cb..d0a76b95f7 100644
> --- a/target/mips/cpu.c
> +++ b/target/mips/cpu.c
> @@ -297,7 +297,9 @@ static void mips_cpu_reset(DeviceState *dev)
> env->CP0_EBase |= (int32_t)0x8000;
> }
> if (env->CP0_Config3 & (1 << CP0C3_CMGCR)) {
> -env->CP0_CMGCRBase = 0x1fbf8000 >> 4;
> +if (env->gcr) {
> +gcr_cpu_reset(env->gcr, env);
> +}
> }
> env->CP0_EntryHi_ASID_mask = (env->CP0_Config5 & (1 << CP0C5_MI)) ?
> 0x0 : (env->CP0_Config4 & (1 << CP0C4_AE)) ? 0x3ff : 0xff;
> diff --git a/target/mips/cpu.h b/target/mips/cpu.h
> index 0a085643a3..c345e6b1c7 100644
> --- a/target/mips/cpu.h
> +++ b/target/mips/cpu.h
> @@ -1154,6 +1154,7 @@ typedef struct CPUArchState {
> CPUMIPSTLBContext *tlb;
> void *irq[8];
> struct MIPSITUState *itu;
> +struct MIPSGCRState *gcr;
> MemoryRegion *itc_tag; /* ITC Configuration Tags */
> #endif
> 
> @@ -1310,6 +1311,9 @@ void cpu_mips_soft_irq(CPUMIPSState *env, int irq, int 
> level);
> /* mips_itu.c */
> void itc_reconfigure(struct MIPSITUState *tag);
> 
> +/* mips_cmgcr.c */
> +void gcr_cpu_reset(struct MIPSGCRState *s, CPUMIPSState *env);
> +
> #endif /* !CONFIG_USER_ONLY */
> 
> /* helper.c */
> -- 
> 2.37.4
>

Re: [PATCH v2] Drop more useless casts from void * to pointer

2022-11-23 Thread Markus Armbruster

Daniel P. Berrangé  writes:

> On Wed, Nov 23, 2022 at 02:51:49PM +0100, BALATON Zoltan wrote:
>> On Wed, 23 Nov 2022, Markus Armbruster wrote:
>> > Signed-off-by: Markus Armbruster 
>> > Reviewed-by: Laurent Vivier 
>> > ---
>> > v2:
>> > * PATCH 1+2 merged as commit 0a553c12c7 and 3f7febc937
>> > * PATCH 3 change to util/coroutine-ucontext.c dropped [Laurent]
>> > 
>> > bsd-user/elfload.c  | 2 +-
>> > contrib/plugins/cache.c | 8 
>> > contrib/vhost-user-blk/vhost-user-blk.c | 2 +-
>> > hw/core/qdev-clock.c| 2 +-
>> > hw/hyperv/vmbus.c   | 2 +-
>> > hw/net/cadence_gem.c| 2 +-
>> > hw/net/virtio-net.c | 2 +-
>> > hw/nvme/ctrl.c  | 4 ++--
>> > hw/rdma/vmw/pvrdma_cmd.c| 9 +++--
>> > hw/rdma/vmw/pvrdma_qp_ops.c | 6 +++---
>> > hw/virtio/virtio-iommu.c| 3 +--
>> > linux-user/syscall.c| 2 +-
>> > target/i386/hax/hax-all.c   | 2 +-
>> > tests/tcg/aarch64/system/semiheap.c | 4 ++--
>> > util/vfio-helpers.c | 2 +-
>> > 15 files changed, 24 insertions(+), 28 deletions(-)
>> > 
>> > diff --git a/bsd-user/elfload.c b/bsd-user/elfload.c
>> > index f8edb22f2a..fbcdc94b96 100644
>> > --- a/bsd-user/elfload.c
>> > +++ b/bsd-user/elfload.c
>> > @@ -156,7 +156,7 @@ static abi_ulong copy_elf_strings(int argc, char 
>> > **argv, void **page,
>> > --p; --tmp; --len;
>> > if (--offset < 0) {
>> > offset = p % TARGET_PAGE_SIZE;
>> > -pag = (char *)page[p / TARGET_PAGE_SIZE];
>> > +pag = page[p / TARGET_PAGE_SIZE];
>> 
>> I think arithmetic on void pointer was undefined at least in the past so
>> some compilers may warn for it but not sure if this is still the case for
>> the compilers we care about. Apparently not if this now compiles but that
>> explains why this cast was not useless.

I don't think so :)

@pag is char *.

@page is void **.

page[p / TARGET_PAGE_SIZE] is void *.  No need to cast to char * before
assigning to @pag.

No pointer arithmetic so far.  There's some further down: pag + offset.
@pag is char * before and after my patch.

>> Found some more info on this here:
>> 
>> https://stackoverflow.com/questions/3523145/pointer-arithmetic-for-void-pointer-in-c
>
> QEMU explicitly only targets GCC + Clang, so portability to other
> compilers is not required.

Correct.  We do arithmentic with void * in many places already.

If we cared for portability to other compilers, we'd enable

'-Wpointer-arith'
 Warn about anything that depends on the "size of" a function type
 or of 'void'.  GNU C assigns these types a size of 1, for
 convenience in calculations with 'void *' pointers and pointers to
 functions.  In C++, warn also when an arithmetic operation involves
 'NULL'.  This warning is also enabled by '-Wpedantic'.

But we don't.

[PATCH v4 0/2] migration: check magic value for deciding the mapping of channels

2022-11-23 Thread manish.mishra

Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the main channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, tls live
migrations already does tls handshake before creating other channels, so
this issue is not possible with tls, hence this logic is avoided for tls
live migrations. This patch uses MSG_PEEK to check the magic number of
channels so that current data/control stream management remains
un-effected.

v2:
  TLS does not support MSG_PEEK, so V1 was broken for tls live
  migrations. For tls live migration, while initializing main channel
  tls handshake is done before we can create other channels, so this
  issue is not possible for tls live migrations. In V2 added a check
  to avoid checking magic number for tls live migration and fallback
  to older method to decide mapping of channels on destination side.

v3:
  1. Split change in two patches, io patch for read_peek routines,
 migration patch for migration related changes.
  2. Add flags to io_readv calls to get extra read flags like
 MSG_PEEK.
  3. Some other minor fixes.

v4:
  1. Removed common *all_eof routines for read peek and added one
 specific to live migration.
  2. Updated to use qemu_co_sleep_ns instead of qio_channel_yield.
  3. Some other minor fixes.

manish.mishra (2):
  io: Add support for MSG_PEEK for socket channel
  migration: check magic value for deciding the mapping of channels

 chardev/char-socket.c   |  4 +--
 include/io/channel.h|  6 
 io/channel-buffer.c |  1 +
 io/channel-command.c|  1 +
 io/channel-file.c   |  1 +
 io/channel-null.c   |  1 +
 io/channel-socket.c | 17 ++-
 io/channel-tls.c|  1 +
 io/channel-websock.c|  1 +
 io/channel.c| 16 +++---
 migration/channel-block.c   |  1 +
 migration/channel.c | 46 +
 migration/channel.h |  5 
 migration/migration.c   | 45 
 migration/multifd.c | 12 +++-
 migration/multifd.h |  2 +-
 migration/postcopy-ram.c|  5 +---
 migration/postcopy-ram.h|  2 +-
 scsi/qemu-pr-helper.c   |  2 +-
 tests/qtest/tpm-emu.c   |  2 +-
 tests/unit/test-io-channel-socket.c |  1 +
 util/vhost-user-server.c|  2 +-
 22 files changed, 138 insertions(+), 36 deletions(-)

-- 
2.22.3

[PATCH v4 2/2] migration: check magic value for deciding the mapping of channels

2022-11-23 Thread manish.mishra

Current logic assumes that channel connections on the destination side are
always established in the same order as the source and the first one will
always be the main channel followed by the multifid or post-copy
preemption channel. This may not be always true, as even if a channel has a
connection established on the source side it can be in the pending state on
the destination side and a newer connection can be established first.
Basically causing out of order mapping of channels on the destination side.
Currently, all channels except post-copy preempt send a magic number, this
patch uses that magic number to decide the type of channel. This logic is
applicable only for precopy(multifd) live migration, as mentioned, the
post-copy preempt channel does not send any magic number. Also, tls live
migrations already does tls handshake before creating other channels, so
this issue is not possible with tls, hence this logic is avoided for tls
live migrations. This patch uses read peek to check the magic number of
channels so that current data/control stream management remains
un-effected.

Reviewed-by: Peter Xu 
Reviewed-by: Daniel P. Berrangé 
---
 migration/channel.c  | 46 
 migration/channel.h  |  5 +
 migration/migration.c| 45 ---
 migration/multifd.c  | 12 ---
 migration/multifd.h  |  2 +-
 migration/postcopy-ram.c |  5 +
 migration/postcopy-ram.h |  2 +-
 7 files changed, 91 insertions(+), 26 deletions(-)

diff --git a/migration/channel.c b/migration/channel.c
index 1b0815039f..a4600f52c5 100644
--- a/migration/channel.c
+++ b/migration/channel.c
@@ -92,3 +92,49 @@ void migration_channel_connect(MigrationState *s,
 migrate_fd_connect(s, error);
 error_free(error);
 }
+
+
+/**
+ * @migration_channel_read_peek - Read from the peek of migration channel,
+ *without actually removing it from channel buffer.
+ *
+ * @ioc: the channel object
+ * @buf: the memory region to read data into
+ * @buflen: the number of bytes to read in @buf
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Returns 0 if successful, returns -1 and sets @errp if fails.
+ */
+int migration_channel_read_peek(QIOChannel *ioc,
+const char *buf,
+const size_t buflen,
+Error **errp)
+{
+   ssize_t len = 0;
+   struct iovec iov = { .iov_base = (char *)buf, .iov_len = buflen };
+
+   while (len < buflen) {
+   len = qio_channel_readv_full(ioc, &iov, 1, NULL,
+NULL, QIO_CHANNEL_READ_FLAG_MSG_PEEK, 
errp);
+
+   if (len == QIO_CHANNEL_ERR_BLOCK) {
+if (qemu_in_coroutine()) {
+/* 1ms sleep. */
+qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 100);
+} else {
+qio_channel_wait(ioc, G_IO_IN);
+}
+continue;
+   }
+   if (len == 0) {
+   error_setg(errp,
+  "Unexpected end-of-file on channel");
+   return -1;
+   }
+   if (len < 0) {
+   return -1;
+   }
+   }
+
+   return 0;
+}
diff --git a/migration/channel.h b/migration/channel.h
index 67a461c28a..5bdb8208a7 100644
--- a/migration/channel.h
+++ b/migration/channel.h
@@ -24,4 +24,9 @@ void migration_channel_connect(MigrationState *s,
QIOChannel *ioc,
const char *hostname,
Error *error_in);
+
+int migration_channel_read_peek(QIOChannel *ioc,
+const char *buf,
+const size_t buflen,
+Error **errp);
 #endif
diff --git a/migration/migration.c b/migration/migration.c
index f485eea5fb..8509f20a80 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -31,6 +31,7 @@
 #include "migration.h"
 #include "savevm.h"
 #include "qemu-file.h"
+#include "channel.h"
 #include "migration/vmstate.h"
 #include "block/block.h"
 #include "qapi/error.h"
@@ -733,31 +734,51 @@ void migration_ioc_process_incoming(QIOChannel *ioc, 
Error **errp)
 {
 MigrationIncomingState *mis = migration_incoming_get_current();
 Error *local_err = NULL;
-bool start_migration;
 QEMUFile *f;
+bool default_channel = true;
+uint32_t channel_magic = 0;
+int ret = 0;
 
-if (!mis->from_src_file) {
-/* The first connection (multifd may have multiple) */
+if (migrate_use_multifd() && !migrate_postcopy_ram() &&
+qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
+/*
+ * With multiple channels, it is possible that we receive channels
+ * out of order on destination side, causing incorrect mapping of
+ * source channels on destination side. Check channel MAGIC to
+ * decide type of channel. Please note this is best effort, postcopy

[PATCH v4 1/2] io: Add support for MSG_PEEK for socket channel

2022-11-23 Thread manish.mishra

MSG_PEEK reads from the peek of channel, The data is treated as
unread and the next read shall still return this data. This
support is currently added only for socket class. Extra parameter
'flags' is added to io_readv calls to pass extra read flags like
MSG_PEEK.

Suggested-by: Daniel P. Berrangé 
---
 chardev/char-socket.c   |  4 ++--
 include/io/channel.h|  6 ++
 io/channel-buffer.c |  1 +
 io/channel-command.c|  1 +
 io/channel-file.c   |  1 +
 io/channel-null.c   |  1 +
 io/channel-socket.c | 17 -
 io/channel-tls.c|  1 +
 io/channel-websock.c|  1 +
 io/channel.c| 16 
 migration/channel-block.c   |  1 +
 scsi/qemu-pr-helper.c   |  2 +-
 tests/qtest/tpm-emu.c   |  2 +-
 tests/unit/test-io-channel-socket.c |  1 +
 util/vhost-user-server.c|  2 +-
 15 files changed, 47 insertions(+), 10 deletions(-)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 879564aa8a..5afce9a464 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -283,11 +283,11 @@ static ssize_t tcp_chr_recv(Chardev *chr, char *buf, 
size_t len)
 if (qio_channel_has_feature(s->ioc, QIO_CHANNEL_FEATURE_FD_PASS)) {
 ret = qio_channel_readv_full(s->ioc, &iov, 1,
  &msgfds, &msgfds_num,
- NULL);
+ 0, NULL);
 } else {
 ret = qio_channel_readv_full(s->ioc, &iov, 1,
  NULL, NULL,
- NULL);
+ 0, NULL);
 }
 
 if (msgfds_num) {
diff --git a/include/io/channel.h b/include/io/channel.h
index c680ee7480..716235d496 100644
--- a/include/io/channel.h
+++ b/include/io/channel.h
@@ -34,6 +34,8 @@ OBJECT_DECLARE_TYPE(QIOChannel, QIOChannelClass,
 
 #define QIO_CHANNEL_WRITE_FLAG_ZERO_COPY 0x1
 
+#define QIO_CHANNEL_READ_FLAG_MSG_PEEK 0x1
+
 typedef enum QIOChannelFeature QIOChannelFeature;
 
 enum QIOChannelFeature {
@@ -41,6 +43,7 @@ enum QIOChannelFeature {
 QIO_CHANNEL_FEATURE_SHUTDOWN,
 QIO_CHANNEL_FEATURE_LISTEN,
 QIO_CHANNEL_FEATURE_WRITE_ZERO_COPY,
+QIO_CHANNEL_FEATURE_READ_MSG_PEEK,
 };
 
 
@@ -114,6 +117,7 @@ struct QIOChannelClass {
 size_t niov,
 int **fds,
 size_t *nfds,
+int flags,
 Error **errp);
 int (*io_close)(QIOChannel *ioc,
 Error **errp);
@@ -188,6 +192,7 @@ void qio_channel_set_name(QIOChannel *ioc,
  * @niov: the length of the @iov array
  * @fds: pointer to an array that will received file handles
  * @nfds: pointer filled with number of elements in @fds on return
+ * @flags: read flags (QIO_CHANNEL_READ_FLAG_*)
  * @errp: pointer to a NULL-initialized error object
  *
  * Read data from the IO channel, storing it in the
@@ -224,6 +229,7 @@ ssize_t qio_channel_readv_full(QIOChannel *ioc,
size_t niov,
int **fds,
size_t *nfds,
+   int flags,
Error **errp);
 
 
diff --git a/io/channel-buffer.c b/io/channel-buffer.c
index bf52011be2..8096180f85 100644
--- a/io/channel-buffer.c
+++ b/io/channel-buffer.c
@@ -54,6 +54,7 @@ static ssize_t qio_channel_buffer_readv(QIOChannel *ioc,
 size_t niov,
 int **fds,
 size_t *nfds,
+int flags,
 Error **errp)
 {
 QIOChannelBuffer *bioc = QIO_CHANNEL_BUFFER(ioc);
diff --git a/io/channel-command.c b/io/channel-command.c
index 74516252ba..e7edd091af 100644
--- a/io/channel-command.c
+++ b/io/channel-command.c
@@ -203,6 +203,7 @@ static ssize_t qio_channel_command_readv(QIOChannel *ioc,
  size_t niov,
  int **fds,
  size_t *nfds,
+ int flags,
  Error **errp)
 {
 QIOChannelCommand *cioc = QIO_CHANNEL_COMMAND(ioc);
diff --git a/io/channel-file.c b/io/channel-file.c
index b67687c2aa..d76663e6ae 100644
--- a/io/channel-file.c
+++ b/io/channel-file.c
@@ -86,6 +86,7 @@ static ssize_t qio_channel_file_readv(QIOChannel *ioc,
   size_t niov,
   int **fds,
   size_t *nfds,
+  int flags,
   Error **errp)
 {
 QIOChannelFi

arm: gdb-stub is broken by FEAT_HAFDBS

2022-11-23 Thread Changbin Du via

Hello, Richard,
We just noticed the gdb-stub is broken and probably caused by commit 4a3585568
("target/arm: Plumb debug into S1Translate").

(gdb) target remote :1234
Remote debugging using :1234
0x0e1716d0 in ?? ()
=> 0x0e1716d0:  Cannot access memory at address 0xe1716d0

This issue can be workaround by below change.

--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -2879,7 +2879,7 @@ hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cs, 
vaddr addr,
 S1Translate ptw = {
 .in_mmu_idx = arm_mmu_idx(env),
 .in_secure = arm_is_secure(env),
-.in_debug = true,
+.in_debug = false,
 };

Could you take a look at this? Thank you!

-- 
Cheers,
Changbin Du

[QEMU][Question] Is there any way to backup a dirty-bitmap that can be executed at the other server?

2022-11-23 Thread 너구리맨

First of all, Thanks for the hard work.

With QMP, I can backup dirty-bitmap - like 'blockdev-backup' incremental
sync mode.
But this command should be executed at the host where the QEMU process is
live.

I want to execute a backup at the other host so it does not affect running
Guest.
- `blockdev-backup` preempts the host's I/O resource.

Is there any way to backup a dirty-bitmap that can be executed at the other
server?

Thanks.

[PULL 6/7] hw/audio/intel-hda: don't reset codecs twice

2022-11-23 Thread Gerd Hoffmann

From: Peter Maydell 

Currently the intel-hda device has a reset method which manually
resets all the codecs by calling device_legacy_reset() on them.  This
means they get reset twice, once because child devices on a qbus get
reset before the parent device's reset method is called, and then
again because we're manually resetting them.

Drop the manual reset call, and ensure that codecs are still reset
when the guest does a reset via ICH6_GCTL_RESET by using
device_cold_reset() (which resets all the devices on the qbus as well
as the device itself) instead of a direct call to the reset function.

This is a slight ordering change because the (only) codec reset now
happens before the controller registers etc are reset, rather than
once before and then once after, but the codec reset function
hda_audio_reset() doesn't care.

This lets us drop a use of device_legacy_reset(), which is
deprecated.

Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20221014142632.2092404-2-peter.mayd...@linaro.org>
Signed-off-by: Gerd Hoffmann 
---
 hw/audio/intel-hda.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c
index f38117057b9b..38cfa20262e2 100644
--- a/hw/audio/intel-hda.c
+++ b/hw/audio/intel-hda.c
@@ -516,7 +516,7 @@ static void intel_hda_notify_codecs(IntelHDAState *d, 
uint32_t stream, bool runn
 static void intel_hda_set_g_ctl(IntelHDAState *d, const IntelHDAReg *reg, 
uint32_t old)
 {
 if ((d->g_ctl & ICH6_GCTL_RESET) == 0) {
-intel_hda_reset(DEVICE(d));
+device_cold_reset(DEVICE(d));
 }
 }
 
@@ -1083,11 +1083,9 @@ static void intel_hda_reset(DeviceState *dev)
 intel_hda_regs_reset(d);
 d->wall_base_ns = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
 
-/* reset codecs */
 QTAILQ_FOREACH(kid, &d->codecs.qbus.children, sibling) {
 DeviceState *qdev = kid->child;
 cdev = HDA_CODEC_DEVICE(qdev);
-device_legacy_reset(DEVICE(cdev));
 d->state_sts |= (1 << cdev->cad);
 }
 intel_hda_update_irq(d);
-- 
2.38.1

[PULL 7/7] hw/audio/intel-hda: Drop unnecessary prototype

2022-11-23 Thread Gerd Hoffmann

From: Peter Maydell 

The only use of intel_hda_reset() is after its definition, so we
don't need to separately declare its prototype at the top of the
file; drop the unnecessary line.

Signed-off-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Message-Id: <20221014142632.2092404-3-peter.mayd...@linaro.org>
Signed-off-by: Gerd Hoffmann 
---
 hw/audio/intel-hda.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/audio/intel-hda.c b/hw/audio/intel-hda.c
index 38cfa20262e2..b9ed231fe849 100644
--- a/hw/audio/intel-hda.c
+++ b/hw/audio/intel-hda.c
@@ -220,8 +220,6 @@ struct IntelHDAReg {
 void   (*rhandler)(IntelHDAState *d, const IntelHDAReg *reg);
 };
 
-static void intel_hda_reset(DeviceState *dev);
-
 /* - */
 
 static hwaddr intel_hda_addr(uint32_t lbase, uint32_t ubase)
-- 
2.38.1

[PULL 4/7] ui/gtk: prevent ui lock up when dpy_gl_update called again before current draw event occurs

2022-11-23 Thread Gerd Hoffmann

From: Dongwon Kim 

A warning, "qemu: warning: console: no gl-unblock within" followed by
guest scanout lockup can happen if dpy_gl_update is called in a row
and the second call is made before gd_draw_event scheduled by the first
call is taking place. This is because draw call returns without decrementing
gl_block ref count if the dmabuf was already submitted as shown below.

(gd_gl_area_draw/gd_egl_draw)

if (dmabuf) {
if (!dmabuf->draw_submitted) {
return;
} else {
dmabuf->draw_submitted = false;
}
}

So it should not schedule any redundant draw event in case draw_submitted is
already set in gd_egl_fluch/gd_gl_area_scanout_flush.

Cc: Gerd Hoffmann 
Cc: Vivek Kasireddy 
Signed-off-by: Dongwon Kim 
Reviewed-by: Marc-André Lureau 
Message-Id: <20221021192315.9110-1-dongwon@intel.com>
Signed-off-by: Gerd Hoffmann 
---
 ui/gtk-egl.c | 2 +-
 ui/gtk-gl-area.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/ui/gtk-egl.c b/ui/gtk-egl.c
index 35f917ceb15e..e84431790c9b 100644
--- a/ui/gtk-egl.c
+++ b/ui/gtk-egl.c
@@ -341,7 +341,7 @@ void gd_egl_flush(DisplayChangeListener *dcl,
 VirtualConsole *vc = container_of(dcl, VirtualConsole, gfx.dcl);
 GtkWidget *area = vc->gfx.drawing_area;
 
-if (vc->gfx.guest_fb.dmabuf) {
+if (vc->gfx.guest_fb.dmabuf && !vc->gfx.guest_fb.dmabuf->draw_submitted) {
 graphic_hw_gl_block(vc->gfx.dcl.con, true);
 vc->gfx.guest_fb.dmabuf->draw_submitted = true;
 gtk_widget_queue_draw_area(area, x, y, w, h);
diff --git a/ui/gtk-gl-area.c b/ui/gtk-gl-area.c
index 682638a197d2..7696df1f6bc4 100644
--- a/ui/gtk-gl-area.c
+++ b/ui/gtk-gl-area.c
@@ -278,7 +278,7 @@ void gd_gl_area_scanout_flush(DisplayChangeListener *dcl,
 {
 VirtualConsole *vc = container_of(dcl, VirtualConsole, gfx.dcl);
 
-if (vc->gfx.guest_fb.dmabuf) {
+if (vc->gfx.guest_fb.dmabuf && !vc->gfx.guest_fb.dmabuf->draw_submitted) {
 graphic_hw_gl_block(vc->gfx.dcl.con, true);
 vc->gfx.guest_fb.dmabuf->draw_submitted = true;
 }
-- 
2.38.1

[PULL 1/7] Revert "usbredir: avoid queuing hello packet on snapshot restore"

2022-11-23 Thread Gerd Hoffmann

From: Joelle van Dyne 

Run state is also in RUN_STATE_PRELAUNCH while "-S" is used.

This reverts commit 0631d4b448454ae8a1ab091c447e3f71ab6e088a

Signed-off-by: Joelle van Dyne 
Reviewed-by: Ján Tomko 

The original commit broke the usage of usbredir with libvirt, which
starts every domain with "-S".

This workaround is no longer needed because the usbredir behavior
has been fixed in the meantime:
https://gitlab.freedesktop.org/spice/usbredir/-/merge_requests/61

Signed-off-by: Ján Tomko 
Message-Id: 
<1689cec3eadcea87255e390cb236033aca72e168.1669193161.git.jto...@redhat.com>
Signed-off-by: Gerd Hoffmann 
---
 hw/usb/redirect.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c
index 1bd30efc3ef0..fd7df599bc0b 100644
--- a/hw/usb/redirect.c
+++ b/hw/usb/redirect.c
@@ -1280,8 +1280,7 @@ static void usbredir_create_parser(USBRedirDevice *dev)
 }
 #endif
 
-if (runstate_check(RUN_STATE_INMIGRATE) ||
-runstate_check(RUN_STATE_PRELAUNCH)) {
+if (runstate_check(RUN_STATE_INMIGRATE)) {
 flags |= usbredirparser_fl_no_hello;
 }
 usbredirparser_init(dev->parser, VERSION, caps, USB_REDIR_CAPS_SIZE,
-- 
2.38.1

[PULL 2/7] gtk: disable GTK Clipboard with a new meson option

2022-11-23 Thread Gerd Hoffmann

From: Claudio Fontana 

The GTK Clipboard implementation may cause guest hangs.

Therefore implement new configure switch: --enable-gtk-clipboard,

as a meson option disabled by default, which warns in the help
text about the experimental nature of the feature.
Regenerate the meson build options to include it.

The initialization of the clipboard is gtk.c, as well as the
compilation of gtk-clipboard.c are now conditional on this new
option to be set.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1150
Signed-off-by: Claudio Fontana 
Acked-by: Gerd Hoffmann 
Reviewed-by: Jim Fehlig 
Message-Id: <20221121135538.14625-1-cfont...@suse.de>
Signed-off-by: Gerd Hoffmann 
---
 meson_options.txt | 7 +++
 ui/gtk.c  | 2 ++
 meson.build   | 5 +
 scripts/meson-buildoptions.sh | 3 +++
 ui/meson.build| 5 -
 5 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/meson_options.txt b/meson_options.txt
index 66128178bffa..4b749ca54900 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -219,6 +219,13 @@ option('vnc_sasl', type : 'feature', value : 'auto',
description: 'SASL authentication for VNC server')
 option('vte', type : 'feature', value : 'auto',
description: 'vte support for the gtk UI')
+
+# GTK Clipboard implementation is disabled by default, since it may cause hangs
+# of the guest VCPUs. See gitlab issue 1150:
+# https://gitlab.com/qemu-project/qemu/-/issues/1150
+
+option('gtk_clipboard', type: 'feature', value : 'disabled',
+   description: 'clipboard support for the gtk UI (EXPERIMENTAL, MAY 
HANG)')
 option('xkbcommon', type : 'feature', value : 'auto',
description: 'xkbcommon support')
 option('zstd', type : 'feature', value : 'auto',
diff --git a/ui/gtk.c b/ui/gtk.c
index 7ec21f7798ef..4817623c8f3f 100644
--- a/ui/gtk.c
+++ b/ui/gtk.c
@@ -2403,7 +2403,9 @@ static void gtk_display_init(DisplayState *ds, 
DisplayOptions *opts)
 opts->u.gtk.show_tabs) {
 gtk_menu_item_activate(GTK_MENU_ITEM(s->show_tabs_item));
 }
+#ifdef CONFIG_GTK_CLIPBOARD
 gd_clipboard_init(s);
+#endif /* CONFIG_GTK_CLIPBOARD */
 }
 
 static void early_gtk_display_init(DisplayOptions *opts)
diff --git a/meson.build b/meson.build
index cf3e517e56d8..5c6b5a1c757f 100644
--- a/meson.build
+++ b/meson.build
@@ -1246,6 +1246,8 @@ endif
 gtk = not_found
 gtkx11 = not_found
 vte = not_found
+have_gtk_clipboard = get_option('gtk_clipboard').enabled()
+
 if not get_option('gtk').auto() or have_system
   gtk = dependency('gtk+-3.0', version: '>=3.22.0',
method: 'pkg-config',
@@ -1264,6 +1266,8 @@ if not get_option('gtk').auto() or have_system
required: get_option('vte'),
kwargs: static_kwargs)
 endif
+  elif have_gtk_clipboard
+error('GTK clipboard requested, but GTK not found')
   endif
 endif
 
@@ -1842,6 +1846,7 @@ if glusterfs.found()
 endif
 config_host_data.set('CONFIG_GTK', gtk.found())
 config_host_data.set('CONFIG_VTE', vte.found())
+config_host_data.set('CONFIG_GTK_CLIPBOARD', have_gtk_clipboard)
 config_host_data.set('CONFIG_LIBATTR', have_old_libattr)
 config_host_data.set('CONFIG_LIBCAP_NG', libcap_ng.found())
 config_host_data.set('CONFIG_EBPF', libbpf.found())
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
index 2cb0de5601ef..aa6e30ea911e 100644
--- a/scripts/meson-buildoptions.sh
+++ b/scripts/meson-buildoptions.sh
@@ -93,6 +93,7 @@ meson_options_help() {
   printf "%s\n" '  glusterfs   Glusterfs block device driver'
   printf "%s\n" '  gnutls  GNUTLS cryptography support'
   printf "%s\n" '  gtk GTK+ user interface'
+  printf "%s\n" '  gtk-clipboard   clipboard support for GTK (EXPERIMENTAL, 
MAY HANG)'
   printf "%s\n" '  guest-agent Build QEMU Guest Agent'
   printf "%s\n" '  guest-agent-msi Build MSI package for the QEMU Guest Agent'
   printf "%s\n" '  hax HAX acceleration support'
@@ -274,6 +275,8 @@ _meson_option_parse() {
 --disable-gprof) printf "%s" -Dgprof=false ;;
 --enable-gtk) printf "%s" -Dgtk=enabled ;;
 --disable-gtk) printf "%s" -Dgtk=disabled ;;
+--enable-gtk-clipboard) printf "%s" -Dgtk_clipboard=enabled ;;
+--disable-gtk-clipboard) printf "%s" -Dgtk_clipboard=disabled ;;
 --enable-guest-agent) printf "%s" -Dguest_agent=enabled ;;
 --disable-guest-agent) printf "%s" -Dguest_agent=disabled ;;
 --enable-guest-agent-msi) printf "%s" -Dguest_agent_msi=enabled ;;
diff --git a/ui/meson.build b/ui/meson.build
index ec139497766a..c1b137bf330c 100644
--- a/ui/meson.build
+++ b/ui/meson.build
@@ -97,7 +97,10 @@ if gtk.found()
   softmmu_ss.add(when: 'CONFIG_WIN32', if_true: files('win32-kbd-hook.c'))
 
   gtk_ss = ss.source_set()
-  gtk_ss.add(gtk, vte, pixman, files('gtk.c', 'gtk-clipboard.c'))
+  gtk_ss.add(gtk, vte, pixman, files('gtk.c'))
+  if have_gtk_clipboard
+gtk_ss.add(files('gtk-clipboar

[PULL 0/7] Fixes 20221123 patches

2022-11-23 Thread Gerd Hoffmann

The following changes since commit 7c09a7f6ae1770d15535980d15dffdb23f4d9786:

  Update VERSION for v7.2.0-rc2 (2022-11-22 18:59:56 -0500)

are available in the Git repository at:

  https://gitlab.com/kraxel/qemu.git tags/fixes-20221123-pull-request

for you to fetch changes up to 7d3cf19548b7f9afd9d25c30dd1450aad7d1877d:

  hw/audio/intel-hda: Drop unnecessary prototype (2022-11-23 12:30:45 +0100)


ui+usb+audio: bugfixes for 7.2



Claudio Fontana (1):
  gtk: disable GTK Clipboard with a new meson option

Dongwon Kim (1):
  ui/gtk: prevent ui lock up when dpy_gl_update called again before
current draw event occurs

Joelle van Dyne (1):
  Revert "usbredir: avoid queuing hello packet on snapshot restore"

Michael Tokarev (1):
  hw/usb/hcd-xhci.c: spelling: tranfer

Peter Maydell (3):
  hw/usb/hcd-xhci: Reset the XHCIState with device_cold_reset()
  hw/audio/intel-hda: don't reset codecs twice
  hw/audio/intel-hda: Drop unnecessary prototype

 meson_options.txt | 7 +++
 hw/audio/intel-hda.c  | 6 +-
 hw/usb/hcd-xhci-pci.c | 2 +-
 hw/usb/hcd-xhci-sysbus.c  | 2 +-
 hw/usb/hcd-xhci.c | 2 +-
 hw/usb/redirect.c | 3 +--
 ui/gtk-egl.c  | 2 +-
 ui/gtk-gl-area.c  | 2 +-
 ui/gtk.c  | 2 ++
 meson.build   | 5 +
 scripts/meson-buildoptions.sh | 3 +++
 ui/meson.build| 5 -
 12 files changed, 28 insertions(+), 13 deletions(-)

-- 
2.38.1

[PULL 5/7] hw/usb/hcd-xhci: Reset the XHCIState with device_cold_reset()

2022-11-23 Thread Gerd Hoffmann

From: Peter Maydell 

Currently the hcd-xhci-pci and hcd-xhci-sysbus devices, which are
mostly wrappers around the TYPE_XHCI device, which is a direct
subclass of TYPE_DEVICE.  Since TYPE_DEVICE devices are not on any
qbus and do not get automatically reset, the wrapper devices both
reset the TYPE_XHCI device in their own reset functions.  However,
they do this using device_legacy_reset(), which will reset the device
itself but not any bus it has.

Switch to device_cold_reset(), which avoids using a deprecated
function and also propagates reset along any child buses.

Signed-off-by: Peter Maydell 
Message-Id: <20221014145423.2102706-1-peter.mayd...@linaro.org>
Signed-off-by: Gerd Hoffmann 
---
 hw/usb/hcd-xhci-pci.c| 2 +-
 hw/usb/hcd-xhci-sysbus.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/usb/hcd-xhci-pci.c b/hw/usb/hcd-xhci-pci.c
index e934b1a5b1fb..643d4643e4d6 100644
--- a/hw/usb/hcd-xhci-pci.c
+++ b/hw/usb/hcd-xhci-pci.c
@@ -85,7 +85,7 @@ static void xhci_pci_reset(DeviceState *dev)
 {
 XHCIPciState *s = XHCI_PCI(dev);
 
-device_legacy_reset(DEVICE(&s->xhci));
+device_cold_reset(DEVICE(&s->xhci));
 }
 
 static int xhci_pci_vmstate_post_load(void *opaque, int version_id)
diff --git a/hw/usb/hcd-xhci-sysbus.c b/hw/usb/hcd-xhci-sysbus.c
index a14e4381960e..faf57b47975d 100644
--- a/hw/usb/hcd-xhci-sysbus.c
+++ b/hw/usb/hcd-xhci-sysbus.c
@@ -29,7 +29,7 @@ void xhci_sysbus_reset(DeviceState *dev)
 {
 XHCISysbusState *s = XHCI_SYSBUS(dev);
 
-device_legacy_reset(DEVICE(&s->xhci));
+device_cold_reset(DEVICE(&s->xhci));
 }
 
 static void xhci_sysbus_realize(DeviceState *dev, Error **errp)
-- 
2.38.1

[PULL 3/7] hw/usb/hcd-xhci.c: spelling: tranfer

2022-11-23 Thread Gerd Hoffmann

From: Michael Tokarev 

Fixes: effaf5a240e03020f4ae953e10b764622c3e87cc
Signed-off-by: Michael Tokarev 
Reviewed-by: Thomas Huth 
Reviewed-by: Stefan Weil 
Message-Id: <20221105114851.306206-1-...@msgid.tls.msk.ru>
Signed-off-by: Gerd Hoffmann 
---
 hw/usb/hcd-xhci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/usb/hcd-xhci.c b/hw/usb/hcd-xhci.c
index 8299f35e6695..b89b618ec210 100644
--- a/hw/usb/hcd-xhci.c
+++ b/hw/usb/hcd-xhci.c
@@ -796,7 +796,7 @@ static int xhci_ring_chain_length(XHCIState *xhci, const 
XHCIRing *ring)
  */
 } while (length < TRB_LINK_LIMIT * 65536 / TRB_SIZE);
 
-qemu_log_mask(LOG_GUEST_ERROR, "%s: exceeded maximum tranfer ring size!\n",
+qemu_log_mask(LOG_GUEST_ERROR, "%s: exceeded maximum transfer ring 
size!\n",
   __func__);
 
 return -1;
-- 
2.38.1

Re: [PATCH] gdbstub: move update guest debug to accel ops

2022-11-23 Thread Mads Ynddal




> On 23 Nov 2022, at 15.05, Alex Bennée  wrote:
> 
> Nice. Looks good to me but I'll have a proper look when I go through my
> gdbstub/next queue. I don't think this is critical for 7.2.
> 

Thanks, and I agree. It can easily wait.

[PULL 0/3] Avocado tests and qtests improvements

2022-11-23 Thread Thomas Huth

 Hi Stefan!

The following changes since commit 7c09a7f6ae1770d15535980d15dffdb23f4d9786:

  Update VERSION for v7.2.0-rc2 (2022-11-22 18:59:56 -0500)

are available in the Git repository at:

  https://gitlab.com/thuth/qemu.git tags/pull-request-2022-11-23

for you to fetch changes up to 4189af72dd6fa74e2253f16c8078be52e55eb80e:

  tests/avocado: use new rootfs for orangepi test (2022-11-23 10:58:48 +0100)


* Shorten the amount of text from the qos-test to avoid hitting
  output size limits in the gitlab CI
* Update URLs of avocado tests


Alex Bennée (1):
  tests/avocado: use new rootfs for orangepi test

Thomas Huth (2):
  tests/avocado: Update the URLs of the advent calendar images
  tests/qtest: Decrease the amount of output from the qom-test

 tests/qtest/qom-test.c  | 22 +++---
 tests/avocado/boot_linux_console.py |  8 +++
 tests/avocado/machine_arm_canona1100.py |  4 ++--
 tests/avocado/machine_microblaze.py |  4 ++--
 tests/avocado/machine_sparc64_sun4u.py  |  4 ++--
 tests/avocado/ppc_mpc8544ds.py  |  6 ++---
 tests/avocado/ppc_virtex_ml507.py   |  6 ++---
 tests/avocado/replay_kernel.py  | 40 -
 8 files changed, 55 insertions(+), 39 deletions(-)

[PULL 1/3] tests/avocado: Update the URLs of the advent calendar images

2022-11-23 Thread Thomas Huth

The qemu-advent-calendar.org server will be decommissioned soon.
I've mirrored the images that we use for the QEMU CI to gitlab,
so update their URLs to point to the new location.

Message-Id: <20221121102436.78635-1-th...@redhat.com>
Tested-by: Philippe Mathieu-Daudé 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Thomas Huth 
---
 tests/avocado/boot_linux_console.py |  4 +--
 tests/avocado/machine_arm_canona1100.py |  4 +--
 tests/avocado/machine_microblaze.py |  4 +--
 tests/avocado/machine_sparc64_sun4u.py  |  4 +--
 tests/avocado/ppc_mpc8544ds.py  |  6 ++--
 tests/avocado/ppc_virtex_ml507.py   |  6 ++--
 tests/avocado/replay_kernel.py  | 40 -
 7 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/tests/avocado/boot_linux_console.py 
b/tests/avocado/boot_linux_console.py
index 4c9d551f47..f3e6f44ae9 100644
--- a/tests/avocado/boot_linux_console.py
+++ b/tests/avocado/boot_linux_console.py
@@ -1029,8 +1029,8 @@ def test_m68k_q800(self):
 self.wait_for_console_pattern(console_pattern)
 
 def do_test_advcal_2018(self, day, tar_hash, kernel_name, console=0):
-tar_url = ('https://www.qemu-advent-calendar.org'
-   '/2018/download/day' + day + '.tar.xz')
+tar_url = ('https://qemu-advcal.gitlab.io'
+   '/qac-best-of-multiarch/download/day' + day + '.tar.xz')
 file_path = self.fetch_asset(tar_url, asset_hash=tar_hash)
 archive.extract(file_path, self.workdir)
 self.vm.set_console(console_index=console)
diff --git a/tests/avocado/machine_arm_canona1100.py 
b/tests/avocado/machine_arm_canona1100.py
index 182a0b0513..a42d8b0f2b 100644
--- a/tests/avocado/machine_arm_canona1100.py
+++ b/tests/avocado/machine_arm_canona1100.py
@@ -23,8 +23,8 @@ def test_arm_canona1100(self):
 :avocado: tags=machine:canon-a1100
 :avocado: tags=device:pflash_cfi02
 """
-tar_url = ('https://www.qemu-advent-calendar.org'
-   '/2018/download/day18.tar.xz')
+tar_url = ('https://qemu-advcal.gitlab.io'
+   '/qac-best-of-multiarch/download/day18.tar.xz')
 tar_hash = '068b5fc4242b29381acee94713509f8a876e9db6'
 file_path = self.fetch_asset(tar_url, asset_hash=tar_hash)
 archive.extract(file_path, self.workdir)
diff --git a/tests/avocado/machine_microblaze.py 
b/tests/avocado/machine_microblaze.py
index 4928920f96..8d0efff30d 100644
--- a/tests/avocado/machine_microblaze.py
+++ b/tests/avocado/machine_microblaze.py
@@ -19,8 +19,8 @@ def test_microblaze_s3adsp1800(self):
 :avocado: tags=machine:petalogix-s3adsp1800
 """
 
-tar_url = ('https://www.qemu-advent-calendar.org'
-   '/2018/download/day17.tar.xz')
+tar_url = ('https://qemu-advcal.gitlab.io'
+   '/qac-best-of-multiarch/download/day17.tar.xz')
 tar_hash = '08bf3e3bfb6b6c7ce1e54ab65d54e189f2caf13f'
 file_path = self.fetch_asset(tar_url, asset_hash=tar_hash)
 archive.extract(file_path, self.workdir)
diff --git a/tests/avocado/machine_sparc64_sun4u.py 
b/tests/avocado/machine_sparc64_sun4u.py
index 458165500e..d333c0ae91 100644
--- a/tests/avocado/machine_sparc64_sun4u.py
+++ b/tests/avocado/machine_sparc64_sun4u.py
@@ -24,8 +24,8 @@ def test_sparc64_sun4u(self):
 :avocado: tags=arch:sparc64
 :avocado: tags=machine:sun4u
 """
-tar_url = ('https://www.qemu-advent-calendar.org'
-   '/2018/download/day23.tar.xz')
+tar_url = ('https://qemu-advcal.gitlab.io'
+   '/qac-best-of-multiarch/download/day23.tar.xz')
 tar_hash = '142db83cd974ffadc4f75c8a5cad5bcc5722c240'
 file_path = self.fetch_asset(tar_url, asset_hash=tar_hash)
 archive.extract(file_path, self.workdir)
diff --git a/tests/avocado/ppc_mpc8544ds.py b/tests/avocado/ppc_mpc8544ds.py
index 8d6a749201..b599fb1cc9 100644
--- a/tests/avocado/ppc_mpc8544ds.py
+++ b/tests/avocado/ppc_mpc8544ds.py
@@ -22,9 +22,9 @@ def test_ppc_mpc8544ds(self):
 :avocado: tags=accel:tcg
 """
 self.require_accelerator("tcg")
-tar_url = ('https://www.qemu-advent-calendar.org'
-   '/2020/download/day17.tar.gz')
-tar_hash = '7a5239542a7c4257aa4d3b7f6ddf08fb6775c494'
+tar_url = ('https://qemu-advcal.gitlab.io'
+   '/qac-best-of-multiarch/download/day04.tar.xz')
+tar_hash = 'f46724d281a9f30fa892d458be7beb7d34dc25f9'
 file_path = self.fetch_asset(tar_url, asset_hash=tar_hash)
 archive.extract(file_path, self.workdir)
 self.vm.set_console()
diff --git a/tests/avocado/ppc_virtex_ml507.py 
b/tests/avocado/ppc_virtex_ml507.py
index 6b07686b56..a73f8ae396 100644
--- a/tests/avocado/ppc_virtex_ml507.py
+++ b/tests/avocado/ppc_virtex_ml507.py
@@ -22,9 +22,9 @@ def test_ppc_virtex_ml507(self):
 :avocado: tags=accel:tcg
 """

[PULL 2/3] tests/qtest: Decrease the amount of output from the qom-test

2022-11-23 Thread Thomas Huth

The logs in the gitlab-CI have a size constraint, and sometimes
we already hit this limit. The biggest part of the log then seems
to be filled by the qom-test, so we should decrease the size of
the output - which can be done easily by not printing the path
for each property, since the path has already been logged at the
beginning of each node that we handle here.

However, if we omit the path, we should make sure to not recurse
into child nodes in between, so that it is clear to which node
each property belongs. Thus store the children and links in a
temporary list and recurse only at the end of each node, when
all properties have already been printed.

Message-Id: <20221121194240.149268-1-th...@redhat.com>
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Thomas Huth 
---
 tests/qtest/qom-test.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/tests/qtest/qom-test.c b/tests/qtest/qom-test.c
index 7b871b2a31..13510bc349 100644
--- a/tests/qtest/qom-test.c
+++ b/tests/qtest/qom-test.c
@@ -20,6 +20,7 @@ static void test_properties(QTestState *qts, const char 
*path, bool recurse)
 QDict *response, *tuple, *tmp;
 QList *list;
 QListEntry *entry;
+GSList *children = NULL, *links = NULL;
 
 g_test_message("Obtaining properties of %s", path);
 response = qtest_qmp(qts, "{ 'execute': 'qom-list',"
@@ -41,11 +42,14 @@ static void test_properties(QTestState *qts, const char 
*path, bool recurse)
 if (is_child || is_link) {
 child_path = g_strdup_printf("%s/%s",
  path, qdict_get_str(tuple, "name"));
-test_properties(qts, child_path, is_child);
-g_free(child_path);
+if (is_child) {
+children = g_slist_prepend(children, child_path);
+} else {
+links = g_slist_prepend(links, child_path);
+}
 } else {
 const char *prop = qdict_get_str(tuple, "name");
-g_test_message("Testing property %s.%s", path, prop);
+g_test_message("-> %s", prop);
 tmp = qtest_qmp(qts,
 "{ 'execute': 'qom-get',"
 "  'arguments': { 'path': %s, 'property': %s } }",
@@ -55,6 +59,18 @@ static void test_properties(QTestState *qts, const char 
*path, bool recurse)
 qobject_unref(tmp);
 }
 }
+
+while (links) {
+test_properties(qts, links->data, false);
+g_free(links->data);
+links = g_slist_delete_link(links, links);
+}
+while (children) {
+test_properties(qts, children->data, true);
+g_free(children->data);
+children = g_slist_delete_link(children, children);
+}
+
 qobject_unref(response);
 }
 
-- 
2.31.1

[PULL 3/3] tests/avocado: use new rootfs for orangepi test

2022-11-23 Thread Thomas Huth

From: Alex Bennée 

The old URL wasn't stable. I suspect the current URL will only be
stable for a few months so maybe we need another strategy for hosting
rootfs snapshots?

Signed-off-by: Alex Bennée 
Message-Id: <20221118113309.1057790-1-alex.ben...@linaro.org>
Signed-off-by: Thomas Huth 
---
 tests/avocado/boot_linux_console.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/avocado/boot_linux_console.py 
b/tests/avocado/boot_linux_console.py
index f3e6f44ae9..ec07c64291 100644
--- a/tests/avocado/boot_linux_console.py
+++ b/tests/avocado/boot_linux_console.py
@@ -793,8 +793,8 @@ def test_arm_orangepi_sd(self):
 dtb_path = 
'/usr/lib/linux-image-current-sunxi/sun8i-h3-orangepi-pc.dtb'
 dtb_path = self.extract_from_deb(deb_path, dtb_path)
 rootfs_url = ('http://storage.kernelci.org/images/rootfs/buildroot/'
-  'kci-2019.02/armel/base/rootfs.ext2.xz')
-rootfs_hash = '692510cb625efda31640d1de0a8d60e26040f061'
+  'buildroot-baseline/20221116.0/armel/rootfs.ext2.xz')
+rootfs_hash = 'fae32f337c7b87547b10f42599acf109da8b6d9a'
 rootfs_path_xz = self.fetch_asset(rootfs_url, asset_hash=rootfs_hash)
 rootfs_path = os.path.join(self.workdir, 'rootfs.cpio')
 archive.lzma_uncompress(rootfs_path_xz, rootfs_path)
-- 
2.31.1

Re: [PATCH] vhost: mask VIRTIO_F_RING_RESET for vhost and vhost-user devices

2022-11-23 Thread Stefano Garzarella


On Tue, Nov 22, 2022 at 07:01:38PM +0100, Eugenio Perez Martin wrote:

On Tue, Nov 22, 2022 at 4:13 AM Jason Wang  wrote:


On Mon, Nov 21, 2022 at 6:11 PM Stefano Garzarella  wrote:
>
> Commit 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
> enabled VIRTIO_F_RING_RESET by default for all virtio devices.
>
> This feature is not currently emulated by QEMU, so for vhost and
> vhost-user devices we need to make sure it is supported by the offloaded
> device emulation (in-kernel or in another process).
> To do this we need to add VIRTIO_F_RING_RESET to the features bitmap
> passed to vhost_get_features(). This way it will be masked if the device
> does not support it.
>
> This issue was initially discovered with vhost-vsock and vhost-user-vsock,
> and then also tested with vhost-user-rng which confirmed the same issue.
> They fail when sending features through VHOST_SET_FEATURES ioctl or
> VHOST_USER_SET_FEATURES message, since VIRTIO_F_RING_RESET is negotiated
> by the guest (Linux >= v6.0), but not supported by the device.
>
> Fixes: 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1318
> Signed-off-by: Stefano Garzarella 

Acked-by: Jason Wang 

> ---
>
> To prevent this problem in the future, perhaps we should provide a function
> (e.g. vhost_device_get_features) where we go to mask all non-device-specific
> features (e.g VIRTIO_F_*, VIRTIO_RING_F_*) that are not emulated by QEMU but
> we expect them to be emulated by the vhost or vhost-user devices.

Adding Eugenio, this might not be true if we want to use shadow
virtqueue for emulating some features?



I think we should be able to introduce that in the (hypothetical)
vhost_device_get_features if SVQ starts emulating a ring feature,
isn't it?


Yep, I think so. The idea is to have a single place where to do it.


I think a first version not aware of SVQ is fine anyway, and
to introduce it later should be easy.


Thanks for the feedbacks, I'll keep you CCed.

Thanks,
Stefano

Re: [RFC PATCH] tests/avocado: use new rootfs for orangepi test

2022-11-23 Thread Alex Bennée



Thomas Huth  writes:

> On 23/11/2022 12.15, Philippe Mathieu-Daudé wrote:
>> On 18/11/22 12:33, Alex Bennée wrote:
>>> The old URL wasn't stable. I suspect the current URL will only be
>>> stable for a few months so maybe we need another strategy for hosting
>>> rootfs snapshots?
>>>
>>> Signed-off-by: Alex Bennée 
>>> ---
>>>   tests/avocado/boot_linux_console.py | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tests/avocado/boot_linux_console.py
>>> b/tests/avocado/boot_linux_console.py
>>> index 4c9d551f47..5a2923c423 100644
>>> --- a/tests/avocado/boot_linux_console.py
>>> +++ b/tests/avocado/boot_linux_console.py
>>> @@ -793,8 +793,8 @@ def test_arm_orangepi_sd(self):
>>>   dtb_path =
>>> '/usr/lib/linux-image-current-sunxi/sun8i-h3-orangepi-pc.dtb'
>>>   dtb_path = self.extract_from_deb(deb_path, dtb_path)
>>>   rootfs_url =
>>> ('http://storage.kernelci.org/images/rootfs/buildroot/'
>>> -  'kci-2019.02/armel/base/rootfs.ext2.xz')
>>> -    rootfs_hash = '692510cb625efda31640d1de0a8d60e26040f061'
>>> +  'buildroot-baseline/20221116.0/armel/rootfs.ext2.xz')
>>> +    rootfs_hash = 'fae32f337c7b87547b10f42599acf109da8b6d9a'
>> If Avocado doesn't find an artifact in its local cache, it will fetch it
>> from the URL.
>> The cache might be populated with artifacts previously downloaded, but
>> their URL is not valid anymore (my case for many tests).
>> We can also add artifacts manually, see [1].
>> I'd rather keep pre-existing tests if possible, to test older
>> (kernel / user-space) images. We don't need to run all the tests all
>> the time:
>> tests can be filtered by tags (see [2]).
>> My preference here is to refactor this test, adding the
>> "kci-2019.02"
>> and "baseline-20221116.0" releases. I can prepare the patch if you /
>> Thomas don't object.
>
> IMHO we shouldn't keep tests in the upstream git repository where the
> binaries are not available in public anymore. They won't get run by
> new contributors anymore, and also could vanish from the disks of the
> people who previously downloaded it, once they wipe their cache or
> upgrade to a new installation, so the test code will sooner or later
> be bitrotting. But if you want to keep the tests around on your hard
> disk, you could also stick the test in a local branch on your hard
> disk instead.

CI/Workstation splits aside I tend to agree with Thomas here that having
tests no one else can run will lead to an accretion of broken tests.
Given the tests themselves are standalone couldn't the prospective test
hoarder keep their own personal repository to be run with the rest of the
in-tree code, something like:

  cd my/test/zoo/repo
  $(QEMU_BUILD)/tests/venv/bin/avocado run my_test_zoo.py

for convenience we could maybe support an env variable so the existing
test selection tags would work:

  set -x QEMU_AVOCADO_EXTRA_TESTS /my/test/zoo/repo
  ./tests/venv/bin/avocado list
  ...
  
  ...

?

> The other possibility is to upload the binaries to a new public
> location in the web ... but for software that contains GPLed software,
> you should then also make sure to provide the source code to comply
> with the license.

This is the traditional reason we've lent so hard on external hosting
for binaries because the upstream doesn't want the hassle of maintaining
that sort of zoo of binaries. That said we have tests where binaries are
served from fileserver.linaro.org but its then only my problem to deal
with GPL requirements and not the upstream.

-- 
Alex Bennée

Re: [PATCH v6 11/11] tests/qtest: Enable qtest build on Windows

2022-11-23 Thread Thomas Huth


On 23/11/2022 15.13, Marc-André Lureau wrote:

Hi Bin

On Fri, Oct 28, 2022 at 9:06 AM Bin Meng  wrote:


Now that we have fixed various test case issues as seen when running
on Windows, let's enable the qtest build on Windows.

Signed-off-by: Bin Meng 
Reviewed-by: Thomas Huth 


We haven't solved the CI timing out or eating all the CPU time, right?

Can we simply exclude it from CI for now, ie add to this patch

diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
index 093276ddbc..ba9045ec38 100644
--- a/.gitlab-ci.d/windows.yml
+++ b/.gitlab-ci.d/windows.yml
@@ -62,7 +62,7 @@ msys2-64bit:
- .\msys64\usr\bin\bash -lc './configure --target-list=x86_64-softmmu
--enable-capstone'
- .\msys64\usr\bin\bash -lc 'make'
-  - .\msys64\usr\bin\bash -lc 'make check || { cat
build/meson-logs/testlog.txt; exit 1; } ;'
+  - .\msys64\usr\bin\bash -lc 'make check MTESTARGS="--no-suite
qtest" || { cat build/meson-logs/testlog.txt; exit 1; } ;'

  msys2-32bit:
extends: .shared_msys2_builder
@@ -96,4 +96,4 @@ msys2-32bit:
- cd output
- ..\msys64\usr\bin\bash -lc "../configure --target-list=ppc64-softmmu"
- ..\msys64\usr\bin\bash -lc 'make'
-  - ..\msys64\usr\bin\bash -lc 'make check || { cat
meson-logs/testlog.txt; exit 1; } ;'
+  - ..\msys64\usr\bin\bash -lc 'make check MTESTARGS="--no-suite
qtest" || { cat meson-logs/testlog.txt; exit 1; } ;'


I think it's only the 64-bit job that is really problematic, so we could 
still run the qtests in the 32-bit job?


Alternatively, what about switching the 64-bit to another target that does 
not have so many qtests enabled? Some mips-softmmu or riscv-softmmu maybe? 
... we still check x86_64-softmmu in the .cirrus.yml builds, so this is 
hopefully not such a big loss...


 Thomas

Re: [PATCH v6 11/11] tests/qtest: Enable qtest build on Windows

2022-11-23 Thread Marc-André Lureau

Hi Bin

On Fri, Oct 28, 2022 at 9:06 AM Bin Meng  wrote:
>
> Now that we have fixed various test case issues as seen when running
> on Windows, let's enable the qtest build on Windows.
>
> Signed-off-by: Bin Meng 
> Reviewed-by: Thomas Huth 

We haven't solved the CI timing out or eating all the CPU time, right?

Can we simply exclude it from CI for now, ie add to this patch

diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
index 093276ddbc..ba9045ec38 100644
--- a/.gitlab-ci.d/windows.yml
+++ b/.gitlab-ci.d/windows.yml
@@ -62,7 +62,7 @@ msys2-64bit:
   - .\msys64\usr\bin\bash -lc './configure --target-list=x86_64-softmmu
   --enable-capstone'
   - .\msys64\usr\bin\bash -lc 'make'
-  - .\msys64\usr\bin\bash -lc 'make check || { cat
build/meson-logs/testlog.txt; exit 1; } ;'
+  - .\msys64\usr\bin\bash -lc 'make check MTESTARGS="--no-suite
qtest" || { cat build/meson-logs/testlog.txt; exit 1; } ;'

 msys2-32bit:
   extends: .shared_msys2_builder
@@ -96,4 +96,4 @@ msys2-32bit:
   - cd output
   - ..\msys64\usr\bin\bash -lc "../configure --target-list=ppc64-softmmu"
   - ..\msys64\usr\bin\bash -lc 'make'
-  - ..\msys64\usr\bin\bash -lc 'make check || { cat
meson-logs/testlog.txt; exit 1; } ;'
+  - ..\msys64\usr\bin\bash -lc 'make check MTESTARGS="--no-suite
qtest" || { cat meson-logs/testlog.txt; exit 1; } ;'


Could you resubmit your missing win test patches and check if gitlab is happy?

thanks

>
> ---
>
> Changes in v5:
> - Drop patches that are already merged
>
> Changes in v3:
> - Drop the host test
>
> Changes in v2:
> - new patch: "tests/qtest: Enable qtest build on Windows"
>
>  tests/qtest/meson.build | 6 --
>  1 file changed, 6 deletions(-)
>
> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
> index c07a5b1a5f..f0ebb5fac6 100644
> --- a/tests/qtest/meson.build
> +++ b/tests/qtest/meson.build
> @@ -1,9 +1,3 @@
> -# All QTests for now are POSIX-only, but the dependencies are
> -# really in libqtest, not in the testcases themselves.
> -if not config_host.has_key('CONFIG_POSIX')
> -  subdir_done()
> -endif
> -
>  slow_qtests = {
>'ahci-test' : 60,
>'bios-tables-test' : 120,
> --
> 2.25.1
>
>


-- 
Marc-André Lureau

Re: [PATCH v2] Drop more useless casts from void * to pointer

2022-11-23 Thread Daniel P . Berrangé

On Wed, Nov 23, 2022 at 02:51:49PM +0100, BALATON Zoltan wrote:
> On Wed, 23 Nov 2022, Markus Armbruster wrote:
> > Signed-off-by: Markus Armbruster 
> > Reviewed-by: Laurent Vivier 
> > ---
> > v2:
> > * PATCH 1+2 merged as commit 0a553c12c7 and 3f7febc937
> > * PATCH 3 change to util/coroutine-ucontext.c dropped [Laurent]
> > 
> > bsd-user/elfload.c  | 2 +-
> > contrib/plugins/cache.c | 8 
> > contrib/vhost-user-blk/vhost-user-blk.c | 2 +-
> > hw/core/qdev-clock.c| 2 +-
> > hw/hyperv/vmbus.c   | 2 +-
> > hw/net/cadence_gem.c| 2 +-
> > hw/net/virtio-net.c | 2 +-
> > hw/nvme/ctrl.c  | 4 ++--
> > hw/rdma/vmw/pvrdma_cmd.c| 9 +++--
> > hw/rdma/vmw/pvrdma_qp_ops.c | 6 +++---
> > hw/virtio/virtio-iommu.c| 3 +--
> > linux-user/syscall.c| 2 +-
> > target/i386/hax/hax-all.c   | 2 +-
> > tests/tcg/aarch64/system/semiheap.c | 4 ++--
> > util/vfio-helpers.c | 2 +-
> > 15 files changed, 24 insertions(+), 28 deletions(-)
> > 
> > diff --git a/bsd-user/elfload.c b/bsd-user/elfload.c
> > index f8edb22f2a..fbcdc94b96 100644
> > --- a/bsd-user/elfload.c
> > +++ b/bsd-user/elfload.c
> > @@ -156,7 +156,7 @@ static abi_ulong copy_elf_strings(int argc, char 
> > **argv, void **page,
> > --p; --tmp; --len;
> > if (--offset < 0) {
> > offset = p % TARGET_PAGE_SIZE;
> > -pag = (char *)page[p / TARGET_PAGE_SIZE];
> > +pag = page[p / TARGET_PAGE_SIZE];
> 
> I think arithmetic on void pointer was undefined at least in the past so
> some compilers may warn for it but not sure if this is still the case for
> the compilers we care about. Apparently not if this now compiles but that
> explains why this cast was not useless. Found some more info on this here:
> 
> https://stackoverflow.com/questions/3523145/pointer-arithmetic-for-void-pointer-in-c

QEMU explicitly only targets GCC + Clang, so portability to other
compilers is not required.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH] gdbstub: move update guest debug to accel ops

2022-11-23 Thread Alex Bennée



Mads Ynddal  writes:

> From: Mads Ynddal 
>
> Continuing the refactor of a48e7d9e52 (gdbstub: move guest debug support
> check to ops) by removing hardcoded kvm_enabled() from generic cpu.c
> code, and replace it with a property of AccelOpsClass.
>
> Signed-off-by: Mads Ynddal 

Nice. Looks good to me but I'll have a proper look when I go through my
gdbstub/next queue. I don't think this is critical for 7.2.

> ---
>  accel/kvm/kvm-accel-ops.c  |  1 +
>  cpu.c  | 10 +++---
>  include/sysemu/accel-ops.h |  1 +
>  3 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/accel/kvm/kvm-accel-ops.c b/accel/kvm/kvm-accel-ops.c
> index fbf4fe3497..6ebf9a644f 100644
> --- a/accel/kvm/kvm-accel-ops.c
> +++ b/accel/kvm/kvm-accel-ops.c
> @@ -99,6 +99,7 @@ static void kvm_accel_ops_class_init(ObjectClass *oc, void 
> *data)
>  ops->synchronize_pre_loadvm = kvm_cpu_synchronize_pre_loadvm;
>  
>  #ifdef KVM_CAP_SET_GUEST_DEBUG
> +ops->update_guest_debug = kvm_update_guest_debug;
>  ops->supports_guest_debug = kvm_supports_guest_debug;
>  ops->insert_breakpoint = kvm_insert_breakpoint;
>  ops->remove_breakpoint = kvm_remove_breakpoint;
> diff --git a/cpu.c b/cpu.c
> index 2a09b05205..ef433a79e3 100644
> --- a/cpu.c
> +++ b/cpu.c
> @@ -31,8 +31,8 @@
>  #include "hw/core/sysemu-cpu-ops.h"
>  #include "exec/address-spaces.h"
>  #endif
> +#include "sysemu/cpus.h"
>  #include "sysemu/tcg.h"
> -#include "sysemu/kvm.h"
>  #include "sysemu/replay.h"
>  #include "exec/cpu-common.h"
>  #include "exec/exec-all.h"
> @@ -378,10 +378,14 @@ void cpu_breakpoint_remove_all(CPUState *cpu, int mask)
>  void cpu_single_step(CPUState *cpu, int enabled)
>  {
>  if (cpu->singlestep_enabled != enabled) {
> +const AccelOpsClass *ops = cpus_get_accel();
> +
>  cpu->singlestep_enabled = enabled;
> -if (kvm_enabled()) {
> -kvm_update_guest_debug(cpu, 0);
> +
> +if (ops->update_guest_debug) {
> +ops->update_guest_debug(cpu, 0);
>  }
> +
>  trace_breakpoint_singlestep(cpu->cpu_index, enabled);
>  }
>  }
> diff --git a/include/sysemu/accel-ops.h b/include/sysemu/accel-ops.h
> index 8cc7996def..0a47a2f00c 100644
> --- a/include/sysemu/accel-ops.h
> +++ b/include/sysemu/accel-ops.h
> @@ -48,6 +48,7 @@ struct AccelOpsClass {
>  
>  /* gdbstub hooks */
>  bool (*supports_guest_debug)(void);
> +int (*update_guest_debug)(CPUState *cpu, unsigned long flags);
>  int (*insert_breakpoint)(CPUState *cpu, int type, hwaddr addr, hwaddr 
> len);
>  int (*remove_breakpoint)(CPUState *cpu, int type, hwaddr addr, hwaddr 
> len);
>  void (*remove_all_breakpoints)(CPUState *cpu);


-- 
Alex Bennée

Re: [PATCH v2] Drop more useless casts from void * to pointer

2022-11-23 Thread BALATON Zoltan


On Wed, 23 Nov 2022, Markus Armbruster wrote:

Signed-off-by: Markus Armbruster 
Reviewed-by: Laurent Vivier 
---
v2:
* PATCH 1+2 merged as commit 0a553c12c7 and 3f7febc937
* PATCH 3 change to util/coroutine-ucontext.c dropped [Laurent]

bsd-user/elfload.c  | 2 +-
contrib/plugins/cache.c | 8 
contrib/vhost-user-blk/vhost-user-blk.c | 2 +-
hw/core/qdev-clock.c| 2 +-
hw/hyperv/vmbus.c   | 2 +-
hw/net/cadence_gem.c| 2 +-
hw/net/virtio-net.c | 2 +-
hw/nvme/ctrl.c  | 4 ++--
hw/rdma/vmw/pvrdma_cmd.c| 9 +++--
hw/rdma/vmw/pvrdma_qp_ops.c | 6 +++---
hw/virtio/virtio-iommu.c| 3 +--
linux-user/syscall.c| 2 +-
target/i386/hax/hax-all.c   | 2 +-
tests/tcg/aarch64/system/semiheap.c | 4 ++--
util/vfio-helpers.c | 2 +-
15 files changed, 24 insertions(+), 28 deletions(-)

diff --git a/bsd-user/elfload.c b/bsd-user/elfload.c
index f8edb22f2a..fbcdc94b96 100644
--- a/bsd-user/elfload.c
+++ b/bsd-user/elfload.c
@@ -156,7 +156,7 @@ static abi_ulong copy_elf_strings(int argc, char **argv, 
void **page,
--p; --tmp; --len;
if (--offset < 0) {
offset = p % TARGET_PAGE_SIZE;
-pag = (char *)page[p / TARGET_PAGE_SIZE];
+pag = page[p / TARGET_PAGE_SIZE];


I think arithmetic on void pointer was undefined at least in the past so 
some compilers may warn for it but not sure if this is still the case for 
the compilers we care about. Apparently not if this now compiles but that 
explains why this cast was not useless. Found some more info on this here:


https://stackoverflow.com/questions/3523145/pointer-arithmetic-for-void-pointer-in-c

Regards,
BALATON Zoltan

Re: [PATCH v5 1/2] sysemu: tpm: Add a stub function for TPM_IS_CRB

2022-11-23 Thread Michael S. Tsirkin

On Wed, Nov 23, 2022 at 02:01:32PM +0100, Eric Auger wrote:
> 
> 
> On 11/23/22 12:24, Michael S. Tsirkin wrote:
> > On Wed, Nov 23, 2022 at 12:10:09PM +0100, Eric Auger wrote:
> >>
> >> On 11/23/22 10:30, Michael S. Tsirkin wrote:
> >>> On Wed, Nov 23, 2022 at 09:18:39AM +0100, Eric Auger wrote:
>  Hi,
> 
>  On 11/23/22 07:36, Michael S. Tsirkin wrote:
> > On Fri, May 06, 2022 at 09:47:52AM -0400, Stefan Berger wrote:
> >> On 5/6/22 09:25, Eric Auger wrote:
> >>> In a subsequent patch, VFIO will need to recognize if
> >>> a memory region owner is a TPM CRB device. Hence VFIO
> >>> needs to use TPM_IS_CRB() even if CONFIG_TPM is unset. So
> >>> let's add a stub function.
> >>>
> >>> Signed-off-by: Eric Auger 
> >>> Suggested-by: Cornelia Huck 
> >> Reviewed-by: Stefan Berger 
> > ... and now in 7.2 vdpa needs a dependency on tpm too, what a hack :(
> > And what exactly is it about TPM CRB that everyone needs to
> > know about it and skip it? The API does not tell ...
>  An excerpt of one reply I made at that time:
> 
>  The spec (CG PC Client Platform TPM Profile (PTP)
>      Specification Family “2.0” Level 00 Revision 01.03 v22, page 100) 
>  says that the command/response data "may be defined as large as 3968",
>  which is (0x1000 - 0x80), 0x80 being the size of the control struct.
>  so the size of the region logically is less than a 4kB page, hence our
>  trouble.
> 
>  We learnt in the past Windows driver has some stronger expectation wrt
>  memory mapping. I don't know if those latter would work if we were to
>  enlarge the window by some tricks.
> 
>  https://trustedcomputinggroup.org/wp-content/uploads/Mobile-Command-Response-Buffer-Interface-v2-r12-Specification_FINAL2.pdf
>  says
> 
>  "
>  Including the control structure, the three memory areas comprise the
>  entirety of the CRB. There are no constraints on how those three memory
>  areas are provided. They can all be in system RAM, or all be in device
>  memory, or any combination.
> 
>  Thanks
> 
>  Eric
> >>> So we put it in system RAM then? But why isn't DMA there allowed?
> >> I don't think there is any need and since it violates the alignment
> >> check in VFIO we discard the region from DMA mapped ones.
> >>
> >> Thanks
> >>
> >> Eric
> > If that's all then we could just check alignment -
> > why are we bothering with a tpm specific hack?
> I think Alex prefered to avoid silently skipping the DMA mapping of a
> region (a possible scenario may be invalid P2P DMA access?). Except if
> we know this region can be safely ignored, which is the case for the TPM
> CRB, hence this whitelist.
> 
> Eric

As a vdpa maintainer I might know (more like trust) TPM can be safely
ignored right now, but for sure I won't know if that ever changes nor
will I remember why down the road. Nor will TPM maintainers remember to
go poke at vdpa if this changes.


> 
> >
> >
> >>> ---
> >>>   include/sysemu/tpm.h | 6 ++
> >>>   1 file changed, 6 insertions(+)
> >>>
> >>> diff --git a/include/sysemu/tpm.h b/include/sysemu/tpm.h
> >>> index 68b2206463c..fb40e30ff60 100644
> >>> --- a/include/sysemu/tpm.h
> >>> +++ b/include/sysemu/tpm.h
> >>> @@ -80,6 +80,12 @@ static inline TPMVersion tpm_get_version(TPMIf *ti)
> >>>   #define tpm_init()  (0)
> >>>   #define tpm_cleanup()
> >>>
> >>> +/* needed for an alignment check in non-tpm code */
> >>> +static inline Object *TPM_IS_CRB(Object *obj)
> >>> +{
> >>> + return NULL;
> >>> +}
> >>> +
> >>>   #endif /* CONFIG_TPM */
> >>>
> >>>   #endif /* QEMU_TPM_H */

Re: [PATCH 00/15] Protect the block layer with a rwlock: part 3

2022-11-23 Thread Kevin Wolf

Am 23.11.2022 um 12:45 hat Emanuele Giuseppe Esposito geschrieben:
> 
> 
> Am 18/11/2022 um 11:57 schrieb Paolo Bonzini:
> > On 11/16/22 15:07, Emanuele Giuseppe Esposito wrote:
> >> Here we introduce generated_co_wrapper_simple, a simplification of
> >> g_c_w that
> >> only considers the case where the caller is not in a coroutine.
> >> This simplifies and clarifies a lot when the caller is a coroutine or
> >> not, and
> >> in the future will hopefully replace g_c_w.
> > 
> > This is a good idea!
> > 
> > Just one thing, though it belongs more in the two series which
> > introduced generated_co_wrapper_simple and generated_co_wrapper_blk - I
> > would make this the "official" wrapper.  So perhaps:
> > 
> > - generated_co_wrapper_simple -> coroutine_wrapper
> > - generated_co_wrapper_blk -> coroutine_wrapper_mixed
> > - generated_co_wrapper -> coroutine_wrapper_mixed_bdrv
> 
> Ah damn I forgot about this, and of course I just sent v5 for "Still
> more coroutine and various fixes in block layer".
> 
> To me it sounds good, but before I do a massive edit and then someone
> asks to revert it, @Kevin and the others do you agree?

Makes sense to me in general. I think I originally suggested that the
"basic" version should be the one that doesn't take the graph lock, and
the special version with the longer name is the one that takes it.

This proposal contains the same thought, but extends it to make
generated_co_wrapper_simple the basic thing. This is good, because
eventually we want to get rid of the mixed functions, so they are indeed
the special thing.

I think this means that if we clean up everything, in the end we'll have
coroutine_wrapper and coroutine_wrapper_bdrv (the fourth version not in
the above list, but that Paolo mentioned we may want to have).

The only thing I'm unsure about is whether coroutine_wrapper_bdrv is
descriptive enough as a name or whether it should be something more
explicit like coroutine_wrapper_bdrv_graph_locked.

Kevin

[PATCH v2] Drop more useless casts from void * to pointer

2022-11-23 Thread Markus Armbruster

Signed-off-by: Markus Armbruster 
Reviewed-by: Laurent Vivier 
---
v2:
* PATCH 1+2 merged as commit 0a553c12c7 and 3f7febc937
* PATCH 3 change to util/coroutine-ucontext.c dropped [Laurent]

 bsd-user/elfload.c  | 2 +-
 contrib/plugins/cache.c | 8 
 contrib/vhost-user-blk/vhost-user-blk.c | 2 +-
 hw/core/qdev-clock.c| 2 +-
 hw/hyperv/vmbus.c   | 2 +-
 hw/net/cadence_gem.c| 2 +-
 hw/net/virtio-net.c | 2 +-
 hw/nvme/ctrl.c  | 4 ++--
 hw/rdma/vmw/pvrdma_cmd.c| 9 +++--
 hw/rdma/vmw/pvrdma_qp_ops.c | 6 +++---
 hw/virtio/virtio-iommu.c| 3 +--
 linux-user/syscall.c| 2 +-
 target/i386/hax/hax-all.c   | 2 +-
 tests/tcg/aarch64/system/semiheap.c | 4 ++--
 util/vfio-helpers.c | 2 +-
 15 files changed, 24 insertions(+), 28 deletions(-)

diff --git a/bsd-user/elfload.c b/bsd-user/elfload.c
index f8edb22f2a..fbcdc94b96 100644
--- a/bsd-user/elfload.c
+++ b/bsd-user/elfload.c
@@ -156,7 +156,7 @@ static abi_ulong copy_elf_strings(int argc, char **argv, 
void **page,
 --p; --tmp; --len;
 if (--offset < 0) {
 offset = p % TARGET_PAGE_SIZE;
-pag = (char *)page[p / TARGET_PAGE_SIZE];
+pag = page[p / TARGET_PAGE_SIZE];
 if (!pag) {
 pag = g_try_malloc0(TARGET_PAGE_SIZE);
 page[p / TARGET_PAGE_SIZE] = pag;
diff --git a/contrib/plugins/cache.c b/contrib/plugins/cache.c
index ac1510aaa1..2e25184a7f 100644
--- a/contrib/plugins/cache.c
+++ b/contrib/plugins/cache.c
@@ -405,7 +405,7 @@ static void vcpu_mem_access(unsigned int vcpu_index, 
qemu_plugin_meminfo_t info,
 g_mutex_lock(&l1_dcache_locks[cache_idx]);
 hit_in_l1 = access_cache(l1_dcaches[cache_idx], effective_addr);
 if (!hit_in_l1) {
-insn = (InsnData *) userdata;
+insn = userdata;
 __atomic_fetch_add(&insn->l1_dmisses, 1, __ATOMIC_SEQ_CST);
 l1_dcaches[cache_idx]->misses++;
 }
@@ -419,7 +419,7 @@ static void vcpu_mem_access(unsigned int vcpu_index, 
qemu_plugin_meminfo_t info,
 
 g_mutex_lock(&l2_ucache_locks[cache_idx]);
 if (!access_cache(l2_ucaches[cache_idx], effective_addr)) {
-insn = (InsnData *) userdata;
+insn = userdata;
 __atomic_fetch_add(&insn->l2_misses, 1, __ATOMIC_SEQ_CST);
 l2_ucaches[cache_idx]->misses++;
 }
@@ -440,7 +440,7 @@ static void vcpu_insn_exec(unsigned int vcpu_index, void 
*userdata)
 g_mutex_lock(&l1_icache_locks[cache_idx]);
 hit_in_l1 = access_cache(l1_icaches[cache_idx], insn_addr);
 if (!hit_in_l1) {
-insn = (InsnData *) userdata;
+insn = userdata;
 __atomic_fetch_add(&insn->l1_imisses, 1, __ATOMIC_SEQ_CST);
 l1_icaches[cache_idx]->misses++;
 }
@@ -454,7 +454,7 @@ static void vcpu_insn_exec(unsigned int vcpu_index, void 
*userdata)
 
 g_mutex_lock(&l2_ucache_locks[cache_idx]);
 if (!access_cache(l2_ucaches[cache_idx], insn_addr)) {
-insn = (InsnData *) userdata;
+insn = userdata;
 __atomic_fetch_add(&insn->l2_misses, 1, __ATOMIC_SEQ_CST);
 l2_ucaches[cache_idx]->misses++;
 }
diff --git a/contrib/vhost-user-blk/vhost-user-blk.c 
b/contrib/vhost-user-blk/vhost-user-blk.c
index d6932a2645..aa99877fcd 100644
--- a/contrib/vhost-user-blk/vhost-user-blk.c
+++ b/contrib/vhost-user-blk/vhost-user-blk.c
@@ -193,7 +193,7 @@ vub_discard_write_zeroes(VubReq *req, struct iovec *iov, 
uint32_t iovcnt,
 
 #if defined(__linux__) && defined(BLKDISCARD) && defined(BLKZEROOUT)
 VubDev *vdev_blk = req->vdev_blk;
-desc = (struct virtio_blk_discard_write_zeroes *)buf;
+desc = buf;
 uint64_t range[2] = { le64toh(desc->sector) << 9,
   le32toh(desc->num_sectors) << 9 };
 if (type == VIRTIO_BLK_T_DISCARD) {
diff --git a/hw/core/qdev-clock.c b/hw/core/qdev-clock.c
index 117f4c6ea4..82799577f3 100644
--- a/hw/core/qdev-clock.c
+++ b/hw/core/qdev-clock.c
@@ -134,7 +134,7 @@ void qdev_init_clocks(DeviceState *dev, const 
ClockPortInitArray clocks)
 Clock **clkp;
 /* offset cannot be inside the DeviceState part */
 assert(elem->offset > sizeof(DeviceState));
-clkp = (Clock **)(((void *) dev) + elem->offset);
+clkp = ((void *)dev) + elem->offset;
 if (elem->is_output) {
 *clkp = qdev_init_clock_out(dev, elem->name);
 } else {
diff --git a/hw/hyperv/vmbus.c b/hw/hyperv/vmbus.c
index 30bc04e1c4..f956381cc9 100644
--- a/hw/hyperv/vmbus.c
+++ b/hw/hyperv/vmbus.c
@@ -2104,7 +2104,7 @@ static void process_message(VMBus *vmbus)
 goto out;
 }
 msgdata = hv_msg->payload;
-msg = (struct vmbus_message_header *)msgdata;
+msg = msgdata;
 
 trace_vmbus_process_incoming_message(msg->messag

1 2 >

1 - 100 of 152 matches

Mail list logo