[PULL 2/4] virtio-pci: fix virtio_pci_queue_enabled()

2020-07-27 Thread Jason Wang
From: Laurent Vivier 

In legacy mode, virtio_pci_queue_enabled() falls back to
virtio_queue_enabled() to know if the queue is enabled.

But virtio_queue_enabled() calls again virtio_pci_queue_enabled()
if k->queue_enabled is set. This ends in a crash after a stack
overflow.

The problem can be reproduced with
"-device virtio-net-pci,disable-legacy=off,disable-modern=true
 -net tap,vhost=on"

And a look to the backtrace is very explicit:

...
#4  0x00010029a438 in virtio_queue_enabled ()
#5  0x000100497a9c in virtio_pci_queue_enabled ()
...
#130902 0x00010029a460 in virtio_queue_enabled ()
#130903 0x000100497a9c in virtio_pci_queue_enabled ()
#130904 0x00010029a460 in virtio_queue_enabled ()
#130905 0x000100454a20 in vhost_net_start ()
...

This patch fixes the problem by introducing a new function
for the legacy case and calls it from virtio_pci_queue_enabled().
It also calls it from virtio_queue_enabled() to avoid code duplication.

Fixes: f19bcdfedd53 ("virtio-pci: implement queue_enabled method")
Cc: Jason Wang 
Cc: Cindy Lu 
CC: Michael S. Tsirkin 
Reviewed-by: Richard Henderson 
Signed-off-by: Laurent Vivier 
Signed-off-by: Jason Wang 
---
 hw/virtio/virtio-pci.c | 2 +-
 hw/virtio/virtio.c | 7 ++-
 include/hw/virtio/virtio.h | 1 +
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 2b1f9cc..ccdf54e 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1116,7 +1116,7 @@ static bool virtio_pci_queue_enabled(DeviceState *d, int 
n)
 return proxy->vqs[n].enabled;
 }
 
-return virtio_queue_enabled(vdev, n);
+return virtio_queue_enabled_legacy(vdev, n);
 }
 
 static int virtio_pci_add_mem_cap(VirtIOPCIProxy *proxy,
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 546a198..e983025 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3309,6 +3309,11 @@ hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, 
int n)
 return vdev->vq[n].vring.desc;
 }
 
+bool virtio_queue_enabled_legacy(VirtIODevice *vdev, int n)
+{
+return virtio_queue_get_desc_addr(vdev, n) != 0;
+}
+
 bool virtio_queue_enabled(VirtIODevice *vdev, int n)
 {
 BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
@@ -3317,7 +3322,7 @@ bool virtio_queue_enabled(VirtIODevice *vdev, int n)
 if (k->queue_enabled) {
 return k->queue_enabled(qbus->parent, n);
 }
-return virtio_queue_get_desc_addr(vdev, n) != 0;
+return virtio_queue_enabled_legacy(vdev, n);
 }
 
 hwaddr virtio_queue_get_avail_addr(VirtIODevice *vdev, int n)
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 198ffc7..e424df1 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -295,6 +295,7 @@ typedef struct VirtIORNGConf VirtIORNGConf;
   VIRTIO_F_RING_PACKED, false)
 
 hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n);
+bool virtio_queue_enabled_legacy(VirtIODevice *vdev, int n);
 bool virtio_queue_enabled(VirtIODevice *vdev, int n);
 hwaddr virtio_queue_get_avail_addr(VirtIODevice *vdev, int n);
 hwaddr virtio_queue_get_used_addr(VirtIODevice *vdev, int n);
-- 
2.7.4




[PULL 4/4] net: forbid the reentrant RX

2020-07-27 Thread Jason Wang
The memory API allows DMA into NIC's MMIO area. This means the NIC's
RX routine must be reentrant. Instead of auditing all the NIC, we can
simply detect the reentrancy and return early. The queue->delivering
is set and cleared by qemu_net_queue_deliver() for other queue helpers
to know whether the delivering in on going (NIC's receive is being
called). We can check it and return early in qemu_net_queue_flush() to
forbid reentrant RX.

Signed-off-by: Jason Wang 
---
 net/queue.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/queue.c b/net/queue.c
index 0164727..19e32c8 100644
--- a/net/queue.c
+++ b/net/queue.c
@@ -250,6 +250,9 @@ void qemu_net_queue_purge(NetQueue *queue, NetClientState 
*from)
 
 bool qemu_net_queue_flush(NetQueue *queue)
 {
+if (queue->delivering)
+return false;
+
 while (!QTAILQ_EMPTY(>packets)) {
 NetPacket *packet;
 int ret;
-- 
2.7.4




[PULL 0/4] Net patches

2020-07-27 Thread Jason Wang
The following changes since commit 9303ecb658a0194560d1eecde165a1511223c2d8:

  Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20200727' into 
staging (2020-07-27 17:25:06 +0100)

are available in the git repository at:

  https://github.com/jasowang/qemu.git tags/net-pull-request

for you to fetch changes up to 7142cad78d6bf4a1cbcb09d06b39935a7998c24e:

  net: forbid the reentrant RX (2020-07-28 13:50:41 +0800)


Want to send earlier but most patches just come.

- fix vhost-vdpa issues when no peer
- fix virtio-pci queue enabling check
- forbid reentrant RX


Jason Wang (2):
  virtio-net: check the existence of peer before accessing vDPA config
  net: forbid the reentrant RX

Laurent Vivier (1):
  virtio-pci: fix virtio_pci_queue_enabled()

Yuri Benditovich (1):
  virtio-pci: fix wrong index in virtio_pci_queue_enabled

 hw/net/virtio-net.c| 30 +++---
 hw/virtio/virtio-pci.c |  4 ++--
 hw/virtio/virtio.c |  7 ++-
 include/hw/virtio/virtio.h |  1 +
 net/queue.c|  3 +++
 5 files changed, 31 insertions(+), 14 deletions(-)





Re: [BUG] vhost-vdpa: qemu-system-s390x crashes with second virtio-net-ccw device

2020-07-27 Thread Jason Wang



On 2020/7/27 下午9:16, Michael S. Tsirkin wrote:

On Mon, Jul 27, 2020 at 08:44:09PM +0800, Jason Wang wrote:

On 2020/7/27 下午7:43, Michael S. Tsirkin wrote:

On Mon, Jul 27, 2020 at 04:51:23PM +0800, Jason Wang wrote:

On 2020/7/27 下午4:41, Cornelia Huck wrote:

On Mon, 27 Jul 2020 15:38:12 +0800
Jason Wang  wrote:


On 2020/7/27 下午2:43, Cornelia Huck wrote:

On Sat, 25 Jul 2020 08:40:07 +0800
Jason Wang  wrote:

On 2020/7/24 下午11:34, Cornelia Huck wrote:

On Fri, 24 Jul 2020 11:17:57 -0400
"Michael S. Tsirkin"   wrote:

On Fri, Jul 24, 2020 at 04:56:27PM +0200, Cornelia Huck wrote:

On Fri, 24 Jul 2020 09:30:58 -0400
"Michael S. Tsirkin"   wrote:

On Fri, Jul 24, 2020 at 03:27:18PM +0200, Cornelia Huck wrote:

When I start qemu with a second virtio-net-ccw device (i.e. adding
-device virtio-net-ccw in addition to the autogenerated device), I get
a segfault. gdb points to

#0  0x55d6ab52681d in virtio_net_get_config (vdev=,
 config=0x55d6ad9e3f80 "RT") at 
/home/cohuck/git/qemu/hw/net/virtio-net.c:146
146 if (nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {

(backtrace doesn't go further)

The core was incomplete, but running under gdb directly shows that it
is just a bog-standard config space access (first for that device).

The cause of the crash is that nc->peer is not set... no idea how that
can happen, not that familiar with that part of QEMU. (Should the code
check, or is that really something that should not happen?)

What I don't understand is why it is set correctly for the first,
autogenerated virtio-net-ccw device, but not for the second one, and
why virtio-net-pci doesn't show these problems. The only difference
between -ccw and -pci that comes to my mind here is that config space
accesses for ccw are done via an asynchronous operation, so timing
might be different.

Hopefully Jason has an idea. Could you post a full command line
please? Do you need a working guest to trigger this? Does this trigger
on an x86 host?

Yes, it does trigger with tcg-on-x86 as well. I've been using

s390x-softmmu/qemu-system-s390x -M s390-ccw-virtio,accel=tcg -cpu qemu,zpci=on
-m 1024 -nographic -device virtio-scsi-ccw,id=scsi0,devno=fe.0.0001
-drive file=/path/to/image,format=qcow2,if=none,id=drive-scsi0-0-0-0
-device 
scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1
-device virtio-net-ccw

It seems it needs the guest actually doing something with the nics; I
cannot reproduce the crash if I use the old advent calendar moon buggy
image and just add a virtio-net-ccw device.

(I don't think it's a problem with my local build, as I see the problem
both on my laptop and on an LPAR.)

It looks to me we forget the check the existence of peer.

Please try the attached patch to see if it works.

Thanks, that patch gets my guest up and running again. So, FWIW,

Tested-by: Cornelia Huck

Any idea why this did not hit with virtio-net-pci (or the autogenerated
virtio-net-ccw device)?

It can be hit with virtio-net-pci as well (just start without peer).

Hm, I had not been able to reproduce the crash with a 'naked' -device
virtio-net-pci. But checking seems to be the right idea anyway.

Sorry for being unclear, I meant for networking part, you just need start
without peer, and you need a real guest (any Linux) that is trying to access
the config space of virtio-net.

Thanks

A pxe guest will do it, but that doesn't support ccw, right?


Yes, it depends on the cli actually.



I'm still unclear why this triggers with ccw but not pci -
any idea?


I don't test pxe but I can reproduce this with pci (just start a linux guest
without a peer).

Thanks


Might be a good addition to a unit test. Not sure what would the
test do exactly: just make sure guest runs? Looks like a lot of work
for an empty test ... maybe we can poke at the guest config with
qtest commands at least.



That should work or we can simply extend the exist virtio-net qtest to 
do that.


Thanks









Re: [PATCH 1/2] hw/net/net_tx_pkt: add function to check pkt->max_raw_frags

2020-07-27 Thread Jason Wang



On 2020/7/28 上午1:08, Mauro Matteo Cascella wrote:

This patch introduces a new function in hw/net/net_tx_pkt.{c,h} to check the
current data fragment against the maximum number of data fragments.



I wonder whether it's better to do the check in 
net_tx_pkt_add_raw_fragment() and fail there.


Btw, I find net_tx_pkt_add_raw_fragment() does not unmap dma when 
returning to true, is this a bug?


Thanks




Reported-by: Ziming Zhang 
Signed-off-by: Mauro Matteo Cascella 
---
  hw/net/net_tx_pkt.c | 5 +
  hw/net/net_tx_pkt.h | 8 
  2 files changed, 13 insertions(+)

diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 9560e4a49e..d035618f2c 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -400,6 +400,11 @@ bool net_tx_pkt_add_raw_fragment(struct NetTxPkt *pkt, 
hwaddr pa,
  }
  }
  
+bool net_tx_pkt_exceed_max_fragments(struct NetTxPkt *pkt)

+{
+return pkt->raw_frags >= pkt->max_raw_frags;
+}
+
  bool net_tx_pkt_has_fragments(struct NetTxPkt *pkt)
  {
  return pkt->raw_frags > 0;
diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index 4ec8bbe9bd..e2ee46ae03 100644
--- a/hw/net/net_tx_pkt.h
+++ b/hw/net/net_tx_pkt.h
@@ -179,6 +179,14 @@ bool net_tx_pkt_send_loopback(struct NetTxPkt *pkt, 
NetClientState *nc);
   */
  bool net_tx_pkt_parse(struct NetTxPkt *pkt);
  
+/**

+* indicates if the current data fragment exceeds max_raw_frags
+*
+* @pkt:packet
+*
+*/
+bool net_tx_pkt_exceed_max_fragments(struct NetTxPkt *pkt);
+
  /**
  * indicates if there are data fragments held by this packet object.
  *





Re: [PATCH 1/2] net: forbid the reentrant RX

2020-07-27 Thread Jason Wang



On 2020/7/22 下午4:57, Jason Wang wrote:

The memory API allows DMA into NIC's MMIO area. This means the NIC's
RX routine must be reentrant. Instead of auditing all the NIC, we can
simply detect the reentrancy and return early. The queue->delivering
is set and cleared by qemu_net_queue_deliver() for other queue helpers
to know whether the delivering in on going (NIC's receive is being
called). We can check it and return early in qemu_net_queue_flush() to
forbid reentrant RX.

Signed-off-by: Jason Wang 
---
  net/queue.c | 3 +++
  1 file changed, 3 insertions(+)

diff --git a/net/queue.c b/net/queue.c
index 0164727e39..19e32c80fd 100644
--- a/net/queue.c
+++ b/net/queue.c
@@ -250,6 +250,9 @@ void qemu_net_queue_purge(NetQueue *queue, NetClientState 
*from)
  
  bool qemu_net_queue_flush(NetQueue *queue)

  {
+if (queue->delivering)
+return false;
+
  while (!QTAILQ_EMPTY(>packets)) {
  NetPacket *packet;
  int ret;



Queued for rc2.

Thanks




Re: [PATCH] target/ppc: Fix TCG leak with the evmwsmiaa instruction

2020-07-27 Thread David Gibson
On Mon, Jul 27, 2020 at 10:21:14AM -0700, Matthieu Bucchianeri wrote:
> Fix double-call to tcg_temp_new_i64(), where a temp is allocated both at
> declaration time and further down the implementation of gen_evmwsmiaa().
> 
> Note that gen_evmwsmia() and gen_evmwsmiaa() are still not implemented
> correctly, as they invoke gen_evmwsmi() which may return early, but the
> return is not propagated. This will be fixed in my patch for bug #1888918.
> 
> Signed-off-by: Matthieu Bucchianeri
> 

Applied to ppc-for-5.1.  Note that since this isn't a regression, it's
not entirely clear it's a good candidate for 5.1 this late in the
freeze.  There's a possibility it will get punted to 5.2, therefore,
but for now I'm staging it for 5.1.

> ---
>  target/ppc/translate/spe-impl.inc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/ppc/translate/spe-impl.inc.c 
> b/target/ppc/translate/spe-impl.inc.c
> index 36b4d5654d..42a0d1cffb 100644
> --- a/target/ppc/translate/spe-impl.inc.c
> +++ b/target/ppc/translate/spe-impl.inc.c
> @@ -528,14 +528,14 @@ static inline void gen_evmwsmia(DisasContext *ctx)
> 
>  tcg_temp_free_i64(tmp);
>  }
> 
>  static inline void gen_evmwsmiaa(DisasContext *ctx)
>  {
> -TCGv_i64 acc = tcg_temp_new_i64();
> -TCGv_i64 tmp = tcg_temp_new_i64();
> +TCGv_i64 acc;
> +TCGv_i64 tmp;
> 
>  gen_evmwsmi(ctx);   /* rD := rA * rB */
> 
>  acc = tcg_temp_new_i64();
>  tmp = tcg_temp_new_i64();
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [PATCH] virtio-pci: fix wrong index in virtio_pci_queue_enabled

2020-07-27 Thread Jason Wang



On 2020/7/27 下午10:38, Yuri Benditovich wrote:

https://bugzilla.redhat.com/show_bug.cgi?id=1702608

Signed-off-by: Yuri Benditovich 



Queued for rc2.

Thanks



---
  hw/virtio/virtio-pci.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ada1101d07..2b1f9cc67b 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1113,7 +1113,7 @@ static bool virtio_pci_queue_enabled(DeviceState *d, int 
n)
  VirtIODevice *vdev = virtio_bus_get_device(>bus);
  
  if (virtio_vdev_has_feature(vdev, VIRTIO_F_VERSION_1)) {

-return proxy->vqs[vdev->queue_sel].enabled;
+return proxy->vqs[n].enabled;
  }
  
  return virtio_queue_enabled(vdev, n);





Re: [PATCH] virtio-pci: fix virtio_pci_queue_enabled()

2020-07-27 Thread Jason Wang



On 2020/7/27 下午11:33, Laurent Vivier wrote:

In legacy mode, virtio_pci_queue_enabled() falls back to
virtio_queue_enabled() to know if the queue is enabled.

But virtio_queue_enabled() calls again virtio_pci_queue_enabled()
if k->queue_enabled is set. This ends in a crash after a stack
overflow.

The problem can be reproduced with
"-device virtio-net-pci,disable-legacy=off,disable-modern=true
  -net tap,vhost=on"

And a look to the backtrace is very explicit:

 ...
 #4  0x00010029a438 in virtio_queue_enabled ()
 #5  0x000100497a9c in virtio_pci_queue_enabled ()
 ...
 #130902 0x00010029a460 in virtio_queue_enabled ()
 #130903 0x000100497a9c in virtio_pci_queue_enabled ()
 #130904 0x00010029a460 in virtio_queue_enabled ()
 #130905 0x000100454a20 in vhost_net_start ()
 ...

This patch fixes the problem by introducing a new function
for the legacy case and calls it from virtio_pci_queue_enabled().
It also calls it from virtio_queue_enabled() to avoid code duplication.

Fixes: f19bcdfedd53 ("virtio-pci: implement queue_enabled method")
Cc: Jason Wang 
Cc: Cindy Lu 
CC: Michael S. Tsirkin 
Signed-off-by: Laurent Vivier 



Queued for rc2.

Thanks



---
  hw/virtio/virtio-pci.c | 2 +-
  hw/virtio/virtio.c | 7 ++-
  include/hw/virtio/virtio.h | 1 +
  3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index ada1101d07bf..4ad3ad81a2cf 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1116,7 +1116,7 @@ static bool virtio_pci_queue_enabled(DeviceState *d, int 
n)
  return proxy->vqs[vdev->queue_sel].enabled;
  }
  
-return virtio_queue_enabled(vdev, n);

+return virtio_queue_enabled_legacy(vdev, n);
  }
  
  static int virtio_pci_add_mem_cap(VirtIOPCIProxy *proxy,

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 546a198e79b0..e98302521769 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -3309,6 +3309,11 @@ hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, 
int n)
  return vdev->vq[n].vring.desc;
  }
  
+bool virtio_queue_enabled_legacy(VirtIODevice *vdev, int n)

+{
+return virtio_queue_get_desc_addr(vdev, n) != 0;
+}
+
  bool virtio_queue_enabled(VirtIODevice *vdev, int n)
  {
  BusState *qbus = qdev_get_parent_bus(DEVICE(vdev));
@@ -3317,7 +3322,7 @@ bool virtio_queue_enabled(VirtIODevice *vdev, int n)
  if (k->queue_enabled) {
  return k->queue_enabled(qbus->parent, n);
  }
-return virtio_queue_get_desc_addr(vdev, n) != 0;
+return virtio_queue_enabled_legacy(vdev, n);
  }
  
  hwaddr virtio_queue_get_avail_addr(VirtIODevice *vdev, int n)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 198ffc762678..e424df12cf6d 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -295,6 +295,7 @@ typedef struct VirtIORNGConf VirtIORNGConf;
VIRTIO_F_RING_PACKED, false)
  
  hwaddr virtio_queue_get_desc_addr(VirtIODevice *vdev, int n);

+bool virtio_queue_enabled_legacy(VirtIODevice *vdev, int n);
  bool virtio_queue_enabled(VirtIODevice *vdev, int n);
  hwaddr virtio_queue_get_avail_addr(VirtIODevice *vdev, int n);
  hwaddr virtio_queue_get_used_addr(VirtIODevice *vdev, int n);





[PATCH v5] hw/pci-host: save/restore pci host config register for old ones

2020-07-27 Thread Hogan Wang
The i440fx and q35 machines integrate i440FX or MCH PCI device by default.
Refer to i440FX and ICH9-LPC spcifications, there are some reserved
configuration registers can used to save/restore PCIHostState.config_reg.
It's nasty but friendly to old ones.

Reproducer steps:
step 1. Make modifications to seabios and qemu for increase reproduction
efficiency, write 0xf0 to 0x402 port notify qemu to stop vcpu after
0x0cf8 port wrote i440 configure register. qemu stop vcpu when catch
0x402 port wrote 0xf0.

seabios:/src/hw/pci.c
@@ -52,6 +52,11 @@ void pci_config_writeb(u16 bdf, u32 addr, u8 val)
 writeb(mmconfig_addr(bdf, addr), val);
 } else {
 outl(ioconfig_cmd(bdf, addr), PORT_PCI_CMD);
+   if (bdf == 0 && addr == 0x72 && val == 0xa) {
+dprintf(1, "stop vcpu\n");
+outb(0xf0, 0x402); // notify qemu to stop vcpu
+dprintf(1, "resume vcpu\n");
+}
 outb(val, PORT_PCI_DATA + (addr & 3));
 }
 }

qemu:hw/char/debugcon.c
@@ -60,6 +61,9 @@ static void debugcon_ioport_write(void *opaque, hwaddr addr, 
uint64_t val,
 printf(" [debugcon: write addr=0x%04" HWADDR_PRIx " val=0x%02" PRIx64 
"]\n", addr, val);
 #endif

+if (ch == 0xf0) {
+vm_stop(RUN_STATE_PAUSED);
+}
 /* XXX this blocks entire thread. Rewrite to use
  * qemu_chr_fe_write and background I/O callbacks */
 qemu_chr_fe_write_all(>chr, , 1);

step 2. start vm1 by the following command line, and then vm stopped.
$ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\
 -netdev tap,ifname=tap-test,id=hostnet0,vhost=on,downscript=no,script=no\
 -device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\
 -device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\
 -chardev file,id=seabios,path=/var/log/test.seabios,append=on\
 -device isa-debugcon,iobase=0x402,chardev=seabios\
 -monitor stdio

step 3. start vm2 to accept vm1 state.
$ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\
 -netdev tap,ifname=tap-test1,id=hostnet0,vhost=on,downscript=no,script=no\
 -device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\
 -device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\
 -chardev file,id=seabios,path=/var/log/test.seabios,append=on\
 -device isa-debugcon,iobase=0x402,chardev=seabios\
 -monitor stdio \
 -incoming tcp:127.0.0.1:8000

step 4. execute the following qmp command in vm1 to migrate.
(qemu) migrate tcp:127.0.0.1:8000

step 5. execute the following qmp command in vm2 to resume vcpu.
(qemu) cont

Before this patch, we get KVM "emulation failure" error on vm2.
This patch fixes it.

Signed-off-by: Hogan Wang 
---
 hw/pci-host/i440fx.c | 46 
 hw/pci-host/q35.c| 44 ++
 2 files changed, 90 insertions(+)

diff --git a/hw/pci-host/i440fx.c b/hw/pci-host/i440fx.c
index 8ed2417f0c..419e27c21a 100644
--- a/hw/pci-host/i440fx.c
+++ b/hw/pci-host/i440fx.c
@@ -64,6 +64,14 @@ typedef struct I440FXState {
  */
 #define I440FX_COREBOOT_RAM_SIZE 0x57
 
+/* Older I440FX machines (5.0 and older) do not support i440FX-pcihost state
+ * migration, use some reserved INTEL 82441 configuration registers to
+ * save/restore i440FX-pcihost config register. Refer to [INTEL 440FX PCISET
+ * 82441FX PCI AND MEMORY CONTROLLER (PMC) AND 82442FX DATA BUS ACCELERATOR
+ * (DBX) Table 1. PMC Configuration Space]
+ */
+#define I440FX_PCI_HOST_CONFIG_REG 0x94
+
 static void i440fx_update_memory_mappings(PCII440FXState *d)
 {
 int i;
@@ -98,15 +106,53 @@ static void i440fx_write_config(PCIDevice *dev,
 static int i440fx_post_load(void *opaque, int version_id)
 {
 PCII440FXState *d = opaque;
+PCIDevice *dev;
+PCIHostState *s = OBJECT_CHECK(PCIHostState,
+   object_resolve_path("/machine/i440fx", 
NULL),
+   TYPE_PCI_HOST_BRIDGE);
 
 i440fx_update_memory_mappings(d);
+
+if (!s->mig_enabled) {
+dev = PCI_DEVICE(d);
+s->config_reg = pci_get_long(>config[I440FX_PCI_HOST_CONFIG_REG]);
+pci_set_long(>config[I440FX_PCI_HOST_CONFIG_REG], 0);
+}
+return 0;
+}
+
+static int i440fx_pre_save(void *opaque)
+{
+PCIDevice *dev = opaque;
+PCIHostState *s = OBJECT_CHECK(PCIHostState,
+   object_resolve_path("/machine/i440fx", 
NULL),
+   TYPE_PCI_HOST_BRIDGE);
+if (!s->mig_enabled) {
+pci_set_long(>config[I440FX_PCI_HOST_CONFIG_REG],
+ s->config_reg);
+}
+return 0;
+}
+
+static int i440fx_post_save(void *opaque)
+{
+PCIDevice *dev = opaque;
+PCIHostState *s = OBJECT_CHECK(PCIHostState,
+   object_resolve_path("/machine/i440fx", 
NULL),
+   TYPE_PCI_HOST_BRIDGE);
+if (!s->mig_enabled) {
+pci_set_long(>config[I440FX_PCI_HOST_CONFIG_REG], 0);
+}
 

Re: [PATCH v5 3/4] target/riscv: Fix the translation of physical address

2020-07-27 Thread Zong Li
On Tue, Jul 28, 2020 at 6:49 AM Alistair Francis  wrote:
>
> On Sat, Jul 25, 2020 at 8:05 AM Zong Li  wrote:
> >
> > The real physical address should add the 12 bits page offset. It also
> > causes the PMP wrong checking due to the minimum granularity of PMP is
> > 4 byte, but we always get the physical address which is 4KB alignment,
> > that means, we always use the start address of the page to check PMP for
> > all addresses which in the same page.
>
> So riscv_cpu_tlb_fill() will clear these bits when calling
> tlb_set_page(), so this won't have an impact on actual translation
> (although it will change in input address for 2-stage translation, but
> that seems fine).
>
> Your point about PMP seems correct as we allow a smaller then page
> granularity this seems like the right approach.
>
> Can you edit riscv_cpu_get_phys_page_debug() to mask these bits out at
> the end? Otherwise we will break what callers to
> cpu_get_phys_page_attrs_debug() expect.
>

OK, I checked that already, the callers would add these bits again,
because they expect to get the address for the page. Thanks for your
reviewing, modify it in the next version.

> Alistair
>
> >
> > Signed-off-by: Zong Li 
> > ---
> >  target/riscv/cpu_helper.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> > index 75d2ae3434..08b069f0c9 100644
> > --- a/target/riscv/cpu_helper.c
> > +++ b/target/riscv/cpu_helper.c
> > @@ -543,7 +543,8 @@ restart:
> >  /* for superpage mappings, make a fake leaf PTE for the TLB's
> > benefit. */
> >  target_ulong vpn = addr >> PGSHIFT;
> > -*physical = (ppn | (vpn & ((1L << ptshift) - 1))) << PGSHIFT;
> > +*physical = ((ppn | (vpn & ((1L << ptshift) - 1))) << PGSHIFT) 
> > |
> > +(addr & ~TARGET_PAGE_MASK);
> >
> >  /* set permissions on the TLB entry */
> >  if ((pte & PTE_R) || ((pte & PTE_X) && mxr)) {
> > --
> > 2.27.0
> >
> >



Re: [RFC PATCH 1/2] hw/riscv: sifive_u: Add file-backed OTP. softmmu/vl: add otp-file to boot option

2020-07-27 Thread Green Wan
Hi Bin,

Thanks for the reply.

I think we can add property to sifive_u_otp_properties[] (something like
below) and remove generic code dependency. What do you think of it?

@@ -243,6 +245,7 @@ static const MemoryRegionOps sifive_u_otp_ops = {

 static Property sifive_u_otp_properties[] = {
 DEFINE_PROP_UINT32("serial", SiFiveUOTPState, serial, 0),
+DEFINE_PROP_STRING("otp_file", SiFiveUOTPState, otp_file),
 DEFINE_PROP_END_OF_LIST(),
 };

 typedef struct SiFiveUOTPState {
 /*< private >*/
 SysBusDevice parent_obj;
@@ -77,6 +75,7 @@ typedef struct SiFiveUOTPState {
 uint32_t fuse[SIFIVE_U_OTP_NUM_FUSES];
 /* config */
 uint32_t serial;
+char *otp_file;
 uint32_t fuse_wo[SIFIVE_U_OTP_NUM_FUSES];
 } SiFiveUOTPState;

Regards,
Green


On Fri, Jul 24, 2020 at 10:20 PM Bin Meng  wrote:

> Hi Green,
>
> On Fri, Jul 24, 2020 at 5:51 PM Green Wan  wrote:
> >
> > Add a file-backed implementation for OTP of sifive_u machine. Use
> > '-boot otp-file=xxx' to enable it. Do file open, mmap and close
> > for every OTP read/write in case keep the update-to-date snapshot
> > of OTP.
> >
> > Signed-off-by: Green Wan 
> > ---
> >  hw/riscv/sifive_u_otp.c | 88 -
> >  include/hw/riscv/sifive_u_otp.h |  2 +
> >  qemu-options.hx |  3 +-
> >  softmmu/vl.c|  6 ++-
> >  4 files changed, 96 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/riscv/sifive_u_otp.c b/hw/riscv/sifive_u_otp.c
> > index f6ecbaa2ca..26e1965821 100644
> > --- a/hw/riscv/sifive_u_otp.c
> > +++ b/hw/riscv/sifive_u_otp.c
> > @@ -24,6 +24,72 @@
> >  #include "qemu/log.h"
> >  #include "qemu/module.h"
> >  #include "hw/riscv/sifive_u_otp.h"
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#define TRACE_PREFIX"FU540_OTP: "
> > +#define SIFIVE_FU540_OTP_SIZE   (SIFIVE_U_OTP_NUM_FUSES * 4)
> > +
> > +static int otp_backed_fd;
> > +static unsigned int *otp_mmap;
> > +
> > +static void sifive_u_otp_backed_load(const char *filename);
> > +static uint64_t sifive_u_otp_backed_read(uint32_t fuseidx);
> > +static void sifive_u_otp_backed_write(uint32_t fuseidx,
> > +  uint32_t paio,
> > +  uint32_t pdin);
> > +static void sifive_u_otp_backed_unload(void);
> > +
> > +void sifive_u_otp_backed_load(const char *filename)
> > +{
> > +if (otp_backed_fd < 0) {
> > +
> > +otp_backed_fd = open(filename, O_RDWR);
> > +
> > +if (otp_backed_fd < 0)
> > +qemu_log_mask(LOG_TRACE,
> > +  TRACE_PREFIX "Warning: can't open otp
> file\n");
> > +else {
> > +
> > +otp_mmap = (unsigned int *)mmap(0,
> > +SIFIVE_FU540_OTP_SIZE,
> > +PROT_READ | PROT_WRITE |
> PROT_EXEC,
> > +MAP_FILE | MAP_SHARED,
> > +otp_backed_fd,
> > +0);
> > +
> > +if (otp_mmap == MAP_FAILED)
> > +qemu_log_mask(LOG_TRACE,
> > +  TRACE_PREFIX "Warning: can't mmap otp
> file\n");
> > +}
> > +}
> > +
> > +}
> > +
> > +uint64_t sifive_u_otp_backed_read(uint32_t fuseidx)
> > +{
> > +return (uint64_t)(otp_mmap[fuseidx]);
> > +}
> > +
> > +void sifive_u_otp_backed_write(uint32_t fuseidx, uint32_t paio,
> uint32_t pdin)
> > +{
> > +otp_mmap[fuseidx] &= ~(pdin << paio);
> > +otp_mmap[fuseidx] |= (pdin << paio);
> > +}
> > +
> > +
> > +void sifive_u_otp_backed_unload(void)
> > +{
> > +munmap(otp_mmap, SIFIVE_FU540_OTP_SIZE);
> > +close(otp_backed_fd);
> > +otp_backed_fd = -1;
> > +}
> >
> >  static uint64_t sifive_u_otp_read(void *opaque, hwaddr addr, unsigned
> int size)
> >  {
> > @@ -46,7 +112,17 @@ static uint64_t sifive_u_otp_read(void *opaque,
> hwaddr addr, unsigned int size)
> >  if ((s->pce & SIFIVE_U_OTP_PCE_EN) &&
> >  (s->pdstb & SIFIVE_U_OTP_PDSTB_EN) &&
> >  (s->ptrim & SIFIVE_U_OTP_PTRIM_EN)) {
> > -return s->fuse[s->pa & SIFIVE_U_OTP_PA_MASK];
> > +
> > +if (otp_file) {
> > +uint64_t val;
> > +
> > +sifive_u_otp_backed_load(otp_file);
> > +val = sifive_u_otp_backed_read(s->pa);
> > +sifive_u_otp_backed_unload();
> > +
> > +return val;
> > +} else
> > +return s->fuse[s->pa & SIFIVE_U_OTP_PA_MASK];
> >  } else {
> >  return 0xff;
> >  }
> > @@ -123,6 +199,12 @@ static void sifive_u_otp_write(void *opaque, hwaddr
> addr,
> >  s->ptrim = val32;
> >  break;
> >  case SIFIVE_U_OTP_PWE:
> > +if (otp_file) {
> > +

[Bug 1390520] Re: virtual machine fails to start with connected audio cd

2020-07-27 Thread John Snow
Dropping from my queue due to capacity.

** Changed in: qemu
 Assignee: John Snow (jnsnow) => (unassigned)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1390520

Title:
  virtual machine fails to start with connected audio cd

Status in QEMU:
  New
Status in libvirt package in Ubuntu:
  Confirmed
Status in libvirt source package in Trusty:
  Won't Fix

Bug description:
  when connecting a data cd with a virtual machine (IDE CDROM 1), the virtual 
machine starts up and the data cd is accessable (for example to install 
software package or drivers),
  but connecting an audio cd the following error appears:

  
---
  cannot read header '/dev/sr0': Input/output error

  Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/details.py", line 2530, in 
_change_config_helper
  func(*args)
File "/usr/share/virt-manager/virtManager/domain.py", line 850, in 
hotplug_storage_media
  self.attach_device(devobj)
File "/usr/share/virt-manager/virtManager/domain.py", line 798, in 
attach_device
  self._backend.attachDevice(devxml)
File "/usr/lib/python2.7/dist-packages/libvirt.py", line 493, in 
attachDevice
  if ret == -1: raise libvirtError ('virDomainAttachDevice() failed', 
dom=self)
  libvirtError: cannot read header '/dev/sr0': Input/output error
  


  Description:Ubuntu 14.04.1 LTS
  Release:14.04

  qemu:
Installiert:   2.0.0+dfsg-2ubuntu1.6
Installationskandidat: 2.0.0+dfsg-2ubuntu1.6

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1390520/+subscriptions



[Bug 1070762] Re: savevm fails with inserted CD, "Device '%s' is writable but does not support snapshots."

2020-07-27 Thread John Snow
Very old bug. If anyone sees this behavior, please re-file against a
supported release (5.0 at time of writing, soon to be 5.1) and please
paste a full command-line and steps to reproduce.

(To my knowledge, this bug is not present in modern QEMU builds, but do
not know when it would have changed.)

--js

** Changed in: qemu
   Status: New => Incomplete

** Changed in: qemu
 Assignee: John Snow (jnsnow) => (unassigned)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1070762

Title:
  savevm fails with inserted CD, "Device '%s' is writable but does not
  support  snapshots."

Status in QEMU:
  Incomplete

Bug description:
  Hi,

  yesterday unfortunately a customer reported a failed snapshot of his
  VM. Going through the logfile I discovered:

  "Device 'ide1-cd0' is writable but does not support snapshots"

  this is with qemu-1.2.0 and 1.0.1 at least...

  Why writeable?
  Even if I specify "-drive ...,readonly=on,snapshot=off" to qemu the 
monitor-command sees the CD-ROM-device as being writeable?!

  Somewhere I saw a "hint" for blockdev.c:
  === snip ===

  --- /tmp/blockdev.c   2012-10-24 11:37:10.0 +0200
  +++ blockdev.c2012-10-24 11:37:17.0 +0200
  @@ -551,6 +551,7 @@
   case IF_XEN:
   case IF_NONE:
   dinfo->media_cd = media == MEDIA_CDROM;
  + dinfo->bdrv->read_only = 1;
   break;
   case IF_SD:
   case IF_FLOPPY:

  === snap ===

  after installing with this small patch applied it works, so insert CD, savevm 
 succeeds.
  This should be fixed at all correct places, and the tags 
"readonly=on,snapshot=off" should do it, too. Or even just work after 
specifying a drive being a CD-rom should do the trick ;-)

  Another "bad habit" is, that the ISO/DVD-file has to be writeable to
  be changed?

  Thnx for attention and regards,

  Oliver.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1070762/+subscriptions



[Bug 1777315] Re: IDE short PRDT abort

2020-07-27 Thread John Snow
** Summary changed:

- Denial of service
+ IDE short PRDT abort

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1777315

Title:
  IDE short PRDT abort

Status in QEMU:
  In Progress

Bug description:
  Hi,
  QEMU 'hw/ide/core.c:871' Denial of Service Vulnerability in version 
qemu-2.12.0

  run the program in qemu-2.12.0:
  #define _GNU_SOURCE 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 
  #include 

  static uintptr_t syz_open_dev(uintptr_t a0, uintptr_t a1, uintptr_t a2)
  {
  if (a0 == 0xc || a0 == 0xb) {
  char buf[128];
  sprintf(buf, "/dev/%s/%d:%d", a0 == 0xc ? "char" : "block", 
(uint8_t)a1, (uint8_t)a2);
  return open(buf, O_RDWR, 0);
  } else {
  char buf[1024];
  char* hash;
  strncpy(buf, (char*)a0, sizeof(buf) - 1);
  buf[sizeof(buf) - 1] = 0;
  while ((hash = strchr(buf, '#'))) {
  *hash = '0' + (char)(a1 % 10);
  a1 /= 10;
  }
  return open(buf, a2, 0);
  }
  }

  uint64_t r[2] = {0x, 0x};
  void loop()
  {
  long res = 0;
  memcpy((void*)0x2000, "/dev/sg#", 9);
  res = syz_open_dev(0x2000, 0, 2);
  if (res != -1)
  r[0] = res;
  res = syscall(__NR_dup2, r[0], r[0]);
  if (res != -1)
  r[1] = res;
  *(uint8_t*)0x2ec0 = 0;
  *(uint8_t*)0x2ec1 = 0;
  *(uint8_t*)0x2ec2 = 0;
  *(uint8_t*)0x2ec3 = 0;
  *(uint32_t*)0x2ec8 = 0;
  *(uint8_t*)0x2ed8 = 0;
  *(uint8_t*)0x2ed9 = 0;
  *(uint8_t*)0x2eda = 0;
  *(uint8_t*)0x2edb = 0;
  memcpy((void*)0x2ee0, "\x9c\x4d\xe7\xd5\x0a\x62\x43\xa7\x77\x53\x67\xb3", 
12);
  syscall(__NR_write, r[1], 0x2ec0, 0x323);
  }

  int main()
  {
  syscall(__NR_mmap, 0x2000, 0x100, 3, 0x32, -1, 0);
  loop();
  return 0;
  }
  this will crash qemu, output information:
   qemu-system-x86_64: hw/ide/core.c:843: ide_dma_cb: Assertion `n * 512 == 
s->sg.size' failed.

  
  Thanks 
  owl337

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1777315/+subscriptions



Re: [PATCH] bugfix: irq: Avoid covering object refcount of qemu_irq

2020-07-27 Thread zhukeqian
Hi Peter,

On 2020/7/27 22:41, Peter Maydell wrote:
> On Mon, 27 Jul 2020 at 14:03, Keqian Zhu  wrote:
>>
>> Avoid covering object refcount of qemu_irq, otherwise it may causes
>> memory leak.
>>
>> Signed-off-by: Keqian Zhu 
>> ---
>>  hw/core/irq.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/core/irq.c b/hw/core/irq.c
>> index fb3045b912..59af4dfc74 100644
>> --- a/hw/core/irq.c
>> +++ b/hw/core/irq.c
>> @@ -125,7 +125,9 @@ void qemu_irq_intercept_in(qemu_irq *gpio_in, 
>> qemu_irq_handler handler, int n)
>>  int i;
>>  qemu_irq *old_irqs = qemu_allocate_irqs(NULL, NULL, n);
>>  for (i = 0; i < n; i++) {
>> -*old_irqs[i] = *gpio_in[i];
>> +old_irqs[i]->handler = gpio_in[i]->handler;
>> +old_irqs[i]->opaque = gpio_in[i]->opaque;
>> +
>>  gpio_in[i]->handler = handler;
>>  gpio_in[i]->opaque = _irqs[i];
>>  }
> 
> This function is leaky by design, because it doesn't do anything
> with the old_irqs array and there's no function for un-intercepting
> the IRQs (which would need to free that memory). This is not ideal
> but OK because it's only used in the test suite.
One of our internal self-developed module also use this function, and we
implemented a function to remove intercepting, so there is no memory leak
after this bugfix.

I suggest to merge this bugfix to prepare for future code which may invoke
this function.

> 
> Is there a specific bug you're trying to fix here?
The memory leak is reported by ASAN.
> 

Thanks,
Keqian
> thanks
> -- PMM
> .
> 



[Bug 1883739] Re: ide_dma_cb: Assertion `prep_size >= 0 && prep_size <= n * 512' failed.

2020-07-27 Thread John Snow
*** This bug is a duplicate of bug 1777315 ***
https://bugs.launchpad.net/bugs/1777315

** This bug has been marked a duplicate of bug 1777315
   Denial of service

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1883739

Title:
  ide_dma_cb: Assertion `prep_size >= 0 && prep_size <= n * 512' failed.

Status in QEMU:
  Confirmed

Bug description:
  To reproduce run the QEMU with the following command line:
  ```
  qemu-system-x86_64 -cdrom hypertrash.iso -nographic -m 100 -enable-kvm -net 
none -drive id=disk,file=hda.img,if=none -device ahci,id=ahci -device 
ide-hd,drive=disk,bus=ahci.0
  ```

  QEMU Version:
  ```
  # qemu-5.0.0
  $ ./configure --target-list=x86_64-softmmu --enable-sanitizers; make
  $ x86_64-softmmu/qemu-system-x86_64 --version
  QEMU emulator version 5.0.0
  Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers
  ```

  To create disk image run:
  ```
  dd if=/dev/zero of=hda.img bs=1024 count=1024
  ```

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1883739/+subscriptions



[Bug 1681439] Re: qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma_sync: Assertion `s->bus->dma->aiocb == NULL' failed.

2020-07-27 Thread John Snow
** Changed in: qemu
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1681439

Title:
  qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma_sync: Assertion
  `s->bus->dma->aiocb == NULL' failed.

Status in QEMU:
  In Progress

Bug description:
  Since upgrading to QEMU 2.8.0, my Windows 7 64-bit virtual machines
  started crashing due to the assertion quoted in the summary failing.
  The assertion in question was added by commit 9972354856 ("block: add
  BDS field to count in-flight requests").  My tests show that setting
  discard=unmap is needed to reproduce the issue.  Speaking of
  reproduction, it is a bit flaky, because I have been unable to come up
  with specific instructions that would allow the issue to be triggered
  outside of my environment, but I do have a semi-sane way of testing that
  appears to depend on a specific initial state of data on the underlying
  storage volume, actions taken within the VM and waiting for about 20
  minutes.

  Here is the shortest QEMU command line that I managed to reproduce the
  bug with:

  qemu-system-x86_64 \
  -machine pc-i440fx-2.7,accel=kvm \
  -m 3072 \
  -drive file=/dev/lvm/qemu,format=raw,if=ide,discard=unmap \
-netdev tap,id=hostnet0,ifname=tap0,script=no,downscript=no,vhost=on \
  -device virtio-net-pci,netdev=hostnet0 \
-vnc :0

  The underlying storage (/dev/lvm/qemu) is a thin LVM snapshot.

  QEMU was compiled using:

  ./configure --python=/usr/bin/python2.7 --target-list=x86_64-softmmu
  make -j3

  My virtualization environment is not really a critical one and
  reproduction is not that much of a hassle, so if you need me to gather
  further diagnostic information or test patches, I will be happy to help.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1681439/+subscriptions



Re: [PATCH] bugfix: irq: Avoid covering object refcount of qemu_irq

2020-07-27 Thread zhukeqian
Hi Qiang,

On 2020/7/27 22:37, Li Qiang wrote:
> Keqian Zhu  于2020年7月27日周一 下午9:03写道:
>>
>> Avoid covering object refcount of qemu_irq, otherwise it may causes
>> memory leak.
> 
> Any reproducer?
> 
In mainline Qemu. this function is only used in qtest. One of our internal
self-developed module also use this function. The memory leak is reported
by ASAN.

Thanks,
Keqian

> Thanks,
> Li Qiang
> 
>>
>> Signed-off-by: Keqian Zhu 
>> ---
>>  hw/core/irq.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/core/irq.c b/hw/core/irq.c
>> index fb3045b912..59af4dfc74 100644
>> --- a/hw/core/irq.c
>> +++ b/hw/core/irq.c
>> @@ -125,7 +125,9 @@ void qemu_irq_intercept_in(qemu_irq *gpio_in, 
>> qemu_irq_handler handler, int n)
>>  int i;
>>  qemu_irq *old_irqs = qemu_allocate_irqs(NULL, NULL, n);
>>  for (i = 0; i < n; i++) {
>> -*old_irqs[i] = *gpio_in[i];
>> +old_irqs[i]->handler = gpio_in[i]->handler;
>> +old_irqs[i]->opaque = gpio_in[i]->opaque;
>> +
>>  gpio_in[i]->handler = handler;
>>  gpio_in[i]->opaque = _irqs[i];
>>  }
>> --
>> 2.19.1
>>
> .
> 



[Bug 1681439] Re: qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma_sync: Assertion `s->bus->dma->aiocb == NULL' failed.

2020-07-27 Thread John Snow
The qtest reproducers are so nice.

writel 0x0 0x

outw 0x171 0x32a
  features := 0x2ab8cb
  count := 0x03;  b8cb
outw 0x176 0x3570
  device := 0x70 (select device1)   b8cb
  command := 0x35(DMA WRITE EXT)8f98

outl 0xcf8 0x8903
outl 0xcfc 0x4e002700
outl 0xcf8 0x8920
outb 0xcfc 0x5e

outb 0x58 0xe1
  bmdma_cmd_writeb val = 0xe1 [1110 0001]
   DMA READ ^  ^ DMA Start
outw 0x57 0x0
  bmdma_cmd_writeb val = 0x00 [ ]
   ^ DMA Cancel
EOF


This should be a straightforward DMA cancel. I added some more traces;

# After the 0x35 command write:
ide_exec_cmd IDE exec cmd: bus 0x561808b0ecc0; state 0x561808b0f118; cmd 0x35
ide_sector_start_dma IDEState 0x561808b0f118;
ide_start_dma IDEState 0x561808b0f118;

# After the 0xe1 bmdma kick:
ide_dma_cb_entry IDEState 0x561808b0f118; ret 0;
ide_dma_cb IDEState 0x561808b0f118; sector_num=1 n=259 cmd=DMA WRITE
ide_dma_cb_next IDEState 0x561808b0f118;

So far, pretty normal. IDE calls the HBA's DMA start, but the HBA
doesn't have DMA enabled, so it stalls. Later, when we turn on DMA, the
HBA engages the DMA callback and sets up the first transfer. This sets
s->bus->dma->aiocb.

Then, we try to cancel DMA:

ide_cancel_dma_sync IDEState 0x561808b0f118;
ide_cancel_dma_sync_remaining draining all remaining requests
1343877@1595891049.469050:dma_blk_cb dbs=0x55baededdc50 ret=0
1343877@1595891049.469054:dma_map_wait dbs=0x55baededdc50
qemu-system-i386: /home/jsnow/src/qemu/hw/ide/core.c:732: void 
ide_cancel_dma_sync(IDEState *): Assertion `s->bus->dma->aiocb == NULL' failed.

We still have a DMA callback out, so we try to synchronously cancel it;
but the blk_drain doesn't appear to be effective!

We apparently wind up here:

if (dbs->iov.size == 0) {
trace_dma_map_wait(dbs);
dbs->bh = aio_bh_new(dbs->ctx, reschedule_dma, dbs);
cpu_register_map_client(dbs->bh);
return;
}


... The DMA simply re-schedules itself (?) when iov.size is zero. unfortunately 
for us, that means that the original point of scheduling the drain doesn't 
work, because the DMA never returns all the way to the IDE device emulation 
code.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1681439

Title:
  qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma_sync: Assertion
  `s->bus->dma->aiocb == NULL' failed.

Status in QEMU:
  Confirmed

Bug description:
  Since upgrading to QEMU 2.8.0, my Windows 7 64-bit virtual machines
  started crashing due to the assertion quoted in the summary failing.
  The assertion in question was added by commit 9972354856 ("block: add
  BDS field to count in-flight requests").  My tests show that setting
  discard=unmap is needed to reproduce the issue.  Speaking of
  reproduction, it is a bit flaky, because I have been unable to come up
  with specific instructions that would allow the issue to be triggered
  outside of my environment, but I do have a semi-sane way of testing that
  appears to depend on a specific initial state of data on the underlying
  storage volume, actions taken within the VM and waiting for about 20
  minutes.

  Here is the shortest QEMU command line that I managed to reproduce the
  bug with:

  qemu-system-x86_64 \
  -machine pc-i440fx-2.7,accel=kvm \
  -m 3072 \
  -drive file=/dev/lvm/qemu,format=raw,if=ide,discard=unmap \
-netdev tap,id=hostnet0,ifname=tap0,script=no,downscript=no,vhost=on \
  -device virtio-net-pci,netdev=hostnet0 \
-vnc :0

  The underlying storage (/dev/lvm/qemu) is a thin LVM snapshot.

  QEMU was compiled using:

  ./configure --python=/usr/bin/python2.7 --target-list=x86_64-softmmu
  make -j3

  My virtualization environment is not really a critical one and
  reproduction is not that much of a hassle, so if you need me to gather
  further diagnostic information or test patches, I will be happy to help.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1681439/+subscriptions



[PATCH 2/6 v3] KVM: SVM: Fill in conforming svm_x86_ops via macro

2020-07-27 Thread Krish Sadhukhan
The names of some of the svm_x86_ops functions do not have a corresponding
'svm_' prefix. Generate the names using a macro so that the names are
conformant. Fixing the naming will help in better readability and
maintenance of the code.

Suggested-by: Vitaly Kuznetsov 
Suggested-by: Paolo Bonzini 
Signed-off-by: Sean Christopherson 
Signed-off-by: Krish Sadhukhan 
---
 arch/x86/kvm/svm/avic.c   |   4 +-
 arch/x86/kvm/svm/nested.c |   2 +-
 arch/x86/kvm/svm/sev.c|   6 +-
 arch/x86/kvm/svm/svm.c| 218 +++---
 arch/x86/kvm/svm/svm.h|   8 +-
 5 files changed, 120 insertions(+), 118 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index e80daa9..619391e 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -579,7 +579,7 @@ int avic_init_vcpu(struct vcpu_svm *svm)
return ret;
 }
 
-void avic_post_state_restore(struct kvm_vcpu *vcpu)
+void svm_avic_post_state_restore(struct kvm_vcpu *vcpu)
 {
if (avic_handle_apic_id_update(vcpu) != 0)
return;
@@ -660,7 +660,7 @@ void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
 * we need to check and update the AVIC logical APIC ID table
 * accordingly before re-activating.
 */
-   avic_post_state_restore(vcpu);
+   svm_avic_post_state_restore(vcpu);
vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
} else {
vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 6bceafb..3be6256 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -348,7 +348,7 @@ static void nested_prepare_vmcb_control(struct vcpu_svm 
*svm)
/* Guest paging mode is active - reset mmu */
kvm_mmu_reset_context(>vcpu);
 
-   svm_flush_tlb(>vcpu);
+   svm_tlb_flush(>vcpu);
 
svm->vmcb->control.tsc_offset = svm->vcpu.arch.tsc_offset =
svm->vcpu.arch.l1_tsc_offset + svm->nested.ctl.tsc_offset;
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 5573a97..1ca9f60 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -969,7 +969,7 @@ int svm_mem_enc_op(struct kvm *kvm, void __user *argp)
return r;
 }
 
-int svm_register_enc_region(struct kvm *kvm,
+int svm_mem_enc_register_region(struct kvm *kvm,
struct kvm_enc_region *range)
 {
struct kvm_sev_info *sev = _kvm_svm(kvm)->sev_info;
@@ -1038,8 +1038,8 @@ static void __unregister_enc_region_locked(struct kvm 
*kvm,
kfree(region);
 }
 
-int svm_unregister_enc_region(struct kvm *kvm,
- struct kvm_enc_region *range)
+int svm_mem_enc_unregister_region(struct kvm *kvm,
+ struct kvm_enc_region *range)
 {
struct enc_region *region;
int ret;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 24755eb..d63181e 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -254,7 +254,7 @@ static inline void invlpga(unsigned long addr, u32 asid)
asm volatile (__ex("invlpga %1, %0") : : "c"(asid), "a"(addr));
 }
 
-static int get_npt_level(struct kvm_vcpu *vcpu)
+static int svm_get_tdp_level(struct kvm_vcpu *vcpu)
 {
 #ifdef CONFIG_X86_64
return PT64_ROOT_4LEVEL;
@@ -312,7 +312,7 @@ static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, 
int mask)
 
 }
 
-static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
+static int svm_skip_emulated_instruction(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
 
@@ -351,7 +351,7 @@ static void svm_queue_exception(struct kvm_vcpu *vcpu)
 * raises a fault that is not intercepted. Still better than
 * failing in all cases.
 */
-   (void)skip_emulated_instruction(>vcpu);
+   (void)svm_skip_emulated_instruction(>vcpu);
rip = kvm_rip_read(>vcpu);
svm->int3_rip = rip + svm->vmcb->save.cs.base;
svm->int3_injected = rip - old_rip;
@@ -1153,7 +1153,7 @@ static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool 
init_event)
avic_update_vapic_bar(svm, APIC_DEFAULT_PHYS_BASE);
 }
 
-static int svm_create_vcpu(struct kvm_vcpu *vcpu)
+static int svm_vcpu_create(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm;
struct page *page;
@@ -1232,7 +1232,7 @@ static void svm_clear_current_vmcb(struct vmcb *vmcb)
cmpxchg(_cpu(svm_data, i)->current_vmcb, vmcb, NULL);
 }
 
-static void svm_free_vcpu(struct kvm_vcpu *vcpu)
+static void svm_vcpu_free(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
 
@@ -1585,7 +1585,7 @@ int svm_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
return 1;
 
if (npt_enabled && ((old_cr4 ^ cr4) & X86_CR4_PGE))
-   svm_flush_tlb(vcpu);
+ 

[PATCH 4/6 v3] KVM: VMX: Fill in conforming vmx_x86_ops via macro

2020-07-27 Thread Krish Sadhukhan
The names of some of the vmx_x86_ops functions do not have a corresponding
'vmx_' prefix. Generate the names using a macro so that the names are
conformant. Fixing the naming will help in better readability and
maintenance of the code.

Suggested-by: Vitaly Kuznetsov 
Suggested-by: Paolo Bonzini 
Signed-off-by: Sean Christopherson 
Signed-off-by: Krish Sadhukhan 
---
 arch/x86/kvm/vmx/nested.c |   2 +-
 arch/x86/kvm/vmx/vmx.c| 234 +++---
 arch/x86/kvm/vmx/vmx.h|   2 +-
 3 files changed, 120 insertions(+), 118 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index d1af20b..a898b53 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3016,7 +3016,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu 
*vcpu)
 
preempt_disable();
 
-   vmx_prepare_switch_to_guest(vcpu);
+   vmx_prepare_guest_switch(vcpu);
 
/*
 * Induce a consistency check VMExit by clearing bit 1 in GUEST_RFLAGS,
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 90d91524..f6a6674 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1125,7 +1125,7 @@ void vmx_set_host_fs_gs(struct vmcs_host_state *host, u16 
fs_sel, u16 gs_sel,
}
 }
 
-void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
+void vmx_prepare_guest_switch(struct kvm_vcpu *vcpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
struct vmcs_host_state *host_state;
@@ -2317,7 +2317,7 @@ static int kvm_cpu_vmxon(u64 vmxon_pointer)
return -EFAULT;
 }
 
-static int hardware_enable(void)
+static int vmx_hardware_enable(void)
 {
int cpu = raw_smp_processor_id();
u64 phys_addr = __pa(per_cpu(vmxarea, cpu));
@@ -2366,7 +2366,7 @@ static void kvm_cpu_vmxoff(void)
cr4_clear_bits(X86_CR4_VMXE);
 }
 
-static void hardware_disable(void)
+static void vmx_hardware_disable(void)
 {
vmclear_local_loaded_vmcss();
kvm_cpu_vmxoff();
@@ -2911,7 +2911,7 @@ static void exit_lmode(struct kvm_vcpu *vcpu)
 
 #endif
 
-static void vmx_flush_tlb_all(struct kvm_vcpu *vcpu)
+static void vmx_tlb_flush_all(struct kvm_vcpu *vcpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
 
@@ -2934,7 +2934,7 @@ static void vmx_flush_tlb_all(struct kvm_vcpu *vcpu)
}
 }
 
-static void vmx_flush_tlb_current(struct kvm_vcpu *vcpu)
+static void vmx_tlb_flush_current(struct kvm_vcpu *vcpu)
 {
u64 root_hpa = vcpu->arch.mmu->root_hpa;
 
@@ -2950,16 +2950,16 @@ static void vmx_flush_tlb_current(struct kvm_vcpu *vcpu)
vpid_sync_context(nested_get_vpid02(vcpu));
 }
 
-static void vmx_flush_tlb_gva(struct kvm_vcpu *vcpu, gva_t addr)
+static void vmx_tlb_flush_gva(struct kvm_vcpu *vcpu, gva_t addr)
 {
/*
 * vpid_sync_vcpu_addr() is a nop if vmx->vpid==0, see the comment in
-* vmx_flush_tlb_guest() for an explanation of why this is ok.
+* vmx_tlb_flush_guest() for an explanation of why this is ok.
 */
vpid_sync_vcpu_addr(to_vmx(vcpu)->vpid, addr);
 }
 
-static void vmx_flush_tlb_guest(struct kvm_vcpu *vcpu)
+static void vmx_tlb_flush_guest(struct kvm_vcpu *vcpu)
 {
/*
 * vpid_sync_context() is a nop if vmx->vpid==0, e.g. if enable_vpid==0
@@ -4455,16 +4455,16 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool 
init_event)
vmx_clear_hlt(vcpu);
 }
 
-static void enable_irq_window(struct kvm_vcpu *vcpu)
+static void vmx_enable_irq_window(struct kvm_vcpu *vcpu)
 {
exec_controls_setbit(to_vmx(vcpu), CPU_BASED_INTR_WINDOW_EXITING);
 }
 
-static void enable_nmi_window(struct kvm_vcpu *vcpu)
+static void vmx_enable_nmi_window(struct kvm_vcpu *vcpu)
 {
if (!enable_vnmi ||
vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & GUEST_INTR_STATE_STI) {
-   enable_irq_window(vcpu);
+   vmx_enable_irq_window(vcpu);
return;
}
 
@@ -6173,7 +6173,7 @@ static void vmx_l1d_flush(struct kvm_vcpu *vcpu)
: "eax", "ebx", "ecx", "edx");
 }
 
-static void update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
+static void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr)
 {
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
int tpr_threshold;
@@ -6261,7 +6261,7 @@ static void vmx_set_apic_access_page_addr(struct kvm_vcpu 
*vcpu)
return;
 
vmcs_write64(APIC_ACCESS_ADDR, page_to_phys(page));
-   vmx_flush_tlb_current(vcpu);
+   vmx_tlb_flush_current(vcpu);
 
/*
 * Do not pin apic access page in memory, the MMU notifier
@@ -6837,7 +6837,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
return exit_fastpath;
 }
 
-static void vmx_free_vcpu(struct kvm_vcpu *vcpu)
+static void vmx_vcpu_free(struct kvm_vcpu *vcpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
 
@@ -6848,7 +6848,7 @@ static void vmx_free_vcpu(struct kvm_vcpu *vcpu)

[PATCH 6/6 v3] QEMU: x86: Change KVM_MEMORY_ENCRYPT_* #defines to make them conformant to the kernel

2020-07-27 Thread Krish Sadhukhan
Suggested-by: Vitaly Kuznetsov 
Suggested-by: Paolo Bonzini 
Signed-off-by: Sean Christopherson 
Signed-off-by: Krish Sadhukhan 
---
 target/i386/sev.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/i386/sev.c b/target/i386/sev.c
index c3ecf86..0913782 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -113,7 +113,7 @@ sev_ioctl(int fd, int cmd, void *data, int *error)
 input.sev_fd = fd;
 input.data = (__u64)(unsigned long)data;
 
-r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, );
+r = kvm_vm_ioctl(kvm_state, KVM_MEM_ENC_OP, );
 
 if (error) {
 *error = input.error;
@@ -187,7 +187,7 @@ sev_ram_block_added(RAMBlockNotifier *n, void *host, size_t 
size)
 range.size = size;
 
 trace_kvm_memcrypt_register_region(host, size);
-r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_REG_REGION, );
+r = kvm_vm_ioctl(kvm_state, KVM_MEM_ENC_REGISTER_REGION, );
 if (r) {
 error_report("%s: failed to register region (%p+%#zx) error '%s'",
  __func__, host, size, strerror(errno));
@@ -216,7 +216,7 @@ sev_ram_block_removed(RAMBlockNotifier *n, void *host, 
size_t size)
 range.size = size;
 
 trace_kvm_memcrypt_unregister_region(host, size);
-r = kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_UNREG_REGION, );
+r = kvm_vm_ioctl(kvm_state, KVM_MEM_ENC_UNREGISTER_REGION, );
 if (r) {
 error_report("%s: failed to unregister region (%p+%#zx)",
  __func__, host, size);
@@ -454,7 +454,7 @@ sev_get_capabilities(Error **errp)
 error_setg(errp, "KVM not enabled");
 return NULL;
 }
-if (kvm_vm_ioctl(kvm_state, KVM_MEMORY_ENCRYPT_OP, NULL) < 0) {
+if (kvm_vm_ioctl(kvm_state, KVM_MEM_ENC_OP, NULL) < 0) {
 error_setg(errp, "SEV is not enabled in KVM");
 return NULL;
 }
-- 
1.8.3.1




Re: [PATCH] docs/nvdimm: add 'pmem=on' for the device dax backend file

2020-07-27 Thread Liu, Jingqi

Hi Paolo,

Any comments for this patch ?

Thanks,

Jingqi

On 7/15/2020 10:54 AM, Liu, Jingqi wrote:

At the end of live migration, QEMU uses msync() to flush the data to
the backend storage. When the backend file is a character device dax,
the pages explicitly avoid the page cache. It will return failure from msync().
The following warning is output.

 "warning: qemu_ram_msync: failed to sync memory range“

So we add 'pmem=on' to avoid calling msync(), use the QEMU command line:

 -object memory-backend-file,id=mem1,pmem=on,mem-path=/dev/dax0.0,size=4G

Signed-off-by: Jingqi Liu 
---
  docs/nvdimm.txt | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/docs/nvdimm.txt b/docs/nvdimm.txt
index c2c6e441b3..31048aff5e 100644
--- a/docs/nvdimm.txt
+++ b/docs/nvdimm.txt
@@ -243,6 +243,13 @@ use the QEMU command line:
  
  -object memory-backend-file,id=nv_mem,mem-path=/XXX/yyy,size=4G,pmem=on
  
+At the end of live migration, QEMU uses msync() to flush the data to the

+backend storage. When the backend file is a character device dax, the pages
+explicitly avoid the page cache. It will return failure from msync().
+So we add 'pmem=on' to avoid calling msync(), use the QEMU command line:
+
+-object memory-backend-file,id=mem1,pmem=on,mem-path=/dev/dax0.0,size=4G
+
  References
  --
  




[PATCH 1/6 v3] KVM: x86: Change names of some of the kvm_x86_ops functions to make them more semantical and readable

2020-07-27 Thread Krish Sadhukhan
Suggested-by: Vitaly Kuznetsov 
Suggested-by: Paolo Bonzini 
Signed-off-by: Sean Christopherson 
Signed-off-by: Krish Sadhukhan 
---
 arch/arm64/include/asm/kvm_host.h   |  2 +-
 arch/mips/include/asm/kvm_host.h|  2 +-
 arch/powerpc/include/asm/kvm_host.h |  2 +-
 arch/s390/kvm/kvm-s390.c|  2 +-
 arch/x86/include/asm/kvm_host.h | 12 ++--
 arch/x86/kvm/svm/svm.c  | 12 ++--
 arch/x86/kvm/vmx/vmx.c  |  8 
 arch/x86/kvm/x86.c  | 28 ++--
 include/linux/kvm_host.h|  2 +-
 include/uapi/linux/kvm.h|  6 +++---
 tools/include/uapi/linux/kvm.h  |  6 +++---
 virt/kvm/kvm_main.c |  4 ++--
 12 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h 
b/arch/arm64/include/asm/kvm_host.h
index c3e6fcc6..f5be4fa 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -545,7 +545,7 @@ static inline bool kvm_arch_requires_vhe(void)
 
 void kvm_arm_vcpu_ptrauth_trap(struct kvm_vcpu *vcpu);
 
-static inline void kvm_arch_hardware_unsetup(void) {}
+static inline void kvm_arch_hardware_teardown(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 363e7a89..95cea05 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -1178,7 +1178,7 @@ extern int kvm_mips_trans_mtc0(union mips_instruction 
inst, u32 *opc,
 extern int kvm_vcpu_ioctl_interrupt(struct kvm_vcpu *vcpu,
 struct kvm_mips_interrupt *irq);
 
-static inline void kvm_arch_hardware_unsetup(void) {}
+static inline void kvm_arch_hardware_teardown(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_free_memslot(struct kvm *kvm,
 struct kvm_memory_slot *slot) {}
diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 7e2d061..892b0e2 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -856,7 +856,7 @@ struct kvm_vcpu_arch {
 #define __KVM_HAVE_CREATE_DEVICE
 
 static inline void kvm_arch_hardware_disable(void) {}
-static inline void kvm_arch_hardware_unsetup(void) {}
+static inline void kvm_arch_hardware_teardown(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_memslots_updated(struct kvm *kvm, u64 gen) {}
 static inline void kvm_arch_flush_shadow_all(struct kvm *kvm) {}
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index d47c197..5c9 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -312,7 +312,7 @@ int kvm_arch_hardware_setup(void *opaque)
return 0;
 }
 
-void kvm_arch_hardware_unsetup(void)
+void kvm_arch_hardware_teardown(void)
 {
gmap_unregister_pte_notifier(_notifier);
gmap_unregister_pte_notifier(_gmap_notifier);
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index be5363b..ccad66d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1080,7 +1080,7 @@ static inline u16 kvm_lapic_irq_dest_mode(bool 
dest_mode_logical)
 struct kvm_x86_ops {
int (*hardware_enable)(void);
void (*hardware_disable)(void);
-   void (*hardware_unsetup)(void);
+   void (*hardware_teardown)(void);
bool (*cpu_has_accelerated_tpr)(void);
bool (*has_emulated_msr)(u32 index);
void (*cpuid_update)(struct kvm_vcpu *vcpu);
@@ -1141,7 +1141,7 @@ struct kvm_x86_ops {
 */
void (*tlb_flush_guest)(struct kvm_vcpu *vcpu);
 
-   enum exit_fastpath_completion (*run)(struct kvm_vcpu *vcpu);
+   enum exit_fastpath_completion (*vcpu_run)(struct kvm_vcpu *vcpu);
int (*handle_exit)(struct kvm_vcpu *vcpu,
enum exit_fastpath_completion exit_fastpath);
int (*skip_emulated_instruction)(struct kvm_vcpu *vcpu);
@@ -1150,8 +1150,8 @@ struct kvm_x86_ops {
u32 (*get_interrupt_shadow)(struct kvm_vcpu *vcpu);
void (*patch_hypercall)(struct kvm_vcpu *vcpu,
unsigned char *hypercall_addr);
-   void (*set_irq)(struct kvm_vcpu *vcpu);
-   void (*set_nmi)(struct kvm_vcpu *vcpu);
+   void (*inject_irq)(struct kvm_vcpu *vcpu);
+   void (*inject_nmi)(struct kvm_vcpu *vcpu);
void (*queue_exception)(struct kvm_vcpu *vcpu);
void (*cancel_injection)(struct kvm_vcpu *vcpu);
int (*interrupt_allowed)(struct kvm_vcpu *vcpu, bool for_injection);
@@ -1258,8 +1258,8 @@ struct kvm_x86_ops {
void (*enable_smi_window)(struct kvm_vcpu *vcpu);
 
int (*mem_enc_op)(struct kvm 

[PATCH 3/6 v3] KVM: nSVM: Fill in conforming svm_nested_ops via macro

2020-07-27 Thread Krish Sadhukhan
The names of the nested_svm_ops functions do not have a corresponding
'nested_svm_' prefix. Generate the names using a macro so that the names are
conformant. Fixing the naming will help in better readability and
maintenance of the code.

Suggested-by: Vitaly Kuznetsov 
Suggested-by: Paolo Bonzini 
Signed-off-by: Sean Christopherson 
Signed-off-by: Krish Sadhukhan 
---
 arch/x86/kvm/svm/nested.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 3be6256..7cb834a 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -718,7 +718,7 @@ static int nested_svm_intercept(struct vcpu_svm *svm)
/*
 * Host-intercepted exceptions have been checked already in
 * nested_svm_exit_special.  There is nothing to do here,
-* the vmexit is injected by svm_check_nested_events.
+* the vmexit is injected by nested_svm_check_events().
 */
vmexit = NESTED_EXIT_DONE;
break;
@@ -850,7 +850,7 @@ static void nested_svm_init(struct vcpu_svm *svm)
 }
 
 
-static int svm_check_nested_events(struct kvm_vcpu *vcpu)
+static int nested_svm_check_events(struct kvm_vcpu *vcpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
bool block_nested_events =
@@ -933,7 +933,7 @@ int nested_svm_exit_special(struct vcpu_svm *svm)
return NESTED_EXIT_CONTINUE;
 }
 
-static int svm_get_nested_state(struct kvm_vcpu *vcpu,
+static int nested_svm_get_state(struct kvm_vcpu *vcpu,
struct kvm_nested_state __user 
*user_kvm_nested_state,
u32 user_data_size)
 {
@@ -990,7 +990,7 @@ static int svm_get_nested_state(struct kvm_vcpu *vcpu,
return kvm_state.size;
 }
 
-static int svm_set_nested_state(struct kvm_vcpu *vcpu,
+static int nested_svm_set_state(struct kvm_vcpu *vcpu,
struct kvm_nested_state __user 
*user_kvm_nested_state,
struct kvm_nested_state *kvm_state)
 {
@@ -1075,8 +1075,10 @@ static int svm_set_nested_state(struct kvm_vcpu *vcpu,
return 0;
 }
 
+#define KVM_X86_NESTED_OP(name) .name = nested_svm_##name
+
 struct kvm_x86_nested_ops svm_nested_ops = {
-   .check_events = svm_check_nested_events,
-   .get_state = svm_get_nested_state,
-   .set_state = svm_set_nested_state,
+   KVM_X86_NESTED_OP(check_events),
+   KVM_X86_NESTED_OP(get_state),
+   KVM_X86_NESTED_OP(set_state),
 };
-- 
1.8.3.1




[PATCH 0/6 v3] KVM: x86: Fill in conforming {vmx|svm}_x86_ops and {vmx|svm}_nested_ops via macros

2020-07-27 Thread Krish Sadhukhan
v2 -> v3:
1. kvm_arch_hardware_unsetup() is changed to
   kvm_arch_hardware_teardown() on non-x86 arches as well.

2. The following #defines

KVM_MEMORY_ENCRYPT_OP
KVM_MEMORY_ENCRYPT_REG_REGION
KVM_MEMORY_ENCRYPT_UNREG_REGION

   have been changed to:

KVM_MEM_ENC_OP
KVM_MEM_ENC_REGISTER_REGION
KVM_MEM_ENC_UNREGISTER_REGION

3. Patch# 6 is new. It changes the KVM_MEMORY_ENCRYPT_* #defines in
   QEMU to make them conformant to those in the kernel.


[PATCH 1/6 v3] KVM: x86: Change names of some of the kvm_x86_ops
[PATCH 2/6 v3] KVM: SVM: Fill in conforming svm_x86_ops via macro
[PATCH 3/6 v3] KVM: nSVM: Fill in conforming svm_nested_ops via macro
[PATCH 4/6 v3] KVM: VMX: Fill in conforming vmx_x86_ops via macro
[PATCH 5/6 v3] KVM: nVMX: Fill in conforming vmx_nested_ops via macro
[PATCH 6/6 v3] QEMU: x86: Change KVM_MEMORY_ENCRYPT_*  #defines to make them

 arch/arm64/include/asm/kvm_host.h   |   2 +-
 arch/mips/include/asm/kvm_host.h|   2 +-
 arch/powerpc/include/asm/kvm_host.h |   2 +-
 arch/s390/kvm/kvm-s390.c|   2 +-
 arch/x86/include/asm/kvm_host.h |  12 +-
 arch/x86/kvm/svm/avic.c |   4 +-
 arch/x86/kvm/svm/nested.c   |  18 +--
 arch/x86/kvm/svm/sev.c  |   6 +-
 arch/x86/kvm/svm/svm.c  | 218 +
 arch/x86/kvm/svm/svm.h  |   8 +-
 arch/x86/kvm/vmx/nested.c   |  26 ++--
 arch/x86/kvm/vmx/nested.h   |   2 +-
 arch/x86/kvm/vmx/vmx.c  | 238 ++--
 arch/x86/kvm/vmx/vmx.h  |   2 +-
 arch/x86/kvm/x86.c  |  28 ++---
 include/linux/kvm_host.h|   2 +-
 include/uapi/linux/kvm.h|   6 +-
 tools/include/uapi/linux/kvm.h  |   6 +-
 virt/kvm/kvm_main.c |   4 +-
 19 files changed, 298 insertions(+), 290 deletions(-)

Krish Sadhukhan (5):
  KVM: x86: Change names of some of the kvm_x86_ops functions to make them m
  KVM: SVM: Fill in conforming svm_x86_ops via macro
  KVM: nSVM: Fill in conforming svm_nested_ops via macro
  KVM: VMX: Fill in conforming vmx_x86_ops via macro
  KVM: nVMX: Fill in conforming vmx_nested_ops via macro

 target/i386/sev.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

Krish Sadhukhan (1):
  QEMU: x86: Change KVM_MEMORY_ENCRYPT_*  #defines to make them conformant t



[PATCH 5/6 v3] KVM: nVMX: Fill in conforming vmx_nested_ops via macro

2020-07-27 Thread Krish Sadhukhan
The names of some of the vmx_nested_ops functions do not have a corresponding
'nested_vmx_' prefix. Generate the names using a macro so that the names are
conformant. Fixing the naming will help in better readability and
maintenance of the code.

Suggested-by: Vitaly Kuznetsov 
Suggested-by: Paolo Bonzini 
Signed-off-by: Sean Christopherson 
Signed-off-by: Krish Sadhukhan 
---
 arch/x86/kvm/vmx/nested.c | 24 +---
 arch/x86/kvm/vmx/nested.h |  2 +-
 arch/x86/kvm/vmx/vmx.c|  4 ++--
 3 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index a898b53..fc09bb0 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3105,7 +3105,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu 
*vcpu)
return 0;
 }
 
-static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
+static bool nested_vmx_get_vmcs12_pages(struct kvm_vcpu *vcpu)
 {
struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -3295,7 +3295,7 @@ enum nvmx_vmentry_status 
nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
prepare_vmcs02_early(vmx, vmcs12);
 
if (from_vmentry) {
-   if (unlikely(!nested_get_vmcs12_pages(vcpu)))
+   if (unlikely(!nested_vmx_get_vmcs12_pages(vcpu)))
return NVMX_VMENTRY_KVM_INTERNAL_ERROR;
 
if (nested_vmx_check_vmentry_hw(vcpu)) {
@@ -3711,7 +3711,7 @@ static bool nested_vmx_preemption_timer_pending(struct 
kvm_vcpu *vcpu)
   to_vmx(vcpu)->nested.preemption_timer_expired;
 }
 
-static int vmx_check_nested_events(struct kvm_vcpu *vcpu)
+static int nested_vmx_check_events(struct kvm_vcpu *vcpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
unsigned long exit_qual;
@@ -5907,7 +5907,7 @@ bool nested_vmx_reflect_vmexit(struct kvm_vcpu *vcpu)
return true;
 }
 
-static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
+static int nested_vmx_get_state(struct kvm_vcpu *vcpu,
struct kvm_nested_state __user 
*user_kvm_nested_state,
u32 user_data_size)
 {
@@ -6031,7 +6031,7 @@ void vmx_leave_nested(struct kvm_vcpu *vcpu)
free_nested(vcpu);
 }
 
-static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
+static int nested_vmx_set_state(struct kvm_vcpu *vcpu,
struct kvm_nested_state __user 
*user_kvm_nested_state,
struct kvm_nested_state *kvm_state)
 {
@@ -6448,7 +6448,7 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs 
*msrs, u32 ept_caps)
msrs->vmcs_enum = VMCS12_MAX_FIELD_INDEX << 1;
 }
 
-void nested_vmx_hardware_unsetup(void)
+void nested_vmx_hardware_teardown(void)
 {
int i;
 
@@ -6473,7 +6473,7 @@ __init int nested_vmx_hardware_setup(int 
(*exit_handlers[])(struct kvm_vcpu *))
vmx_bitmap[i] = (unsigned long *)
__get_free_page(GFP_KERNEL);
if (!vmx_bitmap[i]) {
-   nested_vmx_hardware_unsetup();
+   nested_vmx_hardware_teardown();
return -ENOMEM;
}
}
@@ -6497,12 +6497,14 @@ __init int nested_vmx_hardware_setup(int 
(*exit_handlers[])(struct kvm_vcpu *))
return 0;
 }
 
+#define KVM_X86_NESTED_OP(name) .name = nested_vmx_##name
+
 struct kvm_x86_nested_ops vmx_nested_ops = {
-   .check_events = vmx_check_nested_events,
+   KVM_X86_NESTED_OP(check_events),
.hv_timer_pending = nested_vmx_preemption_timer_pending,
-   .get_state = vmx_get_nested_state,
-   .set_state = vmx_set_nested_state,
-   .get_vmcs12_pages = nested_get_vmcs12_pages,
+   KVM_X86_NESTED_OP(get_state),
+   KVM_X86_NESTED_OP(set_state),
+   KVM_X86_NESTED_OP(get_vmcs12_pages),
.enable_evmcs = nested_enable_evmcs,
.get_evmcs_version = nested_get_evmcs_version,
 };
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index 758bccc..ac6b561 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -18,7 +18,7 @@ enum nvmx_vmentry_status {
 
 void vmx_leave_nested(struct kvm_vcpu *vcpu);
 void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps);
-void nested_vmx_hardware_unsetup(void);
+void nested_vmx_hardware_teardown(void);
 __init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu 
*));
 void nested_vmx_set_vmcs_shadowing_bitmap(void);
 void nested_vmx_free_vcpu(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index f6a6674..6512e6e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7830,7 +7830,7 @@ static void vmx_migrate_timers(struct kvm_vcpu *vcpu)
 static void vmx_hardware_teardown(void)
 {
if (nested)
-   

RE: [PATCH v2 3/3] virtiofsd: probe unshare(CLONE_FS) and print an error

2020-07-27 Thread misono.tomoh...@fujitsu.com
> Subject: [PATCH v2 3/3] virtiofsd: probe unshare(CLONE_FS) and print an error
> 
> An assertion failure is raised during request processing if
> unshare(CLONE_FS) fails. Implement a probe at startup so the problem can
> be detected right away.
> 
> Unfortunately Docker/Moby does not include unshare in the seccomp.json
> list unless CAP_SYS_ADMIN is given. Other seccomp.json lists always
> include unshare (e.g. podman is unaffected):
> https://raw.githubusercontent.com/seccomp/containers-golang/master/seccomp.json
> 
> Use "docker run --security-opt seccomp=path/to/seccomp.json ..." if the
> default seccomp.json is missing unshare.

Hi, sorry for a bit late.

unshare() was added to fix xattr problem: 
  https://github.com/qemu/qemu/commit/bdfd66788349acc43cd3f1298718ad491663cfcc#
In theory we don't need to call unshare if xattr is disabled, but it is hard to 
get to know
if xattr is enabled or disabled in fv_queue_worker(), right?

So, it looks good to me.
Reviewed-by: Misono Tomohiro 

Regards,
Misono

> 
> Cc: Misono Tomohiro 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  tools/virtiofsd/fuse_virtio.c | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c
> index 3b6d16a041..9e5537506c 100644
> --- a/tools/virtiofsd/fuse_virtio.c
> +++ b/tools/virtiofsd/fuse_virtio.c
> @@ -949,6 +949,22 @@ int virtio_session_mount(struct fuse_session *se)
>  {
>  int ret;
> 
> +/*
> + * Test that unshare(CLONE_FS) works. fv_queue_worker() will need it. 
> It's
> + * an unprivileged system call but some Docker/Moby versions are known to
> + * reject it via seccomp when CAP_SYS_ADMIN is not given.
> + *
> + * Note that the program is single-threaded here so this syscall has no
> + * visible effect and is safe to make.
> + */
> +ret = unshare(CLONE_FS);
> +if (ret == -1 && errno == EPERM) {
> +fuse_log(FUSE_LOG_ERR, "unshare(CLONE_FS) failed with EPERM. If "
> +"running in a container please check that the container "
> +"runtime seccomp policy allows unshare.\n");
> +return -1;
> +}
> +
>  ret = fv_create_listen_socket(se);
>  if (ret < 0) {
>  return ret;
> --
> 2.26.2




RE: [PATCH v2 1/3] hw/i386: Initialize topo_ids from CpuInstanceProperties

2020-07-27 Thread Babu Moger



> -Original Message-
> From: Igor Mammedov 
> Sent: Monday, July 27, 2020 12:14 PM
> To: Moger, Babu 
> Cc: qemu-devel@nongnu.org; pbonz...@redhat.com; ehabk...@redhat.com;
> r...@twiddle.net
> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> CpuInstanceProperties
> 
> On Mon, 27 Jul 2020 10:49:08 -0500
> Babu Moger  wrote:
> 
> > > -Original Message-
> > > From: Igor Mammedov 
> > > Sent: Friday, July 24, 2020 12:05 PM
> > > To: Moger, Babu 
> > > Cc: qemu-devel@nongnu.org; pbonz...@redhat.com;
> ehabk...@redhat.com;
> > > r...@twiddle.net
> > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > CpuInstanceProperties
> > >
> > > On Mon, 13 Jul 2020 14:30:29 -0500
> > > Babu Moger  wrote:
> > >
> > > > > -Original Message-
> > > > > From: Igor Mammedov 
> > > > > Sent: Monday, July 13, 2020 12:32 PM
> > > > > To: Moger, Babu 
> > > > > Cc: pbonz...@redhat.com; r...@twiddle.net; ehabk...@redhat.com;
> > > > > qemu- de...@nongnu.org
> > > > > Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids from
> > > > > CpuInstanceProperties
> > > > >
> > > > > On Mon, 13 Jul 2020 11:43:33 -0500 Babu Moger
> > > > >  wrote:
> > > > >
> > > > > > On 7/13/20 11:17 AM, Igor Mammedov wrote:
> > > > > > > On Mon, 13 Jul 2020 10:02:22 -0500 Babu Moger
> > > > > > >  wrote:
> > > > > > >
> > > > > > >>> -Original Message-
> > > > > > >>> From: Igor Mammedov 
> > > > > > >>> Sent: Monday, July 13, 2020 4:08 AM
> > > > > > >>> To: Moger, Babu 
> > > > > > >>> Cc: pbonz...@redhat.com; r...@twiddle.net;
> > > > > > >>> ehabk...@redhat.com;
> > > > > > >>> qemu- de...@nongnu.org
> > > > > > >>> Subject: Re: [PATCH v2 1/3] hw/i386: Initialize topo_ids
> > > > > > >>> from CpuInstanceProperties
> > > > > > > [...]
> > > > > >  +
> > > > > >  +/*
> > > > > >  + * Initialize topo_ids from CpuInstanceProperties
> > > > > >  + * node_id in CpuInstanceProperties(or in CPU device) is
> > > > > >  +a sequential
> > > > > >  + * number, but while building the topology
> > > > > > >>>
> > > > > >  we need to separate it for
> > > > > >  + * each socket(mod nodes_per_pkg).
> > > > > > >>> could you clarify a bit more on why this is necessary?
> > > > > > >>
> > > > > > >> If you have two sockets and 4 numa nodes, node_id in
> > > > > > >> CpuInstanceProperties will be number sequentially as 0, 1, 2, 3.
> > > > > > >> But in EPYC topology, it will be  0, 1, 0, 1( Basically mod
> > > > > > >> % number of nodes
> > > > > per socket).
> > > > > > >
> > > > > > > I'm confused, let's suppose we have 2 EPYC sockets with 2
> > > > > > > nodes per socket so APIC id woulbe be composed like:
> > > > > > >
> > > > > > >  1st socket
> > > > > > >pkg_id(0) | node_id(0)
> > > > > > >pkg_id(0) | node_id(1)
> > > > > > >
> > > > > > >  2nd socket
> > > > > > >pkg_id(1) | node_id(0)
> > > > > > >pkg_id(1) | node_id(1)
> > > > > > >
> > > > > > > if that's the case, then EPYC's node_id here doesn't look
> > > > > > > like a NUMA node in the sense it's usually used (above
> > > > > > > config would have 4 different memory controllers => 4 conventional
> NUMA nodes).
> > > > > >
> > > > > > EPIC model uses combination of socket id and node id to
> > > > > > identify the numa nodes. So, it internally uses all the information.
> > > > >
> > > > > well with above values, EPYC's node_id doesn't look like it's
> > > > > specifying a machine numa node, but rather a node index within
> > > > > single socket. In which case, it doesn't make much sense calling
> > > > > it NUMA node_id, it's rather some index within a socket. (it
> > > > > starts looking like terminology is all mixed up)
> > > > >
> > > > > If you have access to a milti-socket EPYC machine, can you dump
> > > > > and post here its apic ids, pls?
> > > >
> > > > Here is the output from my EPYC machine with 2 sockets and totally
> > > > 8 nodes(SMT disabled). The cpus 0-31 are in socket 0 and  cpus
> > > > 32-63 in socket 1.
> > > >
> > > > # lscpu
> > > > Architecture:x86_64
> > > > CPU op-mode(s):  32-bit, 64-bit
> > > > Byte Order:  Little Endian
> > > > CPU(s):  64
> > > > On-line CPU(s) list: 0-63
> > > > Thread(s) per core:  1
> > > > Core(s) per socket:  32
> > > > Socket(s):   2
> > > > NUMA node(s):8
> > > > Vendor ID:   AuthenticAMD
> > > > CPU family:  23
> > > > Model:   1
> > > > Model name:  AMD Eng Sample: 1S1901A4VIHF5_30/19_N
> > > > Stepping:2
> > > > CPU MHz: 2379.233
> > > > CPU max MHz: 1900.
> > > > CPU min MHz: 1200.
> > > > BogoMIPS:3792.81
> > > > Virtualization:  AMD-V
> > > > L1d cache:   32K
> > > > L1i cache:   64K
> > > > L2 cache:512K
> > > > L3 cache:8192K
> > > > NUMA node0 CPU(s):   0-7
> > > > NUMA node1 CPU(s):   8-15
> > > > NUMA node2 CPU(s):   16-23
> > > > NUMA node3 

Re: [PATCH v2 0/7] target/riscv: NaN-boxing for multiple precison

2020-07-27 Thread Alistair Francis
On Thu, Jul 23, 2020 at 5:28 PM Richard Henderson
 wrote:
>
> This is my take on Liu Zhiwei's patch set:
> https://patchew.org/QEMU/20200626205917.4545-1-zhiwei_...@c-sky.com
>
> This differs from Zhiwei's v1 in:
>
>  * If a helper is involved, the helper does the boxing and unboxing.
>
>  * Which leaves only LDW and FSGN*.S as the only instructions that
>are expanded inline which need to handle nanboxing.
>
>  * All mention of RVD is dropped vs boxing.  This means that an
>RVF-only cpu will still generate and check nanboxes into the
>64-bit cpu_fpu slots.  There should be no way an RVF-only cpu
>can generate an unboxed cpu_fpu value.
>
>This choice is made to speed up the common case: RVF+RVD, so
>that we do not have to check whether RVD is enabled.
>
>  * The translate.c primitives take TCGv values rather than fpu
>regno, which will make it possible to use them with RVV,
>since v0.9 does proper nanboxing.
>
>  * I have adjusted the current naming to be float32 specific ("*_s"),
>to avoid confusion with the float16 data type supported by RVV.

Thanks Richard. As Zhiwei has reviewed all of these I have applied
them to the riscv-to-apply.next tree for 5.2.

Alistair

>
>
> r~
>
>
> LIU Zhiwei (2):
>   target/riscv: Clean up fmv.w.x
>   target/riscv: check before allocating TCG temps
>
> Richard Henderson (5):
>   target/riscv: Generate nanboxed results from fp helpers
>   target/riscv: Generalize gen_nanbox_fpr to gen_nanbox_s
>   target/riscv: Generate nanboxed results from trans_rvf.inc.c
>   target/riscv: Check nanboxed inputs to fp helpers
>   target/riscv: Check nanboxed inputs in trans_rvf.inc.c
>
>  target/riscv/internals.h|  16 
>  target/riscv/fpu_helper.c   | 102 
>  target/riscv/insn_trans/trans_rvd.inc.c |   8 +-
>  target/riscv/insn_trans/trans_rvf.inc.c |  99 ++-
>  target/riscv/translate.c|  29 +++
>  5 files changed, 178 insertions(+), 76 deletions(-)
>
> --
> 2.25.1
>



Re: [PATCH v10] qga: add command guest-get-devices for reporting VirtIO devices

2020-07-27 Thread Michael Roth
Quoting Tomáš Golembiovský (2020-07-21 10:40:41)
> Add command for reporting devices on Windows guest. The intent is not so
> much to report the devices but more importantly the driver (and its
> version) that is assigned to the device. This gives caller the
> information whether VirtIO drivers are installed and/or whether
> inadequate driver is used on a device (e.g. QXL device with base VGA
> driver).
> 
> Example:
> [
> {
>   "driver-date": "2019-08-12",
>   "driver-name": "Red Hat VirtIO SCSI controller",
>   "driver-version": "100.80.104.17300",
>   "address": {
> "type": "pci",
> "data": {
>   "device-id": 4162,
>   "vendor-id": 6900
> }
>   }
> },
> ...
> ]
> 
> Signed-off-by: Tomáš Golembiovský 
> Reviewed-by: Marc-André Lureau 
> Reviewed-by: Philippe Mathieu-Daudé 

Thanks, applied to qga-staging tree for 5.2:
  https://github.com/mdroth/qemu/commits/qga-staging

Sorry for the delays in processing this.

> ---
> 
> Changes in v10:
> - rebased to current master
> - changed `since` tag in schema to 5.2
> 
>  qga/commands-posix.c |   9 ++
>  qga/commands-win32.c | 212 ++-
>  qga/qapi-schema.json |  51 +++
>  3 files changed, 271 insertions(+), 1 deletion(-)
> 
> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
> index 1a62a3a70d..f509a1f525 100644
> --- a/qga/commands-posix.c
> +++ b/qga/commands-posix.c
> @@ -2761,6 +2761,8 @@ GList *ga_command_blacklist_init(GList *blacklist)
>  blacklist = g_list_append(blacklist, g_strdup("guest-fstrim"));
>  #endif
> 
> +blacklist = g_list_append(blacklist, g_strdup("guest-get-devices"));
> +
>  return blacklist;
>  }
> 
> @@ -2981,3 +2983,10 @@ GuestOSInfo *qmp_guest_get_osinfo(Error **errp)
> 
>  return info;
>  }
> +
> +GuestDeviceInfoList *qmp_guest_get_devices(Error **errp)
> +{
> +error_setg(errp, QERR_UNSUPPORTED);
> +
> +return NULL;
> +}
> diff --git a/qga/commands-win32.c b/qga/commands-win32.c
> index aaa71f147b..1302bae9eb 100644
> --- a/qga/commands-win32.c
> +++ b/qga/commands-win32.c
> @@ -21,10 +21,11 @@
>  #ifdef CONFIG_QGA_NTDDSCSI
>  #include 
>  #include 
> +#endif
>  #include 
>  #include 
>  #include 
> -#endif
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -39,6 +40,36 @@
>  #include "qemu/base64.h"
>  #include "commands-common.h"
> 
> +/*
> + * The following should be in devpkey.h, but it isn't. The key names were
> + * prefixed to avoid (future) name clashes. Once the definitions get into
> + * mingw the following lines can be removed.
> + */
> +DEFINE_DEVPROPKEY(qga_DEVPKEY_NAME, 0xb725f130, 0x47ef, 0x101a, 0xa5,
> +0xf1, 0x02, 0x60, 0x8c, 0x9e, 0xeb, 0xac, 10);
> +/* DEVPROP_TYPE_STRING */
> +DEFINE_DEVPROPKEY(qga_DEVPKEY_Device_HardwareIds, 0xa45c254e, 0xdf1c,
> +0x4efd, 0x80, 0x20, 0x67, 0xd1, 0x46, 0xa8, 0x50, 0xe0, 3);
> +/* DEVPROP_TYPE_STRING_LIST */
> +DEFINE_DEVPROPKEY(qga_DEVPKEY_Device_DriverDate, 0xa8b865dd, 0x2e3d,
> +0x4094, 0xad, 0x97, 0xe5, 0x93, 0xa7, 0xc, 0x75, 0xd6, 2);
> +/* DEVPROP_TYPE_FILETIME */
> +DEFINE_DEVPROPKEY(qga_DEVPKEY_Device_DriverVersion, 0xa8b865dd, 0x2e3d,
> +0x4094, 0xad, 0x97, 0xe5, 0x93, 0xa7, 0xc, 0x75, 0xd6, 3);
> +/* DEVPROP_TYPE_STRING */
> +/* The following shoud be in cfgmgr32.h, but it isn't */
> +#ifndef CM_Get_DevNode_Property
> +CMAPI CONFIGRET WINAPI CM_Get_DevNode_PropertyW(
> +DEVINST  dnDevInst,
> +CONST DEVPROPKEY * PropertyKey,
> +DEVPROPTYPE  * PropertyType,
> +PBYTEPropertyBuffer,
> +PULONG   PropertyBufferSize,
> +ULONGulFlags
> +);
> +#define CM_Get_DevNode_Property CM_Get_DevNode_PropertyW
> +#endif
> +
>  #ifndef SHTDN_REASON_FLAG_PLANNED
>  #define SHTDN_REASON_FLAG_PLANNED 0x8000
>  #endif
> @@ -93,6 +124,8 @@ static OpenFlags guest_file_open_modes[] = {
>  g_free(suffix); \
>  } while (0)
> 
> +G_DEFINE_AUTOPTR_CLEANUP_FUNC(GuestDeviceInfo, qapi_free_GuestDeviceInfo)
> +
>  static OpenFlags *find_open_flag(const char *mode_str)
>  {
>  int mode;
> @@ -2229,3 +2262,180 @@ GuestOSInfo *qmp_guest_get_osinfo(Error **errp)
> 
>  return info;
>  }
> +
> +/*
> + * Safely get device property. Returned strings are using wide characters.
> + * Caller is responsible for freeing the buffer.
> + */
> +static LPBYTE cm_get_property(DEVINST devInst, const DEVPROPKEY *propName,
> +PDEVPROPTYPE propType)
> +{
> +CONFIGRET cr;
> +g_autofree LPBYTE buffer = NULL;
> +ULONG buffer_len = 0;
> +
> +/* First query for needed space */
> +cr = CM_Get_DevNode_PropertyW(devInst, propName, propType,
> +buffer, _len, 0);
> +if (cr != CR_SUCCESS && cr != CR_BUFFER_SMALL) {
> +
> +slog("failed to get property size, error=0x%lx", cr);
> +return NULL;
> +}
> +buffer = g_new0(BYTE, buffer_len + 1);
> +cr = CM_Get_DevNode_PropertyW(devInst, propName, propType,
> +

Re: [PATCH v2 0/4] Allow guest-get-fsinfo also for non-PCI devices

2020-07-27 Thread Michael Roth
Quoting Thomas Huth (2020-07-21 23:40:24)
> The information that can be retrieved via UDEV is also usable for non-PCI
> devices. So let's allow build_guest_fsinfo_for_real_device() on non-PCI
> devices, too. This is required to fix the bug that CCW devices show up
> without "Target" when running libvirt's "virsh domfsinfo" command (see
> https://bugzilla.redhat.com/show_bug.cgi?id=1755075 for details).
> 
> v2:
>  - Use g_new0 instead of g_malloc0 (as suggested by Daniel)
>  - Init fields to -1 explicitely, not via memset (Daniel)
>  - Add the fourth patch to also fill in virtio information on s390x

Thanks, patches 2-4 applied to qga-staging tree for 5.2:
  https://github.com/mdroth/qemu/commits/qga-staging

I've sent a pull request for 5.1 with patch 1/4

> 
> Thomas Huth (4):
>   qga/qapi-schema: Document -1 for invalid PCI address fields
>   qga/commands-posix: Rework build_guest_fsinfo_for_real_device()
> function
>   qga/commands-posix: Move the udev code from the pci to the generic
> function
>   qga/commands-posix: Support fsinfo for non-PCI virtio devices, too
> 
>  qga/commands-posix.c | 157 ++-
>  qga/qapi-schema.json |   2 +-
>  2 files changed, 110 insertions(+), 49 deletions(-)
> 
> -- 
> 2.18.1
> 



[PULL for-5.1 0/2] qemu-ga patch queue for hard-freeze

2020-07-27 Thread Michael Roth
The following changes since commit 9303ecb658a0194560d1eecde165a1511223c2d8:

  Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20200727' into 
staging (2020-07-27 17:25:06 +0100)

are available in the Git repository at:

  git://github.com/mdroth/qemu.git tags/qga-pull-2020-07-27-tag

for you to fetch changes up to ba620541d0db7e3433babbd97c0413a371e6fb4a:

  qga/qapi-schema: Document -1 for invalid PCI address fields (2020-07-27 
18:03:55 -0500)


qemu-ga patch queue for hard-freeze

* document use of -1 when pci_controller field can't be retrieved for
  guest-get-fsinfo
* fix incorrect filesystem type reporting on w32 for guest-get-fsinfo
  when a volume is not mounted


Basil Salman (1):
  qga-win: fix "guest-get-fsinfo" wrong filesystem type

Thomas Huth (1):
  qga/qapi-schema: Document -1 for invalid PCI address fields

 qga/commands-win32.c | 29 +++--
 qga/qapi-schema.json |  2 +-
 2 files changed, 24 insertions(+), 7 deletions(-)





[PULL for-5.1 1/2] qga-win: fix "guest-get-fsinfo" wrong filesystem type

2020-07-27 Thread Michael Roth
From: Basil Salman 

This patch handles the case where unmounted volumes exist,
where in that case GetVolumePathNamesForVolumeName returns
empty path, GetVolumeInformation will use the current working
directory instead.
This patch fixes the issue by opening a handle to the volumes,
and using GetVolumeInformationByHandleW instead.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1746667

Signed-off-by: Basil Salman 
Signed-off-by: Basil Salman 
*fix crash when guest_build_fsinfo() sets errp multiple times
*make new error message more distinct from existing ones
Signed-off-by: Michael Roth 
---
 qga/commands-win32.c | 29 +++--
 1 file changed, 23 insertions(+), 6 deletions(-)

diff --git a/qga/commands-win32.c b/qga/commands-win32.c
index aaa71f147b..15c9d7944b 100644
--- a/qga/commands-win32.c
+++ b/qga/commands-win32.c
@@ -958,11 +958,13 @@ static GuestFilesystemInfo *build_guest_fsinfo(char 
*guid, Error **errp)
 {
 DWORD info_size;
 char mnt, *mnt_point;
+wchar_t wfs_name[32];
 char fs_name[32];
-char vol_info[MAX_PATH+1];
+wchar_t vol_info[MAX_PATH + 1];
 size_t len;
 uint64_t i64FreeBytesToCaller, i64TotalBytes, i64FreeBytes;
 GuestFilesystemInfo *fs = NULL;
+HANDLE hLocalDiskHandle = NULL;
 
 GetVolumePathNamesForVolumeName(guid, (LPCH), 0, _size);
 if (GetLastError() != ERROR_MORE_DATA) {
@@ -977,18 +979,27 @@ static GuestFilesystemInfo *build_guest_fsinfo(char 
*guid, Error **errp)
 goto free;
 }
 
+hLocalDiskHandle = CreateFile(guid, 0 , 0, NULL, OPEN_EXISTING,
+  FILE_ATTRIBUTE_NORMAL |
+  FILE_FLAG_BACKUP_SEMANTICS, NULL);
+if (INVALID_HANDLE_VALUE == hLocalDiskHandle) {
+error_setg_win32(errp, GetLastError(), "failed to get handle for 
volume");
+goto free;
+}
+
 len = strlen(mnt_point);
 mnt_point[len] = '\\';
 mnt_point[len+1] = 0;
-if (!GetVolumeInformation(mnt_point, vol_info, sizeof(vol_info), NULL, 
NULL,
-  NULL, (LPSTR)_name, sizeof(fs_name))) {
+
+if (!GetVolumeInformationByHandleW(hLocalDiskHandle, vol_info,
+   sizeof(vol_info), NULL, NULL, NULL,
+   (LPWSTR) & wfs_name, sizeof(wfs_name))) 
{
 if (GetLastError() != ERROR_NOT_READY) {
 error_setg_win32(errp, GetLastError(), "failed to get volume 
info");
 }
 goto free;
 }
 
-fs_name[sizeof(fs_name) - 1] = 0;
 fs = g_malloc(sizeof(*fs));
 fs->name = g_strdup(guid);
 fs->has_total_bytes = false;
@@ -1007,9 +1018,11 @@ static GuestFilesystemInfo *build_guest_fsinfo(char 
*guid, Error **errp)
 fs->has_used_bytes = true;
 }
 }
+wcstombs(fs_name, wfs_name, sizeof(wfs_name));
 fs->type = g_strdup(fs_name);
 fs->disk = build_guest_disk_info(guid, errp);
 free:
+CloseHandle(hLocalDiskHandle);
 g_free(mnt_point);
 return fs;
 }
@@ -1027,8 +1040,12 @@ GuestFilesystemInfoList *qmp_guest_get_fsinfo(Error 
**errp)
 }
 
 do {
-GuestFilesystemInfo *info = build_guest_fsinfo(guid, errp);
-if (info == NULL) {
+Error *local_err = NULL;
+GuestFilesystemInfo *info = build_guest_fsinfo(guid, _err);
+if (local_err) {
+g_debug("failed to get filesystem info, ignoring error: %s",
+error_get_pretty(local_err));
+error_free(local_err);
 continue;
 }
 new = g_malloc(sizeof(*ret));
-- 
2.17.1




[PULL for-5.1 2/2] qga/qapi-schema: Document -1 for invalid PCI address fields

2020-07-27 Thread Michael Roth
From: Thomas Huth 

The "guest-get-fsinfo" could also be used for non-PCI devices in the
future. And the code in GuestPCIAddress() in qga/commands-win32.c seems
to be using "-1" for fields that it can not determine already. Thus
let's properly document "-1" as value for invalid PCI address fields.

Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Thomas Huth 
Signed-off-by: Michael Roth 
---
 qga/qapi-schema.json | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index 4be9aad48e..408a662ea5 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -846,7 +846,7 @@
 ##
 # @GuestDiskAddress:
 #
-# @pci-controller: controller's PCI address
+# @pci-controller: controller's PCI address (fields are set to -1 if invalid)
 # @bus-type: bus type
 # @bus: bus id
 # @target: target id
-- 
2.17.1




Re: migration: broken snapshot saves appear on s390 when small fields in migration stream removed

2020-07-27 Thread Bruce Rogers
On Tue, 2020-07-21 at 10:22 +0200, Claudio Fontana wrote:
> On 7/20/20 8:24 PM, Claudio Fontana wrote:
> > I have now been able to reproduce this on X86 as well.
> > 
> > It happens much more rarely, about once every 10 times.
> > 
> > I will sort out the data and try to make it even more reproducible,
> > then post my findings in detail.
> > 
> > Overall I proceeded as follows:
> > 
> > 1) hooked the savevm code to skip all fields with the exception of
> > "s390-skeys". So only s390-skeys are actually saved.
> > 
> > 2) reimplemented "s390-skeys" in a common implementation in cpus.c,
> > used on both x86 and s390, modeling the behaviour of save/load from
> > hw/s390
> > 
> > 3) ran ./check -qcow2 267 on both x86 and s390.
> > 
> > In the case of s390, failure seems to be reproducible 100% of the
> > times.
> > On X86, it is as mentioned failing about 10% of the times.
> > 
> > Ciao,
> > 
> > Claudio
> 
> And here is a small series of two patches that can be used to
> reproduce the problem.
> 
> Clearly, this is not directly related to s390 or to skeys or to
> icount in particular, it is just an issue that happened to be more
> visible there.
> 
> If you could help with this, please apply the attached patches.
> 
> Patch 1 just adds a new "300" iotest. It is way easier to extract the
> relevant part out of test 267, which does a bit too much in the same
> file.
> Also this allows easier use of valgrind, since it does not "require"
> anything.
> 
> Patch 2 hooks the savevm code to skip all fields during the snapshot
> with the exception of "s390-skeys", a new artificial field
> implemented to
> model what the real s390-skeys is doing.
> 
> After applying patch 1 and patch 2, you can test (also on X86), with:
> 
> ./check -qcow2 300
> 
> On X86 many runs will be successful, but a certain % of them will
> instead fail like this:
> 
> 
> claudio@linux-ch70:~/git/qemu-pristine/qemu-build/tests/qemu-iotests> 
> ./check -qcow2 300
> QEMU  -- "/home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64"
> -nodefaults -display none -accel qtest
> QEMU_IMG  -- "/home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/../../qemu-img" 
> QEMU_IO   -- "/home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio
> threads -f qcow2
> QEMU_NBD  -- "/home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/../../qemu-nbd" 
> IMGFMT-- qcow2 (compat=1.1)
> IMGPROTO  -- file
> PLATFORM  -- Linux/x86_64 linux-ch70 4.12.14-lp151.28.36-default
> TEST_DIR  -- /home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/scratch
> SOCK_DIR  -- /tmp/tmp.gdcUu3l0SM
> SOCKET_SCM_HELPER -- /home/claudio/git/qemu-pristine/qemu-
> build/tests/qemu-iotests/socket_scm_helper
> 
> 300  fail   [10:14:05] [10:14:06]  (last: 0s)output
> mismatch (see 300.out.bad)
> --- /home/claudio/git/qemu-pristine/qemu/tests/qemu-
> iotests/300.out 2020-07-21 10:03:54.468104764 +0200
> +++ /home/claudio/git/qemu-pristine/qemu-build/tests/qemu-
> iotests/300.out.bad   2020-07-21 10:14:06.098090543 +0200
> @@ -12,6 +12,9 @@
>  IDTAG VM SIZEDATE   VM
> CLOCK
>  --snap0  SIZE -mm-dd
> hh:mm:ss   00:00:00.000
>  (qemu) loadvm snap0
> +Unexpected storage key data: 0
> +error while loading state for instance 0x0 of device 's390-skeys'
> +Error: Error -22 while loading VM state
>  (qemu) quit
>  
>  *** done
> Failures: 300
> Failed 1 of 1 iotests
> 
> 
> At this point somebody more knowledgeable about QCOW2, coroutines and
> backing files could chime in?
> 


I used the reproducer you provide here to do a git bisect as I assume
whatever is now broken wasn't always broken, and it pointed to the
following commit:

commit df893d25ceea3c0dcbe6d6b425309317fab6b22e (refs/bisect/bad)
Author: Vladimir Sementsov-Ogievskiy 
Date:   Tue Jun 4 19:15:13 2019 +0300

block/qcow2: implement .bdrv_co_preadv_part

Indeed, I am currently able to reliable reproduce the issue with this
commit applied, and not reproduce it without it.

That said, I've not been able to identify exactly what is going wrong.
I'm fairly confident the savevm data is correctly written out, but on
the loadvm side, somehow the last part of the s390 data is not
correctly read in the data (it's in the second pass through the while
loop in qcow2_co_preadv_part() where that happens.)

If anyone familiar with this code can have a look or provide some
pointers, it would be much appreciated.

Adding commit author to CC.

Thanks,

Bruce




Re: [PATCH v5 3/4] target/riscv: Fix the translation of physical address

2020-07-27 Thread Alistair Francis
On Sat, Jul 25, 2020 at 8:05 AM Zong Li  wrote:
>
> The real physical address should add the 12 bits page offset. It also
> causes the PMP wrong checking due to the minimum granularity of PMP is
> 4 byte, but we always get the physical address which is 4KB alignment,
> that means, we always use the start address of the page to check PMP for
> all addresses which in the same page.

So riscv_cpu_tlb_fill() will clear these bits when calling
tlb_set_page(), so this won't have an impact on actual translation
(although it will change in input address for 2-stage translation, but
that seems fine).

Your point about PMP seems correct as we allow a smaller then page
granularity this seems like the right approach.

Can you edit riscv_cpu_get_phys_page_debug() to mask these bits out at
the end? Otherwise we will break what callers to
cpu_get_phys_page_attrs_debug() expect.

Alistair

>
> Signed-off-by: Zong Li 
> ---
>  target/riscv/cpu_helper.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 75d2ae3434..08b069f0c9 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -543,7 +543,8 @@ restart:
>  /* for superpage mappings, make a fake leaf PTE for the TLB's
> benefit. */
>  target_ulong vpn = addr >> PGSHIFT;
> -*physical = (ppn | (vpn & ((1L << ptshift) - 1))) << PGSHIFT;
> +*physical = ((ppn | (vpn & ((1L << ptshift) - 1))) << PGSHIFT) |
> +(addr & ~TARGET_PAGE_MASK);
>
>  /* set permissions on the TLB entry */
>  if ((pte & PTE_R) || ((pte & PTE_X) && mxr)) {
> --
> 2.27.0
>
>



[Bug 1681439] Re: qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma_sync: Assertion `s->bus->dma->aiocb == NULL' failed.

2020-07-27 Thread John Snow
** Changed in: qemu
   Status: New => Confirmed

** Changed in: qemu
 Assignee: (unassigned) => John Snow (jnsnow)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1681439

Title:
  qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma_sync: Assertion
  `s->bus->dma->aiocb == NULL' failed.

Status in QEMU:
  Confirmed

Bug description:
  Since upgrading to QEMU 2.8.0, my Windows 7 64-bit virtual machines
  started crashing due to the assertion quoted in the summary failing.
  The assertion in question was added by commit 9972354856 ("block: add
  BDS field to count in-flight requests").  My tests show that setting
  discard=unmap is needed to reproduce the issue.  Speaking of
  reproduction, it is a bit flaky, because I have been unable to come up
  with specific instructions that would allow the issue to be triggered
  outside of my environment, but I do have a semi-sane way of testing that
  appears to depend on a specific initial state of data on the underlying
  storage volume, actions taken within the VM and waiting for about 20
  minutes.

  Here is the shortest QEMU command line that I managed to reproduce the
  bug with:

  qemu-system-x86_64 \
  -machine pc-i440fx-2.7,accel=kvm \
  -m 3072 \
  -drive file=/dev/lvm/qemu,format=raw,if=ide,discard=unmap \
-netdev tap,id=hostnet0,ifname=tap0,script=no,downscript=no,vhost=on \
  -device virtio-net-pci,netdev=hostnet0 \
-vnc :0

  The underlying storage (/dev/lvm/qemu) is a thin LVM snapshot.

  QEMU was compiled using:

  ./configure --python=/usr/bin/python2.7 --target-list=x86_64-softmmu
  make -j3

  My virtualization environment is not really a critical one and
  reproduction is not that much of a hassle, so if you need me to gather
  further diagnostic information or test patches, I will be happy to help.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1681439/+subscriptions



Re: [PATCH v5 2/4] target/riscv/pmp.c: Fix the index offset on RV64

2020-07-27 Thread Alistair Francis
On Sat, Jul 25, 2020 at 8:04 AM Zong Li  wrote:
>
> On RV64, the reg_index is 2 (pmpcfg2 CSR) after the seventh pmp
> entry, it is not 1 (pmpcfg1 CSR) like RV32. In the original
> implementation, the second parameter of pmp_write_cfg is
> "reg_index * sizeof(target_ulong)", and we get the the result
> which is started from 16 if reg_index is 2, but we expect that
> it should be started from 8. Separate the implementation for
> RV32 and RV64 respectively.
>
> Signed-off-by: Zong Li 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/pmp.c | 11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index 2a2b9f5363..aeba796484 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -318,6 +318,10 @@ void pmpcfg_csr_write(CPURISCVState *env, uint32_t 
> reg_index,
>  return;
>  }
>
> +#if defined(TARGET_RISCV64)
> +reg_index >>= 1;
> +#endif
> +
>  for (i = 0; i < sizeof(target_ulong); i++) {
>  cfg_val = (val >> 8 * i)  & 0xff;
>  pmp_write_cfg(env, (reg_index * sizeof(target_ulong)) + i,
> @@ -335,11 +339,16 @@ target_ulong pmpcfg_csr_read(CPURISCVState *env, 
> uint32_t reg_index)
>  target_ulong cfg_val = 0;
>  target_ulong val = 0;
>
> +trace_pmpcfg_csr_read(env->mhartid, reg_index, cfg_val);
> +
> +#if defined(TARGET_RISCV64)
> +reg_index >>= 1;
> +#endif
> +
>  for (i = 0; i < sizeof(target_ulong); i++) {
>  val = pmp_read_cfg(env, (reg_index * sizeof(target_ulong)) + i);
>  cfg_val |= (val << (i * 8));
>  }
> -trace_pmpcfg_csr_read(env->mhartid, reg_index, cfg_val);
>
>  return cfg_val;
>  }
> --
> 2.27.0
>
>



[Bug 1884693] Re: Assertion failure in address_space_unmap through ahci_map_clb_address

2020-07-27 Thread John Snow
** Changed in: qemu
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1884693

Title:
  Assertion failure in address_space_unmap through ahci_map_clb_address

Status in QEMU:
  In Progress

Bug description:
  Hello,
  Reproducer:
  cat << EOF | ./i386-softmmu/qemu-system-i386 -qtest stdio -monitor none 
-serial none -M pc-q35-5.0 -nographic
  outl 0xcf8 0x8000fa24
  outl 0xcfc 0xe1068000
  outl 0xcf8 0x8000fa04
  outw 0xcfc 0x7
  outl 0xcf8 0x8000fb20
  write 0xe1068304 0x1 0x21
  write 0xe1068318 0x1 0x21
  write 0xe1068384 0x1 0x21
  write 0xe1068398 0x2 0x21
  EOF

  Stack trace:
  #0 0x55bfabfe9ea0 in __libc_start_main 
/build/glibc-GwnBeO/glibc-2.30/csu/../csu/libc-start.c:308:16
  #1 0x55bfabfc8ef9 in __sanitizer_print_stack_trace 
(build/i386-softmmu/qemu-fuzz-i386+0x7b7ef9)
  #2 0x55bfabfaf933 in fuzzer::PrintStackTrace() FuzzerUtil.cpp:210:5
  #3 0x7f88df76110f  (/lib/x86_64-linux-gnu/libpthread.so.0+0x1410f)
  #4 0x7f88df5a4760 in __libc_signal_restore_set 
/build/glibc-GwnBeO/glibc-2.30/signal/../sysdeps/unix/sysv/linux/internal-signals.h:84:10
  #5 0x7f88df5a4760 in raise 
/build/glibc-GwnBeO/glibc-2.30/signal/../sysdeps/unix/sysv/linux/raise.c:48:3
  #6 0x7f88df58e55a in abort /build/glibc-GwnBeO/glibc-2.30/stdlib/abort.c:79:7
  #7 0x7f88df58e42e in __assert_fail_base 
/build/glibc-GwnBeO/glibc-2.30/assert/assert.c:92:3
  #8 0x7f88df59d091 in __assert_fail 
/build/glibc-GwnBeO/glibc-2.30/assert/assert.c:101:3
  #9 0x55bfabff7182 in address_space_unmap exec.c:3602:9
  #10 0x55bfac4a452f in dma_memory_unmap include/sysemu/dma.h:148:5
  #11 0x55bfac4a452f in map_page hw/ide/ahci.c:254:9
  #12 0x55bfac4a1f98 in ahci_map_clb_address hw/ide/ahci.c:748:5
  #13 0x55bfac4a1f98 in ahci_cond_start_engines hw/ide/ahci.c:276:14
  #14 0x55bfac4a074e in ahci_port_write hw/ide/ahci.c:339:9
  #15 0x55bfac4a074e in ahci_mem_write hw/ide/ahci.c:513:9
  #16 0x55bfac0e0dc2 in memory_region_write_accessor memory.c:483:5
  #17 0x55bfac0e0bde in access_with_adjusted_size memory.c:544:18
  #18 0x55bfac0e0917 in memory_region_dispatch_write memory.c
  #19 0x55bfabffa4fd in flatview_write_continue exec.c:3146:23
  #20 0x55bfabff569b in flatview_write exec.c:3186:14
  #21 0x55bfabff569b in address_space_write exec.c:3280:18
  #22 0x55bfac8982a9 in op_write_pattern tests/qtest/fuzz/general_fuzz.c:407:5
  #23 0x55bfac897749 in general_fuzz tests/qtest/fuzz/general_fuzz.c:481:17
  #24 0x55bfac8930a2 in LLVMFuzzerTestOneInput tests/qtest/fuzz/fuzz.c:136:5
  #25 0x55bfabfb0e68 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, 
unsigned long) FuzzerLoop.cpp:558:15
  #26 0x55bfabfb0485 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned 
long, bool, fuzzer::InputInfo*, bool*) FuzzerLoop.cpp:470:3
  #27 0x55bfabfb18a1 in fuzzer::Fuzzer::MutateAndTestOne() FuzzerLoop.cpp:701:19
  #28 0x55bfabfb2305 in fuzzer::Fuzzer::Loop(std::vector >&) FuzzerLoop.cpp:837:5
  #29 0x55bfabfa2018 in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned 
char const*, unsigned long)) FuzzerDriver.cpp:846:6
  #30 0x55bfabfb8722 in main FuzzerMain.cpp:19:10
  #31 0x7f88df58fe0a in __libc_start_main 
/build/glibc-GwnBeO/glibc-2.30/csu/../csu/libc-start.c:308:16
  #32 0x55bfabf97869 in _start (build/i386-softmmu/qemu-fuzz-i386+0x786869)

  The same error can be triggered through  ahci_map_fis_address @ 
hw/ide/ahci.c:721:5
  Found with generic device fuzzer: 
https://patchew.org/QEMU/20200611055651.13784-1-alx...@bu.edu/

  Please let me know if I can provide any further info.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1884693/+subscriptions



Re: [PATCH for-5.1] hw/arm/netduino2, netduinoplus2: Set system_clock_scale

2020-07-27 Thread Alistair Francis
On Mon, Jul 27, 2020 at 9:26 AM Peter Maydell  wrote:
>
> The netduino2 and netduinoplus2 boards forgot to set the system_clock_scale
> global, which meant that if guest code used the systick timer in "use
> the processor clock" mode it would hang because time never advances.
>
> Set the global to match the documented CPU clock speed of these boards.
> Judging by the data sheet this is slightly simplistic because the
> SoC allows configuration of the SYSCLK source and frequency via the
> RCC (reset and clock control) module, but we don't model that.
>
> Fixes: https://bugs.launchpad.net/qemu/+bug/1876187
> Signed-off-by: Peter Maydell 

Reviewed-by: Alistair Francis 

Alistair

> ---
> NB: tested with "make check" only...
>
>  hw/arm/netduino2.c | 10 ++
>  hw/arm/netduinoplus2.c | 10 ++
>  2 files changed, 20 insertions(+)
>
> diff --git a/hw/arm/netduino2.c b/hw/arm/netduino2.c
> index 79e19392b56..8f103341443 100644
> --- a/hw/arm/netduino2.c
> +++ b/hw/arm/netduino2.c
> @@ -30,10 +30,20 @@
>  #include "hw/arm/stm32f205_soc.h"
>  #include "hw/arm/boot.h"
>
> +/* Main SYSCLK frequency in Hz (120MHz) */
> +#define SYSCLK_FRQ 12000ULL
> +
>  static void netduino2_init(MachineState *machine)
>  {
>  DeviceState *dev;
>
> +/*
> + * TODO: ideally we would model the SoC RCC and let it handle
> + * system_clock_scale, including its ability to define different
> + * possible SYSCLK sources.
> + */
> +system_clock_scale = NANOSECONDS_PER_SECOND / SYSCLK_FRQ;
> +
>  dev = qdev_new(TYPE_STM32F205_SOC);
>  qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m3"));
>  sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), _fatal);
> diff --git a/hw/arm/netduinoplus2.c b/hw/arm/netduinoplus2.c
> index 958d21dd9f9..68abd3ec69d 100644
> --- a/hw/arm/netduinoplus2.c
> +++ b/hw/arm/netduinoplus2.c
> @@ -30,10 +30,20 @@
>  #include "hw/arm/stm32f405_soc.h"
>  #include "hw/arm/boot.h"
>
> +/* Main SYSCLK frequency in Hz (168MHz) */
> +#define SYSCLK_FRQ 16800ULL
> +
>  static void netduinoplus2_init(MachineState *machine)
>  {
>  DeviceState *dev;
>
> +/*
> + * TODO: ideally we would model the SoC RCC and let it handle
> + * system_clock_scale, including its ability to define different
> + * possible SYSCLK sources.
> + */
> +system_clock_scale = NANOSECONDS_PER_SECOND / SYSCLK_FRQ;
> +
>  dev = qdev_new(TYPE_STM32F405_SOC);
>  qdev_prop_set_string(dev, "cpu-type", ARM_CPU_TYPE_NAME("cortex-m4"));
>  sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), _fatal);
> --
> 2.20.1
>
>



[Bug 1883739] Re: ide_dma_cb: Assertion `prep_size >= 0 && prep_size <= n * 512' failed.

2020-07-27 Thread John Snow
** Changed in: qemu
 Assignee: (unassigned) => John Snow (jnsnow)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1883739

Title:
  ide_dma_cb: Assertion `prep_size >= 0 && prep_size <= n * 512' failed.

Status in QEMU:
  Confirmed

Bug description:
  To reproduce run the QEMU with the following command line:
  ```
  qemu-system-x86_64 -cdrom hypertrash.iso -nographic -m 100 -enable-kvm -net 
none -drive id=disk,file=hda.img,if=none -device ahci,id=ahci -device 
ide-hd,drive=disk,bus=ahci.0
  ```

  QEMU Version:
  ```
  # qemu-5.0.0
  $ ./configure --target-list=x86_64-softmmu --enable-sanitizers; make
  $ x86_64-softmmu/qemu-system-x86_64 --version
  QEMU emulator version 5.0.0
  Copyright (c) 2003-2020 Fabrice Bellard and the QEMU Project developers
  ```

  To create disk image run:
  ```
  dd if=/dev/zero of=hda.img bs=1024 count=1024
  ```

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1883739/+subscriptions



[Bug 1887303] Re: Assertion failure in *bmdma_active_if `bmdma->bus->retry_unit != (uint8_t)-1' failed.

2020-07-27 Thread John Snow
This is another manifestation of the SRST bug.

New proposal: https://lists.gnu.org/archive/html/qemu-
devel/2020-07/msg06974.html

More analysis of the problem in response to Philippe's proposed fix:
https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg06237.html

** Changed in: qemu
   Status: New => In Progress

** Changed in: qemu
 Assignee: (unassigned) => John Snow (jnsnow)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887303

Title:
  Assertion failure in *bmdma_active_if `bmdma->bus->retry_unit !=
  (uint8_t)-1' failed.

Status in QEMU:
  In Progress

Bug description:
  Hello,
  Here is a QTest Reproducer:

  cat << EOF | ./i386-softmmu/qemu-system-i386 -M pc,accel=qtest\
   -qtest null -nographic -vga qxl -qtest stdio -nodefaults\
   -drive if=none,id=drive0,file=null-co://,file.read-zeroes=on,format=raw\
   -drive if=none,id=drive1,file=null-co://,file.read-zeroes=on,format=raw\
   -device ide-cd,drive=drive0 -device ide-hd,drive=drive1
  outw 0x176 0x3538
  outw 0x376 0x6007
  outw 0x376 0x6b6b
  outw 0x176 0x985c
  outl 0xcf8 0x8903
  outl 0xcfc 0x2f2931
  outl 0xcf8 0x8920
  outb 0xcfc 0x6b
  outb 0x68 0x7
  outw 0x176 0x2530
  EOF

  Here is the call-stack:

  #8 0x7f00e0443091 in __assert_fail 
/build/glibc-GwnBeO/glibc-2.30/assert/assert.c:101:3
  #9 0x55e163f5a1af in bmdma_active_if 
/home/alxndr/Development/qemu/include/hw/ide/pci.h:59:5
  #10 0x55e163f5a1af in bmdma_prepare_buf 
/home/alxndr/Development/qemu/hw/ide/pci.c:132:19
  #11 0x55e163f4f00d in ide_dma_cb 
/home/alxndr/Development/qemu/hw/ide/core.c:898:17
  #12 0x55e163de86ad in dma_complete 
/home/alxndr/Development/qemu/dma-helpers.c:120:9
  #13 0x55e163de86ad in dma_blk_cb 
/home/alxndr/Development/qemu/dma-helpers.c:138:9
  #14 0x55e1642ade85 in blk_aio_complete 
/home/alxndr/Development/qemu/block/block-backend.c:1402:9
  #15 0x55e1642ade85 in blk_aio_complete_bh 
/home/alxndr/Development/qemu/block/block-backend.c:1412:5
  #16 0x55e16443556f in aio_bh_call 
/home/alxndr/Development/qemu/util/async.c:136:5
  #17 0x55e16443556f in aio_bh_poll 
/home/alxndr/Development/qemu/util/async.c:164:13
  #18 0x55e16440fac3 in aio_dispatch 
/home/alxndr/Development/qemu/util/aio-posix.c:380:5
  #19 0x55e164436dac in aio_ctx_dispatch 
/home/alxndr/Development/qemu/util/async.c:306:5
  #20 0x7f00e16e29ed in g_main_context_dispatch 
(/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4e9ed)
  #21 0x55e164442f2b in glib_pollfds_poll 
/home/alxndr/Development/qemu/util/main-loop.c:219:9
  #22 0x55e164442f2b in os_host_main_loop_wait 
/home/alxndr/Development/qemu/util/main-loop.c:242:5
  #23 0x55e164442f2b in main_loop_wait 
/home/alxndr/Development/qemu/util/main-loop.c:518:11
  #24 0x55e164376953 in flush_events 
/home/alxndr/Development/qemu/tests/qtest/fuzz/fuzz.c:47:9
  #25 0x55e16437b8fa in general_fuzz 
/home/alxndr/Development/qemu/tests/qtest/fuzz/general_fuzz.c:544:17

  =

  Here is the same assertion failure but triggered through a different
  call-stack:

  cat << EOF | ./i386-softmmu/qemu-system-i386 -M pc,accel=qtest\
   -qtest null -nographic -vga qxl -qtest stdio -nodefaults\
   -drive if=none,id=drive0,file=null-co://,file.read-zeroes=on,format=raw\
   -drive if=none,id=drive1,file=null-co://,file.read-zeroes=on,format=raw\
   -device ide-cd,drive=drive0 -device ide-hd,drive=drive1
  outw 0x171 0x2fe9
  outb 0x177 0xa0
  outl 0x170 0x928
  outl 0x170 0x2b923b31
  outl 0x170 0x800a24d7
  outl 0xcf8 0x8903
  outl 0xcfc 0x842700
  outl 0xcf8 0x8920
  outb 0xcfc 0x5e
  outb 0x58 0x7
  outb 0x376 0x5
  outw 0x376 0x11
  outw 0x176 0x3538
  EOF

  Call-stack:
  #8 0x7f00e0443091 in __assert_fail 
/build/glibc-GwnBeO/glibc-2.30/assert/assert.c:101:3
  #9 0x55e163f5a622 in bmdma_active_if 
/home/alxndr/Development/qemu/include/hw/ide/pci.h:59:5
  #10 0x55e163f5a622 in bmdma_rw_buf 
/home/alxndr/Development/qemu/hw/ide/pci.c:187:19
  #11 0x55e163f57577 in ide_atapi_cmd_read_dma_cb 
/home/alxndr/Development/qemu/hw/ide/atapi.c:375:13
  #12 0x55e163f44c55 in ide_buffered_readv_cb 
/home/alxndr/Development/qemu/hw/ide/core.c:650:9
  #13 0x55e1642ade85 in blk_aio_complete 
/home/alxndr/Development/qemu/block/block-backend.c:1402:9
  #14 0x55e1642ade85 in blk_aio_complete_bh 
/home/alxndr/Development/qemu/block/block-backend.c:1412:5
  #15 0x55e16443556f in aio_bh_call 
/home/alxndr/Development/qemu/util/async.c:136:5
  #16 0x55e16443556f in aio_bh_poll 
/home/alxndr/Development/qemu/util/async.c:164:13
  #17 0x55e16440fac3 in aio_dispatch 
/home/alxndr/Development/qemu/util/aio-posix.c:380:5
  #18 0x55e164436dac in aio_ctx_dispatch 
/home/alxndr/Development/qemu/util/async.c:306:5
  #19 0x7f00e16e29ed in g_main_context_dispatch 

Re: device compatibility interface for live migration with assigned devices

2020-07-27 Thread Alex Williamson
On Mon, 27 Jul 2020 15:24:40 +0800
Yan Zhao  wrote:

> > > As you indicate, the vendor driver is responsible for checking version
> > > information embedded within the migration stream.  Therefore a
> > > migration should fail early if the devices are incompatible.  Is it  
> > but as I know, currently in VFIO migration protocol, we have no way to
> > get vendor specific compatibility checking string in migration setup stage
> > (i.e. .save_setup stage) before the device is set to _SAVING state.
> > In this way, for devices who does not save device data in precopy stage,
> > the migration compatibility checking is as late as in stop-and-copy
> > stage, which is too late.
> > do you think we need to add the getting/checking of vendor specific
> > compatibility string early in save_setup stage?
> >  
> hi Alex,
> after an offline discussion with Kevin, I realized that it may not be a
> problem if migration compatibility check in vendor driver occurs late in
> stop-and-copy phase for some devices, because if we report device
> compatibility attributes clearly in an interface, the chances for
> libvirt/openstack to make a wrong decision is little.

I think it would be wise for a vendor driver to implement a pre-copy
phase, even if only to send version information and verify it at the
target.  Deciding you have no device state to send during pre-copy does
not mean your vendor driver needs to opt-out of the pre-copy phase
entirely.  Please also note that pre-copy is at the user's discretion,
we've defined that we can enter stop-and-copy at any point, including
without a pre-copy phase, so I would recommend that vendor drivers
validate compatibility at the start of both the pre-copy and the
stop-and-copy phases.

> so, do you think we are now arriving at an agreement that we'll give up
> the read-and-test scheme and start to defining one interface (perhaps in
> json format), from which libvirt/openstack is able to parse and find out
> compatibility list of a source mdev/physical device?

Based on the feedback we've received, the previously proposed interface
is not viable.  I think there's agreement that the user needs to be
able to parse and interpret the version information.  Using json seems
viable, but I don't know if it's the best option.  Is there any
precedent of markup strings returned via sysfs we could follow?

Your idea of having both a "self" object and an array of "compatible"
objects is perhaps something we can build on, but we must not assume
PCI devices at the root level of the object.  Providing both the
mdev-type and the driver is a bit redundant, since the former includes
the latter.  We can't have vendor specific versioning schemes though,
ie. gvt-version. We need to agree on a common scheme and decide which
fields the version is relative to, ex. just the mdev type?

I had also proposed fields that provide information to create a
compatible type, for example to create a type_x2 device from a type_x1
mdev type, they need to know to apply an aggregation attribute.  If we
need to explicitly list every aggregation value and the resulting type,
I think we run aground of what aggregation was trying to avoid anyway,
so we might need to pick a language that defines variable substitution
or some kind of tagging.  For example if we could define ${aggr} as an
integer within a specified range, then we might be able to define a type
relative to that value (type_x${aggr}) which requires an aggregation
attribute using the same value.  I dunno, just spit balling.  Thanks,

Alex




[Bug 1887309] Re: Floating-point exception in ide_set_sector

2020-07-27 Thread John Snow
New proposal: https://lists.gnu.org/archive/html/qemu-
devel/2020-07/msg06974.html

(The root cause is that SRST is not handled correctly.)

More analysis in the replies to Philippe's patch:
https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg05949.html

** Changed in: qemu
 Assignee: (unassigned) => John Snow (jnsnow)

** Changed in: qemu
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887309

Title:
  Floating-point exception in ide_set_sector

Status in QEMU:
  In Progress

Bug description:
  Hello,
  Here is a reproducer:
  cat << EOF | ./i386-softmmu/qemu-system-i386 -M pc,accel=qtest\
   -qtest null -nographic -vga qxl -qtest stdio -nodefaults\
   -drive if=none,id=drive0,file=null-co://,file.read-zeroes=on,format=raw\
   -drive if=none,id=drive1,file=null-co://,file.read-zeroes=on,format=raw\
   -device ide-cd,drive=drive0 -device ide-hd,drive=drive1
  outw 0x176 0x3538
  outl 0xcf8 0x8903
  outl 0xcfc 0x184275c
  outb 0x376 0x2f
  outb 0x376 0x0
  outw 0x176 0xa1a4
  outl 0xcf8 0x8920
  outb 0xcfc 0xff
  outb 0xf8 0x9
  EOF

  The stack-trace:
  ==16513==ERROR: UndefinedBehaviorSanitizer: FPE on unknown address 
0x556783603fdc (pc 0x556783603fdc bp 0x7fff82143b10 sp 0x7fff82143ab0 T16513)
  #0 0x556783603fdc in ide_set_sector 
/home/alxndr/Development/qemu/hw/ide/core.c:626:26
  #1 0x556783603fdc in ide_dma_cb 
/home/alxndr/Development/qemu/hw/ide/core.c:883:9
  #2 0x55678349d74d in dma_complete 
/home/alxndr/Development/qemu/dma-helpers.c:120:9
  #3 0x55678349d74d in dma_blk_cb 
/home/alxndr/Development/qemu/dma-helpers.c:138:9
  #4 0x556783962f25 in blk_aio_complete 
/home/alxndr/Development/qemu/block/block-backend.c:1402:9
  #5 0x556783962f25 in blk_aio_complete_bh 
/home/alxndr/Development/qemu/block/block-backend.c:1412:5
  #6 0x556783ac030f in aio_bh_call 
/home/alxndr/Development/qemu/util/async.c:136:5
  #7 0x556783ac030f in aio_bh_poll 
/home/alxndr/Development/qemu/util/async.c:164:13
  #8 0x556783a9a863 in aio_dispatch 
/home/alxndr/Development/qemu/util/aio-posix.c:380:5
  #9 0x556783ac1b4c in aio_ctx_dispatch 
/home/alxndr/Development/qemu/util/async.c:306:5
  #10 0x7f4f1c0fe9ed in g_main_context_dispatch 
(/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4e9ed)
  #11 0x556783acdccb in glib_pollfds_poll 
/home/alxndr/Development/qemu/util/main-loop.c:219:9
  #12 0x556783acdccb in os_host_main_loop_wait 
/home/alxndr/Development/qemu/util/main-loop.c:242:5
  #13 0x556783acdccb in main_loop_wait 
/home/alxndr/Development/qemu/util/main-loop.c:518:11
  #14 0x5567833613e5 in qemu_main_loop 
/home/alxndr/Development/qemu/softmmu/vl.c:1664:9
  #15 0x556783a07a4d in main 
/home/alxndr/Development/qemu/softmmu/main.c:49:5
  #16 0x7f4f1ac84e0a in __libc_start_main 
/build/glibc-GwnBeO/glibc-2.30/csu/../csu/libc-start.c:308:16
  #17 0x5567830a9089 in _start 
(/home/alxndr/Development/qemu/build/i386-softmmu/qemu-system-i386+0x7d2089)

  With -trace ide*

  12163@1594585516.671265:ide_reset IDEstate 0x56162a269058
  [R +0.024963] outw 0x176 0x3538
  12163@1594585516.673676:ide_ioport_write IDE PIO wr @ 0x176 (Device/Head); 
val 0x38; bus 0x56162a268c00 IDEState 0x56162a268c88
  12163@1594585516.673683:ide_ioport_write IDE PIO wr @ 0x177 (Command); val 
0x35; bus 0x56162a268c00 IDEState 0x56162a269058
  12163@1594585516.673686:ide_exec_cmd IDE exec cmd: bus 0x56162a268c00; state 
0x56162a269058; cmd 0x35
  OK
  [S +0.025002] OK
  [R +0.025012] outl 0xcf8 0x8903
  OK
  [S +0.025018] OK
  [R +0.025026] outl 0xcfc 0x184275c
  OK
  [S +0.025210] OK
  [R +0.025219] outb 0x376 0x2f
  12163@1594585516.673916:ide_cmd_write IDE PIO wr @ 0x376 (Device Control); 
val 0x2f; bus 0x56162a268c00
  OK
  [S +0.025229] OK
  [R +0.025234] outb 0x376 0x0
  12163@1594585516.673928:ide_cmd_write IDE PIO wr @ 0x376 (Device Control); 
val 0x00; bus 0x56162a268c00
  OK
  [S +0.025240] OK
  [R +0.025246] outw 0x176 0xa1a4
  12163@1594585516.673940:ide_ioport_write IDE PIO wr @ 0x176 (Device/Head); 
val 0xa4; bus 0x56162a268c00 IDEState 0x56162a269058
  12163@1594585516.673943:ide_ioport_write IDE PIO wr @ 0x177 (Command); val 
0xa1; bus 0x56162a268c00 IDEState 0x56162a268c88
  12163@1594585516.673946:ide_exec_cmd IDE exec cmd: bus 0x56162a268c00; state 
0x56162a268c88; cmd 0xa1
  OK
  [S +0.025265] OK
  [R +0.025270] outl 0xcf8 0x8920
  OK
  [S +0.025274] OK
  [R +0.025279] outb 0xcfc 0xff
  OK
  [S +0.025443] OK
  [R +0.025451] outb 0xf8 0x9
  12163@1594585516.674221:ide_dma_cb IDEState 0x56162a268c88; sector_num=0 n=1 
cmd=DMA READ
  OK
  [S +0.025724] OK
  UndefinedBehaviorSanitizer:DEADLYSIGNAL
  ==12163==ERROR: UndefinedBehaviorSanitizer: FPE on unknown address 
0x5616279cffdc (pc 0x5616279cffdc bp 0x7ffcdaabae90 sp 0x7ffcdaabae30 T12163)

  -Alex

To manage notifications about this bug go 

[Bug 1878253] Re: null-ptr dereference in address_space_to_flatview through ide

2020-07-27 Thread John Snow
Proposed fix: https://lists.gnu.org/archive/html/qemu-
devel/2020-07/msg06974.html

** Changed in: qemu
 Assignee: (unassigned) => John Snow (jnsnow)

** Changed in: qemu
   Status: New => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1878253

Title:
  null-ptr dereference in address_space_to_flatview through ide

Status in QEMU:
  In Progress

Bug description:
  Hello,
  While fuzzing, I found an input that triggers a null-ptr dereference in
  address_space_to_flatview through ide:

  ==31699==ERROR: AddressSanitizer: SEGV on unknown address 0x0020 (pc 
0x55e0f562bafd bp 0x7ffee92355b0 sp 0x7ffee92354e0 T0)
  ==31699==The signal is caused by a READ memory access.
  ==31699==Hint: address points to the zero page.
  #0 0x55e0f562bafd in address_space_to_flatview 
/home/alxndr/Development/qemu/include/exec/memory.h:693:12
  #1 0x55e0f562bafd in address_space_write 
/home/alxndr/Development/qemu/exec.c:3267:14
  #2 0x55e0f562dd9c in address_space_unmap 
/home/alxndr/Development/qemu/exec.c:3592:9
  #3 0x55e0f5ab8277 in dma_memory_unmap 
/home/alxndr/Development/qemu/include/sysemu/dma.h:145:5
  #4 0x55e0f5ab8277 in dma_blk_unmap 
/home/alxndr/Development/qemu/dma-helpers.c:104:9
  #5 0x55e0f5ab8277 in dma_blk_cb 
/home/alxndr/Development/qemu/dma-helpers.c:139:5
  #6 0x55e0f617a6b8 in blk_aio_complete 
/home/alxndr/Development/qemu/block/block-backend.c:1398:9
  #7 0x55e0f617a6b8 in blk_aio_complete_bh 
/home/alxndr/Development/qemu/block/block-backend.c:1408:5
  #8 0x55e0f6355efb in aio_bh_call 
/home/alxndr/Development/qemu/util/async.c:136:5
  #9 0x55e0f6355efb in aio_bh_poll 
/home/alxndr/Development/qemu/util/async.c:164:13
  #10 0x55e0f63608ce in aio_dispatch 
/home/alxndr/Development/qemu/util/aio-posix.c:380:5
  #11 0x55e0f635799a in aio_ctx_dispatch 
/home/alxndr/Development/qemu/util/async.c:306:5
  #12 0x7f16e85d69ed in g_main_context_dispatch 
(/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4e9ed)
  #13 0x55e0f635e384 in glib_pollfds_poll 
/home/alxndr/Development/qemu/util/main-loop.c:219:9
  #14 0x55e0f635e384 in os_host_main_loop_wait 
/home/alxndr/Development/qemu/util/main-loop.c:242:5
  #15 0x55e0f635e384 in main_loop_wait 
/home/alxndr/Development/qemu/util/main-loop.c:518:11
  #16 0x55e0f593d676 in qemu_main_loop 
/home/alxndr/Development/qemu/softmmu/vl.c:1664:9
  #17 0x55e0f6267c6a in main 
/home/alxndr/Development/qemu/softmmu/main.c:49:5
  #18 0x7f16e7186e0a in __libc_start_main 
/build/glibc-GwnBeO/glibc-2.30/csu/../csu/libc-start.c:308:16
  #19 0x55e0f55727b9 in _start 
(/home/alxndr/Development/qemu/build/i386-softmmu/qemu-system-i386+0x9027b9)

  AddressSanitizer can not provide additional info.
  SUMMARY: AddressSanitizer: SEGV 
/home/alxndr/Development/qemu/include/exec/memory.h:693:12 in 
address_space_to_flatview

  I can reproduce it in qemu 5.0 using:

  cat << EOF | ~/Development/qemu/build/i386-softmmu/qemu-system-i386 -M pc 
-nographic -drive file=null-co://,if=ide,cache=writeback,format=raw -nodefaults 
-display none -nographic -qtest stdio -monitor none -serial none
  outl 0xcf8 0x8920
  outl 0xcfc 0xc001
  outl 0xcf8 0x8924
  outl 0xcf8 0x8904
  outw 0xcfc 0x7
  outb 0x1f7 0xc8
  outw 0x3f6 0xe784
  outw 0x3f6 0xeb01
  outb 0xc005 0x21
  write 0x2103 0x1 0x4e
  outb 0xc000 0x1b
  outw 0x1f7 0xff35
  EOF

  I also attached the traces to this launchpad report, in case the
  formatting is broken:

  qemu-system-i386 -M pc -nographic -drive file=null-
  co://,if=ide,cache=writeback,format=raw -nodefaults -display none
  -nographic -qtest stdio -monitor none -serial none < attachment

  Please let me know if I can provide any further info.
  -Alex

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1878253/+subscriptions



[Bug 1878255] Re: Assertion failure in bdrv_aio_cancel, through ide

2020-07-27 Thread John Snow
Thank you, Stefan!

Fix: https://gitlab.com/qemu-
project/qemu/-/commit/1d719ddc35e9827b6e5df771555874df34301a0d


** Changed in: qemu
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1878255

Title:
  Assertion failure in bdrv_aio_cancel, through ide

Status in QEMU:
  Fix Committed

Bug description:
  Hello,
  While fuzzing, I found an input that triggers an assertion failure in 
bdrv_aio_cancel, through ide:

  #1  0x7685755b in __GI_abort () at abort.c:79
  #2  0x56a8d396 in bdrv_aio_cancel (acb=0x60761290) at 
/home/alxndr/Development/qemu/block/io.c:2746
  #3  0x56a58525 in blk_aio_cancel (acb=0x2) at 
/home/alxndr/Development/qemu/block/block-backend.c:1540
  #4  0x56552f5b in ide_reset (s=) at 
/home/alxndr/Development/qemu/hw/ide/core.c:1318
  #5  0x56552aeb in ide_bus_reset (bus=0x62d17398) at 
/home/alxndr/Development/qemu/hw/ide/core.c:2422
  #6  0x56579ba5 in ahci_reset_port (s=, port=) at /home/alxndr/Development/qemu/hw/ide/ahci.c:650
  #7  0x5657bd8d in ahci_port_write (s=0x61e14d70, port=0x2, 
offset=, val=0x10) at 
/home/alxndr/Development/qemu/hw/ide/ahci.c:360
  #8  0x5657bd8d in ahci_mem_write (opaque=, 
addr=, val=, size=) at 
/home/alxndr/Development/qemu/hw/ide/ahci.c:513
  #9  0x560028d7 in memory_region_write_accessor (mr=, 
addr=, value=, size=, 
shift=, mask=, attrs=...) at 
/home/alxndr/Development/qemu/memory.c:483
  #10 0x56002280 in access_with_adjusted_size (addr=, 
value=, size=, access_size_min=, 
access_size_max=, access_fn=, mr=0x61e14da0, 
attrs=...) at /home/alxndr/Development/qemu/memory.c:544
  #11 0x56002280 in memory_region_dispatch_write (mr=, 
addr=, data=0x10, op=, attrs=...) at 
/home/alxndr/Development/qemu/memory.c:1476
  #12 0x55f171d4 in flatview_write_continue (fv=, 
addr=0xe106c22c, attrs=..., ptr=, len=0x1, addr1=0x7fffb8d0, 
l=, mr=0x61e14da0) at 
/home/alxndr/Development/qemu/exec.c:3137
  #13 0x55f0fb98 in flatview_write (fv=0x6063b180, addr=, attrs=..., buf=, len=) at 
/home/alxndr/Development/qemu/exec.c:3177

  I can reproduce it in qemu 5.0 using:

  cat << EOF | ~/Development/qemu/build/i386-softmmu/qemu-system-i386 -qtest 
stdio -monitor none -serial none -M pc-q35-5.0  -nographic
  outl 0xcf8 0x8000fa24
  outl 0xcfc 0xe106c000
  outl 0xcf8 0x8000fa04
  outw 0xcfc 0x7
  outl 0xcf8 0x8000fb20
  write 0x0 0x3 0x2780e7
  write 0xe106c22c 0xd 0x1130c218021130c218021130c2
  write 0xe106c218 0x15 0x110010110010110010110010110010110010110010
  EOF

  I also attached the commands to this launchpad report, in case the
  formatting is broken:

  qemu-system-i386 -qtest stdio -monitor none -serial none -M pc-q35-5.0
  -nographic < attachment

  Please let me know if I can provide any further info.
  -Alex

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1878255/+subscriptions



Re: [PATCH 1/1] scripts/performance: Add bisect.py script

2020-07-27 Thread Aleksandar Markovic
On Monday, July 27, 2020, John Snow  wrote:

> On 7/25/20 8:31 AM, Aleksandar Markovic wrote:
>
>>
>>
>> On Wednesday, July 22, 2020, Ahmed Karaman > > wrote:
>>
>> Python script that locates the commit that caused a performance
>> degradation or improvement in QEMU using the git bisect command
>> (binary search).
>>
>> Syntax:
>> bisect.py [-h] -s,--start START [-e,--end END] [-q,--qemu QEMU] \
>> --target TARGET --tool {perf,callgrind} -- \
>>  []
>>
>> [-h] - Print the script arguments help message
>> -s,--start START - First commit hash in the search range
>> [-e,--end END] - Last commit hash in the search range
>>  (default: Latest commit)
>> [-q,--qemu QEMU] - QEMU path.
>>  (default: Path to a GitHub QEMU clone)
>> --target TARGET - QEMU target name
>> --tool {perf,callgrind} - Underlying tool used for measurements
>>
>> Example of usage:
>> bisect.py --start=fdd76fecdd --qemu=/path/to/qemu --target=ppc \
>> --tool=perf -- coulomb_double-ppc -n 1000
>>
>> Example output:
>> Start Commit Instructions: 12,710,790,060
>> End Commit Instructions:   13,031,083,512
>> Performance Change:-2.458%
>>
>> Estimated Number of Steps: 10
>>
>> *BISECT STEP 1*
>> Instructions:13,031,097,790
>> Status:  slow commit
>> *BISECT STEP 2*
>> Instructions:12,710,805,265
>> Status:  fast commit
>> *BISECT STEP 3*
>> Instructions:13,031,028,053
>> Status:  slow commit
>> *BISECT STEP 4*
>> Instructions:12,711,763,211
>> Status:  fast commit
>> *BISECT STEP 5*
>> Instructions:13,031,027,292
>> Status:  slow commit
>> *BISECT STEP 6*
>> Instructions:12,711,748,738
>> Status:  fast commit
>> *BISECT STEP 7*
>> Instructions:12,711,748,788
>> Status:  fast commit
>> *BISECT STEP 8*
>> Instructions:13,031,100,493
>> Status:  slow commit
>> *BISECT STEP 9*
>> Instructions:12,714,472,954
>> Status:  fast commit
>> BISECT STEP 10*
>> Instructions:12,715,409,153
>> Status:  fast commit
>> BISECT STEP 11*
>> Instructions:12,715,394,739
>> Status:  fast commit
>>
>> *BISECT RESULT*
>> commit 0673ecdf6cb2b1445a85283db8cbacb251c46516
>> Author: Richard Henderson > >
>> Date:   Tue May 5 10:40:23 2020 -0700
>>
>>  softfloat: Inline float64 compare specializations
>>
>>  Replace the float64 compare specializations with inline functions
>>  that call the standard float64_compare{,_quiet} functions.
>>  Use bool as the return type.
>> ***
>>
>> Signed-off-by: Ahmed Karaman > >
>> ---
>>   scripts/performance/bisect.py | 374 ++
>> 
>>   1 file changed, 374 insertions(+)
>>   create mode 100755 scripts/performance/bisect.py
>>
>> diff --git a/scripts/performance/bisect.py
>> b/scripts/performance/bisect.py
>> new file mode 100755
>> index 00..869cc69ef4
>> --- /dev/null
>> +++ b/scripts/performance/bisect.py
>> @@ -0,0 +1,374 @@
>> +#!/usr/bin/env python3
>> +
>> +#  Locate the commit that caused a performance degradation or
>> improvement in
>> +#  QEMU using the git bisect command (binary search).
>> +#
>> +#  Syntax:
>> +#  bisect.py [-h] -s,--start START [-e,--end END] [-q,--qemu QEMU] \
>> +#  --target TARGET --tool {perf,callgrind} -- \
>> +#   []
>> +#
>> +#  [-h] - Print the script arguments help message
>> +#  -s,--start START - First commit hash in the search range
>> +#  [-e,--end END] - Last commit hash in the search range
>> +# (default: Latest commit)
>> +#  [-q,--qemu QEMU] - QEMU path.
>> +#  (default: Path to a GitHub QEMU clone)
>> +#  --target TARGET - QEMU target name
>> +#  --tool {perf,callgrind} - Underlying tool used for measurements
>> +
>> +#  Example of usage:
>> +#  bisect.py --start=fdd76fecdd --qemu=/path/to/qemu --target=ppc
>> --tool=perf \
>> +#  -- coulomb_double-ppc -n 1000
>> +#
>> +#  This file is a part of the 

Re: [PATCH v2 for-5.1? 0/5] Fix nbd reconnect dead-locks

2020-07-27 Thread Eric Blake

On 7/27/20 1:47 PM, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

v2: it's a bit updated "[PATCH for-5.1? 0/3] Fix nbd reconnect dead-locks"
plus completely rewritten "[PATCH for-5.1? 0/4] non-blocking connect"
(which is now the only one patch 05)

01: new
02: rebased on 01, fix (add outer "if")
03-04: add Eric's r-b:
05: new

If 05 is too big for 5.1, it's OK to take only 01-04 or less, as well as
postponing everything to 5.2, as it's all not a degradation of 5.1
(it's a degradation of 4.2, together with the whole reconnect feature).


I think I like where 5/5 is headed, but am not sure yet whether all 
paths are thread-safe or if there is anything we can reuse to make its 
implementation smaller.  You are right that it's probably best to defer 
that to 5.2.  In the meantime, I'll queue 1-4 for my NBD pull request 
for -rc2.




Vladimir Sementsov-Ogievskiy (5):
   block/nbd: split nbd_establish_connection out of nbd_client_connect
   block/nbd: allow drain during reconnect attempt
   block/nbd: on shutdown terminate connection attempt
   block/nbd: nbd_co_reconnect_loop(): don't sleep if drained
   block/nbd: use non-blocking connect: fix vm hang on connect()

  block/nbd.c| 360 +
  block/trace-events |   4 +-
  2 files changed, 331 insertions(+), 33 deletions(-)



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




[PATCH v2 2/4] iotests: Make qemu_nbd_popen() a contextmanager

2020-07-27 Thread Nir Soffer
Instead of duplicating the code to wait until the server is ready and
remember to terminate the server and wait for it, make it possible to
use like this:

with qemu_nbd_popen('-k', sock, image):
# Access image via qemu-nbd socket...

Only test 264 used this helper, but I had to modify the output since it
did not consistently when starting and stopping qemu-nbd.

Signed-off-by: Nir Soffer 
---
 tests/qemu-iotests/264| 76 +--
 tests/qemu-iotests/264.out|  2 +
 tests/qemu-iotests/iotests.py | 28 -
 3 files changed, 56 insertions(+), 50 deletions(-)

diff --git a/tests/qemu-iotests/264 b/tests/qemu-iotests/264
index 304a7443d7..666f164ed8 100755
--- a/tests/qemu-iotests/264
+++ b/tests/qemu-iotests/264
@@ -36,48 +36,32 @@ wait_step = 0.2
 
 qemu_img_create('-f', iotests.imgfmt, disk_a, str(size))
 qemu_img_create('-f', iotests.imgfmt, disk_b, str(size))
-srv = qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b)
 
-# Wait for NBD server availability
-t = 0
-ok = False
-while t < wait_limit:
-ok = qemu_io_silent_check('-f', 'raw', '-c', 'read 0 512', nbd_uri)
-if ok:
-break
-time.sleep(wait_step)
-t += wait_step
+with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
+vm = iotests.VM().add_drive(disk_a)
+vm.launch()
+vm.hmp_qemu_io('drive0', 'write 0 {}'.format(size))
+
+vm.qmp_log('blockdev-add', filters=[iotests.filter_qmp_testfiles],
+   **{'node_name': 'backup0',
+  'driver': 'raw',
+  'file': {'driver': 'nbd',
+   'server': {'type': 'unix', 'path': nbd_sock},
+   'reconnect-delay': 10}})
+vm.qmp_log('blockdev-backup', device='drive0', sync='full', 
target='backup0',
+   speed=(1 * 1024 * 1024))
+
+# Wait for some progress
+t = 0
+while t < wait_limit:
+jobs = vm.qmp('query-block-jobs')['return']
+if jobs and jobs[0]['offset'] > 0:
+break
+time.sleep(wait_step)
+t += wait_step
 
-assert ok
-
-vm = iotests.VM().add_drive(disk_a)
-vm.launch()
-vm.hmp_qemu_io('drive0', 'write 0 {}'.format(size))
-
-vm.qmp_log('blockdev-add', filters=[iotests.filter_qmp_testfiles],
-   **{'node_name': 'backup0',
-  'driver': 'raw',
-  'file': {'driver': 'nbd',
-   'server': {'type': 'unix', 'path': nbd_sock},
-   'reconnect-delay': 10}})
-vm.qmp_log('blockdev-backup', device='drive0', sync='full', target='backup0',
-   speed=(1 * 1024 * 1024))
-
-# Wait for some progress
-t = 0
-while t < wait_limit:
-jobs = vm.qmp('query-block-jobs')['return']
 if jobs and jobs[0]['offset'] > 0:
-break
-time.sleep(wait_step)
-t += wait_step
-
-if jobs and jobs[0]['offset'] > 0:
-log('Backup job is started')
-
-log('Kill NBD server')
-srv.kill()
-srv.wait()
+log('Backup job is started')
 
 jobs = vm.qmp('query-block-jobs')['return']
 if jobs and jobs[0]['offset'] < jobs[0]['len']:
@@ -88,12 +72,8 @@ vm.qmp_log('block-job-set-speed', device='drive0', speed=0)
 # Emulate server down time for 1 second
 time.sleep(1)
 
-log('Start NBD server')
-srv = qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b)
-
-e = vm.event_wait('BLOCK_JOB_COMPLETED')
-log('Backup completed: {}'.format(e['data']['offset']))
-
-vm.qmp_log('blockdev-del', node_name='backup0')
-srv.kill()
-vm.shutdown()
+with qemu_nbd_popen('-k', nbd_sock, '-f', iotests.imgfmt, disk_b):
+e = vm.event_wait('BLOCK_JOB_COMPLETED')
+log('Backup completed: {}'.format(e['data']['offset']))
+vm.qmp_log('blockdev-del', node_name='backup0')
+vm.shutdown()
diff --git a/tests/qemu-iotests/264.out b/tests/qemu-iotests/264.out
index 3000944b09..c45b1e81ef 100644
--- a/tests/qemu-iotests/264.out
+++ b/tests/qemu-iotests/264.out
@@ -1,3 +1,4 @@
+Start NBD server
 {"execute": "blockdev-add", "arguments": {"driver": "raw", "file": {"driver": 
"nbd", "reconnect-delay": 10, "server": {"path": "TEST_DIR/PID-nbd-sock", 
"type": "unix"}}, "node-name": "backup0"}}
 {"return": {}}
 {"execute": "blockdev-backup", "arguments": {"device": "drive0", "speed": 
1048576, "sync": "full", "target": "backup0"}}
@@ -11,3 +12,4 @@ Start NBD server
 Backup completed: 5242880
 {"execute": "blockdev-del", "arguments": {"node-name": "backup0"}}
 {"return": {}}
+Kill NBD server
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 3590ed78a0..8f79668435 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -28,10 +28,13 @@ import signal
 import struct
 import subprocess
 import sys
+import time
 from typing import (Any, Callable, Dict, Iterable,
 List, Optional, Sequence, Tuple, TypeVar)
 import unittest
 
+from contextlib import contextmanager
+
 # pylint: disable=import-error, wrong-import-position
 

[PATCH v2 3/4] iotests: Add more qemu_img helpers

2020-07-27 Thread Nir Soffer
Add 2 helpers for measuring and checking images:
- qemu_img_measure()
- qemu_img_check()

Both use --output-json and parse the returned json to make easy to use
in other tests. I'm going to use them in a new test, and I hope they
will be useful in may other tests.

Signed-off-by: Nir Soffer 
---
 tests/qemu-iotests/iotests.py | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 8f79668435..717b5b652c 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -141,6 +141,12 @@ def qemu_img_create(*args):
 
 return qemu_img(*args)
 
+def qemu_img_measure(*args):
+return json.loads(qemu_img_pipe("measure", "--output", "json", *args))
+
+def qemu_img_check(*args):
+return json.loads(qemu_img_pipe("check", "--output", "json", *args))
+
 def qemu_img_verbose(*args):
 '''Run qemu-img without suppressing its output and return the exit code'''
 exitcode = subprocess.call(qemu_img_args + list(args))
-- 
2.25.4




[PATCH v2 0/4] Fix convert to qcow2 compressed to NBD

2020-07-27 Thread Nir Soffer
Fix qemu-img convert -O qcow2 -c to NBD URL and add missing test for this
usage.

This already works now, but unfortunately qemu-img fails when trying to
truncate the target image to the same size at the end of the operation.

Changes since v1:
- Include complete code for creating OVA file [Eric]
- Use qcow2 for source file to avoid issues with random CI filesystem [Max]
- Fix many typos [Eric, Max]
- Make qemu_nbd_popen a context manager
- Add more qemu_img_* helpers
- Verify OVA file contents

v1 was here:
https://lists.nongnu.org/archive/html/qemu-block/2020-07/msg01543.html

Nir Soffer (4):
  block: nbd: Fix convert qcow2 compressed to nbd
  iotests: Make qemu_nbd_popen() a contextmanager
  iotests: Add more qemu_img helpers
  iotests: Test convert to qcow2 compressed to NBD

 block/nbd.c   |  30 
 tests/qemu-iotests/264|  76 
 tests/qemu-iotests/264.out|   2 +
 tests/qemu-iotests/302| 127 ++
 tests/qemu-iotests/302.out|  31 +
 tests/qemu-iotests/group  |   1 +
 tests/qemu-iotests/iotests.py |  34 -
 7 files changed, 251 insertions(+), 50 deletions(-)
 create mode 100755 tests/qemu-iotests/302
 create mode 100644 tests/qemu-iotests/302.out

-- 
2.25.4




[PATCH v2 4/4] iotests: Test convert to qcow2 compressed to NBD

2020-07-27 Thread Nir Soffer
Add test for "qemu-img convert -O qcow2 -c" to NBD target. The tests    
create a OVA file and write compressed qcow2 disk content directly into
the OVA file via qemu-nbd.

Signed-off-by: Nir Soffer 
---
 tests/qemu-iotests/302 | 127 +
 tests/qemu-iotests/302.out |  31 +
 tests/qemu-iotests/group   |   1 +
 3 files changed, 159 insertions(+)
 create mode 100755 tests/qemu-iotests/302
 create mode 100644 tests/qemu-iotests/302.out

diff --git a/tests/qemu-iotests/302 b/tests/qemu-iotests/302
new file mode 100755
index 00..a8506bda15
--- /dev/null
+++ b/tests/qemu-iotests/302
@@ -0,0 +1,127 @@
+#!/usr/bin/env python3
+#
+# Tests converting qcow2 compressed to NBD
+#
+# Copyright (c) 2020 Nir Soffer 
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+# owner=nir...@gmail.com
+
+import io
+import tarfile
+
+import iotests
+
+from iotests import (
+file_path,
+qemu_img,
+qemu_img_check,
+qemu_img_create,
+qemu_img_log,
+qemu_img_measure,
+qemu_io,
+qemu_nbd_popen,
+)
+
+iotests.script_initialize(supported_fmts=["qcow2"])
+
+# Create source disk. Using qcow2 to enable strict comparing later, and
+# avoid issues with random filesystem on CI environment.
+src_disk = file_path("disk.qcow2")
+qemu_img_create("-f", iotests.imgfmt, src_disk, "1g")
+qemu_io("-f", iotests.imgfmt, "-c", "write 1m 64k", src_disk)
+
+# The use case is writing qcow2 image directly into an ova file, which
+# is a tar file with specific layout. This is tricky since we don't know the
+# size of the image before compressing, so we have to do:
+# 1. Add an ovf file.
+# 2. Find the offset of the next member data.
+# 3. Make room for image data, allocating for the worst case.
+# 4. Write compressed image data into the tar.
+# 5. Add a tar entry with the actual image size.
+# 6. Shrink the tar to the actual size, aligned to 512 bytes.
+
+tar_file = file_path("test.ova")
+
+with tarfile.open(tar_file, "w") as tar:
+
+# 1. Add an ovf file.
+
+ovf_data = b""
+ovf = tarfile.TarInfo("vm.ovf")
+ovf.size = len(ovf_data)
+tar.addfile(ovf, io.BytesIO(ovf_data))
+
+# 2. Find the offset of the next member data.
+
+offset = tar.fileobj.tell() + 512
+
+# 3. Make room for image data, allocating for the worst case.
+
+measure = qemu_img_measure("-O", "qcow2", src_disk)
+tar.fileobj.truncate(offset + measure["required"])
+
+# 4. Write compressed image data into the tar.
+
+nbd_sock = file_path("nbd-sock", base_dir=iotests.sock_dir)
+nbd_uri = "nbd+unix:///exp?socket=" + nbd_sock
+
+# Use raw format to allow creating qcow2 directly into tar file.
+with qemu_nbd_popen(
+"--socket", nbd_sock,
+"--export-name", "exp",
+"--format", "raw",
+"--offset", str(offset),
+tar_file):
+
+iotests.log("=== Target image info ===")
+qemu_img_log("info", nbd_uri)
+
+qemu_img(
+"convert",
+"-f", iotests.imgfmt,
+"-O", "qcow2",
+"-c",
+src_disk,
+nbd_uri)
+
+iotests.log("=== Converted image info ===")
+qemu_img_log("info", nbd_uri)
+
+iotests.log("=== Converted image check ===")
+qemu_img_log("check", nbd_uri)
+
+iotests.log("=== Comparing to source disk ===")
+qemu_img_log("compare", src_disk, nbd_uri)
+
+actual_size = qemu_img_check(nbd_uri)["image-end-offset"]
+
+# 5. Add a tar entry with the actual image size.
+
+disk = tarfile.TarInfo("disk")
+disk.size = actual_size
+tar.addfile(disk)
+
+# 6. Shrink the tar to the actual size, aligned to 512 bytes.
+
+tar_size = offset + (disk.size + 511) & ~511
+tar.fileobj.seek(tar_size)
+tar.fileobj.truncate(tar_size)
+
+with tarfile.open(tar_file) as tar:
+members = [{"name": m.name, "size": m.size, "offset": m.offset_data}
+   for m in tar]
+iotests.log("=== OVA file contents ===")
+iotests.log(members)
diff --git a/tests/qemu-iotests/302.out b/tests/qemu-iotests/302.out
new file mode 100644
index 00..e37d3a1030
--- /dev/null
+++ b/tests/qemu-iotests/302.out
@@ -0,0 +1,31 @@
+Start NBD server
+=== Target image info ===
+image: nbd+unix:///exp?socket=SOCK_DIR/PID-nbd-sock
+file format: raw
+virtual size: 448 KiB (458752 bytes)
+disk 

[PATCH v2 1/4] block: nbd: Fix convert qcow2 compressed to nbd

2020-07-27 Thread Nir Soffer
When converting to qcow2 compressed format, the last step is a special
zero length compressed write, ending in call to bdrv_co_truncate(). This
call always fails for the nbd driver since it does not implement
bdrv_co_truncate().

For block devices, which have the same limits, the call succeeds since
file driver implements bdrv_co_truncate(). If the caller asked to
truncate to the same or smaller size with exact=false, the truncate
succeeds. Implement the same logic for nbd.

Example failing without this change:

In one shell starts qemu-nbd:

$ truncate -s 1g test.tar
$ qemu-nbd --socket=/tmp/nbd.sock --persistent --format=raw --offset 1536 
test.tar

In another shell convert an image to qcow2 compressed via NBD:

$ echo "disk data" > disk.raw
$ truncate -s 1g disk.raw
$ qemu-img convert -f raw -O qcow2 -c disk1.raw 
nbd+unix:///?socket=/tmp/nbd.sock; echo $?
1

qemu-img failed, but the conversion was successful:

$ qemu-img info nbd+unix:///?socket=/tmp/nbd.sock
image: nbd+unix://?socket=/tmp/nbd.sock
file format: qcow2
virtual size: 1 GiB (1073741824 bytes)
...

$ qemu-img check nbd+unix:///?socket=/tmp/nbd.sock
No errors were found on the image.
1/16384 = 0.01% allocated, 100.00% fragmented, 100.00% compressed clusters
Image end offset: 393216

$ qemu-img compare disk.raw nbd+unix:///?socket=/tmp/nbd.sock
Images are identical.

Fixes: https://bugzilla.redhat.com/1860627
Signed-off-by: Nir Soffer 
---
 block/nbd.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 65a4f56924..dcb0b03641 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1966,6 +1966,33 @@ static void nbd_close(BlockDriverState *bs)
 nbd_clear_bdrvstate(s);
 }
 
+/*
+ * NBD cannot truncate, but if the caller asks to truncate to the same size, or
+ * to a smaller size with exact=false, there is no reason to fail the
+ * operation.
+ *
+ * Preallocation mode is ignored since it does not seems useful to fail when
+ * when never change anything.
+ */
+static int coroutine_fn nbd_co_truncate(BlockDriverState *bs, int64_t offset,
+bool exact, PreallocMode prealloc,
+BdrvRequestFlags flags, Error **errp)
+{
+BDRVNBDState *s = bs->opaque;
+
+if (offset != s->info.size && exact) {
+error_setg(errp, "Cannot resize NBD nodes");
+return -ENOTSUP;
+}
+
+if (offset > s->info.size) {
+error_setg(errp, "Cannot grow NBD nodes");
+return -EINVAL;
+}
+
+return 0;
+}
+
 static int64_t nbd_getlength(BlockDriverState *bs)
 {
 BDRVNBDState *s = bs->opaque;
@@ -2045,6 +2072,7 @@ static BlockDriver bdrv_nbd = {
 .bdrv_co_flush_to_os= nbd_co_flush,
 .bdrv_co_pdiscard   = nbd_client_co_pdiscard,
 .bdrv_refresh_limits= nbd_refresh_limits,
+.bdrv_co_truncate   = nbd_co_truncate,
 .bdrv_getlength = nbd_getlength,
 .bdrv_detach_aio_context= nbd_client_detach_aio_context,
 .bdrv_attach_aio_context= nbd_client_attach_aio_context,
@@ -2072,6 +2100,7 @@ static BlockDriver bdrv_nbd_tcp = {
 .bdrv_co_flush_to_os= nbd_co_flush,
 .bdrv_co_pdiscard   = nbd_client_co_pdiscard,
 .bdrv_refresh_limits= nbd_refresh_limits,
+.bdrv_co_truncate   = nbd_co_truncate,
 .bdrv_getlength = nbd_getlength,
 .bdrv_detach_aio_context= nbd_client_detach_aio_context,
 .bdrv_attach_aio_context= nbd_client_attach_aio_context,
@@ -2099,6 +2128,7 @@ static BlockDriver bdrv_nbd_unix = {
 .bdrv_co_flush_to_os= nbd_co_flush,
 .bdrv_co_pdiscard   = nbd_client_co_pdiscard,
 .bdrv_refresh_limits= nbd_refresh_limits,
+.bdrv_co_truncate   = nbd_co_truncate,
 .bdrv_getlength = nbd_getlength,
 .bdrv_detach_aio_context= nbd_client_detach_aio_context,
 .bdrv_attach_aio_context= nbd_client_attach_aio_context,
-- 
2.25.4




[PATCH 1/4] hw/hppa: Sync hppa_hardware.h file with SeaBIOS sources

2020-07-27 Thread Helge Deller
The hppa_hardware.h file is shared with SeaBIOS. Sync it.

Signed-off-by: Helge Deller 
---
 hw/hppa/hppa_hardware.h | 6 ++
 hw/hppa/lasi.c  | 2 --
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/hppa/hppa_hardware.h b/hw/hppa/hppa_hardware.h
index 4a2fe2df60..cdb7fa6240 100644
--- a/hw/hppa/hppa_hardware.h
+++ b/hw/hppa/hppa_hardware.h
@@ -17,6 +17,7 @@
 #define LASI_UART_HPA   0xffd05000
 #define LASI_SCSI_HPA   0xffd06000
 #define LASI_LAN_HPA0xffd07000
+#define LASI_RTC_HPA0xffd09000
 #define LASI_LPT_HPA0xffd02000
 #define LASI_AUDIO_HPA  0xffd04000
 #define LASI_PS2KBD_HPA 0xffd08000
@@ -37,10 +38,15 @@
 #define PORT_PCI_CMD(PCI_HPA + DINO_PCI_ADDR)
 #define PORT_PCI_DATA   (PCI_HPA + DINO_CONFIG_DATA)

+/* QEMU fw_cfg interface port */
+#define QEMU_FW_CFG_IO_BASE (MEMORY_HPA + 0x80)
+
 #define PORT_SERIAL1(DINO_UART_HPA + 0x800)
 #define PORT_SERIAL2(LASI_UART_HPA + 0x800)

 #define HPPA_MAX_CPUS   8   /* max. number of SMP CPUs */
 #define CPU_CLOCK_MHZ   250 /* emulate a 250 MHz CPU */

+#define CPU_HPA_CR_REG  7   /* store CPU HPA in cr7 (SeaBIOS internal) */
+
 #endif
diff --git a/hw/hppa/lasi.c b/hw/hppa/lasi.c
index 19974034f3..ffcbb988b8 100644
--- a/hw/hppa/lasi.c
+++ b/hw/hppa/lasi.c
@@ -54,8 +54,6 @@
 #define LASI_CHIP(obj) \
 OBJECT_CHECK(LasiState, (obj), TYPE_LASI_CHIP)

-#define LASI_RTC_HPA(LASI_HPA + 0x9000)
-
 typedef struct LasiState {
 PCIHostState parent_obj;

--
2.21.3




[PATCH 0/4] Various fixes for hppa architecture

2020-07-27 Thread Helge Deller
This patch series fixes a few issues with the hppa emulation:

* The artist framebuffer emulation reports:
  "write outside bounds: wants 1256x1023, max size 1280x1024"
  This is fixed by a patch from Sven Schnelle.

* Fix a SeaBIOS hppa compilation issue with gcc-10.

* Implement a proper SeaBIOS firmware version check to prevent
  incompatibility issues between emulation and firmware.

* The hppa_hardware.h file is shared with SeaBIOS. Sync it.

The series can be pulled from the fw_cfg-3 branch at:
https://github.com/hdeller/qemu-hppa.git  fw_cfg-3

Helge

Helge Deller (3):
  hw/hppa: Sync hppa_hardware.h file with SeaBIOS sources
  seabios-hppa: Update to SeaBIOS hppa version 1
  hw/hppa: Implement proper SeaBIOS version check

Sven Schnelle (1):
  hw/display/artist.c: fix out of bounds check

 hw/display/artist.c   |  18 ++
 hw/hppa/hppa_hardware.h   |   6 ++
 hw/hppa/lasi.c|   2 --
 hw/hppa/machine.c |  22 ++
 pc-bios/hppa-firmware.img | Bin 766136 -> 783144 bytes
 roms/seabios-hppa |   2 +-
 6 files changed, 35 insertions(+), 15 deletions(-)

--
2.21.3




[PATCH 3/4] hw/hppa: Implement proper SeaBIOS version check

2020-07-27 Thread Helge Deller
It's important that the SeaBIOS hppa firmware is at least at a minimal
level to ensure proper interaction between qemu and firmware.

Implement a proper firmware version check by telling SeaBIOS via the
fw_cfg interface which minimal SeaBIOS version is required by this
running qemu instance. If the firmware detects that it's too old, it
will stop.

Signed-off-by: Helge Deller 
---
 hw/hppa/machine.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index 49155537cd..90aeefe2a4 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -25,6 +25,8 @@

 #define MAX_IDE_BUS 2

+#define MIN_SEABIOS_HPPA_VERSION 1 /* require at least this fw version */
+
 static ISABus *hppa_isa_bus(void)
 {
 ISABus *isa_bus;
@@ -56,6 +58,23 @@ static uint64_t cpu_hppa_to_phys(void *opaque, uint64_t addr)
 static HPPACPU *cpu[HPPA_MAX_CPUS];
 static uint64_t firmware_entry;

+static FWCfgState *create_fw_cfg(MachineState *ms)
+{
+FWCfgState *fw_cfg;
+uint64_t val;
+
+fw_cfg = fw_cfg_init_mem(QEMU_FW_CFG_IO_BASE, QEMU_FW_CFG_IO_BASE + 4);
+fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, ms->smp.cpus);
+fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, HPPA_MAX_CPUS);
+fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, ram_size);
+
+val = cpu_to_le64(MIN_SEABIOS_HPPA_VERSION);
+fw_cfg_add_file(fw_cfg, "/etc/firmware-min-version",
+g_memdup(, sizeof(val)), sizeof(val));
+
+return fw_cfg;
+}
+
 static void machine_hppa_init(MachineState *machine)
 {
 const char *kernel_filename = machine->kernel_filename;
@@ -118,6 +137,9 @@ static void machine_hppa_init(MachineState *machine)
115200, serial_hd(0), DEVICE_BIG_ENDIAN);
 }

+/* fw_cfg configuration interface */
+create_fw_cfg(machine);
+
 /* SCSI disk setup. */
 dev = DEVICE(pci_create_simple(pci_bus, -1, "lsi53c895a"));
 lsi53c8xx_handle_legacy_cmdline(dev);
--
2.21.3




[PATCH 4/4] hw/display/artist.c: fix out of bounds check

2020-07-27 Thread Helge Deller
From: Sven Schnelle 

Signed-off-by: Sven Schnelle 
Signed-off-by: Helge Deller 
---
 hw/display/artist.c | 18 ++
 1 file changed, 6 insertions(+), 12 deletions(-)

diff --git a/hw/display/artist.c b/hw/display/artist.c
index 6261bfe65b..46043ec895 100644
--- a/hw/display/artist.c
+++ b/hw/display/artist.c
@@ -340,14 +340,13 @@ static void vram_bit_write(ARTISTState *s, int posx, int 
posy, bool incr_x,
 {
 struct vram_buffer *buf;
 uint32_t vram_bitmask = s->vram_bitmask;
-int mask, i, pix_count, pix_length, offset, height, width;
+int mask, i, pix_count, pix_length, offset, width;
 uint8_t *data8, *p;

 pix_count = vram_write_pix_per_transfer(s);
 pix_length = vram_pixel_length(s);

 buf = vram_write_buffer(s);
-height = buf->height;
 width = buf->width;

 if (s->cmap_bm_access) {
@@ -367,20 +366,13 @@ static void vram_bit_write(ARTISTState *s, int posx, int 
posy, bool incr_x,
 pix_count = size * 8;
 }

-if (posy * width + posx + pix_count > buf->size) {
-qemu_log("write outside bounds: wants %dx%d, max size %dx%d\n",
- posx, posy, width, height);
-return;
-}
-
-
 switch (pix_length) {
 case 0:
 if (s->image_bitmap_op & 0x2000) {
 data &= vram_bitmask;
 }

-for (i = 0; i < pix_count; i++) {
+for (i = 0; i < pix_count && offset + i < buf->size; i++) {
 artist_rop8(s, p + offset + pix_count - 1 - i,
 (data & 1) ? (s->plane_mask >> 24) : 0);
 data >>= 1;
@@ -398,7 +390,9 @@ static void vram_bit_write(ARTISTState *s, int posx, int 
posy, bool incr_x,
 for (i = 3; i >= 0; i--) {
 if (!(s->image_bitmap_op & 0x2000) ||
 s->vram_bitmask & (1 << (28 + i))) {
-artist_rop8(s, p + offset + 3 - i, data8[ROP8OFF(i)]);
+if (offset + 3 - i < buf->size) {
+artist_rop8(s, p + offset + 3 - i, data8[ROP8OFF(i)]);
+}
 }
 }
 memory_region_set_dirty(>mr, offset, 3);
@@ -420,7 +414,7 @@ static void vram_bit_write(ARTISTState *s, int posx, int 
posy, bool incr_x,
 break;
 }

-for (i = 0; i < pix_count; i++) {
+for (i = 0; i < pix_count && offset + i < buf->size; i++) {
 mask = 1 << (pix_count - 1 - i);

 if (!(s->image_bitmap_op & 0x2000) ||
--
2.21.3




Re: [PATCH v2 2/5] block/nbd: allow drain during reconnect attempt

2020-07-27 Thread Eric Blake

On 7/27/20 1:47 PM, Vladimir Sementsov-Ogievskiy wrote:

It should be to reenter qio_channel_yield() on io/channel read/write


be safe


path, so it's safe to reduce in_flight and allow attaching new aio
context. And no problem to allow drain itself: connection attempt is
not a guest request. Moreover, if remote server is down, we can hang
in negotiation, blocking drain section and provoking a dead lock.

How to reproduce the dead lock:

1. Create nbd-fault-injector.conf with the following contents:

[inject-error "mega1"]
event=data
io=readwrite
when=before

2. In one terminal run nbd-fault-injector in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./nbd-fault-injector.py 127.0.0.1:1 nbd-fault-injector.conf;
done

3. In another terminal run qemu-io in a loop, like this:

n=1; while true; do
 echo $n; ((n++));
 ./qemu-io -c 'read 0 512' nbd://127.0.0.1:1;
done





Note, that the hang may be
triggered by another bug, so the whole case is fixed only together with
commit "block/nbd: on shutdown terminate connection attempt".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c | 14 ++
  1 file changed, 14 insertions(+)

diff --git a/block/nbd.c b/block/nbd.c
index 2ec6623c18..6d19f3c660 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -291,8 +291,22 @@ static coroutine_fn void 
nbd_reconnect_attempt(BDRVNBDState *s)
  goto out;
  }
  
+bdrv_dec_in_flight(s->bs);

+
  ret = nbd_client_handshake(s->bs, sioc, _err);
  
+if (s->drained) {

+s->wait_drained_end = true;
+while (s->drained) {
+/*
+ * We may be entered once from nbd_client_attach_aio_context_bh
+ * and then from nbd_client_co_drain_end. So here is a loop.
+ */
+qemu_coroutine_yield();
+}
+}
+bdrv_inc_in_flight(s->bs);
+
  out:
  s->connect_status = ret;
  error_free(s->connect_err);



Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v2 1/5] block/nbd: split nbd_establish_connection out of nbd_client_connect

2020-07-27 Thread Eric Blake

On 7/27/20 1:47 PM, Vladimir Sementsov-Ogievskiy wrote:

We are going to implement non-blocking version of
nbd_establish_connection, which for a while will be used only for
nbd_reconnect_attempt, not for nbd_open, so we need to call it
separately.

Refactor nbd_reconnect_attempt in a way which makes next commit
simpler.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/nbd.c| 60 +++---
  block/trace-events |  4 ++--
  2 files changed, 38 insertions(+), 26 deletions(-)



Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




[PULL 24/24] migration: Fix typos in bitmap migration comments

2020-07-27 Thread Eric Blake
Noticed while reviewing the file for newer patches.

Fixes: b35ebdf076
Signed-off-by: Eric Blake 
Message-Id: <20200727203206.134996-1-ebl...@redhat.com>
---
 migration/block-dirty-bitmap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 1f675b792fc9..784330ebe130 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -97,7 +97,7 @@

 #define DIRTY_BITMAP_MIG_START_FLAG_ENABLED  0x01
 #define DIRTY_BITMAP_MIG_START_FLAG_PERSISTENT   0x02
-/* 0x04 was "AUTOLOAD" flags on elder versions, no it is ignored */
+/* 0x04 was "AUTOLOAD" flags on older versions, now it is ignored */
 #define DIRTY_BITMAP_MIG_START_FLAG_RESERVED_MASK0xf8

 /* State of one bitmap during save process */
@@ -180,7 +180,7 @@ static uint32_t qemu_get_bitmap_flags(QEMUFile *f)

 static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t flags)
 {
-/* The code currently do not send flags more than one byte */
+/* The code currently does not send flags as more than one byte */
 assert(!(flags & (0xff00 | DIRTY_BITMAP_MIG_EXTRA_FLAGS)));

 qemu_put_byte(f, flags);
-- 
2.27.0




[PULL 22/24] qemu-iotests/199: add source-killed case to bitmaps postcopy

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Previous patches fixes behavior of bitmaps migration, so that errors
are handled by just removing unfinished bitmaps, and not fail or try to
recover postcopy migration. Add corresponding test.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
Message-Id: <20200727194236.19551-22-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/199 | 15 +++
 tests/qemu-iotests/199.out |  4 ++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 140930b2b12e..58fad872a12c 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -241,6 +241,21 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.launch()
 check_bitmaps(self.vm_a, 0)

+def test_early_kill_source(self):
+self.start_postcopy()
+
+self.vm_a_events = self.vm_a.get_qmp_events()
+self.vm_a.kill()
+
+self.vm_a.launch()
+
+match = {'data': {'status': 'completed'}}
+e_complete = self.vm_b.event_wait('MIGRATION', match=match)
+self.vm_b_events.append(e_complete)
+
+check_bitmaps(self.vm_a, 0)
+check_bitmaps(self.vm_b, 0)
+

 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index fbc63e62f885..8d7e99670093 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-..
+...
 --
-Ran 2 tests
+Ran 3 tests

 OK
-- 
2.27.0




[PULL 19/24] qemu-iotests/199: prepare for new test-cases addition

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Move future common part to start_postcopy() method. Move checking
number of bitmaps to check_bitmap().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Message-Id: <20200727194236.19551-19-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/199 | 36 +++-
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index d8532e49da00..355c0b288592 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -29,6 +29,8 @@ disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')

+granularity = 512
+nb_bitmaps = 15

 GiB = 1024 * 1024 * 1024

@@ -61,6 +63,15 @@ def event_dist(e1, e2):
 return event_seconds(e2) - event_seconds(e1)


+def check_bitmaps(vm, count):
+result = vm.qmp('query-block')
+
+if count == 0:
+assert 'dirty-bitmaps' not in result['return'][0]
+else:
+assert len(result['return'][0]['dirty-bitmaps']) == count
+
+
 class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 def tearDown(self):
 if debug:
@@ -101,10 +112,8 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a_events = []
 self.vm_b_events = []

-def test_postcopy(self):
-granularity = 512
-nb_bitmaps = 15
-
+def start_postcopy(self):
+""" Run migration until RESUME event on target. Return this event. """
 for i in range(nb_bitmaps):
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap{}'.format(i),
@@ -119,10 +128,10 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):

 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-discards1_sha256 = result['return']['sha256']
+self.discards1_sha256 = result['return']['sha256']

 # Check, that updating the bitmap by discards works
-assert discards1_sha256 != empty_sha256
+assert self.discards1_sha256 != empty_sha256

 # We want to calculate resulting sha256. Do it in bitmap0, so, disable
 # other bitmaps
@@ -135,7 +144,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):

 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-all_discards_sha256 = result['return']['sha256']
+self.all_discards_sha256 = result['return']['sha256']

 # Now, enable some bitmaps, to be updated during migration
 for i in range(2, nb_bitmaps, 2):
@@ -160,6 +169,10 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):

 event_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(event_resume)
+return event_resume
+
+def test_postcopy_success(self):
+event_resume = self.start_postcopy()

 # enabled bitmaps should be updated
 apply_discards(self.vm_b, discards2)
@@ -180,18 +193,15 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)

-# Assert that bitmap migration is finished (check that successor bitmap
-# is removed)
-result = self.vm_b.qmp('query-block')
-assert len(result['return'][0]['dirty-bitmaps']) == nb_bitmaps
+check_bitmaps(self.vm_b, nb_bitmaps)

 # Check content of migrated bitmaps. Still, don't waste time checking
 # every bitmap
 for i in range(0, nb_bitmaps, 5):
 result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap{}'.format(i))
-sha256 = discards1_sha256 if i % 2 else all_discards_sha256
-self.assert_qmp(result, 'return/sha256', sha256)
+sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
+self.assert_qmp(result, 'return/sha256', sha)


 if __name__ == '__main__':
-- 
2.27.0




[PULL 21/24] qemu-iotests/199: add early shutdown case to bitmaps postcopy

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Previous patches fixed two crashes which may occur on shutdown prior to
bitmaps postcopy finished. Check that it works now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
Message-Id: <20200727194236.19551-21-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/199 | 24 
 tests/qemu-iotests/199.out |  4 ++--
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 5fd34f0fcdfa..140930b2b12e 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -217,6 +217,30 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
 self.assert_qmp(result, 'return/sha256', sha)

+def test_early_shutdown_destination(self):
+self.start_postcopy()
+
+self.vm_b_events += self.vm_b.get_qmp_events()
+self.vm_b.shutdown()
+# recreate vm_b, so there is no incoming option, which prevents
+# loading bitmaps from disk
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_b.launch()
+check_bitmaps(self.vm_b, 0)
+
+# Bitmaps will be lost if we just shutdown the vm, as they are marked
+# to skip storing to disk when prepared for migration. And that's
+# correct, as actual data may be modified in target vm, so we play
+# safe.
+# Still, this mark would be taken away if we do 'cont', and bitmaps
+# become persistent again. (see iotest 169 for such behavior case)
+result = self.vm_a.qmp('query-status')
+assert not result['return']['running']
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_a.shutdown()
+self.vm_a.launch()
+check_bitmaps(self.vm_a, 0)
+

 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index ae1213e6f863..fbc63e62f885 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-.
+..
 --
-Ran 1 tests
+Ran 2 tests

 OK
-- 
2.27.0




[PULL 23/24] iotests: Adjust which migration tests are quick

2020-07-27 Thread Eric Blake
A quick run of './check -qcow2 -g migration' shows that test 169 is
NOT quick, but meanwhile several other tests ARE quick.  Let's adjust
the test designations accordingly.

Signed-off-by: Eric Blake 
Message-Id: <20200727195117.132151-1-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/group | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 1d0252e1f051..806044642c69 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -112,7 +112,7 @@
 088 rw quick
 089 rw auto quick
 090 rw auto quick
-091 rw migration
+091 rw migration quick
 092 rw quick
 093 throttle
 094 rw quick
@@ -186,7 +186,7 @@
 162 quick
 163 rw
 165 rw quick
-169 rw quick migration
+169 rw migration
 170 rw auto quick
 171 rw quick
 172 auto
@@ -197,9 +197,9 @@
 177 rw auto quick
 178 img
 179 rw auto quick
-181 rw auto migration
+181 rw auto migration quick
 182 rw quick
-183 rw migration
+183 rw migration quick
 184 rw auto quick
 185 rw
 186 rw auto
@@ -216,9 +216,9 @@
 198 rw
 199 rw migration
 200 rw
-201 rw migration
+201 rw migration quick
 202 rw quick
-203 rw auto migration
+203 rw auto migration quick
 204 rw quick
 205 rw quick
 206 rw
-- 
2.27.0




[PULL 17/24] migration/block-dirty-bitmap: cancel migration on shutdown

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

If target is turned off prior to postcopy finished, target crashes
because busy bitmaps are found at shutdown.
Canceling incoming migration helps, as it removes all unfinished (and
therefore busy) bitmaps.

Similarly on source we crash in bdrv_close_all which asserts that all
bdrv states are removed, because bdrv states involved into dirty bitmap
migration are referenced by it. So, we need to cancel outgoing
migration as well.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Message-Id: <20200727194236.19551-17-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 migration/migration.h  |  2 ++
 migration/block-dirty-bitmap.c | 16 
 migration/migration.c  | 13 +
 3 files changed, 31 insertions(+)

diff --git a/migration/migration.h b/migration/migration.h
index ab20c756f549..6c6a931d0dc2 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -335,6 +335,8 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState 
*mis,
 void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);

 void dirty_bitmap_mig_before_vm_start(void);
+void dirty_bitmap_mig_cancel_outgoing(void);
+void dirty_bitmap_mig_cancel_incoming(void);
 void migrate_add_address(SocketAddress *address);

 int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index f91015a4f88f..1f675b792fc9 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -657,6 +657,22 @@ static void cancel_incoming_locked(DBMLoadState *s)
 s->bitmaps = NULL;
 }

+void dirty_bitmap_mig_cancel_outgoing(void)
+{
+dirty_bitmap_do_save_cleanup(_state.save);
+}
+
+void dirty_bitmap_mig_cancel_incoming(void)
+{
+DBMLoadState *s = _state.load;
+
+qemu_mutex_lock(>lock);
+
+cancel_incoming_locked(s);
+
+qemu_mutex_unlock(>lock);
+}
+
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 {
 GSList *item;
diff --git a/migration/migration.c b/migration/migration.c
index 1c61428988e9..8fe36339dbe8 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -188,6 +188,19 @@ void migration_shutdown(void)
  */
 migrate_fd_cancel(current_migration);
 object_unref(OBJECT(current_migration));
+
+/*
+ * Cancel outgoing migration of dirty bitmaps. It should
+ * at least unref used block nodes.
+ */
+dirty_bitmap_mig_cancel_outgoing();
+
+/*
+ * Cancel incoming migration of dirty bitmaps. Dirty bitmaps
+ * are non-critical data, and their loss never considered as
+ * something serious.
+ */
+dirty_bitmap_mig_cancel_incoming();
 }

 /* For outgoing */
-- 
2.27.0




[PULL 11/24] migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

No reasons to keep two public init functions.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Reviewed-by: Dr. David Alan Gilbert 
Message-Id: <20200727194236.19551-11-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 migration/migration.h  | 1 -
 migration/block-dirty-bitmap.c | 6 +-
 migration/migration.c  | 2 --
 3 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/migration/migration.h b/migration/migration.h
index f617960522aa..ab20c756f549 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -335,7 +335,6 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState 
*mis,
 void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);

 void dirty_bitmap_mig_before_vm_start(void);
-void init_dirty_bitmap_incoming_migration(void);
 void migrate_add_address(SocketAddress *address);

 int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 01a536d7d3d3..4b67e4f4fbcd 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -148,11 +148,6 @@ typedef struct LoadBitmapState {
 static GSList *enabled_bitmaps;
 QemuMutex finish_lock;

-void init_dirty_bitmap_incoming_migration(void)
-{
-qemu_mutex_init(_lock);
-}
-
 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
 {
 uint8_t flags = qemu_get_byte(f);
@@ -801,6 +796,7 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = {
 void dirty_bitmap_mig_init(void)
 {
 QSIMPLEQ_INIT(_bitmap_mig_state.dbms_list);
+qemu_mutex_init(_lock);

 register_savevm_live("dirty-bitmap", 0, 1,
  _dirty_bitmap_handlers,
diff --git a/migration/migration.c b/migration/migration.c
index 2ed99232272e..1c61428988e9 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -165,8 +165,6 @@ void migration_object_init(void)
 qemu_sem_init(_incoming->postcopy_pause_sem_dst, 0);
 qemu_sem_init(_incoming->postcopy_pause_sem_fault, 0);

-init_dirty_bitmap_incoming_migration();
-
 if (!migration_object_check(current_migration, )) {
 error_report_err(err);
 exit(1);
-- 
2.27.0




[PULL 14/24] migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in
postcopy, bitmap successor must be enabled, and reclaim operation will
enable the bitmap.

So, actually we need just call _reclaim_ in both if branches, and
making differences only to add an assertion seems not really good. The
logic becomes simple: on load complete we do reclaim and that's all.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Message-Id: <20200727194236.19551-14-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 25 -
 1 file changed, 4 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 9194807b54f1..405a259296d9 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -603,6 +603,10 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)

 qemu_mutex_lock(>lock);

+if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
+bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
+}
+
 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;

@@ -612,27 +616,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 }
 }

-if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
-bdrv_dirty_bitmap_lock(s->bitmap);
-if (s->enabled_bitmaps == NULL) {
-/* in postcopy */
-bdrv_reclaim_dirty_bitmap_locked(s->bitmap, _abort);
-bdrv_enable_dirty_bitmap_locked(s->bitmap);
-} else {
-/* target not started, successor must be empty */
-int64_t count = bdrv_get_dirty_count(s->bitmap);
-BdrvDirtyBitmap *ret = bdrv_reclaim_dirty_bitmap_locked(s->bitmap,
-NULL);
-/* bdrv_reclaim_dirty_bitmap can fail only on no successor (it
- * must be) or on merge fail, but merge can't fail when second
- * bitmap is empty
- */
-assert(ret == s->bitmap &&
-   count == bdrv_get_dirty_count(s->bitmap));
-}
-bdrv_dirty_bitmap_unlock(s->bitmap);
-}
-
 qemu_mutex_unlock(>lock);
 }

-- 
2.27.0




[PULL 20/24] qemu-iotests/199: check persistent bitmaps

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Check that persistent bitmaps are not stored on source and that bitmaps
are persistent on destination.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Message-Id: <20200727194236.19551-20-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/199 | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 355c0b288592..5fd34f0fcdfa 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -117,7 +117,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 for i in range(nb_bitmaps):
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap{}'.format(i),
-   granularity=granularity)
+   granularity=granularity,
+   persistent=True)
 self.assert_qmp(result, 'return', {})

 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
@@ -193,6 +194,19 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)

+# check that there are no bitmaps stored on source
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_a.shutdown()
+self.vm_a.launch()
+check_bitmaps(self.vm_a, 0)
+
+# check that bitmaps are migrated and persistence works
+check_bitmaps(self.vm_b, nb_bitmaps)
+self.vm_b.shutdown()
+# recreate vm_b, so there is no incoming option, which prevents
+# loading bitmaps from disk
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_b.launch()
 check_bitmaps(self.vm_b, nb_bitmaps)

 # Check content of migrated bitmaps. Still, don't waste time checking
-- 
2.27.0




[PULL 13/24] migration/block-dirty-bitmap: rename finish_lock to just lock

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

finish_lock is bad name, as lock used not only on process end.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Message-Id: <20200727194236.19551-13-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 9b39e7aa2b4f..9194807b54f1 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -143,7 +143,7 @@ typedef struct DBMLoadState {
 BdrvDirtyBitmap *bitmap;

 GSList *enabled_bitmaps;
-QemuMutex finish_lock;
+QemuMutex lock; /* protect enabled_bitmaps */
 } DBMLoadState;

 typedef struct DBMState {
@@ -575,7 +575,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 DBMLoadState *s = _state.load;
 GSList *item;

-qemu_mutex_lock(>finish_lock);
+qemu_mutex_lock(>lock);

 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
@@ -592,7 +592,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 g_slist_free(s->enabled_bitmaps);
 s->enabled_bitmaps = NULL;

-qemu_mutex_unlock(>finish_lock);
+qemu_mutex_unlock(>lock);
 }

 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
@@ -601,7 +601,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 trace_dirty_bitmap_load_complete();
 bdrv_dirty_bitmap_deserialize_finish(s->bitmap);

-qemu_mutex_lock(>finish_lock);
+qemu_mutex_lock(>lock);

 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
@@ -633,7 +633,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 bdrv_dirty_bitmap_unlock(s->bitmap);
 }

-qemu_mutex_unlock(>finish_lock);
+qemu_mutex_unlock(>lock);
 }

 static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -815,7 +815,7 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = {
 void dirty_bitmap_mig_init(void)
 {
 QSIMPLEQ_INIT(_state.save.dbms_list);
-qemu_mutex_init(_state.load.finish_lock);
+qemu_mutex_init(_state.load.lock);

 register_savevm_live("dirty-bitmap", 0, 1,
  _dirty_bitmap_handlers,
-- 
2.27.0




[PULL 18/24] migration/savevm: don't worry if bitmap migration postcopy failed

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

First, if only bitmaps postcopy is enabled (and not ram postcopy)
postcopy_pause_incoming crashes on an assertion
assert(mis->to_src_file).

And anyway, bitmaps postcopy is not prepared to be somehow recovered.
The original idea instead is that if bitmaps postcopy failed, we just
lose some bitmaps, which is not critical. So, on failure we just need
to remove unfinished bitmaps and guest should continue execution on
destination.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Andrey Shinkevich 
Reviewed-by: Eric Blake 
Message-Id: <20200727194236.19551-18-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 migration/savevm.c | 37 -
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index 45c9dd9d8a6d..a843d202b5b4 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1813,6 +1813,9 @@ static void *postcopy_ram_listen_thread(void *opaque)
 MigrationIncomingState *mis = migration_incoming_get_current();
 QEMUFile *f = mis->from_src_file;
 int load_res;
+MigrationState *migr = migrate_get_current();
+
+object_ref(OBJECT(migr));

 migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_POSTCOPY_ACTIVE);
@@ -1839,11 +1842,24 @@ static void *postcopy_ram_listen_thread(void *opaque)

 trace_postcopy_ram_listen_thread_exit();
 if (load_res < 0) {
-error_report("%s: loadvm failed: %d", __func__, load_res);
 qemu_file_set_error(f, load_res);
-migrate_set_state(>state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
-   MIGRATION_STATUS_FAILED);
-} else {
+dirty_bitmap_mig_cancel_incoming();
+if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
+!migrate_postcopy_ram() && migrate_dirty_bitmaps())
+{
+error_report("%s: loadvm failed during postcopy: %d. All states "
+ "are migrated except dirty bitmaps. Some dirty "
+ "bitmaps may be lost, and present migrated dirty "
+ "bitmaps are correctly migrated and valid.",
+ __func__, load_res);
+load_res = 0; /* prevent further exit() */
+} else {
+error_report("%s: loadvm failed: %d", __func__, load_res);
+migrate_set_state(>state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
+   MIGRATION_STATUS_FAILED);
+}
+}
+if (load_res >= 0) {
 /*
  * This looks good, but it's possible that the device loading in the
  * main thread hasn't finished yet, and so we might not be in 'RUN'
@@ -1879,6 +1895,8 @@ static void *postcopy_ram_listen_thread(void *opaque)
 mis->have_listen_thread = false;
 postcopy_state_set(POSTCOPY_INCOMING_END);

+object_unref(OBJECT(migr));
+
 return NULL;
 }

@@ -2437,6 +2455,8 @@ static bool 
postcopy_pause_incoming(MigrationIncomingState *mis)
 {
 trace_postcopy_pause_incoming();

+assert(migrate_postcopy_ram());
+
 /* Clear the triggered bit to allow one recovery */
 mis->postcopy_recover_triggered = false;

@@ -2521,15 +2541,22 @@ out:
 if (ret < 0) {
 qemu_file_set_error(f, ret);

+/* Cancel bitmaps incoming regardless of recovery */
+dirty_bitmap_mig_cancel_incoming();
+
 /*
  * If we are during an active postcopy, then we pause instead
  * of bail out to at least keep the VM's dirty data.  Note
  * that POSTCOPY_INCOMING_LISTENING stage is still not enough,
  * during which we're still receiving device states and we
  * still haven't yet started the VM on destination.
+ *
+ * Only RAM postcopy supports recovery. Still, if RAM postcopy is
+ * enabled, canceled bitmaps postcopy will not affect RAM postcopy
+ * recovering.
  */
 if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING &&
-postcopy_pause_incoming(mis)) {
+migrate_postcopy_ram() && postcopy_pause_incoming(mis)) {
 /* Reset f to point to the newly created channel */
 f = mis->from_src_file;
 goto retry;
-- 
2.27.0




[PULL 16/24] migration/block-dirty-bitmap: relax error handling in incoming part

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.

Still we have to report io stream violation errors, as they affect the
whole migration stream.

While touching this, tighten code that was previously blindly calling
malloc on a size read from the migration stream, as a corrupted stream
(perhaps from a malicious user) should not be able to convince us to
allocate an inordinate amount of memory.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20200727194236.19551-16-vsement...@virtuozzo.com>
Reviewed-by: Eric Blake 
[eblake: typo fixes, enhance commit message]
Signed-off-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 162 +
 1 file changed, 126 insertions(+), 36 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index eb4ffeac4d1b..f91015a4f88f 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -145,6 +145,15 @@ typedef struct DBMLoadState {

 bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start */

+/*
+ * cancelled
+ * Incoming migration is cancelled for some reason. That means that we
+ * still should read our chunks from migration stream, to not affect other
+ * migration objects (like RAM), but just ignore them and do not touch any
+ * bitmaps or nodes.
+ */
+bool cancelled;
+
 GSList *bitmaps;
 QemuMutex lock; /* protect bitmaps */
 } DBMLoadState;
@@ -531,6 +540,10 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
 uint8_t flags = qemu_get_byte(f);
 LoadBitmapState *b;

+if (s->cancelled) {
+return 0;
+}
+
 if (s->bitmap) {
 error_report("Bitmap with the same name ('%s') already exists on "
  "destination", bdrv_dirty_bitmap_name(s->bitmap));
@@ -613,14 +626,48 @@ void dirty_bitmap_mig_before_vm_start(void)
 qemu_mutex_unlock(>lock);
 }

+static void cancel_incoming_locked(DBMLoadState *s)
+{
+GSList *item;
+
+if (s->cancelled) {
+return;
+}
+
+s->cancelled = true;
+s->bs = NULL;
+s->bitmap = NULL;
+
+/* Drop all unfinished bitmaps */
+for (item = s->bitmaps; item; item = g_slist_next(item)) {
+LoadBitmapState *b = item->data;
+
+/*
+ * Bitmap must be unfinished, as finished bitmaps should already be
+ * removed from the list.
+ */
+assert(!s->before_vm_start_handled || !b->migrated);
+if (bdrv_dirty_bitmap_has_successor(b->bitmap)) {
+bdrv_reclaim_dirty_bitmap(b->bitmap, _abort);
+}
+bdrv_release_dirty_bitmap(b->bitmap);
+}
+
+g_slist_free_full(s->bitmaps, g_free);
+s->bitmaps = NULL;
+}
+
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 {
 GSList *item;
 trace_dirty_bitmap_load_complete();
+
+if (s->cancelled) {
+return;
+}
+
 bdrv_dirty_bitmap_deserialize_finish(s->bitmap);

-qemu_mutex_lock(>lock);
-
 if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
 bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
 }
@@ -637,8 +684,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 break;
 }
 }
-
-qemu_mutex_unlock(>lock);
 }

 static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -650,15 +695,46 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)

 if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
 trace_dirty_bitmap_load_bits_zeroes();
-bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
- false);
+if (!s->cancelled) {
+bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
+ nr_bytes, false);
+}
 } else {
 size_t ret;
-uint8_t *buf;
+g_autofree uint8_t *buf = NULL;
 uint64_t buf_size = qemu_get_be64(f);
-uint64_t needed_size =
-bdrv_dirty_bitmap_serialization_size(s->bitmap,
- first_byte, nr_bytes);
+uint64_t needed_size;
+
+/*
+ * The actual check for buf_size is done a bit later. We can't do it in
+ * cancelled mode as we don't have the bitmap to check the constraints
+ * (so, we allocate a buffer and read prior to the check). On the other
+ * hand, we shouldn't blindly g_malloc the number from the stream.
+ * Actually one chunk should not be larger than CHUNK_SIZE. Let's allow
+ * a bit larger (which means that bitmap migration will fail anyway and
+ * the whole migration will most probably fail soon due to broken
+ * stream).
+ */
+if 

[PULL 08/24] migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Using the _locked version of bdrv_enable_dirty_bitmap to bypass locking
is wrong as we do not already own the mutex.  Moreover, the adjacent
call to bdrv_dirty_bitmap_enable_successor grabs the mutex.

Fixes: 58f72b965e9e1q
Cc: qemu-sta...@nongnu.org # v3.0
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Reviewed-by: Eric Blake 
Message-Id: <20200727194236.19551-8-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index b0dbf9eeed43..0739f1259e05 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -566,7 +566,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 DirtyBitmapLoadBitmapState *b = item->data;

 if (b->migrated) {
-bdrv_enable_dirty_bitmap_locked(b->bitmap);
+bdrv_enable_dirty_bitmap(b->bitmap);
 } else {
 bdrv_dirty_bitmap_enable_successor(b->bitmap);
 }
-- 
2.27.0




[PULL 15/24] migration/block-dirty-bitmap: keep bitmap state for all bitmaps

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Keep bitmap state for disabled bitmaps too. Keep the state until the
end of the process. It's needed for the following commit to implement
bitmap postcopy canceling.

To clean-up the new list the following logic is used:
We need two events to consider bitmap migration finished:
1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received
2. dirty_bitmap_mig_before_vm_start should be called
These two events may come in any order, so we understand which one is
last, and on the last of them we remove bitmap migration state from the
list.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Message-Id: <20200727194236.19551-15-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 64 +++---
 1 file changed, 43 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 405a259296d9..eb4ffeac4d1b 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -132,6 +132,7 @@ typedef struct LoadBitmapState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 bool migrated;
+bool enabled;
 } LoadBitmapState;

 /* State of the dirty bitmap migration (DBM) during load process */
@@ -142,8 +143,10 @@ typedef struct DBMLoadState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;

-GSList *enabled_bitmaps;
-QemuMutex lock; /* protect enabled_bitmaps */
+bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start */
+
+GSList *bitmaps;
+QemuMutex lock; /* protect bitmaps */
 } DBMLoadState;

 typedef struct DBMState {
@@ -526,6 +529,7 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
 Error *local_err = NULL;
 uint32_t granularity = qemu_get_be32(f);
 uint8_t flags = qemu_get_byte(f);
+LoadBitmapState *b;

 if (s->bitmap) {
 error_report("Bitmap with the same name ('%s') already exists on "
@@ -552,45 +556,59 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)

 bdrv_disable_dirty_bitmap(s->bitmap);
 if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
-LoadBitmapState *b;
-
 bdrv_dirty_bitmap_create_successor(s->bitmap, _err);
 if (local_err) {
 error_report_err(local_err);
 return -EINVAL;
 }
-
-b = g_new(LoadBitmapState, 1);
-b->bs = s->bs;
-b->bitmap = s->bitmap;
-b->migrated = false;
-s->enabled_bitmaps = g_slist_prepend(s->enabled_bitmaps, b);
 }

+b = g_new(LoadBitmapState, 1);
+b->bs = s->bs;
+b->bitmap = s->bitmap;
+b->migrated = false;
+b->enabled = flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED;
+
+s->bitmaps = g_slist_prepend(s->bitmaps, b);
+
 return 0;
 }

-void dirty_bitmap_mig_before_vm_start(void)
+/*
+ * before_vm_start_handle_item
+ *
+ * g_slist_foreach helper
+ *
+ * item is LoadBitmapState*
+ * opaque is DBMLoadState*
+ */
+static void before_vm_start_handle_item(void *item, void *opaque)
 {
-DBMLoadState *s = _state.load;
-GSList *item;
-
-qemu_mutex_lock(>lock);
-
-for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
-LoadBitmapState *b = item->data;
+DBMLoadState *s = opaque;
+LoadBitmapState *b = item;

+if (b->enabled) {
 if (b->migrated) {
 bdrv_enable_dirty_bitmap(b->bitmap);
 } else {
 bdrv_dirty_bitmap_enable_successor(b->bitmap);
 }
+}

+if (b->migrated) {
+s->bitmaps = g_slist_remove(s->bitmaps, b);
 g_free(b);
 }
+}

-g_slist_free(s->enabled_bitmaps);
-s->enabled_bitmaps = NULL;
+void dirty_bitmap_mig_before_vm_start(void)
+{
+DBMLoadState *s = _state.load;
+qemu_mutex_lock(>lock);
+
+assert(!s->before_vm_start_handled);
+g_slist_foreach(s->bitmaps, before_vm_start_handle_item, s);
+s->before_vm_start_handled = true;

 qemu_mutex_unlock(>lock);
 }
@@ -607,11 +625,15 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
 }

-for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
+for (item = s->bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;

 if (b->bitmap == s->bitmap) {
 b->migrated = true;
+if (s->before_vm_start_handled) {
+s->bitmaps = g_slist_remove(s->bitmaps, b);
+g_free(b);
+}
 break;
 }
 }
-- 
2.27.0




[PULL 12/24] migration/block-dirty-bitmap: refactor state global variables

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Move all state variables into one global struct. Reduce global
variable usage, utilizing opaque pointer where possible.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Message-Id: <20200727194236.19551-12-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 179 ++---
 1 file changed, 99 insertions(+), 80 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 4b67e4f4fbcd..9b39e7aa2b4f 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -128,6 +128,12 @@ typedef struct DBMSaveState {
 BdrvDirtyBitmap *prev_bitmap;
 } DBMSaveState;

+typedef struct LoadBitmapState {
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+bool migrated;
+} LoadBitmapState;
+
 /* State of the dirty bitmap migration (DBM) during load process */
 typedef struct DBMLoadState {
 uint32_t flags;
@@ -135,18 +141,17 @@ typedef struct DBMLoadState {
 char bitmap_name[256];
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
+
+GSList *enabled_bitmaps;
+QemuMutex finish_lock;
 } DBMLoadState;

-static DBMSaveState dirty_bitmap_mig_state;
+typedef struct DBMState {
+DBMSaveState save;
+DBMLoadState load;
+} DBMState;

-/* State of one bitmap during load process */
-typedef struct LoadBitmapState {
-BlockDriverState *bs;
-BdrvDirtyBitmap *bitmap;
-bool migrated;
-} LoadBitmapState;
-static GSList *enabled_bitmaps;
-QemuMutex finish_lock;
+static DBMState dbm_state;

 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
 {
@@ -169,21 +174,21 @@ static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t 
flags)
 qemu_put_byte(f, flags);
 }

-static void send_bitmap_header(QEMUFile *f, SaveBitmapState *dbms,
-   uint32_t additional_flags)
+static void send_bitmap_header(QEMUFile *f, DBMSaveState *s,
+   SaveBitmapState *dbms, uint32_t 
additional_flags)
 {
 BlockDriverState *bs = dbms->bs;
 BdrvDirtyBitmap *bitmap = dbms->bitmap;
 uint32_t flags = additional_flags;
 trace_send_bitmap_header_enter();

-if (bs != dirty_bitmap_mig_state.prev_bs) {
-dirty_bitmap_mig_state.prev_bs = bs;
+if (bs != s->prev_bs) {
+s->prev_bs = bs;
 flags |= DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME;
 }

-if (bitmap != dirty_bitmap_mig_state.prev_bitmap) {
-dirty_bitmap_mig_state.prev_bitmap = bitmap;
+if (bitmap != s->prev_bitmap) {
+s->prev_bitmap = bitmap;
 flags |= DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME;
 }

@@ -198,19 +203,22 @@ static void send_bitmap_header(QEMUFile *f, 
SaveBitmapState *dbms,
 }
 }

-static void send_bitmap_start(QEMUFile *f, SaveBitmapState *dbms)
+static void send_bitmap_start(QEMUFile *f, DBMSaveState *s,
+  SaveBitmapState *dbms)
 {
-send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
+send_bitmap_header(f, s, dbms, DIRTY_BITMAP_MIG_FLAG_START);
 qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
 qemu_put_byte(f, dbms->flags);
 }

-static void send_bitmap_complete(QEMUFile *f, SaveBitmapState *dbms)
+static void send_bitmap_complete(QEMUFile *f, DBMSaveState *s,
+ SaveBitmapState *dbms)
 {
-send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
+send_bitmap_header(f, s, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
 }

-static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
+static void send_bitmap_bits(QEMUFile *f, DBMSaveState *s,
+ SaveBitmapState *dbms,
  uint64_t start_sector, uint32_t nr_sectors)
 {
 /* align for buffer_is_zero() */
@@ -235,7 +243,7 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,

 trace_send_bitmap_bits(flags, start_sector, nr_sectors, buf_size);

-send_bitmap_header(f, dbms, flags);
+send_bitmap_header(f, s, dbms, flags);

 qemu_put_be64(f, start_sector);
 qemu_put_be32(f, nr_sectors);
@@ -254,12 +262,12 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
 }

 /* Called with iothread lock taken.  */
-static void dirty_bitmap_do_save_cleanup(void)
+static void dirty_bitmap_do_save_cleanup(DBMSaveState *s)
 {
 SaveBitmapState *dbms;

-while ((dbms = QSIMPLEQ_FIRST(_bitmap_mig_state.dbms_list)) != NULL) 
{
-QSIMPLEQ_REMOVE_HEAD(_bitmap_mig_state.dbms_list, entry);
+while ((dbms = QSIMPLEQ_FIRST(>dbms_list)) != NULL) {
+QSIMPLEQ_REMOVE_HEAD(>dbms_list, entry);
 bdrv_dirty_bitmap_set_busy(dbms->bitmap, false);
 bdrv_unref(dbms->bs);
 g_free(dbms);
@@ -267,7 +275,8 @@ static void dirty_bitmap_do_save_cleanup(void)
 }

 /* Called with iothread lock taken. */
-static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name)
+static int 

[PULL 06/24] qemu-iotests/199: change discard patterns

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

iotest 199 works too long because of many discard operations. At the
same time, postcopy period is very short, in spite of all these
efforts.

So, let's use less discards (and with more interesting patterns) to
reduce test timing. In the next commit we'll increase postcopy period.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
Message-Id: <20200727194236.19551-6-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/199 | 44 +-
 1 file changed, 26 insertions(+), 18 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 190e820b8408..da4dae01fb5d 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -30,6 +30,28 @@ size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')


+GiB = 1024 * 1024 * 1024
+
+discards1 = (
+(0, GiB),
+(2 * GiB + 512 * 5, 512),
+(3 * GiB + 512 * 5, 512),
+(100 * GiB, GiB)
+)
+
+discards2 = (
+(3 * GiB + 512 * 8, 512),
+(4 * GiB + 512 * 8, 512),
+(50 * GiB, GiB),
+(100 * GiB + GiB // 2, GiB)
+)
+
+
+def apply_discards(vm, discards):
+for d in discards:
+vm.hmp_qemu_io('drive0', 'discard {} {}'.format(*d))
+
+
 def event_seconds(event):
 return event['timestamp']['seconds'] + \
 event['timestamp']['microseconds'] / 100.0
@@ -80,9 +102,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events = []

 def test_postcopy(self):
-discard_size = 0x4000
 granularity = 512
-chunk = 4096

 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap', granularity=granularity)
@@ -92,14 +112,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
node='drive0', name='bitmap')
 empty_sha256 = result['return']['sha256']

-s = 0
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
-s = 0x8000
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+apply_discards(self.vm_a, discards1 + discards2)

 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')
@@ -111,10 +124,8 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
 self.assert_qmp(result, 'return', {})
-s = 0
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+
+apply_discards(self.vm_a, discards1)

 caps = [{'capability': 'dirty-bitmaps', 'state': True},
 {'capability': 'events', 'state': True}]
@@ -134,10 +145,7 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 event_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(event_resume)

-s = 0x8000
-while s < discard_size:
-self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+apply_discards(self.vm_b, discards2)

 match = {'data': {'status': 'completed'}}
 event_complete = self.vm_b.event_wait('MIGRATION', match=match)
-- 
2.27.0




[PULL 09/24] migration/block-dirty-bitmap: rename state structure types

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Rename types to be symmetrical for load/save part and shorter.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Reviewed-by: Eric Blake 
Message-Id: <20200727194236.19551-9-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 70 ++
 1 file changed, 37 insertions(+), 33 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 0739f1259e05..1d57bff4f6c7 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -100,23 +100,25 @@
 /* 0x04 was "AUTOLOAD" flags on elder versions, no it is ignored */
 #define DIRTY_BITMAP_MIG_START_FLAG_RESERVED_MASK0xf8

-typedef struct DirtyBitmapMigBitmapState {
+/* State of one bitmap during save process */
+typedef struct SaveBitmapState {
 /* Written during setup phase. */
 BlockDriverState *bs;
 const char *node_name;
 BdrvDirtyBitmap *bitmap;
 uint64_t total_sectors;
 uint64_t sectors_per_chunk;
-QSIMPLEQ_ENTRY(DirtyBitmapMigBitmapState) entry;
+QSIMPLEQ_ENTRY(SaveBitmapState) entry;
 uint8_t flags;

 /* For bulk phase. */
 bool bulk_completed;
 uint64_t cur_sector;
-} DirtyBitmapMigBitmapState;
+} SaveBitmapState;

-typedef struct DirtyBitmapMigState {
-QSIMPLEQ_HEAD(, DirtyBitmapMigBitmapState) dbms_list;
+/* State of the dirty bitmap migration (DBM) during save process */
+typedef struct DBMSaveState {
+QSIMPLEQ_HEAD(, SaveBitmapState) dbms_list;

 bool bulk_completed;
 bool no_bitmaps;
@@ -124,23 +126,25 @@ typedef struct DirtyBitmapMigState {
 /* for send_bitmap_bits() */
 BlockDriverState *prev_bs;
 BdrvDirtyBitmap *prev_bitmap;
-} DirtyBitmapMigState;
+} DBMSaveState;

-typedef struct DirtyBitmapLoadState {
+/* State of the dirty bitmap migration (DBM) during load process */
+typedef struct DBMLoadState {
 uint32_t flags;
 char node_name[256];
 char bitmap_name[256];
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
-} DirtyBitmapLoadState;
+} DBMLoadState;

-static DirtyBitmapMigState dirty_bitmap_mig_state;
+static DBMSaveState dirty_bitmap_mig_state;

-typedef struct DirtyBitmapLoadBitmapState {
+/* State of one bitmap during load process */
+typedef struct LoadBitmapState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 bool migrated;
-} DirtyBitmapLoadBitmapState;
+} LoadBitmapState;
 static GSList *enabled_bitmaps;
 QemuMutex finish_lock;

@@ -170,7 +174,7 @@ static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t 
flags)
 qemu_put_byte(f, flags);
 }

-static void send_bitmap_header(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+static void send_bitmap_header(QEMUFile *f, SaveBitmapState *dbms,
uint32_t additional_flags)
 {
 BlockDriverState *bs = dbms->bs;
@@ -199,19 +203,19 @@ static void send_bitmap_header(QEMUFile *f, 
DirtyBitmapMigBitmapState *dbms,
 }
 }

-static void send_bitmap_start(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void send_bitmap_start(QEMUFile *f, SaveBitmapState *dbms)
 {
 send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
 qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
 qemu_put_byte(f, dbms->flags);
 }

-static void send_bitmap_complete(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void send_bitmap_complete(QEMUFile *f, SaveBitmapState *dbms)
 {
 send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
 }

-static void send_bitmap_bits(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
  uint64_t start_sector, uint32_t nr_sectors)
 {
 /* align for buffer_is_zero() */
@@ -257,7 +261,7 @@ static void send_bitmap_bits(QEMUFile *f, 
DirtyBitmapMigBitmapState *dbms,
 /* Called with iothread lock taken.  */
 static void dirty_bitmap_mig_cleanup(void)
 {
-DirtyBitmapMigBitmapState *dbms;
+SaveBitmapState *dbms;

 while ((dbms = QSIMPLEQ_FIRST(_bitmap_mig_state.dbms_list)) != NULL) 
{
 QSIMPLEQ_REMOVE_HEAD(_bitmap_mig_state.dbms_list, entry);
@@ -271,7 +275,7 @@ static void dirty_bitmap_mig_cleanup(void)
 static int add_bitmaps_to_list(BlockDriverState *bs, const char *bs_name)
 {
 BdrvDirtyBitmap *bitmap;
-DirtyBitmapMigBitmapState *dbms;
+SaveBitmapState *dbms;
 Error *local_err = NULL;

 FOR_EACH_DIRTY_BITMAP(bs, bitmap) {
@@ -309,7 +313,7 @@ static int add_bitmaps_to_list(BlockDriverState *bs, const 
char *bs_name)
 bdrv_ref(bs);
 bdrv_dirty_bitmap_set_busy(bitmap, true);

-dbms = g_new0(DirtyBitmapMigBitmapState, 1);
+dbms = g_new0(SaveBitmapState, 1);
 dbms->bs = bs;
 dbms->node_name = bs_name;
 dbms->bitmap = bitmap;
@@ -334,7 +338,7 @@ static int add_bitmaps_to_list(BlockDriverState *bs, const 
char *bs_name)
 static int 

[PULL 10/24] migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Rename dirty_bitmap_mig_cleanup to dirty_bitmap_do_save_cleanup, to
stress that it is on save part.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Reviewed-by: Eric Blake 
Message-Id: <20200727194236.19551-10-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 migration/block-dirty-bitmap.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 1d57bff4f6c7..01a536d7d3d3 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -259,7 +259,7 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
 }

 /* Called with iothread lock taken.  */
-static void dirty_bitmap_mig_cleanup(void)
+static void dirty_bitmap_do_save_cleanup(void)
 {
 SaveBitmapState *dbms;

@@ -406,7 +406,7 @@ static int init_dirty_bitmap_migration(void)

 fail:
 g_hash_table_destroy(handled_by_blk);
-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();

 return -1;
 }
@@ -445,7 +445,7 @@ static void bulk_phase(QEMUFile *f, bool limit)
 /* for SaveVMHandlers */
 static void dirty_bitmap_save_cleanup(void *opaque)
 {
-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();
 }

 static int dirty_bitmap_save_iterate(QEMUFile *f, void *opaque)
@@ -480,7 +480,7 @@ static int dirty_bitmap_save_complete(QEMUFile *f, void 
*opaque)

 trace_dirty_bitmap_save_complete_finish();

-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();
 return 0;
 }

-- 
2.27.0




[PULL 07/24] qemu-iotests/199: increase postcopy period

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

The test wants to force a bitmap postcopy. Still, the resulting
postcopy period is very small. Let's increase it by adding more
bitmaps to migrate. Also, test disabled bitmaps migration.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
Message-Id: <20200727194236.19551-7-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/199 | 66 +++---
 1 file changed, 43 insertions(+), 23 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index da4dae01fb5d..d8532e49da00 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -103,30 +103,46 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):

 def test_postcopy(self):
 granularity = 512
+nb_bitmaps = 15

-result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
-   name='bitmap', granularity=granularity)
-self.assert_qmp(result, 'return', {})
+for i in range(nb_bitmaps):
+result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
+   name='bitmap{}'.format(i),
+   granularity=granularity)
+self.assert_qmp(result, 'return', {})

 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
+   node='drive0', name='bitmap0')
 empty_sha256 = result['return']['sha256']

-apply_discards(self.vm_a, discards1 + discards2)
-
-result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
-sha256 = result['return']['sha256']
-
-# Check, that updating the bitmap by discards works
-assert sha256 != empty_sha256
-
-result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
-   name='bitmap')
-self.assert_qmp(result, 'return', {})
-
 apply_discards(self.vm_a, discards1)

+result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap0')
+discards1_sha256 = result['return']['sha256']
+
+# Check, that updating the bitmap by discards works
+assert discards1_sha256 != empty_sha256
+
+# We want to calculate resulting sha256. Do it in bitmap0, so, disable
+# other bitmaps
+for i in range(1, nb_bitmaps):
+result = self.vm_a.qmp('block-dirty-bitmap-disable', node='drive0',
+   name='bitmap{}'.format(i))
+self.assert_qmp(result, 'return', {})
+
+apply_discards(self.vm_a, discards2)
+
+result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap0')
+all_discards_sha256 = result['return']['sha256']
+
+# Now, enable some bitmaps, to be updated during migration
+for i in range(2, nb_bitmaps, 2):
+result = self.vm_a.qmp('block-dirty-bitmap-enable', node='drive0',
+   name='bitmap{}'.format(i))
+self.assert_qmp(result, 'return', {})
+
 caps = [{'capability': 'dirty-bitmaps', 'state': True},
 {'capability': 'events', 'state': True}]

@@ -145,6 +161,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 event_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(event_resume)

+# enabled bitmaps should be updated
 apply_discards(self.vm_b, discards2)

 match = {'data': {'status': 'completed'}}
@@ -158,7 +175,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 downtime = event_dist(event_stop, event_resume)
 postcopy_time = event_dist(event_resume, event_complete)

-# TODO: assert downtime * 10 < postcopy_time
+assert downtime * 10 < postcopy_time
 if debug:
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)
@@ -166,12 +183,15 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 # Assert that bitmap migration is finished (check that successor bitmap
 # is removed)
 result = self.vm_b.qmp('query-block')
-assert len(result['return'][0]['dirty-bitmaps']) == 1
+assert len(result['return'][0]['dirty-bitmaps']) == nb_bitmaps

-# Check content of migrated (and updated by new writes) bitmap
-result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
-self.assert_qmp(result, 'return/sha256', sha256)
+# Check content of migrated bitmaps. Still, don't waste time checking
+# every bitmap
+for i in range(0, nb_bitmaps, 5):
+  

[PULL 05/24] qemu-iotests/199: improve performance: set bitmap by discard

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Discard dirties dirty-bitmap as well as write, but works faster. Let's
use it instead.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
Message-Id: <20200727194236.19551-5-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/199 | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index dd6044768c76..190e820b8408 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -67,8 +67,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 os.mkfifo(fifo)
 qemu_img('create', '-f', iotests.imgfmt, disk_a, size)
 qemu_img('create', '-f', iotests.imgfmt, disk_b, size)
-self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a)
-self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a,
+  'discard=unmap')
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b,
+  'discard=unmap')
 self.vm_b.add_incoming("exec: cat '" + fifo + "'")
 self.vm_a.launch()
 self.vm_b.launch()
@@ -78,7 +80,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events = []

 def test_postcopy(self):
-write_size = 0x4000
+discard_size = 0x4000
 granularity = 512
 chunk = 4096

@@ -86,25 +88,32 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
name='bitmap', granularity=granularity)
 self.assert_qmp(result, 'return', {})

+result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap')
+empty_sha256 = result['return']['sha256']
+
 s = 0
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 s = 0x8000
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1

 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')
 sha256 = result['return']['sha256']

+# Check, that updating the bitmap by discards works
+assert sha256 != empty_sha256
+
 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
 self.assert_qmp(result, 'return', {})
 s = 0
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1

 caps = [{'capability': 'dirty-bitmaps', 'state': True},
@@ -126,8 +135,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events.append(event_resume)

 s = 0x8000
-while s < write_size:
-self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1

 match = {'data': {'status': 'completed'}}
-- 
2.27.0




[PULL 02/24] qemu-iotests/199: fix style

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

Mostly, satisfy pep8 complaints.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
Message-Id: <20200727194236.19551-2-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/199 | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 40774eed74c2..de9ba8d94c23 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -28,8 +28,8 @@ disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')

+
 class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
-
 def tearDown(self):
 self.vm_a.shutdown()
 self.vm_b.shutdown()
@@ -54,7 +54,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):

 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap', granularity=granularity)
-self.assert_qmp(result, 'return', {});
+self.assert_qmp(result, 'return', {})

 s = 0
 while s < write_size:
@@ -71,7 +71,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):

 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
-self.assert_qmp(result, 'return', {});
+self.assert_qmp(result, 'return', {})
 s = 0
 while s < write_size:
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
@@ -104,15 +104,16 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1

-result = self.vm_b.qmp('query-block');
+result = self.vm_b.qmp('query-block')
 while len(result['return'][0]['dirty-bitmaps']) > 1:
 time.sleep(2)
-result = self.vm_b.qmp('query-block');
+result = self.vm_b.qmp('query-block')

 result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')

-self.assert_qmp(result, 'return/sha256', sha256);
+self.assert_qmp(result, 'return/sha256', sha256)
+

 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
-- 
2.27.0




[PULL 03/24] qemu-iotests/199: drop extra constraints

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

We don't need any specific format constraints here. Still keep qcow2
for two reasons:
1. No extra calls of format-unrelated test
2. Add some check around persistent bitmap in future (require qcow2)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
Message-Id: <20200727194236.19551-3-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/199 | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index de9ba8d94c23..dda918450a8b 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -116,5 +116,4 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):


 if __name__ == '__main__':
-iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
- supported_protocols=['file'])
+iotests.main(supported_fmts=['qcow2'])
-- 
2.27.0




[PULL 04/24] qemu-iotests/199: better catch postcopy time

2020-07-27 Thread Eric Blake
From: Vladimir Sementsov-Ogievskiy 

The test aims to test _postcopy_ migration, and wants to do some write
operations during postcopy time.

Test considers migrate status=complete event on source as start of
postcopy. This is completely wrong, completion is completion of the
whole migration process. Let's instead consider destination start as
start of postcopy, and use RESUME event for it.

Next, as migration finish, let's use migration status=complete event on
target, as such method is closer to what libvirt or another user will
do, than tracking number of dirty-bitmaps.

Finally, add a possibility to dump events for debug. And if
set debug to True, we see, that actual postcopy period is very small
relatively to the whole test duration time (~0.2 seconds to >40 seconds
for me). This means, that test is very inefficient in what it supposed
to do. Let's improve it in following commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Andrey Shinkevich 
Tested-by: Eric Blake 
Message-Id: <20200727194236.19551-4-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/199 | 72 +-
 1 file changed, 57 insertions(+), 15 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index dda918450a8b..dd6044768c76 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -20,17 +20,43 @@

 import os
 import iotests
-import time
 from iotests import qemu_img

+debug = False
+
 disk_a = os.path.join(iotests.test_dir, 'disk_a')
 disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')


+def event_seconds(event):
+return event['timestamp']['seconds'] + \
+event['timestamp']['microseconds'] / 100.0
+
+
+def event_dist(e1, e2):
+return event_seconds(e2) - event_seconds(e1)
+
+
 class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 def tearDown(self):
+if debug:
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_b_events += self.vm_b.get_qmp_events()
+for e in self.vm_a_events:
+e['vm'] = 'SRC'
+for e in self.vm_b_events:
+e['vm'] = 'DST'
+events = (self.vm_a_events + self.vm_b_events)
+events = [(e['timestamp']['seconds'],
+   e['timestamp']['microseconds'],
+   e['vm'],
+   e['event'],
+   e.get('data', '')) for e in events]
+for e in sorted(events):
+print('{}.{:06} {} {} {}'.format(*e))
+
 self.vm_a.shutdown()
 self.vm_b.shutdown()
 os.remove(disk_a)
@@ -47,6 +73,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.launch()
 self.vm_b.launch()

+# collect received events for debug
+self.vm_a_events = []
+self.vm_b_events = []
+
 def test_postcopy(self):
 write_size = 0x4000
 granularity = 512
@@ -77,15 +107,13 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1

-bitmaps_cap = {'capability': 'dirty-bitmaps', 'state': True}
-events_cap = {'capability': 'events', 'state': True}
+caps = [{'capability': 'dirty-bitmaps', 'state': True},
+{'capability': 'events', 'state': True}]

-result = self.vm_a.qmp('migrate-set-capabilities',
-   capabilities=[bitmaps_cap, events_cap])
+result = self.vm_a.qmp('migrate-set-capabilities', capabilities=caps)
 self.assert_qmp(result, 'return', {})

-result = self.vm_b.qmp('migrate-set-capabilities',
-   capabilities=[bitmaps_cap])
+result = self.vm_b.qmp('migrate-set-capabilities', capabilities=caps)
 self.assert_qmp(result, 'return', {})

 result = self.vm_a.qmp('migrate', uri='exec:cat>' + fifo)
@@ -94,24 +122,38 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 result = self.vm_a.qmp('migrate-start-postcopy')
 self.assert_qmp(result, 'return', {})

-while True:
-event = self.vm_a.event_wait('MIGRATION')
-if event['data']['status'] == 'completed':
-break
+event_resume = self.vm_b.event_wait('RESUME')
+self.vm_b_events.append(event_resume)

 s = 0x8000
 while s < write_size:
 self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1

+match = {'data': {'status': 'completed'}}
+event_complete = self.vm_b.event_wait('MIGRATION', match=match)
+self.vm_b_events.append(event_complete)
+
+# take queued event, should already been happened
+event_stop = self.vm_a.event_wait('STOP')
+self.vm_a_events.append(event_stop)
+
+   

[PULL 00/24] bitmaps patches for -rc2, 2020-07-27

2020-07-27 Thread Eric Blake
The following changes since commit 9303ecb658a0194560d1eecde165a1511223c2d8:

  Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20200727' into 
staging (2020-07-27 17:25:06 +0100)

are available in the Git repository at:

  https://repo.or.cz/qemu/ericb.git tags/pull-bitmaps-2020-07-27

for you to fetch changes up to 37931e006f05cb768b78dcc47453b13f76ea43c5:

  migration: Fix typos in bitmap migration comments (2020-07-27 15:42:21 -0500)


bitmaps patches for 2020-07-27

- Improve handling of various post-copy bitmap migration scenarios. A lost
bitmap should merely mean that the next backup must be full rather than
incremental, rather than abruptly breaking the entire guest migration.
- Associated iotest improvements


Andrey Shinkevich (1):
  qcow2: Fix capitalization of header extension constant.

Eric Blake (2):
  iotests: Adjust which migration tests are quick
  migration: Fix typos in bitmap migration comments

Vladimir Sementsov-Ogievskiy (21):
  qemu-iotests/199: fix style
  qemu-iotests/199: drop extra constraints
  qemu-iotests/199: better catch postcopy time
  qemu-iotests/199: improve performance: set bitmap by discard
  qemu-iotests/199: change discard patterns
  qemu-iotests/199: increase postcopy period
  migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start
  migration/block-dirty-bitmap: rename state structure types
  migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup
  migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init
  migration/block-dirty-bitmap: refactor state global variables
  migration/block-dirty-bitmap: rename finish_lock to just lock
  migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete
  migration/block-dirty-bitmap: keep bitmap state for all bitmaps
  migration/block-dirty-bitmap: relax error handling in incoming part
  migration/block-dirty-bitmap: cancel migration on shutdown
  migration/savevm: don't worry if bitmap migration postcopy failed
  qemu-iotests/199: prepare for new test-cases addition
  qemu-iotests/199: check persistent bitmaps
  qemu-iotests/199: add early shutdown case to bitmaps postcopy
  qemu-iotests/199: add source-killed case to bitmaps postcopy

 docs/interop/qcow2.txt |   2 +-
 migration/migration.h  |   3 +-
 block/qcow2.c  |   2 +-
 migration/block-dirty-bitmap.c | 472 ++---
 migration/migration.c  |  15 +-
 migration/savevm.c |  37 +++-
 tests/qemu-iotests/199 | 254 +-
 tests/qemu-iotests/199.out |   4 +-
 tests/qemu-iotests/group   |  12 +-
 9 files changed, 556 insertions(+), 245 deletions(-)

-- 
2.27.0




[PULL 01/24] qcow2: Fix capitalization of header extension constant.

2020-07-27 Thread Eric Blake
From: Andrey Shinkevich 

Make the capitalization of the hexadecimal numbers consistent for the
QCOW2 header extension constants in docs/interop/qcow2.txt.

Suggested-by: Eric Blake 
Signed-off-by: Andrey Shinkevich 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <1594973699-781898-2-git-send-email-andrey.shinkev...@virtuozzo.com>
Reviewed-by: Eric Blake 
Signed-off-by: Eric Blake 
---
 docs/interop/qcow2.txt | 2 +-
 block/qcow2.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index cb723463f241..f072e27900e6 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -231,7 +231,7 @@ be stored. Each extension has a structure like the 
following:

 Byte  0 -  3:   Header extension type:
 0x - End of the header extension area
-0xE2792ACA - Backing file format name string
+0xe2792aca - Backing file format name string
 0x6803f857 - Feature name table
 0x23852875 - Bitmaps extension
 0x0537be77 - Full disk encryption header pointer
diff --git a/block/qcow2.c b/block/qcow2.c
index fadf3422f8c5..6ad6bdc166ea 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -66,7 +66,7 @@ typedef struct {
 } QEMU_PACKED QCowExtension;

 #define  QCOW2_EXT_MAGIC_END 0
-#define  QCOW2_EXT_MAGIC_BACKING_FORMAT 0xE2792ACA
+#define  QCOW2_EXT_MAGIC_BACKING_FORMAT 0xe2792aca
 #define  QCOW2_EXT_MAGIC_FEATURE_TABLE 0x6803f857
 #define  QCOW2_EXT_MAGIC_CRYPTO_HEADER 0x0537be77
 #define  QCOW2_EXT_MAGIC_BITMAPS 0x23852875
-- 
2.27.0




Re: [PATCH] migration: Fix typos in bitmap migration comments

2020-07-27 Thread Vladimir Sementsov-Ogievskiy

27.07.2020 23:32, Eric Blake wrote:

Noticed while reviewing the file for newer patches.

Fixes: b35ebdf076
Signed-off-by: Eric Blake 
---

This is trivial enough that I'll throw it in my pull request today.

  migration/block-dirty-bitmap.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 1f675b792fc9..784330ebe130 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -97,7 +97,7 @@

  #define DIRTY_BITMAP_MIG_START_FLAG_ENABLED  0x01
  #define DIRTY_BITMAP_MIG_START_FLAG_PERSISTENT   0x02
-/* 0x04 was "AUTOLOAD" flags on elder versions, no it is ignored */
+/* 0x04 was "AUTOLOAD" flags on older versions, now it is ignored */


may be also s/flags/flag


  #define DIRTY_BITMAP_MIG_START_FLAG_RESERVED_MASK0xf8

  /* State of one bitmap during save process */
@@ -180,7 +180,7 @@ static uint32_t qemu_get_bitmap_flags(QEMUFile *f)

  static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t flags)
  {
-/* The code currently do not send flags more than one byte */
+/* The code currently does not send flags as more than one byte */


Hmm, why "as more than", not just "more than"?.
(this note is about the following: the protocol allows adding more than
one byte of flags with use of DIRTY_BITMAP_MIG_EXTRA_FLAGS. Still,
currently this possibility is not used and we assert it.)


  assert(!(flags & (0xff00 | DIRTY_BITMAP_MIG_EXTRA_FLAGS)));

  qemu_put_byte(f, flags);



Anyway:
Reviewed-by: Vladimir Sementsov-Ogievskiy 

--
Best regards,
Vladimir



Re: [PATCH 00/16] hw/block/nvme: dma handling and address mapping cleanup

2020-07-27 Thread Keith Busch
On Mon, Jul 27, 2020 at 11:42:46AM +0200, Klaus Jensen wrote:
> On Jul 20 13:37, Klaus Jensen wrote:
> > From: Klaus Jensen 
> > 
> > This series consists of patches that refactors dma read/write and adds a
> > number of address mapping helper functions.
> > 
> > Based-on: <20200706061303.246057-1-...@irrelevant.dk>
> > 
> > Klaus Jensen (16):
> >   hw/block/nvme: memset preallocated requests structures
> >   hw/block/nvme: add mapping helpers
> >   hw/block/nvme: replace dma_acct with blk_acct equivalent
> >   hw/block/nvme: remove redundant has_sg member
> >   hw/block/nvme: refactor dma read/write
> >   hw/block/nvme: pass request along for tracing
> >   hw/block/nvme: add request mapping helper
> >   hw/block/nvme: verify validity of prp lists in the cmb
> >   hw/block/nvme: refactor request bounds checking
> >   hw/block/nvme: add check for mdts
> >   hw/block/nvme: be consistent about zeros vs zeroes
> >   hw/block/nvme: refactor NvmeRequest clearing
> >   hw/block/nvme: add a namespace reference in NvmeRequest
> >   hw/block/nvme: consolidate qsg/iov clearing
> >   hw/block/nvme: remove NvmeCmd parameter
> >   hw/block/nvme: use preallocated qsg/iov in nvme_dma_prp
> > 
> >  block/nvme.c  |   4 +-
> >  hw/block/nvme.c   | 498 +++---
> >  hw/block/nvme.h   |   4 +-
> >  hw/block/trace-events |   4 +
> >  include/block/nvme.h  |   4 +-
> >  5 files changed, 331 insertions(+), 183 deletions(-)
> > 
> > -- 
> > 2.27.0
> > 
> 
> Gentle ping on this.

I'll have free time to get back to this probably end of the week,
possibly early next week.



[PATCH] migration: Fix typos in bitmap migration comments

2020-07-27 Thread Eric Blake
Noticed while reviewing the file for newer patches.

Fixes: b35ebdf076
Signed-off-by: Eric Blake 
---

This is trivial enough that I'll throw it in my pull request today.

 migration/block-dirty-bitmap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 1f675b792fc9..784330ebe130 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -97,7 +97,7 @@

 #define DIRTY_BITMAP_MIG_START_FLAG_ENABLED  0x01
 #define DIRTY_BITMAP_MIG_START_FLAG_PERSISTENT   0x02
-/* 0x04 was "AUTOLOAD" flags on elder versions, no it is ignored */
+/* 0x04 was "AUTOLOAD" flags on older versions, now it is ignored */
 #define DIRTY_BITMAP_MIG_START_FLAG_RESERVED_MASK0xf8

 /* State of one bitmap during save process */
@@ -180,7 +180,7 @@ static uint32_t qemu_get_bitmap_flags(QEMUFile *f)

 static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t flags)
 {
-/* The code currently do not send flags more than one byte */
+/* The code currently does not send flags as more than one byte */
 assert(!(flags & (0xff00 | DIRTY_BITMAP_MIG_EXTRA_FLAGS)));

 qemu_put_byte(f, flags);
-- 
2.27.0




Re: [PATCH] linux-user: Fix 'clock_nanosleep()' implementation

2020-07-27 Thread Laurent Vivier
Le 27/07/2020 à 22:13, Filip Bozuta a écrit :
> Implementation of syscall 'clock_nanosleep()' in 'syscall.c' uses
> functions 'target_to_host_timespec()' and 'host_to_target_timespec()'
> to transfer the value of 'struct timespec' between target and host.
> However, the implementation doesn't check whether this conversion
> succeeds and thus can return an unaproppriate error instead of 'EFAULT'
> that is expected. This was confirmed with the modified LTP test suite
> where testcases with bad 'struct timespec' adress for 'clock_nanosleep()'
> were added. This modified LTP suite can be found at:
> https://github.com/bozutaf/ltp
> 
> (Patch with this new test case will be sent to LTP mailing list soon)
> 
> Signed-off-by: Filip Bozuta 
> ---
>  linux-user/syscall.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index f5c4f6b95d..9f06dde947 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -11828,7 +11828,9 @@ static abi_long do_syscall1(void *cpu_env, int num, 
> abi_long arg1,
>  case TARGET_NR_clock_nanosleep:
>  {
>  struct timespec ts;
> -target_to_host_timespec(, arg3);
> +if (target_to_host_timespec(, arg3)) {
> +return -TARGET_EFAULT;
> +}
>  ret = get_errno(safe_clock_nanosleep(arg1, arg2,
>   , arg4 ?  : NULL));
>  /*
> @@ -11836,8 +11838,9 @@ static abi_long do_syscall1(void *cpu_env, int num, 
> abi_long arg1,
>   * with error -TARGET_EINTR and if arg4 is not NULL and arg2 is not
>   * TIMER_ABSTIME, it returns the remaining unslept time in arg4.
>   */
> -if (ret == -TARGET_EINTR && arg4 && arg2 != TIMER_ABSTIME) {
> -host_to_target_timespec(arg4, );
> +if (ret == -TARGET_EINTR && arg4 && arg2 != TIMER_ABSTIME &&
> +host_to_target_timespec(arg4, )) {
> +  return -TARGET_EFAULT;
>  }
>  
>  return ret;
> 

Reviewed-by: Laurent Vivier 



[Bug 1876187] Re: qemu-system-arm freezes when using SystickTimer on netduinoplus2

2020-07-27 Thread Peter Maydell
Patch sent to list:
https://patchew.org/QEMU/20200727162617.26227-1-peter.mayd...@linaro.org/


** Changed in: qemu
   Status: New => In Progress

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1876187

Title:
  qemu-system-arm freezes when using SystickTimer on netduinoplus2

Status in QEMU:
  In Progress

Bug description:
  git commit 27c94566379069fb8930bb1433dcffbf7df3203d

  The global variable system_clock_scale used in
  hw/timer/armv7m_systick.c is never set on the netduinoplus2 platform,
  it stays initialized as zero. Using the timer with the clock source as
  cpu clock leads to an infinit loop because systick_timer->tick always
  stays the same.

  To reproduce use to CMSIS function SysTick_Config(uint32_t ticks) to
  setup the timer.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1876187/+subscriptions



Re: [PATCH v4 15/21] migration/block-dirty-bitmap: relax error handling in incoming part

2020-07-27 Thread Eric Blake

On 7/27/20 2:42 PM, Vladimir Sementsov-Ogievskiy wrote:

Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.

Still we have to report io stream violation errors, as they affect the
whole migration stream.



I'm amending this to also add:

While touching this, tighten code that was previously blindly calling 
malloc on a size read from the migration stream, as a corrupted stream 
(perhaps from a malicious user) should not be able to convince us to 
allocate an inordinate amount of memory.



Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  migration/block-dirty-bitmap.c | 164 +
  1 file changed, 127 insertions(+), 37 deletions(-)




@@ -650,15 +695,46 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
  
  if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {

  trace_dirty_bitmap_load_bits_zeroes();
-bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
- false);
+if (!s->cancelled) {
+bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
+ nr_bytes, false);
+}
  } else {
  size_t ret;
-uint8_t *buf;
+g_autofree uint8_t *buf = NULL;
  uint64_t buf_size = qemu_get_be64(f);
-uint64_t needed_size =
-bdrv_dirty_bitmap_serialization_size(s->bitmap,
- first_byte, nr_bytes);
+uint64_t needed_size;
+
+/*
+ * Actual check for buf_size is done a bit later. We can't do it in


s/Actual/The actual/


+ * cancelled mode as we don't have the bitmap to check the constraints
+ * (so, we do allocate buffer and read prior to the check). On the 
other
+ * hand, we shouldn't blindly g_malloc the number from the stream.
+ * Actually one chunk should not be larger thatn CHUNK_SIZE. Let's 
allow


than


+ * a bit larger (which means that bitmap migration will fail anyway and
+ * the whole migration will most probably fail soon due to broken
+ * stream).
+ */
+if (buf_size > 10 * CHUNK_SIZE) {
+error_report("Bitmap migration stream requests too large buffer "
+ "size to allocate");


Bitmap migration stream buffer allocation request is too large

I'll make those touchups.

Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH for 5.1] docs: fix trace docs build with sphinx 3.1.1

2020-07-27 Thread Peter Maydell
On Mon, 27 Jul 2020 at 20:52, John Snow  wrote:
> ... Should we say goodbye to Sphinx 1.7.x, or is there a workaround that
> keeps support from 1.6.1 through to 3.1.1?

I think we need to keep 1.7.x because it's the Sphinx shipped
by some LTS distros we support, don't we?

I do feel we probably need to defend our Sphinx-version-support
more actively by having oldest-supported and bleeding-edge
both tested in the CI setup...

thanks
-- PMM



[PATCH] linux-user: Fix 'clock_nanosleep()' implementation

2020-07-27 Thread Filip Bozuta
Implementation of syscall 'clock_nanosleep()' in 'syscall.c' uses
functions 'target_to_host_timespec()' and 'host_to_target_timespec()'
to transfer the value of 'struct timespec' between target and host.
However, the implementation doesn't check whether this conversion
succeeds and thus can return an unaproppriate error instead of 'EFAULT'
that is expected. This was confirmed with the modified LTP test suite
where testcases with bad 'struct timespec' adress for 'clock_nanosleep()'
were added. This modified LTP suite can be found at:
https://github.com/bozutaf/ltp

(Patch with this new test case will be sent to LTP mailing list soon)

Signed-off-by: Filip Bozuta 
---
 linux-user/syscall.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f5c4f6b95d..9f06dde947 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -11828,7 +11828,9 @@ static abi_long do_syscall1(void *cpu_env, int num, 
abi_long arg1,
 case TARGET_NR_clock_nanosleep:
 {
 struct timespec ts;
-target_to_host_timespec(, arg3);
+if (target_to_host_timespec(, arg3)) {
+return -TARGET_EFAULT;
+}
 ret = get_errno(safe_clock_nanosleep(arg1, arg2,
  , arg4 ?  : NULL));
 /*
@@ -11836,8 +11838,9 @@ static abi_long do_syscall1(void *cpu_env, int num, 
abi_long arg1,
  * with error -TARGET_EINTR and if arg4 is not NULL and arg2 is not
  * TIMER_ABSTIME, it returns the remaining unslept time in arg4.
  */
-if (ret == -TARGET_EINTR && arg4 && arg2 != TIMER_ABSTIME) {
-host_to_target_timespec(arg4, );
+if (ret == -TARGET_EINTR && arg4 && arg2 != TIMER_ABSTIME &&
+host_to_target_timespec(arg4, )) {
+  return -TARGET_EFAULT;
 }
 
 return ret;
-- 
2.25.1




Windows 10 client 4k

2020-07-27 Thread Jerry Geis
How do I get 4K resolution on Windows 10 client.
I have using CentOS 7 or 8 (hosts) either one - both have issue.
I am set for QXL for the guest.

I tried to look at the VirtIO stable and new - neither have a Windows 10
driver under QXL.
THey stop at windows 7.  Lets me to think there is a different way to do
this - but I dont know what that is .

Suggestions?  My LInux host is running in 4K.

Thanks,


Jerry


Re: [PATCH for-5.1?] iotests: Adjust which tests are quick

2020-07-27 Thread Vladimir Sementsov-Ogievskiy

27.07.2020 22:51, Eric Blake wrote:

A quick run of './check -qcow2 -g migration' shows that test 169 is
NOT quick, but meanwhile several other tests ARE quick.  Let's adjust
the test designations accordingly.

Signed-off-by: Eric Blake 


Reviewed-by: Vladimir Sementsov-Ogievskiy 

Still, why do we need quick group? make check uses "auto" group..
Some tests are considered important enough to run even not being quick.
Probably, everyone who don't want to run all tests, should run "auto" group,
not "quick"?
I, when want to check my changes, run all tests or limit them with
help of grep. I mostly run tests on tmpfs, so they all are quick enough.
Saving several minutes of cpu work doesn't worth missing a bug..


--
Best regards,
Vladimir



Re: [PATCH] linux-user: Use getcwd syscall directly

2020-07-27 Thread Laurent Vivier
Le 23/07/2020 à 12:27, Andreas Schwab a écrit :
> The glibc getcwd function returns different errors than the getcwd
> syscall, which triggers an assertion failure in the glibc getcwd function
> when running under the emulation.
> 
> Signed-off-by: Andreas Schwab 
> ---
>  linux-user/syscall.c | 9 +
>  1 file changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index b9144b18fc..e4e46867e8 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -388,14 +388,7 @@ static bitmask_transtbl fcntl_flags_tbl[] = {
>{ 0, 0, 0, 0 }
>  };
>  
> -static int sys_getcwd1(char *buf, size_t size)
> -{
> -  if (getcwd(buf, size) == NULL) {
> -  /* getcwd() sets errno */
> -  return (-1);
> -  }
> -  return strlen(buf)+1;
> -}
> +_syscall2(int, sys_getcwd1, char *, buf, size_t, size)
>  
>  #ifdef TARGET_NR_utimensat
>  #if defined(__NR_utimensat)
> 

Applied to my linux-user-for-5.1 branch.

Thanks,
Laurent



Re: [RFC v2 01/76] target/riscv: drop vector 0.7.1 support

2020-07-27 Thread Alistair Francis
On Mon, Jul 27, 2020 at 12:54 PM Palmer Dabbelt  wrote:
>
> On Wed, 22 Jul 2020 02:15:24 PDT (-0700), frank.ch...@sifive.com wrote:
> > From: Frank Chang 
> >
> > Signed-off-by: Frank Chang 
> > ---
> >  target/riscv/cpu.c | 24 ++--
> >  target/riscv/cpu.h |  2 --
> >  2 files changed, 6 insertions(+), 20 deletions(-)
> >
> > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> > index 228b9bdb5d..2800953e6c 100644
> > --- a/target/riscv/cpu.c
> > +++ b/target/riscv/cpu.c
> > @@ -106,11 +106,6 @@ static void set_priv_version(CPURISCVState *env, int 
> > priv_ver)
> >  env->priv_ver = priv_ver;
> >  }
> >
> > -static void set_vext_version(CPURISCVState *env, int vext_ver)
> > -{
> > -env->vext_ver = vext_ver;
> > -}
> > -
> >  static void set_feature(CPURISCVState *env, int feature)
> >  {
> >  env->features |= (1ULL << feature);
> > @@ -339,7 +334,6 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
> > **errp)
> >  CPURISCVState *env = >env;
> >  RISCVCPUClass *mcc = RISCV_CPU_GET_CLASS(dev);
> >  int priv_version = PRIV_VERSION_1_11_0;
> > -int vext_version = VEXT_VERSION_0_07_1;
> >  target_ulong target_misa = 0;
> >  Error *local_err = NULL;
> >
> > @@ -363,7 +357,6 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
> > **errp)
> >  }
> >
> >  set_priv_version(env, priv_version);
> > -set_vext_version(env, vext_version);
> >
> >  if (cpu->cfg.mmu) {
> >  set_feature(env, RISCV_FEATURE_MMU);
> > @@ -455,19 +448,14 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
> > **errp)
> >  return;
> >  }
> >  if (cpu->cfg.vext_spec) {
> > -if (!g_strcmp0(cpu->cfg.vext_spec, "v0.7.1")) {
> > -vext_version = VEXT_VERSION_0_07_1;
> > -} else {
> > -error_setg(errp,
> > -   "Unsupported vector spec version '%s'",
> > -   cpu->cfg.vext_spec);
> > -return;
> > -}
> > +error_setg(errp,
> > +   "Unsupported vector spec version '%s'",
> > +   cpu->cfg.vext_spec);
> > +return;
> >  } else {
> > -qemu_log("vector verison is not specified, "
> > -"use the default value v0.7.1\n");
> > +qemu_log("vector version is not specified\n");
> > +return;
> >  }
> > -set_vext_version(env, vext_version);
> >  }
> >
> >  set_misa(env, RVXLEN | target_misa);
> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> > index eef20ca6e5..6766dcd914 100644
> > --- a/target/riscv/cpu.h
> > +++ b/target/riscv/cpu.h
> > @@ -79,8 +79,6 @@ enum {
> >  #define PRIV_VERSION_1_10_0 0x00011000
> >  #define PRIV_VERSION_1_11_0 0x00011100
> >
> > -#define VEXT_VERSION_0_07_1 0x0701
> > -
> >  #define TRANSLATE_PMP_FAIL 2
> >  #define TRANSLATE_FAIL 1
> >  #define TRANSLATE_SUCCESS 0
>
> If I'm reading things correctly, 5.0 did not have the V extension.  This means
> that we can technically drop 0.7.1 from QEMU, as it's never been released.

There is no intention of this making it into 5.1. We are currently in
hard freeze.

The idea is that QEMU 5.1 will support v0.7.1 and then hopefully 5.2
will support v0.9.

> That said, I'd still prefer to avoid dropping 0.7.1 so late in the release
> cycle (it's already soft freeze, right?).  Given the extended length of the V
> extension development process it seems likely that 0.7.1 is going to end up in
> some silicon, which means it would be quite useful to have it in QEMU.

Agreed!

>
> I understand it's a lot more work to maintain multiple vector extensions, but
> it was very useful to have multiple privileged extensions supported in QEMU
> while that was all getting sorted out and as the vector drafts has massive
> differences it'll probably be even more useful.

Hopefully a release will be enough for this as managing both will be a
huge maintenance burden.

Alistair

>



Re: [PATCH] linux-user: Fix syscall rt_sigtimedwait() implementation

2020-07-27 Thread Laurent Vivier
Le 24/07/2020 à 20:16, Filip Bozuta a écrit :
> Implementation of 'rt_sigtimedwait()' in 'syscall.c' uses the
> function 'target_to_host_timespec()' to transfer the value of
> 'struct timespec' from target to host. However, the implementation
> doesn't check whether this conversion succeeds and thus can cause
> an unaproppriate error instead of the 'EFAULT (Bad address)' which
> is supposed to be set if the conversion from target to host fails.
> 
> This was confirmed with the LTP test for rt_sigtimedwait:
> "/testcases/kernel/syscalls/rt_sigtimedwait/rt_sigtimedwait01.c"
> which causes an unapropriate error in test case "test_bad_adress3"
> which is run with a bad adress for the 'struct timespec' argument:
> 
> FAIL: test_bad_address3 (349): Unexpected failure: EAGAIN/EWOULDBLOCK (11)
> 
> The test fails with an unexptected errno 'EAGAIN/EWOULDBLOCK' instead
> of the expected EFAULT.
> 
> After the changes from this patch, the test case is executed successfully
> along with the other LTP test cases for 'rt_sigtimedwait()':
> 
> PASS: test_bad_address3 (349): Test passed
> 
> Signed-off-by: Filip Bozuta 
> ---
>  linux-user/syscall.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 1211e759c2..72735682cb 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -8868,7 +8868,9 @@ static abi_long do_syscall1(void *cpu_env, int num, 
> abi_long arg1,
>  unlock_user(p, arg1, 0);
>  if (arg3) {
>  puts = 
> -target_to_host_timespec(puts, arg3);
> +if (target_to_host_timespec(puts, arg3)) {
> +return -TARGET_EFAULT;
> +}
>  } else {
>  puts = NULL;
>  }
> 

Applied to my linux-user-for-5.1 branch.

Thanks,
Laurent



Re: [PATCH] linux-user: Ensure mmap_min_addr is non-zero

2020-07-27 Thread Laurent Vivier
Le 24/07/2020 à 23:23, Richard Henderson a écrit :
> When the chroot does not have /proc mounted, we can read neither
> /proc/sys/vm/mmap_min_addr nor /proc/sys/maps.
> 
> The enforcement of mmap_min_addr in the host kernel is done by
> the security module, and so does not apply to processes owned
> by root.  Which leads pgd_find_hole_fallback to succeed in probing
> a reservation at address 0.  Which confuses pgb_reserved_va to
> believe that guest_base has not actually been initialized.
> 
> We don't actually want NULL addresses to become accessible, so
> make sure that mmap_min_addr is initialized with a non-zero value.
> 
> Buglink: https://bugs.launchpad.net/qemu/+bug/1888728
> Reported-by: John Paul Adrian Glaubitz 
> Signed-off-by: Richard Henderson 
> ---
>  linux-user/main.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/linux-user/main.c b/linux-user/main.c
> index 3597e99bb1..75c9785157 100644
> --- a/linux-user/main.c
> +++ b/linux-user/main.c
> @@ -758,14 +758,26 @@ int main(int argc, char **argv, char **envp)
>  
>  if ((fp = fopen("/proc/sys/vm/mmap_min_addr", "r")) != NULL) {
>  unsigned long tmp;
> -if (fscanf(fp, "%lu", ) == 1) {
> +if (fscanf(fp, "%lu", ) == 1 && tmp != 0) {
>  mmap_min_addr = tmp;
> -qemu_log_mask(CPU_LOG_PAGE, "host mmap_min_addr=0x%lx\n", 
> mmap_min_addr);
> +qemu_log_mask(CPU_LOG_PAGE, "host mmap_min_addr=0x%lx\n",
> +  mmap_min_addr);
>  }
>  fclose(fp);
>  }
>  }
>  
> +/*
> + * We prefer to not make NULL pointers accessible to QEMU.
> + * If we're in a chroot with no /proc, fall back to 1 page.
> + */
> +if (mmap_min_addr == 0) {
> +mmap_min_addr = qemu_host_page_size;
> +qemu_log_mask(CPU_LOG_PAGE,
> +  "host mmap_min_addr=0x%lx (fallback)\n",
> +  mmap_min_addr);
> +}
> +
>  /*
>   * Prepare copy of argv vector for target.
>   */
> 

Applied to my linux-user-for-5.1 branch.

Thanks,
Laurent



  1   2   3   4   5   >