Re: [PATCH] migration/block-dirty-bitmap: fix uninitialized variable warning

2020-10-12 Thread Li Qiang
Laurent Vivier  于2020年10月12日周一 下午11:33写道:
>
> Le 10/10/2020 à 13:07, Chen Qun a écrit :
> > This if statement judgment is redundant and it will cause a warning:
> >
> > migration/block-dirty-bitmap.c:1090:13: warning: ‘bitmap_name’ may be used
> >  uninitialized in this function [-Wmaybe-uninitialized]
> >  g_strlcpy(s->bitmap_name, bitmap_name, sizeof(s->bitmap_name));
> >  ^~
> >
> > Reported-by: Euler Robot 
> > Signed-off-by: Chen Qun 
> > ---
> >  migration/block-dirty-bitmap.c | 2 --
> >  1 file changed, 2 deletions(-)
> >
> > diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
> > index 5bef793ac0..e09ea4f22b 100644
> > --- a/migration/block-dirty-bitmap.c
> > +++ b/migration/block-dirty-bitmap.c
> > @@ -1084,9 +1084,7 @@ static int dirty_bitmap_load_header(QEMUFile *f, 
> > DBMLoadState *s,
> >  } else {
> >  bitmap_name = s->bitmap_alias;
> >  }
> > -}
> >
> > -if (!s->cancelled) {
> >  g_strlcpy(s->bitmap_name, bitmap_name, sizeof(s->bitmap_name));
> >  s->bitmap = bdrv_find_dirty_bitmap(s->bs, s->bitmap_name);
> >
> >
>
> I don't think it's correct as "cancel_incoming_locked(s)" can change the
> value of "s->cancelled".
>

Hi Laurent,

You're right. So I think this can simply assign 'bitmap_name' to NULL
to make compiler happy.

Thanks,
Li Qiang



> Thanks,
> Laurent
>



Re: Which qemu change corresponds to RedHat bug 1655408

2020-10-12 Thread Jakob Bohm

On 2020-10-12 13:47, Max Reitz wrote:

On 09.10.20 14:55, Jakob Bohm wrote:

On 2020-10-09 10:48, Max Reitz wrote:

On 08.10.20 18:49, Jakob Bohm wrote:

(Top posting because previous reply did so):

If the bug was closed as "can't reproduce", why was a very similar bug
listed as fixed in RHSA-2019:2553-01 ?


Hi,

Which very similar bug do you mean?  I can only guess that perhaps you
mean 1603104 or 1551486.

Bug 1603104 was about qemu not ignoring errors when releasing file locks
fails (we should ignore errors then, because they're not fatal, and we
often cannot return errors, so they ended up as aborts).  (To give more
context, this error generally appeared only when the storage the image
is on somehow disappeared while qemu is running.  E.g. when the
connection to an NFS server was lost.)

Bug 1551486 entailed a bit of a rewrite of the whole locking code, which
may have resulted in the bug 1655408 no longer appearing for our QE
team.  But it was a different bug, as it wasn’t about any error, but
just about the fact that qemu used more FDs than necessary.

(Although I see 1655408 was reported for RHEL 8, whereas 1603104 and
1551486 (as part of RHSA-2019:2553) were reported for RHEL 7.  The
corresponding RHEL 8 bug for those two is 1694148.)

Either way, both of those bugs are fixed in 5.0.


1655408 in contrast reports an error at startup; locking itself failed.
   I couldn’t reproduce it, and I still can’t; neither with the image
mounted concurrently, nor with an RO NFS mount.

(For example:

exports:
[...]/test-nfs-ro
127.0.0.1(ro,sync,no_subtree_check,fsid=0,insecure,crossmnt)

$ for i in $(seq 100); do \
  echo -e '\033[1m---\033[0m'; \
  x86_64-softmmu/qemu-system-x86_64 \
    -drive \
  if=none,id=drv0,readonly=on,file=/mnt/tmp/arch.iso,format=raw \
    -device ide-cd,drive=drv0 \
    -enable-kvm -m 2048 -display none &; \
  pid=$!; \
  sleep 1; \
  kill $pid; \
    done

(Where x86_64-softmmu/qemu-system-x86_64 is upstream 5.0.1.)

All I see is something like:

---
qemu-system-x86_64: terminating on signal 15 from pid 7278 (/bin/zsh)
[2] 34103
[3]  - 34095 terminated  x86_64-softmmu/qemu-system-x86_64 -drive
-device ide-cd,drive=drv0  -m 2048

So no file locking errors.)



The error I got was specifically "Failed to lock byte 100" and VM not
starting.  The ISO file was on a R/W NFS3 share, but was itself R/O for
the user that root was mapped to by linux-nfs-server via /etc/exports
options, specifically the file iso file was mode 0444 in a 0755
directory, and the exports line was (simplified)

/share1
:::/64(ro,sync,mp,subtree_check,anonuid=1000,anongid=1000)

where :::/64 is the numeric IPv6 prefix of the LAN

NFS kernel Server ran Debian Stretch kernel 4.19.0-0.bpo.8-amd64 #1 SMP
Debian 4.19.98-1~bpo9+1 (2020-03-09) x86_64 GNU/Linux

NFS client mount options were:

rw,nosuid,nodev,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,
soft,proto=tcp6,timeo=600,retrans=6,sec=sys,mountaddr=:::::xxff:fexx:,

mountvers=3,mountport=45327,mountproto=udp6,local_lock=none,addr=:::::xxff:fexx:


I’ve tried using these settings, but still can’t reproduce the bug.

Nothing else uses the image when you try to attach it to qemu, right?
(Your last email noted something about a loop mount, but I’m not sure
whether that just referred to the RH Bugzilla entry.)

(local_lock=none means that all locks are relayed to the server, correct?)

Max



Nothing else was supposed to access that ISO at the time, but at various 
times that ISO has been accessed by different virtualization systems for 
different virtual machines, and maybe something didn't release its own 
locks from much earlier (virtualization hosts tend to accumulate a lot 
of uptime).


Coordinating locking of shared disk images between multiple qemu
instances should ideally try to emulate what happens when a SCSI disk is
shared over a SAN (fibre channel, iSCSI, shared parallel SCSI bus etc.), 
so if a VM issues the SCSI lock management commands, they should behave 
as they would for real hardware.


My reference to loop mounts refers to the (common) scenario where
someone tries to mount a raw image file using both qemu and OS methods,
with the loop block driver being the traditional POSIX method that would
be invoked by not using the qemu NBD server.

My large batch job is still running...








Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded



[PATCH 10/10] block: check availablity for preadv/pwritev on mac

2020-10-12 Thread Joelle van Dyne
From: osy 

macOS 11/iOS 14 added preadv/pwritev APIs. Due to weak linking, configure
will succeed with CONFIG_PREADV even when targeting a lower OS version. We
therefore need to check at run time if we can actually use these APIs.

Signed-off-by: Joelle van Dyne 
---
 block/file-posix.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index cdc73b5f1d..d7482036a3 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1393,12 +1393,24 @@ static bool preadv_present = true;
 static ssize_t
 qemu_preadv(int fd, const struct iovec *iov, int nr_iov, off_t offset)
 {
+#ifdef CONFIG_DARWIN /* preadv introduced in macOS 11 */
+if (!__builtin_available(macOS 11, iOS 14, watchOS 7, tvOS 14, *)) {
+preadv_present = false;
+return -ENOSYS;
+} else
+#endif
 return preadv(fd, iov, nr_iov, offset);
 }
 
 static ssize_t
 qemu_pwritev(int fd, const struct iovec *iov, int nr_iov, off_t offset)
 {
+#ifdef CONFIG_DARWIN /* pwritev introduced in macOS 11 */
+if (!__builtin_available(macOS 11, iOS 14, watchOS 7, tvOS 14, *)) {
+preadv_present = false;
+return -ENOSYS;
+} else
+#endif
 return pwritev(fd, iov, nr_iov, offset);
 }
 
-- 
2.24.3 (Apple Git-128)




[PATCH 03/10] qemu: add support for iOS host

2020-10-12 Thread Joelle van Dyne
From: osy 

This introduces support for building for iOS hosts. When the correct Xcode
toolchain is used, iOS host will be detected automatically.

block: disable features not supported by iOS sandbox
slirp: disable SMB features for iOS
target: disable system() calls for iOS
tcg: use sys_icache_invalidate() instead of GCC builtin for iOS
tests: disable tests on iOS which uses system()
Signed-off-by: Joelle van Dyne 
---
 block.c   |  2 +-
 block/file-posix.c| 30 ---
 configure | 43 ++-
 meson.build   |  2 +-
 net/slirp.c   | 16 +++
 qga/commands-posix.c  |  6 ++
 target/arm/arm-semi.c |  2 ++
 target/m68k/m68k-semi.c   |  2 ++
 target/nios2/nios2-semi.c |  2 ++
 tcg/aarch64/tcg-target.h  | 10 +
 tests/qtest/meson.build   |  7 +++
 11 files changed, 95 insertions(+), 27 deletions(-)

diff --git a/block.c b/block.c
index 430edf79bb..5d49869d02 100644
--- a/block.c
+++ b/block.c
@@ -53,7 +53,7 @@
 #ifdef CONFIG_BSD
 #include 
 #include 
-#ifndef __DragonFly__
+#if !defined(__DragonFly__) && !defined(CONFIG_IOS)
 #include 
 #endif
 #endif
diff --git a/block/file-posix.c b/block/file-posix.c
index 52f7c20525..cdc73b5f1d 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -181,7 +181,16 @@ typedef struct BDRVRawReopenState {
 bool check_cache_dropped;
 } BDRVRawReopenState;
 
-static int fd_open(BlockDriverState *bs);
+static int fd_open(BlockDriverState *bs)
+{
+BDRVRawState *s = bs->opaque;
+
+/* this is just to ensure s->fd is sane (its called by io ops) */
+if (s->fd >= 0)
+return 0;
+return -EIO;
+}
+
 static int64_t raw_getlength(BlockDriverState *bs);
 
 typedef struct RawPosixAIOData {
@@ -252,6 +261,12 @@ static int raw_normalize_devicepath(const char **filename, 
Error **errp)
 }
 #endif
 
+#if defined(CONFIG_IOS)
+static int probe_logical_blocksize(int fd, unsigned int *sector_size_p)
+{
+return -ENOTSUP; /* not supported on iOS */
+}
+#else /* CONFIG_IOS */
 /*
  * Get logical block size via ioctl. On success store it in @sector_size_p.
  */
@@ -284,6 +299,7 @@ static int probe_logical_blocksize(int fd, unsigned int 
*sector_size_p)
 
 return success ? 0 : -errno;
 }
+#endif /* !CONFIG_IOS */
 
 /**
  * Get physical block size of @fd.
@@ -2306,7 +2322,7 @@ again:
 }
 if (size == 0)
 #endif
-#if defined(__APPLE__) && defined(__MACH__)
+#if !defined(CONFIG_IOS) && defined(__APPLE__) && defined(__MACH__)
 {
 uint64_t sectors = 0;
 uint32_t sector_size = 0;
@@ -3541,16 +3557,6 @@ hdev_co_ioctl(BlockDriverState *bs, unsigned long int 
req, void *buf)
 }
 #endif /* linux */
 
-static int fd_open(BlockDriverState *bs)
-{
-BDRVRawState *s = bs->opaque;
-
-/* this is just to ensure s->fd is sane (its called by io ops) */
-if (s->fd >= 0)
-return 0;
-return -EIO;
-}
-
 static coroutine_fn int
 hdev_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
 {
diff --git a/configure b/configure
index 46d5db63e8..c474d7c221 100755
--- a/configure
+++ b/configure
@@ -561,6 +561,19 @@ EOF
   compile_object
 }
 
+check_ios() {
+  cat > $TMPC < $TMPC <
@@ -603,7 +616,11 @@ elif check_define __DragonFly__ ; then
 elif check_define __NetBSD__; then
   targetos='NetBSD'
 elif check_define __APPLE__; then
-  targetos='Darwin'
+  if check_ios ; then
+targetos='iOS'
+  else
+targetos='Darwin'
+  fi
 else
   # This is a fatal error, but don't report it yet, because we
   # might be going to just print the --help text, or it might
@@ -780,6 +797,22 @@ Darwin)
   # won't work when we're compiling with gcc as a C compiler.
   QEMU_CFLAGS="-DOS_OBJECT_USE_OBJC=0 $QEMU_CFLAGS"
 ;;
+iOS)
+  bsd="yes"
+  darwin="yes"
+  ios="yes"
+  if [ "$cpu" = "x86_64" ] ; then
+QEMU_CFLAGS="-arch x86_64 $QEMU_CFLAGS"
+QEMU_LDFLAGS="-arch x86_64 $QEMU_LDFLAGS"
+  fi
+  host_block_device_support="no"
+  audio_drv_list=""
+  audio_possible_drivers=""
+  QEMU_LDFLAGS="-framework CoreFoundation $QEMU_LDFLAGS"
+  # Disable attempts to use ObjectiveC features in os/object.h since they
+  # won't work when we're compiling with gcc as a C compiler.
+  QEMU_CFLAGS="-DOS_OBJECT_USE_OBJC=0 $QEMU_CFLAGS"
+;;
 SunOS)
   solaris="yes"
   make="${MAKE-gmake}"
@@ -6162,6 +6195,10 @@ if test "$darwin" = "yes" ; then
   echo "CONFIG_DARWIN=y" >> $config_host_mak
 fi
 
+if test "$ios" = "yes" ; then
+  echo "CONFIG_IOS=y" >> $config_host_mak
+fi
+
 if test "$solaris" = "yes" ; then
   echo "CONFIG_SOLARIS=y" >> $config_host_mak
 fi
@@ -7166,6 +7203,7 @@ echo "cpp_link_args = [${LDFLAGS:+$(meson_quote 
$LDFLAGS)}]" >> $cross
 echo "[binaries]" >> $cross
 echo "c = [$(meson_quote $cc)]" >> $cross
 test -n "$cxx" && echo "cpp = [$(meson_quote $cxx)]" >> $cross
+test -n "$objcc" && echo "objc = [$(meson_quote $objcc)]" >> $cross
 echo "ar = [$(meson_quote $ar)]" >> $cross
 

[PATCH 01/10] configure: option to disable host block devices

2020-10-12 Thread Joelle van Dyne
From: osy 

Some hosts (iOS) have a sandboxed filesystem and do not provide low-level
APIs for interfacing with host block devices.

Signed-off-by: Joelle van Dyne 
---
 block/file-posix.c | 8 +++-
 configure  | 4 
 meson.build| 1 +
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index c63926d592..52f7c20525 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -41,7 +41,7 @@
 #include "scsi/pr-manager.h"
 #include "scsi/constants.h"
 
-#if defined(__APPLE__) && (__MACH__)
+#if defined(CONFIG_HOST_BLOCK_DEVICE) && defined(__APPLE__) && (__MACH__)
 #include 
 #include 
 #include 
@@ -3247,6 +3247,8 @@ BlockDriver bdrv_file = {
 /***/
 /* host device */
 
+#if defined(CONFIG_HOST_BLOCK_DEVICE)
+
 #if defined(__APPLE__) && defined(__MACH__)
 static kern_return_t GetBSDPath(io_iterator_t mediaIterator, char *bsdPath,
 CFIndex maxPathSize, int flags);
@@ -3872,6 +3874,8 @@ static BlockDriver bdrv_host_cdrom = {
 };
 #endif /* __FreeBSD__ */
 
+#endif /* CONFIG_HOST_BLOCK_DEVICE */
+
 static void bdrv_file_init(void)
 {
 /*
@@ -3879,6 +3883,7 @@ static void bdrv_file_init(void)
  * registered last will get probed first.
  */
 bdrv_register(_file);
+#if defined(CONFIG_HOST_BLOCK_DEVICE)
 bdrv_register(_host_device);
 #ifdef __linux__
 bdrv_register(_host_cdrom);
@@ -3886,6 +3891,7 @@ static void bdrv_file_init(void)
 #if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
 bdrv_register(_host_cdrom);
 #endif
+#endif /* CONFIG_HOST_BLOCK_DEVICE */
 }
 
 block_init(bdrv_file_init);
diff --git a/configure b/configure
index b553288c5e..3c63879750 100755
--- a/configure
+++ b/configure
@@ -446,6 +446,7 @@ meson=""
 ninja=""
 skip_meson=no
 gettext=""
+host_block_device_support="yes"
 
 bogus_os="no"
 malloc_trim="auto"
@@ -6098,6 +6099,9 @@ if test "$default_devices" = "yes" ; then
 else
   echo "CONFIG_MINIKCONF_MODE=--allnoconfig" >> $config_host_mak
 fi
+if test "$host_block_device_support" = "yes" ; then
+  echo "CONFIG_HOST_BLOCK_DEVICE=y" >> $config_host_mak
+fi
 if test "$debug_tcg" = "yes" ; then
   echo "CONFIG_DEBUG_TCG=y" >> $config_host_mak
 fi
diff --git a/meson.build b/meson.build
index 17c89c87c6..5d3a47784b 100644
--- a/meson.build
+++ b/meson.build
@@ -1947,6 +1947,7 @@ summary_info += {'vvfat support': 
config_host.has_key('CONFIG_VVFAT')}
 summary_info += {'qed support':   config_host.has_key('CONFIG_QED')}
 summary_info += {'parallels support': config_host.has_key('CONFIG_PARALLELS')}
 summary_info += {'sheepdog support':  config_host.has_key('CONFIG_SHEEPDOG')}
+summary_info += {'host block dev support': 
config_host.has_key('CONFIG_HOST_BLOCK_DEVICE')}
 summary_info += {'capstone':  capstone_opt == 'disabled' ? false : 
capstone_opt}
 summary_info += {'libpmem support':   config_host.has_key('CONFIG_LIBPMEM')}
 summary_info += {'libdaxctl support': config_host.has_key('CONFIG_LIBDAXCTL')}
-- 
2.24.3 (Apple Git-128)




[PATCH v2 1/2] block: Fixes nfs compiling error on msys2/mingw

2020-10-12 Thread Yonggang Luo
These compiling errors are fixed:
../block/nfs.c:27:10: fatal error: poll.h: No such file or directory
   27 | #include 
  |  ^~~~
compilation terminated.

../block/nfs.c:63:5: error: unknown type name 'blkcnt_t'
   63 | blkcnt_t st_blocks;
  | ^~~~
../block/nfs.c: In function 'nfs_client_open':
../block/nfs.c:550:27: error: 'struct _stat64' has no member named 'st_blocks'
  550 | client->st_blocks = st.st_blocks;
  |   ^
../block/nfs.c: In function 'nfs_get_allocated_file_size':
../block/nfs.c:751:41: error: 'struct _stat64' has no member named 'st_blocks'
  751 | return (task.ret < 0 ? task.ret : st.st_blocks * 512);
  | ^
../block/nfs.c: In function 'nfs_reopen_prepare':
../block/nfs.c:805:31: error: 'struct _stat64' has no member named 'st_blocks'
  805 | client->st_blocks = st.st_blocks;
  |   ^
../block/nfs.c: In function 'nfs_get_allocated_file_size':
../block/nfs.c:752:1: error: control reaches end of non-void function 
[-Werror=return-type]
  752 | }
  | ^

On msys2/mingw, there is no st_blocks in struct _stat64 yet, we disable the 
usage of it
on msys2/mingw, and create a typedef long long blkcnt_t; for further 
implementation

Signed-off-by: Yonggang Luo 
---
 block/nfs.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/block/nfs.c b/block/nfs.c
index f86e660374..cf8795fb49 100644
--- a/block/nfs.c
+++ b/block/nfs.c
@@ -24,7 +24,9 @@
 
 #include "qemu/osdep.h"
 
+#if !defined(_WIN32)
 #include 
+#endif
 #include "qemu/config-file.h"
 #include "qemu/error-report.h"
 #include "qapi/error.h"
@@ -51,6 +53,10 @@
 #define QEMU_NFS_MAX_PAGECACHE_SIZE (8388608 / NFS_BLKSIZE)
 #define QEMU_NFS_MAX_DEBUG_LEVEL 2
 
+#if defined(_WIN32)
+typedef long long blkcnt_t;
+#endif
+
 typedef struct NFSClient {
 struct nfs_context *context;
 struct nfsfh *fh;
@@ -545,7 +551,9 @@ static int64_t nfs_client_open(NFSClient *client, 
BlockdevOptionsNfs *opts,
 }
 
 ret = DIV_ROUND_UP(st.st_size, BDRV_SECTOR_SIZE);
+#if !defined(_WIN32)
 client->st_blocks = st.st_blocks;
+#endif
 client->has_zero_init = S_ISREG(st.st_mode);
 *strp = '/';
 goto out;
@@ -706,6 +714,7 @@ static int nfs_has_zero_init(BlockDriverState *bs)
 return client->has_zero_init;
 }
 
+#if !defined(_WIN32)
 /* Called (via nfs_service) with QemuMutex held.  */
 static void
 nfs_get_allocated_file_size_cb(int ret, struct nfs_context *nfs, void *data,
@@ -748,6 +757,7 @@ static int64_t nfs_get_allocated_file_size(BlockDriverState 
*bs)
 
 return (task.ret < 0 ? task.ret : st.st_blocks * 512);
 }
+#endif
 
 static int coroutine_fn
 nfs_file_co_truncate(BlockDriverState *bs, int64_t offset, bool exact,
@@ -800,7 +810,9 @@ static int nfs_reopen_prepare(BDRVReopenState *state,
nfs_get_error(client->context));
 return ret;
 }
+#if !defined(_WIN32)
 client->st_blocks = st.st_blocks;
+#endif
 }
 
 return 0;
@@ -869,7 +881,10 @@ static BlockDriver bdrv_nfs = {
 .create_opts= _create_opts,
 
 .bdrv_has_zero_init = nfs_has_zero_init,
+/* libnfs does not provide the allocated filesize of a file on win32. */
+#if !defined(_WIN32)
 .bdrv_get_allocated_file_size   = nfs_get_allocated_file_size,
+#endif
 .bdrv_co_truncate   = nfs_file_co_truncate,
 
 .bdrv_file_open = nfs_file_open,
-- 
2.28.0.windows.1




[PATCH v2 2/2] block: enable libnfs on msys2/mingw in cirrus.yml

2020-10-12 Thread Yonggang Luo
At the begging libnfs are not enabled because of compiling error,
now it's fixed so enable it

Signed-off-by: Yonggang Luo 
---
 .cirrus.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.cirrus.yml b/.cirrus.yml
index f42ccb956a..2c6bf45e6d 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -109,6 +109,7 @@ windows_msys2_task:
   mingw-w64-x86_64-cyrus-sasl \
   mingw-w64-x86_64-curl \
   mingw-w64-x86_64-gnutls \
+  mingw-w64-x86_64-libnfs \
   "
 bitsadmin /transfer msys_download /dynamic /download /priority 
FOREGROUND `
   
https://repo.msys2.org/mingw/x86_64/mingw-w64-x86_64-python-sphinx-2.3.1-1-any.pkg.tar.xz
 `
-- 
2.28.0.windows.1




[PATCH v2 0/2] Fixes building nfs on msys2/mingw

2020-10-12 Thread Yonggang Luo
V1-V2
Apply suggestion from  Peter Lieven

Yonggang Luo (2):
  block: Fixes nfs compiling error on msys2/mingw
  block: enable libnfs on msys2/mingw in cirrus.yml

 .cirrus.yml |  1 +
 block/nfs.c | 15 +++
 2 files changed, 16 insertions(+)

-- 
2.28.0.windows.1




Re: [PULL v2 00/30] Block patches

2020-10-12 Thread Eric Blake

On 10/12/20 4:48 PM, Peter Maydell wrote:



Build failures, OSX and the BSDs:



I'll let you find and fix those...


Build failure, Windows:

../../qemu-nbd.c:158:5: error: "CONFIG_POSIX" is not defined [-Werror=undef]
  #if CONFIG_POSIX
  ^



but this one is easy.  In 22/30 block: move block exports to 
libblockdev, replace '#if CONFIG_POSIX' with '#ifdef CONFIG_POSIX' 
around qemu_system_killed().



'make check' failure, s390x and ppc64 (ie big-endian host):

ERROR:../../tests/qtest/vhost-user-blk-test.c:448:idx: assertion
failed (capacity == TEST_IMAGE_SIZE / 512): (219902322 == 131072)
ERROR qtest-i386: qos-test - Bail out!
ERROR:../../tests/qtest/vhost-user-blk-test.c:448:idx: assertion
failed (capacity == TEST_IMAGE_SIZE / 512): (219902322 == 131072)

I also got this on aarch64 host:

ERROR:../../tests/qtest/boot-sector.c:161:boot_sector_test: assertion
failed (signature == SIGNATURE): (0x == 0xdead)
ERROR qtest-s390x: cdrom-test - Bail out!
ERROR:../../tests/qtest/boot-sector.c:161:boot_sector_test: assertion
failed (signature == SIGNATURE): (0x == 0xdead)

but it might be an unrelated intermittent.

Also a hang on x86-64 host running qtest-ppc64 qos-test, ditto.

thanks
-- PMM



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PULL v2 00/30] Block patches

2020-10-12 Thread Peter Maydell
On Mon, 12 Oct 2020 at 19:28, Stefan Hajnoczi  wrote:
>
> The following changes since commit 2387df497b4b4bcf754eb7398edca82889e2ef54:
>
>   Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2020-10-10' 
> into staging (2020-10-12 11:29:42 +0100)
>
> are available in the Git repository at:
>
>   https://gitlab.com/stefanha/qemu.git tags/block-pull-request
>
> for you to fetch changes up to 3664ec6bbe236126b79d251d4037889e7181ab55:
>
>   iotests: add commit top->base cases to 274 (2020-10-12 16:47:58 +0100)
>
> 
> Pull request
>
> v2:
>  * Rebase and resolve conflict with commit 029a88c9a7e3 ("qemu-nbd: Honor
>SIGINT and SIGHUP") [Peter]
>

Build failures, OSX and the BSDs:

Compiling C object
contrib/libvhost-user/libvhost-user.a.p/libvhost-user-glib.c.o
../../contrib/libvhost-user/libvhost-user.c:27:10: fatal error:
'sys/eventfd.h' file not found
#include 
 ^~~
In file included from ../../contrib/libvhost-user/libvhost-user-glib.c:17:
In file included from ../../contrib/libvhost-user/libvhost-user-glib.h:19:
../../contrib/libvhost-user/libvhost-user.h:21:10: fatal error:
'linux/vhost.h' file not found
#include 
 ^~~
1 error generated.


Build failure, 32-bit:

../../util/vhost-user-server.c: In function 'vu_message_read':
../../util/vhost-user-server.c:119:30: error: format '%lu' expects
argument of type 'long unsigned int', but argument 3 has type 'size_t
{aka unsigned int}' [-Werror=format=]
 error_report("A maximum of %zu fds are allowed, "
  ^~~~
../../util/vhost-user-server.c:121:39:
  max_fds, vmsg->fd_num + nfds);
   ~~~
../../util/vhost-user-server.c:120:45: note: format string is defined here
  "however got %lu fds now",
   ~~^
   %u

(you might also want to wordsmith the English in that error message)

Build failure, Windows:

../../qemu-nbd.c:158:5: error: "CONFIG_POSIX" is not defined [-Werror=undef]
 #if CONFIG_POSIX
 ^

'make check' failure, s390x and ppc64 (ie big-endian host):

ERROR:../../tests/qtest/vhost-user-blk-test.c:448:idx: assertion
failed (capacity == TEST_IMAGE_SIZE / 512): (219902322 == 131072)
ERROR qtest-i386: qos-test - Bail out!
ERROR:../../tests/qtest/vhost-user-blk-test.c:448:idx: assertion
failed (capacity == TEST_IMAGE_SIZE / 512): (219902322 == 131072)

I also got this on aarch64 host:

ERROR:../../tests/qtest/boot-sector.c:161:boot_sector_test: assertion
failed (signature == SIGNATURE): (0x == 0xdead)
ERROR qtest-s390x: cdrom-test - Bail out!
ERROR:../../tests/qtest/boot-sector.c:161:boot_sector_test: assertion
failed (signature == SIGNATURE): (0x == 0xdead)

but it might be an unrelated intermittent.

Also a hang on x86-64 host running qtest-ppc64 qos-test, ditto.

thanks
-- PMM



Re: [PULL v2 00/30] Block patches

2020-10-12 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20201012182800.157697-1-stefa...@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20201012182800.157697-1-stefa...@redhat.com
Subject: [PULL v2 00/30] Block patches

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20201012182800.157697-1-stefa...@redhat.com -> 
patchew/20201012182800.157697-1-stefa...@redhat.com
Switched to a new branch 'test'
0a960f8 iotests: add commit top->base cases to 274
2904098 block/io: fix bdrv_is_allocated_above
d4a7d3e block/io: bdrv_common_block_status_above: support bs == base
b49dc55 block/io: bdrv_common_block_status_above: support include_base
d855c38 block/io: fix bdrv_co_block_status_above
617ab32 tests/qtest: add multi-queue test case to vhost-user-blk-test
1ba3da8 block/export: add vhost-user-blk multi-queue support
c9cda63 block/export: add iothread and fixed-iothread options
8d7ed6c block: move block exports to libblockdev
c68e244 qemu-storage-daemon: avoid compiling blockdev_ss twice
05df47a util/vhost-user-server: use static library in meson.build
55e251f util/vhost-user-server: move header to include/
b1d101e block/export: convert vhost-user-blk server to block export API
21c3c13 block/export: report flush errors
559e3b3 util/vhost-user-server: rework vu_client_trip() coroutine lifecycle
c4ccc02 util/vhost-user-server: check EOF when reading payload
db49ac1 util/vhost-user-server: fix memory leak in vu_message_read()
5b01ca0 util/vhost-user-server: drop unused DevicePanicNotifier
1aab372 block/export: consolidate request structs into VuBlockReq
1df6ecd util/vhost-user-server: drop unnecessary watch deletion
a8933a8 util/vhost-user-server: drop unnecessary QOM cast
bd4edb0 util/vhost-user-server: s/fileds/fields/ typo fix
bec8cbf MAINTAINERS: Add vhost-user block device backend server maintainer
de2268c test: new qTest case to test the vhost-user-blk-server
ec6d467 block/export: vhost-user block device backend server
6ac349d block: move logical block size check function to a common utility 
function
9036dfc util/vhost-user-server: generic vhost user server
a70c2f0 libvhost-user: remove watch for kick_fd when de-initialize vu-dev
cea6c96 libvhost-user: Allow vu_message_read to be replaced
5023e8a block/nvme: Add driver statistics for access alignment and hw errors

=== OUTPUT BEGIN ===
1/30 Checking commit 5023e8a8bf04 (block/nvme: Add driver statistics for access 
alignment and hw errors)
2/30 Checking commit cea6c96deb4e (libvhost-user: Allow vu_message_read to be 
replaced)
WARNING: Block comments use a leading /* on a separate line
#130: FILE: contrib/libvhost-user/libvhost-user.h:395:
+/* @read_msg: custom method to read vhost-user message

total: 0 errors, 1 warnings, 139 lines checked

Patch 2/30 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/30 Checking commit a70c2f01bb3f (libvhost-user: remove watch for kick_fd when 
de-initialize vu-dev)
4/30 Checking commit 9036dfc2c475 (util/vhost-user-server: generic vhost user 
server)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#33: 
new file mode 100644

WARNING: line over 80 characters
#85: FILE: util/vhost-user-server.c:48:
+/* When this is set vu_client_trip will stop new processing vhost-user 
message */

total: 0 errors, 2 warnings, 500 lines checked

Patch 4/30 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
5/30 Checking commit 6ac349dc1fbf (block: move logical block size check 
function to a common utility function)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#85: 
new file mode 100644

total: 0 errors, 1 warnings, 129 lines checked

Patch 5/30 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
6/30 Checking commit ec6d467efa05 (block/export: vhost-user block device 
backend server)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#28: 
new file mode 100644

WARNING: line over 80 characters
#474: FILE: block/export/vhost-user-blk-server.c:442:
+blk_remove_aio_context_notifier(vu_block_device->backend, 
blk_aio_attached,

ERROR: g_free(NULL) is safe this check is probably not required
#534: FILE: block/export/vhost-user-blk-server.c:502:
+if (vus->node_name) {
+g_free(vus->node_name);

total: 1 errors, 2 warnings, 714 lines checked

Patch 6/30 has style problems, 

Overlay limit bug

2020-10-12 Thread Yoonho Park
I stumbled on a bug in qemu 4.2.0 (virsh 6.0.0) with a large number of
overlays. I am using "qemu-img create" and "virsh snapshot-create-as" to
create each overlay. When I run "virsh snapshot-create-as" for the 42nd
overlay, I get "error: No complete monitor response found in 10485760
bytes: Numerical result out of range". However, I pulled down qemu 5.1.50
(still using virsh 6.0.0), and it looks like the problem has disappeared
which is great. Does anyone know the patch set that addressed this bug?
Also, does anyone know the "official" limit on the number of overlays that
can be created and is there a qemu test that exercises this? I could not
find an overlay limit test in tests/qemu-iotests.


[PULL v2 26/30] block/io: fix bdrv_co_block_status_above

2020-10-12 Thread Stefan Hajnoczi
From: Vladimir Sementsov-Ogievskiy 

bdrv_co_block_status_above has several design problems with handling
short backing files:

1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but
without BDRV_BLOCK_ALLOCATED flag, when actually short backing file
which produces these after-EOF zeros is inside requested backing
sequence.

2. With want_zero=false, it may return pnum=0 prior to actual EOF,
because of EOF of short backing file.

Fix these things, making logic about short backing files clearer.

With fixed bdrv_block_status_above we also have to improve is_zero in
qcow2 code, otherwise iotest 154 will fail, because with this patch we
stop to merge zeros of different types (produced by fully unallocated
in the whole backing chain regions vs produced by short backing files).

Note also, that this patch leaves for another day the general problem
around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated
vs go-to-backing.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Eric Blake 
Message-id: 20200924194003.22080-2-vsement...@virtuozzo.com
[Fix s/comes/come/ as suggested by Eric Blake
--Stefan]
Signed-off-by: Stefan Hajnoczi 
---
 block/io.c| 68 ---
 block/qcow2.c | 16 ++--
 2 files changed, 68 insertions(+), 16 deletions(-)

diff --git a/block/io.c b/block/io.c
index 54f0968aee..a718d50ca2 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2350,34 +2350,74 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
   int64_t *map,
   BlockDriverState **file)
 {
+int ret;
 BlockDriverState *p;
-int ret = 0;
-bool first = true;
+int64_t eof = 0;
 
 assert(bs != base);
-for (p = bs; p != base; p = bdrv_filter_or_cow_bs(p)) {
+
+ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
+if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
+return ret;
+}
+
+if (ret & BDRV_BLOCK_EOF) {
+eof = offset + *pnum;
+}
+
+assert(*pnum <= bytes);
+bytes = *pnum;
+
+for (p = bdrv_filter_or_cow_bs(bs); p != base;
+ p = bdrv_filter_or_cow_bs(p))
+{
 ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
file);
 if (ret < 0) {
-break;
+return ret;
 }
-if (ret & BDRV_BLOCK_ZERO && ret & BDRV_BLOCK_EOF && !first) {
+if (*pnum == 0) {
 /*
- * Reading beyond the end of the file continues to read
- * zeroes, but we can only widen the result to the
- * unallocated length we learned from an earlier
- * iteration.
+ * The top layer deferred to this layer, and because this layer is
+ * short, any zeroes that we synthesize beyond EOF behave as if 
they
+ * were allocated at this layer.
+ *
+ * We don't include BDRV_BLOCK_EOF into ret, as upper layer may be
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
+ * below.
  */
+assert(ret & BDRV_BLOCK_EOF);
 *pnum = bytes;
+if (file) {
+*file = p;
+}
+ret = BDRV_BLOCK_ZERO | BDRV_BLOCK_ALLOCATED;
+break;
 }
-if (ret & (BDRV_BLOCK_ZERO | BDRV_BLOCK_DATA)) {
+if (ret & BDRV_BLOCK_ALLOCATED) {
+/*
+ * We've found the node and the status, we must break.
+ *
+ * Drop BDRV_BLOCK_EOF, as it's not for upper layer, which may be
+ * larger. We'll add BDRV_BLOCK_EOF if needed at function end, see
+ * below.
+ */
+ret &= ~BDRV_BLOCK_EOF;
 break;
 }
-/* [offset, pnum] unallocated on this layer, which could be only
- * the first part of [offset, bytes].  */
-bytes = MIN(bytes, *pnum);
-first = false;
+
+/*
+ * OK, [offset, offset + *pnum) region is unallocated on this layer,
+ * let's continue the diving.
+ */
+assert(*pnum <= bytes);
+bytes = *pnum;
+}
+
+if (offset + *pnum == eof) {
+ret |= BDRV_BLOCK_EOF;
 }
+
 return ret;
 }
 
diff --git a/block/qcow2.c b/block/qcow2.c
index b05512718c..b6cb4db8bb 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3860,8 +3860,20 @@ static bool is_zero(BlockDriverState *bs, int64_t 
offset, int64_t bytes)
 if (!bytes) {
 return true;
 }
-res = bdrv_block_status_above(bs, NULL, offset, bytes, , NULL, NULL);
-return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == bytes;
+
+/*
+ * bdrv_block_status_above doesn't merge different types of zeros, for
+ * example, zeros which come from the region which is unallocated in
+ * the whole backing 

[PULL v2 30/30] iotests: add commit top->base cases to 274

2020-10-12 Thread Stefan Hajnoczi
From: Vladimir Sementsov-Ogievskiy 

These cases are fixed by previous patches around block_status and
is_allocated.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Message-id: 20200924194003.22080-6-vsement...@virtuozzo.com
Signed-off-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/274 | 20 +++
 tests/qemu-iotests/274.out | 68 ++
 2 files changed, 88 insertions(+)

diff --git a/tests/qemu-iotests/274 b/tests/qemu-iotests/274
index d4571c5465..76b1ba6a52 100755
--- a/tests/qemu-iotests/274
+++ b/tests/qemu-iotests/274
@@ -115,6 +115,26 @@ with iotests.FilePath('base') as base, \
 iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, mid)
 iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), mid)
 
+iotests.log('=== Testing qemu-img commit (top -> base) ===')
+
+create_chain()
+iotests.qemu_img_log('commit', '-b', base, top)
+iotests.img_info_log(base)
+iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
+iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), 
base)
+
+iotests.log('=== Testing QMP active commit (top -> base) ===')
+
+create_chain()
+with create_vm() as vm:
+vm.launch()
+vm.qmp_log('block-commit', device='top', base_node='base',
+   job_id='job0', auto_dismiss=False)
+vm.run_job('job0', wait=5)
+
+iotests.img_info_log(mid)
+iotests.qemu_io_log('-c', 'read -P 1 0 %d' % size_short, base)
+iotests.qemu_io_log('-c', 'read -P 0 %d %d' % (size_short, size_diff), 
base)
 
 iotests.log('== Resize tests ==')
 
diff --git a/tests/qemu-iotests/274.out b/tests/qemu-iotests/274.out
index bf5abd4c10..cfe17a8659 100644
--- a/tests/qemu-iotests/274.out
+++ b/tests/qemu-iotests/274.out
@@ -135,6 +135,74 @@ read 1048576/1048576 bytes at offset 0
 read 1048576/1048576 bytes at offset 1048576
 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
+=== Testing qemu-img commit (top -> base) ===
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
+
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base 
backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
+
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid 
backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
+
+wrote 2097152/2097152 bytes at offset 0
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+Image committed.
+
+image: TEST_IMG
+file format: IMGFMT
+virtual size: 2 MiB (2097152 bytes)
+cluster_size: 65536
+Format specific information:
+compat: 1.1
+compression type: zlib
+lazy refcounts: false
+refcount bits: 16
+corrupt: false
+extended l2: false
+
+read 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+read 1048576/1048576 bytes at offset 1048576
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Testing QMP active commit (top -> base) ===
+Formatting 'TEST_DIR/PID-base', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=2097152 lazy_refcounts=off refcount_bits=16
+
+Formatting 'TEST_DIR/PID-mid', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=1048576 backing_file=TEST_DIR/PID-base 
backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
+
+Formatting 'TEST_DIR/PID-top', fmt=qcow2 cluster_size=65536 extended_l2=off 
compression_type=zlib size=2097152 backing_file=TEST_DIR/PID-mid 
backing_fmt=qcow2 lazy_refcounts=off refcount_bits=16
+
+wrote 2097152/2097152 bytes at offset 0
+2 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+{"execute": "block-commit", "arguments": {"auto-dismiss": false, "base-node": 
"base", "device": "top", "job-id": "job0"}}
+{"return": {}}
+{"execute": "job-complete", "arguments": {"id": "job0"}}
+{"return": {}}
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, 
"type": "commit"}, "event": "BLOCK_JOB_READY", "timestamp": {"microseconds": 
"USECS", "seconds": "SECS"}}
+{"data": {"device": "job0", "len": 1048576, "offset": 1048576, "speed": 0, 
"type": "commit"}, "event": "BLOCK_JOB_COMPLETED", "timestamp": 
{"microseconds": "USECS", "seconds": "SECS"}}
+{"execute": "job-dismiss", "arguments": {"id": "job0"}}
+{"return": {}}
+image: TEST_IMG
+file format: IMGFMT
+virtual size: 1 MiB (1048576 bytes)
+cluster_size: 65536
+backing file: TEST_DIR/PID-base
+backing file format: IMGFMT
+Format specific information:
+compat: 1.1
+compression type: zlib
+lazy refcounts: false
+refcount bits: 16
+corrupt: false
+extended l2: false
+
+read 1048576/1048576 bytes at offset 0
+1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+

[PULL v2 27/30] block/io: bdrv_common_block_status_above: support include_base

2020-10-12 Thread Stefan Hajnoczi
From: Vladimir Sementsov-Ogievskiy 

In order to reuse bdrv_common_block_status_above in
bdrv_is_allocated_above, let's support include_base parameter.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
Reviewed-by: Eric Blake 
Message-id: 20200924194003.22080-3-vsement...@virtuozzo.com
Signed-off-by: Stefan Hajnoczi 
---
 block/coroutines.h |  2 ++
 block/io.c | 21 ++---
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/block/coroutines.h b/block/coroutines.h
index f69179f5ef..1cb3128b94 100644
--- a/block/coroutines.h
+++ b/block/coroutines.h
@@ -41,6 +41,7 @@ bdrv_pwritev(BdrvChild *child, int64_t offset, unsigned int 
bytes,
 int coroutine_fn
 bdrv_co_common_block_status_above(BlockDriverState *bs,
   BlockDriverState *base,
+  bool include_base,
   bool want_zero,
   int64_t offset,
   int64_t bytes,
@@ -50,6 +51,7 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
 int generated_co_wrapper
 bdrv_common_block_status_above(BlockDriverState *bs,
BlockDriverState *base,
+   bool include_base,
bool want_zero,
int64_t offset,
int64_t bytes,
diff --git a/block/io.c b/block/io.c
index a718d50ca2..86f76d04bf 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2343,6 +2343,7 @@ early_out:
 int coroutine_fn
 bdrv_co_common_block_status_above(BlockDriverState *bs,
   BlockDriverState *base,
+  bool include_base,
   bool want_zero,
   int64_t offset,
   int64_t bytes,
@@ -2354,10 +2355,11 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
 BlockDriverState *p;
 int64_t eof = 0;
 
-assert(bs != base);
+assert(include_base || bs != base);
+assert(!include_base || base); /* Can't include NULL base */
 
 ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
-if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED) {
+if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
 return ret;
 }
 
@@ -2368,7 +2370,7 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
 assert(*pnum <= bytes);
 bytes = *pnum;
 
-for (p = bdrv_filter_or_cow_bs(bs); p != base;
+for (p = bdrv_filter_or_cow_bs(bs); include_base || p != base;
  p = bdrv_filter_or_cow_bs(p))
 {
 ret = bdrv_co_block_status(p, want_zero, offset, bytes, pnum, map,
@@ -2406,6 +2408,11 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
 break;
 }
 
+if (p == base) {
+assert(include_base);
+break;
+}
+
 /*
  * OK, [offset, offset + *pnum) region is unallocated on this layer,
  * let's continue the diving.
@@ -2425,7 +2432,7 @@ int bdrv_block_status_above(BlockDriverState *bs, 
BlockDriverState *base,
 int64_t offset, int64_t bytes, int64_t *pnum,
 int64_t *map, BlockDriverState **file)
 {
-return bdrv_common_block_status_above(bs, base, true, offset, bytes,
+return bdrv_common_block_status_above(bs, base, false, true, offset, bytes,
   pnum, map, file);
 }
 
@@ -2442,9 +2449,9 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, 
int64_t offset,
 int ret;
 int64_t dummy;
 
-ret = bdrv_common_block_status_above(bs, bdrv_filter_or_cow_bs(bs), false,
- offset, bytes, pnum ? pnum : ,
- NULL, NULL);
+ret = bdrv_common_block_status_above(bs, bs, true, false, offset,
+ bytes, pnum ? pnum : , NULL,
+ NULL);
 if (ret < 0) {
 return ret;
 }
-- 
2.26.2



[PULL v2 20/30] util/vhost-user-server: use static library in meson.build

2020-10-12 Thread Stefan Hajnoczi
Don't compile contrib/libvhost-user/libvhost-user.c again. Instead build
the static library once and then reuse it throughout QEMU.

Also switch from CONFIG_LINUX to CONFIG_VHOST_USER, which is what the
vhost-user tools (vhost-user-gpu, etc) do.

Signed-off-by: Stefan Hajnoczi 
Message-id: 20200924151549.913737-14-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 block/export/export.c | 8 
 block/export/meson.build  | 2 +-
 contrib/libvhost-user/meson.build | 1 +
 meson.build   | 6 +-
 tests/qtest/meson.build   | 2 +-
 util/meson.build  | 4 +++-
 6 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/block/export/export.c b/block/export/export.c
index bd7cac241f..550897e236 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -17,17 +17,17 @@
 #include "sysemu/block-backend.h"
 #include "block/export.h"
 #include "block/nbd.h"
-#if CONFIG_LINUX
-#include "block/export/vhost-user-blk-server.h"
-#endif
 #include "qapi/error.h"
 #include "qapi/qapi-commands-block-export.h"
 #include "qapi/qapi-events-block-export.h"
 #include "qemu/id.h"
+#ifdef CONFIG_VHOST_USER
+#include "vhost-user-blk-server.h"
+#endif
 
 static const BlockExportDriver *blk_exp_drivers[] = {
 _exp_nbd,
-#if CONFIG_LINUX
+#ifdef CONFIG_VHOST_USER
 _exp_vhost_user_blk,
 #endif
 };
diff --git a/block/export/meson.build b/block/export/meson.build
index ef3a9576f7..469a7aa0f5 100644
--- a/block/export/meson.build
+++ b/block/export/meson.build
@@ -1,2 +1,2 @@
 block_ss.add(files('export.c'))
-block_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-blk-server.c', 
'../../contrib/libvhost-user/libvhost-user.c'))
+block_ss.add(when: 'CONFIG_VHOST_USER', if_true: 
files('vhost-user-blk-server.c'))
diff --git a/contrib/libvhost-user/meson.build 
b/contrib/libvhost-user/meson.build
index e68dd1a581..a261e7665f 100644
--- a/contrib/libvhost-user/meson.build
+++ b/contrib/libvhost-user/meson.build
@@ -1,3 +1,4 @@
 libvhost_user = static_library('vhost-user',
files('libvhost-user.c', 
'libvhost-user-glib.c'),
build_by_default: false)
+vhost_user = declare_dependency(link_with: libvhost_user)
diff --git a/meson.build b/meson.build
index 17c89c87c6..4ddd899fdd 100644
--- a/meson.build
+++ b/meson.build
@@ -1297,6 +1297,11 @@ trace_events_subdirs += [
   'util',
 ]
 
+vhost_user = not_found
+if 'CONFIG_VHOST_USER' in config_host
+  subdir('contrib/libvhost-user')
+endif
+
 subdir('qapi')
 subdir('qobject')
 subdir('stubs')
@@ -1680,7 +1685,6 @@ if have_tools
  install: true)
 
   if 'CONFIG_VHOST_USER' in config_host
-subdir('contrib/libvhost-user')
 subdir('contrib/vhost-user-blk')
 subdir('contrib/vhost-user-gpu')
 subdir('contrib/vhost-user-input')
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 6fe217262b..6fb450ddce 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -192,7 +192,7 @@ qos_test_ss.add(
 )
 qos_test_ss.add(when: 'CONFIG_VIRTFS', if_true: files('virtio-9p-test.c'))
 qos_test_ss.add(when: 'CONFIG_VHOST_USER', if_true: files('vhost-user-test.c'))
-qos_test_ss.add(when: ['CONFIG_LINUX', 'CONFIG_TOOLS'], if_true: 
files('vhost-user-blk-test.c'))
+qos_test_ss.add(when: ['CONFIG_VHOST_USER', 'CONFIG_TOOLS'], if_true: 
files('vhost-user-blk-test.c'))
 
 extra_qtest_deps = {
   'bios-tables-test': [io],
diff --git a/util/meson.build b/util/meson.build
index 2296e81b34..9b2a7a5de9 100644
--- a/util/meson.build
+++ b/util/meson.build
@@ -66,7 +66,9 @@ if have_block
   util_ss.add(files('main-loop.c'))
   util_ss.add(files('nvdimm-utils.c'))
   util_ss.add(files('qemu-coroutine.c', 'qemu-coroutine-lock.c', 
'qemu-coroutine-io.c'))
-  util_ss.add(when: 'CONFIG_LINUX', if_true: files('vhost-user-server.c'))
+  util_ss.add(when: 'CONFIG_VHOST_USER', if_true: [
+files('vhost-user-server.c'), vhost_user
+  ])
   util_ss.add(files('block-helpers.c'))
   util_ss.add(files('qemu-coroutine-sleep.c'))
   util_ss.add(files('qemu-co-shared-resource.c'))
-- 
2.26.2



[PULL v2 29/30] block/io: fix bdrv_is_allocated_above

2020-10-12 Thread Stefan Hajnoczi
From: Vladimir Sementsov-Ogievskiy 

bdrv_is_allocated_above wrongly handles short backing files: it reports
after-EOF space as UNALLOCATED which is wrong, as on read the data is
generated on the level of short backing file (if all overlays have
unallocated areas at that place).

Reusing bdrv_common_block_status_above fixes the issue and unifies code
path.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Message-id: 20200924194003.22080-5-vsement...@virtuozzo.com
[Fix s/has/have/ as suggested by Eric Blake. Fix s/area/areas/.
--Stefan]
Signed-off-by: Stefan Hajnoczi 
---
 block/io.c | 43 +--
 1 file changed, 5 insertions(+), 38 deletions(-)

diff --git a/block/io.c b/block/io.c
index b616bc4ada..02528b3823 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2477,52 +2477,19 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState 
*bs, int64_t offset,
  * at 'offset + *pnum' may return the same allocation status (in other
  * words, the result is not necessarily the maximum possible range);
  * but 'pnum' will only be 0 when end of file is reached.
- *
  */
 int bdrv_is_allocated_above(BlockDriverState *top,
 BlockDriverState *base,
 bool include_base, int64_t offset,
 int64_t bytes, int64_t *pnum)
 {
-BlockDriverState *intermediate;
-int ret;
-int64_t n = bytes;
-
-assert(base || !include_base);
-
-intermediate = top;
-while (include_base || intermediate != base) {
-int64_t pnum_inter;
-int64_t size_inter;
-
-assert(intermediate);
-ret = bdrv_is_allocated(intermediate, offset, bytes, _inter);
-if (ret < 0) {
-return ret;
-}
-if (ret) {
-*pnum = pnum_inter;
-return 1;
-}
-
-size_inter = bdrv_getlength(intermediate);
-if (size_inter < 0) {
-return size_inter;
-}
-if (n > pnum_inter &&
-(intermediate == top || offset + pnum_inter < size_inter)) {
-n = pnum_inter;
-}
-
-if (intermediate == base) {
-break;
-}
-
-intermediate = bdrv_filter_or_cow_bs(intermediate);
+int ret = bdrv_common_block_status_above(top, base, include_base, false,
+ offset, bytes, pnum, NULL, NULL);
+if (ret < 0) {
+return ret;
 }
 
-*pnum = n;
-return 0;
+return !!(ret & BDRV_BLOCK_ALLOCATED);
 }
 
 int coroutine_fn
-- 
2.26.2



[PULL v2 19/30] util/vhost-user-server: move header to include/

2020-10-12 Thread Stefan Hajnoczi
Headers used by other subsystems are located in include/. Also add the
vhost-user-server and vhost-user-blk-server headers to MAINTAINERS.

Signed-off-by: Stefan Hajnoczi 
Message-id: 20200924151549.913737-13-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 MAINTAINERS| 4 +++-
 {util => include/qemu}/vhost-user-server.h | 0
 block/export/vhost-user-blk-server.c   | 2 +-
 util/vhost-user-server.c   | 2 +-
 4 files changed, 5 insertions(+), 3 deletions(-)
 rename {util => include/qemu}/vhost-user-server.h (100%)

diff --git a/MAINTAINERS b/MAINTAINERS
index 28262319db..59fcaf9706 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3072,9 +3072,11 @@ Vhost-user block device backend server
 M: Coiby Xu 
 S: Maintained
 F: block/export/vhost-user-blk-server.c
-F: util/vhost-user-server.c
+F: block/export/vhost-user-blk-server.h
+F: include/qemu/vhost-user-server.h
 F: tests/qtest/vhost-user-blk-test.c
 F: tests/qtest/libqos/vhost-user-blk.c
+F: util/vhost-user-server.c
 
 Replication
 M: Wen Congyang 
diff --git a/util/vhost-user-server.h b/include/qemu/vhost-user-server.h
similarity index 100%
rename from util/vhost-user-server.h
rename to include/qemu/vhost-user-server.h
diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index 91fc7040b2..81072a5a46 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -13,7 +13,7 @@
 #include "block/block.h"
 #include "contrib/libvhost-user/libvhost-user.h"
 #include "standard-headers/linux/virtio_blk.h"
-#include "util/vhost-user-server.h"
+#include "qemu/vhost-user-server.h"
 #include "vhost-user-blk-server.h"
 #include "qapi/error.h"
 #include "qom/object_interfaces.h"
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 2a27139eb8..3fd26c9f30 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -9,8 +9,8 @@
  */
 #include "qemu/osdep.h"
 #include "qemu/main-loop.h"
+#include "qemu/vhost-user-server.h"
 #include "block/aio-wait.h"
-#include "vhost-user-server.h"
 
 /*
  * Theory of operation:
-- 
2.26.2



[PULL v2 28/30] block/io: bdrv_common_block_status_above: support bs == base

2020-10-12 Thread Stefan Hajnoczi
From: Vladimir Sementsov-Ogievskiy 

We are going to reuse bdrv_common_block_status_above in
bdrv_is_allocated_above. bdrv_is_allocated_above may be called with
include_base == false and still bs == base (for ex. from img_rebase()).

So, support this corner case.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Message-id: 20200924194003.22080-4-vsement...@virtuozzo.com
Signed-off-by: Stefan Hajnoczi 
---
 block/io.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 86f76d04bf..b616bc4ada 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2355,9 +2355,13 @@ bdrv_co_common_block_status_above(BlockDriverState *bs,
 BlockDriverState *p;
 int64_t eof = 0;
 
-assert(include_base || bs != base);
 assert(!include_base || base); /* Can't include NULL base */
 
+if (!include_base && bs == base) {
+*pnum = bytes;
+return 0;
+}
+
 ret = bdrv_co_block_status(bs, want_zero, offset, bytes, pnum, map, file);
 if (ret < 0 || *pnum == 0 || ret & BDRV_BLOCK_ALLOCATED || bs == base) {
 return ret;
-- 
2.26.2



[PULL v2 21/30] qemu-storage-daemon: avoid compiling blockdev_ss twice

2020-10-12 Thread Stefan Hajnoczi
Introduce libblkdev.fa to avoid recompiling blockdev_ss twice.

Suggested-by: Paolo Bonzini 
Reviewed-by: Paolo Bonzini 
Signed-off-by: Stefan Hajnoczi 
Message-id: 20200929125516.186715-3-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 meson.build| 12 ++--
 storage-daemon/meson.build |  3 +--
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/meson.build b/meson.build
index 4ddd899fdd..2fb0bb7cb4 100644
--- a/meson.build
+++ b/meson.build
@@ -1366,7 +1366,6 @@ blockdev_ss.add(files(
 blockdev_ss.add(when: 'CONFIG_POSIX', if_true: files('os-posix.c'))
 softmmu_ss.add(when: 'CONFIG_WIN32', if_true: [files('os-win32.c')])
 
-softmmu_ss.add_all(blockdev_ss)
 softmmu_ss.add(files(
   'bootdevice.c',
   'dma-helpers.c',
@@ -1462,6 +1461,15 @@ block = declare_dependency(link_whole: [libblock],
link_args: '@block.syms',
dependencies: [crypto, io])
 
+blockdev_ss = blockdev_ss.apply(config_host, strict: false)
+libblockdev = static_library('blockdev', blockdev_ss.sources() + genh,
+ dependencies: blockdev_ss.dependencies(),
+ name_suffix: 'fa',
+ build_by_default: false)
+
+blockdev = declare_dependency(link_whole: [libblockdev],
+  dependencies: [block])
+
 qmp_ss = qmp_ss.apply(config_host, strict: false)
 libqmp = static_library('qmp', qmp_ss.sources() + genh,
 dependencies: qmp_ss.dependencies(),
@@ -1478,7 +1486,7 @@ foreach m : block_mods + softmmu_mods
 install_dir: config_host['qemu_moddir'])
 endforeach
 
-softmmu_ss.add(authz, block, chardev, crypto, io, qmp)
+softmmu_ss.add(authz, blockdev, chardev, crypto, io, qmp)
 common_ss.add(qom, qemuutil)
 
 common_ss.add_all(when: 'CONFIG_SOFTMMU', if_true: [softmmu_ss])
diff --git a/storage-daemon/meson.build b/storage-daemon/meson.build
index 0409acc3f5..c5adce81c3 100644
--- a/storage-daemon/meson.build
+++ b/storage-daemon/meson.build
@@ -1,7 +1,6 @@
 qsd_ss = ss.source_set()
 qsd_ss.add(files('qemu-storage-daemon.c'))
-qsd_ss.add(block, chardev, qmp, qom, qemuutil)
-qsd_ss.add_all(blockdev_ss)
+qsd_ss.add(blockdev, chardev, qmp, qom, qemuutil)
 
 subdir('qapi')
 
-- 
2.26.2



[PULL v2 24/30] block/export: add vhost-user-blk multi-queue support

2020-10-12 Thread Stefan Hajnoczi
Allow the number of queues to be configured using --export
vhost-user-blk,num-queues=N. This setting should match the QEMU --device
vhost-user-blk-pci,num-queues=N setting but QEMU vhost-user-blk.c lowers
its own value if the vhost-user-blk backend offers fewer queues than
QEMU.

The vhost-user-blk-server.c code is already capable of multi-queue. All
virtqueue processing runs in the same AioContext. No new locking is
needed.

Add the num-queues=N option and set the VIRTIO_BLK_F_MQ feature bit.
Note that the feature bit only announces the presence of the num_queues
configuration space field. It does not promise that there is more than 1
virtqueue, so we can set it unconditionally.

I tested multi-queue by running a random read fio test with numjobs=4 on
an -smp 4 guest. After the benchmark finished the guest /proc/interrupts
file showed activity on all 4 virtio-blk MSI-X. The /sys/block/vda/mq/
directory shows that Linux blk-mq has 4 queues configured.

An automated test is included in the next commit.

Signed-off-by: Stefan Hajnoczi 
Acked-by: Markus Armbruster 
Message-id: 20201001144604.559733-2-stefa...@redhat.com
[Fixed accidental tab characters as suggested by Markus Armbruster
--Stefan]
Signed-off-by: Stefan Hajnoczi 
---
 qapi/block-export.json   | 10 +++---
 block/export/vhost-user-blk-server.c | 24 ++--
 2 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/qapi/block-export.json b/qapi/block-export.json
index 8a4ced817f..480c497690 100644
--- a/qapi/block-export.json
+++ b/qapi/block-export.json
@@ -93,11 +93,15 @@
 #SocketAddress types are supported. Passed fds must be UNIX domain
 #sockets.
 # @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
+# @num-queues: Number of request virtqueues. Must be greater than 0. Defaults
+#  to 1.
 #
 # Since: 5.2
 ##
 { 'struct': 'BlockExportOptionsVhostUserBlk',
-  'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
+  'data': { 'addr': 'SocketAddress',
+   '*logical-block-size': 'size',
+'*num-queues': 'uint16'} }
 
 ##
 # @NbdServerAddOptions:
@@ -233,8 +237,8 @@
 { 'union': 'BlockExportOptions',
   'base': { 'type': 'BlockExportType',
 'id': 'str',
-   '*fixed-iothread': 'bool',
-   '*iothread': 'str',
+'*fixed-iothread': 'bool',
+'*iothread': 'str',
 'node-name': 'str',
 '*writable': 'bool',
 '*writethrough': 'bool' },
diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index a1c37548e1..385c04da75 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -21,7 +21,7 @@
 #include "util/block-helpers.h"
 
 enum {
-VHOST_USER_BLK_MAX_QUEUES = 1,
+VHOST_USER_BLK_NUM_QUEUES_DEFAULT = 1,
 };
 struct virtio_blk_inhdr {
 unsigned char status;
@@ -242,6 +242,7 @@ static uint64_t vu_blk_get_features(VuDev *dev)
1ull << VIRTIO_BLK_F_DISCARD |
1ull << VIRTIO_BLK_F_WRITE_ZEROES |
1ull << VIRTIO_BLK_F_CONFIG_WCE |
+   1ull << VIRTIO_BLK_F_MQ |
1ull << VIRTIO_F_VERSION_1 |
1ull << VIRTIO_RING_F_INDIRECT_DESC |
1ull << VIRTIO_RING_F_EVENT_IDX |
@@ -338,7 +339,9 @@ static void blk_aio_detach(void *opaque)
 
 static void
 vu_blk_initialize_config(BlockDriverState *bs,
-   struct virtio_blk_config *config, uint32_t blk_size)
+ struct virtio_blk_config *config,
+ uint32_t blk_size,
+ uint16_t num_queues)
 {
 config->capacity = bdrv_getlength(bs) >> BDRV_SECTOR_BITS;
 config->blk_size = blk_size;
@@ -346,7 +349,7 @@ vu_blk_initialize_config(BlockDriverState *bs,
 config->seg_max = 128 - 2;
 config->min_io_size = 1;
 config->opt_io_size = 1;
-config->num_queues = VHOST_USER_BLK_MAX_QUEUES;
+config->num_queues = num_queues;
 config->max_discard_sectors = 32768;
 config->max_discard_seg = 1;
 config->discard_sector_alignment = config->blk_size >> 9;
@@ -368,6 +371,7 @@ static int vu_blk_exp_create(BlockExport *exp, 
BlockExportOptions *opts,
 BlockExportOptionsVhostUserBlk *vu_opts = >u.vhost_user_blk;
 Error *local_err = NULL;
 uint64_t logical_block_size;
+uint16_t num_queues = VHOST_USER_BLK_NUM_QUEUES_DEFAULT;
 
 vexp->writable = opts->writable;
 vexp->blkcfg.wce = 0;
@@ -385,15 +389,23 @@ static int vu_blk_exp_create(BlockExport *exp, 
BlockExportOptions *opts,
 }
 vexp->blk_size = logical_block_size;
 blk_set_guest_block_size(exp->blk, logical_block_size);
+
+if (vu_opts->has_num_queues) {
+num_queues = vu_opts->num_queues;
+}
+if (num_queues == 0) {
+error_setg(errp, "num-queues must be greater than 0");
+return -EINVAL;
+}
+
 

[PULL v2 23/30] block/export: add iothread and fixed-iothread options

2020-10-12 Thread Stefan Hajnoczi
Make it possible to specify the iothread where the export will run. By
default the block node can be moved to other AioContexts later and the
export will follow. The fixed-iothread option forces strict behavior
that prevents changing AioContext while the export is active. See the
QAPI docs for details.

Signed-off-by: Stefan Hajnoczi 
Message-id: 20200929125516.186715-5-stefa...@redhat.com
[Fix stray '#' character in block-export.json and add missing "(since:
5.2)" as suggested by Eric Blake.
--Stefan]
Signed-off-by: Stefan Hajnoczi 
---
 qapi/block-export.json   | 11 ++
 block/export/export.c| 31 +++-
 block/export/vhost-user-blk-server.c |  5 -
 nbd/server.c |  2 --
 4 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/qapi/block-export.json b/qapi/block-export.json
index a793e34af9..8a4ced817f 100644
--- a/qapi/block-export.json
+++ b/qapi/block-export.json
@@ -219,11 +219,22 @@
 #export before completion is signalled. (since: 5.2;
 #default: false)
 #
+# @iothread: The name of the iothread object where the export will run. The
+#default is to use the thread currently associated with the
+#block node. (since: 5.2)
+#
+# @fixed-iothread: True prevents the block node from being moved to another
+#  thread while the export is active. If true and @iothread is
+#  given, export creation fails if the block node cannot be
+#  moved to the iothread. The default is false. (since: 5.2)
+#
 # Since: 4.2
 ##
 { 'union': 'BlockExportOptions',
   'base': { 'type': 'BlockExportType',
 'id': 'str',
+   '*fixed-iothread': 'bool',
+   '*iothread': 'str',
 'node-name': 'str',
 '*writable': 'bool',
 '*writethrough': 'bool' },
diff --git a/block/export/export.c b/block/export/export.c
index 550897e236..a5b6b02703 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -15,6 +15,7 @@
 
 #include "block/block.h"
 #include "sysemu/block-backend.h"
+#include "sysemu/iothread.h"
 #include "block/export.h"
 #include "block/nbd.h"
 #include "qapi/error.h"
@@ -63,10 +64,11 @@ static const BlockExportDriver 
*blk_exp_find_driver(BlockExportType type)
 
 BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp)
 {
+bool fixed_iothread = export->has_fixed_iothread && export->fixed_iothread;
 const BlockExportDriver *drv;
 BlockExport *exp = NULL;
 BlockDriverState *bs;
-BlockBackend *blk;
+BlockBackend *blk = NULL;
 AioContext *ctx;
 uint64_t perm;
 int ret;
@@ -102,6 +104,28 @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error 
**errp)
 ctx = bdrv_get_aio_context(bs);
 aio_context_acquire(ctx);
 
+if (export->has_iothread) {
+IOThread *iothread;
+AioContext *new_ctx;
+
+iothread = iothread_by_id(export->iothread);
+if (!iothread) {
+error_setg(errp, "iothread \"%s\" not found", export->iothread);
+goto fail;
+}
+
+new_ctx = iothread_get_aio_context(iothread);
+
+ret = bdrv_try_set_aio_context(bs, new_ctx, errp);
+if (ret == 0) {
+aio_context_release(ctx);
+aio_context_acquire(new_ctx);
+ctx = new_ctx;
+} else if (fixed_iothread) {
+goto fail;
+}
+}
+
 /*
  * Block exports are used for non-shared storage migration. Make sure
  * that BDRV_O_INACTIVE is cleared and the image is ready for write
@@ -116,6 +140,11 @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error 
**errp)
 }
 
 blk = blk_new(ctx, perm, BLK_PERM_ALL);
+
+if (!fixed_iothread) {
+blk_set_allow_aio_context_change(blk, true);
+}
+
 ret = blk_insert_bs(blk, bs, errp);
 if (ret < 0) {
 goto fail;
diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index 81072a5a46..a1c37548e1 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -323,13 +323,17 @@ static const VuDevIface vu_blk_iface = {
 static void blk_aio_attached(AioContext *ctx, void *opaque)
 {
 VuBlkExport *vexp = opaque;
+
+vexp->export.ctx = ctx;
 vhost_user_server_attach_aio_context(>vu_server, ctx);
 }
 
 static void blk_aio_detach(void *opaque)
 {
 VuBlkExport *vexp = opaque;
+
 vhost_user_server_detach_aio_context(>vu_server);
+vexp->export.ctx = NULL;
 }
 
 static void
@@ -384,7 +388,6 @@ static int vu_blk_exp_create(BlockExport *exp, 
BlockExportOptions *opts,
 vu_blk_initialize_config(blk_bs(exp->blk), >blkcfg,
logical_block_size);
 
-blk_set_allow_aio_context_change(exp->blk, true);
 blk_add_aio_context_notifier(exp->blk, blk_aio_attached, blk_aio_detach,
  vexp);
 
diff --git 

[PULL v2 25/30] tests/qtest: add multi-queue test case to vhost-user-blk-test

2020-10-12 Thread Stefan Hajnoczi
Signed-off-by: Stefan Hajnoczi 
Message-id: 20201001144604.559733-3-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 tests/qtest/vhost-user-blk-test.c | 81 +--
 1 file changed, 76 insertions(+), 5 deletions(-)

diff --git a/tests/qtest/vhost-user-blk-test.c 
b/tests/qtest/vhost-user-blk-test.c
index 42e4cfde82..b9f35191df 100644
--- a/tests/qtest/vhost-user-blk-test.c
+++ b/tests/qtest/vhost-user-blk-test.c
@@ -559,6 +559,67 @@ static void pci_hotplug(void *obj, void *data, 
QGuestAllocator *t_alloc)
 qpci_unplug_acpi_device_test(qts, "drv1", PCI_SLOT_HP);
 }
 
+static void multiqueue(void *obj, void *data, QGuestAllocator *t_alloc)
+{
+QVirtioPCIDevice *pdev1 = obj;
+QVirtioDevice *dev1 = >vdev;
+QVirtioPCIDevice *pdev8;
+QVirtioDevice *dev8;
+QTestState *qts = pdev1->pdev->bus->qts;
+uint64_t features;
+uint16_t num_queues;
+
+/*
+ * The primary device has 1 queue and VIRTIO_BLK_F_MQ is not enabled. The
+ * VIRTIO specification allows VIRTIO_BLK_F_MQ to be enabled when there is
+ * only 1 virtqueue, but --device vhost-user-blk-pci doesn't do this (which
+ * is also spec-compliant).
+ */
+features = qvirtio_get_features(dev1);
+g_assert_cmpint(features & (1u << VIRTIO_BLK_F_MQ), ==, 0);
+features = features & ~(QVIRTIO_F_BAD_FEATURE |
+(1u << VIRTIO_RING_F_INDIRECT_DESC) |
+(1u << VIRTIO_F_NOTIFY_ON_EMPTY) |
+(1u << VIRTIO_BLK_F_SCSI));
+qvirtio_set_features(dev1, features);
+
+/* Hotplug a secondary device with 8 queues */
+qtest_qmp_device_add(qts, "vhost-user-blk-pci", "drv1",
+ "{'addr': %s, 'chardev': 'char2', 'num-queues': 8}",
+ stringify(PCI_SLOT_HP) ".0");
+
+pdev8 = virtio_pci_new(pdev1->pdev->bus,
+   &(QPCIAddress) {
+   .devfn = QPCI_DEVFN(PCI_SLOT_HP, 0)
+   });
+g_assert_nonnull(pdev8);
+g_assert_cmpint(pdev8->vdev.device_type, ==, VIRTIO_ID_BLOCK);
+
+qos_object_start_hw(>obj);
+
+dev8 = >vdev;
+features = qvirtio_get_features(dev8);
+g_assert_cmpint(features & (1u << VIRTIO_BLK_F_MQ),
+==,
+(1u << VIRTIO_BLK_F_MQ));
+features = features & ~(QVIRTIO_F_BAD_FEATURE |
+(1u << VIRTIO_RING_F_INDIRECT_DESC) |
+(1u << VIRTIO_F_NOTIFY_ON_EMPTY) |
+(1u << VIRTIO_BLK_F_SCSI) |
+(1u << VIRTIO_BLK_F_MQ));
+qvirtio_set_features(dev8, features);
+
+num_queues = qvirtio_config_readw(dev8,
+offsetof(struct virtio_blk_config, num_queues));
+g_assert_cmpint(num_queues, ==, 8);
+
+qvirtio_pci_device_disable(pdev8);
+qos_object_destroy(>obj);
+
+/* unplug secondary disk */
+qpci_unplug_acpi_device_test(qts, "drv1", PCI_SLOT_HP);
+}
+
 /*
  * Check that setting the vring addr on a non-existent virtqueue does
  * not crash.
@@ -643,7 +704,8 @@ static void quit_storage_daemon(void *qmp_test_state)
 g_free(qmp_test_state);
 }
 
-static char *start_vhost_user_blk(GString *cmd_line, int vus_instances)
+static char *start_vhost_user_blk(GString *cmd_line, int vus_instances,
+  int num_queues)
 {
 const char *vhost_user_blk_bin = qtest_qemu_storage_daemon_binary();
 int fd, qmp_fd, i;
@@ -675,8 +737,8 @@ static char *start_vhost_user_blk(GString *cmd_line, int 
vus_instances)
 g_string_append_printf(storage_daemon_command,
 "--blockdev driver=file,node-name=disk%d,filename=%s "
 "--export 
type=vhost-user-blk,id=disk%d,addr.type=unix,addr.path=%s,"
-"node-name=disk%i,writable=on ",
-i, img_path, i, sock_path, i);
+"node-name=disk%i,writable=on,num-queues=%d ",
+i, img_path, i, sock_path, i, num_queues);
 
 g_string_append_printf(cmd_line, "-chardev socket,id=char%d,path=%s ",
i + 1, sock_path);
@@ -705,7 +767,7 @@ static char *start_vhost_user_blk(GString *cmd_line, int 
vus_instances)
 
 static void *vhost_user_blk_test_setup(GString *cmd_line, void *arg)
 {
-start_vhost_user_blk(cmd_line, 1);
+start_vhost_user_blk(cmd_line, 1, 1);
 return arg;
 }
 
@@ -719,7 +781,13 @@ static void *vhost_user_blk_test_setup(GString *cmd_line, 
void *arg)
 static void *vhost_user_blk_hotplug_test_setup(GString *cmd_line, void *arg)
 {
 /* "-chardev socket,id=char2" is used for pci_hotplug*/
-start_vhost_user_blk(cmd_line, 2);
+start_vhost_user_blk(cmd_line, 2, 1);
+return arg;
+}
+
+static void *vhost_user_blk_multiqueue_test_setup(GString *cmd_line, void *arg)
+{
+start_vhost_user_blk(cmd_line, 2, 8);
 return arg;
 }
 
@@ -746,6 +814,9 @@ static void register_vhost_user_blk_test(void)
 
 

[PULL v2 16/30] util/vhost-user-server: rework vu_client_trip() coroutine lifecycle

2020-10-12 Thread Stefan Hajnoczi
The vu_client_trip() coroutine is leaked during AioContext switching. It
is also unsafe to destroy the vu_dev in panic_cb() since its callers
still access it in some cases.

Rework the lifecycle to solve these safety issues.

Signed-off-by: Stefan Hajnoczi 
Message-id: 20200924151549.913737-10-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 util/vhost-user-server.h |  29 ++--
 block/export/vhost-user-blk-server.c |   9 +-
 util/vhost-user-server.c | 245 +++
 3 files changed, 155 insertions(+), 128 deletions(-)

diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
index 92177fc911..0da4c2cc4c 100644
--- a/util/vhost-user-server.h
+++ b/util/vhost-user-server.h
@@ -19,34 +19,36 @@
 #include "qapi/error.h"
 #include "standard-headers/linux/virtio_blk.h"
 
+/* A kick fd that we monitor on behalf of libvhost-user */
 typedef struct VuFdWatch {
 VuDev *vu_dev;
 int fd; /*kick fd*/
 void *pvt;
 vu_watch_cb cb;
-bool processing;
 QTAILQ_ENTRY(VuFdWatch) next;
 } VuFdWatch;
 
-typedef struct VuServer VuServer;
-
-struct VuServer {
+/**
+ * VuServer:
+ * A vhost-user server instance with user-defined VuDevIface callbacks.
+ * Vhost-user device backends can be implemented using VuServer. VuDevIface
+ * callbacks and virtqueue kicks run in the given AioContext.
+ */
+typedef struct {
 QIONetListener *listener;
+QEMUBH *restart_listener_bh;
 AioContext *ctx;
 int max_queues;
 const VuDevIface *vu_iface;
+
+/* Protected by ctx lock */
 VuDev vu_dev;
 QIOChannel *ioc; /* The I/O channel with the client */
 QIOChannelSocket *sioc; /* The underlying data channel with the client */
-/* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
-QIOChannel *ioc_slave;
-QIOChannelSocket *sioc_slave;
-Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
 QTAILQ_HEAD(, VuFdWatch) vu_fd_watches;
-/* restart coroutine co_trip if AIOContext is changed */
-bool aio_context_changed;
-bool processing_msg;
-};
+
+Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
+} VuServer;
 
 bool vhost_user_server_start(VuServer *server,
  SocketAddress *unix_socket,
@@ -57,6 +59,7 @@ bool vhost_user_server_start(VuServer *server,
 
 void vhost_user_server_stop(VuServer *server);
 
-void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx);
+void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx);
+void vhost_user_server_detach_aio_context(VuServer *server);
 
 #endif /* VHOST_USER_SERVER_H */
diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index f2bfddbf26..b609a3e4d6 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -313,18 +313,13 @@ static const VuDevIface vu_block_iface = {
 static void blk_aio_attached(AioContext *ctx, void *opaque)
 {
 VuBlockDev *vub_dev = opaque;
-aio_context_acquire(ctx);
-vhost_user_server_set_aio_context(_dev->vu_server, ctx);
-aio_context_release(ctx);
+vhost_user_server_attach_aio_context(_dev->vu_server, ctx);
 }
 
 static void blk_aio_detach(void *opaque)
 {
 VuBlockDev *vub_dev = opaque;
-AioContext *ctx = vub_dev->vu_server.ctx;
-aio_context_acquire(ctx);
-vhost_user_server_set_aio_context(_dev->vu_server, NULL);
-aio_context_release(ctx);
+vhost_user_server_detach_aio_context(_dev->vu_server);
 }
 
 static void
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index ec555abcb2..d8b8c08b5f 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -9,8 +9,50 @@
  */
 #include "qemu/osdep.h"
 #include "qemu/main-loop.h"
+#include "block/aio-wait.h"
 #include "vhost-user-server.h"
 
+/*
+ * Theory of operation:
+ *
+ * VuServer is started and stopped by vhost_user_server_start() and
+ * vhost_user_server_stop() from the main loop thread. Starting the server
+ * opens a vhost-user UNIX domain socket and listens for incoming connections.
+ * Only one connection is allowed at a time.
+ *
+ * The connection is handled by the vu_client_trip() coroutine in the
+ * VuServer->ctx AioContext. The coroutine consists of a vu_dispatch() loop
+ * where libvhost-user calls vu_message_read() to receive the next vhost-user
+ * protocol messages over the UNIX domain socket.
+ *
+ * When virtqueues are set up libvhost-user calls set_watch() to monitor kick
+ * fds. These fds are also handled in the VuServer->ctx AioContext.
+ *
+ * Both vu_client_trip() and kick fd monitoring can be stopped by shutting down
+ * the socket connection. Shutting down the socket connection causes
+ * vu_message_read() to fail since no more data can be received from the 
socket.
+ * After vu_dispatch() fails, vu_client_trip() calls vu_deinit() to stop
+ * libvhost-user before terminating the coroutine. vu_deinit() calls
+ * remove_watch() to 

[PULL v2 17/30] block/export: report flush errors

2020-10-12 Thread Stefan Hajnoczi
Propagate the flush return value since errors are possible.

Signed-off-by: Stefan Hajnoczi 
Message-id: 20200924151549.913737-11-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 block/export/vhost-user-blk-server.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index b609a3e4d6..44d3c45fa2 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -78,11 +78,11 @@ vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec 
*iov,
 return -EINVAL;
 }
 
-static void coroutine_fn vu_block_flush(VuBlockReq *req)
+static int coroutine_fn vu_block_flush(VuBlockReq *req)
 {
 VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
 BlockBackend *backend = vdev_blk->backend;
-blk_co_flush(backend);
+return blk_co_flush(backend);
 }
 
 static void coroutine_fn vu_block_virtio_process_req(void *opaque)
@@ -152,8 +152,11 @@ static void coroutine_fn vu_block_virtio_process_req(void 
*opaque)
 break;
 }
 case VIRTIO_BLK_T_FLUSH:
-vu_block_flush(req);
-req->in->status = VIRTIO_BLK_S_OK;
+if (vu_block_flush(req) == 0) {
+req->in->status = VIRTIO_BLK_S_OK;
+} else {
+req->in->status = VIRTIO_BLK_S_IOERR;
+}
 break;
 case VIRTIO_BLK_T_GET_ID: {
 size_t size = MIN(iov_size(>in_sg[0], in_num),
-- 
2.26.2



[PULL v2 18/30] block/export: convert vhost-user-blk server to block export API

2020-10-12 Thread Stefan Hajnoczi
Use the new QAPI block exports API instead of defining our own QOM
objects.

This is a large change because the lifecycle of VuBlockDev needs to
follow BlockExportDriver. QOM properties are replaced by QAPI options
objects.

VuBlockDev is renamed VuBlkExport and contains a BlockExport field.
Several fields can be dropped since BlockExport already has equivalents.

The file names and meson build integration will be adjusted in a future
patch. libvhost-user should probably be built as a static library that
is linked into QEMU instead of as a .c file that results in duplicate
compilation.

The new command-line syntax is:

  $ qemu-storage-daemon \
  --blockdev file,node-name=drive0,filename=test.img \
  --export 
vhost-user-blk,node-name=drive0,id=export0,unix-socket=/tmp/vhost-user-blk.sock

Note that unix-socket is optional because we may wish to accept chardevs
too in the future.

Markus noted that supported address families are not explicit in the
QAPI schema. It is unlikely that support for more address families will
be added since file descriptor passing is required and few address
families support it. If a new address family needs to be added, then the
QAPI 'features' syntax can be used to advertize them.

Signed-off-by: Stefan Hajnoczi 
Acked-by: Markus Armbruster 
Message-id: 20200924151549.913737-12-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 qapi/block-export.json   |  21 +-
 block/export/vhost-user-blk-server.h |  23 +-
 block/export/export.c|   6 +
 block/export/vhost-user-blk-server.c | 452 +++
 tests/qtest/vhost-user-blk-test.c|   2 +-
 util/vhost-user-server.c |  10 +-
 block/export/meson.build |   1 +
 block/meson.build|   1 -
 8 files changed, 157 insertions(+), 359 deletions(-)

diff --git a/qapi/block-export.json b/qapi/block-export.json
index 65804834d9..a793e34af9 100644
--- a/qapi/block-export.json
+++ b/qapi/block-export.json
@@ -84,6 +84,21 @@
   'data': { '*name': 'str', '*description': 'str',
 '*bitmap': 'str' } }
 
+##
+# @BlockExportOptionsVhostUserBlk:
+#
+# A vhost-user-blk block export.
+#
+# @addr: The vhost-user socket on which to listen. Both 'unix' and 'fd'
+#SocketAddress types are supported. Passed fds must be UNIX domain
+#sockets.
+# @logical-block-size: Logical block size in bytes. Defaults to 512 bytes.
+#
+# Since: 5.2
+##
+{ 'struct': 'BlockExportOptionsVhostUserBlk',
+  'data': { 'addr': 'SocketAddress', '*logical-block-size': 'size' } }
+
 ##
 # @NbdServerAddOptions:
 #
@@ -180,11 +195,12 @@
 # An enumeration of block export types
 #
 # @nbd: NBD export
+# @vhost-user-blk: vhost-user-blk export (since 5.2)
 #
 # Since: 4.2
 ##
 { 'enum': 'BlockExportType',
-  'data': [ 'nbd' ] }
+  'data': [ 'nbd', 'vhost-user-blk' ] }
 
 ##
 # @BlockExportOptions:
@@ -213,7 +229,8 @@
 '*writethrough': 'bool' },
   'discriminator': 'type',
   'data': {
-  'nbd': 'BlockExportOptionsNbd'
+  'nbd': 'BlockExportOptionsNbd',
+  'vhost-user-blk': 'BlockExportOptionsVhostUserBlk'
} }
 
 ##
diff --git a/block/export/vhost-user-blk-server.h 
b/block/export/vhost-user-blk-server.h
index f06f37c4c8..fcf46fc8a5 100644
--- a/block/export/vhost-user-blk-server.h
+++ b/block/export/vhost-user-blk-server.h
@@ -10,27 +10,10 @@
 
 #ifndef VHOST_USER_BLK_SERVER_H
 #define VHOST_USER_BLK_SERVER_H
-#include "util/vhost-user-server.h"
 
-typedef struct VuBlockDev VuBlockDev;
-#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server"
-#define VHOST_USER_BLK_SERVER(obj) \
-   OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER)
+#include "block/export.h"
 
-/* vhost user block device */
-struct VuBlockDev {
-Object parent_obj;
-char *node_name;
-SocketAddress *addr;
-AioContext *ctx;
-VuServer vu_server;
-bool running;
-uint32_t blk_size;
-BlockBackend *backend;
-QIOChannelSocket *sioc;
-QTAILQ_ENTRY(VuBlockDev) next;
-struct virtio_blk_config blkcfg;
-bool writable;
-};
+/* For block/export/export.c */
+extern const BlockExportDriver blk_exp_vhost_user_blk;
 
 #endif /* VHOST_USER_BLK_SERVER_H */
diff --git a/block/export/export.c b/block/export/export.c
index f2c00d13bf..bd7cac241f 100644
--- a/block/export/export.c
+++ b/block/export/export.c
@@ -17,6 +17,9 @@
 #include "sysemu/block-backend.h"
 #include "block/export.h"
 #include "block/nbd.h"
+#if CONFIG_LINUX
+#include "block/export/vhost-user-blk-server.h"
+#endif
 #include "qapi/error.h"
 #include "qapi/qapi-commands-block-export.h"
 #include "qapi/qapi-events-block-export.h"
@@ -24,6 +27,9 @@
 
 static const BlockExportDriver *blk_exp_drivers[] = {
 _exp_nbd,
+#if CONFIG_LINUX
+_exp_vhost_user_blk,
+#endif
 };
 
 /* Only accessed from the main thread */
diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index 44d3c45fa2..91fc7040b2 100644
--- 

[PULL v2 15/30] util/vhost-user-server: check EOF when reading payload

2020-10-12 Thread Stefan Hajnoczi
Unexpected EOF is an error that must be reported.

Signed-off-by: Stefan Hajnoczi 
Message-id: 20200924151549.913737-9-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 util/vhost-user-server.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 5a60e2ca2a..ec555abcb2 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -169,8 +169,10 @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg 
*vmsg)
 };
 if (vmsg->size) {
 rc = qio_channel_readv_all_eof(ioc, _payload, 1, _err);
-if (rc == -1) {
-error_report_err(local_err);
+if (rc != 1) {
+if (local_err) {
+error_report_err(local_err);
+}
 goto fail;
 }
 }
-- 
2.26.2



[PULL v2 14/30] util/vhost-user-server: fix memory leak in vu_message_read()

2020-10-12 Thread Stefan Hajnoczi
fds[] is leaked when qio_channel_readv_full() fails.

Use vmsg->fds[] instead of keeping a local fds[] array. Then we can
reuse goto fail to clean up fds. vmsg->fd_num must be zeroed before the
loop to make this safe.

Signed-off-by: Stefan Hajnoczi 
Message-id: 20200924151549.913737-8-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 util/vhost-user-server.c | 50 ++--
 1 file changed, 23 insertions(+), 27 deletions(-)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 892815827d..5a60e2ca2a 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -100,21 +100,11 @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg 
*vmsg)
 };
 int rc, read_bytes = 0;
 Error *local_err = NULL;
-/*
- * Store fds/nfds returned from qio_channel_readv_full into
- * temporary variables.
- *
- * VhostUserMsg is a packed structure, gcc will complain about passing
- * pointer to a packed structure member if we pass _num
- * and  directly when calling qio_channel_readv_full,
- * thus two temporary variables nfds and fds are used here.
- */
-size_t nfds = 0, nfds_t = 0;
 const size_t max_fds = G_N_ELEMENTS(vmsg->fds);
-int *fds_t = NULL;
 VuServer *server = container_of(vu_dev, VuServer, vu_dev);
 QIOChannel *ioc = server->ioc;
 
+vmsg->fd_num = 0;
 if (!ioc) {
 error_report_err(local_err);
 goto fail;
@@ -122,41 +112,47 @@ vu_message_read(VuDev *vu_dev, int conn_fd, VhostUserMsg 
*vmsg)
 
 assert(qemu_in_coroutine());
 do {
+size_t nfds = 0;
+int *fds = NULL;
+
 /*
  * qio_channel_readv_full may have short reads, keeping calling it
  * until getting VHOST_USER_HDR_SIZE or 0 bytes in total
  */
-rc = qio_channel_readv_full(ioc, , 1, _t, _t, _err);
+rc = qio_channel_readv_full(ioc, , 1, , , _err);
 if (rc < 0) {
 if (rc == QIO_CHANNEL_ERR_BLOCK) {
+assert(local_err == NULL);
 qio_channel_yield(ioc, G_IO_IN);
 continue;
 } else {
 error_report_err(local_err);
-return false;
+goto fail;
 }
 }
-read_bytes += rc;
-if (nfds_t > 0) {
-if (nfds + nfds_t > max_fds) {
+
+if (nfds > 0) {
+if (vmsg->fd_num + nfds > max_fds) {
 error_report("A maximum of %zu fds are allowed, "
  "however got %lu fds now",
- max_fds, nfds + nfds_t);
+ max_fds, vmsg->fd_num + nfds);
+g_free(fds);
 goto fail;
 }
-memcpy(vmsg->fds + nfds, fds_t,
-   nfds_t *sizeof(vmsg->fds[0]));
-nfds += nfds_t;
-g_free(fds_t);
+memcpy(vmsg->fds + vmsg->fd_num, fds, nfds * sizeof(vmsg->fds[0]));
+vmsg->fd_num += nfds;
+g_free(fds);
 }
-if (read_bytes == VHOST_USER_HDR_SIZE || rc == 0) {
-break;
+
+if (rc == 0) { /* socket closed */
+goto fail;
 }
-iov.iov_base = (char *)vmsg + read_bytes;
-iov.iov_len = VHOST_USER_HDR_SIZE - read_bytes;
-} while (true);
 
-vmsg->fd_num = nfds;
+iov.iov_base += rc;
+iov.iov_len -= rc;
+read_bytes += rc;
+} while (read_bytes != VHOST_USER_HDR_SIZE);
+
 /* qio_channel_readv_full will make socket fds blocking, unblock them */
 vmsg_unblock_fds(vmsg);
 if (vmsg->size > sizeof(vmsg->payload)) {
-- 
2.26.2



[PULL v2 22/30] block: move block exports to libblockdev

2020-10-12 Thread Stefan Hajnoczi
Block exports are used by softmmu, qemu-storage-daemon, and qemu-nbd.
They are not used by other programs and are not otherwise needed in
libblock.

Undo the recent move of blockdev-nbd.c from blockdev_ss into block_ss.
Since bdrv_close_all() (libblock) calls blk_exp_close_all()
(libblockdev) a stub function is required..

Make qemu-nbd.c use signal handling utility functions instead of
duplicating the code. This helps because os-posix.c is in libblockdev
and it depends on a qemu_system_killed() symbol that qemu-nbd.c lacks.
Once we use the signal handling utility functions we also end up
providing the necessary symbol.

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Eric Blake 
Message-id: 20200929125516.186715-4-stefa...@redhat.com
[Fixed s/ndb/nbd/ typo in commit description as suggested by Eric Blake
--Stefan]
Signed-off-by: Stefan Hajnoczi 
---
 qemu-nbd.c| 25 +
 stubs/blk-exp-close-all.c |  7 +++
 block/export/meson.build  |  4 ++--
 meson.build   |  4 ++--
 nbd/meson.build   |  2 ++
 stubs/meson.build |  1 +
 6 files changed, 23 insertions(+), 20 deletions(-)
 create mode 100644 stubs/blk-exp-close-all.c

diff --git a/qemu-nbd.c b/qemu-nbd.c
index bc644a0670..369b00e3b9 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -25,6 +25,7 @@
 #include "qapi/error.h"
 #include "qemu/cutils.h"
 #include "sysemu/block-backend.h"
+#include "sysemu/runstate.h" /* for qemu_system_killed() prototype */
 #include "block/block_int.h"
 #include "block/nbd.h"
 #include "qemu/main-loop.h"
@@ -154,8 +155,12 @@ QEMU_COPYRIGHT "\n"
 , name);
 }
 
-#ifdef CONFIG_POSIX
-static void termsig_handler(int signum)
+#if CONFIG_POSIX
+/*
+ * The client thread uses SIGTERM to interrupt the server.  A signal
+ * handler ensures that "qemu-nbd -v -c" exits with a nice status code.
+ */
+void qemu_system_killed(int signum, pid_t pid)
 {
 qatomic_cmpxchg(, RUNNING, TERMINATE);
 qemu_notify_event();
@@ -581,20 +586,8 @@ int main(int argc, char **argv)
 const char *pid_file_name = NULL;
 BlockExportOptions *export_opts;
 
-#ifdef CONFIG_POSIX
-/*
- * Exit gracefully on various signals, which includes SIGTERM used
- * by 'qemu-nbd -v -c'.
- */
-struct sigaction sa_sigterm;
-memset(_sigterm, 0, sizeof(sa_sigterm));
-sa_sigterm.sa_handler = termsig_handler;
-sigaction(SIGTERM, _sigterm, NULL);
-sigaction(SIGINT, _sigterm, NULL);
-sigaction(SIGHUP, _sigterm, NULL);
-
-signal(SIGPIPE, SIG_IGN);
-#endif
+os_setup_early_signal_handling();
+os_setup_signal_handling();
 
 socket_init();
 error_init(argv[0]);
diff --git a/stubs/blk-exp-close-all.c b/stubs/blk-exp-close-all.c
new file mode 100644
index 00..1c71316763
--- /dev/null
+++ b/stubs/blk-exp-close-all.c
@@ -0,0 +1,7 @@
+#include "qemu/osdep.h"
+#include "block/export.h"
+
+/* Only used in programs that support block exports (libblockdev.fa) */
+void blk_exp_close_all(void)
+{
+}
diff --git a/block/export/meson.build b/block/export/meson.build
index 469a7aa0f5..a2772a0dce 100644
--- a/block/export/meson.build
+++ b/block/export/meson.build
@@ -1,2 +1,2 @@
-block_ss.add(files('export.c'))
-block_ss.add(when: 'CONFIG_VHOST_USER', if_true: 
files('vhost-user-blk-server.c'))
+blockdev_ss.add(files('export.c'))
+blockdev_ss.add(when: 'CONFIG_VHOST_USER', if_true: 
files('vhost-user-blk-server.c'))
diff --git a/meson.build b/meson.build
index 2fb0bb7cb4..9964ed43a1 100644
--- a/meson.build
+++ b/meson.build
@@ -1344,7 +1344,6 @@ subdir('dump')
 
 block_ss.add(files(
   'block.c',
-  'blockdev-nbd.c',
   'blockjob.c',
   'job.c',
   'qemu-io-cmds.c',
@@ -1357,6 +1356,7 @@ subdir('block')
 
 blockdev_ss.add(files(
   'blockdev.c',
+  'blockdev-nbd.c',
   'iothread.c',
   'job-qmp.c',
 ))
@@ -1682,7 +1682,7 @@ if have_tools
   qemu_io = executable('qemu-io', files('qemu-io.c'),
  dependencies: [block, qemuutil], install: true)
   qemu_nbd = executable('qemu-nbd', files('qemu-nbd.c'),
-   dependencies: [block, qemuutil], install: true)
+   dependencies: [blockdev, qemuutil], install: true)
 
   subdir('storage-daemon')
   subdir('contrib/rdmacm-mux')
diff --git a/nbd/meson.build b/nbd/meson.build
index 0c00a776d3..2baaa36948 100644
--- a/nbd/meson.build
+++ b/nbd/meson.build
@@ -1,5 +1,7 @@
 block_ss.add(files(
   'client.c',
   'common.c',
+))
+blockdev_ss.add(files(
   'server.c',
 ))
diff --git a/stubs/meson.build b/stubs/meson.build
index 67f2a8c069..7b733fadb7 100644
--- a/stubs/meson.build
+++ b/stubs/meson.build
@@ -1,6 +1,7 @@
 stub_ss.add(files('arch_type.c'))
 stub_ss.add(files('bdrv-next-monitor-owned.c'))
 stub_ss.add(files('blk-commit-all.c'))
+stub_ss.add(files('blk-exp-close-all.c'))
 stub_ss.add(files('blockdev-close-all-bdrv-states.c'))
 stub_ss.add(files('change-state-handler.c'))
 stub_ss.add(files('cmos.c'))
-- 
2.26.2



[PULL v2 12/30] block/export: consolidate request structs into VuBlockReq

2020-10-12 Thread Stefan Hajnoczi
Only one struct is needed per request. Drop req_data and the separate
VuBlockReq instance. Instead let vu_queue_pop() allocate everything at
once.

This fixes the req_data memory leak in vu_block_virtio_process_req().

Signed-off-by: Stefan Hajnoczi 
Message-id: 20200924151549.913737-6-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 block/export/vhost-user-blk-server.c | 68 +---
 1 file changed, 21 insertions(+), 47 deletions(-)

diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
index fb7764a730..ef07a87eb1 100644
--- a/block/export/vhost-user-blk-server.c
+++ b/block/export/vhost-user-blk-server.c
@@ -25,7 +25,7 @@ struct virtio_blk_inhdr {
 };
 
 typedef struct VuBlockReq {
-VuVirtqElement *elem;
+VuVirtqElement elem;
 int64_t sector_num;
 size_t size;
 struct virtio_blk_inhdr *in;
@@ -39,14 +39,10 @@ static void vu_block_req_complete(VuBlockReq *req)
 VuDev *vu_dev = >server->vu_dev;
 
 /* IO size with 1 extra status byte */
-vu_queue_push(vu_dev, req->vq, req->elem, req->size + 1);
+vu_queue_push(vu_dev, req->vq, >elem, req->size + 1);
 vu_queue_notify(vu_dev, req->vq);
 
-if (req->elem) {
-free(req->elem);
-}
-
-g_free(req);
+free(req);
 }
 
 static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
@@ -89,20 +85,12 @@ static void coroutine_fn vu_block_flush(VuBlockReq *req)
 blk_co_flush(backend);
 }
 
-struct req_data {
-VuServer *server;
-VuVirtq *vq;
-VuVirtqElement *elem;
-};
-
 static void coroutine_fn vu_block_virtio_process_req(void *opaque)
 {
-struct req_data *data = opaque;
-VuServer *server = data->server;
-VuVirtq *vq = data->vq;
-VuVirtqElement *elem = data->elem;
+VuBlockReq *req = opaque;
+VuServer *server = req->server;
+VuVirtqElement *elem = >elem;
 uint32_t type;
-VuBlockReq *req;
 
 VuBlockDev *vdev_blk = get_vu_block_device_by_server(server);
 BlockBackend *backend = vdev_blk->backend;
@@ -111,18 +99,13 @@ static void coroutine_fn vu_block_virtio_process_req(void 
*opaque)
 struct iovec *out_iov = elem->out_sg;
 unsigned in_num = elem->in_num;
 unsigned out_num = elem->out_num;
+
 /* refer to hw/block/virtio_blk.c */
 if (elem->out_num < 1 || elem->in_num < 1) {
 error_report("virtio-blk request missing headers");
-free(elem);
-return;
+goto err;
 }
 
-req = g_new0(VuBlockReq, 1);
-req->server = server;
-req->vq = vq;
-req->elem = elem;
-
 if (unlikely(iov_to_buf(out_iov, out_num, 0, >out,
 sizeof(req->out)) != sizeof(req->out))) {
 error_report("virtio-blk request outhdr too short");
@@ -202,36 +185,27 @@ static void coroutine_fn vu_block_virtio_process_req(void 
*opaque)
 
 err:
 free(elem);
-g_free(req);
-return;
 }
 
 static void vu_block_process_vq(VuDev *vu_dev, int idx)
 {
-VuServer *server;
-VuVirtq *vq;
-struct req_data *req_data;
+VuServer *server = container_of(vu_dev, VuServer, vu_dev);
+VuVirtq *vq = vu_get_queue(vu_dev, idx);
 
-server = container_of(vu_dev, VuServer, vu_dev);
-assert(server);
-
-vq = vu_get_queue(vu_dev, idx);
-assert(vq);
-VuVirtqElement *elem;
 while (1) {
-elem = vu_queue_pop(vu_dev, vq, sizeof(VuVirtqElement) +
-sizeof(VuBlockReq));
-if (elem) {
-req_data = g_new0(struct req_data, 1);
-req_data->server = server;
-req_data->vq = vq;
-req_data->elem = elem;
-Coroutine *co = qemu_coroutine_create(vu_block_virtio_process_req,
-  req_data);
-aio_co_enter(server->ioc->ctx, co);
-} else {
+VuBlockReq *req;
+
+req = vu_queue_pop(vu_dev, vq, sizeof(VuBlockReq));
+if (!req) {
 break;
 }
+
+req->server = server;
+req->vq = vq;
+
+Coroutine *co =
+qemu_coroutine_create(vu_block_virtio_process_req, req);
+qemu_coroutine_enter(co);
 }
 }
 
-- 
2.26.2



[PULL v2 11/30] util/vhost-user-server: drop unnecessary watch deletion

2020-10-12 Thread Stefan Hajnoczi
Explicitly deleting watches is not necessary since libvhost-user calls
remove_watch() during vu_deinit(). Add an assertion to check this
though.

Signed-off-by: Stefan Hajnoczi 
Message-id: 20200924151549.913737-5-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 util/vhost-user-server.c | 19 ---
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index ebe47885f5..ebb850731b 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -48,21 +48,6 @@ static void close_client(VuServer *server)
 /* When this is set vu_client_trip will stop new processing vhost-user 
message */
 server->sioc = NULL;
 
-VuFdWatch *vu_fd_watch, *next;
-QTAILQ_FOREACH_SAFE(vu_fd_watch, >vu_fd_watches, next, next) {
-aio_set_fd_handler(server->ioc->ctx, vu_fd_watch->fd, true, NULL,
-   NULL, NULL, NULL);
-}
-
-while (!QTAILQ_EMPTY(>vu_fd_watches)) {
-QTAILQ_FOREACH_SAFE(vu_fd_watch, >vu_fd_watches, next, next) {
-if (!vu_fd_watch->processing) {
-QTAILQ_REMOVE(>vu_fd_watches, vu_fd_watch, next);
-g_free(vu_fd_watch);
-}
-}
-}
-
 while (server->processing_msg) {
 if (server->ioc->read_coroutine) {
 server->ioc->read_coroutine = NULL;
@@ -73,6 +58,10 @@ static void close_client(VuServer *server)
 }
 
 vu_deinit(>vu_dev);
+
+/* vu_deinit() should have called remove_watch() */
+assert(QTAILQ_EMPTY(>vu_fd_watches));
+
 object_unref(OBJECT(sioc));
 object_unref(OBJECT(server->ioc));
 }
-- 
2.26.2



[PULL v2 05/30] block: move logical block size check function to a common utility function

2020-10-12 Thread Stefan Hajnoczi
From: Coiby Xu 

Move the constants from hw/core/qdev-properties.c to
util/block-helpers.h so that knowledge of the min/max values is

Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Coiby Xu 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Marc-André Lureau 
Acked-by: Eduardo Habkost 
Message-id: 20200918080912.321299-5-coiby...@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 util/block-helpers.h | 19 +
 hw/core/qdev-properties-system.c | 31 -
 util/block-helpers.c | 46 
 util/meson.build |  1 +
 4 files changed, 71 insertions(+), 26 deletions(-)
 create mode 100644 util/block-helpers.h
 create mode 100644 util/block-helpers.c

diff --git a/util/block-helpers.h b/util/block-helpers.h
new file mode 100644
index 00..b53295a529
--- /dev/null
+++ b/util/block-helpers.h
@@ -0,0 +1,19 @@
+#ifndef BLOCK_HELPERS_H
+#define BLOCK_HELPERS_H
+
+#include "qemu/units.h"
+
+/* lower limit is sector size */
+#define MIN_BLOCK_SIZE  INT64_C(512)
+#define MIN_BLOCK_SIZE_STR  "512 B"
+/*
+ * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
+ * matches qcow2 cluster size limit
+ */
+#define MAX_BLOCK_SIZE  (2 * MiB)
+#define MAX_BLOCK_SIZE_STR  "2 MiB"
+
+void check_block_size(const char *id, const char *name, int64_t value,
+  Error **errp);
+
+#endif /* BLOCK_HELPERS_H */
diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 49bdd12581..b81a4e8d14 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -30,6 +30,7 @@
 #include "sysemu/blockdev.h"
 #include "net/net.h"
 #include "hw/pci/pci.h"
+#include "util/block-helpers.h"
 
 static bool check_prop_still_unset(DeviceState *dev, const char *name,
const void *old_val, const char *new_val,
@@ -576,16 +577,6 @@ const PropertyInfo qdev_prop_losttickpolicy = {
 
 /* --- blocksize --- */
 
-/* lower limit is sector size */
-#define MIN_BLOCK_SIZE  512
-#define MIN_BLOCK_SIZE_STR  "512 B"
-/*
- * upper limit is arbitrary, 2 MiB looks sufficient for all sensible uses, and
- * matches qcow2 cluster size limit
- */
-#define MAX_BLOCK_SIZE  (2 * MiB)
-#define MAX_BLOCK_SIZE_STR  "2 MiB"
-
 static void set_blocksize(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
 {
@@ -593,6 +584,7 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 Property *prop = opaque;
 uint32_t *ptr = qdev_get_prop_ptr(dev, prop);
 uint64_t value;
+Error *local_err = NULL;
 
 if (dev->realized) {
 qdev_prop_set_after_realize(dev, name, errp);
@@ -602,24 +594,11 @@ static void set_blocksize(Object *obj, Visitor *v, const 
char *name,
 if (!visit_type_size(v, name, , errp)) {
 return;
 }
-/* value of 0 means "unset" */
-if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
-error_setg(errp,
-   "Property %s.%s doesn't take value %" PRIu64
-   " (minimum: " MIN_BLOCK_SIZE_STR
-   ", maximum: " MAX_BLOCK_SIZE_STR ")",
-   dev->id ? : "", name, value);
+check_block_size(dev->id ? : "", name, value, _err);
+if (local_err) {
+error_propagate(errp, local_err);
 return;
 }
-
-/* We rely on power-of-2 blocksizes for bitmasks */
-if ((value & (value - 1)) != 0) {
-error_setg(errp,
-  "Property %s.%s doesn't take value '%" PRId64 "', "
-  "it's not a power of 2", dev->id ?: "", name, 
(int64_t)value);
-return;
-}
-
 *ptr = value;
 }
 
diff --git a/util/block-helpers.c b/util/block-helpers.c
new file mode 100644
index 00..c4851432f5
--- /dev/null
+++ b/util/block-helpers.c
@@ -0,0 +1,46 @@
+/*
+ * Block utility functions
+ *
+ * Copyright IBM, Corp. 2011
+ * Copyright (c) 2020 Coiby Xu 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qerror.h"
+#include "block-helpers.h"
+
+/**
+ * check_block_size:
+ * @id: The unique ID of the object
+ * @name: The name of the property being validated
+ * @value: The block size in bytes
+ * @errp: A pointer to an area to store an error
+ *
+ * This function checks that the block size meets the following conditions:
+ * 1. At least MIN_BLOCK_SIZE
+ * 2. No larger than MAX_BLOCK_SIZE
+ * 3. A power of 2
+ */
+void check_block_size(const char *id, const char *name, int64_t value,
+  Error **errp)
+{
+/* value of 0 means "unset" */
+if (value && (value < MIN_BLOCK_SIZE || value > MAX_BLOCK_SIZE)) {
+error_setg(errp, QERR_PROPERTY_VALUE_OUT_OF_RANGE,
+   id, name, value, 

[PULL v2 09/30] util/vhost-user-server: s/fileds/fields/ typo fix

2020-10-12 Thread Stefan Hajnoczi
Signed-off-by: Stefan Hajnoczi 
Message-id: 20200924151549.913737-3-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 util/vhost-user-server.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 7b50a2b1fd..2cd0cf8277 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -407,7 +407,7 @@ bool vhost_user_server_start(VuServer *server,
 return false;
 }
 
-/* zero out unspecified fileds */
+/* zero out unspecified fields */
 *server = (VuServer) {
 .listener  = listener,
 .vu_iface  = vu_iface,
-- 
2.26.2



[PULL v2 10/30] util/vhost-user-server: drop unnecessary QOM cast

2020-10-12 Thread Stefan Hajnoczi
We already have access to the value with the correct type (ioc and sioc
are the same QIOChannel).

Signed-off-by: Stefan Hajnoczi 
Message-id: 20200924151549.913737-4-stefa...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 util/vhost-user-server.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
index 2cd0cf8277..ebe47885f5 100644
--- a/util/vhost-user-server.c
+++ b/util/vhost-user-server.c
@@ -337,7 +337,7 @@ static void vu_accept(QIONetListener *listener, 
QIOChannelSocket *sioc,
 server->ioc = QIO_CHANNEL(sioc);
 object_ref(OBJECT(server->ioc));
 qio_channel_attach_aio_context(server->ioc, server->ctx);
-qio_channel_set_blocking(QIO_CHANNEL(server->sioc), false, NULL);
+qio_channel_set_blocking(server->ioc, false, NULL);
 vu_client_start(server);
 }
 
-- 
2.26.2



[PULL v2 06/30] block/export: vhost-user block device backend server

2020-10-12 Thread Stefan Hajnoczi
From: Coiby Xu 

By making use of libvhost-user, block device drive can be shared to
the connected vhost-user client. Only one client can connect to the
server one time.

Since vhost-user-server needs a block drive to be created first, delay
the creation of this object.

Suggested-by: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Coiby Xu 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Marc-André Lureau 
Message-id: 20200918080912.321299-6-coiby...@gmail.com
[Shorten "vhost_user_blk_server" string to "vhost_user_blk" to avoid the
following compiler warning:
../block/export/vhost-user-blk-server.c:178:50: error: ‘%s’ directive output 
truncated writing 21 bytes into a region of size 20 [-Werror=format-truncation=]
--Stefan]
Signed-off-by: Stefan Hajnoczi 
---
 block/export/vhost-user-blk-server.h |  36 ++
 block/export/vhost-user-blk-server.c | 661 +++
 softmmu/vl.c |   4 +
 block/meson.build|   1 +
 4 files changed, 702 insertions(+)
 create mode 100644 block/export/vhost-user-blk-server.h
 create mode 100644 block/export/vhost-user-blk-server.c

diff --git a/block/export/vhost-user-blk-server.h 
b/block/export/vhost-user-blk-server.h
new file mode 100644
index 00..f06f37c4c8
--- /dev/null
+++ b/block/export/vhost-user-blk-server.h
@@ -0,0 +1,36 @@
+/*
+ * Sharing QEMU block devices via vhost-user protocal
+ *
+ * Copyright (c) Coiby Xu .
+ * Copyright (c) 2020 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef VHOST_USER_BLK_SERVER_H
+#define VHOST_USER_BLK_SERVER_H
+#include "util/vhost-user-server.h"
+
+typedef struct VuBlockDev VuBlockDev;
+#define TYPE_VHOST_USER_BLK_SERVER "vhost-user-blk-server"
+#define VHOST_USER_BLK_SERVER(obj) \
+   OBJECT_CHECK(VuBlockDev, obj, TYPE_VHOST_USER_BLK_SERVER)
+
+/* vhost user block device */
+struct VuBlockDev {
+Object parent_obj;
+char *node_name;
+SocketAddress *addr;
+AioContext *ctx;
+VuServer vu_server;
+bool running;
+uint32_t blk_size;
+BlockBackend *backend;
+QIOChannelSocket *sioc;
+QTAILQ_ENTRY(VuBlockDev) next;
+struct virtio_blk_config blkcfg;
+bool writable;
+};
+
+#endif /* VHOST_USER_BLK_SERVER_H */
diff --git a/block/export/vhost-user-blk-server.c 
b/block/export/vhost-user-blk-server.c
new file mode 100644
index 00..fb7764a730
--- /dev/null
+++ b/block/export/vhost-user-blk-server.c
@@ -0,0 +1,661 @@
+/*
+ * Sharing QEMU block devices via vhost-user protocal
+ *
+ * Parts of the code based on nbd/server.c.
+ *
+ * Copyright (c) Coiby Xu .
+ * Copyright (c) 2020 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include "block/block.h"
+#include "vhost-user-blk-server.h"
+#include "qapi/error.h"
+#include "qom/object_interfaces.h"
+#include "sysemu/block-backend.h"
+#include "util/block-helpers.h"
+
+enum {
+VHOST_USER_BLK_MAX_QUEUES = 1,
+};
+struct virtio_blk_inhdr {
+unsigned char status;
+};
+
+typedef struct VuBlockReq {
+VuVirtqElement *elem;
+int64_t sector_num;
+size_t size;
+struct virtio_blk_inhdr *in;
+struct virtio_blk_outhdr out;
+VuServer *server;
+struct VuVirtq *vq;
+} VuBlockReq;
+
+static void vu_block_req_complete(VuBlockReq *req)
+{
+VuDev *vu_dev = >server->vu_dev;
+
+/* IO size with 1 extra status byte */
+vu_queue_push(vu_dev, req->vq, req->elem, req->size + 1);
+vu_queue_notify(vu_dev, req->vq);
+
+if (req->elem) {
+free(req->elem);
+}
+
+g_free(req);
+}
+
+static VuBlockDev *get_vu_block_device_by_server(VuServer *server)
+{
+return container_of(server, VuBlockDev, vu_server);
+}
+
+static int coroutine_fn
+vu_block_discard_write_zeroes(VuBlockReq *req, struct iovec *iov,
+  uint32_t iovcnt, uint32_t type)
+{
+struct virtio_blk_discard_write_zeroes desc;
+ssize_t size = iov_to_buf(iov, iovcnt, 0, , sizeof(desc));
+if (unlikely(size != sizeof(desc))) {
+error_report("Invalid size %ld, expect %ld", size, sizeof(desc));
+return -EINVAL;
+}
+
+VuBlockDev *vdev_blk = get_vu_block_device_by_server(req->server);
+uint64_t range[2] = { le64_to_cpu(desc.sector) << 9,
+  le32_to_cpu(desc.num_sectors) << 9 };
+if (type == VIRTIO_BLK_T_DISCARD) {
+if (blk_co_pdiscard(vdev_blk->backend, range[0], range[1]) == 0) {
+return 0;
+}
+} else if (type == VIRTIO_BLK_T_WRITE_ZEROES) {
+if (blk_co_pwrite_zeroes(vdev_blk->backend,
+ range[0], range[1], 0) == 0) {
+return 0;
+}
+}
+
+return -EINVAL;
+}
+
+static void coroutine_fn vu_block_flush(VuBlockReq *req)
+{
+VuBlockDev *vdev_blk = 

[PULL v2 08/30] MAINTAINERS: Add vhost-user block device backend server maintainer

2020-10-12 Thread Stefan Hajnoczi
From: Coiby Xu 

Suggested-by: Stefano Garzarella 
Signed-off-by: Coiby Xu 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Marc-André Lureau 
Message-id: 20200918080912.321299-8-coiby...@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index e9d85cc873..28262319db 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3068,6 +3068,14 @@ L: qemu-block@nongnu.org
 S: Supported
 F: tests/image-fuzzer/
 
+Vhost-user block device backend server
+M: Coiby Xu 
+S: Maintained
+F: block/export/vhost-user-blk-server.c
+F: util/vhost-user-server.c
+F: tests/qtest/vhost-user-blk-test.c
+F: tests/qtest/libqos/vhost-user-blk.c
+
 Replication
 M: Wen Congyang 
 M: Xie Changlong 
-- 
2.26.2



[PULL v2 07/30] test: new qTest case to test the vhost-user-blk-server

2020-10-12 Thread Stefan Hajnoczi
From: Coiby Xu 

This test case has the same tests as tests/virtio-blk-test.c except for
tests have block_resize. Since vhost-user server can only server one
client one time, two instances of vhost-user-blk-server are started by
qemu-storage-daemon for the hotplug test.

In order to not block scripts/tap-driver.pl, vhost-user-blk-server will
send "quit" command to qemu-storage-daemon's QMP monitor. So a function
is added to libqtest.c to establish socket connection with socket
server.

Suggested-by: Thomas Huth 
Signed-off-by: Coiby Xu 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Marc-André Lureau 
Message-id: 20200918080912.321299-7-coiby...@gmail.com
[Update meson.build to only test when CONFIG_TOOLS has built
qemu-storage-daemon. This prevents CI failures with --disable-tools.
--Stefan]
Signed-off-by: Stefan Hajnoczi 
---
 tests/qtest/libqos/libqtest.h   |  17 +
 tests/qtest/libqos/vhost-user-blk.h |  48 ++
 tests/qtest/libqos/vhost-user-blk.c | 129 +
 tests/qtest/libqtest.c  |  36 +-
 tests/qtest/vhost-user-blk-test.c   | 751 
 tests/qtest/libqos/meson.build  |   1 +
 tests/qtest/meson.build |   4 +-
 7 files changed, 983 insertions(+), 3 deletions(-)
 create mode 100644 tests/qtest/libqos/vhost-user-blk.h
 create mode 100644 tests/qtest/libqos/vhost-user-blk.c
 create mode 100644 tests/qtest/vhost-user-blk-test.c

diff --git a/tests/qtest/libqos/libqtest.h b/tests/qtest/libqos/libqtest.h
index a6ee1654f2..2c20381cee 100644
--- a/tests/qtest/libqos/libqtest.h
+++ b/tests/qtest/libqos/libqtest.h
@@ -132,6 +132,23 @@ void qtest_qmp_send(QTestState *s, const char *fmt, ...)
 void qtest_qmp_send_raw(QTestState *s, const char *fmt, ...)
 GCC_FMT_ATTR(2, 3);
 
+/**
+ * qtest_socket_client:
+ * @server_socket_path: the socket server's path
+ *
+ * Connect to a socket server.
+ */
+int qtest_socket_client(char *server_socket_path);
+
+/**
+ * qtest_create_state_with_qmp_fd:
+ * @fd: socket fd
+ *
+ * Wrap socket fd in QTestState to make use of qtest_qmp*
+ * functions
+ */
+QTestState *qtest_create_state_with_qmp_fd(int fd);
+
 /**
  * qtest_vqmp_fds:
  * @s: #QTestState instance to operate on.
diff --git a/tests/qtest/libqos/vhost-user-blk.h 
b/tests/qtest/libqos/vhost-user-blk.h
new file mode 100644
index 00..2a03456a45
--- /dev/null
+++ b/tests/qtest/libqos/vhost-user-blk.h
@@ -0,0 +1,48 @@
+/*
+ * libqos driver framework
+ *
+ * Based on tests/qtest/libqos/virtio-blk.c
+ *
+ * Copyright (c) 2020 Coiby Xu 
+ *
+ * Copyright (c) 2018 Emanuele Giuseppe Esposito 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License version 2 as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#ifndef TESTS_LIBQOS_VHOST_USER_BLK_H
+#define TESTS_LIBQOS_VHOST_USER_BLK_H
+
+#include "qgraph.h"
+#include "virtio.h"
+#include "virtio-pci.h"
+
+typedef struct QVhostUserBlk QVhostUserBlk;
+typedef struct QVhostUserBlkPCI QVhostUserBlkPCI;
+typedef struct QVhostUserBlkDevice QVhostUserBlkDevice;
+
+struct QVhostUserBlk {
+QVirtioDevice *vdev;
+};
+
+struct QVhostUserBlkPCI {
+QVirtioPCIDevice pci_vdev;
+QVhostUserBlk blk;
+};
+
+struct QVhostUserBlkDevice {
+QOSGraphObject obj;
+QVhostUserBlk blk;
+};
+
+#endif
diff --git a/tests/qtest/libqos/vhost-user-blk.c 
b/tests/qtest/libqos/vhost-user-blk.c
new file mode 100644
index 00..58c7e1eb69
--- /dev/null
+++ b/tests/qtest/libqos/vhost-user-blk.c
@@ -0,0 +1,129 @@
+/*
+ * libqos driver framework
+ *
+ * Based on tests/qtest/libqos/virtio-blk.c
+ *
+ * Copyright (c) 2020 Coiby Xu 
+ *
+ * Copyright (c) 2018 Emanuele Giuseppe Esposito 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License version 2.1 as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest.h"
+#include "qemu/module.h"
+#include "standard-headers/linux/virtio_blk.h"
+#include "vhost-user-blk.h"
+
+#define PCI_SLOT0x04
+#define PCI_FN  0x00
+
+/* 

[PULL v2 04/30] util/vhost-user-server: generic vhost user server

2020-10-12 Thread Stefan Hajnoczi
From: Coiby Xu 

Sharing QEMU devices via vhost-user protocol.

Only one vhost-user client can connect to the server one time.

Suggested-by: Kevin Wolf 
Signed-off-by: Stefan Hajnoczi 
Signed-off-by: Coiby Xu 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Marc-André Lureau 
Message-id: 20200918080912.321299-4-coiby...@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 util/vhost-user-server.h |  65 ++
 util/vhost-user-server.c | 428 +++
 util/meson.build |   1 +
 3 files changed, 494 insertions(+)
 create mode 100644 util/vhost-user-server.h
 create mode 100644 util/vhost-user-server.c

diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h
new file mode 100644
index 00..5232f96718
--- /dev/null
+++ b/util/vhost-user-server.h
@@ -0,0 +1,65 @@
+/*
+ * Sharing QEMU devices via vhost-user protocol
+ *
+ * Copyright (c) Coiby Xu .
+ * Copyright (c) 2020 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef VHOST_USER_SERVER_H
+#define VHOST_USER_SERVER_H
+
+#include "contrib/libvhost-user/libvhost-user.h"
+#include "io/channel-socket.h"
+#include "io/channel-file.h"
+#include "io/net-listener.h"
+#include "qemu/error-report.h"
+#include "qapi/error.h"
+#include "standard-headers/linux/virtio_blk.h"
+
+typedef struct VuFdWatch {
+VuDev *vu_dev;
+int fd; /*kick fd*/
+void *pvt;
+vu_watch_cb cb;
+bool processing;
+QTAILQ_ENTRY(VuFdWatch) next;
+} VuFdWatch;
+
+typedef struct VuServer VuServer;
+typedef void DevicePanicNotifierFn(VuServer *server);
+
+struct VuServer {
+QIONetListener *listener;
+AioContext *ctx;
+DevicePanicNotifierFn *device_panic_notifier;
+int max_queues;
+const VuDevIface *vu_iface;
+VuDev vu_dev;
+QIOChannel *ioc; /* The I/O channel with the client */
+QIOChannelSocket *sioc; /* The underlying data channel with the client */
+/* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */
+QIOChannel *ioc_slave;
+QIOChannelSocket *sioc_slave;
+Coroutine *co_trip; /* coroutine for processing VhostUserMsg */
+QTAILQ_HEAD(, VuFdWatch) vu_fd_watches;
+/* restart coroutine co_trip if AIOContext is changed */
+bool aio_context_changed;
+bool processing_msg;
+};
+
+bool vhost_user_server_start(VuServer *server,
+ SocketAddress *unix_socket,
+ AioContext *ctx,
+ uint16_t max_queues,
+ DevicePanicNotifierFn *device_panic_notifier,
+ const VuDevIface *vu_iface,
+ Error **errp);
+
+void vhost_user_server_stop(VuServer *server);
+
+void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx);
+
+#endif /* VHOST_USER_SERVER_H */
diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c
new file mode 100644
index 00..7b50a2b1fd
--- /dev/null
+++ b/util/vhost-user-server.c
@@ -0,0 +1,428 @@
+/*
+ * Sharing QEMU devices via vhost-user protocol
+ *
+ * Copyright (c) Coiby Xu .
+ * Copyright (c) 2020 Red Hat, Inc.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include "qemu/main-loop.h"
+#include "vhost-user-server.h"
+
+static void vmsg_close_fds(VhostUserMsg *vmsg)
+{
+int i;
+for (i = 0; i < vmsg->fd_num; i++) {
+close(vmsg->fds[i]);
+}
+}
+
+static void vmsg_unblock_fds(VhostUserMsg *vmsg)
+{
+int i;
+for (i = 0; i < vmsg->fd_num; i++) {
+qemu_set_nonblock(vmsg->fds[i]);
+}
+}
+
+static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc,
+  gpointer opaque);
+
+static void close_client(VuServer *server)
+{
+/*
+ * Before closing the client
+ *
+ * 1. Let vu_client_trip stop processing new vhost-user msg
+ *
+ * 2. remove kick_handler
+ *
+ * 3. wait for the kick handler to be finished
+ *
+ * 4. wait for the current vhost-user msg to be finished processing
+ */
+
+QIOChannelSocket *sioc = server->sioc;
+/* When this is set vu_client_trip will stop new processing vhost-user 
message */
+server->sioc = NULL;
+
+VuFdWatch *vu_fd_watch, *next;
+QTAILQ_FOREACH_SAFE(vu_fd_watch, >vu_fd_watches, next, next) {
+aio_set_fd_handler(server->ioc->ctx, vu_fd_watch->fd, true, NULL,
+   NULL, NULL, NULL);
+}
+
+while (!QTAILQ_EMPTY(>vu_fd_watches)) {
+QTAILQ_FOREACH_SAFE(vu_fd_watch, >vu_fd_watches, next, next) {
+if (!vu_fd_watch->processing) {
+QTAILQ_REMOVE(>vu_fd_watches, vu_fd_watch, next);
+g_free(vu_fd_watch);
+}
+}
+}
+
+while (server->processing_msg) {
+if 

[PULL v2 00/30] Block patches

2020-10-12 Thread Stefan Hajnoczi
The following changes since commit 2387df497b4b4bcf754eb7398edca82889e2ef54:

  Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2020-10-10' into 
staging (2020-10-12 11:29:42 +0100)

are available in the Git repository at:

  https://gitlab.com/stefanha/qemu.git tags/block-pull-request

for you to fetch changes up to 3664ec6bbe236126b79d251d4037889e7181ab55:

  iotests: add commit top->base cases to 274 (2020-10-12 16:47:58 +0100)


Pull request

v2:
 * Rebase and resolve conflict with commit 029a88c9a7e3 ("qemu-nbd: Honor
   SIGINT and SIGHUP") [Peter]



Coiby Xu (7):
  libvhost-user: Allow vu_message_read to be replaced
  libvhost-user: remove watch for kick_fd when de-initialize vu-dev
  util/vhost-user-server: generic vhost user server
  block: move logical block size check function to a common utility
function
  block/export: vhost-user block device backend server
  test: new qTest case to test the vhost-user-blk-server
  MAINTAINERS: Add vhost-user block device backend server maintainer

Philippe Mathieu-Daudé (1):
  block/nvme: Add driver statistics for access alignment and hw errors

Stefan Hajnoczi (17):
  util/vhost-user-server: s/fileds/fields/ typo fix
  util/vhost-user-server: drop unnecessary QOM cast
  util/vhost-user-server: drop unnecessary watch deletion
  block/export: consolidate request structs into VuBlockReq
  util/vhost-user-server: drop unused DevicePanicNotifier
  util/vhost-user-server: fix memory leak in vu_message_read()
  util/vhost-user-server: check EOF when reading payload
  util/vhost-user-server: rework vu_client_trip() coroutine lifecycle
  block/export: report flush errors
  block/export: convert vhost-user-blk server to block export API
  util/vhost-user-server: move header to include/
  util/vhost-user-server: use static library in meson.build
  qemu-storage-daemon: avoid compiling blockdev_ss twice
  block: move block exports to libblockdev
  block/export: add iothread and fixed-iothread options
  block/export: add vhost-user-blk multi-queue support
  tests/qtest: add multi-queue test case to vhost-user-blk-test

Vladimir Sementsov-Ogievskiy (5):
  block/io: fix bdrv_co_block_status_above
  block/io: bdrv_common_block_status_above: support include_base
  block/io: bdrv_common_block_status_above: support bs == base
  block/io: fix bdrv_is_allocated_above
  iotests: add commit top->base cases to 274

 MAINTAINERS|  10 +
 qapi/block-core.json   |  24 +-
 qapi/block-export.json |  36 +-
 block/coroutines.h |   2 +
 block/export/vhost-user-blk-server.h   |  19 +
 contrib/libvhost-user/libvhost-user.h  |  21 +
 include/qemu/vhost-user-server.h   |  65 ++
 tests/qtest/libqos/libqtest.h  |  17 +
 tests/qtest/libqos/vhost-user-blk.h|  48 ++
 util/block-helpers.h   |  19 +
 block/export/export.c  |  37 +-
 block/export/vhost-user-blk-server.c   | 431 +++
 block/io.c | 132 ++--
 block/nvme.c   |  27 +
 block/qcow2.c  |  16 +-
 contrib/libvhost-user/libvhost-user-glib.c |   2 +-
 contrib/libvhost-user/libvhost-user.c  |  15 +-
 hw/core/qdev-properties-system.c   |  31 +-
 nbd/server.c   |   2 -
 qemu-nbd.c |  25 +-
 softmmu/vl.c   |   4 +
 stubs/blk-exp-close-all.c  |   7 +
 tests/qtest/libqos/vhost-user-blk.c| 129 
 tests/qtest/libqtest.c |  36 +-
 tests/qtest/vhost-user-blk-test.c  | 822 +
 tests/vhost-user-bridge.c  |   2 +
 tools/virtiofsd/fuse_virtio.c  |   4 +-
 util/block-helpers.c   |  46 ++
 util/vhost-user-server.c   | 446 +++
 block/export/meson.build   |   3 +-
 contrib/libvhost-user/meson.build  |   1 +
 meson.build|  22 +-
 nbd/meson.build|   2 +
 storage-daemon/meson.build |   3 +-
 stubs/meson.build  |   1 +
 tests/qemu-iotests/274 |  20 +
 tests/qemu-iotests/274.out |  68 ++
 tests/qtest/libqos/meson.build |   1 +
 tests/qtest/meson.build|   4 +-
 util/meson.build   |   4 +
 40 files changed, 2476 insertions(+), 128 deletions(-)
 create mode 100644 block/export/vhost-user-blk-server.h
 create mode 100644 include/qemu/vhost-user-server.h
 create mode 100644 tests/qtest/libqos/vhost-user-blk.h
 create mode 100644 util/block-helpers.h
 create mode 100644 block/export/vhost-user-blk-server.c
 create 

[PULL v2 03/30] libvhost-user: remove watch for kick_fd when de-initialize vu-dev

2020-10-12 Thread Stefan Hajnoczi
From: Coiby Xu 

When the client is running in gdb and quit command is run in gdb,
QEMU will still dispatch the event which will cause segment fault in
the callback function.

Signed-off-by: Coiby Xu 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Marc-André Lureau 
Message-id: 20200918080912.321299-3-coiby...@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 contrib/libvhost-user/libvhost-user.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/contrib/libvhost-user/libvhost-user.c 
b/contrib/libvhost-user/libvhost-user.c
index 09bdff18f3..bfec8a881a 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -1918,6 +1918,7 @@ vu_deinit(VuDev *dev)
 }
 
 if (vq->kick_fd != -1) {
+dev->remove_watch(dev, vq->kick_fd);
 close(vq->kick_fd);
 vq->kick_fd = -1;
 }
-- 
2.26.2



[PULL v2 02/30] libvhost-user: Allow vu_message_read to be replaced

2020-10-12 Thread Stefan Hajnoczi
From: Coiby Xu 

Allow vu_message_read to be replaced by one which will make use of the
QIOChannel functions. Thus reading vhost-user message won't stall the
guest. For slave channel, we still use the default vu_message_read.

Reviewed-by: Marc-André Lureau 
Signed-off-by: Coiby Xu 
Reviewed-by: Stefan Hajnoczi 
Message-id: 20200918080912.321299-2-coiby...@gmail.com
Signed-off-by: Stefan Hajnoczi 
---
 contrib/libvhost-user/libvhost-user.h  | 21 +
 contrib/libvhost-user/libvhost-user-glib.c |  2 +-
 contrib/libvhost-user/libvhost-user.c  | 14 +++---
 tests/vhost-user-bridge.c  |  2 ++
 tools/virtiofsd/fuse_virtio.c  |  4 ++--
 5 files changed, 33 insertions(+), 10 deletions(-)

diff --git a/contrib/libvhost-user/libvhost-user.h 
b/contrib/libvhost-user/libvhost-user.h
index 287ac5fec7..3bbeae8587 100644
--- a/contrib/libvhost-user/libvhost-user.h
+++ b/contrib/libvhost-user/libvhost-user.h
@@ -36,6 +36,8 @@
  */
 #define VHOST_USER_MAX_RAM_SLOTS 32
 
+#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
+
 typedef enum VhostSetConfigType {
 VHOST_SET_CONFIG_TYPE_MASTER = 0,
 VHOST_SET_CONFIG_TYPE_MIGRATION = 1,
@@ -221,6 +223,7 @@ typedef uint64_t (*vu_get_features_cb) (VuDev *dev);
 typedef void (*vu_set_features_cb) (VuDev *dev, uint64_t features);
 typedef int (*vu_process_msg_cb) (VuDev *dev, VhostUserMsg *vmsg,
   int *do_reply);
+typedef bool (*vu_read_msg_cb) (VuDev *dev, int sock, VhostUserMsg *vmsg);
 typedef void (*vu_queue_set_started_cb) (VuDev *dev, int qidx, bool started);
 typedef bool (*vu_queue_is_processed_in_order_cb) (VuDev *dev, int qidx);
 typedef int (*vu_get_config_cb) (VuDev *dev, uint8_t *config, uint32_t len);
@@ -389,6 +392,23 @@ struct VuDev {
 bool broken;
 uint16_t max_queues;
 
+/* @read_msg: custom method to read vhost-user message
+ *
+ * Read data from vhost_user socket fd and fill up
+ * the passed VhostUserMsg *vmsg struct.
+ *
+ * If reading fails, it should close the received set of file
+ * descriptors as socket message's auxiliary data.
+ *
+ * For the details, please refer to vu_message_read in libvhost-user.c
+ * which will be used by default if not custom method is provided when
+ * calling vu_init
+ *
+ * Returns: true if vhost-user message successfully received,
+ *  otherwise return false.
+ *
+ */
+vu_read_msg_cb read_msg;
 /* @set_watch: add or update the given fd to the watch set,
  * call cb when condition is met */
 vu_set_watch_cb set_watch;
@@ -432,6 +452,7 @@ bool vu_init(VuDev *dev,
  uint16_t max_queues,
  int socket,
  vu_panic_cb panic,
+ vu_read_msg_cb read_msg,
  vu_set_watch_cb set_watch,
  vu_remove_watch_cb remove_watch,
  const VuDevIface *iface);
diff --git a/contrib/libvhost-user/libvhost-user-glib.c 
b/contrib/libvhost-user/libvhost-user-glib.c
index 53f1ca4cdd..0df2ec9271 100644
--- a/contrib/libvhost-user/libvhost-user-glib.c
+++ b/contrib/libvhost-user/libvhost-user-glib.c
@@ -147,7 +147,7 @@ vug_init(VugDev *dev, uint16_t max_queues, int socket,
 g_assert(dev);
 g_assert(iface);
 
-if (!vu_init(>parent, max_queues, socket, panic, set_watch,
+if (!vu_init(>parent, max_queues, socket, panic, NULL, set_watch,
  remove_watch, iface)) {
 return false;
 }
diff --git a/contrib/libvhost-user/libvhost-user.c 
b/contrib/libvhost-user/libvhost-user.c
index 9f1285b8a1..09bdff18f3 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -68,8 +68,6 @@
 /* The version of inflight buffer */
 #define INFLIGHT_VERSION 1
 
-#define VHOST_USER_HDR_SIZE offsetof(VhostUserMsg, payload.u64)
-
 /* The version of the protocol we support */
 #define VHOST_USER_VERSION 1
 #define LIBVHOST_USER_DEBUG 0
@@ -268,7 +266,7 @@ have_userfault(void)
 }
 
 static bool
-vu_message_read(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
+vu_message_read_default(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
 {
 char control[CMSG_SPACE(VHOST_MEMORY_BASELINE_NREGIONS * sizeof(int))] = 
{};
 struct iovec iov = {
@@ -416,7 +414,7 @@ vu_process_message_reply(VuDev *dev, const VhostUserMsg 
*vmsg)
 goto out;
 }
 
-if (!vu_message_read(dev, dev->slave_fd, _reply)) {
+if (!vu_message_read_default(dev, dev->slave_fd, _reply)) {
 goto out;
 }
 
@@ -907,7 +905,7 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg 
*vmsg)
 /* Wait for QEMU to confirm that it's registered the handler for the
  * faults.
  */
-if (!vu_message_read(dev, dev->sock, vmsg) ||
+if (!dev->read_msg(dev, dev->sock, vmsg) ||
 vmsg->size != sizeof(vmsg->payload.u64) ||
 vmsg->payload.u64 != 0) {
 vu_panic(dev, "failed to receive valid ack for postcopy 

[PULL v2 01/30] block/nvme: Add driver statistics for access alignment and hw errors

2020-10-12 Thread Stefan Hajnoczi
From: Philippe Mathieu-Daudé 

Keep statistics of some hardware errors, and number of
aligned/unaligned I/O accesses.

QMP example booting a full RHEL 8.3 aarch64 guest:

{ "execute": "query-blockstats" }
{
"return": [
{
"device": "",
"node-name": "drive0",
"stats": {
"flush_total_time_ns": 6026948,
"wr_highest_offset": 3383991230464,
"wr_total_time_ns": 807450995,
"failed_wr_operations": 0,
"failed_rd_operations": 0,
"wr_merged": 3,
"wr_bytes": 50133504,
"failed_unmap_operations": 0,
"failed_flush_operations": 0,
"account_invalid": false,
"rd_total_time_ns": 1846979900,
"flush_operations": 130,
"wr_operations": 659,
"rd_merged": 1192,
"rd_bytes": 218244096,
"account_failed": false,
"idle_time_ns": 2678641497,
"rd_operations": 7406,
},
"driver-specific": {
"driver": "nvme",
"completion-errors": 0,
"unaligned-accesses": 2959,
"aligned-accesses": 4477
},
"qdev": "/machine/peripheral-anon/device[0]/virtio-backend"
}
]
}

Suggested-by: Stefan Hajnoczi 
Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Markus Armbruster 
Message-id: 20201001162939.1567915-1-phi...@redhat.com
Signed-off-by: Stefan Hajnoczi 
---
 qapi/block-core.json | 24 +++-
 block/nvme.c | 27 +++
 2 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index ee5ebef7f2..e00fc27b5e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -947,6 +947,27 @@
   'discard-nb-failed': 'uint64',
   'discard-bytes-ok': 'uint64' } }
 
+##
+# @BlockStatsSpecificNvme:
+#
+# NVMe driver statistics
+#
+# @completion-errors: The number of completion errors.
+#
+# @aligned-accesses: The number of aligned accesses performed by
+#the driver.
+#
+# @unaligned-accesses: The number of unaligned accesses performed by
+#  the driver.
+#
+# Since: 5.2
+##
+{ 'struct': 'BlockStatsSpecificNvme',
+  'data': {
+  'completion-errors': 'uint64',
+  'aligned-accesses': 'uint64',
+  'unaligned-accesses': 'uint64' } }
+
 ##
 # @BlockStatsSpecific:
 #
@@ -959,7 +980,8 @@
   'discriminator': 'driver',
   'data': {
   'file': 'BlockStatsSpecificFile',
-  'host_device': 'BlockStatsSpecificFile' } }
+  'host_device': 'BlockStatsSpecificFile',
+  'nvme': 'BlockStatsSpecificNvme' } }
 
 ##
 # @BlockStats:
diff --git a/block/nvme.c b/block/nvme.c
index b48f6f2588..739a0a700c 100644
--- a/block/nvme.c
+++ b/block/nvme.c
@@ -128,6 +128,12 @@ struct BDRVNVMeState {
 
 /* PCI address (required for nvme_refresh_filename()) */
 char *device;
+
+struct {
+uint64_t completion_errors;
+uint64_t aligned_accesses;
+uint64_t unaligned_accesses;
+} stats;
 };
 
 #define NVME_BLOCK_OPT_DEVICE "device"
@@ -384,6 +390,9 @@ static bool nvme_process_completion(NVMeQueuePair *q)
 break;
 }
 ret = nvme_translate_error(c);
+if (ret) {
+s->stats.completion_errors++;
+}
 q->cq.head = (q->cq.head + 1) % NVME_QUEUE_SIZE;
 if (!q->cq.head) {
 q->cq_phase = !q->cq_phase;
@@ -1155,8 +1164,10 @@ static int nvme_co_prw(BlockDriverState *bs, uint64_t 
offset, uint64_t bytes,
 assert(QEMU_IS_ALIGNED(bytes, s->page_size));
 assert(bytes <= s->max_transfer);
 if (nvme_qiov_aligned(bs, qiov)) {
+s->stats.aligned_accesses++;
 return nvme_co_prw_aligned(bs, offset, bytes, qiov, is_write, flags);
 }
+s->stats.unaligned_accesses++;
 trace_nvme_prw_buffered(s, offset, bytes, qiov->niov, is_write);
 buf = qemu_try_memalign(s->page_size, bytes);
 
@@ -1452,6 +1463,21 @@ static void nvme_unregister_buf(BlockDriverState *bs, 
void *host)
 qemu_vfio_dma_unmap(s->vfio, host);
 }
 
+static BlockStatsSpecific *nvme_get_specific_stats(BlockDriverState *bs)
+{
+BlockStatsSpecific *stats = g_new(BlockStatsSpecific, 1);
+BDRVNVMeState *s = bs->opaque;
+
+stats->driver = BLOCKDEV_DRIVER_NVME;
+stats->u.nvme = (BlockStatsSpecificNvme) {
+.completion_errors = s->stats.completion_errors,
+.aligned_accesses = s->stats.aligned_accesses,
+.unaligned_accesses = s->stats.unaligned_accesses,
+};
+
+return stats;
+}
+
 static const char *const nvme_strong_runtime_opts[] = {
 NVME_BLOCK_OPT_DEVICE,
 NVME_BLOCK_OPT_NAMESPACE,
@@ -1485,6 +1511,7 @@ static BlockDriver bdrv_nvme = {
 .bdrv_refresh_filename= nvme_refresh_filename,
 .bdrv_refresh_limits  = nvme_refresh_limits,
   

[PATCH v11 13/13] block: apply COR-filter to block-stream jobs

2020-10-12 Thread Andrey Shinkevich via
This patch completes the series with the COR-filter insertion for
block-stream operations. Adding the filter makes it possible for copied
regions to be discarded in backing files during the block-stream job,
what will reduce the disk overuse.
The COR-filter insertion incurs changes in the iotests case
245:test_block_stream_4 that reopens the backing chain during a
block-stream job. There are changes in the iotests #030 as well.
The iotests case 030:test_stream_parallel was deleted due to multiple
conflicts between the concurrent job operations over the same backing
chain. The base backing node for one job is the top node for another
job. It may change due to the filter node inserted into the backing
chain while both jobs are running. Another issue is that the parts of
the backing chain are being frozen by the running job and may not be
changed by the concurrent job when needed. The concept of the parallel
jobs with common nodes is considered vital no more.

Signed-off-by: Andrey Shinkevich 
---
 block/stream.c | 93 +-
 tests/qemu-iotests/030 | 51 +++--
 tests/qemu-iotests/030.out |  4 +-
 tests/qemu-iotests/141.out |  2 +-
 tests/qemu-iotests/245 | 19 +++---
 5 files changed, 81 insertions(+), 88 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index d3e1812..93564db 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -17,8 +17,10 @@
 #include "block/blockjob_int.h"
 #include "qapi/error.h"
 #include "qapi/qmp/qerror.h"
+#include "qapi/qmp/qdict.h"
 #include "qemu/ratelimit.h"
 #include "sysemu/block-backend.h"
+#include "block/copy-on-read.h"
 
 enum {
 /*
@@ -33,6 +35,8 @@ typedef struct StreamBlockJob {
 BlockJob common;
 BlockDriverState *base_overlay; /* COW overlay (stream from this) */
 BlockDriverState *above_base;   /* Node directly above the base */
+BlockDriverState *cor_filter_bs;
+BlockDriverState *target_bs;
 BlockdevOnError on_error;
 bool bs_read_only;
 bool chain_frozen;
@@ -43,8 +47,7 @@ static int coroutine_fn stream_populate(BlockBackend *blk,
 {
 assert(bytes < SIZE_MAX);
 
-return blk_co_preadv(blk, offset, bytes, NULL,
- BDRV_REQ_COPY_ON_READ | BDRV_REQ_PREFETCH);
+return blk_co_preadv(blk, offset, bytes, NULL, BDRV_REQ_PREFETCH);
 }
 
 static void stream_abort(Job *job)
@@ -52,23 +55,20 @@ static void stream_abort(Job *job)
 StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
 
 if (s->chain_frozen) {
-BlockJob *bjob = >common;
-bdrv_unfreeze_backing_chain(blk_bs(bjob->blk), s->above_base);
+bdrv_unfreeze_backing_chain(s->cor_filter_bs, s->above_base);
 }
 }
 
 static int stream_prepare(Job *job)
 {
 StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
-BlockJob *bjob = >common;
-BlockDriverState *bs = blk_bs(bjob->blk);
-BlockDriverState *unfiltered_bs = bdrv_skip_filters(bs);
+BlockDriverState *unfiltered_bs = bdrv_skip_filters(s->target_bs);
 BlockDriverState *base = bdrv_filter_or_cow_bs(s->above_base);
 BlockDriverState *base_unfiltered = bdrv_skip_filters(base);
 Error *local_err = NULL;
 int ret = 0;
 
-bdrv_unfreeze_backing_chain(bs, s->above_base);
+bdrv_unfreeze_backing_chain(s->cor_filter_bs, s->above_base);
 s->chain_frozen = false;
 
 if (bdrv_cow_child(unfiltered_bs)) {
@@ -94,13 +94,14 @@ static void stream_clean(Job *job)
 {
 StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
 BlockJob *bjob = >common;
-BlockDriverState *bs = blk_bs(bjob->blk);
+
+bdrv_cor_filter_drop(s->cor_filter_bs);
 
 /* Reopen the image back in read-only mode if necessary */
 if (s->bs_read_only) {
 /* Give up write permissions before making it read-only */
 blk_set_perm(bjob->blk, 0, BLK_PERM_ALL, _abort);
-bdrv_reopen_set_read_only(bs, true, NULL);
+bdrv_reopen_set_read_only(s->target_bs, true, NULL);
 }
 }
 
@@ -108,9 +109,7 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
 {
 StreamBlockJob *s = container_of(job, StreamBlockJob, common.job);
 BlockBackend *blk = s->common.blk;
-BlockDriverState *bs = blk_bs(blk);
-BlockDriverState *unfiltered_bs = bdrv_skip_filters(bs);
-bool enable_cor = !bdrv_cow_child(s->base_overlay);
+BlockDriverState *unfiltered_bs = bdrv_skip_filters(s->target_bs);
 int64_t len;
 int64_t offset = 0;
 uint64_t delay_ns = 0;
@@ -122,21 +121,12 @@ static int coroutine_fn stream_run(Job *job, Error **errp)
 return 0;
 }
 
-len = bdrv_getlength(bs);
+len = bdrv_getlength(s->target_bs);
 if (len < 0) {
 return len;
 }
 job_progress_set_remaining(>common.job, len);
 
-/* Turn on copy-on-read for the whole block device so that guest read
- * requests help us make progress.  Only do this when copying the entire
- * 

[PATCH v11 12/13] stream: remove unused backing-file name parameter

2020-10-12 Thread Andrey Shinkevich via
The 'backing-file' argument is not used by the block-stream job. It
designates a backing file name to set in QCOW2 image header after the
block-stream job finished. A backing file name of the node above base
is used instead.

Signed-off-by: Andrey Shinkevich 
---
 block/stream.c|  6 +-
 blockdev.c| 21 ++---
 include/block/block_int.h |  2 +-
 3 files changed, 8 insertions(+), 21 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index 51462bd..d3e1812 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -34,7 +34,6 @@ typedef struct StreamBlockJob {
 BlockDriverState *base_overlay; /* COW overlay (stream from this) */
 BlockDriverState *above_base;   /* Node directly above the base */
 BlockdevOnError on_error;
-char *backing_file_str;
 bool bs_read_only;
 bool chain_frozen;
 } StreamBlockJob;
@@ -103,8 +102,6 @@ static void stream_clean(Job *job)
 blk_set_perm(bjob->blk, 0, BLK_PERM_ALL, _abort);
 bdrv_reopen_set_read_only(bs, true, NULL);
 }
-
-g_free(s->backing_file_str);
 }
 
 static int coroutine_fn stream_run(Job *job, Error **errp)
@@ -220,7 +217,7 @@ static const BlockJobDriver stream_job_driver = {
 };
 
 void stream_start(const char *job_id, BlockDriverState *bs,
-  BlockDriverState *base, const char *backing_file_str,
+  BlockDriverState *base,
   int creation_flags, int64_t speed,
   BlockdevOnError on_error,
   const char *filter_node_name,
@@ -295,7 +292,6 @@ void stream_start(const char *job_id, BlockDriverState *bs,
 
 s->base_overlay = base_overlay;
 s->above_base = above_base;
-s->backing_file_str = g_strdup(backing_file_str);
 s->bs_read_only = bs_read_only;
 s->chain_frozen = true;
 
diff --git a/blockdev.c b/blockdev.c
index d719c47..019b6e0 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2498,7 +2498,6 @@ void qmp_block_stream(bool has_job_id, const char 
*job_id, const char *device,
 BlockDriverState *base_bs = NULL;
 AioContext *aio_context;
 Error *local_err = NULL;
-const char *base_name = NULL;
 int job_flags = JOB_DEFAULT;
 
 if (!has_on_error) {
@@ -2526,7 +2525,6 @@ void qmp_block_stream(bool has_job_id, const char 
*job_id, const char *device,
 goto out;
 }
 assert(bdrv_get_aio_context(base_bs) == aio_context);
-base_name = base;
 }
 
 if (has_base_node) {
@@ -2541,7 +2539,11 @@ void qmp_block_stream(bool has_job_id, const char 
*job_id, const char *device,
 }
 assert(bdrv_get_aio_context(base_bs) == aio_context);
 bdrv_refresh_filename(base_bs);
-base_name = base_bs->filename;
+}
+
+if (has_backing_file) {
+warn_report("Use of \"backing-file\" argument is deprecated; "
+"a backing file of the node above base is used instead");
 }
 
 /* Check for op blockers in the whole chain between bs and base */
@@ -2553,17 +2555,6 @@ void qmp_block_stream(bool has_job_id, const char 
*job_id, const char *device,
 }
 }
 
-/* if we are streaming the entire chain, the result will have no backing
- * file, and specifying one is therefore an error */
-if (base_bs == NULL && has_backing_file) {
-error_setg(errp, "backing file specified, but streaming the "
- "entire chain");
-goto out;
-}
-
-/* backing_file string overrides base bs filename */
-base_name = has_backing_file ? backing_file : base_name;
-
 if (has_auto_finalize && !auto_finalize) {
 job_flags |= JOB_MANUAL_FINALIZE;
 }
@@ -2571,7 +2562,7 @@ void qmp_block_stream(bool has_job_id, const char 
*job_id, const char *device,
 job_flags |= JOB_MANUAL_DISMISS;
 }
 
-stream_start(has_job_id ? job_id : NULL, bs, base_bs, base_name,
+stream_start(has_job_id ? job_id : NULL, bs, base_bs,
  job_flags, has_speed ? speed : 0, on_error,
  filter_node_name, _err);
 if (local_err) {
diff --git a/include/block/block_int.h b/include/block/block_int.h
index a142867..4f523c3 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1151,7 +1151,7 @@ int is_windows_drive(const char *filename);
  * BlockDriverState.
  */
 void stream_start(const char *job_id, BlockDriverState *bs,
-  BlockDriverState *base, const char *backing_file_str,
+  BlockDriverState *base,
   int creation_flags, int64_t speed,
   BlockdevOnError on_error,
   const char *filter_node_name,
-- 
1.8.3.1




[PATCH v11 11/13] stream: mark backing-file argument as deprecated

2020-10-12 Thread Andrey Shinkevich via
Whereas the block-stream job starts using a backing file name of the
base node overlay after the block-stream job completes, mark the QMP
'backing-file' argument as deprecated.

Signed-off-by: Andrey Shinkevich 
---
 docs/system/deprecated.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 8b3ab5b..7491fcf 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -285,6 +285,12 @@ details.
 The ``query-events`` command has been superseded by the more powerful
 and accurate ``query-qmp-schema`` command.
 
+``block-stream`` argument ``backing-file`` (since 5.2)
+'
+
+The argument ``backing-file`` is deprecated. QEMU uses a backing file
+name of the base node overlay after the block-stream job completes.
+
 chardev client socket with ``wait`` option (since 4.0)
 ''
 
-- 
1.8.3.1




[PATCH v11 08/13] copy-on-read: add support for BDRV_REQ_PREFETCH to COR-filter

2020-10-12 Thread Andrey Shinkevich via
Add support for the BDRV_REQ_PREFETCH flag to the supported_write_flags
of the COR-filter.

Signed-off-by: Andrey Shinkevich 
---
 block/copy-on-read.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index dfbd6ad..b136895 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -50,6 +50,7 @@ static int cor_open(BlockDriverState *bs, QDict *options, int 
flags,
 return -EINVAL;
 }
 
+bs->supported_read_flags = BDRV_REQ_PREFETCH;
 bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
 (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
 
-- 
1.8.3.1




[PATCH v11 09/13] copy-on-read: skip non-guest reads if no copy needed

2020-10-12 Thread Andrey Shinkevich via
If the flag BDRV_REQ_PREFETCH was set, pass it further to the
COR-driver to skip unneeded reading. It can be taken into account for
the COR-algorithms optimization. That check is being made during the
block stream job by the moment.

Signed-off-by: Andrey Shinkevich 
---
 block/copy-on-read.c | 13 +
 block/io.c   |  3 ++-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index b136895..278a11a 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -148,10 +148,15 @@ static int coroutine_fn 
cor_co_preadv_part(BlockDriverState *bs,
 }
 }
 
-ret = bdrv_co_preadv_part(bs->file, offset, n, qiov, qiov_offset,
-  local_flags);
-if (ret < 0) {
-return ret;
+if (!!(flags & BDRV_REQ_PREFETCH) &
+!(local_flags & BDRV_REQ_COPY_ON_READ)) {
+/* Skip non-guest reads if no copy needed */
+} else {
+ret = bdrv_co_preadv_part(bs->file, offset, n, qiov, qiov_offset,
+  local_flags);
+if (ret < 0) {
+return ret;
+}
 }
 
 offset += n;
diff --git a/block/io.c b/block/io.c
index 11df188..bff1808 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1512,7 +1512,8 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild 
*child,
 
 max_bytes = ROUND_UP(MAX(0, total_bytes - offset), align);
 if (bytes <= max_bytes && bytes <= max_transfer) {
-ret = bdrv_driver_preadv(bs, offset, bytes, qiov, qiov_offset, 0);
+ret = bdrv_driver_preadv(bs, offset, bytes, qiov, qiov_offset,
+ flags & bs->supported_read_flags);
 goto out;
 }
 
-- 
1.8.3.1




[PATCH v11 10/13] stream: skip filters when writing backing file name to QCOW2 header

2020-10-12 Thread Andrey Shinkevich via
Avoid writing a filter JSON-name to QCOW2 image when the backing file
is changed after the block stream job.

Signed-off-by: Andrey Shinkevich 
---
 block/stream.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/stream.c b/block/stream.c
index e0540ee..51462bd 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -65,6 +65,7 @@ static int stream_prepare(Job *job)
 BlockDriverState *bs = blk_bs(bjob->blk);
 BlockDriverState *unfiltered_bs = bdrv_skip_filters(bs);
 BlockDriverState *base = bdrv_filter_or_cow_bs(s->above_base);
+BlockDriverState *base_unfiltered = bdrv_skip_filters(base);
 Error *local_err = NULL;
 int ret = 0;
 
@@ -73,10 +74,10 @@ static int stream_prepare(Job *job)
 
 if (bdrv_cow_child(unfiltered_bs)) {
 const char *base_id = NULL, *base_fmt = NULL;
-if (base) {
-base_id = s->backing_file_str;
-if (base->drv) {
-base_fmt = base->drv->format_name;
+if (base_unfiltered) {
+base_id = base_unfiltered->filename;
+if (base_unfiltered->drv) {
+base_fmt = base_unfiltered->drv->format_name;
 }
 }
 bdrv_set_backing_hd(unfiltered_bs, base, _err);
-- 
1.8.3.1




[PATCH v11 07/13] block: include supported_read_flags into BDS structure

2020-10-12 Thread Andrey Shinkevich via
Add the new member supported_read_flags to BlockDriverState structure.
It will control the BDRV_REQ_PREFETCH flag set for copy-on-read
operations.

Signed-off-by: Andrey Shinkevich 
---
 include/block/block_int.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index f782737..a142867 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -873,6 +873,10 @@ struct BlockDriverState {
 /* I/O Limits */
 BlockLimits bl;
 
+/*
+ * Flags honored during pread (so far: BDRV_REQ_PREFETCH)
+ */
+unsigned int supported_read_flags;
 /* Flags honored during pwrite (so far: BDRV_REQ_FUA,
  * BDRV_REQ_WRITE_UNCHANGED).
  * If a driver does not support BDRV_REQ_WRITE_UNCHANGED, those
-- 
1.8.3.1




[PATCH v11 06/13] block: modify the comment for BDRV_REQ_PREFETCH flag

2020-10-12 Thread Andrey Shinkevich via
Modify the comment for the flag BDRV_REQ_PREFETCH as we are going to
use it alone and pass it to the COR-filter driver for further
processing.

Signed-off-by: Andrey Shinkevich 
---
 include/block/block.h | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 981ab5b..2b7efd1 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -71,9 +71,10 @@ typedef enum {
 BDRV_REQ_NO_FALLBACK= 0x100,
 
 /*
- * BDRV_REQ_PREFETCH may be used only together with BDRV_REQ_COPY_ON_READ
- * on read request and means that caller doesn't really need data to be
- * written to qiov parameter which may be NULL.
+ * BDRV_REQ_PREFETCH may be used together with the BDRV_REQ_COPY_ON_READ
+ * flag or when the COR-filter applied to read operations and means that
+ * caller doesn't really need data to be written to qiov parameter which
+ * may be NULL.
  */
 BDRV_REQ_PREFETCH  = 0x200,
 /* Mask of valid flags */
-- 
1.8.3.1




[PATCH v11 05/13] copy-on-read: limit COR operations to base in COR driver

2020-10-12 Thread Andrey Shinkevich via
Limit COR operations by the base node in the backing chain when the
overlay base node name is given. It will be useful for a block stream
job when the COR-filter is applied. The overlay base node is passed as
the base itself may change due to concurrent commit jobs on the same
backing chain.

Signed-off-by: Andrey Shinkevich 
---
 block/copy-on-read.c | 39 +--
 1 file changed, 37 insertions(+), 2 deletions(-)

diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index c578b1b..dfbd6ad 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -122,8 +122,43 @@ static int coroutine_fn 
cor_co_preadv_part(BlockDriverState *bs,
size_t qiov_offset,
int flags)
 {
-return bdrv_co_preadv_part(bs->file, offset, bytes, qiov, qiov_offset,
-   flags | BDRV_REQ_COPY_ON_READ);
+int64_t n = 0;
+int64_t size = offset + bytes;
+int local_flags;
+int ret;
+BDRVStateCOR *state = bs->opaque;
+
+if (!state->base_overlay) {
+return bdrv_co_preadv_part(bs->file, offset, bytes, qiov, qiov_offset,
+   flags | BDRV_REQ_COPY_ON_READ);
+}
+
+while (offset < size) {
+local_flags = flags;
+
+/* In case of failure, try to copy-on-read anyway */
+ret = bdrv_is_allocated(bs->file->bs, offset, bytes, );
+if (!ret) {
+ret = bdrv_is_allocated_above(bdrv_cow_bs(bs->file->bs),
+  state->base_overlay, true, offset,
+  n, );
+if (ret) {
+local_flags |= BDRV_REQ_COPY_ON_READ;
+}
+}
+
+ret = bdrv_co_preadv_part(bs->file, offset, n, qiov, qiov_offset,
+  local_flags);
+if (ret < 0) {
+return ret;
+}
+
+offset += n;
+qiov_offset += n;
+bytes -= n;
+}
+
+return 0;
 }
 
 
-- 
1.8.3.1




[PATCH v11 02/13] copy-on-read: add filter append/drop functions

2020-10-12 Thread Andrey Shinkevich via
Provide API for the COR-filter insertion/removal.
Also, drop the filter child permissions for an inactive state when the
filter node is being removed.

Signed-off-by: Andrey Shinkevich 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/copy-on-read.c | 88 
 block/copy-on-read.h | 35 +
 2 files changed, 123 insertions(+)
 create mode 100644 block/copy-on-read.h

diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index cb03e0f..bcccf0f 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -23,11 +23,21 @@
 #include "qemu/osdep.h"
 #include "block/block_int.h"
 #include "qemu/module.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qdict.h"
+#include "block/copy-on-read.h"
+
+
+typedef struct BDRVStateCOR {
+bool active;
+} BDRVStateCOR;
 
 
 static int cor_open(BlockDriverState *bs, QDict *options, int flags,
 Error **errp)
 {
+BDRVStateCOR *state = bs->opaque;
+
 bs->file = bdrv_open_child(NULL, options, "file", bs, _of_bds,
BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
false, errp);
@@ -42,6 +52,13 @@ static int cor_open(BlockDriverState *bs, QDict *options, 
int flags,
 ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
 bs->file->bs->supported_zero_flags);
 
+state->active = true;
+
+/*
+ * We don't need to call bdrv_child_refresh_perms() now as the permissions
+ * will be updated later when the filter node gets its parent.
+ */
+
 return 0;
 }
 
@@ -57,6 +74,17 @@ static void cor_child_perm(BlockDriverState *bs, BdrvChild 
*c,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
 {
+BDRVStateCOR *s = bs->opaque;
+
+if (!s->active) {
+/*
+ * While the filter is being removed
+ */
+*nperm = 0;
+*nshared = BLK_PERM_ALL;
+return;
+}
+
 *nperm = perm & PERM_PASSTHROUGH;
 *nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED;
 
@@ -135,6 +163,7 @@ static void cor_lock_medium(BlockDriverState *bs, bool 
locked)
 
 static BlockDriver bdrv_copy_on_read = {
 .format_name= "copy-on-read",
+.instance_size  = sizeof(BDRVStateCOR),
 
 .bdrv_open  = cor_open,
 .bdrv_child_perm= cor_child_perm,
@@ -159,4 +188,63 @@ static void bdrv_copy_on_read_init(void)
 bdrv_register(_copy_on_read);
 }
 
+
+BlockDriverState *bdrv_cor_filter_append(BlockDriverState *bs,
+ QDict *node_options,
+ int flags, Error **errp)
+{
+BlockDriverState *cor_filter_bs;
+Error *local_err = NULL;
+
+cor_filter_bs = bdrv_open(NULL, NULL, node_options, flags, errp);
+if (cor_filter_bs == NULL) {
+error_prepend(errp, "Could not create COR-filter node: ");
+return NULL;
+}
+
+if (!qdict_get_try_str(node_options, "node-name")) {
+cor_filter_bs->implicit = true;
+}
+
+bdrv_drained_begin(bs);
+bdrv_replace_node(bs, cor_filter_bs, _err);
+bdrv_drained_end(bs);
+
+if (local_err) {
+bdrv_unref(cor_filter_bs);
+error_propagate(errp, local_err);
+return NULL;
+}
+
+return cor_filter_bs;
+}
+
+
+void bdrv_cor_filter_drop(BlockDriverState *cor_filter_bs)
+{
+BdrvChild *child;
+BlockDriverState *bs;
+BDRVStateCOR *s = cor_filter_bs->opaque;
+
+child = bdrv_filter_child(cor_filter_bs);
+if (!child) {
+return;
+}
+bs = child->bs;
+
+/* Retain the BDS until we complete the graph change. */
+bdrv_ref(bs);
+/* Hold a guest back from writing while permissions are being reset. */
+bdrv_drained_begin(bs);
+/* Drop permissions before the graph change. */
+s->active = false;
+bdrv_child_refresh_perms(cor_filter_bs, child, _abort);
+bdrv_replace_node(cor_filter_bs, bs, _abort);
+
+bdrv_drained_end(bs);
+bdrv_unref(bs);
+bdrv_unref(cor_filter_bs);
+}
+
+
 block_init(bdrv_copy_on_read_init);
diff --git a/block/copy-on-read.h b/block/copy-on-read.h
new file mode 100644
index 000..d6f2422
--- /dev/null
+++ b/block/copy-on-read.h
@@ -0,0 +1,35 @@
+/*
+ * Copy-on-read filter block driver
+ *
+ * The filter driver performs Copy-On-Read (COR) operations
+ *
+ * Copyright (c) 2018-2020 Virtuozzo International GmbH.
+ *
+ * Author:
+ *   Andrey Shinkevich 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of

[PATCH v11 04/13] copy-on-read: pass overlay base node name to COR driver

2020-10-12 Thread Andrey Shinkevich via
We are going to use the COR-filter for a block-stream job.
To limit COR operations by the base node in the backing chain during
stream job, pass the name of overlay base node to the copy-on-read
driver as base node itself may change due to possible concurrent jobs.
The rest of the functionality will be implemented in the patch that
follows.

Signed-off-by: Andrey Shinkevich 
---
 block/copy-on-read.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index bcccf0f..c578b1b 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -24,19 +24,24 @@
 #include "block/block_int.h"
 #include "qemu/module.h"
 #include "qapi/error.h"
+#include "qapi/qmp/qerror.h"
 #include "qapi/qmp/qdict.h"
 #include "block/copy-on-read.h"
 
 
 typedef struct BDRVStateCOR {
 bool active;
+BlockDriverState *base_overlay;
 } BDRVStateCOR;
 
 
 static int cor_open(BlockDriverState *bs, QDict *options, int flags,
 Error **errp)
 {
+BlockDriverState *base_overlay = NULL;
 BDRVStateCOR *state = bs->opaque;
+/* We need the base overlay node rather than the base itself */
+const char *base_overlay_node = qdict_get_try_str(options, "base");
 
 bs->file = bdrv_open_child(NULL, options, "file", bs, _of_bds,
BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
@@ -52,7 +57,16 @@ static int cor_open(BlockDriverState *bs, QDict *options, 
int flags,
 ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
 bs->file->bs->supported_zero_flags);
 
+if (base_overlay_node) {
+qdict_del(options, "base");
+base_overlay = bdrv_lookup_bs(NULL, base_overlay_node, errp);
+if (!base_overlay) {
+error_setg(errp, QERR_BASE_NOT_FOUND, base_overlay_node);
+return -EINVAL;
+}
+}
 state->active = true;
+state->base_overlay = base_overlay;
 
 /*
  * We don't need to call bdrv_child_refresh_perms() now as the permissions
-- 
1.8.3.1




[PATCH v11 03/13] qapi: add filter-node-name to block-stream

2020-10-12 Thread Andrey Shinkevich via
Provide the possibility to pass the 'filter-node-name' parameter to the
block-stream job as it is done for the commit block job.

Signed-off-by: Andrey Shinkevich 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/monitor/block-hmp-cmds.c | 4 ++--
 block/stream.c | 4 +++-
 blockdev.c | 4 +++-
 include/block/block_int.h  | 7 ++-
 qapi/block-core.json   | 6 ++
 5 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index 4d3db5e..4e66775 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -507,8 +507,8 @@ void hmp_block_stream(Monitor *mon, const QDict *qdict)
 
 qmp_block_stream(true, device, device, base != NULL, base, false, NULL,
  false, NULL, qdict_haskey(qdict, "speed"), speed, true,
- BLOCKDEV_ON_ERROR_REPORT, false, false, false, false,
- );
+ BLOCKDEV_ON_ERROR_REPORT, false, NULL, false, false, 
false,
+ false, );
 
 hmp_handle_error(mon, error);
 }
diff --git a/block/stream.c b/block/stream.c
index 8ce6729..e0540ee 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -221,7 +221,9 @@ static const BlockJobDriver stream_job_driver = {
 void stream_start(const char *job_id, BlockDriverState *bs,
   BlockDriverState *base, const char *backing_file_str,
   int creation_flags, int64_t speed,
-  BlockdevOnError on_error, Error **errp)
+  BlockdevOnError on_error,
+  const char *filter_node_name,
+  Error **errp)
 {
 StreamBlockJob *s;
 BlockDriverState *iter;
diff --git a/blockdev.c b/blockdev.c
index bebd3ba..d719c47 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2489,6 +2489,7 @@ void qmp_block_stream(bool has_job_id, const char 
*job_id, const char *device,
   bool has_backing_file, const char *backing_file,
   bool has_speed, int64_t speed,
   bool has_on_error, BlockdevOnError on_error,
+  bool has_filter_node_name, const char *filter_node_name,
   bool has_auto_finalize, bool auto_finalize,
   bool has_auto_dismiss, bool auto_dismiss,
   Error **errp)
@@ -2571,7 +2572,8 @@ void qmp_block_stream(bool has_job_id, const char 
*job_id, const char *device,
 }
 
 stream_start(has_job_id ? job_id : NULL, bs, base_bs, base_name,
- job_flags, has_speed ? speed : 0, on_error, _err);
+ job_flags, has_speed ? speed : 0, on_error,
+ filter_node_name, _err);
 if (local_err) {
 error_propagate(errp, local_err);
 goto out;
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 38cad9d..f782737 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -1134,6 +1134,9 @@ int is_windows_drive(const char *filename);
  *  See @BlockJobCreateFlags
  * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
  * @on_error: The action to take upon error.
+ * @filter_node_name: The node name that should be assigned to the filter
+ * driver that the commit job inserts into the graph above @bs. NULL means
+ * that a node name should be autogenerated.
  * @errp: Error object.
  *
  * Start a streaming operation on @bs.  Clusters that are unallocated
@@ -1146,7 +1149,9 @@ int is_windows_drive(const char *filename);
 void stream_start(const char *job_id, BlockDriverState *bs,
   BlockDriverState *base, const char *backing_file_str,
   int creation_flags, int64_t speed,
-  BlockdevOnError on_error, Error **errp);
+  BlockdevOnError on_error,
+  const char *filter_node_name,
+  Error **errp);
 
 /**
  * commit_start:
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3c16f1e..32fb097 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2533,6 +2533,11 @@
 #'stop' and 'enospc' can only be used if the block device
 #supports io-status (see BlockInfo).  Since 1.3.
 #
+# @filter-node-name: the node name that should be assigned to the
+#filter driver that the stream job inserts into the graph
+#above @device. If this option is not given, a node name is
+#autogenerated. (Since: 5.2)
+#
 # @auto-finalize: When false, this job will wait in a PENDING state after it 
has
 # finished its work, waiting for @block-job-finalize before
 # making any block graph changes.
@@ -2563,6 +2568,7 @@
   'data': { '*job-id': 'str', 'device': 'str', '*base': 'str',
 '*base-node': 'str', '*backing-file': 'str', '*speed': 'int',
 '*on-error': 'BlockdevOnError',
+ 

[PATCH v11 01/13] copy-on-read: Support preadv/pwritev_part functions

2020-10-12 Thread Andrey Shinkevich via
Add support for the recently introduced functions
bdrv_co_preadv_part()
and
bdrv_co_pwritev_part()
to the COR-filter driver.

Signed-off-by: Andrey Shinkevich 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/copy-on-read.c | 28 
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/block/copy-on-read.c b/block/copy-on-read.c
index 2816e61..cb03e0f 100644
--- a/block/copy-on-read.c
+++ b/block/copy-on-read.c
@@ -74,21 +74,25 @@ static int64_t cor_getlength(BlockDriverState *bs)
 }
 
 
-static int coroutine_fn cor_co_preadv(BlockDriverState *bs,
-  uint64_t offset, uint64_t bytes,
-  QEMUIOVector *qiov, int flags)
+static int coroutine_fn cor_co_preadv_part(BlockDriverState *bs,
+   uint64_t offset, uint64_t bytes,
+   QEMUIOVector *qiov,
+   size_t qiov_offset,
+   int flags)
 {
-return bdrv_co_preadv(bs->file, offset, bytes, qiov,
-  flags | BDRV_REQ_COPY_ON_READ);
+return bdrv_co_preadv_part(bs->file, offset, bytes, qiov, qiov_offset,
+   flags | BDRV_REQ_COPY_ON_READ);
 }
 
 
-static int coroutine_fn cor_co_pwritev(BlockDriverState *bs,
-   uint64_t offset, uint64_t bytes,
-   QEMUIOVector *qiov, int flags)
+static int coroutine_fn cor_co_pwritev_part(BlockDriverState *bs,
+uint64_t offset,
+uint64_t bytes,
+QEMUIOVector *qiov,
+size_t qiov_offset, int flags)
 {
-
-return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
+return bdrv_co_pwritev_part(bs->file, offset, bytes, qiov, qiov_offset,
+flags);
 }
 
 
@@ -137,8 +141,8 @@ static BlockDriver bdrv_copy_on_read = {
 
 .bdrv_getlength = cor_getlength,
 
-.bdrv_co_preadv = cor_co_preadv,
-.bdrv_co_pwritev= cor_co_pwritev,
+.bdrv_co_preadv_part= cor_co_preadv_part,
+.bdrv_co_pwritev_part   = cor_co_pwritev_part,
 .bdrv_co_pwrite_zeroes  = cor_co_pwrite_zeroes,
 .bdrv_co_pdiscard   = cor_co_pdiscard,
 .bdrv_co_pwritev_compressed = cor_co_pwritev_compressed,
-- 
1.8.3.1




[PATCH v11 00/13] Apply COR-filter to the block-stream permanently

2020-10-12 Thread Andrey Shinkevich via
The iotest case test_stream_parallel still does not pass after the
COR-filter is inserted into the backing chain. As the test case may not
be initialized, it does not make a sense and was removed again.

v11:
  04: Base node overlay is used instead of base.
  05: Base node overlay is used instead of base.
  06: New.
  07: New.
  08: New.
  09: The new BDS-member 'supported_read_flags' is applied.
  10: The 'base_metadata' variable renamed to 'base_unfiltered'.
  11: New.
  12: The backing-file argument is left in the QMP interface. Warning added.
  13: The BDRV_REQ_COPY_ON_READ removed from the stream_populate();
  The 'implicit' initialization moved back to COR-filter driver.
  Base node overlay is used instead of base.

The v8 Message-Id:
<1601383109-110988-1-git-send-email-andrey.shinkev...@virtuozzo.com>

Andrey Shinkevich (13):
  copy-on-read: Support preadv/pwritev_part functions
  copy-on-read: add filter append/drop functions
  qapi: add filter-node-name to block-stream
  copy-on-read: pass overlay base node name to COR driver
  copy-on-read: limit COR operations to base in COR driver
  block: modify the comment for BDRV_REQ_PREFETCH flag
  block: include supported_read_flags into BDS structure
  copy-on-read: add support for BDRV_REQ_PREFETCH to COR-filter
  copy-on-read: skip non-guest reads if no copy needed
  stream: skip filters when writing backing file name to QCOW2 header
  stream: mark backing-file argument as deprecated
  stream: remove unused backing-file name parameter
  block: apply COR-filter to block-stream jobs

 block/copy-on-read.c   | 171 ++---
 block/copy-on-read.h   |  35 +
 block/io.c |   3 +-
 block/monitor/block-hmp-cmds.c |   4 +-
 block/stream.c | 112 ---
 blockdev.c |  25 +++---
 docs/system/deprecated.rst |   6 ++
 include/block/block.h  |   7 +-
 include/block/block_int.h  |  13 +++-
 qapi/block-core.json   |   6 ++
 tests/qemu-iotests/030 |  51 ++--
 tests/qemu-iotests/030.out |   4 +-
 tests/qemu-iotests/141.out |   2 +-
 tests/qemu-iotests/245 |  19 +++--
 14 files changed, 324 insertions(+), 134 deletions(-)
 create mode 100644 block/copy-on-read.h

-- 
1.8.3.1




Re: [PATCH v7 02/13] qtest: Reintroduce qtest_qmp_receive

2020-10-12 Thread Paolo Bonzini
On 12/10/20 15:49, Thomas Huth wrote:
>> We chose to use the same name because the new version generally is the
>> one you want and, except for the handling of events, is exactly the same
>> as before.  In other words, I'm treating the new semantics more as a
>> bugfix than a feature.
>>
>> The only trap that backports of later patches could fall into is if they
>> want to look at events, but it would be caught easily because the test
>> would fail.
> Ok, thanks for the explanation! ... but I think it might be good to have
> this information in the patch description, though.

Ok, I'll add a couple words.

Paolo




Re: [PULL 00/30] Block patches

2020-10-12 Thread Stefan Hajnoczi
On Sun, Oct 11, 2020 at 07:32:58PM +0100, Peter Maydell wrote:
> On Fri, 9 Oct 2020 at 20:35, Stefan Hajnoczi  wrote:
> >
> > The following changes since commit 497d415d76b9f59fcae27f22df1ca2c3fa4df64e:
> >
> >   Merge remote-tracking branch 
> > 'remotes/pmaydell/tags/pull-target-arm-20201008-1' into staging (2020-10-08 
> > 21:41:20 +0100)
> >
> > are available in the Git repository at:
> >
> >   https://gitlab.com/stefanha/qemu.git tags/block-pull-request
> >
> > for you to fetch changes up to e969c7b045c90368bc3a5db3479e70b6f0ecb828:
> >
> >   iotests: add commit top->base cases to 274 (2020-10-09 14:32:24 +0100)
> >
> > 
> > Pull request
> >
> > This pull request includes the vhost-user-blk server by Coiby Xu, the block
> > coroutine code generator by Vladimir Sementsov-Ogievskiy, nvme block driver
> > statistics by Philippe Mathieu-Daudé, and cleanups/fixes/additions to the
> > vhost-user-blk server by me.
> >
> 
> Hi; this seems to have a conflict in qemu-nbd.c with something
> that landed in the latest nbd pullreq. Could you fix up and
> resend, please?

Thanks for letting me. I will send a v2.

Stefan


signature.asc
Description: PGP signature


Re: [PATCH] migration/block-dirty-bitmap: fix uninitialized variable warning

2020-10-12 Thread Laurent Vivier
Le 10/10/2020 à 13:07, Chen Qun a écrit :
> This if statement judgment is redundant and it will cause a warning:
> 
> migration/block-dirty-bitmap.c:1090:13: warning: ‘bitmap_name’ may be used
>  uninitialized in this function [-Wmaybe-uninitialized]
>  g_strlcpy(s->bitmap_name, bitmap_name, sizeof(s->bitmap_name));
>  ^~
> 
> Reported-by: Euler Robot 
> Signed-off-by: Chen Qun 
> ---
>  migration/block-dirty-bitmap.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
> index 5bef793ac0..e09ea4f22b 100644
> --- a/migration/block-dirty-bitmap.c
> +++ b/migration/block-dirty-bitmap.c
> @@ -1084,9 +1084,7 @@ static int dirty_bitmap_load_header(QEMUFile *f, 
> DBMLoadState *s,
>  } else {
>  bitmap_name = s->bitmap_alias;
>  }
> -}
>  
> -if (!s->cancelled) {
>  g_strlcpy(s->bitmap_name, bitmap_name, sizeof(s->bitmap_name));
>  s->bitmap = bdrv_find_dirty_bitmap(s->bs, s->bitmap_name);
>  
> 

I don't think it's correct as "cancel_incoming_locked(s)" can change the
value of "s->cancelled".

Thanks,
Laurent



Re: [PATCH v2 1/3] block: push error reporting into bdrv_all_*_snapshot functions

2020-10-12 Thread Max Reitz
On 12.10.20 14:27, Philippe Mathieu-Daudé wrote:
> From: Daniel P. Berrangé 
> 
> The bdrv_all_*_snapshot functions return a BlockDriverState pointer
> for the invalid backend, which the callers then use to report an
> error message. In some cases multiple callers are reporting the
> same error message, but with slightly different text. In the future
> there will be more error scenarios for some of these methods, which
> will benefit from fine grained error message reporting. So it is
> helpful to push error reporting down a level.
> 
> Signed-off-by: Daniel P. Berrangé 
> [PMD: Initialize variables]
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  include/block/snapshot.h   | 14 +++
>  block/monitor/block-hmp-cmds.c |  7 ++--
>  block/snapshot.c   | 77 +-
>  migration/savevm.c | 37 +---
>  monitor/hmp-cmds.c |  7 +---
>  replay/replay-debugging.c  |  4 +-
>  tests/qemu-iotests/267.out | 10 ++---
>  7 files changed, 67 insertions(+), 89 deletions(-)

Reviewed-by: Max Reitz 




Re: [PATCH v4 0/7] qemu-storage-daemon: Remove QemuOpts from --object

2020-10-12 Thread Kevin Wolf
Am 11.10.2020 um 09:34 hat Markus Armbruster geschrieben:
> This replaces the QemuOpts-based help code for --object in the storage
> daemon with code based on the keyval parser.
> 
> Review of v3 led me to preexisting issues.  Instead of posting my
> fixes separately, then working with Kevin to rebase his work on top of
> mine, I did the rebase myself and am posting it together with my fixes
> as v4.  Hope that's okay.
> 
> Note: to test qemu-storage-daemon --object help, you need Philippe's
> "[PATCH v3] hw/nvram: Always register FW_CFG_DATA_GENERATOR_INTERFACE"
> and its prerequisites (first 10 patches of Paolo's "[PULL 00/39] SCSI,
> qdev, qtest, meson patches for 2020-10-10"), so it doesn't crash with
> "missing interface 'fw_cfg-data-generator' for object 'tls-creds'".

Thanks, fixed up the typos in patch 4 and applied patches 1-4. I'm
taking patches 5-7 from my original series because you didn't add your
S-o-b (they are unchanged anyway, so no practical difference).

Kevin




Re: [PATCH] block/blkdebug: fix memory leak

2020-10-12 Thread Laurent Vivier
Le 09/10/2020 à 21:09, Elena Afanasova a écrit :
> Spotted by PVS-Studio
> 
> Signed-off-by: Elena Afanasova 
> ---
>  block/blkdebug.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block/blkdebug.c b/block/blkdebug.c
> index eecbf3e5c4..54da719dd1 100644
> --- a/block/blkdebug.c
> +++ b/block/blkdebug.c
> @@ -215,6 +215,7 @@ static int add_rule(void *opaque, QemuOpts *opts, Error 
> **errp)
>   BLKDEBUG_IO_TYPE__MAX, _error);
>  if (local_error) {
>  error_propagate(errp, local_error);
> +g_free(rule);
>  return -1;
>  }
>  if (iotype != BLKDEBUG_IO_TYPE__MAX) {
> 

Applied to my trivial-patches branch.

Thanks,
Laurent




Re: [PATCH v4 4/7] keyval: Parse help options

2020-10-12 Thread Kevin Wolf
Am 11.10.2020 um 09:35 hat Markus Armbruster geschrieben:
> From: Kevin Wolf 
> 
> This adds a special meaning for 'help' and '?' as options to the keyval
> parser. Instead of being an error (because of a missing value) or a
> value for an implied key, they now request help, which is a new boolean
> output of the parser in addition to the QDict.
> 
> A new parameter 'p_help' is added to keyval_parse() that contains on
> return whether help was requested. If NULL is passed, requesting help
> results in an error and all other cases work like before.
> 
> Turning previous error cases into help is a compatible extension. The
> behaviour potentially changes for implied keys: They could previously
> get 'help' as their value, which is now interpreted as requesting help.
> 
> This is not a problem in practice because 'help' and '?' are not a valid
> values for the implied key of any option parsed with keyval_parse():
> 
> * audiodev: union Audiodev, implied key "driver" is enum AudiodevDriver,
>   "help" and "?" are not among its values
> 
> * display: union DisplayOptions, implied key "type" is enum
>   DisplayType, "help" and "?" are not among its values
> 
> * blockdev: union BlockdevOptions, implied key "driver is enum
>   BlockdevDriver, "help" and "?" are not among its values
> 
> * export: union BlockExport, implied key "type" is enum BlockExportType,
>   "help" and "?" are not among its values
> 
> * monitor: struct MonitorOptions, implied key "mode" is enum MonitorMode,
>   "help" and "?" are not among its values
> 
> * nbd-server: struct NbdServerOptions, no implied key.
> 
> Signed-off-by: Kevin Wolf 
> Signed-off-by: Markus Armbruster 

> diff --git a/tests/test-keyval.c b/tests/test-keyval.c
> index 04c62cf045..7b459900af 100644
> --- a/tests/test-keyval.c
> +++ b/tests/test-keyval.c
> @@ -1,3 +1,4 @@
> +
>  /*
>   * Unit tests for parsing of KEY=VALUE,... strings
>   *

This hunk looks unintentional.

Kevin




Re: [PATCH 05/10] hw/isa: Add the ISA_IRQ_FDC_DEFAULT definition

2020-10-12 Thread John Snow

On 10/11/20 3:32 PM, Philippe Mathieu-Daudé wrote:

The floppy disk controller uses IRQ #6 by default. Add this
default definition to the IsaIrqNumber enum.

Avoid magic values in the code, replace them by the
newly introduced definition.

Signed-off-by: Philippe Mathieu-Daudé 


Acked-by: John Snow 




Re: [PATCH 1/4] vmdk: fix maybe uninitialized warnings

2020-10-12 Thread Laurent Vivier
Le 05/10/2020 à 08:26, Christian Borntraeger a écrit :
> On 30.09.20 18:36, Fam Zheng wrote:
>> On Wed, 2020-09-30 at 17:58 +0200, Christian Borntraeger wrote:
>>> Fedora 32 gcc 10 seems to give false positives:
>>>
>>> Compiling C object libblock.fa.p/block_vmdk.c.o
>>> ../block/vmdk.c: In function ‘vmdk_parse_extents’:
>>> ../block/vmdk.c:587:5: error: ‘extent’ may be used uninitialized in
>>> this function [-Werror=maybe-uninitialized]
>>>   587 | g_free(extent->l1_table);
>>>   | ^~~~
>>> ../block/vmdk.c:754:17: note: ‘extent’ was declared here
>>>   754 | VmdkExtent *extent;
>>>   | ^~
>>> ../block/vmdk.c:620:11: error: ‘extent’ may be used uninitialized in
>>> this function [-Werror=maybe-uninitialized]
>>>   620 | ret = vmdk_init_tables(bs, extent, errp);
>>>   |   ^~
>>> ../block/vmdk.c:598:17: note: ‘extent’ was declared here
>>>   598 | VmdkExtent *extent;
>>>   | ^~
>>> ../block/vmdk.c:1178:39: error: ‘extent’ may be used uninitialized in
>>> this function [-Werror=maybe-uninitialized]
>>>  1178 | extent->flat_start_offset = flat_offset << 9;
>>>   | ~~^~
>>> ../block/vmdk.c: In function ‘vmdk_open_vmdk4’:
>>> ../block/vmdk.c:581:22: error: ‘extent’ may be used uninitialized in
>>> this function [-Werror=maybe-uninitialized]
>>>   581 | extent->l2_cache =
>>>   | ~^
>>>   582 | g_malloc(extent->entry_size * extent->l2_size *
>>> L2_CACHE_SIZE);
>>>   | ~
>>> ~
>>> ../block/vmdk.c:872:17: note: ‘extent’ was declared here
>>>   872 | VmdkExtent *extent;
>>>   | ^~
>>> ../block/vmdk.c: In function ‘vmdk_open’:
>>> ../block/vmdk.c:620:11: error: ‘extent’ may be used uninitialized in
>>> this function [-Werror=maybe-uninitialized]
>>>   620 | ret = vmdk_init_tables(bs, extent, errp);
>>>   |   ^~
>>> ../block/vmdk.c:598:17: note: ‘extent’ was declared here
>>>   598 | VmdkExtent *extent;
>>>   | ^~
>>> cc1: all warnings being treated as errors
>>> make: *** [Makefile.ninja:884: libblock.fa.p/block_vmdk.c.o] Error 1
>>>
>>> fix them by assigning a default value.
>>>
>>> Signed-off-by: Christian Borntraeger 
>>> ---
>>>  block/vmdk.c | 8 
>>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/block/vmdk.c b/block/vmdk.c
>>> index 8ec62c7ab798..a00dc00eb47a 100644
>>> --- a/block/vmdk.c
>>> +++ b/block/vmdk.c
>>> @@ -595,7 +595,7 @@ static int vmdk_open_vmfs_sparse(BlockDriverState
>>> *bs,
>>>  int ret;
>>>  uint32_t magic;
>>>  VMDK3Header header;
>>> -VmdkExtent *extent;
>>> +VmdkExtent *extent = NULL;
>>>  
>>>  ret = bdrv_pread(file, sizeof(magic), , sizeof(header));
>>>  if (ret < 0) {
>>> @@ -751,7 +751,7 @@ static int vmdk_open_se_sparse(BlockDriverState
>>> *bs,
>>>  int ret;
>>>  VMDKSESparseConstHeader const_header;
>>>  VMDKSESparseVolatileHeader volatile_header;
>>> -VmdkExtent *extent;
>>> +VmdkExtent *extent = NULL;
>>>  
>>>  ret = bdrv_apply_auto_read_only(bs,
>>>  "No write support for seSparse images available", errp);
>>> @@ -869,7 +869,7 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
>>>  uint32_t magic;
>>>  uint32_t l1_size, l1_entry_sectors;
>>>  VMDK4Header header;
>>> -VmdkExtent *extent;
>>> +VmdkExtent *extent = NULL;
>>>  BDRVVmdkState *s = bs->opaque;
>>>  int64_t l1_backup_offset = 0;
>>>  bool compressed;
>>> @@ -1088,7 +1088,7 @@ static int vmdk_parse_extents(const char *desc,
>>> BlockDriverState *bs,
>>>  BdrvChild *extent_file;
>>>  BdrvChildRole extent_role;
>>>  BDRVVmdkState *s = bs->opaque;
>>> -VmdkExtent *extent;
>>> +VmdkExtent *extent = NULL;
>>>  char extent_opt_prefix[32];
>>>  Error *local_err = NULL;
>>>  
>>
>> Looks trivial, and correct.
>>
>> Reviewed-by: Fam Zheng 
> 
> 
> Will this go via the block or trivial tree (cced). 
> 

Applied to my trivial-patches branch.

Thanks,
Laurent




Re: [PATCH 10/10] hw/isa: Add the ISA_IRQ_IDE_DEFAULT definition

2020-10-12 Thread John Snow

On 10/11/20 3:32 PM, Philippe Mathieu-Daudé wrote:

The IDE controller uses IRQ #14 by default. Add this
default definition to the IsaIrqNumber enum.

Avoid magic values in the code, replace them by the
newly introduced definition.

Signed-off-by: Philippe Mathieu-Daudé 
---
  include/hw/isa/isa.h | 1 +
  hw/ide/isa.c | 2 +-
  2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/hw/isa/isa.h b/include/hw/isa/isa.h
index 43cdc3c47b6..05622ee11e2 100644
--- a/include/hw/isa/isa.h
+++ b/include/hw/isa/isa.h
@@ -17,6 +17,7 @@ enum IsaIrqNumber {
  ISA_IRQ_RTC_DEFAULT =  8,
  ISA_IRQ_NET_DEFAULT =  9,
  ISA_IRQ_MOU_DEFAULT = 12,
+ISA_IRQ_IDE_DEFAULT = 14,
  ISA_NUM_IRQS= 16
  };
  
diff --git a/hw/ide/isa.c b/hw/ide/isa.c

index 6bc19de2265..2412d568937 100644
--- a/hw/ide/isa.c
+++ b/hw/ide/isa.c
@@ -108,7 +108,7 @@ ISADevice *isa_ide_init(ISABus *bus, int iobase, int 
iobase2, int isairq,
  static Property isa_ide_properties[] = {
  DEFINE_PROP_UINT32("iobase",  ISAIDEState, iobase,  0x1f0),
  DEFINE_PROP_UINT32("iobase2", ISAIDEState, iobase2, 0x3f6),
-DEFINE_PROP_UINT32("irq",ISAIDEState, isairq,  14),
+DEFINE_PROP_UINT32("irq", ISAIDEState, isairq, ISA_IRQ_IDE_DEFAULT),
  DEFINE_PROP_END_OF_LIST(),
  };
  



Acked-by: John Snow 




Re: [PATCH v7 02/13] qtest: Reintroduce qtest_qmp_receive

2020-10-12 Thread Thomas Huth
On 12/10/2020 15.47, Paolo Bonzini wrote:
> On 12/10/20 13:14, Thomas Huth wrote:
>>> +/**
>>> + * qtest_qmp_receive:
>>> + * @s: #QTestState instance to operate on.
>>> + *
>>> + * Reads a QMP message from QEMU and returns the response.
>>> + * Buffers all the events received meanwhile, until a
>>> + * call to qtest_qmp_eventwait
>>> + */
>>> +QDict *qtest_qmp_receive(QTestState *s);
>> Re-introducing qtest_qmp_receive() with different behavior than before will
>> likely make backports of other later patches a pain, and might also break
>> other patches that use this function but are not merged yet. Could you
>> please use a different name for this function instead? Maye
>> qtest_qmp_receive_buffered() or something like that?
> 
> We chose to use the same name because the new version generally is the
> one you want and, except for the handling of events, is exactly the same
> as before.  In other words, I'm treating the new semantics more as a
> bugfix than a feature.
> 
> The only trap that backports of later patches could fall into is if they
> want to look at events, but it would be caught easily because the test
> would fail.

Ok, thanks for the explanation! ... but I think it might be good to have
this information in the patch description, though.

 Thomas




Re: [PATCH v7 02/13] qtest: Reintroduce qtest_qmp_receive

2020-10-12 Thread Paolo Bonzini
On 12/10/20 13:14, Thomas Huth wrote:
>> +/**
>> + * qtest_qmp_receive:
>> + * @s: #QTestState instance to operate on.
>> + *
>> + * Reads a QMP message from QEMU and returns the response.
>> + * Buffers all the events received meanwhile, until a
>> + * call to qtest_qmp_eventwait
>> + */
>> +QDict *qtest_qmp_receive(QTestState *s);
> Re-introducing qtest_qmp_receive() with different behavior than before will
> likely make backports of other later patches a pain, and might also break
> other patches that use this function but are not merged yet. Could you
> please use a different name for this function instead? Maye
> qtest_qmp_receive_buffered() or something like that?

We chose to use the same name because the new version generally is the
one you want and, except for the handling of events, is exactly the same
as before.  In other words, I'm treating the new semantics more as a
bugfix than a feature.

The only trap that backports of later patches could fall into is if they
want to look at events, but it would be caught easily because the test
would fail.

Paolo




Re: [PATCH] hw/scsi: remove dead code

2020-10-12 Thread Philippe Mathieu-Daudé

Hi Elena,

On 10/12/20 3:32 PM, Elena Afanasova wrote:

Since MEGASAS_MAX_SGE is defined to be 128 the following if statement can be 
removed.
Spotted by PVS-Studio.

Signed-off-by: Elena Afanasova 
---
  hw/scsi/megasas.c | 2 --
  1 file changed, 2 deletions(-)

diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index e24c12d7ee..6dcaad55fa 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -2393,8 +2393,6 @@ static void megasas_scsi_realize(PCIDevice *dev, Error 
**errp)
  }
  if (s->fw_sge >= MEGASAS_MAX_SGE - MFI_PASS_FRAME_SIZE) {
  s->fw_sge = MEGASAS_MAX_SGE - MFI_PASS_FRAME_SIZE;
-} else if (s->fw_sge >= 128 - MFI_PASS_FRAME_SIZE) {
-s->fw_sge = 128 - MFI_PASS_FRAME_SIZE;


See previous discussion:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg668850.html


  } else {
  s->fw_sge = 64 - MFI_PASS_FRAME_SIZE;
  }






[PATCH] hw/scsi: remove dead code

2020-10-12 Thread Elena Afanasova
Since MEGASAS_MAX_SGE is defined to be 128 the following if statement can be 
removed.
Spotted by PVS-Studio.

Signed-off-by: Elena Afanasova 
---
 hw/scsi/megasas.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index e24c12d7ee..6dcaad55fa 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -2393,8 +2393,6 @@ static void megasas_scsi_realize(PCIDevice *dev, Error 
**errp)
 }
 if (s->fw_sge >= MEGASAS_MAX_SGE - MFI_PASS_FRAME_SIZE) {
 s->fw_sge = MEGASAS_MAX_SGE - MFI_PASS_FRAME_SIZE;
-} else if (s->fw_sge >= 128 - MFI_PASS_FRAME_SIZE) {
-s->fw_sge = 128 - MFI_PASS_FRAME_SIZE;
 } else {
 s->fw_sge = 64 - MFI_PASS_FRAME_SIZE;
 }
-- 
2.25.1





[PATCH v2 3/3] hw/ssi: Rename SSI 'slave' as 'peripheral'

2020-10-12 Thread Philippe Mathieu-Daudé
In order to use inclusive terminology, rename SSI 'slave' as
'peripheral', following the specification resolution:
https://www.oshwa.org/a-resolution-to-redefine-spi-signal-names/

Patch created mechanically using:

  $ sed -i s/SSISlave/SSIPeripheral/ $(git grep -l SSISlave)
  $ sed -i s/SSI_SLAVE/SSI_PERIPHERAL/ $(git grep -l SSI_SLAVE)
  $ sed -i s/ssi-slave/ssi-peripheral/ $(git grep -l ssi-slave)
  $ sed -i s/ssi_slave/ssi_peripheral/ $(git grep -l ssi_slave)
  $ sed -i s/ssi_create_slave/ssi_create_peripheral/ \
$(git grep -l ssi_create_slave)

Then in VMStateDescription vmstate_ssi_peripheral we restored
the "SSISlave" migration stream name (to avoid breaking migration).

Finally the following files have been manually tweaked:
 - hw/ssi/pl022.c
 - hw/ssi/xilinx_spips.c

Suggested-by: Paolo Bonzini 
Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/misc/max111x.h |  2 +-
 include/hw/ssi/ssi.h  | 46 ++---
 hw/arm/spitz.c| 32 +-
 hw/arm/stellaris.c|  4 ++--
 hw/arm/tosa.c | 12 +-
 hw/arm/z2.c   | 14 ++--
 hw/block/m25p80.c | 14 ++--
 hw/display/ads7846.c  | 12 +-
 hw/display/ssd0323.c  | 12 +-
 hw/misc/max111x.c | 18 +++
 hw/sd/ssi-sd.c| 12 +-
 hw/ssi/pl022.c|  2 +-
 hw/ssi/ssi.c  | 48 +++
 hw/ssi/xilinx_spips.c |  7 +++---
 14 files changed, 118 insertions(+), 117 deletions(-)

diff --git a/include/hw/misc/max111x.h b/include/hw/misc/max111x.h
index 606cf1e0a2a..beff59c815d 100644
--- a/include/hw/misc/max111x.h
+++ b/include/hw/misc/max111x.h
@@ -33,7 +33,7 @@
  *be lowered once it has been asserted.
  */
 struct MAX111xState {
-SSISlave parent_obj;
+SSIPeripheral parent_obj;
 
 qemu_irq interrupt;
 /* Values of inputs at system reset (settable by QOM property) */
diff --git a/include/hw/ssi/ssi.h b/include/hw/ssi/ssi.h
index c15548425a3..f411858ab08 100644
--- a/include/hw/ssi/ssi.h
+++ b/include/hw/ssi/ssi.h
@@ -2,11 +2,11 @@
 
 /*
  * In principle SSI is a point-point interface.  As such the qemu
- * implementation has a single slave device on a "bus".
- * However it is fairly common for boards to have multiple slaves
+ * implementation has a single peripheral on a "bus".
+ * However it is fairly common for boards to have multiple peripherals
  * connected to a single master, and select devices with an external
  * chip select.  This is implemented in qemu by having an explicit mux device.
- * It is assumed that master and slave are both using the same transfer
+ * It is assumed that master and peripheral are both using the same transfer
  * width.
  */
 
@@ -18,9 +18,9 @@
 
 typedef enum SSICSMode SSICSMode;
 
-#define TYPE_SSI_SLAVE "ssi-slave"
-OBJECT_DECLARE_TYPE(SSISlave, SSISlaveClass,
-SSI_SLAVE)
+#define TYPE_SSI_PERIPHERAL "ssi-peripheral"
+OBJECT_DECLARE_TYPE(SSIPeripheral, SSIPeripheralClass,
+SSI_PERIPHERAL)
 
 #define SSI_GPIO_CS "ssi-gpio-cs"
 
@@ -30,21 +30,21 @@ enum SSICSMode {
 SSI_CS_HIGH,
 };
 
-/* Slave devices.  */
-struct SSISlaveClass {
+/* Peripherals.  */
+struct SSIPeripheralClass {
 DeviceClass parent_class;
 
-void (*realize)(SSISlave *dev, Error **errp);
+void (*realize)(SSIPeripheral *dev, Error **errp);
 
 /* if you have standard or no CS behaviour, just override transfer.
  * This is called when the device cs is active (true by default).
  */
-uint32_t (*transfer)(SSISlave *dev, uint32_t val);
+uint32_t (*transfer)(SSIPeripheral *dev, uint32_t val);
 /* called when the CS line changes. Optional, devices only need to 
implement
  * this if they have side effects associated with the cs line (beyond
  * tristating the txrx lines).
  */
-int (*set_cs)(SSISlave *dev, bool select);
+int (*set_cs)(SSIPeripheral *dev, bool select);
 /* define whether or not CS exists and is active low/high */
 SSICSMode cs_polarity;
 
@@ -53,30 +53,30 @@ struct SSISlaveClass {
  * cs_polarity are unused if this is overwritten. Transfer_raw will
  * always be called for the device for every txrx access to the parent bus
  */
-uint32_t (*transfer_raw)(SSISlave *dev, uint32_t val);
+uint32_t (*transfer_raw)(SSIPeripheral *dev, uint32_t val);
 };
 
-struct SSISlave {
+struct SSIPeripheral {
 DeviceState parent_obj;
 
 /* Chip select state */
 bool cs;
 };
 
-extern const VMStateDescription vmstate_ssi_slave;
+extern const VMStateDescription vmstate_ssi_peripheral;
 
-#define VMSTATE_SSI_SLAVE(_field, _state) {  \
+#define VMSTATE_SSI_PERIPHERAL(_field, _state) { \
 .name   = (stringify(_field)),   \
-.size   = sizeof(SSISlave),

[PATCH v2 1/3] hw/ssi/aspeed_smc: Rename 'max_slaves' variable as 'max_peripherals'

2020-10-12 Thread Philippe Mathieu-Daudé
From: Philippe Mathieu-Daudé 

In order to use inclusive terminology, rename max_slaves
as max_peripherals.

Patch generated using:

  $ sed -i s/slave/peripheral/ \
hw/ssi/aspeed_smc.c include/hw/ssi/aspeed_smc.h

One line in aspeed_smc_read() has been manually tweaked
to pass checkpatch.

Reviewed-by: Thomas Huth 
Reviewed-by: Cédric Le Goater 
Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/ssi/aspeed_smc.h |  2 +-
 hw/ssi/aspeed_smc.c | 53 +++--
 2 files changed, 28 insertions(+), 27 deletions(-)

diff --git a/include/hw/ssi/aspeed_smc.h b/include/hw/ssi/aspeed_smc.h
index 3dd354b52ec..16c03fe64f3 100644
--- a/include/hw/ssi/aspeed_smc.h
+++ b/include/hw/ssi/aspeed_smc.h
@@ -43,7 +43,7 @@ typedef struct AspeedSMCController {
 uint8_t r_timings;
 uint8_t nregs_timings;
 uint8_t conf_enable_w0;
-uint8_t max_slaves;
+uint8_t max_peripherals;
 const AspeedSegments *segments;
 hwaddr flash_window_base;
 uint32_t flash_window_size;
diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c
index 795784e5f36..2780cac71d2 100644
--- a/hw/ssi/aspeed_smc.c
+++ b/hw/ssi/aspeed_smc.c
@@ -181,7 +181,7 @@
 #define SNOOP_START   0x0
 
 /*
- * Default segments mapping addresses and size for each slave per
+ * Default segments mapping addresses and size for each peripheral per
  * controller. These can be changed when board is initialized with the
  * Segment Address Registers.
  */
@@ -259,7 +259,7 @@ static const AspeedSMCController controllers[] = {
 .r_timings = R_TIMINGS,
 .nregs_timings = 1,
 .conf_enable_w0= CONF_ENABLE_W0,
-.max_slaves= 1,
+.max_peripherals   = 1,
 .segments  = aspeed_segments_legacy,
 .flash_window_base = ASPEED_SOC_SMC_FLASH_BASE,
 .flash_window_size = 0x600,
@@ -275,7 +275,7 @@ static const AspeedSMCController controllers[] = {
 .r_timings = R_TIMINGS,
 .nregs_timings = 1,
 .conf_enable_w0= CONF_ENABLE_W0,
-.max_slaves= 5,
+.max_peripherals   = 5,
 .segments  = aspeed_segments_fmc,
 .flash_window_base = ASPEED_SOC_FMC_FLASH_BASE,
 .flash_window_size = 0x1000,
@@ -293,7 +293,7 @@ static const AspeedSMCController controllers[] = {
 .r_timings = R_SPI_TIMINGS,
 .nregs_timings = 1,
 .conf_enable_w0= SPI_CONF_ENABLE_W0,
-.max_slaves= 1,
+.max_peripherals   = 1,
 .segments  = aspeed_segments_spi,
 .flash_window_base = ASPEED_SOC_SPI_FLASH_BASE,
 .flash_window_size = 0x1000,
@@ -309,7 +309,7 @@ static const AspeedSMCController controllers[] = {
 .r_timings = R_TIMINGS,
 .nregs_timings = 1,
 .conf_enable_w0= CONF_ENABLE_W0,
-.max_slaves= 3,
+.max_peripherals   = 3,
 .segments  = aspeed_segments_ast2500_fmc,
 .flash_window_base = ASPEED_SOC_FMC_FLASH_BASE,
 .flash_window_size = 0x1000,
@@ -327,7 +327,7 @@ static const AspeedSMCController controllers[] = {
 .r_timings = R_TIMINGS,
 .nregs_timings = 1,
 .conf_enable_w0= CONF_ENABLE_W0,
-.max_slaves= 2,
+.max_peripherals   = 2,
 .segments  = aspeed_segments_ast2500_spi1,
 .flash_window_base = ASPEED_SOC_SPI_FLASH_BASE,
 .flash_window_size = 0x800,
@@ -343,7 +343,7 @@ static const AspeedSMCController controllers[] = {
 .r_timings = R_TIMINGS,
 .nregs_timings = 1,
 .conf_enable_w0= CONF_ENABLE_W0,
-.max_slaves= 2,
+.max_peripherals   = 2,
 .segments  = aspeed_segments_ast2500_spi2,
 .flash_window_base = ASPEED_SOC_SPI2_FLASH_BASE,
 .flash_window_size = 0x800,
@@ -359,7 +359,7 @@ static const AspeedSMCController controllers[] = {
 .r_timings = R_TIMINGS,
 .nregs_timings = 1,
 .conf_enable_w0= CONF_ENABLE_W0,
-.max_slaves= 3,
+.max_peripherals   = 3,
 .segments  = aspeed_segments_ast2600_fmc,
 .flash_window_base = ASPEED26_SOC_FMC_FLASH_BASE,
 .flash_window_size = 0x1000,
@@ -377,7 +377,7 @@ static const AspeedSMCController controllers[] = {
 .r_timings = R_TIMINGS,
 .nregs_timings = 2,
 .conf_enable_w0= CONF_ENABLE_W0,
-.max_slaves= 2,
+.max_peripherals   = 2,
 .segments  = aspeed_segments_ast2600_spi1,
 .flash_window_base = ASPEED26_SOC_SPI_FLASH_BASE,
 .flash_window_size = 0x1000,
@@ -395,7 +395,7 @@ static const AspeedSMCController controllers[] = {
 .r_timings = R_TIMINGS,
 .nregs_timings = 3,
 .conf_enable_w0= CONF_ENABLE_W0,
-

[PATCH v2 0/3] hw/ssi: Rename SSI 'slave' as 'peripheral'

2020-10-12 Thread Philippe Mathieu-Daudé
Since v1:
- Fixed patch #1 subject (Kevin)

In order to use inclusive terminology, rename SSI 'slave' as
'peripheral', following the resolution Paolo pointed in [*]:
https://www.oshwa.org/a-resolution-to-redefine-spi-signal-names/

Candidate to be merged via the ARM or Trivial trees.

[*] https://www.mail-archive.com/qemu-devel@nongnu.org/msg739108.html

Philippe Mathieu-Daudé (3):
  hw/ssi/aspeed_smc: Rename 'max_slaves' variable as 'max_peripherals'
  hw/ssi: Update coding style to make checkpatch.pl happy
  hw/ssi: Rename SSI 'slave' as 'peripheral'

 include/hw/misc/max111x.h   |  2 +-
 include/hw/ssi/aspeed_smc.h |  2 +-
 include/hw/ssi/ssi.h| 56 +++--
 hw/arm/spitz.c  | 32 ++---
 hw/arm/stellaris.c  |  4 +--
 hw/arm/tosa.c   | 12 
 hw/arm/z2.c | 14 +-
 hw/block/m25p80.c   | 14 +-
 hw/display/ads7846.c| 12 
 hw/display/ssd0323.c| 12 
 hw/misc/max111x.c   | 18 ++--
 hw/sd/ssi-sd.c  | 12 
 hw/ssi/aspeed_smc.c | 53 ++-
 hw/ssi/pl022.c  |  2 +-
 hw/ssi/ssi.c| 48 +++
 hw/ssi/xilinx_spips.c   |  7 +++--
 16 files changed, 152 insertions(+), 148 deletions(-)

-- 
2.26.2




[PATCH v2 2/3] hw/ssi: Update coding style to make checkpatch.pl happy

2020-10-12 Thread Philippe Mathieu-Daudé
To make the next commit easier to review, clean this code first.

Signed-off-by: Philippe Mathieu-Daudé 
---
 include/hw/ssi/ssi.h | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/include/hw/ssi/ssi.h b/include/hw/ssi/ssi.h
index fe3028c39dc..c15548425a3 100644
--- a/include/hw/ssi/ssi.h
+++ b/include/hw/ssi/ssi.h
@@ -1,12 +1,14 @@
 /* QEMU Synchronous Serial Interface support.  */
 
-/* In principle SSI is a point-point interface.  As such the qemu
-   implementation has a single slave device on a "bus".
-   However it is fairly common for boards to have multiple slaves
-   connected to a single master, and select devices with an external
-   chip select.  This is implemented in qemu by having an explicit mux device.
-   It is assumed that master and slave are both using the same transfer width.
-   */
+/*
+ * In principle SSI is a point-point interface.  As such the qemu
+ * implementation has a single slave device on a "bus".
+ * However it is fairly common for boards to have multiple slaves
+ * connected to a single master, and select devices with an external
+ * chip select.  This is implemented in qemu by having an explicit mux device.
+ * It is assumed that master and slave are both using the same transfer
+ * width.
+ */
 
 #ifndef QEMU_SSI_H
 #define QEMU_SSI_H
-- 
2.26.2




[PATCH v2 1/3] block: push error reporting into bdrv_all_*_snapshot functions

2020-10-12 Thread Philippe Mathieu-Daudé
From: Daniel P. Berrangé 

The bdrv_all_*_snapshot functions return a BlockDriverState pointer
for the invalid backend, which the callers then use to report an
error message. In some cases multiple callers are reporting the
same error message, but with slightly different text. In the future
there will be more error scenarios for some of these methods, which
will benefit from fine grained error message reporting. So it is
helpful to push error reporting down a level.

Signed-off-by: Daniel P. Berrangé 
[PMD: Initialize variables]
Signed-off-by: Philippe Mathieu-Daudé 
---
 include/block/snapshot.h   | 14 +++
 block/monitor/block-hmp-cmds.c |  7 ++--
 block/snapshot.c   | 77 +-
 migration/savevm.c | 37 +---
 monitor/hmp-cmds.c |  7 +---
 replay/replay-debugging.c  |  4 +-
 tests/qemu-iotests/267.out | 10 ++---
 7 files changed, 67 insertions(+), 89 deletions(-)

diff --git a/include/block/snapshot.h b/include/block/snapshot.h
index b0fe42993dc..5cb2b696ad0 100644
--- a/include/block/snapshot.h
+++ b/include/block/snapshot.h
@@ -77,17 +77,15 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState 
*bs,
  * These functions will properly handle dataplane (take aio_context_acquire
  * when appropriate for appropriate block drivers */
 
-bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs);
-int bdrv_all_delete_snapshot(const char *name, BlockDriverState **first_bsd_bs,
- Error **errp);
-int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs,
-   Error **errp);
-int bdrv_all_find_snapshot(const char *name, BlockDriverState **first_bad_bs);
+bool bdrv_all_can_snapshot(Error **errp);
+int bdrv_all_delete_snapshot(const char *name, Error **errp);
+int bdrv_all_goto_snapshot(const char *name, Error **errp);
+int bdrv_all_find_snapshot(const char *name, Error **errp);
 int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
  BlockDriverState *vm_state_bs,
  uint64_t vm_state_size,
- BlockDriverState **first_bad_bs);
+ Error **errp);
 
-BlockDriverState *bdrv_all_find_vmstate_bs(void);
+BlockDriverState *bdrv_all_find_vmstate_bs(Error **errp);
 
 #endif
diff --git a/block/monitor/block-hmp-cmds.c b/block/monitor/block-hmp-cmds.c
index d15a2be827b..548bb6b5e3f 100644
--- a/block/monitor/block-hmp-cmds.c
+++ b/block/monitor/block-hmp-cmds.c
@@ -899,10 +899,11 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 
 ImageEntry *image_entry, *next_ie;
 SnapshotEntry *snapshot_entry;
+Error *err = NULL;
 
-bs = bdrv_all_find_vmstate_bs();
+bs = bdrv_all_find_vmstate_bs();
 if (!bs) {
-monitor_printf(mon, "No available block device supports snapshots\n");
+error_report_err(err);
 return;
 }
 aio_context = bdrv_get_aio_context(bs);
@@ -952,7 +953,7 @@ void hmp_info_snapshots(Monitor *mon, const QDict *qdict)
 total = 0;
 for (i = 0; i < nb_sns; i++) {
 SnapshotEntry *next_sn;
-if (bdrv_all_find_snapshot(sn_tab[i].name, ) == 0) {
+if (bdrv_all_find_snapshot(sn_tab[i].name, NULL) == 0) {
 global_snapshots[total] = i;
 total++;
 QTAILQ_FOREACH(image_entry, _list, next) {
diff --git a/block/snapshot.c b/block/snapshot.c
index a2bf3a54eb0..482e3fc7b72 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -462,14 +462,14 @@ static bool 
bdrv_all_snapshots_includes_bs(BlockDriverState *bs)
  * These functions will properly handle dataplane (take aio_context_acquire
  * when appropriate for appropriate block drivers) */
 
-bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs)
+bool bdrv_all_can_snapshot(Error **errp)
 {
-bool ok = true;
 BlockDriverState *bs;
 BdrvNextIterator it;
 
 for (bs = bdrv_first(); bs; bs = bdrv_next()) {
 AioContext *ctx = bdrv_get_aio_context(bs);
+bool ok = true;
 
 aio_context_acquire(ctx);
 if (bdrv_all_snapshots_includes_bs(bs)) {
@@ -477,26 +477,25 @@ bool bdrv_all_can_snapshot(BlockDriverState 
**first_bad_bs)
 }
 aio_context_release(ctx);
 if (!ok) {
+error_setg(errp, "Device '%s' is writable but does not support "
+   "snapshots", bdrv_get_device_or_node_name(bs));
 bdrv_next_cleanup();
-goto fail;
+return false;
 }
 }
 
-fail:
-*first_bad_bs = bs;
-return ok;
+return true;
 }
 
-int bdrv_all_delete_snapshot(const char *name, BlockDriverState **first_bad_bs,
- Error **errp)
+int bdrv_all_delete_snapshot(const char *name, Error **errp)
 {
-int ret = 0;
 BlockDriverState *bs;
 BdrvNextIterator it;
 QEMUSnapshotInfo sn1, *snapshot = 
 
 for (bs = bdrv_first(); bs; bs = 

[PATCH v2 2/3] migration: Make save_snapshot() return bool, not 0/-1

2020-10-12 Thread Philippe Mathieu-Daudé
Just for consistency, following the example documented since
commit e3fe3988d7 ("error: Document Error API usage rules"),
return a boolean value indicating an error is set or not.

Acked-by: Pavel Dovgalyuk 
Signed-off-by: Philippe Mathieu-Daudé 
---
 include/migration/snapshot.h |  9 -
 migration/savevm.c   | 16 
 replay/replay-debugging.c|  2 +-
 replay/replay-snapshot.c |  2 +-
 4 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
index c85b6ec75b7..a40c34307b5 100644
--- a/include/migration/snapshot.h
+++ b/include/migration/snapshot.h
@@ -15,7 +15,14 @@
 #ifndef QEMU_MIGRATION_SNAPSHOT_H
 #define QEMU_MIGRATION_SNAPSHOT_H
 
-int save_snapshot(const char *name, Error **errp);
+/**
+ * save_snapshot: Save a snapshot.
+ * @name: path to snapshot
+ * @errp: pointer to error object
+ * On success, return %true.
+ * On failure, store an error through @errp and return %false.
+ */
+bool save_snapshot(const char *name, Error **errp);
 int load_snapshot(const char *name, Error **errp);
 
 #endif
diff --git a/migration/savevm.c b/migration/savevm.c
index a52da440f4b..fd2e5e8b663 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2658,7 +2658,7 @@ int qemu_load_device_state(QEMUFile *f)
 return 0;
 }
 
-int save_snapshot(const char *name, Error **errp)
+bool save_snapshot(const char *name, Error **errp)
 {
 BlockDriverState *bs;
 QEMUSnapshotInfo sn1, *sn = , old_sn1, *old_sn = _sn1;
@@ -2671,29 +2671,29 @@ int save_snapshot(const char *name, Error **errp)
 AioContext *aio_context;
 
 if (migration_is_blocked(errp)) {
-return ret;
+return false;
 }
 
 if (!replay_can_snapshot()) {
 error_setg(errp, "Record/replay does not allow making snapshot "
"right now. Try once more later.");
-return ret;
+return false;
 }
 
 if (!bdrv_all_can_snapshot(errp)) {
-return ret;
+return false;
 }
 
 /* Delete old snapshots of the same name */
 if (name) {
 if (bdrv_all_delete_snapshot(name, errp) < 0) {
-return ret;
+return false;
 }
 }
 
 bs = bdrv_all_find_vmstate_bs(errp);
 if (bs == NULL) {
-return ret;
+return false;
 }
 aio_context = bdrv_get_aio_context(bs);
 
@@ -2702,7 +2702,7 @@ int save_snapshot(const char *name, Error **errp)
 ret = global_state_store();
 if (ret) {
 error_setg(errp, "Error saving global state");
-return ret;
+return false;
 }
 vm_stop(RUN_STATE_SAVE_VM);
 
@@ -2779,7 +2779,7 @@ int save_snapshot(const char *name, Error **errp)
 if (saved_vm_running) {
 vm_start();
 }
-return ret;
+return ret == 0;
 }
 
 void qmp_xen_save_devices_state(const char *filename, bool has_live, bool live,
diff --git a/replay/replay-debugging.c b/replay/replay-debugging.c
index 8785489c02d..5458a73fced 100644
--- a/replay/replay-debugging.c
+++ b/replay/replay-debugging.c
@@ -327,7 +327,7 @@ void replay_gdb_attached(void)
  */
 if (replay_mode == REPLAY_MODE_PLAY
 && !replay_snapshot) {
-if (save_snapshot("start_debugging", NULL) != 0) {
+if (!save_snapshot("start_debugging", NULL)) {
 /* Can't create the snapshot. Continue conventional debugging. */
 }
 }
diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c
index e26fa4c892e..4f2560d1568 100644
--- a/replay/replay-snapshot.c
+++ b/replay/replay-snapshot.c
@@ -77,7 +77,7 @@ void replay_vmstate_init(void)
 
 if (replay_snapshot) {
 if (replay_mode == REPLAY_MODE_RECORD) {
-if (save_snapshot(replay_snapshot, ) != 0) {
+if (!save_snapshot(replay_snapshot, )) {
 error_report_err(err);
 error_report("Could not create snapshot for icount record");
 exit(1);
-- 
2.26.2




[PATCH v2 0/3] migration: Make save/load_snapshot() return boolean

2020-10-12 Thread Philippe Mathieu-Daudé
Since v1:
- Initialize variables in first patch (Max)
- Added Pavel's A-b tag

I had a pair of patches making save_snapshot/load_snapshot()
return a boolean (like Markus recent qdev/QOM cleanup), then
realized Daniel already has series changing migration/, so I
rebased my patches in the first part of his v6 series:
"migration: bring improved savevm/loadvm/delvm to QMP"
https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg02158.html

I included patch #1/#3 from Daniel, #2 is my first patch,
and my second patch is squashed with Daniel's #3.

Daniel if you find these patches worthwhile, please consider them
while merging your series (or respining).

Regards,

Phil.

Daniel P. Berrangé (2):
  block: push error reporting into bdrv_all_*_snapshot functions
  migration: stop returning errno from load_snapshot()

Philippe Mathieu-Daudé (1):
  migration: Make save_snapshot() return bool, not 0/-1

 include/block/snapshot.h   | 14 +++
 include/migration/snapshot.h   | 18 +++-
 block/monitor/block-hmp-cmds.c |  7 ++--
 block/snapshot.c   | 77 +-
 migration/savevm.c | 72 ---
 monitor/hmp-cmds.c |  9 +---
 replay/replay-debugging.c  |  6 +--
 replay/replay-snapshot.c   |  4 +-
 softmmu/vl.c   |  2 +-
 tests/qemu-iotests/267.out | 10 ++---
 10 files changed, 105 insertions(+), 114 deletions(-)

-- 
2.26.2





[PATCH v2 3/3] migration: stop returning errno from load_snapshot()

2020-10-12 Thread Philippe Mathieu-Daudé
From: Daniel P. Berrangé 

None of the callers care about the errno value since there is a full
Error object populated. This gives consistency with save_snapshot()
which already just returns a boolean value.

Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Daniel P. Berrangé 
[PMD: Return false/true instead of -1/0, document function]
Acked-by: Pavel Dovgalyuk 
Signed-off-by: Philippe Mathieu-Daudé 
---
 include/migration/snapshot.h |  9 -
 migration/savevm.c   | 19 +--
 monitor/hmp-cmds.c   |  2 +-
 replay/replay-snapshot.c |  2 +-
 softmmu/vl.c |  2 +-
 5 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/include/migration/snapshot.h b/include/migration/snapshot.h
index a40c34307b5..9bc989a6b49 100644
--- a/include/migration/snapshot.h
+++ b/include/migration/snapshot.h
@@ -23,6 +23,13 @@
  * On failure, store an error through @errp and return %false.
  */
 bool save_snapshot(const char *name, Error **errp);
-int load_snapshot(const char *name, Error **errp);
+/**
+ * save_snapshot: Load a snapshot.
+ * @name: path to snapshot
+ * @errp: pointer to error object
+ * On success, return %true.
+ * On failure, store an error through @errp and return %false.
+ */
+bool load_snapshot(const char *name, Error **errp);
 
 #endif
diff --git a/migration/savevm.c b/migration/savevm.c
index fd2e5e8b663..531bb2eca1e 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2864,7 +2864,7 @@ void qmp_xen_load_devices_state(const char *filename, 
Error **errp)
 migration_incoming_state_destroy();
 }
 
-int load_snapshot(const char *name, Error **errp)
+bool load_snapshot(const char *name, Error **errp)
 {
 BlockDriverState *bs_vm_state;
 QEMUSnapshotInfo sn;
@@ -2874,16 +2874,16 @@ int load_snapshot(const char *name, Error **errp)
 MigrationIncomingState *mis = migration_incoming_get_current();
 
 if (!bdrv_all_can_snapshot(errp)) {
-return -ENOTSUP;
+return false;
 }
 ret = bdrv_all_find_snapshot(name, errp);
 if (ret < 0) {
-return ret;
+return false;
 }
 
 bs_vm_state = bdrv_all_find_vmstate_bs(errp);
 if (!bs_vm_state) {
-return -ENOTSUP;
+return false;
 }
 aio_context = bdrv_get_aio_context(bs_vm_state);
 
@@ -2892,11 +2892,11 @@ int load_snapshot(const char *name, Error **errp)
 ret = bdrv_snapshot_find(bs_vm_state, , name);
 aio_context_release(aio_context);
 if (ret < 0) {
-return ret;
+return false;
 } else if (sn.vm_state_size == 0) {
 error_setg(errp, "This is a disk-only snapshot. Revert to it "
" offline using qemu-img");
-return -EINVAL;
+return false;
 }
 
 /*
@@ -2917,7 +2917,6 @@ int load_snapshot(const char *name, Error **errp)
 f = qemu_fopen_bdrv(bs_vm_state, 0);
 if (!f) {
 error_setg(errp, "Could not open VM state file");
-ret = -EINVAL;
 goto err_drain;
 }
 
@@ -2933,14 +2932,14 @@ int load_snapshot(const char *name, Error **errp)
 
 if (ret < 0) {
 error_setg(errp, "Error %d while loading VM state", ret);
-return ret;
+return false;
 }
 
-return 0;
+return true;
 
 err_drain:
 bdrv_drain_all_end();
-return ret;
+return false;
 }
 
 void vmstate_register_ram(MemoryRegion *mr, DeviceState *dev)
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index f48173820ff..521fcf96eba 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -1123,7 +1123,7 @@ void hmp_loadvm(Monitor *mon, const QDict *qdict)
 
 vm_stop(RUN_STATE_RESTORE_VM);
 
-if (load_snapshot(name, ) == 0 && saved_vm_running) {
+if (!load_snapshot(name, ) && saved_vm_running) {
 vm_start();
 }
 hmp_handle_error(mon, err);
diff --git a/replay/replay-snapshot.c b/replay/replay-snapshot.c
index 4f2560d1568..b2893659370 100644
--- a/replay/replay-snapshot.c
+++ b/replay/replay-snapshot.c
@@ -83,7 +83,7 @@ void replay_vmstate_init(void)
 exit(1);
 }
 } else if (replay_mode == REPLAY_MODE_PLAY) {
-if (load_snapshot(replay_snapshot, ) != 0) {
+if (!load_snapshot(replay_snapshot, )) {
 error_report_err(err);
 error_report("Could not load snapshot for icount replay");
 exit(1);
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 5a11a62f78a..6eaa6b3a09a 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -4478,7 +4478,7 @@ void qemu_init(int argc, char **argv, char **envp)
 register_global_state();
 if (loadvm) {
 Error *local_err = NULL;
-if (load_snapshot(loadvm, _err) < 0) {
+if (!load_snapshot(loadvm, _err)) {
 error_report_err(local_err);
 autostart = 0;
 exit(1);
-- 
2.26.2




Re: [PATCH] hw/block/nvme: Simplify timestamp sum

2020-10-12 Thread Laurent Vivier
Le 02/10/2020 à 09:57, Philippe Mathieu-Daudé a écrit :
> As the 'timestamp' variable is declared as a 48-bit bitfield,
> we do not need to wrap the sum result.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  hw/block/nvme.c | 7 +--
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 63078f6009..44fa5b9076 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1280,12 +1280,7 @@ static inline uint64_t nvme_get_timestamp(const 
> NvmeCtrl *n)
>  
>  union nvme_timestamp ts;
>  ts.all = 0;
> -
> -/*
> - * If the sum of the Timestamp value set by the host and the elapsed
> - * time exceeds 2^48, the value returned should be reduced modulo 2^48.
> - */
> -ts.timestamp = (n->host_timestamp + elapsed_time) & 0x;
> +ts.timestamp = n->host_timestamp + elapsed_time;
>  
>  /* If the host timestamp is non-zero, set the timestamp origin */
>  ts.origin = n->host_timestamp ? 0x01 : 0x00;
> 

Applied to my trivial-patches branch.

Thanks,
Laurent



Re: [PATCH 1/3] block: push error reporting into bdrv_all_*_snapshot functions

2020-10-12 Thread Philippe Mathieu-Daudé

On 10/12/20 1:09 PM, Max Reitz wrote:

On 12.10.20 12:16, Philippe Mathieu-Daudé wrote:

On 10/12/20 12:07 PM, Max Reitz wrote:

On 08.10.20 19:48, Philippe Mathieu-Daudé wrote:

From: Daniel P. Berrangé 

The bdrv_all_*_snapshot functions return a BlockDriverState pointer
for the invalid backend, which the callers then use to report an
error message. In some cases multiple callers are reporting the
same error message, but with slightly different text. In the future
there will be more error scenarios for some of these methods, which
will benefit from fine grained error message reporting. So it is
helpful to push error reporting down a level.

Signed-off-by: Daniel P. Berrangé 
---
   include/block/snapshot.h   | 14 +++
   block/monitor/block-hmp-cmds.c |  7 ++--
   block/snapshot.c   | 77 +-
   migration/savevm.c | 37 +---
   monitor/hmp-cmds.c |  7 +---
   replay/replay-debugging.c  |  4 +-
   tests/qemu-iotests/267.out | 10 ++---
   7 files changed, 67 insertions(+), 89 deletions(-)


Looks good overall to me, but for some reason this patch pulls in the
@ok and @ret variables from the top scope of all concerned functions
into the inner scopes of the BDS loops, and drops their initialization.
   That’s wrong, because we only call the respective snapshotting
functions on some BDSs, so the return value stays uninitialized for all
other BDSs:


Indeed, thanks for catching that.

[...]

   int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
    BlockDriverState *vm_state_bs,
    uint64_t vm_state_size,
- BlockDriverState **first_bad_bs)
+ Error **errp)
   {
-    int err = 0;
   BlockDriverState *bs;
   BdrvNextIterator it;
     for (bs = bdrv_first(); bs; bs = bdrv_next()) {
   AioContext *ctx = bdrv_get_aio_context(bs);
+    int ret;


And one final time.

Max


   aio_context_acquire(ctx);
   if (bs == vm_state_bs) {
   sn->vm_state_size = vm_state_size;
-    err = bdrv_snapshot_create(bs, sn);
+    ret = bdrv_snapshot_create(bs, sn);
   } else if (bdrv_all_snapshots_includes_bs(bs)) {
   sn->vm_state_size = 0;
-    err = bdrv_snapshot_create(bs, sn);
+    ret = bdrv_snapshot_create(bs, sn);


This one is not needed.


Why not?  Is bdrv_all_snapshots_includes_bs(bs) guaranteed to be true?
(I don’t see any a plain “else” branch, or where ret would be set
outside of these two “if” blocks.)


Oops, I misread the 'else' branch, you are right again :)



Max


   }
   aio_context_release(ctx);
-    if (err < 0) {
+    if (ret < 0) {
+    error_setg(errp, "Could not create snapshot '%s' on '%s'",
+   sn->name, bdrv_get_device_or_node_name(bs));
   bdrv_next_cleanup();
-    goto fail;
+    return -1;
   }
   }











Re: [PATCH v4 4/7] keyval: Parse help options

2020-10-12 Thread Eric Blake

On 10/11/20 2:35 AM, Markus Armbruster wrote:

From: Kevin Wolf 

This adds a special meaning for 'help' and '?' as options to the keyval
parser. Instead of being an error (because of a missing value) or a
value for an implied key, they now request help, which is a new boolean
output of the parser in addition to the QDict.

A new parameter 'p_help' is added to keyval_parse() that contains on
return whether help was requested. If NULL is passed, requesting help
results in an error and all other cases work like before.




+
+/* "help" by itself, without implied key */
+qdict = keyval_parse("help", NULL, , _abort);
+g_assert_cmpuint(qdict_size(qdict), ==, 0);
+g_assert(help);
+qobject_unref(qdict);
+
+/* "help" by itself, with implied key */
+qdict = keyval_parse("help", "implied", , _abort);
+g_assert_cmpuint(qdict_size(qdict), ==, 0);
+g_assert(help);
+qobject_unref(qdict);
+
+/* "help" when no help is available, without implied key */
+qdict = keyval_parse("help", NULL, NULL, );
+error_free_or_abort();
+g_assert(!qdict);
+
+/* "help" when no help is available, with implied key */
+qdict = keyval_parse("help", "implied", NULL, );
+error_free_or_abort();
+g_assert(!qdict);
+
+/* Key "help" */
+qdict = keyval_parse("help=on", NULL, , _abort);
+g_assert_cmpuint(qdict_size(qdict), ==, 1);
+g_assert_cmpstr(qdict_get_try_str(qdict, "help"), ==, "on");
+g_assert(!help);
+qobject_unref(qdict);
+
+/* "help" followed by crap, without implied key */
+qdict = keyval_parse("help.abc", NULL, , );
+error_free_or_abort();
+g_assert(!qdict);
+
+/* "help" followed by crap, with implied key */
+qdict = keyval_parse("help.abc", "implied", , );
+g_assert_cmpuint(qdict_size(qdict), ==, 1);
+g_assert_cmpstr(qdict_get_try_str(qdict, "implied"), ==, "help.abc");
+g_assert(!help);
+qobject_unref(qdict);
+
+/* "help" with other stuff, without implied key */
+qdict = keyval_parse("number=42,help,foo=bar", NULL, , _abort);
+g_assert_cmpuint(qdict_size(qdict), ==, 2);
+g_assert_cmpstr(qdict_get_try_str(qdict, "number"), ==, "42");
+g_assert_cmpstr(qdict_get_try_str(qdict, "foo"), ==, "bar");
+g_assert(help);
+qobject_unref(qdict);
+
+/* "help" with other stuff, with implied key */
+qdict = keyval_parse("val,help,foo=bar", "implied", , _abort);
+g_assert_cmpuint(qdict_size(qdict), ==, 2);
+g_assert_cmpstr(qdict_get_try_str(qdict, "implied"), ==, "val");
+g_assert_cmpstr(qdict_get_try_str(qdict, "foo"), ==, "bar");
+g_assert(help);
+qobject_unref(qdict);


Is it worth checking that "helper" with implied key is a value, not help?


+++ b/util/keyval.c
@@ -14,10 +14,11 @@
   * KEY=VALUE,... syntax:
   *
   *   key-vals = [ key-val { ',' key-val } [ ',' ] ]
- *   key-val  = key '=' val
+ *   key-val  = key '=' val | help
   *   key  = key-fragment { '.' key-fragment }
   *   key-fragment = / [^=,.]+ /
   *   val  = { / [^,]+ / | ',,' }
+ *   help = 'help | '?'


Missing '

Otherwise
Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v4 3/7] keyval: Fix parsing of ', ' in value of implied key

2020-10-12 Thread Eric Blake

On 10/11/20 2:35 AM, Markus Armbruster wrote:

The previous commit demonstrated documentation and code disagree on
parsing of ',' in the value of an implied key.  Fix the code to match
the documentation.

This breaks uses of keyval_parse() that pass an implied key and accept
a value containing ','.  None of the existing uses does:

* audiodev: implied key "driver" is enum AudiodevDriver, none of the
   values contains ','

* display: implied key "type" is enum DisplayType, none of the values
   contains ','

* blockdev: implied key "driver is enum BlockdevDriver, none of the
   values contains ','

* export: implied key "type" is enum BlockExportType, none of the
   values contains ','

* monitor: implied key "mode" is enum MonitorMode, none of the values
   contains ','

* nbd-server: no implied key.

Signed-off-by: Markus Armbruster 
---
  tests/test-keyval.c |  8 +++-
  util/keyval.c   | 28 +---
  2 files changed, 20 insertions(+), 16 deletions(-)



Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: Which qemu change corresponds to RedHat bug 1655408

2020-10-12 Thread Max Reitz
On 09.10.20 14:55, Jakob Bohm wrote:
> On 2020-10-09 10:48, Max Reitz wrote:
>> On 08.10.20 18:49, Jakob Bohm wrote:
>>> (Top posting because previous reply did so):
>>>
>>> If the bug was closed as "can't reproduce", why was a very similar bug
>>> listed as fixed in RHSA-2019:2553-01 ?
>>
>> Hi,
>>
>> Which very similar bug do you mean?  I can only guess that perhaps you
>> mean 1603104 or 1551486.
>>
>> Bug 1603104 was about qemu not ignoring errors when releasing file locks
>> fails (we should ignore errors then, because they're not fatal, and we
>> often cannot return errors, so they ended up as aborts).  (To give more
>> context, this error generally appeared only when the storage the image
>> is on somehow disappeared while qemu is running.  E.g. when the
>> connection to an NFS server was lost.)
>>
>> Bug 1551486 entailed a bit of a rewrite of the whole locking code, which
>> may have resulted in the bug 1655408 no longer appearing for our QE
>> team.  But it was a different bug, as it wasn’t about any error, but
>> just about the fact that qemu used more FDs than necessary.
>>
>> (Although I see 1655408 was reported for RHEL 8, whereas 1603104 and
>> 1551486 (as part of RHSA-2019:2553) were reported for RHEL 7.  The
>> corresponding RHEL 8 bug for those two is 1694148.)
>>
>> Either way, both of those bugs are fixed in 5.0.
>>
>>
>> 1655408 in contrast reports an error at startup; locking itself failed.
>>   I couldn’t reproduce it, and I still can’t; neither with the image
>> mounted concurrently, nor with an RO NFS mount.
>>
>> (For example:
>>
>> exports:
>> [...]/test-nfs-ro
>> 127.0.0.1(ro,sync,no_subtree_check,fsid=0,insecure,crossmnt)
>>
>> $ for i in $(seq 100); do \
>>  echo -e '\033[1m---\033[0m'; \
>>  x86_64-softmmu/qemu-system-x86_64 \
>>    -drive \
>>  if=none,id=drv0,readonly=on,file=/mnt/tmp/arch.iso,format=raw \
>>    -device ide-cd,drive=drv0 \
>>    -enable-kvm -m 2048 -display none &; \
>>  pid=$!; \
>>  sleep 1; \
>>  kill $pid; \
>>    done
>>
>> (Where x86_64-softmmu/qemu-system-x86_64 is upstream 5.0.1.)
>>
>> All I see is something like:
>>
>> ---
>> qemu-system-x86_64: terminating on signal 15 from pid 7278 (/bin/zsh)
>> [2] 34103
>> [3]  - 34095 terminated  x86_64-softmmu/qemu-system-x86_64 -drive
>> -device ide-cd,drive=drv0  -m 2048
>>
>> So no file locking errors.)
>>
> 
> The error I got was specifically "Failed to lock byte 100" and VM not
> starting.  The ISO file was on a R/W NFS3 share, but was itself R/O for
> the user that root was mapped to by linux-nfs-server via /etc/exports
> options, specifically the file iso file was mode 0444 in a 0755
> directory, and the exports line was (simplified)
> 
> /share1
> :::/64(ro,sync,mp,subtree_check,anonuid=1000,anongid=1000)
> 
> where :::/64 is the numeric IPv6 prefix of the LAN
> 
> NFS kernel Server ran Debian Stretch kernel 4.19.0-0.bpo.8-amd64 #1 SMP
> Debian 4.19.98-1~bpo9+1 (2020-03-09) x86_64 GNU/Linux
> 
> NFS client mount options were:
> 
> rw,nosuid,nodev,noatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,
> soft,proto=tcp6,timeo=600,retrans=6,sec=sys,mountaddr=:::::xxff:fexx:,
> 
> mountvers=3,mountport=45327,mountproto=udp6,local_lock=none,addr=:::::xxff:fexx:

I’ve tried using these settings, but still can’t reproduce the bug.

Nothing else uses the image when you try to attach it to qemu, right?
(Your last email noted something about a loop mount, but I’m not sure
whether that just referred to the RH Bugzilla entry.)

(local_lock=none means that all locks are relayed to the server, correct?)

Max




Re: [PATCH v4 2/7] test-keyval: Demonstrate misparse of ',' with implied key

2020-10-12 Thread Eric Blake

On 10/11/20 2:35 AM, Markus Armbruster wrote:

Add a test for "val,,ue" with implied key.  Documentation says this
should parse as implied key with value "val", then fail.  The code
parses it as implied key with value "val,ue", then succeeds.  The next
commit will fix it.

Signed-off-by: Markus Armbruster 
---
  tests/test-keyval.c | 7 +++
  1 file changed, 7 insertions(+)


Reviewed-by: Eric Blake 



diff --git a/tests/test-keyval.c b/tests/test-keyval.c
index e331a84149..f02bdf7029 100644
--- a/tests/test-keyval.c
+++ b/tests/test-keyval.c
@@ -182,6 +182,13 @@ static void test_keyval_parse(void)
  error_free_or_abort();
  g_assert(!qdict);
  
+/* Implied key's value can't have comma (qemu_opts_parse(): it can) */

+/* BUG: it can */
+qdict = keyval_parse("val,,ue", "implied", _abort);
+g_assert_cmpuint(qdict_size(qdict), ==, 1);
+g_assert_cmpstr(qdict_get_try_str(qdict, "implied"), ==, "val,ue");
+qobject_unref(qdict);
+
  /* Empty key is not an implied key */
  qdict = keyval_parse("=val", "implied", );
  error_free_or_abort();



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v4 1/7] keyval: Fix and clarify grammar

2020-10-12 Thread Eric Blake

On 10/11/20 2:34 AM, Markus Armbruster wrote:

The grammar has a few issues:

* key-fragment = / [^=,.]* /

   Prose restricts key fragments: they "must be valid QAPI names or
   consist only of decimal digits".  Technically, '' consists only of
   decimal digits.  The code rejects that.  Fix the grammar.

* val  = { / [^,]* / | ',,' }

   Use + instead of *.  Accepts the same language.

* val-no-key   = / [^=,]* /

   The code rejects an empty value.  Fix the grammar.

* Section "Additional syntax for use with an implied key" is
   confusing.  Rewrite it.

Signed-off-by: Markus Armbruster 
---
  util/keyval.c | 16 ++--
  1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/util/keyval.c b/util/keyval.c
index 13def4af54..82d8497c71 100644
--- a/util/keyval.c
+++ b/util/keyval.c
@@ -16,8 +16,8 @@
   *   key-vals = [ key-val { ',' key-val } [ ',' ] ]
   *   key-val  = key '=' val
   *   key  = key-fragment { '.' key-fragment }
- *   key-fragment = / [^=,.]* /
- *   val  = { / [^,]* / | ',,' }
+ *   key-fragment = / [^=,.]+ /


This requires a non-empty string.  Good (we don't allow an empty key).


+ *   val  = { / [^,]+ / | ',,' }


I agree that this has no real change.  Previously, you allowed zero or 
more repetitions of a regex that could produce zero characters; now, 
each outer repetition must make progress.



   *
   * Semantics defined by reduction to JSON:
   *
@@ -71,12 +71,16 @@
   * Awkward.  Note that we carefully restrict alternate types to avoid
   * similar ambiguity.
   *
- * Additional syntax for use with an implied key:
+ * Alternative syntax for use with an implied key:
   *
- *   key-vals-ik  = val-no-key [ ',' key-vals ]
- *   val-no-key   = / [^=,]* /
+ *   key-vals = [ key-val-1st { ',' key-val } [ ',' ] ]
+ *   key-val-1st  = val-no-key | key-val
+ *   val-no-key   = / [^=,]+ /
   *
- * where no-key is syntactic sugar for implied-key=val-no-key.
+ * where val-no-key is syntactic sugar for implied-key=val-no-key.
+ *
+ * Note that you can't use the sugared form when the value contains
+ * '=' or ','.


Nor can you use the sugared form when the value is intended to be empty 
(although this may be academic, as your other patches enumerate the list 
of clients, and none of them seem to allow an empty value even when 
desugared).


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v7 02/13] qtest: Reintroduce qtest_qmp_receive

2020-10-12 Thread Thomas Huth
On 06/10/2020 14.38, Maxim Levitsky wrote:
> The new qtest_qmp_receive buffers all the received qmp events, allowing
> qtest_qmp_eventwait_ref to return them.
> 
> This is intended to solve the race in regard to ordering of qmp events
> vs qmp responses, as soon as the callers start using the new interface.
> 
> In addition to that, define qtest_qmp_event_ref a function which only scans
> the buffer that qtest_qmp_receive stores the events to.
> 
> This is intended for callers that are only interested in events that were
> received during the last call to the qtest_qmp_receive.
> 
> Suggested-by: Paolo Bonzini 
> Signed-off-by: Maxim Levitsky 
> ---
>  tests/qtest/libqos/libqtest.h | 23 
>  tests/qtest/libqtest.c| 49 ++-
>  2 files changed, 71 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qtest/libqos/libqtest.h b/tests/qtest/libqos/libqtest.h
> index a41135fc92..19429a536d 100644
> --- a/tests/qtest/libqos/libqtest.h
> +++ b/tests/qtest/libqos/libqtest.h
> @@ -198,6 +198,16 @@ void qtest_qmp_vsend(QTestState *s, const char *fmt, 
> va_list ap)
>   */
>  QDict *qtest_qmp_receive_dict(QTestState *s);
>  
> +/**
> + * qtest_qmp_receive:
> + * @s: #QTestState instance to operate on.
> + *
> + * Reads a QMP message from QEMU and returns the response.
> + * Buffers all the events received meanwhile, until a
> + * call to qtest_qmp_eventwait
> + */
> +QDict *qtest_qmp_receive(QTestState *s);

Re-introducing qtest_qmp_receive() with different behavior than before will
likely make backports of other later patches a pain, and might also break
other patches that use this function but are not merged yet. Could you
please use a different name for this function instead? Maye
qtest_qmp_receive_buffered() or something like that?

 Thomas




Re: [PATCH v7 01/13] qtest: rename qtest_qmp_receive to qtest_qmp_receive_dict

2020-10-12 Thread Thomas Huth
On 06/10/2020 14.38, Maxim Levitsky wrote:
> In the next patch a new version of qtest_qmp_receive will be
> reintroduced that will buffer received qmp events for later
> consumption in qtest_qmp_eventwait_ref
> 
> No functional change intended.
> 
> Suggested-by: Paolo Bonzini 
> Signed-off-by: Maxim Levitsky 
> ---
>  tests/qtest/ahci-test.c|  4 ++--
>  tests/qtest/device-plug-test.c |  2 +-
>  tests/qtest/drive_del-test.c   |  2 +-
>  tests/qtest/libqos/libqtest.h  |  4 ++--
>  tests/qtest/libqtest.c | 16 
>  tests/qtest/pvpanic-test.c |  2 +-
>  tests/qtest/qmp-test.c | 18 +-
>  7 files changed, 24 insertions(+), 24 deletions(-)


Acked-by: Thomas Huth 




Re: [PATCH 1/3] block: push error reporting into bdrv_all_*_snapshot functions

2020-10-12 Thread Max Reitz
On 12.10.20 12:16, Philippe Mathieu-Daudé wrote:
> On 10/12/20 12:07 PM, Max Reitz wrote:
>> On 08.10.20 19:48, Philippe Mathieu-Daudé wrote:
>>> From: Daniel P. Berrangé 
>>>
>>> The bdrv_all_*_snapshot functions return a BlockDriverState pointer
>>> for the invalid backend, which the callers then use to report an
>>> error message. In some cases multiple callers are reporting the
>>> same error message, but with slightly different text. In the future
>>> there will be more error scenarios for some of these methods, which
>>> will benefit from fine grained error message reporting. So it is
>>> helpful to push error reporting down a level.
>>>
>>> Signed-off-by: Daniel P. Berrangé 
>>> ---
>>>   include/block/snapshot.h   | 14 +++
>>>   block/monitor/block-hmp-cmds.c |  7 ++--
>>>   block/snapshot.c   | 77 +-
>>>   migration/savevm.c | 37 +---
>>>   monitor/hmp-cmds.c |  7 +---
>>>   replay/replay-debugging.c  |  4 +-
>>>   tests/qemu-iotests/267.out | 10 ++---
>>>   7 files changed, 67 insertions(+), 89 deletions(-)
>>
>> Looks good overall to me, but for some reason this patch pulls in the
>> @ok and @ret variables from the top scope of all concerned functions
>> into the inner scopes of the BDS loops, and drops their initialization.
>>   That’s wrong, because we only call the respective snapshotting
>> functions on some BDSs, so the return value stays uninitialized for all
>> other BDSs:
> 
> Indeed, thanks for catching that.
> 
> [...]
>>>   int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
>>>    BlockDriverState *vm_state_bs,
>>>    uint64_t vm_state_size,
>>> - BlockDriverState **first_bad_bs)
>>> + Error **errp)
>>>   {
>>> -    int err = 0;
>>>   BlockDriverState *bs;
>>>   BdrvNextIterator it;
>>>     for (bs = bdrv_first(); bs; bs = bdrv_next()) {
>>>   AioContext *ctx = bdrv_get_aio_context(bs);
>>> +    int ret;
>>
>> And one final time.
>>
>> Max
>>
>>>   aio_context_acquire(ctx);
>>>   if (bs == vm_state_bs) {
>>>   sn->vm_state_size = vm_state_size;
>>> -    err = bdrv_snapshot_create(bs, sn);
>>> +    ret = bdrv_snapshot_create(bs, sn);
>>>   } else if (bdrv_all_snapshots_includes_bs(bs)) {
>>>   sn->vm_state_size = 0;
>>> -    err = bdrv_snapshot_create(bs, sn);
>>> +    ret = bdrv_snapshot_create(bs, sn);
> 
> This one is not needed.

Why not?  Is bdrv_all_snapshots_includes_bs(bs) guaranteed to be true?
(I don’t see any a plain “else” branch, or where ret would be set
outside of these two “if” blocks.)

Max

>>>   }
>>>   aio_context_release(ctx);
>>> -    if (err < 0) {
>>> +    if (ret < 0) {
>>> +    error_setg(errp, "Could not create snapshot '%s' on '%s'",
>>> +   sn->name, bdrv_get_device_or_node_name(bs));
>>>   bdrv_next_cleanup();
>>> -    goto fail;
>>> +    return -1;
>>>   }
>>>   }
>>
> 




Re: [PATCH 1/3] block: push error reporting into bdrv_all_*_snapshot functions

2020-10-12 Thread Philippe Mathieu-Daudé

On 10/12/20 12:07 PM, Max Reitz wrote:

On 08.10.20 19:48, Philippe Mathieu-Daudé wrote:

From: Daniel P. Berrangé 

The bdrv_all_*_snapshot functions return a BlockDriverState pointer
for the invalid backend, which the callers then use to report an
error message. In some cases multiple callers are reporting the
same error message, but with slightly different text. In the future
there will be more error scenarios for some of these methods, which
will benefit from fine grained error message reporting. So it is
helpful to push error reporting down a level.

Signed-off-by: Daniel P. Berrangé 
---
  include/block/snapshot.h   | 14 +++
  block/monitor/block-hmp-cmds.c |  7 ++--
  block/snapshot.c   | 77 +-
  migration/savevm.c | 37 +---
  monitor/hmp-cmds.c |  7 +---
  replay/replay-debugging.c  |  4 +-
  tests/qemu-iotests/267.out | 10 ++---
  7 files changed, 67 insertions(+), 89 deletions(-)


Looks good overall to me, but for some reason this patch pulls in the
@ok and @ret variables from the top scope of all concerned functions
into the inner scopes of the BDS loops, and drops their initialization.
  That’s wrong, because we only call the respective snapshotting
functions on some BDSs, so the return value stays uninitialized for all
other BDSs:


Indeed, thanks for catching that.

[...]

  int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
   BlockDriverState *vm_state_bs,
   uint64_t vm_state_size,
- BlockDriverState **first_bad_bs)
+ Error **errp)
  {
-int err = 0;
  BlockDriverState *bs;
  BdrvNextIterator it;
  
  for (bs = bdrv_first(); bs; bs = bdrv_next()) {

  AioContext *ctx = bdrv_get_aio_context(bs);
+int ret;


And one final time.

Max


  aio_context_acquire(ctx);
  if (bs == vm_state_bs) {
  sn->vm_state_size = vm_state_size;
-err = bdrv_snapshot_create(bs, sn);
+ret = bdrv_snapshot_create(bs, sn);
  } else if (bdrv_all_snapshots_includes_bs(bs)) {
  sn->vm_state_size = 0;
-err = bdrv_snapshot_create(bs, sn);
+ret = bdrv_snapshot_create(bs, sn);


This one is not needed.


  }
  aio_context_release(ctx);
-if (err < 0) {
+if (ret < 0) {
+error_setg(errp, "Could not create snapshot '%s' on '%s'",
+   sn->name, bdrv_get_device_or_node_name(bs));
  bdrv_next_cleanup();
-goto fail;
+return -1;
  }
  }







Re: [PATCH 1/3] block: push error reporting into bdrv_all_*_snapshot functions

2020-10-12 Thread Max Reitz
On 08.10.20 19:48, Philippe Mathieu-Daudé wrote:
> From: Daniel P. Berrangé 
> 
> The bdrv_all_*_snapshot functions return a BlockDriverState pointer
> for the invalid backend, which the callers then use to report an
> error message. In some cases multiple callers are reporting the
> same error message, but with slightly different text. In the future
> there will be more error scenarios for some of these methods, which
> will benefit from fine grained error message reporting. So it is
> helpful to push error reporting down a level.
> 
> Signed-off-by: Daniel P. Berrangé 
> ---
>  include/block/snapshot.h   | 14 +++
>  block/monitor/block-hmp-cmds.c |  7 ++--
>  block/snapshot.c   | 77 +-
>  migration/savevm.c | 37 +---
>  monitor/hmp-cmds.c |  7 +---
>  replay/replay-debugging.c  |  4 +-
>  tests/qemu-iotests/267.out | 10 ++---
>  7 files changed, 67 insertions(+), 89 deletions(-)

Looks good overall to me, but for some reason this patch pulls in the
@ok and @ret variables from the top scope of all concerned functions
into the inner scopes of the BDS loops, and drops their initialization.
 That’s wrong, because we only call the respective snapshotting
functions on some BDSs, so the return value stays uninitialized for all
other BDSs:

> diff --git a/block/snapshot.c b/block/snapshot.c
> index a2bf3a54eb..50e35bb9fa 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -462,14 +462,14 @@ static bool 
> bdrv_all_snapshots_includes_bs(BlockDriverState *bs)
>   * These functions will properly handle dataplane (take aio_context_acquire
>   * when appropriate for appropriate block drivers) */
>  
> -bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs)
> +bool bdrv_all_can_snapshot(Error **errp)
>  {
> -bool ok = true;
>  BlockDriverState *bs;
>  BdrvNextIterator it;
>  
>  for (bs = bdrv_first(); bs; bs = bdrv_next()) {
>  AioContext *ctx = bdrv_get_aio_context(bs);
> +bool ok;

So I think @ok must be initialized.

>  
>  aio_context_acquire(ctx);
>  if (bdrv_all_snapshots_includes_bs(bs)) {
> @@ -477,26 +477,25 @@ bool bdrv_all_can_snapshot(BlockDriverState 
> **first_bad_bs)
>  }
>  aio_context_release(ctx);
>  if (!ok) {
> +error_setg(errp, "Device '%s' is writable but does not support "
> +   "snapshots", bdrv_get_device_or_node_name(bs));
>  bdrv_next_cleanup();
> -goto fail;
> +return false;
>  }
>  }
>  
> -fail:
> -*first_bad_bs = bs;
> -return ok;
> +return true;
>  }
>  
> -int bdrv_all_delete_snapshot(const char *name, BlockDriverState 
> **first_bad_bs,
> - Error **errp)
> +int bdrv_all_delete_snapshot(const char *name, Error **errp)
>  {
> -int ret = 0;
>  BlockDriverState *bs;
>  BdrvNextIterator it;
>  QEMUSnapshotInfo sn1, *snapshot = 
>  
>  for (bs = bdrv_first(); bs; bs = bdrv_next()) {
>  AioContext *ctx = bdrv_get_aio_context(bs);
> +int ret;

Same here, @ret must be initialized.

>  
>  aio_context_acquire(ctx);
>  if (bdrv_all_snapshots_includes_bs(bs) &&
> @@ -507,26 +506,25 @@ int bdrv_all_delete_snapshot(const char *name, 
> BlockDriverState **first_bad_bs,
>  }
>  aio_context_release(ctx);
>  if (ret < 0) {
> +error_prepend(errp, "Could not delete snapshot '%s' on '%s': ",
> +  name, bdrv_get_device_or_node_name(bs));
>  bdrv_next_cleanup();
> -goto fail;
> +return -1;
>  }
>  }
>  
> -fail:
> -*first_bad_bs = bs;
> -return ret;
> +return 0;
>  }
>  
>  
> -int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs,
> -   Error **errp)
> +int bdrv_all_goto_snapshot(const char *name, Error **errp)
>  {
> -int ret = 0;
>  BlockDriverState *bs;
>  BdrvNextIterator it;
>  
>  for (bs = bdrv_first(); bs; bs = bdrv_next()) {
>  AioContext *ctx = bdrv_get_aio_context(bs);
> +int ret;

And again.

>  
>  aio_context_acquire(ctx);
>  if (bdrv_all_snapshots_includes_bs(bs)) {
> @@ -534,75 +532,75 @@ int bdrv_all_goto_snapshot(const char *name, 
> BlockDriverState **first_bad_bs,
>  }
>  aio_context_release(ctx);
>  if (ret < 0) {
> +error_prepend(errp, "Could not load snapshot '%s' on '%s': ",
> +  name, bdrv_get_device_or_node_name(bs));
>  bdrv_next_cleanup();
> -goto fail;
> +return -1;
>  }
>  }
>  
> -fail:
> -*first_bad_bs = bs;
> -return ret;
> +return 0;
>  }
>  
> -int bdrv_all_find_snapshot(const char *name, BlockDriverState **first_bad_bs)
> +int bdrv_all_find_snapshot(const char *name, Error **errp)
>  {
>  

Re: [PATCH 1/3] hw/ssi/aspeed_smc: Rename max_slaves as max_devices

2020-10-12 Thread Kevin Wolf
Am 11.10.2020 um 23:05 hat Philippe Mathieu-Daudé geschrieben:
> From: Philippe Mathieu-Daudé 
> 
> In order to use inclusive terminology, rename max_slaves
> as max_peripherals.

This is inconsistent with the subject line which talks about
"max_devices".

Kevin




Re: [PATCH 1/3] hw/ssi/aspeed_smc: Rename max_slaves as max_devices

2020-10-12 Thread Philippe Mathieu-Daudé

On 10/12/20 12:04 PM, Kevin Wolf wrote:

Am 11.10.2020 um 23:05 hat Philippe Mathieu-Daudé geschrieben:

From: Philippe Mathieu-Daudé 

In order to use inclusive terminology, rename max_slaves
as max_peripherals.


This is inconsistent with the subject line which talks about
"max_devices".


Ah I missed that change, thanks.



Re: Which qemu change corresponds to RedHat bug 1655408

2020-10-12 Thread Max Reitz
On 10.10.20 00:54, Jakob Bohm wrote:

[...]

> Theoretically, locking on a raw file needs to be protocol-compatible
> with loop-mounting the same raw file, so if the loop driver doesn't
> probe those magic byte offsets to prevent out-of-order block writes,
> then there is little point for the qemu raw driver to do so.
> 
> This applies to both the loop driver in the host kernel and the loop
> driver on any other machine with file share access to the sane image
> file.

The file access locks aren’t meant to prevent arbitrary other programs
from accessing those specific few bytes, but to prevent concurrently
running qemu processes from accessing the image.  That’s why qemu
doesn’t lock the whole image file, but only a small and very specific
set of bytes.

The problem we originally faced was that some people would run qemu-img
to modify qcow2 images while a VM was running, leading to metadata
corruption and thus data loss.  This is the main reason the locks were
introduced.  However, the problem isn’t limited to qcow2 images.  Even
for raw images, we generally have to prevent e.g. concurrent writes to
the image (from other VMs that might want to use that image) because the
guest won’t be able to deal with that.  Hence why we take locks even on
non-qcow2 images.

> As for upgrading, I will try newer kernels packaged for the Debian
> version used, once the current large batch job has completed, but I
> doubt it will make much difference given the principles I just stated.

I’m not sure whether I understood your paragraphs above wrong, but I
don’t see how they explain why the bug would appear (i.e., why qemu
would fail taking the file lock).  It only seems to explain why it seems
superfluous for qemu to take locks when the loop driver will be the only
concurrent user of the image (presumably because the loop driver just
doesn’t take any locks; which means that qemu should be able to).

Max