Re: making a qdev bus available from a (non-qtree?) device
On 5/11/21 8:17 PM, Klaus Jensen wrote: > Hi all, > > I need some help with grok'ing qdev busses. Stefan, Michael - David > suggested on IRC that I CC'ed you guys since you might have solved a > similar issue with virtio devices. I've tried to study how that works, > but I'm not exactly sure how to apply it to the issue I'm having. The experts on this topic are Peter/Markus/Eduardo. > Currently, to support multiple namespaces on the emulated nvme device, > one can do something like this: > > -device nvme,id=nvme-ctrl-0,serial=foo,... > -device nvme-ns,id=nvme-ns-0,bus=nvme-ctrl-0,... > -device nvme-ns,id-nvme-ns-1,bus=nvme-ctrl-0,... > > The nvme device creates an 'nvme-bus' and the nvme-ns devices has > dc->bus_type = TYPE_NVME_BUS. This all works very well and provides a > nice overview in `info qtree`: > > bus: main-system-bus > type System > ... > dev: q35-pcihost, id "" > .. > bus: pcie.0 > type PCIE > .. > dev: nvme, id "nvme-ctrl-0" > .. > bus: nvme-ctrl-0 > type nvme-bus > dev: nvme-ns, id "nvme-ns-0" > .. > dev: nvme-ns, id "nvme-ns-1" > .. > > > Nice and qdevy. > > We have since introduced support for NVM Subsystems through an > nvme-subsys device. The nvme-subsys device is just a TYPE_DEVICE and > does not show in `info qtree` (I wonder if this should actually just > have been an -object?). Anyway. The nvme device has a 'subsys' link > parameter and we use this to manage the namespaces across the subsystem > that may contain several nvme devices (controllers). The problem is that > this doesnt work too well with unplugging since if the nvme device is > `device_del`'ed, the nvme-ns devices on the nvme-bus are unrealized > which is not what we want. We really want the namespaces to linger, > preferably on an nvme-bus of the nvme-subsys device so they can be > attached to other nvme devices that may show up (or already exist) in > the subsystem. IIUC, while we can have unattached drives, we can't (by design) have qdev unattached to qbus. Not sure this is a good suggestion (bad design IMO) but you could add a fake nvme qbus to hold the lingering nvme devices... > The core problem I'm having is that I can't seem to create an nvme-bus > from the nvme-subsys device and make it available to the nvme-ns device > on the command line: > > -device nvme-subsys,id=nvme-subsys-0,... > -device nvme-ns,bus=nvme-subsys-0 > > The above results in 'No 'nvme-bus' bus found for device 'nvme-ns', even > though I do `qbus_create_inplace()` just like the nvme device. However, > I *can* reparent the nvme-ns device in its realize() method, so if I > instead define it like so: > > -device nvme-subsys,id=nvme-subsys-0,... > -device nvme,id=nvme-ctrl-0,subsys=nvme-subsys-0 > -device nvme-ns,bus=nvme-ctrl-0 > > I can then call `qdev_set_parent_bus()` and set the parent bus to the > bus creates in the nvme-subsys device. This solves the problem since the > namespaces are not "garbage collected" when the nvme device is removed, > but it just feels wrong you know? Also, if possible, I'd of course > really like to retain the nice entries in `info qtree`. > > > Thanks, > Klaus
Re: [PATCH 3/3] virtio-net: Constify VirtIOFeature feature_sizes[]
在 2021/5/11 下午6:41, Philippe Mathieu-Daudé 写道: Signed-off-by: Philippe Mathieu-Daudé --- hw/net/virtio-net.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 66b9ff45118..6b7e8dd04ef 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -89,7 +89,7 @@ VIRTIO_NET_RSS_HASH_TYPE_TCP_EX | \ VIRTIO_NET_RSS_HASH_TYPE_UDP_EX) -static VirtIOFeature feature_sizes[] = { +static const VirtIOFeature feature_sizes[] = { {.flags = 1ULL << VIRTIO_NET_F_MAC, .end = endof(struct virtio_net_config, mac)}, {.flags = 1ULL << VIRTIO_NET_F_STATUS, Acked-by: Jason Wang
Re: [PATCH 2/3] virtio-blk: Constify VirtIOFeature feature_sizes[]
在 2021/5/11 下午6:41, Philippe Mathieu-Daudé 写道: Signed-off-by: Philippe Mathieu-Daudé --- hw/block/virtio-blk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index d28979efb8d..f139cd7cc9c 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -40,7 +40,7 @@ * Starting from the discard feature, we can use this array to properly * set the config size depending on the features enabled. */ -static VirtIOFeature feature_sizes[] = { +static const VirtIOFeature feature_sizes[] = { {.flags = 1ULL << VIRTIO_BLK_F_DISCARD, .end = endof(struct virtio_blk_config, discard_sector_alignment)}, {.flags = 1ULL << VIRTIO_BLK_F_WRITE_ZEROES, Acked-by: Jason Wang
Re: [PATCH 1/3] hw/virtio: Pass virtio_feature_get_config_size() a const argument
在 2021/5/11 下午6:41, Philippe Mathieu-Daudé 写道: The VirtIOFeature structure isn't modified, mark it const. Signed-off-by: Philippe Mathieu-Daudé Acked-by: Jason Wang --- include/hw/virtio/virtio.h | 2 +- hw/virtio/virtio.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h index b7ece7a6a89..8bab9cfb750 100644 --- a/include/hw/virtio/virtio.h +++ b/include/hw/virtio/virtio.h @@ -43,7 +43,7 @@ typedef struct VirtIOFeature { size_t end; } VirtIOFeature; -size_t virtio_feature_get_config_size(VirtIOFeature *features, +size_t virtio_feature_get_config_size(const VirtIOFeature *features, uint64_t host_features); typedef struct VirtQueue VirtQueue; diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 9e13cb9e3ad..e02544b2df7 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -2981,7 +2981,7 @@ int virtio_set_features(VirtIODevice *vdev, uint64_t val) return ret; } -size_t virtio_feature_get_config_size(VirtIOFeature *feature_sizes, +size_t virtio_feature_get_config_size(const VirtIOFeature *feature_sizes, uint64_t host_features) { size_t config_size = 0;
Re: [PATCH v4 06/11] block: make BlockLimits::max_pwrite_zeroes 64bit
On 3/24/21 3:51 PM, Vladimir Sementsov-Ogievskiy wrote: > We are going to support 64 bit write-zeroes requests. Now update the > limit variable. It's absolutely safe. The variable is set in some > drivers, and used in bdrv_co_do_pwrite_zeroes(). > > Update also max_write_zeroes variable in bdrv_co_do_pwrite_zeroes(), so > that bdrv_co_do_pwrite_zeroes() is now prepared to 64bit requests. The > remaining logic including num, offset and bytes variables is already > supporting 64bit requests. > > So the only thing that prevents 64 bit requests is limiting > max_write_zeroes variable to INT_MAX in bdrv_co_do_pwrite_zeroes(). > We'll drop this limitation after updating all block drivers. > > Ah, we also have bdrv_check_request32() in bdrv_co_pwritev_part(). It > will be modified to do bdrv_check_request() for write-zeroes path. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > include/block/block_int.h | 7 +++ > block/io.c| 2 +- > 2 files changed, 4 insertions(+), 5 deletions(-) > > +++ b/include/block/block_int.h > @@ -676,10 +676,9 @@ typedef struct BlockLimits { > * that is set. May be 0 if bl.request_alignment is good enough */ > uint32_t pdiscard_alignment; > > -/* Maximum number of bytes that can zeroized at once (since it is > - * signed, it must be < 2G, if set). Must be multiple of > - * pwrite_zeroes_alignment. May be 0 if no inherent 32-bit limit */ > -int32_t max_pwrite_zeroes; > +/* Maximum number of bytes that can zeroized at once. Must be multiple of > + * pwrite_zeroes_alignment. May be 0 if no inherent 64-bit limit */ Is the comment still right? Leaving as 0 is the easiest way for a driver to say "default limit", but I would feel safer with the default being 2G-align rather than 63-bit limit. And it is a 63-bit limit, not 64-bit, if the driver opts in to INT64_MAX. > +int64_t max_pwrite_zeroes; > > /* Optimal alignment for write zeroes requests in bytes. A power > * of 2 is best but not mandatory. Must be a multiple of > diff --git a/block/io.c b/block/io.c > index 55095dd08e..79e600af27 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -1836,7 +1836,7 @@ static int coroutine_fn > bdrv_co_do_pwrite_zeroes(BlockDriverState *bs, > int head = 0; > int tail = 0; > > -int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX); > +int64_t max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, > INT_MAX); You are correct that for now we have no behavior change; a driver opting in to a larger limit will still be clamped until we revisit this patch later to drop the MIN() - but I agree with your approach of keeping MIN() here until all drivers are audited. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH v4 05/11] block: use int64_t instead of uint64_t in copy_range driver handlers
On 3/24/21 3:51 PM, Vladimir Sementsov-Ogievskiy wrote: > We are generally moving to int64_t for both offset and bytes parameters > on all io paths. > > Main motivation is realization of 64-bit write_zeroes operation for > fast zeroing large disk chunks, up to the whole disk. > > We chose signed type, to be consistent with off_t (which is signed) and > with possibility for signed return type (where negative value means > error). > > So, convert driver copy_range handlers parameters which are already > 64bit to signed type. > > Now let's consider all callers. Simple > > git grep '\->bdrv_co_copy_range' > > shows the only caller: > > bdrv_co_copy_range_internal(), which doesn bdrv_check_request32(), s/doesn/does/ > so everything is OK. > > Still, the functions may be called directly, not only by drv->... > Let's check: > > git grep '\.bdrv_co_copy_range_\(from\|to\)\s*=' | \ > awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \ > while read func; do git grep "$func(" | \ > grep -v "$func(BlockDriverState"; done > > shows no more callers. So, we are done. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > include/block/block_int.h | 12 ++-- > block/file-posix.c| 10 +- > block/iscsi.c | 12 ++-- > block/qcow2.c | 12 ++-- > block/raw-format.c| 16 > 5 files changed, 31 insertions(+), 31 deletions(-) Fewer drivers implement this, so easier review :) > +++ b/block/qcow2.c > @@ -3975,9 +3975,9 @@ static coroutine_fn int > qcow2_co_pdiscard(BlockDriverState *bs, > > static int coroutine_fn > qcow2_co_copy_range_from(BlockDriverState *bs, > - BdrvChild *src, uint64_t src_offset, > - BdrvChild *dst, uint64_t dst_offset, > - uint64_t bytes, BdrvRequestFlags read_flags, > + BdrvChild *src, int64_t src_offset, > + BdrvChild *dst, int64_t dst_offset, > + int64_t bytes, BdrvRequestFlags read_flags, > BdrvRequestFlags write_flags) > { > BDRVQcow2State *s = bs->opaque; The use of cur_bytes = MIN(bytes, INT_MAX) looks odd, when we could instead clamp to a nicer aligned boundary. As noted before, qcow2_get_host_offset() then further clamps cur_bytes, and the caller is already splitting up requests larger than 2G, but copy_from is one of those interfaces that is actually designed to have potentially nice speedups with large byte ranges within the same filesystem (faster than what is possible with usual pread/pwrite). Then again, that's probably more for file-posix (where we know the bytes are contiguous) than qcow2 (where we do have to worry about whether clusters are contiguous), so we'll still have to keep a while(bytes) fragmenting loop in this function. Thoughts for a future patch, doesn't affect correctness of this one. Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH v3 18/33] nbd/client-connection: shutdown connection on release
On Fri, Apr 16, 2021 at 11:08:56AM +0300, Vladimir Sementsov-Ogievskiy wrote: > Now, when thread can do negotiation and retry, it may run relatively > long. We need a mechanism to stop it, when user is not interested in > result anymore. So, on nbd_client_connection_release() let's shutdown > the socked, and do not retry connection if thread is detached. This kinda answers my question to the previous patch about reconnect cancellation. > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > nbd/client-connection.c | 36 ++-- > 1 file changed, 26 insertions(+), 10 deletions(-) > > diff --git a/nbd/client-connection.c b/nbd/client-connection.c > index 002bd91f42..54f73c6c24 100644 > --- a/nbd/client-connection.c > +++ b/nbd/client-connection.c > @@ -158,9 +158,13 @@ static void *connect_thread_func(void *opaque) > uint64_t timeout = 1; > uint64_t max_timeout = 16; > > -while (true) { > +qemu_mutex_lock(>mutex); > +while (!conn->detached) { > +assert(!conn->sioc); > conn->sioc = qio_channel_socket_new(); > > +qemu_mutex_unlock(>mutex); > + > error_free(conn->err); > conn->err = NULL; > conn->updated_info = conn->initial_info; > @@ -171,14 +175,20 @@ static void *connect_thread_func(void *opaque) > conn->updated_info.x_dirty_bitmap = NULL; > conn->updated_info.name = NULL; > > +qemu_mutex_lock(>mutex); > + > if (ret < 0) { > object_unref(OBJECT(conn->sioc)); > conn->sioc = NULL; > -if (conn->do_retry) { > +if (conn->do_retry && !conn->detached) { > +qemu_mutex_unlock(>mutex); > + > sleep(timeout); > if (timeout < max_timeout) { > timeout *= 2; > } > + > +qemu_mutex_lock(>mutex); > continue; > } > } > @@ -186,15 +196,17 @@ static void *connect_thread_func(void *opaque) > break; > } > > -WITH_QEMU_LOCK_GUARD(>mutex) { > -assert(conn->running); > -conn->running = false; > -if (conn->wait_co) { > -aio_co_schedule(NULL, conn->wait_co); > -conn->wait_co = NULL; > -} > -do_free = conn->detached; > +/* mutex is locked */ > + > +assert(conn->running); > +conn->running = false; > +if (conn->wait_co) { > +aio_co_schedule(NULL, conn->wait_co); > +conn->wait_co = NULL; > } > +do_free = conn->detached; > + > +qemu_mutex_unlock(>mutex); This basically reverts some hunks from patch 15 "nbd/client-connection: use QEMU_LOCK_GUARD". How about dropping them there (they weren't an obvious improvement even then). Roman. > > if (do_free) { > nbd_client_connection_do_free(conn); > @@ -215,6 +227,10 @@ void nbd_client_connection_release(NBDClientConnection > *conn) > if (conn->running) { > conn->detached = true; > } > +if (conn->sioc) { > +qio_channel_shutdown(QIO_CHANNEL(conn->sioc), > + QIO_CHANNEL_SHUTDOWN_BOTH, NULL); > +} > do_free = !conn->running && !conn->detached; > qemu_mutex_unlock(>mutex); > > -- > 2.29.2 >
Re: [PATCH v4 04/11] block: use int64_t instead of uint64_t in driver write handlers
On 3/24/21 3:51 PM, Vladimir Sementsov-Ogievskiy wrote: > We are generally moving to int64_t for both offset and bytes parameters > on all io paths. > > Main motivation is realization of 64-bit write_zeroes operation for > fast zeroing large disk chunks, up to the whole disk. > > We chose signed type, to be consistent with off_t (which is signed) and > with possibility for signed return type (where negative value means > error). > > So, convert driver write handlers parameters which are already 64bit to > signed type. > > While being here, convert also flags parameter to be BdrvRequestFlags. > > Now let's consider all callers. Simple > > git grep '\->bdrv_\(aio\|co\)_pwritev\(_part\)\?' > > shows that's there two callers of driver function: > > bdrv_driver_pwritev() in block/io.c, passes int64_t, checked by >bdrv_check_qiov_request() to be non-negative. > > qcow2_save_vmstate() does bdrv_check_qiov_request(). > > Still, the functions may be called directly, not only by drv->... > Let's check: > > git grep '\.bdrv_\(aio\|co\)_pwritev\(_part\)\?\s*=' | \ > awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \ > while read func; do git grep "$func(" | \ > grep -v "$func(BlockDriverState"; done > > shows several callers: ... Thanks for recording the audit. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > include/block/block_int.h| 16 > block/backup-top.c | 6 +++--- > 30 files changed, 95 insertions(+), 86 deletions(-) > > diff --git a/include/block/block_int.h b/include/block/block_int.h > index e6bcf74e46..928369e0bc 100644 > --- a/include/block/block_int.h > +++ b/include/block/block_int.h > @@ -237,8 +237,8 @@ struct BlockDriver { > int64_t offset, int64_t bytes, QEMUIOVector *qiov, > BdrvRequestFlags flags, BlockCompletionFunc *cb, void *opaque); > BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs, > -uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags, > -BlockCompletionFunc *cb, void *opaque); > +int64_t offset, int64_t bytes, QEMUIOVector *qiov, > +BdrvRequestFlags flags, BlockCompletionFunc *cb, void *opaque); > BlockAIOCB *(*bdrv_aio_flush)(BlockDriverState *bs, > BlockCompletionFunc *cb, void *opaque); > BlockAIOCB *(*bdrv_aio_pdiscard)(BlockDriverState *bs, > @@ -287,10 +287,11 @@ struct BlockDriver { > * The buffer in @qiov may point directly to guest memory. > */ > int coroutine_fn (*bdrv_co_pwritev)(BlockDriverState *bs, > -uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags); > +int64_t offset, int64_t bytes, QEMUIOVector *qiov, > +BdrvRequestFlags flags); > int coroutine_fn (*bdrv_co_pwritev_part)(BlockDriverState *bs, > -uint64_t offset, uint64_t bytes, > -QEMUIOVector *qiov, size_t qiov_offset, int flags); > +int64_t offset, int64_t bytes, QEMUIOVector *qiov, size_t > qiov_offset, > +BdrvRequestFlags flags); > > /* > * Efficiently zero a region of the disk image. Typically an image > format > @@ -428,10 +429,9 @@ struct BlockDriver { >Error **errp); > > int coroutine_fn (*bdrv_co_pwritev_compressed)(BlockDriverState *bs, > -uint64_t offset, uint64_t bytes, QEMUIOVector *qiov); > +int64_t offset, int64_t bytes, QEMUIOVector *qiov); > int coroutine_fn (*bdrv_co_pwritev_compressed_part)(BlockDriverState *bs, > -uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, > -size_t qiov_offset); > +int64_t offset, int64_t bytes, QEMUIOVector *qiov, size_t > qiov_offset); Someday it may be nice to convert remaining drivers of the old callbacks to use the newer ones instead, for fewer callbacks to manage here. But not the problem of this patch. > +++ b/block/backup-top.c > @@ -97,9 +97,9 @@ static int coroutine_fn > backup_top_co_pwrite_zeroes(BlockDriverState *bs, > } > > static coroutine_fn int backup_top_co_pwritev(BlockDriverState *bs, > - uint64_t offset, > - uint64_t bytes, > - QEMUIOVector *qiov, int flags) > + int64_t offset, int64_t bytes, > + QEMUIOVector *qiov, > + BdrvRequestFlags flags) > { > int ret = backup_top_cbw(bs, offset, bytes, flags); We should clean up the signature of backup_top_cbw() in a followup, now that pdiscard, pwrite_zeroes, pwritev all pass int64_t. > +++ b/block/blkdebug.c > @@ -635,8 +635,8 @@ blkdebug_co_preadv(BlockDriverState *bs, int64_t offset, > int64_t bytes, > } > > static int coroutine_fn > -blkdebug_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes, > -QEMUIOVector *qiov, int
Re: [PATCH v3 17/33] nbd/client-connection: implement connection retry
On Fri, Apr 16, 2021 at 11:08:55AM +0300, Vladimir Sementsov-Ogievskiy wrote: > Add an option for thread to retry connection until success. We'll use > nbd/client-connection both for reconnect and for initial connection in > nbd_open(), so we need a possibility to use same NBDClientConnection > instance to connect once in nbd_open() and then use retry semantics for > reconnect. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > include/block/nbd.h | 2 ++ > nbd/client-connection.c | 55 + > 2 files changed, 41 insertions(+), 16 deletions(-) > > diff --git a/include/block/nbd.h b/include/block/nbd.h > index 5d86e6a393..5bb54d831c 100644 > --- a/include/block/nbd.h > +++ b/include/block/nbd.h > @@ -409,6 +409,8 @@ const char *nbd_err_lookup(int err); > /* nbd/client-connection.c */ > typedef struct NBDClientConnection NBDClientConnection; > > +void nbd_client_connection_enable_retry(NBDClientConnection *conn); > + > NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr, > bool do_negotiation, > const char *export_name, > diff --git a/nbd/client-connection.c b/nbd/client-connection.c > index ae4a77f826..002bd91f42 100644 > --- a/nbd/client-connection.c > +++ b/nbd/client-connection.c > @@ -36,6 +36,8 @@ struct NBDClientConnection { > NBDExportInfo initial_info; > bool do_negotiation; > > +bool do_retry; > + > /* > * Result of last attempt. Valid in FAIL and SUCCESS states. > * If you want to steal error, don't forget to set pointer to NULL. > @@ -52,6 +54,15 @@ struct NBDClientConnection { > Coroutine *wait_co; /* nbd_co_establish_connection() wait in yield() */ > }; > > +/* > + * The function isn't protected by any mutex, so call it when thread is not > + * running. > + */ > +void nbd_client_connection_enable_retry(NBDClientConnection *conn) > +{ > +conn->do_retry = true; > +} > + > NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr, > bool do_negotiation, > const char *export_name, > @@ -144,24 +155,37 @@ static void *connect_thread_func(void *opaque) > NBDClientConnection *conn = opaque; > bool do_free; > int ret; > +uint64_t timeout = 1; > +uint64_t max_timeout = 16; > + > +while (true) { > +conn->sioc = qio_channel_socket_new(); > + > +error_free(conn->err); > +conn->err = NULL; > +conn->updated_info = conn->initial_info; > + > +ret = nbd_connect(conn->sioc, conn->saddr, > + conn->do_negotiation ? >updated_info : NULL, > + conn->tlscreds, >ioc, >err); > +conn->updated_info.x_dirty_bitmap = NULL; > +conn->updated_info.name = NULL; > + > +if (ret < 0) { > +object_unref(OBJECT(conn->sioc)); > +conn->sioc = NULL; > +if (conn->do_retry) { > +sleep(timeout); > +if (timeout < max_timeout) { > +timeout *= 2; > +} > +continue; > +} > +} How is it supposed to get canceled? Roman. > -conn->sioc = qio_channel_socket_new(); > - > -error_free(conn->err); > -conn->err = NULL; > -conn->updated_info = conn->initial_info; > - > -ret = nbd_connect(conn->sioc, conn->saddr, > - conn->do_negotiation ? >updated_info : NULL, > - conn->tlscreds, >ioc, >err); > -if (ret < 0) { > -object_unref(OBJECT(conn->sioc)); > -conn->sioc = NULL; > +break; > } > > -conn->updated_info.x_dirty_bitmap = NULL; > -conn->updated_info.name = NULL; > - > WITH_QEMU_LOCK_GUARD(>mutex) { > assert(conn->running); > conn->running = false; > @@ -172,7 +196,6 @@ static void *connect_thread_func(void *opaque) > do_free = conn->detached; > } > > - > if (do_free) { > nbd_client_connection_do_free(conn); > } > -- > 2.29.2 >
Re: [PATCH v4 03/11] block: use int64_t instead of uint64_t in driver read handlers
On 3/24/21 3:51 PM, Vladimir Sementsov-Ogievskiy wrote: > We are going to convert .bdrv_co_preadv_part and .bdrv_co_pwritev_part > to int64_t type for offset and bytes parameters (as it's already done > for generic block/io.c layer). > > In qcow2 .bdrv_co_preadv_part is used in some places, so let's add > corresponding checks and assertions. > > Signed-off-by: Vladimir Sementsov-Ogievskiy Unusual line, especially since... > > block: use int64_t instead of uint64_t in driver read handlers > > We are generally moving to int64_t for both offset and bytes parameters > on all io paths. > > Main motivation is realization of 64-bit write_zeroes operation for > fast zeroing large disk chunks, up to the whole disk. > > We chose signed type, to be consistent with off_t (which is signed) and > with possibility for signed return type (where negative value means > error). > > So, convert driver read handlers parameters which are already 64bit to > signed type. > > While being here, convert also flags parameter to be BdrvRequestFlags. > > Now let's consider all callers. Simple > > git grep '\->bdrv_\(aio\|co\)_preadv\(_part\)\?' > > shows that's there three callers of driver function: > > bdrv_driver_preadv() in block/io.c, passes int64_t, checked by >bdrv_check_qiov_request() to be non-negative. > > qcow2_load_vmstate() does bdrv_check_qiov_request(). > > do_perform_cow_read() has uint64_t argument. And a lot of things in > qcow2 driver are uint64_t, so converting it is big job. But we must > not work with requests that doesn't satisfy bdrv_check_qiov_request(), s/doesn't/don't/ > so let's just assert it here. > > Still, the functions may be called directly, not only by drv->... > Let's check: > > git grep '\.bdrv_\(aio\|co\)_preadv\(_part\)\?\s*=' | \ > awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \ > while read func; do git grep "$func(" | \ > grep -v "$func(BlockDriverState"; done > > The only one such caller: > > QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, , 1); > ... > ret = bdrv_replace_test_co_preadv(bs, 0, 1, , 0); > > in tesTS/unit/test-bdrv-drain.c, and it's OK obviously. > > Signed-off-by: Vladimir Sementsov-Ogievskiy ...it is repeated here. I'm fine dropping the first. > --- > include/block/block_int.h| 11 ++- > block/backup-top.c | 4 ++-- > 35 files changed, 120 insertions(+), 89 deletions(-) > > diff --git a/include/block/block_int.h b/include/block/block_int.h > index db7a909ea9..e6bcf74e46 100644 > --- a/include/block/block_int.h > +++ b/include/block/block_int.h > @@ -234,8 +234,8 @@ struct BlockDriver { > > /* aio */ > BlockAIOCB *(*bdrv_aio_preadv)(BlockDriverState *bs, > -uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags, > -BlockCompletionFunc *cb, void *opaque); > +int64_t offset, int64_t bytes, QEMUIOVector *qiov, > +BdrvRequestFlags flags, BlockCompletionFunc *cb, void *opaque); > BlockAIOCB *(*bdrv_aio_pwritev)(BlockDriverState *bs, > uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags, > BlockCompletionFunc *cb, void *opaque); > @@ -264,10 +264,11 @@ struct BlockDriver { > * The buffer in @qiov may point directly to guest memory. > */ > int coroutine_fn (*bdrv_co_preadv)(BlockDriverState *bs, > -uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags); > +int64_t offset, int64_t bytes, QEMUIOVector *qiov, > +BdrvRequestFlags flags); > int coroutine_fn (*bdrv_co_preadv_part)(BlockDriverState *bs, > -uint64_t offset, uint64_t bytes, > -QEMUIOVector *qiov, size_t qiov_offset, int flags); > +int64_t offset, int64_t bytes, > +QEMUIOVector *qiov, size_t qiov_offset, BdrvRequestFlags flags); Lots of fallout, but I'm in favor of this signature change. > +++ b/block/blkdebug.c > @@ -614,8 +614,8 @@ static int rule_check(BlockDriverState *bs, uint64_t > offset, uint64_t bytes, > } > > static int coroutine_fn > -blkdebug_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, > - QEMUIOVector *qiov, int flags) > +blkdebug_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, > + QEMUIOVector *qiov, BdrvRequestFlags flags) > { > int err; Still calls rule_check() with uint64_t parameters, but since we've asserted the caller was within range, no behavior change. > +++ b/block/blkverify.c > @@ -221,8 +221,8 @@ blkverify_co_prwv(BlockDriverState *bs, BlkverifyRequest > *r, uint64_t offset, > } > > static int coroutine_fn > -blkverify_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes, > -QEMUIOVector *qiov, int flags) > +blkverify_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, > +QEMUIOVector *qiov, BdrvRequestFlags flags) > { Similarly for the call to blkverify_co_prwv(), and probably
Re: [PATCH] docs: add table of contents to QAPI references
On 5/11/21 4:25 AM, Daniel P. Berrangé wrote: > The QAPI reference docs for the guest agent, storage daemon and QMP are > all rather long and hard to navigate unless you already know the name of > the command and can do full text search for it. > > A table of contents in each doc will help people locate stuff much more > easily. > > Signed-off-by: Daniel P. Berrangé > --- This looks so much better! Reviewed-by: Connor Kuehl
making a qdev bus available from a (non-qtree?) device
Hi all, I need some help with grok'ing qdev busses. Stefan, Michael - David suggested on IRC that I CC'ed you guys since you might have solved a similar issue with virtio devices. I've tried to study how that works, but I'm not exactly sure how to apply it to the issue I'm having. Currently, to support multiple namespaces on the emulated nvme device, one can do something like this: -device nvme,id=nvme-ctrl-0,serial=foo,... -device nvme-ns,id=nvme-ns-0,bus=nvme-ctrl-0,... -device nvme-ns,id-nvme-ns-1,bus=nvme-ctrl-0,... The nvme device creates an 'nvme-bus' and the nvme-ns devices has dc->bus_type = TYPE_NVME_BUS. This all works very well and provides a nice overview in `info qtree`: bus: main-system-bus type System ... dev: q35-pcihost, id "" .. bus: pcie.0 type PCIE .. dev: nvme, id "nvme-ctrl-0" .. bus: nvme-ctrl-0 type nvme-bus dev: nvme-ns, id "nvme-ns-0" .. dev: nvme-ns, id "nvme-ns-1" .. Nice and qdevy. We have since introduced support for NVM Subsystems through an nvme-subsys device. The nvme-subsys device is just a TYPE_DEVICE and does not show in `info qtree` (I wonder if this should actually just have been an -object?). Anyway. The nvme device has a 'subsys' link parameter and we use this to manage the namespaces across the subsystem that may contain several nvme devices (controllers). The problem is that this doesnt work too well with unplugging since if the nvme device is `device_del`'ed, the nvme-ns devices on the nvme-bus are unrealized which is not what we want. We really want the namespaces to linger, preferably on an nvme-bus of the nvme-subsys device so they can be attached to other nvme devices that may show up (or already exist) in the subsystem. The core problem I'm having is that I can't seem to create an nvme-bus from the nvme-subsys device and make it available to the nvme-ns device on the command line: -device nvme-subsys,id=nvme-subsys-0,... -device nvme-ns,bus=nvme-subsys-0 The above results in 'No 'nvme-bus' bus found for device 'nvme-ns', even though I do `qbus_create_inplace()` just like the nvme device. However, I *can* reparent the nvme-ns device in its realize() method, so if I instead define it like so: -device nvme-subsys,id=nvme-subsys-0,... -device nvme,id=nvme-ctrl-0,subsys=nvme-subsys-0 -device nvme-ns,bus=nvme-ctrl-0 I can then call `qdev_set_parent_bus()` and set the parent bus to the bus creates in the nvme-subsys device. This solves the problem since the namespaces are not "garbage collected" when the nvme device is removed, but it just feels wrong you know? Also, if possible, I'd of course really like to retain the nice entries in `info qtree`. Thanks, Klaus signature.asc Description: PGP signature
Re: [PATCH v4 02/11] qcow2: check request on vmstate save/load path
On 3/24/21 3:51 PM, Vladimir Sementsov-Ogievskiy wrote: > We modify the request by adding an offset to vmstate. Let's check the > modified request. It will help us to safely move .bdrv_co_preadv_part > and .bdrv_co_pwritev_part to int64_t type of offset and bytes. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > include/block/block_int.h | 3 +++ > block/io.c| 6 +++--- > block/qcow2.c | 43 +-- > 3 files changed, 43 insertions(+), 9 deletions(-) > Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH v4 01/11] block/io: bring request check to bdrv_co_{read,write}v_vmstate
On 3/24/21 3:51 PM, Vladimir Sementsov-Ogievskiy wrote: > There are only two drivers supporting vmstate: qcow2 and sheepdog. > Sheepdog is deprecated. In qcow2 these requests go through > .bdrv_co_p{read,write}v_part handlers. > > So, let's do our basic check for the request on vmstate generic > handlers. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block/io.c | 18 -- > 1 file changed, 16 insertions(+), 2 deletions(-) > Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH v2 10/10] qcow2-refcount: check_refblocks(): add separate message for reserved
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote: > Split checking for reserved bits out of aligned offset check. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > Reviewed-by: Eric Blake Tested-by: Kirill Tkhai > --- > block/qcow2.h | 1 + > block/qcow2-refcount.c | 10 +- > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2.h b/block/qcow2.h > index 58fd7f1678..fd48a89d45 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -591,6 +591,7 @@ typedef enum QCow2MetadataOverlap { > #define L2E_STD_RESERVED_MASK 0x3f0001feULL > > #define REFT_OFFSET_MASK 0xfe00ULL > +#define REFT_RESERVED_MASK 0x1ffULL > > #define INV_OFFSET (-1ULL) > > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index 15c4f6b075..472a7026db 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -2091,9 +2091,17 @@ static int check_refblocks(BlockDriverState *bs, > BdrvCheckResult *res, > > for(i = 0; i < s->refcount_table_size; i++) { > uint64_t offset, cluster; > -offset = s->refcount_table[i]; > +offset = s->refcount_table[i] & REFT_OFFSET_MASK; > cluster = offset >> s->cluster_bits; > > +if (s->refcount_table[i] & REFT_RESERVED_MASK) { > +fprintf(stderr, "ERROR refcount table entry %" PRId64 " has " > +"reserved bits set\n", i); > +res->corruptions++; > +*rebuild = true; > +continue; > +} > + > /* Refcount blocks are cluster aligned */ > if (offset_into_cluster(s, offset)) { > fprintf(stderr, "ERROR refcount block %" PRId64 " is not " >
Re: [PATCH] hmp: Fix loadvm to resume the VM on success instead of failure
On Tue, May 11, 2021 at 06:58:01PM +0200, Kevin Wolf wrote: > Am 11.05.2021 um 18:49 hat Daniel P. Berrangé geschrieben: > > On Tue, May 11, 2021 at 06:31:51PM +0200, Kevin Wolf wrote: > > > Commit f61fe11aa6f broke hmp_loadvm() by adding an incorrect negation > > > when converting from 0/-errno return values to a bool value. The result > > > is that loadvm resumes the VM now if it failed and keeps it stopped if > > > it failed. Fix it to restore the old behaviour and do it the other way > > > around. > > > > > > Fixes: f61fe11aa6f7f8f0ffe4ddaa56a8108f3ab57854 > > > Cc: qemu-sta...@nongnu.org > > > Reported-by: Yanhui Ma > > > Signed-off-by: Kevin Wolf > > > --- > > > monitor/hmp-cmds.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c > > > index 0ad5b77477..cc15d9b6ee 100644 > > > --- a/monitor/hmp-cmds.c > > > +++ b/monitor/hmp-cmds.c > > > @@ -1133,7 +1133,7 @@ void hmp_loadvm(Monitor *mon, const QDict *qdict) > > > > > > vm_stop(RUN_STATE_RESTORE_VM); > > > > > > -if (!load_snapshot(name, NULL, false, NULL, ) && > > > saved_vm_running) { > > > +if (load_snapshot(name, NULL, false, NULL, ) && > > > saved_vm_running) { > > > vm_start(); > > > } > > > hmp_handle_error(mon, err); > > > > Paolo had sent a different fix here: > > > > https://lists.gnu.org/archive/html/qemu-devel/2021-05/msg01093.html > > Hm... I missed that patch, but doesn't it just generalise the buggy HMP > code instead of fixing it? That is, we still resume the VM on failure > rather than on success? Yes, if I use my iotest patch on Paolo's patch it shows that it is still broken. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH] hmp: Fix loadvm to resume the VM on success instead of failure
Am 11.05.2021 um 18:49 hat Daniel P. Berrangé geschrieben: > On Tue, May 11, 2021 at 06:31:51PM +0200, Kevin Wolf wrote: > > Commit f61fe11aa6f broke hmp_loadvm() by adding an incorrect negation > > when converting from 0/-errno return values to a bool value. The result > > is that loadvm resumes the VM now if it failed and keeps it stopped if > > it failed. Fix it to restore the old behaviour and do it the other way > > around. > > > > Fixes: f61fe11aa6f7f8f0ffe4ddaa56a8108f3ab57854 > > Cc: qemu-sta...@nongnu.org > > Reported-by: Yanhui Ma > > Signed-off-by: Kevin Wolf > > --- > > monitor/hmp-cmds.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c > > index 0ad5b77477..cc15d9b6ee 100644 > > --- a/monitor/hmp-cmds.c > > +++ b/monitor/hmp-cmds.c > > @@ -1133,7 +1133,7 @@ void hmp_loadvm(Monitor *mon, const QDict *qdict) > > > > vm_stop(RUN_STATE_RESTORE_VM); > > > > -if (!load_snapshot(name, NULL, false, NULL, ) && saved_vm_running) > > { > > +if (load_snapshot(name, NULL, false, NULL, ) && saved_vm_running) { > > vm_start(); > > } > > hmp_handle_error(mon, err); > > Paolo had sent a different fix here: > > https://lists.gnu.org/archive/html/qemu-devel/2021-05/msg01093.html Hm... I missed that patch, but doesn't it just generalise the buggy HMP code instead of fixing it? That is, we still resume the VM on failure rather than on success? > As with my feedback there, I think we should be adding test coverage > when fixing this. How about this: > [...] Yes, this looks good. Kevin
Re: [PATCH v2 09/10] qcow2-refcount: check_refcounts_l1(): check reserved bits
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > Reviewed-by: Eric Blake > --- > block/qcow2.h | 1 + > block/qcow2-refcount.c | 6 ++ > 2 files changed, 7 insertions(+) > > diff --git a/block/qcow2.h b/block/qcow2.h > index b8b1093b61..58fd7f1678 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -586,6 +586,7 @@ typedef enum QCow2MetadataOverlap { > (QCOW2_OL_CACHED | QCOW2_OL_INACTIVE_L2) > > #define L1E_OFFSET_MASK 0x00fffe00ULL > +#define L1E_RESERVED_MASK 0x7f0001ffULL > #define L2E_OFFSET_MASK 0x00fffe00ULL > #define L2E_STD_RESERVED_MASK 0x3f0001feULL > > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index 69294a94fe..15c4f6b075 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -1904,6 +1904,12 @@ static int check_refcounts_l1(BlockDriverState *bs, > continue; > } > > +if (l1_table[i] & L1E_RESERVED_MASK) { > +fprintf(stderr, "ERROR found L1 entry with reserved bits set: " > +"%" PRIx64, l1_table[i]); '\n' is missed. The rest is OK for me. Tested-by: Kirill Tkhai > +res->corruptions++; > +} > + > l2_offset = l1_table[i] & L1E_OFFSET_MASK; > > /* Mark L2 table as used */ >
Re: [PATCH v2 07/10] qcow2-refcount: check_refcounts_l2(): check reserved bits
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote: > Signed-off-by: Vladimir Sementsov-Ogievskiy > Reviewed-by: Eric Blake > --- > block/qcow2.h | 1 + > block/qcow2-refcount.c | 12 +++- > 2 files changed, 12 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2.h b/block/qcow2.h > index c0e1e83796..b8b1093b61 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -587,6 +587,7 @@ typedef enum QCow2MetadataOverlap { > > #define L1E_OFFSET_MASK 0x00fffe00ULL > #define L2E_OFFSET_MASK 0x00fffe00ULL > +#define L2E_STD_RESERVED_MASK 0x3f0001feULL > > #define REFT_OFFSET_MASK 0xfe00ULL > > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index 062ec48a15..47cc82449b 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -1682,8 +1682,18 @@ static int check_refcounts_l2(BlockDriverState *bs, > BdrvCheckResult *res, > int csize; > l2_entry = get_l2_entry(s, l2_table, i); > uint64_t l2_bitmap = get_l2_bitmap(s, l2_table, i); > +QCow2ClusterType type = qcow2_get_cluster_type(bs, l2_entry); > > -switch (qcow2_get_cluster_type(bs, l2_entry)) { > +if (type != QCOW2_CLUSTER_COMPRESSED) { > +/* Check reserved bits of Standard Cluster Descriptor */ > +if (l2_entry & L2E_STD_RESERVED_MASK) { > +fprintf(stderr, "ERROR found l2 entry with reserved bits > set: " > +"%" PRIx64, l2_entry); '\n' is missed. The rest is OK for me. Tested-by: Kirill Tkhai > +res->corruptions++; > +} > +} > + > +switch (type) { > case QCOW2_CLUSTER_COMPRESSED: > /* Compressed clusters don't have QCOW_OFLAG_COPIED */ > if (l2_entry & QCOW_OFLAG_COPIED) { >
Re: [PATCH 0/3] hw/virtio: Constify VirtIOFeature
On 5/11/21 5:41 AM, Philippe Mathieu-Daudé wrote: Philippe Mathieu-Daudé (3): hw/virtio: Pass virtio_feature_get_config_size() a const argument virtio-blk: Constify VirtIOFeature feature_sizes[] virtio-net: Constify VirtIOFeature feature_sizes[] Reviewed-by: Richard Henderson r~
Re: [PATCH v2 06/10] qcow2-refcount: check_refcounts_l2(): check l2_bitmap
On 05.05.2021 09:59, Vladimir Sementsov-Ogievskiy wrote: > Check subcluster bitmap of the l2 entry for different types of > clusters: > > - for compressed it must be zero > - for allocated check consistency of two parts of the bitmap > - for unallocated all subclusters should be unallocated >(or zero-plain) > > For unallocated clusters we can safely fix the entry by making it > zero-plain. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > Reviewed-by: Eric Blake Tested-by: Kirill Tkhai > --- > block/qcow2-refcount.c | 30 +- > 1 file changed, 29 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index f48c5e1b5d..062ec48a15 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -1681,6 +1681,7 @@ static int check_refcounts_l2(BlockDriverState *bs, > BdrvCheckResult *res, > uint64_t coffset; > int csize; > l2_entry = get_l2_entry(s, l2_table, i); > +uint64_t l2_bitmap = get_l2_bitmap(s, l2_table, i); > > switch (qcow2_get_cluster_type(bs, l2_entry)) { > case QCOW2_CLUSTER_COMPRESSED: > @@ -1700,6 +1701,14 @@ static int check_refcounts_l2(BlockDriverState *bs, > BdrvCheckResult *res, > break; > } > > +if (l2_bitmap) { > +fprintf(stderr, "ERROR compressed cluster %d with non-zero " > +"subcluster allocation bitmap, entry=0x%" PRIx64 > "\n", > +i, l2_entry); > +res->corruptions++; > +break; > +} > + > /* Mark cluster as used */ > qcow2_parse_compressed_l2_entry(bs, l2_entry, , ); > ret = qcow2_inc_refcounts_imrt( > @@ -1727,13 +1736,19 @@ static int check_refcounts_l2(BlockDriverState *bs, > BdrvCheckResult *res, > { > uint64_t offset = l2_entry & L2E_OFFSET_MASK; > > +if ((l2_bitmap >> 32) & l2_bitmap) { > +res->corruptions++; > +fprintf(stderr, "ERROR offset=%" PRIx64 ": Allocated " > +"cluster has corrupted subcluster allocation > bitmap\n", > +offset); > +} > + > /* Correct offsets are cluster aligned */ > if (offset_into_cluster(s, offset)) { > bool contains_data; > res->corruptions++; > > if (has_subclusters(s)) { > -uint64_t l2_bitmap = get_l2_bitmap(s, l2_table, i); > contains_data = (l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC); > } else { > contains_data = !(l2_entry & QCOW_OFLAG_ZERO); > @@ -1800,6 +1815,19 @@ static int check_refcounts_l2(BlockDriverState *bs, > BdrvCheckResult *res, > > case QCOW2_CLUSTER_ZERO_PLAIN: > case QCOW2_CLUSTER_UNALLOCATED: > +if (l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC) { > +res->corruptions++; > +fprintf(stderr, "%s: Unallocated " > +"cluster has non-zero subcluster allocation map\n", > +fix & BDRV_FIX_ERRORS ? "Repairing" : "ERROR"); > +if (fix & BDRV_FIX_ERRORS) { > +ret = fix_l2_entry_by_zero(bs, res, l2_offset, l2_table, > i, > + active, _overlap); > +if (metadata_overlap) { > +return ret; > +} > +} > +} > break; > > default: >
Re: [PATCH] hmp: Fix loadvm to resume the VM on success instead of failure
On Tue, May 11, 2021 at 06:31:51PM +0200, Kevin Wolf wrote: > Commit f61fe11aa6f broke hmp_loadvm() by adding an incorrect negation > when converting from 0/-errno return values to a bool value. The result > is that loadvm resumes the VM now if it failed and keeps it stopped if > it failed. Fix it to restore the old behaviour and do it the other way > around. > > Fixes: f61fe11aa6f7f8f0ffe4ddaa56a8108f3ab57854 > Cc: qemu-sta...@nongnu.org > Reported-by: Yanhui Ma > Signed-off-by: Kevin Wolf > --- > monitor/hmp-cmds.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c > index 0ad5b77477..cc15d9b6ee 100644 > --- a/monitor/hmp-cmds.c > +++ b/monitor/hmp-cmds.c > @@ -1133,7 +1133,7 @@ void hmp_loadvm(Monitor *mon, const QDict *qdict) > > vm_stop(RUN_STATE_RESTORE_VM); > > -if (!load_snapshot(name, NULL, false, NULL, ) && saved_vm_running) { > +if (load_snapshot(name, NULL, false, NULL, ) && saved_vm_running) { > vm_start(); > } > hmp_handle_error(mon, err); Paolo had sent a different fix here: https://lists.gnu.org/archive/html/qemu-devel/2021-05/msg01093.html As with my feedback there, I think we should be adding test coverage when fixing this. How about this: diff --git a/tests/qemu-iotests/068 b/tests/qemu-iotests/068 index 54e49c8ffa..137c5d0577 100755 --- a/tests/qemu-iotests/068 +++ b/tests/qemu-iotests/068 @@ -77,7 +77,7 @@ for extra_args in \ # Give qemu some time to boot before saving the VM state { sleep 1; printf "savevm 0\nquit\n"; } | _qemu $extra_args # Now try to continue from that VM state (this should just work) -{ sleep 1; printf "loadvm 0\nloadvm 0\nquit\n"; } | _qemu $extra_args -S +{ sleep 1; printf "info status\nloadvm 0\ninfo status\ncont\ninfo status\nloadvm 0\ninfo status\nquit\n"; } | _qemu $extra_args -S done # success, all done diff --git a/tests/qemu-iotests/068.out b/tests/qemu-iotests/068.out index f07a938a38..75c0a5df5f 100644 --- a/tests/qemu-iotests/068.out +++ b/tests/qemu-iotests/068.out @@ -7,8 +7,17 @@ QEMU X.Y.Z monitor - type 'help' for more information (qemu) savevm 0 (qemu) quit QEMU X.Y.Z monitor - type 'help' for more information +(qemu) info status +VM status: paused (prelaunch) (qemu) loadvm 0 +(qemu) info status +VM status: paused (prelaunch) +(qemu) cont +(qemu) info status +VM status: running (qemu) loadvm 0 +(qemu) info status +VM status: running (qemu) quit === Saving and reloading a VM state to/from a qcow2 image (-object iothread,id=iothread0 -set device.hba0.iothread=iothread0) === @@ -18,7 +27,16 @@ QEMU X.Y.Z monitor - type 'help' for more information (qemu) savevm 0 (qemu) quit QEMU X.Y.Z monitor - type 'help' for more information +(qemu) info status +VM status: paused (prelaunch) (qemu) loadvm 0 +(qemu) info status +VM status: paused (prelaunch) +(qemu) cont +(qemu) info status +VM status: running (qemu) loadvm 0 +(qemu) info status +VM status: running (qemu) quit *** done Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
[PATCH] hmp: Fix loadvm to resume the VM on success instead of failure
Commit f61fe11aa6f broke hmp_loadvm() by adding an incorrect negation when converting from 0/-errno return values to a bool value. The result is that loadvm resumes the VM now if it failed and keeps it stopped if it failed. Fix it to restore the old behaviour and do it the other way around. Fixes: f61fe11aa6f7f8f0ffe4ddaa56a8108f3ab57854 Cc: qemu-sta...@nongnu.org Reported-by: Yanhui Ma Signed-off-by: Kevin Wolf --- monitor/hmp-cmds.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c index 0ad5b77477..cc15d9b6ee 100644 --- a/monitor/hmp-cmds.c +++ b/monitor/hmp-cmds.c @@ -1133,7 +1133,7 @@ void hmp_loadvm(Monitor *mon, const QDict *qdict) vm_stop(RUN_STATE_RESTORE_VM); -if (!load_snapshot(name, NULL, false, NULL, ) && saved_vm_running) { +if (load_snapshot(name, NULL, false, NULL, ) && saved_vm_running) { vm_start(); } hmp_handle_error(mon, err); -- 2.30.2
[PULL 1/2] hw/block/pflash_cfi02: Set romd mode in pflash_cfi02_realize()
From: Philippe Mathieu-Daudé The ROMD mode isn't related to mapping setup. Ideally we'd set this mode when the state machine resets, but for now simply move it to pflash_cfi02_realize() to not introduce logical change. Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Richard Henderson Message-Id: <20210325120921.858993-2-f4...@amsat.org> Signed-off-by: Philippe Mathieu-Daudé --- hw/block/pflash_cfi02.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/block/pflash_cfi02.c b/hw/block/pflash_cfi02.c index 25c053693ce..35e30bb812c 100644 --- a/hw/block/pflash_cfi02.c +++ b/hw/block/pflash_cfi02.c @@ -173,7 +173,6 @@ static void pflash_setup_mappings(PFlashCFI02 *pfl) "pflash-alias", >orig_mem, 0, size); memory_region_add_subregion(>mem, i * size, >mem_mappings[i]); } -pfl->rom_mode = true; } static void pflash_reset_state_machine(PFlashCFI02 *pfl) @@ -917,6 +916,7 @@ static void pflash_cfi02_realize(DeviceState *dev, Error **errp) /* Allocate memory for a bitmap for sectors being erased. */ pfl->sector_erase_map = bitmap_new(pfl->total_sectors); +pfl->rom_mode = true; pflash_setup_mappings(pfl); sysbus_init_mmio(SYS_BUS_DEVICE(dev), >mem); -- 2.26.3
[PULL 2/2] hw/block/pflash_cfi02: Do not create aliases when not necessary
From: Philippe Mathieu-Daudé When no mapping is requested, it is pointless to create alias regions. Only create them when multiple mappings are requested to simplify the memory layout. The flatview is not changed. For example using 'qemu-system-sh4 -M r2d -S -monitor stdio', * before: (qemu) info mtree address-space: memory - (prio 0, i/o): system -00ff (prio 0, i/o): pflash -00ff (prio 0, romd): alias pflash-alias @r2d.flash -00ff 0400-043f (prio 0, i/o): r2d-fpga 0c00-0fff (prio 0, ram): r2d.sdram (qemu) info mtree -f FlatView #0 AS "memory", root: system AS "cpu-memory-0", root: system Root memory region: system -00ff (prio 0, romd): r2d.flash 0400-043f (prio 0, i/o): r2d-fpga 0c00-0fff (prio 0, ram): r2d.sdram * after: (qemu) info mtree address-space: memory - (prio 0, i/o): system -00ff (prio 0, romd): r2d.flash 0400-043f (prio 0, i/o): r2d-fpga 0c00-0fff (prio 0, ram): r2d.sdram (qemu) info mtree -f FlatView #0 AS "memory", root: system AS "cpu-memory-0", root: system Root memory region: system -00ff (prio 0, romd): r2d.flash 0400-043f (prio 0, i/o): r2d-fpga 0c00-0fff (prio 0, ram): r2d.sdram Signed-off-by: Philippe Mathieu-Daudé Reviewed-by: Richard Henderson Message-Id: <20210325120921.858993-3-f4...@amsat.org> Signed-off-by: Philippe Mathieu-Daudé --- hw/block/pflash_cfi02.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/hw/block/pflash_cfi02.c b/hw/block/pflash_cfi02.c index 35e30bb812c..02c514fb6e0 100644 --- a/hw/block/pflash_cfi02.c +++ b/hw/block/pflash_cfi02.c @@ -917,8 +917,12 @@ static void pflash_cfi02_realize(DeviceState *dev, Error **errp) pfl->sector_erase_map = bitmap_new(pfl->total_sectors); pfl->rom_mode = true; -pflash_setup_mappings(pfl); -sysbus_init_mmio(SYS_BUS_DEVICE(dev), >mem); +if (pfl->mappings > 1) { +pflash_setup_mappings(pfl); +sysbus_init_mmio(SYS_BUS_DEVICE(dev), >mem); +} else { +sysbus_init_mmio(SYS_BUS_DEVICE(dev), >orig_mem); +} timer_init_ns(>timer, QEMU_CLOCK_VIRTUAL, pflash_timer, pfl); pfl->status = 0; -- 2.26.3
[PULL 0/2] pflash patches for 2021-05-11
The following changes since commit f9a576a818044133f8564e0d243ebd97df0b3280: Merge remote-tracking branch 'remotes/dgilbert-gitlab/tags/pull-virtiofs-20210506' into staging (2021-05-11 13:03:44 +0100) are available in the Git repository at: https://github.com/philmd/qemu.git tags/pflash-20210511 for you to fetch changes up to 27545c9df24f509c6d1c1f17478281a357125554: hw/block/pflash_cfi02: Do not create aliases when not necessary (2021-05-11 18:11:02 +0200) Parallel NOR Flash patches queue - Simplify memory layout when no pflash_cfi02 mapping requested Philippe Mathieu-Daudé (2): hw/block/pflash_cfi02: Set romd mode in pflash_cfi02_realize() hw/block/pflash_cfi02: Do not create aliases when not necessary hw/block/pflash_cfi02.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) -- 2.26.3
Re: [PATCH] hw/block/nvme: re-enable NVMe PCI hotplug
On 5/11/21 6:03 PM, Klaus Jensen wrote: On May 11 16:54, Hannes Reinecke wrote: On 5/11/21 3:37 PM, Klaus Jensen wrote: On May 11 15:12, Hannes Reinecke wrote: On 5/11/21 2:22 PM, Klaus Jensen wrote: [ .. ] The hotplug fix looks good - I'll post a series that tries to integrate both. Ta. The more I think about it, the more I think we should be looking into reparenting the namespaces to the subsystem. That would have the _immediate_ benefit that 'device_del' and 'device_add' becomes symmetric (ie one doesn't have to do a separate 'device_add nvme-ns'), as the nvme namespace is not affected by the hotplug event. I have that working, but I'm struggling with a QEMU API technicality in that I apparently cannot simply move the NvmeBus creation to the nvme-subsys device. For some reason the bus is not available for the nvme-ns devices. That is, if one does something like this: -device nvme-subsys,... -device nvme-ns,... Then I get an error that "no 'nvme-bus' bus found for device 'nvme'ns". This is probably just me not grok'ing the qdev well enough, so I'll keep trying to fix that. What works now is to have the regular setup: _Normally_ the 'id' of the parent device spans a bus, so the syntax should be -device nvme-subsys,id=subsys1,... -device nvme-ns,bus=subsys1,... Yeah, I know, I just oversimplified the example. This *is* how I wanted it to work ;) As for the nvme device I would initially expose any namespace from the subsystem to the controller; the nvme spec has some concept of 'active' or 'inactive' namespaces which would allow us to blank out individual namespaces on a per-controller basis, but I fear that's not easy to model with qdev and the structure above. The nvme-ns device already supports the boolean 'detached' parameter to support the concept of an inactive namespace. Yeah, but that doesn't really work if we move the namespace to the subsystem; the 'detached' parameter is for the controller<->namespace relationship. That's why I meant we have to have some sort of NSID map for the controller such that the controller knows with NSID to access. I guess we can copy the trick with the NSID array, and reverse the operation we have now wrt subsystem; keep the main NSID array in the subsystem, and per-controller NSID arrays holding those which can be accessed. And ignore the commandline for now; figure that one out later. Cheers, Hannes -device nvme-subsys,... -device nvme,... -device nvme-ns,... And the nvme-ns device will then reparent to the NvmeBus on nvme-subsys (which magically now IS available when nvme-ns is realized). This has the same end result, but I really would like that the namespaces could be specified as children of the subsys directly. Shudder. Automatic reparenting. To my understanding from qdev that shouldn't even be possible. Please don't. It's perfectly possible with the API and used to implement stuff like failover. We are not changing the parent object, we are changing the parent bus. hw/sd does something like this (but does mention that its a bit of a hack). In this case I'd say we could argue to get away with it as well. Allowing the nvme-ns device to be a child of the controller allows the initially attached controller of non-shared namespaces to be easily expressible. But I agree, the approach is a bit wacky, which is why I havnt posted anything yet. Yep, I did try to implement multipathing for SCSI at one point, and that became patently horrible as it would run against qdev where everything is ordered within a tree. Cheers, Hannes -- Dr. Hannes ReineckeKernel Storage Architect h...@suse.de +49 911 74053 688 SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer
Re: [PATCH] hw/block/nvme: re-enable NVMe PCI hotplug
On May 11 16:54, Hannes Reinecke wrote: On 5/11/21 3:37 PM, Klaus Jensen wrote: On May 11 15:12, Hannes Reinecke wrote: On 5/11/21 2:22 PM, Klaus Jensen wrote: [ .. ] The hotplug fix looks good - I'll post a series that tries to integrate both. Ta. The more I think about it, the more I think we should be looking into reparenting the namespaces to the subsystem. That would have the _immediate_ benefit that 'device_del' and 'device_add' becomes symmetric (ie one doesn't have to do a separate 'device_add nvme-ns'), as the nvme namespace is not affected by the hotplug event. I have that working, but I'm struggling with a QEMU API technicality in that I apparently cannot simply move the NvmeBus creation to the nvme-subsys device. For some reason the bus is not available for the nvme-ns devices. That is, if one does something like this: -device nvme-subsys,... -device nvme-ns,... Then I get an error that "no 'nvme-bus' bus found for device 'nvme'ns". This is probably just me not grok'ing the qdev well enough, so I'll keep trying to fix that. What works now is to have the regular setup: _Normally_ the 'id' of the parent device spans a bus, so the syntax should be -device nvme-subsys,id=subsys1,... -device nvme-ns,bus=subsys1,... Yeah, I know, I just oversimplified the example. This *is* how I wanted it to work ;) As for the nvme device I would initially expose any namespace from the subsystem to the controller; the nvme spec has some concept of 'active' or 'inactive' namespaces which would allow us to blank out individual namespaces on a per-controller basis, but I fear that's not easy to model with qdev and the structure above. The nvme-ns device already supports the boolean 'detached' parameter to support the concept of an inactive namespace. -device nvme-subsys,... -device nvme,... -device nvme-ns,... And the nvme-ns device will then reparent to the NvmeBus on nvme-subsys (which magically now IS available when nvme-ns is realized). This has the same end result, but I really would like that the namespaces could be specified as children of the subsys directly. Shudder. Automatic reparenting. To my understanding from qdev that shouldn't even be possible. Please don't. It's perfectly possible with the API and used to implement stuff like failover. We are not changing the parent object, we are changing the parent bus. hw/sd does something like this (but does mention that its a bit of a hack). In this case I'd say we could argue to get away with it as well. Allowing the nvme-ns device to be a child of the controller allows the initially attached controller of non-shared namespaces to be easily expressible. But I agree, the approach is a bit wacky, which is why I havnt posted anything yet. signature.asc Description: PGP signature
[PATCH] block: add more commands to preconfig mode
Most block device commands do not require a fully constructed machine. Allow running them before machine initialization has concluded. Signed-off-by: Paolo Bonzini --- hmp-commands.hx| 14 + qapi/block-core.json | 117 +++-- qapi/block-export.json | 21 +--- qapi/block.json| 6 ++- 4 files changed, 110 insertions(+), 48 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index 146a13c896..901a50ebd1 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -78,6 +78,7 @@ ERST .help = "resize a block image", .cmd= hmp_block_resize, .coroutine = true, +.flags = "p", }, SRST @@ -94,6 +95,7 @@ ERST .params = "device [speed [base]]", .help = "copy data from a backing file into a block device", .cmd= hmp_block_stream, +.flags = "p", }, SRST @@ -107,6 +109,7 @@ ERST .params = "device speed", .help = "set maximum speed for a background block operation", .cmd= hmp_block_job_set_speed, +.flags = "p", }, SRST @@ -122,6 +125,7 @@ ERST "\n\t\t\t if you want to abort the operation immediately" "\n\t\t\t instead of keep running until data is in sync)", .cmd= hmp_block_job_cancel, +.flags = "p", }, SRST @@ -135,6 +139,7 @@ ERST .params = "device", .help = "stop an active background block operation", .cmd= hmp_block_job_complete, +.flags = "p", }, SRST @@ -149,6 +154,7 @@ ERST .params = "device", .help = "pause an active background block operation", .cmd= hmp_block_job_pause, +.flags = "p", }, SRST @@ -162,6 +168,7 @@ ERST .params = "device", .help = "resume a paused background block operation", .cmd= hmp_block_job_resume, +.flags = "p", }, SRST @@ -1396,6 +1403,7 @@ ERST .params = "nbd_server_start [-a] [-w] host:port", .help = "serve block devices on the given host and port", .cmd= hmp_nbd_server_start, +.flags = "p", }, SRST ``nbd_server_start`` *host*:*port* @@ -1411,6 +1419,7 @@ ERST .params = "nbd_server_add [-w] device [name]", .help = "export a block device via NBD", .cmd= hmp_nbd_server_add, +.flags = "p", }, SRST ``nbd_server_add`` *device* [ *name* ] @@ -1426,6 +1435,7 @@ ERST .params = "nbd_server_remove [-f] name", .help = "remove an export previously exposed via NBD", .cmd= hmp_nbd_server_remove, +.flags = "p", }, SRST ``nbd_server_remove [-f]`` *name* @@ -1442,6 +1452,7 @@ ERST .params = "nbd_server_stop", .help = "stop serving block devices using the NBD protocol", .cmd= hmp_nbd_server_stop, +.flags = "p", }, SRST ``nbd_server_stop`` @@ -1471,6 +1482,7 @@ ERST .params = "getfd name", .help = "receive a file descriptor via SCM rights and assign it a name", .cmd= hmp_getfd, +.flags = "p", }, SRST @@ -1486,6 +1498,7 @@ ERST .params = "closefd name", .help = "close a file descriptor previously passed via SCM rights", .cmd= hmp_closefd, +.flags = "p", }, SRST @@ -1501,6 +1514,7 @@ ERST .params = "device bps bps_rd bps_wr iops iops_rd iops_wr", .help = "change I/O throttle limits for a block drive", .cmd= hmp_block_set_io_throttle, +.flags = "p", }, SRST diff --git a/qapi/block-core.json b/qapi/block-core.json index 6d227924d0..8bbc06ac86 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -680,7 +680,8 @@ #} # ## -{ 'command': 'query-block', 'returns': ['BlockInfo'] } +{ 'command': 'query-block', 'returns': ['BlockInfo'], + 'allow-preconfig': true } ## @@ -1056,7 +1057,8 @@ ## { 'command': 'query-blockstats', 'data': { '*query-nodes': 'bool' }, - 'returns': ['BlockStats'] } + 'returns': ['BlockStats'], + 'allow-preconfig': true } ## # @BlockdevOnError: @@ -1205,7 +1207,8 @@ # # Since: 1.1 ## -{ 'command': 'query-block-jobs', 'returns': ['BlockJobInfo'] } +{ 'command': 'query-block-jobs', 'returns': ['BlockJobInfo'], + 'allow-preconfig': true } ## # @block_resize: @@ -1236,7 +1239,8 @@ 'data': { '*device': 'str', '*node-name': 'str', 'size': 'int' }, - 'coroutine': true } + 'coroutine': true, + 'allow-preconfig': true } ## # @NewImageMode: @@ -1447,7 +1451,8 @@ # ## { 'command': 'blockdev-snapshot-sync', - 'data': 'BlockdevSnapshotSync'
Re: [PATCH] hw/block/nvme: re-enable NVMe PCI hotplug
On 5/11/21 3:37 PM, Klaus Jensen wrote: > On May 11 15:12, Hannes Reinecke wrote: >> On 5/11/21 2:22 PM, Klaus Jensen wrote: [ .. ] >>> The hotplug fix looks good - I'll post a series that tries to integrate >>> both. >>> >> Ta. >> >> The more I think about it, the more I think we should be looking into >> reparenting the namespaces to the subsystem. >> That would have the _immediate_ benefit that 'device_del' and >> 'device_add' becomes symmetric (ie one doesn't have to do a separate >> 'device_add nvme-ns'), as the nvme namespace is not affected by the >> hotplug event. >> > > I have that working, but I'm struggling with a QEMU API technicality in > that I apparently cannot simply move the NvmeBus creation to the > nvme-subsys device. For some reason the bus is not available for the > nvme-ns devices. That is, if one does something like this: > > -device nvme-subsys,... > -device nvme-ns,... > > Then I get an error that "no 'nvme-bus' bus found for device 'nvme'ns". > This is probably just me not grok'ing the qdev well enough, so I'll keep > trying to fix that. What works now is to have the regular setup: > _Normally_ the 'id' of the parent device spans a bus, so the syntax should be -device nvme-subsys,id=subsys1,... -device nvme-ns,bus=subsys1,... As for the nvme device I would initially expose any namespace from the subsystem to the controller; the nvme spec has some concept of 'active' or 'inactive' namespaces which would allow us to blank out individual namespaces on a per-controller basis, but I fear that's not easy to model with qdev and the structure above. > -device nvme-subsys,... > -device nvme,... > -device nvme-ns,... > > And the nvme-ns device will then reparent to the NvmeBus on nvme-subsys > (which magically now IS available when nvme-ns is realized). This has > the same end result, but I really would like that the namespaces could > be specified as children of the subsys directly. > Shudder. Automatic reparenting. To my understanding from qdev that shouldn't even be possible. Please don't. Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect h...@suse.de +49 911 74053 688 SUSE Software Solutions Germany GmbH, 90409 Nürnberg GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
Re: [PATCH] block: Improve backing file validation
On 2021/5/10 16:41, Daniel P. Berrangé wrote: > On Mon, May 10, 2021 at 12:30:45PM +0800, Li Zhijian wrote: >> Image below user cases: >> case 1: >> ``` >> $ qemu-img create -f raw source.raw 1G >> $ qemu-img create -f qcow2 -F raw -b source.raw ./source.raw >> qemu-img info source.raw >> image: source.raw >> file format: qcow2 >> virtual size: 193K (197120 bytes) >> disk size: 196K >> cluster_size: 65536 >> backing file: source.raw << >> backing file format: raw >> Format specific information: >> compat: 1.1 >> lazy refcounts: false >> refcount bits: 16 >> corrupt: false >> ``` >> >> case 2: >> ``` >> $ qemu-img create -f raw source.raw 1G >> $ ln -sf source.raw destination.qcow2 >> $ qemu-img create -f qcow2 -F raw -b source.raw ./destination.qcow2 >> qemu-img info source.raw >> image: source.raw >> file format: qcow2 << >> virtual size: 2.0G (2147483648 bytes) >> disk size: 196K >> cluster_size: 65536 >> backing file: source.raw >> backing file format: raw >> Format specific information: >> compat: 1.1 >> lazy refcounts: false >> refcount bits: 16 >> corrupt: false >> ``` >> Generally, we don't expect to corrupte the source.raw anyway, while >> actually it does. >> >> Here we validate the realpath of file instead the input string. > That still won't handle the case where you use hard links > >$ ln source.raw destination.qcow2 > > To properly validate the scenarios I think it is neccessary > to ignore the filename sentirely. > > Instead attempt to open both files, and if successful, fstat() > them both, and then compare the st_dev + st_ino fields. Sounds great, i will update it. Thanks Zhijian
Re: [PATCH] hw/block/nvme: re-enable NVMe PCI hotplug
On May 11 15:12, Hannes Reinecke wrote: On 5/11/21 2:22 PM, Klaus Jensen wrote: On May 11 09:35, Hannes Reinecke wrote: Ever since commit e570768566 ("hw/block/nvme: support for shared namespace in subsystem") NVMe PCI hotplug is broken, as the PCI hotplug infrastructure will only work for the nvme devices (which are PCI devices), but not for any attached namespaces. So when re-adding the NVMe PCI device via 'device_add' the NVMe controller is added, but all namespaces are missing. This patch adds device hotplug hooks for NVMe namespaces, such that one can call 'device_add nvme-ns' to (re-)attach the namespaces after the PCI NVMe device 'device_add nvme' hotplug call. Hi Hannes, Thanks for this. The real fix here is that namespaces are properly detached from other controllers that it may be shared on. But is this really the behavior we want? That nvme-ns devices always "belongs to" (in QEMU qdev terms) an nvme device is an artifact of the Bus/Device architecture and not really how an NVM subsystem should behave. Removing a controller should not cause shared namespaces to disappear from other controllers. I have a WIP that instead adds an NvmeBus to the nvme-subsys device and reparents the nvme-ns devices to that if the parent controller is linked to a sybsystem. This way, nvme-ns devices wont be unrealized under the feet of other controllers. That would be the other direction I thought of; _technically_ NVMe namespaces are objects of the subsystem, and 'controllers' are just temporary objects providing access to the namespaces presented by the subsystem. So if you are going to rework it I'd rather make the namespaces children objects of the subsystem, and have nsid maps per controller detailing which nsids are accessible from the individual controllers. That would probably a simple memcpy() to start with, but it would allow us to modify that map via NVMe-MI and stuff. However, if you do that you'll find that subsystems can't be hotplugged, too; but I'm sure you'll be able to fix it up :-) The hotplug fix looks good - I'll post a series that tries to integrate both. Ta. The more I think about it, the more I think we should be looking into reparenting the namespaces to the subsystem. That would have the _immediate_ benefit that 'device_del' and 'device_add' becomes symmetric (ie one doesn't have to do a separate 'device_add nvme-ns'), as the nvme namespace is not affected by the hotplug event. I have that working, but I'm struggling with a QEMU API technicality in that I apparently cannot simply move the NvmeBus creation to the nvme-subsys device. For some reason the bus is not available for the nvme-ns devices. That is, if one does something like this: -device nvme-subsys,... -device nvme-ns,... Then I get an error that "no 'nvme-bus' bus found for device 'nvme'ns". This is probably just me not grok'ing the qdev well enough, so I'll keep trying to fix that. What works now is to have the regular setup: -device nvme-subsys,... -device nvme,... -device nvme-ns,... And the nvme-ns device will then reparent to the NvmeBus on nvme-subsys (which magically now IS available when nvme-ns is realized). This has the same end result, but I really would like that the namespaces could be specified as children of the subsys directly. Any help from qdev experts would be appreciated! :) This really was a quick hack to demonstrate a shortcoming in the linux NVMe stack (cf 'nvme-mpath: delete disk after last connection' if you are interested in details), so I'm sure there is room for improvement. I follow linux-nvme, so I saw the patch ;) And the prime reason for sending it out was to gauge interest by qemu-devel; I have a somewhat mixed experience when sending patches to the qemu ML ... Your contribution is very much appreciated :) signature.asc Description: PGP signature
Re: [PATCH] hw/block/nvme: re-enable NVMe PCI hotplug
On 5/11/21 2:22 PM, Klaus Jensen wrote: > On May 11 09:35, Hannes Reinecke wrote: >> Ever since commit e570768566 ("hw/block/nvme: support for shared >> namespace in subsystem") NVMe PCI hotplug is broken, as the PCI >> hotplug infrastructure will only work for the nvme devices (which >> are PCI devices), but not for any attached namespaces. >> So when re-adding the NVMe PCI device via 'device_add' the NVMe >> controller is added, but all namespaces are missing. >> This patch adds device hotplug hooks for NVMe namespaces, such that one >> can call 'device_add nvme-ns' to (re-)attach the namespaces after >> the PCI NVMe device 'device_add nvme' hotplug call. >> > > Hi Hannes, > > Thanks for this. > > The real fix here is that namespaces are properly detached from other > controllers that it may be shared on. > > But is this really the behavior we want? That nvme-ns devices always > "belongs to" (in QEMU qdev terms) an nvme device is an artifact of the > Bus/Device architecture and not really how an NVM subsystem should > behave. Removing a controller should not cause shared namespaces to > disappear from other controllers. > > I have a WIP that instead adds an NvmeBus to the nvme-subsys device and > reparents the nvme-ns devices to that if the parent controller is linked > to a sybsystem. This way, nvme-ns devices wont be unrealized under the > feet of other controllers. > That would be the other direction I thought of; _technically_ NVMe namespaces are objects of the subsystem, and 'controllers' are just temporary objects providing access to the namespaces presented by the subsystem. So if you are going to rework it I'd rather make the namespaces children objects of the subsystem, and have nsid maps per controller detailing which nsids are accessible from the individual controllers. That would probably a simple memcpy() to start with, but it would allow us to modify that map via NVMe-MI and stuff. However, if you do that you'll find that subsystems can't be hotplugged, too; but I'm sure you'll be able to fix it up :-) > The hotplug fix looks good - I'll post a series that tries to integrate > both. > Ta. The more I think about it, the more I think we should be looking into reparenting the namespaces to the subsystem. That would have the _immediate_ benefit that 'device_del' and 'device_add' becomes symmetric (ie one doesn't have to do a separate 'device_add nvme-ns'), as the nvme namespace is not affected by the hotplug event. This really was a quick hack to demonstrate a shortcoming in the linux NVMe stack (cf 'nvme-mpath: delete disk after last connection' if you are interested in details), so I'm sure there is room for improvement. And the prime reason for sending it out was to gauge interest by qemu-devel; I have a somewhat mixed experience when sending patches to the qemu ML ... Cheers, Hannes -- Dr. Hannes Reinecke Kernel Storage Architect h...@suse.de +49 911 74053 688 SUSE Software Solutions Germany GmbH, 90409 Nürnberg GF: F. Imendörffer, HRB 36809 (AG Nürnberg)
Re: [PATCH v2 0/6] hw/southbridge: QOM'ify vt82c686 as VT82C686B_SOUTHBRIDGE
On Tue, 11 May 2021, Philippe Mathieu-Daudé wrote: Hi Zoltan, On 5/11/21 1:28 PM, BALATON Zoltan wrote: On Tue, 11 May 2021, Philippe Mathieu-Daudé wrote: The motivation behind this series is to remove the isa_get_irq(NULL) call to simplify the ISA generic model. Since v1: - rebased on top of remotes/dg-gitlab/tags/ppc-for-6.1-20210504 I'll try to have a look at these later but some notes: The pegasos2 changes are now in master so if this was before that maybe rebasing on master is now enough. This is what this series does, simply rebase on top of your merged patches. However I wonder if any changes to pegasos2.c is needed due to changed init of the chip model or is that only affecting 82c686b? There is no change in 'init' in this series, it is only QOM boilerplate code churn, no logical change intended. Please also note that pegasos2 is not enabled by default due to needing undistributable firmware ROM so to test it you need to enable it in default-configs/devices/ppc-softmmu.mak I remember you said you were mostly interested in the VT8231, not the VT82C686. This series only QOM'ify the latter. OK as I said I haven't looked at it in detail. What is your idea? Send the firmware off-list and explain how the OS works and how (what) to test? I've already sent you this info: https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg01553.html but I can't write a test case so if you want to automate this and make it part of QEMU tests then some help with that would be appreciated. As for the firmware, once Alexey's VOF (Virtual Open Firmware, minimial OF emulation in QEMU) is merged I plan to try to use that to make it possible to boot some guests with that so no firmware image would be needed and pegasos2 could be enabled by default. But for now a firmware image is needed as guests expect an OF environment to boot. Regards, BALATON Zoltan
Re: [PATCH 4/6] progressmeter: protect with a mutex
On 10/05/21 13:28, Vladimir Sementsov-Ogievskiy wrote: Could we instead add a c file and add the structure private? Then we'll have progress_new() and progress_free() APIs instead. This way, it would be a lot simpler to control that nobady use structure fields directly. I don't know... I prefer embedding structs to heap allocation. Paolo
Re: [PATCH v2 0/6] hw/southbridge: QOM'ify vt82c686 as VT82C686B_SOUTHBRIDGE
Hi Zoltan, On 5/11/21 1:28 PM, BALATON Zoltan wrote: > On Tue, 11 May 2021, Philippe Mathieu-Daudé wrote: >> The motivation behind this series is to remove the >> isa_get_irq(NULL) call to simplify the ISA generic model. >> >> Since v1: >> - rebased on top of remotes/dg-gitlab/tags/ppc-for-6.1-20210504 > > I'll try to have a look at these later but some notes: The pegasos2 > changes are now in master so if this was before that maybe rebasing on > master is now enough. This is what this series does, simply rebase on top of your merged patches. > However I wonder if any changes to pegasos2.c is > needed due to changed init of the chip model or is that only affecting > 82c686b? There is no change in 'init' in this series, it is only QOM boilerplate code churn, no logical change intended. > Please also note that pegasos2 is not enabled by default due to > needing undistributable firmware ROM so to test it you need to enable it > in default-configs/devices/ppc-softmmu.mak I remember you said you were mostly interested in the VT8231, not the VT82C686. This series only QOM'ify the latter. What is your idea? Send the firmware off-list and explain how the OS works and how (what) to test? Regards, Phil. >> Philippe Mathieu-Daudé (6): >> hw/isa/vt82c686: Name output IRQ as 'intr' >> hw/isa/vt82c686: Simplify removing unuseful qemu_allocate_irqs() call >> hw/isa/vt82c686: Let ISA function expose ISA IRQs >> hw/ide/via: Replace magic 2 value by ARRAY_SIZE / MAX_IDE_DEVS >> hw/ide/via: Connect IDE function output IRQs to the ISA function input >> hw/southbridge/vt82c686: Introduce VT82C686B_SOUTHBRIDGE
Re: [PATCH] hw/block/nvme: re-enable NVMe PCI hotplug
On May 11 09:35, Hannes Reinecke wrote: Ever since commit e570768566 ("hw/block/nvme: support for shared namespace in subsystem") NVMe PCI hotplug is broken, as the PCI hotplug infrastructure will only work for the nvme devices (which are PCI devices), but not for any attached namespaces. So when re-adding the NVMe PCI device via 'device_add' the NVMe controller is added, but all namespaces are missing. This patch adds device hotplug hooks for NVMe namespaces, such that one can call 'device_add nvme-ns' to (re-)attach the namespaces after the PCI NVMe device 'device_add nvme' hotplug call. Hi Hannes, Thanks for this. The real fix here is that namespaces are properly detached from other controllers that it may be shared on. But is this really the behavior we want? That nvme-ns devices always "belongs to" (in QEMU qdev terms) an nvme device is an artifact of the Bus/Device architecture and not really how an NVM subsystem should behave. Removing a controller should not cause shared namespaces to disappear from other controllers. I have a WIP that instead adds an NvmeBus to the nvme-subsys device and reparents the nvme-ns devices to that if the parent controller is linked to a sybsystem. This way, nvme-ns devices wont be unrealized under the feet of other controllers. The hotplug fix looks good - I'll post a series that tries to integrate both. Fixes: e570768566 ("hw/block/nvme: support for shared namespace in subsystem") Signed-off-by: Hannes Reinecke --- capstone | 2 +- hw/block/nvme-ns.c | 31 ++ hw/block/nvme-subsys.c | 12 + hw/block/nvme-subsys.h | 1 + hw/block/nvme.c| 60 +++--- hw/block/nvme.h| 1 + roms/SLOF | 2 +- roms/openbios | 2 +- roms/u-boot| 2 +- 9 files changed, 93 insertions(+), 20 deletions(-) diff --git a/capstone b/capstone index f8b1b83301..22ead3e0bf 16 --- a/capstone +++ b/capstone @@ -1 +1 @@ -Subproject commit f8b1b833015a4ae47110ed068e0deb7106ced66d +Subproject commit 22ead3e0bfdb87516656453336160e0a37b066bf diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 7bb618f182..3a7e01f10f 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -526,6 +526,36 @@ static void nvme_ns_realize(DeviceState *dev, Error **errp) nvme_attach_ns(n, ns); } +static void nvme_ns_unrealize(DeviceState *dev) +{ +NvmeNamespace *ns = NVME_NS(dev); +BusState *s = qdev_get_parent_bus(dev); +NvmeCtrl *n = NVME(s->parent); +NvmeSubsystem *subsys = n->subsys; +uint32_t nsid = ns->params.nsid; +int i; + +nvme_ns_drain(ns); +nvme_ns_shutdown(ns); +nvme_ns_cleanup(ns); + +if (subsys) { +subsys->namespaces[nsid] = NULL; + +if (ns->params.shared) { +for (i = 0; i < ARRAY_SIZE(subsys->ctrls); i++) { +NvmeCtrl *ctrl = subsys->ctrls[i]; + +if (ctrl) { +nvme_detach_ns(ctrl, ns); +} +} +return; +} +} +nvme_detach_ns(n, ns); +} + static Property nvme_ns_props[] = { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_BOOL("detached", NvmeNamespace, params.detached, false), @@ -563,6 +593,7 @@ static void nvme_ns_class_init(ObjectClass *oc, void *data) dc->bus_type = TYPE_NVME_BUS; dc->realize = nvme_ns_realize; +dc->unrealize = nvme_ns_unrealize; device_class_set_props(dc, nvme_ns_props); dc->desc = "Virtual NVMe namespace"; } diff --git a/hw/block/nvme-subsys.c b/hw/block/nvme-subsys.c index 9604c19117..1c00508f33 100644 --- a/hw/block/nvme-subsys.c +++ b/hw/block/nvme-subsys.c @@ -42,6 +42,18 @@ int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp) return cntlid; } +void nvme_subsys_unregister_ctrl(NvmeCtrl *n) +{ +NvmeSubsystem *subsys = n->subsys; +int cntlid = n->cntlid; + +if (!n->subsys) +return; +assert(cntlid < ARRAY_SIZE(subsys->ctrls)); +subsys->ctrls[cntlid] = NULL; +n->cntlid = -1; +} + static void nvme_subsys_setup(NvmeSubsystem *subsys) { const char *nqn = subsys->params.nqn ? diff --git a/hw/block/nvme-subsys.h b/hw/block/nvme-subsys.h index 7d7ef5f7f1..2d8a146c4f 100644 --- a/hw/block/nvme-subsys.h +++ b/hw/block/nvme-subsys.h @@ -32,6 +32,7 @@ typedef struct NvmeSubsystem { } NvmeSubsystem; int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp); +void nvme_subsys_unregister_ctrl(NvmeCtrl *n); static inline NvmeCtrl *nvme_subsys_ctrl(NvmeSubsystem *subsys, uint32_t cntlid) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 5fe082ec34..515678b686 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -4963,26 +4963,12 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, NvmeRequest *req) } nvme_attach_ns(ctrl, ns); -__nvme_select_ns_iocs(ctrl, ns); } else { if (!nvme_ns(ctrl, nsid)) { return
Re: [PATCH v2 0/6] hw/southbridge: QOM'ify vt82c686 as VT82C686B_SOUTHBRIDGE
On Tue, 11 May 2021, Philippe Mathieu-Daudé wrote: The motivation behind this series is to remove the isa_get_irq(NULL) call to simplify the ISA generic model. Since v1: - rebased on top of remotes/dg-gitlab/tags/ppc-for-6.1-20210504 I'll try to have a look at these later but some notes: The pegasos2 changes are now in master so if this was before that maybe rebasing on master is now enough. However I wonder if any changes to pegasos2.c is needed due to changed init of the chip model or is that only affecting 82c686b? Please also note that pegasos2 is not enabled by default due to needing undistributable firmware ROM so to test it you need to enable it in default-configs/devices/ppc-softmmu.mak Regards, BALATON Zoltan Philippe Mathieu-Daudé (6): hw/isa/vt82c686: Name output IRQ as 'intr' hw/isa/vt82c686: Simplify removing unuseful qemu_allocate_irqs() call hw/isa/vt82c686: Let ISA function expose ISA IRQs hw/ide/via: Replace magic 2 value by ARRAY_SIZE / MAX_IDE_DEVS hw/ide/via: Connect IDE function output IRQs to the ISA function input hw/southbridge/vt82c686: Introduce VT82C686B_SOUTHBRIDGE hw/ide/via.c | 31 --- hw/isa/vt82c686.c | 27 +- hw/mips/fuloong2e.c| 35 +++- hw/southbridge/vt82c686.c | 107 + MAINTAINERS| 1 + hw/Kconfig | 1 + hw/isa/Kconfig | 9 hw/meson.build | 1 + hw/southbridge/Kconfig | 8 +++ hw/southbridge/meson.build | 1 + 10 files changed, 164 insertions(+), 57 deletions(-) create mode 100644 hw/southbridge/vt82c686.c create mode 100644 hw/southbridge/Kconfig create mode 100644 hw/southbridge/meson.build -- 2.26.3
[PATCH 0/3] hw/virtio: Constify VirtIOFeature
Trivial patches to keep VirtIOFeature arrays read-only (better safe than sorry). Philippe Mathieu-Daudé (3): hw/virtio: Pass virtio_feature_get_config_size() a const argument virtio-blk: Constify VirtIOFeature feature_sizes[] virtio-net: Constify VirtIOFeature feature_sizes[] include/hw/virtio/virtio.h | 2 +- hw/block/virtio-blk.c | 2 +- hw/net/virtio-net.c| 2 +- hw/virtio/virtio.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) -- 2.26.3
[PATCH 3/3] virtio-net: Constify VirtIOFeature feature_sizes[]
Signed-off-by: Philippe Mathieu-Daudé --- hw/net/virtio-net.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index 66b9ff45118..6b7e8dd04ef 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -89,7 +89,7 @@ VIRTIO_NET_RSS_HASH_TYPE_TCP_EX | \ VIRTIO_NET_RSS_HASH_TYPE_UDP_EX) -static VirtIOFeature feature_sizes[] = { +static const VirtIOFeature feature_sizes[] = { {.flags = 1ULL << VIRTIO_NET_F_MAC, .end = endof(struct virtio_net_config, mac)}, {.flags = 1ULL << VIRTIO_NET_F_STATUS, -- 2.26.3
[PATCH 1/3] hw/virtio: Pass virtio_feature_get_config_size() a const argument
The VirtIOFeature structure isn't modified, mark it const. Signed-off-by: Philippe Mathieu-Daudé --- include/hw/virtio/virtio.h | 2 +- hw/virtio/virtio.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h index b7ece7a6a89..8bab9cfb750 100644 --- a/include/hw/virtio/virtio.h +++ b/include/hw/virtio/virtio.h @@ -43,7 +43,7 @@ typedef struct VirtIOFeature { size_t end; } VirtIOFeature; -size_t virtio_feature_get_config_size(VirtIOFeature *features, +size_t virtio_feature_get_config_size(const VirtIOFeature *features, uint64_t host_features); typedef struct VirtQueue VirtQueue; diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 9e13cb9e3ad..e02544b2df7 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -2981,7 +2981,7 @@ int virtio_set_features(VirtIODevice *vdev, uint64_t val) return ret; } -size_t virtio_feature_get_config_size(VirtIOFeature *feature_sizes, +size_t virtio_feature_get_config_size(const VirtIOFeature *feature_sizes, uint64_t host_features) { size_t config_size = 0; -- 2.26.3
Re: [PATCH v3 16/33] nbd/client-connection: add possibility of negotiation
On Fri, Apr 16, 2021 at 11:08:54AM +0300, Vladimir Sementsov-Ogievskiy wrote: > Add arguments and logic to support nbd negotiation in the same thread > after successful connection. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > include/block/nbd.h | 9 +++- > block/nbd.c | 4 +- > nbd/client-connection.c | 105 ++-- > 3 files changed, 109 insertions(+), 9 deletions(-) > > diff --git a/include/block/nbd.h b/include/block/nbd.h > index 57381be76f..5d86e6a393 100644 > --- a/include/block/nbd.h > +++ b/include/block/nbd.h > @@ -409,11 +409,16 @@ const char *nbd_err_lookup(int err); > /* nbd/client-connection.c */ > typedef struct NBDClientConnection NBDClientConnection; > > -NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr); > +NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr, > + bool do_negotiation, > + const char *export_name, > + const char *x_dirty_bitmap, > + QCryptoTLSCreds *tlscreds); > void nbd_client_connection_release(NBDClientConnection *conn); > > QIOChannelSocket *coroutine_fn > -nbd_co_establish_connection(NBDClientConnection *conn, Error **errp); > +nbd_co_establish_connection(NBDClientConnection *conn, NBDExportInfo *info, > +QIOChannel **ioc, Error **errp); > > void coroutine_fn nbd_co_establish_connection_cancel(NBDClientConnection > *conn); > > diff --git a/block/nbd.c b/block/nbd.c > index 9bd68dcf10..5e63caaf4b 100644 > --- a/block/nbd.c > +++ b/block/nbd.c > @@ -361,7 +361,7 @@ static coroutine_fn void > nbd_reconnect_attempt(BDRVNBDState *s) > s->ioc = NULL; > } > > -s->sioc = nbd_co_establish_connection(s->conn, NULL); > +s->sioc = nbd_co_establish_connection(s->conn, NULL, NULL, NULL); > if (!s->sioc) { > ret = -ECONNREFUSED; > goto out; > @@ -2033,7 +2033,7 @@ static int nbd_open(BlockDriverState *bs, QDict > *options, int flags, > goto fail; > } > > -s->conn = nbd_client_connection_new(s->saddr); > +s->conn = nbd_client_connection_new(s->saddr, false, NULL, NULL, NULL); > > /* > * establish TCP connection, return error if it fails > diff --git a/nbd/client-connection.c b/nbd/client-connection.c > index b45a0bd5f6..ae4a77f826 100644 > --- a/nbd/client-connection.c > +++ b/nbd/client-connection.c > @@ -30,14 +30,19 @@ > #include "qapi/clone-visitor.h" > > struct NBDClientConnection { > -/* Initialization constants */ > +/* Initialization constants, never change */ > SocketAddress *saddr; /* address to connect to */ > +QCryptoTLSCreds *tlscreds; > +NBDExportInfo initial_info; > +bool do_negotiation; > > /* > * Result of last attempt. Valid in FAIL and SUCCESS states. > * If you want to steal error, don't forget to set pointer to NULL. > */ > +NBDExportInfo updated_info; > QIOChannelSocket *sioc; > +QIOChannel *ioc; > Error *err; > > QemuMutex mutex; > @@ -47,12 +52,25 @@ struct NBDClientConnection { > Coroutine *wait_co; /* nbd_co_establish_connection() wait in yield() */ > }; > > -NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr) > +NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr, > + bool do_negotiation, > + const char *export_name, > + const char *x_dirty_bitmap, > + QCryptoTLSCreds *tlscreds) > { > NBDClientConnection *conn = g_new(NBDClientConnection, 1); > > +object_ref(OBJECT(tlscreds)); > *conn = (NBDClientConnection) { > .saddr = QAPI_CLONE(SocketAddress, saddr), > +.tlscreds = tlscreds, > +.do_negotiation = do_negotiation, > + > +.initial_info.request_sizes = true, > +.initial_info.structured_reply = true, > +.initial_info.base_allocation = true, > +.initial_info.x_dirty_bitmap = g_strdup(x_dirty_bitmap), > +.initial_info.name = g_strdup(export_name ?: "") > }; > > qemu_mutex_init(>mutex); > @@ -68,9 +86,59 @@ static void > nbd_client_connection_do_free(NBDClientConnection *conn) > } > error_free(conn->err); > qapi_free_SocketAddress(conn->saddr); > +object_unref(OBJECT(conn->tlscreds)); > +g_free(conn->initial_info.x_dirty_bitmap); > +g_free(conn->initial_info.name); > g_free(conn); > } > > +/* > + * Connect to @addr and do NBD negotiation if @info is not null. If @tlscreds > + * given @outioc is provided. @outioc is provided only on success. The call > may s/given/are given/ s/provided/returned/g > + *
[PATCH 2/3] virtio-blk: Constify VirtIOFeature feature_sizes[]
Signed-off-by: Philippe Mathieu-Daudé --- hw/block/virtio-blk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c index d28979efb8d..f139cd7cc9c 100644 --- a/hw/block/virtio-blk.c +++ b/hw/block/virtio-blk.c @@ -40,7 +40,7 @@ * Starting from the discard feature, we can use this array to properly * set the config size depending on the features enabled. */ -static VirtIOFeature feature_sizes[] = { +static const VirtIOFeature feature_sizes[] = { {.flags = 1ULL << VIRTIO_BLK_F_DISCARD, .end = endof(struct virtio_blk_config, discard_sector_alignment)}, {.flags = 1ULL << VIRTIO_BLK_F_WRITE_ZEROES, -- 2.26.3
Re: [PATCH v3 0/2] virtio-blk: Convert QEMUBH callback to "bitops.h" API
On Mon, May 10, 2021 at 10:07:56PM +0200, Philippe Mathieu-Daudé wrote: > This series follow a suggestion from Stefan to use the bitops > API in virtio-blk: > https://www.mail-archive.com/qemu-devel@nongnu.org/msg805139.html > > Since v2: > - clear bitmap to avoid spurious interrupts! (Stefan) > - use 'further' in find_next docstring (Eric) > - added Richard R-b tag > > Since v1: > - improved "bitops.h" docstring > - addressed Richard's review comments > > Philippe Mathieu-Daudé (2): > bitops.h: Improve find_xxx_bit() documentation > virtio-blk: Convert QEMUBH callback to "bitops.h" API > > include/qemu/bitops.h | 15 --- > hw/block/dataplane/virtio-blk.c | 20 +--- > 2 files changed, 17 insertions(+), 18 deletions(-) > > -- > 2.26.3 > > Reviewed-by: Stefan Hajnoczi signature.asc Description: PGP signature
Re: [PULL 0/9] scripts/simplebench patches
11.05.2021 10:39, Peter Maydell wrote: On Tue, 11 May 2021 at 06:59, Vladimir Sementsov-Ogievskiy wrote: Hi! Kindly ping, or what's wrong with it? You've never sent me a pullreq before. Pull requests from new people are a pain and take more time to deal with, so I only look at them when I have the time to do that. Pull requests from established submaintainers are easy because I know they know the process and they have their gpg key set up and so on. OK, I'll wait) -- Best regards, Vladimir
[PATCH] docs: add table of contents to QAPI references
The QAPI reference docs for the guest agent, storage daemon and QMP are all rather long and hard to navigate unless you already know the name of the command and can do full text search for it. A table of contents in each doc will help people locate stuff much more easily. Signed-off-by: Daniel P. Berrangé --- docs/interop/qemu-ga-ref.rst | 3 +++ docs/interop/qemu-qmp-ref.rst| 3 +++ docs/interop/qemu-storage-daemon-qmp-ref.rst | 3 +++ 3 files changed, 9 insertions(+) diff --git a/docs/interop/qemu-ga-ref.rst b/docs/interop/qemu-ga-ref.rst index 3f1c4f908f..db1e946124 100644 --- a/docs/interop/qemu-ga-ref.rst +++ b/docs/interop/qemu-ga-ref.rst @@ -10,4 +10,7 @@ QEMU Guest Agent Protocol Reference TODO: display the QEMU version, both here and in our Sphinx manuals more generally. +.. contents:: + :depth: 3 + .. qapi-doc:: qga/qapi-schema.json diff --git a/docs/interop/qemu-qmp-ref.rst b/docs/interop/qemu-qmp-ref.rst index c8abaaf8e3..b5bebf6b9a 100644 --- a/docs/interop/qemu-qmp-ref.rst +++ b/docs/interop/qemu-qmp-ref.rst @@ -10,4 +10,7 @@ QEMU QMP Reference Manual TODO: display the QEMU version, both here and in our Sphinx manuals more generally. +.. contents:: + :depth: 3 + .. qapi-doc:: qapi/qapi-schema.json diff --git a/docs/interop/qemu-storage-daemon-qmp-ref.rst b/docs/interop/qemu-storage-daemon-qmp-ref.rst index caf9dad23a..d0ebb42ebd 100644 --- a/docs/interop/qemu-storage-daemon-qmp-ref.rst +++ b/docs/interop/qemu-storage-daemon-qmp-ref.rst @@ -10,4 +10,7 @@ QEMU Storage Daemon QMP Reference Manual TODO: display the QEMU version, both here and in our Sphinx manuals more generally. +.. contents:: + :depth: 3 + .. qapi-doc:: storage-daemon/qapi/qapi-schema.json -- 2.31.1
Re: [PATCH v2 5/5] blkdebug: protect rules and suspended_reqs with a lock
On 11/05/2021 10:37, Paolo Bonzini wrote: On 07/05/21 17:29, Eric Blake wrote: + qemu_mutex_lock(>lock); QLIST_FOREACH(r, >suspended_reqs, next) { if (!strcmp(r->tag, tag)) { + qemu_mutex_unlock(>lock); return true; } } + qemu_mutex_unlock(>lock); return false; Would code like this be easier to write by using QEMU_LOCK_GUARD from lockable.h? Yes, this one would. In other cases (rule_check) it's not so clear cut. It depends whether you prefer to have the simplest code, or rather to have homogeneous use of either guards or lock/unlock. Makes sense. I will use the lock guard and fix the "yield" typos in the other patches. Thank you, Emanuele
Re: [PATCH v2 5/5] blkdebug: protect rules and suspended_reqs with a lock
On 07/05/21 17:29, Eric Blake wrote: +qemu_mutex_lock(>lock); QLIST_FOREACH(r, >suspended_reqs, next) { if (!strcmp(r->tag, tag)) { +qemu_mutex_unlock(>lock); return true; } } +qemu_mutex_unlock(>lock); return false; Would code like this be easier to write by using QEMU_LOCK_GUARD from lockable.h? Yes, this one would. In other cases (rule_check) it's not so clear cut. It depends whether you prefer to have the simplest code, or rather to have homogeneous use of either guards or lock/unlock. Paolo
Re: [PATCH v2] block: Improve backing file validation
On Tue, May 11, 2021 at 01:55:18PM +0800, Li Zhijian wrote: > Image below user cases: > case 1: > ``` > $ qemu-img create -f raw source.raw 1G > $ qemu-img create -f qcow2 -F raw -b source.raw ./source.raw > qemu-img info source.raw > image: source.raw > file format: qcow2 > virtual size: 193K (197120 bytes) > disk size: 196K > cluster_size: 65536 > backing file: source.raw << > backing file format: raw > Format specific information: > compat: 1.1 > lazy refcounts: false > refcount bits: 16 > corrupt: false > ``` > > case 2: > ``` > $ qemu-img create -f raw source.raw 1G > $ ln -sf source.raw destination.qcow2 > $ qemu-img create -f qcow2 -F raw -b source.raw ./destination.qcow2 > qemu-img info source.raw > image: source.raw > file format: qcow2 << > virtual size: 2.0G (2147483648 bytes) > disk size: 196K > cluster_size: 65536 > backing file: source.raw > backing file format: raw > Format specific information: > compat: 1.1 > lazy refcounts: false > refcount bits: 16 > corrupt: false > ``` > Generally, we don't expect to corrupte the source.raw anyway, while > actually it does. > > Here we check their inode number instead of file name. > > Suggested-by: Daniel P. Berrangé > Signed-off-by: Li Zhijian > > --- > v2: utilize stat() instead of realpath() (Daniel) > > Signed-off-by: Li Zhijian > --- > block.c | 39 --- > 1 file changed, 32 insertions(+), 7 deletions(-) > > diff --git a/block.c b/block.c > index 9ad725d205..db4ae57959 100644 > --- a/block.c > +++ b/block.c > @@ -6431,6 +6431,37 @@ bool bdrv_op_blocker_is_empty(BlockDriverState *bs) > return true; > } > > +static bool validate_backing_file(const char *filename, > + const char *backing_file, Error **errp) > +{ > +struct stat filename_stat, backing_stat; > + > +if (backing_file[0] == '\0') { > +error_setg(errp, "Expected backing file name, got empty string"); > +goto out; > +} > + > +/* check whether filename and backing_file are refering to the same file > */ > +if (stat(backing_file, _stat) == -1) { > +error_setg(errp, "Cannot stat backing file %s", backing_file); > +goto out; > +} > +if (stat(filename, _stat) == -1) { > +/* Simply consider filename doesn't exist, no need to further check > */ > +return true; > +} > +if ((filename_stat.st_dev == backing_stat.st_dev) && > +(filename_stat.st_ino == backing_stat.st_ino)) { > +error_setg(errp, "Error: Trying to create an image with the " > + "same filename as the backing file"); > +goto out; > +} > + > +return true; > +out: > +return false; > +} > + > void bdrv_img_create(const char *filename, const char *fmt, > const char *base_filename, const char *base_fmt, > char *options, uint64_t img_size, int flags, bool quiet, > @@ -6507,13 +6538,7 @@ void bdrv_img_create(const char *filename, const char > *fmt, > > backing_file = qemu_opt_get(opts, BLOCK_OPT_BACKING_FILE); > if (backing_file) { > -if (!strcmp(filename, backing_file)) { > -error_setg(errp, "Error: Trying to create an image with the " > - "same filename as the backing file"); > -goto out; > -} > -if (backing_file[0] == '\0') { > -error_setg(errp, "Expected backing file name, got empty string"); > +if (!validate_backing_file(filename, backing_file, errp)) { > goto out; > } > } Thinking about this again, this seems to be quite high in the generic block layer code. As such I don't think we can assume that the backing file here is actually a plain file on disk. IIUC the backing file could still be any of the block drivers. Only once we get down into the protocol specific drivers can be validate the type of backend. I'm not sure what the right way to deal with that is, so perhaps Kevin or Max can make a suggestion. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [PATCH 6/6] aiopool: protect with a mutex
On 10/05/21 13:56, Vladimir Sementsov-Ogievskiy wrote: + } - if (task->ret < 0 && pool->status == 0) { - pool->status = task->ret; + if (ret < 0) { + qatomic_cmpxchg(>status, 0, ret); } Can we just do it inside critical section above and avoid extra cmpxchg? We'll need just qatomic_set as a pair to qatomic_read() Good idea. g_free(task); - if (pool->waiting) { - pool->waiting = false; - aio_co_wake(pool->main_co); - } + qemu_co_queue_next(>queue); this call doesn't need mutex protection? It does indeed. I second the idea of "stealing" Denis's two patches to block/aio_task and only adding the mutex (plus qatomic_read/set) here. Paolo Then we should modify comment insid AioTaskPool structure. Anyway, I think it's simpler to just have one QEMU_MUTEX_GUARD() for the whole function.
Re: [PATCH 0/3] vhost-user-blk-test: add tests for the vhost-user-blk server
On Mon, Mar 22, 2021 at 09:23:24AM +, Stefan Hajnoczi wrote: > These patches add a qtest for the vhost-user-blk server. CI found several > issues that caused these patches to be dropped from Michael Tsirkin and Kevin > Wolf's pull requests in the past. Hopefully they will go in smoothly this > time! > > Coiby Xu (1): > test: new qTest case to test the vhost-user-blk-server > > Stefan Hajnoczi (2): > tests/qtest: add multi-queue test case to vhost-user-blk-test > vhost-user-blk-test: test discard/write zeroes invalid inputs > > MAINTAINERS | 2 + > tests/qtest/libqos/vhost-user-blk.h | 48 ++ > tests/qtest/libqos/vhost-user-blk.c | 130 > tests/qtest/vhost-user-blk-test.c | 989 > tests/qtest/libqos/meson.build | 1 + > tests/qtest/meson.build | 4 + > 6 files changed, 1174 insertions(+) > create mode 100644 tests/qtest/libqos/vhost-user-blk.h > create mode 100644 tests/qtest/libqos/vhost-user-blk.c > create mode 100644 tests/qtest/vhost-user-blk-test.c Ping for QEMU 6.1 Stefan signature.asc Description: PGP signature
Re: [PATCH 5/6] co-shared-resource: protect with a mutex
On 10/05/21 13:40, Vladimir Sementsov-Ogievskiy wrote: co-shared-resource is currently not thread-safe, as also reported in co-shared-resource.h. Add a QemuMutex because co_try_get_from_shres can also be invoked from non-coroutine context. But it doesn't. It's called only from co_get_from_shres(). So, better make it a static function first. It's a sensible interface though. It lets you sleep or retry in your own way if you cannot get the resources, so (apart from the unlocked/locked confusion in the names) I like keeping it in the public API. Paolo
Re: [PATCH] block/export: improve vu_blk_sect_range_ok()
On Wed, Mar 31, 2021 at 03:27:27PM +0100, Stefan Hajnoczi wrote: > The checks in vu_blk_sect_range_ok() assume VIRTIO_BLK_SECTOR_SIZE is > equal to BDRV_SECTOR_SIZE. This is true, but let's add a > QEMU_BUILD_BUG_ON() to make it explicit. > > We might as well check that the request buffer size is a multiple of > VIRTIO_BLK_SECTOR_SIZE while we're at it. > > Suggested-by: Max Reitz > Signed-off-by: Stefan Hajnoczi > --- > block/export/vhost-user-blk-server.c | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) Ping for QEMU 6.1. Stefan > > diff --git a/block/export/vhost-user-blk-server.c > b/block/export/vhost-user-blk-server.c > index fa06996d37..1862563336 100644 > --- a/block/export/vhost-user-blk-server.c > +++ b/block/export/vhost-user-blk-server.c > @@ -70,9 +70,16 @@ static void vu_blk_req_complete(VuBlkReq *req) > static bool vu_blk_sect_range_ok(VuBlkExport *vexp, uint64_t sector, > size_t size) > { > -uint64_t nb_sectors = size >> BDRV_SECTOR_BITS; > +uint64_t nb_sectors; > uint64_t total_sectors; > > +if (size % VIRTIO_BLK_SECTOR_SIZE) { > +return false; > +} > + > +nb_sectors = size >> VIRTIO_BLK_SECTOR_BITS; > + > +QEMU_BUILD_BUG_ON(BDRV_SECTOR_SIZE != VIRTIO_BLK_SECTOR_SIZE); > if (nb_sectors > BDRV_REQUEST_MAX_SECTORS) { > return false; > } > -- > 2.30.2 > signature.asc Description: PGP signature
Re: [PATCH] hw/block/nvme: re-enable NVMe PCI hotplug
Cc'ing Klaus (maintainer) On 5/11/21 9:35 AM, Hannes Reinecke wrote: > Ever since commit e570768566 ("hw/block/nvme: support for shared > namespace in subsystem") NVMe PCI hotplug is broken, as the PCI > hotplug infrastructure will only work for the nvme devices (which > are PCI devices), but not for any attached namespaces. > So when re-adding the NVMe PCI device via 'device_add' the NVMe > controller is added, but all namespaces are missing. > This patch adds device hotplug hooks for NVMe namespaces, such that one > can call 'device_add nvme-ns' to (re-)attach the namespaces after > the PCI NVMe device 'device_add nvme' hotplug call. > > Fixes: e570768566 ("hw/block/nvme: support for shared namespace in subsystem") > Signed-off-by: Hannes Reinecke > --- > capstone | 2 +- > roms/SLOF | 2 +- > roms/openbios | 2 +- > roms/u-boot| 2 +- > 9 files changed, 93 insertions(+), 20 deletions(-) > > diff --git a/capstone b/capstone > index f8b1b83301..22ead3e0bf 16 > --- a/capstone > +++ b/capstone > @@ -1 +1 @@ > -Subproject commit f8b1b833015a4ae47110ed068e0deb7106ced66d > +Subproject commit 22ead3e0bfdb87516656453336160e0a37b066bf > index 33a7322de1..e18ddad851 16 > --- a/roms/SLOF > +++ b/roms/SLOF > @@ -1 +1 @@ > -Subproject commit 33a7322de13e9dca4b38851a345a58d37e7a441d > +Subproject commit e18ddad8516ff2cfe36ec130200318f7251aa78c > diff --git a/roms/openbios b/roms/openbios > index 4a0041107b..7f28286f5c 16 > --- a/roms/openbios > +++ b/roms/openbios > @@ -1 +1 @@ > -Subproject commit 4a0041107b8ef77e0e8337bfcb5f8078887261a7 > +Subproject commit 7f28286f5cb1ca682e3ba0a8706d8884f12bc49e > diff --git a/roms/u-boot b/roms/u-boot > index b46dd116ce..d3689267f9 16 > --- a/roms/u-boot > +++ b/roms/u-boot > @@ -1 +1 @@ > -Subproject commit b46dd116ce03e235f2a7d4843c6278e1da44b5e1 > +Subproject commit d3689267f92c5956e09cc7d1baa4700141662bff > Submodule changes unlikely related.
Re: [PATCH] hw/block/nvme: re-enable NVMe PCI hotplug
Patchew URL: https://patchew.org/QEMU/20210511073511.32511-1-h...@suse.de/ Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 20210511073511.32511-1-h...@suse.de Subject: [PATCH] hw/block/nvme: re-enable NVMe PCI hotplug === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu e58c7a3..e4f3ede master -> master * [new tag] patchew/20210511073511.32511-1-h...@suse.de -> patchew/20210511073511.32511-1-h...@suse.de Switched to a new branch 'test' 1d24622 hw/block/nvme: re-enable NVMe PCI hotplug === OUTPUT BEGIN === ERROR: braces {} are necessary for all arms of this statement #101: FILE: hw/block/nvme-subsys.c:50: +if (!n->subsys) [...] total: 1 errors, 0 warnings, 182 lines checked Commit 1d24622c5d19 (hw/block/nvme: re-enable NVMe PCI hotplug) has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 The full log is available at http://patchew.org/logs/20210511073511.32511-1-h...@suse.de/testing.checkpatch/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-de...@redhat.com
Re: [PULL 0/9] scripts/simplebench patches
On Tue, 11 May 2021 at 06:59, Vladimir Sementsov-Ogievskiy wrote: > > Hi! > > Kindly ping, or what's wrong with it? You've never sent me a pullreq before. Pull requests from new people are a pain and take more time to deal with, so I only look at them when I have the time to do that. Pull requests from established submaintainers are easy because I know they know the process and they have their gpg key set up and so on. -- PMM
[PATCH] hw/block/nvme: re-enable NVMe PCI hotplug
Ever since commit e570768566 ("hw/block/nvme: support for shared namespace in subsystem") NVMe PCI hotplug is broken, as the PCI hotplug infrastructure will only work for the nvme devices (which are PCI devices), but not for any attached namespaces. So when re-adding the NVMe PCI device via 'device_add' the NVMe controller is added, but all namespaces are missing. This patch adds device hotplug hooks for NVMe namespaces, such that one can call 'device_add nvme-ns' to (re-)attach the namespaces after the PCI NVMe device 'device_add nvme' hotplug call. Fixes: e570768566 ("hw/block/nvme: support for shared namespace in subsystem") Signed-off-by: Hannes Reinecke --- capstone | 2 +- hw/block/nvme-ns.c | 31 ++ hw/block/nvme-subsys.c | 12 + hw/block/nvme-subsys.h | 1 + hw/block/nvme.c| 60 +++--- hw/block/nvme.h| 1 + roms/SLOF | 2 +- roms/openbios | 2 +- roms/u-boot| 2 +- 9 files changed, 93 insertions(+), 20 deletions(-) diff --git a/capstone b/capstone index f8b1b83301..22ead3e0bf 16 --- a/capstone +++ b/capstone @@ -1 +1 @@ -Subproject commit f8b1b833015a4ae47110ed068e0deb7106ced66d +Subproject commit 22ead3e0bfdb87516656453336160e0a37b066bf diff --git a/hw/block/nvme-ns.c b/hw/block/nvme-ns.c index 7bb618f182..3a7e01f10f 100644 --- a/hw/block/nvme-ns.c +++ b/hw/block/nvme-ns.c @@ -526,6 +526,36 @@ static void nvme_ns_realize(DeviceState *dev, Error **errp) nvme_attach_ns(n, ns); } +static void nvme_ns_unrealize(DeviceState *dev) +{ +NvmeNamespace *ns = NVME_NS(dev); +BusState *s = qdev_get_parent_bus(dev); +NvmeCtrl *n = NVME(s->parent); +NvmeSubsystem *subsys = n->subsys; +uint32_t nsid = ns->params.nsid; +int i; + +nvme_ns_drain(ns); +nvme_ns_shutdown(ns); +nvme_ns_cleanup(ns); + +if (subsys) { +subsys->namespaces[nsid] = NULL; + +if (ns->params.shared) { +for (i = 0; i < ARRAY_SIZE(subsys->ctrls); i++) { +NvmeCtrl *ctrl = subsys->ctrls[i]; + +if (ctrl) { +nvme_detach_ns(ctrl, ns); +} +} +return; +} +} +nvme_detach_ns(n, ns); +} + static Property nvme_ns_props[] = { DEFINE_BLOCK_PROPERTIES(NvmeNamespace, blkconf), DEFINE_PROP_BOOL("detached", NvmeNamespace, params.detached, false), @@ -563,6 +593,7 @@ static void nvme_ns_class_init(ObjectClass *oc, void *data) dc->bus_type = TYPE_NVME_BUS; dc->realize = nvme_ns_realize; +dc->unrealize = nvme_ns_unrealize; device_class_set_props(dc, nvme_ns_props); dc->desc = "Virtual NVMe namespace"; } diff --git a/hw/block/nvme-subsys.c b/hw/block/nvme-subsys.c index 9604c19117..1c00508f33 100644 --- a/hw/block/nvme-subsys.c +++ b/hw/block/nvme-subsys.c @@ -42,6 +42,18 @@ int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp) return cntlid; } +void nvme_subsys_unregister_ctrl(NvmeCtrl *n) +{ +NvmeSubsystem *subsys = n->subsys; +int cntlid = n->cntlid; + +if (!n->subsys) +return; +assert(cntlid < ARRAY_SIZE(subsys->ctrls)); +subsys->ctrls[cntlid] = NULL; +n->cntlid = -1; +} + static void nvme_subsys_setup(NvmeSubsystem *subsys) { const char *nqn = subsys->params.nqn ? diff --git a/hw/block/nvme-subsys.h b/hw/block/nvme-subsys.h index 7d7ef5f7f1..2d8a146c4f 100644 --- a/hw/block/nvme-subsys.h +++ b/hw/block/nvme-subsys.h @@ -32,6 +32,7 @@ typedef struct NvmeSubsystem { } NvmeSubsystem; int nvme_subsys_register_ctrl(NvmeCtrl *n, Error **errp); +void nvme_subsys_unregister_ctrl(NvmeCtrl *n); static inline NvmeCtrl *nvme_subsys_ctrl(NvmeSubsystem *subsys, uint32_t cntlid) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 5fe082ec34..515678b686 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -4963,26 +4963,12 @@ static uint16_t nvme_ns_attachment(NvmeCtrl *n, NvmeRequest *req) } nvme_attach_ns(ctrl, ns); -__nvme_select_ns_iocs(ctrl, ns); } else { if (!nvme_ns(ctrl, nsid)) { return NVME_NS_NOT_ATTACHED | NVME_DNR; } -ctrl->namespaces[nsid - 1] = NULL; -ns->attached--; - -nvme_update_dmrsl(ctrl); -} - -/* - * Add namespace id to the changed namespace id list for event clearing - * via Get Log Page command. - */ -if (!test_and_set_bit(nsid, ctrl->changed_nsids)) { -nvme_enqueue_event(ctrl, NVME_AER_TYPE_NOTICE, - NVME_AER_INFO_NOTICE_NS_ATTR_CHANGED, - NVME_LOG_CHANGED_NSLIST); +nvme_detach_ns(ctrl, ns); } } @@ -6166,6 +6152,34 @@ void nvme_attach_ns(NvmeCtrl *n, NvmeNamespace *ns) n->dmrsl = MIN_NON_ZERO(n->dmrsl,
[PATCH v2] block: Improve backing file validation
Image below user cases: case 1: ``` $ qemu-img create -f raw source.raw 1G $ qemu-img create -f qcow2 -F raw -b source.raw ./source.raw qemu-img info source.raw image: source.raw file format: qcow2 virtual size: 193K (197120 bytes) disk size: 196K cluster_size: 65536 backing file: source.raw << backing file format: raw Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false ``` case 2: ``` $ qemu-img create -f raw source.raw 1G $ ln -sf source.raw destination.qcow2 $ qemu-img create -f qcow2 -F raw -b source.raw ./destination.qcow2 qemu-img info source.raw image: source.raw file format: qcow2 << virtual size: 2.0G (2147483648 bytes) disk size: 196K cluster_size: 65536 backing file: source.raw backing file format: raw Format specific information: compat: 1.1 lazy refcounts: false refcount bits: 16 corrupt: false ``` Generally, we don't expect to corrupte the source.raw anyway, while actually it does. Here we check their inode number instead of file name. Suggested-by: Daniel P. Berrangé Signed-off-by: Li Zhijian --- v2: utilize stat() instead of realpath() (Daniel) Signed-off-by: Li Zhijian --- block.c | 39 --- 1 file changed, 32 insertions(+), 7 deletions(-) diff --git a/block.c b/block.c index 9ad725d205..db4ae57959 100644 --- a/block.c +++ b/block.c @@ -6431,6 +6431,37 @@ bool bdrv_op_blocker_is_empty(BlockDriverState *bs) return true; } +static bool validate_backing_file(const char *filename, + const char *backing_file, Error **errp) +{ +struct stat filename_stat, backing_stat; + +if (backing_file[0] == '\0') { +error_setg(errp, "Expected backing file name, got empty string"); +goto out; +} + +/* check whether filename and backing_file are refering to the same file */ +if (stat(backing_file, _stat) == -1) { +error_setg(errp, "Cannot stat backing file %s", backing_file); +goto out; +} +if (stat(filename, _stat) == -1) { +/* Simply consider filename doesn't exist, no need to further check */ +return true; +} +if ((filename_stat.st_dev == backing_stat.st_dev) && +(filename_stat.st_ino == backing_stat.st_ino)) { +error_setg(errp, "Error: Trying to create an image with the " + "same filename as the backing file"); +goto out; +} + +return true; +out: +return false; +} + void bdrv_img_create(const char *filename, const char *fmt, const char *base_filename, const char *base_fmt, char *options, uint64_t img_size, int flags, bool quiet, @@ -6507,13 +6538,7 @@ void bdrv_img_create(const char *filename, const char *fmt, backing_file = qemu_opt_get(opts, BLOCK_OPT_BACKING_FILE); if (backing_file) { -if (!strcmp(filename, backing_file)) { -error_setg(errp, "Error: Trying to create an image with the " - "same filename as the backing file"); -goto out; -} -if (backing_file[0] == '\0') { -error_setg(errp, "Expected backing file name, got empty string"); +if (!validate_backing_file(filename, backing_file, errp)) { goto out; } } -- 2.30.2