Re: [PATCH 00/13] block: remove aio_disable_external() API
On Tue, Apr 04, 2023 at 03:43:20PM +0200, Paolo Bonzini wrote: > On 4/3/23 20:29, Stefan Hajnoczi wrote: > > The aio_disable_external() API temporarily suspends file descriptor > > monitoring > > in the event loop. The block layer uses this to prevent new I/O requests > > being > > submitted from the guest and elsewhere between bdrv_drained_begin() and > > bdrv_drained_end(). > > > > While the block layer still needs to prevent new I/O requests in drained > > sections, the aio_disable_external() API can be replaced with > > .drained_begin/end/poll() callbacks that have been added to BdrvChildClass > > and > > BlockDevOps. > > > > This newer .bdrained_begin/end/poll() approach is attractive because it > > works > > without specifying a specific AioContext. The block layer is moving towards > > multi-queue and that means multiple AioContexts may be processing I/O > > simultaneously. > > > > The aio_disable_external() was always somewhat hacky. It suspends all file > > descriptors that were registered with is_external=true, even if they have > > nothing to do with the BlockDriverState graph nodes that are being drained. > > It's better to solve a block layer problem in the block layer than to have > > an > > odd event loop API solution. > > > > That covers the motivation for this change, now on to the specifics of this > > series: > > > > While it would be nice if a single conceptual approach could be applied to > > all > > is_external=true file descriptors, I ended up looking at callers on a > > case-by-case basis. There are two general ways I migrated code away from > > is_external=true: > > > > 1. Block exports are typically best off unregistering fds in > > .drained_begin() > > and registering them again in .drained_end(). The .drained_poll() > > function > > waits for in-flight requests to finish using a reference counter. > > > > 2. Emulated storage controllers like virtio-blk and virtio-scsi are a little > > simpler. They can rely on BlockBackend's request queuing during drain > > feature. Guest I/O request coroutines are suspended in a drained > > section and > > resume upon the end of the drained section. > > Sorry, I disagree with this. > > Request queuing was shown to cause deadlocks; Hanna's latest patch is piling > another hack upon it, instead in my opinion we should go in the direction of > relying _less_ (or not at all) on request queuing. > > I am strongly convinced that request queuing must apply only after > bdrv_drained_begin has returned, which would also fix the IDE TRIM bug > reported by Fiona Ebner. The possible livelock scenario is generally not a > problem because 1) outside an iothread you have anyway the BQL that prevents > a vCPU from issuing more I/O operations during bdrv_drained_begin 2) in > iothreads you have aio_disable_external() instead of .drained_begin(). > > It is also less tidy to start a request during the drained_begin phase, > because a request that has been submitted has to be completed (cancel > doesn't really work). > > So in an ideal world, request queuing would not only apply only after > bdrv_drained_begin has returned, it would log a warning and .drained_begin() > should set up things so that there are no such warnings. That's fine, I will give .drained_begin/end/poll() a try with virtio-blk and virtio-scsi in the next revision. Stefan signature.asc Description: PGP signature
Re: [PATCH 11/13] block/fuse: take AioContext lock around blk_exp_ref/unref()
On Tue, Apr 04, 2023 at 03:46:34PM +0200, Paolo Bonzini wrote: > On 4/3/23 20:30, Stefan Hajnoczi wrote: > > These functions must be called with the AioContext acquired: > > > >/* Callers must hold exp->ctx lock */ > >void blk_exp_ref(BlockExport *exp) > >... > >/* Callers must hold exp->ctx lock */ > >void blk_exp_unref(BlockExport *exp) > > > > Signed-off-by: Stefan Hajnoczi > > --- > > block/export/fuse.c | 4 > > 1 file changed, 4 insertions(+) > > > > diff --git a/block/export/fuse.c b/block/export/fuse.c > > index 06fa41079e..18394f9e07 100644 > > --- a/block/export/fuse.c > > +++ b/block/export/fuse.c > > @@ -244,7 +244,9 @@ static void read_from_fuse_export(void *opaque) > > FuseExport *exp = opaque; > > int ret; > > +aio_context_acquire(exp->common.ctx); > > blk_exp_ref(&exp->common); > > +aio_context_release(exp->common.ctx); > > do { > > ret = fuse_session_receive_buf(exp->fuse_session, &exp->fuse_buf); > > @@ -256,7 +258,9 @@ static void read_from_fuse_export(void *opaque) > > fuse_session_process_buf(exp->fuse_session, &exp->fuse_buf); > > out: > > +aio_context_acquire(exp->common.ctx); > > blk_exp_unref(&exp->common); > > +aio_context_release(exp->common.ctx); > > } > > Since the actual thread-unsafe work is done in a bottom half, perhaps > instead you can use qatomic_inc and qatomic_fetch_dec in > blk_exp_{ref,unref}? Sure, I'll give that a try in the next revision. Stefan signature.asc Description: PGP signature
Re: [PATCH v9 0/5] Add zoned storage emulation to virtio-blk driver
On Tue, Apr 04, 2023 at 11:46:13PM +0800, Sam Li wrote: > Stefan Hajnoczi 于2023年4月3日周一 20:18写道: > > > > On Wed, 29 Mar 2023 at 01:01, Michael S. Tsirkin wrote: > > > > > > On Mon, Mar 27, 2023 at 10:45:48PM +0800, Sam Li wrote: > > > > > > virtio bits look ok. > > > > > > Reviewed-by: Michael S. Tsirkin > > > > > > merge through block layer tree I'm guessing? > > > > Sounds good. Thank you! > > Hi Stefan, > > I've sent the v8 zone append write to the list where I move the wps > field to BlockDriverState. It will make a small change the emulation > code, which is in hw/block/virtio-blk.c of [2/5] virtio-blk: add zoned > storage emulation for zoned devices: > - if (BDRV_ZT_IS_CONV(bs->bl.wps->wp[index])) { > + if (BDRV_ZT_IS_CONV(bs->wps->wp[index])) { > > Please let me know if you prefer a new version or not. Yes, please. Stefan signature.asc Description: PGP signature
Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX
On 04.04.23 10:10, Vladimir Sementsov-Ogievskiy wrote: On 03.04.23 16:33, Hanna Czenczek wrote: (Sorry for the rather late reply... Thanks for the review!) On 20.03.23 11:31, Vladimir Sementsov-Ogievskiy wrote: On 17.03.23 20:50, Hanna Czenczek wrote: [...] diff --git a/block/io.c b/block/io.c index 8974d46941..1e9cdba17a 100644 --- a/block/io.c +++ b/block/io.c [..] + pad->write = write; + return true; } @@ -1545,6 +1561,18 @@ zero_mem: static void bdrv_padding_destroy(BdrvRequestPadding *pad) Maybe, rename to _finalize, to stress that it's not only freeing memory. Sounds good! [...] @@ -1552,6 +1580,101 @@ static void bdrv_padding_destroy(BdrvRequestPadding *pad) memset(pad, 0, sizeof(*pad)); } +/* + * Create pad->local_qiov by wrapping @iov in the padding head and tail, while + * ensuring that the resulting vector will not exceed IOV_MAX elements. + * + * To ensure this, when necessary, the first couple of elements (up to three) maybe, "first two-three elements" Sure (here and... [...] + /* + * If padded_niov > IOV_MAX, we cannot just concatenate everything. + * Instead, merge the first couple of elements of @iov to reduce the number maybe, "first two-three elements" ...here). + * of vector elements as necessary. + */ + if (padded_niov > IOV_MAX) { [..] @@ -1653,8 +1786,8 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child, flags |= BDRV_REQ_COPY_ON_READ; } - ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pad, - NULL, &flags); + ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, false, + &pad, NULL, &flags); if (ret < 0) { goto fail; } a bit later: tracked_request_end(&req); bdrv_padding_destroy(&pad); Now, the request is formally finished inside bdrv_padding_destroy().. Not sure, does it really violate something, but seems safer to swap these two calls. I’d rather not, for two reasons: First, tracked requests are (as far as I understand) only there to implement request serialization, and so only care about metadata (offset, length, and type), which is not changed by changes to the I/O vector. Second, even if the state of the I/O vector were relevant to tracked requests, I think it would actually be the other way around, i.e. the tracked request must be ended before the padding is finalized/destroyed. The tracked request is about the actual request we submit to `child` (which is why tracked_request_begin() is called after bdrv_pad_request()), and that request is done using the modified I/O vector. So if the tracked request had any connection to the request’s I/O vector (which it doesn’t), it would be to this modified one, so we mustn’t invalidate it via bdrv_padding_finalize() while the tracked request lives. Or, said differently: I generally try to clean up things in the inverse way they were set up, and because bdrv_pad_requests() comes before tracked_request_begin(), I think tracked_request_end() should come before bdrv_padding_finalize(). Note, that it's wise-versa in bdrv_co_pwritev_part(). Well, and it’s this way here. We agree that for clean-up, the order doesn’t functionally matter, so either way is actually fine. For me it's just simpler to think that the whole request, including filling user-given qiov with data on read part is inside tracked_request_begin() / tracked_request_end(). It isn’t, though, because padding must be done before the tracked request is created. The tracked request uses the request’s actual offset and length, after padding, so bdrv_pad_request() must always be done before (i.e., outside) tracked_request_begin(). And moving the last manipulation with qiov out of it breaks this simple thought. Guest should not care of it, as it doesn't know about request tracking.. But what about internal code? Some code may depend on some requests be finished after bdrv_drained_begin() call, but now they may be not fully finished, and some data may be not copied back to original qiov. I agree with your point about sequence of objects finalization, but maybe, that just shows that copying data back to qiov should not be a part of bdrv_padding_finalize(), but instead be a separate function, called before tracked_request_end(). But my thought is that copying back shouldn’t be done before tracked_request_end(), because copying back is not part of the tracked request. What we track is the padded request, which uses a modified I/O vector, so undoing that modification shouldn’t be done while the tracked request lives. I know I’m inconsistent with regards to bdrv_co_pwritev_part(), which is because it doesn’t matter. My actual position is that tracked requests are about metadata, so undoing/finalizing the padding (including potentially copying data back) has nothing to do with a tracked request
Re: [PATCH 01/16] qga/qapi-schema: Tidy up documentation of guest-fsfreeze-status
Reviewed-by: Konstantin Kostiuk On Tue, Apr 4, 2023 at 2:59 PM Markus Armbruster wrote: > Delete "error state indicates", because it doesn't make sense. > I suspect it was an accident. > > Signed-off-by: Markus Armbruster > --- > qga/qapi-schema.json | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json > index 796434ed34..f349345116 100644 > --- a/qga/qapi-schema.json > +++ b/qga/qapi-schema.json > @@ -420,7 +420,7 @@ > ## > # @guest-fsfreeze-status: > # > -# Get guest fsfreeze state. error state indicates > +# Get guest fsfreeze state. > # > # Returns: GuestFsfreezeStatus ("thawed", "frozen", etc., as defined > below) > # > -- > 2.39.2 > >
Re: [PATCH] aio-wait: avoid AioContext lock in aio_wait_bh_oneshot()
On 4/4/23 17:33, Stefan Hajnoczi wrote: There is no need for the AioContext lock in aio_wait_bh_oneshot(). It's easy to remove the lock from existing callers and then switch from AIO_WAIT_WHILE() to AIO_WAIT_WHILE_UNLOCKED() in aio_wait_bh_oneshot(). Document that the AioContext lock should not be held across aio_wait_bh_oneshot(). Holding a lock across aio_poll() can cause deadlock so we don't want callers to do that. This is a step towards getting rid of the AioContext lock. Reviewed-by: Paolo Bonzini Paolo
Re: [PATCH v2 05/11] qemu-options: finesse the recommendations around -blockdev
Am 04.04.2023 um 17:07 hat Michael Tokarev geschrieben: > 04.04.2023 16:57, Kevin Wolf пишет: > > Let's not make the use of -drive look more advisable than it really is. > > If you're writing a management tool/script and you're still using -drive > > today, you're doing it wrong. > > Kevin, maybe I'm wrong here, but what to do with the situation which > started it all, -- with -snapshot? > > If anything, I think there should be a bold note that -snapshot is > broken by -blockdev. Users are learning that the *hard* way, after > losing their data.. Ah, I missed this context. Maybe -snapshot should error out if -blockdev is in use. You'd generally expect that either -blockdev is used primarily and snapshots are done externally (if the command line is generated by some management tool), or that -drive is used consistently (by a human who likes the convenience). In both cases, we wouldn't hit the error path. There may be some exceptional cases where you have both -drive and -blockdev (maybe because a human users needs more control for one specific disk). This is the case where you can get a nasty surprise and that would error out. If you legitimately want the -drive images snapshotted, but not the -blockdev ones, you can still use individual '-drive snapshot=on' options instead of the global '-snapshot' (and the error message should mention this). Would you see any problems with such an approach? Kevin
Re: [PATCH v9 0/5] Add zoned storage emulation to virtio-blk driver
Stefan Hajnoczi 于2023年4月3日周一 20:18写道: > > On Wed, 29 Mar 2023 at 01:01, Michael S. Tsirkin wrote: > > > > On Mon, Mar 27, 2023 at 10:45:48PM +0800, Sam Li wrote: > > > > virtio bits look ok. > > > > Reviewed-by: Michael S. Tsirkin > > > > merge through block layer tree I'm guessing? > > Sounds good. Thank you! Hi Stefan, I've sent the v8 zone append write to the list where I move the wps field to BlockDriverState. It will make a small change the emulation code, which is in hw/block/virtio-blk.c of [2/5] virtio-blk: add zoned storage emulation for zoned devices: - if (BDRV_ZT_IS_CONV(bs->bl.wps->wp[index])) { + if (BDRV_ZT_IS_CONV(bs->wps->wp[index])) { Please let me know if you prefer a new version or not. Thanks, Sam
[PATCH] aio-wait: avoid AioContext lock in aio_wait_bh_oneshot()
There is no need for the AioContext lock in aio_wait_bh_oneshot(). It's easy to remove the lock from existing callers and then switch from AIO_WAIT_WHILE() to AIO_WAIT_WHILE_UNLOCKED() in aio_wait_bh_oneshot(). Document that the AioContext lock should not be held across aio_wait_bh_oneshot(). Holding a lock across aio_poll() can cause deadlock so we don't want callers to do that. This is a step towards getting rid of the AioContext lock. Cc: Paolo Bonzini Signed-off-by: Stefan Hajnoczi --- include/block/aio-wait.h| 2 +- hw/block/dataplane/virtio-blk.c | 3 ++- hw/scsi/virtio-scsi-dataplane.c | 2 -- util/aio-wait.c | 2 +- 4 files changed, 4 insertions(+), 5 deletions(-) diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h index da13357bb8..fdd14294ec 100644 --- a/include/block/aio-wait.h +++ b/include/block/aio-wait.h @@ -131,7 +131,7 @@ void aio_wait_kick(void); * * Run a BH in @ctx and wait for it to complete. * - * Must be called from the main loop thread with @ctx acquired exactly once. + * Must be called from the main loop thread without @ctx acquired. * Note that main loop event processing may occur. */ void aio_wait_bh_oneshot(AioContext *ctx, QEMUBHFunc *cb, void *opaque); diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index b28d81737e..e0111efd6d 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -314,9 +314,10 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev) s->stopping = true; trace_virtio_blk_data_plane_stop(s); -aio_context_acquire(s->ctx); aio_wait_bh_oneshot(s->ctx, virtio_blk_data_plane_stop_bh, s); +aio_context_acquire(s->ctx); + /* Wait for virtio_blk_dma_restart_bh() and in flight I/O to complete */ blk_drain(s->conf->conf.blk); diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c index 20bb91766e..f3214e1c57 100644 --- a/hw/scsi/virtio-scsi-dataplane.c +++ b/hw/scsi/virtio-scsi-dataplane.c @@ -197,9 +197,7 @@ void virtio_scsi_dataplane_stop(VirtIODevice *vdev) } s->dataplane_stopping = true; -aio_context_acquire(s->ctx); aio_wait_bh_oneshot(s->ctx, virtio_scsi_dataplane_stop_bh, s); -aio_context_release(s->ctx); blk_drain_all(); /* ensure there are no in-flight requests */ diff --git a/util/aio-wait.c b/util/aio-wait.c index 98c5accd29..b5336cf5fd 100644 --- a/util/aio-wait.c +++ b/util/aio-wait.c @@ -82,5 +82,5 @@ void aio_wait_bh_oneshot(AioContext *ctx, QEMUBHFunc *cb, void *opaque) assert(qemu_get_current_aio_context() == qemu_get_aio_context()); aio_bh_schedule_oneshot(ctx, aio_wait_bh, &data); -AIO_WAIT_WHILE(ctx, !data.done); +AIO_WAIT_WHILE_UNLOCKED(NULL, !data.done); } -- 2.39.2
[PATCH v8 4/4] block: add some trace events for zone append
Signed-off-by: Sam Li Reviewed-by: Dmitry Fomichev Reviewed-by: Stefan Hajnoczi --- block/file-posix.c | 3 +++ block/trace-events | 2 ++ 2 files changed, 5 insertions(+) diff --git a/block/file-posix.c b/block/file-posix.c index a7130b1024..825301467e 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2502,6 +2502,8 @@ out: if (!BDRV_ZT_IS_CONV(*wp)) { if (type & QEMU_AIO_ZONE_APPEND) { *s->offset = *wp; +trace_zbd_zone_append_complete(bs, *s->offset +>> BDRV_SECTOR_BITS); } /* Advance the wp if needed */ if (offset + bytes > *wp) { @@ -3546,6 +3548,7 @@ static int coroutine_fn raw_co_zone_append(BlockDriverState *bs, len += iov_len; } +trace_zbd_zone_append(bs, *offset >> BDRV_SECTOR_BITS); return raw_co_prw(bs, *offset, len, qiov, QEMU_AIO_ZONE_APPEND); } #endif diff --git a/block/trace-events b/block/trace-events index 3f4e1d088a..32665158d6 100644 --- a/block/trace-events +++ b/block/trace-events @@ -211,6 +211,8 @@ file_hdev_is_sg(int type, int version) "SG device found: type=%d, version=%d" file_flush_fdatasync_failed(int err) "errno %d" zbd_zone_report(void *bs, unsigned int nr_zones, int64_t sector) "bs %p report %d zones starting at sector offset 0x%" PRIx64 "" zbd_zone_mgmt(void *bs, const char *op_name, int64_t sector, int64_t len) "bs %p %s starts at sector offset 0x%" PRIx64 " over a range of 0x%" PRIx64 " sectors" +zbd_zone_append(void *bs, int64_t sector) "bs %p append at sector offset 0x%" PRIx64 "" +zbd_zone_append_complete(void *bs, int64_t sector) "bs %p returns append sector 0x%" PRIx64 "" # ssh.c sftp_error(const char *op, const char *ssh_err, int ssh_err_code, int sftp_err_code) "%s failed: %s (libssh error code: %d, sftp error code: %d)" -- 2.39.2
[PATCH v8 3/4] qemu-iotests: test zone append operation
The patch tests zone append writes by reporting the zone wp after the completion of the call. "zap -p" option can print the sector offset value after completion, which should be the start sector where the append write begins. Signed-off-by: Sam Li Reviewed-by: Stefan Hajnoczi --- qemu-io-cmds.c | 75 ++ tests/qemu-iotests/tests/zoned | 16 +++ tests/qemu-iotests/tests/zoned.out | 16 +++ 3 files changed, 107 insertions(+) diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c index f35ea627d7..3f75d2f5a6 100644 --- a/qemu-io-cmds.c +++ b/qemu-io-cmds.c @@ -1874,6 +1874,80 @@ static const cmdinfo_t zone_reset_cmd = { .oneline = "reset a zone write pointer in zone block device", }; +static int do_aio_zone_append(BlockBackend *blk, QEMUIOVector *qiov, + int64_t *offset, int flags, int *total) +{ +int async_ret = NOT_DONE; + +blk_aio_zone_append(blk, offset, qiov, flags, aio_rw_done, &async_ret); +while (async_ret == NOT_DONE) { +main_loop_wait(false); +} + +*total = qiov->size; +return async_ret < 0 ? async_ret : 1; +} + +static int zone_append_f(BlockBackend *blk, int argc, char **argv) +{ +int ret; +bool pflag = false; +int flags = 0; +int total = 0; +int64_t offset; +char *buf; +int c, nr_iov; +int pattern = 0xcd; +QEMUIOVector qiov; + +if (optind > argc - 3) { +return -EINVAL; +} + +if ((c = getopt(argc, argv, "p")) != -1) { +pflag = true; +} + +offset = cvtnum(argv[optind]); +if (offset < 0) { +print_cvtnum_err(offset, argv[optind]); +return offset; +} +optind++; +nr_iov = argc - optind; +buf = create_iovec(blk, &qiov, &argv[optind], nr_iov, pattern, + flags & BDRV_REQ_REGISTERED_BUF); +if (buf == NULL) { +return -EINVAL; +} +ret = do_aio_zone_append(blk, &qiov, &offset, flags, &total); +if (ret < 0) { +printf("zone append failed: %s\n", strerror(-ret)); +goto out; +} + +if (pflag) { +printf("After zap done, the append sector is 0x%" PRIx64 "\n", + tosector(offset)); +} + +out: +qemu_io_free(blk, buf, qiov.size, + flags & BDRV_REQ_REGISTERED_BUF); +qemu_iovec_destroy(&qiov); +return ret; +} + +static const cmdinfo_t zone_append_cmd = { +.name = "zone_append", +.altname = "zap", +.cfunc = zone_append_f, +.argmin = 3, +.argmax = 4, +.args = "offset len [len..]", +.oneline = "append write a number of bytes at a specified offset", +}; + static int truncate_f(BlockBackend *blk, int argc, char **argv); static const cmdinfo_t truncate_cmd = { .name = "truncate", @@ -2672,6 +2746,7 @@ static void __attribute((constructor)) init_qemuio_commands(void) qemuio_add_command(&zone_close_cmd); qemuio_add_command(&zone_finish_cmd); qemuio_add_command(&zone_reset_cmd); +qemuio_add_command(&zone_append_cmd); qemuio_add_command(&truncate_cmd); qemuio_add_command(&length_cmd); qemuio_add_command(&info_cmd); diff --git a/tests/qemu-iotests/tests/zoned b/tests/qemu-iotests/tests/zoned index 56f60616b5..3d23ce9cc1 100755 --- a/tests/qemu-iotests/tests/zoned +++ b/tests/qemu-iotests/tests/zoned @@ -82,6 +82,22 @@ echo "(5) resetting the second zone" $QEMU_IO $IMG -c "zrs 268435456 268435456" echo "After resetting a zone:" $QEMU_IO $IMG -c "zrp 268435456 1" +echo +echo +echo "(6) append write" # the physical block size of the device is 4096 +$QEMU_IO $IMG -c "zrp 0 1" +$QEMU_IO $IMG -c "zap -p 0 0x1000 0x2000" +echo "After appending the first zone firstly:" +$QEMU_IO $IMG -c "zrp 0 1" +$QEMU_IO $IMG -c "zap -p 0 0x1000 0x2000" +echo "After appending the first zone secondly:" +$QEMU_IO $IMG -c "zrp 0 1" +$QEMU_IO $IMG -c "zap -p 268435456 0x1000 0x2000" +echo "After appending the second zone firstly:" +$QEMU_IO $IMG -c "zrp 268435456 1" +$QEMU_IO $IMG -c "zap -p 268435456 0x1000 0x2000" +echo "After appending the second zone secondly:" +$QEMU_IO $IMG -c "zrp 268435456 1" # success, all done echo "*** done" diff --git a/tests/qemu-iotests/tests/zoned.out b/tests/qemu-iotests/tests/zoned.out index b2d061da49..fe53ba4744 100644 --- a/tests/qemu-iotests/tests/zoned.out +++ b/tests/qemu-iotests/tests/zoned.out @@ -50,4 +50,20 @@ start: 0x8, len 0x8, cap 0x8, wptr 0x10, zcond:14, [type: 2] (5) resetting the second zone After resetting a zone: start: 0x8, len 0x8, cap 0x8, wptr 0x8, zcond:1, [type: 2] + + +(6) append write +start: 0x0, len 0x8, cap 0x8, wptr 0x0, zcond:1, [type: 2] +After zap done, the append sector is 0x0 +After appending the first zone firstly: +start: 0x0, len 0x8, cap 0x8, wptr 0x18, zcond:2, [type: 2] +After zap done, the append sector is 0x18 +After appending the first zone secondly: +start: 0x0, len 0x8, cap 0x8, wptr 0x30,
[PATCH v8 1/4] file-posix: add tracking of the zone write pointers
Since Linux doesn't have a user API to issue zone append operations to zoned devices from user space, the file-posix driver is modified to add zone append emulation using regular writes. To do this, the file-posix driver tracks the wp location of all zones of the device. It uses an array of uint64_t. The most significant bit of each wp location indicates if the zone type is conventional zones. The zones wp can be changed due to the following operations issued: - zone reset: change the wp to the start offset of that zone - zone finish: change to the end location of that zone - write to a zone - zone append Signed-off-by: Sam Li --- block/file-posix.c | 168 ++- include/block/block-common.h | 14 +++ include/block/block_int-common.h | 5 + 3 files changed, 184 insertions(+), 3 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 65efe5147e..bc58f7193b 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -1324,6 +1324,88 @@ static int hdev_get_max_segments(int fd, struct stat *st) #endif } +#if defined(CONFIG_BLKZONED) +/* + * If the reset_all flag is true, then the wps of zone whose state is + * not readonly or offline should be all reset to the start sector. + * Else, take the real wp of the device. + */ +static int get_zones_wp(int fd, BlockZoneWps *wps, int64_t offset, +unsigned int nrz, bool reset_all) +{ +struct blk_zone *blkz; +size_t rep_size; +uint64_t sector = offset >> BDRV_SECTOR_BITS; +int ret, n = 0, i = 0; +rep_size = sizeof(struct blk_zone_report) + nrz * sizeof(struct blk_zone); +g_autofree struct blk_zone_report *rep = NULL; + +rep = g_malloc(rep_size); +blkz = (struct blk_zone *)(rep + 1); +while (n < nrz) { +memset(rep, 0, rep_size); +rep->sector = sector; +rep->nr_zones = nrz - n; + +do { +ret = ioctl(fd, BLKREPORTZONE, rep); +} while (ret != 0 && errno == EINTR); +if (ret != 0) { +error_report("%d: ioctl BLKREPORTZONE at %" PRId64 " failed %d", +fd, offset, errno); +return -errno; +} + +if (!rep->nr_zones) { +break; +} + +for (i = 0; i < rep->nr_zones; i++, n++) { +/* + * The wp tracking cares only about sequential writes required and + * sequential write preferred zones so that the wp can advance to + * the right location. + * Use the most significant bit of the wp location to indicate the + * zone type: 0 for SWR/SWP zones and 1 for conventional zones. + */ +if (blkz[i].type == BLK_ZONE_TYPE_CONVENTIONAL) { +wps->wp[i] &= 1ULL << 63; +} else { +switch(blkz[i].cond) { +case BLK_ZONE_COND_FULL: +case BLK_ZONE_COND_READONLY: +/* Zone not writable */ +wps->wp[i] = (blkz[i].start + blkz[i].len) << BDRV_SECTOR_BITS; +break; +case BLK_ZONE_COND_OFFLINE: +/* Zone not writable nor readable */ +wps->wp[i] = (blkz[i].start) << BDRV_SECTOR_BITS; +break; +default: +if (reset_all) { +wps->wp[i] = blkz[i].start << BDRV_SECTOR_BITS; +} else { +wps->wp[i] = blkz[i].wp << BDRV_SECTOR_BITS; +} +break; +} +} +} +sector = blkz[i - 1].start + blkz[i - 1].len; +} + +return 0; +} + +static void update_zones_wp(int fd, BlockZoneWps *wps, int64_t offset, +unsigned int nrz) +{ +if (get_zones_wp(fd, wps, offset, nrz, 0) < 0) { +error_report("update zone wp failed"); +} +} +#endif + static void raw_refresh_limits(BlockDriverState *bs, Error **errp) { BDRVRawState *s = bs->opaque; @@ -1413,6 +1495,23 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp) if (ret >= 0) { bs->bl.max_active_zones = ret; } + +ret = get_sysfs_long_val(&st, "physical_block_size"); +if (ret >= 0) { +bs->bl.write_granularity = ret; +} + +/* The refresh_limits() function can be called multiple times. */ +bs->wps = NULL; +bs->wps = g_malloc(sizeof(BlockZoneWps) + +sizeof(int64_t) * bs->bl.nr_zones); +ret = get_zones_wp(s->fd, bs->wps, 0, bs->bl.nr_zones, 0); +if (ret < 0) { +error_setg_errno(errp, -ret, "report wps failed"); +bs->wps = NULL; +return; +} +qemu_co_mutex_init(&bs->wps->colock); return; } out: @@ -2338,9 +2437,15 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset, {
[PATCH v8 0/4] Add zone append write for zoned device
This patch series add zone append operation based on the previous zoned device support part. The file-posix driver is modified to add zone append emulation using regular writes. v8: - address review comments [Stefan] * fix zone_mgmt covering multiple zones case * fix memory leak bug of wps in refresh_limits() * mv BlockZoneWps field from BlockLimits to BlockDriverState * add check_qiov_request() to bdrv_co_zone_append v7: - address review comments * fix wp assignment [Stefan] * fix reset_all cases, skip R/O & offline zones [Dmitry, Damien] * fix locking on non-zap related cases [Stefan] * cleanups and typos correction - add "zap -p" option to qemuio-cmds [Stefan] v6: - add small fixes v5: - fix locking conditions and error handling - drop some trival optimizations - add tracing points for zone append v4: - fix lock related issues[Damien] - drop all field in zone_mgmt op [Damien] - fix state checks in zong_mgmt command [Damien] - return start sector of wp when issuing zap req [Damien] v3: - only read wps when it is locked [Damien] - allow last smaller zone case [Damien] - add zone type and state checks in zone_mgmt command [Damien] - fix RESET_ALL related problems v2: - split patch to two patches for better reviewing - change BlockZoneWps's structure to an array of integers - use only mutex lock on locking conditions of zone wps - coding styles and clean-ups v1: - introduce zone append write Sam Li (4): file-posix: add tracking of the zone write pointers block: introduce zone append write for zoned devices qemu-iotests: test zone append operation block: add some trace events for zone append block/block-backend.c | 60 block/file-posix.c | 221 - block/io.c | 27 block/io_uring.c | 4 + block/linux-aio.c | 3 + block/raw-format.c | 8 ++ block/trace-events | 2 + include/block/block-common.h | 14 ++ include/block/block-io.h | 4 + include/block/block_int-common.h | 8 ++ include/block/raw-aio.h| 4 +- include/sysemu/block-backend-io.h | 9 ++ qemu-io-cmds.c | 75 ++ tests/qemu-iotests/tests/zoned | 16 +++ tests/qemu-iotests/tests/zoned.out | 16 +++ 15 files changed, 464 insertions(+), 7 deletions(-) -- 2.39.2
[PATCH v8 2/4] block: introduce zone append write for zoned devices
A zone append command is a write operation that specifies the first logical block of a zone as the write position. When writing to a zoned block device using zone append, the byte offset of the call may point at any position within the zone to which the data is being appended. Upon completion the device will respond with the position where the data has been written in the zone. Signed-off-by: Sam Li Reviewed-by: Dmitry Fomichev --- block/block-backend.c | 60 +++ block/file-posix.c| 56 + block/io.c| 27 ++ block/io_uring.c | 4 +++ block/linux-aio.c | 3 ++ block/raw-format.c| 8 + include/block/block-io.h | 4 +++ include/block/block_int-common.h | 3 ++ include/block/raw-aio.h | 4 ++- include/sysemu/block-backend-io.h | 9 + 10 files changed, 171 insertions(+), 7 deletions(-) diff --git a/block/block-backend.c b/block/block-backend.c index f70b08e3f6..bcb3a1eff0 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -1888,6 +1888,45 @@ BlockAIOCB *blk_aio_zone_mgmt(BlockBackend *blk, BlockZoneOp op, return &acb->common; } +static void coroutine_fn blk_aio_zone_append_entry(void *opaque) +{ +BlkAioEmAIOCB *acb = opaque; +BlkRwCo *rwco = &acb->rwco; + +rwco->ret = blk_co_zone_append(rwco->blk, (int64_t *)acb->bytes, + rwco->iobuf, rwco->flags); +blk_aio_complete(acb); +} + +BlockAIOCB *blk_aio_zone_append(BlockBackend *blk, int64_t *offset, +QEMUIOVector *qiov, BdrvRequestFlags flags, +BlockCompletionFunc *cb, void *opaque) { +BlkAioEmAIOCB *acb; +Coroutine *co; +IO_CODE(); + +blk_inc_in_flight(blk); +acb = blk_aio_get(&blk_aio_em_aiocb_info, blk, cb, opaque); +acb->rwco = (BlkRwCo) { +.blk= blk, +.ret= NOT_DONE, +.flags = flags, +.iobuf = qiov, +}; +acb->bytes = (int64_t)offset; +acb->has_returned = false; + +co = qemu_coroutine_create(blk_aio_zone_append_entry, acb); +aio_co_enter(blk_get_aio_context(blk), co); +acb->has_returned = true; +if (acb->rwco.ret != NOT_DONE) { +replay_bh_schedule_oneshot_event(blk_get_aio_context(blk), + blk_aio_complete_bh, acb); +} + +return &acb->common; +} + /* * Send a zone_report command. * offset is a byte offset from the start of the device. No alignment @@ -1939,6 +1978,27 @@ int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, BlockZoneOp op, return ret; } +/* + * Send a zone_append command. + */ +int coroutine_fn blk_co_zone_append(BlockBackend *blk, int64_t *offset, +QEMUIOVector *qiov, BdrvRequestFlags flags) +{ +int ret; +IO_CODE(); + +blk_inc_in_flight(blk); +blk_wait_while_drained(blk); +if (!blk_is_available(blk)) { +blk_dec_in_flight(blk); +return -ENOMEDIUM; +} + +ret = bdrv_co_zone_append(blk_bs(blk), offset, qiov, flags); +blk_dec_in_flight(blk); +return ret; +} + void blk_drain(BlockBackend *blk) { BlockDriverState *bs = blk_bs(blk); diff --git a/block/file-posix.c b/block/file-posix.c index bc58f7193b..a7130b1024 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -160,6 +160,7 @@ typedef struct BDRVRawState { bool has_write_zeroes:1; bool use_linux_aio:1; bool use_linux_io_uring:1; +int64_t *offset; /* offset of zone append operation */ int page_cache_inconsistent; /* errno from fdatasync failure */ bool has_fallocate; bool needs_alignment; @@ -1685,7 +1686,7 @@ static ssize_t handle_aiocb_rw_vector(RawPosixAIOData *aiocb) ssize_t len; len = RETRY_ON_EINTR( -(aiocb->aio_type & QEMU_AIO_WRITE) ? +(aiocb->aio_type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) ? qemu_pwritev(aiocb->aio_fildes, aiocb->io.iov, aiocb->io.niov, @@ -1714,7 +1715,7 @@ static ssize_t handle_aiocb_rw_linear(RawPosixAIOData *aiocb, char *buf) ssize_t len; while (offset < aiocb->aio_nbytes) { -if (aiocb->aio_type & QEMU_AIO_WRITE) { +if (aiocb->aio_type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND)) { len = pwrite(aiocb->aio_fildes, (const char *)buf + offset, aiocb->aio_nbytes - offset, @@ -1807,7 +1808,7 @@ static int handle_aiocb_rw(void *opaque) } nbytes = handle_aiocb_rw_linear(aiocb, buf); -if (!(aiocb->aio_type & QEMU_AIO_WRITE)) { +if (!(aiocb->aio_type & (QEMU_AIO_WRITE | QEMU_AIO_ZONE_APPEND))) { char *p = buf; size_t count = aiocb->aio_nbytes, copy; int i; @@ -2442,8 +2443,12 @@ static int coroutine_fn raw_co_prw(BlockDriverState *bs, ui
[PULL 1/1] nbd/server: Request TCP_NODELAY
Nagle's algorithm adds latency in order to reduce network packet overhead on small packets. But when we are already using corking to merge smaller packets into transactional requests, the extra delay from TCP defaults just gets in the way (see recent commit bd2cd4a4). For reference, qemu as an NBD client already requests TCP_NODELAY (see nbd_connect() in nbd/client-connection.c); as does libnbd as a client [1], and nbdkit as a server [2]. Furthermore, the NBD spec recommends the use of TCP_NODELAY [3]. [1] https://gitlab.com/nbdkit/libnbd/-/blob/a48a1142/generator/states-connect.c#L39 [2] https://gitlab.com/nbdkit/nbdkit/-/blob/45b72f5b/server/sockets.c#L430 [3] https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md#protocol-phases CC: Florian Westphal Signed-off-by: Eric Blake Message-Id: <20230404004047.142086-1-ebl...@redhat.com> Reviewed-by: Philippe Mathieu-Daudé --- nbd/server.c | 1 + 1 file changed, 1 insertion(+) diff --git a/nbd/server.c b/nbd/server.c index 848836d4140..3d8d0d81df2 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -2758,6 +2758,7 @@ void nbd_client_new(QIOChannelSocket *sioc, } client->tlsauthz = g_strdup(tlsauthz); client->sioc = sioc; +qio_channel_set_delay(QIO_CHANNEL(sioc), false); object_ref(OBJECT(client->sioc)); client->ioc = QIO_CHANNEL(sioc); object_ref(OBJECT(client->ioc)); -- 2.39.2
Re: s390x TCG migration failure
On 29/03/2023 08.36, Thomas Huth wrote: On 29/03/2023 00.21, Nina Schoetterl-Glausch wrote: On Tue, 2023-03-28 at 15:01 +0200, Thomas Huth wrote: On 24/03/2023 19.41, Nina Schoetterl-Glausch wrote: Hi, We're seeing failures running s390x migration kvm-unit-tests tests with TCG. Some initial findings: What seems to be happening is that after migration a control block header accessed by the test code is all zeros which causes an unexpected exception. I did a bisection which points to c8df4a7aef ("migration: Split save_live_pending() into state_pending_*") as the culprit. The migration issue persists after applying the fix e264705012 ("migration: I messed state_pending_exact/estimate") on top of c8df4a7aef. Hi Nina, could you please open a ticket in the QEMU bug tracker and add the "8.0" label there, so that it correctly shows up in the list of things that should be fixed before the 8.0 release? https://gitlab.com/qemu-project/qemu/-/issues/1565 Thanks! Ping! Juan, did you have a chance to look at this issue yet? ... we're getting quite close to the 8.0 release, and IMHO this sounds like a critical bug? Thomas
Re: [PATCH 01/13] virtio-scsi: avoid race between unplug and transport event
On Mon, Apr 03, 2023 at 02:29:52PM -0400, Stefan Hajnoczi wrote: > Only report a transport reset event to the guest after the SCSIDevice > has been unrealized by qdev_simple_device_unplug_cb(). > > qdev_simple_device_unplug_cb() sets the SCSIDevice's qdev.realized field > to false so that scsi_device_find/get() no longer see it. > > scsi_target_emulate_report_luns() also needs to be updated to filter out > SCSIDevices that are unrealized. > > These changes ensure that the guest driver does not see the SCSIDevice > that's being unplugged if it responds very quickly to the transport > reset event. > > Signed-off-by: Stefan Hajnoczi Reviewed-by: Michael S. Tsirkin Feel free to merge. > --- > hw/scsi/scsi-bus.c| 3 ++- > hw/scsi/virtio-scsi.c | 18 +- > 2 files changed, 11 insertions(+), 10 deletions(-) > > diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c > index c97176110c..f9bd064833 100644 > --- a/hw/scsi/scsi-bus.c > +++ b/hw/scsi/scsi-bus.c > @@ -487,7 +487,8 @@ static bool scsi_target_emulate_report_luns(SCSITargetReq > *r) > DeviceState *qdev = kid->child; > SCSIDevice *dev = SCSI_DEVICE(qdev); > > -if (dev->channel == channel && dev->id == id && dev->lun != 0) { > +if (dev->channel == channel && dev->id == id && dev->lun != 0 && > +qatomic_load_acquire(&dev->qdev.realized)) { > store_lun(tmp, dev->lun); > g_byte_array_append(buf, tmp, 8); > len += 8; > diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c > index 612c525d9d..000961446c 100644 > --- a/hw/scsi/virtio-scsi.c > +++ b/hw/scsi/virtio-scsi.c > @@ -1063,15 +1063,6 @@ static void virtio_scsi_hotunplug(HotplugHandler > *hotplug_dev, DeviceState *dev, > SCSIDevice *sd = SCSI_DEVICE(dev); > AioContext *ctx = s->ctx ?: qemu_get_aio_context(); > > -if (virtio_vdev_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG)) { > -virtio_scsi_acquire(s); > -virtio_scsi_push_event(s, sd, > - VIRTIO_SCSI_T_TRANSPORT_RESET, > - VIRTIO_SCSI_EVT_RESET_REMOVED); > -scsi_bus_set_ua(&s->bus, SENSE_CODE(REPORTED_LUNS_CHANGED)); > -virtio_scsi_release(s); > -} > - > aio_disable_external(ctx); > qdev_simple_device_unplug_cb(hotplug_dev, dev, errp); > aio_enable_external(ctx); > @@ -1082,6 +1073,15 @@ static void virtio_scsi_hotunplug(HotplugHandler > *hotplug_dev, DeviceState *dev, > blk_set_aio_context(sd->conf.blk, qemu_get_aio_context(), NULL); > virtio_scsi_release(s); > } > + > +if (virtio_vdev_has_feature(vdev, VIRTIO_SCSI_F_HOTPLUG)) { > +virtio_scsi_acquire(s); > +virtio_scsi_push_event(s, sd, > + VIRTIO_SCSI_T_TRANSPORT_RESET, > + VIRTIO_SCSI_EVT_RESET_REMOVED); > +scsi_bus_set_ua(&s->bus, SENSE_CODE(REPORTED_LUNS_CHANGED)); > +virtio_scsi_release(s); > +} > } > > static struct SCSIBusInfo virtio_scsi_scsi_info = { > -- > 2.39.2
Re: [PATCH v2 05/11] qemu-options: finesse the recommendations around -blockdev
04.04.2023 16:57, Kevin Wolf пишет: Let's not make the use of -drive look more advisable than it really is. If you're writing a management tool/script and you're still using -drive today, you're doing it wrong. Kevin, maybe I'm wrong here, but what to do with the situation which started it all, -- with -snapshot? If anything, I think there should be a bold note that -snapshot is broken by -blockdev. Users are learning that the *hard* way, after losing their data.. /mjt
[PULL 07/10] tests/qemu-iotests: explicitly invoke 'check' via 'python'
From: Daniel P. Berrangé The 'check' script will use "#!/usr/bin/env python3" by default to locate python, but this doesn't work in distros which lack a bare 'python3' binary like NetBSD. We need to explicitly invoke 'check' by referring to the 'python' variable in meson, which resolves to the detected python binary that QEMU intends to use. This fixes a regression introduced by commit 51ab5f8bd795d8980351f8531e54995ff9e6d163 Author: Daniel P. Berrangé Date: Wed Mar 15 17:43:23 2023 + iotests: register each I/O test separately with meson Signed-off-by: Daniel P. Berrangé Reviewed-by: Paolo Bonzini Reviewed-by: Philippe Mathieu-Daudé Message-Id: <20230329124539.822022-1-berra...@redhat.com> Signed-off-by: Alex Bennée Reviewed-by: Thomas Huth Message-Id: <20230403134920.2132362-9-alex.ben...@linaro.org> diff --git a/tests/qemu-iotests/meson.build b/tests/qemu-iotests/meson.build index a162f683ef..9735071a29 100644 --- a/tests/qemu-iotests/meson.build +++ b/tests/qemu-iotests/meson.build @@ -47,19 +47,20 @@ foreach format, speed: qemu_iotests_formats endif rc = run_command( - [qemu_iotests_check_cmd] + args + ['-n'], + [python, qemu_iotests_check_cmd] + args + ['-n'], check: true, ) foreach item: rc.stdout().strip().split() - args = ['-tap', '-' + format, item, + args = [qemu_iotests_check_cmd, + '-tap', '-' + format, item, '--source-dir', meson.current_source_dir(), '--build-dir', meson.current_build_dir()] # Some individual tests take as long as 45 seconds # Bump the timeout to 3 minutes for some headroom # on slow machines to minimize spurious failures test('io-' + format + '-' + item, - qemu_iotests_check_cmd, + python, args: args, depends: qemu_iotests_binaries, env: qemu_iotests_env, -- 2.39.2
Re: [PATCH v2 05/11] qemu-options: finesse the recommendations around -blockdev
Kevin Wolf writes: > Am 03.04.2023 um 15:49 hat Alex Bennée geschrieben: >> We are a bit premature in recommending -blockdev/-device as the best >> way to configure block devices, especially in the common case. >> Improve the language to hopefully make things clearer. >> >> Suggested-by: Michael Tokarev >> Signed-off-by: Alex Bennée >> Reviewed-by: Thomas Huth >> Message-Id: <20230330101141.30199-5-alex.ben...@linaro.org> >> --- >> qemu-options.hx | 8 ++-- >> 1 file changed, 6 insertions(+), 2 deletions(-) >> >> diff --git a/qemu-options.hx b/qemu-options.hx >> index 59bdf67a2c..9a69ed838e 100644 >> --- a/qemu-options.hx >> +++ b/qemu-options.hx >> @@ -1143,10 +1143,14 @@ have gone through several iterations as the feature >> set and complexity >> of the block layer have grown. Many online guides to QEMU often >> reference older and deprecated options, which can lead to confusion. >> >> -The recommended modern way to describe disks is to use a combination of >> +The most explicit way to describe disks is to use a combination of >> ``-device`` to specify the hardware device and ``-blockdev`` to >> describe the backend. The device defines what the guest sees and the >> -backend describes how QEMU handles the data. >> +backend describes how QEMU handles the data. The ``-drive`` option >> +combines the device and backend into a single command line options >> +which is useful in the majority of cases. Older options like ``-hda`` >> +bake in a lot of assumptions from the days when QEMU was emulating a >> +legacy PC, they are not recommended for modern configurations. > > Let's not make the use of -drive look more advisable than it really is. > If you're writing a management tool/script and you're still using -drive > today, you're doing it wrong. > > Maybe this is actually the point where we should just clearly define > that -blockdev is the only supported stable API (like QMP), and that > -drive etc. are convenient shortcuts for human users with no > compatibility promise (like HMP). OK I'll drop this patch from today's PR and await a better description in due course. > > What stopped us from doing so is that there are certain boards that > don't allow the user to configure the onboard devices, but that look at > -drive. These wouldn't provide any stable API any more after this > change. However, if this hasn't been solved in many years, maybe it's > time to view it as the board's problem, and use this change to motivate > them to implement ways to configure the devices. Or maybe some don't > even want to bother with a stable API, who knows. > > Kevin -- Alex Bennée Virtualisation Tech Lead @ Linaro
Re: [PATCH v7 1/4] file-posix: add tracking of the zone write pointers
Stefan Hajnoczi 于2023年4月4日周二 01:04写道: > > On Thu, Mar 23, 2023 at 01:19:04PM +0800, Sam Li wrote: > > Since Linux doesn't have a user API to issue zone append operations to > > zoned devices from user space, the file-posix driver is modified to add > > zone append emulation using regular writes. To do this, the file-posix > > driver tracks the wp location of all zones of the device. It uses an > > array of uint64_t. The most significant bit of each wp location indicates > > if the zone type is conventional zones. > > > > The zones wp can be changed due to the following operations issued: > > - zone reset: change the wp to the start offset of that zone > > - zone finish: change to the end location of that zone > > - write to a zone > > - zone append > > > > Signed-off-by: Sam Li > > --- > > block/file-posix.c | 168 ++- > > include/block/block-common.h | 14 +++ > > include/block/block_int-common.h | 5 + > > 3 files changed, 183 insertions(+), 4 deletions(-) > > > > diff --git a/block/file-posix.c b/block/file-posix.c > > index 65efe5147e..0fb425dcae 100644 > > --- a/block/file-posix.c > > +++ b/block/file-posix.c > > @@ -1324,6 +1324,85 @@ static int hdev_get_max_segments(int fd, struct stat > > *st) > > #endif > > } > > > > +#if defined(CONFIG_BLKZONED) > > +/* > > + * If the ra (reset_all) flag > 0, then the wp of that zone should be > > reset to > > + * the start sector. Else, take the real wp of the device. > > + */ > > +static int get_zones_wp(int fd, BlockZoneWps *wps, int64_t offset, > > +unsigned int nrz, int ra) { > > Please use bool for true/false and use clear variable names: > int ra -> bool reset_all > > > +struct blk_zone *blkz; > > +size_t rep_size; > > +uint64_t sector = offset >> BDRV_SECTOR_BITS; > > +int ret, n = 0, i = 0; > > +rep_size = sizeof(struct blk_zone_report) + nrz * sizeof(struct > > blk_zone); > > +g_autofree struct blk_zone_report *rep = NULL; > > + > > +rep = g_malloc(rep_size); > > +blkz = (struct blk_zone *)(rep + 1); > > +while (n < nrz) { > > +memset(rep, 0, rep_size); > > +rep->sector = sector; > > +rep->nr_zones = nrz - n; > > + > > +do { > > +ret = ioctl(fd, BLKREPORTZONE, rep); > > +} while (ret != 0 && errno == EINTR); > > +if (ret != 0) { > > +error_report("%d: ioctl BLKREPORTZONE at %" PRId64 " failed > > %d", > > +fd, offset, errno); > > +return -errno; > > +} > > + > > +if (!rep->nr_zones) { > > +break; > > +} > > + > > +for (i = 0; i < rep->nr_zones; i++, n++) { > > +/* > > + * The wp tracking cares only about sequential writes required > > and > > + * sequential write preferred zones so that the wp can advance > > to > > + * the right location. > > + * Use the most significant bit of the wp location to indicate > > the > > + * zone type: 0 for SWR/SWP zones and 1 for conventional zones. > > + */ > > +if (blkz[i].type == BLK_ZONE_TYPE_CONVENTIONAL) { > > +wps->wp[i] &= 1ULL << 63; > > +} else { > > +switch(blkz[i].cond) { > > +case BLK_ZONE_COND_FULL: > > +case BLK_ZONE_COND_READONLY: > > +/* Zone not writable */ > > +wps->wp[i] = (blkz[i].start + blkz[i].len) << > > BDRV_SECTOR_BITS; > > +break; > > +case BLK_ZONE_COND_OFFLINE: > > +/* Zone not writable nor readable */ > > +wps->wp[i] = (blkz[i].start) << BDRV_SECTOR_BITS; > > +break; > > +default: > > +if (ra > 0) { > > +wps->wp[i] = blkz[i].start << BDRV_SECTOR_BITS; > > +} else { > > +wps->wp[i] = blkz[i].wp << BDRV_SECTOR_BITS; > > +} > > +break; > > +} > > +} > > +} > > +sector = blkz[i - 1].start + blkz[i - 1].len; > > +} > > + > > +return 0; > > +} > > + > > +static void update_zones_wp(int fd, BlockZoneWps *wps, int64_t offset, > > +unsigned int nrz) { > > QEMU coding style puts the opening curly bracket on a new line: > > static void update_zones_wp(int fd, BlockZoneWps *wps, int64_t offset, > unsigned int nrz) > { > > > +if (get_zones_wp(fd, wps, offset, nrz, 0) < 0) { > > +error_report("update zone wp failed"); > > +} > > +} > > +#endif > > + > > static void raw_refresh_limits(BlockDriverState *bs, Error **errp) > > { > > BDRVRawState *s = bs->opaque; > > @@ -1413,6 +1492,21 @@ static void raw_refresh_limits(BlockDriverState *bs, > > Error **errp) > >
Re: [PATCH v2 05/11] qemu-options: finesse the recommendations around -blockdev
Am 03.04.2023 um 15:49 hat Alex Bennée geschrieben: > We are a bit premature in recommending -blockdev/-device as the best > way to configure block devices, especially in the common case. > Improve the language to hopefully make things clearer. > > Suggested-by: Michael Tokarev > Signed-off-by: Alex Bennée > Reviewed-by: Thomas Huth > Message-Id: <20230330101141.30199-5-alex.ben...@linaro.org> > --- > qemu-options.hx | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/qemu-options.hx b/qemu-options.hx > index 59bdf67a2c..9a69ed838e 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -1143,10 +1143,14 @@ have gone through several iterations as the feature > set and complexity > of the block layer have grown. Many online guides to QEMU often > reference older and deprecated options, which can lead to confusion. > > -The recommended modern way to describe disks is to use a combination of > +The most explicit way to describe disks is to use a combination of > ``-device`` to specify the hardware device and ``-blockdev`` to > describe the backend. The device defines what the guest sees and the > -backend describes how QEMU handles the data. > +backend describes how QEMU handles the data. The ``-drive`` option > +combines the device and backend into a single command line options > +which is useful in the majority of cases. Older options like ``-hda`` > +bake in a lot of assumptions from the days when QEMU was emulating a > +legacy PC, they are not recommended for modern configurations. Let's not make the use of -drive look more advisable than it really is. If you're writing a management tool/script and you're still using -drive today, you're doing it wrong. Maybe this is actually the point where we should just clearly define that -blockdev is the only supported stable API (like QMP), and that -drive etc. are convenient shortcuts for human users with no compatibility promise (like HMP). What stopped us from doing so is that there are certain boards that don't allow the user to configure the onboard devices, but that look at -drive. These wouldn't provide any stable API any more after this change. However, if this hasn't been solved in many years, maybe it's time to view it as the board's problem, and use this change to motivate them to implement ways to configure the devices. Or maybe some don't even want to bother with a stable API, who knows. Kevin
Re: [PATCH v2 for 8.0?] nbd/server: Request TCP_NODELAY
On 4/4/23 02:40, Eric Blake wrote: Nagle's algorithm adds latency in order to reduce network packet overhead on small packets. But when we are already using corking to merge smaller packets into transactional requests, the extra delay from TCP defaults just gets in the way (see recent commit bd2cd4a4). For reference, qemu as an NBD client already requests TCP_NODELAY (see nbd_connect() in nbd/client-connection.c); as does libnbd as a client [1], and nbdkit as a server [2]. Furthermore, the NBD spec recommends the use of TCP_NODELAY [3]. [1] https://gitlab.com/nbdkit/libnbd/-/blob/a48a1142/generator/states-connect.c#L39 [2] https://gitlab.com/nbdkit/nbdkit/-/blob/45b72f5b/server/sockets.c#L430 [3] https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md#protocol-phases CC: Florian Westphal Signed-off-by: Eric Blake Message-Id: <20230327192947.1324372-1-ebl...@redhat.com> --- v2 fix typo, enhance commit message Given that corking made it in through Kevin's tree for 8.0-rc2 but this one did not, but I didn't get any R-b, is there any objection to me doing a pull request to get this into 8.0-rc3? nbd/server.c | 1 + 1 file changed, 1 insertion(+) diff --git a/nbd/server.c b/nbd/server.c index 848836d4140..3d8d0d81df2 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -2758,6 +2758,7 @@ void nbd_client_new(QIOChannelSocket *sioc, } client->tlsauthz = g_strdup(tlsauthz); client->sioc = sioc; +qio_channel_set_delay(QIO_CHANNEL(sioc), false); object_ref(OBJECT(client->sioc)); client->ioc = QIO_CHANNEL(sioc); object_ref(OBJECT(client->ioc)); base-commit: efcd0ec14b0fe9ee0ee70277763b2d538d19238d Acked-by: Paolo Bonzini
Re: [PATCH] block/nvme: use AIO_WAIT_WHILE_UNLOCKED()
On 4/4/23 13:20, Stefan Hajnoczi wrote: A few Admin Queue commands are submitted during nvme_file_open(). They are synchronous since device initialization cannot continue until the commands complete. AIO_WAIT_WHILE() is currently used, but the block/nvme.c code actually doesn't rely on the AioContext lock. Replace it with AIO_WAIT_WHILE_UNLOCKED(NULL, condition). There is no change in behavior and the dependency on the AioContext lock is eliminated. This is a step towards removing the AioContext lock. Signed-off-by: Stefan Hajnoczi --- block/nvme.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/block/nvme.c b/block/nvme.c index 5b744c2bda..829b9c04db 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -512,7 +512,6 @@ static int nvme_admin_cmd_sync(BlockDriverState *bs, NvmeCmd *cmd) { BDRVNVMeState *s = bs->opaque; NVMeQueuePair *q = s->queues[INDEX_ADMIN]; -AioContext *aio_context = bdrv_get_aio_context(bs); NVMeRequest *req; int ret = -EINPROGRESS; req = nvme_get_free_req_nowait(q); @@ -521,7 +520,7 @@ static int nvme_admin_cmd_sync(BlockDriverState *bs, NvmeCmd *cmd) } nvme_submit_command(q, req, cmd, nvme_admin_cmd_sync_cb, &ret); -AIO_WAIT_WHILE(aio_context, ret == -EINPROGRESS); +AIO_WAIT_WHILE_UNLOCKED(NULL, ret == -EINPROGRESS); return ret; } Reviewed-by: Paolo Bonzini
Re: [PATCH 01/13] virtio-scsi: avoid race between unplug and transport event
On 4/4/23 15:06, Stefan Hajnoczi wrote: Would this be more useful as a qdev_is_realized() helper? Yes. There are no other users, but I think a helper makes sense. Agreed; anyway, Reviewed-by: Paolo Bonzini Paolo
Re: [PATCH 11/13] block/fuse: take AioContext lock around blk_exp_ref/unref()
On 4/3/23 20:30, Stefan Hajnoczi wrote: These functions must be called with the AioContext acquired: /* Callers must hold exp->ctx lock */ void blk_exp_ref(BlockExport *exp) ... /* Callers must hold exp->ctx lock */ void blk_exp_unref(BlockExport *exp) Signed-off-by: Stefan Hajnoczi --- block/export/fuse.c | 4 1 file changed, 4 insertions(+) diff --git a/block/export/fuse.c b/block/export/fuse.c index 06fa41079e..18394f9e07 100644 --- a/block/export/fuse.c +++ b/block/export/fuse.c @@ -244,7 +244,9 @@ static void read_from_fuse_export(void *opaque) FuseExport *exp = opaque; int ret; +aio_context_acquire(exp->common.ctx); blk_exp_ref(&exp->common); +aio_context_release(exp->common.ctx); do { ret = fuse_session_receive_buf(exp->fuse_session, &exp->fuse_buf); @@ -256,7 +258,9 @@ static void read_from_fuse_export(void *opaque) fuse_session_process_buf(exp->fuse_session, &exp->fuse_buf); out: +aio_context_acquire(exp->common.ctx); blk_exp_unref(&exp->common); +aio_context_release(exp->common.ctx); } Since the actual thread-unsafe work is done in a bottom half, perhaps instead you can use qatomic_inc and qatomic_fetch_dec in blk_exp_{ref,unref}? Paolo
Re: [PATCH 00/13] block: remove aio_disable_external() API
On 4/3/23 20:29, Stefan Hajnoczi wrote: The aio_disable_external() API temporarily suspends file descriptor monitoring in the event loop. The block layer uses this to prevent new I/O requests being submitted from the guest and elsewhere between bdrv_drained_begin() and bdrv_drained_end(). While the block layer still needs to prevent new I/O requests in drained sections, the aio_disable_external() API can be replaced with .drained_begin/end/poll() callbacks that have been added to BdrvChildClass and BlockDevOps. This newer .bdrained_begin/end/poll() approach is attractive because it works without specifying a specific AioContext. The block layer is moving towards multi-queue and that means multiple AioContexts may be processing I/O simultaneously. The aio_disable_external() was always somewhat hacky. It suspends all file descriptors that were registered with is_external=true, even if they have nothing to do with the BlockDriverState graph nodes that are being drained. It's better to solve a block layer problem in the block layer than to have an odd event loop API solution. That covers the motivation for this change, now on to the specifics of this series: While it would be nice if a single conceptual approach could be applied to all is_external=true file descriptors, I ended up looking at callers on a case-by-case basis. There are two general ways I migrated code away from is_external=true: 1. Block exports are typically best off unregistering fds in .drained_begin() and registering them again in .drained_end(). The .drained_poll() function waits for in-flight requests to finish using a reference counter. 2. Emulated storage controllers like virtio-blk and virtio-scsi are a little simpler. They can rely on BlockBackend's request queuing during drain feature. Guest I/O request coroutines are suspended in a drained section and resume upon the end of the drained section. Sorry, I disagree with this. Request queuing was shown to cause deadlocks; Hanna's latest patch is piling another hack upon it, instead in my opinion we should go in the direction of relying _less_ (or not at all) on request queuing. I am strongly convinced that request queuing must apply only after bdrv_drained_begin has returned, which would also fix the IDE TRIM bug reported by Fiona Ebner. The possible livelock scenario is generally not a problem because 1) outside an iothread you have anyway the BQL that prevents a vCPU from issuing more I/O operations during bdrv_drained_begin 2) in iothreads you have aio_disable_external() instead of .drained_begin(). It is also less tidy to start a request during the drained_begin phase, because a request that has been submitted has to be completed (cancel doesn't really work). So in an ideal world, request queuing would not only apply only after bdrv_drained_begin has returned, it would log a warning and .drained_begin() should set up things so that there are no such warnings. Thanks, Paolo
Re: [PATCH 04/13] util/vhost-user-server: rename refcount to in_flight counter
On 4/3/23 20:29, Stefan Hajnoczi wrote: The VuServer object has a refcount field and ref/unref APIs. The name is confusing because it's actually an in-flight request counter instead of a refcount. Normally a refcount destroys the object upon reaching zero. The VuServer counter is used to wake up the vhost-user coroutine when there are no more requests. Avoid confusing by renaming refcount and ref/unref to in_flight and inc/dec. Signed-off-by: Stefan Hajnoczi --- include/qemu/vhost-user-server.h | 6 +++--- block/export/vhost-user-blk-server.c | 11 +++ util/vhost-user-server.c | 14 +++--- 3 files changed, 17 insertions(+), 14 deletions(-) diff --git a/include/qemu/vhost-user-server.h b/include/qemu/vhost-user-server.h index 25c72433ca..bc0ac9ddb6 100644 --- a/include/qemu/vhost-user-server.h +++ b/include/qemu/vhost-user-server.h @@ -41,7 +41,7 @@ typedef struct { const VuDevIface *vu_iface; /* Protected by ctx lock */ -unsigned int refcount; +unsigned int in_flight; bool wait_idle; VuDev vu_dev; QIOChannel *ioc; /* The I/O channel with the client */ @@ -60,8 +60,8 @@ bool vhost_user_server_start(VuServer *server, void vhost_user_server_stop(VuServer *server); -void vhost_user_server_ref(VuServer *server); -void vhost_user_server_unref(VuServer *server); +void vhost_user_server_inc_in_flight(VuServer *server); +void vhost_user_server_dec_in_flight(VuServer *server); void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx); void vhost_user_server_detach_aio_context(VuServer *server); diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c index 3409d9e02e..e93f2ed6b4 100644 --- a/block/export/vhost-user-blk-server.c +++ b/block/export/vhost-user-blk-server.c @@ -49,7 +49,10 @@ static void vu_blk_req_complete(VuBlkReq *req, size_t in_len) free(req); } -/* Called with server refcount increased, must decrease before returning */ +/* + * Called with server in_flight counter increased, must decrease before + * returning. + */ static void coroutine_fn vu_blk_virtio_process_req(void *opaque) { VuBlkReq *req = opaque; @@ -67,12 +70,12 @@ static void coroutine_fn vu_blk_virtio_process_req(void *opaque) in_num, out_num); if (in_len < 0) { free(req); -vhost_user_server_unref(server); +vhost_user_server_dec_in_flight(server); return; } vu_blk_req_complete(req, in_len); -vhost_user_server_unref(server); +vhost_user_server_dec_in_flight(server); } static void vu_blk_process_vq(VuDev *vu_dev, int idx) @@ -94,7 +97,7 @@ static void vu_blk_process_vq(VuDev *vu_dev, int idx) Coroutine *co = qemu_coroutine_create(vu_blk_virtio_process_req, req); -vhost_user_server_ref(server); +vhost_user_server_inc_in_flight(server); qemu_coroutine_enter(co); } } diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c index 5b6216069c..1622f8cfb3 100644 --- a/util/vhost-user-server.c +++ b/util/vhost-user-server.c @@ -75,16 +75,16 @@ static void panic_cb(VuDev *vu_dev, const char *buf) error_report("vu_panic: %s", buf); } -void vhost_user_server_ref(VuServer *server) +void vhost_user_server_inc_in_flight(VuServer *server) { assert(!server->wait_idle); -server->refcount++; +server->in_flight++; } -void vhost_user_server_unref(VuServer *server) +void vhost_user_server_dec_in_flight(VuServer *server) { -server->refcount--; -if (server->wait_idle && !server->refcount) { +server->in_flight--; +if (server->wait_idle && !server->in_flight) { aio_co_wake(server->co_trip); } } @@ -192,13 +192,13 @@ static coroutine_fn void vu_client_trip(void *opaque) /* Keep running */ } -if (server->refcount) { +if (server->in_flight) { /* Wait for requests to complete before we can unmap the memory */ server->wait_idle = true; qemu_coroutine_yield(); server->wait_idle = false; } -assert(server->refcount == 0); +assert(server->in_flight == 0); vu_deinit(vu_dev); Reviewed-by: Paolo Bonzini
Re: [PATCH 02/13] virtio-scsi: stop using aio_disable_external() during unplug
On 4/3/23 20:29, Stefan Hajnoczi wrote: This patch is part of an effort to remove the aio_disable_external() API because it does not fit in a multi-queue block layer world where many AioContexts may be submitting requests to the same disk. The SCSI emulation code is already in good shape to stop using aio_disable_external(). It was only used by commit 9c5aad84da1c ("virtio-scsi: fixed virtio_scsi_ctx_check failed when detaching scsi disk") to ensure that virtio_scsi_hotunplug() works while the guest driver is submitting I/O. Ensure virtio_scsi_hotunplug() is safe as follows: 1. qdev_simple_device_unplug_cb() -> qdev_unrealize() -> device_set_realized() calls qatomic_set(&dev->realized, false) so that future scsi_device_get() calls return NULL because they exclude SCSIDevices with realized=false. That means virtio-scsi will reject new I/O requests to this SCSIDevice with VIRTIO_SCSI_S_BAD_TARGET even while virtio_scsi_hotunplug() is still executing. We are protected against new requests! 2. Add a call to scsi_device_purge_requests() from scsi_unrealize() so that in-flight requests are cancelled synchronously. This ensures that no in-flight requests remain once qdev_simple_device_unplug_cb() returns. Thanks to these two conditions we don't need aio_disable_external() anymore. Cc: Zhengui Li Signed-off-by: Stefan Hajnoczi --- hw/scsi/scsi-disk.c | 1 + hw/scsi/virtio-scsi.c | 3 --- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index 97c9b1c8cd..e01bd84541 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -2522,6 +2522,7 @@ static void scsi_realize(SCSIDevice *dev, Error **errp) static void scsi_unrealize(SCSIDevice *dev) { +scsi_device_purge_requests(dev, SENSE_CODE(RESET)); del_boot_device_lchs(&dev->qdev, NULL); } diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 000961446c..a02f9233ec 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -1061,11 +1061,8 @@ static void virtio_scsi_hotunplug(HotplugHandler *hotplug_dev, DeviceState *dev, VirtIODevice *vdev = VIRTIO_DEVICE(hotplug_dev); VirtIOSCSI *s = VIRTIO_SCSI(vdev); SCSIDevice *sd = SCSI_DEVICE(dev); -AioContext *ctx = s->ctx ?: qemu_get_aio_context(); -aio_disable_external(ctx); qdev_simple_device_unplug_cb(hotplug_dev, dev, errp); -aio_enable_external(ctx); if (s->ctx) { virtio_scsi_acquire(s); Reviewed-by: Paolo Bonzini
Re: [PATCH 15/16] qapi: Format since information the conventional way: (since X.Y)
Vladimir Sementsov-Ogievskiy writes: > On 04.04.23 14:59, Markus Armbruster wrote: >> @@ -2741,7 +2741,7 @@ >> # >> # @on-error: the action to take on an error (default report). >> #'stop' and 'enospc' can only be used if the block device >> -#supports io-status (see BlockInfo). Since 1.3. >> +#supports io-status (see BlockInfo). (Since 1.3) >> # >> # @filter-node-name: the node name that should be assigned to the >> #filter driver that the stream job inserts into the >> graph >> diff --git a/qapi/stats.json b/qapi/stats.json >> index f17495ee65..36d5f4dc94 100644 >> --- a/qapi/stats.json >> +++ b/qapi/stats.json >> @@ -69,7 +69,7 @@ >> # >> # @vcpu: statistics that apply to a single virtual CPU. >> # >> -# @cryptodev: statistics that apply to a crypto device. since 8.0 >> +# @cryptodev: statistics that apply to a crypto device (since 8.0) >> # >> # Since: 7.1 >> ## >> diff --git a/qapi/tpm.json b/qapi/tpm.json >> index 4e2ea9756a..eac87d30b2 100644 >> --- a/qapi/tpm.json >> +++ b/qapi/tpm.json >> @@ -44,8 +44,7 @@ >> # An enumeration of TPM types >> # >> # @passthrough: TPM passthrough type >> -# @emulator: Software Emulator TPM type >> -#Since: 2.11 >> +# @emulator: Software Emulator TPM type (since 2.11) > > Seems, we don't have any preference between "some text (since VER)" vs "some > text. (Since VER)" ? I personally use (Since VER) after a full sentence ending in punctuation, and (since VER) after something that doesn't end in punctuation.
Re: [PATCH 16/16] qapi storage-daemon/qapi: Fix documentation section structure
Looks good to me, thanks! Acked-by: zhenwei pi On 4/4/23 19:59, Markus Armbruster wrote: In the QEMU QMP Reference Manual, subsection "Block core (VM unrelated)" is empty. Its contents is at the end of subsection "Background jobs" instead. That's because qapi/job.json is includeded first from qapi/block-core.json, which makes qapi/job.json's documentation go between qapi/block-core.json's subsection heading and contents. In the QEMU Storage Daemon QMP Reference Manual, section "Block Devices" contains nothing but an empty subsection "Block core (VM unrelated)". The latter's contents is at the end section "Socket data types", along with subsection "Block device exports". Subsection "Background jobs" is at the end of section "Cryptography". All this is because storage-daemon/qapi/qapi-schema.json includes modules in a confused order. Fix both as follows. Turn subsection "Background jobs" into a section. Move it before section "Block devices" in the QEMU QMP Reference Manual, by including qapi/jobs.json right before qapi/block.json. Reorder include directives in storage-daemon/qapi/qapi-schema.json to match the order in qapi/qapi-schema.json, so that the QEMU Storage Daemon QMP Reference Manual's section structure the QEMU QMP Reference Manual's. In the QEMU QMP Reference Manual, qapi/cryptodev.json's documentation is at the end of section "Virtio devices". That's because it lacks a section heading, and therefore gets squashed into whatever section happens to precede it. Add section heading so it's in section "Cryptography devices". Signed-off-by: Markus Armbruster --- qapi/cryptodev.json | 4 qapi/job.json| 2 +- qapi/qapi-schema.json| 2 +- storage-daemon/qapi/qapi-schema.json | 22 +++--- 4 files changed, 21 insertions(+), 9 deletions(-) diff --git a/qapi/cryptodev.json b/qapi/cryptodev.json index f33f96a692..cf960ea81f 100644 --- a/qapi/cryptodev.json +++ b/qapi/cryptodev.json @@ -4,6 +4,10 @@ # This work is licensed under the terms of the GNU GPL, version 2 or later. # See the COPYING file in the top-level directory. +## +# = Cryptography devices +## + ## # @QCryptodevBackendAlgType: # diff --git a/qapi/job.json b/qapi/job.json index bc4104757a..9e29a796c5 100644 --- a/qapi/job.json +++ b/qapi/job.json @@ -2,7 +2,7 @@ # vim: filetype=python ## -# == Background jobs +# = Background jobs ## ## diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json index e57d8ff801..bb7217da26 100644 --- a/qapi/qapi-schema.json +++ b/qapi/qapi-schema.json @@ -43,11 +43,11 @@ { 'include': 'sockets.json' } { 'include': 'run-state.json' } { 'include': 'crypto.json' } +{ 'include': 'job.json' } { 'include': 'block.json' } { 'include': 'block-export.json' } { 'include': 'char.json' } { 'include': 'dump.json' } -{ 'include': 'job.json' } { 'include': 'net.json' } { 'include': 'rdma.json' } { 'include': 'rocker.json' } diff --git a/storage-daemon/qapi/qapi-schema.json b/storage-daemon/qapi/qapi-schema.json index 67749d1101..f10c949490 100644 --- a/storage-daemon/qapi/qapi-schema.json +++ b/storage-daemon/qapi/qapi-schema.json @@ -15,18 +15,26 @@ { 'include': '../../qapi/pragma.json' } +# Documentation generated with qapi-gen.py is in source order, with +# included sub-schemas inserted at the first include directive +# (subsequent include directives have no effect). To get a sane and +# stable order, it's best to include each sub-schema just once, or +# include it first right here. + +{ 'include': '../../qapi/common.json' } +{ 'include': '../../qapi/sockets.json' } +{ 'include': '../../qapi/crypto.json' } +{ 'include': '../../qapi/job.json' } + ## # = Block devices ## { 'include': '../../qapi/block-core.json' } { 'include': '../../qapi/block-export.json' } + { 'include': '../../qapi/char.json' } -{ 'include': '../../qapi/common.json' } -{ 'include': '../../qapi/control.json' } -{ 'include': '../../qapi/crypto.json' } -{ 'include': '../../qapi/introspect.json' } -{ 'include': '../../qapi/job.json' } { 'include': '../../qapi/authz.json' } -{ 'include': '../../qapi/qom.json' } -{ 'include': '../../qapi/sockets.json' } { 'include': '../../qapi/transaction.json' } +{ 'include': '../../qapi/control.json' } +{ 'include': '../../qapi/introspect.json' } +{ 'include': '../../qapi/qom.json' } -- zhenwei pi
Re: [PATCH 16/16] qapi storage-daemon/qapi: Fix documentation section structure
On Tue, Apr 04, 2023 at 01:59:12PM +0200, Markus Armbruster wrote: > In the QEMU QMP Reference Manual, subsection "Block core (VM > unrelated)" is empty. Its contents is at the end of subsection > "Background jobs" instead. That's because qapi/job.json is includeded included > first from qapi/block-core.json, which makes qapi/job.json's > documentation go between qapi/block-core.json's subsection heading and > contents. > > In the QEMU Storage Daemon QMP Reference Manual, section "Block > Devices" contains nothing but an empty subsection "Block core (VM > unrelated)". The latter's contents is at the end section "Socket data > types", along with subsection "Block device exports". Subsection > "Background jobs" is at the end of section "Cryptography". All this > is because storage-daemon/qapi/qapi-schema.json includes modules in a > confused order. > > Fix both as follows. > > Turn subsection "Background jobs" into a section. > > Move it before section "Block devices" in the QEMU QMP Reference > Manual, by including qapi/jobs.json right before qapi/block.json. > > Reorder include directives in storage-daemon/qapi/qapi-schema.json to > match the order in qapi/qapi-schema.json, so that the QEMU Storage > Daemon QMP Reference Manual's section structure the QEMU QMP Reference > Manual's. > > In the QEMU QMP Reference Manual, qapi/cryptodev.json's documentation > is at the end of section "Virtio devices". That's because it lacks a > section heading, and therefore gets squashed into whatever section > happens to precede it. > > Add section heading so it's in section "Cryptography devices". > > Signed-off-by: Markus Armbruster > --- > qapi/cryptodev.json | 4 > qapi/job.json| 2 +- > qapi/qapi-schema.json| 2 +- > storage-daemon/qapi/qapi-schema.json | 22 +++--- > 4 files changed, 21 insertions(+), 9 deletions(-) Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Re: [PATCH 00/16] qapi qga/qapi-schema: Doc fixes
On Tue, Apr 4, 2023 at 4:02 PM Markus Armbruster wrote: > > It's always nice to get doc fixes into the release, but if it's too > late, it's too late. > > Generated code does not change, except for the last patch, which moves > a bit of code without changing it. > > Markus Armbruster (16): > qga/qapi-schema: Tidy up documentation of guest-fsfreeze-status > qga/qapi-schema: Fix a misspelled reference > qapi: Fix misspelled references > qapi: Fix up references to long gone error classes > qapi/block-core: Clean up after removal of dirty bitmap @status > qapi: @foo should be used to reference, not ``foo`` > qapi: Tidy up examples > qapi: Delete largely misleading "Stability Considerations" > qapi: Fix bullet list markup in documentation > qapi: Fix unintended definition lists in documentation > qga/qapi-schema: Fix member documentation markup > qapi: Fix argument documentation markup > qapi: Replace ad hoc "since" documentation by member documentation > qapi: Fix misspelled section tags in doc comments > qapi: Format since information the conventional way: (since X.Y) > qapi storage-daemon/qapi: Fix documentation section structure > > docs/devel/qapi-code-gen.rst | 8 ++- > docs/interop/firmware.json | 6 +- > qapi/block-core.json | 82 ++-- > qapi/block.json | 2 +- > qapi/char.json | 4 +- > qapi/control.json| 2 +- > qapi/cryptodev.json | 4 ++ > qapi/job.json| 4 +- > qapi/machine-target.json | 2 +- > qapi/machine.json| 26 + > qapi/migration.json | 37 - > qapi/misc.json | 6 +- > qapi/net.json| 25 +++-- > qapi/qapi-schema.json| 24 +--- > qapi/qdev.json | 2 +- > qapi/qom.json| 4 +- > qapi/rdma.json | 2 +- > qapi/replay.json | 3 + > qapi/run-state.json | 6 +- > qapi/stats.json | 3 +- > qapi/tpm.json| 3 +- > qapi/trace.json | 1 + > qapi/ui.json | 12 ++-- > qga/qapi-schema.json | 10 ++-- > storage-daemon/qapi/qapi-schema.json | 22 +--- > 25 files changed, 154 insertions(+), 146 deletions(-) > > -- > 2.39.2 > > lgtm, Reviewed-by: Marc-André Lureau -- Marc-André Lureau
Re: [PATCH 01/13] virtio-scsi: avoid race between unplug and transport event
On Mon, Apr 03, 2023 at 10:47:11PM +0200, Philippe Mathieu-Daudé wrote: > On 3/4/23 20:29, Stefan Hajnoczi wrote: > > Only report a transport reset event to the guest after the SCSIDevice > > has been unrealized by qdev_simple_device_unplug_cb(). > > > > qdev_simple_device_unplug_cb() sets the SCSIDevice's qdev.realized field > > to false so that scsi_device_find/get() no longer see it. > > > > scsi_target_emulate_report_luns() also needs to be updated to filter out > > SCSIDevices that are unrealized. > > > > These changes ensure that the guest driver does not see the SCSIDevice > > that's being unplugged if it responds very quickly to the transport > > reset event. > > > > Signed-off-by: Stefan Hajnoczi > > --- > > hw/scsi/scsi-bus.c| 3 ++- > > hw/scsi/virtio-scsi.c | 18 +- > > 2 files changed, 11 insertions(+), 10 deletions(-) > > > > diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c > > index c97176110c..f9bd064833 100644 > > --- a/hw/scsi/scsi-bus.c > > +++ b/hw/scsi/scsi-bus.c > > @@ -487,7 +487,8 @@ static bool > > scsi_target_emulate_report_luns(SCSITargetReq *r) > > DeviceState *qdev = kid->child; > > SCSIDevice *dev = SCSI_DEVICE(qdev); > > -if (dev->channel == channel && dev->id == id && dev->lun != 0) > > { > > +if (dev->channel == channel && dev->id == id && dev->lun != 0 > > && > > +qatomic_load_acquire(&dev->qdev.realized)) { > > Would this be more useful as a qdev_is_realized() helper? Yes. There are no other users, but I think a helper makes sense. Stefan signature.asc Description: PGP signature
Re: [PATCH 00/16] qapi qga/qapi-schema: Doc fixes
On 04.04.23 14:58, Markus Armbruster wrote: It's always nice to get doc fixes into the release, but if it's too late, it's too late. Generated code does not change, except for the last patch, which moves a bit of code without changing it. I didn't deeply check the details, but looked through and nothing seems wrong to me. Good cleanup! all patches: Reviewed-by: Vladimir Sementsov-Ogievskiy PS: do you plan some automatic checks in build process to avoid similar style/naming problems in future? -- Best regards, Vladimir
Re: [PATCH 15/16] qapi: Format since information the conventional way: (since X.Y)
On 04.04.23 14:59, Markus Armbruster wrote: @@ -2741,7 +2741,7 @@ # # @on-error: the action to take on an error (default report). #'stop' and 'enospc' can only be used if the block device -#supports io-status (see BlockInfo). Since 1.3. +#supports io-status (see BlockInfo). (Since 1.3) # # @filter-node-name: the node name that should be assigned to the #filter driver that the stream job inserts into the graph diff --git a/qapi/stats.json b/qapi/stats.json index f17495ee65..36d5f4dc94 100644 --- a/qapi/stats.json +++ b/qapi/stats.json @@ -69,7 +69,7 @@ # # @vcpu: statistics that apply to a single virtual CPU. # -# @cryptodev: statistics that apply to a crypto device. since 8.0 +# @cryptodev: statistics that apply to a crypto device (since 8.0) # # Since: 7.1 ## diff --git a/qapi/tpm.json b/qapi/tpm.json index 4e2ea9756a..eac87d30b2 100644 --- a/qapi/tpm.json +++ b/qapi/tpm.json @@ -44,8 +44,7 @@ # An enumeration of TPM types # # @passthrough: TPM passthrough type -# @emulator: Software Emulator TPM type -#Since: 2.11 +# @emulator: Software Emulator TPM type (since 2.11) Seems, we don't have any preference between "some text (since VER)" vs "some text. (Since VER)" ? -- Best regards, Vladimir
Re: [PATCH 10/16] qapi: Fix unintended definition lists in documentation
On Tue, 4 Apr 2023 at 12:59, Markus Armbruster wrote: > > rST parses something like > > first line > second line > > as a definition list item, where "first line" is the term being > defined by "second line". > > This bites us in a couple of places. Here's one: > > # @bps_max: total throughput limit during bursts, > # in bytes (Since 1.7) > > scripts/qapi/parser.py parses this into an "argument section" with > name "bps_max" and text > > total throughput limit during bursts, > in bytes (Since 1.7) > > docs/sphinx/qapidoc.py duly passes the text to the rST parser, which > parses it as another definition list. Comes out as nested > definitions: term "bps_max: int (optional)" defined as term "total > throughput limit during bursts," defined as "in bytes (Since 1.7)". > > rST truly is the Perl of ASCII-based markups. > > Fix by deleting the extra indentation. > > Reported-by: Peter Maydell > Signed-off-by: Markus Armbruster Reviewed-by: Peter Maydell thanks -- PMM
[PATCH 07/16] qapi: Tidy up examples
A few examples neglect to prefix QMP input with '->'. Fix that. A few examples neglect to show output. Provide some. The example output for query-vcpu-dirty-limit could use further improvement. Add a TODO comment. Use "Examples:" instead of "Example:" where multiple examples are given. One example section numbers its two examples. Not done elswehre; drop. Another example section separates them with "or". Likewise. Signed-off-by: Markus Armbruster --- qapi/block-core.json | 14 ++ qapi/block.json | 2 +- qapi/char.json | 4 ++-- qapi/machine.json| 7 --- qapi/migration.json | 33 ++--- qapi/net.json| 4 +--- qapi/qdev.json | 2 +- qapi/qom.json| 2 +- qapi/replay.json | 3 +++ qapi/ui.json | 2 +- 10 files changed, 42 insertions(+), 31 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index eeb2ed3f16..a5a5007b28 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -4574,9 +4574,8 @@ # # Since: 2.9 # -# Example: +# Examples: # -# 1. # -> { "execute": "blockdev-add", # "arguments": { # "driver": "qcow2", @@ -4589,7 +4588,6 @@ # } # <- { "return": {} } # -# 2. # -> { "execute": "blockdev-add", # "arguments": { # "driver": "qcow2", @@ -5596,7 +5594,7 @@ # # Since: 2.7 # -# Example: +# Examples: # # 1. Add a new node to a quorum # -> { "execute": "blockdev-add", @@ -5646,7 +5644,7 @@ # # Since: 2.12 # -# Example: +# Examples: # # 1. Move a node into an IOThread # -> { "execute": "x-blockdev-set-iothread", @@ -5731,18 +5729,18 @@ # # Since: 2.0 # -# Example: +# Examples: # # 1. Read operation # -# { "event": "QUORUM_REPORT_BAD", +# <- { "event": "QUORUM_REPORT_BAD", # "data": { "node-name": "node0", "sector-num": 345435, "sectors-count": 5, #"type": "read" }, # "timestamp": { "seconds": 1344522075, "microseconds": 745528 } } # # 2. Flush operation # -# { "event": "QUORUM_REPORT_BAD", +# <- { "event": "QUORUM_REPORT_BAD", # "data": { "node-name": "node0", "sector-num": 0, "sectors-count": 2097120, #"type": "flush", "error": "Broken pipe" }, # "timestamp": { "seconds": 1456406829, "microseconds": 291763 } } diff --git a/qapi/block.json b/qapi/block.json index 5fe068f903..94339a1761 100644 --- a/qapi/block.json +++ b/qapi/block.json @@ -457,7 +457,7 @@ # # Since: 1.1 # -# Example: +# Examples: # # -> { "execute": "block_set_io_throttle", # "arguments": { "id": "virtio-blk-pci0/virtio-backend", diff --git a/qapi/char.json b/qapi/char.json index 923dc5056d..c9431dd0a7 100644 --- a/qapi/char.json +++ b/qapi/char.json @@ -637,7 +637,7 @@ # # Since: 1.4 # -# Example: +# Examples: # # -> { "execute" : "chardev-add", # "arguments" : { "id" : "foo", @@ -673,7 +673,7 @@ # # Since: 2.10 # -# Example: +# Examples: # # -> { "execute" : "chardev-change", # "arguments" : { "id" : "baz", diff --git a/qapi/machine.json b/qapi/machine.json index 8c3c58c763..20541cb319 100644 --- a/qapi/machine.json +++ b/qapi/machine.json @@ -954,7 +954,7 @@ # # Since: 2.7 # -# Example: +# Examples: # # For pseries machine type started with -smp 2,cores=2,maxcpus=4 -cpu POWER8: # @@ -1677,8 +1677,9 @@ # Since: 7.2 # # Example: -# {"execute": "dumpdtb"} -#"arguments": { "filename": "fdt.dtb" } } +# -> { "execute": "dumpdtb" } +# "arguments": { "filename": "fdt.dtb" } } +# <- { "return": {} } # ## { 'command': 'dumpdtb', diff --git a/qapi/migration.json b/qapi/migration.json index 87c174dca2..477ee4d35b 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -273,7 +273,7 @@ # # Since: 0.14 # -# Example: +# Examples: # # 1. Before the first migration # @@ -521,6 +521,7 @@ # # -> { "execute": "migrate-set-capabilities" , "arguments": # { "capabilities": [ { "capability": "xbzrle", "state": true } ] } } +# <- { "return": {} } # ## { 'command': 'migrate-set-capabilities', @@ -989,6 +990,7 @@ # # -> { "execute": "migrate-set-parameters" , # "arguments": { "compress-level": 1 } } +# <- { "return": {} } # ## { 'command': 'migrate-set-parameters', 'boxed': true, @@ -1279,8 +1281,8 @@ # # Example: # -# { "timestamp": {"seconds": 1449669631, "microseconds": 239225}, -# "event": "MIGRATION_PASS", "data": {"pass": 2} } +# <- { "timestamp": {"seconds": 1449669631, "microseconds": 239225}, +# "event": "MIGRATION_PASS", "data": {"pass": 2} } # ## { 'event': 'MIGRATION_PASS', @@ -1861,8 +1863,9 @@ # # Example: # -# {"execute": "calc-dirty-rate", "arguments": {"calc-time": 1, -#'sample-pages': 512} } +# -> {"execute": "calc-dirty-rate", "arguments": {"calc-time": 1, +# 'sample-pages': 512} } +# <- { "return": {} } # ## { 'command': 'calc-dirty-rate', 'data': {'calc-time': 'int64', @@ -1914,9
[PATCH 14/16] qapi: Fix misspelled section tags in doc comments
Section tags are case sensitive and end with a colon. Screwing up either gets them interpreted as ordinary paragraph. Fix a few. Signed-off-by: Markus Armbruster --- qapi/machine.json | 4 ++-- qapi/migration.json | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/qapi/machine.json b/qapi/machine.json index 172ef0ad96..135a4926a9 100644 --- a/qapi/machine.json +++ b/qapi/machine.json @@ -517,7 +517,7 @@ # @targets: Target root bridge IDs from -device ...,id= for each root # bridge. # -# Since 7.1 +# Since: 7.1 ## { 'struct': 'CXLFixedMemoryWindowOptions', 'data': { @@ -532,7 +532,7 @@ # # @cxl-fmw: List of CXLFixedMemoryWindowOptions # -# Since 7.1 +# Since: 7.1 ## { 'struct' : 'CXLFMWProperties', 'data': { 'cxl-fmw': ['CXLFixedMemoryWindowOptions'] } diff --git a/qapi/migration.json b/qapi/migration.json index 477ee4d35b..1b6c7b19e4 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -1991,7 +1991,7 @@ # # data: migration thread name # -# returns: information about migration threads +# Returns: information about migration threads # # Since: 7.2 ## -- 2.39.2
[PATCH 06/16] qapi: @foo should be used to reference, not ``foo``
Documentation suggests @foo is merely shorthand for ``foo``. It's not, it carries additional meaning: it's a reference to a QAPI schema name. Reword the documentation to spell that out. Fix up the few ``foo`` that should be @foo. Signed-off-by: Markus Armbruster --- docs/devel/qapi-code-gen.rst | 8 +--- docs/interop/firmware.json | 6 +++--- qapi/qom.json| 2 +- qapi/ui.json | 2 +- 4 files changed, 10 insertions(+), 8 deletions(-) diff --git a/docs/devel/qapi-code-gen.rst b/docs/devel/qapi-code-gen.rst index 23e7f2fb1c..23f9059db2 100644 --- a/docs/devel/qapi-code-gen.rst +++ b/docs/devel/qapi-code-gen.rst @@ -925,9 +925,11 @@ first character of the first line. The usual strong, *\*emphasized\** and literal markup should be used. If you need a single literal ``*``, you will need to -backslash-escape it. As an extension beyond the usual rST syntax, you -can also use ``@foo`` to reference a name in the schema; this is rendered -the same way as foo. +backslash-escape it. + +Use ``@foo`` to reference a name in the schema. This is an rST +extension. It is rendered the same way as foo, but carries +additional meaning. Example:: diff --git a/docs/interop/firmware.json b/docs/interop/firmware.json index 56814f02b3..cc8f869186 100644 --- a/docs/interop/firmware.json +++ b/docs/interop/firmware.json @@ -258,7 +258,7 @@ # # @mode: Describes how the firmware build handles code versus variable #storage. If not present, it must be treated as if it was -#configured with value ``split``. Since: 7.0.0 +#configured with value @split. Since: 7.0.0 # # @executable: Identifies the firmware executable. The @mode # indicates whether there will be an associated @@ -267,13 +267,13 @@ # -drive if=none,id=pflash0,readonly=on,file=@executable.@filename,format=@executable.@format # -machine pflash0=pflash0 # or equivalent -blockdev instead of -drive. When -# @mode is ``combined`` the executable must be +# @mode is @combined the executable must be # cloned before use and configured with readonly=off. # With QEMU versions older than 4.0, you have to use # -drive if=pflash,unit=0,readonly=on,file=@executable.@filename,format=@executable.@format # # @nvram-template: Identifies the NVRAM template compatible with -# @executable, when @mode is set to ``split``, +# @executable, when @mode is set to @split, # otherwise it should not be present. # Management software instantiates an # individual copy -- a specific NVRAM file -- from diff --git a/qapi/qom.json b/qapi/qom.json index a877b879b9..4fe7a93a75 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -637,7 +637,7 @@ # # @discard-data: if true, the file contents can be destroyed when QEMU exits, #to avoid unnecessarily flushing data to the backing file. Note -#that ``discard-data`` is only an optimization, and QEMU might +#that @discard-data is only an optimization, and QEMU might #not discard file contents if it aborts unexpectedly or is #terminated using SIGKILL. (default: false) # diff --git a/qapi/ui.json b/qapi/ui.json index 25f9d731df..c383c11097 100644 --- a/qapi/ui.json +++ b/qapi/ui.json @@ -1247,7 +1247,7 @@ # available node on the host. # # @p2p: Whether to use peer-to-peer connections (accepted through -# ``add_client``). +# @add_client). # # @audiodev: Use the specified DBus audiodev to export audio. # -- 2.39.2
[PATCH 02/16] qga/qapi-schema: Fix a misspelled reference
Code returns a list of GuestNetworkInterface, documentation claims GuestNetworkInfo, which doesn't exist. Fix the documentation. Fixes: 3424fc9f16a1 (qemu-ga: add guest-network-get-interfaces command) Signed-off-by: Markus Armbruster --- qga/qapi-schema.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json index f349345116..bb9bac0655 100644 --- a/qga/qapi-schema.json +++ b/qga/qapi-schema.json @@ -722,7 +722,7 @@ # Get list of guest IP addresses, MAC addresses # and netmasks. # -# Returns: List of GuestNetworkInfo on success. +# Returns: List of GuestNetworkInterface on success. # # Since: 1.1 ## -- 2.39.2
[PATCH 16/16] qapi storage-daemon/qapi: Fix documentation section structure
In the QEMU QMP Reference Manual, subsection "Block core (VM unrelated)" is empty. Its contents is at the end of subsection "Background jobs" instead. That's because qapi/job.json is includeded first from qapi/block-core.json, which makes qapi/job.json's documentation go between qapi/block-core.json's subsection heading and contents. In the QEMU Storage Daemon QMP Reference Manual, section "Block Devices" contains nothing but an empty subsection "Block core (VM unrelated)". The latter's contents is at the end section "Socket data types", along with subsection "Block device exports". Subsection "Background jobs" is at the end of section "Cryptography". All this is because storage-daemon/qapi/qapi-schema.json includes modules in a confused order. Fix both as follows. Turn subsection "Background jobs" into a section. Move it before section "Block devices" in the QEMU QMP Reference Manual, by including qapi/jobs.json right before qapi/block.json. Reorder include directives in storage-daemon/qapi/qapi-schema.json to match the order in qapi/qapi-schema.json, so that the QEMU Storage Daemon QMP Reference Manual's section structure the QEMU QMP Reference Manual's. In the QEMU QMP Reference Manual, qapi/cryptodev.json's documentation is at the end of section "Virtio devices". That's because it lacks a section heading, and therefore gets squashed into whatever section happens to precede it. Add section heading so it's in section "Cryptography devices". Signed-off-by: Markus Armbruster --- qapi/cryptodev.json | 4 qapi/job.json| 2 +- qapi/qapi-schema.json| 2 +- storage-daemon/qapi/qapi-schema.json | 22 +++--- 4 files changed, 21 insertions(+), 9 deletions(-) diff --git a/qapi/cryptodev.json b/qapi/cryptodev.json index f33f96a692..cf960ea81f 100644 --- a/qapi/cryptodev.json +++ b/qapi/cryptodev.json @@ -4,6 +4,10 @@ # This work is licensed under the terms of the GNU GPL, version 2 or later. # See the COPYING file in the top-level directory. +## +# = Cryptography devices +## + ## # @QCryptodevBackendAlgType: # diff --git a/qapi/job.json b/qapi/job.json index bc4104757a..9e29a796c5 100644 --- a/qapi/job.json +++ b/qapi/job.json @@ -2,7 +2,7 @@ # vim: filetype=python ## -# == Background jobs +# = Background jobs ## ## diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json index e57d8ff801..bb7217da26 100644 --- a/qapi/qapi-schema.json +++ b/qapi/qapi-schema.json @@ -43,11 +43,11 @@ { 'include': 'sockets.json' } { 'include': 'run-state.json' } { 'include': 'crypto.json' } +{ 'include': 'job.json' } { 'include': 'block.json' } { 'include': 'block-export.json' } { 'include': 'char.json' } { 'include': 'dump.json' } -{ 'include': 'job.json' } { 'include': 'net.json' } { 'include': 'rdma.json' } { 'include': 'rocker.json' } diff --git a/storage-daemon/qapi/qapi-schema.json b/storage-daemon/qapi/qapi-schema.json index 67749d1101..f10c949490 100644 --- a/storage-daemon/qapi/qapi-schema.json +++ b/storage-daemon/qapi/qapi-schema.json @@ -15,18 +15,26 @@ { 'include': '../../qapi/pragma.json' } +# Documentation generated with qapi-gen.py is in source order, with +# included sub-schemas inserted at the first include directive +# (subsequent include directives have no effect). To get a sane and +# stable order, it's best to include each sub-schema just once, or +# include it first right here. + +{ 'include': '../../qapi/common.json' } +{ 'include': '../../qapi/sockets.json' } +{ 'include': '../../qapi/crypto.json' } +{ 'include': '../../qapi/job.json' } + ## # = Block devices ## { 'include': '../../qapi/block-core.json' } { 'include': '../../qapi/block-export.json' } + { 'include': '../../qapi/char.json' } -{ 'include': '../../qapi/common.json' } -{ 'include': '../../qapi/control.json' } -{ 'include': '../../qapi/crypto.json' } -{ 'include': '../../qapi/introspect.json' } -{ 'include': '../../qapi/job.json' } { 'include': '../../qapi/authz.json' } -{ 'include': '../../qapi/qom.json' } -{ 'include': '../../qapi/sockets.json' } { 'include': '../../qapi/transaction.json' } +{ 'include': '../../qapi/control.json' } +{ 'include': '../../qapi/introspect.json' } +{ 'include': '../../qapi/qom.json' } -- 2.39.2
[PATCH 09/16] qapi: Fix bullet list markup in documentation
Peter Maydell's commit 100cc4fe0f08 explains: rST insists on a blank line before and after a bulleted list [...] Add some extra blank lines in the doc comments so they're acceptable rST input. It missed one in qapi/trace.json. Paolo Bonzini later added another instance in qapi/stats.json, providing further, if unintended, evidence for his quip that rST is the Perl of ASCII-based markups. Both are parsed as ordinary paragraph, resulting in garbled output. Add the blank lines we need to get the bullet lists recognized as such. Fixes: 100cc4fe0f0827f8da1a5c05f9c65e2aaa40e03d (qapi: Add blank lines before bulleted lists) Fixes: 467ef823d83e (qmp: add filtering of statistics by target vCPU) Signed-off-by: Markus Armbruster --- qapi/stats.json | 1 + qapi/trace.json | 1 + 2 files changed, 2 insertions(+) diff --git a/qapi/stats.json b/qapi/stats.json index 1f5d3c59ab..f17495ee65 100644 --- a/qapi/stats.json +++ b/qapi/stats.json @@ -107,6 +107,7 @@ # The arguments to the query-stats command; specifies a target for which to # request statistics and optionally the required subset of information for # that target: +# # - which vCPUs to request statistics for # - which providers to request statistics from # - which named values to return within each provider diff --git a/qapi/trace.json b/qapi/trace.json index 6c6982a587..f425d10764 100644 --- a/qapi/trace.json +++ b/qapi/trace.json @@ -87,6 +87,7 @@ # @vcpu: The vCPU to act upon (all by default; since 2.7). # # An event's state is modified if: +# # - its name matches the @name pattern, and # - if @vcpu is given, the event has the "vcpu" property. # -- 2.39.2
[PATCH 10/16] qapi: Fix unintended definition lists in documentation
rST parses something like first line second line as a definition list item, where "first line" is the term being defined by "second line". This bites us in a couple of places. Here's one: # @bps_max: total throughput limit during bursts, # in bytes (Since 1.7) scripts/qapi/parser.py parses this into an "argument section" with name "bps_max" and text total throughput limit during bursts, in bytes (Since 1.7) docs/sphinx/qapidoc.py duly passes the text to the rST parser, which parses it as another definition list. Comes out as nested definitions: term "bps_max: int (optional)" defined as term "total throughput limit during bursts," defined as "in bytes (Since 1.7)". rST truly is the Perl of ASCII-based markups. Fix by deleting the extra indentation. Reported-by: Peter Maydell Signed-off-by: Markus Armbruster --- qapi/block-core.json | 52 ++-- qapi/control.json| 2 +- 2 files changed, 27 insertions(+), 27 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index a5a5007b28..2382772e17 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -488,41 +488,41 @@ # # @image: the info of image used (since: 1.6) # -# @bps_max: total throughput limit during bursts, -# in bytes (Since 1.7) +# @bps_max: total throughput limit during bursts, in bytes +# (Since 1.7) # -# @bps_rd_max: read throughput limit during bursts, -#in bytes (Since 1.7) +# @bps_rd_max: read throughput limit during bursts, in bytes +# (Since 1.7) # -# @bps_wr_max: write throughput limit during bursts, -#in bytes (Since 1.7) +# @bps_wr_max: write throughput limit during bursts, in bytes +# (Since 1.7) # -# @iops_max: total I/O operations per second during bursts, -# in bytes (Since 1.7) +# @iops_max: total I/O operations per second during bursts, in bytes +#(Since 1.7) # -# @iops_rd_max: read I/O operations per second during bursts, -# in bytes (Since 1.7) +# @iops_rd_max: read I/O operations per second during bursts, in bytes +# (Since 1.7) # -# @iops_wr_max: write I/O operations per second during bursts, -# in bytes (Since 1.7) +# @iops_wr_max: write I/O operations per second during bursts, in +# bytes (Since 1.7) # -# @bps_max_length: maximum length of the @bps_max burst -#period, in seconds. (Since 2.6) +# @bps_max_length: maximum length of the @bps_max burst period, in +# seconds. (Since 2.6) # -# @bps_rd_max_length: maximum length of the @bps_rd_max -# burst period, in seconds. (Since 2.6) +# @bps_rd_max_length: maximum length of the @bps_rd_max burst period, +# in seconds. (Since 2.6) # -# @bps_wr_max_length: maximum length of the @bps_wr_max -# burst period, in seconds. (Since 2.6) +# @bps_wr_max_length: maximum length of the @bps_wr_max burst period, +# in seconds. (Since 2.6) # -# @iops_max_length: maximum length of the @iops burst -# period, in seconds. (Since 2.6) +# @iops_max_length: maximum length of the @iops burst period, in +# seconds. (Since 2.6) # -# @iops_rd_max_length: maximum length of the @iops_rd_max -#burst period, in seconds. (Since 2.6) +# @iops_rd_max_length: maximum length of the @iops_rd_max burst +# period, in seconds. (Since 2.6) # -# @iops_wr_max_length: maximum length of the @iops_wr_max -#burst period, in seconds. (Since 2.6) +# @iops_wr_max_length: maximum length of the @iops_wr_max burst +# period, in seconds. (Since 2.6) # # @iops_size: an I/O size in bytes (Since 1.7) # @@ -948,7 +948,7 @@ # by the device (Since 4.2) # # @invalid_rd_operations: The number of invalid read operations -# performed by the device (Since 2.5) +# performed by the device (Since 2.5) # # @invalid_wr_operations: The number of invalid write operations # performed by the device (Since 2.5) @@ -3735,7 +3735,7 @@ # Driver specific block device options for Quorum # # @blkverify: true if the driver must print content mismatch -# set to false by default +# set to false by default # # @children: the children block devices to use # diff --git a/qapi/control.json b/qapi/control.json index afca2043af..f83499280a 100644 --- a/qapi/control.json +++ b/qapi/control.json @@ -195,7 +195,7 @@ # @id: Name of the monitor # # @mode: Selects the monitor mode (default: readline in the system -# emulator, control in qemu-storage-daemon) +#emula
[PATCH 01/16] qga/qapi-schema: Tidy up documentation of guest-fsfreeze-status
Delete "error state indicates", because it doesn't make sense. I suspect it was an accident. Signed-off-by: Markus Armbruster --- qga/qapi-schema.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json index 796434ed34..f349345116 100644 --- a/qga/qapi-schema.json +++ b/qga/qapi-schema.json @@ -420,7 +420,7 @@ ## # @guest-fsfreeze-status: # -# Get guest fsfreeze state. error state indicates +# Get guest fsfreeze state. # # Returns: GuestFsfreezeStatus ("thawed", "frozen", etc., as defined below) # -- 2.39.2
[PATCH 12/16] qapi: Fix argument documentation markup
Member / argument documentation of BlockdevAmendOptionsQcow2, job-resume, and RDMA_GID_STATUS_CHANGED is parsed as ordinary text due to missing colon or space before the colon. The generated documentation shows these members / arguments as "Not documented". The fix is obvious: add missing colons, delete extra spaces. Signed-off-by: Markus Armbruster --- qapi/block-core.json | 2 +- qapi/job.json| 2 +- qapi/rdma.json | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 2382772e17..9dd5ed9a47 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -5155,7 +5155,7 @@ # Driver specific image amend options for qcow2. # For now, only encryption options can be amended # -# @encrypt Encryption options to be amended +# @encrypt: Encryption options to be amended # # Since: 5.1 ## diff --git a/qapi/job.json b/qapi/job.json index d5f84e9615..bc4104757a 100644 --- a/qapi/job.json +++ b/qapi/job.json @@ -148,7 +148,7 @@ # This command returns immediately after resuming a paused job. Resuming an # already running job is an error. # -# @id : The job identifier. +# @id: The job identifier. # # Since: 3.0 ## diff --git a/qapi/rdma.json b/qapi/rdma.json index a1d2175a8b..5b6b66afa4 100644 --- a/qapi/rdma.json +++ b/qapi/rdma.json @@ -17,7 +17,7 @@ # # @subnet-prefix: Subnet Prefix # -# @interface-id : Interface ID +# @interface-id: Interface ID # # Since: 4.0 # -- 2.39.2
[PATCH 08/16] qapi: Delete largely misleading "Stability Considerations"
Documentation section "Stability Considerations" dates back to the early days of QMP (commit 82a56f0d83d (Monitor: Introduce the qmp-commands.hx file)). It became largely misleading years ago. Delete it. Signed-off-by: Markus Armbruster --- qapi/qapi-schema.json | 22 -- 1 file changed, 22 deletions(-) diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json index 7c09af5cc8..e57d8ff801 100644 --- a/qapi/qapi-schema.json +++ b/qapi/qapi-schema.json @@ -28,28 +28,6 @@ # # Please, refer to the QMP specification (docs/interop/qmp-spec.txt) for # detailed information on the Server command and response formats. -# -# = Stability Considerations -# -# The current QMP command set (described in this file) may be useful for a -# number of use cases, however it's limited and several commands have bad -# defined semantics, specially with regard to command completion. -# -# These problems are going to be solved incrementally in the next QEMU releases -# and we're going to establish a deprecation policy for badly defined commands. -# -# If you're planning to adopt QMP, please observe the following: -# -# 1. The deprecation policy will take effect and be documented soon, please -#check the documentation of each used command as soon as a new release of -#QEMU is available -# -# 2. DO NOT rely on anything which is not explicit documented -# -# 3. Errors, in special, are not documented. Applications should NOT check -#for specific errors classes or data (it's strongly recommended to only -#check for the "error" key) -# ## { 'include': 'pragma.json' } -- 2.39.2
[PATCH 05/16] qapi/block-core: Clean up after removal of dirty bitmap @status
Commit 81cbfd50886 (block: remove dirty bitmaps 'status' field) removed deprecated BlockDirtyInfo member @status. It neglected to remove references to its enumeration values from the documentation of its replacements. Do that now. Signed-off-by: Markus Armbruster --- qapi/block-core.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 75f7c62405..eeb2ed3f16 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -582,11 +582,11 @@ # @granularity: granularity of the dirty bitmap in bytes (since 1.4) # # @recording: true if the bitmap is recording new writes from the guest. -# Replaces ``active`` and ``disabled`` statuses. (since 4.0) +# (since 4.0) # # @busy: true if the bitmap is in-use by some operation (NBD or jobs) #and cannot be modified via QMP or used by another operation. -#Replaces ``locked`` and ``frozen`` statuses. (since 4.0) +#(since 4.0) # # @persistent: true if the bitmap was stored on disk, is scheduled to be stored # on disk, or both. (since 4.0) -- 2.39.2
[PATCH 15/16] qapi: Format since information the conventional way: (since X.Y)
Signed-off-by: Markus Armbruster --- qapi/block-core.json | 6 +++--- qapi/stats.json | 2 +- qapi/tpm.json| 3 +-- qapi/ui.json | 6 +++--- 4 files changed, 8 insertions(+), 9 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 9dd5ed9a47..b57978957f 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -1318,10 +1318,10 @@ # value is monotonically increasing. # # @busy: false if the job is known to be in a quiescent state, with -#no pending I/O. Since 1.3. +#no pending I/O. (Since 1.3) # # @paused: whether the job is paused or, if @busy is true, will -# pause itself as soon as possible. Since 1.3. +# pause itself as soon as possible. (Since 1.3) # # @speed: the rate limit, bytes per second # @@ -2741,7 +2741,7 @@ # # @on-error: the action to take on an error (default report). #'stop' and 'enospc' can only be used if the block device -#supports io-status (see BlockInfo). Since 1.3. +#supports io-status (see BlockInfo). (Since 1.3) # # @filter-node-name: the node name that should be assigned to the #filter driver that the stream job inserts into the graph diff --git a/qapi/stats.json b/qapi/stats.json index f17495ee65..36d5f4dc94 100644 --- a/qapi/stats.json +++ b/qapi/stats.json @@ -69,7 +69,7 @@ # # @vcpu: statistics that apply to a single virtual CPU. # -# @cryptodev: statistics that apply to a crypto device. since 8.0 +# @cryptodev: statistics that apply to a crypto device (since 8.0) # # Since: 7.1 ## diff --git a/qapi/tpm.json b/qapi/tpm.json index 4e2ea9756a..eac87d30b2 100644 --- a/qapi/tpm.json +++ b/qapi/tpm.json @@ -44,8 +44,7 @@ # An enumeration of TPM types # # @passthrough: TPM passthrough type -# @emulator: Software Emulator TPM type -#Since: 2.11 +# @emulator: Software Emulator TPM type (since 2.11) # # Since: 1.5 ## diff --git a/qapi/ui.json b/qapi/ui.json index e9599dea50..88de458ba9 100644 --- a/qapi/ui.json +++ b/qapi/ui.json @@ -1207,13 +1207,13 @@ # window resizes (virtio-gpu) this will default to "on", # assuming the guest will resize the display to match # the window size then. Otherwise it defaults to "off". -# Since 3.1 +# (Since 3.1) # @show-tabs: Display the tab bar for switching between the various graphical # interfaces (e.g. VGA and virtual console character devices) # by default. -# Since 7.1 +# (Since 7.1) # @show-menubar: Display the main window menubar. Defaults to "on". -#Since 8.0 +#(Since 8.0) # # Since: 2.12 ## -- 2.39.2
[PATCH 00/16] qapi qga/qapi-schema: Doc fixes
It's always nice to get doc fixes into the release, but if it's too late, it's too late. Generated code does not change, except for the last patch, which moves a bit of code without changing it. Markus Armbruster (16): qga/qapi-schema: Tidy up documentation of guest-fsfreeze-status qga/qapi-schema: Fix a misspelled reference qapi: Fix misspelled references qapi: Fix up references to long gone error classes qapi/block-core: Clean up after removal of dirty bitmap @status qapi: @foo should be used to reference, not ``foo`` qapi: Tidy up examples qapi: Delete largely misleading "Stability Considerations" qapi: Fix bullet list markup in documentation qapi: Fix unintended definition lists in documentation qga/qapi-schema: Fix member documentation markup qapi: Fix argument documentation markup qapi: Replace ad hoc "since" documentation by member documentation qapi: Fix misspelled section tags in doc comments qapi: Format since information the conventional way: (since X.Y) qapi storage-daemon/qapi: Fix documentation section structure docs/devel/qapi-code-gen.rst | 8 ++- docs/interop/firmware.json | 6 +- qapi/block-core.json | 82 ++-- qapi/block.json | 2 +- qapi/char.json | 4 +- qapi/control.json| 2 +- qapi/cryptodev.json | 4 ++ qapi/job.json| 4 +- qapi/machine-target.json | 2 +- qapi/machine.json| 26 + qapi/migration.json | 37 - qapi/misc.json | 6 +- qapi/net.json| 25 +++-- qapi/qapi-schema.json| 24 +--- qapi/qdev.json | 2 +- qapi/qom.json| 4 +- qapi/rdma.json | 2 +- qapi/replay.json | 3 + qapi/run-state.json | 6 +- qapi/stats.json | 3 +- qapi/tpm.json| 3 +- qapi/trace.json | 1 + qapi/ui.json | 12 ++-- qga/qapi-schema.json | 10 ++-- storage-daemon/qapi/qapi-schema.json | 22 +--- 25 files changed, 154 insertions(+), 146 deletions(-) -- 2.39.2
[PATCH 03/16] qapi: Fix misspelled references
query-cpu-definitions returns a list of CpuDefinitionInfo, but documentation claims CpuDefInfo, which doesn't exist. query-migrate-capabilities returns a list of MigrationCapabilityStatus, but documentation claims MigrationCapabilitiesStatus, which doesn't exist. balloon and query-balloon can fail with KVMMissingCap, but documentation claims KvmMissingCap, which doesn't exist. Fix the documentation. Fixes: e4e31c6324af (qapi: add query-cpu-definitions command (v2)) Fixes: bbf6da32b5bd (Add migration capabilities) Fixes: d72f326431e2 (qapi: Convert balloon) Fixes: 96637bcdf9e0 (qapi: Convert query-balloon) Signed-off-by: Markus Armbruster --- qapi/machine-target.json | 2 +- qapi/machine.json| 4 ++-- qapi/migration.json | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/qapi/machine-target.json b/qapi/machine-target.json index 2e267fa458..b94fbdb65e 100644 --- a/qapi/machine-target.json +++ b/qapi/machine-target.json @@ -331,7 +331,7 @@ # # Return a list of supported virtual CPU definitions # -# Returns: a list of CpuDefInfo +# Returns: a list of CpuDefinitionInfo # # Since: 1.2 ## diff --git a/qapi/machine.json b/qapi/machine.json index 604b686e59..8c3c58c763 100644 --- a/qapi/machine.json +++ b/qapi/machine.json @@ -1029,7 +1029,7 @@ # # Returns: - Nothing on success # - If the balloon driver is enabled but not functional because the KVM -#kernel module cannot support it, KvmMissingCap +#kernel module cannot support it, KVMMissingCap # - If no balloon device is present, DeviceNotActive # # Notes: This command just issues a request to the guest. When it returns, @@ -1067,7 +1067,7 @@ # # Returns: - @BalloonInfo on success # - If the balloon driver is enabled but not functional because the KVM -#kernel module cannot support it, KvmMissingCap +#kernel module cannot support it, KVMMissingCap # - If no balloon device is present, DeviceNotActive # # Since: 0.14 diff --git a/qapi/migration.json b/qapi/migration.json index c84fa10e86..87c174dca2 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -531,7 +531,7 @@ # # Returns information about the current migration capabilities status # -# Returns: @MigrationCapabilitiesStatus +# Returns: @MigrationCapabilityStatus # # Since: 1.2 # -- 2.39.2
[PATCH 11/16] qga/qapi-schema: Fix member documentation markup
GuestDiskStatsInfo's member documentation is parsed as ordinary text due to missing colons. The generated documentation shows these members as "Not documented". The fix is obvious: add the missing colons. Signed-off-by: Markus Armbruster --- qga/qapi-schema.json | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json index bb9bac0655..6a20eeb297 100644 --- a/qga/qapi-schema.json +++ b/qga/qapi-schema.json @@ -1553,11 +1553,11 @@ ## # @GuestDiskStatsInfo: # -# @name disk name +# @name: disk name # -# @major major device number of disk +# @major: major device number of disk # -# @minor minor device number of disk +# @minor: minor device number of disk ## { 'struct': 'GuestDiskStatsInfo', 'data': {'name': 'str', -- 2.39.2
[PATCH 04/16] qapi: Fix up references to long gone error classes
Commit de253f14912e88f4 (qmp: switch to the new error format on the wire) removed most error classes. Several later commits mistakenly mentioned them in documentation. Replace them by the actual error class there. Fixes: 44e3e053af56 (qmp: add interface blockdev-snapshot-delete-internal-sync) Fixes: f323bc9e8b3b (qmp: add interface blockdev-snapshot-internal-sync) Fixes: ba1c048a8f9c (qapi: Introduce add-fd, remove-fd, query-fdsets) Fixes: ed61fc10e8c8 (QAPI: add command for live block commit, 'block-commit') Fixes: e4c8f004c55d (qapi: convert sendkey) Signed-off-by: Markus Armbruster --- qapi/block-core.json | 4 ++-- qapi/misc.json | 6 +++--- qapi/ui.json | 2 +- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index c05ad0c07e..75f7c62405 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -5784,7 +5784,7 @@ # - If any snapshot matching @name exists, or @name is empty, #GenericError # - If the format of the image used does not support it, -#BlockFormatFeatureNotSupported +#GenericError # # Since: 1.7 # @@ -5820,7 +5820,7 @@ # - If @device is not a valid block device, GenericError # - If snapshot not found, GenericError # - If the format of the image used does not support it, -#BlockFormatFeatureNotSupported +#GenericError # - If @id and @name are both not specified, GenericError # # Since: 1.7 diff --git a/qapi/misc.json b/qapi/misc.json index 6ddd16ea28..7e278ca1eb 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -349,8 +349,8 @@ # @opaque: A free-form string that can be used to describe the fd. # # Returns: - @AddfdInfo on success -# - If file descriptor was not received, FdNotSupplied -# - If @fdset-id is a negative value, InvalidParameterValue +# - If file descriptor was not received, GenericError +# - If @fdset-id is a negative value, GenericError # # Notes: The list of fd sets is shared by all monitor connections. # @@ -379,7 +379,7 @@ # @fd: The file descriptor that is to be removed. # # Returns: - Nothing on success -# - If @fdset-id or @fd is not found, FdNotFound +# - If @fdset-id or @fd is not found, GenericError # # Since: 1.2 # diff --git a/qapi/ui.json b/qapi/ui.json index 98322342f7..25f9d731df 100644 --- a/qapi/ui.json +++ b/qapi/ui.json @@ -985,7 +985,7 @@ # to 100 # # Returns: - Nothing on success -# - If key is unknown or redundant, InvalidParameter +# - If key is unknown or redundant, GenericError # # Since: 1.3 # -- 2.39.2
[PATCH 13/16] qapi: Replace ad hoc "since" documentation by member documentation
MemoryDeviceInfoKind, NetClientDriver, and GuestPanicAction mention some members only in ad hoc since documentation. The generated documentation shows these members as "Not documented". Replace by formal member documentation. Add actual documentation text for the GuestPanicAction members, to match existing member documentation there. For the others, merely move existing "since" information. Signed-off-by: Markus Armbruster --- qapi/machine.json | 11 --- qapi/net.json | 21 - qapi/run-state.json | 6 +- 3 files changed, 21 insertions(+), 17 deletions(-) diff --git a/qapi/machine.json b/qapi/machine.json index 20541cb319..172ef0ad96 100644 --- a/qapi/machine.json +++ b/qapi/machine.json @@ -1260,6 +1260,14 @@ ## # @MemoryDeviceInfoKind: # +# @nvdimm: since 2.12 +# +# @virtio-pmem: since 4.1 +# +# @virtio-mem: since 5.1 +# +# @sgx-epc: since 6.2. +# # Since: 2.1 ## { 'enum': 'MemoryDeviceInfoKind', @@ -1302,9 +1310,6 @@ # # Union containing information about a memory device # -# nvdimm is included since 2.12. virtio-pmem is included since 4.1. -# virtio-mem is included since 5.1. sgx-epc is included since 6.2. -# # Since: 2.1 ## { 'union': 'MemoryDeviceInfo', diff --git a/qapi/net.json b/qapi/net.json index 1f1e148f01..3f9db0a718 100644 --- a/qapi/net.json +++ b/qapi/net.json @@ -641,14 +641,15 @@ # # Available netdev drivers. # -# Since: 2.7 +# @l2tpv3: since 2.1 +# @vhost-vdpa: since 5.1 +# @vmnet-host: since 7.1 +# @vmnet-shared: since 7.1 +# @vmnet-bridged: since 7.1 +# @stream: since 7.2 +# @dgram: since 7.2 # -#@vhost-vdpa since 5.1 -#@vmnet-host since 7.1 -#@vmnet-shared since 7.1 -#@vmnet-bridged since 7.1 -#@stream since 7.2 -#@dgram since 7.2 +# Since: 2.7 ## { 'enum': 'NetClientDriver', 'data': [ 'none', 'nic', 'user', 'tap', 'l2tpv3', 'socket', 'stream', @@ -669,12 +670,6 @@ # # Since: 1.2 # -#'l2tpv3' - since 2.1 -#'vmnet-host' - since 7.1 -#'vmnet-shared' - since 7.1 -#'vmnet-bridged' - since 7.1 -#'stream' since 7.2 -#'dgram' since 7.2 ## { 'union': 'Netdev', 'base': { 'id': 'str', 'type': 'NetClientDriver' }, diff --git a/qapi/run-state.json b/qapi/run-state.json index 419c188dd1..52495884c5 100644 --- a/qapi/run-state.json +++ b/qapi/run-state.json @@ -468,7 +468,11 @@ # # @pause: system pauses # -# Since: 2.1 (poweroff since 2.8, run since 5.0) +# @poweroff: system powers off (since 2.8) +# +# @run: system continues to run (since 5.0) +# +# Since: 2.1 ## { 'enum': 'GuestPanicAction', 'data': [ 'pause', 'poweroff', 'run' ] } -- 2.39.2
Re: [PATCH] block/nvme: use AIO_WAIT_WHILE_UNLOCKED()
On 4/4/23 13:20, Stefan Hajnoczi wrote: A few Admin Queue commands are submitted during nvme_file_open(). They are synchronous since device initialization cannot continue until the commands complete. AIO_WAIT_WHILE() is currently used, but the block/nvme.c code actually doesn't rely on the AioContext lock. Replace it with AIO_WAIT_WHILE_UNLOCKED(NULL, condition). There is no change in behavior and the dependency on the AioContext lock is eliminated. This is a step towards removing the AioContext lock. Signed-off-by: Stefan Hajnoczi --- block/nvme.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) Reviewed-by: Philippe Mathieu-Daudé
[PATCH] block/nvme: use AIO_WAIT_WHILE_UNLOCKED()
A few Admin Queue commands are submitted during nvme_file_open(). They are synchronous since device initialization cannot continue until the commands complete. AIO_WAIT_WHILE() is currently used, but the block/nvme.c code actually doesn't rely on the AioContext lock. Replace it with AIO_WAIT_WHILE_UNLOCKED(NULL, condition). There is no change in behavior and the dependency on the AioContext lock is eliminated. This is a step towards removing the AioContext lock. Signed-off-by: Stefan Hajnoczi --- block/nvme.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/block/nvme.c b/block/nvme.c index 5b744c2bda..829b9c04db 100644 --- a/block/nvme.c +++ b/block/nvme.c @@ -512,7 +512,6 @@ static int nvme_admin_cmd_sync(BlockDriverState *bs, NvmeCmd *cmd) { BDRVNVMeState *s = bs->opaque; NVMeQueuePair *q = s->queues[INDEX_ADMIN]; -AioContext *aio_context = bdrv_get_aio_context(bs); NVMeRequest *req; int ret = -EINPROGRESS; req = nvme_get_free_req_nowait(q); @@ -521,7 +520,7 @@ static int nvme_admin_cmd_sync(BlockDriverState *bs, NvmeCmd *cmd) } nvme_submit_command(q, req, cmd, nvme_admin_cmd_sync_cb, &ret); -AIO_WAIT_WHILE(aio_context, ret == -EINPROGRESS); +AIO_WAIT_WHILE_UNLOCKED(NULL, ret == -EINPROGRESS); return ret; } -- 2.39.2
Re: [PATCH v2 09/11] tests/vm: use the default system python for NetBSD
On 3/4/23 15:49, Alex Bennée wrote: From: Daniel P. Berrangé Currently our NetBSD VM recipe requests instal of the python37 package and explicitly tells QEMU to use that version of python. Since the NetBSD base ISO was updated to version 9.3 though, the default system python version is 3.9 which is sufficiently new for QEMU to rely on. Rather than requesting an older python, just test against the default system python which is what most users will have. Signed-off-by: Daniel P. Berrangé Reviewed-by: Paolo Bonzini Reviewed-by: Philippe Mathieu-Daudé Message-Id: <20230329124601.822209-1-berra...@redhat.com> Signed-off-by: Alex Bennée Reviewed-by: Thomas Huth Message-Id: <20230330101141.30199-9-alex.ben...@linaro.org> --- tests/vm/netbsd | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) Tested-by: Philippe Mathieu-Daudé
Re: [PATCH 0/2] virtio-scsi: stop using aio_disable_external() during unplug
On 3/23/23 9:56 PM, Stefan Hajnoczi wrote: The aio_disable_external() API is a solution for stopping I/O during critical sections. The newer BlockDevOps->drained_begin/end/poll() callbacks offer a cleaner solution that supports the upcoming multi-queue block layer. This series removes aio_disable_external() from the virtio-scsi emulation code. Patch 1 is a fix for something I noticed when reading the code. Patch 2 replaces aio_disable_external() with functionality for safe hot unplug that's mostly already present in the SCSI emulation code. Stefan Hajnoczi (2): virtio-scsi: avoid race between unplug and transport event virtio-scsi: stop using aio_disable_external() during unplug hw/scsi/scsi-bus.c| 3 ++- hw/scsi/scsi-disk.c | 1 + hw/scsi/virtio-scsi.c | 21 + 3 files changed, 12 insertions(+), 13 deletions(-) For both patches: Reviewed-by: Daniil Tatianin
Re: [PATCH 08/13] hw/xen: do not use aio_set_fd_handler(is_external=true) in xen_xenstore
On Mon, 2023-04-03 at 14:29 -0400, Stefan Hajnoczi wrote: > There is no need to suspend activity between aio_disable_external() and > aio_enable_external(), which is mainly used for the block layer's drain > operation. > > This is part of ongoing work to remove the aio_disable_external() API. > > Signed-off-by: Stefan Hajnoczi Reviewed-by: David Woodhouse Thanks. > --- > hw/i386/kvm/xen_xenstore.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/hw/i386/kvm/xen_xenstore.c b/hw/i386/kvm/xen_xenstore.c > index 900679af8a..6e81bc8791 100644 > --- a/hw/i386/kvm/xen_xenstore.c > +++ b/hw/i386/kvm/xen_xenstore.c > @@ -133,7 +133,7 @@ static void xen_xenstore_realize(DeviceState *dev, Error > **errp) > error_setg(errp, "Xenstore evtchn port init failed"); > return; > } > - aio_set_fd_handler(qemu_get_aio_context(), xen_be_evtchn_fd(s->eh), true, > + aio_set_fd_handler(qemu_get_aio_context(), xen_be_evtchn_fd(s->eh), > false, > xen_xenstore_event, NULL, NULL, NULL, s); > > s->impl = xs_impl_create(xen_domid); smime.p7s Description: S/MIME cryptographic signature
Re: [PATCH 13/13] aio: remove aio_disable_external() API
* Juan Quintela (quint...@redhat.com) wrote: > Stefan Hajnoczi wrote: > > All callers now pass is_external=false to aio_set_fd_handler() and > > aio_set_event_notifier(). The aio_disable_external() API that > > temporarily disables fd handlers that were registered is_external=true > > is therefore dead code. > > > > Remove aio_disable_external(), aio_enable_external(), and the > > is_external arguments to aio_set_fd_handler() and > > aio_set_event_notifier(). > > > > The entire test-fdmon-epoll test is removed because its sole purpose was > > testing aio_disable_external(). > > > > Parts of this patch were generated using the following coccinelle > > (https://coccinelle.lip6.fr/) semantic patch: > > > > @@ > > expression ctx, fd, is_external, io_read, io_write, io_poll, > > io_poll_ready, opaque; > > @@ > > - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, > > io_poll_ready, opaque) > > + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, > > opaque) > > > > @@ > > expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; > > @@ > > - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, > > io_poll_ready) > > + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready) > > > > Signed-off-by: Stefan Hajnoczi > > [] > > > diff --git a/migration/rdma.c b/migration/rdma.c > > index df646be35e..aee41ca43e 100644 > > --- a/migration/rdma.c > > +++ b/migration/rdma.c > > @@ -3104,15 +3104,15 @@ static void > > qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc, > > { > > QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); > > if (io_read) { > > -aio_set_fd_handler(ctx, rioc->rdmain->recv_comp_channel->fd, > > - false, io_read, io_write, NULL, NULL, opaque); > > -aio_set_fd_handler(ctx, rioc->rdmain->send_comp_channel->fd, > > - false, io_read, io_write, NULL, NULL, opaque); > > +aio_set_fd_handler(ctx, rioc->rdmain->recv_comp_channel->fd, > > io_read, > > + io_write, NULL, NULL, opaque); > > +aio_set_fd_handler(ctx, rioc->rdmain->send_comp_channel->fd, > > io_read, > > + io_write, NULL, NULL, opaque); > > } else { > > -aio_set_fd_handler(ctx, rioc->rdmaout->recv_comp_channel->fd, > > - false, io_read, io_write, NULL, NULL, opaque); > > -aio_set_fd_handler(ctx, rioc->rdmaout->send_comp_channel->fd, > > - false, io_read, io_write, NULL, NULL, opaque); > > +aio_set_fd_handler(ctx, rioc->rdmaout->recv_comp_channel->fd, > > io_read, > > + io_write, NULL, NULL, opaque); > > +aio_set_fd_handler(ctx, rioc->rdmaout->send_comp_channel->fd, > > io_read, > > + io_write, NULL, NULL, opaque); > > } > > } > > Reviewed-by: Juan Quintela > > For the migration bits. > I don't even want to know why the RDMA code uses a low level block layer API. I don't think it's block specific. It looks like it's because qio_channel uses aio in the case where something QIO_CHANNEL_ERR_BLOCK and then waits for the recovery; see 4d9f675 that added it. Dave -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
Re: [PATCH 13/13] aio: remove aio_disable_external() API
Stefan Hajnoczi wrote: > All callers now pass is_external=false to aio_set_fd_handler() and > aio_set_event_notifier(). The aio_disable_external() API that > temporarily disables fd handlers that were registered is_external=true > is therefore dead code. > > Remove aio_disable_external(), aio_enable_external(), and the > is_external arguments to aio_set_fd_handler() and > aio_set_event_notifier(). > > The entire test-fdmon-epoll test is removed because its sole purpose was > testing aio_disable_external(). > > Parts of this patch were generated using the following coccinelle > (https://coccinelle.lip6.fr/) semantic patch: > > @@ > expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, > opaque; > @@ > - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, > io_poll_ready, opaque) > + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, > opaque) > > @@ > expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; > @@ > - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, > io_poll_ready) > + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready) > > Signed-off-by: Stefan Hajnoczi [] > diff --git a/migration/rdma.c b/migration/rdma.c > index df646be35e..aee41ca43e 100644 > --- a/migration/rdma.c > +++ b/migration/rdma.c > @@ -3104,15 +3104,15 @@ static void > qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc, > { > QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc); > if (io_read) { > -aio_set_fd_handler(ctx, rioc->rdmain->recv_comp_channel->fd, > - false, io_read, io_write, NULL, NULL, opaque); > -aio_set_fd_handler(ctx, rioc->rdmain->send_comp_channel->fd, > - false, io_read, io_write, NULL, NULL, opaque); > +aio_set_fd_handler(ctx, rioc->rdmain->recv_comp_channel->fd, io_read, > + io_write, NULL, NULL, opaque); > +aio_set_fd_handler(ctx, rioc->rdmain->send_comp_channel->fd, io_read, > + io_write, NULL, NULL, opaque); > } else { > -aio_set_fd_handler(ctx, rioc->rdmaout->recv_comp_channel->fd, > - false, io_read, io_write, NULL, NULL, opaque); > -aio_set_fd_handler(ctx, rioc->rdmaout->send_comp_channel->fd, > - false, io_read, io_write, NULL, NULL, opaque); > +aio_set_fd_handler(ctx, rioc->rdmaout->recv_comp_channel->fd, > io_read, > + io_write, NULL, NULL, opaque); > +aio_set_fd_handler(ctx, rioc->rdmaout->send_comp_channel->fd, > io_read, > + io_write, NULL, NULL, opaque); > } > } Reviewed-by: Juan Quintela For the migration bits. I don't even want to know why the RDMA code uses a low level block layer API.
Re: [PATCH v2 for 8.0?] nbd/server: Request TCP_NODELAY
On 4/4/23 02:40, Eric Blake wrote: Nagle's algorithm adds latency in order to reduce network packet overhead on small packets. But when we are already using corking to merge smaller packets into transactional requests, the extra delay from TCP defaults just gets in the way (see recent commit bd2cd4a4). For reference, qemu as an NBD client already requests TCP_NODELAY (see nbd_connect() in nbd/client-connection.c); as does libnbd as a client [1], and nbdkit as a server [2]. Furthermore, the NBD spec recommends the use of TCP_NODELAY [3]. [1] https://gitlab.com/nbdkit/libnbd/-/blob/a48a1142/generator/states-connect.c#L39 [2] https://gitlab.com/nbdkit/nbdkit/-/blob/45b72f5b/server/sockets.c#L430 [3] https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md#protocol-phases CC: Florian Westphal Signed-off-by: Eric Blake Message-Id: <20230327192947.1324372-1-ebl...@redhat.com> --- v2 fix typo, enhance commit message Given that corking made it in through Kevin's tree for 8.0-rc2 but this one did not, but I didn't get any R-b, is there any objection to me doing a pull request to get this into 8.0-rc3? FWIW, no objection. nbd/server.c | 1 + 1 file changed, 1 insertion(+) Reviewed-by: Philippe Mathieu-Daudé
Re: [PATCH 2/4] block: Split padded I/O vectors exceeding IOV_MAX
On 03.04.23 16:33, Hanna Czenczek wrote: (Sorry for the rather late reply... Thanks for the review!) On 20.03.23 11:31, Vladimir Sementsov-Ogievskiy wrote: On 17.03.23 20:50, Hanna Czenczek wrote: [...] diff --git a/block/io.c b/block/io.c index 8974d46941..1e9cdba17a 100644 --- a/block/io.c +++ b/block/io.c [..] + pad->write = write; + return true; } @@ -1545,6 +1561,18 @@ zero_mem: static void bdrv_padding_destroy(BdrvRequestPadding *pad) Maybe, rename to _finalize, to stress that it's not only freeing memory. Sounds good! [...] @@ -1552,6 +1580,101 @@ static void bdrv_padding_destroy(BdrvRequestPadding *pad) memset(pad, 0, sizeof(*pad)); } +/* + * Create pad->local_qiov by wrapping @iov in the padding head and tail, while + * ensuring that the resulting vector will not exceed IOV_MAX elements. + * + * To ensure this, when necessary, the first couple of elements (up to three) maybe, "first two-three elements" Sure (here and... [...] + /* + * If padded_niov > IOV_MAX, we cannot just concatenate everything. + * Instead, merge the first couple of elements of @iov to reduce the number maybe, "first two-three elements" ...here). + * of vector elements as necessary. + */ + if (padded_niov > IOV_MAX) { [..] @@ -1653,8 +1786,8 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child, flags |= BDRV_REQ_COPY_ON_READ; } - ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pad, - NULL, &flags); + ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, false, + &pad, NULL, &flags); if (ret < 0) { goto fail; } a bit later: tracked_request_end(&req); bdrv_padding_destroy(&pad); Now, the request is formally finished inside bdrv_padding_destroy().. Not sure, does it really violate something, but seems safer to swap these two calls. I’d rather not, for two reasons: First, tracked requests are (as far as I understand) only there to implement request serialization, and so only care about metadata (offset, length, and type), which is not changed by changes to the I/O vector. Second, even if the state of the I/O vector were relevant to tracked requests, I think it would actually be the other way around, i.e. the tracked request must be ended before the padding is finalized/destroyed. The tracked request is about the actual request we submit to `child` (which is why tracked_request_begin() is called after bdrv_pad_request()), and that request is done using the modified I/O vector. So if the tracked request had any connection to the request’s I/O vector (which it doesn’t), it would be to this modified one, so we mustn’t invalidate it via bdrv_padding_finalize() while the tracked request lives. Or, said differently: I generally try to clean up things in the inverse way they were set up, and because bdrv_pad_requests() comes before tracked_request_begin(), I think tracked_request_end() should come before bdrv_padding_finalize(). Note, that it's wise-versa in bdrv_co_pwritev_part(). For me it's just simpler to think that the whole request, including filling user-given qiov with data on read part is inside tracked_request_begin() / tracked_request_end(). And moving the last manipulation with qiov out of it breaks this simple thought. Guest should not care of it, as it doesn't know about request tracking.. But what about internal code? Some code may depend on some requests be finished after bdrv_drained_begin() call, but now they may be not fully finished, and some data may be not copied back to original qiov. I agree with your point about sequence of objects finalization, but maybe, that just shows that copying data back to qiov should not be a part of bdrv_padding_finalize(), but instead be a separate function, called before tracked_request_end(). With that: Reviewed-by: Vladimir Sementsov-Ogievskiy PS, I feel here still exists small space for optimization: The question is whether any optimization is really worth it, and I’m not sure it is. The bug has been in qemu for over two years, and because the only report I’ve seen about it came from our QE department, it seems like a very rare case, so I find it more important for the code to be as simple as possible than to optimize. move the logic to bdrv_init_padding(), and 1. allocate only one buffer 2. make the new collpase are to be attached to head or tail padding 3. avoid creating extra iov-slice, maybe with help of some new qemu_iovec_* API that can control number of copied/to-be-copied iovs and/or calculation number of iovs in qiov/qiov_offset/bytes slice I’ve actually begun by trying to reuse the padding buffer, and to collapse head/tail into it, but found it to be rather complicated. See also my reply to Stefan here: https://lists.nongnu.org/archive/html/qemu-devel/2023-03/msg04774.html Hanna -- Best re