[PATCH v3 5/5] block-coroutine-wrapper: use qemu_get_current_aio_context()

2023-09-12 Thread Stefan Hajnoczi
Use qemu_get_current_aio_context() in mixed wrappers and coroutine wrappers so that code runs in the caller's AioContext instead of moving to the BlockDriverState's AioContext. This change is necessary for the multi-queue block layer where any thread can call into the block layer. Most wrappers ar

[PATCH v3 2/5] test-bdrv-drain: avoid race with BH in IOThread drain test

2023-09-12 Thread Stefan Hajnoczi
This patch fixes a race condition in test-bdrv-drain that is difficult to reproduce. test-bdrv-drain sometimes fails without an error message on the block pull request sent by Kevin Wolf on Sep 4, 2023. I was able to reproduce it locally and found that "block-backend: process I/O in the current Aio

[PATCH v3 4/5] block-backend: process zoned requests in the current AioContext

2023-09-12 Thread Stefan Hajnoczi
Process zoned requests in the current thread's AioContext instead of in the BlockBackend's AioContext. There is no need to use the BlockBackend's AioContext thanks to CoMutex bs->wps->colock, which protects zone metadata. Signed-off-by: Stefan Hajnoczi --- block/block-backend.c | 12 ++-

[PATCH v3 3/5] block-backend: process I/O in the current AioContext

2023-09-12 Thread Stefan Hajnoczi
Switch blk_aio_*() APIs over to multi-queue by using qemu_get_current_aio_context() instead of blk_get_aio_context(). This change will allow devices to process I/O in multiple IOThreads in the future. I audited existing blk_aio_*() callers: - migration/block.c: blk_mig_lock() protects the data acc

[PATCH v3 0/5] block-backend: process I/O in the current AioContext

2023-09-12 Thread Stefan Hajnoczi
v3 - Add Patch 2 to fix a race condition in test-bdrv-drain. This was the CI failure that bumped this patch series from Kevin's pull request. - Add missing 051.pc.out file. I tried qemu-system-aarch64 to see of 051.out also needs to be updated, but no changes were necessary. [Kevin] v2 - Add pa

[PATCH v3 1/5] block: remove AIOCBInfo->get_aio_context()

2023-09-12 Thread Stefan Hajnoczi
The synchronous bdrv_aio_cancel() function needs the acb's AioContext so it can call aio_poll() to wait for cancellation. It turns out that all users run under the BQL in the main AioContext, so this callback is not needed. Remove the callback, mark bdrv_aio_cancel() GLOBAL_STATE_CODE just like i

Re: [PATCH 3/3] iotests: distinguish 'skipped' and 'not run' states

2023-09-12 Thread Denis V. Lunev
On 9/12/23 22:03, Vladimir Sementsov-Ogievskiy wrote: On 06.09.23 17:09, Denis V. Lunev wrote: Each particular testcase could skipped intentionally and accidentally. For example the test is not designed for a particular image format or is not run due to the missed library. The latter case is un

Re: [PATCH v2 4/4] block-coroutine-wrapper: use qemu_get_current_aio_context()

2023-09-12 Thread Stefan Hajnoczi
On Fri, Sep 01, 2023 at 07:01:37PM +0200, Kevin Wolf wrote: > Am 24.08.2023 um 01:59 hat Stefan Hajnoczi geschrieben: > > Use qemu_get_current_aio_context() in mixed wrappers and coroutine > > wrappers so that code runs in the caller's AioContext instead of moving > > to the BlockDriverState's AioC

Re: [PATCH 3/3] iotests: distinguish 'skipped' and 'not run' states

2023-09-12 Thread Vladimir Sementsov-Ogievskiy
On 06.09.23 17:09, Denis V. Lunev wrote: Each particular testcase could skipped intentionally and accidentally. For example the test is not designed for a particular image format or is not run due to the missed library. The latter case is unwanted in reality. Though the discussion has revealed t

Re: [PATCH 2/3] iotests: improve 'not run' message for nbd-multiconn test

2023-09-12 Thread Vladimir Sementsov-Ogievskiy
On 06.09.23 17:09, Denis V. Lunev wrote: The test actually requires Python bindings to libnbd rather than libnbd itself. Clarify that inside the message. Signed-off-by: Denis V. Lunev CC: Kevin Wolf CC: Hanna Reitz CC: Eric Blake Reviewed-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vla

Re: [PATCH 1/2] blockdev: qmp_transaction: harden transaction properties for bitmaps

2023-09-12 Thread Vladimir Sementsov-Ogievskiy
On 04.09.23 11:31, Andrey Zhadchenko wrote: Unlike other transaction commands, bitmap operations do not drain target bds. If we have an IOThread, this may result in some inconsistencies, as bitmap content may change during transaction command. Add bdrv_drained_begin()/end() to bitmap operations.

Re: [PATCH v2 17/21] block: Take graph rdlock in bdrv_drop_intermediate()

2023-09-12 Thread Stefan Hajnoczi
On Mon, Sep 11, 2023 at 11:46:16AM +0200, Kevin Wolf wrote: > The function reads the parents list, so it needs to hold the graph lock. > > Signed-off-by: Kevin Wolf > Reviewed-by: Emanuele Giuseppe Esposito > --- > block.c | 2 ++ > 1 file changed, 2 insertions(+) Reviewed-by: Stefan Hajnoczi

Re: [PATCH v2 18/21] block: Take graph rdlock in bdrv_change_aio_context()

2023-09-12 Thread Stefan Hajnoczi
On Mon, Sep 11, 2023 at 11:46:17AM +0200, Kevin Wolf wrote: > The function reads the parents list, so it needs to hold the graph lock. > > Signed-off-by: Kevin Wolf > Reviewed-by: Emanuele Giuseppe Esposito > --- > block.c | 4 > 1 file changed, 4 insertions(+) Reviewed-by: Stefan Hajnocz

Re: [PATCH v2 00/21] Graph locking part 4 (node management)

2023-09-12 Thread Stefan Hajnoczi
On Mon, Sep 11, 2023 at 11:45:59AM +0200, Kevin Wolf wrote: > The previous parts of the graph locking changes focussed mostly on the > BlockDriver side and taking reader locks while performing I/O. This > series focusses more on the functions managing the graph structure, i.e > adding, removing and

Re: [PATCH v2 05/21] block: Introduce bdrv_schedule_unref()

2023-09-12 Thread Stefan Hajnoczi
On Mon, Sep 11, 2023 at 11:46:04AM +0200, Kevin Wolf wrote: > bdrv_unref() is called by a lot of places that need to hold the graph > lock (it naturally happens in the context of operations that change the > graph). However, bdrv_unref() takes the graph writer lock internally, so > it can't actuall

[PULL 0/2] hw/nvme: updates

2023-09-12 Thread Klaus Jensen
From: Klaus Jensen Hi, The following changes since commit 9ef497755afc252fb8e060c9ea6b0987abfd20b6: Merge tag 'pull-vfio-20230911' of https://github.com/legoater/qemu into staging (2023-09-11 09:13:08 -0400) are available in the Git repository at: https://gitlab.com/birkelund/qemu.git ta

[PULL 2/2] hw/nvme: Avoid dynamic stack allocation

2023-09-12 Thread Klaus Jensen
From: Peter Maydell Instead of using a variable-length array in nvme_map_prp(), allocate on the stack with a g_autofree pointer. The codebase has very few VLAs, and if we can get rid of them all we can make the compiler error on new additions. This is a defensive measure against security bugs w

[PULL 1/2] hw/nvme: Use #define to avoid variable length array

2023-09-12 Thread Klaus Jensen
From: Philippe Mathieu-Daudé In nvme_map_sgl() we create an array segment[] whose size is the 'const int SEG_CHUNK_SIZE'. Since this is C, rather than C++, a "const int foo" is not a true constant, it's merely a variable with a constant value, and so semantically segment[] is a variable-length a

Re: [PATCH 0/2] nvme: avoid dynamic stack allocations

2023-09-12 Thread Klaus Jensen
On Sep 12 15:15, Peter Maydell wrote: > On Mon, 14 Aug 2023 at 08:09, Klaus Jensen wrote: > > > > On Aug 11 18:47, Peter Maydell wrote: > > > The QEMU codebase has very few C variable length arrays, and if we can > > > get rid of them all we can make the compiler error on new additions. > > > This

Re: [PATCH 0/2] nvme: avoid dynamic stack allocations

2023-09-12 Thread Peter Maydell
On Mon, 14 Aug 2023 at 08:09, Klaus Jensen wrote: > > On Aug 11 18:47, Peter Maydell wrote: > > The QEMU codebase has very few C variable length arrays, and if we can > > get rid of them all we can make the compiler error on new additions. > > This is a defensive measure against security bugs wher

Re: [PATCH v5 3/3] hw/nvme: add nvme management interface model

2023-09-12 Thread Andrew Jeffery
Hi Klaus, On Tue, 2023-09-05 at 10:38 +0200, Klaus Jensen wrote: > > > > +static void nmi_handle_mi_config_get(NMIDevice *nmi, NMIRequest > > *request) > > +{ > > +    uint32_t dw0 = le32_to_cpu(request->dw0); > > +    uint8_t identifier = FIELD_EX32(dw0, > > NMI_CMD_CONFIGURATION_GET_DW0, > > + 

Re: [PULL 00/14] Block patches

2023-09-12 Thread Hanna Czenczek
On 07.09.23 13:21, Hanna Czenczek wrote: On 06.09.23 15:18, Stefan Hajnoczi wrote: On Fri, 1 Sept 2023 at 04:18, Hanna Czenczek wrote: The following changes since commit f5fe7c17ac4e309e47e78f0f9761aebc8d2f2c81:    Merge tag 'pull-tcg-20230823-2' of https://gitlab.com/rth7680/qemu into stag

[PATCH v4 3/5] vhost-user-scsi: support reconnect to backend

2023-09-12 Thread Li Feng
If the backend crashes and restarts, the device is broken. This patch adds reconnect for vhost-user-scsi. This patch also improves the error messages, and reports some silent errors. Tested with spdk backend. Signed-off-by: Li Feng --- hw/scsi/vhost-scsi-common.c | 16 +- hw/scsi/vh

[PATCH v4 5/5] vhost-user: fix lost reconnect

2023-09-12 Thread Li Feng
When the vhost-user is reconnecting to the backend, and if the vhost-user fails at the get_features in vhost_dev_init(), then the reconnect will fail and it will not be retriggered forever. The reason is: When the vhost-user fails at get_features, the vhost_dev_cleanup will be called immediately.

[PATCH v4 1/5] vhost-user-common: send get_inflight_fd once

2023-09-12 Thread Li Feng
Currently the get_inflight_fd will be sent every time the device is started, and the backend will allocate shared memory to save the inflight state. If the backend finds that it receives the second get_inflight_fd, it will release the previous shared memory, which breaks inflight working logic. Th

[PATCH v4 0/5] Implement reconnect for vhost-user-scsi

2023-09-12 Thread Li Feng
This patchset adds reconnect support for vhost-user-scsi. At the same time, improve the error messages and silent errors are now reported. And fix a lost reconnect issue for all vhost-user backend. Changes for v4: - Merge https://lore.kernel.org/all/20230830045722.611224-1-fen...@smartx.com/ to

[PATCH v4 4/5] vhost-user-scsi: start vhost when guest kicks

2023-09-12 Thread Li Feng
Let's keep the same behavior as vhost-user-blk. Some old guests kick virtqueue before setting VIRTIO_CONFIG_S_DRIVER_OK. Signed-off-by: Li Feng --- hw/scsi/vhost-user-scsi.c | 51 +++ 1 file changed, 46 insertions(+), 5 deletions(-) diff --git a/hw/scsi/vhos

[PATCH v4 2/5] vhost: move and rename the conn retry times

2023-09-12 Thread Li Feng
Multiple devices need this macro, move it to a common header. Signed-off-by: Li Feng Reviewed-by: Raphael Norwitz --- hw/block/vhost-user-blk.c | 4 +--- hw/virtio/vhost-user-gpio.c | 3 +-- include/hw/virtio/vhost.h | 2 ++ 3 files changed, 4 insertions(+), 5 deletions(-) diff --git a/hw/

Re: [PATCH v3 2/2] vhost: Add Error parameter to vhost_scsi_common_start()

2023-09-12 Thread Li Feng
> On 1 Sep 2023, at 8:00 PM, Markus Armbruster wrote: > > Li Feng mailto:fen...@smartx.com>> writes: > >> Add a Error parameter to report the real error, like vhost-user-blk. >> >> Signed-off-by: Li Feng >> --- >> hw/scsi/vhost-scsi-common.c | 16 +--- >> hw/scsi/vhost-s

Re: [PATCH v3 4/5] vhost-user-scsi: support reconnect to backend

2023-09-12 Thread Li Feng
> On 1 Sep 2023, at 8:00 PM, Markus Armbruster wrote: > > Li Feng mailto:fen...@smartx.com>> writes: > >> If the backend crashes and restarts, the device is broken. >> This patch adds reconnect for vhost-user-scsi. >> >> Tested with spdk backend. >> >> Signed-off-by: Li Feng >> --- >> hw/sc

Re: [PATCH v3 5/5] vhost-user-scsi: start vhost when guest kicks

2023-09-12 Thread Li Feng
> On 1 Sep 2023, at 7:44 PM, Markus Armbruster wrote: > > Li Feng mailto:fen...@smartx.com>> writes: > >> Let's keep the same behavior as vhost-user-blk. >> >> Some old guests kick virtqueue before setting VIRTIO_CONFIG_S_DRIVER_OK. >> >> Signed-off-by: Li Feng >> --- >> hw/scsi/vhost-user-