Re: [PATCH 0/9] hw/block: m25p80: Fix the mess of dummy bytes needed for fast read commands
On Fri, Apr 23, 2021 at 4:46 PM Bin Meng wrote: > > On Mon, Feb 8, 2021 at 10:41 PM Bin Meng wrote: > > > > On Thu, Jan 21, 2021 at 10:18 PM Francisco Iglesias > > wrote: > > > > > > Hi Bin, > > > > > > On [2021 Jan 21] Thu 16:59:51, Bin Meng wrote: > > > > Hi Francisco, > > > > > > > > On Thu, Jan 21, 2021 at 4:50 PM Francisco Iglesias > > > > wrote: > > > > > > > > > > Dear Bin, > > > > > > > > > > On [2021 Jan 20] Wed 22:20:25, Bin Meng wrote: > > > > > > Hi Francisco, > > > > > > > > > > > > On Tue, Jan 19, 2021 at 9:01 PM Francisco Iglesias > > > > > > wrote: > > > > > > > > > > > > > > Hi Bin, > > > > > > > > > > > > > > On [2021 Jan 18] Mon 20:32:19, Bin Meng wrote: > > > > > > > > Hi Francisco, > > > > > > > > > > > > > > > > On Mon, Jan 18, 2021 at 6:06 PM Francisco Iglesias > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > Hi Bin, > > > > > > > > > > > > > > > > > > On [2021 Jan 15] Fri 22:38:18, Bin Meng wrote: > > > > > > > > > > Hi Francisco, > > > > > > > > > > > > > > > > > > > > On Fri, Jan 15, 2021 at 8:26 PM Francisco Iglesias > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > Hi Bin, > > > > > > > > > > > > > > > > > > > > > > On [2021 Jan 15] Fri 10:07:52, Bin Meng wrote: > > > > > > > > > > > > Hi Francisco, > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Jan 15, 2021 at 2:13 AM Francisco Iglesias > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Bin, > > > > > > > > > > > > > > > > > > > > > > > > > > On [2021 Jan 14] Thu 23:08:53, Bin Meng wrote: > > > > > > > > > > > > > > From: Bin Meng > > > > > > > > > > > > > > > > > > > > > > > > > > > > The m25p80 model uses s->needed_bytes to indicate > > > > > > > > > > > > > > how many follow-up > > > > > > > > > > > > > > bytes are expected to be received after it receives > > > > > > > > > > > > > > a command. For > > > > > > > > > > > > > > example, depending on the address mode, either > > > > > > > > > > > > > > 3-byte address or > > > > > > > > > > > > > > 4-byte address is needed. > > > > > > > > > > > > > > > > > > > > > > > > > > > > For fast read family commands, some dummy cycles > > > > > > > > > > > > > > are required after > > > > > > > > > > > > > > sending the address bytes, and the dummy cycles > > > > > > > > > > > > > > need to be counted > > > > > > > > > > > > > > in s->needed_bytes. This is where the mess began. > > > > > > > > > > > > > > > > > > > > > > > > > > > > As the variable name (needed_bytes) indicates, the > > > > > > > > > > > > > > unit is in byte. > > > > > > > > > > > > > > It is not in bit, or cycle. However for some reason > > > > > > > > > > > > > > the model has > > > > > > > > > > > > > > been using the number of dummy cycles for > > > > > > > > > > > > > > s->needed_bytes. The right > > > > > > > > > > > > > > approach is to convert the number of dummy cycles > > > > > > > > > > > > > > to bytes based on > > > > > > > > > > > > > > the SPI protocol, for example, 6 dummy cycles for > > > > > > > > > > > > > > the Fast Read Quad > > > > > > > > > > > > > > I/O (EBh) should be converted to 3 bytes per the > > > > > > > > > > > > > > formula (6 * 4 / 8). > > > > > > > > > > > > > > > > > > > > > > > > > > While not being the original implementor I must > > > > > > > > > > > > > assume that above solution was > > > > > > > > > > > > > considered but not chosen by the developers due to it > > > > > > > > > > > > > is inaccuracy (it > > > > > > > > > > > > > wouldn't be possible to model exacly 6 dummy cycles, > > > > > > > > > > > > > only a multiple of 8, > > > > > > > > > > > > > meaning that if the controller is wrongly programmed > > > > > > > > > > > > > to generate 7 the error > > > > > > > > > > > > > wouldn't be caught and the controller will still be > > > > > > > > > > > > > considered "correct"). Now > > > > > > > > > > > > > that we have this detail in the implementation I'm in > > > > > > > > > > > > > favor of keeping it, this > > > > > > > > > > > > > also because the detail is already in use for > > > > > > > > > > > > > catching exactly above error. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I found no clue from the commit message that my > > > > > > > > > > > > proposed solution here > > > > > > > > > > > > was ever considered, otherwise all SPI controller > > > > > > > > > > > > models supporting > > > > > > > > > > > > software generation should have been found out > > > > > > > > > > > > seriously broken long > > > > > > > > > > > > time ago! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The controllers you are referring to might lack support > > > > > > > > > > > for commands requiring > > > > > > > > > > > dummy clock cycles but I really hope they work with the > > > > > > > > > > > other commands? If so I > > > > > > > > > > > > > > > > > > > > I am not sure why you view dummy clock cycles as something > > > > > > > > > >
Re: [PATCH 1/5] hw/ppc/spapr_iommu: Register machine reset handler
On Sat, Apr 24, 2021 at 06:22:25PM +0200, Philippe Mathieu-Daudé wrote: > The TYPE_SPAPR_TCE_TABLE device is bus-less, thus isn't reset > automatically. Register a reset handler to get reset with the > machine. > > It doesn't seem to be an issue because it is that way since the > device QDev'ifycation 8 years ago, in commit a83000f5e3f > ("spapr-tce: make sPAPRTCETable a proper device"). > Still, correct to have a proper API usage. So, the reason this works now is that we explicitly call device_reset() on the TCE table from the TCE tables "owner", either a PHB (spapr_phb_reset()) or a VIO device (spapr_vio_quiesce_one()). I think we want either that, or the register_reset(), not both. > > Signed-off-by: Philippe Mathieu-Daudé > --- > hw/ppc/spapr_iommu.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c > index 24537ffcbd3..f7dad1dc0fe 100644 > --- a/hw/ppc/spapr_iommu.c > +++ b/hw/ppc/spapr_iommu.c > @@ -24,6 +24,7 @@ > #include "sysemu/kvm.h" > #include "kvm_ppc.h" > #include "migration/vmstate.h" > +#include "sysemu/reset.h" > #include "sysemu/dma.h" > #include "exec/address-spaces.h" > #include "trace.h" > @@ -302,6 +303,11 @@ static const VMStateDescription vmstate_spapr_tce_table > = { > } > }; > > +static void spapr_tce_reset_handler(void *dev) > +{ > +device_legacy_reset(DEVICE(dev)); > +} > + > static void spapr_tce_table_realize(DeviceState *dev, Error **errp) > { > SpaprTceTable *tcet = SPAPR_TCE_TABLE(dev); > @@ -324,6 +330,8 @@ static void spapr_tce_table_realize(DeviceState *dev, > Error **errp) > > vmstate_register(VMSTATE_IF(tcet), tcet->liobn, _spapr_tce_table, > tcet); > + > +qemu_register_reset(spapr_tce_reset_handler, dev); > } > > void spapr_tce_set_need_vfio(SpaprTceTable *tcet, bool need_vfio) > @@ -425,6 +433,8 @@ static void spapr_tce_table_unrealize(DeviceState *dev) > { > SpaprTceTable *tcet = SPAPR_TCE_TABLE(dev); > > +qemu_unregister_reset(spapr_tce_reset_handler, dev); > + > vmstate_unregister(VMSTATE_IF(tcet), _spapr_tce_table, tcet); > > QLIST_REMOVE(tcet, list); -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH 4/5] hw/pci-host/raven: Manually reset the OR_IRQ device
On Sat, Apr 24, 2021 at 06:22:28PM +0200, Philippe Mathieu-Daudé wrote: > The OR_IRQ device is bus-less, thus isn't reset automatically. > Add the raven_pcihost_reset() handler to manually reset the OR IRQ. > > Fixes: f40b83a4e31 ("40p: use OR gate to wire up raven PCI interrupts") > Signed-off-by: Philippe Mathieu-Daudé Acked-by: David Gibson > --- > hw/pci-host/prep.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c > index 0a9162fba97..275379e4c78 100644 > --- a/hw/pci-host/prep.c > +++ b/hw/pci-host/prep.c > @@ -230,6 +230,15 @@ static void raven_change_gpio(void *opaque, int n, int > level) > s->contiguous_map = level; > } > > +static void raven_pcihost_reset(DeviceState *dev) > +{ > +PREPPCIState *s = RAVEN_PCI_HOST_BRIDGE(dev); > + > +if (!s->is_legacy_prep) { > +device_legacy_reset(DEVICE(>or_irq)); > +} > +} > + > static void raven_pcihost_realizefn(DeviceState *d, Error **errp) > { > SysBusDevice *dev = SYS_BUS_DEVICE(d); > @@ -422,6 +431,7 @@ static void raven_pcihost_class_init(ObjectClass *klass, > void *data) > > set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories); > dc->realize = raven_pcihost_realizefn; > +dc->reset = raven_pcihost_reset; > device_class_set_props(dc, raven_pcihost_properties); > dc->fw_name = "pci"; > } -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [PATCH] qapi: deprecate drive-backup
26.04.2021 21:30, John Snow wrote: On 4/26/21 2:05 PM, Daniel P. Berrangé wrote: On Mon, Apr 26, 2021 at 09:00:36PM +0300, Vladimir Sementsov-Ogievskiy wrote: 26.04.2021 20:34, John Snow wrote: On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote: Modern way is using blockdev-add + blockdev-backup, which provides a lot more control on how target is opened. As example of drive-backup problems consider the following: User of drive-backup expects that target will be opened in the same cache and aio mode as source. Corresponding logic is in drive_backup_prepare(), where we take bs->open_flags of source. It works rather bad if source was added by blockdev-add. Assume source is qcow2 image. On blockdev-add we should specify aio and cache options for file child of qcow2 node. What happens next: drive_backup_prepare() looks at bs->open_flags of qcow2 source node. But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere, as file-posix parse options and simply set s->use_linux_aio. No complaints from me, especially if Virtuozzo is on board. I would like to see some documentation changes alongside this deprecation, though. Signed-off-by: Vladimir Sementsov-Ogievskiy --- Hi all! I remember, I suggested to deprecate drive-backup some time ago, and nobody complain.. But that old patch was inside the series with other more questionable deprecations and it did not landed. Let's finally deprecate what should be deprecated long ago. We now faced a problem in our downstream, described in commit message. In downstream I've fixed it by simply enabling O_DIRECT and linux_aio unconditionally for drive_backup target. But actually this just shows that using drive-backup in blockdev era is a bad idea. So let's motivate everyone (including Virtuozzo of course) to move to new interfaces and avoid problems with all that outdated option inheritance. docs/system/deprecated.rst | 5 + qapi/block-core.json | 5 - 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst index 80cae86252..b6f5766e17 100644 --- a/docs/system/deprecated.rst +++ b/docs/system/deprecated.rst @@ -186,6 +186,11 @@ Use the more generic commands ``block-export-add`` and ``block-export-del`` instead. As part of this deprecation, where ``nbd-server-add`` used a single ``bitmap``, the new ``block-export-add`` uses a list of ``bitmaps``. +``drive-backup`` (since 6.0) + + +Use ``blockdev-backup`` in pair with ``blockdev-add`` instead. + 1) Let's add a sphinx reference to https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup 2) Just a thought, not a request: We also may wish to update https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the new, preferred method. However, this doc is a bit old and is in need of an overhaul anyway (Especially to add the NBD pull workflow.) Since the doc is in need of an overhaul anyway, can we ask Kashyap to help us here, if he has time? 3) Let's add a small explanation here that outlines the differences in using these two commands. Here's a suggestion: This change primarily separates the creation/opening process of the backup target with explicit, separate steps. BlockdevBackup uses mostly the same arguments as DriveBackup, except the "format" and "mode" options are removed in favor of using explicit "blockdev-create" and "blockdev-add" calls. The "target" argument changes semantics. It no longer accepts filenames, and will now additionally accept arbitrary node names in addition to device names. 4) Also not a request: If we want to go above and beyond, it might be nice to spell out the exact steps required to transition from the old interface to the new one. Here's a (hasty) suggestion for how that might look: - The MODE argument is deprecated. - "existing" is replaced by using "blockdev-add" commands. - "absolute-paths" is replaced by using "blockdev-add" and "blockdev-create" commands. - The FORMAT argument is deprecated. - Format information is given to "blockdev-add"/"blockdev-create". - The TARGET argument has new semantics: - Filenames are no longer supported, use blockdev-add/blockdev-create as necessary instead. - Device targets remain supported. Example: drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes: (taking some liberties with syntax to just illustrate the idea ...) blockdev-create options={ "driver": "file", "filename": $FILENAME, "size": 0, } blockdev-add arguments={ "driver": "file", "filename": $FILENAME, "node-name": "Example_Filenode0" } blockdev-create options={ "driver": $FORMAT, "file": "Example_Filenode0", "size": $SIZE, } blockdev-add arguments={ "driver": $FORMAT, "file":
Re: [PATCH] qapi: deprecate drive-backup
On 4/26/21 2:41 PM, Vladimir Sementsov-Ogievskiy wrote: 26.04.2021 21:30, John Snow wrote: On 4/26/21 2:05 PM, Daniel P. Berrangé wrote: On Mon, Apr 26, 2021 at 09:00:36PM +0300, Vladimir Sementsov-Ogievskiy wrote: 26.04.2021 20:34, John Snow wrote: On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote: Modern way is using blockdev-add + blockdev-backup, which provides a lot more control on how target is opened. As example of drive-backup problems consider the following: User of drive-backup expects that target will be opened in the same cache and aio mode as source. Corresponding logic is in drive_backup_prepare(), where we take bs->open_flags of source. It works rather bad if source was added by blockdev-add. Assume source is qcow2 image. On blockdev-add we should specify aio and cache options for file child of qcow2 node. What happens next: drive_backup_prepare() looks at bs->open_flags of qcow2 source node. But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere, as file-posix parse options and simply set s->use_linux_aio. No complaints from me, especially if Virtuozzo is on board. I would like to see some documentation changes alongside this deprecation, though. Signed-off-by: Vladimir Sementsov-Ogievskiy --- Hi all! I remember, I suggested to deprecate drive-backup some time ago, and nobody complain.. But that old patch was inside the series with other more questionable deprecations and it did not landed. Let's finally deprecate what should be deprecated long ago. We now faced a problem in our downstream, described in commit message. In downstream I've fixed it by simply enabling O_DIRECT and linux_aio unconditionally for drive_backup target. But actually this just shows that using drive-backup in blockdev era is a bad idea. So let's motivate everyone (including Virtuozzo of course) to move to new interfaces and avoid problems with all that outdated option inheritance. docs/system/deprecated.rst | 5 + qapi/block-core.json | 5 - 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst index 80cae86252..b6f5766e17 100644 --- a/docs/system/deprecated.rst +++ b/docs/system/deprecated.rst @@ -186,6 +186,11 @@ Use the more generic commands ``block-export-add`` and ``block-export-del`` instead. As part of this deprecation, where ``nbd-server-add`` used a single ``bitmap``, the new ``block-export-add`` uses a list of ``bitmaps``. +``drive-backup`` (since 6.0) + + +Use ``blockdev-backup`` in pair with ``blockdev-add`` instead. + 1) Let's add a sphinx reference to https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup 2) Just a thought, not a request: We also may wish to update https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the new, preferred method. However, this doc is a bit old and is in need of an overhaul anyway (Especially to add the NBD pull workflow.) Since the doc is in need of an overhaul anyway, can we ask Kashyap to help us here, if he has time? 3) Let's add a small explanation here that outlines the differences in using these two commands. Here's a suggestion: This change primarily separates the creation/opening process of the backup target with explicit, separate steps. BlockdevBackup uses mostly the same arguments as DriveBackup, except the "format" and "mode" options are removed in favor of using explicit "blockdev-create" and "blockdev-add" calls. (Here, I accidentally used the names of the argument objects instead of the names of the commands. It's likely better to spell out the names of the commands instead.) The "target" argument changes semantics. It no longer accepts filenames, and will now additionally accept arbitrary node names in addition to device names. 4) Also not a request: If we want to go above and beyond, it might be nice to spell out the exact steps required to transition from the old interface to the new one. Here's a (hasty) suggestion for how that might look: - The MODE argument is deprecated. - "existing" is replaced by using "blockdev-add" commands. - "absolute-paths" is replaced by using "blockdev-add" and "blockdev-create" commands. - The FORMAT argument is deprecated. - Format information is given to "blockdev-add"/"blockdev-create". - The TARGET argument has new semantics: - Filenames are no longer supported, use blockdev-add/blockdev-create as necessary instead. - Device targets remain supported. Example: drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes: (taking some liberties with syntax to just illustrate the idea ...) blockdev-create options={ "driver": "file", "filename": $FILENAME, "size": 0, } blockdev-add arguments={
Re: [PATCH] qapi: deprecate drive-backup
On 4/26/21 2:05 PM, Daniel P. Berrangé wrote: On Mon, Apr 26, 2021 at 09:00:36PM +0300, Vladimir Sementsov-Ogievskiy wrote: 26.04.2021 20:34, John Snow wrote: On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote: Modern way is using blockdev-add + blockdev-backup, which provides a lot more control on how target is opened. As example of drive-backup problems consider the following: User of drive-backup expects that target will be opened in the same cache and aio mode as source. Corresponding logic is in drive_backup_prepare(), where we take bs->open_flags of source. It works rather bad if source was added by blockdev-add. Assume source is qcow2 image. On blockdev-add we should specify aio and cache options for file child of qcow2 node. What happens next: drive_backup_prepare() looks at bs->open_flags of qcow2 source node. But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere, as file-posix parse options and simply set s->use_linux_aio. No complaints from me, especially if Virtuozzo is on board. I would like to see some documentation changes alongside this deprecation, though. Signed-off-by: Vladimir Sementsov-Ogievskiy --- Hi all! I remember, I suggested to deprecate drive-backup some time ago, and nobody complain.. But that old patch was inside the series with other more questionable deprecations and it did not landed. Let's finally deprecate what should be deprecated long ago. We now faced a problem in our downstream, described in commit message. In downstream I've fixed it by simply enabling O_DIRECT and linux_aio unconditionally for drive_backup target. But actually this just shows that using drive-backup in blockdev era is a bad idea. So let's motivate everyone (including Virtuozzo of course) to move to new interfaces and avoid problems with all that outdated option inheritance. docs/system/deprecated.rst | 5 + qapi/block-core.json | 5 - 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst index 80cae86252..b6f5766e17 100644 --- a/docs/system/deprecated.rst +++ b/docs/system/deprecated.rst @@ -186,6 +186,11 @@ Use the more generic commands ``block-export-add`` and ``block-export-del`` instead. As part of this deprecation, where ``nbd-server-add`` used a single ``bitmap``, the new ``block-export-add`` uses a list of ``bitmaps``. +``drive-backup`` (since 6.0) + + +Use ``blockdev-backup`` in pair with ``blockdev-add`` instead. + 1) Let's add a sphinx reference to https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup 2) Just a thought, not a request: We also may wish to update https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the new, preferred method. However, this doc is a bit old and is in need of an overhaul anyway (Especially to add the NBD pull workflow.) Since the doc is in need of an overhaul anyway, can we ask Kashyap to help us here, if he has time? 3) Let's add a small explanation here that outlines the differences in using these two commands. Here's a suggestion: This change primarily separates the creation/opening process of the backup target with explicit, separate steps. BlockdevBackup uses mostly the same arguments as DriveBackup, except the "format" and "mode" options are removed in favor of using explicit "blockdev-create" and "blockdev-add" calls. The "target" argument changes semantics. It no longer accepts filenames, and will now additionally accept arbitrary node names in addition to device names. 4) Also not a request: If we want to go above and beyond, it might be nice to spell out the exact steps required to transition from the old interface to the new one. Here's a (hasty) suggestion for how that might look: - The MODE argument is deprecated. - "existing" is replaced by using "blockdev-add" commands. - "absolute-paths" is replaced by using "blockdev-add" and "blockdev-create" commands. - The FORMAT argument is deprecated. - Format information is given to "blockdev-add"/"blockdev-create". - The TARGET argument has new semantics: - Filenames are no longer supported, use blockdev-add/blockdev-create as necessary instead. - Device targets remain supported. Example: drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes: (taking some liberties with syntax to just illustrate the idea ...) blockdev-create options={ "driver": "file", "filename": $FILENAME, "size": 0, } blockdev-add arguments={ "driver": "file", "filename": $FILENAME, "node-name": "Example_Filenode0" } blockdev-create options={ "driver": $FORMAT, "file": "Example_Filenode0", "size": $SIZE, } blockdev-add arguments={ "driver": $FORMAT, "file": "Example_Filenode0", "node-name":
Re: [PATCH] qapi: deprecate drive-backup
On Mon, Apr 26, 2021 at 09:00:36PM +0300, Vladimir Sementsov-Ogievskiy wrote: > 26.04.2021 20:34, John Snow wrote: > > On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote: > > > Modern way is using blockdev-add + blockdev-backup, which provides a > > > lot more control on how target is opened. > > > > > > As example of drive-backup problems consider the following: > > > > > > User of drive-backup expects that target will be opened in the same > > > cache and aio mode as source. Corresponding logic is in > > > drive_backup_prepare(), where we take bs->open_flags of source. > > > > > > It works rather bad if source was added by blockdev-add. Assume source > > > is qcow2 image. On blockdev-add we should specify aio and cache options > > > for file child of qcow2 node. What happens next: > > > > > > drive_backup_prepare() looks at bs->open_flags of qcow2 source node. > > > But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is > > > places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere, > > > as file-posix parse options and simply set s->use_linux_aio. > > > > > > > No complaints from me, especially if Virtuozzo is on board. I would like to > > see some documentation changes alongside this deprecation, though. > > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy > > > --- > > > > > > Hi all! I remember, I suggested to deprecate drive-backup some time ago, > > > and nobody complain.. But that old patch was inside the series with > > > other more questionable deprecations and it did not landed. > > > > > > Let's finally deprecate what should be deprecated long ago. > > > > > > We now faced a problem in our downstream, described in commit message. > > > In downstream I've fixed it by simply enabling O_DIRECT and linux_aio > > > unconditionally for drive_backup target. But actually this just shows > > > that using drive-backup in blockdev era is a bad idea. So let's motivate > > > everyone (including Virtuozzo of course) to move to new interfaces and > > > avoid problems with all that outdated option inheritance. > > > > > > docs/system/deprecated.rst | 5 + > > > qapi/block-core.json | 5 - > > > 2 files changed, 9 insertions(+), 1 deletion(-) > > > > > > diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst > > > index 80cae86252..b6f5766e17 100644 > > > --- a/docs/system/deprecated.rst > > > +++ b/docs/system/deprecated.rst > > > @@ -186,6 +186,11 @@ Use the more generic commands ``block-export-add`` > > > and ``block-export-del`` > > > instead. As part of this deprecation, where ``nbd-server-add`` used a > > > single ``bitmap``, the new ``block-export-add`` uses a list of > > > ``bitmaps``. > > > +``drive-backup`` (since 6.0) > > > + > > > + > > > +Use ``blockdev-backup`` in pair with ``blockdev-add`` instead. > > > + > > > > 1) Let's add a sphinx reference to > > https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup > > > > > > 2) Just a thought, not a request: We also may wish to update > > https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the new, > > preferred method. However, this doc is a bit old and is in need of an > > overhaul anyway (Especially to add the NBD pull workflow.) Since the doc is > > in need of an overhaul anyway, can we ask Kashyap to help us here, if he > > has time? > > > > > > 3) Let's add a small explanation here that outlines the differences in > > using these two commands. Here's a suggestion: > > > > This change primarily separates the creation/opening process of the backup > > target with explicit, separate steps. BlockdevBackup uses mostly the same > > arguments as DriveBackup, except the "format" and "mode" options are > > removed in favor of using explicit "blockdev-create" and "blockdev-add" > > calls. > > > > The "target" argument changes semantics. It no longer accepts filenames, > > and will now additionally accept arbitrary node names in addition to device > > names. > > > > > > 4) Also not a request: If we want to go above and beyond, it might be nice > > to spell out the exact steps required to transition from the old interface > > to the new one. Here's a (hasty) suggestion for how that might look: > > > > - The MODE argument is deprecated. > > - "existing" is replaced by using "blockdev-add" commands. > > - "absolute-paths" is replaced by using "blockdev-add" and > > "blockdev-create" commands. > > > > - The FORMAT argument is deprecated. > > - Format information is given to "blockdev-add"/"blockdev-create". > > > > - The TARGET argument has new semantics: > > - Filenames are no longer supported, use blockdev-add/blockdev-create > > as necessary instead. > > - Device targets remain supported. > > > > > > Example: > > > > drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes: > > > > (taking some liberties
Re: [PATCH] qapi: deprecate drive-backup
26.04.2021 20:34, John Snow wrote: On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote: Modern way is using blockdev-add + blockdev-backup, which provides a lot more control on how target is opened. As example of drive-backup problems consider the following: User of drive-backup expects that target will be opened in the same cache and aio mode as source. Corresponding logic is in drive_backup_prepare(), where we take bs->open_flags of source. It works rather bad if source was added by blockdev-add. Assume source is qcow2 image. On blockdev-add we should specify aio and cache options for file child of qcow2 node. What happens next: drive_backup_prepare() looks at bs->open_flags of qcow2 source node. But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere, as file-posix parse options and simply set s->use_linux_aio. No complaints from me, especially if Virtuozzo is on board. I would like to see some documentation changes alongside this deprecation, though. Signed-off-by: Vladimir Sementsov-Ogievskiy --- Hi all! I remember, I suggested to deprecate drive-backup some time ago, and nobody complain.. But that old patch was inside the series with other more questionable deprecations and it did not landed. Let's finally deprecate what should be deprecated long ago. We now faced a problem in our downstream, described in commit message. In downstream I've fixed it by simply enabling O_DIRECT and linux_aio unconditionally for drive_backup target. But actually this just shows that using drive-backup in blockdev era is a bad idea. So let's motivate everyone (including Virtuozzo of course) to move to new interfaces and avoid problems with all that outdated option inheritance. docs/system/deprecated.rst | 5 + qapi/block-core.json | 5 - 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst index 80cae86252..b6f5766e17 100644 --- a/docs/system/deprecated.rst +++ b/docs/system/deprecated.rst @@ -186,6 +186,11 @@ Use the more generic commands ``block-export-add`` and ``block-export-del`` instead. As part of this deprecation, where ``nbd-server-add`` used a single ``bitmap``, the new ``block-export-add`` uses a list of ``bitmaps``. +``drive-backup`` (since 6.0) + + +Use ``blockdev-backup`` in pair with ``blockdev-add`` instead. + 1) Let's add a sphinx reference to https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup 2) Just a thought, not a request: We also may wish to update https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the new, preferred method. However, this doc is a bit old and is in need of an overhaul anyway (Especially to add the NBD pull workflow.) Since the doc is in need of an overhaul anyway, can we ask Kashyap to help us here, if he has time? 3) Let's add a small explanation here that outlines the differences in using these two commands. Here's a suggestion: This change primarily separates the creation/opening process of the backup target with explicit, separate steps. BlockdevBackup uses mostly the same arguments as DriveBackup, except the "format" and "mode" options are removed in favor of using explicit "blockdev-create" and "blockdev-add" calls. The "target" argument changes semantics. It no longer accepts filenames, and will now additionally accept arbitrary node names in addition to device names. 4) Also not a request: If we want to go above and beyond, it might be nice to spell out the exact steps required to transition from the old interface to the new one. Here's a (hasty) suggestion for how that might look: - The MODE argument is deprecated. - "existing" is replaced by using "blockdev-add" commands. - "absolute-paths" is replaced by using "blockdev-add" and "blockdev-create" commands. - The FORMAT argument is deprecated. - Format information is given to "blockdev-add"/"blockdev-create". - The TARGET argument has new semantics: - Filenames are no longer supported, use blockdev-add/blockdev-create as necessary instead. - Device targets remain supported. Example: drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes: (taking some liberties with syntax to just illustrate the idea ...) blockdev-create options={ "driver": "file", "filename": $FILENAME, "size": 0, } blockdev-add arguments={ "driver": "file", "filename": $FILENAME, "node-name": "Example_Filenode0" } blockdev-create options={ "driver": $FORMAT, "file": "Example_Filenode0", "size": $SIZE, } blockdev-add arguments={ "driver": $FORMAT, "file": "Example_Filenode0", "node-name": "Example_Formatnode0", } blockdev-backup arguments={ $ARGS ..., "target": "Example_Formatnode0", } Good ideas. Hmm. Do you think that the whole
Re: [PATCH] qapi: deprecate drive-backup
On 4/23/21 8:59 AM, Vladimir Sementsov-Ogievskiy wrote: Modern way is using blockdev-add + blockdev-backup, which provides a lot more control on how target is opened. As example of drive-backup problems consider the following: User of drive-backup expects that target will be opened in the same cache and aio mode as source. Corresponding logic is in drive_backup_prepare(), where we take bs->open_flags of source. It works rather bad if source was added by blockdev-add. Assume source is qcow2 image. On blockdev-add we should specify aio and cache options for file child of qcow2 node. What happens next: drive_backup_prepare() looks at bs->open_flags of qcow2 source node. But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere, as file-posix parse options and simply set s->use_linux_aio. No complaints from me, especially if Virtuozzo is on board. I would like to see some documentation changes alongside this deprecation, though. Signed-off-by: Vladimir Sementsov-Ogievskiy --- Hi all! I remember, I suggested to deprecate drive-backup some time ago, and nobody complain.. But that old patch was inside the series with other more questionable deprecations and it did not landed. Let's finally deprecate what should be deprecated long ago. We now faced a problem in our downstream, described in commit message. In downstream I've fixed it by simply enabling O_DIRECT and linux_aio unconditionally for drive_backup target. But actually this just shows that using drive-backup in blockdev era is a bad idea. So let's motivate everyone (including Virtuozzo of course) to move to new interfaces and avoid problems with all that outdated option inheritance. docs/system/deprecated.rst | 5 + qapi/block-core.json | 5 - 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst index 80cae86252..b6f5766e17 100644 --- a/docs/system/deprecated.rst +++ b/docs/system/deprecated.rst @@ -186,6 +186,11 @@ Use the more generic commands ``block-export-add`` and ``block-export-del`` instead. As part of this deprecation, where ``nbd-server-add`` used a single ``bitmap``, the new ``block-export-add`` uses a list of ``bitmaps``. +``drive-backup`` (since 6.0) + + +Use ``blockdev-backup`` in pair with ``blockdev-add`` instead. + 1) Let's add a sphinx reference to https://qemu-project.gitlab.io/qemu/interop/live-block-operations.html#live-disk-backup-drive-backup-and-blockdev-backup 2) Just a thought, not a request: We also may wish to update https://qemu-project.gitlab.io/qemu/interop/bitmaps.html to use the new, preferred method. However, this doc is a bit old and is in need of an overhaul anyway (Especially to add the NBD pull workflow.) Since the doc is in need of an overhaul anyway, can we ask Kashyap to help us here, if he has time? 3) Let's add a small explanation here that outlines the differences in using these two commands. Here's a suggestion: This change primarily separates the creation/opening process of the backup target with explicit, separate steps. BlockdevBackup uses mostly the same arguments as DriveBackup, except the "format" and "mode" options are removed in favor of using explicit "blockdev-create" and "blockdev-add" calls. The "target" argument changes semantics. It no longer accepts filenames, and will now additionally accept arbitrary node names in addition to device names. 4) Also not a request: If we want to go above and beyond, it might be nice to spell out the exact steps required to transition from the old interface to the new one. Here's a (hasty) suggestion for how that might look: - The MODE argument is deprecated. - "existing" is replaced by using "blockdev-add" commands. - "absolute-paths" is replaced by using "blockdev-add" and "blockdev-create" commands. - The FORMAT argument is deprecated. - Format information is given to "blockdev-add"/"blockdev-create". - The TARGET argument has new semantics: - Filenames are no longer supported, use blockdev-add/blockdev-create as necessary instead. - Device targets remain supported. Example: drive-backup $ARGS format=$FORMAT mode=$MODE target=$FILENAME becomes: (taking some liberties with syntax to just illustrate the idea ...) blockdev-create options={ "driver": "file", "filename": $FILENAME, "size": 0, } blockdev-add arguments={ "driver": "file", "filename": $FILENAME, "node-name": "Example_Filenode0" } blockdev-create options={ "driver": $FORMAT, "file": "Example_Filenode0", "size": $SIZE, } blockdev-add arguments={ "driver": $FORMAT, "file": "Example_Filenode0", "node-name": "Example_Formatnode0", } blockdev-backup arguments={ $ARGS ..., "target": "Example_Formatnode0", } System accelerators --- diff --git
Re: [PATCH v3 22/36] block: add bdrv_remove_filter_or_cow transaction action
26.04.2021 19:26, Kevin Wolf wrote: Am 17.03.2021 um 15:35 hat Vladimir Sementsov-Ogievskiy geschrieben: Signed-off-by: Vladimir Sementsov-Ogievskiy --- block.c | 78 +++-- 1 file changed, 76 insertions(+), 2 deletions(-) diff --git a/block.c b/block.c index 11f7ad0818..2fca1f2ad5 100644 --- a/block.c +++ b/block.c @@ -2929,12 +2929,19 @@ static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs) } } +static void bdrv_child_free(void *opaque) +{ +BdrvChild *c = opaque; + +g_free(c->name); +g_free(c); +} + static void bdrv_remove_empty_child(BdrvChild *child) { assert(!child->bs); QLIST_SAFE_REMOVE(child, next); -g_free(child->name); -g_free(child); +bdrv_child_free(child); } typedef struct BdrvAttachChildCommonState { @@ -4956,6 +4963,73 @@ static bool should_update_child(BdrvChild *c, BlockDriverState *to) return ret; } +typedef struct BdrvRemoveFilterOrCowChild { +BdrvChild *child; +bool is_backing; +} BdrvRemoveFilterOrCowChild; + +/* this doesn't restore original child bs, only the child itself */ Hm, this comment tells me that it's intentional, but why is it correct? that's because bdrv_remove_filter_or_cow_child_abort() aborts only part of bdrv_remove_filter_or_cow_child(). Look: bdrv_remove_filter_or_cow_child() firstly do bdrv_replace_child_safe(child, NULL, tran);, so bs would be restored by .abort() of bdrv_replace_child_safe() action. So, improved comment may look like: This doesn't restore original child bs, only the child itself. The bs would be restored by .abort() bdrv_replace_child_safe() subation of bdrv_remove_filter_or_cow_child() action. Probably it would be more correct to rename BdrvRemoveFilterOrCowChild -> BdrvRemoveFilterOrCowChildNoBs bdrv_remove_filter_or_cow_child_abort -> bdrv_remove_filter_or_cow_child_no_bs_abort bdrv_remove_filter_or_cow_child_commit -> bdrv_remove_filter_or_cow_child_no_bs_commit and assert on .abort() and .commit() that s->child->bs is NULL. +static void bdrv_remove_filter_or_cow_child_abort(void *opaque) +{ +BdrvRemoveFilterOrCowChild *s = opaque; +BlockDriverState *parent_bs = s->child->opaque; + +QLIST_INSERT_HEAD(_bs->children, s->child, next); +if (s->is_backing) { +parent_bs->backing = s->child; +} else { +parent_bs->file = s->child; +} +} Kevin -- Best regards, Vladimir
Re: [PATCH v3 18/36] block: add bdrv_attach_child_common() transaction action
26.04.2021 19:14, Kevin Wolf wrote: Am 17.03.2021 um 15:35 hat Vladimir Sementsov-Ogievskiy geschrieben: Split out no-perm part of bdrv_root_attach_child() into separate transaction action. bdrv_root_attach_child() now moves to new permission update paradigm: first update graph relations then update permissions. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block.c | 189 1 file changed, 135 insertions(+), 54 deletions(-) diff --git a/block.c b/block.c index 98ff44dbf7..b6bdc534d2 100644 --- a/block.c +++ b/block.c @@ -2921,37 +2921,73 @@ static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs) } } -/* - * This function steals the reference to child_bs from the caller. - * That reference is later dropped by bdrv_root_unref_child(). - * - * On failure NULL is returned, errp is set and the reference to - * child_bs is also dropped. - * - * The caller must hold the AioContext lock @child_bs, but not that of @ctx - * (unless @child_bs is already in @ctx). - */ -BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs, - const char *child_name, - const BdrvChildClass *child_class, - BdrvChildRole child_role, - uint64_t perm, uint64_t shared_perm, - void *opaque, Error **errp) +static void bdrv_remove_empty_child(BdrvChild *child) { -BdrvChild *child; -Error *local_err = NULL; -int ret; -AioContext *ctx; +assert(!child->bs); +QLIST_SAFE_REMOVE(child, next); +g_free(child->name); +g_free(child); +} -ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, errp); -if (ret < 0) { -bdrv_abort_perm_update(child_bs); -bdrv_unref(child_bs); -return NULL; +typedef struct BdrvAttachChildCommonState { +BdrvChild **child; +AioContext *old_parent_ctx; +AioContext *old_child_ctx; +} BdrvAttachChildCommonState; + +static void bdrv_attach_child_common_abort(void *opaque) +{ +BdrvAttachChildCommonState *s = opaque; +BdrvChild *child = *s->child; +BlockDriverState *bs = child->bs; + +bdrv_replace_child_noperm(child, NULL); + +if (bdrv_get_aio_context(bs) != s->old_child_ctx) { +bdrv_try_set_aio_context(bs, s->old_child_ctx, _abort); } -child = g_new(BdrvChild, 1); -*child = (BdrvChild) { +if (bdrv_child_get_parent_aio_context(child) != s->old_parent_ctx) { +GSList *ignore = g_slist_prepend(NULL, child); + +child->klass->can_set_aio_ctx(child, s->old_parent_ctx, , + _abort); +g_slist_free(ignore); +ignore = g_slist_prepend(NULL, child); +child->klass->set_aio_ctx(child, s->old_parent_ctx, ); + +g_slist_free(ignore); +} + +bdrv_unref(bs); +bdrv_remove_empty_child(child); +*s->child = NULL; +} + +static TransactionActionDrv bdrv_attach_child_common_drv = { +.abort = bdrv_attach_child_common_abort, +}; + +/* + * Common part of attoching bdrv child to bs or to blk or to job + */ +static int bdrv_attach_child_common(BlockDriverState *child_bs, +const char *child_name, +const BdrvChildClass *child_class, +BdrvChildRole child_role, +uint64_t perm, uint64_t shared_perm, +void *opaque, BdrvChild **child, +Transaction *tran, Error **errp) +{ +BdrvChild *new_child; +AioContext *parent_ctx; +AioContext *child_ctx = bdrv_get_aio_context(child_bs); + +assert(child); +assert(*child == NULL); + +new_child = g_new(BdrvChild, 1); +*new_child = (BdrvChild) { .bs = NULL, .name = g_strdup(child_name), .klass = child_class, @@ -2961,37 +2997,92 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs, .opaque = opaque, }; -ctx = bdrv_child_get_parent_aio_context(child); - -/* If the AioContexts don't match, first try to move the subtree of +/* + * If the AioContexts don't match, first try to move the subtree of * child_bs into the AioContext of the new parent. If this doesn't work, - * try moving the parent into the AioContext of child_bs instead. */ -if (bdrv_get_aio_context(child_bs) != ctx) { -ret = bdrv_try_set_aio_context(child_bs, ctx, _err); + * try moving the parent into the AioContext of child_bs instead. + */ +parent_ctx = bdrv_child_get_parent_aio_context(new_child); +if (child_ctx != parent_ctx) { +Error *local_err = NULL; +int ret = bdrv_try_set_aio_context(child_bs, parent_ctx, _err); + if (ret < 0 &&
Re: [PATCH v3 22/36] block: add bdrv_remove_filter_or_cow transaction action
Am 17.03.2021 um 15:35 hat Vladimir Sementsov-Ogievskiy geschrieben: > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block.c | 78 +++-- > 1 file changed, 76 insertions(+), 2 deletions(-) > > diff --git a/block.c b/block.c > index 11f7ad0818..2fca1f2ad5 100644 > --- a/block.c > +++ b/block.c > @@ -2929,12 +2929,19 @@ static void bdrv_replace_child(BdrvChild *child, > BlockDriverState *new_bs) > } > } > > +static void bdrv_child_free(void *opaque) > +{ > +BdrvChild *c = opaque; > + > +g_free(c->name); > +g_free(c); > +} > + > static void bdrv_remove_empty_child(BdrvChild *child) > { > assert(!child->bs); > QLIST_SAFE_REMOVE(child, next); > -g_free(child->name); > -g_free(child); > +bdrv_child_free(child); > } > > typedef struct BdrvAttachChildCommonState { > @@ -4956,6 +4963,73 @@ static bool should_update_child(BdrvChild *c, > BlockDriverState *to) > return ret; > } > > +typedef struct BdrvRemoveFilterOrCowChild { > +BdrvChild *child; > +bool is_backing; > +} BdrvRemoveFilterOrCowChild; > + > +/* this doesn't restore original child bs, only the child itself */ Hm, this comment tells me that it's intentional, but why is it correct? > +static void bdrv_remove_filter_or_cow_child_abort(void *opaque) > +{ > +BdrvRemoveFilterOrCowChild *s = opaque; > +BlockDriverState *parent_bs = s->child->opaque; > + > +QLIST_INSERT_HEAD(_bs->children, s->child, next); > +if (s->is_backing) { > +parent_bs->backing = s->child; > +} else { > +parent_bs->file = s->child; > +} > +} Kevin
Re: [PATCH v3 18/36] block: add bdrv_attach_child_common() transaction action
Am 17.03.2021 um 15:35 hat Vladimir Sementsov-Ogievskiy geschrieben: > Split out no-perm part of bdrv_root_attach_child() into separate > transaction action. bdrv_root_attach_child() now moves to new > permission update paradigm: first update graph relations then update > permissions. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > block.c | 189 > 1 file changed, 135 insertions(+), 54 deletions(-) > > diff --git a/block.c b/block.c > index 98ff44dbf7..b6bdc534d2 100644 > --- a/block.c > +++ b/block.c > @@ -2921,37 +2921,73 @@ static void bdrv_replace_child(BdrvChild *child, > BlockDriverState *new_bs) > } > } > > -/* > - * This function steals the reference to child_bs from the caller. > - * That reference is later dropped by bdrv_root_unref_child(). > - * > - * On failure NULL is returned, errp is set and the reference to > - * child_bs is also dropped. > - * > - * The caller must hold the AioContext lock @child_bs, but not that of @ctx > - * (unless @child_bs is already in @ctx). > - */ > -BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs, > - const char *child_name, > - const BdrvChildClass *child_class, > - BdrvChildRole child_role, > - uint64_t perm, uint64_t shared_perm, > - void *opaque, Error **errp) > +static void bdrv_remove_empty_child(BdrvChild *child) > { > -BdrvChild *child; > -Error *local_err = NULL; > -int ret; > -AioContext *ctx; > +assert(!child->bs); > +QLIST_SAFE_REMOVE(child, next); > +g_free(child->name); > +g_free(child); > +} > > -ret = bdrv_check_update_perm(child_bs, NULL, perm, shared_perm, NULL, > errp); > -if (ret < 0) { > -bdrv_abort_perm_update(child_bs); > -bdrv_unref(child_bs); > -return NULL; > +typedef struct BdrvAttachChildCommonState { > +BdrvChild **child; > +AioContext *old_parent_ctx; > +AioContext *old_child_ctx; > +} BdrvAttachChildCommonState; > + > +static void bdrv_attach_child_common_abort(void *opaque) > +{ > +BdrvAttachChildCommonState *s = opaque; > +BdrvChild *child = *s->child; > +BlockDriverState *bs = child->bs; > + > +bdrv_replace_child_noperm(child, NULL); > + > +if (bdrv_get_aio_context(bs) != s->old_child_ctx) { > +bdrv_try_set_aio_context(bs, s->old_child_ctx, _abort); > } > > -child = g_new(BdrvChild, 1); > -*child = (BdrvChild) { > +if (bdrv_child_get_parent_aio_context(child) != s->old_parent_ctx) { > +GSList *ignore = g_slist_prepend(NULL, child); > + > +child->klass->can_set_aio_ctx(child, s->old_parent_ctx, , > + _abort); > +g_slist_free(ignore); > +ignore = g_slist_prepend(NULL, child); > +child->klass->set_aio_ctx(child, s->old_parent_ctx, ); > + > +g_slist_free(ignore); > +} > + > +bdrv_unref(bs); > +bdrv_remove_empty_child(child); > +*s->child = NULL; > +} > + > +static TransactionActionDrv bdrv_attach_child_common_drv = { > +.abort = bdrv_attach_child_common_abort, > +}; > + > +/* > + * Common part of attoching bdrv child to bs or to blk or to job > + */ > +static int bdrv_attach_child_common(BlockDriverState *child_bs, > +const char *child_name, > +const BdrvChildClass *child_class, > +BdrvChildRole child_role, > +uint64_t perm, uint64_t shared_perm, > +void *opaque, BdrvChild **child, > +Transaction *tran, Error **errp) > +{ > +BdrvChild *new_child; > +AioContext *parent_ctx; > +AioContext *child_ctx = bdrv_get_aio_context(child_bs); > + > +assert(child); > +assert(*child == NULL); > + > +new_child = g_new(BdrvChild, 1); > +*new_child = (BdrvChild) { > .bs = NULL, > .name = g_strdup(child_name), > .klass = child_class, > @@ -2961,37 +2997,92 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState > *child_bs, > .opaque = opaque, > }; > > -ctx = bdrv_child_get_parent_aio_context(child); > - > -/* If the AioContexts don't match, first try to move the subtree of > +/* > + * If the AioContexts don't match, first try to move the subtree of > * child_bs into the AioContext of the new parent. If this doesn't work, > - * try moving the parent into the AioContext of child_bs instead. */ > -if (bdrv_get_aio_context(child_bs) != ctx) { > -ret = bdrv_try_set_aio_context(child_bs, ctx, _err); > + * try moving the parent into the AioContext of child_bs instead. > + */ > +parent_ctx =
Re: [PATCH for-6.0 v2 1/2] hw/block/nvme: fix invalid msix exclusive uninit
On Fri, 23 Apr 2021 at 06:21, Klaus Jensen wrote: > > From: Klaus Jensen > > Commit 1901b4967c3f changed the nvme device from using a bar exclusive > for MSI-x to sharing it on bar0. > > Unfortunately, the msix_uninit_exclusive_bar() call remains in > nvme_exit() which causes havoc when the device is removed with, say, > device_del. Fix this. > > Additionally, a subregion is added but it is not removed on exit which > causes a reference to linger and the drive to never be unlocked. > > Fixes: 1901b4967c3f ("hw/block/nvme: move msix table and pba to BAR 0") > Signed-off-by: Klaus Jensen > --- > hw/block/nvme.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/hw/block/nvme.c b/hw/block/nvme.c > index 624a1431d072..5fe082ec34c5 100644 > --- a/hw/block/nvme.c > +++ b/hw/block/nvme.c > @@ -6235,7 +6235,8 @@ static void nvme_exit(PCIDevice *pci_dev) > if (n->pmr.dev) { > host_memory_backend_set_mapped(n->pmr.dev, false); > } > -msix_uninit_exclusive_bar(pci_dev); > +msix_uninit(pci_dev, >bar0, >bar0); > +memory_region_del_subregion(>bar0, >iomem); > } > > static Property nvme_props[] = { > -- Applied this patch (but not patch 2) to master for 6.0; thanks. -- PMM
Re: [PATCH 2/5] hw/pcmcia/microdrive: Register machine reset handler
On 4/25/21 8:36 PM, Peter Maydell wrote: > On Sat, 24 Apr 2021 at 17:22, Philippe Mathieu-Daudé wrote: >> >> The abstract PCMCIA_CARD is a bus-less TYPE_DEVICE, so devices >> implementing it are not reset automatically. >> Register a reset handler so children get reset on machine reset. >> >> Note, the DSCM-1 device (TYPE_DSCM1) which inherits >> TYPE_MICRODRIVE and PCMCIA_CARD reset itself when a disk is >> attached or detached, but was not resetting itself on machine >> reset. >> >> It doesn't seem to be an issue because it is that way since the >> device QDev'ifycation 8 years ago, in commit d1f2c96a81a >> ("pcmcia: QOM'ify PCMCIACardState and MicroDriveState"). >> Still, correct to have a proper API usage. >> >> Signed-off-by: Philippe Mathieu-Daudé >> --- >> hw/pcmcia/pcmcia.c | 25 + >> 1 file changed, 25 insertions(+) >> >> diff --git a/hw/pcmcia/pcmcia.c b/hw/pcmcia/pcmcia.c >> index 03d13e7d670..73656257227 100644 >> --- a/hw/pcmcia/pcmcia.c >> +++ b/hw/pcmcia/pcmcia.c >> @@ -6,14 +6,39 @@ >> >> #include "qemu/osdep.h" >> #include "qemu/module.h" >> +#include "sysemu/reset.h" >> #include "hw/pcmcia.h" >> >> +static void pcmcia_card_reset_handler(void *dev) >> +{ >> +device_legacy_reset(DEVICE(dev)); >> +} >> + >> +static void pcmcia_card_realize(DeviceState *dev, Error **errp) >> +{ >> +qemu_register_reset(pcmcia_card_reset_handler, dev); >> +} >> + >> +static void pcmcia_card_unrealize(DeviceState *dev) >> +{ >> +qemu_unregister_reset(pcmcia_card_reset_handler, dev); >> +} > > Why isn't a pcmcia card something that plugs into a bus ? No clue, looks like a very old device with unfinished qdev-ification? See pxa2xx_pcmcia_attach(): /* Insert a new card into a slot */ int pxa2xx_pcmcia_attach(void *opaque, PCMCIACardState *card) { PXA2xxPCMCIAState *s = (PXA2xxPCMCIAState *) opaque; PCMCIACardClass *pcc; ... s->card = card; pcc = PCMCIA_CARD_GET_CLASS(s->card); ... s->card->slot = >slot; pcc->attach(s->card); ... }
Re: [PATCH for-6.0 v2 1/2] hw/block/nvme: fix invalid msix exclusive uninit
On Mon, Apr 26, 2021 at 11:27:04AM +0200, Philippe Mathieu-Daudé wrote: > On 4/26/21 6:40 AM, Klaus Jensen wrote: > > On Apr 23 07:21, Klaus Jensen wrote: > >> From: Klaus Jensen > >> > >> Commit 1901b4967c3f changed the nvme device from using a bar exclusive > >> for MSI-x to sharing it on bar0. > >> > >> Unfortunately, the msix_uninit_exclusive_bar() call remains in > >> nvme_exit() which causes havoc when the device is removed with, say, > >> device_del. Fix this. > >> > >> Additionally, a subregion is added but it is not removed on exit which > >> causes a reference to linger and the drive to never be unlocked. > >> > >> Fixes: 1901b4967c3f ("hw/block/nvme: move msix table and pba to BAR 0") > >> Signed-off-by: Klaus Jensen Reviewed-by: Michael S. Tsirkin > >> --- > >> hw/block/nvme.c | 3 ++- > >> 1 file changed, 2 insertions(+), 1 deletion(-) > >> > >> diff --git a/hw/block/nvme.c b/hw/block/nvme.c > >> index 624a1431d072..5fe082ec34c5 100644 > >> --- a/hw/block/nvme.c > >> +++ b/hw/block/nvme.c > >> @@ -6235,7 +6235,8 @@ static void nvme_exit(PCIDevice *pci_dev) > >> if (n->pmr.dev) { > >> host_memory_backend_set_mapped(n->pmr.dev, false); > >> } > >> - msix_uninit_exclusive_bar(pci_dev); > >> + msix_uninit(pci_dev, >bar0, >bar0); > >> + memory_region_del_subregion(>bar0, >iomem); > >> } > >> > >> static Property nvme_props[] = { > >> -- > >> 2.31.1 > >> > > > > Ping for a review on this please :) > > You forgot to Cc the maintainers :/ (doing it now). > > $ ./scripts/get_maintainer.pl -f include/hw/pci/msix.h > "Michael S. Tsirkin" (supporter:PCI) > Marcel Apfelbaum (supporter:PCI)
Re: [PATCH 03/11] block/block-gen.h: bind monitor
Vladimir Sementsov-Ogievskiy writes: > 24.04.2021 08:23, Markus Armbruster wrote: >> Vladimir Sementsov-Ogievskiy writes: >> >>> If we have current monitor, let's bind it to wrapper coroutine too. >>> >>> Signed-off-by: Vladimir Sementsov-Ogievskiy >>> --- >>> block/block-gen.h | 10 ++ >>> 1 file changed, 10 insertions(+) >>> >>> diff --git a/block/block-gen.h b/block/block-gen.h >>> index c1fd3f40de..61f055a8cc 100644 >>> --- a/block/block-gen.h >>> +++ b/block/block-gen.h >>> @@ -27,6 +27,7 @@ >>> #define BLOCK_BLOCK_GEN_H >>> >>> #include "block/block_int.h" >>> +#include "monitor/monitor.h" >>> >>> /* Base structure for argument packing structures */ >>> typedef struct AioPollCo { >>> @@ -38,11 +39,20 @@ typedef struct AioPollCo { >>> >>> static inline int aio_poll_co(AioPollCo *s) >>> { >>> +Monitor *mon = monitor_cur(); >> >> This gets the currently executing coroutine's monitor from the hash >> table. >> >>> assert(!qemu_in_coroutine()); >>> >>> +if (mon) { >>> +monitor_set_cur(s->co, mon); >> >> This writes it back. No-op, since the coroutine hasn't changed. Why? > > No. s->co != qemu_corotuine_current(), so it's not a write back, but creating > a new entry in the hash map. s->co is a new coroutine which we are going to > start. Ah, that's what I missed. Thanks! [...]
Re: [PATCH v6 00/12] qcow2: fix parallel rewrite and discard (lockless)
22.04.2021 19:30, Vladimir Sementsov-Ogievskiy wrote: Hi all! It's an alternative lock-less solution to [PATCH v4 0/3] qcow2: fix parallel rewrite and discard (rw-lock) In v6 a lot of things are rewritten. What is changed: 1. rename the feature to host_range_refcnt, move it to separate file 2. better naming for everything (I hope) 3. cover reads, not only writes 4. do "ref" in qcow2_get_host_offset(), qcow2_alloc_host_offset(), qcow2_alloc_compressed_cluster_offset(). and callers do "unref" appropriately. About performance. With these series we do extra allocations and hash-map operations.. Still testing by ./build/qemu-img bench -c 100 -s 4K --image-opts driver=null-co,size=5G and ./build/qemu-img bench -c 100 -s 4K -w --image-opts driver=null-co,size=5G I see difference less than 1%. -- Best regards, Vladimir
Re: [PATCH] hw/block/nvme: fix csi field for cns 0x00 and 0x11
On Apr 26 13:16, Gollu Appalanaidu wrote: As per the TP 4056d Namespace types CNS 0x00 and CNS 0x11 CSI field shouldn't use but it is being used for these two Identify command CNS values, fix that. Signed-off-by: Gollu Appalanaidu --- hw/nvme/ctrl.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 2e7498a73e..1657b1d04a 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -4244,11 +4244,16 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req, bool active) } } -if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) { -return nvme_c2h(n, (uint8_t *)>id_ns, sizeof(NvmeIdNs), req); +if (active && nvme_csi_has_nvm_support(ns)) { +goto out; +} else if (!active && ns->csi == NVME_CSI_NVM) { +goto out; +} else { +return NVME_INVALID_CMD_SET | NVME_DNR; } -return NVME_INVALID_CMD_SET | NVME_DNR; +out: +return nvme_c2h(n, (uint8_t *)>id_ns, sizeof(NvmeIdNs), req); } static uint16_t nvme_identify_ns_attached_list(NvmeCtrl *n, NvmeRequest *req) -- 2.17.1 Looking closer at this, since we only support the NVM and Zoned command sets, we can get rid of the `nvme_csi_has_nvm_support()` helper and just assume NVM command set support for all namespaces. The way different command sets are handled doesn't scale anyway, so we might as well simplify the logic a bit. Something like this (compile-tested only) patch maybe? diff --git i/hw/nvme/ctrl.c w/hw/nvme/ctrl.c index 2e7498a73e70..7fcd6992358d 100644 --- i/hw/nvme/ctrl.c +++ w/hw/nvme/ctrl.c @@ -4178,16 +4178,6 @@ static uint16_t nvme_rpt_empty_id_struct(NvmeCtrl *n, NvmeRequest *req) return nvme_c2h(n, id, sizeof(id), req); } -static inline bool nvme_csi_has_nvm_support(NvmeNamespace *ns) -{ -switch (ns->csi) { -case NVME_CSI_NVM: -case NVME_CSI_ZONED: -return true; -} -return false; -} - static uint16_t nvme_identify_ctrl(NvmeCtrl *n, NvmeRequest *req) { trace_pci_nvme_identify_ctrl(); @@ -4244,7 +4234,7 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req, bool active) } } -if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) { +if (active || ns->csi == NVME_CSI_NVM) { return nvme_c2h(n, (uint8_t *)>id_ns, sizeof(NvmeIdNs), req); } @@ -4315,7 +4305,7 @@ static uint16_t nvme_identify_ns_csi(NvmeCtrl *n, NvmeRequest *req, } } -if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) { +if (c->csi == NVME_CSI_NVM) { return nvme_rpt_empty_id_struct(n, req); } else if (c->csi == NVME_CSI_ZONED && ns->csi == NVME_CSI_ZONED) { return nvme_c2h(n, (uint8_t *)ns->id_ns_zoned, sizeof(NvmeIdNsZoned), signature.asc Description: PGP signature
Re: [PATCH 1/2] block/export: Free ignored Error
26.04.2021 13:33, Max Reitz wrote: On 26.04.21 11:44, Vladimir Sementsov-Ogievskiy wrote: 22.04.2021 17:53, Max Reitz wrote: When invoking block-export-add with some iothread and fixed-iothread=false, and changing the node's iothread fails, the error is supposed to be ignored. However, it is still stored in *errp, which is wrong. If a second error occurs, the "*errp must be NULL" assertion in error_setv() fails: qemu-system-x86_64: ../util/error.c:59: error_setv: Assertion `*errp == NULL' failed. So the error from bdrv_try_set_aio_context() must be freed when it is ignored. Fixes: f51d23c80af73c95e0ce703ad06a300f1b3d63ef ("block/export: add iothread and fixed-iothread options") Signed-off-by: Max Reitz --- block/export/export.c | 4 1 file changed, 4 insertions(+) diff --git a/block/export/export.c b/block/export/export.c index fec7d9f738..ce5dd3e59b 100644 --- a/block/export/export.c +++ b/block/export/export.c @@ -68,6 +68,7 @@ static const BlockExportDriver *blk_exp_find_driver(BlockExportType type) BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp) { + ERRP_GUARD(); bool fixed_iothread = export->has_fixed_iothread && export->fixed_iothread; const BlockExportDriver *drv; BlockExport *exp = NULL; @@ -127,6 +128,9 @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp) ctx = new_ctx; } else if (fixed_iothread) { goto fail; + } else { + error_free(*errp); + *errp = NULL; } } I don't think ERRP_GUARD is needed in this case: we don't need to handle errp somehow except for just free if it was set. Perhaps not, but style-wise, I prefer not special-casing the errp == NULL case. (It can be argued that ERRP_GUARD similarly special-cases it, but that’s hidden from my view. Also, the errp == NULL case actually doesn’t even happen, so ERRP_GUARD is effectively a no-op and it won’t cost performance (not that it really matters).) Hm. I don't know. May be you are right.. Actually, I don't care too much, so, patch is OK as is: Reviewed-by: Vladimir Sementsov-Ogievskiy Of course we could also do this: ret = bdrv_try_set_aio_context(bs, new_ctx, fixed_iothread ? errp : NULL); Would be even shorter. So we can simply do: } else if (errp) { error_free(*errp); *errp = NULL; } Let's only check that errp is really set on failure path of bdrv_try_set_aio_context(): OK, but out of interest, why? error_free() doesn’t care. I mean it might be a problem if blk_exp_add() returns an error without setting *errp, but that’d’ve been pre-existing. I remember we still have some functions not setting errp on some error paths.. bdrv_open_driver() has work-around for such bad .*open handlers of some drivers... So I decided to look through. bdrv_try_set_aio_context() fails iff bdrv_can_set_aio_context() fails, which in turn may fail iff bdrv_parent_can_set_aio_context() or bdrv_child_can_set_aio_context() fails. bdrv_parent_can_set_aio_context() has two failure path, on first it set errp by hand, and on second it has assertion that errp is set. bdrv_child_can_set_aio_context() may fail only if nested call to bdrv_can_set_aio_context() fails, so recursion is closed. -- Best regards, Vladimir
Re: [PATCH 1/2] block/export: Free ignored Error
On 26.04.21 11:44, Vladimir Sementsov-Ogievskiy wrote: 22.04.2021 17:53, Max Reitz wrote: When invoking block-export-add with some iothread and fixed-iothread=false, and changing the node's iothread fails, the error is supposed to be ignored. However, it is still stored in *errp, which is wrong. If a second error occurs, the "*errp must be NULL" assertion in error_setv() fails: qemu-system-x86_64: ../util/error.c:59: error_setv: Assertion `*errp == NULL' failed. So the error from bdrv_try_set_aio_context() must be freed when it is ignored. Fixes: f51d23c80af73c95e0ce703ad06a300f1b3d63ef ("block/export: add iothread and fixed-iothread options") Signed-off-by: Max Reitz --- block/export/export.c | 4 1 file changed, 4 insertions(+) diff --git a/block/export/export.c b/block/export/export.c index fec7d9f738..ce5dd3e59b 100644 --- a/block/export/export.c +++ b/block/export/export.c @@ -68,6 +68,7 @@ static const BlockExportDriver *blk_exp_find_driver(BlockExportType type) BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp) { + ERRP_GUARD(); bool fixed_iothread = export->has_fixed_iothread && export->fixed_iothread; const BlockExportDriver *drv; BlockExport *exp = NULL; @@ -127,6 +128,9 @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp) ctx = new_ctx; } else if (fixed_iothread) { goto fail; + } else { + error_free(*errp); + *errp = NULL; } } I don't think ERRP_GUARD is needed in this case: we don't need to handle errp somehow except for just free if it was set. Perhaps not, but style-wise, I prefer not special-casing the errp == NULL case. (It can be argued that ERRP_GUARD similarly special-cases it, but that’s hidden from my view. Also, the errp == NULL case actually doesn’t even happen, so ERRP_GUARD is effectively a no-op and it won’t cost performance (not that it really matters).) Of course we could also do this: ret = bdrv_try_set_aio_context(bs, new_ctx, fixed_iothread ? errp : NULL); Would be even shorter. So we can simply do: } else if (errp) { error_free(*errp); *errp = NULL; } Let's only check that errp is really set on failure path of bdrv_try_set_aio_context(): OK, but out of interest, why? error_free() doesn’t care. I mean it might be a problem if blk_exp_add() returns an error without setting *errp, but that’d’ve been pre-existing. Max bdrv_try_set_aio_context() fails iff bdrv_can_set_aio_context() fails, which in turn may fail iff bdrv_parent_can_set_aio_context() or bdrv_child_can_set_aio_context() fails. bdrv_parent_can_set_aio_context() has two failure path, on first it set errp by hand, and on second it has assertion that errp is set. bdrv_child_can_set_aio_context() may fail only if nested call to bdrv_can_set_aio_context() fails, so recursion is closed.
Re: [PATCH 2/2] iotests/307: Test iothread conflict for exports
22.04.2021 17:53, Max Reitz wrote: Passing fixed-iothread=true should make iothread conflicts fatal, whereas fixed-iothread=false should not. Combine the second case with an error condition that is checked after the iothread is handled, to verify that qemu does not crash if there is such an error after changing the iothread failed. Signed-off-by: Max Reitz Reviewed-by: Vladimir Sementsov-Ogievskiy Tested-by: Vladimir Sementsov-Ogievskiy -- Best regards, Vladimir
Re: [PATCH 0/2] iotests/qsd-jobs: Use common.qemu for the QSD
On Thu, Apr 01, 2021 at 03:28:13PM +0200, Max Reitz wrote: > (Alternative to: “iotests/qsd-jobs: Filter events in the first test”) > > Hi, > > The qsd-jobs test has kind of unreliable output, because sometimes the > job is ready before ‘quit’, and sometimes it is not. This series > presents one approach to fix that, which is to extend common.qemu to > allow running the storage daemon instead of qemu, and then to use that > in qsd-jobs to wait for the BLOCK_JOB_READY event before issuing the > ‘quit’ command. > > I took patch 1 from my “qcow2: Improve refcount structure rebuilding” > series. > (https://lists.nongnu.org/archive/html/qemu-block/2021-03/msg00654.html) > > As noted above, this series is an alternative to “iotests/qsd-jobs: > Filter events in the first test”. I like this series here better > because I’d prefer it if tests that do QMP actually check the output so > they control what’s really happening. > On the other hand, this may be too complicated for 6.0, and we might > want to fix qsd-jobs in 6.0. > > > Max Reitz (2): > iotests/common.qemu: Allow using the QSD > iotests/qsd-jobs: Use common.qemu for the QSD > > tests/qemu-iotests/common.qemu| 53 +- > tests/qemu-iotests/tests/qsd-jobs | 55 --- > tests/qemu-iotests/tests/qsd-jobs.out | 10 - > 3 files changed, 92 insertions(+), 26 deletions(-) > > -- > 2.29.2 > Acked-by: Stefan Hajnoczi signature.asc Description: PGP signature
Re: [PATCH 1/2] block/export: Free ignored Error
22.04.2021 17:53, Max Reitz wrote: When invoking block-export-add with some iothread and fixed-iothread=false, and changing the node's iothread fails, the error is supposed to be ignored. However, it is still stored in *errp, which is wrong. If a second error occurs, the "*errp must be NULL" assertion in error_setv() fails: qemu-system-x86_64: ../util/error.c:59: error_setv: Assertion `*errp == NULL' failed. So the error from bdrv_try_set_aio_context() must be freed when it is ignored. Fixes: f51d23c80af73c95e0ce703ad06a300f1b3d63ef ("block/export: add iothread and fixed-iothread options") Signed-off-by: Max Reitz --- block/export/export.c | 4 1 file changed, 4 insertions(+) diff --git a/block/export/export.c b/block/export/export.c index fec7d9f738..ce5dd3e59b 100644 --- a/block/export/export.c +++ b/block/export/export.c @@ -68,6 +68,7 @@ static const BlockExportDriver *blk_exp_find_driver(BlockExportType type) BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp) { +ERRP_GUARD(); bool fixed_iothread = export->has_fixed_iothread && export->fixed_iothread; const BlockExportDriver *drv; BlockExport *exp = NULL; @@ -127,6 +128,9 @@ BlockExport *blk_exp_add(BlockExportOptions *export, Error **errp) ctx = new_ctx; } else if (fixed_iothread) { goto fail; +} else { +error_free(*errp); +*errp = NULL; } } I don't think ERRP_GUARD is needed in this case: we don't need to handle errp somehow except for just free if it was set. So we can simply do: } else if (errp) { error_free(*errp); *errp = NULL; } Let's only check that errp is really set on failure path of bdrv_try_set_aio_context(): bdrv_try_set_aio_context() fails iff bdrv_can_set_aio_context() fails, which in turn may fail iff bdrv_parent_can_set_aio_context() or bdrv_child_can_set_aio_context() fails. bdrv_parent_can_set_aio_context() has two failure path, on first it set errp by hand, and on second it has assertion that errp is set. bdrv_child_can_set_aio_context() may fail only if nested call to bdrv_can_set_aio_context() fails, so recursion is closed. -- Best regards, Vladimir
Re: [PATCH for-6.0 v2 1/2] hw/block/nvme: fix invalid msix exclusive uninit
On Apr 26 11:27, Philippe Mathieu-Daudé wrote: On 4/26/21 6:40 AM, Klaus Jensen wrote: On Apr 23 07:21, Klaus Jensen wrote: From: Klaus Jensen Commit 1901b4967c3f changed the nvme device from using a bar exclusive for MSI-x to sharing it on bar0. Unfortunately, the msix_uninit_exclusive_bar() call remains in nvme_exit() which causes havoc when the device is removed with, say, device_del. Fix this. Additionally, a subregion is added but it is not removed on exit which causes a reference to linger and the drive to never be unlocked. Fixes: 1901b4967c3f ("hw/block/nvme: move msix table and pba to BAR 0") Signed-off-by: Klaus Jensen --- hw/block/nvme.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/hw/block/nvme.c b/hw/block/nvme.c index 624a1431d072..5fe082ec34c5 100644 --- a/hw/block/nvme.c +++ b/hw/block/nvme.c @@ -6235,7 +6235,8 @@ static void nvme_exit(PCIDevice *pci_dev) if (n->pmr.dev) { host_memory_backend_set_mapped(n->pmr.dev, false); } - msix_uninit_exclusive_bar(pci_dev); + msix_uninit(pci_dev, >bar0, >bar0); + memory_region_del_subregion(>bar0, >iomem); } static Property nvme_props[] = { -- 2.31.1 Ping for a review on this please :) You forgot to Cc the maintainers :/ (doing it now). $ ./scripts/get_maintainer.pl -f include/hw/pci/msix.h "Michael S. Tsirkin" (supporter:PCI) Marcel Apfelbaum (supporter:PCI) I didnt consider CC'ing the PCI maintainers directly, but makes total sense here, thanks! signature.asc Description: PGP signature
Re: [PATCH for-6.0 v2 1/2] hw/block/nvme: fix invalid msix exclusive uninit
On 4/26/21 6:40 AM, Klaus Jensen wrote: > On Apr 23 07:21, Klaus Jensen wrote: >> From: Klaus Jensen >> >> Commit 1901b4967c3f changed the nvme device from using a bar exclusive >> for MSI-x to sharing it on bar0. >> >> Unfortunately, the msix_uninit_exclusive_bar() call remains in >> nvme_exit() which causes havoc when the device is removed with, say, >> device_del. Fix this. >> >> Additionally, a subregion is added but it is not removed on exit which >> causes a reference to linger and the drive to never be unlocked. >> >> Fixes: 1901b4967c3f ("hw/block/nvme: move msix table and pba to BAR 0") >> Signed-off-by: Klaus Jensen >> --- >> hw/block/nvme.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/hw/block/nvme.c b/hw/block/nvme.c >> index 624a1431d072..5fe082ec34c5 100644 >> --- a/hw/block/nvme.c >> +++ b/hw/block/nvme.c >> @@ -6235,7 +6235,8 @@ static void nvme_exit(PCIDevice *pci_dev) >> if (n->pmr.dev) { >> host_memory_backend_set_mapped(n->pmr.dev, false); >> } >> - msix_uninit_exclusive_bar(pci_dev); >> + msix_uninit(pci_dev, >bar0, >bar0); >> + memory_region_del_subregion(>bar0, >iomem); >> } >> >> static Property nvme_props[] = { >> -- >> 2.31.1 >> > > Ping for a review on this please :) You forgot to Cc the maintainers :/ (doing it now). $ ./scripts/get_maintainer.pl -f include/hw/pci/msix.h "Michael S. Tsirkin" (supporter:PCI) Marcel Apfelbaum (supporter:PCI)
Re: [PATCH 03/11] block/block-gen.h: bind monitor
24.04.2021 08:23, Markus Armbruster wrote: Vladimir Sementsov-Ogievskiy writes: If we have current monitor, let's bind it to wrapper coroutine too. Signed-off-by: Vladimir Sementsov-Ogievskiy --- block/block-gen.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/block/block-gen.h b/block/block-gen.h index c1fd3f40de..61f055a8cc 100644 --- a/block/block-gen.h +++ b/block/block-gen.h @@ -27,6 +27,7 @@ #define BLOCK_BLOCK_GEN_H #include "block/block_int.h" +#include "monitor/monitor.h" /* Base structure for argument packing structures */ typedef struct AioPollCo { @@ -38,11 +39,20 @@ typedef struct AioPollCo { static inline int aio_poll_co(AioPollCo *s) { +Monitor *mon = monitor_cur(); This gets the currently executing coroutine's monitor from the hash table. assert(!qemu_in_coroutine()); +if (mon) { +monitor_set_cur(s->co, mon); This writes it back. No-op, since the coroutine hasn't changed. Why? No. s->co != qemu_corotuine_current(), so it's not a write back, but creating a new entry in the hash map. s->co is a new coroutine which we are going to start. +} + aio_co_enter(s->ctx, s->co); AIO_WAIT_WHILE(s->ctx, s->in_progress); +if (mon) { +monitor_set_cur(s->co, NULL); This removes s->co's monitor from the hash table. Why? +} + return s->ret; } If I comment the new code of this patch (keeping the whole series applied), 249 fails, as error message goes simply to stderr, not to monitor: 249 fail [11:56:54] [11:56:54] 0.3s (last: 0.2s) output mismatch (see 249.out.bad) --- /work/src/qemu/up/hmp-qemu-io/tests/qemu-iotests/249.out +++ 249.out.bad @@ -9,7 +9,8 @@ { 'execute': 'human-monitor-command', 'arguments': {'command-line': 'qemu-io none0 "aio_write 0 2k"'}} -{"return": "Block node is read-onlyrn"} +QEMU_PROG: Block node is read-only +{"return": ""} === Run block-commit on base using an invalid filter node name @@ -24,7 +25,8 @@ { 'execute': 'human-monitor-command', 'arguments': {'command-line': 'qemu-io none0 "aio_write 0 2k"'}} -{"return": "Block node is read-onlyrn"} +QEMU_PROG: Block node is read-only +{"return": ""} === Run block-commit on base using the default filter node name @@ -43,5 +45,6 @@ { 'execute': 'human-monitor-command', 'arguments': {'command-line': 'qemu-io none0 "aio_write 0 2k"'}} -{"return": "Block node is read-onlyrn"} +QEMU_PROG: Block node is read-only +{"return": ""} *** done Failures: 249 Failed 1 of 1 iotests -- Best regards, Vladimir
Re: [PATCH v3 06/33] util/async: aio_co_schedule(): support reschedule in same ctx
23.04.2021 13:09, Roman Kagan wrote: On Fri, Apr 16, 2021 at 11:08:44AM +0300, Vladimir Sementsov-Ogievskiy wrote: With the following patch we want to call wake coroutine from thread. And it doesn't work with aio_co_wake: Assume we have no iothreads. Assume we have a coroutine A, which waits in the yield point for external aio_co_wake(), and no progress can be done until it happen. Main thread is in blocking aio_poll() (for example, in blk_read()). Now, in a separate thread we do aio_co_wake(). It calls aio_co_enter(), which goes through last "else" branch and do aio_context_acquire(ctx). Now we have a deadlock, as aio_poll() will not release the context lock until some progress is done, and progress can't be done until aio_co_wake() wake the coroutine A. And it can't because it wait for aio_context_acquire(). Still, aio_co_schedule() works well in parallel with blocking aio_poll(). So we want use it. Let's add a possibility of rescheduling coroutine in same ctx where it was yield'ed. Fetch co->ctx in same way as it is done in aio_co_wake(). Signed-off-by: Vladimir Sementsov-Ogievskiy --- include/block/aio.h | 2 +- util/async.c| 8 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/include/block/aio.h b/include/block/aio.h index 5f342267d5..744b695525 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -643,7 +643,7 @@ static inline bool aio_node_check(AioContext *ctx, bool is_external) /** * aio_co_schedule: - * @ctx: the aio context + * @ctx: the aio context, if NULL, the current ctx of @co will be used. * @co: the coroutine * * Start a coroutine on a remote AioContext. diff --git a/util/async.c b/util/async.c index 674dbefb7c..750be555c6 100644 --- a/util/async.c +++ b/util/async.c @@ -545,6 +545,14 @@ fail: void aio_co_schedule(AioContext *ctx, Coroutine *co) { +if (!ctx) { +/* + * Read coroutine before co->ctx. Matches smp_wmb in + * qemu_coroutine_enter. + */ +smp_read_barrier_depends(); +ctx = qatomic_read(>ctx); +} I'd rather not extend this interface, but add a new one on top. And document how it's different from aio_co_wake(). Agree, that's better. Will do. -- Best regards, Vladimir
Re: [PATCH 00/14] hw(/block/)nvme: spring cleaning
On Apr 19 21:27, Klaus Jensen wrote: From: Klaus Jensen This series consists of various clean up patches. The final patch moves nvme emulation from hw/block to hw/nvme. Klaus Jensen (14): hw/block/nvme: rename __nvme_zrm_open hw/block/nvme: rename __nvme_advance_zone_wp hw/block/nvme: rename __nvme_select_ns_iocs hw/block/nvme: consolidate header files hw/block/nvme: cleanup includes hw/block/nvme: remove non-shared defines from header file hw/block/nvme: replace nvme_ns_status hw/block/nvme: cache lba and ms sizes hw/block/nvme: add metadata offset helper hw/block/nvme: streamline namespace array indexing hw/block/nvme: remove num_namespaces member hw/block/nvme: remove irrelevant zone resource checks hw/block/nvme: move zoned constraints checks hw/nvme: move nvme emulation out of hw/block meson.build | 1 + hw/block/nvme-dif.h | 63 --- hw/block/nvme-ns.h| 229 - hw/block/nvme-subsys.h| 59 --- hw/block/nvme.h | 266 --- hw/nvme/nvme.h| 547 ++ hw/nvme/trace.h | 1 + hw/{block/nvme.c => nvme/ctrl.c} | 204 hw/{block/nvme-dif.c => nvme/dif.c} | 57 +-- hw/{block/nvme-ns.c => nvme/ns.c} | 104 ++-- hw/{block/nvme-subsys.c => nvme/subsys.c} | 13 +- MAINTAINERS | 2 +- hw/Kconfig| 1 + hw/block/Kconfig | 5 - hw/block/meson.build | 1 - hw/block/trace-events | 206 hw/meson.build| 1 + hw/nvme/Kconfig | 4 + hw/nvme/meson.build | 1 + hw/nvme/trace-events | 204 20 files changed, 928 insertions(+), 1041 deletions(-) delete mode 100644 hw/block/nvme-dif.h delete mode 100644 hw/block/nvme-ns.h delete mode 100644 hw/block/nvme-subsys.h delete mode 100644 hw/block/nvme.h create mode 100644 hw/nvme/nvme.h create mode 100644 hw/nvme/trace.h rename hw/{block/nvme.c => nvme/ctrl.c} (98%) rename hw/{block/nvme-dif.c => nvme/dif.c} (90%) rename hw/{block/nvme-ns.c => nvme/ns.c} (87%) rename hw/{block/nvme-subsys.c => nvme/subsys.c} (85%) create mode 100644 hw/nvme/Kconfig create mode 100644 hw/nvme/meson.build create mode 100644 hw/nvme/trace-events -- 2.31.1 Applied to nvme-next. signature.asc Description: PGP signature
[PATCH] hw/block/nvme: fix csi field for cns 0x00 and 0x11
As per the TP 4056d Namespace types CNS 0x00 and CNS 0x11 CSI field shouldn't use but it is being used for these two Identify command CNS values, fix that. Signed-off-by: Gollu Appalanaidu --- hw/nvme/ctrl.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 2e7498a73e..1657b1d04a 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -4244,11 +4244,16 @@ static uint16_t nvme_identify_ns(NvmeCtrl *n, NvmeRequest *req, bool active) } } -if (c->csi == NVME_CSI_NVM && nvme_csi_has_nvm_support(ns)) { -return nvme_c2h(n, (uint8_t *)>id_ns, sizeof(NvmeIdNs), req); +if (active && nvme_csi_has_nvm_support(ns)) { +goto out; +} else if (!active && ns->csi == NVME_CSI_NVM) { +goto out; +} else { +return NVME_INVALID_CMD_SET | NVME_DNR; } -return NVME_INVALID_CMD_SET | NVME_DNR; +out: +return nvme_c2h(n, (uint8_t *)>id_ns, sizeof(NvmeIdNs), req); } static uint16_t nvme_identify_ns_attached_list(NvmeCtrl *n, NvmeRequest *req) -- 2.17.1
Re: [PATCH] qapi: deprecate drive-backup
On Fri, Apr 23, 2021 at 15:59:00 +0300, Vladimir Sementsov-Ogievskiy wrote: > Modern way is using blockdev-add + blockdev-backup, which provides a > lot more control on how target is opened. > > As example of drive-backup problems consider the following: > > User of drive-backup expects that target will be opened in the same > cache and aio mode as source. Corresponding logic is in > drive_backup_prepare(), where we take bs->open_flags of source. > > It works rather bad if source was added by blockdev-add. Assume source > is qcow2 image. On blockdev-add we should specify aio and cache options > for file child of qcow2 node. What happens next: > > drive_backup_prepare() looks at bs->open_flags of qcow2 source node. > But there no BDRV_O_NOCAHE neither BDRV_O_NATIVE_AIO: BDRV_O_NOCAHE is > places in bs->file->bs->open_flags, and BDRV_O_NATIVE_AIO is nowhere, > as file-posix parse options and simply set s->use_linux_aio. > > Signed-off-by: Vladimir Sementsov-Ogievskiy > --- > > Hi all! I remember, I suggested to deprecate drive-backup some time ago, > and nobody complain.. But that old patch was inside the series with > other more questionable deprecations and it did not landed. > > Let's finally deprecate what should be deprecated long ago. > > We now faced a problem in our downstream, described in commit message. > In downstream I've fixed it by simply enabling O_DIRECT and linux_aio > unconditionally for drive_backup target. But actually this just shows > that using drive-backup in blockdev era is a bad idea. So let's motivate > everyone (including Virtuozzo of course) to move to new interfaces and > avoid problems with all that outdated option inheritance. libvirt never used 'drive-backup' thus Reviewed-by: Peter Krempa