Re: [PATCH v1 2/8] virtio-pci: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA

2024-03-08 Thread Michael S. Tsirkin
On Fri, Mar 08, 2024 at 12:45:13PM -0500, Jonah Palmer wrote:
> 
> 
> On 3/8/24 12:36 PM, Eugenio Perez Martin wrote:
> > On Fri, Mar 8, 2024 at 6: 01 PM Michael S. Tsirkin 
> > wrote: > > On Mon, Mar 04, 2024 at 02: 46: 06PM -0500, Jonah Palmer
> > wrote: > > Prevent ioeventfd from being enabled/disabled when a
> > virtio-pci > > device
> > ZjQcmQRYFpfptBannerStart
> > This Message Is From an External Sender
> > This message came from outside your organization.
> > Report Suspicious
> > 
> > ZjQcmQRYFpfptBannerEnd
> > 
> > On Fri, Mar 8, 2024 at 6:01 PM Michael S. Tsirkin  wrote:
> > > 
> > > On Mon, Mar 04, 2024 at 02:46:06PM -0500, Jonah Palmer wrote:
> > > > Prevent ioeventfd from being enabled/disabled when a virtio-pci
> > > > device has negotiated the VIRTIO_F_NOTIFICATION_DATA transport
> > > > feature.
> > > >
> > > > Due to ioeventfd not being able to carry the extra data associated with
> > > > this feature, the ioeventfd should be left in a disabled state for
> > > > emulated virtio-pci devices using this feature.
> > > >
> > > > Reviewed-by: Eugenio Pérez 
> > > > Signed-off-by: Jonah Palmer 
> > > 
> > > I thought hard about this. I propose that for now,
> > > instead of disabling ioevetfd silently we error out unless
> > > user disabled it for us.
> > > WDYT?
> > > 
> > 
> > Yes, error is a better plan than silently disabling it. In the
> > (unlikely?) case we are able to make notification data work with
> > eventfd in the future, it makes the change more evident.
> > 
> 
> Will do in v2. I assume we'll also make this the case for virtio-mmio and
> virtio-ccw?

Guess so. Pls note freeze is imminent.
> > > 
> > > > ---
> > > >  hw/virtio/virtio-pci.c | 6 --
> > > >  1 file changed, 4 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > > > index d12edc567f..287b8f7720 100644
> > > > --- a/hw/virtio/virtio-pci.c
> > > > +++ b/hw/virtio/virtio-pci.c
> > > > @@ -417,13 +417,15 @@ static void virtio_ioport_write(void *opaque, 
> > > > uint32_t addr, uint32_t val)
> > > >  }
> > > >  break;
> > > >  case VIRTIO_PCI_STATUS:
> > > > -if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > > > +if (!(val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> > > > +!virtio_vdev_has_feature(vdev, 
> > > > VIRTIO_F_NOTIFICATION_DATA)) {
> > > >  virtio_pci_stop_ioeventfd(proxy);
> > > >  }
> > > >
> > > >  virtio_set_status(vdev, val & 0xFF);
> > > >
> > > > -if (val & VIRTIO_CONFIG_S_DRIVER_OK) {
> > > > +if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> > > > +!virtio_vdev_has_feature(vdev, 
> > > > VIRTIO_F_NOTIFICATION_DATA)) {
> > > >  virtio_pci_start_ioeventfd(proxy);
> > > >  }
> > > >
> > > > --
> > > > 2.39.3
> > > 
> > 




Re: [PATCH v1 2/8] virtio-pci: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA

2024-03-08 Thread Jonah Palmer




On 3/8/24 12:36 PM, Eugenio Perez Martin wrote:
On Fri, Mar 8, 2024 at 6: 01 PM Michael S. Tsirkin  
wrote: > > On Mon, Mar 04, 2024 at 02: 46: 06PM -0500, Jonah Palmer 
wrote: > > Prevent ioeventfd from being enabled/disabled when a 
virtio-pci > > device

ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
Report Suspicious

ZjQcmQRYFpfptBannerEnd

On Fri, Mar 8, 2024 at 6:01 PM Michael S. Tsirkin  wrote:


On Mon, Mar 04, 2024 at 02:46:06PM -0500, Jonah Palmer wrote:
> Prevent ioeventfd from being enabled/disabled when a virtio-pci
> device has negotiated the VIRTIO_F_NOTIFICATION_DATA transport
> feature.
>
> Due to ioeventfd not being able to carry the extra data associated with
> this feature, the ioeventfd should be left in a disabled state for
> emulated virtio-pci devices using this feature.
>
> Reviewed-by: Eugenio Pérez 
> Signed-off-by: Jonah Palmer 

I thought hard about this. I propose that for now,
instead of disabling ioevetfd silently we error out unless
user disabled it for us.
WDYT?



Yes, error is a better plan than silently disabling it. In the
(unlikely?) case we are able to make notification data work with
eventfd in the future, it makes the change more evident.



Will do in v2. I assume we'll also make this the case for virtio-mmio 
and virtio-ccw?




> ---
>  hw/virtio/virtio-pci.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index d12edc567f..287b8f7720 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -417,13 +417,15 @@ static void virtio_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
>  }
>  break;
>  case VIRTIO_PCI_STATUS:
> -if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) {
> +if (!(val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) {
>  virtio_pci_stop_ioeventfd(proxy);
>  }
>
>  virtio_set_status(vdev, val & 0xFF);
>
> -if (val & VIRTIO_CONFIG_S_DRIVER_OK) {
> +if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) {
>  virtio_pci_start_ioeventfd(proxy);
>  }
>
> --
> 2.39.3







Re: [PATCH v1 2/8] virtio-pci: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA

2024-03-08 Thread Eugenio Perez Martin
On Fri, Mar 8, 2024 at 6:01 PM Michael S. Tsirkin  wrote:
>
> On Mon, Mar 04, 2024 at 02:46:06PM -0500, Jonah Palmer wrote:
> > Prevent ioeventfd from being enabled/disabled when a virtio-pci
> > device has negotiated the VIRTIO_F_NOTIFICATION_DATA transport
> > feature.
> >
> > Due to ioeventfd not being able to carry the extra data associated with
> > this feature, the ioeventfd should be left in a disabled state for
> > emulated virtio-pci devices using this feature.
> >
> > Reviewed-by: Eugenio Pérez 
> > Signed-off-by: Jonah Palmer 
>
> I thought hard about this. I propose that for now,
> instead of disabling ioevetfd silently we error out unless
> user disabled it for us.
> WDYT?
>

Yes, error is a better plan than silently disabling it. In the
(unlikely?) case we are able to make notification data work with
eventfd in the future, it makes the change more evident.

>
> > ---
> >  hw/virtio/virtio-pci.c | 6 --
> >  1 file changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index d12edc567f..287b8f7720 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -417,13 +417,15 @@ static void virtio_ioport_write(void *opaque, 
> > uint32_t addr, uint32_t val)
> >  }
> >  break;
> >  case VIRTIO_PCI_STATUS:
> > -if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) {
> > +if (!(val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> > +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) {
> >  virtio_pci_stop_ioeventfd(proxy);
> >  }
> >
> >  virtio_set_status(vdev, val & 0xFF);
> >
> > -if (val & VIRTIO_CONFIG_S_DRIVER_OK) {
> > +if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> > +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) {
> >  virtio_pci_start_ioeventfd(proxy);
> >  }
> >
> > --
> > 2.39.3
>




Re: [PATCH v1 2/8] virtio-pci: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA

2024-03-08 Thread Michael S. Tsirkin
On Mon, Mar 04, 2024 at 02:46:06PM -0500, Jonah Palmer wrote:
> Prevent ioeventfd from being enabled/disabled when a virtio-pci
> device has negotiated the VIRTIO_F_NOTIFICATION_DATA transport
> feature.
> 
> Due to ioeventfd not being able to carry the extra data associated with
> this feature, the ioeventfd should be left in a disabled state for
> emulated virtio-pci devices using this feature.
> 
> Reviewed-by: Eugenio Pérez 
> Signed-off-by: Jonah Palmer 

I thought hard about this. I propose that for now,
instead of disabling ioevetfd silently we error out unless
user disabled it for us.
WDYT?


> ---
>  hw/virtio/virtio-pci.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index d12edc567f..287b8f7720 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -417,13 +417,15 @@ static void virtio_ioport_write(void *opaque, uint32_t 
> addr, uint32_t val)
>  }
>  break;
>  case VIRTIO_PCI_STATUS:
> -if (!(val & VIRTIO_CONFIG_S_DRIVER_OK)) {
> +if (!(val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) {
>  virtio_pci_stop_ioeventfd(proxy);
>  }
>  
>  virtio_set_status(vdev, val & 0xFF);
>  
> -if (val & VIRTIO_CONFIG_S_DRIVER_OK) {
> +if ((val & VIRTIO_CONFIG_S_DRIVER_OK) &&
> +!virtio_vdev_has_feature(vdev, VIRTIO_F_NOTIFICATION_DATA)) {
>  virtio_pci_start_ioeventfd(proxy);
>  }
>  
> -- 
> 2.39.3




Re: [PULL 34/56] hw/intc/grlib_irqmp: add ncpus property

2024-03-08 Thread Clément Chigot
On Fri, Mar 8, 2024 at 2:27 PM Peter Maydell  wrote:
>
> On Thu, 15 Feb 2024 at 18:04, Philippe Mathieu-Daudé  
> wrote:
> >
> > From: Clément Chigot 
> >
> > This adds a "ncpus" property to the "grlib-irqmp" device to be used
> > later, this required a little refactoring of how we initialize the
> > device (ie: use realize instead of init).
> >
> > Co-developed-by: Frederic Konrad 
> > Signed-off-by: Clément Chigot 
> > Reviewed-by: Philippe Mathieu-Daudé 
> > Message-ID: <20240131085047.18458-3-chi...@adacore.com>
> > Signed-off-by: Philippe Mathieu-Daudé 
>
> Hi; Coverity points out a bug in this commit (CID 1534914):
>
>
> > -static void grlib_irqmp_init(Object *obj)
> > +static void grlib_irqmp_realize(DeviceState *dev, Error **errp)
> >  {
> > -IRQMP *irqmp = GRLIB_IRQMP(obj);
> > -SysBusDevice *dev = SYS_BUS_DEVICE(obj);
> > +IRQMP *irqmp = GRLIB_IRQMP(dev);
> >
> > -qdev_init_gpio_in(DEVICE(obj), grlib_irqmp_set_irq, MAX_PILS);
> > -qdev_init_gpio_out_named(DEVICE(obj), &irqmp->irq, "grlib-irq", 1);
> > -memory_region_init_io(&irqmp->iomem, obj, &grlib_irqmp_ops, irqmp,
> > +if ((!irqmp->ncpus) || (irqmp->ncpus > IRQMP_MAX_CPU)) {
> > +error_setg(errp, "Invalid ncpus properties: "
> > +   "%u, must be 0 < ncpus =< %u.", irqmp->ncpus,
> > +   IRQMP_MAX_CPU);
> > +}
>
> We detect the out-of-range 'ncpus' value, but forget the "return"
> statement, so execution will continue onward regardless, and
> overrun the irqmp->irq[] array when we call qdev_init_gpio_out_named().

Indeed, I'll send a patch.
Thanks for pointing that out.

Clément

> > +
> > +qdev_init_gpio_in(dev, grlib_irqmp_set_irq, MAX_PILS);
> > +qdev_init_gpio_out_named(dev, &irqmp->irq, "grlib-irq", 1);
> > +memory_region_init_io(&irqmp->iomem, OBJECT(dev), &grlib_irqmp_ops, 
> > irqmp,
> >"irqmp", IRQMP_REG_SIZE);
> >
> >  irqmp->state = g_malloc0(sizeof *irqmp->state);
> >
> > -sysbus_init_mmio(dev, &irqmp->iomem);
> > +sysbus_init_mmio(SYS_BUS_DEVICE(dev), &irqmp->iomem);
> >  }
>
> thanks
> -- PMM



[PATCH 1/2] copy-before-write: allow specifying minimum cluster size

2024-03-08 Thread Fiona Ebner
Useful to make discard-source work in the context of backup fleecing
when the fleecing image has a larger granularity than the backup
target.

Copy-before-write operations will use at least this granularity and in
particular, discard requests to the source node will too. If the
granularity is too small, they will just be aligned down in
cbw_co_pdiscard_snapshot() and thus effectively ignored.

The QAPI uses uint32 so the value will be non-negative, but still fit
into a uint64_t.

Suggested-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Fiona Ebner 
---
 block/block-copy.c | 17 +
 block/copy-before-write.c  |  3 ++-
 include/block/block-copy.h |  1 +
 qapi/block-core.json   |  8 +++-
 4 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 7e3b378528..adb1cbb440 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -310,6 +310,7 @@ void block_copy_set_copy_opts(BlockCopyState *s, bool 
use_copy_range,
 }
 
 static int64_t block_copy_calculate_cluster_size(BlockDriverState *target,
+ int64_t min_cluster_size,
  Error **errp)
 {
 int ret;
@@ -335,7 +336,7 @@ static int64_t 
block_copy_calculate_cluster_size(BlockDriverState *target,
 "used. If the actual block size of the target exceeds "
 "this default, the backup may be unusable",
 BLOCK_COPY_CLUSTER_SIZE_DEFAULT);
-return BLOCK_COPY_CLUSTER_SIZE_DEFAULT;
+return MAX(min_cluster_size, BLOCK_COPY_CLUSTER_SIZE_DEFAULT);
 } else if (ret < 0 && !target_does_cow) {
 error_setg_errno(errp, -ret,
 "Couldn't determine the cluster size of the target image, "
@@ -345,16 +346,18 @@ static int64_t 
block_copy_calculate_cluster_size(BlockDriverState *target,
 return ret;
 } else if (ret < 0 && target_does_cow) {
 /* Not fatal; just trudge on ahead. */
-return BLOCK_COPY_CLUSTER_SIZE_DEFAULT;
+return MAX(min_cluster_size, BLOCK_COPY_CLUSTER_SIZE_DEFAULT);
 }
 
-return MAX(BLOCK_COPY_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
+return MAX(min_cluster_size,
+   MAX(BLOCK_COPY_CLUSTER_SIZE_DEFAULT, bdi.cluster_size));
 }
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
  BlockDriverState *copy_bitmap_bs,
  const BdrvDirtyBitmap *bitmap,
  bool discard_source,
+ int64_t min_cluster_size,
  Error **errp)
 {
 ERRP_GUARD();
@@ -365,7 +368,13 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, 
BdrvChild *target,
 
 GLOBAL_STATE_CODE();
 
-cluster_size = block_copy_calculate_cluster_size(target->bs, errp);
+if (min_cluster_size && !is_power_of_2(min_cluster_size)) {
+error_setg(errp, "min-cluster-size needs to be a power of 2");
+return NULL;
+}
+
+cluster_size = block_copy_calculate_cluster_size(target->bs,
+ min_cluster_size, errp);
 if (cluster_size < 0) {
 return NULL;
 }
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index dac57481c5..f9896c6c1e 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -476,7 +476,8 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
 
 s->discard_source = flags & BDRV_O_CBW_DISCARD_SOURCE;
 s->bcs = block_copy_state_new(bs->file, s->target, bs, bitmap,
-  flags & BDRV_O_CBW_DISCARD_SOURCE, errp);
+  flags & BDRV_O_CBW_DISCARD_SOURCE,
+  opts->min_cluster_size, errp);
 if (!s->bcs) {
 error_prepend(errp, "Cannot create block-copy-state: ");
 return -EINVAL;
diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index bdc703bacd..77857c6c68 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -28,6 +28,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, 
BdrvChild *target,
  BlockDriverState *copy_bitmap_bs,
  const BdrvDirtyBitmap *bitmap,
  bool discard_source,
+ int64_t min_cluster_size,
  Error **errp);
 
 /* Function should be called prior any actual copy request */
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 0a72c590a8..85c8f88f6e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4625,12 +4625,18 @@
 # @on-cbw-error parameter will decide how this failure is handled.
 # Default 0. (Since 7.1)
 #
+# @min-cluster-size: Minimum size of b

[PATCH 0/2] backup: allow specifying minimum cluster size

2024-03-08 Thread Fiona Ebner
Based-on: 
https://lore.kernel.org/qemu-devel/20240228141501.455989-1-vsement...@yandex-team.ru/

Useful to make discard-source work in the context of backup fleecing
when the fleecing image has a larger granularity than the backup
target.

Backup/block-copy will use at least this granularity for copy operations
and in particular, discard requests to the backup source will too. If
the granularity is too small, they will just be aligned down in
cbw_co_pdiscard_snapshot() and thus effectively ignored.

Fiona Ebner (2):
  copy-before-write: allow specifying minimum cluster size
  backup: add minimum cluster size to performance options

 block/backup.c |  2 +-
 block/block-copy.c | 17 +
 block/copy-before-write.c  |  5 -
 block/copy-before-write.h  |  1 +
 blockdev.c |  3 +++
 include/block/block-copy.h |  1 +
 qapi/block-core.json   | 17 ++---
 7 files changed, 37 insertions(+), 9 deletions(-)

-- 
2.39.2





[PATCH 2/2] backup: add minimum cluster size to performance options

2024-03-08 Thread Fiona Ebner
Useful to make discard-source work in the context of backup fleecing
when the fleecing image has a larger granularity than the backup
target.

Backup/block-copy will use at least this granularity for copy operations
and in particular, discard requests to the backup source will too. If
the granularity is too small, they will just be aligned down in
cbw_co_pdiscard_snapshot() and thus effectively ignored.

Suggested-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Fiona Ebner 
---
 block/backup.c| 2 +-
 block/copy-before-write.c | 2 ++
 block/copy-before-write.h | 1 +
 blockdev.c| 3 +++
 qapi/block-core.json  | 9 +++--
 5 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 3dd2e229d2..a1292c01ec 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -458,7 +458,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 }
 
 cbw = bdrv_cbw_append(bs, target, filter_node_name, discard_source,
-  &bcs, errp);
+  perf->min_cluster_size, &bcs, errp);
 if (!cbw) {
 goto error;
 }
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index f9896c6c1e..55a9272485 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -545,6 +545,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   BlockDriverState *target,
   const char *filter_node_name,
   bool discard_source,
+  int64_t min_cluster_size,
   BlockCopyState **bcs,
   Error **errp)
 {
@@ -563,6 +564,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
 }
 qdict_put_str(opts, "file", bdrv_get_node_name(source));
 qdict_put_str(opts, "target", bdrv_get_node_name(target));
+qdict_put_int(opts, "min-cluster-size", min_cluster_size);
 
 top = bdrv_insert_node(source, opts, flags, errp);
 if (!top) {
diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index 01af0cd3c4..dc6cafe7fa 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -40,6 +40,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   BlockDriverState *target,
   const char *filter_node_name,
   bool discard_source,
+  int64_t min_cluster_size,
   BlockCopyState **bcs,
   Error **errp);
 void bdrv_cbw_drop(BlockDriverState *bs);
diff --git a/blockdev.c b/blockdev.c
index daceb50460..8e6bdbc94a 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2653,6 +2653,9 @@ static BlockJob *do_backup_common(BackupCommon *backup,
 if (backup->x_perf->has_max_chunk) {
 perf.max_chunk = backup->x_perf->max_chunk;
 }
+if (backup->x_perf->has_min_cluster_size) {
+perf.min_cluster_size = backup->x_perf->min_cluster_size;
+}
 }
 
 if ((backup->sync == MIRROR_SYNC_MODE_BITMAP) ||
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 85c8f88f6e..ba0836892f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1551,11 +1551,16 @@
 # it should not be less than job cluster size which is calculated
 # as maximum of target image cluster size and 64k.  Default 0.
 #
+# @min-cluster-size: Minimum size of blocks used by copy-before-write
+# and background copy operations.  Has to be a power of 2.  No
+# effect if smaller than the maximum of the target's cluster size
+# and 64 KiB.  Default 0.  (Since 9.0)
+#
 # Since: 6.0
 ##
 { 'struct': 'BackupPerf',
-  'data': { '*use-copy-range': 'bool',
-'*max-workers': 'int', '*max-chunk': 'int64' } }
+  'data': { '*use-copy-range': 'bool', '*max-workers': 'int',
+'*max-chunk': 'int64', '*min-cluster-size': 'uint32' } }
 
 ##
 # @BackupCommon:
-- 
2.39.2





Re: [PATCH v3 3/5] block/copy-before-write: create block_copy bitmap in filter node

2024-03-08 Thread Fiona Ebner
Am 28.02.24 um 15:14 schrieb Vladimir Sementsov-Ogievskiy:
> Currently block_copy creates copy_bitmap in source node. But that is in
> bad relation with .independent_close=true of copy-before-write filter:
> source node may be detached and removed before .bdrv_close() handler
> called, which should call block_copy_state_free(), which in turn should
> remove copy_bitmap.
> 
> That's all not ideal: it would be better if internal bitmap of
> block-copy object is not attached to any node. But that is not possible
> now.
> 
> The simplest solution is just create copy_bitmap in filter node, where
> anyway two other bitmaps are created.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Reviewed-by: Fiona Ebner 




Re: [PATCH v3 0/5] backup: discard-source parameter

2024-03-08 Thread Fiona Ebner
Am 28.02.24 um 15:14 schrieb Vladimir Sementsov-Ogievskiy:
> Hi all! The main patch is 04, please look at it for description and
> diagram.
> 
> v3:
> 02: new patch
> 04: take WRITE permission only when discard_source is required
> 
> Vladimir Sementsov-Ogievskiy (5):
>   block/copy-before-write: fix permission
>   block/copy-before-write: support unligned snapshot-discard
>   block/copy-before-write: create block_copy bitmap in filter node
>   qapi: blockdev-backup: add discard-source parameter
>   iotests: add backup-discard-source
> 
>  block/backup.c|   5 +-
>  block/block-copy.c|  12 +-
>  block/copy-before-write.c |  39 -
>  block/copy-before-write.h |   1 +
>  block/replication.c   |   4 +-
>  blockdev.c|   2 +-
>  include/block/block-common.h  |   2 +
>  include/block/block-copy.h|   2 +
>  include/block/block_int-global-state.h|   2 +-
>  qapi/block-core.json  |   4 +
>  tests/qemu-iotests/257.out| 112 ++---
>  .../qemu-iotests/tests/backup-discard-source  | 151 ++
>  .../tests/backup-discard-source.out   |   5 +
>  13 files changed, 271 insertions(+), 70 deletions(-)
>  create mode 100755 tests/qemu-iotests/tests/backup-discard-source
>  create mode 100644 tests/qemu-iotests/tests/backup-discard-source.out
> 

Tested-by: Fiona Ebner 




Re: [PATCH v3 5/5] iotests: add backup-discard-source

2024-03-08 Thread Fiona Ebner
Am 28.02.24 um 15:15 schrieb Vladimir Sementsov-Ogievskiy:
> Add test for a new backup option: discard-source.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  .../qemu-iotests/tests/backup-discard-source  | 151 ++
>  .../tests/backup-discard-source.out   |   5 +
>  2 files changed, 156 insertions(+)
>  create mode 100755 tests/qemu-iotests/tests/backup-discard-source
>  create mode 100644 tests/qemu-iotests/tests/backup-discard-source.out
> 
> diff --git a/tests/qemu-iotests/tests/backup-discard-source 
> b/tests/qemu-iotests/tests/backup-discard-source
> new file mode 100755
> index 00..8a88b0f6c4
> --- /dev/null
> +++ b/tests/qemu-iotests/tests/backup-discard-source
> @@ -0,0 +1,151 @@
> +#!/usr/bin/env python3
> +#
> +# Test removing persistent bitmap from backing
> +#
> +# Copyright (c) 2022 Virtuozzo International GmbH.
> +#

Title and copyright year are wrong.

Apart from that:

Reviewed-by: Fiona Ebner 




Re: [PATCH v3 2/5] block/copy-before-write: support unligned snapshot-discard

2024-03-08 Thread Fiona Ebner
Am 28.02.24 um 15:14 schrieb Vladimir Sementsov-Ogievskiy:
> First thing that crashes on unligned access here is
> bdrv_reset_dirty_bitmap(). Correct way is to align-down the
> snapshot-discard request.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Reviewed-by: Fiona Ebner 




Re: [PATCH v3 4/5] qapi: blockdev-backup: add discard-source parameter

2024-03-08 Thread Fiona Ebner
Am 28.02.24 um 15:15 schrieb Vladimir Sementsov-Ogievskiy:
> Add a parameter that enables discard-after-copy. That is mostly useful
> in "push backup with fleecing" scheme, when source is snapshot-access
> format driver node, based on copy-before-write filter snapshot-access
> API:
> 
> [guest]  [snapshot-access] ~~ blockdev-backup ~~> [backup target]
>||
>| root   | file
>vv
> [copy-before-write]
>| |
>| file| target
>v v
> [active disk]   [temp.img]
> 
> In this case discard-after-copy does two things:
> 
>  - discard data in temp.img to save disk space
>  - avoid further copy-before-write operation in discarded area
> 
> Note that we have to declare WRITE permission on source in
> copy-before-write filter, for discard to work. Still we can't take it
> unconditionally, as it will break normal backup from RO source. So, we
> have to add a parameter and pass it thorough bdrv_open flags.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 

Reviewed-by: Fiona Ebner 




Re: [PATCH v1 0/8] virtio, vhost: Add VIRTIO_F_NOTIFICATION_DATA support

2024-03-08 Thread Jonah Palmer




On 3/8/24 8:28 AM, Lei Yang wrote:

Hi Jonah

QE tested this series v1 with a tap device with vhost=off with
regression tests, everything works fine. And QE also add
"notification_data=true" to the qemu command line then got "1" when
performing the command [1] inside the guest.
[1] cut -c39 /sys/devices/pci:00/:00:01.3/:05:00.0/virtio1/features

Tested-by: Lei Yang 



Thank you for double-checking this series for me Lei! I appreciate it :)

Jonah


On Thu, Mar 7, 2024 at 7:18 PM Eugenio Perez Martin  wrote:


On Wed, Mar 6, 2024 at 8:34 AM Michael S. Tsirkin  wrote:


On Wed, Mar 06, 2024 at 08:07:31AM +0100, Eugenio Perez Martin wrote:

On Wed, Mar 6, 2024 at 6:34 AM Jason Wang  wrote:


On Tue, Mar 5, 2024 at 3:46 AM Jonah Palmer  wrote:


The goal of these patches are to add support to a variety of virtio and
vhost devices for the VIRTIO_F_NOTIFICATION_DATA transport feature. This
feature indicates that a driver will pass extra data (instead of just a
virtqueue's index) when notifying the corresponding device.

The data passed in by the driver when this feature is enabled varies in
format depending on if the device is using a split or packed virtqueue
layout:

  Split VQ
   - Upper 16 bits: shadow_avail_idx
   - Lower 16 bits: virtqueue index

  Packed VQ
   - Upper 16 bits: 1-bit wrap counter & 15-bit shadow_avail_idx
   - Lower 16 bits: virtqueue index

Also, due to the limitations of ioeventfd not being able to carry the
extra provided by the driver, ioeventfd is left disabled for any devices
using this feature.


Is there any method to overcome this? This might help for vhost.



As a half-baked idea, read(2)ing an eventfd descriptor returns an
8-byte integer already. The returned value of read depends on eventfd
flags, but both have to do with the number of writes of the other end.

My proposal is to replace this value with the last value written by
the guest, so we can extract the virtio notification data from there.
The behavior of read is similar to not-EFD_SEMAPHORE, reading a value
and then blocking if read again without writes. The behavior of KVM
writes is different, as it is not a counter anymore.

Thanks!



I doubt you will be able to support this in ioeventfd...


I agree.


But vhost does not really need the value at all.
So why mask out ioeventfd with vhost?


The interface should not be able to start with vhost-kernel because
the feature is not offered by the vhost-kernel device. So ioeventfd is
always enabled with vhost-kernel.

Or do you mean we should allow it and let vhost-kernel fetch data from
the avail ring as usual? I'm ok with that but then the guest can place
any value to it, so the driver cannot be properly "validated by
software" that way.


vhost-vdpa is probably the only one that might need it...


Right, but vhost-vdpa already supports doorbell memory regions so I
guess it has little use, isn't it?

Thanks!






Thanks



A significant aspect of this effort has been to maintain compatibility
across different backends. As such, the feature is offered by backend
devices only when supported, with fallback mechanisms where backend
support is absent.

Jonah Palmer (8):
   virtio/virtio-pci: Handle extra notification data
   virtio-pci: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA
   virtio-mmio: Handle extra notification data
   virtio-mmio: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA
   virtio-ccw: Handle extra notification data
   virtio-ccw: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA
   vhost/vhost-user: Add VIRTIO_F_NOTIFICATION_DATA to vhost feature bits
   virtio: Add VIRTIO_F_NOTIFICATION_DATA property definition

  hw/block/vhost-user-blk.c|  1 +
  hw/net/vhost_net.c   |  2 ++
  hw/s390x/s390-virtio-ccw.c   | 16 
  hw/s390x/virtio-ccw.c|  6 --
  hw/scsi/vhost-scsi.c |  1 +
  hw/scsi/vhost-user-scsi.c|  1 +
  hw/virtio/vhost-user-fs.c|  2 +-
  hw/virtio/vhost-user-vsock.c |  1 +
  hw/virtio/virtio-mmio.c  | 15 +++
  hw/virtio/virtio-pci.c   | 16 +++-
  hw/virtio/virtio.c   | 18 ++
  include/hw/virtio/virtio.h   |  5 -
  net/vhost-vdpa.c |  1 +
  13 files changed, 68 insertions(+), 17 deletions(-)

--
2.39.3














Re: [PATCH v1 0/8] virtio, vhost: Add VIRTIO_F_NOTIFICATION_DATA support

2024-03-08 Thread Lei Yang
Hi Jonah

QE tested this series v1 with a tap device with vhost=off with
regression tests, everything works fine. And QE also add
"notification_data=true" to the qemu command line then got "1" when
performing the command [1] inside the guest.
[1] cut -c39 /sys/devices/pci:00/:00:01.3/:05:00.0/virtio1/features

Tested-by: Lei Yang 

On Thu, Mar 7, 2024 at 7:18 PM Eugenio Perez Martin  wrote:
>
> On Wed, Mar 6, 2024 at 8:34 AM Michael S. Tsirkin  wrote:
> >
> > On Wed, Mar 06, 2024 at 08:07:31AM +0100, Eugenio Perez Martin wrote:
> > > On Wed, Mar 6, 2024 at 6:34 AM Jason Wang  wrote:
> > > >
> > > > On Tue, Mar 5, 2024 at 3:46 AM Jonah Palmer  
> > > > wrote:
> > > > >
> > > > > The goal of these patches are to add support to a variety of virtio 
> > > > > and
> > > > > vhost devices for the VIRTIO_F_NOTIFICATION_DATA transport feature. 
> > > > > This
> > > > > feature indicates that a driver will pass extra data (instead of just 
> > > > > a
> > > > > virtqueue's index) when notifying the corresponding device.
> > > > >
> > > > > The data passed in by the driver when this feature is enabled varies 
> > > > > in
> > > > > format depending on if the device is using a split or packed virtqueue
> > > > > layout:
> > > > >
> > > > >  Split VQ
> > > > >   - Upper 16 bits: shadow_avail_idx
> > > > >   - Lower 16 bits: virtqueue index
> > > > >
> > > > >  Packed VQ
> > > > >   - Upper 16 bits: 1-bit wrap counter & 15-bit shadow_avail_idx
> > > > >   - Lower 16 bits: virtqueue index
> > > > >
> > > > > Also, due to the limitations of ioeventfd not being able to carry the
> > > > > extra provided by the driver, ioeventfd is left disabled for any 
> > > > > devices
> > > > > using this feature.
> > > >
> > > > Is there any method to overcome this? This might help for vhost.
> > > >
> > >
> > > As a half-baked idea, read(2)ing an eventfd descriptor returns an
> > > 8-byte integer already. The returned value of read depends on eventfd
> > > flags, but both have to do with the number of writes of the other end.
> > >
> > > My proposal is to replace this value with the last value written by
> > > the guest, so we can extract the virtio notification data from there.
> > > The behavior of read is similar to not-EFD_SEMAPHORE, reading a value
> > > and then blocking if read again without writes. The behavior of KVM
> > > writes is different, as it is not a counter anymore.
> > >
> > > Thanks!
> >
> >
> > I doubt you will be able to support this in ioeventfd...
>
> I agree.
>
> > But vhost does not really need the value at all.
> > So why mask out ioeventfd with vhost?
>
> The interface should not be able to start with vhost-kernel because
> the feature is not offered by the vhost-kernel device. So ioeventfd is
> always enabled with vhost-kernel.
>
> Or do you mean we should allow it and let vhost-kernel fetch data from
> the avail ring as usual? I'm ok with that but then the guest can place
> any value to it, so the driver cannot be properly "validated by
> software" that way.
>
> > vhost-vdpa is probably the only one that might need it...
>
> Right, but vhost-vdpa already supports doorbell memory regions so I
> guess it has little use, isn't it?
>
> Thanks!
>
> >
> >
> >
> > > > Thanks
> > > >
> > > > >
> > > > > A significant aspect of this effort has been to maintain compatibility
> > > > > across different backends. As such, the feature is offered by backend
> > > > > devices only when supported, with fallback mechanisms where backend
> > > > > support is absent.
> > > > >
> > > > > Jonah Palmer (8):
> > > > >   virtio/virtio-pci: Handle extra notification data
> > > > >   virtio-pci: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA
> > > > >   virtio-mmio: Handle extra notification data
> > > > >   virtio-mmio: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA
> > > > >   virtio-ccw: Handle extra notification data
> > > > >   virtio-ccw: Lock ioeventfd state with VIRTIO_F_NOTIFICATION_DATA
> > > > >   vhost/vhost-user: Add VIRTIO_F_NOTIFICATION_DATA to vhost feature 
> > > > > bits
> > > > >   virtio: Add VIRTIO_F_NOTIFICATION_DATA property definition
> > > > >
> > > > >  hw/block/vhost-user-blk.c|  1 +
> > > > >  hw/net/vhost_net.c   |  2 ++
> > > > >  hw/s390x/s390-virtio-ccw.c   | 16 
> > > > >  hw/s390x/virtio-ccw.c|  6 --
> > > > >  hw/scsi/vhost-scsi.c |  1 +
> > > > >  hw/scsi/vhost-user-scsi.c|  1 +
> > > > >  hw/virtio/vhost-user-fs.c|  2 +-
> > > > >  hw/virtio/vhost-user-vsock.c |  1 +
> > > > >  hw/virtio/virtio-mmio.c  | 15 +++
> > > > >  hw/virtio/virtio-pci.c   | 16 +++-
> > > > >  hw/virtio/virtio.c   | 18 ++
> > > > >  include/hw/virtio/virtio.h   |  5 -
> > > > >  net/vhost-vdpa.c |  1 +
> > > > >  13 files changed, 68 insertions(+), 17 deletions(-)
> > > > >
> > > > > --
> > > > > 2.39.3
> > > > >
> > > >
> >
>
>




Re: [PULL 34/56] hw/intc/grlib_irqmp: add ncpus property

2024-03-08 Thread Peter Maydell
On Thu, 15 Feb 2024 at 18:04, Philippe Mathieu-Daudé  wrote:
>
> From: Clément Chigot 
>
> This adds a "ncpus" property to the "grlib-irqmp" device to be used
> later, this required a little refactoring of how we initialize the
> device (ie: use realize instead of init).
>
> Co-developed-by: Frederic Konrad 
> Signed-off-by: Clément Chigot 
> Reviewed-by: Philippe Mathieu-Daudé 
> Message-ID: <20240131085047.18458-3-chi...@adacore.com>
> Signed-off-by: Philippe Mathieu-Daudé 

Hi; Coverity points out a bug in this commit (CID 1534914):


> -static void grlib_irqmp_init(Object *obj)
> +static void grlib_irqmp_realize(DeviceState *dev, Error **errp)
>  {
> -IRQMP *irqmp = GRLIB_IRQMP(obj);
> -SysBusDevice *dev = SYS_BUS_DEVICE(obj);
> +IRQMP *irqmp = GRLIB_IRQMP(dev);
>
> -qdev_init_gpio_in(DEVICE(obj), grlib_irqmp_set_irq, MAX_PILS);
> -qdev_init_gpio_out_named(DEVICE(obj), &irqmp->irq, "grlib-irq", 1);
> -memory_region_init_io(&irqmp->iomem, obj, &grlib_irqmp_ops, irqmp,
> +if ((!irqmp->ncpus) || (irqmp->ncpus > IRQMP_MAX_CPU)) {
> +error_setg(errp, "Invalid ncpus properties: "
> +   "%u, must be 0 < ncpus =< %u.", irqmp->ncpus,
> +   IRQMP_MAX_CPU);
> +}

We detect the out-of-range 'ncpus' value, but forget the "return"
statement, so execution will continue onward regardless, and
overrun the irqmp->irq[] array when we call qdev_init_gpio_out_named().

> +
> +qdev_init_gpio_in(dev, grlib_irqmp_set_irq, MAX_PILS);
> +qdev_init_gpio_out_named(dev, &irqmp->irq, "grlib-irq", 1);
> +memory_region_init_io(&irqmp->iomem, OBJECT(dev), &grlib_irqmp_ops, 
> irqmp,
>"irqmp", IRQMP_REG_SIZE);
>
>  irqmp->state = g_malloc0(sizeof *irqmp->state);
>
> -sysbus_init_mmio(dev, &irqmp->iomem);
> +sysbus_init_mmio(SYS_BUS_DEVICE(dev), &irqmp->iomem);
>  }

thanks
-- PMM



Re: [PATCH] linux-aio: add IO_CMD_FDSYNC command support

2024-03-08 Thread Prasad Pandit
Hello Kevin,

On Fri, 8 Mar 2024 at 14:35, Kevin Wolf  wrote:
> Hm... This might make it more challenging because then not only one
> specific request fails, but the whole submission batch.

* Yes exactly.
* I've updated yesterday's patch to return an error (-EINVAL) from
ioq_submit to laio_co_submit and fallback on thread-pool as you said.
It seems to be working fine when the kernel supports an AIO_FDSYNC
call. I'm trying to test it against the Fedora-26 kernel, which was <
4.13.0, and did not support the AIO_FDSYNC call.

> Do we know if it can include unrelated requests?

* Unrelated requests?

> It passes the return value to the request:
>
> if (ret < 0) {
> /* Fail the first request, retry the rest */
> aiocb = QSIMPLEQ_FIRST(&s->io_q.pending);
> QSIMPLEQ_REMOVE_HEAD(&s->io_q.pending, next);
> s->io_q.in_queue--;
> aiocb->ret = ret;
> qemu_laio_process_completion(aiocb);
> continue;
> }
>
> laio_co_submit() then fetches it from laiocb.ret and returns it to its
> caller (in this case, your new code).

* I did not get what's happening here. It looks like when 'io_submit'
returns an error, ioq_submit plucks out the first 'laiocb' object from
QSIMPLEQ and sets the error code (-EINVAL) in it, and continues to
submit further requests in the batch. But what if 'io_submit' of those
further requests also returns an error? ie. For each error code the
first object in the queue is removed and error code is saved in it?
But in laio_co_submit() it is only looking at the 'ret' value of one
local laiocb object, not all in the batch. So which return code 'ret'
is it really checking against -EINPROGRESS?

* If a user submits DEFAULT_MAX_BATCH 32=aio requests, in
laio_co_submit() there does not seem to be a way to see 'io_submit'
return value for all of them, right? Should it check the io_submit
status of them all?

Thank you.
---
  - Prasad




Re: [PATCH] linux-aio: add IO_CMD_FDSYNC command support

2024-03-08 Thread Kevin Wolf
Am 07.03.2024 um 18:19 hat Prasad Pandit geschrieben:
> Hi,
> 
> On Thu, 7 Mar 2024 at 19:21, Kevin Wolf  wrote:
> > Kernel support for this is "relatively" new, added in 2018. Should we
> > fall back to the thread pool if we receive -EINVAL?
> 
> laio_co_submit
>   laio_do_submit
> ioq_submit
>   io_submit
> 
> * When an aio operation is not supported by the kernel, io_submit()
> call would return '-EINVAL', indicating that the specified (_FDSYNC)
> operation is invalid for the file descriptor. ie. an error (-EINVAL)
> needs to reach the 'laio_co_submit' routine above, to make its caller
> fall back on the thread-pool functionality as default.

Hm... This might make it more challenging because then not only one
specific request fails, but the whole submission batch. Do we know if it
can include unrelated requests?

> * Strangely the 'ioq_submit' routine above ignores the return value
> from io_submit and returns void. I think we need to change that to be
> able to fall back on the thread-pool functionality. I'll try to see if
> such a change works as expected. Please let me know if there's another
> way to fix this.

It passes the return value to the request:

if (ret < 0) {
/* Fail the first request, retry the rest */
aiocb = QSIMPLEQ_FIRST(&s->io_q.pending);
QSIMPLEQ_REMOVE_HEAD(&s->io_q.pending, next);
s->io_q.in_queue--;
aiocb->ret = ret;
qemu_laio_process_completion(aiocb);
continue;
}

laio_co_submit() then fetches it from laiocb.ret and returns it to its
caller (in this case, your new code).

Kevin




Re: [PATCH v2 00/10] mirror: allow switching from background to active mode

2024-03-08 Thread Kevin Wolf
Am 07.03.2024 um 20:42 hat Vladimir Sementsov-Ogievskiy geschrieben:
> On 04.03.24 14:09, Peter Krempa wrote:
> > On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
> > > Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
> > > > On 03.11.23 18:56, Markus Armbruster wrote:
> > > > > Kevin Wolf  writes:
> > 
> > [...]
> > 
> > > > > Is the job abstraction a failure?
> > > > > 
> > > > > We have
> > > > > 
> > > > >   block-job- command  since   job- commandsince
> > > > >   -
> > > > >   block-job-set-speed 1.1
> > > > >   block-job-cancel1.1 job-cancel  3.0
> > > > >   block-job-pause 1.3 job-pause   3.0
> > > > >   block-job-resume1.3 job-resume  3.0
> > > > >   block-job-complete  1.3 job-complete3.0
> > > > >   block-job-dismiss   2.12job-dismiss 3.0
> > > > >   block-job-finalize  2.12job-finalize3.0
> > > > >   block-job-change8.2
> > > > >   query-block-jobs1.1 query-jobs
> > 
> > [...]
> > 
> > > I consider these strictly optional. We don't really have strong reasons
> > > to deprecate these commands (they are just thin wrappers), and I think
> > > libvirt still uses block-job-* in some places.
> > 
> > Libvirt uses 'block-job-cancel' because it has different semantics from
> > 'job-cancel' which libvirt documented as the behaviour of the API that
> > uses it. (Semantics regarding the expectation of what is written to the
> > destination node at the point when the job is cancelled).
> > 
> 
> That's the following semantics:
> 
>   # Note that if you issue 'block-job-cancel' after 'drive-mirror' has
>   # indicated (via the event BLOCK_JOB_READY) that the source and
>   # destination are synchronized, then the event triggered by this
>   # command changes to BLOCK_JOB_COMPLETED, to indicate that the
>   # mirroring has ended and the destination now has a point-in-time copy
>   # tied to the time of the cancellation.
> 
> Hmm. Looking at this, it looks for me, that should probably a
> 'block-job-complete" command (as leading to BLOCK_JOB_COMPLETED).

Yes, it's just a different completion mode.

> Actually, what is the difference between block-job-complete and
> block-job-cancel(force=false) for mirror in ready state?
> 
> I only see the following differencies:
> 
> 1. block-job-complete documents that it completes the job
>synchronously.. But looking at mirror code I see it just set
>s->should_complete = true, which will be then handled
>asynchronously..  So I doubt that documentation is correct.
> 
> 2. block-job-complete will trigger final graph changes.
>block-job-cancel will not.
> 
> Is [2] really useful? Seems yes: in case of some failure before
> starting migration target, we'd like to continue executing source. So,
> no reason to break block-graph in source, better keep it unchanged.
> 
> But I think, such behavior better be setup by mirror-job start
> parameter, rather then by special option for cancel (or even
> compelete) command, useful only for mirror.

I'm not sure, having the option on the complete command makes more sense
to me than having it in blockdev-mirror.

I do see the challenge of representing this meaningfully in QAPI,
though. Semantically it should be a union with job-specific options and
only mirror adds the graph-changes option. But the union variant
can't be directly selected from another option - instead we have a job
ID, and the variant is the job type of the job with this ID.

Practically speaking, we would probably indeed end up with an optional
field in the existing completion command.

> So, what about the following substitution for block-job-cancel:
> 
> block-job-cancel(force=true)  -->  use job-cancel
> 
> block-job-cancel(force=false) for backup, stream, commit  -->  use job-cancel
> 
> block-job-cancel(force=false) for mirror in ready mode  -->
> 
>   instead, use block-job-complete. If you don't need final graph
>   modification which mirror job normally does, use graph-change=false
>   parameter for blockdev-mirror command.

Apart from the open question where to put the option, agreed.

> (I can hardly remember, that we've already discussed something like
> this long time ago, but I don't remember the results)

I think everyone agreed that this is how things should be, and nobody
did anything to achieve it.

> I also a bit unsure about active commit soft-cancelling semantics. Is
> it actually useful? If yes, block-commit command will need similar
> option.

Hm... That would commit everything down to the lower layer and then keep
the old overlays still around?

I could see a limited use case for committing into the immediate backing
file and then keep using the overlay to accumulate new changes (would be
more useful if we discarded the old content). Once you have intermediate
backing files, I don't think it makes

Re: [PATCH v2 00/10] mirror: allow switching from background to active mode

2024-03-08 Thread Fiona Ebner
Am 07.03.24 um 20:42 schrieb Vladimir Sementsov-Ogievskiy:
> On 04.03.24 14:09, Peter Krempa wrote:
>> On Mon, Mar 04, 2024 at 11:48:54 +0100, Kevin Wolf wrote:
>>> Am 28.02.2024 um 19:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
 On 03.11.23 18:56, Markus Armbruster wrote:
> Kevin Wolf  writes:
>>
>> [...]
>>
> Is the job abstraction a failure?
>
> We have
>
>   block-job- command  since   job- command    since
>   -
>   block-job-set-speed 1.1
>   block-job-cancel    1.1 job-cancel  3.0
>   block-job-pause 1.3 job-pause   3.0
>   block-job-resume    1.3 job-resume  3.0
>   block-job-complete  1.3 job-complete    3.0
>   block-job-dismiss   2.12    job-dismiss 3.0
>   block-job-finalize  2.12    job-finalize    3.0
>   block-job-change    8.2
>   query-block-jobs    1.1 query-jobs
>>
>> [...]
>>
>>> I consider these strictly optional. We don't really have strong reasons
>>> to deprecate these commands (they are just thin wrappers), and I think
>>> libvirt still uses block-job-* in some places.
>>
>> Libvirt uses 'block-job-cancel' because it has different semantics from
>> 'job-cancel' which libvirt documented as the behaviour of the API that
>> uses it. (Semantics regarding the expectation of what is written to the
>> destination node at the point when the job is cancelled).
>>
> 
> That's the following semantics:
> 
>   # Note that if you issue 'block-job-cancel' after 'drive-mirror' has
>   # indicated (via the event BLOCK_JOB_READY) that the source and
>   # destination are synchronized, then the event triggered by this
>   # command changes to BLOCK_JOB_COMPLETED, to indicate that the
>   # mirroring has ended and the destination now has a point-in-time copy
>   # tied to the time of the cancellation.
> 
> Hmm. Looking at this, it looks for me, that should probably a
> 'block-job-complete" command (as leading to BLOCK_JOB_COMPLETED).
> 
> Actually, what is the difference between block-job-complete and
> block-job-cancel(force=false) for mirror in ready state?
> 
> I only see the following differencies:
> 
> 1. block-job-complete documents that it completes the job
> synchronously.. But looking at mirror code I see it just set
> s->should_complete = true, which will be then handled asynchronously..
> So I doubt that documentation is correct.
> 
> 2. block-job-complete will trigger final graph changes. block-job-cancel
> will not.
> 
> Is [2] really useful? Seems yes: in case of some failure before starting
> migration target, we'd like to continue executing source. So, no reason
> to break block-graph in source, better keep it unchanged.
> 

FWIW, we also rely on these special semantics. We allow cloning the disk
state of a running guest using drive-mirror (and before finishing,
fsfreeze in the guest for consistency). We cannot use block-job-complete
there, because we do not want to switch the source's drive.

> But I think, such behavior better be setup by mirror-job start
> parameter, rather then by special option for cancel (or even compelete)
> command, useful only for mirror.
> 
> So, what about the following substitution for block-job-cancel:
> 
> block-job-cancel(force=true)  -->  use job-cancel
> 
> block-job-cancel(force=false) for backup, stream, commit  -->  use
> job-cancel
> 
> block-job-cancel(force=false) for mirror in ready mode  -->
> 
>   instead, use block-job-complete. If you don't need final graph
> modification which mirror job normally does, use graph-change=false
> parameter for blockdev-mirror command.
> 

But yes, having a graph-change parameter would work for us too :)

Best Regards,
Fiona