Re: [PATCH for 6.1] tests: filter out TLS distinguished name in certificate checks

2021-08-04 Thread Eric Blake
On Wed, Aug 04, 2021 at 07:03:30PM +0100, Daniel P. Berrangé wrote:
> The version of GNUTLS in Fedora 34 has changed the order in which encodes
> fields when generating new TLS certificates. This in turn changes the
> order seen when querying the distinguished name. This ultimately breaks
> the expected output in the NBD TLS iotests. We don't need to be
> comparing the exact distinguished name text for the purpose of the test
> though, so it is fine to filter it out.
> 
> Reported-by: Eric Blake 
> Signed-off-by: Daniel P. Berrangé 
> ---

Reviewed-by: Eric Blake 
Tested-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 5/9] tests/acceptance: Use image_expand() in test_arm_orangepi_uboot_netbsd9

2021-08-04 Thread Philippe Mathieu-Daudé
Hi Niek,

On 6/23/21 8:00 PM, Philippe Mathieu-Daudé wrote:
> The NetBSD OrangePi image must be at least 2GiB, not less.
> Expand the SD card image to this size before using it.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  tests/acceptance/boot_linux_console.py | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/acceptance/boot_linux_console.py 
> b/tests/acceptance/boot_linux_console.py
> index 61069f0064f..b10f7257503 100644
> --- a/tests/acceptance/boot_linux_console.py
> +++ b/tests/acceptance/boot_linux_console.py
> @@ -868,7 +868,12 @@ def test_arm_orangepi_uboot_netbsd9(self):
>  :avocado: tags=device:sd
>  :avocado: tags=os:netbsd
>  """
> -# This test download a 304MB compressed image and expand it to 2GB
> +# This test download a 304MB compressed image and expand it to 2GB,
> +# which is the minimum card size required by the NetBSD installer:
> +# https://wiki.netbsd.org/ports/evbarm/raspberry_pi/#index7h2
> +# "A 2 GB card is the smallest workable size that the installation
> +# image will fit on."

Do you agree with this comment and the one in the next patch?

> +NETBSD_SDCARD_MINSIZE = 2 * 1024 * 1024 * 1024
>  deb_url = ('http://snapshot.debian.org/archive/debian/'
> '20200108T145233Z/pool/main/u/u-boot/'
> 'u-boot-sunxi_2020.01%2Bdfsg-1_armhf.deb')
> @@ -886,7 +891,7 @@ def test_arm_orangepi_uboot_netbsd9(self):
>  image_path_gz = self.fetch_asset(image_url, asset_hash=image_hash)
>  image_path = os.path.join(self.workdir, 'armv7.img')
>  archive.gzip_uncompress(image_path_gz, image_path)
> -image_pow2ceil_expand(image_path)
> +image_expand(image_path, NETBSD_SDCARD_MINSIZE)
>  image_drive_args = 'if=sd,format=raw,snapshot=on,file=' + image_path
>  
>  # dd if=u-boot-sunxi-with-spl.bin of=armv7.img bs=1K seek=8 
> conv=notrunc
> 




[PATCH for 6.1] tests: filter out TLS distinguished name in certificate checks

2021-08-04 Thread Daniel P . Berrangé
The version of GNUTLS in Fedora 34 has changed the order in which encodes
fields when generating new TLS certificates. This in turn changes the
order seen when querying the distinguished name. This ultimately breaks
the expected output in the NBD TLS iotests. We don't need to be
comparing the exact distinguished name text for the purpose of the test
though, so it is fine to filter it out.

Reported-by: Eric Blake 
Signed-off-by: Daniel P. Berrangé 
---
 tests/qemu-iotests/233   | 2 +-
 tests/qemu-iotests/233.out   | 4 ++--
 tests/qemu-iotests/common.filter | 5 +
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/233 b/tests/qemu-iotests/233
index da150cd27b..9ca7b68f42 100755
--- a/tests/qemu-iotests/233
+++ b/tests/qemu-iotests/233
@@ -148,7 +148,7 @@ $QEMU_IMG info --image-opts \
 
 echo
 echo "== final server log =="
-cat "$TEST_DIR/server.log"
+cat "$TEST_DIR/server.log" | _filter_authz_check_tls
 rm -f "$TEST_DIR/server.log"
 
 # success, all done
diff --git a/tests/qemu-iotests/233.out b/tests/qemu-iotests/233.out
index c3c344811b..4b1f6a0e15 100644
--- a/tests/qemu-iotests/233.out
+++ b/tests/qemu-iotests/233.out
@@ -65,6 +65,6 @@ qemu-img: Could not open 
'driver=nbd,host=127.0.0.1,port=PORT,tls-creds=tls0': F
 == final server log ==
 qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
 qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
-qemu-nbd: option negotiation failed: TLS x509 authz check for 
CN=localhost,O=Cthulhu Dark Lord Enterprises client1,L=R'lyeh,C=South Pacific 
is denied
-qemu-nbd: option negotiation failed: TLS x509 authz check for 
CN=localhost,O=Cthulhu Dark Lord Enterprises client3,L=R'lyeh,C=South Pacific 
is denied
+qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
+qemu-nbd: option negotiation failed: TLS x509 authz check for 
DISTINGUISHED-NAME is denied
 *** done
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 268b749e2f..2b2b53946c 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -332,5 +332,10 @@ for fname in fnames:
 sys.stdout.write(result)'
 }
 
+_filter_authz_check_tls()
+{
+$SED -e 's/TLS x509 authz check for .* is denied/TLS x509 authz check for 
DISTINGUISHED-NAME is denied/'
+}
+
 # make sure this script returns success
 true
-- 
2.31.1




Re: [PATCH v7 25/33] iotests.py: VM: add own __enter__ method

2021-08-04 Thread John Snow
On Wed, Aug 4, 2021 at 5:39 AM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> In superclass __enter__ method is defined with return value type hint
> 'QEMUMachine'. So, mypy thinks that return value of VM.__enter__ is
> QEMUMachine. Let's redefine __enter__ in VM class, to give it correct
> type hint.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Max Reitz 
> ---
>  tests/qemu-iotests/iotests.py | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 89663dac06..025e288ddd 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -568,6 +568,10 @@ def remote_filename(path):
>  class VM(qtest.QEMUQtestMachine):
>  '''A QEMU VM'''
>
> +# Redefine __enter__ with proper type hint
> +def __enter__(self) -> 'VM':
> +return self
> +
>  def __init__(self, path_suffix=''):
>  name = "qemu%s-%d" % (path_suffix, os.getpid())
>  super().__init__(qemu_prog, qemu_opts, name=name,
> --
> 2.29.2
>

First and foremost:

Reviewed-by: John Snow 
Acked-by: John Snow 

A more durable approach might be to annotate QEMUMachine differently such
that subclasses get the right types automatically. See if this following
snippet works instead of adding a new __enter__ method.

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 971ed7e8c6a..2e410103569 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -36,6 +36,7 @@
 Sequence,
 Tuple,
 Type,
+TypeVar,
 )

 from qemu.qmp import (  # pylint: disable=import-error
@@ -73,6 +74,9 @@ class AbnormalShutdown(QEMUMachineError):
 """


+_T = TypeVar('_T', bound='QEMUMachine')
+
+
 class QEMUMachine:
 """
 A QEMU VM.
@@ -166,7 +170,7 @@ def __init__(self,
 self._remove_files: List[str] = []
 self._user_killed = False

-def __enter__(self) -> 'QEMUMachine':
+def __enter__(self: _T) -> _T:
 return self

 def __exit__(self,
@@ -182,8 +186,8 @@ def add_monitor_null(self) -> None:
 self._args.append('-monitor')
 self._args.append('null')

-def add_fd(self, fd: int, fdset: int,
-   opaque: str, opts: str = '') -> 'QEMUMachine':
+def add_fd(self: _T, fd: int, fdset: int,
+   opaque: str, opts: str = '') -> _T:
 """
 Pass a file descriptor to the VM
 """


Re: [PATCH v2] block/io_uring: resubmit when result is -EAGAIN

2021-08-04 Thread Kevin Wolf
Am 04.08.2021 um 16:50 hat Stefano Garzarella geschrieben:
> On Mon, Aug 02, 2021 at 02:40:36PM +0200, Kevin Wolf wrote:
> > Am 29.07.2021 um 11:10 hat Fabian Ebner geschrieben:
> > > Linux SCSI can throw spurious -EAGAIN in some corner cases in its
> > > completion path, which will end up being the result in the completed
> > > io_uring request.
> > > 
> > > Resubmitting such requests should allow block jobs to complete, even
> > > if such spurious errors are encountered.
> > > 
> > > Co-authored-by: Stefan Hajnoczi 
> > > Reviewed-by: Stefano Garzarella 
> > > Signed-off-by: Fabian Ebner 
> > > ---
> > > 
> > > Changes from v1:
> > > * Focus on what's relevant for the patch itself in the commit
> > >   message.
> > > * Add Stefan's comment.
> > > * Add Stefano's R-b tag (I hope that's fine, since there was no
> > >   change code-wise).
> > > 
> > >  block/io_uring.c | 16 +++-
> > >  1 file changed, 15 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/block/io_uring.c b/block/io_uring.c
> > > index 00a3ee9fb8..dfa475cc87 100644
> > > --- a/block/io_uring.c
> > > +++ b/block/io_uring.c
> > > @@ -165,7 +165,21 @@ static void luring_process_completions(LuringState 
> > > *s)
> > >  total_bytes = ret + luringcb->total_read;
> > > 
> > >  if (ret < 0) {
> > > -if (ret == -EINTR) {
> > > +/*
> > > + * Only writev/readv/fsync requests on regular files or host 
> > > block
> > > + * devices are submitted. Therefore -EAGAIN is not expected 
> > > but it's
> > > + * known to happen sometimes with Linux SCSI. Submit again 
> > > and hope
> > > + * the request completes successfully.
> > > + *
> > > + * For more information, see:
> > > + * 
> > > https://lore.kernel.org/io-uring/20210727165811.284510-3-ax...@kernel.dk/T/#u
> > > + *
> > > + * If the code is changed to submit other types of requests 
> > > in the
> > > + * future, then this workaround may need to be extended to 
> > > deal with
> > > + * genuine -EAGAIN results that should not be resubmitted
> > > + * immediately.
> > > + */
> > > +if (ret == -EINTR || ret == -EAGAIN) {
> > >  luring_resubmit(s, luringcb);
> > >  continue;
> > >  }
> > 
> > Reviewed-by: Kevin Wolf 
> > 
> > Question about the preexisting code, though: luring_resubmit() requires
> > that the caller calls ioq_submit() later so that the request doesn't
> > just sit in a queue without getting any attention, but actually gets
> > submitted to the kernel.
> > 
> > In the call chain ioq_submit() -> luring_process_completions() ->
> > luring_resubmit(), who takes care of that?
> 
> Mmm, good point.
> There should be the same problem with ioq_submit() ->
> luring_process_completions() -> luring_resubmit_short_read() ->
> luring_resubmit().
> 
> Should we schedule a BH for example in luring_resubmit() to make sure that
> ioq_submit() is invoked after a resubmission?

Or just loop in ioq_submit() after calling luring_process_completions()
if new requests were added to the queue?

Kevin




Re: [PATCH v7 29/33] iotests.py: hmp_qemu_io: support qdev

2021-08-04 Thread John Snow
On Wed, Aug 4, 2021 at 5:39 AM Vladimir Sementsov-Ogievskiy <
vsement...@virtuozzo.com> wrote:

> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Max Reitz 
> ---
>  tests/qemu-iotests/iotests.py | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 025e288ddd..9d0031a0e8 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -655,9 +655,10 @@ def resume_drive(self, drive: str) -> None:
>  self.hmp(f'qemu-io {drive} "remove_break bp_{drive}"')
>
>  def hmp_qemu_io(self, drive: str, cmd: str,
> -use_log: bool = False) -> QMPMessage:
> +use_log: bool = False, qdev: bool = False) ->
> QMPMessage:
>  """Write to a given drive using an HMP command"""
> -return self.hmp(f'qemu-io {drive} "{cmd}"', use_log=use_log)
> +d = '-d ' if qdev else ''
> +return self.hmp(f'qemu-io {d}{drive} "{cmd}"', use_log=use_log)
>
>  def flatten_qmp_object(self, obj, output=None, basestr=''):
>  if output is None:
> --
> 2.29.2
>
>
Guess that's really the only flag that this HMP command supports. I was
gonna suggest abstracting to {args} ... but, uh, that's the only one, so...
sure!

Reviewed-by: John Snow 


Re: [PATCH 0/7] floppy: build as modules.

2021-08-04 Thread Philippe Mathieu-Daudé
+Mark

On 8/4/21 4:27 PM, Gerd Hoffmann wrote:
> Some code shuffling needed beforehand due to floppy being part of
> several platforms.  While being at it also make floppy optional
> in pc machine type.

>   floppy: move fdctrl_init_sysbus
>   floppy: move sun4m_fdctrl_init

https://www.mail-archive.com/qemu-block@nongnu.org/msg84008.html

Mark suggested:

  You may be able to simplify this further by removing the
  global legacy init functions fdctrl_init_sysbus() and
  sun4m_fdctrl_init(): from what I can see fdctrl_init_sysbus()
  is only used in hw/mips/jazz.c and sun4m_fdctrl_init() is only
  used in hw/sparc/sun4m.c so you might as well inline them or
  move the functions to the relevant files.

I did it and plan to send during 6.2. Sounds simpler than module.
You could easily rebase your series on top (or I can include your
patches while sending).



Re: [PATCH v2] block/io_uring: resubmit when result is -EAGAIN

2021-08-04 Thread Stefano Garzarella

On Mon, Aug 02, 2021 at 02:40:36PM +0200, Kevin Wolf wrote:

Am 29.07.2021 um 11:10 hat Fabian Ebner geschrieben:

Linux SCSI can throw spurious -EAGAIN in some corner cases in its
completion path, which will end up being the result in the completed
io_uring request.

Resubmitting such requests should allow block jobs to complete, even
if such spurious errors are encountered.

Co-authored-by: Stefan Hajnoczi 
Reviewed-by: Stefano Garzarella 
Signed-off-by: Fabian Ebner 
---

Changes from v1:
* Focus on what's relevant for the patch itself in the commit
  message.
* Add Stefan's comment.
* Add Stefano's R-b tag (I hope that's fine, since there was no
  change code-wise).

 block/io_uring.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/block/io_uring.c b/block/io_uring.c
index 00a3ee9fb8..dfa475cc87 100644
--- a/block/io_uring.c
+++ b/block/io_uring.c
@@ -165,7 +165,21 @@ static void luring_process_completions(LuringState *s)
 total_bytes = ret + luringcb->total_read;

 if (ret < 0) {
-if (ret == -EINTR) {
+/*
+ * Only writev/readv/fsync requests on regular files or host block
+ * devices are submitted. Therefore -EAGAIN is not expected but 
it's
+ * known to happen sometimes with Linux SCSI. Submit again and hope
+ * the request completes successfully.
+ *
+ * For more information, see:
+ * 
https://lore.kernel.org/io-uring/20210727165811.284510-3-ax...@kernel.dk/T/#u
+ *
+ * If the code is changed to submit other types of requests in the
+ * future, then this workaround may need to be extended to deal 
with
+ * genuine -EAGAIN results that should not be resubmitted
+ * immediately.
+ */
+if (ret == -EINTR || ret == -EAGAIN) {
 luring_resubmit(s, luringcb);
 continue;
 }


Reviewed-by: Kevin Wolf 

Question about the preexisting code, though: luring_resubmit() requires
that the caller calls ioq_submit() later so that the request doesn't
just sit in a queue without getting any attention, but actually gets
submitted to the kernel.

In the call chain ioq_submit() -> luring_process_completions() ->
luring_resubmit(), who takes care of that?


Mmm, good point.
There should be the same problem with ioq_submit() -> 
luring_process_completions() -> luring_resubmit_short_read() -> 
luring_resubmit().


Should we schedule a BH for example in luring_resubmit() to make sure 
that ioq_submit() is invoked after a resubmission?


Thanks,
Stefano




[PATCH 7/7] pc: add floppy=OnOffAuto

2021-08-04 Thread Gerd Hoffmann
Allows to enable/disable the floppy controller.  Default depends on
MachineClass->no_floppy.  It's ON for now, but we can flip the default
for 6.2+ machine types.

NOTE: This requires -nodefaults or no_floppy=1 to actually have an
effect.  Otherwise the default floppy drive created by qemu will
auto-enable the floppy controller.  Not sure how to deal with that best.
IMHO we should simply stop creating a default floppy, unfortunaly
that will break live migration.

Signed-off-by: Gerd Hoffmann 
---
 include/hw/i386/pc.h |  2 ++
 hw/i386/pc.c | 23 +++
 hw/i386/pc_piix.c|  8 +++-
 3 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 88dffe751724..b418ead6c260 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -38,6 +38,7 @@ typedef struct PCMachineState {
 /* Configuration options: */
 uint64_t max_ram_below_4g;
 OnOffAuto vmport;
+OnOffAuto floppy;
 
 bool acpi_build_enabled;
 bool smbus_enabled;
@@ -59,6 +60,7 @@ typedef struct PCMachineState {
 #define PC_MACHINE_MAX_RAM_BELOW_4G "max-ram-below-4g"
 #define PC_MACHINE_DEVMEM_REGION_SIZE "device-memory-region-size"
 #define PC_MACHINE_VMPORT   "vmport"
+#define PC_MACHINE_FLOPPY   "floppy"
 #define PC_MACHINE_SMBUS"smbus"
 #define PC_MACHINE_SATA "sata"
 #define PC_MACHINE_PIT  "pit"
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index c2b9d62a358f..832ea9cc8ef8 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1468,6 +1468,23 @@ static void pc_machine_set_vmport(Object *obj, Visitor 
*v, const char *name,
 visit_type_OnOffAuto(v, name, &pcms->vmport, errp);
 }
 
+static void pc_machine_get_floppy(Object *obj, Visitor *v, const char *name,
+  void *opaque, Error **errp)
+{
+PCMachineState *pcms = PC_MACHINE(obj);
+OnOffAuto floppy = pcms->floppy;
+
+visit_type_OnOffAuto(v, name, &floppy, errp);
+}
+
+static void pc_machine_set_floppy(Object *obj, Visitor *v, const char *name,
+  void *opaque, Error **errp)
+{
+PCMachineState *pcms = PC_MACHINE(obj);
+
+visit_type_OnOffAuto(v, name, &pcms->floppy, errp);
+}
+
 static bool pc_machine_get_smbus(Object *obj, Error **errp)
 {
 PCMachineState *pcms = PC_MACHINE(obj);
@@ -1751,6 +1768,12 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
 object_class_property_set_description(oc, PC_MACHINE_VMPORT,
 "Enable vmport (pc & q35)");
 
+object_class_property_add(oc, PC_MACHINE_FLOPPY, "OnOffAuto",
+pc_machine_get_floppy, pc_machine_set_floppy,
+NULL, NULL);
+object_class_property_set_description(oc, PC_MACHINE_FLOPPY,
+"Enable floppy (pc only)");
+
 object_class_property_add_bool(oc, PC_MACHINE_SMBUS,
 pc_machine_get_smbus, pc_machine_set_smbus);
 
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 30b8bd6ea92d..7f81729e42cd 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -235,8 +235,14 @@ static void pc_init1(MachineState *machine,
 pcms->vmport = xen_enabled() ? ON_OFF_AUTO_OFF : ON_OFF_AUTO_ON;
 }
 
+if (pcms->floppy == ON_OFF_AUTO_AUTO) {
+pcms->floppy = MACHINE_CLASS(pcmc)->no_floppy
+? ON_OFF_AUTO_OFF : ON_OFF_AUTO_ON;
+}
+
 /* init basic PC hardware */
-pc_basic_device_init(pcms, isa_bus, x86ms->gsi, &rtc_state, true,
+pc_basic_device_init(pcms, isa_bus, x86ms->gsi, &rtc_state,
+ pcms->floppy == ON_OFF_AUTO_ON,
  0x4);
 
 pc_nic_init(pcmc, isa_bus, pci_bus);
-- 
2.31.1




[PATCH 6/7] floppy: build as modules.

2021-08-04 Thread Gerd Hoffmann
Add module_obj() annotations, update meson build rules.

Signed-off-by: Gerd Hoffmann 
---
 hw/block/fdc-isa.c|  2 ++
 hw/block/fdc-sysbus.c |  4 
 hw/block/fdc.c|  2 ++
 hw/block/meson.build  | 17 ++---
 4 files changed, 22 insertions(+), 3 deletions(-)

diff --git a/hw/block/fdc-isa.c b/hw/block/fdc-isa.c
index a5a124fb9236..fb2139760f8c 100644
--- a/hw/block/fdc-isa.c
+++ b/hw/block/fdc-isa.c
@@ -259,6 +259,7 @@ static const TypeInfo isa_fdc_info = {
 .class_init= isabus_fdc_class_init,
 .instance_init = isabus_fdc_instance_init,
 };
+module_obj(TYPE_ISA_FDC);
 
 static void isa_fdc_register_types(void)
 {
@@ -266,3 +267,4 @@ static void isa_fdc_register_types(void)
 }
 
 type_init(isa_fdc_register_types)
+module_dep("hw-block-fdc");
diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
index b358b6467ef5..8cc65cd92642 100644
--- a/hw/block/fdc-sysbus.c
+++ b/hw/block/fdc-sysbus.c
@@ -137,6 +137,7 @@ static const TypeInfo sysbus_fdc_common_typeinfo = {
 .class_init= sysbus_fdc_common_class_init,
 .class_size= sizeof(FDCtrlSysBusClass),
 };
+module_obj(TYPE_SYSBUS_FDC);
 
 static Property sysbus_fdc_properties[] = {
 DEFINE_PROP_SIGNED("fdtypeA", FDCtrlSysBus, state.qdev_for_drives[0].type,
@@ -164,6 +165,7 @@ static const TypeInfo sysbus_fdc_typeinfo = {
 .parent= TYPE_SYSBUS_FDC,
 .class_init= sysbus_fdc_class_init,
 };
+module_obj("sysbus-fdc");
 
 static Property sun4m_fdc_properties[] = {
 DEFINE_PROP_SIGNED("fdtype", FDCtrlSysBus, state.qdev_for_drives[0].type,
@@ -190,6 +192,7 @@ static const TypeInfo sun4m_fdc_typeinfo = {
 .parent= TYPE_SYSBUS_FDC,
 .class_init= sun4m_fdc_class_init,
 };
+module_obj("sun-fdtwo");
 
 static void sysbus_fdc_register_types(void)
 {
@@ -199,3 +202,4 @@ static void sysbus_fdc_register_types(void)
 }
 
 type_init(sysbus_fdc_register_types)
+module_dep("hw-block-fdc");
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index ba42537e8d26..95a1467f3faf 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -74,6 +74,7 @@ static const TypeInfo floppy_bus_info = {
 .parent = TYPE_BUS,
 .instance_size = sizeof(FloppyBus),
 };
+module_obj(TYPE_FLOPPY_BUS);
 
 static void floppy_bus_create(FDCtrl *fdc, FloppyBus *bus, DeviceState *dev)
 {
@@ -564,6 +565,7 @@ static const TypeInfo floppy_drive_info = {
 .instance_size = sizeof(FloppyDrive),
 .class_init = floppy_drive_class_init,
 };
+module_obj(TYPE_FLOPPY_DRIVE);
 
 //
 /* Intel 82078 floppy disk controller emulation  */
diff --git a/hw/block/meson.build b/hw/block/meson.build
index 8460042fe320..b336773ac591 100644
--- a/hw/block/meson.build
+++ b/hw/block/meson.build
@@ -1,3 +1,5 @@
+hw_block_modules = {}
+
 softmmu_ss.add(files(
   'block.c',
   'cdrom.c',
@@ -5,9 +7,6 @@ softmmu_ss.add(files(
 ))
 softmmu_ss.add(when: 'CONFIG_ECC', if_true: files('ecc.c'))
 softmmu_ss.add(when: 'CONFIG_FDC', if_true: files('fdc-module.c'))
-softmmu_ss.add(when: 'CONFIG_FDC', if_true: files('fdc.c'))
-softmmu_ss.add(when: 'CONFIG_FDC_ISA', if_true: files('fdc-isa.c'))
-softmmu_ss.add(when: 'CONFIG_FDC_SYSBUS', if_true: files('fdc-sysbus.c'))
 softmmu_ss.add(when: 'CONFIG_NAND', if_true: files('nand.c'))
 softmmu_ss.add(when: 'CONFIG_ONENAND', if_true: files('onenand.c'))
 softmmu_ss.add(when: 'CONFIG_PFLASH_CFI01', if_true: files('pflash_cfi01.c'))
@@ -20,4 +19,16 @@ softmmu_ss.add(when: 'CONFIG_TC58128', if_true: 
files('tc58128.c'))
 specific_ss.add(when: 'CONFIG_VIRTIO_BLK', if_true: files('virtio-blk.c'))
 specific_ss.add(when: 'CONFIG_VHOST_USER_BLK', if_true: 
files('vhost-user-blk.c'))
 
+fdc_ss = ss.source_set()
+fdc_isa_ss = ss.source_set()
+fdc_sysbus_ss = ss.source_set()
+fdc_ss.add(when: 'CONFIG_FDC', if_true: files('fdc.c'))
+fdc_isa_ss.add(when: 'CONFIG_FDC_ISA', if_true: files('fdc-isa.c'))
+fdc_sysbus_ss.add(when: 'CONFIG_FDC_SYSBUS', if_true: files('fdc-sysbus.c'))
+hw_block_modules += {'fdc': fdc_ss,
+ 'fdc-isa' : fdc_isa_ss,
+ 'fdc-sysbus' : fdc_sysbus_ss }
+
 subdir('dataplane')
+
+modules += { 'hw-block': hw_block_modules }
-- 
2.31.1




[PATCH 5/7] floppy: move cmos_get_fd_drive_type

2021-08-04 Thread Gerd Hoffmann
Needed by pc machine init.

Move to separate source file so we can keep it in core qemu
when building floppy as module.

Signed-off-by: Gerd Hoffmann 
---
 hw/block/fdc-isa.c| 25 -
 hw/block/fdc-module.c | 25 +
 2 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/hw/block/fdc-isa.c b/hw/block/fdc-isa.c
index 36c246efa822..a5a124fb9236 100644
--- a/hw/block/fdc-isa.c
+++ b/hw/block/fdc-isa.c
@@ -162,31 +162,6 @@ static Aml *build_fdinfo_aml(int idx, FloppyDriveType type)
 return dev;
 }
 
-int cmos_get_fd_drive_type(FloppyDriveType fd0)
-{
-int val;
-
-switch (fd0) {
-case FLOPPY_DRIVE_TYPE_144:
-/* 1.44 Mb 3"5 drive */
-val = 4;
-break;
-case FLOPPY_DRIVE_TYPE_288:
-/* 2.88 Mb 3"5 drive */
-val = 5;
-break;
-case FLOPPY_DRIVE_TYPE_120:
-/* 1.2 Mb 5"5 drive */
-val = 2;
-break;
-case FLOPPY_DRIVE_TYPE_NONE:
-default:
-val = 0;
-break;
-}
-return val;
-}
-
 static void fdc_isa_build_aml(ISADevice *isadev, Aml *scope)
 {
 Aml *dev;
diff --git a/hw/block/fdc-module.c b/hw/block/fdc-module.c
index 3682b5c1ebd1..7e2aaf86a0c0 100644
--- a/hw/block/fdc-module.c
+++ b/hw/block/fdc-module.c
@@ -98,3 +98,28 @@ FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i)
 
 return isa->state.drives[i].drive;
 }
+
+int cmos_get_fd_drive_type(FloppyDriveType fd0)
+{
+int val;
+
+switch (fd0) {
+case FLOPPY_DRIVE_TYPE_144:
+/* 1.44 Mb 3"5 drive */
+val = 4;
+break;
+case FLOPPY_DRIVE_TYPE_288:
+/* 2.88 Mb 3"5 drive */
+val = 5;
+break;
+case FLOPPY_DRIVE_TYPE_120:
+/* 1.2 Mb 5"5 drive */
+val = 2;
+break;
+case FLOPPY_DRIVE_TYPE_NONE:
+default:
+val = 0;
+break;
+}
+return val;
+}
-- 
2.31.1




[PATCH 3/7] floppy: move fdctrl_init_sysbus

2021-08-04 Thread Gerd Hoffmann
Needed by mips machine init.

Move to separate source file so we can keep it in core qemu
when building floppy as module.

Signed-off-by: Gerd Hoffmann 
---
 hw/block/fdc-internal.h | 15 +++
 hw/block/fdc-module.c   | 21 +
 hw/block/fdc-sysbus.c   | 34 --
 3 files changed, 36 insertions(+), 34 deletions(-)

diff --git a/hw/block/fdc-internal.h b/hw/block/fdc-internal.h
index a74cd5e4b9f2..f6c56f6e827f 100644
--- a/hw/block/fdc-internal.h
+++ b/hw/block/fdc-internal.h
@@ -30,6 +30,7 @@
 #include "hw/block/block.h"
 #include "hw/block/fdc.h"
 #include "hw/isa/isa.h"
+#include "hw/sysbus.h"
 #include "qapi/qapi-types-block.h"
 
 typedef struct FDCtrl FDCtrl;
@@ -159,6 +160,20 @@ struct FDCtrlISABus {
 int32_t bootindexB;
 };
 
+#define TYPE_SYSBUS_FDC "base-sysbus-fdc"
+typedef struct FDCtrlSysBusClass FDCtrlSysBusClass;
+typedef struct FDCtrlSysBus FDCtrlSysBus;
+DECLARE_OBJ_CHECKERS(FDCtrlSysBus, FDCtrlSysBusClass,
+ SYSBUS_FDC, TYPE_SYSBUS_FDC)
+
+struct FDCtrlSysBus {
+/*< private >*/
+SysBusDevice parent_obj;
+/*< public >*/
+
+struct FDCtrl state;
+};
+
 extern const FDFormat fd_formats[];
 extern const VMStateDescription vmstate_fdc;
 
diff --git a/hw/block/fdc-module.c b/hw/block/fdc-module.c
index 8309e99361bc..11e6ae7c0cb9 100644
--- a/hw/block/fdc-module.c
+++ b/hw/block/fdc-module.c
@@ -29,10 +29,31 @@
 #include "qemu/osdep.h"
 #include "hw/isa/isa.h"
 #include "hw/block/fdc.h"
+#include "hw/sysbus.h"
 #include "qapi/error.h"
 #include "sysemu/blockdev.h"
 #include "fdc-internal.h"
 
+void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
+hwaddr mmio_base, DriveInfo **fds)
+{
+FDCtrl *fdctrl;
+DeviceState *dev;
+SysBusDevice *sbd;
+FDCtrlSysBus *sys;
+
+dev = qdev_new("sysbus-fdc");
+sys = SYSBUS_FDC(dev);
+fdctrl = &sys->state;
+fdctrl->dma_chann = dma_chann; /* FIXME */
+sbd = SYS_BUS_DEVICE(dev);
+sysbus_realize_and_unref(sbd, &error_fatal);
+sysbus_connect_irq(sbd, 0, irq);
+sysbus_mmio_map(sbd, 0, mmio_base);
+
+fdctrl_init_drives(&sys->state.bus, fds);
+}
+
 void fdctrl_init_drives(FloppyBus *bus, DriveInfo **fds)
 {
 DeviceState *dev;
diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
index 57fc8773f124..5a8d393d31c2 100644
--- a/hw/block/fdc-sysbus.c
+++ b/hw/block/fdc-sysbus.c
@@ -32,12 +32,6 @@
 #include "fdc-internal.h"
 #include "trace.h"
 
-#define TYPE_SYSBUS_FDC "base-sysbus-fdc"
-typedef struct FDCtrlSysBusClass FDCtrlSysBusClass;
-typedef struct FDCtrlSysBus FDCtrlSysBus;
-DECLARE_OBJ_CHECKERS(FDCtrlSysBus, FDCtrlSysBusClass,
- SYSBUS_FDC, TYPE_SYSBUS_FDC)
-
 struct FDCtrlSysBusClass {
 /*< private >*/
 SysBusDeviceClass parent_class;
@@ -46,14 +40,6 @@ struct FDCtrlSysBusClass {
 bool use_strict_io;
 };
 
-struct FDCtrlSysBus {
-/*< private >*/
-SysBusDevice parent_obj;
-/*< public >*/
-
-struct FDCtrl state;
-};
-
 static uint64_t fdctrl_read_mem(void *opaque, hwaddr reg, unsigned ize)
 {
 return fdctrl_read(opaque, (uint32_t)reg);
@@ -94,26 +80,6 @@ static void fdctrl_handle_tc(void *opaque, int irq, int 
level)
 trace_fdctrl_tc_pulse(level);
 }
 
-void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
-hwaddr mmio_base, DriveInfo **fds)
-{
-FDCtrl *fdctrl;
-DeviceState *dev;
-SysBusDevice *sbd;
-FDCtrlSysBus *sys;
-
-dev = qdev_new("sysbus-fdc");
-sys = SYSBUS_FDC(dev);
-fdctrl = &sys->state;
-fdctrl->dma_chann = dma_chann; /* FIXME */
-sbd = SYS_BUS_DEVICE(dev);
-sysbus_realize_and_unref(sbd, &error_fatal);
-sysbus_connect_irq(sbd, 0, irq);
-sysbus_mmio_map(sbd, 0, mmio_base);
-
-fdctrl_init_drives(&sys->state.bus, fds);
-}
-
 void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
DriveInfo **fds, qemu_irq *fdc_tc)
 {
-- 
2.31.1




[PATCH 4/7] floppy: move sun4m_fdctrl_init

2021-08-04 Thread Gerd Hoffmann
Needed by sparc machine init.

Move to separate source file so we can keep it in core qemu
when building floppy as module.

Signed-off-by: Gerd Hoffmann 
---
 hw/block/fdc-module.c | 16 
 hw/block/fdc-sysbus.c | 16 
 2 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/hw/block/fdc-module.c b/hw/block/fdc-module.c
index 11e6ae7c0cb9..3682b5c1ebd1 100644
--- a/hw/block/fdc-module.c
+++ b/hw/block/fdc-module.c
@@ -54,6 +54,22 @@ void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
 fdctrl_init_drives(&sys->state.bus, fds);
 }
 
+void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
+   DriveInfo **fds, qemu_irq *fdc_tc)
+{
+DeviceState *dev;
+FDCtrlSysBus *sys;
+
+dev = qdev_new("sun-fdtwo");
+sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
+sys = SYSBUS_FDC(dev);
+sysbus_connect_irq(SYS_BUS_DEVICE(sys), 0, irq);
+sysbus_mmio_map(SYS_BUS_DEVICE(sys), 0, io_base);
+*fdc_tc = qdev_get_gpio_in(dev, 0);
+
+fdctrl_init_drives(&sys->state.bus, fds);
+}
+
 void fdctrl_init_drives(FloppyBus *bus, DriveInfo **fds)
 {
 DeviceState *dev;
diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
index 5a8d393d31c2..b358b6467ef5 100644
--- a/hw/block/fdc-sysbus.c
+++ b/hw/block/fdc-sysbus.c
@@ -80,22 +80,6 @@ static void fdctrl_handle_tc(void *opaque, int irq, int 
level)
 trace_fdctrl_tc_pulse(level);
 }
 
-void sun4m_fdctrl_init(qemu_irq irq, hwaddr io_base,
-   DriveInfo **fds, qemu_irq *fdc_tc)
-{
-DeviceState *dev;
-FDCtrlSysBus *sys;
-
-dev = qdev_new("sun-fdtwo");
-sysbus_realize_and_unref(SYS_BUS_DEVICE(dev), &error_fatal);
-sys = SYSBUS_FDC(dev);
-sysbus_connect_irq(SYS_BUS_DEVICE(sys), 0, irq);
-sysbus_mmio_map(SYS_BUS_DEVICE(sys), 0, io_base);
-*fdc_tc = qdev_get_gpio_in(dev, 0);
-
-fdctrl_init_drives(&sys->state.bus, fds);
-}
-
 static void sysbus_fdc_common_instance_init(Object *obj)
 {
 DeviceState *dev = DEVICE(obj);
-- 
2.31.1




[PATCH 1/7] floppy: move isa_fdc_get_drive_type to separate source file.

2021-08-04 Thread Gerd Hoffmann
isa_fdc_get_drive_type() is needed by pc machine types
when setting up the cmos.

Move to separate source file so we can keep it in core qemu
when building floppy as module.

Signed-off-by: Gerd Hoffmann 
---
 hw/block/fdc-internal.h | 16 
 hw/block/fdc-isa.c  | 22 --
 hw/block/fdc-module.c   | 39 +++
 hw/block/meson.build|  1 +
 4 files changed, 56 insertions(+), 22 deletions(-)
 create mode 100644 hw/block/fdc-module.c

diff --git a/hw/block/fdc-internal.h b/hw/block/fdc-internal.h
index 036392e9fc10..a74cd5e4b9f2 100644
--- a/hw/block/fdc-internal.h
+++ b/hw/block/fdc-internal.h
@@ -29,6 +29,7 @@
 #include "exec/ioport.h"
 #include "hw/block/block.h"
 #include "hw/block/fdc.h"
+#include "hw/isa/isa.h"
 #include "qapi/qapi-types-block.h"
 
 typedef struct FDCtrl FDCtrl;
@@ -143,6 +144,21 @@ struct FDCtrl {
 PortioList portio_list;
 };
 
+OBJECT_DECLARE_SIMPLE_TYPE(FDCtrlISABus, ISA_FDC)
+
+struct FDCtrlISABus {
+/*< private >*/
+ISADevice parent_obj;
+/*< public >*/
+
+uint32_t iobase;
+uint32_t irq;
+uint32_t dma;
+struct FDCtrl state;
+int32_t bootindexA;
+int32_t bootindexB;
+};
+
 extern const FDFormat fd_formats[];
 extern const VMStateDescription vmstate_fdc;
 
diff --git a/hw/block/fdc-isa.c b/hw/block/fdc-isa.c
index 3bf64e06657b..dd7e1669f862 100644
--- a/hw/block/fdc-isa.c
+++ b/hw/block/fdc-isa.c
@@ -49,21 +49,6 @@
 #include "qom/object.h"
 #include "fdc-internal.h"
 
-OBJECT_DECLARE_SIMPLE_TYPE(FDCtrlISABus, ISA_FDC)
-
-struct FDCtrlISABus {
-/*< private >*/
-ISADevice parent_obj;
-/*< public >*/
-
-uint32_t iobase;
-uint32_t irq;
-uint32_t dma;
-struct FDCtrl state;
-int32_t bootindexA;
-int32_t bootindexB;
-};
-
 static void fdctrl_external_reset_isa(DeviceState *d)
 {
 FDCtrlISABus *isa = ISA_FDC(d);
@@ -117,13 +102,6 @@ static void isabus_fdc_realize(DeviceState *dev, Error 
**errp)
 }
 }
 
-FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i)
-{
-FDCtrlISABus *isa = ISA_FDC(fdc);
-
-return isa->state.drives[i].drive;
-}
-
 static void isa_fdc_get_drive_max_chs(FloppyDriveType type, uint8_t *maxc,
   uint8_t *maxh, uint8_t *maxs)
 {
diff --git a/hw/block/fdc-module.c b/hw/block/fdc-module.c
new file mode 100644
index ..93953bf0aa57
--- /dev/null
+++ b/hw/block/fdc-module.c
@@ -0,0 +1,39 @@
+/*
+ * QEMU Floppy disk emulator (Intel 82078)
+ *
+ * Some small helper functions which must be built into core qemu when
+ * building floppy as module.
+ *
+ * Copyright (c) 2003, 2007 Jocelyn Mayer
+ * Copyright (c) 2008 Hervé Poussineau
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/isa/isa.h"
+#include "hw/block/fdc.h"
+#include "fdc-internal.h"
+
+FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i)
+{
+FDCtrlISABus *isa = ISA_FDC(fdc);
+
+return isa->state.drives[i].drive;
+}
diff --git a/hw/block/meson.build b/hw/block/meson.build
index 2389326112ae..8460042fe320 100644
--- a/hw/block/meson.build
+++ b/hw/block/meson.build
@@ -4,6 +4,7 @@ softmmu_ss.add(files(
   'hd-geometry.c'
 ))
 softmmu_ss.add(when: 'CONFIG_ECC', if_true: files('ecc.c'))
+softmmu_ss.add(when: 'CONFIG_FDC', if_true: files('fdc-module.c'))
 softmmu_ss.add(when: 'CONFIG_FDC', if_true: files('fdc.c'))
 softmmu_ss.add(when: 'CONFIG_FDC_ISA', if_true: files('fdc-isa.c'))
 softmmu_ss.add(when: 'CONFIG_FDC_SYSBUS', if_true: files('fdc-sysbus.c'))
-- 
2.31.1




[PATCH 2/7] floppy: move isa_fdc_init_drives + fdctrl_init_drives

2021-08-04 Thread Gerd Hoffmann
isa_fdc_init_drives() is called by pc machine setup,
and it depends on fdctrl_init_drives().

Move both functions to separate source file so we can
keep them in core qemu when building floppy as module.

Signed-off-by: Gerd Hoffmann 
---
 hw/block/fdc-isa.c|  5 -
 hw/block/fdc-module.c | 24 
 hw/block/fdc.c| 17 -
 3 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/hw/block/fdc-isa.c b/hw/block/fdc-isa.c
index dd7e1669f862..36c246efa822 100644
--- a/hw/block/fdc-isa.c
+++ b/hw/block/fdc-isa.c
@@ -57,11 +57,6 @@ static void fdctrl_external_reset_isa(DeviceState *d)
 fdctrl_reset(s, 0);
 }
 
-void isa_fdc_init_drives(ISADevice *fdc, DriveInfo **fds)
-{
-fdctrl_init_drives(&ISA_FDC(fdc)->state.bus, fds);
-}
-
 static const MemoryRegionPortio fdc_portio_list[] = {
 { 1, 5, 1, .read = fdctrl_read, .write = fdctrl_write },
 { 7, 1, 1, .read = fdctrl_read, .write = fdctrl_write },
diff --git a/hw/block/fdc-module.c b/hw/block/fdc-module.c
index 93953bf0aa57..8309e99361bc 100644
--- a/hw/block/fdc-module.c
+++ b/hw/block/fdc-module.c
@@ -29,8 +29,32 @@
 #include "qemu/osdep.h"
 #include "hw/isa/isa.h"
 #include "hw/block/fdc.h"
+#include "qapi/error.h"
+#include "sysemu/blockdev.h"
 #include "fdc-internal.h"
 
+void fdctrl_init_drives(FloppyBus *bus, DriveInfo **fds)
+{
+DeviceState *dev;
+int i;
+
+for (i = 0; i < MAX_FD; i++) {
+if (fds[i]) {
+dev = qdev_new("floppy");
+qdev_prop_set_uint32(dev, "unit", i);
+qdev_prop_set_enum(dev, "drive-type", FLOPPY_DRIVE_TYPE_AUTO);
+qdev_prop_set_drive_err(dev, "drive", blk_by_legacy_dinfo(fds[i]),
+&error_fatal);
+qdev_realize_and_unref(dev, &bus->bus, &error_fatal);
+}
+}
+}
+
+void isa_fdc_init_drives(ISADevice *fdc, DriveInfo **fds)
+{
+fdctrl_init_drives(&ISA_FDC(fdc)->state.bus, fds);
+}
+
 FloppyDriveType isa_fdc_get_drive_type(ISADevice *fdc, int i)
 {
 FDCtrlISABus *isa = ISA_FDC(fdc);
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 9014cd30b3ab..ba42537e8d26 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -2297,23 +2297,6 @@ static void fdctrl_result_timer(void *opaque)
 
 /* Init functions */
 
-void fdctrl_init_drives(FloppyBus *bus, DriveInfo **fds)
-{
-DeviceState *dev;
-int i;
-
-for (i = 0; i < MAX_FD; i++) {
-if (fds[i]) {
-dev = qdev_new("floppy");
-qdev_prop_set_uint32(dev, "unit", i);
-qdev_prop_set_enum(dev, "drive-type", FLOPPY_DRIVE_TYPE_AUTO);
-qdev_prop_set_drive_err(dev, "drive", blk_by_legacy_dinfo(fds[i]),
-&error_fatal);
-qdev_realize_and_unref(dev, &bus->bus, &error_fatal);
-}
-}
-}
-
 void fdctrl_realize_common(DeviceState *dev, FDCtrl *fdctrl, Error **errp)
 {
 int i, j;
-- 
2.31.1




[PATCH 0/7] floppy: build as modules.

2021-08-04 Thread Gerd Hoffmann
Some code shuffling needed beforehand due to floppy being part of
several platforms.  While being at it also make floppy optional
in pc machine type.

Gerd Hoffmann (7):
  floppy: move isa_fdc_get_drive_type to separate source file.
  floppy: move isa_fdc_init_drives + fdctrl_init_drives
  floppy: move fdctrl_init_sysbus
  floppy: move sun4m_fdctrl_init
  floppy: move cmos_get_fd_drive_type
  floppy: build as modules.
  pc: add floppy=OnOffAuto

 hw/block/fdc-internal.h |  31 ++
 include/hw/i386/pc.h|   2 +
 hw/block/fdc-isa.c  |  54 +
 hw/block/fdc-module.c   | 125 
 hw/block/fdc-sysbus.c   |  54 ++---
 hw/block/fdc.c  |  19 +-
 hw/i386/pc.c|  23 
 hw/i386/pc_piix.c   |   8 ++-
 hw/block/meson.build|  18 +-
 9 files changed, 211 insertions(+), 123 deletions(-)
 create mode 100644 hw/block/fdc-module.c

-- 
2.31.1





Re: [PATCH for-6.1? v2 5/7] job: Add job_cancel_requested()

2021-08-04 Thread Max Reitz

On 04.08.21 12:34, Kevin Wolf wrote:

[ Peter, the question for you is at the end. ]

Am 04.08.2021 um 10:07 hat Max Reitz geschrieben:

On 03.08.21 16:25, Kevin Wolf wrote:

Am 26.07.2021 um 16:46 hat Max Reitz geschrieben:

Most callers of job_is_cancelled() actually want to know whether the job
is on its way to immediate termination.  For example, we refuse to pause
jobs that are cancelled; but this only makes sense for jobs that are
really actually cancelled.

A mirror job that is cancelled during READY with force=false should
absolutely be allowed to pause.  This "cancellation" (which is actually
a kind of completion) may take an indefinite amount of time, and so
should behave like any job during normal operation.  For example, with
on-target-error=stop, the job should stop on write errors.  (In
contrast, force-cancelled jobs should not get write errors, as they
should just terminate and not do further I/O.)

Therefore, redefine job_is_cancelled() to only return true for jobs that
are force-cancelled (which as of HEAD^ means any job that interprets the
cancellation request as a request for immediate termination), and add
job_cancel_request() as the general variant, which returns true for any
jobs which have been requested to be cancelled, whether it be
immediately or after an arbitrarily long completion phase.

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz 
---
   include/qemu/job.h |  8 +++-
   block/mirror.c | 10 --
   job.c  |  7 ++-
   3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 8aa90f7395..032edf3c5f 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -436,9 +436,15 @@ const char *job_type_str(const Job *job);
   /** Returns true if the job should not be visible to the management layer. */
   bool job_is_internal(Job *job);
-/** Returns whether the job is scheduled for cancellation. */
+/** Returns whether the job is being cancelled. */
   bool job_is_cancelled(Job *job);
+/**
+ * Returns whether the job is scheduled for cancellation (at an
+ * indefinite point).
+ */
+bool job_cancel_requested(Job *job);

I don't think non-force blockdev-cancel for mirror should actually be
considered cancellation, so what is the question that this function
answers?

"Is this a cancelled job, or a mirror block job that is supposed to
complete soon, but only if it doesn't switch over the users to the
target on completion"?

Well, technically yes, but it was more intended as “Has the user ever
invoked (block-)job-cancel on this job?”.

I understand this, but is this much more useful to know than "Has the
user ever called HMP 'change'?", if you know what I mean?


Hm.  Not really.  It’s still a crutch that shouldn’t be there ideally.

But I like this crutch for this series so I can get this batch done, and 
then worry about all the other bugs that keep popping up (and where 
job_cancel_requested() is a nice sign that something’s off).



Is this ever a reasonable question to ask, except maybe inside the
mirror implementation itself?

I asked myself the same for v3, but found two places in job.c where I
would like to keep it:

First, there’s an assertion in job_completed_txn_abort().  All jobs
other than @job have been force-cancelled, and so job_is_cancelled()
would be true for them.  As for @job itself, the function is mostly
called when the job’s return value is not 0, but a soft-cancelled
mirror does have a return value of 0 and so would not end up in that
function.
But job_cancel() invokes job_completed_txn_abort() if the job has been
deferred to the main loop, which mostly correlates with the job having
been completed (in which case the assertion is skipped), but not 100 %
(there’s a small window between setting deferred_to_main_loop and the
job changing to a completed state).
So I’d prefer to keep the assertion as-is functionally, i.e. to only
check job->cancelled.

Well, you don't. It's still job_is_cancelled() after this patch.


No: I didn’t. O:)

For v3, I had absolutely planned to use job_cancel_requested(), and I 
wanted to put the above explanation into the commit message.



So the scenario you're concerned about is a job that has just finished
successfully (job->ret = 0) and then gets a cancel request?


Yes.


With force=false, I'm pretty sure the code is wrong anyway because
calling job_completed_txn_abort() is not the right response.


Absolutely possible, I just didn’t want to deal with this, too… :/


It should
return an error because you're trying to complete twice, possibly with
conflicting completion modes. Second best is just ignoring the cancel
request because we obviously already fulfilled the request of completing
the job (the completion mode might be different, though).

With force=true, arguably still letting the job fail is correct.
However, letting it fail involves more than just letting the transaction
fail. We would have to call job_update_

Re: Failing iotest 206

2021-08-04 Thread Daniel P . Berrangé
On Tue, Aug 03, 2021 at 07:17:47PM +0200, Kevin Wolf wrote:
> Am 20.07.2021 um 10:32 hat Daniel P. Berrangé geschrieben:
> > On Mon, Jul 19, 2021 at 08:12:58PM -0500, Eric Blake wrote:
> > > On Mon, Jul 19, 2021 at 10:06:01AM +0200, Thomas Huth wrote:
> > > >  Hi,
> > > > 
> > > > iotest 206 fails for me with:
> > > > 
> > > 
> > > > --- 206.out
> > > > +++ 206.out.bad
> > > > @@ -99,55 +99,19 @@
> > > > 
> > > >  {"execute": "blockdev-create", "arguments": {"job-id": "job0", 
> > > > "options":
> > > > {"driver": "qcow2", "encrypt": {"cipher-alg": "twofish-128", 
> > > > "cipher-mode":
> > > > "ctr", "format": "luks", "hash-alg": "sha1", "iter-time": 10, 
> > > > "ivgen-alg":
> > > > "plain64", "ivgen-hash-alg": "md5", "key-secret": "keysec0"}, "file":
> > > > {"driver": "file", "filename": "TEST_DIR/PID-t.qcow2"}, "size": 
> > > > 33554432}}}
> > > >  {"return": {}}
> > > > +Job failed: Unsupported cipher algorithm twofish-128 with ctr mode
> > > >  {"execute": "job-dismiss", "arguments": {"id": "job0"}}
> > > >  {"return": {}}
> > > 
> > > > 
> > > > Looks like it is missing a check for the availability of the 
> > > > corresponding
> > > > crypto stuff? Does anybody got a clue how to fix this?
> > > 
> > > What system is this on? Which crypto library versions are installed?
> > > I suspect this is related to Dan's effort to speed up crypto by
> > > favoring gnutls over nettle, where the switch in favored libraries
> > > failed to account for whether twofish-128 is supported?
> > > 
> > > https://lists.gnu.org/archive/html/qemu-devel/2021-07/msg03886.html
> > 
> > Yes, the gnutls provider doesn't support twofish. This doesn't matter
> > in real world usage because no one is seriously going to ask for twofish
> > instead of AES for luks encryption.
> > 
> > I guess that test suite was simply trying to ask for some non-default
> > values though.
> 
> Do we already have a patch somewhere that makes it use a different
> value? Or if not, which value would be most likely to work everywhere?

Ultimately there is only one cipher alg that is guaranteed 'aes',
which can be used in two keysizes 128/256, and two modes cbc/xts.

Sine aes-128 with xts is the default, if you want to exercise
a non-default codepath for LUKS support, i'd suggest aes-256
with cbc mode, and essiv IV generator.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




[PATCH 11/11] block: introduce fleecing block driver

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Introduce a new driver, that works in pair with copy-before-write to
improve fleecing.

Without fleecing driver, fleecing scheme may look as follows:

[guest]
  |
  |root
  v
[copy-before-write] -> [temp.qcow2] <--- [nbd export]
  | target  |
  |file |backing
  v |
[active disk] <-+

With fleecing driver, new scheme is:

   bdrv_ref()
[guest]<~~~[fleecing] <--- [nbd export]
  | ||
  |root  +--source--+|file
  v  v   v
[copy-before-write] -> [temp.img]
  | target
  |file
  v
[active disk]

Benefits of new scheme:

1. Access control: if remote client try to read data that not covered
   by original dirty bitmap used on copy-before-write open, client gets
   -EACCES.

2. Discard support: if remote client do DISCARD, this additionally to
   discarding data in temp.img informs block-copy process to not copy
   these clusters. Next read from discarded area will return -EACCES.

3. Synchronisation between client reads and block-copy write is more
   efficient: it doesn't block intersecting block-copy write during
   client read (hmm, we still needlesly block it, as block-copy
   always serialize writes, it's a TODO to stop doing so).

4. We don't rely on backing feature: active disk should not be backing
   of temp image, so we avoid some permission-related difficulties
   (cleaning them up in copy-before-write filter is a TODO) and temp
   image now is not required to support backing, it may be simple raw
   image.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/fleecing.c  | 183 ++
 block/meson.build |   1 +
 2 files changed, 184 insertions(+)
 create mode 100644 block/fleecing.c

diff --git a/block/fleecing.c b/block/fleecing.c
new file mode 100644
index 00..7d213da1ca
--- /dev/null
+++ b/block/fleecing.c
@@ -0,0 +1,183 @@
+/*
+ * copy-before-write filter driver
+ *
+ * The driver performs Copy-Before-Write (CBW) operation: it is injected above
+ * some node, and before each write it copies _old_ data to the target node.
+ *
+ * Copyright (c) 2018-2021 Virtuozzo International GmbH.
+ *
+ * Author:
+ *  Sementsov-Ogievskiy Vladimir 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see .
+ */
+
+#include "qemu/osdep.h"
+
+#include "sysemu/block-backend.h"
+#include "qemu/cutils.h"
+#include "qapi/error.h"
+#include "block/block_int.h"
+#include "block/qdict.h"
+#include "block/block-copy.h"
+
+#include "block/copy-before-write.h"
+
+typedef struct BDRVFleecingState {
+BlockDriverState *cbw;
+BdrvChild *source;
+} BDRVFleecingState;
+
+static coroutine_fn int fleecing_co_preadv_part(
+BlockDriverState *bs, uint64_t offset, uint64_t bytes,
+QEMUIOVector *qiov, size_t qiov_offset, int flags)
+{
+BDRVFleecingState *s = bs->opaque;
+const BlockReq *req;
+int ret;
+
+/* TODO: upgrade to async loop using AioTask */
+while (bytes) {
+int64_t cur_bytes;
+
+ret = cbw_snapshot_read_lock(s->cbw, offset, bytes, &req, &cur_bytes);
+if (ret < 0) {
+return ret;
+}
+
+if (req) {
+ret = bdrv_co_preadv_part(s->source, offset, cur_bytes,
+  qiov, qiov_offset, flags);
+cbw_snapshot_read_unlock(s->cbw, req);
+} else {
+ret = bdrv_co_preadv_part(bs->file, offset, cur_bytes,
+  qiov, qiov_offset, flags);
+}
+if (ret < 0) {
+return ret;
+}
+
+bytes -= cur_bytes;
+offset += cur_bytes;
+qiov_offset += cur_bytes;
+}
+
+return 0;
+}
+
+static int coroutine_fn fleecing_co_pdiscard(BlockDriverState *bs,
+ int64_t offset, int bytes)
+{
+BDRVFleecingState *s = bs->opaque;
+
+cbw_snapshot_discard(s->cbw, offset, bytes);
+
+bdrv_co_pdiscard(bs->file, offset, bytes);
+
+/*
+ * Ignore bdrv_co_pdiscard() result: cbw_snapshot_discard() succeeded, that
+ * means that next read from this area will fail with -EACCES. More correct
+ * to report success now.
+ */
+return 0;
+}
+
+static int coroutine_fn fleecing_co_pwrite_zeroes(BlockDriverState *bs,
+int64_t 

[PATCH 08/11] block/reqlist: add reqlist_wait_all()

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
To be used in the next commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/reqlist.h | 2 ++
 block/reqlist.c | 6 ++
 2 files changed, 8 insertions(+)

diff --git a/include/block/reqlist.h b/include/block/reqlist.h
index 2de86be300..59d5c24cda 100644
--- a/include/block/reqlist.h
+++ b/include/block/reqlist.h
@@ -39,6 +39,8 @@ BlockReq *reqlist_find_conflict(BlockReqList *reqs, int64_t 
offset,
 int64_t bytes);
 bool coroutine_fn reqlist_wait_one(BlockReqList *reqs, int64_t offset,
int64_t bytes, CoMutex *lock);
+void coroutine_fn reqlist_wait_all(BlockReqList *reqs, int64_t offset,
+   int64_t bytes, CoMutex *lock);
 void coroutine_fn reqlist_shrink_req(BlockReq *req, int64_t new_bytes);
 void coroutine_fn reqlist_remove_req(BlockReq *req);
 
diff --git a/block/reqlist.c b/block/reqlist.c
index c41415c16a..939437621d 100644
--- a/block/reqlist.c
+++ b/block/reqlist.c
@@ -68,6 +68,12 @@ bool coroutine_fn reqlist_wait_one(BlockReqList *reqs, 
int64_t offset,
 return true;
 }
 
+void coroutine_fn reqlist_wait_all(BlockReqList *reqs, int64_t offset,
+   int64_t bytes, CoMutex *lock)
+{
+while (reqlist_wait_one(reqs, offset, bytes, lock)) { }
+}
+
 /*
  * Shrink request and wake all waiting coroutines (may be some of them are not
  * intersecting with shrunk request).
-- 
2.29.2




[PATCH 06/11] block: intoduce reqlist

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Split intersecting-requests functionality out of block-copy to be
reused in copy-before-write filter.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/reqlist.h |  45 
 block/block-copy.c  | 116 +---
 block/reqlist.c |  94 
 block/meson.build   |   1 +
 4 files changed, 177 insertions(+), 79 deletions(-)
 create mode 100644 include/block/reqlist.h
 create mode 100644 block/reqlist.c

diff --git a/include/block/reqlist.h b/include/block/reqlist.h
new file mode 100644
index 00..2de86be300
--- /dev/null
+++ b/include/block/reqlist.h
@@ -0,0 +1,45 @@
+/*
+ * block_copy API
+ *
+ * Copyright (C) 2013 Proxmox Server Solutions
+ * Copyright (c) 2021 Virtuozzo International GmbH.
+ *
+ * Authors:
+ *  Dietmar Maurer (diet...@proxmox.com)
+ *  Vladimir Sementsov-Ogievskiy 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#ifndef REQLIST_H
+#define REQLIST_H
+
+#include "qemu/coroutine.h"
+
+/*
+ * The API is not thread-safe and shouldn't be. The struct is public to be part
+ * of other structures and protected by third-party locks, see
+ * block/block-copy.c for example.
+ */
+
+typedef struct BlockReq {
+int64_t offset;
+int64_t bytes;
+
+CoQueue wait_queue; /* coroutines blocked on this req */
+QLIST_ENTRY(BlockReq) list;
+} BlockReq;
+
+typedef QLIST_HEAD(, BlockReq) BlockReqList;
+
+void reqlist_init_req(BlockReqList *reqs, BlockReq *req, int64_t offset,
+  int64_t bytes);
+BlockReq *reqlist_find_conflict(BlockReqList *reqs, int64_t offset,
+int64_t bytes);
+bool coroutine_fn reqlist_wait_one(BlockReqList *reqs, int64_t offset,
+   int64_t bytes, CoMutex *lock);
+void coroutine_fn reqlist_shrink_req(BlockReq *req, int64_t new_bytes);
+void coroutine_fn reqlist_remove_req(BlockReq *req);
+
+#endif /* REQLIST_H */
diff --git a/block/block-copy.c b/block/block-copy.c
index 5cd291727b..de388b7b96 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -17,6 +17,7 @@
 #include "trace.h"
 #include "qapi/error.h"
 #include "block/block-copy.h"
+#include "block/reqlist.h"
 #include "sysemu/block-backend.h"
 #include "qemu/units.h"
 #include "qemu/coroutine.h"
@@ -82,7 +83,6 @@ typedef struct BlockCopyTask {
  */
 BlockCopyState *s;
 BlockCopyCallState *call_state;
-int64_t offset;
 /*
  * @method can also be set again in the while loop of
  * block_copy_dirty_clusters(), but it is never accessed concurrently
@@ -93,21 +93,17 @@ typedef struct BlockCopyTask {
 BlockCopyMethod method;
 
 /*
- * Fields whose state changes throughout the execution
- * Protected by lock in BlockCopyState.
+ * Generally, req is protected by lock in BlockCopyState, Still req.offset
+ * is only set on task creation, so may be read concurrently after 
creation.
+ * req.bytes is changed at most once, and need only protecting the case of
+ * parallel read while updating @bytes value in block_copy_task_shrink().
  */
-CoQueue wait_queue; /* coroutines blocked on this task */
-/*
- * Only protect the case of parallel read while updating @bytes
- * value in block_copy_task_shrink().
- */
-int64_t bytes;
-QLIST_ENTRY(BlockCopyTask) list;
+BlockReq req;
 } BlockCopyTask;
 
 static int64_t task_end(BlockCopyTask *task)
 {
-return task->offset + task->bytes;
+return task->req.offset + task->req.bytes;
 }
 
 typedef struct BlockCopyState {
@@ -135,7 +131,7 @@ typedef struct BlockCopyState {
 CoMutex lock;
 int64_t in_flight_bytes;
 BlockCopyMethod method;
-QLIST_HEAD(, BlockCopyTask) tasks; /* All tasks from all block-copy calls 
*/
+BlockReqList reqs;
 QLIST_HEAD(, BlockCopyCallState) calls;
 /*
  * skip_unallocated:
@@ -159,42 +155,6 @@ typedef struct BlockCopyState {
 RateLimit rate_limit;
 } BlockCopyState;
 
-/* Called with lock held */
-static BlockCopyTask *find_conflicting_task(BlockCopyState *s,
-int64_t offset, int64_t bytes)
-{
-BlockCopyTask *t;
-
-QLIST_FOREACH(t, &s->tasks, list) {
-if (offset + bytes > t->offset && offset < t->offset + t->bytes) {
-return t;
-}
-}
-
-return NULL;
-}
-
-/*
- * If there are no intersecting tasks return false. Otherwise, wait for the
- * first found intersecting tasks to finish and return true.
- *
- * Called with lock held. May temporary release the lock.
- * Return value of 0 proves that lock was NOT released.
- */
-static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t offset,
- int64_t bytes)
-{
-BlockCopyTask *task = find_conflicting_task(s, offset, bytes);
-
-if (!task) {
-return false;
-

[PATCH 10/11] block/copy-before-write: add cbw_snapshot_discard()

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
To be used soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/copy-before-write.h |  1 +
 block/copy-before-write.c | 11 +++
 2 files changed, 12 insertions(+)

diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index a7e286620c..5809ffc7d0 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -40,5 +40,6 @@ void bdrv_cbw_drop(BlockDriverState *bs);
 int cbw_snapshot_read_lock(BlockDriverState *bs, int64_t offset,
int64_t bytes, const BlockReq **req, int64_t *pnum);
 void cbw_snapshot_read_unlock(BlockDriverState *bs, const BlockReq *req);
+void cbw_snapshot_discard(BlockDriverState *bs, int64_t offset, int64_t bytes);
 
 #endif /* COPY_BEFORE_WRITE_H */
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index a96131358e..a9fc8e34e9 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -103,6 +103,17 @@ void cbw_snapshot_read_unlock(BlockDriverState *bs, const 
BlockReq *req)
 drop_read_req(s, (BlockReq *)req);
 }
 
+void cbw_snapshot_discard(BlockDriverState *bs, int64_t offset, int64_t bytes)
+{
+BDRVCopyBeforeWriteState *s = bs->opaque;
+
+WITH_QEMU_LOCK_GUARD(&s->lock) {
+bdrv_reset_dirty_bitmap(s->access_bitmap, offset, bytes);
+}
+
+block_copy_reset(s->bcs, offset, bytes);
+}
+
 static coroutine_fn int cbw_co_preadv(
 BlockDriverState *bs, uint64_t offset, uint64_t bytes,
 QEMUIOVector *qiov, int flags)
-- 
2.29.2




[PATCH 09/11] block/copy-before-write: add cbw_snapshot_read_{lock, unlock}()

2021-08-04 Thread Vladimir Sementsov-Ogievskiy via
Add interface which help to do fleecing read. To be used in the next
commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/copy-before-write.h |   5 ++
 block/copy-before-write.c | 103 +-
 2 files changed, 106 insertions(+), 2 deletions(-)

diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index 51847e711a..a7e286620c 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -28,6 +28,7 @@
 
 #include "block/block_int.h"
 #include "block/block-copy.h"
+#include "block/reqlist.h"
 
 BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   BlockDriverState *target,
@@ -36,4 +37,8 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   Error **errp);
 void bdrv_cbw_drop(BlockDriverState *bs);
 
+int cbw_snapshot_read_lock(BlockDriverState *bs, int64_t offset,
+   int64_t bytes, const BlockReq **req, int64_t *pnum);
+void cbw_snapshot_read_unlock(BlockDriverState *bs, const BlockReq *req);
+
 #endif /* COPY_BEFORE_WRITE_H */
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index b58a5e8b48..a96131358e 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -31,14 +31,78 @@
 #include "block/block_int.h"
 #include "block/qdict.h"
 #include "block/block-copy.h"
+#include "block/reqlist.h"
 
 #include "block/copy-before-write.h"
 
 typedef struct BDRVCopyBeforeWriteState {
 BlockCopyState *bcs;
 BdrvChild *target;
+CoMutex lock;
+
+BdrvDirtyBitmap *access_bitmap;
+BdrvDirtyBitmap *done_bitmap;
+
+BlockReqList frozen_read_reqs;
 } BDRVCopyBeforeWriteState;
 
+static BlockReq *add_read_req(BDRVCopyBeforeWriteState *s, uint64_t offset,
+  uint64_t bytes)
+{
+BlockReq *req = g_new(BlockReq, 1);
+
+reqlist_init_req(&s->frozen_read_reqs, req, offset, bytes);
+
+return req;
+}
+
+static void drop_read_req(BDRVCopyBeforeWriteState *s, BlockReq *req)
+{
+reqlist_remove_req(req);
+g_free(req);
+}
+
+/*
+ * Convenient function for thous who want to do fleecing read.
+ *
+ * If requested region starts in "done" area, i.e. data is already copied to
+ * copy-before-write target node, req is set to NULL, pnum is set to available
+ * bytes to read from target. User is free to read @pnum bytes from target.
+ * Still, user is responsible for concurrent discards on target.
+ *
+ * If requests region starts in "not done" area, i.e. we have to read from
+ * source node directly, than @pnum bytes of source node are frozen and
+ * guaranteed not be rewritten until user calls cbw_snapshot_read_unlock().
+ */
+int cbw_snapshot_read_lock(BlockDriverState *bs, int64_t offset,
+   int64_t bytes, const BlockReq **req, int64_t *pnum)
+{
+BDRVCopyBeforeWriteState *s = bs->opaque;
+bool done;
+
+QEMU_LOCK_GUARD(&s->lock);
+
+if (bdrv_dirty_bitmap_next_zero(s->access_bitmap, offset, bytes) != -1) {
+return -EACCES;
+}
+
+bdrv_dirty_bitmap_status(s->done_bitmap, offset, bytes, &done, pnum);
+if (!done) {
+*req = add_read_req(s, offset, *pnum);
+}
+
+return 0;
+}
+
+void cbw_snapshot_read_unlock(BlockDriverState *bs, const BlockReq *req)
+{
+BDRVCopyBeforeWriteState *s = bs->opaque;
+
+QEMU_LOCK_GUARD(&s->lock);
+
+drop_read_req(s, (BlockReq *)req);
+}
+
 static coroutine_fn int cbw_co_preadv(
 BlockDriverState *bs, uint64_t offset, uint64_t bytes,
 QEMUIOVector *qiov, int flags)
@@ -50,6 +114,7 @@ static coroutine_fn int 
cbw_do_copy_before_write(BlockDriverState *bs,
 uint64_t offset, uint64_t bytes, BdrvRequestFlags flags)
 {
 BDRVCopyBeforeWriteState *s = bs->opaque;
+int ret;
 uint64_t off, end;
 int64_t cluster_size = block_copy_cluster_size(s->bcs);
 
@@ -60,7 +125,17 @@ static coroutine_fn int 
cbw_do_copy_before_write(BlockDriverState *bs,
 off = QEMU_ALIGN_DOWN(offset, cluster_size);
 end = QEMU_ALIGN_UP(offset + bytes, cluster_size);
 
-return block_copy(s->bcs, off, end - off, true);
+ret = block_copy(s->bcs, off, end - off, true);
+if (ret < 0) {
+return ret;
+}
+
+WITH_QEMU_LOCK_GUARD(&s->lock) {
+bdrv_set_dirty_bitmap(s->done_bitmap, off, end - off);
+reqlist_wait_all(&s->frozen_read_reqs, off, end - off, &s->lock);
+}
+
+return 0;
 }
 
 static int coroutine_fn cbw_co_pdiscard(BlockDriverState *bs,
@@ -148,7 +223,11 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
 Error **errp)
 {
 BDRVCopyBeforeWriteState *s = bs->opaque;
-BdrvDirtyBitmap *bitmap = NULL;
+BdrvDirtyBitmap *bcs_bitmap, *bitmap = NULL;
+bool ok;
+
+qemu_co_mutex_init(&s->lock);
+QLIST_INIT(&s->frozen_read_reqs);
 
 bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
BDRV_CHILD_F

[PATCH 07/11] block/dirty-bitmap: introduce bdrv_dirty_bitmap_status()

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Add a convenient function similar with bdrv_block_status() to get
status of dirty bitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/dirty-bitmap.h |  2 ++
 include/qemu/hbitmap.h   | 11 +++
 block/dirty-bitmap.c |  6 ++
 util/hbitmap.c   | 36 
 4 files changed, 55 insertions(+)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index f95d350b70..2ae7dc3d1d 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -115,6 +115,8 @@ int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap 
*bitmap, int64_t offset,
 bool bdrv_dirty_bitmap_next_dirty_area(BdrvDirtyBitmap *bitmap,
 int64_t start, int64_t end, int64_t max_dirty_count,
 int64_t *dirty_start, int64_t *dirty_count);
+void bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap, int64_t offset,
+  int64_t bytes, bool *is_dirty, int64_t *count);
 BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
   Error **errp);
 
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 5e71b6d6f7..845fda12db 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -340,6 +340,17 @@ bool hbitmap_next_dirty_area(const HBitmap *hb, int64_t 
start, int64_t end,
  int64_t max_dirty_count,
  int64_t *dirty_start, int64_t *dirty_count);
 
+/*
+ * bdrv_dirty_bitmap_status:
+ * @hb: The HBitmap to operate on
+ * @start: the offset to start from
+ * @end: end of requested area
+ * @is_dirty: is bitmap dirty at @offset
+ * @pnum: how many bits has same value starting from @offset
+ */
+void hbitmap_status(const HBitmap *hb, int64_t offset, int64_t bytes,
+bool *is_dirty, int64_t *pnum);
+
 /**
  * hbitmap_iter_next:
  * @hbi: HBitmapIter to operate on.
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 8f8b428baa..ecbd6703dc 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -875,6 +875,12 @@ bool bdrv_dirty_bitmap_next_dirty_area(BdrvDirtyBitmap 
*bitmap,
dirty_start, dirty_count);
 }
 
+void bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap, int64_t offset,
+  int64_t bytes, bool *is_dirty, int64_t *count)
+{
+hbitmap_status(bitmap->bitmap, offset, bytes, is_dirty, count);
+}
+
 /**
  * bdrv_merge_dirty_bitmap: merge src into dest.
  * Ensures permissions on bitmaps are reasonable; use for public API.
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 305b894a63..91956263ef 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -301,6 +301,42 @@ bool hbitmap_next_dirty_area(const HBitmap *hb, int64_t 
start, int64_t end,
 return true;
 }
 
+void hbitmap_status(const HBitmap *hb, int64_t start, int64_t count,
+bool *is_dirty, int64_t *pnum)
+{
+int64_t next_dirty, next_zero;
+
+assert(start >= 0);
+assert(count > 0);
+assert(start + count <= hb->orig_size);
+
+next_dirty = hbitmap_next_dirty(hb, start, count);
+if (next_dirty == -1) {
+*pnum = count;
+*is_dirty = false;
+return;
+}
+
+if (next_dirty > start) {
+*pnum = next_dirty - start;
+*is_dirty = false;
+return;
+}
+
+assert(next_dirty == start);
+
+next_zero = hbitmap_next_zero(hb, start, count);
+if (next_zero == -1) {
+*pnum = next_zero - start;
+*is_dirty = true;
+return;
+}
+
+assert(next_zero > start);
+*pnum = next_zero - start;
+*is_dirty = false;
+}
+
 bool hbitmap_empty(const HBitmap *hb)
 {
 return hb->count == 0;
-- 
2.29.2




[PATCH 03/11] block/block-copy: block_copy_state_new(): add bitmap parameter

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
This will be used in the following commit to bring "incremental" mode
to copy-before-write filter.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/block-copy.h |  2 +-
 block/block-copy.c | 16 +---
 block/copy-before-write.c  |  3 ++-
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 9c24e8f38e..7d40bf2ac2 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -26,7 +26,7 @@ typedef struct BlockCopyCallState BlockCopyCallState;
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
  bool use_copy_range, bool compress,
- Error **errp);
+ BdrvDirtyBitmap *bitmap, Error **errp);
 
 void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range,
   bool compress);
diff --git a/block/block-copy.c b/block/block-copy.c
index 307045a59f..ec6a39b2ed 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -400,9 +400,10 @@ static int64_t 
block_copy_calculate_cluster_size(BlockDriverState *target,
 }
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
- bool use_copy_range,
- bool compress, Error **errp)
+ bool use_copy_range, bool compress,
+ BdrvDirtyBitmap *bitmap, Error **errp)
 {
+ERRP_GUARD();
 BlockCopyState *s;
 int64_t cluster_size;
 BdrvDirtyBitmap *copy_bitmap;
@@ -418,7 +419,16 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, 
BdrvChild *target,
 return NULL;
 }
 bdrv_disable_dirty_bitmap(copy_bitmap);
-bdrv_set_dirty_bitmap(copy_bitmap, 0, bdrv_dirty_bitmap_size(copy_bitmap));
+if (bitmap) {
+if (!bdrv_merge_dirty_bitmap(copy_bitmap, bitmap, NULL, errp)) {
+error_prepend(errp, "Failed to merge bitmap '%s' to internal "
+  "copy-bitmap: ", bdrv_dirty_bitmap_name(bitmap));
+return NULL;
+}
+} else {
+bdrv_set_dirty_bitmap(copy_bitmap, 0,
+  bdrv_dirty_bitmap_size(copy_bitmap));
+}
 
 s = g_new(BlockCopyState, 1);
 *s = (BlockCopyState) {
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index b36ede3186..dbafee1f03 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -169,7 +169,8 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
 ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
  bs->file->bs->supported_zero_flags);
 
-s->bcs = block_copy_state_new(bs->file, s->target, false, false, errp);
+s->bcs = block_copy_state_new(bs->file, s->target, false, false, NULL,
+  errp);
 if (!s->bcs) {
 error_prepend(errp, "Cannot create block-copy-state: ");
 return -EINVAL;
-- 
2.29.2




Re: [PATCH v7 22/33] qapi: publish copy-before-write filter

2021-08-04 Thread Markus Armbruster
Vladimir Sementsov-Ogievskiy  writes:

> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> Reviewed-by: Max Reitz 
> ---
>  qapi/block-core.json | 25 +++--
>  1 file changed, 23 insertions(+), 2 deletions(-)
>
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 675d8265eb..59d3e5e42d 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -2825,13 +2825,14 @@
>  # @blklogwrites: Since 3.0
>  # @blkreplay: Since 4.2
>  # @compress: Since 5.0
> +# @copy-before-write: Since 6.1
>  #
>  # Since: 2.9
>  ##
>  { 'enum': 'BlockdevDriver',
>'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
> -'cloop', 'compress', 'copy-on-read', 'dmg', 'file', 'ftp', 
> 'ftps',
> -'gluster',
> +'cloop', 'compress', 'copy-before-write', 'copy-on-read', 'dmg',
> +'file', 'ftp', 'ftps', 'gluster',
>  {'name': 'host_cdrom', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
>  {'name': 'host_device', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' 
> },
>  'http', 'https', 'iscsi',
> @@ -4049,6 +4050,25 @@
>'base': 'BlockdevOptionsGenericFormat',
>'data': { '*bottom': 'str' } }
>  
> +##
> +# @BlockdevOptionsCbw:
> +#
> +# Driver specific block device options for the copy-before-write driver,
> +# which does so called copy-before-write operations: when data is
> +# written to the filter, the filter firstly reads corresponding blocks

s/firstly/first/

> +# from its file child and copies them to @target child. After successful

s/successful/successfully/

> +# copying the write request is propagated to file child. If copying

Comma after copying, I think.

> +# failed, the original write request is failed too and no data is written

s/failed,/fails,/

> +# to file child.
> +#
> +# @target: The target for copy-before-write operations.
> +#
> +# Since: 6.1
> +##
> +{ 'struct': 'BlockdevOptionsCbw',
> +  'base': 'BlockdevOptionsGenericFormat',
> +  'data': { 'target': 'BlockdevRef' } }
> +
>  ##
>  # @BlockdevOptions:
>  #
> @@ -4101,6 +4121,7 @@
>'bochs':  'BlockdevOptionsGenericFormat',
>'cloop':  'BlockdevOptionsGenericFormat',
>'compress':   'BlockdevOptionsGenericFormat',
> +  'copy-before-write':'BlockdevOptionsCbw',
>'copy-on-read':'BlockdevOptionsCor',
>'dmg':'BlockdevOptionsGenericFormat',
>'file':   'BlockdevOptionsFile',

With the doc phrasing tweaks duly considered:
Acked-by: Markus Armbruster 




[PATCH 04/11] block/copy-before-write: add bitmap open parameter

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
This brings "incremental" mode to copy-before-write filter: user can
specify bitmap so that filter will copy only "dirty" areas.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block-core.json  | 10 +-
 block/copy-before-write.c | 28 +++-
 2 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 59d3e5e42d..d061204cb0 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -4063,11 +4063,19 @@
 #
 # @target: The target for copy-before-write operations.
 #
+# @bitmap: If specified, copy-before-write filter will do
+#  copy-before-write operations only for dirty regions of the
+#  bitmap. Bitmap size must be equal to length of file and
+#  target child of the filter. Note also, that bitmap is used
+#  only to initialize internal bitmap of the process, so further
+#  modifications (or removing) of specified bitmap doesn't
+#  influence the filter.
+#
 # Since: 6.1
 ##
 { 'struct': 'BlockdevOptionsCbw',
   'base': 'BlockdevOptionsGenericFormat',
-  'data': { 'target': 'BlockdevRef' } }
+  'data': { 'target': 'BlockdevRef', '*bitmap': 'BlockDirtyBitmap' } }
 
 ##
 # @BlockdevOptions:
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index dbafee1f03..b58a5e8b48 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -148,6 +148,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
 Error **errp)
 {
 BDRVCopyBeforeWriteState *s = bs->opaque;
+BdrvDirtyBitmap *bitmap = NULL;
 
 bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
@@ -162,6 +163,31 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
 return -EINVAL;
 }
 
+if (qdict_haskey(options, "bitmap.node") ||
+qdict_haskey(options, "bitmap.name"))
+{
+const char *bitmap_node, *bitmap_name;
+
+if (!qdict_haskey(options, "bitmap.node")) {
+error_setg(errp, "bitmap.node is not specified");
+return -EINVAL;
+}
+
+if (!qdict_haskey(options, "bitmap.name")) {
+error_setg(errp, "bitmap.name is not specified");
+return -EINVAL;
+}
+
+bitmap_node = qdict_get_str(options, "bitmap.node");
+bitmap_name = qdict_get_str(options, "bitmap.name");
+
+bitmap = block_dirty_bitmap_lookup(bitmap_node, bitmap_name, NULL,
+   errp);
+if (!bitmap) {
+return -EINVAL;
+}
+}
+
 bs->total_sectors = bs->file->bs->total_sectors;
 bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
 (BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
@@ -169,7 +195,7 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
 ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
  bs->file->bs->supported_zero_flags);
 
-s->bcs = block_copy_state_new(bs->file, s->target, false, false, NULL,
+s->bcs = block_copy_state_new(bs->file, s->target, false, false, bitmap,
   errp);
 if (!s->bcs) {
 error_prepend(errp, "Cannot create block-copy-state: ");
-- 
2.29.2




[PATCH RFC DRAFT 00/11] Make image fleecing more usable

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Hi all!

That's an untested draft. I'll be on vocation 05-22, so no reason for
this just lay in my hard drive. Any comments are welcome (mostly about
general design), but don't waste time on careful reviewing.

What this series brings to image-fleecing:

1. support for bitmap (patch 04). So, we can do incremental external
backups and not do extra copy-before-write operations for non-dirty
regions.

2. fleecing block driver - see lat patch for details and list of
benefits.

Based-on: <20210804093813.20688-1-vsement...@virtuozzo.com>
  ([PATCH v7 00/33] block: publish backup-top filter)

Vladimir Sementsov-Ogievskiy (11):
  block/block-copy: move copy_bitmap initialization to
block_copy_state_new()
  block/dirty-bitmap: bdrv_merge_dirty_bitmap(): add return value
  block/block-copy: block_copy_state_new(): add bitmap parameter
  block/copy-before-write: add bitmap open parameter
  block/block-copy: add block_copy_reset()
  block: intoduce reqlist
  block/dirty-bitmap: introduce bdrv_dirty_bitmap_status()
  block/reqlist: add reqlist_wait_all()
  block/copy-before-write: add cbw_snapshot_read_{lock,unlock}()
  block/copy-before-write: add cbw_snapshot_discard()
  block: introduce fleecing block driver

 qapi/block-core.json|  10 +-
 block/copy-before-write.h   |   6 ++
 include/block/block-copy.h  |   3 +-
 include/block/dirty-bitmap.h|   4 +-
 include/block/reqlist.h |  47 
 include/qemu/hbitmap.h  |  11 ++
 block/block-copy.c  | 152 +++---
 block/copy-before-write.c   | 143 -
 block/dirty-bitmap.c|  14 ++-
 block/fleecing.c| 183 
 block/monitor/bitmap-qmp-cmds.c |   5 +-
 block/reqlist.c | 100 +
 util/hbitmap.c  |  36 +++
 block/meson.build   |   2 +
 14 files changed, 613 insertions(+), 103 deletions(-)
 create mode 100644 include/block/reqlist.h
 create mode 100644 block/fleecing.c
 create mode 100644 block/reqlist.c

-- 
2.29.2




[PATCH 05/11] block/block-copy: add block_copy_reset()

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Split block_copy_reset() out of block_copy_reset_unallocated() to be
used in separate later.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/block-copy.h |  1 +
 block/block-copy.c | 21 +
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 7d40bf2ac2..599af02bef 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -34,6 +34,7 @@ void block_copy_set_progress_meter(BlockCopyState *s, 
ProgressMeter *pm);
 
 void block_copy_state_free(BlockCopyState *s);
 
+void block_copy_reset(BlockCopyState *s, int64_t offset, int64_t bytes);
 int64_t block_copy_reset_unallocated(BlockCopyState *s,
  int64_t offset, int64_t *count);
 
diff --git a/block/block-copy.c b/block/block-copy.c
index ec6a39b2ed..5cd291727b 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -689,6 +689,18 @@ static int block_copy_is_cluster_allocated(BlockCopyState 
*s, int64_t offset,
 }
 }
 
+void block_copy_reset(BlockCopyState *s, int64_t offset, int64_t bytes)
+{
+QEMU_LOCK_GUARD(&s->lock);
+
+bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
+if (s->progress) {
+progress_set_remaining(s->progress,
+   bdrv_get_dirty_count(s->copy_bitmap) +
+   s->in_flight_bytes);
+}
+}
+
 /*
  * Reset bits in copy_bitmap starting at offset if they represent unallocated
  * data in the image. May reset subsequent contiguous bits.
@@ -709,14 +721,7 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
 bytes = clusters * s->cluster_size;
 
 if (!ret) {
-qemu_co_mutex_lock(&s->lock);
-bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
-if (s->progress) {
-progress_set_remaining(s->progress,
-   bdrv_get_dirty_count(s->copy_bitmap) +
-   s->in_flight_bytes);
-}
-qemu_co_mutex_unlock(&s->lock);
+block_copy_reset(s, offset, bytes);
 }
 
 *count = bytes;
-- 
2.29.2




[PATCH 02/11] block/dirty-bitmap: bdrv_merge_dirty_bitmap(): add return value

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
That simplifies handling failure in existing code and in further new
usage of bdrv_merge_dirty_bitmap().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/dirty-bitmap.h| 2 +-
 block/dirty-bitmap.c| 8 ++--
 block/monitor/bitmap-qmp-cmds.c | 5 +
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 40950ae3d5..f95d350b70 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -77,7 +77,7 @@ void bdrv_dirty_bitmap_set_persistence(BdrvDirtyBitmap 
*bitmap,
bool persistent);
 void bdrv_dirty_bitmap_set_inconsistent(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_set_busy(BdrvDirtyBitmap *bitmap, bool busy);
-void bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
+bool bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
  HBitmap **backup, Error **errp);
 void bdrv_dirty_bitmap_skip_store(BdrvDirtyBitmap *bitmap, bool skip);
 bool bdrv_dirty_bitmap_get(BdrvDirtyBitmap *bitmap, int64_t offset);
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 0ef46163e3..8f8b428baa 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -881,10 +881,10 @@ bool bdrv_dirty_bitmap_next_dirty_area(BdrvDirtyBitmap 
*bitmap,
  *
  * @backup: If provided, make a copy of dest here prior to merge.
  */
-void bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
+bool bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const BdrvDirtyBitmap *src,
  HBitmap **backup, Error **errp)
 {
-bool ret;
+bool ret = false;
 
 bdrv_dirty_bitmaps_lock(dest->bs);
 if (src->bs != dest->bs) {
@@ -907,11 +907,15 @@ void bdrv_merge_dirty_bitmap(BdrvDirtyBitmap *dest, const 
BdrvDirtyBitmap *src,
 ret = bdrv_dirty_bitmap_merge_internal(dest, src, backup, false);
 assert(ret);
 
+ret = true;
+
 out:
 bdrv_dirty_bitmaps_unlock(dest->bs);
 if (src->bs != dest->bs) {
 bdrv_dirty_bitmaps_unlock(src->bs);
 }
+
+return ret;
 }
 
 /**
diff --git a/block/monitor/bitmap-qmp-cmds.c b/block/monitor/bitmap-qmp-cmds.c
index 9f11deec64..83970b22fa 100644
--- a/block/monitor/bitmap-qmp-cmds.c
+++ b/block/monitor/bitmap-qmp-cmds.c
@@ -259,7 +259,6 @@ BdrvDirtyBitmap *block_dirty_bitmap_merge(const char *node, 
const char *target,
 BlockDriverState *bs;
 BdrvDirtyBitmap *dst, *src, *anon;
 BlockDirtyBitmapMergeSourceList *lst;
-Error *local_err = NULL;
 
 dst = block_dirty_bitmap_lookup(node, target, &bs, errp);
 if (!dst) {
@@ -297,9 +296,7 @@ BdrvDirtyBitmap *block_dirty_bitmap_merge(const char *node, 
const char *target,
 abort();
 }
 
-bdrv_merge_dirty_bitmap(anon, src, NULL, &local_err);
-if (local_err) {
-error_propagate(errp, local_err);
+if (!bdrv_merge_dirty_bitmap(anon, src, NULL, errp)) {
 dst = NULL;
 goto out;
 }
-- 
2.29.2




[PATCH 01/11] block/block-copy: move copy_bitmap initialization to block_copy_state_new()

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
We are going to complicate bitmap initialization in the following
commit. And in future, backup job will be able to work without filter
(when source is immutable), so we'll need same bitmap initialization in
copy-before-write filter and in backup job. So, it's reasonable to do
it in block-copy.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/block-copy.c| 1 +
 block/copy-before-write.c | 4 
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 90664ee0ab..307045a59f 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -418,6 +418,7 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, 
BdrvChild *target,
 return NULL;
 }
 bdrv_disable_dirty_bitmap(copy_bitmap);
+bdrv_set_dirty_bitmap(copy_bitmap, 0, bdrv_dirty_bitmap_size(copy_bitmap));
 
 s = g_new(BlockCopyState, 1);
 *s = (BlockCopyState) {
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 2cd68b480a..b36ede3186 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -148,7 +148,6 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
 Error **errp)
 {
 BDRVCopyBeforeWriteState *s = bs->opaque;
-BdrvDirtyBitmap *copy_bitmap;
 
 bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
@@ -176,9 +175,6 @@ static int cbw_open(BlockDriverState *bs, QDict *options, 
int flags,
 return -EINVAL;
 }
 
-copy_bitmap = block_copy_dirty_bitmap(s->bcs);
-bdrv_set_dirty_bitmap(copy_bitmap, 0, bdrv_dirty_bitmap_size(copy_bitmap));
-
 return 0;
 }
 
-- 
2.29.2




Re: [PULL 0/2] SD/MMC patches for 2021-08-03

2021-08-04 Thread Peter Maydell
On Tue, 3 Aug 2021 at 18:42, Philippe Mathieu-Daudé  wrote:
>
> The following changes since commit acf8200722251a0a995cfa75fe5c15aea0886418:
>
>   Merge remote-tracking branch 
> 'remotes/mdroth/tags/qga-pull-2021-08-03-pull-tag' into staging (2021-08-03 
> 14:48:57 +0100)
>
> are available in the Git repository at:
>
>   https://github.com/philmd/qemu.git tags/sdmmc-20210803
>
> for you to fetch changes up to 4ac0b72bae85cf94ae0e5153b9c2c288c71667d4:
>
>   hw/sd/sdcard: Fix assertion accessing out-of-range addresses with CMD30 
> (2021-08-03 19:34:51 +0200)
>
> 
> SD/MMC patches queue
>
> - sdcard: Fix assertion accessing out-of-range addresses
>   with SEND_WRITE_PROT (CMD30)
>
> 


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/6.1
for any user-visible changes.

-- PMM



Re: [PATCH for-6.1? v2 5/7] job: Add job_cancel_requested()

2021-08-04 Thread Peter Krempa
On Wed, Aug 04, 2021 at 12:34:31 +0200, Kevin Wolf wrote:
> We could in theory keep allowing redundant completion requests when the
> completion mode doesn't conflict, but I don't see the point of that.

I don't see either. Especially since ...



> Unless libvirt can actually issue multiple completion requests (again,
> this includes both (block-)job-complete and non-force block-job-cancel
> for mirror) for the same block job - Peter, I hope it doesn't?

... the regular job completion code in libvirt which is meant for user
interaction (qemuDomainBlockJobAbort) has the following interlock:

if (job->state == QEMU_BLOCKJOB_STATE_ABORTING ||
job->state == QEMU_BLOCKJOB_STATE_PIVOTING) {
virReportError(VIR_ERR_OPERATION_INVALID,
   _("block job on disk '%s' is still being ended"),
   disk->dst);
goto endjob;
}

.. the other two uses of blockjobs are internal for handling migration
with non shared storage and there we also issue exactly one cancel
request and backup jobs were we too make sure to cancel it just once.

As of such it's okay to forbid the case you are mentioning.




Re: [PULL 0/1] Block layer patches

2021-08-04 Thread Peter Maydell
On Tue, 3 Aug 2021 at 15:41, Kevin Wolf  wrote:
>
> The following changes since commit 7f1cab9c628a798ae2607940993771e6300e9e00:
>
>   Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' 
> into staging (2021-08-02 17:21:50 +0100)
>
> are available in the Git repository at:
>
>   git://repo.or.cz/qemu/kevin.git tags/for-upstream
>
> for you to fetch changes up to 87ab88025247b893aad5071fd38301b67be76d1a:
>
>   block: Fix in_flight leak in request padding error path (2021-08-03 
> 15:43:30 +0200)
>
> 
> Block layer patches
>
> - Fix hang after request padding error (Windows + 512-on-4k emulation)
>
> 


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/6.1
for any user-visible changes.

-- PMM



Re: [PATCH for-6.1? v2 5/7] job: Add job_cancel_requested()

2021-08-04 Thread Kevin Wolf
[ Peter, the question for you is at the end. ]

Am 04.08.2021 um 10:07 hat Max Reitz geschrieben:
> On 03.08.21 16:25, Kevin Wolf wrote:
> > Am 26.07.2021 um 16:46 hat Max Reitz geschrieben:
> > > Most callers of job_is_cancelled() actually want to know whether the job
> > > is on its way to immediate termination.  For example, we refuse to pause
> > > jobs that are cancelled; but this only makes sense for jobs that are
> > > really actually cancelled.
> > > 
> > > A mirror job that is cancelled during READY with force=false should
> > > absolutely be allowed to pause.  This "cancellation" (which is actually
> > > a kind of completion) may take an indefinite amount of time, and so
> > > should behave like any job during normal operation.  For example, with
> > > on-target-error=stop, the job should stop on write errors.  (In
> > > contrast, force-cancelled jobs should not get write errors, as they
> > > should just terminate and not do further I/O.)
> > > 
> > > Therefore, redefine job_is_cancelled() to only return true for jobs that
> > > are force-cancelled (which as of HEAD^ means any job that interprets the
> > > cancellation request as a request for immediate termination), and add
> > > job_cancel_request() as the general variant, which returns true for any
> > > jobs which have been requested to be cancelled, whether it be
> > > immediately or after an arbitrarily long completion phase.
> > > 
> > > Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
> > > Signed-off-by: Max Reitz 
> > > ---
> > >   include/qemu/job.h |  8 +++-
> > >   block/mirror.c | 10 --
> > >   job.c  |  7 ++-
> > >   3 files changed, 17 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/include/qemu/job.h b/include/qemu/job.h
> > > index 8aa90f7395..032edf3c5f 100644
> > > --- a/include/qemu/job.h
> > > +++ b/include/qemu/job.h
> > > @@ -436,9 +436,15 @@ const char *job_type_str(const Job *job);
> > >   /** Returns true if the job should not be visible to the management 
> > > layer. */
> > >   bool job_is_internal(Job *job);
> > > -/** Returns whether the job is scheduled for cancellation. */
> > > +/** Returns whether the job is being cancelled. */
> > >   bool job_is_cancelled(Job *job);
> > > +/**
> > > + * Returns whether the job is scheduled for cancellation (at an
> > > + * indefinite point).
> > > + */
> > > +bool job_cancel_requested(Job *job);
> > I don't think non-force blockdev-cancel for mirror should actually be
> > considered cancellation, so what is the question that this function
> > answers?
> > 
> > "Is this a cancelled job, or a mirror block job that is supposed to
> > complete soon, but only if it doesn't switch over the users to the
> > target on completion"?
> 
> Well, technically yes, but it was more intended as “Has the user ever
> invoked (block-)job-cancel on this job?”.

I understand this, but is this much more useful to know than "Has the
user ever called HMP 'change'?", if you know what I mean?

> > Is this ever a reasonable question to ask, except maybe inside the
> > mirror implementation itself?
> 
> I asked myself the same for v3, but found two places in job.c where I
> would like to keep it:
> 
> First, there’s an assertion in job_completed_txn_abort().  All jobs
> other than @job have been force-cancelled, and so job_is_cancelled()
> would be true for them.  As for @job itself, the function is mostly
> called when the job’s return value is not 0, but a soft-cancelled
> mirror does have a return value of 0 and so would not end up in that
> function.
> But job_cancel() invokes job_completed_txn_abort() if the job has been
> deferred to the main loop, which mostly correlates with the job having
> been completed (in which case the assertion is skipped), but not 100 %
> (there’s a small window between setting deferred_to_main_loop and the
> job changing to a completed state).
> So I’d prefer to keep the assertion as-is functionally, i.e. to only
> check job->cancelled.

Well, you don't. It's still job_is_cancelled() after this patch.

So the scenario you're concerned about is a job that has just finished
successfully (job->ret = 0) and then gets a cancel request?

With force=false, I'm pretty sure the code is wrong anyway because
calling job_completed_txn_abort() is not the right response. It should
return an error because you're trying to complete twice, possibly with
conflicting completion modes. Second best is just ignoring the cancel
request because we obviously already fulfilled the request of completing
the job (the completion mode might be different, though).

With force=true, arguably still letting the job fail is correct.
However, letting it fail involves more than just letting the transaction
fail. We would have to call job_update_rc() as well so that instead of
reporting success for the job, ECANCELED is returned and the job
transitions to JOB_STATUS_ABORTING (after which job_is_completed()
returns true).

So, just bugs to be fixed.

After thi

Re: [PATCH for-6.1? v2 6/7] mirror: Check job_is_cancelled() earlier

2021-08-04 Thread Max Reitz

On 04.08.21 11:48, Kevin Wolf wrote:

Am 04.08.2021 um 10:25 hat Max Reitz geschrieben:

On 03.08.21 16:34, Kevin Wolf wrote:

Am 26.07.2021 um 16:46 hat Max Reitz geschrieben:

We must check whether the job is force-cancelled early in our main loop,
most importantly before any `continue` statement.  For example, we used
to have `continue`s before our current checking location that are
triggered by `mirror_flush()` failing.  So, if `mirror_flush()` kept
failing, force-cancelling the job would not terminate it.

A job being force-cancelled should be treated the same as the job having
failed, so put the check in the same place where we check `s->ret < 0`.

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz 
---
   block/mirror.c | 7 +--
   1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 72e02fa34e..46d1a1e5a2 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -993,7 +993,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
   mirror_wait_for_any_operation(s, true);
   }
-if (s->ret < 0) {
+if (s->ret < 0 || job_is_cancelled(&s->common.job)) {
   ret = s->ret;
   goto immediate_exit;
   }
@@ -1078,8 +1078,6 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
   break;
   }
-ret = 0;
-
   if (job_is_ready(&s->common.job) && !should_complete) {
   delay_ns = (s->in_flight == 0 &&
   cnt == 0 ? BLOCK_JOB_SLICE_TIME : 0);
@@ -1087,9 +1085,6 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
   trace_mirror_before_sleep(s, cnt, job_is_ready(&s->common.job),
 delay_ns);
   job_sleep_ns(&s->common.job, delay_ns);
-if (job_is_cancelled(&s->common.job)) {
-break;
-}

I think it was intentional that the check is here because it means
skipping the job_sleep_ns() and instead cancelling immediately, and we
probably still want that. Between your check above and here, the
coroutine can yield, so cancellation could have been newly requested.

I’m afraid I don’t quite understand.

Hm, I don't either. Somehow I thought job_sleep_ns() was after the
check, while quoting the exact hunk that shows that it comes before
it...

I'm still not sure if sleeping before exiting is really useful, but it
seems we never cared about that.


Jobs that are (force-)cancelled cannot yield or sleep anyway 
(job_sleep_ns(), job_yield(), and job_pause_point() will all return 
immediately when called on a cancelled job).


So I thought you meant that a job can only be cancelled while it is 
yielding, so we should prefer to put the is_cancelled check after a 
yield point (like job_pause_point()) than before it.


But I mean, if you’re happy, I’ll be happy, too. :)

Max




[PATCH v7 33/33] iotests/image-fleecing: add test-case for copy-before-write filter

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
New fleecing method becomes available: copy-before-write filter.

Actually we don't need backup job to setup image fleecing. Add test
for new recommended way of image fleecing.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/tests/image-fleecing | 50 +-
 tests/qemu-iotests/tests/image-fleecing.out | 72 +
 2 files changed, 107 insertions(+), 15 deletions(-)

diff --git a/tests/qemu-iotests/tests/image-fleecing 
b/tests/qemu-iotests/tests/image-fleecing
index e210c00d28..f6318492c6 100755
--- a/tests/qemu-iotests/tests/image-fleecing
+++ b/tests/qemu-iotests/tests/image-fleecing
@@ -48,7 +48,7 @@ remainder = [('0xd5', '0x108000',  '32k'), # Right-end of 
partial-left [1]
  ('0xdc', '32M',   '32k'), # Left-end of partial-right [2]
  ('0xcd', '0x3ff', '64k')] # patterns[3]
 
-def do_test(base_img_path, fleece_img_path, nbd_sock_path, vm):
+def do_test(use_cbw, base_img_path, fleece_img_path, nbd_sock_path, vm):
 log('--- Setting up images ---')
 log('')
 
@@ -67,6 +67,7 @@ def do_test(base_img_path, fleece_img_path, nbd_sock_path, 
vm):
 
 src_node = 'source'
 tmp_node = 'temp'
+qom_path = '/machine/peripheral/sda'
 vm.add_blockdev(f'driver={iotests.imgfmt},file.driver=file,'
 f'file.filename={base_img_path},node-name={src_node}')
 vm.add_device('virtio-scsi')
@@ -90,12 +91,22 @@ def do_test(base_img_path, fleece_img_path, nbd_sock_path, 
vm):
 'backing': src_node,
 }))
 
-# Establish COW from source to fleecing node
-log(vm.qmp('blockdev-backup',
-   job_id='fleecing',
-   device=src_node,
-   target=tmp_node,
-   sync='none'))
+# Establish CBW from source to fleecing node
+if use_cbw:
+log(vm.qmp('blockdev-add', {
+'driver': 'copy-before-write',
+'node-name': 'fl-cbw',
+'file': src_node,
+'target': tmp_node
+}))
+
+log(vm.qmp('qom-set', path=qom_path, property='drive', value='fl-cbw'))
+else:
+log(vm.qmp('blockdev-backup',
+   job_id='fleecing',
+   device=src_node,
+   target=tmp_node,
+   sync='none'))
 
 log('')
 log('--- Setting up NBD Export ---')
@@ -124,7 +135,7 @@ def do_test(base_img_path, fleece_img_path, nbd_sock_path, 
vm):
 for p in overwrite:
 cmd = 'write -P%s %s %s' % p
 log(cmd)
-log(vm.hmp_qemu_io('/machine/peripheral/sda', cmd, qdev=True))
+log(vm.hmp_qemu_io(qom_path, cmd, qdev=True))
 
 log('')
 log('--- Verifying Data ---')
@@ -139,10 +150,15 @@ def do_test(base_img_path, fleece_img_path, 
nbd_sock_path, vm):
 log('--- Cleanup ---')
 log('')
 
-log(vm.qmp('block-job-cancel', device='fleecing'))
-e = vm.event_wait('BLOCK_JOB_CANCELLED')
-assert e is not None
-log(e, filters=[iotests.filter_qmp_event])
+if use_cbw:
+log(vm.qmp('qom-set', path=qom_path, property='drive', value=src_node))
+log(vm.qmp('blockdev-del', node_name='fl-cbw'))
+else:
+log(vm.qmp('block-job-cancel', device='fleecing'))
+e = vm.event_wait('BLOCK_JOB_CANCELLED')
+assert e is not None
+log(e, filters=[iotests.filter_qmp_event])
+
 log(vm.qmp('nbd-server-stop'))
 log(vm.qmp('blockdev-del', node_name=tmp_node))
 vm.shutdown()
@@ -160,13 +176,17 @@ def do_test(base_img_path, fleece_img_path, 
nbd_sock_path, vm):
 log('Done')
 
 
-def test():
+def test(use_cbw):
 with iotests.FilePath('base.img') as base_img_path, \
  iotests.FilePath('fleece.img') as fleece_img_path, \
  iotests.FilePath('nbd.sock',
   base_dir=iotests.sock_dir) as nbd_sock_path, \
  iotests.VM() as vm:
-do_test(base_img_path, fleece_img_path, nbd_sock_path, vm)
+do_test(use_cbw, base_img_path, fleece_img_path, nbd_sock_path, vm)
+
 
+log('=== Test backup(sync=none) based fleecing ===\n')
+test(False)
 
-test()
+log('=== Test filter based fleecing ===\n')
+test(True)
diff --git a/tests/qemu-iotests/tests/image-fleecing.out 
b/tests/qemu-iotests/tests/image-fleecing.out
index 314a1b54e9..e96d122a8b 100644
--- a/tests/qemu-iotests/tests/image-fleecing.out
+++ b/tests/qemu-iotests/tests/image-fleecing.out
@@ -1,3 +1,5 @@
+=== Test backup(sync=none) based fleecing ===
+
 --- Setting up images ---
 
 Done
@@ -65,3 +67,73 @@ read -P0xdc 32M 32k
 read -P0xcd 0x3ff 64k
 
 Done
+=== Test filter based fleecing ===
+
+--- Setting up images ---
+
+Done
+
+--- Launching VM ---
+
+Done
+
+--- Setting up Fleecing Graph ---
+
+{"return": {}}
+{"return": {}}
+{"return": {}}
+
+--- Setting up NBD Export ---
+
+{"return": {}}
+{"return": {}}
+
+--- Sanity Check ---
+
+read -P0x5d 0 64k
+read -P0xd5 1M 64k
+read -P0xdc 32M 64k
+read -P0xcd 0x3ff 64k
+read -P0 0x00f8000 32k
+r

Re: [PATCH v2 for-6.2 0/6] push backup with fleecing

2021-08-04 Thread Vladimir Sementsov-Ogievskiy

Consider that as RFC.

I'm preparing an alternative and more efficient fleecing scheme, based on 
special block driver, not on backing link.

So, this will be rebased, and probably some permission-related difficulties may 
be avoided.

21.07.2021 17:04, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

Here is push-backup with fleecing. What is it:

1. Make fleecing scheme

guest blk
   |
   |root
   v
copy-before-write filter  ---> temp qcow2
   |  |
   |file  | backing
   V  |
active disk <-

2. Start backup job from temp qcow2 to final remote target (NBD or
something)

Benefit in comparison with simple backup job: for remote final target
write operations are not very fast. And guest have to wait for
copy-before-write operations. With fleecing scheme target for
copy-before-write operations is local qcow2 file with faster access than
actual backup target. So, guest is less disturbed by copy-before-write
operations.

Based-on: <20210721100555.45594-1-vsement...@virtuozzo.com>
([PATCH v6 00/33] block: publish backup-top filter)

v2:
01: changed to simply check s->target->perm
06: make explicit immutable-source parameter instead of auto-detecting

Vladimir Sementsov-Ogievskiy (6):
   block/block-copy: use write-unchanged for fleecing scheme
   block/copy-before-write: require BLK_PERM_WRITE_UNCHANGED for fleecing
   block: share writes on backing child of fleecing node
   block: blk_root(): return non-const pointer
   qapi: backup: add immutable-source paramter
   iotests/image-fleecing: test push backup with fleecing

  qapi/block-core.json|  12 ++-
  block/copy-before-write.h   |   1 +
  include/block/block_int.h   |   1 +
  include/sysemu/block-backend.h  |   2 +-
  block.c |   3 +-
  block/backup.c  |  71 -
  block/block-backend.c   |   2 +-
  block/block-copy.c  |  18 +++-
  block/copy-before-write.c   |  48 -
  block/replication.c |   2 +-
  blockdev.c  |   1 +
  tests/qemu-iotests/tests/image-fleecing | 105 +++-
  tests/qemu-iotests/tests/image-fleecing.out |  61 
  13 files changed, 287 insertions(+), 40 deletions(-)




--
Best regards,
Vladimir



[PATCH v7 31/33] iotests/image-fleecing: rename tgt_node

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Actually target of backup(sync=None) is not a final backup target:
image fleecing is intended to be used with external tool, which will
copy data from fleecing node to some real backup target.

Also, we are going to add a test case for "push backup with fleecing",
where instead of exporting fleecing node by NBD, we'll start a backup
job from fleecing node to real backup target.

To avoid confusion, let's rename temporary fleecing node now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/tests/image-fleecing | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tests/qemu-iotests/tests/image-fleecing 
b/tests/qemu-iotests/tests/image-fleecing
index 961941bb27..ec4ef5f3f6 100755
--- a/tests/qemu-iotests/tests/image-fleecing
+++ b/tests/qemu-iotests/tests/image-fleecing
@@ -71,6 +71,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('')
 
 src_node = 'source'
+tmp_node = 'temp'
 vm.add_blockdev(f'driver={iotests.imgfmt},file.driver=file,'
 f'file.filename={base_img_path},node-name={src_node}')
 vm.add_device('virtio-scsi')
@@ -82,12 +83,11 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('--- Setting up Fleecing Graph ---')
 log('')
 
-tgt_node = 'fleeceNode'
 
-# create tgt_node backed by src_node
+# create tmp_node backed by src_node
 log(vm.qmp('blockdev-add', {
 'driver': 'qcow2',
-'node-name': tgt_node,
+'node-name': tmp_node,
 'file': {
 'driver': 'file',
 'filename': fleece_img_path,
@@ -99,19 +99,19 @@ with iotests.FilePath('base.img') as base_img_path, \
 log(vm.qmp('blockdev-backup',
job_id='fleecing',
device=src_node,
-   target=tgt_node,
+   target=tmp_node,
sync='none'))
 
 log('')
 log('--- Setting up NBD Export ---')
 log('')
 
-nbd_uri = 'nbd+unix:///%s?socket=%s' % (tgt_node, nbd_sock_path)
+nbd_uri = 'nbd+unix:///%s?socket=%s' % (tmp_node, nbd_sock_path)
 log(vm.qmp('nbd-server-start',
{'addr': { 'type': 'unix',
   'data': { 'path': nbd_sock_path } } }))
 
-log(vm.qmp('nbd-server-add', device=tgt_node))
+log(vm.qmp('nbd-server-add', device=tmp_node))
 
 log('')
 log('--- Sanity Check ---')
@@ -149,7 +149,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 assert e is not None
 log(e, filters=[iotests.filter_qmp_event])
 log(vm.qmp('nbd-server-stop'))
-log(vm.qmp('blockdev-del', node_name=tgt_node))
+log(vm.qmp('blockdev-del', node_name=tmp_node))
 vm.shutdown()
 
 log('')
-- 
2.29.2




[PATCH v7 26/33] iotests/222: fix pylint and mypy complains

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Here:
 - long line
 - move to new interface of vm.qmp() (direct passing dict), to avoid
   mypy false-positive, as it thinks that unpacked dict is a positional
   argument.
 - extra parenthesis
 - handle event_wait possible None value

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/222 | 20 +++-
 tests/qemu-iotests/297 |  2 +-
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/tests/qemu-iotests/222 b/tests/qemu-iotests/222
index b48afe623e..5e2556f8df 100755
--- a/tests/qemu-iotests/222
+++ b/tests/qemu-iotests/222
@@ -50,7 +50,8 @@ remainder = [("0xd5", "0x108000",  "32k"), # Right-end of 
partial-left [1]
 
 with iotests.FilePath('base.img') as base_img_path, \
  iotests.FilePath('fleece.img') as fleece_img_path, \
- iotests.FilePath('nbd.sock', base_dir=iotests.sock_dir) as nbd_sock_path, 
\
+ iotests.FilePath('nbd.sock',
+  base_dir=iotests.sock_dir) as nbd_sock_path, \
  iotests.VM() as vm:
 
 log('--- Setting up images ---')
@@ -81,7 +82,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 tgt_node = "fleeceNode"
 
 # create tgt_node backed by src_node
-log(vm.qmp("blockdev-add", **{
+log(vm.qmp("blockdev-add", {
 "driver": "qcow2",
 "node-name": tgt_node,
 "file": {
@@ -103,8 +104,8 @@ with iotests.FilePath('base.img') as base_img_path, \
 
 nbd_uri = 'nbd+unix:///%s?socket=%s' % (tgt_node, nbd_sock_path)
 log(vm.qmp("nbd-server-start",
-   **{"addr": { "type": "unix",
-"data": { "path": nbd_sock_path } } }))
+   {"addr": { "type": "unix",
+  "data": { "path": nbd_sock_path } } }))
 
 log(vm.qmp("nbd-server-add", device=tgt_node))
 
@@ -112,7 +113,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('--- Sanity Check ---')
 log('')
 
-for p in (patterns + zeroes):
+for p in patterns + zeroes:
 cmd = "read -P%s %s %s" % p
 log(cmd)
 assert qemu_io_silent('-r', '-f', 'raw', '-c', cmd, nbd_uri) == 0
@@ -130,7 +131,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('--- Verifying Data ---')
 log('')
 
-for p in (patterns + zeroes):
+for p in patterns + zeroes:
 cmd = "read -P%s %s %s" % p
 log(cmd)
 assert qemu_io_silent('-r', '-f', 'raw', '-c', cmd, nbd_uri) == 0
@@ -140,8 +141,9 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('')
 
 log(vm.qmp('block-job-cancel', device=src_node))
-log(vm.event_wait('BLOCK_JOB_CANCELLED'),
-filters=[iotests.filter_qmp_event])
+e = vm.event_wait('BLOCK_JOB_CANCELLED')
+assert e is not None
+log(e, filters=[iotests.filter_qmp_event])
 log(vm.qmp('nbd-server-stop'))
 log(vm.qmp('blockdev-del', node_name=tgt_node))
 vm.shutdown()
@@ -150,7 +152,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('--- Confirming writes ---')
 log('')
 
-for p in (overwrite + remainder):
+for p in overwrite + remainder:
 cmd = "read -P%s %s %s" % p
 log(cmd)
 assert qemu_io_silent(base_img_path, '-c', cmd) == 0
diff --git a/tests/qemu-iotests/297 b/tests/qemu-iotests/297
index 433b732336..345b617b34 100755
--- a/tests/qemu-iotests/297
+++ b/tests/qemu-iotests/297
@@ -31,7 +31,7 @@ SKIP_FILES = (
 '096', '118', '124', '132', '136', '139', '147', '148', '149',
 '151', '152', '155', '163', '165', '169', '194', '196', '199', '202',
 '203', '205', '206', '207', '208', '210', '211', '212', '213', '216',
-'218', '219', '222', '224', '228', '234', '235', '236', '237', '238',
+'218', '219', '224', '228', '234', '235', '236', '237', '238',
 '240', '242', '245', '246', '248', '255', '256', '257', '258', '260',
 '262', '264', '266', '274', '277', '280', '281', '295', '296', '298',
 '299', '302', '303', '304', '307',
-- 
2.29.2




Re: [PATCH for-6.1? v2 6/7] mirror: Check job_is_cancelled() earlier

2021-08-04 Thread Kevin Wolf
Am 04.08.2021 um 10:25 hat Max Reitz geschrieben:
> On 03.08.21 16:34, Kevin Wolf wrote:
> > Am 26.07.2021 um 16:46 hat Max Reitz geschrieben:
> > > We must check whether the job is force-cancelled early in our main loop,
> > > most importantly before any `continue` statement.  For example, we used
> > > to have `continue`s before our current checking location that are
> > > triggered by `mirror_flush()` failing.  So, if `mirror_flush()` kept
> > > failing, force-cancelling the job would not terminate it.
> > > 
> > > A job being force-cancelled should be treated the same as the job having
> > > failed, so put the check in the same place where we check `s->ret < 0`.
> > > 
> > > Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
> > > Signed-off-by: Max Reitz 
> > > ---
> > >   block/mirror.c | 7 +--
> > >   1 file changed, 1 insertion(+), 6 deletions(-)
> > > 
> > > diff --git a/block/mirror.c b/block/mirror.c
> > > index 72e02fa34e..46d1a1e5a2 100644
> > > --- a/block/mirror.c
> > > +++ b/block/mirror.c
> > > @@ -993,7 +993,7 @@ static int coroutine_fn mirror_run(Job *job, Error 
> > > **errp)
> > >   mirror_wait_for_any_operation(s, true);
> > >   }
> > > -if (s->ret < 0) {
> > > +if (s->ret < 0 || job_is_cancelled(&s->common.job)) {
> > >   ret = s->ret;
> > >   goto immediate_exit;
> > >   }
> > > @@ -1078,8 +1078,6 @@ static int coroutine_fn mirror_run(Job *job, Error 
> > > **errp)
> > >   break;
> > >   }
> > > -ret = 0;
> > > -
> > >   if (job_is_ready(&s->common.job) && !should_complete) {
> > >   delay_ns = (s->in_flight == 0 &&
> > >   cnt == 0 ? BLOCK_JOB_SLICE_TIME : 0);
> > > @@ -1087,9 +1085,6 @@ static int coroutine_fn mirror_run(Job *job, Error 
> > > **errp)
> > >   trace_mirror_before_sleep(s, cnt, job_is_ready(&s->common.job),
> > > delay_ns);
> > >   job_sleep_ns(&s->common.job, delay_ns);
> > > -if (job_is_cancelled(&s->common.job)) {
> > > -break;
> > > -}
> > I think it was intentional that the check is here because it means
> > skipping the job_sleep_ns() and instead cancelling immediately, and we
> > probably still want that. Between your check above and here, the
> > coroutine can yield, so cancellation could have been newly requested.
> 
> I’m afraid I don’t quite understand.

Hm, I don't either. Somehow I thought job_sleep_ns() was after the
check, while quoting the exact hunk that shows that it comes before
it...

I'm still not sure if sleeping before exiting is really useful, but it
seems we never cared about that.

Kevin




[PATCH v7 32/33] iotests/image-fleecing: prepare for adding new test-case

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
We are going to add a test-case with some behavior modifications. So,
let's prepare a function to be reused.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/tests/image-fleecing | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/tests/image-fleecing 
b/tests/qemu-iotests/tests/image-fleecing
index ec4ef5f3f6..e210c00d28 100755
--- a/tests/qemu-iotests/tests/image-fleecing
+++ b/tests/qemu-iotests/tests/image-fleecing
@@ -48,12 +48,7 @@ remainder = [('0xd5', '0x108000',  '32k'), # Right-end of 
partial-left [1]
  ('0xdc', '32M',   '32k'), # Left-end of partial-right [2]
  ('0xcd', '0x3ff', '64k')] # patterns[3]
 
-with iotests.FilePath('base.img') as base_img_path, \
- iotests.FilePath('fleece.img') as fleece_img_path, \
- iotests.FilePath('nbd.sock',
-  base_dir=iotests.sock_dir) as nbd_sock_path, \
- iotests.VM() as vm:
-
+def do_test(base_img_path, fleece_img_path, nbd_sock_path, vm):
 log('--- Setting up images ---')
 log('')
 
@@ -163,3 +158,15 @@ with iotests.FilePath('base.img') as base_img_path, \
 
 log('')
 log('Done')
+
+
+def test():
+with iotests.FilePath('base.img') as base_img_path, \
+ iotests.FilePath('fleece.img') as fleece_img_path, \
+ iotests.FilePath('nbd.sock',
+  base_dir=iotests.sock_dir) as nbd_sock_path, \
+ iotests.VM() as vm:
+do_test(base_img_path, fleece_img_path, nbd_sock_path, vm)
+
+
+test()
-- 
2.29.2




[PATCH v7 25/33] iotests.py: VM: add own __enter__ method

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
In superclass __enter__ method is defined with return value type hint
'QEMUMachine'. So, mypy thinks that return value of VM.__enter__ is
QEMUMachine. Let's redefine __enter__ in VM class, to give it correct
type hint.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/iotests.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 89663dac06..025e288ddd 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -568,6 +568,10 @@ def remote_filename(path):
 class VM(qtest.QEMUQtestMachine):
 '''A QEMU VM'''
 
+# Redefine __enter__ with proper type hint
+def __enter__(self) -> 'VM':
+return self
+
 def __init__(self, path_suffix=''):
 name = "qemu%s-%d" % (path_suffix, os.getpid())
 super().__init__(qemu_prog, qemu_opts, name=name,
-- 
2.29.2




[PATCH v7 28/33] iotests: move 222 to tests/image-fleecing

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Give a good name to test file.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/{222 => tests/image-fleecing} | 0
 tests/qemu-iotests/{222.out => tests/image-fleecing.out} | 0
 2 files changed, 0 insertions(+), 0 deletions(-)
 rename tests/qemu-iotests/{222 => tests/image-fleecing} (100%)
 rename tests/qemu-iotests/{222.out => tests/image-fleecing.out} (100%)

diff --git a/tests/qemu-iotests/222 b/tests/qemu-iotests/tests/image-fleecing
similarity index 100%
rename from tests/qemu-iotests/222
rename to tests/qemu-iotests/tests/image-fleecing
diff --git a/tests/qemu-iotests/222.out 
b/tests/qemu-iotests/tests/image-fleecing.out
similarity index 100%
rename from tests/qemu-iotests/222.out
rename to tests/qemu-iotests/tests/image-fleecing.out
-- 
2.29.2




[PATCH v7 30/33] iotests/image-fleecing: proper source device

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Define scsi device to operate with it by qom-set in further patch.

Give a new node-name to source block node, to not look like device
name.

Job now don't want to work without giving explicit id, so, let's call
it "fleecing".

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/tests/image-fleecing | 12 
 tests/qemu-iotests/tests/image-fleecing.out |  2 +-
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/tests/image-fleecing 
b/tests/qemu-iotests/tests/image-fleecing
index 799369e290..961941bb27 100755
--- a/tests/qemu-iotests/tests/image-fleecing
+++ b/tests/qemu-iotests/tests/image-fleecing
@@ -70,7 +70,11 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('--- Launching VM ---')
 log('')
 
-vm.add_drive(base_img_path)
+src_node = 'source'
+vm.add_blockdev(f'driver={iotests.imgfmt},file.driver=file,'
+f'file.filename={base_img_path},node-name={src_node}')
+vm.add_device('virtio-scsi')
+vm.add_device(f'scsi-hd,id=sda,drive={src_node}')
 vm.launch()
 log('Done')
 
@@ -78,7 +82,6 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('--- Setting up Fleecing Graph ---')
 log('')
 
-src_node = 'drive0'
 tgt_node = 'fleeceNode'
 
 # create tgt_node backed by src_node
@@ -94,6 +97,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 
 # Establish COW from source to fleecing node
 log(vm.qmp('blockdev-backup',
+   job_id='fleecing',
device=src_node,
target=tgt_node,
sync='none'))
@@ -125,7 +129,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 for p in overwrite:
 cmd = 'write -P%s %s %s' % p
 log(cmd)
-log(vm.hmp_qemu_io(src_node, cmd))
+log(vm.hmp_qemu_io('/machine/peripheral/sda', cmd, qdev=True))
 
 log('')
 log('--- Verifying Data ---')
@@ -140,7 +144,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('--- Cleanup ---')
 log('')
 
-log(vm.qmp('block-job-cancel', device=src_node))
+log(vm.qmp('block-job-cancel', device='fleecing'))
 e = vm.event_wait('BLOCK_JOB_CANCELLED')
 assert e is not None
 log(e, filters=[iotests.filter_qmp_event])
diff --git a/tests/qemu-iotests/tests/image-fleecing.out 
b/tests/qemu-iotests/tests/image-fleecing.out
index 16643dde30..314a1b54e9 100644
--- a/tests/qemu-iotests/tests/image-fleecing.out
+++ b/tests/qemu-iotests/tests/image-fleecing.out
@@ -50,7 +50,7 @@ read -P0 0x3fe 64k
 --- Cleanup ---
 
 {"return": {}}
-{"data": {"device": "drive0", "len": 67108864, "offset": 393216, "speed": 0, 
"type": "backup"}, "event": "BLOCK_JOB_CANCELLED", "timestamp": 
{"microseconds": "USECS", "seconds": "SECS"}}
+{"data": {"device": "fleecing", "len": 67108864, "offset": 393216, "speed": 0, 
"type": "backup"}, "event": "BLOCK_JOB_CANCELLED", "timestamp": 
{"microseconds": "USECS", "seconds": "SECS"}}
 {"return": {}}
 {"return": {}}
 
-- 
2.29.2




[PATCH v7 27/33] iotests/222: constantly use single quotes for strings

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
The file use both single and double quotes for strings. Let's be
consistent.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/222 | 68 +-
 1 file changed, 34 insertions(+), 34 deletions(-)

diff --git a/tests/qemu-iotests/222 b/tests/qemu-iotests/222
index 5e2556f8df..799369e290 100755
--- a/tests/qemu-iotests/222
+++ b/tests/qemu-iotests/222
@@ -30,23 +30,23 @@ iotests.script_initialize(
 supported_platforms=['linux'],
 )
 
-patterns = [("0x5d", "0", "64k"),
-("0xd5", "1M","64k"),
-("0xdc", "32M",   "64k"),
-("0xcd", "0x3ff", "64k")]  # 64M - 64K
+patterns = [('0x5d', '0', '64k'),
+('0xd5', '1M','64k'),
+('0xdc', '32M',   '64k'),
+('0xcd', '0x3ff', '64k')]  # 64M - 64K
 
-overwrite = [("0xab", "0", "64k"), # Full overwrite
- ("0xad", "0x00f8000", "64k"), # Partial-left (1M-32K)
- ("0x1d", "0x2008000", "64k"), # Partial-right (32M+32K)
- ("0xea", "0x3fe", "64k")] # Adjacent-left (64M - 128K)
+overwrite = [('0xab', '0', '64k'), # Full overwrite
+ ('0xad', '0x00f8000', '64k'), # Partial-left (1M-32K)
+ ('0x1d', '0x2008000', '64k'), # Partial-right (32M+32K)
+ ('0xea', '0x3fe', '64k')] # Adjacent-left (64M - 128K)
 
-zeroes = [("0", "0x00f8000", "32k"), # Left-end of partial-left (1M-32K)
-  ("0", "0x201", "32k"), # Right-end of partial-right (32M+64K)
-  ("0", "0x3fe", "64k")] # overwrite[3]
+zeroes = [('0', '0x00f8000', '32k'), # Left-end of partial-left (1M-32K)
+  ('0', '0x201', '32k'), # Right-end of partial-right (32M+64K)
+  ('0', '0x3fe', '64k')] # overwrite[3]
 
-remainder = [("0xd5", "0x108000",  "32k"), # Right-end of partial-left [1]
- ("0xdc", "32M",   "32k"), # Left-end of partial-right [2]
- ("0xcd", "0x3ff", "64k")] # patterns[3]
+remainder = [('0xd5', '0x108000',  '32k'), # Right-end of partial-left [1]
+ ('0xdc', '32M',   '32k'), # Left-end of partial-right [2]
+ ('0xcd', '0x3ff', '64k')] # patterns[3]
 
 with iotests.FilePath('base.img') as base_img_path, \
  iotests.FilePath('fleece.img') as fleece_img_path, \
@@ -58,7 +58,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('')
 
 assert qemu_img('create', '-f', iotests.imgfmt, base_img_path, '64M') == 0
-assert qemu_img('create', '-f', "qcow2", fleece_img_path, '64M') == 0
+assert qemu_img('create', '-f', 'qcow2', fleece_img_path, '64M') == 0
 
 for p in patterns:
 qemu_io('-f', iotests.imgfmt,
@@ -78,43 +78,43 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('--- Setting up Fleecing Graph ---')
 log('')
 
-src_node = "drive0"
-tgt_node = "fleeceNode"
+src_node = 'drive0'
+tgt_node = 'fleeceNode'
 
 # create tgt_node backed by src_node
-log(vm.qmp("blockdev-add", {
-"driver": "qcow2",
-"node-name": tgt_node,
-"file": {
-"driver": "file",
-"filename": fleece_img_path,
+log(vm.qmp('blockdev-add', {
+'driver': 'qcow2',
+'node-name': tgt_node,
+'file': {
+'driver': 'file',
+'filename': fleece_img_path,
 },
-"backing": src_node,
+'backing': src_node,
 }))
 
 # Establish COW from source to fleecing node
-log(vm.qmp("blockdev-backup",
+log(vm.qmp('blockdev-backup',
device=src_node,
target=tgt_node,
-   sync="none"))
+   sync='none'))
 
 log('')
 log('--- Setting up NBD Export ---')
 log('')
 
 nbd_uri = 'nbd+unix:///%s?socket=%s' % (tgt_node, nbd_sock_path)
-log(vm.qmp("nbd-server-start",
-   {"addr": { "type": "unix",
-  "data": { "path": nbd_sock_path } } }))
+log(vm.qmp('nbd-server-start',
+   {'addr': { 'type': 'unix',
+  'data': { 'path': nbd_sock_path } } }))
 
-log(vm.qmp("nbd-server-add", device=tgt_node))
+log(vm.qmp('nbd-server-add', device=tgt_node))
 
 log('')
 log('--- Sanity Check ---')
 log('')
 
 for p in patterns + zeroes:
-cmd = "read -P%s %s %s" % p
+cmd = 'read -P%s %s %s' % p
 log(cmd)
 assert qemu_io_silent('-r', '-f', 'raw', '-c', cmd, nbd_uri) == 0
 
@@ -123,7 +123,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('')
 
 for p in overwrite:
-cmd = "write -P%s %s %s" % p
+cmd = 'write -P%s %s %s' % p
 log(cmd)
 log(vm.hmp_qemu_io(src_node, cmd))
 
@@ -132,7 +132,7 @@ with iotests.FilePath('base.img') as base_img_path, \
 log('')
 
 for p in patterns + zeroes:
-cmd = "read -P%s %s %s" % p
+cmd = 'read -P%s %s 

[PATCH v7 22/33] qapi: publish copy-before-write filter

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 qapi/block-core.json | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 675d8265eb..59d3e5e42d 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2825,13 +2825,14 @@
 # @blklogwrites: Since 3.0
 # @blkreplay: Since 4.2
 # @compress: Since 5.0
+# @copy-before-write: Since 6.1
 #
 # Since: 2.9
 ##
 { 'enum': 'BlockdevDriver',
   'data': [ 'blkdebug', 'blklogwrites', 'blkreplay', 'blkverify', 'bochs',
-'cloop', 'compress', 'copy-on-read', 'dmg', 'file', 'ftp', 'ftps',
-'gluster',
+'cloop', 'compress', 'copy-before-write', 'copy-on-read', 'dmg',
+'file', 'ftp', 'ftps', 'gluster',
 {'name': 'host_cdrom', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
 {'name': 'host_device', 'if': 'defined(HAVE_HOST_BLOCK_DEVICE)' },
 'http', 'https', 'iscsi',
@@ -4049,6 +4050,25 @@
   'base': 'BlockdevOptionsGenericFormat',
   'data': { '*bottom': 'str' } }
 
+##
+# @BlockdevOptionsCbw:
+#
+# Driver specific block device options for the copy-before-write driver,
+# which does so called copy-before-write operations: when data is
+# written to the filter, the filter firstly reads corresponding blocks
+# from its file child and copies them to @target child. After successful
+# copying the write request is propagated to file child. If copying
+# failed, the original write request is failed too and no data is written
+# to file child.
+#
+# @target: The target for copy-before-write operations.
+#
+# Since: 6.1
+##
+{ 'struct': 'BlockdevOptionsCbw',
+  'base': 'BlockdevOptionsGenericFormat',
+  'data': { 'target': 'BlockdevRef' } }
+
 ##
 # @BlockdevOptions:
 #
@@ -4101,6 +4121,7 @@
   'bochs':  'BlockdevOptionsGenericFormat',
   'cloop':  'BlockdevOptionsGenericFormat',
   'compress':   'BlockdevOptionsGenericFormat',
+  'copy-before-write':'BlockdevOptionsCbw',
   'copy-on-read':'BlockdevOptionsCor',
   'dmg':'BlockdevOptionsGenericFormat',
   'file':   'BlockdevOptionsFile',
-- 
2.29.2




[PATCH v7 21/33] block/copy-before-write: make public block driver

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Finally, copy-before-write gets own .bdrv_open and .bdrv_close
handlers, block_init() call and becomes available through bdrv_open().

To achieve this:

 - cbw_init gets unused flags argument and becomes cbw_open
 - block_copy_state_free() call moved to new cbw_close()
 - in bdrv_cbw_append:
   - options are completed with driver and node-name, and we can simply
 use bdrv_insert_node() to do both open and drained replacing
 - in bdrv_cbw_drop:
   - cbw_close() is now responsible for freeing s->bcs, so don't do it
 here

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/copy-before-write.c | 60 ++-
 1 file changed, 28 insertions(+), 32 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 2efe098aae..2cd68b480a 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -144,7 +144,8 @@ static void cbw_child_perm(BlockDriverState *bs, BdrvChild 
*c,
 }
 }
 
-static int cbw_init(BlockDriverState *bs, QDict *options, Error **errp)
+static int cbw_open(BlockDriverState *bs, QDict *options, int flags,
+Error **errp)
 {
 BDRVCopyBeforeWriteState *s = bs->opaque;
 BdrvDirtyBitmap *copy_bitmap;
@@ -181,10 +182,21 @@ static int cbw_init(BlockDriverState *bs, QDict *options, 
Error **errp)
 return 0;
 }
 
+static void cbw_close(BlockDriverState *bs)
+{
+BDRVCopyBeforeWriteState *s = bs->opaque;
+
+block_copy_state_free(s->bcs);
+s->bcs = NULL;
+}
+
 BlockDriver bdrv_cbw_filter = {
 .format_name = "copy-before-write",
 .instance_size = sizeof(BDRVCopyBeforeWriteState),
 
+.bdrv_open  = cbw_open,
+.bdrv_close = cbw_close,
+
 .bdrv_co_preadv = cbw_co_preadv,
 .bdrv_co_pwritev= cbw_co_pwritev,
 .bdrv_co_pwrite_zeroes  = cbw_co_pwrite_zeroes,
@@ -205,56 +217,40 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState 
*source,
   Error **errp)
 {
 ERRP_GUARD();
-int ret;
 BDRVCopyBeforeWriteState *state;
 BlockDriverState *top;
 QDict *opts;
 
 assert(source->total_sectors == target->total_sectors);
 
-top = bdrv_new_open_driver(&bdrv_cbw_filter, filter_node_name,
-   BDRV_O_RDWR, errp);
-if (!top) {
-error_prepend(errp, "Cannot open driver: ");
-return NULL;
-}
-state = top->opaque;
-
 opts = qdict_new();
+qdict_put_str(opts, "driver", "copy-before-write");
+if (filter_node_name) {
+qdict_put_str(opts, "node-name", filter_node_name);
+}
 qdict_put_str(opts, "file", bdrv_get_node_name(source));
 qdict_put_str(opts, "target", bdrv_get_node_name(target));
 
-ret = cbw_init(top, opts, errp);
-qobject_unref(opts);
-if (ret < 0) {
-goto fail;
-}
-
-bdrv_drained_begin(source);
-ret = bdrv_replace_node(source, top, errp);
-bdrv_drained_end(source);
-if (ret < 0) {
-error_prepend(errp, "Cannot append copy-before-write filter: ");
-goto fail;
+top = bdrv_insert_node(source, opts, BDRV_O_RDWR, errp);
+if (!top) {
+return NULL;
 }
 
+state = top->opaque;
 *bcs = state->bcs;
 
 return top;
-
-fail:
-block_copy_state_free(state->bcs);
-bdrv_unref(top);
-return NULL;
 }
 
 void bdrv_cbw_drop(BlockDriverState *bs)
 {
-BDRVCopyBeforeWriteState *s = bs->opaque;
-
 bdrv_drop_filter(bs, &error_abort);
-
-block_copy_state_free(s->bcs);
-
 bdrv_unref(bs);
 }
+
+static void cbw_init(void)
+{
+bdrv_register(&bdrv_cbw_filter);
+}
+
+block_init(cbw_init);
-- 
2.29.2




[PATCH v7 19/33] block/copy-before-write: initialize block-copy bitmap

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
We are going to publish copy-before-write filter to be used in separate
of backup. Future step would support bitmap for the filter. But let's
start from full set bitmap.

We have to modify backup, as bitmap is first initialized by
copy-before-write filter, and then backup modifies it.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/backup.c| 16 +++-
 block/copy-before-write.c |  4 
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 4869f1e5da..687d2882bc 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -233,18 +233,16 @@ static void backup_init_bcs_bitmap(BackupBlockJob *job)
 BdrvDirtyBitmap *bcs_bitmap = block_copy_dirty_bitmap(job->bcs);
 
 if (job->sync_mode == MIRROR_SYNC_MODE_BITMAP) {
+bdrv_clear_dirty_bitmap(bcs_bitmap, NULL);
 ret = bdrv_dirty_bitmap_merge_internal(bcs_bitmap, job->sync_bitmap,
NULL, true);
 assert(ret);
-} else {
-if (job->sync_mode == MIRROR_SYNC_MODE_TOP) {
-/*
- * We can't hog the coroutine to initialize this thoroughly.
- * Set a flag and resume work when we are able to yield safely.
- */
-block_copy_set_skip_unallocated(job->bcs, true);
-}
-bdrv_set_dirty_bitmap(bcs_bitmap, 0, job->len);
+} else if (job->sync_mode == MIRROR_SYNC_MODE_TOP) {
+/*
+ * We can't hog the coroutine to initialize this thoroughly.
+ * Set a flag and resume work when we are able to yield safely.
+ */
+block_copy_set_skip_unallocated(job->bcs, true);
 }
 
 estimate = bdrv_get_dirty_count(bcs_bitmap);
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 1cefcade78..2efe098aae 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -147,6 +147,7 @@ static void cbw_child_perm(BlockDriverState *bs, BdrvChild 
*c,
 static int cbw_init(BlockDriverState *bs, QDict *options, Error **errp)
 {
 BDRVCopyBeforeWriteState *s = bs->opaque;
+BdrvDirtyBitmap *copy_bitmap;
 
 bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
@@ -174,6 +175,9 @@ static int cbw_init(BlockDriverState *bs, QDict *options, 
Error **errp)
 return -EINVAL;
 }
 
+copy_bitmap = block_copy_dirty_bitmap(s->bcs);
+bdrv_set_dirty_bitmap(copy_bitmap, 0, bdrv_dirty_bitmap_size(copy_bitmap));
+
 return 0;
 }
 
-- 
2.29.2




[PATCH v7 29/33] iotests.py: hmp_qemu_io: support qdev

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 tests/qemu-iotests/iotests.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 025e288ddd..9d0031a0e8 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -655,9 +655,10 @@ def resume_drive(self, drive: str) -> None:
 self.hmp(f'qemu-io {drive} "remove_break bp_{drive}"')
 
 def hmp_qemu_io(self, drive: str, cmd: str,
-use_log: bool = False) -> QMPMessage:
+use_log: bool = False, qdev: bool = False) -> QMPMessage:
 """Write to a given drive using an HMP command"""
-return self.hmp(f'qemu-io {drive} "{cmd}"', use_log=use_log)
+d = '-d ' if qdev else ''
+return self.hmp(f'qemu-io {d}{drive} "{cmd}"', use_log=use_log)
 
 def flatten_qmp_object(self, obj, output=None, basestr=''):
 if output is None:
-- 
2.29.2




[PATCH v7 24/33] python/qemu/machine: QEMUMachine: improve qmp() method

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
We often call qmp() with unpacking dict, like qmp('foo', **{...}).
mypy don't really like it, it thinks that passed unpacked dict is a
positional argument and complains that it type should be bool (because
second argument of qmp() is conv_keys: bool).

Allow passing dict directly, simplifying interface, and giving a way to
satisfy mypy.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 python/qemu/machine/machine.py | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 5eab31aeec..ce1e130c13 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -572,11 +572,21 @@ def _qmp_args(cls, conv_keys: bool,
 return args
 
 def qmp(self, cmd: str,
-conv_keys: bool = True,
+args_dict: Optional[Dict[str, object]] = None,
+conv_keys: Optional[bool] = None,
 **args: Any) -> QMPMessage:
 """
 Invoke a QMP command and return the response dict
 """
+if args_dict is not None:
+assert not args
+assert conv_keys is None
+args = args_dict
+conv_keys = False
+
+if conv_keys is None:
+conv_keys = True
+
 qmp_args = self._qmp_args(conv_keys, args)
 return self._qmp.cmd(cmd, args=qmp_args)
 
-- 
2.29.2




[PATCH v7 20/33] block/block-copy: make setting progress optional

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Now block-copy will crash if user don't set progress meter by
block_copy_set_progress_meter(). copy-before-write filter will be used
in separate of backup job, and it doesn't want any progress meter (for
now). So, allow not setting it.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/block-copy.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/block/block-copy.c b/block/block-copy.c
index 78ee255fd0..90664ee0ab 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -291,9 +291,11 @@ static void coroutine_fn block_copy_task_end(BlockCopyTask 
*task, int ret)
 bdrv_set_dirty_bitmap(task->s->copy_bitmap, task->offset, task->bytes);
 }
 QLIST_REMOVE(task, list);
-progress_set_remaining(task->s->progress,
-   bdrv_get_dirty_count(task->s->copy_bitmap) +
-   task->s->in_flight_bytes);
+if (task->s->progress) {
+progress_set_remaining(task->s->progress,
+   bdrv_get_dirty_count(task->s->copy_bitmap) +
+   task->s->in_flight_bytes);
+}
 qemu_co_queue_restart_all(&task->wait_queue);
 }
 
@@ -592,7 +594,7 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
 t->call_state->ret = ret;
 t->call_state->error_is_read = error_is_read;
 }
-} else {
+} else if (s->progress) {
 progress_work_done(s->progress, t->bytes);
 }
 }
@@ -698,9 +700,11 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
 if (!ret) {
 qemu_co_mutex_lock(&s->lock);
 bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
-progress_set_remaining(s->progress,
-   bdrv_get_dirty_count(s->copy_bitmap) +
-   s->in_flight_bytes);
+if (s->progress) {
+progress_set_remaining(s->progress,
+   bdrv_get_dirty_count(s->copy_bitmap) +
+   s->in_flight_bytes);
+}
 qemu_co_mutex_unlock(&s->lock);
 }
 
-- 
2.29.2




[PATCH v7 14/33] block/copy-before-write: introduce cbw_init()

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Move part of bdrv_cbw_append() to new function cbw_open(). It's an
intermediate step for adding normal .bdrv_open() handler to the
filter. With this commit no logic is changed, but we have a function
which will be turned into .bdrv_open() handler in future commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/copy-before-write.c | 69 +++
 1 file changed, 41 insertions(+), 28 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index adbbc64aa9..a4fee645fd 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -144,6 +144,45 @@ static void cbw_child_perm(BlockDriverState *bs, BdrvChild 
*c,
 }
 }
 
+static int cbw_init(BlockDriverState *top, BlockDriverState *source,
+BlockDriverState *target, bool compress, Error **errp)
+{
+BDRVCopyBeforeWriteState *state = top->opaque;
+
+top->total_sectors = source->total_sectors;
+top->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
+(BDRV_REQ_FUA & source->supported_write_flags);
+top->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
+((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
+ source->supported_zero_flags);
+
+bdrv_ref(target);
+state->target = bdrv_attach_child(top, target, "target", &child_of_bds,
+  BDRV_CHILD_DATA, errp);
+if (!state->target) {
+error_prepend(errp, "Cannot attach target child: ");
+return -EINVAL;
+}
+
+bdrv_ref(source);
+top->file = bdrv_attach_child(top, source, "file", &child_of_bds,
+  BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
+  errp);
+if (!top->file) {
+error_prepend(errp, "Cannot attach file child: ");
+return -EINVAL;
+}
+
+state->bcs = block_copy_state_new(top->file, state->target, false, 
compress,
+  errp);
+if (!state->bcs) {
+error_prepend(errp, "Cannot create block-copy-state: ");
+return -EINVAL;
+}
+
+return 0;
+}
+
 BlockDriver bdrv_cbw_filter = {
 .format_name = "copy-before-write",
 .instance_size = sizeof(BDRVCopyBeforeWriteState),
@@ -181,36 +220,10 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState 
*source,
 error_prepend(errp, "Cannot open driver: ");
 return NULL;
 }
-
 state = top->opaque;
-top->total_sectors = source->total_sectors;
-top->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
-(BDRV_REQ_FUA & source->supported_write_flags);
-top->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
-((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
- source->supported_zero_flags);
-
-bdrv_ref(target);
-state->target = bdrv_attach_child(top, target, "target", &child_of_bds,
-  BDRV_CHILD_DATA, errp);
-if (!state->target) {
-error_prepend(errp, "Cannot attach target child: ");
-goto fail;
-}
 
-bdrv_ref(source);
-top->file = bdrv_attach_child(top, source, "file", &child_of_bds,
-  BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-  errp);
-if (!top->file) {
-error_prepend(errp, "Cannot attach file child: ");
-goto fail;
-}
-
-state->bcs = block_copy_state_new(top->file, state->target, false, 
compress,
-  errp);
-if (!state->bcs) {
-error_prepend(errp, "Cannot create block-copy-state: ");
+ret = cbw_init(top, source, target, compress, errp);
+if (ret < 0) {
 goto fail;
 }
 
-- 
2.29.2




[PATCH v7 23/33] python/qemu/machine.py: refactor _qemu_args()

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
 - use shorter construction
 - don't create new dict if not needed
 - drop extra unpacking key-val arguments
 - drop extra default values

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
Reviewed-by: John Snow 
---
 python/qemu/machine/machine.py | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index 971ed7e8c6..5eab31aeec 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -564,14 +564,12 @@ def _qmp(self) -> QEMUMonitorProtocol:
 return self._qmp_connection
 
 @classmethod
-def _qmp_args(cls, _conv_keys: bool = True, **args: Any) -> Dict[str, Any]:
-qmp_args = dict()
-for key, value in args.items():
-if _conv_keys:
-qmp_args[key.replace('_', '-')] = value
-else:
-qmp_args[key] = value
-return qmp_args
+def _qmp_args(cls, conv_keys: bool,
+  args: Dict[str, Any]) -> Dict[str, object]:
+if conv_keys:
+return {k.replace('_', '-'): v for k, v in args.items()}
+
+return args
 
 def qmp(self, cmd: str,
 conv_keys: bool = True,
@@ -579,7 +577,7 @@ def qmp(self, cmd: str,
 """
 Invoke a QMP command and return the response dict
 """
-qmp_args = self._qmp_args(conv_keys, **args)
+qmp_args = self._qmp_args(conv_keys, args)
 return self._qmp.cmd(cmd, args=qmp_args)
 
 def command(self, cmd: str,
@@ -590,7 +588,7 @@ def command(self, cmd: str,
 On success return the response dict.
 On failure raise an exception.
 """
-qmp_args = self._qmp_args(conv_keys, **args)
+qmp_args = self._qmp_args(conv_keys, args)
 return self._qmp.command(cmd, **qmp_args)
 
 def get_qmp_event(self, wait: bool = False) -> Optional[QMPMessage]:
-- 
2.29.2




[PATCH v7 17/33] block/copy-before-write: bdrv_cbw_append(): drop unused compress arg

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/copy-before-write.h | 1 -
 block/backup.c| 2 +-
 block/copy-before-write.c | 7 +++
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index b386fd8f01..51847e711a 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -32,7 +32,6 @@
 BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   BlockDriverState *target,
   const char *filter_node_name,
-  bool compress,
   BlockCopyState **bcs,
   Error **errp);
 void bdrv_cbw_drop(BlockDriverState *bs);
diff --git a/block/backup.c b/block/backup.c
index 83516297cb..4869f1e5da 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -452,7 +452,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 goto error;
 }
 
-cbw = bdrv_cbw_append(bs, target, filter_node_name, false, &bcs, errp);
+cbw = bdrv_cbw_append(bs, target, filter_node_name, &bcs, errp);
 if (!cbw) {
 goto error;
 }
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 4858dcf8ff..1e7180760a 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -145,7 +145,7 @@ static void cbw_child_perm(BlockDriverState *bs, BdrvChild 
*c,
 }
 
 static int cbw_init(BlockDriverState *bs, BlockDriverState *source,
-BlockDriverState *target, bool compress, Error **errp)
+BlockDriverState *target, Error **errp)
 {
 BDRVCopyBeforeWriteState *s = bs->opaque;
 
@@ -173,7 +173,7 @@ static int cbw_init(BlockDriverState *bs, BlockDriverState 
*source,
 ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
  bs->file->bs->supported_zero_flags);
 
-s->bcs = block_copy_state_new(bs->file, s->target, false, compress, errp);
+s->bcs = block_copy_state_new(bs->file, s->target, false, false, errp);
 if (!s->bcs) {
 error_prepend(errp, "Cannot create block-copy-state: ");
 return -EINVAL;
@@ -202,7 +202,6 @@ BlockDriver bdrv_cbw_filter = {
 BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   BlockDriverState *target,
   const char *filter_node_name,
-  bool compress,
   BlockCopyState **bcs,
   Error **errp)
 {
@@ -221,7 +220,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
 }
 state = top->opaque;
 
-ret = cbw_init(top, source, target, compress, errp);
+ret = cbw_init(top, source, target, errp);
 if (ret < 0) {
 goto fail;
 }
-- 
2.29.2




[PATCH v7 16/33] block/copy-before-write: cbw_init(): use file child after attaching

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
In the next commit we'll get rid of source argument of cbw_init().
Prepare to it now, to make next commit simpler: move the code block
that uses source below attaching the child and use bs->file->bs instead
of source variable.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/copy-before-write.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index d7f1833efa..4858dcf8ff 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -149,13 +149,6 @@ static int cbw_init(BlockDriverState *bs, BlockDriverState 
*source,
 {
 BDRVCopyBeforeWriteState *s = bs->opaque;
 
-bs->total_sectors = source->total_sectors;
-bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
-(BDRV_REQ_FUA & source->supported_write_flags);
-bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
-((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
- source->supported_zero_flags);
-
 bdrv_ref(target);
 s->target = bdrv_attach_child(bs, target, "target", &child_of_bds,
   BDRV_CHILD_DATA, errp);
@@ -173,6 +166,13 @@ static int cbw_init(BlockDriverState *bs, BlockDriverState 
*source,
 return -EINVAL;
 }
 
+bs->total_sectors = bs->file->bs->total_sectors;
+bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
+(BDRV_REQ_FUA & bs->file->bs->supported_write_flags);
+bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
+((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
+ bs->file->bs->supported_zero_flags);
+
 s->bcs = block_copy_state_new(bs->file, s->target, false, compress, errp);
 if (!s->bcs) {
 error_prepend(errp, "Cannot create block-copy-state: ");
-- 
2.29.2




[PATCH v7 18/33] block/copy-before-write: cbw_init(): use options

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
One more step closer to .bdrv_open(): use options instead of plain
arguments. Move to bdrv_open_child() calls, native for drive open
handlers.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/copy-before-write.c | 29 +++--
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 1e7180760a..1cefcade78 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -144,25 +144,20 @@ static void cbw_child_perm(BlockDriverState *bs, 
BdrvChild *c,
 }
 }
 
-static int cbw_init(BlockDriverState *bs, BlockDriverState *source,
-BlockDriverState *target, Error **errp)
+static int cbw_init(BlockDriverState *bs, QDict *options, Error **errp)
 {
 BDRVCopyBeforeWriteState *s = bs->opaque;
 
-bdrv_ref(target);
-s->target = bdrv_attach_child(bs, target, "target", &child_of_bds,
-  BDRV_CHILD_DATA, errp);
-if (!s->target) {
-error_prepend(errp, "Cannot attach target child: ");
+bs->file = bdrv_open_child(NULL, options, "file", bs, &child_of_bds,
+   BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
+   false, errp);
+if (!bs->file) {
 return -EINVAL;
 }
 
-bdrv_ref(source);
-bs->file = bdrv_attach_child(bs, source, "file", &child_of_bds,
- BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
- errp);
-if (!bs->file) {
-error_prepend(errp, "Cannot attach file child: ");
+s->target = bdrv_open_child(NULL, options, "target", bs, &child_of_bds,
+BDRV_CHILD_DATA, false, errp);
+if (!s->target) {
 return -EINVAL;
 }
 
@@ -209,6 +204,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
 int ret;
 BDRVCopyBeforeWriteState *state;
 BlockDriverState *top;
+QDict *opts;
 
 assert(source->total_sectors == target->total_sectors);
 
@@ -220,7 +216,12 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
 }
 state = top->opaque;
 
-ret = cbw_init(top, source, target, errp);
+opts = qdict_new();
+qdict_put_str(opts, "file", bdrv_get_node_name(source));
+qdict_put_str(opts, "target", bdrv_get_node_name(target));
+
+ret = cbw_init(top, opts, errp);
+qobject_unref(opts);
 if (ret < 0) {
 goto fail;
 }
-- 
2.29.2




[PATCH v7 09/33] block/backup: move cluster size calculation to block-copy

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
The main consumer of cluster-size is block-copy. Let's calculate it
here instead of passing through backup-top.

We are going to publish copy-before-write filter soon, so it will be
created through options. But we don't want for now to make explicit
option for cluster-size, let's continue to calculate it automatically.
So, now is the time to get rid of cluster_size argument for
bdrv_cbw_append().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/copy-before-write.h  |  1 -
 include/block/block-copy.h |  5 +--
 block/backup.c | 62 ++
 block/block-copy.c | 51 ++-
 block/copy-before-write.c  | 10 +++---
 5 files changed, 66 insertions(+), 63 deletions(-)

diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index 538aab8bdb..b386fd8f01 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -32,7 +32,6 @@
 BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   BlockDriverState *target,
   const char *filter_node_name,
-  uint64_t cluster_size,
   bool compress,
   BlockCopyState **bcs,
   Error **errp);
diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index f0ba7bc828..9c24e8f38e 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -25,8 +25,8 @@ typedef struct BlockCopyState BlockCopyState;
 typedef struct BlockCopyCallState BlockCopyCallState;
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
- int64_t cluster_size, bool use_copy_range,
- bool compress, Error **errp);
+ bool use_copy_range, bool compress,
+ Error **errp);
 
 void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range,
   bool compress);
@@ -90,6 +90,7 @@ void block_copy_kick(BlockCopyCallState *call_state);
 void block_copy_call_cancel(BlockCopyCallState *call_state);
 
 BdrvDirtyBitmap *block_copy_dirty_bitmap(BlockCopyState *s);
+int64_t block_copy_cluster_size(BlockCopyState *s);
 void block_copy_set_skip_unallocated(BlockCopyState *s, bool skip);
 
 #endif /* BLOCK_COPY_H */
diff --git a/block/backup.c b/block/backup.c
index b31fd99ab3..83516297cb 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -29,8 +29,6 @@
 
 #include "block/copy-before-write.h"
 
-#define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
-
 typedef struct BackupBlockJob {
 BlockJob common;
 BlockDriverState *cbw;
@@ -354,43 +352,6 @@ static const BlockJobDriver backup_job_driver = {
 .set_speed = backup_set_speed,
 };
 
-static int64_t backup_calculate_cluster_size(BlockDriverState *target,
- Error **errp)
-{
-int ret;
-BlockDriverInfo bdi;
-bool target_does_cow = bdrv_backing_chain_next(target);
-
-/*
- * If there is no backing file on the target, we cannot rely on COW if our
- * backup cluster size is smaller than the target cluster size. Even for
- * targets with a backing file, try to avoid COW if possible.
- */
-ret = bdrv_get_info(target, &bdi);
-if (ret == -ENOTSUP && !target_does_cow) {
-/* Cluster size is not defined */
-warn_report("The target block device doesn't provide "
-"information about the block size and it doesn't have a "
-"backing file. The default block size of %u bytes is "
-"used. If the actual block size of the target exceeds "
-"this default, the backup may be unusable",
-BACKUP_CLUSTER_SIZE_DEFAULT);
-return BACKUP_CLUSTER_SIZE_DEFAULT;
-} else if (ret < 0 && !target_does_cow) {
-error_setg_errno(errp, -ret,
-"Couldn't determine the cluster size of the target image, "
-"which has no backing file");
-error_append_hint(errp,
-"Aborting, since this may create an unusable destination image\n");
-return ret;
-} else if (ret < 0 && target_does_cow) {
-/* Not fatal; just trudge on ahead. */
-return BACKUP_CLUSTER_SIZE_DEFAULT;
-}
-
-return MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
-}
-
 BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
   BlockDriverState *target, int64_t speed,
   MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap,
@@ -448,11 +409,6 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 return NULL;
 }
 
-cluster_size = backup_calculate_cluster_size(target, errp);
-if (cluster_size < 0) {
-goto error;
-}
-
 if (perf->max_wo

[PATCH v7 06/33] block-copy: move detecting fleecing scheme to block-copy

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
We want to simplify initialization interface of copy-before-write
filter as we are going to make it public. So, let's detect fleecing
scheme exactly in block-copy code, to not pass this information through
extra levels.

Why not just set BDRV_REQ_SERIALISING unconditionally: because we are
going to implement new more efficient fleecing scheme which will not
rely on backing feature.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/copy-before-write.h  |  2 +-
 include/block/block-copy.h |  3 +--
 block/backup.c | 21 +
 block/block-copy.c | 24 +---
 block/copy-before-write.c  |  4 ++--
 5 files changed, 26 insertions(+), 28 deletions(-)

diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index 5977b7aa31..f37e2249ae 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -34,7 +34,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   const char *filter_node_name,
   uint64_t cluster_size,
   BackupPerf *perf,
-  BdrvRequestFlags write_flags,
+  bool compress,
   BlockCopyState **bcs,
   Error **errp);
 void bdrv_cbw_drop(BlockDriverState *bs);
diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 5c8278895c..734389d32a 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -26,8 +26,7 @@ typedef struct BlockCopyCallState BlockCopyCallState;
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
  int64_t cluster_size, bool use_copy_range,
- BdrvRequestFlags write_flags,
- Error **errp);
+ bool compress, Error **errp);
 
 void block_copy_set_progress_meter(BlockCopyState *s, ProgressMeter *pm);
 
diff --git a/block/backup.c b/block/backup.c
index ac91821b08..84f9a5f02c 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -407,7 +407,6 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 int64_t len, target_len;
 BackupBlockJob *job = NULL;
 int64_t cluster_size;
-BdrvRequestFlags write_flags;
 BlockDriverState *cbw = NULL;
 BlockCopyState *bcs = NULL;
 
@@ -504,26 +503,8 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 goto error;
 }
 
-/*
- * If source is in backing chain of target assume that target is going to 
be
- * used for "image fleecing", i.e. it should represent a kind of snapshot 
of
- * source at backup-start point in time. And target is going to be read by
- * somebody (for example, used as NBD export) during backup job.
- *
- * In this case, we need to add BDRV_REQ_SERIALISING write flag to avoid
- * intersection of backup writes and third party reads from target,
- * otherwise reading from target we may occasionally read already updated 
by
- * guest data.
- *
- * For more information see commit f8d59dfb40bb and test
- * tests/qemu-iotests/222
- */
-write_flags = (bdrv_chain_contains(target, bs) ? BDRV_REQ_SERIALISING : 0) 
|
-  (compress ? BDRV_REQ_WRITE_COMPRESSED : 0),
-
 cbw = bdrv_cbw_append(bs, target, filter_node_name,
-cluster_size, perf,
-write_flags, &bcs, errp);
+  cluster_size, perf, compress, &bcs, errp);
 if (!cbw) {
 goto error;
 }
diff --git a/block/block-copy.c b/block/block-copy.c
index 0becad52da..58b4345a5a 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -317,10 +317,11 @@ static uint32_t block_copy_max_transfer(BdrvChild 
*source, BdrvChild *target)
 
 BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
  int64_t cluster_size, bool use_copy_range,
- BdrvRequestFlags write_flags, Error 
**errp)
+ bool compress, Error **errp)
 {
 BlockCopyState *s;
 BdrvDirtyBitmap *copy_bitmap;
+bool is_fleecing;
 
 copy_bitmap = bdrv_create_dirty_bitmap(source->bs, cluster_size, NULL,
errp);
@@ -329,6 +330,22 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, 
BdrvChild *target,
 }
 bdrv_disable_dirty_bitmap(copy_bitmap);
 
+/*
+ * If source is in backing chain of target assume that target is going to 
be
+ * used for "image fleecing", i.e. it should represent a kind of snapshot 
of
+ * source at backup-start point in time. And target is going to be read by
+ * somebody (for example, used as NBD export) during backup job.
+ *
+ * In

[PATCH v7 11/33] block/copy-before-write: drop extra bdrv_unref on failure path

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
bdrv_attach_child() do bdrv_unref() on failure, so we shouldn't do it
by hand here.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/copy-before-write.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 2a51cc64e4..945d9340f4 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -201,7 +201,6 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
 state->target = bdrv_attach_child(top, target, "target", &child_of_bds,
   BDRV_CHILD_DATA, errp);
 if (!state->target) {
-bdrv_unref(target);
 bdrv_unref(top);
 return NULL;
 }
-- 
2.29.2




[PATCH v7 15/33] block/copy-before-write: cbw_init(): rename variables

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
One more step closer to real .bdrv_open() handler: use more usual names
for bs being initialized and its state.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/copy-before-write.c | 29 ++---
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index a4fee645fd..d7f1833efa 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -144,38 +144,37 @@ static void cbw_child_perm(BlockDriverState *bs, 
BdrvChild *c,
 }
 }
 
-static int cbw_init(BlockDriverState *top, BlockDriverState *source,
+static int cbw_init(BlockDriverState *bs, BlockDriverState *source,
 BlockDriverState *target, bool compress, Error **errp)
 {
-BDRVCopyBeforeWriteState *state = top->opaque;
+BDRVCopyBeforeWriteState *s = bs->opaque;
 
-top->total_sectors = source->total_sectors;
-top->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
+bs->total_sectors = source->total_sectors;
+bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
 (BDRV_REQ_FUA & source->supported_write_flags);
-top->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
+bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
 ((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
  source->supported_zero_flags);
 
 bdrv_ref(target);
-state->target = bdrv_attach_child(top, target, "target", &child_of_bds,
-  BDRV_CHILD_DATA, errp);
-if (!state->target) {
+s->target = bdrv_attach_child(bs, target, "target", &child_of_bds,
+  BDRV_CHILD_DATA, errp);
+if (!s->target) {
 error_prepend(errp, "Cannot attach target child: ");
 return -EINVAL;
 }
 
 bdrv_ref(source);
-top->file = bdrv_attach_child(top, source, "file", &child_of_bds,
-  BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
-  errp);
-if (!top->file) {
+bs->file = bdrv_attach_child(bs, source, "file", &child_of_bds,
+ BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
+ errp);
+if (!bs->file) {
 error_prepend(errp, "Cannot attach file child: ");
 return -EINVAL;
 }
 
-state->bcs = block_copy_state_new(top->file, state->target, false, 
compress,
-  errp);
-if (!state->bcs) {
+s->bcs = block_copy_state_new(bs->file, s->target, false, compress, errp);
+if (!s->bcs) {
 error_prepend(errp, "Cannot create block-copy-state: ");
 return -EINVAL;
 }
-- 
2.29.2




[PATCH v7 13/33] block/copy-before-write: bdrv_cbw_append(): replace child at last

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Refactor the function to replace child at last. Thus we don't need to
revert it and code is simplified.

block-copy state initialization being done before replacing the child
doesn't need any drained section.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/copy-before-write.c | 33 +++--
 1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 7a6c15f141..adbbc64aa9 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -172,7 +172,6 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
 int ret;
 BDRVCopyBeforeWriteState *state;
 BlockDriverState *top;
-bool appended = false;
 
 assert(source->total_sectors == target->total_sectors);
 
@@ -196,8 +195,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   BDRV_CHILD_DATA, errp);
 if (!state->target) {
 error_prepend(errp, "Cannot attach target child: ");
-bdrv_unref(top);
-return NULL;
+goto fail;
 }
 
 bdrv_ref(source);
@@ -206,18 +204,8 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   errp);
 if (!top->file) {
 error_prepend(errp, "Cannot attach file child: ");
-bdrv_unref(top);
-return NULL;
-}
-
-bdrv_drained_begin(source);
-
-ret = bdrv_replace_node(source, top, errp);
-if (ret < 0) {
-error_prepend(errp, "Cannot append copy-before-write filter: ");
 goto fail;
 }
-appended = true;
 
 state->bcs = block_copy_state_new(top->file, state->target, false, 
compress,
   errp);
@@ -225,21 +213,22 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState 
*source,
 error_prepend(errp, "Cannot create block-copy-state: ");
 goto fail;
 }
-*bcs = state->bcs;
 
+bdrv_drained_begin(source);
+ret = bdrv_replace_node(source, top, errp);
 bdrv_drained_end(source);
+if (ret < 0) {
+error_prepend(errp, "Cannot append copy-before-write filter: ");
+goto fail;
+}
+
+*bcs = state->bcs;
 
 return top;
 
 fail:
-if (appended) {
-bdrv_cbw_drop(top);
-} else {
-bdrv_unref(top);
-}
-
-bdrv_drained_end(source);
-
+block_copy_state_free(state->bcs);
+bdrv_unref(top);
 return NULL;
 }
 
-- 
2.29.2




[PATCH v7 12/33] block/copy-before-write: use file child instead of backing

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
We are going to publish copy-before-write filter, and there no public
backing-child-based filter in Qemu. No reason to create a precedent, so
let's refactor copy-before-write filter instead.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/copy-before-write.c | 39 ++-
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 945d9340f4..7a6c15f141 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -43,7 +43,7 @@ static coroutine_fn int cbw_co_preadv(
 BlockDriverState *bs, uint64_t offset, uint64_t bytes,
 QEMUIOVector *qiov, int flags)
 {
-return bdrv_co_preadv(bs->backing, offset, bytes, qiov, flags);
+return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
 }
 
 static coroutine_fn int cbw_do_copy_before_write(BlockDriverState *bs,
@@ -71,7 +71,7 @@ static int coroutine_fn cbw_co_pdiscard(BlockDriverState *bs,
 return ret;
 }
 
-return bdrv_co_pdiscard(bs->backing, offset, bytes);
+return bdrv_co_pdiscard(bs->file, offset, bytes);
 }
 
 static int coroutine_fn cbw_co_pwrite_zeroes(BlockDriverState *bs,
@@ -82,7 +82,7 @@ static int coroutine_fn cbw_co_pwrite_zeroes(BlockDriverState 
*bs,
 return ret;
 }
 
-return bdrv_co_pwrite_zeroes(bs->backing, offset, bytes, flags);
+return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
 }
 
 static coroutine_fn int cbw_co_pwritev(BlockDriverState *bs,
@@ -95,29 +95,22 @@ static coroutine_fn int cbw_co_pwritev(BlockDriverState *bs,
 return ret;
 }
 
-return bdrv_co_pwritev(bs->backing, offset, bytes, qiov, flags);
+return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
 }
 
 static int coroutine_fn cbw_co_flush(BlockDriverState *bs)
 {
-if (!bs->backing) {
+if (!bs->file) {
 return 0;
 }
 
-return bdrv_co_flush(bs->backing->bs);
+return bdrv_co_flush(bs->file->bs);
 }
 
 static void cbw_refresh_filename(BlockDriverState *bs)
 {
-if (bs->backing == NULL) {
-/*
- * we can be here after failed bdrv_attach_child in
- * bdrv_set_backing_hd
- */
-return;
-}
 pstrcpy(bs->exact_filename, sizeof(bs->exact_filename),
-bs->backing->bs->filename);
+bs->file->bs->filename);
 }
 
 static void cbw_child_perm(BlockDriverState *bs, BdrvChild *c,
@@ -186,6 +179,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
 top = bdrv_new_open_driver(&bdrv_cbw_filter, filter_node_name,
BDRV_O_RDWR, errp);
 if (!top) {
+error_prepend(errp, "Cannot open driver: ");
 return NULL;
 }
 
@@ -201,21 +195,32 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState 
*source,
 state->target = bdrv_attach_child(top, target, "target", &child_of_bds,
   BDRV_CHILD_DATA, errp);
 if (!state->target) {
+error_prepend(errp, "Cannot attach target child: ");
+bdrv_unref(top);
+return NULL;
+}
+
+bdrv_ref(source);
+top->file = bdrv_attach_child(top, source, "file", &child_of_bds,
+  BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY,
+  errp);
+if (!top->file) {
+error_prepend(errp, "Cannot attach file child: ");
 bdrv_unref(top);
 return NULL;
 }
 
 bdrv_drained_begin(source);
 
-ret = bdrv_append(top, source, errp);
+ret = bdrv_replace_node(source, top, errp);
 if (ret < 0) {
 error_prepend(errp, "Cannot append copy-before-write filter: ");
 goto fail;
 }
 appended = true;
 
-state->bcs = block_copy_state_new(top->backing, state->target,
-  false, compress, errp);
+state->bcs = block_copy_state_new(top->file, state->target, false, 
compress,
+  errp);
 if (!state->bcs) {
 error_prepend(errp, "Cannot create block-copy-state: ");
 goto fail;
-- 
2.29.2




[PATCH v7 02/33] block: introduce blk_replace_bs

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Add function to change bs inside blk.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 include/sysemu/block-backend.h | 1 +
 block/block-backend.c  | 8 
 2 files changed, 9 insertions(+)

diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 9ac5f7bbd3..29d4fdbf63 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -102,6 +102,7 @@ BlockBackend *blk_by_public(BlockBackendPublic *public);
 BlockDriverState *blk_bs(BlockBackend *blk);
 void blk_remove_bs(BlockBackend *blk);
 int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, Error **errp);
+int blk_replace_bs(BlockBackend *blk, BlockDriverState *new_bs, Error **errp);
 bool bdrv_has_blk(BlockDriverState *bs);
 bool bdrv_is_root_node(BlockDriverState *bs);
 int blk_set_perm(BlockBackend *blk, uint64_t perm, uint64_t shared_perm,
diff --git a/block/block-backend.c b/block/block-backend.c
index deb55c272e..6140d133e2 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -869,6 +869,14 @@ int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, 
Error **errp)
 return 0;
 }
 
+/*
+ * Change BlockDriverState associated with @blk.
+ */
+int blk_replace_bs(BlockBackend *blk, BlockDriverState *new_bs, Error **errp)
+{
+return bdrv_replace_child_bs(blk->root, new_bs, errp);
+}
+
 /*
  * Sets the permission bitmasks that the user of the BlockBackend needs.
  */
-- 
2.29.2




[PATCH v7 03/33] qdev-properties: PropertyInfo: add realized_set_allowed field

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Add field, so property can declare support for setting the property
when device is realized. To be used in the following commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 include/hw/qdev-properties.h | 1 +
 hw/core/qdev-properties.c| 6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index 0ef97d60ce..f7925f67d0 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -32,6 +32,7 @@ struct PropertyInfo {
 const char *name;
 const char *description;
 const QEnumLookup *enum_table;
+bool realized_set_allowed; /* allow setting property on realized device */
 int (*print)(Object *obj, Property *prop, char *dest, size_t len);
 void (*set_default_value)(ObjectProperty *op, const Property *prop);
 ObjectProperty *(*create)(ObjectClass *oc, const char *name,
diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 50f40949f5..c34aac6ebc 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -26,11 +26,11 @@ void qdev_prop_set_after_realize(DeviceState *dev, const 
char *name,
 
 /* returns: true if property is allowed to be set, false otherwise */
 static bool qdev_prop_allow_set(Object *obj, const char *name,
-Error **errp)
+const PropertyInfo *info, Error **errp)
 {
 DeviceState *dev = DEVICE(obj);
 
-if (dev->realized) {
+if (dev->realized && !info->realized_set_allowed) {
 qdev_prop_set_after_realize(dev, name, errp);
 return false;
 }
@@ -79,7 +79,7 @@ static void field_prop_set(Object *obj, Visitor *v, const 
char *name,
 {
 Property *prop = opaque;
 
-if (!qdev_prop_allow_set(obj, name, errp)) {
+if (!qdev_prop_allow_set(obj, name, prop->info, errp)) {
 return;
 }
 
-- 
2.29.2




[PATCH v7 08/33] block/backup: set copy_range and compress after filter insertion

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
We are going to publish copy-before-write filter, so it would be
initialized through options. Still we don't want to publish compress
and copy-range options, as

1. Modern way to enable compression is to use compress filter.

2. For copy-range it's unclean how to make proper interface:
 - it's has experimental prefix for backup job anyway
 - the whole BackupPerf structure doesn't make sense for the filter
 So, let's just add copy-range possibility to the filter later if
 needed.

Still, we are going to continue support for compression and
experimental copy-range in backup job. So, set these options after
filter creation.

Note, that we can drop "compress" argument of bdrv_cbw_append() now, as
well as "perf". The only reason not doing so is that now, when I
prepare this patch the big series around it is already reviewed and I
want to avoid extra rebase conflicts to simplify review of the
following version.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/copy-before-write.h | 1 -
 block/backup.c| 3 ++-
 block/copy-before-write.c | 4 +---
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/block/copy-before-write.h b/block/copy-before-write.h
index f37e2249ae..538aab8bdb 100644
--- a/block/copy-before-write.h
+++ b/block/copy-before-write.h
@@ -33,7 +33,6 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   BlockDriverState *target,
   const char *filter_node_name,
   uint64_t cluster_size,
-  BackupPerf *perf,
   bool compress,
   BlockCopyState **bcs,
   Error **errp);
diff --git a/block/backup.c b/block/backup.c
index 84f9a5f02c..b31fd99ab3 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -504,7 +504,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 }
 
 cbw = bdrv_cbw_append(bs, target, filter_node_name,
-  cluster_size, perf, compress, &bcs, errp);
+  cluster_size, false, &bcs, errp);
 if (!cbw) {
 goto error;
 }
@@ -530,6 +530,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 job->len = len;
 job->perf = *perf;
 
+block_copy_set_copy_opts(bcs, perf->use_copy_range, compress);
 block_copy_set_progress_meter(bcs, &job->common.job.progress);
 block_copy_set_speed(bcs, speed);
 
diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index 4337076c1c..235251a640 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -170,7 +170,6 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
   BlockDriverState *target,
   const char *filter_node_name,
   uint64_t cluster_size,
-  BackupPerf *perf,
   bool compress,
   BlockCopyState **bcs,
   Error **errp)
@@ -217,8 +216,7 @@ BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
 
 state->cluster_size = cluster_size;
 state->bcs = block_copy_state_new(top->backing, state->target,
-  cluster_size, perf->use_copy_range,
-  compress, errp);
+  cluster_size, false, compress, errp);
 if (!state->bcs) {
 error_prepend(errp, "Cannot create block-copy-state: ");
 goto fail;
-- 
2.29.2




[PATCH v7 07/33] block/block-copy: introduce block_copy_set_copy_opts()

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
We'll need a possibility to set compress and use_copy_range options
after initialization of the state. So make corresponding part of
block_copy_state_new() separate and public.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/block-copy.h |  2 ++
 block/block-copy.c | 66 +-
 2 files changed, 39 insertions(+), 29 deletions(-)

diff --git a/include/block/block-copy.h b/include/block/block-copy.h
index 734389d32a..f0ba7bc828 100644
--- a/include/block/block-copy.h
+++ b/include/block/block-copy.h
@@ -28,6 +28,8 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, 
BdrvChild *target,
  int64_t cluster_size, bool use_copy_range,
  bool compress, Error **errp);
 
+void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range,
+  bool compress);
 void block_copy_set_progress_meter(BlockCopyState *s, ProgressMeter *pm);
 
 void block_copy_state_free(BlockCopyState *s);
diff --git a/block/block-copy.c b/block/block-copy.c
index 58b4345a5a..84c29fb233 100644
--- a/block/block-copy.c
+++ b/block/block-copy.c
@@ -315,21 +315,11 @@ static uint32_t block_copy_max_transfer(BdrvChild 
*source, BdrvChild *target)
  target->bs->bl.max_transfer));
 }
 
-BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
- int64_t cluster_size, bool use_copy_range,
- bool compress, Error **errp)
+/* Function should be called prior any actual copy request */
+void block_copy_set_copy_opts(BlockCopyState *s, bool use_copy_range,
+  bool compress)
 {
-BlockCopyState *s;
-BdrvDirtyBitmap *copy_bitmap;
 bool is_fleecing;
-
-copy_bitmap = bdrv_create_dirty_bitmap(source->bs, cluster_size, NULL,
-   errp);
-if (!copy_bitmap) {
-return NULL;
-}
-bdrv_disable_dirty_bitmap(copy_bitmap);
-
 /*
  * If source is in backing chain of target assume that target is going to 
be
  * used for "image fleecing", i.e. it should represent a kind of snapshot 
of
@@ -344,24 +334,12 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, 
BdrvChild *target,
  * For more information see commit f8d59dfb40bb and test
  * tests/qemu-iotests/222
  */
-is_fleecing = bdrv_chain_contains(target->bs, source->bs);
+is_fleecing = bdrv_chain_contains(s->target->bs, s->source->bs);
 
-s = g_new(BlockCopyState, 1);
-*s = (BlockCopyState) {
-.source = source,
-.target = target,
-.copy_bitmap = copy_bitmap,
-.cluster_size = cluster_size,
-.len = bdrv_dirty_bitmap_size(copy_bitmap),
-.write_flags = (is_fleecing ? BDRV_REQ_SERIALISING : 0) |
-(compress ? BDRV_REQ_WRITE_COMPRESSED : 0),
-.mem = shres_create(BLOCK_COPY_MAX_MEM),
-.max_transfer = QEMU_ALIGN_DOWN(
-block_copy_max_transfer(source, target),
-cluster_size),
-};
+s->write_flags = (is_fleecing ? BDRV_REQ_SERIALISING : 0) |
+(compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
 
-if (s->max_transfer < cluster_size) {
+if (s->max_transfer < s->cluster_size) {
 /*
  * copy_range does not respect max_transfer. We don't want to bother
  * with requests smaller than block-copy cluster size, so fallback to
@@ -379,6 +357,36 @@ BlockCopyState *block_copy_state_new(BdrvChild *source, 
BdrvChild *target,
  */
 s->method = use_copy_range ? COPY_RANGE_SMALL : COPY_READ_WRITE;
 }
+}
+
+BlockCopyState *block_copy_state_new(BdrvChild *source, BdrvChild *target,
+ int64_t cluster_size, bool use_copy_range,
+ bool compress, Error **errp)
+{
+BlockCopyState *s;
+BdrvDirtyBitmap *copy_bitmap;
+
+copy_bitmap = bdrv_create_dirty_bitmap(source->bs, cluster_size, NULL,
+   errp);
+if (!copy_bitmap) {
+return NULL;
+}
+bdrv_disable_dirty_bitmap(copy_bitmap);
+
+s = g_new(BlockCopyState, 1);
+*s = (BlockCopyState) {
+.source = source,
+.target = target,
+.copy_bitmap = copy_bitmap,
+.cluster_size = cluster_size,
+.len = bdrv_dirty_bitmap_size(copy_bitmap),
+.mem = shres_create(BLOCK_COPY_MAX_MEM),
+.max_transfer = QEMU_ALIGN_DOWN(
+block_copy_max_transfer(source, target),
+cluster_size),
+};
+
+block_copy_set_copy_opts(s, use_copy_range, compress);
 
 ratelimit_init(&s->rate_limit);
 qemu_co_mutex_init(&s->lock);
-- 
2.29.2




[PATCH v7 04/33] qdev: allow setting drive property for realized device

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
We need an ability to insert filters above top block node, attached to
block device. It can't be achieved with blockdev-reopen command. So, we
want do it with help of qom-set.

Intended usage:

Assume there is a node A that is attached to some guest device.

1. blockdev-add to create a filter node B that has A as its child.

2. qom-set to change the node attached to the guest device’s
   BlockBackend from A to B.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 hw/core/qdev-properties-system.c | 43 +++-
 1 file changed, 31 insertions(+), 12 deletions(-)

diff --git a/hw/core/qdev-properties-system.c b/hw/core/qdev-properties-system.c
index 2760c21f11..e71f5d64d1 100644
--- a/hw/core/qdev-properties-system.c
+++ b/hw/core/qdev-properties-system.c
@@ -36,11 +36,11 @@
 
 static bool check_prop_still_unset(Object *obj, const char *name,
const void *old_val, const char *new_val,
-   Error **errp)
+   bool allow_override, Error **errp)
 {
 const GlobalProperty *prop = qdev_find_global_prop(obj, name);
 
-if (!old_val) {
+if (!old_val || (!prop && allow_override)) {
 return true;
 }
 
@@ -93,16 +93,34 @@ static void set_drive_helper(Object *obj, Visitor *v, const 
char *name,
 BlockBackend *blk;
 bool blk_created = false;
 int ret;
+BlockDriverState *bs;
+AioContext *ctx;
 
 if (!visit_type_str(v, name, &str, errp)) {
 return;
 }
 
-/*
- * TODO Should this really be an error?  If no, the old value
- * needs to be released before we store the new one.
- */
-if (!check_prop_still_unset(obj, name, *ptr, str, errp)) {
+if (!check_prop_still_unset(obj, name, *ptr, str, true, errp)) {
+return;
+}
+
+if (*ptr) {
+/* BlockBackend alread exists. So, we want to change attached node */
+blk = *ptr;
+ctx = blk_get_aio_context(blk);
+bs = bdrv_lookup_bs(NULL, str, errp);
+if (!bs) {
+return;
+}
+
+if (ctx != bdrv_get_aio_context(bs)) {
+error_setg(errp, "Different aio context is not supported for new "
+   "node");
+}
+
+aio_context_acquire(ctx);
+blk_replace_bs(blk, bs, errp);
+aio_context_release(ctx);
 return;
 }
 
@@ -114,7 +132,7 @@ static void set_drive_helper(Object *obj, Visitor *v, const 
char *name,
 
 blk = blk_by_name(str);
 if (!blk) {
-BlockDriverState *bs = bdrv_lookup_bs(NULL, str, NULL);
+bs = bdrv_lookup_bs(NULL, str, NULL);
 if (bs) {
 /*
  * If the device supports iothreads, it will make sure to move the
@@ -123,8 +141,7 @@ static void set_drive_helper(Object *obj, Visitor *v, const 
char *name,
  * aware of iothreads require their BlockBackends to be in the main
  * AioContext.
  */
-AioContext *ctx = iothread ? bdrv_get_aio_context(bs) :
- qemu_get_aio_context();
+ctx = iothread ? bdrv_get_aio_context(bs) : qemu_get_aio_context();
 blk = blk_new(ctx, 0, BLK_PERM_ALL);
 blk_created = true;
 
@@ -196,6 +213,7 @@ static void release_drive(Object *obj, const char *name, 
void *opaque)
 const PropertyInfo qdev_prop_drive = {
 .name  = "str",
 .description = "Node name or ID of a block device to use as a backend",
+.realized_set_allowed = true,
 .get   = get_drive,
 .set   = set_drive,
 .release = release_drive,
@@ -204,6 +222,7 @@ const PropertyInfo qdev_prop_drive = {
 const PropertyInfo qdev_prop_drive_iothread = {
 .name  = "str",
 .description = "Node name or ID of a block device to use as a backend",
+.realized_set_allowed = true,
 .get   = get_drive,
 .set   = set_drive_iothread,
 .release = release_drive,
@@ -238,7 +257,7 @@ static void set_chr(Object *obj, Visitor *v, const char 
*name, void *opaque,
  * TODO Should this really be an error?  If no, the old value
  * needs to be released before we store the new one.
  */
-if (!check_prop_still_unset(obj, name, be->chr, str, errp)) {
+if (!check_prop_still_unset(obj, name, be->chr, str, false, errp)) {
 return;
 }
 
@@ -408,7 +427,7 @@ static void set_netdev(Object *obj, Visitor *v, const char 
*name,
  * TODO Should this really be an error?  If no, the old value
  * needs to be released before we store the new one.
  */
-if (!check_prop_still_unset(obj, name, ncs[i], str, errp)) {
+if (!check_prop_still_unset(obj, name, ncs[i], str, false, errp)) {
 goto out;
 }
 
-- 
2.29.2




[PATCH v7 10/33] block/copy-before-write: relax permission requirements when no parents

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
We are going to publish copy-before-write filter. So, user should be
able to create it with blockdev-add first, specifying both filtered and
target children. And then do blockdev-reopen, to actually insert the
filter where needed.

Currently, filter unshares write permission unconditionally on source
node. It's good, but it will not allow to do blockdev-add. So, let's
relax restrictions when filter doesn't have any parent.

Test output is modified, as now permission conflict happens only when
job creates a blk parent for filter node.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/copy-before-write.c  | 8 +---
 tests/qemu-iotests/283.out | 2 +-
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/block/copy-before-write.c b/block/copy-before-write.c
index a7996d54db..2a51cc64e4 100644
--- a/block/copy-before-write.c
+++ b/block/copy-before-write.c
@@ -142,10 +142,12 @@ static void cbw_child_perm(BlockDriverState *bs, 
BdrvChild *c,
 bdrv_default_perms(bs, c, role, reopen_queue,
perm, shared, nperm, nshared);
 
-if (perm & BLK_PERM_WRITE) {
-*nperm = *nperm | BLK_PERM_CONSISTENT_READ;
+if (!QLIST_EMPTY(&bs->parents)) {
+if (perm & BLK_PERM_WRITE) {
+*nperm = *nperm | BLK_PERM_CONSISTENT_READ;
+}
+*nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
 }
-*nshared &= ~(BLK_PERM_WRITE | BLK_PERM_RESIZE);
 }
 }
 
diff --git a/tests/qemu-iotests/283.out b/tests/qemu-iotests/283.out
index f2b7219632..5bb75952ef 100644
--- a/tests/qemu-iotests/283.out
+++ b/tests/qemu-iotests/283.out
@@ -5,7 +5,7 @@
 {"execute": "blockdev-add", "arguments": {"driver": "blkdebug", "image": 
"base", "node-name": "other", "take-child-perms": ["write"]}}
 {"return": {}}
 {"execute": "blockdev-backup", "arguments": {"device": "source", "sync": 
"full", "target": "target"}}
-{"error": {"class": "GenericError", "desc": "Cannot append copy-before-write 
filter: Permission conflict on node 'base': permissions 'write' are both 
required by node 'other' (uses node 'base' as 'image' child) and unshared by 
node 'source' (uses node 'base' as 'image' child)."}}
+{"error": {"class": "GenericError", "desc": "Permission conflict on node 
'base': permissions 'write' are both required by node 'other' (uses node 'base' 
as 'image' child) and unshared by node 'source' (uses node 'base' as 'image' 
child)."}}
 
 === copy-before-write filter should be gone after job-finalize ===
 
-- 
2.29.2




[PATCH v7 01/33] block: introduce bdrv_replace_child_bs()

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Add function to transactionally replace bs inside BdrvChild.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 include/block/block.h |  2 ++
 block.c   | 31 +++
 2 files changed, 33 insertions(+)

diff --git a/include/block/block.h b/include/block/block.h
index 3477290f9a..740038a892 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -361,6 +361,8 @@ int bdrv_append(BlockDriverState *bs_new, BlockDriverState 
*bs_top,
 Error **errp);
 int bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
   Error **errp);
+int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
+  Error **errp);
 BlockDriverState *bdrv_insert_node(BlockDriverState *bs, QDict *node_options,
int flags, Error **errp);
 int bdrv_drop_filter(BlockDriverState *bs, Error **errp);
diff --git a/block.c b/block.c
index e97ce0b1c8..b2b66263f9 100644
--- a/block.c
+++ b/block.c
@@ -5048,6 +5048,37 @@ out:
 return ret;
 }
 
+/* Not for empty child */
+int bdrv_replace_child_bs(BdrvChild *child, BlockDriverState *new_bs,
+  Error **errp)
+{
+int ret;
+Transaction *tran = tran_new();
+g_autoptr(GHashTable) found = NULL;
+g_autoptr(GSList) refresh_list = NULL;
+BlockDriverState *old_bs = child->bs;
+
+bdrv_ref(old_bs);
+bdrv_drained_begin(old_bs);
+bdrv_drained_begin(new_bs);
+
+bdrv_replace_child_tran(child, new_bs, tran);
+
+found = g_hash_table_new(NULL, NULL);
+refresh_list = bdrv_topological_dfs(refresh_list, found, old_bs);
+refresh_list = bdrv_topological_dfs(refresh_list, found, new_bs);
+
+ret = bdrv_list_refresh_perms(refresh_list, NULL, tran, errp);
+
+tran_finalize(tran, ret);
+
+bdrv_drained_end(old_bs);
+bdrv_drained_end(new_bs);
+bdrv_unref(old_bs);
+
+return ret;
+}
+
 static void bdrv_delete(BlockDriverState *bs)
 {
 assert(bdrv_op_blocker_is_empty(bs));
-- 
2.29.2




[PATCH v7 05/33] block: rename backup-top to copy-before-write

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
We are going to convert backup_top to full featured public filter,
which can be used in separate of backup job. Start from renaming from
"how it used" to "what it does".

While updating comments in 283 iotest, drop and rephrase also things
about ".active", as this field is now dropped, and filter doesn't have
"inactive" mode.

Note that this change may be considered as incompatible interface
change, as backup-top filter format name was visible through
query-block and query-named-block-nodes.

Still, consider the following reasoning:

1. backup-top was never documented, so if someone depends on format
   name (for driver that can't be used other than it is automatically
   inserted on backup job start), it's a kind of "undocumented feature
   use". So I think we are free to change it.

2. There is a hope, that there is no such users: it's a lot more native
   to give a good node-name to backup-top filter if need to operate
   with it somehow, and don't touch format name.

3. Another "incompatible" change in further commit would be moving
   copy-before-write filter from using backing child to file child. And
   this is even more reasonable than renaming: for now all public
   filters are file-child based.

So, it's a risky change, but risk seems small and good interface worth
it.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Max Reitz 
---
 block/{backup-top.h => copy-before-write.h} |  28 +++---
 block/backup.c  |  22 ++---
 block/{backup-top.c => copy-before-write.c} | 100 ++--
 MAINTAINERS |   4 +-
 block/meson.build   |   2 +-
 tests/qemu-iotests/283  |  35 +++
 tests/qemu-iotests/283.out  |   4 +-
 7 files changed, 95 insertions(+), 100 deletions(-)
 rename block/{backup-top.h => copy-before-write.h} (56%)
 rename block/{backup-top.c => copy-before-write.c} (62%)

diff --git a/block/backup-top.h b/block/copy-before-write.h
similarity index 56%
rename from block/backup-top.h
rename to block/copy-before-write.h
index b28b0031c4..5977b7aa31 100644
--- a/block/backup-top.h
+++ b/block/copy-before-write.h
@@ -1,10 +1,10 @@
 /*
- * backup-top filter driver
+ * copy-before-write filter driver
  *
  * The driver performs Copy-Before-Write (CBW) operation: it is injected above
  * some node, and before each write it copies _old_ data to the target node.
  *
- * Copyright (c) 2018-2019 Virtuozzo International GmbH.
+ * Copyright (c) 2018-2021 Virtuozzo International GmbH.
  *
  * Author:
  *  Sementsov-Ogievskiy Vladimir 
@@ -23,20 +23,20 @@
  * along with this program. If not, see .
  */
 
-#ifndef BACKUP_TOP_H
-#define BACKUP_TOP_H
+#ifndef COPY_BEFORE_WRITE_H
+#define COPY_BEFORE_WRITE_H
 
 #include "block/block_int.h"
 #include "block/block-copy.h"
 
-BlockDriverState *bdrv_backup_top_append(BlockDriverState *source,
- BlockDriverState *target,
- const char *filter_node_name,
- uint64_t cluster_size,
- BackupPerf *perf,
- BdrvRequestFlags write_flags,
- BlockCopyState **bcs,
- Error **errp);
-void bdrv_backup_top_drop(BlockDriverState *bs);
+BlockDriverState *bdrv_cbw_append(BlockDriverState *source,
+  BlockDriverState *target,
+  const char *filter_node_name,
+  uint64_t cluster_size,
+  BackupPerf *perf,
+  BdrvRequestFlags write_flags,
+  BlockCopyState **bcs,
+  Error **errp);
+void bdrv_cbw_drop(BlockDriverState *bs);
 
-#endif /* BACKUP_TOP_H */
+#endif /* COPY_BEFORE_WRITE_H */
diff --git a/block/backup.c b/block/backup.c
index bd3614ce70..ac91821b08 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -27,13 +27,13 @@
 #include "qemu/bitmap.h"
 #include "qemu/error-report.h"
 
-#include "block/backup-top.h"
+#include "block/copy-before-write.h"
 
 #define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
 
 typedef struct BackupBlockJob {
 BlockJob common;
-BlockDriverState *backup_top;
+BlockDriverState *cbw;
 BlockDriverState *source_bs;
 BlockDriverState *target_bs;
 
@@ -104,7 +104,7 @@ static void backup_clean(Job *job)
 {
 BackupBlockJob *s = container_of(job, BackupBlockJob, common.job);
 block_job_remove_all_bdrv(&s->common);
-bdrv_backup_top_drop(s->backup_top);
+bdrv_cbw_drop(s->cbw);
 }
 
 void backup_do_checkpoint(BlockJob *job, Error **errp)
@@ -408,7 +408,7 @@ BlockJob *backup_job_create(const char *job_id, 
BlockDriverState *bs,
 BackupBlockJob *job = NULL;
 int64_t cluster_size;

[PATCH v7 00/33] block: publish backup-top filter

2021-08-04 Thread Vladimir Sementsov-Ogievskiy
Hi all!

v7: small change: keep fleecing detection logic. I'm now implementing a
more effective way to do fleecing, that doesn't rely on backing chain
and on serializing requests. So, for this alternative way we'll not need
BDRV_REQ_SERIALISING flag. Let's keep automatic addition of this flag
when backing-based fleecing detected.
So, 06 is changed and renamed, 07 rebased on it.

Patches without r-b: 6, 7, 8, 17, 18

Vladimir Sementsov-Ogievskiy (33):
  block: introduce bdrv_replace_child_bs()
  block: introduce blk_replace_bs
  qdev-properties: PropertyInfo: add realized_set_allowed field
  qdev: allow setting drive property for realized device
  block: rename backup-top to copy-before-write
  block-copy: move detecting fleecing scheme to block-copy
  block/block-copy: introduce block_copy_set_copy_opts()
  block/backup: set copy_range and compress after filter insertion
  block/backup: move cluster size calculation to block-copy
  block/copy-before-write: relax permission requirements when no parents
  block/copy-before-write: drop extra bdrv_unref on failure path
  block/copy-before-write: use file child instead of backing
  block/copy-before-write: bdrv_cbw_append(): replace child at last
  block/copy-before-write: introduce cbw_init()
  block/copy-before-write: cbw_init(): rename variables
  block/copy-before-write: cbw_init(): use file child after attaching
  block/copy-before-write: bdrv_cbw_append(): drop unused compress arg
  block/copy-before-write: cbw_init(): use options
  block/copy-before-write: initialize block-copy bitmap
  block/block-copy: make setting progress optional
  block/copy-before-write: make public block driver
  qapi: publish copy-before-write filter
  python/qemu/machine.py: refactor _qemu_args()
  python/qemu/machine: QEMUMachine: improve qmp() method
  iotests.py: VM: add own __enter__ method
  iotests/222: fix pylint and mypy complains
  iotests/222: constantly use single quotes for strings
  iotests: move 222 to tests/image-fleecing
  iotests.py: hmp_qemu_io: support qdev
  iotests/image-fleecing: proper source device
  iotests/image-fleecing: rename tgt_node
  iotests/image-fleecing: prepare for adding new test-case
  iotests/image-fleecing: add test-case for copy-before-write filter

 qapi/block-core.json|  25 +-
 block/{backup-top.h => copy-before-write.h} |  25 +-
 include/block/block-copy.h  |   6 +-
 include/block/block.h   |   2 +
 include/hw/qdev-properties.h|   1 +
 include/sysemu/block-backend.h  |   1 +
 block.c |  31 +++
 block/backup-top.c  | 253 ---
 block/backup.c  | 116 ++---
 block/block-backend.c   |   8 +
 block/block-copy.c  | 135 ---
 block/copy-before-write.c   | 256 
 hw/core/qdev-properties-system.c|  43 +++-
 hw/core/qdev-properties.c   |   6 +-
 MAINTAINERS |   4 +-
 block/meson.build   |   2 +-
 python/qemu/machine/machine.py  |  30 ++-
 tests/qemu-iotests/222  | 159 
 tests/qemu-iotests/222.out  |  67 -
 tests/qemu-iotests/283  |  35 ++-
 tests/qemu-iotests/283.out  |   4 +-
 tests/qemu-iotests/297  |   2 +-
 tests/qemu-iotests/iotests.py   |   9 +-
 tests/qemu-iotests/tests/image-fleecing | 192 +++
 tests/qemu-iotests/tests/image-fleecing.out | 139 +++
 25 files changed, 882 insertions(+), 669 deletions(-)
 rename block/{backup-top.h => copy-before-write.h} (56%)
 delete mode 100644 block/backup-top.c
 create mode 100644 block/copy-before-write.c
 delete mode 100755 tests/qemu-iotests/222
 delete mode 100644 tests/qemu-iotests/222.out
 create mode 100755 tests/qemu-iotests/tests/image-fleecing
 create mode 100644 tests/qemu-iotests/tests/image-fleecing.out

-- 
2.29.2




Re: [PATCH for-6.1? v2 6/7] mirror: Check job_is_cancelled() earlier

2021-08-04 Thread Max Reitz

On 03.08.21 16:34, Kevin Wolf wrote:

Am 26.07.2021 um 16:46 hat Max Reitz geschrieben:

We must check whether the job is force-cancelled early in our main loop,
most importantly before any `continue` statement.  For example, we used
to have `continue`s before our current checking location that are
triggered by `mirror_flush()` failing.  So, if `mirror_flush()` kept
failing, force-cancelling the job would not terminate it.

A job being force-cancelled should be treated the same as the job having
failed, so put the check in the same place where we check `s->ret < 0`.

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz 
---
  block/mirror.c | 7 +--
  1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 72e02fa34e..46d1a1e5a2 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -993,7 +993,7 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
  mirror_wait_for_any_operation(s, true);
  }
  
-if (s->ret < 0) {

+if (s->ret < 0 || job_is_cancelled(&s->common.job)) {
  ret = s->ret;
  goto immediate_exit;
  }
@@ -1078,8 +1078,6 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
  break;
  }
  
-ret = 0;

-
  if (job_is_ready(&s->common.job) && !should_complete) {
  delay_ns = (s->in_flight == 0 &&
  cnt == 0 ? BLOCK_JOB_SLICE_TIME : 0);
@@ -1087,9 +1085,6 @@ static int coroutine_fn mirror_run(Job *job, Error **errp)
  trace_mirror_before_sleep(s, cnt, job_is_ready(&s->common.job),
delay_ns);
  job_sleep_ns(&s->common.job, delay_ns);
-if (job_is_cancelled(&s->common.job)) {
-break;
-}

I think it was intentional that the check is here because it means
skipping the job_sleep_ns() and instead cancelling immediately, and we
probably still want that. Between your check above and here, the
coroutine can yield, so cancellation could have been newly requested.


I’m afraid I don’t quite understand.  If cancel is requested in 
job_sleep_ns(), then we will go back to the top of the loop, wait for 
in-flight active requests and then break.  Waiting for the in-flight 
requests seems unnecessary, but does it really make a difference in 
practice?  We don’t start new requests, so it should be legal to wait 
for existing ones to settle, and also I believe someone will have to 
wait for those in-flight requests anyway (when the mirror top node is 
removed).  (The only thing we could do is to cancel the in-flight 
requests, but that is what mirror_cancel() does.)


Looking more at the whole loop, there are a couple of places that can 
yield.  Of course we can check whether the job has been cancelled after 
every single one of them, but that would be a bit strange.  We only 
really need to check before we initiate new requests or want to change 
the state.  I believe the right place to do the check would be after the 
job_pause_point().


And perhaps the active write functions (bdrv_mirror_top_do_write() and 
bdrv_mirror_top_pwritev()) should stop copying to the target if the job 
has been cancelled.


Max


So have the check in both places, I guess? And a comment to explain why
neither is redundant.


  s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
  }

Kevin






Re: [PATCH for-6.1? v2 5/7] job: Add job_cancel_requested()

2021-08-04 Thread Max Reitz

On 03.08.21 16:25, Kevin Wolf wrote:

Am 26.07.2021 um 16:46 hat Max Reitz geschrieben:

Most callers of job_is_cancelled() actually want to know whether the job
is on its way to immediate termination.  For example, we refuse to pause
jobs that are cancelled; but this only makes sense for jobs that are
really actually cancelled.

A mirror job that is cancelled during READY with force=false should
absolutely be allowed to pause.  This "cancellation" (which is actually
a kind of completion) may take an indefinite amount of time, and so
should behave like any job during normal operation.  For example, with
on-target-error=stop, the job should stop on write errors.  (In
contrast, force-cancelled jobs should not get write errors, as they
should just terminate and not do further I/O.)

Therefore, redefine job_is_cancelled() to only return true for jobs that
are force-cancelled (which as of HEAD^ means any job that interprets the
cancellation request as a request for immediate termination), and add
job_cancel_request() as the general variant, which returns true for any
jobs which have been requested to be cancelled, whether it be
immediately or after an arbitrarily long completion phase.

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Max Reitz 
---
  include/qemu/job.h |  8 +++-
  block/mirror.c | 10 --
  job.c  |  7 ++-
  3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/include/qemu/job.h b/include/qemu/job.h
index 8aa90f7395..032edf3c5f 100644
--- a/include/qemu/job.h
+++ b/include/qemu/job.h
@@ -436,9 +436,15 @@ const char *job_type_str(const Job *job);
  /** Returns true if the job should not be visible to the management layer. */
  bool job_is_internal(Job *job);
  
-/** Returns whether the job is scheduled for cancellation. */

+/** Returns whether the job is being cancelled. */
  bool job_is_cancelled(Job *job);
  
+/**

+ * Returns whether the job is scheduled for cancellation (at an
+ * indefinite point).
+ */
+bool job_cancel_requested(Job *job);

I don't think non-force blockdev-cancel for mirror should actually be
considered cancellation, so what is the question that this function
answers?

"Is this a cancelled job, or a mirror block job that is supposed to
complete soon, but only if it doesn't switch over the users to the
target on completion"?


Well, technically yes, but it was more intended as “Has the user ever 
invoked (block-)job-cancel on this job?”.



Is this ever a reasonable question to ask, except maybe inside the
mirror implementation itself?


I asked myself the same for v3, but found two places in job.c where I 
would like to keep it:


First, there’s an assertion in job_completed_txn_abort().  All jobs 
other than @job have been force-cancelled, and so job_is_cancelled() 
would be true for them.  As for @job itself, the function is mostly 
called when the job’s return value is not 0, but a soft-cancelled mirror 
does have a return value of 0 and so would not end up in that function.
But job_cancel() invokes job_completed_txn_abort() if the job has been 
deferred to the main loop, which mostly correlates with the job having 
been completed (in which case the assertion is skipped), but not 100 % 
(there’s a small window between setting deferred_to_main_loop and the 
job changing to a completed state).
So I’d prefer to keep the assertion as-is functionally, i.e. to only 
check job->cancelled.


Second, job_complete() refuses to let a job complete that has been 
cancelled.  This function is basically only invoked by the user (through 
qmp_block_job_complete()/qmp_job_complete(), or job_complete_sync(), 
which comes from qemu-img), so I believe that it should correspond to 
the external interface we have right now; i.e., if the user has invoked 
(block-)job-cancel at one point, job_complete() should generally return 
an error.



job_complete() is the only function outside of mirror that seems to use
it. But even there, it feels wrong to make a difference. Either we
accept redundant completion requests, or we don't. It doesn't really
matter how the job reconfigures the graph on completion. (Also, I feel
this should really have been part of the state machine, but I'm not sure
if we want to touch it now...)


Well, yes, I don’t think it makes a difference because I don’t think 
anyone will first tell the job via block-job-cancel to complete without 
pivoting, and then change their mind and call block-job-complete after 
all.  (Not least because that’s an error pre-series.)


Also, I’m not even sure whether completing after a soft cancel request 
works.  I don’t think any of our code accounts for such a case, so I’d 
rather avoid allowing it if there’s no need to allow it anyway.


Max




Re: [Question] qemu-img convert block alignment

2021-08-04 Thread Zhenyu Ye
On 2021/8/3 23:03, Eric Blake wrote:
> On Fri, Apr 02, 2021 at 11:52:25AM +0800, Zhenyu Ye wrote:
>> Hi all,
>>
>> commit 8dcd3c9b91 ("qemu-img: align result of is_allocated_sectors")
>> introduces block alignment when doing qemu-img convert. However, the
>> alignment is:
>>
>>  s.alignment = MAX(pow2floor(s.min_sparse),
>>   DIV_ROUND_UP(out_bs->bl.request_alignment,
>>BDRV_SECTOR_SIZE));
>>
>> (where the default s.min_sparse is 8)
>> When the target device's bl.request_alignment is smaller than 4K, this
>> will cause additional write-zero overhead and makes the size of target
>> file larger.
>>
>> Is this as expected?  Should we change the MAX() to MIN()?
> 
> Yes it is expected, and no we shouldn't change it.  Even when a target
> advertises a bl.request_alignment of 512, our goal is to avoid needing
> read-modify-write cycles when that target is really on top of a 4k
> sector disk.  Writing extra 0s out to the 4k boundaries does not
> change the fact that allocation is in 4k chunks anyways, regardless of
> whether the disk supports smaller 512-byte reads.
> 

Thanks for your reply.

Zhenyu