Re: [PATCH v3 0/2] ide: Fix incorrect handling of some PRDTs and add the corresponding unit-test

2020-01-22 Thread John Snow



On 12/23/19 12:51 PM, Alexander Popov wrote:
> Fuzzing the Linux kernel with syzkaller allowed to find how to crash qemu
> using a special SCSI_IOCTL_SEND_COMMAND. It hits the assertion in
> ide_dma_cb() introduced in the commit a718978ed58a in July 2015.
> 
> This patch series fixes incorrect handling of some PRDTs in ide_dma_cb()
> and improves the ide-test to cover more PRDT cases (including one
> that causes that particular qemu crash).
> 
> Changes from v2 (thanks to Kevin Wolf for the feedback):
>  - the assertion about prepare_buf() return value is improved;
>  - the patch order is reversed to keep the tree bisectable;
>  - the unit-test performance is improved -- now it runs 8 seconds
>instead of 3 minutes on my laptop.
> 
> Alexander Popov (2):
>   ide: Fix incorrect handling of some PRDTs in ide_dma_cb()
>   tests/ide-test: Create a single unit-test covering more PRDT cases
> 
>  hw/ide/core.c|  30 +---
>  tests/ide-test.c | 174 ---
>  2 files changed, 96 insertions(+), 108 deletions(-)
> 

Thanks, applied to my IDE tree:

https://github.com/jnsnow/qemu/commits/ide
https://github.com/jnsnow/qemu.git

--js




Re: [PATCH] qemu-nbd: Removed deprecated --partition option

2020-01-22 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20200122214328.1413664-1-ebl...@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

  TESTcheck-qtest-aarch64: tests/qtest/boot-serial-test
  TESTcheck-qtest-aarch64: tests/qtest/migration-test
**
ERROR:/tmp/qemu-test/src/tests/test-char.c:1125:char_serial_test: 'chr' should 
not be NULL
ERROR - Bail out! 
ERROR:/tmp/qemu-test/src/tests/test-char.c:1125:char_serial_test: 'chr' should 
not be NULL
make: *** [check-unit] Error 1
make: *** Waiting for unfinished jobs
qemu-system-aarch64: -accel kvm: invalid accelerator kvm
qemu-system-aarch64: falling back to tcg
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=bcad5894bb1b43f2b125072f6631cf33', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-7pp43kkd/src/docker-src.2020-01-22-16.49.30.24833:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=bcad5894bb1b43f2b125072f6631cf33
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-7pp43kkd/src'
make: *** [docker-run-test-quick@centos7] Error 2

real11m21.905s
user0m8.570s


The full log is available at
http://patchew.org/logs/20200122214328.1413664-1-ebl...@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[PATCH] qemu-nbd: Removed deprecated --partition option

2020-01-22 Thread Eric Blake
The option was deprecated in 4.0.0 (commit 0ae2d546); it's now been
long enough with no complaints to follow through with that process.

Signed-off-by: Eric Blake 
---
 qemu-deprecated.texi |  49 ++--
 qemu-nbd.c   | 133 +--
 qemu-nbd.texi|  13 ++---
 3 files changed, 24 insertions(+), 171 deletions(-)

diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
index 8471eef9c22d..1b4c638db8e0 100644
--- a/qemu-deprecated.texi
+++ b/qemu-deprecated.texi
@@ -304,37 +304,6 @@ The above, converted to the current supported format:

 @section Related binaries

-@subsection qemu-nbd --partition (since 4.0.0)
-
-The ``qemu-nbd --partition $digit'' code (also spelled @option{-P})
-can only handle MBR partitions, and has never correctly handled
-logical partitions beyond partition 5.  If you know the offset and
-length of the partition (perhaps by using @code{sfdisk} within the
-guest), you can achieve the effect of exporting just that subset of
-the disk by use of the @option{--image-opts} option with a raw
-blockdev using the @code{offset} and @code{size} parameters layered on
-top of any other existing blockdev. For example, if partition 1 is
-100MiB long starting at 1MiB, the old command:
-
-@code{qemu-nbd -t -P 1 -f qcow2 file.qcow2}
-
-can be rewritten as:
-
-@code{qemu-nbd -t --image-opts 
driver=raw,offset=1M,size=100M,file.driver=qcow2,file.backing.driver=file,file.backing.filename=file.qcow2}
-
-Alternatively, the @code{nbdkit} project provides a more powerful
-partition filter on top of its nbd plugin, which can be used to select
-an arbitrary MBR or GPT partition on top of any other full-image NBD
-export.  Using this to rewrite the above example results in:
-
-@code{qemu-nbd -t -k /tmp/sock -f qcow2 file.qcow2 &}
-@code{nbdkit -f --filter=partition nbd socket=/tmp/sock partition=1}
-
-Note that if you are exposing the export via /dev/nbd0, it is easier
-to just export the entire image and then mount only /dev/nbd0p1 than
-it is to reinvoke @command{qemu-nbd -c /dev/nbd0} limited to just a
-subset of the image.
-
 @subsection qemu-img convert -n -o (since 4.2.0)

 All options specified in @option{-o} are image creation options, so
@@ -383,3 +352,21 @@ trouble after a recent upgrade.

 The "autoload" parameter has been ignored since 2.12.0. All bitmaps
 are automatically loaded from qcow2 images.
+
+@section Related binaries
+
+@subsection qemu-nbd --partition (removed in 5.0.0)
+
+The ``qemu-nbd --partition $digit'' code (also spelled @option{-P})
+could only handle MBR partitions, and never correctly handled logical
+partitions beyond partition 5.  Exporting a partition can still be
+done by utilizing the @option{--image-opts} option with a raw blockdev
+using the @code{offset} and @code{size} parameters layered on top of
+any other existing blockdev. For example, if partition 1 is 100MiB
+long starting at 1MiB, the old command:
+
+@code{qemu-nbd -t -P 1 -f qcow2 file.qcow2}
+
+can be rewritten as:
+
+@code{qemu-nbd -t --image-opts 
driver=raw,offset=1M,size=100M,file.driver=qcow2,file.backing.driver=file,file.backing.filename=file.qcow2}
diff --git a/qemu-nbd.c b/qemu-nbd.c
index bc125370dd36..e6f2eb76a3f2 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -100,7 +100,6 @@ static void usage(const char *name)
 "\n"
 "Exposing part of the image:\n"
 "  -o, --offset=OFFSET   offset into the image\n"
-"  -P, --partition=NUM   only expose partition NUM\n"
 "  -B, --bitmap=NAME expose a persistent dirty bitmap\n"
 "\n"
 "General purpose options:\n"
@@ -156,96 +155,6 @@ QEMU_COPYRIGHT "\n"
 , name);
 }

-struct partition_record
-{
-uint8_t bootable;
-uint8_t start_head;
-uint32_t start_cylinder;
-uint8_t start_sector;
-uint8_t system;
-uint8_t end_head;
-uint8_t end_cylinder;
-uint8_t end_sector;
-uint32_t start_sector_abs;
-uint32_t nb_sectors_abs;
-};
-
-static void read_partition(uint8_t *p, struct partition_record *r)
-{
-r->bootable = p[0];
-r->start_head = p[1];
-r->start_cylinder = p[3] | ((p[2] << 2) & 0x0300);
-r->start_sector = p[2] & 0x3f;
-r->system = p[4];
-r->end_head = p[5];
-r->end_cylinder = p[7] | ((p[6] << 2) & 0x300);
-r->end_sector = p[6] & 0x3f;
-
-r->start_sector_abs = ldl_le_p(p + 8);
-r->nb_sectors_abs   = ldl_le_p(p + 12);
-}
-
-static int find_partition(BlockBackend *blk, int partition,
-  uint64_t *offset, uint64_t *size)
-{
-struct partition_record mbr[4];
-uint8_t data[MBR_SIZE];
-int i;
-int ext_partnum = 4;
-int ret;
-
-ret = blk_pread(blk, 0, data, sizeof(data));
-if (ret < 0) {
-error_report("error while reading: %s", strerror(-ret));
-exit(EXIT_FAILURE);
-}
-
-if (data[510] != 0x55 || data[511] != 0xaa) {
-return -EINVAL;
-}
-
-for (i = 0; i < 4; i++) {
-read_partition([446 + 16 * i], [i]);
-
-if 

Re: [PATCH v3 0/2] ide: Fix incorrect handling of some PRDTs and add the corresponding unit-test

2020-01-22 Thread John Snow



On 1/22/20 7:23 AM, Kevin Wolf wrote:
> Am 22.01.2020 um 12:53 hat Alexander Popov geschrieben:
>> On 23.12.2019 20:51, Alexander Popov wrote:
>>> Fuzzing the Linux kernel with syzkaller allowed to find how to crash qemu
>>> using a special SCSI_IOCTL_SEND_COMMAND. It hits the assertion in
>>> ide_dma_cb() introduced in the commit a718978ed58a in July 2015.
>>>
>>> This patch series fixes incorrect handling of some PRDTs in ide_dma_cb()
>>> and improves the ide-test to cover more PRDT cases (including one
>>> that causes that particular qemu crash).
>>>
>>> Changes from v2 (thanks to Kevin Wolf for the feedback):
>>>  - the assertion about prepare_buf() return value is improved;
>>>  - the patch order is reversed to keep the tree bisectable;
>>>  - the unit-test performance is improved -- now it runs 8 seconds
>>>instead of 3 minutes on my laptop.
>>>
>>> Alexander Popov (2):
>>>   ide: Fix incorrect handling of some PRDTs in ide_dma_cb()
>>>   tests/ide-test: Create a single unit-test covering more PRDT cases
>>>
>>>  hw/ide/core.c|  30 +---
>>>  tests/ide-test.c | 174 ---
>>>  2 files changed, 96 insertions(+), 108 deletions(-)
>>
>> Hello!
>>
>> Pinging again about this fix and unit-test...
>>
>> It's ready. Kevin Wolf has reviewed this (thanks a lot!).
>>
>> What is next?
> 
> I asked John about it just yesterday (if he will merge it or if he would
> prefer me to take it through my tree) and he promised to take a look
> very soon.
> 
> Kevin
> 

Going to merge it today.

--js




Re: [PATCH v2 5/5] iotests: Add test for image creation fallback

2020-01-22 Thread Eric Blake

On 1/22/20 10:45 AM, Max Reitz wrote:

Signed-off-by: Max Reitz 
---



+
+_make_test_img 64M
+
+echo
+echo '--- Testing creation ---'
+
+$QEMU_IMG create -f qcow2 "$TEST_IMG" 64M | _filter_img_create
+$QEMU_IMG info "$TEST_IMG" | _filter_img_info
+
+echo
+echo '--- Testing creation for which the node would need to grow ---'
+
+$QEMU_IMG create -f qcow2 -o preallocation=metadata "$TEST_IMG" 64M 2>&1 \
+| _filter_img_create
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/259.out b/tests/qemu-iotests/259.out


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 3/3] qemu-block-drivers: Convert to rST

2020-01-22 Thread Alex Bennée


Peter Maydell  writes:

> The qemu-block-drivers documentation is currently in
> docs/qemu-block-drivers.texi in Texinfo format, which we present
> to the user as:
>  * a qemu-block-drivers manpage
>  * a section of the main qemu-doc HTML documentation
>
> Convert the documentation to rST format, and present it to
> the user as:
>  * a qemu-block-drivers manpage
>  * part of the system/ Sphinx manual
>
> This follows the same pattern we've done for qemu-ga and qemu-nbd.
>
> We have to drop a cross-reference from the documentation of the
> -cdrom option back to the qemu-block-drivers documentation, since
> they're no longer within the same texinfo document.
>
> As noted in a comment, the manpage output is slightly compromised
> due to limitations in Sphinx. In an ideal world, the HTML output
> would have the various headings like 'Disk image file formats'
> as top-level section headings (which then appear in the overall
> system manual's table-of-contents), and it would not have the
> section headings which make sense only for the manpage like
> 'synopsis', 'description', and 'see also'. Unfortunately, the
> mechanism Sphinx provides for restricting pieces of documentation
> is limited to the point of being flawed: the 'only::' directive
> is implemented as a filter that is applied at a very late stage
> in the document processing pipeline, rather than as an early
> equivalent of an #ifdef. This means that Sphinx's process of
> identifying which section heading markup styles are which levels
> of heading gets confused if the 'only::' directive contains
> section headings which would affect the heading-level of a
> later heading. I have opted to prioritise making the HTML format
> look better, with the compromise being that in the manpage
> the 'Disk image file formats'  headings are top-level headings
> rather than being sub-headings under the traditional 'Description'
> top-level section title.
>
> Signed-off-by: Peter Maydell 

Reviewed-by: Alex Bennée 
Tested-by: Alex Bennée 

> ---
> I enquired on sphinx-users (without response so far)
> https://www.mail-archive.com/sphinx-users@googlegroups.com/msg03837.html
> whether there's a better approach than using 'only::' to delimit
> output for a specific format.  That directive seems fundamentally
> flawed because it's implemented as a filter very late in the document
> processing, so for instance text goes into the search index even if
> it's marked as not-for-HTML:
> https://github.com/sphinx-doc/sphinx/issues/1420
> What we want ideally would be some equivalent to #ifdef that happens
> early in the processing pipeline, but I'm not sure Sphinx has it and
> I'm not convinced that "run everything through the C preprocessor so
> we can use #ifdef" is worthwhile for the minor benefit here.
> ---
>  Makefile   |  11 +-
>  docs/qemu-block-drivers.texi   | 889 --
>  docs/system/conf.py|   7 +
>  docs/system/index.rst  |   1 +
>  docs/system/qemu-block-drivers.rst | 985 +
>  qemu-doc.texi  |  12 -
>  qemu-options.hx|   2 +-
>  7 files changed, 1000 insertions(+), 907 deletions(-)
>  delete mode 100644 docs/qemu-block-drivers.texi
>  create mode 100644 docs/system/qemu-block-drivers.rst
>
> diff --git a/Makefile b/Makefile
> index 4372d0d1fbd..318d1046c6c 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -342,9 +342,9 @@ ifdef BUILD_DOCS
>  DOCS=qemu-doc.html qemu-doc.txt qemu.1 qemu-img.1
>  DOCS+=$(MANUAL_BUILDDIR)/interop/qemu-nbd.8
>  DOCS+=$(MANUAL_BUILDDIR)/interop/qemu-ga.8
> +DOCS+=$(MANUAL_BUILDDIR)/system/qemu-block-drivers.7
>  DOCS+=docs/interop/qemu-qmp-ref.html docs/interop/qemu-qmp-ref.txt 
> docs/interop/qemu-qmp-ref.7
>  DOCS+=docs/interop/qemu-ga-ref.html docs/interop/qemu-ga-ref.txt 
> docs/interop/qemu-ga-ref.7
> -DOCS+=docs/qemu-block-drivers.7
>  DOCS+=docs/qemu-cpu-models.7
>  DOCS+=$(MANUAL_BUILDDIR)/index.html
>  ifdef CONFIG_VIRTFS
> @@ -751,7 +751,6 @@ distclean: clean
>   rm -f docs/interop/qemu-qmp-ref.txt docs/interop/qemu-ga-ref.txt
>   rm -f docs/interop/qemu-qmp-ref.pdf docs/interop/qemu-ga-ref.pdf
>   rm -f docs/interop/qemu-qmp-ref.html docs/interop/qemu-ga-ref.html
> - rm -f docs/qemu-block-drivers.7
>   rm -f docs/qemu-cpu-models.7
>   rm -rf .doctrees
>   $(call clean-manual,devel)
> @@ -828,7 +827,7 @@ ifdef CONFIG_POSIX
>   $(INSTALL_DATA) qemu.1 "$(DESTDIR)$(mandir)/man1"
>   $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man7"
>   $(INSTALL_DATA) docs/interop/qemu-qmp-ref.7 "$(DESTDIR)$(mandir)/man7"
> - $(INSTALL_DATA) docs/qemu-block-drivers.7 "$(DESTDIR)$(mandir)/man7"
> + $(INSTALL_DATA) $(MANUAL_BUILDDIR)/system/qemu-block-drivers.7 
> "$(DESTDIR)$(mandir)/man7"
>   $(INSTALL_DATA) docs/qemu-cpu-models.7 "$(DESTDIR)$(mandir)/man7"
>  ifeq ($(CONFIG_TOOLS),y)
>   $(INSTALL_DATA) qemu-img.1 "$(DESTDIR)$(mandir)/man1"
> @@ -1036,6 

Re: [PATCH 2/3] docs: Create stub system manual

2020-01-22 Thread Alex Bennée


Peter Maydell  writes:

> On Thu, 16 Jan 2020 at 14:15, Peter Maydell  wrote:
>>
>> We want a user-facing manual which contains system emulation
>> documentation. Create an empty one which we can populate.
>>
>> Signed-off-by: Peter Maydell 
>> ---
>>  Makefile  | 10 +-
>>  docs/system/conf.py   | 15 +++
>>  docs/system/index.rst | 16 
>>  3 files changed, 40 insertions(+), 1 deletion(-)
>>  create mode 100644 docs/system/conf.py
>>  create mode 100644 docs/system/index.rst
>
> I think this may have crossed in the post with the
> commit adding index.html.in. Anyway, here's the obvious
> fixup, which I plan to squash into this patch without
> doing a respin unless there's some other respin needed:
>
> diff --git a/docs/index.html.in b/docs/index.html.in
> index 94eb782cf7e..573c543c02b 100644
> --- a/docs/index.html.in
> +++ b/docs/index.html.in
> @@ -11,6 +11,7 @@
>  QMP Reference Manual
>  Guest Agent Protocol
> Reference
>  System Emulation
> Management and Interoperability Guide
> +System Emulation User's
> Guide
>  System Emulation Guest
> Hardware Specifications
>  
>  


This didn't seem to make a difference on readthedocs so I assume this is
for different tooling?

Anyway:

Reviewed-by: Alex Bennée 

>
> thanks
> -- PMM


-- 
Alex Bennée



Re: [PATCH v2 1/5] block/nbd: Fix hang in .bdrv_close()

2020-01-22 Thread Eric Blake

On 1/22/20 10:45 AM, Max Reitz wrote:

When nbd_close() is called from a coroutine, the connection_co never
gets to run, and thus nbd_teardown_connection() hangs.

This is because aio_co_enter() only puts the connection_co into the main
coroutine's wake-up queue, so this main coroutine needs to yield and
wait for connection_co to terminate.

Suggested-by: Kevin Wolf 
Signed-off-by: Max Reitz 
---
  block/nbd.c | 14 +-
  1 file changed, 13 insertions(+), 1 deletion(-)


Reviewed-by: Eric Blake 

I'm assuming it is better to take the entire series through somewhere 
other than my NBD tree, since the remaining patches touch more than NBD?


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




[PATCH v2 0/5] block: Generic file creation fallback

2020-01-22 Thread Max Reitz
Hi,

As version 1, this series adds a fallback path for creating files (on
the protocol layer) if the protocol driver does not support file
creation, but the file already exists.


Branch: https://github.com/XanClic/qemu.git skip-proto-create-v2
Branch: https://git.xanclic.moe/XanClic/qemu.git skip-proto-create-v2


v2:
- Drop blk_truncate_for_formatting(): It doesn’t make sense to introduce
  this function any more after 26536c7fc25917d1bd13781f81fe3ab039643bff
  (“block: Do not truncate file node when formatting”), because we’d
  only use it in bdrv_create_file_fallback().
  Thus, it makes more sense to create special helper functions
  specifically for bdrv_create_file_fallback().

- Thus, dropped patches 2 and 3.

- And changed patch 4 to include those helper functions.

- Rebased, which was a bit of a pain.


git-backport-diff against v1:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/5:[] [-C] 'block/nbd: Fix hang in .bdrv_close()'
002/5:[0080] [FC] 'block: Generic file creation fallback'
003/5:[] [--] 'file-posix: Drop hdev_co_create_opts()'
004/5:[] [--] 'iscsi: Drop iscsi_co_create_opts()'
005/5:[] [-C] 'iotests: Add test for image creation fallback'


Max Reitz (5):
  block/nbd: Fix hang in .bdrv_close()
  block: Generic file creation fallback
  file-posix: Drop hdev_co_create_opts()
  iscsi: Drop iscsi_co_create_opts()
  iotests: Add test for image creation fallback

 block.c| 159 ++---
 block/file-posix.c |  67 
 block/iscsi.c  |  56 -
 block/nbd.c|  14 +++-
 tests/qemu-iotests/259 |  61 ++
 tests/qemu-iotests/259.out |  14 
 tests/qemu-iotests/group   |   1 +
 7 files changed, 236 insertions(+), 136 deletions(-)
 create mode 100755 tests/qemu-iotests/259
 create mode 100644 tests/qemu-iotests/259.out

-- 
2.24.1




[PATCH v2 5/5] iotests: Add test for image creation fallback

2020-01-22 Thread Max Reitz
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/259 | 61 ++
 tests/qemu-iotests/259.out | 14 +
 tests/qemu-iotests/group   |  1 +
 3 files changed, 76 insertions(+)
 create mode 100755 tests/qemu-iotests/259
 create mode 100644 tests/qemu-iotests/259.out

diff --git a/tests/qemu-iotests/259 b/tests/qemu-iotests/259
new file mode 100755
index 00..22b4c10241
--- /dev/null
+++ b/tests/qemu-iotests/259
@@ -0,0 +1,61 @@
+#!/usr/bin/env bash
+#
+# Test generic image creation fallback (by using NBD)
+#
+# Copyright (C) 2019 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=mre...@redhat.com
+
+seq=$(basename $0)
+echo "QA output created by $seq"
+
+status=1   # failure is the default!
+
+_cleanup()
+{
+_cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt raw
+_supported_proto nbd
+_supported_os Linux
+
+
+_make_test_img 64M
+
+echo
+echo '--- Testing creation ---'
+
+$QEMU_IMG create -f qcow2 "$TEST_IMG" 64M | _filter_img_create
+$QEMU_IMG info "$TEST_IMG" | _filter_img_info
+
+echo
+echo '--- Testing creation for which the node would need to grow ---'
+
+$QEMU_IMG create -f qcow2 -o preallocation=metadata "$TEST_IMG" 64M 2>&1 \
+| _filter_img_create
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/259.out b/tests/qemu-iotests/259.out
new file mode 100644
index 00..ffed19c2a0
--- /dev/null
+++ b/tests/qemu-iotests/259.out
@@ -0,0 +1,14 @@
+QA output created by 259
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
+
+--- Testing creation ---
+Formatting 'TEST_DIR/t.IMGFMT', fmt=qcow2 size=67108864
+image: TEST_DIR/t.IMGFMT
+file format: qcow2
+virtual size: 64 MiB (67108864 bytes)
+disk size: unavailable
+
+--- Testing creation for which the node would need to grow ---
+qemu-img: TEST_DIR/t.IMGFMT: Could not resize image: Image format driver does 
not support resize
+Formatting 'TEST_DIR/t.IMGFMT', fmt=qcow2 size=67108864 preallocation=metadata
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 1904223020..ec47c2216a 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -273,6 +273,7 @@
 256 rw auto quick
 257 rw
 258 rw quick
+259 rw auto quick
 260 rw quick
 261 rw
 262 rw quick migration
-- 
2.24.1




[PATCH v2 4/5] iscsi: Drop iscsi_co_create_opts()

2020-01-22 Thread Max Reitz
The generic fallback implementation effectively does the same.

Reviewed-by: Maxim Levitsky 
Signed-off-by: Max Reitz 
---
 block/iscsi.c | 56 ---
 1 file changed, 56 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 2aea7e3f13..68562e108f 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -2163,58 +2163,6 @@ static int coroutine_fn 
iscsi_co_truncate(BlockDriverState *bs, int64_t offset,
 return 0;
 }
 
-static int coroutine_fn iscsi_co_create_opts(const char *filename, QemuOpts 
*opts,
- Error **errp)
-{
-int ret = 0;
-int64_t total_size = 0;
-BlockDriverState *bs;
-IscsiLun *iscsilun = NULL;
-QDict *bs_options;
-Error *local_err = NULL;
-
-bs = bdrv_new();
-
-/* Read out options */
-total_size = DIV_ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
-bs->opaque = g_new0(struct IscsiLun, 1);
-iscsilun = bs->opaque;
-
-bs_options = qdict_new();
-iscsi_parse_filename(filename, bs_options, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-ret = -EINVAL;
-} else {
-ret = iscsi_open(bs, bs_options, 0, NULL);
-}
-qobject_unref(bs_options);
-
-if (ret != 0) {
-goto out;
-}
-iscsi_detach_aio_context(bs);
-if (iscsilun->type != TYPE_DISK) {
-ret = -ENODEV;
-goto out;
-}
-if (bs->total_sectors < total_size) {
-ret = -ENOSPC;
-goto out;
-}
-
-ret = 0;
-out:
-if (iscsilun->iscsi != NULL) {
-iscsi_destroy_context(iscsilun->iscsi);
-}
-g_free(bs->opaque);
-bs->opaque = NULL;
-bdrv_unref(bs);
-return ret;
-}
-
 static int iscsi_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
 {
 IscsiLun *iscsilun = bs->opaque;
@@ -2485,8 +2433,6 @@ static BlockDriver bdrv_iscsi = {
 .bdrv_parse_filename= iscsi_parse_filename,
 .bdrv_file_open = iscsi_open,
 .bdrv_close = iscsi_close,
-.bdrv_co_create_opts= iscsi_co_create_opts,
-.create_opts= _create_opts,
 .bdrv_reopen_prepare= iscsi_reopen_prepare,
 .bdrv_reopen_commit = iscsi_reopen_commit,
 .bdrv_co_invalidate_cache = iscsi_co_invalidate_cache,
@@ -2524,8 +2470,6 @@ static BlockDriver bdrv_iser = {
 .bdrv_parse_filename= iscsi_parse_filename,
 .bdrv_file_open = iscsi_open,
 .bdrv_close = iscsi_close,
-.bdrv_co_create_opts= iscsi_co_create_opts,
-.create_opts= _create_opts,
 .bdrv_reopen_prepare= iscsi_reopen_prepare,
 .bdrv_reopen_commit = iscsi_reopen_commit,
 .bdrv_co_invalidate_cache  = iscsi_co_invalidate_cache,
-- 
2.24.1




[PATCH v2 3/5] file-posix: Drop hdev_co_create_opts()

2020-01-22 Thread Max Reitz
The generic fallback implementation effectively does the same.

Reviewed-by: Maxim Levitsky 
Signed-off-by: Max Reitz 
---
 block/file-posix.c | 67 --
 1 file changed, 67 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 1b805bd938..fd29372599 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3418,67 +3418,6 @@ static coroutine_fn int 
hdev_co_pwrite_zeroes(BlockDriverState *bs,
 return raw_do_pwrite_zeroes(bs, offset, bytes, flags, true);
 }
 
-static int coroutine_fn hdev_co_create_opts(const char *filename, QemuOpts 
*opts,
-Error **errp)
-{
-int fd;
-int ret = 0;
-struct stat stat_buf;
-int64_t total_size = 0;
-bool has_prefix;
-
-/* This function is used by both protocol block drivers and therefore 
either
- * of these prefixes may be given.
- * The return value has to be stored somewhere, otherwise this is an error
- * due to -Werror=unused-value. */
-has_prefix =
-strstart(filename, "host_device:", ) ||
-strstart(filename, "host_cdrom:" , );
-
-(void)has_prefix;
-
-ret = raw_normalize_devicepath(, errp);
-if (ret < 0) {
-return ret;
-}
-
-/* Read out options */
-total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
-  BDRV_SECTOR_SIZE);
-
-fd = qemu_open(filename, O_WRONLY | O_BINARY);
-if (fd < 0) {
-ret = -errno;
-error_setg_errno(errp, -ret, "Could not open device");
-return ret;
-}
-
-if (fstat(fd, _buf) < 0) {
-ret = -errno;
-error_setg_errno(errp, -ret, "Could not stat device");
-} else if (!S_ISBLK(stat_buf.st_mode) && !S_ISCHR(stat_buf.st_mode)) {
-error_setg(errp,
-   "The given file is neither a block nor a character device");
-ret = -ENODEV;
-} else if (lseek(fd, 0, SEEK_END) < total_size) {
-error_setg(errp, "Device is too small");
-ret = -ENOSPC;
-}
-
-if (!ret && total_size) {
-uint8_t buf[BDRV_SECTOR_SIZE] = { 0 };
-int64_t zero_size = MIN(BDRV_SECTOR_SIZE, total_size);
-if (lseek(fd, 0, SEEK_SET) == -1) {
-ret = -errno;
-} else {
-ret = qemu_write_full(fd, buf, zero_size);
-ret = ret == zero_size ? 0 : -errno;
-}
-}
-qemu_close(fd);
-return ret;
-}
-
 static BlockDriver bdrv_host_device = {
 .format_name= "host_device",
 .protocol_name= "host_device",
@@ -3491,8 +3430,6 @@ static BlockDriver bdrv_host_device = {
 .bdrv_reopen_prepare = raw_reopen_prepare,
 .bdrv_reopen_commit  = raw_reopen_commit,
 .bdrv_reopen_abort   = raw_reopen_abort,
-.bdrv_co_create_opts = hdev_co_create_opts,
-.create_opts = _create_opts,
 .mutable_opts= mutable_opts,
 .bdrv_co_invalidate_cache = raw_co_invalidate_cache,
 .bdrv_co_pwrite_zeroes = hdev_co_pwrite_zeroes,
@@ -3619,8 +3556,6 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_reopen_prepare = raw_reopen_prepare,
 .bdrv_reopen_commit  = raw_reopen_commit,
 .bdrv_reopen_abort   = raw_reopen_abort,
-.bdrv_co_create_opts = hdev_co_create_opts,
-.create_opts = _create_opts,
 .mutable_opts= mutable_opts,
 .bdrv_co_invalidate_cache = raw_co_invalidate_cache,
 
@@ -3753,8 +3688,6 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_reopen_prepare = raw_reopen_prepare,
 .bdrv_reopen_commit  = raw_reopen_commit,
 .bdrv_reopen_abort   = raw_reopen_abort,
-.bdrv_co_create_opts = hdev_co_create_opts,
-.create_opts= _create_opts,
 .mutable_opts   = mutable_opts,
 
 .bdrv_co_preadv = raw_co_preadv,
-- 
2.24.1




[PATCH v2 2/5] block: Generic file creation fallback

2020-01-22 Thread Max Reitz
If a protocol driver does not support image creation, we can see whether
maybe the file exists already.  If so, just truncating it will be
sufficient.

Signed-off-by: Max Reitz 
---
 block.c | 159 +++-
 1 file changed, 147 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index 99ce26d64d..e167eca04b 100644
--- a/block.c
+++ b/block.c
@@ -532,20 +532,139 @@ out:
 return ret;
 }
 
-int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp)
+/**
+ * Helper function for bdrv_create_file_fallback(): Resize @blk to at
+ * least the given @minimum_size.
+ *
+ * On success, return @blk's actual length.
+ * Otherwise, return -errno.
+ */
+static int64_t create_file_fallback_truncate(BlockBackend *blk,
+ int64_t minimum_size, Error 
**errp)
 {
-BlockDriver *drv;
+Error *local_err = NULL;
+int64_t size;
+int ret;
+
+ret = blk_truncate(blk, minimum_size, false, PREALLOC_MODE_OFF, 
_err);
+if (ret < 0 && ret != -ENOTSUP) {
+error_propagate(errp, local_err);
+return ret;
+}
+
+size = blk_getlength(blk);
+if (size < 0) {
+error_free(local_err);
+error_setg_errno(errp, -size,
+ "Failed to inquire the new image file's length");
+return size;
+}
+
+if (size < minimum_size) {
+/* Need to grow the image, but we failed to do that */
+error_propagate(errp, local_err);
+return -ENOTSUP;
+}
+
+error_free(local_err);
+local_err = NULL;
+
+return size;
+}
+
+/**
+ * Helper function for bdrv_create_file_fallback(): Zero the first
+ * sector to remove any potentially pre-existing image header.
+ */
+static int create_file_fallback_zero_first_sector(BlockBackend *blk,
+  int64_t current_size,
+  Error **errp)
+{
+int64_t bytes_to_clear;
+int ret;
+
+bytes_to_clear = MIN(current_size, BDRV_SECTOR_SIZE);
+if (bytes_to_clear) {
+ret = blk_pwrite_zeroes(blk, 0, bytes_to_clear, BDRV_REQ_MAY_UNMAP);
+if (ret < 0) {
+error_setg_errno(errp, -ret,
+ "Failed to clear the new image's first sector");
+return ret;
+}
+}
+
+return 0;
+}
+
+static int bdrv_create_file_fallback(const char *filename, BlockDriver *drv,
+ QemuOpts *opts, Error **errp)
+{
+BlockBackend *blk;
+QDict *options = qdict_new();
+int64_t size = 0;
+char *buf = NULL;
+PreallocMode prealloc;
 Error *local_err = NULL;
 int ret;
 
+size = qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0);
+buf = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
+prealloc = qapi_enum_parse(_lookup, buf,
+   PREALLOC_MODE_OFF, _err);
+g_free(buf);
+if (local_err) {
+error_propagate(errp, local_err);
+return -EINVAL;
+}
+
+if (prealloc != PREALLOC_MODE_OFF) {
+error_setg(errp, "Unsupported preallocation mode '%s'",
+   PreallocMode_str(prealloc));
+return -ENOTSUP;
+}
+
+qdict_put_str(options, "driver", drv->format_name);
+
+blk = blk_new_open(filename, NULL, options,
+   BDRV_O_RDWR | BDRV_O_RESIZE, errp);
+if (!blk) {
+error_prepend(errp, "Protocol driver '%s' does not support image "
+  "creation, and opening the image failed: ",
+  drv->format_name);
+return -EINVAL;
+}
+
+size = create_file_fallback_truncate(blk, size, errp);
+if (size < 0) {
+ret = size;
+goto out;
+}
+
+ret = create_file_fallback_zero_first_sector(blk, size, errp);
+if (ret < 0) {
+goto out;
+}
+
+ret = 0;
+out:
+blk_unref(blk);
+return ret;
+}
+
+int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp)
+{
+BlockDriver *drv;
+
 drv = bdrv_find_protocol(filename, true, errp);
 if (drv == NULL) {
 return -ENOENT;
 }
 
-ret = bdrv_create(drv, filename, opts, _err);
-error_propagate(errp, local_err);
-return ret;
+if (drv->bdrv_co_create_opts) {
+return bdrv_create(drv, filename, opts, errp);
+} else {
+return bdrv_create_file_fallback(filename, drv, opts, errp);
+}
 }
 
 /**
@@ -1422,6 +1541,24 @@ QemuOptsList bdrv_runtime_opts = {
 },
 };
 
+static QemuOptsList fallback_create_opts = {
+.name = "fallback-create-opts",
+.head = QTAILQ_HEAD_INITIALIZER(fallback_create_opts.head),
+.desc = {
+{
+.name = BLOCK_OPT_SIZE,
+.type = QEMU_OPT_SIZE,
+.help = "Virtual disk size"
+},
+{
+.name = BLOCK_OPT_PREALLOC,
+.type = QEMU_OPT_STRING,
+.help = "Preallocation mode (allowed 

[PATCH v2 1/5] block/nbd: Fix hang in .bdrv_close()

2020-01-22 Thread Max Reitz
When nbd_close() is called from a coroutine, the connection_co never
gets to run, and thus nbd_teardown_connection() hangs.

This is because aio_co_enter() only puts the connection_co into the main
coroutine's wake-up queue, so this main coroutine needs to yield and
wait for connection_co to terminate.

Suggested-by: Kevin Wolf 
Signed-off-by: Max Reitz 
---
 block/nbd.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/nbd.c b/block/nbd.c
index d085554f21..6d3b22f844 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -70,6 +70,7 @@ typedef struct BDRVNBDState {
 CoMutex send_mutex;
 CoQueue free_sema;
 Coroutine *connection_co;
+Coroutine *teardown_co;
 QemuCoSleepState *connection_co_sleep_ns_state;
 bool drained;
 bool wait_drained_end;
@@ -203,7 +204,15 @@ static void nbd_teardown_connection(BlockDriverState *bs)
 qemu_co_sleep_wake(s->connection_co_sleep_ns_state);
 }
 }
-BDRV_POLL_WHILE(bs, s->connection_co);
+if (qemu_in_coroutine()) {
+s->teardown_co = qemu_coroutine_self();
+/* connection_co resumes us when it terminates */
+qemu_coroutine_yield();
+s->teardown_co = NULL;
+} else {
+BDRV_POLL_WHILE(bs, s->connection_co);
+}
+assert(!s->connection_co);
 }
 
 static bool nbd_client_connecting(BDRVNBDState *s)
@@ -395,6 +404,9 @@ static coroutine_fn void nbd_connection_entry(void *opaque)
 s->ioc = NULL;
 }
 
+if (s->teardown_co) {
+aio_co_wake(s->teardown_co);
+}
 aio_wait_kick();
 }
 
-- 
2.24.1




Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands

2020-01-22 Thread Kevin Wolf
Am 22.01.2020 um 13:15 hat Markus Armbruster geschrieben:
> Kevin Wolf  writes:
> 
> > Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
> >> Kevin Wolf  writes:
> >> 
> >> > This patch adds a new 'coroutine' flag to QMP command definitions that
> >> > tells the QMP dispatcher that the command handler is safe to be run in a
> >> > coroutine.
> >> 
> >> I'm afraid I missed this question in my review of v3: when is a handler
> >> *not* safe to be run in a coroutine?
> >
> > That's a hard one to answer fully.
> 
> You're welcome ;)
> 
> > Basically, I think the biggest problem is with calling functions that
> > change their behaviour if run in a coroutine compared to running them
> > outside of coroutine context. In most cases the differences like having
> > a nested event loop instead of yielding are just fine, but they are
> > still subtly different.
> >
> > I know this is vague, but I can assure you that problematic cases exist.
> > I hit one of them with my initial hack that just moved everything into a
> > coroutine. It was related to graph modifications and bdrv_drain and
> > resulted in a hang. For the specifics, I would have to try and reproduce
> > the problem again.
> 
> Interesting.
> 
> Is coroutine-incompatible command handler code necessary or accidental?
> 
> By "necessary" I mean there are (and likely always will be) commands
> that need to do stuff that cannot or should not be done on coroutine
> context.
> 
> "Accidental" is the opposite: coroutine-incompatibility can be regarded
> as a fixable flaw.

I expect it's mostly accidental, but maybe not all of it.

bdrv_drain() is an example of a function that has a problem with running
inside a coroutine: If you schedule another coroutine to run, it will
only run after the currently active coroutine yields. However,
bdrv_drain() doesn't yield, but runs a nested event loop, so this can
lead to deadlocks.

For this reason, bdrv_drain() has code to drop out of coroutine context,
so that you can call it from a coroutine anyway:

/* Calling bdrv_drain() from a BH ensures the current coroutine yields and
 * other coroutines run if they were queued by aio_co_enter(). */

Now, after reading this, I think the problem in the QMP handler I
observerd was that it called some parts of the drain code without
dropping out of coroutine context first. Well possible that it was the
exact deadlock described by this comment.

(Even in bdrv_drain() it might not be strictly necessary. Maybe it's
possible to redesign bdrv_drain() so that it doesn't have a nested event
loop, but can somehow rely on the normal main loop. Touching it is
scary, though, so I'd rather not unless we have a good reason.)

> How widespread is coroutine-incompatibility?  Common, or just a few
> commands?

Answering this would require a massive code audit. Way out of scope for
this series. This is the reason why I decided to just make it optional
instead of trying to find and fix the cases that would need a fix if we
enabled coroutine handlers unconditionally.

> If coroutine-incompatibility is accidental, then your code to drop out
> of coroutine context can be regarded as a temporary work-around.  Such
> work-arounds receive a modest extra ugliness & complexity budget.
> 
> If coroutine-incompatibility is rare, we'll eventually want "mark the
> known-bad ones with 'coroutine': false" instead of "mark the known-good
> ones with 'coroutine': true".  I'm okay with marking the known-good ones
> initially, and flipping only later.
> 
> Inability to recognize coroutine-incompatibility by inspection worries
> me.  Can you think of ways to identify causes other than testing things
> to death?

I think it is possible to recognise by inspection, it would just be a
massive effort. Maybe a way to start it could be adding something like a
non_coroutine_fn marker to functions that we know don't want to be
called from coroutine context, and propagating it to their callers.

I guess I would start by looking for nested event loops (AIO_WAIT_WHILE
or BDRV_POLL_WHILE), and then you would have to decide for each whether
it could run into problems. I can't promise this will cover everything,
but it would at least cover the case known from bdrv_drain.

Kevin




[PATCH] docs/devel: Fix qtest paths and info about check-block in testing.rst

2020-01-22 Thread Thomas Huth
The qtests have recently been moved to a separate subdirectory, so
the paths that are mentioned in the documentation have to be adjusted
accordingly. And some of the iotests are now always run as part of
"make check", so this information has to be adjusted here, too.

Signed-off-by: Thomas Huth 
---
 docs/devel/testing.rst | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/docs/devel/testing.rst b/docs/devel/testing.rst
index ab5be0c729..770a987ea4 100644
--- a/docs/devel/testing.rst
+++ b/docs/devel/testing.rst
@@ -16,8 +16,8 @@ The usual way to run these tests is:
 
   make check
 
-which includes QAPI schema tests, unit tests, and QTests. Different sub-types
-of "make check" tests will be explained below.
+which includes QAPI schema tests, unit tests, QTests and some iotests.
+Different sub-types of "make check" tests will be explained below.
 
 Before running tests, it is best to build QEMU programs first. Some tests
 expect the executables to exist and will fail with obscure messages if they
@@ -79,8 +79,8 @@ QTest cases can be executed with
 
make check-qtest
 
-The QTest library is implemented by ``tests/libqtest.c`` and the API is defined
-in ``tests/libqtest.h``.
+The QTest library is implemented by ``tests/qtest/libqtest.c`` and the API is
+defined in ``tests/qtest/libqtest.h``.
 
 Consider adding a new QTest case when you are introducing a new virtual
 hardware, or extending one if you are adding functionalities to an existing
@@ -94,20 +94,20 @@ libqos instead of directly calling into libqtest.
 Steps to add a new QTest case are:
 
 1. Create a new source file for the test. (More than one file can be added as
-   necessary.) For example, ``tests/test-foo-device.c``.
+   necessary.) For example, ``tests/qtest/foo-test.c``.
 
 2. Write the test code with the glib and libqtest/libqos API. See also existing
tests and the library headers for reference.
 
-3. Register the new test in ``tests/Makefile.include``. Add the test executable
-   name to an appropriate ``check-qtest-*-y`` variable. For example:
+3. Register the new test in ``tests/qtest/Makefile.include``. Add the test
+   executable name to an appropriate ``check-qtest-*-y`` variable. For example:
 
-   ``check-qtest-generic-y = tests/test-foo-device$(EXESUF)``
+   ``check-qtest-generic-y = tests/qtest/foo-test$(EXESUF)``
 
 4. Add object dependencies of the executable in the Makefile, including the
test source file(s) and other interesting objects. For example:
 
-   ``tests/test-foo-device$(EXESUF): tests/test-foo-device.o $(libqos-obj-y)``
+   ``tests/qtest/foo-test$(EXESUF): tests/qtest/foo-test.o $(libqos-obj-y)``
 
 Debugging a QTest failure is slightly harder than the unit test because the
 tests look up QEMU program names in the environment variables, such as
@@ -152,8 +152,9 @@ parser (either fixing a bug or extending/modifying the 
syntax). To do this:
 check-block
 ---
 
-``make check-block`` is a legacy command to invoke block layer iotests and is
-rarely used. See "QEMU iotests" section below for more information.
+``make check-block`` runs a subset of the block layer iotests (the tests that
+are in the "auto" group in ``tests/qemu-iotests/group``).
+See the "QEMU iotests" section below for more information.
 
 GCC gcov support
 
-- 
2.18.1




Re: [PATCH 1/3] qemu-nbd: Convert invocation documentation to rST

2020-01-22 Thread Alex Bennée


Peter Maydell  writes:

> The qemu-nbd documentation is currently in qemu-nbd.texi in Texinfo
> format, which we present to the user as:
>  * a qemu-nbd manpage
>  * a section of the main qemu-doc HTML documentation
>
> Convert the documentation to rST format, and present it to the user as:
>  * a qemu-nbd manpage
>  * part of the interop/ Sphinx manual
>
> This follows the same pattern as commit 27a296fce982 did for the
> qemu-ga manpage.
>
> All the content of the old manpage is retained, except that I have
> dropped the "This is free software; see the source for copying
> conditions.  There is NO warranty..." text that was in the old AUTHOR
> section; Sphinx's manpage builder doesn't expect that much text in
> the AUTHOR section, and since none of our other manpages have it it
> seems easiest to delete it rather than try to figure out where else
> in the manpage to put it.
>
> The only other textual change is that I have had to give the
> --nocache option its own description ("Equivalent to --cache=none")
> because Sphinx doesn't have an equivalent of using item/itemx
> to share a description between two options.
>
> Some minor aspects of the formatting have changed, to suit what is
> easiest for Sphinx to output. (The most notable is that Sphinx
> option section option syntax doesn't support '--option foo=bar'
> with bar underlined rather than bold, so we have to switch to
> '--option foo=BAR' instead.)
>
> The contents of qemu-option-trace.texi are now duplicated in
> docs/interop/qemu-option-trace.rst.inc, until such time as we complete
> the conversion of the other files which use it; since it has had only
> 3 changes in 3 years, this shouldn't be too awkward a burden.
> (We use .rst.inc because if this file fragment has a .rst extension
> then Sphinx complains about not seeing it in a toctree.)
>
> Signed-off-by: Peter Maydell 

Reviewed-by: Alex Bennée 
Tested-by: Alex Bennée 

> ---
> Another step forward for https://wiki.qemu.org/Features/Documentation
> (this is part of step 3 in the 'transition plan').
>
> v1->v2:
>  * avoided some long lines in the Makefile
> ---
>  Makefile   |  16 +-
>  MAINTAINERS|   1 +
>  docs/interop/conf.py   |   4 +-
>  docs/interop/index.rst |   1 +
>  docs/interop/qemu-nbd.rst  | 263 +
>  docs/interop/qemu-option-trace.rst.inc |  30 +++
>  qemu-doc.texi  |   6 -
>  qemu-nbd.texi  | 214 
>  qemu-option-trace.texi |   4 +
>  9 files changed, 313 insertions(+), 226 deletions(-)
>  create mode 100644 docs/interop/qemu-nbd.rst
>  create mode 100644 docs/interop/qemu-option-trace.rst.inc
>  delete mode 100644 qemu-nbd.texi
>
> diff --git a/Makefile b/Makefile
> index afa5cb6548d..7a8a50d8700 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -339,7 +339,9 @@ MANUAL_BUILDDIR := docs
>  endif
>  
>  ifdef BUILD_DOCS
> -DOCS=qemu-doc.html qemu-doc.txt qemu.1 qemu-img.1 qemu-nbd.8 
> $(MANUAL_BUILDDIR)/interop/qemu-ga.8
> +DOCS=qemu-doc.html qemu-doc.txt qemu.1 qemu-img.1
> +DOCS+=$(MANUAL_BUILDDIR)/interop/qemu-nbd.8
> +DOCS+=$(MANUAL_BUILDDIR)/interop/qemu-ga.8
>  DOCS+=docs/interop/qemu-qmp-ref.html docs/interop/qemu-qmp-ref.txt 
> docs/interop/qemu-qmp-ref.7
>  DOCS+=docs/interop/qemu-ga-ref.html docs/interop/qemu-ga-ref.txt 
> docs/interop/qemu-ga-ref.7
>  DOCS+=docs/qemu-block-drivers.7
> @@ -829,7 +831,7 @@ ifdef CONFIG_POSIX
>  ifeq ($(CONFIG_TOOLS),y)
>   $(INSTALL_DATA) qemu-img.1 "$(DESTDIR)$(mandir)/man1"
>   $(INSTALL_DIR) "$(DESTDIR)$(mandir)/man8"
> - $(INSTALL_DATA) qemu-nbd.8 "$(DESTDIR)$(mandir)/man8"
> + $(INSTALL_DATA) $(MANUAL_BUILDDIR)/interop/qemu-nbd.8 
> "$(DESTDIR)$(mandir)/man8"
>  endif
>  ifdef CONFIG_TRACE_SYSTEMTAP
>   $(INSTALL_DATA) scripts/qemu-trace-stap.1 "$(DESTDIR)$(mandir)/man1"
> @@ -1007,7 +1009,9 @@ sphinxdocs: $(MANUAL_BUILDDIR)/devel/index.html 
> $(MANUAL_BUILDDIR)/interop/index
>  # a single doctree: https://github.com/sphinx-doc/sphinx/issues/2946
>  build-manual = $(call quiet-command,CONFDIR="$(qemu_confdir)" sphinx-build 
> $(if $(V),,-q) -W -b $2 -D version=$(VERSION) -D release="$(FULL_VERSION)" -d 
> .doctrees/$1-$2 $(SRC_PATH)/docs/$1 $(MANUAL_BUILDDIR)/$1 
> ,"SPHINX","$(MANUAL_BUILDDIR)/$1")
>  # We assume all RST files in the manual's directory are used in it
> -manual-deps = $(wildcard $(SRC_PATH)/docs/$1/*.rst) 
> $(SRC_PATH)/docs/$1/conf.py $(SRC_PATH)/docs/conf.py
> +manual-deps = $(wildcard $(SRC_PATH)/docs/$1/*.rst) \
> +  $(wildcard $(SRC_PATH)/docs/$1/*.rst.inc) \
> +  $(SRC_PATH)/docs/$1/conf.py $(SRC_PATH)/docs/conf.py
>  
>  $(MANUAL_BUILDDIR)/devel/index.html: $(call manual-deps,devel)
>   $(call build-manual,devel,html)
> @@ -1021,6 +1025,9 @@ $(MANUAL_BUILDDIR)/specs/index.html: $(call 
> manual-deps,specs)
>  $(MANUAL_BUILDDIR)/interop/qemu-ga.8: $(call 

[PATCH 5/7] migration/block-dirty-bitmap: cancel migration on shutdown

2020-01-22 Thread Vladimir Sementsov-Ogievskiy
If target is turned of prior to postcopy finished, we crash because
busy bitmaps are found at shutdown.

Let's fix it by removing all unfinished bitmaps on shutdown.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/migration.h  |  1 +
 migration/block-dirty-bitmap.c | 44 ++
 migration/migration.c  |  7 ++
 3 files changed, 48 insertions(+), 4 deletions(-)

diff --git a/migration/migration.h b/migration/migration.h
index aa9ff6f27b..a3927b93bb 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -331,6 +331,7 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState 
*mis,
 void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);
 
 void dirty_bitmap_mig_before_vm_start(void);
+void dirty_bitmap_mig_cancel_incoming(void);
 void init_dirty_bitmap_incoming_migration(void);
 void migrate_add_address(SocketAddress *address);
 
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index f96458113c..5a98543672 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -143,6 +143,7 @@ typedef struct DirtyBitmapLoadState {
 
 bool bitmaps_enabled; /* set in dirty_bitmap_mig_before_vm_start */
 bool stream_ended; /* set when all migrated data handled */
+bool cancelled;
 
 GSList *bitmaps;
 QemuMutex lock; /* protect bitmaps */
@@ -533,8 +534,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f)
 trace_dirty_bitmap_load_complete();
 bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-qemu_mutex_lock(_load_state.lock);
-
 if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
 bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
 }
@@ -547,8 +546,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f)
 break;
 }
 }
-
-qemu_mutex_unlock(_load_state.lock);
 }
 
 static int dirty_bitmap_load_bits(QEMUFile *f)
@@ -656,6 +653,13 @@ static int dirty_bitmap_load(QEMUFile *f, void *opaque, 
int version_id)
 }
 
 do {
+qemu_mutex_lock(_load_state.lock);
+
+if (dbm_load_state.cancelled) {
+qemu_mutex_unlock(_load_state.lock);
+break;
+}
+
 ret = dirty_bitmap_load_header(f);
 if (ret < 0) {
 return ret;
@@ -676,6 +680,8 @@ static int dirty_bitmap_load(QEMUFile *f, void *opaque, int 
version_id)
 if (ret) {
 return ret;
 }
+
+qemu_mutex_unlock(_load_state.lock);
 } while (!(dbm_load_state.flags & DIRTY_BITMAP_MIG_FLAG_EOS));
 
 qemu_mutex_lock(_load_state.lock);
@@ -692,6 +698,36 @@ static int dirty_bitmap_load(QEMUFile *f, void *opaque, 
int version_id)
 return 0;
 }
 
+void dirty_bitmap_mig_cancel_incoming(void)
+{
+GSList *item;
+
+qemu_mutex_lock(_load_state.lock);
+
+if (dbm_load_state.bitmaps_enabled && dbm_load_state.stream_ended) {
+qemu_mutex_unlock(_load_state.lock);
+return;
+}
+
+dbm_load_state.cancelled = true;
+
+for (item = dbm_load_state.bitmaps; item; item = g_slist_next(item)) {
+DirtyBitmapLoadBitmapState *b = item->data;
+
+if (!dbm_load_state.bitmaps_enabled || !b->migrated) {
+if (bdrv_dirty_bitmap_has_successor(b->bitmap)) {
+bdrv_reclaim_dirty_bitmap(b->bitmap, _abort);
+}
+bdrv_release_dirty_bitmap(b->bitmap);
+}
+}
+
+g_slist_free_full(dbm_load_state.bitmaps, g_free);
+dbm_load_state.bitmaps = NULL;
+
+qemu_mutex_unlock(_load_state.lock);
+}
+
 static int dirty_bitmap_save_setup(QEMUFile *f, void *opaque)
 {
 DirtyBitmapMigBitmapState *dbms = NULL;
diff --git a/migration/migration.c b/migration/migration.c
index 990bff00c0..12d161165d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -182,6 +182,13 @@ void migration_shutdown(void)
  */
 migrate_fd_cancel(current_migration);
 object_unref(OBJECT(current_migration));
+
+/*
+ * Cancel incoming migration of dirty bitmaps. Dirty bitmaps
+ * are non-critical data, and their loss never considered as
+ * something serious.
+ */
+dirty_bitmap_mig_cancel_incoming();
 }
 
 /* For outgoing */
-- 
2.21.0




[PATCH 1/7] migration/block-dirty-bitmap: refactor incoming state to be one struct

2020-01-22 Thread Vladimir Sementsov-Ogievskiy
Move enabled_bitmaps and finish_lock, which are part of incoming state
to DirtyBitmapLoadState, and make static global variable to store state
instead of static local one.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 77 +++---
 1 file changed, 43 insertions(+), 34 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 7eafface61..281d20f41d 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -125,6 +125,13 @@ typedef struct DirtyBitmapMigState {
 BlockDriverState *prev_bs;
 BdrvDirtyBitmap *prev_bitmap;
 } DirtyBitmapMigState;
+static DirtyBitmapMigState dirty_bitmap_mig_state;
+
+typedef struct DirtyBitmapLoadBitmapState {
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+bool migrated;
+} DirtyBitmapLoadBitmapState;
 
 typedef struct DirtyBitmapLoadState {
 uint32_t flags;
@@ -132,21 +139,15 @@ typedef struct DirtyBitmapLoadState {
 char bitmap_name[256];
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
-} DirtyBitmapLoadState;
 
-static DirtyBitmapMigState dirty_bitmap_mig_state;
-
-typedef struct DirtyBitmapLoadBitmapState {
-BlockDriverState *bs;
-BdrvDirtyBitmap *bitmap;
-bool migrated;
-} DirtyBitmapLoadBitmapState;
-static GSList *enabled_bitmaps;
-QemuMutex finish_lock;
+GSList *enabled_bitmaps;
+QemuMutex finish_lock;
+} DirtyBitmapLoadState;
+static DirtyBitmapLoadState dbm_load_state;
 
 void init_dirty_bitmap_incoming_migration(void)
 {
-qemu_mutex_init(_lock);
+qemu_mutex_init(_load_state.finish_lock);
 }
 
 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
@@ -439,8 +440,9 @@ static void dirty_bitmap_save_pending(QEMUFile *f, void 
*opaque,
 }
 
 /* First occurrence of this bitmap. It should be created if doesn't exist */
-static int dirty_bitmap_load_start(QEMUFile *f, DirtyBitmapLoadState *s)
+static int dirty_bitmap_load_start(QEMUFile *f)
 {
+DirtyBitmapLoadState *s = _load_state;
 Error *local_err = NULL;
 uint32_t granularity = qemu_get_be32(f);
 uint8_t flags = qemu_get_byte(f);
@@ -482,7 +484,8 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DirtyBitmapLoadState *s)
 b->bs = s->bs;
 b->bitmap = s->bitmap;
 b->migrated = false;
-enabled_bitmaps = g_slist_prepend(enabled_bitmaps, b);
+dbm_load_state.enabled_bitmaps =
+g_slist_prepend(dbm_load_state.enabled_bitmaps, b);
 }
 
 return 0;
@@ -492,9 +495,11 @@ void dirty_bitmap_mig_before_vm_start(void)
 {
 GSList *item;
 
-qemu_mutex_lock(_lock);
+qemu_mutex_lock(_load_state.finish_lock);
 
-for (item = enabled_bitmaps; item; item = g_slist_next(item)) {
+for (item = dbm_load_state.enabled_bitmaps; item;
+ item = g_slist_next(item))
+{
 DirtyBitmapLoadBitmapState *b = item->data;
 
 if (b->migrated) {
@@ -506,21 +511,24 @@ void dirty_bitmap_mig_before_vm_start(void)
 g_free(b);
 }
 
-g_slist_free(enabled_bitmaps);
-enabled_bitmaps = NULL;
+g_slist_free(dbm_load_state.enabled_bitmaps);
+dbm_load_state.enabled_bitmaps = NULL;
 
-qemu_mutex_unlock(_lock);
+qemu_mutex_unlock(_load_state.finish_lock);
 }
 
-static void dirty_bitmap_load_complete(QEMUFile *f, DirtyBitmapLoadState *s)
+static void dirty_bitmap_load_complete(QEMUFile *f)
 {
+DirtyBitmapLoadState *s = _load_state;
 GSList *item;
 trace_dirty_bitmap_load_complete();
 bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-qemu_mutex_lock(_lock);
+qemu_mutex_lock(_load_state.finish_lock);
 
-for (item = enabled_bitmaps; item; item = g_slist_next(item)) {
+for (item = dbm_load_state.enabled_bitmaps; item;
+ item = g_slist_next(item))
+{
 DirtyBitmapLoadBitmapState *b = item->data;
 
 if (b->bitmap == s->bitmap) {
@@ -531,7 +539,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DirtyBitmapLoadState *s)
 
 if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
 bdrv_dirty_bitmap_lock(s->bitmap);
-if (enabled_bitmaps == NULL) {
+if (dbm_load_state.enabled_bitmaps == NULL) {
 /* in postcopy */
 bdrv_reclaim_dirty_bitmap_locked(s->bitmap, _abort);
 bdrv_enable_dirty_bitmap_locked(s->bitmap);
@@ -550,11 +558,12 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DirtyBitmapLoadState *s)
 bdrv_dirty_bitmap_unlock(s->bitmap);
 }
 
-qemu_mutex_unlock(_lock);
+qemu_mutex_unlock(_load_state.finish_lock);
 }
 
-static int dirty_bitmap_load_bits(QEMUFile *f, DirtyBitmapLoadState *s)
+static int dirty_bitmap_load_bits(QEMUFile *f)
 {
+DirtyBitmapLoadState *s = _load_state;
 uint64_t first_byte = qemu_get_be64(f) << BDRV_SECTOR_BITS;
 uint64_t nr_bytes = (uint64_t)qemu_get_be32(f) << BDRV_SECTOR_BITS;
 trace_dirty_bitmap_load_bits_enter(first_byte >> BDRV_SECTOR_BITS,
@@ 

[PATCH 2/7] migration/block-dirty-bitmap: rename finish_lock to just lock

2020-01-22 Thread Vladimir Sementsov-Ogievskiy
finish_lock is bad name, as lock used not only on process end.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 281d20f41d..502e858c31 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -141,13 +141,13 @@ typedef struct DirtyBitmapLoadState {
 BdrvDirtyBitmap *bitmap;
 
 GSList *enabled_bitmaps;
-QemuMutex finish_lock;
+QemuMutex lock; /* protect enabled_bitmaps */
 } DirtyBitmapLoadState;
 static DirtyBitmapLoadState dbm_load_state;
 
 void init_dirty_bitmap_incoming_migration(void)
 {
-qemu_mutex_init(_load_state.finish_lock);
+qemu_mutex_init(_load_state.lock);
 }
 
 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
@@ -495,7 +495,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 {
 GSList *item;
 
-qemu_mutex_lock(_load_state.finish_lock);
+qemu_mutex_lock(_load_state.lock);
 
 for (item = dbm_load_state.enabled_bitmaps; item;
  item = g_slist_next(item))
@@ -514,7 +514,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 g_slist_free(dbm_load_state.enabled_bitmaps);
 dbm_load_state.enabled_bitmaps = NULL;
 
-qemu_mutex_unlock(_load_state.finish_lock);
+qemu_mutex_unlock(_load_state.lock);
 }
 
 static void dirty_bitmap_load_complete(QEMUFile *f)
@@ -524,7 +524,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f)
 trace_dirty_bitmap_load_complete();
 bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-qemu_mutex_lock(_load_state.finish_lock);
+qemu_mutex_lock(_load_state.lock);
 
 for (item = dbm_load_state.enabled_bitmaps; item;
  item = g_slist_next(item))
@@ -558,7 +558,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f)
 bdrv_dirty_bitmap_unlock(s->bitmap);
 }
 
-qemu_mutex_unlock(_load_state.finish_lock);
+qemu_mutex_unlock(_load_state.lock);
 }
 
 static int dirty_bitmap_load_bits(QEMUFile *f)
-- 
2.21.0




[PATCH 0/7] Fix crashes on early shutdown during bitmaps postcopy

2020-01-22 Thread Vladimir Sementsov-Ogievskiy
Hi all!

Patches 5 and 6 fixes two crashes, triggered by new test case in patch
7.

Vladimir Sementsov-Ogievskiy (7):
  migration/block-dirty-bitmap: refactor incoming state to be one struct
  migration/block-dirty-bitmap: rename finish_lock to just lock
  migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete
  migration/block-dirty-bitmap: keep bitmap state for all bitmaps
  migration/block-dirty-bitmap: cancel migration on shutdown
  migration: handle to_src_file on target only for ram postcopy
  qemu-iotests/199: add early shutdown case to bitmaps postcopy

 migration/migration.h  |   1 +
 migration/block-dirty-bitmap.c | 171 +
 migration/migration.c  |   7 ++
 migration/savevm.c |  19 ++--
 tests/qemu-iotests/199 |  12 ++-
 tests/qemu-iotests/199.out |   4 +-
 6 files changed, 142 insertions(+), 72 deletions(-)

-- 
2.21.0




[PATCH 6/7] migration: handle to_src_file on target only for ram postcopy

2020-01-22 Thread Vladimir Sementsov-Ogievskiy
If only bitmaps postcopy migration enabled and not ram, this assertion
will fire, as we don't have to_src_file for bitmaps postcopy migration.

migrate_postcopy_ram() accesses migrations state, which may be freed in
main thread, so, we should ref/unref it in postcopy incoming thread.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/savevm.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index adfdca26ac..143755389e 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1832,6 +1832,9 @@ static void *postcopy_ram_listen_thread(void *opaque)
 MigrationIncomingState *mis = migration_incoming_get_current();
 QEMUFile *f = mis->from_src_file;
 int load_res;
+MigrationState *migr = migrate_get_current();
+
+object_ref(OBJECT(migr));
 
 migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
MIGRATION_STATUS_POSTCOPY_ACTIVE);
@@ -1898,6 +1901,8 @@ static void *postcopy_ram_listen_thread(void *opaque)
 mis->have_listen_thread = false;
 postcopy_state_set(POSTCOPY_INCOMING_END);
 
+object_unref(OBJECT(migr));
+
 return NULL;
 }
 
@@ -2457,12 +2462,14 @@ static bool 
postcopy_pause_incoming(MigrationIncomingState *mis)
 qemu_fclose(mis->from_src_file);
 mis->from_src_file = NULL;
 
-assert(mis->to_src_file);
-qemu_file_shutdown(mis->to_src_file);
-qemu_mutex_lock(>rp_mutex);
-qemu_fclose(mis->to_src_file);
-mis->to_src_file = NULL;
-qemu_mutex_unlock(>rp_mutex);
+if (migrate_postcopy_ram()) {
+assert(mis->to_src_file);
+qemu_file_shutdown(mis->to_src_file);
+qemu_mutex_lock(>rp_mutex);
+qemu_fclose(mis->to_src_file);
+mis->to_src_file = NULL;
+qemu_mutex_unlock(>rp_mutex);
+}
 
 migrate_set_state(>state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
   MIGRATION_STATUS_POSTCOPY_PAUSED);
-- 
2.21.0




[PATCH 3/7] migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete

2020-01-22 Thread Vladimir Sementsov-Ogievskiy
bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in
postcopy, bitmap successor must be enabled, and reclaim operation will
enable the bitmap.

So, actually we need just call _reclaim_ in both if branches, and
making differences only to add an assertion seems not really good. The
logic becomes simple: on load complete we do reclaim and that's all.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 25 -
 1 file changed, 4 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 502e858c31..eeaab2174e 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -526,6 +526,10 @@ static void dirty_bitmap_load_complete(QEMUFile *f)
 
 qemu_mutex_lock(_load_state.lock);
 
+if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
+bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
+}
+
 for (item = dbm_load_state.enabled_bitmaps; item;
  item = g_slist_next(item))
 {
@@ -537,27 +541,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f)
 }
 }
 
-if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
-bdrv_dirty_bitmap_lock(s->bitmap);
-if (dbm_load_state.enabled_bitmaps == NULL) {
-/* in postcopy */
-bdrv_reclaim_dirty_bitmap_locked(s->bitmap, _abort);
-bdrv_enable_dirty_bitmap_locked(s->bitmap);
-} else {
-/* target not started, successor must be empty */
-int64_t count = bdrv_get_dirty_count(s->bitmap);
-BdrvDirtyBitmap *ret = bdrv_reclaim_dirty_bitmap_locked(s->bitmap,
-NULL);
-/* bdrv_reclaim_dirty_bitmap can fail only on no successor (it
- * must be) or on merge fail, but merge can't fail when second
- * bitmap is empty
- */
-assert(ret == s->bitmap &&
-   count == bdrv_get_dirty_count(s->bitmap));
-}
-bdrv_dirty_bitmap_unlock(s->bitmap);
-}
-
 qemu_mutex_unlock(_load_state.lock);
 }
 
-- 
2.21.0




[PATCH 4/7] migration/block-dirty-bitmap: keep bitmap state for all bitmaps

2020-01-22 Thread Vladimir Sementsov-Ogievskiy
Keep bitmap state for disabled bitmaps too. Keep the state until the
end of the process. It's needed for the following commit to implement
bitmap postcopy canceling.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 59 ++
 1 file changed, 38 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index eeaab2174e..f96458113c 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -131,6 +131,7 @@ typedef struct DirtyBitmapLoadBitmapState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 bool migrated;
+bool enabled;
 } DirtyBitmapLoadBitmapState;
 
 typedef struct DirtyBitmapLoadState {
@@ -140,8 +141,11 @@ typedef struct DirtyBitmapLoadState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 
-GSList *enabled_bitmaps;
-QemuMutex lock; /* protect enabled_bitmaps */
+bool bitmaps_enabled; /* set in dirty_bitmap_mig_before_vm_start */
+bool stream_ended; /* set when all migrated data handled */
+
+GSList *bitmaps;
+QemuMutex lock; /* protect bitmaps */
 } DirtyBitmapLoadState;
 static DirtyBitmapLoadState dbm_load_state;
 
@@ -446,6 +450,7 @@ static int dirty_bitmap_load_start(QEMUFile *f)
 Error *local_err = NULL;
 uint32_t granularity = qemu_get_be32(f);
 uint8_t flags = qemu_get_byte(f);
+DirtyBitmapLoadBitmapState *b;
 
 if (s->bitmap) {
 error_report("Bitmap with the same name ('%s') already exists on "
@@ -472,22 +477,23 @@ static int dirty_bitmap_load_start(QEMUFile *f)
 
 bdrv_disable_dirty_bitmap(s->bitmap);
 if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
-DirtyBitmapLoadBitmapState *b;
-
 bdrv_dirty_bitmap_create_successor(s->bitmap, _err);
 if (local_err) {
 error_report_err(local_err);
 return -EINVAL;
 }
-
-b = g_new(DirtyBitmapLoadBitmapState, 1);
-b->bs = s->bs;
-b->bitmap = s->bitmap;
-b->migrated = false;
-dbm_load_state.enabled_bitmaps =
-g_slist_prepend(dbm_load_state.enabled_bitmaps, b);
 }
 
+b = g_new(DirtyBitmapLoadBitmapState, 1);
+*b = (DirtyBitmapLoadBitmapState) {
+.bs = s->bs,
+.bitmap = s->bitmap,
+.migrated = false,
+.enabled = flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED,
+};
+
+dbm_load_state.bitmaps = g_slist_prepend(dbm_load_state.bitmaps, b);
+
 return 0;
 }
 
@@ -497,22 +503,25 @@ void dirty_bitmap_mig_before_vm_start(void)
 
 qemu_mutex_lock(_load_state.lock);
 
-for (item = dbm_load_state.enabled_bitmaps; item;
- item = g_slist_next(item))
-{
+for (item = dbm_load_state.bitmaps; item; item = g_slist_next(item)) {
 DirtyBitmapLoadBitmapState *b = item->data;
 
+if (!b->enabled) {
+continue;
+}
+
 if (b->migrated) {
 bdrv_enable_dirty_bitmap_locked(b->bitmap);
 } else {
 bdrv_dirty_bitmap_enable_successor(b->bitmap);
 }
-
-g_free(b);
 }
 
-g_slist_free(dbm_load_state.enabled_bitmaps);
-dbm_load_state.enabled_bitmaps = NULL;
+dbm_load_state.bitmaps_enabled = true;
+if (dbm_load_state.stream_ended) {
+g_slist_free_full(dbm_load_state.bitmaps, g_free);
+dbm_load_state.bitmaps = NULL;
+}
 
 qemu_mutex_unlock(_load_state.lock);
 }
@@ -530,9 +539,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f)
 bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
 }
 
-for (item = dbm_load_state.enabled_bitmaps; item;
- item = g_slist_next(item))
-{
+for (item = dbm_load_state.bitmaps; item; item = g_slist_next(item)) {
 DirtyBitmapLoadBitmapState *b = item->data;
 
 if (b->bitmap == s->bitmap) {
@@ -671,6 +678,16 @@ static int dirty_bitmap_load(QEMUFile *f, void *opaque, 
int version_id)
 }
 } while (!(dbm_load_state.flags & DIRTY_BITMAP_MIG_FLAG_EOS));
 
+qemu_mutex_lock(_load_state.lock);
+
+dbm_load_state.stream_ended = true;
+if (dbm_load_state.bitmaps_enabled) {
+g_slist_free_full(dbm_load_state.bitmaps, g_free);
+dbm_load_state.bitmaps = NULL;
+}
+
+qemu_mutex_unlock(_load_state.lock);
+
 trace_dirty_bitmap_load_success();
 return 0;
 }
-- 
2.21.0




[PATCH 7/7] qemu-iotests/199: add early shutdown case to bitmaps postcopy

2020-01-22 Thread Vladimir Sementsov-Ogievskiy
Previous patches fixed two crashes which may occur on shutdown prior to
bitmaps postcopy finished. Check that it works now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199 | 12 +++-
 tests/qemu-iotests/199.out |  4 ++--
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index a2c8ecab5a..a3f6c73aed 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -47,7 +47,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.launch()
 self.vm_b.launch()
 
-def test_postcopy(self):
+def do_test_postcopy(self, early_shutdown):
 write_size = 0x4000
 granularity = 512
 chunk = 4096
@@ -99,6 +99,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 if event['data']['status'] == 'completed':
 break
 
+if early_shutdown:
+self.vm_b.qmp('quit')
+return
+
 s = 0x8000
 while s < write_size:
 self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
@@ -114,6 +118,12 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 self.assert_qmp(result, 'return/sha256', sha256);
 
+def test_postcopy(self):
+self.do_test_postcopy(False)
+
+def test_postcopy_early_shutdown(self):
+self.do_test_postcopy(True)
+
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
  supported_protocols=['file'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index ae1213e6f8..fbc63e62f8 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-.
+..
 --
-Ran 1 tests
+Ran 2 tests
 
 OK
-- 
2.21.0




Re: [PATCH v3 0/2] ide: Fix incorrect handling of some PRDTs and add the corresponding unit-test

2020-01-22 Thread Kevin Wolf
Am 22.01.2020 um 12:53 hat Alexander Popov geschrieben:
> On 23.12.2019 20:51, Alexander Popov wrote:
> > Fuzzing the Linux kernel with syzkaller allowed to find how to crash qemu
> > using a special SCSI_IOCTL_SEND_COMMAND. It hits the assertion in
> > ide_dma_cb() introduced in the commit a718978ed58a in July 2015.
> > 
> > This patch series fixes incorrect handling of some PRDTs in ide_dma_cb()
> > and improves the ide-test to cover more PRDT cases (including one
> > that causes that particular qemu crash).
> > 
> > Changes from v2 (thanks to Kevin Wolf for the feedback):
> >  - the assertion about prepare_buf() return value is improved;
> >  - the patch order is reversed to keep the tree bisectable;
> >  - the unit-test performance is improved -- now it runs 8 seconds
> >instead of 3 minutes on my laptop.
> > 
> > Alexander Popov (2):
> >   ide: Fix incorrect handling of some PRDTs in ide_dma_cb()
> >   tests/ide-test: Create a single unit-test covering more PRDT cases
> > 
> >  hw/ide/core.c|  30 +---
> >  tests/ide-test.c | 174 ---
> >  2 files changed, 96 insertions(+), 108 deletions(-)
> 
> Hello!
> 
> Pinging again about this fix and unit-test...
> 
> It's ready. Kevin Wolf has reviewed this (thanks a lot!).
> 
> What is next?

I asked John about it just yesterday (if he will merge it or if he would
prefer me to take it through my tree) and he promised to take a look
very soon.

Kevin




Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands

2020-01-22 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
>> Kevin Wolf  writes:
>> 
>> > This patch adds a new 'coroutine' flag to QMP command definitions that
>> > tells the QMP dispatcher that the command handler is safe to be run in a
>> > coroutine.
>> 
>> I'm afraid I missed this question in my review of v3: when is a handler
>> *not* safe to be run in a coroutine?
>
> That's a hard one to answer fully.

You're welcome ;)

> Basically, I think the biggest problem is with calling functions that
> change their behaviour if run in a coroutine compared to running them
> outside of coroutine context. In most cases the differences like having
> a nested event loop instead of yielding are just fine, but they are
> still subtly different.
>
> I know this is vague, but I can assure you that problematic cases exist.
> I hit one of them with my initial hack that just moved everything into a
> coroutine. It was related to graph modifications and bdrv_drain and
> resulted in a hang. For the specifics, I would have to try and reproduce
> the problem again.

Interesting.

Is coroutine-incompatible command handler code necessary or accidental?

By "necessary" I mean there are (and likely always will be) commands
that need to do stuff that cannot or should not be done on coroutine
context.

"Accidental" is the opposite: coroutine-incompatibility can be regarded
as a fixable flaw.

How widespread is coroutine-incompatibility?  Common, or just a few
commands?

If coroutine-incompatibility is accidental, then your code to drop out
of coroutine context can be regarded as a temporary work-around.  Such
work-arounds receive a modest extra ugliness & complexity budget.

If coroutine-incompatibility is rare, we'll eventually want "mark the
known-bad ones with 'coroutine': false" instead of "mark the known-good
ones with 'coroutine': true".  I'm okay with marking the known-good ones
initially, and flipping only later.

Inability to recognize coroutine-incompatibility by inspection worries
me.  Can you think of ways to identify causes other than testing things
to death?




Re: [PATCH v3 0/2] ide: Fix incorrect handling of some PRDTs and add the corresponding unit-test

2020-01-22 Thread Alexander Popov
On 23.12.2019 20:51, Alexander Popov wrote:
> Fuzzing the Linux kernel with syzkaller allowed to find how to crash qemu
> using a special SCSI_IOCTL_SEND_COMMAND. It hits the assertion in
> ide_dma_cb() introduced in the commit a718978ed58a in July 2015.
> 
> This patch series fixes incorrect handling of some PRDTs in ide_dma_cb()
> and improves the ide-test to cover more PRDT cases (including one
> that causes that particular qemu crash).
> 
> Changes from v2 (thanks to Kevin Wolf for the feedback):
>  - the assertion about prepare_buf() return value is improved;
>  - the patch order is reversed to keep the tree bisectable;
>  - the unit-test performance is improved -- now it runs 8 seconds
>instead of 3 minutes on my laptop.
> 
> Alexander Popov (2):
>   ide: Fix incorrect handling of some PRDTs in ide_dma_cb()
>   tests/ide-test: Create a single unit-test covering more PRDT cases
> 
>  hw/ide/core.c|  30 +---
>  tests/ide-test.c | 174 ---
>  2 files changed, 96 insertions(+), 108 deletions(-)

Hello!

Pinging again about this fix and unit-test...

It's ready. Kevin Wolf has reviewed this (thanks a lot!).

What is next?

Best regards,
Alexander



Re: [PATCH v4 1/4] qapi: Add a 'coroutine' flag for commands

2020-01-22 Thread Kevin Wolf
Am 22.01.2020 um 07:32 hat Markus Armbruster geschrieben:
> Kevin Wolf  writes:
> 
> > This patch adds a new 'coroutine' flag to QMP command definitions that
> > tells the QMP dispatcher that the command handler is safe to be run in a
> > coroutine.
> 
> I'm afraid I missed this question in my review of v3: when is a handler
> *not* safe to be run in a coroutine?

That's a hard one to answer fully.

Basically, I think the biggest problem is with calling functions that
change their behaviour if run in a coroutine compared to running them
outside of coroutine context. In most cases the differences like having
a nested event loop instead of yielding are just fine, but they are
still subtly different.

I know this is vague, but I can assure you that problematic cases exist.
I hit one of them with my initial hack that just moved everything into a
coroutine. It was related to graph modifications and bdrv_drain and
resulted in a hang. For the specifics, I would have to try and reproduce
the problem again.

Kevin