date:20200217

Re: [PATCH v2 00/22] Fix error handling during bitmap postcopy

2020-02-17 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/20200217150246.29180-1-vsement...@virtuozzo.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH v2 00/22] Fix error handling during bitmap postcopy
Message-id: 20200217150246.29180-1-vsement...@virtuozzo.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
fatal: git fetch_pack: expected ACK/NAK, got 'ERR upload-pack: not our ref 
247b588c357694c896d056836da2341d75451c4f'
fatal: The remote end hung up unexpectedly
error: Could not fetch 3c8cf5a9c21ff8782164d1def7f44bd888713384
Traceback (most recent call last):
  File "patchew-tester/src/patchew-cli", line 521, in test_one
git_clone_repo(clone, r["repo"], r["head"], logf, True)
  File "patchew-tester/src/patchew-cli", line 48, in git_clone_repo
stdout=logf, stderr=logf)
  File "/opt/rh/rh-python36/root/usr/lib64/python3.6/subprocess.py", line 291, 
in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['git', 'remote', 'add', '-f', 
'--mirror=fetch', '3c8cf5a9c21ff8782164d1def7f44bd888713384', 
'https://github.com/patchew-project/qemu']' returned non-zero exit status 1.



The full log is available at
http://patchew.org/logs/20200217150246.29180-1-vsement...@virtuozzo.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: x-blockdev-reopen & block-dirty-bitmaps

2020-02-17 Thread John Snow




On 2/17/20 4:52 AM, Kevin Wolf wrote:
> Am 14.02.2020 um 21:32 hat John Snow geschrieben:
>> On 2/14/20 3:19 PM, Kevin Wolf wrote:

>>> I think I've asked this before, but I don't remember the answer...
>>>
>>> What would be the problem with letting the enable/disable command
>>> temporarily reopen the backing file read-write, like the commit job [1]
>>> does?
>>>
>>
>> I guess no problem? I wouldn't want it to do this automatically, but
>> perhaps we could add a "force=True" bool where it tries to do just this.
>>
>> (And once reopen works properly we could deprecate this workaround again.)
> 
> I'm not sure if I would view this only as a workaround, but if you like
> it better with force=true, I won't object either.
> 

It feels like the wrong place to manage node permissions. I could be
wrong, but I felt like bitmap commands should try to stay focused on
managing bitmaps instead of graph management.

At the very least, I feel like we ought to require 'force' so that
management clients like libvirt can be aware that this will (attempt to)
change the graph in those cases.

Thanks for the suggestion!

>>> [1] I mean, I know that this code is wrong strictly speaking because we
>>> really should be counting read-write users [2] rather than blindly
>>> making the node read-only at the end of the operation - but somehow
>>> it seems to work in practice for commit jobs.
>>>
>>> [2] Counting really means just looking at parent BdrvChild links that
>>> have WRITE permissions. I guess doing it right would mean getting
>>> rid of BlockDriverState.read_only (which is redundant) and using
>>> only permissions.
>>
>> OK, sounds good. I'll make a mockup that tries to accurately detect
>> read-only-ness and reverts changes only if it made any to begin with.
> 
> Fixing it for commit would be appreciated, though as I said it seems to
> be a theoretical case mostly because we never got bug reports for it.
> 
> For bitmaps it's even more theoretical because we hold the BQL between
> the switch to read-write and the switch back. So I don't think we can
> actually run into this case for the bitmap enable/disable operation.
> 

Yes, only a theoretical problem -- but personally I find it confusing to
have competing APIs that don't always agree, so it's just in my nature
to try and "use the one that's more correct."

Thanks again.
--js

qcow2: Zero-initialization of external data files

2020-02-17 Thread Max Reitz

Hi,

AFAIU, external data files with data_file_raw=on are supposed to return
the same data as the qcow2 file when read.  But we still use the qcow2
metadata structures (which are by default initialized to “everything
unallocated”), even though we never ensure that the external data file
is zero, too, so this can happen:

$ dd if=/dev/urandom of=foo.raw 64M
[...]

$ sudo losetup -f --show foo.raw
/dev/loop0

$ sudo ./qemu-img create -f qcow2 -o \
data_file=/dev/loop0,data_file_raw=on foo.qcow2 64M
[...]

$ sudo ./qemu-io -c 'read -P 0 0 64M' foo.qcow2
read 67108864/67108864 bytes at offset 0
64 MiB, 1 ops; 00.00 sec (25.036 GiB/sec and 400.5751 ops/sec)

$ sudo ./qemu-io -c 'read -P 0 0 64M' -f raw foo.raw
Pattern verification failed at offset 0, 67108864 bytes
read 67108864/67108864 bytes at offset 0
64 MiB, 1 ops; 00.01 sec (5.547 GiB/sec and 88.7484 ops/sec)

I suppose this behavior is fine for blockdev-create because I guess it’s
the user’s responsibility to ensure that the external data file is zero.
 But maybe it isn’t, so that’s my first question: Is it really the
user’s responsibility or should we always ensure it’s zero?

My second question is: If we decide that this is fine for
blockdev-create, should at least qcow2_co_create_opts() ensure the data
file it just created is zero?

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH] q800: fix coverity warning CID 1412799

2020-02-17 Thread Philippe Mathieu-Daudé


Hi Laurent,

Cc'ing qemu-block@

On 2/10/20 3:56 PM, Philippe Mathieu-Daudé wrote:

On 2/10/20 2:22 PM, Laurent Vivier wrote:

Check the return value of blk_write() and log an error if any



Fixes: Coverity CID 1412799 (Error handling issues)


Signed-off-by: Laurent Vivier 
---
  hw/misc/mac_via.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/misc/mac_via.c b/hw/misc/mac_via.c
index b7d0012794..81343301b1 100644
--- a/hw/misc/mac_via.c
+++ b/hw/misc/mac_via.c
@@ -30,6 +30,7 @@
  #include "hw/qdev-properties.h"
  #include "sysemu/block-backend.h"
  #include "trace.h"
+#include "qemu/log.h"
  /*
   * VIAs: There are two in every machine,
@@ -381,8 +382,10 @@ static void via2_irq_request(void *opaque, int 
irq, int level)

  static void pram_update(MacVIAState *m)
  {
  if (m->blk) {
-    blk_pwrite(m->blk, 0, m->mos6522_via1.PRAM,
-   sizeof(m->mos6522_via1.PRAM), 0);
+    if (blk_pwrite(m->blk, 0, m->mos6522_via1.PRAM,
+   sizeof(m->mos6522_via1.PRAM), 0) < 0) {
+    qemu_log("pram_update: cannot write to file\n");


I am not comfortable reviewing this patch, because it use a undocumented 
function. If I understand co-routine code enough, this eventually calls 
blk_co_pwritev_part(), which on success returns bdrv_co_pwritev_part(), 
which on success returns bdrv_aligned_pwritev(), which itself returns 0 
or errno (as negative value).


I don't understand how to treat the error, but apparently other devices 
do the same (only report some error and discarding the block not written).


So this can happens if your filesystem got full, the VM is running, the 
device can not sync the VIA PRAM but keeps running. You let the user the 
possibility to make some space on the filesystem so the next sync will 
succeed.


This does not seem critical, so I dare to add:
Reviewed-by: Philippe Mathieu-Daudé 

But I'd rather have one of the block folks reviewing this pattern.

Regards,

Phil.


+    }
  }
  }

Re: [PATCH v2 1/4] m25p80: Convert to support tracing

2020-02-17 Thread Cédric Le Goater

Hello all, 

On 2/6/20 7:32 PM, Guenter Roeck wrote:
> While at it, add some trace messages to help debug problems
> seen when running the latest Linux kernel.

Through which tree do you think it is best to merge this patchset ? 
block or arm ? 

Thanks,

C.


> Signed-off-by: Guenter Roeck 
> ---
> v2: Print pointer to Flash data structure as flash ID with each trace
> message to support systems with more than one instantiated flash.
> 
>  hw/block/m25p80.c | 48 ---
>  hw/block/trace-events | 16 +++
>  2 files changed, 38 insertions(+), 26 deletions(-)
> 
> diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> index 61f2fb8f8f..5ff8d270c4 100644
> --- a/hw/block/m25p80.c
> +++ b/hw/block/m25p80.c
> @@ -32,17 +32,7 @@
>  #include "qemu/module.h"
>  #include "qemu/error-report.h"
>  #include "qapi/error.h"
> -
> -#ifndef M25P80_ERR_DEBUG
> -#define M25P80_ERR_DEBUG 0
> -#endif
> -
> -#define DB_PRINT_L(level, ...) do { \
> -if (M25P80_ERR_DEBUG > (level)) { \
> -fprintf(stderr,  ": %s: ", __func__); \
> -fprintf(stderr, ## __VA_ARGS__); \
> -} \
> -} while (0)
> +#include "trace.h"
>  
>  /* Fields for FlashPartInfo->flags */
>  
> @@ -574,7 +564,8 @@ static void flash_erase(Flash *s, int offset, FlashCMD 
> cmd)
>  abort();
>  }
>  
> -DB_PRINT_L(0, "offset = %#x, len = %d\n", offset, len);
> +trace_m25p80_flash_erase(s, offset, len);
> +
>  if ((s->pi->flags & capa_to_assert) != capa_to_assert) {
>  qemu_log_mask(LOG_GUEST_ERROR, "M25P80: %d erase size not supported 
> by"
>" device\n", len);
> @@ -607,8 +598,7 @@ void flash_write8(Flash *s, uint32_t addr, uint8_t data)
>  }
>  
>  if ((prev ^ data) & data) {
> -DB_PRINT_L(1, "programming zero to one! addr=%" PRIx32 "  %" PRIx8
> -   " -> %" PRIx8 "\n", addr, prev, data);
> +trace_m25p80_programming_zero_to_one(s, addr, prev, data);
>  }
>  
>  if (s->pi->flags & EEPROM) {
> @@ -662,6 +652,9 @@ static void complete_collecting_data(Flash *s)
>  
>  s->state = STATE_IDLE;
>  
> +trace_m25p80_complete_collecting(s, s->cmd_in_progress, n, s->ear,
> + s->cur_addr);
> +
>  switch (s->cmd_in_progress) {
>  case DPP:
>  case QPP:
> @@ -825,7 +818,7 @@ static void reset_memory(Flash *s)
>  break;
>  }
>  
> -DB_PRINT_L(0, "Reset done.\n");
> +trace_m25p80_reset_done(s);
>  }
>  
>  static void decode_fast_read_cmd(Flash *s)
> @@ -941,9 +934,10 @@ static void decode_qio_read_cmd(Flash *s)
>  
>  static void decode_new_cmd(Flash *s, uint32_t value)
>  {
> -s->cmd_in_progress = value;
>  int i;
> -DB_PRINT_L(0, "decoded new command:%x\n", value);
> +
> +s->cmd_in_progress = value;
> +trace_m25p80_command_decoded(s, value);
>  
>  if (value != RESET_MEMORY) {
>  s->reset_enable = false;
> @@ -1042,7 +1036,7 @@ static void decode_new_cmd(Flash *s, uint32_t value)
>  break;
>  
>  case JEDEC_READ:
> -DB_PRINT_L(0, "populated jedec code\n");
> +trace_m25p80_populated_jedec(s);
>  for (i = 0; i < s->pi->id_len; i++) {
>  s->data[i] = s->pi->id[i];
>  }
> @@ -1063,7 +1057,7 @@ static void decode_new_cmd(Flash *s, uint32_t value)
>  case BULK_ERASE_60:
>  case BULK_ERASE:
>  if (s->write_enable) {
> -DB_PRINT_L(0, "chip erase\n");
> +trace_m25p80_chip_erase(s);
>  flash_erase(s, 0, BULK_ERASE);
>  } else {
>  qemu_log_mask(LOG_GUEST_ERROR, "M25P80: chip erase with write "
> @@ -1184,7 +1178,7 @@ static int m25p80_cs(SSISlave *ss, bool select)
>  s->data_read_loop = false;
>  }
>  
> -DB_PRINT_L(0, "%sselect\n", select ? "de" : "");
> +trace_m25p80_select(s, select ? "de" : "");
>  
>  return 0;
>  }
> @@ -1194,19 +1188,20 @@ static uint32_t m25p80_transfer8(SSISlave *ss, 
> uint32_t tx)
>  Flash *s = M25P80(ss);
>  uint32_t r = 0;
>  
> +trace_m25p80_transfer(s, s->state, s->len, s->needed_bytes, s->pos,
> +  s->cur_addr, (uint8_t)tx);
> +
>  switch (s->state) {
>  
>  case STATE_PAGE_PROGRAM:
> -DB_PRINT_L(1, "page program cur_addr=%#" PRIx32 " data=%" PRIx8 "\n",
> -   s->cur_addr, (uint8_t)tx);
> +trace_m25p80_page_program(s, s->cur_addr, (uint8_t)tx);
>  flash_write8(s, s->cur_addr, (uint8_t)tx);
>  s->cur_addr = (s->cur_addr + 1) & (s->size - 1);
>  break;
>  
>  case STATE_READ:
>  r = s->storage[s->cur_addr];
> -DB_PRINT_L(1, "READ 0x%" PRIx32 "=%" PRIx8 "\n", s->cur_addr,
> -   (uint8_t)r);
> +trace_m25p80_read_byte(s, s->cur_addr, (uint8_t)r);
>  s->cur_addr = (s->cur_addr + 1) & (s->size - 1);
>  break;
>  
> @@ -1244,6 +1239,7 @@ static uint32_t

[PATCH v2 01/22] migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

No reason to use _locked version of bdrv_enable_dirty_bitmap, as we
don't lock this mutex before. Moreover, the adjacent
bdrv_dirty_bitmap_enable_successor do lock the mutex.

Fixes: 58f72b965e9e1q
Cc: qemu-sta...@nongnu.org # v3.0
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 7eafface61..16f1793ee3 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -498,7 +498,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 DirtyBitmapLoadBitmapState *b = item->data;
 
 if (b->migrated) {
-bdrv_enable_dirty_bitmap_locked(b->bitmap);
+bdrv_enable_dirty_bitmap(b->bitmap);
 } else {
 bdrv_dirty_bitmap_enable_successor(b->bitmap);
 }
-- 
2.21.0

[PATCH v2 03/22] migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Rename dirty_bitmap_mig_cleanup to dirty_bitmap_do_save_cleanup, to
stress that it is on save part.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 73792ab005..4e8959ae52 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -259,7 +259,7 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
 }
 
 /* Called with iothread lock taken.  */
-static void dirty_bitmap_mig_cleanup(void)
+static void dirty_bitmap_do_save_cleanup(void)
 {
 SaveBitmapState *dbms;
 
@@ -338,7 +338,7 @@ static int init_dirty_bitmap_migration(void)
 return 0;
 
 fail:
-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();
 
 return -1;
 }
@@ -377,7 +377,7 @@ static void bulk_phase(QEMUFile *f, bool limit)
 /* for SaveVMHandlers */
 static void dirty_bitmap_save_cleanup(void *opaque)
 {
-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();
 }
 
 static int dirty_bitmap_save_iterate(QEMUFile *f, void *opaque)
@@ -412,7 +412,7 @@ static int dirty_bitmap_save_complete(QEMUFile *f, void 
*opaque)
 
 trace_dirty_bitmap_save_complete_finish();
 
-dirty_bitmap_mig_cleanup();
+dirty_bitmap_do_save_cleanup();
 return 0;
 }
 
-- 
2.21.0

[PATCH v2 09/22] migration/block-dirty-bitmap: relax error handling in incoming part

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Bitmaps data is not critical, and we should not fail the migration (or
use postcopy recovering) because of dirty-bitmaps migration failure.
Instead we should just lose unfinished bitmaps.

Still we have to report io stream violation errors, as they affect the
whole migration stream.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 148 +
 1 file changed, 113 insertions(+), 35 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 1329db8d7d..aea5326804 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -145,6 +145,15 @@ typedef struct DBMLoadState {
 
 bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start */
 
+/*
+ * cancelled
+ * Incoming migration is cancelled for some reason. That means that we
+ * still should read our chunks from migration stream, to not affect other
+ * migration objects (like RAM), but just ignore them and do not touch any
+ * bitmaps or nodes.
+ */
+bool cancelled;
+
 GSList *bitmaps;
 QemuMutex lock; /* protect bitmaps */
 } DBMLoadState;
@@ -545,13 +554,47 @@ void dirty_bitmap_mig_before_vm_start(void)
 qemu_mutex_unlock(>lock);
 }
 
+static void cancel_incoming_locked(DBMLoadState *s)
+{
+GSList *item;
+
+if (s->cancelled) {
+return;
+}
+
+s->cancelled = true;
+s->bs = NULL;
+s->bitmap = NULL;
+
+/* Drop all unfinished bitmaps */
+for (item = s->bitmaps; item; item = g_slist_next(item)) {
+LoadBitmapState *b = item->data;
+
+/*
+ * Bitmap must be unfinished, as finished bitmaps should already be
+ * removed from the list.
+ */
+assert(!s->before_vm_start_handled || !b->migrated);
+if (bdrv_dirty_bitmap_has_successor(b->bitmap)) {
+bdrv_reclaim_dirty_bitmap(b->bitmap, _abort);
+}
+bdrv_release_dirty_bitmap(b->bitmap);
+}
+
+g_slist_free_full(s->bitmaps, g_free);
+s->bitmaps = NULL;
+}
+
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 {
 GSList *item;
 trace_dirty_bitmap_load_complete();
-bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-qemu_mutex_lock(>lock);
+if (s->cancelled) {
+return;
+}
+
+bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
 if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
 bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
@@ -569,8 +612,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 break;
 }
 }
-
-qemu_mutex_unlock(>lock);
 }
 
 static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -582,15 +623,32 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
 
 if (s->flags & DIRTY_BITMAP_MIG_FLAG_ZEROES) {
 trace_dirty_bitmap_load_bits_zeroes();
-bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte, nr_bytes,
- false);
+if (!s->cancelled) {
+bdrv_dirty_bitmap_deserialize_zeroes(s->bitmap, first_byte,
+ nr_bytes, false);
+}
 } else {
 size_t ret;
 uint8_t *buf;
 uint64_t buf_size = qemu_get_be64(f);
-uint64_t needed_size =
-bdrv_dirty_bitmap_serialization_size(s->bitmap,
- first_byte, nr_bytes);
+uint64_t needed_size;
+
+buf = g_malloc(buf_size);
+ret = qemu_get_buffer(f, buf, buf_size);
+if (ret != buf_size) {
+error_report("Failed to read bitmap bits");
+g_free(buf);
+return -EIO;
+}
+
+if (s->cancelled) {
+g_free(buf);
+return 0;
+}
+
+needed_size = bdrv_dirty_bitmap_serialization_size(s->bitmap,
+   first_byte,
+   nr_bytes);
 
 if (needed_size > buf_size ||
 buf_size > QEMU_ALIGN_UP(needed_size, 4 * sizeof(long))
@@ -599,15 +657,8 @@ static int dirty_bitmap_load_bits(QEMUFile *f, 
DBMLoadState *s)
 error_report("Migrated bitmap granularity doesn't "
  "match the destination bitmap '%s' granularity",
  bdrv_dirty_bitmap_name(s->bitmap));
-return -EINVAL;
-}
-
-buf = g_malloc(buf_size);
-ret = qemu_get_buffer(f, buf, buf_size);
-if (ret != buf_size) {
-error_report("Failed to read bitmap bits");
-g_free(buf);
-return -EIO;
+cancel_incoming_locked(s);
+return 0;
 }
 
 bdrv_dirty_bitmap_deserialize_part(s->bitmap, buf, first_byte, 
nr_bytes,
@@ -632,14 +683,16 @@ static int dirty_bitmap_load_header(QEMUFile

Re: Cross-project NBD extension proposal: NBD_INFO_INIT_STATE

2020-02-17 Thread Max Reitz

Hi,

It’s my understanding that without some is_zero infrastructure for QEMU,
it’s impossible to implement this flag in qemu’s NBD server.

At the same time, I still haven’t understood what we need the flag for.

As far as I understood in our discussion on your qemu series, there is
no case where anyone would need to know whether an image is zero.  All
practical cases involve someone having to ensure that some image is
zero.  Knowing whether an image is zero can help with that, but that can
be an implementation detail.

For qcow2, the idea would be that there is some flag that remains true
as long as the image is guaranteed to be zero.  Then we’d have some
bdrv_make_zero function, and qcow2’s implementation would use this
information to gauge whether there’s something to do as all.

For NBD, we cannot use this idea directly because to implement such a
flag (as you’re describing in this mail), we’d need separate is_zero
infrastructure, and that kind of makes the point of “drivers’
bdrv_make_zero() implementations do the right thing by themselves” moot.

OTOH, we wouldn’t need such a flag for the implementation, because we
could just send a 64-bit discard/make_zero over the whole block device
length to the NBD server, and then the server internally does the right
thing(TM).  AFAIU discard and write_zeroes currently have only 32 bit
length fields, but there were plans for adding support for 64 bit
versions anyway.  From my naïve outsider perspective, doing that doesn’t
seem a more complicated protocol addition than adding some way to tell
whether an NBD export is zero.

So I’m still wondering whether there are actually cases where we need to
tell whether some image or NBD export is zero that do not involve making
it zero if it isn’t.

(I keep asking because it seems to me that if all we ever really want to
do is to ensure that some images/exports are zero, we should implement
that.)

Max



signature.asc
Description: OpenPGP digital signature

[PATCH v2 13/22] qemu-iotests/199: drop extra constraints

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

We don't need any specific format constraints here. Still keep qcow2
for two reasons:
1. No extra calls of format-unrelated test
2. Add some check around persistent bitmap in future (require qcow2)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199 | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index de9ba8d94c..dda918450a 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -116,5 +116,4 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 
 if __name__ == '__main__':
-iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
- supported_protocols=['file'])
+iotests.main(supported_fmts=['qcow2'])
-- 
2.21.0

[PATCH v2 15/22] qemu-iotests/199: improve performance: set bitmap by discard

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Discard dirties dirty-bitmap as well as write, but works faster. Let's
use it instead.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199 | 31 ---
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 6599fc6fb4..d78f81b71c 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -67,8 +67,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 os.mkfifo(fifo)
 qemu_img('create', '-f', iotests.imgfmt, disk_a, size)
 qemu_img('create', '-f', iotests.imgfmt, disk_b, size)
-self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a)
-self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_a = iotests.VM(path_suffix='a').add_drive(disk_a,
+  'discard=unmap')
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b,
+  'discard=unmap')
 self.vm_b.add_incoming("exec: cat '" + fifo + "'")
 self.vm_a.launch()
 self.vm_b.launch()
@@ -78,7 +80,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events = []
 
 def test_postcopy(self):
-write_size = 0x4000
+discard_size = 0x4000
 granularity = 512
 chunk = 4096
 
@@ -86,25 +88,32 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
name='bitmap', granularity=granularity)
 self.assert_qmp(result, 'return', {})
 
+result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap')
+empty_sha256 = result['return']['sha256']
+
 s = 0
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 s = 0x8000
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')
 sha256 = result['return']['sha256']
 
+# Check, that updating the bitmap by discards works
+assert sha256 != empty_sha256
+
 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
 self.assert_qmp(result, 'return', {})
 s = 0
-while s < write_size:
-self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 
 caps = [{'capability': 'dirty-bitmaps', 'state': True},
@@ -126,8 +135,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events.append(e_resume)
 
 s = 0x8000
-while s < write_size:
-self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
+while s < discard_size:
+self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
 s += 0x1
 
 match = {'data': {'status': 'completed'}}
-- 
2.21.0

[PATCH v2 08/22] migration/block-dirty-bitmap: keep bitmap state for all bitmaps

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Keep bitmap state for disabled bitmaps too. Keep the state until the
end of the process. It's needed for the following commit to implement
bitmap postcopy canceling.

To clean-up the new list the following logic is used:
We need two events to consider bitmap migration finished:
1. chunk with DIRTY_BITMAP_MIG_FLAG_COMPLETE flag should be received
2. dirty_bitmap_mig_before_vm_start should be called
These two events may come in any order, so we understand which one is
last, and on the last of them we remove bitmap migration state from the
list.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 64 +++---
 1 file changed, 43 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 9cc750d93b..1329db8d7d 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -132,6 +132,7 @@ typedef struct LoadBitmapState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 bool migrated;
+bool enabled;
 } LoadBitmapState;
 
 /* State of the dirty bitmap migration (DBM) during load process */
@@ -142,8 +143,10 @@ typedef struct DBMLoadState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 
-GSList *enabled_bitmaps;
-QemuMutex lock; /* protect enabled_bitmaps */
+bool before_vm_start_handled; /* set in dirty_bitmap_mig_before_vm_start */
+
+GSList *bitmaps;
+QemuMutex lock; /* protect bitmaps */
 } DBMLoadState;
 
 typedef struct DBMState {
@@ -458,6 +461,7 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
 Error *local_err = NULL;
 uint32_t granularity = qemu_get_be32(f);
 uint8_t flags = qemu_get_byte(f);
+LoadBitmapState *b;
 
 if (s->bitmap) {
 error_report("Bitmap with the same name ('%s') already exists on "
@@ -484,45 +488,59 @@ static int dirty_bitmap_load_start(QEMUFile *f, 
DBMLoadState *s)
 
 bdrv_disable_dirty_bitmap(s->bitmap);
 if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
-LoadBitmapState *b;
-
 bdrv_dirty_bitmap_create_successor(s->bitmap, _err);
 if (local_err) {
 error_report_err(local_err);
 return -EINVAL;
 }
-
-b = g_new(LoadBitmapState, 1);
-b->bs = s->bs;
-b->bitmap = s->bitmap;
-b->migrated = false;
-s->enabled_bitmaps = g_slist_prepend(s->enabled_bitmaps, b);
 }
 
+b = g_new(LoadBitmapState, 1);
+b->bs = s->bs;
+b->bitmap = s->bitmap;
+b->migrated = false;
+b->enabled = flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED,
+
+s->bitmaps = g_slist_prepend(s->bitmaps, b);
+
 return 0;
 }
 
-void dirty_bitmap_mig_before_vm_start(void)
+/*
+ * before_vm_start_handle_item
+ *
+ * g_slist_foreach helper
+ *
+ * item is LoadBitmapState*
+ * opaque is DBMLoadState*
+ */
+static void before_vm_start_handle_item(void *item, void *opaque)
 {
-DBMLoadState *s = _state.load;
-GSList *item;
-
-qemu_mutex_lock(>lock);
-
-for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
-LoadBitmapState *b = item->data;
+DBMLoadState *s = opaque;
+LoadBitmapState *b = item;
 
+if (b->enabled) {
 if (b->migrated) {
 bdrv_enable_dirty_bitmap(b->bitmap);
 } else {
 bdrv_dirty_bitmap_enable_successor(b->bitmap);
 }
+}
 
+if (b->migrated) {
+s->bitmaps = g_slist_remove(s->bitmaps, b);
 g_free(b);
 }
+}
 
-g_slist_free(s->enabled_bitmaps);
-s->enabled_bitmaps = NULL;
+void dirty_bitmap_mig_before_vm_start(void)
+{
+DBMLoadState *s = _state.load;
+qemu_mutex_lock(>lock);
+
+assert(!s->before_vm_start_handled);
+g_slist_foreach(s->bitmaps, before_vm_start_handle_item, s);
+s->before_vm_start_handled = true;
 
 qemu_mutex_unlock(>lock);
 }
@@ -539,11 +557,15 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
 }
 
-for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
+for (item = s->bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
 
 if (b->bitmap == s->bitmap) {
 b->migrated = true;
+if (s->before_vm_start_handled) {
+s->bitmaps = g_slist_remove(s->bitmaps, b);
+g_free(b);
+}
 break;
 }
 }
-- 
2.21.0

[PATCH v2 02/22] migration/block-dirty-bitmap: rename state structure types

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Rename types to be symmetrical for load/save part and shorter.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 68 ++
 1 file changed, 36 insertions(+), 32 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 16f1793ee3..73792ab005 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -100,23 +100,25 @@
 /* 0x04 was "AUTOLOAD" flags on elder versions, no it is ignored */
 #define DIRTY_BITMAP_MIG_START_FLAG_RESERVED_MASK0xf8
 
-typedef struct DirtyBitmapMigBitmapState {
+/* State of one bitmap during save process */
+typedef struct SaveBitmapState {
 /* Written during setup phase. */
 BlockDriverState *bs;
 const char *node_name;
 BdrvDirtyBitmap *bitmap;
 uint64_t total_sectors;
 uint64_t sectors_per_chunk;
-QSIMPLEQ_ENTRY(DirtyBitmapMigBitmapState) entry;
+QSIMPLEQ_ENTRY(SaveBitmapState) entry;
 uint8_t flags;
 
 /* For bulk phase. */
 bool bulk_completed;
 uint64_t cur_sector;
-} DirtyBitmapMigBitmapState;
+} SaveBitmapState;
 
-typedef struct DirtyBitmapMigState {
-QSIMPLEQ_HEAD(, DirtyBitmapMigBitmapState) dbms_list;
+/* State of the dirty bitmap migration (DBM) during save process */
+typedef struct DBMSaveState {
+QSIMPLEQ_HEAD(, SaveBitmapState) dbms_list;
 
 bool bulk_completed;
 bool no_bitmaps;
@@ -124,23 +126,25 @@ typedef struct DirtyBitmapMigState {
 /* for send_bitmap_bits() */
 BlockDriverState *prev_bs;
 BdrvDirtyBitmap *prev_bitmap;
-} DirtyBitmapMigState;
+} DBMSaveState;
 
-typedef struct DirtyBitmapLoadState {
+/* State of the dirty bitmap migration (DBM) during load process */
+typedef struct DBMLoadState {
 uint32_t flags;
 char node_name[256];
 char bitmap_name[256];
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
-} DirtyBitmapLoadState;
+} DBMLoadState;
 
-static DirtyBitmapMigState dirty_bitmap_mig_state;
+static DBMSaveState dirty_bitmap_mig_state;
 
-typedef struct DirtyBitmapLoadBitmapState {
+/* State of one bitmap during load process */
+typedef struct LoadBitmapState {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 bool migrated;
-} DirtyBitmapLoadBitmapState;
+} LoadBitmapState;
 static GSList *enabled_bitmaps;
 QemuMutex finish_lock;
 
@@ -170,7 +174,7 @@ static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t 
flags)
 qemu_put_byte(f, flags);
 }
 
-static void send_bitmap_header(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+static void send_bitmap_header(QEMUFile *f, SaveBitmapState *dbms,
uint32_t additional_flags)
 {
 BlockDriverState *bs = dbms->bs;
@@ -199,19 +203,19 @@ static void send_bitmap_header(QEMUFile *f, 
DirtyBitmapMigBitmapState *dbms,
 }
 }
 
-static void send_bitmap_start(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void send_bitmap_start(QEMUFile *f, SaveBitmapState *dbms)
 {
 send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
 qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
 qemu_put_byte(f, dbms->flags);
 }
 
-static void send_bitmap_complete(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void send_bitmap_complete(QEMUFile *f, SaveBitmapState *dbms)
 {
 send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
 }
 
-static void send_bitmap_bits(QEMUFile *f, DirtyBitmapMigBitmapState *dbms,
+static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
  uint64_t start_sector, uint32_t nr_sectors)
 {
 /* align for buffer_is_zero() */
@@ -257,7 +261,7 @@ static void send_bitmap_bits(QEMUFile *f, 
DirtyBitmapMigBitmapState *dbms,
 /* Called with iothread lock taken.  */
 static void dirty_bitmap_mig_cleanup(void)
 {
-DirtyBitmapMigBitmapState *dbms;
+SaveBitmapState *dbms;
 
 while ((dbms = QSIMPLEQ_FIRST(_bitmap_mig_state.dbms_list)) != NULL) 
{
 QSIMPLEQ_REMOVE_HEAD(_bitmap_mig_state.dbms_list, entry);
@@ -272,7 +276,7 @@ static int init_dirty_bitmap_migration(void)
 {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
-DirtyBitmapMigBitmapState *dbms;
+SaveBitmapState *dbms;
 Error *local_err = NULL;
 
 dirty_bitmap_mig_state.bulk_completed = false;
@@ -303,7 +307,7 @@ static int init_dirty_bitmap_migration(void)
 bdrv_ref(bs);
 bdrv_dirty_bitmap_set_busy(bitmap, true);
 
-dbms = g_new0(DirtyBitmapMigBitmapState, 1);
+dbms = g_new0(SaveBitmapState, 1);
 dbms->bs = bs;
 dbms->node_name = name;
 dbms->bitmap = bitmap;
@@ -340,7 +344,7 @@ fail:
 }
 
 /* Called with no lock taken.  */
-static void bulk_phase_send_chunk(QEMUFile *f, DirtyBitmapMigBitmapState *dbms)
+static void bulk_phase_send_chunk(QEMUFile *f, SaveBitmapState *dbms)
 {
 uint32_t nr_sectors = MIN(dbms->total_sectors - dbms->cur_sector,

[PATCH v2 04/22] migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

No reasons to keep two public init functions.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/migration.h  | 1 -
 migration/block-dirty-bitmap.c | 6 +-
 migration/migration.c  | 2 --
 3 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/migration/migration.h b/migration/migration.h
index 8473ddfc88..2948f2387b 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -332,7 +332,6 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState 
*mis,
 void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);
 
 void dirty_bitmap_mig_before_vm_start(void);
-void init_dirty_bitmap_incoming_migration(void);
 void migrate_add_address(SocketAddress *address);
 
 int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 4e8959ae52..49d4cf8810 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -148,11 +148,6 @@ typedef struct LoadBitmapState {
 static GSList *enabled_bitmaps;
 QemuMutex finish_lock;
 
-void init_dirty_bitmap_incoming_migration(void)
-{
-qemu_mutex_init(_lock);
-}
-
 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
 {
 uint8_t flags = qemu_get_byte(f);
@@ -733,6 +728,7 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = {
 void dirty_bitmap_mig_init(void)
 {
 QSIMPLEQ_INIT(_bitmap_mig_state.dbms_list);
+qemu_mutex_init(_lock);
 
 register_savevm_live("dirty-bitmap", 0, 1,
  _dirty_bitmap_handlers,
diff --git a/migration/migration.c b/migration/migration.c
index 8fb68795dc..515047932c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -158,8 +158,6 @@ void migration_object_init(void)
 qemu_sem_init(_incoming->postcopy_pause_sem_dst, 0);
 qemu_sem_init(_incoming->postcopy_pause_sem_fault, 0);
 
-init_dirty_bitmap_incoming_migration();
-
 if (!migration_object_check(current_migration, )) {
 error_report_err(err);
 exit(1);
-- 
2.21.0

[PATCH v2 00/22] Fix error handling during bitmap postcopy

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Original idea of bitmaps postcopy migration is that bitmaps are non
critical data, and their loss is not serious problem. So, using postcopy
method on any failure we should just drop unfinished bitmaps and
continue guest execution.

However, it doesn't work so. It crashes, fails, it goes to
postcopy-recovery feature. It does anything except for behavior we want.
These series fixes at least some problems with error handling during
bitmaps migration postcopy.

v1 was "[PATCH 0/7] Fix crashes on early shutdown during bitmaps postcopy"

v2:

Most of patches are new or changed a lot.
Only patches 06,07 mostly unchanged, just rebased on refactorings.

Vladimir Sementsov-Ogievskiy (22):
  migration/block-dirty-bitmap: fix dirty_bitmap_mig_before_vm_start
  migration/block-dirty-bitmap: rename state structure types
  migration/block-dirty-bitmap: rename dirty_bitmap_mig_cleanup
  migration/block-dirty-bitmap: move mutex init to dirty_bitmap_mig_init
  migration/block-dirty-bitmap: refactor state global variables
  migration/block-dirty-bitmap: rename finish_lock to just lock
  migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete
  migration/block-dirty-bitmap: keep bitmap state for all bitmaps
  migration/block-dirty-bitmap: relax error handling in incoming part
  migration/block-dirty-bitmap: cancel migration on shutdown
  migration/savevm: don't worry if bitmap migration postcopy failed
  qemu-iotests/199: fix style
  qemu-iotests/199: drop extra constraints
  qemu-iotests/199: better catch postcopy time
  qemu-iotests/199: improve performance: set bitmap by discard
  qemu-iotests/199: change discard patterns
  qemu-iotests/199: increase postcopy period
  python/qemu/machine: add kill() method
  qemu-iotests/199: prepare for new test-cases addition
  qemu-iotests/199: check persistent bitmaps
  qemu-iotests/199: add early shutdown case to bitmaps postcopy
  qemu-iotests/199: add source-killed case to bitmaps postcopy

Cc: John Snow 
Cc: Vladimir Sementsov-Ogievskiy 
Cc: Stefan Hajnoczi 
Cc: Fam Zheng 
Cc: Juan Quintela 
Cc: "Dr. David Alan Gilbert" 
Cc: Eduardo Habkost 
Cc: Cleber Rosa 
Cc: Kevin Wolf 
Cc: Max Reitz 
Cc: qemu-block@nongnu.org
Cc: qemu-de...@nongnu.org
Cc: qemu-sta...@nongnu.org # for patch 01

 migration/migration.h  |   3 +-
 migration/block-dirty-bitmap.c | 444 +
 migration/migration.c  |  15 +-
 migration/savevm.c |  37 ++-
 python/qemu/machine.py |  12 +-
 tests/qemu-iotests/199 | 244 ++
 tests/qemu-iotests/199.out |   4 +-
 7 files changed, 529 insertions(+), 230 deletions(-)

-- 
2.21.0

[PATCH v2 17/22] qemu-iotests/199: increase postcopy period

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Test wants force bitmap postcopy. Still, resulting postcopy period is
very small. Let's increase it by adding more bitmaps to migrate. Also,
test disabled bitmaps migration.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199 | 58 --
 1 file changed, 39 insertions(+), 19 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 7914fd0b2b..9a6e8dcb9d 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -103,29 +103,45 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 def test_postcopy(self):
 granularity = 512
+nb_bitmaps = 15
 
-result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
-   name='bitmap', granularity=granularity)
-self.assert_qmp(result, 'return', {})
+for i in range(nb_bitmaps):
+result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
+   name='bitmap{}'.format(i),
+   granularity=granularity)
+self.assert_qmp(result, 'return', {})
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
+   node='drive0', name='bitmap0')
 empty_sha256 = result['return']['sha256']
 
-apply_discards(self.vm_a, discards1 + discards2)
+apply_discards(self.vm_a, discards1)
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
-sha256 = result['return']['sha256']
+   node='drive0', name='bitmap0')
+discards1_sha256 = result['return']['sha256']
 
 # Check, that updating the bitmap by discards works
-assert sha256 != empty_sha256
+assert discards1_sha256 != empty_sha256
 
-result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
-   name='bitmap')
-self.assert_qmp(result, 'return', {})
+# We want to calculate resulting sha256. Do it in bitmap0, so, disable
+# other bitmaps
+for i in range(1, nb_bitmaps):
+result = self.vm_a.qmp('block-dirty-bitmap-disable', node='drive0',
+   name='bitmap{}'.format(i))
+self.assert_qmp(result, 'return', {})
 
-apply_discards(self.vm_a, discards1)
+apply_discards(self.vm_a, discards2)
+
+result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap0')
+all_discards_sha256 = result['return']['sha256']
+
+# Now, enable some bitmaps, to be updated during migration
+for i in range(2, nb_bitmaps, 2):
+result = self.vm_a.qmp('block-dirty-bitmap-enable', node='drive0',
+   name='bitmap{}'.format(i))
+self.assert_qmp(result, 'return', {})
 
 caps = [{'capability': 'dirty-bitmaps', 'state': True},
 {'capability': 'events', 'state': True}]
@@ -145,6 +161,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 e_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(e_resume)
 
+# enabled bitmaps should be updated
 apply_discards(self.vm_b, discards2)
 
 match = {'data': {'status': 'completed'}}
@@ -158,7 +175,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 downtime = event_dist(e_stop, e_resume)
 postcopy_time = event_dist(e_resume, e_complete)
 
-# TODO: assert downtime * 10 < postcopy_time
+assert downtime * 10 < postcopy_time
 if debug:
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)
@@ -166,12 +183,15 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 # Assert that bitmap migration is finished (check that successor bitmap
 # is removed)
 result = self.vm_b.qmp('query-block')
-assert len(result['return'][0]['dirty-bitmaps']) == 1
-
-# Check content of migrated (and updated by new writes) bitmap
-result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
-   node='drive0', name='bitmap')
-self.assert_qmp(result, 'return/sha256', sha256)
+assert len(result['return'][0]['dirty-bitmaps']) == nb_bitmaps
+
+# Check content of migrated bitmaps. Still, don't waste time checking
+# every bitmap
+for i in range(0, nb_bitmaps, 5):
+result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
+   node='drive0', name='bitmap{}'.format(i))
+sha256 = discards1_sha256 if i % 2 else all_discards_sha256
+self.assert_qmp(result, 'return/sha256', sha256)
 
 
 if __name__ ==

[PATCH v2 19/22] qemu-iotests/199: prepare for new test-cases addition

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Move future common part to start_postcopy() method. Move checking
number of bitmaps to check_bitmap().

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199 | 36 +++-
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 9a6e8dcb9d..969620b103 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -29,6 +29,8 @@ disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
+granularity = 512
+nb_bitmaps = 15
 
 GiB = 1024 * 1024 * 1024
 
@@ -61,6 +63,15 @@ def event_dist(e1, e2):
 return event_seconds(e2) - event_seconds(e1)
 
 
+def check_bitmaps(vm, count):
+result = vm.qmp('query-block')
+
+if count == 0:
+assert 'dirty-bitmaps' not in result['return'][0]
+else:
+assert len(result['return'][0]['dirty-bitmaps']) == count
+
+
 class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 def tearDown(self):
 if debug:
@@ -101,10 +112,8 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a_events = []
 self.vm_b_events = []
 
-def test_postcopy(self):
-granularity = 512
-nb_bitmaps = 15
-
+def start_postcopy(self):
+""" Run migration until RESUME event on target. Return this event. """
 for i in range(nb_bitmaps):
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap{}'.format(i),
@@ -119,10 +128,10 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-discards1_sha256 = result['return']['sha256']
+self.discards1_sha256 = result['return']['sha256']
 
 # Check, that updating the bitmap by discards works
-assert discards1_sha256 != empty_sha256
+assert self.discards1_sha256 != empty_sha256
 
 # We want to calculate resulting sha256. Do it in bitmap0, so, disable
 # other bitmaps
@@ -135,7 +144,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap0')
-all_discards_sha256 = result['return']['sha256']
+self.all_discards_sha256 = result['return']['sha256']
 
 # Now, enable some bitmaps, to be updated during migration
 for i in range(2, nb_bitmaps, 2):
@@ -160,6 +169,10 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 e_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(e_resume)
+return e_resume
+
+def test_postcopy_success(self):
+e_resume = self.start_postcopy()
 
 # enabled bitmaps should be updated
 apply_discards(self.vm_b, discards2)
@@ -180,18 +193,15 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)
 
-# Assert that bitmap migration is finished (check that successor bitmap
-# is removed)
-result = self.vm_b.qmp('query-block')
-assert len(result['return'][0]['dirty-bitmaps']) == nb_bitmaps
+check_bitmaps(self.vm_b, nb_bitmaps)
 
 # Check content of migrated bitmaps. Still, don't waste time checking
 # every bitmap
 for i in range(0, nb_bitmaps, 5):
 result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap{}'.format(i))
-sha256 = discards1_sha256 if i % 2 else all_discards_sha256
-self.assert_qmp(result, 'return/sha256', sha256)
+sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
+self.assert_qmp(result, 'return/sha256', sha)
 
 
 if __name__ == '__main__':
-- 
2.21.0

[PATCH v2 07/22] migration/block-dirty-bitmap: simplify dirty_bitmap_load_complete

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

bdrv_enable_dirty_bitmap_locked() call does nothing, as if we are in
postcopy, bitmap successor must be enabled, and reclaim operation will
enable the bitmap.

So, actually we need just call _reclaim_ in both if branches, and
making differences only to add an assertion seems not really good. The
logic becomes simple: on load complete we do reclaim and that's all.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 25 -
 1 file changed, 4 insertions(+), 21 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 440c41cfca..9cc750d93b 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -535,6 +535,10 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 
 qemu_mutex_lock(>lock);
 
+if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
+bdrv_reclaim_dirty_bitmap(s->bitmap, _abort);
+}
+
 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
 
@@ -544,27 +548,6 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 }
 }
 
-if (bdrv_dirty_bitmap_has_successor(s->bitmap)) {
-bdrv_dirty_bitmap_lock(s->bitmap);
-if (s->enabled_bitmaps == NULL) {
-/* in postcopy */
-bdrv_reclaim_dirty_bitmap_locked(s->bitmap, _abort);
-bdrv_enable_dirty_bitmap_locked(s->bitmap);
-} else {
-/* target not started, successor must be empty */
-int64_t count = bdrv_get_dirty_count(s->bitmap);
-BdrvDirtyBitmap *ret = bdrv_reclaim_dirty_bitmap_locked(s->bitmap,
-NULL);
-/* bdrv_reclaim_dirty_bitmap can fail only on no successor (it
- * must be) or on merge fail, but merge can't fail when second
- * bitmap is empty
- */
-assert(ret == s->bitmap &&
-   count == bdrv_get_dirty_count(s->bitmap));
-}
-bdrv_dirty_bitmap_unlock(s->bitmap);
-}
-
 qemu_mutex_unlock(>lock);
 }
 
-- 
2.21.0

[PATCH v2 10/22] migration/block-dirty-bitmap: cancel migration on shutdown

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

If target is turned of prior to postcopy finished, target crashes
because busy bitmaps are found at shutdown.
Canceling incoming migration helps, as it removes all unfinished (and
therefore busy) bitmaps.

Similarly on source we crash in bdrv_close_all which asserts that all
bdrv states are removed, because bdrv states involved into dirty bitmap
migration are referenced by it. So, we need to cancel outgoing
migration as well.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/migration.h  |  2 ++
 migration/block-dirty-bitmap.c | 16 
 migration/migration.c  | 13 +
 3 files changed, 31 insertions(+)

diff --git a/migration/migration.h b/migration/migration.h
index 2948f2387b..2de6b8bbe2 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -332,6 +332,8 @@ void migrate_send_rp_recv_bitmap(MigrationIncomingState 
*mis,
 void migrate_send_rp_resume_ack(MigrationIncomingState *mis, uint32_t value);
 
 void dirty_bitmap_mig_before_vm_start(void);
+void dirty_bitmap_mig_cancel_outgoing(void);
+void dirty_bitmap_mig_cancel_incoming(void);
 void migrate_add_address(SocketAddress *address);
 
 int foreach_not_ignored_block(RAMBlockIterFunc func, void *opaque);
diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index aea5326804..3ca425d95e 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -585,6 +585,22 @@ static void cancel_incoming_locked(DBMLoadState *s)
 s->bitmaps = NULL;
 }
 
+void dirty_bitmap_mig_cancel_outgoing(void)
+{
+dirty_bitmap_do_save_cleanup(_state.save);
+}
+
+void dirty_bitmap_mig_cancel_incoming(void)
+{
+DBMLoadState *s = _state.load;
+
+qemu_mutex_lock(>lock);
+
+cancel_incoming_locked(s);
+
+qemu_mutex_unlock(>lock);
+}
+
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
 {
 GSList *item;
diff --git a/migration/migration.c b/migration/migration.c
index 515047932c..7c605ba218 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -181,6 +181,19 @@ void migration_shutdown(void)
  */
 migrate_fd_cancel(current_migration);
 object_unref(OBJECT(current_migration));
+
+/*
+ * Cancel outgoing migration of dirty bitmaps. It should
+ * at least unref used block nodes.
+ */
+dirty_bitmap_mig_cancel_outgoing();
+
+/*
+ * Cancel incoming migration of dirty bitmaps. Dirty bitmaps
+ * are non-critical data, and their loss never considered as
+ * something serious.
+ */
+dirty_bitmap_mig_cancel_incoming();
 }
 
 /* For outgoing */
-- 
2.21.0

[PATCH v2 14/22] qemu-iotests/199: better catch postcopy time

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

The test aims to test _postcopy_ migration, and wants to do some write
operations during postcopy time.

Test considers migrate status=complete event on source as start of
postcopy. This is completely wrong, completion is completion of the
whole migration process. Let's instead consider destination start as
start of postcopy, and use RESUME event for it.

Next, as migration finish, let's use migration status=complete event on
target, as such method is closer to what libvirt or another user will
do, than tracking number of dirty-bitmaps.

Finally, add a possibility to dump events for debug. And if
set debug to True, we see, that actual postcopy period is very small
relatively to the whole test duration time (~0.2 seconds to >40 seconds
for me). This means, that test is very inefficient in what it supposed
to do. Let's improve it in following commits.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199 | 72 +-
 1 file changed, 57 insertions(+), 15 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index dda918450a..6599fc6fb4 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -20,17 +20,43 @@
 
 import os
 import iotests
-import time
 from iotests import qemu_img
 
+debug = False
+
 disk_a = os.path.join(iotests.test_dir, 'disk_a')
 disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
 
+def event_seconds(event):
+return event['timestamp']['seconds'] + \
+event['timestamp']['microseconds'] / 100.0
+
+
+def event_dist(e1, e2):
+return event_seconds(e2) - event_seconds(e1)
+
+
 class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 def tearDown(self):
+if debug:
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_b_events += self.vm_b.get_qmp_events()
+for e in self.vm_a_events:
+e['vm'] = 'SRC'
+for e in self.vm_b_events:
+e['vm'] = 'DST'
+events = (self.vm_a_events + self.vm_b_events)
+events = [(e['timestamp']['seconds'],
+   e['timestamp']['microseconds'],
+   e['vm'],
+   e['event'],
+   e.get('data', '')) for e in events]
+for e in sorted(events):
+print('{}.{:06} {} {} {}'.format(*e))
+
 self.vm_a.shutdown()
 self.vm_b.shutdown()
 os.remove(disk_a)
@@ -47,6 +73,10 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.launch()
 self.vm_b.launch()
 
+# collect received events for debug
+self.vm_a_events = []
+self.vm_b_events = []
+
 def test_postcopy(self):
 write_size = 0x4000
 granularity = 512
@@ -77,15 +107,13 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1
 
-bitmaps_cap = {'capability': 'dirty-bitmaps', 'state': True}
-events_cap = {'capability': 'events', 'state': True}
+caps = [{'capability': 'dirty-bitmaps', 'state': True},
+{'capability': 'events', 'state': True}]
 
-result = self.vm_a.qmp('migrate-set-capabilities',
-   capabilities=[bitmaps_cap, events_cap])
+result = self.vm_a.qmp('migrate-set-capabilities', capabilities=caps)
 self.assert_qmp(result, 'return', {})
 
-result = self.vm_b.qmp('migrate-set-capabilities',
-   capabilities=[bitmaps_cap])
+result = self.vm_b.qmp('migrate-set-capabilities', capabilities=caps)
 self.assert_qmp(result, 'return', {})
 
 result = self.vm_a.qmp('migrate', uri='exec:cat>' + fifo)
@@ -94,24 +122,38 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 result = self.vm_a.qmp('migrate-start-postcopy')
 self.assert_qmp(result, 'return', {})
 
-while True:
-event = self.vm_a.event_wait('MIGRATION')
-if event['data']['status'] == 'completed':
-break
+e_resume = self.vm_b.event_wait('RESUME')
+self.vm_b_events.append(e_resume)
 
 s = 0x8000
 while s < write_size:
 self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1
 
+match = {'data': {'status': 'completed'}}
+e_complete = self.vm_b.event_wait('MIGRATION', match=match)
+self.vm_b_events.append(e_complete)
+
+# take queued event, should already been happened
+e_stop = self.vm_a.event_wait('STOP')
+self.vm_a_events.append(e_stop)
+
+downtime = event_dist(e_stop, e_resume)
+postcopy_time = event_dist(e_resume, e_complete)
+
+# TODO: assert downtime * 10 < postcopy_time
+if debug:
+

[PATCH v2 06/22] migration/block-dirty-bitmap: rename finish_lock to just lock

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

finish_lock is bad name, as lock used not only on process end.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 7a82b76809..440c41cfca 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -143,7 +143,7 @@ typedef struct DBMLoadState {
 BdrvDirtyBitmap *bitmap;
 
 GSList *enabled_bitmaps;
-QemuMutex finish_lock;
+QemuMutex lock; /* protect enabled_bitmaps */
 } DBMLoadState;
 
 typedef struct DBMState {
@@ -507,7 +507,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 DBMLoadState *s = _state.load;
 GSList *item;
 
-qemu_mutex_lock(>finish_lock);
+qemu_mutex_lock(>lock);
 
 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
@@ -524,7 +524,7 @@ void dirty_bitmap_mig_before_vm_start(void)
 g_slist_free(s->enabled_bitmaps);
 s->enabled_bitmaps = NULL;
 
-qemu_mutex_unlock(>finish_lock);
+qemu_mutex_unlock(>lock);
 }
 
 static void dirty_bitmap_load_complete(QEMUFile *f, DBMLoadState *s)
@@ -533,7 +533,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 trace_dirty_bitmap_load_complete();
 bdrv_dirty_bitmap_deserialize_finish(s->bitmap);
 
-qemu_mutex_lock(>finish_lock);
+qemu_mutex_lock(>lock);
 
 for (item = s->enabled_bitmaps; item; item = g_slist_next(item)) {
 LoadBitmapState *b = item->data;
@@ -565,7 +565,7 @@ static void dirty_bitmap_load_complete(QEMUFile *f, 
DBMLoadState *s)
 bdrv_dirty_bitmap_unlock(s->bitmap);
 }
 
-qemu_mutex_unlock(>finish_lock);
+qemu_mutex_unlock(>lock);
 }
 
 static int dirty_bitmap_load_bits(QEMUFile *f, DBMLoadState *s)
@@ -747,7 +747,7 @@ static SaveVMHandlers savevm_dirty_bitmap_handlers = {
 void dirty_bitmap_mig_init(void)
 {
 QSIMPLEQ_INIT(_state.save.dbms_list);
-qemu_mutex_init(_state.load.finish_lock);
+qemu_mutex_init(_state.load.lock);
 
 register_savevm_live("dirty-bitmap", 0, 1,
  _dirty_bitmap_handlers,
-- 
2.21.0

[PATCH v2 22/22] qemu-iotests/199: add source-killed case to bitmaps postcopy

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Previous patches fixes behavior of bitmaps migration, so that errors
are handled by just removing unfinished bitmaps, and not fail or try to
recover postcopy migration. Add corresponding test.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199 | 15 +++
 tests/qemu-iotests/199.out |  4 ++--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 0d12e6b1ae..d38913fa44 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -235,6 +235,21 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_a.launch()
 check_bitmaps(self.vm_a, 0)
 
+def test_early_kill_source(self):
+self.start_postcopy()
+
+self.vm_a_events = self.vm_a.get_qmp_events()
+self.vm_a.kill()
+
+self.vm_a.launch()
+
+match = {'data': {'status': 'completed'}}
+e_complete = self.vm_b.event_wait('MIGRATION', match=match)
+self.vm_b_events.append(e_complete)
+
+check_bitmaps(self.vm_a, 0)
+check_bitmaps(self.vm_b, 0)
+
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index fbc63e62f8..8d7e996700 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-..
+...
 --
-Ran 2 tests
+Ran 3 tests
 
 OK
-- 
2.21.0

[PATCH v2 21/22] qemu-iotests/199: add early shutdown case to bitmaps postcopy

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Previous patches fixed two crashes which may occur on shutdown prior to
bitmaps postcopy finished. Check that it works now.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199 | 18 ++
 tests/qemu-iotests/199.out |  4 ++--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 8baa078151..0d12e6b1ae 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -217,6 +217,24 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 sha = self.discards1_sha256 if i % 2 else self.all_discards_sha256
 self.assert_qmp(result, 'return/sha256', sha)
 
+def test_early_shutdown_destination(self):
+self.start_postcopy()
+
+self.vm_b_events += self.vm_b.get_qmp_events()
+self.vm_b.shutdown()
+# recreate vm_b, so there is no incoming option, which prevents
+# loading bitmaps from disk
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_b.launch()
+check_bitmaps(self.vm_b, 0)
+
+result = self.vm_a.qmp('query-status')
+assert not result['return']['running']
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_a.shutdown()
+self.vm_a.launch()
+check_bitmaps(self.vm_a, 0)
+
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/199.out b/tests/qemu-iotests/199.out
index ae1213e6f8..fbc63e62f8 100644
--- a/tests/qemu-iotests/199.out
+++ b/tests/qemu-iotests/199.out
@@ -1,5 +1,5 @@
-.
+..
 --
-Ran 1 tests
+Ran 2 tests
 
 OK
-- 
2.21.0

[PATCH v2 05/22] migration/block-dirty-bitmap: refactor state global variables

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Move all state variables into one global struct. Reduce global
variable usage, utilizing opaque pointer where possible.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 migration/block-dirty-bitmap.c | 171 ++---
 1 file changed, 95 insertions(+), 76 deletions(-)

diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
index 49d4cf8810..7a82b76809 100644
--- a/migration/block-dirty-bitmap.c
+++ b/migration/block-dirty-bitmap.c
@@ -128,6 +128,12 @@ typedef struct DBMSaveState {
 BdrvDirtyBitmap *prev_bitmap;
 } DBMSaveState;
 
+typedef struct LoadBitmapState {
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+bool migrated;
+} LoadBitmapState;
+
 /* State of the dirty bitmap migration (DBM) during load process */
 typedef struct DBMLoadState {
 uint32_t flags;
@@ -135,18 +141,17 @@ typedef struct DBMLoadState {
 char bitmap_name[256];
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
+
+GSList *enabled_bitmaps;
+QemuMutex finish_lock;
 } DBMLoadState;
 
-static DBMSaveState dirty_bitmap_mig_state;
+typedef struct DBMState {
+DBMSaveState save;
+DBMLoadState load;
+} DBMState;
 
-/* State of one bitmap during load process */
-typedef struct LoadBitmapState {
-BlockDriverState *bs;
-BdrvDirtyBitmap *bitmap;
-bool migrated;
-} LoadBitmapState;
-static GSList *enabled_bitmaps;
-QemuMutex finish_lock;
+static DBMState dbm_state;
 
 static uint32_t qemu_get_bitmap_flags(QEMUFile *f)
 {
@@ -169,21 +174,21 @@ static void qemu_put_bitmap_flags(QEMUFile *f, uint32_t 
flags)
 qemu_put_byte(f, flags);
 }
 
-static void send_bitmap_header(QEMUFile *f, SaveBitmapState *dbms,
-   uint32_t additional_flags)
+static void send_bitmap_header(QEMUFile *f, DBMSaveState *s,
+   SaveBitmapState *dbms, uint32_t 
additional_flags)
 {
 BlockDriverState *bs = dbms->bs;
 BdrvDirtyBitmap *bitmap = dbms->bitmap;
 uint32_t flags = additional_flags;
 trace_send_bitmap_header_enter();
 
-if (bs != dirty_bitmap_mig_state.prev_bs) {
-dirty_bitmap_mig_state.prev_bs = bs;
+if (bs != s->prev_bs) {
+s->prev_bs = bs;
 flags |= DIRTY_BITMAP_MIG_FLAG_DEVICE_NAME;
 }
 
-if (bitmap != dirty_bitmap_mig_state.prev_bitmap) {
-dirty_bitmap_mig_state.prev_bitmap = bitmap;
+if (bitmap != s->prev_bitmap) {
+s->prev_bitmap = bitmap;
 flags |= DIRTY_BITMAP_MIG_FLAG_BITMAP_NAME;
 }
 
@@ -198,19 +203,22 @@ static void send_bitmap_header(QEMUFile *f, 
SaveBitmapState *dbms,
 }
 }
 
-static void send_bitmap_start(QEMUFile *f, SaveBitmapState *dbms)
+static void send_bitmap_start(QEMUFile *f, DBMSaveState *s,
+  SaveBitmapState *dbms)
 {
-send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_START);
+send_bitmap_header(f, s, dbms, DIRTY_BITMAP_MIG_FLAG_START);
 qemu_put_be32(f, bdrv_dirty_bitmap_granularity(dbms->bitmap));
 qemu_put_byte(f, dbms->flags);
 }
 
-static void send_bitmap_complete(QEMUFile *f, SaveBitmapState *dbms)
+static void send_bitmap_complete(QEMUFile *f, DBMSaveState *s,
+ SaveBitmapState *dbms)
 {
-send_bitmap_header(f, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
+send_bitmap_header(f, s, dbms, DIRTY_BITMAP_MIG_FLAG_COMPLETE);
 }
 
-static void send_bitmap_bits(QEMUFile *f, SaveBitmapState *dbms,
+static void send_bitmap_bits(QEMUFile *f, DBMSaveState *s,
+ SaveBitmapState *dbms,
  uint64_t start_sector, uint32_t nr_sectors)
 {
 /* align for buffer_is_zero() */
@@ -235,7 +243,7 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
 
 trace_send_bitmap_bits(flags, start_sector, nr_sectors, buf_size);
 
-send_bitmap_header(f, dbms, flags);
+send_bitmap_header(f, s, dbms, flags);
 
 qemu_put_be64(f, start_sector);
 qemu_put_be32(f, nr_sectors);
@@ -254,12 +262,12 @@ static void send_bitmap_bits(QEMUFile *f, SaveBitmapState 
*dbms,
 }
 
 /* Called with iothread lock taken.  */
-static void dirty_bitmap_do_save_cleanup(void)
+static void dirty_bitmap_do_save_cleanup(DBMSaveState *s)
 {
 SaveBitmapState *dbms;
 
-while ((dbms = QSIMPLEQ_FIRST(_bitmap_mig_state.dbms_list)) != NULL) 
{
-QSIMPLEQ_REMOVE_HEAD(_bitmap_mig_state.dbms_list, entry);
+while ((dbms = QSIMPLEQ_FIRST(>dbms_list)) != NULL) {
+QSIMPLEQ_REMOVE_HEAD(>dbms_list, entry);
 bdrv_dirty_bitmap_set_busy(dbms->bitmap, false);
 bdrv_unref(dbms->bs);
 g_free(dbms);
@@ -267,17 +275,17 @@ static void dirty_bitmap_do_save_cleanup(void)
 }
 
 /* Called with iothread lock taken. */
-static int init_dirty_bitmap_migration(void)
+static int init_dirty_bitmap_migration(DBMSaveState *s)
 {
 BlockDriverState *bs;
 BdrvDirtyBitmap *bitmap;
 SaveBitmapState *dbms;
 Error *local_err = NULL;
 
-

[PATCH v2 20/22] qemu-iotests/199: check persistent bitmaps

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Check that persistent bitmaps are not stored on source and that bitmaps
are persistent on destination.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199 | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 969620b103..8baa078151 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -117,7 +117,8 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 for i in range(nb_bitmaps):
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap{}'.format(i),
-   granularity=granularity)
+   granularity=granularity,
+   persistent=True)
 self.assert_qmp(result, 'return', {})
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
@@ -193,6 +194,19 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 print('downtime:', downtime)
 print('postcopy_time:', postcopy_time)
 
+# check that there are no bitmaps stored on source
+self.vm_a_events += self.vm_a.get_qmp_events()
+self.vm_a.shutdown()
+self.vm_a.launch()
+check_bitmaps(self.vm_a, 0)
+
+# check that bitmaps are migrated and persistence works
+check_bitmaps(self.vm_b, nb_bitmaps)
+self.vm_b.shutdown()
+# recreate vm_b, so there is no incoming option, which prevents
+# loading bitmaps from disk
+self.vm_b = iotests.VM(path_suffix='b').add_drive(disk_b)
+self.vm_b.launch()
 check_bitmaps(self.vm_b, nb_bitmaps)
 
 # Check content of migrated bitmaps. Still, don't waste time checking
-- 
2.21.0

[PATCH v2 12/22] qemu-iotests/199: fix style

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

Mostly, satisfy pep8 complains.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199 | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index 40774eed74..de9ba8d94c 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -28,8 +28,8 @@ disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
-class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
+class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 def tearDown(self):
 self.vm_a.shutdown()
 self.vm_b.shutdown()
@@ -54,7 +54,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap', granularity=granularity)
-self.assert_qmp(result, 'return', {});
+self.assert_qmp(result, 'return', {})
 
 s = 0
 while s < write_size:
@@ -71,7 +71,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 
 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
-self.assert_qmp(result, 'return', {});
+self.assert_qmp(result, 'return', {})
 s = 0
 while s < write_size:
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
@@ -104,15 +104,16 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b.hmp_qemu_io('drive0', 'write %d %d' % (s, chunk))
 s += 0x1
 
-result = self.vm_b.qmp('query-block');
+result = self.vm_b.qmp('query-block')
 while len(result['return'][0]['dirty-bitmaps']) > 1:
 time.sleep(2)
-result = self.vm_b.qmp('query-block');
+result = self.vm_b.qmp('query-block')
 
 result = self.vm_b.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')
 
-self.assert_qmp(result, 'return/sha256', sha256);
+self.assert_qmp(result, 'return/sha256', sha256)
+
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2'], supported_cache_modes=['none'],
-- 
2.21.0

[PATCH v2 16/22] qemu-iotests/199: change discard patterns

2020-02-17 Thread Vladimir Sementsov-Ogievskiy

iotest 40 works too long because of many discard opertion. On the same
time, postcopy period is very short, in spite of all these efforts.

So, let's use less discards (and with more interesting patterns) to
reduce test timing. In the next commit we'll increase postcopy period.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/199 | 44 +-
 1 file changed, 26 insertions(+), 18 deletions(-)

diff --git a/tests/qemu-iotests/199 b/tests/qemu-iotests/199
index d78f81b71c..7914fd0b2b 100755
--- a/tests/qemu-iotests/199
+++ b/tests/qemu-iotests/199
@@ -30,6 +30,28 @@ size = '256G'
 fifo = os.path.join(iotests.test_dir, 'mig_fifo')
 
 
+GiB = 1024 * 1024 * 1024
+
+discards1 = (
+(0, GiB),
+(2 * GiB + 512 * 5, 512),
+(3 * GiB + 512 * 5, 512),
+(100 * GiB, GiB)
+)
+
+discards2 = (
+(3 * GiB + 512 * 8, 512),
+(4 * GiB + 512 * 8, 512),
+(50 * GiB, GiB),
+(100 * GiB + GiB // 2, GiB)
+)
+
+
+def apply_discards(vm, discards):
+for d in discards:
+vm.hmp_qemu_io('drive0', 'discard {} {}'.format(*d))
+
+
 def event_seconds(event):
 return event['timestamp']['seconds'] + \
 event['timestamp']['microseconds'] / 100.0
@@ -80,9 +102,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 self.vm_b_events = []
 
 def test_postcopy(self):
-discard_size = 0x4000
 granularity = 512
-chunk = 4096
 
 result = self.vm_a.qmp('block-dirty-bitmap-add', node='drive0',
name='bitmap', granularity=granularity)
@@ -92,14 +112,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
node='drive0', name='bitmap')
 empty_sha256 = result['return']['sha256']
 
-s = 0
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
-s = 0x8000
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+apply_discards(self.vm_a, discards1 + discards2)
 
 result = self.vm_a.qmp('x-debug-block-dirty-bitmap-sha256',
node='drive0', name='bitmap')
@@ -111,10 +124,8 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 result = self.vm_a.qmp('block-dirty-bitmap-clear', node='drive0',
name='bitmap')
 self.assert_qmp(result, 'return', {})
-s = 0
-while s < discard_size:
-self.vm_a.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+
+apply_discards(self.vm_a, discards1)
 
 caps = [{'capability': 'dirty-bitmaps', 'state': True},
 {'capability': 'events', 'state': True}]
@@ -134,10 +145,7 @@ class 
TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 e_resume = self.vm_b.event_wait('RESUME')
 self.vm_b_events.append(e_resume)
 
-s = 0x8000
-while s < discard_size:
-self.vm_b.hmp_qemu_io('drive0', 'discard %d %d' % (s, chunk))
-s += 0x1
+apply_discards(self.vm_b, discards2)
 
 match = {'data': {'status': 'completed'}}
 e_complete = self.vm_b.event_wait('MIGRATION', match=match)
-- 
2.21.0

Re: [PATCH v2 03/33] block: Add BdrvChildRole

2020-02-17 Thread Max Reitz

On 11.02.20 16:41, Alberto Garcia wrote:
> On Tue 04 Feb 2020 06:08:18 PM CET, Max Reitz wrote:
>> +/* Child to COW from (backing child) */
>> +BDRV_CHILD_COW  = (1 << 3),
> 
> Without the comment in brackets I'm not sure that I would have
> understood that this is meant for backing files.

I put it in brackets because bs->backing isn’t always such a child (for
filters it isn’t).  That’s also the reason why I prefer to stress the
COW aspect.

> This is the "child that contains the data that is not allocated in the
> parent", or something like that, right?

Hm, so I suppose the problem is that I didn’t describe in which event
the COW is to occur.  (I didn’t because we only have one kind of COW in
the block layer, namely for backing chains.)

So maybe “Child from which to read all data that isn’t allocated in the
parent (backing child); such data may be copied to the parent by means
of COW or COR”?

(The problem I see with this description is that it is kind of a
tautology, because “allocation” is in turn defined by differentiating
between layers of a backing chain, i.e. this layer and that backing/COW
child we’re talking about here.)

Max

signature.asc
Description: OpenPGP digital signature

Re: [PATCH v2 7/7] block/block-copy: hide structure definitions

2020-02-17 Thread Max Reitz

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
> Hide structure definitions and add explicit API instead, to keep an
> eye on the scope of the shared fields.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/block-copy.h | 57 +++--
>  block/backup-top.c |  6 ++--
>  block/backup.c | 27 
>  block/block-copy.c | 64 ++
>  4 files changed, 86 insertions(+), 68 deletions(-)

[...]

> diff --git a/block/backup.c b/block/backup.c
> index cf62b1a38c..acab0d08da 100644
> --- a/block/backup.c
> +++ b/block/backup.c

[...]

> @@ -458,6 +458,7 @@ BlockJob *backup_job_create(const char *job_id, 
> BlockDriverState *bs,
>  job->sync_bitmap = sync_bitmap;
>  job->bitmap_mode = bitmap_mode;
>  job->bcs = bcs;
> +job->bcs_bitmap = block_copy_dirty_bitmap(bcs);

It seems a bit weird to me to store a pointer to the BCS-owned bitmap
here, because, well, it’s a BCS-owned object, and just calling
block_copy_dirty_bitmap() every time wouldn’t be prohibitively expensive.

I feel sufficiently bad about this to warrant not giving an R-b, but I
know I shouldn’t withhold an R-b over this, so:

Reviewed-by: Max Reitz 

>  job->cluster_size = cluster_size;
>  job->len = len;



signature.asc
Description: OpenPGP digital signature

Re: [GEDI] [PATCH 07/17] gluster: Drop useless has_zero_init callback

2020-02-17 Thread Niels de Vos

On Mon, Feb 17, 2020 at 06:03:40AM -0600, Eric Blake wrote:
> On 2/17/20 2:06 AM, Niels de Vos wrote:
> > On Fri, Jan 31, 2020 at 11:44:26AM -0600, Eric Blake wrote:
> > > block.c already defaults to 0 if we don't provide a callback; there's
> > > no need to write a callback that always fails.
> > > 
> > > Signed-off-by: Eric Blake 
> > 
> > Reviewed-by: Niels de Vos 
> > 
> 
> Per your other message,
> 
> On 2/17/20 2:16 AM, Niels de Vos wrote:
> > On Fri, Jan 31, 2020 at 11:44:31AM -0600, Eric Blake wrote:
> >> Since gluster already copies file-posix for lseek usage in block
> >> status, it also makes sense to copy it for learning if the image
> >> currently reads as all zeroes.
> >>
> 
> >> +static int qemu_gluster_known_zeroes(BlockDriverState *bs)
> >> +{
> >> +/*
> >> + * GlusterFS volume could be backed by a block device, with no way
> >
> > Actually, Gluster dropped support for volumes backed by block devices
> > (LVM) a few releases back. Nobody could be found that used it, and it
> > could not be combined with other Gluster features. All contents on a
> > Gluster volume is now always backed by 'normal' files on a filesystem.
> 
> That's useful to know.  Thanks!
> 
> >
> > Creation or truncation should behave just as on a file on a local
> > filesystem. So maybe qemu_gluster_known_zeroes is not needed at all?
> 
> Which version of gluster first required a regular filesystem backing for all
> gluster files?  Does qemu support older versions (in which case, what is the
> correct version-probing invocation to return 0 prior to that point, and 1
> after), or do all versions supported by qemu already guarantee zero
> initialization on creation or widening truncation by virtue of POSIX file
> semantics (in which case, patch 7 should instead switch to using
> .bdrv_has_zero_init_1 for both functions)?  Per configure, we probe for
> glusterfs_xlator_opt from gluster 4, which implies the code still tries to
> be portable to even older gluster, but I'm not sure if this squares with
> qemu-doc.texi which mentions our minimum distro policy (for example, now
> that qemu requires python 3 consistent with our distro policy, that rules
> out several older systems where older gluster was likely to be present).

The block device feature (storage/bd xlator) got deprecated in Gluster
5.0, and was removed with Gluster 6.0. Fedora 29 is the last version
that contained the bd.so xlator (glusterfs-server 5.0, deprecated).

All currently maintained and available Gluster releases should have
glusterfs_xlator_opt (introduced with glusterfs-3.5 in 2014). However, I
am not sure what versions are provided with different distributions. The
expectation is that at least Gluster 5 is provided, as older releases
will not get any updates anymore. See
https://www.gluster.org/release-schedule/ for a more detailed timeline.

Unfortunately there is no reasonable way to probe for the type of
backend (block or filesystem) that is used. So, a runtime check to be on
the extreme safe side to fallback on block device backends is not an
option.

HTH,
Niels

Re: x-blockdev-reopen & block-dirty-bitmaps

2020-02-17 Thread Kevin Wolf

Am 17.02.2020 um 14:03 hat Alberto Garcia geschrieben:
> On Mon 17 Feb 2020 11:58:18 AM CET, Kevin Wolf wrote:
> >> As far as I'm aware there's nothing needed to make x-blockdev-reopen
> >> stable. It was just marked experimental because it's a relatively
> >> complex operation and I wanted to have some margin to detect bugs or
> >> other possible problems.
> >
> > Maybe some more test cases for potentially problematic things like
> > attaching a backing file node that is in a different AioContext than
> > the new parent. But I think the recent AioContext management fixes
> > should have taken care of this. (Actually, bdrv_reopen_parse_backing()
> > still forbids this case and has a "TODO" comment there.)
> >
> > Hm... In fact I guess it would still be broken:
> >
> > bdrv_set_backing_hd(bs, reopen_state->new_backing_bs,
> > _abort);
> 
> I can give that a try and add a new test case if necessary.
> 
> > Another thing that we probably want to add is changing bs->file,
> > specifically in order to insert or remove filter nodes. If we don't do
> > this before marking blockdev-reopen stable, introspection wouldn't be
> > able to detect when we later add it. (We have QAPI features now to
> > work around this, but...)
> 
> I'm not sure if introspection can track all the changes in this feature,
> though. The list of runtime options that can be modified for each driver
> can change as we add support to new ones, as can change the list of
> drivers that allow reopening.

Yes, I'm sure we'll have to resort to QAPI features here and there to
make things introspectable. But the more options you can change from the
start, the less we'll need to do this.

Kevin

Re: [PATCH v2 6/7] block/block-copy: reduce intersecting request lock

2020-02-17 Thread Max Reitz

On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
> Currently, block_copy operation lock the whole requested region. But
> there is no reason to lock clusters, which are already copied, it will
> disturb other parallel block_copy requests for no reason.
> 
> Let's instead do the following:
> 
> Lock only sub-region, which we are going to operate on. Then, after
> copying all dirty sub-regions, we should wait for intersecting
> requests block-copy, if they failed, we should retry these new dirty
> clusters.

Just a thought spoken aloud:

I would expect the number of intersecting CBW requests to be low in
general, so I don’t know how useful this change is in practice.  OTOH,
it makes block_copy call the existing implementation in a loop, which
seems just worse.

But then again, in the common case, block_copy_dirty_clusters() won’t
copy anything because it’s all been copied already, so there is no
change; and even if something is copied, the second call will just
re-check the dirty bitmap to see that the area’s clean (which will be
quick compared to the copy operation).  So there’s probably nothing to
worry about.

> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/block-copy.c | 116 +
>  1 file changed, 95 insertions(+), 21 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 20068cd699..aca44b13fb 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -39,29 +39,62 @@ static BlockCopyInFlightReq 
> *block_copy_find_inflight_req(BlockCopyState *s,
>  return NULL;
>  }
>  
> -static void coroutine_fn block_copy_wait_inflight_reqs(BlockCopyState *s,
> -   int64_t offset,
> -   int64_t bytes)
> +/*
> + * If there are no intersecting requests return false. Otherwise, wait for 
> the
> + * first found intersecting request to finish and return true.
> + */
> +static bool coroutine_fn block_copy_wait_one(BlockCopyState *s, int64_t 
> start,
> + int64_t end)

s/end/bytes/?

(And maybe s/start/offset/, too)

>  {
> -BlockCopyInFlightReq *req;
> +BlockCopyInFlightReq *req = block_copy_find_inflight_req(s, start, end);
>  
> -while ((req = block_copy_find_inflight_req(s, offset, bytes))) {
> -qemu_co_queue_wait(>wait_queue, NULL);
> +if (!req) {
> +return false;
>  }
> +
> +qemu_co_queue_wait(>wait_queue, NULL);
> +
> +return true;
>  }
>  
> +/* Called only on full-dirty region */
>  static void block_copy_inflight_req_begin(BlockCopyState *s,
>BlockCopyInFlightReq *req,
>int64_t offset, int64_t bytes)
>  {
> +assert(!block_copy_find_inflight_req(s, offset, bytes));
> +
> +bdrv_reset_dirty_bitmap(s->copy_bitmap, offset, bytes);
> +
>  req->offset = offset;
>  req->bytes = bytes;
>  qemu_co_queue_init(>wait_queue);
>  QLIST_INSERT_HEAD(>inflight_reqs, req, list);
>  }
>  
> -static void coroutine_fn block_copy_inflight_req_end(BlockCopyInFlightReq 
> *req)
> +static void coroutine_fn block_copy_inflight_req_shrink(BlockCopyState *s,
> +BlockCopyInFlightReq *req, int64_t new_bytes)

It took me a while to understand that this is operation drops the tail
of the request.  I think there should be a comment on this.

(I thought it would successively drop the head after each copy, and so I
was wondering why the code didn’t match that.)

>  {
> +if (new_bytes == req->bytes) {
> +return;
> +}
> +
> +assert(new_bytes > 0 && new_bytes < req->bytes);
> +
> +bdrv_set_dirty_bitmap(s->copy_bitmap,
> +  req->offset + new_bytes, req->bytes - new_bytes);> 
> +
> +req->bytes = new_bytes;
> +qemu_co_queue_restart_all(>wait_queue);
> +}
> +
> +static void coroutine_fn block_copy_inflight_req_end(BlockCopyState *s,
> + BlockCopyInFlightReq 
> *req,
> + int ret)
> +{
> +if (ret < 0) {
> +bdrv_set_dirty_bitmap(s->copy_bitmap, req->offset, req->bytes);
> +}
>  QLIST_REMOVE(req, list);
>  qemu_co_queue_restart_all(>wait_queue);
>  }
> @@ -344,12 +377,19 @@ int64_t block_copy_reset_unallocated(BlockCopyState *s,
>  return ret;
>  }
>  
> -int coroutine_fn block_copy(BlockCopyState *s,
> -int64_t offset, uint64_t bytes,
> -bool *error_is_read)
> +/*
> + * block_copy_dirty_clusters
> + *
> + * Copy dirty clusters in @start/@bytes range.
> + * Returns 1 if dirty clusters found and successfully copied, 0 if no dirty
> + * clusters found and -errno on failure.
> + */
> +static int coroutine_fn block_copy_dirty_clusters(BlockCopyState *s,
> +  int64_t offset,

[PATCH v3] configure: Avoid compiling system tools on user build by default

2020-02-17 Thread Philippe Mathieu-Daudé

User-mode does not need the system tools. Do not build them by
default if the user specifies --disable-system.

This disables building the following binaries on a user-only build:

- elf2dmp
- qemu-edid
- qemu-ga
- qemu-img
- qemu-io
- qemu-nbd
- ivshmem-client
- ivshmem-server

The qemu-user binaries are not affected by this change.

Signed-off-by: Philippe Mathieu-Daudé 
---
v3:
- fixed typos (Aleksandar)
v2:
- use simpler if/else statement (therefore not adding Richard R-b)
- improved description (Aleksandar)
---
 configure | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 6f5d850949..efe00dd497 100755
--- a/configure
+++ b/configure
@@ -455,7 +455,7 @@ guest_agent_ntddscsi="no"
 guest_agent_msi=""
 vss_win32_sdk=""
 win_sdk="no"
-want_tools="yes"
+want_tools=""
 libiscsi=""
 libnfs=""
 coroutine=""
@@ -2213,6 +2213,16 @@ else
 echo big/little test failed
 fi
 
+##
+# system tools
+if test -z "$want_tools"; then
+if test "$softmmu" = "no"; then
+want_tools=no
+else
+want_tools=yes
+fi
+fi
+
 ##
 # cocoa implies not SDL or GTK
 # (the cocoa UI code currently assumes it is always the active UI
-- 
2.21.1

Re: [PATCH v3] configure: Avoid compiling system tools on user build by default

2020-02-17 Thread Philippe Mathieu-Daudé

On Mon, Feb 17, 2020 at 2:33 PM Philippe Mathieu-Daudé  wrote:
>
> User-mode does not need the system tools. Do not build them by
> default if the user specifies --disable-system.
>
> This disables building the following binaries on a user-only build:
>
> - elf2dmp
> - qemu-edid
> - qemu-ga
> - qemu-img
> - qemu-io
> - qemu-nbd
> - ivshmem-client
> - ivshmem-server
>
> The qemu-user binaries are not affected by this change.
>

I forgot to add:
Reviewed-by: Laurent Vivier 

> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> v3:
> - fixed typos (Aleksandar)
> v2:
> - use simpler if/else statement (therefore not adding Richard R-b)
> - improved description (Aleksandar)
> ---
>  configure | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/configure b/configure
> index 6f5d850949..efe00dd497 100755
> --- a/configure
> +++ b/configure
> @@ -455,7 +455,7 @@ guest_agent_ntddscsi="no"
>  guest_agent_msi=""
>  vss_win32_sdk=""
>  win_sdk="no"
> -want_tools="yes"
> +want_tools=""
>  libiscsi=""
>  libnfs=""
>  coroutine=""
> @@ -2213,6 +2213,16 @@ else
>  echo big/little test failed
>  fi
>
> +##
> +# system tools
> +if test -z "$want_tools"; then
> +if test "$softmmu" = "no"; then
> +want_tools=no
> +else
> +want_tools=yes
> +fi
> +fi
> +
>  ##
>  # cocoa implies not SDL or GTK
>  # (the cocoa UI code currently assumes it is always the active UI
> --
> 2.21.1
>

Re: x-blockdev-reopen & block-dirty-bitmaps

2020-02-17 Thread Alberto Garcia

On Mon 17 Feb 2020 11:58:18 AM CET, Kevin Wolf wrote:
>> As far as I'm aware there's nothing needed to make x-blockdev-reopen
>> stable. It was just marked experimental because it's a relatively
>> complex operation and I wanted to have some margin to detect bugs or
>> other possible problems.
>
> Maybe some more test cases for potentially problematic things like
> attaching a backing file node that is in a different AioContext than
> the new parent. But I think the recent AioContext management fixes
> should have taken care of this. (Actually, bdrv_reopen_parse_backing()
> still forbids this case and has a "TODO" comment there.)
>
> Hm... In fact I guess it would still be broken:
>
> bdrv_set_backing_hd(bs, reopen_state->new_backing_bs,
> _abort);

I can give that a try and add a new test case if necessary.

> Another thing that we probably want to add is changing bs->file,
> specifically in order to insert or remove filter nodes. If we don't do
> this before marking blockdev-reopen stable, introspection wouldn't be
> able to detect when we later add it. (We have QAPI features now to
> work around this, but...)

I'm not sure if introspection can track all the changes in this feature,
though. The list of runtime options that can be modified for each driver
can change as we add support to new ones, as can change the list of
drivers that allow reopening.

Berto

Re: QAPI schema for desired state of LUKS keyslots

2020-02-17 Thread Eric Blake


On 2/17/20 6:28 AM, Markus Armbruster wrote:


Proposal:

 { 'enum': 'LUKSKeyslotState',
   'data': [ 'active', 'inactive' ] }

 { 'struct': 'LUKSKeyslotActive',
   'data': { 'secret': 'str',
 '*iter-time': 'int } }

 { 'struct': 'LUKSKeyslotInactive',
   'data': { '*old-secret': 'str' } }

 { 'union': 'LUKSKeyslotAmend',
   'base': { '*keyslot': 'int',
 'state': 'LUKSKeyslotState' }
   'discriminator': 'state',
   'data': { 'active': 'LUKSKeyslotActive',
 'inactive': 'LUKSKeyslotInactive' } }

LUKSKeyslotAmend specifies desired state for a set of keyslots.


Though not arbitrary sets of keyslots, it's only a single keyslot or
multiple keyslots containing the same secret. Might be good enough in
practice, though it means that you may have to issue multiple amend
commands to get to the final state that you really want (even if doing
everything at once would be safe).


True.  I traded expressiveness for simplicity.

Here's the only practical case I can think of where the lack of
expressiveness may hurt: replace secrets.

With this interface, you need two operations: activate a free slot with
the new secret, deactivate the slot(s) with the old secret.  There is an
intermediate state with both secrets active.

A more expressive interface could let you do both in one step.  Relevant
only if the implementation actually provides atomicity.  Can it?


Or put another way, can atomicity be added via 'transaction' later?  If 
so, reusing one common interface to provide atomicity is nicer than 
making every interface reimplement atomicity on an ad hoc basis.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2] configure: Avoid compiling system tools on user build by default

2020-02-17 Thread Aleksandar Markovic

11:00 AM Pon, 17.02.2020. Philippe Mathieu-Daudé  је
написао/ла:
>
> From: Philippe Mathieu-Daudé 
>
> User-mode does not need the sytem tools.

system

> Do not build them by
> default if user specified --disable-system.

specifies

>
> This disables building the following binary on a user-only build:

binaries

>
> - elf2dmp
> - qemu-edid
> - qemu-ga
> - qemu-img
> - qemu-io
> - qemu-nbd
> - ivshmem-client
> - ivshmem-server
>
> The qemu-user binary is not affected by this change.

binaries are

>
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> v2:
> - use simpler if/else statement (therefore not adding Richard R-b)
> - improved description (Aleksandar)
> ---
>  configure | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/configure b/configure
> index 16f94cd96b..d1877a60f5 100755
> --- a/configure
> +++ b/configure
> @@ -455,7 +455,7 @@ guest_agent_ntddscsi="no"
>  guest_agent_msi=""
>  vss_win32_sdk=""
>  win_sdk="no"
> -want_tools="yes"
> +want_tools=""
>  libiscsi=""
>  libnfs=""
>  coroutine=""
> @@ -2199,6 +2199,16 @@ else
>  echo big/little test failed
>  fi
>
> +##
> +# system tools
> +if test -z "$want_tools"; then
> +if test "$softmmu" = "no"; then
> +want_tools=no
> +else
> +want_tools=yes
> +fi
> +fi
> +
>  ##
>  # cocoa implies not SDL or GTK
>  # (the cocoa UI code currently assumes it is always the active UI
> --
> 2.21.1
>

Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine

2020-02-17 Thread Kevin Wolf

Am 17.02.2020 um 12:08 hat Markus Armbruster geschrieben:
> This is the hairy one.  Please bear with me while I try to grok it.
> 
> Kevin Wolf  writes:
> 
> > This moves the QMP dispatcher to a coroutine and runs all QMP command
> > handlers that declare 'coroutine': true in coroutine context so they
> > can avoid blocking the main loop while doing I/O or waiting for other
> > events.
> >
> > For commands that are not declared safe to run in a coroutine, the
> > dispatcher drops out of coroutine context by calling the QMP command
> > handler from a bottom half.
> >
> > Signed-off-by: Kevin Wolf 
> > ---
> >  include/qapi/qmp/dispatch.h |   1 +
> >  monitor/monitor-internal.h  |   6 +-
> >  monitor/monitor.c   |  33 ---
> >  monitor/qmp.c   | 110 ++--
> >  qapi/qmp-dispatch.c |  44 ++-
> >  qapi/qmp-registry.c |   3 +
> >  util/aio-posix.c|   7 ++-
> >  7 files changed, 162 insertions(+), 42 deletions(-)
> >
> > diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
> > index d6ce9efc8e..6812e49b5f 100644
> > --- a/include/qapi/qmp/dispatch.h
> > +++ b/include/qapi/qmp/dispatch.h
> > @@ -30,6 +30,7 @@ typedef enum QmpCommandOptions
> >  typedef struct QmpCommand
> >  {
> >  const char *name;
> > +/* Runs in coroutine context if QCO_COROUTINE is set */
> >  QmpCommandFunc *fn;
> >  QmpCommandOptions options;
> >  QTAILQ_ENTRY(QmpCommand) node;
> > diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
> > index d78f5ca190..f180d03368 100644
> > --- a/monitor/monitor-internal.h
> > +++ b/monitor/monitor-internal.h
> > @@ -154,7 +154,9 @@ static inline bool monitor_is_qmp(const Monitor *mon)
> >  
> >  typedef QTAILQ_HEAD(MonitorList, Monitor) MonitorList;
> >  extern IOThread *mon_iothread;
> > -extern QEMUBH *qmp_dispatcher_bh;
> > +extern Coroutine *qmp_dispatcher_co;
> > +extern bool qmp_dispatcher_co_shutdown;
> > +extern bool qmp_dispatcher_co_busy;
> >  extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands;
> >  extern QemuMutex monitor_lock;
> >  extern MonitorList mon_list;
> > @@ -172,7 +174,7 @@ void monitor_fdsets_cleanup(void);
> >  
> >  void qmp_send_response(MonitorQMP *mon, const QDict *rsp);
> >  void monitor_data_destroy_qmp(MonitorQMP *mon);
> > -void monitor_qmp_bh_dispatcher(void *data);
> > +void coroutine_fn monitor_qmp_dispatcher_co(void *data);
> >  
> >  int get_monitor_def(int64_t *pval, const char *name);
> >  void help_cmd(Monitor *mon, const char *name);
> > diff --git a/monitor/monitor.c b/monitor/monitor.c
> > index 12898b6448..e753fa435d 100644
> > --- a/monitor/monitor.c
> > +++ b/monitor/monitor.c
> > @@ -53,8 +53,18 @@ typedef struct {
> >  /* Shared monitor I/O thread */
> >  IOThread *mon_iothread;
> >  
> > -/* Bottom half to dispatch the requests received from I/O thread */
> > -QEMUBH *qmp_dispatcher_bh;
> > +/* Coroutine to dispatch the requests received from I/O thread */
> > +Coroutine *qmp_dispatcher_co;
> > +
> > +/* Set to true when the dispatcher coroutine should terminate */
> > +bool qmp_dispatcher_co_shutdown;
> > +
> > +/*
> > + * true if the coroutine is active and processing requests. The coroutine 
> > may
> > + * only be woken up externally (e.g. from the monitor thread) after 
> > changing
> > + * qmp_dispatcher_co_busy from false to true (e.g. using atomic_xchg).
> > + */
> 
> I'm not sure what you mean by "externally".

By anyone outside the coroutine itself. Maybe just dropping the word
"externally" avoids the confusion because it's an implementation detail
that the coroutine can schedule itself while it is marked busy.

> Also mention how it changes from true to false?

Somethin like: "The coroutine will automatically change it back to false
after processing all pending requests"?

> Note to self: monitor_qmp_dispatcher_co() checks busy is true on resume.
> 
> Nitpick: wrap around column 70, two spaces between sentences for
> consistency with other comments in this file, please.

Any specific reason why comments (but not code) in this file use a
different text width than everything else in QEMU? My editor is set to
use 80 characters to conform to CODING_STYLE.rst.

> > +bool qmp_dispatcher_co_busy;
> >  
> >  /* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
> >  QemuMutex monitor_lock;
> > @@ -579,9 +589,16 @@ void monitor_cleanup(void)
> 
> monitor_cleanup() runs in the main thread.
> 
> Coroutine qmp_dispatcher_co also runs in the main thread, right?

Yes.

> >  }
> >  qemu_mutex_unlock(_lock);
> >  
> > -/* QEMUBHs needs to be deleted before destroying the I/O thread */
> > -qemu_bh_delete(qmp_dispatcher_bh);
> > -qmp_dispatcher_bh = NULL;
> > +/* The dispatcher needs to stop before destroying the I/O thread */
> > +qmp_dispatcher_co_shutdown = true;
> 
> The coroutine switch ensures qmp_dispatcher_co sees this write, so no
> need for a

Re: QAPI schema for desired state of LUKS keyslots

2020-02-17 Thread Markus Armbruster

Kevin Wolf  writes:

> Am 15.02.2020 um 15:51 hat Markus Armbruster geschrieben:
>> Review of this patch led to a lengthy QAPI schema design discussion.
>> Let me try to condense it into a concrete proposal.
>> 
>> This is about the QAPI schema, and therefore about QMP.  The
>> human-friendly interface is out of scope.  Not because it's not
>> important (it clearly is!), only because we need to *focus* to have a
>> chance at success.
>> 
>> I'm going to include a few design options.  I'll mark them "Option:".
>> 
>> The proposed "amend" interface takes a specification of desired state,
>> and figures out how to get from here to there by itself.  LUKS keyslots
>> are one part of desired state.
>> 
>> We commonly have eight LUKS keyslots.  Each keyslot is either active or
>> inactive.  An active keyslot holds a secret.
>> 
>> Goal: a QAPI type for specifying desired state of LUKS keyslots.
>> 
>> Proposal:
>> 
>> { 'enum': 'LUKSKeyslotState',
>>   'data': [ 'active', 'inactive' ] }
>> 
>> { 'struct': 'LUKSKeyslotActive',
>>   'data': { 'secret': 'str',
>> '*iter-time': 'int } }
>> 
>> { 'struct': 'LUKSKeyslotInactive',
>>   'data': { '*old-secret': 'str' } }
>> 
>> { 'union': 'LUKSKeyslotAmend',
>>   'base': { '*keyslot': 'int',
>> 'state': 'LUKSKeyslotState' }
>>   'discriminator': 'state',
>>   'data': { 'active': 'LUKSKeyslotActive',
>> 'inactive': 'LUKSKeyslotInactive' } }
>> 
>> LUKSKeyslotAmend specifies desired state for a set of keyslots.
>
> Though not arbitrary sets of keyslots, it's only a single keyslot or
> multiple keyslots containing the same secret. Might be good enough in
> practice, though it means that you may have to issue multiple amend
> commands to get to the final state that you really want (even if doing
> everything at once would be safe).

True.  I traded expressiveness for simplicity.

Here's the only practical case I can think of where the lack of
expressiveness may hurt: replace secrets.

With this interface, you need two operations: activate a free slot with
the new secret, deactivate the slot(s) with the old secret.  There is an
intermediate state with both secrets active.

A more expressive interface could let you do both in one step.  Relevant
only if the implementation actually provides atomicity.  Can it?

>> Four cases:
>> 
>> * @state is "active"
>> 
>>   Desired state is active holding the secret given by @secret.  Optional
>>   @iter-time tweaks key stretching.
>> 
>>   The keyslot is chosen either by the user or by the system, as follows:
>> 
>>   - @keyslot absent
>> 
>> One inactive keyslot chosen by the system.  If none exists, error.
>> 
>>   - @keyslot present
>> 
>> The keyslot given by @keyslot.
>> 
>> If it's already active holding @secret, no-op.  Rationale: the
>> current state is the desired state.
>> 
>> If it's already active holding another secret, error.  Rationale:
>> update in place is unsafe.
>> 
>> Option: delete the "already active holding @secret" case.  Feels
>> inelegant to me.  Okay if it makes things substantially simpler.
>> 
>> * @state is "inactive"
>> 
>>   Desired state is inactive.
>> 
>>   Error if the current state has active keyslots, but the desired state
>>   has none.
>> 
>>   The user choses the keyslot by number and/or by the secret it holds,
>>   as follows:
>> 
>>   - @keyslot absent, @old-secret present
>> 
>> All active keyslots holding @old-secret.  If none exists, error.
>> 
>>   - @keyslot present, @old-secret absent
>> 
>> The keyslot given by @keyslot.
>> 
>> If it's already inactive, no-op.  Rationale: the current state is
>> the desired state.
>> 
>>   - both @keyslot and @old-secret present
>> 
>> The keyslot given by keyslot.
>> 
>> If it's inactive or holds a secret other than @old-secret, error.
>> 
>> Option: error regardless of @old-secret, if that makes things
>> simpler.
>> 
>>   - neither @keyslot not @old-secret present
>> 
>> All keyslots.  Note that this will error out due to "desired state
>> has no active keyslots" unless the current state has none, either.
>> 
>> Option: error out unconditionally.
>> 
>> Note that LUKSKeyslotAmend can specify only one desired state for
>> commonly just one keyslot.  Rationale: this satisfies practical needs.
>> An array of LUKSKeyslotAmend could specify desired state for all
>> keyslots.  However, multiple array elements could then apply to the same
>> slot.  We'd have to specify how to resolve such conflicts, and we'd have
>> to code up conflict detection.  Not worth it.
>> 
>> Examples:
>> 
>> * Add a secret to some free keyslot:
>> 
>>   { "state": "active", "secret": "CIA/GRU/MI6" }
>> 
>> * Deactivate all keyslots holding a secret:
>> 
>>   { "state": "inactive", "old-secret": "CIA/GRU/MI6" }
>> 
>> * Add a secret to a specific keyslot:
>> 
>>   { "state": "active", "secret": "CIA/GRU/MI6", "keyslot": 0

Re: [GEDI] [PATCH 07/17] gluster: Drop useless has_zero_init callback

2020-02-17 Thread Eric Blake

On 2/17/20 6:03 AM, Eric Blake wrote:

 >
 > Creation or truncation should behave just as on a file on a local
 > filesystem. So maybe qemu_gluster_known_zeroes is not needed at all?

Which version of gluster first required a regular filesystem backing for 
all gluster files?  Does qemu support older versions (in which case, 
what is the correct version-probing invocation to return 0 prior to that 
point, and 1 after), or do all versions supported by qemu already 
guarantee zero initialization on creation or widening truncation by 
virtue of POSIX file semantics (in which case, patch 7 should instead 
switch to using .bdrv_has_zero_init_1 for both functions)?  Per 
configure, we probe for glusterfs_xlator_opt from gluster 4, which 
implies the code still tries to be portable to even older gluster, but 
I'm not sure if this squares with qemu-doc.texi which mentions our 
minimum distro policy (for example, now that qemu requires python 3 
consistent with our distro policy, that rules out several older systems 
where older gluster was likely to be present).

For reference, I quickly found commit efc6c070ac as an example of 
bumping minimum versions (however, that commit is from 2018, so I'm sure 
there are even more recent examples, just not with the same keywords 
that I was searching for).

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [GEDI] [PATCH 07/17] gluster: Drop useless has_zero_init callback

2020-02-17 Thread Eric Blake

On 2/17/20 2:06 AM, Niels de Vos wrote:

On Fri, Jan 31, 2020 at 11:44:26AM -0600, Eric Blake wrote:

block.c already defaults to 0 if we don't provide a callback; there's
no need to write a callback that always fails.

Signed-off-by: Eric Blake 

Reviewed-by: Niels de Vos 

Per your other message,

On 2/17/20 2:16 AM, Niels de Vos wrote:
> On Fri, Jan 31, 2020 at 11:44:31AM -0600, Eric Blake wrote:
>> Since gluster already copies file-posix for lseek usage in block
>> status, it also makes sense to copy it for learning if the image
>> currently reads as all zeroes.
>>

>> +static int qemu_gluster_known_zeroes(BlockDriverState *bs)
>> +{
>> +/*
>> + * GlusterFS volume could be backed by a block device, with no way
>
> Actually, Gluster dropped support for volumes backed by block devices
> (LVM) a few releases back. Nobody could be found that used it, and it
> could not be combined with other Gluster features. All contents on a
> Gluster volume is now always backed by 'normal' files on a filesystem.

That's useful to know.  Thanks!

>
> Creation or truncation should behave just as on a file on a local
> filesystem. So maybe qemu_gluster_known_zeroes is not needed at all?

Which version of gluster first required a regular filesystem backing for 
all gluster files?  Does qemu support older versions (in which case, 
what is the correct version-probing invocation to return 0 prior to that 
point, and 1 after), or do all versions supported by qemu already 
guarantee zero initialization on creation or widening truncation by 
virtue of POSIX file semantics (in which case, patch 7 should instead 
switch to using .bdrv_has_zero_init_1 for both functions)?  Per 
configure, we probe for glusterfs_xlator_opt from gluster 4, which 
implies the code still tries to be portable to even older gluster, but 
I'm not sure if this squares with qemu-doc.texi which mentions our 
minimum distro policy (for example, now that qemu requires python 3 
consistent with our distro policy, that rules out several older systems 
where older gluster was likely to be present).

I'm respinning the series for other reasons, but would like to get this 
right as part of that respin.

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 2/7] block/block-copy: use block_status

2020-02-17 Thread Max Reitz

On 08.02.20 13:25, Vladimir Sementsov-Ogievskiy wrote:
> 07.02.2020 20:46, Max Reitz wrote:
>> On 27.11.19 19:08, Vladimir Sementsov-Ogievskiy wrote:
>>> Use bdrv_block_status_above to chose effective chunk size and to handle
>>> zeroes effectively.
>>>
>>> This substitutes checking for just being allocated or not, and drops
>>> old code path for it. Assistance by backup job is dropped too, as
>>> caching block-status information is more difficult than just caching
>>> is-allocated information in our dirty bitmap, and backup job is not
>>> good place for this caching anyway.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>>> ---
>>>   block/block-copy.c | 67 +-
>>>   block/trace-events |  1 +
>>>   2 files changed, 55 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index 8602e2cae7..74295d93d5 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>
>> [...]
>>
>>> @@ -336,23 +375,25 @@ int coroutine_fn block_copy(BlockCopyState *s,
>>>   chunk_end = next_zero;
>>>   }
>>>   -    if (s->skip_unallocated) {
>>> -    ret = block_copy_reset_unallocated(s, start,
>>> _bytes);
>>> -    if (ret == 0) {
>>> -    trace_block_copy_skip_range(s, start, status_bytes);
>>> -    start += status_bytes;
>>> -    continue;
>>> -    }
>>> -    /* Clamp to known allocated region */
>>> -    chunk_end = MIN(chunk_end, start + status_bytes);
>>> +    ret = block_copy_block_status(s, start, chunk_end - start,
>>> +  _bytes);
>>> +    if (s->skip_unallocated && !(ret & BDRV_BLOCK_ALLOCATED)) {
>>> +    bdrv_reset_dirty_bitmap(s->copy_bitmap, start,
>>> status_bytes);
>>> +    s->progress_reset_callback(s->progress_opaque);
>>> +    trace_block_copy_skip_range(s, start, status_bytes);
>>> +    start += status_bytes;
>>> +    continue;
>>>   }
>>>   +    chunk_end = MIN(chunk_end, start + status_bytes);
>>
>> I’m not sure how much the old “Clamp to known allocated region” actually
>> helps, but I wouldn’t drop it anyway.
> 
> But it may be not allocated now. We just clamp to status_bytes.
> It's "known allocated" only if s->skip_unallocated is true.

Ah, yes, I suppose I was just thinking about that case when trying to
understand how the code changes.  So:

Reviewed-by: Max Reitz 

Max



signature.asc
Description: OpenPGP digital signature

Re: [PATCH v4 3/4] qmp: Move dispatcher to a coroutine

2020-02-17 Thread Markus Armbruster

This is the hairy one.  Please bear with me while I try to grok it.

Kevin Wolf  writes:

> This moves the QMP dispatcher to a coroutine and runs all QMP command
> handlers that declare 'coroutine': true in coroutine context so they
> can avoid blocking the main loop while doing I/O or waiting for other
> events.
>
> For commands that are not declared safe to run in a coroutine, the
> dispatcher drops out of coroutine context by calling the QMP command
> handler from a bottom half.
>
> Signed-off-by: Kevin Wolf 
> ---
>  include/qapi/qmp/dispatch.h |   1 +
>  monitor/monitor-internal.h  |   6 +-
>  monitor/monitor.c   |  33 ---
>  monitor/qmp.c   | 110 ++--
>  qapi/qmp-dispatch.c |  44 ++-
>  qapi/qmp-registry.c |   3 +
>  util/aio-posix.c|   7 ++-
>  7 files changed, 162 insertions(+), 42 deletions(-)
>
> diff --git a/include/qapi/qmp/dispatch.h b/include/qapi/qmp/dispatch.h
> index d6ce9efc8e..6812e49b5f 100644
> --- a/include/qapi/qmp/dispatch.h
> +++ b/include/qapi/qmp/dispatch.h
> @@ -30,6 +30,7 @@ typedef enum QmpCommandOptions
>  typedef struct QmpCommand
>  {
>  const char *name;
> +/* Runs in coroutine context if QCO_COROUTINE is set */
>  QmpCommandFunc *fn;
>  QmpCommandOptions options;
>  QTAILQ_ENTRY(QmpCommand) node;
> diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
> index d78f5ca190..f180d03368 100644
> --- a/monitor/monitor-internal.h
> +++ b/monitor/monitor-internal.h
> @@ -154,7 +154,9 @@ static inline bool monitor_is_qmp(const Monitor *mon)
>  
>  typedef QTAILQ_HEAD(MonitorList, Monitor) MonitorList;
>  extern IOThread *mon_iothread;
> -extern QEMUBH *qmp_dispatcher_bh;
> +extern Coroutine *qmp_dispatcher_co;
> +extern bool qmp_dispatcher_co_shutdown;
> +extern bool qmp_dispatcher_co_busy;
>  extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands;
>  extern QemuMutex monitor_lock;
>  extern MonitorList mon_list;
> @@ -172,7 +174,7 @@ void monitor_fdsets_cleanup(void);
>  
>  void qmp_send_response(MonitorQMP *mon, const QDict *rsp);
>  void monitor_data_destroy_qmp(MonitorQMP *mon);
> -void monitor_qmp_bh_dispatcher(void *data);
> +void coroutine_fn monitor_qmp_dispatcher_co(void *data);
>  
>  int get_monitor_def(int64_t *pval, const char *name);
>  void help_cmd(Monitor *mon, const char *name);
> diff --git a/monitor/monitor.c b/monitor/monitor.c
> index 12898b6448..e753fa435d 100644
> --- a/monitor/monitor.c
> +++ b/monitor/monitor.c
> @@ -53,8 +53,18 @@ typedef struct {
>  /* Shared monitor I/O thread */
>  IOThread *mon_iothread;
>  
> -/* Bottom half to dispatch the requests received from I/O thread */
> -QEMUBH *qmp_dispatcher_bh;
> +/* Coroutine to dispatch the requests received from I/O thread */
> +Coroutine *qmp_dispatcher_co;
> +
> +/* Set to true when the dispatcher coroutine should terminate */
> +bool qmp_dispatcher_co_shutdown;
> +
> +/*
> + * true if the coroutine is active and processing requests. The coroutine may
> + * only be woken up externally (e.g. from the monitor thread) after changing
> + * qmp_dispatcher_co_busy from false to true (e.g. using atomic_xchg).
> + */

I'm not sure what you mean by "externally".

Also mention how it changes from true to false?

Note to self: monitor_qmp_dispatcher_co() checks busy is true on resume.

Nitpick: wrap around column 70, two spaces between sentences for
consistency with other comments in this file, please.

> +bool qmp_dispatcher_co_busy;
>  
>  /* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
>  QemuMutex monitor_lock;
> @@ -579,9 +589,16 @@ void monitor_cleanup(void)

monitor_cleanup() runs in the main thread.

Coroutine qmp_dispatcher_co also runs in the main thread, right?

>  }
>  qemu_mutex_unlock(_lock);
>  
> -/* QEMUBHs needs to be deleted before destroying the I/O thread */
> -qemu_bh_delete(qmp_dispatcher_bh);
> -qmp_dispatcher_bh = NULL;
> +/* The dispatcher needs to stop before destroying the I/O thread */
> +qmp_dispatcher_co_shutdown = true;

The coroutine switch ensures qmp_dispatcher_co sees this write, so no
need for a barrier.  Correct?

> +if (!atomic_xchg(_dispatcher_co_busy, true)) {

Why do we need atomic?  I figure it's because qmp_dispatcher_co_busy is
accessed from multiple threads (main thread and mon_iothread), unlike
qmp_dispatcher_co_shutdown.

What kind of atomic?  I'm asking because you use sequentially consistent
atomic_xchg() together with the weaker atomic_mb_set() and
atomic_mb_read().

> +aio_co_wake(qmp_dispatcher_co);
> +}
> +
> +AIO_WAIT_WHILE(qemu_get_aio_context(),
> +   (aio_poll(iohandler_get_aio_context(), false),
> +atomic_mb_read(_dispatcher_co_busy)));

This waits for qmp_dispatcher_co_busy to become false again.  While
waiting, pending AIO work is given a chance to progress, as long as it
doesn't block.

The only

Re: QAPI schema for desired state of LUKS keyslots (was: [PATCH 02/13] qcrypto-luks: implement encryption key management)

2020-02-17 Thread Maxim Levitsky

On Mon, 2020-02-17 at 11:37 +0100, Kevin Wolf wrote:
> Am 15.02.2020 um 15:51 hat Markus Armbruster geschrieben:
> > Review of this patch led to a lengthy QAPI schema design discussion.
> > Let me try to condense it into a concrete proposal.
> > 
> > This is about the QAPI schema, and therefore about QMP.  The
> > human-friendly interface is out of scope.  Not because it's not
> > important (it clearly is!), only because we need to *focus* to have a
> > chance at success.
> > 
> > I'm going to include a few design options.  I'll mark them "Option:".
> > 
> > The proposed "amend" interface takes a specification of desired state,
> > and figures out how to get from here to there by itself.  LUKS keyslots
> > are one part of desired state.
> > 
> > We commonly have eight LUKS keyslots.  Each keyslot is either active or
> > inactive.  An active keyslot holds a secret.
> > 
> > Goal: a QAPI type for specifying desired state of LUKS keyslots.
> > 
> > Proposal:
> > 
> > { 'enum': 'LUKSKeyslotState',
> >   'data': [ 'active', 'inactive' ] }
> > 
> > { 'struct': 'LUKSKeyslotActive',
> >   'data': { 'secret': 'str',
> > '*iter-time': 'int } }
> > 
> > { 'struct': 'LUKSKeyslotInactive',
> >   'data': { '*old-secret': 'str' } }
> > 
> > { 'union': 'LUKSKeyslotAmend',
> >   'base': { '*keyslot': 'int',
> > 'state': 'LUKSKeyslotState' }
> >   'discriminator': 'state',
> >   'data': { 'active': 'LUKSKeyslotActive',
> > 'inactive': 'LUKSKeyslotInactive' } }
> > 
> > LUKSKeyslotAmend specifies desired state for a set of keyslots.
> 
> Though not arbitrary sets of keyslots, it's only a single keyslot or
> multiple keyslots containing the same secret. Might be good enough in
> practice, though it means that you may have to issue multiple amend
> commands to get to the final state that you really want (even if doing
> everything at once would be safe).
> 
> > Four cases:
> > 
> > * @state is "active"
> > 
> >   Desired state is active holding the secret given by @secret.  Optional
> >   @iter-time tweaks key stretching.
> > 
> >   The keyslot is chosen either by the user or by the system, as follows:
> > 
> >   - @keyslot absent
> > 
> > One inactive keyslot chosen by the system.  If none exists, error.
> > 
> >   - @keyslot present
> > 
> > The keyslot given by @keyslot.
> > 
> > If it's already active holding @secret, no-op.  Rationale: the
> > current state is the desired state.
> > 
> > If it's already active holding another secret, error.  Rationale:
> > update in place is unsafe.
> > 
> > Option: delete the "already active holding @secret" case.  Feels
> > inelegant to me.  Okay if it makes things substantially simpler.
> > 
> > * @state is "inactive"
> > 
> >   Desired state is inactive.
> > 
> >   Error if the current state has active keyslots, but the desired state
> >   has none.
> > 
> >   The user choses the keyslot by number and/or by the secret it holds,
> >   as follows:
> > 
> >   - @keyslot absent, @old-secret present
> > 
> > All active keyslots holding @old-secret.  If none exists, error.
> > 
> >   - @keyslot present, @old-secret absent
> > 
> > The keyslot given by @keyslot.
> > 
> > If it's already inactive, no-op.  Rationale: the current state is
> > the desired state.
> > 
> >   - both @keyslot and @old-secret present
> > 
> > The keyslot given by keyslot.
> > 
> > If it's inactive or holds a secret other than @old-secret, error.
> > 
> > Option: error regardless of @old-secret, if that makes things
> > simpler.
> > 
> >   - neither @keyslot not @old-secret present
> > 
> > All keyslots.  Note that this will error out due to "desired state
> > has no active keyslots" unless the current state has none, either.
> > 
> > Option: error out unconditionally.
> > 
> > Note that LUKSKeyslotAmend can specify only one desired state for
> > commonly just one keyslot.  Rationale: this satisfies practical needs.
> > An array of LUKSKeyslotAmend could specify desired state for all
> > keyslots.  However, multiple array elements could then apply to the same
> > slot.  We'd have to specify how to resolve such conflicts, and we'd have
> > to code up conflict detection.  Not worth it.
> > 
> > Examples:
> > 
> > * Add a secret to some free keyslot:
> > 
> >   { "state": "active", "secret": "CIA/GRU/MI6" }
> > 
> > * Deactivate all keyslots holding a secret:
> > 
> >   { "state": "inactive", "old-secret": "CIA/GRU/MI6" }
> > 
> > * Add a secret to a specific keyslot:
> > 
> >   { "state": "active", "secret": "CIA/GRU/MI6", "keyslot": 0 }
> > 
> > * Deactivate a specific keyslot:
> > 
> >   { "state": "inactive", "keyslot": 0 }
> > 
> >   Possibly less dangerous:
> > 
> >   { "state": "inactive", "keyslot": 0, "old-secret": "CIA/GRU/MI6" }
> > 
> > Option: Make use of Max's patches to support optional union tag with
> > default value to let us default @state

Re: [PATCH 1/3] block/qcow2-bitmap: Remove unneeded variable assignment

2020-02-17 Thread Kevin Wolf

Am 15.02.2020 um 17:15 hat Philippe Mathieu-Daudé geschrieben:
> Fix warning reported by Clang static code analyzer:
> 
> CC  block/qcow2-bitmap.o
>   block/qcow2-bitmap.c:650:5: warning: Value stored to 'ret' is never read
>   ret = -EINVAL;
>   ^ ~~~
> 
> Reported-by: Clang Static Analyzer
> Signed-off-by: Philippe Mathieu-Daudé 

This isn't hw/, so I'm taking it through my tree. Thanks, applied to the
block branch.

Kevin

Re: x-blockdev-reopen & block-dirty-bitmaps

2020-02-17 Thread Kevin Wolf

Am 17.02.2020 um 11:28 hat Alberto Garcia geschrieben:
> On Fri 14 Feb 2020 07:54:36 PM CET, John Snow wrote:
> 
> > Probably a loaded question, but:
> >
> > - What's needed to make the interface stable?
> > - Are there known problem points?
> 
> As far as I'm aware there's nothing needed to make x-blockdev-reopen
> stable. It was just marked experimental because it's a relatively
> complex operation and I wanted to have some margin to detect bugs or
> other possible problems.

Maybe some more test cases for potentially problematic things like
attaching a backing file node that is in a different AioContext than the
new parent. But I think the recent AioContext management fixes should
have taken care of this. (Actually, bdrv_reopen_parse_backing() still
forbids this case and has a "TODO" comment there.)

Hm... In fact I guess it would still be broken:

bdrv_set_backing_hd(bs, reopen_state->new_backing_bs, _abort);

This is bdrv_reopen_commit(), so we can't return an error, but
bdrv_set_backing_hd() can potentially fail if an AioContext switch is
involved. If we move it already in bdrv_reopen_prepare(), it could work,
though.

Another thing that we probably want to add is changing bs->file,
specifically in order to insert or remove filter nodes. If we don't do
this before marking blockdev-reopen stable, introspection wouldn't be
able to detect when we later add it. (We have QAPI features now to work
around this, but...)

Kevin

Re: QAPI schema for desired state of LUKS keyslots (was: [PATCH 02/13] qcrypto-luks: implement encryption key management)

2020-02-17 Thread Kevin Wolf

Am 15.02.2020 um 15:51 hat Markus Armbruster geschrieben:
> Review of this patch led to a lengthy QAPI schema design discussion.
> Let me try to condense it into a concrete proposal.
> 
> This is about the QAPI schema, and therefore about QMP.  The
> human-friendly interface is out of scope.  Not because it's not
> important (it clearly is!), only because we need to *focus* to have a
> chance at success.
> 
> I'm going to include a few design options.  I'll mark them "Option:".
> 
> The proposed "amend" interface takes a specification of desired state,
> and figures out how to get from here to there by itself.  LUKS keyslots
> are one part of desired state.
> 
> We commonly have eight LUKS keyslots.  Each keyslot is either active or
> inactive.  An active keyslot holds a secret.
> 
> Goal: a QAPI type for specifying desired state of LUKS keyslots.
> 
> Proposal:
> 
> { 'enum': 'LUKSKeyslotState',
>   'data': [ 'active', 'inactive' ] }
> 
> { 'struct': 'LUKSKeyslotActive',
>   'data': { 'secret': 'str',
> '*iter-time': 'int } }
> 
> { 'struct': 'LUKSKeyslotInactive',
>   'data': { '*old-secret': 'str' } }
> 
> { 'union': 'LUKSKeyslotAmend',
>   'base': { '*keyslot': 'int',
> 'state': 'LUKSKeyslotState' }
>   'discriminator': 'state',
>   'data': { 'active': 'LUKSKeyslotActive',
> 'inactive': 'LUKSKeyslotInactive' } }
> 
> LUKSKeyslotAmend specifies desired state for a set of keyslots.

Though not arbitrary sets of keyslots, it's only a single keyslot or
multiple keyslots containing the same secret. Might be good enough in
practice, though it means that you may have to issue multiple amend
commands to get to the final state that you really want (even if doing
everything at once would be safe).

> Four cases:
> 
> * @state is "active"
> 
>   Desired state is active holding the secret given by @secret.  Optional
>   @iter-time tweaks key stretching.
> 
>   The keyslot is chosen either by the user or by the system, as follows:
> 
>   - @keyslot absent
> 
> One inactive keyslot chosen by the system.  If none exists, error.
> 
>   - @keyslot present
> 
> The keyslot given by @keyslot.
> 
> If it's already active holding @secret, no-op.  Rationale: the
> current state is the desired state.
> 
> If it's already active holding another secret, error.  Rationale:
> update in place is unsafe.
> 
> Option: delete the "already active holding @secret" case.  Feels
> inelegant to me.  Okay if it makes things substantially simpler.
> 
> * @state is "inactive"
> 
>   Desired state is inactive.
> 
>   Error if the current state has active keyslots, but the desired state
>   has none.
> 
>   The user choses the keyslot by number and/or by the secret it holds,
>   as follows:
> 
>   - @keyslot absent, @old-secret present
> 
> All active keyslots holding @old-secret.  If none exists, error.
> 
>   - @keyslot present, @old-secret absent
> 
> The keyslot given by @keyslot.
> 
> If it's already inactive, no-op.  Rationale: the current state is
> the desired state.
> 
>   - both @keyslot and @old-secret present
> 
> The keyslot given by keyslot.
> 
> If it's inactive or holds a secret other than @old-secret, error.
> 
> Option: error regardless of @old-secret, if that makes things
> simpler.
> 
>   - neither @keyslot not @old-secret present
> 
> All keyslots.  Note that this will error out due to "desired state
> has no active keyslots" unless the current state has none, either.
> 
> Option: error out unconditionally.
> 
> Note that LUKSKeyslotAmend can specify only one desired state for
> commonly just one keyslot.  Rationale: this satisfies practical needs.
> An array of LUKSKeyslotAmend could specify desired state for all
> keyslots.  However, multiple array elements could then apply to the same
> slot.  We'd have to specify how to resolve such conflicts, and we'd have
> to code up conflict detection.  Not worth it.
> 
> Examples:
> 
> * Add a secret to some free keyslot:
> 
>   { "state": "active", "secret": "CIA/GRU/MI6" }
> 
> * Deactivate all keyslots holding a secret:
> 
>   { "state": "inactive", "old-secret": "CIA/GRU/MI6" }
> 
> * Add a secret to a specific keyslot:
> 
>   { "state": "active", "secret": "CIA/GRU/MI6", "keyslot": 0 }
> 
> * Deactivate a specific keyslot:
> 
>   { "state": "inactive", "keyslot": 0 }
> 
>   Possibly less dangerous:
> 
>   { "state": "inactive", "keyslot": 0, "old-secret": "CIA/GRU/MI6" }
> 
> Option: Make use of Max's patches to support optional union tag with
> default value to let us default @state to "active".  I doubt this makes
> much of a difference in QMP.  A human-friendly interface should probably
> be higher level anyway (Daniel pointed to cryptsetup).
> 
> Option: LUKSKeyslotInactive member @old-secret could also be named
> @secret.  I don't care.
> 
> Option: delete @keyslot.  It provides low-level slot access.
>

Re: x-blockdev-reopen & block-dirty-bitmaps

2020-02-17 Thread Alberto Garcia

On Fri 14 Feb 2020 07:54:36 PM CET, John Snow wrote:

> Probably a loaded question, but:
>
> - What's needed to make the interface stable?
> - Are there known problem points?

As far as I'm aware there's nothing needed to make x-blockdev-reopen
stable. It was just marked experimental because it's a relatively
complex operation and I wanted to have some margin to detect bugs or
other possible problems.

Berto

Re: [PATCH v2] configure: Avoid compiling system tools on user build by default

2020-02-17 Thread Laurent Vivier

Le 17/02/2020 à 10:59, Philippe Mathieu-Daudé a écrit :
> From: Philippe Mathieu-Daudé 
> 
> User-mode does not need the sytem tools. Do not build them by
> default if user specified --disable-system.
> 
> This disables building the following binary on a user-only build:
> 
> - elf2dmp
> - qemu-edid
> - qemu-ga
> - qemu-img
> - qemu-io
> - qemu-nbd
> - ivshmem-client
> - ivshmem-server
> 
> The qemu-user binary is not affected by this change.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> v2:
> - use simpler if/else statement (therefore not adding Richard R-b)
> - improved description (Aleksandar)
> ---
>  configure | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/configure b/configure
> index 16f94cd96b..d1877a60f5 100755
> --- a/configure
> +++ b/configure
> @@ -455,7 +455,7 @@ guest_agent_ntddscsi="no"
>  guest_agent_msi=""
>  vss_win32_sdk=""
>  win_sdk="no"
> -want_tools="yes"
> +want_tools=""
>  libiscsi=""
>  libnfs=""
>  coroutine=""
> @@ -2199,6 +2199,16 @@ else
>  echo big/little test failed
>  fi
>  
> +##
> +# system tools
> +if test -z "$want_tools"; then
> +if test "$softmmu" = "no"; then
> +want_tools=no
> +else
> +want_tools=yes
> +fi
> +fi
> +
>  ##
>  # cocoa implies not SDL or GTK
>  # (the cocoa UI code currently assumes it is always the active UI
> 

Reviewed-by: Laurent Vivier

Re: [PATCH 1/7] qapi: Document meaning of 'ignore' BlockdevOnError for jobs

2020-02-17 Thread Kevin Wolf

Am 16.02.2020 um 22:44 hat Ján Tomko geschrieben:
> On Fri, Feb 14, 2020 at 09:08:06PM +0100, Kevin Wolf wrote:
> > It is not obvious what 'ignore' actually means for block jobs: It could
> > be continuing the job and returning success in the end despite the error
> > (no block job does this). It could also mean continuing and returning
> > failure in the end (this is what stream does). And it can mean retrying
> > the failed request later (this is what backup, commit and mirror do).
> > 
> > This (somewhat inconsistent) behaviour was introduced and described for
> > stream and mirror in commit ae586d6158. backup and commit were
> 
> fatal: ambiguous argument 'ae586d6158': unknown revision or path not in the 
> working tree.

Oops, thanks for catching this. Not sure how this happened, but
32c81a4a6ec is the correct commit.

Kevin


signature.asc
Description: PGP signature

[PATCH v2] configure: Avoid compiling system tools on user build by default

2020-02-17 Thread Philippe Mathieu-Daudé

From: Philippe Mathieu-Daudé 

User-mode does not need the sytem tools. Do not build them by
default if user specified --disable-system.

This disables building the following binary on a user-only build:

- elf2dmp
- qemu-edid
- qemu-ga
- qemu-img
- qemu-io
- qemu-nbd
- ivshmem-client
- ivshmem-server

The qemu-user binary is not affected by this change.

Signed-off-by: Philippe Mathieu-Daudé 
---
v2:
- use simpler if/else statement (therefore not adding Richard R-b)
- improved description (Aleksandar)
---
 configure | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 16f94cd96b..d1877a60f5 100755
--- a/configure
+++ b/configure
@@ -455,7 +455,7 @@ guest_agent_ntddscsi="no"
 guest_agent_msi=""
 vss_win32_sdk=""
 win_sdk="no"
-want_tools="yes"
+want_tools=""
 libiscsi=""
 libnfs=""
 coroutine=""
@@ -2199,6 +2199,16 @@ else
 echo big/little test failed
 fi
 
+##
+# system tools
+if test -z "$want_tools"; then
+if test "$softmmu" = "no"; then
+want_tools=no
+else
+want_tools=yes
+fi
+fi
+
 ##
 # cocoa implies not SDL or GTK
 # (the cocoa UI code currently assumes it is always the active UI
-- 
2.21.1

Re: x-blockdev-reopen & block-dirty-bitmaps

2020-02-17 Thread Kevin Wolf

Am 14.02.2020 um 21:32 hat John Snow geschrieben:
> On 2/14/20 3:19 PM, Kevin Wolf wrote:
> > Am 14.02.2020 um 19:54 hat John Snow geschrieben:
> >> Hi, what work remains to make this a stable interface, is it known?
> >>
> >> We're having a problem with bitmaps where in some cases libvirt wants to
> >> commit an image back down to a base image but -- for various reasons --
> >> the bitmap was left enabled in the backing image, so it would accrue new
> >> writes during the commit.
> >>
> >> Normally, when creating a snapshot using blockdev-snapshot, the backing
> >> file becomes RO and all of the bitmaps become RO too.
> >>
> >> The goal is to be able to disable (or enable) bitmaps from a backing
> >> file before (or atomically just before) a commit operation to allow
> >> libvirt greater control on snapshot commit.
> >>
> >> Now, in my own testing, we can reopen a backing file just fine, delete
> >> or disable a bitmap and be done with it -- but the interface isn't
> >> stable, so libvirt will be reluctant to use such tricks.
> >>
> >> Probably a loaded question, but:
> >>
> >> - What's needed to make the interface stable?
> >> - Are there known problem points?
> >> - Any suggestions for workarounds in the meantime?
> > 
> > I think I've asked this before, but I don't remember the answer...
> > 
> > What would be the problem with letting the enable/disable command
> > temporarily reopen the backing file read-write, like the commit job [1]
> > does?
> > 
> 
> I guess no problem? I wouldn't want it to do this automatically, but
> perhaps we could add a "force=True" bool where it tries to do just this.
> 
> (And once reopen works properly we could deprecate this workaround again.)

I'm not sure if I would view this only as a workaround, but if you like
it better with force=true, I won't object either.

> > [1] I mean, I know that this code is wrong strictly speaking because we
> > really should be counting read-write users [2] rather than blindly
> > making the node read-only at the end of the operation - but somehow
> > it seems to work in practice for commit jobs.
> > 
> > [2] Counting really means just looking at parent BdrvChild links that
> > have WRITE permissions. I guess doing it right would mean getting
> > rid of BlockDriverState.read_only (which is redundant) and using
> > only permissions.
> 
> OK, sounds good. I'll make a mockup that tries to accurately detect
> read-only-ness and reverts changes only if it made any to begin with.

Fixing it for commit would be appreciated, though as I said it seems to
be a theoretical case mostly because we never got bug reports for it.

For bitmaps it's even more theoretical because we hold the BQL between
the switch to read-write and the switch back. So I don't think we can
actually run into this case for the bitmap enable/disable operation.

Kevin

Re: [PATCH 1/3] block/qcow2-bitmap: Remove unneeded variable assignment

2020-02-17 Thread Vladimir Sementsov-Ogievskiy


15.02.2020 19:15, Philippe Mathieu-Daudé wrote:

Fix warning reported by Clang static code analyzer:

 CC  block/qcow2-bitmap.o
   block/qcow2-bitmap.c:650:5: warning: Value stored to 'ret' is never read
   ret = -EINVAL;
   ^ ~~~

Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé 
---
  block/qcow2-bitmap.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index d41f5d049b..82c9f3 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -647,7 +647,6 @@ static Qcow2BitmapList *bitmap_list_load(BlockDriverState 
*bs, uint64_t offset,
  return bm_list;
  
  broken_dir:

-ret = -EINVAL;
  error_setg(errp, "Broken bitmap directory");
  
  fail:




Reviewed-by: Vladimir Sementsov-Ogievskiy 


--
Best regards,
Vladimir

Re: QAPI schema for desired state of LUKS keyslots

2020-02-17 Thread Maxim Levitsky

On Mon, 2020-02-17 at 07:45 +0100, Markus Armbruster wrote:
> Maxim Levitsky  writes:
> 
> > On Sat, 2020-02-15 at 15:51 +0100, Markus Armbruster wrote:
> > > Review of this patch led to a lengthy QAPI schema design discussion.
> > > Let me try to condense it into a concrete proposal.
> > > 
> > > This is about the QAPI schema, and therefore about QMP.  The
> > > human-friendly interface is out of scope.  Not because it's not
> > > important (it clearly is!), only because we need to *focus* to have a
> > > chance at success.
> > 
> > 100% agree.
> > > 
> > > I'm going to include a few design options.  I'll mark them "Option:".
> > > 
> > > The proposed "amend" interface takes a specification of desired state,
> > > and figures out how to get from here to there by itself.  LUKS keyslots
> > > are one part of desired state.
> > > 
> > > We commonly have eight LUKS keyslots.  Each keyslot is either active or
> > > inactive.  An active keyslot holds a secret.
> > > 
> > > Goal: a QAPI type for specifying desired state of LUKS keyslots.
> > > 
> > > Proposal:
> > > 
> > > { 'enum': 'LUKSKeyslotState',
> > >   'data': [ 'active', 'inactive' ] }
> > > 
> > > { 'struct': 'LUKSKeyslotActive',
> > >   'data': { 'secret': 'str',
> > > '*iter-time': 'int } }
> > > 
> > > { 'struct': 'LUKSKeyslotInactive',
> > >   'data': { '*old-secret': 'str' } }
> > > 
> > > { 'union': 'LUKSKeyslotAmend',
> > >   'base': { '*keyslot': 'int',
> > > 'state': 'LUKSKeyslotState' }
> > >   'discriminator': 'state',
> > >   'data': { 'active': 'LUKSKeyslotActive',
> > > 'inactive': 'LUKSKeyslotInactive' } }
> > > 
> > > LUKSKeyslotAmend specifies desired state for a set of keyslots.
> > > 
> > > Four cases:
> > > 
> > > * @state is "active"
> > > 
> > >   Desired state is active holding the secret given by @secret.  Optional
> > >   @iter-time tweaks key stretching.
> > > 
> > >   The keyslot is chosen either by the user or by the system, as follows:
> > > 
> > >   - @keyslot absent
> > > 
> > > One inactive keyslot chosen by the system.  If none exists, error.
> > > 
> > >   - @keyslot present
> > > 
> > > The keyslot given by @keyslot.
> > > 
> > > If it's already active holding @secret, no-op.  Rationale: the
> > > current state is the desired state.
> > > 
> > > If it's already active holding another secret, error.  Rationale:
> > > update in place is unsafe.
> > > 
> > > Option: delete the "already active holding @secret" case.  Feels
> > > inelegant to me.  Okay if it makes things substantially simpler.
> > 
> > I didn't really understand this, since in state=active we shouldn't
> > delete anything. Looks OK otherwise.
> 
> Let me try to clarify.
> 
> Option: make the "already active holding @secret" case an error like the
> "already active holding another secret" case.  In longhand:
> 
>  - @keyslot present
> 
>The keyslot given by @keyslot.
> 
>If it's already active, error.
> 
> It feels inelegant to me, because it deviates from "specify desired
> state" paradigm: the specified desired state is fine, the way to get
> there from current state is obvious (do nothing), yet it's still an
> error.
Yep, although in theory we also specify that iteration count, which might not
match (and it will never exactly match since it is benchmark based), thus
if user specified it, we might err out, and otherwise indeed ignore this.
This is of course very minor issue.

> 
> > > * @state is "inactive"
> > > 
> > >   Desired state is inactive.
> > > 
> > >   Error if the current state has active keyslots, but the desired state
> > >   has none.
> > > 
> > >   The user choses the keyslot by number and/or by the secret it holds,
> > >   as follows:
> > > 
> > >   - @keyslot absent, @old-secret present
> > > 
> > > All active keyslots holding @old-secret.  If none exists, error.
> > > 
> > >   - @keyslot present, @old-secret absent
> > > 
> > > The keyslot given by @keyslot.
> > > 
> > > If it's already inactive, no-op.  Rationale: the current state is
> > > the desired state.
> > > 
> > >   - both @keyslot and @old-secret present
> > > 
> > > The keyslot given by keyslot.
> > > 
> > > If it's inactive or holds a secret other than @old-secret, error.
> > 
> > Yea, that would be very nice to have.
> > > 
> > > Option: error regardless of @old-secret, if that makes things
> > > simpler.
> > > 
> > >   - neither @keyslot not @old-secret present
> > > 
> > > All keyslots.  Note that this will error out due to "desired state
> > > has no active keyslots" unless the current state has none, either.
> > > 
> > > Option: error out unconditionally.
> > 
> > Yep, that the best IMHO.
> 
> It's a matter of taste.
> 
> If we interpret "both absent" as "all keyslots", then all cases flow out
> of the following simple spec:
> 
> 0. Start with the set of all keyslots
> 
> 1. If

Re: [GEDI] [PATCH 12/17] gluster: Support BDRV_ZERO_OPEN

2020-02-17 Thread Niels de Vos

On Fri, Jan 31, 2020 at 11:44:31AM -0600, Eric Blake wrote:
> Since gluster already copies file-posix for lseek usage in block
> status, it also makes sense to copy it for learning if the image
> currently reads as all zeroes.
> 
> Signed-off-by: Eric Blake 
> ---
>  block/gluster.c | 20 
>  1 file changed, 20 insertions(+)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 9d952c70981b..0417a86547c8 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -1464,6 +1464,22 @@ exit:
>  return -ENOTSUP;
>  }
> 
> +static int qemu_gluster_known_zeroes(BlockDriverState *bs)
> +{
> +/*
> + * GlusterFS volume could be backed by a block device, with no way

Actually, Gluster dropped support for volumes backed by block devices
(LVM) a few releases back. Nobody could be found that used it, and it
could not be combined with other Gluster features. All contents on a
Gluster volume is now always backed by 'normal' files on a filesystem.

Creation or truncation should behave just as on a file on a local
filesystem. So maybe qemu_gluster_known_zeroes is not needed at all?

Niels


> + * to query if regions added by creation or truncation will read
> + * as zeroes.  However, we can use lseek(SEEK_DATA) to check if
> + * contents currently read as zero.
> + */
> +off_t data, hole;
> +
> +if (find_allocation(bs, 0, , ) == -ENXIO) {
> +return BDRV_ZERO_OPEN;
> +}
> +return 0;
> +}
> +
>  /*
>   * Returns the allocation status of the specified offset.
>   *
> @@ -1561,6 +1577,7 @@ static BlockDriver bdrv_gluster = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_known_zeroes= qemu_gluster_known_zeroes,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1591,6 +1608,7 @@ static BlockDriver bdrv_gluster_tcp = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_known_zeroes= qemu_gluster_known_zeroes,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1621,6 +1639,7 @@ static BlockDriver bdrv_gluster_unix = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_known_zeroes= qemu_gluster_known_zeroes,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1657,6 +1676,7 @@ static BlockDriver bdrv_gluster_rdma = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_known_zeroes= qemu_gluster_known_zeroes,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> -- 
> 2.24.1
> 
> ___
> integration mailing list
> integrat...@gluster.org
> https://lists.gluster.org/mailman/listinfo/integration
>

Re: [GEDI] [PATCH 07/17] gluster: Drop useless has_zero_init callback

2020-02-17 Thread Niels de Vos

On Fri, Jan 31, 2020 at 11:44:26AM -0600, Eric Blake wrote:
> block.c already defaults to 0 if we don't provide a callback; there's
> no need to write a callback that always fails.
> 
> Signed-off-by: Eric Blake 

Reviewed-by: Niels de Vos 

> ---
>  block/gluster.c | 14 --
>  1 file changed, 14 deletions(-)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 4fa4a77a4777..9d952c70981b 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -1357,12 +1357,6 @@ static int64_t 
> qemu_gluster_allocated_file_size(BlockDriverState *bs)
>  }
>  }
> 
> -static int qemu_gluster_has_zero_init(BlockDriverState *bs)
> -{
> -/* GlusterFS volume could be backed by a block device */
> -return 0;
> -}
> -
>  /*
>   * Find allocation range in @bs around offset @start.
>   * May change underlying file descriptor's file offset.
> @@ -1567,8 +1561,6 @@ static BlockDriver bdrv_gluster = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> -.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
> -.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1599,8 +1591,6 @@ static BlockDriver bdrv_gluster_tcp = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> -.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
> -.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1631,8 +1621,6 @@ static BlockDriver bdrv_gluster_unix = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> -.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
> -.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1669,8 +1657,6 @@ static BlockDriver bdrv_gluster_rdma = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> -.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
> -.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> -- 
> 2.24.1
> 
> ___
> integration mailing list
> integrat...@gluster.org
> https://lists.gluster.org/mailman/listinfo/integration
>

56 matches

Mail list logo