from:"John Snow"

Re: [Qemu-block] [Qemu-devel] [2.4 PATCH v3 04/19] qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove

2015-03-16 Thread John Snow




On 03/16/2015 04:44 PM, Max Reitz wrote:

On 2015-03-13 at 14:30, John Snow wrote:

The new command pair is added to manage a user created dirty bitmap. The
dirty bitmap's name is mandatory and must be unique for the same device,
but different devices can have bitmaps with the same names.

The granularity is an optional field. If it is not specified, we will
choose a default granularity based on the cluster size if available,
clamped to between 4K and 64K to mirror how the 'mirror' code was
already choosing granularity. If we do not have cluster size info
available, we choose 64K. This code has been factored out into a helper
shared with block/mirror.

This patch also introduces the 'block_dirty_bitmap_lookup' helper,
which takes a device name and a dirty bitmap name and validates the
lookup, returning NULL and setting errp if there is a problem with
either field. This helper will be re-used in future patches in this
series.

The types added to block-core.json will be re-used in future patches
in this series, see:
'qapi: Add transaction support to block-dirty-bitmap-{add, enable,
disable}'

Signed-off-by: John Snow js...@redhat.com
---
  block.c   |  20 ++
  block/mirror.c|  10 +
  blockdev.c| 102
++
  include/block/block.h |   1 +
  qapi/block-core.json  |  55 +++
  qmp-commands.hx   |  56 +++
  6 files changed, 235 insertions(+), 9 deletions(-)



[snip]


diff --git a/blockdev.c b/blockdev.c
index b9c1c0c..b8455b9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1161,6 +1161,53 @@ out_aio_context:
  return NULL;
  }
+/**
+ * Return a dirty bitmap (if present), after validating
+ * the node reference and bitmap names. Returns NULL on error,
+ * including when the BDS and/or bitmap is not found.
+ */
+static BdrvDirtyBitmap *block_dirty_bitmap_lookup(const char *node,
+  const char *name,
+  BlockDriverState
**pbs,
+  AioContext **paio,
+  Error **errp)
+{
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+AioContext *aio_context;
+
+if (!node) {
+error_setg(errp, Node cannot be NULL);
+return NULL;
+}
+if (!name) {
+error_setg(errp, Bitmap name cannot be NULL);
+return NULL;
+}
+bs = bdrv_lookup_bs(node, node, NULL);
+if (!bs) {
+error_setg(errp, Node '%s' not found, node);
+return NULL;
+}
+
+/* If caller provided a BDS*, provide the result of that lookup,
too. */
+if (pbs) {
+assert(paio);
+aio_context = bdrv_get_aio_context(bs);
+aio_context_acquire(aio_context);
+*pbs = bs;
+*paio = aio_context;


General question (because I'm too lazy to look up, or find out where to
look it up in the first place): Do you maybe want to acquire the AIO
context always before bdrv_find_dirty_bitmap(), even if paio == pbs ==
NULL?



Somewhat a leftover from an earlier revision when not every caller 
actually cared to receive the BDS for a bitmap lookup -- There is the 
assumption that maybe certain callers don't care, already know the BDS, etc.


In these cases maybe the lock isn't important because they already have 
a lock from acquiring the BDS.


Impossible to say, anyway, since nobody uses the function as such right 
now, so it might be just as good to remove the optional-ness of these 
parameters for now.



+}
+
+bitmap = bdrv_find_dirty_bitmap(bs, name);
+if (!bitmap) {
+error_setg(errp, Dirty bitmap '%s' not found, name);


I'd propose a aio_context_release(aio_context); *paio = NULL; *pbs =
NULL; here. Makes error handling easier.


+return NULL;
+}
+
+return bitmap;
+}
+
  /* New and old BlockDriverState structs for atomic group operations */
  typedef struct BlkTransactionState BlkTransactionState;
@@ -1941,6 +1988,61 @@ void qmp_block_set_io_throttle(const char
*device, int64_t bps, int64_t bps_rd,
  aio_context_release(aio_context);
  }
+void qmp_block_dirty_bitmap_add(const char *node, const char *name,
+bool has_granularity, uint32_t
granularity,
+Error **errp)
+{
+AioContext *aio_context;
+BlockDriverState *bs;
+
+if (!name || name[0] == '\0') {
+error_setg(errp, Bitmap name cannot be empty);
+return;
+}
+
+bs = bdrv_lookup_bs(node, node, errp);
+if (!bs) {
+return;
+}
+
+aio_context = bdrv_get_aio_context(bs);
+aio_context_acquire(aio_context);
+
+if (has_granularity) {
+if (granularity  512 || !is_power_of_2(granularity)) {
+error_setg(errp, Granularity must be power of 2 
+ and at least 512);
+goto out

Re: [Qemu-block] [Qemu-devel] [PATCH] fdc: remove sparc sun4m mutations

2015-03-16 Thread John Snow




On 03/14/2015 12:50 PM, Hervé Poussineau wrote:

They were introduced in 6f7e9aec5eb5bdfa57a9e458e391b785c283a007 and
82407d1a4035e5bfefb53ffdcb270872f813b34c and lots of bug fixes were done after 
that.

This fixes (at least) the detection of the floppy controller on Debian 
4.0r9/SPARC,
and SS-5's OBP initialization routine still works.



Removing workaround code from six years ago in a device we hardly touch 
seems sane to me if it doesn't appear to break the machine it was 
originally architected for (SS-5, from 82407d1a's commit message), but I 
am not well versed in SPARC configurations, unfortunately for us :)


It appears this quirk is active for a wide number of machine 
configurations (basically all that appear under sun4m_machine_init) -- 
What's the risk of us breaking one of those configurations?


How did you test SS-5? (Can we test the others similarly? Is there a 
justification for not doing so?)


Thanks,
--js


Signed-off-by: Hervé Poussineau hpous...@reactos.org
---
  hw/block/fdc.c |   17 -
  1 file changed, 17 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 2bf87c9..f72a392 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -535,8 +535,6 @@ struct FDCtrl {
  uint8_t pwrd;
  /* Floppy drives */
  uint8_t num_floppies;
-/* Sun4m quirks? */
-int sun4m;
  FDrive drives[MAX_FD];
  int reset_sensei;
  uint32_t check_media_rate;
@@ -885,13 +883,6 @@ static void fdctrl_reset_irq(FDCtrl *fdctrl)

  static void fdctrl_raise_irq(FDCtrl *fdctrl)
  {
-/* Sparc mutation */
-if (fdctrl-sun4m  (fdctrl-msr  FD_MSR_CMDBUSY)) {
-/* XXX: not sure */
-fdctrl-msr = ~FD_MSR_CMDBUSY;
-fdctrl-msr |= FD_MSR_RQM | FD_MSR_DIO;
-return;
-}
  if (!(fdctrl-sra  FD_SRA_INTPEND)) {
  qemu_set_irq(fdctrl-irq, 1);
  fdctrl-sra |= FD_SRA_INTPEND;
@@ -1080,12 +1071,6 @@ static uint32_t fdctrl_read_main_status(FDCtrl *fdctrl)
  fdctrl-dsr = ~FD_DSR_PWRDOWN;
  fdctrl-dor |= FD_DOR_nRESET;

-/* Sparc mutation */
-if (fdctrl-sun4m) {
-retval |= FD_MSR_DIO;
-fdctrl_reset_irq(fdctrl);
-};
-
  FLOPPY_DPRINTF(main status register: 0x%02x\n, retval);

  return retval;
@@ -2241,8 +2226,6 @@ static void sun4m_fdc_initfn(Object *obj)
  FDCtrlSysBus *sys = SYSBUS_FDC(obj);
  FDCtrl *fdctrl = sys-state;

-fdctrl-sun4m = 1;
-
  memory_region_init_io(fdctrl-iomem, obj, fdctrl_mem_strict_ops,
fdctrl, fdctrl, 0x08);
  sysbus_init_mmio(sbd, fdctrl-iomem);

Re: [Qemu-block] [Qemu-devel] [PATCH 11/11] iotests: 124 - transactional failure test

2015-03-17 Thread John Snow




On 03/17/2015 04:59 PM, Max Reitz wrote:

On 2015-03-04 at 23:15, John Snow wrote:

Use a transaction to request an incremental backup across two drives.
Coerce one of the jobs to fail, and then re-run the transaction.

Verify that no bitmap data was lost due to the partial transaction
failure.

Signed-off-by: John Snow js...@redhat.com
---
  tests/qemu-iotests/124 | 119
+
  tests/qemu-iotests/124.out |   4 +-
  2 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 4afdca1..48571a5 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -331,6 +331,125 @@ class TestIncrementalBackup(iotests.QMPTestCase):
  self.create_incremental()
+def test_transaction_failure(self):
+'''Test: Verify backups made from a transaction that
partially fails.
+
+Add a second drive with its own unique pattern, and add a
bitmap to each
+drive. Use blkdebug to interfere with the backup on just one
drive and
+attempt to create a coherent incremental backup across both
drives.
+
+verify a failure in one but not both, then delete the failed
stubs and
+re-run the same transaction.
+
+verify that both incrementals are created successfully.
+'''
+
+# Create a second drive, with pattern:
+drive1 = self.add_node('drive1')
+self.img_create(drive1['file'], drive1['fmt'])
+io_write_patterns(drive1['file'], (('0x14', 0, 512),
+   ('0x5d', '1M', '32k'),
+   ('0xcd', '32M', '124k')))
+
+# Create a blkdebug interface to this img as 'drive1'
+result = self.vm.qmp('blockdev-add', options={
+'id': drive1['id'],
+'driver': drive1['fmt'],
+'file': {
+'driver': 'blkdebug',
+'image': {
+'driver': 'file',
+'filename': drive1['file']
+},
+'set-state': [{
+'event': 'flush_to_disk',
+'state': 1,
+'new_state': 2
+}],
+'inject-error': [{
+'event': 'read_aio',
+'errno': 5,
+'state': 2,
+'immediately': False,
+'once': True
+}],
+}
+})
+self.assert_qmp(result, 'return', {})
+
+# Create bitmaps and full backups for both drives
+drive0 = self.drives[0]
+dr0bm0 = self.add_bitmap('bitmap0', drive0)
+dr1bm0 = self.add_bitmap('bitmap0', drive1)
+self.create_full_backup(drive0)
+self.create_full_backup(drive1)
+self.assert_no_active_block_jobs()
+self.assertFalse(self.vm.get_qmp_events(wait=False))
+
+# Emulate some writes
+self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+  ('0xfe', '16M', '256k'),
+  ('0x64', '32736k', '64k')))
+self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
+  ('0xef', '16M', '256k'),
+  ('0x46', '32736k', '64k')))
+
+# Create incremental backup targets
+target0 = self.prepare_backup(dr0bm0)
+target1 = self.prepare_backup(dr1bm0)
+
+# Ask for a new incremental backup per-each drive,
+# expecting drive1's backup to fail:
+transaction = [
+{
+'type': 'drive-backup',
+'data': { 'device': drive0['id'],
+  'sync': 'dirty-bitmap',
+  'format': drive0['fmt'],
+  'target': target0,
+  'mode': 'existing',
+  'bitmap': dr0bm0.name },
+},
+{
+'type': 'drive-backup',
+'data': { 'device': drive1['id'],
+  'sync': 'dirty-bitmap',
+  'format': drive1['fmt'],
+  'target': target1,
+  'mode': 'existing',
+  'bitmap': dr1bm0.name }
+}
+]
+result = self.vm.qmp('transaction', actions=transaction)
+self.assert_qmp(result, 'return', {})
+
+# Observe that drive0's backup completes, but drive1's does not.
+# Consume drive1's error and ensure all pending actions are
completed.
+self.wait_incremental(dr0bm0, validate=True)
+self.wait_incremental(dr1bm0, validate=False)
+error = self.vm.event_wait('BLOCK_JOB_ERROR')
+self.assert_qmp(error, 'data', {'device': drive1['id'],
+'action': 'report

Re: [Qemu-block] [Qemu-devel] [2.4 PATCH v3 15/19] block: Resize bitmaps on bdrv_truncate

2015-03-17 Thread John Snow




On 03/17/2015 09:50 AM, Max Reitz wrote:

On 2015-03-13 at 14:30, John Snow wrote:

Signed-off-by: John Snow js...@redhat.com
---
  block.c| 18 +
  include/qemu/hbitmap.h | 10 ++
  util/hbitmap.c | 52
++
  3 files changed, 80 insertions(+)

diff --git a/block.c b/block.c
index 1eee394..f40b014 100644
--- a/block.c
+++ b/block.c
@@ -113,6 +113,7 @@ static void bdrv_set_dirty(BlockDriverState *bs,
int64_t cur_sector,
 int nr_sectors);
  static void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector,
   int nr_sectors);
+static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
  /* If non-zero, use only whitelisted block drivers */
  static int use_bdrv_whitelist;
@@ -3543,6 +3544,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t
offset)
  ret = drv-bdrv_truncate(bs, offset);
  if (ret == 0) {
  ret = refresh_total_sectors(bs, offset  BDRV_SECTOR_BITS);
+bdrv_dirty_bitmap_truncate(bs);
  if (bs-blk) {
  blk_dev_resize_cb(bs-blk);
  }
@@ -5562,6 +5564,22 @@ BdrvDirtyBitmap
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
  return parent;
  }
+/**
+ * Truncates _all_ bitmaps attached to a BDS.
+ */
+static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
+{
+BdrvDirtyBitmap *bitmap;
+uint64_t size = bdrv_nb_sectors(bs);
+
+QLIST_FOREACH(bitmap, bs-dirty_bitmaps, list) {
+if (bdrv_dirty_bitmap_frozen(bitmap)) {
+continue;
+}
+hbitmap_truncate(bitmap-bitmap, size);
+}
+}
+
  void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap
*bitmap)
  {
  BdrvDirtyBitmap *bm, *next;
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index c19c1cb..a75157e 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -65,6 +65,16 @@ struct HBitmapIter {
  HBitmap *hbitmap_alloc(uint64_t size, int granularity);
  /**
+ * hbitmap_truncate:
+ * @hb: The bitmap to change the size of.
+ * @size: The number of elements to change the bitmap to accommodate.
+ *
+ * truncate or grow an existing bitmap to accommodate a new number of
elements.
+ * This may invalidate existing HBitmapIterators.
+ */
+void hbitmap_truncate(HBitmap *hb, uint64_t size);
+
+/**
   * hbitmap_merge:
   * @a: The bitmap to store the result in.
   * @b: The bitmap to merge into @a.
diff --git a/util/hbitmap.c b/util/hbitmap.c
index ba11fd3..4505ef7 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -400,6 +400,58 @@ HBitmap *hbitmap_alloc(uint64_t size, int
granularity)
  return hb;
  }
+void hbitmap_truncate(HBitmap *hb, uint64_t size)
+{
+bool shrink;
+unsigned i;
+uint64_t num_elements = size;
+uint64_t old;
+
+/* Size comes in as logical elements, adjust for granularity. */
+size = (size + (1ULL  hb-granularity) - 1)  hb-granularity;
+assert(size = ((uint64_t)1  HBITMAP_LOG_MAX_SIZE));
+shrink = size  hb-size;
+
+/* bit sizes are identical; nothing to do. */
+if (size == hb-size) {
+return;
+}
+
+/* If we're losing bits, let's clear those bits before we
invalidate all of
+ * our invariants. This helps keep the bitcount consistent, and
will prevent
+ * us from carrying around garbage bits beyond the end of the map.
+ *
+ * Because clearing bits past the end of map might reset bits we
care about
+ * within the array, record the current value of the last bit
we're keeping.
+ */
+if (shrink) {
+bool set = hbitmap_get(hb, num_elements - 1);
+uint64_t fix_count = (hb-size  hb-granularity) -
num_elements;
+
+assert(fix_count);
+hbitmap_reset(hb, num_elements, fix_count);
+if (set) {
+hbitmap_set(hb, num_elements - 1, 1);
+}
+}
+
+hb-size = size;
+for (i = HBITMAP_LEVELS; i--  0; ) {
+size = MAX(BITS_TO_LONGS(size), 1);


Shouldn't this be size = MAX(BITS_TO_LONGS(size)  BITS_PER_LEVEL, 1);?



I don't think so;

BITS_TO_LONGS(X) replaces the original construct:
(size + BITS_PER_LONG - 1)  BITS_PER_LEVEL
which takes a size, adds 31|63 and then divides by 32|64.

BITS_TO_LONGS performs DIV_ROUND_UP(nr, 32|64), which will do 
effectively the same thing. (Actually, a little less efficiently, but I 
found this macro was nicer to read.)



+if (hb-sizes[i] == size) {
+break;
+}
+old = hb-sizes[i];
+hb-sizes[i] = size;
+hb-levels[i] = g_realloc(hb-levels[i], size *
sizeof(unsigned long));


Any specific reason you got rid of the g_realloc_n()?



Not available in glib 2.12 (or 2.22.)


Apart from these, the changes to v2 look good.

Max


+if (!shrink) {
+memset(hb-levels[i][old], 0x00,
+   (size - old) * sizeof(*hb-levels[i]));
+}
+}
+}
+
+
  /**
   * Given HBitmaps A and B, let A := A (BITOR) B

Re: [Qemu-block] [Qemu-devel] [2.4 PATCH v3 16/19] hbitmap: truncate tests

2015-03-17 Thread John Snow




On 03/17/2015 10:53 AM, Max Reitz wrote:

On 2015-03-13 at 14:30, John Snow wrote:

The general approach is to set bits close to the boundaries of
where we are truncating and ensure that everything appears to
have gone OK.

We test growing and shrinking by different amounts:
- Less than the granularity
- Less than the granularity, but across a boundary
- Less than sizeof(unsigned long)
- Less than sizeof(unsigned long), but across a ulong boundary
- More than sizeof(unsigned long)

Signed-off-by: John Snow js...@redhat.com
---
  tests/test-hbitmap.c | 247
+++
  1 file changed, 247 insertions(+)

diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
index 8c902f2..65401ab 100644
--- a/tests/test-hbitmap.c
+++ b/tests/test-hbitmap.c
@@ -11,6 +11,8 @@
  #include glib.h
  #include stdarg.h
+#include string.h
+#include sys/types.h
  #include qemu/hbitmap.h
  #define LOG_BITS_PER_LONG  (BITS_PER_LONG == 32 ? 5 : 6)
@@ -23,6 +25,7 @@ typedef struct TestHBitmapData {
  HBitmap   *hb;
  unsigned long *bits;
  size_t size;
+size_t old_size;
  intgranularity;
  } TestHBitmapData;
@@ -91,6 +94,44 @@ static void hbitmap_test_init(TestHBitmapData *data,
  }
  }
+static inline size_t hbitmap_test_array_size(size_t bits)
+{
+size_t n = (bits + BITS_PER_LONG - 1) / BITS_PER_LONG;
+return n ? n : 1;
+}
+
+static void hbitmap_test_truncate_impl(TestHBitmapData *data,
+   size_t size)
+{
+size_t n;
+size_t m;
+data-old_size = data-size;
+data-size = size;
+
+if (data-size == data-old_size) {
+return;
+}
+
+n = hbitmap_test_array_size(size);
+m = hbitmap_test_array_size(data-old_size);
+data-bits = g_realloc(data-bits, sizeof(unsigned long) * n);
+if (n  m) {
+memset(data-bits[m], 0x00, sizeof(unsigned long) * (n - m));
+}
+
+/* If we shrink to an uneven multiple of sizeof(unsigned long),
+ * scrub the leftover memory. */
+if (data-size  data-old_size) {
+m = size % (sizeof(unsigned long) * 8);
+if (m) {
+unsigned long mask = (1ULL  m) - 1;
+data-bits[n-1] = mask;
+}
+}
+
+hbitmap_truncate(data-hb, size);
+}
+
  static void hbitmap_test_teardown(TestHBitmapData *data,
const void *unused)
  {
@@ -369,6 +410,190 @@ static void
test_hbitmap_iter_granularity(TestHBitmapData *data,
  g_assert_cmpint(hbitmap_iter_next(hbi), , 0);
  }
+static void hbitmap_test_set_boundary_bits(TestHBitmapData *data,
ssize_t diff)
+{
+size_t size = data-size;
+
+/* First bit */
+hbitmap_test_set(data, 0, 1);
+if (diff  0) {
+/* Last bit in new, shortened map */
+hbitmap_test_set(data, size + diff - 1, 1);
+
+/* First bit to be truncated away */
+hbitmap_test_set(data, size + diff, 1);
+}
+/* Last bit */
+hbitmap_test_set(data, size - 1, 1);
+if (data-granularity == 0) {
+hbitmap_test_check_get(data);
+}
+}
+
+static void hbitmap_test_check_boundary_bits(TestHBitmapData *data)
+{
+size_t size = MIN(data-size, data-old_size);
+
+if (data-granularity == 0) {
+hbitmap_test_check_get(data);
+hbitmap_test_check(data, 0);
+} else {
+g_assert(hbitmap_get(data-hb, 0));
+g_assert(hbitmap_get(data-hb, size - 1));
+g_assert_cmpint(2  data-granularity, ==,
hbitmap_count(data-hb));


Hm, where does this come from?



I assume you are referring to specifically the population count. On both 
grow and shrink operations, we should be left with only two 
real/physical bits set: the first and either the last or the formerly 
last bit in the bitmap.


For shrink operations, we truncate off two extra bits that exist within 
the now 'dead space.', leaving us with two.


For grow operations, we add empty space, leaving the first and formerly 
last bit set. (This is the MIN() call above.)


In both cases, we should have two real bits left. Adjusting for 
granularity (g=1 in my tests, here, when used) we should always find 
four virtual bits set.


Confusingly, this even happens when the bitmap ends or is truncated on a 
virtual granularity boundary: e.g. a bitmap of 3 bits with a granularity 
of g=1 (2^1 - 2 bits). Setting the 3rd bit will set two virtual bits, 
giving us a popcount of 2, even though one of those bits is a phantom.


The boundary bits that I am checking here are set in 
test_set_boundary_bits, and are not checked explicitly for g=0 cases 
where we can rely on the shadow data that Paolo keeps track of. For g=1 
cases, I check manually.


The implication here is that test_check_boundary_bits is only expected 
to avoid an assertion if it is called after test_set_boundary_bits 
and, in the shrinking case, a truncate operation.



+}
+}
+
+/* Generic truncate test. */
+static void hbitmap_test_truncate

[Qemu-block] [PATCH for-2.3 1/4] ide: fix cmd_write_pio when nsectors 1

2015-03-19 Thread John Snow

We need to adjust the sector being written to
prior to calling ide_transfer_start, otherwise
we'll write to the same sector again.

Signed-off-by: John Snow js...@redhat.com
---
 hw/ide/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index ef52f35..0e9da64 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -846,6 +846,7 @@ static void ide_sector_write_cb(void *opaque, int ret)
 s-nsector -= n;
 s-io_buffer_offset += 512 * n;
 
+ide_set_sector(s, ide_get_sector(s) + n);
 if (s-nsector == 0) {
 /* no more sectors to write */
 ide_transfer_stop(s);
@@ -857,7 +858,6 @@ static void ide_sector_write_cb(void *opaque, int ret)
 ide_transfer_start(s, s-io_buffer, n1 * BDRV_SECTOR_SIZE,
ide_sector_write);
 }
-ide_set_sector(s, ide_get_sector(s) + n);
 
 if (win2k_install_hack  ((++s-irq_count % 16) == 0)) {
 /* It seems there is a bug in the Windows 2000 installer HDD
-- 
2.1.0

[Qemu-block] [PATCH for-2.3 4/4] ahci-test: improve rw buffer patterns

2015-03-19 Thread John Snow

My pattern was cyclical every 256 bytes, so it missed a fairly obvious
failure case. Add some rand() pepper into the test pattern, and for large
patterns that exceed 256 sectors, start writing an ID per-sector so that
we never generate identical sector patterns.

Signed-off-by: John Snow js...@redhat.com
---
 tests/ahci-test.c | 36 
 1 file changed, 28 insertions(+), 8 deletions(-)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index 169e83b..ea62e24 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -68,6 +68,32 @@ static void string_bswap16(uint16_t *s, size_t bytes)
 }
 }
 
+static void generate_pattern(void *buffer, size_t len, size_t cycle_len)
+{
+int i, j;
+unsigned char *tx = (unsigned char *)buffer;
+unsigned char p;
+size_t *sx;
+
+/* Write an indicative pattern that varies and is unique per-cycle */
+p = rand() % 256;
+for (i = j = 0; i  len; i++, j++) {
+tx[i] = p;
+if (j % cycle_len == 0) {
+p = rand() % 256;
+}
+}
+
+/* force uniqueness by writing an id per-cycle */
+for (i = 0; i  len / cycle_len; i++) {
+j = i * cycle_len;
+if (j + sizeof(*sx) = len) {
+sx = (size_t *)tx[j];
+*sx = i;
+}
+}
+}
+
 /*** Test Setup  Teardown ***/
 
 /**
@@ -736,7 +762,6 @@ static void ahci_test_io_rw_simple(AHCIQState *ahci, 
unsigned bufsize,
 {
 uint64_t ptr;
 uint8_t port;
-unsigned i;
 unsigned char *tx = g_malloc(bufsize);
 unsigned char *rx = g_malloc0(bufsize);
 
@@ -752,9 +777,7 @@ static void ahci_test_io_rw_simple(AHCIQState *ahci, 
unsigned bufsize,
 g_assert(ptr);
 
 /* Write some indicative pattern to our buffer. */
-for (i = 0; i  bufsize; i++) {
-tx[i] = (bufsize - i);
-}
+generate_pattern(tx, bufsize, AHCI_SECTOR_SIZE);
 memwrite(ptr, tx, bufsize);
 
 /* Write this buffer to disk, then read it back to the DMA buffer. */
@@ -865,7 +888,6 @@ static void test_dma_fragmented(void)
 size_t bufsize = 4096;
 unsigned char *tx = g_malloc(bufsize);
 unsigned char *rx = g_malloc0(bufsize);
-unsigned i;
 uint64_t ptr;
 
 ahci = ahci_boot_and_enable();
@@ -873,9 +895,7 @@ static void test_dma_fragmented(void)
 ahci_port_clear(ahci, px);
 
 /* create pattern */
-for (i = 0; i  bufsize; i++) {
-tx[i] = (bufsize - i);
-}
+generate_pattern(tx, bufsize, AHCI_SECTOR_SIZE);
 
 /* Create a DMA buffer in guest memory, and write our pattern to it. */
 ptr = guest_alloc(ahci-parent-alloc, bufsize);
-- 
2.1.0

Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/2] ahci: test varying sector offsets

2015-03-24 Thread John Snow

Ping: I'll pull both this series and the 'ahci: rerror/werror=stop 
resume tests' series into my ide-next branch if just these two patches 
get a re-review.


They were excised from a pullreq due to glib compatibility issues, so 
the following series has no changes.


--js

On 03/13/2015 03:22 PM, John Snow wrote:

This is a re-send of patches 7  8 from an earlier series,
[PATCH v2 0/8] ahci: add more IO tests which ultimately got bounced
back because I used some glib functions that were too new.

v2:
- Patchew caught a pathing problem with the qemu-img binary;
   the relative path produced by the Makefile does not prepend
   ./, so I was relying on the /distro's/ qemu-img by accident.
   Fix that by using realpath().

v1:
- Removed ./ from the execution CLI. Now you can set an absolute or
   relative path for QTEST_QEMU_IMG and it will work either way. The default
   as generated by the Makefile will be a relative path.

- Removed the g_spawn_check_exit_status glib call from mkimg(). See the
   in-line comments in patch 1/2 for correctness justification.

John Snow (2):
   qtest/ahci: add qcow2 support to ahci-test
   qtest/ahci: test different disk sectors

  tests/Makefile|  1 +
  tests/ahci-test.c | 84 +--
  tests/libqos/ahci.c   | 10 +++---
  tests/libqos/ahci.h   |  4 +--
  tests/libqos/libqos.c | 44 +++
  tests/libqos/libqos.h |  2 ++
  6 files changed, 116 insertions(+), 29 deletions(-)

Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/6] ahci: rerror/werror=stop resume tests

2015-03-25 Thread John Snow



On 03/10/2015 04:14 PM, John Snow wrote:

This series is based on:
[Qemu-devel] [PATCH 0/2] ahci: test varying sector offsets

There appear to be some upstream issues for iotests 051 and 061,
but this series does not appear to alter the existing bad behavior
of those tests.

This patchset brings us up to feature parity with the ide-test that
was checked in for the rerror/werror migration fixes series.

With the expanded functionality of libqos, we test error injection
and error recovery for the AHCI device.

v1 got bounced due to a prerequisite failing a test during a pull req,
so v2 is nearly unchanged:

v2:
  - Rebased to master
  - Fixed an include issue for patch 5.

John Snow (6):
   qtest/ahci: Add simple flush test
   qtest/ahci: Allow override of default CLI options
   libqtest: add qmp_eventwait
   libqtest: add qmp_async
   libqos: add blkdebug_prepare_script
   qtest/ahci: add flush retry test

  tests/ahci-test.c| 143 ---
  tests/ide-test.c |  34 +--
  tests/libqos/libqos-pc.c |   5 ++
  tests/libqos/libqos-pc.h |   1 +
  tests/libqos/libqos.c|  22 
  tests/libqos/libqos.h|   1 +
  tests/libqtest.c |  46 ++-
  tests/libqtest.h |  47 
  8 files changed, 245 insertions(+), 54 deletions(-)



After fixing the series this depends upon after it was booted from a 2.3 
pullreq for glib issues, I am staging this for 2.4.


Thanks!

Re: [Qemu-block] [Qemu-devel] [PATCH RFC for-2.3? 5/8] fdb: Move FDCtrlISABus to header

2015-03-30 Thread John Snow


You probably meant 'fdc' !

On 03/29/2015 01:53 PM, Andreas Färber wrote:

To be used for embedding the device.

Add gtk-doc private/public markers for parent field.

Signed-off-by: Andreas Färber afaer...@suse.de
---
  hw/block/fdc.c | 87 -
  include/hw/block/fdc.h | 88 ++
  2 files changed, 88 insertions(+), 87 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index 2bf87c9..da521f1 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -31,7 +31,6 @@
  #include hw/block/fdc.h
  #include qemu/error-report.h
  #include qemu/timer.h
-#include hw/isa/isa.h
  #include hw/sysbus.h
  #include sysemu/block-backend.h
  #include sysemu/blockdev.h
@@ -167,33 +166,7 @@ static void pick_geometry(BlockBackend *blk, int *nb_heads,
  #define FD_SECTOR_SC   2   /* Sector size code */
  #define FD_RESET_SENSEI_COUNT  4   /* Number of sense interrupts on RESET */

-typedef struct FDCtrl FDCtrl;
-
  /* Floppy disk drive emulation */
-typedef enum FDiskFlags {
-FDISK_DBL_SIDES  = 0x01,
-} FDiskFlags;
-
-typedef struct FDrive {
-FDCtrl *fdctrl;
-BlockBackend *blk;
-/* Drive status */
-FDriveType drive;
-uint8_t perpendicular;/* 2.88 MB access mode*/
-/* Position */
-uint8_t head;
-uint8_t track;
-uint8_t sect;
-/* Media */
-FDiskFlags flags;
-uint8_t last_sect;/* Nb sector per track*/
-uint8_t max_track;/* Nb of tracks   */
-uint16_t bps; /* Bytes per sector   */
-uint8_t ro;   /* Is read-only   */
-uint8_t media_changed;/* Is media changed   */
-uint8_t media_rate;   /* Data rate of medium*/
-} FDrive;
-
  static void fd_init(FDrive *drv)
  {
  /* Drive */
@@ -498,53 +471,6 @@ enum {
  #define FD_MULTI_TRACK(state) ((state)  FD_STATE_MULTI)
  #define FD_FORMAT_CMD(state) ((state)  FD_STATE_FORMAT)

-struct FDCtrl {
-MemoryRegion iomem;
-qemu_irq irq;
-/* Controller state */
-QEMUTimer *result_timer;
-int dma_chann;
-/* Controller's identification */
-uint8_t version;
-/* HW */
-uint8_t sra;
-uint8_t srb;
-uint8_t dor;
-uint8_t dor_vmstate; /* only used as temp during vmstate */
-uint8_t tdr;
-uint8_t dsr;
-uint8_t msr;
-uint8_t cur_drv;
-uint8_t status0;
-uint8_t status1;
-uint8_t status2;
-/* Command FIFO */
-uint8_t *fifo;
-int32_t fifo_size;
-uint32_t data_pos;
-uint32_t data_len;
-uint8_t data_state;
-uint8_t data_dir;
-uint8_t eot; /* last wanted sector */
-/* States kept only to be returned back */
-/* precompensation */
-uint8_t precomp_trk;
-uint8_t config;
-uint8_t lock;
-/* Power down config (also with status regB access mode */
-uint8_t pwrd;
-/* Floppy drives */
-uint8_t num_floppies;
-/* Sun4m quirks? */
-int sun4m;
-FDrive drives[MAX_FD];
-int reset_sensei;
-uint32_t check_media_rate;
-/* Timers state */
-uint8_t timer0;
-uint8_t timer1;
-};
-
  #define TYPE_SYSBUS_FDC base-sysbus-fdc
  #define SYSBUS_FDC(obj) OBJECT_CHECK(FDCtrlSysBus, (obj), TYPE_SYSBUS_FDC)

@@ -556,19 +482,6 @@ typedef struct FDCtrlSysBus {
  struct FDCtrl state;
  } FDCtrlSysBus;

-#define ISA_FDC(obj) OBJECT_CHECK(FDCtrlISABus, (obj), TYPE_ISA_FDC)
-
-typedef struct FDCtrlISABus {
-ISADevice parent_obj;
-
-uint32_t iobase;
-uint32_t irq;
-uint32_t dma;
-struct FDCtrl state;
-int32_t bootindexA;
-int32_t bootindexB;
-} FDCtrlISABus;
-
  static uint32_t fdctrl_read (void *opaque, uint32_t reg)
  {
  FDCtrl *fdctrl = opaque;
diff --git a/include/hw/block/fdc.h b/include/hw/block/fdc.h
index d48b2f8..86d852d 100644
--- a/include/hw/block/fdc.h
+++ b/include/hw/block/fdc.h
@@ -2,6 +2,7 @@
  #define HW_FDC_H

  #include qemu-common.h
+#include hw/isa/isa.h

  /* fdc.c */
  #define MAX_FD 2
@@ -13,7 +14,94 @@ typedef enum FDriveType {
  FDRIVE_DRV_NONE = 0x03,   /* No drive connected */
  } FDriveType;

+typedef enum FDiskFlags {
+FDISK_DBL_SIDES  = 0x01,
+} FDiskFlags;
+
+typedef struct FDCtrl FDCtrl;
+
+typedef struct FDrive {
+FDCtrl *fdctrl;
+BlockBackend *blk;
+/* Drive status */
+FDriveType drive;
+uint8_t perpendicular;/* 2.88 MB access mode*/
+/* Position */
+uint8_t head;
+uint8_t track;
+uint8_t sect;
+/* Media */
+FDiskFlags flags;
+uint8_t last_sect;/* Nb sector per track*/
+uint8_t max_track;/* Nb of tracks   */
+uint16_t bps; /* Bytes per sector   */
+uint8_t ro;   /* Is read-only   */
+uint8_t media_changed;/* Is media changed   */
+uint8_t media_rate;   /* Data rate of medium*/
+} FDrive;
+
+struct FDCtrl {
+MemoryRegion iomem;
+qemu_irq irq;
+/* Controller state */

[Qemu-block] qemu-img behavior for locating backing files

2015-04-01 Thread John Snow

Kevin, what's the correct behavior for qemu-img and relative paths when 
creating a new qcow2 file?


Example:

(in e.g. /home/qemu/build/ or anywhere not /home: )
qemu-img create -f qcow2 base.qcow2 32G
qemu-img create -f qcow2 -F qcow2 -b base.qcow2 /home/overlay.qcow2

In 1.7.0., this produces a warning that the base object cannot be found 
(because it does not exist at that location relative to overlay.qcow2), 
but qemu-img will create the qcow2 for you regardless.


2.0, 2.1 and 2.2 all will create the image successfully, with no warnings.

2.3-rc1/master as they exist now will emit an error message and create 
no image.


Since this is a change in behavior for the pending release, is this the 
correct/desired behavior?

[Qemu-block] [PATCH v2 09/11] qmp: Add an implementation wrapper for qmp_drive_backup

2015-03-27 Thread John Snow

We'd like to be able to specify the callback given to backup_start
manually in the case of transactions, so split apart qmp_drive_backup
into an implementation and a wrapper.

Switch drive_backup_prepare to use the new wrapper, but don't overload
the callback and closure yet.

Signed-off-by: John Snow js...@redhat.com
---
 blockdev.c | 65 --
 1 file changed, 46 insertions(+), 19 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index fa954e9..16b2cf7 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1847,15 +1847,18 @@ out:
 aio_context_release(aio_context);
 }
 
-void qmp_drive_backup(const char *device, const char *target,
-  bool has_format, const char *format,
-  enum MirrorSyncMode sync,
-  bool has_mode, enum NewImageMode mode,
-  bool has_speed, int64_t speed,
-  bool has_bitmap, const char *bitmap,
-  bool has_on_source_error, BlockdevOnError 
on_source_error,
-  bool has_on_target_error, BlockdevOnError 
on_target_error,
-  Error **errp)
+static void do_drive_backup(const char *device, const char *target,
+bool has_format, const char *format,
+enum MirrorSyncMode sync,
+bool has_mode, enum NewImageMode mode,
+bool has_speed, int64_t speed,
+bool has_bitmap, const char *bitmap,
+bool has_on_source_error,
+BlockdevOnError on_source_error,
+bool has_on_target_error,
+BlockdevOnError on_target_error,
+BlockCompletionFunc *cb, void *opaque,
+Error **errp)
 {
 BlockBackend *blk;
 BlockDriverState *bs;
@@ -1969,9 +1972,16 @@ void qmp_drive_backup(const char *device, const char 
*target,
 }
 }
 
+/* If we are not supplied with callback override info, use our defaults */
+if (cb == NULL) {
+cb = block_job_cb;
+}
+if (opaque == NULL) {
+opaque = bs;
+}
 backup_start(bs, target_bs, speed, sync, bmap,
  on_source_error, on_target_error,
- block_job_cb, bs, local_err);
+ cb, opaque, local_err);
 if (local_err != NULL) {
 bdrv_unref(target_bs);
 error_propagate(errp, local_err);
@@ -1982,6 +1992,22 @@ out:
 aio_context_release(aio_context);
 }
 
+void qmp_drive_backup(const char *device, const char *target,
+  bool has_format, const char *format,
+  enum MirrorSyncMode sync,
+  bool has_mode, enum NewImageMode mode,
+  bool has_speed, int64_t speed,
+  bool has_bitmap, const char *bitmap,
+  bool has_on_source_error, BlockdevOnError 
on_source_error,
+  bool has_on_target_error, BlockdevOnError 
on_target_error,
+  Error **errp)
+{
+do_drive_backup(device, target, has_format, format, sync, has_mode, mode,
+has_speed, speed, has_bitmap, bitmap, has_on_source_error,
+on_source_error, has_on_target_error, on_target_error,
+NULL, NULL, errp);
+}
+
 BlockDeviceInfoList *qmp_query_named_block_nodes(Error **errp)
 {
 return bdrv_named_nodes_list();
@@ -3112,15 +3138,16 @@ static void drive_backup_prepare(BlkActionState 
*common, Error **errp)
 state-aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(state-aio_context);
 
-qmp_drive_backup(backup-device, backup-target,
- backup-has_format, backup-format,
- backup-sync,
- backup-has_mode, backup-mode,
- backup-has_speed, backup-speed,
- backup-has_bitmap, backup-bitmap,
- backup-has_on_source_error, backup-on_source_error,
- backup-has_on_target_error, backup-on_target_error,
- local_err);
+do_drive_backup(backup-device, backup-target,
+backup-has_format, backup-format,
+backup-sync,
+backup-has_mode, backup-mode,
+backup-has_speed, backup-speed,
+backup-has_bitmap, backup-bitmap,
+backup-has_on_source_error, backup-on_source_error,
+backup-has_on_target_error, backup-on_target_error,
+NULL, NULL,
+local_err);
 if (local_err) {
 error_propagate(errp, local_err);
 return;
-- 
2.1.0

[Qemu-block] [PATCH v2 11/11] iotests: 124 - transactional failure test

2015-03-27 Thread John Snow

Use a transaction to request an incremental backup across two drives.
Coerce one of the jobs to fail, and then re-run the transaction.

Verify that no bitmap data was lost due to the partial transaction
failure.

Signed-off-by: John Snow js...@redhat.com
---
 tests/qemu-iotests/124 | 119 +
 tests/qemu-iotests/124.out |   4 +-
 2 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 31946f9..ad82076 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -332,6 +332,125 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 self.create_incremental()
 
 
+def test_transaction_failure(self):
+'''Test: Verify backups made from a transaction that partially fails.
+
+Add a second drive with its own unique pattern, and add a bitmap to 
each
+drive. Use blkdebug to interfere with the backup on just one drive and
+attempt to create a coherent incremental backup across both drives.
+
+verify a failure in one but not both, then delete the failed stubs and
+re-run the same transaction.
+
+verify that both incrementals are created successfully.
+'''
+
+# Create a second drive, with pattern:
+drive1 = self.add_node('drive1')
+self.img_create(drive1['file'], drive1['fmt'])
+io_write_patterns(drive1['file'], (('0x14', 0, 512),
+   ('0x5d', '1M', '32k'),
+   ('0xcd', '32M', '124k')))
+
+# Create a blkdebug interface to this img as 'drive1'
+result = self.vm.qmp('blockdev-add', options={
+'id': drive1['id'],
+'driver': drive1['fmt'],
+'file': {
+'driver': 'blkdebug',
+'image': {
+'driver': 'file',
+'filename': drive1['file']
+},
+'set-state': [{
+'event': 'flush_to_disk',
+'state': 1,
+'new_state': 2
+}],
+'inject-error': [{
+'event': 'read_aio',
+'errno': 5,
+'state': 2,
+'immediately': False,
+'once': True
+}],
+}
+})
+self.assert_qmp(result, 'return', {})
+
+# Create bitmaps and full backups for both drives
+drive0 = self.drives[0]
+dr0bm0 = self.add_bitmap('bitmap0', drive0)
+dr1bm0 = self.add_bitmap('bitmap0', drive1)
+self.create_full_backup(drive0)
+self.create_full_backup(drive1)
+self.assert_no_active_block_jobs()
+self.assertFalse(self.vm.get_qmp_events(wait=False))
+
+# Emulate some writes
+self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+  ('0xfe', '16M', '256k'),
+  ('0x64', '32736k', '64k')))
+self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
+  ('0xef', '16M', '256k'),
+  ('0x46', '32736k', '64k')))
+
+# Create incremental backup targets
+target0 = self.prepare_backup(dr0bm0)
+target1 = self.prepare_backup(dr1bm0)
+
+# Ask for a new incremental backup per-each drive,
+# expecting drive1's backup to fail:
+transaction = [
+{
+'type': 'drive-backup',
+'data': { 'device': drive0['id'],
+  'sync': 'dirty-bitmap',
+  'format': drive0['fmt'],
+  'target': target0,
+  'mode': 'existing',
+  'bitmap': dr0bm0.name },
+},
+{
+'type': 'drive-backup',
+'data': { 'device': drive1['id'],
+  'sync': 'dirty-bitmap',
+  'format': drive1['fmt'],
+  'target': target1,
+  'mode': 'existing',
+  'bitmap': dr1bm0.name }
+}
+]
+result = self.vm.qmp('transaction', actions=transaction)
+self.assert_qmp(result, 'return', {})
+
+# Observe that drive0's backup completes, but drive1's does not.
+# Consume drive1's error and ensure all pending actions are completed.
+self.wait_incremental(dr0bm0, validate=True)
+self.wait_incremental(dr1bm0, validate=False)
+error = self.vm.event_wait('BLOCK_JOB_ERROR')
+self.assert_qmp(error, 'data', {'device': drive1['id'],
+'action': 'report',
+'operation': 'read'})
+self.assertFalse(self.vm.get_qmp_events(wait

[Qemu-block] [PATCH v2 01/11] qapi: Add transaction support to block-dirty-bitmap operations

2015-03-27 Thread John Snow

This adds two qmp commands to transactions.

block-dirty-bitmap-add allows you to create a bitmap simultaneously
alongside a new full backup to accomplish a clean synchronization
point.

block-dirty-bitmap-clear allows you to reset a bitmap back to as-if
it were new, which can also be used alongside a full backup to
accomplish a clean synchronization point.

Signed-off-by: Fam Zheng f...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
 blockdev.c   | 100 +++
 qapi-schema.json |   6 +++-
 2 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index ab67b4d..d5ea75e 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1692,6 +1692,95 @@ static void blockdev_backup_clean(BlkTransactionState 
*common)
 }
 }
 
+typedef struct BlockDirtyBitmapState {
+BlkTransactionState common;
+BdrvDirtyBitmap *bitmap;
+BlockDriverState *bs;
+AioContext *aio_context;
+bool prepared;
+} BlockDirtyBitmapState;
+
+static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
+   Error **errp)
+{
+Error *local_err = NULL;
+BlockDirtyBitmapAdd *action;
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+action = common-action-block_dirty_bitmap_add;
+/* AIO context taken within qmp_block_dirty_bitmap_add */
+qmp_block_dirty_bitmap_add(action-node, action-name,
+   action-has_granularity, action-granularity,
+   local_err);
+
+if (!local_err) {
+state-prepared = true;
+} else {
+error_propagate(errp, local_err);
+}
+}
+
+static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
+{
+BlockDirtyBitmapAdd *action;
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+action = common-action-block_dirty_bitmap_add;
+/* Should not be able to fail: IF the bitmap was added via .prepare(),
+ * then the node reference and bitmap name must have been valid.
+ */
+if (state-prepared) {
+qmp_block_dirty_bitmap_remove(action-node, action-name, 
error_abort);
+}
+}
+
+static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
+ Error **errp)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+BlockDirtyBitmap *action;
+
+action = common-action-block_dirty_bitmap_clear;
+state-bitmap = block_dirty_bitmap_lookup(action-node,
+  action-name,
+  state-bs,
+  state-aio_context,
+  errp);
+if (!state-bitmap) {
+return;
+}
+
+if (bdrv_dirty_bitmap_frozen(state-bitmap)) {
+error_setg(errp, Cannot modify a frozen bitmap);
+return;
+} else if (!bdrv_dirty_bitmap_enabled(state-bitmap)) {
+error_setg(errp, Cannot clear a disabled bitmap);
+return;
+}
+
+/* AioContext is released in .clean() */
+}
+
+static void block_dirty_bitmap_clear_commit(BlkTransactionState *common)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+bdrv_clear_dirty_bitmap(state-bitmap);
+}
+
+static void block_dirty_bitmap_clear_clean(BlkTransactionState *common)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+if (state-aio_context) {
+aio_context_release(state-aio_context);
+}
+}
+
 static void abort_prepare(BlkTransactionState *common, Error **errp)
 {
 error_setg(errp, Transaction aborted using Abort action);
@@ -1732,6 +1821,17 @@ static const BdrvActionOps actions[] = {
 .abort = internal_snapshot_abort,
 .clean = internal_snapshot_clean,
 },
+[TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ADD] = {
+.instance_size = sizeof(BlockDirtyBitmapState),
+.prepare = block_dirty_bitmap_add_prepare,
+.abort = block_dirty_bitmap_add_abort,
+},
+[TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_CLEAR] = {
+.instance_size = sizeof(BlockDirtyBitmapState),
+.prepare = block_dirty_bitmap_clear_prepare,
+.commit = block_dirty_bitmap_clear_commit,
+.clean = block_dirty_bitmap_clear_clean,
+}
 };
 
 /*
diff --git a/qapi-schema.json b/qapi-schema.json
index ac9594d..f6fe2b3 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1356,6 +1356,8 @@
 # abort since 1.6
 # blockdev-snapshot-internal-sync since 1.7
 # blockdev-backup since 2.3
+# block-dirty-bitmap-add

[Qemu-block] [PATCH v2 00/11] block: incremental backup transactions

2015-03-27 Thread John Snow

requires: 1426879023-18151-1-git-send-email-js...@redhat.com
  [PATCH v4 00/20] block: transactionless incremental backup series

This series adds support for incremental backup primitives in QMP
transactions. It requires my transactionless incremental backup series,
currently at v4.

Patch 1 adds basic support for add and clear transactions.
Patch 2 tests this basic support.
Patches 3-4 refactor transactions a little bit, to add clarity.
Patch 5 adds the framework for error scenarios where only
some jobs that were launched by a transaction complete successfully,
and we need to perform context-sensitive cleanup after the transaction
itself has already completed.
Patches 6-7 add necessary bookkeeping information to backup job
data structures to take advantage of this new feature.
Patch 8 just moves code.
Patch 9 modifies qmp_drive_backup to support the new feature.
Patch 10 implements the new feature for drive_backup transaction actions.
Patch 11 tests the new feature.

Lingering questions:
 - Is it worth it to add a post-transaction completion event to QMP that
   signifies all jobs launched by the transaction have completed? This
   would be of primary interest to libvirt, in particular, but only as
   a convenience feature.

Thank you,
--John Snow

v2:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/11:[0009] [FC] 'qapi: Add transaction support to block-dirty-bitmap 
operations'
002/11:[0036] [FC] 'iotests: add transactional incremental backup test'
003/11:[down] 'blockdev: rename BlkTransactionState and BdrvActionOps'
004/11:[down] 'block: re-add BlkTransactionState'
005/11:[0274] [FC] 'block: add transactional callbacks feature'
006/11:[0004] [FC] 'block: add refcount to Job object'
007/11:[0048] [FC] 'block: add delayed bitmap successor cleanup'
008/11:[down] 'block: move transactions beneath qmp interfaces'
009/11:[0084] [FC] 'qmp: Add an implementation wrapper for qmp_drive_backup'
010/11:[0050] [FC] 'block: drive_backup transaction callback support'
011/11:[0004] [FC] 'iotests: 124 - transactional failure test'

 01: Fixed indentation.
 Fixed QMP commands to behave with new bitmap_lookup from
   transactionless-v4.
 2.3 -- 2.4.
 02: Folded in improvements to qemu-iotest 124 from transactional-v1.
 03: NEW
 04: NEW
 05: A lot:
 Don't delete the comma in the transaction actions config
 use g_new0 instead of g_malloc0
 Phrasing: retcode -- Return code
 Use GCC attributes to mark functions as unused until future patches.
 Added some data structure documentation.
 Many structure and function renames, hopefully to improve readability.
 Use just one list for all Actions instead of two separate lists.
 Remove ActionState from the list upon deletion/decref
 And many other small tweaks.
 06: Comment phrasing.
 07: Removed errp parameter from all functions introduced by this commit.
 bdrv_dirty_bitmap_decref -- bdrv_frozen_bitmap_decref
 08: NEW
 09: _drive_backup -- do_drive_backup()
 Forward declarations removed.
 10: Rebased on top of drastically modified #05.
 Phrasing: BackupBlockJob instead of BackupJob in comments.
 11: Removed extra parameters to wait_incremental() in
   test_transaction_failure()

John Snow (11):
  qapi: Add transaction support to block-dirty-bitmap operations
  iotests: add transactional incremental backup test
  block: rename BlkTransactionState and BdrvActionOps
  block: re-add BlkTransactionState
  block: add transactional callbacks feature
  block: add refcount to Job object
  block: add delayed bitmap successor cleanup
  block: move transactions beneath qmp interfaces
  qmp: Add an implementation wrapper for qmp_drive_backup
  block: drive_backup transaction callback support
  iotests: 124 - transactional failure test

 block.c|   65 +-
 block/backup.c |   29 +-
 blockdev.c | 1599 
 blockjob.c |   18 +-
 include/block/block.h  |   10 +-
 include/block/block_int.h  |8 +
 include/block/blockjob.h   |   21 +
 qapi-schema.json   |6 +-
 tests/qemu-iotests/124 |  170 +
 tests/qemu-iotests/124.out |4 +-
 10 files changed, 1313 insertions(+), 617 deletions(-)

-- 
2.1.0

[Qemu-block] [PATCH v2 08/11] block: move transactions beneath qmp interfaces

2015-03-27 Thread John Snow

In general, since transactions may reference QMP function helpers,
it would be nice for them to sit beneath them.

This will avoid the need for forward declaring any QMP interfaces,
which would be aggravating to update in so many places.

Signed-off-by: John Snow js...@redhat.com
---
 blockdev.c | 2581 ++--
 1 file changed, 1292 insertions(+), 1289 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index d404251..fa954e9 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1225,7 +1225,1297 @@ static BdrvDirtyBitmap 
*block_dirty_bitmap_lookup(const char *node,
 return NULL;
 }
 
-/* New and old BlockDriverState structs for atomic group operations */
+static void eject_device(BlockBackend *blk, int force, Error **errp)
+{
+BlockDriverState *bs = blk_bs(blk);
+AioContext *aio_context;
+
+aio_context = bdrv_get_aio_context(bs);
+aio_context_acquire(aio_context);
+
+if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_EJECT, errp)) {
+goto out;
+}
+if (!blk_dev_has_removable_media(blk)) {
+error_setg(errp, Device '%s' is not removable,
+   bdrv_get_device_name(bs));
+goto out;
+}
+
+if (blk_dev_is_medium_locked(blk)  !blk_dev_is_tray_open(blk)) {
+blk_dev_eject_request(blk, force);
+if (!force) {
+error_setg(errp, Device '%s' is locked,
+   bdrv_get_device_name(bs));
+goto out;
+}
+}
+
+bdrv_close(bs);
+
+out:
+aio_context_release(aio_context);
+}
+
+void qmp_eject(const char *device, bool has_force, bool force, Error **errp)
+{
+BlockBackend *blk;
+
+blk = blk_by_name(device);
+if (!blk) {
+error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+return;
+}
+
+eject_device(blk, force, errp);
+}
+
+void qmp_block_passwd(bool has_device, const char *device,
+  bool has_node_name, const char *node_name,
+  const char *password, Error **errp)
+{
+Error *local_err = NULL;
+BlockDriverState *bs;
+AioContext *aio_context;
+
+bs = bdrv_lookup_bs(has_device ? device : NULL,
+has_node_name ? node_name : NULL,
+local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return;
+}
+
+aio_context = bdrv_get_aio_context(bs);
+aio_context_acquire(aio_context);
+
+bdrv_add_key(bs, password, errp);
+
+aio_context_release(aio_context);
+}
+
+/* Assumes AioContext is held */
+static void qmp_bdrv_open_encrypted(BlockDriverState *bs, const char *filename,
+int bdrv_flags, BlockDriver *drv,
+const char *password, Error **errp)
+{
+Error *local_err = NULL;
+int ret;
+
+ret = bdrv_open(bs, filename, NULL, NULL, bdrv_flags, drv, local_err);
+if (ret  0) {
+error_propagate(errp, local_err);
+return;
+}
+
+bdrv_add_key(bs, password, errp);
+}
+
+void qmp_change_blockdev(const char *device, const char *filename,
+ const char *format, Error **errp)
+{
+BlockBackend *blk;
+BlockDriverState *bs;
+AioContext *aio_context;
+BlockDriver *drv = NULL;
+int bdrv_flags;
+Error *err = NULL;
+
+blk = blk_by_name(device);
+if (!blk) {
+error_set(errp, QERR_DEVICE_NOT_FOUND, device);
+return;
+}
+bs = blk_bs(blk);
+
+aio_context = bdrv_get_aio_context(bs);
+aio_context_acquire(aio_context);
+
+if (format) {
+drv = bdrv_find_whitelisted_format(format, bs-read_only);
+if (!drv) {
+error_set(errp, QERR_INVALID_BLOCK_FORMAT, format);
+goto out;
+}
+}
+
+eject_device(blk, 0, err);
+if (err) {
+error_propagate(errp, err);
+goto out;
+}
+
+bdrv_flags = bdrv_is_read_only(bs) ? 0 : BDRV_O_RDWR;
+bdrv_flags |= bdrv_is_snapshot(bs) ? BDRV_O_SNAPSHOT : 0;
+
+qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv, NULL, errp);
+
+out:
+aio_context_release(aio_context);
+}
+
+/* throttling disk I/O limits */
+void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd,
+   int64_t bps_wr,
+   int64_t iops,
+   int64_t iops_rd,
+   int64_t iops_wr,
+   bool has_bps_max,
+   int64_t bps_max,
+   bool has_bps_rd_max,
+   int64_t bps_rd_max,
+   bool has_bps_wr_max,
+   int64_t bps_wr_max,
+   bool has_iops_max,
+   int64_t iops_max,
+   bool has_iops_rd_max,
+   int64_t iops_rd_max

Re: [Qemu-block] [PATCH v6 00/21] block: transactionless incremental backup series

2015-04-23 Thread John Snow




On 04/23/2015 09:19 AM, Stefan Hajnoczi wrote:

On Fri, Apr 17, 2015 at 07:49:48PM -0400, John Snow wrote:

===
v6:
===

01: s/underlaying/underlying/
 Removed a reference to 'disabled' bitmaps.
 Touching up inconsistent list indentation.
 Added FreeBSD Documentation License, primarily to be difficult


Please stick to the currently used set of licenses in the future, unless
you have a strong reason.  It's not a good use of anyone's time to fuss
with licenses when we have enough of them in the codebase already.

In my non-lawyer opinion the license you chose seems okay but I'd rather
avoid the risk and hassle.

Thanks,
Stefan



I know I said primarily to be difficult but I was just being 
facetious. I didn't find the GPL2+ to be suitable for documentation, 
strictly, so I went to read up on the documentation licenses that the 
fsf support/recommend.


There's the GNU documentation license, but I found that unsuitable for a 
couple reasons -- one of them was that you are forbidden(!) from 
changing the text of the license, and there are some provisions in there 
I didn't like, such as requiring the full text of the license to be 
included with compiled copies of the document. That's not something I 
care about -- a reference in source, for instance, is sufficient, or a 
copy of the license being distributed *with* the compiled source is 
fine, but I have no need for the full license to be copied with the 
compiled version.


The other documentation license the fsf recommends is the FreeBSD one, 
and that one looked appealing, short, and to the point, so it is the one 
I chose. It is essentially the FreeBSD license with words altered to 
clarify what code and source means with respect to a document.


Sorry for /actually/ being difficult; but Eric Blake was urging me to 
select a license instead of relying on the implicit GPL, so I did go out 
of my way to choose one I found appropriate.


I stand by my pick.

--js

Re: [Qemu-block] [Qemu-devel] [PATCH v6 00/21] block: transactionless incremental backup series

2015-04-23 Thread John Snow




On 04/23/2015 03:18 PM, Eric Blake wrote:

On 04/23/2015 08:41 AM, John Snow wrote:


I know I said primarily to be difficult but I was just being
facetious. I didn't find the GPL2+ to be suitable for documentation,
strictly, so I went to read up on the documentation licenses that the
fsf support/recommend.

There's the GNU documentation license, but I found that unsuitable for a
couple reasons -- one of them was that you are forbidden(!) from
changing the text of the license,


Note that it is usually only the license text proper that is locked
down; the rest of the documentation is not under the same restriction
unless you declare specific invariant sections such as a cover page. But
I know that the Debian project has typically frowned upon any use of FDL
with invariant sections, and the FDL has therefore earned a somewhat
questionable reputation outside of FSF projects.



Understood; however the GNU FDL specifies within the license where and 
how the GNU FDL must be displayed. I didn't like these requirements, and 
might've used the FDL, but you are prohibited from altering the license, 
so I chose against this license.


It's too restrictive for me.


and there are some provisions in there
I didn't like, such as requiring the full text of the license to be
included with compiled copies of the document. That's not something I
care about -- a reference in source, for instance, is sufficient, or a
copy of the license being distributed *with* the compiled source is
fine, but I have no need for the full license to be copied with the
compiled version.


Yes, I like those benefits of the FreeBSD Documentation License.



The other documentation license the fsf recommends is the FreeBSD one,
and that one looked appealing, short, and to the point, so it is the one
I chose. It is essentially the FreeBSD license with words altered to
clarify what code and source means with respect to a document.


In particular, according to the FSF, the FreeBSD Documentation License
_should be_ acceptable for use with a GPLv2 program:

https://www.gnu.org/philosophy/license-list.html#FreeDocumentationLicenses

although this is probably not the right list to get a definitive answer
from a lawyer familiar with the various copyright licenses and laws.



Sorry for /actually/ being difficult; but Eric Blake was urging me to
select a license instead of relying on the implicit GPL, so I did go out
of my way to choose one I found appropriate.

I stand by my pick.


I also agree with the pick; I think that GPLv2+ on documentation is a
bit questionable - if someone else implements the same interface using
just the documentation, is their code required to be under the GPL by
virtue of using the documentation?  Using a more permissive
documentation license feels nicer to me, as it would allow non-GPL
implementations if someone is so inclined.  Sorry if encouraging the
issue has made matters more difficult.



It's too late! You've opened Pandora's Box!

[Qemu-block] [PATCH v3 10/10] iotests: 124 - transactional failure test

2015-04-22 Thread John Snow

Use a transaction to request an incremental backup across two drives.
Coerce one of the jobs to fail, and then re-run the transaction.

Verify that no bitmap data was lost due to the partial transaction
failure.

Signed-off-by: John Snow js...@redhat.com
---
 tests/qemu-iotests/124 | 120 -
 tests/qemu-iotests/124.out |   4 +-
 2 files changed, 121 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 2d50594..772edd4 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -139,9 +139,12 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 def do_qmp_backup(self, error='Input/output error', **kwargs):
 res = self.vm.qmp('drive-backup', **kwargs)
 self.assert_qmp(res, 'return', {})
+return self.wait_qmp_backup(kwargs['device'], error)
 
+
+def wait_qmp_backup(self, device, error='Input/output error'):
 event = self.vm.event_wait(name=BLOCK_JOB_COMPLETED,
-   match={'data': {'device': 
kwargs['device']}})
+   match={'data': {'device': device}})
 self.assertIsNotNone(event)
 
 try:
@@ -375,6 +378,121 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 self.check_backups()
 
 
+def test_transaction_failure(self):
+'''Test: Verify backups made from a transaction that partially fails.
+
+Add a second drive with its own unique pattern, and add a bitmap to 
each
+drive. Use blkdebug to interfere with the backup on just one drive and
+attempt to create a coherent incremental backup across both drives.
+
+verify a failure in one but not both, then delete the failed stubs and
+re-run the same transaction.
+
+verify that both incrementals are created successfully.
+'''
+
+# Create a second drive, with pattern:
+drive1 = self.add_node('drive1')
+self.img_create(drive1['file'], drive1['fmt'])
+io_write_patterns(drive1['file'], (('0x14', 0, 512),
+   ('0x5d', '1M', '32k'),
+   ('0xcd', '32M', '124k')))
+
+# Create a blkdebug interface to this img as 'drive1'
+result = self.vm.qmp('blockdev-add', options={
+'id': drive1['id'],
+'driver': drive1['fmt'],
+'file': {
+'driver': 'blkdebug',
+'image': {
+'driver': 'file',
+'filename': drive1['file']
+},
+'set-state': [{
+'event': 'flush_to_disk',
+'state': 1,
+'new_state': 2
+}],
+'inject-error': [{
+'event': 'read_aio',
+'errno': 5,
+'state': 2,
+'immediately': False,
+'once': True
+}],
+}
+})
+self.assert_qmp(result, 'return', {})
+
+# Create bitmaps and full backups for both drives
+drive0 = self.drives[0]
+dr0bm0 = self.add_bitmap('bitmap0', drive0)
+dr1bm0 = self.add_bitmap('bitmap0', drive1)
+self.create_anchor_backup(drive0)
+self.create_anchor_backup(drive1)
+self.assert_no_active_block_jobs()
+self.assertFalse(self.vm.get_qmp_events(wait=False))
+
+# Emulate some writes
+self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+  ('0xfe', '16M', '256k'),
+  ('0x64', '32736k', '64k')))
+self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
+  ('0xef', '16M', '256k'),
+  ('0x46', '32736k', '64k')))
+
+# Create incremental backup targets
+target0 = self.prepare_backup(dr0bm0)
+target1 = self.prepare_backup(dr1bm0)
+
+# Ask for a new incremental backup per-each drive,
+# expecting drive1's backup to fail:
+transaction = [
+transaction_drive_backup(drive0['id'], target0, 
sync='dirty-bitmap',
+ format=drive0['fmt'], mode='existing',
+ bitmap=dr0bm0.name),
+transaction_drive_backup(drive1['id'], target1, 
sync='dirty-bitmap',
+ format=drive1['fmt'], mode='existing',
+ bitmap=dr1bm0.name),
+]
+result = self.vm.qmp('transaction', actions=transaction)
+self.assert_qmp(result, 'return', {})
+
+# Observe that drive0's backup completes, but drive1's does not.
+# Consume drive1's error and ensure all pending actions are completed.
+self.assertTrue(self.wait_qmp_backup(drive0['id

[Qemu-block] [PATCH v3 05/10] block: add transactional callbacks feature

2015-04-22 Thread John Snow

The goal here is to add a new method to transactions that allows
developers to specify a callback that will get invoked only once
all jobs spawned by a transaction are completed, allowing developers
the chance to perform actions conditionally pending complete success,
partial failure, or complete failure.

In order to register the new callback to be invoked, a user must request
a callback pointer and closure by calling new_action_cb_wrapper, which
creates a wrapper around an opaque pointer and callback that would have
originally been passed to e.g. backup_start().

The function will return a function pointer and a new opaque pointer to
be passed instead. The transaction system will effectively intercept the
original callbacks and perform book-keeping on the transaction after it
has delivered the original enveloped callback.

This means that Transaction Action callback methods will be called after
all callbacks triggered by all Actions in the Transactional group have
been received.

This feature has no knowledge of any jobs spawned by Actions that do not
inform the system via new_action_cb_wrapper().

For an example of how to use the feature, please skip ahead to:
'block: drive_backup transaction callback support' which serves as an example
for how to hook up a post-transaction callback to the Drive Backup action.


Note 1: Defining a callback method alone is not sufficient to have the new
method invoked. You must call new_action_cb_wrapper() AND ensure the
callback it returns is the one used as the callback for the job
launched by the action.

Note 2: You can use this feature for any system that registers completions of
an asynchronous task via a callback of the form
(void *opaque, int ret), not just block job callbacks.

Signed-off-by: John Snow js...@redhat.com
---
 blockdev.c | 183 +++--
 1 file changed, 179 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 2ab63ed..31ccb1b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1240,6 +1240,8 @@ typedef struct BlkActionState BlkActionState;
  * @abort: Abort the changes on fail, can be NULL.
  * @clean: Clean up resources after all transaction actions have called
  * commit() or abort(). Can be NULL.
+ * @cb: Executed after all jobs launched by actions in the transaction finish,
+ *  but only if requested by new_action_cb_wrapper() prior to clean().
  *
  * Only prepare() may fail. In a single transaction, only one of commit() or
  * abort() will be called. clean() will always be called if it is present.
@@ -1250,6 +1252,7 @@ typedef struct BlkActionOps {
 void (*commit)(BlkActionState *common);
 void (*abort)(BlkActionState *common);
 void (*clean)(BlkActionState *common);
+void (*cb)(BlkActionState *common);
 } BlkActionOps;
 
 /**
@@ -1258,19 +1261,46 @@ typedef struct BlkActionOps {
  * by a transaction group.
  *
  * @jobs: A reference count that tracks how many jobs still need to complete.
+ * @status: A cumulative return code for all actions that have reported
+ *  a return code via callback in the transaction.
  * @actions: A list of all Actions in the Transaction.
+ *   However, once the transaction has completed, it will be only a 
list
+ *   of transactions that have registered a post-transaction callback.
  */
 typedef struct BlkTransactionState {
 int jobs;
+int status;
 QTAILQ_HEAD(actions, BlkActionState) actions;
 } BlkTransactionState;
 
+typedef void (CallbackFn)(void *opaque, int ret);
+
+/**
+ * BlkActionCallbackData:
+ * Necessary state for intercepting and
+ * re-delivering a callback triggered by an Action.
+ *
+ * @opaque: The data to be given to the encapsulated callback when
+ *  a job launched by an Action completes.
+ * @ret: The status code that was delivered to the encapsulated callback.
+ * @callback: The encapsulated callback to invoke upon completion of
+ *the Job launched by the Action.
+ */
+typedef struct BlkActionCallbackData {
+void *opaque;
+int ret;
+CallbackFn *callback;
+} BlkActionCallbackData;
+
 /**
  * BlkActionState:
  * Describes one Action's state within a Transaction.
  *
  * @action: QAPI-defined enum identifying which Action to perform.
  * @ops: Table of ActionOps this Action can perform.
+ * @transaction: A pointer back to the Transaction this Action belongs to.
+ * @cb_data: Information on this Action's encapsulated callback, if any.
+ * @refcount: reference count, allowing access to this state beyond clean().
  * @entry: List membership for all Actions in this Transaction.
  *
  * This structure must be arranged as first member in a subclassed type,
@@ -1280,6 +1310,9 @@ typedef struct BlkTransactionState {
 struct BlkActionState {
 TransactionAction *action;
 const BlkActionOps *ops;
+BlkTransactionState *transaction;
+BlkActionCallbackData *cb_data;
+int refcount

[Qemu-block] [PATCH v3 06/10] block: add refcount to Job object

2015-04-22 Thread John Snow

If we want to get at the job after the life of the job,
we'll need a refcount for this object.

This may occur for example if we wish to inspect the actions
taken by a particular job after a transactional group of jobs
runs, and further actions are required.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 blockjob.c   | 18 --
 include/block/blockjob.h | 21 +
 2 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index ba2255d..d620082 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -35,6 +35,19 @@
 #include qemu/timer.h
 #include qapi-event.h
 
+void block_job_incref(BlockJob *job)
+{
+job-refcount++;
+}
+
+void block_job_decref(BlockJob *job)
+{
+job-refcount--;
+if (job-refcount == 0) {
+g_free(job);
+}
+}
+
 void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
int64_t speed, BlockCompletionFunc *cb,
void *opaque, Error **errp)
@@ -57,6 +70,7 @@ void *block_job_create(const BlockJobDriver *driver, 
BlockDriverState *bs,
 job-cb= cb;
 job-opaque= opaque;
 job-busy  = true;
+job-refcount  = 1;
 bs-job = job;
 
 /* Only set speed when necessary to avoid NotSupported error */
@@ -68,7 +82,7 @@ void *block_job_create(const BlockJobDriver *driver, 
BlockDriverState *bs,
 bs-job = NULL;
 bdrv_op_unblock_all(bs, job-blocker);
 error_free(job-blocker);
-g_free(job);
+block_job_decref(job);
 error_propagate(errp, local_err);
 return NULL;
 }
@@ -85,7 +99,7 @@ void block_job_completed(BlockJob *job, int ret)
 bs-job = NULL;
 bdrv_op_unblock_all(bs, job-blocker);
 error_free(job-blocker);
-g_free(job);
+block_job_decref(job);
 }
 
 void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index b6d4ebb..dcc0596 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -116,6 +116,9 @@ struct BlockJob {
 
 /** The opaque value that is passed to the completion function.  */
 void *opaque;
+
+/** A reference count, allowing for post-job actions in e.g. transactions 
*/
+int refcount;
 };
 
 /**
@@ -141,6 +144,24 @@ void *block_job_create(const BlockJobDriver *driver, 
BlockDriverState *bs,
void *opaque, Error **errp);
 
 /**
+ * block_job_incref:
+ * @job: The job to pick up a handle to
+ *
+ * Increment the refcount on @job, to be able to use it asynchronously
+ * from the job it is being used for. Put down the reference when done
+ * with @block_job_unref.
+ */
+void block_job_incref(BlockJob *job);
+
+/**
+ * block_job_decref:
+ * @job: The job to unreference and delete.
+ *
+ * Decrement the refcount on @job and delete it if there are no more 
references.
+ */
+void block_job_decref(BlockJob *job);
+
+/**
  * block_job_sleep_ns:
  * @job: The job that calls the function.
  * @clock: The clock to sleep on.
-- 
2.1.0

[Qemu-block] [PATCH v3 02/10] iotests: add transactional incremental backup test

2015-04-22 Thread John Snow

Test simple usage cases for using transactions to create
and synchronize incremental backups.

Signed-off-by: John Snow js...@redhat.com
---
 tests/qemu-iotests/124 | 54 ++
 tests/qemu-iotests/124.out |  4 ++--
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 3ee78cd..2d50594 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -36,6 +36,23 @@ def try_remove(img):
 pass
 
 
+def transaction_action(action, **kwargs):
+return {
+'type': action,
+'data': kwargs
+}
+
+
+def transaction_bitmap_clear(node, name, **kwargs):
+return transaction_action('block-dirty-bitmap-clear',
+  node=node, name=name, **kwargs)
+
+
+def transaction_drive_backup(device, target, **kwargs):
+return transaction_action('drive-backup', device=device, target=target,
+  **kwargs)
+
+
 class Bitmap:
 def __init__(self, name, drive):
 self.name = name
@@ -264,6 +281,43 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 return self.do_incremental_simple(granularity=131072)
 
 
+def test_incremental_transaction(self):
+'''Test: Verify backups made from transactionally created bitmaps.
+
+Create a bitmap before VM execution begins, then create a second
+bitmap AFTER writes have already occurred. Use transactions to create
+a full backup and synchronize both bitmaps to this backup.
+Create an incremental backup through both bitmaps and verify that
+both backups match the current drive0 image.
+'''
+
+drive0 = self.drives[0]
+bitmap0 = self.add_bitmap('bitmap0', drive0)
+self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+  ('0xfe', '16M', '256k'),
+  ('0x64', '32736k', '64k')))
+bitmap1 = self.add_bitmap('bitmap1', drive0)
+
+result = self.vm.qmp('transaction', actions=[
+transaction_bitmap_clear(bitmap0.drive['id'], bitmap0.name),
+transaction_bitmap_clear(bitmap1.drive['id'], bitmap1.name),
+transaction_drive_backup(drive0['id'], drive0['backup'],
+ sync='full', format=drive0['fmt'])
+])
+self.assert_qmp(result, 'return', {})
+self.wait_until_completed(drive0['id'])
+self.files.append(drive0['backup'])
+
+self.hmp_io_writes(drive0['id'], (('0x9a', 0, 512),
+  ('0x55', '8M', '352k'),
+  ('0x78', '15872k', '1M')))
+# Both bitmaps should be correctly in sync.
+self.create_incremental(bitmap0)
+self.create_incremental(bitmap1)
+self.vm.shutdown()
+self.check_backups()
+
+
 def test_incremental_failure(self):
 '''Test: Verify backups made after a failure are correct.
 
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 2f7d390..594c16f 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-...
+
 --
-Ran 7 tests
+Ran 8 tests
 
 OK
-- 
2.1.0

[Qemu-block] [PATCH v3 07/10] block: add delayed bitmap successor cleanup

2015-04-22 Thread John Snow

Allow bitmap successors to carry reference counts.

We can in a later patch use this ability to clean up the dirty bitmap
according to both the individual job's success and the success of all
jobs in the transaction group.

The code for cleaning up a bitmap is also moved from backup_run to
backup_complete.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 block.c   | 65 ++-
 block/backup.c| 20 ++--
 include/block/block.h | 10 
 3 files changed, 70 insertions(+), 25 deletions(-)

diff --git a/block.c b/block.c
index b29aafe..0e7308c 100644
--- a/block.c
+++ b/block.c
@@ -51,6 +51,12 @@
 #include windows.h
 #endif
 
+typedef enum BitmapSuccessorAction {
+SUCCESSOR_ACTION_UNDEFINED = 0,
+SUCCESSOR_ACTION_ABDICATE,
+SUCCESSOR_ACTION_RECLAIM
+} BitmapSuccessorAction;
+
 /**
  * A BdrvDirtyBitmap can be in three possible states:
  * (1) successor is NULL and disabled is false: full r/w mode
@@ -65,6 +71,8 @@ struct BdrvDirtyBitmap {
 char *name; /* Optional non-empty unique ID */
 int64_t size;   /* Size of the bitmap (Number of sectors) */
 bool disabled;  /* Bitmap is read-only */
+int successor_refcount; /* Number of active handles to the successor */
+BitmapSuccessorAction act;  /* Action to take on successor upon release */
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -5540,6 +5548,7 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState 
*bs,
 
 /* Install the successor and freeze the parent */
 bitmap-successor = child;
+bitmap-successor_refcount = 1;
 return 0;
 }
 
@@ -5547,9 +5556,9 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState 
*bs,
  * For a bitmap with a successor, yield our name to the successor,
  * delete the old bitmap, and return a handle to the new bitmap.
  */
-BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
-BdrvDirtyBitmap *bitmap,
-Error **errp)
+static BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
+   BdrvDirtyBitmap *bitmap,
+   Error **errp)
 {
 char *name;
 BdrvDirtyBitmap *successor = bitmap-successor;
@@ -5574,9 +5583,9 @@ BdrvDirtyBitmap 
*bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
  * we may wish to re-join the parent and child/successor.
  * The merged parent will be un-frozen, but not explicitly re-enabled.
  */
-BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
-   BdrvDirtyBitmap *parent,
-   Error **errp)
+static BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
+  BdrvDirtyBitmap *parent,
+  Error **errp)
 {
 BdrvDirtyBitmap *successor = parent-successor;
 
@@ -5595,6 +5604,50 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
 return parent;
 }
 
+static BdrvDirtyBitmap *bdrv_free_bitmap_successor(BlockDriverState *bs,
+   BdrvDirtyBitmap *parent)
+{
+assert(!parent-successor_refcount);
+
+switch (parent-act) {
+case SUCCESSOR_ACTION_RECLAIM:
+return bdrv_reclaim_dirty_bitmap(bs, parent, NULL);
+case SUCCESSOR_ACTION_ABDICATE:
+return bdrv_dirty_bitmap_abdicate(bs, parent, NULL);
+case SUCCESSOR_ACTION_UNDEFINED:
+default:
+g_assert_not_reached();
+}
+}
+
+BdrvDirtyBitmap *bdrv_frozen_bitmap_decref(BlockDriverState *bs,
+   BdrvDirtyBitmap *parent,
+   int ret)
+{
+assert(bdrv_dirty_bitmap_frozen(parent));
+assert(parent-successor);
+
+if (ret) {
+parent-act = SUCCESSOR_ACTION_RECLAIM;
+} else if (parent-act != SUCCESSOR_ACTION_RECLAIM) {
+parent-act = SUCCESSOR_ACTION_ABDICATE;
+}
+
+parent-successor_refcount--;
+if (parent-successor_refcount == 0) {
+return bdrv_free_bitmap_successor(bs, parent);
+}
+return parent;
+}
+
+void bdrv_dirty_bitmap_incref(BdrvDirtyBitmap *parent)
+{
+assert(bdrv_dirty_bitmap_frozen(parent));
+assert(parent-successor);
+
+parent-successor_refcount++;
+}
+
 /**
  * Truncates _all_ bitmaps attached to a BDS.
  */
diff --git a/block/backup.c b/block/backup.c
index a297df6..62f8d2b 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -240,6 +240,12 @@ static void backup_complete(BlockJob *job, void *opaque)
 
 bdrv_unref(s-target);
 
+if (s-sync_bitmap) {
+BdrvDirtyBitmap *bm;
+bm = bdrv_frozen_bitmap_decref(job-bs, s-sync_bitmap, data-ret);
+assert(bm

[Qemu-block] [PATCH v3 7/9] qtest/ahci: add flush migrate test

2015-04-30 Thread John Snow

Use blkdebug to inject an error on first flush, then attempt to flush
on the first guest. When the error halts the VM, migrate to the
second VM, and attempt to resume the command.

Signed-off-by: John Snow js...@redhat.com
---
 tests/ahci-test.c | 52 +++-
 1 file changed, 51 insertions(+), 1 deletion(-)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index 90f631c..f770c9d 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -1069,7 +1069,7 @@ static void test_flush_retry(void)
 debug_path,
 tmp_path);
 
-/* Issue Flush Command */
+/* Issue Flush Command and wait for error */
 port = ahci_port_select(ahci);
 ahci_port_clear(ahci, port);
 cmd = ahci_command_create(CMD_FLUSH_CACHE);
@@ -1152,6 +1152,55 @@ static void test_migrate_dma(void)
 g_free(tx);
 }
 
+/**
+ * Migration test: Try to flush, migrate, then resume.
+ */
+static void test_flush_migrate(void)
+{
+AHCIQState *src, *dst;
+AHCICommand *cmd;
+uint8_t px;
+const char *s;
+const char *uri = tcp:127.0.0.1:1234;
+
+prepare_blkdebug_script(debug_path, flush_to_disk);
+
+src = ahci_boot_and_enable(-drive file=blkdebug:%s:%s,if=none,id=drive0,
+   cache=writeback,rerror=stop,werror=stop 
+   -M q35 
+   -device ide-hd,drive=drive0 ,
+   debug_path, tmp_path);
+dst = ahci_boot(-drive file=%s,if=none,id=drive0,
+cache=writeback,rerror=stop,werror=stop 
+-M q35 
+-device ide-hd,drive=drive0 
+-incoming %s, tmp_path, uri);
+
+set_context(src-parent);
+
+/* Issue Flush Command */
+px = ahci_port_select(src);
+ahci_port_clear(src, px);
+cmd = ahci_command_create(CMD_FLUSH_CACHE);
+ahci_command_commit(src, cmd, px);
+ahci_command_issue_async(src, cmd);
+qmp_eventwait(STOP);
+
+/* Migrate over */
+ahci_migrate(src, dst, uri);
+
+/* Complete the command */
+s = {'execute':'cont' };
+qmp_async(s);
+qmp_eventwait(RESUME);
+ahci_command_wait(dst, cmd);
+ahci_command_verify(dst, cmd);
+
+ahci_command_free(cmd);
+ahci_shutdown(src);
+ahci_shutdown(dst);
+}
+
 
/**/
 /* AHCI I/O Test Matrix Definitions   
*/
 
@@ -1400,6 +1449,7 @@ int main(int argc, char **argv)
 
 qtest_add_func(/ahci/flush/simple, test_flush);
 qtest_add_func(/ahci/flush/retry, test_flush_retry);
+qtest_add_func(/ahci/flush/migrate, test_flush_migrate);
 
 qtest_add_func(/ahci/migrate/sanity, test_migrate_sanity);
 qtest_add_func(/ahci/migrate/dma, test_migrate_dma);
-- 
2.1.0

[Qemu-block] [PATCH v3 6/9] qtest/ahci: add migrate dma test

2015-04-30 Thread John Snow

Write to one guest, migrate, and then read from the other.
adjust ahci_io to clear any buffers it creates, so that we
can use ahci_io safely on both guests knowing we are using
empty buffers and not accidentally re-using data.

Signed-off-by: John Snow js...@redhat.com
---
 tests/ahci-test.c   | 45 +
 tests/libqos/ahci.c |  1 +
 2 files changed, 46 insertions(+)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index 2656b37..90f631c 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -1108,6 +1108,50 @@ static void test_migrate_sanity(void)
 ahci_shutdown(dst);
 }
 
+/**
+ * DMA Migration test: Write a pattern, migrate, then read.
+ */
+static void test_migrate_dma(void)
+{
+AHCIQState *src, *dst;
+uint8_t px;
+size_t bufsize = 4096;
+unsigned char *tx = g_malloc(bufsize);
+unsigned char *rx = g_malloc0(bufsize);
+unsigned i;
+const char *uri = tcp:127.0.0.1:1234;
+
+src = ahci_boot_and_enable(-m 1024 -M q35 
+   -hda %s , tmp_path);
+dst = ahci_boot(-m 1024 -M q35 
+-hda %s 
+-incoming %s, tmp_path, uri);
+
+set_context(src-parent);
+
+/* initialize */
+px = ahci_port_select(src);
+ahci_port_clear(src, px);
+
+/* create pattern */
+for (i = 0; i  bufsize; i++) {
+tx[i] = (bufsize - i);
+}
+
+/* Write, migrate, then read. */
+ahci_io(src, px, CMD_WRITE_DMA, tx, bufsize, 0);
+ahci_migrate(src, dst, uri);
+ahci_io(dst, px, CMD_READ_DMA, rx, bufsize, 0);
+
+/* Verify pattern */
+g_assert_cmphex(memcmp(tx, rx, bufsize), ==, 0);
+
+ahci_shutdown(src);
+ahci_shutdown(dst);
+g_free(rx);
+g_free(tx);
+}
+
 
/**/
 /* AHCI I/O Test Matrix Definitions   
*/
 
@@ -1358,6 +1402,7 @@ int main(int argc, char **argv)
 qtest_add_func(/ahci/flush/retry, test_flush_retry);
 
 qtest_add_func(/ahci/migrate/sanity, test_migrate_sanity);
+qtest_add_func(/ahci/migrate/dma, test_migrate_dma);
 
 ret = g_test_run();
 
diff --git a/tests/libqos/ahci.c b/tests/libqos/ahci.c
index 29e12f9..e2ac0d7 100644
--- a/tests/libqos/ahci.c
+++ b/tests/libqos/ahci.c
@@ -650,6 +650,7 @@ void ahci_io(AHCIQState *ahci, uint8_t port, uint8_t 
ide_cmd,
 g_assert(props);
 ptr = ahci_alloc(ahci, bufsize);
 g_assert(ptr);
+qmemset(ptr, 0x00, bufsize);
 
 if (props-write) {
 memwrite(ptr, buffer, bufsize);
-- 
2.1.0

[Qemu-block] [PATCH v3 8/9] qtest/ahci: add halted dma test

2015-04-30 Thread John Snow

If we're going to test the migration of halted DMA jobs,
we should probably check to make sure we can resume them
locally as a first step.

Signed-off-by: John Snow js...@redhat.com
---
 tests/ahci-test.c | 60 +++
 1 file changed, 60 insertions(+)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index f770c9d..aa0db92 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -1153,6 +1153,65 @@ static void test_migrate_dma(void)
 }
 
 /**
+ * DMA Error Test
+ *
+ * Simulate an error on first write, Try to write a pattern,
+ * Confirm the VM has stopped, resume the VM, verify command
+ * has completed, then read back the data and verify.
+ */
+static void test_halted_dma(void)
+{
+AHCIQState *ahci;
+uint8_t port;
+size_t bufsize = 4096;
+unsigned char *tx = g_malloc(bufsize);
+unsigned char *rx = g_malloc0(bufsize);
+unsigned i;
+uint64_t ptr;
+AHCICommand *cmd;
+
+prepare_blkdebug_script(debug_path, write_aio);
+
+ahci = ahci_boot_and_enable(-drive file=blkdebug:%s:%s,if=none,id=drive0,
+format=qcow2,cache=writeback,
+rerror=stop,werror=stop 
+-M q35 
+-device ide-hd,drive=drive0 ,
+debug_path,
+tmp_path);
+
+/* Initialize and prepare */
+port = ahci_port_select(ahci);
+ahci_port_clear(ahci, port);
+
+for (i = 0; i  bufsize; i++) {
+tx[i] = (bufsize - i);
+}
+
+/* create DMA source buffer and write pattern */
+ptr = ahci_alloc(ahci, bufsize);
+g_assert(ptr);
+memwrite(ptr, tx, bufsize);
+
+/* Attempt to write (and fail) */
+cmd = ahci_guest_io_halt(ahci, port, CMD_WRITE_DMA,
+ ptr, bufsize, 0);
+
+/* Attempt to resume the command */
+ahci_guest_io_resume(ahci, cmd);
+ahci_free(ahci, ptr);
+
+/* Read back and verify */
+ahci_io(ahci, port, CMD_READ_DMA, rx, bufsize, 0);
+g_assert_cmphex(memcmp(tx, rx, bufsize), ==, 0);
+
+/* Cleanup and go home */
+ahci_shutdown(ahci);
+g_free(rx);
+g_free(tx);
+}
+
+/**
  * Migration test: Try to flush, migrate, then resume.
  */
 static void test_flush_migrate(void)
@@ -1453,6 +1512,7 @@ int main(int argc, char **argv)
 
 qtest_add_func(/ahci/migrate/sanity, test_migrate_sanity);
 qtest_add_func(/ahci/migrate/dma, test_migrate_dma);
+qtest_add_func(/ahci/io/dma/lba28/retry, test_halted_dma);
 
 ret = g_test_run();
 
-- 
2.1.0

[Qemu-block] [PATCH v3 1/9] libqos/ahci: Add halted command helpers

2015-04-30 Thread John Snow

Sometimes we want a command to halt the VM instead
of complete successfully, so it'd be nice to let the
libqos/ahci functions cope with such scenarios.

Signed-off-by: John Snow js...@redhat.com
---
 tests/libqos/ahci.c | 27 +++
 tests/libqos/ahci.h |  3 +++
 2 files changed, 30 insertions(+)

diff --git a/tests/libqos/ahci.c b/tests/libqos/ahci.c
index a18c12b..05dd04d 100644
--- a/tests/libqos/ahci.c
+++ b/tests/libqos/ahci.c
@@ -566,6 +566,33 @@ inline unsigned size_to_prdtl(unsigned bytes, unsigned 
bytes_per_prd)
 return (bytes + bytes_per_prd - 1) / bytes_per_prd;
 }
 
+/* Issue a command, expecting it to fail and STOP the VM */
+AHCICommand *ahci_guest_io_halt(AHCIQState *ahci, uint8_t port,
+uint8_t ide_cmd, uint64_t buffer,
+size_t bufsize, uint64_t sector)
+{
+AHCICommand *cmd;
+
+cmd = ahci_command_create(ide_cmd);
+ahci_command_adjust(cmd, sector, buffer, bufsize, 0);
+ahci_command_commit(ahci, cmd, port);
+ahci_command_issue_async(ahci, cmd);
+qmp_eventwait(STOP);
+
+return cmd;
+}
+
+/* Resume a previously failed command and verify/finalize */
+void ahci_guest_io_resume(AHCIQState *ahci, AHCICommand *cmd)
+{
+/* Complete the command */
+qmp_async({'execute':'cont' });
+qmp_eventwait(RESUME);
+ahci_command_wait(ahci, cmd);
+ahci_command_verify(ahci, cmd);
+ahci_command_free(cmd);
+}
+
 /* Given a guest buffer address, perform an IO operation */
 void ahci_guest_io(AHCIQState *ahci, uint8_t port, uint8_t ide_cmd,
uint64_t buffer, size_t bufsize, uint64_t sector)
diff --git a/tests/libqos/ahci.h b/tests/libqos/ahci.h
index 40e8ca4..779e812 100644
--- a/tests/libqos/ahci.h
+++ b/tests/libqos/ahci.h
@@ -524,6 +524,9 @@ unsigned ahci_pick_cmd(AHCIQState *ahci, uint8_t port);
 unsigned size_to_prdtl(unsigned bytes, unsigned bytes_per_prd);
 void ahci_guest_io(AHCIQState *ahci, uint8_t port, uint8_t ide_cmd,
uint64_t gbuffer, size_t size, uint64_t sector);
+AHCICommand *ahci_guest_io_halt(AHCIQState *ahci, uint8_t port, uint8_t 
ide_cmd,
+uint64_t gbuffer, size_t size, uint64_t 
sector);
+void ahci_guest_io_resume(AHCIQState *ahci, AHCICommand *cmd);
 void ahci_io(AHCIQState *ahci, uint8_t port, uint8_t ide_cmd,
  void *buffer, size_t bufsize, uint64_t sector);
 
-- 
2.1.0

[Qemu-block] [PATCH v3 0/9] ahci: enable migration

2015-04-30 Thread John Snow

The day we all feared is here, and I am proposing we allow the migration
of the AHCI device tentatively for the 2.4 development window.

There are some more NCQ migration tests are needed, but I felt that it was
important to get migration enabled as close to the start of the 2.4
development window as possible.

If the NCQ patches don't pan out by the time the 2.4 freeze occurs, we can
revert the migration boolean and add a conditional around the ahci tests
that rely on the migration feature being enabled.

I am justifying this checkin based on a series of ping-pong
migration tests I ran under heavy load (using google's stressapptest)
and saw over 300 successful migrations without a single failure.

This series does a few things:
(1) Add migration facilities to libqos
(2) Enable AHCI and ICH9 migration
(3) Add a series of migration tests to ahci-test

v3:
 - Rebase and resend for 2.4.
 - Minor style guide fix.

v2:
 - Added a URI parameter to the migrate() helper
 - Adjust ahci_shutdown to set qtest context for itself
 - Make sure verify() is part of ahci_migrate() and redundant
   calls are eliminated
 - Add new helpers to make tests with blkdebug injections more
   succint
 - Change the flush migrate test to not load the blkdebug rule
   on the destination host
 - Modify the migrate() function so that it does not poll the
   VM for migration status if it can rely on RESUME events.
 - New patch: Repair the ahci_command_set_offset helper.
 - New test: Test DMA halt and resume.
 - New test: Test DMA halt, migrate, and resume.

==
For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch ahci-migration-test
https://github.com/jnsnow/qemu/tree/ahci-migration-test

This version is tagged ahci-migration-test-v3:
https://github.com/jnsnow/qemu/releases/tag/ahci-migration-test-v3
==

John Snow (9):
  libqos/ahci: Add halted command helpers
  libqos/ahci: Fix sector set method
  libqos: Add migration helpers
  ich9/ahci: Enable Migration
  qtest/ahci: Add migration test
  qtest/ahci: add migrate dma test
  qtest/ahci: add flush migrate test
  qtest/ahci: add halted dma test
  qtest/ahci: add migrate halted dma test

 hw/ide/ahci.c |   1 -
 hw/ide/ich.c  |   1 -
 tests/ahci-test.c | 318 +-
 tests/libqos/ahci.c   |  34 +-
 tests/libqos/ahci.h   |   3 +
 tests/libqos/libqos.c |  84 +
 tests/libqos/libqos.h |   2 +
 tests/libqos/malloc.c |  74 +---
 tests/libqos/malloc.h |   1 +
 9 files changed, 496 insertions(+), 22 deletions(-)

-- 
2.1.0

[Qemu-block] [PATCH v3 2/9] libqos/ahci: Fix sector set method

2015-04-30 Thread John Snow

|| probably does not mean the same thing as |.

Additionally, allow users to submit a prd_size of 0
to indicate that they'd like to continue using the default.

Signed-off-by: John Snow js...@redhat.com
---
 tests/libqos/ahci.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tests/libqos/ahci.c b/tests/libqos/ahci.c
index 05dd04d..29e12f9 100644
--- a/tests/libqos/ahci.c
+++ b/tests/libqos/ahci.c
@@ -769,7 +769,7 @@ void ahci_command_set_offset(AHCICommand *cmd, uint64_t 
lba_sect)
 fis-lba_lo[1] = (lba_sect  8)  0xFF;
 fis-lba_lo[2] = (lba_sect  16)  0xFF;
 if (cmd-props-lba28) {
-fis-device = (fis-device  0xF0) || (lba_sect  24)  0x0F;
+fis-device = (fis-device  0xF0) | ((lba_sect  24)  0x0F);
 }
 fis-lba_hi[0] = (lba_sect  24)  0xFF;
 fis-lba_hi[1] = (lba_sect  32)  0xFF;
@@ -787,7 +787,9 @@ void ahci_command_set_sizes(AHCICommand *cmd, uint64_t 
xbytes,
 /* Each PRD can describe up to 4MiB, and must not be odd. */
 g_assert_cmphex(prd_size, =, 4096 * 1024);
 g_assert_cmphex(prd_size  0x01, ==, 0x00);
-cmd-prd_size = prd_size;
+if (prd_size) {
+cmd-prd_size = prd_size;
+}
 cmd-xbytes = xbytes;
 cmd-fis.count = (cmd-xbytes / AHCI_SECTOR_SIZE);
 cmd-header.prdtl = size_to_prdtl(cmd-xbytes, cmd-prd_size);
-- 
2.1.0

[Qemu-block] [PATCH v3 4/9] ich9/ahci: Enable Migration

2015-04-30 Thread John Snow

Lift the flag preventing the migration of the ICH9/AHCI devices.

Signed-off-by: John Snow js...@redhat.com
---
 hw/ide/ahci.c | 1 -
 hw/ide/ich.c  | 1 -
 2 files changed, 2 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 833fd45..8e36dec 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -1461,7 +1461,6 @@ typedef struct SysbusAHCIState {
 
 static const VMStateDescription vmstate_sysbus_ahci = {
 .name = sysbus-ahci,
-.unmigratable = 1, /* Still buggy under I/O load */
 .fields = (VMStateField[]) {
 VMSTATE_AHCI(ahci, SysbusAHCIState),
 VMSTATE_END_OF_LIST()
diff --git a/hw/ide/ich.c b/hw/ide/ich.c
index b1d8874..350c7f1 100644
--- a/hw/ide/ich.c
+++ b/hw/ide/ich.c
@@ -82,7 +82,6 @@
 
 static const VMStateDescription vmstate_ich9_ahci = {
 .name = ich9_ahci,
-.unmigratable = 1, /* Still buggy under I/O load */
 .version_id = 1,
 .fields = (VMStateField[]) {
 VMSTATE_PCI_DEVICE(parent_obj, AHCIPCIState),
-- 
2.1.0

[Qemu-block] [PATCH v3 9/9] qtest/ahci: add migrate halted dma test

2015-04-30 Thread John Snow

Test migrating a halted DMA transaction.
Resume, then test data integrity.

Signed-off-by: John Snow js...@redhat.com
---
 tests/ahci-test.c | 75 ++-
 1 file changed, 74 insertions(+), 1 deletion(-)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index aa0db92..020acdd 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -1212,6 +1212,78 @@ static void test_halted_dma(void)
 }
 
 /**
+ * DMA Error Migration Test
+ *
+ * Simulate an error on first write, Try to write a pattern,
+ * Confirm the VM has stopped, migrate, resume the VM,
+ * verify command has completed, then read back the data and verify.
+ */
+static void test_migrate_halted_dma(void)
+{
+AHCIQState *src, *dst;
+uint8_t port;
+size_t bufsize = 4096;
+unsigned char *tx = g_malloc(bufsize);
+unsigned char *rx = g_malloc0(bufsize);
+unsigned i;
+uint64_t ptr;
+AHCICommand *cmd;
+const char *uri = tcp:127.0.0.1:1234;
+
+prepare_blkdebug_script(debug_path, write_aio);
+
+src = ahci_boot_and_enable(-drive file=blkdebug:%s:%s,if=none,id=drive0,
+   format=qcow2,cache=writeback,
+   rerror=stop,werror=stop 
+   -M q35 
+   -device ide-hd,drive=drive0 ,
+   debug_path,
+   tmp_path);
+
+dst = ahci_boot(-drive file=%s,if=none,id=drive0,
+format=qcow2,cache=writeback,
+rerror=stop,werror=stop 
+-M q35 
+-device ide-hd,drive=drive0 
+-incoming %s,
+tmp_path, uri);
+
+set_context(src-parent);
+
+/* Initialize and prepare */
+port = ahci_port_select(src);
+ahci_port_clear(src, port);
+
+for (i = 0; i  bufsize; i++) {
+tx[i] = (bufsize - i);
+}
+
+/* create DMA source buffer and write pattern */
+ptr = ahci_alloc(src, bufsize);
+g_assert(ptr);
+memwrite(ptr, tx, bufsize);
+
+/* Write, trigger the VM to stop, migrate, then resume. */
+cmd = ahci_guest_io_halt(src, port, CMD_WRITE_DMA,
+ ptr, bufsize, 0);
+ahci_migrate(src, dst, uri);
+ahci_guest_io_resume(dst, cmd);
+ahci_free(dst, ptr);
+
+/* Read back */
+ahci_io(dst, port, CMD_READ_DMA, rx, bufsize, 0);
+
+/* Verify TX and RX are identical */
+g_assert_cmphex(memcmp(tx, rx, bufsize), ==, 0);
+
+/* Cleanup and go home. */
+ahci_shutdown(src);
+ahci_shutdown(dst);
+g_free(rx);
+g_free(tx);
+}
+
+/**
  * Migration test: Try to flush, migrate, then resume.
  */
 static void test_flush_migrate(void)
@@ -1511,8 +1583,9 @@ int main(int argc, char **argv)
 qtest_add_func(/ahci/flush/migrate, test_flush_migrate);
 
 qtest_add_func(/ahci/migrate/sanity, test_migrate_sanity);
-qtest_add_func(/ahci/migrate/dma, test_migrate_dma);
+qtest_add_func(/ahci/migrate/dma/simple, test_migrate_dma);
 qtest_add_func(/ahci/io/dma/lba28/retry, test_halted_dma);
+qtest_add_func(/ahci/migrate/dma/halted, test_migrate_halted_dma);
 
 ret = g_test_run();
 
-- 
2.1.0

Re: [Qemu-block] [PATCH v3 0/9] ahci: enable migration

2015-05-04 Thread John Snow




On 05/04/2015 08:29 AM, Kevin Wolf wrote:

Am 30.04.2015 um 20:07 hat John Snow geschrieben:

The day we all feared is here, and I am proposing we allow the migration
of the AHCI device tentatively for the 2.4 development window.

There are some more NCQ migration tests are needed, but I felt that it was
important to get migration enabled as close to the start of the 2.4
development window as possible.

If the NCQ patches don't pan out by the time the 2.4 freeze occurs, we can
revert the migration boolean and add a conditional around the ahci tests
that rely on the migration feature being enabled.

I am justifying this checkin based on a series of ping-pong
migration tests I ran under heavy load (using google's stressapptest)
and saw over 300 successful migrations without a single failure.

This series does a few things:
(1) Add migration facilities to libqos
(2) Enable AHCI and ICH9 migration
(3) Add a series of migration tests to ahci-test


Reviewed-by: Kevin Wolf kw...@redhat.com

I think besides the NCQ tests, we'll definitely also want to test
migration with READ DMA in flight (this series tests only WRITE DMA and
FLUSH CACHE). Probably also discard.



Added to the todo list, thanks.


Nice to have, but less important, would be the other variants that
exist, like the EXT version of each command, and READ/WRITE SECTOR.

But all of that can be added in a follow-up series, it's not a reason to
hold up what's already there.

Kevin



Thanks!

--js

Re: [Qemu-block] [Qemu-devel] [PATCH v3 3/9] libqos: Add migration helpers

2015-05-04 Thread John Snow




On 05/04/2015 08:07 AM, Kevin Wolf wrote:

Am 30.04.2015 um 20:07 hat John Snow geschrieben:

libqos.c:
 -set_context for addressing which commands go where
 -migrate performs the actual migration

malloc.c:
 - Structure of the allocator is adjusted slightly with
   a second-tier malloc to make swapping around the allocators
   easy when we migrate the lists from the source to the destination.

Signed-off-by: John Snow js...@redhat.com
---
  tests/libqos/libqos.c | 84 +++
  tests/libqos/libqos.h |  2 ++
  tests/libqos/malloc.c | 74 ++---
  tests/libqos/malloc.h |  1 +
  4 files changed, 144 insertions(+), 17 deletions(-)

diff --git a/tests/libqos/libqos.c b/tests/libqos/libqos.c
index 7e72078..ac1bae1 100644
--- a/tests/libqos/libqos.c
+++ b/tests/libqos/libqos.c
@@ -1,5 +1,6 @@
  #include stdio.h
  #include stdlib.h
+#include string.h
  #include glib.h
  #include unistd.h
  #include fcntl.h
@@ -62,6 +63,89 @@ void qtest_shutdown(QOSState *qs)
  g_free(qs);
  }

+void set_context(QOSState *s)
+{
+global_qtest = s-qts;
+}
+
+static QDict *qmp_execute(const char *command)
+{
+char *fmt;
+QDict *rsp;
+
+fmt = g_strdup_printf({ 'execute': '%s' }, command);
+rsp = qmp(fmt);
+g_free(fmt);
+
+return rsp;
+}
+
+void migrate(QOSState *from, QOSState *to, const char *uri)
+{
+const char *st;
+char *s;
+QDict *rsp, *sub;
+bool running;
+
+set_context(from);
+
+/* Is the machine currently running? */
+rsp = qmp_execute(query-status);
+g_assert(qdict_haskey(rsp, return));
+sub = qdict_get_qdict(rsp, return);
+g_assert(qdict_haskey(sub, running));
+running = qdict_get_bool(sub, running);
+QDECREF(rsp);
+
+/* Issue the migrate command. */
+s = g_strdup_printf({ 'execute': 'migrate',
+'arguments': { 'uri': '%s' } },
+uri);
+rsp = qmp(s);
+g_free(s);
+g_assert(qdict_haskey(rsp, return));
+QDECREF(rsp);
+
+/* Wait for STOP event, but only if we were running: */
+if (running) {
+qmp_eventwait(STOP);
+}
+
+/* If we were running, we can wait for an event. */
+if (running) {
+migrate_allocator(from-alloc, to-alloc);
+set_context(to);
+qmp_eventwait(RESUME);
+return;
+}
+
+/* Otherwise, we need to wait: poll until migration is completed. */
+while (1) {
+rsp = qmp_execute(query-migrate);
+g_assert(qdict_haskey(rsp, return));
+sub = qdict_get_qdict(rsp, return);
+g_assert(qdict_haskey(sub, status));
+st = qdict_get_str(sub, status);
+
+/* setup, active, completed, failed, cancelled */
+if (strcmp(st, completed) == 0) {
+QDECREF(rsp);
+break;
+}
+
+if ((strcmp(st, setup) == 0) || (strcmp(st, active) == 0)) {
+QDECREF(rsp);
+continue;


Wouldn't it be nicer to sleep a bit before retrying?



I actually figured that all the string and stream manipulation for 
sending and receiving QMP queries was enough sleep because of how 
quick a migration without any guest should complete -- in practice this 
loop doesn't ever seem to trigger more than once.


If you still think sleep is necessary, I can add some very small sleep 
in a separate patch, or when I merge the tree. Something like:


g_usleep(5000) /* 5 msec */


+}
+
+fprintf(stderr, Migration did not complete, status: %s\n, st);
+g_assert_not_reached();
+}
+
+migrate_allocator(from-alloc, to-alloc);
+set_context(to);
+}


Kevin

Re: [Qemu-block] [PATCH COLO v3 10/14] util/hbitmap: Add an API to reset all set bits in hbitmap

2015-05-01 Thread John Snow




On 04/03/2015 07:05 AM, Paolo Bonzini wrote:



On 03/04/2015 12:01, Wen Congyang wrote:

Signed-off-by: Wen Congyang we...@cn.fujitsu.com
Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
Signed-off-by: Gonglei arei.gong...@huawei.com
---
  include/qemu/hbitmap.h |  8 
  tests/test-hbitmap.c   | 39 +++
  util/hbitmap.c | 16 
  3 files changed, 63 insertions(+)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 550d7ce..95a55e4 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -109,6 +109,14 @@ void hbitmap_set(HBitmap *hb, uint64_t start, uint64_t 
count);
  void hbitmap_reset(HBitmap *hb, uint64_t start, uint64_t count);

  /**
+ * hbitmap_reset_all:
+ * @hb: HBitmap to operate on.
+ *
+ * Reset all bits in an HBitmap.
+ */
+void hbitmap_reset_all(HBitmap *hb);
+
+/**
   * hbitmap_get:
   * @hb: HBitmap to operate on.
   * @item: Bit to query (0-based).
diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
index 8c902f2..1f0078a 100644
--- a/tests/test-hbitmap.c
+++ b/tests/test-hbitmap.c
@@ -11,6 +11,7 @@

  #include glib.h
  #include stdarg.h
+#include string.h
  #include qemu/hbitmap.h

  #define LOG_BITS_PER_LONG  (BITS_PER_LONG == 32 ? 5 : 6)
@@ -143,6 +144,23 @@ static void hbitmap_test_reset(TestHBitmapData *data,
  }
  }

+static void hbitmap_test_reset_all(TestHBitmapData *data)
+{
+size_t n;
+
+hbitmap_reset_all(data-hb);
+
+n = (data-size + BITS_PER_LONG - 1) / BITS_PER_LONG;
+if (n == 0) {
+n = 1;
+}
+memset(data-bits, 0, n * sizeof(unsigned long));
+
+if (data-granularity == 0) {
+hbitmap_test_check(data, 0);
+}
+}
+
  static void hbitmap_test_check_get(TestHBitmapData *data)
  {
  uint64_t count = 0;
@@ -323,6 +341,26 @@ static void test_hbitmap_reset(TestHBitmapData *data,
  hbitmap_test_set(data, L3 / 2, L3);
  }

+static void test_hbitmap_reset_all(TestHBitmapData *data,
+   const void *unused)
+{
+hbitmap_test_init(data, L3 * 2, 0);
+hbitmap_test_set(data, L1 - 1, L1 + 2);
+hbitmap_test_reset_all(data);
+hbitmap_test_set(data, 0, L1 * 3);
+hbitmap_test_reset_all(data);
+hbitmap_test_set(data, L2, L1);
+hbitmap_test_reset_all(data);
+hbitmap_test_set(data, L2, L3 - L2 + 1);
+hbitmap_test_reset_all(data);
+hbitmap_test_set(data, L3 - 1, 3);
+hbitmap_test_reset_all(data);
+hbitmap_test_set(data, 0, L3 * 2);
+hbitmap_test_reset_all(data);
+hbitmap_test_set(data, L3 / 2, L3);
+hbitmap_test_reset_all(data);
+}
+
  static void test_hbitmap_granularity(TestHBitmapData *data,
   const void *unused)
  {
@@ -394,6 +432,7 @@ int main(int argc, char **argv)
  hbitmap_test_add(/hbitmap/set/overlap, test_hbitmap_set_overlap);
  hbitmap_test_add(/hbitmap/reset/empty, test_hbitmap_reset_empty);
  hbitmap_test_add(/hbitmap/reset/general, test_hbitmap_reset);
+hbitmap_test_add(/hbitmap/reset/all, test_hbitmap_reset_all);
  hbitmap_test_add(/hbitmap/granularity, test_hbitmap_granularity);
  g_test_run();

diff --git a/util/hbitmap.c b/util/hbitmap.c
index ab13971..acce93c 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -353,6 +353,22 @@ void hbitmap_reset(HBitmap *hb, uint64_t start, uint64_t 
count)
  hb_reset_between(hb, HBITMAP_LEVELS - 1, start, last);
  }

+void hbitmap_reset_all(HBitmap *hb)
+{
+uint64_t size = hb-size;
+unsigned int i;
+
+/* Same as hbitmap_alloc() except memset() */
+for (i = HBITMAP_LEVELS; --i = 1; ) {
+size = MAX((size + BITS_PER_LONG - 1)  BITS_PER_LEVEL, 1);
+memset(hb-levels[i], 0, size * sizeof(unsigned long));
+}
+


For what it's worth, I recently added in a hb-sizes[i] cache to store 
the size of each array so you don't have to recompute this all the time.



+assert(size == 1);
+hb-levels[0][0] = 1UL  (BITS_PER_LONG - 1);
+hb-count = 0;
+}
+
  bool hbitmap_get(const HBitmap *hb, uint64_t item)
  {
  /* Compute position and bit in the last layer.  */



Acked-by: Paolo Bonzini pbonz...@redhat.com

[Qemu-block] [RFC] Differential Backups

2015-04-29 Thread John Snow

This is a feature that should be very easy to add on top of the existing 
incremental feature, since it's just a difference in how the bitmap is 
treated:


Incremental
- Links to the last incremental (managed by libvirt)
- Clears the bitmap after creation

Differential:
- Links to the last full backup always (managed by libvirt)
- Does not clear the bitmap after creation

No biggie.

How it works currently: Incremental backups are created via the 
MIRROR_SYNC_MODE_DIRTY_BITMAP backup mode. An early version of the 
patchset actually had an additional parameter called the BitmapUseMode 
that controlled how the bitmap was cleared, a concern that was later 
made obsolete for other reasons.


I can add Differential backups in two ways:

(1) rename MIRROR_SYNC_MODE_DIRTY_BITMAP to 
MIRROR_SYNC_MODE_INCREMENTAL, and then add 
MIRROR_SYNC_MODE_DIFFERENTIAL. It's not too late to do this, since 2.4 
has just started.


(2) Re-add the BitmapUseMode parameter and add some enums:
BITMAP_USE_MODE_INCREMENTAL
BITMAP_USE_MODE_DIFFERENTIAL

I think I am partial to #1, if only to cut down on additional 
parameters, especially ones that are only useful to a small subset of 
backup types.




I am also considering adding a QMP primitive to allow people to /copy/ 
bitmaps. This would allow users to differentiate backup chains after 
they've already been started -- e.g.:


- User creates drive0, a full backup, and bitmap monthly0.
- Six days in, user decides it would be very nice to add a weekly 
incremental backup series to the same drive, but doesn't want to lose 
out on the monthly chain that was already started.

- User copies the monthly0 bitmap to a new weekly0 target.
- User can continue two independent incremental series intended for 
different periodicity.


Actually, the copy primitive would allow people to do both differentials 
and incrementals with the existing backup mode by copying the bitmap 
before each usage, but that's slightly yucky, so I'd rather just add 
both features to increase the flexibility of the delta backup system in 
general.

Re: [Qemu-block] [Qemu-devel] [PATCH COLO v3 10/14] util/hbitmap: Add an API to reset all set bits in hbitmap

2015-05-07 Thread John Snow



On 05/06/2015 10:20 PM, Wen Congyang wrote:
 On 05/02/2015 12:47 AM, John Snow wrote:


 On 04/03/2015 07:05 AM, Paolo Bonzini wrote:


 On 03/04/2015 12:01, Wen Congyang wrote:
 Signed-off-by: Wen Congyang we...@cn.fujitsu.com
 Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
 Signed-off-by: Gonglei arei.gong...@huawei.com
 ---
   include/qemu/hbitmap.h |  8 
   tests/test-hbitmap.c   | 39 +++
   util/hbitmap.c | 16 
   3 files changed, 63 insertions(+)

 diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
 index 550d7ce..95a55e4 100644
 --- a/include/qemu/hbitmap.h
 +++ b/include/qemu/hbitmap.h
 @@ -109,6 +109,14 @@ void hbitmap_set(HBitmap *hb, uint64_t start, 
 uint64_t count);
   void hbitmap_reset(HBitmap *hb, uint64_t start, uint64_t count);

   /**
 + * hbitmap_reset_all:
 + * @hb: HBitmap to operate on.
 + *
 + * Reset all bits in an HBitmap.
 + */
 +void hbitmap_reset_all(HBitmap *hb);
 +
 +/**
* hbitmap_get:
* @hb: HBitmap to operate on.
* @item: Bit to query (0-based).
 diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
 index 8c902f2..1f0078a 100644
 --- a/tests/test-hbitmap.c
 +++ b/tests/test-hbitmap.c
 @@ -11,6 +11,7 @@

   #include glib.h
   #include stdarg.h
 +#include string.h
   #include qemu/hbitmap.h

   #define LOG_BITS_PER_LONG  (BITS_PER_LONG == 32 ? 5 : 6)
 @@ -143,6 +144,23 @@ static void hbitmap_test_reset(TestHBitmapData *data,
   }
   }

 +static void hbitmap_test_reset_all(TestHBitmapData *data)
 +{
 +size_t n;
 +
 +hbitmap_reset_all(data-hb);
 +
 +n = (data-size + BITS_PER_LONG - 1) / BITS_PER_LONG;
 +if (n == 0) {
 +n = 1;
 +}
 +memset(data-bits, 0, n * sizeof(unsigned long));
 +
 +if (data-granularity == 0) {
 +hbitmap_test_check(data, 0);
 +}
 +}
 +
   static void hbitmap_test_check_get(TestHBitmapData *data)
   {
   uint64_t count = 0;
 @@ -323,6 +341,26 @@ static void test_hbitmap_reset(TestHBitmapData *data,
   hbitmap_test_set(data, L3 / 2, L3);
   }

 +static void test_hbitmap_reset_all(TestHBitmapData *data,
 +   const void *unused)
 +{
 +hbitmap_test_init(data, L3 * 2, 0);
 +hbitmap_test_set(data, L1 - 1, L1 + 2);
 +hbitmap_test_reset_all(data);
 +hbitmap_test_set(data, 0, L1 * 3);
 +hbitmap_test_reset_all(data);
 +hbitmap_test_set(data, L2, L1);
 +hbitmap_test_reset_all(data);
 +hbitmap_test_set(data, L2, L3 - L2 + 1);
 +hbitmap_test_reset_all(data);
 +hbitmap_test_set(data, L3 - 1, 3);
 +hbitmap_test_reset_all(data);
 +hbitmap_test_set(data, 0, L3 * 2);
 +hbitmap_test_reset_all(data);
 +hbitmap_test_set(data, L3 / 2, L3);
 +hbitmap_test_reset_all(data);
 +}
 +
   static void test_hbitmap_granularity(TestHBitmapData *data,
const void *unused)
   {
 @@ -394,6 +432,7 @@ int main(int argc, char **argv)
   hbitmap_test_add(/hbitmap/set/overlap, test_hbitmap_set_overlap);
   hbitmap_test_add(/hbitmap/reset/empty, test_hbitmap_reset_empty);
   hbitmap_test_add(/hbitmap/reset/general, test_hbitmap_reset);
 +hbitmap_test_add(/hbitmap/reset/all, test_hbitmap_reset_all);
   hbitmap_test_add(/hbitmap/granularity, test_hbitmap_granularity);
   g_test_run();

 diff --git a/util/hbitmap.c b/util/hbitmap.c
 index ab13971..acce93c 100644
 --- a/util/hbitmap.c
 +++ b/util/hbitmap.c
 @@ -353,6 +353,22 @@ void hbitmap_reset(HBitmap *hb, uint64_t start, 
 uint64_t count)
   hb_reset_between(hb, HBITMAP_LEVELS - 1, start, last);
   }

 +void hbitmap_reset_all(HBitmap *hb)
 +{
 +uint64_t size = hb-size;
 +unsigned int i;
 +
 +/* Same as hbitmap_alloc() except memset() */
 +for (i = HBITMAP_LEVELS; --i = 1; ) {
 +size = MAX((size + BITS_PER_LONG - 1)  BITS_PER_LEVEL, 1);
 +memset(hb-levels[i], 0, size * sizeof(unsigned long));
 +}
 +

 For what it's worth, I recently added in a hb-sizes[i] cache to store the 
 size of each array so you don't have to recompute this all the time.
 
 Yes, will fix it in the next version.
 
 Thanks
 Wen Congyang
 

Since the reset stuff is useful all by itself, you can send that patch
by itself, CC me, and I'll review it.

You can update the existing call in block.c:

bdrv_clear_dirty_bitmap() {
hbitmap_reset(bitmap-bitmap, 0, bitmap-size);
}

to using your faster hbitmap_reset_all call.

Thanks,

--js


 +assert(size == 1);
 +hb-levels[0][0] = 1UL  (BITS_PER_LONG - 1);
 +hb-count = 0;
 +}
 +
   bool hbitmap_get(const HBitmap *hb, uint64_t item)
   {
   /* Compute position and bit in the last layer.  */


 Acked-by: Paolo Bonzini pbonz...@redhat.com

 .

Re: [Qemu-block] [Qemu-devel] [PATCH v3 0/9] ahci: enable migration

2015-05-05 Thread John Snow



On 04/30/2015 02:07 PM, John Snow wrote:
 The day we all feared is here, and I am proposing we allow the migration
 of the AHCI device tentatively for the 2.4 development window.
 
 There are some more NCQ migration tests are needed, but I felt that it was
 important to get migration enabled as close to the start of the 2.4
 development window as possible.
 
 If the NCQ patches don't pan out by the time the 2.4 freeze occurs, we can
 revert the migration boolean and add a conditional around the ahci tests
 that rely on the migration feature being enabled.
 
 I am justifying this checkin based on a series of ping-pong
 migration tests I ran under heavy load (using google's stressapptest)
 and saw over 300 successful migrations without a single failure.
 
 This series does a few things:
 (1) Add migration facilities to libqos
 (2) Enable AHCI and ICH9 migration
 (3) Add a series of migration tests to ahci-test
 
 v3:
  - Rebase and resend for 2.4.
  - Minor style guide fix.
 
 v2:
  - Added a URI parameter to the migrate() helper
  - Adjust ahci_shutdown to set qtest context for itself
  - Make sure verify() is part of ahci_migrate() and redundant
calls are eliminated
  - Add new helpers to make tests with blkdebug injections more
succint
  - Change the flush migrate test to not load the blkdebug rule
on the destination host
  - Modify the migrate() function so that it does not poll the
VM for migration status if it can rely on RESUME events.
  - New patch: Repair the ahci_command_set_offset helper.
  - New test: Test DMA halt and resume.
  - New test: Test DMA halt, migrate, and resume.
 
 ==
 For convenience, this branch is available at:
 https://github.com/jnsnow/qemu.git branch ahci-migration-test
 https://github.com/jnsnow/qemu/tree/ahci-migration-test
 
 This version is tagged ahci-migration-test-v3:
 https://github.com/jnsnow/qemu/releases/tag/ahci-migration-test-v3
 ==
 
 John Snow (9):
   libqos/ahci: Add halted command helpers
   libqos/ahci: Fix sector set method
   libqos: Add migration helpers
   ich9/ahci: Enable Migration
   qtest/ahci: Add migration test
   qtest/ahci: add migrate dma test
   qtest/ahci: add flush migrate test
   qtest/ahci: add halted dma test
   qtest/ahci: add migrate halted dma test
 
  hw/ide/ahci.c |   1 -
  hw/ide/ich.c  |   1 -
  tests/ahci-test.c | 318 
 +-
  tests/libqos/ahci.c   |  34 +-
  tests/libqos/ahci.h   |   3 +
  tests/libqos/libqos.c |  84 +
  tests/libqos/libqos.h |   2 +
  tests/libqos/malloc.c |  74 +---
  tests/libqos/malloc.h |   1 +
  9 files changed, 496 insertions(+), 22 deletions(-)
 

Staged: https://github.com/jnsnow/qemu/commits/ide
(with one edit to patch 3 as suggested by Kevin.)

--js

Re: [Qemu-block] [Qemu-devel] [PATCH v3 01/10] qapi: Add transaction support to block-dirty-bitmap operations

2015-05-08 Thread John Snow



On 05/08/2015 09:17 AM, Max Reitz wrote:
 On 08.05.2015 15:14, Stefan Hajnoczi wrote:
 On Thu, May 07, 2015 at 01:22:26PM -0400, John Snow wrote:

 On 05/07/2015 10:54 AM, Stefan Hajnoczi wrote:
 On Wed, Apr 22, 2015 at 08:04:44PM -0400, John Snow wrote:
 +static void block_dirty_bitmap_clear_prepare(BlkTransactionState
 *common, + Error
 **errp) +{ +BlockDirtyBitmapState *state =
 DO_UPCAST(BlockDirtyBitmapState, +
 common, common); +BlockDirtyBitmap *action; + +action =
 common-action-block_dirty_bitmap_clear; +state-bitmap =
 block_dirty_bitmap_lookup(action-node, +
 action-name, +
 state-bs, +
 state-aio_context, +
 errp); +if (!state-bitmap) { +return; +} + +
 if (bdrv_dirty_bitmap_frozen(state-bitmap)) { +
 error_setg(errp, Cannot modify a frozen bitmap); +
 return; +} else if
 (!bdrv_dirty_bitmap_enabled(state-bitmap)) { +
 error_setg(errp, Cannot clear a disabled bitmap); +
 return; +} + +/* AioContext is released in .clean() */
 +} + +static void
 block_dirty_bitmap_clear_commit(BlkTransactionState *common) +{ +
 BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState, +
 common, common); +bdrv_clear_dirty_bitmap(state-bitmap); +}
 These semantics don't work in this example:

 [block-dirty-bitmap-clear, drive-backup]

 Since drive-backup starts the blockjob in .prepare() but
 block-dirty-bitmap-clear only clears the bitmap in .commit() the
 order is wrong.
 
 Well, starts the block job is technically correct, but the block job
 doesn't run until later. If it were to really start in prepare, that
 would be wrong. Actually, the block job is initialized and yields,
 allowing the code handling the QMP transaction command to continue. I
 think in your example that means that the block job won't actually run
 until after block-dirty-bitmap-clear has been committed.
 
 Max
 

The important thing is that we get the contents of the drive as it was,
according to the bitmap as it was, when we start the job.

So if clear doesn't actually modify the bitmap until the commit() phase,
but drive-backup actually goes until the first yield() in prepare...

We're actually going to see an assertion() failure, likely. drive-backup
will freeze the bitmap with a successor, but then the clear transaction
will try to commit the changes and try to modify a frozen bitmap.

Oops.

What we (The royal we...) need to figure out is how we want to solve
this problem.


 .prepare() has to do something non-destructive, like stashing away
 the HBitmap and replacing it with an empty one.  Then .commit() can
 discard the old bitmap while .abort() can move the old bitmap back
 to undo the operation.

 Stefan

 Hmm, that's sort of gross. That means that any transactional command
 *ever* destined to be used with drive-backup in any conceivable way
 needs to move a lot more of its action forward to .prepare().

 That sort of defeats the premise of .prepare() and .commit(), no? And
 all because drive-backup jumped the gun.
 No it doesn't.  Actions have to appear atomic to the qmp_transaction
 caller.  Both approaches achieve that so they are both correct in
 isolation.

 The ambiguity is whether commit the changes for .commit() means
 changes take effect or discard stashed state, making undo
 impossible.

 I think the discard stashed state, making undo impossible
 interpretation is good because .commit() is not allowed to fail.  That
 function should only do things that never fail.


To be clear, you are favoring drive-backup's interpretation of the
prepare and commit phases. I had been operating under the other
interpretation.

I think I like the semantics of my interpretation better, but have to
admit it's a lot harder programmatically to enforce commit cannot fail
for all of the transactions we support under mine, so your
interpretation is probably the right way to go for sanity's sake -- we
just need to add a bit of documentation to make it clear.

I suppose in this case clear() isn't too hard to modify -- just rip the
Hbitmap out of it and replace it with a new empty one. Don't free() the
old one until commit(). Should be a fairly inexpensive operation.

I will have to re-audit all the existing transactions to make sure these
semantics are consistent, though.

 That's going to get hard to maintain as we add more transactions.
 Yes, we need to be consistent and stick to one of the interpretations in
 order to guarantee ordering.

 Unfortunately, there is already an inconsistency:

 1. internal_snapshot - snapshot taken in .prepare()
 2. external_snapshot - BDS node appended in .commit()
 3. drive_backup - block job started in .prepare()
 4. blockdev_backup - block job started in .prepare()

 external_snapshot followed by internal_snapshot acts like the reverse
 ordering!

 Stefan
 

What a mess!

--js

Re: [Qemu-block] [Qemu-devel] [PATCH v3 3/9] libqos: Add migration helpers

2015-05-05 Thread John Snow




On 05/05/2015 07:35 AM, Kevin Wolf wrote:

Am 04.05.2015 um 19:52 hat John Snow geschrieben:



On 05/04/2015 08:07 AM, Kevin Wolf wrote:

Am 30.04.2015 um 20:07 hat John Snow geschrieben:

+/* Otherwise, we need to wait: poll until migration is completed. */
+while (1) {
+rsp = qmp_execute(query-migrate);
+g_assert(qdict_haskey(rsp, return));
+sub = qdict_get_qdict(rsp, return);
+g_assert(qdict_haskey(sub, status));
+st = qdict_get_str(sub, status);
+
+/* setup, active, completed, failed, cancelled */
+if (strcmp(st, completed) == 0) {
+QDECREF(rsp);
+break;
+}
+
+if ((strcmp(st, setup) == 0) || (strcmp(st, active) == 0)) {
+QDECREF(rsp);
+continue;


Wouldn't it be nicer to sleep a bit before retrying?



I actually figured that all the string and stream manipulation for
sending and receiving QMP queries was enough sleep because of how
quick a migration without any guest should complete -- in practice
this loop doesn't ever seem to trigger more than once.


This surprised me a bit at first because there's no way that string
operations are _that_ slow. You would definitely spin a while in this
loop (and potentially slow down the migration by that).

I think what saves you is that you wait for the STOP event first, and
when qemu's migration thread sends that event, it happens to have
already taken the global mutex. This means that you get your enough
sleep from the qemu monitor, which won't respond before migration has
completed.


If you still think sleep is necessary, I can add some very small
sleep in a separate patch, or when I merge the tree. Something like:

g_usleep(5000) /* 5 msec */


If I were you, I'd add it just to be nice (just applying it to your tree
instead of sending out a new version would be okay). If you don't want
to, I won't insist, though. I mean, I already gave my R-b...

Kevin



It's worth finding out if my reasoning is sane, and you cared enough to 
comment.


I'll add the sleep when I merge, no problem :)

Thanks!
--js

Re: [Qemu-block] [Qemu-devel] [RFC] Differential Backups

2015-05-05 Thread John Snow




On 05/05/2015 06:25 AM, Stefan Hajnoczi wrote:

On Wed, Apr 29, 2015 at 06:51:08PM -0400, John Snow wrote:

This is a feature that should be very easy to add on top of the existing
incremental feature, since it's just a difference in how the bitmap is
treated:

Incremental
- Links to the last incremental (managed by libvirt)
- Clears the bitmap after creation

Differential:
- Links to the last full backup always (managed by libvirt)
- Does not clear the bitmap after creation

No biggie.


Differential backups can be done using incremental backup functionality
in QEMU:

The client application points QEMU to the same target repeatedly instead
of keeping separate incremental backups.

Stefan



Oh, so you're saying:

[anchor]--[diff1]

And then when making a new incremental, we re-use diff1 as a target and 
overwrite it so that it becomes:


[anchor]--[diff2]

In effect giving us a differential.

OK, so it's possible, but we still lose out on some flexibility that a 
slightly different mode would provide us, like the ability to keep 
multiple differentials if desired. (Well, I suppose we *can* create 
those by manually copying differentials after we create them, but that 
seems hackier than necessary.)


Still, it would be such a paltry few lines of code and introduce no real 
complexity to the subsystem, and it might make libvirt's time a little 
easier for managing such things.


--js

Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/6] block: Fix dirty bitmap in bdrv_co_discard

2015-05-11 Thread John Snow



On 05/06/2015 12:52 AM, Fam Zheng wrote:
 Unsetting dirty globally with discard is not very correct. The discard may 
 zero
 out sectors (depending on can_write_zeroes_with_unmap), we should replicate
 this change to destinition side to make sure that the guest sees the same 
 data.
 
 Calling bdrv_reset_dirty also troubles mirror job because the hbitmap iterator
 doesn't expect unsetting of bits after current position.
 
 So let's do it the opposite way which fixes both problems: set the dirty bits
 if we are to discard it.
 
 Reported-by: wangxiaol...@ucloud.cn
 Signed-off-by: Fam Zheng f...@redhat.com
 ---
  block/io.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/block/io.c b/block/io.c
 index 1ce62c4..809688b 100644
 --- a/block/io.c
 +++ b/block/io.c
 @@ -2343,8 +2343,6 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, 
 int64_t sector_num,
  return -EROFS;
  }
  
 -bdrv_reset_dirty(bs, sector_num, nb_sectors);
 -
  /* Do nothing if disabled.  */
  if (!(bs-open_flags  BDRV_O_UNMAP)) {
  return 0;
 @@ -2354,6 +2352,8 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, 
 int64_t sector_num,
  return 0;
  }
  
 +bdrv_set_dirty(bs, sector_num, nb_sectors);
 +
  max_discard = MIN_NON_ZERO(bs-bl.max_discard, BDRV_REQUEST_MAX_SECTORS);
  while (nb_sectors  0) {
  int ret;
 

For the clueless: will discard *always* change the data, or is it
possible that some implementations might do nothing?

Is it possible to just omit a set/reset from this function altogether
and let whatever function that is called later (e.g. a write_zeroes
call) worry about setting the dirty bits?

What I wonder about: Is it possible that we are needlessly marking data
as dirty when it has not changed?

Re: [Qemu-block] [Qemu-devel] [PATCH v2 5/6] qemu-iotests: Add test case for mirror with unmap

2015-05-11 Thread John Snow



On 05/06/2015 12:52 AM, Fam Zheng wrote:
 This checks that the discard on mirror source that effectively zeroes
 data is also reflected by the data of target.
 
 Signed-off-by: Fam Zheng f...@redhat.com
 ---
  tests/qemu-iotests/131 | 59 
 ++
  tests/qemu-iotests/131.out |  5 
  tests/qemu-iotests/group   |  1 +
  3 files changed, 65 insertions(+)
  create mode 100644 tests/qemu-iotests/131
  create mode 100644 tests/qemu-iotests/131.out
 
 diff --git a/tests/qemu-iotests/131 b/tests/qemu-iotests/131
 new file mode 100644
 index 000..f53ef6e
 --- /dev/null
 +++ b/tests/qemu-iotests/131
 @@ -0,0 +1,59 @@
 +#!/usr/bin/env python
 +#
 +# Test mirror with unmap
 +#
 +# Copyright (C) 2015 Red Hat, Inc.
 +#
 +# This program is free software; you can redistribute it and/or modify
 +# it under the terms of the GNU General Public License as published by
 +# the Free Software Foundation; either version 2 of the License, or
 +# (at your option) any later version.
 +#
 +# This program is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY; without even the implied warranty of
 +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +# GNU General Public License for more details.
 +#
 +# You should have received a copy of the GNU General Public License
 +# along with this program.  If not, see http://www.gnu.org/licenses/.
 +#
 +
 +import time
 +import os
 +import iotests
 +from iotests import qemu_img, qemu_io
 +
 +test_img = os.path.join(iotests.test_dir, 'test.img')
 +target_img = os.path.join(iotests.test_dir, 'target.img')
 +
 +class TestSingleDrive(iotests.QMPTestCase):
 +image_len = 2 * 1024 * 1024 # MB
 +
 +def setUp(self):
 +# Write data to the image so we can compare later
 +qemu_img('create', '-f', iotests.imgfmt, test_img, 
 str(TestSingleDrive.image_len))
 +qemu_io('-f', iotests.imgfmt, '-c', 'write -P0x5d 0 2M', test_img)
 +
 +self.vm = iotests.VM().add_drive(test_img, 'discard=unmap')
 +self.vm.launch()
 +
 +def tearDown(self):
 +self.vm.shutdown()
 +os.remove(test_img)
 +try:
 +os.remove(target_img)
 +except OSError:
 +pass
 +
 +def test_mirror_discard(self):
 +result = self.vm.qmp('drive-mirror', device='drive0', sync='full',
 + target=target_img)
 +self.assert_qmp(result, 'return', {})
 +self.vm.hmp_qemu_io('drive0', 'discard 0 64k')
 +self.complete_and_wait('drive0')
 +self.vm.shutdown()
 +self.assertTrue(iotests.compare_images(test_img, target_img),
 +'target image does not match source after mirroring')
 +
 +if __name__ == '__main__':
 +iotests.main(supported_fmts=['raw', 'qcow2'])
 diff --git a/tests/qemu-iotests/131.out b/tests/qemu-iotests/131.out
 new file mode 100644
 index 000..ae1213e
 --- /dev/null
 +++ b/tests/qemu-iotests/131.out
 @@ -0,0 +1,5 @@
 +.
 +--
 +Ran 1 tests
 +
 +OK
 diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
 index 6ca3466..34b16cb 100644
 --- a/tests/qemu-iotests/group
 +++ b/tests/qemu-iotests/group
 @@ -128,3 +128,4 @@
  128 rw auto quick
  129 rw auto quick
  130 rw auto quick
 +131 rw auto quick
 

Reviewed-by: John Snow js...@redhat.com

Re: [Qemu-block] [Qemu-devel] [PATCH v2 4/6] qemu-iotests: Make block job methods common

2015-05-11 Thread John Snow

(ImageMirroringTestCase):
  result = self.vm.qmp('query-block')
  self.assert_qmp(result, 'return[0]/inserted/file', target_img)
  self.vm.shutdown()
 -self.assertTrue(self.compare_images(test_img, target_img),
 +self.assertTrue(iotests.compare_images(test_img, target_img),
  'target image does not match source after mirroring')
  
 -class TestMirrorResized(ImageMirroringTestCase):
 +class TestMirrorResized(iotests.QMPTestCase):
  backing_len = 1 * 1024 * 1024 # MB
  image_len = 2 * 1024 * 1024 # MB
  
 @@ -344,7 +308,7 @@ class TestMirrorResized(ImageMirroringTestCase):
  self.assertTrue(iotests.compare_images(test_img, target_img),
  'target image does not match source after mirroring')
  
 -class TestReadErrors(ImageMirroringTestCase):
 +class TestReadErrors(iotests.QMPTestCase):
  image_len = 2 * 1024 * 1024 # MB
  
  # this should be a multiple of twice the default granularity
 @@ -498,7 +462,7 @@ new_state = 1
  self.assert_no_active_block_jobs()
  self.vm.shutdown()
  
 -class TestWriteErrors(ImageMirroringTestCase):
 +class TestWriteErrors(iotests.QMPTestCase):
  image_len = 2 * 1024 * 1024 # MB
  
  # this should be a multiple of twice the default granularity
 @@ -624,7 +588,7 @@ new_state = 1
  self.assert_no_active_block_jobs()
  self.vm.shutdown()
  
 -class TestSetSpeed(ImageMirroringTestCase):
 +class TestSetSpeed(iotests.QMPTestCase):
  image_len = 80 * 1024 * 1024 # MB
  
  def setUp(self):
 @@ -690,7 +654,7 @@ class TestSetSpeed(ImageMirroringTestCase):
  
  self.wait_ready_and_cancel()
  
 -class TestUnbackedSource(ImageMirroringTestCase):
 +class TestUnbackedSource(iotests.QMPTestCase):
  image_len = 2 * 1024 * 1024 # MB
  
  def setUp(self):
 @@ -731,7 +695,7 @@ class TestUnbackedSource(ImageMirroringTestCase):
  self.complete_and_wait()
  self.assert_no_active_block_jobs()
  
 -class TestRepairQuorum(ImageMirroringTestCase):
 +class TestRepairQuorum(iotests.QMPTestCase):
   This class test quorum file repair using drive-mirror.
  It's mostly a fork of TestSingleDrive 
  image_len = 1 * 1024 * 1024 # MB
 diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
 index e93e623..2e07cc4 100644
 --- a/tests/qemu-iotests/iotests.py
 +++ b/tests/qemu-iotests/iotests.py
 @@ -326,6 +326,34 @@ class QMPTestCase(unittest.TestCase):
  self.assert_no_active_block_jobs()
  return event
  
 +def wait_ready(self, drive='drive0'):
 +'''Wait until a block job BLOCK_JOB_READY event'''
 +ready = False
 +while not ready:
 +for event in self.vm.get_qmp_events(wait=True):
 +if event['event'] == 'BLOCK_JOB_READY':
 +self.assert_qmp(event, 'data/type', 'mirror')
 +self.assert_qmp(event, 'data/device', drive)
 +ready = True
 +
 +def wait_ready_and_cancel(self, drive='drive0'):
 +self.wait_ready(drive=drive)
 +event = self.cancel_and_wait(drive=drive)
 +self.assertEquals(event['event'], 'BLOCK_JOB_COMPLETED')
 +self.assert_qmp(event, 'data/type', 'mirror')
 +self.assert_qmp(event, 'data/offset', event['data']['len'])
 +
 +def complete_and_wait(self, drive='drive0', wait_ready=True):
 +'''Complete a block job and wait for it to finish'''
 +if wait_ready:
 +self.wait_ready(drive=drive)
 +
 +result = self.vm.qmp('block-job-complete', device=drive)
 +self.assert_qmp(result, 'return', {})
 +
 +event = self.wait_until_completed(drive=drive)
 +self.assert_qmp(event, 'data/type', 'mirror')
 +
  def notrun(reason):
  '''Skip this test suite'''
  # Each test in qemu-iotests has a number (seq)
 

Reviewed-by: John Snow js...@redhat.com

Side-note: we should at some point clean up the images this tests leaves
laying around. Not caused by this patch, though.

Re: [Qemu-block] [Qemu-devel] [PATCH v2 6/6] iotests: Use event_wait in wait_ready

2015-05-11 Thread John Snow



On 05/06/2015 12:52 AM, Fam Zheng wrote:
 Only poll the specific type of event we are interested in, to avoid
 stealing events that should be consumed by someone else.
 
 Suggested-by: John Snow js...@redhat.com
 Signed-off-by: Fam Zheng f...@redhat.com
 ---
  tests/qemu-iotests/iotests.py | 9 ++---
  1 file changed, 2 insertions(+), 7 deletions(-)
 
 diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
 index 2e07cc4..0ddc513 100644
 --- a/tests/qemu-iotests/iotests.py
 +++ b/tests/qemu-iotests/iotests.py
 @@ -328,13 +328,8 @@ class QMPTestCase(unittest.TestCase):
  
  def wait_ready(self, drive='drive0'):
  '''Wait until a block job BLOCK_JOB_READY event'''
 -ready = False
 -while not ready:
 -for event in self.vm.get_qmp_events(wait=True):
 -if event['event'] == 'BLOCK_JOB_READY':
 -self.assert_qmp(event, 'data/type', 'mirror')
 -self.assert_qmp(event, 'data/device', drive)
 -ready = True
 +f = {'data': {'type': 'mirror', 'device': drive } }
 +event = self.vm.event_wait(name='BLOCK_JOB_READY', match=f)
  
  def wait_ready_and_cancel(self, drive='drive0'):
  self.wait_ready(drive=drive)
 

Thanks for appeasing me :)

Reviewed-by: John Snow js...@redhat.com

[Qemu-block] [PATCH v4 05/11] block: add transactional callbacks feature

2015-05-11 Thread John Snow

The goal here is to add a new method to transactions that allows
developers to specify a callback that will get invoked only once
all jobs spawned by a transaction are completed, allowing developers
the chance to perform actions conditionally pending complete success,
partial failure, or complete failure.

In order to register the new callback to be invoked, a user must request
a callback pointer and closure by calling new_action_cb_wrapper, which
creates a wrapper around an opaque pointer and callback that would have
originally been passed to e.g. backup_start().

The function will return a function pointer and a new opaque pointer to
be passed instead. The transaction system will effectively intercept the
original callbacks and perform book-keeping on the transaction after it
has delivered the original enveloped callback.

This means that Transaction Action callback methods will be called after
all callbacks triggered by all Actions in the Transactional group have
been received.

This feature has no knowledge of any jobs spawned by Actions that do not
inform the system via new_action_cb_wrapper().

For an example of how to use the feature, please skip ahead to:
'block: drive_backup transaction callback support' which serves as an example
for how to hook up a post-transaction callback to the Drive Backup action.


Note 1: Defining a callback method alone is not sufficient to have the new
method invoked. You must call new_action_cb_wrapper() AND ensure the
callback it returns is the one used as the callback for the job
launched by the action.

Note 2: You can use this feature for any system that registers completions of
an asynchronous task via a callback of the form
(void *opaque, int ret), not just block job callbacks.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 blockdev.c | 183 +++--
 1 file changed, 179 insertions(+), 4 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 068eccb..27db1b4 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1240,6 +1240,8 @@ typedef struct BlkActionState BlkActionState;
  * @abort: Abort the changes on fail, can be NULL.
  * @clean: Clean up resources after all transaction actions have called
  * commit() or abort(). Can be NULL.
+ * @cb: Executed after all jobs launched by actions in the transaction finish,
+ *  but only if requested by new_action_cb_wrapper() prior to clean().
  *
  * Only prepare() may fail. In a single transaction, only one of commit() or
  * abort() will be called. clean() will always be called if it is present.
@@ -1250,6 +1252,7 @@ typedef struct BlkActionOps {
 void (*commit)(BlkActionState *common);
 void (*abort)(BlkActionState *common);
 void (*clean)(BlkActionState *common);
+void (*cb)(BlkActionState *common);
 } BlkActionOps;
 
 /**
@@ -1258,19 +1261,46 @@ typedef struct BlkActionOps {
  * by a transaction group.
  *
  * @jobs: A reference count that tracks how many jobs still need to complete.
+ * @status: A cumulative return code for all actions that have reported
+ *  a return code via callback in the transaction.
  * @actions: A list of all Actions in the Transaction.
+ *   However, once the transaction has completed, it will be only a 
list
+ *   of transactions that have registered a post-transaction callback.
  */
 typedef struct BlkTransactionState {
 int jobs;
+int status;
 QTAILQ_HEAD(actions, BlkActionState) actions;
 } BlkTransactionState;
 
+typedef void (CallbackFn)(void *opaque, int ret);
+
+/**
+ * BlkActionCallbackData:
+ * Necessary state for intercepting and
+ * re-delivering a callback triggered by an Action.
+ *
+ * @opaque: The data to be given to the encapsulated callback when
+ *  a job launched by an Action completes.
+ * @ret: The status code that was delivered to the encapsulated callback.
+ * @callback: The encapsulated callback to invoke upon completion of
+ *the Job launched by the Action.
+ */
+typedef struct BlkActionCallbackData {
+void *opaque;
+int ret;
+CallbackFn *callback;
+} BlkActionCallbackData;
+
 /**
  * BlkActionState:
  * Describes one Action's state within a Transaction.
  *
  * @action: QAPI-defined enum identifying which Action to perform.
  * @ops: Table of ActionOps this Action can perform.
+ * @transaction: A pointer back to the Transaction this Action belongs to.
+ * @cb_data: Information on this Action's encapsulated callback, if any.
+ * @refcount: reference count, allowing access to this state beyond clean().
  * @entry: List membership for all Actions in this Transaction.
  *
  * This structure must be arranged as first member in a subclassed type,
@@ -1280,6 +1310,9 @@ typedef struct BlkTransactionState {
 struct BlkActionState {
 TransactionAction *action;
 const BlkActionOps *ops;
+BlkTransactionState *transaction

[Qemu-block] [PATCH v4 06/11] block: add refcount to Job object

2015-05-11 Thread John Snow

If we want to get at the job after the life of the job,
we'll need a refcount for this object.

This may occur for example if we wish to inspect the actions
taken by a particular job after a transactional group of jobs
runs, and further actions are required.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 blockjob.c   | 18 --
 include/block/blockjob.h | 21 +
 2 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 2755465..9b3456f 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -35,6 +35,19 @@
 #include qemu/timer.h
 #include qapi-event.h
 
+void block_job_incref(BlockJob *job)
+{
+job-refcount++;
+}
+
+void block_job_decref(BlockJob *job)
+{
+job-refcount--;
+if (job-refcount == 0) {
+g_free(job);
+}
+}
+
 void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
int64_t speed, BlockCompletionFunc *cb,
void *opaque, Error **errp)
@@ -57,6 +70,7 @@ void *block_job_create(const BlockJobDriver *driver, 
BlockDriverState *bs,
 job-cb= cb;
 job-opaque= opaque;
 job-busy  = true;
+job-refcount  = 1;
 bs-job = job;
 
 /* Only set speed when necessary to avoid NotSupported error */
@@ -68,7 +82,7 @@ void *block_job_create(const BlockJobDriver *driver, 
BlockDriverState *bs,
 bs-job = NULL;
 bdrv_op_unblock_all(bs, job-blocker);
 error_free(job-blocker);
-g_free(job);
+block_job_decref(job);
 error_propagate(errp, local_err);
 return NULL;
 }
@@ -85,7 +99,7 @@ void block_job_completed(BlockJob *job, int ret)
 bs-job = NULL;
 bdrv_op_unblock_all(bs, job-blocker);
 error_free(job-blocker);
-g_free(job);
+block_job_decref(job);
 }
 
 void block_job_set_speed(BlockJob *job, int64_t speed, Error **errp)
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 57d8ef1..86d770a 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -122,6 +122,9 @@ struct BlockJob {
 
 /** The opaque value that is passed to the completion function.  */
 void *opaque;
+
+/** A reference count, allowing for post-job actions in e.g. transactions 
*/
+int refcount;
 };
 
 /**
@@ -147,6 +150,24 @@ void *block_job_create(const BlockJobDriver *driver, 
BlockDriverState *bs,
void *opaque, Error **errp);
 
 /**
+ * block_job_incref:
+ * @job: The job to pick up a handle to
+ *
+ * Increment the refcount on @job, to be able to use it asynchronously
+ * from the job it is being used for. Put down the reference when done
+ * with @block_job_unref.
+ */
+void block_job_incref(BlockJob *job);
+
+/**
+ * block_job_decref:
+ * @job: The job to unreference and delete.
+ *
+ * Decrement the refcount on @job and delete it if there are no more 
references.
+ */
+void block_job_decref(BlockJob *job);
+
+/**
  * block_job_sleep_ns:
  * @job: The job that calls the function.
  * @clock: The clock to sleep on.
-- 
2.1.0

[Qemu-block] [PATCH v4 09/11] block: drive_backup transaction callback support

2015-05-11 Thread John Snow

This patch actually implements the transactional callback system
for the drive_backup action.

(1) We manually pick up a reference to the bitmap if present to allow
its cleanup to be delayed until after all drive_backup jobs launched
by the transaction have fully completed.

(2) We create a functional closure that envelops the original drive_backup
callback, to be able to intercept the completion status and return code
for the job.

(3) We add the drive_backup_cb method for the drive_backup action, which
unpacks the completion information and invokes the final cleanup.

(4) backup_transaction_complete will perform the final cleanup on the
backup job.

(5) In the case of transaction cancellation, drive_backup_cb is still
responsible for cleaning up the mess we may have already made.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 block/backup.c|  9 
 blockdev.c| 53 ---
 include/block/block_int.h |  8 +++
 3 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 4ac0be8..1634c88 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -233,6 +233,15 @@ typedef struct {
 int ret;
 } BackupCompleteData;
 
+void backup_transaction_complete(BlockJob *job, int ret)
+{
+BackupBlockJob *s = container_of(job, BackupBlockJob, common);
+
+if (s-sync_bitmap) {
+bdrv_frozen_bitmap_decref(job-bs, s-sync_bitmap, ret);
+}
+}
+
 static void backup_complete(BlockJob *job, void *opaque)
 {
 BackupBlockJob *s = container_of(job, BackupBlockJob, common);
diff --git a/blockdev.c b/blockdev.c
index f391e18..c438949 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1426,7 +1426,6 @@ static void transaction_action_callback(void *opaque, int 
ret)
  *
  * @return The callback to be used instead of @callback.
  */
-__attribute__((__unused__))
 static CallbackFn *new_action_cb_wrapper(BlkActionState *common,
  void *opaque,
  CallbackFn *callback,
@@ -1454,7 +1453,6 @@ static CallbackFn *new_action_cb_wrapper(BlkActionState 
*common,
 /**
  * Undo any actions performed by the above call.
  */
-__attribute__((__unused__))
 static void cancel_action_cb_wrapper(BlkActionState *common)
 {
 /* Stage 0: Wrapper was never created: */
@@ -1806,6 +1804,7 @@ static void do_drive_backup(const char *device, const 
char *target,
 BlockdevOnError on_target_error,
 BlockCompletionFunc *cb, void *opaque,
 Error **errp);
+static void block_job_cb(void *opaque, int ret);
 
 static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
@@ -1814,6 +1813,9 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 BlockBackend *blk;
 DriveBackup *backup;
 Error *local_err = NULL;
+CallbackFn *cb;
+void *opaque;
+BdrvDirtyBitmap *bmap = NULL;
 
 assert(common-action-kind == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
 backup = common-action-drive_backup;
@@ -1825,6 +1827,19 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 }
 bs = blk_bs(blk);
 
+/* BackupBlockJob is opaque to us, so look up the bitmap ourselves */
+if (backup-has_bitmap) {
+bmap = bdrv_find_dirty_bitmap(bs, backup-bitmap);
+if (!bmap) {
+error_setg(errp, Bitmap '%s' could not be found, backup-bitmap);
+return;
+}
+}
+
+/* Create our transactional callback wrapper,
+   and register that we'd like to call .cb() later. */
+cb = new_action_cb_wrapper(common, bs, block_job_cb, opaque);
+
 /* AioContext is released in .clean() */
 state-aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(state-aio_context);
@@ -1837,7 +1852,7 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 backup-has_bitmap, backup-bitmap,
 backup-has_on_source_error, backup-on_source_error,
 backup-has_on_target_error, backup-on_target_error,
-NULL, NULL,
+cb, opaque,
 local_err);
 if (local_err) {
 error_propagate(errp, local_err);
@@ -1846,6 +1861,12 @@ static void drive_backup_prepare(BlkActionState *common, 
Error **errp)
 
 state-bs = bs;
 state-job = state-bs-job;
+/* Keep the job alive until .cb(), too:
+ * References are only incremented after the job launches successfully. */
+block_job_incref(state-job);
+if (bmap) {
+bdrv_dirty_bitmap_incref(bmap);
+}
 }
 
 static void drive_backup_abort(BlkActionState *common)
@@ -1857,6 +1878,10 @@ static void drive_backup_abort(BlkActionState *common)
 if (bs  bs-job  bs-job == state-job) {
 block_job_cancel_sync(bs

[Qemu-block] [PATCH v4 01/11] qapi: Add transaction support to block-dirty-bitmap operations

2015-05-11 Thread John Snow

This adds two qmp commands to transactions.

block-dirty-bitmap-add allows you to create a bitmap simultaneously
alongside a new full backup to accomplish a clean synchronization
point.

block-dirty-bitmap-clear allows you to reset a bitmap back to as-if
it were new, which can also be used alongside a full backup to
accomplish a clean synchronization point.

Signed-off-by: Fam Zheng f...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
 block.c   |  19 +++-
 blockdev.c| 114 +-
 docs/bitmaps.md   |   6 +--
 include/block/block.h |   1 -
 include/block/block_int.h |   3 ++
 qapi-schema.json  |   6 ++-
 6 files changed, 139 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 7904098..ca5b1e9 100644
--- a/block.c
+++ b/block.c
@@ -3306,10 +3306,25 @@ void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
 hbitmap_reset(bitmap-bitmap, cur_sector, nr_sectors);
 }
 
-void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
 {
 assert(bdrv_dirty_bitmap_enabled(bitmap));
-hbitmap_reset(bitmap-bitmap, 0, bitmap-size);
+if (!out) {
+hbitmap_reset(bitmap-bitmap, 0, bitmap-size);
+} else {
+HBitmap *backup = bitmap-bitmap;
+bitmap-bitmap = hbitmap_alloc(bitmap-size,
+   hbitmap_granularity(backup));
+*out = backup;
+}
+}
+
+void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in)
+{
+HBitmap *tmp = bitmap-bitmap;
+assert(bdrv_dirty_bitmap_enabled(bitmap));
+bitmap-bitmap = in;
+hbitmap_free(tmp);
 }
 
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
diff --git a/blockdev.c b/blockdev.c
index 5eaf77e..a62cc4b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1694,6 +1694,106 @@ static void blockdev_backup_clean(BlkTransactionState 
*common)
 }
 }
 
+typedef struct BlockDirtyBitmapState {
+BlkTransactionState common;
+BdrvDirtyBitmap *bitmap;
+BlockDriverState *bs;
+AioContext *aio_context;
+HBitmap *backup;
+bool prepared;
+} BlockDirtyBitmapState;
+
+static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
+   Error **errp)
+{
+Error *local_err = NULL;
+BlockDirtyBitmapAdd *action;
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+action = common-action-block_dirty_bitmap_add;
+/* AIO context taken and released within qmp_block_dirty_bitmap_add */
+qmp_block_dirty_bitmap_add(action-node, action-name,
+   action-has_granularity, action-granularity,
+   local_err);
+
+if (!local_err) {
+state-prepared = true;
+} else {
+error_propagate(errp, local_err);
+}
+}
+
+static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
+{
+BlockDirtyBitmapAdd *action;
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+action = common-action-block_dirty_bitmap_add;
+/* Should not be able to fail: IF the bitmap was added via .prepare(),
+ * then the node reference and bitmap name must have been valid.
+ */
+if (state-prepared) {
+qmp_block_dirty_bitmap_remove(action-node, action-name, 
error_abort);
+}
+}
+
+static void block_dirty_bitmap_clear_prepare(BlkTransactionState *common,
+ Error **errp)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+BlockDirtyBitmap *action;
+
+action = common-action-block_dirty_bitmap_clear;
+state-bitmap = block_dirty_bitmap_lookup(action-node,
+  action-name,
+  state-bs,
+  state-aio_context,
+  errp);
+if (!state-bitmap) {
+return;
+}
+
+if (bdrv_dirty_bitmap_frozen(state-bitmap)) {
+error_setg(errp, Cannot modify a frozen bitmap);
+return;
+} else if (!bdrv_dirty_bitmap_enabled(state-bitmap)) {
+error_setg(errp, Cannot clear a disabled bitmap);
+return;
+}
+
+bdrv_clear_dirty_bitmap(state-bitmap, state-backup);
+/* AioContext is released in .clean() */
+}
+
+static void block_dirty_bitmap_clear_abort(BlkTransactionState *common)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+
+bdrv_undo_clear_dirty_bitmap(state-bitmap, state-backup);
+}
+
+static void block_dirty_bitmap_clear_commit(BlkTransactionState *common

[Qemu-block] [PATCH v4 07/11] block: add delayed bitmap successor cleanup

2015-05-11 Thread John Snow

Allow bitmap successors to carry reference counts.

We can in a later patch use this ability to clean up the dirty bitmap
according to both the individual job's success and the success of all
jobs in the transaction group.

The code for cleaning up a bitmap is also moved from backup_run to
backup_complete.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 block.c   | 65 ++-
 block/backup.c| 20 ++--
 include/block/block.h | 10 
 3 files changed, 70 insertions(+), 25 deletions(-)

diff --git a/block.c b/block.c
index ca5b1e9..d964564 100644
--- a/block.c
+++ b/block.c
@@ -51,6 +51,12 @@
 #include windows.h
 #endif
 
+typedef enum BitmapSuccessorAction {
+SUCCESSOR_ACTION_UNDEFINED = 0,
+SUCCESSOR_ACTION_ABDICATE,
+SUCCESSOR_ACTION_RECLAIM
+} BitmapSuccessorAction;
+
 /**
  * A BdrvDirtyBitmap can be in three possible states:
  * (1) successor is NULL and disabled is false: full r/w mode
@@ -65,6 +71,8 @@ struct BdrvDirtyBitmap {
 char *name; /* Optional non-empty unique ID */
 int64_t size;   /* Size of the bitmap (Number of sectors) */
 bool disabled;  /* Bitmap is read-only */
+int successor_refcount; /* Number of active handles to the successor */
+BitmapSuccessorAction act;  /* Action to take on successor upon release */
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -3133,6 +3141,7 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState 
*bs,
 
 /* Install the successor and freeze the parent */
 bitmap-successor = child;
+bitmap-successor_refcount = 1;
 return 0;
 }
 
@@ -3140,9 +3149,9 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState 
*bs,
  * For a bitmap with a successor, yield our name to the successor,
  * delete the old bitmap, and return a handle to the new bitmap.
  */
-BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
-BdrvDirtyBitmap *bitmap,
-Error **errp)
+static BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
+   BdrvDirtyBitmap *bitmap,
+   Error **errp)
 {
 char *name;
 BdrvDirtyBitmap *successor = bitmap-successor;
@@ -3167,9 +3176,9 @@ BdrvDirtyBitmap 
*bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
  * we may wish to re-join the parent and child/successor.
  * The merged parent will be un-frozen, but not explicitly re-enabled.
  */
-BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
-   BdrvDirtyBitmap *parent,
-   Error **errp)
+static BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
+  BdrvDirtyBitmap *parent,
+  Error **errp)
 {
 BdrvDirtyBitmap *successor = parent-successor;
 
@@ -3188,6 +3197,50 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
 return parent;
 }
 
+static BdrvDirtyBitmap *bdrv_free_bitmap_successor(BlockDriverState *bs,
+   BdrvDirtyBitmap *parent)
+{
+assert(!parent-successor_refcount);
+
+switch (parent-act) {
+case SUCCESSOR_ACTION_RECLAIM:
+return bdrv_reclaim_dirty_bitmap(bs, parent, NULL);
+case SUCCESSOR_ACTION_ABDICATE:
+return bdrv_dirty_bitmap_abdicate(bs, parent, NULL);
+case SUCCESSOR_ACTION_UNDEFINED:
+default:
+g_assert_not_reached();
+}
+}
+
+BdrvDirtyBitmap *bdrv_frozen_bitmap_decref(BlockDriverState *bs,
+   BdrvDirtyBitmap *parent,
+   int ret)
+{
+assert(bdrv_dirty_bitmap_frozen(parent));
+assert(parent-successor);
+
+if (ret) {
+parent-act = SUCCESSOR_ACTION_RECLAIM;
+} else if (parent-act != SUCCESSOR_ACTION_RECLAIM) {
+parent-act = SUCCESSOR_ACTION_ABDICATE;
+}
+
+parent-successor_refcount--;
+if (parent-successor_refcount == 0) {
+return bdrv_free_bitmap_successor(bs, parent);
+}
+return parent;
+}
+
+void bdrv_dirty_bitmap_incref(BdrvDirtyBitmap *parent)
+{
+assert(bdrv_dirty_bitmap_frozen(parent));
+assert(parent-successor);
+
+parent-successor_refcount++;
+}
+
 /**
  * Truncates _all_ bitmaps attached to a BDS.
  */
diff --git a/block/backup.c b/block/backup.c
index d3f648d..4ac0be8 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -240,6 +240,12 @@ static void backup_complete(BlockJob *job, void *opaque)
 
 bdrv_unref(s-target);
 
+if (s-sync_bitmap) {
+BdrvDirtyBitmap *bm;
+bm = bdrv_frozen_bitmap_decref(job-bs, s-sync_bitmap, data-ret);
+assert(bm

[Qemu-block] [PATCH v4 04/11] block: re-add BlkTransactionState

2015-05-11 Thread John Snow

Now that the structure formerly known as BlkTransactionState has been
renamed to something sensible (BlkActionState), re-introduce an actual
BlkTransactionState that actually manages state for the entire Transaction.

In the process, convert the old QSIMPLEQ list of actions into a QTAILQ,
to let us more efficiently delete items in arbitrary order, which will
be more important in the future when some actions will expire at the end
of the transaction, but others may persist until all callbacks triggered
by the transaction are recollected.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 blockdev.c | 66 +++---
 1 file changed, 59 insertions(+), 7 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 6df575d..068eccb 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1253,6 +1253,19 @@ typedef struct BlkActionOps {
 } BlkActionOps;
 
 /**
+ * BlkTransactionState:
+ * Object to track the job completion info for jobs launched
+ * by a transaction group.
+ *
+ * @jobs: A reference count that tracks how many jobs still need to complete.
+ * @actions: A list of all Actions in the Transaction.
+ */
+typedef struct BlkTransactionState {
+int jobs;
+QTAILQ_HEAD(actions, BlkActionState) actions;
+} BlkTransactionState;
+
+/**
  * BlkActionState:
  * Describes one Action's state within a Transaction.
  *
@@ -1267,9 +1280,45 @@ typedef struct BlkActionOps {
 struct BlkActionState {
 TransactionAction *action;
 const BlkActionOps *ops;
-QSIMPLEQ_ENTRY(BlkActionState) entry;
+QTAILQ_ENTRY(BlkActionState) entry;
 };
 
+static BlkTransactionState *new_blk_transaction_state(void)
+{
+BlkTransactionState *bts = g_new0(BlkTransactionState, 1);
+
+/* The qmp_transaction function itself can be considered a pending job
+ * that should complete before pending action callbacks are executed,
+ * so increment the jobs remaining refcount to indicate this. */
+bts-jobs = 1;
+QTAILQ_INIT(bts-actions);
+return bts;
+}
+
+static void destroy_blk_transaction_state(BlkTransactionState *bts)
+{
+BlkActionState *bas, *bas_next;
+
+/* The list should in normal cases be empty,
+ * but in case someone really just wants to kibosh the whole deal: */
+QTAILQ_FOREACH_SAFE(bas, bts-actions, entry, bas_next) {
+QTAILQ_REMOVE(bts-actions, bas, entry);
+g_free(bas);
+}
+
+g_free(bts);
+}
+
+static BlkTransactionState *transaction_job_complete(BlkTransactionState *bts)
+{
+bts-jobs--;
+if (bts-jobs == 0) {
+destroy_blk_transaction_state(bts);
+return NULL;
+}
+return bts;
+}
+
 /* internal snapshot private data */
 typedef struct InternalSnapshotState {
 BlkActionState common;
@@ -1870,10 +1919,10 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 {
 TransactionActionList *dev_entry = dev_list;
 BlkActionState *state, *next;
+BlkTransactionState *bts;
 Error *local_err = NULL;
 
-QSIMPLEQ_HEAD(snap_bdrv_states, BlkActionState) snap_bdrv_states;
-QSIMPLEQ_INIT(snap_bdrv_states);
+bts = new_blk_transaction_state();
 
 /* drain all i/o before any operations */
 bdrv_drain_all();
@@ -1894,7 +1943,7 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 state = g_malloc0(ops-instance_size);
 state-ops = ops;
 state-action = dev_info;
-QSIMPLEQ_INSERT_TAIL(snap_bdrv_states, state, entry);
+QTAILQ_INSERT_TAIL(bts-actions, state, entry);
 
 state-ops-prepare(state, local_err);
 if (local_err) {
@@ -1903,7 +1952,7 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 }
 }
 
-QSIMPLEQ_FOREACH(state, snap_bdrv_states, entry) {
+QTAILQ_FOREACH(state, bts-actions, entry) {
 if (state-ops-commit) {
 state-ops-commit(state);
 }
@@ -1914,18 +1963,21 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 
 delete_and_fail:
 /* failure, and it is all-or-none; roll back all operations */
-QSIMPLEQ_FOREACH(state, snap_bdrv_states, entry) {
+QTAILQ_FOREACH(state, bts-actions, entry) {
 if (state-ops-abort) {
 state-ops-abort(state);
 }
 }
 exit:
-QSIMPLEQ_FOREACH_SAFE(state, snap_bdrv_states, entry, next) {
+QTAILQ_FOREACH_SAFE(state, bts-actions, entry, next) {
 if (state-ops-clean) {
 state-ops-clean(state);
 }
+QTAILQ_REMOVE(bts-actions, state, entry);
 g_free(state);
 }
+
+transaction_job_complete(bts);
 }
 
 
-- 
2.1.0

[Qemu-block] [PATCH v4 02/11] iotests: add transactional incremental backup test

2015-05-11 Thread John Snow

Test simple usage cases for using transactions to create
and synchronize incremental backups.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
---
 tests/qemu-iotests/124 | 54 ++
 tests/qemu-iotests/124.out |  4 ++--
 2 files changed, 56 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 3ee78cd..2d50594 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -36,6 +36,23 @@ def try_remove(img):
 pass
 
 
+def transaction_action(action, **kwargs):
+return {
+'type': action,
+'data': kwargs
+}
+
+
+def transaction_bitmap_clear(node, name, **kwargs):
+return transaction_action('block-dirty-bitmap-clear',
+  node=node, name=name, **kwargs)
+
+
+def transaction_drive_backup(device, target, **kwargs):
+return transaction_action('drive-backup', device=device, target=target,
+  **kwargs)
+
+
 class Bitmap:
 def __init__(self, name, drive):
 self.name = name
@@ -264,6 +281,43 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 return self.do_incremental_simple(granularity=131072)
 
 
+def test_incremental_transaction(self):
+'''Test: Verify backups made from transactionally created bitmaps.
+
+Create a bitmap before VM execution begins, then create a second
+bitmap AFTER writes have already occurred. Use transactions to create
+a full backup and synchronize both bitmaps to this backup.
+Create an incremental backup through both bitmaps and verify that
+both backups match the current drive0 image.
+'''
+
+drive0 = self.drives[0]
+bitmap0 = self.add_bitmap('bitmap0', drive0)
+self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+  ('0xfe', '16M', '256k'),
+  ('0x64', '32736k', '64k')))
+bitmap1 = self.add_bitmap('bitmap1', drive0)
+
+result = self.vm.qmp('transaction', actions=[
+transaction_bitmap_clear(bitmap0.drive['id'], bitmap0.name),
+transaction_bitmap_clear(bitmap1.drive['id'], bitmap1.name),
+transaction_drive_backup(drive0['id'], drive0['backup'],
+ sync='full', format=drive0['fmt'])
+])
+self.assert_qmp(result, 'return', {})
+self.wait_until_completed(drive0['id'])
+self.files.append(drive0['backup'])
+
+self.hmp_io_writes(drive0['id'], (('0x9a', 0, 512),
+  ('0x55', '8M', '352k'),
+  ('0x78', '15872k', '1M')))
+# Both bitmaps should be correctly in sync.
+self.create_incremental(bitmap0)
+self.create_incremental(bitmap1)
+self.vm.shutdown()
+self.check_backups()
+
+
 def test_incremental_failure(self):
 '''Test: Verify backups made after a failure are correct.
 
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 2f7d390..594c16f 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-...
+
 --
-Ran 7 tests
+Ran 8 tests
 
 OK
-- 
2.1.0

[Qemu-block] [PATCH v4 10/11] iotests: 124 - transactional failure test

2015-05-11 Thread John Snow

Use a transaction to request an incremental backup across two drives.
Coerce one of the jobs to fail, and then re-run the transaction.

Verify that no bitmap data was lost due to the partial transaction
failure.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 tests/qemu-iotests/124 | 120 -
 tests/qemu-iotests/124.out |   4 +-
 2 files changed, 121 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 2d50594..772edd4 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -139,9 +139,12 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 def do_qmp_backup(self, error='Input/output error', **kwargs):
 res = self.vm.qmp('drive-backup', **kwargs)
 self.assert_qmp(res, 'return', {})
+return self.wait_qmp_backup(kwargs['device'], error)
 
+
+def wait_qmp_backup(self, device, error='Input/output error'):
 event = self.vm.event_wait(name=BLOCK_JOB_COMPLETED,
-   match={'data': {'device': 
kwargs['device']}})
+   match={'data': {'device': device}})
 self.assertIsNotNone(event)
 
 try:
@@ -375,6 +378,121 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 self.check_backups()
 
 
+def test_transaction_failure(self):
+'''Test: Verify backups made from a transaction that partially fails.
+
+Add a second drive with its own unique pattern, and add a bitmap to 
each
+drive. Use blkdebug to interfere with the backup on just one drive and
+attempt to create a coherent incremental backup across both drives.
+
+verify a failure in one but not both, then delete the failed stubs and
+re-run the same transaction.
+
+verify that both incrementals are created successfully.
+'''
+
+# Create a second drive, with pattern:
+drive1 = self.add_node('drive1')
+self.img_create(drive1['file'], drive1['fmt'])
+io_write_patterns(drive1['file'], (('0x14', 0, 512),
+   ('0x5d', '1M', '32k'),
+   ('0xcd', '32M', '124k')))
+
+# Create a blkdebug interface to this img as 'drive1'
+result = self.vm.qmp('blockdev-add', options={
+'id': drive1['id'],
+'driver': drive1['fmt'],
+'file': {
+'driver': 'blkdebug',
+'image': {
+'driver': 'file',
+'filename': drive1['file']
+},
+'set-state': [{
+'event': 'flush_to_disk',
+'state': 1,
+'new_state': 2
+}],
+'inject-error': [{
+'event': 'read_aio',
+'errno': 5,
+'state': 2,
+'immediately': False,
+'once': True
+}],
+}
+})
+self.assert_qmp(result, 'return', {})
+
+# Create bitmaps and full backups for both drives
+drive0 = self.drives[0]
+dr0bm0 = self.add_bitmap('bitmap0', drive0)
+dr1bm0 = self.add_bitmap('bitmap0', drive1)
+self.create_anchor_backup(drive0)
+self.create_anchor_backup(drive1)
+self.assert_no_active_block_jobs()
+self.assertFalse(self.vm.get_qmp_events(wait=False))
+
+# Emulate some writes
+self.hmp_io_writes(drive0['id'], (('0xab', 0, 512),
+  ('0xfe', '16M', '256k'),
+  ('0x64', '32736k', '64k')))
+self.hmp_io_writes(drive1['id'], (('0xba', 0, 512),
+  ('0xef', '16M', '256k'),
+  ('0x46', '32736k', '64k')))
+
+# Create incremental backup targets
+target0 = self.prepare_backup(dr0bm0)
+target1 = self.prepare_backup(dr1bm0)
+
+# Ask for a new incremental backup per-each drive,
+# expecting drive1's backup to fail:
+transaction = [
+transaction_drive_backup(drive0['id'], target0, 
sync='dirty-bitmap',
+ format=drive0['fmt'], mode='existing',
+ bitmap=dr0bm0.name),
+transaction_drive_backup(drive1['id'], target1, 
sync='dirty-bitmap',
+ format=drive1['fmt'], mode='existing',
+ bitmap=dr1bm0.name),
+]
+result = self.vm.qmp('transaction', actions=transaction)
+self.assert_qmp(result, 'return', {})
+
+# Observe that drive0's backup completes, but drive1's does not.
+# Consume drive1's error and ensure all pending actions are completed.
+self.assertTrue

[Qemu-block] [PATCH v4 11/11] qmp-commands.hx: Update the supported 'transaction' operations

2015-05-11 Thread John Snow

From: Kashyap Chamarthy kcham...@redhat.com

Although the canonical source of reference for QMP commands is
qapi-schema.json, for consistency's sake, update qmp-commands.hx to
state the list of supported transactionable operations, namely:

drive-backup
blockdev-backup
blockdev-snapshot-internal-sync
abort
block-dirty-bitmap-add
block-dirty-bitmap-clear

Signed-off-by: Kashyap Chamarthy kcham...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
 qmp-commands.hx | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/qmp-commands.hx b/qmp-commands.hx
index 7506774..363126a 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1238,11 +1238,14 @@ SQMP
 transaction
 ---
 
-Atomically operate on one or more block devices.  The only supported operations
-for now are drive-backup, internal and external snapshotting.  A list of
-dictionaries is accepted, that contains the actions to be performed.
-If there is any failure performing any of the operations, all operations
-for the group are abandoned.
+Atomically operate on one or more block devices.  Operations that are
+currently supported: drive-backup, blockdev-backup,
+blockdev-snapshot-sync, blockdev-snapshot-internal-sync, abort,
+block-dirty-bitmap-add, block-dirty-bitmap-clear (refer to the
+qemu/qapi-schema.json file for minimum required QEMU versions for these
+operations).  A list of dictionaries is accepted, that contains the
+actions to be performed.  If there is any failure performing any of the
+operations, all operations for the group are abandoned.
 
 For external snapshots, the dictionary contains the device, the file to use for
 the new snapshot, and the format.  The default format, if not specified, is
-- 
2.1.0

[Qemu-block] [PATCH v4 08/11] qmp: Add an implementation wrapper for qmp_drive_backup

2015-05-11 Thread John Snow

We'd like to be able to specify the callback given to backup_start
manually in the case of transactions, so split apart qmp_drive_backup
into an implementation and a wrapper.

Switch drive_backup_prepare to use the new wrapper, but don't overload
the callback and closure yet.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 blockdev.c | 78 +++---
 1 file changed, 59 insertions(+), 19 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index 27db1b4..f391e18 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1794,6 +1794,19 @@ typedef struct DriveBackupState {
 BlockJob *job;
 } DriveBackupState;
 
+static void do_drive_backup(const char *device, const char *target,
+bool has_format, const char *format,
+enum MirrorSyncMode sync,
+bool has_mode, enum NewImageMode mode,
+bool has_speed, int64_t speed,
+bool has_bitmap, const char *bitmap,
+bool has_on_source_error,
+BlockdevOnError on_source_error,
+bool has_on_target_error,
+BlockdevOnError on_target_error,
+BlockCompletionFunc *cb, void *opaque,
+Error **errp);
+
 static void drive_backup_prepare(BlkActionState *common, Error **errp)
 {
 DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
@@ -1816,15 +1829,16 @@ static void drive_backup_prepare(BlkActionState 
*common, Error **errp)
 state-aio_context = bdrv_get_aio_context(bs);
 aio_context_acquire(state-aio_context);
 
-qmp_drive_backup(backup-device, backup-target,
- backup-has_format, backup-format,
- backup-sync,
- backup-has_mode, backup-mode,
- backup-has_speed, backup-speed,
- backup-has_bitmap, backup-bitmap,
- backup-has_on_source_error, backup-on_source_error,
- backup-has_on_target_error, backup-on_target_error,
- local_err);
+do_drive_backup(backup-device, backup-target,
+backup-has_format, backup-format,
+backup-sync,
+backup-has_mode, backup-mode,
+backup-has_speed, backup-speed,
+backup-has_bitmap, backup-bitmap,
+backup-has_on_source_error, backup-on_source_error,
+backup-has_on_target_error, backup-on_target_error,
+NULL, NULL,
+local_err);
 if (local_err) {
 error_propagate(errp, local_err);
 return;
@@ -2778,15 +2792,18 @@ out:
 aio_context_release(aio_context);
 }
 
-void qmp_drive_backup(const char *device, const char *target,
-  bool has_format, const char *format,
-  enum MirrorSyncMode sync,
-  bool has_mode, enum NewImageMode mode,
-  bool has_speed, int64_t speed,
-  bool has_bitmap, const char *bitmap,
-  bool has_on_source_error, BlockdevOnError 
on_source_error,
-  bool has_on_target_error, BlockdevOnError 
on_target_error,
-  Error **errp)
+static void do_drive_backup(const char *device, const char *target,
+bool has_format, const char *format,
+enum MirrorSyncMode sync,
+bool has_mode, enum NewImageMode mode,
+bool has_speed, int64_t speed,
+bool has_bitmap, const char *bitmap,
+bool has_on_source_error,
+BlockdevOnError on_source_error,
+bool has_on_target_error,
+BlockdevOnError on_target_error,
+BlockCompletionFunc *cb, void *opaque,
+Error **errp)
 {
 BlockBackend *blk;
 BlockDriverState *bs;
@@ -2900,9 +2917,16 @@ void qmp_drive_backup(const char *device, const char 
*target,
 }
 }
 
+/* If we are not supplied with callback override info, use our defaults */
+if (cb == NULL) {
+cb = block_job_cb;
+}
+if (opaque == NULL) {
+opaque = bs;
+}
 backup_start(bs, target_bs, speed, sync, bmap,
  on_source_error, on_target_error,
- block_job_cb, bs, local_err);
+ cb, opaque, local_err);
 if (local_err != NULL) {
 bdrv_unref(target_bs);
 error_propagate(errp, local_err);
@@ -2913,6 +2937,22 @@ out:
 aio_context_release(aio_context);
 }
 
+void qmp_drive_backup(const char *device, const char *target

Re: [Qemu-block] [Qemu-devel] [PATCH v3 01/10] qapi: Add transaction support to block-dirty-bitmap operations

2015-05-07 Thread John Snow



On 05/07/2015 10:54 AM, Stefan Hajnoczi wrote:
 On Wed, Apr 22, 2015 at 08:04:44PM -0400, John Snow wrote:
 +static void block_dirty_bitmap_clear_prepare(BlkTransactionState
 *common, + Error
 **errp) +{ +BlockDirtyBitmapState *state =
 DO_UPCAST(BlockDirtyBitmapState, +
 common, common); +BlockDirtyBitmap *action; + +action =
 common-action-block_dirty_bitmap_clear; +state-bitmap =
 block_dirty_bitmap_lookup(action-node, +
 action-name, +
 state-bs, +
 state-aio_context, +
 errp); +if (!state-bitmap) { +return; +} + +
 if (bdrv_dirty_bitmap_frozen(state-bitmap)) { +
 error_setg(errp, Cannot modify a frozen bitmap); +
 return; +} else if
 (!bdrv_dirty_bitmap_enabled(state-bitmap)) { +
 error_setg(errp, Cannot clear a disabled bitmap); +
 return; +} + +/* AioContext is released in .clean() */ 
 +} + +static void
 block_dirty_bitmap_clear_commit(BlkTransactionState *common) +{ +
 BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState, +
 common, common); +bdrv_clear_dirty_bitmap(state-bitmap); +}
 
 These semantics don't work in this example:
 
 [block-dirty-bitmap-clear, drive-backup]
 
 Since drive-backup starts the blockjob in .prepare() but 
 block-dirty-bitmap-clear only clears the bitmap in .commit() the
 order is wrong.
 
 .prepare() has to do something non-destructive, like stashing away
 the HBitmap and replacing it with an empty one.  Then .commit() can
 discard the old bitmap while .abort() can move the old bitmap back
 to undo the operation.
 
 Stefan
 

Hmm, that's sort of gross. That means that any transactional command
*ever* destined to be used with drive-backup in any conceivable way
needs to move a lot more of its action forward to .prepare().

That sort of defeats the premise of .prepare() and .commit(), no? And
all because drive-backup jumped the gun.

That's going to get hard to maintain as we add more transactions.

--js

Re: [Qemu-block] [Qemu-devel] [PATCH 5/5] tests: add test case for encrypted qcow2 read/write

2015-05-12 Thread John Snow



On 05/12/2015 03:52 PM, Eric Blake wrote:
 On 05/12/2015 01:06 PM, John Snow wrote:
 tests/qemu-iotests/131 | 69 
 ++ 
 tests/qemu-iotests/131.out | 46
 +++
 
 
 Fam Zheng already has a patch on-list that uses test 131, and I
 think his patch was submitted first.
 
 (Unless we want to play the Who gets merged first? game.)
 
 That's the sort of conflict that I expect a maintainer can clean
 up, if there is no other reason for a respin (although it is not
 always easy to coax git into understanding that a patch would be
 valid if the file is renamed.
 

Sure, whoever fixes it. Just pointing it out.

--js

Re: [Qemu-block] [PATCH v4 08/11] qmp: Add an implementation wrapper for qmp_drive_backup

2015-05-18 Thread John Snow



On 05/18/2015 10:42 AM, Stefan Hajnoczi wrote:
 On Mon, May 11, 2015 at 07:04:23PM -0400, John Snow wrote:
 @@ -2900,9 +2917,16 @@ void qmp_drive_backup(const char *device,
 const char *target, } }
 
 +/* If we are not supplied with callback override info, use
 our defaults */ +if (cb == NULL) { +cb =
 block_job_cb; +} +if (opaque == NULL) { +opaque =
 bs; +}
 
 Why assign opaque separately, it raises the question what happens
 if a custom cb is given but the caller really wants opaque to be
 NULL?
 
 The following might be clearer:
 
 if (cb == NULL) { cb = block_job_cb; opaque = bs; }
 

It just wasn't a consideration when I was writing it, since the
transaction system won't ever want to pass NULL here.

It's easy enough to fix, though.

Re: [Qemu-block] [PATCH v4 09/11] block: drive_backup transaction callback support

2015-05-18 Thread John Snow



On 05/18/2015 11:35 AM, Stefan Hajnoczi wrote:
 On Mon, May 11, 2015 at 07:04:24PM -0400, John Snow wrote:
 +static void drive_backup_cb(BlkActionState *common) +{ +
 BlkActionCallbackData *cb_data = common-cb_data; +
 BlockDriverState *bs = cb_data-opaque; +DriveBackupState
 *state = DO_UPCAST(DriveBackupState, common, common); + +
 assert(state-bs == bs); +if (bs-job) { +
 assert(state-job == bs-job); +}
 
 What is the purpose of the if statement?
 
 Why is it not okay for a new job to have started?
 

Hmm, maybe it's fine -- It was just my thought that it probably
/shouldn't/ occur under normal circumstances.

I think my assumption was that we want to impose an ordering that job
cleanup occurs before another job launches, in general.

I think, though, that you wanted to start allowing non-conflicting
jobs to run concurrently, though, so I'll just eye over this series
again to make sure it's okay for cleanup to happen after another job
starts ...

...Provided the second job does not fiddle with bitmaps, of course. We
should clean those up before another bitmap job starts, definitely.

 + +state-aio_context = bdrv_get_aio_context(bs); +
 aio_context_acquire(state-aio_context);
 
 The bs-job access above should be protected by
 aio_context_acquire().
 

Thanks,
--js

[Qemu-block] [PATCH v5 10/21] qmp: Add support of dirty-bitmap sync mode for drive-backup

2015-04-08 Thread John Snow

For dirty-bitmap sync mode, the block job will iterate through the
given dirty bitmap to decide if a sector needs backup (backup all the
dirty clusters and skip clean ones), just as allocation conditions of
top sync mode.

Signed-off-by: Fam Zheng f...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
 block.c   |   9 +++
 block/backup.c| 156 +++---
 block/mirror.c|   4 ++
 blockdev.c|  18 +-
 hmp.c |   3 +-
 include/block/block.h |   1 +
 include/block/block_int.h |   2 +
 qapi/block-core.json  |  13 ++--
 qmp-commands.hx   |   7 ++-
 9 files changed, 180 insertions(+), 33 deletions(-)

diff --git a/block.c b/block.c
index 9d30379..2367311 100644
--- a/block.c
+++ b/block.c
@@ -5717,6 +5717,15 @@ static void bdrv_reset_dirty(BlockDriverState *bs, 
int64_t cur_sector,
 }
 }
 
+/**
+ * Advance an HBitmapIter to an arbitrary offset.
+ */
+void bdrv_set_dirty_iter(HBitmapIter *hbi, int64_t offset)
+{
+assert(hbi-hb);
+hbitmap_iter_init(hbi, hbi-hb, offset);
+}
+
 int64_t bdrv_get_dirty_count(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
 {
 return hbitmap_count(bitmap-bitmap);
diff --git a/block/backup.c b/block/backup.c
index 1c535b1..8513917 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -37,6 +37,8 @@ typedef struct CowRequest {
 typedef struct BackupBlockJob {
 BlockJob common;
 BlockDriverState *target;
+/* bitmap for sync=dirty-bitmap */
+BdrvDirtyBitmap *sync_bitmap;
 MirrorSyncMode sync_mode;
 RateLimit limit;
 BlockdevOnError on_source_error;
@@ -242,6 +244,92 @@ static void backup_complete(BlockJob *job, void *opaque)
 g_free(data);
 }
 
+static bool coroutine_fn yield_and_check(BackupBlockJob *job)
+{
+if (block_job_is_cancelled(job-common)) {
+return true;
+}
+
+/* we need to yield so that qemu_aio_flush() returns.
+ * (without, VM does not reboot)
+ */
+if (job-common.speed) {
+uint64_t delay_ns = ratelimit_calculate_delay(job-limit,
+  job-sectors_read);
+job-sectors_read = 0;
+block_job_sleep_ns(job-common, QEMU_CLOCK_REALTIME, delay_ns);
+} else {
+block_job_sleep_ns(job-common, QEMU_CLOCK_REALTIME, 0);
+}
+
+if (block_job_is_cancelled(job-common)) {
+return true;
+}
+
+return false;
+}
+
+static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
+{
+bool error_is_read;
+int ret = 0;
+int clusters_per_iter;
+uint32_t granularity;
+int64_t sector;
+int64_t cluster;
+int64_t end;
+int64_t last_cluster = -1;
+BlockDriverState *bs = job-common.bs;
+HBitmapIter hbi;
+
+granularity = bdrv_dirty_bitmap_granularity(job-sync_bitmap);
+clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);
+bdrv_dirty_iter_init(bs, job-sync_bitmap, hbi);
+
+/* Find the next dirty sector(s) */
+while ((sector = hbitmap_iter_next(hbi)) != -1) {
+cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
+
+/* Fake progress updates for any clusters we skipped */
+if (cluster != last_cluster + 1) {
+job-common.offset += ((cluster - last_cluster - 1) *
+   BACKUP_CLUSTER_SIZE);
+}
+
+for (end = cluster + clusters_per_iter; cluster  end; cluster++) {
+if (yield_and_check(job)) {
+return ret;
+}
+
+do {
+ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
+BACKUP_SECTORS_PER_CLUSTER, 
error_is_read);
+if ((ret  0) 
+backup_error_action(job, error_is_read, -ret) ==
+BLOCK_ERROR_ACTION_REPORT) {
+return ret;
+}
+} while (ret  0);
+}
+
+/* If the bitmap granularity is smaller than the backup granularity,
+ * we need to advance the iterator pointer to the next cluster. */
+if (granularity  BACKUP_CLUSTER_SIZE) {
+bdrv_set_dirty_iter(hbi, cluster * BACKUP_SECTORS_PER_CLUSTER);
+}
+
+last_cluster = cluster - 1;
+}
+
+/* Play some final catchup with the progress meter */
+end = DIV_ROUND_UP(job-common.len, BACKUP_CLUSTER_SIZE);
+if (last_cluster + 1  end) {
+job-common.offset += ((end - last_cluster - 1) * BACKUP_CLUSTER_SIZE);
+}
+
+return ret;
+}
+
 static void coroutine_fn backup_run(void *opaque)
 {
 BackupBlockJob *job = opaque;
@@ -259,8 +347,7 @@ static void coroutine_fn backup_run(void *opaque)
 qemu_co_rwlock_init(job-flush_rwlock);
 
 start = 0;
-end = DIV_ROUND_UP(job-common.len / BDRV_SECTOR_SIZE,
-   BACKUP_SECTORS_PER_CLUSTER);
+end = DIV_ROUND_UP(job-common.len, BACKUP_CLUSTER_SIZE

[Qemu-block] [PATCH v5 15/21] block: Resize bitmaps on bdrv_truncate

2015-04-08 Thread John Snow

Signed-off-by: John Snow js...@redhat.com
---
 block.c| 18 ++
 include/qemu/hbitmap.h | 10 ++
 util/hbitmap.c | 48 
 3 files changed, 76 insertions(+)

diff --git a/block.c b/block.c
index 16209a2..42839a0 100644
--- a/block.c
+++ b/block.c
@@ -113,6 +113,7 @@ static void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
int nr_sectors);
 static void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector,
  int nr_sectors);
+static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
@@ -3583,6 +3584,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
 ret = drv-bdrv_truncate(bs, offset);
 if (ret == 0) {
 ret = refresh_total_sectors(bs, offset  BDRV_SECTOR_BITS);
+bdrv_dirty_bitmap_truncate(bs);
 if (bs-blk) {
 blk_dev_resize_cb(bs-blk);
 }
@@ -5593,6 +5595,22 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
 return parent;
 }
 
+/**
+ * Truncates _all_ bitmaps attached to a BDS.
+ */
+static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
+{
+BdrvDirtyBitmap *bitmap;
+uint64_t size = bdrv_nb_sectors(bs);
+
+QLIST_FOREACH(bitmap, bs-dirty_bitmaps, list) {
+if (bdrv_dirty_bitmap_frozen(bitmap)) {
+continue;
+}
+hbitmap_truncate(bitmap-bitmap, size);
+}
+}
+
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
 {
 BdrvDirtyBitmap *bm, *next;
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index c19c1cb..a75157e 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -65,6 +65,16 @@ struct HBitmapIter {
 HBitmap *hbitmap_alloc(uint64_t size, int granularity);
 
 /**
+ * hbitmap_truncate:
+ * @hb: The bitmap to change the size of.
+ * @size: The number of elements to change the bitmap to accommodate.
+ *
+ * truncate or grow an existing bitmap to accommodate a new number of elements.
+ * This may invalidate existing HBitmapIterators.
+ */
+void hbitmap_truncate(HBitmap *hb, uint64_t size);
+
+/**
  * hbitmap_merge:
  * @a: The bitmap to store the result in.
  * @b: The bitmap to merge into @a.
diff --git a/util/hbitmap.c b/util/hbitmap.c
index ba11fd3..1ad3bf3 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -400,6 +400,54 @@ HBitmap *hbitmap_alloc(uint64_t size, int granularity)
 return hb;
 }
 
+void hbitmap_truncate(HBitmap *hb, uint64_t size)
+{
+bool shrink;
+unsigned i;
+uint64_t num_elements = size;
+uint64_t old;
+
+/* Size comes in as logical elements, adjust for granularity. */
+size = (size + (1ULL  hb-granularity) - 1)  hb-granularity;
+assert(size = ((uint64_t)1  HBITMAP_LOG_MAX_SIZE));
+shrink = size  hb-size;
+
+/* bit sizes are identical; nothing to do. */
+if (size == hb-size) {
+return;
+}
+
+/* If we're losing bits, let's clear those bits before we invalidate all of
+ * our invariants. This helps keep the bitcount consistent, and will 
prevent
+ * us from carrying around garbage bits beyond the end of the map.
+ */
+if (shrink) {
+/* Don't clear partial granularity groups;
+ * start at the first full one. */
+uint64_t start = QEMU_ALIGN_UP(num_elements, 1  hb-granularity);
+uint64_t fix_count = (hb-size  hb-granularity) - num_elements;
+
+assert(fix_count);
+hbitmap_reset(hb, start, fix_count);
+}
+
+hb-size = size;
+for (i = HBITMAP_LEVELS; i--  0; ) {
+size = MAX(BITS_TO_LONGS(size), 1);
+if (hb-sizes[i] == size) {
+break;
+}
+old = hb-sizes[i];
+hb-sizes[i] = size;
+hb-levels[i] = g_realloc(hb-levels[i], size * sizeof(unsigned long));
+if (!shrink) {
+memset(hb-levels[i][old], 0x00,
+   (size - old) * sizeof(*hb-levels[i]));
+}
+}
+}
+
+
 /**
  * Given HBitmaps A and B, let A := A (BITOR) B.
  * Bitmap B will not be modified.
-- 
2.1.0

[Qemu-block] [PATCH v5 14/21] block: Ensure consistent bitmap function prototypes

2015-04-08 Thread John Snow

We often don't need the BlockDriverState for functions
that operate on bitmaps. Remove it.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
---
 block.c   | 13 ++---
 block/backup.c|  2 +-
 block/mirror.c| 26 ++
 blockdev.c|  2 +-
 include/block/block.h | 11 +--
 migration/block.c |  7 +++
 6 files changed, 26 insertions(+), 35 deletions(-)

diff --git a/block.c b/block.c
index 843b0bf..16209a2 100644
--- a/block.c
+++ b/block.c
@@ -5460,7 +5460,7 @@ BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState 
*bs, const char *name)
 return NULL;
 }
 
-void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
+void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
 {
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
 g_free(bitmap-name);
@@ -5629,7 +5629,7 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 QLIST_FOREACH(bm, bs-dirty_bitmaps, list) {
 BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
 BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
-info-count = bdrv_get_dirty_count(bs, bm);
+info-count = bdrv_get_dirty_count(bm);
 info-granularity = bdrv_dirty_bitmap_granularity(bm);
 info-has_name = !!bm-name;
 info-name = g_strdup(bm-name);
@@ -5676,20 +5676,19 @@ uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap 
*bitmap)
 return BDRV_SECTOR_SIZE  hbitmap_granularity(bitmap-bitmap);
 }
 
-void bdrv_dirty_iter_init(BlockDriverState *bs,
-  BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
+void bdrv_dirty_iter_init(BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
 {
 hbitmap_iter_init(hbi, bitmap-bitmap, 0);
 }
 
-void bdrv_set_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
+void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors)
 {
 assert(bdrv_dirty_bitmap_enabled(bitmap));
 hbitmap_set(bitmap-bitmap, cur_sector, nr_sectors);
 }
 
-void bdrv_reset_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
+void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
  int64_t cur_sector, int nr_sectors)
 {
 assert(bdrv_dirty_bitmap_enabled(bitmap));
@@ -5735,7 +5734,7 @@ void bdrv_set_dirty_iter(HBitmapIter *hbi, int64_t offset)
 hbitmap_iter_init(hbi, hbi-hb, offset);
 }
 
-int64_t bdrv_get_dirty_count(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
+int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
 {
 return hbitmap_count(bitmap-bitmap);
 }
diff --git a/block/backup.c b/block/backup.c
index 8513917..cdd41c5 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -284,7 +284,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 
 granularity = bdrv_dirty_bitmap_granularity(job-sync_bitmap);
 clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);
-bdrv_dirty_iter_init(bs, job-sync_bitmap, hbi);
+bdrv_dirty_iter_init(job-sync_bitmap, hbi);
 
 /* Find the next dirty sector(s) */
 while ((sector = hbitmap_iter_next(hbi)) != -1) {
diff --git a/block/mirror.c b/block/mirror.c
index f89eccf..dcd6f65 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -125,11 +125,9 @@ static void mirror_write_complete(void *opaque, int ret)
 MirrorOp *op = opaque;
 MirrorBlockJob *s = op-s;
 if (ret  0) {
-BlockDriverState *source = s-common.bs;
 BlockErrorAction action;
 
-bdrv_set_dirty_bitmap(source, s-dirty_bitmap, op-sector_num,
-  op-nb_sectors);
+bdrv_set_dirty_bitmap(s-dirty_bitmap, op-sector_num, op-nb_sectors);
 action = mirror_error_action(s, false, -ret);
 if (action == BLOCK_ERROR_ACTION_REPORT  s-ret = 0) {
 s-ret = ret;
@@ -143,11 +141,9 @@ static void mirror_read_complete(void *opaque, int ret)
 MirrorOp *op = opaque;
 MirrorBlockJob *s = op-s;
 if (ret  0) {
-BlockDriverState *source = s-common.bs;
 BlockErrorAction action;
 
-bdrv_set_dirty_bitmap(source, s-dirty_bitmap, op-sector_num,
-  op-nb_sectors);
+bdrv_set_dirty_bitmap(s-dirty_bitmap, op-sector_num, op-nb_sectors);
 action = mirror_error_action(s, true, -ret);
 if (action == BLOCK_ERROR_ACTION_REPORT  s-ret = 0) {
 s-ret = ret;
@@ -170,10 +166,9 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 
 s-sector_num = hbitmap_iter_next(s-hbi);
 if (s-sector_num  0) {
-bdrv_dirty_iter_init(source, s-dirty_bitmap, s-hbi);
+bdrv_dirty_iter_init(s-dirty_bitmap, s-hbi);
 s-sector_num = hbitmap_iter_next(s-hbi);
-trace_mirror_restart_iter(s,
-  bdrv_get_dirty_count(source, 
s-dirty_bitmap

[Qemu-block] [PATCH v5 01/21] docs: incremental backup documentation

2015-04-08 Thread John Snow

Reviewed-by: Max Reitz mre...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
 docs/bitmaps.md | 311 
 1 file changed, 311 insertions(+)
 create mode 100644 docs/bitmaps.md

diff --git a/docs/bitmaps.md b/docs/bitmaps.md
new file mode 100644
index 000..ad8c33b
--- /dev/null
+++ b/docs/bitmaps.md
@@ -0,0 +1,311 @@
+# Dirty Bitmaps and Incremental Backup
+
+* Dirty Bitmaps are objects that track which data needs to be backed up for the
+  next incremental backup.
+
+* Dirty bitmaps can be created at any time and attached to any node
+  (not just complete drives.)
+
+## Dirty Bitmap Names
+
+* A dirty bitmap's name is unique to the node, but bitmaps attached to 
different
+nodes can share the same name.
+
+## Bitmap Modes
+
+* A Bitmap can be frozen, which means that it is currently in-use by a backup
+operation and cannot be deleted, renamed, written to, reset,
+etc.
+
+## Basic QMP Usage
+
+### Supported Commands ###
+
+* block-dirty-bitmap-add
+* block-dirty-bitmap-remove
+* block-dirty-bitmap-clear
+
+### Creation
+
+* To create a new bitmap, enabled, on the drive with id=drive0:
+
+```json
+{ execute: block-dirty-bitmap-add,
+  arguments: {
+node: drive0,
+name: bitmap0
+  }
+}
+```
+
+* This bitmap will have a default granularity that matches the cluster size of
+its associated drive, if available, clamped to between [4KiB, 64KiB].
+The current default for qcow2 is 64KiB.
+
+* To create a new bitmap that tracks changes in 32KiB segments:
+
+```json
+{ execute: block-dirty-bitmap-add,
+  arguments: {
+node: drive0,
+name: bitmap0,
+granularity: 32768
+  }
+}
+```
+
+### Deletion
+
+* Can be performed on a disabled bitmap, but not a frozen one.
+
+* Because bitmaps are only unique to the node to which they are attached,
+you must specify the node/drive name here, too.
+
+```json
+{ execute: block-dirty-bitmap-remove,
+  arguments: {
+node: drive0,
+name: bitmap0
+  }
+}
+```
+
+### Resetting
+
+* Resetting a bitmap will clear all information it holds.
+* An incremental backup created from an empty bitmap will copy no data,
+as if nothing has changed.
+
+```json
+{ execute: block-dirty-bitmap-clear,
+  arguments: {
+node: drive0,
+name: bitmap0
+  }
+}
+```
+
+## Transactions (Not yet implemented)
+
+* Transactional commands are forthcoming in a future version,
+  and are not yet available for use. This section serves as
+  documentation of intent for their design and usage.
+
+### Justification
+Bitmaps can be safely modified when the VM is paused or halted by using
+the basic QMP commands. For instance, you might perform the following actions:
+
+1. Boot the VM in a paused state.
+2. Create a full drive backup of drive0.
+3. Create a new bitmap attached to drive0.
+4. Resume execution of the VM.
+5. Incremental backups are ready to be created.
+
+At this point, the bitmap and drive backup would be correctly in sync,
+and incremental backups made from this point forward would be correctly aligned
+to the full drive backup.
+
+This is not particularly useful if we decide we want to start incremental
+backups after the VM has been running for a while, for which we will need to
+perform actions such as the following:
+
+1. Boot the VM and begin execution.
+2. Using a single transaction, perform the following operations:
+* Create bitmap0.
+* Create a full drive backup of drive0.
+3. Incremental backups are now ready to be created.
+
+### Supported Bitmap Transactions
+
+* block-dirty-bitmap-add
+* block-dirty-bitmap-clear
+
+The usages are identical to their respective QMP commands, but see below
+for examples.
+
+### Example: New Incremental Backup
+
+As outlined in the justification, perhaps we want to create a new incremental
+backup chain attached to a drive.
+
+```json
+{ execute: transaction,
+  arguments: {
+actions: [
+  {type: block-dirty-bitmap-add,
+   data: {node: drive0, name: bitmap0} },
+  {type: drive-backup,
+   data: {device: drive0, target: /path/to/full_backup.img,
+sync: full, format: qcow2} }
+]
+  }
+}
+```
+
+### Example: New Incremental Backup Anchor Point
+
+Maybe we just want to create a new full backup with an existing bitmap and
+want to reset the bitmap to track the new chain.
+
+```json
+{ execute: transaction,
+  arguments: {
+actions: [
+  {type: block-dirty-bitmap-clear,
+   data: {node: drive0, name: bitmap0} },
+  {type: drive-backup,
+   data: {device: drive0, target: /path/to/new_full_backup.img,
+sync: full, format: qcow2} }
+]
+  }
+}
+```
+
+## Incremental Backups
+
+The star of the show.
+
+**Nota Bene!** Only incremental backups of entire drives are supported for now.
+So despite the fact that you can attach a bitmap to any arbitrary node, they 
are
+only currently useful when attached to the root node. This is because
+drive-backup only supports drives/devices instead

[Qemu-block] [PATCH v5 08/21] block: Add bitmap disabled status

2015-04-08 Thread John Snow

Add a status indicating the enabled/disabled state of the bitmap.
A bitmap is by default enabled, but you can lock the bitmap into
a read-only state by setting disabled = true.

A previous version of this patch added a QMP interface for changing
the state of the bitmap, but it has since been removed for now until
a use case emerges where this state must be revealed to the user.

The disabled state WILL be used internally for bitmap migration and
bitmap persistence.

Signed-off-by: Fam Zheng f...@redhat.com
Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
---
 block.c   | 25 +
 include/block/block.h |  3 +++
 2 files changed, 28 insertions(+)

diff --git a/block.c b/block.c
index 41c5a67..db742a9 100644
--- a/block.c
+++ b/block.c
@@ -54,6 +54,7 @@
 struct BdrvDirtyBitmap {
 HBitmap *bitmap;
 char *name;
+bool disabled;
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -5481,10 +5482,16 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 bitmap = g_new0(BdrvDirtyBitmap, 1);
 bitmap-bitmap = hbitmap_alloc(bitmap_size, ffs(sector_granularity) - 1);
 bitmap-name = g_strdup(name);
+bitmap-disabled = false;
 QLIST_INSERT_HEAD(bs-dirty_bitmaps, bitmap, list);
 return bitmap;
 }
 
+bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
+{
+return !bitmap-disabled;
+}
+
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
 {
 BdrvDirtyBitmap *bm, *next;
@@ -5499,6 +5506,16 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
 }
 }
 
+void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+bitmap-disabled = true;
+}
+
+void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+bitmap-disabled = false;
+}
+
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 {
 BdrvDirtyBitmap *bm;
@@ -5563,12 +5580,14 @@ void bdrv_dirty_iter_init(BlockDriverState *bs,
 void bdrv_set_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors)
 {
+assert(bdrv_dirty_bitmap_enabled(bitmap));
 hbitmap_set(bitmap-bitmap, cur_sector, nr_sectors);
 }
 
 void bdrv_reset_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
  int64_t cur_sector, int nr_sectors)
 {
+assert(bdrv_dirty_bitmap_enabled(bitmap));
 hbitmap_reset(bitmap-bitmap, cur_sector, nr_sectors);
 }
 
@@ -5577,6 +5596,9 @@ static void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
 {
 BdrvDirtyBitmap *bitmap;
 QLIST_FOREACH(bitmap, bs-dirty_bitmaps, list) {
+if (!bdrv_dirty_bitmap_enabled(bitmap)) {
+continue;
+}
 hbitmap_set(bitmap-bitmap, cur_sector, nr_sectors);
 }
 }
@@ -5586,6 +5608,9 @@ static void bdrv_reset_dirty(BlockDriverState *bs, 
int64_t cur_sector,
 {
 BdrvDirtyBitmap *bitmap;
 QLIST_FOREACH(bitmap, bs-dirty_bitmaps, list) {
+if (!bdrv_dirty_bitmap_enabled(bitmap)) {
+continue;
+}
 hbitmap_reset(bitmap-bitmap, cur_sector, nr_sectors);
 }
 }
diff --git a/include/block/block.h b/include/block/block.h
index 493b7c5..029a8a7 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -457,9 +457,12 @@ BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState 
*bs,
 const char *name);
 void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, BdrvDirtyBitmap 
*bitmap);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
+void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
+void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
 uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
+bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap, int64_t 
sector);
 void bdrv_set_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors);
-- 
2.1.0

Re: [Qemu-block] [Qemu-devel] [PATCH v5 10/21] qmp: Add support of dirty-bitmap sync mode for drive-backup

2015-04-17 Thread John Snow




On 04/17/2015 09:17 AM, Max Reitz wrote:

On 09.04.2015 00:19, John Snow wrote:

For dirty-bitmap sync mode, the block job will iterate through the
given dirty bitmap to decide if a sector needs backup (backup all the
dirty clusters and skip clean ones), just as allocation conditions of
top sync mode.

Signed-off-by: Fam Zheng f...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
  block.c   |   9 +++
  block/backup.c| 156
+++---
  block/mirror.c|   4 ++
  blockdev.c|  18 +-
  hmp.c |   3 +-
  include/block/block.h |   1 +
  include/block/block_int.h |   2 +
  qapi/block-core.json  |  13 ++--
  qmp-commands.hx   |   7 ++-
  9 files changed, 180 insertions(+), 33 deletions(-)

diff --git a/block.c b/block.c
index 9d30379..2367311 100644
--- a/block.c
+++ b/block.c
@@ -5717,6 +5717,15 @@ static void bdrv_reset_dirty(BlockDriverState
*bs, int64_t cur_sector,
  }
  }
+/**
+ * Advance an HBitmapIter to an arbitrary offset.
+ */
+void bdrv_set_dirty_iter(HBitmapIter *hbi, int64_t offset)
+{
+assert(hbi-hb);
+hbitmap_iter_init(hbi, hbi-hb, offset);
+}
+
  int64_t bdrv_get_dirty_count(BlockDriverState *bs, BdrvDirtyBitmap
*bitmap)
  {
  return hbitmap_count(bitmap-bitmap);
diff --git a/block/backup.c b/block/backup.c
index 1c535b1..8513917 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -37,6 +37,8 @@ typedef struct CowRequest {
  typedef struct BackupBlockJob {
  BlockJob common;
  BlockDriverState *target;
+/* bitmap for sync=dirty-bitmap */
+BdrvDirtyBitmap *sync_bitmap;
  MirrorSyncMode sync_mode;
  RateLimit limit;
  BlockdevOnError on_source_error;
@@ -242,6 +244,92 @@ static void backup_complete(BlockJob *job, void
*opaque)
  g_free(data);
  }
+static bool coroutine_fn yield_and_check(BackupBlockJob *job)
+{
+if (block_job_is_cancelled(job-common)) {
+return true;
+}
+
+/* we need to yield so that qemu_aio_flush() returns.
+ * (without, VM does not reboot)
+ */
+if (job-common.speed) {
+uint64_t delay_ns = ratelimit_calculate_delay(job-limit,
+
job-sectors_read);
+job-sectors_read = 0;
+block_job_sleep_ns(job-common, QEMU_CLOCK_REALTIME, delay_ns);
+} else {
+block_job_sleep_ns(job-common, QEMU_CLOCK_REALTIME, 0);
+}
+
+if (block_job_is_cancelled(job-common)) {
+return true;
+}
+
+return false;
+}
+
+static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
+{
+bool error_is_read;
+int ret = 0;
+int clusters_per_iter;
+uint32_t granularity;
+int64_t sector;
+int64_t cluster;
+int64_t end;
+int64_t last_cluster = -1;
+BlockDriverState *bs = job-common.bs;
+HBitmapIter hbi;
+
+granularity = bdrv_dirty_bitmap_granularity(job-sync_bitmap);
+clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);


DIV_ROUND_UP(granularity, BACKUP_CLUSTER_SIZE) would've worked, too
(instead of the MAX()), but since both are powers of two, this is
equivalent.



But this way we get to put your name in the source code.


+bdrv_dirty_iter_init(bs, job-sync_bitmap, hbi);
+
+/* Find the next dirty sector(s) */
+while ((sector = hbitmap_iter_next(hbi)) != -1) {
+cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
+
+/* Fake progress updates for any clusters we skipped */
+if (cluster != last_cluster + 1) {
+job-common.offset += ((cluster - last_cluster - 1) *
+   BACKUP_CLUSTER_SIZE);
+}
+
+for (end = cluster + clusters_per_iter; cluster  end;
cluster++) {
+if (yield_and_check(job)) {
+return ret;
+}
+
+do {
+ret = backup_do_cow(bs, cluster *
BACKUP_SECTORS_PER_CLUSTER,
+BACKUP_SECTORS_PER_CLUSTER,
error_is_read);
+if ((ret  0) 
+backup_error_action(job, error_is_read, -ret) ==
+BLOCK_ERROR_ACTION_REPORT) {
+return ret;
+}


Now that I'm reading this code again... The other backup implementation
handles retries differently; it redoes the whole loop, with the
effective difference being that it calls yield_and_check() between every
retry. Would it make sense to move the yield_and_check() call into this
loop?



Yes, I should be mindful of the case where we might have to copy many 
clusters per dirty bit. I don't think we lose anything by inserting it 
at the top of the do{}while(), but we will potentially exit the loop 
quicker on cancellation cases.



+} while (ret  0);
+}
+
+/* If the bitmap granularity is smaller than the backup
granularity,
+ * we need to advance the iterator pointer to the next
cluster. */
+if (granularity  BACKUP_CLUSTER_SIZE

Re: [Qemu-block] [PATCH v5 01/21] docs: incremental backup documentation

2015-04-17 Thread John Snow




On 04/17/2015 11:06 AM, Eric Blake wrote:

On 04/08/2015 04:19 PM, John Snow wrote:

Reviewed-by: Max Reitz mre...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
  docs/bitmaps.md | 311 
  1 file changed, 311 insertions(+)
  create mode 100644 docs/bitmaps.md

diff --git a/docs/bitmaps.md b/docs/bitmaps.md
new file mode 100644
index 000..ad8c33b
--- /dev/null
+++ b/docs/bitmaps.md
@@ -0,0 +1,311 @@
+# Dirty Bitmaps and Incremental Backup
+


Still might be nice to list explicit copyright/license instead of
relying on implicit top-level GPLv2+, but I won't insist.



I think I would rather not clutter up the document itself, if that 
remains suitable. I don't mind those declarations in source code, but 
for a document like this, it seems weird to have it in the preamble.


I can attach a license to the footer, if that's suitable?




+### Deletion
+
+* Can be performed on a disabled bitmap, but not a frozen one.


Do you still have a notion of disabled bitmaps?  Earlier, in '## Bitmap
Modes', you only document 'frozen' (as opposed to the default unnamed
state).



We do internally. It's not likely to come up from a user's perspective, 
but we do intend to disable the bitmap during e.g. migration, boot, etc.


I did pull the disabled bit out because it's not a necessary detail yet.

I'll tidy this up and reintroduce the language alongside the patch that 
may expose the user to witnessing a disabled bitmap.





+
+## Transactions (Not yet implemented)


I'm assuming that [PATCH v2 00/11] block: incremental backup
transactions is incomplete, because it forgot to clean this up as part
of adding transaction support.



Fixed in my local copy, yes.




+
+5. Retry the command after fixing the underlaying problem,


s/underlaying/underlying/



:(

[Qemu-block] [PATCH v6 06/21] hbitmap: cache array lengths

2015-04-17 Thread John Snow

As a convenience: between incremental backups, bitmap migrations
and bitmap persistence we seem to need to recalculate these a lot.

Because the lengths are a little bit-twiddly, let's just solidly
cache them and be done with it.

Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
 util/hbitmap.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/util/hbitmap.c b/util/hbitmap.c
index ab13971..5b78613 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -90,6 +90,9 @@ struct HBitmap {
  * bitmap will still allocate HBITMAP_LEVELS arrays.
  */
 unsigned long *levels[HBITMAP_LEVELS];
+
+/* The length of each levels[] array. */
+uint64_t sizes[HBITMAP_LEVELS];
 };
 
 /* Advance hbi to the next nonzero word and return it.  hbi-pos
@@ -384,6 +387,7 @@ HBitmap *hbitmap_alloc(uint64_t size, int granularity)
 hb-granularity = granularity;
 for (i = HBITMAP_LEVELS; i--  0; ) {
 size = MAX((size + BITS_PER_LONG - 1)  BITS_PER_LEVEL, 1);
+hb-sizes[i] = size;
 hb-levels[i] = g_new0(unsigned long, size);
 }
 
-- 
2.1.0

[Qemu-block] [PATCH v6 05/21] block: Introduce bdrv_dirty_bitmap_granularity()

2015-04-17 Thread John Snow

This returns the granularity (in bytes) of dirty bitmap,
which matches the QMP interface and the existing query
interface.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
---
 block.c   | 8 ++--
 include/block/block.h | 1 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index ebea54d..41c5a67 100644
--- a/block.c
+++ b/block.c
@@ -5509,8 +5509,7 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
 BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
 info-count = bdrv_get_dirty_count(bs, bm);
-info-granularity =
-((uint32_t) BDRV_SECTOR_SIZE  hbitmap_granularity(bm-bitmap));
+info-granularity = bdrv_dirty_bitmap_granularity(bm);
 info-has_name = !!bm-name;
 info-name = g_strdup(bm-name);
 entry-value = info;
@@ -5550,6 +5549,11 @@ uint32_t 
bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
 return granularity;
 }
 
+uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
+{
+return BDRV_SECTOR_SIZE  hbitmap_granularity(bitmap-bitmap);
+}
+
 void bdrv_dirty_iter_init(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
 {
diff --git a/include/block/block.h b/include/block/block.h
index 0f014a3..493b7c5 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -459,6 +459,7 @@ void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
+uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
 int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap, int64_t 
sector);
 void bdrv_set_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors);
-- 
2.1.0

[Qemu-block] [PATCH v6 11/21] qmp: add block-dirty-bitmap-clear

2015-04-17 Thread John Snow

Add bdrv_clear_dirty_bitmap and a matching QMP command,
qmp_block_dirty_bitmap_clear that enables a user to reset
the bitmap attached to a drive.

This allows us to reset a bitmap in the event of a full
drive backup.

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
Reviewed-by: Eric Blake ebl...@redhat.com
---
 block.c   |  8 
 blockdev.c| 34 ++
 include/block/block.h |  1 +
 qapi/block-core.json  | 14 ++
 qmp-commands.hx   | 29 +
 5 files changed, 86 insertions(+)

diff --git a/block.c b/block.c
index 185cd7f..679991b 100644
--- a/block.c
+++ b/block.c
@@ -62,6 +62,7 @@
 struct BdrvDirtyBitmap {
 HBitmap *bitmap;
 BdrvDirtyBitmap *successor;
+int64_t size;
 char *name;
 bool disabled;
 QLIST_ENTRY(BdrvDirtyBitmap) list;
@@ -5491,6 +5492,7 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 }
 bitmap = g_new0(BdrvDirtyBitmap, 1);
 bitmap-bitmap = hbitmap_alloc(bitmap_size, ffs(sector_granularity) - 1);
+bitmap-size = bitmap_size;
 bitmap-name = g_strdup(name);
 bitmap-disabled = false;
 QLIST_INSERT_HEAD(bs-dirty_bitmaps, bitmap, list);
@@ -5693,6 +5695,12 @@ void bdrv_reset_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap,
 hbitmap_reset(bitmap-bitmap, cur_sector, nr_sectors);
 }
 
+void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+assert(bdrv_dirty_bitmap_enabled(bitmap));
+hbitmap_reset(bitmap-bitmap, 0, bitmap-size);
+}
+
 static void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
int nr_sectors)
 {
diff --git a/blockdev.c b/blockdev.c
index 90ba5b6..df96959 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2078,6 +2078,40 @@ void qmp_block_dirty_bitmap_remove(const char *node, 
const char *name,
 aio_context_release(aio_context);
 }
 
+/**
+ * Completely clear a bitmap, for the purposes of synchronizing a bitmap
+ * immediately after a full backup operation.
+ */
+void qmp_block_dirty_bitmap_clear(const char *node, const char *name,
+  Error **errp)
+{
+AioContext *aio_context;
+BdrvDirtyBitmap *bitmap;
+BlockDriverState *bs;
+
+bitmap = block_dirty_bitmap_lookup(node, name, bs, aio_context, errp);
+if (!bitmap || !bs) {
+return;
+}
+
+if (bdrv_dirty_bitmap_frozen(bitmap)) {
+error_setg(errp,
+   Bitmap '%s' is currently frozen and cannot be modified,
+   name);
+goto out;
+} else if (!bdrv_dirty_bitmap_enabled(bitmap)) {
+error_setg(errp,
+   Bitmap '%s' is currently disabled and cannot be cleared,
+   name);
+goto out;
+}
+
+bdrv_clear_dirty_bitmap(bitmap);
+
+ out:
+aio_context_release(aio_context);
+}
+
 int hmp_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
 {
 const char *id = qdict_get_str(qdict, id);
diff --git a/include/block/block.h b/include/block/block.h
index 80ac2cc..0961b1e 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -478,6 +478,7 @@ void bdrv_set_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors);
 void bdrv_reset_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
  int64_t cur_sector, int nr_sectors);
+void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_iter_init(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap, struct HBitmapIter *hbi);
 void bdrv_set_dirty_iter(struct HBitmapIter *hbi, int64_t offset);
diff --git a/qapi/block-core.json b/qapi/block-core.json
index a4e2897..fc9ca04 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -1022,6 +1022,20 @@
   'data': 'BlockDirtyBitmap' }
 
 ##
+# @block-dirty-bitmap-clear
+#
+# Clear (reset) a dirty bitmap on the device
+#
+# Returns: nothing on success
+#  If @node is not a valid block device, DeviceNotFound
+#  If @name is not found, GenericError with an explanation
+#
+# Since 2.4
+##
+{ 'command': 'block-dirty-bitmap-clear',
+  'data': 'BlockDirtyBitmap' }
+
+##
 # @block_set_io_throttle:
 #
 # Change I/O throttle limits for a block drive.
diff --git a/qmp-commands.hx b/qmp-commands.hx
index eb54dcd..e7db3a3 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -1328,6 +1328,35 @@ Example:
 EQMP
 
 {
+.name   = block-dirty-bitmap-clear,
+.args_type  = node:B,name:s,
+.mhandler.cmd_new = qmp_marshal_input_block_dirty_bitmap_clear,
+},
+
+SQMP
+
+block-dirty-bitmap-clear
+
+Since 2.4
+
+Reset the dirty bitmap associated with a node so that an incremental backup
+from this point in time forward will only backup clusters modified after this
+clear operation

[Qemu-block] [PATCH v6 07/21] hbitmap: add hbitmap_merge

2015-04-17 Thread John Snow

We add a bitmap merge operation to assist in error cases
where we wish to combine two bitmaps together.

This is algorithmically O(bits) provided HBITMAP_LEVELS remains
constant. For a full bitmap on a 64bit machine:
sum(bits/64^k, k, 0, HBITMAP_LEVELS) ~= 1.01587 * bits

We may be able to improve running speed for particularly sparse
bitmaps by using iterators, but the running time for dense maps
will be worse.

We present the simpler solution first, and we can refine it later
if needed.

Signed-off-by: John Snow js...@redhat.com
---
 include/qemu/hbitmap.h | 13 +
 util/hbitmap.c | 33 +
 2 files changed, 46 insertions(+)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 550d7ce..6cb2d0e 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -65,6 +65,19 @@ struct HBitmapIter {
 HBitmap *hbitmap_alloc(uint64_t size, int granularity);
 
 /**
+ * hbitmap_merge:
+ * @a: The bitmap to store the result in.
+ * @b: The bitmap to merge into @a.
+ * @return true if the merge was successful,
+ * false if it was not attempted.
+ *
+ * Merge two bitmaps together.
+ * A := A (BITOR) B.
+ * B is left unmodified.
+ */
+bool hbitmap_merge(HBitmap *a, const HBitmap *b);
+
+/**
  * hbitmap_empty:
  * @hb: HBitmap to operate on.
  *
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 5b78613..150d6e9 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -399,3 +399,36 @@ HBitmap *hbitmap_alloc(uint64_t size, int granularity)
 hb-levels[0][0] |= 1UL  (BITS_PER_LONG - 1);
 return hb;
 }
+
+/**
+ * Given HBitmaps A and B, let A := A (BITOR) B.
+ * Bitmap B will not be modified.
+ *
+ * @return true if the merge was successful,
+ * false if it was not attempted.
+ */
+bool hbitmap_merge(HBitmap *a, const HBitmap *b)
+{
+int i;
+uint64_t j;
+
+if ((a-size != b-size) || (a-granularity != b-granularity)) {
+return false;
+}
+
+if (hbitmap_count(b) == 0) {
+return true;
+}
+
+/* This merge is O(size), as BITS_PER_LONG and HBITMAP_LEVELS are constant.
+ * It may be possible to improve running times for sparsely populated maps
+ * by using hbitmap_iter_next, but this is suboptimal for dense maps.
+ */
+for (i = HBITMAP_LEVELS - 1; i = 0; i--) {
+for (j = 0; j  a-sizes[i]; j++) {
+a-levels[i][j] |= b-levels[i][j];
+}
+}
+
+return true;
+}
-- 
2.1.0

[Qemu-block] [PATCH v6 16/21] hbitmap: truncate tests

2015-04-17 Thread John Snow

The general approach is to set bits close to the boundaries of
where we are truncating and ensure that everything appears to
have gone OK.

We test growing and shrinking by different amounts:
- Less than the granularity
- Less than the granularity, but across a boundary
- Less than sizeof(unsigned long)
- Less than sizeof(unsigned long), but across a ulong boundary
- More than sizeof(unsigned long)

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
---
 tests/test-hbitmap.c | 255 +++
 1 file changed, 255 insertions(+)

diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
index 8c902f2..9f41b5f 100644
--- a/tests/test-hbitmap.c
+++ b/tests/test-hbitmap.c
@@ -11,6 +11,8 @@
 
 #include glib.h
 #include stdarg.h
+#include string.h
+#include sys/types.h
 #include qemu/hbitmap.h
 
 #define LOG_BITS_PER_LONG  (BITS_PER_LONG == 32 ? 5 : 6)
@@ -23,6 +25,7 @@ typedef struct TestHBitmapData {
 HBitmap   *hb;
 unsigned long *bits;
 size_t size;
+size_t old_size;
 intgranularity;
 } TestHBitmapData;
 
@@ -91,6 +94,44 @@ static void hbitmap_test_init(TestHBitmapData *data,
 }
 }
 
+static inline size_t hbitmap_test_array_size(size_t bits)
+{
+size_t n = (bits + BITS_PER_LONG - 1) / BITS_PER_LONG;
+return n ? n : 1;
+}
+
+static void hbitmap_test_truncate_impl(TestHBitmapData *data,
+   size_t size)
+{
+size_t n;
+size_t m;
+data-old_size = data-size;
+data-size = size;
+
+if (data-size == data-old_size) {
+return;
+}
+
+n = hbitmap_test_array_size(size);
+m = hbitmap_test_array_size(data-old_size);
+data-bits = g_realloc(data-bits, sizeof(unsigned long) * n);
+if (n  m) {
+memset(data-bits[m], 0x00, sizeof(unsigned long) * (n - m));
+}
+
+/* If we shrink to an uneven multiple of sizeof(unsigned long),
+ * scrub the leftover memory. */
+if (data-size  data-old_size) {
+m = size % (sizeof(unsigned long) * 8);
+if (m) {
+unsigned long mask = (1ULL  m) - 1;
+data-bits[n-1] = mask;
+}
+}
+
+hbitmap_truncate(data-hb, size);
+}
+
 static void hbitmap_test_teardown(TestHBitmapData *data,
   const void *unused)
 {
@@ -369,6 +410,198 @@ static void test_hbitmap_iter_granularity(TestHBitmapData 
*data,
 g_assert_cmpint(hbitmap_iter_next(hbi), , 0);
 }
 
+static void hbitmap_test_set_boundary_bits(TestHBitmapData *data, ssize_t diff)
+{
+size_t size = data-size;
+
+/* First bit */
+hbitmap_test_set(data, 0, 1);
+if (diff  0) {
+/* Last bit in new, shortened map */
+hbitmap_test_set(data, size + diff - 1, 1);
+
+/* First bit to be truncated away */
+hbitmap_test_set(data, size + diff, 1);
+}
+/* Last bit */
+hbitmap_test_set(data, size - 1, 1);
+if (data-granularity == 0) {
+hbitmap_test_check_get(data);
+}
+}
+
+static void hbitmap_test_check_boundary_bits(TestHBitmapData *data)
+{
+size_t size = MIN(data-size, data-old_size);
+
+if (data-granularity == 0) {
+hbitmap_test_check_get(data);
+hbitmap_test_check(data, 0);
+} else {
+/* If a granularity was set, note that every distinct
+ * (bit  granularity) value that was set will increase
+ * the bit pop count by 2^granularity, not just 1.
+ *
+ * The hbitmap_test_check facility does not currently tolerate
+ * non-zero granularities, so test the boundaries and the population
+ * count manually.
+ */
+g_assert(hbitmap_get(data-hb, 0));
+g_assert(hbitmap_get(data-hb, size - 1));
+g_assert_cmpint(2  data-granularity, ==, hbitmap_count(data-hb));
+}
+}
+
+/* Generic truncate test. */
+static void hbitmap_test_truncate(TestHBitmapData *data,
+  size_t size,
+  ssize_t diff,
+  int granularity)
+{
+hbitmap_test_init(data, size, granularity);
+hbitmap_test_set_boundary_bits(data, diff);
+hbitmap_test_truncate_impl(data, size + diff);
+hbitmap_test_check_boundary_bits(data);
+}
+
+static void test_hbitmap_truncate_nop(TestHBitmapData *data,
+  const void *unused)
+{
+hbitmap_test_truncate(data, L2, 0, 0);
+}
+
+/**
+ * Grow by an amount smaller than the granularity, without crossing
+ * a granularity alignment boundary. Effectively a NOP.
+ */
+static void test_hbitmap_truncate_grow_negligible(TestHBitmapData *data,
+  const void *unused)
+{
+size_t size = L2 - 1;
+size_t diff = 1;
+int granularity = 1;
+
+hbitmap_test_truncate(data, size, diff, granularity);
+}
+
+/**
+ * Shrink

[Qemu-block] [PATCH v6 19/21] iotests: add simple incremental backup case

2015-04-17 Thread John Snow

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
---
 tests/qemu-iotests/124 | 174 +++--
 tests/qemu-iotests/124.out |   4 +-
 2 files changed, 172 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 85675ec..5c3b434 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -29,6 +29,51 @@ def io_write_patterns(img, patterns):
 iotests.qemu_io('-c', 'write -P%s %s %s' % pattern, img)
 
 
+def try_remove(img):
+try:
+os.remove(img)
+except OSError:
+pass
+
+
+class Bitmap:
+def __init__(self, name, drive):
+self.name = name
+self.drive = drive
+self.num = 0
+self.backups = list()
+
+def base_target(self):
+return (self.drive['backup'], None)
+
+def new_target(self, num=None):
+if num is None:
+num = self.num
+self.num = num + 1
+base = os.path.join(iotests.test_dir,
+%s.%s. % (self.drive['id'], self.name))
+suff = %i.%s % (num, self.drive['fmt'])
+target = base + inc + suff
+reference = base + ref + suff
+self.backups.append((target, reference))
+return (target, reference)
+
+def last_target(self):
+if self.backups:
+return self.backups[-1]
+return self.base_target()
+
+def del_target(self):
+for image in self.backups.pop():
+try_remove(image)
+self.num -= 1
+
+def cleanup(self):
+for backup in self.backups:
+for image in backup:
+try_remove(image)
+
+
 class TestIncrementalBackup(iotests.QMPTestCase):
 def setUp(self):
 self.bitmaps = list()
@@ -73,6 +118,128 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 iotests.qemu_img('create', '-f', fmt, img, size)
 self.files.append(img)
 
+
+def do_qmp_backup(self, error='Input/output error', **kwargs):
+res = self.vm.qmp('drive-backup', **kwargs)
+self.assert_qmp(res, 'return', {})
+
+event = self.vm.event_wait(name=BLOCK_JOB_COMPLETED,
+   match={'data': {'device': 
kwargs['device']}})
+self.assertIsNotNone(event)
+
+try:
+failure = self.dictpath(event, 'data/error')
+except AssertionError:
+# Backup succeeded.
+self.assert_qmp(event, 'data/offset', event['data']['len'])
+return True
+else:
+# Backup failed.
+self.assert_qmp(event, 'data/error', error)
+return False
+
+
+def create_anchor_backup(self, drive=None):
+if drive is None:
+drive = self.drives[-1]
+res = self.do_qmp_backup(device=drive['id'], sync='full',
+ format=drive['fmt'], target=drive['backup'])
+self.assertTrue(res)
+self.files.append(drive['backup'])
+return drive['backup']
+
+
+def make_reference_backup(self, bitmap=None):
+if bitmap is None:
+bitmap = self.bitmaps[-1]
+_, reference = bitmap.last_target()
+res = self.do_qmp_backup(device=bitmap.drive['id'], sync='full',
+ format=bitmap.drive['fmt'], target=reference)
+self.assertTrue(res)
+
+
+def add_bitmap(self, name, drive):
+bitmap = Bitmap(name, drive)
+self.bitmaps.append(bitmap)
+result = self.vm.qmp('block-dirty-bitmap-add', node=drive['id'],
+ name=bitmap.name)
+self.assert_qmp(result, 'return', {})
+return bitmap
+
+
+def prepare_backup(self, bitmap=None, parent=None):
+if bitmap is None:
+bitmap = self.bitmaps[-1]
+if parent is None:
+parent, _ = bitmap.last_target()
+
+target, _ = bitmap.new_target()
+self.img_create(target, bitmap.drive['fmt'], parent=parent)
+return target
+
+
+def create_incremental(self, bitmap=None, parent=None,
+   parentFormat=None, validate=True):
+if bitmap is None:
+bitmap = self.bitmaps[-1]
+if parent is None:
+parent, _ = bitmap.last_target()
+
+target = self.prepare_backup(bitmap, parent)
+res = self.do_qmp_backup(device=bitmap.drive['id'],
+ sync='dirty-bitmap', bitmap=bitmap.name,
+ format=bitmap.drive['fmt'], target=target,
+ mode='existing')
+if not res:
+bitmap.del_target();
+self.assertFalse(validate)
+else:
+self.make_reference_backup(bitmap)
+return res
+
+
+def check_backups(self):
+for bitmap in self.bitmaps:
+for incremental, reference in bitmap.backups:
+self.assertTrue

[Qemu-block] [PATCH v6 00/21] block: transactionless incremental backup series

2015-04-17 Thread John Snow

 of that feature.

===
v3:
===

01: Removed enabled/disabled modes information.
Elaborated on events that can occur during error cases.
04: Added an AioContext out parameter to block_dirty_bitmap_lookup.
06: NEW:
Cache the array lengths for hbitmap.
07: hbitmap_merge now uses the cached array lengths.
11: block-dirty-bitmap-clear is edited for the new block_dirty_bitmap_lookup.
12: Removed the disabled status, leaving just Frozen.
15: Moved bdrv_truncate_dirty_bitmap to be static local
Inlined dirty_bitmap_truncate function.
Removed size[] caching into new patch (06, above)
hbitmap_truncate now keeps correct bit population count
hbitmap_truncate now uses hbitmap_reset BEFORE the truncate,
to avoid tricky out-of-bounds usages.
Remove g_realloc_n call that is not available in glib 2.12 (or 2.22)
Renamed truncate to shrink to make that more clear
to people who aren't me (at last count: 7+ billion)
16 NEW:
   hbitmap_truncate tests.

===
v2:
===

01: Added a new opening blurb.
Adjusted codeblock indentations to be 4 spaces instead of 3,
so it works as MD or GFMD.
Adjusted errors explanation.
Make visual separations between json data and shell commands
Eliminate any ligering single quotes

07: Remember that memset takes bytes, not n_items ...

===
v1:
===

Deletions:
 - Removed Transactions, to be added later.
 - Removed Transaction tests, as above.

Changes:
01: Indentation fixes.
Removed enable/disable documentation.
Added a note that transactions aren't implemented yet.
Removed my needless commas
Added error case documentation.

07: QMP enable/disable commands are deleted.

14: Some comments concerning assertions.
Scrub re-alloc memory if we expand the array.
Do not attempt to scrub memory if fix_count is 0

Changes made with Reviews kept:

02: Since 2.4
04: Since 2.4
Demingled the QMP command documentation.
08: Additions to what was qmp_block_dirty_enable/disable
are no longer present as those function no longer exist.
09: Since 2.4
10: Since 2.4
Demingled QMP command documentation.
11: Since 2.4
15: Test 112 -- 124
17: Number of tests altered. (Only 4, now.)

Fam Zheng (1):
  qapi: Add optional field name to block dirty bitmap

John Snow (20):
  docs: incremental backup documentation
  qmp: Ensure consistent granularity type
  qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove
  block: Introduce bdrv_dirty_bitmap_granularity()
  hbitmap: cache array lengths
  hbitmap: add hbitmap_merge
  block: Add bitmap disabled status
  block: Add bitmap successors
  qmp: Add support of dirty-bitmap sync mode for drive-backup
  qmp: add block-dirty-bitmap-clear
  qmp: Add dirty bitmap status field in query-block
  block: add BdrvDirtyBitmap documentation
  block: Ensure consistent bitmap function prototypes
  block: Resize bitmaps on bdrv_truncate
  hbitmap: truncate tests
  iotests: add invalid input incremental backup tests
  iotests: add QMP event waiting queue
  iotests: add simple incremental backup case
  iotests: add incremental backup failure recovery test
  iotests: add incremental backup granularity tests

 block.c   | 243 ++--
 block/backup.c| 155 +++---
 block/mirror.c|  46 +++---
 blockdev.c| 176 +++-
 docs/bitmaps.md   | 352 
 hmp.c |   3 +-
 include/block/block.h |  33 +++-
 include/block/block_int.h |   4 +-
 include/qemu/hbitmap.h|  23 +++
 migration/block.c |   9 +-
 qapi/block-core.json  |  91 ++-
 qmp-commands.hx   |  93 ++-
 scripts/qmp/qmp.py|  95 +++
 tests/qemu-iotests/124| 363 ++
 tests/qemu-iotests/124.out|   5 +
 tests/qemu-iotests/group  |   1 +
 tests/qemu-iotests/iotests.py |  38 +
 tests/test-hbitmap.c  | 255 +
 util/hbitmap.c|  85 ++
 19 files changed, 1953 insertions(+), 117 deletions(-)
 create mode 100644 docs/bitmaps.md
 create mode 100644 tests/qemu-iotests/124
 create mode 100644 tests/qemu-iotests/124.out

-- 
2.1.0

[Qemu-block] [PATCH v6 15/21] block: Resize bitmaps on bdrv_truncate

2015-04-17 Thread John Snow

Signed-off-by: John Snow js...@redhat.com
---
 block.c| 18 ++
 include/qemu/hbitmap.h | 10 ++
 util/hbitmap.c | 48 
 3 files changed, 76 insertions(+)

diff --git a/block.c b/block.c
index 735acff..b29aafe 100644
--- a/block.c
+++ b/block.c
@@ -113,6 +113,7 @@ static void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
int nr_sectors);
 static void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector,
  int nr_sectors);
+static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
@@ -3583,6 +3584,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
 ret = drv-bdrv_truncate(bs, offset);
 if (ret == 0) {
 ret = refresh_total_sectors(bs, offset  BDRV_SECTOR_BITS);
+bdrv_dirty_bitmap_truncate(bs);
 if (bs-blk) {
 blk_dev_resize_cb(bs-blk);
 }
@@ -5593,6 +5595,22 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
 return parent;
 }
 
+/**
+ * Truncates _all_ bitmaps attached to a BDS.
+ */
+static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
+{
+BdrvDirtyBitmap *bitmap;
+uint64_t size = bdrv_nb_sectors(bs);
+
+QLIST_FOREACH(bitmap, bs-dirty_bitmaps, list) {
+if (bdrv_dirty_bitmap_frozen(bitmap)) {
+continue;
+}
+hbitmap_truncate(bitmap-bitmap, size);
+}
+}
+
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
 {
 BdrvDirtyBitmap *bm, *next;
diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 6cb2d0e..f0a85f8 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -65,6 +65,16 @@ struct HBitmapIter {
 HBitmap *hbitmap_alloc(uint64_t size, int granularity);
 
 /**
+ * hbitmap_truncate:
+ * @hb: The bitmap to change the size of.
+ * @size: The number of elements to change the bitmap to accommodate.
+ *
+ * truncate or grow an existing bitmap to accommodate a new number of elements.
+ * This may invalidate existing HBitmapIterators.
+ */
+void hbitmap_truncate(HBitmap *hb, uint64_t size);
+
+/**
  * hbitmap_merge:
  * @a: The bitmap to store the result in.
  * @b: The bitmap to merge into @a.
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 150d6e9..a10c7ae 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -400,6 +400,54 @@ HBitmap *hbitmap_alloc(uint64_t size, int granularity)
 return hb;
 }
 
+void hbitmap_truncate(HBitmap *hb, uint64_t size)
+{
+bool shrink;
+unsigned i;
+uint64_t num_elements = size;
+uint64_t old;
+
+/* Size comes in as logical elements, adjust for granularity. */
+size = (size + (1ULL  hb-granularity) - 1)  hb-granularity;
+assert(size = ((uint64_t)1  HBITMAP_LOG_MAX_SIZE));
+shrink = size  hb-size;
+
+/* bit sizes are identical; nothing to do. */
+if (size == hb-size) {
+return;
+}
+
+/* If we're losing bits, let's clear those bits before we invalidate all of
+ * our invariants. This helps keep the bitcount consistent, and will 
prevent
+ * us from carrying around garbage bits beyond the end of the map.
+ */
+if (shrink) {
+/* Don't clear partial granularity groups;
+ * start at the first full one. */
+uint64_t start = QEMU_ALIGN_UP(num_elements, 1  hb-granularity);
+uint64_t fix_count = (hb-size  hb-granularity) - start;
+
+assert(fix_count);
+hbitmap_reset(hb, start, fix_count);
+}
+
+hb-size = size;
+for (i = HBITMAP_LEVELS; i--  0; ) {
+size = MAX(BITS_TO_LONGS(size), 1);
+if (hb-sizes[i] == size) {
+break;
+}
+old = hb-sizes[i];
+hb-sizes[i] = size;
+hb-levels[i] = g_realloc(hb-levels[i], size * sizeof(unsigned long));
+if (!shrink) {
+memset(hb-levels[i][old], 0x00,
+   (size - old) * sizeof(*hb-levels[i]));
+}
+}
+}
+
+
 /**
  * Given HBitmaps A and B, let A := A (BITOR) B.
  * Bitmap B will not be modified.
-- 
2.1.0

[Qemu-block] [PATCH v6 08/21] block: Add bitmap disabled status

2015-04-17 Thread John Snow

Add a status indicating the enabled/disabled state of the bitmap.
A bitmap is by default enabled, but you can lock the bitmap into
a read-only state by setting disabled = true.

A previous version of this patch added a QMP interface for changing
the state of the bitmap, but it has since been removed for now until
a use case emerges where this state must be revealed to the user.

The disabled state WILL be used internally for bitmap migration and
bitmap persistence.

Signed-off-by: Fam Zheng f...@redhat.com
Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
---
 block.c   | 25 +
 include/block/block.h |  3 +++
 2 files changed, 28 insertions(+)

diff --git a/block.c b/block.c
index 41c5a67..db742a9 100644
--- a/block.c
+++ b/block.c
@@ -54,6 +54,7 @@
 struct BdrvDirtyBitmap {
 HBitmap *bitmap;
 char *name;
+bool disabled;
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -5481,10 +5482,16 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 bitmap = g_new0(BdrvDirtyBitmap, 1);
 bitmap-bitmap = hbitmap_alloc(bitmap_size, ffs(sector_granularity) - 1);
 bitmap-name = g_strdup(name);
+bitmap-disabled = false;
 QLIST_INSERT_HEAD(bs-dirty_bitmaps, bitmap, list);
 return bitmap;
 }
 
+bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
+{
+return !bitmap-disabled;
+}
+
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
 {
 BdrvDirtyBitmap *bm, *next;
@@ -5499,6 +5506,16 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
 }
 }
 
+void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+bitmap-disabled = true;
+}
+
+void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+bitmap-disabled = false;
+}
+
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 {
 BdrvDirtyBitmap *bm;
@@ -5563,12 +5580,14 @@ void bdrv_dirty_iter_init(BlockDriverState *bs,
 void bdrv_set_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors)
 {
+assert(bdrv_dirty_bitmap_enabled(bitmap));
 hbitmap_set(bitmap-bitmap, cur_sector, nr_sectors);
 }
 
 void bdrv_reset_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
  int64_t cur_sector, int nr_sectors)
 {
+assert(bdrv_dirty_bitmap_enabled(bitmap));
 hbitmap_reset(bitmap-bitmap, cur_sector, nr_sectors);
 }
 
@@ -5577,6 +5596,9 @@ static void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
 {
 BdrvDirtyBitmap *bitmap;
 QLIST_FOREACH(bitmap, bs-dirty_bitmaps, list) {
+if (!bdrv_dirty_bitmap_enabled(bitmap)) {
+continue;
+}
 hbitmap_set(bitmap-bitmap, cur_sector, nr_sectors);
 }
 }
@@ -5586,6 +5608,9 @@ static void bdrv_reset_dirty(BlockDriverState *bs, 
int64_t cur_sector,
 {
 BdrvDirtyBitmap *bitmap;
 QLIST_FOREACH(bitmap, bs-dirty_bitmaps, list) {
+if (!bdrv_dirty_bitmap_enabled(bitmap)) {
+continue;
+}
 hbitmap_reset(bitmap-bitmap, cur_sector, nr_sectors);
 }
 }
diff --git a/include/block/block.h b/include/block/block.h
index 493b7c5..029a8a7 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -457,9 +457,12 @@ BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState 
*bs,
 const char *name);
 void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, BdrvDirtyBitmap 
*bitmap);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
+void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
+void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
 uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap);
+bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap, int64_t 
sector);
 void bdrv_set_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors);
-- 
2.1.0

[Qemu-block] [PATCH v6 17/21] iotests: add invalid input incremental backup tests

2015-04-17 Thread John Snow

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
---
 tests/qemu-iotests/124 | 104 +
 tests/qemu-iotests/124.out |   5 +++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 110 insertions(+)
 create mode 100644 tests/qemu-iotests/124
 create mode 100644 tests/qemu-iotests/124.out

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
new file mode 100644
index 000..85675ec
--- /dev/null
+++ b/tests/qemu-iotests/124
@@ -0,0 +1,104 @@
+#!/usr/bin/env python
+#
+# Tests for incremental drive-backup
+#
+# Copyright (C) 2015 John Snow for Red Hat, Inc.
+#
+# Based on 056.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+#
+
+import os
+import iotests
+
+
+def io_write_patterns(img, patterns):
+for pattern in patterns:
+iotests.qemu_io('-c', 'write -P%s %s %s' % pattern, img)
+
+
+class TestIncrementalBackup(iotests.QMPTestCase):
+def setUp(self):
+self.bitmaps = list()
+self.files = list()
+self.drives = list()
+self.vm = iotests.VM()
+self.err_img = os.path.join(iotests.test_dir, 'err.%s' % 
iotests.imgfmt)
+
+# Create a base image with a distinctive patterning
+drive0 = self.add_node('drive0')
+self.img_create(drive0['file'], drive0['fmt'])
+self.vm.add_drive(drive0['file'])
+io_write_patterns(drive0['file'], (('0x41', 0, 512),
+   ('0xd5', '1M', '32k'),
+   ('0xdc', '32M', '124k')))
+self.vm.launch()
+
+
+def add_node(self, node_id, fmt=iotests.imgfmt, path=None, backup=None):
+if path is None:
+path = os.path.join(iotests.test_dir, '%s.%s' % (node_id, fmt))
+if backup is None:
+backup = os.path.join(iotests.test_dir,
+  '%s.full.backup.%s' % (node_id, fmt))
+
+self.drives.append({
+'id': node_id,
+'file': path,
+'backup': backup,
+'fmt': fmt })
+return self.drives[-1]
+
+
+def img_create(self, img, fmt=iotests.imgfmt, size='64M',
+   parent=None, parentFormat=None):
+if parent:
+if parentFormat is None:
+parentFormat = fmt
+iotests.qemu_img('create', '-f', fmt, img, size,
+ '-b', parent, '-F', parentFormat)
+else:
+iotests.qemu_img('create', '-f', fmt, img, size)
+self.files.append(img)
+
+def test_sync_dirty_bitmap_missing(self):
+self.assert_no_active_block_jobs()
+self.files.append(self.err_img)
+result = self.vm.qmp('drive-backup', device=self.drives[0]['id'],
+ sync='dirty-bitmap', format=self.drives[0]['fmt'],
+ target=self.err_img)
+self.assert_qmp(result, 'error/class', 'GenericError')
+
+
+def test_sync_dirty_bitmap_not_found(self):
+self.assert_no_active_block_jobs()
+self.files.append(self.err_img)
+result = self.vm.qmp('drive-backup', device=self.drives[0]['id'],
+ sync='dirty-bitmap', bitmap='unknown',
+ format=self.drives[0]['fmt'], target=self.err_img)
+self.assert_qmp(result, 'error/class', 'GenericError')
+
+
+def tearDown(self):
+self.vm.shutdown()
+for filename in self.files:
+try:
+os.remove(filename)
+except OSError:
+pass
+
+
+if __name__ == '__main__':
+iotests.main(supported_fmts=['qcow2'])
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
new file mode 100644
index 000..fbc63e6
--- /dev/null
+++ b/tests/qemu-iotests/124.out
@@ -0,0 +1,5 @@
+..
+--
+Ran 2 tests
+
+OK
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index bcf2578..f9830d2 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -123,5 +123,6 @@
 116 rw auto quick
 121 rw auto
 123 rw auto quick
+124 rw auto backing
 128 rw auto quick
 130 rw auto quick
-- 
2.1.0

[Qemu-block] [PATCH v6 21/21] iotests: add incremental backup granularity tests

2015-04-17 Thread John Snow

Test what happens if you fiddle with the granularity.

Reviewed-by: Max Reitz mre...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
 tests/qemu-iotests/124 | 58 +-
 tests/qemu-iotests/124.out |  4 ++--
 2 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 95f6de5..3ee78cd 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124
@@ -158,11 +158,11 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 self.assertTrue(res)
 
 
-def add_bitmap(self, name, drive):
+def add_bitmap(self, name, drive, **kwargs):
 bitmap = Bitmap(name, drive)
 self.bitmaps.append(bitmap)
 result = self.vm.qmp('block-dirty-bitmap-add', node=drive['id'],
- name=bitmap.name)
+ name=bitmap.name, **kwargs)
 self.assert_qmp(result, 'return', {})
 return bitmap
 
@@ -212,16 +212,9 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 self.vm.hmp_qemu_io(drive, 'flush')
 
 
-def test_incremental_simple(self):
-'''
-Test: Create and verify three incremental backups.
-
-Create a bitmap and a full backup before VM execution begins,
-then create a series of three incremental backups during execution,
-i.e.; after IO requests begin modifying the drive.
-'''
+def do_incremental_simple(self, **kwargs):
 self.create_anchor_backup()
-self.add_bitmap('bitmap0', self.drives[0])
+self.add_bitmap('bitmap0', self.drives[0], **kwargs)
 
 # Sanity: Create a hollow incremental backup
 self.create_incremental()
@@ -240,6 +233,37 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 self.check_backups()
 
 
+def test_incremental_simple(self):
+'''
+Test: Create and verify three incremental backups.
+
+Create a bitmap and a full backup before VM execution begins,
+then create a series of three incremental backups during execution,
+i.e.; after IO requests begin modifying the drive.
+'''
+return self.do_incremental_simple()
+
+
+def test_small_granularity(self):
+'''
+Test: Create and verify backups made with a small granularity bitmap.
+
+Perform the same test as test_incremental_simple, but with a 
granularity
+of only 32KiB instead of the present default of 64KiB.
+'''
+return self.do_incremental_simple(granularity=32768)
+
+
+def test_large_granularity(self):
+'''
+Test: Create and verify backups made with a large granularity bitmap.
+
+Perform the same test as test_incremental_simple, but with a 
granularity
+of 128KiB instead of the present default of 64KiB.
+'''
+return self.do_incremental_simple(granularity=131072)
+
+
 def test_incremental_failure(self):
 '''Test: Verify backups made after a failure are correct.
 
@@ -315,6 +339,18 @@ class TestIncrementalBackup(iotests.QMPTestCase):
 self.assert_qmp(result, 'error/class', 'GenericError')
 
 
+def test_sync_dirty_bitmap_bad_granularity(self):
+'''
+Test: Test what happens if we provide an improper granularity.
+
+The granularity must always be a power of 2.
+'''
+self.assert_no_active_block_jobs()
+self.assertRaises(AssertionError, self.add_bitmap,
+  'bitmap0', self.drives[0],
+  granularity=64000)
+
+
 def tearDown(self):
 self.vm.shutdown()
 for bitmap in self.bitmaps:
diff --git a/tests/qemu-iotests/124.out b/tests/qemu-iotests/124.out
index 89968f3..2f7d390 100644
--- a/tests/qemu-iotests/124.out
+++ b/tests/qemu-iotests/124.out
@@ -1,5 +1,5 @@
-
+...
 --
-Ran 4 tests
+Ran 7 tests
 
 OK
-- 
2.1.0

[Qemu-block] [PATCH v6 13/21] block: add BdrvDirtyBitmap documentation

2015-04-17 Thread John Snow

Signed-off-by: John Snow js...@redhat.com
Reviewed-by: Max Reitz mre...@redhat.com
Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
---
 block.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index ca73a6a..30c8568 100644
--- a/block.c
+++ b/block.c
@@ -60,11 +60,11 @@
  * or enabled. A frozen bitmap can only abdicate() or reclaim().
  */
 struct BdrvDirtyBitmap {
-HBitmap *bitmap;
-BdrvDirtyBitmap *successor;
-int64_t size;
-char *name;
-bool disabled;
+HBitmap *bitmap;/* Dirty sector bitmap implementation */
+BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
+char *name; /* Optional non-empty unique ID */
+int64_t size;   /* Size of the bitmap (Number of sectors) */
+bool disabled;  /* Bitmap is read-only */
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
-- 
2.1.0

[Qemu-block] [PATCH v6 10/21] qmp: Add support of dirty-bitmap sync mode for drive-backup

2015-04-17 Thread John Snow

For dirty-bitmap sync mode, the block job will iterate through the
given dirty bitmap to decide if a sector needs backup (backup all the
dirty clusters and skip clean ones), just as allocation conditions of
top sync mode.

Signed-off-by: Fam Zheng f...@redhat.com
Signed-off-by: John Snow js...@redhat.com
---
 block.c   |   9 +++
 block/backup.c| 155 +++---
 block/mirror.c|   4 ++
 blockdev.c|  18 +-
 hmp.c |   3 +-
 include/block/block.h |   1 +
 include/block/block_int.h |   2 +
 qapi/block-core.json  |  14 +++--
 qmp-commands.hx   |   8 ++-
 9 files changed, 181 insertions(+), 33 deletions(-)

diff --git a/block.c b/block.c
index 8f08b6e..185cd7f 100644
--- a/block.c
+++ b/block.c
@@ -5717,6 +5717,15 @@ static void bdrv_reset_dirty(BlockDriverState *bs, 
int64_t cur_sector,
 }
 }
 
+/**
+ * Advance an HBitmapIter to an arbitrary offset.
+ */
+void bdrv_set_dirty_iter(HBitmapIter *hbi, int64_t offset)
+{
+assert(hbi-hb);
+hbitmap_iter_init(hbi, hbi-hb, offset);
+}
+
 int64_t bdrv_get_dirty_count(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
 {
 return hbitmap_count(bitmap-bitmap);
diff --git a/block/backup.c b/block/backup.c
index 1c535b1..e77f7e8 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -37,6 +37,8 @@ typedef struct CowRequest {
 typedef struct BackupBlockJob {
 BlockJob common;
 BlockDriverState *target;
+/* bitmap for sync=dirty-bitmap */
+BdrvDirtyBitmap *sync_bitmap;
 MirrorSyncMode sync_mode;
 RateLimit limit;
 BlockdevOnError on_source_error;
@@ -242,6 +244,91 @@ static void backup_complete(BlockJob *job, void *opaque)
 g_free(data);
 }
 
+static bool coroutine_fn yield_and_check(BackupBlockJob *job)
+{
+if (block_job_is_cancelled(job-common)) {
+return true;
+}
+
+/* we need to yield so that qemu_aio_flush() returns.
+ * (without, VM does not reboot)
+ */
+if (job-common.speed) {
+uint64_t delay_ns = ratelimit_calculate_delay(job-limit,
+  job-sectors_read);
+job-sectors_read = 0;
+block_job_sleep_ns(job-common, QEMU_CLOCK_REALTIME, delay_ns);
+} else {
+block_job_sleep_ns(job-common, QEMU_CLOCK_REALTIME, 0);
+}
+
+if (block_job_is_cancelled(job-common)) {
+return true;
+}
+
+return false;
+}
+
+static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
+{
+bool error_is_read;
+int ret = 0;
+int clusters_per_iter;
+uint32_t granularity;
+int64_t sector;
+int64_t cluster;
+int64_t end;
+int64_t last_cluster = -1;
+BlockDriverState *bs = job-common.bs;
+HBitmapIter hbi;
+
+granularity = bdrv_dirty_bitmap_granularity(job-sync_bitmap);
+clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);
+bdrv_dirty_iter_init(bs, job-sync_bitmap, hbi);
+
+/* Find the next dirty sector(s) */
+while ((sector = hbitmap_iter_next(hbi)) != -1) {
+cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
+
+/* Fake progress updates for any clusters we skipped */
+if (cluster != last_cluster + 1) {
+job-common.offset += ((cluster - last_cluster - 1) *
+   BACKUP_CLUSTER_SIZE);
+}
+
+for (end = cluster + clusters_per_iter; cluster  end; cluster++) {
+do {
+if (yield_and_check(job)) {
+return ret;
+}
+ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
+BACKUP_SECTORS_PER_CLUSTER, 
error_is_read);
+if ((ret  0) 
+backup_error_action(job, error_is_read, -ret) ==
+BLOCK_ERROR_ACTION_REPORT) {
+return ret;
+}
+} while (ret  0);
+}
+
+/* If the bitmap granularity is smaller than the backup granularity,
+ * we need to advance the iterator pointer to the next cluster. */
+if (granularity  BACKUP_CLUSTER_SIZE) {
+bdrv_set_dirty_iter(hbi, cluster * BACKUP_SECTORS_PER_CLUSTER);
+}
+
+last_cluster = cluster - 1;
+}
+
+/* Play some final catchup with the progress meter */
+end = DIV_ROUND_UP(job-common.len, BACKUP_CLUSTER_SIZE);
+if (last_cluster + 1  end) {
+job-common.offset += ((end - last_cluster - 1) * BACKUP_CLUSTER_SIZE);
+}
+
+return ret;
+}
+
 static void coroutine_fn backup_run(void *opaque)
 {
 BackupBlockJob *job = opaque;
@@ -259,8 +346,7 @@ static void coroutine_fn backup_run(void *opaque)
 qemu_co_rwlock_init(job-flush_rwlock);
 
 start = 0;
-end = DIV_ROUND_UP(job-common.len / BDRV_SECTOR_SIZE,
-   BACKUP_SECTORS_PER_CLUSTER);
+end = DIV_ROUND_UP(job-common.len

Re: [Qemu-block] [Qemu-devel] [PATCH v4 19/20] iotests: add simple incremental backup case

2015-04-06 Thread John Snow




On 04/02/2015 10:27 AM, Stefan Hajnoczi wrote:

On Fri, Mar 20, 2015 at 03:17:02PM -0400, John Snow wrote:

Signed-off-by: John Snow js...@redhat.com
---
  tests/qemu-iotests/124 | 153 +
  tests/qemu-iotests/124.out |   4 +-
  2 files changed, 155 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/124 b/tests/qemu-iotests/124
index 85675ec..ce2cda7 100644
--- a/tests/qemu-iotests/124
+++ b/tests/qemu-iotests/124


[snip, snip, snip]



  class TestIncrementalBackup(iotests.QMPTestCase):
  def setUp(self):
@@ -73,6 +109,123 @@ class TestIncrementalBackup(iotests.QMPTestCase):
  iotests.qemu_img('create', '-f', fmt, img, size)
  self.files.append(img)

+
+def create_full_backup(self, drive=None):
+if drive is None:
+drive = self.drives[-1]
+
+res = self.vm.qmp('drive-backup', device=drive['id'],
+  sync='full', format=drive['fmt'],
+  target=drive['backup'])
+self.assert_qmp(res, 'return', {})
+self.wait_until_completed(drive['id'])
+self.check_full_backup(drive)
+self.files.append(drive['backup'])
+return drive['backup']
+
+
+def check_full_backup(self, drive=None):
+if drive is None:
+drive = self.drives[-1]
+self.assertTrue(iotests.compare_images(drive['file'], drive['backup']))


I think QEMU still has at least drive['file'] open?  It's not safe to
access the file from another program while it is open.

The simplest solution is to terminate the VM before calling
iotests.compare_images().



Oh, that's unfortunate. I have been checking images after every creation 
as a nice incremental sanity check. That will be hard to do if I have to 
actually close QEMU.


My reasoning was:

(1) We're explicitly flushing after every write,
(2) We're in a qtest mode so there is no guest activity,
(3) Nobody is writing to this image by the time we call compare_images().

So it should be safe to /read/ the files while QEMU is occupied doing 
nothing.


I could delay all tests until completion, but I'd lose out on the 
ability to test the equivalence of any incrementals that are not their 
most recent versions, unless I also start creating a lot of full 
backups, but I think this starts to get messy and makes the tests hard 
to follow.


Is it really that unsafe? I could add in an explicit pause/resume 
barrier around the check if that would help inspire some confidence in 
the test.


--js

Re: [Qemu-block] [Qemu-devel] qemu-img behavior for locating backing files

2015-04-07 Thread John Snow




On 04/07/2015 04:44 AM, Kevin Wolf wrote:

Am 07.04.2015 um 02:31 hat John Snow geschrieben:



On 04/02/2015 05:38 AM, Kevin Wolf wrote:

Am 01.04.2015 um 18:16 hat John Snow geschrieben:

Kevin, what's the correct behavior for qemu-img and relative paths
when creating a new qcow2 file?

Example:

(in e.g. /home/qemu/build/ or anywhere not /home: )
qemu-img create -f qcow2 base.qcow2 32G
qemu-img create -f qcow2 -F qcow2 -b base.qcow2 /home/overlay.qcow2

In 1.7.0., this produces a warning that the base object cannot be
found (because it does not exist at that location relative to
overlay.qcow2), but qemu-img will create the qcow2 for you
regardless.

2.0, 2.1 and 2.2 all will create the image successfully, with no warnings.

2.3-rc1/master as they exist now will emit an error message and
create no image.


Are you going to take care of that, or should I write one?



Since this is a change in behavior for the pending release, is this
the correct/desired behavior?


Part one of the answer is easy: qemu-img create should succeed if, and
only if, a usable image is created. This requires that the backing file
exists.



So far so good.


Part two is a bit harder: Should base.qcow2 be found in the current
directory even if the new image is somewhere else? We must give
preference to an existing base.qcow2 relative to the new image path, but
if it doesn't exist, we could in theory try to find it relative to the
working directory.



Nack. This seems like inviting heartbreak unnecessarily.


If we then find it, we have two options: Either we use that image
(probably with an absolute path then?) or we print a useful error
message that instructs the user how relative paths work with images.
I think the latter is better because the other option feels like too
much magic.



Too much magic indeed. I think where ambiguity of paths is
concerned, it is best to stick to one particular path and make it
very explicit.

In this case, if we cannot find some relative path offered by the
user, an error message such as:

Hey! We can't find /absolute/path/to/../your/relative/file.qcow2

should be sufficient to clue the user in to where qemu-img is
looking for this backing file.


Yes, printing the combined path sounds like a good option.


A usability bonus might be when we go to whine at the user, if the
file exists relative to the PWD:

Qemu noticed a file at /your/pwd/.../your/relative/file.qcow2, but
Qemu expects relative paths for backing files to be relative to the
image referencing them, not your current PWD

but this is just a Deluxe Niceness.


If we touch it anyway, why not have Deluxe Niceness?



Your wish is my command!


In any case, the behaviour you describe for 2.3-rc1 seems to be the best
that we've had until now; 1.7.0 looks like the second best. We should
probably document the 2.3-rc1 behaviour with a qemu-iotests case.



Absolutely.


Are you going to take care of that, or should I write one?


Oh, and we still have a bug: If you specify an image size, qemu-img
doesn't check at all whether the backing file exists.


Same question here.

Kevin



I meant to imply I'd do it, thanks!

--js

Re: [Qemu-block] [PATCH v4 15/20] block: Resize bitmaps on bdrv_truncate

2015-04-07 Thread John Snow




On 04/07/2015 08:57 AM, Stefan Hajnoczi wrote:

On Thu, Apr 02, 2015 at 11:57:59AM -0400, John Snow wrote:



On 04/02/2015 09:37 AM, Stefan Hajnoczi wrote:

On Fri, Mar 20, 2015 at 03:16:58PM -0400, John Snow wrote:

+void hbitmap_truncate(HBitmap *hb, uint64_t size)
+{
+bool shrink;
+unsigned i;
+uint64_t num_elements = size;
+uint64_t old;
+
+/* Size comes in as logical elements, adjust for granularity. */
+size = (size + (1ULL  hb-granularity) - 1)  hb-granularity;
+assert(size = ((uint64_t)1  HBITMAP_LOG_MAX_SIZE));
+shrink = size  hb-size;
+
+/* bit sizes are identical; nothing to do. */
+if (size == hb-size) {
+return;
+}
+
+/* If we're losing bits, let's clear those bits before we invalidate all of
+ * our invariants. This helps keep the bitcount consistent, and will 
prevent
+ * us from carrying around garbage bits beyond the end of the map.
+ *
+ * Because clearing bits past the end of map might reset bits we care about
+ * within the array, record the current value of the last bit we're 
keeping.
+ */
+if (shrink) {
+bool set = hbitmap_get(hb, num_elements - 1);
+uint64_t fix_count = (hb-size  hb-granularity) - num_elements;
+
+assert(fix_count);
+hbitmap_reset(hb, num_elements, fix_count);
+if (set) {
+hbitmap_set(hb, num_elements - 1, 1);
+}


Why is it necessary to set the last bit (if it was set)?  The comment
isn't clear to me.



Sure. The granularity of the bitmap provides us with virtual bit groups. for
a granularity of say g=2, we have 2^2 virtual bits per every real bit:

101 in memory is treated, virtually, as   .

The get/set calls operate on virtual bits, not concrete ones, so if we were
to reset virtual bits 2-11:
11|11  

We'd set the real bits to '000', because we clear or set the entire virtual
group.

This is probably not what we really want, so as a shortcut I just read and
then re-set the final bit.

It is programmatically avoidable (Are we truncating into a granularity
group?) but in the case that we are, I'd need to read/reset the bit anyway,
so it seemed fine to just unconditionally apply the fix.


I see.  This is equivalent to:

uint64_t start = QEMU_ALIGN_UP(num_elements, hb-granularity);


Probably you mean QEMU_ALIGN_UP(num_elements, 1  hb-granularity)


uint64_t fix_count = (hb-size  hb-granularity) - start;
hbitmap_reset(hb, start, fix_count);

The explicit QEMU_ALIGN_UP(num_elements, hb-granularity) calculation
shows that we're working around the granularity.  I find this easier to
understand.

If you keep the get/set version, please extend the comment to explain
that clearing the first bit could destroy up to granularity - 1 bits
that must be preserved.



Your solution will read more nicely, so I'll just adopt that, thanks.


Stefan

Re: [Qemu-block] [Qemu-devel] Migration sometimes fails with IDE and Qemu 2.2.1

2015-04-07 Thread John Snow

On 04/07/2015 02:44 PM, Peter Lieven wrote:

Am 07.04.2015 um 17:29 schrieb Dr. David Alan Gilbert:

* Peter Lieven (p...@kamp.de) wrote:

Hi David,

Am 07.04.2015 um 10:43 schrieb Dr. David Alan Gilbert:

Any particular workload or reproducer?

Workload is almost zero. I try to figure out if there is a way to trigger it.

Maybe playing a role: Machine type is -M pc1.2 and we set -kvmclock as
CPU flag since kvmclock seemed to be quite buggy in 2.6.16...

Exact cmdline is:
/usr/bin/qemu-2.2.1 -enable-kvm -M pc-1.2 -nodefaults -netdev
type=tap,id=guest2,script=no,downscript=no,ifname=tap2 -device
e1000,netdev=guest2,mac=52:54:00:ff:00:65 -drive
format=raw,file=iscsi://172.21.200.53/iqn.2001-05.com.equallogic:4-52aed6-88a7e99a4-d9e00040fdc509a3-XXX-hd0/0,if=ide,cache=writeback,aio=native
-serial null -parallel null -m 1024 -smp 2,sockets=1,cores=2,threads=1
-monitor tcp:0:4003,server,nowait -vnc :3 -qmp tcp:0:3003,server,nowait -name
'XXX' -boot order=c,once=dc,menu=off -drive
index=2,media=cdrom,if=ide,cache=unsafe,aio=native,readonly=on -k de
-incoming tcp:0:5003 -pidfile /var/run/qemu/vm-146.pid -mem-path /hugepages
-mem-prealloc -rtc base=utc -usb -usbdevice tablet -no-hpet -vga cirrus -cpu
qemu64,-kvmclock

Exact kernel is:
2.6.16.46-0.12-smp (i think this is SLES10 or sth.)

The machine does not hang. It seems just I/O is hanging. So you can type at the
console or ping the system, but no longer login.

Thank you,
Peter

Interesting observation: Migrating the vServer again seems to fix to problem
(at least in one case I could test just now).

2.6.8-24-smp is also affected.

How often does it fail - you say 'sometimes' - is it a 1/10 or a 1/1000 ?

Its more often than 1/10 I would say.

OK, that's not too bad - it's the 1/1000 that are really nasty to find.
In your setup, how easy would it be for you to try :
with either 2.1 or current head?
with a newer machine-type?
without the cdrom?

Its all possible. I can clone the system and try everything on my test systems.
I hope
it reproduces there.

Has the cdrom the power of taking down the bus?

Peter

I don't know if CDROM could stall the entire bus, but I suspect the
reason for asking is this: dgilbert and I had tracked down a problem
previously where during migration, outstanding requests being handled by
the ATAPI code can get lost during migration if, for instance, the user
has only prepared the command (via bmdma) but has not yet written to the
register to activate the command yet.

So if something like this happens:

- User writes to the ATA registers to program a command
- Migration occurs
- User writes to the BMDMA register to initiate the command

We can lose some of the state and data of the request. David had checked
in a workaround for at least ATAPI that simply coaxes the guest OS into
trying the command again to unstick it.

I think we determined last time that we couldn't fix this problem
without changing the migration format, so we opted not to do it for 2.3.
We had also only noticed it with ATAPI drives, not HDDs, so a proper fix
got kicked down the road since we thought the workaround was sufficient.

IIRC our success rate with reproducing it was something on the order of
1/50, too.

If you can reproduce it without a CDROM but using the BMDMA interface,
that's a good data point. If you can't reproduce it using the ISA
interface, that's a phenomenal data point and implicates BMDMA pretty
heavily.

--js

Re: [Qemu-block] [PATCH v4 10/20] qmp: Add support of dirty-bitmap sync mode for drive-backup

2015-04-07 Thread John Snow




On 04/02/2015 08:44 AM, Stefan Hajnoczi wrote:

On Fri, Mar 20, 2015 at 03:16:53PM -0400, John Snow wrote:

+} else if (job-sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
+/* Dirty Bitmap sync has a slightly different iteration method */
+HBitmapIter hbi;
+int64_t sector;
+int64_t cluster;
+int64_t last_cluster = -1;
+bool polyrhythmic;
+
+bdrv_dirty_iter_init(bs, job-sync_bitmap, hbi);
+/* Does the granularity happen to match our backup cluster size? */
+polyrhythmic = (bdrv_dirty_bitmap_granularity(job-sync_bitmap) !=
+BACKUP_CLUSTER_SIZE);
+
+/* Find the next dirty /sector/ and copy that /cluster/ */
+while ((sector = hbitmap_iter_next(hbi)) != -1) {
+cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
+
+/* Fake progress updates for any clusters we skipped,
+ * excluding this current cluster. */
+if (cluster != last_cluster + 1) {
+job-common.offset += ((cluster - last_cluster - 1) *
+   BACKUP_CLUSTER_SIZE);
+}
+
+if (yield_and_check(job)) {
+goto leave;
+}
+
+do {
+ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
+BACKUP_SECTORS_PER_CLUSTER, 
error_is_read);
+if ((ret  0) 
+backup_error_action(job, error_is_read, -ret) ==
+BLOCK_ERROR_ACTION_REPORT) {
+goto leave;
+}
+} while (ret  0);
+
+/* Advance (or rewind) our iterator if we need to. */
+if (polyrhythmic) {
+bdrv_set_dirty_iter(hbi,
+(cluster + 1) * 
BACKUP_SECTORS_PER_CLUSTER);
+}
+
+last_cluster = cluster;
+}


What happens when the dirty bitmap granularity is larger than
BACKUP_SECTORS_PER_CLUSTER?

|-bitmap granularity-|
|---backup cluster---|
   ~~~
Will these sectors ever be copied?

I think this case causes an infinite loop:

   cluster = hbitmap_iter_next(hbi) / BACKUP_SECTORS_PER_CLUSTER

The iterator is reset:

   bdrv_set_dirty_iter(hbi, (cluster + 1) * BACKUP_SECTORS_PER_CLUSTER);

So we get the same cluster ever time and never advance?



I had mistakenly thought that if I advanced to the middle of a 
granularity grouping that the iterator might return the next (virtual) 
index to me, instead of the beginning of the current group.


Anyway, I've fixed this up a bit and added in some granularity variance 
tests for the iotests to test what happens if the granularity is larger, 
equal, or smaller.


v5 is ready to send out, but I need to test it first, so I will probably 
send that out Wednesday night.


Thanks,
--js

Re: [Qemu-block] [PATCH v4 15/20] block: Resize bitmaps on bdrv_truncate

2015-04-02 Thread John Snow




On 04/02/2015 09:37 AM, Stefan Hajnoczi wrote:

On Fri, Mar 20, 2015 at 03:16:58PM -0400, John Snow wrote:

+void hbitmap_truncate(HBitmap *hb, uint64_t size)
+{
+bool shrink;
+unsigned i;
+uint64_t num_elements = size;
+uint64_t old;
+
+/* Size comes in as logical elements, adjust for granularity. */
+size = (size + (1ULL  hb-granularity) - 1)  hb-granularity;
+assert(size = ((uint64_t)1  HBITMAP_LOG_MAX_SIZE));
+shrink = size  hb-size;
+
+/* bit sizes are identical; nothing to do. */
+if (size == hb-size) {
+return;
+}
+
+/* If we're losing bits, let's clear those bits before we invalidate all of
+ * our invariants. This helps keep the bitcount consistent, and will 
prevent
+ * us from carrying around garbage bits beyond the end of the map.
+ *
+ * Because clearing bits past the end of map might reset bits we care about
+ * within the array, record the current value of the last bit we're 
keeping.
+ */
+if (shrink) {
+bool set = hbitmap_get(hb, num_elements - 1);
+uint64_t fix_count = (hb-size  hb-granularity) - num_elements;
+
+assert(fix_count);
+hbitmap_reset(hb, num_elements, fix_count);
+if (set) {
+hbitmap_set(hb, num_elements - 1, 1);
+}


Why is it necessary to set the last bit (if it was set)?  The comment
isn't clear to me.



Sure. The granularity of the bitmap provides us with virtual bit groups. 
for a granularity of say g=2, we have 2^2 virtual bits per every real bit:


101 in memory is treated, virtually, as   .

The get/set calls operate on virtual bits, not concrete ones, so if we 
were to reset virtual bits 2-11:

11|11  

We'd set the real bits to '000', because we clear or set the entire 
virtual group.


This is probably not what we really want, so as a shortcut I just read 
and then re-set the final bit.


It is programmatically avoidable (Are we truncating into a granularity 
group?) but in the case that we are, I'd need to read/reset the bit 
anyway, so it seemed fine to just unconditionally apply the fix.

Re: [Qemu-block] [PATCH v4 18/20] iotests: add QMP event waiting queue

2015-04-02 Thread John Snow




On 04/02/2015 09:57 AM, Stefan Hajnoczi wrote:

On Fri, Mar 20, 2015 at 03:17:01PM -0400, John Snow wrote:

+# Test if 'match' is a recursive subset of 'event'
+def event_match(event, match = None):


Not worth respinning but PEP8 says there should be no spaces around the
'=' for keyword arguments:
https://www.python.org/dev/peps/pep-0008/#whitespace-in-expressions-and-statements


+def event_wait(self, name='BLOCK_JOB_COMPLETED', maxtries=3, match=None):
+# Search cached events
+for event in self._events:
+if (event['event'] == name) and event_match(event, match):
+self._events.remove(event)
+return event
+
+# Poll for new events
+for _ in range(maxtries):
+event = self._qmp.pull_event(wait=True)
+if (event['event'] == name) and event_match(event, match):
+return event
+self._events.append(event)
+
+return None


I'm not sure if maxtries is useful.  Why is a particular number of
skipped events useful to the caller and how do they pick the magic
number?

If you added the argument because this is a blocking operation then we
should probably use timeouts instead.  Timeouts will bail out even if
QEMU is unresponsive.



Yeah, this was just a poor man's timeout.

I'll make it nicer.

Re: [Qemu-block] [PATCH v4 10/20] qmp: Add support of dirty-bitmap sync mode for drive-backup

2015-04-02 Thread John Snow




On 04/02/2015 08:44 AM, Stefan Hajnoczi wrote:

On Fri, Mar 20, 2015 at 03:16:53PM -0400, John Snow wrote:

+} else if (job-sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
+/* Dirty Bitmap sync has a slightly different iteration method */
+HBitmapIter hbi;
+int64_t sector;
+int64_t cluster;
+int64_t last_cluster = -1;
+bool polyrhythmic;
+
+bdrv_dirty_iter_init(bs, job-sync_bitmap, hbi);
+/* Does the granularity happen to match our backup cluster size? */
+polyrhythmic = (bdrv_dirty_bitmap_granularity(job-sync_bitmap) !=
+BACKUP_CLUSTER_SIZE);
+
+/* Find the next dirty /sector/ and copy that /cluster/ */
+while ((sector = hbitmap_iter_next(hbi)) != -1) {
+cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
+
+/* Fake progress updates for any clusters we skipped,
+ * excluding this current cluster. */
+if (cluster != last_cluster + 1) {
+job-common.offset += ((cluster - last_cluster - 1) *
+   BACKUP_CLUSTER_SIZE);
+}
+
+if (yield_and_check(job)) {
+goto leave;
+}
+
+do {
+ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
+BACKUP_SECTORS_PER_CLUSTER, 
error_is_read);
+if ((ret  0) 
+backup_error_action(job, error_is_read, -ret) ==
+BLOCK_ERROR_ACTION_REPORT) {
+goto leave;
+}
+} while (ret  0);
+
+/* Advance (or rewind) our iterator if we need to. */
+if (polyrhythmic) {
+bdrv_set_dirty_iter(hbi,
+(cluster + 1) * 
BACKUP_SECTORS_PER_CLUSTER);
+}
+
+last_cluster = cluster;
+}


What happens when the dirty bitmap granularity is larger than
BACKUP_SECTORS_PER_CLUSTER?

|-bitmap granularity-|
|---backup cluster---|
   ~~~
Will these sectors ever be copied?

I think this case causes an infinite loop:

   cluster = hbitmap_iter_next(hbi) / BACKUP_SECTORS_PER_CLUSTER

The iterator is reset:

   bdrv_set_dirty_iter(hbi, (cluster + 1) * BACKUP_SECTORS_PER_CLUSTER);

So we get the same cluster ever time and never advance?



That is indeed a bug. Tracking to the middle of a granularity group will 
return the index of the group you're in the middle of, not the next group.


Thanks for catching this.

Re: [Qemu-block] [Qemu-devel] [PATCH 6/8] fdc: Disentangle phases in fdctrl_read_data()

2015-05-20 Thread John Snow



On 05/20/2015 04:25 AM, Kevin Wolf wrote:
 Am 19.05.2015 um 22:40 hat John Snow geschrieben:


 On 05/19/2015 11:36 AM, Kevin Wolf wrote:
 This commit makes similar improvements as have already been made to the
 write function: Instead of relying on a flag in the MSR to distinguish
 controller phases, use the explicit phase that we store now. Assertions
 of the right MSR flags are added.

 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  hw/block/fdc.c | 33 +++--
  1 file changed, 23 insertions(+), 10 deletions(-)

 diff --git a/hw/block/fdc.c b/hw/block/fdc.c
 index cbf7abf..8d322e0 100644
 --- a/hw/block/fdc.c
 +++ b/hw/block/fdc.c
 @@ -1533,9 +1533,16 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
  FLOPPY_DPRINTF(error: controller not ready for reading\n);
  return 0;
  }
 +
 +/* If data_len spans multiple sectors, the current position in the FIFO
 + * wraps around while fdctrl-data_pos is the real position in the 
 whole
 + * request. */
  pos = fdctrl-data_pos;
  pos %= FD_SECTOR_LEN;
 -if (fdctrl-msr  FD_MSR_NONDMA) {
 +
 +switch (fdctrl-phase) {
 +case FD_PHASE_EXECUTION:
 +assert(fdctrl-msr  FD_MSR_NONDMA);
  if (pos == 0) {
  if (fdctrl-data_pos != 0)
  if (!fdctrl_seek_to_next_sect(fdctrl, cur_drv)) {
 @@ -1551,20 +1558,26 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
  memset(fdctrl-fifo, 0, FD_SECTOR_LEN);
  }
  }
 -}
 -retval = fdctrl-fifo[pos];
 -if (++fdctrl-data_pos == fdctrl-data_len) {
 -fdctrl-data_pos = 0;

 I suppose data_pos is now reset by either stop_transfer (via
 to_result_phase) or to_command_phase, so this is OK.
 
 Yes, that was redundant code.
 
 -/* Switch from transfer mode to status mode
 - * then from status mode to command mode
 - */
 -if (fdctrl-msr  FD_MSR_NONDMA) {
 +
 +if (++fdctrl-data_pos == fdctrl-data_len) {
  fdctrl_stop_transfer(fdctrl, 0x00, 0x00, 0x00);
 -} else {
 +}
 +break;
 +
 +case FD_PHASE_RESULT:
 +assert(!(fdctrl-msr  FD_MSR_NONDMA));
 +if (++fdctrl-data_pos == fdctrl-data_len) {

 Not a terribly big fan of moving this pointer independently inside of
 each case statement, but I guess the alternative does look a lot worse.
 Having things separated by phases is a lot easier to follow.
 
 I'm not too happy about it either, but I couldn't think of anything
 better. Having two different switches almost immediately after each
 other, with only the if line in between, would look really awkward and
 be hard to read. And the old code isn't nice either.
 
 If you have any idea for a better solution, let me know.
 
 Kevin
 

I'm all complaints and no solutions. I believe I gave you my R-b anyway. :)

Re: [Qemu-block] [Qemu-devel] [PATCH 3/8] fdc: Introduce fdctrl-phase

2015-05-20 Thread John Snow



On 05/20/2015 05:24 AM, Peter Maydell wrote:
 On 20 May 2015 at 09:43, Kevin Wolf kw...@redhat.com wrote:
 Am 20.05.2015 um 10:06 hat Peter Maydell geschrieben:
 That handles migration, which is good. But I still think that
 storing the same information in two places in the device
 state (phase field and the register fields) is error-prone.

 That's actually my point. The registers are accessed everywhere in the
 code, whereas phase transitions are in very few well-defined places
 (there are exactly four of them, iirc). If they get out of sync, chances
 are that the bug is in the register value, not in the phase. When we
 know what phase we're in, we can assert the bits and actually catch such
 bugs.

 If we want to switch to having a phase field we should calculate
 the relevant register bits on demand based on the phase, rather
 than keeping both copies of the state in sync manually.

 That doesn't work, unfortunately. Some register bits imply a specific
 phase (assuming correct code), but you can't derive the exact bits just
 from the phase.
 
 Having now dug out a copy of the 82078 spec, I agree that the state
 isn't derivable purely from the register values in the general case.
 The controller clearly has a state machine internally but it doesn't
 surface that in the register state except indirectly.
 
 -- PMM
 

So even if /currently/ we can reconstitute it from the register values,
we may eventually be unable to.

post_load will work for now, but I fear the case (in ten years) when
someone else cleans up FDC code but fails to realize that the phase is
not explicitly migrated.

Re: [Qemu-block] [Qemu-devel] [PATCH 7/8] fdc: Fix MSR.RQM flag

2015-05-20 Thread John Snow



On 05/20/2015 04:14 AM, Kevin Wolf wrote:
 Am 19.05.2015 um 22:40 hat John Snow geschrieben:


 On 05/19/2015 11:36 AM, Kevin Wolf wrote:
 The RQM bit in MSR should be set whenever the guest is supposed to
 access the FIFO, and it should be cleared in all other cases. This is
 important so the guest can't continue writing/reading the FIFO beyond
 the length that it's suppossed to access (see CVE-2015-3456).

 Commit e9077462 fixed the CVE by adding code that avoids the buffer
 overflow; however it doesn't correct the wrong behaviour of the floppy
 controller which should already have cleared RQM.

 Currently, RQM stays set all the time and during all phases while a
 command is being processed. This is error-prone because the command has
 to explicitly clear the flag if it doesn't need data (and indeed, the
 two buggy commands that are the culprits for the CVE just forgot to do
 that).

 This patch clears RQM immediately as soon as all bytes that are expected
 have been received. If the the FIFO is used in the next phase, the flag
 has to be set explicitly there.

 This alone should have been enough to fix the CVE, but now we have two
 lines of defense - even better.

 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  hw/block/fdc.c | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)

 diff --git a/hw/block/fdc.c b/hw/block/fdc.c
 index 8d322e0..c6a046e 100644
 --- a/hw/block/fdc.c
 +++ b/hw/block/fdc.c
 @@ -1165,7 +1165,9 @@ static void fdctrl_to_command_phase(FDCtrl *fdctrl)
  fdctrl-phase = FD_PHASE_COMMAND;
  fdctrl-data_dir = FD_DIR_WRITE;
  fdctrl-data_pos = 0;
 +fdctrl-data_len = 1; /* Accept command byte, adjust for params later 
 */
  fdctrl-msr = ~(FD_MSR_CMDBUSY | FD_MSR_DIO);
 +fdctrl-msr |= FD_MSR_RQM;
  }
  
  /* Update the state to allow the guest to read out the command status.
 @@ -1380,7 +1382,7 @@ static void fdctrl_start_transfer(FDCtrl *fdctrl, int 
 direction)
  }
  }
  FLOPPY_DPRINTF(start non-DMA transfer\n);
 -fdctrl-msr |= FD_MSR_NONDMA;
 +fdctrl-msr |= FD_MSR_NONDMA | FD_MSR_RQM;
  if (direction != FD_DIR_WRITE)
  fdctrl-msr |= FD_MSR_DIO;
  /* IO based transfer: calculate len */
 @@ -1560,6 +1562,7 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
  }
  
  if (++fdctrl-data_pos == fdctrl-data_len) {
 +fdctrl-msr = ~FD_MSR_RQM;

 Doesn't stop_transfer set this flag back right away?
 
 It does, by switching to the result phase.
 
 I think it's clearer to disable the bit anywhere where the FIFO has
 received as many bytes as it's supposed to, even if the next phase is
 started immediately and reenables it.
 
 In real hardware, sending a byte causes the FDC to disable RQM, then
 process the byte (which means completing command execution for this code
 path), then reenable RQM if needed.
 
 Currently our code is completely synchronous, so we could ignore this
 detail because the state between clearing and setting RQM isn't
 observable by the guest. If we ever introduce something asynchronous in
 the path, we will need this though - and modelling real hardware more
 precisely has never hurt anyway.
 
 Kevin
 

OK, just amend the commit message to explain that clearing the bits here
is to accommodate a possible asynchronous refactor, or to be more
explicit, or etc etc etc.

--js

Re: [Qemu-block] [Qemu-devel] [PATCH v4 06/11] block: add refcount to Job object

2015-05-19 Thread John Snow



On 05/18/2015 11:45 AM, Stefan Hajnoczi wrote:
 On Mon, May 11, 2015 at 07:04:21PM -0400, John Snow wrote:
 If we want to get at the job after the life of the job,
 we'll need a refcount for this object.

 This may occur for example if we wish to inspect the actions
 taken by a particular job after a transactional group of jobs
 runs, and further actions are required.

 Signed-off-by: John Snow js...@redhat.com
 Reviewed-by: Max Reitz mre...@redhat.com
 ---
  blockjob.c   | 18 --
  include/block/blockjob.h | 21 +
  2 files changed, 37 insertions(+), 2 deletions(-)
 
 I think the only reason for this refcount is so that
 backup_transaction_complete() can be called.  It accesses
 BackupBlockJob-sync_bitmap so the BackupBlockJob instance needs to be
 alive.
 
 The bitmap refcount is incremented in blockdev.c, not block/backup.c, so
 it is fine to drop backup_transaction_complete() and decrement the
 bitmap refcount in blockdev.c instead.
 
 If you do that then there is no need to add a refcount to block job.
 This would simplify things.
 

So you are suggesting that I cache the bitmap reference (instead of the
job reference) and then just increment/decrement it directly in
.prepare, .abort and .cb as needed.

You did find the disparity with the reference count for the bitmap, at
least: that is kind of gross. I was coincidentally thinking of punting
it back into a backup_transaction_start to keep more code /out/ of
blockdev...

I'll sit on this one for a few more minutes. I'll try to get rid of the
job refcnt, but I also want to keep the transaction actions as tidy as I
can.

Maybe it's too much abstraction for a simple task, but I wanted to make
sure I wasn't hacking in transaction callbacks in a manner where they'd
only be useful to me, for only this one case. It's conceivable that if
anyone else attempts to use this callback hijacking mechanism that
they'll need to find a way to modify objects within the Job without
pulling everything up to the transaction actions, too.

Ho hum.

--js

Re: [Qemu-block] [Qemu-devel] [PATCH 2/8] fdc: Rename fdctrl_set_fifo() to fdctrl_to_result_phase()

2015-05-19 Thread John Snow

 direct
  /* ERROR */
  fdctrl-fifo[0] = 0x80 |
  (cur_drv-head  2) | GET_CUR_DRV(fdctrl);
 -fdctrl_set_fifo(fdctrl, 1);
 +fdctrl_to_result_phase(fdctrl, 1);
  }
  }
  
 

Similar bike-shedding comment here to match patch #1, but that won't
stop this:

Reviewed-by: John Snow js...@redhat.com

Re: [Qemu-block] [Qemu-devel] [PATCH 3/8] fdc: Introduce fdctrl-phase

2015-05-19 Thread John Snow



On 05/19/2015 11:35 AM, Kevin Wolf wrote:
 The floppy controller spec describes three different controller phases,
 which are currently not explicitly modelled in our emulation. Instead,
 each phase is represented by a combination of flags in registers.
 
 This patch makes explicit in which phase the controller currently is.
 
 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  hw/block/fdc.c | 31 +++
  1 file changed, 31 insertions(+)
 
 diff --git a/hw/block/fdc.c b/hw/block/fdc.c
 index 8c41434..4d4868e 100644
 --- a/hw/block/fdc.c
 +++ b/hw/block/fdc.c

[snip]

Reviewed-by: John Snow js...@redhat.com

Re: [Qemu-block] [Qemu-devel] [PATCH 1/8] fdc: Rename fdctrl_reset_fifo() to fdctrl_to_command_phase()

2015-05-19 Thread John Snow

);
 +fdctrl_to_command_phase(fdctrl);
  }
  } else if (fdctrl-data_len  7) {
  /* ERROR */
 @@ -1887,7 +1887,7 @@ static void fdctrl_handle_relative_seek_in(FDCtrl 
 *fdctrl, int direction)
  fd_seek(cur_drv, cur_drv-head,
  cur_drv-track + fdctrl-fifo[2], cur_drv-sect, 1);
  }
 -fdctrl_reset_fifo(fdctrl);
 +fdctrl_to_command_phase(fdctrl);
  /* Raise Interrupt */
  fdctrl-status0 |= FD_SR0_SEEK;
  fdctrl_raise_irq(fdctrl);
 @@ -1905,7 +1905,7 @@ static void fdctrl_handle_relative_seek_out(FDCtrl 
 *fdctrl, int direction)
  fd_seek(cur_drv, cur_drv-head,
  cur_drv-track - fdctrl-fifo[2], cur_drv-sect, 1);
  }
 -fdctrl_reset_fifo(fdctrl);
 +fdctrl_to_command_phase(fdctrl);
  /* Raise Interrupt */
  fdctrl-status0 |= FD_SR0_SEEK;
  fdctrl_raise_irq(fdctrl);
 

why 'to' instead of 'start' or 'init'? It seems weird to describe it in
a third-party descriptive way instead of with the imperative.

Bike-shedding aside:

Reviewed-by: John Snow js...@redhat.com

Re: [Qemu-block] [Qemu-devel] [PATCH 7/8] fdc: Fix MSR.RQM flag

2015-05-19 Thread John Snow



On 05/19/2015 11:36 AM, Kevin Wolf wrote:
 The RQM bit in MSR should be set whenever the guest is supposed to
 access the FIFO, and it should be cleared in all other cases. This is
 important so the guest can't continue writing/reading the FIFO beyond
 the length that it's suppossed to access (see CVE-2015-3456).
 
 Commit e9077462 fixed the CVE by adding code that avoids the buffer
 overflow; however it doesn't correct the wrong behaviour of the floppy
 controller which should already have cleared RQM.
 
 Currently, RQM stays set all the time and during all phases while a
 command is being processed. This is error-prone because the command has
 to explicitly clear the flag if it doesn't need data (and indeed, the
 two buggy commands that are the culprits for the CVE just forgot to do
 that).
 
 This patch clears RQM immediately as soon as all bytes that are expected
 have been received. If the the FIFO is used in the next phase, the flag
 has to be set explicitly there.
 
 This alone should have been enough to fix the CVE, but now we have two
 lines of defense - even better.
 
 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  hw/block/fdc.c | 13 -
  1 file changed, 12 insertions(+), 1 deletion(-)
 
 diff --git a/hw/block/fdc.c b/hw/block/fdc.c
 index 8d322e0..c6a046e 100644
 --- a/hw/block/fdc.c
 +++ b/hw/block/fdc.c
 @@ -1165,7 +1165,9 @@ static void fdctrl_to_command_phase(FDCtrl *fdctrl)
  fdctrl-phase = FD_PHASE_COMMAND;
  fdctrl-data_dir = FD_DIR_WRITE;
  fdctrl-data_pos = 0;
 +fdctrl-data_len = 1; /* Accept command byte, adjust for params later */
  fdctrl-msr = ~(FD_MSR_CMDBUSY | FD_MSR_DIO);
 +fdctrl-msr |= FD_MSR_RQM;
  }
  
  /* Update the state to allow the guest to read out the command status.
 @@ -1380,7 +1382,7 @@ static void fdctrl_start_transfer(FDCtrl *fdctrl, int 
 direction)
  }
  }
  FLOPPY_DPRINTF(start non-DMA transfer\n);
 -fdctrl-msr |= FD_MSR_NONDMA;
 +fdctrl-msr |= FD_MSR_NONDMA | FD_MSR_RQM;
  if (direction != FD_DIR_WRITE)
  fdctrl-msr |= FD_MSR_DIO;
  /* IO based transfer: calculate len */
 @@ -1560,6 +1562,7 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
  }
  
  if (++fdctrl-data_pos == fdctrl-data_len) {
 +fdctrl-msr = ~FD_MSR_RQM;

Doesn't stop_transfer set this flag back right away?

  fdctrl_stop_transfer(fdctrl, 0x00, 0x00, 0x00);
  }
  break;
 @@ -1567,6 +1570,7 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
  case FD_PHASE_RESULT:
  assert(!(fdctrl-msr  FD_MSR_NONDMA));
  if (++fdctrl-data_pos == fdctrl-data_len) {
 +fdctrl-msr = ~FD_MSR_RQM;

Same here with to_command_phase.

  fdctrl_to_command_phase(fdctrl);
  fdctrl_reset_irq(fdctrl);
  }
 @@ -2036,6 +2040,10 @@ static void fdctrl_write_data(FDCtrl *fdctrl, uint32_t 
 value)
  pos %= FD_SECTOR_LEN;
  fdctrl-fifo[pos] = value;
  
 +if (fdctrl-data_pos == fdctrl-data_len) {
 +fdctrl-msr = ~FD_MSR_RQM;
 +}
 +
  switch (fdctrl-phase) {
  case FD_PHASE_EXECUTION:
  assert(fdctrl-msr  FD_MSR_NONDMA);
 @@ -2071,6 +2079,9 @@ static void fdctrl_write_data(FDCtrl *fdctrl, uint32_t 
 value)
   * as many parameters as this command requires. */
  cmd = get_command(value);
  fdctrl-data_len = cmd-parameters + 1;
 +if (cmd-parameters) {
 +fdctrl-msr |= FD_MSR_RQM;
 +}
  fdctrl-msr |= FD_MSR_CMDBUSY;
  }

Re: [Qemu-block] [Qemu-devel] [PATCH 5/8] fdc: Code cleanup in fdctrl_write_data()

2015-05-19 Thread John Snow



On 05/19/2015 11:35 AM, Kevin Wolf wrote:
 Factor out a few common lines of code, reformat, improve comments.
 
 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  hw/block/fdc.c | 62 
 +++---
  1 file changed, 38 insertions(+), 24 deletions(-)
 
 diff --git a/hw/block/fdc.c b/hw/block/fdc.c
 index a13e0ce..cbf7abf 100644
 --- a/hw/block/fdc.c
 +++ b/hw/block/fdc.c
 @@ -1942,14 +1942,16 @@ static void fdctrl_handle_relative_seek_out(FDCtrl 
 *fdctrl, int direction)
  /*
   * Handlers for the execution phase of each command
   */
 -static const struct {
 +typedef struct FDCtrlCommand {
  uint8_t value;
  uint8_t mask;
  const char* name;
  int parameters;
  void (*handler)(FDCtrl *fdctrl, int direction);
  int direction;
 -} handlers[] = {
 +} FDCtrlCommand;
 +
 +static const FDCtrlCommand handlers[] = {
  { FD_CMD_READ, 0x1f, READ, 8, fdctrl_start_transfer, FD_DIR_READ },
  { FD_CMD_WRITE, 0x3f, WRITE, 8, fdctrl_start_transfer, FD_DIR_WRITE },
  { FD_CMD_SEEK, 0xff, SEEK, 2, fdctrl_handle_seek },
 @@ -1986,9 +1988,19 @@ static const struct {
  /* Associate command to an index in the 'handlers' array */
  static uint8_t command_to_handler[256];
  
 +static const FDCtrlCommand *get_command(uint8_t cmd)
 +{
 +int idx;
 +
 +idx = command_to_handler[cmd];
 +FLOPPY_DPRINTF(%s command\n, handlers[idx].name);
 +return handlers[idx];
 +}
 +
  static void fdctrl_write_data(FDCtrl *fdctrl, uint32_t value)
  {
  FDrive *cur_drv;
 +const FDCtrlCommand *cmd;
  uint32_t pos;
  
  /* Reset mode */
 @@ -2002,13 +2014,20 @@ static void fdctrl_write_data(FDCtrl *fdctrl, 
 uint32_t value)
  }
  fdctrl-dsr = ~FD_DSR_PWRDOWN;
  
 +FLOPPY_DPRINTF(%s: %02x\n, __func__, value);
 +
 +/* If data_len spans multiple sectors, the current position in the FIFO
 + * wraps around while fdctrl-data_pos is the real position in the whole
 + * request. */
 +pos = fdctrl-data_pos++;
 +pos %= FD_SECTOR_LEN;
 +fdctrl-fifo[pos] = value;
 +
  switch (fdctrl-phase) {
  case FD_PHASE_EXECUTION:
  assert(fdctrl-msr  FD_MSR_NONDMA);
 +
  /* FIFO data write */
 -pos = fdctrl-data_pos++;
 -pos %= FD_SECTOR_LEN;
 -fdctrl-fifo[pos] = value;
  if (pos == FD_SECTOR_LEN - 1 ||
  fdctrl-data_pos == fdctrl-data_len) {
  cur_drv = get_cur_drv(fdctrl);
 @@ -2024,41 +2043,36 @@ static void fdctrl_write_data(FDCtrl *fdctrl, 
 uint32_t value)
  break;
  }
  }
 -/* Switch from transfer mode to status mode
 - * then from status mode to command mode
 - */
 -if (fdctrl-data_pos == fdctrl-data_len)
 +
 +/* Switch to result phase when done with the transfer */
 +if (fdctrl-data_pos == fdctrl-data_len) {
  fdctrl_stop_transfer(fdctrl, 0x00, 0x00, 0x00);
 +}
  break;
  
  case FD_PHASE_COMMAND:
  assert(!(fdctrl-msr  FD_MSR_NONDMA));
  
 -if (fdctrl-data_pos == 0) {
 -/* Command */
 -pos = command_to_handler[value  0xff];
 -FLOPPY_DPRINTF(%s command\n, handlers[pos].name);
 -fdctrl-data_len = handlers[pos].parameters + 1;
 +if (fdctrl-data_pos == 1) {

This reads more awkwardly than the previous ifz, but it's not
like I have a better idea. (I just had a momentary pause of Why 1?)

 +/* The first byte specifies the command. Now we start reading
 + * as many parameters as this command requires. */
 +cmd = get_command(value);
 +fdctrl-data_len = cmd-parameters + 1;
  fdctrl-msr |= FD_MSR_CMDBUSY;
  }
  
 -FLOPPY_DPRINTF(%s: %02x\n, __func__, value);
 -pos = fdctrl-data_pos++;
 -pos %= FD_SECTOR_LEN;
 -fdctrl-fifo[pos] = value;
  if (fdctrl-data_pos == fdctrl-data_len) {
 -/* We now have all parameters
 - * and will be able to treat the command
 - */
 +/* We have all parameters now, execute the command */
  fdctrl-phase = FD_PHASE_EXECUTION;
 +
  if (fdctrl-data_state  FD_STATE_FORMAT) {
  fdctrl_format_sector(fdctrl);
  break;
  }
  
 -pos = command_to_handler[fdctrl-fifo[0]  0xff];
 -FLOPPY_DPRINTF(treat %s command\n, handlers[pos].name);
 -(*handlers[pos].handler)(fdctrl, handlers[pos].direction);
 +cmd = get_command(fdctrl-fifo[0]);
 +FLOPPY_DPRINTF(Calling handler for '%s'\n, cmd-name);
 +cmd-handler(fdctrl, cmd-direction);
  }
  break;
  
 

Reviewed-by: John Snow js...@redhat.com

Re: [Qemu-block] [Qemu-devel] [PATCH 4/8] fdc: Use phase in fdctrl_write_data()

2015-05-19 Thread John Snow



On 05/19/2015 11:35 AM, Kevin Wolf wrote:
 Instead of relying on a flag in the MSR to distinguish controller phases,
 use the explicit phase that we store now. Assertions of the right MSR
 flags are added.
 
 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  hw/block/fdc.c | 67 
 ++
  1 file changed, 39 insertions(+), 28 deletions(-)
 
 diff --git a/hw/block/fdc.c b/hw/block/fdc.c
 index 4d4868e..a13e0ce 100644
 --- a/hw/block/fdc.c
 +++ b/hw/block/fdc.c

[snip]

Reviewed-by: John Snow js...@redhat.com

Re: [Qemu-block] [Qemu-devel] [PATCH 6/8] fdc: Disentangle phases in fdctrl_read_data()

2015-05-19 Thread John Snow



On 05/19/2015 11:36 AM, Kevin Wolf wrote:
 This commit makes similar improvements as have already been made to the
 write function: Instead of relying on a flag in the MSR to distinguish
 controller phases, use the explicit phase that we store now. Assertions
 of the right MSR flags are added.
 
 Signed-off-by: Kevin Wolf kw...@redhat.com
 ---
  hw/block/fdc.c | 33 +++--
  1 file changed, 23 insertions(+), 10 deletions(-)
 
 diff --git a/hw/block/fdc.c b/hw/block/fdc.c
 index cbf7abf..8d322e0 100644
 --- a/hw/block/fdc.c
 +++ b/hw/block/fdc.c
 @@ -1533,9 +1533,16 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
  FLOPPY_DPRINTF(error: controller not ready for reading\n);
  return 0;
  }
 +
 +/* If data_len spans multiple sectors, the current position in the FIFO
 + * wraps around while fdctrl-data_pos is the real position in the whole
 + * request. */
  pos = fdctrl-data_pos;
  pos %= FD_SECTOR_LEN;
 -if (fdctrl-msr  FD_MSR_NONDMA) {
 +
 +switch (fdctrl-phase) {
 +case FD_PHASE_EXECUTION:
 +assert(fdctrl-msr  FD_MSR_NONDMA);
  if (pos == 0) {
  if (fdctrl-data_pos != 0)
  if (!fdctrl_seek_to_next_sect(fdctrl, cur_drv)) {
 @@ -1551,20 +1558,26 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
  memset(fdctrl-fifo, 0, FD_SECTOR_LEN);
  }
  }
 -}
 -retval = fdctrl-fifo[pos];
 -if (++fdctrl-data_pos == fdctrl-data_len) {
 -fdctrl-data_pos = 0;

I suppose data_pos is now reset by either stop_transfer (via
to_result_phase) or to_command_phase, so this is OK.

 -/* Switch from transfer mode to status mode
 - * then from status mode to command mode
 - */
 -if (fdctrl-msr  FD_MSR_NONDMA) {
 +
 +if (++fdctrl-data_pos == fdctrl-data_len) {
  fdctrl_stop_transfer(fdctrl, 0x00, 0x00, 0x00);
 -} else {
 +}
 +break;
 +
 +case FD_PHASE_RESULT:
 +assert(!(fdctrl-msr  FD_MSR_NONDMA));
 +if (++fdctrl-data_pos == fdctrl-data_len) {

Not a terribly big fan of moving this pointer independently inside of
each case statement, but I guess the alternative does look a lot worse.
Having things separated by phases is a lot easier to follow.

  fdctrl_to_command_phase(fdctrl);
  fdctrl_reset_irq(fdctrl);
  }
 +break;
 +
 +case FD_PHASE_COMMAND:
 +default:
 +abort();
  }
 +
 +retval = fdctrl-fifo[pos];
  FLOPPY_DPRINTF(data register: 0x%02x\n, retval);
  
  return retval;
 

Reviewed-by: John Snow js...@redhat.com

[Qemu-block] [PATCH 7/9] block: add differential backup mode

2015-06-04 Thread John Snow

This is simple: instead of clearing the bitmap, just leave the bitmap
data intact even in case of success.

Signed-off-by: John Snow js...@redhat.com
---
 block.c   |  9 -
 block/backup.c| 17 ++---
 block/mirror.c|  9 +++--
 include/block/block.h |  1 +
 qapi/block-core.json  |  6 --
 5 files changed, 30 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index 5551f79..3e780f9 100644
--- a/block.c
+++ b/block.c
@@ -3166,7 +3166,9 @@ DirtyBitmapStatus 
bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
  * Requires that the bitmap is not frozen and has no successor.
  */
 int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
-   BdrvDirtyBitmap *bitmap, Error **errp)
+   BdrvDirtyBitmap *bitmap,
+   MirrorSyncMode sync_mode,
+   Error **errp)
 {
 uint64_t granularity;
 BdrvDirtyBitmap *child;
@@ -3191,6 +3193,11 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState 
*bs,
 /* Install the successor and freeze the parent */
 bitmap-successor = child;
 bitmap-successor_refcount = 1;
+
+if (sync_mode == MIRROR_SYNC_MODE_DIFFERENTIAL) {
+bitmap-act = SUCCESSOR_ACTION_RECLAIM;
+}
+
 return 0;
 }
 
diff --git a/block/backup.c b/block/backup.c
index a8f7c43..dd808c2 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -390,7 +390,8 @@ static void coroutine_fn backup_run(void *opaque)
 qemu_coroutine_yield();
 job-common.busy = true;
 }
-} else if (job-sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+} else if ((job-sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) ||
+   (job-sync_mode == MIRROR_SYNC_MODE_DIFFERENTIAL)) {
 ret = backup_run_incremental(job);
 } else {
 /* Both FULL and TOP SYNC_MODE's require copying.. */
@@ -510,15 +511,18 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 return;
 }
 
-if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+if ((sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) ||
+(sync_mode == MIRROR_SYNC_MODE_DIFFERENTIAL)) {
 if (!sync_bitmap) {
-error_setg(errp, must provide a valid bitmap name for 
- \incremental\ sync mode);
+error_setg(errp,
+   must provide a valid bitmap name for \%s\ sync mode,
+   MirrorSyncMode_lookup[sync_mode]);
 return;
 }
 
 /* Create a new bitmap, and freeze/disable this one. */
-if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap, errp)  0) {
+if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap,
+   sync_mode, errp)  0) {
 return;
 }
 } else if (sync_bitmap) {
@@ -548,8 +552,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 job-on_target_error = on_target_error;
 job-target = target;
 job-sync_mode = sync_mode;
-job-sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
-   sync_bitmap : NULL;
+job-sync_bitmap = sync_bitmap;
 job-common.len = len;
 job-common.co = qemu_coroutine_create(backup_run);
 qemu_coroutine_enter(job-common.co, job);
diff --git a/block/mirror.c b/block/mirror.c
index adf391c..1cde86b 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -709,9 +709,14 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
*target,
 bool is_none_mode;
 BlockDriverState *base;
 
-if (mode == MIRROR_SYNC_MODE_INCREMENTAL) {
-error_setg(errp, Sync mode 'incremental' not supported);
+switch (mode) {
+case MIRROR_SYNC_MODE_INCREMENTAL:
+case MIRROR_SYNC_MODE_DIFFERENTIAL:
+error_setg(errp, Sync mode \%s\ not supported,
+   MirrorSyncMode_lookup[mode]);
 return;
+default:
+break;
 }
 is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
 base = mode == MIRROR_SYNC_MODE_TOP ? bs-backing_hd : NULL;
diff --git a/include/block/block.h b/include/block/block.h
index e88a332..8169a60 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -462,6 +462,7 @@ BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState 
*bs,
 Error **errp);
 int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
+   MirrorSyncMode sync_mode,
Error **errp);
 BdrvDirtyBitmap *bdrv_frozen_bitmap_decref(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 92c9e53..421fd25 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -534,12 +534,14

[Qemu-block] [PATCH 1/9] qapi: Rename 'dirty-bitmap' mode to 'incremental'

2015-06-04 Thread John Snow

If we wish to make differential backups a feature that's easy to access,
it might be pertinent to rename the dirty-bitmap mode to incremental
to make it clear what /type/ of backup the dirty-bitmap is helping us
perform.

This is an API breaking change, but 2.4 has not yet gone live,
so we have this flexibility.

Signed-off-by: John Snow js...@redhat.com
---
 block/backup.c| 10 +-
 block/mirror.c|  4 ++--
 docs/bitmaps.md   |  8 
 include/block/block_int.h |  2 +-
 qapi/block-core.json  |  8 
 qmp-commands.hx   |  6 +++---
 tests/qemu-iotests/124| 10 +-
 7 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index e681f1b..a8f7c43 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -37,7 +37,7 @@ typedef struct CowRequest {
 typedef struct BackupBlockJob {
 BlockJob common;
 BlockDriverState *target;
-/* bitmap for sync=dirty-bitmap */
+/* bitmap for sync=incremental */
 BdrvDirtyBitmap *sync_bitmap;
 MirrorSyncMode sync_mode;
 RateLimit limit;
@@ -390,7 +390,7 @@ static void coroutine_fn backup_run(void *opaque)
 qemu_coroutine_yield();
 job-common.busy = true;
 }
-} else if (job-sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
+} else if (job-sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
 ret = backup_run_incremental(job);
 } else {
 /* Both FULL and TOP SYNC_MODE's require copying.. */
@@ -510,10 +510,10 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 return;
 }
 
-if (sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
+if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
 if (!sync_bitmap) {
 error_setg(errp, must provide a valid bitmap name for 
- \dirty-bitmap\ sync mode);
+ \incremental\ sync mode);
 return;
 }
 
@@ -548,7 +548,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState 
*target,
 job-on_target_error = on_target_error;
 job-target = target;
 job-sync_mode = sync_mode;
-job-sync_bitmap = sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP ?
+job-sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
sync_bitmap : NULL;
 job-common.len = len;
 job-common.co = qemu_coroutine_create(backup_run);
diff --git a/block/mirror.c b/block/mirror.c
index 58f391a..adf391c 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -709,8 +709,8 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
*target,
 bool is_none_mode;
 BlockDriverState *base;
 
-if (mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
-error_setg(errp, Sync mode 'dirty-bitmap' not supported);
+if (mode == MIRROR_SYNC_MODE_INCREMENTAL) {
+error_setg(errp, Sync mode 'incremental' not supported);
 return;
 }
 is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
diff --git a/docs/bitmaps.md b/docs/bitmaps.md
index a60fee1..9fd8ea6 100644
--- a/docs/bitmaps.md
+++ b/docs/bitmaps.md
@@ -206,7 +206,7 @@ full backup as a backing image.
 bitmap: bitmap0,
 target: incremental.0.img,
 format: qcow2,
-sync: dirty-bitmap,
+sync: incremental,
 mode: existing
   }
 }
@@ -231,7 +231,7 @@ full backup as a backing image.
 bitmap: bitmap0,
 target: incremental.1.img,
 format: qcow2,
-sync: dirty-bitmap,
+sync: incremental,
 mode: existing
   }
 }
@@ -271,7 +271,7 @@ full backup as a backing image.
 bitmap: bitmap0,
 target: incremental.0.img,
 format: qcow2,
-sync: dirty-bitmap,
+sync: incremental,
 mode: existing
   }
 }
@@ -304,7 +304,7 @@ full backup as a backing image.
 bitmap: bitmap0,
 target: incremental.0.img,
 format: qcow2,
-sync: dirty-bitmap,
+sync: incremental,
 mode: existing
   }
 }
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 5ca5f15..656abcf 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -613,7 +613,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState 
*target,
  * @target: Block device to write to.
  * @speed: The maximum speed, in bytes per second, or 0 for unlimited.
  * @sync_mode: What parts of the disk image should be copied to the 
destination.
- * @sync_bitmap: The dirty bitmap if sync_mode is 
MIRROR_SYNC_MODE_DIRTY_BITMAP.
+ * @sync_bitmap: The dirty bitmap if sync_mode is MIRROR_SYNC_MODE_INCREMENTAL.
  * @on_source_error: The action to take upon error reading from the source.
  * @on_target_error: The action to take upon error writing to the target.
  * @cb: Completion function for the job.
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 8411d4f..0061713 100644
--- a/qapi/block-core.json
+++ b/qapi/block

[Qemu-block] [PATCH 0/9] block: add differential backup support

2015-06-04 Thread John Snow

Requires: 1433454372-16356-1-git-send-email-js...@redhat.com
  [0/10] block: incremental backup transactions

It's entirely possible to use the incremental backup primitives to
achieve a differential backup mechanism, but in the interest of
ease of use, I am proposing the explicit addition of the mechanism
because it does not particularly complicate the code, add new edge
cases, or present itself as difficult to test.

This series actually adds two ease of use features:

(1) Add a copy primitive for bitmaps to add flexibility to the
backup system in case users would like to later run multiple
backup chains (weekly vs. monthly or perhaps incremental vs.
differential)

(2) Add a 'differential' backup mode that does what the name says
on the tin.

==
For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch differential-backup
https://github.com/jnsnow/qemu/tree/differential-backup

This version is tagged differential-backup-v1:
https://github.com/jnsnow/qemu/releases/tag/differential-backup-v1
==

John Snow (9):
  qapi: Rename 'dirty-bitmap' mode to 'incremental'
  hbitmap: add hbitmap_copy
  block: add bdrv_copy_dirty_bitmap
  qapi: add Copy data type for bitmaps
  qmp: add qmp cmd block-dirty-bitmap-copy
  qmp: add block-dirty-bitmap-copy transaction
  block: add differential backup mode
  iotests: 124: support differential backups
  iotests: add differential backup test

 block.c| 35 +-
 block/backup.c | 19 ++
 block/mirror.c |  9 -
 blockdev.c | 61 +++
 docs/bitmaps.md|  8 ++--
 include/block/block.h  |  5 +++
 include/block/block_int.h  |  2 +-
 include/qemu/hbitmap.h |  9 +
 qapi-schema.json   |  4 +-
 qapi/block-core.json   | 40 ++--
 qmp-commands.hx| 36 --
 tests/qemu-iotests/124 | 91 +-
 tests/qemu-iotests/124.out |  4 +-
 util/hbitmap.c | 17 +
 14 files changed, 280 insertions(+), 60 deletions(-)

-- 
2.1.0

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 5326 matches

Mail list logo