date:20170830

Re: [Qemu-block] [Qemu-devel] [PATCH v2] qemu-img: Clarify about relative backing file options

2017-08-30 Thread Fam Zheng

On Fri, 08/04 22:36, Fam Zheng wrote:
> It's not too surprising when a user specifies the backing file relative
> to the current working directory instead of the top layer image. This
> causes error when they differ. Though the error message has enough
> information to infer the fact about the misunderstanding, it is better
> if we document this explicitly, so that users don't have to learn from
> mistakes.

Gentle ping as a reminder for 2.11, as 2.10 is now out of door.

Fam

Re: [Qemu-block] [Qemu-devel] [PATCH for-2.11 v2] file-posix: Clear out first sector in hdev_create

2017-08-30 Thread Fam Zheng

On Fri, 08/11 16:09, Fam Zheng wrote:
> People get surprised when, after "qemu-img create -f raw /dev/sdX", they
> still see qcow2 with "qemu-img info", if previously the bdev had a qcow2
> header. While this is natural because raw doesn't need to write any
> magic bytes during creation, hdev_create is free to clear out the first
> sector to make sure the stale qcow2 header doesn't cause such confusion.
> 
> Signed-off-by: Fam Zheng 

Gentle ping as a reminder for 2.11 as we have now released 2.10.

Fam

Re: [Qemu-block] [Qemu-devel] reduce write bandwidth of qcow2 driver while allocating new cluster

2017-08-30 Thread Liu Qing

On Wed, Aug 30, 2017 at 01:15:33PM +0300, Anton Nefedov wrote:
> 
> On 29/08/2017 05:56, Liu Qing wrote:
> >On Mon, Aug 28, 2017 at 10:46:34AM -0500, Eric Blake wrote:
> >>[adding qemu-block]
> >>
> >>On 08/28/2017 12:56 AM, Liu Qing wrote:
> >>>Dear list,
> >>>Recently I used fio to test qcow2 driver in the guest os, and found out
> >>>that when a new cluster is allocated the 4K IO will occupy 64K(default 
> >>>cluster
> >>>size) bandwith.
> >>>From the code qcow2 driver will fill the unused part of new allocated
> >>>cluster with 0 in perform_cow. These 0s are set in qcow2_co_readv when the 
> >>>read
> >>>destination is not allocated and it has no backing file. Could I forbidden 
> >>>any
> >>>further write in copy_sectors if the copy source is not allocated and it 
> >>>has
> >>>no backing file? So only the requested data is written to the cluster. 
> >>>Function
> >>>copy_sectors is only used by perform_cow in the master branch.
> >>
> >>There have already been discussions on optimizing COW writes in a manner
> >>similar to what you are describing; for example,
> >>
> >>https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00109.html
> >Thanks Eric, this is what I am looking for.
> >The only concern I have is in patch '[Qemu-devel] [PATCH v4 12/15] qcow2: 
> >skip
> >writing zero buffers to empty' it says:
> >
> >It can be detected that
> >  1. COW alignment of a write request is zeroes
> >  2. Respective areas on the underlying BDS already read as zeroes
> > after being preallocated previously
> >  If both of these true, COW may be skipped
> >
> >Will writing zero be skipped if the disk is not preallocated? @Anton
> >
> 
> Hi,
> 
> In short, no, it will not (with my patches), but there might be some way
> if that's what you really need.
> 
> 
> First of all, this might be undesirable as you lose the cluster-size
> data locality: now the whole cluster is written at once and is expected
> to reside in the contiguous area on the physical drive.
> 
> Secondly, I think there is no guarantee that the underlying bs->file
> image reads back as zeroes if the cluster is unallocated on qcow2 level.
Why we need this guarantee? If the cluster is unallocated, it means no
one used these clusters previously. So why should these unallocated
clusters be read back as zeroes?
> 
> For example, the unallocated cluster could have been used earlier but
> then discarded. Discard passthrough is configurable so discard may not
> be passed down to the underlying image. And I guess that in general,
> even if it is passed, there is no strong requirement on reading back as
> zeroes - look at qcow2 discard handling - discard head and tail which do
> not cover full clusters are ignored.
> 
> _perhaps_, one may expect that there will be zeroes if the cluster is
> allocated at the end of file
> (see 'clusters_are_trailing' detection here
> https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg00122.html)
> 
> but I haven't thought about all corner cases here.
> 
> 
> /Anton
> 
> >BTW: why the code in the patch is a little different than the latest
> >master branch? For example I don't have the is_zero function but only
> >get is_zero_sectors. Is there something wrong with my settings?
> >
> >My repo:
> ># git remote -v
> >origin  git://git.qemu-project.org/qemu.git (fetch)
> >origin  git://git.qemu-project.org/qemu.git (push)
> >
> >Thanks.
> >>
> >>--
> >>Eric Blake, Principal Software Engineer
> >>Red Hat, Inc.   +1-919-301-3266
> >>Virtualization:  qemu.org | libvirt.org
> >>
> >
> >

Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/3] block-jobs: Optionally unregister live block operations

2017-08-30 Thread Markus Armbruster

Eric Blake  writes:

> On 08/30/2017 12:24 PM, Eduardo Habkost wrote:
>> On Wed, Aug 30, 2017 at 01:01:41PM -0400, Jeff Cody wrote:
>>> From: Jeffrey Cody 
>>>
>>> If configured without live block operations enabled, unregister the
>>> live block operation commands.
>>>
>>> Signed-off-by: Jeff Cody 
>>> ---
>>>  monitor.c | 16 
>>>  1 file changed, 16 insertions(+)
>>>
>
>> 
>> I suggest using the new mechanisms added by:
>> 
>>   [PATCH 00/26] qapi: add #if pre-processor conditions to generated code
>
> Those haven't landed yet, but as both series are proposed for 2.11, I
> indeed agree that basing this series on top of that one will be a bit
> cleaner.

Rebasing shouldn't be hard.  However, we then have to hold it until the
QAPI series lands.  I don't think holding is necessary, as the conflicts
between the two are obvious, and should be straightforward to resolve.

Re: [Qemu-block] [Qemu-devel] [PATCH] block: Cleanup BMDS in bdrv_close_all

2017-08-30 Thread Fam Zheng

On Wed, 08/30 16:04, Juan Quintela wrote:
> Fam Zheng  wrote:
> > On Wed, 08/30 13:49, Juan Quintela wrote:
> >> Fam Zheng  wrote:
> >> > This fixes the assertion due to op blockers added by BMDS:
> >> >
> >> > block.c:3248: bdrv_delete: Assertion `bdrv_op_blocker_is_empty(bs)' 
> >> > failed.
> >> >
> >> > Reproducer: simply start block migration and quit QEMU before it ends.
> >> >
> >> > Cc: qemu-sta...@nongnu.org
> >> > Signed-off-by: Fam Zheng 
> >> 
> >> No need for one stub, see later.
> >> 
> >> 
> >> > ---
> >> >  block.c | 2 ++
> >> >  migration/block.c   | 2 +-
> >> >  migration/block.h   | 1 +
> >> >  stubs/Makefile.objs | 1 +
> >> >  stubs/block-migration.c | 6 ++
> >> >  5 files changed, 11 insertions(+), 1 deletion(-)
> >> >  create mode 100644 stubs/block-migration.c
> >> >
> >> > diff --git a/block.c b/block.c
> >> > index 3308814bba..508a57274d 100644
> >> > --- a/block.c
> >> > +++ b/block.c
> >> > @@ -43,6 +43,7 @@
> >> >  #include "qemu/cutils.h"
> >> >  #include "qemu/id.h"
> >> >  #include "qapi/util.h"
> >> > +#include "migration/block.h"
> >> 
> >> this should be misc.h
> >> 
> >> >  
> >> >  #ifdef CONFIG_BSD
> >> >  #include 
> >> > @@ -3111,6 +3112,7 @@ static void bdrv_close(BlockDriverState *bs)
> >> >  
> >> >  void bdrv_close_all(void)
> >> >  {
> >> > +block_migration_cleanup_bmds();
> >> >  block_job_cancel_sync_all();
> >> >  nbd_export_close_all();
> >> >  
> >> 
> >> > diff --git a/migration/block.h b/migration/block.h
> >> > index 22ebe94259..8bae1cf55a 100644
> >> > --- a/migration/block.h
> >> > +++ b/migration/block.h
> >> > @@ -42,4 +42,5 @@ static inline uint64_t blk_mig_bytes_total(void)
> >> >  #endif /* CONFIG_LIVE_BLOCK_MIGRATION */
> >> >  
> >> >  void migrate_set_block_enabled(bool value, Error **errp);
> >> > +void block_migration_cleanup_bmds(void);
> >> >  #endif /* MIGRATION_BLOCK_H */
> >> > diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
> >> > index e69c217aff..7540913767 100644
> >> > --- a/stubs/Makefile.objs
> >> > +++ b/stubs/Makefile.objs
> >> > @@ -19,6 +19,7 @@ stub-obj-y += is-daemonized.o
> >> >  stub-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
> >> >  stub-obj-y += machine-init-done.o
> >> >  stub-obj-y += migr-blocker.o
> >> > +stub-obj-y += block-migration.o
> >> >  stub-obj-y += change-state-handler.o
> >> >  stub-obj-y += monitor.o
> >> >  stub-obj-y += notify-event.o
> >> > diff --git a/stubs/block-migration.c b/stubs/block-migration.c
> >> > new file mode 100644
> >> > index 00..855f15c757
> >> > --- /dev/null
> >> > +++ b/stubs/block-migration.c
> >> > @@ -0,0 +1,6 @@
> >> > +#include "qemu/osdep.h"
> >> > +#include "migration/block.h"
> >> > +
> >> > +void block_migration_cleanup_bmds(void)
> >> > +{
> >> > +}
> >> 
> >> You can add this inside include/migration/misc.h
> >> 
> >> #ifdef CONFIG_LIVE_BLOCK_MIGRATION
> >> void blk_mig_init(void);
> >> #else
> >> static inline void blk_mig_init(void) {}
> >> 
> >> // And then you add the stub here?
> >
> > This doesn't work.  The function is not stubbed for 
> > !CONFIG_LIVE_BLOCK_MIGRATION
> > configs, but for tools that don't link to common-obj-y. For example with 
> > your
> > proposed change, I get:
> >
> >   LINKqemu-nbd
> > block.o: In function `bdrv_close_all':
> > /home/fam/work/qemu/block.c:3115: undefined reference to
> > `block_migration_cleanup_bmds'
> > collect2: error: ld returned 1 exit status
> > make: *** [/home/fam/work/qemu/rules.mak:121: qemu-nbd] Error 1
> > make: Leaving directory '/home/fam/work/q/build'
> 
> 
> This works for me, for both CONFIG_LIVE_BLOCK_MIGRATION enabled and not.
> For qemu-system-x86_64 and qemu-nbd.  Could you test?

I get the same error:

  LINKqemu-nbd
block.o: In function `bdrv_close_all':
/home/fam/work/qemu/block.c:3115: undefined reference to 
`block_migration_cleanup_bmds'
collect2: error: ld returned 1 exit status
make: *** [/home/fam/work/qemu/rules.mak:121: qemu-nbd] Error 1
make: Leaving directory '/home/fam/work/q/build'

(also applies to qemu-img etc.)

Fam

---

$ cat config.status 
#!/bin/sh
# Generated by configure.
# Run this file to recreate the current configuration.
# Compiler output produced by configure, useful for debugging
# configure, is in config.log if it exists.
exec '/home/fam/work/qemu/configure' 
'--prefix=/home/fam/work/q/install/bdrv_close_all-bmds' '--enable-debug' 
'--extra-cflags=-Wno-error=format-truncation' '--target-list=x86_64-softmmu' 
"$@"

$ grep CONFIG_LIVE_BLOCK_MIGRATION config-host.h
#define CONFIG_LIVE_BLOCK_MIGRATION 1

$ make qemu-nbd V=1
(cd /home/fam/work/qemu; printf '#define QEMU_PKGVERSION '; if test -n ""; then 
printf '""\n'; else if test -d .git; then printf '" ('; git describe --match 
'v*' 2>/dev/null | tr -d '\n'; if ! git diff-index --quiet HEAD &>/dev/null; 
then printf -- '-dirty'; fi; printf ')"\n'; else printf '""\n'; fi; fi) > 
qemu-version.h.tmp
if ! cmp -s qemu-version.h qemu-version.h.tmp; then mv qemu-version.h

Re: [Qemu-block] [Qemu-devel] [PATCH] qcow2: allocate cluster_cache/cluster_data on demand

2017-08-30 Thread Alexey Kardashevskiy

On 31/08/17 03:20, Stefan Hajnoczi wrote:
> On Tue, Aug 22, 2017 at 02:56:00PM +1000, Alexey Kardashevskiy wrote:
>> On 19/08/17 12:46, Alexey Kardashevskiy wrote:
>>> On 19/08/17 01:18, Eric Blake wrote:
 On 08/18/2017 08:31 AM, Stefan Hajnoczi wrote:
> Most qcow2 files are uncompressed so it is wasteful to allocate (32 + 1)
> * cluster_size + 512 bytes upfront.  Allocate s->cluster_cache and
> s->cluster_data when the first read operation is performance on a
> compressed cluster.
>
> The buffers are freed in .bdrv_close().  .bdrv_open() no longer has any
> code paths that can allocate these buffers, so remove the free functions
> in the error code path.
>
> Reported-by: Alexey Kardashevskiy 
> Cc: Kevin Wolf 
> Signed-off-by: Stefan Hajnoczi 
> ---
> Alexey: Does this improve your memory profiling results?

 Is this a regression from earlier versions? 
>>>
>>> Hm, I have not thought about this.
>>>
>>> So. I did bisect and this started happening from
>>> 9a4c0e220d8a4f82b5665d0ee95ef94d8e1509d5
>>> "hw/virtio-pci: fix virtio behaviour"
>>>
>>> Before that, the very same command line would take less than 1GB of
>>> resident memory. That thing basically enforces virtio-1.0 for QEMU <=2.6
>>> which means that upstream with "-machine pseries-2.6" works fine (less than
>>> 1GB), "-machine pseries-2.7" does not (close to 7GB, sometime even 9GB).
>>>
>>> Then I tried bisecting again, with
>>> "scsi=off,disable-modern=off,disable-legacy=on" on my 150 virtio-block
>>> devices, started from
>>> e266d421490e0 "virtio-pci: add flags to enable/disable legacy/modern" (it
>>> added the disable-modern switch) which uses 2GB of memory.
>>>
>>> I ended up with ada434cd0b44 "virtio-pci: implement cfg capability".
>>>
>>> Then I removed proxy->modern_as on v2.10.0-rc3 (see below) and got 1.5GB of
>>> used memory (yay!)
>>>
>>> I do not really know how to reinterpret all of this, do you?
>>
>>
>> Anyone, ping? Should I move the conversation to the original thread? Any
>> hacks to try with libc?
> 
> I suggest a new top-level thread with Michael Tsirkin CCed.


I am continuing in the original "Memory use with >100 virtio devices"
thread and the problem is more generic than virtio, it is just easier to
reproduce it with virtio, that's all.



-- 
Alexey

Re: [Qemu-block] [Qemu-devel] [PATCH v2 7/9] AHCI: Rework IRQ constants

2017-08-30 Thread John Snow



On 08/29/2017 11:54 PM, Philippe Mathieu-Daudé wrote:
> On 08/29/2017 05:49 PM, John Snow wrote:
>> Create a new enum so that we can name the IRQ bits, which will make
>> debugging
>> them a little nicer if we can print them out. Not handled in this
>> patch, but
>> this will make it possible to get a nice debug printf detailing
>> exactly which
>> status bits are set, as it can be multiple at any given time.
>>
>> As a consequence of this patch, it is no longer possible to set
>> multiple IRQ
>> codes at once, but nothing was utilizing this ability anyway.
>>
>> Signed-off-by: John Snow 
>> ---
>>   hw/ide/ahci.c  | 49
>> ++---
>>   hw/ide/ahci_internal.h | 44
>> +++-
>>   hw/ide/trace-events|  2 +-
>>   3 files changed, 74 insertions(+), 21 deletions(-)
>>
>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
>> index c60a000..a0a4dd6 100644
>> --- a/hw/ide/ahci.c
>> +++ b/hw/ide/ahci.c
>> @@ -56,6 +56,27 @@ static bool ahci_map_fis_address(AHCIDevice *ad);
>>   static void ahci_unmap_clb_address(AHCIDevice *ad);
>>   static void ahci_unmap_fis_address(AHCIDevice *ad);
>>   +static const char *AHCIPortIRQ_lookup[AHCI_PORT_IRQ__END] = {
>> +[AHCI_PORT_IRQ_BIT_DHRS] = "DHRS",
>> +[AHCI_PORT_IRQ_BIT_PSS]  = "PSS",
>> +[AHCI_PORT_IRQ_BIT_DSS]  = "DSS",
>> +[AHCI_PORT_IRQ_BIT_SDBS] = "SDBS",
>> +[AHCI_PORT_IRQ_BIT_UFS]  = "UFS",
>> +[AHCI_PORT_IRQ_BIT_DPS]  = "DPS",
>> +[AHCI_PORT_IRQ_BIT_PCS]  = "PCS",
>> +[AHCI_PORT_IRQ_BIT_DMPS] = "DMPS",
>> +[8 ... 21]   = "RESERVED",
>> +[AHCI_PORT_IRQ_BIT_PRCS] = "PRCS",
>> +[AHCI_PORT_IRQ_BIT_IPMS] = "IPMS",
>> +[AHCI_PORT_IRQ_BIT_OFS]  = "OFS",
>> +[25] = "RESERVED",
>> +[AHCI_PORT_IRQ_BIT_INFS] = "INFS",
>> +[AHCI_PORT_IRQ_BIT_IFS]  = "IFS",
>> +[AHCI_PORT_IRQ_BIT_HBDS] = "HBDS",
>> +[AHCI_PORT_IRQ_BIT_HBFS] = "HBFS",
>> +[AHCI_PORT_IRQ_BIT_TFES] = "TFES",
>> +[AHCI_PORT_IRQ_BIT_CPDS] = "CPDS"
>> +};
>> static uint32_t  ahci_port_read(AHCIState *s, int port, int offset)
>>   {
>> @@ -170,12 +191,18 @@ static void ahci_check_irq(AHCIState *s)
>>   }
>> static void ahci_trigger_irq(AHCIState *s, AHCIDevice *d,
>> - int irq_type)
>> + enum AHCIPortIRQ irqbit)
>>   {
>> -DPRINTF(d->port_no, "trigger irq %#x -> %x\n",
>> -irq_type, d->port_regs.irq_mask & irq_type);
>> +g_assert(irqbit >= 0 && irqbit < 32);
> 
> I still think this assert is superfluous, anyway (and having hard time
> reading C99 statement before declarations - I need to grow):
> 
> Reviewed-by: Philippe Mathieu-Daudé 
> 

Left in because of my distrust of compilers as explained in my reply to
#05. We'll get to the bottom of it ;)

Thank you for the reviews.

--js

Re: [Qemu-block] [Qemu-devel] [PATCH v2 5/9] IDE: replace DEBUG_AIO with trace events

2017-08-30 Thread John Snow

CCing Laszlo Ersek literally just for laughs, as he is the most
entertaining language lawyer I know of.

Laszlo, please feel free to ignore this if you don't care :P

On 08/29/2017 11:14 PM, Philippe Mathieu-Daudé wrote:
> Hi John,
> 
> On 08/29/2017 05:49 PM, John Snow wrote:
>> Signed-off-by: John Snow 
>> ---
>>   hw/ide/atapi.c|  5 +
>>   hw/ide/core.c | 17 ++---
>>   hw/ide/trace-events   |  3 +++
>>   include/hw/ide/internal.h |  6 --
>>   4 files changed, 18 insertions(+), 13 deletions(-)
>>
>> diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
>> index 37fa699..b8fc51e 100644
>> --- a/hw/ide/atapi.c
>> +++ b/hw/ide/atapi.c
>> @@ -416,10 +416,7 @@ static void ide_atapi_cmd_read_dma_cb(void
>> *opaque, int ret)
>>   s->io_buffer_size = n * 2048;
>>   data_offset = 0;
>>   }
>> -#ifdef DEBUG_AIO
>> -printf("aio_read_cd: lba=%u n=%d\n", s->lba, n);
>> -#endif
>> -
>> +trace_ide_atapi_cmd_read_dma_cb_aio(s, s->lba, n);
>>   s->bus->dma->iov.iov_base = (void *)(s->io_buffer + data_offset);
>>   s->bus->dma->iov.iov_len = n * ATAPI_SECTOR_SIZE;
>>   qemu_iovec_init_external(&s->bus->dma->qiov, &s->bus->dma->iov, 1);
>> diff --git a/hw/ide/core.c b/hw/ide/core.c
>> index 82a19b1..a1c90e9 100644
>> --- a/hw/ide/core.c
>> +++ b/hw/ide/core.c
>> @@ -58,6 +58,13 @@ static const int smart_attributes[][12] = {
>>   { 190,  0x03, 0x00, 0x45, 0x45, 0x1f, 0x00, 0x1f, 0x1f, 0x00,
>> 0x00, 0x32},
>>   };
>>   +const char *IDE_DMA_CMD_lookup[IDE_DMA__COUNT] = {
>> +[IDE_DMA_READ] = "DMA READ",
>> +[IDE_DMA_WRITE] = "DMA WRITE",
>> +[IDE_DMA_TRIM] = "DMA TRIM",
>> +[IDE_DMA_ATAPI] = "DMA ATAPI"
>> +};
>> +
>>   static void ide_dummy_transfer_stop(IDEState *s);
>> static void padstr(char *str, const char *src, int len)
>> @@ -860,10 +867,8 @@ static void ide_dma_cb(void *opaque, int ret)
>>   goto eot;
>>   }
>>   -#ifdef DEBUG_AIO
>> -printf("ide_dma_cb: sector_num=%" PRId64 " n=%d, cmd_cmd=%d\n",
>> -   sector_num, n, s->dma_cmd);
>> -#endif
>> +trace_ide_dma_cb(s, sector_num, n,
>> + IDE_DMA_CMD_lookup[s->dma_cmd]);
>> if ((s->dma_cmd == IDE_DMA_READ || s->dma_cmd ==
>> IDE_DMA_WRITE) &&
>>   !ide_sect_range_ok(s, sector_num, n)) {
>> @@ -2391,9 +2396,7 @@ void ide_bus_reset(IDEBus *bus)
>> /* pending async DMA */
>>   if (bus->dma->aiocb) {
>> -#ifdef DEBUG_AIO
>> -printf("aio_cancel\n");
>> -#endif
>> +trace_ide_bus_reset_aio();
>>   blk_aio_cancel(bus->dma->aiocb);
>>   bus->dma->aiocb = NULL;
>>   }
>> diff --git a/hw/ide/trace-events b/hw/ide/trace-events
>> index 8c79a6c..cc8949c 100644
>> --- a/hw/ide/trace-events
>> +++ b/hw/ide/trace-events
>> @@ -18,6 +18,8 @@ ide_cancel_dma_sync_remaining(void) "draining all
>> remaining requests"
>>   ide_sector_read(int64_t sector_num, int nsectors) "sector=%"PRId64"
>> nsectors=%d"
>>   ide_sector_write(int64_t sector_num, int nsectors) "sector=%"PRId64"
>> nsectors=%d"
>>   ide_reset(void *s) "IDEstate %p"
>> +ide_bus_reset_aio(void) "aio_cancel"
>> +ide_dma_cb(void *s, int64_t sector_num, int n, const char *dma)
>> "IDEState %p; sector_num=%"PRId64" n=%d cmd=%s"
>> # BMDMA HBAs:
>>   @@ -51,5 +53,6 @@ ide_atapi_cmd_reply_end_new(void *s, int status)
>> "IDEState: %p; new transfer sta
>>   ide_atapi_cmd_check_status(void *s) "IDEState: %p"
>>   ide_atapi_cmd_read(void *s, const char *method, int lba, int
>> nb_sectors) "IDEState: %p; read %s: LBA=%d nb_sectors=%d"
>>   ide_atapi_cmd(void *s, uint8_t cmd) "IDEState: %p; cmd: 0x%02x"
>> +ide_atapi_cmd_read_dma_cb_aio(void *s, int lba, int n) "IDEState: %p;
>> aio read: lba=%d n=%d"
>>   # Warning: Verbose
>>   ide_atapi_cmd_packet(void *s, uint16_t limit, const char *packet)
>> "IDEState: %p; limit=0x%x packet: %s"
>> diff --git a/include/hw/ide/internal.h b/include/hw/ide/internal.h
>> index 74efe8a..db9fde0 100644
>> --- a/include/hw/ide/internal.h
>> +++ b/include/hw/ide/internal.h
>> @@ -14,7 +14,6 @@
>>   #include "block/scsi.h"
>> /* debug IDE devices */
>> -//#define DEBUG_AIO
>>   #define USE_DMA_CDROM
>> typedef struct IDEBus IDEBus;
>> @@ -333,12 +332,15 @@ struct unreported_events {
>>   };
>> enum ide_dma_cmd {
>> -IDE_DMA_READ,
>> +IDE_DMA_READ = 0,
>>   IDE_DMA_WRITE,
>>   IDE_DMA_TRIM,
>>   IDE_DMA_ATAPI,
>> +IDE_DMA__COUNT
>>   };
>>   +extern const char *IDE_DMA_CMD_lookup[IDE_DMA__COUNT];
> 
> I recommend you to avoid this declaring extern const array with size, I
> remember some compilers (old GCC?) ignoring array size in extern. Eric
> will correct me!
> 
> It is much safer to use a getter:
> 

Well, whether or not the compiler ignores it, you're right that it's
safer to use a getter. I don't think the width being declared HURTS any
compiler though, does it?

> const char *IDE_DMA_CMD_lookup(enum ide_dma_cmd cmd)
> {
> static const char *IDE_DM

Re: [Qemu-block] [Qemu-devel] [PATCH v2 8/9] AHCI: pretty-print FIS to buffer instead of stderr

2017-08-30 Thread John Snow



On 08/30/2017 05:17 AM, Stefan Hajnoczi wrote:
> On Tue, Aug 29, 2017 at 04:49:33PM -0400, John Snow wrote:
>> The current FIS printing routines dump the FIS to screen. adjust this
>> such that it dumps to buffer instead, then use this ability to have
>> FIS dump mechanisms via trace-events instead of compiled defines.
>>
>> Signed-off-by: John Snow 
>> ---
>>  hw/ide/ahci.c   | 54 
>> +++--
>>  hw/ide/trace-events |  4 
>>  2 files changed, 48 insertions(+), 10 deletions(-)
>>
>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
>> index a0a4dd6..2e75f9b 100644
>> --- a/hw/ide/ahci.c
>> +++ b/hw/ide/ahci.c
>> @@ -644,20 +644,45 @@ static void ahci_reset_port(AHCIState *s, int port)
>>  ahci_init_d2h(d);
>>  }
>>  
>> -static void debug_print_fis(uint8_t *fis, int cmd_len)
>> +/* Buffer pretty output based on a raw FIS structure. */
>> +static void ahci_pretty_buffer_fis(uint8_t *fis, int cmd_len, char **out)
> 
> Simplified function using GString:
> 
>   static char *ahci_pretty_buffer_fis(const uint8_t *fis, int cmd_len)
>   {
>   GString *s = g_string_new("FIS:");
>   int i;
> 
>   for (i = 0; i < cmd_len; i++) {
>   if (i % 16 == 0) {
>   g_string_append_printf(s, "\n0x%02x:", i);
> }
> 
>   g_string_append_printf(s, " %02x", fis[i]);
>   }
> 
>   g_string_append_c('\n');
>   return g_string_free(s, FALSE);
>   }
> 
> It's less efficient due to extra mallocs but a lot easier to read.
> 

eeeea I guess I don't need to be bleedingly efficient with debug
printfs...

And in this example I don't need to worry about the math being precisely
correct. I'll make the change :(

>>  {
>> -#if DEBUG_AHCI
>> +size_t bufsize;
>> +char *pbuf;
>> +char *pptr;
>> +size_t lines = DIV_ROUND_UP(cmd_len, 16);
>> +const char *preamble = "FIS:";
>>  int i;
>>  
>> -fprintf(stderr, "fis:");
>> +/* Total amount of memory to store FISes in HBA memory */
>> +g_assert_cmpint(cmd_len, <=, 0x100);
>> +g_assert(out);
>> +
>> +/* Printed like:
>> + * FIS:\n
>> + * 0x00: 00 11 22 33 44 55 66 77 88 99 aa bb cc dd ee \n
>> + * 0x10: ff \n
>> + * \0
>> + *
>> + * Four bytes for the preamble, seven for each line prefix (including a
>> + * newline to start a new line), three bytes for each source byte,
>> + * a trailing newline and a terminal null byte.
>> + */
>> +
>> +bufsize = strlen(preamble) + ((6 + 1) * lines) + (3 * cmd_len) + 1 + 1;
>> +pbuf = g_malloc(bufsize);
>> +pptr = pbuf;
>> +pptr += sprintf(pptr, "%s", preamble);
>>  for (i = 0; i < cmd_len; i++) {
>>  if ((i & 0xf) == 0) {
>> -fprintf(stderr, "\n%02x:",i);
>> +pptr += sprintf(pptr, "\n0x%02x: ", i);
>>  }
>> -fprintf(stderr, "%02x ",fis[i]);
>> +pptr += sprintf(pptr, "%02x ", fis[i]);
>>  }
>> -fprintf(stderr, "\n");
>> -#endif
>> +pptr += sprintf(pptr, "\n");
>> +pptr += 1; /* \0 */
>> +g_assert(pbuf + bufsize == pptr);
>> +*out = pbuf;
>>  }
>>  
>>  static bool ahci_map_fis_address(AHCIDevice *ad)
>> @@ -1201,7 +1226,12 @@ static void handle_reg_h2d_fis(AHCIState *s, int port,
>>   * table to ide_state->io_buffer */
>>  if (opts & AHCI_CMD_ATAPI) {
>>  memcpy(ide_state->io_buffer, &cmd_fis[AHCI_COMMAND_TABLE_ACMD], 
>> 0x10);
>> -debug_print_fis(ide_state->io_buffer, 0x10);
>> +if (TRACE_HANDLE_REG_H2D_FIS_DUMP_ENABLED) {
> 
> This should probably be:
> 
>   if (trace_event_get_state_backends(TRACE_HANDLE_REG_H2D_FIS_DUMP)) {
> 
> The difference is that TRACE_HANDLE_REG_H2D_FIS_DUMP_ENABLED is set at
> compile time while trace_event_get_state_backends() checks if the event
> is enabled at run-time.
> 
> Therefore TRACE_HANDLE_REG_H2D_FIS_DUMP_ENABLED causes the trace event
> to fire even when the user hasn't enabled the trace event yet.  That
> would be a waste of CPU.
> 

Oh, cool! Nice tip!

>> +char *pretty_fis;
>> +ahci_pretty_buffer_fis(ide_state->io_buffer, 0x10, &pretty_fis);
>> +trace_handle_reg_h2d_fis_dump(s, port, pretty_fis);
>> +g_free(pretty_fis);
>> +}
>>  s->dev[port].done_atapi_packet = false;
>>  /* XXX send PIO setup FIS */
>>  }
>> @@ -1256,8 +1286,12 @@ static int handle_cmd(AHCIState *s, int port, uint8_t 
>> slot)
>>  trace_handle_cmd_badmap(s, port, cmd_len);
>>  goto out;
>>  }
>> -debug_print_fis(cmd_fis, 0x80);
>> -
>> +if (TRACE_HANDLE_CMD_FIS_DUMP_ENABLED) {
> 
> Same here.
> 

Sure thing. There's probably a block in ATAPI that needs this treatment,
too.

>> +char *pretty_fis;
>> +ahci_pretty_buffer_fis(cmd_fis, 0x80, &pretty_fis);
>> +trace_handle_cmd_fis_dump(s, port, pretty_fis);
>> +g_free(pretty_fis);
>> +}
>>  switch (cmd_fis[0]) {
>>  case SATA_FIS_TYPE_REG

Re: [Qemu-block] [Qemu-devel] [PATCH v6 00/18] make dirty-bitmap byte-based

2017-08-30 Thread John Snow



On 08/30/2017 05:05 PM, Eric Blake wrote:
> There are patches floating around to add NBD_CMD_BLOCK_STATUS,
> but NBD wants to report status on byte granularity (even if the
> reporting will probably be naturally aligned to sectors or even
> much higher levels).  I've therefore started the task of
> converting our block status code to report at a byte granularity
> rather than sectors.
> 
> Now that 2.11 is open, I'm rebasing/reposting the remaining patches.
> 
> The overall conversion currently looks like:
> part 1: bdrv_is_allocated (merged in 2.10, commit 51b0a488)
> part 2: dirty-bitmap (this series, v5 was here [1])
> part 3: bdrv_get_block_status (v3 is posted [2] and is mostly reviewed, but
> needs a rebase)
> part 4: .bdrv_co_block_status (v2 is posted [3], but needs a rebase)
> 
> Available as a tag at:
> git fetch git://repo.or.cz/qemu/ericb.git nbd-byte-dirty-v6
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg03512.html
> [2] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg03853.html
> [3] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg04370.html
> 
> Diff from v5:
> - add another patch (more for ease of bookkeeping, as it was previously
> posted independently)
> - drop bug fixes that were hoisted into 2.10 (v5 1/18, plus 14/18)
> 
> 001/18:[down] 'block: Make bdrv_img_create() size selection easier to read'
> 002/18:[] [--] 'hbitmap: Rename serialization_granularity to 
> serialization_align'
> 003/18:[] [--] 'qcow2: Ensure bitmap serialization is aligned'
> 004/18:[] [--] 'dirty-bitmap: Drop unused functions'
> 005/18:[] [--] 'dirty-bitmap: Change bdrv_dirty_bitmap_size() to report 
> bytes'
> 006/18:[] [--] 'dirty-bitmap: Change bdrv_dirty_bitmap_*serialize*() to 
> take bytes'
> 007/18:[] [--] 'qcow2: Switch sectors_covered_by_bitmap_cluster() to 
> byte-based'
> 008/18:[] [--] 'dirty-bitmap: Set iterator start by offset, not sector'
> 009/18:[] [--] 'dirty-bitmap: Change bdrv_dirty_iter_next() to report 
> byte offset'
> 010/18:[] [--] 'dirty-bitmap: Change bdrv_get_dirty_count() to report 
> bytes'
> 011/18:[] [--] 'dirty-bitmap: Change bdrv_get_dirty_locked() to take 
> bytes'
> 012/18:[] [--] 'dirty-bitmap: Change bdrv_[re]set_dirty_bitmap() to use 
> bytes'
> 013/18:[] [--] 'mirror: Switch mirror_dirty_init() to byte-based 
> iteration'
> 014/18:[0004] [FC] 'qcow2: Switch qcow2_measure() to byte-based iteration'
> 015/18:[] [--] 'qcow2: Switch load_bitmap_data() to byte-based iteration'
> 016/18:[] [--] 'qcow2: Switch store_bitmap_data() to byte-based iteration'
> 017/18:[] [--] 'dirty-bitmap: Switch bdrv_set_dirty() to bytes'
> 018/18:[] [--] 'dirty-bitmap: Convert internal hbitmap size/granularity'
> 
> Eric Blake (18):
>   block: Make bdrv_img_create() size selection easier to read
>   hbitmap: Rename serialization_granularity to serialization_align
>   qcow2: Ensure bitmap serialization is aligned
>   dirty-bitmap: Drop unused functions
>   dirty-bitmap: Change bdrv_dirty_bitmap_size() to report bytes
>   dirty-bitmap: Change bdrv_dirty_bitmap_*serialize*() to take bytes
>   qcow2: Switch sectors_covered_by_bitmap_cluster() to byte-based
>   dirty-bitmap: Set iterator start by offset, not sector
>   dirty-bitmap: Change bdrv_dirty_iter_next() to report byte offset
>   dirty-bitmap: Change bdrv_get_dirty_count() to report bytes
>   dirty-bitmap: Change bdrv_get_dirty_locked() to take bytes
>   dirty-bitmap: Change bdrv_[re]set_dirty_bitmap() to use bytes
>   mirror: Switch mirror_dirty_init() to byte-based iteration
>   qcow2: Switch qcow2_measure() to byte-based iteration
>   qcow2: Switch load_bitmap_data() to byte-based iteration
>   qcow2: Switch store_bitmap_data() to byte-based iteration
>   dirty-bitmap: Switch bdrv_set_dirty() to bytes
>   dirty-bitmap: Convert internal hbitmap size/granularity
> 
>  include/block/block_int.h|   2 +-
>  include/block/dirty-bitmap.h |  41 +-
>  include/qemu/hbitmap.h   |   8 +--
>  block/io.c   |   6 +-
>  block.c  |   2 +-
>  block/backup.c   |   7 +--
>  block/dirty-bitmap.c | 130 
> ++-
>  block/mirror.c   |  76 +++--
>  block/qcow2-bitmap.c |  57 +--
>  block/qcow2.c|  22 
>  migration/block.c|  12 ++--
>  tests/test-hbitmap.c |  10 ++--
>  util/hbitmap.c   |   8 +--
>  13 files changed, 154 insertions(+), 227 deletions(-)
> 

Should this go through the bitmap tree, or since it's touching qcow2,
I'll let Kevin/Max/Stefan stage it?

--js

Re: [Qemu-block] [Qemu-devel] [PATCH v3 4/5] qemu-iotests: make python tests attempt to leave intermediate files

2017-08-30 Thread John Snow



On 08/30/2017 06:35 PM, Eric Blake wrote:
> On 08/30/2017 05:28 PM, John Snow wrote:
> 
>> I'm a little iffy on this patch; I know that ./check can take care of
>> our temp files for us now, but because each python test is itself a
>> little mini-harness, I'm a little leery of moving the teardown to setup
>> and trying to pre-clean the confetti before the test begins.
>>
>> What's the benefit? We still have to clean up these files per-test, but
>> now it's slightly more error-prone and in a weird place.
>>
>> If we want to try to preserve the most-recent-failure-files, perhaps we
>> can define a setting in the python test-runner that allows us to
>> globally skip file cleanup.
> 
> On the other hand, since each test is a mini-harness, globally skipping
> cleanup will make a two-part test fail on the second because of garbage
> left behind by the first.
> 

subtext was to have per-subtest files.

> Patch 5 adds a comment with another possible solution: teach the python
> mini-harness to either clean all files in the directory, or to relocate
> the directory according to test name, so that each mini-test starts with
> a fresh location, and cleanup is then handled by the harness rather than
> spaghetti pre-cleanup.  But any solution is better than our current
> situation of nothing, so that's why I'm still okay with this patch as-is
> as offering more (even if not perfect) than before.
> 

I guess where I am unsure is really if this is better than what we
currently do, which is to (try) to clean up after each test as best as
we can. I don't see it as too different from trying to clean up before
each test.

It does give us the ability to leave behind a little detritus after a
failed run, but it's so imperfect that I wonder if it's worth shifting
this code around to change not much.

I won't die on this hill, it just strikes me a slightly less intuitive
use of the python unittest framework.

--js

Re: [Qemu-block] [Qemu-devel] [PATCH v3 1/5] qemu-iotests: set TEST_DIR to a unique dir for each test

2017-08-30 Thread Jeff Cody

On Wed, Aug 30, 2017 at 06:15:05PM -0400, John Snow wrote:
> 
> 
> On 08/30/2017 12:52 PM, Jeff Cody wrote:
> > Right now, all qemu-iotests output data into the same scratch directory,
> > and so each test needs to be responsible for cleaning up its own files.
> > 
> > Have each test use 'scratch/$seq' as its temp directory, so the check
> > script can do simple cleanup of removing the whole temporary directory.
> > 
> > Reviewed-by: Eric Blake 
> > Signed-off-by: Jeff Cody 
> > ---
> >  tests/qemu-iotests/check | 21 +
> >  1 file changed, 17 insertions(+), 4 deletions(-)
> > 
> > diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
> > index d504b6e..f6ca85d 100755
> > --- a/tests/qemu-iotests/check
> > +++ b/tests/qemu-iotests/check
> > @@ -243,6 +243,7 @@ seq="check"
> >  
> >  for seq in $list
> >  do
> > +TEST_DIR_SEQ=$TEST_DIR/$seq
> >  err=false
> >  printf %s "$seq"
> >  if [ -n "$TESTS_REMAINING_LOG" ] ; then
> > @@ -289,13 +290,23 @@ do
> >  fi
> >  export OUTPUT_DIR=$PWD
> >  if $debug; then
> > -(cd "$source_iotests";
> > +(
> > +export TEST_DIR=$TEST_DIR_SEQ
> > +. "$source_iotests/common.config"
> > +. "$source_iotests/common.rc"
> 
> What purpose do these serve?
>

This is setting $TEST_DIR according to the $seq number (test # being run) in
the bash subshell that the tests are being run from.  So that all the other
variables that are based on the $TEST_DIR are set appropriately, this also
sources them in the subshell prior to running the test.  That way their
environment is with $TEST_DIR_SEQ rather than the original base $TEST_DIR.

> > +cd "$source_iotests" &&
> >  MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(($RANDOM % 255 + 1))} \
> > -$run_command -d 2>&1 | tee $tmp.out)
> > +$run_command -d 2>&1 | tee $tmp.out
> > +)
> >  else
> > -(cd "$source_iotests";
> > +(
> > +export TEST_DIR=$TEST_DIR_SEQ
> > +. "$source_iotests/common.config"
> > +. "$source_iotests/common.rc"
> > + cd "$source_iotests" &&
> >  MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(($RANDOM % 255 + 1))} \
> > -$run_command >$tmp.out 2>&1)
> > +$run_command >$tmp.out 2>&1
> > +)
> >  fi
> >  sts=$?
> >  $timestamp && _timestamp
> > @@ -359,6 +370,8 @@ do
> >  fi
> >  fi
> >  
> > +rm -rf "$TEST_DIR_SEQ"
> > +
> >  fi
> >  
> >  # come here for each test, except when $showme is true
> > 
> 
> Seems OK to me, though I am not able to answer all doubts about exactly
> how this may effect the strange pipe/subshell arrangements that occur
> deeper in the bowels of the included files for launching QEMU and so
> on.. I suppose that might be related to the inclusion of those
> common.XYZ files?
> 
> Tested-by: John Snow 
> Reviewed-by: John Snow

Re: [Qemu-block] [Qemu-devel] [PATCH v3 5/5] qemu-iotests: add option to save temp files on error

2017-08-30 Thread John Snow



On 08/30/2017 12:52 PM, Jeff Cody wrote:
> Now that ./check takes care of cleaning up after each tests, it
> can also selectively not clean up.  Add option to leave all output from
> tests intact if that test encountered an error.
> 
> Note: this currently only works for bash tests, as the python tests
> still clean up after themselves manually.
> 
> Signed-off-by: Jeff Cody 
> ---
>  tests/qemu-iotests/check  | 10 +-
>  tests/qemu-iotests/common |  6 ++
>  2 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
> index f6ca85d..8a5fc0d 100755
> --- a/tests/qemu-iotests/check
> +++ b/tests/qemu-iotests/check
> @@ -370,7 +370,15 @@ do
>  fi
>  fi
>  
> -rm -rf "$TEST_DIR_SEQ"
> +#TODO: There is some intial work to save intermediate files
> +#  in python tests, but it is imperfect.  Having each
> +#  test record its test name, and the tearDown function
> +#  just move intermediate images to a subdirectory with
> +#  the test name may prove more useful.
> +if [ "$save_on_err" != "true" ] || [ "$err" != "true" ]
> +then
> +rm -rf "$TEST_DIR_SEQ"
> +fi
>  
>  fi
>  
> diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common
> index d34c11c..d08b233 100644
> --- a/tests/qemu-iotests/common
> +++ b/tests/qemu-iotests/common
> @@ -42,6 +42,7 @@ expunge=true
>  have_test_arg=false
>  randomize=false
>  cachemode=false
> +save_on_err=false
>  rm -f $tmp.list $tmp.tmp $tmp.sed
>  
>  export IMGFMT=raw
> @@ -172,6 +173,7 @@ other options
>  -T  output timestamps
>  -r  randomize test order
>  -c mode cache mode
> +-s  save test scratch directory on test failure
>  
>  testlist options
>  -g group[,group...]include tests from these groups
> @@ -349,6 +351,10 @@ testlist options
>  xgroup=true
>  xpand=false
>  ;;
> +-s)
> +save_on_err=true
> +xpand=false
> +;;
>  '[0-9][0-9][0-9] [0-9][0-9][0-9][0-9]')
>  echo "No tests?"
>  status=1
> 

This, however, is definitely awesome.

Tested-by: John Snow 
Reviewed-by: John Snow

Re: [Qemu-block] [Qemu-devel] [PATCH v3 4/5] qemu-iotests: make python tests attempt to leave intermediate files

2017-08-30 Thread Eric Blake

On 08/30/2017 05:28 PM, John Snow wrote:

> I'm a little iffy on this patch; I know that ./check can take care of
> our temp files for us now, but because each python test is itself a
> little mini-harness, I'm a little leery of moving the teardown to setup
> and trying to pre-clean the confetti before the test begins.
> 
> What's the benefit? We still have to clean up these files per-test, but
> now it's slightly more error-prone and in a weird place.
> 
> If we want to try to preserve the most-recent-failure-files, perhaps we
> can define a setting in the python test-runner that allows us to
> globally skip file cleanup.

On the other hand, since each test is a mini-harness, globally skipping
cleanup will make a two-part test fail on the second because of garbage
left behind by the first.

Patch 5 adds a comment with another possible solution: teach the python
mini-harness to either clean all files in the directory, or to relocate
the directory according to test name, so that each mini-test starts with
a fresh location, and cleanup is then handled by the harness rather than
spaghetti pre-cleanup.  But any solution is better than our current
situation of nothing, so that's why I'm still okay with this patch as-is
as offering more (even if not perfect) than before.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v3 4/5] qemu-iotests: make python tests attempt to leave intermediate files

2017-08-30 Thread John Snow



On 08/30/2017 02:33 PM, Eric Blake wrote:
> On 08/30/2017 11:52 AM, Jeff Cody wrote:
>> Now that 'check' will clean up after tests, try and make python
>> tests leave intermediate files so that they might be inspectable
>> on failure.
>>
>> This isn't perfect; the python unittest framework runs multiple
>> tests, even if previous tests failed.  So we need to make sure that
>> each test still begins with a "clean" slate, to prevent false
>> positives or tainted test runs.
>>
>> Rather than delete images in the unittest tearDown, invert this
>> and delete images to be used in that test at the beginning of the
>> setUp.  This is to make sure that the test run is not inadvertently
>> using file droppings from previous runs.  We must use 'blind_remove'
>> then for these, as the files might not exist yet, but we don't want
>> to throw an error for that.
>>
>> Signed-off-by: Jeff Cody 
>> ---
> 
>> +++ b/tests/qemu-iotests/030
>> @@ -21,7 +21,7 @@
>>  import time
>>  import os
>>  import iotests
>> -from iotests import qemu_img, qemu_io
>> +from iotests import qemu_img, qemu_io, blind_remove
>>  
>>  backing_img = os.path.join(iotests.test_dir, 'backing.img')
>>  mid_img = os.path.join(iotests.test_dir, 'mid.img')
>> @@ -31,6 +31,9 @@ class TestSingleDrive(iotests.QMPTestCase):
>>  image_len = 1 * 1024 * 1024 # MB
>>  
>>  def setUp(self):
>> +blind_remove(test_img)
>> +blind_remove(mid_img)
>> +blind_remove(backing_img)
> 
> Would it be any more pythonic to have support for:
> 
> blind_remove(test_img, mid_img, backing_img)
> 
> built into the previous patch?
> 

It should probably either take an iterable, or an arbitrary number of
arguments, or both, I dunno. I'm not a python.

>>  def tearDown(self):
>>  self.vm.shutdown()
>> -os.remove(self.test_img)
>> -os.remove(self.mid_img_abs)
>> -os.remove(self.backing_img_abs)
>> -try:
>> -os.rmdir(os.path.join(iotests.test_dir, self.dir1))
>> -os.rmdir(os.path.join(iotests.test_dir, self.dir3))
>> -os.rmdir(os.path.join(iotests.test_dir, self.dir2))
>> -except OSError as exception:
>> -if exception.errno != errno.EEXIST and exception.errno != 
>> errno.ENOTEMPTY:
>> -raise
> 
> The code removed here is using a syntax that differs from what you used
> in 3/5 when defining blind_remove; does that matter for 3/5?
> 
>> +++ b/tests/qemu-iotests/041
> 
>> +blind_remove(target_img)
>>  iotests.create_image(backing_img, self.image_len)
>>  qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
>> backing_img, test_img)
>>  self.vm = iotests.VM().add_drive(test_img, 
>> "node-name=top,backing.node-name=base")
>> @@ -49,12 +52,6 @@ class TestSingleDrive(iotests.QMPTestCase):
>>  
>>  def tearDown(self):
>>  self.vm.shutdown()
>> -os.remove(test_img)
>> -os.remove(backing_img)
>> -try:
>> -os.remove(target_img)
>> -except OSError:
>> -pass
> 
> You're changing failures other than ENOENT from ignored to explicit -
> nice little bug-fix along the way :)  I notice this pattern in multiple
> tests; is it worth mentioning in the commit message as intentional?
> 
>> @@ -797,6 +788,9 @@ class TestRepairQuorum(iotests.QMPTestCase):
>>  IMAGES = [ quorum_img1, quorum_img2, quorum_img3 ]
>>  
>>  def setUp(self):
>> +for i in self.IMAGES + [ quorum_repair_img, quorum_snapshot_file ]:
>> +blind_remove(i)
> 
> Again, would it be more pythonic if blind_remove() could take a list and
> automatically work on each element of the list, rather than having to
> make the caller iterate?
> 
>> +++ b/tests/qemu-iotests/057
>> @@ -23,7 +23,7 @@
>>  import time
>>  import os
>>  import iotests
>> -from iotests import qemu_img, qemu_io
>> +from iotests import qemu_img, qemu_io, blind_remove
>>  
>>  test_drv_base_name = 'drive'
>>  
>> @@ -36,6 +36,8 @@ class ImageSnapshotTestCase(iotests.QMPTestCase):
>>  
>>  def _setUp(self, test_img_base_name, image_num):
>>  self.vm = iotests.VM()
>> +for dev_expect in self.expect:
>> +blind_remove(dev_expect['image'])
> 
> Another place where python magic could make the caller nicer?
> 
>> +++ b/tests/qemu-iotests/118
> 
>> @@ -411,16 +411,16 @@ class TestFloppyInitiallyEmpty(TestInitiallyEmpty):
>>  
>>  class TestChangeReadOnly(ChangeBaseClass):
>>  def setUp(self):
>> -qemu_img('create', '-f', iotests.imgfmt, old_img, '1440k')
>> -qemu_img('create', '-f', iotests.imgfmt, new_img, '1440k')
>> -self.vm = iotests.VM()
>> -
>> -def tearDown(self):
>> -self.vm.shutdown()
>>  os.chmod(old_img, 0666)
>>  os.chmod(new_img, 0666)
>> -os.remove(old_img)
>> -os.remove(new_img)
>> +blind_remove(old_img)
>> +blind_remove(new_img)
>> +qemu_img('create', '-f', iot

Re: [Qemu-block] [Qemu-devel] [PATCH v3 3/5] qemu-iotests: add 'blind_remove' for python tests

2017-08-30 Thread John Snow



On 08/30/2017 02:13 PM, Eric Blake wrote:
> On 08/30/2017 11:52 AM, Jeff Cody wrote:
>> Add a function to attempt to 'blindly' remove a file, without
>> throwing an error if the file doesn't exist.
>>
>> Signed-off-by: Jeff Cody 
>> ---
>>  tests/qemu-iotests/iotests.py | 7 +++
>>  1 file changed, 7 insertions(+)
>>
>> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
>> index 7233983..a2088c7 100644
>> --- a/tests/qemu-iotests/iotests.py
>> +++ b/tests/qemu-iotests/iotests.py
>> @@ -57,6 +57,13 @@ qemu_default_machine = 
>> os.environ.get('QEMU_DEFAULT_MACHINE')
>>  socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
>>  debug = False
>>  
>> +def blind_remove(filename):
>> +try:
>> +os.remove(filename)
>> +except OSError, error:
> 
> I'm assuming this works for both python 2 and 3?
> 

Appears to be python2 specific syntax, actually. using "as error"
appears to work in both 2.7 and 3.whatever, and according to
http://python3porting.com/differences.html will work in 2.6 too.

>> +if error.errno != errno.ENOENT:
>> +raise
>> +
> 
> Weak, since I'm not the strongest at python, but you can add:
> Reviewed-by: Eric Blake 
>

Re: [Qemu-block] [Qemu-devel] [PATCH v3 2/5] qemu-iotests: remove file cleanup from bash tests

2017-08-30 Thread John Snow



On 08/30/2017 12:52 PM, Jeff Cody wrote:
> All files for a given test are now self-contained in a subdirectory,
> and therefore the "./check" script can do all file-related cleanup
> without any help.
> 
> This removes file cleanups from the bash tests.  The only cleanup left
> is whatever is needed to kill any spawned processes; e.g. _cleanup_qemu.
> 
> Reviewed-by: Eric Blake 
> Signed-off-by: Jeff Cody 
> ---
>  tests/qemu-iotests/001 |  6 --
>  tests/qemu-iotests/002 |  6 --
>  tests/qemu-iotests/003 |  6 --
>  tests/qemu-iotests/004 |  6 --
>  tests/qemu-iotests/005 |  6 --
>  tests/qemu-iotests/007 |  7 ---
>  tests/qemu-iotests/008 |  6 --
>  tests/qemu-iotests/009 |  6 --
>  tests/qemu-iotests/010 |  6 --
>  tests/qemu-iotests/011 |  6 --
>  tests/qemu-iotests/012 |  6 --
>  tests/qemu-iotests/013 |  6 --
>  tests/qemu-iotests/014 |  6 --
>  tests/qemu-iotests/015 |  7 ---
>  tests/qemu-iotests/017 |  6 --
>  tests/qemu-iotests/018 |  6 --
>  tests/qemu-iotests/019 |  8 
>  tests/qemu-iotests/020 |  8 
>  tests/qemu-iotests/021 |  6 --
>  tests/qemu-iotests/022 |  6 --
>  tests/qemu-iotests/023 |  6 --
>  tests/qemu-iotests/024 |  8 
>  tests/qemu-iotests/025 |  6 --
>  tests/qemu-iotests/026 |  7 ---
>  tests/qemu-iotests/027 |  6 --
>  tests/qemu-iotests/028 |  8 
>  tests/qemu-iotests/029 |  7 ---
>  tests/qemu-iotests/031 |  6 --
>  tests/qemu-iotests/032 |  6 --
>  tests/qemu-iotests/033 |  6 --
>  tests/qemu-iotests/034 |  6 --
>  tests/qemu-iotests/035 |  6 --
>  tests/qemu-iotests/036 |  6 --
>  tests/qemu-iotests/037 |  6 --
>  tests/qemu-iotests/038 |  6 --
>  tests/qemu-iotests/039 |  6 --
>  tests/qemu-iotests/042 |  6 --
>  tests/qemu-iotests/043 |  7 ---
>  tests/qemu-iotests/046 |  6 --
>  tests/qemu-iotests/047 |  6 --
>  tests/qemu-iotests/048 |  8 
>  tests/qemu-iotests/048.out |  1 -
>  tests/qemu-iotests/049 |  6 --
>  tests/qemu-iotests/050 |  8 
>  tests/qemu-iotests/051 |  6 --
>  tests/qemu-iotests/052 |  6 --
>  tests/qemu-iotests/053 |  7 ---
>  tests/qemu-iotests/054 |  6 --
>  tests/qemu-iotests/058 |  8 +---
>  tests/qemu-iotests/059 |  7 ---
>  tests/qemu-iotests/060 |  6 --
>  tests/qemu-iotests/061 |  6 --
>  tests/qemu-iotests/062 |  6 --
>  tests/qemu-iotests/063 |  7 ---
>  tests/qemu-iotests/064 |  6 --
>  tests/qemu-iotests/066 |  6 --
>  tests/qemu-iotests/068 |  6 --
>  tests/qemu-iotests/069 |  6 --
>  tests/qemu-iotests/070 |  6 --
>  tests/qemu-iotests/071 |  6 --
>  tests/qemu-iotests/072 |  6 --
>  tests/qemu-iotests/073 |  6 --
>  tests/qemu-iotests/074 |  9 -
>  tests/qemu-iotests/074.out |  1 -
>  tests/qemu-iotests/075 |  6 --
>  tests/qemu-iotests/076 |  6 --
>  tests/qemu-iotests/077 |  6 --
>  tests/qemu-iotests/078 |  6 --
>  tests/qemu-iotests/079 |  6 --
>  tests/qemu-iotests/080 |  7 ---
>  tests/qemu-iotests/081 |  8 
>  tests/qemu-iotests/082 |  6 --
>  tests/qemu-iotests/084 |  6 --
>  tests/qemu-iotests/085 | 13 +
>  tests/qemu-iotests/086 |  6 --
>  tests/qemu-iotests/088 |  7 ---
>  tests/qemu-iotests/089 |  6 --
>  tests/qemu-iotests/090 |  6 --
>  tests/qemu-iotests/091 |  8 +---
>  tests/qemu-iotests/092 |  7 ---
>  tests/qemu-iotests/094 |  9 +
>  tests/qemu-iotests/095 |  8 +---
>  tests/qemu-iotests/097 |  7 ---
>  tests/qemu-iotests/098 |  7 ---
>  tests/qemu-iotests/099 |  6 --
>  tests/qemu-iotests/101 |  6 --
>  tests/qemu-iotests/102 |  7 +--
>  tests/qemu-iotests/103 |  6 --
>  tests/qemu-iotests/104 |  2 --
>  tests/qemu-iotests/105 |  6 --
>  tests/qemu-iotests/106 |  6 --
>  tests/qemu-iotests/107 |  6 --
>  tests/qemu-iotests/108 |  6 --
>  tests/qemu-iotests/109 |  8 +---
>  tests/qemu-iotests/110 |  6 --
>  tests/qemu-iotests/111 |  6 --
>  tests/qemu-iotests/112 |  6 --
>  tests/qemu-iotests/113 |  6 --
>  tests/qemu-iotests/114 |  6 --
>  tests/qemu-iotests/115 |  6 --
>  tests/qemu-iotests/116 |  6 --
>  tests/qemu-iotests/117 |  7 +--
>  tests/qemu-iotests/119 |  6 --
>  tests/qemu-iotests/120 |  6 --
>  tests/qemu-iotests/121 |  6 --
>  tests/qemu-iotests/122 |  7 ---
>  tests/qemu-iotests/123 |  7 ---
>  tests/qemu-iotests/125 |  6 --
>  test

Re: [Qemu-block] [Qemu-devel] [PATCH v3 1/5] qemu-iotests: set TEST_DIR to a unique dir for each test

2017-08-30 Thread John Snow



On 08/30/2017 12:52 PM, Jeff Cody wrote:
> Right now, all qemu-iotests output data into the same scratch directory,
> and so each test needs to be responsible for cleaning up its own files.
> 
> Have each test use 'scratch/$seq' as its temp directory, so the check
> script can do simple cleanup of removing the whole temporary directory.
> 
> Reviewed-by: Eric Blake 
> Signed-off-by: Jeff Cody 
> ---
>  tests/qemu-iotests/check | 21 +
>  1 file changed, 17 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
> index d504b6e..f6ca85d 100755
> --- a/tests/qemu-iotests/check
> +++ b/tests/qemu-iotests/check
> @@ -243,6 +243,7 @@ seq="check"
>  
>  for seq in $list
>  do
> +TEST_DIR_SEQ=$TEST_DIR/$seq
>  err=false
>  printf %s "$seq"
>  if [ -n "$TESTS_REMAINING_LOG" ] ; then
> @@ -289,13 +290,23 @@ do
>  fi
>  export OUTPUT_DIR=$PWD
>  if $debug; then
> -(cd "$source_iotests";
> +(
> +export TEST_DIR=$TEST_DIR_SEQ
> +. "$source_iotests/common.config"
> +. "$source_iotests/common.rc"

What purpose do these serve?

> +cd "$source_iotests" &&
>  MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(($RANDOM % 255 + 1))} \
> -$run_command -d 2>&1 | tee $tmp.out)
> +$run_command -d 2>&1 | tee $tmp.out
> +)
>  else
> -(cd "$source_iotests";
> +(
> +export TEST_DIR=$TEST_DIR_SEQ
> +. "$source_iotests/common.config"
> +. "$source_iotests/common.rc"
> + cd "$source_iotests" &&
>  MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(($RANDOM % 255 + 1))} \
> -$run_command >$tmp.out 2>&1)
> +$run_command >$tmp.out 2>&1
> +)
>  fi
>  sts=$?
>  $timestamp && _timestamp
> @@ -359,6 +370,8 @@ do
>  fi
>  fi
>  
> +rm -rf "$TEST_DIR_SEQ"
> +
>  fi
>  
>  # come here for each test, except when $showme is true
> 

Seems OK to me, though I am not able to answer all doubts about exactly
how this may effect the strange pipe/subshell arrangements that occur
deeper in the bowels of the included files for launching QEMU and so
on.. I suppose that might be related to the inclusion of those
common.XYZ files?

Tested-by: John Snow 
Reviewed-by: John Snow

[Qemu-block] [PATCH v2 13/17] MAINTAINERS: add missing AIO entry

2017-08-30 Thread Philippe Mathieu-Daudé

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Fam Zheng 
Reviewed-by: Stefan Hajnoczi 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 20d65dca73..a48f633cad 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1225,6 +1225,7 @@ F: util/aio-*.c
 F: block/io.c
 F: migration/block*
 F: include/block/aio.h
+F: scripts/qemugdb/aio.py
 T: git git://github.com/stefanha/qemu.git block
 
 Block Jobs
-- 
2.14.1

[Qemu-block] [PATCH v2 17/17] MAINTAINERS: update docs/interop/ entries

2017-08-30 Thread Philippe Mathieu-Daudé

moved in commit 7746cf8aab68

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Fam Zheng 
Acked-by: John Snow 
---
 MAINTAINERS | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 452ccd71b4..833a7a6778 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1259,7 +1259,7 @@ F: block/dirty-bitmap.c
 F: include/qemu/hbitmap.h
 F: include/block/dirty-bitmap.h
 F: tests/test-hbitmap.c
-F: docs/bitmaps.md
+F: docs/interop/bitmaps.rst
 T: git git://github.com/famz/qemu.git bitmaps
 T: git git://github.com/jnsnow/qemu.git bitmaps
 
@@ -1828,7 +1828,7 @@ M: Denis V. Lunev 
 L: qemu-block@nongnu.org
 S: Supported
 F: block/parallels.c
-F: docs/specs/parallels.txt
+F: docs/interop/parallels.txt
 
 qed
 M: Stefan Hajnoczi 
-- 
2.14.1

[Qemu-block] [PATCH v2 12/17] MAINTAINERS: add missing megasas test entry

2017-08-30 Thread Philippe Mathieu-Daudé

Signed-off-by: Philippe Mathieu-Daudé 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fa74b7254b..20d65dca73 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1118,6 +1118,7 @@ L: qemu-block@nongnu.org
 S: Supported
 F: hw/scsi/megasas.c
 F: hw/scsi/mfi.h
+F: tests/megasas-test.c
 
 Network packet abstractions
 M: Dmitry Fleytman 
-- 
2.14.1

[Qemu-block] [PATCH v2 07/17] MAINTAINERS: add missing qcow2 entry

2017-08-30 Thread Philippe Mathieu-Daudé

Signed-off-by: Philippe Mathieu-Daudé 
Acked-by: Kevin Wolf 
Reviewed-by: Stefan Hajnoczi 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index f4e07173c8..eb20365fbb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1841,6 +1841,7 @@ M: Max Reitz 
 L: qemu-block@nongnu.org
 S: Supported
 F: block/qcow2*
+F: docs/interop/qcow2.txt
 
 qcow
 M: Kevin Wolf 
-- 
2.14.1

Re: [Qemu-block] [Qemu-devel] Persistent bitmaps for non-qcow2 formats

2017-08-30 Thread John Snow



On 08/30/2017 09:45 AM, Daniel P. Berrange wrote:
> On Wed, Aug 30, 2017 at 02:36:11PM +0100, Stefan Hajnoczi wrote:
>> On Tue, Aug 22, 2017 at 03:07:04PM -0400, John Snow wrote:
>>> (3) Add either a new flag that turns qcow2's backing file into a full
>>> R/W backing file, or add a new extension to qcow2 entirely (bypassing
>>> the traditional backing file mechanism to avoid confusion for older
>>> tooling) that adds a new read-write backing file field.
>>>
>>> This RW backing file field will be used for all reads AND writes; the
>>> qcow2 in question becomes a metadata container on top of the BDS chain.
>>> We can re-use Vladimir's bitmap persistence extension to save bitmaps in
>>> a qcow2 shell.
>>>
>>> The qcow2 becomes effectively a metadata cache for a new (essentially)
>>> filter node that handles features such as bitmaps. This could also be
>>> used to provide allocation map data for RAW files and other goodies down
>>> the road.
>>>
>>> Hopefully this achieves our desire to not create new formats AND our
>>> desire to concentrate features (and debugging, testing, etc) into qcow2,
>>> while allowing users to "have bitmaps with raw files."
>>>
>>> Of course, in this scenario, users now have two files: a qcow2 wrapper
>>> and the actual raw file in question; but regardless of how we were going
>>> to solve this, a raw file necessitates an external file of some sort,
>>> else we give up the idea that it was a raw file.
>>
>> There is some complexity here for management tools:
>>
>> If the underlying image is resized, who resizes the qcow2 and how do
>> they know to do it?
>>
>> If QEMU's resize/truncate command it used, does first try to resize the
>> underlying image and then resize the qcow2?  This is probably the sanest
>> approach.
>>
>> If the underlying image is moved to a new location, does the qcow2 file
>> need to be modified and who does that?
>>
>> Management tools need to figure out how to represent manage this extra
>> qcow2 file.  The easiest solution is to punt it to the user and treat it
>> as part of a backing file chain.  If the management tool wants to
>> automatically manage the qcow2 so the user just specifies the underlying
>> image and enables the persistent bitmap checkbox, then it becomes more
>> complicated.
> 
> Indeed, I don't think it is practical to have libvirt / QEMU automagically
> create a qcow2 overlay on disk. Something has to decide where this would
> be stored. You might say just put it alongside the raw file, but it might
> not be a local file at all, it could be a NBD, or RBD raw "file". So do
> we create  local qcow2 file, or store a qcow2 file inside another RBD
> volume to hold the persistent bitmap. This kind of decision needs to be
> made by the mgmt app since only it knows about its storage mgmt model.

Oh, you mean to say mgmt app like VMM or something even above libvirt,
yes? Who currently makes the decision for where snapshot files and the
like goes, does libvirt not decide that, but the app using it?

> At this point you might as well just let the mgmt app take care of it
> all and not try to do anything magical with qcow2 overlays in libvirt/QEMU
> 

Might be the sanest, yes -- but say I don't offer an "automagic" way to
add a bitmap to a raw file to a running QEMU instance but rather offer a
way to qcow2ify an existing node:

We could create a wrapper:

qemu-img create -f qcow2 -o wrapper wrapper.qcow2

Then add a node:

blockdev_add driver=qcow2 node-name=foo file.driver=file
file.filename=wrapper.qcow2

(The size of the drive could perhaps remain as zero temporarily here)

Then some magic to make it the active target of whatever blockbackend
was using the old image, perhaps?:

blockdev_magic target=virtioblk0 node-name=foo

At this point, virtioblk0 is now backed by our new magic qcow2 node
which provides bitmap features for whatever kind of backing storage
virtioblk0 had, and nothing automagic happened -- we passed explicit
file references.

Any magic that we wish to provide can happen in libvirt or above.

--js

Re: [Qemu-block] [Qemu-devel] Persistent bitmaps for non-qcow2 formats

2017-08-30 Thread John Snow

On 08/30/2017 08:58 AM, Yaniv Lavi (Dary) wrote:
> 
> 
> We had no reason to switch to anything else so far and I'm sure this
> option was not available when we started supporting raw format. 
>  
> 

Yeah, they don't exist yet...! I've looped you in to see if the proposal
being discussed would alleviate the need for bitmaps for "other formats"
if we can offer a "raw mode qcow2."

At the moment I am going to still try to add bitmaps to other formats
(through the use of a qcow2 wrapper) but it sounds like a raw-layout
qcow2 might provide some benefits too.

-js

[Qemu-block] [PATCH v6 15/18] qcow2: Switch load_bitmap_data() to byte-based iteration

2017-08-30 Thread Eric Blake

Now that we have adjusted the majority of the calls this function
makes to be byte-based, it is easier to read the code if it makes
passes over the image using bytes rather than sectors.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 
Reviewed-by: Vladimir Sementsov-Ogievskiy 

---
v5: no change
v4: new patch
---
 block/qcow2-bitmap.c | 22 --
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index c7c60dfca2..b807298484 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -291,9 +291,8 @@ static int load_bitmap_data(BlockDriverState *bs,
 {
 int ret = 0;
 BDRVQcow2State *s = bs->opaque;
-uint64_t sector, limit, sbc;
+uint64_t offset, limit;
 uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);
-uint64_t bm_sectors = DIV_ROUND_UP(bm_size, BDRV_SECTOR_SIZE);
 uint8_t *buf = NULL;
 uint64_t i, tab_size =
 size_to_clusters(s,
@@ -305,32 +304,27 @@ static int load_bitmap_data(BlockDriverState *bs,

 buf = g_malloc(s->cluster_size);
 limit = bytes_covered_by_bitmap_cluster(s, bitmap);
-sbc = limit >> BDRV_SECTOR_BITS;
-for (i = 0, sector = 0; i < tab_size; ++i, sector += sbc) {
-uint64_t count = MIN(bm_sectors - sector, sbc);
+for (i = 0, offset = 0; i < tab_size; ++i, offset += limit) {
+uint64_t count = MIN(bm_size - offset, limit);
 uint64_t entry = bitmap_table[i];
-uint64_t offset = entry & BME_TABLE_ENTRY_OFFSET_MASK;
+uint64_t data_offset = entry & BME_TABLE_ENTRY_OFFSET_MASK;

 assert(check_table_entry(entry, s->cluster_size) == 0);

-if (offset == 0) {
+if (data_offset == 0) {
 if (entry & BME_TABLE_ENTRY_FLAG_ALL_ONES) {
-bdrv_dirty_bitmap_deserialize_ones(bitmap,
-   sector * BDRV_SECTOR_SIZE,
-   count * BDRV_SECTOR_SIZE,
+bdrv_dirty_bitmap_deserialize_ones(bitmap, offset, count,
false);
 } else {
 /* No need to deserialize zeros because the dirty bitmap is
  * already cleared */
 }
 } else {
-ret = bdrv_pread(bs->file, offset, buf, s->cluster_size);
+ret = bdrv_pread(bs->file, data_offset, buf, s->cluster_size);
 if (ret < 0) {
 goto finish;
 }
-bdrv_dirty_bitmap_deserialize_part(bitmap, buf,
-   sector * BDRV_SECTOR_SIZE,
-   count * BDRV_SECTOR_SIZE,
+bdrv_dirty_bitmap_deserialize_part(bitmap, buf, offset, count,
false);
 }
 }
-- 
2.13.5

[Qemu-block] [PATCH v6 18/18] dirty-bitmap: Convert internal hbitmap size/granularity

2017-08-30 Thread Eric Blake

Now that all callers are using byte-based interfaces, there's no
reason for our internal hbitmap to remain with sector-based
granularity.  It also simplifies our internal scaling, since we
already know that hbitmap widens requests out to granularity
boundaries.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v6: no change
v5: fix bdrv_dirty_bitmap_truncate [John]
v4: rebase to earlier changes, include serialization, R-b dropped
v3: no change
v2: no change
---
 block/dirty-bitmap.c | 61 
 1 file changed, 18 insertions(+), 43 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index b54eed46e4..0b349f0b5a 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -38,7 +38,7 @@
  */
 struct BdrvDirtyBitmap {
 QemuMutex *mutex;
-HBitmap *bitmap;/* Dirty sector bitmap implementation */
+HBitmap *bitmap;/* Dirty bitmap implementation */
 HBitmap *meta;  /* Meta dirty bitmap */
 BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
 char *name; /* Optional non-empty unique ID */
@@ -130,12 +130,7 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
 }
 bitmap = g_new0(BdrvDirtyBitmap, 1);
 bitmap->mutex = &bs->dirty_bitmap_mutex;
-/*
- * TODO - let hbitmap track full granularity. For now, it is tracking
- * only sector granularity, as a shortcut for our iterators.
- */
-bitmap->bitmap = hbitmap_alloc(DIV_ROUND_UP(bitmap_size, BDRV_SECTOR_SIZE),
-   ctz32(granularity) - BDRV_SECTOR_BITS);
+bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(granularity));
 bitmap->size = bitmap_size;
 bitmap->name = g_strdup(name);
 bitmap->disabled = false;
@@ -314,7 +309,7 @@ void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
 QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
 assert(!bitmap->active_iterators);
-hbitmap_truncate(bitmap->bitmap, DIV_ROUND_UP(size, BDRV_SECTOR_SIZE));
+hbitmap_truncate(bitmap->bitmap, size);
 bitmap->size = size;
 }
 bdrv_dirty_bitmaps_unlock(bs);
@@ -444,7 +439,7 @@ bool bdrv_get_dirty_locked(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap,
int64_t offset)
 {
 if (bitmap) {
-return hbitmap_get(bitmap->bitmap, offset >> BDRV_SECTOR_BITS);
+return hbitmap_get(bitmap->bitmap, offset);
 } else {
 return false;
 }
@@ -472,7 +467,7 @@ uint32_t 
bdrv_get_default_bitmap_granularity(BlockDriverState *bs)

 uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap)
 {
-return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
+return 1U << hbitmap_granularity(bitmap->bitmap);
 }

 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap)
@@ -505,19 +500,16 @@ void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter)

 int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter)
 {
-return hbitmap_iter_next(&iter->hbi) * BDRV_SECTOR_SIZE;
+return hbitmap_iter_next(&iter->hbi);
 }

 /* Called within bdrv_dirty_bitmap_lock..unlock */
 void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
   int64_t offset, int64_t bytes)
 {
-int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
-
 assert(bdrv_dirty_bitmap_enabled(bitmap));
 assert(!bdrv_dirty_bitmap_readonly(bitmap));
-hbitmap_set(bitmap->bitmap, offset >> BDRV_SECTOR_BITS,
-end_sector - (offset >> BDRV_SECTOR_BITS));
+hbitmap_set(bitmap->bitmap, offset, bytes);
 }

 void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
@@ -532,12 +524,9 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
 void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
 int64_t offset, int64_t bytes)
 {
-int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
-
 assert(bdrv_dirty_bitmap_enabled(bitmap));
 assert(!bdrv_dirty_bitmap_readonly(bitmap));
-hbitmap_reset(bitmap->bitmap, offset >> BDRV_SECTOR_BITS,
-  end_sector - (offset >> BDRV_SECTOR_BITS));
+hbitmap_reset(bitmap->bitmap, offset, bytes);
 }

 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
@@ -557,8 +546,7 @@ void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, 
HBitmap **out)
 hbitmap_reset_all(bitmap->bitmap);
 } else {
 HBitmap *backup = bitmap->bitmap;
-bitmap->bitmap = hbitmap_alloc(DIV_ROUND_UP(bitmap->size,
-BDRV_SECTOR_SIZE),
+bitmap->bitmap = hbitmap_alloc(bitmap->size,
hbitmap_granularity(backup));
 *out = backup;
 }
@@ -577,51 +565,40 @@ void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap 
*bitmap, HBitmap *in)
 uint64_t bdrv_dirty_bit

[Qemu-block] [PATCH v6 13/18] mirror: Switch mirror_dirty_init() to byte-based iteration

2017-08-30 Thread Eric Blake

Now that we have adjusted the majority of the calls this function
makes to be byte-based, it is easier to read the code if it makes
passes over the image using bytes rather than sectors.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v6: no change
v5: rebase to earlier changes
v2-v4: no change
---
 block/mirror.c | 38 ++
 1 file changed, 14 insertions(+), 24 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 18172a56a3..87d9857475 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -613,15 +613,13 @@ static void mirror_throttle(MirrorBlockJob *s)

 static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 {
-int64_t sector_num, end;
+int64_t offset;
 BlockDriverState *base = s->base;
 BlockDriverState *bs = s->source;
 BlockDriverState *target_bs = blk_bs(s->target);
-int ret, n;
+int ret;
 int64_t count;

-end = s->bdev_length / BDRV_SECTOR_SIZE;
-
 if (base == NULL && !bdrv_has_zero_init(target_bs)) {
 if (!bdrv_can_write_zeroes_with_unmap(target_bs)) {
 bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, s->bdev_length);
@@ -629,9 +627,9 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 }

 s->initial_zeroing_ongoing = true;
-for (sector_num = 0; sector_num < end; ) {
-int nb_sectors = MIN(end - sector_num,
-QEMU_ALIGN_DOWN(INT_MAX, s->granularity) >> BDRV_SECTOR_BITS);
+for (offset = 0; offset < s->bdev_length; ) {
+int bytes = MIN(s->bdev_length - offset,
+QEMU_ALIGN_DOWN(INT_MAX, s->granularity));

 mirror_throttle(s);

@@ -647,9 +645,8 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 continue;
 }

-mirror_do_zero_or_discard(s, sector_num * BDRV_SECTOR_SIZE,
-  nb_sectors * BDRV_SECTOR_SIZE, false);
-sector_num += nb_sectors;
+mirror_do_zero_or_discard(s, offset, bytes, false);
+offset += bytes;
 }

 mirror_wait_for_all_io(s);
@@ -657,10 +654,10 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob 
*s)
 }

 /* First part, loop on the sectors and initialize the dirty bitmap.  */
-for (sector_num = 0; sector_num < end; ) {
+for (offset = 0; offset < s->bdev_length; ) {
 /* Just to make sure we are not exceeding int limit. */
-int nb_sectors = MIN(INT_MAX >> BDRV_SECTOR_BITS,
- end - sector_num);
+int bytes = MIN(s->bdev_length - offset,
+QEMU_ALIGN_DOWN(INT_MAX, s->granularity));

 mirror_throttle(s);

@@ -668,23 +665,16 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob 
*s)
 return 0;
 }

-ret = bdrv_is_allocated_above(bs, base, sector_num * BDRV_SECTOR_SIZE,
-  nb_sectors * BDRV_SECTOR_SIZE, &count);
+ret = bdrv_is_allocated_above(bs, base, offset, bytes, &count);
 if (ret < 0) {
 return ret;
 }

-/* TODO: Relax this once bdrv_is_allocated_above and dirty
- * bitmaps no longer require sector alignment. */
-assert(QEMU_IS_ALIGNED(count, BDRV_SECTOR_SIZE));
-n = count >> BDRV_SECTOR_BITS;
-assert(n > 0);
+assert(count);
 if (ret == 1) {
-bdrv_set_dirty_bitmap(s->dirty_bitmap,
-  sector_num * BDRV_SECTOR_SIZE,
-  n * BDRV_SECTOR_SIZE);
+bdrv_set_dirty_bitmap(s->dirty_bitmap, offset, count);
 }
-sector_num += n;
+offset += count;
 }
 return 0;
 }
-- 
2.13.5

[Qemu-block] [PATCH v6 12/18] dirty-bitmap: Change bdrv_[re]set_dirty_bitmap() to use bytes

2017-08-30 Thread Eric Blake

Some of the callers were already scaling bytes to sectors; others
can be easily converted to pass byte offsets, all in our shift
towards a consistent byte interface everywhere.  Making the change
will also make it easier to write the hold-out callers to use byte
rather than sectors for their iterations; it also makes it easier
for a future dirty-bitmap patch to offload scaling over to the
internal hbitmap.  Although all callers happen to pass
sector-aligned values, make the internal scaling robust to any
sub-sector requests.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v5: only context change
v4: only context change, due to rebasing to persistent bitmaps
v3: rebase to addition of _locked interfaces; complex enough that I
dropped R-b
v2: no change
---
 include/block/dirty-bitmap.h |  8 
 block/dirty-bitmap.c | 22 ++
 block/mirror.c   | 16 
 migration/block.c|  7 +--
 4 files changed, 31 insertions(+), 22 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 1dddcd320b..0341a605d7 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -40,9 +40,9 @@ const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap 
*bitmap);
 int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap);
 DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap);
 void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
-   int64_t cur_sector, int64_t nr_sectors);
+   int64_t offset, int64_t bytes);
 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
- int64_t cur_sector, int64_t nr_sectors);
+ int64_t offset, int64_t bytes);
 BdrvDirtyBitmapIter *bdrv_dirty_meta_iter_new(BdrvDirtyBitmap *bitmap);
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter);
@@ -75,9 +75,9 @@ void bdrv_dirty_bitmap_unlock(BdrvDirtyBitmap *bitmap);
 bool bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t offset);
 void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
-  int64_t cur_sector, int64_t nr_sectors);
+  int64_t offset, int64_t bytes);
 void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
-int64_t cur_sector, int64_t nr_sectors);
+int64_t offset, int64_t bytes);
 int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter);
 void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *hbi, int64_t offset);
 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 8b3c0221c6..9821225523 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -510,35 +510,41 @@ int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter)

 /* Called within bdrv_dirty_bitmap_lock..unlock */
 void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
-  int64_t cur_sector, int64_t nr_sectors)
+  int64_t offset, int64_t bytes)
 {
+int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
+
 assert(bdrv_dirty_bitmap_enabled(bitmap));
 assert(!bdrv_dirty_bitmap_readonly(bitmap));
-hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
+hbitmap_set(bitmap->bitmap, offset >> BDRV_SECTOR_BITS,
+end_sector - (offset >> BDRV_SECTOR_BITS));
 }

 void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
-   int64_t cur_sector, int64_t nr_sectors)
+   int64_t offset, int64_t bytes)
 {
 bdrv_dirty_bitmap_lock(bitmap);
-bdrv_set_dirty_bitmap_locked(bitmap, cur_sector, nr_sectors);
+bdrv_set_dirty_bitmap_locked(bitmap, offset, bytes);
 bdrv_dirty_bitmap_unlock(bitmap);
 }

 /* Called within bdrv_dirty_bitmap_lock..unlock */
 void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
-int64_t cur_sector, int64_t nr_sectors)
+int64_t offset, int64_t bytes)
 {
+int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
+
 assert(bdrv_dirty_bitmap_enabled(bitmap));
 assert(!bdrv_dirty_bitmap_readonly(bitmap));
-hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
+hbitmap_reset(bitmap->bitmap, offset >> BDRV_SECTOR_BITS,
+  end_sector - (offset >> BDRV_SECTOR_BITS));
 }

 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
- int64_t cur_sector, int64_t nr_sectors)
+ int64_t offset, int64_t bytes)
 {
 bdrv_dirty_bitmap_lock(bitmap);
-bdrv_reset_dirty_bitmap_locked(bitmap, cur_sector, nr_sectors);
+bdrv_reset_dirty_bitmap_locked(bitmap, offset, bytes);
 bdrv_dir

[Qemu-block] [PATCH v6 17/18] dirty-bitmap: Switch bdrv_set_dirty() to bytes

2017-08-30 Thread Eric Blake

Both callers already had bytes available, but were scaling to
sectors.  Move the scaling to internal code.  In the case of
bdrv_aligned_pwritev(), we are now passing the exact offset
rather than a rounded sector-aligned value, but that's okay
as long as dirty bitmap widens start/bytes to granularity
boundaries.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v4: only context changes
v3: rebase to lock context changes, R-b kept
v2: no change
---
 include/block/block_int.h | 2 +-
 block/io.c| 6 ++
 block/dirty-bitmap.c  | 7 ---
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7571c0aaaf..4d01dc3fa6 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -969,7 +969,7 @@ void blk_dev_eject_request(BlockBackend *blk, bool force);
 bool blk_dev_is_tray_open(BlockBackend *blk);
 bool blk_dev_is_medium_locked(BlockBackend *blk);

-void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int64_t nr_sect);
+void bdrv_set_dirty(BlockDriverState *bs, int64_t offset, int64_t bytes);
 bool bdrv_requests_pending(BlockDriverState *bs);

 void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out);
diff --git a/block/io.c b/block/io.c
index 26003814eb..926beebc8f 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1334,7 +1334,6 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 bool waited;
 int ret;

-int64_t start_sector = offset >> BDRV_SECTOR_BITS;
 int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);
 uint64_t bytes_remaining = bytes;
 int max_transfer;
@@ -1409,7 +1408,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
 bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);

 atomic_inc(&bs->write_gen);
-bdrv_set_dirty(bs, start_sector, end_sector - start_sector);
+bdrv_set_dirty(bs, offset, bytes);

 stat64_max(&bs->wr_highest_offset, offset + bytes);

@@ -2412,8 +2411,7 @@ int coroutine_fn bdrv_co_pdiscard(BlockDriverState *bs, 
int64_t offset,
 ret = 0;
 out:
 atomic_inc(&bs->write_gen);
-bdrv_set_dirty(bs, req.offset >> BDRV_SECTOR_BITS,
-   req.bytes >> BDRV_SECTOR_BITS);
+bdrv_set_dirty(bs, req.offset, req.bytes);
 tracked_request_end(&req);
 bdrv_dec_in_flight(bs);
 return ret;
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 9821225523..b54eed46e4 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -629,10 +629,10 @@ void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap 
*bitmap)
 hbitmap_deserialize_finish(bitmap->bitmap);
 }

-void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
-int64_t nr_sectors)
+void bdrv_set_dirty(BlockDriverState *bs, int64_t offset, int64_t bytes)
 {
 BdrvDirtyBitmap *bitmap;
+int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);

 if (QLIST_EMPTY(&bs->dirty_bitmaps)) {
 return;
@@ -644,7 +644,8 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
 continue;
 }
 assert(!bdrv_dirty_bitmap_readonly(bitmap));
-hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
+hbitmap_set(bitmap->bitmap, offset >> BDRV_SECTOR_BITS,
+end_sector - (offset >> BDRV_SECTOR_BITS));
 }
 bdrv_dirty_bitmaps_unlock(bs);
 }
-- 
2.13.5

[Qemu-block] [PATCH v6 11/18] dirty-bitmap: Change bdrv_get_dirty_locked() to take bytes

2017-08-30 Thread Eric Blake

Half the callers were already scaling bytes to sectors; the other
half can eventually be simplified to use byte iteration.  Both
callers were already using the result as a bool, so make that
explicit.  Making the change also makes it easier for a future
dirty-bitmap patch to offload scaling over to the internal hbitmap.

Remember, asking whether a byte is dirty is effectively asking
whether the entire granularity containing the byte is dirty, since
we only track dirtiness by granularity.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 
Reviewed-by: Juan Quintela 

---
v4: only context change
v3: rebase to _locked rename was straightforward enough that R-b kept
v2: tweak commit message, no code change
---
 include/block/dirty-bitmap.h | 4 ++--
 block/dirty-bitmap.c | 8 
 block/mirror.c   | 3 +--
 migration/block.c| 3 ++-
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index ece28e1edc..1dddcd320b 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -72,8 +72,8 @@ void bdrv_dirty_bitmap_set_persistance(BdrvDirtyBitmap 
*bitmap,
 /* Functions that require manual locking.  */
 void bdrv_dirty_bitmap_lock(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_unlock(BdrvDirtyBitmap *bitmap);
-int bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
-  int64_t sector);
+bool bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
+   int64_t offset);
 void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
   int64_t cur_sector, int64_t nr_sectors);
 void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index f983d99def..8b3c0221c6 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -440,13 +440,13 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 }

 /* Called within bdrv_dirty_bitmap_lock..unlock */
-int bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
-  int64_t sector)
+bool bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
+   int64_t offset)
 {
 if (bitmap) {
-return hbitmap_get(bitmap->bitmap, sector);
+return hbitmap_get(bitmap->bitmap, offset >> BDRV_SECTOR_BITS);
 } else {
-return 0;
+return false;
 }
 }

diff --git a/block/mirror.c b/block/mirror.c
index cc47e21814..42888ebd04 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -362,8 +362,7 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 int64_t next_offset = offset + nb_chunks * s->granularity;
 int64_t next_chunk = next_offset / s->granularity;
 if (next_offset >= s->bdev_length ||
-!bdrv_get_dirty_locked(source, s->dirty_bitmap,
-   next_offset >> BDRV_SECTOR_BITS)) {
+!bdrv_get_dirty_locked(source, s->dirty_bitmap, next_offset)) {
 break;
 }
 if (test_bit(next_chunk, s->in_flight_bitmap)) {
diff --git a/migration/block.c b/migration/block.c
index a3512945da..b618869661 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -530,7 +530,8 @@ static int mig_save_device_dirty(QEMUFile *f, 
BlkMigDevState *bmds,
 blk_mig_unlock();
 }
 bdrv_dirty_bitmap_lock(bmds->dirty_bitmap);
-if (bdrv_get_dirty_locked(bs, bmds->dirty_bitmap, sector)) {
+if (bdrv_get_dirty_locked(bs, bmds->dirty_bitmap,
+  sector * BDRV_SECTOR_SIZE)) {
 if (total_sectors - sector < BDRV_SECTORS_PER_DIRTY_CHUNK) {
 nr_sectors = total_sectors - sector;
 } else {
-- 
2.13.5

[Qemu-block] [PATCH v6 10/18] dirty-bitmap: Change bdrv_get_dirty_count() to report bytes

2017-08-30 Thread Eric Blake

Thanks to recent cleanups, all callers were scaling a return value
of sectors into bytes; do the scaling internally instead.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 
Reviewed-by: Juan Quintela 

---
v4: no change
v3: no change, add R-b
v2: no change
---
 block/dirty-bitmap.c |  4 ++--
 block/mirror.c   | 13 +
 migration/block.c|  2 +-
 3 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 20f230867d..f983d99def 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -425,7 +425,7 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
 BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
 BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
-info->count = bdrv_get_dirty_count(bm) << BDRV_SECTOR_BITS;
+info->count = bdrv_get_dirty_count(bm);
 info->granularity = bdrv_dirty_bitmap_granularity(bm);
 info->has_name = !!bm->name;
 info->name = g_strdup(bm->name);
@@ -653,7 +653,7 @@ void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *iter, int64_t 
offset)

 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
 {
-return hbitmap_count(bitmap->bitmap);
+return hbitmap_count(bitmap->bitmap) << BDRV_SECTOR_BITS;
 }

 int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap)
diff --git a/block/mirror.c b/block/mirror.c
index af13f5d658..cc47e21814 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -811,11 +811,10 @@ static void coroutine_fn mirror_run(void *opaque)

 cnt = bdrv_get_dirty_count(s->dirty_bitmap);
 /* s->common.offset contains the number of bytes already processed so
- * far, cnt is the number of dirty sectors remaining and
+ * far, cnt is the number of dirty bytes remaining and
  * s->bytes_in_flight is the number of bytes currently being
  * processed; together those are the current total operation length */
-s->common.len = s->common.offset + s->bytes_in_flight +
-cnt * BDRV_SECTOR_SIZE;
+s->common.len = s->common.offset + s->bytes_in_flight + cnt;

 /* Note that even when no rate limit is applied we need to yield
  * periodically with no pending I/O so that bdrv_drain_all() returns.
@@ -827,8 +826,7 @@ static void coroutine_fn mirror_run(void *opaque)
 s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
 if (s->in_flight >= MAX_IN_FLIGHT || s->buf_free_count == 0 ||
 (cnt == 0 && s->in_flight > 0)) {
-trace_mirror_yield(s, cnt * BDRV_SECTOR_SIZE,
-   s->buf_free_count, s->in_flight);
+trace_mirror_yield(s, cnt, s->buf_free_count, s->in_flight);
 mirror_wait_for_io(s);
 continue;
 } else if (cnt != 0) {
@@ -869,7 +867,7 @@ static void coroutine_fn mirror_run(void *opaque)
  * whether to switch to target check one last time if I/O has
  * come in the meanwhile, and if not flush the data to disk.
  */
-trace_mirror_before_drain(s, cnt * BDRV_SECTOR_SIZE);
+trace_mirror_before_drain(s, cnt);

 bdrv_drained_begin(bs);
 cnt = bdrv_get_dirty_count(s->dirty_bitmap);
@@ -888,8 +886,7 @@ static void coroutine_fn mirror_run(void *opaque)
 }

 ret = 0;
-trace_mirror_before_sleep(s, cnt * BDRV_SECTOR_SIZE,
-  s->synced, delay_ns);
+trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
 if (!s->synced) {
 block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
 if (block_job_is_cancelled(&s->common)) {
diff --git a/migration/block.c b/migration/block.c
index 9171f60028..a3512945da 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -667,7 +667,7 @@ static int64_t get_remaining_dirty(void)
 aio_context_release(blk_get_aio_context(bmds->blk));
 }

-return dirty << BDRV_SECTOR_BITS;
+return dirty;
 }


-- 
2.13.5

[Qemu-block] [PATCH v6 08/18] dirty-bitmap: Set iterator start by offset, not sector

2017-08-30 Thread Eric Blake

All callers to bdrv_dirty_iter_new() passed 0 for their initial
starting point, drop that parameter.

Most callers to bdrv_set_dirty_iter() were scaling a byte offset to
a sector number; the exception qcow2-bitmap will be converted later
to use byte rather than sector iteration.  Move the scaling to occur
internally to dirty bitmap code instead, so that callers now pass
in bytes.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v5: no change
v4: rebase to persistent bitmaps
v3: no change
v2: no change
---
 include/block/dirty-bitmap.h | 5 ++---
 block/backup.c   | 5 ++---
 block/dirty-bitmap.c | 9 -
 block/mirror.c   | 4 ++--
 block/qcow2-bitmap.c | 4 ++--
 5 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index f4ccd3f33f..ece28e1edc 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -44,8 +44,7 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
  int64_t cur_sector, int64_t nr_sectors);
 BdrvDirtyBitmapIter *bdrv_dirty_meta_iter_new(BdrvDirtyBitmap *bitmap);
-BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
- uint64_t first_sector);
+BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter);

 uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
@@ -80,7 +79,7 @@ void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
 void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
 int64_t cur_sector, int64_t nr_sectors);
 int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter);
-void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *hbi, int64_t sector_num);
+void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *hbi, int64_t offset);
 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap);
 int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
diff --git a/block/backup.c b/block/backup.c
index 504a089150..1a6ec7d079 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -372,7 +372,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)

 granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
 clusters_per_iter = MAX((granularity / job->cluster_size), 1);
-dbi = bdrv_dirty_iter_new(job->sync_bitmap, 0);
+dbi = bdrv_dirty_iter_new(job->sync_bitmap);

 /* Find the next dirty sector(s) */
 while ((offset = bdrv_dirty_iter_next(dbi) * BDRV_SECTOR_SIZE) >= 0) {
@@ -403,8 +403,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 /* If the bitmap granularity is smaller than the backup granularity,
  * we need to advance the iterator pointer to the next cluster. */
 if (granularity < job->cluster_size) {
-bdrv_set_dirty_iter(dbi,
-cluster * job->cluster_size / 
BDRV_SECTOR_SIZE);
+bdrv_set_dirty_iter(dbi, cluster * job->cluster_size);
 }

 last_cluster = cluster - 1;
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 034effc8cd..c091f0c7ba 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -475,11 +475,10 @@ uint32_t bdrv_dirty_bitmap_granularity(const 
BdrvDirtyBitmap *bitmap)
 return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
 }

-BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
- uint64_t first_sector)
+BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap)
 {
 BdrvDirtyBitmapIter *iter = g_new(BdrvDirtyBitmapIter, 1);
-hbitmap_iter_init(&iter->hbi, bitmap->bitmap, first_sector);
+hbitmap_iter_init(&iter->hbi, bitmap->bitmap, 0);
 iter->bitmap = bitmap;
 bitmap->active_iterators++;
 return iter;
@@ -647,9 +646,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
 /**
  * Advance a BdrvDirtyBitmapIter to an arbitrary offset.
  */
-void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *iter, int64_t sector_num)
+void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *iter, int64_t offset)
 {
-hbitmap_iter_init(&iter->hbi, iter->hbi.hb, sector_num);
+hbitmap_iter_init(&iter->hbi, iter->hbi.hb, offset >> BDRV_SECTOR_BITS);
 }

 int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
diff --git a/block/mirror.c b/block/mirror.c
index 429751b9fe..51824750ac 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -373,7 +373,7 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 next_dirty = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
 if (next_dirty > next_offset || next_dirty < 0) {
 /* The bitmap iterator's cache is stale, refresh it */
-bdrv_set_dirty_iter(s->dbi, next_offset >> BDRV_SECTOR_BITS

[Qemu-block] [PATCH v6 16/18] qcow2: Switch store_bitmap_data() to byte-based iteration

2017-08-30 Thread Eric Blake

Now that we have adjusted the majority of the calls this function
makes to be byte-based, it is easier to read the code if it makes
passes over the image using bytes rather than sectors.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 
Reviewed-by: Vladimir Sementsov-Ogievskiy 

---
v5: no change
v4: new patch
---
 block/qcow2-bitmap.c | 26 +++---
 1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index b807298484..63d845e35f 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -1072,10 +1072,9 @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,
 {
 int ret;
 BDRVQcow2State *s = bs->opaque;
-int64_t sector;
-uint64_t limit, sbc;
+int64_t offset;
+uint64_t limit;
 uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);
-uint64_t bm_sectors = DIV_ROUND_UP(bm_size, BDRV_SECTOR_SIZE);
 const char *bm_name = bdrv_dirty_bitmap_name(bitmap);
 uint8_t *buf = NULL;
 BdrvDirtyBitmapIter *dbi;
@@ -1100,18 +1099,17 @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,
 dbi = bdrv_dirty_iter_new(bitmap);
 buf = g_malloc(s->cluster_size);
 limit = bytes_covered_by_bitmap_cluster(s, bitmap);
-sbc = limit >> BDRV_SECTOR_BITS;
 assert(DIV_ROUND_UP(bm_size, limit) == tb_size);

-while ((sector = bdrv_dirty_iter_next(dbi) >> BDRV_SECTOR_BITS) != -1) {
-uint64_t cluster = sector / sbc;
+while ((offset = bdrv_dirty_iter_next(dbi)) != -1) {
+uint64_t cluster = offset / limit;
 uint64_t end, write_size;
 int64_t off;

-sector = cluster * sbc;
-end = MIN(bm_sectors, sector + sbc);
-write_size = bdrv_dirty_bitmap_serialization_size(bitmap,
-sector * BDRV_SECTOR_SIZE, (end - sector) * BDRV_SECTOR_SIZE);
+offset = cluster * limit;
+end = MIN(bm_size, offset + limit);
+write_size = bdrv_dirty_bitmap_serialization_size(bitmap, offset,
+  end - offset);
 assert(write_size <= s->cluster_size);

 off = qcow2_alloc_clusters(bs, s->cluster_size);
@@ -1123,9 +1121,7 @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,
 }
 tb[cluster] = off;

-bdrv_dirty_bitmap_serialize_part(bitmap, buf,
- sector * BDRV_SECTOR_SIZE,
- (end - sector) * BDRV_SECTOR_SIZE);
+bdrv_dirty_bitmap_serialize_part(bitmap, buf, offset, end - offset);
 if (write_size < s->cluster_size) {
 memset(buf + write_size, 0, s->cluster_size - write_size);
 }
@@ -1143,11 +1139,11 @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,
 goto fail;
 }

-if (end >= bm_sectors) {
+if (end >= bm_size) {
 break;
 }

-bdrv_set_dirty_iter(dbi, end * BDRV_SECTOR_SIZE);
+bdrv_set_dirty_iter(dbi, end);
 }

 *bitmap_table_size = tb_size;
-- 
2.13.5

[Qemu-block] [PATCH v6 04/18] dirty-bitmap: Drop unused functions

2017-08-30 Thread Eric Blake

We had several functions that no one is currently using, and which
use sector-based interfaces.  I'm trying to convert towards byte-based
interfaces, so it's easier to just drop the unused functions:

bdrv_dirty_bitmap_get_meta
bdrv_dirty_bitmap_get_meta_locked
bdrv_dirty_bitmap_reset_meta
bdrv_dirty_bitmap_meta_granularity

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v5: no change
v4: rebase to Vladimir's persistent bitmaps (bdrv_dirty_bitmap_size now
in use), dropped R-b
v3: rebase to upstream changes (bdrv_dirty_bitmap_get_meta_locked was
added in b64bd51e with no clients), kept R-b
v2: tweak commit message based on review, no code change
---
 include/block/dirty-bitmap.h | 10 --
 block/dirty-bitmap.c | 44 
 2 files changed, 54 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index a79a58d2c3..8fd842eac9 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -34,7 +34,6 @@ void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs);
 uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap);
-uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap);
 bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap);
 const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap);
@@ -44,15 +43,6 @@ void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int64_t nr_sectors);
 void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
  int64_t cur_sector, int64_t nr_sectors);
-int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
-   BdrvDirtyBitmap *bitmap, int64_t sector,
-   int nb_sectors);
-int bdrv_dirty_bitmap_get_meta_locked(BlockDriverState *bs,
-  BdrvDirtyBitmap *bitmap, int64_t sector,
-  int nb_sectors);
-void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
-  BdrvDirtyBitmap *bitmap, int64_t sector,
-  int nb_sectors);
 BdrvDirtyBitmapIter *bdrv_dirty_meta_iter_new(BdrvDirtyBitmap *bitmap);
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
  uint64_t first_sector);
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 0490ca3aff..42a55e4a4b 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -173,45 +173,6 @@ void bdrv_release_meta_dirty_bitmap(BdrvDirtyBitmap 
*bitmap)
 qemu_mutex_unlock(bitmap->mutex);
 }

-int bdrv_dirty_bitmap_get_meta_locked(BlockDriverState *bs,
-  BdrvDirtyBitmap *bitmap, int64_t sector,
-  int nb_sectors)
-{
-uint64_t i;
-int sectors_per_bit = 1 << hbitmap_granularity(bitmap->meta);
-
-/* To optimize: we can make hbitmap to internally check the range in a
- * coarse level, or at least do it word by word. */
-for (i = sector; i < sector + nb_sectors; i += sectors_per_bit) {
-if (hbitmap_get(bitmap->meta, i)) {
-return true;
-}
-}
-return false;
-}
-
-int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
-   BdrvDirtyBitmap *bitmap, int64_t sector,
-   int nb_sectors)
-{
-bool dirty;
-
-qemu_mutex_lock(bitmap->mutex);
-dirty = bdrv_dirty_bitmap_get_meta_locked(bs, bitmap, sector, nb_sectors);
-qemu_mutex_unlock(bitmap->mutex);
-
-return dirty;
-}
-
-void bdrv_dirty_bitmap_reset_meta(BlockDriverState *bs,
-  BdrvDirtyBitmap *bitmap, int64_t sector,
-  int nb_sectors)
-{
-qemu_mutex_lock(bitmap->mutex);
-hbitmap_reset(bitmap->meta, sector, nb_sectors);
-qemu_mutex_unlock(bitmap->mutex);
-}
-
 int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap)
 {
 return bitmap->size;
@@ -511,11 +472,6 @@ uint32_t bdrv_dirty_bitmap_granularity(const 
BdrvDirtyBitmap *bitmap)
 return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
 }

-uint32_t bdrv_dirty_bitmap_meta_granularity(BdrvDirtyBitmap *bitmap)
-{
-return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->meta);
-}
-
 BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap,
  uint64_t first_sector)
 {
-- 
2.13.5

[Qemu-block] [PATCH v6 09/18] dirty-bitmap: Change bdrv_dirty_iter_next() to report byte offset

2017-08-30 Thread Eric Blake

Thanks to recent cleanups, most callers were scaling a return value
of sectors into bytes (the exception, in qcow2-bitmap, will be
converted to byte-based iteration later).  Update the interface to
do the scaling internally instead.

Signed-off-by: Eric Blake 
Reviewed-By: John Snow 

---
v5: no change
v4: rebase to persistent bitmap
v3: no change
v2: no change
---
 block/backup.c   | 2 +-
 block/dirty-bitmap.c | 2 +-
 block/mirror.c   | 8 
 block/qcow2-bitmap.c | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 1a6ec7d079..f53cde90d6 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -375,7 +375,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 dbi = bdrv_dirty_iter_new(job->sync_bitmap);

 /* Find the next dirty sector(s) */
-while ((offset = bdrv_dirty_iter_next(dbi) * BDRV_SECTOR_SIZE) >= 0) {
+while ((offset = bdrv_dirty_iter_next(dbi)) >= 0) {
 cluster = offset / job->cluster_size;

 /* Fake progress updates for any clusters we skipped */
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index c091f0c7ba..20f230867d 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -505,7 +505,7 @@ void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter)

 int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter)
 {
-return hbitmap_iter_next(&iter->hbi);
+return hbitmap_iter_next(&iter->hbi) * BDRV_SECTOR_SIZE;
 }

 /* Called within bdrv_dirty_bitmap_lock..unlock */
diff --git a/block/mirror.c b/block/mirror.c
index 51824750ac..af13f5d658 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -336,10 +336,10 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 int max_io_bytes = MAX(s->buf_size / MAX_IN_FLIGHT, MAX_IO_BYTES);

 bdrv_dirty_bitmap_lock(s->dirty_bitmap);
-offset = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
+offset = bdrv_dirty_iter_next(s->dbi);
 if (offset < 0) {
 bdrv_set_dirty_iter(s->dbi, 0);
-offset = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
+offset = bdrv_dirty_iter_next(s->dbi);
 trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap) *
   BDRV_SECTOR_SIZE);
 assert(offset >= 0);
@@ -370,11 +370,11 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 break;
 }

-next_dirty = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
+next_dirty = bdrv_dirty_iter_next(s->dbi);
 if (next_dirty > next_offset || next_dirty < 0) {
 /* The bitmap iterator's cache is stale, refresh it */
 bdrv_set_dirty_iter(s->dbi, next_offset);
-next_dirty = bdrv_dirty_iter_next(s->dbi) * BDRV_SECTOR_SIZE;
+next_dirty = bdrv_dirty_iter_next(s->dbi);
 }
 assert(next_dirty == next_offset);
 nb_chunks++;
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 44329fc74f..c7c60dfca2 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -1109,7 +1109,7 @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,
 sbc = limit >> BDRV_SECTOR_BITS;
 assert(DIV_ROUND_UP(bm_size, limit) == tb_size);

-while ((sector = bdrv_dirty_iter_next(dbi)) != -1) {
+while ((sector = bdrv_dirty_iter_next(dbi) >> BDRV_SECTOR_BITS) != -1) {
 uint64_t cluster = sector / sbc;
 uint64_t end, write_size;
 int64_t off;
-- 
2.13.5

[Qemu-block] [PATCH v6 14/18] qcow2: Switch qcow2_measure() to byte-based iteration

2017-08-30 Thread Eric Blake

This is new code, but it is easier to read if it makes passes over
the image using bytes rather than sectors (and will get easier in
the future when bdrv_get_block_status is converted to byte-based).

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v6: separate bug fix to earlier patch
v5: new patch
---
 block/qcow2.c | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 40ba26c111..57e3c5e7d5 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3666,20 +3666,19 @@ static BlockMeasureInfo *qcow2_measure(QemuOpts *opts, 
BlockDriverState *in_bs,
  */
 required = virtual_size;
 } else {
-int cluster_sectors = cluster_size / BDRV_SECTOR_SIZE;
-int64_t sector_num;
+int64_t offset;
 int pnum = 0;

-for (sector_num = 0;
- sector_num < ssize / BDRV_SECTOR_SIZE;
- sector_num += pnum) {
-int nb_sectors = MIN(ssize / BDRV_SECTOR_SIZE - sector_num,
- BDRV_REQUEST_MAX_SECTORS);
+for (offset = 0; offset < ssize;
+ offset += pnum * BDRV_SECTOR_SIZE) {
+int nb_sectors = MIN(ssize - offset,
+ INT_MAX) / BDRV_SECTOR_SIZE;
 BlockDriverState *file;
 int64_t ret;

 ret = bdrv_get_block_status_above(in_bs, NULL,
-  sector_num, nb_sectors,
+  offset >> BDRV_SECTOR_BITS,
+  nb_sectors,
   &pnum, &file);
 if (ret < 0) {
 error_setg_errno(&local_err, -ret,
@@ -3692,12 +3691,11 @@ static BlockMeasureInfo *qcow2_measure(QemuOpts *opts, 
BlockDriverState *in_bs,
 } else if ((ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED)) ==
(BDRV_BLOCK_DATA | BDRV_BLOCK_ALLOCATED)) {
 /* Extend pnum to end of cluster for next iteration */
-pnum = ROUND_UP(sector_num + pnum, cluster_sectors) -
-   sector_num;
+pnum = (ROUND_UP(offset + pnum * BDRV_SECTOR_SIZE,
+ cluster_size) - offset) >> BDRV_SECTOR_BITS;

 /* Count clusters we've seen */
-required += (sector_num % cluster_sectors + pnum) *
-BDRV_SECTOR_SIZE;
+required += offset % cluster_size + pnum * 
BDRV_SECTOR_SIZE;
 }
 }
 }
-- 
2.13.5

[Qemu-block] [PATCH v6 06/18] dirty-bitmap: Change bdrv_dirty_bitmap_serialize() to take bytes

2017-08-30 Thread Eric Blake

Right now, the dirty-bitmap code exposes the fact that we use
a scale of sector granularity in the underlying hbitmap to anything
that wants to serialize a dirty bitmap.  It's nicer to uniformly
expose bytes as our dirty-bitmap interface, matching the previous
change to bitmap size.  The only caller to serialization is currently
qcow2-cluster.c, which becomes a bit more verbose because it is still
tracking sectors for other reasons, but a later patch will fix that
to more uniformly use byte offsets everywhere.  Likewise, within
dirty-bitmap, we have to add more assertions that we are not
truncating incorrectly, which can go away once the internal hbitmap
is byte-based rather than sector-based.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v5: no change
v4: new patch
---
 include/block/dirty-bitmap.h | 14 +++---
 block/dirty-bitmap.c | 37 -
 block/qcow2-bitmap.c | 22 ++
 3 files changed, 45 insertions(+), 28 deletions(-)

diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 8fd842eac9..f4ccd3f33f 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -49,19 +49,19 @@ BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap 
*bitmap,
 void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter);

 uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
-  uint64_t start, uint64_t count);
+  uint64_t offset, uint64_t bytes);
 uint64_t bdrv_dirty_bitmap_serialization_align(const BdrvDirtyBitmap *bitmap);
 void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
-  uint8_t *buf, uint64_t start,
-  uint64_t count);
+  uint8_t *buf, uint64_t offset,
+  uint64_t bytes);
 void bdrv_dirty_bitmap_deserialize_part(BdrvDirtyBitmap *bitmap,
-uint8_t *buf, uint64_t start,
-uint64_t count, bool finish);
+uint8_t *buf, uint64_t offset,
+uint64_t bytes, bool finish);
 void bdrv_dirty_bitmap_deserialize_zeroes(BdrvDirtyBitmap *bitmap,
-  uint64_t start, uint64_t count,
+  uint64_t offset, uint64_t bytes,
   bool finish);
 void bdrv_dirty_bitmap_deserialize_ones(BdrvDirtyBitmap *bitmap,
-uint64_t start, uint64_t count,
+uint64_t offset, uint64_t bytes,
 bool finish);
 void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap);

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index e65ec4f7ec..034effc8cd 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -570,42 +570,53 @@ void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap 
*bitmap, HBitmap *in)
 }

 uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
-  uint64_t start, uint64_t count)
+  uint64_t offset, uint64_t bytes)
 {
-return hbitmap_serialization_size(bitmap->bitmap, start, count);
+assert(QEMU_IS_ALIGNED(offset | bytes, BDRV_SECTOR_SIZE));
+return hbitmap_serialization_size(bitmap->bitmap,
+  offset >> BDRV_SECTOR_BITS,
+  bytes >> BDRV_SECTOR_BITS);
 }

 uint64_t bdrv_dirty_bitmap_serialization_align(const BdrvDirtyBitmap *bitmap)
 {
-return hbitmap_serialization_align(bitmap->bitmap);
+return hbitmap_serialization_align(bitmap->bitmap) * BDRV_SECTOR_SIZE;
 }

 void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
-  uint8_t *buf, uint64_t start,
-  uint64_t count)
+  uint8_t *buf, uint64_t offset,
+  uint64_t bytes)
 {
-hbitmap_serialize_part(bitmap->bitmap, buf, start, count);
+assert(QEMU_IS_ALIGNED(offset | bytes, BDRV_SECTOR_SIZE));
+hbitmap_serialize_part(bitmap->bitmap, buf, offset >> BDRV_SECTOR_BITS,
+   bytes >> BDRV_SECTOR_BITS);
 }

 void bdrv_dirty_bitmap_deserialize_part(BdrvDirtyBitmap *bitmap,
-uint8_t *buf, uint64_t start,
-uint64_t count, bool finish)
+uint8_t *buf, uint64_t offset,
+uint64_t bytes, bool finish)
 {
-hbitmap_deserialize_part(bitmap->bitmap, buf, start, count, finish);
+

[Qemu-block] [PATCH v6 02/18] hbitmap: Rename serialization_granularity to serialization_align

2017-08-30 Thread Eric Blake

The only client of hbitmap_serialization_granularity() is dirty-bitmap's
bdrv_dirty_bitmap_serialization_align().  Keeping the two names consistent
is worthwhile, and the shorter name is more representative of what the
function returns (the required alignment to be used for start/count of
other serialization functions, where violating the alignment causes
assertion failures).

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v5: no change
v4: new patch
---
 include/qemu/hbitmap.h |  8 
 block/dirty-bitmap.c   |  2 +-
 tests/test-hbitmap.c   | 10 +-
 util/hbitmap.c |  8 
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index d3a74a21fc..81e78043d1 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -159,16 +159,16 @@ bool hbitmap_get(const HBitmap *hb, uint64_t item);
 bool hbitmap_is_serializable(const HBitmap *hb);

 /**
- * hbitmap_serialization_granularity:
+ * hbitmap_serialization_align:
  * @hb: HBitmap to operate on.
  *
- * Granularity of serialization chunks, used by other serialization functions.
- * For every chunk:
+ * Required alignment of serialization chunks, used by other serialization
+ * functions. For every chunk:
  * 1. Chunk start should be aligned to this granularity.
  * 2. Chunk size should be aligned too, except for last chunk (for which
  *  start + count == hb->size)
  */
-uint64_t hbitmap_serialization_granularity(const HBitmap *hb);
+uint64_t hbitmap_serialization_align(const HBitmap *hb);

 /**
  * hbitmap_serialization_size:
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 30462d4f9a..0490ca3aff 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -617,7 +617,7 @@ uint64_t bdrv_dirty_bitmap_serialization_size(const 
BdrvDirtyBitmap *bitmap,

 uint64_t bdrv_dirty_bitmap_serialization_align(const BdrvDirtyBitmap *bitmap)
 {
-return hbitmap_serialization_granularity(bitmap->bitmap);
+return hbitmap_serialization_align(bitmap->bitmap);
 }

 void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
diff --git a/tests/test-hbitmap.c b/tests/test-hbitmap.c
index 1acb353889..af41642346 100644
--- a/tests/test-hbitmap.c
+++ b/tests/test-hbitmap.c
@@ -738,15 +738,15 @@ static void test_hbitmap_meta_one(TestHBitmapData *data, 
const void *unused)
 }
 }

-static void test_hbitmap_serialize_granularity(TestHBitmapData *data,
-   const void *unused)
+static void test_hbitmap_serialize_align(TestHBitmapData *data,
+ const void *unused)
 {
 int r;

 hbitmap_test_init(data, L3 * 2, 3);
 g_assert(hbitmap_is_serializable(data->hb));

-r = hbitmap_serialization_granularity(data->hb);
+r = hbitmap_serialization_align(data->hb);
 g_assert_cmpint(r, ==, 64 << 3);
 }

@@ -974,8 +974,8 @@ int main(int argc, char **argv)
 hbitmap_test_add("/hbitmap/meta/word", test_hbitmap_meta_word);
 hbitmap_test_add("/hbitmap/meta/sector", test_hbitmap_meta_sector);

-hbitmap_test_add("/hbitmap/serialize/granularity",
- test_hbitmap_serialize_granularity);
+hbitmap_test_add("/hbitmap/serialize/align",
+ test_hbitmap_serialize_align);
 hbitmap_test_add("/hbitmap/serialize/basic",
  test_hbitmap_serialize_basic);
 hbitmap_test_add("/hbitmap/serialize/part",
diff --git a/util/hbitmap.c b/util/hbitmap.c
index 21535cc90b..2f9d0fdbd0 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -413,14 +413,14 @@ bool hbitmap_is_serializable(const HBitmap *hb)
 {
 /* Every serialized chunk must be aligned to 64 bits so that endianness
  * requirements can be fulfilled on both 64 bit and 32 bit hosts.
- * We have hbitmap_serialization_granularity() which converts this
+ * We have hbitmap_serialization_align() which converts this
  * alignment requirement from bitmap bits to items covered (e.g. sectors).
  * That value is:
  *64 << hb->granularity
  * Since this value must not exceed UINT64_MAX, hb->granularity must be
  * less than 58 (== 64 - 6, where 6 is ld(64), i.e. 1 << 6 == 64).
  *
- * In order for hbitmap_serialization_granularity() to always return a
+ * In order for hbitmap_serialization_align() to always return a
  * meaningful value, bitmaps that are to be serialized must have a
  * granularity of less than 58. */

@@ -437,7 +437,7 @@ bool hbitmap_get(const HBitmap *hb, uint64_t item)
 return (hb->levels[HBITMAP_LEVELS - 1][pos >> BITS_PER_LEVEL] & bit) != 0;
 }

-uint64_t hbitmap_serialization_granularity(const HBitmap *hb)
+uint64_t hbitmap_serialization_align(const HBitmap *hb)
 {
 assert(hbitmap_is_serializable(hb));

@@ -454,7 +454,7 @@ static void serialization_chunk(const HBitmap *hb,
 unsigned long **first_el, uint64_t *el_count)
 {
 uint64_t last = sta

[Qemu-block] [PATCH v6 07/18] qcow2: Switch sectors_covered_by_bitmap_cluster() to byte-based

2017-08-30 Thread Eric Blake

We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the qcow2 bitmap
helper function sectors_covered_by_bitmap_cluster(), renaming it
to bytes_covered_by_bitmap_cluster() in the process.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v5: no change
v4: new patch
---
 block/qcow2-bitmap.c | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index 92098bfa49..4475273d8c 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -269,18 +269,16 @@ static int free_bitmap_clusters(BlockDriverState *bs, 
Qcow2BitmapTable *tb)
 return 0;
 }

-/* This function returns the number of disk sectors covered by a single qcow2
- * cluster of bitmap data. */
-static uint64_t sectors_covered_by_bitmap_cluster(const BDRVQcow2State *s,
-  const BdrvDirtyBitmap 
*bitmap)
+/* Return the disk size covered by a single qcow2 cluster of bitmap data. */
+static uint64_t bytes_covered_by_bitmap_cluster(const BDRVQcow2State *s,
+const BdrvDirtyBitmap *bitmap)
 {
-uint64_t sector_granularity =
-bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS;
-uint64_t sbc = sector_granularity * (s->cluster_size << 3);
+uint64_t granularity = bdrv_dirty_bitmap_granularity(bitmap);
+uint64_t limit = granularity * (s->cluster_size << 3);

-assert(QEMU_IS_ALIGNED(sbc << BDRV_SECTOR_BITS,
+assert(QEMU_IS_ALIGNED(limit,
bdrv_dirty_bitmap_serialization_align(bitmap)));
-return sbc;
+return limit;
 }

 /* load_bitmap_data
@@ -293,7 +291,7 @@ static int load_bitmap_data(BlockDriverState *bs,
 {
 int ret = 0;
 BDRVQcow2State *s = bs->opaque;
-uint64_t sector, sbc;
+uint64_t sector, limit, sbc;
 uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);
 uint64_t bm_sectors = DIV_ROUND_UP(bm_size, BDRV_SECTOR_SIZE);
 uint8_t *buf = NULL;
@@ -306,7 +304,8 @@ static int load_bitmap_data(BlockDriverState *bs,
 }

 buf = g_malloc(s->cluster_size);
-sbc = sectors_covered_by_bitmap_cluster(s, bitmap);
+limit = bytes_covered_by_bitmap_cluster(s, bitmap);
+sbc = limit >> BDRV_SECTOR_BITS;
 for (i = 0, sector = 0; i < tab_size; ++i, sector += sbc) {
 uint64_t count = MIN(bm_sectors - sector, sbc);
 uint64_t entry = bitmap_table[i];
@@ -1080,7 +1079,7 @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,
 int ret;
 BDRVQcow2State *s = bs->opaque;
 int64_t sector;
-uint64_t sbc;
+uint64_t limit, sbc;
 uint64_t bm_size = bdrv_dirty_bitmap_size(bitmap);
 uint64_t bm_sectors = DIV_ROUND_UP(bm_size, BDRV_SECTOR_SIZE);
 const char *bm_name = bdrv_dirty_bitmap_name(bitmap);
@@ -1106,8 +1105,9 @@ static uint64_t *store_bitmap_data(BlockDriverState *bs,

 dbi = bdrv_dirty_iter_new(bitmap, 0);
 buf = g_malloc(s->cluster_size);
-sbc = sectors_covered_by_bitmap_cluster(s, bitmap);
-assert(DIV_ROUND_UP(bm_sectors, sbc) == tb_size);
+limit = bytes_covered_by_bitmap_cluster(s, bitmap);
+sbc = limit >> BDRV_SECTOR_BITS;
+assert(DIV_ROUND_UP(bm_size, limit) == tb_size);

 while ((sector = bdrv_dirty_iter_next(dbi)) != -1) {
 uint64_t cluster = sector / sbc;
-- 
2.13.5

[Qemu-block] [PATCH v6 01/18] block: Make bdrv_img_create() size selection easier to read

2017-08-30 Thread Eric Blake

All callers of bdrv_img_create() pass in a size, or -1 to read the
size from the backing file.  We then set that size as the QemuOpt
default, which means we will reuse that default rather than the
final parameter to qemu_opt_get_size() several lines later.  But
it is rather confusing to read subsequent checks of 'size == -1'
when it looks (without seeing the full context) like size defaults
to 0; it also doesn't help that a size of 0 is valid (for some
formats).

Rework the logic to make things more legible.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v6: Combine into a series rather than being a standalone patch (more for
ease of tracking than for being on topic)
---
 block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 3308814bba..3f84d141a7 100644
--- a/block.c
+++ b/block.c
@@ -4392,7 +4392,7 @@ void bdrv_img_create(const char *filename, const char 
*fmt,

 /* The size for the image must always be specified, unless we have a 
backing
  * file and we have not been forbidden from opening it. */
-size = qemu_opt_get_size(opts, BLOCK_OPT_SIZE, 0);
+size = qemu_opt_get_size(opts, BLOCK_OPT_SIZE, img_size);
 if (backing_file && !(flags & BDRV_O_NO_BACKING)) {
 BlockDriverState *bs;
 char *full_backing = g_new0(char, PATH_MAX);
-- 
2.13.5

[Qemu-block] [PATCH v6 00/18] make dirty-bitmap byte-based

2017-08-30 Thread Eric Blake

There are patches floating around to add NBD_CMD_BLOCK_STATUS,
but NBD wants to report status on byte granularity (even if the
reporting will probably be naturally aligned to sectors or even
much higher levels).  I've therefore started the task of
converting our block status code to report at a byte granularity
rather than sectors.

Now that 2.11 is open, I'm rebasing/reposting the remaining patches.

The overall conversion currently looks like:
part 1: bdrv_is_allocated (merged in 2.10, commit 51b0a488)
part 2: dirty-bitmap (this series, v5 was here [1])
part 3: bdrv_get_block_status (v3 is posted [2] and is mostly reviewed, but
needs a rebase)
part 4: .bdrv_co_block_status (v2 is posted [3], but needs a rebase)

Available as a tag at:
git fetch git://repo.or.cz/qemu/ericb.git nbd-byte-dirty-v6

[1] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg03512.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg03853.html
[3] https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg04370.html

Diff from v5:
- add another patch (more for ease of bookkeeping, as it was previously
posted independently)
- drop bug fixes that were hoisted into 2.10 (v5 1/18, plus 14/18)

001/18:[down] 'block: Make bdrv_img_create() size selection easier to read'
002/18:[] [--] 'hbitmap: Rename serialization_granularity to 
serialization_align'
003/18:[] [--] 'qcow2: Ensure bitmap serialization is aligned'
004/18:[] [--] 'dirty-bitmap: Drop unused functions'
005/18:[] [--] 'dirty-bitmap: Change bdrv_dirty_bitmap_size() to report 
bytes'
006/18:[] [--] 'dirty-bitmap: Change bdrv_dirty_bitmap_*serialize*() to 
take bytes'
007/18:[] [--] 'qcow2: Switch sectors_covered_by_bitmap_cluster() to 
byte-based'
008/18:[] [--] 'dirty-bitmap: Set iterator start by offset, not sector'
009/18:[] [--] 'dirty-bitmap: Change bdrv_dirty_iter_next() to report byte 
offset'
010/18:[] [--] 'dirty-bitmap: Change bdrv_get_dirty_count() to report bytes'
011/18:[] [--] 'dirty-bitmap: Change bdrv_get_dirty_locked() to take bytes'
012/18:[] [--] 'dirty-bitmap: Change bdrv_[re]set_dirty_bitmap() to use 
bytes'
013/18:[] [--] 'mirror: Switch mirror_dirty_init() to byte-based iteration'
014/18:[0004] [FC] 'qcow2: Switch qcow2_measure() to byte-based iteration'
015/18:[] [--] 'qcow2: Switch load_bitmap_data() to byte-based iteration'
016/18:[] [--] 'qcow2: Switch store_bitmap_data() to byte-based iteration'
017/18:[] [--] 'dirty-bitmap: Switch bdrv_set_dirty() to bytes'
018/18:[] [--] 'dirty-bitmap: Convert internal hbitmap size/granularity'

Eric Blake (18):
  block: Make bdrv_img_create() size selection easier to read
  hbitmap: Rename serialization_granularity to serialization_align
  qcow2: Ensure bitmap serialization is aligned
  dirty-bitmap: Drop unused functions
  dirty-bitmap: Change bdrv_dirty_bitmap_size() to report bytes
  dirty-bitmap: Change bdrv_dirty_bitmap_*serialize*() to take bytes
  qcow2: Switch sectors_covered_by_bitmap_cluster() to byte-based
  dirty-bitmap: Set iterator start by offset, not sector
  dirty-bitmap: Change bdrv_dirty_iter_next() to report byte offset
  dirty-bitmap: Change bdrv_get_dirty_count() to report bytes
  dirty-bitmap: Change bdrv_get_dirty_locked() to take bytes
  dirty-bitmap: Change bdrv_[re]set_dirty_bitmap() to use bytes
  mirror: Switch mirror_dirty_init() to byte-based iteration
  qcow2: Switch qcow2_measure() to byte-based iteration
  qcow2: Switch load_bitmap_data() to byte-based iteration
  qcow2: Switch store_bitmap_data() to byte-based iteration
  dirty-bitmap: Switch bdrv_set_dirty() to bytes
  dirty-bitmap: Convert internal hbitmap size/granularity

 include/block/block_int.h|   2 +-
 include/block/dirty-bitmap.h |  41 +-
 include/qemu/hbitmap.h   |   8 +--
 block/io.c   |   6 +-
 block.c  |   2 +-
 block/backup.c   |   7 +--
 block/dirty-bitmap.c | 130 ++-
 block/mirror.c   |  76 +++--
 block/qcow2-bitmap.c |  57 +--
 block/qcow2.c|  22 
 migration/block.c|  12 ++--
 tests/test-hbitmap.c |  10 ++--
 util/hbitmap.c   |   8 +--
 13 files changed, 154 insertions(+), 227 deletions(-)

-- 
2.13.5

[Qemu-block] [PATCH v6 05/18] dirty-bitmap: Change bdrv_dirty_bitmap_size() to report bytes

2017-08-30 Thread Eric Blake

We are still using an internal hbitmap that tracks a size in sectors,
with the granularity scaled down accordingly, because it lets us
use a shortcut for our iterators which are currently sector-based.
But there's no reason we can't track the dirty bitmap size in bytes,
since it is (mostly) an internal-only variable (remember, the size
is how many bytes are covered by the bitmap, not how many bytes the
bitmap occupies).  Furthermore, we're already reporting bytes for
bdrv_dirty_bitmap_granularity(); mixing bytes and sectors in our
return values is a recipe for confusion.  A later cleanup will
convert dirty bitmap internals to be entirely byte-based,
eliminating the intermediate sector rounding added here; and
technically, since bdrv_getlength() already rounds up to sectors,
our use of DIV_ROUND_UP is more for theoretical completeness than
for any actual rounding.

The only external caller in qcow2-bitmap.c is temporarily more verbose
(because it is still using sector-based math), but will later be
switched to track progress by bytes instead of sectors.

Use is_power_of_2() while at it, instead of open-coding that, and
add an assertion where bdrv_getlength() should not fail.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v6: no change
v5: fix bdrv_dirty_bitmap_truncate [John], drop R-b
v4: retitle from "Track size in bytes", rebase to persistent bitmaps,
round up when converting bytes to sectors
v3: no change
v2: tweak commit message, no code change
---
 block/dirty-bitmap.c | 26 +++---
 block/qcow2-bitmap.c | 14 --
 2 files changed, 23 insertions(+), 17 deletions(-)

diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 42a55e4a4b..e65ec4f7ec 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -1,7 +1,7 @@
 /*
  * Block Dirty Bitmap
  *
- * Copyright (c) 2016 Red Hat. Inc
+ * Copyright (c) 2016-2017 Red Hat. Inc
  *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the "Software"), to 
deal
@@ -42,7 +42,7 @@ struct BdrvDirtyBitmap {
 HBitmap *meta;  /* Meta dirty bitmap */
 BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
 char *name; /* Optional non-empty unique ID */
-int64_t size;   /* Size of the bitmap (Number of sectors) */
+int64_t size;   /* Size of the bitmap, in bytes */
 bool disabled;  /* Bitmap is disabled. It ignores all writes to
the device */
 int active_iterators;   /* How many iterators are active */
@@ -115,17 +115,14 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 {
 int64_t bitmap_size;
 BdrvDirtyBitmap *bitmap;
-uint32_t sector_granularity;

-assert((granularity & (granularity - 1)) == 0);
+assert(is_power_of_2(granularity) && granularity >= BDRV_SECTOR_SIZE);

 if (name && bdrv_find_dirty_bitmap(bs, name)) {
 error_setg(errp, "Bitmap already exists: %s", name);
 return NULL;
 }
-sector_granularity = granularity >> BDRV_SECTOR_BITS;
-assert(sector_granularity);
-bitmap_size = bdrv_nb_sectors(bs);
+bitmap_size = bdrv_getlength(bs);
 if (bitmap_size < 0) {
 error_setg_errno(errp, -bitmap_size, "could not get length of device");
 errno = -bitmap_size;
@@ -133,7 +130,12 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
 }
 bitmap = g_new0(BdrvDirtyBitmap, 1);
 bitmap->mutex = &bs->dirty_bitmap_mutex;
-bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(sector_granularity));
+/*
+ * TODO - let hbitmap track full granularity. For now, it is tracking
+ * only sector granularity, as a shortcut for our iterators.
+ */
+bitmap->bitmap = hbitmap_alloc(DIV_ROUND_UP(bitmap_size, BDRV_SECTOR_SIZE),
+   ctz32(granularity) - BDRV_SECTOR_BITS);
 bitmap->size = bitmap_size;
 bitmap->name = g_strdup(name);
 bitmap->disabled = false;
@@ -305,13 +307,14 @@ BdrvDirtyBitmap 
*bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
 void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
 {
 BdrvDirtyBitmap *bitmap;
-uint64_t size = bdrv_nb_sectors(bs);
+int64_t size = bdrv_getlength(bs);

+assert(size >= 0);
 bdrv_dirty_bitmaps_lock(bs);
 QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
 assert(!bitmap->active_iterators);
-hbitmap_truncate(bitmap->bitmap, size);
+hbitmap_truncate(bitmap->bitmap, DIV_ROUND_UP(size, BDRV_SECTOR_SIZE));
 bitmap->size = size;
 }
 bdrv_dirty_bitmaps_unlock(bs);
@@ -549,7 +552,8 @@ void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, 
HBitmap **out)
 hbitmap_reset_all(bitmap->bitmap);
 } else {
 HBitmap *backup = bitmap->bitmap;
-bitmap->bitmap = hbitm

[Qemu-block] [PATCH v6 03/18] qcow2: Ensure bitmap serialization is aligned

2017-08-30 Thread Eric Blake

When subdividing a bitmap serialization, the code in hbitmap.c
enforces that start/count parameters are aligned (except that
count can end early at end-of-bitmap).  We exposed this required
alignment through bdrv_dirty_bitmap_serialization_align(), but
forgot to actually check that we comply with it.

Fortunately, qcow2 is never dividing bitmap serialization smaller
than one cluster (which is a minimum of 512 bytes); so we are
always compliant with the serialization alignment (which insists
that we partition at least 64 bits per chunk) because we are doing
at least 4k bits per chunk.

Still, it's safer to add an assertion (for the unlikely case that
we'd ever support a cluster smaller than 512 bytes, or if the
hbitmap implementation changes what it considers to be aligned),
rather than leaving bdrv_dirty_bitmap_serialization_align()
without a caller.

Signed-off-by: Eric Blake 
Reviewed-by: John Snow 

---
v5: no change
v4: new patch
---
 block/qcow2-bitmap.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index e8d3bdbd6e..b3ee4c794a 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -274,10 +274,13 @@ static int free_bitmap_clusters(BlockDriverState *bs, 
Qcow2BitmapTable *tb)
 static uint64_t sectors_covered_by_bitmap_cluster(const BDRVQcow2State *s,
   const BdrvDirtyBitmap 
*bitmap)
 {
-uint32_t sector_granularity =
+uint64_t sector_granularity =
 bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS;
+uint64_t sbc = sector_granularity * (s->cluster_size << 3);

-return (uint64_t)sector_granularity * (s->cluster_size << 3);
+assert(QEMU_IS_ALIGNED(sbc,
+   bdrv_dirty_bitmap_serialization_align(bitmap)));
+return sbc;
 }

 /* load_bitmap_data
-- 
2.13.5

Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/7] block/ssh: don't call libssh2_init() in block_init()

2017-08-30 Thread Eric Blake

On 08/30/2017 03:11 PM, Jeff Cody wrote:
> On Wed, Aug 30, 2017 at 02:40:16PM -0500, Eric Blake wrote:
>> On 08/30/2017 11:56 AM, Jeff Cody wrote:
>>> We don't need libssh2 failure to be fatal (we could just opt to not
>>> register the driver on failure). But, it is probably a good idea to
>>> avoid external library calls during the block_init(), and call the
>>> libssh2 global init function on the first usage, returning any errors.
>>>

>>
>> Is returning 'ret' from libssh2_init() wise?
>>
>> If 'ret' is not important, you could simplify this:
>>
>> if (ssh_state_init(s, errp)) {
>> return -1;
>> }
> 
> Good point, not sure if a non-zero ret from libssh2_init() provides meaning
> beyond 'error' for us.  Maybe return -EIO instead of -1, though?

You used -EIO for 5/7, so that does sound a bit better.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/7] block/ssh: don't call libssh2_init() in block_init()

2017-08-30 Thread Jeff Cody

On Wed, Aug 30, 2017 at 02:40:16PM -0500, Eric Blake wrote:
> On 08/30/2017 11:56 AM, Jeff Cody wrote:
> > We don't need libssh2 failure to be fatal (we could just opt to not
> > register the driver on failure). But, it is probably a good idea to
> > avoid external library calls during the block_init(), and call the
> > libssh2 global init function on the first usage, returning any errors.
> > 
> > Signed-off-by: Jeff Cody 
> > ---
> >  block/ssh.c | 40 +---
> >  1 file changed, 29 insertions(+), 11 deletions(-)
> > 
> 
> > +static int ssh_state_init(BDRVSSHState *s, Error **errp)
> >  {
> > +int ret;
> > +
> > +if (!ssh_libinit_called) {
> > +ret = libssh2_init(0);
> > +if (ret) {
> > +error_setg(errp, "libssh2 initialization failed with %d", ret);
> > +return ret;
> 
> Do we know if this number is always positive or negative?
> 

>From the documentation [1], it returns 0 on success, or a negative value
for error.  (I guess presumably that means a positive value is by definition
an error, as well).

[1] https://www.libssh2.org/libssh2_init.html

> > @@ -772,8 +788,13 @@ static int ssh_file_open(BlockDriverState *bs, QDict 
> > *options, int bdrv_flags,
> >  BDRVSSHState *s = bs->opaque;
> >  int ret;
> >  int ssh_flags;
> > +Error *local_err = NULL;
> >  
> > -ssh_state_init(s);
> > +ret = ssh_state_init(s, &local_err);
> > +if (local_err) {
> > +error_propagate(errp, local_err);
> > +return ret;
> 
> Is returning 'ret' from libssh2_init() wise?
> 
> If 'ret' is not important, you could simplify this:
> 
> if (ssh_state_init(s, errp)) {
> return -1;
> }

Good point, not sure if a non-zero ret from libssh2_init() provides meaning
beyond 'error' for us.  Maybe return -EIO instead of -1, though?

Re: [Qemu-block] [Qemu-devel] [PATCH v2 7/7] block/curl: code cleanup to comply with coding style

2017-08-30 Thread Eric Blake

On 08/30/2017 11:57 AM, Jeff Cody wrote:
> This addresses non-functional changes to help curl.c better comply
> with the coding styles (comments, indentation, brackets, etc.).
> 
> One minor code change is the combination of two if statements into
> a single if statement.
> 
> Signed-off-by: Jeff Cody 
> ---
>  block/curl.c | 100 
> +++
>  1 file changed, 52 insertions(+), 48 deletions(-)
> 

> -#define CURL_BLOCK_OPT_URL   "url"
> -#define CURL_BLOCK_OPT_READAHEAD "readahead"
> -#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
> -#define CURL_BLOCK_OPT_TIMEOUT "timeout"
> -#define CURL_BLOCK_OPT_COOKIE"cookie"
> -#define CURL_BLOCK_OPT_COOKIE_SECRET "cookie-secret"
> -#define CURL_BLOCK_OPT_USERNAME "username"
> -#define CURL_BLOCK_OPT_PASSWORD_SECRET "password-secret"
> -#define CURL_BLOCK_OPT_PROXY_USERNAME "proxy-username"

Another change from inconsistent spacing,

> +#define CURL_BLOCK_OPT_URL   "url"
> +#define CURL_BLOCK_OPT_READAHEAD "readahead"
> +#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
> +#define CURL_BLOCK_OPT_TIMEOUT   "timeout"
> +#define CURL_BLOCK_OPT_COOKIE"cookie"
> +#define CURL_BLOCK_OPT_COOKIE_SECRET "cookie-secret"
> +#define CURL_BLOCK_OPT_USERNAME  "username"
> +#define CURL_BLOCK_OPT_PASSWORD_SECRET   "password-secret"
> +#define CURL_BLOCK_OPT_PROXY_USERNAME"proxy-username"
>  #define CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET "proxy-password-secret"

to something that is consistent but requires a mass-edit if we add a
longer name.


> @@ -235,7 +236,7 @@ static size_t curl_header_cb(void *ptr, size_t size, 
> size_t nmemb, void *opaque)
>  /* Called from curl_multi_do_locked, with s->mutex held.  */
>  static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void 
> *opaque)
>  {
> -CURLState *s = ((CURLState*)opaque);
> +CURLState *s = ((CURLState *)opaque);

Umm, this is C, so why not just:

CURLState *s = opaque;

Those are minor findings, so whether or not you address them,
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v2 6/7] block/curl: fix minor memory leaks

2017-08-30 Thread Eric Blake

On 08/30/2017 11:57 AM, Jeff Cody wrote:
> Signed-off-by: Jeff Cody 
> ---
>  block/curl.c | 6 ++
>  1 file changed, 6 insertions(+)
> 

> +++ b/block/curl.c
> @@ -857,6 +857,9 @@ out_noclean:
>  qemu_mutex_destroy(&s->mutex);
>  g_free(s->cookie);
>  g_free(s->url);
> +g_free(s->username);
> +g_free(s->proxyusername);
> +g_free(s->proxypassword);

Would it be any simpler to call curl_close(s) instead of open-coding it
here in this cleanup path?

> @@ -955,6 +958,9 @@ static void curl_close(BlockDriverState *bs)
>  
>  g_free(s->cookie);
>  g_free(s->url);
> +g_free(s->username);
> +g_free(s->proxyusername);
> +g_free(s->proxypassword);
>  }

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v2 5/7] block/curl: check error return of curl_global_init()

2017-08-30 Thread Eric Blake

On 08/30/2017 11:57 AM, Jeff Cody wrote:
> If curl_global_init() fails, per the documentation no other curl
> functions may be called, so make sure to check the return value.
> 
> Also, some minor changes to the initialization latch variable 'inited':
> 
> - Make it static in the file, for clarity
> - Change the name for clarity
> - Make it a bool
> 
> Signed-off-by: Jeff Cody 
> ---
>  block/curl.c | 18 --
>  1 file changed, 12 insertions(+), 6 deletions(-)
> 
> diff --git a/block/curl.c b/block/curl.c

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v2 4/7] block/sheepdog: code beautification

2017-08-30 Thread Eric Blake

On 08/30/2017 11:57 AM, Jeff Cody wrote:
> No functional changes, just whitespace manipulation.
> 
> Signed-off-by: Jeff Cody 
> ---
>  block/sheepdog.c | 162 
> +++
>  1 file changed, 81 insertions(+), 81 deletions(-)
> 

>  static BlockDriver bdrv_sheepdog = {
> -.format_name= "sheepdog",
> -.protocol_name  = "sheepdog",
> -.instance_size  = sizeof(BDRVSheepdogState),
> -.bdrv_parse_filename= sd_parse_filename,
> -.bdrv_file_open = sd_open,
> -.bdrv_reopen_prepare= sd_reopen_prepare,
> -.bdrv_reopen_commit = sd_reopen_commit,
> -.bdrv_reopen_abort  = sd_reopen_abort,
> -.bdrv_close = sd_close,
> -.bdrv_create= sd_create,
> -.bdrv_has_zero_init = bdrv_has_zero_init_1,
> -.bdrv_getlength = sd_getlength,

The existing style is indeed ugly since it has no consistency,...

> +.format_name  = "sheepdog",
> +.protocol_name= "sheepdog",
> +.instance_size= sizeof(BDRVSheepdogState),
> +.bdrv_parse_filename  = sd_parse_filename,
> +.bdrv_file_open   = sd_open,
> +.bdrv_reopen_prepare  = sd_reopen_prepare,
> +.bdrv_reopen_commit   = sd_reopen_commit,
> +.bdrv_reopen_abort= sd_reopen_abort,
> +.bdrv_close   = sd_close,
> +.bdrv_create  = sd_create,
> +.bdrv_has_zero_init   = bdrv_has_zero_init_1,
> +.bdrv_getlength   = sd_getlength,
>  .bdrv_get_allocated_file_size = sd_get_allocated_file_size,

...but aligning '=' requires mass reformatting if you get any longer
.bdrv_ function callback added down the road.  If it were me, I'd just
consistently use a single space everywhere, as in:

.format_name = "sheepdog",
.protocol_name = "sheepdog",

but that's my personal opinion, and not a hard rule, so I won't be
bothered if you don't take it.

>  static BlockDriver bdrv_sheepdog_tcp = {
> -.format_name= "sheepdog",
> -.protocol_name  = "sheepdog+tcp",
> -.instance_size  = sizeof(BDRVSheepdogState),
> -.bdrv_parse_filename= sd_parse_filename,
> +.format_name  = "sheepdog",
> +.protocol_name= "sheepdog+tcp",
> +.instance_size= sizeof(BDRVSheepdogState),
> +.bdrv_parse_filename  = sd_parse_filename,
>  .bdrv_file_open = sd_open,

See, with your style, this row now stands out as being missed; with my
style, this line needs no edits.

But your changes are indeed semantic no-ops, so whether you fix the few
unaligned lines per your style, or redo to a consistent style of only a
single space, you can add:
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v2 3/7] block/sheepdog: remove spurious NULL check

2017-08-30 Thread Eric Blake

On 08/30/2017 11:57 AM, Jeff Cody wrote:
> 'tag' is already checked in the lines immediately preceding this check,
> and set to non-NULL if NULL.  No need to check again, it hasn't changed.
> 
> Signed-off-by: Jeff Cody 
> ---
>  block/sheepdog.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/7] block/ssh: make compliant with coding guidelines

2017-08-30 Thread Eric Blake

On 08/30/2017 11:56 AM, Jeff Cody wrote:
> Signed-off-by: Jeff Cody 
> ---
>  block/ssh.c | 32 ++--
>  1 file changed, 18 insertions(+), 14 deletions(-)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/7] block/ssh: don't call libssh2_init() in block_init()

2017-08-30 Thread Eric Blake

On 08/30/2017 11:56 AM, Jeff Cody wrote:
> We don't need libssh2 failure to be fatal (we could just opt to not
> register the driver on failure). But, it is probably a good idea to
> avoid external library calls during the block_init(), and call the
> libssh2 global init function on the first usage, returning any errors.
> 
> Signed-off-by: Jeff Cody 
> ---
>  block/ssh.c | 40 +---
>  1 file changed, 29 insertions(+), 11 deletions(-)
> 

> +static int ssh_state_init(BDRVSSHState *s, Error **errp)
>  {
> +int ret;
> +
> +if (!ssh_libinit_called) {
> +ret = libssh2_init(0);
> +if (ret) {
> +error_setg(errp, "libssh2 initialization failed with %d", ret);
> +return ret;

Do we know if this number is always positive or negative?

> @@ -772,8 +788,13 @@ static int ssh_file_open(BlockDriverState *bs, QDict 
> *options, int bdrv_flags,
>  BDRVSSHState *s = bs->opaque;
>  int ret;
>  int ssh_flags;
> +Error *local_err = NULL;
>  
> -ssh_state_init(s);
> +ret = ssh_state_init(s, &local_err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +return ret;

Is returning 'ret' from libssh2_init() wise?

If 'ret' is not important, you could simplify this:

if (ssh_state_init(s, errp)) {
return -1;
}

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/3] Live block optional disable

2017-08-30 Thread Eric Blake

On 08/30/2017 12:01 PM, Jeff Cody wrote:
> This series adds a configurable option to disable live block operations.
> 
> The default is that live block operations are 'enabled'.
> 
> Jeffrey Cody (3):
>   configure: Add option in configure to disable live block ops
>   block-jobs: Optionally unregister live block operations
>   hmp: Optionally disable live block operations in HMP monitor

Series:
Reviewed-by: Eric Blake 

However, per the discussion on 2/3, you probably want a v2 of this
series after we land conditional QAPI support first.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/3] block-jobs: Optionally unregister live block operations

2017-08-30 Thread Eric Blake

On 08/30/2017 12:24 PM, Eduardo Habkost wrote:
> On Wed, Aug 30, 2017 at 01:01:41PM -0400, Jeff Cody wrote:
>> From: Jeffrey Cody 
>>
>> If configured without live block operations enabled, unregister the
>> live block operation commands.
>>
>> Signed-off-by: Jeff Cody 
>> ---
>>  monitor.c | 16 
>>  1 file changed, 16 insertions(+)
>>

> 
> I suggest using the new mechanisms added by:
> 
>   [PATCH 00/26] qapi: add #if pre-processor conditions to generated code

Those haven't landed yet, but as both series are proposed for 2.11, I
indeed agree that basing this series on top of that one will be a bit
cleaner.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/3] Live block optional disable

2017-08-30 Thread Eric Blake

On 08/30/2017 12:01 PM, Jeff Cody wrote:
> This series adds a configurable option to disable live block operations.
> 
> The default is that live block operations are 'enabled'.
> 
> Jeffrey Cody (3):
>   configure: Add option in configure to disable live block ops
>   block-jobs: Optionally unregister live block operations
>   hmp: Optionally disable live block operations in HMP monitor

That doesn't match the spelling you've used on other contributions (see:
 git shortlog --author=Cody | grep -v "^ "
for a demonstration); is it intentional, and do you want a .mailmap
entry to merge your contributions under a preferred name?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v3 5/5] qemu-iotests: add option to save temp files on error

2017-08-30 Thread Eric Blake

On 08/30/2017 11:52 AM, Jeff Cody wrote:
> Now that ./check takes care of cleaning up after each tests, it
> can also selectively not clean up.  Add option to leave all output from
> tests intact if that test encountered an error.
> 
> Note: this currently only works for bash tests, as the python tests
> still clean up after themselves manually.
> 
> Signed-off-by: Jeff Cody 
> ---
>  tests/qemu-iotests/check  | 10 +-
>  tests/qemu-iotests/common |  6 ++
>  2 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
> index f6ca85d..8a5fc0d 100755
> --- a/tests/qemu-iotests/check
> +++ b/tests/qemu-iotests/check
> @@ -370,7 +370,15 @@ do
>  fi
>  fi
>  
> -rm -rf "$TEST_DIR_SEQ"
> +#TODO: There is some intial work to save intermediate files

s/intial/initial/

> +#  in python tests, but it is imperfect.  Having each
> +#  test record its test name, and the tearDown function
> +#  just move intermediate images to a subdirectory with
> +#  the test name may prove more useful.

Comment works for me, and I'm fine with the idea you present here being
in a followup patch.  So with the typo fixed,
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-block] [PATCH v3 4/5] qemu-iotests: make python tests attempt to leave intermediate files

2017-08-30 Thread Eric Blake

On 08/30/2017 11:52 AM, Jeff Cody wrote:
> Now that 'check' will clean up after tests, try and make python
> tests leave intermediate files so that they might be inspectable
> on failure.
> 
> This isn't perfect; the python unittest framework runs multiple
> tests, even if previous tests failed.  So we need to make sure that
> each test still begins with a "clean" slate, to prevent false
> positives or tainted test runs.
> 
> Rather than delete images in the unittest tearDown, invert this
> and delete images to be used in that test at the beginning of the
> setUp.  This is to make sure that the test run is not inadvertently
> using file droppings from previous runs.  We must use 'blind_remove'
> then for these, as the files might not exist yet, but we don't want
> to throw an error for that.
> 
> Signed-off-by: Jeff Cody 
> ---

> +++ b/tests/qemu-iotests/030
> @@ -21,7 +21,7 @@
>  import time
>  import os
>  import iotests
> -from iotests import qemu_img, qemu_io
> +from iotests import qemu_img, qemu_io, blind_remove
>  
>  backing_img = os.path.join(iotests.test_dir, 'backing.img')
>  mid_img = os.path.join(iotests.test_dir, 'mid.img')
> @@ -31,6 +31,9 @@ class TestSingleDrive(iotests.QMPTestCase):
>  image_len = 1 * 1024 * 1024 # MB
>  
>  def setUp(self):
> +blind_remove(test_img)
> +blind_remove(mid_img)
> +blind_remove(backing_img)

Would it be any more pythonic to have support for:

blind_remove(test_img, mid_img, backing_img)

built into the previous patch?

>  def tearDown(self):
>  self.vm.shutdown()
> -os.remove(self.test_img)
> -os.remove(self.mid_img_abs)
> -os.remove(self.backing_img_abs)
> -try:
> -os.rmdir(os.path.join(iotests.test_dir, self.dir1))
> -os.rmdir(os.path.join(iotests.test_dir, self.dir3))
> -os.rmdir(os.path.join(iotests.test_dir, self.dir2))
> -except OSError as exception:
> -if exception.errno != errno.EEXIST and exception.errno != 
> errno.ENOTEMPTY:
> -raise

The code removed here is using a syntax that differs from what you used
in 3/5 when defining blind_remove; does that matter for 3/5?

> +++ b/tests/qemu-iotests/041

> +blind_remove(target_img)
>  iotests.create_image(backing_img, self.image_len)
>  qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
> backing_img, test_img)
>  self.vm = iotests.VM().add_drive(test_img, 
> "node-name=top,backing.node-name=base")
> @@ -49,12 +52,6 @@ class TestSingleDrive(iotests.QMPTestCase):
>  
>  def tearDown(self):
>  self.vm.shutdown()
> -os.remove(test_img)
> -os.remove(backing_img)
> -try:
> -os.remove(target_img)
> -except OSError:
> -pass

You're changing failures other than ENOENT from ignored to explicit -
nice little bug-fix along the way :)  I notice this pattern in multiple
tests; is it worth mentioning in the commit message as intentional?

> @@ -797,6 +788,9 @@ class TestRepairQuorum(iotests.QMPTestCase):
>  IMAGES = [ quorum_img1, quorum_img2, quorum_img3 ]
>  
>  def setUp(self):
> +for i in self.IMAGES + [ quorum_repair_img, quorum_snapshot_file ]:
> +blind_remove(i)

Again, would it be more pythonic if blind_remove() could take a list and
automatically work on each element of the list, rather than having to
make the caller iterate?

> +++ b/tests/qemu-iotests/057
> @@ -23,7 +23,7 @@
>  import time
>  import os
>  import iotests
> -from iotests import qemu_img, qemu_io
> +from iotests import qemu_img, qemu_io, blind_remove
>  
>  test_drv_base_name = 'drive'
>  
> @@ -36,6 +36,8 @@ class ImageSnapshotTestCase(iotests.QMPTestCase):
>  
>  def _setUp(self, test_img_base_name, image_num):
>  self.vm = iotests.VM()
> +for dev_expect in self.expect:
> +blind_remove(dev_expect['image'])

Another place where python magic could make the caller nicer?

> +++ b/tests/qemu-iotests/118

> @@ -411,16 +411,16 @@ class TestFloppyInitiallyEmpty(TestInitiallyEmpty):
>  
>  class TestChangeReadOnly(ChangeBaseClass):
>  def setUp(self):
> -qemu_img('create', '-f', iotests.imgfmt, old_img, '1440k')
> -qemu_img('create', '-f', iotests.imgfmt, new_img, '1440k')
> -self.vm = iotests.VM()
> -
> -def tearDown(self):
> -self.vm.shutdown()
>  os.chmod(old_img, 0666)
>  os.chmod(new_img, 0666)
> -os.remove(old_img)
> -os.remove(new_img)
> +blind_remove(old_img)
> +blind_remove(new_img)
> +qemu_img('create', '-f', iotests.imgfmt, old_img, '1440k')
> +qemu_img('create', '-f', iotests.imgfmt, new_img, '1440k')
> +self.vm = iotests.VM()
> +
> +def tearDown(self):
> +self.vm.shutdown()

The script framework doesn't have any problem removing left-over
read-only files, correct?  (If it does, then earlier in the

Re: [Qemu-block] [PATCH v3 3/5] qemu-iotests: add 'blind_remove' for python tests

2017-08-30 Thread Eric Blake

On 08/30/2017 11:52 AM, Jeff Cody wrote:
> Add a function to attempt to 'blindly' remove a file, without
> throwing an error if the file doesn't exist.
> 
> Signed-off-by: Jeff Cody 
> ---
>  tests/qemu-iotests/iotests.py | 7 +++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
> index 7233983..a2088c7 100644
> --- a/tests/qemu-iotests/iotests.py
> +++ b/tests/qemu-iotests/iotests.py
> @@ -57,6 +57,13 @@ qemu_default_machine = 
> os.environ.get('QEMU_DEFAULT_MACHINE')
>  socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
>  debug = False
>  
> +def blind_remove(filename):
> +try:
> +os.remove(filename)
> +except OSError, error:

I'm assuming this works for both python 2 and 3?

> +if error.errno != errno.ENOENT:
> +raise
> +

Weak, since I'm not the strongest at python, but you can add:
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-block] [PULL 11/11] block/nbd-client: refactor request send/receive

2017-08-30 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Add nbd_co_request, to remove code duplications in
nbd_client_co_{pwrite,pread,...} functions. Also this is
needed for further refactoring.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20170804151440.320927-8-vsement...@virtuozzo.com>
[eblake: make nbd_co_request a wrapper, rather than merging two
existing functions]
Signed-off-by: Eric Blake 
---
 block/nbd-client.c | 73 +++---
 1 file changed, 26 insertions(+), 47 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 322b725ff9..f0dbea24d3 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -220,28 +220,40 @@ static void nbd_co_receive_reply(NBDClientSession *s,
 qemu_co_mutex_unlock(&s->send_mutex);
 }

+static int nbd_co_request(BlockDriverState *bs,
+  NBDRequest *request,
+  QEMUIOVector *qiov)
+{
+NBDClientSession *client = nbd_get_client_session(bs);
+NBDReply reply;
+int ret;
+
+assert(!qiov || request->type == NBD_CMD_WRITE ||
+   request->type == NBD_CMD_READ);
+ret = nbd_co_send_request(bs, request,
+  request->type == NBD_CMD_WRITE ? qiov : NULL);
+if (ret < 0) {
+reply.error = -ret;
+} else {
+nbd_co_receive_reply(client, request, &reply,
+ request->type == NBD_CMD_READ ? qiov : NULL);
+}
+return -reply.error;
+}
+
 int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
  uint64_t bytes, QEMUIOVector *qiov, int flags)
 {
-NBDClientSession *client = nbd_get_client_session(bs);
 NBDRequest request = {
 .type = NBD_CMD_READ,
 .from = offset,
 .len = bytes,
 };
-NBDReply reply;
-int ret;

 assert(bytes <= NBD_MAX_BUFFER_SIZE);
 assert(!flags);

-ret = nbd_co_send_request(bs, &request, NULL);
-if (ret < 0) {
-reply.error = -ret;
-} else {
-nbd_co_receive_reply(client, &request, &reply, qiov);
-}
-return -reply.error;
+return nbd_co_request(bs, &request, qiov);
 }

 int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset,
@@ -253,8 +265,6 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,
 .from = offset,
 .len = bytes,
 };
-NBDReply reply;
-int ret;

 if (flags & BDRV_REQ_FUA) {
 assert(client->info.flags & NBD_FLAG_SEND_FUA);
@@ -263,26 +273,18 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,

 assert(bytes <= NBD_MAX_BUFFER_SIZE);

-ret = nbd_co_send_request(bs, &request, qiov);
-if (ret < 0) {
-reply.error = -ret;
-} else {
-nbd_co_receive_reply(client, &request, &reply, NULL);
-}
-return -reply.error;
+return nbd_co_request(bs, &request, qiov);
 }

 int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
 int bytes, BdrvRequestFlags flags)
 {
-int ret;
 NBDClientSession *client = nbd_get_client_session(bs);
 NBDRequest request = {
 .type = NBD_CMD_WRITE_ZEROES,
 .from = offset,
 .len = bytes,
 };
-NBDReply reply;

 if (!(client->info.flags & NBD_FLAG_SEND_WRITE_ZEROES)) {
 return -ENOTSUP;
@@ -296,21 +298,13 @@ int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, 
int64_t offset,
 request.flags |= NBD_CMD_FLAG_NO_HOLE;
 }

-ret = nbd_co_send_request(bs, &request, NULL);
-if (ret < 0) {
-reply.error = -ret;
-} else {
-nbd_co_receive_reply(client, &request, &reply, NULL);
-}
-return -reply.error;
+return nbd_co_request(bs, &request, NULL);
 }

 int nbd_client_co_flush(BlockDriverState *bs)
 {
 NBDClientSession *client = nbd_get_client_session(bs);
 NBDRequest request = { .type = NBD_CMD_FLUSH };
-NBDReply reply;
-int ret;

 if (!(client->info.flags & NBD_FLAG_SEND_FLUSH)) {
 return 0;
@@ -319,13 +313,7 @@ int nbd_client_co_flush(BlockDriverState *bs)
 request.from = 0;
 request.len = 0;

-ret = nbd_co_send_request(bs, &request, NULL);
-if (ret < 0) {
-reply.error = -ret;
-} else {
-nbd_co_receive_reply(client, &request, &reply, NULL);
-}
-return -reply.error;
+return nbd_co_request(bs, &request, NULL);
 }

 int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes)
@@ -336,21 +324,12 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t 
offset, int bytes)
 .from = offset,
 .len = bytes,
 };
-NBDReply reply;
-int ret;

 if (!(client->info.flags & NBD_FLAG_SEND_TRIM)) {
 return 0;
 }

-ret = nbd_co_send_request(bs, &request, NULL);
-if (ret < 0) {
-reply.error = -ret;
-} else {
-nbd_co_receive_reply(client, &request, &reply, NULL);
-}
-return -reply.error;
-
+return nbd_co_req

[Qemu-block] [PULL 10/11] block/nbd-client: rename nbd_recv_coroutines_enter_all

2017-08-30 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Rename nbd_recv_coroutines_enter_all to nbd_recv_coroutines_wake_all,
as it most probably just adds all recv coroutines into co_queue_wakeup,
rather than directly enter them.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20170804151440.320927-9-vsement...@virtuozzo.com>
[eblake: tweak commit message]
Signed-off-by: Eric Blake 
---
 block/nbd-client.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 1e393cf26f..322b725ff9 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -34,7 +34,7 @@
 #define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
 #define INDEX_TO_HANDLE(bs, index)  ((index)  ^ ((uint64_t)(intptr_t)bs))

-static void nbd_recv_coroutines_enter_all(NBDClientSession *s)
+static void nbd_recv_coroutines_wake_all(NBDClientSession *s)
 {
 int i;

@@ -112,7 +112,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
 }

 s->quit = true;
-nbd_recv_coroutines_enter_all(s);
+nbd_recv_coroutines_wake_all(s);
 s->read_reply_co = NULL;
 }

-- 
2.13.5

[Qemu-block] [PULL 06/11] nbd/client: refactor nbd_read_eof

2017-08-30 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Refactor nbd_read_eof to return 1 on success, 0 on eof, when no
data was read and <0 for other cases, because returned size of
read data is not actually used.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20170804151440.320927-3-vsement...@virtuozzo.com>
[eblake: tweak function comments, rebase to test 083 enhancements]
Signed-off-by: Eric Blake 
---
 nbd/nbd-internal.h | 33 -
 nbd/client.c   |  5 -
 tests/qemu-iotests/083.out |  8 
 3 files changed, 28 insertions(+), 18 deletions(-)

diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 396ddb4d3e..03549e3f39 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -77,21 +77,36 @@
 #define NBD_ESHUTDOWN  108

 /* nbd_read_eof
- * Tries to read @size bytes from @ioc. Returns number of bytes actually read.
- * May return a value >= 0 and < size only on EOF, i.e. when iteratively called
- * qio_channel_readv() returns 0. So, there is no need to call nbd_read_eof
- * iteratively.
+ * Tries to read @size bytes from @ioc.
+ * Returns 1 on success
+ * 0 on eof, when no data was read (errp is not set)
+ * negative errno on failure (errp is set)
  */
-static inline ssize_t nbd_read_eof(QIOChannel *ioc, void *buffer, size_t size,
-   Error **errp)
+static inline int nbd_read_eof(QIOChannel *ioc, void *buffer, size_t size,
+   Error **errp)
 {
 struct iovec iov = { .iov_base = buffer, .iov_len = size };
+ssize_t ret;
+
 /* Sockets are kept in blocking mode in the negotiation phase.  After
  * that, a non-readable socket simply means that another thread stole
  * our request/reply.  Synchronization is done with recv_coroutine, so
  * that this is coroutine-safe.
  */
-return nbd_rwv(ioc, &iov, 1, size, true, errp);
+
+assert(size);
+
+ret = nbd_rwv(ioc, &iov, 1, size, true, errp);
+if (ret <= 0) {
+return ret;
+}
+
+if (ret != size) {
+error_setg(errp, "End of file");
+return -EINVAL;
+}
+
+return 1;
 }

 /* nbd_read
@@ -100,9 +115,9 @@ static inline ssize_t nbd_read_eof(QIOChannel *ioc, void 
*buffer, size_t size,
 static inline int nbd_read(QIOChannel *ioc, void *buffer, size_t size,
Error **errp)
 {
-ssize_t ret = nbd_read_eof(ioc, buffer, size, errp);
+int ret = nbd_read_eof(ioc, buffer, size, errp);

-if (ret >= 0 && ret != size) {
+if (ret == 0) {
 ret = -EINVAL;
 error_setg(errp, "End of file");
 }
diff --git a/nbd/client.c b/nbd/client.c
index f1c16b588f..4556056daa 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -925,11 +925,6 @@ ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply 
*reply, Error **errp)
 return ret;
 }

-if (ret != sizeof(buf)) {
-error_setg(errp, "read failed");
-return -EINVAL;
-}
-
 /* Reply
[ 0 ..  3]magic   (NBD_REPLY_MAGIC)
[ 4 ..  7]error   (0 == no error)
diff --git a/tests/qemu-iotests/083.out b/tests/qemu-iotests/083.out
index a7fb081889..fb71b6f8ad 100644
--- a/tests/qemu-iotests/083.out
+++ b/tests/qemu-iotests/083.out
@@ -69,12 +69,12 @@ read failed: Input/output error

 === Check disconnect 4 reply ===

-read failed
+End of file
 read failed: Input/output error

 === Check disconnect 8 reply ===

-read failed
+End of file
 read failed: Input/output error

 === Check disconnect before data ===
@@ -180,12 +180,12 @@ read failed: Input/output error

 === Check disconnect 4 reply ===

-read failed
+End of file
 read failed: Input/output error

 === Check disconnect 8 reply ===

-read failed
+End of file
 read failed: Input/output error

 === Check disconnect before data ===
-- 
2.13.5

[Qemu-block] [PULL 08/11] nbd/client: fix nbd_send_request to return int

2017-08-30 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Fix nbd_send_request to return int, as it returns a return value
of nbd_write (which is int), and the only user of nbd_send_request's
return value (nbd_co_send_request) consider it as int too.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20170804151440.320927-5-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 include/block/nbd.h | 2 +-
 nbd/client.c| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index f7450608b4..040cdd2e60 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -163,7 +163,7 @@ int nbd_receive_negotiate(QIOChannel *ioc, const char *name,
   Error **errp);
 int nbd_init(int fd, QIOChannelSocket *sioc, NBDExportInfo *info,
  Error **errp);
-ssize_t nbd_send_request(QIOChannel *ioc, NBDRequest *request);
+int nbd_send_request(QIOChannel *ioc, NBDRequest *request);
 int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp);
 int nbd_client(int fd);
 int nbd_disconnect(int fd);
diff --git a/nbd/client.c b/nbd/client.c
index f8c213bc96..68a0bc1ffc 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -896,7 +896,7 @@ int nbd_disconnect(int fd)
 }
 #endif

-ssize_t nbd_send_request(QIOChannel *ioc, NBDRequest *request)
+int nbd_send_request(QIOChannel *ioc, NBDRequest *request)
 {
 uint8_t buf[NBD_REQUEST_SIZE];

-- 
2.13.5

[Qemu-block] [PULL 09/11] block/nbd-client: get rid of ssize_t

2017-08-30 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Use int variable for nbd_co_send_request return value (as
nbd_co_send_request returns int).

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20170804151440.320927-6-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 block/nbd-client.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index ea728fffc8..1e393cf26f 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -230,7 +230,7 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t 
offset,
 .len = bytes,
 };
 NBDReply reply;
-ssize_t ret;
+int ret;

 assert(bytes <= NBD_MAX_BUFFER_SIZE);
 assert(!flags);
@@ -254,7 +254,7 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,
 .len = bytes,
 };
 NBDReply reply;
-ssize_t ret;
+int ret;

 if (flags & BDRV_REQ_FUA) {
 assert(client->info.flags & NBD_FLAG_SEND_FUA);
@@ -275,7 +275,7 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,
 int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
 int bytes, BdrvRequestFlags flags)
 {
-ssize_t ret;
+int ret;
 NBDClientSession *client = nbd_get_client_session(bs);
 NBDRequest request = {
 .type = NBD_CMD_WRITE_ZEROES,
@@ -310,7 +310,7 @@ int nbd_client_co_flush(BlockDriverState *bs)
 NBDClientSession *client = nbd_get_client_session(bs);
 NBDRequest request = { .type = NBD_CMD_FLUSH };
 NBDReply reply;
-ssize_t ret;
+int ret;

 if (!(client->info.flags & NBD_FLAG_SEND_FLUSH)) {
 return 0;
@@ -337,7 +337,7 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t 
offset, int bytes)
 .len = bytes,
 };
 NBDReply reply;
-ssize_t ret;
+int ret;

 if (!(client->info.flags & NBD_FLAG_SEND_TRIM)) {
 return 0;
-- 
2.13.5

[Qemu-block] [PULL 03/11] qemu-iotests: improve nbd-fault-injector.py startup protocol

2017-08-30 Thread Eric Blake

From: Stefan Hajnoczi 

Currently 083 waits for the nbd-fault-injector.py server to start up by
looping until netstat shows the TCP listen socket.

The startup protocol can be simplified by passing a 0 port number to
nbd-fault-injector.py.  The kernel will allocate a port in bind(2) and
the final port number can be printed by nbd-fault-injector.py.

This should make it slightly nicer and less TCP-specific to wait for
server startup.  This patch changes nbd-fault-injector.py, the next one
will rewrite server startup in 083.

Reviewed-by: Eric Blake 
Signed-off-by: Stefan Hajnoczi 
Message-Id: <20170829122745.14309-3-stefa...@redhat.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/nbd-fault-injector.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/nbd-fault-injector.py 
b/tests/qemu-iotests/nbd-fault-injector.py
index 6c07191a5a..1c10dcb51c 100755
--- a/tests/qemu-iotests/nbd-fault-injector.py
+++ b/tests/qemu-iotests/nbd-fault-injector.py
@@ -235,11 +235,15 @@ def open_socket(path):
 sock = socket.socket()
 sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
 sock.bind((host, int(port)))
+
+# If given port was 0 the final port number is now available
+path = '%s:%d' % sock.getsockname()
 else:
 sock = socket.socket(socket.AF_UNIX)
 sock.bind(path)
 sock.listen(0)
 print 'Listening on %s' % path
+sys.stdout.flush() # another process may be waiting, show message now
 return sock

 def usage(args):
-- 
2.13.5

[Qemu-block] [PULL 07/11] nbd/client: refactor nbd_receive_reply

2017-08-30 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Refactor nbd_receive_reply to return 1 on success, 0 on eof, when no
data was read and <0 for other cases, because returned size of read
data is not actually used.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20170804151440.320927-4-vsement...@virtuozzo.com>
[eblake: tweak function comments]
Signed-off-by: Eric Blake 
---
 include/block/nbd.h |  2 +-
 nbd/client.c| 12 +---
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 9c3d0a5868..f7450608b4 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -164,7 +164,7 @@ int nbd_receive_negotiate(QIOChannel *ioc, const char *name,
 int nbd_init(int fd, QIOChannelSocket *sioc, NBDExportInfo *info,
  Error **errp);
 ssize_t nbd_send_request(QIOChannel *ioc, NBDRequest *request);
-ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp);
+int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp);
 int nbd_client(int fd);
 int nbd_disconnect(int fd);

diff --git a/nbd/client.c b/nbd/client.c
index 4556056daa..f8c213bc96 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -914,11 +914,16 @@ ssize_t nbd_send_request(QIOChannel *ioc, NBDRequest 
*request)
 return nbd_write(ioc, buf, sizeof(buf), NULL);
 }

-ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
+/* nbd_receive_reply
+ * Returns 1 on success
+ * 0 on eof, when no data was read (errp is not set)
+ * negative errno on failure (errp is set)
+ */
+int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
 {
 uint8_t buf[NBD_REPLY_SIZE];
 uint32_t magic;
-ssize_t ret;
+int ret;

 ret = nbd_read_eof(ioc, buf, sizeof(buf), errp);
 if (ret <= 0) {
@@ -948,6 +953,7 @@ ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, 
Error **errp)
 error_setg(errp, "invalid magic (got 0x%" PRIx32 ")", magic);
 return -EINVAL;
 }
-return sizeof(buf);
+
+return 1;
 }

-- 
2.13.5

[Qemu-block] [PULL 02/11] nbd-client: avoid read_reply_co entry if send failed

2017-08-30 Thread Eric Blake

From: Stefan Hajnoczi 

The following segfault is encountered if the NBD server closes the UNIX
domain socket immediately after negotiation:

  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441
  441   QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
  (gdb) bt
  #0  0x00d3c01a50f8 in aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at 
util/async.c:441
  #1  0x00d3c012fa90 in nbd_coroutine_end (bs=bs@entry=0xd3c0fec650, 
request=) at block/nbd-client.c:207
  #2  0x00d3c012fb58 in nbd_client_co_preadv (bs=0xd3c0fec650, offset=0, 
bytes=, qiov=0x7ffc10a91b20, flags=0) at block/nbd-client.c:237
  #3  0x00d3c0128e63 in bdrv_driver_preadv (bs=bs@entry=0xd3c0fec650, 
offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, 
flags=0) at block/io.c:836
  #4  0x00d3c012c3e0 in bdrv_aligned_preadv 
(child=child@entry=0xd3c0ff51d0, req=req@entry=0x7f31885d6e90, 
offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=1, 
qiov=qiov@entry=0x7ffc10a91b20, f
+lags=0) at block/io.c:1086
  #5  0x00d3c012c6b8 in bdrv_co_preadv (child=0xd3c0ff51d0, 
offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, 
flags=flags@entry=0) at block/io.c:1182
  #6  0x00d3c011cc17 in blk_co_preadv (blk=0xd3c0ff4f80, offset=0, 
bytes=512, qiov=0x7ffc10a91b20, flags=0) at block/block-backend.c:1032
  #7  0x00d3c011ccec in blk_read_entry (opaque=0x7ffc10a91b40) at 
block/block-backend.c:1079
  #8  0x00d3c01bbb96 in coroutine_trampoline (i0=, 
i1=) at util/coroutine-ucontext.c:79
  #9  0x7f3196cb8600 in __start_context () at /lib64/libc.so.6

The problem is that nbd_client_init() uses
nbd_client_attach_aio_context() -> aio_co_schedule(new_context,
client->read_reply_co).  Execution of read_reply_co is deferred to a BH
which doesn't run until later.

In the mean time blk_co_preadv() can be called and nbd_coroutine_end()
calls aio_wake() on read_reply_co.  At this point in time
read_reply_co's ctx isn't set because it has never been entered yet.

This patch simplifies the nbd_co_send_request() ->
nbd_co_receive_reply() -> nbd_coroutine_end() lifecycle to just
nbd_co_send_request() -> nbd_co_receive_reply().  The request is "ended"
if an error occurs at any point.  Callers no longer have to invoke
nbd_coroutine_end().

This cleanup also eliminates the segfault because we don't call
aio_co_schedule() to wake up s->read_reply_co if sending the request
failed.  It is only necessary to wake up s->read_reply_co if a reply was
received.

Note this only happens with UNIX domain sockets on Linux.  It doesn't
seem possible to reproduce this with TCP sockets.

Suggested-by: Paolo Bonzini 
Signed-off-by: Stefan Hajnoczi 
Message-Id: <20170829122745.14309-2-stefa...@redhat.com>
Signed-off-by: Eric Blake 
---
 block/nbd-client.c | 25 +
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 25bcaa2346..ea728fffc8 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -144,12 +144,12 @@ static int nbd_co_send_request(BlockDriverState *bs,
 request->handle = INDEX_TO_HANDLE(s, i);

 if (s->quit) {
-qemu_co_mutex_unlock(&s->send_mutex);
-return -EIO;
+rc = -EIO;
+goto err;
 }
 if (!s->ioc) {
-qemu_co_mutex_unlock(&s->send_mutex);
-return -EPIPE;
+rc = -EPIPE;
+goto err;
 }

 if (qiov) {
@@ -166,8 +166,13 @@ static int nbd_co_send_request(BlockDriverState *bs,
 } else {
 rc = nbd_send_request(s->ioc, request);
 }
+
+err:
 if (rc < 0) {
 s->quit = true;
+s->requests[i].coroutine = NULL;
+s->in_flight--;
+qemu_co_queue_next(&s->free_sema);
 }
 qemu_co_mutex_unlock(&s->send_mutex);
 return rc;
@@ -201,13 +206,6 @@ static void nbd_co_receive_reply(NBDClientSession *s,
 /* Tell the read handler to read another header.  */
 s->reply.handle = 0;
 }
-}
-
-static void nbd_coroutine_end(BlockDriverState *bs,
-  NBDRequest *request)
-{
-NBDClientSession *s = nbd_get_client_session(bs);
-int i = HANDLE_TO_INDEX(s, request->handle);

 s->requests[i].coroutine = NULL;

@@ -243,7 +241,6 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t 
offset,
 } else {
 nbd_co_receive_reply(client, &request, &reply, qiov);
 }
-nbd_coroutine_end(bs, &request);
 return -reply.error;
 }

@@ -272,7 +269,6 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,
 } else {
 nbd_co_receive_reply(client, &request, &reply, NULL);
 }
-nbd_coroutine_end(bs, &request);
 return -reply.error;
 }

@@ -306,7 +302,6 @@ int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, 
int64_t offset,
 } else {
 nbd_co_receive_reply(client, &request, &reply, NULL);
 }
-nbd_co

[Qemu-block] [PULL 05/11] nbd/client: fix nbd_opt_go

2017-08-30 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

Do not send NBD_OPT_ABORT to the broken server. After sending
NBD_REP_ACK on NBD_OPT_GO server is most probably in transmission
phase, when option sending is finished.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-Id: <20170804151440.320927-2-vsement...@virtuozzo.com>
Signed-off-by: Eric Blake 
---
 nbd/client.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/nbd/client.c b/nbd/client.c
index 0a17de80b5..f1c16b588f 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -399,12 +399,10 @@ static int nbd_opt_go(QIOChannel *ioc, const char 
*wantname,
phase, but make sure it sent flags */
 if (len) {
 error_setg(errp, "server sent invalid NBD_REP_ACK");
-nbd_send_opt_abort(ioc);
 return -1;
 }
 if (!info->flags) {
 error_setg(errp, "broken server omitted NBD_INFO_EXPORT");
-nbd_send_opt_abort(ioc);
 return -1;
 }
 trace_nbd_opt_go_success();
-- 
2.13.5

[Qemu-block] [PULL 04/11] qemu-iotests: test NBD over UNIX domain sockets in 083

2017-08-30 Thread Eric Blake

From: Stefan Hajnoczi 

083 only tests TCP.  Some failures might be specific to UNIX domain
sockets.

A few adjustments are necessary:

1. Generating a port number and waiting for server startup is
   TCP-specific.  Use the new nbd-fault-injector.py startup protocol to
   fetch the address.  This is a little more elegant because we don't
   need netstat anymore.

2. The NBD filter does not work for the UNIX domain sockets URIs we
   generate and must be extended.

3. Run all tests twice: once for TCP and once for UNIX domain sockets.

Reviewed-by: Eric Blake 
Signed-off-by: Stefan Hajnoczi 
Message-Id: <20170829122745.14309-4-stefa...@redhat.com>
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/common.filter |   4 +-
 tests/qemu-iotests/083   | 138 +++--
 tests/qemu-iotests/083.out   | 145 ++-
 3 files changed, 215 insertions(+), 72 deletions(-)

diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 7a58e57317..9d5442ecd9 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -170,9 +170,9 @@ _filter_nbd()
 #
 # Filter out the TCP port number since this changes between runs.
 sed -e '/nbd\/.*\.c:/d' \
--e 's#nbd:\(//\)\?127\.0\.0\.1:[0-9]*#nbd:\1127.0.0.1:PORT#g' \
+-e 's#127\.0\.0\.1:[0-9]*#127.0.0.1:PORT#g' \
 -e "s#?socket=$TEST_DIR#?socket=TEST_DIR#g" \
--e 's#\(exportname=foo\|PORT\): Failed to .*$#\1#'
+-e 's#\(foo\|PORT/\?\|.sock\): Failed to .*$#\1#'
 }

 # make sure this script returns success
diff --git a/tests/qemu-iotests/083 b/tests/qemu-iotests/083
index bff9360048..0306f112da 100755
--- a/tests/qemu-iotests/083
+++ b/tests/qemu-iotests/083
@@ -27,6 +27,14 @@ echo "QA output created by $seq"
 here=`pwd`
 status=1   # failure is the default!

+_cleanup()
+{
+   rm -f nbd.sock
+   rm -f nbd-fault-injector.out
+   rm -f nbd-fault-injector.conf
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
 # get standard environment, filters and checks
 . ./common.rc
 . ./common.filter
@@ -35,81 +43,105 @@ _supported_fmt generic
 _supported_proto nbd
 _supported_os Linux

-# Pick a TCP port based on our pid.  This way multiple instances of this test
-# can run in parallel without conflicting.
-choose_tcp_port() {
-   echo $((($$ % 31744) + 1024)) # 1024 <= port < 32768
-}
-
-wait_for_tcp_port() {
-   while ! (netstat --tcp --listening --numeric | \
-grep "$1.*0\\.0\\.0\\.0:\\*.*LISTEN") >/dev/null 2>&1; do
-   sleep 0.1
-   done
-}
-
 check_disconnect() {
+   local event export_name=foo extra_args nbd_addr nbd_url proto when
+
+   while true; do
+   case $1 in
+   --classic-negotiation)
+   shift
+   extra_args=--classic-negotiation
+   export_name=
+   ;;
+   --tcp)
+   shift
+   proto=tcp
+   ;;
+   --unix)
+   shift
+   proto=unix
+   ;;
+   *)
+   break
+   ;;
+   esac
+   done
+
event=$1
when=$2
-   negotiation=$3
echo "=== Check disconnect $when $event ==="
echo

-   port=$(choose_tcp_port)
-
cat > "$TEST_DIR/nbd-fault-injector.conf" <"$TEST_DIR/nbd-fault-injector.out" 2>&1 &
+
+   # Wait for server to be ready
+   while ! grep -q 'Listening on ' "$TEST_DIR/nbd-fault-injector.out"; do
+   sleep 0.1
+   done
+
+   # Extract the final address (port number has now been assigned in tcp 
case)
+   nbd_addr=$(sed 's/Listening on \(.*\)$/\1/' 
"$TEST_DIR/nbd-fault-injector.out")
+
+   if [ "$proto" = "tcp" ]; then
+   nbd_url="nbd+tcp://$nbd_addr/$export_name"
+   else
+   nbd_url="nbd+unix:///$export_name?socket=$nbd_addr"
fi

-   $PYTHON nbd-fault-injector.py $extra_args "127.0.0.1:$port" 
"$TEST_DIR/nbd-fault-injector.conf" >/dev/null 2>&1 &
-   wait_for_tcp_port "127\\.0\\.0\\.1:$port"
$QEMU_IO -c "read 0 512" "$nbd_url" 2>&1 | _filter_qemu_io | _filter_nbd

echo
 }

-for event in neg1 "export" neg2 request reply data; do
-   for when in before after; do
-   check_disconnect "$event" "$when"
-   done
-
-   # Also inject short replies from the NBD server
-   case "$event" in
-   neg1)
-   for when in 8 16; do
-   check_disconnect "$event" "$when"
-   done
-   ;;
-   "export")
-   for when in 4 12 16; do
-   check_disconnect "$event" "$when"
-   done
-   ;;
-   neg2)
-   for when in 8 10; do
-   check_disconnect "$even

[Qemu-block] [PULL 01/11] qemu-iotests: Extend non-shared storage migration test (194)

2017-08-30 Thread Eric Blake

From: Kashyap Chamarthy 

This is the follow-up patch that was discussed[*] as part of feedback to
qemu-iotest 194.

Changes in this patch:

  - Supply 'job-id' parameter to `drive-mirror` invocation.

  - Once migration completes, issue QMP `block-job-cancel` command on
the source QEMU to gracefully complete `drive-mirror` operation.

  - Once the BLOCK_JOB_COMPLETED event is emitted, stop the NBD server
on the destination QEMU.

  - Check for both the events: MIGRATION and BLOCK_JOB_COMPLETED.

With the above, the test will also be (almost) in sync with the
procedure outlined in the document 'live-block-operations.rst'[+]
(section: "QMP invocation for live storage migration with
``drive-mirror`` + NBD").

[*] https://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg04820.html
-- qemu-iotests: add 194 non-shared storage migration test
[+] 
https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=docs/interop/live-block-operations.rst

Signed-off-by: Kashyap Chamarthy 
Message-Id: <20170829165058.8229-1-kcham...@redhat.com>
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/194 | 23 +--
 tests/qemu-iotests/194.out | 11 ---
 2 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/tests/qemu-iotests/194 b/tests/qemu-iotests/194
index 8028111e21..a3e3bad664 100755
--- a/tests/qemu-iotests/194
+++ b/tests/qemu-iotests/194
@@ -46,16 +46,17 @@ iotests.log('Launching NBD server on destination...')
 iotests.log(dest_vm.qmp('nbd-server-start', addr={'type': 'unix', 'data': 
{'path': nbd_sock_path}}))
 iotests.log(dest_vm.qmp('nbd-server-add', device='drive0', writable=True))

-iotests.log('Starting drive-mirror on source...')
+iotests.log('Starting `drive-mirror` on source...')
 iotests.log(source_vm.qmp(
   'drive-mirror',
   device='drive0',
   target='nbd+unix:///drive0?socket={0}'.format(nbd_sock_path),
   sync='full',
   format='raw', # always raw, the server handles the format
-  mode='existing'))
+  mode='existing',
+  job_id='mirror-job0'))

-iotests.log('Waiting for drive-mirror to complete...')
+iotests.log('Waiting for `drive-mirror` to complete...')
 iotests.log(source_vm.event_wait('BLOCK_JOB_READY'),
 filters=[iotests.filter_qmp_event])

@@ -67,7 +68,17 @@ dest_vm.qmp('migrate-set-capabilities',
 iotests.log(source_vm.qmp('migrate', 
uri='unix:{0}'.format(migration_sock_path)))

 while True:
-event = source_vm.event_wait('MIGRATION')
-iotests.log(event, filters=[iotests.filter_qmp_event])
-if event['data']['status'] in ('completed', 'failed'):
+event1 = source_vm.event_wait('MIGRATION')
+iotests.log(event1, filters=[iotests.filter_qmp_event])
+if event1['data']['status'] in ('completed', 'failed'):
+iotests.log('Gracefully ending the `drive-mirror` job on source...')
+iotests.log(source_vm.qmp('block-job-cancel', device='mirror-job0'))
+break
+
+while True:
+event2 = source_vm.event_wait('BLOCK_JOB_COMPLETED')
+iotests.log(event2, filters=[iotests.filter_qmp_event])
+if event2['event'] == 'BLOCK_JOB_COMPLETED':
+iotests.log('Stopping the NBD server on destination...')
+iotests.log(dest_vm.qmp('nbd-server-stop'))
 break
diff --git a/tests/qemu-iotests/194.out b/tests/qemu-iotests/194.out
index ae501fecac..50ac50da5e 100644
--- a/tests/qemu-iotests/194.out
+++ b/tests/qemu-iotests/194.out
@@ -2,12 +2,17 @@ Launching VMs...
 Launching NBD server on destination...
 {u'return': {}}
 {u'return': {}}
-Starting drive-mirror on source...
+Starting `drive-mirror` on source...
 {u'return': {}}
-Waiting for drive-mirror to complete...
-{u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'device': u'drive0', u'type': u'mirror', u'speed': 0, u'len': 1073741824, 
u'offset': 1073741824}, u'event': u'BLOCK_JOB_READY'}
+Waiting for `drive-mirror` to complete...
+{u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'device': u'mirror-job0', u'type': u'mirror', u'speed': 0, u'len': 
1073741824, u'offset': 1073741824}, u'event': u'BLOCK_JOB_READY'}
 Starting migration...
 {u'return': {}}
 {u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'status': u'setup'}, u'event': u'MIGRATION'}
 {u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'status': u'active'}, u'event': u'MIGRATION'}
 {u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'status': u'completed'}, u'event': u'MIGRATION'}
+Gracefully ending the `drive-mirror` job on source...
+{u'return': {}}
+{u'timestamp': {u'seconds': 'SECS', u'microseconds': 'USECS'}, u'data': 
{u'device': u'mirror-job0', u'type': u'mirror', u'speed': 0, u'len': 
1073741824, u'offset': 1073741824}, u'event': u'BLOCK_JOB_COMPLETED'}
+Stopping the NBD server on destination...
+{u'return': {}}
-- 
2.13.5

Re: [Qemu-block] [Qemu-devel] [PATCH] qcow2: allocate cluster_cache/cluster_data on demand

2017-08-30 Thread Stefan Hajnoczi

On Tue, Aug 22, 2017 at 02:56:00PM +1000, Alexey Kardashevskiy wrote:
> On 19/08/17 12:46, Alexey Kardashevskiy wrote:
> > On 19/08/17 01:18, Eric Blake wrote:
> >> On 08/18/2017 08:31 AM, Stefan Hajnoczi wrote:
> >>> Most qcow2 files are uncompressed so it is wasteful to allocate (32 + 1)
> >>> * cluster_size + 512 bytes upfront.  Allocate s->cluster_cache and
> >>> s->cluster_data when the first read operation is performance on a
> >>> compressed cluster.
> >>>
> >>> The buffers are freed in .bdrv_close().  .bdrv_open() no longer has any
> >>> code paths that can allocate these buffers, so remove the free functions
> >>> in the error code path.
> >>>
> >>> Reported-by: Alexey Kardashevskiy 
> >>> Cc: Kevin Wolf 
> >>> Signed-off-by: Stefan Hajnoczi 
> >>> ---
> >>> Alexey: Does this improve your memory profiling results?
> >>
> >> Is this a regression from earlier versions? 
> > 
> > Hm, I have not thought about this.
> > 
> > So. I did bisect and this started happening from
> > 9a4c0e220d8a4f82b5665d0ee95ef94d8e1509d5
> > "hw/virtio-pci: fix virtio behaviour"
> > 
> > Before that, the very same command line would take less than 1GB of
> > resident memory. That thing basically enforces virtio-1.0 for QEMU <=2.6
> > which means that upstream with "-machine pseries-2.6" works fine (less than
> > 1GB), "-machine pseries-2.7" does not (close to 7GB, sometime even 9GB).
> > 
> > Then I tried bisecting again, with
> > "scsi=off,disable-modern=off,disable-legacy=on" on my 150 virtio-block
> > devices, started from
> > e266d421490e0 "virtio-pci: add flags to enable/disable legacy/modern" (it
> > added the disable-modern switch) which uses 2GB of memory.
> > 
> > I ended up with ada434cd0b44 "virtio-pci: implement cfg capability".
> > 
> > Then I removed proxy->modern_as on v2.10.0-rc3 (see below) and got 1.5GB of
> > used memory (yay!)
> > 
> > I do not really know how to reinterpret all of this, do you?
> 
> 
> Anyone, ping? Should I move the conversation to the original thread? Any
> hacks to try with libc?

I suggest a new top-level thread with Michael Tsirkin CCed.

Stefan

[Qemu-block] [PATCH v2 3/3] hmp: Optionally disable live block operations in HMP monitor

2017-08-30 Thread Jeff Cody

From: Jeffrey Cody 

If live block operations are disabled, disable the corresponding
HMP commands.

Signed-off-by: Jeff Cody 
---
 hmp-commands-info.hx |  4 
 hmp-commands.hx  | 12 
 hmp.c| 12 
 3 files changed, 28 insertions(+)

diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index d9df238..0967e41 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -84,6 +84,8 @@ STEXI
 Show block device statistics.
 ETEXI
 
+#ifdef CONFIG_LIVE_BLOCK_OPS
+
 {
 .name   = "block-jobs",
 .args_type  = "",
@@ -98,6 +100,8 @@ STEXI
 Show progress of ongoing block device operations.
 ETEXI
 
+#endif /* CONFIG_LIVE_BLOCK_OPS */
+
 {
 .name   = "registers",
 .args_type  = "cpustate_all:-a",
diff --git a/hmp-commands.hx b/hmp-commands.hx
index 1941e19..2d137a1 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -73,6 +73,8 @@ but should be used with extreme caution.  Note that this 
command only
 resizes image files, it can not resize block devices like LVM volumes.
 ETEXI
 
+#ifdef CONFIG_LIVE_BLOCK_OPS
+
 {
 .name   = "block_stream",
 .args_type  = "device:B,speed:o?,base:s?",
@@ -159,6 +161,8 @@ STEXI
 Resume a paused block streaming operation.
 ETEXI
 
+#endif /* CONFIG_LIVE_BLOCK_OPS */
+
 {
 .name   = "eject",
 .args_type  = "force:-f,device:B",
@@ -1169,6 +1173,8 @@ STEXI
 Enables or disables migration mode.
 ETEXI
 
+#ifdef CONFIG_LIVE_BLOCK_OPS
+
 {
 .name   = "snapshot_blkdev",
 .args_type  = "reuse:-n,device:B,snapshot-file:s?,format:s?",
@@ -1190,6 +1196,8 @@ STEXI
 Snapshot device, using snapshot file as target if provided
 ETEXI
 
+#endif /* CONFIG_LIVE_BLOCK_OPS */
+
 {
 .name   = "snapshot_blkdev_internal",
 .args_type  = "device:B,name:s",
@@ -1224,6 +1232,8 @@ STEXI
 Delete an internal snapshot on device if it support
 ETEXI
 
+#ifdef CONFIG_LIVE_BLOCK_OPS
+
 {
 .name   = "drive_mirror",
 .args_type  = "reuse:-n,full:-f,device:B,target:s,format:s?",
@@ -1267,6 +1277,8 @@ STEXI
 Start a point-in-time copy of a block device to a specificed target.
 ETEXI
 
+#endif /* CONFIG_LIVE_BLOCK_OPS */
+
 {
 .name   = "drive_add",
 .args_type  = "node:-n,pci_addr:s,opts:s",
diff --git a/hmp.c b/hmp.c
index fd80dce..ab985c6 100644
--- a/hmp.c
+++ b/hmp.c
@@ -951,6 +951,8 @@ void hmp_info_pci(Monitor *mon, const QDict *qdict)
 qapi_free_PciInfoList(info_list);
 }
 
+#ifdef CONFIG_LIVE_BLOCK_OPS
+
 void hmp_info_block_jobs(Monitor *mon, const QDict *qdict)
 {
 BlockJobInfoList *list;
@@ -989,6 +991,8 @@ void hmp_info_block_jobs(Monitor *mon, const QDict *qdict)
 qapi_free_BlockJobInfoList(list);
 }
 
+#endif /* CONFIG_LIVE_BLOCK_OPS */
+
 void hmp_info_tpm(Monitor *mon, const QDict *qdict)
 {
 TPMInfoList *info_list, *info;
@@ -1197,6 +1201,8 @@ void hmp_block_resize(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, &err);
 }
 
+#ifdef CONFIG_LIVE_BLOCK_OPS
+
 void hmp_drive_mirror(Monitor *mon, const QDict *qdict)
 {
 const char *filename = qdict_get_str(qdict, "target");
@@ -1280,6 +1286,8 @@ void hmp_snapshot_blkdev(Monitor *mon, const QDict *qdict)
 hmp_handle_error(mon, &err);
 }
 
+#endif /* CONFIG_LIVE_BLOCK_OPS */
+
 void hmp_snapshot_blkdev_internal(Monitor *mon, const QDict *qdict)
 {
 const char *device = qdict_get_str(qdict, "device");
@@ -1776,6 +1784,8 @@ void hmp_block_set_io_throttle(Monitor *mon, const QDict 
*qdict)
 hmp_handle_error(mon, &err);
 }
 
+#ifdef CONFIG_LIVE_BLOCK_OPS
+
 void hmp_block_stream(Monitor *mon, const QDict *qdict)
 {
 Error *error = NULL;
@@ -1842,6 +1852,8 @@ void hmp_block_job_complete(Monitor *mon, const QDict 
*qdict)
 hmp_handle_error(mon, &error);
 }
 
+#endif /* CONFIG_LIVE_BLOCK_OPS */
+
 typedef struct HMPMigrationStatus
 {
 QEMUTimer *timer;
-- 
2.9.5

Re: [Qemu-block] [Qemu-devel] [PATCH v2 3/7] block/sheepdog: remove spurious NULL check

2017-08-30 Thread Philippe Mathieu-Daudé

On Wed, Aug 30, 2017 at 1:57 PM, Jeff Cody  wrote:
> 'tag' is already checked in the lines immediately preceding this check,
> and set to non-NULL if NULL.  No need to check again, it hasn't changed.
>
> Signed-off-by: Jeff Cody 
> ---
>  block/sheepdog.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/sheepdog.c b/block/sheepdog.c
> index abb2e79..bbbfa72 100644
> --- a/block/sheepdog.c
> +++ b/block/sheepdog.c
> @@ -1632,7 +1632,7 @@ static int sd_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  if (!tag) {
>  tag = "";
>  }
> -if (tag && strlen(tag) >= SD_MAX_VDI_TAG_LEN) {
> +if (strlen(tag) >= SD_MAX_VDI_TAG_LEN) {

you can even prepend an 'else'! :D

>  error_setg(errp, "value of parameter 'tag' is too long");
>  ret = -EINVAL;
>  goto err_no_fd;
> --
> 2.9.5
>
>

Re: [Qemu-block] [PATCH v2 2/3] block-jobs: Optionally unregister live block operations

2017-08-30 Thread Eduardo Habkost

On Wed, Aug 30, 2017 at 01:01:41PM -0400, Jeff Cody wrote:
> From: Jeffrey Cody 
> 
> If configured without live block operations enabled, unregister the
> live block operation commands.
> 
> Signed-off-by: Jeff Cody 
> ---
>  monitor.c | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/monitor.c b/monitor.c
> index e0f8801..de0a70e 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -998,6 +998,22 @@ static void qmp_unregister_commands_hack(void)
>  && !defined(TARGET_S390X)
>  qmp_unregister_command(&qmp_commands, "query-cpu-definitions");
>  #endif
> +#ifndef CONFIG_LIVE_BLOCK_OPS
> +qmp_unregister_command(&qmp_commands, "block-stream");
> +qmp_unregister_command(&qmp_commands, "block-commit");
> +qmp_unregister_command(&qmp_commands, "drive-mirror");
> +qmp_unregister_command(&qmp_commands, "blockdev-mirror");
> +qmp_unregister_command(&qmp_commands, "drive-backup");
> +qmp_unregister_command(&qmp_commands, "blockdev-backup");
> +qmp_unregister_command(&qmp_commands, "blockdev-snapshot");
> +qmp_unregister_command(&qmp_commands, "blockdev-snapshot-sync");
> +qmp_unregister_command(&qmp_commands, "block-job-set-speed");
> +qmp_unregister_command(&qmp_commands, "block-job-cancel");
> +qmp_unregister_command(&qmp_commands, "block-job-pause");
> +qmp_unregister_command(&qmp_commands, "block-job-resume");
> +qmp_unregister_command(&qmp_commands, "block-job-complete");
> +qmp_unregister_command(&qmp_commands, "query-block-jobs");
> +#endif

I suggest using the new mechanisms added by:

  [PATCH 00/26] qapi: add #if pre-processor conditions to generated code

-- 
Eduardo

[Qemu-block] [PATCH v2 4/7] block/sheepdog: code beautification

2017-08-30 Thread Jeff Cody

No functional changes, just whitespace manipulation.

Signed-off-by: Jeff Cody 
---
 block/sheepdog.c | 162 +++
 1 file changed, 81 insertions(+), 81 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index bbbfa72..ad461f1 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -400,7 +400,7 @@ typedef struct BDRVSheepdogReopenState {
 int cache_flags;
 } BDRVSheepdogReopenState;
 
-static const char * sd_strerror(int err)
+static const char *sd_strerror(int err)
 {
 int i;
 
@@ -3078,111 +3078,111 @@ static QemuOptsList sd_create_opts = {
 };
 
 static BlockDriver bdrv_sheepdog = {
-.format_name= "sheepdog",
-.protocol_name  = "sheepdog",
-.instance_size  = sizeof(BDRVSheepdogState),
-.bdrv_parse_filename= sd_parse_filename,
-.bdrv_file_open = sd_open,
-.bdrv_reopen_prepare= sd_reopen_prepare,
-.bdrv_reopen_commit = sd_reopen_commit,
-.bdrv_reopen_abort  = sd_reopen_abort,
-.bdrv_close = sd_close,
-.bdrv_create= sd_create,
-.bdrv_has_zero_init = bdrv_has_zero_init_1,
-.bdrv_getlength = sd_getlength,
+.format_name  = "sheepdog",
+.protocol_name= "sheepdog",
+.instance_size= sizeof(BDRVSheepdogState),
+.bdrv_parse_filename  = sd_parse_filename,
+.bdrv_file_open   = sd_open,
+.bdrv_reopen_prepare  = sd_reopen_prepare,
+.bdrv_reopen_commit   = sd_reopen_commit,
+.bdrv_reopen_abort= sd_reopen_abort,
+.bdrv_close   = sd_close,
+.bdrv_create  = sd_create,
+.bdrv_has_zero_init   = bdrv_has_zero_init_1,
+.bdrv_getlength   = sd_getlength,
 .bdrv_get_allocated_file_size = sd_get_allocated_file_size,
-.bdrv_truncate  = sd_truncate,
+.bdrv_truncate= sd_truncate,
 
-.bdrv_co_readv  = sd_co_readv,
-.bdrv_co_writev = sd_co_writev,
-.bdrv_co_flush_to_disk  = sd_co_flush_to_disk,
-.bdrv_co_pdiscard = sd_co_pdiscard,
-.bdrv_co_get_block_status = sd_co_get_block_status,
+.bdrv_co_readv= sd_co_readv,
+.bdrv_co_writev   = sd_co_writev,
+.bdrv_co_flush_to_disk= sd_co_flush_to_disk,
+.bdrv_co_pdiscard = sd_co_pdiscard,
+.bdrv_co_get_block_status = sd_co_get_block_status,
 
-.bdrv_snapshot_create   = sd_snapshot_create,
-.bdrv_snapshot_goto = sd_snapshot_goto,
-.bdrv_snapshot_delete   = sd_snapshot_delete,
-.bdrv_snapshot_list = sd_snapshot_list,
+.bdrv_snapshot_create = sd_snapshot_create,
+.bdrv_snapshot_goto   = sd_snapshot_goto,
+.bdrv_snapshot_delete = sd_snapshot_delete,
+.bdrv_snapshot_list   = sd_snapshot_list,
 
-.bdrv_save_vmstate  = sd_save_vmstate,
-.bdrv_load_vmstate  = sd_load_vmstate,
+.bdrv_save_vmstate= sd_save_vmstate,
+.bdrv_load_vmstate= sd_load_vmstate,
 
-.bdrv_detach_aio_context = sd_detach_aio_context,
-.bdrv_attach_aio_context = sd_attach_aio_context,
+.bdrv_detach_aio_context  = sd_detach_aio_context,
+.bdrv_attach_aio_context  = sd_attach_aio_context,
 
-.create_opts= &sd_create_opts,
+.create_opts  = &sd_create_opts,
 };
 
 static BlockDriver bdrv_sheepdog_tcp = {
-.format_name= "sheepdog",
-.protocol_name  = "sheepdog+tcp",
-.instance_size  = sizeof(BDRVSheepdogState),
-.bdrv_parse_filename= sd_parse_filename,
+.format_name  = "sheepdog",
+.protocol_name= "sheepdog+tcp",
+.instance_size= sizeof(BDRVSheepdogState),
+.bdrv_parse_filename  = sd_parse_filename,
 .bdrv_file_open = sd_open,
-.bdrv_reopen_prepare= sd_reopen_prepare,
-.bdrv_reopen_commit = sd_reopen_commit,
-.bdrv_reopen_abort  = sd_reopen_abort,
-.bdrv_close = sd_close,
-.bdrv_create= sd_create,
-.bdrv_has_zero_init = bdrv_has_zero_init_1,
-.bdrv_getlength = sd_getlength,
+.bdrv_reopen_prepare  = sd_reopen_prepare,
+.bdrv_reopen_commit   = sd_reopen_commit,
+.bdrv_reopen_abort= sd_reopen_abort,
+.bdrv_close   = sd_close,
+.bdrv_create  = sd_create,
+.bdrv_has_zero_init   = bdrv_has_zero_init_1,
+.bdrv_getlength   = sd_getlength,
 .bdrv_get_allocated_file_size = sd_get_allocated_file_size,
-.bdrv_truncate  = sd_truncate,
+.bdrv_truncate= sd_truncate,
 
-.bdrv_co_readv  = sd_co_readv,
-.bdrv_co_writev = sd_co_writev,
-.bdrv_co_flush_to_disk  = sd_co_flush_to_disk,
-.bdrv_co_pdiscard = sd_co_pdiscard,
-.bdrv_co_get_block_status = sd_co_get_block_status,
+.bdrv_co_readv= sd_co_readv,
+.bdrv_co_writev

[Qemu-block] [PATCH v2 6/7] block/curl: fix minor memory leaks

2017-08-30 Thread Jeff Cody

Signed-off-by: Jeff Cody 
---
 block/curl.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/block/curl.c b/block/curl.c
index 00a9879..35cf417 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -857,6 +857,9 @@ out_noclean:
 qemu_mutex_destroy(&s->mutex);
 g_free(s->cookie);
 g_free(s->url);
+g_free(s->username);
+g_free(s->proxyusername);
+g_free(s->proxypassword);
 qemu_opts_del(opts);
 return -EINVAL;
 }
@@ -955,6 +958,9 @@ static void curl_close(BlockDriverState *bs)
 
 g_free(s->cookie);
 g_free(s->url);
+g_free(s->username);
+g_free(s->proxyusername);
+g_free(s->proxypassword);
 }
 
 static int64_t curl_getlength(BlockDriverState *bs)
-- 
2.9.5

Re: [Qemu-block] [PATCH 10/10] scsi: add persistent reservation manager using qemu-pr-helper

2017-08-30 Thread Stefan Hajnoczi

On Tue, Aug 22, 2017 at 03:18:32PM +0200, Paolo Bonzini wrote:
> +/* Called with lock held.  */
> +static int pr_manager_helper_read(PRManagerHelper *pr_mgr,
> +  void *buf, int sz, Error **errp)
> +{
> +ssize_t r = qio_channel_read_all(pr_mgr->ioc, buf, sz, errp);
> +
> +if (r < 0) {
> +object_unref(OBJECT(pr_mgr->ioc));
> +pr_mgr->ioc = NULL;
> +return r;
> +}
> +
> +return r < 0 ? r : 0;

At this point we know r >= 0:

  return r;

> +if (pr_manager_helper_read(pr_mgr, &resp, sizeof(resp), NULL) < 0) {
> +ret = -EINVAL;
> +goto out;
> +}

resp.result is big-endian and accessed without byteswaps below.  We
need:

  resp.result = be32_to_host(resp.result);

> +if (expected_dir == SG_DXFER_FROM_DEV && resp.result == 0) {
> +if (pr_manager_helper_read(pr_mgr, io_hdr->dxferp, len, NULL) < 0) {
> +ret = -EINVAL;
> +goto out;
> +}
> +}
> +
> +io_hdr->status = resp.result;
> +if (resp.result == CHECK_CONDITION) {
> +io_hdr->driver_status = SG_ERR_DRIVER_SENSE;
> +io_hdr->sb_len_wr = MIN(io_hdr->mx_sb_len, PR_HELPER_SENSE_SIZE);
> +memcpy(io_hdr->sbp, resp.sense, io_hdr->sb_len_wr);
> +}
> +
> +out:
> +if (ret < 0) {
> +int sense_len = scsi_build_sense(io_hdr->sbp,
> + SENSE_CODE(LUN_COMM_FAILURE));
> +io_hdr->driver_status = SG_ERR_DRIVER_SENSE;
> +io_hdr->sb_len_wr = MIN(io_hdr->mx_sb_len, sense_len);
> +io_hdr->status = CHECK_CONDITION;
> +}
> +qemu_mutex_unlock(&pr_mgr->lock);
> +return ret;
> +}
> +static void pr_manager_helper_instance_finalize(Object *obj)
> +{
> +PRManagerHelper *pr_mgr = PR_MANAGER_HELPER(obj);
> +
> +g_free(pr_mgr->path);

Double free, the "path" property already has a release function that
frees the string.

[Qemu-block] [PATCH v2 1/3] configure: Add option in configure to disable live block ops

2017-08-30 Thread Jeff Cody

From: Jeffrey Cody 

This adds in the option to disable the live block operations.  The
resultant config option is not checked until subsequent patches.

Signed-off-by: Jeff Cody 
---
 configure | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/configure b/configure
index dd73cce..24bde07 100755
--- a/configure
+++ b/configure
@@ -400,6 +400,7 @@ virglrenderer=""
 tpm="yes"
 libssh2=""
 live_block_migration="yes"
+live_block_ops="yes"
 numa=""
 tcmalloc="no"
 jemalloc="no"
@@ -1263,6 +1264,10 @@ for opt do
   ;;
   --enable-live-block-migration) live_block_migration="yes"
   ;;
+  --disable-live-block-ops) live_block_ops="no"
+  ;;
+  --enable-live-block-ops) live_block_ops="yes"
+  ;;
   --disable-numa) numa="no"
   ;;
   --enable-numa) numa="yes"
@@ -1513,6 +1518,7 @@ disabled with --disable-FEATURE, default is enabled if 
available:
   smartcard   smartcard support (libcacard)
   libusb  libusb (for usb passthrough)
   live-block-migration   Block migration in the main migration stream
+  live-block-ops  live block operations support
   usb-redir   usb network redirection support
   lzo support of lzo compression library
   snappy  support of snappy compression library
@@ -5398,6 +5404,7 @@ echo "libssh2 support   $libssh2"
 echo "TPM passthrough   $tpm_passthrough"
 echo "QOM debugging $qom_cast_debug"
 echo "Live block migration $live_block_migration"
+echo "Live block ops$live_block_ops"
 echo "lzo support   $lzo"
 echo "snappy support$snappy"
 echo "bzip2 support $bzip2"
@@ -5976,6 +5983,10 @@ if test "$live_block_migration" = "yes" ; then
   echo "CONFIG_LIVE_BLOCK_MIGRATION=y" >> $config_host_mak
 fi
 
+if test "$live_block_ops" = "yes" ; then
+  echo "CONFIG_LIVE_BLOCK_OPS=y" >> $config_host_mak
+fi
+
 # USB host support
 if test "$libusb" = "yes"; then
   echo "HOST_USB=libusb legacy" >> $config_host_mak
-- 
2.9.5

[Qemu-block] [PATCH v2 7/7] block/curl: code cleanup to comply with coding style

2017-08-30 Thread Jeff Cody

This addresses non-functional changes to help curl.c better comply
with the coding styles (comments, indentation, brackets, etc.).

One minor code change is the combination of two if statements into
a single if statement.

Signed-off-by: Jeff Cody 
---
 block/curl.c | 100 +++
 1 file changed, 52 insertions(+), 48 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 35cf417..c557b59 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -32,8 +32,10 @@
 #include 
 #include "qemu/cutils.h"
 
-// #define DEBUG_CURL
-// #define DEBUG_VERBOSE
+/*
+ #define DEBUG_CURL
+ #define DEBUG_VERBOSE
+*/
 
 #ifdef DEBUG_CURL
 #define DEBUG_CURL_PRINT 1
@@ -76,15 +78,15 @@ static CURLMcode __curl_multi_socket_action(CURLM 
*multi_handle,
 #define CURL_TIMEOUT_DEFAULT 5
 #define CURL_TIMEOUT_MAX 1
 
-#define CURL_BLOCK_OPT_URL   "url"
-#define CURL_BLOCK_OPT_READAHEAD "readahead"
-#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
-#define CURL_BLOCK_OPT_TIMEOUT "timeout"
-#define CURL_BLOCK_OPT_COOKIE"cookie"
-#define CURL_BLOCK_OPT_COOKIE_SECRET "cookie-secret"
-#define CURL_BLOCK_OPT_USERNAME "username"
-#define CURL_BLOCK_OPT_PASSWORD_SECRET "password-secret"
-#define CURL_BLOCK_OPT_PROXY_USERNAME "proxy-username"
+#define CURL_BLOCK_OPT_URL   "url"
+#define CURL_BLOCK_OPT_READAHEAD "readahead"
+#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
+#define CURL_BLOCK_OPT_TIMEOUT   "timeout"
+#define CURL_BLOCK_OPT_COOKIE"cookie"
+#define CURL_BLOCK_OPT_COOKIE_SECRET "cookie-secret"
+#define CURL_BLOCK_OPT_USERNAME  "username"
+#define CURL_BLOCK_OPT_PASSWORD_SECRET   "password-secret"
+#define CURL_BLOCK_OPT_PROXY_USERNAME"proxy-username"
 #define CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET "proxy-password-secret"
 
 struct BDRVCURLState;
@@ -110,8 +112,7 @@ typedef struct CURLSocket {
 QLIST_ENTRY(CURLSocket) next;
 } CURLSocket;
 
-typedef struct CURLState
-{
+typedef struct CURLState {
 struct BDRVCURLState *s;
 CURLAIOCB *acb[CURL_NUM_ACB];
 CURL *curl;
@@ -196,22 +197,22 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int 
action,
 
 DPRINTF("CURL (AIO): Sock action %d on fd %d\n", action, (int)fd);
 switch (action) {
-case CURL_POLL_IN:
-aio_set_fd_handler(s->aio_context, fd, false,
-   curl_multi_read, NULL, NULL, state);
-break;
-case CURL_POLL_OUT:
-aio_set_fd_handler(s->aio_context, fd, false,
-   NULL, curl_multi_do, NULL, state);
-break;
-case CURL_POLL_INOUT:
-aio_set_fd_handler(s->aio_context, fd, false,
-   curl_multi_read, curl_multi_do, NULL, state);
-break;
-case CURL_POLL_REMOVE:
-aio_set_fd_handler(s->aio_context, fd, false,
-   NULL, NULL, NULL, NULL);
-break;
+case CURL_POLL_IN:
+aio_set_fd_handler(s->aio_context, fd, false,
+   curl_multi_read, NULL, NULL, state);
+break;
+case CURL_POLL_OUT:
+aio_set_fd_handler(s->aio_context, fd, false,
+   NULL, curl_multi_do, NULL, state);
+break;
+case CURL_POLL_INOUT:
+aio_set_fd_handler(s->aio_context, fd, false,
+   curl_multi_read, curl_multi_do, NULL, state);
+break;
+case CURL_POLL_REMOVE:
+aio_set_fd_handler(s->aio_context, fd, false,
+   NULL, NULL, NULL, NULL);
+break;
 }
 
 return 0;
@@ -235,7 +236,7 @@ static size_t curl_header_cb(void *ptr, size_t size, size_t 
nmemb, void *opaque)
 /* Called from curl_multi_do_locked, with s->mutex held.  */
 static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
 {
-CURLState *s = ((CURLState*)opaque);
+CURLState *s = ((CURLState *)opaque);
 size_t realsize = size * nmemb;
 int i;
 
@@ -253,11 +254,12 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t 
nmemb, void *opaque)
 memcpy(s->orig_buf + s->buf_off, ptr, realsize);
 s->buf_off += realsize;
 
-for(i=0; iacb[i];
 
-if (!acb)
+if (!acb) {
 continue;
+}
 
 if ((s->buf_off >= acb->end)) {
 size_t request_length = acb->bytes;
@@ -293,17 +295,16 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t 
start, uint64_t len,
 uint64_t clamped_end = MIN(end, s->len);
 uint64_t clamped_len = clamped_end - start;
 
-for (i=0; istates[i];
 uint64_t buf_end = (state->buf_start + state->buf_off);
 uint64_t buf_fend = (state->buf_start + state->buf_len);
 
-if (!state->orig_buf)
-continue;
-if (!state->buf_off)
+if (!state->orig_buf || !state->buf_off) {
 continue;
+}
 
-

[Qemu-block] [PATCH v2 2/7] block/ssh: make compliant with coding guidelines

2017-08-30 Thread Jeff Cody

Signed-off-by: Jeff Cody 
---
 block/ssh.c | 32 ++--
 1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/block/ssh.c b/block/ssh.c
index cbb0e34..97f7673 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -241,7 +241,7 @@ static int parse_uri(const char *filename, QDict *options, 
Error **errp)
 goto err;
 }
 
-if(uri->user && strcmp(uri->user, "") != 0) {
+if (uri->user && strcmp(uri->user, "") != 0) {
 qdict_put_str(options, "user", uri->user);
 }
 
@@ -268,7 +268,7 @@ static int parse_uri(const char *filename, QDict *options, 
Error **errp)
 
  err:
 if (uri) {
-  uri_free(uri);
+uri_free(uri);
 }
 return -EINVAL;
 }
@@ -342,7 +342,7 @@ static int check_host_key_knownhosts(BDRVSSHState *s,
 libssh2_knownhost_readfile(knh, knh_file, LIBSSH2_KNOWNHOST_FILE_OPENSSH);
 
 r = libssh2_knownhost_checkp(knh, host, port, hostkey, len,
- LIBSSH2_KNOWNHOST_TYPE_PLAIN|
+ LIBSSH2_KNOWNHOST_TYPE_PLAIN |
  LIBSSH2_KNOWNHOST_KEYENC_RAW,
  &found);
 switch (r) {
@@ -405,15 +405,18 @@ static int compare_fingerprint(const unsigned char 
*fingerprint, size_t len,
 unsigned c;
 
 while (len > 0) {
-while (*host_key_check == ':')
+while (*host_key_check == ':') {
 host_key_check++;
+}
 if (!qemu_isxdigit(host_key_check[0]) ||
-!qemu_isxdigit(host_key_check[1]))
+!qemu_isxdigit(host_key_check[1])) {
 return 1;
+}
 c = hex2decimal(host_key_check[0]) * 16 +
 hex2decimal(host_key_check[1]);
-if (c - *fingerprint != 0)
+if (c - *fingerprint != 0) {
 return c - *fingerprint;
+}
 fingerprint++;
 len--;
 host_key_check += 2;
@@ -433,8 +436,8 @@ check_host_key_hash(BDRVSSHState *s, const char *hash,
 return -EINVAL;
 }
 
-if(compare_fingerprint((unsigned char *) fingerprint, fingerprint_len,
-   hash) != 0) {
+if (compare_fingerprint((unsigned char *) fingerprint, fingerprint_len,
+hash) != 0) {
 error_setg(errp, "remote host key does not match host_key_check '%s'",
hash);
 return -EPERM;
@@ -507,7 +510,7 @@ static int authenticate(BDRVSSHState *s, const char *user, 
Error **errp)
 goto out;
 }
 
-for(;;) {
+for (;;) {
 r = libssh2_agent_get_identity(agent, &identity, prev_identity);
 if (r == 1) {   /* end of list */
 break;
@@ -863,8 +866,8 @@ static int ssh_create(const char *filename, QemuOpts *opts, 
Error **errp)
 }
 
 r = connect_to_ssh(&s, uri_options,
-   LIBSSH2_FXF_READ|LIBSSH2_FXF_WRITE|
-   LIBSSH2_FXF_CREAT|LIBSSH2_FXF_TRUNC,
+   LIBSSH2_FXF_READ  | LIBSSH2_FXF_WRITE |
+   LIBSSH2_FXF_CREAT | LIBSSH2_FXF_TRUNC,
0644, errp);
 if (r < 0) {
 ret = r;
@@ -872,7 +875,7 @@ static int ssh_create(const char *filename, QemuOpts *opts, 
Error **errp)
 }
 
 if (total_size > 0) {
-libssh2_sftp_seek64(s.sftp_handle, total_size-1);
+libssh2_sftp_seek64(s.sftp_handle, total_size - 1);
 r2 = libssh2_sftp_write(s.sftp_handle, c, 1);
 if (r2 < 0) {
 sftp_error_setg(errp, &s, "truncate failed");
@@ -,7 +1114,7 @@ static int ssh_write(BDRVSSHState *s, BlockDriverState 
*bs,
  * works for me.
  */
 if (r == 0) {
-ssh_seek(s, offset + written, SSH_SEEK_WRITE|SSH_SEEK_FORCE);
+ssh_seek(s, offset + written, SSH_SEEK_WRITE | SSH_SEEK_FORCE);
 co_yield(s, bs);
 goto again;
 }
@@ -1125,8 +1128,9 @@ static int ssh_write(BDRVSSHState *s, BlockDriverState 
*bs,
 end_of_vec = i->iov_base + i->iov_len;
 }
 
-if (offset + written > s->attrs.filesize)
+if (offset + written > s->attrs.filesize) {
 s->attrs.filesize = offset + written;
+}
 }
 
 return 0;
-- 
2.9.5

[Qemu-block] [PATCH v2 3/7] block/sheepdog: remove spurious NULL check

2017-08-30 Thread Jeff Cody

'tag' is already checked in the lines immediately preceding this check,
and set to non-NULL if NULL.  No need to check again, it hasn't changed.

Signed-off-by: Jeff Cody 
---
 block/sheepdog.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index abb2e79..bbbfa72 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -1632,7 +1632,7 @@ static int sd_open(BlockDriverState *bs, QDict *options, 
int flags,
 if (!tag) {
 tag = "";
 }
-if (tag && strlen(tag) >= SD_MAX_VDI_TAG_LEN) {
+if (strlen(tag) >= SD_MAX_VDI_TAG_LEN) {
 error_setg(errp, "value of parameter 'tag' is too long");
 ret = -EINVAL;
 goto err_no_fd;
-- 
2.9.5

[Qemu-block] [PATCH v2 0/7] Code cleanup and minor fixes

2017-08-30 Thread Jeff Cody

Minor bug fixes and code cleanup.

Changes from v1 -> v2:

Rebased to v2.10.0

Jeff Cody (7):
  block/ssh: don't call libssh2_init() in block_init()
  block/ssh: make compliant with coding guidelines
  block/sheepdog: remove spurious NULL check
  block/sheepdog: code beautification
  block/curl: check error return of curl_global_init()
  block/curl: fix minor memory leaks
  block/curl: code cleanup to comply with coding style

 block/curl.c | 124 +++--
 block/sheepdog.c | 164 +++
 block/ssh.c  |  72 +++-
 3 files changed, 199 insertions(+), 161 deletions(-)

-- 
2.9.5

Re: [Qemu-block] [PATCH v2] qcow2: allocate cluster_cache/cluster_data on demand

2017-08-30 Thread Stefan Hajnoczi

On Mon, Aug 21, 2017 at 02:55:30PM +0100, Stefan Hajnoczi wrote:
> Most qcow2 files are uncompressed so it is wasteful to allocate (32 + 1)
> * cluster_size + 512 bytes upfront.  Allocate s->cluster_cache and
> s->cluster_data when the first read operation is performance on a
> compressed cluster.
> 
> The buffers are freed in .bdrv_close().  .bdrv_open() no longer has any
> code paths that can allocate these buffers, so remove the free functions
> in the error code path.
> 
> This patch can result in significant memory savings when many qcow2
> disks are attached or backing file chains are long:
> 
> Before 12.81% (1,023,193,088B)
> After   5.36% (393,893,888B)
> 
> Reported-by: Alexey Kardashevskiy 
> Tested-by: Alexey Kardashevskiy 
> Cc: Kevin Wolf 
> Signed-off-by: Stefan Hajnoczi 
> ---
> v2:
>  * Changed EIO to ENOMEM [Eric]
>  * Added Alexey's Tested-by
> ---
>  block/qcow2-cluster.c | 17 +
>  block/qcow2.c | 12 
>  2 files changed, 17 insertions(+), 12 deletions(-)

Thanks, applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan

[Qemu-block] [PATCH v2 0/3] Live block optional disable

2017-08-30 Thread Jeff Cody

This series adds a configurable option to disable live block operations.

The default is that live block operations are 'enabled'.

Jeffrey Cody (3):
  configure: Add option in configure to disable live block ops
  block-jobs: Optionally unregister live block operations
  hmp: Optionally disable live block operations in HMP monitor

 configure| 11 +++
 hmp-commands-info.hx |  4 
 hmp-commands.hx  | 12 
 hmp.c| 12 
 monitor.c| 16 
 5 files changed, 55 insertions(+)

-- 
2.9.5

[Qemu-block] [PATCH v2 2/3] block-jobs: Optionally unregister live block operations

2017-08-30 Thread Jeff Cody

From: Jeffrey Cody 

If configured without live block operations enabled, unregister the
live block operation commands.

Signed-off-by: Jeff Cody 
---
 monitor.c | 16 
 1 file changed, 16 insertions(+)

diff --git a/monitor.c b/monitor.c
index e0f8801..de0a70e 100644
--- a/monitor.c
+++ b/monitor.c
@@ -998,6 +998,22 @@ static void qmp_unregister_commands_hack(void)
 && !defined(TARGET_S390X)
 qmp_unregister_command(&qmp_commands, "query-cpu-definitions");
 #endif
+#ifndef CONFIG_LIVE_BLOCK_OPS
+qmp_unregister_command(&qmp_commands, "block-stream");
+qmp_unregister_command(&qmp_commands, "block-commit");
+qmp_unregister_command(&qmp_commands, "drive-mirror");
+qmp_unregister_command(&qmp_commands, "blockdev-mirror");
+qmp_unregister_command(&qmp_commands, "drive-backup");
+qmp_unregister_command(&qmp_commands, "blockdev-backup");
+qmp_unregister_command(&qmp_commands, "blockdev-snapshot");
+qmp_unregister_command(&qmp_commands, "blockdev-snapshot-sync");
+qmp_unregister_command(&qmp_commands, "block-job-set-speed");
+qmp_unregister_command(&qmp_commands, "block-job-cancel");
+qmp_unregister_command(&qmp_commands, "block-job-pause");
+qmp_unregister_command(&qmp_commands, "block-job-resume");
+qmp_unregister_command(&qmp_commands, "block-job-complete");
+qmp_unregister_command(&qmp_commands, "query-block-jobs");
+#endif
 }
 
 void monitor_init_qmp_commands(void)
-- 
2.9.5

[Qemu-block] [PATCH v2 5/7] block/curl: check error return of curl_global_init()

2017-08-30 Thread Jeff Cody

If curl_global_init() fails, per the documentation no other curl
functions may be called, so make sure to check the return value.

Also, some minor changes to the initialization latch variable 'inited':

- Make it static in the file, for clarity
- Change the name for clarity
- Make it a bool

Signed-off-by: Jeff Cody 
---
 block/curl.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/block/curl.c b/block/curl.c
index 2a244e2..00a9879 100644
--- a/block/curl.c
+++ b/block/curl.c
@@ -89,6 +89,8 @@ static CURLMcode __curl_multi_socket_action(CURLM 
*multi_handle,
 
 struct BDRVCURLState;
 
+static bool libcurl_initialized;
+
 typedef struct CURLAIOCB {
 Coroutine *co;
 QEMUIOVector *qiov;
@@ -686,14 +688,23 @@ static int curl_open(BlockDriverState *bs, QDict 
*options, int flags,
 double d;
 const char *secretid;
 const char *protocol_delimiter;
+int ret;
 
-static int inited = 0;
 
 if (flags & BDRV_O_RDWR) {
 error_setg(errp, "curl block device does not support writes");
 return -EROFS;
 }
 
+if (!libcurl_initialized) {
+ret = curl_global_init(CURL_GLOBAL_ALL);
+if (ret) {
+error_setg(errp, "libcurl initialization failed with %d", ret);
+return -EIO;
+}
+libcurl_initialized = true;
+}
+
 qemu_mutex_init(&s->mutex);
 opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
 qemu_opts_absorb_qdict(opts, options, &local_err);
@@ -772,11 +783,6 @@ static int curl_open(BlockDriverState *bs, QDict *options, 
int flags,
 }
 }
 
-if (!inited) {
-curl_global_init(CURL_GLOBAL_ALL);
-inited = 1;
-}
-
 DPRINTF("CURL: Opening %s\n", file);
 QSIMPLEQ_INIT(&s->free_state_waitq);
 s->aio_context = bdrv_get_aio_context(bs);
-- 
2.9.5

[Qemu-block] [PATCH v2 1/7] block/ssh: don't call libssh2_init() in block_init()

2017-08-30 Thread Jeff Cody

We don't need libssh2 failure to be fatal (we could just opt to not
register the driver on failure). But, it is probably a good idea to
avoid external library calls during the block_init(), and call the
libssh2 global init function on the first usage, returning any errors.

Signed-off-by: Jeff Cody 
---
 block/ssh.c | 40 +---
 1 file changed, 29 insertions(+), 11 deletions(-)

diff --git a/block/ssh.c b/block/ssh.c
index e8f0404..cbb0e34 100644
--- a/block/ssh.c
+++ b/block/ssh.c
@@ -83,12 +83,28 @@ typedef struct BDRVSSHState {
 bool unsafe_flush_warning;
 } BDRVSSHState;
 
-static void ssh_state_init(BDRVSSHState *s)
+static bool ssh_libinit_called;
+
+static int ssh_state_init(BDRVSSHState *s, Error **errp)
 {
+int ret;
+
+if (!ssh_libinit_called) {
+ret = libssh2_init(0);
+if (ret) {
+error_setg(errp, "libssh2 initialization failed with %d", ret);
+return ret;
+}
+ssh_libinit_called = true;
+}
+
+
 memset(s, 0, sizeof *s);
 s->sock = -1;
 s->offset = -1;
 qemu_co_mutex_init(&s->lock);
+
+return 0;
 }
 
 static void ssh_state_free(BDRVSSHState *s)
@@ -772,8 +788,13 @@ static int ssh_file_open(BlockDriverState *bs, QDict 
*options, int bdrv_flags,
 BDRVSSHState *s = bs->opaque;
 int ret;
 int ssh_flags;
+Error *local_err = NULL;
 
-ssh_state_init(s);
+ret = ssh_state_init(s, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return ret;
+}
 
 ssh_flags = LIBSSH2_FXF_READ;
 if (bdrv_flags & BDRV_O_RDWR) {
@@ -821,8 +842,13 @@ static int ssh_create(const char *filename, QemuOpts 
*opts, Error **errp)
 BDRVSSHState s;
 ssize_t r2;
 char c[1] = { '\0' };
+Error *local_err = NULL;
 
-ssh_state_init(&s);
+ret = ssh_state_init(&s, &local_err);
+if (local_err) {
+error_propagate(errp, local_err);
+return ret;
+}
 
 /* Get desired file size. */
 total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
@@ -1213,14 +1239,6 @@ static BlockDriver bdrv_ssh = {
 
 static void bdrv_ssh_init(void)
 {
-int r;
-
-r = libssh2_init(0);
-if (r != 0) {
-fprintf(stderr, "libssh2 initialization failed, %d\n", r);
-exit(EXIT_FAILURE);
-}
-
 bdrv_register(&bdrv_ssh);
 }
 
-- 
2.9.5

[Qemu-block] [PATCH v3 5/5] qemu-iotests: add option to save temp files on error

2017-08-30 Thread Jeff Cody

Now that ./check takes care of cleaning up after each tests, it
can also selectively not clean up.  Add option to leave all output from
tests intact if that test encountered an error.

Note: this currently only works for bash tests, as the python tests
still clean up after themselves manually.

Signed-off-by: Jeff Cody 
---
 tests/qemu-iotests/check  | 10 +-
 tests/qemu-iotests/common |  6 ++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index f6ca85d..8a5fc0d 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -370,7 +370,15 @@ do
 fi
 fi
 
-rm -rf "$TEST_DIR_SEQ"
+#TODO: There is some intial work to save intermediate files
+#  in python tests, but it is imperfect.  Having each
+#  test record its test name, and the tearDown function
+#  just move intermediate images to a subdirectory with
+#  the test name may prove more useful.
+if [ "$save_on_err" != "true" ] || [ "$err" != "true" ]
+then
+rm -rf "$TEST_DIR_SEQ"
+fi
 
 fi
 
diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common
index d34c11c..d08b233 100644
--- a/tests/qemu-iotests/common
+++ b/tests/qemu-iotests/common
@@ -42,6 +42,7 @@ expunge=true
 have_test_arg=false
 randomize=false
 cachemode=false
+save_on_err=false
 rm -f $tmp.list $tmp.tmp $tmp.sed
 
 export IMGFMT=raw
@@ -172,6 +173,7 @@ other options
 -T  output timestamps
 -r  randomize test order
 -c mode cache mode
+-s  save test scratch directory on test failure
 
 testlist options
 -g group[,group...]include tests from these groups
@@ -349,6 +351,10 @@ testlist options
 xgroup=true
 xpand=false
 ;;
+-s)
+save_on_err=true
+xpand=false
+;;
 '[0-9][0-9][0-9] [0-9][0-9][0-9][0-9]')
 echo "No tests?"
 status=1
-- 
2.9.5

[Qemu-block] [PATCH v3 4/5] qemu-iotests: make python tests attempt to leave intermediate files

2017-08-30 Thread Jeff Cody

Now that 'check' will clean up after tests, try and make python
tests leave intermediate files so that they might be inspectable
on failure.

This isn't perfect; the python unittest framework runs multiple
tests, even if previous tests failed.  So we need to make sure that
each test still begins with a "clean" slate, to prevent false
positives or tainted test runs.

Rather than delete images in the unittest tearDown, invert this
and delete images to be used in that test at the beginning of the
setUp.  This is to make sure that the test run is not inadvertently
using file droppings from previous runs.  We must use 'blind_remove'
then for these, as the files might not exist yet, but we don't want
to throw an error for that.

Signed-off-by: Jeff Cody 
---
 tests/qemu-iotests/030 |  8 +++---
 tests/qemu-iotests/040 | 35 ++---
 tests/qemu-iotests/041 | 70 +-
 tests/qemu-iotests/044 |  8 ++
 tests/qemu-iotests/045 | 14 +-
 tests/qemu-iotests/055 | 36 +-
 tests/qemu-iotests/056 | 13 --
 tests/qemu-iotests/057 |  6 ++---
 tests/qemu-iotests/065 |  6 ++---
 tests/qemu-iotests/096 |  5 ++--
 tests/qemu-iotests/118 | 31 ++
 tests/qemu-iotests/124 | 21 +--
 tests/qemu-iotests/132 |  9 +++
 tests/qemu-iotests/136 |  3 ++-
 tests/qemu-iotests/139 |  6 ++---
 tests/qemu-iotests/147 | 16 
 tests/qemu-iotests/148 |  7 ++---
 tests/qemu-iotests/152 |  9 +++
 tests/qemu-iotests/155 | 15 +--
 tests/qemu-iotests/165 |  6 ++---
 20 files changed, 130 insertions(+), 194 deletions(-)

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index d745cb4..051fb0c 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -21,7 +21,7 @@
 import time
 import os
 import iotests
-from iotests import qemu_img, qemu_io
+from iotests import qemu_img, qemu_io, blind_remove
 
 backing_img = os.path.join(iotests.test_dir, 'backing.img')
 mid_img = os.path.join(iotests.test_dir, 'mid.img')
@@ -31,6 +31,9 @@ class TestSingleDrive(iotests.QMPTestCase):
 image_len = 1 * 1024 * 1024 # MB
 
 def setUp(self):
+blind_remove(test_img)
+blind_remove(mid_img)
+blind_remove(backing_img)
 iotests.create_image(backing_img, TestSingleDrive.image_len)
 qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
backing_img, mid_img)
 qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
mid_img, test_img)
@@ -41,9 +44,6 @@ class TestSingleDrive(iotests.QMPTestCase):
 
 def tearDown(self):
 self.vm.shutdown()
-os.remove(test_img)
-os.remove(mid_img)
-os.remove(backing_img)
 
 def test_stream(self):
 self.assert_no_active_block_jobs()
diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
index 95b7510..736afa7 100755
--- a/tests/qemu-iotests/040
+++ b/tests/qemu-iotests/040
@@ -24,7 +24,7 @@
 import time
 import os
 import iotests
-from iotests import qemu_img, qemu_io
+from iotests import qemu_img, qemu_io, blind_remove
 import struct
 import errno
 
@@ -76,6 +76,9 @@ class TestSingleDrive(ImageCommitTestCase):
 test_len = 1 * 1024 * 256
 
 def setUp(self):
+blind_remove(test_img)
+blind_remove(mid_img)
+blind_remove(backing_img)
 iotests.create_image(backing_img, self.image_len)
 qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
backing_img, mid_img)
 qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
mid_img, test_img)
@@ -88,9 +91,6 @@ class TestSingleDrive(ImageCommitTestCase):
 
 def tearDown(self):
 self.vm.shutdown()
-os.remove(test_img)
-os.remove(mid_img)
-os.remove(backing_img)
 
 def test_commit(self):
 self.run_commit_test(mid_img, backing_img)
@@ -214,6 +214,9 @@ class TestRelativePaths(ImageCommitTestCase):
 except OSError as exception:
 if exception.errno != errno.EEXIST:
 raise
+blind_remove(self.test_img)
+blind_remove(self.mid_img_abs)
+blind_remove(self.backing_img_abs)
 iotests.create_image(self.backing_img_abs, TestRelativePaths.image_len)
 qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
self.backing_img_abs, self.mid_img_abs)
 qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
self.mid_img_abs, self.test_img)
@@ -226,16 +229,6 @@ class TestRelativePaths(ImageCommitTestCase):
 
 def tearDown(self):
 self.vm.shutdown()
-os.remove(self.test_img)
-os.remove(self.mid_img_abs)
-os.remove(self.backing_img_abs)
-try:
-os.rmdir(os.path.join(iotests.test_dir, self.dir1))
-os.rmdir(os.path.join(iotests.test_dir, self.dir3))
-os.rmdir(os.path.join(iotests.test_dir, self.dir2))
-except OSError

[Qemu-block] [PATCH v3 2/5] qemu-iotests: remove file cleanup from bash tests

2017-08-30 Thread Jeff Cody

All files for a given test are now self-contained in a subdirectory,
and therefore the "./check" script can do all file-related cleanup
without any help.

This removes file cleanups from the bash tests.  The only cleanup left
is whatever is needed to kill any spawned processes; e.g. _cleanup_qemu.

Reviewed-by: Eric Blake 
Signed-off-by: Jeff Cody 
---
 tests/qemu-iotests/001 |  6 --
 tests/qemu-iotests/002 |  6 --
 tests/qemu-iotests/003 |  6 --
 tests/qemu-iotests/004 |  6 --
 tests/qemu-iotests/005 |  6 --
 tests/qemu-iotests/007 |  7 ---
 tests/qemu-iotests/008 |  6 --
 tests/qemu-iotests/009 |  6 --
 tests/qemu-iotests/010 |  6 --
 tests/qemu-iotests/011 |  6 --
 tests/qemu-iotests/012 |  6 --
 tests/qemu-iotests/013 |  6 --
 tests/qemu-iotests/014 |  6 --
 tests/qemu-iotests/015 |  7 ---
 tests/qemu-iotests/017 |  6 --
 tests/qemu-iotests/018 |  6 --
 tests/qemu-iotests/019 |  8 
 tests/qemu-iotests/020 |  8 
 tests/qemu-iotests/021 |  6 --
 tests/qemu-iotests/022 |  6 --
 tests/qemu-iotests/023 |  6 --
 tests/qemu-iotests/024 |  8 
 tests/qemu-iotests/025 |  6 --
 tests/qemu-iotests/026 |  7 ---
 tests/qemu-iotests/027 |  6 --
 tests/qemu-iotests/028 |  8 
 tests/qemu-iotests/029 |  7 ---
 tests/qemu-iotests/031 |  6 --
 tests/qemu-iotests/032 |  6 --
 tests/qemu-iotests/033 |  6 --
 tests/qemu-iotests/034 |  6 --
 tests/qemu-iotests/035 |  6 --
 tests/qemu-iotests/036 |  6 --
 tests/qemu-iotests/037 |  6 --
 tests/qemu-iotests/038 |  6 --
 tests/qemu-iotests/039 |  6 --
 tests/qemu-iotests/042 |  6 --
 tests/qemu-iotests/043 |  7 ---
 tests/qemu-iotests/046 |  6 --
 tests/qemu-iotests/047 |  6 --
 tests/qemu-iotests/048 |  8 
 tests/qemu-iotests/048.out |  1 -
 tests/qemu-iotests/049 |  6 --
 tests/qemu-iotests/050 |  8 
 tests/qemu-iotests/051 |  6 --
 tests/qemu-iotests/052 |  6 --
 tests/qemu-iotests/053 |  7 ---
 tests/qemu-iotests/054 |  6 --
 tests/qemu-iotests/058 |  8 +---
 tests/qemu-iotests/059 |  7 ---
 tests/qemu-iotests/060 |  6 --
 tests/qemu-iotests/061 |  6 --
 tests/qemu-iotests/062 |  6 --
 tests/qemu-iotests/063 |  7 ---
 tests/qemu-iotests/064 |  6 --
 tests/qemu-iotests/066 |  6 --
 tests/qemu-iotests/068 |  6 --
 tests/qemu-iotests/069 |  6 --
 tests/qemu-iotests/070 |  6 --
 tests/qemu-iotests/071 |  6 --
 tests/qemu-iotests/072 |  6 --
 tests/qemu-iotests/073 |  6 --
 tests/qemu-iotests/074 |  9 -
 tests/qemu-iotests/074.out |  1 -
 tests/qemu-iotests/075 |  6 --
 tests/qemu-iotests/076 |  6 --
 tests/qemu-iotests/077 |  6 --
 tests/qemu-iotests/078 |  6 --
 tests/qemu-iotests/079 |  6 --
 tests/qemu-iotests/080 |  7 ---
 tests/qemu-iotests/081 |  8 
 tests/qemu-iotests/082 |  6 --
 tests/qemu-iotests/084 |  6 --
 tests/qemu-iotests/085 | 13 +
 tests/qemu-iotests/086 |  6 --
 tests/qemu-iotests/088 |  7 ---
 tests/qemu-iotests/089 |  6 --
 tests/qemu-iotests/090 |  6 --
 tests/qemu-iotests/091 |  8 +---
 tests/qemu-iotests/092 |  7 ---
 tests/qemu-iotests/094 |  9 +
 tests/qemu-iotests/095 |  8 +---
 tests/qemu-iotests/097 |  7 ---
 tests/qemu-iotests/098 |  7 ---
 tests/qemu-iotests/099 |  6 --
 tests/qemu-iotests/101 |  6 --
 tests/qemu-iotests/102 |  7 +--
 tests/qemu-iotests/103 |  6 --
 tests/qemu-iotests/104 |  2 --
 tests/qemu-iotests/105 |  6 --
 tests/qemu-iotests/106 |  6 --
 tests/qemu-iotests/107 |  6 --
 tests/qemu-iotests/108 |  6 --
 tests/qemu-iotests/109 |  8 +---
 tests/qemu-iotests/110 |  6 --
 tests/qemu-iotests/111 |  6 --
 tests/qemu-iotests/112 |  6 --
 tests/qemu-iotests/113 |  6 --
 tests/qemu-iotests/114 |  6 --
 tests/qemu-iotests/115 |  6 --
 tests/qemu-iotests/116 |  6 --
 tests/qemu-iotests/117 |  7 +--
 tests/qemu-iotests/119 |  6 --
 tests/qemu-iotests/120 |  6 --
 tests/qemu-iotests/121 |  6 --
 tests/qemu-iotests/122 |  7 ---
 tests/qemu-iotests/123 |  7 ---
 tests/qemu-iotests/125 |  6 --
 tests/qemu-iotests/130 |  7 +--
 tests/qemu-iotests/131 |  6 --
 tests/qemu-iotests/133 |  6 --
 tests/qemu-iotests/134 |  6 --
 tests/qemu-iotests/135 |  6 --
 tests/qemu-iotests/137 |  6 --
 tests/qemu-iotests/138 |  6 --
 test

[Qemu-block] [PATCH v3 0/5] qemu-iotests: place output in unique dir

2017-08-30 Thread Jeff Cody

Differences v2 -> v3:

git-backport-diff -r qemu/master..devel-iotests -u github/devel-next
Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/5:[] [--] 'qemu-iotests: set TEST_DIR to a unique dir for each test'
  ^^
  No patch diff, but fixed typo in commit message. [Thanks Eric]


002/5:[0006] [FC] 'qemu-iotests: remove file cleanup from bash tests'
  ^^
  Picked up test 192 in the patch.

003/5:[down] 'qemu-iotests: add 'blind_remove' for python tests'
004/5:[down] 'qemu-iotests: make python tests attempt to leave intermediate 
files'
  ^^
  Two new tests, to help address python intermediate files.


005/5:[0005] [FC] 'qemu-iotests: add option to save temp files on error'
  ^^
  Added TODO regarding python tests. [Thanks Markus]
  Dropped Eric's r-b, in case he is not OK with the TODO wording


Rebased to v2.10.0.


This series does 2 things:

1.) Sets TEST_DIR to a unique subdirectory for each test
2.) Has './check' be responsible for removing temporary files
3.) Add option to './check' to retain temporary files in case of error


Jeff Cody (5):
  qemu-iotests: set TEST_DIR to a unique dir for each test
  qemu-iotests: remove file cleanup from bash tests
  qemu-iotests: add 'blind_remove' for python tests
  qemu-iotests: make python tests attempt to leave intermediate files
  qemu-iotests: add option to save temp files on error

 tests/qemu-iotests/001|  6 
 tests/qemu-iotests/002|  6 
 tests/qemu-iotests/003|  6 
 tests/qemu-iotests/004|  6 
 tests/qemu-iotests/005|  6 
 tests/qemu-iotests/007|  7 -
 tests/qemu-iotests/008|  6 
 tests/qemu-iotests/009|  6 
 tests/qemu-iotests/010|  6 
 tests/qemu-iotests/011|  6 
 tests/qemu-iotests/012|  6 
 tests/qemu-iotests/013|  6 
 tests/qemu-iotests/014|  6 
 tests/qemu-iotests/015|  7 -
 tests/qemu-iotests/017|  6 
 tests/qemu-iotests/018|  6 
 tests/qemu-iotests/019|  8 -
 tests/qemu-iotests/020|  8 -
 tests/qemu-iotests/021|  6 
 tests/qemu-iotests/022|  6 
 tests/qemu-iotests/023|  6 
 tests/qemu-iotests/024|  8 -
 tests/qemu-iotests/025|  6 
 tests/qemu-iotests/026|  7 -
 tests/qemu-iotests/027|  6 
 tests/qemu-iotests/028|  8 -
 tests/qemu-iotests/029|  7 -
 tests/qemu-iotests/030|  8 ++---
 tests/qemu-iotests/031|  6 
 tests/qemu-iotests/032|  6 
 tests/qemu-iotests/033|  6 
 tests/qemu-iotests/034|  6 
 tests/qemu-iotests/035|  6 
 tests/qemu-iotests/036|  6 
 tests/qemu-iotests/037|  6 
 tests/qemu-iotests/038|  6 
 tests/qemu-iotests/039|  6 
 tests/qemu-iotests/040| 35 +-
 tests/qemu-iotests/041| 70 ++-
 tests/qemu-iotests/042|  6 
 tests/qemu-iotests/043|  7 -
 tests/qemu-iotests/044|  8 ++---
 tests/qemu-iotests/045| 14 -
 tests/qemu-iotests/046|  6 
 tests/qemu-iotests/047|  6 
 tests/qemu-iotests/048|  8 -
 tests/qemu-iotests/048.out|  1 -
 tests/qemu-iotests/049|  6 
 tests/qemu-iotests/050|  8 -
 tests/qemu-iotests/051|  6 
 tests/qemu-iotests/052|  6 
 tests/qemu-iotests/053|  7 -
 tests/qemu-iotests/054|  6 
 tests/qemu-iotests/055| 36 --
 tests/qemu-iotests/056| 13 
 tests/qemu-iotests/057|  6 ++--
 tests/qemu-iotests/058|  8 +
 tests/qemu-iotests/059|  7 -
 tests/qemu-iotests/060|  6 
 tests/qemu-iotests/061|  6 
 tests/qemu-iotests/062|  6 
 tests/qemu-iotests/063|  7 -
 tests/qemu-iotests/064|  6 
 tests/qemu-iotests/065|  6 ++--
 tests/qemu-iotests/066|  6 
 tests/qemu-iotests/068|  6 
 tests/qemu-iotests/069|  6 
 tests/qemu-iotests/070|  6 
 tests/qemu-iotests/071|  6 
 tests/qemu-iotests/072|  6 
 tests/qemu-iotests/073|  6 
 tests/qemu-iotests/074|  9 --
 tests/qemu-iotests/074.out|  1 -
 tests/qemu-iotests/075|  6 
 tests/qemu-iotests/076|  6 
 tests/qemu-iotests/077|  6 
 tests/qemu-iotests/078|  6 
 tests/qemu-iotests/079|  6 
 tests/qemu-iotests/080|  7 -
 tests/qemu-iotests/081|  8 -
 tests/qemu-iotests/082

[Qemu-block] [PATCH v3 1/5] qemu-iotests: set TEST_DIR to a unique dir for each test

2017-08-30 Thread Jeff Cody

Right now, all qemu-iotests output data into the same scratch directory,
and so each test needs to be responsible for cleaning up its own files.

Have each test use 'scratch/$seq' as its temp directory, so the check
script can do simple cleanup of removing the whole temporary directory.

Reviewed-by: Eric Blake 
Signed-off-by: Jeff Cody 
---
 tests/qemu-iotests/check | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index d504b6e..f6ca85d 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -243,6 +243,7 @@ seq="check"
 
 for seq in $list
 do
+TEST_DIR_SEQ=$TEST_DIR/$seq
 err=false
 printf %s "$seq"
 if [ -n "$TESTS_REMAINING_LOG" ] ; then
@@ -289,13 +290,23 @@ do
 fi
 export OUTPUT_DIR=$PWD
 if $debug; then
-(cd "$source_iotests";
+(
+export TEST_DIR=$TEST_DIR_SEQ
+. "$source_iotests/common.config"
+. "$source_iotests/common.rc"
+cd "$source_iotests" &&
 MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(($RANDOM % 255 + 1))} \
-$run_command -d 2>&1 | tee $tmp.out)
+$run_command -d 2>&1 | tee $tmp.out
+)
 else
-(cd "$source_iotests";
+(
+export TEST_DIR=$TEST_DIR_SEQ
+. "$source_iotests/common.config"
+. "$source_iotests/common.rc"
+ cd "$source_iotests" &&
 MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(($RANDOM % 255 + 1))} \
-$run_command >$tmp.out 2>&1)
+$run_command >$tmp.out 2>&1
+)
 fi
 sts=$?
 $timestamp && _timestamp
@@ -359,6 +370,8 @@ do
 fi
 fi
 
+rm -rf "$TEST_DIR_SEQ"
+
 fi
 
 # come here for each test, except when $showme is true
-- 
2.9.5

[Qemu-block] [PATCH v3 3/5] qemu-iotests: add 'blind_remove' for python tests

2017-08-30 Thread Jeff Cody

Add a function to attempt to 'blindly' remove a file, without
throwing an error if the file doesn't exist.

Signed-off-by: Jeff Cody 
---
 tests/qemu-iotests/iotests.py | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 7233983..a2088c7 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -57,6 +57,13 @@ qemu_default_machine = os.environ.get('QEMU_DEFAULT_MACHINE')
 socket_scm_helper = os.environ.get('SOCKET_SCM_HELPER', 'socket_scm_helper')
 debug = False
 
+def blind_remove(filename):
+try:
+os.remove(filename)
+except OSError, error:
+if error.errno != errno.ENOENT:
+raise
+
 def qemu_img(*args):
 '''Run qemu-img and return the exit code'''
 devnull = open('/dev/null', 'r+')
-- 
2.9.5

Re: [Qemu-block] [Qemu-devel] [PATCH 09/10] scsi: add multipath support to qemu-pr-helper

2017-08-30 Thread Stefan Hajnoczi

On Tue, Aug 22, 2017 at 03:18:31PM +0200, Paolo Bonzini wrote:
> +static int multipath_pr_in(int fd, const uint8_t *cdb, uint8_t *sense,
> +   uint8_t *data, int sz)
> +{
> +int rq_servact = cdb[1];
> +struct prin_resp resp;
> +size_t written;
> +int r;
> +
> +switch (rq_servact) {
> +case MPATH_PRIN_RKEY_SA:
> +case MPATH_PRIN_RRES_SA:
> +case MPATH_PRIN_RCAP_SA:
> +break;
> +case MPATH_PRIN_RFSTAT_SA:
> +/* Nobody implements it anyway, so bail out. */
> +default:
> +/* Cannot parse any other output.  */
> +scsi_build_sense(sense, SENSE_CODE(INVALID_FIELD));
> +return CHECK_CONDITION;
> +}
> +
> +r = mpath_persistent_reserve_in(fd, rq_servact, &resp, noisy, verbose);
> +if (r == MPATH_PR_SUCCESS) {
> +switch (rq_servact) {

The case statements asssume sz has a certain minimum value.  I didn't
see a check anywhere that guarantees this.  It may be easier to hide the
client's sz value and instead use sizeof(client->data).  The caller can
worry about sz.

> +case MPATH_PRIN_RKEY_SA:
> +case MPATH_PRIN_RRES_SA: {
> +struct prin_readdescr *out = &resp.prin_descriptor.prin_readkeys;
> +stl_be_p(&data[0], out->prgeneration);
> +stl_be_p(&data[4], out->additional_length);
> +memcpy(&data[8], out->key_list, MIN(out->additional_length, sz - 
> 8));

sz < 8 is possible, please handle this case.

> +written = MIN(out->additional_length + 8, sz);
> +break;
> +}
> +case MPATH_PRIN_RCAP_SA: {
> +struct prin_capdescr *out = &resp.prin_descriptor.prin_readcap;
> +stw_be_p(&data[0], out->length);
> +data[2] = out->flags[0];
> +data[3] = out->flags[1];
> +stw_be_p(&data[4], out->pr_type_mask);
> +written = MIN(6, sz);
> +break;
> +}
> +default:
> +scsi_build_sense(sense, SENSE_CODE(INVALID_OPCODE));
> +return CHECK_CONDITION;
> +}
> +assert(written < sz);

Why is written == sz not allowed?

> +memset(data + written, 0, sz - written);
> +}
> +
> +return mpath_reconstruct_sense(fd, r, sense);
> +}
> +
> +static int multipath_pr_out(int fd, const uint8_t *cdb, uint8_t *sense,
> +const uint8_t *param, int sz)
> +{
> +int rq_servact = cdb[1];
> +int rq_scope = cdb[2] >> 4;
> +int rq_type = cdb[2] & 0xf;
> +struct prout_param_descriptor paramp;
> +char transportids[PR_HELPER_DATA_SIZE];
> +int r;
> +int i, j;
> +
> +switch (rq_servact) {
> +case MPATH_PROUT_REG_SA:
> +case MPATH_PROUT_RES_SA:
> +case MPATH_PROUT_REL_SA:
> +case MPATH_PROUT_CLEAR_SA:
> +case MPATH_PROUT_PREE_SA:
> +case MPATH_PROUT_PREE_AB_SA:
> +case MPATH_PROUT_REG_IGN_SA:
> +case MPATH_PROUT_REG_MOV_SA:
> +break;
> +default:
> +/* Cannot parse any other input.  */
> +scsi_build_sense(sense, SENSE_CODE(INVALID_FIELD));
> +return CHECK_CONDITION;
> +}
> +
> +/* Convert input data, especially transport IDs, to the structs
> + * used by libmpathpersist (which, of course, will immediately
> + * do the opposite).
> + */
> +memset(¶mp, 0, sizeof(paramp));
> +memcpy(¶mp.key, ¶m[0], 8);
> +memcpy(¶mp.sa_key, ¶m[8], 8);
> +paramp.sa_flags = param[10];
> +for (i = PR_OUT_FIXED_PARAM_SIZE, j = 0; i < sz; ) {
> +struct transportid *id = (struct transportid *) &transportids[j];
> +int len;
> +
> +id->format_code = param[i] & 0xc0;
> +id->protocol_id = param[i] & 0x0f;
> +switch (param[i] & 0xcf) {

At this point we know sz > PR_OUT_FIXED_PARAM_SIZE && i < sz.  I think
the following case statements can read beyond the end of client->data[]
because nothing checks sz before accessing param[].

Missing sz checks?

> +case 0:
> +/* FC transport.  */
> +memcpy(id->n_port_name, ¶m[i + 8], 8);
> +j += offsetof(struct transportid, n_port_name[8]);
> +i += 24;
> +break;
> +case 3:
> +case 0x43:
> +/* iSCSI transport.  */
> +len = lduw_be_p(¶m[i + 2]);
> +if (len > 252 || (len & 3)) {

int len can be negative here :(.  Please use the size_t type - it's
unsigned and used by memchr(3)/memcpy(3).

Re: [Qemu-block] [PATCH 08/10] scsi: build qemu-pr-helper

2017-08-30 Thread Stefan Hajnoczi

On Tue, Aug 22, 2017 at 03:18:30PM +0200, Paolo Bonzini wrote:
> +#ifdef CONFIG_MPATH
> +dm_init();
> +multipath_pr_init();
> +#endif

This should be in the next patch.

Stefan

Re: [Qemu-block] [Qemu-devel] [PATCH 09/10] scsi: add multipath support to qemu-pr-helper

2017-08-30 Thread Stefan Hajnoczi

On Tue, Aug 22, 2017 at 03:18:31PM +0200, Paolo Bonzini wrote:
> @@ -444,6 +740,11 @@ static int drop_privileges(void)
>   CAP_SYS_RAWIO) < 0) {
>  return -1;
>  }
> +/* For /dev/mapper/control ioctls */
> +if (capng_update(CAPNG_ADD, CAPNG_EFFECTIVE | CAPNG_PERMITTED,
> + CAP_SYS_ADMIN) < 0) {
> +return -1;
> +}

Only if mpath is being used?  This capability isn't necessary with
ordinary sg_io so it would be nice to avoid keeping it in that case.

Re: [Qemu-block] [PATCH 08/10] scsi: build qemu-pr-helper

2017-08-30 Thread Stefan Hajnoczi

On Tue, Aug 22, 2017 at 03:18:30PM +0200, Paolo Bonzini wrote:
> diff --git a/docs/interop/pr-helper.rst b/docs/interop/pr-helper.rst
> new file mode 100644
> index 00..765174c31f
> --- /dev/null
> +++ b/docs/interop/pr-helper.rst
> @@ -0,0 +1,78 @@
> +..
> +
> +==
> +Persistent reservation helper protocol
> +==
> +
> +QEMU's SCSI passthrough devices, ``scsi-block`` and ``scsi-generic``,
> +can delegate implementation of persistent reservations to an external
> +(and typically privilege) program.  Persistent Reservations allow

privileged

> diff --git a/scsi/pr-helper.h b/scsi/pr-helper.h
> new file mode 100644
> index 00..2c7ccc9928
> --- /dev/null
> +++ b/scsi/pr-helper.h
> @@ -0,0 +1,13 @@

Do you want to license this file under the BSD license just in case
someone wants to copy it into an external helper implementation?  The
file is trivial but still.

> +#ifndef QEMU_PR_HELPER_H
> +#define QEMU_PR_HELPER_H 1
> +

Missing #include  for in32_t and uint8_t.

> +#include "qemu/osdep.h"
> +#include 
> +#include "qapi/error.h"
> +#include "qemu-common.h"
> +#include "qemu/cutils.h"
> +#include "qemu/main-loop.h"
> +#include "qemu/error-report.h"
> +#include "qemu/config-file.h"
> +#include "qemu/bswap.h"
> +#include "qemu/log.h"
> +#include "qemu/systemd.h"
> +#include "qapi/util.h"
> +#include "qapi/qmp/qstring.h"
> +#include "io/channel-socket.h"
> +#include "trace/control.h"
> +#include "qemu-version.h"
> +
> +#include "block/aio.h"
> +#include "block/thread-pool.h"
> +
> +#include "scsi/constants.h"
> +#include "scsi/utils.h"
> +#include "pr-helper.h"
> +#include 
> +#include 
> +#include 
> +
> +#ifdef CONFIG_LIBCAP
> +#include 
> +#endif
> +#include 
> +#include 

#include ordering

> +static int prh_read(PRHelperClient *client, void *buf, int sz, Error **errp)
> +{
> +while (sz > 0) {
> +int *fds = NULL;
> +size_t nfds = 0;
> +int i;
> +struct iovec iov;
> +ssize_t n_read;
> +
> +iov.iov_base = buf;
> +iov.iov_len = sz;
> +n_read = qio_channel_readv_full(QIO_CHANNEL(client->ioc), &iov, 1,
> +&fds, &nfds, errp);
> +
> +if (n_read == QIO_CHANNEL_ERR_BLOCK) {
> +qio_channel_yield(QIO_CHANNEL(client->ioc), G_IO_IN);
> +continue;
> +}
> +if (n_read <= 0) {
> +return n_read ? n_read : -1;

This assumes that client->fd == -1.  It's probably true on Linux but I'm
not sure.  What happens if the client sends an fd with a write that is
smaller than sz, and then follows up by closing the socket?  In the
worst case this would leak client->fd (the caller assumes it's -1 on
failure).

> +}
> +
> +/* Stash one file descriptor per request.  */
> +if (nfds) {
> +for (i = 0; i < nfds; i++) {
> +if (client->fd == -1) {
> +client->fd = fds[i++];

i++ looks like a bug.  The loop is already iterating fds[i] so we don't
need to increment it.  This would leak the following file descriptor.

> +static void prh_co_entry(void *opaque)

coroutine_fn

Re: [Qemu-block] [PATCH 07/10] io: add qio_channel_read/write_all

2017-08-30 Thread Stefan Hajnoczi

On Tue, Aug 22, 2017 at 03:18:29PM +0200, Paolo Bonzini wrote:
> @@ -315,6 +315,23 @@ ssize_t qio_channel_read(QIOChannel *ioc,
>   Error **errp);
>  
>  /**
> + * qio_channel_read_all:
> + * @ioc: the channel object
> + * @buf: the memory region to read data into
> + * @buflen: the number of bytes to @buf
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Reads @buflen bytes into @buf, possibly blocking or (if the
> + * channel is non-blocking) yielding from the current coroutine
> + * multiple times until the entire content is read.  Otherwise
> + * behaves as qio_channel_read().
> + */
> +ssize_t coroutine_fn qio_channel_read_all(QIOChannel *ioc,

This function is not coroutine_fn.  It only assumes coroutine context
when called on a non-blocking socket.

> +  char *buf,
> +  size_t buflen,
> +  Error **errp);
> +
> +/**
>   * qio_channel_write:
>   * @ioc: the channel object
>   * @buf: the memory regions to send data from
> @@ -331,6 +348,23 @@ ssize_t qio_channel_write(QIOChannel *ioc,
>Error **errp);
>  
>  /**
> + * qio_channel_write_all:
> + * @ioc: the channel object
> + * @buf: the memory region to write data into
> + * @buflen: the number of bytes to @buf
> + * @errp: pointer to a NULL-initialized error object
> + *
> + * Writes @buflen bytes from @buf, possibly blocking or (if the
> + * channel is non-blocking) yielding from the current coroutine
> + * multiple times until the entire content is written.  Otherwise
> + * behaves as qio_channel_write().
> + */
> +ssize_t coroutine_fn qio_channel_write_all(QIOChannel *ioc,

This function is not coroutine_fn.  It only assumes coroutine context
when called on a non-blocking socket.

Re: [Qemu-block] [PATCH 06/10] scsi, file-posix: add support for persistent reservation management

2017-08-30 Thread Stefan Hajnoczi

On Tue, Aug 22, 2017 at 03:18:28PM +0200, Paolo Bonzini wrote:
> +#ifdef CONFIG_LINUX
> +PRManager *pr_manager_lookup(const char *id, Error **errp);
> +#else
> +static inline PRManager *pr_manager_lookup(const char *id,
> +  Error **errp)

Indentation

> diff --git a/scsi/pr-manager.c b/scsi/pr-manager.c
> new file mode 100644
> index 00..e80f8d9b31
> --- /dev/null
> +++ b/scsi/pr-manager.c
> @@ -0,0 +1,109 @@
> +/*
> + * Persistent reservation manager abstract class
> + *
> + * Copyright (c) 2017 Red Hat, Inc.
> + *
> + * Author: Paolo Bonzini 
> + *
> + * This code is licensed under the LGPL.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "block/aio.h"
> +#include "block/thread-pool.h"
> +#include "scsi/pr-manager.h"
> +#include "scsi/trace.h"
> +
> +#include 

HACKING "1.2. Include directives" defines the order of #includes.  It
should be "qemu/osdep.h", , followed by all other QEMU
headers.

Re: [Qemu-block] [PATCH] block: Cleanup BMDS in bdrv_close_all

2017-08-30 Thread Juan Quintela

Fam Zheng  wrote:
> On Wed, 08/30 13:49, Juan Quintela wrote:
>> Fam Zheng  wrote:
>> > This fixes the assertion due to op blockers added by BMDS:
>> >
>> > block.c:3248: bdrv_delete: Assertion `bdrv_op_blocker_is_empty(bs)' failed.
>> >
>> > Reproducer: simply start block migration and quit QEMU before it ends.
>> >
>> > Cc: qemu-sta...@nongnu.org
>> > Signed-off-by: Fam Zheng 
>> 
>> No need for one stub, see later.
>> 
>> 
>> > ---
>> >  block.c | 2 ++
>> >  migration/block.c   | 2 +-
>> >  migration/block.h   | 1 +
>> >  stubs/Makefile.objs | 1 +
>> >  stubs/block-migration.c | 6 ++
>> >  5 files changed, 11 insertions(+), 1 deletion(-)
>> >  create mode 100644 stubs/block-migration.c
>> >
>> > diff --git a/block.c b/block.c
>> > index 3308814bba..508a57274d 100644
>> > --- a/block.c
>> > +++ b/block.c
>> > @@ -43,6 +43,7 @@
>> >  #include "qemu/cutils.h"
>> >  #include "qemu/id.h"
>> >  #include "qapi/util.h"
>> > +#include "migration/block.h"
>> 
>> this should be misc.h
>> 
>> >  
>> >  #ifdef CONFIG_BSD
>> >  #include 
>> > @@ -3111,6 +3112,7 @@ static void bdrv_close(BlockDriverState *bs)
>> >  
>> >  void bdrv_close_all(void)
>> >  {
>> > +block_migration_cleanup_bmds();
>> >  block_job_cancel_sync_all();
>> >  nbd_export_close_all();
>> >  
>> 
>> > diff --git a/migration/block.h b/migration/block.h
>> > index 22ebe94259..8bae1cf55a 100644
>> > --- a/migration/block.h
>> > +++ b/migration/block.h
>> > @@ -42,4 +42,5 @@ static inline uint64_t blk_mig_bytes_total(void)
>> >  #endif /* CONFIG_LIVE_BLOCK_MIGRATION */
>> >  
>> >  void migrate_set_block_enabled(bool value, Error **errp);
>> > +void block_migration_cleanup_bmds(void);
>> >  #endif /* MIGRATION_BLOCK_H */
>> > diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
>> > index e69c217aff..7540913767 100644
>> > --- a/stubs/Makefile.objs
>> > +++ b/stubs/Makefile.objs
>> > @@ -19,6 +19,7 @@ stub-obj-y += is-daemonized.o
>> >  stub-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
>> >  stub-obj-y += machine-init-done.o
>> >  stub-obj-y += migr-blocker.o
>> > +stub-obj-y += block-migration.o
>> >  stub-obj-y += change-state-handler.o
>> >  stub-obj-y += monitor.o
>> >  stub-obj-y += notify-event.o
>> > diff --git a/stubs/block-migration.c b/stubs/block-migration.c
>> > new file mode 100644
>> > index 00..855f15c757
>> > --- /dev/null
>> > +++ b/stubs/block-migration.c
>> > @@ -0,0 +1,6 @@
>> > +#include "qemu/osdep.h"
>> > +#include "migration/block.h"
>> > +
>> > +void block_migration_cleanup_bmds(void)
>> > +{
>> > +}
>> 
>> You can add this inside include/migration/misc.h
>> 
>> #ifdef CONFIG_LIVE_BLOCK_MIGRATION
>> void blk_mig_init(void);
>> #else
>> static inline void blk_mig_init(void) {}
>> 
>> // And then you add the stub here?
>
> This doesn't work.  The function is not stubbed for 
> !CONFIG_LIVE_BLOCK_MIGRATION
> configs, but for tools that don't link to common-obj-y. For example with your
> proposed change, I get:
>
>   LINKqemu-nbd
> block.o: In function `bdrv_close_all':
> /home/fam/work/qemu/block.c:3115: undefined reference to
> `block_migration_cleanup_bmds'
> collect2: error: ld returned 1 exit status
> make: *** [/home/fam/work/qemu/rules.mak:121: qemu-nbd] Error 1
> make: Leaving directory '/home/fam/work/q/build'


This works for me, for both CONFIG_LIVE_BLOCK_MIGRATION enabled and not.
For qemu-system-x86_64 and qemu-nbd.  Could you test?

gcommit 2888b96dfe5ea9c7901990f54e14b1a7ed3e46b9
Author: Fam Zheng 
Date:   Wed Aug 30 18:06:05 2017 +0800

block: Cleanup BMDS in bdrv_close_all

This fixes the assertion due to op blockers added by BMDS:

block.c:3248: bdrv_delete: Assertion `bdrv_op_blocker_is_empty(bs)' failed.

Reproducer: simply start block migration and quit QEMU before it ends.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Fam Zheng 

--

Don't use stub

diff --git a/block.c b/block.c
index 3615a6809e..4268f892da 100644
--- a/block.c
+++ b/block.c
@@ -43,6 +43,7 @@
 #include "qemu/cutils.h"
 #include "qemu/id.h"
 #include "qapi/util.h"
+#include "migration/misc.h"
 
 #ifdef CONFIG_BSD
 #include 
@@ -3111,6 +3112,7 @@ static void bdrv_close(BlockDriverState *bs)
 
 void bdrv_close_all(void)
 {
+block_migration_cleanup_bmds();
 block_job_cancel_sync_all();
 nbd_export_close_all();
 
diff --git a/include/migration/misc.h b/include/migration/misc.h
index c079b7771b..6ecb7068d9 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -24,8 +24,10 @@ void ram_mig_init(void);
 
 #ifdef CONFIG_LIVE_BLOCK_MIGRATION
 void blk_mig_init(void);
+void block_migration_cleanup_bmds(void);
 #else
 static inline void blk_mig_init(void) {}
+static inline void block_migration_cleanup_bmds(void) {}
 #endif
 
 #define SELF_ANNOUNCE_ROUNDS 5
diff --git a/migration/block.c b/migration/block.c
index 9171f60028..c059e48484 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -673,7 +673,7 @@ static int64_

Re: [Qemu-block] [Qemu-devel] Persistent bitmaps for non-qcow2 formats

2017-08-30 Thread Daniel P. Berrange

On Wed, Aug 30, 2017 at 02:36:11PM +0100, Stefan Hajnoczi wrote:
> On Tue, Aug 22, 2017 at 03:07:04PM -0400, John Snow wrote:
> > (3) Add either a new flag that turns qcow2's backing file into a full
> > R/W backing file, or add a new extension to qcow2 entirely (bypassing
> > the traditional backing file mechanism to avoid confusion for older
> > tooling) that adds a new read-write backing file field.
> > 
> > This RW backing file field will be used for all reads AND writes; the
> > qcow2 in question becomes a metadata container on top of the BDS chain.
> > We can re-use Vladimir's bitmap persistence extension to save bitmaps in
> > a qcow2 shell.
> > 
> > The qcow2 becomes effectively a metadata cache for a new (essentially)
> > filter node that handles features such as bitmaps. This could also be
> > used to provide allocation map data for RAW files and other goodies down
> > the road.
> > 
> > Hopefully this achieves our desire to not create new formats AND our
> > desire to concentrate features (and debugging, testing, etc) into qcow2,
> > while allowing users to "have bitmaps with raw files."
> > 
> > Of course, in this scenario, users now have two files: a qcow2 wrapper
> > and the actual raw file in question; but regardless of how we were going
> > to solve this, a raw file necessitates an external file of some sort,
> > else we give up the idea that it was a raw file.
> 
> There is some complexity here for management tools:
> 
> If the underlying image is resized, who resizes the qcow2 and how do
> they know to do it?
> 
> If QEMU's resize/truncate command it used, does first try to resize the
> underlying image and then resize the qcow2?  This is probably the sanest
> approach.
> 
> If the underlying image is moved to a new location, does the qcow2 file
> need to be modified and who does that?
> 
> Management tools need to figure out how to represent manage this extra
> qcow2 file.  The easiest solution is to punt it to the user and treat it
> as part of a backing file chain.  If the management tool wants to
> automatically manage the qcow2 so the user just specifies the underlying
> image and enables the persistent bitmap checkbox, then it becomes more
> complicated.

Indeed, I don't think it is practical to have libvirt / QEMU automagically
create a qcow2 overlay on disk. Something has to decide where this would
be stored. You might say just put it alongside the raw file, but it might
not be a local file at all, it could be a NBD, or RBD raw "file". So do
we create  local qcow2 file, or store a qcow2 file inside another RBD
volume to hold the persistent bitmap. This kind of decision needs to be
made by the mgmt app since only it knows about its storage mgmt model.
At this point you might as well just let the mgmt app take care of it
all and not try to do anything magical with qcow2 overlays in libvirt/QEMU

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

1 2 >

1 - 100 of 119 matches

Mail list logo