Re: [Qemu-block] [Qemu-devel] [PATCH] block: implement the bdrv_reopen_prepare helper for LUKS driver

2018-01-19 Thread Daniel P. Berrange
On Thu, Jan 18, 2018 at 01:51:36PM -0600, Eric Blake wrote:
> On 01/18/2018 04:31 AM, Daniel P. Berrange wrote:
> > If the bdrv_reopen_prepare helper isn't provided, the qemu-img commit
> > command fails to re-open the base layer after committing changes into
> > it. Provide a no-op implementation for the LUKS driver, since there
> > is not any custom work that needs doing to re-open it.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  block/crypto.c | 7 +++
> >  1 file changed, 7 insertions(+)
> 
> I'm hoping another block-layer expert chimes in, as I'm not quite sure
> what the full reopen rules are; but the idea makes sense to me.

Yeah, likewise - it is hard to understand what is required, but I see
lots of other block drivers just put a no-op impl here. I'm relying on
Kevin/Max to tell me if i'm wrong here 

> > diff --git a/block/crypto.c b/block/crypto.c
> > index 60ddf8623e..bb9a8f5376 100644
> > --- a/block/crypto.c
> > +++ b/block/crypto.c
> > @@ -382,6 +382,12 @@ static void block_crypto_close(BlockDriverState *bs)
> >  qcrypto_block_free(crypto->block);
> >  }
> >  
> > +static int block_crypto_reopen_prepare(BDRVReopenState *state,
> > +   BlockReopenQueue *queue, Error 
> > **errp)
> > +{
> > +/* nothing needs checking */
> 
> Are we sure that even changes such as moving from read-only to
> read-write need no checking?

LUKS doesn't do anything differently with ro vs rw images, so I'm assuming
any such checks are handled by the layer below.



> 
> > +return 0;
> > +}
> >  
> >  /*
> >   * 1 MB bounce buffer gives good performance / memory tradeoff
> > @@ -620,6 +626,7 @@ BlockDriver bdrv_crypto_luks = {
> >  .bdrv_truncate  = block_crypto_truncate,
> >  .create_opts= &block_crypto_create_opts_luks,
> >  
> > +.bdrv_reopen_prepare = block_crypto_reopen_prepare,
> >  .bdrv_refresh_limits = block_crypto_refresh_limits,
> >  .bdrv_co_preadv = block_crypto_co_preadv,
> >  .bdrv_co_pwritev= block_crypto_co_pwritev,

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-block] [PATCH v2 2/6] hmp: add name parameter to nbd_server_add

2018-01-19 Thread Vladimir Sementsov-Ogievskiy

19.01.2018 00:08, Eric Blake wrote:

On 01/18/2018 12:11 PM, Vladimir Sementsov-Ogievskiy wrote:

Support name parameter for HMP too.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  hmp.c   | 3 ++-
  hmp-commands.hx | 9 +
  2 files changed, 7 insertions(+), 5 deletions(-)

+++ b/hmp-commands.hx
@@ -1553,8 +1553,8 @@ ETEXI
  
  {

  .name   = "nbd_server_add",
-.args_type  = "writable:-w,device:B",
-.params = "nbd_server_add [-w] device",
+.args_type  = "name:-n,writable:-w,device:B",
+.params = "nbd_server_add [-n] [-w] device",

Doesn't quite look like my proposal:
https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg01639.html

-.args_type  = "writable:-w,device:B",
-.params = "nbd_server_add [-w] device",
+.args_type  = "writable:-w,device:B,name:s?",
+.params = "nbd_server_add [-w] device [name]",

In fact, using -n is not quite right, because that's just a boolean flag
rather than a string.



Strange, I don't have your letter. Let's use yours, of course.

--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH v2 6/6] iotest 201: new test for qmp nbd-server-remove

2018-01-19 Thread Vladimir Sementsov-Ogievskiy

19.01.2018 01:43, Eric Blake wrote:

On 01/18/2018 12:11 PM, Vladimir Sementsov-Ogievskiy wrote:

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  tests/qemu-iotests/201 | 159 +
  tests/qemu-iotests/201.out |   5 ++
  tests/qemu-iotests/group   |   1 +
  3 files changed, 165 insertions(+)
  create mode 100644 tests/qemu-iotests/201
  create mode 100644 tests/qemu-iotests/201.out

+
+def assertExistingClients(self, result):
+self.assert_qmp(result, 'error/desc',
+"NBD export 'exp' has 1 active connection. To force "
+"remove it (and hard disconnect clients) use "
+"mode='hard'")

Needs tweaking if we massage the error message earlier in the series.

I'm still worried that this test may fail spuriously due to a hard-coded
port, but some testing is better than none, and if the CI engines don't
immediately reject it, whoever first encounters it will be nice and let
us know.

Reviewed-by: Eric Blake 



Looked through other tests, looks like same approach is only in 147, and 
other iotests uses unix sockets for nbd-server-start. May be, it is 
better to move to unix socket here.

I'll resend today with unix-socket.

--
Best regards,
Vladimir




Re: [Qemu-block] [PATCH v2 0/6] nbd export qmp interface

2018-01-19 Thread Kevin Wolf
Am 18.01.2018 um 23:45 hat Eric Blake geschrieben:
> On 01/18/2018 12:11 PM, Vladimir Sementsov-Ogievskiy wrote:
> > v2:
> > 01: tweak comment
> > add Eric's r-b
> > 02: new patch
> > 03: rewritten, to move form 'bool force' flag to 'enum mode' parameter
> > 04: add Eric's r-b
> > 05: improve commit message
> > tweak comment
> > 06: rebase on 03 changes
> > make PEP8 happy
> > some other tweaks
> > I've left nbd_port variable hard-set to 10900. I think all such things
> > should be fixed together, and it is simple to change in future
> > nbd_port = '10900'
> > to
> > nbd_port = iotests.get_free_port()
> > if needed.
> > 
> > [Unfortunately, qmp query-nbd-server is not finished yet, coming soon,
> >  but may be after my vocation on the next week]
> 
> Enjoy your time off. I think the series is nearly ready to go; I had
> some tweaks that I suggested, and will probably replace your 2/6 with my
> counterproposal, but I don't mind doing that cleanup if you don't have
> time to respin.  I'll give it a few more days in case anyone else has
> comments, then add it to my NBD queue.

I haven't reviewed the patches in detail, but the API changes look good
to me.

Kevin


signature.asc
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PATCH 0/5 v3] preparation for Parallels Disk xml driver

2018-01-19 Thread klim

On 01/12/2018 12:01 PM, Klim Kireev wrote:

ping

Parallels Desktop and Parallels Cloud Server uses images glued with the
bundle description in XML format. This series contains very basic
description of this XML files and makes preparations for actual
implementation to be followed.

Signed-off-by: Edgar Kaziakhmedov 
Signed-off-by: Klim Kireev 
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Signed-off-by: Denis V. Lunev 

Changelog:

v2:
PATCH 1/5: Fix some places noticed by Stefan Hajnoczi 
PATCH 2/5: Rebase to upstream
PATCH 3/5: Fix includes

v3:
PATCH 1/5: Fix the place about GUID, add emails of the authors






Re: [Qemu-block] [PATCH] block/vmdk: Report failures in vmdk_read_cid()

2018-01-19 Thread Paolo Bonzini
On 28/07/2017 14:54, Kevin Wolf wrote:
> Am 09.07.2017 um 19:06 hat Peter Maydell geschrieben:
>> The function vmdk_read_cid() can fail if the read on the underlying
>> block device fails, or if there's a format error in the VMDK file.
>> However its API doesn't provide a mechanism to report these errors,
>> and in some cases we were returning a CID of 0 and in some cases a
>> CID of 0x, either of which might potentially be valid values.
>>
>> Change the function to return 0 on success or a negative errno, and
>> return the CID via a uint32_t* argument. Update the callsites to
>> handle and propagate the error appropriately.
>>
>> This fixes in passing a Coverity-spotted issue (CID 1350038) where
>> we weren't checking the return value from sscanf().
>>
>> Signed-off-by: Peter Maydell 
> 
> Fam, this is the commit that introduced the qemu-iotests 059 failure for
> vmdk. I think what's happening is that we use an image produced by a
> fuzzer, and with the additional checks introduced in this patch, we now
> fail earlier and don't test the condition any more that we wanted to
> test.
> 
> So do we need a new version of sample_images/afl9.vmdk.bz2 that has a
> valid CID?

This has never been fixed, has it?  I still see the failure.

Paolo



[Qemu-block] [PATCH v2 2/8] qapi: add unmap to BlockDeviceStats

2018-01-19 Thread Anton Nefedov
Signed-off-by: Anton Nefedov 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
---
 qapi/block-core.json   | 29 +++--
 include/block/accounting.h |  1 +
 block/qapi.c   |  6 ++
 3 files changed, 30 insertions(+), 6 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 2e0665e..3fa2d3a 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -674,6 +674,8 @@
 #
 # @wr_bytes:  The number of bytes written by the device.
 #
+# @unmap_bytes: The number of bytes unmapped by the device (Since 2.12)
+#
 # @rd_operations: The number of read operations performed by the device.
 #
 # @wr_operations: The number of write operations performed by the device.
@@ -681,6 +683,9 @@
 # @flush_operations: The number of cache flush operations performed by the
 #device (since 0.15.0)
 #
+# @unmap_operations: The number of unmap operations performed by the device
+#(Since 2.12)
+#
 # @flush_total_time_ns: Total time spend on cache flushes in nano-seconds
 #   (since 0.15.0).
 #
@@ -688,6 +693,9 @@
 #
 # @rd_total_time_ns: Total_time_spend on reads in nano-seconds (since 0.15.0).
 #
+# @unmap_total_time_ns: Total time spent on unmap operations in nano-seconds
+#   (Since 2.12)
+#
 # @wr_highest_offset: The offset after the greatest byte written to the
 # device.  The intended use of this information is for
 # growable sparse files (like qcow2) that are used on top
@@ -699,6 +707,9 @@
 # @wr_merged: Number of write requests that have been merged into another
 # request (Since 2.3).
 #
+# @unmap_merged: Number of unmap requests that have been merged into another
+#request (Since 2.12)
+#
 # @idle_time_ns: Time since the last I/O operation, in
 #nanoseconds. If the field is absent it means that
 #there haven't been any operations yet (Since 2.5).
@@ -712,6 +723,9 @@
 # @failed_flush_operations: The number of failed flush operations
 #   performed by the device (Since 2.5)
 #
+# @failed_unmap_operations: The number of failed unmap operations performed
+#   by the device (Since 2.12)
+#
 # @invalid_rd_operations: The number of invalid read operations
 #  performed by the device (Since 2.5)
 #
@@ -721,6 +735,9 @@
 # @invalid_flush_operations: The number of invalid flush operations
 #performed by the device (Since 2.5)
 #
+# @invalid_unmap_operations: The number of invalid unmap operations performed
+#by the device (Since 2.12)
+#
 # @account_invalid: Whether invalid operations are included in the
 #   last access statistics (Since 2.5)
 #
@@ -733,25 +750,25 @@
 # Since: 0.14.0
 ##
 { 'struct': 'BlockDeviceStats',
-  'data': {'rd_bytes': 'int', 'wr_bytes': 'int',
+  'data': {'rd_bytes': 'int', 'wr_bytes': 'int', 'unmap_bytes' : 'int',
 
'rd_operations': 'int', 'wr_operations': 'int',
-   'flush_operations': 'int',
+   'flush_operations': 'int', 'unmap_operations': 'int',
 
'flush_total_time_ns': 'int', 'wr_total_time_ns': 'int',
-   'rd_total_time_ns': 'int',
+   'rd_total_time_ns': 'int', 'unmap_total_time_ns': 'int',
 
'wr_highest_offset': 'int',
 
-   'rd_merged': 'int', 'wr_merged': 'int',
+   'rd_merged': 'int', 'wr_merged': 'int', 'unmap_merged': 'int',
 
'*idle_time_ns': 'int',
 
'failed_rd_operations': 'int', 'failed_wr_operations': 'int',
-   'failed_flush_operations': 'int',
+   'failed_flush_operations': 'int', 'failed_unmap_operations': 'int',
 
'invalid_rd_operations': 'int', 'invalid_wr_operations': 'int',
-   'invalid_flush_operations': 'int',
+   'invalid_flush_operations': 'int', 'invalid_unmap_operations': 
'int',
 
'account_invalid': 'bool', 'account_failed': 'bool',
'timed_stats': ['BlockDeviceTimedStats'] } }
diff --git a/include/block/accounting.h b/include/block/accounting.h
index b833d26..4e53c4a 100644
--- a/include/block/accounting.h
+++ b/include/block/accounting.h
@@ -35,6 +35,7 @@ enum BlockAcctType {
 BLOCK_ACCT_READ,
 BLOCK_ACCT_WRITE,
 BLOCK_ACCT_FLUSH,
+BLOCK_ACCT_UNMAP,
 BLOCK_MAX_IOTYPE,
 };
 
diff --git a/block/qapi.c b/block/qapi.c
index fc10f0a..6e110f2 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -396,24 +396,30 @@ static void bdrv_query_blk_stats(BlockDeviceStats *ds, 
BlockBackend *blk)
 
 ds->rd_bytes = stats->nr_bytes[BLOCK_ACCT_READ];
 ds->wr_bytes = stats->nr_bytes[BLOCK_ACCT_WRITE];
+ds->unmap_bytes = stats->nr_bytes[BLOCK_ACCT_UNMAP];
 ds->rd_operations = stats->nr_ops[BLOCK_ACCT_READ];
 ds->wr_operations = stats->nr_ops[BLOCK_ACCT_WRITE];
+ds->unmap_opera

[Qemu-block] [PATCH v2 1/8] qapi: group BlockDeviceStats fields

2018-01-19 Thread Anton Nefedov
Make the stat fields definition slightly more readable.
Cosmetic change only.

Signed-off-by: Anton Nefedov 
---
 qapi/block-core.json | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index e94a688..2e0665e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -733,14 +733,26 @@
 # Since: 0.14.0
 ##
 { 'struct': 'BlockDeviceStats',
-  'data': {'rd_bytes': 'int', 'wr_bytes': 'int', 'rd_operations': 'int',
-   'wr_operations': 'int', 'flush_operations': 'int',
+  'data': {'rd_bytes': 'int', 'wr_bytes': 'int',
+
+   'rd_operations': 'int', 'wr_operations': 'int',
+   'flush_operations': 'int',
+
'flush_total_time_ns': 'int', 'wr_total_time_ns': 'int',
-   'rd_total_time_ns': 'int', 'wr_highest_offset': 'int',
-   'rd_merged': 'int', 'wr_merged': 'int', '*idle_time_ns': 'int',
+   'rd_total_time_ns': 'int',
+
+   'wr_highest_offset': 'int',
+
+   'rd_merged': 'int', 'wr_merged': 'int',
+
+   '*idle_time_ns': 'int',
+
'failed_rd_operations': 'int', 'failed_wr_operations': 'int',
-   'failed_flush_operations': 'int', 'invalid_rd_operations': 'int',
-   'invalid_wr_operations': 'int', 'invalid_flush_operations': 'int',
+   'failed_flush_operations': 'int',
+
+   'invalid_rd_operations': 'int', 'invalid_wr_operations': 'int',
+   'invalid_flush_operations': 'int',
+
'account_invalid': 'bool', 'account_failed': 'bool',
'timed_stats': ['BlockDeviceTimedStats'] } }
 
-- 
2.7.4




[Qemu-block] [PATCH v2 6/8] scsi: account unmap operations

2018-01-19 Thread Anton Nefedov
Signed-off-by: Anton Nefedov 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 hw/scsi/scsi-disk.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 693a754..6881664 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -1636,6 +1636,10 @@ static void scsi_unmap_complete_noio(UnmapCBData *data, 
int ret)
 goto done;
 }
 
+block_acct_start(blk_get_stats(s->qdev.conf.blk), &r->acct,
+ r->sector_count * s->qdev.blocksize,
+ BLOCK_ACCT_UNMAP);
+
 r->req.aiocb = blk_aio_pdiscard(s->qdev.conf.blk,
 r->sector * s->qdev.blocksize,
 r->sector_count * s->qdev.blocksize,
@@ -1662,10 +1666,11 @@ static void scsi_unmap_complete(void *opaque, int ret)
 r->req.aiocb = NULL;
 
 aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
-if (scsi_disk_req_check_error(r, ret, false)) {
+if (scsi_disk_req_check_error(r, ret, true)) {
 scsi_req_unref(&r->req);
 g_free(data);
 } else {
+block_acct_done(blk_get_stats(s->qdev.conf.blk), &r->acct);
 scsi_unmap_complete_noio(data, ret);
 }
 aio_context_release(blk_get_aio_context(s->qdev.conf.blk));
@@ -1712,10 +1717,12 @@ static void scsi_disk_emulate_unmap(SCSIDiskReq *r, 
uint8_t *inbuf)
 return;
 
 invalid_param_len:
+block_acct_invalid(blk_get_stats(s->qdev.conf.blk), BLOCK_ACCT_UNMAP);
 scsi_check_condition(r, SENSE_CODE(INVALID_PARAM_LEN));
 return;
 
 invalid_field:
+block_acct_invalid(blk_get_stats(s->qdev.conf.blk), BLOCK_ACCT_UNMAP);
 scsi_check_condition(r, SENSE_CODE(INVALID_FIELD));
 }
 
-- 
2.7.4




[Qemu-block] [PATCH v2 3/8] ide: account UNMAP (TRIM) operations

2018-01-19 Thread Anton Nefedov
Signed-off-by: Anton Nefedov 
---
 hw/ide/core.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 5be72d4..6fdc936 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -443,6 +443,14 @@ static void ide_issue_trim_cb(void *opaque, int ret)
 TrimAIOCB *iocb = opaque;
 IDEState *s = iocb->s;
 
+if (iocb->i >= 0) {
+if (ret >= 0) {
+block_acct_done(blk_get_stats(s->blk), &s->acct);
+} else {
+block_acct_failed(blk_get_stats(s->blk), &s->acct);
+}
+}
+
 if (ret >= 0) {
 while (iocb->j < iocb->qiov->niov) {
 int j = iocb->j;
@@ -460,10 +468,15 @@ static void ide_issue_trim_cb(void *opaque, int ret)
 }
 
 if (!ide_sect_range_ok(s, sector, count)) {
+block_acct_invalid(blk_get_stats(s->blk),
+   BLOCK_ACCT_UNMAP);
 iocb->is_invalid = true;
 goto done;
 }
 
+block_acct_start(blk_get_stats(s->blk), &s->acct,
+ count << BDRV_SECTOR_BITS, BLOCK_ACCT_UNMAP);
+
 /* Got an entry! Submit and exit.  */
 iocb->aiocb = blk_aio_pdiscard(s->blk,
sector << BDRV_SECTOR_BITS,
-- 
2.7.4




[Qemu-block] [PATCH v2 5/8] scsi: move unmap error checking to the complete callback

2018-01-19 Thread Anton Nefedov
This will help to account the operation in the following commit.

The difference is that we don't call scsi_disk_req_check_error() before
the 1st discard iteration anymore. That function also checks if
the request is cancelled, however it shouldn't get canceled until it
yields in blk_aio() functions anyway.
Same approach is already used for emulate_write_same.

Signed-off-by: Anton Nefedov 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 hw/scsi/scsi-disk.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 7b8e0ed..693a754 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -1627,9 +1627,6 @@ static void scsi_unmap_complete_noio(UnmapCBData *data, 
int ret)
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
 
 assert(r->req.aiocb == NULL);
-if (scsi_disk_req_check_error(r, ret, false)) {
-goto done;
-}
 
 if (data->count > 0) {
 r->sector = ldq_be_p(&data->inbuf[0]);
@@ -1665,7 +1662,12 @@ static void scsi_unmap_complete(void *opaque, int ret)
 r->req.aiocb = NULL;
 
 aio_context_acquire(blk_get_aio_context(s->qdev.conf.blk));
-scsi_unmap_complete_noio(data, ret);
+if (scsi_disk_req_check_error(r, ret, false)) {
+scsi_req_unref(&r->req);
+g_free(data);
+} else {
+scsi_unmap_complete_noio(data, ret);
+}
 aio_context_release(blk_get_aio_context(s->qdev.conf.blk));
 }
 
-- 
2.7.4




[Qemu-block] [PATCH v2 0/8] discard blockstats

2018-01-19 Thread Anton Nefedov
v2: - rebased on top of series 'ide: abort TRIM operation for invalid range'
  (http://lists.nongnu.org/archive/html/qemu-devel/2017-12/msg01432.html)
  Now invalid trim requests are properly accounted

- patches 1/2 - qapi fields regrouped together

v1: http://lists.nongnu.org/archive/html/qemu-devel/2017-11/msg03664.html



qmp query-blockstats provides stats info for write/read/flush ops.

Patches 1-5 implement the similar for discard (unmap) command for scsi
and ide disks.
Discard stat "unmap_ops / unmap_bytes" is supposed to account the ops that
have completed without an error.

However, discard operation is advisory. Specifically,
 - common block layer ignores ENOTSUP error code.
   That might be returned if the block driver does not support discard,
   or discard has been configured to be ignored.
 - format drivers such as qcow2 may ignore discard if they were configured
   to ignore that, or if the corresponding area is already marked unused
   (unallocated / zero clusters).

And what is actually useful is the number of bytes actually discarded
down on the host filesystem.
To achieve that, driver-specific statistics has been added to blockstats
(patch 7).
With patch 6, file-posix driver accounts discard operations on its level too.

query-blockstat result:

(note the difference between blockdevice unmap and file discard stats. qcow2
sends a few ops down to the file as the clusters are actually unallocated
on qcow2 level)

{"return": [
{"device": "drive-scsi0-0-0-0"
  "parent": {
"stats": {
>  "unmap_operations": 0
>  "unmap_merged": 0
  "flush_total_time_ns": 0
  "wr_highest_offset": 8111718400
  [..]
  "invalid_wr_operations": 0
  "invalid_rd_operations": 0}
"node-name": "#block047"
>"driver_stats": {
>  "type": "file"
>  "data": {
>"discard_bytes_ok": 1572864
>"discard_nb_failed": 0
>"discard_nb_ok": 5}}}
  "stats": {
>   "unmap_operations": 472
>   "unmap_merged": 0
"flush_total_time_ns": 44530540
"wr_highest_offset": 7106662400
"wr_total_time_ns": 45518856
"failed_wr_operations": 0
"failed_rd_operations": 0
"wr_merged": 0
"wr_bytes": 889856
"timed_stats": []
>"failed_unmap_operations": 0
"failed_flush_operations": 0
"account_invalid": true
"rd_total_time_ns": 3306264098
>"invalid_unmap_operations": 0
"flush_operations": 18
"wr_operations": 120
>"unmap_bytes": 12312014848
"rd_merged": 0
"rd_bytes": 137103360
>"unmap_total_time_ns": 22664692
"invalid_flush_operations": 0
"account_failed": true
"idle_time_ns": 437316567
"rd_operations": 5636
"invalid_wr_operations": 0
"invalid_rd_operations": 0}
  "node-name": "#block128"}

  {"device": "drive-ide0-0-0"
  [..]

Anton Nefedov (8):
  qapi: group BlockDeviceStats fields
  qapi: add unmap to BlockDeviceStats
  ide: account UNMAP (TRIM) operations
  scsi: store unmap offset and nb_sectors in request struct
  scsi: move unmap error checking to the complete callback
  scsi: account unmap operations
  file-posix: account discard operations
  qapi: query-blockstat: add driver specific file-posix stats

 qapi/block-core.json   | 78 ++
 include/block/accounting.h |  1 +
 include/block/block.h  |  1 +
 include/block/block_int.h  |  1 +
 block.c|  9 ++
 block/file-posix.c | 42 +++--
 block/qapi.c   | 11 +++
 hw/ide/core.c  | 13 
 hw/scsi/scsi-disk.c| 29 ++---
 9 files changed, 166 insertions(+), 19 deletions(-)

-- 
2.7.4




[Qemu-block] [PATCH v2 8/8] qapi: query-blockstat: add driver specific file-posix stats

2018-01-19 Thread Anton Nefedov
A block driver can provide a callback to report driver-specific
statistics.

file-posix driver now reports discard statistics

Signed-off-by: Anton Nefedov 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block-core.json  | 37 +
 include/block/block.h |  1 +
 include/block/block_int.h |  1 +
 block.c   |  9 +
 block/file-posix.c| 21 +
 block/qapi.c  |  5 +
 6 files changed, 74 insertions(+)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3fa2d3a..0d9dc01 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -774,6 +774,40 @@
'timed_stats': ['BlockDeviceTimedStats'] } }
 
 ##
+# @BlockDriverStatsFile:
+#
+# File driver statistics
+#
+# @discard_nb_ok: The number of succeeded discard operations performed by
+# the driver.
+#
+# @discard_nb_failed: The number of failed discard operations performed by
+# the driver.
+#
+# @discard_bytes_ok: The number of bytes discarded by the driver.
+#
+# Since 2.12
+##
+{ 'struct': 'BlockDriverStatsFile',
+  'data': {
+  'discard_nb_ok': 'int',
+  'discard_nb_failed': 'int',
+  'discard_bytes_ok': 'int'
+  } }
+
+##
+# @BlockDriverStats:
+#
+# Statistics of a block driver (driver-specific)
+#
+# Since: 2.12
+##
+{ 'union': 'BlockDriverStats',
+  'data': {
+  'file': 'BlockDriverStatsFile'
+  } }
+
+##
 # @BlockStats:
 #
 # Statistics of a virtual block device or a block backing device.
@@ -785,6 +819,8 @@
 #
 # @stats:  A @BlockDeviceStats for the device.
 #
+# @driver-stats: Optional driver-specific statistics. (Since 2.12)
+#
 # @parent: This describes the file block device if it has one.
 #  Contains recursively the statistics of the underlying
 #  protocol (e.g. the host file for a qcow2 image). If there is
@@ -798,6 +834,7 @@
 { 'struct': 'BlockStats',
   'data': {'*device': 'str', '*node-name': 'str',
'stats': 'BlockDeviceStats',
+   '*driver-stats': 'BlockDriverStats',
'*parent': 'BlockStats',
'*backing': 'BlockStats'} }
 
diff --git a/include/block/block.h b/include/block/block.h
index 9b12774..2f20697 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -473,6 +473,7 @@ const char *bdrv_get_device_or_node_name(const 
BlockDriverState *bs);
 int bdrv_get_flags(BlockDriverState *bs);
 int bdrv_get_info(BlockDriverState *bs, BlockDriverInfo *bdi);
 ImageInfoSpecific *bdrv_get_specific_info(BlockDriverState *bs);
+BlockDriverStats *bdrv_get_driver_stats(BlockDriverState *bs);
 void bdrv_round_to_clusters(BlockDriverState *bs,
 int64_t offset, int64_t bytes,
 int64_t *cluster_offset,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 29cafa4..6f56aeb 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -269,6 +269,7 @@ struct BlockDriver {
   Error **errp);
 int (*bdrv_get_info)(BlockDriverState *bs, BlockDriverInfo *bdi);
 ImageInfoSpecific *(*bdrv_get_specific_info)(BlockDriverState *bs);
+BlockDriverStats *(*bdrv_get_stats)(BlockDriverState *bs);
 
 int coroutine_fn (*bdrv_save_vmstate)(BlockDriverState *bs,
   QEMUIOVector *qiov,
diff --git a/block.c b/block.c
index a8da4f2..8c03587 100644
--- a/block.c
+++ b/block.c
@@ -4062,6 +4062,15 @@ ImageInfoSpecific 
*bdrv_get_specific_info(BlockDriverState *bs)
 return NULL;
 }
 
+BlockDriverStats *bdrv_get_driver_stats(BlockDriverState *bs)
+{
+BlockDriver *drv = bs->drv;
+if (!drv || !drv->bdrv_get_stats) {
+return NULL;
+}
+return drv->bdrv_get_stats(bs);
+}
+
 void bdrv_debug_event(BlockDriverState *bs, BlkdebugEvent event)
 {
 if (!bs || !bs->drv || !bs->drv->bdrv_debug_event) {
diff --git a/block/file-posix.c b/block/file-posix.c
index 544ae58..3ab92e6 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -2240,6 +2240,25 @@ static int raw_get_info(BlockDriverState *bs, 
BlockDriverInfo *bdi)
 return 0;
 }
 
+static BlockDriverStats *raw_get_stats(BlockDriverState *bs)
+{
+BDRVRawState *s = bs->opaque;
+BlockDriverStats *stats = g_new(BlockDriverStats, 1);
+
+*stats = (BlockDriverStats){
+.type  = BLOCK_DRIVER_STATS_KIND_FILE,
+.u.file.data = g_new(BlockDriverStatsFile, 1),
+};
+
+*stats->u.file.data = (BlockDriverStatsFile){
+.discard_nb_ok = s->stats.discard_nb_ok,
+.discard_nb_failed = s->stats.discard_nb_failed,
+.discard_bytes_ok = s->stats.discard_bytes_ok,
+};
+
+return stats;
+}
+
 static QemuOptsList raw_create_opts = {
 .name = "raw-create-opts",
 .head = QTAILQ_HEAD_INITIALIZER(raw_create_opts.head),
@@ -2312,6 +2331,7 @@ BlockDriver bdrv_file = {
 .bdrv_get_info = raw_get_info,
 .bdrv_get_allocated_file_size

[Qemu-block] [PATCH v2 4/8] scsi: store unmap offset and nb_sectors in request struct

2018-01-19 Thread Anton Nefedov
it allows to report it in the error handler

Signed-off-by: Anton Nefedov 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Alberto Garcia 
---
 hw/scsi/scsi-disk.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 49d2559..7b8e0ed 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -1625,8 +1625,6 @@ static void scsi_unmap_complete_noio(UnmapCBData *data, 
int ret)
 {
 SCSIDiskReq *r = data->r;
 SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
-uint64_t sector_num;
-uint32_t nb_sectors;
 
 assert(r->req.aiocb == NULL);
 if (scsi_disk_req_check_error(r, ret, false)) {
@@ -1634,16 +1632,16 @@ static void scsi_unmap_complete_noio(UnmapCBData *data, 
int ret)
 }
 
 if (data->count > 0) {
-sector_num = ldq_be_p(&data->inbuf[0]);
-nb_sectors = ldl_be_p(&data->inbuf[8]) & 0xULL;
-if (!check_lba_range(s, sector_num, nb_sectors)) {
+r->sector = ldq_be_p(&data->inbuf[0]);
+r->sector_count = ldl_be_p(&data->inbuf[8]) & 0xULL;
+if (!check_lba_range(s, r->sector, r->sector_count)) {
 scsi_check_condition(r, SENSE_CODE(LBA_OUT_OF_RANGE));
 goto done;
 }
 
 r->req.aiocb = blk_aio_pdiscard(s->qdev.conf.blk,
-sector_num * s->qdev.blocksize,
-nb_sectors * s->qdev.blocksize,
+r->sector * s->qdev.blocksize,
+r->sector_count * s->qdev.blocksize,
 scsi_unmap_complete, data);
 data->count--;
 data->inbuf += 16;
-- 
2.7.4




[Qemu-block] [PATCH v2 7/8] file-posix: account discard operations

2018-01-19 Thread Anton Nefedov
This will help to identify how many of the user-issued discard operations
(accounted on a device level) have actually suceeded down on the host file
(even though the numbers will not be exactly the same if non-raw format
driver is used (e.g. qcow2 sending metadata discards)).

Signed-off-by: Anton Nefedov 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/file-posix.c | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 36ee89e..544ae58 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -158,6 +158,11 @@ typedef struct BDRVRawState {
 bool page_cache_inconsistent:1;
 bool has_fallocate;
 bool needs_alignment;
+struct {
+int64_t discard_nb_ok;
+int64_t discard_nb_failed;
+int64_t discard_bytes_ok;
+} stats;
 
 PRManager *pr_mgr;
 } BDRVRawState;
@@ -1458,6 +1463,16 @@ static ssize_t handle_aiocb_discard(RawPosixAIOData 
*aiocb)
 return ret;
 }
 
+static void raw_account_discard(BDRVRawState *s, uint64_t nbytes, int ret)
+{
+if (ret) {
+s->stats.discard_nb_failed++;
+} else {
+s->stats.discard_nb_ok++;
+s->stats.discard_bytes_ok += nbytes;
+}
+}
+
 static int aio_worker(void *arg)
 {
 RawPosixAIOData *aiocb = arg;
@@ -1494,6 +1509,7 @@ static int aio_worker(void *arg)
 break;
 case QEMU_AIO_DISCARD:
 ret = handle_aiocb_discard(aiocb);
+raw_account_discard(aiocb->bs->opaque, aiocb->aio_nbytes, ret);
 break;
 case QEMU_AIO_WRITE_ZEROES:
 ret = handle_aiocb_write_zeroes(aiocb);
@@ -2654,8 +2670,9 @@ static coroutine_fn BlockAIOCB 
*hdev_aio_pdiscard(BlockDriverState *bs,
 BlockCompletionFunc *cb, void *opaque)
 {
 BDRVRawState *s = bs->opaque;
-
-if (fd_open(bs) < 0) {
+int ret = fd_open(bs);
+if (ret < 0) {
+raw_account_discard(s, bytes, ret);
 return NULL;
 }
 return paio_submit(bs, s->fd, offset, NULL, bytes,
-- 
2.7.4




Re: [Qemu-block] [Qemu-devel] [PATCH v2 18/32] qcow2: Update qcow2_get_cluster_offset() to support L2 slices

2018-01-19 Thread Alberto Garcia
On Tue 16 Jan 2018 11:42:45 PM CET, Eric Blake wrote:
> Callers like 'qemu-img map' will potentially have to iterate in more
> steps over an entire image; but that's not a severe limitation.  The
> block_status() functions already document that as long as drivers make
> progress, callers must be prepared for back-to-back calls that return
> the same status, so you are not breaking semantics with a shorter
> clamping limit.
>
> Maybe some of that is worth mentioning in the commit message.  Either
> way,

Yes indeed, for 4KB slices and 64KB clusters each slice can only map
32MB. Then again I don't think that it makes sense to change the slice
size for qemu-img map, or in general for anything that does pure
sequential I/O.

But it's probably worth mentioning in the commit message.

Berto



[Qemu-block] [PATCH v3 5/5] iotest 201: new test for qmp nbd-server-remove

2018-01-19 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/201 | 156 +
 tests/qemu-iotests/201.out |   5 ++
 tests/qemu-iotests/group   |   1 +
 3 files changed, 162 insertions(+)
 create mode 100644 tests/qemu-iotests/201
 create mode 100644 tests/qemu-iotests/201.out

diff --git a/tests/qemu-iotests/201 b/tests/qemu-iotests/201
new file mode 100644
index 00..10388920dc
--- /dev/null
+++ b/tests/qemu-iotests/201
@@ -0,0 +1,156 @@
+#!/usr/bin/env python
+#
+# Tests for qmp command nbd-server-remove.
+#
+# Copyright (c) 2017 Virtuozzo International GmbH
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+import os
+import sys
+import iotests
+import time
+from iotests import qemu_img, qemu_io, filter_qemu_io, QemuIoInteractive
+
+nbd_sock = 'nbd_sock'
+nbd_uri = 'nbd+unix:///exp?socket=' + nbd_sock
+disk = os.path.join(iotests.test_dir, 'disk')
+
+
+class TestNbdServerRemove(iotests.QMPTestCase):
+def setUp(self):
+qemu_img('create', '-f', iotests.imgfmt, disk, '1M')
+
+self.vm = iotests.VM().add_drive(disk)
+self.vm.launch()
+
+address = {
+'type': 'unix',
+'data': {
+'path': nbd_sock
+}
+}
+
+result = self.vm.qmp('nbd-server-start', addr=address)
+self.assert_qmp(result, 'return', {})
+result = self.vm.qmp('nbd-server-add', device='drive0', name='exp')
+self.assert_qmp(result, 'return', {})
+
+def tearDown(self):
+self.vm.shutdown()
+os.remove(nbd_sock)
+os.remove(disk)
+
+def remove_export(self, name, mode=None):
+if mode is None:
+return self.vm.qmp('nbd-server-remove', name=name)
+else:
+return self.vm.qmp('nbd-server-remove', name=name, mode=mode)
+
+def assertExportNotFound(self, name):
+result = self.vm.qmp('nbd-server-remove', name=name)
+self.assert_qmp(result, 'error/desc', "Export 'exp' is not found")
+
+def assertExistingClients(self, result):
+self.assert_qmp(result, 'error/desc', "export 'exp' still in use")
+
+def assertReadOk(self, qemu_io_output):
+self.assertEqual(
+filter_qemu_io(qemu_io_output).strip(),
+'read 512/512 bytes at offset 0\n' +
+'512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)')
+
+def assertReadFailed(self, qemu_io_output):
+self.assertEqual(filter_qemu_io(qemu_io_output).strip(),
+ 'read failed: Input/output error')
+
+def assertConnectFailed(self, qemu_io_output):
+self.assertEqual(filter_qemu_io(qemu_io_output).strip(),
+ "can't open device " + nbd_uri +
+ ": Requested export not available\n"
+ "server reported: export 'exp' not present")
+
+def do_test_connect_after_remove(self, mode=None):
+args = ('-r', '-f', 'raw', '-c', 'read 0 512', nbd_uri)
+self.assertReadOk(qemu_io(*args))
+
+result = self.remove_export('exp', mode)
+self.assert_qmp(result, 'return', {})
+
+self.assertExportNotFound('exp')
+self.assertConnectFailed(qemu_io(*args))
+
+def test_connect_after_remove_default(self):
+self.do_test_connect_after_remove()
+
+def test_connect_after_remove_safe(self):
+self.do_test_connect_after_remove('safe')
+
+def test_connect_after_remove_force(self):
+self.do_test_connect_after_remove('hard')
+
+def do_test_remove_during_connect_safe(self, mode=None):
+qio = QemuIoInteractive('-r', '-f', 'raw', nbd_uri)
+self.assertReadOk(qio.cmd('read 0 512'))
+
+result = self.remove_export('exp', mode)
+self.assertExistingClients(result)
+
+self.assertReadOk(qio.cmd('read 0 512'))
+
+qio.close()
+
+result = self.remove_export('exp', mode)
+self.assert_qmp(result, 'return', {})
+
+self.assertExportNotFound('exp')
+
+def test_remove_during_connect_default(self):
+self.do_test_remove_during_connect_safe()
+
+def test_remove_during_connect_safe(self):
+self.do_test_remove_during_connect_safe('safe')
+
+def test_remove_during_connect_hard(self):
+qio = QemuIoInteractive('-r', '-f', 'raw', nbd_uri)
+self.assertR

[Qemu-block] [PATCH v3 0/5] nbd export qmp interface

2018-01-19 Thread Vladimir Sementsov-Ogievskiy
v3:
hmp patch deleted
02: tweak commit message and comments
tweak error message
05: use unix socket instead of tcp (more common practice in iotests,
and we do not need port)
rebase on new error message in 02

v2:
01: tweak comment
add Eric's r-b
02: new patch
03: rewritten, to move form 'bool force' flag to 'enum mode' parameter
04: add Eric's r-b
05: improve commit message
tweak comment
06: rebase on 03 changes
make PEP8 happy
some other tweaks
I've left nbd_port variable hard-set to 10900. I think all such things
should be fixed together, and it is simple to change in future
nbd_port = '10900'
to
nbd_port = iotests.get_free_port()
if needed.

[Unfortunately, qmp query-nbd-server is not finished yet, coming soon,
 but may be after my vocation on the next week]

Vladimir Sementsov-Ogievskiy (5):
  qapi: add name parameter to nbd-server-add
  qapi: add nbd-server-remove
  iotest 147: add cases to test new @name parameter of nbd-server-add
  iotests: implement QemuIoInteractive class
  iotest 201: new test for qmp nbd-server-remove

 qapi/block.json   |  50 +-
 include/block/nbd.h   |   1 +
 blockdev-nbd.c|  38 --
 hmp.c |   5 +-
 nbd/server.c  |  20 ++
 tests/qemu-iotests/147|  68 ++
 tests/qemu-iotests/147.out|   4 +-
 tests/qemu-iotests/201| 156 ++
 tests/qemu-iotests/201.out|   5 ++
 tests/qemu-iotests/group  |   1 +
 tests/qemu-iotests/iotests.py |  38 ++
 11 files changed, 362 insertions(+), 24 deletions(-)
 create mode 100644 tests/qemu-iotests/201
 create mode 100644 tests/qemu-iotests/201.out

-- 
2.11.1




[Qemu-block] [PATCH v3 4/5] iotests: implement QemuIoInteractive class

2018-01-19 Thread Vladimir Sementsov-Ogievskiy
Implement QemuIoInteractive to test nbd-server-remove command when
there are active connections.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/iotests.py | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 44477e9295..5a10b2d534 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -93,6 +93,44 @@ def qemu_io(*args):
 sys.stderr.write('qemu-io received signal %i: %s\n' % (-exitcode, ' 
'.join(args)))
 return subp.communicate()[0]
 
+
+class QemuIoInteractive:
+def __init__(self, *args):
+self.args = qemu_io_args + list(args)
+self._p = subprocess.Popen(self.args, stdin=subprocess.PIPE,
+   stdout=subprocess.PIPE,
+   stderr=subprocess.STDOUT)
+assert self._p.stdout.read(9) == 'qemu-io> '
+
+def close(self):
+self._p.communicate('q\n')
+
+def _read_output(self):
+pattern = 'qemu-io> '
+n = len(pattern)
+pos = 0
+s = []
+while pos != n:
+c = self._p.stdout.read(1)
+# check unexpected EOF
+assert c != ''
+s.append(c)
+if c == pattern[pos]:
+pos += 1
+else:
+pos = 0
+
+return ''.join(s[:-n])
+
+def cmd(self, cmd):
+# quit command is in close(), '\n' is added automatically
+assert '\n' not in cmd
+cmd = cmd.strip()
+assert cmd != 'q' and cmd != 'quit'
+self._p.stdin.write(cmd + '\n')
+return self._read_output()
+
+
 def qemu_nbd(*args):
 '''Run qemu-nbd in daemon mode and return the parent's exit code'''
 return subprocess.call(qemu_nbd_args + ['--fork'] + list(args))
-- 
2.11.1




[Qemu-block] [PATCH v3 1/5] qapi: add name parameter to nbd-server-add

2018-01-19 Thread Vladimir Sementsov-Ogievskiy
Allow user to specify name for new export, to not reuse internal
node name and to not show it to clients.

This also allows creating several exports per device.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 qapi/block.json |  9 +++--
 blockdev-nbd.c  | 14 +-
 hmp.c   |  5 +++--
 3 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/qapi/block.json b/qapi/block.json
index f093fa3f27..353e3a45bd 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -213,14 +213,19 @@
 #
 # @device: The device name or node name of the node to be exported
 #
+# @name: Export name. If unspecified, the @device parameter is used as the
+#export name. (Since 2.12)
+#
 # @writable: Whether clients should be able to write to the device via the
 # NBD connection (default false).
 #
-# Returns: error if the device is already marked for export.
+# Returns: error if the server is not running, or export with the same name
+#  already exists.
 #
 # Since: 1.3.0
 ##
-{ 'command': 'nbd-server-add', 'data': {'device': 'str', '*writable': 'bool'} }
+{ 'command': 'nbd-server-add',
+  'data': {'device': 'str', '*name': 'str', '*writable': 'bool'} }
 
 ##
 # @nbd-server-stop:
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 9e3c22109c..104789e521 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -140,8 +140,8 @@ void qmp_nbd_server_start(SocketAddressLegacy *addr,
 qapi_free_SocketAddress(addr_flat);
 }
 
-void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
-Error **errp)
+void qmp_nbd_server_add(const char *device, bool has_name, const char *name,
+bool has_writable, bool writable, Error **errp)
 {
 BlockDriverState *bs = NULL;
 BlockBackend *on_eject_blk;
@@ -152,8 +152,12 @@ void qmp_nbd_server_add(const char *device, bool 
has_writable, bool writable,
 return;
 }
 
-if (nbd_export_find(device)) {
-error_setg(errp, "NBD server already exporting device '%s'", device);
+if (!has_name) {
+name = device;
+}
+
+if (nbd_export_find(name)) {
+error_setg(errp, "NBD server already has export named '%s'", name);
 return;
 }
 
@@ -177,7 +181,7 @@ void qmp_nbd_server_add(const char *device, bool 
has_writable, bool writable,
 return;
 }
 
-nbd_export_set_name(exp, device);
+nbd_export_set_name(exp, name);
 
 /* The list of named exports has a strong reference to this export now and
  * our only way of accessing it is through nbd_export_find(), so we can 
drop
diff --git a/hmp.c b/hmp.c
index c6bab5373b..37972f8322 100644
--- a/hmp.c
+++ b/hmp.c
@@ -2218,7 +2218,8 @@ void hmp_nbd_server_start(Monitor *mon, const QDict 
*qdict)
 continue;
 }
 
-qmp_nbd_server_add(info->value->device, true, writable, &local_err);
+qmp_nbd_server_add(info->value->device, false, NULL,
+   true, writable, &local_err);
 
 if (local_err != NULL) {
 qmp_nbd_server_stop(NULL);
@@ -2238,7 +2239,7 @@ void hmp_nbd_server_add(Monitor *mon, const QDict *qdict)
 bool writable = qdict_get_try_bool(qdict, "writable", false);
 Error *local_err = NULL;
 
-qmp_nbd_server_add(device, true, writable, &local_err);
+qmp_nbd_server_add(device, false, NULL, true, writable, &local_err);
 
 if (local_err != NULL) {
 hmp_handle_error(mon, &local_err);
-- 
2.11.1




[Qemu-block] [PATCH v3 3/5] iotest 147: add cases to test new @name parameter of nbd-server-add

2018-01-19 Thread Vladimir Sementsov-Ogievskiy
Signed-off-by: Vladimir Sementsov-Ogievskiy 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/147 | 68 +-
 tests/qemu-iotests/147.out |  4 +--
 2 files changed, 57 insertions(+), 15 deletions(-)

diff --git a/tests/qemu-iotests/147 b/tests/qemu-iotests/147
index 90f40ed245..d2081df84b 100755
--- a/tests/qemu-iotests/147
+++ b/tests/qemu-iotests/147
@@ -38,8 +38,8 @@ def flatten_sock_addr(crumpled_address):
 
 
 class NBDBlockdevAddBase(iotests.QMPTestCase):
-def blockdev_add_options(self, address, export=None):
-options = { 'node-name': 'nbd-blockdev',
+def blockdev_add_options(self, address, export, node_name):
+options = { 'node-name': node_name,
 'driver': 'raw',
 'file': {
 'driver': 'nbd',
@@ -50,23 +50,28 @@ class NBDBlockdevAddBase(iotests.QMPTestCase):
 options['file']['export'] = export
 return options
 
-def client_test(self, filename, address, export=None):
-bao = self.blockdev_add_options(address, export)
+def client_test(self, filename, address, export=None,
+node_name='nbd-blockdev', delete=True):
+bao = self.blockdev_add_options(address, export, node_name)
 result = self.vm.qmp('blockdev-add', **bao)
 self.assert_qmp(result, 'return', {})
 
+found = False
 result = self.vm.qmp('query-named-block-nodes')
 for node in result['return']:
-if node['node-name'] == 'nbd-blockdev':
+if node['node-name'] == node_name:
+found = True
 if isinstance(filename, str):
 self.assert_qmp(node, 'image/filename', filename)
 else:
 self.assert_json_filename_equal(node['image']['filename'],
 filename)
 break
+self.assertTrue(found)
 
-result = self.vm.qmp('blockdev-del', node_name='nbd-blockdev')
-self.assert_qmp(result, 'return', {})
+if delete:
+result = self.vm.qmp('blockdev-del', node_name=node_name)
+self.assert_qmp(result, 'return', {})
 
 
 class QemuNBD(NBDBlockdevAddBase):
@@ -125,26 +130,63 @@ class BuiltinNBD(NBDBlockdevAddBase):
 except OSError:
 pass
 
-def _server_up(self, address):
+def _server_up(self, address, export_name=None, export_name2=None):
 result = self.server.qmp('nbd-server-start', addr=address)
 self.assert_qmp(result, 'return', {})
 
-result = self.server.qmp('nbd-server-add', device='nbd-export')
+if export_name is None:
+result = self.server.qmp('nbd-server-add', device='nbd-export')
+else:
+result = self.server.qmp('nbd-server-add', device='nbd-export',
+ name=export_name)
 self.assert_qmp(result, 'return', {})
 
+if export_name2 is not None:
+result = self.server.qmp('nbd-server-add', device='nbd-export',
+ name=export_name2)
+self.assert_qmp(result, 'return', {})
+
+
 def _server_down(self):
 result = self.server.qmp('nbd-server-stop')
 self.assert_qmp(result, 'return', {})
 
-def test_inet(self):
+def do_test_inet(self, export_name=None):
 address = { 'type': 'inet',
 'data': {
 'host': 'localhost',
 'port': str(NBD_PORT)
 } }
-self._server_up(address)
-self.client_test('nbd://localhost:%i/nbd-export' % NBD_PORT,
- flatten_sock_addr(address), 'nbd-export')
+self._server_up(address, export_name)
+export_name = export_name or 'nbd-export'
+self.client_test('nbd://localhost:%i/%s' % (NBD_PORT, export_name),
+ flatten_sock_addr(address), export_name)
+self._server_down()
+
+def test_inet_default_export_name(self):
+self.do_test_inet()
+
+def test_inet_same_export_name(self):
+self.do_test_inet('nbd-export')
+
+def test_inet_different_export_name(self):
+self.do_test_inet('shadow')
+
+def test_inet_two_exports(self):
+address = { 'type': 'inet',
+'data': {
+'host': 'localhost',
+'port': str(NBD_PORT)
+} }
+self._server_up(address, 'exp1', 'exp2')
+self.client_test('nbd://localhost:%i/%s' % (NBD_PORT, 'exp1'),
+ flatten_sock_addr(address), 'exp1', 'node1', False)
+self.client_test('nbd://localhost:%i/%s' % (NBD_PORT, 'exp2'),
+ flatten_sock_addr(address), 'exp2', 'node2', False)
+result = self.vm.qmp('blockdev-del', node_name='node1')
+self.assert_qmp(result, 'return', {})
+   

Re: [Qemu-block] [Qemu-devel] [PATCH v9 03/13] block/dirty-bitmap: add _locked version of bdrv_reclaim_dirty_bitmap

2018-01-19 Thread Vladimir Sementsov-Ogievskiy

18.01.2018 13:09, Paolo Bonzini wrote:

On 18/01/2018 10:55, Vladimir Sementsov-Ogievskiy wrote:

Most functions that looks at the list are "called with BQL taken".
Functions that write to the list are "called with BQL taken" and call
bdrv_dirty_bitmaps_lock/bdrv_dirty_bitmaps_unlock themselves.

Paolo, could you please explain about bitmap locking in more details?
Why do we need mutexes?

We have three cases:

1) monitor creates and destroy bitmaps.

2) monitor also has to read the list.  We know it operates with BQL.

3) users such as mirror.c create a dirty bitmap in the monitor command
(under BQL), but they can operate without BQL in a separate iothread so
we create a separate lock (bitmap->mutex).

While in the second and third case, bitmaps cannot disappear.  So in the
first case you operate with BQL+dirty bitmap mutex.  The result is that
you lock out both the second and the third case while creating and
destroying bitmaps.


Why do we do not need them
on read from the bitmap, only on write?

Indeed, reading the bitmap also requires taking the lock.  So
s/Modifying/Accessing/ in that comment.

Paolo


so,

    /* Writing to the list requires the BQL_and_  the dirty_bitmap_mutex.
 * Reading from the list can be done with either the BQL or the
 * dirty_bitmap_mutex.  Accessing a bitmap only requires
 * dirty_bitmap_mutex. */
    QemuMutex dirty_bitmap_mutex;



so, accessing the bitmap needs mutex lock?

Then what do you mean under accessing the bitmap? Any touch of 
BdrvDirtyBitmap fields? Then "reading the list" will require bitmap 
mutex too.
Or accessing the bitmap is accessing any field except 
BdrvDirtyBitmap.list? Then in (2), what do you mean? For example 
query-block will go through

the list, but it touches other fields too, so it should lock mutex.

--
Best regards,
Vladimir




[Qemu-block] [PATCH v3 2/5] qapi: add nbd-server-remove

2018-01-19 Thread Vladimir Sementsov-Ogievskiy
Add command for removing an export. It is needed for cases when we
don't want to keep export after the operation on it was completed.
The other example is temporary node, created with blockdev-add.
If we want to delete it we should firstly remove corresponding
NBD export.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block.json | 41 +
 include/block/nbd.h |  1 +
 blockdev-nbd.c  | 24 
 nbd/server.c| 20 
 4 files changed, 86 insertions(+)

diff --git a/qapi/block.json b/qapi/block.json
index 353e3a45bd..c694524002 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -228,6 +228,47 @@
   'data': {'device': 'str', '*name': 'str', '*writable': 'bool'} }
 
 ##
+# @NbdServerRemoveMode:
+#
+# Mode for removing an NBD export.
+#
+# @safe: Remove export if there are no existing connections, fail otherwise.
+#
+# @hard: Drop all connections immediately and remove export.
+#
+# Potential additional modes to be added in the future:
+#
+# hide: Just hide export from new clients, leave existing connections as is.
+#   Remove export after all clients are disconnected.
+#
+# soft: Hide export from new clients, answer with ESHUTDOWN for all further
+#   requests from existing clients.
+#
+# Since: 2.12
+##
+{'enum': 'NbdServerRemoveMode', 'data': ['safe', 'hard']}
+
+##
+# @nbd-server-remove:
+#
+# Remove NBD export by name.
+#
+# @name: Export name.
+#
+# @mode: Mode of command operation. See @NbdServerRemoveMode description.
+#Default is 'safe'.
+#
+# Returns: error if
+#- the server is not running
+#- export is not found
+#- mode is 'safe' and there are existing connections
+#
+# Since: 2.12
+##
+{ 'command': 'nbd-server-remove',
+  'data': {'name': 'str', '*mode': 'NbdServerRemoveMode'} }
+
+##
 # @nbd-server-stop:
 #
 # Stop QEMU's embedded NBD server, and unregister all devices previously
diff --git a/include/block/nbd.h b/include/block/nbd.h
index 978e443366..ee74ec391a 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -261,6 +261,7 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t 
dev_offset, off_t size,
   bool writethrough, BlockBackend *on_eject_blk,
   Error **errp);
 void nbd_export_close(NBDExport *exp);
+void nbd_export_remove(NBDExport *exp, NbdServerRemoveMode mode, Error **errp);
 void nbd_export_get(NBDExport *exp);
 void nbd_export_put(NBDExport *exp);
 
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 104789e521..a9f79c6778 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -189,6 +189,30 @@ void qmp_nbd_server_add(const char *device, bool has_name, 
const char *name,
 nbd_export_put(exp);
 }
 
+void qmp_nbd_server_remove(const char *name,
+   bool has_mode, NbdServerRemoveMode mode,
+   Error **errp)
+{
+NBDExport *exp;
+
+if (!nbd_server) {
+error_setg(errp, "NBD server not running");
+return;
+}
+
+exp = nbd_export_find(name);
+if (exp == NULL) {
+error_setg(errp, "Export '%s' is not found", name);
+return;
+}
+
+if (!has_mode) {
+mode = NBD_SERVER_REMOVE_MODE_SAFE;
+}
+
+nbd_export_remove(exp, mode, errp);
+}
+
 void qmp_nbd_server_stop(Error **errp)
 {
 nbd_export_close_all();
diff --git a/nbd/server.c b/nbd/server.c
index 6cf2eeb2c1..bf2afd264a 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1177,6 +1177,26 @@ void nbd_export_close(NBDExport *exp)
 nbd_export_put(exp);
 }
 
+void nbd_export_remove(NBDExport *exp, NbdServerRemoveMode mode, Error **errp)
+{
+NBDClient *client;
+int nb_clients = 0;
+
+if (mode == NBD_SERVER_REMOVE_MODE_HARD || QTAILQ_EMPTY(&exp->clients)) {
+nbd_export_close(exp);
+return;
+}
+
+assert(mode == NBD_SERVER_REMOVE_MODE_SAFE);
+
+QTAILQ_FOREACH(client, &exp->clients, next) {
+nb_clients++;
+}
+
+error_setg(errp, "export '%s' still in use", exp->name);
+error_append_hint(errp, "Use mode='hard' to force client disconnect\n");
+}
+
 void nbd_export_get(NBDExport *exp)
 {
 assert(exp->refcount > 0);
-- 
2.11.1




Re: [Qemu-block] [Qemu-devel] [PATCH v9 13/13] iotests: add dirty bitmap postcopy test

2018-01-19 Thread Vladimir Sementsov-Ogievskiy

18.01.2018 12:57, Vladimir Sementsov-Ogievskiy wrote:

17.01.2018 21:30, John Snow wrote:


On 12/28/2017 06:49 AM, Vladimir Sementsov-Ogievskiy wrote:

Thank you for reviewing my code!


Time for the re-spin? There's pretty good pressure to get this into 2.12
and say the non-nbd workflow model is feature complete.


Yes, you are right. Hope to do it today or tomorrow.



I've rebased, applied comments from review, and even one some new fixes, 
but now I understand that it is too early for re-spin.


Now this series depends on
1. [PATCH v2 0/3] fix bitmaps migration through shared storage
  - we need to decide, how to fix it. May be, we just do not need 
bitmaps in inactive state, and should not load them in do_qcow2_open in 
this case
  [I can ignore it, just dropping one test case from new iotest, and 
fix it later, but..

2. [PATCH v2 0/6] qmp dirty bitmap API
 - here autoload is dropped, and migration should be rebased on it.

so, I'll re-spin migration after these two questions are resolved.

--
Best regards,
Vladimir




Re: [Qemu-block] [Qemu-devel] [PATCH v9 13/13] iotests: add dirty bitmap postcopy test

2018-01-19 Thread John Snow


On 01/19/2018 01:08 PM, Vladimir Sementsov-Ogievskiy wrote:
> 18.01.2018 12:57, Vladimir Sementsov-Ogievskiy wrote:
>> 17.01.2018 21:30, John Snow wrote:
>>>
>>> On 12/28/2017 06:49 AM, Vladimir Sementsov-Ogievskiy wrote:
 Thank you for reviewing my code!

>>> Time for the re-spin? There's pretty good pressure to get this into 2.12
>>> and say the non-nbd workflow model is feature complete.
>>
>> Yes, you are right. Hope to do it today or tomorrow.
>>
> 
> I've rebased, applied comments from review, and even one some new fixes,
> but now I understand that it is too early for re-spin.
> 
> Now this series depends on
> 1. [PATCH v2 0/3] fix bitmaps migration through shared storage
>   - we need to decide, how to fix it. May be, we just do not need
> bitmaps in inactive state, and should not load them in do_qcow2_open in
> this case
>   [I can ignore it, just dropping one test case from new iotest, and fix
> it later, but..
> 2. [PATCH v2 0/6] qmp dirty bitmap API
>  - here autoload is dropped, and migration should be rebased on it.
> 
> so, I'll re-spin migration after these two questions are resolved.
> 

You got it, thanks.



[Qemu-block] [PATCH v2 03/13] blockjob: create block_job_relax

2018-01-19 Thread John Snow
This will replace mirror_throttle, for reuse in other jobs.

Signed-off-by: John Snow 
---
 block/mirror.c   | 15 ++-
 blockjob.c   | 11 +++
 include/block/blockjob_int.h |  9 +
 3 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 1fb5fc0cb8..36c681c7a1 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -590,17 +590,6 @@ static void mirror_exit(BlockJob *job, void *opaque)
 bdrv_unref(src);
 }
 
-static void mirror_throttle(MirrorBlockJob *s)
-{
-int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
-
-if (now - s->common.last_enter_ns > SLICE_TIME) {
-block_job_sleep_ns(&s->common, 0);
-} else {
-block_job_pause_point(&s->common);
-}
-}
-
 static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 {
 int64_t offset;
@@ -621,7 +610,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 int bytes = MIN(s->bdev_length - offset,
 QEMU_ALIGN_DOWN(INT_MAX, s->granularity));
 
-mirror_throttle(s);
+block_job_relax(&s->common);
 
 if (block_job_is_cancelled(&s->common)) {
 s->initial_zeroing_ongoing = false;
@@ -649,7 +638,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 int bytes = MIN(s->bdev_length - offset,
 QEMU_ALIGN_DOWN(INT_MAX, s->granularity));
 
-mirror_throttle(s);
+block_job_relax(&s->common);
 
 if (block_job_is_cancelled(&s->common)) {
 return 0;
diff --git a/blockjob.c b/blockjob.c
index 2a9ff66b95..6f2e709b51 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -906,6 +906,17 @@ void block_job_yield(BlockJob *job)
 block_job_pause_point(job);
 }
 
+void block_job_relax(BlockJob *job)
+{
+int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
+
+if (now - job->last_enter_ns > SLICE_TIME) {
+block_job_sleep_ns(job, 0);
+} else {
+block_job_pause_point(job);
+}
+}
+
 void block_job_iostatus_reset(BlockJob *job)
 {
 if (job->iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index 209fa1bb3e..553784d86f 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -157,6 +157,15 @@ void block_job_sleep_ns(BlockJob *job, int64_t ns);
  */
 void block_job_yield(BlockJob *job);
 
+/**
+ * block_job_relax:
+ * @job: The job that calls the function.
+ *
+ * Yield if it has been SLICE_TIME nanoseconds since the last yield.
+ * Otherwise, check if we need to pause, and yield if so.
+ */
+void block_job_relax(BlockJob *job);
+
 /**
  * block_job_pause_all:
  *
-- 
2.14.3




[Qemu-block] [PATCH v2 02/13] blockjob: consolidate SLICE_TIME definition

2018-01-19 Thread John Snow
They're all the same. If it actually becomes important to configure it,
it can become a job or driver property.

Signed-off-by: John Snow 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Jeff Cody 
---
 block/backup.c   | 1 -
 block/commit.c   | 2 --
 block/mirror.c   | 1 -
 block/stream.c   | 2 --
 include/block/blockjob_int.h | 2 ++
 5 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 4a16a37229..7b1cdd038a 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -27,7 +27,6 @@
 #include "qemu/error-report.h"
 
 #define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
-#define SLICE_TIME 1ULL /* ns */
 
 typedef struct BackupBlockJob {
 BlockJob common;
diff --git a/block/commit.c b/block/commit.c
index bb6c904704..898545b318 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -31,8 +31,6 @@ enum {
 COMMIT_BUFFER_SIZE = 512 * 1024, /* in bytes */
 };
 
-#define SLICE_TIME 1ULL /* ns */
-
 typedef struct CommitBlockJob {
 BlockJob common;
 RateLimit limit;
diff --git a/block/mirror.c b/block/mirror.c
index 88f4e8964d..1fb5fc0cb8 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -22,7 +22,6 @@
 #include "qemu/ratelimit.h"
 #include "qemu/bitmap.h"
 
-#define SLICE_TIME1ULL /* ns */
 #define MAX_IN_FLIGHT 16
 #define MAX_IO_BYTES (1 << 20) /* 1 Mb */
 #define DEFAULT_MIRROR_BUF_SIZE (MAX_IN_FLIGHT * MAX_IO_BYTES)
diff --git a/block/stream.c b/block/stream.c
index 499cdacdb0..e85af18c54 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -29,8 +29,6 @@ enum {
 STREAM_BUFFER_SIZE = 512 * 1024, /* in bytes */
 };
 
-#define SLICE_TIME 1ULL /* ns */
-
 typedef struct StreamBlockJob {
 BlockJob common;
 RateLimit limit;
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index c9b23b0cc9..209fa1bb3e 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -29,6 +29,8 @@
 #include "block/blockjob.h"
 #include "block/block.h"
 
+#define SLICE_TIME 1ULL /* ns */
+
 /**
  * BlockJobDriver:
  *
-- 
2.14.3




[Qemu-block] [PATCH v2 00/13] blockjob: refactor mirror_throttle

2018-01-19 Thread John Snow
mirror_throttle attempts to make sure we yield every so often when we're
doing operations that may not otherwise be as cooperative as we'd like.

This pattern is useful to other jobs like commit as well; if commit
continuously decides not to copy data (and calculates delay_ns to be 0),
we'll frequently and aggressively yield, only to be rescheduled immediately.

This causes those "WARNING: I/O thread spun for 1000 iterations"
warnings that everyone loves so much.

Split out the mirror_throttle function to become a new internal
BlockJob API function, block_job_throttle; then use it for all
the other jobs in their runtime loops.

I decided to use it in stream and backup as well, so that regardless of if the
loop structure has the chance to be greedy or not (I did not audit this), the
BlockJob has some kind of failsafe that will occasionally perform a 0ns sleep
every SLICE_TIME.

v2:
 - Fixed spacing consistency (Paolo)
 - Renamed last_yield_ns to last_enter_ns (Stefan)
 - Renamed block_job_throttle to block_job_relax (Stefan)
 - Removed external usages of block_job_pause_point and
   block_job_sleep_ns (Paolo)
 - Further cut down block/backup code to use block_job_relax



For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch block-job-yield
https://github.com/jnsnow/qemu/tree/block-job-yield

This version is tagged block-job-yield-v2:
https://github.com/jnsnow/qemu/releases/tag/block-job-yield-v2

John Snow (13):
  blockjob: record time of last entrance
  blockjob: consolidate SLICE_TIME definition
  blockjob: create block_job_relax
  blockjob: allow block_job_throttle to take delay_ns
  block/commit: use block_job_relax
  block/stream: use block_job_relax
  block/backup: use block_job_relax
  allow block_job_relax to return -ECANCELED
  block/backup: remove yield_and_check
  block/mirror: condense cancellation and relax calls
  block/mirror: remove block_job_sleep_ns calls
  blockjob: privatize block_job_sleep_ns
  blockjob: remove block_job_pause_point from interface

 block/backup.c   | 27 ++-
 block/commit.c   |  4 +---
 block/mirror.c   | 52 +++-
 block/stream.c   |  4 +---
 block/trace-events   |  2 +-
 blockjob.c   | 51 ---
 include/block/blockjob.h |  5 +
 include/block/blockjob_int.h | 37 +++
 tests/test-bdrv-drain.c  |  2 +-
 tests/test-blockjob-txn.c|  2 +-
 10 files changed, 94 insertions(+), 92 deletions(-)

-- 
2.14.3




[Qemu-block] [PATCH v2 01/13] blockjob: record time of last entrance

2018-01-19 Thread John Snow
The mirror job makes a semi-inaccurate record of the last time we yielded
by recording the last time we left a "pause", but this doesn't always
correlate to the time we actually last successfully ceded control.

Record the time we last *exited* a yield centrally. In other words, record
the time we began execution of this job to know how long we have been
selfish for.

Signed-off-by: John Snow 
---
 block/mirror.c   | 8 ++--
 blockjob.c   | 2 ++
 include/block/blockjob.h | 5 +
 3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index c9badc1203..88f4e8964d 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -63,7 +63,6 @@ typedef struct MirrorBlockJob {
 QSIMPLEQ_HEAD(, MirrorBuffer) buf_free;
 int buf_free_count;
 
-uint64_t last_pause_ns;
 unsigned long *in_flight_bitmap;
 int in_flight;
 int64_t bytes_in_flight;
@@ -596,8 +595,7 @@ static void mirror_throttle(MirrorBlockJob *s)
 {
 int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 
-if (now - s->last_pause_ns > SLICE_TIME) {
-s->last_pause_ns = now;
+if (now - s->common.last_enter_ns > SLICE_TIME) {
 block_job_sleep_ns(&s->common, 0);
 } else {
 block_job_pause_point(&s->common);
@@ -769,7 +767,6 @@ static void coroutine_fn mirror_run(void *opaque)
 
 mirror_free_init(s);
 
-s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 if (!s->is_none_mode) {
 ret = mirror_dirty_init(s);
 if (ret < 0 || block_job_is_cancelled(&s->common)) {
@@ -803,7 +800,7 @@ static void coroutine_fn mirror_run(void *opaque)
  * We do so every SLICE_TIME nanoseconds, or when there is an error,
  * or when the source is clean, whichever comes first.
  */
-delta = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - s->last_pause_ns;
+delta = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - 
s->common.last_enter_ns;
 if (delta < SLICE_TIME &&
 s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
 if (s->in_flight >= MAX_IN_FLIGHT || s->buf_free_count == 0 ||
@@ -878,7 +875,6 @@ static void coroutine_fn mirror_run(void *opaque)
 delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
 block_job_sleep_ns(&s->common, delay_ns);
 }
-s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 }
 
 immediate_exit:
diff --git a/blockjob.c b/blockjob.c
index f5cea84e73..2a9ff66b95 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -321,6 +321,7 @@ void block_job_start(BlockJob *job)
 job->pause_count--;
 job->busy = true;
 job->paused = false;
+job->last_enter_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 bdrv_coroutine_enter(blk_bs(job->blk), job->co);
 }
 
@@ -786,6 +787,7 @@ static void block_job_do_yield(BlockJob *job, uint64_t ns)
 job->busy = false;
 block_job_unlock();
 qemu_coroutine_yield();
+job->last_enter_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 
 /* Set by block_job_enter before re-entering the coroutine.  */
 assert(job->busy);
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 00403d9482..e965845c94 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -141,6 +141,11 @@ typedef struct BlockJob {
  */
 QEMUTimer sleep_timer;
 
+/**
+ * Timestamp of the last yield
+ */
+uint64_t last_enter_ns;
+
 /** Non-NULL if this job is part of a transaction */
 BlockJobTxn *txn;
 QLIST_ENTRY(BlockJob) txn_list;
-- 
2.14.3




[Qemu-block] [PATCH v2 07/13] block/backup: use block_job_relax

2018-01-19 Thread John Snow
See two commits back for justification.

Signed-off-by: John Snow 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Stefan Hajnoczi 
---
 block/backup.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index 7b1cdd038a..b4204c0ee4 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -336,6 +336,8 @@ static void backup_complete(BlockJob *job, void *opaque)
 
 static bool coroutine_fn yield_and_check(BackupBlockJob *job)
 {
+uint64_t delay_ns = 0;
+
 if (block_job_is_cancelled(&job->common)) {
 return true;
 }
@@ -344,14 +346,12 @@ static bool coroutine_fn yield_and_check(BackupBlockJob 
*job)
  * (without, VM does not reboot)
  */
 if (job->common.speed) {
-uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
-  job->bytes_read);
+delay_ns = ratelimit_calculate_delay(&job->limit,
+ job->bytes_read);
 job->bytes_read = 0;
-block_job_sleep_ns(&job->common, delay_ns);
-} else {
-block_job_sleep_ns(&job->common, 0);
 }
 
+block_job_relax(&job->common, delay_ns);
 if (block_job_is_cancelled(&job->common)) {
 return true;
 }
-- 
2.14.3




[Qemu-block] [PATCH v2 08/13] allow block_job_relax to return -ECANCELED

2018-01-19 Thread John Snow
This is just an optimization for callers who are likely going to
want to check quite close to this call if the job was canceled or
not anyway.

Along the same lines, add the return to block_job_pause_point and
block_job_sleep_ns, so we don't have to re-check it quite so
excessively.

Signed-off-by: John Snow 
---
 blockjob.c   | 28 +---
 include/block/blockjob_int.h |  8 +---
 2 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index 51c0eb5d9e..b5a0cda412 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -793,15 +793,15 @@ static void block_job_do_yield(BlockJob *job, uint64_t ns)
 assert(job->busy);
 }
 
-void coroutine_fn block_job_pause_point(BlockJob *job)
+int coroutine_fn block_job_pause_point(BlockJob *job)
 {
 assert(job && block_job_started(job));
 
-if (!block_job_should_pause(job)) {
-return;
-}
 if (block_job_is_cancelled(job)) {
-return;
+return -ECANCELED;
+}
+if (!block_job_should_pause(job)) {
+return 0;
 }
 
 if (job->driver->pause) {
@@ -817,6 +817,8 @@ void coroutine_fn block_job_pause_point(BlockJob *job)
 if (job->driver->resume) {
 job->driver->resume(job);
 }
+
+return 0;
 }
 
 void block_job_resume_all(void)
@@ -874,20 +876,20 @@ bool block_job_is_cancelled(BlockJob *job)
 return job->cancelled;
 }
 
-void block_job_sleep_ns(BlockJob *job, int64_t ns)
+int block_job_sleep_ns(BlockJob *job, int64_t ns)
 {
 assert(job->busy);
 
 /* Check cancellation *before* setting busy = false, too!  */
 if (block_job_is_cancelled(job)) {
-return;
+return -ECANCELED;
 }
 
 if (!block_job_should_pause(job)) {
 block_job_do_yield(job, qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + ns);
 }
 
-block_job_pause_point(job);
+return block_job_pause_point(job);
 }
 
 void block_job_yield(BlockJob *job)
@@ -906,13 +908,17 @@ void block_job_yield(BlockJob *job)
 block_job_pause_point(job);
 }
 
-void block_job_relax(BlockJob *job, int64_t delay_ns)
+int block_job_relax(BlockJob *job, int64_t delay_ns)
 {
+if (block_job_is_cancelled(job)) {
+return -ECANCELED;
+}
+
 if (delay_ns || (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - \
  job->last_enter_ns > SLICE_TIME)) {
-block_job_sleep_ns(job, delay_ns);
+return block_job_sleep_ns(job, delay_ns);
 } else {
-block_job_pause_point(job);
+return block_job_pause_point(job);
 }
 }
 
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index 5f1520fab7..1ceb47e1e6 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -147,7 +147,7 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
  * %QEMU_CLOCK_REALTIME nanoseconds.  Canceling the job will immediately
  * interrupt the wait.
  */
-void block_job_sleep_ns(BlockJob *job, int64_t ns);
+int block_job_sleep_ns(BlockJob *job, int64_t ns);
 
 /**
  * block_job_yield:
@@ -167,8 +167,10 @@ void block_job_yield(BlockJob *job);
  * If delay_ns is 0, yield if it has been SLICE_TIME
  * nanoseconds since the last yield. Otherwise, check
  * if we need to yield for a pause event.
+ *
+ * returns ECANCELED if the job has been canceled.
  */
-void block_job_relax(BlockJob *job, int64_t delay_ns);
+int block_job_relax(BlockJob *job, int64_t delay_ns);
 
 /**
  * block_job_pause_all:
@@ -217,7 +219,7 @@ bool block_job_is_cancelled(BlockJob *job);
  * Pause now if block_job_pause() has been called.  Block jobs that perform
  * lots of I/O must call this between requests so that the job can be paused.
  */
-void coroutine_fn block_job_pause_point(BlockJob *job);
+int coroutine_fn block_job_pause_point(BlockJob *job);
 
 /**
  * block_job_enter:
-- 
2.14.3




[Qemu-block] [PATCH v2 04/13] blockjob: allow block_job_throttle to take delay_ns

2018-01-19 Thread John Snow
Instead of only sleeping for 0ms when we've hit a timeout, optionally
take a longer more explicit delay_ns that always forces the sleep.

Signed-off-by: John Snow 
---
 block/mirror.c   |  4 ++--
 blockjob.c   |  9 -
 include/block/blockjob_int.h | 10 +++---
 3 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 36c681c7a1..3c73caed5e 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -610,7 +610,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 int bytes = MIN(s->bdev_length - offset,
 QEMU_ALIGN_DOWN(INT_MAX, s->granularity));
 
-block_job_relax(&s->common);
+block_job_relax(&s->common, 0);
 
 if (block_job_is_cancelled(&s->common)) {
 s->initial_zeroing_ongoing = false;
@@ -638,7 +638,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 int bytes = MIN(s->bdev_length - offset,
 QEMU_ALIGN_DOWN(INT_MAX, s->granularity));
 
-block_job_relax(&s->common);
+block_job_relax(&s->common, 0);
 
 if (block_job_is_cancelled(&s->common)) {
 return 0;
diff --git a/blockjob.c b/blockjob.c
index 6f2e709b51..51c0eb5d9e 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -906,12 +906,11 @@ void block_job_yield(BlockJob *job)
 block_job_pause_point(job);
 }
 
-void block_job_relax(BlockJob *job)
+void block_job_relax(BlockJob *job, int64_t delay_ns)
 {
-int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
-
-if (now - job->last_enter_ns > SLICE_TIME) {
-block_job_sleep_ns(job, 0);
+if (delay_ns || (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - \
+ job->last_enter_ns > SLICE_TIME)) {
+block_job_sleep_ns(job, delay_ns);
 } else {
 block_job_pause_point(job);
 }
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index 553784d86f..5f1520fab7 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -160,11 +160,15 @@ void block_job_yield(BlockJob *job);
 /**
  * block_job_relax:
  * @job: The job that calls the function.
+ * @delay_ns: The amount of time to sleep for
  *
- * Yield if it has been SLICE_TIME nanoseconds since the last yield.
- * Otherwise, check if we need to pause, and yield if so.
+ * Sleep for delay_ns nanoseconds.
+ *
+ * If delay_ns is 0, yield if it has been SLICE_TIME
+ * nanoseconds since the last yield. Otherwise, check
+ * if we need to yield for a pause event.
  */
-void block_job_relax(BlockJob *job);
+void block_job_relax(BlockJob *job, int64_t delay_ns);
 
 /**
  * block_job_pause_all:
-- 
2.14.3




[Qemu-block] [PATCH v2 05/13] block/commit: use block_job_relax

2018-01-19 Thread John Snow
Depending on the value of `speed` and how fast our backends are,
delay_ns might be 0 very, very often. This creates some warning
messages that spook users, but it's also pretty inefficient.

Use block_job_relax instead to yield a little more intelligently.

Signed-off-by: John Snow 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Stefan Hajnoczi 
---
 block/commit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/commit.c b/block/commit.c
index 898545b318..9cfe3ed76b 100644
--- a/block/commit.c
+++ b/block/commit.c
@@ -172,7 +172,7 @@ static void coroutine_fn commit_run(void *opaque)
 /* Note that even when no rate limit is applied we need to yield
  * with no pending I/O here so that bdrv_drain_all() returns.
  */
-block_job_sleep_ns(&s->common, delay_ns);
+block_job_relax(&s->common, delay_ns);
 if (block_job_is_cancelled(&s->common)) {
 break;
 }
-- 
2.14.3




[Qemu-block] [PATCH v2 06/13] block/stream: use block_job_relax

2018-01-19 Thread John Snow
See prior commit for justification.

Signed-off-by: John Snow 
Reviewed-by: Paolo Bonzini 
Reviewed-by: Stefan Hajnoczi 
---
 block/stream.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/stream.c b/block/stream.c
index e85af18c54..49dc102565 100644
--- a/block/stream.c
+++ b/block/stream.c
@@ -139,7 +139,7 @@ static void coroutine_fn stream_run(void *opaque)
 /* Note that even when no rate limit is applied we need to yield
  * with no pending I/O here so that bdrv_drain_all() returns.
  */
-block_job_sleep_ns(&s->common, delay_ns);
+block_job_relax(&s->common, delay_ns);
 if (block_job_is_cancelled(&s->common)) {
 break;
 }
-- 
2.14.3




[Qemu-block] [PATCH v2 12/13] blockjob: privatize block_job_sleep_ns

2018-01-19 Thread John Snow
There's not currently any external caller of it.

Except in tests, but we'll fix that here too.

Replace usages in test cases with block_job_relax, which functions
similarly enough to be used as a drop-in replacement.

Very technically block_job_sleep_ns(job, 0) behaves differently
from block_job_relax(job, 0) in that relax may resolve to a no-op,
but this makes no difference in the test in which it is used.

Signed-off-by: John Snow 
---
 blockjob.c   | 11 ++-
 include/block/blockjob_int.h | 11 ---
 tests/test-bdrv-drain.c  |  2 +-
 tests/test-blockjob-txn.c|  2 +-
 4 files changed, 12 insertions(+), 14 deletions(-)

diff --git a/blockjob.c b/blockjob.c
index b5a0cda412..40167d6896 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -876,7 +876,16 @@ bool block_job_is_cancelled(BlockJob *job)
 return job->cancelled;
 }
 
-int block_job_sleep_ns(BlockJob *job, int64_t ns)
+/**
+ * block_job_sleep_ns:
+ * @job: The job that calls the function.
+ * @ns: How many nanoseconds to stop for.
+ *
+ * Put the job to sleep (assuming that it wasn't canceled) for @ns
+ * %QEMU_CLOCK_REALTIME nanoseconds.  Canceling the job will immediately
+ * interrupt the wait.
+ */
+static int block_job_sleep_ns(BlockJob *job, int64_t ns)
 {
 assert(job->busy);
 
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index 1ceb47e1e6..c4891a5a9b 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -138,17 +138,6 @@ void *block_job_create(const char *job_id, const 
BlockJobDriver *driver,
uint64_t shared_perm, int64_t speed, int flags,
BlockCompletionFunc *cb, void *opaque, Error **errp);
 
-/**
- * block_job_sleep_ns:
- * @job: The job that calls the function.
- * @ns: How many nanoseconds to stop for.
- *
- * Put the job to sleep (assuming that it wasn't canceled) for @ns
- * %QEMU_CLOCK_REALTIME nanoseconds.  Canceling the job will immediately
- * interrupt the wait.
- */
-int block_job_sleep_ns(BlockJob *job, int64_t ns);
-
 /**
  * block_job_yield:
  * @job: The job that calls the function.
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
index d760e2b243..5d47541b4c 100644
--- a/tests/test-bdrv-drain.c
+++ b/tests/test-bdrv-drain.c
@@ -506,7 +506,7 @@ static void coroutine_fn test_job_start(void *opaque)
 TestBlockJob *s = opaque;
 
 while (!s->should_complete) {
-block_job_sleep_ns(&s->common, 10);
+block_job_relax(&s->common, 10);
 }
 
 block_job_defer_to_main_loop(&s->common, test_job_completed, NULL);
diff --git a/tests/test-blockjob-txn.c b/tests/test-blockjob-txn.c
index 3591c9617f..edcf0de6bd 100644
--- a/tests/test-blockjob-txn.c
+++ b/tests/test-blockjob-txn.c
@@ -44,7 +44,7 @@ static void coroutine_fn test_block_job_run(void *opaque)
 
 while (s->iterations--) {
 if (s->use_timer) {
-block_job_sleep_ns(job, 0);
+block_job_relax(job, 0);
 } else {
 block_job_yield(job);
 }
-- 
2.14.3




[Qemu-block] [PATCH v2 11/13] block/mirror: remove block_job_sleep_ns calls

2018-01-19 Thread John Snow
We're attempting to slacken the mirror loop in three different places,
but we can combine these three attempts. Combine the early loop call to
block_job_pause_point with the two late-loop calls to block_job_sleep_ns.

When delay_ns is 0 and it has not been SLICE_TIME since the last yield,
block_job_relax is merely a call to block_job_pause_point, so this should
be equivalent with the exception that if we have managed to not yield at
all in the last SLICE_TIME ns, we will now do so.

I am not sure that condition was possible,
so this loop should be equivalent.

Signed-off-by: John Snow 
---
 block/mirror.c | 22 +++---
 block/trace-events |  2 +-
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index a0e0044de2..192e03694f 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -761,7 +761,7 @@ static void coroutine_fn mirror_run(void *opaque)
 assert(!s->dbi);
 s->dbi = bdrv_dirty_iter_new(s->dirty_bitmap);
 for (;;) {
-uint64_t delay_ns = 0;
+static uint64_t delay_ns = 0;
 int64_t cnt, delta;
 bool should_complete;
 
@@ -770,9 +770,16 @@ static void coroutine_fn mirror_run(void *opaque)
 goto immediate_exit;
 }
 
-block_job_pause_point(&s->common);
-
 cnt = bdrv_get_dirty_count(s->dirty_bitmap);
+
+trace_mirror_before_relax(s, cnt, s->synced, delay_ns);
+if (block_job_relax(&s->common, delay_ns)) {
+if (!s->synced) {
+goto immediate_exit;
+}
+}
+delay_ns = 0;
+
 /* s->common.offset contains the number of bytes already processed so
  * far, cnt is the number of dirty bytes remaining and
  * s->bytes_in_flight is the number of bytes currently being
@@ -849,15 +856,8 @@ static void coroutine_fn mirror_run(void *opaque)
 }
 
 ret = 0;
-trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
-if (!s->synced) {
-block_job_sleep_ns(&s->common, delay_ns);
-if (block_job_is_cancelled(&s->common)) {
-break;
-}
-} else if (!should_complete) {
+if (s->synced && !should_complete) {
 delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
-block_job_sleep_ns(&s->common, delay_ns);
 }
 }
 
diff --git a/block/trace-events b/block/trace-events
index 11c8d5f590..60f36f929a 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -27,7 +27,7 @@ mirror_start(void *bs, void *s, void *opaque) "bs %p s %p 
opaque %p"
 mirror_restart_iter(void *s, int64_t cnt) "s %p dirty count %"PRId64
 mirror_before_flush(void *s) "s %p"
 mirror_before_drain(void *s, int64_t cnt) "s %p dirty count %"PRId64
-mirror_before_sleep(void *s, int64_t cnt, int synced, uint64_t delay_ns) "s %p 
dirty count %"PRId64" synced %d delay %"PRIu64"ns"
+mirror_before_relax(void *s, int64_t cnt, int synced, uint64_t delay_ns) "s %p 
dirty count %"PRId64" synced %d delay %"PRIu64"ns"
 mirror_one_iteration(void *s, int64_t offset, uint64_t bytes) "s %p offset %" 
PRId64 " bytes %" PRIu64
 mirror_iteration_done(void *s, int64_t offset, uint64_t bytes, int ret) "s %p 
offset %" PRId64 " bytes %" PRIu64 " ret %d"
 mirror_yield(void *s, int64_t cnt, int buf_free_count, int in_flight) "s %p 
dirty count %"PRId64" free buffers %d in_flight %d"
-- 
2.14.3




[Qemu-block] [PATCH v2 09/13] block/backup: remove yield_and_check

2018-01-19 Thread John Snow
This is a respin of the same functionality as mirror_throttle,
so trash this and replace it with the generic version.

yield_and_check returned true if canceled, false otherwise.
block_job_relax returns -ECANCELED if canceled, 0 otherwise.

Signed-off-by: John Snow 
---
 block/backup.c | 20 
 1 file changed, 4 insertions(+), 16 deletions(-)

diff --git a/block/backup.c b/block/backup.c
index b4204c0ee4..0624c3b322 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -334,29 +334,17 @@ static void backup_complete(BlockJob *job, void *opaque)
 g_free(data);
 }
 
-static bool coroutine_fn yield_and_check(BackupBlockJob *job)
+static uint64_t get_delay_ns(BackupBlockJob *job)
 {
 uint64_t delay_ns = 0;
 
-if (block_job_is_cancelled(&job->common)) {
-return true;
-}
-
-/* we need to yield so that bdrv_drain_all() returns.
- * (without, VM does not reboot)
- */
 if (job->common.speed) {
 delay_ns = ratelimit_calculate_delay(&job->limit,
  job->bytes_read);
 job->bytes_read = 0;
 }
 
-block_job_relax(&job->common, delay_ns);
-if (block_job_is_cancelled(&job->common)) {
-return true;
-}
-
-return false;
+return delay_ns;
 }
 
 static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
@@ -369,7 +357,7 @@ static int coroutine_fn 
backup_run_incremental(BackupBlockJob *job)
 hbitmap_iter_init(&hbi, job->copy_bitmap, 0);
 while ((cluster = hbitmap_iter_next(&hbi)) != -1) {
 do {
-if (yield_and_check(job)) {
+if (block_job_relax(&job->common, get_delay_ns(job))) {
 return 0;
 }
 ret = backup_do_cow(job, cluster * job->cluster_size,
@@ -465,7 +453,7 @@ static void coroutine_fn backup_run(void *opaque)
 bool error_is_read;
 int alloced = 0;
 
-if (yield_and_check(job)) {
+if (block_job_relax(&job->common, get_delay_ns(job))) {
 break;
 }
 
-- 
2.14.3




[Qemu-block] [PATCH v2 10/13] block/mirror: condense cancellation and relax calls

2018-01-19 Thread John Snow
We can count on the relax call to check cancellation for us, so
condense these concurrent calls.

Signed-off-by: John Snow 
---
 block/mirror.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 3c73caed5e..a0e0044de2 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -610,9 +610,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 int bytes = MIN(s->bdev_length - offset,
 QEMU_ALIGN_DOWN(INT_MAX, s->granularity));
 
-block_job_relax(&s->common, 0);
-
-if (block_job_is_cancelled(&s->common)) {
+if (block_job_relax(&s->common, 0)) {
 s->initial_zeroing_ongoing = false;
 return 0;
 }
@@ -638,9 +636,7 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
 int bytes = MIN(s->bdev_length - offset,
 QEMU_ALIGN_DOWN(INT_MAX, s->granularity));
 
-block_job_relax(&s->common, 0);
-
-if (block_job_is_cancelled(&s->common)) {
+if (block_job_relax(&s->common, 0)) {
 return 0;
 }
 
-- 
2.14.3




[Qemu-block] [PATCH v2 13/13] blockjob: remove block_job_pause_point from interface

2018-01-19 Thread John Snow
Remove the last call in block/mirror, using relax instead.
relax may do nothing if we are canceled, so allow iteration to return
prematurely and allow mirror_run to handle the cancellation logic.

This is a functional change to mirror that should have the effect of
cancelled mirror jobs being able to respond to that request a little
sooner instead of launching new requests.

Signed-off-by: John Snow 
---
 block/mirror.c   |  4 +++-
 blockjob.c   | 10 +-
 include/block/blockjob_int.h |  9 -
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/block/mirror.c b/block/mirror.c
index 192e03694f..8e6b5b25a9 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -345,7 +345,9 @@ static uint64_t coroutine_fn 
mirror_iteration(MirrorBlockJob *s)
 mirror_wait_for_io(s);
 }
 
-block_job_pause_point(&s->common);
+if (block_job_relax(&s->common, 0)) {
+return 0;
+}
 
 /* Find the number of consective dirty chunks following the first dirty
  * one, and wait for in flight requests in them. */
diff --git a/blockjob.c b/blockjob.c
index 40167d6896..27c13fdd08 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -60,6 +60,7 @@ static void __attribute__((__constructor__)) 
block_job_init(void)
 static void block_job_event_cancelled(BlockJob *job);
 static void block_job_event_completed(BlockJob *job, const char *msg);
 static void block_job_enter_cond(BlockJob *job, bool(*fn)(BlockJob *job));
+static int coroutine_fn block_job_pause_point(BlockJob *job);
 
 /* Transactional group of block jobs */
 struct BlockJobTxn {
@@ -793,7 +794,14 @@ static void block_job_do_yield(BlockJob *job, uint64_t ns)
 assert(job->busy);
 }
 
-int coroutine_fn block_job_pause_point(BlockJob *job)
+/**
+ * block_job_pause_point:
+ * @job: The job that is ready to pause.
+ *
+ * Pause now if block_job_pause() has been called.  Block jobs that perform
+ * lots of I/O must call this between requests so that the job can be paused.
+ */
+static int coroutine_fn block_job_pause_point(BlockJob *job)
 {
 assert(job && block_job_started(job));
 
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index c4891a5a9b..57327cbc5a 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -201,15 +201,6 @@ void block_job_completed(BlockJob *job, int ret);
  */
 bool block_job_is_cancelled(BlockJob *job);
 
-/**
- * block_job_pause_point:
- * @job: The job that is ready to pause.
- *
- * Pause now if block_job_pause() has been called.  Block jobs that perform
- * lots of I/O must call this between requests so that the job can be paused.
- */
-int coroutine_fn block_job_pause_point(BlockJob *job);
-
 /**
  * block_job_enter:
  * @job: The job to enter.
-- 
2.14.3




Re: [Qemu-block] [Qemu-devel] [PATCH v2 00/13] blockjob: refactor mirror_throttle

2018-01-19 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180119205847.7141-1-js...@redhat.com
Subject: [Qemu-devel] [PATCH v2 00/13] blockjob: refactor mirror_throttle

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 t [tag update]
patchew/20180119162554.27270-1-marcandre.lur...@redhat.com -> 
patchew/20180119162554.27270-1-marcandre.lur...@redhat.com
 * [new tag]   patchew/20180119205847.7141-1-js...@redhat.com -> 
patchew/20180119205847.7141-1-js...@redhat.com
Switched to a new branch 'test'
969dddc4fd blockjob: remove block_job_pause_point from interface
792539a07e blockjob: privatize block_job_sleep_ns
ea6f8f2359 block/mirror: remove block_job_sleep_ns calls
7f2ee60c8b block/mirror: condense cancellation and relax calls
9a16d6778e block/backup: remove yield_and_check
bbfcfe9c89 allow block_job_relax to return -ECANCELED
b1324b111c block/backup: use block_job_relax
d208db6fca block/stream: use block_job_relax
788ece436a block/commit: use block_job_relax
4e209e7aed blockjob: allow block_job_throttle to take delay_ns
76ce42710b blockjob: create block_job_relax
9771e84ab8 blockjob: consolidate SLICE_TIME definition
2a1eec68ac blockjob: record time of last entrance

=== OUTPUT BEGIN ===
Checking PATCH 1/13: blockjob: record time of last entrance...
WARNING: line over 80 characters
#52: FILE: block/mirror.c:803:
+delta = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - 
s->common.last_enter_ns;

total: 0 errors, 1 warnings, 63 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 2/13: blockjob: consolidate SLICE_TIME definition...
Checking PATCH 3/13: blockjob: create block_job_relax...
Checking PATCH 4/13: blockjob: allow block_job_throttle to take delay_ns...
Checking PATCH 5/13: block/commit: use block_job_relax...
Checking PATCH 6/13: block/stream: use block_job_relax...
Checking PATCH 7/13: block/backup: use block_job_relax...
Checking PATCH 8/13: allow block_job_relax to return -ECANCELED...
Checking PATCH 9/13: block/backup: remove yield_and_check...
Checking PATCH 10/13: block/mirror: condense cancellation and relax calls...
Checking PATCH 11/13: block/mirror: remove block_job_sleep_ns calls...
ERROR: do not initialise statics to 0 or NULL
#30: FILE: block/mirror.c:764:
+static uint64_t delay_ns = 0;

total: 1 errors, 0 warnings, 50 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 12/13: blockjob: privatize block_job_sleep_ns...
Checking PATCH 13/13: blockjob: remove block_job_pause_point from interface...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-block] [Qemu-devel] [PATCH v2] hw/ide: Remove duplicated definitions from ahci_internal.h

2018-01-19 Thread John Snow


On 12/07/2017 12:47 AM, Thomas Huth wrote:
> On 06.12.2017 23:16, John Snow wrote:
>> I tweaked this again, sorry:
>>
>> The names need to stay public, but the wrappers to manipulate the
>> objects can stay internal. Minor difference.
>>
>> If that's okay, I'll just merge this in.
>> OK?
> 
> Sure. Feel also free to replace my "Signed-off-by" with "Reported-by" in
> that case if you like.
> 
>  Thomas
> 

I forgot about this. Sending the PR now.



Re: [Qemu-block] [Qemu-devel] [PATCH] ide-test: test trim requests

2018-01-19 Thread John Snow


On 01/19/2018 07:40 AM, Anton Nefedov wrote:
> Signed-off-by: Anton Nefedov 
> ---
>  tests/ide-test.c | 71 
> 
>  1 file changed, 71 insertions(+)
> 
> diff --git a/tests/ide-test.c b/tests/ide-test.c
> index aa9de06..259f39f 100644
> --- a/tests/ide-test.c
> +++ b/tests/ide-test.c
> @@ -52,6 +52,7 @@
>  enum {
>  reg_data= 0x0,
>  reg_feature = 0x1,
> +reg_error   = 0x1,
>  reg_nsectors= 0x2,
>  reg_lba_low = 0x3,
>  reg_lba_middle  = 0x4,
> @@ -69,6 +70,11 @@ enum {
>  ERR = 0x01,
>  };
>  
> +/* Error field */
> +enum {
> +ABRT= 0x04,
> +};
> +
>  enum {
>  DEV = 0x10,
>  LBA = 0x40,
> @@ -81,6 +87,7 @@ enum {
>  };
>  
>  enum {
> +CMD_DSM = 0x06,
>  CMD_READ_DMA= 0xc8,
>  CMD_WRITE_DMA   = 0xca,
>  CMD_FLUSH_CACHE = 0xe7,
> @@ -179,6 +186,12 @@ typedef struct PrdtEntry {
>  #define assert_bit_set(data, mask) g_assert_cmphex((data) & (mask), ==, 
> (mask))
>  #define assert_bit_clear(data, mask) g_assert_cmphex((data) & (mask), ==, 0)
>  
> +static uint64_t trim_range_le(uint64_t sector, uint64_t count)
> +{
> +/* 2-byte range, 6-byte LBA */
> +return cpu_to_le64((count << 48) + sector);
> +}
> +

if count can only be two bytes, why not make it uint16_t?

>  static int send_dma_request(int cmd, uint64_t sector, int nb_sectors,
>  PrdtEntry *prdt, int prdt_entries,
>  void(*post_exec)(QPCIDevice *dev, QPCIBar 
> ide_bar,
> @@ -204,6 +217,7 @@ static int send_dma_request(int cmd, uint64_t sector, int 
> nb_sectors,
>   * the SCSI command being sent in the packet, too. */
>  from_dev = true;
>  break;
> +case CMD_DSM:
>  case CMD_WRITE_DMA:
>  from_dev = false;
>  break;
> @@ -234,6 +248,10 @@ static int send_dma_request(int cmd, uint64_t sector, 
> int nb_sectors,
>  /* Enables ATAPI DMA; otherwise PIO is attempted */
>  qpci_io_writeb(dev, ide_bar, reg_feature, 0x01);
>  } else {
> +if (cmd == CMD_DSM) {
> +/* trim bit */
> +qpci_io_writeb(dev, ide_bar, reg_feature, 0x01);
> +}
>  qpci_io_writeb(dev, ide_bar, reg_nsectors, nb_sectors);
>  qpci_io_writeb(dev, ide_bar, reg_lba_low,sector & 0xff);
>  qpci_io_writeb(dev, ide_bar, reg_lba_middle, (sector >> 8) & 0xff);
> @@ -344,6 +362,58 @@ static void test_bmdma_simple_rw(void)
>  g_free(cmpbuf);
>  }
>  
> +static void test_bmdma_trim(void)
> +{
> +QPCIDevice *dev;
> +QPCIBar bmdma_bar, ide_bar;
> +uint8_t status;
> +const uint64_t random_range[] = { trim_range_le(0, 2),
> +  trim_range_le(6, 8),
> +  trim_range_le(10, 1),
> +};

Maybe just "trim_range" -- it's not /random/.

> +const uint64_t bad_range = trim_range_le(TEST_IMAGE_SIZE / 512 - 1, 2);
> +size_t len = 512;
> +uint8_t *buf;
> +uintptr_t guest_buf = guest_alloc(guest_malloc, len);
> +
> +PrdtEntry prdt[] = {
> +{
> +.addr = cpu_to_le32(guest_buf),
> +.size = cpu_to_le32(len | PRDT_EOT),
> +},
> +};
> +
> +dev = get_pci_device(&bmdma_bar, &ide_bar);
> +
> +buf = g_malloc(len);
> +
> +/* Normal request */
> +*((uint64_t *)buf) = random_range[0];
> +*((uint64_t *)buf + 1) = random_range[1];
> +
> +memwrite(guest_buf, buf, 2 * sizeof(uint64_t));
> +
> +status = send_dma_request(CMD_DSM, 0, 1, prdt,
> +  ARRAY_SIZE(prdt), NULL);
> +g_assert_cmphex(status, ==, BM_STS_INTR);
> +assert_bit_clear(qpci_io_readb(dev, ide_bar, reg_status), DF | ERR);
> +
> +/* Request contains invalid range */
> +*((uint64_t *)buf) = random_range[2];
> +*((uint64_t *)buf + 1) = bad_range;
> +
> +memwrite(guest_buf, buf, 2 * sizeof(uint64_t));
> +
> +status = send_dma_request(CMD_DSM, 0, 1, prdt,
> +  ARRAY_SIZE(prdt), NULL);
> +g_assert_cmphex(status, ==, BM_STS_INTR);
> +assert_bit_set(qpci_io_readb(dev, ide_bar, reg_status), ERR);
> +assert_bit_set(qpci_io_readb(dev, ide_bar, reg_error), ABRT);
> +
> +free_pci_device(dev);
> +g_free(buf);
> +}
> +
>  static void test_bmdma_short_prdt(void)
>  {
>  QPCIDevice *dev;
> @@ -963,6 +1033,7 @@ int main(int argc, char **argv)
>  
>  qtest_add_func("/ide/bmdma/setup", test_bmdma_setup);
>  qtest_add_func("/ide/bmdma/simple_rw", test_bmdma_simple_rw);
> +qtest_add_func("/ide/bmdma/trim", test_bmdma_trim);
>  qtest_add_func("/ide/bmdma/short_prdt", test_bmdma_short_prdt);
>  qtest_add_func("/ide/bmdma/one_sector_short_prdt",
> test_bmdma_one_sector_short_prdt);
> 

looks good except for those two very small nits.



Re: [Qemu-block] [Qemu-devel] [PATCH 0/3] ide: abort TRIM operation for invalid range

2018-01-19 Thread John Snow


On 12/08/2017 07:10 AM, Anton Nefedov wrote:
> Started from the separate series discussion (trim statistics) , see
> http://lists.nongnu.org/archive/html/qemu-devel/2017-12/msg01059.html
> 
> There is no range check for IDE trim requests now.
> Such request will likely be rejected by the block layer and count as
> failed and not an invalid/aborted operation.
> 
> Anton Nefedov (3):
>   ide: pass IDEState to trim AIO callback
>   ide: move ide_sect_range_ok() up
>   ide: abort TRIM operation for invalid range
> 
>  hw/ide/core.c | 53 +
>  1 file changed, 33 insertions(+), 20 deletions(-)
> 

I forgot about this series due to the 2.11 freeze and winter break. It
appears to still apply, so I'll send it along.



[Qemu-block] [PATCH v4] file-posix: specify expected filetypes

2018-01-19 Thread John Snow
Adjust each caller of raw_open_common to specify if they are expecting
host and character devices or not. Tighten expectations of file types upon
open in the common code and refuse types that are not expected.

This has two effects:

(1) Character and block devices are now considered deprecated for the
'file' driver, which expects only S_IFREG, and
(2) no file-posix driver (file, host_cdrom, or host_device) can open
directories now.

I don't think there's a legitimate reason to open directories as if
they were files. This prevents QEMU from opening and attempting to probe
a directory inode, which can break in exciting ways. One of those ways
is lseek on ext4/xfs, which will return 0x7fff as the file
size instead of EISDIR. This can coax QEMU into responding with a
confusing "file too big" instead of "Hey, that's not a file".

See: https://bugs.launchpad.net/qemu/+bug/1739304/
Signed-off-by: John Snow 
---

v4: fixing doc building bug, sorry :(

 block/file-posix.c | 37 +
 qemu-doc.texi  |  6 ++
 2 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 36ee89e940..61fac1d2a4 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -417,7 +417,8 @@ static QemuOptsList raw_runtime_opts = {
 };
 
 static int raw_open_common(BlockDriverState *bs, QDict *options,
-   int bdrv_flags, int open_flags, Error **errp)
+   int bdrv_flags, int open_flags,
+   bool device, Error **errp)
 {
 BDRVRawState *s = bs->opaque;
 QemuOpts *opts;
@@ -556,10 +557,30 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 error_setg_errno(errp, errno, "Could not stat file");
 goto fail;
 }
-if (S_ISREG(st.st_mode)) {
-s->discard_zeroes = true;
-s->has_fallocate = true;
+
+if (!device) {
+if (S_ISBLK(st.st_mode)) {
+warn_report("Opening a block device as file using 'file' "
+"driver is deprecated");
+} else if (S_ISCHR(st.st_mode)) {
+warn_report("Opening a character device as file using the 'file' "
+"driver is deprecated");
+} else if (!S_ISREG(st.st_mode)) {
+error_setg(errp, "A regular file was expected by the 'file' 
driver, "
+   "but something else was given");
+goto fail;
+} else {
+s->discard_zeroes = true;
+s->has_fallocate = true;
+}
+} else {
+if (!(S_ISCHR(st.st_mode) || S_ISBLK(st.st_mode))) {
+error_setg(errp, "host_device/host_cdrom driver expects either "
+   "a character or block device");
+goto fail;
+}
 }
+
 if (S_ISBLK(st.st_mode)) {
 #ifdef BLKDISCARDZEROES
 unsigned int arg;
@@ -611,7 +632,7 @@ static int raw_open(BlockDriverState *bs, QDict *options, 
int flags,
 BDRVRawState *s = bs->opaque;
 
 s->type = FTYPE_FILE;
-return raw_open_common(bs, options, flags, 0, errp);
+return raw_open_common(bs, options, flags, 0, false, errp);
 }
 
 typedef enum {
@@ -2575,7 +2596,7 @@ hdev_open_Mac_error:
 
 s->type = FTYPE_FILE;
 
-ret = raw_open_common(bs, options, flags, 0, &local_err);
+ret = raw_open_common(bs, options, flags, 0, true, &local_err);
 if (ret < 0) {
 error_propagate(errp, local_err);
 #if defined(__APPLE__) && defined(__MACH__)
@@ -2802,7 +2823,7 @@ static int cdrom_open(BlockDriverState *bs, QDict 
*options, int flags,
 s->type = FTYPE_CD;
 
 /* open will not fail even if no CD is inserted, so add O_NONBLOCK */
-return raw_open_common(bs, options, flags, O_NONBLOCK, errp);
+return raw_open_common(bs, options, flags, O_NONBLOCK, true, errp);
 }
 
 static int cdrom_probe_device(const char *filename)
@@ -2915,7 +2936,7 @@ static int cdrom_open(BlockDriverState *bs, QDict 
*options, int flags,
 
 s->type = FTYPE_CD;
 
-ret = raw_open_common(bs, options, flags, 0, &local_err);
+ret = raw_open_common(bs, options, flags, 0, true, &local_err);
 if (ret) {
 error_propagate(errp, local_err);
 return ret;
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 3e9eb819a6..8c94dcf8a6 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -2708,6 +2708,12 @@ that can be specified with the ``-device'' parameter.
 The drive addr argument is replaced by the the addr argument
 that can be specified with the ``-device'' parameter.
 
+@subsection -drive file=json:@{...@{'driver':'file'@}@} (since 2.12.0)
+
+The 'file' driver for drives is no longer appropriate for character or host
+devices and will only accept regular files (S_IFREG). The correct driver
+for these file types is 'host_cdrom' or 'host_device' as appropriate.
+
 @subsection -net dump (since 2.10.0)
 
 The ``--net dump'' argument is now replaced with the
-- 
2.14.3




Re: [Qemu-block] [Qemu-devel] [PATCH v4] file-posix: specify expected filetypes

2018-01-19 Thread Eric Blake
On 01/19/2018 04:47 PM, John Snow wrote:
> Adjust each caller of raw_open_common to specify if they are expecting
> host and character devices or not. Tighten expectations of file types upon
> open in the common code and refuse types that are not expected.
> 
> This has two effects:
> 
> (1) Character and block devices are now considered deprecated for the
> 'file' driver, which expects only S_IFREG, and
> (2) no file-posix driver (file, host_cdrom, or host_device) can open
> directories now.
> 
> I don't think there's a legitimate reason to open directories as if
> they were files. This prevents QEMU from opening and attempting to probe
> a directory inode, which can break in exciting ways. One of those ways
> is lseek on ext4/xfs, which will return 0x7fff as the file
> size instead of EISDIR. This can coax QEMU into responding with a
> confusing "file too big" instead of "Hey, that's not a file".
> 
> See: https://bugs.launchpad.net/qemu/+bug/1739304/
> Signed-off-by: John Snow 
> ---

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v2 0/6] qmp dirty bitmap API

2018-01-19 Thread John Snow


On 01/16/2018 07:54 AM, Vladimir Sementsov-Ogievskiy wrote:
> Hi all.
> 
> There are three qmp commands, needed to implement external backup API.
> 
> Using these three commands, client may do all needed bitmap management by
> hand:
> 
> on backup start we need to do a transaction:
>  {disable old bitmap, create new bitmap}
> 
> on backup success:
>  drop old bitmap
> 
> on backup fail:
>  enable old bitmap
>  merge new bitmap to old bitmap
>  drop new bitmap
> 
> v2: fix merge command deadlock
>   add new patches: 1 and 6
> 
> Vladimir Sementsov-Ogievskiy (6):
>   block: maintain persistent disabled bitmaps
>   block/dirty-bitmap: add lock to bdrv_enable/disable_dirty_bitmap
>   qapi: add block-dirty-bitmap-enable/disable
>   qmp: transaction support for block-dirty-bitmap-enable/disable
>   qapi: add block-dirty-bitmap-merge
>   qapi: add disabled parameter to block-dirty-bitmap-add
> 
>  qapi/block-core.json |  92 ++-
>  qapi/transaction.json|   4 +
>  block/qcow2.h|   2 +-
>  include/block/dirty-bitmap.h |   3 +-
>  block/dirty-bitmap.c |  42 ++-
>  block/qcow2-bitmap.c |  12 +--
>  block/qcow2.c|   2 +-
>  blockdev.c   | 169 
> +--
>  8 files changed, 287 insertions(+), 39 deletions(-)
> 

Fails to apply to master (b384cd95) on patch four and five. Only
contextual problems, I've patched it up and I'll review that.

(mirrored here if you want to check my rebase work:
https://github.com/jnsnow/qemu/tree/vlad-review)

Since I was full of such bad and stupid ideas last time, I'd like
someone else to look over this one for design and I'll just review it
for accuracy.

--js



Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/6] block: maintain persistent disabled bitmaps

2018-01-19 Thread John Snow


On 01/16/2018 07:54 AM, Vladimir Sementsov-Ogievskiy wrote:
> To maintain load/store disabled bitmap there is new approach:
> 
>  - deprecate @autoload flag of block-dirty-bitmap-add, make it ignored
>  - store enabled bitmaps as "auto" to qcow2
>  - store disabled bitmaps without "auto" flag to qcow2
>  - on qcow2 open load "auto" bitmaps as enabled and others
>as disabled (except in_use bitmaps)
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  qapi/block-core.json |  6 +++---
>  block/qcow2.h|  2 +-
>  include/block/dirty-bitmap.h |  1 -
>  block/dirty-bitmap.c | 18 --
>  block/qcow2-bitmap.c | 12 +++-
>  block/qcow2.c|  2 +-
>  blockdev.c   | 10 ++
>  7 files changed, 14 insertions(+), 37 deletions(-)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index ab96e348e6..827254db22 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1593,9 +1593,9 @@
>  #  Qcow2 disks support persistent bitmaps. Default is false for
>  #  block-dirty-bitmap-add. (Since: 2.10)
>  #
> -# @autoload: the bitmap will be automatically loaded when the image it is 
> stored
> -#in is opened. This flag may only be specified for persistent
> -#bitmaps. Default is false for block-dirty-bitmap-add. (Since: 
> 2.10)
> +# @autoload: ignored and deprecated since 2.12.
> +#Currently, all dirty tracking bitmaps are loaded from Qcow2 on
> +#open.

Hmm, so we're going to say that *all* persistent bitmaps are loaded into
memory, but they may-or-may-not-be enabled, is that the approach we're
taking now?

>  #
>  # Since: 2.4
>  ##
> diff --git a/block/qcow2.h b/block/qcow2.h
> index 782a206ecb..a3e29276fc 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -672,7 +672,7 @@ void qcow2_cache_discard(BlockDriverState *bs, Qcow2Cache 
> *c, void *table);
>  int qcow2_check_bitmaps_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
>void **refcount_table,
>int64_t *refcount_table_size);
> -bool qcow2_load_autoloading_dirty_bitmaps(BlockDriverState *bs, Error 
> **errp);
> +bool qcow2_load_dirty_bitmaps(BlockDriverState *bs, Error **errp);
>  int qcow2_reopen_bitmaps_rw(BlockDriverState *bs, Error **errp);
>  void qcow2_store_persistent_dirty_bitmaps(BlockDriverState *bs, Error 
> **errp);
>  int qcow2_reopen_bitmaps_ro(BlockDriverState *bs, Error **errp);
> diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
> index 3579a7597c..144e77a879 100644
> --- a/include/block/dirty-bitmap.h
> +++ b/include/block/dirty-bitmap.h
> @@ -65,7 +65,6 @@ void bdrv_dirty_bitmap_deserialize_ones(BdrvDirtyBitmap 
> *bitmap,
>  void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap);
>  
>  void bdrv_dirty_bitmap_set_readonly(BdrvDirtyBitmap *bitmap, bool value);
> -void bdrv_dirty_bitmap_set_autoload(BdrvDirtyBitmap *bitmap, bool autoload);
>  void bdrv_dirty_bitmap_set_persistance(BdrvDirtyBitmap *bitmap,
> bool persistent);
>  
> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> index bd04e991b1..3777be1985 100644
> --- a/block/dirty-bitmap.c
> +++ b/block/dirty-bitmap.c
> @@ -52,8 +52,6 @@ struct BdrvDirtyBitmap {
> Such operations must fail and both the 
> image
> and this bitmap must remain unchanged 
> while
> this flag is set. */
> -bool autoload;  /* For persistent bitmaps: bitmap must be
> -   autoloaded on image opening */
>  bool persistent;/* bitmap must be saved to owner disk image 
> */
>  QLIST_ENTRY(BdrvDirtyBitmap) list;
>  };
> @@ -104,7 +102,6 @@ void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
>  g_free(bitmap->name);
>  bitmap->name = NULL;
>  bitmap->persistent = false;
> -bitmap->autoload = false;
>  }
>  
>  /* Called with BQL taken.  */
> @@ -261,8 +258,6 @@ BdrvDirtyBitmap 
> *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
>  bitmap->successor = NULL;
>  successor->persistent = bitmap->persistent;
>  bitmap->persistent = false;
> -successor->autoload = bitmap->autoload;
> -bitmap->autoload = false;
>  bdrv_release_dirty_bitmap(bs, bitmap);
>  
>  return successor;
> @@ -667,19 +662,6 @@ bool bdrv_has_readonly_bitmaps(BlockDriverState *bs)
>  }
>  
>  /* Called with BQL taken. */
> -void bdrv_dirty_bitmap_set_autoload(BdrvDirtyBitmap *bitmap, bool autoload)
> -{
> -qemu_mutex_lock(bitmap->mutex);
> -bitmap->autoload = autoload;
> -qemu_mutex_unlock(bitmap->mutex);
> -}
> -
> -bool bdrv_dirty_bitmap_get_autoload(const BdrvDirtyBitmap *bitmap)
> -{
> -return bitmap->autoload;
> -}
> -
> -/* Called with BQL taken. */
>  void b

Re: [Qemu-block] [Qemu-devel] [PATCH v2 2/6] block/dirty-bitmap: add lock to bdrv_enable/disable_dirty_bitmap

2018-01-19 Thread John Snow


On 01/16/2018 07:54 AM, Vladimir Sementsov-Ogievskiy wrote:
> Add locks and remove comments about BQL accordingly to
> dirty_bitmap_mutex definition in block_int.h.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/dirty-bitmap.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> index 3777be1985..d0a10c4f5d 100644
> --- a/block/dirty-bitmap.c
> +++ b/block/dirty-bitmap.c
> @@ -389,18 +389,20 @@ void 
> bdrv_remove_persistent_dirty_bitmap(BlockDriverState *bs,
>  }
>  }
>  
> -/* Called with BQL taken.  */
>  void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
>  {
> +bdrv_dirty_bitmap_lock(bitmap);
>  assert(!bdrv_dirty_bitmap_frozen(bitmap));
>  bitmap->disabled = true;
> +bdrv_dirty_bitmap_unlock(bitmap);
>  }
>  
> -/* Called with BQL taken.  */
>  void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
>  {
> +bdrv_dirty_bitmap_lock(bitmap);
>  assert(!bdrv_dirty_bitmap_frozen(bitmap));
>  bitmap->disabled = false;
> +bdrv_dirty_bitmap_unlock(bitmap);
>  }
>  
>  BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
> 

Reviewed-by: John Snow 



Re: [Qemu-block] [Qemu-devel] [PATCH v2 3/6] qapi: add block-dirty-bitmap-enable/disable

2018-01-19 Thread John Snow


On 01/16/2018 07:54 AM, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  qapi/block-core.json | 42 ++
>  blockdev.c   | 42 ++
>  2 files changed, 84 insertions(+)
> 
> diff --git a/qapi/block-core.json b/qapi/block-core.json
> index 827254db22..b3851ee760 100644
> --- a/qapi/block-core.json
> +++ b/qapi/block-core.json
> @@ -1672,6 +1672,48 @@
>'data': 'BlockDirtyBitmap' }
>  
>  ##
> +# @block-dirty-bitmap-enable:
> +#
> +# Enable dirty bitmap, so that it will continue tracking disk changes.
> +#

suggest:
"Enables a dirty bitmap so that it will begin tracking disk changes."

Key item here is "begin" instead of "continue".

> +# Returns: nothing on success
> +#  If @node is not a valid block device, DeviceNotFound
> +#  If @name is not found, GenericError with an explanation
> +#
> +# Since: 2.12
> +#
> +# Example:
> +#
> +# -> { "execute": "block-dirty-bitmap-enable",
> +#  "arguments": { "node": "drive0", "name": "bitmap0" } }
> +# <- { "return": {} }
> +#
> +##
> +  { 'command': 'block-dirty-bitmap-enable',
> +'data': 'BlockDirtyBitmap' }
> +
> +##
> +# @block-dirty-bitmap-disable:
> +#
> +# Disable dirty bitmap, so that it will stop tracking disk changes.
> +#

suggest:
"Disables a dirty bitmap so that it will stop tracking disk changes."

> +# Returns: nothing on success
> +#  If @node is not a valid block device, DeviceNotFound
> +#  If @name is not found, GenericError with an explanation
> +#
> +# Since: 2.12
> +#
> +# Example:
> +#
> +# -> { "execute": "block-dirty-bitmap-disable",
> +#  "arguments": { "node": "drive0", "name": "bitmap0" } }
> +# <- { "return": {} }
> +#
> +##
> +{ 'command': 'block-dirty-bitmap-disable',
> +  'data': 'BlockDirtyBitmap' }
> +
> +##
>  # @BlockDirtyBitmapSha256:
>  #
>  # SHA256 hash of dirty bitmap data
> diff --git a/blockdev.c b/blockdev.c
> index 8068cbd606..997a6f514c 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -2821,6 +2821,48 @@ void qmp_block_dirty_bitmap_clear(const char *node, 
> const char *name,
>  bdrv_clear_dirty_bitmap(bitmap, NULL);
>  }
>  
> +void qmp_block_dirty_bitmap_enable(const char *node, const char *name,
> +   Error **errp)
> +{
> +BlockDriverState *bs;
> +BdrvDirtyBitmap *bitmap;
> +
> +bitmap = block_dirty_bitmap_lookup(node, name, &bs, errp);
> +if (!bitmap || !bs) {
> +return;
> +}
> +
> +if (bdrv_dirty_bitmap_frozen(bitmap)) {
> +error_setg(errp,
> +   "Bitmap '%s' is currently frozen and cannot be enabled",
> +   name);
> +return;
> +}
> +
> +bdrv_enable_dirty_bitmap(bitmap);
> +}
> +
> +void qmp_block_dirty_bitmap_disable(const char *node, const char *name,
> +Error **errp)
> +{
> +BlockDriverState *bs;
> +BdrvDirtyBitmap *bitmap;
> +
> +bitmap = block_dirty_bitmap_lookup(node, name, &bs, errp);
> +if (!bitmap || !bs) {
> +return;
> +}
> +
> +if (bdrv_dirty_bitmap_frozen(bitmap)) {
> +error_setg(errp,
> +   "Bitmap '%s' is currently frozen and cannot be disabled",
> +   name);
> +return;
> +}
> +
> +bdrv_disable_dirty_bitmap(bitmap);
> +}
> +
>  BlockDirtyBitmapSha256 *qmp_x_debug_block_dirty_bitmap_sha256(const char 
> *node,
>const char 
> *name,
>Error **errp)
> 

I have to admit exposing this interface still makes me nervous, but :)

Mechanically correct, and with suggesting phrasing changes:

Reviewed-by: John Snow 



Re: [Qemu-block] [Qemu-devel] [PATCH v2 4/6] qmp: transaction support for block-dirty-bitmap-enable/disable

2018-01-19 Thread John Snow


On 01/17/2018 10:06 AM, Vladimir Sementsov-Ogievskiy wrote:
> 16.01.2018 15:54, Vladimir Sementsov-Ogievskiy wrote:
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> ---
>>   qapi/transaction.json |  4 +++
>>   blockdev.c    | 79
>> +++
>>   2 files changed, 83 insertions(+)
>>
>> diff --git a/qapi/transaction.json b/qapi/transaction.json
>> index bd312792da..b643d848f8 100644
>> --- a/qapi/transaction.json
>> +++ b/qapi/transaction.json
>> @@ -46,6 +46,8 @@
>>   # - @abort: since 1.6
>>   # - @block-dirty-bitmap-add: since 2.5
>>   # - @block-dirty-bitmap-clear: since 2.5
>> +# - @block-dirty-bitmap-enable: since 2.12
>> +# - @block-dirty-bitmap-disable: since 2.12
>>   # - @blockdev-backup: since 2.3
>>   # - @blockdev-snapshot: since 2.5
>>   # - @blockdev-snapshot-internal-sync: since 1.7
>> @@ -59,6 +61,8 @@
>>  'abort': 'Abort',
>>  'block-dirty-bitmap-add': 'BlockDirtyBitmapAdd',
>>  'block-dirty-bitmap-clear': 'BlockDirtyBitmap',
>> +   'block-dirty-bitmap-enable': 'BlockDirtyBitmap',
>> +   'block-dirty-bitmap-disable': 'BlockDirtyBitmap',
>>  'blockdev-backup': 'BlockdevBackup',
>>  'blockdev-snapshot': 'BlockdevSnapshot',
>>  'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal',
>> diff --git a/blockdev.c b/blockdev.c
>> index 997a6f514c..d338363d78 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -1962,6 +1962,7 @@ typedef struct BlockDirtyBitmapState {
>>   AioContext *aio_context;
>>   HBitmap *backup;
>>   bool prepared;
>> +    bool was_enabled;
>>   } BlockDirtyBitmapState;
>>     static void block_dirty_bitmap_add_prepare(BlkActionState *common,
>> @@ -2069,6 +2070,74 @@ static void
>> block_dirty_bitmap_clear_clean(BlkActionState *common)
>>   }
>>   }
>>   +static void block_dirty_bitmap_enable_prepare(BlkActionState *common,
>> +  Error **errp)
>> +{
>> +    BlockDirtyBitmap *action;
>> +    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>> + common, common);
>> +
>> +    if (action_check_completion_mode(common, errp) < 0) {
>> +    return;
>> +    }
>> +
>> +    action = common->action->u.block_dirty_bitmap_enable.data;
>> +    state->bitmap = block_dirty_bitmap_lookup(action->node,
>> +  action->name,
>> +  &state->bs,
>> +  errp);
>> +    if (!state->bitmap) {
>> +    return;
>> +    }
>> +
>> +    state->was_enabled = bdrv_dirty_bitmap_enabled(state->bitmap);
>> +    bdrv_enable_dirty_bitmap(state->bitmap);
>> +}
>> +
>> +static void block_dirty_bitmap_enable_abort(BlkActionState *common)
>> +{
>> +    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>> + common, common);
>> +
>> +    if (!state->was_enabled) {
>> +    bdrv_disable_dirty_bitmap(state->bitmap);
>> +    }
>> +}
>> +
>> +static void block_dirty_bitmap_disable_prepare(BlkActionState *common,
>> +   Error **errp)
>> +{
>> +    BlockDirtyBitmap *action;
>> +    BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
>> + common, common);
>> +
>> +    if (action_check_completion_mode(common, errp) < 0) {
>> +    return;
>> +    }
>> +
>> +    action = common->action->u.block_dirty_bitmap_disable.data;
>> +    state->bitmap = block_dirty_bitmap_lookup(action->node,
>> +  action->name,
>> +  &state->bs,
>> +  errp);
>> +    if (!state->bitmap) {

&state->bs should be NULL if we're not going to use it. If we're going
to use it, we should check the error condition here because it might fail.

>> +    return;
>> +    }
>> +
>> +    state->was_enabled = bdrv_dirty_bitmap_enabled(state->bitmap);
>> +    bdrv_disable_dirty_bitmap(state->bitmap);
> 
> hm. actually, I just make it like _clear is made. But now I doubt,
> why do we need disable here? we can disable in commit, and do not need
> state->was_enabled..
> 

You need to make sure that there is no way for bdrv_disable_dirty_bitmap
to fail, so you need to add that frozen check in. Then you're alright,
and you can move the actual disable portion to the commit, and get rid
of the undo call.

Or, we can even do this kind of trick to remove the redundant error
checking:

void qmp_block_dirty_bitmap_enable(const char *node, const char *name,
   Error **errp)
{
BlockDirtyBitmap data = {
.node = node,
.name = name
};
TransactionAction action = {
.type = TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ENABLE,
.u.block_dirty_bitmap_enable.d

Re: [Qemu-block] [Qemu-devel] [PATCH v9 13/13] iotests: add dirty bitmap postcopy test

2018-01-19 Thread John Snow


On 01/19/2018 01:08 PM, Vladimir Sementsov-Ogievskiy wrote:
> 18.01.2018 12:57, Vladimir Sementsov-Ogievskiy wrote:
>> 17.01.2018 21:30, John Snow wrote:
>>>
>>> On 12/28/2017 06:49 AM, Vladimir Sementsov-Ogievskiy wrote:
 Thank you for reviewing my code!

>>> Time for the re-spin? There's pretty good pressure to get this into 2.12
>>> and say the non-nbd workflow model is feature complete.
>>
>> Yes, you are right. Hope to do it today or tomorrow.
>>
> 
> I've rebased, applied comments from review, and even one some new fixes,
> but now I understand that it is too early for re-spin.
> 
> Now this series depends on
> 1. [PATCH v2 0/3] fix bitmaps migration through shared storage
>   - we need to decide, how to fix it. May be, we just do not need
> bitmaps in inactive state, and should not load them in do_qcow2_open in
> this case
>   [I can ignore it, just dropping one test case from new iotest, and fix
> it later, but..

I'll look at this next.

> 2. [PATCH v2 0/6] qmp dirty bitmap API
>  - here autoload is dropped, and migration should be rebased on it.
> 

For (2) I really want to see a test case showing the utility for enable,
disable and merge to be added to the API, *or* add an x- prefix to them
for now.

I want to see some use-case tests that demonstrate how they are to be
safely used, basically. You might want to expand test 124 for this.

I'm not sure we need this entire series for migration, but I don't want
to make you re-order absolutely everything. Can we merge patch one by
itself for the purposes of the migration patchset?

> so, I'll re-spin migration after these two questions are resolved.
> 

Thanks for your patience, as always. Please send me pings every day on
whatever you have to in order to get migration in to 2.12 -- I'm going
to try to stay focused too, but I've got the attention span of a
goldfish. If I miss something, please speak up.

--js