Re: [Qemu-block] [PATCH v3] virtio-blk: trivial code optimization

2015-11-09 Thread Stefan Hajnoczi
On Mon, Nov 09, 2015 at 05:03:30PM +0800, arei.gong...@huawei.com wrote:
> From: Gonglei 
> 
> 1. avoid possible superflous checking
> 2. make code more robustness
> 
> Signed-off-by: Gonglei 
> Reviewed-by: Fam Zheng 
> ---
> v3: change the third condition too [Paolo]
> add Fam's R-by
> ---
>  hw/block/virtio-blk.c | 27 +--
>  1 file changed, 9 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 093e475..9124358 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -404,24 +404,15 @@ void virtio_blk_submit_multireq(BlockBackend *blk, 
> MultiReqBuffer *mrb)
>  for (i = 0; i < mrb->num_reqs; i++) {
>  VirtIOBlockReq *req = mrb->reqs[i];
>  if (num_reqs > 0) {
> -bool merge = true;
> -
> -/* merge would exceed maximum number of IOVs */
> -if (niov + req->qiov.niov > IOV_MAX) {
> -merge = false;
> -}
> -
> -/* merge would exceed maximum transfer length of backend device 
> */
> -if (req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > 
> max_xfer_len) {
> -merge = false;
> -}
> -
> -/* requests are not sequential */
> -if (sector_num + nb_sectors != req->sector_num) {
> -merge = false;
> -}
> -
> -if (!merge) {
> +/*
> + * NOTE: We cannot merge the requests in below situations:
> + * 1. requests are not sequential
> + * 2. merge would exceed maximum number of IOVs
> + * 3. merge would exceed maximum transfer length of backend 
> device
> + */
> +if (sector_num + nb_sectors != req->sector_num ||
> +niov > IOV_MAX - req->qiov.niov ||
> +nb_sectors > max_xfer_len - req->qiov.size / 
> BDRV_SECTOR_SIZE) {

nb_sectors - int
max_xfer_len - int
req->qiov.size - size_t
BDRV_SECTOR_SIZE - unsigned long long

Therefore this expression is an int > unsigned long long comparison.

req->qiov.size can be larger than max_xfer_len so this comparison
suffers from underflow.  Please check that req->qiov.size <=
max_xfer_len first.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH v6 3/4] qmp: add monitor command to add/remove a child

2015-11-09 Thread Alberto Garcia
Sorry again for the late review, here are my comments:

On Fri 16 Oct 2015 10:57:45 AM CEST, Wen Congyang wrote:
> +void qmp_x_blockdev_change(ChangeOperation op, const char *parent,
> +   bool has_child, const char *child,
> +   bool has_new_node, const char *new_node,
> +   Error **errp)

You are using different names for the parameters here: 'op', 'parent',
'child', 'new_node'; in the JSON file the first and last one are named
'operation' and 'node'.

> +parent_bs = bdrv_lookup_bs(parent, parent, _err);
> +if (!parent_bs) {
> +error_propagate(errp, local_err);
> +return;
> +}

You don't need to change it if you don't want but you can use errp
directly here and spare the error_propagate() call.

> +
> +switch(op) {
> +case CHANGE_OPERATION_ADD:
> +if (has_child) {
> +error_setg(errp, "The operation %s doesn't support the parameter 
> child",
> +   ChangeOperation_lookup[op]);
> +return;
> +}

This line goes over 80 columns, please use scripts/checkpatch.pl to
check the style of the code.

> +if (!has_new_node) {
> +error_setg(errp, "The operation %s needs the parameter new_node",
> +   ChangeOperation_lookup[op]);
> +return;
> +}
> +break;
> +case CHANGE_OPERATION_DELETE:
> +if (has_new_node) {
> +error_setg(errp, "The operation %s doesn't support the parameter 
> node",
> +   ChangeOperation_lookup[op]);
> +return;
> +}
> +if (!has_child) {
> +error_setg(errp, "The operation %s needs the parameter child",
> +   ChangeOperation_lookup[op]);
> +return;
> +}

I still think that having two optional parameters here makes things
unnecessarily complex.

If in the future we want to add a new operation that needs a new
parameter then we can add it then, but I think that both 'add' and
'delete' can work perfectly fine with a single 'node' parameter.

Does anyone else have an opinion about this?

> +default:
> +break;
> +}

This is unreachable so you can add a g_assert_not_reached() here.

> +##
> +# @ChangeOperation:
> +#
> +# An enumeration of block device change operation.

How about something like "An enumeration of possible change operations
on a block device" ?

> +# @add: Add a new block driver state to a existed block driver state.

s/a existed/an existing/

> +# @x-blockdev-change
> +#
> +# Dynamic reconfigure the block driver state graph. It can be used to

"Dynamically reconfigure"

> +# add, remove, insert, replace a block driver state. Currently only

"insert or replace"

> +# the Quorum driver implements this feature to add and remove its child.
> +# This is useful to fix a broken quorum child.

"add and remove its child" -> "add or remove a child"

> +#
> +# @operation: the chanage operation. It can be add, delete.

s/chanage/change/

> +#
> +# @parent: the id or node name of which node will be changed.

How about "the id or name of the node that will be changed" ?

> +#
> +# @child: the child node-name which will be deleted.
> +#
> +# @node: the new node-name which will be added.

"The name of the node that will be deleted"
"The name of the node that will be added"

Or, if you merge both parameters, "...that will be added or deleted".

> +#
> +# Note: this command is experimental, and not a stable API.

"and not a stable API" -> "and does not have a stable API", or "and its
API is not stable".

Berto



Re: [Qemu-block] [PATCH v6 4/4] hmp: add monitor command to add/remove a child

2015-11-09 Thread Alberto Garcia
On Fri 16 Oct 2015 10:57:46 AM CEST, Wen Congyang wrote:

> +.name   = "blockdev_change",
> +.args_type  = "op:s,parent:B,child:B?,node:?",
> +.params = "operation parent [child] [node]",
  [...]
> +/*
> + * FIXME: we must specify the parameter child, otherwise,
> + * we can't specify the parameter node.
> + */
> +if (op == CHANGE_OPERATION_ADD) {
> +has_child = false;
> +}

So if you want to perform the 'add' operation you must pass both 'child'
and 'node' but the former will be discarded.

I don't think you really need to do this for the HMP interface, but it's
anyway one more good reason to merge 'child' and 'node'.

Berto



Re: [Qemu-block] [PATCH v3] virtio-blk: trivial code optimization

2015-11-09 Thread Paolo Bonzini


On 09/11/2015 10:03, arei.gong...@huawei.com wrote:
> From: Gonglei 
> 
> 1. avoid possible superflous checking
> 2. make code more robustness
> 
> Signed-off-by: Gonglei 
> Reviewed-by: Fam Zheng 
> ---
> v3: change the third condition too [Paolo]
> add Fam's R-by
> ---
>  hw/block/virtio-blk.c | 27 +--
>  1 file changed, 9 insertions(+), 18 deletions(-)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 093e475..9124358 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -404,24 +404,15 @@ void virtio_blk_submit_multireq(BlockBackend *blk, 
> MultiReqBuffer *mrb)
>  for (i = 0; i < mrb->num_reqs; i++) {
>  VirtIOBlockReq *req = mrb->reqs[i];
>  if (num_reqs > 0) {
> -bool merge = true;
> -
> -/* merge would exceed maximum number of IOVs */
> -if (niov + req->qiov.niov > IOV_MAX) {
> -merge = false;
> -}
> -
> -/* merge would exceed maximum transfer length of backend device 
> */
> -if (req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > 
> max_xfer_len) {
> -merge = false;
> -}
> -
> -/* requests are not sequential */
> -if (sector_num + nb_sectors != req->sector_num) {
> -merge = false;
> -}
> -
> -if (!merge) {
> +/*
> + * NOTE: We cannot merge the requests in below situations:
> + * 1. requests are not sequential
> + * 2. merge would exceed maximum number of IOVs
> + * 3. merge would exceed maximum transfer length of backend 
> device
> + */
> +if (sector_num + nb_sectors != req->sector_num ||
> +niov > IOV_MAX - req->qiov.niov ||
> +nb_sectors > max_xfer_len - req->qiov.size / 
> BDRV_SECTOR_SIZE) {
>  submit_requests(blk, mrb, start, num_reqs, niov);
>  num_reqs = 0;
>  }
> 

Reviewed-by: Paolo Bonzini 



Re: [Qemu-block] [PATCH v6 01/15] blockdev: Add missing bdrv_unref() in drive-backup

2015-11-09 Thread Alberto Garcia
On Wed 04 Nov 2015 07:57:33 PM CET, Max Reitz  wrote:
> All error paths after a successful bdrv_open() of target_bs should
> contain a bdrv_unref(target_bs). This one did not yet, so add it.
>
> Signed-off-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [PATCH v6 12/15] blockdev: Keep track of monitor-owned BDS

2015-11-09 Thread Alberto Garcia
On Wed 04 Nov 2015 07:57:44 PM CET, Max Reitz wrote:
> @@ -3519,11 +3537,18 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
> bdrv_get_device_or_node_name(bs));
>  goto out;
>  }
> +
> +if (!blk && !bs->monitor_list.tqe_prev) {
> +error_setg(errp, "Node %s is not owned by the monitor",
> +   bs->node_name);
> +goto out;
> +}
>  }
>  
>  if (blk) {
>  blk_unref(blk);
>  } else {
> +QTAILQ_REMOVE(_bdrv_states, bs, monitor_list);
>  bdrv_unref(bs);
>  }

blk_unref(blk) will also unref the BDS (if there's any), so you also
need to update monitor_bdrv_states in that case, don't you?

Anyway, wouldn't it make more sense to do this in bdrv_delete() ?

Berto



Re: [Qemu-block] [PATCH v6 10/15] block: Make bdrv_close() static

2015-11-09 Thread Alberto Garcia
On Wed 04 Nov 2015 07:57:42 PM CET, Max Reitz wrote:
> There are no users of bdrv_close() left, except for one of bdrv_open()'s
> failure paths, bdrv_close_all() and bdrv_delete(), and that is good.
> Make bdrv_close() static so nobody makes the mistake of directly using
> bdrv_close() again.
>
> Signed-off-by: Max Reitz 
> Reviewed-by: Eric Blake 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-block] [Qemu-devel] [PATCH v3 0/9] block: Fixes for bdrv_drain

2015-11-09 Thread Fam Zheng
On Mon, 11/09 11:42, Stefan Hajnoczi wrote:
> On Mon, Nov 09, 2015 at 10:56:39AM +0800, Fam Zheng wrote:
> > v3: Don't reuse coroutine in bdrv_aio_ioctl. [Stefan]
> > Recursely call .bdrv_drain callback only. [Stefan, Paolo]
> 
> I don't understand this change.  I thought you want .bdrv_drain() to be
> called on the whole tree, but the latest code seems to call it on the
> root and children only.  It doesn't recurse the only the root and first
> level of the tree get .bdrv_drain() calls.  Is this intentional?

Right, v4 posted, please ignore this.



Re: [Qemu-block] [PATCH 0/2] atapi: fix NetBSD boot regression

2015-11-09 Thread Mark Cave-Ayland
On 09/11/15 19:05, John Snow wrote:

> Marc noticed that a recent ATAPI permissions fix broke NetBSD 7.0's
> installer ISO.
> 
> The problem is that it's meaningless to check for !(cmd->flags & nondata)
> if the command isn't supported, since all unsupported commands have
> _no_ flags. Effectively, all commands default to "Transfer Data" in our
> model until we classify them otherwise.
> 
> This leads to a problem where we reject a zero byte BCL PIO command that
> transfers no data, simply because we have no properties for the command
> at all.
> 
> Getting an ATA rejection for this command greatly confuses NetBSD.
> 
> Correct behavior is to reject the command at the SCSI layer for being
> unsupported.
> 
> 
> 
> For convenience, this branch is available at:
> https://github.com/jnsnow/qemu.git branch atapi-bclimit-netbsd
> https://github.com/jnsnow/qemu/tree/atapi-bclimit-netbsd
> 
> This version is tagged atapi-bclimit-netbsd-v1:
> https://github.com/jnsnow/qemu/releases/tag/atapi-bclimit-netbsd-v1
> 
> John Snow (2):
>   atapi: add byte_count_limit helper
>   atapi: Prioritize unknown cmd error over BCL error
> 
>  hw/ide/atapi.c | 38 +++---
>  1 file changed, 23 insertions(+), 15 deletions(-)

I've tested this against my OpenBIOS ISO images here and it fixes the
problem without introducing any further regressions, so:

Tested-by: Mark Cave-Ayland 


Many thanks,

Mark.




Re: [Qemu-block] [PATCH v6 03/15] block: Release dirty bitmaps in bdrv_close()

2015-11-09 Thread Max Reitz
On 09.11.2015 17:21, Max Reitz wrote:
> On 06.11.2015 19:59, John Snow wrote:
>>
>>
>> On 11/04/2015 01:57 PM, Max Reitz wrote:
>>> bdrv_delete() is not very happy about deleting BlockDriverStates with
>>> dirty bitmaps still attached to them. In the past, we got around that
>>> very easily by relying on bdrv_close_all() bypassing bdrv_delete(), and
>>> bdrv_close() simply ignoring that condition. We should fix that by
>>> releasing all dirty bitmaps in bdrv_close() and drop the assertion in
>>> bdrv_delete().
>>>
>>> Signed-off-by: Max Reitz 
>>> ---
>>>  block.c | 20 +++-
>>>  1 file changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/block.c b/block.c
>>> index 3493501..23448ed 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -87,6 +87,8 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
>>> const char *filename,
>>>   const BdrvChildRole *child_role, Error 
>>> **errp);
>>>  
>>>  static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
>>> +static void bdrv_release_all_dirty_bitmaps(BlockDriverState *bs);
>>> +
>>>  /* If non-zero, use only whitelisted block drivers */
>>>  static int use_bdrv_whitelist;
>>>  
>>> @@ -1916,6 +1918,8 @@ void bdrv_close(BlockDriverState *bs)
>>>  bdrv_drain(bs); /* in case flush left pending I/O */
>>>  notifier_list_notify(>close_notifiers, bs);
>>>  
>>> +bdrv_release_all_dirty_bitmaps(bs);
>>> +
>>>  if (bs->blk) {
>>>  blk_dev_change_media_cb(bs->blk, false);
>>>  }
>>> @@ -2123,7 +2127,6 @@ static void bdrv_delete(BlockDriverState *bs)
>>>  assert(!bs->job);
>>>  assert(bdrv_op_blocker_is_empty(bs));
>>>  assert(!bs->refcnt);
>>> -assert(QLIST_EMPTY(>dirty_bitmaps));
>>>  
>>>  bdrv_close(bs);
>>>  
>>> @@ -3318,6 +3321,21 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
>>> BdrvDirtyBitmap *bitmap)
>>>  }
>>>  }
>>>  
>>> +/**
>>> + * Release all dirty bitmaps attached to a BDS, independently of whether 
>>> they
>>> + * are frozen or not (for use in bdrv_close()).
>>> + */
>>
>> This comment caught my attention ...
>>
>>> +static void bdrv_release_all_dirty_bitmaps(BlockDriverState *bs)
>>> +{
>>> +BdrvDirtyBitmap *bm, *next;
>>> +QLIST_FOREACH_SAFE(bm, >dirty_bitmaps, list, next) {
>>> +QLIST_REMOVE(bm, list);
>>> +hbitmap_free(bm->bitmap);
>>> +g_free(bm->name);
>>> +g_free(bm);
>>> +}
>>> +}
>>> +
>>>  void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
>>>  {
>>>  assert(!bdrv_dirty_bitmap_frozen(bitmap));
>>>
>>
>> Currently, the only way a bitmap could/should be frozen is if it is in
>> use by a job. If a job is running, you probably shouldn't delete stuff
>> it is using out from under it.
> 
> Right. Good thing I'm fixing everything and bdrv_close() will no longer
> be called unless there are no references to the BDS anymore (and each
> block job holds an own reference to its BDS). :-)
> 
> (At least that's how it's supposed to be; also, there is an
> assert(!bs->job) in bdrv_close())
> 
> Also, I wanted to mirror current behavior. Right now, we are just
> leaking all dirty bitmaps on bdrv_close() (bdrv_delete() asserts that
> there are none, but if you invoke bdrv_close() directly...).
> 
>> I am assuming by now that it's actually likely that you've canceled any
>> jobs attached to this node before you go through the motions of deleting
>> it, and it should be safe to just call bdrv_release_dirty_bitmap ...
> 
> Well, yes, but then I'd have to iterate through all the dirty bitmaps to
> call bdrv_release_dirty_bitmap() for each of them, and then
> bdrv_release_dirty_bitmap() iterates through all of them again to check
> whether it really belongs to the BDS. That seemed a bit like a waste.
> 
>> We don't want the two foreach loops though, so maybe just factor out a
>> helper that bdrv_release_all_dirty_bitmaps and bdrv_release_dirty_bitmap
>> can share. You can leave the frozen assertion
>> in just bdrv_release_dirty_bitmap before it invokes the helper, so that
>> bdrv_delete always succeeds in case something gets out-of-sync in the
>> internals.
> 
> Hm, yes, that should do just fine, thanks.
> 
>> (Hmm, to prevent leaks, if you delete a bitmap that is frozen, you
>> should probably call the helper on its child object too... or just make
>> the helper recursive.)
> 
> I think it'll be actually better to just keep the assertion in and thus
> not allow releasing frozen dirty bitmaps in bdrv_close(). In addition
> I'll add an assertion that !bs->refcount to bdrv_close().

OK, I can't actually do that yet. There is a bdrv_close() in one of the
fail paths of bdrv_open_inherit(), and the refcount of that BDS will
probably be 1. There is generally no way that BDS has any bitmaps
attached to it, so we're fine in that regard.

I'll revisit this after this series and its follow-up.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v2 0/8] blockdev: Further BlockBackend work

2015-11-09 Thread Jeff Cody
On Tue, Nov 10, 2015 at 04:49:12AM +0100, Max Reitz wrote:
> On 10.11.2015 04:45, Eric Blake wrote:
> > [side question]
> > 
> > On 11/09/2015 08:27 PM, Max Reitz wrote:
> >> Overall, this series adds more uses for BlockBackends and makes the code
> >> follow the "reference theory" more closely, in that any BlockBackend
> >> created (through -drive or blockdev-add) has a reference count of 1, and
> >> that reference should be held by the monitor. This is reflected here by
> >> introducing an explicit list of monitor-owned BlockBackends, which in
> >> turn allows us to remove bdrv_states, and, perhaps most importantly,
> >> blk_hide_on_behalf_of_do_drive_del().
> >>
> > 
> >> Although I don't suppose anyone will care much, here's that
> >> backport-diff against v1:
> >>
> >> Key:
> >> [] : patches are identical
> >> [] : number of functional differences between upstream/downstream patch
> >> [down] : patch is downstream-only
> >> The flags [FC] indicate (F)unctional and (C)ontextual differences, 
> >> respectively
> >>
> >> 001/8:[] [-C] 'block: Add blk_name_taken()'
> >> 002/8:[] [-C] 'block: Add blk_next_inserted()'
> >> 003/8:[] [-C] 'block: Add blk_commit_all() and 
> >> blk_invalidate_cache_all()'
> >> 004/8:[0073] [FC] 'block: Use BlockBackend more'
> >> 005/8:[0004] [FC] 'blockdev: Add list of monitor-owned BlockBackends'
> >> 006/8:[down] 'blockdev: Remove blk_hide_on_behalf_of_hmp_drive_del()'
> >>   (renamed from "..._on_behalf_of_do_drive_del()")
> > 
> > This is a nice note. Is there any way to make 'git backport-diff'
> > automatically do this, or does it have to be done manually when renaming
> > patches?
> 
> As far as I know, git backport-diff simply looks for matching patches
> based on the title alone, so I don't think there's a way for it to
> automatically figure out name changes.
> 
> I don't know of a way of telling it that a particular "downstream" patch
> belongs to another particular "upstream" patch, though. So in general,
> if I do want an accurate diff even for renamed patches (I did not here,
> because v1 has been more than eight months ago), I just diff manually
> and count the changes or temporarily rename the patch on the current
> (new) branch.
> 
> Max
> 

Yeah, it is a limitation of the "match-by-title" method.  I suppose we
could add a couple different matching "engines", some of which may
work better for some scenarios rather than others (for instance,
looking for cherry-pick lines, etc..).  Maybe even go so far as try
and match patches based on similar content.




Re: [Qemu-block] [Qemu-devel] [PATCH 1/8] qapi: Drop QERR_UNKNOWN_BLOCK_FORMAT_FEATURE

2015-11-09 Thread Eric Blake
On 11/09/2015 08:44 PM, Max Reitz wrote:
> Just specifying a custom string is simpler in basically all places that
> used it, and in addition, specifying the BB or node name is something we
> generally do not do in other error messages when opening a BDS, so we
> should not do it here.
> 
> This changes the output for iotest 036 (to the better, in my opinion),

s/to/for/ to make the idiomatic expression sound correct to a native speaker

> so the reference output needs to be changed accordingly.
> 
> Signed-off-by: Max Reitz 
> ---
>  block/qcow.c   |  6 +-
>  block/qcow2.c  | 24 +---
>  block/qed.c|  7 ++-
>  block/vmdk.c   |  7 ++-
>  include/qapi/qmp/qerror.h  |  3 ---
>  tests/qemu-iotests/036.out | 16 
>  6 files changed, 18 insertions(+), 45 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v6 3/4] qmp: add monitor command to add/remove a child

2015-11-09 Thread Wen Congyang
On 11/10/2015 12:04 AM, Kevin Wolf wrote:
> Am 16.10.2015 um 10:57 hat Wen Congyang geschrieben:
>> +##
>> +# @ChangeOperation:
>> +#
>> +# An enumeration of block device change operation.
>> +#
>> +# @add: Add a new block driver state to a existed block driver state.
>> +#
>> +# @delete: Delete a block driver state's child.
>> +#
>> +# Since: 2.5
>> +##
>> +{ 'enum': 'ChangeOperation',
>> +  'data': [ 'add', 'delete' ] }
> 
> What's the advantage of this enum compared to separate QMP commands? The
> way it is specified here, ChangeOperation is already implicit by whether
> or not child and node are given.
> 
>> +##
>> +# @x-blockdev-change
>> +#
>> +# Dynamic reconfigure the block driver state graph. It can be used to
>> +# add, remove, insert, replace a block driver state. Currently only
>> +# the Quorum driver implements this feature to add and remove its child.
>> +# This is useful to fix a broken quorum child.
>> +#
>> +# @operation: the chanage operation. It can be add, delete.
>> +#
>> +# @parent: the id or node name of which node will be changed.
>> +#
>> +# @child: the child node-name which will be deleted.
> 
> #optional
> 
> Must be present for operation = delete, must not be present otherwise.
> 
>> +# @node: the new node-name which will be added.
> 
> #optional
> 
> Must be present for operation = add, must not be present otherwise.
> 
>> +#
>> +# Note: this command is experimental, and not a stable API.
>> +#
>> +# Since: 2.5
>> +##
>> +{ 'command': 'x-blockdev-change',
>> +  'data' : { 'operation': 'ChangeOperation',
>> + 'parent': 'str',
>> + '*child': 'str',
>> + '*node': 'str' } }
> 
> Let me suggest this alternative:
> 
> { 'command': 'x-blockdev-change',
>   'data' : { 'parent': 'str',
>  'child': 'str',
>  '*node': 'str' } }
> 
> child doesn't describe a node name then, but a child name (adds a
> dependency on my patches which add a name to BdrvChild, though).

Where is the patch? I don't find it.

> Depending on whether node is given and whether the child already exists,
> this may add, remove or replace a child.

If the user wants to insert a filter driver between parent and child, we
also needs three parameters: parent, child, node. So it is why I add the
parameter operation.

Thanks
Wen Congyang

> 
> Kevin
> .
> 




[Qemu-block] [PATCH v2 0/8] blockdev: Further BlockBackend work

2015-11-09 Thread Max Reitz
Overall, this series adds more uses for BlockBackends and makes the code
follow the "reference theory" more closely, in that any BlockBackend
created (through -drive or blockdev-add) has a reference count of 1, and
that reference should be held by the monitor. This is reflected here by
introducing an explicit list of monitor-owned BlockBackends, which in
turn allows us to remove bdrv_states, and, perhaps most importantly,
blk_hide_on_behalf_of_do_drive_del().

Also, it lays groundwork for a nice series that will make bdrv_open()
return a BDS pointer.


This series depends on v8 (or any later version) of my series
"block: Rework bdrv_close_all()" (or any later version). With v7 you
will get a contextual conflict in patch 5 which should be simple to
resolve, however.

(No, v8 is not yet out, what I am referring to as that is basically v7
 with a fix for the following:
 http://lists.nongnu.org/archive/html/qemu-block/2015-11/msg00330.html)


v2:
- Mostly a rebase, complicated by the fact that the underlying series
  have changed a lot in the meantime.


Although I don't suppose anyone will care much, here's that
backport-diff against v1:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/8:[] [-C] 'block: Add blk_name_taken()'
002/8:[] [-C] 'block: Add blk_next_inserted()'
003/8:[] [-C] 'block: Add blk_commit_all() and blk_invalidate_cache_all()'
004/8:[0073] [FC] 'block: Use BlockBackend more'
005/8:[0004] [FC] 'blockdev: Add list of monitor-owned BlockBackends'
006/8:[down] 'blockdev: Remove blk_hide_on_behalf_of_hmp_drive_del()'
  (renamed from "..._on_behalf_of_do_drive_del()")
007/8:[0213] [FC] 'block: Move some bdrv_*_all() functions to BB'
008/8:[0027] [FC] 'block: Remove bdrv_states'



Max Reitz (8):
  block: Add blk_name_taken()
  block: Add blk_next_inserted()
  block: Add blk_commit_all() and blk_invalidate_cache_all()
  block: Use BlockBackend more
  blockdev: Add list of monitor-owned BlockBackends
  blockdev: Remove blk_hide_on_behalf_of_hmp_drive_del()
  block: Move some bdrv_*_all() functions to BB
  block: Remove bdrv_states

 block.c|  82 ++
 block/block-backend.c  | 245 ++---
 block/io.c |  88 ---
 block/qapi.c   |  13 ++-
 blockdev.c |  29 ++---
 cpus.c |   7 +-
 include/block/block.h  |   5 -
 include/block/block_int.h  |   4 -
 include/sysemu/block-backend.h |   8 +-
 migration/block.c  |   7 +-
 migration/migration.c  |   3 +-
 migration/savevm.c |  66 ++-
 monitor.c  |  15 ++-
 qemu-char.c|   3 +-
 qemu-io.c  |   2 +-
 qmp.c  |   9 +-
 stubs/Makefile.objs|   2 +-
 stubs/bdrv-commit-all.c|   7 --
 stubs/blk-commit-all.c |   7 ++
 xen-mapcache.c |   3 +-
 20 files changed, 322 insertions(+), 283 deletions(-)
 delete mode 100644 stubs/bdrv-commit-all.c
 create mode 100644 stubs/blk-commit-all.c

-- 
2.6.2




[Qemu-block] [PATCH v2 4/8] block: Use BlockBackend more

2015-11-09 Thread Max Reitz
Replace bdrv_drain_all(), bdrv_commmit_all(), bdrv_flush_all(),
bdrv_invalidate_cache_all(), bdrv_next() and occurrences of bdrv_states
by their BlockBackend equivalents.

Signed-off-by: Max Reitz 
---
 block.c   | 10 
 block/block-backend.c |  8 +++
 block/qapi.c  | 13 --
 blockdev.c| 12 ++
 cpus.c|  7 +++---
 migration/block.c |  7 --
 migration/migration.c |  3 ++-
 migration/savevm.c| 66 ++-
 monitor.c | 15 +++-
 qemu-char.c   |  3 ++-
 qemu-io.c |  2 +-
 qmp.c |  9 ---
 xen-mapcache.c|  3 ++-
 13 files changed, 95 insertions(+), 63 deletions(-)

diff --git a/block.c b/block.c
index faad9fb..f8c1e99 100644
--- a/block.c
+++ b/block.c
@@ -1717,7 +1717,7 @@ int bdrv_reopen_multiple(BlockReopenQueue *bs_queue, 
Error **errp)
 
 assert(bs_queue != NULL);
 
-bdrv_drain_all();
+blk_drain_all();
 
 QSIMPLEQ_FOREACH(bs_entry, bs_queue, entry) {
 if (bdrv_reopen_prepare(_entry->state, bs_queue, _err)) {
@@ -1974,7 +1974,7 @@ void bdrv_close_all(void)
 
 /* Drop references from requests still in flight, such as canceled block
  * jobs whose AIO context has not been polled yet */
-bdrv_drain_all();
+blk_drain_all();
 
 blockdev_close_all_bdrv_states();
 blk_remove_all_bs();
@@ -3908,14 +3908,14 @@ bool bdrv_recurse_is_first_non_filter(BlockDriverState 
*bs,
  */
 bool bdrv_is_first_non_filter(BlockDriverState *candidate)
 {
-BlockDriverState *bs;
+BlockBackend *blk = NULL;
 
 /* walk down the bs forest recursively */
-QTAILQ_FOREACH(bs, _states, device_list) {
+while ((blk = blk_next_inserted(blk)) != NULL) {
 bool perm;
 
 /* try to recurse in this top level bs */
-perm = bdrv_recurse_is_first_non_filter(bs, candidate);
+perm = bdrv_recurse_is_first_non_filter(blk_bs(blk), candidate);
 
 /* candidate is the first non filter */
 if (perm) {
diff --git a/block/block-backend.c b/block/block-backend.c
index b3d086d..c89340e 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -282,10 +282,10 @@ const char *blk_name(BlockBackend *blk)
  */
 BlockBackend *blk_by_name(const char *name)
 {
-BlockBackend *blk;
+BlockBackend *blk = NULL;
 
 assert(name);
-QTAILQ_FOREACH(blk, _backends, link) {
+while ((blk = blk_next(blk)) != NULL) {
 if (!strcmp(name, blk->name)) {
 return blk;
 }
@@ -360,9 +360,9 @@ DriveInfo *blk_set_legacy_dinfo(BlockBackend *blk, 
DriveInfo *dinfo)
  */
 BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo)
 {
-BlockBackend *blk;
+BlockBackend *blk = NULL;
 
-QTAILQ_FOREACH(blk, _backends, link) {
+while ((blk = blk_next(blk)) != NULL) {
 if (blk->legacy_dinfo == dinfo) {
 return blk;
 }
diff --git a/block/qapi.c b/block/qapi.c
index 89d4274..82727fb 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -405,15 +405,24 @@ BlockStatsList *qmp_query_blockstats(bool has_query_nodes,
 {
 BlockStatsList *head = NULL, **p_next = 
 BlockDriverState *bs = NULL;
+BlockBackend *blk = NULL;
 
 /* Just to be safe if query_nodes is not always initialized */
 query_nodes = has_query_nodes && query_nodes;
 
-while ((bs = query_nodes ? bdrv_next_node(bs) : bdrv_next(bs))) {
+while (query_nodes ? (bs = bdrv_next_node(bs)) != NULL
+   : (blk = blk_next_inserted(blk)) != NULL)
+{
 BlockStatsList *info = g_malloc0(sizeof(*info));
-AioContext *ctx = bdrv_get_aio_context(bs);
+AioContext *ctx = blk ? blk_get_aio_context(blk)
+  : bdrv_get_aio_context(bs);
 
 aio_context_acquire(ctx);
+
+if (blk) {
+bs = blk_bs(blk);
+}
+
 info->value = bdrv_query_stats(bs, !query_nodes);
 aio_context_release(ctx);
 
diff --git a/blockdev.c b/blockdev.c
index 93a70b4..97e6897 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1134,7 +1134,7 @@ void hmp_commit(Monitor *mon, const QDict *qdict)
 int ret;
 
 if (!strcmp(device, "all")) {
-ret = bdrv_commit_all();
+ret = blk_commit_all();
 } else {
 BlockDriverState *bs;
 AioContext *aio_context;
@@ -1953,7 +1953,7 @@ void qmp_transaction(TransactionActionList *dev_list, 
Error **errp)
 QSIMPLEQ_INIT(_bdrv_states);
 
 /* drain all i/o before any operations */
-bdrv_drain_all();
+blk_drain_all();
 
 /* We don't do anything in this loop that commits us to the operations */
 while (NULL != dev_entry) {
@@ -2581,7 +2581,7 @@ void qmp_block_resize(bool has_device, const char *device,
 }
 
 /* complete all in-flight operations before resizing the device */
-bdrv_drain_all();
+blk_drain_all();
 
 ret = bdrv_truncate(bs, size);
 switch 

[Qemu-block] [PATCH v2 5/8] blockdev: Add list of monitor-owned BlockBackends

2015-11-09 Thread Max Reitz
The monitor does hold references to some BlockBackends so it should have
a list of those BBs; blk_backends is a different list, as it contains
references to all BBs (after a follow-up patch, that is), and that
should not be changed because we do need such a list.

monitor_remove_blk() is idempotent so that we can call it in
blockdev_auto_del() without having to care whether it had been called in
do_drive_del() before. monitor_add_blk() is idempotent for symmetry
reasons (monitor_remove_blk() is, so it would be strange for
monitor_add_blk() not to be).

Signed-off-by: Max Reitz 
---
 block/block-backend.c  | 34 +-
 blockdev.c |  8 
 include/sysemu/block-backend.h |  2 ++
 3 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index c89340e..e35d84f 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -29,6 +29,8 @@ struct BlockBackend {
 BlockDriverState *bs;
 DriveInfo *legacy_dinfo;/* null unless created by drive_new() */
 QTAILQ_ENTRY(BlockBackend) link; /* for blk_backends */
+QTAILQ_ENTRY(BlockBackend) monitor_link; /* for monitor_block_backends */
+bool in_monitor_list;
 
 void *dev;  /* attached device model, if any */
 /* TODO change to DeviceState when all users are qdevified */
@@ -70,6 +72,11 @@ static void drive_info_del(DriveInfo *dinfo);
 static QTAILQ_HEAD(, BlockBackend) blk_backends =
 QTAILQ_HEAD_INITIALIZER(blk_backends);
 
+/* All BlockBackends referenced by the monitor and which are iterated through 
by
+ * blk_next() */
+static QTAILQ_HEAD(, BlockBackend) monitor_block_backends =
+QTAILQ_HEAD_INITIALIZER(monitor_block_backends);
+
 /*
  * Create a new BlockBackend with @name, with a reference count of one.
  * @name must not be null or empty.
@@ -248,7 +255,8 @@ void blk_remove_all_bs(void)
  */
 BlockBackend *blk_next(BlockBackend *blk)
 {
-return blk ? QTAILQ_NEXT(blk, link) : QTAILQ_FIRST(_backends);
+return blk ? QTAILQ_NEXT(blk, monitor_link)
+   : QTAILQ_FIRST(_block_backends);
 }
 
 /*
@@ -267,6 +275,30 @@ BlockBackend *blk_next_inserted(BlockBackend *blk)
 }
 
 /*
+ * Add a BlockBackend into the list of backends referenced by the monitor.
+ * Strictly for use by blockdev.c.
+ */
+void monitor_add_blk(BlockBackend *blk)
+{
+if (!blk->in_monitor_list) {
+QTAILQ_INSERT_TAIL(_block_backends, blk, monitor_link);
+blk->in_monitor_list = true;
+}
+}
+
+/*
+ * Remove a BlockBackend from the list of backends referenced by the monitor.
+ * Strictly for use by blockdev.c.
+ */
+void monitor_remove_blk(BlockBackend *blk)
+{
+if (blk->in_monitor_list) {
+QTAILQ_REMOVE(_block_backends, blk, monitor_link);
+blk->in_monitor_list = false;
+}
+}
+
+/*
  * Return @blk's name, a non-null string.
  * Wart: the name is empty iff @blk has been hidden with
  * blk_hide_on_behalf_of_hmp_drive_del().
diff --git a/blockdev.c b/blockdev.c
index 97e6897..f5dde19 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -146,6 +146,7 @@ void blockdev_auto_del(BlockBackend *blk)
 DriveInfo *dinfo = blk_legacy_dinfo(blk);
 
 if (dinfo && dinfo->auto_del) {
+monitor_remove_blk(blk);
 blk_unref(blk);
 }
 }
@@ -606,6 +607,8 @@ static BlockBackend *blockdev_init(const char *file, QDict 
*bs_opts,
 
 blk_set_on_error(blk, on_read_error, on_write_error);
 
+monitor_add_blk(blk);
+
 err_no_bs_opts:
 qemu_opts_del(opts);
 return blk;
@@ -2528,6 +2531,8 @@ void hmp_drive_del(Monitor *mon, const QDict *qdict)
 blk_remove_bs(blk);
 }
 
+monitor_remove_blk(blk);
+
 /* if we have a device attached to this BlockDriverState
  * then we need to make the drive anonymous until the device
  * can be removed.  If this is a drive with no device backing
@@ -3486,6 +3491,7 @@ void qmp_blockdev_add(BlockdevOptions *options, Error 
**errp)
 
 if (bs && bdrv_key_required(bs)) {
 if (blk) {
+monitor_remove_blk(blk);
 blk_unref(blk);
 } else {
 QTAILQ_REMOVE(_bdrv_states, bs, monitor_list);
@@ -3515,6 +3521,7 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
 }
 
 if (has_id) {
+/* blk_by_name() never returns a BB that is not owned by the monitor */
 blk = blk_by_name(id);
 if (!blk) {
 error_setg(errp, "Cannot find block backend %s", id);
@@ -3562,6 +3569,7 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
 }
 
 if (blk) {
+monitor_remove_blk(blk);
 blk_unref(blk);
 } else {
 QTAILQ_REMOVE(_bdrv_states, bs, monitor_list);
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index ee36f93..9a3d8db 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -74,6 +74,8 @@ BlockBackend *blk_by_name(const char 

[Qemu-block] [PATCH v2 7/8] block: Move some bdrv_*_all() functions to BB

2015-11-09 Thread Max Reitz
Move bdrv_drain_all(), bdrv_commit_all(), bdrv_flush_all() and
bdrv_invalidate_cache_all() to BB.

The only operation left is bdrv_close_all(), which cannot be moved to
the BB because it should not only close all BBs, but also all
monitor-owned BDSs.

Signed-off-by: Max Reitz 
---
 block.c |  38 -
 block/block-backend.c   | 138 +++-
 block/io.c  |  88 --
 include/block/block.h   |   4 --
 stubs/Makefile.objs |   2 +-
 stubs/bdrv-commit-all.c |   7 ---
 stubs/blk-commit-all.c  |   7 +++
 7 files changed, 134 insertions(+), 150 deletions(-)
 delete mode 100644 stubs/bdrv-commit-all.c
 create mode 100644 stubs/blk-commit-all.c

diff --git a/block.c b/block.c
index f8c1e99..c80675e 100644
--- a/block.c
+++ b/block.c
@@ -2297,26 +2297,6 @@ ro_cleanup:
 return ret;
 }
 
-int bdrv_commit_all(void)
-{
-BlockDriverState *bs;
-
-QTAILQ_FOREACH(bs, _states, device_list) {
-AioContext *aio_context = bdrv_get_aio_context(bs);
-
-aio_context_acquire(aio_context);
-if (bs->drv && bs->backing) {
-int ret = bdrv_commit(bs);
-if (ret < 0) {
-aio_context_release(aio_context);
-return ret;
-}
-}
-aio_context_release(aio_context);
-}
-return 0;
-}
-
 /*
  * Return values:
  * 0- success
@@ -3074,24 +3054,6 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error 
**errp)
 }
 }
 
-void bdrv_invalidate_cache_all(Error **errp)
-{
-BlockDriverState *bs;
-Error *local_err = NULL;
-
-QTAILQ_FOREACH(bs, _states, device_list) {
-AioContext *aio_context = bdrv_get_aio_context(bs);
-
-aio_context_acquire(aio_context);
-bdrv_invalidate_cache(bs, _err);
-aio_context_release(aio_context);
-if (local_err) {
-error_propagate(errp, local_err);
-return;
-}
-}
-}
-
 /**/
 /* removable device support */
 
diff --git a/block/block-backend.c b/block/block-backend.c
index 609a045..b0bf3b2 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -409,6 +409,9 @@ void blk_remove_bs(BlockBackend *blk)
 notifier_list_notify(>remove_bs_notifiers, blk);
 
 blk_update_root_state(blk);
+/* After this function, the BDS may longer be accessible to blk_*_all()
+ * functions */
+bdrv_drain(blk->bs);
 
 blk->bs->blk = NULL;
 bdrv_unref(blk->bs);
@@ -902,11 +905,6 @@ int blk_flush(BlockBackend *blk)
 return bdrv_flush(blk->bs);
 }
 
-int blk_flush_all(void)
-{
-return bdrv_flush_all();
-}
-
 void blk_drain(BlockBackend *blk)
 {
 if (blk->bs) {
@@ -914,11 +912,6 @@ void blk_drain(BlockBackend *blk)
 }
 }
 
-void blk_drain_all(void)
-{
-bdrv_drain_all();
-}
-
 void blk_set_on_error(BlockBackend *blk, BlockdevOnError on_read_error,
   BlockdevOnError on_write_error)
 {
@@ -1343,12 +1336,133 @@ BlockBackendRootState *blk_get_root_state(BlockBackend 
*blk)
 return >root_state;
 }
 
+/*
+ * Wait for pending requests to complete across all BlockBackends
+ *
+ * This function does not flush data to disk, use blk_flush_all() for that
+ * after calling this function.
+ */
+void blk_drain_all(void)
+{
+/* Always run first iteration so any pending completion BHs run */
+bool busy = true;
+BlockBackend *blk;
+GSList *aio_ctxs = NULL, *ctx;
+
+QTAILQ_FOREACH(blk, _backends, link) {
+AioContext *aio_context = blk_get_aio_context(blk);
+
+aio_context_acquire(aio_context);
+if (blk_is_inserted(blk) && blk->bs->job) {
+block_job_pause(blk->bs->job);
+}
+aio_context_release(aio_context);
+
+if (!g_slist_find(aio_ctxs, aio_context)) {
+aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
+}
+}
+
+/* Note that completion of an asynchronous I/O operation can trigger any
+ * number of other I/O operations on other devices---for example a
+ * coroutine can submit an I/O request to another device in response to
+ * request completion.  Therefore we must keep looping until there was no
+ * more activity rather than simply draining each device independently.
+ */
+while (busy) {
+busy = false;
+
+for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) {
+AioContext *aio_context = ctx->data;
+
+aio_context_acquire(aio_context);
+QTAILQ_FOREACH(blk, _backends, link) {
+if (blk_is_inserted(blk) &&
+aio_context == blk_get_aio_context(blk))
+{
+bdrv_flush_io_queue(blk->bs);
+if (bdrv_requests_pending(blk->bs)) {
+busy = true;
+aio_poll(aio_context, busy);
+}
+

[Qemu-block] [PATCH 7/8] block: Make bdrv_open() return a BDS

2015-11-09 Thread Max Reitz
There are no callers to bdrv_open() or bdrv_open_inherit() left that
pass a pointer to a non-NULL BDS pointer as the first argument of these
functions, so we can finally drop that parameter and just make them
return the new BDS.

Generally, the following pattern is applied:

bs = NULL;
ret = bdrv_open(, ..., _err);
if (ret < 0) {
error_propagate(errp, local_err);
...
}

by

bs = bdrv_open(..., errp);
if (!bs) {
ret = -EINVAL;
...
}

Of course, there are only a few instances where the pattern is really
pure.

Signed-off-by: Max Reitz 
---
 block.c   | 121 +-
 block/block-backend.c |   6 +--
 block/parallels.c |   9 ++--
 block/qcow.c  |   9 ++--
 block/qcow2.c |  29 +---
 block/qed.c   |  11 ++---
 block/sheepdog.c  |  16 +++
 block/vdi.c   |   7 ++-
 block/vhdx.c  |   8 ++--
 block/vmdk.c  |  24 +-
 block/vpc.c   |   7 ++-
 block/vvfat.c |   9 ++--
 blockdev.c|  38 ++--
 include/block/block.h |   4 +-
 14 files changed, 113 insertions(+), 185 deletions(-)

diff --git a/block.c b/block.c
index 0531992..f258c54 100644
--- a/block.c
+++ b/block.c
@@ -82,10 +82,12 @@ static QTAILQ_HEAD(, BlockDriverState) all_bdrv_states =
 static QLIST_HEAD(, BlockDriver) bdrv_drivers =
 QLIST_HEAD_INITIALIZER(bdrv_drivers);
 
-static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
- const char *reference, QDict *options, int flags,
- BlockDriverState *parent,
- const BdrvChildRole *child_role, Error **errp);
+static BlockDriverState *bdrv_open_inherit(const char *filename,
+   const char *reference,
+   QDict *options, int flags,
+   BlockDriverState *parent,
+   const BdrvChildRole *child_role,
+   Error **errp);
 
 static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
 static void bdrv_release_all_dirty_bitmaps(BlockDriverState *bs);
@@ -1211,15 +1213,15 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*options, Error **errp)
 qdict_put(options, "driver", qstring_from_str(bs->backing_format));
 }
 
-backing_hd = NULL;
-ret = bdrv_open_inherit(_hd,
-*backing_filename ? backing_filename : NULL,
-NULL, options, 0, bs, _backing, _err);
-if (ret < 0) {
+backing_hd = bdrv_open_inherit(*backing_filename ? backing_filename : NULL,
+   NULL, options, 0, bs, _backing,
+   _err);
+if (!backing_hd) {
 bs->open_flags |= BDRV_O_NO_BACKING;
 error_setg(errp, "Could not open backing file: %s",
error_get_pretty(local_err));
 error_free(local_err);
+ret = -EINVAL;
 goto free_exit;
 }
 
@@ -1256,7 +1258,6 @@ BdrvChild *bdrv_open_child(const char *filename,
 BdrvChild *c = NULL;
 BlockDriverState *bs;
 QDict *image_options;
-int ret;
 char *bdref_key_dot;
 const char *reference;
 
@@ -1276,10 +1277,9 @@ BdrvChild *bdrv_open_child(const char *filename,
 goto done;
 }
 
-bs = NULL;
-ret = bdrv_open_inherit(, filename, reference, image_options, 0,
-parent, child_role, errp);
-if (ret < 0) {
+bs = bdrv_open_inherit(filename, reference, image_options, 0,
+   parent, child_role, errp);
+if (!bs) {
 goto done;
 }
 
@@ -1342,11 +1342,9 @@ static BlockDriverState 
*bdrv_append_temp_snapshot(BlockDriverState *bs,
 qdict_put(snapshot_options, "driver",
   qstring_from_str("qcow2"));
 
-bs_snapshot = NULL;
-ret = bdrv_open(_snapshot, NULL, NULL, snapshot_options,
-flags, _err);
-if (ret < 0) {
-error_propagate(errp, local_err);
+bs_snapshot = bdrv_open(NULL, NULL, snapshot_options, flags, errp);
+if (!bs_snapshot) {
+ret = -EINVAL;
 goto out;
 }
 
@@ -1376,10 +1374,12 @@ out:
  * should be opened. If specified, neither options nor a filename may be given,
  * nor can an existing BDS be reused (that is, *pbs has to be NULL).
  */
-static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
- const char *reference, QDict *options, int flags,
- BlockDriverState *parent,
- const BdrvChildRole *child_role, Error **errp)
+static BlockDriverState *bdrv_open_inherit(const char *filename,
+   const char *reference,
+ 

Re: [Qemu-block] [PATCH v2 0/8] blockdev: Further BlockBackend work

2015-11-09 Thread Max Reitz
On 10.11.2015 04:45, Eric Blake wrote:
> [side question]
> 
> On 11/09/2015 08:27 PM, Max Reitz wrote:
>> Overall, this series adds more uses for BlockBackends and makes the code
>> follow the "reference theory" more closely, in that any BlockBackend
>> created (through -drive or blockdev-add) has a reference count of 1, and
>> that reference should be held by the monitor. This is reflected here by
>> introducing an explicit list of monitor-owned BlockBackends, which in
>> turn allows us to remove bdrv_states, and, perhaps most importantly,
>> blk_hide_on_behalf_of_do_drive_del().
>>
> 
>> Although I don't suppose anyone will care much, here's that
>> backport-diff against v1:
>>
>> Key:
>> [] : patches are identical
>> [] : number of functional differences between upstream/downstream patch
>> [down] : patch is downstream-only
>> The flags [FC] indicate (F)unctional and (C)ontextual differences, 
>> respectively
>>
>> 001/8:[] [-C] 'block: Add blk_name_taken()'
>> 002/8:[] [-C] 'block: Add blk_next_inserted()'
>> 003/8:[] [-C] 'block: Add blk_commit_all() and 
>> blk_invalidate_cache_all()'
>> 004/8:[0073] [FC] 'block: Use BlockBackend more'
>> 005/8:[0004] [FC] 'blockdev: Add list of monitor-owned BlockBackends'
>> 006/8:[down] 'blockdev: Remove blk_hide_on_behalf_of_hmp_drive_del()'
>>   (renamed from "..._on_behalf_of_do_drive_del()")
> 
> This is a nice note. Is there any way to make 'git backport-diff'
> automatically do this, or does it have to be done manually when renaming
> patches?

As far as I know, git backport-diff simply looks for matching patches
based on the title alone, so I don't think there's a way for it to
automatically figure out name changes.

I don't know of a way of telling it that a particular "downstream" patch
belongs to another particular "upstream" patch, though. So in general,
if I do want an accurate diff even for renamed patches (I did not here,
because v1 has been more than eight months ago), I just diff manually
and count the changes or temporarily rename the patch on the current
(new) branch.

Max



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH 3/8] block: Let bdrv_open_inherit() return the snapshot

2015-11-09 Thread Max Reitz
If bdrv_open_inherit() creates a snapshot BDS and *pbs is NULL, that
snapshot BDS should be returned instead of the BDS under it.

To this end, bdrv_append_temp_snapshot() now returns the snapshot BDS
instead of just appending it on top of the snapshotted BDS, and that
function is made static since bdrv_open_inherit() is its only user
anyway. Also, it calls bdrv_ref() before bdrv_append() (which
bdrv_open_inherit() has to undo if not returning the overlay).

Signed-off-by: Max Reitz 
---
 block.c   | 27 +++
 include/block/block.h |  1 -
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index aae1d69..c5ea5e7 100644
--- a/block.c
+++ b/block.c
@@ -1295,7 +1295,8 @@ done:
 return c;
 }
 
-int bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
+static BlockDriverState *bdrv_append_temp_snapshot(BlockDriverState *bs,
+   int flags, Error **errp)
 {
 /* TODO: extra byte is a hack to ensure MAX_PATH space on Windows. */
 char *tmp_filename = g_malloc0(PATH_MAX + 1);
@@ -1354,11 +1355,15 @@ int bdrv_append_temp_snapshot(BlockDriverState *bs, int 
flags, Error **errp)
 goto out;
 }
 
+bdrv_ref(bs_snapshot);
 bdrv_append(bs_snapshot, bs);
 
+g_free(tmp_filename);
+return bs_snapshot;
+
 out:
 g_free(tmp_filename);
-return ret;
+return NULL;
 }
 
 /*
@@ -1557,15 +1562,29 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
 }
 
 QDECREF(options);
-*pbs = bs;
 
 /* For snapshot=on, create a temporary qcow2 overlay. bs points to the
  * temporary snapshot afterwards. */
 if (snapshot_flags) {
-ret = bdrv_append_temp_snapshot(bs, snapshot_flags, _err);
+BlockDriverState *snapshot_bs;
+snapshot_bs = bdrv_append_temp_snapshot(bs, snapshot_flags, 
_err);
 if (local_err) {
+ret = -EINVAL;
 goto close_and_fail;
 }
+if (!*pbs) {
+/* The reference is now held by the overlay BDS */
+bdrv_unref(bs);
+bs = snapshot_bs;
+} else {
+/* It is still referenced in the same way that *pbs was referenced,
+ * however that may be */
+bdrv_unref(snapshot_bs);
+}
+}
+
+if (!*pbs) {
+*pbs = bs;
 }
 
 return 0;
diff --git a/include/block/block.h b/include/block/block.h
index 1abfc70..37e49a1 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -210,7 +210,6 @@ BdrvChild *bdrv_open_child(const char *filename,
bool allow_none, Error **errp);
 void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd);
 int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp);
-int bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp);
 int bdrv_open(BlockDriverState **pbs, const char *filename,
   const char *reference, QDict *options, int flags, Error **errp);
 BlockReopenQueue *bdrv_reopen_queue(BlockReopenQueue *bs_queue,
-- 
2.6.2




[Qemu-block] [PATCH 6/8] block: Drop bdrv_new_root()

2015-11-09 Thread Max Reitz
It is unused now, so we may just as well drop it.

Signed-off-by: Max Reitz 
---
 block.c   | 5 -
 include/block/block.h | 1 -
 2 files changed, 6 deletions(-)

diff --git a/block.c b/block.c
index 395b7b0..0531992 100644
--- a/block.c
+++ b/block.c
@@ -243,11 +243,6 @@ void bdrv_register(BlockDriver *bdrv)
 QLIST_INSERT_HEAD(_drivers, bdrv, list);
 }
 
-BlockDriverState *bdrv_new_root(void)
-{
-return bdrv_new();
-}
-
 BlockDriverState *bdrv_new(void)
 {
 BlockDriverState *bs;
diff --git a/include/block/block.h b/include/block/block.h
index 37e49a1..6638790 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -193,7 +193,6 @@ BlockDriver *bdrv_find_format(const char *format_name);
 int bdrv_create(BlockDriver *drv, const char* filename,
 QemuOpts *opts, Error **errp);
 int bdrv_create_file(const char *filename, QemuOpts *opts, Error **errp);
-BlockDriverState *bdrv_new_root(void);
 BlockDriverState *bdrv_new(void);
 void bdrv_make_anon(BlockDriverState *bs);
 void bdrv_swap(BlockDriverState *bs_new, BlockDriverState *bs_old);
-- 
2.6.2




[Qemu-block] [PATCH 5/8] block: Drop blk_new_with_bs()

2015-11-09 Thread Max Reitz
Its only caller is blk_new_open(), so we can just inline it there. Since
bdrv_new_root() is only a wrapper around bdrv_new(), we can just use
bdrv_new() instead.

Signed-off-by: Max Reitz 
---
 block/block-backend.c | 29 +++--
 1 file changed, 7 insertions(+), 22 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index b0bf3b2..13f5fef 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -114,26 +114,6 @@ BlockBackend *blk_new(const char *name, Error **errp)
 }
 
 /*
- * Create a new BlockBackend with a new BlockDriverState attached.
- * Otherwise just like blk_new(), which see.
- */
-BlockBackend *blk_new_with_bs(const char *name, Error **errp)
-{
-BlockBackend *blk;
-BlockDriverState *bs;
-
-blk = blk_new(name, errp);
-if (!blk) {
-return NULL;
-}
-
-bs = bdrv_new_root();
-blk->bs = bs;
-bs->blk = blk;
-return blk;
-}
-
-/*
  * Calls blk_new_with_bs() and then calls bdrv_open() on the BlockDriverState.
  *
  * Just as with bdrv_open(), after having called this function the reference to
@@ -150,20 +130,25 @@ BlockBackend *blk_new_open(const char *name, const char 
*filename,
Error **errp)
 {
 BlockBackend *blk;
+BlockDriverState *bs;
 int ret;
 
-blk = blk_new_with_bs(name, errp);
+blk = blk_new(name, errp);
 if (!blk) {
 QDECREF(options);
 return NULL;
 }
 
-ret = bdrv_open(>bs, filename, reference, options, flags, errp);
+bs = NULL;
+ret = bdrv_open(, filename, reference, options, flags, errp);
 if (ret < 0) {
 blk_unref(blk);
 return NULL;
 }
 
+blk->bs = bs;
+bs->blk = blk;
+
 return blk;
 }
 
-- 
2.6.2




Re: [Qemu-block] [PATCH v2 0/8] blockdev: Further BlockBackend work

2015-11-09 Thread Eric Blake
[side question]

On 11/09/2015 08:27 PM, Max Reitz wrote:
> Overall, this series adds more uses for BlockBackends and makes the code
> follow the "reference theory" more closely, in that any BlockBackend
> created (through -drive or blockdev-add) has a reference count of 1, and
> that reference should be held by the monitor. This is reflected here by
> introducing an explicit list of monitor-owned BlockBackends, which in
> turn allows us to remove bdrv_states, and, perhaps most importantly,
> blk_hide_on_behalf_of_do_drive_del().
> 

> Although I don't suppose anyone will care much, here's that
> backport-diff against v1:
> 
> Key:
> [] : patches are identical
> [] : number of functional differences between upstream/downstream patch
> [down] : patch is downstream-only
> The flags [FC] indicate (F)unctional and (C)ontextual differences, 
> respectively
> 
> 001/8:[] [-C] 'block: Add blk_name_taken()'
> 002/8:[] [-C] 'block: Add blk_next_inserted()'
> 003/8:[] [-C] 'block: Add blk_commit_all() and blk_invalidate_cache_all()'
> 004/8:[0073] [FC] 'block: Use BlockBackend more'
> 005/8:[0004] [FC] 'blockdev: Add list of monitor-owned BlockBackends'
> 006/8:[down] 'blockdev: Remove blk_hide_on_behalf_of_hmp_drive_del()'
>   (renamed from "..._on_behalf_of_do_drive_del()")

This is a nice note. Is there any way to make 'git backport-diff'
automatically do this, or does it have to be done manually when renaming
patches?

> 007/8:[0213] [FC] 'block: Move some bdrv_*_all() functions to BB'
> 008/8:[0027] [FC] 'block: Remove bdrv_states'
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v6 04/15] iotests: Move _filter_nbd into common.filter

2015-11-09 Thread Max Reitz
On 09.11.2015 19:17, Max Reitz wrote:
> On 09.11.2015 17:04, Kevin Wolf wrote:
>> Am 04.11.2015 um 19:57 hat Max Reitz geschrieben:
>>> _filter_nbd can be useful for other NBD tests, too, therefore it should
>>> reside in common.filter, and it should support URLs of the "nbd://"
>>> format and export names.
>>>
>>> The NBD log lines ("/your/source/dir/nbd.c:function():line: error")
>>> should not be converted to empty lines but removed altogether.
>>>
>>> Signed-off-by: Max Reitz 
>>
>> Code motion and modification in the same patch is bad style. The changes
>> look good, though.
> 
> Considering splitting this into two patches will result basically in
> both of them each changing just as much as this single patch does
> (because test 083 uses tabs instead of spaces) I'm inclined to just
> change the commit title to "Remove filter_nbd and add _filter_nbd" instead.
> 
> There actually is no good style for this patch. One could argue that the
> coding style in all of test 083 is broken since it uses tabs instead of
> spaces, so first I'd need to fix up the style of 083 completely. Then,
> in a second patch, I can drop those empty lines, and in a third patch I
> can move the function. I consider that horrible when it's just about
> getting the code to common.filter.
> 
> The second variant would be not to move the code, but to "move" it, i.e.
> leave the coding style in 083 and just convert the style of this
> function when moving it to common.filter. Well, if I'm already doing
> that, I might just as well fix that empty line thing on the way.
> 
> The third variant would be to fix that empty line thing in 083, and fix
> the code style of that single function along with it, and then move it
> to common.filter in a separate patch.
> 
> And then we have the fourth way which would be to move nbd_filter to
> common.filter as it is, and then fix up the coding style there.
> 
> So let's look at my opinion for each of them:
> (1) I find it horrible, but I can do that.
> (2) That's what I did and that's what I'd do again.
> (3) I'm opposed to change the style of that one function inside of 083
> without changing the rest of the file, but not strongly enough not
> to do it.
> (4) I will definitely not insert tabs, even if it's just code movement.
> 
> I still consider 2 very reasonable for the issue at hand since the
> function is very small and it will have to be completely “rewritten”
> anyway in some patch (because the tabs to spaces change is absolutely
> necessary at some point when moving it from 083 to common.filter).
> Therefore, I don't think reviewers gain anything from doing it any other
> way.
> 
> I consider 1 (fixing up all of 083 just so that I can move this little
> function) so horrible that I won't do it unless there is another way,
> and 2 already is another way, so that's that.
> 
> I guess I'll go for 3 just so you can see why I chose 2 before. I can do
> 1 in v8 if you insist, so you can get to experience that, too.

Oh, even better, I'll go for a mix of 1, 3 and 4: I'll first fix the
code style of that function alone, then I'll rename it to _filter_nbd,
then I'll move it, and then I'll fix it. Even more patches than 1, but
not as many changes, as I don't have to fix all of 083.

Max



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH 1/2] atapi: add byte_count_limit helper

2015-11-09 Thread John Snow
Signed-off-by: John Snow 
---
 hw/ide/atapi.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
index 747f466..1471ae2 100644
--- a/hw/ide/atapi.c
+++ b/hw/ide/atapi.c
@@ -167,6 +167,17 @@ void ide_atapi_io_error(IDEState *s, int ret)
 }
 }
 
+static uint16_t atapi_byte_count_limit(IDEState *s)
+{
+uint16_t bcl;
+
+bcl = s->lcyl | (s->hcyl << 8);
+if (bcl == 0x) {
+return 0xfffe;
+}
+return bcl;
+}
+
 /* The whole ATAPI transfer logic is handled in this function */
 void ide_atapi_cmd_reply_end(IDEState *s)
 {
@@ -209,12 +220,10 @@ void ide_atapi_cmd_reply_end(IDEState *s)
 } else {
 /* a new transfer is needed */
 s->nsector = (s->nsector & ~7) | ATAPI_INT_REASON_IO;
-byte_count_limit = s->lcyl | (s->hcyl << 8);
+byte_count_limit = atapi_byte_count_limit(s);
 #ifdef DEBUG_IDE_ATAPI
 printf("byte_count_limit=%d\n", byte_count_limit);
 #endif
-if (byte_count_limit == 0x)
-byte_count_limit--;
 size = s->packet_transfer_size;
 if (size > byte_count_limit) {
 /* byte count limit must be even if this case */
@@ -1265,8 +1274,7 @@ void ide_atapi_cmd(IDEState *s)
  * See ATA8 ACS3 section 7.17.6.49 and 7.21.5 */
 if (!(atapi_cmd_table[s->io_buffer[0]].flags & NONDATA)) {
 /* TODO: Check IDENTIFY data word 125 for default BCL (currently 0) */
-uint16_t byte_count_limit = s->lcyl | (s->hcyl << 8);
-if (!(byte_count_limit || s->atapi_dma)) {
+if (!(atapi_byte_count_limit(s) || s->atapi_dma)) {
 /* TODO: Move abort back into core.c and make static inline again 
*/
 ide_abort_command(s);
 return;
-- 
2.4.3




[Qemu-block] [PATCH 0/2] atapi: fix NetBSD boot regression

2015-11-09 Thread John Snow
Marc noticed that a recent ATAPI permissions fix broke NetBSD 7.0's
installer ISO.

The problem is that it's meaningless to check for !(cmd->flags & nondata)
if the command isn't supported, since all unsupported commands have
_no_ flags. Effectively, all commands default to "Transfer Data" in our
model until we classify them otherwise.

This leads to a problem where we reject a zero byte BCL PIO command that
transfers no data, simply because we have no properties for the command
at all.

Getting an ATA rejection for this command greatly confuses NetBSD.

Correct behavior is to reject the command at the SCSI layer for being
unsupported.



For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch atapi-bclimit-netbsd
https://github.com/jnsnow/qemu/tree/atapi-bclimit-netbsd

This version is tagged atapi-bclimit-netbsd-v1:
https://github.com/jnsnow/qemu/releases/tag/atapi-bclimit-netbsd-v1

John Snow (2):
  atapi: add byte_count_limit helper
  atapi: Prioritize unknown cmd error over BCL error

 hw/ide/atapi.c | 38 +++---
 1 file changed, 23 insertions(+), 15 deletions(-)

-- 
2.4.3




[Qemu-block] [PATCH 2/2] atapi: Prioritize unknown cmd error over BCL error

2015-11-09 Thread John Snow
If we don't know about the command at all, we need to prioritize
that failure above the zero byte-count-limit failure.

This fixes a failure in the sparc64 NetBSD 7.0 installer bootup.

Reported-by: Mark Cave-Ayland 
Signed-off-by: John Snow 
---
 hw/ide/atapi.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
index 1471ae2..52a989e 100644
--- a/hw/ide/atapi.c
+++ b/hw/ide/atapi.c
@@ -1188,7 +1188,7 @@ enum {
 NONDATA = 0x04,
 };
 
-static const struct {
+static const struct AtapiCmd {
 void (*handler)(IDEState *s, uint8_t *buf);
 int flags;
 } atapi_cmd_table[0x100] = {
@@ -1215,9 +1215,9 @@ static const struct {
 
 void ide_atapi_cmd(IDEState *s)
 {
-uint8_t *buf;
+uint8_t *buf = s->io_buffer;
+const struct AtapiCmd *cmd = _cmd_table[s->io_buffer[0]];
 
-buf = s->io_buffer;
 #ifdef DEBUG_IDE_ATAPI
 {
 int i;
@@ -1228,14 +1228,14 @@ void ide_atapi_cmd(IDEState *s)
 printf("\n");
 }
 #endif
+
 /*
  * If there's a UNIT_ATTENTION condition pending, only command flagged with
  * ALLOW_UA are allowed to complete. with other commands getting a CHECK
  * condition response unless a higher priority status, defined by the drive
  * here, is pending.
  */
-if (s->sense_key == UNIT_ATTENTION &&
-!(atapi_cmd_table[s->io_buffer[0]].flags & ALLOW_UA)) {
+if (s->sense_key == UNIT_ATTENTION && !(cmd->flags & ALLOW_UA)) {
 ide_atapi_cmd_check_status(s);
 return;
 }
@@ -1246,7 +1246,7 @@ void ide_atapi_cmd(IDEState *s)
  * GET_EVENT_STATUS_NOTIFICATION to detect such tray open/close
  * states rely on this behavior.
  */
-if (!(atapi_cmd_table[s->io_buffer[0]].flags & ALLOW_UA) &&
+if (!(cmd->flags & ALLOW_UA) &&
 !s->tray_open && blk_is_inserted(s->blk) && s->cdrom_changed) {
 
 if (s->cdrom_changed == 1) {
@@ -1261,7 +1261,7 @@ void ide_atapi_cmd(IDEState *s)
 }
 
 /* Report a Not Ready condition if appropriate for the command */
-if ((atapi_cmd_table[s->io_buffer[0]].flags & CHECK_READY) &&
+if ((cmd->flags & CHECK_READY) &&
 (!media_present(s) || !blk_is_inserted(s->blk)))
 {
 ide_atapi_cmd_error(s, NOT_READY, ASC_MEDIUM_NOT_PRESENT);
@@ -1272,7 +1272,7 @@ void ide_atapi_cmd(IDEState *s)
  * If this is a data-transferring PIO command and BCL is 0,
  * we abort at the /ATA/ level, not the ATAPI level.
  * See ATA8 ACS3 section 7.17.6.49 and 7.21.5 */
-if (!(atapi_cmd_table[s->io_buffer[0]].flags & NONDATA)) {
+if (cmd->handler && !(cmd->flags & NONDATA)) {
 /* TODO: Check IDENTIFY data word 125 for default BCL (currently 0) */
 if (!(atapi_byte_count_limit(s) || s->atapi_dma)) {
 /* TODO: Move abort back into core.c and make static inline again 
*/
@@ -1282,8 +1282,8 @@ void ide_atapi_cmd(IDEState *s)
 }
 
 /* Execute the command */
-if (atapi_cmd_table[s->io_buffer[0]].handler) {
-atapi_cmd_table[s->io_buffer[0]].handler(s, buf);
+if (cmd->handler) {
+cmd->handler(s, buf);
 return;
 }
 
-- 
2.4.3




Re: [Qemu-block] [PATCH v4 0/9] block: Fixes for bdrv_drain

2015-11-09 Thread Stefan Hajnoczi
On Mon, Nov 09, 2015 at 06:16:45PM +0800, Fam Zheng wrote:
> v4: Don't miss children's children. [Paolo]
> 
> v3: Don't reuse coroutine in bdrv_aio_ioctl. [Stefan]
> Recursely call .bdrv_drain callback only. [Stefan, Paolo]
> Added Kevin's reviewed-by in other patches.
> 
> v2: Add Kevin's reviewed-by in patches 1, 2, 5-7, 9.
> Address Kevin's reviewing comments which are:
> - Explicit "ret = 0" before out label in patch 3.
> - Add missing qemu_aio_unref() in patch 4.
> - Recurse into all children in bdrv_drain in patch 8.
> 
> Previously bdrv_drain and bdrv_drain_all don't handle ioctl, flush and discard
> requests (which are fundamentally the same as read and write requests that
> change disk state).  Forgetting such requests leaves us in risk of violating
> the invariant that bdrv_drain() callers rely on - all asynchronous requests
> must have completed after bdrv_drain returns.
> 
> Enrich the tracked request types, and add tracked_request_begin/end pairs to
> all three code paths. As a prerequisite, ioctl code is moved into coroutine
> too.
> 
> The last two patches take care of QED's "need check" timer, so that after
> bdrv_drain returns, the driver is in a consistent state.
> 
> Fam
> 
> 
> Fam Zheng (9):
>   block: Add more types for tracked request
>   block: Track flush requests
>   block: Track discard requests
>   iscsi: Emulate commands in iscsi_aio_ioctl as iscsi_ioctl
>   block: Add ioctl parameter fields to BlockRequest
>   block: Emulate bdrv_ioctl with bdrv_aio_ioctl and track both
>   block: Drop BlockDriver.bdrv_ioctl
>   block: Introduce BlockDriver.bdrv_drain callback
>   qed: Implement .bdrv_drain
> 
>  block/io.c| 150 
> +++---
>  block/iscsi.c |  73 +++---
>  block/qed.c   |  13 
>  block/raw-posix.c |   8 ---
>  block/raw_bsd.c   |   6 --
>  include/block/block.h |  16 +++--
>  include/block/block_int.h |  17 +-
>  7 files changed, 208 insertions(+), 75 deletions(-)
> 
> -- 
> 2.4.3
> 

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH v6 07/15] block: Move BDS close notifiers into BB

2015-11-09 Thread Max Reitz
On 09.11.2015 17:04, Kevin Wolf wrote:
> Am 04.11.2015 um 19:57 hat Max Reitz geschrieben:
>> The only remaining user of the BDS close notifiers is NBD which uses
>> them to determine when a BDS tree is being ejected. This patch removes
>> the BDS-level close notifiers and adds a notifier list to the
>> BlockBackend structure that is invoked whenever a BDS is removed.
>>
>> Symmetrically to that, another notifier list is added that is invoked
>> whenever a BDS is inserted. The dataplane implementations for virtio-blk
>> and virtio-scsi use both notifier types for setting up and removing op
>> blockers. This is not only important for setting up the op blockers on
>> insertion, but also for removing them on ejection since bdrv_delete()
>> asserts that there are no op blockers set up.
>>
>> Signed-off-by: Max Reitz 
> 
> I think this needs to be split into smaller patches:
> 
> 1. Add the new BlockBackend notifiers
> 2. Use them in virtio-blk in order to fix... removable virtio-blk
>devices, or what is it?
> 3. Convert NBD
> 4. Remove old close notifiers

I'll do my best.

>>  block.c |  7 
>>  block/block-backend.c   | 19 +++---
>>  blockdev-nbd.c  | 37 +---
>>  hw/block/dataplane/virtio-blk.c | 77 
>> +++--
>>  hw/scsi/virtio-scsi.c   | 59 +++
>>  include/block/block.h   |  1 -
>>  include/block/block_int.h   |  2 --
>>  include/hw/virtio/virtio-scsi.h | 10 ++
>>  include/sysemu/block-backend.h  |  3 +-
>>  nbd.c   | 13 +++
>>  10 files changed, 159 insertions(+), 69 deletions(-)
> 
>> diff --git a/block/block-backend.c b/block/block-backend.c
>> index 6f9309f..38580f7 100644
>> --- a/block/block-backend.c
>> +++ b/block/block-backend.c
>> @@ -48,6 +48,8 @@ struct BlockBackend {
>>  BlockdevOnError on_read_error, on_write_error;
>>  bool iostatus_enabled;
>>  BlockDeviceIoStatus iostatus;
>> +
>> +NotifierList remove_bs_notifiers, insert_bs_notifiers;
>>  };
>>  
>>  typedef struct BlockBackendAIOCB {
>> @@ -98,6 +100,8 @@ BlockBackend *blk_new(const char *name, Error **errp)
>>  blk = g_new0(BlockBackend, 1);
>>  blk->name = g_strdup(name);
>>  blk->refcnt = 1;
>> +notifier_list_init(>remove_bs_notifiers);
>> +notifier_list_init(>insert_bs_notifiers);
>>  QTAILQ_INSERT_TAIL(_backends, blk, link);
>>  return blk;
>>  }
>> @@ -343,6 +347,8 @@ void blk_hide_on_behalf_of_hmp_drive_del(BlockBackend 
>> *blk)
>>   */
>>  void blk_remove_bs(BlockBackend *blk)
>>  {
>> +notifier_list_notify(>remove_bs_notifiers, blk);
>> +
>>  blk_update_root_state(blk);
>>  
>>  blk->bs->blk = NULL;
>> @@ -359,6 +365,8 @@ void blk_insert_bs(BlockBackend *blk, BlockDriverState 
>> *bs)
>>  bdrv_ref(bs);
>>  blk->bs = bs;
>>  bs->blk = blk;
>> +
>> +notifier_list_notify(>insert_bs_notifiers, blk);
>>  }
> 
> Do we want to notify on BB deletion, too? It's also some kind of removal
> of a connection between BB and BDS.  In other words, should blk_delete()
> call blk_remove_bs() rather than bdrv_unref()?
> 
> [ Edit: I see that's what the next patch does. Good. ]
> 
> Should blk_unref() also assert that the notifier list is empty?
> Otherwise we would be leaking notifiers.

You mean blk_delete()? I can do that, yes.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v6 04/15] iotests: Move _filter_nbd into common.filter

2015-11-09 Thread Max Reitz
On 09.11.2015 17:04, Kevin Wolf wrote:
> Am 04.11.2015 um 19:57 hat Max Reitz geschrieben:
>> _filter_nbd can be useful for other NBD tests, too, therefore it should
>> reside in common.filter, and it should support URLs of the "nbd://"
>> format and export names.
>>
>> The NBD log lines ("/your/source/dir/nbd.c:function():line: error")
>> should not be converted to empty lines but removed altogether.
>>
>> Signed-off-by: Max Reitz 
> 
> Code motion and modification in the same patch is bad style. The changes
> look good, though.

Considering splitting this into two patches will result basically in
both of them each changing just as much as this single patch does
(because test 083 uses tabs instead of spaces) I'm inclined to just
change the commit title to "Remove filter_nbd and add _filter_nbd" instead.

There actually is no good style for this patch. One could argue that the
coding style in all of test 083 is broken since it uses tabs instead of
spaces, so first I'd need to fix up the style of 083 completely. Then,
in a second patch, I can drop those empty lines, and in a third patch I
can move the function. I consider that horrible when it's just about
getting the code to common.filter.

The second variant would be not to move the code, but to "move" it, i.e.
leave the coding style in 083 and just convert the style of this
function when moving it to common.filter. Well, if I'm already doing
that, I might just as well fix that empty line thing on the way.

The third variant would be to fix that empty line thing in 083, and fix
the code style of that single function along with it, and then move it
to common.filter in a separate patch.

And then we have the fourth way which would be to move nbd_filter to
common.filter as it is, and then fix up the coding style there.

So let's look at my opinion for each of them:
(1) I find it horrible, but I can do that.
(2) That's what I did and that's what I'd do again.
(3) I'm opposed to change the style of that one function inside of 083
without changing the rest of the file, but not strongly enough not
to do it.
(4) I will definitely not insert tabs, even if it's just code movement.

I still consider 2 very reasonable for the issue at hand since the
function is very small and it will have to be completely “rewritten”
anyway in some patch (because the tabs to spaces change is absolutely
necessary at some point when moving it from 083 to common.filter).
Therefore, I don't think reviewers gain anything from doing it any other
way.

I consider 1 (fixing up all of 083 just so that I can move this little
function) so horrible that I won't do it unless there is another way,
and 2 already is another way, so that's that.

I guess I'll go for 3 just so you can see why I chose 2 before. I can do
1 in v8 if you insist, so you can get to experience that, too.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v7 12/35] util: let qemu_fd_getlength support block device

2015-11-09 Thread Eduardo Habkost
On Mon, Nov 09, 2015 at 01:58:27PM +0800, Xiao Guangrong wrote:
> 
> 
> On 11/06/2015 11:54 PM, Eduardo Habkost wrote:
> >On Mon, Nov 02, 2015 at 05:13:14PM +0800, Xiao Guangrong wrote:
> >>lseek can not work for all block devices as the man page says:
> >>| Some devices are incapable of seeking and POSIX does not specify
> >>| which devices must support lseek().
> >>
> >>This patch tries to add the support on Linux by using BLKGETSIZE64
> >>ioctl
> >>
> >>Signed-off-by: Xiao Guangrong 
> >
> >On which cases is this patch necessary? Do you know any examples of
> >Linux block devices that won't work with lseek(SEEK_END)?
> 
> To be honest, i have not checked all block device, this patch was made
> based on the man page. However, i do not mind drop this patch (and maybe
> other patches) to make this pachset smaller. BLKGETSIZE64 can be added
> in the future if we meet such device.

By looking at the Linux source code implementing BLKGETSIZE64, it looks
like it should work for all block devices where SEEK_END works:

* BLKGETSIZE64 returns i_size_read(bdev->bd_inode)
  (block/ioctl.c:blkdev_ioctl())
* llseek(SEEK_END) uses i_size_read(bd_inode) as the offset
  (fs/block_dev.c:block_llseek())

That's probably why raw_getlength() never needed a Linux-specific
BLKGETSIZE call.

-- 
Eduardo



Re: [Qemu-block] [PATCH v6 3/4] qmp: add monitor command to add/remove a child

2015-11-09 Thread Kevin Wolf
Am 16.10.2015 um 10:57 hat Wen Congyang geschrieben:
> +##
> +# @ChangeOperation:
> +#
> +# An enumeration of block device change operation.
> +#
> +# @add: Add a new block driver state to a existed block driver state.
> +#
> +# @delete: Delete a block driver state's child.
> +#
> +# Since: 2.5
> +##
> +{ 'enum': 'ChangeOperation',
> +  'data': [ 'add', 'delete' ] }

What's the advantage of this enum compared to separate QMP commands? The
way it is specified here, ChangeOperation is already implicit by whether
or not child and node are given.

> +##
> +# @x-blockdev-change
> +#
> +# Dynamic reconfigure the block driver state graph. It can be used to
> +# add, remove, insert, replace a block driver state. Currently only
> +# the Quorum driver implements this feature to add and remove its child.
> +# This is useful to fix a broken quorum child.
> +#
> +# @operation: the chanage operation. It can be add, delete.
> +#
> +# @parent: the id or node name of which node will be changed.
> +#
> +# @child: the child node-name which will be deleted.

#optional

Must be present for operation = delete, must not be present otherwise.

> +# @node: the new node-name which will be added.

#optional

Must be present for operation = add, must not be present otherwise.

> +#
> +# Note: this command is experimental, and not a stable API.
> +#
> +# Since: 2.5
> +##
> +{ 'command': 'x-blockdev-change',
> +  'data' : { 'operation': 'ChangeOperation',
> + 'parent': 'str',
> + '*child': 'str',
> + '*node': 'str' } }

Let me suggest this alternative:

{ 'command': 'x-blockdev-change',
  'data' : { 'parent': 'str',
 'child': 'str',
 '*node': 'str' } }

child doesn't describe a node name then, but a child name (adds a
dependency on my patches which add a name to BdrvChild, though).
Depending on whether node is given and whether the child already exists,
this may add, remove or replace a child.

Kevin



Re: [Qemu-block] [PATCH v6 03/15] block: Release dirty bitmaps in bdrv_close()

2015-11-09 Thread Kevin Wolf
Am 04.11.2015 um 19:57 hat Max Reitz geschrieben:
> bdrv_delete() is not very happy about deleting BlockDriverStates with
> dirty bitmaps still attached to them. In the past, we got around that
> very easily by relying on bdrv_close_all() bypassing bdrv_delete(), and
> bdrv_close() simply ignoring that condition. We should fix that by
> releasing all dirty bitmaps in bdrv_close()

This doesn't sound right. If there was a dirty bitmap, there must be a
user associated with it. Now that we simply free the bitmap, that user
has a dangling pointer.

An exception would be if we knew that the only "user" of this bitmap is
the monitor because the monitor doesn't actually maintain its own list
of bitmaps. However, it's doubtful whether bdrv_close() should remove
something that the QMP client added explicitly.

> and drop the assertion in bdrv_delete().

Why? It should still hold true.

Kevin



Re: [Qemu-block] [PATCH v6 12/15] blockdev: Keep track of monitor-owned BDS

2015-11-09 Thread Alberto Garcia
On Mon 09 Nov 2015 05:26:38 PM CET, Kevin Wolf wrote:

>> >  if (blk) {
>> >  blk_unref(blk);
>> >  } else {
>> > +QTAILQ_REMOVE(_bdrv_states, bs, monitor_list);
>> >  bdrv_unref(bs);
>> >  }
>> 
>> blk_unref(blk) will also unref the BDS (if there's any), so you also
>> need to update monitor_bdrv_states in that case, don't you?
>
> No, in that case the BDS referenced wasn't owned by the monitor in the
> first place. It was owned by the BB.

I see, I hadn't noticed that the BDS is added to monitor_bdrv_states
only if it is created without a BB.

You're right then, thanks!

Berto



Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image

2015-11-09 Thread Paolo Bonzini


On 09/11/2015 17:04, Kevin Wolf wrote:
> Am 06.11.2015 um 11:22 hat Fam Zheng geschrieben:
>> The "pnum < nb_sectors" condition in deciding whether to actually copy
>> data is unnecessarily strict, and the qiov initialization is
>> unnecessarily too, for both bdrv_aio_write_zeroes and bdrv_aio_discard
>> branches.
>>
>> Reorganize mirror_iteration flow so that we:
>>
>> 1) Find the contiguous zero/discarded sectors with
>> bdrv_get_block_status_above() before deciding what to do. We query
>> s->buf_size sized blocks at a time.
>>
>> 2) If the sectors in question are zeroed/discarded and aligned to
>> target cluster, issue zero write or discard accordingly. It's done
>> in mirror_do_zero_or_discard, where we don't add buffer to qiov.
>>
>> 3) Otherwise, do the same loop as before in mirror_do_read.
>>
>> Signed-off-by: Fam Zheng 
> 
> I'm not sure where in the patch to comment on this, so I'll just do it
> here right in the beginning.
> 
> I'm concerned that we need to be more careful about races in this patch,
> in particular regarding the bitmaps. I think the conditions for the two
> bitmaps are:
> 
> * Dirty bitmap: We must clear the bit after finding the next piece of
>   data to be mirrored, but before we yield after getting information
>   that we use for the decision which kind of operation we need.
> 
>   In other words, we need to clear the dirty bitmap bit before calling
>   bdrv_get_block_status_above(), because that's both the function that
>   retrieves information about the next chunk and also a function that
>   can yield.
> 
>   If after this point the data is written to, we need to mirror it
>   again.

With Fam's patch, that's not trivial for two reasons:

1) bdrv_get_block_status_above() can return a smaller amount than what
is asked.

2) the "read and write" case can handle s->granularity sectors per
iteration (many of them can be coalesced, but still that's how the
iteration works).

The simplest solution is to perform the query with s->granularity size
rather than s->buf_size.

Paolo

> * In-flight bitmap: We need to make sure that we never mirror the same
>   data twice at the same time as older data could overwrite newer data
>   then.
> 
>   Strictly speaking, it looks to me as if this meant we can delay
>   setting the bit until before we issue an AIO operation. It might be
>   more obviously correct to set it at the same time as the dirty bitmap
>   is cleared.





Re: [Qemu-block] [PATCH v6 03/15] block: Release dirty bitmaps in bdrv_close()

2015-11-09 Thread Max Reitz
On 09.11.2015 17:04, Kevin Wolf wrote:
> Am 04.11.2015 um 19:57 hat Max Reitz geschrieben:
>> bdrv_delete() is not very happy about deleting BlockDriverStates with
>> dirty bitmaps still attached to them. In the past, we got around that
>> very easily by relying on bdrv_close_all() bypassing bdrv_delete(), and
>> bdrv_close() simply ignoring that condition. We should fix that by
>> releasing all dirty bitmaps in bdrv_close()
> 
> This doesn't sound right. If there was a dirty bitmap, there must be a
> user associated with it. Now that we simply free the bitmap, that user
> has a dangling pointer.

Well, having an assertion there means that we already assumed that case
to be impossible. Even though it isn't, as you yourself describe:

> An exception would be if we knew that the only "user" of this bitmap is
> the monitor because the monitor doesn't actually maintain its own list
> of bitmaps. However, it's doubtful whether bdrv_close() should remove
> something that the QMP client added explicitly.

So you are proposing that bdrv_close() should fail if there are still
dirty bitmaps attached? I don't like that either.

The bitmaps are attached to the BDS, that much is exposed over QMP, too.
If the BDS is released it's only natural to assume that all its bitmaps
are released, too. If you don't want that, you need to make sure that
the monitor has a reference to the BDS itself so the user can defer the
call to blockdev-del until he's/she's ready.

Maybe we need some QMP command to fetch a reference for the monitor for
that to be more usable, I don't know. It will work with blockdev-add
alone, too, though.

>> and drop the assertion in bdrv_delete().
> 
> Why? It should still hold true.

But it does not for user-added bitmaps.

Actually, on master, you can't break that assertion by adding a bitmap
through QMP, but with the BB and media series, you can. And that's
because as of that series, eject will no longer force-call bdrv_close()
(which bypasses the assertion althogether) but bdrv_unref() instead
(leading to bdrv_delete(), and that gets you the assertion).

I'm not really keen on fixing that in that series, though, since I
consider leaking not much better than a failed assertion, especially
considering I'm trying to actually fix it here anyway.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v6 12/15] blockdev: Keep track of monitor-owned BDS

2015-11-09 Thread Kevin Wolf
Am 09.11.2015 um 16:05 hat Alberto Garcia geschrieben:
> On Wed 04 Nov 2015 07:57:44 PM CET, Max Reitz wrote:
> > @@ -3519,11 +3537,18 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
> > bdrv_get_device_or_node_name(bs));
> >  goto out;
> >  }
> > +
> > +if (!blk && !bs->monitor_list.tqe_prev) {
> > +error_setg(errp, "Node %s is not owned by the monitor",
> > +   bs->node_name);
> > +goto out;
> > +}
> >  }
> >  
> >  if (blk) {
> >  blk_unref(blk);
> >  } else {
> > +QTAILQ_REMOVE(_bdrv_states, bs, monitor_list);
> >  bdrv_unref(bs);
> >  }
> 
> blk_unref(blk) will also unref the BDS (if there's any), so you also
> need to update monitor_bdrv_states in that case, don't you?

No, in that case the BDS referenced wasn't owned by the monitor in the
first place. It was owned by the BB.

Kevin



Re: [Qemu-block] [PATCH v6 for-2.6 00/15] block: Rework bdrv_close_all()

2015-11-09 Thread Kevin Wolf
Am 04.11.2015 um 19:57 hat Max Reitz geschrieben:
> Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend,
> which can lead to data corruption (see the iotest added in the final
> patch of this series) and is most certainly very ugly.
> 
> This series reworks bdrv_close_all() to instead eject the BDS trees from
> all BlockBackends and then close the monitor-owned BDS trees, which are
> the only BDSs without a BB. In effect, all BDSs are closed just by
> getting closed automatically due to their reference count becoming 0.
> 
> Note that the approach taken here leaks all BlockBackends. This does not
> really matter, however, since qemu is about to exit anyway.

Patches 1-2, 5-6, 8-15:
Reviewed-by: Kevin Wolf 



Re: [Qemu-block] [PATCH v6 04/15] iotests: Move _filter_nbd into common.filter

2015-11-09 Thread Kevin Wolf
Am 04.11.2015 um 19:57 hat Max Reitz geschrieben:
> _filter_nbd can be useful for other NBD tests, too, therefore it should
> reside in common.filter, and it should support URLs of the "nbd://"
> format and export names.
> 
> The NBD log lines ("/your/source/dir/nbd.c:function():line: error")
> should not be converted to empty lines but removed altogether.
> 
> Signed-off-by: Max Reitz 

Code motion and modification in the same patch is bad style. The changes
look good, though.

Kevin



Re: [Qemu-block] [PATCH v6 03/15] block: Release dirty bitmaps in bdrv_close()

2015-11-09 Thread Max Reitz
On 06.11.2015 19:59, John Snow wrote:
> 
> 
> On 11/04/2015 01:57 PM, Max Reitz wrote:
>> bdrv_delete() is not very happy about deleting BlockDriverStates with
>> dirty bitmaps still attached to them. In the past, we got around that
>> very easily by relying on bdrv_close_all() bypassing bdrv_delete(), and
>> bdrv_close() simply ignoring that condition. We should fix that by
>> releasing all dirty bitmaps in bdrv_close() and drop the assertion in
>> bdrv_delete().
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  block.c | 20 +++-
>>  1 file changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/block.c b/block.c
>> index 3493501..23448ed 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -87,6 +87,8 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const 
>> char *filename,
>>   const BdrvChildRole *child_role, Error **errp);
>>  
>>  static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
>> +static void bdrv_release_all_dirty_bitmaps(BlockDriverState *bs);
>> +
>>  /* If non-zero, use only whitelisted block drivers */
>>  static int use_bdrv_whitelist;
>>  
>> @@ -1916,6 +1918,8 @@ void bdrv_close(BlockDriverState *bs)
>>  bdrv_drain(bs); /* in case flush left pending I/O */
>>  notifier_list_notify(>close_notifiers, bs);
>>  
>> +bdrv_release_all_dirty_bitmaps(bs);
>> +
>>  if (bs->blk) {
>>  blk_dev_change_media_cb(bs->blk, false);
>>  }
>> @@ -2123,7 +2127,6 @@ static void bdrv_delete(BlockDriverState *bs)
>>  assert(!bs->job);
>>  assert(bdrv_op_blocker_is_empty(bs));
>>  assert(!bs->refcnt);
>> -assert(QLIST_EMPTY(>dirty_bitmaps));
>>  
>>  bdrv_close(bs);
>>  
>> @@ -3318,6 +3321,21 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
>> BdrvDirtyBitmap *bitmap)
>>  }
>>  }
>>  
>> +/**
>> + * Release all dirty bitmaps attached to a BDS, independently of whether 
>> they
>> + * are frozen or not (for use in bdrv_close()).
>> + */
> 
> This comment caught my attention ...
> 
>> +static void bdrv_release_all_dirty_bitmaps(BlockDriverState *bs)
>> +{
>> +BdrvDirtyBitmap *bm, *next;
>> +QLIST_FOREACH_SAFE(bm, >dirty_bitmaps, list, next) {
>> +QLIST_REMOVE(bm, list);
>> +hbitmap_free(bm->bitmap);
>> +g_free(bm->name);
>> +g_free(bm);
>> +}
>> +}
>> +
>>  void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
>>  {
>>  assert(!bdrv_dirty_bitmap_frozen(bitmap));
>>
> 
> Currently, the only way a bitmap could/should be frozen is if it is in
> use by a job. If a job is running, you probably shouldn't delete stuff
> it is using out from under it.

Right. Good thing I'm fixing everything and bdrv_close() will no longer
be called unless there are no references to the BDS anymore (and each
block job holds an own reference to its BDS). :-)

(At least that's how it's supposed to be; also, there is an
assert(!bs->job) in bdrv_close())

Also, I wanted to mirror current behavior. Right now, we are just
leaking all dirty bitmaps on bdrv_close() (bdrv_delete() asserts that
there are none, but if you invoke bdrv_close() directly...).

> I am assuming by now that it's actually likely that you've canceled any
> jobs attached to this node before you go through the motions of deleting
> it, and it should be safe to just call bdrv_release_dirty_bitmap ...

Well, yes, but then I'd have to iterate through all the dirty bitmaps to
call bdrv_release_dirty_bitmap() for each of them, and then
bdrv_release_dirty_bitmap() iterates through all of them again to check
whether it really belongs to the BDS. That seemed a bit like a waste.

> We don't want the two foreach loops though, so maybe just factor out a
> helper that bdrv_release_all_dirty_bitmaps and bdrv_release_dirty_bitmap
> can share. You can leave the frozen assertion
> in just bdrv_release_dirty_bitmap before it invokes the helper, so that
> bdrv_delete always succeeds in case something gets out-of-sync in the
> internals.

Hm, yes, that should do just fine, thanks.

> (Hmm, to prevent leaks, if you delete a bitmap that is frozen, you
> should probably call the helper on its child object too... or just make
> the helper recursive.)

I think it'll be actually better to just keep the assertion in and thus
not allow releasing frozen dirty bitmaps in bdrv_close(). In addition
I'll add an assertion that !bs->refcount to bdrv_close().

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v6 07/15] block: Move BDS close notifiers into BB

2015-11-09 Thread Kevin Wolf
Am 04.11.2015 um 19:57 hat Max Reitz geschrieben:
> The only remaining user of the BDS close notifiers is NBD which uses
> them to determine when a BDS tree is being ejected. This patch removes
> the BDS-level close notifiers and adds a notifier list to the
> BlockBackend structure that is invoked whenever a BDS is removed.
> 
> Symmetrically to that, another notifier list is added that is invoked
> whenever a BDS is inserted. The dataplane implementations for virtio-blk
> and virtio-scsi use both notifier types for setting up and removing op
> blockers. This is not only important for setting up the op blockers on
> insertion, but also for removing them on ejection since bdrv_delete()
> asserts that there are no op blockers set up.
> 
> Signed-off-by: Max Reitz 

I think this needs to be split into smaller patches:

1. Add the new BlockBackend notifiers
2. Use them in virtio-blk in order to fix... removable virtio-blk
   devices, or what is it?
3. Convert NBD
4. Remove old close notifiers

>  block.c |  7 
>  block/block-backend.c   | 19 +++---
>  blockdev-nbd.c  | 37 +---
>  hw/block/dataplane/virtio-blk.c | 77 
> +++--
>  hw/scsi/virtio-scsi.c   | 59 +++
>  include/block/block.h   |  1 -
>  include/block/block_int.h   |  2 --
>  include/hw/virtio/virtio-scsi.h | 10 ++
>  include/sysemu/block-backend.h  |  3 +-
>  nbd.c   | 13 +++
>  10 files changed, 159 insertions(+), 69 deletions(-)

> diff --git a/block/block-backend.c b/block/block-backend.c
> index 6f9309f..38580f7 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -48,6 +48,8 @@ struct BlockBackend {
>  BlockdevOnError on_read_error, on_write_error;
>  bool iostatus_enabled;
>  BlockDeviceIoStatus iostatus;
> +
> +NotifierList remove_bs_notifiers, insert_bs_notifiers;
>  };
>  
>  typedef struct BlockBackendAIOCB {
> @@ -98,6 +100,8 @@ BlockBackend *blk_new(const char *name, Error **errp)
>  blk = g_new0(BlockBackend, 1);
>  blk->name = g_strdup(name);
>  blk->refcnt = 1;
> +notifier_list_init(>remove_bs_notifiers);
> +notifier_list_init(>insert_bs_notifiers);
>  QTAILQ_INSERT_TAIL(_backends, blk, link);
>  return blk;
>  }
> @@ -343,6 +347,8 @@ void blk_hide_on_behalf_of_hmp_drive_del(BlockBackend 
> *blk)
>   */
>  void blk_remove_bs(BlockBackend *blk)
>  {
> +notifier_list_notify(>remove_bs_notifiers, blk);
> +
>  blk_update_root_state(blk);
>  
>  blk->bs->blk = NULL;
> @@ -359,6 +365,8 @@ void blk_insert_bs(BlockBackend *blk, BlockDriverState 
> *bs)
>  bdrv_ref(bs);
>  blk->bs = bs;
>  bs->blk = blk;
> +
> +notifier_list_notify(>insert_bs_notifiers, blk);
>  }

Do we want to notify on BB deletion, too? It's also some kind of removal
of a connection between BB and BDS.  In other words, should blk_delete()
call blk_remove_bs() rather than bdrv_unref()?

[ Edit: I see that's what the next patch does. Good. ]

Should blk_unref() also assert that the notifier list is empty?
Otherwise we would be leaking notifiers.

Kevin



Re: [Qemu-block] [PATCH v6 07/15] block: Move BDS close notifiers into BB

2015-11-09 Thread Kevin Wolf
Am 09.11.2015 um 17:50 hat Max Reitz geschrieben:
> On 09.11.2015 17:04, Kevin Wolf wrote:
> > Am 04.11.2015 um 19:57 hat Max Reitz geschrieben:
> >> The only remaining user of the BDS close notifiers is NBD which uses
> >> them to determine when a BDS tree is being ejected. This patch removes
> >> the BDS-level close notifiers and adds a notifier list to the
> >> BlockBackend structure that is invoked whenever a BDS is removed.
> >>
> >> Symmetrically to that, another notifier list is added that is invoked
> >> whenever a BDS is inserted. The dataplane implementations for virtio-blk
> >> and virtio-scsi use both notifier types for setting up and removing op
> >> blockers. This is not only important for setting up the op blockers on
> >> insertion, but also for removing them on ejection since bdrv_delete()
> >> asserts that there are no op blockers set up.
> >>
> >> Signed-off-by: Max Reitz 
> > 
> > I think this needs to be split into smaller patches:
> > 
> > 1. Add the new BlockBackend notifiers
> > 2. Use them in virtio-blk in order to fix... removable virtio-blk
> >devices, or what is it?
> > 3. Convert NBD
> > 4. Remove old close notifiers
> 
> I'll do my best.
> 
> >>  block.c |  7 
> >>  block/block-backend.c   | 19 +++---
> >>  blockdev-nbd.c  | 37 +---
> >>  hw/block/dataplane/virtio-blk.c | 77 
> >> +++--
> >>  hw/scsi/virtio-scsi.c   | 59 +++
> >>  include/block/block.h   |  1 -
> >>  include/block/block_int.h   |  2 --
> >>  include/hw/virtio/virtio-scsi.h | 10 ++
> >>  include/sysemu/block-backend.h  |  3 +-
> >>  nbd.c   | 13 +++
> >>  10 files changed, 159 insertions(+), 69 deletions(-)
> > 
> >> diff --git a/block/block-backend.c b/block/block-backend.c
> >> index 6f9309f..38580f7 100644
> >> --- a/block/block-backend.c
> >> +++ b/block/block-backend.c
> >> @@ -48,6 +48,8 @@ struct BlockBackend {
> >>  BlockdevOnError on_read_error, on_write_error;
> >>  bool iostatus_enabled;
> >>  BlockDeviceIoStatus iostatus;
> >> +
> >> +NotifierList remove_bs_notifiers, insert_bs_notifiers;
> >>  };
> >>  
> >>  typedef struct BlockBackendAIOCB {
> >> @@ -98,6 +100,8 @@ BlockBackend *blk_new(const char *name, Error **errp)
> >>  blk = g_new0(BlockBackend, 1);
> >>  blk->name = g_strdup(name);
> >>  blk->refcnt = 1;
> >> +notifier_list_init(>remove_bs_notifiers);
> >> +notifier_list_init(>insert_bs_notifiers);
> >>  QTAILQ_INSERT_TAIL(_backends, blk, link);
> >>  return blk;
> >>  }
> >> @@ -343,6 +347,8 @@ void blk_hide_on_behalf_of_hmp_drive_del(BlockBackend 
> >> *blk)
> >>   */
> >>  void blk_remove_bs(BlockBackend *blk)
> >>  {
> >> +notifier_list_notify(>remove_bs_notifiers, blk);
> >> +
> >>  blk_update_root_state(blk);
> >>  
> >>  blk->bs->blk = NULL;
> >> @@ -359,6 +365,8 @@ void blk_insert_bs(BlockBackend *blk, BlockDriverState 
> >> *bs)
> >>  bdrv_ref(bs);
> >>  blk->bs = bs;
> >>  bs->blk = blk;
> >> +
> >> +notifier_list_notify(>insert_bs_notifiers, blk);
> >>  }
> > 
> > Do we want to notify on BB deletion, too? It's also some kind of removal
> > of a connection between BB and BDS.  In other words, should blk_delete()
> > call blk_remove_bs() rather than bdrv_unref()?
> > 
> > [ Edit: I see that's what the next patch does. Good. ]
> > 
> > Should blk_unref() also assert that the notifier list is empty?
> > Otherwise we would be leaking notifiers.
> 
> You mean blk_delete()? I can do that, yes.

Yes, sorry, that's what I meant.

Kevin


pgpbZhKTyGsG3.pgp
Description: PGP signature


Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image

2015-11-09 Thread Kevin Wolf
Am 09.11.2015 um 17:18 hat Paolo Bonzini geschrieben:
> 
> 
> On 09/11/2015 17:04, Kevin Wolf wrote:
> > Am 06.11.2015 um 11:22 hat Fam Zheng geschrieben:
> >> The "pnum < nb_sectors" condition in deciding whether to actually copy
> >> data is unnecessarily strict, and the qiov initialization is
> >> unnecessarily too, for both bdrv_aio_write_zeroes and bdrv_aio_discard
> >> branches.
> >>
> >> Reorganize mirror_iteration flow so that we:
> >>
> >> 1) Find the contiguous zero/discarded sectors with
> >> bdrv_get_block_status_above() before deciding what to do. We query
> >> s->buf_size sized blocks at a time.
> >>
> >> 2) If the sectors in question are zeroed/discarded and aligned to
> >> target cluster, issue zero write or discard accordingly. It's done
> >> in mirror_do_zero_or_discard, where we don't add buffer to qiov.
> >>
> >> 3) Otherwise, do the same loop as before in mirror_do_read.
> >>
> >> Signed-off-by: Fam Zheng 
> > 
> > I'm not sure where in the patch to comment on this, so I'll just do it
> > here right in the beginning.
> > 
> > I'm concerned that we need to be more careful about races in this patch,
> > in particular regarding the bitmaps. I think the conditions for the two
> > bitmaps are:
> > 
> > * Dirty bitmap: We must clear the bit after finding the next piece of
> >   data to be mirrored, but before we yield after getting information
> >   that we use for the decision which kind of operation we need.
> > 
> >   In other words, we need to clear the dirty bitmap bit before calling
> >   bdrv_get_block_status_above(), because that's both the function that
> >   retrieves information about the next chunk and also a function that
> >   can yield.
> > 
> >   If after this point the data is written to, we need to mirror it
> >   again.
> 
> With Fam's patch, that's not trivial for two reasons:
> 
> 1) bdrv_get_block_status_above() can return a smaller amount than what
> is asked.
> 
> 2) the "read and write" case can handle s->granularity sectors per
> iteration (many of them can be coalesced, but still that's how the
> iteration works).
> 
> The simplest solution is to perform the query with s->granularity size
> rather than s->buf_size.

Then we end up with many small operations, that's not what we want.

Why can't we mark up to s->buf_size dirty clusters as clean first, then
query the status, and mark all of those that we can't handle dirty
again?

Kevin

> > * In-flight bitmap: We need to make sure that we never mirror the same
> >   data twice at the same time as older data could overwrite newer data
> >   then.
> > 
> >   Strictly speaking, it looks to me as if this meant we can delay
> >   setting the bit until before we issue an AIO operation. It might be
> >   more obviously correct to set it at the same time as the dirty bitmap
> >   is cleared.
> 
> 



Re: [Qemu-block] [PATCH v6 12/15] blockdev: Keep track of monitor-owned BDS

2015-11-09 Thread Max Reitz
On 09.11.2015 16:05, Alberto Garcia wrote:
> On Wed 04 Nov 2015 07:57:44 PM CET, Max Reitz wrote:
>> @@ -3519,11 +3537,18 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
>> bdrv_get_device_or_node_name(bs));
>>  goto out;
>>  }
>> +
>> +if (!blk && !bs->monitor_list.tqe_prev) {
>> +error_setg(errp, "Node %s is not owned by the monitor",
>> +   bs->node_name);
>> +goto out;
>> +}
>>  }
>>  
>>  if (blk) {
>>  blk_unref(blk);
>>  } else {
>> +QTAILQ_REMOVE(_bdrv_states, bs, monitor_list);
>>  bdrv_unref(bs);
>>  }
> 
> blk_unref(blk) will also unref the BDS (if there's any), so you also
> need to update monitor_bdrv_states in that case, don't you?

If we get to blk_unref(blk), that means there is a BB attached to that
BDS (if there is any). If that BDS is monitor-owned, that means its
refcount is at least 2 (one from the BB, one from the monitor).
Therefore, blockdev-del will fail before that point.

> Anyway, wouldn't it make more sense to do this in bdrv_delete() ?

No, I don't think so. The monitor_bdrv_states is not a plain list of
BDSs, but actually contains strong references. Every time a BDS is
entered there, that reference should be counted (which we don't actually
have to call bdrv_ref() for since we're just keeping the initial
refcount of 1 in qmp_blockdev_add()), and every time a BDS is removed,
its reference should be dropped, just like is done here.

That's the issue I had with this implementation of blockdev-del. It's
just dropping references without actually having those references. Now
we at least have the reference through monitor_bdrv_states, and in the
future, we'll have such a list of monitor references for BBs, too.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v7 11/35] util: introduce qemu_file_getlength()

2015-11-09 Thread Eduardo Habkost
On Mon, Nov 09, 2015 at 12:44:55PM +0800, Xiao Guangrong wrote:
> On 11/06/2015 11:50 PM, Eduardo Habkost wrote:
> >As this patch affects raw_getlength(), CCing the raw block driver
> >maintainer and the qemu-block mailing list.
> 
> Eduardo, thanks for your reminder. I will keep CCing Kevin and qemu-block mail
> list for future version.
> 
> >
> >On Mon, Nov 02, 2015 at 05:13:13PM +0800, Xiao Guangrong wrote:
> >>It is used to get the size of the specified file, also qemu_fd_getlength()
> >>is introduced to unify the code with raw_getlength() in block/raw-posix.c
> >>
> >>Signed-off-by: Xiao Guangrong 
> >>---
> >>  block/raw-posix.c|  7 +--
> >>  include/qemu/osdep.h |  2 ++
> >>  util/osdep.c | 31 +++
> >
> >I know I was the one who suggested osdep.c, but maybe oslib-posix.c is a
> >more appropriate place for the new function?
> >
> 
> Since the function we introduced here can work on both windows and posix, so
> i thing osdep.c is the right place. Otherwise we should implement it for 
> multiple
> platforms.

I didn't notice it was going to be used by a platform-independent
qemu_file_getlength() function in addition to the posix-specific
raw_getlength(). Have you tested it on Windows, though?

If you didn't test it on Windows, what about keeping
qemu_file_getlength() available only on posix, by now? The only
users are raw-posix.c and hostmem-file.c, currently. If in the
future somebody need it on Windows, they can decide between
moving the SEEK_END code to osdep.c (after testing it), or moving
the existing raw-win32.c:raw_getlength() code to a
oslib-win32.c:qemu_file_getlength() function.

-- 
Eduardo



Re: [Qemu-block] [PATCH] mirror: Improve zero-write and discard with fragmented image

2015-11-09 Thread Fam Zheng
On Mon, 11/09 17:29, Kevin Wolf wrote:
> Am 09.11.2015 um 17:18 hat Paolo Bonzini geschrieben:
> > 
> > 
> > On 09/11/2015 17:04, Kevin Wolf wrote:
> > > Am 06.11.2015 um 11:22 hat Fam Zheng geschrieben:
> > >> The "pnum < nb_sectors" condition in deciding whether to actually copy
> > >> data is unnecessarily strict, and the qiov initialization is
> > >> unnecessarily too, for both bdrv_aio_write_zeroes and bdrv_aio_discard
> > >> branches.
> > >>
> > >> Reorganize mirror_iteration flow so that we:
> > >>
> > >> 1) Find the contiguous zero/discarded sectors with
> > >> bdrv_get_block_status_above() before deciding what to do. We query
> > >> s->buf_size sized blocks at a time.
> > >>
> > >> 2) If the sectors in question are zeroed/discarded and aligned to
> > >> target cluster, issue zero write or discard accordingly. It's done
> > >> in mirror_do_zero_or_discard, where we don't add buffer to qiov.
> > >>
> > >> 3) Otherwise, do the same loop as before in mirror_do_read.
> > >>
> > >> Signed-off-by: Fam Zheng 
> > > 
> > > I'm not sure where in the patch to comment on this, so I'll just do it
> > > here right in the beginning.
> > > 
> > > I'm concerned that we need to be more careful about races in this patch,
> > > in particular regarding the bitmaps. I think the conditions for the two
> > > bitmaps are:
> > > 
> > > * Dirty bitmap: We must clear the bit after finding the next piece of
> > >   data to be mirrored, but before we yield after getting information
> > >   that we use for the decision which kind of operation we need.
> > > 
> > >   In other words, we need to clear the dirty bitmap bit before calling
> > >   bdrv_get_block_status_above(), because that's both the function that
> > >   retrieves information about the next chunk and also a function that
> > >   can yield.
> > > 
> > >   If after this point the data is written to, we need to mirror it
> > >   again.
> > 
> > With Fam's patch, that's not trivial for two reasons:
> > 
> > 1) bdrv_get_block_status_above() can return a smaller amount than what
> > is asked.
> > 
> > 2) the "read and write" case can handle s->granularity sectors per
> > iteration (many of them can be coalesced, but still that's how the
> > iteration works).
> > 
> > The simplest solution is to perform the query with s->granularity size
> > rather than s->buf_size.
> 
> Then we end up with many small operations, that's not what we want.
> 
> Why can't we mark up to s->buf_size dirty clusters as clean first, then
> query the status, and mark all of those that we can't handle dirty
> again?

Then we may end up marking more clusters as dirty than it should be.

Because all bdrv_set_dirty() and bdrv_set_dirty_bitmap() callers are coroutine,
we can introduce a CoMutex to let bitmap reader block bdrv_set_dirty and
bdrv_set_dirty_bitmap.

Fam

> 
> Kevin
> 
> > > * In-flight bitmap: We need to make sure that we never mirror the same
> > >   data twice at the same time as older data could overwrite newer data
> > >   then.
> > > 
> > >   Strictly speaking, it looks to me as if this meant we can delay
> > >   setting the bit until before we issue an AIO operation. It might be
> > >   more obviously correct to set it at the same time as the dirty bitmap
> > >   is cleared.
> > 
> > 



Re: [Qemu-block] [PATCH v3] virtio-blk: trivial code optimization

2015-11-09 Thread Gonglei
On 2015/11/9 21:57, Stefan Hajnoczi wrote:
> On Mon, Nov 09, 2015 at 05:03:30PM +0800, arei.gong...@huawei.com wrote:
>> From: Gonglei 
>>
>> 1. avoid possible superflous checking
>> 2. make code more robustness
>>
>> Signed-off-by: Gonglei 
>> Reviewed-by: Fam Zheng 
>> ---
>> v3: change the third condition too [Paolo]
>> add Fam's R-by
>> ---
>>  hw/block/virtio-blk.c | 27 +--
>>  1 file changed, 9 insertions(+), 18 deletions(-)
>>
>> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>> index 093e475..9124358 100644
>> --- a/hw/block/virtio-blk.c
>> +++ b/hw/block/virtio-blk.c
>> @@ -404,24 +404,15 @@ void virtio_blk_submit_multireq(BlockBackend *blk, 
>> MultiReqBuffer *mrb)
>>  for (i = 0; i < mrb->num_reqs; i++) {
>>  VirtIOBlockReq *req = mrb->reqs[i];
>>  if (num_reqs > 0) {
>> -bool merge = true;
>> -
>> -/* merge would exceed maximum number of IOVs */
>> -if (niov + req->qiov.niov > IOV_MAX) {
>> -merge = false;
>> -}
>> -
>> -/* merge would exceed maximum transfer length of backend device 
>> */
>> -if (req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > 
>> max_xfer_len) {
>> -merge = false;
>> -}
>> -
>> -/* requests are not sequential */
>> -if (sector_num + nb_sectors != req->sector_num) {
>> -merge = false;
>> -}
>> -
>> -if (!merge) {
>> +/*
>> + * NOTE: We cannot merge the requests in below situations:
>> + * 1. requests are not sequential
>> + * 2. merge would exceed maximum number of IOVs
>> + * 3. merge would exceed maximum transfer length of backend 
>> device
>> + */
>> +if (sector_num + nb_sectors != req->sector_num ||
>> +niov > IOV_MAX - req->qiov.niov ||
>> +nb_sectors > max_xfer_len - req->qiov.size / 
>> BDRV_SECTOR_SIZE) {
> 
> nb_sectors - int
> max_xfer_len - int
> req->qiov.size - size_t
> BDRV_SECTOR_SIZE - unsigned long long
> 
> Therefore this expression is an int > unsigned long long comparison.
> 
Sorry, I'm confused.
max_xfer_len is int,
"req->qiov.size / BDRV_SECTOR_SIZE" is  unsigned long long,
so, "max_xfer_len - req->qiov.size / BDRV_SECTOR_SIZE" will be int,

and nb_sectors is int too, so this comparison is right. Am I wrong?

> req->qiov.size can be larger than max_xfer_len so this comparison
> suffers from underflow.  Please check that req->qiov.size <=
> max_xfer_len first.
> 

Regards,
-Gonglei






Re: [Qemu-block] [PATCH V5] block/nfs: add support for setting debug level

2015-11-09 Thread Fam Zheng
On Mon, 11/09 08:09, Peter Lieven wrote:
> recent libnfs versions support logging debug messages. Add
> support for it in qemu through an URL parameter.
> 
> Example:
>  qemu -cdrom nfs://127.0.0.1/iso/my.iso?debug=2
> 
> Signed-off-by: Peter Lieven 
> ---
> v4->v5: add a comment in the code why we limit the debug level [Stefan]
> v3->v4: revert to the initial version, but limit max debug level
> v2->v3: use a per-drive option instead of a global one. [Stefan]
> v1->v2: reworked patch to accept the debug level as a cmdline
> parameter instead of an URI parameter [Stefan]
> 
>  block/nfs.c | 12 
>  1 file changed, 12 insertions(+)
> 
> diff --git a/block/nfs.c b/block/nfs.c
> index fd79f89..ab1e267 100644
> --- a/block/nfs.c
> +++ b/block/nfs.c
> @@ -36,6 +36,7 @@
>  #include 
>  
>  #define QEMU_NFS_MAX_READAHEAD_SIZE 1048576
> +#define QEMU_NFS_MAX_DEBUG_LEVEL 2
>  
>  typedef struct NFSClient {
>  struct nfs_context *context;
> @@ -334,6 +335,17 @@ static int64_t nfs_client_open(NFSClient *client, const 
> char *filename,
>  }
>  nfs_set_readahead(client->context, val);
>  #endif
> +#ifdef LIBNFS_FEATURE_DEBUG
> +} else if (!strcmp(qp->p[i].name, "debug")) {
> +/* limit the maximum debug level to avoid potential flooding
> + * of our log files. */
> +if (val > QEMU_NFS_MAX_DEBUG_LEVEL) {
> +error_report("NFS Warning: Limiting NFS debug level"
> + " to %d", QEMU_NFS_MAX_DEBUG_LEVEL);
> +val = QEMU_NFS_MAX_DEBUG_LEVEL;
> +}
> +nfs_set_debug(client->context, val);
> +#endif

It would be slightly nicer to do:

} else if (!strcmp(qp->p[i].name, "debug")) {
#ifdef LIBNFS_FEATURE_DEBUG
/* limit the maximum debug level to avoid potential flooding
 * of our log files. */
if (val > QEMU_NFS_MAX_DEBUG_LEVEL) {
error_report("NFS Warning: Limiting NFS debug level"
 " to %d", QEMU_NFS_MAX_DEBUG_LEVEL);
val = QEMU_NFS_MAX_DEBUG_LEVEL;
}
nfs_set_debug(client->context, val);
#else
error_report("Specifying debug level is only supported by libnfs"
 " version >= XXX");
#endif

So that you know which one out of QEMU and libnfs to upgrade when the option is
refused. :)

But it's your call. Either way:

Reviewed-by: Fam Zheng 

>  } else {
>  error_setg(errp, "Unknown NFS parameter name: %s",
> qp->p[i].name);
> -- 
> 1.9.1
> 



[Qemu-block] [PATCH v2 6/8] blockdev: Remove blk_hide_on_behalf_of_hmp_drive_del()

2015-11-09 Thread Max Reitz
This function first removed the BlockBackend from the blk_backends list
and cleared its name so it would no longer be found by blk_name(); since
blk_next() now iterates through monitor_block_backends (which the BB is
removed from in hmp_drive_del()), this is no longer necessary.

Second, bdrv_make_anon() was called on the BDS. This was intended for
cases where the BDS was owned by that BB alone; in which case the BDS
will no longer exist at this point thanks to the blk_remove_bs() in
hmp_drive_del().

Therefore, this function does nothing useful anymore. Remove it.

Signed-off-by: Max Reitz 
---
 block/block-backend.c  | 25 ++---
 blockdev.c |  1 -
 include/sysemu/block-backend.h |  2 --
 3 files changed, 2 insertions(+), 26 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index e35d84f..609a045 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -68,7 +68,7 @@ static const AIOCBInfo block_backend_aiocb_info = {
 
 static void drive_info_del(DriveInfo *dinfo);
 
-/* All the BlockBackends (except for hidden ones) */
+/* All the BlockBackends */
 static QTAILQ_HEAD(, BlockBackend) blk_backends =
 QTAILQ_HEAD_INITIALIZER(blk_backends);
 
@@ -180,10 +180,7 @@ static void blk_delete(BlockBackend *blk)
 g_free(blk->root_state.throttle_group);
 throttle_group_unref(blk->root_state.throttle_state);
 }
-/* Avoid double-remove after blk_hide_on_behalf_of_hmp_drive_del() */
-if (blk->name[0]) {
-QTAILQ_REMOVE(_backends, blk, link);
-}
+QTAILQ_REMOVE(_backends, blk, link);
 g_free(blk->name);
 drive_info_del(blk->legacy_dinfo);
 g_free(blk);
@@ -403,24 +400,6 @@ BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo)
 }
 
 /*
- * Hide @blk.
- * @blk must not have been hidden already.
- * Make attached BlockDriverState, if any, anonymous.
- * Once hidden, @blk is invisible to all functions that don't receive
- * it as argument.  For example, blk_by_name() won't return it.
- * Strictly for use by do_drive_del().
- * TODO get rid of it!
- */
-void blk_hide_on_behalf_of_hmp_drive_del(BlockBackend *blk)
-{
-QTAILQ_REMOVE(_backends, blk, link);
-blk->name[0] = 0;
-if (blk->bs) {
-bdrv_make_anon(blk->bs);
-}
-}
-
-/*
  * Disassociates the currently associated BlockDriverState from @blk.
  */
 void blk_remove_bs(BlockBackend *blk)
diff --git a/blockdev.c b/blockdev.c
index f5dde19..934b9d8 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2539,7 +2539,6 @@ void hmp_drive_del(Monitor *mon, const QDict *qdict)
  * then we can just get rid of the block driver state right here.
  */
 if (blk_get_attached_dev(blk)) {
-blk_hide_on_behalf_of_hmp_drive_del(blk);
 /* Further I/O must not pause the guest */
 blk_set_on_error(blk, BLOCKDEV_ON_ERROR_REPORT,
  BLOCKDEV_ON_ERROR_REPORT);
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 9a3d8db..5208c08 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -81,8 +81,6 @@ BlockDriverState *blk_bs(BlockBackend *blk);
 void blk_remove_bs(BlockBackend *blk);
 void blk_insert_bs(BlockBackend *blk, BlockDriverState *bs);
 
-void blk_hide_on_behalf_of_hmp_drive_del(BlockBackend *blk);
-
 void blk_iostatus_enable(BlockBackend *blk);
 bool blk_iostatus_is_enabled(const BlockBackend *blk);
 BlockDeviceIoStatus blk_iostatus(const BlockBackend *blk);
-- 
2.6.2




[Qemu-block] [PATCH 1/8] qapi: Drop QERR_UNKNOWN_BLOCK_FORMAT_FEATURE

2015-11-09 Thread Max Reitz
Just specifying a custom string is simpler in basically all places that
used it, and in addition, specifying the BB or node name is something we
generally do not do in other error messages when opening a BDS, so we
should not do it here.

This changes the output for iotest 036 (to the better, in my opinion),
so the reference output needs to be changed accordingly.

Signed-off-by: Max Reitz 
---
 block/qcow.c   |  6 +-
 block/qcow2.c  | 24 +---
 block/qed.c|  7 ++-
 block/vmdk.c   |  7 ++-
 include/qapi/qmp/qerror.h  |  3 ---
 tests/qemu-iotests/036.out | 16 
 6 files changed, 18 insertions(+), 45 deletions(-)

diff --git a/block/qcow.c b/block/qcow.c
index 635085e..2601954 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -119,11 +119,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, 
int flags,
 goto fail;
 }
 if (header.version != QCOW_VERSION) {
-char version[64];
-snprintf(version, sizeof(version), "QCOW version %" PRIu32,
- header.version);
-error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
-   bdrv_get_device_or_node_name(bs), "qcow", version);
+error_setg(errp, "Unsupported qcow version %" PRIu32, header.version);
 ret = -ENOTSUP;
 goto fail;
 }
diff --git a/block/qcow2.c b/block/qcow2.c
index 88f56c8..88e17aa 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -196,22 +196,8 @@ static void cleanup_unknown_header_ext(BlockDriverState 
*bs)
 }
 }
 
-static void GCC_FMT_ATTR(3, 4) report_unsupported(BlockDriverState *bs,
-Error **errp, const char *fmt, ...)
-{
-char msg[64];
-va_list ap;
-
-va_start(ap, fmt);
-vsnprintf(msg, sizeof(msg), fmt, ap);
-va_end(ap);
-
-error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
-   bdrv_get_device_or_node_name(bs), "qcow2", msg);
-}
-
-static void report_unsupported_feature(BlockDriverState *bs,
-Error **errp, Qcow2Feature *table, uint64_t mask)
+static void report_unsupported_feature(Error **errp, Qcow2Feature *table,
+   uint64_t mask)
 {
 char *features = g_strdup("");
 char *old;
@@ -236,7 +222,7 @@ static void report_unsupported_feature(BlockDriverState *bs,
 g_free(old);
 }
 
-report_unsupported(bs, errp, "%s", features);
+error_setg(errp, "Unsupported qcow2 feature(s): %s", features);
 g_free(features);
 }
 
@@ -853,7 +839,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, 
int flags,
 goto fail;
 }
 if (header.version < 2 || header.version > 3) {
-report_unsupported(bs, errp, "QCOW version %" PRIu32, header.version);
+error_setg(errp, "Unsupported qcow2 version %" PRIu32, header.version);
 ret = -ENOTSUP;
 goto fail;
 }
@@ -933,7 +919,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, 
int flags,
 void *feature_table = NULL;
 qcow2_read_extensions(bs, header.header_length, ext_end,
   _table, NULL);
-report_unsupported_feature(bs, errp, feature_table,
+report_unsupported_feature(errp, feature_table,
s->incompatible_features &
~QCOW2_INCOMPAT_MASK);
 ret = -ENOTSUP;
diff --git a/block/qed.c b/block/qed.c
index 5ea05d4..c15f0be 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -398,11 +398,8 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict 
*options, int flags,
 }
 if (s->header.features & ~QED_FEATURE_MASK) {
 /* image uses unsupported feature bits */
-char buf[64];
-snprintf(buf, sizeof(buf), "%" PRIx64,
-s->header.features & ~QED_FEATURE_MASK);
-error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
-   bdrv_get_device_or_node_name(bs), "QED", buf);
+error_setg(errp, "Unsupported QED features: %" PRIx64,
+   s->header.features & ~QED_FEATURE_MASK);
 return -ENOTSUP;
 }
 if (!qed_is_cluster_size_valid(s->header.cluster_size)) {
diff --git a/block/vmdk.c b/block/vmdk.c
index 6f819e4..bcce9a7 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -645,11 +645,8 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
 }
 
 if (le32_to_cpu(header.version) > 3) {
-char buf[64];
-snprintf(buf, sizeof(buf), "VMDK version %" PRId32,
- le32_to_cpu(header.version));
-error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
-   bdrv_get_device_or_node_name(bs), "vmdk", buf);
+error_setg(errp, "Unsupported VMDK version %" PRIu32,
+   le32_to_cpu(header.version));
 return -ENOTSUP;
 } else if (le32_to_cpu(header.version) == 3 && (flags & BDRV_O_RDWR)) {
 /* VMware KB 2064959 explains that version 3 added support for

[Qemu-block] [PATCH 4/8] block: Drop BB name from bad option error

2015-11-09 Thread Max Reitz
The information which BB is concerned does not seem useful enough to
justify its existence in most other place (which may be related to qemu
printing the -drive parameter in question anyway, and for blockdev-add
the attribution is naturally unambiguous). Furthermore, as of a future
patch, bdrv_get_device_name(bs) will always return the empty string
before bdrv_open_inherit() returns.

Therefore, just dropping that information seems to be the best course of
action.

Signed-off-by: Max Reitz 
---
 block.c| 6 +++---
 tests/qemu-iotests/051.out | 8 
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/block.c b/block.c
index c5ea5e7..395b7b0 100644
--- a/block.c
+++ b/block.c
@@ -1539,9 +1539,9 @@ static int bdrv_open_inherit(BlockDriverState **pbs, 
const char *filename,
 error_setg(errp, "Block protocol '%s' doesn't support the option "
"'%s'", drv->format_name, entry->key);
 } else {
-error_setg(errp, "Block format '%s' used by device '%s' doesn't "
-   "support the option '%s'", drv->format_name,
-   bdrv_get_device_name(bs), entry->key);
+error_setg(errp,
+   "Block format '%s' doesn't support the option '%s'",
+   drv->format_name, entry->key);
 }
 
 ret = -EINVAL;
diff --git a/tests/qemu-iotests/051.out b/tests/qemu-iotests/051.out
index 7765aa0..c6df65f 100644
--- a/tests/qemu-iotests/051.out
+++ b/tests/qemu-iotests/051.out
@@ -5,16 +5,16 @@ Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728 
backing_file=TEST_DIR/
 === Unknown option ===
 
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=: Block 
format 'qcow2' used by device 'ide0-hd0' doesn't support the option 
'unknown_opt'
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=: Block 
format 'qcow2' doesn't support the option 'unknown_opt'
 
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=on
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=on: Block 
format 'qcow2' used by device 'ide0-hd0' doesn't support the option 
'unknown_opt'
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=on: Block 
format 'qcow2' doesn't support the option 'unknown_opt'
 
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=1234
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=1234: Block 
format 'qcow2' used by device 'ide0-hd0' doesn't support the option 
'unknown_opt'
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=1234: Block 
format 'qcow2' doesn't support the option 'unknown_opt'
 
 Testing: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=foo
-QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=foo: Block 
format 'qcow2' used by device 'ide0-hd0' doesn't support the option 
'unknown_opt'
+QEMU_PROG: -drive file=TEST_DIR/t.qcow2,format=qcow2,unknown_opt=foo: Block 
format 'qcow2' doesn't support the option 'unknown_opt'
 
 
 === Unknown protocol option ===
-- 
2.6.2




[Qemu-block] [PATCH 0/8] blockdev: (Nearly) free clean-up work

2015-11-09 Thread Max Reitz
After a lot has been restructed in the block layer in the past, we can
now reap at least one of the fruits: Make bdrv_open() return a BDS!

I tried to squeeze patch 8 of this series into my bdrv_close_all()
series but noticed that won't work because bdrv_open_inherit() still has
one instance of bdrv_close() which is actually pretty reasonable.
Therefore, that was impossible there.

However, that instance exists only because bdrv_open_inherit() does not
always create a new BDS but sometimes takes an existing one and opens a
file "into" that BDS, basically. That is ugly and we've tried to get rid
of that for a long time now. Then I noticed that nearly all the callers
of bdrv_open()/bdrv_open_inherit() no longer need that feature, there
was only one show-stopper left: blk_new_open(). That functions creates a
BDS using blk_new_with_bs(), which in turn uses bdrv_new_root() instead
of a plain bdrv_new().

However, the sole difference between bdrv_new_root() and bdrv_new() is
gone as of my "Further BlockBackend work" series, so we can make
blk_new_open() reuse the BDS creation capabilities of bdrv_open() as
well. And that's it!

And with bdv_open_inherit() always creating the BDS, we no longer need
to call bdrv_close() there, which allows us to assert that any BDS given
to bdrv_close() will have a refcount of 0 (i.e. is about to be deleted).


As hinted before, this series depends on v2 of my previous series
"blockdev: Further BlockBackend work".


Max Reitz (8):
  qapi: Drop QERR_UNKNOWN_BLOCK_FORMAT_FEATURE
  block: Drop useless bdrv_new() calls
  block: Let bdrv_open_inherit() return the snapshot
  block: Drop BB name from bad option error
  block: Drop blk_new_with_bs()
  block: Drop bdrv_new_root()
  block: Make bdrv_open() return a BDS
  block: Assert !bs->refcnt in bdrv_close()

 block.c| 139 ++---
 block/block-backend.c  |  31 +++---
 block/parallels.c  |   9 +--
 block/qcow.c   |  15 ++---
 block/qcow2.c  |  53 ++---
 block/qed.c|  18 ++
 block/sheepdog.c   |  16 +++---
 block/vdi.c|   7 +--
 block/vhdx.c   |   8 +--
 block/vmdk.c   |  31 +-
 block/vpc.c|   7 +--
 block/vvfat.c  |   9 ++-
 blockdev.c |  38 +
 include/block/block.h  |   6 +-
 include/qapi/qmp/qerror.h  |   3 -
 tests/qemu-iotests/036.out |  16 +++---
 tests/qemu-iotests/051.out |   8 +--
 17 files changed, 153 insertions(+), 261 deletions(-)

-- 
2.6.2




[Qemu-block] [PATCH 2/8] block: Drop useless bdrv_new() calls

2015-11-09 Thread Max Reitz
bdrv_append_temp_snapshot() uses bdrv_new() to create an empty BDS
before invoking bdrv_open() on that BDS. This is probably a relict from
when it used to do some modifications on that empty BDS, but now that is
unnecessary, so we can just set bs_snapshot to NULL and let bdrv_open()
do the rest.

The same applies to bdrv_open_backing_file(). There is even a bit more
cruft here: The assert() was intended for the BDS which is indirectly
passed as the first bdrv_open() argument, so we can now drop it, too.

Signed-off-by: Max Reitz 
---
 block.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/block.c b/block.c
index 5b02990..aae1d69 100644
--- a/block.c
+++ b/block.c
@@ -1212,19 +1212,15 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*options, Error **errp)
 goto free_exit;
 }
 
-backing_hd = bdrv_new();
-
 if (bs->backing_format[0] != '\0' && !qdict_haskey(options, "driver")) {
 qdict_put(options, "driver", qstring_from_str(bs->backing_format));
 }
 
-assert(bs->backing == NULL);
+backing_hd = NULL;
 ret = bdrv_open_inherit(_hd,
 *backing_filename ? backing_filename : NULL,
 NULL, options, 0, bs, _backing, _err);
 if (ret < 0) {
-bdrv_unref(backing_hd);
-backing_hd = NULL;
 bs->open_flags |= BDRV_O_NO_BACKING;
 error_setg(errp, "Could not open backing file: %s",
error_get_pretty(local_err));
@@ -1350,8 +1346,7 @@ int bdrv_append_temp_snapshot(BlockDriverState *bs, int 
flags, Error **errp)
 qdict_put(snapshot_options, "driver",
   qstring_from_str("qcow2"));
 
-bs_snapshot = bdrv_new();
-
+bs_snapshot = NULL;
 ret = bdrv_open(_snapshot, NULL, NULL, snapshot_options,
 flags, _err);
 if (ret < 0) {
-- 
2.6.2




[Qemu-block] [PATCH v2 3/8] block: Add blk_commit_all() and blk_invalidate_cache_all()

2015-11-09 Thread Max Reitz
These functions will be changed to iterate through the BDS trees as
defined by the BlockBackends instead of the list of root BDS, therefore
prepare moving their code to the BlockBackend level.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
---
 block/block-backend.c  | 10 ++
 include/sysemu/block-backend.h |  2 ++
 2 files changed, 12 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index ef387f8..b3d086d 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1331,3 +1331,13 @@ BlockBackendRootState *blk_get_root_state(BlockBackend 
*blk)
 {
 return >root_state;
 }
+
+int blk_commit_all(void)
+{
+return bdrv_commit_all();
+}
+
+void blk_invalidate_cache_all(Error **errp)
+{
+bdrv_invalidate_cache_all(errp);
+}
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 967c768..ee36f93 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -129,8 +129,10 @@ int blk_co_discard(BlockBackend *blk, int64_t sector_num, 
int nb_sectors);
 int blk_co_flush(BlockBackend *blk);
 int blk_flush(BlockBackend *blk);
 int blk_flush_all(void);
+int blk_commit_all(void);
 void blk_drain(BlockBackend *blk);
 void blk_drain_all(void);
+void blk_invalidate_cache_all(Error **errp);
 void blk_set_on_error(BlockBackend *blk, BlockdevOnError on_read_error,
   BlockdevOnError on_write_error);
 BlockdevOnError blk_get_on_error(BlockBackend *blk, bool is_read);
-- 
2.6.2




[Qemu-block] [PATCH v2 1/8] block: Add blk_name_taken()

2015-11-09 Thread Max Reitz
There may be BlockBackends which are not returned by blk_by_name(), but
do exist and have a name. blk_name_taken() allows testing whether a
specific name is in use already, independent of whether the BlockBackend
with that name is accessible through blk_by_name().

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
---
 block.c|  4 ++--
 block/block-backend.c  | 19 ++-
 include/sysemu/block-backend.h |  1 +
 3 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/block.c b/block.c
index e3d5aea..faad9fb 100644
--- a/block.c
+++ b/block.c
@@ -777,8 +777,8 @@ static void bdrv_assign_node_name(BlockDriverState *bs,
 return;
 }
 
-/* takes care of avoiding namespaces collisions */
-if (blk_by_name(node_name)) {
+/* takes care of avoiding namespace collisions */
+if (blk_name_taken(node_name)) {
 error_setg(errp, "node-name=%s is conflicting with a device id",
node_name);
 goto out;
diff --git a/block/block-backend.c b/block/block-backend.c
index 7373093..7128adc 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -86,7 +86,7 @@ BlockBackend *blk_new(const char *name, Error **errp)
 error_setg(errp, "Invalid device name");
 return NULL;
 }
-if (blk_by_name(name)) {
+if (blk_name_taken(name)) {
 error_setg(errp, "Device with id '%s' already exists", name);
 return NULL;
 }
@@ -279,6 +279,23 @@ BlockBackend *blk_by_name(const char *name)
 }
 
 /*
+ * This function should be used to check whether a certain BlockBackend name is
+ * already taken; blk_by_name() will only search in the list of monitor-owned
+ * BlockBackends which is not necessarily complete.
+ */
+bool blk_name_taken(const char *name)
+{
+BlockBackend *blk;
+
+QTAILQ_FOREACH(blk, _backends, link) {
+if (!strcmp(name, blk->name)) {
+return true;
+}
+}
+return false;
+}
+
+/*
  * Return the BlockDriverState attached to @blk if any, else null.
  */
 BlockDriverState *blk_bs(BlockBackend *blk)
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 9fb3e54..0593231 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -71,6 +71,7 @@ void blk_unref(BlockBackend *blk);
 void blk_remove_all_bs(void);
 const char *blk_name(BlockBackend *blk);
 BlockBackend *blk_by_name(const char *name);
+bool blk_name_taken(const char *name);
 BlockBackend *blk_next(BlockBackend *blk);
 
 BlockDriverState *blk_bs(BlockBackend *blk);
-- 
2.6.2




[Qemu-block] [PATCH v2 2/8] block: Add blk_next_inserted()

2015-11-09 Thread Max Reitz
This function skips to the next BlockBackend for which blk_is_inserted()
is true.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
---
 block/block-backend.c  | 15 +++
 include/sysemu/block-backend.h |  1 +
 2 files changed, 16 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 7128adc..ef387f8 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -252,6 +252,21 @@ BlockBackend *blk_next(BlockBackend *blk)
 }
 
 /*
+ * Like blk_next(), but skips all non-inserted BlockBackends (that is,
+ * BlockBackends for which blk_is_inserted() returns false)
+ */
+BlockBackend *blk_next_inserted(BlockBackend *blk)
+{
+while ((blk = blk_next(blk)) != NULL) {
+if (blk_is_inserted(blk)) {
+break;
+}
+}
+
+return blk;
+}
+
+/*
  * Return @blk's name, a non-null string.
  * Wart: the name is empty iff @blk has been hidden with
  * blk_hide_on_behalf_of_hmp_drive_del().
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index 0593231..967c768 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -73,6 +73,7 @@ const char *blk_name(BlockBackend *blk);
 BlockBackend *blk_by_name(const char *name);
 bool blk_name_taken(const char *name);
 BlockBackend *blk_next(BlockBackend *blk);
+BlockBackend *blk_next_inserted(BlockBackend *blk);
 
 BlockDriverState *blk_bs(BlockBackend *blk);
 void blk_remove_bs(BlockBackend *blk);
-- 
2.6.2




[Qemu-block] [PATCH v7 10/24] iotests: Add test for eject under NBD server

2015-11-09 Thread Max Reitz
This patch adds a test for ejecting the BlockBackend an NBD server is
connected to (the NBD server is supposed to stop).

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/140 | 92 ++
 tests/qemu-iotests/140.out | 16 
 tests/qemu-iotests/group   |  1 +
 3 files changed, 109 insertions(+)
 create mode 100755 tests/qemu-iotests/140
 create mode 100644 tests/qemu-iotests/140.out

diff --git a/tests/qemu-iotests/140 b/tests/qemu-iotests/140
new file mode 100755
index 000..3434997
--- /dev/null
+++ b/tests/qemu-iotests/140
@@ -0,0 +1,92 @@
+#!/bin/bash
+#
+# Test case for ejecting a BB with an NBD server attached to it
+#
+# Copyright (C) 2015 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=mre...@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+_cleanup_test_img
+rm -f "$TEST_DIR/nbd"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+_supported_fmt generic
+_supported_proto file
+_supported_os Linux
+
+_make_test_img 64k
+
+$QEMU_IO -c 'write -P 42 0 64k' "$TEST_IMG" | _filter_qemu_io
+
+keep_stderr=y \
+_launch_qemu -drive if=ide,media=cdrom,id=drv,file="$TEST_IMG",format=$IMGFMT \
+2> >(_filter_nbd)
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'qmp_capabilities' }" \
+'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'nbd-server-start',
+   'arguments': { 'addr': { 'type': 'unix',
+'data': { 'path': '$TEST_DIR/nbd' " \
+'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'nbd-server-add',
+   'arguments': { 'device': 'drv' }}" \
+'return'
+
+$QEMU_IO_PROG -f raw -c 'read -P 42 0 64k' \
+"nbd+unix:///drv?socket=$TEST_DIR/nbd" 2>&1 \
+| _filter_qemu_io | _filter_nbd
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'eject',
+   'arguments': { 'device': 'drv' }}" \
+'return'
+
+$QEMU_IO_PROG -f raw -c close \
+"nbd+unix:///drv?socket=$TEST_DIR/nbd" 2>&1 \
+| _filter_qemu_io | _filter_nbd
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'quit' }" \
+'return'
+
+wait=1 _cleanup_qemu
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/140.out b/tests/qemu-iotests/140.out
new file mode 100644
index 000..0087909
--- /dev/null
+++ b/tests/qemu-iotests/140.out
@@ -0,0 +1,16 @@
+QA output created by 140
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65536
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": {}}
+{"return": {}}
+{"return": {}}
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"DEVICE_TRAY_MOVED", "data": {"device": "drv", "tray-open": true}}
+{"return": {}}
+qemu-io: can't open device nbd+unix:///drv?socket=TEST_DIR/nbd: Failed to read 
export length
+no file open, try 'help open'
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN"}
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index c69265d..993711b 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -139,3 +139,4 @@
 137 rw auto
 138 rw auto quick
 139 rw auto quick
+140 rw auto quick
-- 
2.6.2




[Qemu-block] [PATCH v7 01/24] blockdev: Add missing bdrv_unref() in drive-backup

2015-11-09 Thread Max Reitz
All error paths after a successful bdrv_open() of target_bs should
contain a bdrv_unref(target_bs). This one did not yet, so add it.

Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 blockdev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/blockdev.c b/blockdev.c
index 8607df9..3185808 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2914,6 +2914,7 @@ void qmp_drive_backup(const char *device, const char 
*target,
 bmap = bdrv_find_dirty_bitmap(bs, bitmap);
 if (!bmap) {
 error_setg(errp, "Bitmap '%s' could not be found", bitmap);
+bdrv_unref(target_bs);
 goto out;
 }
 }
-- 
2.6.2




[Qemu-block] [PATCH v7 for-2.6 00/24] block: Rework bdrv_close_all()

2015-11-09 Thread Max Reitz
Currently, bdrv_close_all() force-closes all BDSs with a BlockBackend,
which can lead to data corruption (see the iotest added in the final
patch of this series) and is most certainly very ugly.

This series reworks bdrv_close_all() to instead eject the BDS trees from
all BlockBackends and then close the monitor-owned BDS trees, which are
the only BDSs without a BB. In effect, all BDSs are closed just by
getting closed automatically due to their reference count becoming 0.

Note that the approach taken here leaks all BlockBackends. This does not
really matter, however, since qemu is about to exit anyway.


v7:
- Patch 3:
  - Put common functionality of bdrv_release_*dirty_bitmap*() into a new
function bdrv_do_release_matching_dirty_bitmap() [John]
  - Keep assertion against any bitmap being frozen in
bdrv_release_all_dirty_bitmaps() [John]
- Patches 4, 5, 6, 7, 8:
  - Split off from old patch 4 [Kevin]
  - Also: More code style changes ("() {" -> "()\n{")
  - Also: Support for nbd+unix URLs
- Patch 10: Use nbd+unix [Fam for v5]
- Patches 11, 12, 13, 14, 15: Split off from old patch 7 [Kevin]
- Patch 24: Added [Paolo]


git-backport-diff against v6:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/24:[] [--] 'blockdev: Add missing bdrv_unref() in drive-backup'
002/24:[] [--] 'blockjob: Call bdrv_unref() on creation error'
003/24:[0035] [FC] 'block: Release dirty bitmaps in bdrv_close()'
004/24:[down] 'iotests: Rename filter_nbd to _filter_nbd in 083'
005/24:[down] 'iotests: Change coding style of _filter_nbd in 083'
006/24:[0035] [FC] 'iotests: Move _filter_nbd into common.filter'
007/24:[down] 'iotests: Make _filter_nbd drop log lines'
008/24:[down] 'iotests: Make _filter_nbd support more URL types'
009/24:[] [--] 'iotests: Make redirecting qemu's stderr optional'
010/24:[0016] [FC] 'iotests: Add test for eject under NBD server'
011/24:[down] 'block: Add BB-BDS remove/insert notifiers'
012/24:[down] 'virtio-blk: Functions for op blocker management'
013/24:[down] 'virtio-scsi: Catch BDS-BB removal/insertion'
014/24:[down] 'nbd: Switch from close to eject notifier'
015/24:[down] 'block: Remove BDS close notifier'
016/24:[] [-C] 'block: Use blk_remove_bs() in blk_delete()'
017/24:[] [--] 'blockdev: Use blk_remove_bs() in do_drive_del()'
018/24:[] [--] 'block: Make bdrv_close() static'
019/24:[] [--] 'block: Add list of all BlockDriverStates'
020/24:[] [--] 'blockdev: Keep track of monitor-owned BDS'
021/24:[] [--] 'block: Add blk_remove_all_bs()'
022/24:[] [-C] 'block: Rewrite bdrv_close_all()'
023/24:[] [--] 'iotests: Add test for multiple BB on BDS tree'
024/24:[down] 'iotests: Add test for block jobs and BDS ejection'


Max Reitz (24):
  blockdev: Add missing bdrv_unref() in drive-backup
  blockjob: Call bdrv_unref() on creation error
  block: Release dirty bitmaps in bdrv_close()
  iotests: Rename filter_nbd to _filter_nbd in 083
  iotests: Change coding style of _filter_nbd in 083
  iotests: Move _filter_nbd into common.filter
  iotests: Make _filter_nbd drop log lines
  iotests: Make _filter_nbd support more URL types
  iotests: Make redirecting qemu's stderr optional
  iotests: Add test for eject under NBD server
  block: Add BB-BDS remove/insert notifiers
  virtio-blk: Functions for op blocker management
  virtio-scsi: Catch BDS-BB removal/insertion
  nbd: Switch from close to eject notifier
  block: Remove BDS close notifier
  block: Use blk_remove_bs() in blk_delete()
  blockdev: Use blk_remove_bs() in do_drive_del()
  block: Make bdrv_close() static
  block: Add list of all BlockDriverStates
  blockdev: Keep track of monitor-owned BDS
  block: Add blk_remove_all_bs()
  block: Rewrite bdrv_close_all()
  iotests: Add test for multiple BB on BDS tree
  iotests: Add test for block jobs and BDS ejection

 block.c| 100 +++-
 block/block-backend.c  |  43 +++--
 blockdev-nbd.c |  37 +---
 blockdev.c |  28 +-
 blockjob.c |   1 +
 hw/block/dataplane/virtio-blk.c|  77 +++
 hw/scsi/virtio-scsi.c  |  59 
 include/block/block.h  |   2 -
 include/block/block_int.h  |   8 +-
 include/hw/virtio/virtio-scsi.h|  10 ++
 include/sysemu/block-backend.h |   4 +-
 nbd.c  |  13 +++
 stubs/Makefile.objs|   1 +
 stubs/blockdev-close-all-bdrv-states.c |   5 +
 tests/qemu-iotests/083 |  13 +--
 tests/qemu-iotests/083.out |  10 --
 tests/qemu-iotests/117 |  86 +
 tests/qemu-iotests/117.out |  14 +++
 tests/qemu-iotests/140  

[Qemu-block] [PATCH v7 04/24] iotests: Rename filter_nbd to _filter_nbd in 083

2015-11-09 Thread Max Reitz
In the patch after the next, this function is moved to common.filter.
Therefore, its name should be preceded by an underscore to signify its
global availability.

To keep the code motion patch clean, we cannot rename it in the same
patch, so we need to choose some order of renaming vs. motion. It is
better to keep a supposedly global function used by only a single test
in that test than to keep a supposedly local function in a common* file
and use it from a test, so we should rename the function before moving
it.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/083 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/083 b/tests/qemu-iotests/083
index 1b2d3f1..664f0cf 100755
--- a/tests/qemu-iotests/083
+++ b/tests/qemu-iotests/083
@@ -49,7 +49,7 @@ wait_for_tcp_port() {
done
 }
 
-filter_nbd() {
+_filter_nbd() {
# nbd.c error messages contain function names and line numbers that are 
prone
# to change.  Message ordering depends on timing between send and 
receive
# callbacks sometimes, making them unreliable.
@@ -84,7 +84,7 @@ EOF
 
$PYTHON nbd-fault-injector.py $extra_args "127.0.0.1:$port" 
"$TEST_DIR/nbd-fault-injector.conf" 2>&1 >/dev/null &
wait_for_tcp_port "127\\.0\\.0\\.1:$port"
-   $QEMU_IO -c "read 0 512" "$nbd_url" 2>&1 | _filter_qemu_io | filter_nbd
+   $QEMU_IO -c "read 0 512" "$nbd_url" 2>&1 | _filter_qemu_io | _filter_nbd
 
echo
 }
-- 
2.6.2




[Qemu-block] [PATCH v7 09/24] iotests: Make redirecting qemu's stderr optional

2015-11-09 Thread Max Reitz
Redirecting qemu's stderr to stdout makes working with the stderr output
difficult due to the other file descriptor magic performed in
_launch_qemu ("ambiguous redirect").

Add an option which specifies whether stderr should be redirected to
stdout or not (allowing for other modes to be added in the future).

Signed-off-by: Max Reitz 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/common.qemu | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 8bf3969..2548a87 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -129,6 +129,8 @@ function _send_qemu_cmd()
 # $qemu_comm_method: set this variable to 'monitor' (case insensitive)
 #to use the QEMU HMP monitor for communication.
 #Otherwise, the default of QMP is used.
+# $keep_stderr: Set this variable to 'y' to keep QEMU's stderr output on 
stderr.
+#   If this variable is empty, stderr will be redirected to stdout.
 # Returns:
 # $QEMU_HANDLE: set to a handle value to communicate with this QEMU instance.
 #
@@ -151,11 +153,20 @@ function _launch_qemu()
 mkfifo "${fifo_out}"
 mkfifo "${fifo_in}"
 
-QEMU_NEED_PID='y'\
-${QEMU} -nographic -serial none ${comm} -machine accel=qtest "${@}" \
+if [ -z "$keep_stderr" ]; then
+QEMU_NEED_PID='y'\
+${QEMU} -nographic -serial none ${comm} -machine accel=qtest "${@}" \
 >"${fifo_out}" 
\
 2>&1 \
 <"${fifo_in}" &
+elif [ "$keep_stderr" = "y" ]; then
+QEMU_NEED_PID='y'\
+${QEMU} -nographic -serial none ${comm} -machine accel=qtest "${@}" \
+>"${fifo_out}" 
\
+<"${fifo_in}" &
+else
+exit 1
+fi
 
 if [[ "${BASH_VERSINFO[0]}" -ge "5" ||
 ("${BASH_VERSINFO[0]}" -ge "4"  &&  "${BASH_VERSINFO[1]}" -ge "1") ]]
-- 
2.6.2




[Qemu-block] [PATCH v7 08/24] iotests: Make _filter_nbd support more URL types

2015-11-09 Thread Max Reitz
This function should support URLs of the "nbd://" format (without
swallowing the export name), and for "nbd:///" URLs it should replace
"?socket=$TEST_DIR" by "?socket=TEST_DIR" because putting the Unix
socket files into the test directory makes sense.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/common.filter | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 0fc443a..bd53cab 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -238,7 +238,8 @@ _filter_nbd()
 #
 # Filter out the TCP port number since this changes between runs.
 sed -e '/nbd\.c:/d' \
--e 's#nbd:127\.0\.0\.1:[^:]*:#nbd:127\.0\.0\.1:PORT:#g' \
+-e 's#nbd:\(//\)\?127\.0\.0\.1:[0-9]*#nbd:\1127.0.0.1:PORT#g' \
+-e "s#?socket=$TEST_DIR#?socket=TEST_DIR#g" \
 -e 's#\(exportname=foo\|PORT\): Failed to .*$#\1#'
 }
 
-- 
2.6.2




[Qemu-block] [PATCH v7 07/24] iotests: Make _filter_nbd drop log lines

2015-11-09 Thread Max Reitz
The NBD log lines ("/your/source/dir/nbd.c:function():line: error")
should not be converted to empty lines but removed altogether.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/083.out   | 10 --
 tests/qemu-iotests/common.filter |  2 +-
 2 files changed, 1 insertion(+), 11 deletions(-)

diff --git a/tests/qemu-iotests/083.out b/tests/qemu-iotests/083.out
index 8c1441b..5c9141b 100644
--- a/tests/qemu-iotests/083.out
+++ b/tests/qemu-iotests/083.out
@@ -51,7 +51,6 @@ no file open, try 'help open'
 
 === Check disconnect after neg2 ===
 
-
 read failed: Input/output error
 
 === Check disconnect 8 neg2 ===
@@ -66,42 +65,34 @@ no file open, try 'help open'
 
 === Check disconnect before request ===
 
-
 read failed: Input/output error
 
 === Check disconnect after request ===
 
-
 read failed: Input/output error
 
 === Check disconnect before reply ===
 
-
 read failed: Input/output error
 
 === Check disconnect after reply ===
 
-
 read failed: Input/output error
 
 === Check disconnect 4 reply ===
 
-
 read failed: Input/output error
 
 === Check disconnect 8 reply ===
 
-
 read failed: Input/output error
 
 === Check disconnect before data ===
 
-
 read failed: Input/output error
 
 === Check disconnect after data ===
 
-
 read 512/512 bytes at offset 0
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 
@@ -132,7 +123,6 @@ no file open, try 'help open'
 
 === Check disconnect after neg-classic ===
 
-
 read failed: Input/output error
 
 *** done
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index aa2fb8d..0fc443a 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -237,7 +237,7 @@ _filter_nbd()
 # receive callbacks sometimes, making them unreliable.
 #
 # Filter out the TCP port number since this changes between runs.
-sed -e 's#^.*nbd\.c:.*##g' \
+sed -e '/nbd\.c:/d' \
 -e 's#nbd:127\.0\.0\.1:[^:]*:#nbd:127\.0\.0\.1:PORT:#g' \
 -e 's#\(exportname=foo\|PORT\): Failed to .*$#\1#'
 }
-- 
2.6.2




[Qemu-block] [PATCH v7 20/24] blockdev: Keep track of monitor-owned BDS

2015-11-09 Thread Max Reitz
Signed-off-by: Max Reitz 
Reviewed-by: Kevin Wolf 
---
 blockdev.c | 25 +
 include/block/block_int.h  |  4 
 stubs/Makefile.objs|  1 +
 stubs/blockdev-close-all-bdrv-states.c |  5 +
 4 files changed, 35 insertions(+)
 create mode 100644 stubs/blockdev-close-all-bdrv-states.c

diff --git a/blockdev.c b/blockdev.c
index a70a684..8211eae 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -50,6 +50,9 @@
 #include "trace.h"
 #include "sysemu/arch_init.h"
 
+static QTAILQ_HEAD(, BlockDriverState) monitor_bdrv_states =
+QTAILQ_HEAD_INITIALIZER(monitor_bdrv_states);
+
 static const char *const if_name[IF_COUNT] = {
 [IF_NONE] = "none",
 [IF_IDE] = "ide",
@@ -662,6 +665,19 @@ fail:
 return NULL;
 }
 
+void blockdev_close_all_bdrv_states(void)
+{
+BlockDriverState *bs, *next_bs;
+
+QTAILQ_FOREACH_SAFE(bs, _bdrv_states, monitor_list, next_bs) {
+AioContext *ctx = bdrv_get_aio_context(bs);
+
+aio_context_acquire(ctx);
+bdrv_unref(bs);
+aio_context_release(ctx);
+}
+}
+
 static void qemu_opt_rename(QemuOpts *opts, const char *from, const char *to,
 Error **errp)
 {
@@ -3464,6 +3480,8 @@ void qmp_blockdev_add(BlockdevOptions *options, Error 
**errp)
 if (!bs) {
 goto fail;
 }
+
+QTAILQ_INSERT_TAIL(_bdrv_states, bs, monitor_list);
 }
 
 if (bs && bdrv_key_required(bs)) {
@@ -3534,11 +3552,18 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
bdrv_get_device_or_node_name(bs));
 goto out;
 }
+
+if (!blk && !bs->monitor_list.tqe_prev) {
+error_setg(errp, "Node %s is not owned by the monitor",
+   bs->node_name);
+goto out;
+}
 }
 
 if (blk) {
 blk_unref(blk);
 } else {
+QTAILQ_REMOVE(_bdrv_states, bs, monitor_list);
 bdrv_unref(bs);
 }
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 8c82747..7408eef 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -424,6 +424,8 @@ struct BlockDriverState {
 QTAILQ_ENTRY(BlockDriverState) device_list;
 /* element of the list of all BlockDriverStates (all_bdrv_states) */
 QTAILQ_ENTRY(BlockDriverState) bs_list;
+/* element of the list of monitor-owned BDS */
+QTAILQ_ENTRY(BlockDriverState) monitor_list;
 QLIST_HEAD(, BdrvDirtyBitmap) dirty_bitmaps;
 int refcnt;
 
@@ -680,4 +682,6 @@ void blk_dev_resize_cb(BlockBackend *blk);
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors);
 bool bdrv_requests_pending(BlockDriverState *bs);
 
+void blockdev_close_all_bdrv_states(void);
+
 #endif /* BLOCK_INT_H */
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 330e1a4..c1f02be 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -1,5 +1,6 @@
 stub-obj-y += arch-query-cpu-def.o
 stub-obj-y += bdrv-commit-all.o
+stub-obj-y += blockdev-close-all-bdrv-states.o
 stub-obj-y += clock-warp.o
 stub-obj-y += cpu-get-clock.o
 stub-obj-y += cpu-get-icount.o
diff --git a/stubs/blockdev-close-all-bdrv-states.c 
b/stubs/blockdev-close-all-bdrv-states.c
new file mode 100644
index 000..12d2442
--- /dev/null
+++ b/stubs/blockdev-close-all-bdrv-states.c
@@ -0,0 +1,5 @@
+#include "block/block_int.h"
+
+void blockdev_close_all_bdrv_states(void)
+{
+}
-- 
2.6.2




[Qemu-block] [PATCH v7 21/24] block: Add blk_remove_all_bs()

2015-11-09 Thread Max Reitz
When bdrv_close_all() is called, instead of force-closing all root
BlockDriverStates, it is better to just drop the reference from all
BlockBackends and let them be closed automatically. This prevents BDS
from getting closed that are still referenced by other BDS, which may
result in loss of cached data.

This patch adds a function for doing that, but does not yet incorporate
it in bdrv_close_all().

Signed-off-by: Max Reitz 
Reviewed-by: Kevin Wolf 
---
 block/block-backend.c  | 15 +++
 include/sysemu/block-backend.h |  1 +
 2 files changed, 16 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 4523387..7373093 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -221,6 +221,21 @@ void blk_unref(BlockBackend *blk)
 }
 }
 
+void blk_remove_all_bs(void)
+{
+BlockBackend *blk;
+
+QTAILQ_FOREACH(blk, _backends, link) {
+AioContext *ctx = blk_get_aio_context(blk);
+
+aio_context_acquire(ctx);
+if (blk->bs) {
+blk_remove_bs(blk);
+}
+aio_context_release(ctx);
+}
+}
+
 /*
  * Return the BlockBackend after @blk.
  * If @blk is null, return the first one.
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index abb52be..9fb3e54 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -68,6 +68,7 @@ BlockBackend *blk_new_open(const char *name, const char 
*filename,
 int blk_get_refcnt(BlockBackend *blk);
 void blk_ref(BlockBackend *blk);
 void blk_unref(BlockBackend *blk);
+void blk_remove_all_bs(void);
 const char *blk_name(BlockBackend *blk);
 BlockBackend *blk_by_name(const char *name);
 BlockBackend *blk_next(BlockBackend *blk);
-- 
2.6.2




[Qemu-block] [PATCH 3/9] libqos/ahci: ATAPI identify

2015-11-09 Thread John Snow
We need to say "hello!" to our ATAPI friends
in a slightly different manner.

Signed-off-by: John Snow 
---
 tests/ahci-test.c   | 8 +++-
 tests/libqos/ahci.c | 5 +
 tests/libqos/ahci.h | 1 +
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index 8370fa3..58d0545 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -204,6 +204,7 @@ static AHCIQState *ahci_boot_and_enable(const char *cli, 
...)
 va_list ap;
 uint16_t buff[256];
 uint8_t port;
+uint8_t hello;
 
 if (cli) {
 va_start(ap, cli);
@@ -218,7 +219,12 @@ static AHCIQState *ahci_boot_and_enable(const char *cli, 
...)
 /* Initialize test device */
 port = ahci_port_select(ahci);
 ahci_port_clear(ahci, port);
-ahci_io(ahci, port, CMD_IDENTIFY, , sizeof(buff), 0);
+if (is_atapi(ahci, port)) {
+hello = CMD_PACKET_ID;
+} else {
+hello = CMD_IDENTIFY;
+}
+ahci_io(ahci, port, hello, , sizeof(buff), 0);
 
 return ahci;
 }
diff --git a/tests/libqos/ahci.c b/tests/libqos/ahci.c
index 59bf893..81edf34 100644
--- a/tests/libqos/ahci.c
+++ b/tests/libqos/ahci.c
@@ -114,6 +114,11 @@ void ahci_free(AHCIQState *ahci, uint64_t addr)
 qfree(ahci->parent, addr);
 }
 
+bool is_atapi(AHCIQState *ahci, uint8_t port)
+{
+return ahci_px_rreg(ahci, port, AHCI_PX_SIG) == AHCI_SIGNATURE_CDROM;
+}
+
 /**
  * Locate, verify, and return a handle to the AHCI device.
  */
diff --git a/tests/libqos/ahci.h b/tests/libqos/ahci.h
index 9ffd415..705fbd6 100644
--- a/tests/libqos/ahci.h
+++ b/tests/libqos/ahci.h
@@ -596,5 +596,6 @@ void ahci_command_adjust(AHCICommand *cmd, uint64_t 
lba_sect, uint64_t gbuffer,
 
 /* Command Misc */
 uint8_t ahci_command_slot(AHCICommand *cmd);
+bool is_atapi(AHCIQState *ahci, uint8_t port);
 
 #endif
-- 
2.4.3




[Qemu-block] [PATCH 1/9] ahci-test: fix memory leak

2015-11-09 Thread John Snow
Use the proper free command to detroy an AHCICommand.

Signed-off-by: John Snow 
---
 tests/ahci-test.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index 59d387c..8370fa3 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -1034,14 +1034,14 @@ static void test_dma_fragmented(void)
 ahci_command_commit(ahci, cmd, px);
 ahci_command_issue(ahci, cmd);
 ahci_command_verify(ahci, cmd);
-g_free(cmd);
+ahci_command_free(cmd);
 
 cmd = ahci_command_create(CMD_READ_DMA);
 ahci_command_adjust(cmd, 0, ptr, bufsize, 32);
 ahci_command_commit(ahci, cmd, px);
 ahci_command_issue(ahci, cmd);
 ahci_command_verify(ahci, cmd);
-g_free(cmd);
+ahci_command_free(cmd);
 
 /* Read back the guest's receive buffer into local memory */
 bufread(ptr, rx, bufsize);
-- 
2.4.3




[Qemu-block] [PATCH 9/9] libqos/ahci: organize header

2015-11-09 Thread John Snow
Organize the prototypes into nice little sections.

Signed-off-by: John Snow 
---
 tests/libqos/ahci.h | 36 
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/tests/libqos/ahci.h b/tests/libqos/ahci.h
index 2c2d2fc..69dc4d7 100644
--- a/tests/libqos/ahci.h
+++ b/tests/libqos/ahci.h
@@ -553,14 +553,28 @@ static inline void ahci_px_clr(AHCIQState *ahci, uint8_t 
port,
 /*** Prototypes ***/
 uint64_t ahci_alloc(AHCIQState *ahci, size_t bytes);
 void ahci_free(AHCIQState *ahci, uint64_t addr);
+void ahci_clean_mem(AHCIQState *ahci);
+
+/* Device management */
 QPCIDevice *get_ahci_device(uint32_t *fingerprint);
 void free_ahci_device(QPCIDevice *dev);
-void ahci_clean_mem(AHCIQState *ahci);
 void ahci_pci_enable(AHCIQState *ahci);
 void start_ahci_device(AHCIQState *ahci);
 void ahci_hba_enable(AHCIQState *ahci);
+
+/* Port Management */
 unsigned ahci_port_select(AHCIQState *ahci);
 void ahci_port_clear(AHCIQState *ahci, uint8_t port);
+
+/* Command header / table management */
+unsigned ahci_pick_cmd(AHCIQState *ahci, uint8_t port);
+void ahci_get_command_header(AHCIQState *ahci, uint8_t port,
+ uint8_t slot, AHCICommandHeader *cmd);
+void ahci_set_command_header(AHCIQState *ahci, uint8_t port,
+ uint8_t slot, AHCICommandHeader *cmd);
+void ahci_destroy_command(AHCIQState *ahci, uint8_t port, uint8_t slot);
+
+/* AHCI sanity check routines */
 void ahci_port_check_error(AHCIQState *ahci, uint8_t port);
 void ahci_port_check_interrupts(AHCIQState *ahci, uint8_t port,
 uint32_t intr_mask);
@@ -569,14 +583,12 @@ void ahci_port_check_d2h_sanity(AHCIQState *ahci, uint8_t 
port, uint8_t slot);
 void ahci_port_check_pio_sanity(AHCIQState *ahci, uint8_t port,
 uint8_t slot, size_t buffsize);
 void ahci_port_check_cmd_sanity(AHCIQState *ahci, AHCICommand *cmd);
-void ahci_get_command_header(AHCIQState *ahci, uint8_t port,
- uint8_t slot, AHCICommandHeader *cmd);
-void ahci_set_command_header(AHCIQState *ahci, uint8_t port,
- uint8_t slot, AHCICommandHeader *cmd);
-void ahci_destroy_command(AHCIQState *ahci, uint8_t port, uint8_t slot);
-void ahci_write_fis(AHCIQState *ahci, AHCICommand *cmd);
-unsigned ahci_pick_cmd(AHCIQState *ahci, uint8_t port);
+
+/* Misc */
+bool is_atapi(AHCIQState *ahci, uint8_t port);
 unsigned size_to_prdtl(unsigned bytes, unsigned bytes_per_prd);
+
+/* Command: Macro level execution */
 void ahci_guest_io(AHCIQState *ahci, uint8_t port, uint8_t ide_cmd,
uint64_t gbuffer, size_t size, uint64_t sector);
 AHCICommand *ahci_guest_io_halt(AHCIQState *ahci, uint8_t port, uint8_t 
ide_cmd,
@@ -587,7 +599,7 @@ void ahci_io(AHCIQState *ahci, uint8_t port, uint8_t 
ide_cmd,
 void ahci_exec(AHCIQState *ahci, uint8_t port,
uint8_t op, const AHCIOpts *opts);
 
-/* Command Lifecycle */
+/* Command: Fine-grained lifecycle */
 AHCICommand *ahci_command_create(uint8_t command_name);
 AHCICommand *ahci_atapi_command_create(uint8_t scsi_cmd);
 void ahci_command_commit(AHCIQState *ahci, AHCICommand *cmd, uint8_t port);
@@ -597,7 +609,7 @@ void ahci_command_wait(AHCIQState *ahci, AHCICommand *cmd);
 void ahci_command_verify(AHCIQState *ahci, AHCICommand *cmd);
 void ahci_command_free(AHCICommand *cmd);
 
-/* Command adjustments */
+/* Command: adjustments */
 void ahci_command_set_flags(AHCICommand *cmd, uint16_t cmdh_flags);
 void ahci_command_clr_flags(AHCICommand *cmd, uint16_t cmdh_flags);
 void ahci_command_set_offset(AHCICommand *cmd, uint64_t lba_sect);
@@ -611,8 +623,8 @@ void ahci_command_enable_atapi_dma(AHCICommand *cmd);
 void ahci_command_adjust(AHCICommand *cmd, uint64_t lba_sect, uint64_t gbuffer,
  uint64_t xbytes, unsigned prd_size);
 
-/* Command Misc */
+/* Command: Misc */
 uint8_t ahci_command_slot(AHCICommand *cmd);
-bool is_atapi(AHCIQState *ahci, uint8_t port);
+void ahci_write_fis(AHCIQState *ahci, AHCICommand *cmd);
 
 #endif
-- 
2.4.3




[Qemu-block] [PATCH] ahci/qtest: don't use tcp sockets for migration tests

2015-11-09 Thread John Snow
Signed-off-by: John Snow 
---
 tests/ahci-test.c | 25 -
 1 file changed, 20 insertions(+), 5 deletions(-)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index 59d387c..1bb7410 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -47,6 +47,7 @@
 /*** Globals ***/
 static char tmp_path[] = "/tmp/qtest.XX";
 static char debug_path[] = "/tmp/qtest-blkdebug.XX";
+static char mig_socket[] = "/tmp/qtest-migration.XX";
 static bool ahci_pedantic;
 
 /*** Function Declarations ***/
@@ -114,8 +115,11 @@ static void ahci_migrate(AHCIQState *from, AHCIQState *to, 
const char *uri)
 {
 QOSState *tmp = to->parent;
 QPCIDevice *dev = to->dev;
+char *uri_local = NULL;
+
 if (uri == NULL) {
-uri = "tcp:127.0.0.1:1234";
+uri_local = g_strdup_printf("%s%s", "unix:", mig_socket);
+uri = uri_local;
 }
 
 /* context will be 'to' after completion. */
@@ -135,6 +139,7 @@ static void ahci_migrate(AHCIQState *from, AHCIQState *to, 
const char *uri)
 from->dev = dev;
 
 verify_state(to);
+g_free(uri_local);
 }
 
 /*** Test Setup & Teardown ***/
@@ -1105,7 +1110,7 @@ static void test_flush_retry(void)
 static void test_migrate_sanity(void)
 {
 AHCIQState *src, *dst;
-const char *uri = "tcp:127.0.0.1:1234";
+char *uri = g_strdup_printf("unix:%s", mig_socket);
 
 src = ahci_boot("-m 1024 -M q35 "
 "-hda %s ", tmp_path);
@@ -1117,6 +1122,7 @@ static void test_migrate_sanity(void)
 
 ahci_shutdown(src);
 ahci_shutdown(dst);
+g_free(uri);
 }
 
 /**
@@ -1129,7 +1135,7 @@ static void ahci_migrate_simple(uint8_t cmd_read, uint8_t 
cmd_write)
 size_t bufsize = 4096;
 unsigned char *tx = g_malloc(bufsize);
 unsigned char *rx = g_malloc0(bufsize);
-const char *uri = "tcp:127.0.0.1:1234";
+char *uri = g_strdup_printf("unix:%s", mig_socket);
 
 src = ahci_boot_and_enable("-m 1024 -M q35 "
"-hda %s ", tmp_path);
@@ -1158,6 +1164,7 @@ static void ahci_migrate_simple(uint8_t cmd_read, uint8_t 
cmd_write)
 ahci_shutdown(dst);
 g_free(rx);
 g_free(tx);
+g_free(uri);
 }
 
 static void test_migrate_dma(void)
@@ -1251,7 +1258,7 @@ static void ahci_migrate_halted_io(uint8_t cmd_read, 
uint8_t cmd_write)
 unsigned char *rx = g_malloc0(bufsize);
 uint64_t ptr;
 AHCICommand *cmd;
-const char *uri = "tcp:127.0.0.1:1234";
+char *uri = g_strdup_printf("unix:%s", mig_socket);
 
 prepare_blkdebug_script(debug_path, "write_aio");
 
@@ -1301,6 +1308,7 @@ static void ahci_migrate_halted_io(uint8_t cmd_read, 
uint8_t cmd_write)
 ahci_shutdown(dst);
 g_free(rx);
 g_free(tx);
+g_free(uri);
 }
 
 static void test_migrate_halted_dma(void)
@@ -1322,7 +1330,7 @@ static void test_flush_migrate(void)
 AHCICommand *cmd;
 uint8_t px;
 const char *s;
-const char *uri = "tcp:127.0.0.1:1234";
+char *uri = g_strdup_printf("unix:%s", mig_socket);
 
 prepare_blkdebug_script(debug_path, "flush_to_disk");
 
@@ -1360,6 +1368,7 @@ static void test_flush_migrate(void)
 ahci_command_free(cmd);
 ahci_shutdown(src);
 ahci_shutdown(dst);
+g_free(uri);
 }
 
 static void test_max(void)
@@ -1626,6 +1635,11 @@ int main(int argc, char **argv)
 g_assert(fd >= 0);
 close(fd);
 
+/* Reserve a hollow file to use as a socket for migration tests */
+fd = mkstemp(mig_socket);
+g_assert(fd >= 0);
+close(fd);
+
 /* Run the tests */
 qtest_add_func("/ahci/sanity", test_sanity);
 qtest_add_func("/ahci/pci_spec",   test_pci_spec);
@@ -1668,6 +1682,7 @@ int main(int argc, char **argv)
 /* Cleanup */
 unlink(tmp_path);
 unlink(debug_path);
+unlink(mig_socket);
 
 return ret;
 }
-- 
2.4.3




[Qemu-block] [PATCH v7 05/24] iotests: Change coding style of _filter_nbd in 083

2015-11-09 Thread Max Reitz
In order to be able to move _filter_nbd to common.filter in the next
patch, its coding style needs to be adapted to that of common.filter.
That means, we have to convert tabs to four spaces, adjust the alignment
of the last line (done with spaces already, assuming one tab equals
eight spaces), fix the line length of the comment, and add a line break
before the opening brace.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/083 | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/tests/qemu-iotests/083 b/tests/qemu-iotests/083
index 664f0cf..c00a66b 100755
--- a/tests/qemu-iotests/083
+++ b/tests/qemu-iotests/083
@@ -49,15 +49,16 @@ wait_for_tcp_port() {
done
 }
 
-_filter_nbd() {
-   # nbd.c error messages contain function names and line numbers that are 
prone
-   # to change.  Message ordering depends on timing between send and 
receive
-   # callbacks sometimes, making them unreliable.
-   #
-   # Filter out the TCP port number since this changes between runs.
-   sed -e 's#^.*nbd\.c:.*##g' \
-   -e 's#nbd:127\.0\.0\.1:[^:]*:#nbd:127\.0\.0\.1:PORT:#g' \
--e 's#\(exportname=foo\|PORT\): Failed to .*$#\1#'
+_filter_nbd()
+{
+# nbd.c error messages contain function names and line numbers that are
+# prone to change.  Message ordering depends on timing between send and
+# receive callbacks sometimes, making them unreliable.
+#
+# Filter out the TCP port number since this changes between runs.
+sed -e 's#^.*nbd\.c:.*##g' \
+-e 's#nbd:127\.0\.0\.1:[^:]*:#nbd:127\.0\.0\.1:PORT:#g' \
+-e 's#\(exportname=foo\|PORT\): Failed to .*$#\1#'
 }
 
 check_disconnect() {
-- 
2.6.2




[Qemu-block] [PATCH v7 02/24] blockjob: Call bdrv_unref() on creation error

2015-11-09 Thread Max Reitz
If block_job_create() fails, it should release its reference to the
job's BDS. Normally, this is done in the callback provided by the
caller, but that callback will not be invoked if the block job failed to
be created.

Signed-off-by: Max Reitz 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 blockjob.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/blockjob.c b/blockjob.c
index c02fe59..0886a4a 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -70,6 +70,7 @@ void *block_job_create(const BlockJobDriver *driver, 
BlockDriverState *bs,
 if (local_err) {
 block_job_release(bs);
 error_propagate(errp, local_err);
+bdrv_unref(bs);
 return NULL;
 }
 }
-- 
2.6.2




[Qemu-block] [PATCH v7 14/24] nbd: Switch from close to eject notifier

2015-11-09 Thread Max Reitz
The NBD code uses the BDS close notifier to determine when a medium is
ejected. However, now it should use the BB's BDS removal notifier for
that instead of the BDS's close notifier.

Signed-off-by: Max Reitz 
---
 blockdev-nbd.c | 37 +
 nbd.c  | 13 +
 2 files changed, 14 insertions(+), 36 deletions(-)

diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index bcdd18b..b28a55b 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -46,37 +46,11 @@ void qmp_nbd_server_start(SocketAddress *addr, Error **errp)
 }
 }
 
-/*
- * Hook into the BlockBackend notifiers to close the export when the
- * backend is closed.
- */
-typedef struct NBDCloseNotifier {
-Notifier n;
-NBDExport *exp;
-QTAILQ_ENTRY(NBDCloseNotifier) next;
-} NBDCloseNotifier;
-
-static QTAILQ_HEAD(, NBDCloseNotifier) close_notifiers =
-QTAILQ_HEAD_INITIALIZER(close_notifiers);
-
-static void nbd_close_notifier(Notifier *n, void *data)
-{
-NBDCloseNotifier *cn = DO_UPCAST(NBDCloseNotifier, n, n);
-
-notifier_remove(>n);
-QTAILQ_REMOVE(_notifiers, cn, next);
-
-nbd_export_close(cn->exp);
-nbd_export_put(cn->exp);
-g_free(cn);
-}
-
 void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
 Error **errp)
 {
 BlockBackend *blk;
 NBDExport *exp;
-NBDCloseNotifier *n;
 
 if (server_fd == -1) {
 error_setg(errp, "NBD server not running");
@@ -113,20 +87,11 @@ void qmp_nbd_server_add(const char *device, bool 
has_writable, bool writable,
 }
 
 nbd_export_set_name(exp, device);
-
-n = g_new0(NBDCloseNotifier, 1);
-n->n.notify = nbd_close_notifier;
-n->exp = exp;
-blk_add_close_notifier(blk, >n);
-QTAILQ_INSERT_TAIL(_notifiers, n, next);
 }
 
 void qmp_nbd_server_stop(Error **errp)
 {
-while (!QTAILQ_EMPTY(_notifiers)) {
-NBDCloseNotifier *cn = QTAILQ_FIRST(_notifiers);
-nbd_close_notifier(>n, nbd_export_get_blockdev(cn->exp));
-}
+nbd_export_close_all();
 
 if (server_fd != -1) {
 qemu_set_fd_handler(server_fd, NULL, NULL, NULL);
diff --git a/nbd.c b/nbd.c
index b3d9654..ab2e725 100644
--- a/nbd.c
+++ b/nbd.c
@@ -162,6 +162,8 @@ struct NBDExport {
 QTAILQ_ENTRY(NBDExport) next;
 
 AioContext *ctx;
+
+Notifier eject_notifier;
 };
 
 static QTAILQ_HEAD(, NBDExport) exports = QTAILQ_HEAD_INITIALIZER(exports);
@@ -1053,6 +1055,12 @@ static void blk_aio_detach(void *opaque)
 exp->ctx = NULL;
 }
 
+static void nbd_eject_notifier(Notifier *n, void *data)
+{
+NBDExport *exp = container_of(n, NBDExport, eject_notifier);
+nbd_export_close(exp);
+}
+
 NBDExport *nbd_export_new(BlockBackend *blk, off_t dev_offset, off_t size,
   uint32_t nbdflags, void (*close)(NBDExport *),
   Error **errp)
@@ -1075,6 +1083,10 @@ NBDExport *nbd_export_new(BlockBackend *blk, off_t 
dev_offset, off_t size,
 exp->ctx = blk_get_aio_context(blk);
 blk_ref(blk);
 blk_add_aio_context_notifier(blk, blk_aio_attached, blk_aio_detach, exp);
+
+exp->eject_notifier.notify = nbd_eject_notifier;
+blk_add_remove_bs_notifier(blk, >eject_notifier);
+
 /*
  * NBD exports are used for non-shared storage migration.  Make sure
  * that BDRV_O_INCOMING is cleared and the image is ready for write
@@ -1154,6 +1166,7 @@ void nbd_export_put(NBDExport *exp)
 }
 
 if (exp->blk) {
+notifier_remove(>eject_notifier);
 blk_remove_aio_context_notifier(exp->blk, blk_aio_attached,
 blk_aio_detach, exp);
 blk_unref(exp->blk);
-- 
2.6.2




[Qemu-block] [PATCH v7 23/24] iotests: Add test for multiple BB on BDS tree

2015-11-09 Thread Max Reitz
This adds a test for having multiple BlockBackends in one BDS tree. In
this case, there is one BB for the protocol BDS and one BB for the
format BDS in a simple two-BDS tree (with the protocol BDS and BB added
first).

When bdrv_close_all() is executed, no cached data from any BDS should be
lost; the protocol BDS may not be closed until the format BDS is closed.
Otherwise, metadata updates may be lost.

Signed-off-by: Max Reitz 
Reviewed-by: Kevin Wolf 
---
 tests/qemu-iotests/117 | 86 ++
 tests/qemu-iotests/117.out | 14 
 tests/qemu-iotests/group   |  1 +
 3 files changed, 101 insertions(+)
 create mode 100755 tests/qemu-iotests/117
 create mode 100644 tests/qemu-iotests/117.out

diff --git a/tests/qemu-iotests/117 b/tests/qemu-iotests/117
new file mode 100755
index 000..7881fc7
--- /dev/null
+++ b/tests/qemu-iotests/117
@@ -0,0 +1,86 @@
+#!/bin/bash
+#
+# Test case for shared BDS between backend trees
+#
+# Copyright (C) 2015 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=mre...@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+_make_test_img 64k
+
+_launch_qemu
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'qmp_capabilities' }" \
+'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'blockdev-add',
+   'arguments': { 'options': { 'id': 'protocol',
+   'driver': 'file',
+   'filename': '$TEST_IMG' } } }" \
+'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'blockdev-add',
+   'arguments': { 'options': { 'id': 'format',
+   'driver': '$IMGFMT',
+   'file': 'protocol' } } }" \
+'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'human-monitor-command',
+   'arguments': { 'command-line': 'qemu-io format \"write -P 42 0 64k\"' } 
}" \
+'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{ 'execute': 'quit' }" \
+'return'
+
+wait=1 _cleanup_qemu
+
+_check_test_img
+
+$QEMU_IO -c 'read -P 42 0 64k' "$TEST_IMG" | _filter_qemu_io
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/117.out b/tests/qemu-iotests/117.out
new file mode 100644
index 000..f52dc1a
--- /dev/null
+++ b/tests/qemu-iotests/117.out
@@ -0,0 +1,14 @@
+QA output created by 117
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=65536
+{"return": {}}
+{"return": {}}
+{"return": {}}
+wrote 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+{"return": ""}
+{"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP}, "event": 
"SHUTDOWN"}
+No errors were found on the image.
+read 65536/65536 bytes at offset 0
+64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 993711b..d157c07 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -122,6 +122,7 @@
 114 rw auto quick
 115 rw auto
 116 rw auto quick
+117 rw auto
 118 rw auto
 119 rw auto quick
 120 rw auto quick
-- 
2.6.2




Re: [Qemu-block] [PATCH v3 8/9] block: Introduce BlockDriver.bdrv_drain callback

2015-11-09 Thread Paolo Bonzini


On 09/11/2015 03:56, Fam Zheng wrote:
> +if (bs->drv && bs->drv->bdrv_drain) {
> +bs->drv->bdrv_drain(bs);
> +}
> +QLIST_FOREACH(child, >children, next) {
> +BlockDriverState *cbs = child->bs;
> +if (cbs->drv && cbs->drv->bdrv_drain) {
> +cbs->drv->bdrv_drain(bs);
> +}
> +}

I think this is not enough, because the children could have children as
well.

Perhaps you can do it like bdrv_flush: one function does the call to
bdrv_drain and the recursion on children; bdrv_drain calls that one
function and then does the "while (busy)" loop.

Paolo



[Qemu-block] [PATCH v4 5/9] block: Add ioctl parameter fields to BlockRequest

2015-11-09 Thread Fam Zheng
The two fields that will be used by ioctl handling code later are added
as union, because it's used exclusively by ioctl code which dosn't need
the four fields in the other struct of the union.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 include/block/block.h | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index 610db92..c8b40b7 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -335,10 +335,18 @@ void bdrv_aio_cancel_async(BlockAIOCB *acb);
 
 typedef struct BlockRequest {
 /* Fields to be filled by multiwrite caller */
-int64_t sector;
-int nb_sectors;
-int flags;
-QEMUIOVector *qiov;
+union {
+struct {
+int64_t sector;
+int nb_sectors;
+int flags;
+QEMUIOVector *qiov;
+};
+struct {
+int req;
+void *buf;
+};
+};
 BlockCompletionFunc *cb;
 void *opaque;
 
-- 
2.4.3




[Qemu-block] [PATCH v4 8/9] block: Introduce BlockDriver.bdrv_drain callback

2015-11-09 Thread Fam Zheng
Drivers can have internal request sources that generate IO, like the
need_check_timer in QED. Since we want quiesced periods that contain
nested event loops in block layer, we need to have a way to disable such
event sources.

Block drivers must implement the "bdrv_drain" callback if it has any
internal sources that can generate I/O activity, like a timer or a
worker thread (even in a library) that can schedule QEMUBH in an
asynchronous callback.

Update the comments of bdrv_drain and bdrv_drained_begin accordingly.

Like bdrv_requests_pending(), we should consider all the children of bs.
Before, the while loop just works, as bdrv_requests_pending() already
tracks its children; now we mustn't miss the callback, so recurse down
explicitly.

Signed-off-by: Fam Zheng 
---
 block/io.c| 16 +++-
 include/block/block_int.h |  6 ++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/block/io.c b/block/io.c
index 4ecb171..adc1eab 100644
--- a/block/io.c
+++ b/block/io.c
@@ -237,8 +237,21 @@ bool bdrv_requests_pending(BlockDriverState *bs)
 return false;
 }
 
+static void bdrv_drain_recurse(BlockDriverState *bs)
+{
+BdrvChild *child;
+
+if (bs->drv && bs->drv->bdrv_drain) {
+bs->drv->bdrv_drain(bs);
+}
+QLIST_FOREACH(child, >children, next) {
+bdrv_drain_recurse(child->bs);
+}
+}
+
 /*
- * Wait for pending requests to complete on a single BlockDriverState subtree
+ * Wait for pending requests to complete on a single BlockDriverState subtree,
+ * and suspend block driver's internal I/O until next request arrives.
  *
  * Note that unlike bdrv_drain_all(), the caller must hold the BlockDriverState
  * AioContext.
@@ -251,6 +264,7 @@ void bdrv_drain(BlockDriverState *bs)
 {
 bool busy = true;
 
+bdrv_drain_recurse(bs);
 while (busy) {
 /* Keep iterating */
  bdrv_flush_io_queue(bs);
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 550ce18..4a9f8ff 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -295,6 +295,12 @@ struct BlockDriver {
  */
 int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);
 
+/**
+ * Drain and stop any internal sources of requests in the driver, and
+ * remain so until next I/O callback (e.g. bdrv_co_writev) is called.
+ */
+void (*bdrv_drain)(BlockDriverState *bs);
+
 QLIST_ENTRY(BlockDriver) list;
 };
 
-- 
2.4.3




[Qemu-block] [PATCH v4 7/9] block: Drop BlockDriver.bdrv_ioctl

2015-11-09 Thread Fam Zheng
Now the callback is not used any more, drop the field along with all
implementations in block drivers, which are iscsi and raw.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 block/iscsi.c | 33 -
 block/raw-posix.c |  8 
 block/raw_bsd.c   |  6 --
 include/block/block_int.h |  1 -
 4 files changed, 48 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index b3fa0a0..002d29b 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -845,38 +845,6 @@ static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
 return >common;
 }
 
-static void ioctl_cb(void *opaque, int status)
-{
-int *p_status = opaque;
-*p_status = status;
-}
-
-static int iscsi_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
-{
-IscsiLun *iscsilun = bs->opaque;
-int status;
-
-switch (req) {
-case SG_GET_VERSION_NUM:
-*(int *)buf = 3;
-break;
-case SG_GET_SCSI_ID:
-((struct sg_scsi_id *)buf)->scsi_type = iscsilun->type;
-break;
-case SG_IO:
-status = -EINPROGRESS;
-iscsi_aio_ioctl(bs, req, buf, ioctl_cb, );
-
-while (status == -EINPROGRESS) {
-aio_poll(iscsilun->aio_context, true);
-}
-
-return 0;
-default:
-return -1;
-}
-return 0;
-}
 #endif
 
 static int64_t
@@ -1807,7 +1775,6 @@ static BlockDriver bdrv_iscsi = {
 .bdrv_co_flush_to_disk = iscsi_co_flush,
 
 #ifdef __linux__
-.bdrv_ioctl   = iscsi_ioctl,
 .bdrv_aio_ioctl   = iscsi_aio_ioctl,
 #endif
 
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 918c756..aec9ec6 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -2175,12 +2175,6 @@ static int hdev_open(BlockDriverState *bs, QDict 
*options, int flags,
 }
 
 #if defined(__linux__)
-static int hdev_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
-{
-BDRVRawState *s = bs->opaque;
-
-return ioctl(s->fd, req, buf);
-}
 
 static BlockAIOCB *hdev_aio_ioctl(BlockDriverState *bs,
 unsigned long int req, void *buf,
@@ -2338,7 +2332,6 @@ static BlockDriver bdrv_host_device = {
 
 /* generic scsi device */
 #ifdef __linux__
-.bdrv_ioctl = hdev_ioctl,
 .bdrv_aio_ioctl = hdev_aio_ioctl,
 #endif
 };
@@ -2471,7 +2464,6 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_lock_medium   = cdrom_lock_medium,
 
 /* generic scsi device */
-.bdrv_ioctl = hdev_ioctl,
 .bdrv_aio_ioctl = hdev_aio_ioctl,
 };
 #endif /* __linux__ */
diff --git a/block/raw_bsd.c b/block/raw_bsd.c
index 0aded31..915d6fd 100644
--- a/block/raw_bsd.c
+++ b/block/raw_bsd.c
@@ -169,11 +169,6 @@ static void raw_lock_medium(BlockDriverState *bs, bool 
locked)
 bdrv_lock_medium(bs->file->bs, locked);
 }
 
-static int raw_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
-{
-return bdrv_ioctl(bs->file->bs, req, buf);
-}
-
 static BlockAIOCB *raw_aio_ioctl(BlockDriverState *bs,
  unsigned long int req, void *buf,
  BlockCompletionFunc *cb,
@@ -262,7 +257,6 @@ BlockDriver bdrv_raw = {
 .bdrv_media_changed   = _media_changed,
 .bdrv_eject   = _eject,
 .bdrv_lock_medium = _lock_medium,
-.bdrv_ioctl   = _ioctl,
 .bdrv_aio_ioctl   = _aio_ioctl,
 .create_opts  = _create_opts,
 .bdrv_has_zero_init   = _has_zero_init
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 7db9900..550ce18 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -227,7 +227,6 @@ struct BlockDriver {
 void (*bdrv_lock_medium)(BlockDriverState *bs, bool locked);
 
 /* to control generic scsi devices */
-int (*bdrv_ioctl)(BlockDriverState *bs, unsigned long int req, void *buf);
 BlockAIOCB *(*bdrv_aio_ioctl)(BlockDriverState *bs,
 unsigned long int req, void *buf,
 BlockCompletionFunc *cb, void *opaque);
-- 
2.4.3




Re: [Qemu-block] [PATCH v4 8/9] block: Introduce BlockDriver.bdrv_drain callback

2015-11-09 Thread Paolo Bonzini


On 09/11/2015 11:16, Fam Zheng wrote:
> Drivers can have internal request sources that generate IO, like the
> need_check_timer in QED. Since we want quiesced periods that contain
> nested event loops in block layer, we need to have a way to disable such
> event sources.
> 
> Block drivers must implement the "bdrv_drain" callback if it has any
> internal sources that can generate I/O activity, like a timer or a
> worker thread (even in a library) that can schedule QEMUBH in an
> asynchronous callback.
> 
> Update the comments of bdrv_drain and bdrv_drained_begin accordingly.
> 
> Like bdrv_requests_pending(), we should consider all the children of bs.
> Before, the while loop just works, as bdrv_requests_pending() already
> tracks its children; now we mustn't miss the callback, so recurse down
> explicitly.

Reviewed-by: Paolo Bonzini 

> Signed-off-by: Fam Zheng 
> ---
>  block/io.c| 16 +++-
>  include/block/block_int.h |  6 ++
>  2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 4ecb171..adc1eab 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -237,8 +237,21 @@ bool bdrv_requests_pending(BlockDriverState *bs)
>  return false;
>  }
>  
> +static void bdrv_drain_recurse(BlockDriverState *bs)
> +{
> +BdrvChild *child;
> +
> +if (bs->drv && bs->drv->bdrv_drain) {
> +bs->drv->bdrv_drain(bs);
> +}
> +QLIST_FOREACH(child, >children, next) {
> +bdrv_drain_recurse(child->bs);
> +}
> +}
> +
>  /*
> - * Wait for pending requests to complete on a single BlockDriverState subtree
> + * Wait for pending requests to complete on a single BlockDriverState 
> subtree,
> + * and suspend block driver's internal I/O until next request arrives.
>   *
>   * Note that unlike bdrv_drain_all(), the caller must hold the 
> BlockDriverState
>   * AioContext.
> @@ -251,6 +264,7 @@ void bdrv_drain(BlockDriverState *bs)
>  {
>  bool busy = true;
>  
> +bdrv_drain_recurse(bs);
>  while (busy) {
>  /* Keep iterating */
>   bdrv_flush_io_queue(bs);
> diff --git a/include/block/block_int.h b/include/block/block_int.h
> index 550ce18..4a9f8ff 100644
> --- a/include/block/block_int.h
> +++ b/include/block/block_int.h
> @@ -295,6 +295,12 @@ struct BlockDriver {
>   */
>  int (*bdrv_probe_geometry)(BlockDriverState *bs, HDGeometry *geo);
>  
> +/**
> + * Drain and stop any internal sources of requests in the driver, and
> + * remain so until next I/O callback (e.g. bdrv_co_writev) is called.
> + */
> +void (*bdrv_drain)(BlockDriverState *bs);
> +
>  QLIST_ENTRY(BlockDriver) list;
>  };
>  
> 



[Qemu-block] [PATCH v7 03/24] block: Release dirty bitmaps in bdrv_close()

2015-11-09 Thread Max Reitz
bdrv_delete() is not very happy about deleting BlockDriverStates with
dirty bitmaps still attached to them. In the past, we got around that
very easily by relying on bdrv_close_all() bypassing bdrv_delete(), and
bdrv_close() simply ignoring that condition. We should fix that by
releasing all dirty bitmaps in bdrv_close() and drop the assertion in
bdrv_delete().

Signed-off-by: Max Reitz 
---
 block.c | 37 +
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/block.c b/block.c
index cffac75..94c0ecf 100644
--- a/block.c
+++ b/block.c
@@ -87,6 +87,8 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const 
char *filename,
  const BdrvChildRole *child_role, Error **errp);
 
 static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
+static void bdrv_release_all_dirty_bitmaps(BlockDriverState *bs);
+
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
@@ -1916,6 +1918,8 @@ void bdrv_close(BlockDriverState *bs)
 bdrv_drain(bs); /* in case flush left pending I/O */
 notifier_list_notify(>close_notifiers, bs);
 
+bdrv_release_all_dirty_bitmaps(bs);
+
 if (bs->blk) {
 blk_dev_change_media_cb(bs->blk, false);
 }
@@ -2123,7 +2127,6 @@ static void bdrv_delete(BlockDriverState *bs)
 assert(!bs->job);
 assert(bdrv_op_blocker_is_empty(bs));
 assert(!bs->refcnt);
-assert(QLIST_EMPTY(>dirty_bitmaps));
 
 bdrv_close(bs);
 
@@ -3303,21 +3306,39 @@ static void bdrv_dirty_bitmap_truncate(BlockDriverState 
*bs)
 }
 }
 
-void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
+static void bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
+  BdrvDirtyBitmap *bitmap)
 {
 BdrvDirtyBitmap *bm, *next;
 QLIST_FOREACH_SAFE(bm, >dirty_bitmaps, list, next) {
-if (bm == bitmap) {
+if (!bitmap || bm == bitmap) {
 assert(!bdrv_dirty_bitmap_frozen(bm));
-QLIST_REMOVE(bitmap, list);
-hbitmap_free(bitmap->bitmap);
-g_free(bitmap->name);
-g_free(bitmap);
-return;
+QLIST_REMOVE(bm, list);
+hbitmap_free(bm->bitmap);
+g_free(bm->name);
+g_free(bm);
+
+if (bitmap) {
+return;
+}
 }
 }
 }
 
+void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
+{
+bdrv_do_release_matching_dirty_bitmap(bs, bitmap);
+}
+
+/**
+ * Release all dirty bitmaps attached to a BDS (for use in bdrv_close()). There
+ * must not be any frozen bitmaps attached.
+ */
+static void bdrv_release_all_dirty_bitmaps(BlockDriverState *bs)
+{
+bdrv_do_release_matching_dirty_bitmap(bs, NULL);
+}
+
 void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
 {
 assert(!bdrv_dirty_bitmap_frozen(bitmap));
-- 
2.6.2




[Qemu-block] [PATCH v7 06/24] iotests: Move _filter_nbd into common.filter

2015-11-09 Thread Max Reitz
_filter_nbd can be useful for other NBD tests, too, therefore it should
reside in common.filter.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/083   | 12 
 tests/qemu-iotests/common.filter | 12 
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/tests/qemu-iotests/083 b/tests/qemu-iotests/083
index c00a66b..aa99278 100755
--- a/tests/qemu-iotests/083
+++ b/tests/qemu-iotests/083
@@ -49,18 +49,6 @@ wait_for_tcp_port() {
done
 }
 
-_filter_nbd()
-{
-# nbd.c error messages contain function names and line numbers that are
-# prone to change.  Message ordering depends on timing between send and
-# receive callbacks sometimes, making them unreliable.
-#
-# Filter out the TCP port number since this changes between runs.
-sed -e 's#^.*nbd\.c:.*##g' \
--e 's#nbd:127\.0\.0\.1:[^:]*:#nbd:127\.0\.0\.1:PORT:#g' \
--e 's#\(exportname=foo\|PORT\): Failed to .*$#\1#'
-}
-
 check_disconnect() {
event=$1
when=$2
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index cfdb633..aa2fb8d 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -230,5 +230,17 @@ _filter_qemu_img_map()
 -e 's/Mapped to *//' | _filter_testdir | _filter_imgfmt
 }
 
+_filter_nbd()
+{
+# nbd.c error messages contain function names and line numbers that are
+# prone to change.  Message ordering depends on timing between send and
+# receive callbacks sometimes, making them unreliable.
+#
+# Filter out the TCP port number since this changes between runs.
+sed -e 's#^.*nbd\.c:.*##g' \
+-e 's#nbd:127\.0\.0\.1:[^:]*:#nbd:127\.0\.0\.1:PORT:#g' \
+-e 's#\(exportname=foo\|PORT\): Failed to .*$#\1#'
+}
+
 # make sure this script returns success
 true
-- 
2.6.2




[Qemu-block] [PATCH v7 11/24] block: Add BB-BDS remove/insert notifiers

2015-11-09 Thread Max Reitz
bdrv_close() no longer signifies ejection of a medium, this is now done
by removing the BDS from the BB. Therefore, we want to have a notifier
for that in the BB instead of a close notifier in the BDS. The former is
added now, the latter is removed later.

Symmetrically, another notifier list is added that is invoked whenever a
BDS is inserted. We will need that for virtio-blk and virtio-scsi, which
can then remove their op blockers on BDS ejection and set them up on
insertion.

Signed-off-by: Max Reitz 
---
 block/block-backend.c  | 20 
 include/sysemu/block-backend.h |  2 ++
 2 files changed, 22 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 6f9309f..8f74a54 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -48,6 +48,8 @@ struct BlockBackend {
 BlockdevOnError on_read_error, on_write_error;
 bool iostatus_enabled;
 BlockDeviceIoStatus iostatus;
+
+NotifierList remove_bs_notifiers, insert_bs_notifiers;
 };
 
 typedef struct BlockBackendAIOCB {
@@ -98,6 +100,8 @@ BlockBackend *blk_new(const char *name, Error **errp)
 blk = g_new0(BlockBackend, 1);
 blk->name = g_strdup(name);
 blk->refcnt = 1;
+notifier_list_init(>remove_bs_notifiers);
+notifier_list_init(>insert_bs_notifiers);
 QTAILQ_INSERT_TAIL(_backends, blk, link);
 return blk;
 }
@@ -166,6 +170,8 @@ static void blk_delete(BlockBackend *blk)
 bdrv_unref(blk->bs);
 blk->bs = NULL;
 }
+assert(QLIST_EMPTY(>remove_bs_notifiers.notifiers));
+assert(QLIST_EMPTY(>insert_bs_notifiers.notifiers));
 if (blk->root_state.throttle_state) {
 g_free(blk->root_state.throttle_group);
 throttle_group_unref(blk->root_state.throttle_state);
@@ -343,6 +349,8 @@ void blk_hide_on_behalf_of_hmp_drive_del(BlockBackend *blk)
  */
 void blk_remove_bs(BlockBackend *blk)
 {
+notifier_list_notify(>remove_bs_notifiers, blk);
+
 blk_update_root_state(blk);
 
 blk->bs->blk = NULL;
@@ -359,6 +367,8 @@ void blk_insert_bs(BlockBackend *blk, BlockDriverState *bs)
 bdrv_ref(bs);
 blk->bs = bs;
 bs->blk = blk;
+
+notifier_list_notify(>insert_bs_notifiers, blk);
 }
 
 /*
@@ -1105,6 +1115,16 @@ void blk_remove_aio_context_notifier(BlockBackend *blk,
 }
 }
 
+void blk_add_remove_bs_notifier(BlockBackend *blk, Notifier *notify)
+{
+notifier_list_add(>remove_bs_notifiers, notify);
+}
+
+void blk_add_insert_bs_notifier(BlockBackend *blk, Notifier *notify)
+{
+notifier_list_add(>insert_bs_notifiers, notify);
+}
+
 void blk_add_close_notifier(BlockBackend *blk, Notifier *notify)
 {
 if (blk->bs) {
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index f4a68e2..632cdc0 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -162,6 +162,8 @@ void blk_remove_aio_context_notifier(BlockBackend *blk,
   void *),
  void (*detach_aio_context)(void *),
  void *opaque);
+void blk_add_remove_bs_notifier(BlockBackend *blk, Notifier *notify);
+void blk_add_insert_bs_notifier(BlockBackend *blk, Notifier *notify);
 void blk_add_close_notifier(BlockBackend *blk, Notifier *notify);
 void blk_io_plug(BlockBackend *blk);
 void blk_io_unplug(BlockBackend *blk);
-- 
2.6.2




[Qemu-block] [PATCH v7 18/24] block: Make bdrv_close() static

2015-11-09 Thread Max Reitz
There are no users of bdrv_close() left, except for one of bdrv_open()'s
failure paths, bdrv_close_all() and bdrv_delete(), and that is good.
Make bdrv_close() static so nobody makes the mistake of directly using
bdrv_close() again.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 block.c   | 4 +++-
 include/block/block.h | 1 -
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index 4ee3b01..b5874aa 100644
--- a/block.c
+++ b/block.c
@@ -92,6 +92,8 @@ static void bdrv_release_all_dirty_bitmaps(BlockDriverState 
*bs);
 /* If non-zero, use only whitelisted block drivers */
 static int use_bdrv_whitelist;
 
+static void bdrv_close(BlockDriverState *bs);
+
 #ifdef _WIN32
 static int is_windows_drive_prefix(const char *filename)
 {
@@ -1894,7 +1896,7 @@ void bdrv_reopen_abort(BDRVReopenState *reopen_state)
 }
 
 
-void bdrv_close(BlockDriverState *bs)
+static void bdrv_close(BlockDriverState *bs)
 {
 BdrvAioNotifier *ban, *ban_next;
 
diff --git a/include/block/block.h b/include/block/block.h
index 8d9a049..03551d3 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -222,7 +222,6 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state,
 BlockReopenQueue *queue, Error **errp);
 void bdrv_reopen_commit(BDRVReopenState *reopen_state);
 void bdrv_reopen_abort(BDRVReopenState *reopen_state);
-void bdrv_close(BlockDriverState *bs);
 int bdrv_read(BlockDriverState *bs, int64_t sector_num,
   uint8_t *buf, int nb_sectors);
 int bdrv_read_unthrottled(BlockDriverState *bs, int64_t sector_num,
-- 
2.6.2




[Qemu-block] [PATCH v7 19/24] block: Add list of all BlockDriverStates

2015-11-09 Thread Max Reitz
We need this list so that bdrv_close_all() can keep track of which BDSs
are still open after having removed the BDSs from all of the BBs and
having released all monitor BDS references.

Signed-off-by: Max Reitz 
Reviewed-by: Kevin Wolf 
---
 block.c   | 7 +++
 include/block/block_int.h | 2 ++
 2 files changed, 9 insertions(+)

diff --git a/block.c b/block.c
index b5874aa..b935dff 100644
--- a/block.c
+++ b/block.c
@@ -78,6 +78,9 @@ struct BdrvStates bdrv_states = 
QTAILQ_HEAD_INITIALIZER(bdrv_states);
 static QTAILQ_HEAD(, BlockDriverState) graph_bdrv_states =
 QTAILQ_HEAD_INITIALIZER(graph_bdrv_states);
 
+static QTAILQ_HEAD(, BlockDriverState) all_bdrv_states =
+QTAILQ_HEAD_INITIALIZER(all_bdrv_states);
+
 static QLIST_HEAD(, BlockDriver) bdrv_drivers =
 QLIST_HEAD_INITIALIZER(bdrv_drivers);
 
@@ -266,6 +269,8 @@ BlockDriverState *bdrv_new(void)
 bs->refcnt = 1;
 bs->aio_context = qemu_get_aio_context();
 
+QTAILQ_INSERT_TAIL(_bdrv_states, bs, bs_list);
+
 return bs;
 }
 
@@ -2128,6 +2133,8 @@ static void bdrv_delete(BlockDriverState *bs)
 /* remove from list, if necessary */
 bdrv_make_anon(bs);
 
+QTAILQ_REMOVE(_bdrv_states, bs, bs_list);
+
 g_free(bs);
 }
 
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 617994d..8c82747 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -422,6 +422,8 @@ struct BlockDriverState {
 QTAILQ_ENTRY(BlockDriverState) node_list;
 /* element of the list of "drives" the guest sees */
 QTAILQ_ENTRY(BlockDriverState) device_list;
+/* element of the list of all BlockDriverStates (all_bdrv_states) */
+QTAILQ_ENTRY(BlockDriverState) bs_list;
 QLIST_HEAD(, BdrvDirtyBitmap) dirty_bitmaps;
 int refcnt;
 
-- 
2.6.2




[Qemu-block] [PATCH v7 24/24] iotests: Add test for block jobs and BDS ejection

2015-11-09 Thread Max Reitz
Suggested-by: Paolo Bonzini 
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/141 | 166 +
 tests/qemu-iotests/141.out |  47 +
 tests/qemu-iotests/group   |   1 +
 3 files changed, 214 insertions(+)
 create mode 100755 tests/qemu-iotests/141
 create mode 100644 tests/qemu-iotests/141.out

diff --git a/tests/qemu-iotests/141 b/tests/qemu-iotests/141
new file mode 100755
index 000..6a32d56
--- /dev/null
+++ b/tests/qemu-iotests/141
@@ -0,0 +1,166 @@
+#!/bin/bash
+#
+# Test case for ejecting BDSs with block jobs still running on them
+#
+# Copyright (C) 2015 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=mre...@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+_cleanup_test_img
+rm -f "$TEST_DIR/{b,o}.$IMGFMT"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+
+# Needs backing file support
+_supported_fmt qcow qcow2 qed
+_supported_proto file
+_supported_os Linux
+
+
+test_blockjob()
+{
+_send_qemu_cmd $QEMU_HANDLE \
+"{'execute': 'blockdev-add',
+  'arguments': {
+  'options': {
+  'id': 'drv0',
+  'driver': '$IMGFMT',
+  'file': {
+  'driver': 'file',
+  'filename': '$TEST_IMG'
+  " \
+'return'
+
+_send_qemu_cmd $QEMU_HANDLE \
+"$1" \
+"$2" \
+| _filter_img_create
+
+# We want this to return an error because the block job is still running
+_send_qemu_cmd $QEMU_HANDLE \
+"{'execute': 'blockdev-remove-medium',
+  'arguments': {'device': 'drv0'}}" \
+'error'
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{'execute': 'block-job-cancel',
+  'arguments': {'device': 'drv0'}}" \
+"$3"
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{'execute': 'x-blockdev-del',
+  'arguments': {'id': 'drv0'}}" \
+'return'
+}
+
+
+TEST_IMG="$TEST_DIR/b.$IMGFMT" _make_test_img 1M
+_make_test_img -b "$TEST_DIR/b.$IMGFMT" 1M
+
+_launch_qemu -nodefaults
+
+_send_qemu_cmd $QEMU_HANDLE \
+"{'execute': 'qmp_capabilities'}" \
+'return'
+
+echo
+echo '=== Testing drive-backup ==='
+echo
+
+# drive-backup will not send BLOCK_JOB_READY by itself, and cancelling the job
+# will consequently result in BLOCK_JOB_CANCELLED being emitted.
+
+test_blockjob \
+"{'execute': 'drive-backup',
+  'arguments': {'device': 'drv0',
+'target': '$TEST_DIR/o.$IMGFMT',
+'format': '$IMGFMT',
+'sync': 'none'}}" \
+'return' \
+'BLOCK_JOB_CANCELLED'
+
+echo
+echo '=== Testing drive-mirror ==='
+echo
+
+# drive-mirror will send BLOCK_JOB_READY basically immediately, and cancelling
+# the job will consequently result in BLOCK_JOB_COMPLETED being emitted.
+
+test_blockjob \
+"{'execute': 'drive-mirror',
+  'arguments': {'device': 'drv0',
+'target': '$TEST_DIR/o.$IMGFMT',
+'format': '$IMGFMT',
+'sync': 'none'}}" \
+'BLOCK_JOB_READY' \
+'BLOCK_JOB_COMPLETED'
+
+echo
+echo '=== Testing block-commit ==='
+echo
+
+# block-commit will send BLOCK_JOB_READY basically immediately, and cancelling
+# the job will consequently result in BLOCK_JOB_COMPLETED being emitted.
+
+test_blockjob \
+"{'execute': 'block-commit',
+  'arguments': {'device': 'drv0'}}" \
+'BLOCK_JOB_READY' \
+'BLOCK_JOB_COMPLETED'
+
+echo
+echo '=== Testing block-stream ==='
+echo
+
+# Give block-stream something to work on, otherwise it would be done
+# immediately, send a BLOCK_JOB_COMPLETED and ejecting the BDS would work just
+# fine without the block job still running.
+
+$QEMU_IO -c 'write 0 1M' "$TEST_DIR/b.$IMGFMT" | _filter_qemu_io
+
+# With some data to stream (and @speed set to 1), block-stream will not 
complete
+# until we send the block-job-cancel command. Therefore, no event other than
+# BLOCK_JOB_CANCELLED will be emitted.
+
+test_blockjob \
+"{'execute': 'block-stream',
+  'arguments': {'device': 'drv0',
+'speed': 1}}" \

[Qemu-block] [PATCH 7/9] libqos/ahci: add ahci_exec

2015-11-09 Thread John Snow
add ahci_exec, which is a standard purpose flexible command dispatcher
and tester for the AHCI device. The intent is to eventually cut down on
the absurd amount of boilerplate inside of the AHCI qtest.

Signed-off-by: John Snow 
---
 tests/libqos/ahci.c | 76 +
 tests/libqos/ahci.h | 17 
 2 files changed, 93 insertions(+)

diff --git a/tests/libqos/ahci.c b/tests/libqos/ahci.c
index 0fa9bf2..6d1298b 100644
--- a/tests/libqos/ahci.c
+++ b/tests/libqos/ahci.c
@@ -601,6 +601,82 @@ inline unsigned size_to_prdtl(unsigned bytes, unsigned 
bytes_per_prd)
 return (bytes + bytes_per_prd - 1) / bytes_per_prd;
 }
 
+const AHCIOpts default_opts = { .size = 0 };
+
+/**
+ * ahci_exec: execute a given command on a specific
+ * AHCI port.
+ *
+ * @ahci: The device to send the command to
+ * @port: The port number of the SATA device we wish
+ *to have execute this command
+ * @op:   The S/ATA command to execute, or if opts.atapi
+ *is true, the SCSI command code.
+ * @opts: Optional arguments to modify execution behavior.
+ */
+void ahci_exec(AHCIQState *ahci, uint8_t port,
+   uint8_t op, const AHCIOpts *opts_in)
+{
+AHCICommand *cmd;
+int rc;
+AHCIOpts *opts;
+
+opts = g_memdup((opts_in == NULL ? _opts : opts_in),
+sizeof(AHCIOpts));
+
+/* No guest buffer provided, create one. */
+if (opts->size && !opts->buffer) {
+opts->buffer = ahci_alloc(ahci, opts->size);
+g_assert(opts->buffer);
+qmemset(opts->buffer, 0x00, opts->size);
+}
+
+/* Command creation */
+if (opts->atapi) {
+cmd = ahci_atapi_command_create(op);
+if (opts->atapi_dma) {
+ahci_command_enable_atapi_dma(cmd);
+}
+} else {
+cmd = ahci_command_create(op);
+}
+ahci_command_adjust(cmd, opts->lba, opts->buffer,
+opts->size, opts->prd_size);
+
+if (opts->pre_cb) {
+rc = opts->pre_cb(ahci, cmd, opts);
+g_assert_cmpint(rc, ==, 0);
+}
+
+/* Write command to memory and issue it */
+ahci_command_commit(ahci, cmd, port);
+ahci_command_issue_async(ahci, cmd);
+if (opts->error) {
+qmp_eventwait("STOP");
+}
+if (opts->mid_cb) {
+rc = opts->mid_cb(ahci, cmd, opts);
+g_assert_cmpint(rc, ==, 0);
+}
+if (opts->error) {
+qmp_async("{'execute':'cont' }");
+qmp_eventwait("RESUME");
+}
+
+/* Wait for command to complete and verify sanity */
+ahci_command_wait(ahci, cmd);
+ahci_command_verify(ahci, cmd);
+if (opts->post_cb) {
+rc = opts->post_cb(ahci, cmd, opts);
+g_assert_cmpint(rc, ==, 0);
+}
+ahci_command_free(cmd);
+if (opts->buffer != opts_in->buffer) {
+ahci_free(ahci, opts->buffer);
+}
+g_free(opts);
+}
+
 /* Issue a command, expecting it to fail and STOP the VM */
 AHCICommand *ahci_guest_io_halt(AHCIQState *ahci, uint8_t port,
 uint8_t ide_cmd, uint64_t buffer,
diff --git a/tests/libqos/ahci.h b/tests/libqos/ahci.h
index 705fbd6..2c2d2fc 100644
--- a/tests/libqos/ahci.h
+++ b/tests/libqos/ahci.h
@@ -462,6 +462,21 @@ typedef struct PRD {
 /* Opaque, defined within ahci.c */
 typedef struct AHCICommand AHCICommand;
 
+/* Options to ahci_exec */
+typedef struct AHCIOpts {
+size_t size;
+unsigned prd_size;
+uint64_t lba;
+uint64_t buffer;
+bool atapi;
+bool atapi_dma;
+bool error;
+int (*pre_cb)(AHCIQState*, AHCICommand*, const struct AHCIOpts *);
+int (*mid_cb)(AHCIQState*, AHCICommand*, const struct AHCIOpts *);
+int (*post_cb)(AHCIQState*, AHCICommand*, const struct AHCIOpts *);
+void *opaque;
+} AHCIOpts;
+
 /*** Macro Utilities ***/
 #define BITANY(data, mask) (((data) & (mask)) != 0)
 #define BITSET(data, mask) (((data) & (mask)) == (mask))
@@ -569,6 +584,8 @@ AHCICommand *ahci_guest_io_halt(AHCIQState *ahci, uint8_t 
port, uint8_t ide_cmd,
 void ahci_guest_io_resume(AHCIQState *ahci, AHCICommand *cmd);
 void ahci_io(AHCIQState *ahci, uint8_t port, uint8_t ide_cmd,
  void *buffer, size_t bufsize, uint64_t sector);
+void ahci_exec(AHCIQState *ahci, uint8_t port,
+   uint8_t op, const AHCIOpts *opts);
 
 /* Command Lifecycle */
 AHCICommand *ahci_command_create(uint8_t command_name);
-- 
2.4.3




[Qemu-block] [PATCH 5/9] libqos: allow zero-size allocations

2015-11-09 Thread John Snow
As part of streamlining the AHCI tests interface, it'd be nice
if specying a size of zero could be handled without special branches
and the allocator could handle this special case gracefully.

This lets me use the "ahci_io" macros for non-data commands, too,
which moves me forward towards shepherding all AHCI qtests into
a common set of commands in a unified pipeline.

Signed-off-by: John Snow 
---
 tests/ahci-test.c | 8 +---
 tests/libqos/ahci.c   | 6 +++---
 tests/libqos/malloc.c | 4 
 3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index 58d0545..f18669c 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -879,18 +879,12 @@ static void ahci_test_io_rw_simple(AHCIQState *ahci, 
unsigned bufsize,
 static uint8_t ahci_test_nondata(AHCIQState *ahci, uint8_t ide_cmd)
 {
 uint8_t port;
-AHCICommand *cmd;
 
 /* Sanitize */
 port = ahci_port_select(ahci);
 ahci_port_clear(ahci, port);
 
-/* Issue Command */
-cmd = ahci_command_create(ide_cmd);
-ahci_command_commit(ahci, cmd, port);
-ahci_command_issue(ahci, cmd);
-ahci_command_verify(ahci, cmd);
-ahci_command_free(cmd);
+ahci_io(ahci, port, ide_cmd, NULL, 0, 0);
 
 return port;
 }
diff --git a/tests/libqos/ahci.c b/tests/libqos/ahci.c
index a219f67..df29560 100644
--- a/tests/libqos/ahci.c
+++ b/tests/libqos/ahci.c
@@ -668,16 +668,16 @@ void ahci_io(AHCIQState *ahci, uint8_t port, uint8_t 
ide_cmd,
 props = ahci_command_find(ide_cmd);
 g_assert(props);
 ptr = ahci_alloc(ahci, bufsize);
-g_assert(ptr);
+g_assert(!bufsize || ptr);
 qmemset(ptr, 0x00, bufsize);
 
-if (props->write) {
+if (bufsize && props->write) {
 bufwrite(ptr, buffer, bufsize);
 }
 
 ahci_guest_io(ahci, port, ide_cmd, ptr, bufsize, sector);
 
-if (props->read) {
+if (bufsize && props->read) {
 bufread(ptr, buffer, bufsize);
 }
 
diff --git a/tests/libqos/malloc.c b/tests/libqos/malloc.c
index 82b9df5..19d05ca 100644
--- a/tests/libqos/malloc.c
+++ b/tests/libqos/malloc.c
@@ -270,6 +270,10 @@ uint64_t guest_alloc(QGuestAllocator *allocator, size_t 
size)
 uint64_t rsize = size;
 uint64_t naddr;
 
+if (!size) {
+return 0;
+}
+
 rsize += (allocator->page_size - 1);
 rsize &= -allocator->page_size;
 g_assert_cmpint((allocator->start + rsize), <=, allocator->end);
-- 
2.4.3




[Qemu-block] [PATCH 0/9] ahci: atapi qtests

2015-11-09 Thread John Snow
Add ATAPI support into libqos/ahci, and write a few tests for it.
I intend this for 2.5, as it's just a test and might help unearth
some bugs during release candidate testing.

This is the last "batch" of planned qtests for s/ata -- basic i/o
testing of HDDs and CDROMs on both PCI and AHCI should be complete
after this series.



For convenience, this branch is available at:
https://github.com/jnsnow/qemu.git branch ahci-atapi-qtests
https://github.com/jnsnow/qemu/tree/ahci-atapi-qtests

This version is tagged ahci-atapi-qtests-v1:
https://github.com/jnsnow/qemu/releases/tag/ahci-atapi-qtests-v1

John Snow (9):
  ahci-test: fix memory leak
  libqos/ahci: ATAPI support
  libqos/ahci: ATAPI identify
  libqos/ahci: Switch to mutable properties
  libqos: allow zero-size allocations
  libqos/ahci: allow nondata commands for ahci_io variants
  libqos/ahci: add ahci_exec
  qtest/ahci: ATAPI data tests
  libqos/ahci: organize header

 tests/ahci-test.c | 131 ++--
 tests/libqos/ahci.c   | 181 +++---
 tests/libqos/ahci.h   |  66 +++---
 tests/libqos/malloc.c |   4 ++
 4 files changed, 341 insertions(+), 41 deletions(-)

-- 
2.4.3




[Qemu-block] [PATCH 2/9] libqos/ahci: ATAPI support

2015-11-09 Thread John Snow
Add pathways to tolerate ATAPI commands.

Notably, unlike ATA, each SCSI command's layout is a little different,
so support will have to be patched in for each command as we want to
test them in e.g. ahci_command_set_sizes and ahci_command_set_offset.

For now, I'm adding support for 0x28, READ (10).

Signed-off-by: John Snow 
---
 tests/libqos/ahci.c | 83 ++---
 tests/libqos/ahci.h | 14 +
 2 files changed, 93 insertions(+), 4 deletions(-)

diff --git a/tests/libqos/ahci.c b/tests/libqos/ahci.c
index adb2665..59bf893 100644
--- a/tests/libqos/ahci.c
+++ b/tests/libqos/ahci.c
@@ -74,7 +74,11 @@ AHCICommandProp ahci_command_properties[] = {
  .lba48 = true, .write = true, .ncq = true },
 { .cmd = CMD_READ_MAX,   .lba28 = true },
 { .cmd = CMD_READ_MAX_EXT,   .lba48 = true },
-{ .cmd = CMD_FLUSH_CACHE,.data = false }
+{ .cmd = CMD_FLUSH_CACHE,.data = false },
+{ .cmd = CMD_PACKET, .data = true,  .size = 16,
+ .atapi = true, },
+{ .cmd = CMD_PACKET_ID,  .data = true,  .pio = true,
+ .size = 512,   .read = true }
 };
 
 struct AHCICommand {
@@ -90,7 +94,7 @@ struct AHCICommand {
 /* Data to be transferred to the guest */
 AHCICommandHeader header;
 RegH2DFIS fis;
-void *atapi_cmd;
+unsigned char *atapi_cmd;
 };
 
 /**
@@ -731,6 +735,13 @@ static void command_table_init(AHCICommand *cmd)
 memset(fis->aux, 0x00, ARRAY_SIZE(fis->aux));
 }
 
+void ahci_command_enable_atapi_dma(AHCICommand *cmd)
+{
+RegH2DFIS *fis = &(cmd->fis);
+g_assert(cmd->props->atapi);
+fis->feature_low |= 0x01;
+}
+
 AHCICommand *ahci_command_create(uint8_t command_name)
 {
 AHCICommandProp *props = ahci_command_find(command_name);
@@ -767,8 +778,22 @@ AHCICommand *ahci_command_create(uint8_t command_name)
 return cmd;
 }
 
+AHCICommand *ahci_atapi_command_create(uint8_t scsi_cmd)
+{
+AHCICommand *cmd = ahci_command_create(CMD_PACKET);
+cmd->atapi_cmd = g_malloc0(16);
+cmd->atapi_cmd[0] = scsi_cmd;
+/* ATAPI needs a PIO transfer chunk size set inside of the LBA registers.
+ * The block/sector size is a natural default. */
+cmd->fis.lba_lo[1] = ATAPI_SECTOR_SIZE >> 8 & 0xFF;
+cmd->fis.lba_lo[2] = ATAPI_SECTOR_SIZE & 0xFF;
+
+return cmd;
+}
+
 void ahci_command_free(AHCICommand *cmd)
 {
+g_free(cmd->atapi_cmd);
 g_free(cmd);
 }
 
@@ -782,10 +807,33 @@ void ahci_command_clr_flags(AHCICommand *cmd, uint16_t 
cmdh_flags)
 cmd->header.flags &= ~cmdh_flags;
 }
 
+static void ahci_atapi_command_set_offset(AHCICommand *cmd, uint64_t lba)
+{
+unsigned char *cbd = cmd->atapi_cmd;
+uint32_t *lba32;
+g_assert(cbd);
+
+switch (cbd[0]) {
+case CMD_ATAPI_READ_10:
+g_assert_cmpuint(lba, <=, UINT32_MAX);
+lba32 = (uint32_t *)&(cbd[2]);
+*lba32 = cpu_to_be32(lba);
+break;
+default:
+/* SCSI doesn't have uniform packet formats,
+ * so you have to add support for it manually. Sorry! */
+g_assert_not_reached();
+}
+}
+
 void ahci_command_set_offset(AHCICommand *cmd, uint64_t lba_sect)
 {
 RegH2DFIS *fis = &(cmd->fis);
-if (cmd->props->lba28) {
+
+if (cmd->props->atapi) {
+ahci_atapi_command_set_offset(cmd, lba_sect);
+return;
+} else if (cmd->props->lba28) {
 g_assert_cmphex(lba_sect, <=, 0xFFF);
 } else if (cmd->props->lba48 || cmd->props->ncq) {
 g_assert_cmphex(lba_sect, <=, 0x);
@@ -811,6 +859,26 @@ void ahci_command_set_buffer(AHCICommand *cmd, uint64_t 
buffer)
 cmd->buffer = buffer;
 }
 
+static void ahci_atapi_set_size(AHCICommand *cmd, uint64_t xbytes)
+{
+unsigned char *cbd = cmd->atapi_cmd;
+uint64_t nsectors = xbytes / 2048;
+uint16_t *nsector16;
+g_assert(cbd);
+
+switch (cbd[0]) {
+case CMD_ATAPI_READ_10:
+g_assert_cmpuint(nsectors, <=, UINT16_MAX);
+nsector16 = (uint16_t *)&(cbd[7]);
+*nsector16 = cpu_to_be16(nsectors);
+break;
+default:
+/* SCSI doesn't have uniform packet formats,
+ * so you have to add support for it manually. Sorry! */
+g_assert_not_reached();
+}
+}
+
 void ahci_command_set_sizes(AHCICommand *cmd, uint64_t xbytes,
 unsigned prd_size)
 {
@@ -829,6 +897,8 @@ void ahci_command_set_sizes(AHCICommand *cmd, uint64_t 
xbytes,
 NCQFIS *nfis = (NCQFIS *)&(cmd->fis);
 nfis->sector_low = sect_count & 0xFF;
 nfis->sector_hi = (sect_count >> 8) & 0xFF;
+} else if (cmd->props->atapi) {
+ahci_atapi_set_size(cmd, xbytes);
 } else {
 cmd->fis.count = sect_count;
 }
@@ -877,9 +947,14 @@ void ahci_command_commit(AHCIQState *ahci, AHCICommand 
*cmd, uint8_t port)
 g_assert((table_ptr & 0x7F) == 0x00);
 cmd->header.ctba = 

[Qemu-block] [PATCH 4/9] libqos/ahci: Switch to mutable properties

2015-11-09 Thread John Snow
ATAPI commands are, unfortunately, weird in that they can
be either DMA or PIO depending on a header bit. In order to
accommodate them, I'll need to make AHCI command properties
mutable so we can toggle between which "flavor" of ATAPI command
we want to test.

The default ATAPI transfer mechanism is PIO and the default
properties are adjusted accordingly.

Signed-off-by: John Snow 
---
 tests/libqos/ahci.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tests/libqos/ahci.c b/tests/libqos/ahci.c
index 81edf34..a219f67 100644
--- a/tests/libqos/ahci.c
+++ b/tests/libqos/ahci.c
@@ -76,7 +76,7 @@ AHCICommandProp ahci_command_properties[] = {
 { .cmd = CMD_READ_MAX_EXT,   .lba48 = true },
 { .cmd = CMD_FLUSH_CACHE,.data = false },
 { .cmd = CMD_PACKET, .data = true,  .size = 16,
- .atapi = true, },
+ .atapi = true, .pio = true },
 { .cmd = CMD_PACKET_ID,  .data = true,  .pio = true,
  .size = 512,   .read = true }
 };
@@ -745,6 +745,11 @@ void ahci_command_enable_atapi_dma(AHCICommand *cmd)
 RegH2DFIS *fis = &(cmd->fis);
 g_assert(cmd->props->atapi);
 fis->feature_low |= 0x01;
+cmd->interrupts &= ~AHCI_PX_IS_PSS;
+cmd->props->dma = true;
+cmd->props->pio = false;
+/* BUG: We expect the DMA Setup interrupt for DMA commands */
+/* cmd->interrupts |= AHCI_PX_IS_DSS; */
 }
 
 AHCICommand *ahci_command_create(uint8_t command_name)
@@ -761,7 +766,7 @@ AHCICommand *ahci_command_create(uint8_t command_name)
 g_assert(!props->ncq || props->lba48);
 
 /* Defaults and book-keeping */
-cmd->props = props;
+cmd->props = g_memdup(props, sizeof(AHCICommandProp));
 cmd->name = command_name;
 cmd->xbytes = props->size;
 cmd->prd_size = 4096;
@@ -799,6 +804,7 @@ AHCICommand *ahci_atapi_command_create(uint8_t scsi_cmd)
 void ahci_command_free(AHCICommand *cmd)
 {
 g_free(cmd->atapi_cmd);
+g_free(cmd->props);
 g_free(cmd);
 }
 
-- 
2.4.3




Re: [Qemu-block] [Qemu-devel] [PATCH V3 1/6] ide/atapi: make PIO read requests async

2015-11-09 Thread John Snow


On 11/06/2015 03:42 AM, Peter Lieven wrote:
> PIO read requests on the ATAPI interface used to be sync blk requests.
> This has two significant drawbacks. First the main loop hangs util an
> I/O request is completed and secondly if the I/O request does not
> complete (e.g. due to an unresponsive storage) Qemu hangs completely.
> 
> Note: Due to possible race conditions requests during an ongoing
> elementary transfer are still sync.
> 
> Signed-off-by: Peter Lieven 
> ---
>  hw/ide/atapi.c | 97 
> ++
>  1 file changed, 85 insertions(+), 12 deletions(-)
> 
> diff --git a/hw/ide/atapi.c b/hw/ide/atapi.c
> index 747f466..29fd131 100644
> --- a/hw/ide/atapi.c
> +++ b/hw/ide/atapi.c
> @@ -105,33 +105,98 @@ static void cd_data_to_raw(uint8_t *buf, int lba)
>  memset(buf, 0, 288);
>  }
>  
> -static int cd_read_sector(IDEState *s, int lba, uint8_t *buf, int 
> sector_size)
> +static int
> +cd_read_sector_sync(IDEState *s, uint8_t *buf)
>  {
>  int ret;
>  
> -switch(sector_size) {
> +#ifdef DEBUG_IDE_ATAPI
> +printf("cd_read_sector_sync: lba=%d\n", s->lba);
> +#endif
> +
> +switch (s->cd_sector_size) {
>  case 2048:
>  block_acct_start(blk_get_stats(s->blk), >acct,
>   4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> -ret = blk_read(s->blk, (int64_t)lba << 2, buf, 4);
> +ret = blk_read(s->blk, (int64_t)s->lba << 2, buf, 4);
>  block_acct_done(blk_get_stats(s->blk), >acct);
>  break;
>  case 2352:
>  block_acct_start(blk_get_stats(s->blk), >acct,
>   4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> -ret = blk_read(s->blk, (int64_t)lba << 2, buf + 16, 4);
> +ret = blk_read(s->blk, (int64_t)s->lba << 2, buf + 16, 4);
>  block_acct_done(blk_get_stats(s->blk), >acct);
>  if (ret < 0)
>  return ret;
> -cd_data_to_raw(buf, lba);
> +cd_data_to_raw(buf, s->lba);
>  break;
>  default:
>  ret = -EIO;
>  break;
>  }
> +
> +if (!ret) {
> +s->lba++;
> +s->io_buffer_index = 0;
> +}
> +
>  return ret;
>  }
>  
> +static void cd_read_sector_cb(void *opaque, int ret)
> +{
> +IDEState *s = opaque;
> +
> +block_acct_done(blk_get_stats(s->blk), >acct);
> +
> +#ifdef DEBUG_IDE_ATAPI
> +printf("cd_read_sector_cb: lba=%d ret=%d\n", s->lba, ret);
> +#endif
> +
> +if (ret < 0) {
> +ide_atapi_io_error(s, ret);
> +return;
> +}
> +
> +if (s->cd_sector_size == 2352) {
> +cd_data_to_raw(s->io_buffer, s->lba);
> +}
> +
> +s->lba++;
> +s->io_buffer_index = 0;
> +s->status &= ~BUSY_STAT;
> +
> +ide_atapi_cmd_reply_end(s);
> +}
> +
> +static int cd_read_sector(IDEState *s, void *buf)
> +{
> +if (s->cd_sector_size != 2048 && s->cd_sector_size != 2352) {
> +return -EINVAL;
> +}
> +
> +s->iov.iov_base = buf;
> +if (s->cd_sector_size == 2352) {
> +buf += 16;
> +}
> +
> +s->iov.iov_len = 4 * BDRV_SECTOR_SIZE;
> +qemu_iovec_init_external(>qiov, >iov, 1);
> +
> +#ifdef DEBUG_IDE_ATAPI
> +printf("cd_read_sector: lba=%d\n", s->lba);
> +#endif
> +
> +block_acct_start(blk_get_stats(s->blk), >acct,
> + 4 * BDRV_SECTOR_SIZE, BLOCK_ACCT_READ);
> +
> +blk_aio_readv(s->blk, (int64_t)s->lba << 2, >qiov, 4,
> +  cd_read_sector_cb, s);
> +
> +s->status |= BUSY_STAT;
> +return 0;
> +}
> +
>  void ide_atapi_cmd_ok(IDEState *s)
>  {
>  s->error = 0;
> @@ -182,18 +247,27 @@ void ide_atapi_cmd_reply_end(IDEState *s)
>  ide_atapi_cmd_ok(s);
>  ide_set_irq(s->bus);
>  #ifdef DEBUG_IDE_ATAPI
> -printf("status=0x%x\n", s->status);
> +printf("end of transfer, status=0x%x\n", s->status);
>  #endif
>  } else {
>  /* see if a new sector must be read */
>  if (s->lba != -1 && s->io_buffer_index >= s->cd_sector_size) {
> -ret = cd_read_sector(s, s->lba, s->io_buffer, s->cd_sector_size);
> -if (ret < 0) {
> -ide_atapi_io_error(s, ret);
> +if (!s->elementary_transfer_size) {
> +ret = cd_read_sector(s, s->io_buffer);
> +if (ret < 0) {
> +ide_atapi_io_error(s, ret);
> +}
>  return;
> +} else {
> +/* rebuffering within an elementary transfer is
> + * only possible with a sync request because we
> + * end up with a race condition otherwise */
> +ret = cd_read_sector_sync(s, s->io_buffer);
> +if (ret < 0) {
> +ide_atapi_io_error(s, ret);
> +return;
> +}
>  }
> -s->lba++;
> -s->io_buffer_index = 0;
>  }
>  if (s->elementary_transfer_size > 0) 

[Qemu-block] [PATCH v4 2/9] block: Track flush requests

2015-11-09 Thread Fam Zheng
Both bdrv_flush and bdrv_aio_flush eventually call bdrv_co_flush, add
tracked_request_begin and tracked_request_end pair in that function so
that all flush requests are now tracked.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 block/io.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index 793809a..a9a49e4 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2318,18 +2318,20 @@ static void coroutine_fn bdrv_flush_co_entry(void 
*opaque)
 int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 {
 int ret;
+BdrvTrackedRequest req;
 
 if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
 bdrv_is_sg(bs)) {
 return 0;
 }
 
+tracked_request_begin(, bs, 0, 0, BDRV_TRACKED_FLUSH);
 /* Write back cached data to the OS even with cache=unsafe */
 BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_OS);
 if (bs->drv->bdrv_co_flush_to_os) {
 ret = bs->drv->bdrv_co_flush_to_os(bs);
 if (ret < 0) {
-return ret;
+goto out;
 }
 }
 
@@ -2369,14 +2371,17 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 ret = 0;
 }
 if (ret < 0) {
-return ret;
+goto out;
 }
 
 /* Now flush the underlying protocol.  It will also have BDRV_O_NO_FLUSH
  * in the case of cache=unsafe, so there are no useless flushes.
  */
 flush_parent:
-return bs->file ? bdrv_co_flush(bs->file->bs) : 0;
+ret = bs->file ? bdrv_co_flush(bs->file->bs) : 0;
+out:
+tracked_request_end();
+return ret;
 }
 
 int bdrv_flush(BlockDriverState *bs)
-- 
2.4.3




[Qemu-block] [PATCH v4 1/9] block: Add more types for tracked request

2015-11-09 Thread Fam Zheng
We'll track more request types besides read and write, change the
boolean field to an enum.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 block/io.c|  9 +
 include/block/block_int.h | 10 +-
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index 8dcad3b..793809a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -348,13 +348,14 @@ static void tracked_request_end(BdrvTrackedRequest *req)
 static void tracked_request_begin(BdrvTrackedRequest *req,
   BlockDriverState *bs,
   int64_t offset,
-  unsigned int bytes, bool is_write)
+  unsigned int bytes,
+  enum BdrvTrackedRequestType type)
 {
 *req = (BdrvTrackedRequest){
 .bs = bs,
 .offset = offset,
 .bytes  = bytes,
-.is_write   = is_write,
+.type   = type,
 .co = qemu_coroutine_self(),
 .serialising= false,
 .overlap_offset = offset,
@@ -971,7 +972,7 @@ static int coroutine_fn bdrv_co_do_preadv(BlockDriverState 
*bs,
 bytes = ROUND_UP(bytes, align);
 }
 
-tracked_request_begin(, bs, offset, bytes, false);
+tracked_request_begin(, bs, offset, bytes, BDRV_TRACKED_READ);
 ret = bdrv_aligned_preadv(bs, , offset, bytes, align,
   use_local_qiov ? _qiov : qiov,
   flags);
@@ -1292,7 +1293,7 @@ static int coroutine_fn 
bdrv_co_do_pwritev(BlockDriverState *bs,
  * Pad qiov with the read parts and be sure to have a tracked request not
  * only for bdrv_aligned_pwritev, but also for the reads of the RMW cycle.
  */
-tracked_request_begin(, bs, offset, bytes, true);
+tracked_request_begin(, bs, offset, bytes, BDRV_TRACKED_WRITE);
 
 if (!qiov) {
 ret = bdrv_co_do_zero_pwritev(bs, offset, bytes, flags, );
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 3ceeb5a..7db9900 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -60,11 +60,19 @@
 
 #define BLOCK_PROBE_BUF_SIZE512
 
+enum BdrvTrackedRequestType {
+BDRV_TRACKED_READ,
+BDRV_TRACKED_WRITE,
+BDRV_TRACKED_FLUSH,
+BDRV_TRACKED_IOCTL,
+BDRV_TRACKED_DISCARD,
+};
+
 typedef struct BdrvTrackedRequest {
 BlockDriverState *bs;
 int64_t offset;
 unsigned int bytes;
-bool is_write;
+enum BdrvTrackedRequestType type;
 
 bool serialising;
 int64_t overlap_offset;
-- 
2.4.3




[Qemu-block] [PATCH v4 9/9] qed: Implement .bdrv_drain

2015-11-09 Thread Fam Zheng
The "need_check_timer" is used to clear the "NEED_CHECK" flag in the
image header after a grace period once metadata update has finished. In
compliance to the bdrv_drain semantics we should make sure it remains
deleted once .bdrv_drain is called.

We cannot reuse qed_need_check_timer_cb because here it doesn't satisfy
the assertion.  Do the "plug" and "flush" calls manually.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 block/qed.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/block/qed.c b/block/qed.c
index 5ea05d4..9b88895 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -375,6 +375,18 @@ static void bdrv_qed_attach_aio_context(BlockDriverState 
*bs,
 }
 }
 
+static void bdrv_qed_drain(BlockDriverState *bs)
+{
+BDRVQEDState *s = bs->opaque;
+
+/* Cancel timer and start doing I/O that were meant to happen as if it
+ * fired, that way we get bdrv_drain() taking care of the ongoing requests
+ * correctly. */
+qed_cancel_need_check_timer(s);
+qed_plug_allocating_write_reqs(s);
+bdrv_aio_flush(s->bs, qed_clear_need_check, s);
+}
+
 static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
  Error **errp)
 {
@@ -1676,6 +1688,7 @@ static BlockDriver bdrv_qed = {
 .bdrv_check   = bdrv_qed_check,
 .bdrv_detach_aio_context  = bdrv_qed_detach_aio_context,
 .bdrv_attach_aio_context  = bdrv_qed_attach_aio_context,
+.bdrv_drain   = bdrv_qed_drain,
 };
 
 static void bdrv_qed_init(void)
-- 
2.4.3




Re: [Qemu-block] [PATCH v2] virtio-blk: trivial code optimization

2015-11-09 Thread Paolo Bonzini


On 09/11/2015 04:19, arei.gong...@huawei.com wrote:
> + * 1. requests are not sequential
> + * 2. merge would exceed maximum number of IOVs
> + * 3. merge would exceed maximum transfer length of backend 
> device
> + */
> +if (sector_num + nb_sectors != req->sector_num ||
> +niov > IOV_MAX - req->qiov.niov ||
> +req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > 
> max_xfer_len) {

Hi Gonglei,

the third condition should also be changed to "new > max - old".

Thanks,

Paolo



Re: [Qemu-block] [PATCH v2] virtio-blk: trivial code optimization

2015-11-09 Thread Gonglei
On 2015/11/9 16:41, Paolo Bonzini wrote:
> 
> 
> On 09/11/2015 04:19, arei.gong...@huawei.com wrote:
>> + * 1. requests are not sequential
>> + * 2. merge would exceed maximum number of IOVs
>> + * 3. merge would exceed maximum transfer length of backend 
>> device
>> + */
>> +if (sector_num + nb_sectors != req->sector_num ||
>> +niov > IOV_MAX - req->qiov.niov ||
>> +req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > 
>> max_xfer_len) {
> 
> Hi Gonglei,
> 
> the third condition should also be changed to "new > max - old".
> 
Okay, will do.  :)

Thanks,
-Gonglei





[Qemu-block] [PATCH v3] virtio-blk: trivial code optimization

2015-11-09 Thread arei.gonglei
From: Gonglei 

1. avoid possible superflous checking
2. make code more robustness

Signed-off-by: Gonglei 
Reviewed-by: Fam Zheng 
---
v3: change the third condition too [Paolo]
add Fam's R-by
---
 hw/block/virtio-blk.c | 27 +--
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 093e475..9124358 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -404,24 +404,15 @@ void virtio_blk_submit_multireq(BlockBackend *blk, 
MultiReqBuffer *mrb)
 for (i = 0; i < mrb->num_reqs; i++) {
 VirtIOBlockReq *req = mrb->reqs[i];
 if (num_reqs > 0) {
-bool merge = true;
-
-/* merge would exceed maximum number of IOVs */
-if (niov + req->qiov.niov > IOV_MAX) {
-merge = false;
-}
-
-/* merge would exceed maximum transfer length of backend device */
-if (req->qiov.size / BDRV_SECTOR_SIZE + nb_sectors > max_xfer_len) 
{
-merge = false;
-}
-
-/* requests are not sequential */
-if (sector_num + nb_sectors != req->sector_num) {
-merge = false;
-}
-
-if (!merge) {
+/*
+ * NOTE: We cannot merge the requests in below situations:
+ * 1. requests are not sequential
+ * 2. merge would exceed maximum number of IOVs
+ * 3. merge would exceed maximum transfer length of backend device
+ */
+if (sector_num + nb_sectors != req->sector_num ||
+niov > IOV_MAX - req->qiov.niov ||
+nb_sectors > max_xfer_len - req->qiov.size / BDRV_SECTOR_SIZE) 
{
 submit_requests(blk, mrb, start, num_reqs, niov);
 num_reqs = 0;
 }
-- 
1.7.12.4





[Qemu-block] [PATCH v4 4/9] iscsi: Emulate commands in iscsi_aio_ioctl as iscsi_ioctl

2015-11-09 Thread Fam Zheng
iscsi_ioctl emulates SG_GET_VERSION_NUM and SG_GET_SCSI_ID. Now that
bdrv_ioctl() will be emulated with .bdrv_aio_ioctl, replicate the logic
into iscsi_aio_ioctl to make them consistent.

Signed-off-by: Fam Zheng 
Reviewed-by: Kevin Wolf 
---
 block/iscsi.c | 40 ++--
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/block/iscsi.c b/block/iscsi.c
index 9a628b7..b3fa0a0 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -96,6 +96,7 @@ typedef struct IscsiAIOCB {
 int status;
 int64_t sector_num;
 int nb_sectors;
+int ret;
 #ifdef __linux__
 sg_io_hdr_t *ioh;
 #endif
@@ -726,6 +727,38 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
 iscsi_schedule_bh(acb);
 }
 
+static void iscsi_ioctl_bh_completion(void *opaque)
+{
+IscsiAIOCB *acb = opaque;
+
+qemu_bh_delete(acb->bh);
+acb->common.cb(acb->common.opaque, acb->ret);
+qemu_aio_unref(acb);
+}
+
+static void iscsi_ioctl_handle_emulated(IscsiAIOCB *acb, int req, void *buf)
+{
+BlockDriverState *bs = acb->common.bs;
+IscsiLun *iscsilun = bs->opaque;
+int ret = 0;
+
+switch (req) {
+case SG_GET_VERSION_NUM:
+*(int *)buf = 3;
+break;
+case SG_GET_SCSI_ID:
+((struct sg_scsi_id *)buf)->scsi_type = iscsilun->type;
+break;
+default:
+ret = -EINVAL;
+}
+assert(!acb->bh);
+acb->bh = aio_bh_new(bdrv_get_aio_context(bs),
+ iscsi_ioctl_bh_completion, acb);
+acb->ret = ret;
+qemu_bh_schedule(acb->bh);
+}
+
 static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
 unsigned long int req, void *buf,
 BlockCompletionFunc *cb, void *opaque)
@@ -735,8 +768,6 @@ static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
 struct iscsi_data data;
 IscsiAIOCB *acb;
 
-assert(req == SG_IO);
-
 acb = qemu_aio_get(_aiocb_info, bs, cb, opaque);
 
 acb->iscsilun = iscsilun;
@@ -745,6 +776,11 @@ static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
 acb->buf = NULL;
 acb->ioh = buf;
 
+if (req != SG_IO) {
+iscsi_ioctl_handle_emulated(acb, req, buf);
+return >common;
+}
+
 acb->task = malloc(sizeof(struct scsi_task));
 if (acb->task == NULL) {
 error_report("iSCSI: Failed to allocate task for scsi command. %s",
-- 
2.4.3




[Qemu-block] [PATCH v4 6/9] block: Emulate bdrv_ioctl with bdrv_aio_ioctl and track both

2015-11-09 Thread Fam Zheng
Currently all drivers that support .bdrv_aio_ioctl also implement
.bdrv_ioctl redundantly.  To track ioctl requests in block layer it is
easier if we unify the two paths, because we'll need to run it in a
coroutine, as required by tracked_request_begin. While we're at it, use
.bdrv_aio_ioctl plus aio_poll() to emulate bdrv_ioctl().

Signed-off-by: Fam Zheng 
---
 block/io.c | 101 +++--
 1 file changed, 92 insertions(+), 9 deletions(-)

diff --git a/block/io.c b/block/io.c
index 324ae5a..4ecb171 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2528,26 +2528,109 @@ int bdrv_discard(BlockDriverState *bs, int64_t 
sector_num, int nb_sectors)
 return rwco.ret;
 }
 
+typedef struct {
+CoroutineIOCompletion *co;
+QEMUBH *bh;
+} BdrvIoctlCompletionData;
+
+static void bdrv_ioctl_bh_cb(void *opaque)
+{
+BdrvIoctlCompletionData *data = opaque;
+
+bdrv_co_io_em_complete(data->co, -ENOTSUP);
+qemu_bh_delete(data->bh);
+}
+
+static int bdrv_co_do_ioctl(BlockDriverState *bs, int req, void *buf)
+{
+BlockDriver *drv = bs->drv;
+BdrvTrackedRequest tracked_req;
+CoroutineIOCompletion co = {
+.coroutine = qemu_coroutine_self(),
+};
+BlockAIOCB *acb;
+
+tracked_request_begin(_req, bs, 0, 0, BDRV_TRACKED_IOCTL);
+if (!drv || !drv->bdrv_aio_ioctl) {
+co.ret = -ENOTSUP;
+goto out;
+}
+
+acb = drv->bdrv_aio_ioctl(bs, req, buf, bdrv_co_io_em_complete, );
+if (!acb) {
+BdrvIoctlCompletionData *data = g_new(BdrvIoctlCompletionData, 1);
+data->bh = aio_bh_new(bdrv_get_aio_context(bs),
+bdrv_ioctl_bh_cb, data);
+data->co = 
+qemu_bh_schedule(data->bh);
+}
+qemu_coroutine_yield();
+out:
+tracked_request_end(_req);
+return co.ret;
+}
+
+typedef struct {
+BlockDriverState *bs;
+int req;
+void *buf;
+int ret;
+} BdrvIoctlCoData;
+
+static void coroutine_fn bdrv_co_ioctl_entry(void *opaque)
+{
+BdrvIoctlCoData *data = opaque;
+data->ret = bdrv_co_do_ioctl(data->bs, data->req, data->buf);
+}
+
 /* needed for generic scsi interface */
-
 int bdrv_ioctl(BlockDriverState *bs, unsigned long int req, void *buf)
 {
-BlockDriver *drv = bs->drv;
+BdrvIoctlCoData data = {
+.bs = bs,
+.req = req,
+.buf = buf,
+.ret = -EINPROGRESS,
+};
 
-if (drv && drv->bdrv_ioctl)
-return drv->bdrv_ioctl(bs, req, buf);
-return -ENOTSUP;
+if (qemu_in_coroutine()) {
+/* Fast-path if already in coroutine context */
+bdrv_co_ioctl_entry();
+} else {
+Coroutine *co = qemu_coroutine_create(bdrv_co_ioctl_entry);
+qemu_coroutine_enter(co, );
+}
+while (data.ret == -EINPROGRESS) {
+aio_poll(bdrv_get_aio_context(bs), true);
+}
+return data.ret;
+}
+
+static void coroutine_fn bdrv_co_aio_ioctl_entry(void *opaque)
+{
+BlockAIOCBCoroutine *acb = opaque;
+acb->req.error = bdrv_co_do_ioctl(acb->common.bs,
+  acb->req.req, acb->req.buf);
+bdrv_co_complete(acb);
 }
 
 BlockAIOCB *bdrv_aio_ioctl(BlockDriverState *bs,
 unsigned long int req, void *buf,
 BlockCompletionFunc *cb, void *opaque)
 {
-BlockDriver *drv = bs->drv;
+BlockAIOCBCoroutine *acb = qemu_aio_get(_em_co_aiocb_info,
+bs, cb, opaque);
+Coroutine *co;
 
-if (drv && drv->bdrv_aio_ioctl)
-return drv->bdrv_aio_ioctl(bs, req, buf, cb, opaque);
-return NULL;
+acb->need_bh = true;
+acb->req.error = -EINPROGRESS;
+acb->req.req = req;
+acb->req.buf = buf;
+co = qemu_coroutine_create(bdrv_co_aio_ioctl_entry);
+qemu_coroutine_enter(co, acb);
+
+bdrv_co_maybe_schedule_bh(acb);
+return >common;
 }
 
 void *qemu_blockalign(BlockDriverState *bs, size_t size)
-- 
2.4.3




Re: [Qemu-block] [PATCH v3 8/9] block: Introduce BlockDriver.bdrv_drain callback

2015-11-09 Thread Fam Zheng
On Mon, 11/09 10:24, Paolo Bonzini wrote:
> 
> 
> On 09/11/2015 03:56, Fam Zheng wrote:
> > +if (bs->drv && bs->drv->bdrv_drain) {
> > +bs->drv->bdrv_drain(bs);
> > +}
> > +QLIST_FOREACH(child, >children, next) {
> > +BlockDriverState *cbs = child->bs;
> > +if (cbs->drv && cbs->drv->bdrv_drain) {
> > +cbs->drv->bdrv_drain(bs);
> > +}
> > +}
> 
> I think this is not enough, because the children could have children as
> well.
> 
> Perhaps you can do it like bdrv_flush: one function does the call to
> bdrv_drain and the recursion on children; bdrv_drain calls that one
> function and then does the "while (busy)" loop.
> 

Right, this function is no longer recursive and only goes to one layer down.

Will fix.

Thanks,

Fam



Re: [Qemu-block] [PATCH v3 0/9] block: Fixes for bdrv_drain

2015-11-09 Thread Stefan Hajnoczi
On Mon, Nov 09, 2015 at 10:56:39AM +0800, Fam Zheng wrote:
> v3: Don't reuse coroutine in bdrv_aio_ioctl. [Stefan]
> Recursely call .bdrv_drain callback only. [Stefan, Paolo]

I don't understand this change.  I thought you want .bdrv_drain() to be
called on the whole tree, but the latest code seems to call it on the
root and children only.  It doesn't recurse the only the root and first
level of the tree get .bdrv_drain() calls.  Is this intentional?


signature.asc
Description: PGP signature


  1   2   >