Re: [Qemu-block] [PATCH v6 25/39] blockdev: Pull out blockdev option extraction

2015-10-14 Thread Kevin Wolf
Am 12.10.2015 um 22:00 hat Max Reitz geschrieben:
> Extract some of the blockdev option extraction code from blockdev_init()
> into its own function. This simplifies blockdev_init() and will allow
> reusing the code in a different function added in a follow-up patch.
> 
> Signed-off-by: Max Reitz 
> Reviewed-by: Alberto Garcia 

Reviewed-by: Kevin Wolf 

> +#ifdef CONFIG_LINUX_AIO
> +if ((aio = qemu_opt_get(opts, "aio")) != NULL) {
> +if (!strcmp(aio, "native")) {
> +*bdrv_flags |= BDRV_O_NATIVE_AIO;
> +} else if (!strcmp(aio, "threads")) {
> +/* this is the default */
> +} else {
> +   error_setg(errp, "invalid aio option");
> +   return;
> +}
> +}
> +#endif

Unrelated to your patch, but didn't we want to remove this #ifdef? I
thought we had already done so... Oh, I've never flushed block-next
after the 2.4 release. :-/

Expect a merge conflict coming up here...

Kevin



Re: [Qemu-block] [PATCH v6 23/39] block: Prepare for NULL BDS

2015-10-14 Thread Max Reitz
On 13.10.2015 17:37, Kevin Wolf wrote:
> Am 12.10.2015 um 22:00 hat Max Reitz geschrieben:
>> blk_bs() will not necessarily return a non-NULL value any more (unless
>> blk_is_available() is true or it can be assumed to otherwise, e.g.
>> because it is called immediately after a successful blk_new_with_bs() or
>> blk_new_open()).
>>
>> Signed-off-by: Max Reitz 
> 
>> @@ -1584,12 +1594,16 @@ static void drive_backup_prepare(BlkTransactionState 
>> *common, Error **errp)
>>"Device '%s' not found", backup->device);
>>  return;
>>  }
>> -bs = blk_bs(blk);
>>  
>>  /* AioContext is released in .clean() */
>> -state->aio_context = bdrv_get_aio_context(bs);
>> +state->aio_context = blk_get_aio_context(blk);
>>  aio_context_acquire(state->aio_context);
>>  
>> +if (!blk_is_available(blk)) {
>> +error_setg(errp, "Device '%s' has no medium", backup->device);
>> +return;
>> +}
>> +
>>  qmp_drive_backup(backup->device, backup->target,
>>   backup->has_format, backup->format,
>>   backup->sync,
>> @@ -1604,7 +1618,7 @@ static void drive_backup_prepare(BlkTransactionState 
>> *common, Error **errp)
>>  return;
>>  }
>>  
>> -state->bs = bs;
>> +state->bs = blk_bs(blk);
>>  state->job = state->bs->job;
>>  }
>>  
>> @@ -1639,8 +1653,7 @@ static void 
>> blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>>  {
>>  BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, 
>> common);
>>  BlockdevBackup *backup;
>> -BlockDriverState *bs, *target;
>> -BlockBackend *blk;
>> +BlockBackend *blk, *target;
>>  Error *local_err = NULL;
>>  
>>  assert(common->action->kind == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
>> @@ -1651,18 +1664,16 @@ static void 
>> blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>>  error_setg(errp, "Device '%s' not found", backup->device);
>>  return;
>>  }
>> -bs = blk_bs(blk);
>>  
>> -blk = blk_by_name(backup->target);
>> -if (!blk) {
>> +target = blk_by_name(backup->target);
>> +if (!target) {
>>  error_setg(errp, "Device '%s' not found", backup->target);
>>  return;
>>  }
>> -target = blk_bs(blk);
>>  
>>  /* AioContext is released in .clean() */
>> -state->aio_context = bdrv_get_aio_context(bs);
>> -if (state->aio_context != bdrv_get_aio_context(target)) {
>> +state->aio_context = blk_get_aio_context(blk);
>> +if (state->aio_context != blk_get_aio_context(target)) {
>>  state->aio_context = NULL;
>>  error_setg(errp, "Backup between two IO threads is not 
>> implemented");
>>  return;
>> @@ -1680,7 +1691,7 @@ static void 
>> blockdev_backup_prepare(BlkTransactionState *common, Error **errp)
>>  return;
>>  }
>>  
>> -state->bs = bs;
>> +state->bs = blk_bs(blk);
>>  state->job = state->bs->job;
>>  }
> 
> It's somewhat inconsistent that blockdev_backup_prepare() doesn't do an
> explicit blk_is_available() check whereas drive_backup_prepare() does.
> As far as I can tell, both don't necessarily need their for their own
> code and in both cases the called qmp_*_backup() function checks it
> again.

Probably some kind of a rebase conflict. blockdev-backup was added only
rather recently, and so I made a different decision then than for
drive-backup.

I think I'll drop the check from drive_backup_prepare(). Having it in
qmp_drive_backup() is enough.

> Not really a problem, though, it just looks a bit odd.
> 
>> @@ -1818,10 +1829,10 @@ static void eject_device(BlockBackend *blk, int 
>> force, Error **errp)
>>  BlockDriverState *bs = blk_bs(blk);
>>  AioContext *aio_context;
>>  
>> -aio_context = bdrv_get_aio_context(bs);
>> +aio_context = blk_get_aio_context(blk);
>>  aio_context_acquire(aio_context);
>>  
>> -if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_EJECT, errp)) {
>> +if (bs && bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_EJECT, errp)) {
>>  goto out;
>>  }
>>  if (!blk_dev_has_removable_media(blk)) {
>>  error_setg(errp, "Device '%s' is not removable",
>> bdrv_get_device_name(bs));
> 
> bs isn't checked to be non-NULL here. You could argue that if it's not
> removable, it shouldn't be NULL, but I think we let drive_del forcibly
> remove the medium even for devices that can't deal with it and start
> producing I/O errors then.

I'd mainly argue that this function is to be removed anyway. :-)

I should just return immediately for NULL bs, as you're suggesting.

> 
>>  goto out;
>>  }
>>
>>  if (blk_dev_is_medium_locked(blk) && !blk_dev_is_tray_open(blk)) {
>>  blk_dev_eject_request(blk, force);
>>  if (!force) {
>>  error_setg(errp, "Device '%s' is locked",
>> bdrv_get_device_name(bs));
> 
> And here.
> 
>>   

Re: [Qemu-block] [PATCH v6 24/39] blockdev: Do not create BDS for empty drive

2015-10-14 Thread Max Reitz
On 14.10.2015 15:27, Kevin Wolf wrote:
> Am 12.10.2015 um 22:00 hat Max Reitz geschrieben:
>> Do not use "rudimentary" BDSs for empty drives any longer (for
>> freshly created drives).
>>
>> After a follow-up patch, empty drives will generally use a NULL BDS, not
>> only the freshly created drives.
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  blockdev.c | 72 
>> ++
>>  1 file changed, 44 insertions(+), 28 deletions(-)
>>
>> diff --git a/blockdev.c b/blockdev.c
>> index 35efe84..845a1c1 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -514,16 +514,44 @@ static BlockBackend *blockdev_init(const char *file, 
>> QDict *bs_opts,
>>  goto early_err;
>>  }
>>  
>> +if (snapshot) {
>> +/* always use cache=unsafe with snapshot */
>> +bdrv_flags &= ~BDRV_O_CACHE_MASK;
>> +bdrv_flags |= (BDRV_O_SNAPSHOT|BDRV_O_CACHE_WB|BDRV_O_NO_FLUSH);
>> +}
>> +
>> +if (copy_on_read) {
>> +bdrv_flags |= BDRV_O_COPY_ON_READ;
>> +}
>> +
>> +if (runstate_check(RUN_STATE_INMIGRATE)) {
>> +bdrv_flags |= BDRV_O_INCOMING;
>> +}
> 
> Bad conflict resolution when you added patch 2, which moved the
> BDRV_O_INCOMING code to block.c?

Oops, indeed. I was wondering why I only removed it once from
blockdev_init(). Thanks.

> That patch is actually essential here as well: The flag would ends up in
> the root state, and bdrv_invalidate_cache_all() doesn't clear the flag
> any more. So we might end up opening an image with BDRV_O_INCOMING even
> though migration has long finished.

Yep. Will fix.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH V2 3/5] virtio-blk: disable scsi passthrough by default

2015-10-14 Thread Paolo Bonzini


On 14/10/2015 12:29, Cornelia Huck wrote:
> Do we want to change anything for 2.5 about the default?
> 
> Currently, we still default scsi to true, and you have to disable it
> explicitly if you want to use virtio-1 compliant virtio-blk devices
> (which is a bit annoying, as scsi passthrough is not something people
> usually want).

I think disabling it is best.  Can you send a patch?

Paolo



Re: [Qemu-block] [PATCH v10 08/10] Implement new driver for block replication

2015-10-14 Thread Stefan Hajnoczi
On Tue, Oct 13, 2015 at 05:08:17PM +0800, Wen Congyang wrote:
> On 10/13/2015 12:27 AM, Stefan Hajnoczi wrote:
> > On Fri, Sep 25, 2015 at 02:17:36PM +0800, Wen Congyang wrote:
> >> +/* start backup job now */
> >> +bdrv_op_unblock(s->hidden_disk, BLOCK_OP_TYPE_BACKUP_TARGET,
> >> +s->active_disk->backing_blocker);
> >> +bdrv_op_unblock(s->secondary_disk, BLOCK_OP_TYPE_BACKUP_SOURCE,
> >> +s->hidden_disk->backing_blocker);
> > 
> > Why is it safe to unblock these operations?
> > 
> > Why do they have to be blocked for non-replication users?
> 
> hidden_disk and secondary disk are opened as backing file, so it is blocked 
> for
> non-replication users.
> What can I do if I don't unblock it and want to do backup?

CCing Jeff Cody, block jobs maintainer

You need to explain why it is safe remove this protection.  We can't
merge code that may be unsafe.

I think we can investigate further by asking: when does QEMU code assume
the backing file is read-only?

I haven't checked but these cases come to mind:

Operations that move data between BDS in the backing chain (e.g. commit
and stream block jobs) will lose or overwrite data if the backing file
is being written to by another coroutine.

We need to prevent users from running these operations at the same time.

Also, accessing bs->backing_blocker is a layering violation.  No one
outside block.c:bdrv_set_backing_hd() is supposed to access this field.

Let's figure out the safety concerns first and then the
bs->backing_blocker access will probably be eliminated as part of the
solution.

Stefan



Re: [Qemu-block] [PATCH v6 25/39] blockdev: Pull out blockdev option extraction

2015-10-14 Thread Max Reitz
On 14.10.2015 17:12, Kevin Wolf wrote:
> Am 12.10.2015 um 22:00 hat Max Reitz geschrieben:
>> Extract some of the blockdev option extraction code from blockdev_init()
>> into its own function. This simplifies blockdev_init() and will allow
>> reusing the code in a different function added in a follow-up patch.
>>
>> Signed-off-by: Max Reitz 
>> Reviewed-by: Alberto Garcia 
> 
> Reviewed-by: Kevin Wolf 
> 
>> +#ifdef CONFIG_LINUX_AIO
>> +if ((aio = qemu_opt_get(opts, "aio")) != NULL) {
>> +if (!strcmp(aio, "native")) {
>> +*bdrv_flags |= BDRV_O_NATIVE_AIO;
>> +} else if (!strcmp(aio, "threads")) {
>> +/* this is the default */
>> +} else {
>> +   error_setg(errp, "invalid aio option");
>> +   return;
>> +}
>> +}
>> +#endif
> 
> Unrelated to your patch, but didn't we want to remove this #ifdef? I
> thought we had already done so... Oh, I've never flushed block-next
> after the 2.4 release. :-/
> 
> Expect a merge conflict coming up here...

It won't be the first to hit this series. ;-)

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 3/3] block: use bdrv_lookup_bs() over blk_by_name() for BDS only results

2015-10-14 Thread Jeff Cody
On Wed, Oct 14, 2015 at 08:05:34PM +0200, Max Reitz wrote:
> On 14.10.2015 15:16, Jeff Cody wrote:
> > To find a BlockDriverState interface, it can be done via blk_by_name(),
> > bdrv_find_node(), and bdrv_lookup_bs().  The latter can take the place
> > of the other two, in the instances where we are only concerned with the
> > BlockDriverState.
> > 
> > In much of the usage of blk_by_name(), we can simplify the code by
> > calling bdrv_lookup_bs() instead of blk_by_name(), when what we need is
> > just the BDS, and not the BB.
> > 
> > Signed-off-by: Jeff Cody 
> > ---
> >  blockdev.c| 84 
> > ---
> >  migration/block.c |  6 ++--
> >  2 files changed, 32 insertions(+), 58 deletions(-)
> 
> Again, if this series is based on Berto's blockdev-snapshot series, it
> should be based on my BB series as well. But with that series applied,
> this patch has some (non-trivial) rebase conflicts on it.
> 
> Also, it has a fundamental conflict with that series: If a BB can be
> found by bdrv_lookup_bs() but it doesn't have a BDS, it will return NULL
> and set *errp accordingly ("No medium in device"). However, this patch
> discards that error message and just keeps the one of the former
> blk_by_name() caller, which is generally "Device not found". That is
> wrong, however, if there is simply no medium in the device.
>

Good point.  We can just drop this patch, then - maybe at some point,
we can have a consolidated API to obtain a BDS (if having such a thing
is even all that important), that fits well with the design goal of
your series.

I think the first two patches are probably worth keeping, however.

> In some cases, that doesn't matter (since we just assume that if there
> is a BB, it will have a BDS, like for hmp_commit), but it most probably
> does matter for all the places which conflict with my BB series, the
> reason being that they generally conflict because I added a
> blk_is_available() check.
> 
> Note that this is a "design conflict", too: bdrv_lookup_bs() will return
> NULL only if blk_bs() is NULL. blk_is_available() may return false even
> if there is a BDS (if the BDS is unavailable, e.g. one of the BDSs in
> the tree not being inserted or the guest device's tray is open). But if
> you already have a BDS (because you used bdrv_lookup_bs()), calling
> blk_is_available() on bs->blk afterwards looks kind of strange...
> 
> Max
> 





Re: [Qemu-block] [PATCH 2/3] block: make bdrv_find_node() static

2015-10-14 Thread Max Reitz
On 14.10.2015 15:16, Jeff Cody wrote:
> This patch does two things: it moves bdrv_find_node() up before the
> first usage in block.c, and it makes the function static so that it
> is only internal to block.c.
> 
> Signed-off-by: Jeff Cody 
> ---
>  block.c   | 30 +++---
>  include/block/block.h |  1 -
>  2 files changed, 15 insertions(+), 16 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 3/3] block: use bdrv_lookup_bs() over blk_by_name() for BDS only results

2015-10-14 Thread Max Reitz
On 14.10.2015 15:16, Jeff Cody wrote:
> To find a BlockDriverState interface, it can be done via blk_by_name(),
> bdrv_find_node(), and bdrv_lookup_bs().  The latter can take the place
> of the other two, in the instances where we are only concerned with the
> BlockDriverState.
> 
> In much of the usage of blk_by_name(), we can simplify the code by
> calling bdrv_lookup_bs() instead of blk_by_name(), when what we need is
> just the BDS, and not the BB.
> 
> Signed-off-by: Jeff Cody 
> ---
>  blockdev.c| 84 
> ---
>  migration/block.c |  6 ++--
>  2 files changed, 32 insertions(+), 58 deletions(-)

Again, if this series is based on Berto's blockdev-snapshot series, it
should be based on my BB series as well. But with that series applied,
this patch has some (non-trivial) rebase conflicts on it.

Also, it has a fundamental conflict with that series: If a BB can be
found by bdrv_lookup_bs() but it doesn't have a BDS, it will return NULL
and set *errp accordingly ("No medium in device"). However, this patch
discards that error message and just keeps the one of the former
blk_by_name() caller, which is generally "Device not found". That is
wrong, however, if there is simply no medium in the device.

In some cases, that doesn't matter (since we just assume that if there
is a BB, it will have a BDS, like for hmp_commit), but it most probably
does matter for all the places which conflict with my BB series, the
reason being that they generally conflict because I added a
blk_is_available() check.

Note that this is a "design conflict", too: bdrv_lookup_bs() will return
NULL only if blk_bs() is NULL. blk_is_available() may return false even
if there is a BDS (if the BDS is unavailable, e.g. one of the BDSs in
the tree not being inserted or the guest device's tray is open). But if
you already have a BDS (because you used bdrv_lookup_bs()), calling
blk_is_available() on bs->blk afterwards looks kind of strange...

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 1/3] block: Use bdrv_lookup_bs() instead of bdrv_find_node()

2015-10-14 Thread Max Reitz
On 14.10.2015 19:29, Max Reitz wrote:
> On 14.10.2015 15:16, Jeff Cody wrote:
>> This is a precursor to making bdrv_find_node() static, and internal
>> to block.c
>>
>> To find a BlockDriverState interface, it can be done via blk_by_name(),
>> bdrv_find_node(), and bdrv_lookup_bs().  The latter can take the place
>> of the other two, in the instances where we are only concerned with
>> the BlockDriverState.
>>
>> There is no benefit in calling bdrv_find_node() directly.  This patch
>> replaces all calls to bdrv_find_node() outside of block.c with
>> bdrv_lookup_bs().
>>
>> Signed-off-by: Jeff Cody 
>> ---
>>  block/block-backend.c   | 2 +-
>>  block/mirror.c  | 2 +-
>>  block/write-threshold.c | 2 +-
>>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> Reviewed-by: Max Reitz 

Oh, wait, on patch 2 gcc tells me I should take that back. If this
series is based on Berto's series, that means it's also based on my
BlockBackend series, and that one adds another instance of bdrv_find_node().

Since I'll have to send a v7 anyway, I'll make it a bdrv_lookup_bs()
there, so my R-b stands.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH 1/3] block: Use bdrv_lookup_bs() instead of bdrv_find_node()

2015-10-14 Thread Jeff Cody
On Wed, Oct 14, 2015 at 07:34:43PM +0200, Max Reitz wrote:
> On 14.10.2015 19:29, Max Reitz wrote:
> > On 14.10.2015 15:16, Jeff Cody wrote:
> >> This is a precursor to making bdrv_find_node() static, and internal
> >> to block.c
> >>
> >> To find a BlockDriverState interface, it can be done via blk_by_name(),
> >> bdrv_find_node(), and bdrv_lookup_bs().  The latter can take the place
> >> of the other two, in the instances where we are only concerned with
> >> the BlockDriverState.
> >>
> >> There is no benefit in calling bdrv_find_node() directly.  This patch
> >> replaces all calls to bdrv_find_node() outside of block.c with
> >> bdrv_lookup_bs().
> >>
> >> Signed-off-by: Jeff Cody 
> >> ---
> >>  block/block-backend.c   | 2 +-
> >>  block/mirror.c  | 2 +-
> >>  block/write-threshold.c | 2 +-
> >>  3 files changed, 3 insertions(+), 3 deletions(-)
> > 
> > Reviewed-by: Max Reitz 
> 
> Oh, wait, on patch 2 gcc tells me I should take that back. If this
> series is based on Berto's series, that means it's also based on my
> BlockBackend series, and that one adds another instance of bdrv_find_node().
> 
> Since I'll have to send a v7 anyway, I'll make it a bdrv_lookup_bs()
> there, so my R-b stands.
> 
> Max
> 

Thanks.  I actually managed to apply Berto's without yours, but I just
went back and rectified that, and applied yours.

Jeff




Re: [Qemu-block] [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async

2015-10-14 Thread John Snow


On 10/14/2015 02:19 PM, Peter Lieven wrote:
> Am 08.10.2015 um 18:44 schrieb John Snow:
>>
>> On 10/08/2015 08:06 AM, Peter Lieven wrote:
>>> Hi all,
>>>
>>> short summary from my side. The whole thing seems to get complicated,
>>> let me explain why:
>>>
>>> 1) During review I found that the code in ide_atapi_cmd_reply_end can't
>>> work correctly if the
>>> byte_count_limit is not a divider or a multiple of cd_sector_size. The
>>> reason is that as soon
>>> as we load the next sector we start at io_buffer offset 0 overwriting
>>> whatever is left in there
>>> for transfer. We also reset the io_buffer_index to 0 which means if we
>>> continue with the
>>> elementary transfer we always transfer a whole sector (of corrupt data)
>>> regardless if we
>>> are allowed to transfer that much data. Before we consider fixing this I
>>> wonder if it
>>> is legal at all to have an unaligned byte_count_limit. It obviously has
>>> never caused trouble in
>>> practice so maybe its not happening in real life.
>>>
>> I had overlooked that part. Good catch. I do suspect that in practice
>> nobody will be asking for bizarre values.
>>
>> There's no rule against an unaligned byte_count_limit as far as I have
>> read, but suspect nobody would have a reason to use it in practice.
>>
>>> 2) I found that whatever cool optimization I put in to buffer multiple
>>> sectors at once I end
>>> up with code that breaks migration because older versions would either
>>> not fill the io_buffer
>>> as expected or we introduce variables that older versions do not
>>> understand. This will
>>> lead to problems if we migrate in the middle of a transfer.
>>>
>> Ech. This sounds like a bit of a problem. I'll need to think about this
>> one...
>>
>>> 3) My current plan to get this patch to a useful state would be to use
>>> my initial patch and just
>>> change the code to use a sync request if we need to buffer additional
>>> sectors in an elementary
>>> transfer. I found that in real world operating systems the
>>> byte_count_limit seems to be equal to
>>> the cd_sector_size. After all its just a PIO transfer an operating
>>> system will likely switch to DMA
>>> as soon as the kernel ist loaded.
>>>
>>> Thanks,
>>> Peter
>>>
>> It sounds like that might be "good enough" for now, and won't make
>> behavior *worse* than it currently is. You can adjust the test I had
>> checked in to not use a "tricky" value and we can amend support for this
>> later if desired.
> 
> Have you had a chance to look at the series with the "good enough" fix?
> 
> Thanks,
> Peter
> 

Will do so Friday, thanks!

--js



Re: [Qemu-block] [Qemu-devel] [PATCH 1/5] ide/atapi: make PIO read requests async

2015-10-14 Thread Peter Lieven
Am 08.10.2015 um 18:44 schrieb John Snow:
>
> On 10/08/2015 08:06 AM, Peter Lieven wrote:
>> Hi all,
>>
>> short summary from my side. The whole thing seems to get complicated,
>> let me explain why:
>>
>> 1) During review I found that the code in ide_atapi_cmd_reply_end can't
>> work correctly if the
>> byte_count_limit is not a divider or a multiple of cd_sector_size. The
>> reason is that as soon
>> as we load the next sector we start at io_buffer offset 0 overwriting
>> whatever is left in there
>> for transfer. We also reset the io_buffer_index to 0 which means if we
>> continue with the
>> elementary transfer we always transfer a whole sector (of corrupt data)
>> regardless if we
>> are allowed to transfer that much data. Before we consider fixing this I
>> wonder if it
>> is legal at all to have an unaligned byte_count_limit. It obviously has
>> never caused trouble in
>> practice so maybe its not happening in real life.
>>
> I had overlooked that part. Good catch. I do suspect that in practice
> nobody will be asking for bizarre values.
>
> There's no rule against an unaligned byte_count_limit as far as I have
> read, but suspect nobody would have a reason to use it in practice.
>
>> 2) I found that whatever cool optimization I put in to buffer multiple
>> sectors at once I end
>> up with code that breaks migration because older versions would either
>> not fill the io_buffer
>> as expected or we introduce variables that older versions do not
>> understand. This will
>> lead to problems if we migrate in the middle of a transfer.
>>
> Ech. This sounds like a bit of a problem. I'll need to think about this
> one...
>
>> 3) My current plan to get this patch to a useful state would be to use
>> my initial patch and just
>> change the code to use a sync request if we need to buffer additional
>> sectors in an elementary
>> transfer. I found that in real world operating systems the
>> byte_count_limit seems to be equal to
>> the cd_sector_size. After all its just a PIO transfer an operating
>> system will likely switch to DMA
>> as soon as the kernel ist loaded.
>>
>> Thanks,
>> Peter
>>
> It sounds like that might be "good enough" for now, and won't make
> behavior *worse* than it currently is. You can adjust the test I had
> checked in to not use a "tricky" value and we can amend support for this
> later if desired.

Have you had a chance to look at the series with the "good enough" fix?

Thanks,
Peter




Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-14 Thread Kevin Wolf
Am 14.10.2015 um 13:06 hat Stefano Stabellini geschrieben:
> On Wed, 14 Oct 2015, Kevin Wolf wrote:
> > [ CC qemu-block ]
> > 
> > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > > On Tue, 13 Oct 2015, John Snow wrote:
> > > > On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> > > > > I added ahci disk support in libxl and using it for week seems that 
> > > > > was
> > > > > ok, after a reply of Stefano Stabellini seems that xen disk unplug
> > > > > support only ide disks:
> > > > > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39c905374ee8663d5d8
> > > > > 
> > > > > Today Paul Durrant told me that even if pv disk is ok also with ahci 
> > > > > and
> > > > > the emulated one is offline can be a risk:
> > > > > http://lists.xenproject.org/archives/html/win-pv-devel/2015-10/msg00021.html
> > > > > 
> > > > > 
> > > > > I tried to take a fast look in qemu code but I not understand the 
> > > > > needed
> > > > > thing for add the xen disk unplug support also for ahci, can someone 
> > > > > do
> > > > > it or tell me useful information for do it please?
> > > > > 
> > > > > Thanks for any reply and sorry for my bad english.
> > > > > 
> > > > 
> > > > I'm not entirely sure what features you need AHCI to support in order
> > > > for Xen to be happy.
> > > > 
> > > > I'd guess hotplugging, but where I get confused is that IDE disks don't
> > > > support hotplugging either, so I guess I'm not sure sure what you need.
> > > > 
> > > > Stefano, can you help bridge my Xen knowledge gap?
> > >  
> > > Hi John,
> > > 
> > > we need something like hw/i386/xen/xen_platform.c:unplug_disks but that
> > > can unplug AHCI disk. And by unplug, I mean "make disappear" like
> > > pci_piix3_xen_ide_unplug does for ide.
> > 
> > Maybe this would be the right time to stop the craziness with your
> > hybrid IDE/xendisk setup. It's a horrible thing that would never happen
> > on real hardware.
> 
> I would be quite happy to stop (or even get rid of) the craziness.
> 
> 
> > Can't you just teach SeaBIOS how to deal with your PV disks and then
> > only add that to your VM and forget about IDE/AHCI? I mean, that's how
> > it's done for virtio-blk, and it doesn't involve any insanities like
> > ripping out non-hotpluggable devices.
> 
> Teaching SeaBIOS to deal with PV disks can be done, in fact we already
> support PV disks in OVMF. It is possible to boot Windows with OVMF
> without any IDE disks (patch pending for libxl to create a VM without
> emulated IDE disks).
> 
> However we have to be honest that implementing PV disk support in
> SeaBIOS is a different magnitude of effort compared to implementing AHCI
> "unplug".

Agreed. But we generally try to do the right thing and not the easy
thing.

Maybe I'm missing something, but my impression was that the hybrid setup
was only used so that you can boot from a PV disk even though the BIOS
doesn't have a PV driver. In that case, why would you even want to use
AHCI instead of IDE? During the early boot performance shouldn't be much
different as there is no parallelism, and afterwards the PV driver is
used.

> I would suggest Fabio to avoid AHCI disks altogether and just use OVMF
> with PV disks only and Anthony's patch to libxl to avoid creating any
> IDE disks: http://marc.info/?l=xen-devel=144482080812353.
> 
> Would that work for you?

That sounds certainly like a better step forward than adding a crude
hack to AHCI.

And if you want to make me completely happy, plan to extend SeaBIOS so
that we can even drop the existing hack in IDE.

Kevin



Re: [Qemu-block] [PATCH V2 3/5] virtio-blk: disable scsi passthrough by default

2015-10-14 Thread Cornelia Huck


On Wed, 15 Jul 2015 16:45:52 +0200
Paolo Bonzini  wrote:

> On 15/07/2015 16:28, Michael S. Tsirkin wrote:
> > On Wed, Jul 15, 2015 at 04:18:49PM +0200, Paolo Bonzini wrote:
> >>
> >>
> >> On 15/07/2015 16:14, Michael S. Tsirkin wrote:
> >>> On Wed, Jul 15, 2015 at 02:47:24PM +0200, Paolo Bonzini wrote:
> 
> 
>  On 15/07/2015 14:21, Michael S. Tsirkin wrote:
> >>> Disable scsi passthrough by default since it was incompatible with
> >>> virtio 1.0. For legacy machine types, keep this on by default.
> >>>
> >>> Cc: Stefan Hajnoczi 
> >>> Cc: Kevin Wolf 
> >>> Cc: qemu-block@nongnu.org
> >>> Signed-off-by: Jason Wang 
> > Seems risky for 2.4.  modern is off by default for now. Can't we limit
> > the change to when modern is enabled?
> 
>  That would have the effect of disabling a feature when you turn on 
>  modern.
> >>>
> >>> What's wrong with that?
> >>
> >> Weren't you complaining about it a few hours ago? :)
> > 
> > No, I complained about guest driver update disabling it.
> 
> Ah, sorry for the confusion.
> 
> > I suggested changing this from bool to on/off/auto, and
> > make auto mean !modern.
> 
>  No, please do it like Jason did.  The SCSI feature effectively had to be
>  enabled explicitly already, the requests were marked as unsupported.
> >>>
> >>> I didn't know. How is it enabled?
> >>
> >> It's enabled by default in QEMU, but disabled by default in libvirt.
> >> And it only works if you pass a whole _disk_ (not a partition or logical
> >> volume) to QEMU, which is definitely not the common case.
> >>
> >> It can just be documented in the release notes; the feature is still
> >> available, and libvirt won't be broken because it adds explicitly both
> >> scsi=on and scsi=off.
> > 
> > So for libvirt, we don't really care about the default, right?
> > For command line, would it not be friendlier to make it follow the
> > modern flag automatically?
> 
> I wouldn't mind if we grabbed the occasion to disable it altogether.
> However, that would indeed work as well.

Do we want to change anything for 2.5 about the default?

Currently, we still default scsi to true, and you have to disable it
explicitly if you want to use virtio-1 compliant virtio-blk devices
(which is a bit annoying, as scsi passthrough is not something people
usually want).




Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-14 Thread Stefano Stabellini
On Wed, 14 Oct 2015, Kevin Wolf wrote:
> Am 14.10.2015 um 13:06 hat Stefano Stabellini geschrieben:
> > On Wed, 14 Oct 2015, Kevin Wolf wrote:
> > > [ CC qemu-block ]
> > > 
> > > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> > > > On Tue, 13 Oct 2015, John Snow wrote:
> > > > > On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
> > > > > > I added ahci disk support in libxl and using it for week seems that 
> > > > > > was
> > > > > > ok, after a reply of Stefano Stabellini seems that xen disk unplug
> > > > > > support only ide disks:
> > > > > > http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39c905374ee8663d5d8
> > > > > > 
> > > > > > Today Paul Durrant told me that even if pv disk is ok also with 
> > > > > > ahci and
> > > > > > the emulated one is offline can be a risk:
> > > > > > http://lists.xenproject.org/archives/html/win-pv-devel/2015-10/msg00021.html
> > > > > > 
> > > > > > 
> > > > > > I tried to take a fast look in qemu code but I not understand the 
> > > > > > needed
> > > > > > thing for add the xen disk unplug support also for ahci, can 
> > > > > > someone do
> > > > > > it or tell me useful information for do it please?
> > > > > > 
> > > > > > Thanks for any reply and sorry for my bad english.
> > > > > > 
> > > > > 
> > > > > I'm not entirely sure what features you need AHCI to support in order
> > > > > for Xen to be happy.
> > > > > 
> > > > > I'd guess hotplugging, but where I get confused is that IDE disks 
> > > > > don't
> > > > > support hotplugging either, so I guess I'm not sure sure what you 
> > > > > need.
> > > > > 
> > > > > Stefano, can you help bridge my Xen knowledge gap?
> > > >  
> > > > Hi John,
> > > > 
> > > > we need something like hw/i386/xen/xen_platform.c:unplug_disks but that
> > > > can unplug AHCI disk. And by unplug, I mean "make disappear" like
> > > > pci_piix3_xen_ide_unplug does for ide.
> > > 
> > > Maybe this would be the right time to stop the craziness with your
> > > hybrid IDE/xendisk setup. It's a horrible thing that would never happen
> > > on real hardware.
> > 
> > I would be quite happy to stop (or even get rid of) the craziness.
> > 
> > 
> > > Can't you just teach SeaBIOS how to deal with your PV disks and then
> > > only add that to your VM and forget about IDE/AHCI? I mean, that's how
> > > it's done for virtio-blk, and it doesn't involve any insanities like
> > > ripping out non-hotpluggable devices.
> > 
> > Teaching SeaBIOS to deal with PV disks can be done, in fact we already
> > support PV disks in OVMF. It is possible to boot Windows with OVMF
> > without any IDE disks (patch pending for libxl to create a VM without
> > emulated IDE disks).
> > 
> > However we have to be honest that implementing PV disk support in
> > SeaBIOS is a different magnitude of effort compared to implementing AHCI
> > "unplug".
> 
> Agreed. But we generally try to do the right thing and not the easy
> thing.

Sure. I am quite happy to let other people do it though :-)


> Maybe I'm missing something, but my impression was that the hybrid setup
> was only used so that you can boot from a PV disk even though the BIOS
> doesn't have a PV driver. In that case, why would you even want to use
> AHCI instead of IDE? During the early boot performance shouldn't be much
> different as there is no parallelism, and afterwards the PV driver is
> used.

The "unplug" code and the design choice predate me -- I am not sure why
it was done like that.  In those days there was no SeaBIOS: maybe they
thought that writing a PV disk frontend in rombios would be madness? 

In addition isn't it true that some guests, once they see a PIIX3
chipset and they know that there is one disk, they might just try to
access it directly via the IDE interface?  I am thinking of some old
versions of Windows. Are they really able to cope with the case where
they can only access the disk via the legacy BIOS until non-IDE drivers
come along?


> > I would suggest Fabio to avoid AHCI disks altogether and just use OVMF
> > with PV disks only and Anthony's patch to libxl to avoid creating any
> > IDE disks: http://marc.info/?l=xen-devel=144482080812353.
> > 
> > Would that work for you?
> 
> That sounds certainly like a better step forward than adding a crude
> hack to AHCI.
> 
> And if you want to make me completely happy, plan to extend SeaBIOS so
> that we can even drop the existing hack in IDE.

There are lots of existing guests which write to the magic ioport to
"unplug" disks.  We would need some sort of deprecation plan.



Re: [Qemu-block] [Qemu-devel] Question about xen disk unplug support for ahci missed in qemu

2015-10-14 Thread Paul Durrant
> -Original Message-
> From: Fabio Fantoni [mailto:fabio.fant...@m2r.biz]
> Sent: 14 October 2015 12:12
> To: Kevin Wolf; Stefano Stabellini
> Cc: John Snow; Anthony Perard; qemu-de...@nongnu.org; xen-
> de...@lists.xen.org; qemu-block@nongnu.org; Paul Durrant
> Subject: Re: [Qemu-devel] Question about xen disk unplug support for ahci
> missed in qemu
> 
> 
> 
> Il 14/10/2015 11:47, Kevin Wolf ha scritto:
> > [ CC qemu-block ]
> >
> > Am 13.10.2015 um 19:10 hat Stefano Stabellini geschrieben:
> >> On Tue, 13 Oct 2015, John Snow wrote:
> >>> On 10/13/2015 11:55 AM, Fabio Fantoni wrote:
>  I added ahci disk support in libxl and using it for week seems that was
>  ok, after a reply of Stefano Stabellini seems that xen disk unplug
>  support only ide disks:
> 
> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=679f4f8b178e7c66fbc2f39
> c905374ee8663d5d8
> 
>  Today Paul Durrant told me that even if pv disk is ok also with ahci and
>  the emulated one is offline can be a risk:
>  http://lists.xenproject.org/archives/html/win-pv-devel/2015-
> 10/msg00021.html
> 
> 
>  I tried to take a fast look in qemu code but I not understand the
> needed
>  thing for add the xen disk unplug support also for ahci, can someone do
>  it or tell me useful information for do it please?
> 
>  Thanks for any reply and sorry for my bad english.
> 
> >>> I'm not entirely sure what features you need AHCI to support in order
> >>> for Xen to be happy.
> >>>
> >>> I'd guess hotplugging, but where I get confused is that IDE disks don't
> >>> support hotplugging either, so I guess I'm not sure sure what you need.
> >>>
> >>> Stefano, can you help bridge my Xen knowledge gap?
> >>
> >> Hi John,
> >>
> >> we need something like hw/i386/xen/xen_platform.c:unplug_disks but
> that
> >> can unplug AHCI disk. And by unplug, I mean "make disappear" like
> >> pci_piix3_xen_ide_unplug does for ide.
> > Maybe this would be the right time to stop the craziness with your
> > hybrid IDE/xendisk setup. It's a horrible thing that would never happen
> > on real hardware.

Unfortunately, it's going to be difficult to remove such 'craziness' when you 
don't know a priori whether the VM has PV drivers or not. 

> >
> > Can't you just teach SeaBIOS how to deal with your PV disks and then
> > only add that to your VM and forget about IDE/AHCI? I mean, that's how
> > it's done for virtio-blk, and it doesn't involve any insanities like
> > ripping out non-hotpluggable devices.
> >

Does Windows have in-box virtio-blk drivers? If not, how does it boot?

> > Hm... How does a reboot of a machine that had its IDE already removed
> > actually work? Do you restart the qemu process on reboot?
> >

The IDE disks are always present during boot, but before the OS enumerates them 
they are 'unplugged' and then PV drivers are used instead. The only other way 
would be to somehow obscure them from OS enumeration so they could be left 
lying around but remain unused. That's feasible for Windows, but I don't know 
about other OS.

  Paul

> > Kevin
> I thinkthat would be a good idea, unfortunately I'm not able to do that
> and I do not know if the developers of xen would agree to such modification.
> 
> If will be done, for example having new disk type "xenpv" require the pv
> drivers installed, with linux having drivers included should not be a
> problem but on windows yes FWIK.
> Like virtio driver should be installed before (or adding them in windows
> install), I don't know if will be ok specifying them in install for with
> xen driver, another possibility can be start domU with ide disk type,
> install the xen pv drivers and reboot selecting xenpv disk type instead.
> Doing it FWIK require not only xen and qemu changes but also pv drivers
> change, I added on cc also Paul Durrant for see what think about.
> With actual unplug based on my tests in some years I had many problem
> with windows domUs (with both old and new pv drivers) resulting in
> "windows boot blue screen error", I suppose that changing/improving
> unplug method can decrease them, is it correct?
> 
> About reboot xen do different from kvm recreating full domU each time
> (and new qemu process), I don't know if is possible and good change the
> method.
> 
> Thanks for any reply and sorry for my bad english.



Re: [Qemu-block] [PATCH 1/3] block: Use bdrv_lookup_bs() instead of bdrv_find_node()

2015-10-14 Thread Max Reitz
On 14.10.2015 15:16, Jeff Cody wrote:
> This is a precursor to making bdrv_find_node() static, and internal
> to block.c
> 
> To find a BlockDriverState interface, it can be done via blk_by_name(),
> bdrv_find_node(), and bdrv_lookup_bs().  The latter can take the place
> of the other two, in the instances where we are only concerned with
> the BlockDriverState.
> 
> There is no benefit in calling bdrv_find_node() directly.  This patch
> replaces all calls to bdrv_find_node() outside of block.c with
> bdrv_lookup_bs().
> 
> Signed-off-by: Jeff Cody 
> ---
>  block/block-backend.c   | 2 +-
>  block/mirror.c  | 2 +-
>  block/write-threshold.c | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Max Reitz 



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH 3/3] block: use bdrv_lookup_bs() over blk_by_name() for BDS only results

2015-10-14 Thread Jeff Cody
To find a BlockDriverState interface, it can be done via blk_by_name(),
bdrv_find_node(), and bdrv_lookup_bs().  The latter can take the place
of the other two, in the instances where we are only concerned with the
BlockDriverState.

In much of the usage of blk_by_name(), we can simplify the code by
calling bdrv_lookup_bs() instead of blk_by_name(), when what we need is
just the BDS, and not the BB.

Signed-off-by: Jeff Cody 
---
 blockdev.c| 84 ---
 migration/block.c |  6 ++--
 2 files changed, 32 insertions(+), 58 deletions(-)

diff --git a/blockdev.c b/blockdev.c
index fe182bb..6a893e1 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1033,18 +1033,18 @@ fail:
 void hmp_commit(Monitor *mon, const QDict *qdict)
 {
 const char *device = qdict_get_str(qdict, "device");
-BlockBackend *blk;
+BlockDriverState *bs;
 int ret;
 
 if (!strcmp(device, "all")) {
 ret = bdrv_commit_all();
 } else {
-blk = blk_by_name(device);
-if (!blk) {
+bs = bdrv_lookup_bs(device, NULL, NULL);
+if (!bs) {
 monitor_printf(mon, "Device '%s' not found\n", device);
 return;
 }
-ret = bdrv_commit(blk_bs(blk));
+ret = bdrv_commit(bs);
 }
 if (ret < 0) {
 monitor_printf(mon, "'commit' error for '%s': %s\n", device,
@@ -1122,20 +1122,18 @@ SnapshotInfo 
*qmp_blockdev_snapshot_delete_internal_sync(const char *device,
  Error **errp)
 {
 BlockDriverState *bs;
-BlockBackend *blk;
 AioContext *aio_context;
 QEMUSnapshotInfo sn;
 Error *local_err = NULL;
 SnapshotInfo *info = NULL;
 int ret;
 
-blk = blk_by_name(device);
-if (!blk) {
+bs = bdrv_lookup_bs(device, NULL, NULL);
+if (!bs) {
 error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
   "Device '%s' not found", device);
 return NULL;
 }
-bs = blk_bs(blk);
 
 if (!has_id) {
 id = NULL;
@@ -1300,7 +1298,6 @@ static void internal_snapshot_prepare(BlkTransactionState 
*common,
 Error *local_err = NULL;
 const char *device;
 const char *name;
-BlockBackend *blk;
 BlockDriverState *bs;
 QEMUSnapshotInfo old_sn, *sn;
 bool ret;
@@ -1319,13 +1316,12 @@ static void 
internal_snapshot_prepare(BlkTransactionState *common,
 name = internal->name;
 
 /* 2. check for validation */
-blk = blk_by_name(device);
-if (!blk) {
+bs = bdrv_lookup_bs(device, NULL, NULL);
+if (!bs) {
 error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
   "Device '%s' not found", device);
 return;
 }
-bs = blk_bs(blk);
 
 /* AioContext is released in .clean() */
 state->aio_context = bdrv_get_aio_context(bs);
@@ -1613,20 +1609,18 @@ static void drive_backup_prepare(BlkTransactionState 
*common, Error **errp)
 {
 DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common);
 BlockDriverState *bs;
-BlockBackend *blk;
 DriveBackup *backup;
 Error *local_err = NULL;
 
 assert(common->action->kind == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
 backup = common->action->drive_backup;
 
-blk = blk_by_name(backup->device);
-if (!blk) {
+bs = bdrv_lookup_bs(backup->device, NULL, NULL);
+if (!bs) {
 error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
   "Device '%s' not found", backup->device);
 return;
 }
-bs = blk_bs(blk);
 
 /* AioContext is released in .clean() */
 state->aio_context = bdrv_get_aio_context(bs);
@@ -1682,25 +1676,22 @@ static void blockdev_backup_prepare(BlkTransactionState 
*common, Error **errp)
 BlockdevBackupState *state = DO_UPCAST(BlockdevBackupState, common, 
common);
 BlockdevBackup *backup;
 BlockDriverState *bs, *target;
-BlockBackend *blk;
 Error *local_err = NULL;
 
 assert(common->action->kind == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
 backup = common->action->blockdev_backup;
 
-blk = blk_by_name(backup->device);
-if (!blk) {
+bs = bdrv_lookup_bs(backup->device, NULL, NULL);
+if (!bs) {
 error_setg(errp, "Device '%s' not found", backup->device);
 return;
 }
-bs = blk_bs(blk);
 
-blk = blk_by_name(backup->target);
-if (!blk) {
+target = bdrv_lookup_bs(backup->target, NULL, NULL);
+if (!target) {
 error_setg(errp, "Device '%s' not found", backup->target);
 return;
 }
-target = blk_bs(blk);
 
 /* AioContext is released in .clean() */
 state->aio_context = bdrv_get_aio_context(bs);
@@ -2324,7 +2315,6 @@ void qmp_block_stream(const char *device,
   bool has_on_error, BlockdevOnError on_error,
   Error **errp)
 {
-BlockBackend *blk;
 BlockDriverState *bs;
 BlockDriverState *base_bs = NULL;
 AioContext *aio_context;

[Qemu-block] [PATCH 2/3] block: make bdrv_find_node() static

2015-10-14 Thread Jeff Cody
This patch does two things: it moves bdrv_find_node() up before the
first usage in block.c, and it makes the function static so that it
is only internal to block.c.

Signed-off-by: Jeff Cody 
---
 block.c   | 30 +++---
 include/block/block.h |  1 -
 2 files changed, 15 insertions(+), 16 deletions(-)

diff --git a/block.c b/block.c
index f95a59d..da1a6a9 100644
--- a/block.c
+++ b/block.c
@@ -759,6 +759,21 @@ static int bdrv_open_flags(BlockDriverState *bs, int flags)
 return open_flags;
 }
 
+/* This function is to find a node in the bs graph */
+static BlockDriverState *bdrv_find_node(const char *node_name)
+{
+BlockDriverState *bs;
+
+assert(node_name);
+
+QTAILQ_FOREACH(bs, _bdrv_states, node_list) {
+if (!strcmp(node_name, bs->node_name)) {
+return bs;
+}
+}
+return NULL;
+}
+
 static void bdrv_assign_node_name(BlockDriverState *bs,
   const char *node_name,
   Error **errp)
@@ -2720,21 +2735,6 @@ void bdrv_iterate_format(void (*it)(void *opaque, const 
char *name),
 g_free(formats);
 }
 
-/* This function is to find a node in the bs graph */
-BlockDriverState *bdrv_find_node(const char *node_name)
-{
-BlockDriverState *bs;
-
-assert(node_name);
-
-QTAILQ_FOREACH(bs, _bdrv_states, node_list) {
-if (!strcmp(node_name, bs->node_name)) {
-return bs;
-}
-}
-return NULL;
-}
-
 /* Put this QMP function here so it can access the static graph_bdrv_states. */
 BlockDeviceInfoList *bdrv_named_nodes_list(Error **errp)
 {
diff --git a/include/block/block.h b/include/block/block.h
index 1520dee..2b6e1b2 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -404,7 +404,6 @@ int bdrv_media_changed(BlockDriverState *bs);
 void bdrv_lock_medium(BlockDriverState *bs, bool locked);
 void bdrv_eject(BlockDriverState *bs, bool eject_flag);
 const char *bdrv_get_format_name(BlockDriverState *bs);
-BlockDriverState *bdrv_find_node(const char *node_name);
 BlockDeviceInfoList *bdrv_named_nodes_list(Error **errp);
 BlockDriverState *bdrv_lookup_bs(const char *device,
  const char *node_name,
-- 
1.9.3




[Qemu-block] [PATCH 1/3] block: Use bdrv_lookup_bs() instead of bdrv_find_node()

2015-10-14 Thread Jeff Cody
This is a precursor to making bdrv_find_node() static, and internal
to block.c

To find a BlockDriverState interface, it can be done via blk_by_name(),
bdrv_find_node(), and bdrv_lookup_bs().  The latter can take the place
of the other two, in the instances where we are only concerned with
the BlockDriverState.

There is no benefit in calling bdrv_find_node() directly.  This patch
replaces all calls to bdrv_find_node() outside of block.c with
bdrv_lookup_bs().

Signed-off-by: Jeff Cody 
---
 block/block-backend.c   | 2 +-
 block/mirror.c  | 2 +-
 block/write-threshold.c | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/block-backend.c b/block/block-backend.c
index 2256551..7026a3f 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -67,7 +67,7 @@ BlockBackend *blk_new(const char *name, Error **errp)
 error_setg(errp, "Device with id '%s' already exists", name);
 return NULL;
 }
-if (bdrv_find_node(name)) {
+if (bdrv_lookup_bs(NULL, name, NULL)) {
 error_setg(errp,
"Device name '%s' conflicts with an existing node name",
name);
diff --git a/block/mirror.c b/block/mirror.c
index 7e43511..cb3c765 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -644,7 +644,7 @@ static void mirror_complete(BlockJob *job, Error **errp)
 if (s->replaces) {
 AioContext *replace_aio_context;
 
-s->to_replace = bdrv_find_node(s->replaces);
+s->to_replace = bdrv_lookup_bs(NULL, s->replaces, NULL);
 if (!s->to_replace) {
 error_setg(errp, "Node name '%s' not found", s->replaces);
 return;
diff --git a/block/write-threshold.c b/block/write-threshold.c
index a53c1f5..908fa7f 100644
--- a/block/write-threshold.c
+++ b/block/write-threshold.c
@@ -110,7 +110,7 @@ void qmp_block_set_write_threshold(const char *node_name,
 BlockDriverState *bs;
 AioContext *aio_context;
 
-bs = bdrv_find_node(node_name);
+bs = bdrv_lookup_bs(NULL, node_name, NULL);
 if (!bs) {
 error_setg(errp, "Device '%s' not found", node_name);
 return;
-- 
1.9.3




Re: [Qemu-block] [PATCH v6 24/39] blockdev: Do not create BDS for empty drive

2015-10-14 Thread Kevin Wolf
Am 12.10.2015 um 22:00 hat Max Reitz geschrieben:
> Do not use "rudimentary" BDSs for empty drives any longer (for
> freshly created drives).
> 
> After a follow-up patch, empty drives will generally use a NULL BDS, not
> only the freshly created drives.
> 
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c | 72 
> ++
>  1 file changed, 44 insertions(+), 28 deletions(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 35efe84..845a1c1 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -514,16 +514,44 @@ static BlockBackend *blockdev_init(const char *file, 
> QDict *bs_opts,
>  goto early_err;
>  }
>  
> +if (snapshot) {
> +/* always use cache=unsafe with snapshot */
> +bdrv_flags &= ~BDRV_O_CACHE_MASK;
> +bdrv_flags |= (BDRV_O_SNAPSHOT|BDRV_O_CACHE_WB|BDRV_O_NO_FLUSH);
> +}
> +
> +if (copy_on_read) {
> +bdrv_flags |= BDRV_O_COPY_ON_READ;
> +}
> +
> +if (runstate_check(RUN_STATE_INMIGRATE)) {
> +bdrv_flags |= BDRV_O_INCOMING;
> +}

Bad conflict resolution when you added patch 2, which moved the
BDRV_O_INCOMING code to block.c?

That patch is actually essential here as well: The flag would ends up in
the root state, and bdrv_invalidate_cache_all() doesn't clear the flag
any more. So we might end up opening an image with BDRV_O_INCOMING even
though migration has long finished.

Kevin



Re: [Qemu-block] [PATCH v10 08/10] Implement new driver for block replication

2015-10-14 Thread Wen Congyang
On 10/14/2015 10:27 PM, Stefan Hajnoczi wrote:
> On Tue, Oct 13, 2015 at 05:08:17PM +0800, Wen Congyang wrote:
>> On 10/13/2015 12:27 AM, Stefan Hajnoczi wrote:
>>> On Fri, Sep 25, 2015 at 02:17:36PM +0800, Wen Congyang wrote:
 +/* start backup job now */
 +bdrv_op_unblock(s->hidden_disk, BLOCK_OP_TYPE_BACKUP_TARGET,
 +s->active_disk->backing_blocker);
 +bdrv_op_unblock(s->secondary_disk, BLOCK_OP_TYPE_BACKUP_SOURCE,
 +s->hidden_disk->backing_blocker);
>>>
>>> Why is it safe to unblock these operations?
>>>
>>> Why do they have to be blocked for non-replication users?
>>
>> hidden_disk and secondary disk are opened as backing file, so it is blocked 
>> for
>> non-replication users.
>> What can I do if I don't unblock it and want to do backup?
> 
> CCing Jeff Cody, block jobs maintainer
> 
> You need to explain why it is safe remove this protection.  We can't
> merge code that may be unsafe.
> 
> I think we can investigate further by asking: when does QEMU code assume
> the backing file is read-only?

The backing file is opened in read-only mode. I want to reopen it in read-write
mode here in the next version(So the patch 1 will be dropped)

> 
> I haven't checked but these cases come to mind:
> 
> Operations that move data between BDS in the backing chain (e.g. commit
> and stream block jobs) will lose or overwrite data if the backing file
> is being written to by another coroutine.
> 
> We need to prevent users from running these operations at the same time.

Yes, but qemu doesn't provide such API.

> 
> Also, accessing bs->backing_blocker is a layering violation.  No one
> outside block.c:bdrv_set_backing_hd() is supposed to access this field.

I agree with it.

Thanks
Wen Congyang

> 
> Let's figure out the safety concerns first and then the
> bs->backing_blocker access will probably be eliminated as part of the
> solution.
> 
> Stefan
> .
> 




Re: [Qemu-block] [Qemu-devel] [PATCH] Revert "blockdev: add note that block_job_cb() must be thread-safe"

2015-10-14 Thread Fam Zheng


- Original Message -
> On Tue, Oct 13, 2015 at 06:16:15PM +0800, Fam Zheng wrote:
> > This reverts commit 723c5d93c51bdb3adbc238ce90195c0864aa6cd5.
> > 
> > block_job_cb is called by block_job_completed, which is always called in
> > a main loop bottom half in existing block jobs. So we don't need to
> > worry about thread-safety here.
> 
> This is not correct.  Search for block_job_completed() callers.
> 
> For example, block/stream.c has early exit cases that call
> block_job_completed() from the coroutine (i.e. dispatched from a
> coroutine in another AioContext+IOThread).
> 
> I think you are assuming that all block_job_completed() callers are
> called from a function scheduled using block_job_defer_to_main_loop().

No, I'm assuming all block_job_completed() callers are (and they should
be) from main thread. Even the early exit cases in stream are so, because
they are in the same thread as stream_start, which is main thread.

> 
> Please double-check this.
> 
> Thanks,
> Stefan
> 
> 



Re: [Qemu-block] [PATCH] Revert "blockdev: add note that block_job_cb() must be thread-safe"

2015-10-14 Thread Stefan Hajnoczi
On Tue, Oct 13, 2015 at 06:16:15PM +0800, Fam Zheng wrote:
> This reverts commit 723c5d93c51bdb3adbc238ce90195c0864aa6cd5.
> 
> block_job_cb is called by block_job_completed, which is always called in
> a main loop bottom half in existing block jobs. So we don't need to
> worry about thread-safety here.

This is not correct.  Search for block_job_completed() callers.

For example, block/stream.c has early exit cases that call
block_job_completed() from the coroutine (i.e. dispatched from a
coroutine in another AioContext+IOThread).

I think you are assuming that all block_job_completed() callers are
called from a function scheduled using block_job_defer_to_main_loop().

Please double-check this.

Thanks,
Stefan



Re: [Qemu-block] [PATCH v10 02/10] Backup: clear all bitmap when doing block checkpoint

2015-10-14 Thread Stefan Hajnoczi
On Tue, Oct 13, 2015 at 05:13:14PM +0800, Wen Congyang wrote:
> On 10/12/2015 09:45 PM, Stefan Hajnoczi wrote:
> > On Fri, Sep 25, 2015 at 02:17:30PM +0800, Wen Congyang wrote:
> >> Signed-off-by: Wen Congyang 
> >> Signed-off-by: zhanghailiang 
> >> Signed-off-by: Gonglei 
> >> Reviewed-by: Jeff Cody 
> >> ---
> >>  block/backup.c   | 14 ++
> >>  blockjob.c   | 11 +++
> >>  include/block/blockjob.h | 12 
> >>  3 files changed, 37 insertions(+)
> >>
> >> diff --git a/block/backup.c b/block/backup.c
> >> index c61e4c3..5e5995e 100644
> >> --- a/block/backup.c
> >> +++ b/block/backup.c
> >> @@ -214,11 +214,25 @@ static void backup_iostatus_reset(BlockJob *job)
> >>  }
> >>  }
> >>  
> >> +static void backup_do_checkpoint(BlockJob *job, Error **errp)
> >> +{
> >> +BackupBlockJob *backup_job = container_of(job, BackupBlockJob, 
> >> common);
> >> +
> >> +if (backup_job->sync_mode != MIRROR_SYNC_MODE_NONE) {
> >> +error_setg(errp, "The backup job only supports block checkpoint 
> >> in"
> >> +   " sync=none mode");
> >> +return;
> >> +}
> >> +
> >> +hbitmap_reset_all(backup_job->bitmap);
> >> +}
> > 
> > Is this a fast way to stop and then start a new backup blockjob without
> > emitting block job lifecycle events?
> > 
> > Not sure the blockjob_do_checkpoint() interface is appropriate.  Is
> > there any other block job type that will implement .do_checkpoint()?
> 
> Currently, the answer is no.
> 
> > 
> > COLO block replication could call a public backup_do_checkpoint()
> > function.  That way the direct coupling between COLO and the backup
> > block job is obvious.  I'm not convinced a generic interface like
> > blockjob_do_checkpoint() makes sense since it's really not a generic
> > operation that makes sense for other block job types.
> > 
> > void backup_do_checkpoint(BlockJob *job, Error **errp)
> > {
> > BackupBlockJob *s;
> > 
> > if (job->driver != backup_job_driver) {
> > error_setg(errp, "expected backup block job type for "
> >"checkpoint, got %d", job->driver->job_type);
> > return;
> > }
> > 
> > s = container_of(job, BackupBlockJob, common);
> > ...
> > }
> 
> In a older version, I implement it like this, but Paolo didn't like it.

It's a question of taste.  In this case it seems to me that there is
really a direct coupling between COLO and the backup block job.  This
isn't really a generic interface that makes sense in other scenarios.
That's why I advocate for direct coupling instead of pretending this is
a generic interface.

I wish COLO could just stop the existing block job and start a new one
for each checkpoint.  In reality we probably don't want QMP events and
the full block job lifecycle for each checkpoint...  But anyway, I like
this approach because it does not require a new interface at all.

> > Please also make the function name and documentation more specific.
> > Instead of "do" maybe this should be "pre" or "post" to indicate whether
> > this happens before or after the checkpoint commit.  What happens if
> 
> OK
> 
> > this function returns an error?
> 
> We just return this error to COLO, and COLO will do failover.

Okay, I ask these questions because the information should be part of
the doc comment for this new interface.

Stefan