[Qemu-block] [PATCH v2] migration/block: Avoid invoking blk_drain too frequently

2017-03-14 Thread Lidong Chen
Increase bmds->cur_dirty after submit io, so reduce the frequency
involve into blk_drain, and improve the performance obviously
when block migration.

The performance test result of this patch:

During the block dirty save phase, this patch improve guest os IOPS
from 4.0K to 9.5K. and improve the migration speed from
505856 rsec/s to 855756 rsec/s.

Signed-off-by: Lidong Chen 
---
 migration/block.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/migration/block.c b/migration/block.c
index 6741228..7734ff7 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -576,6 +576,9 @@ static int mig_save_device_dirty(QEMUFile *f, 
BlkMigDevState *bmds,
 }
 
 bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, sector, nr_sectors);
+sector += nr_sectors;
+bmds->cur_dirty = sector;
+
 break;
 }
 sector += BDRV_SECTORS_PER_DIRTY_CHUNK;
-- 
1.8.3.1




Re: [Qemu-block] [Qemu-devel] [PATCH] migration/block: Avoid involve into blk_drain too frequently

2017-03-14 Thread 858585 jemmy
On Wed, Mar 15, 2017 at 10:57 AM, Fam Zheng  wrote:
> On Wed, 03/15 10:28, 858585 jemmy wrote:
>> On Tue, Mar 14, 2017 at 11:12 PM, Eric Blake  wrote:
>> > On 03/14/2017 02:57 AM, jemmy858...@gmail.com wrote:
>> >> From: Lidong Chen 
>> >>
>> >> Increase bmds->cur_dirty after submit io, so reduce the frequency involve 
>> >> into blk_drain, and improve the performance obviously when block 
>> >> migration.
>> >
>> > Long line; please wrap your commit messages, preferably around 70 bytes
>> > since 'git log' displays them indented, and it is still nice to read
>> > them in an 80-column window.
>> >
>> > Do you have benchmark numbers to prove the impact of this patch, or even
>> > a formula for reproducing the benchmark testing?
>> >
>>
>> the test result is base on current git master version.
>>
>> the xml of guest os:
>> 
>>   
>>   > file='/instanceimage/ab3ba978-c7a3-463d-a1d0-48649fb7df00/ab3ba978-c7a3-463d-a1d0-48649fb7df00_vda.qcow2'/>
>>   
>>   
>>   > function='0x0'/>
>> 
>> 
>>   
>>   
>>   
>>   
>>   > function='0x0'/>
>> 
>>
>> i used fio running in guest os.  and the context of  fio configuration is 
>> below:
>> [randwrite]
>> ioengine=libaio
>> iodepth=128
>> bs=512
>> filename=/dev/vdb
>> rw=randwrite
>> direct=1
>>
>> when the vm is not durning migrate, the iops is about 10.7K.
>>
>> then i used this command to start migrate virtual machine.
>>
>> virsh migrate-setspeed ab3ba978-c7a3-463d-a1d0-48649fb7df00 1000
>> virsh migrate --live ab3ba978-c7a3-463d-a1d0-48649fb7df00
>> --copy-storage-inc qemu+ssh://10.59.163.38/system
>>
>> before apply this patch, during the block dirty save phase, the iops
>> in guest os is  only 4.0K, the migrate speed is about 505856 rsec/s.
>> after apply this patch, during the block dirty save phase, the iops in
>> guest os is is 9.5K. the migrate speed is about 855756 rsec/s.
>
> Thanks, please include these numbers in the commit message too.
OK, i will.
>
> Fam



Re: [Qemu-block] [Qemu-devel] [PATCH] block: Always call bdrv_child_check_perm first

2017-03-14 Thread Fam Zheng
On Tue, 03/14 08:28, Eric Blake wrote:
> On 03/13/2017 09:30 PM, Fam Zheng wrote:
> > bdrv_child_set_perm alone is not very usable because the caller must
> > call bdrv_child_check_perm first. This is already encapsulated
> > conveniently in bdrv_child_try_set_perm, so remove the other prototypes
> > from the header and fix the one wrong caller, block/mirror.c.
> > 
> > Signed-off-by: Fam Zheng 
> > ---
> >  block.c   | 13 +
> >  block/mirror.c|  6 --
> >  include/block/block_int.h |  4 
> >  3 files changed, 13 insertions(+), 10 deletions(-)
> > 
> > diff --git a/block.c b/block.c
> > index cb57370..a77e8a0 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -1393,6 +1393,11 @@ static int bdrv_fill_options(QDict **options, const 
> > char *filename,
> >  return 0;
> >  }
> >  
> > +static int bdrv_child_check_perm(BdrvChild *c, uint64_t perm, uint64_t 
> > shared,
> > + GSList *ignore_children, Error **errp);
> > +static void bdrv_child_abort_perm_update(BdrvChild *c);
> > +static void bdrv_child_set_perm(BdrvChild *c, uint64_t perm, uint64_t 
> > shared);
> > +
> 
> Now that you have static prototypes, is it worth rearranging the file
> (in a followup) to sort the function implementations into topological
> order so that a prototype is not necessary?  [In general, I try to code
> so that static prototypes are only necessary if I am implementing
> mutually-referencing recursive code. But it's not a strict requirement]

Yes, thanks for pointing out, but it does have a recursion here:

bdrv_check_update_perm
-> bdrv_check_perm
-> bdrv_child_check_perm
-> bdrv_check_update_perm

Fam

> 
> Reviewed-by: Eric Blake 
> 
> -- 
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 






Re: [Qemu-block] [Qemu-devel] [PATCH] migration/block: Avoid involve into blk_drain too frequently

2017-03-14 Thread Fam Zheng
On Wed, 03/15 10:28, 858585 jemmy wrote:
> On Tue, Mar 14, 2017 at 11:12 PM, Eric Blake  wrote:
> > On 03/14/2017 02:57 AM, jemmy858...@gmail.com wrote:
> >> From: Lidong Chen 
> >>
> >> Increase bmds->cur_dirty after submit io, so reduce the frequency involve 
> >> into blk_drain, and improve the performance obviously when block migration.
> >
> > Long line; please wrap your commit messages, preferably around 70 bytes
> > since 'git log' displays them indented, and it is still nice to read
> > them in an 80-column window.
> >
> > Do you have benchmark numbers to prove the impact of this patch, or even
> > a formula for reproducing the benchmark testing?
> >
> 
> the test result is base on current git master version.
> 
> the xml of guest os:
> 
>   
>file='/instanceimage/ab3ba978-c7a3-463d-a1d0-48649fb7df00/ab3ba978-c7a3-463d-a1d0-48649fb7df00_vda.qcow2'/>
>   
>   
>function='0x0'/>
> 
> 
>   
>   
>   
>   
>function='0x0'/>
> 
> 
> i used fio running in guest os.  and the context of  fio configuration is 
> below:
> [randwrite]
> ioengine=libaio
> iodepth=128
> bs=512
> filename=/dev/vdb
> rw=randwrite
> direct=1
> 
> when the vm is not durning migrate, the iops is about 10.7K.
> 
> then i used this command to start migrate virtual machine.
> 
> virsh migrate-setspeed ab3ba978-c7a3-463d-a1d0-48649fb7df00 1000
> virsh migrate --live ab3ba978-c7a3-463d-a1d0-48649fb7df00
> --copy-storage-inc qemu+ssh://10.59.163.38/system
> 
> before apply this patch, during the block dirty save phase, the iops
> in guest os is  only 4.0K, the migrate speed is about 505856 rsec/s.
> after apply this patch, during the block dirty save phase, the iops in
> guest os is is 9.5K. the migrate speed is about 855756 rsec/s.

Thanks, please include these numbers in the commit message too.

Fam



Re: [Qemu-block] [Qemu-devel] [PATCH] migration/block: Avoid involve into blk_drain too frequently

2017-03-14 Thread 858585 jemmy
On Tue, Mar 14, 2017 at 11:15 PM, Eric Blake  wrote:
>
> On 03/14/2017 10:12 AM, Eric Blake wrote:
> > On 03/14/2017 02:57 AM, jemmy858...@gmail.com wrote:
> >> From: Lidong Chen 
> >>
> >> Increase bmds->cur_dirty after submit io, so reduce the frequency involve 
> >> into blk_drain, and improve the performance obviously when block migration.
> >
> > Long line; please wrap your commit messages, preferably around 70 bytes
> > since 'git log' displays them indented, and it is still nice to read
> > them in an 80-column window.
>
> Also, in the subject:
>
> s/involve into/invoking/

thanks for you suggestion.
this is the first time i commit a patch for qemu, i will send a new version.

>
> --
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>



Re: [Qemu-block] [Qemu-devel] [PATCH] migration/block: Avoid involve into blk_drain too frequently

2017-03-14 Thread 858585 jemmy
On Tue, Mar 14, 2017 at 11:12 PM, Eric Blake  wrote:
> On 03/14/2017 02:57 AM, jemmy858...@gmail.com wrote:
>> From: Lidong Chen 
>>
>> Increase bmds->cur_dirty after submit io, so reduce the frequency involve 
>> into blk_drain, and improve the performance obviously when block migration.
>
> Long line; please wrap your commit messages, preferably around 70 bytes
> since 'git log' displays them indented, and it is still nice to read
> them in an 80-column window.
>
> Do you have benchmark numbers to prove the impact of this patch, or even
> a formula for reproducing the benchmark testing?
>

the test result is base on current git master version.

the xml of guest os:

  
  
  
  
  


  
  
  
  
  


i used fio running in guest os.  and the context of  fio configuration is below:
[randwrite]
ioengine=libaio
iodepth=128
bs=512
filename=/dev/vdb
rw=randwrite
direct=1

when the vm is not durning migrate, the iops is about 10.7K.

then i used this command to start migrate virtual machine.

virsh migrate-setspeed ab3ba978-c7a3-463d-a1d0-48649fb7df00 1000
virsh migrate --live ab3ba978-c7a3-463d-a1d0-48649fb7df00
--copy-storage-inc qemu+ssh://10.59.163.38/system

before apply this patch, during the block dirty save phase, the iops
in guest os is  only 4.0K, the migrate speed is about 505856 rsec/s.
after apply this patch, during the block dirty save phase, the iops in
guest os is is 9.5K. the migrate speed is about 855756 rsec/s.

with old version qemu(1.2.0), call bdrv_drain_all function to wait aio
complete, before apply this patch, the result is worse.
because the main thread is block by bdrv_drain_all for a long time,
the vnc is becoming response very slowly.
this bug is only obvious when set the migrate speed with a big number.

>>
>> Signed-off-by: Lidong Chen 
>> ---
>>  migration/block.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/migration/block.c b/migration/block.c
>> index 6741228..f059cca 100644
>> --- a/migration/block.c
>> +++ b/migration/block.c
>> @@ -576,6 +576,8 @@ static int mig_save_device_dirty(QEMUFile *f, 
>> BlkMigDevState *bmds,
>>  }
>>
>>  bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, sector, nr_sectors);
>> +sector += nr_sectors;
>> +bmds->cur_dirty = sector;
>>  break;
>>  }
>>  sector += BDRV_SECTORS_PER_DIRTY_CHUNK;
>>
>
> --
> Eric Blake   eblake redhat com+1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>



Re: [Qemu-block] [Qemu-devel] [PATCH v1 1/1] vmstate: fix failed iotests case 68 and 91

2017-03-14 Thread QingFeng Hao



在 2017/3/14 22:13, Dr. David Alan Gilbert 写道:

* QingFeng Hao (ha...@linux.vnet.ibm.com) wrote:

This problem affects s390x only if we are running without KVM.
Basically, S390CPU.irqstate is unused if we do not use KVM,
and thus no buffer is allocated.
This causes size=0, first_elem=NULL and n_elems=1 in
vmstate_load_state and vmstate_save_state. And the assert fails.
With this fix we can go back to the old behavior and support
VMS_VBUFFER with size 0 and nullptr.

Signed-off-by: QingFeng Hao 
Signed-off-by: Halil Pasic 

Thanks, and fixes problem with vmxnet3 migration.

Reviewed-by: Dr. David Alan Gilbert 

Thank you, Dave!


Dave


---
  migration/vmstate.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/migration/vmstate.c b/migration/vmstate.c
index 78b3cd4..7b4a607 100644
--- a/migration/vmstate.c
+++ b/migration/vmstate.c
@@ -109,7 +109,7 @@ int vmstate_load_state(QEMUFile *f, const 
VMStateDescription *vmsd,
  vmstate_handle_alloc(first_elem, field, opaque);
  if (field->flags & VMS_POINTER) {
  first_elem = *(void **)first_elem;
-assert(first_elem  || !n_elems);
+assert(first_elem || !n_elems || !size);
  }
  for (i = 0; i < n_elems; i++) {
  void *curr_elem = first_elem + size * i;
@@ -117,7 +117,7 @@ int vmstate_load_state(QEMUFile *f, const 
VMStateDescription *vmsd,
  if (field->flags & VMS_ARRAY_OF_POINTER) {
  curr_elem = *(void **)curr_elem;
  }
-if (!curr_elem) {
+if (!curr_elem && size) {
  /* if null pointer check placeholder and do not follow */
  assert(field->flags & VMS_ARRAY_OF_POINTER);
  ret = vmstate_info_nullptr.get(f, curr_elem, size, NULL);
@@ -325,7 +325,7 @@ void vmstate_save_state(QEMUFile *f, const 
VMStateDescription *vmsd,
  trace_vmstate_save_state_loop(vmsd->name, field->name, n_elems);
  if (field->flags & VMS_POINTER) {
  first_elem = *(void **)first_elem;
-assert(first_elem  || !n_elems);
+assert(first_elem || !n_elems || !size);
  }
  for (i = 0; i < n_elems; i++) {
  void *curr_elem = first_elem + size * i;
@@ -336,7 +336,7 @@ void vmstate_save_state(QEMUFile *f, const 
VMStateDescription *vmsd,
  assert(curr_elem);
  curr_elem = *(void **)curr_elem;
  }
-if (!curr_elem) {
+if (!curr_elem && size) {
  /* if null pointer write placeholder and do not follow */
  assert(field->flags & VMS_ARRAY_OF_POINTER);
  vmstate_info_nullptr.put(f, curr_elem, size, NULL, NULL);
--
1.8.3.1



--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



--
Regards
QingFeng Hao




Re: [Qemu-block] [Qemu-devel] [PATCH] file-posix: Remove unnecessary includes

2017-03-14 Thread Eric Blake
On 03/14/2017 11:03 AM, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf 
> ---
>  block/file-posix.c | 2 --
>  1 file changed, 2 deletions(-)

Reviewed-by: Eric Blake 

> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 2cb2f64..a03a3e1 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -25,8 +25,6 @@
>  #include "qapi/error.h"
>  #include "qemu/cutils.h"
>  #include "qemu/error-report.h"
> -#include "qemu/timer.h"
> -#include "qemu/log.h"
>  #include "block/block_int.h"
>  #include "qemu/module.h"
>  #include "trace.h"
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH] file-win32: Remove unnecessary include

2017-03-14 Thread Eric Blake
On 03/14/2017 11:04 AM, Kevin Wolf wrote:
> Signed-off-by: Kevin Wolf 
> ---
>  block/file-win32.c | 1 -
>  1 file changed, 1 deletion(-)

Reviewed-by: Eric Blake 

> 
> diff --git a/block/file-win32.c b/block/file-win32.c
> index 800fabd..e132ba1 100644
> --- a/block/file-win32.c
> +++ b/block/file-win32.c
> @@ -24,7 +24,6 @@
>  #include "qemu/osdep.h"
>  #include "qapi/error.h"
>  #include "qemu/cutils.h"
> -#include "qemu/timer.h"
>  #include "block/block_int.h"
>  #include "qemu/module.h"
>  #include "block/raw-aio.h"
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v2] file-posix: Don't leak fd in hdev_get_max_segments

2017-03-14 Thread Eric Blake
On 03/14/2017 11:28 AM, Kevin Wolf wrote:
> Am 14.03.2017 um 17:20 hat Eric Blake geschrieben:
>> On 03/14/2017 11:12 AM, Fam Zheng wrote:
>>> Signed-off-by: Fam Zheng 
>>> ---
>>>  block/file-posix.c | 3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/block/file-posix.c b/block/file-posix.c
>>> index c4c0663..d670be3 100644
>>> --- a/block/file-posix.c
>>> +++ b/block/file-posix.c
>>> @@ -703,6 +703,9 @@ static int hdev_get_max_segments(const struct stat *st)
>>>  }
>>>  
>>>  out:
>>> +if (fd != -1) {
>>> +close(fd);
>>> +}
>>
>> Mails crossed - I see I asked about this in reply to v1 after you
>> already sent v2 minutes earlier.
>>
>> I still think it might be nice to mention a commit id that you are
>> fixing (one of my reply-to-self on your v1), but either way,
>>
>> Reviewed-by: Eric Blake 
> 
> I added a sentence to the commit message: "This fixes a leaked fd
> introduced in commit 9103f1ce." Sounds good?

Works for me!

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH] blk: fix aio context loss on media change

2017-03-14 Thread Vladimir Sementsov-Ogievskiy
If we have separate iothread for cdrom, we lose connection to it on
qmp_blockdev_change_medium, as aio_context is on bds which is dropped
and switched with new one.

As an example result, after such media change we have crash on
virtio_scsi_ctx_check: Assertion `blk_get_aio_context(d->conf.blk) == s->ctx' 
failed.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

Hi all!

We've faced into this assert, and there some kind of fix. I don't sure that
such fix doesn't break some conceptions, in this case, I hope, someone will
propose a true-way solution.

==
Also, on master branch I can't reproduce it as vm crashed earlier, without any
eject/change, on assert(s->ctx && s->dataplane_started) in
virtio_scsi_data_plane_handle_ctrl(). It looks like race with
virtio_scsi_dataplane_start(), and for test (to reproduce assert described 
above),
I've "fixed" it with just:

@@ -63,6 +63,7 @@ static bool virtio_scsi_data_plane_handle_ctrl(VirtIODevice 
*vdev,
 {
 VirtIOSCSI *s = VIRTIO_SCSI(vdev);
 
+sleep(10);
 assert(s->ctx && s->dataplane_started);
 return virtio_scsi_handle_ctrl_vq(s, vq);
 }

This race is not reproduced for me in our 2.6 based branch.

 block/block-backend.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index 5742c09c2c..6d5044228e 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -65,6 +65,8 @@ struct BlockBackend {
 bool allow_write_beyond_eof;
 
 NotifierList remove_bs_notifiers, insert_bs_notifiers;
+
+AioContext *aio_context;
 };
 
 typedef struct BlockBackendAIOCB {
@@ -559,6 +561,10 @@ int blk_insert_bs(BlockBackend *blk, BlockDriverState *bs, 
Error **errp)
 }
 bdrv_ref(bs);
 
+if (blk->aio_context != NULL) {
+bdrv_set_aio_context(bs, blk->aio_context);
+}
+
 notifier_list_notify(>insert_bs_notifiers, blk);
 if (blk->public.throttle_state) {
 throttle_timers_attach_aio_context(
@@ -1607,6 +1613,7 @@ void blk_set_aio_context(BlockBackend *blk, AioContext 
*new_context)
 {
 BlockDriverState *bs = blk_bs(blk);
 
+blk->aio_context = new_context;
 if (bs) {
 if (blk->public.throttle_state) {
 throttle_timers_detach_aio_context(>public.throttle_timers);
-- 
2.11.1




Re: [Qemu-block] [Qemu-devel] [PATCH v2] file-posix: Don't leak fd in hdev_get_max_segments

2017-03-14 Thread Kevin Wolf
Am 14.03.2017 um 17:20 hat Eric Blake geschrieben:
> On 03/14/2017 11:12 AM, Fam Zheng wrote:
> > Signed-off-by: Fam Zheng 
> > ---
> >  block/file-posix.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/block/file-posix.c b/block/file-posix.c
> > index c4c0663..d670be3 100644
> > --- a/block/file-posix.c
> > +++ b/block/file-posix.c
> > @@ -703,6 +703,9 @@ static int hdev_get_max_segments(const struct stat *st)
> >  }
> >  
> >  out:
> > +if (fd != -1) {
> > +close(fd);
> > +}
> 
> Mails crossed - I see I asked about this in reply to v1 after you
> already sent v2 minutes earlier.
> 
> I still think it might be nice to mention a commit id that you are
> fixing (one of my reply-to-self on your v1), but either way,
> 
> Reviewed-by: Eric Blake 

I added a sentence to the commit message: "This fixes a leaked fd
introduced in commit 9103f1ce." Sounds good?

Kevin


pgp3y1Q7KgGJs.pgp
Description: PGP signature


Re: [Qemu-block] [Qemu-devel] [PATCH v2] file-posix: Don't leak fd in hdev_get_max_segments

2017-03-14 Thread Eric Blake
On 03/14/2017 11:12 AM, Fam Zheng wrote:
> Signed-off-by: Fam Zheng 
> ---
>  block/file-posix.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index c4c0663..d670be3 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -703,6 +703,9 @@ static int hdev_get_max_segments(const struct stat *st)
>  }
>  
>  out:
> +if (fd != -1) {
> +close(fd);
> +}

Mails crossed - I see I asked about this in reply to v1 after you
already sent v2 minutes earlier.

I still think it might be nice to mention a commit id that you are
fixing (one of my reply-to-self on your v1), but either way,

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH v2] file-posix: Don't leak fd in hdev_get_max_segments

2017-03-14 Thread Kevin Wolf
Am 14.03.2017 um 17:12 hat Fam Zheng geschrieben:
> Signed-off-by: Fam Zheng 

Thanks, applied to the block branch.

Kevin



Re: [Qemu-block] [Qemu-devel] [PATCH] file-posix: Don't leak fd in hdev_get_max_segments

2017-03-14 Thread Eric Blake
On 03/14/2017 11:14 AM, Eric Blake wrote:
> On 03/14/2017 11:11 AM, Eric Blake wrote:
>> On 03/14/2017 10:43 AM, Fam Zheng wrote:
>>> Signed-off-by: Fam Zheng 
>>> ---
>>>  block/file-posix.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/block/file-posix.c b/block/file-posix.c
>>> index c4c0663..e6170f4 100644
>>> --- a/block/file-posix.c
>>> +++ b/block/file-posix.c
>>> @@ -703,6 +703,7 @@ static int hdev_get_max_segments(const struct stat *st)
>>
>> hdev_get_max_segments() is not part of master yet; should this just be
>> treated as a fixup! to a pending patch?
> 
> Scratch that - I forgot to 'git pull' on my end.
> 
> Commit message could usefully be improved by mentioning commit 9103f1ce
> as the spot the problem was introduced.
> 
> Reviewed-by: Eric Blake 
> 

Question: does valgrind or any other tool complain about:

+if (fd == -1) {
+ret = -errno;
+goto out;
+}
...
+out:
+close(fd);
+g_free(sysfspath);
+return ret;

this being a useless syscall to close(-1) ?  If so, you need to wrap in
an 'if (fd != -1)'

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH] file-posix: Don't leak fd in hdev_get_max_segments

2017-03-14 Thread Kevin Wolf
Am 14.03.2017 um 17:11 hat Eric Blake geschrieben:
> On 03/14/2017 10:43 AM, Fam Zheng wrote:
> > Signed-off-by: Fam Zheng 
> > ---
> >  block/file-posix.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/block/file-posix.c b/block/file-posix.c
> > index c4c0663..e6170f4 100644
> > --- a/block/file-posix.c
> > +++ b/block/file-posix.c
> > @@ -703,6 +703,7 @@ static int hdev_get_max_segments(const struct stat *st)
> 
> hdev_get_max_segments() is not part of master yet; should this just be
> treated as a fixup! to a pending patch?

It is already in master, since yesterday, I think.

Kevin


pgp7bGdbbj9Ie.pgp
Description: PGP signature


[Qemu-block] [PATCH v2] file-posix: Don't leak fd in hdev_get_max_segments

2017-03-14 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 block/file-posix.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index c4c0663..d670be3 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -703,6 +703,9 @@ static int hdev_get_max_segments(const struct stat *st)
 }
 
 out:
+if (fd != -1) {
+close(fd);
+}
 g_free(sysfspath);
 return ret;
 #else
-- 
2.9.3




Re: [Qemu-block] [Qemu-devel] [PATCH] file-posix: Don't leak fd in hdev_get_max_segments

2017-03-14 Thread Eric Blake
On 03/14/2017 11:11 AM, Eric Blake wrote:
> On 03/14/2017 10:43 AM, Fam Zheng wrote:
>> Signed-off-by: Fam Zheng 
>> ---
>>  block/file-posix.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/block/file-posix.c b/block/file-posix.c
>> index c4c0663..e6170f4 100644
>> --- a/block/file-posix.c
>> +++ b/block/file-posix.c
>> @@ -703,6 +703,7 @@ static int hdev_get_max_segments(const struct stat *st)
> 
> hdev_get_max_segments() is not part of master yet; should this just be
> treated as a fixup! to a pending patch?

Scratch that - I forgot to 'git pull' on my end.

Commit message could usefully be improved by mentioning commit 9103f1ce
as the spot the problem was introduced.

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH] file-posix: Don't leak fd in hdev_get_max_segments

2017-03-14 Thread Eric Blake
On 03/14/2017 10:43 AM, Fam Zheng wrote:
> Signed-off-by: Fam Zheng 
> ---
>  block/file-posix.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index c4c0663..e6170f4 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -703,6 +703,7 @@ static int hdev_get_max_segments(const struct stat *st)

hdev_get_max_segments() is not part of master yet; should this just be
treated as a fixup! to a pending patch?

>  }
>  
>  out:
> +close(fd);
>  g_free(sysfspath);
>  return ret;
>  #else
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [PATCH] file-posix: Don't leak fd in hdev_get_max_segments

2017-03-14 Thread Fam Zheng
On Tue, 03/14 16:50, Kevin Wolf wrote:
> Am 14.03.2017 um 16:43 hat Fam Zheng geschrieben:
> > Signed-off-by: Fam Zheng 
> > ---
> >  block/file-posix.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/block/file-posix.c b/block/file-posix.c
> > index c4c0663..e6170f4 100644
> > --- a/block/file-posix.c
> > +++ b/block/file-posix.c
> > @@ -703,6 +703,7 @@ static int hdev_get_max_segments(const struct stat *st)
> >  }
> >  
> >  out:
> > +close(fd);
> 
> Should we make this conditional on fd != -1?

OK, that is cleaner.

Fam

> 
> >  g_free(sysfspath);
> >  return ret;
> >  #else
> 
> Kevin



[Qemu-block] [PATCH] file-win32: Remove unnecessary include

2017-03-14 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
---
 block/file-win32.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/block/file-win32.c b/block/file-win32.c
index 800fabd..e132ba1 100644
--- a/block/file-win32.c
+++ b/block/file-win32.c
@@ -24,7 +24,6 @@
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qemu/cutils.h"
-#include "qemu/timer.h"
 #include "block/block_int.h"
 #include "qemu/module.h"
 #include "block/raw-aio.h"
-- 
1.8.3.1




[Qemu-block] [PATCH] file-posix: Remove unnecessary includes

2017-03-14 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
---
 block/file-posix.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 2cb2f64..a03a3e1 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -25,8 +25,6 @@
 #include "qapi/error.h"
 #include "qemu/cutils.h"
 #include "qemu/error-report.h"
-#include "qemu/timer.h"
-#include "qemu/log.h"
 #include "block/block_int.h"
 #include "qemu/module.h"
 #include "trace.h"
-- 
1.8.3.1




[Qemu-block] [PATCH] file-posix: Don't leak fd in hdev_get_max_segments

2017-03-14 Thread Fam Zheng
Signed-off-by: Fam Zheng 
---
 block/file-posix.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index c4c0663..e6170f4 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -703,6 +703,7 @@ static int hdev_get_max_segments(const struct stat *st)
 }
 
 out:
+close(fd);
 g_free(sysfspath);
 return ret;
 #else
-- 
2.9.3




Re: [Qemu-block] [Qemu-devel] [PATCH] migration/block: Avoid involve into blk_drain too frequently

2017-03-14 Thread Eric Blake
On 03/14/2017 10:12 AM, Eric Blake wrote:
> On 03/14/2017 02:57 AM, jemmy858...@gmail.com wrote:
>> From: Lidong Chen 
>>
>> Increase bmds->cur_dirty after submit io, so reduce the frequency involve 
>> into blk_drain, and improve the performance obviously when block migration.
> 
> Long line; please wrap your commit messages, preferably around 70 bytes
> since 'git log' displays them indented, and it is still nice to read
> them in an 80-column window.

Also, in the subject:

s/involve into/invoking/

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH] migration/block: Avoid involve into blk_drain too frequently

2017-03-14 Thread Eric Blake
On 03/14/2017 02:57 AM, jemmy858...@gmail.com wrote:
> From: Lidong Chen 
> 
> Increase bmds->cur_dirty after submit io, so reduce the frequency involve 
> into blk_drain, and improve the performance obviously when block migration.

Long line; please wrap your commit messages, preferably around 70 bytes
since 'git log' displays them indented, and it is still nice to read
them in an 80-column window.

Do you have benchmark numbers to prove the impact of this patch, or even
a formula for reproducing the benchmark testing?

> 
> Signed-off-by: Lidong Chen 
> ---
>  migration/block.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/migration/block.c b/migration/block.c
> index 6741228..f059cca 100644
> --- a/migration/block.c
> +++ b/migration/block.c
> @@ -576,6 +576,8 @@ static int mig_save_device_dirty(QEMUFile *f, 
> BlkMigDevState *bmds,
>  }
>  
>  bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, sector, nr_sectors);
> +sector += nr_sectors;
> +bmds->cur_dirty = sector;
>  break;
>  }
>  sector += BDRV_SECTORS_PER_DIRTY_CHUNK;
> 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] [PATCH] migration/block: Avoid involve into blk_drain too frequently

2017-03-14 Thread jemmy858585
From: Lidong Chen 

Increase bmds->cur_dirty after submit io, so reduce the frequency involve into 
blk_drain, and improve the performance obviously when block migration.

Signed-off-by: Lidong Chen 
---
 migration/block.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/migration/block.c b/migration/block.c
index 6741228..f059cca 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -576,6 +576,8 @@ static int mig_save_device_dirty(QEMUFile *f, 
BlkMigDevState *bmds,
 }
 
 bdrv_reset_dirty_bitmap(bmds->dirty_bitmap, sector, nr_sectors);
+sector += nr_sectors;
+bmds->cur_dirty = sector;
 break;
 }
 sector += BDRV_SECTORS_PER_DIRTY_CHUNK;
-- 
1.8.3.1




Re: [Qemu-block] [Qemu-devel] [PATCH] nbd-client: fix handling of hungup connections

2017-03-14 Thread Eric Blake
On 03/14/2017 06:11 AM, Paolo Bonzini wrote:
> After the switch to reading replies in a coroutine, nothing is
> reentering pending receive coroutines if the connection hangs.
> Move nbd_recv_coroutines_enter_all to the reply read coroutine,
> which is the place where hangups are detected.  nbd_teardown_connection
> can simply wait for the reply read coroutine to detect the hangup
> and clean up after itself.
> 
> This wouldn't be enough though because nbd_receive_reply returns 0
> (rather than -EPIPE or similar) when reading from a hung connection.
> Fix the return value check in nbd_read_reply_entry.
> 
> This fixes qemu-iotests 083.
> 
> Reported-by: Max Reitz 
> Signed-off-by: Paolo Bonzini 
> ---
>  block/nbd-client.c | 12 ++--
>  nbd/client.c   |  2 +-
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH] block: quiesce AioContext when detaching from it

2017-03-14 Thread Eric Blake
On 03/14/2017 06:11 AM, Paolo Bonzini wrote:
> While it is true that bdrv_set_aio_context only works on a single
> BlockDriverState subtree (see commit message for 53ec73e, "block: Use
> bdrv_drain to replace uncessary bdrv_drain_all", 2015-07-07), it works

I was about to correct the typo, but see it was in the original commit :)

> at the AioContext level rather than the BlockDriverState level.
> 
> Therefore, it is also necessary to trigger pending bottom halves too,
> even if no requests are pending.
> 
> For NBD this ensures that the aio_co_schedule of a previous call to
> nbd_attach_aio_context is completed before detaching from the old
> AioContext; it fixes qemu-iotest 094.  Another similar bug happens
> when the VM is stopped and the virtio-blk dataplane irqfd is torn down.
> In this case it's possible that guest I/O gets stuck if notify_guest_bh
> was scheduled but doesn't run.
> 
> Calling aio_poll from another AioContext is safe if non-blocking; races
> such as the one mentioned in the commit message for c9d1a56 ("block:
> only call aio_poll on the current thread's AioContext", 2016-10-28)
> are a concern for blocking calls.
> 
> I considered other options, including:
> 
> - moving the bs->wakeup mechanism to AioContext, and letting the caller
> check.  This might work for virtio which has a clear place to wakeup
> (notify_place_bh) and check the condition (virtio_blk_data_plane_stop).
> For aio_co_schedule I couldn't find a clear place to check the condition.
> 
> - adding a dummy oneshot bottom half and waiting for it to trigger.
> This has the complication that bottom half list is LIFO for historical
> reasons.  There were performance issues caused by bottom half ordering
> in the past, so I decided against it for 2.9.
> 
> Fixes: 99723548561978da8ef44cf804fb7912698f5d88
> Reported-by: Max Reitz 
> Reported-by: Halil Pasic 
> Tested-by: Halil Pasic 
> Signed-off-by: Paolo Bonzini 
> ---
>  block.c | 7 +++
>  1 file changed, 7 insertions(+)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-block] [Qemu-devel] [PATCH v1 1/1] vmstate: fix failed iotests case 68 and 91

2017-03-14 Thread Dr. David Alan Gilbert
* QingFeng Hao (ha...@linux.vnet.ibm.com) wrote:
> This problem affects s390x only if we are running without KVM.
> Basically, S390CPU.irqstate is unused if we do not use KVM,
> and thus no buffer is allocated.
> This causes size=0, first_elem=NULL and n_elems=1 in
> vmstate_load_state and vmstate_save_state. And the assert fails.
> With this fix we can go back to the old behavior and support
> VMS_VBUFFER with size 0 and nullptr.
> 
> Signed-off-by: QingFeng Hao 
> Signed-off-by: Halil Pasic 

Thanks, and fixes problem with vmxnet3 migration.

Reviewed-by: Dr. David Alan Gilbert 

Dave

> ---
>  migration/vmstate.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/migration/vmstate.c b/migration/vmstate.c
> index 78b3cd4..7b4a607 100644
> --- a/migration/vmstate.c
> +++ b/migration/vmstate.c
> @@ -109,7 +109,7 @@ int vmstate_load_state(QEMUFile *f, const 
> VMStateDescription *vmsd,
>  vmstate_handle_alloc(first_elem, field, opaque);
>  if (field->flags & VMS_POINTER) {
>  first_elem = *(void **)first_elem;
> -assert(first_elem  || !n_elems);
> +assert(first_elem || !n_elems || !size);
>  }
>  for (i = 0; i < n_elems; i++) {
>  void *curr_elem = first_elem + size * i;
> @@ -117,7 +117,7 @@ int vmstate_load_state(QEMUFile *f, const 
> VMStateDescription *vmsd,
>  if (field->flags & VMS_ARRAY_OF_POINTER) {
>  curr_elem = *(void **)curr_elem;
>  }
> -if (!curr_elem) {
> +if (!curr_elem && size) {
>  /* if null pointer check placeholder and do not follow */
>  assert(field->flags & VMS_ARRAY_OF_POINTER);
>  ret = vmstate_info_nullptr.get(f, curr_elem, size, NULL);
> @@ -325,7 +325,7 @@ void vmstate_save_state(QEMUFile *f, const 
> VMStateDescription *vmsd,
>  trace_vmstate_save_state_loop(vmsd->name, field->name, n_elems);
>  if (field->flags & VMS_POINTER) {
>  first_elem = *(void **)first_elem;
> -assert(first_elem  || !n_elems);
> +assert(first_elem || !n_elems || !size);
>  }
>  for (i = 0; i < n_elems; i++) {
>  void *curr_elem = first_elem + size * i;
> @@ -336,7 +336,7 @@ void vmstate_save_state(QEMUFile *f, const 
> VMStateDescription *vmsd,
>  assert(curr_elem);
>  curr_elem = *(void **)curr_elem;
>  }
> -if (!curr_elem) {
> +if (!curr_elem && size) {
>  /* if null pointer write placeholder and do not follow */
>  assert(field->flags & VMS_ARRAY_OF_POINTER);
>  vmstate_info_nullptr.put(f, curr_elem, size, NULL, NULL);
> -- 
> 1.8.3.1
> 
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-block] [Qemu-devel] [PATCH] block: Always call bdrv_child_check_perm first

2017-03-14 Thread Eric Blake
On 03/13/2017 09:30 PM, Fam Zheng wrote:
> bdrv_child_set_perm alone is not very usable because the caller must
> call bdrv_child_check_perm first. This is already encapsulated
> conveniently in bdrv_child_try_set_perm, so remove the other prototypes
> from the header and fix the one wrong caller, block/mirror.c.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block.c   | 13 +
>  block/mirror.c|  6 --
>  include/block/block_int.h |  4 
>  3 files changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/block.c b/block.c
> index cb57370..a77e8a0 100644
> --- a/block.c
> +++ b/block.c
> @@ -1393,6 +1393,11 @@ static int bdrv_fill_options(QDict **options, const 
> char *filename,
>  return 0;
>  }
>  
> +static int bdrv_child_check_perm(BdrvChild *c, uint64_t perm, uint64_t 
> shared,
> + GSList *ignore_children, Error **errp);
> +static void bdrv_child_abort_perm_update(BdrvChild *c);
> +static void bdrv_child_set_perm(BdrvChild *c, uint64_t perm, uint64_t 
> shared);
> +

Now that you have static prototypes, is it worth rearranging the file
(in a followup) to sort the function implementations into topological
order so that a prototype is not necessary?  [In general, I try to code
so that static prototypes are only necessary if I am implementing
mutually-referencing recursive code. But it's not a strict requirement]

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-block] follow file modifications made by guest os with qemu

2017-03-14 Thread Pascal
hi everybody,

how could I (easily) follow file modifications made by guest os (Windows)
with qemu ?
could I exploit the growing overlay image based on an original Windows
image ?

regards, lacsaP.


Re: [Qemu-block] [PATCH V1] replication: clarify permissions

2017-03-14 Thread Kevin Wolf
Am 14.03.2017 um 12:46 hat Changlong Xie geschrieben:
> Even if hidden_disk, secondary_disk are backing files, they all need
> write permissions in replication scenario. Otherwise we will encouter
> below exceptions on secondary side during adding nbd server:
> 
> {'execute': 'nbd-server-add', 'arguments': {'device': 'colo-disk', 
> 'writable': true } }
> {"error": {"class": "GenericError", "desc": "Conflicts with use by 
> hidden-qcow2-driver as 'backing', which does not allow 'write' on 
> sec-qcow2-driver-for-nbd"}}
> 
> CC: Zhang Hailiang 
> CC: Zhang Chen 
> CC: Wen Congyang 
> Signed-off-by: Changlong Xie 

Thanks, applied to the block branch.

Kevin



[Qemu-block] [PATCH V1] replication: clarify permissions

2017-03-14 Thread Changlong Xie
Even if hidden_disk, secondary_disk are backing files, they all need
write permissions in replication scenario. Otherwise we will encouter
below exceptions on secondary side during adding nbd server:

{'execute': 'nbd-server-add', 'arguments': {'device': 'colo-disk', 'writable': 
true } }
{"error": {"class": "GenericError", "desc": "Conflicts with use by 
hidden-qcow2-driver as 'backing', which does not allow 'write' on 
sec-qcow2-driver-for-nbd"}}

CC: Zhang Hailiang 
CC: Zhang Chen 
CC: Wen Congyang 
Signed-off-by: Changlong Xie 
---
 block/replication.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/block/replication.c b/block/replication.c
index 22f170f..bf3c395 100644
--- a/block/replication.c
+++ b/block/replication.c
@@ -155,6 +155,18 @@ static void replication_close(BlockDriverState *bs)
 replication_remove(s->rs);
 }
 
+static void replication_child_perm(BlockDriverState *bs, BdrvChild *c,
+   const BdrvChildRole *role,
+   uint64_t perm, uint64_t shared,
+   uint64_t *nperm, uint64_t *nshared)
+{
+*nperm = *nshared = BLK_PERM_CONSISTENT_READ \
+| BLK_PERM_WRITE \
+| BLK_PERM_WRITE_UNCHANGED;
+
+return;
+}
+
 static int64_t replication_getlength(BlockDriverState *bs)
 {
 return bdrv_getlength(bs->file->bs);
@@ -660,7 +672,7 @@ BlockDriver bdrv_replication = {
 
 .bdrv_open  = replication_open,
 .bdrv_close = replication_close,
-.bdrv_child_perm= bdrv_filter_default_perms,
+.bdrv_child_perm= replication_child_perm,
 
 .bdrv_getlength = replication_getlength,
 .bdrv_co_readv  = replication_co_readv,
-- 
1.9.3






[Qemu-block] [PATCH] block: quiesce AioContext when detaching from it

2017-03-14 Thread Paolo Bonzini
While it is true that bdrv_set_aio_context only works on a single
BlockDriverState subtree (see commit message for 53ec73e, "block: Use
bdrv_drain to replace uncessary bdrv_drain_all", 2015-07-07), it works
at the AioContext level rather than the BlockDriverState level.

Therefore, it is also necessary to trigger pending bottom halves too,
even if no requests are pending.

For NBD this ensures that the aio_co_schedule of a previous call to
nbd_attach_aio_context is completed before detaching from the old
AioContext; it fixes qemu-iotest 094.  Another similar bug happens
when the VM is stopped and the virtio-blk dataplane irqfd is torn down.
In this case it's possible that guest I/O gets stuck if notify_guest_bh
was scheduled but doesn't run.

Calling aio_poll from another AioContext is safe if non-blocking; races
such as the one mentioned in the commit message for c9d1a56 ("block:
only call aio_poll on the current thread's AioContext", 2016-10-28)
are a concern for blocking calls.

I considered other options, including:

- moving the bs->wakeup mechanism to AioContext, and letting the caller
check.  This might work for virtio which has a clear place to wakeup
(notify_place_bh) and check the condition (virtio_blk_data_plane_stop).
For aio_co_schedule I couldn't find a clear place to check the condition.

- adding a dummy oneshot bottom half and waiting for it to trigger.
This has the complication that bottom half list is LIFO for historical
reasons.  There were performance issues caused by bottom half ordering
in the past, so I decided against it for 2.9.

Fixes: 99723548561978da8ef44cf804fb7912698f5d88
Reported-by: Max Reitz 
Reported-by: Halil Pasic 
Tested-by: Halil Pasic 
Signed-off-by: Paolo Bonzini 
---
 block.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/block.c b/block.c
index f293ccb..e159251 100644
--- a/block.c
+++ b/block.c
@@ -4272,8 +4272,15 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
 
 void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
 {
+AioContext *ctx;
+
 bdrv_drain(bs); /* ensure there are no in-flight requests */
 
+ctx = bdrv_get_aio_context(bs);
+while (aio_poll(ctx, false)) {
+/* wait for all bottom halves to execute */
+}
+
 bdrv_detach_aio_context(bs);
 
 /* This function executes in the old AioContext so acquire the new one in
-- 
2.9.3




[Qemu-block] [PATCH] nbd-client: fix handling of hungup connections

2017-03-14 Thread Paolo Bonzini
After the switch to reading replies in a coroutine, nothing is
reentering pending receive coroutines if the connection hangs.
Move nbd_recv_coroutines_enter_all to the reply read coroutine,
which is the place where hangups are detected.  nbd_teardown_connection
can simply wait for the reply read coroutine to detect the hangup
and clean up after itself.

This wouldn't be enough though because nbd_receive_reply returns 0
(rather than -EPIPE or similar) when reading from a hung connection.
Fix the return value check in nbd_read_reply_entry.

This fixes qemu-iotests 083.

Reported-by: Max Reitz 
Signed-off-by: Paolo Bonzini 
---
 block/nbd-client.c | 12 ++--
 nbd/client.c   |  2 +-
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 0dc12c2..1e2952f 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -33,17 +33,15 @@
 #define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
 #define INDEX_TO_HANDLE(bs, index)  ((index)  ^ ((uint64_t)(intptr_t)bs))
 
-static void nbd_recv_coroutines_enter_all(BlockDriverState *bs)
+static void nbd_recv_coroutines_enter_all(NBDClientSession *s)
 {
-NBDClientSession *s = nbd_get_client_session(bs);
 int i;
 
 for (i = 0; i < MAX_NBD_REQUESTS; i++) {
 if (s->recv_coroutine[i]) {
-qemu_coroutine_enter(s->recv_coroutine[i]);
+aio_co_wake(s->recv_coroutine[i]);
 }
 }
-BDRV_POLL_WHILE(bs, s->read_reply_co);
 }
 
 static void nbd_teardown_connection(BlockDriverState *bs)
@@ -58,7 +56,7 @@ static void nbd_teardown_connection(BlockDriverState *bs)
 qio_channel_shutdown(client->ioc,
  QIO_CHANNEL_SHUTDOWN_BOTH,
  NULL);
-nbd_recv_coroutines_enter_all(bs);
+BDRV_POLL_WHILE(bs, client->read_reply_co);
 
 nbd_client_detach_aio_context(bs);
 object_unref(OBJECT(client->sioc));
@@ -76,7 +74,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
 for (;;) {
 assert(s->reply.handle == 0);
 ret = nbd_receive_reply(s->ioc, >reply);
-if (ret < 0) {
+if (ret <= 0) {
 break;
 }
 
@@ -103,6 +101,8 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
 aio_co_wake(s->recv_coroutine[i]);
 qemu_coroutine_yield();
 }
+
+nbd_recv_coroutines_enter_all(s);
 s->read_reply_co = NULL;
 }
 
diff --git a/nbd/client.c b/nbd/client.c
index 5c9dee3..746e9a7 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -812,6 +812,6 @@ ssize_t nbd_receive_reply(QIOChannel *ioc, NBDReply *reply)
 LOG("invalid magic (got 0x%" PRIx32 ")", magic);
 return -EINVAL;
 }
-return 0;
+return sizeof(buf);
 }
 
-- 
2.9.3




Re: [Qemu-block] [Qemu-devel] [PULL 00/12] Block layer fixes for 2.9.0-rc1

2017-03-14 Thread Peter Maydell
On 14 March 2017 at 10:29, Kevin Wolf  wrote:
> Am 13.03.2017 um 19:05 hat Peter Maydell geschrieben:
>> On 13 March 2017 at 15:54, Kevin Wolf  wrote:
>> > The following changes since commit 
>> > dd4d2578215cd380f40a38028a9904e15b135ef3:
>> >
>> >   Merge remote-tracking branch 'remotes/kraxel/tags/pull-fixes-20170309-1' 
>> > into staging (2017-03-09 13:16:05 +)
>> >
>> > are available in the git repository at:
>> >
>> >
>> >   git://repo.or.cz/qemu/kevin.git tags/for-upstream
>> >
>> > for you to fetch changes up to dcbf37ce41a52698550f8f8b2f14b5e6fee22d2d:
>> >
>> >   commit: Implement .bdrv_refresh_filename (2017-03-13 12:49:33 +0100)
>> >
>> > 
>> > Block layer fixes for 2.9.0-rc1
>>
>> We haven't had rc0 yet :-)
>
> I was assuming that we'd just skip -rc0 then. Are we instead shifting
> the release by a week or two?

I'm planning to push all the rcs back a week. Of course
if we get to rc1/rc2 and there's very little going into the
tree we can release early, but this basically never happens.

thanks
-- PMM



Re: [Qemu-block] [PATCH] file-posix: clean up max_segments buffer termination

2017-03-14 Thread Fam Zheng
On Tue, 03/14 17:09, Stefan Hajnoczi wrote:
> The following pattern is unsafe:
> 
>   char buf[32];
>   ret = read(fd, buf, sizeof(buf));
>   ...
>   buf[ret] = 0;
> 
> If read(2) returns 32 then a byte beyond the end of the buffer is
> zeroed.
> 
> In practice this buffer overflow does not occur because the sysfs
> max_segments file only contains an unsigned short + '\n'.  The string is
> always shorter than 32 bytes.
> 
> Regardless, avoid this pattern because static analysis tools might
> complain and it could lead to real buffer overflows if copy-pasted
> elsewhere in the codebase.

Yes that's a good point, thanks!

Fam

> 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  block/file-posix.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index c4c0663..ac6bd9f 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -686,7 +686,7 @@ static int hdev_get_max_segments(const struct stat *st)
>  goto out;
>  }
>  do {
> -ret = read(fd, buf, sizeof(buf));
> +ret = read(fd, buf, sizeof(buf) - 1);
>  } while (ret == -1 && errno == EINTR);
>  if (ret < 0) {
>  ret = -errno;
> -- 
> 2.9.3
> 



Re: [Qemu-block] [Qemu-devel] [PULL 00/12] Block layer fixes for 2.9.0-rc1

2017-03-14 Thread Kevin Wolf
Am 13.03.2017 um 19:05 hat Peter Maydell geschrieben:
> On 13 March 2017 at 15:54, Kevin Wolf  wrote:
> > The following changes since commit dd4d2578215cd380f40a38028a9904e15b135ef3:
> >
> >   Merge remote-tracking branch 'remotes/kraxel/tags/pull-fixes-20170309-1' 
> > into staging (2017-03-09 13:16:05 +)
> >
> > are available in the git repository at:
> >
> >
> >   git://repo.or.cz/qemu/kevin.git tags/for-upstream
> >
> > for you to fetch changes up to dcbf37ce41a52698550f8f8b2f14b5e6fee22d2d:
> >
> >   commit: Implement .bdrv_refresh_filename (2017-03-13 12:49:33 +0100)
> >
> > 
> > Block layer fixes for 2.9.0-rc1
> 
> We haven't had rc0 yet :-)

I was assuming that we'd just skip -rc0 then. Are we instead shifting
the release by a week or two?

Kevin



Re: [Qemu-block] [PATCH] file-posix: clean up max_segments buffer termination

2017-03-14 Thread Kevin Wolf
Am 14.03.2017 um 10:09 hat Stefan Hajnoczi geschrieben:
> The following pattern is unsafe:
> 
>   char buf[32];
>   ret = read(fd, buf, sizeof(buf));
>   ...
>   buf[ret] = 0;
> 
> If read(2) returns 32 then a byte beyond the end of the buffer is
> zeroed.
> 
> In practice this buffer overflow does not occur because the sysfs
> max_segments file only contains an unsigned short + '\n'.  The string is
> always shorter than 32 bytes.
> 
> Regardless, avoid this pattern because static analysis tools might
> complain and it could lead to real buffer overflows if copy-pasted
> elsewhere in the codebase.
> 
> Signed-off-by: Stefan Hajnoczi 

Oops. I should have found this during review. Thanks for catching it.

Thanks, applied to the block branch.

Kevin



[Qemu-block] [PATCH] file-posix: clean up max_segments buffer termination

2017-03-14 Thread Stefan Hajnoczi
The following pattern is unsafe:

  char buf[32];
  ret = read(fd, buf, sizeof(buf));
  ...
  buf[ret] = 0;

If read(2) returns 32 then a byte beyond the end of the buffer is
zeroed.

In practice this buffer overflow does not occur because the sysfs
max_segments file only contains an unsigned short + '\n'.  The string is
always shorter than 32 bytes.

Regardless, avoid this pattern because static analysis tools might
complain and it could lead to real buffer overflows if copy-pasted
elsewhere in the codebase.

Signed-off-by: Stefan Hajnoczi 
---
 block/file-posix.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index c4c0663..ac6bd9f 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -686,7 +686,7 @@ static int hdev_get_max_segments(const struct stat *st)
 goto out;
 }
 do {
-ret = read(fd, buf, sizeof(buf));
+ret = read(fd, buf, sizeof(buf) - 1);
 } while (ret == -1 && errno == EINTR);
 if (ret < 0) {
 ret = -errno;
-- 
2.9.3




Re: [Qemu-block] [PATCH v2 25/30] trace: Fix parameter types in hw/scsi

2017-03-14 Thread Hannes Reinecke
On 03/13/2017 08:55 PM, Eric Blake wrote:
> An upcoming patch will let the compiler warn us when we are silently
> losing precision in traces; update the trace definitions to pass
> through the full value at the callsite.  Also update some callers
> to avoid variable-sized dma_addr_t (which cannot be easily used in
> trace headers) and needless casts for values used in tracing.
> 
> Signed-off-by: Eric Blake 
> ---
>  hw/scsi/esp-pci.c|   2 +-
>  hw/scsi/megasas.c|   8 ++--
>  hw/scsi/trace-events | 114 
> +--
>  3 files changed, 63 insertions(+), 61 deletions(-)
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)



Re: [Qemu-block] [PATCH v2 02/30] trace: Fix incorrect megasas trace parameters

2017-03-14 Thread Hannes Reinecke
On 03/13/2017 08:55 PM, Eric Blake wrote:
> hw/scsi/trace-events lists cmd as the first parameter for both
> megasas_iovec_overflow and megasas_iovec_underflow, but the caller
> was mistakenly passing cmd->iov_size twice instead of the command
> index.  Also, trace_megasas_abort_invalid is called with parameters
> in the wrong order.  Broken since its introduction in commit
> e8f943c3.
> 
> Signed-off-by: Eric Blake 
> ---
>  hw/scsi/megasas.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
> index e3d59b7..84b8caf 100644
> --- a/hw/scsi/megasas.c
> +++ b/hw/scsi/megasas.c
> @@ -291,7 +291,7 @@ static int megasas_map_sgl(MegasasState *s, MegasasCmd 
> *cmd, union mfi_sgl *sgl)
>  if (cmd->iov_size > iov_size) {
>  trace_megasas_iovec_overflow(cmd->index, iov_size, cmd->iov_size);
>  } else if (cmd->iov_size < iov_size) {
> -trace_megasas_iovec_underflow(cmd->iov_size, iov_size, 
> cmd->iov_size);
> +trace_megasas_iovec_underflow(cmd->index, iov_size, cmd->iov_size);
>  }
>  cmd->iov_offset = 0;
>  return 0;
> @@ -1924,8 +1924,8 @@ static int megasas_handle_abort(MegasasState *s, 
> MegasasCmd *cmd)
>  abort_ctx &= (uint64_t)0x;
>  }
>  if (abort_cmd->context != abort_ctx) {
> -trace_megasas_abort_invalid_context(cmd->index, abort_cmd->index,
> -abort_cmd->context);
> +trace_megasas_abort_invalid_context(cmd->index, abort_cmd->context,
> +abort_cmd->index);
>  s->event_count++;
>  return MFI_STAT_ABORT_NOT_POSSIBLE;
>  }
> 
Reviewed-by: Hannes Reinecke 

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)