答复: [PATCH v1 2/2] system/cpus: Fix resume_all_vcpus() under vCPU hotplug condition

2024-03-18 Thread zhukeqian via
Hi David,

On 17.03.24 09:37, Keqian Zhu via wrote:
>> For vCPU being hotplugged, qemu_init_vcpu() is called. In this 
>> function, we set vcpu state as stopped, and then wait vcpu thread to 
>> be created.
>> 
>> As the vcpu state is stopped, it will inform us it has been created 
>> and then wait on halt_cond. After we has realized vcpu object, we will 
>> resume the vcpu thread.
>> 
>> However, during we wait vcpu thread to be created, the bql is 
>> unlocked, and other thread is allowed to call resume_all_vcpus(), 
>> which will resume the un-realized vcpu.
>> 
>> This fixes the issue by filter out un-realized vcpu during 
>> resume_all_vcpus().
>
>Similar question: is there a reproducer? 
>
>How could we currently hotplug a VCPU, and while it is being created, see 
>pause_all_vcpus()/resume_all_vcpus() getting claled. 
>
I described the reason for this at patch 1.

>If I am not getting this wrong, there seems to be some other mechanism missing 
>that makes sure that this cannot happen. Dropping the BQL half-way through 
>creating a VCPU might be the problem.
>
When we add retry mechanism in pause_all_vcpus(), we can solve this problem. 
With the sematic unchanged for user, which means:
With bql, we can make sure all vcpus are paused after pause_all_vcpus() finish, 
 and all vcpus are resumed after resume_all_vcpus() finish.

Thanks,
Keqian

>
>
--
Cheers,

David / dhildenb



答复: [PATCH v1 1/2] system/cpus: Fix pause_all_vcpus() under concurrent environment

2024-03-18 Thread zhukeqian via
Hi David,

Thanks for reviewing.

On 17.03.24 09:37, Keqian Zhu via wrote:
>> Both main loop thread and vCPU thread are allowed to call 
>> pause_all_vcpus(), and in general resume_all_vcpus() is called after 
>> it. Two issues live in pause_all_vcpus():
>
>In general, calling pause_all_vcpus() from VCPU threads is quite dangerous.
>
>Do we have reproducers for the cases below? 
>

I produce the issues by testing ARM vCPU hotplug feature:
QEMU changes for vCPU hotplug could be cloned from below site,
 https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
Guest Kernel changes (by James Morse, ARM) are available here:
 https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git 
virtual_cpu_hotplug/rfc/v2

The procedure to produce problems:
1. Startup a Linux VM (e.g., called OS-vcpuhotplug) with 32 possible vCPUs and 
16 current vCPUs.
2. Log in guestOS and run script[1] to continuously online/offline CPU.
3. At host side, run script[2] to continuously hotplug/unhotplug vCPU.
After several minutes, we can hit these problems.

Script[1] to online/offline CPU:
for ((time=1;time<1000;time++));
do
for ((cpu=16;cpu<32;cpu++));
do
echo 1 > /sys/devices/system/cpu/cpu$cpu/online
done

for ((cpu=16;cpu<32;cpu++));
do
echo 0 > /sys/devices/system/cpu/cpu$cpu/online
done
done

Script[2] to hotplug/unhotplug vCPU:
for ((time=1;time<100;time++));
do
echo $time
for ((cpu=16;cpu<=32;cpu++));
do
echo "virsh setvcpus OS-vcpuhotplug --count  $cpu --live"
virsh setvcpus OS-vcpuhotplug --count  $cpu --live
sleep 2
done

for ((cpu=32;cpu>=16;cpu--));
do
echo "virsh setvcpus OS-vcpuhotplug --count  $cpu --live"
virsh setvcpus OS-vcpuhotplug --count  $cpu --live
sleep 2
done

for ((cpu=16;cpu<=32;cpu+=2));
do
echo "virsh setvcpus OS-vcpuhotplug --count  $cpu --live"
virsh setvcpus OS-vcpuhotplug --count  $cpu --live
sleep 2
done

for ((cpu=32;cpu>=16;cpu-=2));
do
echo "virsh setvcpus OS-vcpuhotplug --count  $cpu --live"
virsh setvcpus OS-vcpuhotplug --count  $cpu --live
sleep 2
done
done

The script[1] will call PSCI CPU_ON which emulated by QEMU, which result in 
calling cpu_reset() on vCPU thread.
For ARM architecture, it needs to reset GICC registers, which is only possible 
when all vcpus paused. So script[1]
will call pause_all_vcpus() in vCPU thread.
The script[2] also calls cpu_reset() for newly hotplugged vCPU, which is done 
in main loop thread.
So this scenario causes problems as I state in commit message.

>> 
>> 1. There is possibility that during thread T1 waits on qemu_pause_cond 
>> with bql unlocked, other thread has called
>> pause_all_vcpus() and resume_all_vcpus(), then thread T1 will stuck, 
>> because the condition all_vcpus_paused() is always false.
>
>How can this happen?
>
>Two threads calling pause_all_vcpus() is borderline broken, as you note. 
>
>IIRC, we should call pause_all_vcpus() only if some other mechanism prevents 
>these races. For example, based on runstate changes.
>

We already has bql to prevent concurrent calling of pause_all_vcpus() and 
resume_all_vcpus(). But pause_all_vcpus() will
unlock bql in the half way, which gives change for other thread to call pause 
and resume. In the  past, code does not consider
this problem, now I add retry mechanism to fix it.

>
>Just imagine one thread calling pause_all_vcpus() while another one 
>calls resume_all_vcpus(). It cannot possibly work.

With bql, we can make sure all vcpus are paused after pause_all_vcpus() finish, 
 and all vcpus are resumed after resume_all_vcpus() finish.

For example, the following situation may occur:
Thread T1: lock bql  ->pause_all_vcpus ->   wait on cond and unlock bql 
 ->   wait T2 unlock bql to lock bql
-> lock bql  &&  all_vcpu_paused ->   success and do other work -> unlock bql
Thread T2: wait T1 unlock bql to lock bql   
 ->   lock bql->  resume_all_vcpus   ->   success  and do other work   
-> unlock bql

Thanks,
Keqian

>
>
>> 
>> 2. After all_vcpus_paused() has been checked as true, we will
>> unlock bql to relock replay_mutex. During the bql was unlocked,
>> the vcpu's state may has been changed by other thread, so we
>> must retry.
>> 
>> Signed-off-by: Keqian Zhu 
>> ---
>>   system/cpus.c | 29 -
>>   1 file changed, 24 insertions(+), 5 deletions(-)
>> 
> diff --git a/system/cpus.c b/system/cpus.c
> index 68d161d96b..4e41abe23e 100644
> --- a/system/cpus.c
> +++ b/system/cpus.c
> @@ -571,12 +571,14 @@ static bool all_vcpus_paused(void)
>   return true;
>   }
>   
> -void 

[RFC] Is there a bug in pause_all_vcpus()

2024-03-15 Thread zhukeqian via


During we waited on qemu_pause_cond the bql was unlocked,
the vcpu's state may has been changed by other thread, so
we must request the pause state on all vcpus again.

For example:

Both main loop thread and vCPU thread are allowed to call
pause_all_vcpus(), and in general resume_all_vcpus() is called
after it.

Thus there is possibility that during thread T1 waits on
qemu_pause_cond, other thread has called pause_all_vcpus() and
resume_all_vcpus(), then thread T1 will stuck, because the condition
all_vcpus_paused() is always false.

PS:
I hit this bug when I test the RFC patch of ARM vCPU hotplug feature.
This patch is not verified!!! just for RFC.

Signed-off-by: Keqian Zhu 
---
 system/cpus.c | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/system/cpus.c b/system/cpus.c
index 68d161d96b..24ac8085cd 100644
--- a/system/cpus.c
+++ b/system/cpus.c
@@ -571,12 +571,14 @@ static bool all_vcpus_paused(void)
 return true;
 }

-void pause_all_vcpus(void)
+static void request_pause_all_vcpus(void)
 {
 CPUState *cpu;

-qemu_clock_enable(QEMU_CLOCK_VIRTUAL, false);
 CPU_FOREACH(cpu) {
+if (cpu->stopped) {
+continue;
+}
 if (qemu_cpu_is_self(cpu)) {
 qemu_cpu_stop(cpu, true);
 } else {
@@ -584,6 +586,14 @@ void pause_all_vcpus(void)
 qemu_cpu_kick(cpu);
 }
 }
+}
+
+void pause_all_vcpus(void)
+{
+CPUState *cpu;
+
+qemu_clock_enable(QEMU_CLOCK_VIRTUAL, false);
+request_pause_all_vcpus();

 /* We need to drop the replay_lock so any vCPU threads woken up
  * can finish their replay tasks
@@ -592,9 +602,11 @@ void pause_all_vcpus(void)

 while (!all_vcpus_paused()) {
 qemu_cond_wait(_pause_cond, );
-CPU_FOREACH(cpu) {
-qemu_cpu_kick(cpu);
-}
+/* During we waited on qemu_pause_cond the bql was unlocked,
+ * the vcpu's state may has been changed by other thread, so
+ * we must request the pause state on all vcpus again.
+ */
+request_pause_all_vcpus();
 }

 bql_unlock();
--
2.36.1





答复: [PATCH V8 6/8] physmem: Add helper function to destroy CPU AddressSpace

2024-03-14 Thread zhukeqian via
Hi Salil,

[...]

+void cpu_address_space_destroy(CPUState *cpu, int asidx) {
+CPUAddressSpace *cpuas;
+
+assert(cpu->cpu_ases);
+assert(asidx >= 0 && asidx < cpu->num_ases);
+/* KVM cannot currently support multiple address spaces. */
+assert(asidx == 0 || !kvm_enabled());
+
+cpuas = >cpu_ases[asidx];
+if (tcg_enabled()) {
+memory_listener_unregister(>tcg_as_listener);
+}
+
+address_space_destroy(cpuas->as);
+g_free_rcu(cpuas->as, rcu);

In address_space_destroy(), it calls call_rcu1() on cpuas->as which will set 
do_address_space_destroy() as the rcu func.
And g_free_rcu() also calls call_rcu1() on cpuas->as which will overwrite the 
rcu func as g_free().

Then I think the g_free() may be called twice in rcu thread, please verify that.

The source code of call_rcu1:

void call_rcu1(struct rcu_head *node, void (*func)(struct rcu_head *node))
{
node->func = func;
enqueue(node);
qatomic_inc(_call_count);
qemu_event_set(_call_ready_event);
}

Thanks,
Keqian

+
+if (asidx == 0) {
+/* reset the convenience alias for address space 0 */
+cpu->as = NULL;
+}
+
+if (--cpu->cpu_ases_count == 0) {
+g_free(cpu->cpu_ases);
+cpu->cpu_ases = NULL;
+}
+}
+
 AddressSpace *cpu_get_address_space(CPUState *cpu, int asidx)  {
 /* Return the AddressSpace corresponding to the specified index */
--
2.34.1



RE: [QUESTION] About virtio and eventloop

2023-01-18 Thread zhukeqian via
Hi Stefan, this indeed helps, thank you.

Keqian

On Mon, 16 Jan 2023 at 03:20, zhukeqian via 
mailto:qemu-devel@nongnu.org>> wrote:
> And if IO operation is blocked, is vCPU thread will blocked when do 
> deactivate?

Yes, blk_drain() is a synchronous function. It blocks until in-flight
I/O has finished. The vcpu thread will be blocked in
virtio_pci_common_write().

Stefan



答复: [QUESTION] About virtio and eventloop

2023-01-16 Thread zhukeqian via
I found blk_drain() is invoked by virtio_blk_reset(), so only the second 
question remains :).

发件人: zhukeqian <>
发送时间: 2023年1月16日 16:18
收件人: 'Michael S. Tsirkin' ; 'Stefan Hajnoczi' 
; 'Peter Maydell' 
抄送: qemu-devel@nongnu.org; Wubin (H) ; Chentao (Boby) 
; Wanghaibin (D) ; Zhangbo 
(Oscar) ; limingwang (A) ; 
Wangyan ; lihuachao 
主题: [QUESTION] About virtio and eventloop

Hi all maintainers and community friends,

Recently I am reviewing and learning the virtio and eventloop implementation of 
latest QEMU,
and now I have a questions for help:

In general, the IO requests of virtio is popped in iothread/mainloop and may 
submitted to “async IO
Engine”  (io_uring/linux aio/threadpool). Once the IO operation is done, the 
“async IO engine” will send notification
to iothread/mainloop through evenfd or bottomhalf, and the completion action 
for the IO request (add used ring and
notify guest) is done in iothread/mainloop.

And let’s look at the “deactive” procedure of virtio-pci devices (when guest 
write 0 to  device status or system
triggered reset), the basic requirement is that device should stop handling IO 
requests and accessing virtqueue before
returning back to guest, as the guest may destroy virqueue  once deactivation 
is done.

QEMU invokes stop_ioeventfd() callback to perform above actions. It unregisters 
ioeventfd from eventloop and KVM,

  1.  but I can’t find code that ensuring IO operations in “async IO engine” 
are done.
  2.  And if IO operation is blocked, is vCPU thread will blocked when do 
deactivate?

It’s great that if anyone can help!

Thanks,
Keqian


[QUESTION] About virtio and eventloop

2023-01-16 Thread zhukeqian via
Hi all maintainers and community friends,

Recently I am reviewing and learning the virtio and eventloop implementation of 
latest QEMU,
and now I have a questions for help:

In general, the IO requests of virtio is popped in iothread/mainloop and may 
submitted to "async IO
Engine"  (io_uring/linux aio/threadpool). Once the IO operation is done, the 
"async IO engine" will send notification
to iothread/mainloop through evenfd or bottomhalf, and the completion action 
for the IO request (add used ring and
notify guest) is done in iothread/mainloop.

And let's look at the "deactive" procedure of virtio-pci devices (when guest 
write 0 to  device status or system
triggered reset), the basic requirement is that device should stop handling IO 
requests and accessing virtqueue before
returning back to guest, as the guest may destroy virqueue  once deactivation 
is done.

QEMU invokes stop_ioeventfd() callback to perform above actions. It unregisters 
ioeventfd from eventloop and KVM,

  1.  but I can't find code that ensuring IO operations in "async IO engine" 
are done.
  2.  And if IO operation is blocked, is vCPU thread will blocked when do 
deactivate?

It's great that if anyone can help!

Thanks,
Keqian


Re: [PATCH v2] hw/acpi: Add ospm_status hook implementation for acpi-ged

2022-09-23 Thread zhukeqian via
>> > I notice this doesn't seem to have gone in yet -- whose tree is it 
>> > going to go via?
>>
>> I'd guess ARM tree (due to almost sole user virt-arm).
>> (there are toy users like microvm and new loongarch)
>
>OK; applied to target-arm.next, thanks.

Thanks, Peter.

Keqian.


答复: [PATCH] acpi_ged: Add ospm_status hook implementation

2022-08-16 Thread zhukeqian via
OK, I'll send v2 soon.

-邮件原件-
发件人: Peter Maydell [mailto:peter.mayd...@linaro.org] 
发送时间: 2022年8月16日 17:42
收件人: zhukeqian 
抄送: qemu-devel@nongnu.org; qemu-...@nongnu.org; qemu-triv...@nongnu.org; 
Philippe Mathieu-Daudé ; Eric Auger ; 
Peter Xu ; Igor Mammedov ; Wanghaibin 
(D) 
主题: Re: [PATCH] acpi_ged: Add ospm_status hook implementation

On Tue, 16 Aug 2022 at 10:40, zhukeqian  wrote:
>
> Hi Peter,
>
> Setup an ARM virtual machine of machine virt and execute qmp 
> "query-acpi-ospm-status" can trigger this bug.

Thanks. That is worth stating in the commit message, I think.

-- PMm


答复: [PATCH] acpi_ged: Add ospm_status hook implementation

2022-08-16 Thread zhukeqian via
Hi Peter,

Setup an ARM virtual machine of machine virt and execute qmp 
"query-acpi-ospm-status" can trigger this bug.

Thanks.

-邮件原件-
发件人: Qemu-devel [mailto:qemu-devel-bounces+zhukeqian1=huawei@nongnu.org] 代表 
Peter Maydell
发送时间: 2022年8月16日 17:30
收件人: zhukeqian 
抄送: qemu-devel@nongnu.org; qemu-...@nongnu.org; qemu-triv...@nongnu.org; 
Philippe Mathieu-Daudé ; Eric Auger ; 
Peter Xu ; Igor Mammedov ; Wanghaibin 
(D) 
主题: Re: [PATCH] acpi_ged: Add ospm_status hook implementation

On Tue, 16 Aug 2022 at 10:26, Keqian Zhu  wrote:
>
> This fixes a bug that causes segmentation fault with following dumpstack:
>  #1  0xab64235c in qmp_query_acpi_ospm_status 
> (errp=errp@entry=0xf030) at ../monitor/qmp-cmds.c:312
>  #2  0xabfc4e20 in qmp_marshal_query_acpi_ospm_status 
> (args=, ret=0xea4ffe90, errp=0xea4ffe88) at 
> qapi/qapi-commands-acpi.c:63
>  #3  0xabff8ba0 in do_qmp_dispatch_bh (opaque=0xea4ffe98) 
> at ../qapi/qmp-dispatch.c:128
>  #4  0xac02e594 in aio_bh_call (bh=0xe0004d80) at 
> ../util/async.c:150
>  #5  aio_bh_poll (ctx=ctx@entry=0xad0f6040) at ../util/async.c:178
>  #6  0xac00bd40 in aio_dispatch (ctx=ctx@entry=0xad0f6040) 
> at ../util/aio-posix.c:421
>  #7  0xac02e010 in aio_ctx_dispatch (source=0xad0f6040, 
> callback=, user_data=) at 
> ../util/async.c:320
>  #8  0xf76f6884 in g_main_context_dispatch () at 
> /usr/lib64/libglib-2.0.so.0
>  #9  0xac0452d4 in glib_pollfds_poll () at 
> ../util/main-loop.c:297
>  #10 os_host_main_loop_wait (timeout=0) at ../util/main-loop.c:320
>  #11 main_loop_wait (nonblocking=nonblocking@entry=0) at 
> ../util/main-loop.c:596
>  #12 0xab5c9e50 in qemu_main_loop () at 
> ../softmmu/runstate.c:734
>  #13 0xab185370 in qemu_main (argc=argc@entry=47, 
> argv=argv@entry=0xf518, envp=envp@entry=0x0) at 
> ../softmmu/main.c:38
>  #14 0xab16f99c in main (argc=47, argv=0xf518) at 
> ../softmmu/main.c:47

What are the conditions required to trigger the segfault?


> Fixes: ebb62075021a ("hw/acpi: Add ACPI Generic Event Device Support")
> Signed-off-by: Keqian Zhu 
> ---
>  hw/acpi/generic_event_device.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/hw/acpi/generic_event_device.c 
> b/hw/acpi/generic_event_device.c index e28457a7d1..a3d31631fe 100644
> --- a/hw/acpi/generic_event_device.c
> +++ b/hw/acpi/generic_event_device.c
> @@ -267,6 +267,13 @@ static void acpi_ged_unplug_cb(HotplugHandler 
> *hotplug_dev,
>  }
>  }
>
> +static void acpi_ged_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList 
> +***list) {
> +AcpiGedState *s = ACPI_GED(adev);
> +
> +acpi_memory_ospm_status(>memhp_state, list); }
> +
>  static void acpi_ged_send_event(AcpiDeviceIf *adev, 
> AcpiEventStatusBits ev)  {
>  AcpiGedState *s = ACPI_GED(adev); @@ -409,6 +416,7 @@ static void 
> acpi_ged_class_init(ObjectClass *class, void *data)
>  hc->unplug_request = acpi_ged_unplug_request_cb;
>  hc->unplug = acpi_ged_unplug_cb;
>
> +adevc->ospm_status = acpi_ged_ospm_status;
>  adevc->send_event = acpi_ged_send_event;  }
>
> --

thanks
-- PMM



RE: The windows guest can't bootup

2021-03-18 Thread zhukeqian
Hi,

Yep. It is known issue. Paolo will revert it.

Thanks.








Hello,

I synced today qemu code, and found the qemu can't bootup the windows guest.
This issue was caused by commit id 39205528 and revert this patch, the windows
guest can bootup.

qemu-system-x86_64: ../accel/kvm/kvm-all.c:690: kvm_log_clear_one_slot: 
Assertion `(((start | size) % (psize)) == 0)' failed.

I also enabled the vga device in the Linux guest, and same issue was found.

Regards,

Yang



RE: qemu crashes on changing display resolution within guest

2021-03-14 Thread zhukeqian

Thanks, drew. I'll be more careful in the future.

Keqian.

On Fri, Mar 12, 2021 at 11:39:49PM +0100, Igor Mammedov wrote:
> happens on current master,
>
> to reproduce start
> ./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 1g -M pc -vnc localhost:0 \
>  -snapshot -cdrom Fedora-Workstation-Live-x86_64-33-1.2.iso
>
> connect to guest using 'Remote Desktop', wait till it boots to graphical 
> desktop
> then try to change resolution to 800x600
>
> QEMU will crash in a second or 2 with:
> qemu-system-x86_64: ../qemu/accel/kvm/kvm-all.c:690: kvm_log_clear_one_slot: 
> Assertion `QEMU_IS_ALIGNED(start | size, psize)' failed.
>
>
> offending commit:
>
> commit 3920552846e881bafa9f9aad0bb1a6eef874d7fb (HEAD, refs/bisect/bad)
> Author: Keqian Zhu < zhukeqi...@huawei.com>
> Date:   Thu Dec 17 09:49:41 2020 +0800
>
> accel: kvm: Add aligment assert for kvm_log_clear_one_slot
>
> PS:
> same happens when using spice client
>
>

Yup, this is an already reported, disappointing regression. As Paolo says,
a revert is on the way. It's extra disappointing because it actually isn't
that hard to reproduce. A kvm-unit-tests migration test reproduces it
(see below). I guess we should improve our QEMU CI to also run
kvm-unit-tests for accel=kvm related changes on all architectures that
support KVM (or at least x86_64 and aarch64).

Thanks,
drew

$ tests/its-migration
BUILD_HEAD=5f8efadf
run_migration timeout -k 1s --foreground 90s ../build/q/qemu-system-aarch64 
-nodefaults -machine virt,gic-version=host,accel=kvm -cpu host -device 
virtio-serial-device -device virtconsole,chardev=ctd -chardev testdev,id=ctd 
-device pci-testdev -display none -serial stdio -kernel /tmp/tmp.kbJOUcS86v 
-smp 48 -machine gic-version=3 -append its-migration # -initrd 
/tmp/tmp.vN8JxnjX7h
qemu-system-aarch64: -chardev 
socket,id=mon1,path=/tmp/mig-helper-qmp1.2e6Up9BrWK,server,nowait: warning: 
short-form boolean option 'server' deprecated
Please use server=on instead
qemu-system-aarch64: -chardev 
socket,id=mon1,path=/tmp/mig-helper-qmp1.2e6Up9BrWK,server,nowait: warning: 
short-form boolean option 'nowait' deprecated
Please use wait=off instead
qemu-system-aarch64: -chardev 
socket,id=mon2,path=/tmp/mig-helper-qmp2.HcqdylHwvn,server,nowait: warning: 
short-form boolean option 'server' deprecated
Please use server=on instead
qemu-system-aarch64: -chardev 
socket,id=mon2,path=/tmp/mig-helper-qmp2.HcqdylHwvn,server,nowait: warning: 
short-form boolean option 'nowait' deprecated
Please use wait=off instead
ITS: MAPD devid=2 size = 0x8 itt=0x40bc valid=1
ITS: MAPD devid=7 size = 0x8 itt=0x40bd valid=1
MAPC col_id=3 target_addr = 0x3 valid=1
MAPC col_id=2 target_addr = 0x2 valid=1
INVALL col_id=2
INVALL col_id=3
MAPTI dev_id=2 event_id=20 -> phys_id=8195, col_id=3
MAPTI dev_id=7 event_id=255 -> phys_id=8196, col_id=2
Now migrate the VM, then press a key to continue...
qemu-system-aarch64: ../../qemu/accel/kvm/kvm-all.c:690: 
kvm_log_clear_one_slot: Assertion `QEMU_IS_ALIGNED(start | size, psize)' failed.
qemu-system-aarch64: Not a migration stream
qemu-system-aarch64: load of migration failed: Invalid argument
Ncat: Connection reset by peer.
timeout: the monitored command dumped core
/tmp/tmp.M1473gQVZ0: line 126: 1545037 Aborted timeout -k 1s 
--foreground 90s ../build/q/qemu-system-aarch64 -nodefaults -machine 
virt,gic-version=host,accel=kvm -cpu host -device virtio-serial-device -device 
virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display 
none -serial stdio -kernel /tmp/tmp.kbJOUcS86v -smp 48 -machine gic-version=3 
-append its-migration -initrd /tmp/tmp.vN8JxnjX7h -chardev 
socket,id=mon1,path=/tmp/mig-helper-qmp1.2e6Up9BrWK,server,nowait -mon 
chardev=mon1,mode=control
Ncat: Connection refused.
Ncat: Connection refused.
^Ctests/its-migration: line 1: 1545202 Terminated  summary=$(eval 
$cmdline 2> >(RUNTIME_log_stderr $testname)  > 
>(tee >(RUNTIME_log_stdout $testname $kernel) | extract_summary))




RE: [PATCH v2 2/2] accel: kvm: Add aligment assert for kvm_log_clear_one_slot

2021-03-09 Thread zhukeqian

Thanks for your bug report. I was just off work, will dig into it tomorrow.   
thanks :)

Keqian

On 09/03/2021 15.05, Keqian Zhu wrote:
>
>
> On 2021/3/9 21:48, Thomas Huth wrote:
>> On 17/12/2020 02.49, Keqian Zhu wrote:
>>> The parameters start and size are transfered from QEMU memory
>>> emulation layer. It can promise that they are TARGET_PAGE_SIZE
>>> aligned. However, KVM needs they are qemu_real_page_size aligned.
>>>
>>> Though no caller breaks this aligned requirement currently, we'd
>>> better add an explicit assert to avoid future breaking.
>>>
>>> Signed-off-by: Keqian Zhu < 
>>> zhukeqi...@huawei.com>
>>> ---
>>>accel/kvm/kvm-all.c | 7 +++
>>>1 file changed, 7 insertions(+)
>>>
>>> ---
>>> v2
>>>- Address Andrew's commment (Use assert instead of return err).
>>>
>>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>>> index f6b16a8df8..73b195cc41 100644
>>> --- a/accel/kvm/kvm-all.c
>>> +++ b/accel/kvm/kvm-all.c
>>> @@ -692,6 +692,10 @@ out:
>>>#define KVM_CLEAR_LOG_ALIGN (qemu_real_host_page_size << 
>>> KVM_CLEAR_LOG_SHIFT)
>>>#define KVM_CLEAR_LOG_MASK   (-KVM_CLEAR_LOG_ALIGN)
>>>+/*
>>> + * As the granule of kvm dirty log is qemu_real_host_page_size,
>>> + * @start and @size are expected and restricted to align to it.
>>> + */
>>>static int kvm_log_clear_one_slot(KVMSlot *mem, int as_id, uint64_t 
>>> start,
>>>  uint64_t size)
>>>{
>>> @@ -701,6 +705,9 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, int 
>>> as_id, uint64_t start,
>>>unsigned long *bmap_clear = NULL, psize = qemu_real_host_page_size;
>>>int ret;
>>>+/* Make sure start and size are qemu_real_host_page_size aligned */
>>> +assert(QEMU_IS_ALIGNED(start | size, psize));
>>
>> Sorry, but that was a bad idea: It triggers and kills my Centos 6 VM:
>>
>> $ qemu-system-x86_64 -accel kvm -hda ~/virt/images/centos6.qcow2 -m 1G
>> qemu-system-x86_64: ../../devel/qemu/accel/kvm/kvm-all.c:690: 
>> kvm_log_clear_one_slot: Assertion `QEMU_IS_ALIGNED(start | size, psize)' 
>> failed.
>> Aborted (core dumped)
> Hi Thomas,
>
> I think this patch is ok, maybe it trigger a potential bug?

Well, sure, there is either a bug somewhere else or in this new code. But it's 
certainly not normal that the assert() triggers, is it?

FWIW, here's a backtrace:

#0 0x72c1584f in raise () at /lib64/libc.so.6
#1 0x72bffc45 in abort () at /lib64/libc.so.6
#2 0x72bffb19 in _nl_load_domain.cold.0 () at /lib64/libc.so.6
#3 0x72c0de36 in .annobin_assert.c_end () at /lib64/libc.so.6
#4 0x55ba25f3 in kvm_log_clear_one_slot
 (size=6910080, start=0, as_id=0, mem=0x56e1ee00)
 at ../../devel/qemu/accel/kvm/kvm-all.c:691
#5 0x55ba25f3 in kvm_physical_log_clear
 (section=0x7fffd0b0, section=0x7fffd0b0, kml=0x56dbaac0)
 at ../../devel/qemu/accel/kvm/kvm-all.c:843
#6 0x55ba25f3 in kvm_log_clear (listener=0x56dbaac0, 
section=0x7fffd0b0)
 at ../../devel/qemu/accel/kvm/kvm-all.c:1253
#7 0x55b023d8 in memory_region_clear_dirty_bitmap
 (mr=mr@entry=0x573394c0, start=start@entry=0, len=len@entry=6910080)
 at ../../devel/qemu/softmmu/memory.c:2132
#8 0x55b313d9 in cpu_physical_memory_snapshot_and_clear_dirty
 (mr=mr@entry=0x573394c0, offset=offset@entry=0, 
length=length@entry=6910080, client=client@entry=0) at 
../../devel/qemu/softmmu/physmem.c:1109
#9 0x55b02483 in memory_region_snapshot_and_clear_dirty
 (mr=mr@entry=0x573394c0, addr=addr@entry=0, size=size@entry=6910080, 
client=client@entry=0)
 at ../../devel/qemu/softmmu/memory.c:2146
#10 0x55babe99 in vga_draw_graphic (full_update=0, s=0x573394b0)
 at ../../devel/qemu/hw/display/vga.c:1661
#11 0x55babe99 in vga_update_display (opaque=0x573394b0)
 at ../../devel/qemu/hw/display/vga.c:1784
#12 0x55babe99 in vga_update_display (opaque=0x573394b0)
 at ../../devel/qemu/hw/display/vga.c:1757
#13 0x558ddd32 in graphic_hw_update (con=0x56a11800)
 at ../../devel/qemu/ui/console.c:279
#14 0x558dccd2 in dpy_refresh (s=0x56c17da0) at 
../../devel/qemu/ui/console.c:1742
#15 0x558dccd2 in gui_update (opaque=opaque@entry=0x56c17da0)
 at ../../devel/qemu/ui/console.c:209
#16 0x55dbd520 in timerlist_run_timers (timer_list=0x56937c50)
 at ../../devel/qemu/util/qemu-timer.c:574
#17 0x55dbd520 in timerlist_run_timers (timer_list=0x56937c50)
 at ../../devel/qemu/util/qemu-timer.c:499
#18 0x55dbd74a in qemu_clock_run_timers (type=)
 at ../../devel/qemu/util/qemu-timer.c:670
#19 0x55dbd74a in qemu_clock_run_all_timers () at 
../../devel/qemu/util/qemu-timer.c:670

Looks like something in the vga code calls this with size=6910080
and thus triggers the alignment assertion?

  Thomas




Re: [PULL v3 11/32] vfio: Get migration capability flags for container

2020-12-15 Thread zhukeqian
Hi Kirti,

On 2020/11/2 5:00, Alex Williamson wrote:
> From: Kirti Wankhede 
> 
> Added helper functions to get IOMMU info capability chain.
> Added function to get migration capability information from that
> capability chain for IOMMU container.
> 
> Similar change was proposed earlier:
> https://lists.gnu.org/archive/html/qemu-devel/2018-05/msg03759.html
> 
> Disable migration for devices if IOMMU module doesn't support migration
> capability.
> 
> Signed-off-by: Kirti Wankhede 
> Cc: Shameer Kolothum 
> Cc: Eric Auger 
> Signed-off-by: Alex Williamson 
> ---
>  hw/vfio/common.c  |   90 
> +
>  hw/vfio/migration.c   |7 +++
>  include/hw/vfio/vfio-common.h |3 +
>  3 files changed, 91 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index c6e98b8d61be..d4959c036dd1 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -1228,6 +1228,75 @@ static int vfio_init_container(VFIOContainer 
> *container, int group_fd,
>  return 0;
>  }
>  
[...]

> +
> +/*
> + * cpu_physical_memory_set_dirty_lebitmap() expects pages in bitmap of
> + * TARGET_PAGE_SIZE to mark those dirty.
> + */
> +if (cap_mig->pgsize_bitmap & TARGET_PAGE_SIZE) {
> +container->dirty_pages_supported = true;
> +container->max_dirty_bitmap_size = cap_mig->max_dirty_bitmap_size;
> +container->dirty_pgsizes = cap_mig->pgsize_bitmap;
> +}
Maybe it's better to use qemu_real_host_page_size here, as some iommu drivers 
(such as smmu) prefer
host page size.

Thanks,
Keqian



Re: [PULL v3 15/32] vfio: Add ioctl to get dirty pages bitmap during dma unmap

2020-12-15 Thread zhukeqian
Hi Kirti,

On 2020/11/2 5:01, Alex Williamson wrote:
> From: Kirti Wankhede 
> 
> With vIOMMU, IO virtual address range can get unmapped while in pre-copy
> phase of migration. In that case, unmap ioctl should return pages pinned
> in that range and QEMU should find its correcponding guest physical
> addresses and report those dirty.
> 
> Suggested-by: Alex Williamson 
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> [aw: fix error_report types, fix cpu_physical_memory_set_dirty_lebitmap() 
> cast]
> Signed-off-by: Alex Williamson 
> ---
>  hw/vfio/common.c |   97 
> --
>  1 file changed, 93 insertions(+), 4 deletions(-)
> 
[...]

> +
> +unmap->argsz = sizeof(*unmap) + sizeof(*bitmap);
> +unmap->iova = iova;
> +unmap->size = size;
> +unmap->flags |= VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP;
> +bitmap = (struct vfio_bitmap *)>data;
> +
> +/*
> + * cpu_physical_memory_set_dirty_lebitmap() expects pages in bitmap of
> + * TARGET_PAGE_SIZE to mark those dirty. Hence set bitmap_pgsize to
> + * TARGET_PAGE_SIZE.
> + */

Here maybe not OK. cpu_physical_memory_set_dirty_lebitmap expects the granule
of bitmap is qemu_real_host_page_size. It uses hpratio to covert this bitmap
to QEMU dirty bitmap.

Thanks,
Keqian



Re: [PATCH] kvm: Take into account the unaligned section size when preparing bitmap

2020-12-14 Thread zhukeqian


On 2020/12/14 23:36, Peter Xu wrote:
> On Mon, Dec 14, 2020 at 10:14:11AM +0800, zhukeqian wrote:
> 
> [...]
> 
>>>>> Though indeed I must confess I don't know how it worked in general when 
>>>>> host
>>>>> page size != target page size, at least for migration.  For example, I 
>>>>> believe
>>>>> kvm dirty logging is host page size based, though migration should be 
>>>>> migrating
>>>>> pages in guest page size granule when it spots a dirty bit set.
> 
> [1]
> 
>> Hi Peter,
> 
> Keqian,
> 
>>> OTOH I'm more worried on the other question on how we handle guest psize !=
>>> host psize case for migration now...
>> I think it does not matter when guest_psize != host_psize, as we only need 
>> to interact with
>> stage2 page tables during migration. Stage2 is enough to tracking guest 
>> dirty memory, and even
>> if guest close stage1, we also can do a successful migration.
> 
> I don't know why 2-stage matters here, since I believe KVM can track dirty
> pages either using two dimentional paging or shadowing, however it's always
> done in host small page size.  The question I'm confused is there seems to 
> have
> a size mismatch between qemu migration and what kvm does [1].  For example, 
> how
> migration works on ARM64 where host has psize==4K while guest has psize=64K.
> 
Hi Peter,

OK, I got it ;-) Do you mean qemu_real_host_page_size != TARGET_PAGE_SIZE?
After my analysis, I see that when qemu_real_host_page_size != TARGET_PAGE_SIZE,
there are some problems indeed. I have send out some patches, please check 
whether they solve this
problem, thanks!

Keqian.

> Thanks,
> 



Re: [PATCH] kvm: Take into account the unaligned section size when preparing bitmap

2020-12-13 Thread zhukeqian



On 2020/12/11 23:25, Peter Xu wrote:
> On Fri, Dec 11, 2020 at 09:13:10AM +0800, zhukeqian wrote:
>>
>> On 2020/12/10 22:50, Peter Xu wrote:
>>> On Thu, Dec 10, 2020 at 10:53:23AM +0800, zhukeqian wrote:
>>>>
>>>>
>>>> On 2020/12/10 10:08, Peter Xu wrote:
>>>>> Keqian,
>>>>>
>>>>> On Thu, Dec 10, 2020 at 09:46:06AM +0800, zhukeqian wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I see that if start or size is not PAGE aligned, it also clears areas
>>>>>> which beyond caller's expectation, so do we also need to consider this?
>>>>>
>>>>> Could you elaborate?
>>>>>
>>>>> If start_delta != 0, kvm_log_clear_one_slot() should already go the slow 
>>>>> path.
>>>>>
>>>>> Thanks,
>>>>>
>>>>
>>>> Hi Peter,
>>>>
>>>> start_delta /= psize;
>>>>
>>>> If start is not PAGE aligned, then start_delta is not PAGE aligned.
>>>> so I think the above code will implicitly extend our start to be PAGE 
>>>> aligned.
>>>>
>>>> I suggest that we should shrink the start and (start + size) to be PAGE 
>>>> aligned
>>>> at beginning of this function.
>>>
>>> Callers should be with TARGET_PAGE_SIZE aligned on the size, so at least 
>>> x86_64
>>> should be pretty safe since host/guest page sizes match.
>>>
>>> Though indeed I must confess I don't know how it worked in general when host
>>> page size != target page size, at least for migration.  For example, I 
>>> believe
>>> kvm dirty logging is host page size based, though migration should be 
>>> migrating
>>> pages in guest page size granule when it spots a dirty bit set.
>>>
>> Hi,
>>
>> Indeed, we handle target_page_size aligned @start and @size in general. 
>> Maybe we'd better
>> add explicit function comments about alignment requirement, and explicit 
>> alignment assert
>> on @start and @size?
> 

Hi Peter,

> Yes we can, but I think it's not strongly necessary.  As Zenghui pointed out,
> the callers of memory_region_clear_dirty_bitmap() should always be aware of 
> the
> fact that dirty bitmap is always page size based.
Agree.

> 
> OTOH I'm more worried on the other question on how we handle guest psize !=
> host psize case for migration now...
I think it does not matter when guest_psize != host_psize, as we only need to 
interact with
stage2 page tables during migration. Stage2 is enough to tracking guest dirty 
memory, and even
if guest close stage1, we also can do a successful migration.

Please point out if I misunderstood what you meant.

Thanks,
Keqian

> 
> Thanks,
> 




Re: [PATCH] kvm: Take into account the unaligned section size when preparing bitmap

2020-12-10 Thread zhukeqian


On 2020/12/10 22:50, Peter Xu wrote:
> On Thu, Dec 10, 2020 at 10:53:23AM +0800, zhukeqian wrote:
>>
>>
>> On 2020/12/10 10:08, Peter Xu wrote:
>>> Keqian,
>>>
>>> On Thu, Dec 10, 2020 at 09:46:06AM +0800, zhukeqian wrote:
>>>> Hi,
>>>>
>>>> I see that if start or size is not PAGE aligned, it also clears areas
>>>> which beyond caller's expectation, so do we also need to consider this?
>>>
>>> Could you elaborate?
>>>
>>> If start_delta != 0, kvm_log_clear_one_slot() should already go the slow 
>>> path.
>>>
>>> Thanks,
>>>
>>
>> Hi Peter,
>>
>> start_delta /= psize;
>>
>> If start is not PAGE aligned, then start_delta is not PAGE aligned.
>> so I think the above code will implicitly extend our start to be PAGE 
>> aligned.
>>
>> I suggest that we should shrink the start and (start + size) to be PAGE 
>> aligned
>> at beginning of this function.
> 
> Callers should be with TARGET_PAGE_SIZE aligned on the size, so at least 
> x86_64
> should be pretty safe since host/guest page sizes match.
> 
> Though indeed I must confess I don't know how it worked in general when host
> page size != target page size, at least for migration.  For example, I believe
> kvm dirty logging is host page size based, though migration should be 
> migrating
> pages in guest page size granule when it spots a dirty bit set.
> 
Hi,

Indeed, we handle target_page_size aligned @start and @size in general. Maybe 
we'd better
add explicit function comments about alignment requirement, and explicit 
alignment assert
on @start and @size?

Keqian.



Re: [PATCH] kvm: Take into account the unaligned section size when preparing bitmap

2020-12-09 Thread zhukeqian



On 2020/12/10 10:08, Peter Xu wrote:
> Keqian,
> 
> On Thu, Dec 10, 2020 at 09:46:06AM +0800, zhukeqian wrote:
>> Hi,
>>
>> I see that if start or size is not PAGE aligned, it also clears areas
>> which beyond caller's expectation, so do we also need to consider this?
> 
> Could you elaborate?
> 
> If start_delta != 0, kvm_log_clear_one_slot() should already go the slow path.
> 
> Thanks,
> 

Hi Peter,

start_delta /= psize;

If start is not PAGE aligned, then start_delta is not PAGE aligned.
so I think the above code will implicitly extend our start to be PAGE aligned.

I suggest that we should shrink the start and (start + size) to be PAGE aligned
at beginning of this function.

Maybe I miss something?

Keqian.



Re: [PATCH] kvm: Take into account the unaligned section size when preparing bitmap

2020-12-09 Thread zhukeqian
Hi,

I see that if start or size is not PAGE aligned, it also clears areas
which beyond caller's expectation, so do we also need to consider this?

Thanks,
Keqian

On 2020/12/9 10:33, Zenghui Yu wrote:
> Hi Peter,
> 
> Thanks for having a look at it.
> 
> On 2020/12/8 23:16, Peter Xu wrote:
>> Hi, Zenghui,
>>
>> On Tue, Dec 08, 2020 at 07:40:13PM +0800, Zenghui Yu wrote:
>>> The kernel KVM_CLEAR_DIRTY_LOG interface has align requirement on both the
>>> start and the size of the given range of pages. We have been careful to
>>> handle the unaligned cases when performing CLEAR on one slot. But it seems
>>> that we forget to take the unaligned *size* case into account when
>>> preparing bitmap for the interface, and we may end up clearing dirty status
>>> for pages outside of [start, start + size).
>>
>> Thanks for the patch, though my understanding is that this is not a bug.
>>
>> Please have a look at kvm_memslot_init_dirty_bitmap() where we'll allocate 
>> the
>> dirty bitmap to be aligned to 8 bytes (assuming that's the possible max of 
>> the
>> value sizeof(unsigned long)).  That exactly covers 64 pages.
>>
>> So here as long as start_delta==0 (so the value of "bmap_npages - size / 
>> psize"
>> won't really matter a lot, imho), then we'll definitely have 
>> KVMSlot.dirty_bmap
>> long enough to cover the range we'd like to clear.
> 
> I agree.  But actually I'm not saying that KVMSlot.dirty_bmap is not
> long enough.  What I was having in mind is something like:
> 
> // psize = qemu_real_host_page_size;
> // slot.start_addr = 0;
> // slot.memory_size = 64 * psize;
> 
> kvm_log_clear_one_slot(slot, as, 0 * psize, 32 * psize);   --> [1]
> kvm_log_clear_one_slot(slot, as, 32 * psize, 32 * psize);  --> [2]
> 
> So the @size is not aligned with 64 pages.  Before this patch, we'll
> clear dirty status for all pages(0-63) through [1].  It looks to me that
> this violates the caller's expectation since we only want to clear
> pages(0-31).
> 
> As I said, I don't think this will happen in practice -- the migration
> code should always provide us with a 64-page aligned section (right?).
> I'm just thinking about the correctness of the specific algorithm used
> by kvm_log_clear_one_slot().
> 
> Or maybe I had missed some other points obvious ;-) ?
> 
> 
> Thanks,
> Zenghui
> 
>> Note that the size of KVMSlot.dirty_bmap can be bigger than the actually size
>> of the kvm memslot, however since kvm_memslot_init_dirty_bitmap() initialized
>> it to all zero so the extra bits will always be zero for the whole lifecycle 
>> of
>> the vm/bitmap.
>>
>> Thanks,
>>
>>>
>>> Signed-off-by: Zenghui Yu 
>>> ---
>>>   accel/kvm/kvm-all.c | 7 +--
>>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
>>> index bed2455ca5..05d323ba1f 100644
>>> --- a/accel/kvm/kvm-all.c
>>> +++ b/accel/kvm/kvm-all.c
>>> @@ -747,7 +747,7 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, int 
>>> as_id, uint64_t start,
>>>   assert(bmap_start % BITS_PER_LONG == 0);
>>>   /* We should never do log_clear before log_sync */
>>>   assert(mem->dirty_bmap);
>>> -if (start_delta) {
>>> +if (start_delta || bmap_npages - size / psize) {
>>>   /* Slow path - we need to manipulate a temp bitmap */
>>>   bmap_clear = bitmap_new(bmap_npages);
>>>   bitmap_copy_with_src_offset(bmap_clear, mem->dirty_bmap,
>>> @@ -760,7 +760,10 @@ static int kvm_log_clear_one_slot(KVMSlot *mem, int 
>>> as_id, uint64_t start,
>>>   bitmap_clear(bmap_clear, 0, start_delta);
>>>   d.dirty_bitmap = bmap_clear;
>>>   } else {
>>> -/* Fast path - start address aligns well with BITS_PER_LONG */
>>> +/*
>>> + * Fast path - both start and size align well with BITS_PER_LONG
>>> + * (or the end of memory slot)
>>> + */
>>>   d.dirty_bitmap = mem->dirty_bmap + BIT_WORD(bmap_start);
>>>   }
>>>   -- 
>>> 2.19.1
>>>
>>>
>>
> 
> .
> 



Ping: [PATCH v2 0/2] bugfix: Decrease dirty bitmap blocks after we remove ramblock

2020-12-03 Thread zhukeqian
Hi folks, kindly ping ...

This bugfix can save several MBs memory, waiting for review, thanks.

Keqian.

On 2020/11/30 21:11, Keqian Zhu wrote:
> Keqian Zhu (2):
>   ramlist: Make dirty bitmap blocks of ramlist resizable
>   ramlist: Resize dirty bitmap blocks after remove ramblock
> 
>  softmmu/physmem.c | 37 +
>  1 file changed, 29 insertions(+), 8 deletions(-)
> 



Re: [PATCH] bugfix: irq: Avoid covering object refcount of qemu_irq

2020-07-28 Thread zhukeqian
Hi Thomas,

On 2020/7/28 16:48, Thomas Huth wrote:
> On 27/07/2020 16.41, Peter Maydell wrote:
>> On Mon, 27 Jul 2020 at 14:03, Keqian Zhu  wrote:
>>>
>>> Avoid covering object refcount of qemu_irq, otherwise it may causes
>>> memory leak.
>>>
>>> Signed-off-by: Keqian Zhu 
>>> ---
>>>  hw/core/irq.c | 4 +++-
>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/core/irq.c b/hw/core/irq.c
>>> index fb3045b912..59af4dfc74 100644
>>> --- a/hw/core/irq.c
>>> +++ b/hw/core/irq.c
>>> @@ -125,7 +125,9 @@ void qemu_irq_intercept_in(qemu_irq *gpio_in, 
>>> qemu_irq_handler handler, int n)
>>>  int i;
>>>  qemu_irq *old_irqs = qemu_allocate_irqs(NULL, NULL, n);
>>>  for (i = 0; i < n; i++) {
>>> -*old_irqs[i] = *gpio_in[i];
>>> +old_irqs[i]->handler = gpio_in[i]->handler;
>>> +old_irqs[i]->opaque = gpio_in[i]->opaque;
>>> +
>>>  gpio_in[i]->handler = handler;
>>>  gpio_in[i]->opaque = _irqs[i];
>>>  }
>>
>> This function is leaky by design, because it doesn't do anything
>> with the old_irqs array and there's no function for un-intercepting
>> the IRQs (which would need to free that memory). This is not ideal
>> but OK because it's only used in the test suite.
> 
> I think this could better be done without calling qemu_allocate_irqs():
> Simply call qemu_allocate_irq() (without "s" at the end) within the
> for-loop for each irq instead. What do you think?
Yeah, this can save some memory. But I think it does not solve the refcount 
covering
problem.
> 
Thanks
Keqian
>  Thomas
> 
> 
> 



Re: [PATCH] bugfix: irq: Avoid covering object refcount of qemu_irq

2020-07-27 Thread zhukeqian
Hi Peter,

On 2020/7/27 22:41, Peter Maydell wrote:
> On Mon, 27 Jul 2020 at 14:03, Keqian Zhu  wrote:
>>
>> Avoid covering object refcount of qemu_irq, otherwise it may causes
>> memory leak.
>>
>> Signed-off-by: Keqian Zhu 
>> ---
>>  hw/core/irq.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/core/irq.c b/hw/core/irq.c
>> index fb3045b912..59af4dfc74 100644
>> --- a/hw/core/irq.c
>> +++ b/hw/core/irq.c
>> @@ -125,7 +125,9 @@ void qemu_irq_intercept_in(qemu_irq *gpio_in, 
>> qemu_irq_handler handler, int n)
>>  int i;
>>  qemu_irq *old_irqs = qemu_allocate_irqs(NULL, NULL, n);
>>  for (i = 0; i < n; i++) {
>> -*old_irqs[i] = *gpio_in[i];
>> +old_irqs[i]->handler = gpio_in[i]->handler;
>> +old_irqs[i]->opaque = gpio_in[i]->opaque;
>> +
>>  gpio_in[i]->handler = handler;
>>  gpio_in[i]->opaque = _irqs[i];
>>  }
> 
> This function is leaky by design, because it doesn't do anything
> with the old_irqs array and there's no function for un-intercepting
> the IRQs (which would need to free that memory). This is not ideal
> but OK because it's only used in the test suite.
One of our internal self-developed module also use this function, and we
implemented a function to remove intercepting, so there is no memory leak
after this bugfix.

I suggest to merge this bugfix to prepare for future code which may invoke
this function.

> 
> Is there a specific bug you're trying to fix here?
The memory leak is reported by ASAN.
> 

Thanks,
Keqian
> thanks
> -- PMM
> .
> 



Re: [PATCH] bugfix: irq: Avoid covering object refcount of qemu_irq

2020-07-27 Thread zhukeqian
Hi Qiang,

On 2020/7/27 22:37, Li Qiang wrote:
> Keqian Zhu  于2020年7月27日周一 下午9:03写道:
>>
>> Avoid covering object refcount of qemu_irq, otherwise it may causes
>> memory leak.
> 
> Any reproducer?
> 
In mainline Qemu. this function is only used in qtest. One of our internal
self-developed module also use this function. The memory leak is reported
by ASAN.

Thanks,
Keqian

> Thanks,
> Li Qiang
> 
>>
>> Signed-off-by: Keqian Zhu 
>> ---
>>  hw/core/irq.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/core/irq.c b/hw/core/irq.c
>> index fb3045b912..59af4dfc74 100644
>> --- a/hw/core/irq.c
>> +++ b/hw/core/irq.c
>> @@ -125,7 +125,9 @@ void qemu_irq_intercept_in(qemu_irq *gpio_in, 
>> qemu_irq_handler handler, int n)
>>  int i;
>>  qemu_irq *old_irqs = qemu_allocate_irqs(NULL, NULL, n);
>>  for (i = 0; i < n; i++) {
>> -*old_irqs[i] = *gpio_in[i];
>> +old_irqs[i]->handler = gpio_in[i]->handler;
>> +old_irqs[i]->opaque = gpio_in[i]->opaque;
>> +
>>  gpio_in[i]->handler = handler;
>>  gpio_in[i]->opaque = _irqs[i];
>>  }
>> --
>> 2.19.1
>>
> .
> 



Re: [PATCH v3] migration: Count new_dirty instead of real_dirty

2020-07-06 Thread zhukeqian
Hi Dave,

On 2020/7/3 22:20, Dr. David Alan Gilbert wrote:
> * Keqian Zhu (zhukeqi...@huawei.com) wrote:
>> real_dirty_pages becomes equal to total ram size after dirty log sync
>> in ram_init_bitmaps, the reason is that the bitmap of ramblock is
>> initialized to be all set, so old path counts them as "real dirty" at
>> beginning.
>>
>> This causes wrong dirty rate and false positive throttling.
>>
>> Signed-off-by: Keqian Zhu 
> 
> OK, 
> 
> Reviewed-by: Dr. David Alan Gilbert 
> 
> and queued.
> 
> you might still want to look at migration_trigger_thrtottle and see if
> you can stop the throttling if in the RAM bulk stage.
Yes, I tested it and it worked well.

Thanks,
Keqian



Re: [PATCH] migration: Assign current_migration as NULL after migration

2020-07-01 Thread zhukeqian
Please ignore this patch :-)

If we shutdown VM during migration, the migration thread may still
ref current_migration at this point.

On 2020/6/28 14:49, Keqian Zhu wrote:
> In migration_shutdown, global var current_migration is freed but not
> assigned to NULL, which may cause heap-use-after-free problem if the
> following code logic is abnormal.
> 
> Signed-off-by: Keqian Zhu 
> ---
>  migration/migration.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 481a590f72..ed7332 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -189,6 +189,7 @@ void migration_shutdown(void)
>   */
>  migrate_fd_cancel(current_migration);
>  object_unref(OBJECT(current_migration));
> +current_migration = NULL;
>  }
>  
>  /* For outgoing */
> 



Re: [PATCH v2] migration: Count new_dirty instead of real_dirty

2020-06-16 Thread zhukeqian
Hi Dave,

On 2020/6/16 17:58, Dr. David Alan Gilbert wrote:
> * zhukeqian (zhukeqi...@huawei.com) wrote:
>> Hi Dave,
>>
>> On 2020/6/16 17:35, Dr. David Alan Gilbert wrote:
>>> * Keqian Zhu (zhukeqi...@huawei.com) wrote:
>>>> real_dirty_pages becomes equal to total ram size after dirty log sync
>>>> in ram_init_bitmaps, the reason is that the bitmap of ramblock is
>>>> initialized to be all set, so old path counts them as "real dirty" at
>>>> beginning.
>>>>
>>>> This causes wrong dirty rate and false positive throttling at the end
>>>> of first ram save iteration.
>>>>
>>>> Signed-off-by: Keqian Zhu 
>>>
>>> Since this function already returns num_dirty, why not just change the
>>> caller to increment a counter based off the return value?
>> Yes, that would be better :-) .
>>
>>>
>>> Can you point to the code which is using this value that triggers the
>>> throttle?
>>>
>> In migration_trigger_throttle(), rs->num_dirty_pages_period is used.
>> And it corresponds to real_dirty_pages here.
> 
> OK; so is the problem not the same as the check that's in there for
> blk_mig_bulk_activate - don't we need to do the same trick for ram bulk
> migration (i.e. the first pass).
> 
Sorry that I do not get your idea clearly. Could you give some sample
code?

> Dave
> 
>> Thanks,
>> Keqian
>>
>>> Dave
>>>
>>>
>> [...]
>>>>
>>>>
>>> --
>>> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
>>>
>>> .
>>>
>>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> .
> 



Re: [PATCH v2] migration: Count new_dirty instead of real_dirty

2020-06-16 Thread zhukeqian
Hi Dave,

On 2020/6/16 17:35, Dr. David Alan Gilbert wrote:
> * Keqian Zhu (zhukeqi...@huawei.com) wrote:
>> real_dirty_pages becomes equal to total ram size after dirty log sync
>> in ram_init_bitmaps, the reason is that the bitmap of ramblock is
>> initialized to be all set, so old path counts them as "real dirty" at
>> beginning.
>>
>> This causes wrong dirty rate and false positive throttling at the end
>> of first ram save iteration.
>>
>> Signed-off-by: Keqian Zhu 
> 
> Since this function already returns num_dirty, why not just change the
> caller to increment a counter based off the return value?
Yes, that would be better :-) .

> 
> Can you point to the code which is using this value that triggers the
> throttle?
> 
In migration_trigger_throttle(), rs->num_dirty_pages_period is used.
And it corresponds to real_dirty_pages here.

Thanks,
Keqian

> Dave
> 
> 
[...]
>>
>>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> .
> 



Re: [PATCH] migration: Count new_dirty instead of real_dirty

2020-06-15 Thread zhukeqian
Hi Jay Zhou,

On 2020/6/15 19:50, Zhoujian (jay) wrote:
> Hi Keqian,
> 
>> -Original Message-----
>> From: zhukeqian
>> Sent: Monday, June 15, 2020 11:19 AM
>> To: qemu-devel@nongnu.org; qemu-...@nongnu.org; Paolo Bonzini
>> ; Zhoujian (jay) 
>> Cc: Juan Quintela ; Chao Fan ;
>> Wanghaibin (D) 
>> Subject: Re: [PATCH] migration: Count new_dirty instead of real_dirty
>>
>> Hi Paolo and Jian Zhou,
>>
>> Do you have any suggestion on this patch?
>>
>> Thanks,
>> Keqian
>>
>> On 2020/6/1 12:02, Keqian Zhu wrote:
>>> DIRTY_LOG_INITIALLY_ALL_SET feature is on the queue. This fixs the
> 
> s/fixs/fixes
Thanks.
> 
>>> dirty rate calculation for this feature. After introducing this
>>> feature, real_dirty_pages is equal to total memory size at begining.
>>> This causing wrong dirty rate and false positive throttling.
> 
> I think it should be tested whether DIRTY_LOG_INITIALLY_ALL_SET is enabled
> in ram_init_bitmaps(maybe?) in order to be compatible with the old path.
Yeah, you are right ;-)

But after I tested old path yesterday, I found that the num_dirty_pages_period
becomes total ram size after log sync in ram_init_bitmaps. The reason is that
bitmap of ramblock is initialized to be all set, so old path counts them as 
dirty
by mistake.

This bug causes false positive throttling at the end of first ram save 
iteration,
even without our DIRTY_LOG_INITIALLY_ALL_SET feature.
> 
> Thanks,
> Jay Zhou
> 
>>>
>>> BTW, real dirty rate is not suitable and not very accurate.
>>>
>>> 1. For not suitable: We mainly concern on the relationship between
>>>dirty rate and network bandwidth. Net increasement of dirty pages
>>>makes more sense.
>>> 2. For not very accurate: With manual dirty log clear, some dirty pages
>>>will be cleared during each peroid, our "real dirty rate" is less
>>>than real "real dirty rate".
> 
I should correct these commit messages for reason above :-)
> 
> 
>>>
>>> Signed-off-by: Keqian Zhu 
[...]

Thanks,
Keqian



Re: [PATCH] migration: Count new_dirty instead of real_dirty

2020-06-14 Thread zhukeqian
Hi Paolo and Jian Zhou,

Do you have any suggestion on this patch?

Thanks,
Keqian

On 2020/6/1 12:02, Keqian Zhu wrote:
> DIRTY_LOG_INITIALLY_ALL_SET feature is on the queue. This fixs the
> dirty rate calculation for this feature. After introducing this
> feature, real_dirty_pages is equal to total memory size at begining.
> This causing wrong dirty rate and false positive throttling.
> 
> BTW, real dirty rate is not suitable and not very accurate.
> 
> 1. For not suitable: We mainly concern on the relationship between
>dirty rate and network bandwidth. Net increasement of dirty pages
>makes more sense.
> 2. For not very accurate: With manual dirty log clear, some dirty pages
>will be cleared during each peroid, our "real dirty rate" is less
>than real "real dirty rate".
> 
> Signed-off-by: Keqian Zhu 
> ---
>  include/exec/ram_addr.h | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
> index 5e59a3d8d7..af9677e291 100644
> --- a/include/exec/ram_addr.h
> +++ b/include/exec/ram_addr.h
> @@ -443,7 +443,7 @@ static inline
>  uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
> ram_addr_t start,
> ram_addr_t length,
> -   uint64_t *real_dirty_pages)
> +   uint64_t *accu_dirty_pages)
>  {
>  ram_addr_t addr;
>  unsigned long word = BIT_WORD((start + rb->offset) >> TARGET_PAGE_BITS);
> @@ -469,7 +469,6 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock 
> *rb,
>  if (src[idx][offset]) {
>  unsigned long bits = atomic_xchg([idx][offset], 0);
>  unsigned long new_dirty;
> -*real_dirty_pages += ctpopl(bits);
>  new_dirty = ~dest[k];
>  dest[k] |= bits;
>  new_dirty &= bits;
> @@ -502,7 +501,6 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock 
> *rb,
>  start + addr + offset,
>  TARGET_PAGE_SIZE,
>  DIRTY_MEMORY_MIGRATION)) {
> -*real_dirty_pages += 1;
>  long k = (start + addr) >> TARGET_PAGE_BITS;
>  if (!test_and_set_bit(k, dest)) {
>  num_dirty++;
> @@ -511,6 +509,7 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock 
> *rb,
>  }
>  }
>  
> +*accu_dirty_pages += num_dirty;
>  return num_dirty;
>  }
>  #endif
> 



Re: [PATCH v3] migration/throttle: Add cpu-throttle-tailslow migration parameter

2020-05-07 Thread zhukeqian
Hi Dr.David,

Sorry for the reply delay, just come back from holiday.

On 2020/4/30 22:12, Dr. David Alan Gilbert wrote:
> * Keqian Zhu (zhukeqi...@huawei.com) wrote:
>> At the tail stage of throttling, the Guest is very sensitive to
>> CPU percentage while the @cpu-throttle-increment is excessive
>> usually at tail stage.
[...]
>> -static void mig_throttle_guest_down(void)
>> +static void mig_throttle_guest_down(uint64_t bytes_dirty_period,
>> +uint64_t bytes_dirty_threshold)
>>  {
>>  MigrationState *s = migrate_get_current();
>>  uint64_t pct_initial = s->parameters.cpu_throttle_initial;
>> -uint64_t pct_icrement = s->parameters.cpu_throttle_increment;
>> +uint64_t pct_increment = s->parameters.cpu_throttle_increment;
>> +bool pct_tailslow = s->parameters.cpu_throttle_tailslow;
>>  int pct_max = s->parameters.max_cpu_throttle;
>>  
>> +uint64_t throttle_now = cpu_throttle_get_percentage();
>> +uint64_t cpu_now, cpu_ideal, throttle_inc;
>> +
>>  /* We have not started throttling yet. Let's start it. */
>>  if (!cpu_throttle_active()) {
>>  cpu_throttle_set(pct_initial);
>>  } else {
>>  /* Throttling already on, just increase the rate */
>> -cpu_throttle_set(MIN(cpu_throttle_get_percentage() + pct_icrement,
>> - pct_max));
>> +if (!pct_tailslow) {
>> +throttle_inc = pct_increment;
>> +} else {
>> +/* Compute the ideal CPU percentage used by Guest, which may
>> + * make the dirty rate match the dirty rate threshold. */
>> +cpu_now = 100 - throttle_now;
>> +cpu_ideal = cpu_now * (bytes_dirty_threshold * 1.0 /
>> +bytes_dirty_period);
> 
> I worry if we need a divide-by-0 check; but that seems unlikely.
mig_throttle_guest_down is called when bytes_dirty_period is bigger than
bytes_dirty_threshold, and bytes_dirty_threshold is of unsigned type, so
bytes_dirty_period will not be zero here. I will add an assert check here
to make it clear.

> Now if that worked out as huge, then I think the MIN's guard it even
> with overflow below, so I think we're OK.
Yes, it will not exceed legacy increment.

> 
>> +throttle_inc = MIN(cpu_now - cpu_ideal, pct_increment);
>> +}
>> +cpu_throttle_set(MIN(throttle_now + throttle_inc, pct_max));
>>  }
>>  }
>>  
[...]
>> -- 
>> 2.19.1
>>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
Thanks,
Keqian
> 
> .
> 




Re: [PATCH 2/3] intc/gicv3_kvm: use kvm_gicc_access to get ICC_CTLR_EL1

2020-04-17 Thread zhukeqian
Hi Peter,

On 2020/4/17 19:09, Peter Maydell wrote:
> On Mon, 13 Apr 2020 at 10:18, Keqian Zhu  wrote:
>>
>> Replace kvm_device_access with kvm_gicc_access to simplify
>> code.
>>
>> Signed-off-by: Keqian Zhu 
>> ---
>>  hw/intc/arm_gicv3_kvm.c | 5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c
>> index ca43bf87ca..85f6420498 100644
>> --- a/hw/intc/arm_gicv3_kvm.c
>> +++ b/hw/intc/arm_gicv3_kvm.c
>> @@ -678,9 +678,8 @@ static void arm_gicv3_icc_reset(CPUARMState *env, const 
>> ARMCPRegInfo *ri)
>>  }
>>
>>  /* Initialize to actual HW supported configuration */
>> -kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS,
>> -  KVM_VGIC_ATTR(ICC_CTLR_EL1, c->gicr_typer),
>> -  >icc_ctlr_el1[GICV3_NS], false, _abort);
>> +kvm_gicc_access(s, ICC_CTLR_EL1, c->cpu->cpu_index,
>> +>icc_ctlr_el1[GICV3_NS], false);
> 
> This works at the moment, but I'd rather we avoided looking into
> cpu->cpu_index inside the GIC code. The cpu_index is the overall
> index of the CPU of all CPUs in the system, which is not in
> theory the same as "index of this CPU for this GIC". The two
> currently match up because arm_gicv3_common_realize() populates
> its s->cpu[i].cpu by calling qemu_get_cpu(i), but in future
> we might change that code (eg so that the board code has to
> explicitly wire up the CPUs to the GIC object by passing
> pointers to the CPUs to the GIC via link properties). So I'd
> rather not have the internals of the GIC code bake in the
> assumption that 'global CPU index is the same as the index
> of the CPU for this GIC object'.
OK, I get it. This patch can be ignored.
> 
> (All the other places that call kvm_gicc_access() are doing it
> as part of a loop from 0 to n->num_cpus, so they don't have this
> issue.)
> 
> thanks
> -- PMM
> 
> .
> 
Thanks,
Keqian




Re: [PATCH v2] migration/throttle: Add cpu-throttle-tailslow migration parameter

2020-03-31 Thread zhukeqian
Hi Eric,

On 2020/3/31 23:03, Eric Blake wrote:
> On 3/15/20 11:29 PM, Keqian Zhu wrote:
>> At the tail stage of throttling, the Guest is very sensitive to
>> CPU percentage while the @cpu-throttle-increment is excessive
>> usually at tail stage.
>>
>> If this parameter is true, we will compute the ideal CPU percentage
>> used by the Guest, which may exactly makes dirty rate to be dirty
>> rate threshold. Then we will choose a smaller throttle increment
>> between the one specified by @cpu-throttle-increment and the one
>> generated by ideal CPU percentage.
>>
>> Therefore, it is compatible to traditional throttling, meanwhile
>> the throttle increment won't be excessive at tail stage. This may
>> make migration time longer, and is disabled by default.
>>
>> Signed-off-by: Keqian Zhu 
>> ---
>> Cc: Juan Quintela 
>> Cc: "Dr. David Alan Gilbert" 
>> Cc: Eric Blake 
>> Cc: Markus Armbruster 
>> ---
> 
>> +++ b/qapi/migration.json
>> @@ -552,6 +552,21 @@
>>   #  auto-converge detects that migration is not 
>> making
>>   #  progress. The default value is 10. (Since 2.7)
>>   #
>> +# @cpu-throttle-tailslow: Make CPU throttling slower at tail stage
>> +# At the tail stage of throttling, the Guest is very
>> +# sensitive to CPU percentage while the 
>> @cpu-throttle
>> +# -increment is excessive usually at tail stage.
>> +# If this parameter is true, we will compute the 
>> ideal
>> +# CPU percentage used by the Guest, which may 
>> exactly
>> +# makes dirty rate to be dirty rate threshold. Then 
>> we
> 
> Grammar is off here, but I don't know if you meant "which may exactly make 
> the dirty rate match the dirty rate threshold" or something else.
> 
yep, this is what I want to express. I will correct it in v3. Thanks.
>> +# will choose a smaller throttle increment between 
>> the
>> +# one specified by @cpu-throttle-increment and the 
>> one
>> +# generated by ideal CPU percentage.
>> +# Therefore, it is compatible to traditional 
>> throttling,
>> +# meanwhile the throttle increment won't be 
>> excessive
>> +# at tail stage.
>> +# The default value is false. (Since 5.0)
> 
> Is this a bug-fix that must make it into 5.0?  It seems more like a feature 
> addition, at which point listing it for 5.1 is probably a better idea.
> 
Thanks for figuring it out.
This patch is sent out when v5.0 is not released.
I will correct it in v3.

Thanks,
Keqian




Re: [PATCH v2] migration/throttle: Add cpu-throttle-tailslow migration parameter

2020-03-31 Thread zhukeqian
Friendly ping...

Hi all,

Could you please review this patch. Thanks very much.

Thanks,
Keqian

On 2020/3/16 12:29, Keqian Zhu wrote:
> At the tail stage of throttling, the Guest is very sensitive to
> CPU percentage while the @cpu-throttle-increment is excessive
> usually at tail stage.
> 
> If this parameter is true, we will compute the ideal CPU percentage
> used by the Guest, which may exactly makes dirty rate to be dirty
> rate threshold. Then we will choose a smaller throttle increment
> between the one specified by @cpu-throttle-increment and the one
> generated by ideal CPU percentage.
> 
> Therefore, it is compatible to traditional throttling, meanwhile
> the throttle increment won't be excessive at tail stage. This may
> make migration time longer, and is disabled by default.
> 
> Signed-off-by: Keqian Zhu 
> ---
> Cc: Juan Quintela 
> Cc: "Dr. David Alan Gilbert" 
> Cc: Eric Blake 
> Cc: Markus Armbruster 
> ---
>  migration/migration.c | 13 
>  migration/ram.c   | 25 +-
>  monitor/hmp-cmds.c|  8 
>  qapi/migration.json   | 48 +++
>  4 files changed, 89 insertions(+), 5 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index c1d88ace7f..cc157cbf90 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -785,6 +785,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
> **errp)
>  params->cpu_throttle_initial = s->parameters.cpu_throttle_initial;
>  params->has_cpu_throttle_increment = true;
>  params->cpu_throttle_increment = s->parameters.cpu_throttle_increment;
> +params->has_cpu_throttle_tailslow = true;
> +params->cpu_throttle_tailslow = s->parameters.cpu_throttle_tailslow;
>  params->has_tls_creds = true;
>  params->tls_creds = g_strdup(s->parameters.tls_creds);
>  params->has_tls_hostname = true;
> @@ -1323,6 +1325,10 @@ static void 
> migrate_params_test_apply(MigrateSetParameters *params,
>  dest->cpu_throttle_increment = params->cpu_throttle_increment;
>  }
>  
> +if (params->has_cpu_throttle_tailslow) {
> +dest->cpu_throttle_tailslow = params->cpu_throttle_tailslow;
> +}
> +
>  if (params->has_tls_creds) {
>  assert(params->tls_creds->type == QTYPE_QSTRING);
>  dest->tls_creds = g_strdup(params->tls_creds->u.s);
> @@ -1411,6 +1417,10 @@ static void migrate_params_apply(MigrateSetParameters 
> *params, Error **errp)
>  s->parameters.cpu_throttle_increment = 
> params->cpu_throttle_increment;
>  }
>  
> +if (params->has_cpu_throttle_tailslow) {
> +s->parameters.cpu_throttle_tailslow = params->cpu_throttle_tailslow;
> +}
> +
>  if (params->has_tls_creds) {
>  g_free(s->parameters.tls_creds);
>  assert(params->tls_creds->type == QTYPE_QSTRING);
> @@ -3588,6 +3598,8 @@ static Property migration_properties[] = {
>  DEFINE_PROP_UINT8("x-cpu-throttle-increment", MigrationState,
>parameters.cpu_throttle_increment,
>DEFAULT_MIGRATE_CPU_THROTTLE_INCREMENT),
> +DEFINE_PROP_BOOL("x-cpu-throttle-tailslow", MigrationState,
> +  parameters.cpu_throttle_tailslow, false),
>  DEFINE_PROP_SIZE("x-max-bandwidth", MigrationState,
>parameters.max_bandwidth, MAX_THROTTLE),
>  DEFINE_PROP_UINT64("x-downtime-limit", MigrationState,
> @@ -3694,6 +3706,7 @@ static void migration_instance_init(Object *obj)
>  params->has_throttle_trigger_threshold = true;
>  params->has_cpu_throttle_initial = true;
>  params->has_cpu_throttle_increment = true;
> +params->has_cpu_throttle_tailslow = true;
>  params->has_max_bandwidth = true;
>  params->has_downtime_limit = true;
>  params->has_x_checkpoint_delay = true;
> diff --git a/migration/ram.c b/migration/ram.c
> index c12cfdbe26..4b74461306 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -616,20 +616,34 @@ static size_t save_page_header(RAMState *rs, QEMUFile 
> *f,  RAMBlock *block,
>   * able to complete migration. Some workloads dirty memory way too
>   * fast and will not effectively converge, even with auto-converge.
>   */
> -static void mig_throttle_guest_down(void)
> +static void mig_throttle_guest_down(uint64_t bytes_dirty_period,
> +uint64_t bytes_dirty_threshold)
>  {
>  MigrationState *s = migrate_get_current();
>  uint64_t pct_initial = s->parameters.cpu_throttle_initial;
> -uint64_t pct_icrement = s->parameters.cpu_throttle_increment;
> +uint64_t pct_increment = s->parameters.cpu_throttle_increment;
> +bool pct_tailslow = s->parameters.cpu_throttle_tailslow;
>  int pct_max = s->parameters.max_cpu_throttle;
>  
> +uint64_t throttle_now = cpu_throttle_get_percentage();
> +uint64_t cpu_now, cpu_ideal, throttle_inc;
> +
>  /* We have not started throttling yet. Let's start it. */
> 

Re: [PATCH] hmp-cmd: fix a missing_break warning

2020-03-18 Thread zhukeqian
Hi Nengyuan,

On 2020/3/18 15:22, Pan Nengyuan wrote:
> Correcting zhang hailiang's email.
> 
> On 3/18/2020 3:16 PM, Pan Nengyuan wrote:
>> This fix coverity issues 94417686:
>> 1260break;
>> CID 94417686: (MISSING_BREAK)
>> 1261. unterminated_case: The case for value 
>> "MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD" is not terminated by a 
>> 'break' statement.
>> 1261case MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD:
>> 1262p->has_throttle_trigger_threshold = true;
>> 1263visit_type_int(v, param, >throttle_trigger_threshold, 
>> );
>> 1264case MIGRATION_PARAMETER_CPU_THROTTLE_INITIAL:
>>
>> Fixes: dc14a470763c96fd9d360e1028ce38e8c3613a77
>> Reported-by: Euler Robot 
>> Signed-off-by: Pan Nengyuan 
>> ---
>> Cc: zhukeqi...@huawei.com
>> ---
>>  monitor/hmp-cmds.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
>> index 58724031ea..c882c9f3cc 100644
>> --- a/monitor/hmp-cmds.c
>> +++ b/monitor/hmp-cmds.c
>> @@ -1261,6 +1261,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const 
>> QDict *qdict)
>>  case MIGRATION_PARAMETER_THROTTLE_TRIGGER_THRESHOLD:
>>  p->has_throttle_trigger_threshold = true;
>>  visit_type_int(v, param, >throttle_trigger_threshold, );
>> +break;
Good fix :).
>>  case MIGRATION_PARAMETER_CPU_THROTTLE_INITIAL:
>>  p->has_cpu_throttle_initial = true;
>>  visit_type_int(v, param, >cpu_throttle_initial, );
>>
> 
> .
> 
Reviewed-by: Keqian Zhu 

Thanks,
Keqian




Re: [PATCH] migration/throttle: Add throttle-trig-thres migration parameter

2020-03-12 Thread zhukeqian
Hi Dr. David,

On 2020/3/13 2:07, Dr. David Alan Gilbert wrote:
> * Keqian Zhu (zhukeqi...@huawei.com) wrote:
>> Currently, if the bytes_dirty_period is more than the 50% of
>> bytes_xfer_period, we start or increase throttling.
>>
>> If we make this percentage higher, then we can tolerate higher
>> dirty rate during migration, which means less impact on guest.
>> The side effect of higher percentage is longer migration time.
>> We can make this parameter configurable to switch between mig-
>> ration time first or guest performance first.
>>
>> The default value is 50 and valid range is 1 to 100.
>>
>> Signed-off-by: Keqian Zhu 
> 
> Apologies for the delay.
It is not late, no worries ;).
> This looks fine now; so
> 
> Reviewed-by: Dr. David Alan Gilbert 
> 
> and I'll queue it.
Thanks.
> I think we could do with a better description than the current one if we
> can find it:
> 
>  The ratio of bytes_dirty_period and bytes_xfer_period
>  to trigger throttling. It is expressed as percentage.
> 
> assumes people understand what those bytes*period mean.
> 
> Still, until we do:
> 
> Queued for migration
> 
[...]
>> -- 
>> 2.19.1
>>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> 
> .
> 
Thanks,
Keqian




Re: [PATCH] migration/throttle: Add throttle-trig-thres migration parameter

2020-02-23 Thread zhukeqian
Hi, Eric

On 2020/2/21 22:14, Eric Blake wrote:
> On 2/20/20 8:57 PM, Keqian Zhu wrote:
>> Currently, if the bytes_dirty_period is more than the 50% of
>> bytes_xfer_period, we start or increase throttling.
>>
>> If we make this percentage higher, then we can tolerate higher
>> dirty rate during migration, which means less impact on guest.
>> The side effect of higher percentage is longer migration time.
>>
>> We can configure this parameter to switch between migration time
>> firt or guest performance first. The default value is 50.
>>
>> Signed-off-by: Keqian Zhu 
>> ---
>> Cc: Juan Quintela 
>> Cc: "Dr. David Alan Gilbert" 
>> Cc: Eric Blake 
>> Cc: Markus Armbruster 
>> ---
> 
>> +++ b/qapi/migration.json
>> @@ -524,6 +524,10 @@
>>   #  compression, so set the decompress-threads to the 
>> number about 1/4
>>   #  of compress-threads is adequate.
>>   #
>> +# @throttle-trig-thres: The ratio of bytes_dirty_period and 
>> bytes_xfer_period to
>> +#   trigger throttling. It is expressed as percentage. 
>> The
>> +#   default value is 50. (Since 5.0)
>> +#
> 
> Abbreviating feels odd; can you please spell this out as 
> throttle-trigger-threshold?
OK, I will use full name in v2.
> 
> Can the threshold exceed 100%?
If the threshold exceed 100% and the dirty rate is between 100% and threshold, 
then throttling
will not be started, so the migration will not converge and last an uncertain 
time until the workload
in guest is down by itself. So I think that the threshold exceed 100% maybe not 
suitable :).
> 

Thanks.
Keqian




Re: [PATCH] migration/throttle: Make throttle slower at tail stage

2020-02-18 Thread zhukeqian



On 2020/2/14 20:28, Dr. David Alan Gilbert wrote:
> * Keqian Zhu (zhukeqi...@huawei.com) wrote:
>> At the tail stage of throttle, VM is very sensitive to
>> CPU percentage. We just throttle 30% of remaining CPU
>> when throttle is more than 80 percentage.
> 
> This is a bit unusual;  all of the rest of the throttling has no
> fixed constants; all values are set by parameters.
> 
> You've got the two, the '80' and the '0.3'
> 
> I thinkt he easy solution is to replace your parameter 'tailslow' by two
> new parameters; 'tailstart' and 'tailrate';  both defaulting to 100%.
> 
> Then you make it:
> 
> if (cpu_throttle_now >= pct_tailstart) {
> /* Eat some scale of CPU from remaining */
> cpu_throttle_inc = ceil((100 - cpu_throttle_now) * pct_tailrate);
> 
> (with percentage scaling added).
> 
> Then setting 'tailstart' to 80 and 'tailrate' to 30 is equivalent to
> what you have, but means we have no magical constants in the code.
> 
Yes, this is a good suggestion. Though this patch is not the final idea,
I will apply it when throttle approach is decided.
> Dave
> 
> 
[...]
>> -- 
>> 2.19.1
>>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
> 
> 
> .
> 
Thanks,
Keqian




Re: [PATCH] migration/throttle: Make throttle slower at tail stage

2020-02-18 Thread zhukeqian



On 2020/2/14 19:46, Eric Blake wrote:
> On 2/13/20 9:27 PM, Keqian Zhu wrote:
>> At the tail stage of throttle, VM is very sensitive to
>> CPU percentage. We just throttle 30% of remaining CPU
>> when throttle is more than 80 percentage.
>>
>> This doesn't conflict with cpu_throttle_increment.
>>
>> This may make migration time longer, and is disabled
>> by default.
>>
>> Signed-off-by: Keqian Zhu 
>> ---
>> Cc: Juan Quintela 
>> Cc: "Dr. David Alan Gilbert" 
>> Cc: Eric Blake 
>> Cc: Markus Armbruster 
> 
>> +++ b/qapi/migration.json
>> @@ -532,6 +532,12 @@
>>   #  auto-converge detects that migration is not 
>> making
>>   #  progress. The default value is 10. (Since 2.7)
>>   #
>> +# @cpu-throttle-tailslow: Make throttle slower at tail stage
>> +# At the tail stage of throttle, VM is very 
>> sensitive to
>> +# CPU percentage. We just throttle 30% of remaining 
>> CPU
>> +# when throttle is more than 80 percentage. The 
>> default
>> +# value is false. (Since 4.1)
> 
> The next release is 5.0, not 4.1.
Thanks for reminding me.
> 
Thanks,
Keqian





Re: [PATCH] migration/throttle: Make throttle slower at tail stage

2020-02-18 Thread zhukeqian
Hi, Juan

On 2020/2/14 20:37, Juan Quintela wrote:
> Keqian Zhu  wrote:
>> At the tail stage of throttle, VM is very sensitive to
>> CPU percentage. We just throttle 30% of remaining CPU
>> when throttle is more than 80 percentage.
> 
> Why?
> 
My original idea is that if we throttle a fixed percentage of CPU every time,
then the VM is more and more sensitive to performance decrease.

For example, if the initial throttle is 10% and we throttle 10% every time. At 
the
beginning, the performance changes from 100% to 90%, which makes little effect 
on VM.
However, if the dirty rate is very high and it is not enough even throttle 80%, 
then
the performance changes from 20% to 10%, which half the performance and makes 
heavy
effect on VM.

In the example above, if throttle 85% is enough, then throttle 90% makes 
unnecessary
performance loss on VM. So this is the reason for slowdown throttling when we 
are about
to reach the best throttle.

> If we really think that this is better that current approarch, just do
> this _always_.  And throothre 30% of remaining CPU.  So we go:
> 
> 30%
> 30% + 0.3(70%)
> ...
> 
> Or anything else.
> 

This should not be a new approach, instead it is an optional enhancement to

current approach. However, after my deeper thinking, the way that throttle
30% of remaining CPU is unusual and not suitable. We should use another way
to slowdown the tail stage.

When dirty bytes is is 50% more than the approx. bytes xfer, we start or 
increase
throttling. My idea is that we can calculate the throttle increment expected.
When dirty rate is about to reach the 50% of bandwidth, the throttle increment
expected will smaller than "cpu_throttle_increment" at tail stage.

Maybe the core code likes this:

-static void mig_throttle_guest_down(void)
+static void mig_throttle_guest_down(uint64_t bytes_dirty, uint64_t bytes_xfer)
 {
 MigrationState *s = migrate_get_current();
 uint64_t pct_initial = s->parameters.cpu_throttle_initial;
-uint64_t pct_icrement = s->parameters.cpu_throttle_increment;
+uint64_t pct_increment = s->parameters.cpu_throttle_increment;
+bool pct_tailslow = s->parameters.cpu_throttle_tailslow;
 int pct_max = s->parameters.max_cpu_throttle;

+uint64_t cpu_throttle_now = cpu_throttle_get_percentage();
+uint64_t cpu_now, cpu_target, cpu_throttle_expect;
+uint64_t cpu_throttle_inc;
+
 /* We have not started throttling yet. Let's start it. */
 if (!cpu_throttle_active()) {
 cpu_throttle_set(pct_initial);
 } else {
 /* Throttling already on, just increase the rate */
-cpu_throttle_set(MIN(cpu_throttle_get_percentage() + pct_icrement,
+cpu_throttle_inc = pct_increment;
+if (pct_tailslow) {
+cpu_now = 100 - cpu_throttle_now;
+cpu_target = ((bytes_xfer / 2.0) / bytes_dirty) * cpu_now;
+cpu_throttle_expect = cpu_now - cpu_target;
+if (cpu_throttle_expect < pct_increment) {
+cpu_throttle_inc = cpu_throttle_expect;
+}
+}
+cpu_throttle_set(MIN(cpu_throttle_now + cpu_throttle_inc,
  pct_max));
 }
 }
__

-if ((rs->num_dirty_pages_period * TARGET_PAGE_SIZE >
-   (bytes_xfer_now - rs->bytes_xfer_prev) / 2) &&
+bytes_dirty_period = rs->num_dirty_pages_period * TARGET_PAGE_SIZE;
+bytes_xfer_period = bytes_xfer_now - rs->bytes_xfer_prev;
+if ((bytes_dirty_period > bytes_xfer_period / 2) &&
 (++rs->dirty_rate_high_cnt >= 2)) {
 trace_migration_throttle();
 rs->dirty_rate_high_cnt = 0;
-mig_throttle_guest_down();
+mig_throttle_guest_down(bytes_dirty_period,
+bytes_xfer_period);
 }
> My experience is:
> - you really need to go to very high throothle to make migration happens
>   (more than 70% or so usually).
> - The way that we throotle is not completely exact.
> 
>> This doesn't conflict with cpu_throttle_increment.
>>
>> This may make migration time longer, and is disabled
>> by default.
> 
> 
> What do you think?
> I think that it is better to change method and improve documentation
> that yet adding another parameter.
> 
> Later, Juan.
> 
> 
> .
> 
Thanks,
Keqian




Re: [PATCH] migration: Optimization about wait-unplug migration state

2020-02-09 Thread zhukeqian



On 2020/2/4 17:14, Juan Quintela wrote:
> Keqian Zhu  wrote:
>> qemu_savevm_nr_failover_devices() is originally designed to
>> get the number of failover devices, but it actually returns
>> the number of "unplug-pending" failover devices now. Moreover,
>> what drives migration state to wait-unplug should be the number
>> of "unplug-pending" failover devices, not all failover devices.
>>
>> We can also notice that qemu_savevm_state_guest_unplug_pending()
>> and qemu_savevm_nr_failover_devices() is equivalent almost (from
>> the code view). So the latter is incorrect semantically and
>> useless, just delete it.
>>
>> In the qemu_savevm_state_guest_unplug_pending(), once hit a
>> unplug-pending failover device, then it can return true right
>> now to save cpu time.
>>
>> Signed-off-by: Keqian Zhu 
> 
> Reviewed-by: Juan Quintela 
> 
> For my reading you are right:
> - 1st function naming is not right
> - 2nd function achieves exactly the same effect
> 
> I will wait until Jens says anything, but then I will queue it.
> 
> Thanks, Juan.
> 
> 
> .
> 
Thanks,
Keqian.




Re: [PATCH] migration: Optimization about wait-unplug migration state

2020-02-09 Thread zhukeqian



On 2020/2/5 22:40, Jens Freimann wrote:
> On Tue, Feb 04, 2020 at 01:08:41PM +0800, Keqian Zhu wrote:
>> qemu_savevm_nr_failover_devices() is originally designed to
>> get the number of failover devices, but it actually returns
>> the number of "unplug-pending" failover devices now. Moreover,
>> what drives migration state to wait-unplug should be the number
>> of "unplug-pending" failover devices, not all failover devices.
>>
>> We can also notice that qemu_savevm_state_guest_unplug_pending()
>> and qemu_savevm_nr_failover_devices() is equivalent almost (from
>> the code view). So the latter is incorrect semantically and
>> useless, just delete it.
>>
>> In the qemu_savevm_state_guest_unplug_pending(), once hit a
>> unplug-pending failover device, then it can return true right
>> now to save cpu time.
>>
>> Signed-off-by: Keqian Zhu 
>> ---
>> Cc: jfreim...@redhat.com
>> Cc: Juan Quintela 
>> Cc: "Dr. David Alan Gilbert" 
>> ---
>> migration/migration.c |  2 +-
>> migration/savevm.c| 24 +++-
>> migration/savevm.h|  1 -
>> 3 files changed, 4 insertions(+), 23 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 3a21a4686c..deedc968cf 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -,7 +,7 @@ static void *migration_thread(void *opaque)
>>
>> qemu_savevm_state_setup(s->to_dst_file);
>>
>> -if (qemu_savevm_nr_failover_devices()) {
>> +if (qemu_savevm_state_guest_unplug_pending()) {
>> migrate_set_state(>state, MIGRATION_STATUS_SETUP,
>>   MIGRATION_STATUS_WAIT_UNPLUG);
>>
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index f19cb9ec7a..1d4220ece8 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -1140,36 +1140,18 @@ void qemu_savevm_state_header(QEMUFile *f)
>> }
>> }
>>
>> -int qemu_savevm_nr_failover_devices(void)
>> +bool qemu_savevm_state_guest_unplug_pending(void)
>> {
>> SaveStateEntry *se;
>> -int n = 0;
>>
>> QTAILQ_FOREACH(se, _state.handlers, entry) {
>> if (se->vmsd && se->vmsd->dev_unplug_pending &&
>> se->vmsd->dev_unplug_pending(se->opaque)) {
>> -n++;
>> -}
>> -}
>> -
>> -return n;
>> -}
>> -
>> -bool qemu_savevm_state_guest_unplug_pending(void)
>> -{
>> -SaveStateEntry *se;
>> -int n = 0;
>> -
>> -QTAILQ_FOREACH(se, _state.handlers, entry) {
>> -if (!se->vmsd || !se->vmsd->dev_unplug_pending) {
>> -continue;
>> -}
>> -if (se->vmsd->dev_unplug_pending(se->opaque)) {
>> -n++;
>> +return true;
>> }
>> }
>>
>> -return n > 0;
>> +return false;
>> }
>>
>> void qemu_savevm_state_setup(QEMUFile *f)
>> diff --git a/migration/savevm.h b/migration/savevm.h
>> index c42b9c80ee..ba64a7e271 100644
>> --- a/migration/savevm.h
>> +++ b/migration/savevm.h
>> @@ -31,7 +31,6 @@
>>
>> bool qemu_savevm_state_blocked(Error **errp);
>> void qemu_savevm_state_setup(QEMUFile *f);
>> -int qemu_savevm_nr_failover_devices(void);
>> bool qemu_savevm_state_guest_unplug_pending(void);
>> int qemu_savevm_state_resume_prepare(MigrationState *s);
>> void qemu_savevm_state_header(QEMUFile *f);
>> -- 
>> 2.19.1
> 
> Looks good to me. I tested it and it still works, so
> Tested-by: Jens Freimann 
> Reviewed-by: Jens Freimann 
> regards
> Jens
> 
> 
> .
> 
Thanks,
Keqian.




Re: [PATCH v2] hw/arm: Adjust some coding styles about memory hotplug

2020-01-19 Thread zhukeqian


On 2020/1/17 19:07, Peter Maydell wrote:
> On Fri, 17 Jan 2020 at 06:41, Keqian Zhu  wrote:
>>
>> From: zhukeqian 
>>
>> There is extra indent in ACPI GED plug cb. And we can use
>> existing helper function to trigger hotplug handler plug.
>>
>> Reviewed-by: Igor Mammedov 
>> Signed-off-by: Keqian Zhu 
>> ---
>>
>> v1->v2:
>>  - Add Igor's R-b
>>
>> Cc: Shameer Kolothum 
>> Cc: "Michael S. Tsirkin" 
>> Cc: Igor Mammedov 
>> Cc: Peter Maydell 
>> ---
>>  hw/acpi/generic_event_device.c | 2 +-
>>  hw/arm/virt.c  | 6 +++---
>>  2 files changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c
>> index 9cee90cc70..55eb29d80a 100644
>> --- a/hw/acpi/generic_event_device.c
>> +++ b/hw/acpi/generic_event_device.c
>> @@ -175,7 +175,7 @@ static void acpi_ged_device_plug_cb(HotplugHandler 
>> *hotplug_dev,
>>  AcpiGedState *s = ACPI_GED(hotplug_dev);
>>
>>  if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
>> -acpi_memory_plug_cb(hotplug_dev, >memhp_state, dev, errp);
>> +acpi_memory_plug_cb(hotplug_dev, >memhp_state, dev, errp);
>>  } else {
>>  error_setg(errp, "virt: device plug request for unsupported device"
>> " type: %s", object_get_typename(OBJECT(dev)));
>> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
>> index 39ab5f47e0..656b0081c2 100644
>> --- a/hw/arm/virt.c
>> +++ b/hw/arm/virt.c
>> @@ -1934,7 +1934,6 @@ static void virt_memory_pre_plug(HotplugHandler 
>> *hotplug_dev, DeviceState *dev,
>>  static void virt_memory_plug(HotplugHandler *hotplug_dev,
>>   DeviceState *dev, Error **errp)
>>  {
>> -HotplugHandlerClass *hhc;
>>  VirtMachineState *vms = VIRT_MACHINE(hotplug_dev);
>>  Error *local_err = NULL;
>>
>> @@ -1943,8 +1942,9 @@ static void virt_memory_plug(HotplugHandler 
>> *hotplug_dev,
>>  goto out;
>>  }
>>
>> -hhc = HOTPLUG_HANDLER_GET_CLASS(vms->acpi_dev);
>> -hhc->plug(HOTPLUG_HANDLER(vms->acpi_dev), dev, _abort);
>> +hotplug_handler_plug(HOTPLUG_HANDLER(vms->acpi_dev),
>> + dev, _abort);
>> +
>>  out:
>>  error_propagate(errp, local_err);
>>  }
> 
> These two changes are unrelated (one is just a whitespace
> fixup, and it's in an entirely different file to the other
> one, which is making an actual code change). I think they
> should not be in the same patch.
> 
> thanks
> -- PMM
> 
Ok, I will reform this patch.

Thanks,
Keqian
> .
>