[Qemu-devel] [PATCH 1/1 V4] qemu-kvm: fix improper nmi emulation
On 10/14/2011 01:53 PM, Jan Kiszka wrote: On 2011-10-14 02:53, Lai Jiangshan wrote: As explained in some other mail, we could then emulate the missing kernel feature by reading out the current in-kernel APIC state, testing if LINT1 is unmasked, and then delivering the NMI directly. Only the thread of the VCPU can safely get the in-kernel LAPIC states, so this approach will cause some troubles. run_on_cpu() can help. Jan Ah, I forgot it, Thanks. From: Lai Jiangshan la...@cn.fujitsu.com Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is maskied in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, inject-nmi request is handled as follows. - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI interrupt. - When in-kernel irqchip is enabled, get the in-kernel LAPIC states and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then delivering the NMI directly. (Suggested by Jan Kiszka) Changed from old version: re-implement it by the Jan's suggestion. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- hw/apic.c | 48 hw/apic.h |1 + monitor.c |6 +- 3 files changed, 54 insertions(+), 1 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 69d6ac5..9a40129 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -205,6 +205,54 @@ void apic_deliver_pic_intr(DeviceState *d, int level) } } +#ifdef KVM_CAP_IRQCHIP +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id); + +struct kvm_get_remote_lapic_params { +CPUState *env; +struct kvm_lapic_state klapic; +}; + +static void kvm_get_remote_lapic(void *p) +{ +struct kvm_get_remote_lapic_params *params = p; + +kvm_get_lapic(params-env, params-klapic); +} + +void apic_deliver_nmi(DeviceState *d) +{ +APICState *s = DO_UPCAST(APICState, busdev.qdev, d); + +if (kvm_irqchip_in_kernel()) { +struct kvm_get_remote_lapic_params p = {.env = s-cpu_env,}; +uint32_t lvt; + +run_on_cpu(s-cpu_env, kvm_get_remote_lapic, p); +lvt = kapic_reg(p.klapic, 0x32 + APIC_LVT_LINT1); + +if (lvt APIC_LVT_MASKED) { +return; +} + +if (((lvt 8) 7) != APIC_DM_NMI) { +return; +} + +cpu_interrupt(s-cpu_env, CPU_INTERRUPT_NMI); +} else { +apic_local_deliver(s, APIC_LVT_LINT1); +} +} +#else +void apic_deliver_nmi(DeviceState *d) +{ +APICState *s = DO_UPCAST(APICState, busdev.qdev, d); + +apic_local_deliver(s, APIC_LVT_LINT1); +} +#endif + #define foreach_apic(apic, deliver_bitmask, code) \ {\ int __i, __j, __mask;\ diff --git a/hw/apic.h b/hw/apic.h index c857d52..3a4be0a 100644 --- a/hw/apic.h +++ b/hw/apic.h @@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t trigger_mode); int apic_accept_pic_intr(DeviceState *s); void apic_deliver_pic_intr(DeviceState *s, int level); +void apic_deliver_nmi(DeviceState *d); int apic_get_interrupt(DeviceState *s); void apic_reset_irq_delivered(void); int apic_get_irq_delivered(void); diff --git a/monitor.c b/monitor.c index cb485bf..0b81f17 100644 --- a/monitor.c +++ b/monitor.c @@ -2616,7 +2616,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data) CPUState *env; for (env = first_cpu; env != NULL; env = env-next_cpu) { -cpu_interrupt(env, CPU_INTERRUPT_NMI); +if (!env-apic_state) { +cpu_interrupt(env, CPU_INTERRUPT_NMI); +} else { +apic_deliver_nmi(env-apic_state); +} } return 0;
Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppcr: Avoid decrementer related kvm exits
On Fri, Oct 14, 2011 at 07:30:09AM +0200, Alexander Graf wrote: On 14.10.2011, at 07:19, David Gibson wrote: In __cpu_ppc_store_decr(), we set up a regular timer used to trigger decrementer interrupts. This is necessary to implement the decrementer properly under TCG, but is unnecessary under KVM (true for both Book3S-PR and Book3S-HV KVM variants), because the kernel handles generating and delivering decrementer exceptions. Under kvm, in fact, the timer causes expensive and unnecessary exits from kvm to qemu. This patch, therefore, disables setting the timer when kvm is in use. Signed-off-by: Anton Blanchard an...@au1.ibm.com Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/ppc.c | 25 ++--- 1 files changed, 14 insertions(+), 11 deletions(-) diff --git a/hw/ppc.c b/hw/ppc.c index 25b59dd..87aa4e5 100644 --- a/hw/ppc.c +++ b/hw/ppc.c @@ -658,21 +658,24 @@ static void __cpu_ppc_store_decr (CPUState *env, uint64_t *nextp, Do we ever call store_decr in the kvm case? Isn't that only called from emulated mtdec? Yes, from cpu_ppc_set_tb_clk(). Anton observed the kvm exits in the wild, they're not theoretical. Agh, which reminds me, I forgot to fixup the git author again. The patch should show authorship by Anton Blanchard an...@au1.ibm.com, as in the s-o-b. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppcr: Avoid decrementer related kvm exits
On 14.10.2011, at 08:36, David Gibson wrote: On Fri, Oct 14, 2011 at 07:30:09AM +0200, Alexander Graf wrote: On 14.10.2011, at 07:19, David Gibson wrote: In __cpu_ppc_store_decr(), we set up a regular timer used to trigger decrementer interrupts. This is necessary to implement the decrementer properly under TCG, but is unnecessary under KVM (true for both Book3S-PR and Book3S-HV KVM variants), because the kernel handles generating and delivering decrementer exceptions. Under kvm, in fact, the timer causes expensive and unnecessary exits from kvm to qemu. This patch, therefore, disables setting the timer when kvm is in use. Signed-off-by: Anton Blanchard an...@au1.ibm.com Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/ppc.c | 25 ++--- 1 files changed, 14 insertions(+), 11 deletions(-) diff --git a/hw/ppc.c b/hw/ppc.c index 25b59dd..87aa4e5 100644 --- a/hw/ppc.c +++ b/hw/ppc.c @@ -658,21 +658,24 @@ static void __cpu_ppc_store_decr (CPUState *env, uint64_t *nextp, Do we ever call store_decr in the kvm case? Isn't that only called from emulated mtdec? Yes, from cpu_ppc_set_tb_clk(). Anton observed the kvm exits in the wild, they're not theoretical. Agh, which reminds me, I forgot to fixup the git author again. The patch should show authorship by Anton Blanchard an...@au1.ibm.com, as in the s-o-b. Wouldn't a simple if (kvm_enabled()) { return; } in the beginning of the function make more sense? There's no code connecting the in-qemu and the in-kvm decrementors atm, so any logic applying to the in-qemu one is moot for kvm. Alex
Re: [Qemu-devel] [PATCH v2 1/2] Introduce QLIST_INSERT_HEAD_RCU and dummy RCU wrappers.
On 10/13/2011 10:35 PM, Harsh Prateek Bora wrote: +#define QLIST_INSERT_HEAD_RCU(head, elm, field) do {\ +(elm)-field.le_prev =(head)-lh_first; \ +smp_wmb(); \ +if (((elm)-field.le_next = (head)-lh_first) != NULL) \ +(head)-lh_first-field.le_prev =(elm)-field.le_next;\ +smp_wmb(); \ +(head)-lh_first = (elm); \ +smp_wmb(); \ +} while (/* CONSTCOND*/0) Actually, looking more at it it should be more like (elm)-field.le_prev =(head)-lh_first; (elm)-field.le_next = (head)-lh_first; smb_wmb(); /* fill elm before linking it */ if ((head)-lh_first != NULL) (head)-lh_first-field.le_prev =(elm)-field.le_next; (head)-lh_first = (elm); smp_wmb(); ... which even saves a memory barrier. Paolo
Re: [Qemu-devel] [Qemu-ppc] [PATCH] ppcr: Avoid decrementer related kvm exits
On Fri, Oct 14, 2011 at 08:44:06AM +0200, Alexander Graf wrote: On 14.10.2011, at 08:36, David Gibson wrote: On Fri, Oct 14, 2011 at 07:30:09AM +0200, Alexander Graf wrote: On 14.10.2011, at 07:19, David Gibson wrote: In __cpu_ppc_store_decr(), we set up a regular timer used to trigger decrementer interrupts. This is necessary to implement the decrementer properly under TCG, but is unnecessary under KVM (true for both Book3S-PR and Book3S-HV KVM variants), because the kernel handles generating and delivering decrementer exceptions. Under kvm, in fact, the timer causes expensive and unnecessary exits from kvm to qemu. This patch, therefore, disables setting the timer when kvm is in use. Signed-off-by: Anton Blanchard an...@au1.ibm.com Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/ppc.c | 25 ++--- 1 files changed, 14 insertions(+), 11 deletions(-) diff --git a/hw/ppc.c b/hw/ppc.c index 25b59dd..87aa4e5 100644 --- a/hw/ppc.c +++ b/hw/ppc.c @@ -658,21 +658,24 @@ static void __cpu_ppc_store_decr (CPUState *env, uint64_t *nextp, Do we ever call store_decr in the kvm case? Isn't that only called from emulated mtdec? Yes, from cpu_ppc_set_tb_clk(). Anton observed the kvm exits in the wild, they're not theoretical. Agh, which reminds me, I forgot to fixup the git author again. The patch should show authorship by Anton Blanchard an...@au1.ibm.com, as in the s-o-b. Wouldn't a simple if (kvm_enabled()) { return; } in the beginning of the function make more sense? There's no code connecting the in-qemu and the in-kvm decrementors atm, so any logic applying to the in-qemu one is moot for kvm. Uh.. I guess so. I wasn't 100% sure the last bit of code in the function wouldn't have some effect on kvm. But I guess it doesn't; I'll revise. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Re: [Qemu-devel] [PATCH 1/1 V4] qemu-kvm: fix improper nmi emulation
On 2011-10-14 08:36, Lai Jiangshan wrote: On 10/14/2011 01:53 PM, Jan Kiszka wrote: On 2011-10-14 02:53, Lai Jiangshan wrote: As explained in some other mail, we could then emulate the missing kernel feature by reading out the current in-kernel APIC state, testing if LINT1 is unmasked, and then delivering the NMI directly. Only the thread of the VCPU can safely get the in-kernel LAPIC states, so this approach will cause some troubles. run_on_cpu() can help. Jan Ah, I forgot it, Thanks. From: Lai Jiangshan la...@cn.fujitsu.com Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is maskied in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, inject-nmi request is handled as follows. - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI interrupt. - When in-kernel irqchip is enabled, get the in-kernel LAPIC states and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then delivering the NMI directly. (Suggested by Jan Kiszka) Changed from old version: re-implement it by the Jan's suggestion. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- hw/apic.c | 48 hw/apic.h |1 + monitor.c |6 +- 3 files changed, 54 insertions(+), 1 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 69d6ac5..9a40129 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -205,6 +205,54 @@ void apic_deliver_pic_intr(DeviceState *d, int level) } } +#ifdef KVM_CAP_IRQCHIP Again, this is always defined on x86 thus pointless to test. +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id); + +struct kvm_get_remote_lapic_params { +CPUState *env; +struct kvm_lapic_state klapic; +}; + +static void kvm_get_remote_lapic(void *p) +{ +struct kvm_get_remote_lapic_params *params = p; + +kvm_get_lapic(params-env, params-klapic); When you already interrupted that vcpu, why not inject from here? Avoids one further ping-pong round. +} + +void apic_deliver_nmi(DeviceState *d) +{ +APICState *s = DO_UPCAST(APICState, busdev.qdev, d); + +if (kvm_irqchip_in_kernel()) { +struct kvm_get_remote_lapic_params p = {.env = s-cpu_env,}; +uint32_t lvt; + +run_on_cpu(s-cpu_env, kvm_get_remote_lapic, p); +lvt = kapic_reg(p.klapic, 0x32 + APIC_LVT_LINT1); + +if (lvt APIC_LVT_MASKED) { +return; +} + +if (((lvt 8) 7) != APIC_DM_NMI) { +return; +} + +cpu_interrupt(s-cpu_env, CPU_INTERRUPT_NMI); Err, aren't you introducing KVM_CAP_LAPIC_NMI that allows to test if this workaround is needed? Oh, your latest kernel patch is missing this again - requires fixing as well. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2] runstate: add more valid transitions
On 10/13/2011 10:26 PM, Luiz Capitulino wrote: I'm going to take my word back on this one, I've found the real cause of the problem. Will post the patch right now. Are you keeping the vl.c hunks though? Paolo
[Qemu-devel] [PATCH] correct spelling
Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com --- block/sheepdog.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/block/sheepdog.c b/block/sheepdog.c index c1f6e07..ae857e2 100644 --- a/block/sheepdog.c +++ b/block/sheepdog.c @@ -66,7 +66,7 @@ * 20 - 31 (12 bits): reserved data object space * 32 - 55 (24 bits): vdi object space * 56 - 59 ( 4 bits): reserved vdi object space - * 60 - 63 ( 4 bits): object type indentifier space + * 60 - 63 ( 4 bits): object type identifier space */ #define VDI_SPACE_SHIFT 32 -- 1.7.5.4
Re: [Qemu-devel] [PATCH 1/1 V4] qemu-kvm: fix improper nmi emulation
On 10/14/2011 02:49 PM, Jan Kiszka wrote: On 2011-10-14 08:36, Lai Jiangshan wrote: On 10/14/2011 01:53 PM, Jan Kiszka wrote: On 2011-10-14 02:53, Lai Jiangshan wrote: As explained in some other mail, we could then emulate the missing kernel feature by reading out the current in-kernel APIC state, testing if LINT1 is unmasked, and then delivering the NMI directly. Only the thread of the VCPU can safely get the in-kernel LAPIC states, so this approach will cause some troubles. run_on_cpu() can help. Jan Ah, I forgot it, Thanks. From: Lai Jiangshan la...@cn.fujitsu.com Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is maskied in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, inject-nmi request is handled as follows. - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI interrupt. - When in-kernel irqchip is enabled, get the in-kernel LAPIC states and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then delivering the NMI directly. (Suggested by Jan Kiszka) Changed from old version: re-implement it by the Jan's suggestion. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- hw/apic.c | 48 hw/apic.h |1 + monitor.c |6 +- 3 files changed, 54 insertions(+), 1 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 69d6ac5..9a40129 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -205,6 +205,54 @@ void apic_deliver_pic_intr(DeviceState *d, int level) } } +#ifdef KVM_CAP_IRQCHIP Again, this is always defined on x86 thus pointless to test. +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id); + +struct kvm_get_remote_lapic_params { +CPUState *env; +struct kvm_lapic_state klapic; +}; + +static void kvm_get_remote_lapic(void *p) +{ +struct kvm_get_remote_lapic_params *params = p; + +kvm_get_lapic(params-env, params-klapic); When you already interrupted that vcpu, why not inject from here? Avoids one further ping-pong round. get_remote_lapic and inject nmi are two different things, so I don't inject nmi from here. I didn't notice this ping-pond overhead. Thank you. +} + +void apic_deliver_nmi(DeviceState *d) +{ +APICState *s = DO_UPCAST(APICState, busdev.qdev, d); + +if (kvm_irqchip_in_kernel()) { +struct kvm_get_remote_lapic_params p = {.env = s-cpu_env,}; +uint32_t lvt; + +run_on_cpu(s-cpu_env, kvm_get_remote_lapic, p); +lvt = kapic_reg(p.klapic, 0x32 + APIC_LVT_LINT1); + +if (lvt APIC_LVT_MASKED) { +return; +} + +if (((lvt 8) 7) != APIC_DM_NMI) { +return; +} + +cpu_interrupt(s-cpu_env, CPU_INTERRUPT_NMI); Err, aren't you introducing KVM_CAP_LAPIC_NMI that allows to test if this workaround is needed? Oh, your latest kernel patch is missing this again - requires fixing as well. Kernel site patch is dropped with this v4 patch. Did you mean you want KVM_CAP_SET_LINT1 + KVM_SET_LINT1 patches? I have made them. Sent soon. Thanks, Lai
Re: [Qemu-devel] [PATCH] linux-aio: Allow reads beyond the end of growable images
On Thu, Oct 13, 2011 at 03:49:39PM +0200, Kevin Wolf wrote: This is the linux-aio version of commits 22afa7b5 (raw-posix, synchronous) and ba1d1afd (posix-aio-compat). Reads now produce zeros after the end of file instead of failing or resulting in short reads, making linux-aio compatible with the behaviour of synchronous raw-posix requests and posix-aio-compat. The problem can be reproduced like this: dd if=/dev/zero of=/tmp/test.raw bs=1 count=1234 ./qemu-io -k -n -g -c 'read -p 1024 512' /tmp/test.raw Can you send a patch doing this for qemu-iotests?
Re: [Qemu-devel] [PATCH 1/1 V4] qemu-kvm: fix improper nmi emulation
On 2011-10-14 09:43, Lai Jiangshan wrote: On 10/14/2011 02:49 PM, Jan Kiszka wrote: On 2011-10-14 08:36, Lai Jiangshan wrote: On 10/14/2011 01:53 PM, Jan Kiszka wrote: On 2011-10-14 02:53, Lai Jiangshan wrote: As explained in some other mail, we could then emulate the missing kernel feature by reading out the current in-kernel APIC state, testing if LINT1 is unmasked, and then delivering the NMI directly. Only the thread of the VCPU can safely get the in-kernel LAPIC states, so this approach will cause some troubles. run_on_cpu() can help. Jan Ah, I forgot it, Thanks. From: Lai Jiangshan la...@cn.fujitsu.com Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is maskied in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, inject-nmi request is handled as follows. - When in-kernel irqchip is disabled, deliver LINT1 instead of NMI interrupt. - When in-kernel irqchip is enabled, get the in-kernel LAPIC states and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then delivering the NMI directly. (Suggested by Jan Kiszka) Changed from old version: re-implement it by the Jan's suggestion. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- hw/apic.c | 48 hw/apic.h |1 + monitor.c |6 +- 3 files changed, 54 insertions(+), 1 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 69d6ac5..9a40129 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -205,6 +205,54 @@ void apic_deliver_pic_intr(DeviceState *d, int level) } } +#ifdef KVM_CAP_IRQCHIP Again, this is always defined on x86 thus pointless to test. +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id); + +struct kvm_get_remote_lapic_params { +CPUState *env; +struct kvm_lapic_state klapic; +}; + +static void kvm_get_remote_lapic(void *p) +{ +struct kvm_get_remote_lapic_params *params = p; + +kvm_get_lapic(params-env, params-klapic); When you already interrupted that vcpu, why not inject from here? Avoids one further ping-pong round. get_remote_lapic and inject nmi are two different things, so I don't inject nmi from here. I didn't notice this ping-pond overhead. Thank you. Actually, it is not performance-critical. But there is a race between obtaining the APIC state and testing for the NMI injection path. So it's better to define an on-vcpu LINT1 NMI injection service. +} + +void apic_deliver_nmi(DeviceState *d) +{ +APICState *s = DO_UPCAST(APICState, busdev.qdev, d); + +if (kvm_irqchip_in_kernel()) { +struct kvm_get_remote_lapic_params p = {.env = s-cpu_env,}; +uint32_t lvt; + +run_on_cpu(s-cpu_env, kvm_get_remote_lapic, p); +lvt = kapic_reg(p.klapic, 0x32 + APIC_LVT_LINT1); + +if (lvt APIC_LVT_MASKED) { +return; +} + +if (((lvt 8) 7) != APIC_DM_NMI) { +return; +} + +cpu_interrupt(s-cpu_env, CPU_INTERRUPT_NMI); Err, aren't you introducing KVM_CAP_LAPIC_NMI that allows to test if this workaround is needed? Oh, your latest kernel patch is missing this again - requires fixing as well. Kernel site patch is dropped with this v4 patch. Did you mean you want KVM_CAP_SET_LINT1 + KVM_SET_LINT1 patches? I have made them. OK, so this is going to be applied on top? Then I take this remark back. Jan signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH 0/4] coroutinization of flush and discard (split out of NBD series)
Thanks to Stefan's series from today, I managed to understand what the coroutinization was all about. So I split this part out of the NBD series. Applies on top of block branch + the other five patches. Paolo Bonzini (3): block: rename bdrv_co_rw_bh block: unify flush implementations block: add bdrv_co_discard and bdrv_aio_discard support Stefan Hajnoczi (1): block: drop redundant bdrv_flush implementation block.c | 236 ++--- block.h |3 + block/blkdebug.c |6 -- block/blkverify.c |9 -- block/qcow.c |6 -- block/qcow2.c | 19 block/qed.c |6 -- block/raw-posix.c | 18 block/raw.c | 19 ++--- block_int.h | 10 ++- trace-events |1 + 11 files changed, 152 insertions(+), 181 deletions(-) -- 1.7.6
[Qemu-devel] [PATCH 1/4] block: rename bdrv_co_rw_bh
Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- block.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/block.c b/block.c index 9873b57..7184a0f 100644 --- a/block.c +++ b/block.c @@ -2735,7 +2735,7 @@ static AIOPool bdrv_em_co_aio_pool = { .cancel = bdrv_aio_co_cancel_em, }; -static void bdrv_co_rw_bh(void *opaque) +static void bdrv_co_em_bh(void *opaque) { BlockDriverAIOCBCoroutine *acb = opaque; @@ -2758,7 +2758,7 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque) acb-req.nb_sectors, acb-req.qiov); } -acb-bh = qemu_bh_new(bdrv_co_rw_bh, acb); +acb-bh = qemu_bh_new(bdrv_co_em_bh, acb); qemu_bh_schedule(acb-bh); } -- 1.7.6
Re: [Qemu-devel] [PATCH] savevm: qemu_savevm_state(): Drop stop VM logic
Am 13.10.2011 22:27, schrieb Luiz Capitulino: qemu_savevm_state() has some logic to stop the VM and to (or not to) resume it. But this seems to be a big noop, as qemu_savevm_state() is only called by do_savevm() when the VM is already stopped. So, let's drop qemu_savevm_state()'s stop VM logic. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com Reviewed-by: Kevin Wolf kw...@redhat.com
[Qemu-devel] [PATCH 2/4] block: unify flush implementations
Add coroutine support for flush and apply the same emulation that we already do for read/write. bdrv_aio_flush is simplified to always go through a coroutine. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- block.c | 160 ++- block_int.h |1 + 2 files changed, 71 insertions(+), 90 deletions(-) diff --git a/block.c b/block.c index 7184a0f..0af9a89 100644 --- a/block.c +++ b/block.c @@ -53,17 +53,12 @@ static BlockDriverAIOCB *bdrv_aio_readv_em(BlockDriverState *bs, static BlockDriverAIOCB *bdrv_aio_writev_em(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, BlockDriverCompletionFunc *cb, void *opaque); -static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs, -BlockDriverCompletionFunc *cb, void *opaque); -static BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs, -BlockDriverCompletionFunc *cb, void *opaque); static int coroutine_fn bdrv_co_readv_em(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *iov); static int coroutine_fn bdrv_co_writev_em(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *iov); -static int coroutine_fn bdrv_co_flush_em(BlockDriverState *bs); static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov); static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs, @@ -203,9 +198,6 @@ void bdrv_register(BlockDriver *bdrv) } } -if (!bdrv-bdrv_aio_flush) -bdrv-bdrv_aio_flush = bdrv_aio_flush_em; - QLIST_INSERT_HEAD(bdrv_drivers, bdrv, list); } @@ -1027,11 +1019,6 @@ static int bdrv_check_request(BlockDriverState *bs, int64_t sector_num, nb_sectors * BDRV_SECTOR_SIZE); } -static inline bool bdrv_has_async_flush(BlockDriver *drv) -{ -return drv-bdrv_aio_flush != bdrv_aio_flush_em; -} - typedef struct RwCo { BlockDriverState *bs; int64_t sector_num; @@ -1759,33 +1746,6 @@ const char *bdrv_get_device_name(BlockDriverState *bs) return bs-device_name; } -int bdrv_flush(BlockDriverState *bs) -{ -if (bs-open_flags BDRV_O_NO_FLUSH) { -return 0; -} - -if (bs-drv bdrv_has_async_flush(bs-drv) qemu_in_coroutine()) { -return bdrv_co_flush_em(bs); -} - -if (bs-drv bs-drv-bdrv_flush) { -return bs-drv-bdrv_flush(bs); -} - -/* - * Some block drivers always operate in either writethrough or unsafe mode - * and don't support bdrv_flush therefore. Usually qemu doesn't know how - * the server works (because the behaviour is hardcoded or depends on - * server-side configuration), so we can't ensure that everything is safe - * on disk. Returning an error doesn't work because that would break guests - * even if the server operates in writethrough mode. - * - * Let's hope the user knows what he's doing. - */ -return 0; -} - void bdrv_flush_all(void) { BlockDriverState *bs; @@ -2610,22 +2570,6 @@ fail: return -1; } -BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs, -BlockDriverCompletionFunc *cb, void *opaque) -{ -BlockDriver *drv = bs-drv; - -trace_bdrv_aio_flush(bs, opaque); - -if (bs-open_flags BDRV_O_NO_FLUSH) { -return bdrv_aio_noop_em(bs, cb, opaque); -} - -if (!drv) -return NULL; -return drv-bdrv_aio_flush(bs, cb, opaque); -} - void bdrv_aio_cancel(BlockDriverAIOCB *acb) { acb-pool-cancel(acb); @@ -2785,41 +2729,28 @@ static BlockDriverAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs, return acb-common; } -static BlockDriverAIOCB *bdrv_aio_flush_em(BlockDriverState *bs, -BlockDriverCompletionFunc *cb, void *opaque) +static void coroutine_fn bdrv_co_flush(void *opaque) { -BlockDriverAIOCBSync *acb; - -acb = qemu_aio_get(bdrv_em_aio_pool, bs, cb, opaque); -acb-is_write = 1; /* don't bounce in the completion hadler */ -acb-qiov = NULL; -acb-bounce = NULL; -acb-ret = 0; - -if (!acb-bh) -acb-bh = qemu_bh_new(bdrv_aio_bh_cb, acb); +BlockDriverAIOCBCoroutine *acb = opaque; +BlockDriverState *bs = acb-common.bs; -bdrv_flush(bs); +acb-req.error = bdrv_flush(bs); +acb-bh = qemu_bh_new(bdrv_co_em_bh, acb); qemu_bh_schedule(acb-bh); -return acb-common; } -static BlockDriverAIOCB *bdrv_aio_noop_em(BlockDriverState *bs, +BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs, BlockDriverCompletionFunc *cb, void *opaque) { -BlockDriverAIOCBSync *acb; +trace_bdrv_aio_flush(bs, opaque); -acb = qemu_aio_get(bdrv_em_aio_pool, bs, cb, opaque); -acb-is_write = 1; /* don't bounce in the completion handler */ -acb-qiov =
[Qemu-devel] [PATCH 1/2 V5] qemu-kvm: Synchronize kernel headers
Synchronize newest kernel headers which have KVM_CAP_SET_LINT1 and KVM_SET_LINT1 by ./scripts/update-linux-headers.sh Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com --- linux-headers/asm-powerpc/kvm.h | 19 +-- linux-headers/asm-x86/kvm.h |1 + linux-headers/asm-x86/kvm_para.h | 14 ++ linux-headers/linux/kvm.h| 26 +++--- linux-headers/linux/kvm_para.h |1 + 5 files changed, 52 insertions(+), 9 deletions(-) diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h index 777d307..a4f6c85 100644 --- a/linux-headers/asm-powerpc/kvm.h +++ b/linux-headers/asm-powerpc/kvm.h @@ -22,6 +22,10 @@ #include linux/types.h +/* Select powerpc specific features in linux/kvm.h */ +#define __KVM_HAVE_SPAPR_TCE +#define __KVM_HAVE_PPC_SMT + struct kvm_regs { __u64 pc; __u64 cr; @@ -166,8 +170,8 @@ struct kvm_sregs { } ppc64; struct { __u32 sr[16]; - __u64 ibat[8]; - __u64 dbat[8]; + __u64 ibat[8]; + __u64 dbat[8]; } ppc32; } s; struct { @@ -272,4 +276,15 @@ struct kvm_guest_debug_arch { #define KVM_INTERRUPT_UNSET-2U #define KVM_INTERRUPT_SET_LEVEL-3U +/* for KVM_CAP_SPAPR_TCE */ +struct kvm_create_spapr_tce { + __u64 liobn; + __u32 window_size; +}; + +/* for KVM_ALLOCATE_RMA */ +struct kvm_allocate_rma { + __u64 rma_size; +}; + #endif /* __LINUX_KVM_POWERPC_H */ diff --git a/linux-headers/asm-x86/kvm.h b/linux-headers/asm-x86/kvm.h index 4d8dcbd..88d0ac3 100644 --- a/linux-headers/asm-x86/kvm.h +++ b/linux-headers/asm-x86/kvm.h @@ -24,6 +24,7 @@ #define __KVM_HAVE_DEBUGREGS #define __KVM_HAVE_XSAVE #define __KVM_HAVE_XCRS +#define __KVM_HAVE_SET_LINT1 /* Architectural interrupt line count. */ #define KVM_NR_INTERRUPTS 256 diff --git a/linux-headers/asm-x86/kvm_para.h b/linux-headers/asm-x86/kvm_para.h index 834d71e..f2ac46a 100644 --- a/linux-headers/asm-x86/kvm_para.h +++ b/linux-headers/asm-x86/kvm_para.h @@ -21,6 +21,7 @@ */ #define KVM_FEATURE_CLOCKSOURCE23 #define KVM_FEATURE_ASYNC_PF 4 +#define KVM_FEATURE_STEAL_TIME 5 /* The last 8 bits are used to indicate how to interpret the flags field * in pvclock structure. If no bits are set, all flags are ignored. @@ -30,10 +31,23 @@ #define MSR_KVM_WALL_CLOCK 0x11 #define MSR_KVM_SYSTEM_TIME 0x12 +#define KVM_MSR_ENABLED 1 /* Custom MSRs falls in the range 0x4b564d00-0x4b564dff */ #define MSR_KVM_WALL_CLOCK_NEW 0x4b564d00 #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01 #define MSR_KVM_ASYNC_PF_EN 0x4b564d02 +#define MSR_KVM_STEAL_TIME 0x4b564d03 + +struct kvm_steal_time { + __u64 steal; + __u32 version; + __u32 flags; + __u32 pad[12]; +}; + +#define KVM_STEAL_ALIGNMENT_BITS 5 +#define KVM_STEAL_VALID_BITS ((-1ULL (KVM_STEAL_ALIGNMENT_BITS + 1))) +#define KVM_STEAL_RESERVED_MASK (((1 KVM_STEAL_ALIGNMENT_BITS) - 1 ) 1) #define KVM_MAX_MMU_OP_BATCH 32 diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index fc63b73..86808b4 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -161,6 +161,7 @@ struct kvm_pit_config { #define KVM_EXIT_NMI 16 #define KVM_EXIT_INTERNAL_ERROR 17 #define KVM_EXIT_OSI 18 +#define KVM_EXIT_PAPR_HCALL 19 /* For KVM_EXIT_INTERNAL_ERROR */ #define KVM_INTERNAL_ERROR_EMULATION 1 @@ -264,6 +265,11 @@ struct kvm_run { struct { __u64 gprs[32]; } osi; + struct { + __u64 nr; + __u64 ret; + __u64 args[9]; + } papr_hcall; /* Fix the size of the union. */ char padding[256]; }; @@ -544,6 +550,13 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_TSC_CONTROL 60 #define KVM_CAP_GET_TSC_KHZ 61 #define KVM_CAP_PPC_BOOKE_SREGS 62 +#define KVM_CAP_SPAPR_TCE 63 +#define KVM_CAP_PPC_SMT 64 +#define KVM_CAP_PPC_RMA65 +#define KVM_CAP_S390_GMAP 71 +#ifdef __KVM_HAVE_SET_LINT1 +#define KVM_CAP_SET_LINT1 72 +#endif #ifdef KVM_CAP_IRQ_ROUTING @@ -746,6 +759,11 @@ struct kvm_clock_data { /* Available with KVM_CAP_XCRS */ #define KVM_GET_XCRS _IOR(KVMIO, 0xa6, struct kvm_xcrs) #define KVM_SET_XCRS _IOW(KVMIO, 0xa7, struct kvm_xcrs) +#define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce) +/* Available with KVM_CAP_RMA */ +#define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma) +/* Available with KVM_CAP_SET_LINT1 for x86 */ +#define KVM_SET_LINT1_IO(KVMIO, 0xaa) #define KVM_DEV_ASSIGN_ENABLE_IOMMU(1 0)
[Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is masked in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, we introduce introduce KVM_SET_LINT1, and we can use KVM_SET_LINT1 to correctly emulate NMI button without change the old KVM_NMI behavior. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- arch/x86/include/asm/kvm.h |1 + arch/x86/kvm/irq.h |1 + arch/x86/kvm/lapic.c |7 +++ arch/x86/kvm/x86.c |8 include/linux/kvm.h|5 + 5 files changed, 22 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h index 4d8dcbd..88d0ac3 100644 --- a/arch/x86/include/asm/kvm.h +++ b/arch/x86/include/asm/kvm.h @@ -24,6 +24,7 @@ #define __KVM_HAVE_DEBUGREGS #define __KVM_HAVE_XSAVE #define __KVM_HAVE_XCRS +#define __KVM_HAVE_SET_LINT1 /* Architectural interrupt line count. */ #define KVM_NR_INTERRUPTS 256 diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h index 53e2d08..0c96315 100644 --- a/arch/x86/kvm/irq.h +++ b/arch/x86/kvm/irq.h @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s); void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu); +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_timers(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 57dcbd4..87fe36a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu) kvm_apic_local_deliver(apic, APIC_LVT0); } +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + + kvm_apic_local_deliver(apic, APIC_LVT1); +} + static struct kvm_timer_ops lapic_timer_ops = { .is_periodic = lapic_is_periodic, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 84a28ea..fccd094 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: case KVM_CAP_GET_TSC_KHZ: + case KVM_CAP_SET_LINT1: r = 1; break; case KVM_CAP_COALESCED_MMIO: @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp, goto out; } + case KVM_SET_LINT1: { + r = -EINVAL; + if (!irqchip_in_kernel(vcpu-kvm)) + goto out; + r = 0; + kvm_apic_lint1_deliver(vcpu); + } default: r = -EINVAL; } diff --git a/include/linux/kvm.h b/include/linux/kvm.h index aace6b8..3a10572 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -554,6 +554,9 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_PPC_SMT 64 #define KVM_CAP_PPC_RMA65 #define KVM_CAP_S390_GMAP 71 +#ifdef __KVM_HAVE_SET_LINT1 +#define KVM_CAP_SET_LINT1 72 +#endif #ifdef KVM_CAP_IRQ_ROUTING @@ -759,6 +762,8 @@ struct kvm_clock_data { #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce) /* Available with KVM_CAP_RMA */ #define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma) +/* Available with KVM_CAP_SET_LINT1 for x86 */ +#define KVM_SET_LINT1_IO(KVMIO, 0xaa) #define KVM_DEV_ASSIGN_ENABLE_IOMMU(1 0)
[Qemu-devel] [PATCH 2/2 V5] qemu-kvm: fix improper nmi emulation
Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is masked in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, inject-nmi request is handled as follows. - When in-kernel irqchip is enabled and KVM_SET_LINT1 is enabled, inject LINT1 instead of NMI interrupt. - otherwise when in-kernel irqchip is enabled, get the in-kernel LAPIC states and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then delivering the NMI directly. - otherwise, userland lapic emulates NMI button and inject NMI if it is unmasked. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- hw/apic.c | 72 + hw/apic.h |1 + monitor.c |6 - 3 files changed, 78 insertions(+), 1 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 69d6ac5..91b82d0 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -205,6 +205,78 @@ void apic_deliver_pic_intr(DeviceState *d, int level) } } +#ifdef KVM_CAP_IRQCHIP +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id); + +static void kvm_irqchip_deliver_nmi(void *p) +{ +APICState *s = p; +struct kvm_lapic_state klapic; +uint32_t lvt; + +kvm_get_lapic(s-cpu_env, klapic); +lvt = kapic_reg(klapic, 0x32 + APIC_LVT_LINT1); + +if (lvt APIC_LVT_MASKED) { +return; +} + +if (((lvt 8) 7) != APIC_DM_NMI) { +return; +} + +kvm_vcpu_ioctl(s-cpu_env, KVM_NMI); +} + +static void __apic_deliver_nmi(APICState *s) +{ +if (kvm_irqchip_in_kernel()) { +run_on_cpu(s-cpu_env, kvm_irqchip_deliver_nmi, s); +} else { +apic_local_deliver(s, APIC_LVT_LINT1); +} +} +#else +static void __apic_deliver_nmi(APICState *s) +{ +apic_local_deliver(s, APIC_LVT_LINT1); +} +#endif + +enum { +KVM_SET_LINT1_UNKNOWN, +KVM_SET_LINT1_ENABLED, +KVM_SET_LINT1_DISABLED, +}; + +static void kvm_set_lint1(void *p) +{ +CPUState *env = p; + +kvm_vcpu_ioctl(env, KVM_SET_LINT1); +} + +void apic_deliver_nmi(DeviceState *d) +{ +APICState *s = DO_UPCAST(APICState, busdev.qdev, d); +static int kernel_lint1 = KVM_SET_LINT1_UNKNOWN; + +if (kernel_lint1 == KVM_SET_LINT1_UNKNOWN) { +if (kvm_enabled() kvm_irqchip_in_kernel() +kvm_check_extension(kvm_state, KVM_CAP_SET_LINT1)) { +kernel_lint1 = KVM_SET_LINT1_ENABLED; +} else { +kernel_lint1 = KVM_SET_LINT1_DISABLED; +} +} + +if (kernel_lint1 == KVM_SET_LINT1_ENABLED) { +run_on_cpu(s-cpu_env, kvm_set_lint1, s-cpu_env); +} else { +__apic_deliver_nmi(s); +} +} + #define foreach_apic(apic, deliver_bitmask, code) \ {\ int __i, __j, __mask;\ diff --git a/hw/apic.h b/hw/apic.h index c857d52..3a4be0a 100644 --- a/hw/apic.h +++ b/hw/apic.h @@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint8_t dest_mode, uint8_t trigger_mode); int apic_accept_pic_intr(DeviceState *s); void apic_deliver_pic_intr(DeviceState *s, int level); +void apic_deliver_nmi(DeviceState *d); int apic_get_interrupt(DeviceState *s); void apic_reset_irq_delivered(void); int apic_get_irq_delivered(void); diff --git a/monitor.c b/monitor.c index cb485bf..0b81f17 100644 --- a/monitor.c +++ b/monitor.c @@ -2616,7 +2616,11 @@ static int do_inject_nmi(Monitor *mon, const QDict *qdict, QObject **ret_data) CPUState *env; for (env = first_cpu; env != NULL; env = env-next_cpu) { -cpu_interrupt(env, CPU_INTERRUPT_NMI); +if (!env-apic_state) { +cpu_interrupt(env, CPU_INTERRUPT_NMI); +} else { +apic_deliver_nmi(env-apic_state); +} } return 0;
[Qemu-devel] [PATCH 3/4] block: drop redundant bdrv_flush implementation
From: Stefan Hajnoczi stefa...@linux.vnet.ibm.com Block drivers now only need to provide either of .bdrv_co_flush, .bdrv_aio_flush() or for legacy drivers .bdrv_flush(). Remove the redundant .bdrv_flush() and, for the raw driver, replace the asynchronous operation with the coroutine-based one. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- block/blkdebug.c |6 -- block/blkverify.c |9 - block/qcow.c |6 -- block/qcow2.c | 19 --- block/qed.c |6 -- block/raw-posix.c | 18 -- block/raw.c | 11 ++- 7 files changed, 2 insertions(+), 73 deletions(-) diff --git a/block/blkdebug.c b/block/blkdebug.c index b3c5d42..9b88535 100644 --- a/block/blkdebug.c +++ b/block/blkdebug.c @@ -397,11 +397,6 @@ static void blkdebug_close(BlockDriverState *bs) } } -static int blkdebug_flush(BlockDriverState *bs) -{ -return bdrv_flush(bs-file); -} - static BlockDriverAIOCB *blkdebug_aio_flush(BlockDriverState *bs, BlockDriverCompletionFunc *cb, void *opaque) { @@ -454,7 +449,6 @@ static BlockDriver bdrv_blkdebug = { .bdrv_file_open = blkdebug_open, .bdrv_close = blkdebug_close, -.bdrv_flush = blkdebug_flush, .bdrv_aio_readv = blkdebug_aio_readv, .bdrv_aio_writev= blkdebug_aio_writev, diff --git a/block/blkverify.c b/block/blkverify.c index c7522b4..483f3b3 100644 --- a/block/blkverify.c +++ b/block/blkverify.c @@ -116,14 +116,6 @@ static void blkverify_close(BlockDriverState *bs) s-test_file = NULL; } -static int blkverify_flush(BlockDriverState *bs) -{ -BDRVBlkverifyState *s = bs-opaque; - -/* Only flush test file, the raw file is not important */ -return bdrv_flush(s-test_file); -} - static int64_t blkverify_getlength(BlockDriverState *bs) { BDRVBlkverifyState *s = bs-opaque; @@ -368,7 +360,6 @@ static BlockDriver bdrv_blkverify = { .bdrv_file_open = blkverify_open, .bdrv_close = blkverify_close, -.bdrv_flush = blkverify_flush, .bdrv_aio_readv = blkverify_aio_readv, .bdrv_aio_writev= blkverify_aio_writev, diff --git a/block/qcow.c b/block/qcow.c index c8bfecc..9b71116 100644 --- a/block/qcow.c +++ b/block/qcow.c @@ -781,11 +781,6 @@ static int qcow_write_compressed(BlockDriverState *bs, int64_t sector_num, return 0; } -static int qcow_flush(BlockDriverState *bs) -{ -return bdrv_flush(bs-file); -} - static BlockDriverAIOCB *qcow_aio_flush(BlockDriverState *bs, BlockDriverCompletionFunc *cb, void *opaque) { @@ -826,7 +821,6 @@ static BlockDriver bdrv_qcow = { .bdrv_open = qcow_open, .bdrv_close= qcow_close, .bdrv_create = qcow_create, -.bdrv_flush= qcow_flush, .bdrv_is_allocated = qcow_is_allocated, .bdrv_set_key = qcow_set_key, .bdrv_make_empty = qcow_make_empty, diff --git a/block/qcow2.c b/block/qcow2.c index 510ff68..4dc980c 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1092,24 +1092,6 @@ static int qcow2_write_compressed(BlockDriverState *bs, int64_t sector_num, return 0; } -static int qcow2_flush(BlockDriverState *bs) -{ -BDRVQcowState *s = bs-opaque; -int ret; - -ret = qcow2_cache_flush(bs, s-l2_table_cache); -if (ret 0) { -return ret; -} - -ret = qcow2_cache_flush(bs, s-refcount_block_cache); -if (ret 0) { -return ret; -} - -return bdrv_flush(bs-file); -} - static BlockDriverAIOCB *qcow2_aio_flush(BlockDriverState *bs, BlockDriverCompletionFunc *cb, void *opaque) @@ -1242,7 +1224,6 @@ static BlockDriver bdrv_qcow2 = { .bdrv_open = qcow2_open, .bdrv_close = qcow2_close, .bdrv_create= qcow2_create, -.bdrv_flush = qcow2_flush, .bdrv_is_allocated = qcow2_is_allocated, .bdrv_set_key = qcow2_set_key, .bdrv_make_empty= qcow2_make_empty, diff --git a/block/qed.c b/block/qed.c index e87dc4d..2e06992 100644 --- a/block/qed.c +++ b/block/qed.c @@ -533,11 +533,6 @@ static void bdrv_qed_close(BlockDriverState *bs) qemu_vfree(s-l1_table); } -static int bdrv_qed_flush(BlockDriverState *bs) -{ -return bdrv_flush(bs-file); -} - static int qed_create(const char *filename, uint32_t cluster_size, uint64_t image_size, uint32_t table_size, const char *backing_file, const char *backing_fmt) @@ -1479,7 +1474,6 @@ static BlockDriver bdrv_qed = { .bdrv_open= bdrv_qed_open, .bdrv_close = bdrv_qed_close, .bdrv_create = bdrv_qed_create, -.bdrv_flush = bdrv_qed_flush, .bdrv_is_allocated= bdrv_qed_is_allocated, .bdrv_make_empty =
[Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support
This similarly adds support for coroutine and asynchronous discard. Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- I was not sure if qcow2 could be changed to co_discard, though I suspected yes. block.c | 72 +- block.h |3 ++ block/raw.c |8 -- block_int.h |9 +- trace-events |1 + 5 files changed, 77 insertions(+), 16 deletions(-) diff --git a/block.c b/block.c index 0af9a89..7c60361 100644 --- a/block.c +++ b/block.c @@ -1768,17 +1768,6 @@ int bdrv_has_zero_init(BlockDriverState *bs) return 1; } -int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors) -{ -if (!bs-drv) { -return -ENOMEDIUM; -} -if (!bs-drv-bdrv_discard) { -return 0; -} -return bs-drv-bdrv_discard(bs, sector_num, nb_sectors); -} - /* * Returns true iff the specified sector is present in the disk image. Drivers * not implementing the functionality are assumed to not support backing files, @@ -2911,6 +2900,66 @@ int bdrv_flush(BlockDriverState *bs) return rwco.ret; } +static void bdrv_discard_co_entry(void *opaque) +{ +RwCo *rwco = opaque; +BlockDriverState *bs = rwco-bs; +int64_t sector_num = rwco-sector_num; +int nb_sectors = rwco-nb_sectors; + +if (!bs-drv) { +rwco-ret = -ENOMEDIUM; +} else if (bdrv_check_request(bs, sector_num, nb_sectors)) { +rwco-ret = -EIO; +} else if (bs-read_only) { +rwco-ret = -EROFS; +} else if (bs-drv-bdrv_co_discard) { +rwco-ret = bs-drv-bdrv_co_discard(bs, sector_num, nb_sectors); +} else if (bs-drv-bdrv_aio_discard) { +BlockDriverAIOCB *acb; +CoroutineIOCompletion co = { +.coroutine = qemu_coroutine_self(), +}; + +acb = bs-drv-bdrv_aio_discard(bs, sector_num, nb_sectors, +bdrv_co_io_em_complete, co); +if (acb == NULL) { +rwco-ret = -EIO; +} else { +qemu_coroutine_yield(); +rwco-ret = co.ret; +} +} else if (bs-drv-bdrv_discard) { +rwco-ret = bs-drv-bdrv_discard(bs, sector_num, nb_sectors); +} else { +rwco-ret = 0; +} +} + +int bdrv_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors) +{ +Coroutine *co; +RwCo rwco = { +.bs = bs, +.sector_num = sector_num, +.nb_sectors = nb_sectors, +.ret = NOT_DONE, +}; + +if (qemu_in_coroutine()) { +/* Fast-path if already in coroutine context */ +bdrv_discard_co_entry(rwco); +} else { +co = qemu_coroutine_create(bdrv_discard_co_entry); +qemu_coroutine_enter(co, rwco); +while (rwco.ret == NOT_DONE) { +qemu_aio_wait(); +} +} + +return rwco.ret; +} + /**/ /* removable device support */ diff --git a/block.h b/block.h index e77988e..4f4aa94 100644 --- a/block.h +++ b/block.h @@ -166,6 +166,9 @@ BlockDriverAIOCB *bdrv_aio_writev(BlockDriverState *bs, int64_t sector_num, BlockDriverCompletionFunc *cb, void *opaque); BlockDriverAIOCB *bdrv_aio_flush(BlockDriverState *bs, BlockDriverCompletionFunc *cb, void *opaque); +BlockDriverAIOCB *bdrv_aio_discard(BlockDriverState *bs, + int64_t sector_num, int nb_sectors, + BlockDriverCompletionFunc *cb, void *opaque); void bdrv_aio_cancel(BlockDriverAIOCB *acb); typedef struct BlockRequest { diff --git a/block/raw.c b/block/raw.c index 161c9cf..ff68514 100644 --- a/block/raw.c +++ b/block/raw.c @@ -45,7 +45,8 @@ static int raw_probe(const uint8_t *buf, int buf_size, const char *filename) return 1; /* everything can be opened as raw image */ } -static int raw_discard(BlockDriverState *bs, int64_t sector_num, int nb_sectors) +static int coroutine_fn raw_co_discard(BlockDriverState *bs, + int64_t sector_num, int nb_sectors) { return bdrv_discard(bs-file, sector_num, nb_sectors); } @@ -109,15 +110,16 @@ static BlockDriver bdrv_raw = { .bdrv_open = raw_open, .bdrv_close = raw_close, + .bdrv_co_readv = raw_co_readv, .bdrv_co_writev = raw_co_writev, .bdrv_co_flush = raw_co_flush, +.bdrv_co_discard= raw_co_discard, + .bdrv_probe = raw_probe, .bdrv_getlength = raw_getlength, .bdrv_truncate = raw_truncate, -.bdrv_discard = raw_discard, - .bdrv_is_inserted = raw_is_inserted, .bdrv_media_changed = raw_media_changed, .bdrv_eject = raw_eject, diff --git a/block_int.h b/block_int.h index 9cb536d..384598f 100644 --- a/block_int.h +++ b/block_int.h @@ -63,6 +63,8 @@ struct BlockDriver { void
Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
On 2011-10-14 11:03, Lai Jiangshan wrote: Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is masked in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, we introduce introduce KVM_SET_LINT1, and we can use KVM_SET_LINT1 to correctly emulate NMI button without change the old KVM_NMI behavior. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- arch/x86/include/asm/kvm.h |1 + arch/x86/kvm/irq.h |1 + arch/x86/kvm/lapic.c |7 +++ arch/x86/kvm/x86.c |8 include/linux/kvm.h|5 + 5 files changed, 22 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h index 4d8dcbd..88d0ac3 100644 --- a/arch/x86/include/asm/kvm.h +++ b/arch/x86/include/asm/kvm.h @@ -24,6 +24,7 @@ #define __KVM_HAVE_DEBUGREGS #define __KVM_HAVE_XSAVE #define __KVM_HAVE_XCRS +#define __KVM_HAVE_SET_LINT1 /* Architectural interrupt line count. */ #define KVM_NR_INTERRUPTS 256 diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h index 53e2d08..0c96315 100644 --- a/arch/x86/kvm/irq.h +++ b/arch/x86/kvm/irq.h @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s); void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu); +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_timers(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 57dcbd4..87fe36a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu) kvm_apic_local_deliver(apic, APIC_LVT0); } +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + + kvm_apic_local_deliver(apic, APIC_LVT1); +} + static struct kvm_timer_ops lapic_timer_ops = { .is_periodic = lapic_is_periodic, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 84a28ea..fccd094 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: case KVM_CAP_GET_TSC_KHZ: + case KVM_CAP_SET_LINT1: r = 1; break; case KVM_CAP_COALESCED_MMIO: @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp, goto out; } + case KVM_SET_LINT1: { + r = -EINVAL; + if (!irqchip_in_kernel(vcpu-kvm)) + goto out; + r = 0; + kvm_apic_lint1_deliver(vcpu); + } default: r = -EINVAL; } diff --git a/include/linux/kvm.h b/include/linux/kvm.h index aace6b8..3a10572 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -554,6 +554,9 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_PPC_SMT 64 #define KVM_CAP_PPC_RMA 65 #define KVM_CAP_S390_GMAP 71 +#ifdef __KVM_HAVE_SET_LINT1 +#define KVM_CAP_SET_LINT1 72 +#endif Actually, there is no need for __KVM_HAVE_SET_LINT1 and #ifdef. User land will just do a runtime check. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] compatfd.c: Don't pass NULL pointer to SYS_signalfd
On Thu, Oct 13, 2011 at 06:45:37PM +0100, Peter Maydell wrote: Don't pass a NULL pointer in to SYS_signalfd in qemu_signalfd_available(): this isn't valid and Valgrind complains about it. Signed-off-by: Peter Maydell peter.mayd...@linaro.org --- compatfd.c | 12 ++-- 1 files changed, 10 insertions(+), 2 deletions(-) Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
[Qemu-devel] [0/12] Preliminary work for IOMMU emulation support (v2)
A while back, Eduard - Gabriel Munteanu send a series of patches implementing support for emulating the AMD IOMMU in conjunction with qemu emulated PCI devices. A revised patch series added support for the Intel IOMMU, and I also send a revised version of this series which added support for the hypervisor mediated IOMMU on the pseries machine. Richard Henderson also weighed in on the discussion, and there's still a cretain amount to be thrashed out in terms of exactly how to set up an IOMMU / DMA translation subsystem. However, really only 2 or 3 patches in any of these series have contained anything interesting. The rest of the series has been converting existing PCI emulated devices to use the new DMA interface which worked through the IOMMU translation, whatever it was. While we keep working out what we want for the guts of the IOMMU support, these device conversion patches keep bitrotting against updates to the various device implementations themselves. Really, regardless of whether we're actually implementing IOMMU translation, it makes sense that qemu code should distinguish between when it is really operating in CPU physical addresses and when it is operating in bus or DMA addresses which might have some kind of translation into physical addresses. This series, therefore, begins the conversion of existing PCI device emulation code to use new (stub) pci dma access functions. These are, for now, just defined to be untranslated cpu physical memory accesses, as before, but has three advantages: * It becomes obvious where the code is working with dma addresses, so it's easier to grep for what might be affected by an IOMMU or other bus address translation. * The new stubs take the PCIDevice *, from which any of the various suggested IOMMU interfaces should be able to locate the correct IOMMU translation context. * The new pci_dma_{read,write}() functions have a return value. When we do have IOMMU support, translation failures could lead to these functions failing, so we want a way to report it. This series converts all the easy cases. It doesn't yet handle devices which have both PCI and non-PCI variants, such as AHCI, OHCI and ne2k. Unlike the earlier version of this series, functions using scatter gather _are_ covered, though.
[Qemu-devel] [PATCH 06/12] e1000: Use PCI DMA stub functions
From: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro This updates the e1000 device emulation to use the explicit PCI DMA functions, instead of directly calling physical memory access functions. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/e1000.c | 29 +++-- 1 files changed, 15 insertions(+), 14 deletions(-) diff --git a/hw/e1000.c b/hw/e1000.c index ce8fc8b..986ed9c 100644 --- a/hw/e1000.c +++ b/hw/e1000.c @@ -31,6 +31,7 @@ #include net/checksum.h #include loader.h #include sysemu.h +#include dma.h #include e1000_hw.h @@ -465,7 +466,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp) bytes = split_size; if (tp-size + bytes msh) bytes = msh - tp-size; -cpu_physical_memory_read(addr, tp-data + tp-size, bytes); +pci_dma_read(s-dev, addr, tp-data + tp-size, bytes); if ((sz = tp-size + bytes) = hdr tp-size hdr) memmove(tp-header, tp-data, hdr); tp-size = sz; @@ -480,7 +481,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp) // context descriptor TSE is not set, while data descriptor TSE is set DBGOUT(TXERR, TCP segmentaion Error\n); } else { -cpu_physical_memory_read(addr, tp-data + tp-size, split_size); +pci_dma_read(s-dev, addr, tp-data + tp-size, split_size); tp-size += split_size; } @@ -496,7 +497,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp) } static uint32_t -txdesc_writeback(target_phys_addr_t base, struct e1000_tx_desc *dp) +txdesc_writeback(E1000State *s, dma_addr_t base, struct e1000_tx_desc *dp) { uint32_t txd_upper, txd_lower = le32_to_cpu(dp-lower.data); @@ -505,8 +506,8 @@ txdesc_writeback(target_phys_addr_t base, struct e1000_tx_desc *dp) txd_upper = (le32_to_cpu(dp-upper.data) | E1000_TXD_STAT_DD) ~(E1000_TXD_STAT_EC | E1000_TXD_STAT_LC | E1000_TXD_STAT_TU); dp-upper.data = cpu_to_le32(txd_upper); -cpu_physical_memory_write(base + ((char *)dp-upper - (char *)dp), - (void *)dp-upper, sizeof(dp-upper)); +pci_dma_write(s-dev, base + ((char *)dp-upper - (char *)dp), + (void *)dp-upper, sizeof(dp-upper)); return E1000_ICR_TXDW; } @@ -521,7 +522,7 @@ static uint64_t tx_desc_base(E1000State *s) static void start_xmit(E1000State *s) { -target_phys_addr_t base; +dma_addr_t base; struct e1000_tx_desc desc; uint32_t tdh_start = s-mac_reg[TDH], cause = E1000_ICS_TXQE; @@ -533,14 +534,14 @@ start_xmit(E1000State *s) while (s-mac_reg[TDH] != s-mac_reg[TDT]) { base = tx_desc_base(s) + sizeof(struct e1000_tx_desc) * s-mac_reg[TDH]; -cpu_physical_memory_read(base, (void *)desc, sizeof(desc)); +pci_dma_read(s-dev, base, (void *)desc, sizeof(desc)); DBGOUT(TX, index %d: %p : %x %x\n, s-mac_reg[TDH], (void *)(intptr_t)desc.buffer_addr, desc.lower.data, desc.upper.data); process_tx_desc(s, desc); -cause |= txdesc_writeback(base, desc); +cause |= txdesc_writeback(s, base, desc); if (++s-mac_reg[TDH] * sizeof(desc) = s-mac_reg[TDLEN]) s-mac_reg[TDH] = 0; @@ -668,7 +669,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size) { E1000State *s = DO_UPCAST(NICState, nc, nc)-opaque; struct e1000_rx_desc desc; -target_phys_addr_t base; +dma_addr_t base; unsigned int n, rdt; uint32_t rdh_start; uint16_t vlan_special = 0; @@ -713,7 +714,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size) desc_size = s-rxbuf_size; } base = rx_desc_base(s) + sizeof(desc) * s-mac_reg[RDH]; -cpu_physical_memory_read(base, (void *)desc, sizeof(desc)); +pci_dma_read(s-dev, base, (void *)desc, sizeof(desc)); desc.special = vlan_special; desc.status |= (vlan_status | E1000_RXD_STAT_DD); if (desc.buffer_addr) { @@ -722,9 +723,9 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size) if (copy_size s-rxbuf_size) { copy_size = s-rxbuf_size; } -cpu_physical_memory_write(le64_to_cpu(desc.buffer_addr), - (void *)(buf + desc_offset + vlan_offset), - copy_size); +pci_dma_write(s-dev, le64_to_cpu(desc.buffer_addr), + (void *)(buf + desc_offset + vlan_offset), + copy_size); } desc_offset += desc_size; desc.length = cpu_to_le16(desc_size); @@ -738,7 +739,7 @@ e1000_receive(VLANClientState *nc, const uint8_t *buf, size_t size) }
[Qemu-devel] [PATCH 05/12] es1370: Use PCI DMA stub functions
From: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro This updates the es1370 device emulation to use the explicit PCI DMA functions, instead of directly calling physical memory access functions. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/es1370.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/hw/es1370.c b/hw/es1370.c index 2daadde..c5c16b0 100644 --- a/hw/es1370.c +++ b/hw/es1370.c @@ -30,6 +30,7 @@ #include audiodev.h #include audio/audio.h #include pci.h +#include dma.h /* Missing stuff: SCTRL_P[12](END|ST)INC @@ -802,7 +803,7 @@ static void es1370_transfer_audio (ES1370State *s, struct chan *d, int loop_sel, if (!acquired) break; -cpu_physical_memory_write (addr, tmpbuf, acquired); +pci_dma_write (s-dev, addr, tmpbuf, acquired); temp -= acquired; addr += acquired; @@ -816,7 +817,7 @@ static void es1370_transfer_audio (ES1370State *s, struct chan *d, int loop_sel, int copied, to_copy; to_copy = audio_MIN ((size_t) temp, sizeof (tmpbuf)); -cpu_physical_memory_read (addr, tmpbuf, to_copy); +pci_dma_read (s-dev, addr, tmpbuf, to_copy); copied = AUD_write (voice, tmpbuf, to_copy); if (!copied) break; -- 1.7.6.3
[Qemu-devel] [PATCH 07/12] lsi53c895a: Use PCI DMA stub functions
From: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro This updates the lsi53c895a device emulation to use the explicit PCI DMA functions, instead of directly calling physical memory access functions. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/lsi53c895a.c | 31 +++ 1 files changed, 15 insertions(+), 16 deletions(-) diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c index e077ec0..e7e817a 100644 --- a/hw/lsi53c895a.c +++ b/hw/lsi53c895a.c @@ -15,6 +15,8 @@ #include hw.h #include pci.h #include scsi.h +#include block_int.h +#include dma.h //#define DEBUG_LSI //#define DEBUG_LSI_REG @@ -390,10 +392,7 @@ static inline uint32_t read_dword(LSIState *s, uint32_t addr) { uint32_t buf; -/* XXX: an optimization here used to fast-path the read from scripts - * memory. But that bypasses any iommu. - */ -cpu_physical_memory_read(addr, (uint8_t *)buf, 4); +pci_dma_read(s-dev, addr, (uint8_t *)buf, 4); return cpu_to_le32(buf); } @@ -532,7 +531,7 @@ static void lsi_bad_selection(LSIState *s, uint32_t id) static void lsi_do_dma(LSIState *s, int out) { uint32_t count, id; -target_phys_addr_t addr; +dma_addr_t addr; SCSIDevice *dev; assert(s-current); @@ -571,9 +570,9 @@ static void lsi_do_dma(LSIState *s, int out) } /* ??? Set SFBR to first data byte. */ if (out) { -cpu_physical_memory_read(addr, s-current-dma_buf, count); +pci_dma_read(s-dev, addr, s-current-dma_buf, count); } else { -cpu_physical_memory_write(addr, s-current-dma_buf, count); +pci_dma_write(s-dev, addr, s-current-dma_buf, count); } s-current-dma_len -= count; if (s-current-dma_len == 0) { @@ -766,7 +765,7 @@ static void lsi_do_command(LSIState *s) DPRINTF(Send command len=%d\n, s-dbc); if (s-dbc 16) s-dbc = 16; -cpu_physical_memory_read(s-dnad, buf, s-dbc); +pci_dma_read(s-dev, s-dnad, buf, s-dbc); s-sfbr = buf[0]; s-command_complete = 0; @@ -817,7 +816,7 @@ static void lsi_do_status(LSIState *s) s-dbc = 1; status = s-status; s-sfbr = status; -cpu_physical_memory_write(s-dnad, status, 1); +pci_dma_write(s-dev, s-dnad, status, 1); lsi_set_phase(s, PHASE_MI); s-msg_action = 1; lsi_add_msg_byte(s, 0); /* COMMAND COMPLETE */ @@ -831,7 +830,7 @@ static void lsi_do_msgin(LSIState *s) len = s-msg_len; if (len s-dbc) len = s-dbc; -cpu_physical_memory_write(s-dnad, s-msg, len); +pci_dma_write(s-dev, s-dnad, s-msg, len); /* Linux drivers rely on the last byte being in the SIDL. */ s-sidl = s-msg[len - 1]; s-msg_len -= len; @@ -863,7 +862,7 @@ static void lsi_do_msgin(LSIState *s) static uint8_t lsi_get_msgbyte(LSIState *s) { uint8_t data; -cpu_physical_memory_read(s-dnad, data, 1); +pci_dma_read(s-dev, s-dnad, data, 1); s-dnad++; s-dbc--; return data; @@ -1015,8 +1014,8 @@ static void lsi_memcpy(LSIState *s, uint32_t dest, uint32_t src, int count) DPRINTF(memcpy dest 0x%08x src 0x%08x count %d\n, dest, src, count); while (count) { n = (count LSI_BUF_SIZE) ? LSI_BUF_SIZE : count; -cpu_physical_memory_read(src, buf, n); -cpu_physical_memory_write(dest, buf, n); +pci_dma_read(s-dev, src, buf, n); +pci_dma_write(s-dev, dest, buf, n); src += n; dest += n; count -= n; @@ -1084,7 +1083,7 @@ again: /* 32-bit Table indirect */ offset = sxt24(addr); -cpu_physical_memory_read(s-dsa + offset, (uint8_t *)buf, 8); +pci_dma_read(s-dev, s-dsa + offset, (uint8_t *)buf, 8); /* byte count is stored in bits 0:23 only */ s-dbc = cpu_to_le32(buf[0]) 0xff; s-rbc = s-dbc; @@ -1443,7 +1442,7 @@ again: n = (insn 7); reg = (insn 16) 0xff; if (insn (1 24)) { -cpu_physical_memory_read(addr, data, n); +pci_dma_read(s-dev, addr, data, n); DPRINTF(Load reg 0x%x size %d addr 0x%08x = %08x\n, reg, n, addr, *(int *)data); for (i = 0; i n; i++) { @@ -1454,7 +1453,7 @@ again: for (i = 0; i n; i++) { data[i] = lsi_reg_readb(s, reg + i); } -cpu_physical_memory_write(addr, data, n); +pci_dma_write(s-dev, addr, data, n); } } } -- 1.7.6.3
Re: [Qemu-devel] [0/12] Preliminary work for IOMMU emulation support (v2)
On Fri, Oct 14, 2011 at 08:20:51PM +1100, David Gibson wrote: A while back, Eduard - Gabriel Munteanu send a series of patches implementing support for emulating the AMD IOMMU in conjunction with qemu emulated PCI devices. A revised patch series added support for the Intel IOMMU, and I also send a revised version of this series which added support for the hypervisor mediated IOMMU on the pseries machine. Richard Henderson also weighed in on the discussion, and there's still a cretain amount to be thrashed out in terms of exactly how to set up an IOMMU / DMA translation subsystem. However, really only 2 or 3 patches in any of these series have contained anything interesting. The rest of the series has been converting existing PCI emulated devices to use the new DMA interface which worked through the IOMMU translation, whatever it was. While we keep working out what we want for the guts of the IOMMU support, these device conversion patches keep bitrotting against updates to the various device implementations themselves. Really, regardless of whether we're actually implementing IOMMU translation, it makes sense that qemu code should distinguish between when it is really operating in CPU physical addresses and when it is operating in bus or DMA addresses which might have some kind of translation into physical addresses. This series, therefore, begins the conversion of existing PCI device emulation code to use new (stub) pci dma access functions. These are, for now, just defined to be untranslated cpu physical memory accesses, as before, but has three advantages: * It becomes obvious where the code is working with dma addresses, so it's easier to grep for what might be affected by an IOMMU or other bus address translation. * The new stubs take the PCIDevice *, from which any of the various suggested IOMMU interfaces should be able to locate the correct IOMMU translation context. * The new pci_dma_{read,write}() functions have a return value. When we do have IOMMU support, translation failures could lead to these functions failing, so we want a way to report it. This series converts all the easy cases. It doesn't yet handle devices which have both PCI and non-PCI variants, such as AHCI, OHCI and ne2k. Unlike the earlier version of this series, functions using scatter gather _are_ covered, though. Oh, one more note. There are a few checkpatch failures in the series; most are false positives (mistaking pointer * for a multiplication, mostly). There are a few more that are true style errors that I've left in on the basis that matching the surrounding code is more important than being checkpatchily correct. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson
Re: [Qemu-devel] [PATCH] correct spelling
[cc'ing qemu-trivial] A one-line summary here and a topic like sheepdog: in the subject would've been nice but given it's just a typo fix... :) Am 14.10.2011 09:41, schrieb Dong Xu Wang: Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com --- block/sheepdog.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/block/sheepdog.c b/block/sheepdog.c index c1f6e07..ae857e2 100644 --- a/block/sheepdog.c +++ b/block/sheepdog.c @@ -66,7 +66,7 @@ * 20 - 31 (12 bits): reserved data object space * 32 - 55 (24 bits): vdi object space * 56 - 59 ( 4 bits): reserved vdi object space - * 60 - 63 ( 4 bits): object type indentifier space + * 60 - 63 ( 4 bits): object type identifier space */ #define VDI_SPACE_SHIFT 32 Reviewed-by: Andreas Färber afaer...@suse.de Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746, AG Nürnberg
Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
On 10/14/2011 05:07 PM, Jan Kiszka wrote: On 2011-10-14 11:03, Lai Jiangshan wrote: Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is masked in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, we introduce introduce KVM_SET_LINT1, and we can use KVM_SET_LINT1 to correctly emulate NMI button without change the old KVM_NMI behavior. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- arch/x86/include/asm/kvm.h |1 + arch/x86/kvm/irq.h |1 + arch/x86/kvm/lapic.c |7 +++ arch/x86/kvm/x86.c |8 include/linux/kvm.h|5 + 5 files changed, 22 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h index 4d8dcbd..88d0ac3 100644 --- a/arch/x86/include/asm/kvm.h +++ b/arch/x86/include/asm/kvm.h @@ -24,6 +24,7 @@ #define __KVM_HAVE_DEBUGREGS #define __KVM_HAVE_XSAVE #define __KVM_HAVE_XCRS +#define __KVM_HAVE_SET_LINT1 /* Architectural interrupt line count. */ #define KVM_NR_INTERRUPTS 256 diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h index 53e2d08..0c96315 100644 --- a/arch/x86/kvm/irq.h +++ b/arch/x86/kvm/irq.h @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s); void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu); +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_timers(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 57dcbd4..87fe36a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu) kvm_apic_local_deliver(apic, APIC_LVT0); } +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu) +{ +struct kvm_lapic *apic = vcpu-arch.apic; + +kvm_apic_local_deliver(apic, APIC_LVT1); +} + static struct kvm_timer_ops lapic_timer_ops = { .is_periodic = lapic_is_periodic, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 84a28ea..fccd094 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: case KVM_CAP_GET_TSC_KHZ: +case KVM_CAP_SET_LINT1: r = 1; break; case KVM_CAP_COALESCED_MMIO: @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp, goto out; } +case KVM_SET_LINT1: { +r = -EINVAL; +if (!irqchip_in_kernel(vcpu-kvm)) +goto out; +r = 0; +kvm_apic_lint1_deliver(vcpu); +} default: r = -EINVAL; } diff --git a/include/linux/kvm.h b/include/linux/kvm.h index aace6b8..3a10572 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -554,6 +554,9 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_PPC_SMT 64 #define KVM_CAP_PPC_RMA 65 #define KVM_CAP_S390_GMAP 71 +#ifdef __KVM_HAVE_SET_LINT1 +#define KVM_CAP_SET_LINT1 72 +#endif Actually, there is no need for __KVM_HAVE_SET_LINT1 and #ifdef. User land will just do a runtime check. There is not bad result brought by __KVM_HAVE_SET_LINT1 and help for compile time check. Thanks, Lai
Re: [Qemu-devel] [PATCH 2/2 V5] qemu-kvm: fix improper nmi emulation
On 2011-10-14 11:03, Lai Jiangshan wrote: Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is masked in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, inject-nmi request is handled as follows. - When in-kernel irqchip is enabled and KVM_SET_LINT1 is enabled, inject LINT1 instead of NMI interrupt. - otherwise when in-kernel irqchip is enabled, get the in-kernel LAPIC states and test the APIC_LVT_MASKED, if LINT1 is unmasked, and then delivering the NMI directly. - otherwise, userland lapic emulates NMI button and inject NMI if it is unmasked. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- hw/apic.c | 72 + hw/apic.h |1 + monitor.c |6 - 3 files changed, 78 insertions(+), 1 deletions(-) diff --git a/hw/apic.c b/hw/apic.c index 69d6ac5..91b82d0 100644 --- a/hw/apic.c +++ b/hw/apic.c @@ -205,6 +205,78 @@ void apic_deliver_pic_intr(DeviceState *d, int level) } } +#ifdef KVM_CAP_IRQCHIP Please read all my comments. That unfortunately also applies to the rest of the patch. +static inline uint32_t kapic_reg(struct kvm_lapic_state *kapic, int reg_id); + +static void kvm_irqchip_deliver_nmi(void *p) +{ +APICState *s = p; +struct kvm_lapic_state klapic; +uint32_t lvt; + +kvm_get_lapic(s-cpu_env, klapic); +lvt = kapic_reg(klapic, 0x32 + APIC_LVT_LINT1); + +if (lvt APIC_LVT_MASKED) { +return; +} + +if (((lvt 8) 7) != APIC_DM_NMI) { +return; +} + +kvm_vcpu_ioctl(s-cpu_env, KVM_NMI); +} + +static void __apic_deliver_nmi(APICState *s) +{ +if (kvm_irqchip_in_kernel()) { +run_on_cpu(s-cpu_env, kvm_irqchip_deliver_nmi, s); +} else { +apic_local_deliver(s, APIC_LVT_LINT1); +} +} +#else +static void __apic_deliver_nmi(APICState *s) +{ +apic_local_deliver(s, APIC_LVT_LINT1); +} +#endif + +enum { +KVM_SET_LINT1_UNKNOWN, +KVM_SET_LINT1_ENABLED, +KVM_SET_LINT1_DISABLED, +}; + +static void kvm_set_lint1(void *p) +{ +CPUState *env = p; + +kvm_vcpu_ioctl(env, KVM_SET_LINT1); +} + +void apic_deliver_nmi(DeviceState *d) +{ +APICState *s = DO_UPCAST(APICState, busdev.qdev, d); +static int kernel_lint1 = KVM_SET_LINT1_UNKNOWN; + +if (kernel_lint1 == KVM_SET_LINT1_UNKNOWN) { +if (kvm_enabled() kvm_irqchip_in_kernel() +kvm_check_extension(kvm_state, KVM_CAP_SET_LINT1)) { That CAP test belongs where the injection shall happen. Here you decide about user space vs. kernel space APIC model. Let's try it together: if kvm_enabled kvm_irqchip_in_kernel run_on_cpu(kvm_apic_deliver_nmi) else apic_local_deliver(APIC_LVT_LINT1) with kvm_acpi_deliver_nmi like this: if !check_extention(CAP_SET_LINT1) get_kernel_apic_state if !nmi_acceptable return kvm_vcpu_ioctl(KVM_NMI) Please don't trust me blindly and re-check, but this is how the scenario looks like to me. Thanks for your patience, Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 1/1 V5] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
On 2011-10-14 11:27, Lai Jiangshan wrote: On 10/14/2011 05:07 PM, Jan Kiszka wrote: On 2011-10-14 11:03, Lai Jiangshan wrote: Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is masked in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, we introduce introduce KVM_SET_LINT1, and we can use KVM_SET_LINT1 to correctly emulate NMI button without change the old KVM_NMI behavior. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- arch/x86/include/asm/kvm.h |1 + arch/x86/kvm/irq.h |1 + arch/x86/kvm/lapic.c |7 +++ arch/x86/kvm/x86.c |8 include/linux/kvm.h|5 + 5 files changed, 22 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/kvm.h b/arch/x86/include/asm/kvm.h index 4d8dcbd..88d0ac3 100644 --- a/arch/x86/include/asm/kvm.h +++ b/arch/x86/include/asm/kvm.h @@ -24,6 +24,7 @@ #define __KVM_HAVE_DEBUGREGS #define __KVM_HAVE_XSAVE #define __KVM_HAVE_XCRS +#define __KVM_HAVE_SET_LINT1 /* Architectural interrupt line count. */ #define KVM_NR_INTERRUPTS 256 diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h index 53e2d08..0c96315 100644 --- a/arch/x86/kvm/irq.h +++ b/arch/x86/kvm/irq.h @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s); void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu); +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_timers(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 57dcbd4..87fe36a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu) kvm_apic_local_deliver(apic, APIC_LVT0); } +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + + kvm_apic_local_deliver(apic, APIC_LVT1); +} + static struct kvm_timer_ops lapic_timer_ops = { .is_periodic = lapic_is_periodic, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 84a28ea..fccd094 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: case KVM_CAP_GET_TSC_KHZ: + case KVM_CAP_SET_LINT1: r = 1; break; case KVM_CAP_COALESCED_MMIO: @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp, goto out; } + case KVM_SET_LINT1: { + r = -EINVAL; + if (!irqchip_in_kernel(vcpu-kvm)) + goto out; + r = 0; + kvm_apic_lint1_deliver(vcpu); + } default: r = -EINVAL; } diff --git a/include/linux/kvm.h b/include/linux/kvm.h index aace6b8..3a10572 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -554,6 +554,9 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_PPC_SMT 64 #define KVM_CAP_PPC_RMA65 #define KVM_CAP_S390_GMAP 71 +#ifdef __KVM_HAVE_SET_LINT1 +#define KVM_CAP_SET_LINT1 72 +#endif Actually, there is no need for __KVM_HAVE_SET_LINT1 and #ifdef. User land will just do a runtime check. There is not bad result brought by __KVM_HAVE_SET_LINT1 and help for compile time check. It's guarding an arch-specific CAP that will only be checked if there is a need. That's in contrast to generic features that are no supported for all archs (like __KVM_HAVE_GUEST_DEBUG - KVM_CAP_SET_GUEST_DEBUG). Granted, there are quite a few examples for redundant __KVM_HAVE/#ifdef KVM_CAP in the KVM header, but let's not add more. Jan signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH] usb-hub: wakeup on attach
When attaching a new device we must send a wakeup request to the root hub, otherwise the guest will not notice the new device in case the usb hub is suspended. Signed-off-by: Gerd Hoffmann kra...@redhat.com --- hw/usb-hub.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/hw/usb-hub.c b/hw/usb-hub.c index 3e923a0..908c622 100644 --- a/hw/usb-hub.c +++ b/hw/usb-hub.c @@ -163,6 +163,7 @@ static void usb_hub_attach(USBPort *port1) } else { port-wPortStatus = ~PORT_STAT_LOW_SPEED; } +usb_wakeup(s-dev); } static void usb_hub_detach(USBPort *port1) -- 1.7.1
[Qemu-devel] [PATCH 11/12] usb-ehci: Use PCI DMA stub functions
This updates the usb-ehci device emulation to use the explicit PCI DMA wrapper to initialize its scatter/gathjer structure. This means this driver should not need further changes when the sglist interface is extended to support IOMMUs. Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/usb-ehci.c | 44 +--- 1 files changed, 25 insertions(+), 19 deletions(-) diff --git a/hw/usb-ehci.c b/hw/usb-ehci.c index 27376a2..9c74a9f 100644 --- a/hw/usb-ehci.c +++ b/hw/usb-ehci.c @@ -1101,12 +1101,13 @@ static void ehci_mem_writel(void *ptr, target_phys_addr_t addr, uint32_t val) // TODO : Put in common header file, duplication from usb-ohci.c /* Get an array of dwords from main memory */ -static inline int get_dwords(uint32_t addr, uint32_t *buf, int num) +static inline int get_dwords(EHCIState *ehci, uint32_t addr, + uint32_t *buf, int num) { int i; for(i = 0; i num; i++, buf++, addr += sizeof(*buf)) { -cpu_physical_memory_rw(addr,(uint8_t *)buf, sizeof(*buf), 0); +pci_dma_read(ehci-dev, addr, (uint8_t *)buf, sizeof(*buf)); *buf = le32_to_cpu(*buf); } @@ -1114,13 +1115,14 @@ static inline int get_dwords(uint32_t addr, uint32_t *buf, int num) } /* Put an array of dwords in to main memory */ -static inline int put_dwords(uint32_t addr, uint32_t *buf, int num) +static inline int put_dwords(EHCIState *ehci, uint32_t addr, + uint32_t *buf, int num) { int i; for(i = 0; i num; i++, buf++, addr += sizeof(*buf)) { uint32_t tmp = cpu_to_le32(*buf); -cpu_physical_memory_rw(addr,(uint8_t *)tmp, sizeof(tmp), 1); +pci_dma_write(ehci-dev, addr, (uint8_t *)tmp, sizeof(tmp)); } return 1; @@ -1169,7 +1171,8 @@ static int ehci_qh_do_overlay(EHCIQueue *q) q-qh.bufptr[1] = ~BUFPTR_CPROGMASK_MASK; q-qh.bufptr[2] = ~BUFPTR_FRAMETAG_MASK; -put_dwords(NLPTR_GET(q-qhaddr), (uint32_t *) q-qh, sizeof(EHCIqh) 2); +put_dwords(q-ehci, NLPTR_GET(q-qhaddr), (uint32_t *) q-qh, + sizeof(EHCIqh) 2); return 0; } @@ -1177,12 +1180,12 @@ static int ehci_qh_do_overlay(EHCIQueue *q) static int ehci_init_transfer(EHCIQueue *q) { uint32_t cpage, offset, bytes, plen; -target_phys_addr_t page; +dma_addr_t page; cpage = get_field(q-qh.token, QTD_TOKEN_CPAGE); bytes = get_field(q-qh.token, QTD_TOKEN_TBYTES); offset = q-qh.bufptr[0] ~QTD_BUFPTR_MASK; -qemu_sglist_init(q-sgl, 5); +pci_dma_sglist_init(q-sgl, q-ehci-dev, 5); while (bytes 0) { if (cpage 4) { @@ -1428,7 +1431,7 @@ static int ehci_process_itd(EHCIState *ehci, return USB_RET_PROCERR; } -qemu_sglist_init(ehci-isgl, 2); +pci_dma_sglist_init(ehci-isgl, ehci-dev, 2); if (off + len 4096) { /* transfer crosses page border */ uint32_t len2 = off + len - 4096; @@ -1532,7 +1535,8 @@ static int ehci_state_waitlisthead(EHCIState *ehci, int async) /* Find the head of the list (4.9.1.1) */ for(i = 0; i MAX_QH; i++) { -get_dwords(NLPTR_GET(entry), (uint32_t *) qh, sizeof(EHCIqh) 2); +get_dwords(ehci, NLPTR_GET(entry), (uint32_t *) qh, + sizeof(EHCIqh) 2); ehci_trace_qh(NULL, NLPTR_GET(entry), qh); if (qh.epchar QH_EPCHAR_H) { @@ -1629,7 +1633,8 @@ static EHCIQueue *ehci_state_fetchqh(EHCIState *ehci, int async) goto out; } -get_dwords(NLPTR_GET(q-qhaddr), (uint32_t *) q-qh, sizeof(EHCIqh) 2); +get_dwords(ehci, NLPTR_GET(q-qhaddr), + (uint32_t *) q-qh, sizeof(EHCIqh) 2); ehci_trace_qh(q, NLPTR_GET(q-qhaddr), q-qh); if (q-async == EHCI_ASYNC_INFLIGHT) { @@ -1698,7 +1703,7 @@ static int ehci_state_fetchitd(EHCIState *ehci, int async) assert(!async); entry = ehci_get_fetch_addr(ehci, async); -get_dwords(NLPTR_GET(entry),(uint32_t *) itd, +get_dwords(ehci, NLPTR_GET(entry), (uint32_t *) itd, sizeof(EHCIitd) 2); ehci_trace_itd(ehci, entry, itd); @@ -1706,8 +1711,8 @@ static int ehci_state_fetchitd(EHCIState *ehci, int async) return -1; } -put_dwords(NLPTR_GET(entry), (uint32_t *) itd, -sizeof(EHCIitd) 2); +put_dwords(ehci, NLPTR_GET(entry), (uint32_t *) itd, + sizeof(EHCIitd) 2); ehci_set_fetch_addr(ehci, async, itd.next); ehci_set_state(ehci, async, EST_FETCHENTRY); @@ -1722,7 +1727,7 @@ static int ehci_state_fetchsitd(EHCIState *ehci, int async) assert(!async); entry = ehci_get_fetch_addr(ehci, async); -get_dwords(NLPTR_GET(entry), (uint32_t *)sitd, +get_dwords(ehci, NLPTR_GET(entry), (uint32_t *)sitd, sizeof(EHCIsitd) 2); ehci_trace_sitd(ehci, entry, sitd); @@ -1784,7 +1789,8 @@ static int ehci_state_fetchqtd(EHCIQueue
[Qemu-devel] [PATCH 01/12] Add stub functions for PCI device models to do PCI DMA
From: Alexey Kardashevskiy a...@ozlabs.ru This patch adds functions to pci.[ch] to perform PCI DMA operations. At present, these are just stubs which perform directly cpu physical memory accesses. Stubs are included which are analogous to cpu_physical_memory_{read,write}(), the stX_phys() and ldX_phys() functions and cpu_physical_memory_{map,unmap}(). In addition, a wrapper around qemu_sglist_init() is provided, which also takes a PCIDevice *. It's assumed that _init() is the only sglist function which will need wrapping, the idea being that once we have IOMMU support whatever IOMMU context handle the wrapper derives from the PCI device will be stored within the sglist structure for later use. Using these stubs, however, distinguishes PCI device DMA transactions from other accesses to physical memory, which will allow PCI IOMMU support to be added in one place, rather than updating every PCI driver at that time. That is, it allows us to update individual PCI drivers to support an IOMMU without having yet determined the details of how the IOMMU emulation will operate. This will let us remove the most bitrot-sensitive part of an IOMMU patch in advance. Signed-off-by: David Gibson da...@gibson.dropbear.id.au Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- dma.h| 13 + hw/pci.c | 53 + hw/pci.h | 49 + 3 files changed, 111 insertions(+), 4 deletions(-) diff --git a/dma.h b/dma.h index 2bdc236..ce6aab2 100644 --- a/dma.h +++ b/dma.h @@ -18,9 +18,15 @@ typedef struct ScatterGatherEntry ScatterGatherEntry; #if defined(TARGET_PHYS_ADDR_BITS) +typedef target_phys_addr_t dma_addr_t; +typedef enum { +DMA_DIRECTION_TO_DEVICE = 0, +DMA_DIRECTION_FROM_DEVICE = 1, +} DMADirection; + struct ScatterGatherEntry { -target_phys_addr_t base; -target_phys_addr_t len; +dma_addr_t base; +dma_addr_t len; }; struct QEMUSGList { @@ -31,8 +37,7 @@ struct QEMUSGList { }; void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint); -void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base, - target_phys_addr_t len); +void qemu_sglist_add(QEMUSGList *qsg, dma_addr_t base, dma_addr_t len); void qemu_sglist_destroy(QEMUSGList *qsg); #endif diff --git a/hw/pci.c b/hw/pci.c index 749e8d8..4dbdd81 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -2129,3 +2129,56 @@ MemoryRegion *pci_address_space_io(PCIDevice *dev) { return dev-bus-address_space_io; } + +#define PCI_DMA_DEFINE_LDST(_lname, _sname, _end, _bits) \ +uint##_bits##_t ld##_lname##_##_end##_pci_dma(PCIDevice *dev, \ + dma_addr_t addr)\ +{ \ +uint##_bits##_t val; \ +pci_dma_read(dev, addr, val, sizeof(val)); \ +return _end##_bits##_to_cpu(val); \ +} \ +void st##_sname##_##_end##_pci_dma(PCIDevice *dev,\ + dma_addr_t addr, uint##_bits##_t val) \ +{ \ +val = cpu_to_##_end##_bits(val); \ +pci_dma_write(dev, addr, val, sizeof(val)); \ +} + +uint8_t ldub_pci_dma(PCIDevice *dev, dma_addr_t addr) +{ +uint8_t val; + +pci_dma_read(dev, addr, val, sizeof(val)); +return val; +} + +void stb_pci_dma(PCIDevice *dev, dma_addr_t addr, uint8_t val) +{ +pci_dma_write(dev, addr, val, sizeof(val)); +} + +PCI_DMA_DEFINE_LDST(uw, w, le, 16); +PCI_DMA_DEFINE_LDST(uw, w, be, 16); +PCI_DMA_DEFINE_LDST(l, l, le, 32); +PCI_DMA_DEFINE_LDST(l, l, be, 32); +PCI_DMA_DEFINE_LDST(q, q, le, 64); +PCI_DMA_DEFINE_LDST(q, q, be, 64); + +void *pci_dma_map(PCIDevice *dev, dma_addr_t addr, dma_addr_t *plen, + DMADirection dir) +{ +target_phys_addr_t len = *plen; +void *buf; + +buf = cpu_physical_memory_map(addr, len, dir == DMA_DIRECTION_FROM_DEVICE); +*plen = len; +return buf; +} + +void pci_dma_unmap(PCIDevice *dev, void *buffer, dma_addr_t len, + DMADirection dir, dma_addr_t access_len) +{ +cpu_physical_memory_unmap(buffer, len, dir == DMA_DIRECTION_FROM_DEVICE, + access_len); +} diff --git a/hw/pci.h b/hw/pci.h index 86a81c8..0f89d57 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -6,6 +6,7 @@ #include qdev.h #include memory.h +#include dma.h /* PCI includes legacy ISA access. */ #include isa.h @@ -487,4 +488,52 @@ static inline uint32_t pci_config_size(const PCIDevice *d) return pci_is_express(d) ? PCIE_CONFIG_SPACE_SIZE : PCI_CONFIG_SPACE_SIZE; } +/* DMA access
Re: [Qemu-devel] [PATCH v3 03/15] add qemu_send_full and qemu_recv_full
Am 05.10.2011 09:17, schrieb Paolo Bonzini: Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- osdep.c | 67 + qemu-common.h |4 +++ 2 files changed, 71 insertions(+), 0 deletions(-) diff --git a/osdep.c b/osdep.c index 56e6963..718a25d 100644 --- a/osdep.c +++ b/osdep.c @@ -166,3 +166,70 @@ int qemu_accept(int s, struct sockaddr *addr, socklen_t *addrlen) return ret; } + +/* + * A variant of send(2) which handles partial write. + * + * Return the number of bytes transferred, which is only + * smaller than `count' if there is an error. + * + * This function won't work with non-blocking fd's. + * Any of the possibilities with non-bloking fd's is bad: + * - return a short write (then name is wrong) + * - busy wait adding (errno == EAGAIN) to the loop + */ +ssize_t qemu_send_full(int fd, const void *buf, size_t count, int flags) +{ +ssize_t ret = 0; +ssize_t total = 0; + +while (count) { +ret = send(fd, buf, count, flags); +if (ret 0) { +if (errno == EINTR) { +continue; +} +break; +} + +count -= ret; +buf += ret; +total += ret; +} + +return total; +} + +/* + * A variant of recv(2) which handles partial write. + * + * Return the number of bytes transferred, which is only + * smaller than `count' if there is an error. + * + * This function won't work with non-blocking fd's. + * Any of the possibilities with non-bloking fd's is bad: + * - return a short write (then name is wrong) + * - busy wait adding (errno == EAGAIN) to the loop + */ +ssize_t qemu_recv_full(int fd, const void *buf, size_t count, int flags) +{ +ssize_t ret = 0; +ssize_t total = 0; + +while (count) { +ret = recv(fd, buf, count, flags); osdep.c: In function 'qemu_recv_full': osdep.c:220: error: passing argument 2 of 'recv' discards qualifiers from pointer target type /usr/include/bits/socket2.h:35: note: expected 'void *' but argument is of type 'const void *' Kevin
[Qemu-devel] [PATCH 1/1 V5 tuning] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is masked in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, we introduce introduce KVM_SET_LINT1, and we can use KVM_SET_LINT1 to correctly emulate NMI button without change the old KVM_NMI behavior. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- arch/x86/kvm/irq.h |1 + arch/x86/kvm/lapic.c |7 +++ arch/x86/kvm/x86.c |8 include/linux/kvm.h |3 +++ 4 files changed, 19 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h index 53e2d08..0c96315 100644 --- a/arch/x86/kvm/irq.h +++ b/arch/x86/kvm/irq.h @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s); void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu); +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_timers(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 57dcbd4..87fe36a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu) kvm_apic_local_deliver(apic, APIC_LVT0); } +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + + kvm_apic_local_deliver(apic, APIC_LVT1); +} + static struct kvm_timer_ops lapic_timer_ops = { .is_periodic = lapic_is_periodic, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 84a28ea..fccd094 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: case KVM_CAP_GET_TSC_KHZ: + case KVM_CAP_SET_LINT1: r = 1; break; case KVM_CAP_COALESCED_MMIO: @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp, goto out; } + case KVM_SET_LINT1: { + r = -EINVAL; + if (!irqchip_in_kernel(vcpu-kvm)) + goto out; + r = 0; + kvm_apic_lint1_deliver(vcpu); + } default: r = -EINVAL; } diff --git a/include/linux/kvm.h b/include/linux/kvm.h index aace6b8..11a2c42 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -554,6 +554,7 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_PPC_SMT 64 #define KVM_CAP_PPC_RMA65 #define KVM_CAP_S390_GMAP 71 +#define KVM_CAP_SET_LINT1 72 #ifdef KVM_CAP_IRQ_ROUTING @@ -759,6 +760,8 @@ struct kvm_clock_data { #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce) /* Available with KVM_CAP_RMA */ #define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma) +/* Available with KVM_CAP_SET_LINT1 for x86 */ +#define KVM_SET_LINT1_IO(KVMIO, 0xaa) #define KVM_DEV_ASSIGN_ENABLE_IOMMU(1 0)
[Qemu-devel] [PATCH 1/2 V5 tuning] qemu-kvm: Synchronize kernel headers
Synchronize newest kernel headers which have KVM_CAP_SET_LINT1 and KVM_SET_LINT1 by ./scripts/update-linux-headers.sh Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com --- linux-headers/asm-powerpc/kvm.h | 19 +-- linux-headers/asm-x86/kvm_para.h | 14 ++ linux-headers/linux/kvm.h| 24 +--- linux-headers/linux/kvm_para.h |1 + 4 files changed, 49 insertions(+), 9 deletions(-) diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h index 777d307..a4f6c85 100644 --- a/linux-headers/asm-powerpc/kvm.h +++ b/linux-headers/asm-powerpc/kvm.h @@ -22,6 +22,10 @@ #include linux/types.h +/* Select powerpc specific features in linux/kvm.h */ +#define __KVM_HAVE_SPAPR_TCE +#define __KVM_HAVE_PPC_SMT + struct kvm_regs { __u64 pc; __u64 cr; @@ -166,8 +170,8 @@ struct kvm_sregs { } ppc64; struct { __u32 sr[16]; - __u64 ibat[8]; - __u64 dbat[8]; + __u64 ibat[8]; + __u64 dbat[8]; } ppc32; } s; struct { @@ -272,4 +276,15 @@ struct kvm_guest_debug_arch { #define KVM_INTERRUPT_UNSET-2U #define KVM_INTERRUPT_SET_LEVEL-3U +/* for KVM_CAP_SPAPR_TCE */ +struct kvm_create_spapr_tce { + __u64 liobn; + __u32 window_size; +}; + +/* for KVM_ALLOCATE_RMA */ +struct kvm_allocate_rma { + __u64 rma_size; +}; + #endif /* __LINUX_KVM_POWERPC_H */ diff --git a/linux-headers/asm-x86/kvm_para.h b/linux-headers/asm-x86/kvm_para.h index 834d71e..f2ac46a 100644 --- a/linux-headers/asm-x86/kvm_para.h +++ b/linux-headers/asm-x86/kvm_para.h @@ -21,6 +21,7 @@ */ #define KVM_FEATURE_CLOCKSOURCE23 #define KVM_FEATURE_ASYNC_PF 4 +#define KVM_FEATURE_STEAL_TIME 5 /* The last 8 bits are used to indicate how to interpret the flags field * in pvclock structure. If no bits are set, all flags are ignored. @@ -30,10 +31,23 @@ #define MSR_KVM_WALL_CLOCK 0x11 #define MSR_KVM_SYSTEM_TIME 0x12 +#define KVM_MSR_ENABLED 1 /* Custom MSRs falls in the range 0x4b564d00-0x4b564dff */ #define MSR_KVM_WALL_CLOCK_NEW 0x4b564d00 #define MSR_KVM_SYSTEM_TIME_NEW 0x4b564d01 #define MSR_KVM_ASYNC_PF_EN 0x4b564d02 +#define MSR_KVM_STEAL_TIME 0x4b564d03 + +struct kvm_steal_time { + __u64 steal; + __u32 version; + __u32 flags; + __u32 pad[12]; +}; + +#define KVM_STEAL_ALIGNMENT_BITS 5 +#define KVM_STEAL_VALID_BITS ((-1ULL (KVM_STEAL_ALIGNMENT_BITS + 1))) +#define KVM_STEAL_RESERVED_MASK (((1 KVM_STEAL_ALIGNMENT_BITS) - 1 ) 1) #define KVM_MAX_MMU_OP_BATCH 32 diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index fc63b73..0fd246f 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -161,6 +161,7 @@ struct kvm_pit_config { #define KVM_EXIT_NMI 16 #define KVM_EXIT_INTERNAL_ERROR 17 #define KVM_EXIT_OSI 18 +#define KVM_EXIT_PAPR_HCALL 19 /* For KVM_EXIT_INTERNAL_ERROR */ #define KVM_INTERNAL_ERROR_EMULATION 1 @@ -264,6 +265,11 @@ struct kvm_run { struct { __u64 gprs[32]; } osi; + struct { + __u64 nr; + __u64 ret; + __u64 args[9]; + } papr_hcall; /* Fix the size of the union. */ char padding[256]; }; @@ -544,6 +550,11 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_TSC_CONTROL 60 #define KVM_CAP_GET_TSC_KHZ 61 #define KVM_CAP_PPC_BOOKE_SREGS 62 +#define KVM_CAP_SPAPR_TCE 63 +#define KVM_CAP_PPC_SMT 64 +#define KVM_CAP_PPC_RMA65 +#define KVM_CAP_S390_GMAP 71 +#define KVM_CAP_SET_LINT1 72 #ifdef KVM_CAP_IRQ_ROUTING @@ -746,6 +757,11 @@ struct kvm_clock_data { /* Available with KVM_CAP_XCRS */ #define KVM_GET_XCRS _IOR(KVMIO, 0xa6, struct kvm_xcrs) #define KVM_SET_XCRS _IOW(KVMIO, 0xa7, struct kvm_xcrs) +#define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce) +/* Available with KVM_CAP_RMA */ +#define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma) +/* Available with KVM_CAP_SET_LINT1 for x86 */ +#define KVM_SET_LINT1_IO(KVMIO, 0xaa) #define KVM_DEV_ASSIGN_ENABLE_IOMMU(1 0) @@ -773,20 +789,14 @@ struct kvm_assigned_pci_dev { struct kvm_assigned_irq { __u32 assigned_dev_id; - __u32 host_irq; + __u32 host_irq; /* ignored (legacy field) */ __u32 guest_irq; __u32 flags; union { - struct { - __u32 addr_lo; - __u32 addr_hi; - __u32 data; - } guest_msi; __u32 reserved[12];
Re: [Qemu-devel] balloon driver on winxp guest start failed
On 10/14/2011 04:55 AM, Vadim Rozenfeld wrote: On Thu, 2011-10-13 at 15:47 +0100, Stefan Hajnoczi wrote: On Thu, Oct 13, 2011 at 5:00 AM, hkranhk...@vnet.linux.ibm.com wrote: On 10/12/2011 07:09 PM, hkran wrote: I used balloon driver for windows virtio-win-0.1-15.iso (from http://alt.fedoraproject.org/pub/alt/virtio-win/latest/images/bin/) following the install guard , I installed the balloon driver like this: devcon.exe install d:\wxp\x86\balloon.inf PCI\VEN_1AF4DEV_1002SUBSYS_00051AF4REV_00 then reboot guest Os, but the status of driver installed is always incorrect, that show me the driver start failed (code 10) in the device manager. Seems like a resource allocation problem I typed the following cmds in the monitor command line: (qemu) device_add virtio-balloon (qemu) info balloon balloon: actual=2048 (qemu) balloon 1024 (qemu) info balloon balloon: actual=2048 (qemu) info balloon balloon: actual=2048 And I also tried it by using qemu -balloon virtio param when getting qemu up, the status is worse, the winxp guest froze at boot screen. Am I using balloon driver in a correct way? For the boot failure case, I take more looks into it. I open the trace output and see the following when boot failed Balloon driver, built on Oct 13 2011 10:46:59 ^M-- DriverEntry ^Mfile z:\source\kvm-guest-drivers-windows\balloon\sys\driver.c line 151 ^M-- BalloonDeviceAdd ^M-- BalloonDeviceAdd ^M-- BalloonEvtDevicePrepareHardware ^M- Port Resource [C0A0-C0C0] ^M-- BalloonEvtDevicePrepareHardware ^M-- BalloonEvtDeviceD0Entry ^M-- BalloonInit ^M-- VIRTIO_BALLOON_F_STATS_VQ ^M-- BalloonInit ^M-- BalloonInterruptEnable ^M-- BalloonInterruptEnable here, the system is blocked. I compare it with the logfile in the normal case that I hot-plugin the balloon device, and then find the system blocked before calling at BalloonInterruptDpc. What about ISR? Can you try changing balloon size and check if balloon ISR was invoked or not? Is it meaning that we open the interrupt of balloon device too soon when booting the system? I suggest CCing Vadim on virtio Windows driver questions. Not sure if he sees every qemu-devel email. Stefan To make the issue clearer, I do more tests about that. Now I use the package virtio-win-prewhql-0.1-15-sources.zip from http://alt.fedoraproject.org/pub/alt/virtio-win/latest/images/src/ The problem that the balloon driver status is incorrect was not reproduced any longer, but boot failure still be there. more tests told me as if the failure will occur only in the case where virtio-serial and balloon are all attached when qemu booting: (qemu) [huikai@oc0100708617 ~]$ /home/huikai/qemu15/bin/qemu-system-x86_64 --enable-kvm -m 2048 -drive file=/home/huikai/xp_shanghai.img,if=virtio -net user -net nic,model=viga qxl -localtime -chardev stdio,id=muxstdio -mon chardev=muxstdio -usb -usbdevice tablet -device virtio-serial,id=vs0 -chardev socket,path=/tmp/foo,server,nowait,id=foo -device virtserialport,bus=vs0.0,chardev=foo,name=helloworld -serial file:/tmp/xp_1014_6.log -balloon virtio,id=ball1 the trace: Virtio-Serial driver started...built on Oct 14 2011 15:58:02 ^M-- VIOSerialEvtDeviceAdd ^M-- VIOSerialInitInterruptHandling ^MBalloon driver, built on Oct 13 2011 17:34:56 ^M-- DriverEntry ^M-- BalloonDeviceAdd ^M-- BalloonDeviceAdd ^M-- BalloonEvtDevicePrepareHardware ^M- Port Resource [C0A0-C0C0] ^M-- BalloonEvtDevicePrepareHardware ^M-- BalloonEvtDeviceD0Entry ^M-- BalloonInit ^M-- VIRTIO_BALLOON_F_STATS_VQ ^M-- BalloonInit ^M-- BalloonInterruptEnable ^M-- BalloonInterruptEnable ^M-- VIOSerialEvtDevicePrepareHardware ^MIO Port Info [C080-C0A0] ^MWe have multiport host ^MVirtIOConsoleConfig-max_nr_ports 31 ^M-- VIOSerialEvtDeviceD0Entry ^M-- VIOSerialInitAllQueues ^M-- VIOSerialFillQueue ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89B13A50 ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89B13638 ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89C07E08 ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89C07C50 ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89C07A98 ... ... ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89BD14B8 ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89B826E8 ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89BE4450 ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89BE2398 ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89C53468 ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89C37E18 ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89C374C0 ^M-- VIOSerialFreeBuffer buf = 89C374C0, buf-va_buf = 89983000 ^MVIOSerialRenewAllPorts ^M-- VIOSerialFillQueue ^M-- VIOSerialAllocateBuffer ^M-- VIOSerialAddInBuf buf = 89C374C0 ^M-- VIOSerialFreeBuffer buf = 89C374C0, buf-va_buf = 89983000 ^MSetting VIRTIO_CONFIG_S_DRIVER_OK flag ^M not any more output here.
Re: [Qemu-devel] [PATCH 0/5] block: remove unused emulation and synchronous functions
Am 13.10.2011 22:09, schrieb Stefan Hajnoczi: Now that the block layer processes requests in coroutine context, some of the emulation wrappers and duplicate code paths can be dropped. Paraphrasing a wise man, Arnold Schwarzenegger, I will go to the block layer and I will clean house :). They key thing behind this series is that the block layer processes requests in coroutine context and will try to use .brdv_co_readv()/.bdrv_co_writev() when possible. Should the BlockDriver not implement those interfaces, an emulation function will be used to provide them using aio. If the BlockDriver does not implement aio interfaces, then an emulation function will be used to provide them using synchronous I/O. This means: 1. A BlockDriver that implements coroutine interfaces does not need to implement aio or synchronous interfaces. 2. A BlockDriver that implements aio interfaces does not need to implement synchronous interfaces. 3. Coroutine interfaces are preferred and do not require any emulation functions. This patch series propagates these rules across existing BlockDrivers and removes unused emulation functions from block.c. Stefan Hajnoczi (5): block: drop emulation functions that use coroutines raw-posix: remove bdrv_read()/bdrv_write() block: use coroutine interface for raw format block: drop .bdrv_read()/.bdrv_write() emulation block: drop bdrv_has_async_rw() Thanks, applied all to the block branch. block.c | 142 ++- block/raw-posix.c | 277 - block/raw.c | 32 ++- 3 files changed, 18 insertions(+), 433 deletions(-) Awesome. We need more series like this! :-) Kevin
[Qemu-devel] [PATCH 09/12] intel-hda: Use PCI DMA stub functions
This updates the intel-hda device emulation to use the explicit PCI DMA functions, instead of directly calling physical memory access functions. Signed-off-by: David Gibson da...@gibson.dropbear.id.au Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/intel-hda.c | 14 +++--- 1 files changed, 7 insertions(+), 7 deletions(-) diff --git a/hw/intel-hda.c b/hw/intel-hda.c index 4272204..98790e8 100644 --- a/hw/intel-hda.c +++ b/hw/intel-hda.c @@ -24,6 +24,7 @@ #include audiodev.h #include intel-hda.h #include intel-hda-defs.h +#include dma.h /* - */ /* hda bus */ @@ -328,7 +329,7 @@ static void intel_hda_corb_run(IntelHDAState *d) rp = (d-corb_rp + 1) 0xff; addr = intel_hda_addr(d-corb_lbase, d-corb_ubase); -verb = ldl_le_phys(addr + 4*rp); +verb = ldl_le_pci_dma(d-pci, addr + 4*rp); d-corb_rp = rp; dprint(d, 2, %s: [rp 0x%x] verb 0x%08x\n, __FUNCTION__, rp, verb); @@ -360,8 +361,8 @@ static void intel_hda_response(HDACodecDevice *dev, bool solicited, uint32_t res ex = (solicited ? 0 : (1 4)) | dev-cad; wp = (d-rirb_wp + 1) 0xff; addr = intel_hda_addr(d-rirb_lbase, d-rirb_ubase); -stl_le_phys(addr + 8*wp, response); -stl_le_phys(addr + 8*wp + 4, ex); +stl_le_pci_dma(d-pci, addr + 8*wp, response); +stl_le_pci_dma(d-pci, addr + 8*wp + 4, ex); d-rirb_wp = wp; dprint(d, 2, %s: [wp 0x%x] response 0x%x, extra 0x%x\n, @@ -425,8 +426,7 @@ static bool intel_hda_xfer(HDACodecDevice *dev, uint32_t stnr, bool output, dprint(d, 3, dma: entry %d, pos %d/%d, copy %d\n, st-be, st-bp, st-bpl[st-be].len, copy); -cpu_physical_memory_rw(st-bpl[st-be].addr + st-bp, - buf, copy, !output); +pci_dma_rw(d-pci, st-bpl[st-be].addr + st-bp, buf, copy, !output); st-lpib += copy; st-bp += copy; buf += copy; @@ -448,7 +448,7 @@ static bool intel_hda_xfer(HDACodecDevice *dev, uint32_t stnr, bool output, } if (d-dp_lbase 0x01) { addr = intel_hda_addr(d-dp_lbase ~0x01, d-dp_ubase); -stl_le_phys(addr + 8*s, st-lpib); +stl_le_pci_dma(d-pci, addr + 8*s, st-lpib); } dprint(d, 3, dma: --\n); @@ -470,7 +470,7 @@ static void intel_hda_parse_bdl(IntelHDAState *d, IntelHDAStream *st) g_free(st-bpl); st-bpl = g_malloc(sizeof(bpl) * st-bentries); for (i = 0; i st-bentries; i++, addr += 16) { -cpu_physical_memory_read(addr, buf, 16); +pci_dma_read(d-pci, addr, buf, 16); st-bpl[i].addr = le64_to_cpu(*(uint64_t *)buf); st-bpl[i].len = le32_to_cpu(*(uint32_t *)(buf + 8)); st-bpl[i].flags = le32_to_cpu(*(uint32_t *)(buf + 12)); -- 1.7.6.3
[Qemu-devel] [PATCH 03/12] eepro100: Use PCI DMA stub functions
From: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro This updates the eepro100 device emulation to use the explicit PCI DMA functions, instead of directly calling physical memory access functions. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro Signed-off-by: David Gibson d...@au1.ibm.com Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/eepro100.c | 121 +++-- 1 files changed, 49 insertions(+), 72 deletions(-) diff --git a/hw/eepro100.c b/hw/eepro100.c index 4e3c52f..7d59e71 100644 --- a/hw/eepro100.c +++ b/hw/eepro100.c @@ -46,6 +46,7 @@ #include net.h #include eeprom93xx.h #include sysemu.h +#include dma.h /* QEMU sends frames smaller than 60 bytes to ethernet nics. * Such frames are rejected by real nics and their emulations. @@ -315,38 +316,6 @@ static const uint16_t eepro100_mdi_mask[] = { 0x, 0x, 0x, 0x, 0x, 0x, 0x, 0x, }; -/* Read a 16 bit little endian value from physical memory. */ -static uint16_t e100_ldw_le_phys(target_phys_addr_t addr) -{ -/* Load 16 bit (little endian) word from emulated hardware. */ -uint16_t val; -cpu_physical_memory_read(addr, val, sizeof(val)); -return le16_to_cpu(val); -} - -/* Read a 32 bit little endian value from physical memory. */ -static uint32_t e100_ldl_le_phys(target_phys_addr_t addr) -{ -/* Load 32 bit (little endian) word from emulated hardware. */ -uint32_t val; -cpu_physical_memory_read(addr, val, sizeof(val)); -return le32_to_cpu(val); -} - -/* Write a 16 bit little endian value to physical memory. */ -static void e100_stw_le_phys(target_phys_addr_t addr, uint16_t val) -{ -val = cpu_to_le16(val); -cpu_physical_memory_write(addr, val, sizeof(val)); -} - -/* Write a 32 bit little endian value to physical memory. */ -static void e100_stl_le_phys(target_phys_addr_t addr, uint32_t val) -{ -val = cpu_to_le32(val); -cpu_physical_memory_write(addr, val, sizeof(val)); -} - #define POLYNOMIAL 0x04c11db6 /* From FreeBSD */ @@ -744,21 +713,26 @@ static void dump_statistics(EEPRO100State * s) * values which really matter. * Number of data should check configuration!!! */ -cpu_physical_memory_write(s-statsaddr, s-statistics, s-stats_size); -e100_stl_le_phys(s-statsaddr + 0, s-statistics.tx_good_frames); -e100_stl_le_phys(s-statsaddr + 36, s-statistics.rx_good_frames); -e100_stl_le_phys(s-statsaddr + 48, s-statistics.rx_resource_errors); -e100_stl_le_phys(s-statsaddr + 60, s-statistics.rx_short_frame_errors); +pci_dma_write(s-dev, s-statsaddr, + (uint8_t *) s-statistics, s-stats_size); +stl_le_pci_dma(s-dev, s-statsaddr + 0, + s-statistics.tx_good_frames); +stl_le_pci_dma(s-dev, s-statsaddr + 36, + s-statistics.rx_good_frames); +stl_le_pci_dma(s-dev, s-statsaddr + 48, + s-statistics.rx_resource_errors); +stl_le_pci_dma(s-dev, s-statsaddr + 60, + s-statistics.rx_short_frame_errors); #if 0 -e100_stw_le_phys(s-statsaddr + 76, s-statistics.xmt_tco_frames); -e100_stw_le_phys(s-statsaddr + 78, s-statistics.rcv_tco_frames); +stw_le_pci_dma(s-dev, s-statsaddr + 76, s-statistics.xmt_tco_frames); +stw_le_pci_dma(s-dev, s-statsaddr + 78, s-statistics.rcv_tco_frames); missing(CU dump statistical counters); #endif } static void read_cb(EEPRO100State *s) { -cpu_physical_memory_read(s-cb_address, s-tx, sizeof(s-tx)); +pci_dma_read(s-dev, s-cb_address, (uint8_t *) s-tx, sizeof(s-tx)); s-tx.status = le16_to_cpu(s-tx.status); s-tx.command = le16_to_cpu(s-tx.command); s-tx.link = le32_to_cpu(s-tx.link); @@ -788,18 +762,17 @@ static void tx_command(EEPRO100State *s) } assert(tcb_bytes = sizeof(buf)); while (size tcb_bytes) { -uint32_t tx_buffer_address = e100_ldl_le_phys(tbd_address); -uint16_t tx_buffer_size = e100_ldw_le_phys(tbd_address + 4); +uint32_t tx_buffer_address = ldl_le_pci_dma(s-dev, tbd_address); +uint16_t tx_buffer_size = lduw_le_pci_dma(s-dev, tbd_address + 4); #if 0 -uint16_t tx_buffer_el = e100_ldw_le_phys(tbd_address + 6); +uint16_t tx_buffer_el = lduw_le_pci_dma(s-dev, tbd_address + 6); #endif tbd_address += 8; TRACE(RXTX, logout (TBD (simplified mode): buffer address 0x%08x, size 0x%04x\n, tx_buffer_address, tx_buffer_size)); tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size); -cpu_physical_memory_read(tx_buffer_address, buf[size], - tx_buffer_size); +pci_dma_read(s-dev, tx_buffer_address, buf[size], tx_buffer_size); size += tx_buffer_size; } if (tbd_array == 0x) { @@ -810,16 +783,19 @@ static void tx_command(EEPRO100State *s) if (s-has_extended_tcb_support !(s-configuration[6]
[Qemu-devel] [PATCH 04/12] ac97: Use PCI DMA stub functions
From: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro This updates the ac97 device emulation to use the explicit PCI DMA functions, instead of directly calling physical memory access functions. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/ac97.c |7 --- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/hw/ac97.c b/hw/ac97.c index bc69d4e..6800af4 100644 --- a/hw/ac97.c +++ b/hw/ac97.c @@ -18,6 +18,7 @@ #include audiodev.h #include audio/audio.h #include pci.h +#include dma.h enum { AC97_Reset = 0x00, @@ -224,7 +225,7 @@ static void fetch_bd (AC97LinkState *s, AC97BusMasterRegs *r) { uint8_t b[8]; -cpu_physical_memory_read (r-bdbar + r-civ * 8, b, 8); +pci_dma_read (s-dev, r-bdbar + r-civ * 8, b, 8); r-bd_valid = 1; r-bd.addr = le32_to_cpu (*(uint32_t *) b[0]) ~3; r-bd.ctl_len = le32_to_cpu (*(uint32_t *) b[4]); @@ -973,7 +974,7 @@ static int write_audio (AC97LinkState *s, AC97BusMasterRegs *r, while (temp) { int copied; to_copy = audio_MIN (temp, sizeof (tmpbuf)); -cpu_physical_memory_read (addr, tmpbuf, to_copy); +pci_dma_read (s-dev, addr, tmpbuf, to_copy); copied = AUD_write (s-voice_po, tmpbuf, to_copy); dolog (write_audio max=%x to_copy=%x copied=%x\n, max, to_copy, copied); @@ -1054,7 +1055,7 @@ static int read_audio (AC97LinkState *s, AC97BusMasterRegs *r, *stop = 1; break; } -cpu_physical_memory_write (addr, tmpbuf, acquired); +pci_dma_write (s-dev, addr, tmpbuf, acquired); temp -= acquired; addr += acquired; nread += acquired; -- 1.7.6.3
[Qemu-devel] [PATCH 12/12] usb-uhci: Use PCI DMA stub functions
This updates the usb-uhci device emulation to use the explicit PCI DMA wrapper to initialize its scatter/gathjer structure. This means this driver should not need further changes when the sglist interface is extended to support IOMMUs. Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/usb-uhci.c | 22 +++--- 1 files changed, 11 insertions(+), 11 deletions(-) diff --git a/hw/usb-uhci.c b/hw/usb-uhci.c index 17992cf..6f8ea21 100644 --- a/hw/usb-uhci.c +++ b/hw/usb-uhci.c @@ -178,7 +178,7 @@ static UHCIAsync *uhci_async_alloc(UHCIState *s) async-done = 0; async-isoc = 0; usb_packet_init(async-packet); -qemu_sglist_init(async-sgl, 1); +pci_dma_sglist_init(async-sgl, s-dev, 1); return async; } @@ -876,7 +876,7 @@ static void uhci_async_complete(USBPort *port, USBPacket *packet) uint32_t link = async-td; uint32_t int_mask = 0, val; -cpu_physical_memory_read(link ~0xf, (uint8_t *) td, sizeof(td)); +pci_dma_read(s-dev, link ~0xf, (uint8_t *) td, sizeof(td)); le32_to_cpus(td.link); le32_to_cpus(td.ctrl); le32_to_cpus(td.token); @@ -888,8 +888,8 @@ static void uhci_async_complete(USBPort *port, USBPacket *packet) /* update the status bits of the TD */ val = cpu_to_le32(td.ctrl); -cpu_physical_memory_write((link ~0xf) + 4, - (const uint8_t *)val, sizeof(val)); +pci_dma_write(s-dev, (link ~0xf) + 4, + (const uint8_t *)val, sizeof(val)); uhci_async_free(s, async); } else { async-done = 1; @@ -952,7 +952,7 @@ static void uhci_process_frame(UHCIState *s) DPRINTF(uhci: processing frame %d addr 0x%x\n , s-frnum, frame_addr); -cpu_physical_memory_read(frame_addr, (uint8_t *)link, 4); +pci_dma_read(s-dev, frame_addr, (uint8_t *)link, 4); le32_to_cpus(link); int_mask = 0; @@ -976,7 +976,7 @@ static void uhci_process_frame(UHCIState *s) break; } -cpu_physical_memory_read(link ~0xf, (uint8_t *) qh, sizeof(qh)); +pci_dma_read(s-dev, link ~0xf, (uint8_t *) qh, sizeof(qh)); le32_to_cpus(qh.link); le32_to_cpus(qh.el_link); @@ -996,7 +996,7 @@ static void uhci_process_frame(UHCIState *s) } /* TD */ -cpu_physical_memory_read(link ~0xf, (uint8_t *) td, sizeof(td)); +pci_dma_read(s-dev, link ~0xf, (uint8_t *) td, sizeof(td)); le32_to_cpus(td.link); le32_to_cpus(td.ctrl); le32_to_cpus(td.token); @@ -1010,8 +1010,8 @@ static void uhci_process_frame(UHCIState *s) if (old_td_ctrl != td.ctrl) { /* update the status bits of the TD */ val = cpu_to_le32(td.ctrl); -cpu_physical_memory_write((link ~0xf) + 4, - (const uint8_t *)val, sizeof(val)); +pci_dma_write(s-dev, (link ~0xf) + 4, + (const uint8_t *)val, sizeof(val)); } if (ret 0) { @@ -1039,8 +1039,8 @@ static void uhci_process_frame(UHCIState *s) /* update QH element link */ qh.el_link = link; val = cpu_to_le32(qh.el_link); -cpu_physical_memory_write((curr_qh ~0xf) + 4, - (const uint8_t *)val, sizeof(val)); +pci_dma_write(s-dev, (curr_qh ~0xf) + 4, + (const uint8_t *)val, sizeof(val)); if (!depth_first(link)) { /* done with this QH */ -- 1.7.6.3
[Qemu-devel] [PATCH 02/12] rtl8139: Use PCI DMA stub functions
From: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro This updates the rtl8139 device emulation to use the explicit PCI DMA functions, instead of directly calling physical memory access functions. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/rtl8139.c | 98 + 1 files changed, 50 insertions(+), 48 deletions(-) diff --git a/hw/rtl8139.c b/hw/rtl8139.c index 3753950..d2c4980 100644 --- a/hw/rtl8139.c +++ b/hw/rtl8139.c @@ -53,6 +53,7 @@ #include hw.h #include pci.h +#include dma.h #include qemu-timer.h #include net.h #include loader.h @@ -427,9 +428,6 @@ typedef struct RTL8139TallyCounters /* Clears all tally counters */ static void RTL8139TallyCounters_clear(RTL8139TallyCounters* counters); -/* Writes tally counters to specified physical memory address */ -static void RTL8139TallyCounters_physical_memory_write(target_phys_addr_t tc_addr, RTL8139TallyCounters* counters); - typedef struct RTL8139State { PCIDevice dev; uint8_t phys[8]; /* mac address */ @@ -512,6 +510,9 @@ typedef struct RTL8139State { int rtl8139_mmio_io_addr_dummy; } RTL8139State; +/* Writes tally counters to memory via DMA */ +static void RTL8139TallyCounters_dma_write(RTL8139State *s, dma_addr_t tc_addr); + static void rtl8139_set_next_tctr_time(RTL8139State *s, int64_t current_time); static void prom9346_decode_command(EEprom9346 *eeprom, uint8_t command) @@ -773,15 +774,15 @@ static void rtl8139_write_buffer(RTL8139State *s, const void *buf, int size) if (size wrapped) { -cpu_physical_memory_write( s-RxBuf + s-RxBufAddr, - buf, size-wrapped ); +pci_dma_write(s-dev, s-RxBuf + s-RxBufAddr, + buf, size-wrapped); } /* reset buffer pointer */ s-RxBufAddr = 0; -cpu_physical_memory_write( s-RxBuf + s-RxBufAddr, - buf + (size-wrapped), wrapped ); +pci_dma_write(s-dev, s-RxBuf + s-RxBufAddr, + buf + (size-wrapped), wrapped); s-RxBufAddr = wrapped; @@ -790,13 +791,13 @@ static void rtl8139_write_buffer(RTL8139State *s, const void *buf, int size) } /* non-wrapping path or overwrapping enabled */ -cpu_physical_memory_write( s-RxBuf + s-RxBufAddr, buf, size ); +pci_dma_write(s-dev, s-RxBuf + s-RxBufAddr, buf, size); s-RxBufAddr += size; } #define MIN_BUF_SIZE 60 -static inline target_phys_addr_t rtl8139_addr64(uint32_t low, uint32_t high) +static inline dma_addr_t rtl8139_addr64(uint32_t low, uint32_t high) { #if TARGET_PHYS_ADDR_BITS 32 return low | ((target_phys_addr_t)high 32); @@ -979,7 +980,7 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ /* w3 high 32bit of Rx buffer ptr */ int descriptor = s-currCPlusRxDesc; -target_phys_addr_t cplus_rx_ring_desc; +dma_addr_t cplus_rx_ring_desc; cplus_rx_ring_desc = rtl8139_addr64(s-RxRingAddrLO, s-RxRingAddrHI); cplus_rx_ring_desc += 16 * descriptor; @@ -990,13 +991,13 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ uint32_t val, rxdw0,rxdw1,rxbufLO,rxbufHI; -cpu_physical_memory_read(cplus_rx_ring_desc,(uint8_t *)val, 4); +pci_dma_read(s-dev, cplus_rx_ring_desc, (uint8_t *)val, 4); rxdw0 = le32_to_cpu(val); -cpu_physical_memory_read(cplus_rx_ring_desc+4, (uint8_t *)val, 4); +pci_dma_read(s-dev, cplus_rx_ring_desc+4, (uint8_t *)val, 4); rxdw1 = le32_to_cpu(val); -cpu_physical_memory_read(cplus_rx_ring_desc+8, (uint8_t *)val, 4); +pci_dma_read(s-dev, cplus_rx_ring_desc+8, (uint8_t *)val, 4); rxbufLO = le32_to_cpu(val); -cpu_physical_memory_read(cplus_rx_ring_desc+12, (uint8_t *)val, 4); +pci_dma_read(s-dev, cplus_rx_ring_desc+12, (uint8_t *)val, 4); rxbufHI = le32_to_cpu(val); DPRINTF(+++ C+ mode RX descriptor %d %08x %08x %08x %08x\n, @@ -1060,16 +1061,16 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ return size_; } -target_phys_addr_t rx_addr = rtl8139_addr64(rxbufLO, rxbufHI); +dma_addr_t rx_addr = rtl8139_addr64(rxbufLO, rxbufHI); /* receive/copy to target memory */ if (dot1q_buf) { -cpu_physical_memory_write(rx_addr, buf, 2 * ETHER_ADDR_LEN); -cpu_physical_memory_write(rx_addr + 2 * ETHER_ADDR_LEN, -buf + 2 * ETHER_ADDR_LEN + VLAN_HLEN, -size - 2 * ETHER_ADDR_LEN); +pci_dma_write(s-dev, rx_addr, buf, 2 * ETHER_ADDR_LEN); +pci_dma_write(s-dev, rx_addr + 2 * ETHER_ADDR_LEN, +
Re: [Qemu-devel] [PATCH v3 03/15] add qemu_send_full and qemu_recv_full
On 10/14/2011 11:52 AM, Kevin Wolf wrote: Am 05.10.2011 09:17, schrieb Paolo Bonzini: Signed-off-by: Paolo Bonzinipbonz...@redhat.com --- osdep.c | 67 + qemu-common.h |4 +++ 2 files changed, 71 insertions(+), 0 deletions(-) diff --git a/osdep.c b/osdep.c index 56e6963..718a25d 100644 --- a/osdep.c +++ b/osdep.c @@ -166,3 +166,70 @@ int qemu_accept(int s, struct sockaddr *addr, socklen_t *addrlen) return ret; } + +/* + * A variant of send(2) which handles partial write. + * + * Return the number of bytes transferred, which is only + * smaller than `count' if there is an error. + * + * This function won't work with non-blocking fd's. + * Any of the possibilities with non-bloking fd's is bad: + * - return a short write (then name is wrong) + * - busy wait adding (errno == EAGAIN) to the loop + */ +ssize_t qemu_send_full(int fd, const void *buf, size_t count, int flags) +{ +ssize_t ret = 0; +ssize_t total = 0; + +while (count) { +ret = send(fd, buf, count, flags); +if (ret 0) { +if (errno == EINTR) { +continue; +} +break; +} + +count -= ret; +buf += ret; +total += ret; +} + +return total; +} + +/* + * A variant of recv(2) which handles partial write. + * + * Return the number of bytes transferred, which is only + * smaller than `count' if there is an error. + * + * This function won't work with non-blocking fd's. + * Any of the possibilities with non-bloking fd's is bad: + * - return a short write (then name is wrong) + * - busy wait adding (errno == EAGAIN) to the loop + */ +ssize_t qemu_recv_full(int fd, const void *buf, size_t count, int flags) +{ +ssize_t ret = 0; +ssize_t total = 0; + +while (count) { +ret = recv(fd, buf, count, flags); osdep.c: In function 'qemu_recv_full': osdep.c:220: error: passing argument 2 of 'recv' discards qualifiers from pointer target type /usr/include/bits/socket2.h:35: note: expected 'void *' but argument is of type 'const void *' It's fixed in 4/15's osdep.c. I attach the diff, and pushed the fixed version to github nbd-trim. Also, all branches there are now rebased on top of block branch. Paolo diff --git a/osdep.c b/osdep.c index 718a25d..70bad27 100644 --- a/osdep.c +++ b/osdep.c @@ -211,13 +211,13 @@ ssize_t qemu_send_full(int fd, const void *buf, size_t count, int flags) * - return a short write (then name is wrong) * - busy wait adding (errno == EAGAIN) to the loop */ -ssize_t qemu_recv_full(int fd, const void *buf, size_t count, int flags) +ssize_t qemu_recv_full(int fd, void *buf, size_t count, int flags) { ssize_t ret = 0; ssize_t total = 0; while (count) { -ret = recv(fd, buf, count, flags); +ret = qemu_recv(fd, buf, count, flags); if (ret = 0) { if (ret 0 errno == EINTR) { continue; diff --git a/roms/SLOF b/roms/SLOF index d1d6b53..b94bde0 16 --- a/roms/SLOF +++ b/roms/SLOF @@ -1 +1 @@ -Subproject commit d1d6b53b713a2b7c2c25685268fa932d28a4b4c0 +Subproject commit b94bde008b0d49ec4bfe933e110d0952d032ac28
[Qemu-devel] [PATCH 08/12] pcnet-pci: Use PCI DMA stub functions
From: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro This updates the pcnet-pci device emulation to use the explicit PCI DMA functions, instead of directly calling physical memory access functions. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/pcnet-pci.c |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/hw/pcnet-pci.c b/hw/pcnet-pci.c index fb2a00c..41a6e07 100644 --- a/hw/pcnet-pci.c +++ b/hw/pcnet-pci.c @@ -31,6 +31,7 @@ #include net.h #include loader.h #include qemu-timer.h +#include dma.h #include pcnet.h @@ -230,13 +231,13 @@ static const MemoryRegionOps pcnet_mmio_ops = { static void pci_physical_memory_write(void *dma_opaque, target_phys_addr_t addr, uint8_t *buf, int len, int do_bswap) { -cpu_physical_memory_write(addr, buf, len); +pci_dma_write(dma_opaque, addr, buf, len); } static void pci_physical_memory_read(void *dma_opaque, target_phys_addr_t addr, uint8_t *buf, int len, int do_bswap) { -cpu_physical_memory_read(addr, buf, len); +pci_dma_read(dma_opaque, addr, buf, len); } static void pci_pcnet_cleanup(VLANClientState *nc) @@ -302,6 +303,7 @@ static int pci_pcnet_init(PCIDevice *pci_dev) s-irq = pci_dev-irq[0]; s-phys_mem_read = pci_physical_memory_read; s-phys_mem_write = pci_physical_memory_write; +s-dma_opaque = pci_dev; if (!pci_dev-qdev.hotplugged) { static int loaded = 0; -- 1.7.6.3
[Qemu-devel] [PATCH 10/12] PCI IDE: Use PCI DMA stub functions
This updates the PCI IDE device emulation to use the explicit PCI DMA wrapper to initialize its scatter/gathjer structure. This means this driver should not need further changes when the sglist interface is extended to support IOMMUs. Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- hw/ide/pci.c | 15 --- 1 files changed, 8 insertions(+), 7 deletions(-) diff --git a/hw/ide/pci.c b/hw/ide/pci.c index f133c42..0747e09 100644 --- a/hw/ide/pci.c +++ b/hw/ide/pci.c @@ -62,7 +62,8 @@ static int bmdma_prepare_buf(IDEDMA *dma, int is_write) } prd; int l, len; -qemu_sglist_init(s-sg, s-nsector / (BMDMA_PAGE_SIZE / 512) + 1); +pci_dma_sglist_init(s-sg, bm-pci_dev-dev, +s-nsector / (BMDMA_PAGE_SIZE / 512) + 1); s-io_buffer_size = 0; for(;;) { if (bm-cur_prd_len == 0) { @@ -70,7 +71,7 @@ static int bmdma_prepare_buf(IDEDMA *dma, int is_write) if (bm-cur_prd_last || (bm-cur_addr - bm-addr) = BMDMA_PAGE_SIZE) return s-io_buffer_size != 0; -cpu_physical_memory_read(bm-cur_addr, (uint8_t *)prd, 8); +pci_dma_read(bm-pci_dev-dev, bm-cur_addr, (uint8_t *)prd, 8); bm-cur_addr += 8; prd.addr = le32_to_cpu(prd.addr); prd.size = le32_to_cpu(prd.size); @@ -112,7 +113,7 @@ static int bmdma_rw_buf(IDEDMA *dma, int is_write) if (bm-cur_prd_last || (bm-cur_addr - bm-addr) = BMDMA_PAGE_SIZE) return 0; -cpu_physical_memory_read(bm-cur_addr, (uint8_t *)prd, 8); +pci_dma_read(bm-pci_dev-dev, bm-cur_addr, (uint8_t *)prd, 8); bm-cur_addr += 8; prd.addr = le32_to_cpu(prd.addr); prd.size = le32_to_cpu(prd.size); @@ -127,11 +128,11 @@ static int bmdma_rw_buf(IDEDMA *dma, int is_write) l = bm-cur_prd_len; if (l 0) { if (is_write) { -cpu_physical_memory_write(bm-cur_prd_addr, - s-io_buffer + s-io_buffer_index, l); +pci_dma_write(bm-pci_dev-dev, bm-cur_prd_addr, + s-io_buffer + s-io_buffer_index, l); } else { -cpu_physical_memory_read(bm-cur_prd_addr, - s-io_buffer + s-io_buffer_index, l); +pci_dma_read(bm-pci_dev-dev, bm-cur_prd_addr, + s-io_buffer + s-io_buffer_index, l); } bm-cur_prd_addr += l; bm-cur_prd_len -= l; -- 1.7.6.3
[Qemu-devel] [Bug 874038] Re: ARM thumb2 does not propogate carry flag properly
** Patch added: fix for the problem https://bugs.launchpad.net/bugs/874038/+attachment/2542719/+files/fix_carry_in_thumb2 -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/874038 Title: ARM thumb2 does not propogate carry flag properly Status in QEMU: New Bug description: information on carry flag is lost if gen_set_CF_bit31(t1) is called after logic operation. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/874038/+subscriptions
[Qemu-devel] [Bug 874038] [NEW] ARM thumb2 does not propogate carry flag properly
Public bug reported: information on carry flag is lost if gen_set_CF_bit31(t1) is called after logic operation. ** Affects: qemu Importance: Undecided Status: New ** Tags: arm flags -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/874038 Title: ARM thumb2 does not propogate carry flag properly Status in QEMU: New Bug description: information on carry flag is lost if gen_set_CF_bit31(t1) is called after logic operation. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/874038/+subscriptions
Re: [Qemu-devel] [v7 Patch 1/5]Qemu: Enhance info block to display host cache setting
On 10/12/2011 07:47 PM, Kevin Wolf wrote: Am 11.10.2011 05:10, schrieb Supriya Kannery: Enhance info block to display hostcache setting for each block device. +if (qdict_haskey(bs_dict, open_flags)) { +int open_flags = qdict_get_int(bs_dict, open_flags); +if (open_flags BDRV_O_NOCACHE) +monitor_printf(mon, hostcache=0); +else +monitor_printf(mon, hostcache=1); Coding style requires braces. ok..will add. checkpatch.pl didn't catch this! bs_obj = qobject_from_jsonf({ 'device': %s, 'type': 'unknown', -'removable': %i, 'locked': %i }, +'removable': %i, 'locked': %i, +'hostcache': %i }, bs-device_name, bdrv_dev_has_removable_media(bs), -bdrv_dev_is_medium_locked(bs)); +bdrv_dev_is_medium_locked(bs), +!(bs-open_flags BDRV_O_NOCACHE)); bs_dict = qobject_to_qdict(bs_obj); +qdict_put(bs_dict, open_flags, qint_from_int(bs-open_flags)); No. This adds a open_flags field to the QMP structure that is transferred to clients. This is wrong, open_flags is an internal thing that should never be visible on an interface. In bdrv_print_dict, access the hostcache field that you introduced, it provides the same information. Will replace open_flags with hostcache field. thanks, Supriya
[Qemu-devel] [PATCH] rtl8139: check the buffer availiability
Reduce spurious packet drops on RX ring empty when in c+ mode by verifying that we have at least 1 buffer ahead of the time. Signed-off-by: Jason Wang jasow...@redhat.com --- hw/rtl8139.c | 43 +-- 1 files changed, 29 insertions(+), 14 deletions(-) diff --git a/hw/rtl8139.c b/hw/rtl8139.c index 3753950..c654d5d 100644 --- a/hw/rtl8139.c +++ b/hw/rtl8139.c @@ -84,6 +84,19 @@ #define VLAN_TCI_LEN 2 #define VLAN_HLEN (ETHER_TYPE_LEN + VLAN_TCI_LEN) +/* w0 ownership flag */ +#define CP_RX_OWN (131) +/* w0 end of ring flag */ +#define CP_RX_EOR (130) +/* w0 bits 0...12 : buffer size */ +#define CP_RX_BUFFER_SIZE_MASK ((113) - 1) +/* w1 tag available flag */ +#define CP_RX_TAVA (116) +/* w1 bits 0...15 : VLAN tag */ +#define CP_RX_VLAN_TAG_MASK ((116) - 1) +/* w2 low 32bit of Rx buffer ptr */ +/* w3 high 32bit of Rx buffer ptr */ + #if defined (DEBUG_RTL8139) # define DPRINTF(fmt, ...) \ do { fprintf(stderr, RTL8139: fmt, ## __VA_ARGS__); } while (0) @@ -805,6 +818,21 @@ static inline target_phys_addr_t rtl8139_addr64(uint32_t low, uint32_t high) #endif } +/* Verify that we have at least one available rx buffer */ +static int rtl8139_cp_has_rxbuf(RTL8139State *s) +{ +uint32_t val, rxdw0; +target_phys_addr_t cplus_rx_ring_desc = rtl8139_addr64(s-RxRingAddrLO, + s-RxRingAddrHI); +cplus_rx_ring_desc += 16 * s-currCPlusRxDesc; +cpu_physical_memory_read(cplus_rx_ring_desc, (uint8_t *)val, 4); +rxdw0 = le32_to_cpu(val); +if (rxdw0 CP_RX_OWN) +return 1; +else +return 0; +} + static int rtl8139_can_receive(VLANClientState *nc) { RTL8139State *s = DO_UPCAST(NICState, nc, nc)-opaque; @@ -819,7 +847,7 @@ static int rtl8139_can_receive(VLANClientState *nc) if (rtl8139_cp_receiver_enabled(s)) { /* ??? Flow control not implemented in c+ mode. This is a hack to work around slirp deficiencies anyway. */ -return 1; +return rtl8139_cp_has_rxbuf(s); } else { avail = MOD2(s-RxBufferSize + s-RxBufPtr - s-RxBufAddr, s-RxBufferSize); @@ -965,19 +993,6 @@ static ssize_t rtl8139_do_receive(VLANClientState *nc, const uint8_t *buf, size_ /* begin C+ receiver mode */ -/* w0 ownership flag */ -#define CP_RX_OWN (131) -/* w0 end of ring flag */ -#define CP_RX_EOR (130) -/* w0 bits 0...12 : buffer size */ -#define CP_RX_BUFFER_SIZE_MASK ((113) - 1) -/* w1 tag available flag */ -#define CP_RX_TAVA (116) -/* w1 bits 0...15 : VLAN tag */ -#define CP_RX_VLAN_TAG_MASK ((116) - 1) -/* w2 low 32bit of Rx buffer ptr */ -/* w3 high 32bit of Rx buffer ptr */ - int descriptor = s-currCPlusRxDesc; target_phys_addr_t cplus_rx_ring_desc;
Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
Am 14.10.2011 10:41, schrieb Paolo Bonzini: Add coroutine support for flush and apply the same emulation that we already do for read/write. bdrv_aio_flush is simplified to always go through a coroutine. Signed-off-by: Paolo Bonzini pbonz...@redhat.com To make the implementation more consistent with read/write operations, wouldn't it make sense to provide a bdrv_co_flush() globally instead of using the synchronous version as the preferred public interface? This is the semantics that I would expect of a bdrv_co_flush() anyway, your use of it for an AIO emulation functions confused me a bit at first. Kevin
Re: [Qemu-devel] [v7 Patch 5/5]Qemu: New struct 'BDRVReopenState' for image files reopen
On Tue, Oct 11, 2011 at 08:41:59AM +0530, Supriya Kannery wrote: Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -706,6 +706,7 @@ int bdrv_reopen(BlockDriverState *bs, in { BlockDriver *drv = bs-drv; int ret = 0, open_flags; +BDRVReopenState *rs; BDRVReopenState *rs = NULL; If the abort path is taken we need to make sure rs has a defined value. Note that the abort path currently doesn't handle rs == NULL and will segfault in raw_reopen_abort().
[Qemu-devel] website
http://wiki.qemu.org/Main_Page Someone may want to have a look over the website, the links in the Contribute section in the column on the left. All pages seem to primarily discuss the quality of Pandora jewelry and how beneficial it would be to purchase such.
Re: [Qemu-devel] [v7 Patch 4/5]Qemu: Add commandline -drive option 'hostcache'
On 10/12/2011 08:00 PM, Kevin Wolf wrote: Am 11.10.2011 05:11, schrieb Supriya Kannery: qemu command option 'hostcache' added to -drive for block devices. While starting a VM from qemu commandline, this option can be used for setting host cache usage for block data access. Simultaneous use of 'hostcache' and 'cache' options not allowed. Signed-off-by: Supriya Kannerysupri...@linux.vnet.ibm.com I'm not sure if introducing this alone makes sense. I think I would only do it when we introduce more options that allow replacing all cache=... options by other parameters. Can we do transition to alternatives for 'cache=' in a phased manner? Until all other params are ready, we can allow hostcache (as well as other params as and when they are ready) in cmdline with the condition that 'cache=x', if specified, overrides these params. Once we have all other params ready, 'cache=' can be replaced completely. thanks, Supriya
Re: [Qemu-devel] [v7 Patch 4/5]Qemu: Add commandline -drive option 'hostcache'
Am 14.10.2011 13:19, schrieb Supriya Kannery: On 10/12/2011 08:00 PM, Kevin Wolf wrote: Am 11.10.2011 05:11, schrieb Supriya Kannery: qemu command option 'hostcache' added to -drive for block devices. While starting a VM from qemu commandline, this option can be used for setting host cache usage for block data access. Simultaneous use of 'hostcache' and 'cache' options not allowed. Signed-off-by: Supriya Kannerysupri...@linux.vnet.ibm.com I'm not sure if introducing this alone makes sense. I think I would only do it when we introduce more options that allow replacing all cache=... options by other parameters. Can we do transition to alternatives for 'cache=' in a phased manner? Until all other params are ready, we can allow hostcache (as well as other params as and when they are ready) in cmdline with the condition that 'cache=x', if specified, overrides these params. Once we have all other params ready, 'cache=' can be replaced completely. I guess that would be good enough. There's still not much use in specifying hostcache=... at the same time as cache=... but at least it doesn't take away other options then. Kevin
Re: [Qemu-devel] [PATCH] savevm: qemu_savevm_state(): Drop stop VM logic
Luiz Capitulino lcapitul...@redhat.com wrote: qemu_savevm_state() has some logic to stop the VM and to (or not to) resume it. But this seems to be a big noop, as qemu_savevm_state() is only called by do_savevm() when the VM is already stopped. So, let's drop qemu_savevm_state()'s stop VM logic. Signed-off-by: Luiz Capitulino lcapitul...@redhat.com Reviewed-by: Juan Quintela quint...@redhat.com
Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
On 10/14/2011 01:08 PM, Kevin Wolf wrote: Am 14.10.2011 10:41, schrieb Paolo Bonzini: Add coroutine support for flush and apply the same emulation that we already do for read/write. bdrv_aio_flush is simplified to always go through a coroutine. Signed-off-by: Paolo Bonzinipbonz...@redhat.com To make the implementation more consistent with read/write operations, wouldn't it make sense to provide a bdrv_co_flush() globally instead of using the synchronous version as the preferred public interface? I thought about it, but then it turned out that I would have bdrv_flush - create coroutine or just fast-path to bdrv_flush_co_entry - bdrv_flush_co_entry - driver and bdrv_co_flush - bdrv_flush_co_entry - driver In other words, the code would be exactly the same, save for an if (qemu_in_coroutine()). The reason is that, unlike read/write, neither flush nor discard take a qiov. In general, I think that with Stefan's cleanup having specialized coroutine versions has in general a much smaller benefit. The code reading benefit of naming routines like bdrv_co_* is already lost, for example, since bdrv_read can yield when called for coroutine context. Let me show how this might go. Right now you have bdrv_read/write - bdrv_rw_co - create qiov - create coroutine or just fast-path to bdrv_rw_co_entry - bdrv_rw_co_entry - bdrv_co_do_readv/writev - driver bdrv_co_readv/writev - bdrv_co_do_readv/writev - driver But starting from here, you might just as well reorganize it like this: bdrv_read/writev - bdrv_rw_co - create qiov - bdrv_readv/writev bdrv_readv/writev - create coroutine or just fast-path to bdrv_rw_co_entry - bdrv_rw_co_entry - bdrv_co_do_readv/writev - driver and just drop bdrv_co_readv, since it would just be hard-coding the fast-path of bdrv_readv. Since some amount of synchronous I/O would likely always be there, for example in qemu-img, I think this unification would make more sense than providing two separate entrypoints for bdrv_co_flush and bdrv_flush. Paolo
Re: [Qemu-devel] website
http://wiki.qemu.org/Main_Page Someone may want to have a look over the website, the links in the Contribute section in the column on the left. All pages seem to primarily discuss the quality of Pandora jewelry and how beneficial it would be to purchase such. Jcmvbkbc and I have reverted the spam. Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667
Re: [Qemu-devel] [PATCH 2/2] hda: do not mix output and input stream states, RHBZ #740493
Hi, a) My understanding of this patch is that we move from an array of 16 bools representing anything to one array where the 1st 16 represent if there are input and the 2nd 16's reprosenting if there are output for that channel. Correct. So, what we should do if we migrate from one old version that only has 16 bools? My understanding is that copying directly is not gonna work? Yes. Putting output first increases the chance that it works as sound playback is used much more than sound recording. I think hda-audio doesn't need so safe any running state. intel-hda knows which streams are running and it can call intel_hda_notify_codecs() for each stream in intel_hda_post_load(). The only problem is that intel_hda_post_load() might get called before hda-audio state is loaded. Now how to handle compatibility? We can have a running_compat[] array with the current (broken) semantics, so we can write out the state which older qemu versions expect to see. Also on load running_compat[] will be filled, and any state already set by intel_hda_post_load in running_real[] (or however we'll name this) will be kept intact. cheers, Gerd
Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
Am 14.10.2011 13:30, schrieb Paolo Bonzini: On 10/14/2011 01:08 PM, Kevin Wolf wrote: Am 14.10.2011 10:41, schrieb Paolo Bonzini: Add coroutine support for flush and apply the same emulation that we already do for read/write. bdrv_aio_flush is simplified to always go through a coroutine. Signed-off-by: Paolo Bonzinipbonz...@redhat.com To make the implementation more consistent with read/write operations, wouldn't it make sense to provide a bdrv_co_flush() globally instead of using the synchronous version as the preferred public interface? I thought about it, but then it turned out that I would have bdrv_flush - create coroutine or just fast-path to bdrv_flush_co_entry - bdrv_flush_co_entry - driver and bdrv_co_flush - bdrv_flush_co_entry - driver In other words, the code would be exactly the same, save for an if (qemu_in_coroutine()). The reason is that, unlike read/write, neither flush nor discard take a qiov. What I was thinking of looks a bit different: bdrv_flush - create coroutine or just fast-path to bdrv_flush_co_entry - bdrv_flush_co_entry - bdrv_co_flush and bdrv_co_flush - driver And the reason for this is that bdrv_co_flush would be a function that does only little more than passing the function to the driver (just like most bdrv_* functions do), with no emulation going on at all. In general, I think that with Stefan's cleanup having specialized coroutine versions has in general a much smaller benefit. The code reading benefit of naming routines like bdrv_co_* is already lost, for example, since bdrv_read can yield when called for coroutine context. Instead of taking a void* and working on a RwCo structure that is really meant for emulation, bdrv_co_flush would take a BlockDriverState and improve readability this way. The more complicated and ugly code would be left separated and only used for emulation. I think that would make it easier to understand the common path without being distracted by emulation code. Let me show how this might go. Right now you have bdrv_read/write - bdrv_rw_co - create qiov - create coroutine or just fast-path to bdrv_rw_co_entry - bdrv_rw_co_entry - bdrv_co_do_readv/writev - driver bdrv_co_readv/writev - bdrv_co_do_readv/writev - driver But starting from here, you might just as well reorganize it like this: bdrv_read/writev - bdrv_rw_co - create qiov - bdrv_readv/writev bdrv_readv/writev - create coroutine or just fast-path to bdrv_rw_co_entry - bdrv_rw_co_entry - bdrv_co_do_readv/writev - driver and just drop bdrv_co_readv, since it would just be hard-coding the fast-path of bdrv_readv. I guess it's all a matter of taste. Stefan, what do you think? Since some amount of synchronous I/O would likely always be there, for example in qemu-img, I think this unification would make more sense than providing two separate entrypoints for bdrv_co_flush and bdrv_flush. Actually, I'm not so sure about qemu-img. I think we have thought of scenarios where converting it to a coroutine based version with a main loop would be helpful (can't remember the details, though). Kevin
Re: [Qemu-devel] [PATCH 1/1 V5 tuning] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
On Fri, 2011-10-14 at 17:51 +0800, Lai Jiangshan wrote: Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is masked in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, we introduce introduce KVM_SET_LINT1, and we can use KVM_SET_LINT1 to correctly emulate NMI button without change the old KVM_NMI behavior. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- It could use a documentation update as well. arch/x86/kvm/irq.h |1 + arch/x86/kvm/lapic.c |7 +++ arch/x86/kvm/x86.c |8 include/linux/kvm.h |3 +++ 4 files changed, 19 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h index 53e2d08..0c96315 100644 --- a/arch/x86/kvm/irq.h +++ b/arch/x86/kvm/irq.h @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s); void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu); +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_timers(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 57dcbd4..87fe36a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu) kvm_apic_local_deliver(apic, APIC_LVT0); } +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + + kvm_apic_local_deliver(apic, APIC_LVT1); +} + static struct kvm_timer_ops lapic_timer_ops = { .is_periodic = lapic_is_periodic, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 84a28ea..fccd094 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: case KVM_CAP_GET_TSC_KHZ: + case KVM_CAP_SET_LINT1: r = 1; break; case KVM_CAP_COALESCED_MMIO: @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp, goto out; } + case KVM_SET_LINT1: { + r = -EINVAL; + if (!irqchip_in_kernel(vcpu-kvm)) + goto out; + r = 0; + kvm_apic_lint1_deliver(vcpu); We simply ignore the return value of kvm_apic_local_deliver() and assume it always works. why? -- Sasha.
[Qemu-devel] [PATCH] hw/9pfs: Handle Security model parsing
Except local fs driver other fs drivers (handle) don't need security model. Update fsdev parameter parsing accordingly. Signed-off-by: M. Mohan Kumar mo...@in.ibm.com --- fsdev/qemu-fsdev.c | 26 +- qemu-options.hx| 12 vl.c |6 ++ 3 files changed, 27 insertions(+), 17 deletions(-) diff --git a/fsdev/qemu-fsdev.c b/fsdev/qemu-fsdev.c index ce920d6..5977bcc 100644 --- a/fsdev/qemu-fsdev.c +++ b/fsdev/qemu-fsdev.c @@ -58,8 +58,15 @@ int qemu_fsdev_add(QemuOpts *opts) return -1; } -if (!sec_model) { -fprintf(stderr, fsdev: No security_model specified.\n); +if (!strcmp(fsdriver, local) !sec_model) { +fprintf(stderr, security model not specified, +local fs needs security model\nvalid options are: +\tsecurity_model=[passthrough|mapped|none]\n); +return -1; +} + +if (strcmp(fsdriver, local) sec_model) { +fprintf(stderr, only local fs driver needs security model\n); return -1; } @@ -80,6 +87,10 @@ int qemu_fsdev_add(QemuOpts *opts) } } +if (strcmp(fsdriver, local)) { +goto done; +} + if (!strcmp(sec_model, passthrough)) { fsle-fse.export_flags |= V9FS_SM_PASSTHROUGH; } else if (!strcmp(sec_model, mapped)) { @@ -87,14 +98,11 @@ int qemu_fsdev_add(QemuOpts *opts) } else if (!strcmp(sec_model, none)) { fsle-fse.export_flags |= V9FS_SM_NONE; } else { -fprintf(stderr, Default to security_model=none. You may want - enable advanced security model using -security option:\n\t security_model=passthrough\n\t -security_model=mapped\n); - -fsle-fse.export_flags |= V9FS_SM_NONE; +fprintf(stderr, Invalid security model %s specified, valid options are +\n\t [passthrough|mapped|none]\n, sec_model); +return -1; } - +done: QTAILQ_INSERT_TAIL(fsdriver_entries, fsle, next); return 0; } diff --git a/qemu-options.hx b/qemu-options.hx index 518a1f1..f05be30 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -527,13 +527,13 @@ DEFHEADING() DEFHEADING(File system options:) DEF(fsdev, HAS_ARG, QEMU_OPTION_fsdev, --fsdev fsdriver,id=id,path=path,security_model=[mapped|passthrough|none]\n +-fsdev fsdriver,id=id,path=path,[security_model={mapped|passthrough|none}]\n [,writeout=immediate]\n, QEMU_ARCH_ALL) STEXI -@item -fsdev @var{fsdriver},id=@var{id},path=@var{path},security_model=@var{security_model}[,writeout=@var{writeout}] +@item -fsdev @var{fsdriver},id=@var{id},path=@var{path},[security_model=@var{security_model}][,writeout=@var{writeout}] @findex -fsdev Define a new file system device. Valid options are: @table @option @@ -555,7 +555,9 @@ attributes like uid, gid, mode bits and link target are stored as file attributes. Directories exported by this security model cannot interact with other unix tools. none security model is same as passthrough except the sever won't report failures if it fails to -set file attributes like ownership. +set file attributes like ownership. Security model is mandatory +only for local fsdriver. Other fsdrivers (like handle) don't take +security model as a parameter. @item writeout=@var{writeout} This is an optional argument. The only supported value is immediate. This means that host page cache will be used to read and write data but @@ -609,7 +611,9 @@ attributes like uid, gid, mode bits and link target are stored as file attributes. Directories exported by this security model cannot interact with other unix tools. none security model is same as passthrough except the sever won't report failures if it fails to -set file attributes like ownership. +set file attributes like ownership. Security model is mandatory only +for local fsdriver. Other fsdrivers (like handle) don't take security +model as a parameter. @item writeout=@var{writeout} This is an optional argument. The only supported value is immediate. This means that host page cache will be used to read and write data but diff --git a/vl.c b/vl.c index 3b8199f..d672268 100644 --- a/vl.c +++ b/vl.c @@ -2800,14 +2800,12 @@ int main(int argc, char **argv, char **envp) if (qemu_opt_get(opts, fsdriver) == NULL || qemu_opt_get(opts, mount_tag) == NULL || -qemu_opt_get(opts, path) == NULL || -qemu_opt_get(opts, security_model) == NULL) { +qemu_opt_get(opts, path) == NULL) { fprintf(stderr, Usage: -virtfs fsdriver,path=/share_path/, -security_model=[mapped|passthrough|none], +[security_model={mapped|passthrough|none}], mount_tag=tag.\n); exit(1); } - fsdev =
Re: [Qemu-devel] [PATCH 1/1 V5 tuning] kernel/kvm: introduce KVM_SET_LINT1 and fix improper nmi emulation
On 2011-10-14 13:59, Sasha Levin wrote: On Fri, 2011-10-14 at 17:51 +0800, Lai Jiangshan wrote: Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is masked in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, we introduce introduce KVM_SET_LINT1, and we can use KVM_SET_LINT1 to correctly emulate NMI button without change the old KVM_NMI behavior. Signed-off-by: Lai Jiangshan la...@cn.fujitsu.com Reported-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com --- It could use a documentation update as well. arch/x86/kvm/irq.h |1 + arch/x86/kvm/lapic.c |7 +++ arch/x86/kvm/x86.c |8 include/linux/kvm.h |3 +++ 4 files changed, 19 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h index 53e2d08..0c96315 100644 --- a/arch/x86/kvm/irq.h +++ b/arch/x86/kvm/irq.h @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state *s); void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu); +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_timers(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 57dcbd4..87fe36a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1039,6 +1039,13 @@ void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu) kvm_apic_local_deliver(apic, APIC_LVT0); } +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu) +{ +struct kvm_lapic *apic = vcpu-arch.apic; + +kvm_apic_local_deliver(apic, APIC_LVT1); +} + static struct kvm_timer_ops lapic_timer_ops = { .is_periodic = lapic_is_periodic, }; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 84a28ea..fccd094 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2077,6 +2077,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: case KVM_CAP_GET_TSC_KHZ: +case KVM_CAP_SET_LINT1: r = 1; break; case KVM_CAP_COALESCED_MMIO: @@ -3264,6 +3265,13 @@ long kvm_arch_vcpu_ioctl(struct file *filp, goto out; } +case KVM_SET_LINT1: { +r = -EINVAL; +if (!irqchip_in_kernel(vcpu-kvm)) +goto out; +r = 0; +kvm_apic_lint1_deliver(vcpu); We simply ignore the return value of kvm_apic_local_deliver() and assume it always works. why? Hmm, I suddenly realized that we switched from enhancing the KVM_NMI IOCTL to adding KVM_SET_LINT1 - what motivated this? ( Maybe we should let the kernel part settle first before iterating through user space changes. ) Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux
Re: [Qemu-devel] [PATCH 0/6] trace: Add support for trace events grouping
On Thu, Oct 13, 2011 at 04:10:56PM +0800, Mark Wu wrote: On 10/13/2011 01:14 AM, Mark Wu wrote: This series add support for trace events grouping. The state of a given group of trace events can be queried or changed in bulk by the following monitor commands: * info trace-groups View available trace event groups and their state. State 1 means enabled, state 0 means disabled. * trace-group NAME on|off Enable/disable a given trace event group. A group of trace events can also be enabled in early running stage through adding its group name prefixed with group: to trace events list file which is passed to -trace events. Mark Wu (6): trace: Make tracetool generate a group list trace: Add HMP monitor commands for trace events group trace: Add trace events group implementation in the backend simple trace: Add trace events group implementation in the backend stderr trace: Enable -trace events argument to control initial state of groups trace: Update doc for trace events group docs/tracing.txt | 29 ++-- hmp-commands.hx | 14 monitor.c | 22 scripts/tracetool | 94 +++- trace-events | 88 + trace/control.c | 17 + trace/control.h |9 + trace/default.c | 15 trace/simple.c| 30 + trace/simple.h|7 trace/stderr.c| 32 ++ trace/stderr.h|7 12 files changed, 359 insertions(+), 5 deletions(-) Sorry, there're some coding style problems in the patches. I have fixed them and will send out later in order to see if there's any other problem coming up. :) Hi Mark, Please run scripts/checkpatch.pl on the patches if you haven't already. It will point out many of the coding style issues. I think we can get the same convenience by adding wildcard trace event support. For example, virtio-blk trace events could be enabled using: trace-event virtio_blk_* on This doesn't work for the memory allocation functions because their names do not share a common unique prefix, e.g. g_malloc and qemu_memalign. However, most related trace events do share a common unique prefix. Wildcards don't require special comments in trace-events so it eliminates the problem of people forgetting to add groups when they add trace events to QEMU. The other advantage is that wildcards can select fine-grained sets of trace events, like ecc_mem_writel_mer, ecc_mem_writel_mdr, etc. ecc_mem_writel_* selects all these related trace events (they are a subset of the hw/eccmemctl.c trace events). This is not possible with the groups patches since each trace-event can only be in one group; either we have a high-level hw/eccmemctl.c group or a lower-level ecc_mem_writel group but both is not possible. I suggest adding wildcard trace event matching instead of adding groups. The code changes for wildcards should be much smaller than for groups. What do you think? Stefan
Re: [Qemu-devel] [PATCH] configure: Detect when glibc implements makecontext() to always fail
On 13 October 2011 23:23, Andreas Färber andreas.faer...@web.de wrote: Am 13.10.2011 16:26, schrieb Andreas Färber: Am 12.10.2011 18:21, schrieb Peter Maydell: Improve the configure test for presence of ucontext functions by making linker warnings fatal; this allows us to detect when we are linked with a glibc which implements makecontext() to always return ENOSYS. --- Compiling on an Ubuntu Natty ARM host will hit this. Works on Ubuntu Maverick ARM host as well. Erm... This works great, also for accept4(), on Linux, but it's not portable. Apple ld(1) doesn't seem to have --fatal-warnings. I've also just discovered that it's no use on Oneiric, where the linker warning has gone away but the syscall still always returns ENOSYS. I think we should just always use the gthread implementation rather than preferring a non-portable-and-hard-to-detect set of functions (which increases the set of different configs we need to test with). If there's a performance problem with that we should get it fixed in gthread :-) -- PMM
Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
On 10/14/2011 01:54 PM, Kevin Wolf wrote: Am 14.10.2011 13:30, schrieb Paolo Bonzini: On 10/14/2011 01:08 PM, Kevin Wolf wrote: Am 14.10.2011 10:41, schrieb Paolo Bonzini: Add coroutine support for flush and apply the same emulation that we already do for read/write. bdrv_aio_flush is simplified to always go through a coroutine. Signed-off-by: Paolo Bonzinipbonz...@redhat.com To make the implementation more consistent with read/write operations, wouldn't it make sense to provide a bdrv_co_flush() globally instead of using the synchronous version as the preferred public interface? What I was thinking of looks a bit different: bdrv_flush - create coroutine or just fast-path to bdrv_flush_co_entry - bdrv_flush_co_entry - bdrv_co_flush and bdrv_co_flush - driver And the reason for this is that bdrv_co_flush would be a function that does only little more than passing the function to the driver (just like most bdrv_* functions do), with no emulation going on at all. It would still host the checks on BDRV_O_NO_FLUSH and bs-drv-*_flush. It would be the same as bdrv_flush_co_entry is now, minus the marshalling in/out of the RwCo. Instead of taking a void* and working on a RwCo structure that is really meant for emulation, bdrv_co_flush would take a BlockDriverState and improve readability this way. I see. Yeah, that's doable, but I'd still need two coroutines (one for bdrv_flush, one for bdrv_aio_flush) and the patch would be bigger overall... The more complicated and ugly code would be left separated and only used for emulation. I think that would make it easier to understand the common path without being distracted by emulation code. ... and on the other hand the length of the call chain would increse. It easily gets confusing, it already is for me in the read/write case. Would bdrv_co_flush be static or not? If not, you also get an additional entry point of dubious additional value, i.e. more complexity. Actually, I'm not so sure about qemu-img. I think we have thought of scenarios where converting it to a coroutine based version with a main loop would be helpful (can't remember the details, though). qemu-img convert might benefit from multiple in-flight requests if on of the endpoints is remote or perhaps even sparse, I guess. Paolo
Re: [Qemu-devel] [PATCH] configure: Detect when glibc implements makecontext() to always fail
On 10/14/2011 02:30 PM, Peter Maydell wrote: I've also just discovered that it's no use on Oneiric, where the linker warning has gone away but the syscall still always returns ENOSYS. I think we should just always use the gthread implementation rather than preferring a non-portable-and-hard-to-detect set of functions (which increases the set of different configs we need to test with). If there's a performance problem with that we should get it fixed in gthread:-) A user-space longjmp will always be slower than a mutex+condvar+context switch. We're talking _orders of magnitude_ slower. At this point it's better to write assembly, since we already support only a dozen TCG targets. I played with an alternative implementation using a 2-barrier instead of mutex+condvar, but it didn't give any speedup and was still much slower than gthread. Paolo
Re: [Qemu-devel] [PATCH] configure: Detect when glibc implements makecontext() to always fail
On 10/14/2011 02:47 PM, Paolo Bonzini wrote: A user-space longjmp will always be slower than a mutex+condvar+context switch. Gah, I obviously meant faster. :) Paolo
[Qemu-devel] [Bug 874038] Re: ARM thumb2 does not propogate carry flag properly
The existing code looks OK to me -- there's no need to call gen_set_CF_bit31() early because the inputs t0 and t1 to gen_thumb2_data_op() should always be distinct TCG values, and so gen_thumb2_data_op() will never trash t1. (There was a bug in this area involving ORN, but that was fixed in rev 29501f1, and I can see from your patch that you have that fix.) Can you clarify which exact instruction, input data and output data case this patch is intended to fix, please? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/874038 Title: ARM thumb2 does not propogate carry flag properly Status in QEMU: New Bug description: information on carry flag is lost if gen_set_CF_bit31(t1) is called after logic operation. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/874038/+subscriptions
Re: [Qemu-devel] [PATCH 0/2] spice migration interface (RHBZ 737921)
Hi, Yonit Halperin (2): spice: turn client_migrate_info to async spice: support the new migration interface (spice 0.8.3) Added to spice patch queue. thanks, Gerd
Re: [Qemu-devel] [PATCH] qxl: create slots on post_load in any state (fix RHBZ 740547)
On 09/22/11 14:33, Alon Levy wrote: If we migrate when the device is not in a native state the guest still believes the slots are created, and will cause operations that reference the slots, causing a panic: virtual address out of range on the first of them. Easy to see by migrating in vga mode (with a driver loaded, for instance windows cmd window in full screen mode) and then exiting vga mode back to native mode will cause said panic. Fixed by doing the slot recreation unconditionally at post_load Added to spice patch queue. thanks, Gerd
Re: [Qemu-devel] [PATCH] configure: Detect when glibc implements makecontext() to always fail
On 14 October 2011 13:55, Paolo Bonzini pbonz...@redhat.com wrote: On 10/14/2011 02:47 PM, Paolo Bonzini wrote: A user-space longjmp will always be slower than a mutex+condvar+context switch. Gah, I obviously meant faster. :) Ah, I hadn't actually looked at the coroutine-gthread.c code, and had assumed it was implementing coroutines via a gthread coroutine abstraction rather than via gthread real actual threads. -- PMM
Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
On Fri, Oct 14, 2011 at 01:54:42PM +0200, Kevin Wolf wrote: Am 14.10.2011 13:30, schrieb Paolo Bonzini: On 10/14/2011 01:08 PM, Kevin Wolf wrote: Am 14.10.2011 10:41, schrieb Paolo Bonzini: Let me show how this might go. Right now you have bdrv_read/write - bdrv_rw_co - create qiov - create coroutine or just fast-path to bdrv_rw_co_entry - bdrv_rw_co_entry - bdrv_co_do_readv/writev - driver bdrv_co_readv/writev - bdrv_co_do_readv/writev - driver But starting from here, you might just as well reorganize it like this: bdrv_read/writev - bdrv_rw_co - create qiov - bdrv_readv/writev bdrv_readv/writev - create coroutine or just fast-path to bdrv_rw_co_entry - bdrv_rw_co_entry - bdrv_co_do_readv/writev - driver and just drop bdrv_co_readv, since it would just be hard-coding the fast-path of bdrv_readv. I guess it's all a matter of taste. Stefan, what do you think? Since some amount of synchronous I/O would likely always be there, for example in qemu-img, I think this unification would make more sense than providing two separate entrypoints for bdrv_co_flush and bdrv_flush. Actually, I'm not so sure about qemu-img. I think we have thought of scenarios where converting it to a coroutine based version with a main loop would be helpful (can't remember the details, though). I'd like to completely remove synchronous bdrv_*(), including for qemu-img. It's just too tempting to call these functions in contexts where it is not okay to do so. The bdrv_co_*() functions are all tagged as coroutine_fn and make it clear that they can yield. We already have an event loop in qemu-img except it's the nested event loop in synchronous bdrv_*() emulation functions. The nested event loop is a mini event loop and can't really do things like timers. It would be nicer to remove it in favor of a single top-level event loop with the qemu-img code running in a coroutine. So I'm in favor of keeping bdrv_co_*() explicit for now and refactoring both hw/ and qemu-tool users of synchronous functions. Stefan
Re: [Qemu-devel] [PATCH] ui/spice-core: fix segfault in monitor
On 10/04/11 13:25, Alon Levy wrote: Fix segfault if a qxl device is present but no spice command line argument is given. RHBZ 743251. Added to spice patch queue. thanks, Gerd
Re: [Qemu-devel] [PATCH v2] runstate: add more valid transitions
On Fri, 14 Oct 2011 08:56:57 +0200 Paolo Bonzini pbonz...@redhat.com wrote: On 10/13/2011 10:26 PM, Luiz Capitulino wrote: I'm going to take my word back on this one, I've found the real cause of the problem. Will post the patch right now. Are you keeping the vl.c hunks though? I'm not, because I'm assuming that allowing a transition from 'paused' to 'postmigrate' plus this fix: http://lists.gnu.org/archive/html/qemu-devel/2011-10/msg01430.html Will solve the problems we have today. If you don't think so, would you mind to send a patch against the qmp queue? git://repo.or.cz/qemu/qmp-unstable.git queue/qmp Thanks!
[Qemu-devel] [PATCH] ARM GIC and CPU state saving/loading fix
Fixes two trivial indices errors. Signed-off-by: Dmitry Koshelev karaghio...@gmail.com --- hw/arm_gic.c | 12 ++-- target-arm/machine.c |4 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/hw/arm_gic.c b/hw/arm_gic.c index 8286a28..ba05131 100644 --- a/hw/arm_gic.c +++ b/hw/arm_gic.c @@ -662,9 +662,6 @@ static void gic_save(QEMUFile *f, void *opaque) qemu_put_be32(f, s-enabled); for (i = 0; i NUM_CPU(s); i++) { qemu_put_be32(f, s-cpu_enabled[i]); -#ifndef NVIC -qemu_put_be32(f, s-irq_target[i]); -#endif for (j = 0; j 32; j++) qemu_put_be32(f, s-priority1[j][i]); for (j = 0; j GIC_NIRQ; j++) @@ -678,6 +675,9 @@ static void gic_save(QEMUFile *f, void *opaque) qemu_put_be32(f, s-priority2[i]); } for (i = 0; i GIC_NIRQ; i++) { +#ifndef NVIC +qemu_put_be32(f, s-irq_target[i]); +#endif qemu_put_byte(f, s-irq_state[i].enabled); qemu_put_byte(f, s-irq_state[i].pending); qemu_put_byte(f, s-irq_state[i].active); @@ -699,9 +699,6 @@ static int gic_load(QEMUFile *f, void *opaque, int version_id) s-enabled = qemu_get_be32(f); for (i = 0; i NUM_CPU(s); i++) { s-cpu_enabled[i] = qemu_get_be32(f); -#ifndef NVIC -s-irq_target[i] = qemu_get_be32(f); -#endif for (j = 0; j 32; j++) s-priority1[j][i] = qemu_get_be32(f); for (j = 0; j GIC_NIRQ; j++) @@ -715,6 +712,9 @@ static int gic_load(QEMUFile *f, void *opaque, int version_id) s-priority2[i] = qemu_get_be32(f); } for (i = 0; i GIC_NIRQ; i++) { +#ifndef NVIC +s-irq_target[i] = qemu_get_be32(f); +#endif s-irq_state[i].enabled = qemu_get_byte(f); s-irq_state[i].pending = qemu_get_byte(f); s-irq_state[i].active = qemu_get_byte(f); diff --git a/target-arm/machine.c b/target-arm/machine.c index 3925d3a..1b1b3ec 100644 --- a/target-arm/machine.c +++ b/target-arm/machine.c @@ -53,7 +53,7 @@ void cpu_save(QEMUFile *f, void *opaque) qemu_put_be32(f, env-features); if (arm_feature(env, ARM_FEATURE_VFP)) { -for (i = 0; i 16; i++) { +for (i = 16; i 32; i++) { CPU_DoubleU u; u.d = env-vfp.regs[i]; qemu_put_be32(f, u.l.upper); @@ -175,7 +175,7 @@ int cpu_load(QEMUFile *f, void *opaque, int version_id) env-vfp.vec_stride = qemu_get_be32(f); if (arm_feature(env, ARM_FEATURE_VFP3)) { -for (i = 0; i 16; i++) { +for (i = 16; i 32; i++) { CPU_DoubleU u; u.l.upper = qemu_get_be32(f); u.l.lower = qemu_get_be32(f);
Re: [Qemu-devel] [PATCH v2] runstate: add more valid transitions
On 10/14/2011 03:23 PM, Luiz Capitulino wrote: I'm not, because I'm assuming that allowing a transition from 'paused' to 'postmigrate' plus this fix: http://lists.gnu.org/archive/html/qemu-devel/2011-10/msg01430.html I think POST_MIGRATE - FINISH_MIGRATE should be still allowed in case you migrate twice. Management would probably disallow that, but from the monitor you can do funny things: stop the machine, create two images based on the running machine's image, and migrate to both images. Paolo
Re: [Qemu-devel] [v7 Patch 5/5]Qemu: New struct 'BDRVReopenState' for image files reopen
On 10/12/2011 08:25 PM, Kevin Wolf wrote: Am 11.10.2011 05:11, schrieb Supriya Kannery: Struct BDRVReopenState introduced for handling reopen state of images. This can be extended by each of the block drivers to reopen respective image files. Implementation for raw-posix is done here. Signed-off-by: Supriya Kannerysupri...@linux.vnet.ibm.com Maybe it would make sense to split this into two patches, one for the block.c infrastructure and another one for adding the callbacks in drivers. ok..will split in v8. + +/* If only O_DIRECT to be toggled, use fcntl */ +if (!((bs-open_flags ~BDRV_O_NOCACHE) ^ +(flags ~BDRV_O_NOCACHE))) { +raw_rs-reopen_fd = dup(s-fd); +if (raw_rs-reopen_fd= 0) { +return -1; This leaks raw_rs. will handle +} +} + +/* TBD: Handle O_DSYNC and other flags */ +*rs = raw_rs; +return 0; +} + +static int raw_reopen_commit(BDRVReopenState *rs) bdrv_reopen_commit must never fail. Any error that can happen must already be handled in bdrv_reopen_prepare. The commit function should really only do s-fd = rs-reopen_fd (besides cleanup), everything else should already be prepared. will move call to fcntl to bdrv_reopen_prepare. +{ +BlockDriverState *bs = rs-bs; +BDRVRawState *s = bs-opaque; + +if (!rs-reopen_fd) { +return -1; +} + +int ret = fcntl_setfl(rs-reopen_fd, rs-reopen_flags); reopen_flags is BDRV_O_*, not O_*, so it needs to be translated. ok +/* Use driver specific reopen() if available */ +if (drv-bdrv_reopen_prepare) { +ret = drv-bdrv_reopen_prepare(bs,rs, bdrv_flags); if (ret 0) { -/* Reopen failed with orig and modified flags */ -abort(); +goto fail; } -} +if (drv-bdrv_reopen_commit) { +ret = drv-bdrv_reopen_commit(rs); +if (ret 0) { +goto fail; +} +return 0; +} Pull the return 0; out one level. It would be really strange if we turned a successful prepare into reopen_abort just because the driver doesn't need a commit function. (The other consistent way would be to require that if a driver implements one reopen function, it has to implement all three of them. I'm fine either way.) Will give flexibility to drivers, not mandating all the three functions. +ret = bdrv_open(bs, bs-filename, open_flags, drv); +if (ret 0) { +/* + * Reopen with orig and modified flags failed + */ +abort(); Make this bs-drv = NULL, so that trying to access to image will fail, but at least not the whole VM crashes. ok static int raw_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors) { @@ -128,6 +137,8 @@ static BlockDriver bdrv_raw = { .instance_size = 1, .bdrv_open = raw_open, +.bdrv_reopen_prepare += raw_reopen, .bdrv_close = raw_close, .bdrv_read = raw_read, .bdrv_write = raw_write, I think raw must pass through all three functions. Otherwise it could happen that we need to abort, but the image has already been reopened. ok..got it..will have three separate functions to avoid unnecessary dependencies. Got a question.. In raw-posix, the three functions are implemented for file_reopen for now. Should this be extended to hdev, cdrom and floppy? Kevin
Re: [Qemu-devel] [v7 Patch 5/5]Qemu: New struct 'BDRVReopenState' for image files reopen
On 10/14/2011 04:43 PM, Stefan Hajnoczi wrote: On Tue, Oct 11, 2011 at 08:41:59AM +0530, Supriya Kannery wrote: Index: qemu/block.c === --- qemu.orig/block.c +++ qemu/block.c @@ -706,6 +706,7 @@ int bdrv_reopen(BlockDriverState *bs, in { BlockDriver *drv = bs-drv; int ret = 0, open_flags; +BDRVReopenState *rs; BDRVReopenState *rs = NULL; If the abort path is taken we need to make sure rs has a defined value. Note that the abort path currently doesn't handle rs == NULL and will segfault in raw_reopen_abort(). sure..will check on this. thanks, Supriya
Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
On 10/14/2011 03:20 PM, Stefan Hajnoczi wrote: It's just too tempting to call these functions in contexts where it is not okay to do so. The bdrv_co_*() functions are all tagged as coroutine_fn and make it clear that they can yield. Yes, I agree. We already have an event loop in qemu-img except it's the nested event loop in synchronous bdrv_*() emulation functions. The nested event loop is a mini event loop and can't really do things like timers. It would be nicer to remove it in favor of a single top-level event loop with the qemu-img code running in a coroutine. Note that the nested event loop cannot go away because of qemu_aio_flush, though. :( Paolo
Re: [Qemu-devel] [v7 Patch 5/5]Qemu: New struct 'BDRVReopenState' for image files reopen
Am 14.10.2011 15:42, schrieb Supriya Kannery: +/* Use driver specific reopen() if available */ +if (drv-bdrv_reopen_prepare) { +ret = drv-bdrv_reopen_prepare(bs,rs, bdrv_flags); if (ret 0) { -/* Reopen failed with orig and modified flags */ -abort(); +goto fail; } -} +if (drv-bdrv_reopen_commit) { +ret = drv-bdrv_reopen_commit(rs); +if (ret 0) { +goto fail; +} +return 0; +} Pull the return 0; out one level. It would be really strange if we turned a successful prepare into reopen_abort just because the driver doesn't need a commit function. (The other consistent way would be to require that if a driver implements one reopen function, it has to implement all three of them. I'm fine either way.) Will give flexibility to drivers, not mandating all the three functions. Do we have a use case where it is actually possible to implement less functions without introducing bugs? If yes, let's keep it as it is. Got a question.. In raw-posix, the three functions are implemented for file_reopen for now. Should this be extended to hdev, cdrom and floppy? Yes, that would be good. And I think the same implementation can be used for all of them. Kevin
Re: [Qemu-devel] [PATCH] ARM GIC and CPU state saving/loading fix
On Fri, Oct 14, 2011 at 05:25:29PM +0400, Dmitry Koshelev wrote: Fixes two trivial indices errors. Signed-off-by: Dmitry Koshelev karaghio...@gmail.com --- hw/arm_gic.c | 12 ++-- target-arm/machine.c |4 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) Not obvious to me what the implications are. CCed Peter Maydell so it can go through his ARM tree. Stefan
Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
Am 14.10.2011 14:42, schrieb Paolo Bonzini: On 10/14/2011 01:54 PM, Kevin Wolf wrote: Am 14.10.2011 13:30, schrieb Paolo Bonzini: On 10/14/2011 01:08 PM, Kevin Wolf wrote: Am 14.10.2011 10:41, schrieb Paolo Bonzini: Add coroutine support for flush and apply the same emulation that we already do for read/write. bdrv_aio_flush is simplified to always go through a coroutine. Signed-off-by: Paolo Bonzinipbonz...@redhat.com To make the implementation more consistent with read/write operations, wouldn't it make sense to provide a bdrv_co_flush() globally instead of using the synchronous version as the preferred public interface? What I was thinking of looks a bit different: bdrv_flush - create coroutine or just fast-path to bdrv_flush_co_entry - bdrv_flush_co_entry - bdrv_co_flush and bdrv_co_flush - driver And the reason for this is that bdrv_co_flush would be a function that does only little more than passing the function to the driver (just like most bdrv_* functions do), with no emulation going on at all. It would still host the checks on BDRV_O_NO_FLUSH and bs-drv-*_flush. It would be the same as bdrv_flush_co_entry is now, minus the marshalling in/out of the RwCo. Right. By the way, I like how you handle all three backends in the same function. I think this is a lot more readable than the solution used by read/write (changing the function pointers on driver registration). Instead of taking a void* and working on a RwCo structure that is really meant for emulation, bdrv_co_flush would take a BlockDriverState and improve readability this way. I see. Yeah, that's doable, but I'd still need two coroutines (one for bdrv_flush, one for bdrv_aio_flush) and the patch would be bigger overall... You already have two of them (bdrv_co_flush for AIO and bdrv_flush_co_entry for synchronous), so I don't think it makes a difference there. The more complicated and ugly code would be left separated and only used for emulation. I think that would make it easier to understand the common path without being distracted by emulation code. ... and on the other hand the length of the call chain would increse. It easily gets confusing, it already is for me in the read/write case. Well, depends on what you're looking at. The call chain length would increase for AIO and synchronous bdrv_flush, but it would become shorter for bdrv_co_flush. If we want to declare coroutines as the preferred interface, I think such a change makes sense. Would bdrv_co_flush be static or not? If not, you also get an additional entry point of dubious additional value, i.e. more complexity. I think I would make it public. The one that has to go eventually is the synchronous bdrv_flush(). Which is another reason why I wouldn't design everything around it. Actually, I'm not so sure about qemu-img. I think we have thought of scenarios where converting it to a coroutine based version with a main loop would be helpful (can't remember the details, though). qemu-img convert might benefit from multiple in-flight requests if on of the endpoints is remote or perhaps even sparse, I guess. Quite possible, yes. Kevin
Re: [Qemu-devel] [PATCH 2/4] block: unify flush implementations
On 10/14/2011 04:02 PM, Kevin Wolf wrote: It would still host the checks on BDRV_O_NO_FLUSH and bs-drv-*_flush. It would be the same as bdrv_flush_co_entry is now, minus the marshalling in/out of the RwCo. Right. By the way, I like how you handle all three backends in the same function. I think this is a lot more readable than the solution used by read/write (changing the function pointers on driver registration). Yeah, and it makes sense to handle all of them in the bdrv_co_* version. Will resubmit. Paolo
[Qemu-devel] [PATCH 6/7] sheepdog: correct spelling
From: Dong Xu Wang wdon...@linux.vnet.ibm.com Reviewed-by: Andreas Färber afaer...@suse.de Signed-off-by: Dong Xu Wang wdon...@linux.vnet.ibm.com Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- block/sheepdog.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/block/sheepdog.c b/block/sheepdog.c index c1f6e07..ae857e2 100644 --- a/block/sheepdog.c +++ b/block/sheepdog.c @@ -66,7 +66,7 @@ * 20 - 31 (12 bits): reserved data object space * 32 - 55 (24 bits): vdi object space * 56 - 59 ( 4 bits): reserved vdi object space - * 60 - 63 ( 4 bits): object type indentifier space + * 60 - 63 ( 4 bits): object type identifier space */ #define VDI_SPACE_SHIFT 32 -- 1.7.6.3
[Qemu-devel] [PATCH 3/7] arm_pic: Fix typo
From: Andreas Färber andreas.faer...@web.de interrput - interrupt Cc: Paul Brook p...@codesourcery.com Signed-off-by: Andreas Färber andreas.faer...@web.de Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- hw/arm_pic.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/hw/arm_pic.c b/hw/arm_pic.c index 985148a..41f8d3e 100644 --- a/hw/arm_pic.c +++ b/hw/arm_pic.c @@ -39,7 +39,7 @@ static void arm_pic_cpu_handler(void *opaque, int irq, int level) cpu_reset_interrupt(env, CPU_INTERRUPT_FIQ); break; default: -hw_error(arm_pic_cpu_handler: Bad interrput line %d\n, irq); +hw_error(arm_pic_cpu_handler: Bad interrupt line %d\n, irq); } } -- 1.7.6.3
[Qemu-devel] [PULL 0/7] Trivial patches for October 6 to 14 2011
The following changes since commit ebffe2afceb1a17b5d134b5debf553955fe5ea1a: Merge remote-tracking branch 'qmp/queue/qmp' into staging (2011-10-10 08:21:46 -0500) are available in the git repository at: ssh://repo.or.cz/srv/git/qemu/stefanha.git trivial-patches Andreas Färber (1): arm_pic: Fix typo Dong Xu Wang (1): sheepdog: correct spelling Paolo Bonzini (1): remove hpet.h Stefan Hajnoczi (1): qemu-options: avoid #if in spicevmc texi help Stefan Weil (3): qemu-char: Fix use of free() instead of g_free() tcg: Fix spelling in comment (varables - variables) block/qcow: Fix use of free() instead of g_free() block/qcow.c |2 +- block/sheepdog.c |2 +- hpet.h | 22 -- hw/arm_pic.c |2 +- qemu-char.c |8 qemu-options.hx |4 ++-- tcg/tcg.h|2 +- 7 files changed, 10 insertions(+), 32 deletions(-) delete mode 100644 hpet.h -- 1.7.6.3
[Qemu-devel] [PATCH 5/7] tcg: Fix spelling in comment (varables - variables)
From: Stefan Weil s...@weilnetz.de Signed-off-by: Stefan Weil s...@weilnetz.de Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- tcg/tcg.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/tcg/tcg.h b/tcg/tcg.h index de8a1d5..015f88a 100644 --- a/tcg/tcg.h +++ b/tcg/tcg.h @@ -175,7 +175,7 @@ typedef enum TCGType { typedef tcg_target_ulong TCGArg; -/* Define a type and accessor macros for varables. Using a struct is +/* Define a type and accessor macros for variables. Using a struct is nice because it gives some level of type safely. Ideally the compiler be able to see through all this. However in practice this is not true, expecially on targets with braindamaged ABIs (e.g. i386). -- 1.7.6.3
[Qemu-devel] [PATCH 4/7] remove hpet.h
From: Paolo Bonzini pbonz...@redhat.com It is unused since the HPET and RTC timers were removed (commit 25f3151, 2011-05-31). Signed-off-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- hpet.h | 22 -- 1 files changed, 0 insertions(+), 22 deletions(-) delete mode 100644 hpet.h diff --git a/hpet.h b/hpet.h deleted file mode 100644 index 754051a..000 --- a/hpet.h +++ /dev/null @@ -1,22 +0,0 @@ -#ifndef__HPET__ -#define__HPET__ 1 - - - -struct hpet_info { - unsigned long hi_ireqfreq; /* Hz */ - unsigned long hi_flags; /* information */ - unsigned short hi_hpet; - unsigned short hi_timer; -}; - -#defineHPET_INFO_PERIODIC 0x0001 /* timer is periodic */ - -#defineHPET_IE_ON _IO('h', 0x01) /* interrupt on */ -#defineHPET_IE_OFF _IO('h', 0x02) /* interrupt off */ -#defineHPET_INFO _IOR('h', 0x03, struct hpet_info) -#defineHPET_EPI_IO('h', 0x04) /* enable periodic */ -#defineHPET_DPI_IO('h', 0x05) /* disable periodic */ -#defineHPET_IRQFREQ_IOW('h', 0x6, unsigned long) /* IRQFREQ usec */ - -#endif /* !__HPET__ */ -- 1.7.6.3
[Qemu-devel] [PATCH 7/7] block/qcow: Fix use of free() instead of g_free()
From: Stefan Weil s...@weilnetz.de cppcheck reported this error: qemu/block/qcow.c:599: error: Mismatching allocation and deallocation: cluster_data Signed-off-by: Stefan Weil s...@weilnetz.de Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- block/qcow.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/block/qcow.c b/block/qcow.c index c8bfecc..eba5a04 100644 --- a/block/qcow.c +++ b/block/qcow.c @@ -596,7 +596,7 @@ static int qcow_co_writev(BlockDriverState *bs, int64_t sector_num, if (qiov-niov 1) { qemu_vfree(orig_buf); } -free(cluster_data); +g_free(cluster_data); return ret; } -- 1.7.6.3
[Qemu-devel] [PATCH 1/7] qemu-options: avoid #if in spicevmc texi help
Preprocessor directives cannot be used in STEXI/ETEXI sections since they are not passed through the preprocessor. The spicevmc chardev option help currently uses #if, which is included verbatim in the man page output. Fix this by simply stating that spicevmc chardevs are available only in builds with spice support. Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- qemu-options.hx |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index dfbabd0..d4fe990 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -1673,15 +1673,15 @@ Connect to a local parallel port. @option{path} specifies the path to the parallel port device. @option{path} is required. -#if defined(CONFIG_SPICE) @item -chardev spicevmc ,id=@var{id} ,debug=@var{debug}, name=@var{name} +@option{spicevmc} is only available when spice support is built in. + @option{debug} debug level for spicevmc @option{name} name of spice channel to connect to Connect to a spice virtual machine channel, such as vdiport. -#endif @end table ETEXI -- 1.7.6.3
[Qemu-devel] [PATCH 2/7] qemu-char: Fix use of free() instead of g_free()
From: Stefan Weil s...@weilnetz.de cppcheck reported these errors: qemu-char.c:1667: error: Mismatching allocation and deallocation: s qemu-char.c:1668: error: Mismatching allocation and deallocation: chr qemu-char.c:1769: error: Mismatching allocation and deallocation: s qemu-char.c:1770: error: Mismatching allocation and deallocation: chr Tested-by: Dongxu Wang wdon...@linux.vnet.ibm.com Signed-off-by: Stefan Weil s...@weilnetz.de Signed-off-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com --- qemu-char.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/qemu-char.c b/qemu-char.c index 8bdbcfd..fb9e058 100644 --- a/qemu-char.c +++ b/qemu-char.c @@ -1664,8 +1664,8 @@ static int qemu_chr_open_win(QemuOpts *opts, CharDriverState **_chr) chr-chr_close = win_chr_close; if (win_chr_init(chr, filename) 0) { -free(s); -free(chr); +g_free(s); +g_free(chr); return -EIO; } qemu_chr_generic_open(chr); @@ -1766,8 +1766,8 @@ static int qemu_chr_open_win_pipe(QemuOpts *opts, CharDriverState **_chr) chr-chr_close = win_chr_close; if (win_chr_pipe_init(chr, filename) 0) { -free(s); -free(chr); +g_free(s); +g_free(chr); return -EIO; } qemu_chr_generic_open(chr); -- 1.7.6.3
Re: [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support
Am 14.10.2011 10:41, schrieb Paolo Bonzini: This similarly adds support for coroutine and asynchronous discard. Signed-off-by: Paolo Bonzini pbonz...@redhat.com Do we really need bdrv_discard and bdrv_aio_discard in the backends? I think it makes sense to have a bdrv_aio_discard() in block.h as AIO generally fits well for device models, but I would just require bdrv_co_discard for any block drivers implementing discard. I was not sure if qcow2 could be changed to co_discard, though I suspected yes. As discussed on IRC: Yes, it just must make sure to take s-lock. Kevin
Re: [Qemu-devel] [PATCH 1/2] spice: turn client_migrate_info to async
On 09/21/11 17:50, Yonit Halperin wrote: RHBZ 737921 Spice client is required to connect to the migration target before/as migration starts. Since after migration starts, the target qemu is blocked and cannot accept new spice client we trigger the connection to the target upon client_migrate_info command. client_migrate_info completion cb will be called after spice client has been connected to the target (or a timeout). See following patches and spice patches. Patch failes to build with spice disabled. Please fix. thanks, Gerd
Re: [Qemu-devel] [PATCH v2] runstate: add more valid transitions
On Fri, 14 Oct 2011 15:37:29 +0200 Paolo Bonzini pbonz...@redhat.com wrote: On 10/14/2011 03:23 PM, Luiz Capitulino wrote: I'm not, because I'm assuming that allowing a transition from 'paused' to 'postmigrate' plus this fix: http://lists.gnu.org/archive/html/qemu-devel/2011-10/msg01430.html I think POST_MIGRATE - FINISH_MIGRATE should be still allowed in case you migrate twice. Management would probably disallow that, but from the monitor you can do funny things: stop the machine, create two images based on the running machine's image, and migrate to both images. Yes, you're right. But there's another problem there: the VM is stopped when the migration process finishes. So, if you migrate again, there won't be a POST_MIGRATE - FINISH_MIGRATE transition, as vm_stop() will just return. I don't like the idea of changing current vm_stop() to always change the state, so I'm adding a new function to do that: diff --git a/cpus.c b/cpus.c index 8978779..5f5b763 100644 --- a/cpus.c +++ b/cpus.c @@ -887,6 +887,17 @@ void vm_stop(RunState state) do_vm_stop(state); } +/* does a state transition even if the VM is already stopped, + current state is forgotten forever */ +void vm_stop_force_state(RunState state) +{ +if (runstate_is_running()) { +vm_stop(state); +} else { +runstate_set(state); +} +} + static int tcg_cpu_exec(CPUState *env) { int ret; diff --git a/migration.c b/migration.c index 77a51ad..62b74a6 100644 --- a/migration.c +++ b/migration.c @@ -375,7 +375,7 @@ void migrate_fd_put_ready(void *opaque) int old_vm_running = runstate_is_running(); DPRINTF(done iterating\n); -vm_stop(RUN_STATE_FINISH_MIGRATE); +vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); if ((qemu_savevm_state_complete(s-mon, s-file)) 0) { if (old_vm_running) { diff --git a/sysemu.h b/sysemu.h index a889d90..7d288f8 100644 --- a/sysemu.h +++ b/sysemu.h @@ -35,6 +35,7 @@ void vm_state_notify(int running, RunState state); void vm_start(void); void vm_stop(RunState state); +void vm_stop_force_state(RunState state); void qemu_system_reset_request(void); void qemu_system_shutdown_request(void); diff --git a/vl.c b/vl.c index 2e991fc..613204b 100644 --- a/vl.c +++ b/vl.c @@ -346,6 +346,7 @@ static const RunStateTransition runstate_transitions_def[] = { { RUN_STATE_PAUSED, RUN_STATE_POSTMIGRATE }, { RUN_STATE_POSTMIGRATE, RUN_STATE_RUNNING }, +{ RUN_STATE_POSTMIGRATE, RUN_STATE_FINISH_MIGRATE }, { RUN_STATE_PRELAUNCH, RUN_STATE_RUNNING }, { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE }, -- 1.7.7.rc3
Re: [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support
On 10/14/2011 04:23 PM, Kevin Wolf wrote: This similarly adds support for coroutine and asynchronous discard. Signed-off-by: Paolo Bonzinipbonz...@redhat.com Do we really need bdrv_discard and bdrv_aio_discard in the backends? I think it makes sense to have a bdrv_aio_discard() in block.h as AIO generally fits well for device models, but I would just require bdrv_co_discard for any block drivers implementing discard. bdrv_discard is needed for now since I wouldn't like to conflate this patch with the qcow2 patch. I can certainly drop aio_discard from the backends, but I'm not sure how heavy can fallocate be (with FALLOC_FL_PUNCH_HOLE). Probably not much, but I think there's no guarantee of O(1) behavior especially with filesystems like ecryptfs. So you would need to go through the thread pool and aio_discard would come in handy. Paolo
Re: [Qemu-devel] [PATCH v2] runstate: add more valid transitions
On 10/14/2011 04:24 PM, Luiz Capitulino wrote: Yes, you're right. But there's another problem there: the VM is stopped when the migration process finishes. So, if you migrate again, there won't be a POST_MIGRATE - FINISH_MIGRATE transition, as vm_stop() will just return. Acked-by: Paolo Bonzini pbonz...@redhat.com for now. I really think for 1.1 it's better to separate the paused-until-user-interaction-because-... and temporarily-paused-because-... state, as suggested elsewhere in the thread. Paolo
Re: [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support
Am 14.10.2011 16:24, schrieb Paolo Bonzini: On 10/14/2011 04:23 PM, Kevin Wolf wrote: This similarly adds support for coroutine and asynchronous discard. Signed-off-by: Paolo Bonzinipbonz...@redhat.com Do we really need bdrv_discard and bdrv_aio_discard in the backends? I think it makes sense to have a bdrv_aio_discard() in block.h as AIO generally fits well for device models, but I would just require bdrv_co_discard for any block drivers implementing discard. bdrv_discard is needed for now since I wouldn't like to conflate this patch with the qcow2 patch. Okay, that makes sense. Then I'll drop it when I convert qcow2. I can certainly drop aio_discard from the backends, but I'm not sure how heavy can fallocate be (with FALLOC_FL_PUNCH_HOLE). Probably not much, but I think there's no guarantee of O(1) behavior especially with filesystems like ecryptfs. So you would need to go through the thread pool and aio_discard would come in handy. Sure, but the coroutine interface should be just as good for implementing it this way in raw-posix. Kevin
Re: [Qemu-devel] [PATCH 4/4] block: add bdrv_co_discard and bdrv_aio_discard support
On 10/14/2011 04:32 PM, Kevin Wolf wrote: I can certainly drop aio_discard from the backends, but I'm not sure how heavy can fallocate be (with FALLOC_FL_PUNCH_HOLE). Probably not much, but I think there's no guarantee of O(1) behavior especially with filesystems like ecryptfs. So you would need to go through the thread pool and aio_discard would come in handy. Sure, but the coroutine interface should be just as good for implementing it this way in raw-posix. I think until someone rewrites raw_aio_submit/paio_submit in terms of coroutines, it's better to keep aio_discard. Paolo