Re: [PATCH 1/3] kvm tools: Fix powerpc build errors caused by recent changes
On Fri, 2012-10-05 at 09:30 +0300, Pekka Enberg wrote: > Applied all three patches, thanks Michael! Thanks Pekka. cheers -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PULL] virtio
The following changes since commit 925a6f0bf8bd122d5d2429af7f0ca0fecf4ae71f: Merge tag 'hwspinlock-3.6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ohad/hwspinlock (2012-09-18 11:58:54 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux.git virtio-next for you to fetch changes up to ca16f580a5db7e60bfafe59a50bb133bd3347491: lguest: fix occasional crash in example launcher. (2012-10-04 12:12:59 +0930) New workflow: same git trees pulled by linux-next get sent straight to Linus. Git is awkward at shuffling patches compared with quilt or mq, but that doesn't happen often once things get into my -next branch. Alexey Khoroshilov (1): virtio: console: fix error handling in init() function Asias He (3): virtio-blk: Add bio-based IO path for virtio-blk virtio-blk: Add REQ_FLUSH and REQ_FUA support to bio path virtio-blk: Disable callback in virtblk_done() Brian Foley (2): virtio_mmio: fix off by one error allocating queue virtio_mmio: Don't attempt to create empty virtqueues Dan Carpenter (1): virtio-blk: fix NULL checking in virtblk_alloc_req() Jason Wang (2): virtio-ring: move queue_index to vring_virtqueue virtio: introduce an API to set affinity for a virtqueue Masami Hiramatsu (5): virtio/console: Add splice_write support virtio/console: Add a failback for unstealable pipe buffer virtio/console: Wait until the port is ready on splice ftrace: Allow stealing pages from pipe buffer virtio/console: Allocate scatterlist according to the current pipe size Michael S. Tsirkin (3): virtio-balloon: dependency fix virtio: support reserved vqs virtio: don't crash when device is buggy Peter Senna Tschudin (1): drivers/virtio/virtio_pci.c: fix error return code Rusty Russell (4): virtio_balloon: not EXPERIMENTAL any more. virtio: add help to CONFIG_VIRTIO option. virtio: remove CONFIG_VIRTIO_RING lguest: fix occasional crash in example launcher. Yoshihiro YUNOMAE (2): tools: Add guest trace agent as a user tool tools: Fix pthread flag for Makefile of trace-agent used by virtio-trace arch/s390/Kconfig |1 - arch/x86/lguest/Kconfig |1 - drivers/block/virtio_blk.c | 306 +++ drivers/char/virtio_console.c | 210 -- drivers/lguest/lguest_device.c |5 +- drivers/remoteproc/remoteproc_virtio.c |5 +- drivers/rpmsg/Kconfig |1 - drivers/s390/kvm/kvm_virtio.c |5 +- drivers/virtio/Kconfig | 17 +- drivers/virtio/Makefile |3 +- drivers/virtio/virtio.c |2 +- drivers/virtio/virtio_mmio.c| 29 ++- drivers/virtio/virtio_pci.c | 68 +- drivers/virtio/virtio_ring.c| 14 +- include/linux/virtio.h |2 + include/linux/virtio_config.h | 23 ++ include/linux/virtio_ring.h |3 +- kernel/trace/trace.c|8 +- tools/lguest/lguest.c |1 + tools/virtio/virtio-trace/Makefile | 13 ++ tools/virtio/virtio-trace/README| 118 +++ tools/virtio/virtio-trace/trace-agent-ctl.c | 137 tools/virtio/virtio-trace/trace-agent-rw.c | 192 + tools/virtio/virtio-trace/trace-agent.c | 270 +++ tools/virtio/virtio-trace/trace-agent.h | 75 +++ 25 files changed, 1402 insertions(+), 107 deletions(-) create mode 100644 tools/virtio/virtio-trace/Makefile create mode 100644 tools/virtio/virtio-trace/README create mode 100644 tools/virtio/virtio-trace/trace-agent-ctl.c create mode 100644 tools/virtio/virtio-trace/trace-agent-rw.c create mode 100644 tools/virtio/virtio-trace/trace-agent.c create mode 100644 tools/virtio/virtio-trace/trace-agent.h -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] kvm: Set default accelerator to "kvm" if the host supports it
On 05.10.2012, at 04:17, Anthony Liguori wrote: > Alexander Graf writes: > >> On 03.10.2012, at 22:26, Peter Maydell wrote: >> >>> On 3 October 2012 21:01, Blue Swirl wrote: On Mon, Oct 1, 2012 at 4:20 PM, Anthony Liguori wrote: > Jan Kiszka writes: >> +/* The default accelerator depends on the availability of KVM. >> */ >> +p = kvm_configured ? "kvm" : "tcg"; >>} >>> > Blue/Aurelien, any objections? No, maybe a message could be printed that says that the default has changed, for a few releases. >>> >>> I've lost track of the conversation, are we currently proposing >>> the accelerator default to be "kvm" (as per the original patch >>> you quote here) or "kvm:tcg" ? >>> >>> I'm not entirely sure which I prefer from an ARM perspective >>> For some time to come and for a lot of targets (ie any target >>> CPU except A15), having a default of "kvm" is going to cause >>> existing working commandlines to stop working. [I expect that >>> ARM-host qemu binaries will be built with CONFIG_KVM once ARM >>> KVM support lands, but the same binary will be run on hosts >>> without virtualization extensions.] On the other hand, perhaps >>> there just aren't really very many people who run QEMU on >>> ARM hosts, and so we can ignore them :-) >> >> We get similar problems on PPC. Take the following example: >> >> $ qemu-system-ppc -M mpc8544ds -kernel uImage -nographic > > But do you really expect people to do this? I have to believe that > people running on PPC hardware and running qemu-system-ppc most likely > want to do KVM... Sure. But we wouldn't be able to even tell them what went wrong, as we don't have a negotiation mechanism right now that could tell user space "hey, the CPU you selected is unknown to me". However, if during cpu init we could add such a check and then fall back to tcg mode if accel=kvm:tcg with a warning, that'd be nice user experience. We could do the same for ARM. If you do -M beagle on an A15 KVM enabled machine, you would still be able to do so, but KVM tells you it can't emulate an A8 right now. And if in the future KVM learns how to expose an A8 on A15, we could just not bail out and things would magically work. Apart from that, I like the idea of kvm:tcg with a warning as the default for qemu. We should still have a qemu-kvm binary in the distro that does accel=kvm so people don't accidentally fall back to tcg mode. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH v2 06/14] KVM: ARM: Memory virtualization setup
> -Original Message- > From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On > Behalf Of Christoffer Dall > Sent: Monday, October 01, 2012 6:11 PM > To: kvm@vger.kernel.org; linux-arm-ker...@lists.infradead.org; > kvm...@lists.cs.columbia.edu > Cc: Marc Zyngier > Subject: [PATCH v2 06/14] KVM: ARM: Memory virtualization setup > > +static void stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache > *cache, > +phys_addr_t addr, const pte_t *new_pte) { > + pgd_t *pgd; > + pud_t *pud; > + pmd_t *pmd; > + pte_t *pte, old_pte; > + > + /* Create 2nd stage page table mapping - Level 1 */ > + pgd = kvm->arch.pgd + pgd_index(addr); > + pud = pud_offset(pgd, addr); > + if (pud_none(*pud)) { > + if (!cache) > + return; /* ignore calls from kvm_set_spte_hva */ > + pmd = mmu_memory_cache_alloc(cache); > + pud_populate(NULL, pud, pmd); > + pmd += pmd_index(addr); > + get_page(virt_to_page(pud)); > + } else > + pmd = pmd_offset(pud, addr); > + > + /* Create 2nd stage page table mapping - Level 2 */ > + if (pmd_none(*pmd)) { > + if (!cache) > + return; /* ignore calls from kvm_set_spte_hva */ > + pte = mmu_memory_cache_alloc(cache); > + clean_pte_table(pte); > + pmd_populate_kernel(NULL, pmd, pte); > + pte += pte_index(addr); > + get_page(virt_to_page(pmd)); > + } else > + pte = pte_offset_kernel(pmd, addr); > + > + /* Create 2nd stage page table mapping - Level 3 */ > + old_pte = *pte; > + set_pte_ext(pte, *new_pte, 0); > + if (pte_present(old_pte)) > + __kvm_tlb_flush_vmid(kvm); > + else > + get_page(virt_to_page(pte)); > +} I'm not sure about the 3-level page table, but isn't it necessary to clean the page table for 2nd level? There are two mmu_memory_cache_alloc calls. One has following clean_pte_table and the other doesn't have. And why do you ignore calls from kvm_set_spte_hva? It is supposed to happen when host moves the page, right? Then you ignore the case because it can be handled later when fault actually happens? Is there any other reason that I miss? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] kvm: Set default accelerator to "kvm" if the host supports it
Alexander Graf writes: > On 03.10.2012, at 22:26, Peter Maydell wrote: > >> On 3 October 2012 21:01, Blue Swirl wrote: >>> On Mon, Oct 1, 2012 at 4:20 PM, Anthony Liguori >>> wrote: Jan Kiszka writes: > +/* The default accelerator depends on the availability of KVM. */ > +p = kvm_configured ? "kvm" : "tcg"; > } >> Blue/Aurelien, any objections? >>> >>> No, maybe a message could be printed that says that the default has >>> changed, for a few releases. >> >> I've lost track of the conversation, are we currently proposing >> the accelerator default to be "kvm" (as per the original patch >> you quote here) or "kvm:tcg" ? >> >> I'm not entirely sure which I prefer from an ARM perspective >> For some time to come and for a lot of targets (ie any target >> CPU except A15), having a default of "kvm" is going to cause >> existing working commandlines to stop working. [I expect that >> ARM-host qemu binaries will be built with CONFIG_KVM once ARM >> KVM support lands, but the same binary will be run on hosts >> without virtualization extensions.] On the other hand, perhaps >> there just aren't really very many people who run QEMU on >> ARM hosts, and so we can ignore them :-) > > We get similar problems on PPC. Take the following example: > > $ qemu-system-ppc -M mpc8544ds -kernel uImage -nographic But do you really expect people to do this? I have to believe that people running on PPC hardware and running qemu-system-ppc most likely want to do KVM... Kernel development seems unlikely... The folks that I know doing PPC kernel development with QEMU still qemu-system-ppc on x86. Regards, Anthony Liguori > > That command line would work just fine if you run it on a G5 today. However, > if we switch to accel=kvm:tcg, it will stop working, because kvm on the G5 > can not virtualize an e500 CPU. > > I don't know any good way around that one either the way the code is layered > today. We only know the type of CPU we want to create after we made the > decision whether we do KVM or not. > > > Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH] kvm: Set default accelerator to "kvm" if the host supports it
On 03.10.2012, at 22:26, Peter Maydell wrote: > On 3 October 2012 21:01, Blue Swirl wrote: >> On Mon, Oct 1, 2012 at 4:20 PM, Anthony Liguori >> wrote: >>> Jan Kiszka writes: +/* The default accelerator depends on the availability of KVM. */ +p = kvm_configured ? "kvm" : "tcg"; } > >>> Blue/Aurelien, any objections? >> >> No, maybe a message could be printed that says that the default has >> changed, for a few releases. > > I've lost track of the conversation, are we currently proposing > the accelerator default to be "kvm" (as per the original patch > you quote here) or "kvm:tcg" ? > > I'm not entirely sure which I prefer from an ARM perspective > For some time to come and for a lot of targets (ie any target > CPU except A15), having a default of "kvm" is going to cause > existing working commandlines to stop working. [I expect that > ARM-host qemu binaries will be built with CONFIG_KVM once ARM > KVM support lands, but the same binary will be run on hosts > without virtualization extensions.] On the other hand, perhaps > there just aren't really very many people who run QEMU on > ARM hosts, and so we can ignore them :-) We get similar problems on PPC. Take the following example: $ qemu-system-ppc -M mpc8544ds -kernel uImage -nographic That command line would work just fine if you run it on a G5 today. However, if we switch to accel=kvm:tcg, it will stop working, because kvm on the G5 can not virtualize an e500 CPU. I don't know any good way around that one either the way the code is layered today. We only know the type of CPU we want to create after we made the decision whether we do KVM or not. Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/3] kvm tools: Do setup_fdt() later, get powerpc to boot again
In commit e3d3ced "kernel load/firmware cleanup", the call to kvm__arch_setup_firmware() was moved. Previously more or less at the end of the init sequence, but that commit moved it into kvm__init() which is a core_init() call and so runs quite early. This broke booting powerpc guests, as setup_fdt() needs to be called later in the setup sequence. In particular it looks at kvm->nrcpus, which is uninitialised at that point. In general setup_fdt() needs to run late in the sequence, as it encodes the setup of the machine into the device tree. So move setup_fdt() out of kvm__arch_setup_firmware() and make it a firmware_init() call of its own. With this patch I am able to boot guests again on HV KVM. Signed-off-by: Michael Ellerman --- tools/kvm/powerpc/kvm.c |8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/kvm/powerpc/kvm.c b/tools/kvm/powerpc/kvm.c index e4f5315..d675265 100644 --- a/tools/kvm/powerpc/kvm.c +++ b/tools/kvm/powerpc/kvm.c @@ -286,7 +286,7 @@ static void generate_segment_page_sizes(struct kvm_ppc_smmu_info *info, struct f * and whilst most PPC targets will require CPU/memory nodes, others like RTAS * should eventually be added separately. */ -static void setup_fdt(struct kvm *kvm) +static int setup_fdt(struct kvm *kvm) { uint64_tmem_reg_property[] = { 0, cpu_to_be64(kvm->ram_size) }; int smp_cpus = kvm->nrcpus; @@ -488,7 +488,10 @@ static void setup_fdt(struct kvm *kvm) _FDT(fdt_pack(fdt_dest)); free(segment_page_sizes.value); + + return 0; } +firmware_init(setup_fdt); /** * kvm__arch_setup_firmware @@ -517,9 +520,6 @@ int kvm__arch_setup_firmware(struct kvm *kvm) /* Load SLOF */ - /* Init FDT */ - setup_fdt(kvm); - return 0; } -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] kvm tools: Fix segfault on powerpc in xics_register()
In commit 06e6648 "move kvm_cpus into struct kvm", kvm_cpu__init() became kvm_cpu__arch_init() called from a new kvm_cpu__init(), and the call was moved from the end of the init sequence to much earlier, and in particular prior to irq__init(). This leads to a segfault on powerpc, because kvm_cpu__arch_init() calls into xics_cpu_register(), which dereferences vcpu->kvm.icp which is uninitialised until irq__init(). Later in commit a48488d "use init/exit where possible", irq__init() was pulled out of the init sequence and made a dev_base_init() routine, on x86. On powerpc the call to irq__init() was dropped entirely. Finally, we now have a circular dependency between kvm_cpu__init() (which needs kvm->arch.icp), and irq__init() (which needs kvm->nrcpus). This is caused by the combination of commit 89f40a7 "move nrcpus into struct kvm_config", which moved the global nrcpus into kvm->cfg, and commit 06e6648 "move kvm_cpus into struct kvm", which moved the setup of kvm->nrcpus from kvm->cfg into kvm_cpu__init(). To fix it we drop irq__init() entirely, if we ever have a non xics irq option we can bring it back. We turn xics_system_init() into xics_init(), and have it do the allocation and setup of the icp/ics, including the per-vcpu setup, removing the dependency from kvm_cpu__init() (via kvm_cpu__arch_init()). xics_init() is a base_init() routine, it can't be core, which should be early enough, fingers crossed. Finally drop irq__exit(), it does nothing and is never called. Signed-off-by: Michael Ellerman --- tools/kvm/powerpc/irq.c | 19 --- tools/kvm/powerpc/kvm-cpu.c |3 --- tools/kvm/powerpc/xics.c| 38 +++--- tools/kvm/powerpc/xics.h|5 - 4 files changed, 23 insertions(+), 42 deletions(-) diff --git a/tools/kvm/powerpc/irq.c b/tools/kvm/powerpc/irq.c index 6d134c5..e89fa3b 100644 --- a/tools/kvm/powerpc/irq.c +++ b/tools/kvm/powerpc/irq.c @@ -26,8 +26,6 @@ #include "xics.h" #include "spapr_pci.h" -#define XICS_IRQS 1024 - /* * FIXME: The code in this file assumes an SPAPR guest, using XICS. Make * generic & cope with multiple PPC platform types. @@ -51,23 +49,6 @@ int irq__register_device(u32 dev, u8 *num, u8 *pin, u8 *line) return 0; } -int irq__init(struct kvm *kvm) -{ - /* -* kvm->nr_cpus is now valid; for /now/, pass -* this to xics_system_init(), which assumes servers -* are numbered 0..nrcpus. This may not really be true, -* but it is OK currently. -*/ - kvm->arch.icp = xics_system_init(XICS_IRQS, kvm->nrcpus); - return 0; -} - -int irq__exit(struct kvm *kvm) -{ - return 0; -} - int irq__add_msix_route(struct kvm *kvm, struct msi_msg *msg) { die(__FUNCTION__); diff --git a/tools/kvm/powerpc/kvm-cpu.c b/tools/kvm/powerpc/kvm-cpu.c index 6aaf424..8fce121 100644 --- a/tools/kvm/powerpc/kvm-cpu.c +++ b/tools/kvm/powerpc/kvm-cpu.c @@ -93,9 +93,6 @@ struct kvm_cpu *kvm_cpu__arch_init(struct kvm *kvm, unsigned long cpu_id) */ vcpu->is_running = true; - /* Register with IRQ controller (FIXME, assumes XICS) */ - xics_cpu_register(vcpu); - return vcpu; } diff --git a/tools/kvm/powerpc/xics.c b/tools/kvm/powerpc/xics.c index 1cf9558..d4b5caa 100644 --- a/tools/kvm/powerpc/xics.c +++ b/tools/kvm/powerpc/xics.c @@ -18,6 +18,8 @@ #include #include +#define XICS_NUM_IRQS 1024 + /* #define DEBUG_XICS yes */ #ifdef DEBUG_XICS @@ -441,26 +443,19 @@ static void rtas_int_on(struct kvm_cpu *vcpu, uint32_t token, rtas_st(vcpu->kvm, rets, 0, 0); /* Success */ } -void xics_cpu_register(struct kvm_cpu *vcpu) -{ - if (vcpu->cpu_id < vcpu->kvm->arch.icp->nr_servers) - vcpu->kvm->arch.icp->ss[vcpu->cpu_id].cpu = vcpu; - else - die("Setting invalid server for cpuid %ld\n", vcpu->cpu_id); -} - -struct icp_state *xics_system_init(unsigned int nr_irqs, unsigned int nr_cpus) +static int xics_init(struct kvm *kvm) { int max_server_num; unsigned int i; struct icp_state *icp; struct ics_state *ics; + int j; - max_server_num = nr_cpus; + max_server_num = kvm->nrcpus; icp = malloc(sizeof(*icp)); icp->nr_servers = max_server_num + 1; - icp->ss = malloc(icp->nr_servers*sizeof(struct icp_server_state)); + icp->ss = malloc(icp->nr_servers * sizeof(struct icp_server_state)); for (i = 0; i < icp->nr_servers; i++) { icp->ss[i].xirr = 0; @@ -475,14 +470,14 @@ struct icp_state *xics_system_init(unsigned int nr_irqs, unsigned int nr_cpus) */ ics = malloc(sizeof(*ics)); - ics->nr_irqs = nr_irqs; + ics->nr_irqs = XICS_NUM_IRQS; ics->offset = XICS_IRQ_OFFSET; - ics->irqs = malloc(nr_irqs * sizeof(struct ics_irq_state)); + ics->irqs = malloc(ics->nr_irqs * sizeof(struct ics_irq_state)); icp
[PATCH 1/3] kvm tools: Fix powerpc build errors caused by recent changes
Several caused by commit 8074303 "remove global kvm object", ioport__setup_arch(), term_getc_iov() & term_getc() in the spapr_hvcons.c code, and kvm_cpu__reboot() in rtas_power_off(). Commit 221b584 "move active_console into struct kvm_config" added checks in h_put_term_char() & h_get_term_char() of kvm->cfg.active_console but needs to be vcpu->kvm->cfg.active_console. That commit also missed updates to term_putc() & term_getc() in spapr_rtas.c, and I'm guessing that we need similar checks of active_console in rtas_put_term_char() & rtas_get_term_char(). Signed-off-by: Michael Ellerman --- tools/kvm/powerpc/ioport.c |2 +- tools/kvm/powerpc/spapr_hvcons.c |6 +++--- tools/kvm/powerpc/spapr_rtas.c | 14 +- 3 files changed, 13 insertions(+), 9 deletions(-) diff --git a/tools/kvm/powerpc/ioport.c b/tools/kvm/powerpc/ioport.c index a8e4dc3..264fb7e 100644 --- a/tools/kvm/powerpc/ioport.c +++ b/tools/kvm/powerpc/ioport.c @@ -12,7 +12,7 @@ #include -void ioport__setup_arch(void) +void ioport__setup_arch(struct kvm *kvm) { /* PPC has no legacy ioports to set up */ } diff --git a/tools/kvm/powerpc/spapr_hvcons.c b/tools/kvm/powerpc/spapr_hvcons.c index 1fe4bdb..0bdf75b 100644 --- a/tools/kvm/powerpc/spapr_hvcons.c +++ b/tools/kvm/powerpc/spapr_hvcons.c @@ -50,7 +50,7 @@ static unsigned long h_put_term_char(struct kvm_cpu *vcpu, unsigned long opcode, do { int ret; - if (kvm->cfg.active_console == CONSOLE_HV) + if (vcpu->kvm->cfg.active_console == CONSOLE_HV) ret = term_putc_iov(&iov, 1, 0); else ret = 0; @@ -74,14 +74,14 @@ static unsigned long h_get_term_char(struct kvm_cpu *vcpu, unsigned long opcode, union hv_chario data; struct iovec iov; - if (kvm->cfg.active_console != CONSOLE_HV) + if (vcpu->kvm->cfg.active_console != CONSOLE_HV) return H_SUCCESS; if (term_readable(0)) { iov.iov_base = data.buf; iov.iov_len = 16; - *len = term_getc_iov(&iov, 1, 0); + *len = term_getc_iov(vcpu->kvm, &iov, 1, 0); *char0_7 = be64_to_cpu(data.a.char0_7); *char8_15 = be64_to_cpu(data.a.char8_15); } else { diff --git a/tools/kvm/powerpc/spapr_rtas.c b/tools/kvm/powerpc/spapr_rtas.c index 14a3462..c81d82b 100644 --- a/tools/kvm/powerpc/spapr_rtas.c +++ b/tools/kvm/powerpc/spapr_rtas.c @@ -41,7 +41,7 @@ static void rtas_display_character(struct kvm_cpu *vcpu, uint32_t nret, target_ulong rets) { char c = rtas_ld(vcpu->kvm, args, 0); - term_putc(CONSOLE_HV, &c, 1, 0); + term_putc(&c, 1, 0); rtas_st(vcpu->kvm, rets, 0, 0); } @@ -52,7 +52,10 @@ static void rtas_put_term_char(struct kvm_cpu *vcpu, uint32_t nret, target_ulong rets) { char c = rtas_ld(vcpu->kvm, args, 0); - term_putc(CONSOLE_HV, &c, 1, 0); + + if (vcpu->kvm->cfg.active_console == CONSOLE_HV) + term_putc(&c, 1, 0); + rtas_st(vcpu->kvm, rets, 0, 0); } @@ -62,8 +65,9 @@ static void rtas_get_term_char(struct kvm_cpu *vcpu, uint32_t nret, target_ulong rets) { int c; - if (term_readable(CONSOLE_HV, 0) && - (c = term_getc(CONSOLE_HV, 0)) >= 0) { + + if (vcpu->kvm->cfg.active_console == CONSOLE_HV && term_readable(0) && + (c = term_getc(vcpu->kvm, 0)) >= 0) { rtas_st(vcpu->kvm, rets, 0, 0); rtas_st(vcpu->kvm, rets, 1, c); } else { @@ -115,7 +119,7 @@ static void rtas_power_off(struct kvm_cpu *vcpu, rtas_st(vcpu->kvm, rets, 0, -3); return; } - kvm_cpu__reboot(); + kvm_cpu__reboot(vcpu->kvm); } static void rtas_query_cpu_stopped_state(struct kvm_cpu *vcpu, -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 10/15] i386: kvm: x2apic is not supported without in-kernel irqchip
This is necessary so that x2apic is not improperly enabled when the in-kernel irqchip is disabled. This won't generate a warning with "-cpu ...,check" because the current check/enforce code is broken (it checks the host CPU data directly, instead of using kvm_arch_get_supported_cpuid()), but it will be eventually fixed to properly report the missing x2apic flag. Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 704ad33..590f4d5 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -209,6 +209,13 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) { ret |= CPUID_EXT_TSC_DEADLINE_TIMER; } + +/* x2apic is reported by GET_SUPPORTED_CPUID, but it can't be enabled + * without the in-kernel irqchip + */ +if (!kvm_irqchip_in_kernel()) { +ret &= ~CPUID_EXT_X2APIC; +} } else if (function == 0x8001 && reg == R_EDX) { /* On Intel, kvm returns cpuid according to the Intel spec, * so add missing bits according to the AMD spec: -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 12/15] i386: kvm: mask cpuid_ext4_features bits earlier
This way all the filtering by GET_SUPPORTED_CPUID is being done at the same place in the code. Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 3f4bee5..335b3e7 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -431,6 +431,9 @@ int kvm_arch_init_vcpu(CPUX86State *env) env->cpuid_kvm_features &= kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX); +env->cpuid_ext4_features &= kvm_arch_get_supported_cpuid(s, 0xC001, + 0, R_EDX); + cpuid_i = 0; /* Paravirtualization CPUIDs */ @@ -572,8 +575,6 @@ int kvm_arch_init_vcpu(CPUX86State *env) /* Call Centaur's CPUID instructions they are supported. */ if (env->cpuid_xlevel2 > 0) { -env->cpuid_ext4_features &= -kvm_arch_get_supported_cpuid(s, 0xC001, 0, R_EDX); cpu_x86_cpuid(env, 0xC000, 0, &limit, &unused, &unused, &unused); for (i = 0xC000; i <= limit; i++) { -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 09/15] i386: kvm: set CPUID_EXT_TSC_DEADLINE_TIMER on kvm_arch_get_supported_cpuid()
This moves the CPUID_EXT_TSC_DEADLINE_TIMER CPUID flag hacking from kvm_arch_init_vcpu() to kvm_arch_get_supported_cpuid(). Full git grep for kvm_arch_get_supported_cpuid: kvm.h:uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function, target-i386/cpu.c:x86_cpu_def->cpuid_7_0_ebx_features = kvm_arch_get_supported_cpuid(kvm_state, 0x7, 0, R_EBX); target-i386/cpu.c:*eax = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EAX); target-i386/cpu.c:*ebx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EBX); target-i386/cpu.c:*ecx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_ECX); target-i386/cpu.c:*edx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EDX); target-i386/cpu.c:*eax = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EAX); target-i386/cpu.c:*ebx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EBX); target-i386/cpu.c:*ecx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_ECX); target-i386/cpu.c:*edx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EDX); target-i386/kvm.c:uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, target-i386/kvm.c:cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); target-i386/kvm.c:env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); * target-i386/kvm.c:env->cpuid_ext_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); target-i386/kvm.c:env->cpuid_ext2_features &= kvm_arch_get_supported_cpuid(s, 0x8001, target-i386/kvm.c:env->cpuid_ext3_features &= kvm_arch_get_supported_cpuid(s, 0x8001, target-i386/kvm.c:env->cpuid_svm_features &= kvm_arch_get_supported_cpuid(s, 0x800A, target-i386/kvm.c:kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX); target-i386/kvm.c:kvm_arch_get_supported_cpuid(s, 0xC001, 0, R_EDX); Note that there is only one call for CPUID[1].ECX above (*), and it is the one that gets hacked to include CPUID_EXT_TSC_DEADLINE_TIMER, so we can simply make kvm_arch_get_supported_cpuid() set it, to let the rest of the code know the flag can be safely set by QEMU. One thing I was worrying about when doing this is that now kvm_arch_get_supported_cpuid() depends on kvm_irqchip_in_kernel(). But the 'kvm_kernel_irqchip' global variable is initialized during kvm_init(), that is called very early, and kvm_init() is already a requirement to run the GET_SUPPORTED_CPUID ioctl() (as kvm_init() is the function that initializes the 'kvm_state' global variable). Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 1f451e3..704ad33 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -201,6 +201,14 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, * GET_SUPPORTED_CPUID */ ret |= CPUID_EXT_HYPERVISOR; +/* tsc-deadline flag is not returned by GET_SUPPORTED_CPUID, but it + * can be enabled if the kernel has KVM_CAP_TSC_DEADLINE_TIMER, + * and the irqchip is in the kernel. + */ +if (kvm_irqchip_in_kernel() && +kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) { +ret |= CPUID_EXT_TSC_DEADLINE_TIMER; +} } else if (function == 0x8001 && reg == R_EDX) { /* On Intel, kvm returns cpuid according to the Intel spec, * so add missing bits according to the AMD spec: @@ -404,12 +412,7 @@ int kvm_arch_init_vcpu(CPUX86State *env) env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); -j = env->cpuid_ext_features & CPUID_EXT_TSC_DEADLINE_TIMER; env->cpuid_ext_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); -if (j && kvm_irqchip_in_kernel() && -kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) { -env->cpuid_ext_features |= CPUID_EXT_TSC_DEADLINE_TIMER; -} env->cpuid_ext2_features &= kvm_arch_get_supported_cpuid(s, 0x8001, 0, R_EDX); -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/15] i386: kvm: mask cpuid_kvm_features earlier
Instead of masking the KVM feature bits very late (while building the KVM_SET_CPUID2 data), mask it out on env->cpuid_kvm_features, at the same point where the other feature words are masked out. Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 590f4d5..3f4bee5 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -428,6 +428,9 @@ int kvm_arch_init_vcpu(CPUX86State *env) env->cpuid_svm_features &= kvm_arch_get_supported_cpuid(s, 0x800A, 0, R_EDX); +env->cpuid_kvm_features &= +kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX); + cpuid_i = 0; /* Paravirtualization CPUIDs */ @@ -448,8 +451,7 @@ int kvm_arch_init_vcpu(CPUX86State *env) c = &cpuid_data.entries[cpuid_i++]; memset(c, 0, sizeof(*c)); c->function = KVM_CPUID_FEATURES; -c->eax = env->cpuid_kvm_features & -kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX); +c->eax = env->cpuid_kvm_features; if (hyperv_enabled()) { memcpy(signature, "Hv#1\0\0\0\0\0\0\0\0", 12); -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/15] i386: kvm: set CPUID_EXT_HYPERVISOR on kvm_arch_get_supported_cpuid()
Full grep for kvm_arch_get_supported_cpuid: kvm.h:uint32_t kvm_arch_get_supported_cpuid(KVMState *env, uint32_t function, target-i386/cpu.c:x86_cpu_def->cpuid_7_0_ebx_features = kvm_arch_get_supported_cpuid(kvm_state, 0x7, 0, R_EBX); target-i386/cpu.c:*eax = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EAX); target-i386/cpu.c:*ebx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EBX); target-i386/cpu.c:*ecx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_ECX); target-i386/cpu.c:*edx = kvm_arch_get_supported_cpuid(s, 0xA, count, R_EDX); target-i386/cpu.c:*eax = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EAX); target-i386/cpu.c:*ebx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EBX); target-i386/cpu.c:*ecx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_ECX); target-i386/cpu.c:*edx = kvm_arch_get_supported_cpuid(s, 0xd, count, R_EDX); target-i386/kvm.c:uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, target-i386/kvm.c:cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); target-i386/kvm.c:env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); * target-i386/kvm.c:env->cpuid_ext_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); target-i386/kvm.c:env->cpuid_ext2_features &= kvm_arch_get_supported_cpuid(s, 0x8001, target-i386/kvm.c:env->cpuid_ext3_features &= kvm_arch_get_supported_cpuid(s, 0x8001, target-i386/kvm.c:env->cpuid_svm_features &= kvm_arch_get_supported_cpuid(s, 0x800A, target-i386/kvm.c:kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX); target-i386/kvm.c:kvm_arch_get_supported_cpuid(s, 0xC001, 0, R_EDX); Note that there is only one call for CPUID[1].ECX above (*), and it is the one that gets hacked to include CPUID_EXT_HYPERVISOR, so we can simply make kvm_arch_get_supported_cpuid() set it, to let the rest of the code automatically know that the flag can be safely set by QEMU. Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 084b40c..1f451e3 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -196,6 +196,11 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, if (function == 1 && reg == R_EDX) { /* KVM before 2.6.30 misreports the following features */ ret |= CPUID_MTRR | CPUID_PAT | CPUID_MCE | CPUID_MCA; +} else if (function == 1 && reg == R_ECX) { +/* We can set the hypervisor flag, even if KVM does not return it on + * GET_SUPPORTED_CPUID + */ +ret |= CPUID_EXT_HYPERVISOR; } else if (function == 0x8001 && reg == R_EDX) { /* On Intel, kvm returns cpuid according to the Intel spec, * so add missing bits according to the AMD spec: @@ -399,10 +404,8 @@ int kvm_arch_init_vcpu(CPUX86State *env) env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); -i = env->cpuid_ext_features & CPUID_EXT_HYPERVISOR; j = env->cpuid_ext_features & CPUID_EXT_TSC_DEADLINE_TIMER; env->cpuid_ext_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); -env->cpuid_ext_features |= i; if (j && kvm_irqchip_in_kernel() && kvm_check_extension(s, KVM_CAP_TSC_DEADLINE_TIMER)) { env->cpuid_ext_features |= CPUID_EXT_TSC_DEADLINE_TIMER; -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 07/15] i386: kvm: kvm_arch_get_supported_cpuid: replace if+switch with single 'if'
Additional fixups will be added, and making them a single 'if/else if' chain makes it clearer than two nested switch statements. Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 23 +-- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 7115921..084b40c 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -193,20 +193,15 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, /* Fixups for the data returned by KVM, below */ -if (reg == R_EDX) { -switch (function) { -case 1: -/* KVM before 2.6.30 misreports the following features */ -ret |= CPUID_MTRR | CPUID_PAT | CPUID_MCE | CPUID_MCA; -break; -case 0x8001: -/* On Intel, kvm returns cpuid according to the Intel spec, - * so add missing bits according to the AMD spec: - */ -cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); -ret |= cpuid_1_edx & CPUID_EXT2_AMD_ALIASES; -break; -} +if (function == 1 && reg == R_EDX) { +/* KVM before 2.6.30 misreports the following features */ +ret |= CPUID_MTRR | CPUID_PAT | CPUID_MCE | CPUID_MCA; +} else if (function == 0x8001 && reg == R_EDX) { +/* On Intel, kvm returns cpuid according to the Intel spec, + * so add missing bits according to the AMD spec: + */ +cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); +ret |= cpuid_1_edx & CPUID_EXT2_AMD_ALIASES; } g_free(cpuid); -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/15] i386: kvm: reformat filter_features_for_kvm() code
Cosmetic, but it will also help to make futher patches easier to review. Signed-off-by: Eduardo Habkost --- target-i386/cpu.c | 28 +--- 1 file changed, 13 insertions(+), 15 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index a5c1628..7921b1b 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -1366,22 +1366,20 @@ static void filter_features_for_kvm(X86CPU *cpu) CPUX86State *env = &cpu->env; KVMState *s = kvm_state; -env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); - -env->cpuid_ext_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); - -env->cpuid_ext2_features &= kvm_arch_get_supported_cpuid(s, 0x8001, - 0, R_EDX); -env->cpuid_ext3_features &= kvm_arch_get_supported_cpuid(s, 0x8001, - 0, R_ECX); -env->cpuid_svm_features &= kvm_arch_get_supported_cpuid(s, 0x800A, - 0, R_EDX); - +env->cpuid_features &= +kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); +env->cpuid_ext_features &= +kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); +env->cpuid_ext2_features &= +kvm_arch_get_supported_cpuid(s, 0x8001, 0, R_EDX); +env->cpuid_ext3_features &= +kvm_arch_get_supported_cpuid(s, 0x8001, 0, R_ECX); +env->cpuid_svm_features &= +kvm_arch_get_supported_cpuid(s, 0x800A, 0, R_EDX); env->cpuid_kvm_features &= -kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX); - -env->cpuid_ext4_features &= kvm_arch_get_supported_cpuid(s, 0xC001, - 0, R_EDX); +kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX); +env->cpuid_ext4_features &= +kvm_arch_get_supported_cpuid(s, 0xC001, 0, R_EDX); } #endif -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/15] i386: kvm: extract CPUID entry lookup to cpuid_find_entry() function
No behavior change, just code movement. Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 30 ++ 1 file changed, 22 insertions(+), 8 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index ae51573..3519028 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -145,11 +145,28 @@ static uint32_t cpuid_entry_get_reg(struct kvm_cpuid_entry2 *entry, int reg) return ret; } +/* Find matching entry for function/index on kvm_cpuid2 struct + */ +static struct kvm_cpuid_entry2 *cpuid_find_entry(struct kvm_cpuid2 *cpuid, + uint32_t function, + uint32_t index) +{ +int i; +for (i = 0; i < cpuid->nent; ++i) { +if (cpuid->entries[i].function == function && +cpuid->entries[i].index == index) { +return &cpuid->entries[i]; +} +} +/* not found: */ +return NULL; +} + uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, uint32_t index, int reg) { struct kvm_cpuid2 *cpuid; -int i, max; +int max; uint32_t ret = 0; uint32_t cpuid_1_edx; bool found = false; @@ -159,13 +176,10 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, max *= 2; } -for (i = 0; i < cpuid->nent; ++i) { -if (cpuid->entries[i].function == function && -cpuid->entries[i].index == index) { -struct kvm_cpuid_entry2 *entry = &cpuid->entries[i]; -found = true; -ret = cpuid_entry_get_reg(entry, reg); -} +struct kvm_cpuid_entry2 *entry = cpuid_find_entry(cpuid, function, index); +if (entry) { +found = true; +ret = cpuid_entry_get_reg(entry, reg); } /* Fixups for the data returned by KVM, below */ -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 13/15] i386: kvm: filter CPUID feature words earlier, on cpu.c
cpu.c contains the code that will check if all requested CPU features are available, so the filtering of KVM features must be there, so we can implement "check" and "enforce" properly. The only point where kvm_arch_init_vcpu() is called on i386 is: - cpu_x86_init() - x86_cpu_realize() (after cpu_x86_register() is called) - qemu_init_vcpu() - qemu_kvm_start_vcpu() - qemu_kvm_thread_fn() (on a new thread) - kvm_init_vcpu() - kvm_arch_init_vcpu() With this patch, the filtering will be done earlier, at: - cpu_x86_init() - cpu_x86_register() (before x86_cpu_realize() is called) Also, the KVM CPUID filtering will now be done at the same place where the TCG CPUID feature filtering is done. Later, the code can be changed to use the same filtering code for the "check" and "enforce" modes, as now the cpu.c code knows exactly which CPU features are going to be exposed to the guest (and much earlier). One thing I was worrying about when doing this is that kvm_arch_get_supported_cpuid() depends on kvm_irqchip_in_kernel(), and maybe the 'kvm_kernel_irqchip' global variable wasn't initialized yet at CPU creation time. But kvm_kernel_irqchip is initialized during kvm_init(), that is called very early (much earlier than the machine init function), and kvm_init() is already a requirement to run the GET_SUPPORTED_CPUID ioctl() (as kvm_init() initializes the kvm_state global variable). Side note: it would be nice to keep KVM-specific code inside kvm.c. The problem is that properly implementing -cpu check/enforce code (that's inside cpu.c) depends directly on the feature bit filtering done using kvm_arch_get_supported_cpuid(). Currently -cpu check/enforce is broken because it simply uses the host CPU feature bits instead of GET_SUPPORTED_CPUID, and we need to fix that. Signed-off-by: Eduardo Habkost --- kvm.h | 1 + target-i386/cpu.c | 30 ++ target-i386/kvm.c | 18 -- 3 files changed, 31 insertions(+), 18 deletions(-) diff --git a/kvm.h b/kvm.h index dea2998..150f95d 100644 --- a/kvm.h +++ b/kvm.h @@ -20,6 +20,7 @@ #ifdef CONFIG_KVM #include +#include #endif extern int kvm_allowed; diff --git a/target-i386/cpu.c b/target-i386/cpu.c index bb1e44e..a5c1628 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -1360,6 +1360,32 @@ CpuDefinitionInfoList *arch_query_cpu_definitions(Error **errp) return cpu_list; } +#ifdef CONFIG_KVM +static void filter_features_for_kvm(X86CPU *cpu) +{ +CPUX86State *env = &cpu->env; +KVMState *s = kvm_state; + +env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); + +env->cpuid_ext_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); + +env->cpuid_ext2_features &= kvm_arch_get_supported_cpuid(s, 0x8001, + 0, R_EDX); +env->cpuid_ext3_features &= kvm_arch_get_supported_cpuid(s, 0x8001, + 0, R_ECX); +env->cpuid_svm_features &= kvm_arch_get_supported_cpuid(s, 0x800A, + 0, R_EDX); + +env->cpuid_kvm_features &= +kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX); + +env->cpuid_ext4_features &= kvm_arch_get_supported_cpuid(s, 0xC001, + 0, R_EDX); + +} +#endif + int cpu_x86_register(X86CPU *cpu, const char *cpu_model) { CPUX86State *env = &cpu->env; @@ -1417,6 +1443,10 @@ int cpu_x86_register(X86CPU *cpu, const char *cpu_model) ); env->cpuid_ext3_features &= TCG_EXT3_FEATURES; env->cpuid_svm_features &= TCG_SVM_FEATURES; +} else { +#ifdef CONFIG_KVM +filter_features_for_kvm(cpu); +#endif } object_property_set_str(OBJECT(cpu), def->model_id, "model-id", &error); if (error_is_set(&error)) { diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 335b3e7..0257083 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -410,30 +410,12 @@ int kvm_arch_init_vcpu(CPUX86State *env) struct kvm_cpuid2 cpuid; struct kvm_cpuid_entry2 entries[100]; } QEMU_PACKED cpuid_data; -KVMState *s = env->kvm_state; uint32_t limit, i, j, cpuid_i; uint32_t unused; struct kvm_cpuid_entry2 *c; uint32_t signature[3]; int r; -env->cpuid_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); - -env->cpuid_ext_features &= kvm_arch_get_supported_cpuid(s, 1, 0, R_ECX); - -env->cpuid_ext2_features &= kvm_arch_get_supported_cpuid(s, 0x8001, - 0, R_EDX); -env->cpuid_ext3_features &= kvm_arch_get_supported_cpuid(s, 0x8001, - 0, R_ECX); -env->cpuid_svm_features &= kvm_arch_get_supported_cpuid(s, 0x800A, -
[PATCH 04/15] i386: kvm: extract register switch to cpuid_entry_get_reg() function
No behavior change: just code movement. Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 37 +++-- 1 file changed, 23 insertions(+), 14 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 22e8564..ae51573 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -123,6 +123,28 @@ static int get_para_features(KVMState *s) } +/* Returns the value for a specific register on the cpuid entry + */ +static uint32_t cpuid_entry_get_reg(struct kvm_cpuid_entry2 *entry, int reg) +{ +uint32_t ret = 0; +switch (reg) { +case R_EAX: +ret = entry->eax; +break; +case R_EBX: +ret = entry->ebx; +break; +case R_ECX: +ret = entry->ecx; +break; +case R_EDX: +ret = entry->edx; +break; +} +return ret; +} + uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, uint32_t index, int reg) { @@ -142,20 +164,7 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, cpuid->entries[i].index == index) { struct kvm_cpuid_entry2 *entry = &cpuid->entries[i]; found = true; -switch (reg) { -case R_EAX: -ret = entry->eax; -break; -case R_EBX: -ret = entry->ebx; -break; -case R_ECX: -ret = entry->ecx; -break; -case R_EDX: -ret = entry->edx; -break; -} +ret = cpuid_entry_get_reg(entry, reg); } } -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/15] i386: kvm: kvm_arch_get_supported_cpuid: use 'entry' variable
The reg switch will be moved to a separate function, so store the entry pointer in a variable. No behavior change, just code movement. Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 9ebde66..22e8564 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -140,19 +140,20 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, for (i = 0; i < cpuid->nent; ++i) { if (cpuid->entries[i].function == function && cpuid->entries[i].index == index) { +struct kvm_cpuid_entry2 *entry = &cpuid->entries[i]; found = true; switch (reg) { case R_EAX: -ret = cpuid->entries[i].eax; +ret = entry->eax; break; case R_EBX: -ret = cpuid->entries[i].ebx; +ret = entry->ebx; break; case R_ECX: -ret = cpuid->entries[i].ecx; +ret = entry->ecx; break; case R_EDX: -ret = cpuid->entries[i].edx; +ret = entry->edx; break; } } -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 06/15] i386: kvm: extract try_get_cpuid() loop to get_supported_cpuid() function
No behavior change, just code movement. Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 19 ++- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 3519028..7115921 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -98,6 +98,19 @@ static struct kvm_cpuid2 *try_get_cpuid(KVMState *s, int max) return cpuid; } +/* Run KVM_GET_SUPPORTED_CPUID ioctl(), allocating a buffer large enough + * for all entries. + */ +static struct kvm_cpuid2 *get_supported_cpuid(KVMState *s) +{ +struct kvm_cpuid2 *cpuid; +int max = 1; +while ((cpuid = try_get_cpuid(s, max)) == NULL) { +max *= 2; +} +return cpuid; +} + struct kvm_para_features { int cap; int feature; @@ -166,15 +179,11 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, uint32_t index, int reg) { struct kvm_cpuid2 *cpuid; -int max; uint32_t ret = 0; uint32_t cpuid_1_edx; bool found = false; -max = 1; -while ((cpuid = try_get_cpuid(s, max)) == NULL) { -max *= 2; -} +cpuid = get_supported_cpuid(s); struct kvm_cpuid_entry2 *entry = cpuid_find_entry(cpuid, function, index); if (entry) { -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 15/15] i386: kvm: filter CPUID leaf 7 based on GET_SUPPORTED_CPUID, too
Now that CPUID leaf 7 features can be enabled/disabled on the command-line, we need to filter them properly using GET_SUPPORTED_CPUID, at the same place where other features are filtered out. Signed-off-by: Eduardo Habkost --- target-i386/cpu.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 7921b1b..d8cd40a 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -1376,6 +1376,8 @@ static void filter_features_for_kvm(X86CPU *cpu) kvm_arch_get_supported_cpuid(s, 0x8001, 0, R_ECX); env->cpuid_svm_features &= kvm_arch_get_supported_cpuid(s, 0x800A, 0, R_EDX); +env->cpuid_7_0_ebx_features &= +kvm_arch_get_supported_cpuid(s, 7, 0, R_EBX); env->cpuid_kvm_features &= kvm_arch_get_supported_cpuid(s, KVM_CPUID_FEATURES, 0, R_EAX); env->cpuid_ext4_features &= -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 02/15] i386: kvm: kvm_arch_get_supported_cpuid: clean up has_kvm_features check
Instead of a function-specific has_kvm_features variable, simply use a "found" variable that will be checked in case we have to use the legacy get_para_features() interface. No behavior change, just code cleanup. Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 8b4ab34..9ebde66 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -130,7 +130,7 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, int i, max; uint32_t ret = 0; uint32_t cpuid_1_edx; -int has_kvm_features = 0; +bool found = false; max = 1; while ((cpuid = try_get_cpuid(s, max)) == NULL) { @@ -140,9 +140,7 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, for (i = 0; i < cpuid->nent; ++i) { if (cpuid->entries[i].function == function && cpuid->entries[i].index == index) { -if (cpuid->entries[i].function == KVM_CPUID_FEATURES) { -has_kvm_features = 1; -} +found = true; switch (reg) { case R_EAX: ret = cpuid->entries[i].eax; @@ -181,7 +179,7 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, g_free(cpuid); /* fallback for older kernels */ -if (!has_kvm_features && (function == KVM_CPUID_FEATURES)) { +if ((function == KVM_CPUID_FEATURES) && !found) { ret = get_para_features(s); } -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/15] i386: kvm: kvm_arch_get_supported_cpuid: move R_EDX hack outside of for loop
The for loop will become a separate function, so clean it up so it can become independent from the bit hacking for R_EDX. No behavior change[1], just code movement. [1] Well, only if the kernel returned CPUID leafs 1 or 0x8001 as unsupported, but there's no kernel version that does that. Signed-off-by: Eduardo Habkost --- target-i386/kvm.c | 31 ++- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 5b18383..8b4ab34 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -155,24 +155,29 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t function, break; case R_EDX: ret = cpuid->entries[i].edx; -switch (function) { -case 1: -/* KVM before 2.6.30 misreports the following features */ -ret |= CPUID_MTRR | CPUID_PAT | CPUID_MCE | CPUID_MCA; -break; -case 0x8001: -/* On Intel, kvm returns cpuid according to the Intel spec, - * so add missing bits according to the AMD spec: - */ -cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); -ret |= cpuid_1_edx & CPUID_EXT2_AMD_ALIASES; -break; -} break; } } } +/* Fixups for the data returned by KVM, below */ + +if (reg == R_EDX) { +switch (function) { +case 1: +/* KVM before 2.6.30 misreports the following features */ +ret |= CPUID_MTRR | CPUID_PAT | CPUID_MCE | CPUID_MCA; +break; +case 0x8001: +/* On Intel, kvm returns cpuid according to the Intel spec, + * so add missing bits according to the AMD spec: + */ +cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX); +ret |= cpuid_1_edx & CPUID_EXT2_AMD_ALIASES; +break; +} +} + g_free(cpuid); /* fallback for older kernels */ -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 00/15] QEMU KVM_GET_SUPPORTED_CPUID cleanups and fixes
Most of this series are just cleanups that will help when making -cpu check/enforce work properly, with some fixes. In addition to code movements, the main changes are: - x2apic won't be enabled if in-kernel irqchip is disabled (patch 10) - CPUID feature bit filtering is done much earlier, and inside target-i386/cpu.c (patch 13) - CPUID leaf 7 feature bits are now filterd based on GET_SUPPORTED_CPUID too (patch 15) Eduardo Habkost (15): i386: kvm: kvm_arch_get_supported_cpuid: move R_EDX hack outside of for loop i386: kvm: kvm_arch_get_supported_cpuid: clean up has_kvm_features check i386: kvm: kvm_arch_get_supported_cpuid: use 'entry' variable i386: kvm: extract register switch to cpuid_entry_get_reg() function i386: kvm: extract CPUID entry lookup to cpuid_find_entry() function i386: kvm: extract try_get_cpuid() loop to get_supported_cpuid() function i386: kvm: kvm_arch_get_supported_cpuid: replace if+switch with single 'if' i386: kvm: set CPUID_EXT_HYPERVISOR on kvm_arch_get_supported_cpuid() i386: kvm: set CPUID_EXT_TSC_DEADLINE_TIMER on kvm_arch_get_supported_cpuid() i386: kvm: x2apic is not supported without in-kernel irqchip i386: kvm: mask cpuid_kvm_features earlier i386: kvm: mask cpuid_ext4_features bits earlier i386: kvm: filter CPUID feature words earlier, on cpu.c i386: kvm: reformat filter_features_for_kvm() code i386: kvm: filter CPUID leaf 7 based on GET_SUPPORTED_CPUID, too kvm.h | 1 + target-i386/cpu.c | 30 +++ target-i386/kvm.c | 153 -- 3 files changed, 122 insertions(+), 62 deletions(-) -- 1.7.11.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] Proposal for virtio standardization.
Rusty Russell writes: > Hi all, > > I've had several requests for a more formal approach to the > virtio draft spec, and (after some soul-searching) I'd like to try that. > > The proposal is to use OASIS as the standards body, as it's > fairly light-weight as these things go. For me this means paperwork and > setting up a Working Group and getting the right people involved as > Voting members starting with the current contributors; for most of you > it just means a new mailing list, though I'll be cross-posting any > drafts and major changes here anyway. > > I believe that a documented standard (aka virtio 1.0) will > increase visibility and adoption in areas outside our normal linux/kvm > universe. There's been some of that already, but this is the clearest > path to accelerate it. Not the easiest path, but I believe that a solid > I/O standard is a Good Thing for everyone. > > Yet I also want to decouple new and experimental development > from the standards effort; running code comes first. New feature bits > and new device numbers should be reservable without requiring a full > spec change. > > So the essence of my proposal is: > 1) I start a Working Group within OASIS where we can aim for virtio spec >1.0. > > 2) The current spec is textually reordered so the core is clearly >bus-independent, with PCI, mmio, etc appendices. > > 3) Various clarifications, formalizations and cleanups to the spec text, >and possibly elimination of old deprecated features. > > 4) The only significant change to the spec is that we use PCI >capabilities, so we can have infinite feature bits. >(see > http://lists.linuxfoundation.org/pipermail/virtualization/2011-December/019198.html) We discussed this on IRC last night. I don't think PCI capabilites are a good mechanism to use... PCI capabilities are there to organize how the PCI config space is allocated to allow vendor extensions to co-exist with future PCI extensions. But we've never used the PCI config space within virtio-pci. We do everything in BAR0. I don't think there's any real advantage of using the config space vs. a BAR for virtio-pci. The nice thing about using a BAR is that the region is MMIO or PIO. This maps really nicely to non-PCI transports too. But extending the PCI config space (especially dealing with capability allocation) is pretty gnarly and there isn't an obvious equivalent outside of PCI. There are very devices that we emulate today that make use of extended PCI device registers outside the platform devices (that have no BARs). Regards, Anthony Liguori > > 5) Changes to the ring layout and other such things are deferred to a >future virtio version; whether this is done within OASIS or >externally depends on how well this works for the 1.0 release. > > Thoughts? > Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip
Marcelo Tosatti writes: > On Thu, Oct 04, 2012 at 04:30:26PM +0200, Jan Kiszka wrote: >> On 2012-10-04 16:21, Anthony Liguori wrote: >> > -no-kvm should be included too. >> >> Reminds me that we still need to agree on the final default accel strategy. > > Default accel=kvm for x86_64 targets, no fallback. Ack. BTW, if you want to override this, you can just do: [machine] accel=kvm:tcg In /etc/qemu/target-i386.conf So the distros can do whatever they like in their packages. We should instruct users how to do this in an error message too in case they want to permanently disable kvm because they don't have appropriate hardware suport. Regards, Anthony Liguori > Markus's argument is pretty convincing to me: > > "Friendly perhaps, generating an infinite series of questions "why is my > guest slow as molasses?" certainly. > > And for each instance of the question, there's an unknown number of > users who give QEMU a quick try, screw up KVM unknowingly, observe the > glacial speed, and conclude it's crap." > >> > I just ran across a user that was injecting '-no-kvm-irqchip' in their >> > libvirt XML via a custom attribute. It turned out it was to work around >> > broken MSI support in their funky guest they were running. It was the >> > wrong solution to the problem but they were doing it regardless. >> > >> > The point is, there are users in the wild using these options. There's >> > no reason to remove them if they are trivial to maintain (and they are >> > in their current form). >> >> So let's define a consistent policy for them all: >> - warn on the command line on use >> - avoid adding them to the help or other user documentation >> - keep them until we rework the whole command line >> >> Jan >> >> -- >> Siemens AG, Corporate Technology, CT RTC ITP SDP-DE >> Corporate Competence Center Embedded Linux > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm: remove "boot=on|off" drive parameter compatibility
On 2012-10-04 19:21, Lucas Meneghel Rodrigues wrote: > On 10/04/2012 09:27 AM, Jan Kiszka wrote: >> On 2012-10-04 14:10, Lucas Meneghel Rodrigues wrote: >>> On 10/04/2012 07:48 AM, Jan Kiszka wrote: On 2012-10-03 15:19, Paolo Bonzini wrote: > Il 03/10/2012 12:57, Lucas Meneghel Rodrigues ha scritto: >> Yep, I did send patches with the testdev device present on qemu-kvm.git >> to qemu.git a while ago, but there were many comments on the review, I >> ended up not implementing everything that was asked and the patches were >> archived. >> >> If nobody wants to step up to port it, I'll re-read the original thread >> and will spin up new patches (and try to go through the end with it). >> Executing the KVM unittests is something that we can't afford to lose, >> so I'd say it's important on this last mile effort to get rid of >> qemu-kvm. > > Absolutely, IIRC the problem was that testdev did a little bit of > everything... let's see what's the functionality of testdev: > > - write (port 0xf1), can be replaced in autotest with: > -device isa-debugcon,iobase=0xf1,chardev=... > > - exit code (port 0xf4), see this series: > http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00818.html > > - ram size (port 0xd1). If we can also patch kvm-unittests, the memory > is available in the CMOS or in fwcfg. Here is the SeaBIOS code: > > u32 rs = ((inb_cmos(0x34) << 16) | (inb_cmos(0x35) << 24)); > if (rs) > rs += 16 * 1024 * 1024; > else > rs = (((inb_cmos(0x30) << 10) | (inb_cmos(0x31) << 18)) > + 1 * 1024 * 1024); > > The rest (ports 0xe0..0xe7, 0x2000..0x2017, MMIO) can be left in testdev. IIRC, one of the biggest problem with testdev was its hack to inject interrupts. >>> >>> Jan, I assume this commit helps to fix this, right? >>> >>> commit b334ec567f1de9a60349991e7b75083d569ddb0a >>> Author: Jan Kiszka >>> Date: Fri Mar 2 10:30:47 2012 +0100 >>> >>> qemu-kvm: Use upstream kvm-i8259 >>> >>> Drop the qemu-kvm version in favor of the equivalent upstream >>> implementation. This allows to move the i8259 back into the hwlib. >>> >>> Note that this also drops the testdev hack and restores proper >>> isa_get_irq. If testdev scripts exist that inject > IRQ15, they need >>> fixing. Testing for these interrupts on the PIIX3 makes no practical >>> sense anyway as those lines are unused. >>> >>> Signed-off-by: Jan Kiszka >>> Signed-off-by: Avi Kivity >> >> Yes, this improved it a lot as we no longer depend on additional >> changes. I'm not sure if there was resistance beyond that. >> >> When cleaning up the code: register_ioport_read must be replaced with >> the memory API. > > I did look at the MemoryRegionOps/memory_region_init_io and still did > not figure out how to port things. I'll send a v2 addressing all the > comments made so far but this one, just to see if people are OK with the > direction of the full patch, then if you could give me some pointers of > how to do this conversion, it'd be great. > See e.g. http://thread.gmane.org/gmane.comp.emulators.qemu/171217 Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm: remove "boot=on|off" drive parameter compatibility
On 10/04/2012 09:27 AM, Jan Kiszka wrote: On 2012-10-04 14:10, Lucas Meneghel Rodrigues wrote: On 10/04/2012 07:48 AM, Jan Kiszka wrote: On 2012-10-03 15:19, Paolo Bonzini wrote: Il 03/10/2012 12:57, Lucas Meneghel Rodrigues ha scritto: Yep, I did send patches with the testdev device present on qemu-kvm.git to qemu.git a while ago, but there were many comments on the review, I ended up not implementing everything that was asked and the patches were archived. If nobody wants to step up to port it, I'll re-read the original thread and will spin up new patches (and try to go through the end with it). Executing the KVM unittests is something that we can't afford to lose, so I'd say it's important on this last mile effort to get rid of qemu-kvm. Absolutely, IIRC the problem was that testdev did a little bit of everything... let's see what's the functionality of testdev: - write (port 0xf1), can be replaced in autotest with: -device isa-debugcon,iobase=0xf1,chardev=... - exit code (port 0xf4), see this series: http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00818.html - ram size (port 0xd1). If we can also patch kvm-unittests, the memory is available in the CMOS or in fwcfg. Here is the SeaBIOS code: u32 rs = ((inb_cmos(0x34) << 16) | (inb_cmos(0x35) << 24)); if (rs) rs += 16 * 1024 * 1024; else rs = (((inb_cmos(0x30) << 10) | (inb_cmos(0x31) << 18)) + 1 * 1024 * 1024); The rest (ports 0xe0..0xe7, 0x2000..0x2017, MMIO) can be left in testdev. IIRC, one of the biggest problem with testdev was its hack to inject interrupts. Jan, I assume this commit helps to fix this, right? commit b334ec567f1de9a60349991e7b75083d569ddb0a Author: Jan Kiszka Date: Fri Mar 2 10:30:47 2012 +0100 qemu-kvm: Use upstream kvm-i8259 Drop the qemu-kvm version in favor of the equivalent upstream implementation. This allows to move the i8259 back into the hwlib. Note that this also drops the testdev hack and restores proper isa_get_irq. If testdev scripts exist that inject > IRQ15, they need fixing. Testing for these interrupts on the PIIX3 makes no practical sense anyway as those lines are unused. Signed-off-by: Jan Kiszka Signed-off-by: Avi Kivity Yes, this improved it a lot as we no longer depend on additional changes. I'm not sure if there was resistance beyond that. When cleaning up the code: register_ioport_read must be replaced with the memory API. I did look at the MemoryRegionOps/memory_region_init_io and still did not figure out how to port things. I'll send a v2 addressing all the comments made so far but this one, just to see if people are OK with the direction of the full patch, then if you could give me some pointers of how to do this conversion, it'd be great. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip
On Thu, Oct 04, 2012 at 04:30:26PM +0200, Jan Kiszka wrote: > On 2012-10-04 16:21, Anthony Liguori wrote: > > -no-kvm should be included too. > > Reminds me that we still need to agree on the final default accel strategy. Default accel=kvm for x86_64 targets, no fallback. Markus's argument is pretty convincing to me: "Friendly perhaps, generating an infinite series of questions "why is my guest slow as molasses?" certainly. And for each instance of the question, there's an unknown number of users who give QEMU a quick try, screw up KVM unknowingly, observe the glacial speed, and conclude it's crap." > > I just ran across a user that was injecting '-no-kvm-irqchip' in their > > libvirt XML via a custom attribute. It turned out it was to work around > > broken MSI support in their funky guest they were running. It was the > > wrong solution to the problem but they were doing it regardless. > > > > The point is, there are users in the wild using these options. There's > > no reason to remove them if they are trivial to maintain (and they are > > in their current form). > > So let's define a consistent policy for them all: > - warn on the command line on use > - avoid adding them to the help or other user documentation > - keep them until we rework the whole command line > > Jan > > -- > Siemens AG, Corporate Technology, CT RTC ITP SDP-DE > Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip
On Thu, Oct 04, 2012 at 05:36:38PM +0200, Andreas Färber wrote: > Am 04.10.2012 16:30, schrieb Jan Kiszka: > > On 2012-10-04 16:21, Anthony Liguori wrote: > >> -no-kvm should be included too. > > > > Reminds me that we still need to agree on the final default accel strategy. > > > >> > >> I just ran across a user that was injecting '-no-kvm-irqchip' in their > >> libvirt XML via a custom attribute. It turned out it was to work around > >> broken MSI support in their funky guest they were running. It was the > >> wrong solution to the problem but they were doing it regardless. > >> > >> The point is, there are users in the wild using these options. There's > >> no reason to remove them if they are trivial to maintain (and they are > >> in their current form). > > > > So let's define a consistent policy for them all: > > - warn on the command line on use > > > - avoid adding them to the help or other user documentation > > That's dangerous - at some point someone will notice and propose a patch > documenting them and the reviewers may have forgotten by then why it was > not documented in the first place. Better clearly document them in help > output as "DEPRECATED, to be removed in future versions" or so. Adding the policy to docs/qemukvm-compat-commands-policy.txt should workaround that problem. And a friendly text on the announce e-mail. > Andreas > > > - keep them until we rework the whole command line > > > > Jan > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] vfio: Fix PCI INTx disable consistency
The virq_disabled flag tracks the userspace view of INTx masking across interrupt mode changes, but we're not consistently applying this to the interrupt and masking handler notion of the device. Currently if the user sets DisINTx while in MSI or MSIX mode, then returns to INTx mode (ex. rebooting a qemu guest), the hardware has DisINTx+, but the management of INTx thinks it's enabled, making it impossible to actually clear DisINTx. Fix this by updating the handler state when INTx is re-enabled. Cc: sta...@vger.kernel.org Signed-off-by: Alex Williamson --- drivers/vfio/pci/vfio_pci_intrs.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index c8139a5..3639371 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -366,6 +366,17 @@ static int vfio_intx_enable(struct vfio_pci_device *vdev) return -ENOMEM; vdev->num_ctx = 1; + + /* +* If the virtual interrupt is masked, restore it. Devices +* supporting DisINTx can be masked at the hardware level +* here, non-PCI-2.3 devices will have to wait until the +* interrupt is enabled. +*/ + vdev->ctx[0].masked = vdev->virq_disabled; + if (vdev->pci_2_3) + pci_intx(vdev->pdev, !vdev->ctx[0].masked); + vdev->irq_type = VFIO_PCI_INTX_IRQ_INDEX; return 0; @@ -419,7 +430,7 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd) * disable_irq won't. */ spin_lock_irqsave(&vdev->irqlock, flags); - if (!vdev->pci_2_3 && (vdev->ctx[0].masked || vdev->virq_disabled)) + if (!vdev->pci_2_3 && vdev->ctx[0].masked) disable_irq_nosync(pdev->irq); spin_unlock_irqrestore(&vdev->irqlock, flags); -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] vfio: Move PCI INTx eventfd setting earlier
We need to be ready to recieve an interrupt as soon as we call request_irq, so our eventfd context setting needs to be moved earlier. Without this, an interrupt from our device or one sharing the interrupt line can pass a NULL into eventfd_signal and oops. Cc: sta...@vger.kernel.org Signed-off-by: Alex Williamson --- drivers/vfio/pci/vfio_pci_intrs.c |5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c index d8dedc7..c8139a5 100644 --- a/drivers/vfio/pci/vfio_pci_intrs.c +++ b/drivers/vfio/pci/vfio_pci_intrs.c @@ -400,19 +400,20 @@ static int vfio_intx_set_signal(struct vfio_pci_device *vdev, int fd) return PTR_ERR(trigger); } + vdev->ctx[0].trigger = trigger; + if (!vdev->pci_2_3) irqflags = 0; ret = request_irq(pdev->irq, vfio_intx_handler, irqflags, vdev->ctx[0].name, vdev); if (ret) { + vdev->ctx[0].trigger = NULL; kfree(vdev->ctx[0].name); eventfd_ctx_put(trigger); return ret; } - vdev->ctx[0].trigger = trigger; - /* * INTx disable will stick across the new irq setup, * disable_irq won't. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] VFIO PCI INTx fixes
These patches are now available in my next tree to fix a couple issues with PCI INTx. Thanks, Alex --- Alex Williamson (2): vfio: Fix PCI INTx disable consistency vfio: Move PCI INTx eventfd setting earlier drivers/vfio/pci/vfio_pci_intrs.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip
On 2012-10-04 18:13, Andreas Färber wrote: > Am 04.10.2012 18:04, schrieb Jan Kiszka: >> On 2012-10-04 17:36, Andreas Färber wrote: >>> Am 04.10.2012 16:30, schrieb Jan Kiszka: On 2012-10-04 16:21, Anthony Liguori wrote: - avoid adding them to the help or other user documentation >>> >>> That's dangerous - at some point someone will notice and propose a patch >>> documenting them and the reviewers may have forgotten by then why it was >>> not documented in the first place. Better clearly document them in help >>> output as "DEPRECATED, to be removed in future versions" or so. >> >> -M is marked as deprecated in the file that you would have to mess up. > > If it's not within +/- 3 lines and goes through qemu-trivial then that > point is moot. Don't see how deprecation of -no-kvm-irqchip et al. is > related to -M being deprecated either way. I'm only citing an example that these switches should follow: [qemu-options.hx] HXCOMM Deprecated by -machine DEF("M", HAS_ARG, QEMU_OPTION_M, "", QEMU_ARCH_ALL) Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vfio: missing patch in linux 3.6
On Thu, 2012-10-04 at 18:14 +0200, Roedel, Joerg wrote: > On Thu, Oct 04, 2012 at 09:26:39AM -0600, Alex Williamson wrote: > > On Thu, 2012-10-04 at 17:13 +0200, Andreas Hartmann wrote: > > > Hello Alex, > > > > > > I just tested vfio as part of linux 3.6 and detected, that it doesn't > > > work because of the following missing patch: > > > > > > http://news.gmane.org/find-root.php?group=gmane.linux.kernel.pci&article=16422 > > > > > > Could you please get it applied for the next stable release? > > > > Hi Andreas, > > > > This patch needs to go through Bjorn's PCI tree, but as noted in the > > comments I have some outstanding questions that need to be answered, > > probably by Joerg. Thanks, > > What are the open questions? I have confirmation from the hardware > people that the south bridges are peer-2-peer safe. Other questions? Hi Joerg, There are a couple questions in the link above. Since the devices don't expose a PCIe capability, we probably need to add a check to look at the upstream device and verify we're not on a legacy bus where ACS can't be enforced. Then there's the general question of whether the confirmation of no peer-to-peer applies to every case where we might see this device (some of them seem to have history that pre-dates this specific package implementation) or do we need to try to identify specific package properties in addition to just a device ID? Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vfio: missing patch in linux 3.6
On Thu, Oct 04, 2012 at 09:26:39AM -0600, Alex Williamson wrote: > On Thu, 2012-10-04 at 17:13 +0200, Andreas Hartmann wrote: > > Hello Alex, > > > > I just tested vfio as part of linux 3.6 and detected, that it doesn't > > work because of the following missing patch: > > > > http://news.gmane.org/find-root.php?group=gmane.linux.kernel.pci&article=16422 > > > > Could you please get it applied for the next stable release? > > Hi Andreas, > > This patch needs to go through Bjorn's PCI tree, but as noted in the > comments I have some outstanding questions that need to be answered, > probably by Joerg. Thanks, What are the open questions? I have confirmation from the hardware people that the south bridges are peer-2-peer safe. Other questions? Joerg -- AMD Operating System Research Center Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach General Managers: Alberto Bozzo Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip
Am 04.10.2012 18:04, schrieb Jan Kiszka: > On 2012-10-04 17:36, Andreas Färber wrote: >> Am 04.10.2012 16:30, schrieb Jan Kiszka: >>> On 2012-10-04 16:21, Anthony Liguori wrote: >>> - avoid adding them to the help or other user documentation >> >> That's dangerous - at some point someone will notice and propose a patch >> documenting them and the reviewers may have forgotten by then why it was >> not documented in the first place. Better clearly document them in help >> output as "DEPRECATED, to be removed in future versions" or so. > > -M is marked as deprecated in the file that you would have to mess up. If it's not within +/- 3 lines and goes through qemu-trivial then that point is moot. Don't see how deprecation of -no-kvm-irqchip et al. is related to -M being deprecated either way. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] KVM: x86: Skip request checking branches in vcpu_enter_guest() more effectively
On Mon, 24 Sep 2012 09:16:12 +0200 Gleb Natapov wrote: > Yes, for guests that do not enable steal time KVM_REQ_STEAL_UPDATE > should be never set, but currently it is. The patch (not tested) should > fix this. Thinking a bit more about KVM_REQ_STEAL_UPDATE... > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 901ad00..01572f5 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -1544,6 +1544,8 @@ static void accumulate_steal_time(struct kvm_vcpu *vcpu) > delta = current->sched_info.run_delay - vcpu->arch.st.last_steal; > vcpu->arch.st.last_steal = current->sched_info.run_delay; > vcpu->arch.st.accum_steal = delta; > + > + kvm_make_request(KVM_REQ_STEAL_UPDATE, vcpu); > } Even without this flag, we know when we should record the steal time: that's when vcpu->arch.st.accum_steal != 0. > > static void record_steal_time(struct kvm_vcpu *vcpu) So we can live without KVM_REQ_STEAL_UPDATE and make record_steal_time() do the job when vcpu->arch.st.accum_steal != 0 so that it can be called unconditionally in vcpu_enter_guest(). Why didn't we do that? The only reason I can think of is we wanted to eliminate that check for the zero-requests case. But does such a micro-optimization really help us? Thanks, Takuya -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip
On 2012-10-04 17:36, Andreas Färber wrote: > Am 04.10.2012 16:30, schrieb Jan Kiszka: >> On 2012-10-04 16:21, Anthony Liguori wrote: >>> -no-kvm should be included too. >> >> Reminds me that we still need to agree on the final default accel strategy. >> >>> >>> I just ran across a user that was injecting '-no-kvm-irqchip' in their >>> libvirt XML via a custom attribute. It turned out it was to work around >>> broken MSI support in their funky guest they were running. It was the >>> wrong solution to the problem but they were doing it regardless. >>> >>> The point is, there are users in the wild using these options. There's >>> no reason to remove them if they are trivial to maintain (and they are >>> in their current form). >> >> So let's define a consistent policy for them all: >> - warn on the command line on use > >> - avoid adding them to the help or other user documentation > > That's dangerous - at some point someone will notice and propose a patch > documenting them and the reviewers may have forgotten by then why it was > not documented in the first place. Better clearly document them in help > output as "DEPRECATED, to be removed in future versions" or so. -M is marked as deprecated in the file that you would have to mess up. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vfio: missing patch in linux 3.6
Alex Williamson wrote: > On Thu, 2012-10-04 at 17:13 +0200, Andreas Hartmann wrote: >> Hello Alex, >> >> I just tested vfio as part of linux 3.6 and detected, that it >> doesn't work because of the following missing patch: >> >> http://news.gmane.org/find-root.php?group=gmane.linux.kernel.pci&article=16422 >> >> >> Could you please get it applied for the next stable release? > > Hi Andreas, > > This patch needs to go through Bjorn's PCI tree, but as noted in > the comments I have some outstanding questions that need to be > answered, probably by Joerg. Thanks, Sorry Alex, I missed your additional questions because this patch just works pretty fine here with kernel 3.4.x. But you're right, your questions should be resolved before freeing it to the wild. @Joerg: Please, liberate me from compiling and patching each kernel myself to get it working and give all the other users out there a chance to use this pretty code and the IOMMU support of AMD's hardware :-) by answering Alex' questions. Thanks, kind regards, Andreas -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support
On 04.10.2012, at 17:19, Bhushan Bharat-R65777 wrote: > > >>> -static int emulation_exit(struct kvm_run *run, struct kvm_vcpu >>> *vcpu) >>> +static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, >>> + int exit_nr) >>> { >>> enum emulation_result er; >>> >>> + if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_USE_SW_BP) && >>> +vcpu->arch.last_inst == KVMPPC_INST_GUEST_GDB) { >> >> This belongs into the normal emulation code path, behind the same >> switch() that everything else goes through. > > I am not sure I understood correctly. Below is the reason why I > placed this code here. > Instruction where software breakpoint is to be set is replaced by "ehpriv" instruction. On e500v2, this is not a valid instruction can causes program interrupt. On e500mc, "ehpriv" is a valid instruction. Both the exit path calls emulation_exit(), so we have placed the code in this >> function. > Do you want this code to be moved in program interrupt exit path for > e500v2 and BOOKE_INTERRUPT_HV_PRIV for e500mc? Ok, in this patch you do (basically): int emulation_exit() { if (inst == DEBUG_INST) { debug_stuff(); return; } switch (inst) { case INST_A: foo(); } } >>> >>> Are not we doing something like this: >>> int emulation_exit() >>> { >>>if (inst == DEBUG_INST) { >>>debug_stuff(); >>>return; >>>} >>> >>>status = kvmppc_emulate_instruction() >>>switch (status) { >>>case FAIL: >>>foo(); >>>case DONE: >>> foo1(); >>> >>>} >>> } >>> >>> Do you want something like this: >>> >>> int emulation_exit() >>> { >>> >>>status = kvmppc_emulate_instruction() >>>switch (status) { >>>case FAIL: >>> if (inst == DEBUG_INST) { >>> debug_stuff(); >>> return; >>> } >>>foo(); >>> >>>case DONE: >>> foo1(); >>> >>>} >>> } >> >> No, I want the DEBUG_INST be handled the same as any other instruction we >> emulate. > > I would like to understand what you are thinking: > What I derived is , add the instruction in kvmppc_emulate_instruction() (or > its child function) which, > 1) fill the relevant information in run-> , kvmppc_account_exit(vcpu, > DEBUG_EXITS); and returns EMULATION_DONE > And in emulation_exit() > status = kvmppc_emulate_instruction() > switch (status) { > case EMULATION_DONE: > if (inst == DEBUG) > return RESUME_HOST; > } > Or > 2) kvmppc_account_exit(vcpu, DEBUG_EXITS); returns EMULATION_DONE; > And in emulation_exit() > status = kvmppc_emulate_instruction() > switch (status) { > case EMULATION_DONE: > if (inst == DEBUG) { > fill run-> > return RESUME_HOST; > } > } > > Or > 3) kvmppc_account_exit(vcpu, DEBUG_EXITS); returns a new status type > (EMULATION_DEBUG_INST) > And in emulation_exit() > status = kvmppc_emulate_instruction() > switch (status) { > case EMULATION_DEBUG_INST: > fill run-> > return RESUME_HOST; > } This one :). > >> what I want is: int emulation_exit() { switch (inst) { case INST_A: foo(); break; case DEBUG_INST: debug_stuff(); break; } } > > >> >>> + run->exit_reason = KVM_EXIT_DEBUG; >>> + run->debug.arch.pc = vcpu->arch.pc; >>> + run->debug.arch.exception = exit_nr; >>> + run->debug.arch.status = 0; >>> + kvmppc_account_exit(vcpu, DEBUG_EXITS); >>> + return RESUME_HOST; >>> + } >>> + >>> er = kvmppc_emulate_instruction(run, vcpu); >>> switch (er) { >>> case EMULATE_DONE: >>> @@ -697,6 +711,44 @@ static int emulation_exit(struct kvm_run >>> *run, struct >> kvm_vcpu *vcpu) >>> default: >>> BUG(); >>> } >>> + >>> + if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_ENABLE) && >>> + (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) { >> >> I don't understand how this is supposed to work. When we enable >> singlestep, why would we end up in emulation_exit()? > > When singlestep is enabled then we set DBCR0[ICMP] and the debug > handler should be able to handle this. I think you are right. > >> >>> + run->exit_reason = KVM_EXIT_DEBUG; >>> + return RES
Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip
Am 04.10.2012 16:30, schrieb Jan Kiszka: > On 2012-10-04 16:21, Anthony Liguori wrote: >> -no-kvm should be included too. > > Reminds me that we still need to agree on the final default accel strategy. > >> >> I just ran across a user that was injecting '-no-kvm-irqchip' in their >> libvirt XML via a custom attribute. It turned out it was to work around >> broken MSI support in their funky guest they were running. It was the >> wrong solution to the problem but they were doing it regardless. >> >> The point is, there are users in the wild using these options. There's >> no reason to remove them if they are trivial to maintain (and they are >> in their current form). > > So let's define a consistent policy for them all: > - warn on the command line on use > - avoid adding them to the help or other user documentation That's dangerous - at some point someone will notice and propose a patch documenting them and the reviewers may have forgotten by then why it was not documented in the first place. Better clearly document them in help output as "DEPRECATED, to be removed in future versions" or so. Andreas > - keep them until we rework the whole command line > > Jan > -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vfio: missing patch in linux 3.6
On Thu, 2012-10-04 at 17:13 +0200, Andreas Hartmann wrote: > Hello Alex, > > I just tested vfio as part of linux 3.6 and detected, that it doesn't > work because of the following missing patch: > > http://news.gmane.org/find-root.php?group=gmane.linux.kernel.pci&article=16422 > > Could you please get it applied for the next stable release? Hi Andreas, This patch needs to go through Bjorn's PCI tree, but as noted in the comments I have some outstanding questions that need to be answered, probably by Joerg. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
vfio: missing patch in linux 3.6
Hello Alex, I just tested vfio as part of linux 3.6 and detected, that it doesn't work because of the following missing patch: http://news.gmane.org/find-root.php?group=gmane.linux.kernel.pci&article=16422 Could you please get it applied for the next stable release? Thanks, kind regards, Andreas Hartmann -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support
> > -static int emulation_exit(struct kvm_run *run, struct kvm_vcpu > > *vcpu) > > +static int emulation_exit(struct kvm_run *run, struct kvm_vcpu *vcpu, > > + int exit_nr) > > { > > enum emulation_result er; > > > > + if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_USE_SW_BP) && > > +vcpu->arch.last_inst == KVMPPC_INST_GUEST_GDB) { > > This belongs into the normal emulation code path, behind the same > switch() that everything else goes through. > >>> > >>> I am not sure I understood correctly. Below is the reason why I > >>> placed this > >> code here. > >>> Instruction where software breakpoint is to be set is replaced by "ehpriv" > >> instruction. On e500v2, this is not a valid instruction can causes > >> program interrupt. On e500mc, "ehpriv" is a valid instruction. Both > >> the exit path calls emulation_exit(), so we have placed the code in this > function. > >>> Do you want this code to be moved in program interrupt exit path for > >>> e500v2 > >> and BOOKE_INTERRUPT_HV_PRIV for e500mc? > >> > >> Ok, in this patch you do (basically): > >> > >> int emulation_exit() > >> { > >>if (inst == DEBUG_INST) { > >>debug_stuff(); > >>return; > >>} > >> > >>switch (inst) { > >>case INST_A: > >>foo(); > >> > >>} > >> } > > > > Are not we doing something like this: > > int emulation_exit() > > { > > if (inst == DEBUG_INST) { > > debug_stuff(); > > return; > > } > > > > status = kvmppc_emulate_instruction() > > switch (status) { > > case FAIL: > > foo(); > > case DONE: > > foo1(); > > > > } > > } > > > > Do you want something like this: > > > > int emulation_exit() > > { > > > > status = kvmppc_emulate_instruction() > > switch (status) { > > case FAIL: > > if (inst == DEBUG_INST) { > > debug_stuff(); > > return; > > } > > foo(); > > > > case DONE: > > foo1(); > > > > } > > } > > No, I want the DEBUG_INST be handled the same as any other instruction we > emulate. I would like to understand what you are thinking: What I derived is , add the instruction in kvmppc_emulate_instruction() (or its child function) which, 1) fill the relevant information in run-> , kvmppc_account_exit(vcpu, DEBUG_EXITS); and returns EMULATION_DONE And in emulation_exit() status = kvmppc_emulate_instruction() switch (status) { case EMULATION_DONE: if (inst == DEBUG) return RESUME_HOST; } Or 2) kvmppc_account_exit(vcpu, DEBUG_EXITS); returns EMULATION_DONE; And in emulation_exit() status = kvmppc_emulate_instruction() switch (status) { case EMULATION_DONE: if (inst == DEBUG) { fill run-> return RESUME_HOST; } } Or 3) kvmppc_account_exit(vcpu, DEBUG_EXITS); returns a new status type (EMULATION_DEBUG_INST) And in emulation_exit() status = kvmppc_emulate_instruction() switch (status) { case EMULATION_DEBUG_INST: fill run-> return RESUME_HOST; } > > >> > >> what I want is: > >> > >> int emulation_exit() > >> { > >>switch (inst) { > >>case INST_A: > >>foo(); break; > >>case DEBUG_INST: > >>debug_stuff(); break; > >> > >>} > >> } > >> > >>> > >>> > > > + run->exit_reason = KVM_EXIT_DEBUG; > > + run->debug.arch.pc = vcpu->arch.pc; > > + run->debug.arch.exception = exit_nr; > > + run->debug.arch.status = 0; > > + kvmppc_account_exit(vcpu, DEBUG_EXITS); > > + return RESUME_HOST; > > + } > > + > > er = kvmppc_emulate_instruction(run, vcpu); > > switch (er) { > > case EMULATE_DONE: > > @@ -697,6 +711,44 @@ static int emulation_exit(struct kvm_run > > *run, struct > kvm_vcpu *vcpu) > > default: > > BUG(); > > } > > + > > + if (unlikely(vcpu->guest_debug & KVM_GUESTDBG_ENABLE) && > > + (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)) { > > I don't understand how this is supposed to work. When we enable > singlestep, why would we end up in emulation_exit()? > >>> > >>> When singlestep is enabled then we set DBCR0[ICMP] and the debug > >>> handler > >> should be able to handle this. I think you are right. > >>> > > > + run->exit_reason = KVM_EXIT_DEBUG; > > + return RESUME_HOST; > > + } > > +} > > + > > +static int kvmppc_handle_debug(st
Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler
On 10/04/2012 03:07 PM, Peter Zijlstra wrote: > On Thu, 2012-10-04 at 14:41 +0200, Avi Kivity wrote: >> >> Again the numbers are ridiculously high for arch_local_irq_restore. >> Maybe there's a bad perf/kvm interaction when we're injecting an >> interrupt, I can't believe we're spending 84% of the time running the >> popf instruction. > > Smells like a software fallback that doesn't do NMI, hrtimer based > sampling typically hits popf where we re-enable interrupts. Good nose, that's probably it. Raghavendra, can you ensure that the PMU is properly exposed? 'dmesg' in the guest will tell. If it isn't, -cpu host will expose it (and a good idea anyway to get best performance). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler
On Thu, 2012-10-04 at 14:41 +0200, Avi Kivity wrote: > On 10/04/2012 12:49 PM, Raghavendra K T wrote: > > On 10/03/2012 10:35 PM, Avi Kivity wrote: > >> On 10/03/2012 02:22 PM, Raghavendra K T wrote: > So I think it's worth trying again with ple_window of 2-4. > > >>> > >>> Hi Avi, > >>> > >>> I ran different benchmarks increasing ple_window, and results does not > >>> seem to be encouraging for increasing ple_window. > >> > >> Thanks for testing! Comments below. > >> > >>> Results: > >>> 16 core PLE machine with 16 vcpu guest. > >>> > >>> base kernel = 3.6-rc5 + ple handler optimization patch > >>> base_pleopt_8k = base kernel + ple window = 8k > >>> base_pleopt_16k = base kernel + ple window = 16k > >>> base_pleopt_32k = base kernel + ple window = 32k > >>> > >>> > >>> Percentage improvements of benchmarks w.r.t base_pleopt with > >>> ple_window = 4096 > >>> > >>> base_pleopt_8kbase_pleopt_16kbase_pleopt_32k > >>> - > >>> > >>> > >>> kernbench_1x-5.54915-15.94529-44.31562 > >>> kernbench_2x-7.89399-17.75039-37.73498 > >> > >> So, 44% degradation even with no overcommit? That's surprising. > > > > Yes. Kernbench was run with #threads = #vcpu * 2 as usual. Is it > > spending 8 times the original ple_window cycles for 16 vcpus > > significant? > > A PLE exit when not overcommitted cannot do any good, it is better to > spin in the guest rather that look for candidates on the host. In fact > when we benchmark we often disable PLE completely. Agreed. However, I really do not understand why the kernbench regressed with bigger ple_window. It should stay the same or improve. Raghu, do you have perf data for the kernbench runs? > > > > >> > >>> I also got perf top output to analyse the difference. Difference comes > >>> because of flushtlb (and also spinlock). > >> > >> That's in the guest, yes? > > > > Yes. Perf is in guest. > > > >> > >>> > >>> Ebizzy run for 4k ple_window > >>> - 87.20% [kernel] [k] arch_local_irq_restore > >>> - arch_local_irq_restore > >>>- 100.00% _raw_spin_unlock_irqrestore > >>> + 52.89% release_pages > >>> + 47.10% pagevec_lru_move_fn > >>> - 5.71% [kernel] [k] arch_local_irq_restore > >>> - arch_local_irq_restore > >>>+ 86.03% default_send_IPI_mask_allbutself_phys > >>>+ 13.96% default_send_IPI_mask_sequence_phys > >>> - 3.10% [kernel] [k] smp_call_function_many > >>> smp_call_function_many > >>> > >>> > >>> Ebizzy run for 32k ple_window > >>> > >>> - 91.40% [kernel] [k] arch_local_irq_restore > >>> - arch_local_irq_restore > >>>- 100.00% _raw_spin_unlock_irqrestore > >>> + 53.13% release_pages > >>> + 46.86% pagevec_lru_move_fn > >>> - 4.38% [kernel] [k] smp_call_function_many > >>> smp_call_function_many > >>> - 2.51% [kernel] [k] arch_local_irq_restore > >>> - arch_local_irq_restore > >>>+ 90.76% default_send_IPI_mask_allbutself_phys > >>>+ 9.24% default_send_IPI_mask_sequence_phys > >>> > >> > >> Both the 4k and the 32k results are crazy. Why is > >> arch_local_irq_restore() so prominent? Do you have a very high > >> interrupt rate in the guest? > > > > How to measure if I have high interrupt rate in guest? > > From /proc/interrupt numbers I am not able to judge :( > > 'vmstat 1' > > > > > I went back and got the results on a 32 core machine with 32 vcpu guest. > > Strangely, I got result supporting the claim that increasing ple_window > > helps for non-overcommitted scenario. > > > > 32 core 32 vcpu guest 1x scenarios. > > > > ple_gap = 0 > > kernbench: Elapsed Time 38.61 > > ebizzy: 7463 records/s > > > > ple_window = 4k > > kernbench: Elapsed Time 43.5067 > > ebizzy:2528 records/s > > > > ple_window = 32k > > kernebench : Elapsed Time 39.4133 > > ebizzy: 7196 records/s > > So maybe something was wrong with the first measurement. OK, this is more in line with what I expected for kernbench. FWIW, in order to show an improvement for a larger ple_window, we really need a workload which we know has a longer lock holding time (without factoring in LHP). We have noticed this on IO based locks mostly. We saw it with a massive disk IO test (qla2xxx lock), and also with a large web serving test (some vfs related lock, but I forget what exactly it was). > > > > > > > perf top for ebizzy for above: > > ple_gap = 0 > > - 84.74% [kernel] [k] arch_local_irq_restore > >- arch_local_irq_restore > > - 100.00% _raw_spin_unlock_irqrestore > > + 50.96% release_pages > > + 49.02% pagevec_lru_move_fn > > - 6.57% [kernel] [k] arch_local_irq_restore > >- arch_local_irq_restore > > + 92.54% default_send_IPI_mask_allbutself_phys > > + 7.46% default_send_IPI_mask_sequence_phys > > - 1.54% [kernel] [k] smp_call_function_many > > smp_call_function_man
Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support
On 04.10.2012, at 16:22, Bhushan Bharat-R65777 wrote: > > >> -Original Message- >> From: Alexander Graf [mailto:ag...@suse.de] >> Sent: Thursday, October 04, 2012 4:56 PM >> To: Bhushan Bharat-R65777 >> Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org >> Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support >> >> >> On 04.10.2012, at 13:06, Bhushan Bharat-R65777 wrote: >> >>> >>> -Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Monday, September 24, 2012 9:50 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Bhushan Bharat-R65777 Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support On 21.08.2012, at 15:52, Bharat Bhushan wrote: > This patch adds the debug stub support on booke/bookehv. > Now QEMU debug stub can use hw breakpoint, watchpoint and software > breakpoint to debug guest. > > Signed-off-by: Bharat Bhushan > --- > arch/powerpc/include/asm/kvm.h| 29 ++- > arch/powerpc/include/asm/kvm_host.h |5 + > arch/powerpc/kernel/asm-offsets.c | 26 ++ > arch/powerpc/kvm/booke.c | 144 > +-- >> -- > arch/powerpc/kvm/booke_interrupts.S | 110 + > arch/powerpc/kvm/bookehv_interrupts.S | 141 >> +++- > arch/powerpc/kvm/e500mc.c |3 +- > 7 files changed, 435 insertions(+), 23 deletions(-) > > diff --git a/arch/powerpc/include/asm/kvm.h > b/arch/powerpc/include/asm/kvm.h index 61b197e..53479ea 100644 > --- a/arch/powerpc/include/asm/kvm.h > +++ b/arch/powerpc/include/asm/kvm.h > @@ -25,6 +25,7 @@ > /* Select powerpc specific features in */ #define > __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT > +#define __KVM_HAVE_GUEST_DEBUG > > struct kvm_regs { > __u64 pc; > @@ -264,7 +265,31 @@ struct kvm_fpu { > __u64 fpr[32]; > }; > > + > +/* > + * Defines for h/w breakpoint, watchpoint (read, write or both) and > + * software breakpoint. > + * These are used as "type" in KVM_SET_GUEST_DEBUG ioctl and "status" > + * for KVM_DEBUG_EXIT. > + */ > +#define KVMPPC_DEBUG_NONE0x0 > +#define KVMPPC_DEBUG_BREAKPOINT (1UL << 1) > +#define KVMPPC_DEBUG_WATCH_WRITE (1UL << 2) > +#define KVMPPC_DEBUG_WATCH_READ (1UL << 3) > struct kvm_debug_exit_arch { > + __u64 pc; > + /* > + * exception -> returns the exception number. If the KVM_DEBUG_EXIT > + * exit is not handled (say not h/w breakpoint or software breakpoint > + * set for this address) by qemu then it is supposed to inject this > + * exception to guest. > + */ > + __u32 exception; > + /* > + * exiting to userspace because of h/w breakpoint, watchpoint > + * (read, write or both) and software breakpoint. > + */ > + __u32 status; > }; > > /* for KVM_SET_GUEST_DEBUG */ > @@ -276,10 +301,6 @@ struct kvm_guest_debug_arch { >* Type denotes h/w breakpoint, read watchpoint, write >* watchpoint or watchpoint (both read and write). >*/ > -#define KVMPPC_DEBUG_NOTYPE 0x0 > -#define KVMPPC_DEBUG_BREAKPOINT (1UL << 1) > -#define KVMPPC_DEBUG_WATCH_WRITE (1UL << 2) > -#define KVMPPC_DEBUG_WATCH_READ (1UL << 3) > __u32 type; > __u32 pad1; > __u64 pad2; > diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h > index c7219c1..3ba465a 100644 > --- a/arch/powerpc/include/asm/kvm_host.h > +++ b/arch/powerpc/include/asm/kvm_host.h > @@ -496,7 +496,12 @@ struct kvm_vcpu_arch { > u32 mmucfg; > u32 epr; > u32 crit_save; > + /* guest debug registers*/ > struct kvmppc_booke_debug_reg dbg_reg; > + /* shadow debug registers */ > + struct kvmppc_booke_debug_reg shadow_dbg_reg; > + /* host debug registers*/ > + struct kvmppc_booke_debug_reg host_dbg_reg; > #endif > gpa_t paddr_accessed; > gva_t vaddr_accessed; > diff --git a/arch/powerpc/kernel/asm-offsets.c > b/arch/powerpc/kernel/asm- offsets.c > index 555448e..6987821 100644 > --- a/arch/powerpc/kernel/asm-offsets.c > +++ b/arch/powerpc/kernel/asm-offsets.c > @@ -564,6 +564,32 @@ int main(void) > DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear)); > DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr)); > DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save)); > + DEFINE(VCPU_DBSR, offsetof(struct kvm_vcpu, arch.dbsr)); > + DEFINE(VCPU_SHADOW_DBG, offsetof(struct kvm_vcpu, arch.shadow_dbg_reg)); > + DEFINE(VCPU_HOST_DBG, offs
Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip
On 2012-10-04 16:21, Anthony Liguori wrote: > -no-kvm should be included too. Reminds me that we still need to agree on the final default accel strategy. > > I just ran across a user that was injecting '-no-kvm-irqchip' in their > libvirt XML via a custom attribute. It turned out it was to work around > broken MSI support in their funky guest they were running. It was the > wrong solution to the problem but they were doing it regardless. > > The point is, there are users in the wild using these options. There's > no reason to remove them if they are trivial to maintain (and they are > in their current form). So let's define a consistent policy for them all: - warn on the command line on use - avoid adding them to the help or other user documentation - keep them until we rework the whole command line Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: vfio problems
On Thu, 2012-10-04 at 11:40 +0200, Dominic Eschweiler wrote: > Hi, > > I just recently started to play with vfio, since the new Kernel 3.6 > comes directly with an integrated vfio-stack. My problem currently is, > that I'm not able to bind the vfio-pci to the (unused) smbus controller > in my laptop. > > Here are the steps I did and the related results: > > > # modprobe vfio-pci > > [ 1609.065705] VFIO - User Level meta-driver version: 0.3 > > # lspci > > ... > 00:1f.3 SMBus: Intel Corporation 6 Series/C200 Series Chipset Family > SMBus Controller (rev 05) > ... > > # lspci -n -s :00:1f.3 > > 00:1f.3 0c05: 8086:1c22 (rev 05) > > # echo 8086 1c22 > /sys/bus/pci/drivers/vfio-pci/new_id > > [ 4485.759148] vfio-pci: probe of :00:1f.3 failed with error -22 > > > I omitted the unbind step, which is described in the documentation, > since the device is not claimed by any driver at all. Also the "error > -22" statement isn't really helping in this case. > > Any ideas? > What does this report? readlink /sys/bus/pci/devices/\:00\:1f.3/iommu_group The likely cause of an EINVAL for an endpoint device is that it's not part of an IOMMU group, which may mean you don't have an IOMMU enabled. You can also look in /sys/kernel/iommu_groups to see the groups. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT PULL] KVM updates for the 3.7 merge window
Linus, please pull from the repo and tag at: git://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/kvm-3.7-1 to receive the KVM updates for the 3.7 merge windows. Highlights of the changes for this release include support for vfio level triggered interrupts, improved big real mode support on older Intels, a streamlines guest page table walker, guest APIC speedups, PIO optimizations, better overcommit handling, and read-only memory. Shortlog and diffstat follow. KVM updates for the 3.7 merge window Alex Williamson (1): KVM: Add resampling irqfds for level triggered interrupts Alexander Graf (1): KVM: Add ppc hypercall documentation Avi Kivity (36): Merge branch 'queue' into next KVM: Don't update PPR on any APIC read KVM: Remove internal timer abstraction KVM: Simplify kvm_timer KVM: Simplify kvm_pit_timer KVM: fold kvm_pit_timer into kvm_kpit_state Merge remote-tracking branch 'upstream' into next KVM: VMX: Advertize RDTSC exiting to nested guests KVM: x86 emulator: access GPRs on demand KVM: VMX: Separate saving pre-realmode state from setting segments KVM: VMX: Fix incorrect lookup of segment S flag in fix_pmode_dataseg() KVM: VMX: Use kvm_segment to save protected-mode segments when entering realmode KVM: VMX: Retain limit and attributes when entering protected mode KVM: VMX: Allow real mode emulation using vm86 with dpl=0 KVM: VMX: Allow vm86 virtualization of big real mode KVM: x86 emulator: Leave segment limit and attributs alone in real mode KVM: x86 emulator: Check segment limits in real mode too KVM: x86 emulator: Fix #GP error code during linearization KVM: VMX: Return real real-mode segment data even if emulate_invalid_guest_state=1 KVM: VMX: Preserve segment limit and access rights in real mode KVM: VMX: Save all segment data in real mode KVM: VMX: Ignore segment G and D bits when considering whether we can virtualize KVM: VMX: Make lto-friendly KVM: VMX: Make use of asm.h KVM: SVM: Make use of asm.h KVM: MMU: Push clean gpte write protection out of gpte_access() KVM: MMU: Optimize gpte_access() slightly KVM: MMU: Move gpte_access() out of paging_tmpl.h KVM: MMU: Update accessed and dirty bits after guest pagetable walk KVM: MMU: Optimize pte permission checks KVM: MMU: Simplify walk_addr_generic() loop KVM: MMU: Optimize is_last_gpte() KVM: MMU: Eliminate eperm temporary KVM: MMU: Avoid access/dirty update loop if all is well KVM: MMU: Eliminate pointless temporary 'ac' Merge branch 'queue' into next Christian Borntraeger (1): KVM: s390: Fix vcpu_load handling in interrupt code Christoffer Dall (1): KVM: Move KVM_IRQ_LINE to arch-generic code Cornelia Huck (3): s390/dis: Instruction decoding interface KVM: s390: Add architectural trace events KVM: s390: Add implementation-specific trace events Florian Westphal (1): KVM guest: disable stealtime on reboot to avoid mem corruption Gavin Shan (1): KVM: PPC: book3s: fix build error caused by gfn_to_hva_memslot() Gleb Natapov (18): KVM: x86 emulator: drop unneeded call to get_segment() KVM: x86: update KVM_SAVE_MSRS_BEGIN to correct value KVM: clean up kvm_(set|get)_apic_base KVM: use kvm_lapic_set_base() to change apic_base KVM: mark apic enabled on start up jump_label: Export jump_label_rate_limit() KVM: use jump label to optimize checking for HW enabled APIC in APIC_BASE MSR KVM: use jump label to optimize checking for SW enabled apic in spurious interrupt register KVM: use jump label to optimize checking for in kernel local apic presence KVM: inline kvm_apic_present() and kvm_lapic_enabled() KVM: correctly detect APIC SW state in kvm_apic_post_state_restore() KVM: VMX: restore MSR_IA32_DEBUGCTLMSR after VMEXIT KVM: cleanup pic reset KVM: Provide userspace IO exit completion callback KVM: emulator: make x86 emulation modes enum instead of defines KVM: emulator: string_addr_inc() cleanup KVM: emulator: optimize "rep ins" handling KVM: optimize apic interrupt delivery Guo Chao (7): KVM: VMX: Fix typos KVM: SVM: Fix typos KVM: x86: Fix typos in x86.c KVM: x86: Fix typos in emulate.c KVM: x86: Fix typos in cpuid.c KVM: x86: Fix typos in lapic.c KVM: x86: Fix typos in pmu.c Jan Kiszka (2): KVM: Improve wording of KVM_SET_USER_MEMORY_REGION documentation KVM: x86: Fix guest debug across vcpu INIT reset Liu, Jinsong (1): KVM: Depend on HIGH_RES_TIMERS Marcelo Tosatti (7): KVM: x86: fix pvclock guest stopped flag reporting x86: KVM guest: merge CONFIG_KVM_CLOCK int
RE: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support
> -Original Message- > From: Alexander Graf [mailto:ag...@suse.de] > Sent: Thursday, October 04, 2012 4:56 PM > To: Bhushan Bharat-R65777 > Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org > Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support > > > On 04.10.2012, at 13:06, Bhushan Bharat-R65777 wrote: > > > > > > >> -Original Message- > >> From: Alexander Graf [mailto:ag...@suse.de] > >> Sent: Monday, September 24, 2012 9:50 PM > >> To: Bhushan Bharat-R65777 > >> Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Bhushan > >> Bharat-R65777 > >> Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support > >> > >> > >> On 21.08.2012, at 15:52, Bharat Bhushan wrote: > >> > >>> This patch adds the debug stub support on booke/bookehv. > >>> Now QEMU debug stub can use hw breakpoint, watchpoint and software > >>> breakpoint to debug guest. > >>> > >>> Signed-off-by: Bharat Bhushan > >>> --- > >>> arch/powerpc/include/asm/kvm.h| 29 ++- > >>> arch/powerpc/include/asm/kvm_host.h |5 + > >>> arch/powerpc/kernel/asm-offsets.c | 26 ++ > >>> arch/powerpc/kvm/booke.c | 144 > >>> +-- > -- > >>> arch/powerpc/kvm/booke_interrupts.S | 110 + > >>> arch/powerpc/kvm/bookehv_interrupts.S | 141 > +++- > >>> arch/powerpc/kvm/e500mc.c |3 +- > >>> 7 files changed, 435 insertions(+), 23 deletions(-) > >>> > >>> diff --git a/arch/powerpc/include/asm/kvm.h > >>> b/arch/powerpc/include/asm/kvm.h index 61b197e..53479ea 100644 > >>> --- a/arch/powerpc/include/asm/kvm.h > >>> +++ b/arch/powerpc/include/asm/kvm.h > >>> @@ -25,6 +25,7 @@ > >>> /* Select powerpc specific features in */ #define > >>> __KVM_HAVE_SPAPR_TCE #define __KVM_HAVE_PPC_SMT > >>> +#define __KVM_HAVE_GUEST_DEBUG > >>> > >>> struct kvm_regs { > >>> __u64 pc; > >>> @@ -264,7 +265,31 @@ struct kvm_fpu { > >>> __u64 fpr[32]; > >>> }; > >>> > >>> + > >>> +/* > >>> + * Defines for h/w breakpoint, watchpoint (read, write or both) and > >>> + * software breakpoint. > >>> + * These are used as "type" in KVM_SET_GUEST_DEBUG ioctl and "status" > >>> + * for KVM_DEBUG_EXIT. > >>> + */ > >>> +#define KVMPPC_DEBUG_NONE0x0 > >>> +#define KVMPPC_DEBUG_BREAKPOINT (1UL << 1) > >>> +#define KVMPPC_DEBUG_WATCH_WRITE (1UL << 2) > >>> +#define KVMPPC_DEBUG_WATCH_READ (1UL << 3) > >>> struct kvm_debug_exit_arch { > >>> + __u64 pc; > >>> + /* > >>> + * exception -> returns the exception number. If the KVM_DEBUG_EXIT > >>> + * exit is not handled (say not h/w breakpoint or software breakpoint > >>> + * set for this address) by qemu then it is supposed to inject this > >>> + * exception to guest. > >>> + */ > >>> + __u32 exception; > >>> + /* > >>> + * exiting to userspace because of h/w breakpoint, watchpoint > >>> + * (read, write or both) and software breakpoint. > >>> + */ > >>> + __u32 status; > >>> }; > >>> > >>> /* for KVM_SET_GUEST_DEBUG */ > >>> @@ -276,10 +301,6 @@ struct kvm_guest_debug_arch { > >>>* Type denotes h/w breakpoint, read watchpoint, write > >>>* watchpoint or watchpoint (both read and write). > >>>*/ > >>> -#define KVMPPC_DEBUG_NOTYPE 0x0 > >>> -#define KVMPPC_DEBUG_BREAKPOINT (1UL << 1) > >>> -#define KVMPPC_DEBUG_WATCH_WRITE (1UL << 2) > >>> -#define KVMPPC_DEBUG_WATCH_READ (1UL << 3) > >>> __u32 type; > >>> __u32 pad1; > >>> __u64 pad2; > >>> diff --git a/arch/powerpc/include/asm/kvm_host.h > >> b/arch/powerpc/include/asm/kvm_host.h > >>> index c7219c1..3ba465a 100644 > >>> --- a/arch/powerpc/include/asm/kvm_host.h > >>> +++ b/arch/powerpc/include/asm/kvm_host.h > >>> @@ -496,7 +496,12 @@ struct kvm_vcpu_arch { > >>> u32 mmucfg; > >>> u32 epr; > >>> u32 crit_save; > >>> + /* guest debug registers*/ > >>> struct kvmppc_booke_debug_reg dbg_reg; > >>> + /* shadow debug registers */ > >>> + struct kvmppc_booke_debug_reg shadow_dbg_reg; > >>> + /* host debug registers*/ > >>> + struct kvmppc_booke_debug_reg host_dbg_reg; > >>> #endif > >>> gpa_t paddr_accessed; > >>> gva_t vaddr_accessed; > >>> diff --git a/arch/powerpc/kernel/asm-offsets.c > >>> b/arch/powerpc/kernel/asm- > >> offsets.c > >>> index 555448e..6987821 100644 > >>> --- a/arch/powerpc/kernel/asm-offsets.c > >>> +++ b/arch/powerpc/kernel/asm-offsets.c > >>> @@ -564,6 +564,32 @@ int main(void) > >>> DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear)); > >>> DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr)); > >>> DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save)); > >>> + DEFINE(VCPU_DBSR, offsetof(struct kvm_vcpu, arch.dbsr)); > >>> + DEFINE(VCPU_SHADOW_DBG, offsetof(struct kvm_vcpu, arch.shadow_dbg_reg)); > >>> + DEFINE(VCPU_HOST_DBG, offsetof(struct kvm_vcpu, arch.host_dbg_reg)); > >>> + DEFINE(KVMPPC_DBG_DBCR0, offset
Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip
Jan Kiszka writes: > On 2012-10-03 20:26, Marcelo Tosatti wrote: >> On Wed, Oct 03, 2012 at 07:24:48PM +0200, Jan Kiszka wrote: >>> On 2012-10-03 19:16, Anthony Liguori wrote: Jan Kiszka writes: > On 2012-10-03 17:03, Marcelo Tosatti wrote: >> On Wed, Oct 03, 2012 at 09:40:17AM -0500, Anthony Liguori wrote: >>> Marcelo Tosatti writes: >>> Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git. From: Jan Kiszka Upstream is moving towards this mechanism, so start using it in qemu-kvm already to configure the specific defaults: kvm enabled on, just like in-kernel irqchips. Signed-off-by: Marcelo Tosatti >>> >>> >>> Reviewed-by: Anthony Liguori >>> >>> Although it's a little odd to have From: Jan without a SoB... >> >> Agree, Jan can you ACK? > > I wasn't able to join the call yesterday: Is there a removal schedule > associated with those switches? Also, why pushing things upstream, even > when only for one release, that have been loudly deprecated for a while > in qemu-kvm? Some switches are lacking deprecated warnings on the > console, and -no-kvm is missing completely. I tend to focus on patch 1 & > 5, dropping the rest - based on relevance for production use. The distros need to keep these flags to do the switch. >>> >>> Why? Should be documented in commit log. >>> I see no point in deprecating them since they're trivially easy to maintain. >>> >>> Given the level of cr** we already have in the command line, they are >>> kind of noise, yes. But even then, these patches are not consistent as >>> pointed out above. >>> >>> Also, they should not be documented to avoid being spread. That's what >>> we did with other deprecated switches in QEMU. >>> >>> Jan >> >> Jan, >> >> You're comments to the patch are: >> >> - No documentation. > > See e.g. how -M is handled in qemu-options.hx. > >> - Expiration date. > > Anthony said "forever", but I think we should remove all those that > issue deprecation warnings after 1-2 years. > >> - Changelog explaining what?? (didnt get that). Perhaps better changelog >> in general? > > I'm still failing to understand who could depend on -no-kvm-irqchip or > -no-kvm-pit. And I don't understand why -no-kvm was not included. Soe > the reasons for include -X should be provided. Also check your patch > subjects again, at least one was wrong. -no-kvm should be included too. I just ran across a user that was injecting '-no-kvm-irqchip' in their libvirt XML via a custom attribute. It turned out it was to work around broken MSI support in their funky guest they were running. It was the wrong solution to the problem but they were doing it regardless. The point is, there are users in the wild using these options. There's no reason to remove them if they are trivial to maintain (and they are in their current form). Regards, Anthony Liguori > > Jan -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [kvmarm] [PATCH 06/15] KVM: ARM: Initial skeleton to compile KVM support
On 09/25/2012 05:20 PM, Will Deacon wrote: >> + case KVM_GET_REG_LIST: { >> + struct kvm_reg_list __user *user_list = argp; >> + struct kvm_reg_list reg_list; >> + unsigned n; >> + >> + if (copy_from_user(®_list, user_list, sizeof reg_list)) >> + return -EFAULT; >> + n = reg_list.n; >> + reg_list.n = kvm_arm_num_regs(vcpu); >> + if (copy_to_user(user_list, ®_list, sizeof reg_list)) >> + return -EFAULT; >> + if (n < reg_list.n) >> + return -E2BIG; >> + return kvm_arm_copy_reg_indices(vcpu, user_list->reg); > > kvm_reg_list sounds like it could be done using a regset instead. Wouldn't those regsets be userspace oriented? For example, the GPRs returned here include all the shadowed interrupt registers (or however they're called) while most user oriented APIs would only include the user visible registers. FWIW, we're trying to move to an architecture independent ABI for KVM registers, but that's a lot of work since we need to make sure all the weird x86 registers (and non-register state) fit into that. Maybe that ABI will be regset based, but I don't want to block the ARM port on this. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 06/15] KVM: ARM: Initial skeleton to compile KVM support
On Thu, Oct 4, 2012 at 9:02 AM, Min-gyu Kim wrote: > > >> -Original Message- >> From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On >> Behalf Of Christoffer Dall >> Sent: Monday, October 01, 2012 4:22 AM >> To: Will Deacon >> Cc: kvm@vger.kernel.org; linux-arm-ker...@lists.infradead.org; >> kvm...@lists.cs.columbia.edu; rusty.russ...@linaro.org; a...@redhat.com; >> marc.zyng...@arm.com >> Subject: Re: [PATCH 06/15] KVM: ARM: Initial skeleton to compile KVM >> support >> >> On Thu, Sep 27, 2012 at 10:13 AM, Will Deacon wrote: >> > On Wed, Sep 26, 2012 at 02:43:14AM +0100, Christoffer Dall wrote: >> >> On 09/25/2012 11:20 AM, Will Deacon wrote: >> >> >> +/* Multiprocessor Affinity Register */ >> >> >> +#define MPIDR_CPUID(0x3 << 0) >> >> > >> >> > I'm fairly sure we already have code under arch/arm/ for dealing >> >> > with the mpidr. Let's re-use that rather than reinventing it here. >> >> > >> >> >> >> I see some defines in topology.c - do you want some of these factored >> >> out into a header file that we can then also use from kvm? If so, > where? >> > >> > I guess either in topology.h or a new header (topology-bits.h). >> > >> >> >> +#define EXCEPTION_NONE 0 >> >> >> +#define EXCEPTION_RESET 0x80 >> >> >> +#define EXCEPTION_UNDEFINED 0x40 >> >> >> +#define EXCEPTION_SOFTWARE 0x20 >> >> >> +#define EXCEPTION_PREFETCH 0x10 >> >> >> +#define EXCEPTION_DATA 0x08 >> >> >> +#define EXCEPTION_IMPRECISE 0x04 >> >> >> +#define EXCEPTION_IRQ 0x02 >> >> >> +#define EXCEPTION_FIQ 0x01 >> >> > >> >> > Why the noise? >> >> > >> >> >> >> these are simply cruft from a previous life of KVM/ARM. >> > >> > Ok, then please get rid of them. >> > >> >> >> +static inline enum vcpu_mode vcpu_mode(struct kvm_vcpu *vcpu) { >> >> >> + u8 mode = __vcpu_mode(vcpu->arch.regs.cpsr); >> >> >> + BUG_ON(mode == 0xf); >> >> >> + return mode; >> >> >> +} >> >> > >> >> > I noticed that you have a fair few BUG_ONs throughout the series. >> >> > Fair enough, but for hyp code is that really the right thing to do? >> >> > Killing the guest could make more sense, perhaps? >> >> >> >> the idea is to have BUG_ONs that are indeed BUG_ONs that we want to >> >> catch explicitly on the host. We have had a pass over the code to >> >> change all the BUG_ONs that can be provoked by the guest and inject >> >> the proper exceptions into the guest in this case. If you find places >> >> where this is not the case, it should be changed, and do let me know. >> > >> > Ok, so are you saying that a BUG_ON due to some detected inconsistency >> > with one guest may not necessarily terminate the other guests? BUG_ONs >> > in the host seem like a bad idea if the host is able to continue with >> > a subset of guests. >> > >> >> No, I'm saying a BUG_ON is an actual BUG, it should not happen and there >> should be nowhere where a guest can cause a BUG_ON to occur in the host, >> because that would be a bug. >> >> We basically never kill a guest unless really extreme things happen (like >> we cannot allocate a pte, in which case we return -ENOMEM). If a guest >> does something really weird, that guest will receive the appropriate >> exception (undefined, prefetch abort, ...) >> > > I agree with Will. It seems to be overreacting to kill the entire system. > > From the code above, BUG_ON case clearly points out that there happened a > serious bug case. However, killing the corresponding VM may not cause any > further problem. > Then leave some logs for debugging and killing the VM seems to be enough. > > Let's assume KVM for ARM is distributed with a critical bug. > If the case is defended by BUG_ON, it will cause host to shutdown. > If the case is defended by killing VM, it will cause VM to shutdown. > In my opinion, latter case seems to be better. > > I looked for a guide on BUG_ON and found this: > http://yarchive.net/comp/linux/BUG.html > > I completely agree with all this, no further argument is needed. The point of a BUG_ON is to explicitly state the reason for a bug that will anyhow cause the host kernel to overall malfunction. The above bug_on statement is long gone (see new patches), and if you see other cases like this, the code should have been tested and we can remove the BUG_ON. >> >> > >> >> >> +static inline u32 *vcpu_pc(struct kvm_vcpu *vcpu) { >> >> >> + return vcpu_reg(vcpu, 15); } >> >> > >> >> > If you stick a struct pt_regs into struct kvm_regs, you could reuse >> >> > ARM_pc here etc. >> >> > >> >> >> >> I prefer not to, because we'd have those registers presumably for usr >> >> mode and then we only define the others explicit. I think it's much >> >> clearer to look at kvm_regs today. >> > >> > I disagree and think that you should reuse as much of the arch/arm/ >> > code as possible. Not only does it make it easier to read by people >> > who are familiar with that code (and in turn get you more reviewers) >> > but it also means that we limit the amount of dup
Re: [PATCH 0/3] virtio-net: inline header support
Il 04/10/2012 14:51, Rusty Russell ha scritto: > Paolo Bonzini writes: > >> Il 04/10/2012 02:11, Rusty Russell ha scritto: > There's a reason I haven't done this. I really, really dislike "my > implemention isn't broken" feature bits. We could have an infinite > number of them, for each bug in each device. However, this bug affects (almost) all implementations and (almost) all devices. It even makes sense to reserve a transport feature bit for it instead of a device feature bit. >>> >>> Perhaps, but we have to fix the bugs first! >> >> Yes. :) Isn't that what mst's patch does? >> >>> As I said, my torture patch broke qemu immediately. Since noone has >>> leapt onto fixing that, I'll take a look now... >> >> I can look at virtio-scsi. > > Actually, you can't, see my reply to Anthony... > > Message-ID: <87lifm1y1n@rustcorp.com.au> struct virtio_scsi_req_cmd { // Read-only u8 lun[8]; u64 id; u8 task_attr; u8 prio; u8 crn; char cdb[cdb_size]; char dataout[]; // Write-only part u32 sense_len; u32 residual; u16 status_qualifier; u8 status; u8 response; u8 sense[sense_size]; char datain[]; }; where cdb_size and sense_size come from configuration space. The device right now expects everything before dataout/datain to be in a single descriptor, but that's in no way part of the spec. Am I missing something egregious? Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler
On Thu, 2012-10-04 at 14:41 +0200, Avi Kivity wrote: > > Again the numbers are ridiculously high for arch_local_irq_restore. > Maybe there's a bad perf/kvm interaction when we're injecting an > interrupt, I can't believe we're spending 84% of the time running the > popf instruction. Smells like a software fallback that doesn't do NMI, hrtimer based sampling typically hits popf where we re-enable interrupts. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 06/15] KVM: ARM: Initial skeleton to compile KVM support
> -Original Message- > From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On > Behalf Of Christoffer Dall > Sent: Monday, October 01, 2012 4:22 AM > To: Will Deacon > Cc: kvm@vger.kernel.org; linux-arm-ker...@lists.infradead.org; > kvm...@lists.cs.columbia.edu; rusty.russ...@linaro.org; a...@redhat.com; > marc.zyng...@arm.com > Subject: Re: [PATCH 06/15] KVM: ARM: Initial skeleton to compile KVM > support > > On Thu, Sep 27, 2012 at 10:13 AM, Will Deacon wrote: > > On Wed, Sep 26, 2012 at 02:43:14AM +0100, Christoffer Dall wrote: > >> On 09/25/2012 11:20 AM, Will Deacon wrote: > >> >> +/* Multiprocessor Affinity Register */ > >> >> +#define MPIDR_CPUID(0x3 << 0) > >> > > >> > I'm fairly sure we already have code under arch/arm/ for dealing > >> > with the mpidr. Let's re-use that rather than reinventing it here. > >> > > >> > >> I see some defines in topology.c - do you want some of these factored > >> out into a header file that we can then also use from kvm? If so, where? > > > > I guess either in topology.h or a new header (topology-bits.h). > > > >> >> +#define EXCEPTION_NONE 0 > >> >> +#define EXCEPTION_RESET 0x80 > >> >> +#define EXCEPTION_UNDEFINED 0x40 > >> >> +#define EXCEPTION_SOFTWARE 0x20 > >> >> +#define EXCEPTION_PREFETCH 0x10 > >> >> +#define EXCEPTION_DATA 0x08 > >> >> +#define EXCEPTION_IMPRECISE 0x04 > >> >> +#define EXCEPTION_IRQ 0x02 > >> >> +#define EXCEPTION_FIQ 0x01 > >> > > >> > Why the noise? > >> > > >> > >> these are simply cruft from a previous life of KVM/ARM. > > > > Ok, then please get rid of them. > > > >> >> +static inline enum vcpu_mode vcpu_mode(struct kvm_vcpu *vcpu) { > >> >> + u8 mode = __vcpu_mode(vcpu->arch.regs.cpsr); > >> >> + BUG_ON(mode == 0xf); > >> >> + return mode; > >> >> +} > >> > > >> > I noticed that you have a fair few BUG_ONs throughout the series. > >> > Fair enough, but for hyp code is that really the right thing to do? > >> > Killing the guest could make more sense, perhaps? > >> > >> the idea is to have BUG_ONs that are indeed BUG_ONs that we want to > >> catch explicitly on the host. We have had a pass over the code to > >> change all the BUG_ONs that can be provoked by the guest and inject > >> the proper exceptions into the guest in this case. If you find places > >> where this is not the case, it should be changed, and do let me know. > > > > Ok, so are you saying that a BUG_ON due to some detected inconsistency > > with one guest may not necessarily terminate the other guests? BUG_ONs > > in the host seem like a bad idea if the host is able to continue with > > a subset of guests. > > > > No, I'm saying a BUG_ON is an actual BUG, it should not happen and there > should be nowhere where a guest can cause a BUG_ON to occur in the host, > because that would be a bug. > > We basically never kill a guest unless really extreme things happen (like > we cannot allocate a pte, in which case we return -ENOMEM). If a guest > does something really weird, that guest will receive the appropriate > exception (undefined, prefetch abort, ...) > I agree with Will. It seems to be overreacting to kill the entire system. >From the code above, BUG_ON case clearly points out that there happened a serious bug case. However, killing the corresponding VM may not cause any further problem. Then leave some logs for debugging and killing the VM seems to be enough. Let's assume KVM for ARM is distributed with a critical bug. If the case is defended by BUG_ON, it will cause host to shutdown. If the case is defended by killing VM, it will cause VM to shutdown. In my opinion, latter case seems to be better. I looked for a guide on BUG_ON and found this: http://yarchive.net/comp/linux/BUG.html > >> > > >> >> +static inline u32 *vcpu_pc(struct kvm_vcpu *vcpu) { > >> >> + return vcpu_reg(vcpu, 15); } > >> > > >> > If you stick a struct pt_regs into struct kvm_regs, you could reuse > >> > ARM_pc here etc. > >> > > >> > >> I prefer not to, because we'd have those registers presumably for usr > >> mode and then we only define the others explicit. I think it's much > >> clearer to look at kvm_regs today. > > > > I disagree and think that you should reuse as much of the arch/arm/ > > code as possible. Not only does it make it easier to read by people > > who are familiar with that code (and in turn get you more reviewers) > > but it also means that we limit the amount of duplication that we have. > > Reusing a struct just for the sake of reusing is not necessarily an > improvement. Some times it complicates things, and some times it's > misleading. To me, pt_regs carry the meaning that these are the registers > for a user space process that traps into the kernel - in KVM we emulate a > virtual CPU and that current definition is quite clear. > > The argument that more people will review the code if the struct contains > a pt_regs field rather than a usr_regs field is completely invalid, >
Re: [PATCH 0/3] virtio-net: inline header support
Paolo Bonzini writes: > Il 04/10/2012 02:11, Rusty Russell ha scritto: >> > > There's a reason I haven't done this. I really, really dislike "my >> > > implemention isn't broken" feature bits. We could have an infinite >> > > number of them, for each bug in each device. >> > >> > However, this bug affects (almost) all implementations and (almost) all >> > devices. It even makes sense to reserve a transport feature bit for it >> > instead of a device feature bit. >> >> Perhaps, but we have to fix the bugs first! > > Yes. :) Isn't that what mst's patch does? > >> As I said, my torture patch broke qemu immediately. Since noone has >> leapt onto fixing that, I'll take a look now... > > I can look at virtio-scsi. Actually, you can't, see my reply to Anthony... Message-ID: <87lifm1y1n@rustcorp.com.au> Cheers, Rusty. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler
On 10/04/2012 12:56 PM, Raghavendra K T wrote: > On 10/03/2012 10:55 PM, Avi Kivity wrote: >> On 10/03/2012 04:29 PM, Raghavendra K T wrote: >>> * Avi Kivity [2012-09-27 14:03:59]: >>> On 09/27/2012 01:23 PM, Raghavendra K T wrote: >> >>> [...] > 2) looking at the result (comparing A & C) , I do feel we have > significant in iterating over vcpus (when compared to even vmexit) > so We still would need undercommit fix sugested by PeterZ > (improving by > 140%). ? Looking only at the current runqueue? My worry is that it misses a lot of cases. Maybe try the current runqueue first and then others. >>> >>> Okay. Do you mean we can have something like >>> >>> + if (rq->nr_running == 1 && p_rq->nr_running == 1) { >>> + yielded = -ESRCH; >>> + goto out_irq; >>> + } >>> >>> in the Peter's patch ? >>> >>> ( I thought lot about && or || . Both seem to have their own cons ). >>> But that should be only when we have short term imbalance, as PeterZ >>> told. >> >> I'm missing the context. What is p_rq? > > p_rq is the run queue of target vcpu. > What I was trying below was to address Rik concern. Suppose > rq of source vcpu has one task, but target probably has two task, > with a eligible vcpu waiting to be scheduled. > >> >> What I mean was: >> >>if can_yield_to_process_in_current_rq >> do that >>else if can_yield_to_process_in_other_rq >> do that >>else >> return -ESRCH > > I think you are saying we have to check the run queue of the > source vcpu, if we have a vcpu belonging to same VM and try yield to > that? ignoring whatever the target vcpu we received for yield_to. > > Or is it that kvm_vcpu_yield_to should now check the vcpus of same vm > belonging to same run queue first. If we don't succeed, go again for > a vcpu in different runqueue. Right. Prioritize vcpus that are cheap to yield to. But may return bad results if all vcpus on the current runqueue are spinners, so probably not a good idea. > Does it add more overhead especially in <= 1x scenario? The current runqueue should have just our vcpu in that case, so low overhead. But it's a bad idea due to the above scenario. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler
On 10/04/2012 12:49 PM, Raghavendra K T wrote: > On 10/03/2012 10:35 PM, Avi Kivity wrote: >> On 10/03/2012 02:22 PM, Raghavendra K T wrote: So I think it's worth trying again with ple_window of 2-4. >>> >>> Hi Avi, >>> >>> I ran different benchmarks increasing ple_window, and results does not >>> seem to be encouraging for increasing ple_window. >> >> Thanks for testing! Comments below. >> >>> Results: >>> 16 core PLE machine with 16 vcpu guest. >>> >>> base kernel = 3.6-rc5 + ple handler optimization patch >>> base_pleopt_8k = base kernel + ple window = 8k >>> base_pleopt_16k = base kernel + ple window = 16k >>> base_pleopt_32k = base kernel + ple window = 32k >>> >>> >>> Percentage improvements of benchmarks w.r.t base_pleopt with >>> ple_window = 4096 >>> >>> base_pleopt_8kbase_pleopt_16kbase_pleopt_32k >>> - >>> >>> kernbench_1x-5.54915-15.94529-44.31562 >>> kernbench_2x-7.89399-17.75039-37.73498 >> >> So, 44% degradation even with no overcommit? That's surprising. > > Yes. Kernbench was run with #threads = #vcpu * 2 as usual. Is it > spending 8 times the original ple_window cycles for 16 vcpus > significant? A PLE exit when not overcommitted cannot do any good, it is better to spin in the guest rather that look for candidates on the host. In fact when we benchmark we often disable PLE completely. > >> >>> I also got perf top output to analyse the difference. Difference comes >>> because of flushtlb (and also spinlock). >> >> That's in the guest, yes? > > Yes. Perf is in guest. > >> >>> >>> Ebizzy run for 4k ple_window >>> - 87.20% [kernel] [k] arch_local_irq_restore >>> - arch_local_irq_restore >>>- 100.00% _raw_spin_unlock_irqrestore >>> + 52.89% release_pages >>> + 47.10% pagevec_lru_move_fn >>> - 5.71% [kernel] [k] arch_local_irq_restore >>> - arch_local_irq_restore >>>+ 86.03% default_send_IPI_mask_allbutself_phys >>>+ 13.96% default_send_IPI_mask_sequence_phys >>> - 3.10% [kernel] [k] smp_call_function_many >>> smp_call_function_many >>> >>> >>> Ebizzy run for 32k ple_window >>> >>> - 91.40% [kernel] [k] arch_local_irq_restore >>> - arch_local_irq_restore >>>- 100.00% _raw_spin_unlock_irqrestore >>> + 53.13% release_pages >>> + 46.86% pagevec_lru_move_fn >>> - 4.38% [kernel] [k] smp_call_function_many >>> smp_call_function_many >>> - 2.51% [kernel] [k] arch_local_irq_restore >>> - arch_local_irq_restore >>>+ 90.76% default_send_IPI_mask_allbutself_phys >>>+ 9.24% default_send_IPI_mask_sequence_phys >>> >> >> Both the 4k and the 32k results are crazy. Why is >> arch_local_irq_restore() so prominent? Do you have a very high >> interrupt rate in the guest? > > How to measure if I have high interrupt rate in guest? > From /proc/interrupt numbers I am not able to judge :( 'vmstat 1' > > I went back and got the results on a 32 core machine with 32 vcpu guest. > Strangely, I got result supporting the claim that increasing ple_window > helps for non-overcommitted scenario. > > 32 core 32 vcpu guest 1x scenarios. > > ple_gap = 0 > kernbench: Elapsed Time 38.61 > ebizzy: 7463 records/s > > ple_window = 4k > kernbench: Elapsed Time 43.5067 > ebizzy:2528 records/s > > ple_window = 32k > kernebench : Elapsed Time 39.4133 > ebizzy: 7196 records/s So maybe something was wrong with the first measurement. > > > perf top for ebizzy for above: > ple_gap = 0 > - 84.74% [kernel] [k] arch_local_irq_restore >- arch_local_irq_restore > - 100.00% _raw_spin_unlock_irqrestore > + 50.96% release_pages > + 49.02% pagevec_lru_move_fn > - 6.57% [kernel] [k] arch_local_irq_restore >- arch_local_irq_restore > + 92.54% default_send_IPI_mask_allbutself_phys > + 7.46% default_send_IPI_mask_sequence_phys > - 1.54% [kernel] [k] smp_call_function_many > smp_call_function_many Again the numbers are ridiculously high for arch_local_irq_restore. Maybe there's a bad perf/kvm interaction when we're injecting an interrupt, I can't believe we're spending 84% of the time running the popf instruction. > > ple_window = 32k > - 84.47% [kernel] [k] arch_local_irq_restore >+ arch_local_irq_restore > - 6.46% [kernel] [k] arch_local_irq_restore >- arch_local_irq_restore > + 93.51% default_send_IPI_mask_allbutself_phys > + 6.49% default_send_IPI_mask_sequence_phys > - 1.80% [kernel] [k] smp_call_function_many >- smp_call_function_many > + 99.98% native_flush_tlb_others > > > ple_window = 4k > - 91.35% [kernel] [k] arch_local_irq_restore >- arch_local_irq_restore > - 100.00% _raw_spin_unlock_irqrestore > + 53.19% release_pages > + 46.81% pagevec_lru_move_fn > - 3.90% [kernel] [k] smp_call_function_many > smp_call_fu
Re: qemu-kvm: remove "boot=on|off" drive parameter compatibility
On 2012-10-04 14:10, Lucas Meneghel Rodrigues wrote: > On 10/04/2012 07:48 AM, Jan Kiszka wrote: >> On 2012-10-03 15:19, Paolo Bonzini wrote: >>> Il 03/10/2012 12:57, Lucas Meneghel Rodrigues ha scritto: Yep, I did send patches with the testdev device present on qemu-kvm.git to qemu.git a while ago, but there were many comments on the review, I ended up not implementing everything that was asked and the patches were archived. If nobody wants to step up to port it, I'll re-read the original thread and will spin up new patches (and try to go through the end with it). Executing the KVM unittests is something that we can't afford to lose, so I'd say it's important on this last mile effort to get rid of qemu-kvm. >>> >>> Absolutely, IIRC the problem was that testdev did a little bit of >>> everything... let's see what's the functionality of testdev: >>> >>> - write (port 0xf1), can be replaced in autotest with: >>> -device isa-debugcon,iobase=0xf1,chardev=... >>> >>> - exit code (port 0xf4), see this series: >>> http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00818.html >>> >>> - ram size (port 0xd1). If we can also patch kvm-unittests, the memory >>> is available in the CMOS or in fwcfg. Here is the SeaBIOS code: >>> >>> u32 rs = ((inb_cmos(0x34) << 16) | (inb_cmos(0x35) << 24)); >>> if (rs) >>> rs += 16 * 1024 * 1024; >>> else >>> rs = (((inb_cmos(0x30) << 10) | (inb_cmos(0x31) << 18)) >>>+ 1 * 1024 * 1024); >>> >>> The rest (ports 0xe0..0xe7, 0x2000..0x2017, MMIO) can be left in testdev. >> >> IIRC, one of the biggest problem with testdev was its hack to inject >> interrupts. > > Jan, I assume this commit helps to fix this, right? > > commit b334ec567f1de9a60349991e7b75083d569ddb0a > Author: Jan Kiszka > Date: Fri Mar 2 10:30:47 2012 +0100 > > qemu-kvm: Use upstream kvm-i8259 > > Drop the qemu-kvm version in favor of the equivalent upstream > implementation. This allows to move the i8259 back into the hwlib. > > Note that this also drops the testdev hack and restores proper > isa_get_irq. If testdev scripts exist that inject > IRQ15, they need > fixing. Testing for these interrupts on the PIIX3 makes no practical > sense anyway as those lines are unused. > > Signed-off-by: Jan Kiszka > Signed-off-by: Avi Kivity Yes, this improved it a lot as we no longer depend on additional changes. I'm not sure if there was resistance beyond that. When cleaning up the code: register_ioport_read must be replaced with the memory API. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
kvm-kmod 3.6 on linux 3.2.0
Hi, kvm-kmod 3.6 fails to compile against a 3.2.0 kernel with the following error: /usr/src/kvm-kmod-3.6/x86/x86.c: In function ‘get_msr_mce’: /usr/src/kvm-kmod-3.6/x86/x86.c:1908:27: error: ‘kvm’ undeclared (first use in this function) /usr/src/kvm-kmod-3.6/x86/x86.c:1908:27: note: each undeclared identifier is reported only once for each function it appears in Any ideas? Peter -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm: remove "boot=on|off" drive parameter compatibility
On 10/04/2012 07:48 AM, Jan Kiszka wrote: On 2012-10-03 15:19, Paolo Bonzini wrote: Il 03/10/2012 12:57, Lucas Meneghel Rodrigues ha scritto: Yep, I did send patches with the testdev device present on qemu-kvm.git to qemu.git a while ago, but there were many comments on the review, I ended up not implementing everything that was asked and the patches were archived. If nobody wants to step up to port it, I'll re-read the original thread and will spin up new patches (and try to go through the end with it). Executing the KVM unittests is something that we can't afford to lose, so I'd say it's important on this last mile effort to get rid of qemu-kvm. Absolutely, IIRC the problem was that testdev did a little bit of everything... let's see what's the functionality of testdev: - write (port 0xf1), can be replaced in autotest with: -device isa-debugcon,iobase=0xf1,chardev=... - exit code (port 0xf4), see this series: http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00818.html - ram size (port 0xd1). If we can also patch kvm-unittests, the memory is available in the CMOS or in fwcfg. Here is the SeaBIOS code: u32 rs = ((inb_cmos(0x34) << 16) | (inb_cmos(0x35) << 24)); if (rs) rs += 16 * 1024 * 1024; else rs = (((inb_cmos(0x30) << 10) | (inb_cmos(0x31) << 18)) + 1 * 1024 * 1024); The rest (ports 0xe0..0xe7, 0x2000..0x2017, MMIO) can be left in testdev. IIRC, one of the biggest problem with testdev was its hack to inject interrupts. Jan, I assume this commit helps to fix this, right? commit b334ec567f1de9a60349991e7b75083d569ddb0a Author: Jan Kiszka Date: Fri Mar 2 10:30:47 2012 +0100 qemu-kvm: Use upstream kvm-i8259 Drop the qemu-kvm version in favor of the equivalent upstream implementation. This allows to move the i8259 back into the hwlib. Note that this also drops the testdev hack and restores proper isa_get_irq. If testdev scripts exist that inject > IRQ15, they need fixing. Testing for these interrupts on the PIIX3 makes no practical sense anyway as those lines are unused. Signed-off-by: Jan Kiszka Signed-off-by: Avi Kivity -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support
On 04.10.2012, at 13:06, Bhushan Bharat-R65777 wrote: > > >> -Original Message- >> From: Alexander Graf [mailto:ag...@suse.de] >> Sent: Monday, September 24, 2012 9:50 PM >> To: Bhushan Bharat-R65777 >> Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Bhushan Bharat-R65777 >> Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support >> >> >> On 21.08.2012, at 15:52, Bharat Bhushan wrote: >> >>> This patch adds the debug stub support on booke/bookehv. >>> Now QEMU debug stub can use hw breakpoint, watchpoint and >>> software breakpoint to debug guest. >>> >>> Signed-off-by: Bharat Bhushan >>> --- >>> arch/powerpc/include/asm/kvm.h| 29 ++- >>> arch/powerpc/include/asm/kvm_host.h |5 + >>> arch/powerpc/kernel/asm-offsets.c | 26 ++ >>> arch/powerpc/kvm/booke.c | 144 >>> + >>> arch/powerpc/kvm/booke_interrupts.S | 110 + >>> arch/powerpc/kvm/bookehv_interrupts.S | 141 >>> +++- >>> arch/powerpc/kvm/e500mc.c |3 +- >>> 7 files changed, 435 insertions(+), 23 deletions(-) >>> >>> diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h >>> index 61b197e..53479ea 100644 >>> --- a/arch/powerpc/include/asm/kvm.h >>> +++ b/arch/powerpc/include/asm/kvm.h >>> @@ -25,6 +25,7 @@ >>> /* Select powerpc specific features in */ >>> #define __KVM_HAVE_SPAPR_TCE >>> #define __KVM_HAVE_PPC_SMT >>> +#define __KVM_HAVE_GUEST_DEBUG >>> >>> struct kvm_regs { >>> __u64 pc; >>> @@ -264,7 +265,31 @@ struct kvm_fpu { >>> __u64 fpr[32]; >>> }; >>> >>> + >>> +/* >>> + * Defines for h/w breakpoint, watchpoint (read, write or both) and >>> + * software breakpoint. >>> + * These are used as "type" in KVM_SET_GUEST_DEBUG ioctl and "status" >>> + * for KVM_DEBUG_EXIT. >>> + */ >>> +#define KVMPPC_DEBUG_NONE 0x0 >>> +#define KVMPPC_DEBUG_BREAKPOINT(1UL << 1) >>> +#define KVMPPC_DEBUG_WATCH_WRITE (1UL << 2) >>> +#define KVMPPC_DEBUG_WATCH_READ(1UL << 3) >>> struct kvm_debug_exit_arch { >>> + __u64 pc; >>> + /* >>> +* exception -> returns the exception number. If the KVM_DEBUG_EXIT >>> +* exit is not handled (say not h/w breakpoint or software breakpoint >>> +* set for this address) by qemu then it is supposed to inject this >>> +* exception to guest. >>> +*/ >>> + __u32 exception; >>> + /* >>> +* exiting to userspace because of h/w breakpoint, watchpoint >>> +* (read, write or both) and software breakpoint. >>> +*/ >>> + __u32 status; >>> }; >>> >>> /* for KVM_SET_GUEST_DEBUG */ >>> @@ -276,10 +301,6 @@ struct kvm_guest_debug_arch { >>> * Type denotes h/w breakpoint, read watchpoint, write >>> * watchpoint or watchpoint (both read and write). >>> */ >>> -#define KVMPPC_DEBUG_NOTYPE0x0 >>> -#define KVMPPC_DEBUG_BREAKPOINT(1UL << 1) >>> -#define KVMPPC_DEBUG_WATCH_WRITE (1UL << 2) >>> -#define KVMPPC_DEBUG_WATCH_READ(1UL << 3) >>> __u32 type; >>> __u32 pad1; >>> __u64 pad2; >>> diff --git a/arch/powerpc/include/asm/kvm_host.h >> b/arch/powerpc/include/asm/kvm_host.h >>> index c7219c1..3ba465a 100644 >>> --- a/arch/powerpc/include/asm/kvm_host.h >>> +++ b/arch/powerpc/include/asm/kvm_host.h >>> @@ -496,7 +496,12 @@ struct kvm_vcpu_arch { >>> u32 mmucfg; >>> u32 epr; >>> u32 crit_save; >>> + /* guest debug registers*/ >>> struct kvmppc_booke_debug_reg dbg_reg; >>> + /* shadow debug registers */ >>> + struct kvmppc_booke_debug_reg shadow_dbg_reg; >>> + /* host debug registers*/ >>> + struct kvmppc_booke_debug_reg host_dbg_reg; >>> #endif >>> gpa_t paddr_accessed; >>> gva_t vaddr_accessed; >>> diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm- >> offsets.c >>> index 555448e..6987821 100644 >>> --- a/arch/powerpc/kernel/asm-offsets.c >>> +++ b/arch/powerpc/kernel/asm-offsets.c >>> @@ -564,6 +564,32 @@ int main(void) >>> DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear)); >>> DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr)); >>> DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save)); >>> + DEFINE(VCPU_DBSR, offsetof(struct kvm_vcpu, arch.dbsr)); >>> + DEFINE(VCPU_SHADOW_DBG, offsetof(struct kvm_vcpu, arch.shadow_dbg_reg)); >>> + DEFINE(VCPU_HOST_DBG, offsetof(struct kvm_vcpu, arch.host_dbg_reg)); >>> + DEFINE(KVMPPC_DBG_DBCR0, offsetof(struct kvmppc_booke_debug_reg, >>> + dbcr0)); >>> + DEFINE(KVMPPC_DBG_DBCR1, offsetof(struct kvmppc_booke_debug_reg, >>> + dbcr1)); >>> + DEFINE(KVMPPC_DBG_DBCR2, offsetof(struct kvmppc_booke_debug_reg, >>> + dbcr2)); >>> +#ifdef CONFIG_KVM_E500MC >>> + DEFINE(KVMPPC_DBG_DBCR4, offsetof(struct kvm
RE: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support
> -Original Message- > From: Alexander Graf [mailto:ag...@suse.de] > Sent: Monday, September 24, 2012 9:50 PM > To: Bhushan Bharat-R65777 > Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Bhushan Bharat-R65777 > Subject: Re: [PATCH 6/6] KVM: booke/bookehv: Add debug stub support > > > On 21.08.2012, at 15:52, Bharat Bhushan wrote: > > > This patch adds the debug stub support on booke/bookehv. > > Now QEMU debug stub can use hw breakpoint, watchpoint and > > software breakpoint to debug guest. > > > > Signed-off-by: Bharat Bhushan > > --- > > arch/powerpc/include/asm/kvm.h| 29 ++- > > arch/powerpc/include/asm/kvm_host.h |5 + > > arch/powerpc/kernel/asm-offsets.c | 26 ++ > > arch/powerpc/kvm/booke.c | 144 > > + > > arch/powerpc/kvm/booke_interrupts.S | 110 + > > arch/powerpc/kvm/bookehv_interrupts.S | 141 > > +++- > > arch/powerpc/kvm/e500mc.c |3 +- > > 7 files changed, 435 insertions(+), 23 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h > > index 61b197e..53479ea 100644 > > --- a/arch/powerpc/include/asm/kvm.h > > +++ b/arch/powerpc/include/asm/kvm.h > > @@ -25,6 +25,7 @@ > > /* Select powerpc specific features in */ > > #define __KVM_HAVE_SPAPR_TCE > > #define __KVM_HAVE_PPC_SMT > > +#define __KVM_HAVE_GUEST_DEBUG > > > > struct kvm_regs { > > __u64 pc; > > @@ -264,7 +265,31 @@ struct kvm_fpu { > > __u64 fpr[32]; > > }; > > > > + > > +/* > > + * Defines for h/w breakpoint, watchpoint (read, write or both) and > > + * software breakpoint. > > + * These are used as "type" in KVM_SET_GUEST_DEBUG ioctl and "status" > > + * for KVM_DEBUG_EXIT. > > + */ > > +#define KVMPPC_DEBUG_NONE 0x0 > > +#define KVMPPC_DEBUG_BREAKPOINT(1UL << 1) > > +#define KVMPPC_DEBUG_WATCH_WRITE (1UL << 2) > > +#define KVMPPC_DEBUG_WATCH_READ(1UL << 3) > > struct kvm_debug_exit_arch { > > + __u64 pc; > > + /* > > +* exception -> returns the exception number. If the KVM_DEBUG_EXIT > > +* exit is not handled (say not h/w breakpoint or software breakpoint > > +* set for this address) by qemu then it is supposed to inject this > > +* exception to guest. > > +*/ > > + __u32 exception; > > + /* > > +* exiting to userspace because of h/w breakpoint, watchpoint > > +* (read, write or both) and software breakpoint. > > +*/ > > + __u32 status; > > }; > > > > /* for KVM_SET_GUEST_DEBUG */ > > @@ -276,10 +301,6 @@ struct kvm_guest_debug_arch { > > * Type denotes h/w breakpoint, read watchpoint, write > > * watchpoint or watchpoint (both read and write). > > */ > > -#define KVMPPC_DEBUG_NOTYPE0x0 > > -#define KVMPPC_DEBUG_BREAKPOINT(1UL << 1) > > -#define KVMPPC_DEBUG_WATCH_WRITE (1UL << 2) > > -#define KVMPPC_DEBUG_WATCH_READ(1UL << 3) > > __u32 type; > > __u32 pad1; > > __u64 pad2; > > diff --git a/arch/powerpc/include/asm/kvm_host.h > b/arch/powerpc/include/asm/kvm_host.h > > index c7219c1..3ba465a 100644 > > --- a/arch/powerpc/include/asm/kvm_host.h > > +++ b/arch/powerpc/include/asm/kvm_host.h > > @@ -496,7 +496,12 @@ struct kvm_vcpu_arch { > > u32 mmucfg; > > u32 epr; > > u32 crit_save; > > + /* guest debug registers*/ > > struct kvmppc_booke_debug_reg dbg_reg; > > + /* shadow debug registers */ > > + struct kvmppc_booke_debug_reg shadow_dbg_reg; > > + /* host debug registers*/ > > + struct kvmppc_booke_debug_reg host_dbg_reg; > > #endif > > gpa_t paddr_accessed; > > gva_t vaddr_accessed; > > diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm- > offsets.c > > index 555448e..6987821 100644 > > --- a/arch/powerpc/kernel/asm-offsets.c > > +++ b/arch/powerpc/kernel/asm-offsets.c > > @@ -564,6 +564,32 @@ int main(void) > > DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear)); > > DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr)); > > DEFINE(VCPU_CRIT_SAVE, offsetof(struct kvm_vcpu, arch.crit_save)); > > + DEFINE(VCPU_DBSR, offsetof(struct kvm_vcpu, arch.dbsr)); > > + DEFINE(VCPU_SHADOW_DBG, offsetof(struct kvm_vcpu, arch.shadow_dbg_reg)); > > + DEFINE(VCPU_HOST_DBG, offsetof(struct kvm_vcpu, arch.host_dbg_reg)); > > + DEFINE(KVMPPC_DBG_DBCR0, offsetof(struct kvmppc_booke_debug_reg, > > + dbcr0)); > > + DEFINE(KVMPPC_DBG_DBCR1, offsetof(struct kvmppc_booke_debug_reg, > > + dbcr1)); > > + DEFINE(KVMPPC_DBG_DBCR2, offsetof(struct kvmppc_booke_debug_reg, > > + dbcr2)); > > +#ifdef CONFIG_KVM_E500MC > > + DEFINE(KVMPPC_DBG_DBCR4, offsetof(struct kvmppc_booke_debug_reg, > > + dbcr4)); > > +
Re: [PATCH RFC 0/2] kvm: Improving undercommit,overcommit scenarios in PLE handler
On 10/03/2012 10:55 PM, Avi Kivity wrote: On 10/03/2012 04:29 PM, Raghavendra K T wrote: * Avi Kivity [2012-09-27 14:03:59]: On 09/27/2012 01:23 PM, Raghavendra K T wrote: [...] 2) looking at the result (comparing A & C) , I do feel we have significant in iterating over vcpus (when compared to even vmexit) so We still would need undercommit fix sugested by PeterZ (improving by 140%). ? Looking only at the current runqueue? My worry is that it misses a lot of cases. Maybe try the current runqueue first and then others. Okay. Do you mean we can have something like + if (rq->nr_running == 1 && p_rq->nr_running == 1) { + yielded = -ESRCH; + goto out_irq; + } in the Peter's patch ? ( I thought lot about && or || . Both seem to have their own cons ). But that should be only when we have short term imbalance, as PeterZ told. I'm missing the context. What is p_rq? p_rq is the run queue of target vcpu. What I was trying below was to address Rik concern. Suppose rq of source vcpu has one task, but target probably has two task, with a eligible vcpu waiting to be scheduled. What I mean was: if can_yield_to_process_in_current_rq do that else if can_yield_to_process_in_other_rq do that else return -ESRCH I think you are saying we have to check the run queue of the source vcpu, if we have a vcpu belonging to same VM and try yield to that? ignoring whatever the target vcpu we received for yield_to. Or is it that kvm_vcpu_yield_to should now check the vcpus of same vm belonging to same run queue first. If we don't succeed, go again for a vcpu in different runqueue. Does it add more overhead especially in <= 1x scenario? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler
On 10/03/2012 10:35 PM, Avi Kivity wrote: On 10/03/2012 02:22 PM, Raghavendra K T wrote: So I think it's worth trying again with ple_window of 2-4. Hi Avi, I ran different benchmarks increasing ple_window, and results does not seem to be encouraging for increasing ple_window. Thanks for testing! Comments below. Results: 16 core PLE machine with 16 vcpu guest. base kernel = 3.6-rc5 + ple handler optimization patch base_pleopt_8k = base kernel + ple window = 8k base_pleopt_16k = base kernel + ple window = 16k base_pleopt_32k = base kernel + ple window = 32k Percentage improvements of benchmarks w.r.t base_pleopt with ple_window = 4096 base_pleopt_8k base_pleopt_16k base_pleopt_32k - kernbench_1x-5.54915-15.94529 -44.31562 kernbench_2x-7.89399-17.75039 -37.73498 So, 44% degradation even with no overcommit? That's surprising. Yes. Kernbench was run with #threads = #vcpu * 2 as usual. Is it spending 8 times the original ple_window cycles for 16 vcpus significant? I also got perf top output to analyse the difference. Difference comes because of flushtlb (and also spinlock). That's in the guest, yes? Yes. Perf is in guest. Ebizzy run for 4k ple_window - 87.20% [kernel] [k] arch_local_irq_restore - arch_local_irq_restore - 100.00% _raw_spin_unlock_irqrestore + 52.89% release_pages + 47.10% pagevec_lru_move_fn - 5.71% [kernel] [k] arch_local_irq_restore - arch_local_irq_restore + 86.03% default_send_IPI_mask_allbutself_phys + 13.96% default_send_IPI_mask_sequence_phys - 3.10% [kernel] [k] smp_call_function_many smp_call_function_many Ebizzy run for 32k ple_window - 91.40% [kernel] [k] arch_local_irq_restore - arch_local_irq_restore - 100.00% _raw_spin_unlock_irqrestore + 53.13% release_pages + 46.86% pagevec_lru_move_fn - 4.38% [kernel] [k] smp_call_function_many smp_call_function_many - 2.51% [kernel] [k] arch_local_irq_restore - arch_local_irq_restore + 90.76% default_send_IPI_mask_allbutself_phys + 9.24% default_send_IPI_mask_sequence_phys Both the 4k and the 32k results are crazy. Why is arch_local_irq_restore() so prominent? Do you have a very high interrupt rate in the guest? How to measure if I have high interrupt rate in guest? From /proc/interrupt numbers I am not able to judge :( I went back and got the results on a 32 core machine with 32 vcpu guest. Strangely, I got result supporting the claim that increasing ple_window helps for non-overcommitted scenario. 32 core 32 vcpu guest 1x scenarios. ple_gap = 0 kernbench: Elapsed Time 38.61 ebizzy: 7463 records/s ple_window = 4k kernbench: Elapsed Time 43.5067 ebizzy:2528 records/s ple_window = 32k kernebench : Elapsed Time 39.4133 ebizzy: 7196 records/s perf top for ebizzy for above: ple_gap = 0 - 84.74% [kernel] [k] arch_local_irq_restore - arch_local_irq_restore - 100.00% _raw_spin_unlock_irqrestore + 50.96% release_pages + 49.02% pagevec_lru_move_fn - 6.57% [kernel] [k] arch_local_irq_restore - arch_local_irq_restore + 92.54% default_send_IPI_mask_allbutself_phys + 7.46% default_send_IPI_mask_sequence_phys - 1.54% [kernel] [k] smp_call_function_many smp_call_function_many ple_window = 32k - 84.47% [kernel] [k] arch_local_irq_restore + arch_local_irq_restore - 6.46% [kernel] [k] arch_local_irq_restore - arch_local_irq_restore + 93.51% default_send_IPI_mask_allbutself_phys + 6.49% default_send_IPI_mask_sequence_phys - 1.80% [kernel] [k] smp_call_function_many - smp_call_function_many + 99.98% native_flush_tlb_others ple_window = 4k - 91.35% [kernel] [k] arch_local_irq_restore - arch_local_irq_restore - 100.00% _raw_spin_unlock_irqrestore + 53.19% release_pages + 46.81% pagevec_lru_move_fn - 3.90% [kernel] [k] smp_call_function_many smp_call_function_many - 2.94% [kernel] [k] arch_local_irq_restore - arch_local_irq_restore + 93.12% default_send_IPI_mask_allbutself_phys + 6.88% default_send_IPI_mask_sequence_phys Let me know if I can try something here.. /me confused :( -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: qemu-kvm: remove "boot=on|off" drive parameter compatibility
On 2012-10-03 15:19, Paolo Bonzini wrote: > Il 03/10/2012 12:57, Lucas Meneghel Rodrigues ha scritto: >> Yep, I did send patches with the testdev device present on qemu-kvm.git >> to qemu.git a while ago, but there were many comments on the review, I >> ended up not implementing everything that was asked and the patches were >> archived. >> >> If nobody wants to step up to port it, I'll re-read the original thread >> and will spin up new patches (and try to go through the end with it). >> Executing the KVM unittests is something that we can't afford to lose, >> so I'd say it's important on this last mile effort to get rid of qemu-kvm. > > Absolutely, IIRC the problem was that testdev did a little bit of > everything... let's see what's the functionality of testdev: > > - write (port 0xf1), can be replaced in autotest with: > -device isa-debugcon,iobase=0xf1,chardev=... > > - exit code (port 0xf4), see this series: > http://lists.gnu.org/archive/html/qemu-devel/2012-07/msg00818.html > > - ram size (port 0xd1). If we can also patch kvm-unittests, the memory > is available in the CMOS or in fwcfg. Here is the SeaBIOS code: > > u32 rs = ((inb_cmos(0x34) << 16) | (inb_cmos(0x35) << 24)); > if (rs) > rs += 16 * 1024 * 1024; > else > rs = (((inb_cmos(0x30) << 10) | (inb_cmos(0x31) << 18)) > + 1 * 1024 * 1024); > > The rest (ports 0xe0..0xe7, 0x2000..0x2017, MMIO) can be left in testdev. IIRC, one of the biggest problem with testdev was its hack to inject interrupts. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[virt][PATCH 4/4] virt: Adds multihost machine type test variants.
This patch groups multihost test into one group to which is added machine types variant of test. Signed-off-by: Jiří Župka --- kvm/cfg/multi-host-tests.cfg.sample |6 +- shared/cfg/subtests.cfg.sample | 164 ++- 2 files changed, 107 insertions(+), 63 deletions(-) diff --git a/kvm/cfg/multi-host-tests.cfg.sample b/kvm/cfg/multi-host-tests.cfg.sample index bc0ce03..d47e756 100644 --- a/kvm/cfg/multi-host-tests.cfg.sample +++ b/kvm/cfg/multi-host-tests.cfg.sample @@ -21,7 +21,8 @@ variants: only no_pci_assignable only no_9p_export only smallpages -only Fedora.15.64 +only pc +only Fedora.17.64 only migrate_multi_host # Runs qemu, f16 64 bit guest OS, install, boot, shutdown @@ -37,7 +38,8 @@ variants: only no_pci_assignable only no_9p_export only smallpages -only Fedora.15.64 +only pc +only Fedora.17.64 only cpuflags_multi_host only qemu_migrate_multi_host diff --git a/shared/cfg/subtests.cfg.sample b/shared/cfg/subtests.cfg.sample index ab62e12..957bb9c 100644 --- a/shared/cfg/subtests.cfg.sample +++ b/shared/cfg/subtests.cfg.sample @@ -960,52 +960,111 @@ variants: - monotonic_time: test_control_file = monotonic_time.control -- migrate_multi_host: install setup image_copy unattended_install.cdrom -type = migration_multi_host -vms = "vm1" -start_vm = no -migration_test_command = help -migration_bg_command = "cd /tmp; nohup tcpdump -q -i any -t ip host localhost" -migration_bg_check_command = pgrep tcpdump -migration_bg_kill_command = pkill tcpdump -kill_vm_on_error = yes -iterations = 2 -used_mem = 1024 -mig_timeout = 4800 -disk_prepare_timeout = 360 -comm_port = 13234 -regain_ip_cmd = killall dhclient; sleep 10; dhclient; +- @multi_host: variants: -#Migration protocol. --tcp: +- migrate_multi_host: install setup image_copy unattended_install.cdrom +type = migration_multi_host +vms = "vm1" +start_vm = no +migration_test_command = help +migration_bg_command = "cd /tmp; nohup tcpdump -q -i any -t ip host localhost" +migration_bg_check_command = pgrep tcpdump +migration_bg_kill_command = pkill tcpdump +kill_vm_on_error = yes +iterations = 2 +used_mem = 1024 +mig_timeout = 4800 +disk_prepare_timeout = 360 +comm_port = 13234 +regain_ip_cmd = killall dhclient; sleep 10; dhclient; variants: -- @default: -type = migration_multi_host -- measure_migration_speed: -only Linux -mig_speed = 1G -type = migration_multi_host_with_speed_measurement --fd: -type = migration_multi_host_fd - -- migration_multi_host_with_file_transfer: install setup image_copy unattended_install.cdrom -type = migration_multi_host_with_file_transfer -vms = "vm1" -start_vm = no -kill_vm_on_error = yes -used_mem = 1024 -mig_timeout = 4800 -disk_prepare_timeout = 360 -comm_port = 13234 -#path where file is stored on guest. -guest_path = "/tmp/file" -#size of generated file. -file_size = 500 -transfer_timeout = 240 -#Transfer speed in Mb -transfer_speed = 100 -#Count of migration during file transfer. -migrate_count = 3 +#Migration protocol. +-tcp: +variants: +- @default: +type = migration_multi_host +- measure_migration_speed: +only Linux +mig_speed = 1G +type = migration_multi_host_with_speed_measurement +-fd: +type = migration_multi_host_fd + +- migration_multi_host_with_file_transfer: install setup image_copy unattended_install.cdrom +type = migration_multi_host_with_file_transfer +vms = "vm1" +start_vm = no +kill_vm_on_error = yes +used_mem = 1024 +mig_timeout = 4800 +disk_prepare_timeout = 360 +comm_port = 13234 +#path where file is stored on guest. +guest_path = "/tmp/file" +#size of generated file. +file_size = 500 +trans
[virt][PATCH 3/4] virt: Adds kvm,libvirt check of machine model
Signed-off-by: Jiří Župka --- virttest/kvm_vm.py | 12 ++-- virttest/libvirt_vm.py | 13 +++-- 2 files changed, 21 insertions(+), 4 deletions(-) diff --git a/virttest/kvm_vm.py b/virttest/kvm_vm.py index 9877d55..c51cd1b 100644 --- a/virttest/kvm_vm.py +++ b/virttest/kvm_vm.py @@ -830,7 +830,6 @@ class VM(virt_vm.BaseVM): cmd = "" return cmd - def add_machine_type(hlp, machine_type): if has_option(hlp, "machine") or has_option(hlp, "M"): return " -M %s" % machine_type @@ -916,6 +915,7 @@ class VM(virt_vm.BaseVM): self.qemu_binary = qemu_binary hlp = commands.getoutput("%s -help" % qemu_binary) support_cpu_model = commands.getoutput("%s -cpu ?list" % qemu_binary) +support_machine_type = commands.getoutput("%s -M ?" % qemu_binary) device_help = "" if has_option(hlp, "device"): @@ -1165,7 +1165,15 @@ class VM(virt_vm.BaseVM): machine_type = params.get("machine_type") if machine_type: -qemu_cmd += add_machine_type(hlp, machine_type) +m_types = [] +for m in support_machine_type.splitlines()[1:]: +m_types.append(m.split()[0]) + +if machine_type in m_types: +qemu_cmd += add_machine_type(hlp, machine_type) +else: +raise error.TestNAError("Not supported machine type %s." % +(machine_type)) for cdrom in params.objects("cdroms"): cd_format = params.get("cd_format", "") diff --git a/virttest/libvirt_vm.py b/virttest/libvirt_vm.py index 4e03835..c46c139 100644 --- a/virttest/libvirt_vm.py +++ b/virttest/libvirt_vm.py @@ -8,7 +8,7 @@ import time, os, logging, fcntl, re, shutil, urlparse, tempfile from autotest.client.shared import error from autotest.client import utils, os_dep from xml.dom import minidom -import utils_misc, virt_vm, storage, aexpect, remote, virsh +import utils_misc, virt_vm, storage, aexpect, remote, virsh, libvirt_xml def libvirtd_restart(): @@ -521,6 +521,11 @@ class VM(virt_vm.BaseVM): help = utils.system_output("%s --help" % virt_install_binary) +# Find all machine type. There isn't filtred any machine type even if +# domain doesn't support this machine type. Filtering could be added. +me = libvirt_xml.LibvirtXML().getroot().findall("./guest/arch/machine") +support_machine_type = map(lambda m: m.text, me) + # Start constructing the qemu command virt_install_cmd = "" # Set the X11 display parameter if requested @@ -542,7 +547,11 @@ class VM(virt_vm.BaseVM): machine_type = params.get("machine_type") if machine_type: -virt_install_cmd += add_machine_type(help, machine_type) +if machine_type in support_machine_type: +virt_install_cmd += add_machine_type(help, machine_type) +else: +raise error.TestNAError("Not supported machine type %s." % +(machine_type)) mem = params.get("mem") if mem: -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[virt][PATCH 2/4] virt: Cleanup unused import in libvirt_xml.py
Signed-off-by: Jiří Župka --- virttest/libvirt_xml.py |6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/virttest/libvirt_xml.py b/virttest/libvirt_xml.py index bedc496..4376111 100644 --- a/virttest/libvirt_xml.py +++ b/virttest/libvirt_xml.py @@ -8,10 +8,8 @@ xml_utils module documentation for more information on working with XMLTreeFile instances. """ - -import logging, os.path -from autotest.client.shared import error, xml_utils -from virttest import libvirt_vm, virsh +from autotest.client.shared import xml_utils +from virttest import virsh class LibvirtXMLError(Exception): -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [patch 2/6] Use machine options to emulate -no-kvm-irqchip
On 2012-10-03 20:26, Marcelo Tosatti wrote: > On Wed, Oct 03, 2012 at 07:24:48PM +0200, Jan Kiszka wrote: >> On 2012-10-03 19:16, Anthony Liguori wrote: >>> Jan Kiszka writes: >>> On 2012-10-03 17:03, Marcelo Tosatti wrote: > On Wed, Oct 03, 2012 at 09:40:17AM -0500, Anthony Liguori wrote: >> Marcelo Tosatti writes: >> >>> Commit 3ad763fcba5bd0ec5a79d4a9b6baeef119dd4a3d from qemu-kvm.git. >>> >>> From: Jan Kiszka >>> >>> Upstream is moving towards this mechanism, so start using it in qemu-kvm >>> already to configure the specific defaults: kvm enabled on, just like >>> in-kernel irqchips. >>> >>> Signed-off-by: Marcelo Tosatti >> >> >> Reviewed-by: Anthony Liguori >> >> Although it's a little odd to have From: Jan without a SoB... > > Agree, Jan can you ACK? I wasn't able to join the call yesterday: Is there a removal schedule associated with those switches? Also, why pushing things upstream, even when only for one release, that have been loudly deprecated for a while in qemu-kvm? Some switches are lacking deprecated warnings on the console, and -no-kvm is missing completely. I tend to focus on patch 1 & 5, dropping the rest - based on relevance for production use. >>> >>> The distros need to keep these flags to do the switch. >> >> Why? Should be documented in commit log. >> >>> I see no point >>> in deprecating them since they're trivially easy to maintain. >> >> Given the level of cr** we already have in the command line, they are >> kind of noise, yes. But even then, these patches are not consistent as >> pointed out above. >> >> Also, they should not be documented to avoid being spread. That's what >> we did with other deprecated switches in QEMU. >> >> Jan > > Jan, > > You're comments to the patch are: > > - No documentation. See e.g. how -M is handled in qemu-options.hx. > - Expiration date. Anthony said "forever", but I think we should remove all those that issue deprecation warnings after 1-2 years. > - Changelog explaining what?? (didnt get that). Perhaps better changelog > in general? I'm still failing to understand who could depend on -no-kvm-irqchip or -no-kvm-pit. And I don't understand why -no-kvm was not included. Soe the reasons for include -X should be provided. Also check your patch subjects again, at least one was wrong. Jan signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] hw: Add test device for unittests execution
Il 04/10/2012 10:02, Peter Maydell ha scritto: > On 4 October 2012 04:49, Lucas Meneghel Rodrigues wrote: >> Add a test device which supports the kvmctl ioports, >> so one can run the KVM unittest suite [1]. >> >> Usage: >> >> qemu -device testdev >> >> 1) Removed port 0xf1, since now kvm-unit-tests use >>serial >> >> 2) Removed exit code port 0xf4, since that can be >>replaced by >> >> -device isa-debugexit,iobase=0xf4,access-size=2 >> >> 3) Removed ram size port 0xd1, since guest memory >>size can be retrieved from firmware, there's a >>patch for kvm-unit-tests including an API to >>retrieve that value. >> >> [1] Preliminary versions of this patch were posted >> to the mailing list about a year ago, I re-read the >> comments of the thread, and had guidance from >> Paolo about which ports to remove from the test >> device. > > General remark: there's no documentation anywhere in > this patch. I don't necessarily mean user-facing docs, > but at least a descriptive comment saying what the > heck this device is and what it does would be helpful. > > >> >> CC: Paolo Bonzini >> Signed-off-by: Gerd Hoffmann >> Signed-off-by: Avi Kivity >> Signed-off-by: Marcelo Tosatti >> Signed-off-by: Lucas Meneghel Rodrigues >> --- >> hw/i386/Makefile.objs | 1 + >> hw/testdev.c | 131 >> ++ >> 2 files changed, 132 insertions(+) >> create mode 100644 hw/testdev.c >> >> diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs >> index 8c764bb..64d2787 100644 >> --- a/hw/i386/Makefile.objs >> +++ b/hw/i386/Makefile.objs >> @@ -11,5 +11,6 @@ obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o >> obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o >> xen_pt_msi.o >> obj-y += kvm/ >> obj-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o >> +obj-y += testdev.o > > ...the device is useful even in non-KVM configs, then? > >> obj-y := $(addprefix ../,$(obj-y)) >> diff --git a/hw/testdev.c b/hw/testdev.c >> new file mode 100644 >> index 000..44070f2 >> --- /dev/null >> +++ b/hw/testdev.c >> @@ -0,0 +1,131 @@ >> +#include > > This file needs a leading comment with the usual copyright/license/ > description of what the file does. > >> +#include "hw.h" >> +#include "qdev.h" >> +#include "isa.h" >> + >> +struct testdev { >> +ISADevice dev; >> +MemoryRegion iomem; >> +CharDriverState *chr; >> +}; >> + >> +#define TYPE_TESTDEV "testdev" >> +#define TESTDEV(obj) \ >> + OBJECT_CHECK(struct testdev, (obj), TYPE_TESTDEV) >> + >> +static void test_device_irq_line(void *opaque, uint32_t addr, uint32_t data) >> +{ >> +struct testdev *dev = opaque; >> + >> +qemu_set_irq(isa_get_irq(&dev->dev, addr - 0x2000), !!data); >> +} >> + >> +static uint32 test_device_ioport_data; >> + >> +static void test_device_ioport_write(void *opaque, uint32_t addr, uint32_t >> data) >> +{ >> +test_device_ioport_data = data; >> +} >> + >> +static uint32_t test_device_ioport_read(void *opaque, uint32_t addr) >> +{ >> +return test_device_ioport_data; >> +} >> + >> +static void test_device_flush_page(void *opaque, uint32_t addr, uint32_t >> data) >> +{ >> +target_phys_addr_t len = 4096; >> +void *a = cpu_physical_memory_map(data & ~0xffful, &len, 0); >> + >> +mprotect(a, 4096, PROT_NONE); >> +mprotect(a, 4096, PROT_READ|PROT_WRITE); >> +cpu_physical_memory_unmap(a, len, 0, 0); >> +} >> + >> +static char *iomem_buf; >> + >> +static uint32_t test_iomem_readb(void *opaque, target_phys_addr_t addr) >> +{ >> +return iomem_buf[addr]; >> +} >> + >> +static uint32_t test_iomem_readw(void *opaque, target_phys_addr_t addr) >> +{ >> +return *(uint16_t*)(iomem_buf + addr); >> +} >> + >> +static uint32_t test_iomem_readl(void *opaque, target_phys_addr_t addr) >> +{ >> +return *(uint32_t*)(iomem_buf + addr); >> +} >> + >> +static void test_iomem_writeb(void *opaque, target_phys_addr_t addr, >> uint32_t val) >> +{ >> +iomem_buf[addr] = val; >> +} >> + >> +static void test_iomem_writew(void *opaque, target_phys_addr_t addr, >> uint32_t val) >> +{ >> +*(uint16_t*)(iomem_buf + addr) = val; >> +} >> + >> +static void test_iomem_writel(void *opaque, target_phys_addr_t addr, >> uint32_t val) >> +{ >> +*(uint32_t*)(iomem_buf + addr) = val; >> +} >> + >> +static const MemoryRegionOps test_iomem_ops = { >> +.old_mmio = { >> +.read = { test_iomem_readb, test_iomem_readw, test_iomem_readl, }, >> +.write = { test_iomem_writeb, test_iomem_writew, test_iomem_writel, >> }, >> +}, >> +.endianness = DEVICE_LITTLE_ENDIAN, >> +}; >> + >> +static int init_test_device(ISADevice *isa) >> +{ >> +struct testdev *dev = DO_UPCAST(struct testdev, dev, isa); >> + >> +register_ioport_read(0xe0, 1, 1, test_device_ioport_read, dev); >> +register_ioport_write(0xe0, 1, 1, test_device_ioport_write, dev); >> +register_ioport_read(0xe0, 1, 2, test_device_ioport_read, dev);
Re: [Qemu-devel] [PATCH] hw: Add test device for unittests execution
On 4 October 2012 04:49, Lucas Meneghel Rodrigues wrote: > Add a test device which supports the kvmctl ioports, > so one can run the KVM unittest suite [1]. > > Usage: > > qemu -device testdev > > 1) Removed port 0xf1, since now kvm-unit-tests use >serial > > 2) Removed exit code port 0xf4, since that can be >replaced by > > -device isa-debugexit,iobase=0xf4,access-size=2 > > 3) Removed ram size port 0xd1, since guest memory >size can be retrieved from firmware, there's a >patch for kvm-unit-tests including an API to >retrieve that value. > > [1] Preliminary versions of this patch were posted > to the mailing list about a year ago, I re-read the > comments of the thread, and had guidance from > Paolo about which ports to remove from the test > device. General remark: there's no documentation anywhere in this patch. I don't necessarily mean user-facing docs, but at least a descriptive comment saying what the heck this device is and what it does would be helpful. > > CC: Paolo Bonzini > Signed-off-by: Gerd Hoffmann > Signed-off-by: Avi Kivity > Signed-off-by: Marcelo Tosatti > Signed-off-by: Lucas Meneghel Rodrigues > --- > hw/i386/Makefile.objs | 1 + > hw/testdev.c | 131 > ++ > 2 files changed, 132 insertions(+) > create mode 100644 hw/testdev.c > > diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs > index 8c764bb..64d2787 100644 > --- a/hw/i386/Makefile.objs > +++ b/hw/i386/Makefile.objs > @@ -11,5 +11,6 @@ obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o > obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o > xen_pt_msi.o > obj-y += kvm/ > obj-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o > +obj-y += testdev.o ...the device is useful even in non-KVM configs, then? > obj-y := $(addprefix ../,$(obj-y)) > diff --git a/hw/testdev.c b/hw/testdev.c > new file mode 100644 > index 000..44070f2 > --- /dev/null > +++ b/hw/testdev.c > @@ -0,0 +1,131 @@ > +#include This file needs a leading comment with the usual copyright/license/ description of what the file does. > +#include "hw.h" > +#include "qdev.h" > +#include "isa.h" > + > +struct testdev { > +ISADevice dev; > +MemoryRegion iomem; > +CharDriverState *chr; > +}; > + > +#define TYPE_TESTDEV "testdev" > +#define TESTDEV(obj) \ > + OBJECT_CHECK(struct testdev, (obj), TYPE_TESTDEV) > + > +static void test_device_irq_line(void *opaque, uint32_t addr, uint32_t data) > +{ > +struct testdev *dev = opaque; > + > +qemu_set_irq(isa_get_irq(&dev->dev, addr - 0x2000), !!data); > +} > + > +static uint32 test_device_ioport_data; > + > +static void test_device_ioport_write(void *opaque, uint32_t addr, uint32_t > data) > +{ > +test_device_ioport_data = data; > +} > + > +static uint32_t test_device_ioport_read(void *opaque, uint32_t addr) > +{ > +return test_device_ioport_data; > +} > + > +static void test_device_flush_page(void *opaque, uint32_t addr, uint32_t > data) > +{ > +target_phys_addr_t len = 4096; > +void *a = cpu_physical_memory_map(data & ~0xffful, &len, 0); > + > +mprotect(a, 4096, PROT_NONE); > +mprotect(a, 4096, PROT_READ|PROT_WRITE); > +cpu_physical_memory_unmap(a, len, 0, 0); > +} > + > +static char *iomem_buf; > + > +static uint32_t test_iomem_readb(void *opaque, target_phys_addr_t addr) > +{ > +return iomem_buf[addr]; > +} > + > +static uint32_t test_iomem_readw(void *opaque, target_phys_addr_t addr) > +{ > +return *(uint16_t*)(iomem_buf + addr); > +} > + > +static uint32_t test_iomem_readl(void *opaque, target_phys_addr_t addr) > +{ > +return *(uint32_t*)(iomem_buf + addr); > +} > + > +static void test_iomem_writeb(void *opaque, target_phys_addr_t addr, > uint32_t val) > +{ > +iomem_buf[addr] = val; > +} > + > +static void test_iomem_writew(void *opaque, target_phys_addr_t addr, > uint32_t val) > +{ > +*(uint16_t*)(iomem_buf + addr) = val; > +} > + > +static void test_iomem_writel(void *opaque, target_phys_addr_t addr, > uint32_t val) > +{ > +*(uint32_t*)(iomem_buf + addr) = val; > +} > + > +static const MemoryRegionOps test_iomem_ops = { > +.old_mmio = { > +.read = { test_iomem_readb, test_iomem_readw, test_iomem_readl, }, > +.write = { test_iomem_writeb, test_iomem_writew, test_iomem_writel, > }, > +}, > +.endianness = DEVICE_LITTLE_ENDIAN, > +}; > + > +static int init_test_device(ISADevice *isa) > +{ > +struct testdev *dev = DO_UPCAST(struct testdev, dev, isa); > + > +register_ioport_read(0xe0, 1, 1, test_device_ioport_read, dev); > +register_ioport_write(0xe0, 1, 1, test_device_ioport_write, dev); > +register_ioport_read(0xe0, 1, 2, test_device_ioport_read, dev); > +register_ioport_write(0xe0, 1, 2, test_device_ioport_write, dev); > +register_ioport_read(0xe0, 1, 4, test_device_ioport_read, dev); > +register_ioport_write(0xe0, 1, 4, test_device_ioport_write, dev); > +regi
Re: [PATCH 0/3] virtio-net: inline header support
Anthony Liguori writes: >> lguest fix is pending in my queue. lkvm and qemu are broken; lkvm isn't >> ever going to be merged, so I'm not sure what its status is? But I'm >> determined to fix qemu, and hence my torture patch to make sure this >> doesn't creep in again. > > There are even more implementations out there and I'd wager they all > rely on framing. Worse, both virtio_blk (for scsi commands) and virtio_scsi explicitly and inescapably rely on framing. The spec conflicts clearly with itself. Such layering violations are always a mistake, but I can't blame anyone else for my lack of attention :( Here's the spec change: commit 7e74459bb966ccbaad9e4bf361d1178b7f400b79 Author: Rusty Russell Date: Thu Oct 4 17:11:27 2012 +0930 No longer assume framing is independent of messages. *sniff* Signed-off-by: Rusty Russell --- virtio-spec.txt 2012-10-04 17:13:04.988259234 +0930 +++ virtio-spec.txt.new 2012-10-04 17:12:54.624258969 +0930 @@ -880,19 +880,19 @@ Message Framing -The descriptors used for a buffer should not effect the semantics -of the message, except for the total length of the buffer. For -example, a network buffer consists of a 10 byte header followed -by the network packet. Whether this is presented in the ring -descriptor chain as (say) a 10 byte buffer and a 1514 byte -buffer, or a single 1524 byte buffer, or even three buffers, -should have no effect. +Unless stated otherwise, it is expected that headers within a +message are contained within their own descriptors. For example, +a network buffer consists of a 10 or 12 byte header followed by +the network packet. An implementation should expect that this +header will be within the first descriptor, and that the +remainder of the data will begin on the second descriptor. -In particular, no implementation should use the descriptor -boundaries to determine the size of any header in a request.[footnote: -The current qemu device implementations mistakenly insist that -the first descriptor cover the header in these cases exactly, so -a cautious driver should arrange it so. +[footnote: +It was previously asserted that framing should be independent of +message contents, yet invariably drivers layed out messages in +reliable ways and devices assumed it. In addition, the +specifications for virtio_blk and virtio_scsi require intuiting +field lengths from frame boundaries. ] Device Improvements -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 4/6] KVM: PPC: debug stub interface parameter defined
> -Original Message- > From: Alexander Graf [mailto:ag...@suse.de] > Sent: Monday, September 24, 2012 9:09 PM > To: Bhushan Bharat-R65777 > Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Bhushan Bharat-R65777 > Subject: Re: [PATCH 4/6] KVM: PPC: debug stub interface parameter defined > > > On 21.08.2012, at 15:51, Bharat Bhushan wrote: > > > This patch defines the interface parameter for KVM_SET_GUEST_DEBUG > > ioctl support. Follow up patches will use this for setting up hardware > > breakpoints, watchpoints and software breakpoints. > > > > Signed-off-by: Bharat Bhushan > > --- > > arch/powerpc/include/asm/kvm.h | 33 + > > arch/powerpc/kvm/book3s.c |6 ++ > > arch/powerpc/kvm/booke.c |6 ++ > > arch/powerpc/kvm/powerpc.c |6 -- > > 4 files changed, 45 insertions(+), 6 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/kvm.h > > b/arch/powerpc/include/asm/kvm.h index 3c14202..61b197e 100644 > > --- a/arch/powerpc/include/asm/kvm.h > > +++ b/arch/powerpc/include/asm/kvm.h > > @@ -269,8 +269,41 @@ struct kvm_debug_exit_arch { > > > > /* for KVM_SET_GUEST_DEBUG */ > > struct kvm_guest_debug_arch { > > + struct { > > + /* H/W breakpoint/watchpoint address */ > > + __u64 addr; > > + /* > > +* Type denotes h/w breakpoint, read watchpoint, write > > +* watchpoint or watchpoint (both read and write). > > +*/ > > +#define KVMPPC_DEBUG_NOTYPE0x0 > > +#define KVMPPC_DEBUG_BREAKPOINT(1UL << 1) > > +#define KVMPPC_DEBUG_WATCH_WRITE (1UL << 2) > > +#define KVMPPC_DEBUG_WATCH_READ(1UL << 3) > > + __u32 type; > > + __u32 pad1; > > Why the padding? > > > + __u64 pad2; > > + } bp[16]; > > Why 16? > > > }; > > > > +/* Debug related defines */ > > +/* > > + * kvm_guest_debug->control is a 32 bit field. The lower 16 bits are > > +generic > > + * and upper 16 bits are architecture specific. Architecture specific > > +defines > > + * that ioctl is for setting hardware breakpoint or software breakpoint. > > + */ > > +#define KVM_GUESTDBG_USE_SW_BP 0x0001 > > +#define KVM_GUESTDBG_USE_HW_BP 0x0002 > > + > > +/* When setting software breakpoint, Change the software breakpoint > > + * instruction to special trap instruction and set > > +KVM_GUESTDBG_USE_SW_BP > > + * flag in kvm_guest_debug->control. KVM does keep track of software > > + * breakpoints. So when KVM_GUESTDBG_USE_SW_BP flag is set and > > +special trap > > + * instruction is executed by guest then exit to userspace. > > + * NOTE: A Nice interface can be added to get the special trap instruction. > > + */ > > +#define KVMPPC_INST_GUEST_GDB 0x7C00021C /* ehpriv OC=0 > > */ > > This definitely has to be passed to user space (which writes that instruction > into guest phys memory). Other PPC subarchs will use different instructions. > Just model it as a read-only ONE_REG. Ok. Thanks -Bharat -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLE handler
On Wed, Oct 03, 2012 at 04:56:57PM +0200, Avi Kivity wrote: > On 10/03/2012 04:17 PM, Raghavendra K T wrote: > > * Avi Kivity [2012-09-30 13:13:09]: > > > >> On 09/30/2012 01:07 PM, Gleb Natapov wrote: > >> > On Sun, Sep 30, 2012 at 10:18:17AM +0200, Avi Kivity wrote: > >> >> On 09/28/2012 08:16 AM, Raghavendra K T wrote: > >> >> > > >> >> >> > >> >> >> +struct pv_sched_info { > >> >> >> + unsigned long sched_bitmap; > >> >> > > >> >> > Thinking, whether we need something similar to cpumask here? > >> >> > Only thing is we are representing guest (v)cpumask. > >> >> > > >> >> > >> >> DECLARE_BITMAP(sched_bitmap, KVM_MAX_VCPUS) > >> >> > >> > vcpu_id can be greater than KVM_MAX_VCPUS. > >> > >> Use the index into the vcpu table as the bitmap index then. In fact > >> it's better because then the lookup to get the vcpu pointer is trivial. > > > > Did you mean, while setting the bitmap, > > > > we should do > > for (i = 1..n) > > if (kvm->vcpus[i] == vcpu) set ith position in bitmap? > > You can store i in the vcpu itself: > > set_bit(vcpu->index, &kvm->preempted); > This will make the fact that vcpus are stored in an array into API instead of implementation detail :( There were patches for vcpu destruction that changed the array to rculist. Well, it will be still possible to make the array rcu protected and copy it every time vcpu is deleted/added I guess. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] hw: Add test device for unittests execution
Il 04/10/2012 05:49, Lucas Meneghel Rodrigues ha scritto: > Add a test device which supports the kvmctl ioports, > so one can run the KVM unittest suite [1]. > > Usage: > > qemu -device testdev > > 1) Removed port 0xf1, since now kvm-unit-tests use >serial > > 2) Removed exit code port 0xf4, since that can be >replaced by > > -device isa-debugexit,iobase=0xf4,access-size=2 Can you include this in your series? > 3) Removed ram size port 0xd1, since guest memory >size can be retrieved from firmware, there's a >patch for kvm-unit-tests including an API to >retrieve that value. > > [1] Preliminary versions of this patch were posted > to the mailing list about a year ago, I re-read the > comments of the thread, and had guidance from > Paolo about which ports to remove from the test > device. > > CC: Paolo Bonzini > Signed-off-by: Gerd Hoffmann > Signed-off-by: Avi Kivity > Signed-off-by: Marcelo Tosatti > Signed-off-by: Lucas Meneghel Rodrigues > --- > hw/i386/Makefile.objs | 1 + > hw/testdev.c | 131 > ++ > 2 files changed, 132 insertions(+) > create mode 100644 hw/testdev.c > > diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs > index 8c764bb..64d2787 100644 > --- a/hw/i386/Makefile.objs > +++ b/hw/i386/Makefile.objs > @@ -11,5 +11,6 @@ obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen-host-pci-device.o > obj-$(CONFIG_XEN_PCI_PASSTHROUGH) += xen_pt.o xen_pt_config_init.o > xen_pt_msi.o > obj-y += kvm/ > obj-$(CONFIG_SPICE) += qxl.o qxl-logger.o qxl-render.o > +obj-y += testdev.o > > obj-y := $(addprefix ../,$(obj-y)) > diff --git a/hw/testdev.c b/hw/testdev.c > new file mode 100644 > index 000..44070f2 > --- /dev/null > +++ b/hw/testdev.c > @@ -0,0 +1,131 @@ > +#include > +#include "hw.h" > +#include "qdev.h" > +#include "isa.h" > + > +struct testdev { > +ISADevice dev; > +MemoryRegion iomem; > +CharDriverState *chr; This member is not necessary anymore. Paolo > +}; > + > +#define TYPE_TESTDEV "testdev" > +#define TESTDEV(obj) \ > + OBJECT_CHECK(struct testdev, (obj), TYPE_TESTDEV) > + > +static void test_device_irq_line(void *opaque, uint32_t addr, uint32_t data) > +{ > +struct testdev *dev = opaque; > + > +qemu_set_irq(isa_get_irq(&dev->dev, addr - 0x2000), !!data); > +} > + > +static uint32 test_device_ioport_data; > + > +static void test_device_ioport_write(void *opaque, uint32_t addr, uint32_t > data) > +{ > +test_device_ioport_data = data; > +} > + > +static uint32_t test_device_ioport_read(void *opaque, uint32_t addr) > +{ > +return test_device_ioport_data; > +} > + > +static void test_device_flush_page(void *opaque, uint32_t addr, uint32_t > data) > +{ > +target_phys_addr_t len = 4096; > +void *a = cpu_physical_memory_map(data & ~0xffful, &len, 0); > + > +mprotect(a, 4096, PROT_NONE); > +mprotect(a, 4096, PROT_READ|PROT_WRITE); > +cpu_physical_memory_unmap(a, len, 0, 0); > +} > + > +static char *iomem_buf; > + > +static uint32_t test_iomem_readb(void *opaque, target_phys_addr_t addr) > +{ > +return iomem_buf[addr]; > +} > + > +static uint32_t test_iomem_readw(void *opaque, target_phys_addr_t addr) > +{ > +return *(uint16_t*)(iomem_buf + addr); > +} > + > +static uint32_t test_iomem_readl(void *opaque, target_phys_addr_t addr) > +{ > +return *(uint32_t*)(iomem_buf + addr); > +} > + > +static void test_iomem_writeb(void *opaque, target_phys_addr_t addr, > uint32_t val) > +{ > +iomem_buf[addr] = val; > +} > + > +static void test_iomem_writew(void *opaque, target_phys_addr_t addr, > uint32_t val) > +{ > +*(uint16_t*)(iomem_buf + addr) = val; > +} > + > +static void test_iomem_writel(void *opaque, target_phys_addr_t addr, > uint32_t val) > +{ > +*(uint32_t*)(iomem_buf + addr) = val; > +} > + > +static const MemoryRegionOps test_iomem_ops = { > +.old_mmio = { > +.read = { test_iomem_readb, test_iomem_readw, test_iomem_readl, }, > +.write = { test_iomem_writeb, test_iomem_writew, test_iomem_writel, > }, > +}, > +.endianness = DEVICE_LITTLE_ENDIAN, > +}; > + > +static int init_test_device(ISADevice *isa) > +{ > +struct testdev *dev = DO_UPCAST(struct testdev, dev, isa); > + > +register_ioport_read(0xe0, 1, 1, test_device_ioport_read, dev); > +register_ioport_write(0xe0, 1, 1, test_device_ioport_write, dev); > +register_ioport_read(0xe0, 1, 2, test_device_ioport_read, dev); > +register_ioport_write(0xe0, 1, 2, test_device_ioport_write, dev); > +register_ioport_read(0xe0, 1, 4, test_device_ioport_read, dev); > +register_ioport_write(0xe0, 1, 4, test_device_ioport_write, dev); > +register_ioport_write(0xe4, 1, 4, test_device_flush_page, dev); > +register_ioport_write(0x2000, 24, 1, test_device_irq_line, NULL); > +iomem_buf = g_malloc0(0x1); > +memory_region_init_io(&dev->iomem, &test_iomem_ops, dev, > + "testdev", 0x1
RE: [PATCH 4/6] KVM: PPC: debug stub interface parameter defined
> -Original Message- > From: Alexander Graf [mailto:ag...@suse.de] > Sent: Monday, September 24, 2012 9:09 PM > To: Bhushan Bharat-R65777 > Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Bhushan Bharat-R65777 > Subject: Re: [PATCH 4/6] KVM: PPC: debug stub interface parameter defined > > > On 21.08.2012, at 15:51, Bharat Bhushan wrote: > > > This patch defines the interface parameter for KVM_SET_GUEST_DEBUG > > ioctl support. Follow up patches will use this for setting up hardware > > breakpoints, watchpoints and software breakpoints. > > > > Signed-off-by: Bharat Bhushan > > --- > > arch/powerpc/include/asm/kvm.h | 33 + > > arch/powerpc/kvm/book3s.c |6 ++ > > arch/powerpc/kvm/booke.c |6 ++ > > arch/powerpc/kvm/powerpc.c |6 -- > > 4 files changed, 45 insertions(+), 6 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/kvm.h > > b/arch/powerpc/include/asm/kvm.h index 3c14202..61b197e 100644 > > --- a/arch/powerpc/include/asm/kvm.h > > +++ b/arch/powerpc/include/asm/kvm.h > > @@ -269,8 +269,41 @@ struct kvm_debug_exit_arch { > > > > /* for KVM_SET_GUEST_DEBUG */ > > struct kvm_guest_debug_arch { > > + struct { > > + /* H/W breakpoint/watchpoint address */ > > + __u64 addr; > > + /* > > +* Type denotes h/w breakpoint, read watchpoint, write > > +* watchpoint or watchpoint (both read and write). > > +*/ > > +#define KVMPPC_DEBUG_NOTYPE0x0 > > +#define KVMPPC_DEBUG_BREAKPOINT(1UL << 1) > > +#define KVMPPC_DEBUG_WATCH_WRITE (1UL << 2) > > +#define KVMPPC_DEBUG_WATCH_READ(1UL << 3) > > + __u32 type; > > + __u32 pad1; > > Why the padding? Not sure why, I will remove this. > > > + __u64 pad2; > > + } bp[16]; > > Why 16? I think for now 6 (4 iac + 2 dac) is sufficient for BOOKE. We kept 16 to have some room for future and other platforms. Thanks -Bharat > > > }; > > > > +/* Debug related defines */ > > +/* > > + * kvm_guest_debug->control is a 32 bit field. The lower 16 bits are > > +generic > > + * and upper 16 bits are architecture specific. Architecture specific > > +defines > > + * that ioctl is for setting hardware breakpoint or software breakpoint. > > + */ > > +#define KVM_GUESTDBG_USE_SW_BP 0x0001 > > +#define KVM_GUESTDBG_USE_HW_BP 0x0002 > > + > > +/* When setting software breakpoint, Change the software breakpoint > > + * instruction to special trap instruction and set > > +KVM_GUESTDBG_USE_SW_BP > > + * flag in kvm_guest_debug->control. KVM does keep track of software > > + * breakpoints. So when KVM_GUESTDBG_USE_SW_BP flag is set and > > +special trap > > + * instruction is executed by guest then exit to userspace. > > + * NOTE: A Nice interface can be added to get the special trap instruction. > > + */ > > +#define KVMPPC_INST_GUEST_GDB 0x7C00021C /* ehpriv OC=0 > > */ > > This definitely has to be passed to user space (which writes that instruction > into guest phys memory). Other PPC subarchs will use different instructions. > Just model it as a read-only ONE_REG. > > > Alex > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/3] virtio-net: inline header support
Il 04/10/2012 02:11, Rusty Russell ha scritto: > > > There's a reason I haven't done this. I really, really dislike "my > > > implemention isn't broken" feature bits. We could have an infinite > > > number of them, for each bug in each device. > > > > However, this bug affects (almost) all implementations and (almost) all > > devices. It even makes sense to reserve a transport feature bit for it > > instead of a device feature bit. > > Perhaps, but we have to fix the bugs first! Yes. :) Isn't that what mst's patch does? > As I said, my torture patch broke qemu immediately. Since noone has > leapt onto fixing that, I'll take a look now... I can look at virtio-scsi. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html