Re: [kvm-devel] question: HPET for multiple VMs
Anthony Liguori wrote: Avi Kivity wrote: And I would like to ask right and wrong to implement the functionality in terms of need and efficiency (scalability and time accuracy). I think that for newer kernels we already have the desired accuracy. We're not always good at exploiting that accuracy; hence the recent movement of the PIT implementation from userspace to the kernel. But recent discussion leads me to believe it could have been implemented with the userspace PIT as well. What do you think is needed to get the same accuracy in userspace as in kernelspace? Some mechanism that allows us to implement kvm_inject_pit_timer_irqs() and kvm_pit_timer_intr_post(). Specifically, information about whether an interrupt was actually processed, and a window for injecting missed ticks. Better yet, do you think there is a reasonable kvmctl harness we could write to quantify the PIT accuracy? kvmctl doesn't implement a pit, so no. Of course we can test any infrastructure for counting missed interrupts. It's easy enough to count timer interrupts and use compare that to an external time source to get some notion of accuracy (on varying frequencies of course). I know you mentioned before that guest CPU consumption also comes into play... I'm not quite sure why though so I'm not sure how to simulate that. It's not so easy, the code is quite tricky since the cpu processes vectors, not interrupt lines. It's also heuristic; if the guest programs some random device to share interrupts with the pit, the heursitic breaks down. This never happens in practice, though. Problems show up when both the guest and host are loaded, as then the cpu is timesliced instead of being available on demand. The nice thing about the CAP infrastructure is we can always move the PIT back to userspace. I'll happily invest some cycles here as I'm a big fan of getting rid of unneeded kernel code :-) Yes. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] kvm.h: __user requires compiler.h
Anthony Liguori wrote: This patch breaks QEMU build when doing a 'make sync'. When you do a top-level ./configure, libkvm is built with kerneldir pointing to kvm-userspace/kernel/include. While linux/kvm.h is present there, there isn't a linux/compiler.h. The host kernelpath isn't normally part of the libkvm or QEMU build. So we have a couple options. 1) make the host kernelpath (/lib/modules/$(uname -r)/build/include) part of the libkvm/QEMU build. 2) Do something else about __user Suggestions? #1 might be a pain since there may be include conflicts between the host kernel include and kernel/include. We could hack 'make sync' to strip out __user (just like we run unifdef). Of course the reasons for including linux/compiler.h are still valid, so it needs to remain. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] Move kvm_get_pit to libkvm.c common code
Hollis Blanchard wrote: Don't compile kvm_*_pit() on architectures whose currently supported platforms do not contain a PIT. Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h --- a/libkvm/libkvm.h +++ b/libkvm/libkvm.h @@ -539,6 +539,7 @@ int kvm_pit_in_kernel(kvm_context_t kvm) #ifdef KVM_CAP_PIT +#if defined(__i386__) || defined(__x86_64__) || defined(__ia64__) /*! * \brief Get in kernel PIT of the virtual domain * @@ -562,6 +563,8 @@ int kvm_set_pit(kvm_context_t kvm, struc #endif +#endif + #ifdef KVM_CAP_VAPIC ia64 doesn't have an in-kernel pit? (yet?) -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH][QEMU] Use a separate device for in-kernel PIT
Anthony Liguori wrote: Hollis Blanchard wrote: This patch solves annoying qemu build breakage hitting PowerPC around struct kvm_pit_state, so that's another vote in favor... I have an updated version of the patch but it's breaking the build b/c something fouled up right now with configure. libkvm pulls in linux/kvm.h which wants to pull in linux/compiler.h. We don't ship a linux/compiler.h though so it's pulling from /usr/include/linux which on my system doesn't have a compiler.h. The lack of this header is causing the configure test to fail. I've attached the patch here for you to use and I'll send it out again once I figure out the fix for this linux/compiler.h. The patch suffers from the same problem as the apic split; the save/restore code is needlessly duplicated. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] 'make clean' is eager to delete config.mak files
Ryota OZAKI wrote: Hi all, Current 'make clean' deletes config.mak files so that we have to ./configure again after doing that. This behavior is different from that of standard 'make clean'. This patch introduces 'make distclean' to delete config.mak files instead of 'make clean', following a standard manner of Makefile. Applied, thanks. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC PATCH 1/5] lguest: mmap backing file
Anthony Liguori wrote: If we're going to mod the kernel, how about a mmap this part of their address space and having the kernel keep the mappings in sync. But I think that if we want to get speed, we should probably be doing the copy between address spaces in-kernel so we can do lightweight exits. I don't think lightweight exits help the situation very much. The difference between a light weight and heavy weight exit is only 3-4k cycles or so. On what host cpu? IIRC the difference was bigger on Intel (and in relative terms, set to increase). in-kernel doesn't make the situation much easier. You have to map pages in from a different task. It's a lot easier if you have both guest mapped in userspace. The kernel already has everything mapped (kmap_atomic() is an addition on x86_64). -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: yesterday night�s videos
Rock her world with your 9 inch monster. http://www.neurues.com/ Amateur videos for you - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [ANNOUNCE] kvm-guest-drivers-windows-1
Avi Kivity wrote: Daniel P. Berrange wrote: On Tue, Mar 18, 2008 at 05:01:09PM +0200, Avi Kivity wrote: This is the first release of network drivers for Windows guests running on a kvm host. The drivers are intended for Windows 2000 and Windows XP 32-bit. kvm-61 or later is needed in the host. At the moment only binaries are available. There's no license file inside the ZIP file - what license are the binaries re-distributed under ? Good question. I'll find out. I imagine they'd be freely redistributable. The binaries are free for use and redistribution for commercial and non-commercial use. The sources will be released under an open-source license, provided the Windows DDK terms permit. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC/PATCH 01/15] preparation: provide hook to enable pgstes in user pagetable
Heiko Carstens wrote: What you've done with dup_mm() is probably the brute-force way that I would have done it had I just been trying to make a proof of concept or something. I'm worried that there are a bunch of corner cases that haven't been considered. What if someone else is poking around with ptrace or something similar and they bump the mm_users: + if (tsk-mm-context.pgstes) + return 0; + if (!tsk-mm || atomic_read(tsk-mm-mm_users) 1 || + tsk-mm != tsk-active_mm || tsk-mm-ioctx_list) + return -EINVAL; HERE + tsk-mm-context.pgstes = 1;/* dirty little tricks .. */ + mm = dup_mm(tsk); It'll race, possibly fault in some other pages, and those faults will be lost during the dup_mm(). I think you need to be able to lock out all of the users of access_process_vm() before you go and do this. You also need to make sure that anyone who has looked at task-mm doesn't go and get a reference to it and get confused later when it isn't the task-mm any more. Therefore, we need to reallocate the page table after fork() once we know that task is going to be a hypervisor. That's what this code does: reallocate a bigger page table to accomondate the extra information. The task needs to be single-threaded when calling for extended page tables. Btw: at fork() time, we cannot tell whether or not the user's going to be a hypervisor. Therefore we cannot do this in fork. Can you convert the page tables at a later time without doing a wholesale replacement of the mm? It should be a bit easier to keep people off the pagetables than keep their grubby mitts off the mm itself. Yes, as far as I can see you're right. And whatever we do in arch code, after all it's just a work around to avoid a new clone flag. If something like clone() with CLONE_KVM would be useful for more architectures than just s390 then maybe we should try to get a flag. Oh... there are just two unused clone flag bits left. Looks like the namespace changes ate up a lot of them lately. Well, we could still play dirty tricks like setting a bit in current via whatever mechanism which indicates child-wants-extended-page-tables and then just fork and be happy. How about taking mmap_sem for write and converting all page tables in-place? I'd rather avoid the need to fork() when creating a VM. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] Qemu-kvm is leaking my memory ???
Zdenek Kabelac wrote: 2008/3/19, Avi Kivity [EMAIL PROTECTED]: Zdenek Kabelac wrote: 2008/3/19, Avi Kivity [EMAIL PROTECTED]: Zdenek Kabelac wrote: 2008/3/16, Avi Kivity [EMAIL PROTECTED]: The -vnc switch, so there's no local X server. A remote X server should be fine as well. Use runlevel 3, which means network but no local X server. Ok I've finaly got some time to make a comparable measurements about memory - I'm attaching empty trace log which is from the level where most of processes were killed (as you can see in the 'ps' trace) Then there are attachments after using qemu 7 times (log of free before execution is also attached) Both logs are after 3/proc/sys/vm/drop_cache I see the same issue too now, and am investigating. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC PATCH 0/4] Inter-guest virtio I/O example with lguest
On Friday 21 March 2008 01:11:35 Anthony Liguori wrote: Rusty Russell wrote: There are three possible solutions: 1) Just offer the lowest common denominator to both sides (ie. no features). This is what I do with lguest in these patches. 2) Offer something and handle the case where one Guest accepts and another doesn't by emulating it. ie. de-TSO the packets manually. 3) Hot unplug the device from the guest which asks for the greater features, then re-add it offering less features. Requires hotplug in the guest OS. 4) Add a feature negotiation feature. The feature that gets set is the feature negotiate feature. If a guest doesn't support feature negotiation, you end up with the least-common denominator (no features). If both guests support feature negotiation, you can then add something new to determine the true common subset. Hmm, I discarded that out of hand as too icky, but we might end up there. Analyse features like normal, accept feature negotiation, set DRIVER_OK, wait for config change, if feature negotiation is still set then go around again (presumably some features have been removed). I'll prototype it and see how we go. Thanks, Rusty. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] Qemu-kvm is leaking my memory ???
Avi Kivity wrote: I see the same issue too now, and am investigating. The attached patch should fix the issue. It is present in 2.6.25-rc6 only, and not in kvm.git, which is why few people noticed it. -- error compiling committee.c: too many arguments to function diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 4ba85d9..e55af12 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -1412,7 +1412,7 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, up_read(current-mm-mmap_sem); vcpu-arch.update_pte.gfn = gfn; - vcpu-arch.update_pte.page = gfn_to_page(vcpu-kvm, gfn); + vcpu-arch.update_pte.page = page; } void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa, - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [Qemu-devel] Coredump from qemu
Zdenek Kabelac wrote: Hi During execution of qemu I've got this crash: #0 0x00407a29 in qemu_mod_timer (ts=0x2e8cf90, expire_time=130685351465) at /usr/src/debug/kvm-63/qemu/vl.c:1073 #1 0x00425590 in pcnet_ioport_writew (opaque=0x0, addr=1836332585, val=8090216) at /usr/src/debug/kvm-63/qemu/hw/pcnet.c:1617 #2 0x00501cf1 in kvm_outw (opaque=value optimized out, addr=13865, data=29288) at /usr/src/debug/kvm-63/qemu/qemu-kvm.c:457 #3 0x0051e2a0 in kvm_run (kvm=0x2dbb030, vcpu=1) at libkvm.c:719 #4 0x00501646 in kvm_cpu_exec (env=value optimized out) at /usr/src/debug/kvm-63/qemu/qemu-kvm.c:127 #5 0x005021a5 in kvm_main_loop_cpu (env=0x2e8f010) at /usr/src/debug/kvm-63/qemu/qemu-kvm.c:307 #6 0x00502302 in ap_main_loop (_env=value optimized out) at /usr/src/debug/kvm-63/qemu/qemu-kvm.c:338 #7 0x00353420740a in start_thread () from /lib64/libpthread.so.0 #8 0x0035336e5d1d in clone () from /lib64/libc.so.6 (gdb) print alarm_timer $1 = (struct qemu_alarm_timer *) 0x0 It happend during detach of gdb and quit of the qemu itsell - I assume no all timers were probably stoped when quit_timers was executed ? Looks like. Maybe check for non NULL pointer is enough qemu_mod_timer? I think the correct solution is to allow devices to register a shutdown function, which would be called after main_loop() (but before quit_timers) , which would execute qemu_del_timer() in the case of pcnet. It's also necessary for device hotremove. -- error compiling committee.c: too many arguments to function - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] Кадастровый план и ГКН
Приглашаем Вас принять участие в НОВОМ курсе: Новое в правовом регулировании оформления прав на объекты недвижимости. Федеральный закон О государственном КАДАСТРЕ недвижимости 31 марта - 01 апреля 2008 г., Санкт - Петербург В рамках курса: 1. Изменения в правовом регулировании оформления прав на объекты недвижимости 2. Изменения в ФЗ О государственной регистрации прав на недвижимое имущество и сделок с ним 3. Принципы ведения ГКН 4. Порядок предоставления сведений ГКН 5. Порядок и особенности осуществления кадастрового учета земельных участков, зданий, сооружений, объектов незавершенного строительства, помещений 6. Основания приостановления и отказа в проведении кадастрового учета 7. Кадастровая деятельность 8. Межевой план. Порядок согласования границ земельных участков 9. Кадастровые паспорта на земельные участки, здания, сооружения, объекты незавершенного строительства, помещения Справочная информация по телефону : (812) 983 -0315 - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH][QEMU] Use a separate device for in-kernel PIT
Avi Kivity wrote: Anthony Liguori wrote: Hollis Blanchard wrote: This patch solves annoying qemu build breakage hitting PowerPC around struct kvm_pit_state, so that's another vote in favor... I have an updated version of the patch but it's breaking the build b/c something fouled up right now with configure. libkvm pulls in linux/kvm.h which wants to pull in linux/compiler.h. We don't ship a linux/compiler.h though so it's pulling from /usr/include/linux which on my system doesn't have a compiler.h. The lack of this header is causing the configure test to fail. I've attached the patch here for you to use and I'll send it out again once I figure out the fix for this linux/compiler.h. The patch suffers from the same problem as the apic split; the save/restore code is needlessly duplicated. The updated patch addresses this problem. I have to fix the linux/compiler.h issue first though before it can be applied or it will break the build. Regards, Anthony Liguori - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] [PATCH] [RFC] Fix time drift of rtc clock + general support
Qemu device emulation for timers might be inaccurate and causes coalescing of several irq into one. It happens when the load on the host is high and the guest did not manage to ack the previous irq. By get/set request irq commands the device won't issue another irq before the previous one has been acknoledged. Each timer (rtc in this case) will request information about acking its irq vector. If a timer pops and there is pending irq that didn't manage to be injected, it will be queued (pending variable) and a new timer will be fired to try inject it again soon (==0.1msec) It fixes the current time drift on windows acpi hal guest. It works well for in-kernel irqchip and also w/o. Todo: 1. Implement it for the pit and eliminated the in-kernel pit. 2. Support smp (move acked_irq to CPUState) 3. Prepare several cleaner patches Signed-off-by: Dor Laor [EMAIL PROTECTED] --- libkvm/libkvm-x86.c | 11 +++ libkvm/libkvm.h | 30 ++ qemu/hw/apic.c| 14 ++ qemu/hw/irq.c | 15 +++ qemu/hw/irq.h | 42 ++ qemu/hw/mc146818rtc.c | 45 +++-- qemu/hw/pc.c |8 qemu/hw/pc.h |3 +++ qemu/qemu-kvm-x86.c | 13 - 9 files changed, 178 insertions(+), 3 deletions(-) diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c index 6dba91d..2e3b677 100644 --- a/libkvm/libkvm-x86.c +++ b/libkvm/libkvm-x86.c @@ -576,6 +576,17 @@ __u64 kvm_get_cr8(kvm_context_t kvm, int vcpu) return kvm-run[vcpu]-cr8; } +void kvm_get_marked_irqs(kvm_context_t kvm, int vcpu, __u32* irq_acked) +{ + memcpy(irq_acked, kvm-run[vcpu]-irq_acked, sizeof(kvm-run[vcpu]-irq_acked)); +} + +void kvm_set_irqs_to_mark(kvm_context_t kvm, int vcpu, __u32* irq_acked) +{ + memcpy(kvm-run[vcpu]-irq_acked, irq_acked, sizeof(kvm-run[vcpu]-irq_acked)); +} + + int kvm_setup_cpuid(kvm_context_t kvm, int vcpu, int nent, struct kvm_cpuid_entry *entries) { diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h index 61e7e98..1c027c9 100644 --- a/libkvm/libkvm.h +++ b/libkvm/libkvm.h @@ -357,6 +357,36 @@ void kvm_set_cr8(kvm_context_t kvm, int vcpu, uint64_t cr8); * \param vcpu Which virtual CPU should get dumped */ __u64 kvm_get_cr8(kvm_context_t kvm, int vcpu); + +/*! + * \brief Get notification of acked interrupts by in-kernel irq chip + * + * User space device emulation for timers might be inaccurate and + * cause coalescing of several irq into one. It happens when the + * load on the host is high and the guest did not manage to ack the + * previous irq. By get/set request irq commands the device won't issue + * another irq before the previous one has been acknowledged. + * + * \param kvm Pointer to the current kvm_context + * \param vcpu Which virtual CPU should get dumped + * \param irq_acked 256 bit array to copy the content + */ +void kvm_get_marked_irqs(kvm_context_t kvm, int vcpu, __u32* irq_acked); + +/*! + * \brief Set request for notification of acked interrupts by in-kernel irq chip + * + * User space device emulation for timers might be inaccurate and + * cause coalescing of several irq into one. It happens when the + * load on the host is high and the guest did not manage to ack the + * previous irq. By get/set request irq commands the device won't issue + * another irq before the previous one has been acknowledged. + * + * \param kvm Pointer to the current kvm_context + * \param vcpu Which virtual CPU should get dumped + * \param irq_acked 256 bit array to copy the content from + */ +void kvm_set_irqs_to_mark(kvm_context_t kvm, int vcpu, __u32* irq_acked); #endif /*! diff --git a/qemu/hw/apic.c b/qemu/hw/apic.c index 92248dd..cdfc8a4 100644 --- a/qemu/hw/apic.c +++ b/qemu/hw/apic.c @@ -345,6 +345,10 @@ static void apic_eoi(APICState *s) isrv = get_highest_priority_int(s-isr); if (isrv 0) return; + +if (qemu_wait_for_irq_acked(isrv)) +qemu_unset_request_irq_ack(isrv); + reset_bit(s-isr, isrv); /* XXX: send the EOI packet to the APIC bus to allow the I/O APIC to set the remote IRR bit for level triggered interrupts. */ @@ -1044,6 +1048,16 @@ void ioapic_set_irq(void *opaque, int vector, int level) } } +int ioapic_get_vector(void *opaque, int irq_line) +{ +IOAPICState *s = opaque; + +if (irq_line = 0 irq_line IOAPIC_NUM_PINS) +return (s-ioredtbl[irq_line] 0xff); + +return -1; +} + static uint32_t ioapic_mem_readl(void *opaque, target_phys_addr_t addr) { IOAPICState *s = opaque; diff --git a/qemu/hw/irq.c b/qemu/hw/irq.c index 7703f62..1788906 100644 --- a/qemu/hw/irq.c +++ b/qemu/hw/irq.c @@ -30,6 +30,8 @@ struct IRQState { int n; }; +uint32_t qemu_irq_acked[NR_IRQ_WORDS]; + void qemu_set_irq(qemu_irq irq, int level) { if (!irq) @@ -38,6 +40,19 @@ void qemu_set_irq(qemu_irq irq, int
Re: [kvm-devel] question: HPET for multiple VMs
Hi Avi, If you use the dyntick clock option (the default, IIRC), and a newer host kernel, then the kernel provides high-resolution timers, very likely using HPET internally or some other high resolution clock and event source. I see. The dyntick clock seems to be more scalable than the others. I understood that '-clock hpet' is used for boosting one VM (becuase hpet gains best performance on virtio), right? I would like to try dyntick for my multiple VMs environment. I think that for newer kernels we already have the desired accuracy. Yes. In recent versions of kvm, I didn't experience any time inaccuracy, although I had only tested under several VMs. I'll try the more number of VMs, and if time inaccuracy occurs, I would like to report that. Many thanks, ozaki-r 2008/3/23, Avi Kivity [EMAIL PROTECTED]: Ryota OZAKI wrote: Hi all, Current kvm allows only one VM to use HPET. Is there a plan to implement a functionality to allow multiple VMs to use HPET? If so, how about the status of that? If you use the dyntick clock option (the default, IIRC), and a newer host kernel, then the kernel provides high-resolution timers, very likely using HPET internally or some other high resolution clock and event source. And I would like to ask right and wrong to implement the functionality in terms of need and efficiency (scalability and time accuracy). I think that for newer kernels we already have the desired accuracy. We're not always good at exploiting that accuracy; hence the recent movement of the PIT implementation from userspace to the kernel. But recent discussion leads me to believe it could have been implemented with the userspace PIT as well. -- Any sufficiently difficult bug is indistinguishable from a feature. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC/PATCH 01/15] preparation: provide hook to enable pgstes in user pagetable
On Sun, 2008-03-23 at 12:15 +0200, Avi Kivity wrote: Can you convert the page tables at a later time without doing a wholesale replacement of the mm? It should be a bit easier to keep people off the pagetables than keep their grubby mitts off the mm itself. Yes, as far as I can see you're right. And whatever we do in arch code, after all it's just a work around to avoid a new clone flag. If something like clone() with CLONE_KVM would be useful for more architectures than just s390 then maybe we should try to get a flag. Oh... there are just two unused clone flag bits left. Looks like the namespace changes ate up a lot of them lately. Well, we could still play dirty tricks like setting a bit in current via whatever mechanism which indicates child-wants-extended-page-tables and then just fork and be happy. How about taking mmap_sem for write and converting all page tables in-place? I'd rather avoid the need to fork() when creating a VM. That was my initial approach as well. If all the page table allocations can be fullfilled the code is not too complicated. To handle allocation failures gets tricky. At this point I realized that dup_mmap already does what we want to do. It walks all the page tables, allocates new page tables and copies the ptes. In principle I would reinvent the wheel if we can not use dup_mmap. -- blue skies, Martin. Reality continues to ruin my life. - Calvin. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] (no subject)
各位老总:您们好! 诚祝:您们在2008年里;有鼠不尽的快乐!鼠不尽的收获!鼠不尽的钞票! 鼠不尽的幸福!鼠不尽的美满生活!愿:您们阖家欢乐!幸福安康! 我是(深圳市珊湖岛进出口有限公司)的负责人;可以提供:出口报关单, 核销单等等一系列手续;代理:出口报关,商检,境内外运输..等等;还可 以代办:出口欧盟许可证,欧盟产地证;并且还有(广州国际贸易交易会)的摊 位可以转让;有意者请来邮件或来电联系。 电话:0755-81153047。 传真:0755-81172940。 手机:15817477278。 联系人:钟文辉。 此致: 敬礼! - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] (no subject)
各位老总:您们好! 诚祝:您们在2008年里;有鼠不尽的快乐!鼠不尽的收获!鼠不尽的钞票! 鼠不尽的幸福!鼠不尽的美满生活!愿:您们阖家欢乐!幸福安康! 我是(深圳市珊湖岛进出口有限公司)的负责人;可以提供:出口报关单, 核销单等等一系列手续;代理:出口报关,商检,境内外运输..等等;还可 以代办:出口欧盟许可证,欧盟产地证;并且还有(广州国际贸易交易会)的摊 位可以转让;有意者请来邮件或来电联系。 电话:0755-81153047。 传真:0755-81172940。 手机:15817477278。 联系人:钟文辉。 此致: 敬礼! - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] KVM: MMU: add KVM_ZAP_GFN ioctl
On Fri, Mar 21, 2008 at 04:56:50PM +0100, Andrea Arcangeli wrote: On Fri, Mar 21, 2008 at 10:37:00AM -0300, Marcelo Tosatti wrote: This is not the final put_page(). Remote TLB's are flushed here, after rmap_remove: + if (nuked) + kvm_flush_remote_tlbs(kvm); This ioctl is called before zap_page_range() is executed through sys_madvise(MADV_DONTNEED) to remove the page in question. We know that the guest will not attempt to fault in the gfn because the virtio balloon driver is synchronous (it will only attempt to release that page back to the guest OS once rmap_nuke+zap_page_range has finished). Can you be more verbose? Sure. 1) even if you run madvise(MADV_DONTNEED) after KVM_ZAP_GFN, the anon page can be released by the VM at any time without any kvm-aware lock (there will be a swap reference to it, no any more page_count references leading to memory corruption in the host in presence of memory pressure). This is purely theoretical of course, not sure if timings or probabilities allows for reproducing this in real life. If there are any active shadow mappings to a page there is a guarantee that there is a valid linux pte mapping pointing at it. So page_count == 1 + nr_sptes. So the theoretical race you're talking about is: CPU0CPU1 spte = rmap_next(kvm, rmapp, NULL); while (spte) { BUG_ON(!spte); BUG_ON(!(*spte PT_PRESENT_MASK)); rmap_printk(rmap_nuke: spte %p %llx\n, spte, *spte); rmap_remove(kvm, spte); set_shadow_pte(spte, shadow_trap_nonpresent_pte); nuked = 1; spte = rmap_next(kvm, rmapp, spte); } -- try_to_unmap_one() page is now free page allocated for other purposes if (nuked) kvm_flush_remote_tlbs(kvm); And some other VCPU with the TLB cached writes to the now freed (and possibly allocated to another purpose) page. This case is safe because the path that frees a pte and subsequently a page will take care of flushing the TLB of any remote CPU's that possibly have it cached (before freeing the page, of course). ptep_clear_flush-flush_tlb_page. Am I missing something? 2) not sure what you mean with synchronous, do you mean single threaded? I can't see how it can be single threaded (does ballooning stops all other vcpus?). No, I mean synchronous as in that no other vcpu will attempt to fault that _particular gfn_ in between KVM_ZAP_GFN and madvise. Why are you taking the mmu_lock around rmap_nuke if no other vcpu can take any page fault and call into get_user_pages in between KVM_ZAP_GFN and madvise? Other vcpu's can take page faults and call into get_user_pages, but not for the gfn KVM_ZAP_GFN is operating on, because it has been allocated by the balloon driver. So we need mmu_lock to protect against concurrent shadow page and rmap operations. As far as I can tell the only possible safe ordering is madvise; KVM_ZAP_GFN, which is emulating the mmu notifier behavior incidentally. Note that the rmap_remove smp race (also note here smp race means smp-host race, it will trigger even if guest is UP) might be a generic issue with the rmap_remove logic. I didn't analyze all the possible rmap_remove callers yet (this was in my todo list), I just made sure that my code would be smp safe. As detailed above, we have a guarantee that there is a live linux pte by the time rmap_remove() nukes a shadow pte. By the way, I don't see invalidate_begin/invalidate_end hooks in the KVM part of MMU notifiers V9 patch? (meaning that zap_page_range will not zap the spte's for the pages in question). range_begin isn't needed. range_begin is needed only by secondary mmu drivers that aren't reference counting the pages. The _end callback is below. It could be improved to skip the whole range in a single browse of the memslots instead of browsing it for each page in the range. The mmu notifiers aren't merged and this code may still require changes in terms of API if EMM is merged instead of #v9 (hope not), so I tried to keep it simple. Oh, I missed that. Nice. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] question: HPET for multiple VMs
On Mon, 2008-03-24 at 00:32 +0900, Ryota OZAKI wrote: Hi Avi, If you use the dyntick clock option (the default, IIRC), and a newer host kernel, then the kernel provides high-resolution timers, very likely using HPET internally or some other high resolution clock and event source. I see. The dyntick clock seems to be more scalable than the others. I understood that '-clock hpet' is used for boosting one VM (becuase hpet gains best performance on virtio), right? I would like to try dyntick for my multiple VMs environment. I think that for newer kernels we already have the desired accuracy. Yes. In recent versions of kvm, I didn't experience any time inaccuracy, although I had only tested under several VMs. I'll try the more number of VMs, and if time inaccuracy occurs, I would like to report that. The problem is not inaccuracy of guest clock (which we do suffer from in some guests and there is work in progress to fix). The problem is that qemu_timer is not accurate, thus the virtio tx timer is too slow leading to not optimized performance for virtio-net. Try host kernel = 2.6.24 with dyntick. Many thanks, ozaki-r 2008/3/23, Avi Kivity [EMAIL PROTECTED]: Ryota OZAKI wrote: Hi all, Current kvm allows only one VM to use HPET. Is there a plan to implement a functionality to allow multiple VMs to use HPET? If so, how about the status of that? If you use the dyntick clock option (the default, IIRC), and a newer host kernel, then the kernel provides high-resolution timers, very likely using HPET internally or some other high resolution clock and event source. And I would like to ask right and wrong to implement the functionality in terms of need and efficiency (scalability and time accuracy). I think that for newer kernels we already have the desired accuracy. We're not always good at exploiting that accuracy; hence the recent movement of the PIT implementation from userspace to the kernel. But recent discussion leads me to believe it could have been implemented with the userspace PIT as well. -- Any sufficiently difficult bug is indistinguishable from a feature. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [Qemu-devel] [PATCH] [RFC] Fix time drift of rtc clock + general support
On Sun, 2008-03-23 at 16:19 +, Paul Brook wrote: On Sunday 23 March 2008, Dor Laor wrote: --- a/qemu/hw/irq.c +++ b/qemu/hw/irq.c @@ -30,6 +30,8 @@ struct IRQState { int n; }; +uint32_t qemu_irq_acked[NR_IRQ_WORDS]; This is absolute rubbish. The whole point of the IRQ framework is that it doesn't assume a single flat IRQ controller. Thanks for the compliments the review ... I specifically said that I'll move this variable into per-cpu var. Moreover, the translation between irq line to vector is handled by the 'qemu_get_irq_vector' that calls 'irq_controller_get_vector' should take care of the translation. It works for ioapic, I'm not sure if it works for the flat pic case yet. Anyway you're welcome to drift without the patch or provide constructive comments. Paul - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] Qemu-kvm is leaking my memory ???
2008/3/23, Avi Kivity [EMAIL PROTECTED]: Avi Kivity wrote: I see the same issue too now, and am investigating. The attached patch should fix the issue. It is present in 2.6.25-rc6 only, and not in kvm.git, which is why few people noticed it. Hi Tested - and actually seeing no difference in my case of memory leak. Still it looks like over 30M per execution of qemu is lost. (tested with fresh 2.6.25-rc6 with your patch) Also now I'd have said that before my dmsetup status loop test case was not causing big problems and it was just enough to run another dmsetup to unblock the loop - now it's usually leads to some wierd end of qemu itself - will explore more So it's probably fixing some bug - and exposing another. As I said before - in my debuger it was looping in page_fault hadler - i.e. memory should be paged_in - but as soon as the handler return to the code to continue memcopy - new page_fault is invoked and pointer couters are not changed. Zdenek - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [Qemu-devel] [PATCH] [RFC] Fix time drift of rtc clock + general support
On Sunday 23 March 2008, Dor Laor wrote: On Sun, 2008-03-23 at 16:19 +, Paul Brook wrote: On Sunday 23 March 2008, Dor Laor wrote: --- a/qemu/hw/irq.c +++ b/qemu/hw/irq.c @@ -30,6 +30,8 @@ struct IRQState { int n; }; +uint32_t qemu_irq_acked[NR_IRQ_WORDS]; This is absolute rubbish. The whole point of the IRQ framework is that it doesn't assume a single flat IRQ controller. Thanks for the compliments the review ... I specifically said that I'll move this variable into per-cpu var. Per-cpu is no better. Moreover, the translation between irq line to vector is handled by the 'qemu_get_irq_vector' that calls 'irq_controller_get_vector' should take care of the translation. It works for ioapic, I'm not sure if it works for the flat pic case yet. Which shows you've completely missed the point. irq-n is not a globally unique identifier. It's a local per-controller index. qemu has targets with multiple nested interrupt controllers, anything trying to maintain global or per-cpu IRQ lists is fundamentally broken. Anyway you're welcome to drift without the patch or provide constructive comments. Well, the patch doesn't even build on non-x86 targets. a new timer will be fired to try inject it again soon (==0.1msec) If the guest is missing interrupts, the chances of a 0.1ms interval working are not great. Most likely It's either going trigger immediately, or be delayed significantly and you're going to end up even further behind. If triggering immediately is OK then why not do that all the time? If triggering immediately is not acceptable then you're still going to loose interrupts. Paul - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [kvm-ppc-devel] [PATCH] Move kvm_get_pit tolibkvm.c common code
Avi Kivity wrote: Hollis Blanchard wrote: Don't compile kvm_*_pit() on architectures whose currently supported platforms do not contain a PIT. Signed-off-by: Hollis Blanchard [EMAIL PROTECTED] diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h --- a/libkvm/libkvm.h +++ b/libkvm/libkvm.h @@ -539,6 +539,7 @@ int kvm_pit_in_kernel(kvm_context_t kvm) #ifdef KVM_CAP_PIT +#if defined(__i386__) || defined(__x86_64__) || defined(__ia64__) /*! * \brief Get in kernel PIT of the virtual domain * @@ -562,6 +563,8 @@ int kvm_set_pit(kvm_context_t kvm, struc #endif +#endif + #ifdef KVM_CAP_VAPIC ia64 doesn't have an in-kernel pit? (yet?) IA64 doesn't have pit on platform. Xiantao - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel