Re: [kvm-devel] question: HPET for multiple VMs

2008-03-23 Thread Avi Kivity
Anthony Liguori wrote:
 Avi Kivity wrote:
 And I would like to ask right and wrong to
 implement the functionality in terms of need
 and efficiency (scalability and time accuracy).
 

 I think that for newer kernels we already have the desired accuracy.  
 We're not always good at exploiting that accuracy; hence the recent 
 movement of the PIT implementation from userspace to the kernel.  But 
 recent discussion leads me to believe it could have been implemented 
 with the userspace PIT as well.
   

 What do you think is needed to get the same accuracy in userspace as 
 in kernelspace?  

Some mechanism that allows us to implement kvm_inject_pit_timer_irqs() 
and kvm_pit_timer_intr_post().  Specifically, information about whether 
an interrupt was actually processed, and a window for injecting missed 
ticks.

 Better yet, do you think there is a reasonable kvmctl harness we could 
 write to quantify the PIT accuracy?

kvmctl doesn't implement a pit, so no.  Of course we can test any 
infrastructure for counting missed interrupts.


 It's easy enough to count timer interrupts and use compare that to an 
 external time source to get some notion of accuracy (on varying 
 frequencies of course).  I know you mentioned before that guest CPU 
 consumption also comes into play... I'm not quite sure why though so 
 I'm not sure how to simulate that.

It's not so easy, the code is quite tricky since the cpu processes 
vectors, not interrupt lines.  It's also heuristic; if the guest 
programs some random device to share interrupts with the pit, the 
heursitic breaks down.  This never happens in practice, though.

Problems show up when both the guest and host are loaded, as then the 
cpu is timesliced instead of being available on demand.


 The nice thing about the CAP infrastructure is we can always move the 
 PIT back to userspace.  I'll happily invest some cycles here as I'm a 
 big fan of getting rid of unneeded kernel code :-)

Yes.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] kvm.h: __user requires compiler.h

2008-03-23 Thread Avi Kivity
Anthony Liguori wrote:
 This patch breaks QEMU build when doing a 'make sync'.  When you do a 
 top-level ./configure, libkvm is built with kerneldir pointing to 
 kvm-userspace/kernel/include.  While linux/kvm.h is present there, 
 there isn't a linux/compiler.h.

 The host kernelpath isn't normally part of the libkvm or QEMU build.  
 So we have a couple options.

 1) make the host kernelpath (/lib/modules/$(uname -r)/build/include) 
 part of the libkvm/QEMU build.

 2) Do something else about __user

 Suggestions?  #1 might be a pain since there may be include conflicts 
 between the host kernel include and kernel/include.


We could hack 'make sync' to strip out __user (just like we run 
unifdef).  Of course the reasons for including linux/compiler.h are 
still valid, so it needs to remain.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] Move kvm_get_pit to libkvm.c common code

2008-03-23 Thread Avi Kivity
Hollis Blanchard wrote:

 Don't compile kvm_*_pit() on architectures whose currently supported
 platforms do not contain a PIT.

 Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]

 diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
 --- a/libkvm/libkvm.h
 +++ b/libkvm/libkvm.h
 @@ -539,6 +539,7 @@ int kvm_pit_in_kernel(kvm_context_t kvm)
  
  #ifdef KVM_CAP_PIT
  
 +#if defined(__i386__) || defined(__x86_64__) || defined(__ia64__) 
  /*!
   * \brief Get in kernel PIT of the virtual domain
   *
 @@ -562,6 +563,8 @@ int kvm_set_pit(kvm_context_t kvm, struc
  
  #endif
  
 +#endif
 +
  #ifdef KVM_CAP_VAPIC

ia64 doesn't have an in-kernel pit? (yet?)

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][QEMU] Use a separate device for in-kernel PIT

2008-03-23 Thread Avi Kivity
Anthony Liguori wrote:
 Hollis Blanchard wrote:
 This patch solves annoying qemu build breakage hitting PowerPC around
 struct kvm_pit_state, so that's another vote in favor...
   

 I have an updated version of the patch but it's breaking the build b/c 
 something fouled up right now with configure.  libkvm pulls in 
 linux/kvm.h which wants to pull in linux/compiler.h.  We don't ship a 
 linux/compiler.h though so it's pulling from /usr/include/linux which 
 on my system doesn't have a compiler.h.

 The lack of this header is causing the configure test to fail.  I've 
 attached the patch here for you to use and I'll send it out again once 
 I figure out the fix for this linux/compiler.h.


The patch suffers from the same problem as the apic split; the 
save/restore code is needlessly duplicated.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] 'make clean' is eager to delete config.mak files

2008-03-23 Thread Avi Kivity
Ryota OZAKI wrote:
 Hi all,

 Current 'make clean' deletes config.mak files so
 that we have to ./configure again after doing that.
 This behavior is different from that of standard
 'make clean'.

 This patch introduces 'make distclean' to delete
 config.mak files instead of 'make clean', following
 a standard manner of Makefile.

   

Applied, thanks.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC PATCH 1/5] lguest: mmap backing file

2008-03-23 Thread Avi Kivity
Anthony Liguori wrote:
 
   
 If we're going to mod the kernel, how about a mmap this part of their 
 address 
 space and having the kernel keep the mappings in sync.  But I think that if 
 we want to get speed, we should probably be doing the copy between address 
 spaces in-kernel so we can do lightweight exits.
   
 

 I don't think lightweight exits help the situation very much.  The 
 difference between a light weight and heavy weight exit is only 3-4k 
 cycles or so.
   

On what host cpu?  IIRC the difference was bigger on Intel (and in 
relative terms, set to increase).

 in-kernel doesn't make the situation much easier.  You have to map pages 
 in from a different task.  It's a lot easier if you have both guest 
 mapped in userspace.
   

The kernel already has everything mapped (kmap_atomic() is an addition 
on x86_64).



-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: yesterday night�s videos

2008-03-23 Thread Candia quesad
Rock her world with your 9 inch monster.

http://www.neurues.com/
Amateur videos for you

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ANNOUNCE] kvm-guest-drivers-windows-1

2008-03-23 Thread Avi Kivity
Avi Kivity wrote:
 Daniel P. Berrange wrote:
   
 On Tue, Mar 18, 2008 at 05:01:09PM +0200, Avi Kivity wrote:
   
 
 This is the first release of network drivers for Windows guests running 
 on a kvm host.  The drivers are intended for Windows 2000 and Windows XP 
 32-bit.  kvm-61 or later is needed in the host.  At the moment only 
 binaries are available.
 
   
 There's no license file inside the ZIP file - what license are the binaries 
 re-distributed under ?

   
 

 Good question.  I'll find out.  I imagine they'd be freely redistributable.

   

The binaries are free for use and redistribution for commercial and 
non-commercial use.  The sources will be released under an open-source 
license, provided the Windows DDK terms permit.


-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC/PATCH 01/15] preparation: provide hook to enable pgstes in user pagetable

2008-03-23 Thread Avi Kivity
Heiko Carstens wrote:
 What you've done with dup_mm() is probably the brute-force way that I
 would have done it had I just been trying to make a proof of concept or
 something.  I'm worried that there are a bunch of corner cases that
 haven't been considered.

 What if someone else is poking around with ptrace or something similar
 and they bump the mm_users:

 +   if (tsk-mm-context.pgstes)
 +   return 0;
 +   if (!tsk-mm || atomic_read(tsk-mm-mm_users)  1 ||
 +   tsk-mm != tsk-active_mm || tsk-mm-ioctx_list)
 +   return -EINVAL;
 HERE
 +   tsk-mm-context.pgstes = 1;/* dirty little tricks .. */
 +   mm = dup_mm(tsk);

 It'll race, possibly fault in some other pages, and those faults will be
 lost during the dup_mm().  I think you need to be able to lock out all
 of the users of access_process_vm() before you go and do this.  You also
 need to make sure that anyone who has looked at task-mm doesn't go and
 get a reference to it and get confused later when it isn't the task-mm
 any more.

 
 Therefore, we need to reallocate the page table after fork() 
 once we know that task is going to be a hypervisor. That's what this 
 code does: reallocate a bigger page table to accomondate the extra 
 information. The task needs to be single-threaded when calling for 
 extended page tables.

 Btw: at fork() time, we cannot tell whether or not the user's going to 
 be a hypervisor. Therefore we cannot do this in fork.
   
 Can you convert the page tables at a later time without doing a
 wholesale replacement of the mm?  It should be a bit easier to keep
 people off the pagetables than keep their grubby mitts off the mm
 itself.
 

 Yes, as far as I can see you're right. And whatever we do in arch code,
 after all it's just a work around to avoid a new clone flag.
 If something like clone() with CLONE_KVM would be useful for more
 architectures than just s390 then maybe we should try to get a flag.

 Oh... there are just two unused clone flag bits left. Looks like the
 namespace changes ate up a lot of them lately.

 Well, we could still play dirty tricks like setting a bit in current
 via whatever mechanism which indicates child-wants-extended-page-tables
 and then just fork and be happy.
   

How about taking mmap_sem for write and converting all page tables 
in-place?  I'd rather avoid the need to fork() when creating a VM.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Qemu-kvm is leaking my memory ???

2008-03-23 Thread Avi Kivity
Zdenek Kabelac wrote:
 2008/3/19, Avi Kivity [EMAIL PROTECTED]:
   
 Zdenek Kabelac wrote:
   2008/3/19, Avi Kivity [EMAIL PROTECTED]:
  
   Zdenek Kabelac wrote:
 2008/3/16, Avi Kivity [EMAIL PROTECTED]:

 The -vnc switch, so there's no local X server.  A remote X server should
  be fine as well.  Use runlevel 3, which means network but no local X server.
 

 Ok I've finaly got some time to make a comparable measurements about memory -

 I'm attaching empty   trace log which is from the level where most of
 processes were killed (as you can see in the 'ps' trace)

 Then there are attachments after using qemu 7 times (log of free
 before execution is also attached)

 Both logs are after  3/proc/sys/vm/drop_cache

   

I see the same issue too now, and am investigating.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC PATCH 0/4] Inter-guest virtio I/O example with lguest

2008-03-23 Thread Rusty Russell
On Friday 21 March 2008 01:11:35 Anthony Liguori wrote:
 Rusty Russell wrote:
 There are three possible solutions:
  1) Just offer the lowest common denominator to both sides (ie. no
  features). This is what I do with lguest in these patches.
  2) Offer something and handle the case where one Guest accepts and
  another doesn't by emulating it.  ie. de-TSO the packets manually.
  3) Hot unplug the device from the guest which asks for the greater
  features, then re-add it offering less features.  Requires hotplug in the
  guest OS.

 4) Add a feature negotiation feature.  The feature that gets set is the
 feature negotiate feature.  If a guest doesn't support feature
 negotiation, you end up with the least-common denominator (no
 features).  If both guests support feature negotiation, you can then add
 something new to determine the true common subset.

Hmm, I discarded that out of hand as too icky, but we might end up there.  
Analyse features like normal, accept feature negotiation, set DRIVER_OK, wait 
for config change, if feature negotiation is still set then go around again 
(presumably some features have been removed).

I'll prototype it and see how we go.

Thanks,
Rusty.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Qemu-kvm is leaking my memory ???

2008-03-23 Thread Avi Kivity

Avi Kivity wrote:


I see the same issue too now, and am investigating.



The attached patch should fix the issue.  It is present in 2.6.25-rc6 
only, and not in kvm.git, which is why few people noticed it.


--
error compiling committee.c: too many arguments to function

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4ba85d9..e55af12 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1412,7 +1412,7 @@ static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 	up_read(current-mm-mmap_sem);
 
 	vcpu-arch.update_pte.gfn = gfn;
-	vcpu-arch.update_pte.page = gfn_to_page(vcpu-kvm, gfn);
+	vcpu-arch.update_pte.page = page;
 }
 
 void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] Coredump from qemu

2008-03-23 Thread Avi Kivity
Zdenek Kabelac wrote:
 Hi

 During execution of qemu I've got this crash:

 #0  0x00407a29 in qemu_mod_timer (ts=0x2e8cf90,
 expire_time=130685351465) at /usr/src/debug/kvm-63/qemu/vl.c:1073
 #1  0x00425590 in pcnet_ioport_writew (opaque=0x0,
 addr=1836332585, val=8090216)
 at /usr/src/debug/kvm-63/qemu/hw/pcnet.c:1617
 #2  0x00501cf1 in kvm_outw (opaque=value optimized out,
 addr=13865, data=29288)
 at /usr/src/debug/kvm-63/qemu/qemu-kvm.c:457
 #3  0x0051e2a0 in kvm_run (kvm=0x2dbb030, vcpu=1) at libkvm.c:719
 #4  0x00501646 in kvm_cpu_exec (env=value optimized out) at
 /usr/src/debug/kvm-63/qemu/qemu-kvm.c:127
 #5  0x005021a5 in kvm_main_loop_cpu (env=0x2e8f010) at
 /usr/src/debug/kvm-63/qemu/qemu-kvm.c:307
 #6  0x00502302 in ap_main_loop (_env=value optimized out) at
 /usr/src/debug/kvm-63/qemu/qemu-kvm.c:338
 #7  0x00353420740a in start_thread () from /lib64/libpthread.so.0
 #8  0x0035336e5d1d in clone () from /lib64/libc.so.6

 (gdb) print alarm_timer
 $1 = (struct qemu_alarm_timer *) 0x0


 It happend during detach of gdb and quit of the qemu itsell - I assume
 no all timers were probably stoped when quit_timers was executed ?

   

Looks like.

 Maybe check for non NULL pointer is enough qemu_mod_timer?

   

I think the correct solution is to allow devices to register a shutdown 
function, which would be called after main_loop() (but before 
quit_timers) , which would execute qemu_del_timer() in the case of 
pcnet.  It's also necessary for device hotremove.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Кадастровый план и ГКН

2008-03-23 Thread Агенство КАДАСТРА
 Приглашаем Вас принять участие в НОВОМ курсе:

   Новое в правовом регулировании оформления
 прав на объекты недвижимости.
   Федеральный закон
   О государственном КАДАСТРЕ недвижимости

   31 марта  -  01 апреля 2008 г., Санкт - Петербург
   
   В рамках курса:
   
  1. Изменения в правовом регулировании оформления прав на объекты недвижимости
  2. Изменения в ФЗ О государственной регистрации прав на недвижимое имущество
 и сделок с ним
  3. Принципы ведения ГКН
  4. Порядок предоставления сведений ГКН
  5. Порядок и особенности осуществления кадастрового учета земельных участков,
 зданий, сооружений, объектов незавершенного строительства, помещений
  6. Основания приостановления и отказа в проведении кадастрового учета
  7. Кадастровая деятельность
  8. Межевой план. Порядок согласования границ земельных участков
  9. Кадастровые паспорта на земельные участки, здания, сооружения,
 объекты незавершенного строительства, помещения
 
  Справочная информация по телефону : (812) 983 -0315





-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH][QEMU] Use a separate device for in-kernel PIT

2008-03-23 Thread Anthony Liguori
Avi Kivity wrote:
 Anthony Liguori wrote:
   
 Hollis Blanchard wrote:
 
 This patch solves annoying qemu build breakage hitting PowerPC around
 struct kvm_pit_state, so that's another vote in favor...
   
   
 I have an updated version of the patch but it's breaking the build b/c 
 something fouled up right now with configure.  libkvm pulls in 
 linux/kvm.h which wants to pull in linux/compiler.h.  We don't ship a 
 linux/compiler.h though so it's pulling from /usr/include/linux which 
 on my system doesn't have a compiler.h.

 The lack of this header is causing the configure test to fail.  I've 
 attached the patch here for you to use and I'll send it out again once 
 I figure out the fix for this linux/compiler.h.

 

 The patch suffers from the same problem as the apic split; the 
 save/restore code is needlessly duplicated.
   

The updated patch addresses this problem.  I have to fix the 
linux/compiler.h issue first though before it can be applied or it will 
break the build.

Regards,

Anthony Liguori



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] [RFC] Fix time drift of rtc clock + general support

2008-03-23 Thread Dor Laor
Qemu device emulation for timers might be inaccurate and
causes coalescing of several irq into one. It happens when the
load on the host is high and the guest did not manage to ack the
previous irq. By get/set request irq commands the device won't issue
another irq before the previous one has been acknoledged.

Each timer (rtc in this case) will request information about
acking its irq vector. If a timer pops and there is pending irq that
didn't manage to be injected, it will be queued (pending variable) and
a new timer will be fired to try inject it again soon (==0.1msec)

It fixes the current time drift on windows acpi hal guest.
It works well for in-kernel irqchip and also w/o.

Todo:
1. Implement it for the pit and eliminated the in-kernel pit.
2. Support smp (move acked_irq to CPUState)
3. Prepare several cleaner patches

Signed-off-by: Dor Laor [EMAIL PROTECTED]
---
 libkvm/libkvm-x86.c   |   11 +++
 libkvm/libkvm.h   |   30 ++
 qemu/hw/apic.c|   14 ++
 qemu/hw/irq.c |   15 +++
 qemu/hw/irq.h |   42 ++
 qemu/hw/mc146818rtc.c |   45
+++--
 qemu/hw/pc.c  |8 
 qemu/hw/pc.h  |3 +++
 qemu/qemu-kvm-x86.c   |   13 -
 9 files changed, 178 insertions(+), 3 deletions(-)

diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c
index 6dba91d..2e3b677 100644
--- a/libkvm/libkvm-x86.c
+++ b/libkvm/libkvm-x86.c
@@ -576,6 +576,17 @@ __u64 kvm_get_cr8(kvm_context_t kvm, int vcpu)
return kvm-run[vcpu]-cr8;
 }
 
+void kvm_get_marked_irqs(kvm_context_t kvm, int vcpu, __u32* irq_acked)
+{
+   memcpy(irq_acked, kvm-run[vcpu]-irq_acked,
sizeof(kvm-run[vcpu]-irq_acked));
+}
+
+void kvm_set_irqs_to_mark(kvm_context_t kvm, int vcpu, __u32*
irq_acked)
+{
+   memcpy(kvm-run[vcpu]-irq_acked, irq_acked,
sizeof(kvm-run[vcpu]-irq_acked));
+}
+
+
 int kvm_setup_cpuid(kvm_context_t kvm, int vcpu, int nent,
struct kvm_cpuid_entry *entries)
 {
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index 61e7e98..1c027c9 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -357,6 +357,36 @@ void kvm_set_cr8(kvm_context_t kvm, int vcpu,
uint64_t cr8);
  * \param vcpu Which virtual CPU should get dumped
  */
 __u64 kvm_get_cr8(kvm_context_t kvm, int vcpu);
+
+/*!
+ * \brief Get notification of acked interrupts by in-kernel irq chip
+ *
+ * User space device emulation for timers might be inaccurate and 
+ * cause coalescing of several irq into one. It happens when the
+ * load on the host is high and the guest did not manage to ack the
+ * previous irq. By get/set request irq commands the device won't issue
+ * another irq before the previous one has been acknowledged.
+ *
+ * \param kvm Pointer to the current kvm_context
+ * \param vcpu Which virtual CPU should get dumped
+ * \param irq_acked 256 bit array to copy the content
+ */
+void kvm_get_marked_irqs(kvm_context_t kvm, int vcpu, __u32*
irq_acked);
+
+/*!
+ * \brief Set request for notification of acked interrupts by in-kernel
irq chip
+ *
+ * User space device emulation for timers might be inaccurate and 
+ * cause coalescing of several irq into one. It happens when the
+ * load on the host is high and the guest did not manage to ack the
+ * previous irq. By get/set request irq commands the device won't issue
+ * another irq before the previous one has been acknowledged.
+ *
+ * \param kvm Pointer to the current kvm_context
+ * \param vcpu Which virtual CPU should get dumped
+ * \param irq_acked 256 bit array to copy the content from
+ */
+void kvm_set_irqs_to_mark(kvm_context_t kvm, int vcpu, __u32*
irq_acked);
 #endif
 
 /*!
diff --git a/qemu/hw/apic.c b/qemu/hw/apic.c
index 92248dd..cdfc8a4 100644
--- a/qemu/hw/apic.c
+++ b/qemu/hw/apic.c
@@ -345,6 +345,10 @@ static void apic_eoi(APICState *s)
 isrv = get_highest_priority_int(s-isr);
 if (isrv  0)
 return;
+
+if (qemu_wait_for_irq_acked(isrv))
+qemu_unset_request_irq_ack(isrv);
+
 reset_bit(s-isr, isrv);
 /* XXX: send the EOI packet to the APIC bus to allow the I/O APIC
to
 set the remote IRR bit for level triggered interrupts. */
@@ -1044,6 +1048,16 @@ void ioapic_set_irq(void *opaque, int vector, int
level)
 }
 }
 
+int ioapic_get_vector(void *opaque, int irq_line)
+{
+IOAPICState *s = opaque;
+
+if (irq_line = 0  irq_line  IOAPIC_NUM_PINS)
+return (s-ioredtbl[irq_line]  0xff);
+
+return -1;
+}
+
 static uint32_t ioapic_mem_readl(void *opaque, target_phys_addr_t addr)
 {
 IOAPICState *s = opaque;
diff --git a/qemu/hw/irq.c b/qemu/hw/irq.c
index 7703f62..1788906 100644
--- a/qemu/hw/irq.c
+++ b/qemu/hw/irq.c
@@ -30,6 +30,8 @@ struct IRQState {
 int n;
 };
 
+uint32_t qemu_irq_acked[NR_IRQ_WORDS];
+
 void qemu_set_irq(qemu_irq irq, int level)
 {
 if (!irq)
@@ -38,6 +40,19 @@ void qemu_set_irq(qemu_irq irq, int 

Re: [kvm-devel] question: HPET for multiple VMs

2008-03-23 Thread Ryota OZAKI
Hi Avi,

 If you use the dyntick clock option (the default, IIRC), and a newer
  host kernel, then the kernel provides high-resolution timers, very
  likely using HPET internally or some other high resolution clock and
  event source.

I see. The dyntick clock seems to be more scalable than
the others. I understood that '-clock hpet' is used for
boosting one VM (becuase hpet gains best performance
on virtio), right?

I would like to try dyntick for my multiple VMs environment.

 I think that for newer kernels we already have the desired accuracy.

Yes. In recent versions of kvm, I didn't experience
any time inaccuracy, although I had only tested under
several VMs. I'll try the more number of VMs, and
if time inaccuracy occurs, I would like to report
that.

Many thanks,
ozaki-r

2008/3/23, Avi Kivity [EMAIL PROTECTED]:
 Ryota OZAKI wrote:
   Hi all,
  
   Current kvm allows only one VM to use HPET. Is
   there a plan to implement a functionality to
   allow multiple VMs to use HPET? If so, how
   about the status of that?
  
  


 If you use the dyntick clock option (the default, IIRC), and a newer
  host kernel, then the kernel provides high-resolution timers, very
  likely using HPET internally or some other high resolution clock and
  event source.


   And I would like to ask right and wrong to
   implement the functionality in terms of need
   and efficiency (scalability and time accuracy).


 I think that for newer kernels we already have the desired accuracy.
  We're not always good at exploiting that accuracy; hence the recent
  movement of the PIT implementation from userspace to the kernel.  But
  recent discussion leads me to believe it could have been implemented
  with the userspace PIT as well.


  --
  Any sufficiently difficult bug is indistinguishable from a feature.



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC/PATCH 01/15] preparation: provide hook to enable pgstes in user pagetable

2008-03-23 Thread Martin Schwidefsky
On Sun, 2008-03-23 at 12:15 +0200, Avi Kivity wrote:
  Can you convert the page tables at a later time without doing a
  wholesale replacement of the mm?  It should be a bit easier to keep
  people off the pagetables than keep their grubby mitts off the mm
  itself.
  
 
  Yes, as far as I can see you're right. And whatever we do in arch code,
  after all it's just a work around to avoid a new clone flag.
  If something like clone() with CLONE_KVM would be useful for more
  architectures than just s390 then maybe we should try to get a flag.
 
  Oh... there are just two unused clone flag bits left. Looks like the
  namespace changes ate up a lot of them lately.
 
  Well, we could still play dirty tricks like setting a bit in current
  via whatever mechanism which indicates child-wants-extended-page-tables
  and then just fork and be happy.

 
 How about taking mmap_sem for write and converting all page tables 
 in-place?  I'd rather avoid the need to fork() when creating a VM.

That was my initial approach as well. If all the page table allocations
can be fullfilled the code is not too complicated. To handle allocation
failures gets tricky. At this point I realized that dup_mmap already
does what we want to do. It walks all the page tables, allocates new
page tables and copies the ptes. In principle I would reinvent the wheel
if we can not use dup_mmap.

-- 
blue skies,
  Martin.

Reality continues to ruin my life. - Calvin.





-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] (no subject)

2008-03-23 Thread 钟文辉
 


   各位老总:您们好!

   诚祝:您们在2008年里;有鼠不尽的快乐!鼠不尽的收获!鼠不尽的钞票! 
 
   鼠不尽的幸福!鼠不尽的美满生活!愿:您们阖家欢乐!幸福安康!

   我是(深圳市珊湖岛进出口有限公司)的负责人;可以提供:出口报关单,
 
   核销单等等一系列手续;代理:出口报关,商检,境内外运输..等等;还可

   以代办:出口欧盟许可证,欧盟产地证;并且还有(广州国际贸易交易会)的摊

   位可以转让;有意者请来邮件或来电联系。
 
 电话:0755-81153047。
 
 传真:0755-81172940。
 
 手机:15817477278。
 
 联系人:钟文辉。
 
 此致:
 

  敬礼!
 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] (no subject)

2008-03-23 Thread 钟文辉
 


   各位老总:您们好!

   诚祝:您们在2008年里;有鼠不尽的快乐!鼠不尽的收获!鼠不尽的钞票! 
 
   鼠不尽的幸福!鼠不尽的美满生活!愿:您们阖家欢乐!幸福安康!

   我是(深圳市珊湖岛进出口有限公司)的负责人;可以提供:出口报关单,
 
   核销单等等一系列手续;代理:出口报关,商检,境内外运输..等等;还可

   以代办:出口欧盟许可证,欧盟产地证;并且还有(广州国际贸易交易会)的摊

   位可以转让;有意者请来邮件或来电联系。
 
 电话:0755-81153047。
 
 传真:0755-81172940。
 
 手机:15817477278。
 
 联系人:钟文辉。
 
 此致:
 

  敬礼!
 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] KVM: MMU: add KVM_ZAP_GFN ioctl

2008-03-23 Thread Marcelo Tosatti
On Fri, Mar 21, 2008 at 04:56:50PM +0100, Andrea Arcangeli wrote:
 On Fri, Mar 21, 2008 at 10:37:00AM -0300, Marcelo Tosatti wrote:
  This is not the final put_page().
  
  Remote TLB's are flushed here, after rmap_remove:
  
  +   if (nuked)
  +   kvm_flush_remote_tlbs(kvm);
  
  This ioctl is called before zap_page_range() is executed through
  sys_madvise(MADV_DONTNEED) to remove the page in question.
  
  We know that the guest will not attempt to fault in the gfn because
  the virtio balloon driver is synchronous (it will only attempt to
  release that page back to the guest OS once rmap_nuke+zap_page_range has
  finished).
  
  Can you be more verbose?
 
 Sure.
 
 1) even if you run madvise(MADV_DONTNEED) after KVM_ZAP_GFN, the anon
page can be released by the VM at any time without any kvm-aware
lock (there will be a swap reference to it, no any more page_count
references leading to memory corruption in the host in presence of
memory pressure). This is purely theoretical of course, not sure if
timings or probabilities allows for reproducing this in real life.

If there are any active shadow mappings to a page there is a guarantee
that there is a valid linux pte mapping pointing at it. So page_count ==
1 + nr_sptes.

So the theoretical race you're talking about is:


CPU0CPU1

spte = rmap_next(kvm, rmapp, NULL);
while (spte) {
BUG_ON(!spte);
BUG_ON(!(*spte  PT_PRESENT_MASK));
rmap_printk(rmap_nuke: spte %p %llx\n, spte, *spte);
rmap_remove(kvm, spte);
set_shadow_pte(spte, shadow_trap_nonpresent_pte);
nuked = 1;
spte = rmap_next(kvm, rmapp, spte);
}
-- 
try_to_unmap_one()
page is now 
free
page 
allocated for other
purposes
   if (nuked)
   kvm_flush_remote_tlbs(kvm);

And some other VCPU with the TLB cached writes to the now freed (and
possibly allocated to another purpose) page.

This case is safe because the path that frees a pte and subsequently
a page will take care of flushing the TLB of any remote CPU's that
possibly have it cached (before freeing the page, of course).
ptep_clear_flush-flush_tlb_page.

Am I missing something?

 2) not sure what you mean with synchronous, do you mean single
threaded? I can't see how it can be single threaded (does
ballooning stops all other vcpus?). 

No, I mean synchronous as in that no other vcpu will attempt to fault
that _particular gfn_ in between KVM_ZAP_GFN and madvise.

  Why are you taking the mmu_lock around rmap_nuke if no other vcpu
  can take any page fault and call into get_user_pages in between
  KVM_ZAP_GFN and madvise?

Other vcpu's can take page faults and call into get_user_pages, but not
for the gfn KVM_ZAP_GFN is operating on, because it has been allocated
by the balloon driver.

So we need mmu_lock to protect against concurrent shadow page and rmap
operations.

 As far as I
can tell the only possible safe ordering is madvise; KVM_ZAP_GFN,
which is emulating the mmu notifier behavior incidentally.
 
 Note that the rmap_remove smp race (also note here smp race means
 smp-host race, it will trigger even if guest is UP) might be a generic
 issue with the rmap_remove logic. I didn't analyze all the possible
 rmap_remove callers yet (this was in my todo list), I just made sure
 that my code would be smp safe.

As detailed above, we have a guarantee that there is a live linux pte
by the time rmap_remove() nukes a shadow pte.

  By the way, I don't see invalidate_begin/invalidate_end hooks in the KVM 
  part of MMU notifiers V9 patch? (meaning that zap_page_range will not zap
  the spte's for the pages in question).
 
 range_begin isn't needed. range_begin is needed only by secondary mmu
 drivers that aren't reference counting the pages. The _end callback is
 below. It could be improved to skip the whole range in a single browse
 of the memslots instead of browsing it for each page in the range. The
 mmu notifiers aren't merged and this code may still require changes in
 terms of API if EMM is merged instead of #v9 (hope not), so I tried to
 keep it simple.

Oh, I missed that. Nice.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] question: HPET for multiple VMs

2008-03-23 Thread Dor Laor

On Mon, 2008-03-24 at 00:32 +0900, Ryota OZAKI wrote:
 Hi Avi,
 
  If you use the dyntick clock option (the default, IIRC), and a newer
   host kernel, then the kernel provides high-resolution timers, very
   likely using HPET internally or some other high resolution clock and
   event source.
 
 I see. The dyntick clock seems to be more scalable than
 the others. I understood that '-clock hpet' is used for
 boosting one VM (becuase hpet gains best performance
 on virtio), right?
 
 I would like to try dyntick for my multiple VMs environment.
 
  I think that for newer kernels we already have the desired accuracy.
 
 Yes. In recent versions of kvm, I didn't experience
 any time inaccuracy, although I had only tested under
 several VMs. I'll try the more number of VMs, and
 if time inaccuracy occurs, I would like to report
 that.
 

The problem is not inaccuracy of guest clock (which we do suffer from in
some guests and there is work in progress to fix). The problem is that
qemu_timer is not accurate, thus the virtio tx timer is too slow leading
to not optimized performance for virtio-net.

Try host kernel = 2.6.24 with dyntick.

 Many thanks,
 ozaki-r
 
 2008/3/23, Avi Kivity [EMAIL PROTECTED]:
  Ryota OZAKI wrote:
Hi all,
   
Current kvm allows only one VM to use HPET. Is
there a plan to implement a functionality to
allow multiple VMs to use HPET? If so, how
about the status of that?
   
   
 
 
  If you use the dyntick clock option (the default, IIRC), and a newer
   host kernel, then the kernel provides high-resolution timers, very
   likely using HPET internally or some other high resolution clock and
   event source.
 
 
And I would like to ask right and wrong to
implement the functionality in terms of need
and efficiency (scalability and time accuracy).
 
 
  I think that for newer kernels we already have the desired accuracy.
   We're not always good at exploiting that accuracy; hence the recent
   movement of the PIT implementation from userspace to the kernel.  But
   recent discussion leads me to believe it could have been implemented
   with the userspace PIT as well.
 
 
   --
   Any sufficiently difficult bug is indistinguishable from a feature.
 
 
 
 -
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft(R) Visual Studio 2008.
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] [PATCH] [RFC] Fix time drift of rtc clock + general support

2008-03-23 Thread Dor Laor

On Sun, 2008-03-23 at 16:19 +, Paul Brook wrote:
 On Sunday 23 March 2008, Dor Laor wrote:
  --- a/qemu/hw/irq.c
  +++ b/qemu/hw/irq.c
  @@ -30,6 +30,8 @@ struct IRQState {
   int n;
   };
   
  +uint32_t qemu_irq_acked[NR_IRQ_WORDS];
 
 This is absolute rubbish. The whole point of the IRQ framework is that it 
 doesn't assume a single flat IRQ controller.
 

Thanks for the compliments  the review ...
I specifically said that I'll move this variable into per-cpu var.

Moreover, the translation between irq line to vector is handled by the
'qemu_get_irq_vector' that calls 'irq_controller_get_vector' should take
care of the translation.
It works for ioapic, I'm not sure if it works for the flat pic case yet.

Anyway you're welcome to drift without the patch or provide constructive
comments.

 Paul


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Qemu-kvm is leaking my memory ???

2008-03-23 Thread Zdenek Kabelac
2008/3/23, Avi Kivity [EMAIL PROTECTED]:
 Avi Kivity wrote:
  
   I see the same issue too now, and am investigating.
  


 The attached patch should fix the issue.  It is present in 2.6.25-rc6
  only, and not in kvm.git, which is why few people noticed it.


Hi

Tested - and actually seeing no difference in my case of memory leak.
Still it looks like over 30M per execution of qemu is lost.
(tested with fresh 2.6.25-rc6 with your patch)

Also now I'd have said that before my dmsetup status loop test case
was not causing big problems and it was just enough to run another
dmsetup to unblock the loop - now it's usually leads to some wierd end
of qemu itself - will explore more

So it's probably fixing some bug - and exposing another.

As I said before - in my debuger it was looping in page_fault hadler -
i.e.  memory should be paged_in - but as soon as the handler return to
the code to continue memcopy - new page_fault is invoked and pointer 
couters are not changed.

Zdenek

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] [PATCH] [RFC] Fix time drift of rtc clock + general support

2008-03-23 Thread Paul Brook
On Sunday 23 March 2008, Dor Laor wrote:
 On Sun, 2008-03-23 at 16:19 +, Paul Brook wrote:
  On Sunday 23 March 2008, Dor Laor wrote:
   --- a/qemu/hw/irq.c
   +++ b/qemu/hw/irq.c
   @@ -30,6 +30,8 @@ struct IRQState {
int n;
};
  
   +uint32_t qemu_irq_acked[NR_IRQ_WORDS];
 
  This is absolute rubbish. The whole point of the IRQ framework is that it
  doesn't assume a single flat IRQ controller.

 Thanks for the compliments  the review ...
 I specifically said that I'll move this variable into per-cpu var.

Per-cpu is no better.

 Moreover, the translation between irq line to vector is handled by the
 'qemu_get_irq_vector' that calls 'irq_controller_get_vector' should take
 care of the translation.
 It works for ioapic, I'm not sure if it works for the flat pic case yet.

Which shows you've completely missed the point.  irq-n is not a globally 
unique identifier. It's a local per-controller index. qemu has targets with 
multiple nested interrupt controllers, anything trying to maintain global or 
per-cpu IRQ lists is fundamentally broken.

 Anyway you're welcome to drift without the patch or provide constructive
 comments.

Well, the patch doesn't even build on non-x86 targets.

  a new timer will be fired to try inject it again soon (==0.1msec)

If the guest is missing interrupts, the chances of a 0.1ms interval working 
are not great.  Most likely It's either going trigger immediately, or be 
delayed significantly and you're going to end up even further behind. 

If triggering immediately is OK then why not do that all the time?
If triggering immediately is not acceptable then you're still going to loose 
interrupts.

Paul

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [kvm-ppc-devel] [PATCH] Move kvm_get_pit tolibkvm.c common code

2008-03-23 Thread Zhang, Xiantao
Avi Kivity wrote:
 Hollis Blanchard wrote:
 
 Don't compile kvm_*_pit() on architectures whose currently supported
 platforms do not contain a PIT.
 
 Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]
 
 diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
 --- a/libkvm/libkvm.h
 +++ b/libkvm/libkvm.h
 @@ -539,6 +539,7 @@ int kvm_pit_in_kernel(kvm_context_t kvm)
 
  #ifdef KVM_CAP_PIT
 
 +#if defined(__i386__) || defined(__x86_64__) || defined(__ia64__) 
 /*! 
   * \brief Get in kernel PIT of the virtual domain
   *
 @@ -562,6 +563,8 @@ int kvm_set_pit(kvm_context_t kvm, struc
 
  #endif
 
 +#endif
 +
  #ifdef KVM_CAP_VAPIC
 
 ia64 doesn't have an in-kernel pit? (yet?)

IA64 doesn't have pit on platform. 
Xiantao

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel