Re: [kvm-devel] Widescreen troubles again

2008-02-20 Thread Arne Brutschy

On Di, 2008-02-19 at 16:21 +, Andreas Winkelbauer wrote:
 as far as I have seen the 'special' modes have a preprocessor constant defined
 in vgabios/vbe.h which looks like
 #define VBE_OWN_MODE_1152X864X 0x14c
 
 the numbers (0x14c in this case) correspond to those used in vbetables-gen.c. 
 I
 don't know if and where these definitions are used
Yes, I found that too, but I could any references to it. I added my mode
anyways, which didn't change anything.

 in fact all the non-standard (widescreen) modes defined in vbetables-gen.c 
 work
 for me _except_ 1680x1050. I've tested this with kvm-60, kvm-61 and the latest
 snapshot (as of writing this). this really looks like a limitation somewhere,
 but at the moment I have no clue who could be responsible for that limitation.
This is very strange... For me only some of the additional modes work
and are reported by the driver correctly, but I couldn't find the
criteria which makes the difference.

 well, the amount of memory (in bytes) needed (for the framebuffer) is: width x
 heigth x (color depth in bit / 8). in you example: 1920 x 1200 x 4 = 9216000
 bytes which is greater than 8MB, so you might want to try using 16MB of video
 memory.
I did already. Increasing the value in the vbetables.h actually did
change the amount of vram reported by the windows driver, but appart
from that nothing changed.

Regards,
Arne



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] qemu: implicit precedence for logical operator in has_work

2008-02-20 Thread Carlo Marcelo Arenas Belon
janitorial fix for :

  qemu/qemu-kvm.c: In function `has_work':
  qemu/qemu-kvm.c:140: warning: suggest parentheses around  within ||

Signed-off-by: Carlo Marcelo Arenas Belon [EMAIL PROTECTED]
---
 qemu/qemu-kvm.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
index ffc59d5..4056453 100644
--- a/qemu/qemu-kvm.c
+++ b/qemu/qemu-kvm.c
@@ -137,7 +137,7 @@ extern int vm_running;
 
 static int has_work(CPUState *env)
 {
-if (!vm_running || env  vcpu_info[env-cpu_index].stopped)
+if (!vm_running || (env  vcpu_info[env-cpu_index].stopped))
return 0;
 if (!(env-hflags  HF_HALTED_MASK))
return 1;
-- 
1.5.3.7


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] qemu: implicit precedence for logical operator in has_work

2008-02-20 Thread Alexander Graf

On Feb 20, 2008, at 9:30 AM, Carlo Marcelo Arenas Belon wrote:

 janitorial fix for :

  qemu/qemu-kvm.c: In function `has_work':
  qemu/qemu-kvm.c:140: warning: suggest parentheses around  within ||

 Signed-off-by: Carlo Marcelo Arenas Belon [EMAIL PROTECTED]
 ---
 qemu/qemu-kvm.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/qemu/qemu-kvm.c b/qemu/qemu-kvm.c
 index ffc59d5..4056453 100644
 --- a/qemu/qemu-kvm.c
 +++ b/qemu/qemu-kvm.c
 @@ -137,7 +137,7 @@ extern int vm_running;

 static int has_work(CPUState *env)
 {
 -if (!vm_running || env  vcpu_info[env-cpu_index].stopped)
 +if (!vm_running || (env  vcpu_info[env-cpu_index].stopped))

What exactly is the env check needed for here?


   return 0;
 if (!(env-hflags  HF_HALTED_MASK))

It won't be done here anyway...


   return 1;
 -- 
 1.5.3.7


 -
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft(R) Visual Studio 2008.
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Neu und kostenlos nutzbar ....

2008-02-20 Thread martin geier
Hello,
Eine neue Internetplattform - für User von User - bietet alles kostenlos!!!
Anmelden und alles kostenlos nutzen - auf  -  www.prototo.com ...
Viel Spass Martin-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 5/6] mmu_notifier: Support for drivers with revers maps (f.e. for XPmem)

2008-02-20 Thread Robin Holt
On Wed, Feb 20, 2008 at 02:51:45PM +1100, Nick Piggin wrote:
 On Wednesday 20 February 2008 14:12, Robin Holt wrote:
  For XPMEM, we do not currently allow file backed
  mapping pages from being exported so we should never reach this condition.
  It has been an issue since day 1.  We have operated with that assumption
  for 6 years and have not had issues with that assumption.  The user of
  xpmem is MPT and it controls the communication buffers so it is reasonable
  to expect this type of behavior.
 
 OK, that makes things simpler.
 
 So why can't you export a device from your xpmem driver, which
 can be mmap()ed to give out anonymous memory pages to be used
 for these communication buffers?

Because we need to have heap and stack available as well.  MPT does
not control all the communication buffer areas.  I haven't checked, but
this is the same problem that IB will have.  I believe they are actually
allowing any memory region be accessible, but I am not sure of that.

Thanks,
Robin

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kvm-60: kexec in guest crashes the host

2008-02-20 Thread Avi Kivity
Dan Aloni wrote:
 It happens at 100% of the times I invoke kexec.

   

Can you provide a commandline which triggers this? I'm completely 
ignorant wrt kexec.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 5/6] mmu_notifier: Support for drivers with revers maps (f.e. for XPmem)

2008-02-20 Thread Robin Holt
On Wed, Feb 20, 2008 at 03:00:36AM -0600, Robin Holt wrote:
 On Wed, Feb 20, 2008 at 02:51:45PM +1100, Nick Piggin wrote:
  On Wednesday 20 February 2008 14:12, Robin Holt wrote:
   For XPMEM, we do not currently allow file backed
   mapping pages from being exported so we should never reach this condition.
   It has been an issue since day 1.  We have operated with that assumption
   for 6 years and have not had issues with that assumption.  The user of
   xpmem is MPT and it controls the communication buffers so it is reasonable
   to expect this type of behavior.
  
  OK, that makes things simpler.
  
  So why can't you export a device from your xpmem driver, which
  can be mmap()ed to give out anonymous memory pages to be used
  for these communication buffers?
 
 Because we need to have heap and stack available as well.  MPT does
 not control all the communication buffer areas.  I haven't checked, but
 this is the same problem that IB will have.  I believe they are actually
 allowing any memory region be accessible, but I am not sure of that.

I should have read my work email first.  I had gotten an email from
one of our MPT developers saying they would love it if they could share
file backed memory areas as well as it would help them with their MPI-IO
functions which currently need to do multiple copy steps.  Not sure how
high of a priority I am going to be able to make that.


Thanks,
Robin

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] KVM Testing Result for KVM-61

2008-02-20 Thread Izik Eidus

On Wed, 2008-02-20 at 15:09 +0800, Zhao, Yunfeng wrote:
 Hi,All
 This is testing result for KVM-61. 
 No new issue has been found in the testing.
 
 Five old issues:
 1. Fails to save/restore guests 
 Save/restore may cause host to hang.
 https://sourceforge.net/tracker/index.php?func=detailaid=1824525group_
 id=180599atid=893831

savevm loadvm does not work, but it doesnt crush my host
what you see in the dmesg? (the bug with the bad page was fixed no?)

 2. smp windows installer cr
 ashes
  while rebooting
 https://sourceforge.net/tracker/index.php?func=detailaid=1877875group_
 id=180599atid=893831
 3. Timer of guest is inaccurate
 https://sourceforge.net/tracker/?func=detailatid=893831aid=1826080gro
 up_id=180599
 4. Installer of 64bit vista guest will pause for ten minutes after
 reboot
 https://sourceforge.net/tracker/?func=detailatid=893831aid=1836905gro
 up_id=180599
 5. Cannot boot 32bit smp RHEL5.1 guest with nic on 64bit host
 https://sourceforge.net/tracker/?func=detailatid=893831aid=1812043gro
 up_id=180599
 
 Test environment
  
 
 PlatformWoodcrest
 CPU 4
 Memory size 8G'
  
 
 Details
 
 
 IA32-pae: 
 
 1. boot guest with 256M memory  PASS
 2. boot two windows xp guest   PASS
 3. boot 4 same guest in parallelPASS
 4. boot linux and windows guest in parallel PASS
 5. boot guest with 1500M memory PASS
 6. boot windows 2003 with ACPI enabled   PASS
 7. boot Windows xp with ACPI enabled  PASS
 8. boot Windows 2000 without ACPI  PASS
 9. kernel build on SMP linux guestPASS
 10. LTP on SMP linux guest PASS
 11. boot base kernel linux
 PASS
 12. save/restore 32-bit HVM guests   PASS
 13. live migration 32-bit HVM guests  PASS
 14. boot SMP Windows xp with ACPI enabledPASS
 15. boot SMP Windows 2003 with ACPI enabled PASS
 16. boot SMP Windows 2000 with ACPI enabled PASS
  
  
 
 IA32e: 
 
 1. boot four 32-bit guest in parallel
 PASS
 2. boot four 64-bit guest in parallel
 PASS
 3. boot 4G 64-bit guest
 PASS
 4. boot 4G pae guest
 PASS
 5. boot 32-bit linux and 32 bit windows guest in parallelPASS
 6. boot 32-bit guest with 1500M memory PASS
 7. boot 64-bit guest with 1500M memory PASS
 8. boot 32-bit guest with 256M memory   PASS
 9. boot 64-bit guest with 256M memory   PASS
 10. boot two 32-bit windows xp in parallel
 PASS
 11. boot four 32-bit different guest in para
 PASS
 12. save/restore 64-bit linux guests
 FAIL
 13. save/restore 32-bit linux guests
 PASS
 14. boot 32-bit SMP windows 2003 with ACPI enabled PASS
 15. boot 32-bit SMP Windows 2000 with ACPI enabledPASS
 16. boot 32-bit SMP Windows xp with ACPI enabledPASS
 17. boot 32-bit Windows 2000 without ACPIPASS
 18. boot 64-bit Windows xp with ACPI enabledPASS
 19. boot 32-bit Windows xp without ACPIPASS
 20. boot 64-bit vista
 PASS
 21. kernel build in 32-bit linux guest OS
 PASS
 22. kernel build in 64-bit linux guest OS
 PASS
 23. LTP on SMP 32-bit linux guest OSPASS
 24. LTP on SMP 64-bit linux guest OSPASS
 25. boot 64-bit guests with ACPI enabled
 PASS
 26. boot 32-bit x-server
 PASS   
 27. boot 64-bit SMP windows XP with ACPI enabled PASS
 28. boot 64-bit SMP windows 2003 with ACPI enabled  PASS
 29. live migration 64bit linux guests
 PASS
 30. live migration 32bit linux guests
 PASS
 
 
 Report Summary on IA32-pae
  
 Summary Test Report of Last Session
 =
   Total   PassFailNoResult   Crash
 =
 control_panel   6   5   1 00
 Restart 2   2   0 00
 gtest   14  13  1 00
 =
 control_panel   6   5   1 00
  :KVM_LM_PAE_gPAE   1   0   1 00
  :KVM_four_sguest_PAE_gPA   1   1   0 00
  :KVM_256M_guest_PAE_gPAE   1   1   0 00
  :KVM_linux_win_PAE_gPAE1   1   0 00
  

Re: [kvm-devel] Out-of-box kvm-61 driver crash, first kvm problem ever, boo hoo...

2008-02-20 Thread Avi Kivity

Avi Kivity wrote:

[EMAIL PROTECTED] wrote:
EIP: [f8b8dcd2] vmx_set_efer+0xa2/0xb0 [kvm_intel] SS:ESP 
0068:f4a63ed4


  


Not completely unexpected. You are running a Core (not 2) processor 
which doesn't support nx or x86_64, so it doesn't have the EFER 
register. kvm-61 adds support for the EFER on i386, but apparently 
doesn't handle those old cpus well.


I'll try to get a patch for you to test.



Attached.

--
Any sufficiently difficult bug is indistinguishable from a feature.

diff --git a/kernel/vmx.c b/kernel/vmx.c
index e75b2f5..a575e54 100644
--- a/kernel/vmx.c
+++ b/kernel/vmx.c
@@ -1342,6 +1342,8 @@ static void vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer)
 	struct kvm_msr_entry *msr = find_msr_entry(vmx, MSR_EFER);
 
 	vcpu-arch.shadow_efer = efer;
+	if (!msr)
+		return;
 	if (efer  EFER_LMA) {
 		vmcs_write32(VM_ENTRY_CONTROLS,
  vmcs_read32(VM_ENTRY_CONTROLS) |
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Out-of-box kvm-61 driver crash, first kvm problem ever, boo hoo...

2008-02-20 Thread Avi Kivity
[EMAIL PROTECTED] wrote:
 EIP: [f8b8dcd2] vmx_set_efer+0xa2/0xb0 [kvm_intel] SS:ESP 0068:f4a63ed4

   

Not completely unexpected. You are running a Core (not 2) processor 
which doesn't support nx or x86_64, so it doesn't have the EFER 
register. kvm-61 adds support for the EFER on i386, but apparently 
doesn't handle those old cpus well.

I'll try to get a patch for you to test.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] mmu notifiers #v6

2008-02-20 Thread Andrea Arcangeli
Given Nick's comments I ported my version of the mmu notifiers to
latest mainline. There are no known bugs AFIK and it's obviously safe
(nothing is allowed to schedule inside rcu_read_lock taken by
mmu_notifier() with my patch).

XPMEM simply can't use RCU for the registration locking if it wants to
schedule inside the mmu notifier calls. So I guess it's better to add
the XPMEM invalidate_range_end/begin/external-rmap as a whole
different subsystem that will have to use a mutex (not RCU) to
serialize, and at the same time that CONFIG_XPMEM will also have to
switch the i_mmap_lock to a mutex. I doubt xpmem fits inside a
CONFIG_MMU_NOTIFIER anymore, or we'll all run a bit slower because of
it. It's really a call of how much we want to optimize the MMU
notifier, by keeping things like RCU for the registration.

Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]

diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
--- a/include/asm-generic/pgtable.h
+++ b/include/asm-generic/pgtable.h
@@ -46,6 +46,7 @@
__young = ptep_test_and_clear_young(__vma, __address, __ptep);  \
if (__young)\
flush_tlb_page(__vma, __address);   \
+   __young |= mmu_notifier_age_page((__vma)-vm_mm, __address);\
__young;\
 })
 #endif
@@ -86,6 +87,7 @@ do {  
\
pte_t __pte;\
__pte = ptep_get_and_clear((__vma)-vm_mm, __address, __ptep);  \
flush_tlb_page(__vma, __address);   \
+   mmu_notifier(invalidate_page, (__vma)-vm_mm, __address);   \
__pte;  \
 })
 #endif
diff --git a/include/asm-s390/pgtable.h b/include/asm-s390/pgtable.h
--- a/include/asm-s390/pgtable.h
+++ b/include/asm-s390/pgtable.h
@@ -735,6 +735,7 @@ static inline pte_t ptep_clear_flush(str
 {
pte_t pte = *ptep;
ptep_invalidate(vma-vm_mm, address, ptep);
+   mmu_notifier(invalidate_page, vma-vm_mm, address);
return pte;
 }
 
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -10,6 +10,7 @@
 #include linux/rbtree.h
 #include linux/rwsem.h
 #include linux/completion.h
+#include linux/mmu_notifier.h
 #include asm/page.h
 #include asm/mmu.h
 
@@ -228,6 +229,8 @@ struct mm_struct {
 #ifdef CONFIG_CGROUP_MEM_CONT
struct mem_cgroup *mem_cgroup;
 #endif
+
+   struct mmu_notifier_head mmu_notifier; /* MMU notifier list */
 };
 
 #endif /* _LINUX_MM_TYPES_H */
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
new file mode 100644
--- /dev/null
+++ b/include/linux/mmu_notifier.h
@@ -0,0 +1,132 @@
+#ifndef _LINUX_MMU_NOTIFIER_H
+#define _LINUX_MMU_NOTIFIER_H
+
+#include linux/list.h
+#include linux/spinlock.h
+
+struct mmu_notifier;
+
+struct mmu_notifier_ops {
+   /*
+* Called when nobody can register any more notifier in the mm
+* and after the mn notifier has been disarmed already.
+*/
+   void (*release)(struct mmu_notifier *mn,
+   struct mm_struct *mm);
+
+   /*
+* invalidate_page[s] is called in atomic context
+* after any pte has been updated and before
+* dropping the PT lock required to update any Linux pte.
+* Once the PT lock will be released the pte will have its
+* final value to export through the secondary MMU.
+* Before this is invoked any secondary MMU is still ok
+* to read/write to the page previously pointed by the
+* Linux pte because the old page hasn't been freed yet.
+* If required set_page_dirty has to be called internally
+* to this method.
+*/
+   void (*invalidate_page)(struct mmu_notifier *mn,
+   struct mm_struct *mm,
+   unsigned long address);
+   void (*invalidate_pages)(struct mmu_notifier *mn,
+struct mm_struct *mm,
+unsigned long start, unsigned long end);
+
+   /*
+* Age page is called in atomic context inside the PT lock
+* right after the VM is test-and-clearing the young/accessed
+* bitflag in the pte. This way the VM will provide proper aging
+* to the accesses to the page through the secondary MMUs
+* and not only to the ones through the Linux pte.
+*/
+   int (*age_page)(struct mmu_notifier *mn,
+   struct mm_struct *mm,
+   unsigned long address);
+};
+
+struct mmu_notifier {
+   struct hlist_node hlist;
+   const struct mmu_notifier_ops *ops;
+};
+
+#ifdef CONFIG_MMU_NOTIFIER
+
+struct mmu_notifier_head {
+ 

Re: [kvm-devel] KVM Testing Result for KVM-61

2008-02-20 Thread Izik Eidus

On Wed, 2008-02-20 at 12:29 +0200, Dor Laor wrote:
 On Wed, 2008-02-20 at 11:41 +0200, Izik Eidus wrote:
  On Wed, 2008-02-20 at 15:09 +0800, Zhao, Yunfeng wrote:
   Hi,All
   This is testing result for KVM-61. 
   No new issue has been found in the testing.
   
   Five old issues:
   1. Fails to save/restore guests 
   Save/restore may cause host to hang.
   https://sourceforge.net/tracker/index.php?func=detailaid=1824525group_
   id=180599atid=893831
  
  savevm loadvm does not work, but it doesnt crush my host
  what you see in the dmesg? (the bug with the bad page was fixed no?)
  
 
 I know it happens for win2k guest (repeatably)

i belive the problem of the win2k is different from what he experience

 
   2. smp windows installer cr
   ashes
while rebooting
   https://sourceforge.net/tracker/index.php?func=detailaid=1877875group_
   id=180599atid=893831
   3. Timer of guest is inaccurate
   https://sourceforge.net/tracker/?func=detailatid=893831aid=1826080gro
   up_id=180599
   4. Installer of 64bit vista guest will pause for ten minutes after
   reboot
   https://sourceforge.net/tracker/?func=detailatid=893831aid=1836905gro
   up_id=180599
   5. Cannot boot 32bit smp RHEL5.1 guest with nic on 64bit host
   https://sourceforge.net/tracker/?func=detailatid=893831aid=1812043gro
   up_id=180599
   
   Test environment
    
   
   PlatformWoodcrest
   CPU 4
   Memory size 8G'

   
   Details
   
   
   IA32-pae: 
   
   1. boot guest with 256M memory  PASS
   2. boot two windows xp guest   PASS
   3. boot 4 same guest in parallelPASS
   4. boot linux and windows guest in parallel PASS
   5. boot guest with 1500M memory PASS
   6. boot windows 2003 with ACPI enabled   PASS
   7. boot Windows xp with ACPI enabled  PASS
   8. boot Windows 2000 without ACPI  PASS
   9. kernel build on SMP linux guestPASS
   10. LTP on SMP linux guest PASS
   11. boot base kernel linux
   PASS
   12. save/restore 32-bit HVM guests   PASS
   13. live migration 32-bit HVM guests  PASS
   14. boot SMP Windows xp with ACPI enabledPASS
   15. boot SMP Windows 2003 with ACPI enabled PASS
   16. boot SMP Windows 2000 with ACPI enabled PASS

    
   
   IA32e: 
   
   1. boot four 32-bit guest in parallel
   PASS
   2. boot four 64-bit guest in parallel
   PASS
   3. boot 4G 64-bit guest
   PASS
   4. boot 4G pae guest
   PASS
   5. boot 32-bit linux and 32 bit windows guest in parallelPASS
   6. boot 32-bit guest with 1500M memory PASS
   7. boot 64-bit guest with 1500M memory PASS
   8. boot 32-bit guest with 256M memory   PASS
   9. boot 64-bit guest with 256M memory   PASS
   10. boot two 32-bit windows xp in parallel
   PASS
   11. boot four 32-bit different guest in para
   PASS
   12. save/restore 64-bit linux guests
   FAIL
   13. save/restore 32-bit linux guests
   PASS
   14. boot 32-bit SMP windows 2003 with ACPI enabled PASS
   15. boot 32-bit SMP Windows 2000 with ACPI enabledPASS
   16. boot 32-bit SMP Windows xp with ACPI enabledPASS
   17. boot 32-bit Windows 2000 without ACPIPASS
   18. boot 64-bit Windows xp with ACPI enabledPASS
   19. boot 32-bit Windows xp without ACPIPASS
   20. boot 64-bit vista
   PASS
   21. kernel build in 32-bit linux guest OS
   PASS
   22. kernel build in 64-bit linux guest OS
   PASS
   23. LTP on SMP 32-bit linux guest OSPASS
   24. LTP on SMP 64-bit linux guest OSPASS
   25. boot 64-bit guests with ACPI enabled
   PASS
   26. boot 32-bit x-server
   PASS   
   27. boot 64-bit SMP windows XP with ACPI enabled PASS
   28. boot 64-bit SMP windows 2003 with ACPI enabled  PASS
   29. live migration 64bit linux guests
   PASS
   30. live migration 32bit linux guests
   PASS
   
   
   Report Summary on IA32-pae

   Summary Test Report of Last Session
   =
 Total   PassFailNoResult   Crash
   =
   control_panel   6   5   1 00
   Restart 2   2   0 00

Re: [kvm-devel] kvm-60: kexec in guest crashes the host

2008-02-20 Thread Dan Aloni
On Wed, Feb 20, 2008 at 11:09:44AM +0200, Avi Kivity wrote:
 Dan Aloni wrote:
  It happens at 100% of the times I invoke kexec.
 

 
 Can you provide a commandline which triggers this? I'm completely 
 ignorant wrt kexec.

I managed to verify that this problem can be reproduced with the 
2.6.16.60 tree.

Also, it's worth to note that with '-no-kvm' the kexec procedure works
successfully and the second kernel executes.

Please use the .config that that I attached to this mail, and also apply
the patch I supplied (it fixes a build problem that 2.6.16 has with the 
newer binutils versions and x86_64). I use gcc-4.1.2 to build the kernel.

Once you have the bzImage of that guest kernel, use a root filesystem 
and boot it straight into /bin/bash.

Now, assuming that your guest rootfs has kexec-utils package installed, 
do the following:

  mount -t proc proc /proc
  kexec -l bzImage --command-line='ro root=/dev/hda1 init=/bin/bash'
  kexec -e

BTW, if you use the serial console with the '-nographic' switch, then
you might want to use kexec a little differently:

  kexec -l bzImage --command-line='ro root=/dev/hda1 init=/bin/bash 
console=ttyS0,115200' --serial=ttyS0 --serial-baud=115200   

Good luck,

-- 
Dan Aloni
XIV, an IBM (R) company. http://www.xivstorage.com
da-x (at) monatomic.org, dan (at) xiv.co.il


2.6.16.60-guest-config.gz
Description: Binary data
diff --git a/arch/x86_64/boot/compressed/Makefile b/arch/x86_64/boot/compressed/Makefile
index f89d96f..c9688fb 100644
--- a/arch/x86_64/boot/compressed/Makefile
+++ b/arch/x86_64/boot/compressed/Makefile
@@ -12,6 +12,7 @@ EXTRA_AFLAGS	:= -traditional -m32
 # cannot use EXTRA_CFLAGS because base CFLAGS contains -mkernel which conflicts with
 # -m32
 CFLAGS := -m32 -D__KERNEL__ -Iinclude -O2  -fno-strict-aliasing
+AFLAGS  := $(CFLAGS) -D__ASSEMBLY__
 LDFLAGS := -m elf_i386
 
 LDFLAGS_vmlinux := -Ttext $(IMAGE_OFFSET) -e startup_32 -m elf_i386
diff --git a/arch/x86_64/boot/tools/build.c b/arch/x86_64/boot/tools/build.c
index c44f5e2..bcdbaf1 100644
--- a/arch/x86_64/boot/tools/build.c
+++ b/arch/x86_64/boot/tools/build.c
@@ -149,8 +149,8 @@ int main(int argc, char ** argv)
 	sz = sb.st_size;
 	fprintf (stderr, System is %d kB\n, sz/1024);
 	sys_size = (sz + 15) / 16;
-	/* 0x4*16 = 4.0 MB, reasonable estimate for the current maximum */
-	if (sys_size  (is_big_kernel ? 0x4 : DEF_SYSSIZE))
+/* 0x6*16 = 6.0 MB, reasonable estimate for the current maximum */
+if (sys_size  (is_big_kernel ? 0x6 : DEF_SYSSIZE))
 		die(System is too big. Try using %smodules.,
 			is_big_kernel ?  : bzImage or );
 	while (sz  0) {
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] mmdrop external module oops

2008-02-20 Thread Andrea Arcangeli
A 2.6.25-rc based kernel spawned an oops in mmdrop when kvm quit so
that reminded me of this:

Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]

diff --git a/kernel/external-module-compat.h b/kernel/external-module-compat.h
index 20ef841..fd3cb1d 100644
--- a/kernel/external-module-compat.h
+++ b/kernel/external-module-compat.h
@@ -564,6 +564,11 @@ static inline void blahblah(void)
 #if LINUX_VERSION_CODE  KERNEL_VERSION(2,6,25)
 
 #define mmdrop(x) do { (void)(x); } while (0)
+#define mmget(x) do { (void)(x); } while (0)
+
+#else
+
+#define mmget(x) do { atomic_inc(x); } while (0)
 
 #endif
 
diff --git a/kernel/hack-module.awk b/kernel/hack-module.awk
index ad7a7c5..404944e 100644
--- a/kernel/hack-module.awk
+++ b/kernel/hack-module.awk
@@ -33,7 +33,7 @@
 vmx_load_host_state = 0
 }
 
-/atomic_inc\(kvm-mm-mm_count\);/ { $0 = // $0 }
+/atomic_inc\(kvm-mm-mm_count\);/ { $0 = mmget(kvm-mm-mm_count); }
 
 /^\t\.fault = / {
 fcn = gensub(/,/, , g, $3)


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] KVM swapping (+ seqlock fix) with mmu notifiers #v6

2008-02-20 Thread Andrea Arcangeli
This is the same as before but against the mmu notifier #v6 patch,
running on top of 2.6.25-rc latest, and in this last update I fixed
the last race condition with a seqlock. I described the exact fix in a
earlier email, in short the seqlock-write is in the
invalidate_page/pages, and the reader will re-issue gfn_to_page if it
finds a seqlock read failure (see the change to paging_tmpl.h). With
this on top of mmu notifier #v6 there are no more practical or
theoretical known problems, nor in the kvm swapping, nor in the mmu
notifier patch (which also supports all sleeping users not just KVM,
without requiring a page pin).

Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]

diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 41962e7..e1287ab 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -21,6 +21,7 @@ config KVM
tristate Kernel-based Virtual Machine (KVM) support
depends on HAVE_KVM  EXPERIMENTAL
select PREEMPT_NOTIFIERS
+   select MMU_NOTIFIER
select ANON_INODES
---help---
  Support hosting fully virtualized guest machines using hardware
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 6656efa..9151d64 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -533,6 +533,110 @@ static void rmap_write_protect(struct kvm *kvm, u64 gfn)
kvm_flush_remote_tlbs(kvm);
 }
 
+static void kvm_unmap_spte(struct kvm *kvm, u64 *spte)
+{
+   struct page *page = pfn_to_page((*spte  PT64_BASE_ADDR_MASK)  
PAGE_SHIFT);
+   get_page(page);
+   rmap_remove(kvm, spte);
+   set_shadow_pte(spte, shadow_trap_nonpresent_pte);
+   kvm_flush_remote_tlbs(kvm);
+   __free_page(page);
+}
+
+static void kvm_unmap_rmapp(struct kvm *kvm, unsigned long *rmapp)
+{
+   u64 *spte, *curr_spte;
+
+   spte = rmap_next(kvm, rmapp, NULL);
+   while (spte) {
+   BUG_ON(!(*spte  PT_PRESENT_MASK));
+   rmap_printk(kvm_rmap_unmap_hva: spte %p %llx\n, spte, *spte);
+   curr_spte = spte;
+   spte = rmap_next(kvm, rmapp, spte);
+   kvm_unmap_spte(kvm, curr_spte);
+   }
+}
+
+void kvm_unmap_hva(struct kvm *kvm, unsigned long hva)
+{
+   int i;
+
+   /*
+* If mmap_sem isn't taken, we can look the memslots with only
+* the mmu_lock by skipping over the slots with userspace_addr == 0.
+*/
+   spin_lock(kvm-mmu_lock);
+   for (i = 0; i  kvm-nmemslots; i++) {
+   struct kvm_memory_slot *memslot = kvm-memslots[i];
+   unsigned long start = memslot-userspace_addr;
+   unsigned long end;
+
+   /* mmu_lock protects userspace_addr */
+   if (!start)
+   continue;
+
+   end = start + (memslot-npages  PAGE_SHIFT);
+   if (hva = start  hva  end) {
+   gfn_t gfn_offset = (hva - start)  PAGE_SHIFT;
+   kvm_unmap_rmapp(kvm, memslot-rmap[gfn_offset]);
+   }
+   }
+   spin_unlock(kvm-mmu_lock);
+}
+
+static int kvm_age_rmapp(struct kvm *kvm, unsigned long *rmapp)
+{
+   u64 *spte;
+   int young = 0;
+
+   spte = rmap_next(kvm, rmapp, NULL);
+   while (spte) {
+   int _young;
+   u64 _spte = *spte;
+   BUG_ON(!(_spte  PT_PRESENT_MASK));
+   _young = _spte  PT_ACCESSED_MASK;
+   if (_young) {
+   young = !!_young;
+   set_shadow_pte(spte, _spte  ~PT_ACCESSED_MASK);
+   }
+   spte = rmap_next(kvm, rmapp, spte);
+   }
+   return young;
+}
+
+int kvm_age_hva(struct kvm *kvm, unsigned long hva)
+{
+   int i;
+   int young = 0;
+
+   /*
+* If mmap_sem isn't taken, we can look the memslots with only
+* the mmu_lock by skipping over the slots with userspace_addr == 0.
+*/
+   spin_lock(kvm-mmu_lock);
+   for (i = 0; i  kvm-nmemslots; i++) {
+   struct kvm_memory_slot *memslot = kvm-memslots[i];
+   unsigned long start = memslot-userspace_addr;
+   unsigned long end;
+
+   /* mmu_lock protects userspace_addr */
+   if (!start)
+   continue;
+
+   end = start + (memslot-npages  PAGE_SHIFT);
+   if (hva = start  hva  end) {
+   gfn_t gfn_offset = (hva - start)  PAGE_SHIFT;
+   young |= kvm_age_rmapp(kvm, memslot-rmap[gfn_offset]);
+   }
+   }
+   spin_unlock(kvm-mmu_lock);
+
+   if (young)
+   kvm_flush_remote_tlbs(kvm);
+
+   return young;
+}
+
 #ifdef MMU_DEBUG
 static int is_empty_shadow_page(u64 *spt)
 {
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index cdafce3..6d09d13 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -370,6 +370,7 @@ static int 

Re: [kvm-devel] KVM Testing Result for KVM-61

2008-02-20 Thread Dor Laor

On Wed, 2008-02-20 at 11:41 +0200, Izik Eidus wrote:
 On Wed, 2008-02-20 at 15:09 +0800, Zhao, Yunfeng wrote:
  Hi,All
  This is testing result for KVM-61. 
  No new issue has been found in the testing.
  
  Five old issues:
  1. Fails to save/restore guests 
  Save/restore may cause host to hang.
  https://sourceforge.net/tracker/index.php?func=detailaid=1824525group_
  id=180599atid=893831
 
 savevm loadvm does not work, but it doesnt crush my host
 what you see in the dmesg? (the bug with the bad page was fixed no?)
 

I know it happens for win2k guest (repeatably)

  2. smp windows installer cr
  ashes
   while rebooting
  https://sourceforge.net/tracker/index.php?func=detailaid=1877875group_
  id=180599atid=893831
  3. Timer of guest is inaccurate
  https://sourceforge.net/tracker/?func=detailatid=893831aid=1826080gro
  up_id=180599
  4. Installer of 64bit vista guest will pause for ten minutes after
  reboot
  https://sourceforge.net/tracker/?func=detailatid=893831aid=1836905gro
  up_id=180599
  5. Cannot boot 32bit smp RHEL5.1 guest with nic on 64bit host
  https://sourceforge.net/tracker/?func=detailatid=893831aid=1812043gro
  up_id=180599
  
  Test environment
   
  
  PlatformWoodcrest
  CPU 4
  Memory size 8G'
   
  
  Details
  
  
  IA32-pae: 
  
  1. boot guest with 256M memory  PASS
  2. boot two windows xp guest   PASS
  3. boot 4 same guest in parallelPASS
  4. boot linux and windows guest in parallel PASS
  5. boot guest with 1500M memory PASS
  6. boot windows 2003 with ACPI enabled   PASS
  7. boot Windows xp with ACPI enabled  PASS
  8. boot Windows 2000 without ACPI  PASS
  9. kernel build on SMP linux guestPASS
  10. LTP on SMP linux guest PASS
  11. boot base kernel linux
  PASS
  12. save/restore 32-bit HVM guests   PASS
  13. live migration 32-bit HVM guests  PASS
  14. boot SMP Windows xp with ACPI enabledPASS
  15. boot SMP Windows 2003 with ACPI enabled PASS
  16. boot SMP Windows 2000 with ACPI enabled PASS
   
   
  
  IA32e: 
  
  1. boot four 32-bit guest in parallel
  PASS
  2. boot four 64-bit guest in parallel
  PASS
  3. boot 4G 64-bit guest
  PASS
  4. boot 4G pae guest
  PASS
  5. boot 32-bit linux and 32 bit windows guest in parallelPASS
  6. boot 32-bit guest with 1500M memory PASS
  7. boot 64-bit guest with 1500M memory PASS
  8. boot 32-bit guest with 256M memory   PASS
  9. boot 64-bit guest with 256M memory   PASS
  10. boot two 32-bit windows xp in parallel
  PASS
  11. boot four 32-bit different guest in para
  PASS
  12. save/restore 64-bit linux guests
  FAIL
  13. save/restore 32-bit linux guests
  PASS
  14. boot 32-bit SMP windows 2003 with ACPI enabled PASS
  15. boot 32-bit SMP Windows 2000 with ACPI enabledPASS  
  16. boot 32-bit SMP Windows xp with ACPI enabledPASS
  17. boot 32-bit Windows 2000 without ACPIPASS
  18. boot 64-bit Windows xp with ACPI enabledPASS
  19. boot 32-bit Windows xp without ACPIPASS
  20. boot 64-bit vista
  PASS
  21. kernel build in 32-bit linux guest OS
  PASS
  22. kernel build in 64-bit linux guest OS
  PASS
  23. LTP on SMP 32-bit linux guest OSPASS
  24. LTP on SMP 64-bit linux guest OSPASS
  25. boot 64-bit guests with ACPI enabled
  PASS
  26. boot 32-bit x-server
  PASS   
  27. boot 64-bit SMP windows XP with ACPI enabled PASS
  28. boot 64-bit SMP windows 2003 with ACPI enabled  PASS
  29. live migration 64bit linux guests
  PASS
  30. live migration 32bit linux guests
  PASS
  
  
  Report Summary on IA32-pae
   
  Summary Test Report of Last Session
  =
  Total   PassFailNoResult   Crash
  =
  control_panel   6   5   1 00
  Restart 2   2   0 00
  gtest   14  13  1 00
  =
  control_panel   6   5   1 00
   :KVM_LM_PAE_gPAE   1   0   

Re: [kvm-devel] [PATCH] mmu notifiers #v6

2008-02-20 Thread Robin Holt
On Wed, Feb 20, 2008 at 11:39:42AM +0100, Andrea Arcangeli wrote:
 Given Nick's comments I ported my version of the mmu notifiers to
 latest mainline. There are no known bugs AFIK and it's obviously safe
 (nothing is allowed to schedule inside rcu_read_lock taken by
 mmu_notifier() with my patch).
 
 XPMEM simply can't use RCU for the registration locking if it wants to
 schedule inside the mmu notifier calls. So I guess it's better to add
 the XPMEM invalidate_range_end/begin/external-rmap as a whole
 different subsystem that will have to use a mutex (not RCU) to
 serialize, and at the same time that CONFIG_XPMEM will also have to
 switch the i_mmap_lock to a mutex. I doubt xpmem fits inside a
 CONFIG_MMU_NOTIFIER anymore, or we'll all run a bit slower because of
 it. It's really a call of how much we want to optimize the MMU
 notifier, by keeping things like RCU for the registration.

But won't that other subsystem cause us to have two seperate callouts
that do equivalent things and therefore force a removal of this and go
back to what Christoph has currently proposed?

Robin

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] mmu notifiers #v6

2008-02-20 Thread Robin Holt
On Wed, Feb 20, 2008 at 01:03:24PM +0100, Andrea Arcangeli wrote:
 I'm unconvinced both the main linux VM and the mmu notifier should be
 changed like this just to support xpmem. All non-sleeping users don't
 need that. Nevertheless I'm fully welcome to support xpmem (and it's
 not my call nor my interest to comment if allocating skbs in
 try_to_unmap in order to unpin pages is workable, let's assume it's
 workable for the sake of this discussion) with a new config option
 that will also alter how the core VM works, in order to fully support
 the sleeping users for filebacked mappings.

We do not need to do any allocation in the messaging layer, all
structures used for messaging are allocated at module load time.
The allocation discussions we had early on were about trying to
rearrange you notifiers to allow a seperate worker thread to do the
invalidate and then the main thread would spin waiting for the worker to
complete.  That was canned by the moving your notifier to before the
lock was grabbed which led us to the point of needing a _begin and _end.

 This will also create less confusion in the registration. With
 Christoph's one-config-option-fits-all you had to half register into
 the mmu notifier (the sleeping calls, so not invalidate_page) and full
 register in the external rmap notifier, and I had to only half
 register into the mmu notifier (not range_begin) and not register in
 the rmap external notifier.
 
 With two separate config options for sleeping and non sleeping users,
 I'll 100% register in the mmu notifier methods, and the non-sleeping
 users will 100% register the xpmem methods. You won't have to have
 designed the mmu notifier patches to understand how to use it.

So, fundamentally, how would they be different?  Would we be required to
add another notifier list to the mm and have two seperate callout
points?  Reduction would end up with the same half-registered
half-not-registered situation you point out above.  Then further
reduction would lead to the elimination of the callouts you have just
proposed and using the _begin/_end callouts and we are back to
Christoph's current patch.

Robin

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] mmu notifiers #v6

2008-02-20 Thread Andrea Arcangeli
On Wed, Feb 20, 2008 at 05:33:13AM -0600, Robin Holt wrote:
 But won't that other subsystem cause us to have two seperate callouts
 that do equivalent things and therefore force a removal of this and go
 back to what Christoph has currently proposed?

The point is that a new kind of notifier that only supports sleeping
users will allow to keep optimizing the mmu notifier patch for the
non-sleeping users. If we keep going Christoph's way of having a
single notifier that fits all he will have to:

1) drop the entire RCU locking from its patches (making all previous
   rcu discussions and fixes void) those discussions only made sense
   if applied to _my_ patch, not Christoph's patches as long as you
   pretend to sleep in any of his mmu notifier methods like invalidate_range_*.

2) probably modify the linux VM to replace the i_mmap_lock and perhaps
   PT lock with a mutex (see Nick's comments for details)

I'm unconvinced both the main linux VM and the mmu notifier should be
changed like this just to support xpmem. All non-sleeping users don't
need that. Nevertheless I'm fully welcome to support xpmem (and it's
not my call nor my interest to comment if allocating skbs in
try_to_unmap in order to unpin pages is workable, let's assume it's
workable for the sake of this discussion) with a new config option
that will also alter how the core VM works, in order to fully support
the sleeping users for filebacked mappings.

This will also create less confusion in the registration. With
Christoph's one-config-option-fits-all you had to half register into
the mmu notifier (the sleeping calls, so not invalidate_page) and full
register in the external rmap notifier, and I had to only half
register into the mmu notifier (not range_begin) and not register in
the rmap external notifier.

With two separate config options for sleeping and non sleeping users,
I'll 100% register in the mmu notifier methods, and the non-sleeping
users will 100% register the xpmem methods. You won't have to have
designed the mmu notifier patches to understand how to use it.

In theory both KVM and GRU are free to use the xpmem methods too (the
invalidate_page will be page_t based instead of [mm,addr] based, but
that's possible to handle with KVM changes if one wants to), but if a
distro only wants to support the sleeping users in their binary kernel
images, they won't be forced to alter how the VM works to do
that.

If there's agreement that the VM should alter its locking from
spinlock to mutex for its own good, then Christoph's
one-config-option-fits-all becomes a lot more appealing (replacing RCU
with a mutex in the mmu notifier list registration locking isn't my
main worry and the non-sleeping-users may be ok to live with it).

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] mmu notifiers #v6

2008-02-20 Thread Andrea Arcangeli
On Wed, Feb 20, 2008 at 06:24:24AM -0600, Robin Holt wrote:
 We do not need to do any allocation in the messaging layer, all
 structures used for messaging are allocated at module load time.
 The allocation discussions we had early on were about trying to
 rearrange you notifiers to allow a seperate worker thread to do the
 invalidate and then the main thread would spin waiting for the worker to
 complete.  That was canned by the moving your notifier to before the
 lock was grabbed which led us to the point of needing a _begin and _end.

I thought you called some net/* function inside the mmu notifier
methods. Those always require several ram allocations internally.

 So, fundamentally, how would they be different?  Would we be required to
 add another notifier list to the mm and have two seperate callout
 points?  Reduction would end up with the same half-registered
 half-not-registered situation you point out above.  Then further
 reduction would lead to the elimination of the callouts you have just
 proposed and using the _begin/_end callouts and we are back to
 Christoph's current patch.

Did you miss Nick's argument that we'd need to change some VM lock to
mutex and solve lock issues first? Are you implying mutex are more
efficient for the VM? (you may seek support from preempt-rt folks at
least) or are you implying the VM would better run slower with mutex
in order to have a single config option?

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] KVM Testing Result for KVM-61

2008-02-20 Thread Izik Eidus

On Wed, 2008-02-20 at 20:58 +0800, Zhao, Yunfeng wrote:
  Five old issues:
  1. Fails to save/restore guests
  Save/restore may cause host to hang.
  https://sourceforge.net/tracker/index.php?funcÞtailaid24525group_
  id0599atid‰3831
 
 savevm loadvm does not work, but it doesnt crush my host
 what you see in the dmesg? (the bug with the bad page was fixed no?)
 Here is the error message in the dmesg and host console:

is it happen on every guest?
you are using qcow right?

 Bad page state in process 'qemu-system-x86'
 page:81023f4a19d0 flags:0x0204 mapping:
 mapcount:0 count:-1
 Trying to fix it up, but a reboot is needed
 Backtrace:
 
 Call Trace:
  [8025a722] bad_page+0x63/0x8d
  [8025b42f] get_page_from_freelist+0x2d9/0x47e
  [8055a664] mutex_lock+0xd/0x1e
  [8025b742] __alloc_pages+0x61/0x2b5
  [802637f1] __handle_mm_fault+0x4c1/0x9dc
  [802625fc] follow_page+0x15a/0x24b
  [80263fbb] get_user_pages+0x2af/0x39c
  [88010c8b] :kvm:gfn_to_page+0x67/0xa0
  [880182a5] :kvm:paging64_page_fault+0xd8/0x3a0
  [80502cc2] tcp_v4_do_rcv+0x30/0x34d
  [88017183] :kvm:kvm_mmu_page_fault+0x19/0x80
  [88014ec7] :kvm:kvm_arch_vcpu_ioctl_run+0x3a5/0x4fb
  [880114e6] :kvm:kvm_vcpu_ioctl+0xda/0x2dd
  [802426dc] remove_wait_queue+0x12/0x45
  [803f397d] tun_chr_aio_read+0x2aa/0x2bc
  [802872aa] core_sys_select+0x1f8/0x264
  [80245ac5] getnstimeofday+0x32/0x8d
  [80244c65] ktime_get_ts+0x1a/0x4e
  [8024452a] enqueue_hrtimer+0x64/0x6b
  [80244a78] hrtimer_start+0xf2/0x104
  [80286063] do_ioctl+0x2b/0xb6
  [80286331] vfs_ioctl+0x243/0x25c
  [80286386] sys_ioctl+0x3c/0x5e
  [8020935e] system_call+0x7e/0x83
 
 
 essage from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
 vt-dp8 kernel: Bad page state in process 'qemu-system-x86'
 
 Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
 vt-dp8 kernel: page:81023ee1e8f8 flags:0x0200087c
 mapping:810205 dc98a8
 mapcount:0 count:2
 
 Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
 vt-dp8 kernel: Trying to fix it up, but a reboot is needed
 
 Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
 vt-dp8 kernel: Backtrace:
 
 Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
 vt-dp8 kernel: Bad page state in process 'qemu-system-x86'
 
 Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
 vt-dp8 kernel: page:81023f3387f8 flags:0x0268
 mapping:810213 3316a8
 mapcount:0 count:1
 
 Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
 vt-dp8 kernel: Trying to fix it up, but a reboot is needed
 
 Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
 vt-dp8 kernel: Backtrace:
 


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] mmu notifiers #v6

2008-02-20 Thread Robin Holt
On Wed, Feb 20, 2008 at 01:32:36PM +0100, Andrea Arcangeli wrote:
 On Wed, Feb 20, 2008 at 06:24:24AM -0600, Robin Holt wrote:
  We do not need to do any allocation in the messaging layer, all
  structures used for messaging are allocated at module load time.
  The allocation discussions we had early on were about trying to
  rearrange you notifiers to allow a seperate worker thread to do the
  invalidate and then the main thread would spin waiting for the worker to
  complete.  That was canned by the moving your notifier to before the
  lock was grabbed which led us to the point of needing a _begin and _end.
 
 I thought you called some net/* function inside the mmu notifier
 methods. Those always require several ram allocations internally.

Nope, that was the discussions with the IB folks.  We only use XPC and
both the messages we send and the XPC internals do not need to allocate.

  So, fundamentally, how would they be different?  Would we be required to
  add another notifier list to the mm and have two seperate callout
  points?  Reduction would end up with the same half-registered
  half-not-registered situation you point out above.  Then further
  reduction would lead to the elimination of the callouts you have just
  proposed and using the _begin/_end callouts and we are back to
  Christoph's current patch.
 
 Did you miss Nick's argument that we'd need to change some VM lock to
 mutex and solve lock issues first? Are you implying mutex are more
 efficient for the VM? (you may seek support from preempt-rt folks at
 least) or are you implying the VM would better run slower with mutex
 in order to have a single config option?

That would be if we needed to support file backed mappings and hugetlbfs
mappings.  Currently (and for the last 6 years), XPMEM has not supported
either of those.  I don't view either as being a realistic possibility,
but it is certainly something we would need to address before either
could be supported.

Robin

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] KVM Testing Result for KVM-61

2008-02-20 Thread Zhao, Yunfeng
 Five old issues:
 1. Fails to save/restore guests
 Save/restore may cause host to hang.
 https://sourceforge.net/tracker/index.php?funcÞtailaid24525group_
 id0599atid‰3831

savevm loadvm does not work, but it doesnt crush my host
what you see in the dmesg? (the bug with the bad page was fixed no?)
Here is the error message in the dmesg and host console:
Bad page state in process 'qemu-system-x86'
page:81023f4a19d0 flags:0x0204 mapping:
mapcount:0 count:-1
Trying to fix it up, but a reboot is needed
Backtrace:

Call Trace:
 [8025a722] bad_page+0x63/0x8d
 [8025b42f] get_page_from_freelist+0x2d9/0x47e
 [8055a664] mutex_lock+0xd/0x1e
 [8025b742] __alloc_pages+0x61/0x2b5
 [802637f1] __handle_mm_fault+0x4c1/0x9dc
 [802625fc] follow_page+0x15a/0x24b
 [80263fbb] get_user_pages+0x2af/0x39c
 [88010c8b] :kvm:gfn_to_page+0x67/0xa0
 [880182a5] :kvm:paging64_page_fault+0xd8/0x3a0
 [80502cc2] tcp_v4_do_rcv+0x30/0x34d
 [88017183] :kvm:kvm_mmu_page_fault+0x19/0x80
 [88014ec7] :kvm:kvm_arch_vcpu_ioctl_run+0x3a5/0x4fb
 [880114e6] :kvm:kvm_vcpu_ioctl+0xda/0x2dd
 [802426dc] remove_wait_queue+0x12/0x45
 [803f397d] tun_chr_aio_read+0x2aa/0x2bc
 [802872aa] core_sys_select+0x1f8/0x264
 [80245ac5] getnstimeofday+0x32/0x8d
 [80244c65] ktime_get_ts+0x1a/0x4e
 [8024452a] enqueue_hrtimer+0x64/0x6b
 [80244a78] hrtimer_start+0xf2/0x104
 [80286063] do_ioctl+0x2b/0xb6
 [80286331] vfs_ioctl+0x243/0x25c
 [80286386] sys_ioctl+0x3c/0x5e
 [8020935e] system_call+0x7e/0x83


essage from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
vt-dp8 kernel: Bad page state in process 'qemu-system-x86'

Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
vt-dp8 kernel: page:81023ee1e8f8 flags:0x0200087c
mapping:810205 dc98a8
mapcount:0 count:2

Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
vt-dp8 kernel: Trying to fix it up, but a reboot is needed

Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
vt-dp8 kernel: Backtrace:

Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
vt-dp8 kernel: Bad page state in process 'qemu-system-x86'

Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
vt-dp8 kernel: page:81023f3387f8 flags:0x0268
mapping:810213 3316a8
mapcount:0 count:1

Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
vt-dp8 kernel: Trying to fix it up, but a reboot is needed

Message from [EMAIL PROTECTED] at Tue Feb 19 09:52:24 2008 ...
vt-dp8 kernel: Backtrace:


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] howto set up a virtual firewall?

2008-02-20 Thread Kurt Neufeld

Hey there,

I've searched high and low but can't find an answer to my problem and I
find it hard to believe that I'm the only person that wants to do this.

I would like to setup a virtual machine that is my firewall. So far I've
got Shorewall setup in a virtual machine and the internal nic works 
and I can ping the host and vice versa.

However, I can't figure out the external nic. I've setup bridging on 
my hosts eth0 (currently my internet facing nic) but if I understand

3. public bridge from http://kvm.qumranet.com/kvmwiki/Networking then I 
have to have an ip address on my host? That would defeat the purpose of 
the virtual firewall.

So what I want is to have the virtual machine have complete control of 
the external nic (not configured, no ip addr on host), the internal nic 
can either be on a virtual network or bridged internally, either works 
for me.

Current config at end.

Thanks for any assistance,
Kurt

ps - sorry for the previous incomplete post

domain type='kvm'
   namefw/name
   uuid76bfb29c-ebd8-7d25-6009-0874d8cca460/uuid
   memory262144/memory
   currentMemory262144/currentMemory
   vcpu1/vcpu
   os
 typehvm/type
 boot dev='hd'/
   /os
   clock offset='utc'/
   on_poweroffdestroy/on_poweroff
   on_rebootrestart/on_reboot
   on_crashdestroy/on_crash
   devices
 emulator/usr/bin/qemu-kvm/emulator
 disk type='block' device='disk'
   source dev='/dev/lvm-1/vm-fw'/
   target dev='hda'/
 /disk
 disk type='file' device='cdrom'
   source file='/tmp/smoothwall-express-3.0-x86_64.iso'/
   target dev='hdc'/
   readonly/
 /disk
 interface type='bridge'
   mac address='00:50:04:7f:b5:a3'/
   source bridge='br0'/
 /interface
 interface type='network'
   mac address='00:16:3e:06:8c:10'/
   source network='virtnet'/
 /interface
 input type='mouse' bus='ps2'/
 graphics type='vnc' port='-1' listen='127.0.0.1'/
   /devices
/domain

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] howto set up a virtual firewall?

2008-02-20 Thread Avi Kivity
Kurt Neufeld wrote:
 Hey there,

 I've searched high and low but can't find an answer to my problem and I
 find it hard to believe that I'm the only person that wants to do this.

 I would like to setup a virtual machine that is my firewall. So far I've
 got Shorewall setup in a virtual machine and the internal nic works 
 and I can ping the host and vice versa.

 However, I can't figure out the external nic. I've setup bridging on 
 my hosts eth0 (currently my internet facing nic) but if I understand

 3. public bridge from http://kvm.qumranet.com/kvmwiki/Networking then I 
 have to have an ip address on my host? That would defeat the purpose of 
 the virtual firewall.

 So what I want is to have the virtual machine have complete control of 
 the external nic (not configured, no ip addr on host), the internal nic 
 can either be on a virtual network or bridged internally, either works 
 for me.
   

Assuming you have eth0 on the host, tap0 on the host visible as eth0 in 
the guest, and tap1 in the host visible as eth1 in the guest, you can 
add a bridge between eth0 and tap0, and use tap1 as the nic in the host 
for IP (e.g. run 'dhclient tap1' to obtain an internal IP address).


-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] howto set up a virtual firewall?

2008-02-20 Thread Kurt Neufeld

Hey there,

I've search high and low but can't find an answer to my problem and I 
find it hard to believe that I'm the only person that wants to do this.

I would like to setup a virtual machine that is my firewall. So far I've 
got Shorewall setup in a virtual machine and the

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] build #365 issue for v2.6.25-rc2-342-g5d9c4a7 in ./arch/x86/kvm/kvm.ko

2008-02-20 Thread Avi Kivity

Avi Kivity wrote:

Toralf Förster wrote:

Hello,

the build with the attached .config failed, make ends with:
...
  HOSTCC  arch/x86/boot/tools/build
  BUILD   arch/x86/boot/bzImage
Root device is (3, 8)
Setup is 12280 bytes (padded to 12288 bytes).
System is 2192 kB
Kernel: arch/x86/boot/bzImage is ready  (#1)
  Building modules, stage 2.
  MODPOST 211 modules
ERROR: smp_ops [arch/x86/kvm/kvm.ko] undefined!
make[1]: *** [__modpost] Error 1
make: *** [modules] Error 2


The build was made with :
$ make mrproper  make rndconfig  tweak config file  make 
oldconfig  make


Here's the config:
  


Looks like KVM conflicts with CONFIG_VOYAGER...



Attached patch should fix.

Subject: x86: disable KVM on Voyager

Most classic Pentiums don't have hardware virtualization
extension, and building kvm with voyager generates
spurious failures.

Signed-off-by: Avi Kivity [EMAIL PROTECTED]

--
Any sufficiently difficult bug is indistinguishable from a feature.

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index cc2bc37..e27962c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -21,7 +21,7 @@ config X86
 	select HAVE_IDE
 	select HAVE_OPROFILE
 	select HAVE_KPROBES
-	select HAVE_KVM
+	select HAVE_KVM if ((X86_32  !X86_VOYAGER) || X86_64)
 
 
 config GENERIC_LOCKBREAK
-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] build #365 issue for v2.6.25-rc2-342-g5d9c4a7 in ./arch/x86/kvm/kvm.ko

2008-02-20 Thread Avi Kivity
Toralf Förster wrote:
 Hello,

 the build with the attached .config failed, make ends with:
 ...
   HOSTCC  arch/x86/boot/tools/build
   BUILD   arch/x86/boot/bzImage
 Root device is (3, 8)
 Setup is 12280 bytes (padded to 12288 bytes).
 System is 2192 kB
 Kernel: arch/x86/boot/bzImage is ready  (#1)
   Building modules, stage 2.
   MODPOST 211 modules
 ERROR: smp_ops [arch/x86/kvm/kvm.ko] undefined!
 make[1]: *** [__modpost] Error 1
 make: *** [modules] Error 2


 The build was made with :
 $ make mrproper  make rndconfig  tweak config file  make oldconfig 
  make

 Here's the config:
   

Looks like KVM conflicts with CONFIG_VOYAGER...

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] KVM Testing Result for KVM-61

2008-02-20 Thread Izik Eidus

On Wed, 2008-02-20 at 21:12 +0800, Zhao, Yunfeng wrote:
 
 On Wed, 2008-02-20 at 20:58 +0800, Zhao, Yunfeng wrote:
   Five old issues:
   1. Fails to save/restore guests
   Save/restore may cause host to hang.
   https://sourceforge.net/tracker/index.php?funcÞtailaid24525group_
   id0599atid‰3831
  
  savevm loadvm does not work, but it doesnt crush my host
  what you see in the dmesg? (the bug with the bad page was fixed no?)
  Here is the error message in the dmesg and host console:
 
 is it happen on every guest?
 you are using qcow right?
 It doesn't happen on every guest, but the probability of this error is very 
 high.
 Yes, I am using qcow images.

Zhao, it seems like i just dont want to happen on my host
can you please give me as much information about the guest/host and even
on how you save the vm?


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] mmdrop external module oops

2008-02-20 Thread Avi Kivity
Andrea Arcangeli wrote:
 A 2.6.25-rc based kernel spawned an oops in mmdrop when kvm quit so
 that reminded me of this:
   

Applied, thanks.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] howto set up a virtual firewall?

2008-02-20 Thread Javier Guerra
On 2/20/08, Avi Kivity [EMAIL PROTECTED] wrote:
 Assuming you have eth0 on the host, tap0 on the host visible as eth0 in
  the guest, and tap1 in the host visible as eth1 in the guest, you can
  add a bridge between eth0 and tap0, and use tap1 as the nic in the host
  for IP (e.g. run 'dhclient tap1' to obtain an internal IP address).

note that if you do that, there's no need to set IP address on the host's eth0

check the Xen maillist archive; this kind of setup is common there,
but they use some heavy iface renaming to make it look more 'normal'
(but a lot harder to initially grok)



-- 
Javier

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] mmu notifiers #v6

2008-02-20 Thread Robin Holt
On Wed, Feb 20, 2008 at 11:39:42AM +0100, Andrea Arcangeli wrote:
 XPMEM simply can't use RCU for the registration locking if it wants to
 schedule inside the mmu notifier calls. So I guess it's better to add

Whoa there.  In Christoph's patch, we did not use rcu for the list.  It
was a simple hlist_head.  The list manipulations were done under
down_write(current-mm-mmap_sem) and would therefore not be racy.  All
the callout locations are already acquiring the mmap_sem at least
readably, so we should be safe.  Maybe I missed a race somewhere.

Thanks,
Robin

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] large page support for kvm

2008-02-20 Thread Avi Kivity
Marcelo Tosatti wrote:
   
 +   /*
 +* Largepage creation is susceptible to a upper-level
 +* table to be shadowed and write-protected in the
 +* area being mapped. If that is the case, invalidate
 +* the entry and let the instruction fault again
 +* and use 4K mappings.
 +*/
 +   if (largepage) {
 +   spte = shadow_trap_nonpresent_pte;
 +   kvm_x86_ops-tlb_flush(vcpu);
 +   goto unshadowed;
 +   }
  
   
 Would it not repeat exactly the same code path?  Or is this just for the 
 case of the pte_update path?
 

 The problem is if the instruction writing to one of the roots can't be
 emulated.

 kvm_mmu_unprotect_page() does not know about largepages, so it will zap
 a gfn inside the large page frame, but not the large translation itself.

 And zapping the gfn brings the shadowed page count in large area to
 zero, allowing has_wrprotected_page() to succeed. Endless unfixable
 write faults.

   

I don't follow. Can you describe the scenario in more detail? The state 
of the guest and shadow page tables, and what actually happens?

Setting spte to a nonpresent pte seems to violate the rmap btw; rmap 
always expects a valid pte pointing at the page.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] mmu notifiers #v6

2008-02-20 Thread Andrea Arcangeli
On Wed, Feb 20, 2008 at 08:41:55AM -0600, Robin Holt wrote:
 On Wed, Feb 20, 2008 at 11:39:42AM +0100, Andrea Arcangeli wrote:
  XPMEM simply can't use RCU for the registration locking if it wants to
  schedule inside the mmu notifier calls. So I guess it's better to add
 
 Whoa there.  In Christoph's patch, we did not use rcu for the list.  It
 was a simple hlist_head.  The list manipulations were done under
 down_write(current-mm-mmap_sem) and would therefore not be racy.  All
 the callout locations are already acquiring the mmap_sem at least
 readably, so we should be safe.  Maybe I missed a race somewhere.

You missed quite a few, see when atomic=1 and when mmu_rmap_notifier
is invoked for example.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] Fix host_cpuid() in qemu/qemu-kvm-x86.c

2008-02-20 Thread Bernhard Kaindl
Hi,

I found that on kvm-61 the cpuid in the guest was reported incorrectly
when qemu-kvm was compiled with gcc-4.1 or 4.3.

This resulted in linux-64bit not booting, complaining that it is not
running on a 64-bit machine.

Symptom: Unexpected behaviour after the assembly snippet.

Solution: New assembly which is simpler and leaves optimizations to gcc,
resulting in much shorter and maintainable code.

Comments are welcome,
Bernhard

PS: Thanks a lot to Alex Graf for the fix.

--- qemu/qemu-kvm-x86.c
+++ qemu/qemu-kvm-x86.c
@@ -428,35 +428,7 @@ static void host_cpuid(uint32_t function
 uint32_t vec[4];
 
 vec[0] = function;
-asm volatile (
-#ifdef __x86_64__
-sub $128, %%rsp \n\t  /* skip red zone */
- push %0;  push %%rsi \n\t
-push %%rax; push %%rbx; push %%rcx; push %%rdx \n\t
-mov 8*5(%%rsp), %%rsi \n\t
-mov (%%rsi), %%eax \n\t
-cpuid \n\t
-mov %%eax, (%%rsi) \n\t
-mov %%ebx, 4(%%rsi) \n\t
-mov %%ecx, 8(%%rsi) \n\t
-mov %%edx, 12(%%rsi) \n\t
-pop %%rdx; pop %%rcx; pop %%rbx; pop %%rax \n\t
-pop %%rsi; pop %0 \n\t
-add $128, %%rsp
-#else
- push %0;  push %%esi \n\t
-push %%eax; push %%ebx; push %%ecx; push %%edx \n\t
-mov 4*5(%%esp), %%esi \n\t
-mov (%%esi), %%eax \n\t
-cpuid \n\t
-mov %%eax, (%%esi) \n\t
-mov %%ebx, 4(%%esi) \n\t
-mov %%ecx, 8(%%esi) \n\t
-mov %%edx, 12(%%esi) \n\t
-pop %%edx; pop %%ecx; pop %%ebx; pop %%eax \n\t
-pop %%esi; pop %0 \n\t
-#endif
-: : rm(vec) : memory);
+asm volatile(cpuid : +a (vec[0]), =b (vec[1]),=c (vec[2]), =d 
(vec[3]));
 if (eax)
*eax = vec[0];
 if (ebx)

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] build #365 issue for v2.6.25-rc2-342-g5d9c4a7 in ./arch/x86/kvm/kvm.ko

2008-02-20 Thread Randy Dunlap
On Wed, 20 Feb 2008 16:07:03 +0200 Avi Kivity wrote:

  Looks like KVM conflicts with CONFIG_VOYAGER...
 
 
 Attached patch should fix.
 
 Subject: x86: disable KVM on Voyager
 
 Most classic Pentiums don't have hardware virtualization
 extension, and building kvm with voyager generates
 spurious failures.
 
 Signed-off-by: Avi Kivity [EMAIL PROTECTED]

Might as well extend it for VISWS  NUMAQ:

---
From: Avi Kivity [EMAIL PROTECTED]

Most classic Pentiums don't have hardware virtualization extension,
and building kvm with Voyager, Visual Workstation, or NUMAQ
generates spurious failures.

Signed-off-by: Avi Kivity [EMAIL PROTECTED]
Signed-off-by: Randy Dunlap [EMAIL PROTECTED]
---
 arch/x86/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.25-rc2-git4.orig/arch/x86/Kconfig
+++ linux-2.6.25-rc2-git4/arch/x86/Kconfig
@@ -21,7 +21,7 @@ config X86
select HAVE_IDE
select HAVE_OPROFILE
select HAVE_KPROBES
-   select HAVE_KVM
+   select HAVE_KVM if ((X86_32  !X86_VOYAGER  !X86_VISWS  
!X86_NUMAQ) || X86_64)
 
 
 config GENERIC_LOCKBREAK

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [patch 5/5] KVM: VMX cr3 cache support (v2)

2008-02-20 Thread Marcelo Tosatti
Add support for the cr3 cache feature on Intel VMX CPU's. This avoids
vmexits on context switch if the cr3 value is cached in one of the 
entries (currently 4 are present).

This is especially important for Xenner, where each guest syscall
involves a cr3 switch.

v1-v2:
- handle the race which happens when the guest has the cache cleared
in the middle of kvm_write_cr3 by injecting a GP and trapping it to
fallback to hypercall variant (suggested by Avi).

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]
Cc: Anthony Liguori [EMAIL PROTECTED]


Index: kvm.paravirt2/arch/x86/kernel/kvm.c
===
--- kvm.paravirt2.orig/arch/x86/kernel/kvm.c
+++ kvm.paravirt2/arch/x86/kernel/kvm.c
@@ -26,14 +26,17 @@
 #include linux/cpu.h
 #include linux/mm.h
 #include linux/hardirq.h
+#include asm/tlbflush.h
+#include asm/asm.h
 
 #define MAX_MULTICALL_NR (PAGE_SIZE / sizeof(struct kvm_multicall_entry))
 
 struct kvm_para_state {
+   struct kvm_cr3_cache cr3_cache;
struct kvm_multicall_entry queue[MAX_MULTICALL_NR];
int queue_index;
enum paravirt_lazy_mode mode;
-};
+} __attribute__ ((aligned(PAGE_SIZE)));
 
 static DEFINE_PER_CPU(struct kvm_para_state, para_state);
 
@@ -104,6 +107,116 @@ static void kvm_io_delay(void)
 {
 }
 
+static void kvm_new_cr3(unsigned long cr3)
+{
+   kvm_hypercall1(KVM_HYPERCALL_SET_CR3, cr3);
+}
+
+static unsigned long __force_order;
+
+/*
+ * Special, register-to-cr3 instruction based hypercall API
+ * variant to the KVM host. This utilizes the cr3 filter capability
+ * of the hardware - if this works out then no VM exit happens,
+ * if a VM exit happens then KVM will get the virtual address too.
+ */
+static void kvm_write_cr3(unsigned long guest_cr3)
+{
+   struct kvm_para_state *para_state = get_cpu_var(para_state);
+   struct kvm_cr3_cache *cache = para_state-cr3_cache;
+   int idx;
+
+   /*
+* Check the cache (maintained by the host) for a matching
+* guest_cr3 = host_cr3 mapping. Use it if found:
+*/
+   for (idx = 0; idx  cache-max_idx; idx++) {
+   if (cache-entry[idx].guest_cr3 == guest_cr3) {
+   unsigned long trap;
+
+   /*
+* Cache-hit: we load the cached host-CR3 value.
+* Fallback to hypercall variant if it raced with
+* the host clearing the cache after guest_cr3
+* comparison.
+*/
+   __asm__ __volatile__ (
+   mov %2, %0\n
+   0:  mov %3, %%cr3\n
+   1:\n
+   .section .fixup,\ax\\n
+   2:  mov %1, %0\n
+   jmp 1b\n
+   .previous\n
+   _ASM_EXTABLE(0b, 2b)
+   : =r (trap)
+   : n (1UL), n (0UL),
+ b (cache-entry[idx].host_cr3),
+ m (__force_order));
+   if (!trap)
+   goto out;
+   break;
+   }
+   }
+
+   /*
+* Cache-miss. Tell the host the new cr3 via hypercall (to avoid
+* aliasing problems with a cached host_cr3 == guest_cr3).
+*/
+   kvm_new_cr3(guest_cr3);
+out:
+   put_cpu_var(para_state);
+}
+
+/*
+ * Avoid the VM exit upon cr3 load by using the cached
+ * -active_mm-pgd value:
+ */
+static void kvm_flush_tlb_user(void)
+{
+   kvm_write_cr3(__pa(current-active_mm-pgd));
+}
+
+/*
+ * Disable global pages, do a flush, then enable global pages:
+ */
+static void kvm_flush_tlb_kernel(void)
+{
+   unsigned long orig_cr4 = read_cr4();
+
+   write_cr4(orig_cr4  ~X86_CR4_PGE);
+   kvm_flush_tlb_user();
+   write_cr4(orig_cr4);
+}
+
+static void register_cr3_cache(void *cache)
+{
+   struct kvm_para_state *state;
+
+   state = per_cpu(para_state, raw_smp_processor_id());
+   wrmsrl(KVM_MSR_SET_CR3_CACHE, __pa(state-cr3_cache));
+}
+
+static unsigned __init kvm_patch(u8 type, u16 clobbers, void *ibuf,
+unsigned long addr, unsigned len)
+{
+   switch (type) {
+   case PARAVIRT_PATCH(pv_mmu_ops.write_cr3):
+   return paravirt_patch_default(type, clobbers, ibuf, addr, len);
+   default:
+   return native_patch(type, clobbers, ibuf, addr, len);
+   }
+}
+
+static void __init setup_guest_cr3_cache(void)
+{
+   on_each_cpu(register_cr3_cache, NULL, 0, 1);
+
+   pv_mmu_ops.write_cr3 = kvm_write_cr3;
+   pv_mmu_ops.flush_tlb_user = kvm_flush_tlb_user;
+   pv_mmu_ops.flush_tlb_kernel = kvm_flush_tlb_kernel;
+}
+
 static void kvm_mmu_write(void *dest, const void *src, size_t size)
 {

[kvm-devel] [patch 0/5] KVM paravirt MMU updates and cr3 caching (v2)

2008-02-20 Thread Marcelo Tosatti
The following patchset, based on earlier work by Anthony and Ingo, adds
paravirt_ops support for KVM guests enabling hypercall based pte updates,
hypercall batching and cr3 caching.

-- 


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [patch 3/5] KVM: hypercall batching (v2)

2008-02-20 Thread Marcelo Tosatti
Batch pte updates and tlb flushes in lazy MMU mode.

v1-v2:
- report individual hypercall error code, have multicall return number of 
processed entries.
- cover entire multicall duration with slots_lock instead of 
acquiring/reacquiring.

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]
Cc: Anthony Liguori [EMAIL PROTECTED]

Index: kvm.paravirt2/arch/x86/kernel/kvm.c
===
--- kvm.paravirt2.orig/arch/x86/kernel/kvm.c
+++ kvm.paravirt2/arch/x86/kernel/kvm.c
@@ -25,6 +25,77 @@
 #include linux/kvm_para.h
 #include linux/cpu.h
 #include linux/mm.h
+#include linux/hardirq.h
+
+#define MAX_MULTICALL_NR (PAGE_SIZE / sizeof(struct kvm_multicall_entry))
+
+struct kvm_para_state {
+   struct kvm_multicall_entry queue[MAX_MULTICALL_NR];
+   int queue_index;
+   enum paravirt_lazy_mode mode;
+};
+
+static DEFINE_PER_CPU(struct kvm_para_state, para_state);
+
+static int can_defer_hypercall(struct kvm_para_state *state, unsigned int nr)
+{
+   if (state-mode == PARAVIRT_LAZY_MMU) {
+   switch (nr) {
+   case KVM_HYPERCALL_MMU_WRITE:
+   case KVM_HYPERCALL_FLUSH_TLB:
+   return 1;
+   }
+   }
+   return 0;
+}
+
+static void hypercall_queue_flush(struct kvm_para_state *state)
+{
+   long ret;
+
+   if (state-queue_index) {
+   ret = kvm_hypercall2(KVM_HYPERCALL_MULTICALL,
+__pa(state-queue), state-queue_index);
+   WARN_ON (ret != state-queue_index);
+   state-queue_index = 0;
+   }
+}
+
+static void kvm_hypercall_defer(struct kvm_para_state *state,
+   unsigned int nr,
+   unsigned long a0, unsigned long a1,
+   unsigned long a2, unsigned long a3)
+{
+   struct kvm_multicall_entry *entry;
+
+   BUG_ON(preemptible());
+
+   if (state-queue_index == MAX_MULTICALL_NR)
+   hypercall_queue_flush(state);
+
+   entry = state-queue[state-queue_index++];
+   entry-nr = nr;
+   entry-a0 = a0;
+   entry-a1 = a1;
+   entry-a2 = a2;
+   entry-a3 = a3;
+}
+
+static long kvm_hypercall(unsigned int nr, unsigned long a0,
+ unsigned long a1, unsigned long a2,
+ unsigned long a3)
+{
+   struct kvm_para_state *state = get_cpu_var(para_state);
+   long ret = 0;
+
+   if (can_defer_hypercall(state, nr))
+   kvm_hypercall_defer(state, nr, a0, a1, a2, a3);
+   else
+   ret = kvm_hypercall4(nr, a0, a1, a2, a3);
+
+   put_cpu_var(para_state);
+   return ret;
+}
 
 /*
  * No need for any IO delay on KVM
@@ -44,8 +115,8 @@ static void kvm_mmu_write(void *dest, co
if (size == 2)
a1 = *(u32 *)p[4];
 #endif
-   kvm_hypercall3(KVM_HYPERCALL_MMU_WRITE, (unsigned long)__pa(dest), a0,
-   a1);
+   kvm_hypercall(KVM_HYPERCALL_MMU_WRITE, (unsigned long)__pa(dest), a0,
+   a1, 0);
 }
 
 /*
@@ -110,12 +181,31 @@ static void kvm_set_pud(pud_t *pudp, pud
 
 static void kvm_flush_tlb(void)
 {
-   kvm_hypercall0(KVM_HYPERCALL_FLUSH_TLB);
+   kvm_hypercall(KVM_HYPERCALL_FLUSH_TLB, 0, 0, 0, 0);
 }
 
 static void kvm_release_pt(u32 pfn)
 {
-   kvm_hypercall1(KVM_HYPERCALL_RELEASE_PT, pfn  PAGE_SHIFT);
+   kvm_hypercall(KVM_HYPERCALL_RELEASE_PT, pfn  PAGE_SHIFT, 0, 0, 0);
+}
+
+static void kvm_enter_lazy_mmu(void)
+{
+   struct kvm_para_state *state
+   = per_cpu(para_state, smp_processor_id());
+
+   paravirt_enter_lazy_mmu();
+   state-mode = paravirt_get_lazy_mode();
+}
+
+static void kvm_leave_lazy_mmu(void)
+{
+   struct kvm_para_state *state
+   = per_cpu(para_state, smp_processor_id());
+
+   hypercall_queue_flush(state);
+   paravirt_leave_lazy(paravirt_get_lazy_mode());
+   state-mode = paravirt_get_lazy_mode();
 }
 
 static void paravirt_ops_setup(void)
@@ -144,6 +234,11 @@ static void paravirt_ops_setup(void)
pv_mmu_ops.release_pt = kvm_release_pt;
pv_mmu_ops.release_pd = kvm_release_pt;
}
+
+   if (kvm_para_has_feature(KVM_FEATURE_MULTICALL)) {
+   pv_mmu_ops.lazy_mode.enter = kvm_enter_lazy_mmu;
+   pv_mmu_ops.lazy_mode.leave = kvm_leave_lazy_mmu;
+   }
 }
 
 void __init kvm_guest_init(void)
Index: kvm.paravirt2/arch/x86/kvm/x86.c
===
--- kvm.paravirt2.orig/arch/x86/kvm/x86.c
+++ kvm.paravirt2/arch/x86/kvm/x86.c
@@ -79,6 +79,8 @@ struct kvm_stats_debugfs_item debugfs_en
{ fpu_reload, VCPU_STAT(fpu_reload) },
{ insn_emulation, VCPU_STAT(insn_emulation) },
{ insn_emulation_fail, VCPU_STAT(insn_emulation_fail) },
+   { multicall, VCPU_STAT(multicall) },
+   { multicall_nr, VCPU_STAT(multicall_nr) 

[kvm-devel] [patch 2/5] KVM: hypercall based pte updates and TLB flushes (v2)

2008-02-20 Thread Marcelo Tosatti
Hypercall based pte updates are faster than faults, and also allow use
of the lazy MMU mode to batch operations.

Don't report the feature if two dimensional paging is enabled.

v1-v2:
- guest passes physical destination addr, which is cheaper than doing v-p
translation in the host.
- infer size of pte from guest mode

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]
Cc: Anthony Liguori [EMAIL PROTECTED]


Index: kvm.paravirt2/arch/x86/kernel/kvm.c
===
--- kvm.paravirt2.orig/arch/x86/kernel/kvm.c
+++ kvm.paravirt2/arch/x86/kernel/kvm.c
@@ -33,6 +33,91 @@ static void kvm_io_delay(void)
 {
 }
 
+static void kvm_mmu_write(void *dest, const void *src, size_t size)
+{
+   const uint8_t *p = src;
+   unsigned long a0 = *(unsigned long *)p;
+   unsigned long a1 = 0;
+
+#ifdef CONFIG_X86_32
+   size = 2;
+   if (size == 2)
+   a1 = *(u32 *)p[4];
+#endif
+   kvm_hypercall3(KVM_HYPERCALL_MMU_WRITE, (unsigned long)__pa(dest), a0,
+   a1);
+}
+
+/*
+ * We only need to hook operations that are MMU writes.  We hook these so that
+ * we can use lazy MMU mode to batch these operations.  We could probably
+ * improve the performance of the host code if we used some of the information
+ * here to simplify processing of batched writes.
+ */
+static void kvm_set_pte(pte_t *ptep, pte_t pte)
+{
+   kvm_mmu_write(ptep, pte, sizeof(pte));
+}
+
+static void kvm_set_pte_at(struct mm_struct *mm, unsigned long addr,
+  pte_t *ptep, pte_t pte)
+{
+   kvm_mmu_write(ptep, pte, sizeof(pte));
+}
+
+static void kvm_set_pmd(pmd_t *pmdp, pmd_t pmd)
+{
+   kvm_mmu_write(pmdp, pmd, sizeof(pmd));
+}
+
+#if PAGETABLE_LEVELS = 3
+#ifdef CONFIG_X86_PAE
+static void kvm_set_pte_atomic(pte_t *ptep, pte_t pte)
+{
+   kvm_mmu_write(ptep, pte, sizeof(pte));
+}
+
+static void kvm_set_pte_present(struct mm_struct *mm, unsigned long addr,
+   pte_t *ptep, pte_t pte)
+{
+   kvm_mmu_write(ptep, pte, sizeof(pte));
+}
+
+static void kvm_pte_clear(struct mm_struct *mm,
+ unsigned long addr, pte_t *ptep)
+{
+   pte_t pte = __pte(0);
+   kvm_mmu_write(ptep, pte, sizeof(pte));
+}
+
+static void kvm_pmd_clear(pmd_t *pmdp)
+{
+   pmd_t pmd = __pmd(0);
+   kvm_mmu_write(pmdp, pmd, sizeof(pmd));
+}
+#endif
+
+static void kvm_set_pgd(pgd_t *pgdp, pgd_t pgd)
+{
+   kvm_mmu_write(pgdp, pgd, sizeof(pgd));
+}
+
+static void kvm_set_pud(pud_t *pudp, pud_t pud)
+{
+   kvm_mmu_write(pudp, pud, sizeof(pud));
+}
+#endif /* PAGETABLE_LEVELS = 3 */
+
+static void kvm_flush_tlb(void)
+{
+   kvm_hypercall0(KVM_HYPERCALL_FLUSH_TLB);
+}
+
+static void kvm_release_pt(u32 pfn)
+{
+   kvm_hypercall1(KVM_HYPERCALL_RELEASE_PT, pfn  PAGE_SHIFT);
+}
+
 static void paravirt_ops_setup(void)
 {
pv_info.name = KVM;
@@ -41,6 +126,24 @@ static void paravirt_ops_setup(void)
if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY))
pv_cpu_ops.io_delay = kvm_io_delay;
 
+   if (kvm_para_has_feature(KVM_FEATURE_MMU_WRITE)) {
+   pv_mmu_ops.set_pte = kvm_set_pte;
+   pv_mmu_ops.set_pte_at = kvm_set_pte_at;
+   pv_mmu_ops.set_pmd = kvm_set_pmd;
+#if PAGETABLE_LEVELS = 3
+#ifdef CONFIG_X86_PAE
+   pv_mmu_ops.set_pte_atomic = kvm_set_pte_atomic;
+   pv_mmu_ops.set_pte_present = kvm_set_pte_present;
+   pv_mmu_ops.pte_clear = kvm_pte_clear;
+   pv_mmu_ops.pmd_clear = kvm_pmd_clear;
+#endif
+   pv_mmu_ops.set_pud = kvm_set_pud;
+   pv_mmu_ops.set_pgd = kvm_set_pgd;
+#endif
+   pv_mmu_ops.flush_tlb_user = kvm_flush_tlb;
+   pv_mmu_ops.release_pt = kvm_release_pt;
+   pv_mmu_ops.release_pd = kvm_release_pt;
+   }
 }
 
 void __init kvm_guest_init(void)
Index: kvm.paravirt2/arch/x86/kvm/mmu.c
===
--- kvm.paravirt2.orig/arch/x86/kvm/mmu.c
+++ kvm.paravirt2/arch/x86/kvm/mmu.c
@@ -39,7 +39,7 @@
  * 2. while doing 1. it walks guest-physical to host-physical
  * If the hardware supports that we don't need to do shadow paging.
  */
-static bool tdp_enabled = false;
+bool tdp_enabled = false;
 
 #undef MMU_DEBUG
 
@@ -288,7 +288,7 @@ static void mmu_free_memory_cache_page(s
free_page((unsigned long)mc-objects[--mc-nobjs]);
 }
 
-static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu)
+int mmu_topup_memory_caches(struct kvm_vcpu *vcpu)
 {
int r;
 
@@ -857,7 +857,7 @@ static int kvm_mmu_unprotect_page(struct
return r;
 }
 
-static void mmu_unshadow(struct kvm *kvm, gfn_t gfn)
+void mmu_unshadow(struct kvm *kvm, gfn_t gfn)
 {
struct kvm_mmu_page *sp;
 
Index: kvm.paravirt2/arch/x86/kvm/mmu.h
===
--- 

[kvm-devel] [patch 1/5] KVM: add basic paravirt support (v2)

2008-02-20 Thread Marcelo Tosatti
Add basic KVM paravirt support. Avoid vm-exits on IO delays.

Add KVM_GET_PARA_FEATURES ioctl so paravirt features can be reported in a
single bitmask. This allows the host to disable features on runtime if 
appropriate, which would require one ioctl per feature otherwise.

The limit of 32 features can be extended to 64 if needed, beyond that a new 
MSR is required.

v1-v2:
- replace KVM_CAP_CLOCKSOURCE with KVM_CAP_PARA_FEATURES
- cover FEATURE_CLOCKSOURCE

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]
Cc: Anthony Liguori [EMAIL PROTECTED]

Index: kvm.paravirt2/arch/x86/Kconfig
===
--- kvm.paravirt2.orig/arch/x86/Kconfig
+++ kvm.paravirt2/arch/x86/Kconfig
@@ -382,6 +382,14 @@ config KVM_CLOCK
  provides the guest with timing infrastructure such as time of day, and
  system time
 
+config KVM_GUEST
+   bool KVM Guest support
+   select PARAVIRT
+   depends on !(X86_VISWS || X86_VOYAGER)
+   help
+This option enables various optimizations for running under the KVM
+hypervisor.
+
 source arch/x86/lguest/Kconfig
 
 config PARAVIRT
Index: kvm.paravirt2/arch/x86/kernel/Makefile
===
--- kvm.paravirt2.orig/arch/x86/kernel/Makefile
+++ kvm.paravirt2/arch/x86/kernel/Makefile
@@ -69,6 +69,7 @@ obj-$(CONFIG_DEBUG_RODATA_TEST)   += test_
 obj-$(CONFIG_DEBUG_NX_TEST)+= test_nx.o
 
 obj-$(CONFIG_VMI)  += vmi_32.o vmiclock_32.o
+obj-$(CONFIG_KVM_GUEST)+= kvm.o
 obj-$(CONFIG_KVM_CLOCK)+= kvmclock.o
 obj-$(CONFIG_PARAVIRT) += paravirt.o paravirt_patch_$(BITS).o
 
Index: kvm.paravirt2/arch/x86/kernel/kvm.c
===
--- /dev/null
+++ kvm.paravirt2/arch/x86/kernel/kvm.c
@@ -0,0 +1,52 @@
+/*
+ * KVM paravirt_ops implementation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright (C) 2007, Red Hat, Inc., Ingo Molnar [EMAIL PROTECTED]
+ * Copyright IBM Corporation, 2007
+ *   Authors: Anthony Liguori [EMAIL PROTECTED]
+ */
+
+#include linux/module.h
+#include linux/kernel.h
+#include linux/kvm_para.h
+#include linux/cpu.h
+#include linux/mm.h
+
+/*
+ * No need for any IO delay on KVM
+ */
+static void kvm_io_delay(void)
+{
+}
+
+static void paravirt_ops_setup(void)
+{
+   pv_info.name = KVM;
+   pv_info.paravirt_enabled = 1;
+
+   if (kvm_para_has_feature(KVM_FEATURE_NOP_IO_DELAY))
+   pv_cpu_ops.io_delay = kvm_io_delay;
+
+}
+
+void __init kvm_guest_init(void)
+{
+   if (!kvm_para_available())
+   return;
+
+   paravirt_ops_setup();
+}
Index: kvm.paravirt2/arch/x86/kernel/setup_32.c
===
--- kvm.paravirt2.orig/arch/x86/kernel/setup_32.c
+++ kvm.paravirt2/arch/x86/kernel/setup_32.c
@@ -784,6 +784,7 @@ void __init setup_arch(char **cmdline_p)
 */
vmi_init();
 #endif
+   kvm_guest_init();
 
/*
 * NOTE: before this point _nobody_ is allowed to allocate
Index: kvm.paravirt2/arch/x86/kernel/setup_64.c
===
--- kvm.paravirt2.orig/arch/x86/kernel/setup_64.c
+++ kvm.paravirt2/arch/x86/kernel/setup_64.c
@@ -452,6 +452,8 @@ void __init setup_arch(char **cmdline_p)
init_apic_mappings();
ioapic_init_mappings();
 
+   kvm_guest_init();
+
/*
 * We trust e820 completely. No explicit ROM probing in memory.
 */
Index: kvm.paravirt2/arch/x86/kvm/x86.c
===
--- kvm.paravirt2.orig/arch/x86/kvm/x86.c
+++ kvm.paravirt2/arch/x86/kvm/x86.c
@@ -788,7 +788,7 @@ int kvm_dev_ioctl_check_extension(long e
case KVM_CAP_USER_MEMORY:
case KVM_CAP_SET_TSS_ADDR:
case KVM_CAP_EXT_CPUID:
-   case KVM_CAP_CLOCKSOURCE:
+   case KVM_CAP_PARA_FEATURES:
r = 1;
break;
case KVM_CAP_VAPIC:
@@ -854,6 +854,15 @@ long kvm_arch_dev_ioctl(struct file *fil
r = 0;
break;
}
+   case KVM_GET_PARA_FEATURES: {
+   __u32 para_features = KVM_PARA_FEATURES;
+
+   r = 

[kvm-devel] [patch 6/5] KVM: use lockless __emulator_write_phys in kvm_hypercall_mmu_write()

2008-02-20 Thread Marcelo Tosatti

Subject says it all.

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]

Index: kvm.paravirt2/arch/x86/kvm/x86.c
===
--- kvm.paravirt2.orig/arch/x86/kvm/x86.c
+++ kvm.paravirt2/arch/x86/kvm/x86.c
@@ -2400,7 +2400,7 @@ static int kvm_hypercall_mmu_write(struc
bytes = 4;
}
 
-   if (!emulator_write_phys(vcpu, addr, value, bytes))
+   if (!__emulator_write_phys(vcpu, addr, value, bytes))
return -KVM_EFAULT;
 
return 0;

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [patch 4/5] KVM: ignore zapped root pagetables (v2)

2008-02-20 Thread Marcelo Tosatti
Mark zapped root pagetables as invalid and ignore such pages during lookup.

This is a problem with the cr3-target feature, where a zapped root table fools
the faulting code into creating a read-only mapping. The result is a lockup
if the instruction can't be emulated.

v1-v2:
- reload mmu of remote cpu's on root invalidation

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]
Cc: Anthony Liguori [EMAIL PROTECTED]

Index: kvm.paravirt2/arch/x86/kvm/mmu.c
===
--- kvm.paravirt2.orig/arch/x86/kvm/mmu.c
+++ kvm.paravirt2/arch/x86/kvm/mmu.c
@@ -668,7 +668,8 @@ static struct kvm_mmu_page *kvm_mmu_look
index = kvm_page_table_hashfn(gfn);
bucket = kvm-arch.mmu_page_hash[index];
hlist_for_each_entry(sp, node, bucket, hash_link)
-   if (sp-gfn == gfn  !sp-role.metaphysical) {
+   if (sp-gfn == gfn  !sp-role.metaphysical
+!sp-role.invalid) {
pgprintk(%s: found role %x\n,
 __FUNCTION__, sp-role.word);
return sp;
@@ -796,8 +797,11 @@ static void kvm_mmu_zap_page(struct kvm 
if (!sp-root_count) {
hlist_del(sp-hash_link);
kvm_mmu_free_page(kvm, sp);
-   } else
+   } else {
list_move(sp-link, kvm-arch.active_mmu_pages);
+   sp-role.invalid = 1;
+   kvm_reload_remote_mmus(kvm);
+   }
kvm_mmu_reset_last_pte_updated(kvm);
 }
 
@@ -1067,6 +1071,8 @@ static void mmu_free_roots(struct kvm_vc
 
sp = page_header(root);
--sp-root_count;
+   if (!sp-root_count  sp-role.invalid)
+   kvm_mmu_zap_page(vcpu-kvm, sp);
vcpu-arch.mmu.root_hpa = INVALID_PAGE;
spin_unlock(vcpu-kvm-mmu_lock);
return;
@@ -1079,6 +1085,8 @@ static void mmu_free_roots(struct kvm_vc
root = PT64_BASE_ADDR_MASK;
sp = page_header(root);
--sp-root_count;
+   if (!sp-root_count  sp-role.invalid)
+   kvm_mmu_zap_page(vcpu-kvm, sp);
}
vcpu-arch.mmu.pae_root[i] = INVALID_PAGE;
}
Index: kvm.paravirt2/include/asm-x86/kvm_host.h
===
--- kvm.paravirt2.orig/include/asm-x86/kvm_host.h
+++ kvm.paravirt2/include/asm-x86/kvm_host.h
@@ -140,6 +140,7 @@ union kvm_mmu_page_role {
unsigned pad_for_nice_hex_output : 6;
unsigned metaphysical : 1;
unsigned access : 3;
+   unsigned invalid : 1;
};
 };
 
Index: kvm.paravirt2/arch/x86/kvm/x86.c
===
--- kvm.paravirt2.orig/arch/x86/kvm/x86.c
+++ kvm.paravirt2/arch/x86/kvm/x86.c
@@ -2743,6 +2743,10 @@ preempted:
kvm_x86_ops-guest_debug_pre(vcpu);
 
 again:
+   if (vcpu-requests)
+   if (test_and_clear_bit(KVM_REQ_MMU_RELOAD, vcpu-requests))
+   kvm_mmu_unload(vcpu);
+
r = kvm_mmu_reload(vcpu);
if (unlikely(r))
goto out;
@@ -2774,6 +2778,14 @@ again:
goto out;
}
 
+   if (vcpu-requests)
+   if (test_bit(KVM_REQ_MMU_RELOAD, vcpu-requests)) {
+   local_irq_enable();
+   preempt_enable();
+   r = 1;
+   goto out;
+   }
+
if (signal_pending(current)) {
local_irq_enable();
preempt_enable();
Index: kvm.paravirt2/include/linux/kvm_host.h
===
--- kvm.paravirt2.orig/include/linux/kvm_host.h
+++ kvm.paravirt2/include/linux/kvm_host.h
@@ -37,6 +37,7 @@
 #define KVM_REQ_TLB_FLUSH  0
 #define KVM_REQ_MIGRATE_TIMER  1
 #define KVM_REQ_REPORT_TPR_ACCESS  2
+#define KVM_REQ_MMU_RELOAD 3
 
 struct kvm_vcpu;
 extern struct kmem_cache *kvm_vcpu_cache;
@@ -190,6 +191,7 @@ void kvm_resched(struct kvm_vcpu *vcpu);
 void kvm_load_guest_fpu(struct kvm_vcpu *vcpu);
 void kvm_put_guest_fpu(struct kvm_vcpu *vcpu);
 void kvm_flush_remote_tlbs(struct kvm *kvm);
+void kvm_reload_remote_mmus(struct kvm *kvm);
 
 long kvm_arch_dev_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg);
Index: kvm.paravirt2/virt/kvm/kvm_main.c
===
--- kvm.paravirt2.orig/virt/kvm/kvm_main.c
+++ kvm.paravirt2/virt/kvm/kvm_main.c
@@ -119,6 +119,29 @@ void kvm_flush_remote_tlbs(struct kvm *k
smp_call_function_mask(cpus, ack_flush, NULL, 1);
 }
 
+void kvm_reload_remote_mmus(struct kvm *kvm)
+{
+   int i, cpu;
+   cpumask_t cpus;
+   struct kvm_vcpu *vcpu;
+
+   cpus_clear(cpus);

Re: [kvm-devel] [PATCH] mmu notifiers #v6

2008-02-20 Thread Jack Steiner
On Wed, Feb 20, 2008 at 11:39:42AM +0100, Andrea Arcangeli wrote:
 Given Nick's comments I ported my version of the mmu notifiers to
 latest mainline. There are no known bugs AFIK and it's obviously safe
 (nothing is allowed to schedule inside rcu_read_lock taken by
 mmu_notifier() with my patch).
 

I ported the GRU driver to use the latest #v6 patch and ran a series of
tests on it using our system simulator. The simulator is slow so true
stress or swapping is not possible - at least within a finite amount of
time.

Functionally, the #v6 patch seems to work for the GRU. However, I did
notice two significant differences that make the #v6 performance worse for
the GRU than Christoph's patch.  I think one difference is easily fixable
but the other is more difficult:

- the location of the mmu_notifier_release() callout is at a
  different place in the 2 patches. Christoph has the callout
  BEFORE the call to unmap_vmas() whereas you have it AFTER. The
  net result is that the GRU does a LOT of 1-page TLB flushes
  during process teardown.  These flushes are not done with
  Christops's patch.

- the range callouts in Christoph's patch benefit the GRU because
  multiple TLB entries can be flushed with a single GRU
  instruction (the GRU hardware supports a range flush using a
  vaddr  length).  The #v6 patch does a TLB flush for each page in
  the range.  Flushing on the GRU is slow so being able to flush
  multiple pages with a single request is a benefit.

Seems like the latter difference could be significant for other users
of mmu notifiers.


--- jack

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Widescreen troubles again -- solved

2008-02-20 Thread Haydn Solomon
Great Job,

Any chance on getting it up to resolution 1900x1200? Looking forward to
these updates.

On Wed, Feb 20, 2008 at 1:00 PM, Andreas Winkelbauer 
[EMAIL PROTECTED] wrote:

 hi,

 I managed to get kvm-61 working at a resolution of 1680x1050 (using
 -std-vga;
 windowed as well as fullscreen) with windows xp as guest os.

 Basically I took the following steps:
 * increase vga memory from 8MB to 16MB (in vbetables-gen.c)
 * add modes to vbetables-gen.c
 * change qemu/hw/vga_int.h accordingly

 I'll post a patch according to my changes soon.

 cheers,
 Andi


 -
 This SF.net email is sponsored by: Microsoft
 Defy all challenges. Microsoft(R) Visual Studio 2008.
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Widescreen troubles again -- solved

2008-02-20 Thread Andreas Winkelbauer
hi,

I managed to get kvm-61 working at a resolution of 1680x1050 (using -std-vga;
windowed as well as fullscreen) with windows xp as guest os.

Basically I took the following steps:
* increase vga memory from 8MB to 16MB (in vbetables-gen.c)
* add modes to vbetables-gen.c
* change qemu/hw/vga_int.h accordingly

I'll post a patch according to my changes soon.

cheers,
Andi


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] fix widescreen resolution issues

2008-02-20 Thread Andreas Winkelbauer

hi,

the attached patch fixes the issues with widescreen resolutions for me 
when using -std-vga.


in vgabios/vbetables-gen.c I changed the video memory from 8MB to 16MB 
which is sufficient for resolutions up to 2560x1600. I've also added 
some more video modes (up to 2560x1600) with 16, 24, 32bit color depth 
each and I've removed the duplicate for the 1280x960 mode.


in qemu/hw/vga_int.h I've adapted the maximum resolution accordingly.

you can download the updated vgabios from http://www.wina.at/vgabios.bin

I would appreciate any suggestions, comments and of course testing.

@Arne Brutschy: could you please test the 1920x1200 resolution with your 
setup?


cheers,
Andi
--- kvm-61.orig/vgabios/vbetables-gen.c	2008-02-19 15:58:28.0 +0100
+++ kvm-61/vgabios/vbetables-gen.c	2008-02-20 19:22:48.0 +0100
@@ -2,7 +2,7 @@
 #include stdlib.h
 #include stdio.h
 
-#define VBE_DISPI_TOTAL_VIDEO_MEMORY_MB 8
+#define VBE_DISPI_TOTAL_VIDEO_MEMORY_MB 16
 
 typedef struct {
 int width;
@@ -55,18 +55,27 @@
 { 1152, 864, 16  , 0x14a},
 { 1152, 864, 24  , 0x14b},
 { 1152, 864, 32  , 0x14c},
-{ 1280, 800, 24  , 0x178},
-{ 1280, 800, 32  , 0x179},
-{ 1280, 960, 24  , 0x180},
-{ 1280, 960, 32  , 0x181},
-{ 1280, 960, 24  , 0x182},
-{ 1280, 960, 32  , 0x183},
-{ 1440, 900, 24  , 0x184},
-{ 1440, 900, 32  , 0x185},
-{ 1400, 1050, 24 , 0x186},
-{ 1400, 1050, 32 , 0x187},
-{ 1680, 1050, 24 , 0x188},
-{ 1680, 1050, 32 , 0x189},
+{ 1280, 800, 16  , 0x178},
+{ 1280, 800, 24  , 0x179},
+{ 1280, 800, 32  , 0x17a},
+{ 1280, 960, 16  , 0x17b},
+{ 1280, 960, 24  , 0x17c},
+{ 1280, 960, 32  , 0x17d},
+{ 1440, 900, 16  , 0x17e},
+{ 1440, 900, 24  , 0x17f},
+{ 1440, 900, 32  , 0x180},
+{ 1400, 1050, 16 , 0x181},
+{ 1400, 1050, 24 , 0x182},
+{ 1400, 1050, 32 , 0x183},
+{ 1680, 1050, 16 , 0x184},
+{ 1680, 1050, 24 , 0x185},
+{ 1680, 1050, 32 , 0x186},
+{ 1920, 1200, 16 , 0x187},
+{ 1920, 1200, 24 , 0x188},
+{ 1920, 1200, 32 , 0x189},
+{ 2560, 1600, 16 , 0x18a},
+{ 2560, 1600, 24 , 0x18b},
+{ 2560, 1600, 32 , 0x18c},
 { 0, },
 };

--- kvm-61.orig/qemu/hw/vga_int.h	2008-02-19 15:58:28.0 +0100
+++ kvm-61/qemu/hw/vga_int.h	2008-02-20 19:25:35.0 +0100
@@ -30,8 +30,8 @@
 /* bochs VBE support */
 #define CONFIG_BOCHS_VBE
 
-#define VBE_DISPI_MAX_XRES  1600
-#define VBE_DISPI_MAX_YRES  1200
+#define VBE_DISPI_MAX_XRES  2560
+#define VBE_DISPI_MAX_YRES  1600
 #define VBE_DISPI_MAX_BPP   32
 
 #define VBE_DISPI_INDEX_ID  0x0

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] kvm-lite status

2008-02-20 Thread Fabio Checconi
Hi,
what's the status of kvm-lite?  Has anybody worked on it since
the last september posting[1]?

I'd like to play a little bit with the idea and with the code, so I would
be really thankful to you or anybody else for providing any relevant info
on this subject.


[1] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/6585/focus=6599


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 5/6] mmu_notifier: Support for drivers with revers maps (f.e. for XPmem)

2008-02-20 Thread Nick Piggin
On Wednesday 20 February 2008 20:00, Robin Holt wrote:
 On Wed, Feb 20, 2008 at 02:51:45PM +1100, Nick Piggin wrote:
  On Wednesday 20 February 2008 14:12, Robin Holt wrote:
   For XPMEM, we do not currently allow file backed
   mapping pages from being exported so we should never reach this
   condition. It has been an issue since day 1.  We have operated with
   that assumption for 6 years and have not had issues with that
   assumption.  The user of xpmem is MPT and it controls the communication
   buffers so it is reasonable to expect this type of behavior.
 
  OK, that makes things simpler.
 
  So why can't you export a device from your xpmem driver, which
  can be mmap()ed to give out anonymous memory pages to be used
  for these communication buffers?

 Because we need to have heap and stack available as well.  MPT does
 not control all the communication buffer areas.  I haven't checked, but
 this is the same problem that IB will have.  I believe they are actually
 allowing any memory region be accessible, but I am not sure of that.

Then you should create a driver that the user program can register
and unregister regions of their memory with. The driver can do a
get_user_pages to get the pages, and then you'd just need to set up
some kind of mapping so that userspace can unmap pages / won't leak
memory (and an exit_mm notifier I guess).

Because you don't need to swap, you don't need coherency, and you
are in control of the areas, then this seems like the best choice.
It would allow you to use heap, stack, file-backed, anything.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch] my mmu notifiers

2008-02-20 Thread Nick Piggin
On Tue, Feb 19, 2008 at 05:40:50PM -0600, Jack Steiner wrote:
 On Wed, Feb 20, 2008 at 12:11:57AM +0100, Nick Piggin wrote:
  On Tue, Feb 19, 2008 at 02:58:51PM +0100, Andrea Arcangeli wrote:
   On Tue, Feb 19, 2008 at 09:43:57AM +0100, Nick Piggin wrote:
anything when changing the pte to be _more_ permissive, and I don't
   
   Note that in my patch the invalidate_pages in mprotect can be
   trivially switched to a mprotect_pages with proper params. This will
   prevent page faults completely in the secondary MMU (there will only
   be tlb misses after the tlb flush just like for the core linux pte),
   and it'll allow all the secondary MMU pte blocks (512/1024 at time
   with my PT lock design) to be updated to have proper permissions
   matching the core linux pte.
  
  Sorry, I realise I still didn't get this through my head yet (and also
  have not seen your patch recently). So I don't know exactly what you
  are doing...
  
  But why does _anybody_ (why does Christoph's patches) need to invalidate
  when they are going to be more permissive? This should be done lazily by
  the driver, I would have thought.
 
 
 Agree. Although for most real applications, the performance difference
 is probably negligible.

But importantly, doing it that way means you share test coverage with
the CPU TLB flushing code, and you don't introduce a new concept to the
VM.

So, it _has_ to be lazy flushing, IMO (as there doesn't seem to be a
good reason otherwise). mprotect shouldn't really be a special case,
because it still has to flush the CPU tlbs as well when restricting
access.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch] my mmu notifiers

2008-02-20 Thread Nick Piggin
On Wed, Feb 20, 2008 at 02:09:41AM +0100, Andrea Arcangeli wrote:
 On Wed, Feb 20, 2008 at 12:11:57AM +0100, Nick Piggin wrote:
  Sorry, I realise I still didn't get this through my head yet (and also
  have not seen your patch recently). So I don't know exactly what you
  are doing...
 
 The last version was posted here:
 
 http://marc.info/?l=kvm-develm=120321732521533w=2
 
  But why does _anybody_ (why does Christoph's patches) need to invalidate
  when they are going to be more permissive? This should be done lazily by
  the driver, I would have thought.
 
 This can be done lazily by the driver yes. The place where I've an
 invalidate_pages in mprotect however can also become less permissive.

That's OK, because we have to flush tlbs there too.


 It's simpler to invalidate always and it's not guaranteed the
 secondary mmu page fault is capable of refreshing the spte across a
 writeprotect fault.

I think we just have to make sure that it _can_ do writeprotect
faults. AFAIKS, that will be possible if the driver registers a
.page_mkwrite handler (actually not quite -- page_mkwrite is fairly
crap, so I have a patch to merge it together with .fault so we get
address information as well). Anyway, I really think we should do
it that way.

 In the future this can be changed to
 mprotect_pages though, so no page fault will happen in the secondary
 mmu.

Possibly, but hopefully not needed for performance. Let's wait and
see.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] mmu notifiers #v6

2008-02-20 Thread Nick Piggin
On Wed, Feb 20, 2008 at 11:39:42AM +0100, Andrea Arcangeli wrote:
 Given Nick's comments I ported my version of the mmu notifiers to
 latest mainline. There are no known bugs AFIK and it's obviously safe
 (nothing is allowed to schedule inside rcu_read_lock taken by
 mmu_notifier() with my patch).

Thanks! Yes the seqlock you are using now ends up looking similar
to what I did and I couldn't find a hole in that either. So I
think this is going to work.

I do prefer some parts of my patch, however for everyone's sanity,
I think you should be the maintainer of the mmu notifiers, and I
will send you incremental changes that can be discussed more easily
that way (nothing major, mainly style and minor things).


 XPMEM simply can't use RCU for the registration locking if it wants to
 schedule inside the mmu notifier calls. So I guess it's better to add
 the XPMEM invalidate_range_end/begin/external-rmap as a whole
 different subsystem that will have to use a mutex (not RCU) to
 serialize, and at the same time that CONFIG_XPMEM will also have to
 switch the i_mmap_lock to a mutex. I doubt xpmem fits inside a
 CONFIG_MMU_NOTIFIER anymore, or we'll all run a bit slower because of
 it. It's really a call of how much we want to optimize the MMU
 notifier, by keeping things like RCU for the registration.

I agree: your coherent, non-sleeping mmu notifiers are pretty simple
and unintrusive. The sleeping version is fundamentally going to either
need to change VM locks, or be non-coherent, so I don't think there is
a question of making one solution fit everybody. So the sleeping /
xrmap patch should be kept either completely independent, or as an
add-on to this one.

I will post some suggestions to you when I get a chance.

 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] mmu notifiers #v6

2008-02-20 Thread Nick Piggin
On Wed, Feb 20, 2008 at 01:03:24PM +0100, Andrea Arcangeli wrote:
 If there's agreement that the VM should alter its locking from
 spinlock to mutex for its own good, then Christoph's
 one-config-option-fits-all becomes a lot more appealing (replacing RCU
 with a mutex in the mmu notifier list registration locking isn't my
 main worry and the non-sleeping-users may be ok to live with it).

Just from a high level view, in some cases we can just say that no we
aren't going to support this. And this may well be one of those cases.

The more constraints placed on the VM, the harder it becomes to
improve and adapt in future. And this seems like a pretty big restriction.
(especially if we can eg. work around it completely by having a special
purpose driver to get_user_pages on comm buffers as I suggested in the
other mail).

At any rate, I believe Andrea's patch really places minimal or no further
constraints than a regular CPU TLB (or the hash tables that some archs
implement). So we're kind of in 2 different leagues here.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] KVM Testing Result for KVM-61

2008-02-20 Thread Zhao, Yunfeng
Izik Eidus wrote:
 On Wed, 2008-02-20 at 21:12 +0800, Zhao, Yunfeng wrote:
 
 On Wed, 2008-02-20 at 20:58 +0800, Zhao, Yunfeng wrote:
 Five old issues:
 1. Fails to save/restore guests
 Save/restore may cause host to hang.
 
 https://sourceforge.net/tracker/index.php?funcÞtailaid24525group_
 id0599atid‰3831
 
 savevm loadvm does not work, but it doesnt crush my host
 what you see in the dmesg? (the bug with the bad page
 was fixed no?)
 Here is the error message in the dmesg and host console:
 
 is it happen on every guest?
 you are using qcow right?
 It doesn't happen on every guest, but the probability of this error
 is very high. Yes, I am using qcow images.
 
 Zhao, it seems like i just dont want to happen on my host
 can you please give me as much information about the guest/host and
 even on how you save the vm?

Here are the commands we are using to do the test:
qemu-img create -b /share/xvs/img/app/ia32p_UP.img -f qcow2 
/share/xvs/var/tmp-img-1
qemu-system-x86_64 -m 256 -net nic,macaddr=00:16:3e:25:b8:87,model=rtl8139 -net 
tap,script=/etc/kvm/qemu-ifup -hda /share/xvs/var/tmp-img-1
migrate “exec:dd of=FILENAME(/share/install/a.img)”
qemu-system-x86_64 -m 256 -net nic,macaddr=00:16:3e:25:b8:87,model=rtl8139 -net 
tap,script=/etc/kvm/qemu-ifup -hda /share/xvs/var/tmp-img-1   -incoming stdio  
FILENAME(/share/install/a.img)

Both linux guests and windows guests have the problem.
The host is a woodcrest with 8GB memory. The backend file of qcow is on a nfs 
server.

thanks
Yunfeng

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] howto set up a virtual firewall?

2008-02-20 Thread Kurt Neufeld

Avi Kivity wrote:
 
 Assuming you have eth0 on the host, tap0 on the host visible as eth0 in 
 the guest, and tap1 in the host visible as eth1 in the guest, you can 
 add a bridge between eth0 and tap0, and use tap1 as the nic in the host 
 for IP (e.g. run 'dhclient tap1' to obtain an internal IP address).

It turns out I did have everything correctly configured but it still 
doesn't work. The problem is that I cannot get a DHCP address on my vm.

I can see the DHCP Request packets going out and can see the Replies 
getting back to my physical card that I'm running tcpdump on. But for 
some reason the vm doesn't get/see them. The host has no iptables rules, 
all policies set to ACCEPT (yikes!).

I even tried 'echo 1  /proc/sys/net/ipv4/conf/*/bootp_relay' but that 
didn't help.

If I configure the vm nic with a static address (the one that my host 
just gave up) then I can surf the net, even forward packets from my host 
machine that no longer has a public ip address. Unfortunately that is 
not an acceptable long term solution.

Some general questions, should br0 be up or down? What should my vm MAC 
be? The same as my physical card (peth) which is also the same as the 
bridge (br0)? The vnet0 does not match. (output later)

Somewhat related, I setup my internal nic as a bridge as well, but I 
can't get the vm to get a dhcp address there either. Can one member of a 
bridge get a dhcp address from another member of the bridge?

I'm running fedora 8 with kernel 2.6.23.15-137.fc8 if that makes any 
difference.

[EMAIL PROTECTED] ~]
# brctl show
bridge name bridge id   STP enabled interfaces
br0 8000.0050047fb5a3   no  peth0
 vnet0
br1 8000.001617d8fc32   no  peth1
 vnet1

br0 is external
br1 is internal

[EMAIL PROTECTED] ~]
# ifconfig |grep HWaddr
br0   Link encap:Ethernet  HWaddr 00:50:04:7F:B5:A3
br1   Link encap:Ethernet  HWaddr 00:16:17:D8:FC:32
peth0 Link encap:Ethernet  HWaddr 00:50:04:7F:B5:A3
peth1 Link encap:Ethernet  HWaddr 00:16:17:D8:FC:32
vnet0 Link encap:Ethernet  HWaddr 00:FF:79:58:28:0F
vnet1 Link encap:Ethernet  HWaddr 00:FF:DB:40:5D:D2


Thanks for the replies, please keep them coming!

Kurt

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] KVM Testing Result for KVM-61

2008-02-20 Thread Zhao, Yunfeng
Zhao, Yunfeng wrote:
 Izik Eidus wrote:
 On Wed, 2008-02-20 at 21:12 +0800, Zhao, Yunfeng wrote:
 
 On Wed, 2008-02-20 at 20:58 +0800, Zhao, Yunfeng wrote:
 Five old issues:
 1. Fails to save/restore guests
 Save/restore may cause host to hang.
 
 https://sourceforge.net/tracker/index.php?funcÞtailaid24525group_
 id0599atid‰3831
 
 savevm loadvm does not work, but it doesnt crush my host
 what you see in the dmesg? (the bug with the bad page
 was fixed no?)
 Here is the error message in the dmesg and host console:
 
 is it happen on every guest?
 you are using qcow right?
 It doesn't happen on every guest, but the probability of this error
 is very high. Yes, I am using qcow images.
 
 Zhao, it seems like i just dont want to happen on my host
 can you please give me as much information about the guest/host and
 even on how you save the vm?
 
 Here are the commands we are using to do the test:
 qemu-img create -b /share/xvs/img/app/ia32p_UP.img -f qcow2
 /share/xvs/var/tmp-img-1 qemu-system-x86_64 -m 256 -net
 nic,macaddr=00:16:3e:25:b8:87,model=rtl8139 -net
 tap,script=/etc/kvm/qemu-ifup -hda /share/xvs/var/tmp-img-1
 migrate “exec:dd of=FILENAME(/share/install/a.img)”
 qemu-system-x86_64 -m 256 -net
 nic,macaddr=00:16:3e:25:b8:87,model=rtl8139 -net
 tap,script=/etc/kvm/qemu-ifup -hda /share/xvs/var/tmp-img-1
 -incoming stdio  FILENAME(/share/install/a.img)
 
 Both linux guests and windows guests have the problem.
 The host is a woodcrest with 8GB memory. The backend file of qcow is
 on a nfs server. 
 
 thanks
 Yunfeng
BTW: live migration works well.  And  using raw image file has the same problem.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel