Re: [kvm-devel] [kvm-ppc-devel] [PATCH 1/5]Add some trace markers and exposeinterfaces in kernel for tracing

2008-04-18 Thread Christian Ehrhardt
Liu, Eric E wrote:
 Hollis Blanchard wrote:
 On Wednesday 16 April 2008 01:45:34 Liu, Eric E wrote:
[...]
 Actually... we could have kvmtrace itself insert the metadata, so
 there would be no chance of it being overwritten in the kernel
 buffers. The header could be written in tip_open_output(), and update
 fs_size accordingly. 

 Yes, let kvmtrace insert the metadata is more reasonable.
 

I wanted to note that the kvmtrace tool should, but not need to know everything 
about the data format.
I think of e.g. changing kernel implementations that change endianess or even 
flags we don't yet know, but we might need in the future.

What about adding another debugfs entry the kernel can use to expose the 
kvmtrace-metadata defined by the kernel implementation.
The kvmtrace tool could then use that to build up the record by using one entry 
for kernel defined metadata and another to add any metadata that would be 
defined by kvmtrace tool itself.

what about that one:
struct metadata {
u32 kmagic; /* stores kernel defined metadata read from debugfs 
entry */
u32 umagic; /* stores userspace tool defined metadata */
u32 extra;  /* it is redundant, only use to fit into record. */
} 

That should give us the flexibility to keep the format if we get more metadata 
requirements in the future.

-- 

Grüsse / regards, 
Christian Ehrhardt
IBM Linux Technology Center, Open Virtualization

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kvm-65/66 bug with Solaris 10 U4 ?

2008-04-18 Thread Ian Kirk
Avi Kivity wrote:

 Actually kvm is affected by pae: it enables nx support.  Please try
 (separately)

  1.  Boot with 'noexec=off' on the host kernel command line

2.6.24.4-64.fc8PAE noexec=off:

Using normal F8 modules
qemu-kvm dies in the same way

  2.  Loading the kernel modules that come with kvm-66

Against 2.6.24.4-64.fc8 it works.

I can't compile them against 2.6.24.4-64.fc8PAE as the module magic name
mismatches, and I don't know how to change kernel-devel to know it's PAE.

Probably won't be able to do any more tests that require a reboot till
Tuesday now, but feel free to leave me some things to try.

Ian

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kvm-65/66 bug with Solaris 10 U4 ?

2008-04-18 Thread Avi Kivity
Ian Kirk wrote:
 Avi Kivity wrote:

   
 I do this regularly, basically you need to install kernel-devel and
 that's it.
 

 Yes, that is very easy isn't it. Oops to my stupidity. I've got it built
 and will give it a go tomorrow and report back on each test case.
   

Please don't flame on kvm-devel, even if the flames are self-directed.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] (no subject)

2008-04-18 Thread 钟文辉
 


   各位老总:您们好!

   诚祝:您们在2008年里;有鼠不尽的快乐!鼠不尽的收获!鼠不尽的钞票! 
 
   鼠不尽的幸福!鼠不尽的美满生活!愿:您们阖家欢乐!幸福安康!

   我是(深圳市珊湖岛进出口有限公司)的负责人;可以提供:出口报关单,
 
   核销单等等一系列手续;代理:出口报关,商检,境内外运输..等等;还可

   以代办:出口欧盟许可证,欧盟产地证;并且还有(广州国际贸易交易会)的摊

   位可以转让;有意者请来邮件或来电联系。
 
 电话:0755-81153047。
 
 传真:0755-81172940。
 
 手机:15817477278。
 
 联系人:钟文辉。
 
 此致:
 

  敬礼!
 

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kvm-65/66 bug with Solaris 10 U4 ?

2008-04-18 Thread Ian Kirk
Avi Kivity wrote:

  Yes, that is very easy isn't it. Oops to my stupidity. I've got it built
  and will give it a go tomorrow and report back on each test case.

 Please don't flame on kvm-devel, even if the flames are self-directed.

Er, OK...

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] (no subject)

2008-04-18 Thread 钟文辉
 


   各位老总:您们好!

   诚祝:您们在2008年里;有鼠不尽的快乐!鼠不尽的收获!鼠不尽的钞票! 
 
   鼠不尽的幸福!鼠不尽的美满生活!愿:您们阖家欢乐!幸福安康!

   我是(深圳市珊湖岛进出口有限公司)的负责人;可以提供:出口报关单,
 
   核销单等等一系列手续;代理:出口报关,商检,境内外运输..等等;还可

   以代办:出口欧盟许可证,欧盟产地证;并且还有(广州国际贸易交易会)的摊

   位可以转让;有意者请来邮件或来电联系。
 
 电话:0755-81153047。
 
 传真:0755-81172940。
 
 手机:15817477278。
 
 联系人:钟文辉。
 
 此致:
 

  敬礼!
 

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 6/6] kvm: qemu: Enable EPT support for real mode

2008-04-18 Thread Yang, Sheng
From 73c33765f3d879001818cd0719038c78a0c65561 Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Fri, 18 Apr 2008 17:15:39 +0800
Subject: [PATCH] kvm: qemu: Enable EPT support for real mode

This patch build a identity page table on the last page of VGA bios, and use 
it as the guest page table in nonpaging mode for EPT.

Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 qemu/hw/pc.c |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c
index ae87ab9..dcb98c6 100644
--- a/qemu/hw/pc.c
+++ b/qemu/hw/pc.c
@@ -780,6 +780,9 @@ static void pc_init1(ram_addr_t ram_size, int 
vga_ram_size,
 int index;
 BlockDriverState *hd[MAX_IDE_BUS * MAX_IDE_DEVS];
 BlockDriverState *fd[MAX_FD];
+#ifdef USE_KVM
+uint32_t *table_items;
+#endif

 if (ram_size = 0xe000 ) {
 above_4g_mem_size = ram_size - 0xe000;
@@ -857,6 +860,17 @@ static void pc_init1(ram_addr_t ram_size, int 
vga_ram_size,
 exit(1);
 }

+#ifdef USE_KVM
+if (kvm_allowed) {
+   /* set up identity map for EPT at the last page of VGA BIOS region.
+* 0xe7 = _PAGE_PRESENT | _PAGE_RW | _PAGE_USER | _PAGE_ACCESSED |
+*_PAGE_DIRTY   | _PAGE_PSE */
+   table_items = (void *)(phys_ram_base + vga_bios_offset + 0xf000);
+   for (i = 0; i  1024; i++)
+   table_items[i] = (i  22) + 0xe7;
+}
+#endif
+
 /* above 4giga memory allocation */
 if (above_4g_mem_size  0) {
 ram_addr = qemu_ram_alloc(above_4g_mem_size);
--
1.5.4.5

From 73c33765f3d879001818cd0719038c78a0c65561 Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Fri, 18 Apr 2008 17:15:39 +0800
Subject: [PATCH] kvm: qemu: Enable EPT support for real mode

This patch build a identity page table on the last page of VGA bios, and use it
as the guest page table in nonpaging mode for EPT.

Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 qemu/hw/pc.c |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c
index ae87ab9..dcb98c6 100644
--- a/qemu/hw/pc.c
+++ b/qemu/hw/pc.c
@@ -780,6 +780,9 @@ static void pc_init1(ram_addr_t ram_size, int vga_ram_size,
 int index;
 BlockDriverState *hd[MAX_IDE_BUS * MAX_IDE_DEVS];
 BlockDriverState *fd[MAX_FD];
+#ifdef USE_KVM
+uint32_t *table_items;
+#endif
 
 if (ram_size = 0xe000 ) {
 above_4g_mem_size = ram_size - 0xe000;
@@ -857,6 +860,17 @@ static void pc_init1(ram_addr_t ram_size, int vga_ram_size,
 exit(1);
 }
 
+#ifdef USE_KVM
+if (kvm_allowed) {
+	/* set up identity map for EPT at the last page of VGA BIOS region.
+	 * 0xe7 = _PAGE_PRESENT | _PAGE_RW | _PAGE_USER | _PAGE_ACCESSED |
+	 *	  _PAGE_DIRTY   | _PAGE_PSE */
+	table_items = (void *)(phys_ram_base + vga_bios_offset + 0xf000);
+	for (i = 0; i  1024; i++)
+	table_items[i] = (i  22) + 0xe7;
+}
+#endif
+
 /* above 4giga memory allocation */
 if (above_4g_mem_size  0) {
 ram_addr = qemu_ram_alloc(above_4g_mem_size);
-- 
1.5.4.5

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 1/6] KVM: VMX: EPT Feature Detection

2008-04-18 Thread Yang, Sheng
From 9e723871299268e844c9e72f3903ba5f4eb71751 Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Fri, 18 Apr 2008 17:02:59 +0800
Subject: [PATCH 1/5] KVM: VMX: EPT Feature Detection


Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/vmx.c |   63 +++
 arch/x86/kvm/vmx.h |   25 
 2 files changed, 83 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 8e5d664..d93250d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -42,6 +42,9 @@ module_param(enable_vpid, bool, 0);
 static int flexpriority_enabled = 1;
 module_param(flexpriority_enabled, bool, 0);

+static int enable_ept;
+module_param(enable_ept, bool, 0);
+
 struct vmcs {
u32 revision_id;
u32 abort;
@@ -107,6 +110,11 @@ static struct vmcs_config {
u32 vmentry_ctrl;
 } vmcs_config;

+struct vmx_capability {
+   u32 ept;
+   u32 vpid;
+} vmx_capability;
+
 #define VMX_SEGMENT_FIELD(seg) \
[VCPU_SREG_##seg] = {   \
.selector = GUEST_##seg##_SELECTOR, \
@@ -214,6 +222,32 @@ static inline bool 
cpu_has_vmx_virtualize_apic_accesses(void)
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES);
 }

+static inline int cpu_has_vmx_invept_individual_addr(void)
+{
+   return (!!(vmx_capability.ept  VMX_EPT_EXTENT_INDIVIDUAL_BIT));
+}
+
+static inline int cpu_has_vmx_invept_context(void)
+{
+   return (!!(vmx_capability.ept  VMX_EPT_EXTENT_CONTEXT_BIT));
+}
+
+static inline int cpu_has_vmx_invept_global(void)
+{
+   return (!!(vmx_capability.ept  VMX_EPT_EXTENT_GLOBAL_BIT));
+}
+
+static inline int cpu_has_vmx_ept(void)
+{
+   return (vmcs_config.cpu_based_2nd_exec_ctrl 
+   SECONDARY_EXEC_ENABLE_EPT);
+}
+
+static inline int vm_need_ept(void)
+{
+   return (cpu_has_vmx_ept()  enable_ept);
+}
+
 static inline int vm_need_virtualize_apic_accesses(struct kvm *kvm)
 {
return ((cpu_has_vmx_virtualize_apic_accesses()) 
@@ -985,7 +1019,7 @@ static __init int adjust_vmx_controls(u32 ctl_min, u32 
ctl_opt,
 static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
 {
u32 vmx_msr_low, vmx_msr_high;
-   u32 min, opt;
+   u32 min, opt, min2, opt2;
u32 _pin_based_exec_control = 0;
u32 _cpu_based_exec_control = 0;
u32 _cpu_based_2nd_exec_control = 0;
@@ -1003,6 +1037,8 @@ static __init int setup_vmcs_config(struct vmcs_config 
*vmcs_conf)
  CPU_BASED_CR8_LOAD_EXITING |
  CPU_BASED_CR8_STORE_EXITING |
 #endif
+ CPU_BASED_CR3_LOAD_EXITING |
+ CPU_BASED_CR3_STORE_EXITING |
  CPU_BASED_USE_IO_BITMAPS |
  CPU_BASED_MOV_DR_EXITING |
  CPU_BASED_USE_TSC_OFFSETING;
@@ -1018,11 +1054,13 @@ static __init int setup_vmcs_config(struct vmcs_config 
*vmcs_conf)
   ~CPU_BASED_CR8_STORE_EXITING;
 #endif
if (_cpu_based_exec_control  CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
-   min = 0;
-   opt = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
+   min2 = 0;
+   opt2 = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
SECONDARY_EXEC_WBINVD_EXITING |
-   SECONDARY_EXEC_ENABLE_VPID;
-   if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS2,
+   SECONDARY_EXEC_ENABLE_VPID |
+   SECONDARY_EXEC_ENABLE_EPT;
+   if (adjust_vmx_controls(min2, opt2,
+   MSR_IA32_VMX_PROCBASED_CTLS2,
_cpu_based_2nd_exec_control)  0)
return -EIO;
}
@@ -1031,6 +1069,16 @@ static __init int setup_vmcs_config(struct vmcs_config 
*vmcs_conf)
SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES))
_cpu_based_exec_control = ~CPU_BASED_TPR_SHADOW;
 #endif
+   if (_cpu_based_2nd_exec_control  SECONDARY_EXEC_ENABLE_EPT) {
+   /* CR3 accesses don't need to cause VM Exits when EPT enabled */
+   min = ~(CPU_BASED_CR3_LOAD_EXITING |
+CPU_BASED_CR3_STORE_EXITING);
+   if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
+   _cpu_based_exec_control)  0)
+   return -EIO;
+   rdmsr(MSR_IA32_VMX_EPT_VPID_CAP,
+ vmx_capability.ept, vmx_capability.vpid);
+   }

min = 0;
 #ifdef CONFIG_X86_64
@@ -1638,6 +1686,9 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
CPU_BASED_CR8_LOAD_EXITING;
 #endif
}
+   if (!vm_need_ept())
+   exec_control |= CPU_BASED_CR3_STORE_EXITING |
+   CPU_BASED_CR3_LOAD_EXITING;

[kvm-devel] [PATCH 0/6] Enable EPT on KVM v3

2008-04-18 Thread Yang, Sheng
Hi

This patchset enabled EPT on KVM.

The most obvious improvement is the separate construction of EPT table has 
been discarded completely. Now EPT reused ordinary MMU for building the EPT 
table. The code size is greatly reduced and this also solved the display 
problem. But I think it also have impact of scalability...

But currently, S/R and live migration still got problem. I am working on it 
now.

--
Thanks
Yang, Sheng

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 3/6] KVM: MMU: Add EPT support

2008-04-18 Thread Yang, Sheng
From cb851671421832d37c7d90976b603b59a5c75c79 Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Fri, 18 Apr 2008 17:05:06 +0800
Subject: [PATCH 3/5] KVM: MMU: Add EPT support

Enable kvm_set_spte() to generate EPT entries.

Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/mmu.c |   44 
++--
 arch/x86/kvm/x86.c |3 +++
 include/asm-x86/kvm_host.h |3 +++
 3 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 108886d..1828837 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -161,6 +161,12 @@ static struct kmem_cache *mmu_page_header_cache;

 static u64 __read_mostly shadow_trap_nonpresent_pte;
 static u64 __read_mostly shadow_notrap_nonpresent_pte;
+static u64 __read_mostly shadow_base_present_pte;
+static u64 __read_mostly shadow_nx_mask;
+static u64 __read_mostly shadow_x_mask;/* mutual exclusive with 
nx_mask */
+static u64 __read_mostly shadow_user_mask;
+static u64 __read_mostly shadow_accessed_mask;
+static u64 __read_mostly shadow_dirty_mask;

 void kvm_mmu_set_nonpresent_ptes(u64 trap_pte, u64 notrap_pte)
 {
@@ -169,6 +175,23 @@ void kvm_mmu_set_nonpresent_ptes(u64 trap_pte, u64 
notrap_pte)
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_set_nonpresent_ptes);

+void kvm_mmu_set_base_ptes(u64 base_pte)
+{
+   shadow_base_present_pte = base_pte;
+}
+EXPORT_SYMBOL_GPL(kvm_mmu_set_base_ptes);
+
+void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
+   u64 dirty_mask, u64 nx_mask, u64 x_mask)
+{
+   shadow_user_mask = user_mask;
+   shadow_accessed_mask = accessed_mask;
+   shadow_dirty_mask = dirty_mask;
+   shadow_nx_mask = nx_mask;
+   shadow_x_mask = x_mask;
+}
+EXPORT_SYMBOL_GPL(kvm_mmu_set_mask_ptes);
+
 static int is_write_protection(struct kvm_vcpu *vcpu)
 {
return vcpu-arch.cr0  X86_CR0_WP;
@@ -207,7 +230,7 @@ static int is_writeble_pte(unsigned long pte)

 static int is_dirty_pte(unsigned long pte)
 {
-   return pte  PT_DIRTY_MASK;
+   return pte  shadow_dirty_mask;
 }

 static int is_rmap_pte(u64 pte)
@@ -522,7 +545,7 @@ static void rmap_remove(struct kvm *kvm, u64 *spte)
return;
sp = page_header(__pa(spte));
pfn = spte_to_pfn(*spte);
-   if (*spte  PT_ACCESSED_MASK)
+   if (*spte  shadow_accessed_mask)
kvm_set_pfn_accessed(pfn);
if (is_writeble_pte(*spte))
kvm_release_pfn_dirty(pfn);
@@ -1048,17 +1071,18 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 
*shadow_pte,
 * whether the guest actually used the pte (in order to detect
 * demand paging).
 */
-   spte = PT_PRESENT_MASK | PT_DIRTY_MASK;
+   spte = shadow_base_present_pte | shadow_dirty_mask;
if (!speculative)
pte_access |= PT_ACCESSED_MASK;
if (!dirty)
pte_access = ~ACC_WRITE_MASK;
-   if (!(pte_access  ACC_EXEC_MASK))
-   spte |= PT64_NX_MASK;
-
-   spte |= PT_PRESENT_MASK;
+   if (pte_access  ACC_EXEC_MASK) {
+   if (shadow_x_mask)
+   spte |= shadow_x_mask;
+   } else if (shadow_nx_mask)
+   spte |= shadow_nx_mask;
if (pte_access  ACC_USER_MASK)
-   spte |= PT_USER_MASK;
+   spte |= shadow_user_mask;
if (largepage)
spte |= PT_PAGE_SIZE_MASK;

@@ -1164,7 +1188,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, 
int write,
}

table[index] = __pa(new_table-spt) | PT_PRESENT_MASK
-   | PT_WRITABLE_MASK | PT_USER_MASK;
+   | PT_WRITABLE_MASK | shadow_user_mask;
}
table_addr = table[index]  PT64_BASE_ADDR_MASK;
}
@@ -1608,7 +1632,7 @@ static bool last_updated_pte_accessed(struct kvm_vcpu 
*vcpu)
 {
u64 *spte = vcpu-arch.last_pte_updated;

-   return !!(spte  (*spte  PT_ACCESSED_MASK));
+   return !!(spte  (*spte  shadow_accessed_mask));
 }

 static void mmu_guess_page_from_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0ce5563..0735efb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2417,6 +2417,9 @@ int kvm_arch_init(void *opaque)

kvm_x86_ops = ops;
kvm_mmu_set_nonpresent_ptes(0ull, 0ull);
+   kvm_mmu_set_base_ptes(PT_PRESENT_MASK);
+   kvm_mmu_set_mask_ptes(PT_USER_MASK, PT_ACCESSED_MASK,
+   PT_DIRTY_MASK, PT64_NX_MASK, 0);
return 0;

 out:
diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h
index 31aa7d6..9f62773 100644
--- a/include/asm-x86/kvm_host.h
+++ b/include/asm-x86/kvm_host.h
@@ -432,6 +432,9 @@ void kvm_mmu_destroy(struct kvm_vcpu *vcpu);
 int kvm_mmu_create(struct kvm_vcpu *vcpu);
 int kvm_mmu_setup(struct kvm_vcpu *vcpu);
 

[kvm-devel] [PATCH 2/6] KVM: MMU: Move some defination

2008-04-18 Thread Yang, Sheng
From a5ee291f056256f8a892393410bc5923ff575a3b Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Fri, 18 Apr 2008 17:03:53 +0800
Subject: [PATCH 2/5] KVM: MMU: Move some defination for building common 
entries

Move some defination to mmu.h in order to building common table entries.

Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/mmu.c |   25 -
 arch/x86/kvm/mmu.h |   24 
 2 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 078a7f1..108886d 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -84,31 +84,6 @@ static int dbg = 1;
 #define PT32_PT_BITS 10
 #define PT32_ENT_PER_PAGE (1  PT32_PT_BITS)

-#define PT_WRITABLE_SHIFT 1
-
-#define PT_PRESENT_MASK (1ULL  0)
-#define PT_WRITABLE_MASK (1ULL  PT_WRITABLE_SHIFT)
-#define PT_USER_MASK (1ULL  2)
-#define PT_PWT_MASK (1ULL  3)
-#define PT_PCD_MASK (1ULL  4)
-#define PT_ACCESSED_MASK (1ULL  5)
-#define PT_DIRTY_MASK (1ULL  6)
-#define PT_PAGE_SIZE_MASK (1ULL  7)
-#define PT_PAT_MASK (1ULL  7)
-#define PT_GLOBAL_MASK (1ULL  8)
-#define PT64_NX_SHIFT 63
-#define PT64_NX_MASK (1ULL  PT64_NX_SHIFT)
-
-#define PT_PAT_SHIFT 7
-#define PT_DIR_PAT_SHIFT 12
-#define PT_DIR_PAT_MASK (1ULL  PT_DIR_PAT_SHIFT)
-
-#define PT32_DIR_PSE36_SIZE 4
-#define PT32_DIR_PSE36_SHIFT 13
-#define PT32_DIR_PSE36_MASK \
-   (((1ULL  PT32_DIR_PSE36_SIZE) - 1)  PT32_DIR_PSE36_SHIFT)
-
-
 #define PT_FIRST_AVAIL_BITS_SHIFT 9
 #define PT64_SECOND_AVAIL_BITS_SHIFT 52

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index e64e9f5..271c011 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -9,6 +9,30 @@
 #define TDP_ROOT_LEVEL PT32E_ROOT_LEVEL
 #endif

+#define PT_WRITABLE_SHIFT 1
+
+#define PT_PRESENT_MASK (1ULL  0)
+#define PT_WRITABLE_MASK (1ULL  PT_WRITABLE_SHIFT)
+#define PT_USER_MASK (1ULL  2)
+#define PT_PWT_MASK (1ULL  3)
+#define PT_PCD_MASK (1ULL  4)
+#define PT_ACCESSED_MASK (1ULL  5)
+#define PT_DIRTY_MASK (1ULL  6)
+#define PT_PAGE_SIZE_MASK (1ULL  7)
+#define PT_PAT_MASK (1ULL  7)
+#define PT_GLOBAL_MASK (1ULL  8)
+#define PT64_NX_SHIFT 63
+#define PT64_NX_MASK (1ULL  PT64_NX_SHIFT)
+
+#define PT_PAT_SHIFT 7
+#define PT_DIR_PAT_SHIFT 12
+#define PT_DIR_PAT_MASK (1ULL  PT_DIR_PAT_SHIFT)
+
+#define PT32_DIR_PSE36_SIZE 4
+#define PT32_DIR_PSE36_SHIFT 13
+#define PT32_DIR_PSE36_MASK \
+   (((1ULL  PT32_DIR_PSE36_SIZE) - 1)  PT32_DIR_PSE36_SHIFT)
+
 static inline void kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu)
 {
if (unlikely(vcpu-kvm-arch.n_free_mmu_pages  KVM_MIN_FREE_MMU_PAGES))
--
1.5.4.5

From a5ee291f056256f8a892393410bc5923ff575a3b Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Fri, 18 Apr 2008 17:03:53 +0800
Subject: [PATCH 2/5] KVM: MMU: Move some defination for building common entries

Move some defination to mmu.h in order to building common table entries.

Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/mmu.c |   25 -
 arch/x86/kvm/mmu.h |   24 
 2 files changed, 24 insertions(+), 25 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 078a7f1..108886d 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -84,31 +84,6 @@ static int dbg = 1;
 #define PT32_PT_BITS 10
 #define PT32_ENT_PER_PAGE (1  PT32_PT_BITS)
 
-#define PT_WRITABLE_SHIFT 1
-
-#define PT_PRESENT_MASK (1ULL  0)
-#define PT_WRITABLE_MASK (1ULL  PT_WRITABLE_SHIFT)
-#define PT_USER_MASK (1ULL  2)
-#define PT_PWT_MASK (1ULL  3)
-#define PT_PCD_MASK (1ULL  4)
-#define PT_ACCESSED_MASK (1ULL  5)
-#define PT_DIRTY_MASK (1ULL  6)
-#define PT_PAGE_SIZE_MASK (1ULL  7)
-#define PT_PAT_MASK (1ULL  7)
-#define PT_GLOBAL_MASK (1ULL  8)
-#define PT64_NX_SHIFT 63
-#define PT64_NX_MASK (1ULL  PT64_NX_SHIFT)
-
-#define PT_PAT_SHIFT 7
-#define PT_DIR_PAT_SHIFT 12
-#define PT_DIR_PAT_MASK (1ULL  PT_DIR_PAT_SHIFT)
-
-#define PT32_DIR_PSE36_SIZE 4
-#define PT32_DIR_PSE36_SHIFT 13
-#define PT32_DIR_PSE36_MASK \
-	(((1ULL  PT32_DIR_PSE36_SIZE) - 1)  PT32_DIR_PSE36_SHIFT)
-
-
 #define PT_FIRST_AVAIL_BITS_SHIFT 9
 #define PT64_SECOND_AVAIL_BITS_SHIFT 52
 
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index e64e9f5..271c011 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -9,6 +9,30 @@
 #define TDP_ROOT_LEVEL PT32E_ROOT_LEVEL
 #endif
 
+#define PT_WRITABLE_SHIFT 1
+
+#define PT_PRESENT_MASK (1ULL  0)
+#define PT_WRITABLE_MASK (1ULL  PT_WRITABLE_SHIFT)
+#define PT_USER_MASK (1ULL  2)
+#define PT_PWT_MASK (1ULL  3)
+#define PT_PCD_MASK (1ULL  4)
+#define PT_ACCESSED_MASK (1ULL  5)
+#define PT_DIRTY_MASK (1ULL  6)
+#define PT_PAGE_SIZE_MASK (1ULL  7)
+#define PT_PAT_MASK (1ULL  7)
+#define PT_GLOBAL_MASK (1ULL  8)
+#define PT64_NX_SHIFT 63
+#define PT64_NX_MASK (1ULL  PT64_NX_SHIFT)
+
+#define PT_PAT_SHIFT 7
+#define PT_DIR_PAT_SHIFT 12
+#define PT_DIR_PAT_MASK (1ULL  PT_DIR_PAT_SHIFT)
+
+#define PT32_DIR_PSE36_SIZE 4
+#define 

[kvm-devel] [PATCH 4/6] KVM: Export necessary function for EPT

2008-04-18 Thread Yang, Sheng
From 5d4a79e5edfc09b54bd83a3a289cbb82058e3daa Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Fri, 18 Apr 2008 17:05:20 +0800
Subject: [PATCH 4/5] KVM: Export necessary function for EPT

The function gfn_to_gva is necessary for handling EPT violation.

Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 virt/kvm/kvm_main.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 0998455..d028e07 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -522,6 +522,7 @@ unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
return bad_hva();
return (slot-userspace_addr + (gfn - slot-base_gfn) * PAGE_SIZE);
 }
+EXPORT_SYMBOL_GPL(gfn_to_hva);

 /*
  * Requires current-mm-mmap_sem to be held
--
1.5.4.5


From 5d4a79e5edfc09b54bd83a3a289cbb82058e3daa Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Fri, 18 Apr 2008 17:05:20 +0800
Subject: [PATCH 4/5] KVM: Export necessary function for EPT


Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 virt/kvm/kvm_main.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 0998455..d028e07 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -522,6 +522,7 @@ unsigned long gfn_to_hva(struct kvm *kvm, gfn_t gfn)
 		return bad_hva();
 	return (slot-userspace_addr + (gfn - slot-base_gfn) * PAGE_SIZE);
 }
+EXPORT_SYMBOL_GPL(gfn_to_hva);
 
 /*
  * Requires current-mm-mmap_sem to be held
-- 
1.5.4.5

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 5/6] KVM: VMX: Enable EPT feature for KVM

2008-04-18 Thread Yang, Sheng
From 43eb727046349aac3df52317dbbfd3b4b33c084d Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Fri, 18 Apr 2008 17:07:31 +0800
Subject: [PATCH 5/5] KVM: VMX: Enable EPT feature for KVM


Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/mmu.c |   11 ++-
 arch/x86/kvm/vmx.c |  227 
++--
 arch/x86/kvm/vmx.h |   11 ++
 include/asm-x86/kvm_host.h |1 +
 4 files changed, 241 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 1828837..c7b7335 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1187,8 +1187,15 @@ static int __direct_map(struct kvm_vcpu *vcpu, gpa_t v, 
int write,
return -ENOMEM;
}

-   table[index] = __pa(new_table-spt) | PT_PRESENT_MASK
-   | PT_WRITABLE_MASK | shadow_user_mask;
+   if (shadow_user_mask)
+   table[index] = __pa(new_table-spt)
+   | PT_PRESENT_MASK | PT_WRITABLE_MASK
+   | shadow_user_mask;
+   else
+   table[index] = __pa(new_table-spt)
+   | PT_PRESENT_MASK | PT_WRITABLE_MASK
+   | shadow_x_mask;
+   table[index] = __pa(new_table-spt) | 0x7;
}
table_addr = table[index]  PT64_BASE_ADDR_MASK;
}
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index d93250d..2a85930 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -42,7 +42,7 @@ module_param(enable_vpid, bool, 0);
 static int flexpriority_enabled = 1;
 module_param(flexpriority_enabled, bool, 0);

-static int enable_ept;
+static int enable_ept = 1;
 module_param(enable_ept, bool, 0);

 struct vmcs {
@@ -284,6 +284,18 @@ static inline void __invvpid(int ext, u16 vpid, gva_t 
gva)
  : : a(operand), c(ext) : cc, memory);
 }

+static inline void __invept(int ext, u64 eptp, gpa_t gpa)
+{
+   struct {
+   u64 eptp, gpa;
+   } operand = {eptp, gpa};
+
+   asm volatile (ASM_VMX_INVEPT
+   /* CF==1 or ZF==1 -- rc = -1 */
+   ; ja 1f ; ud2 ; 1:\n
+   : : a (operand), c (ext) : cc, memory);
+}
+
 static struct kvm_msr_entry *find_msr_entry(struct vcpu_vmx *vmx, u32 msr)
 {
int i;
@@ -335,6 +347,33 @@ static inline void vpid_sync_vcpu_all(struct vcpu_vmx 
*vmx)
__invvpid(VMX_VPID_EXTENT_SINGLE_CONTEXT, vmx-vpid, 0);
 }

+static inline void ept_sync_global(void)
+{
+   if (cpu_has_vmx_invept_global())
+   __invept(VMX_EPT_EXTENT_GLOBAL, 0, 0);
+}
+
+static inline void ept_sync_context(u64 eptp)
+{
+   if (vm_need_ept()) {
+   if (cpu_has_vmx_invept_context())
+   __invept(VMX_EPT_EXTENT_CONTEXT, eptp, 0);
+   else
+   ept_sync_global();
+   }
+}
+
+static inline void ept_sync_individual_addr(u64 eptp, gpa_t gpa)
+{
+   if (vm_need_ept()) {
+   if (cpu_has_vmx_invept_individual_addr())
+   __invept(VMX_EPT_EXTENT_INDIVIDUAL_ADDR,
+   eptp, gpa);
+   else
+   ept_sync_context(eptp);
+   }
+}
+
 static unsigned long vmcs_readl(unsigned long field)
 {
unsigned long value;
@@ -422,6 +461,8 @@ static void update_exception_bitmap(struct kvm_vcpu *vcpu)
eb |= 1u  1;
if (vcpu-arch.rmode.active)
eb = ~0;
+   if (vm_need_ept())
+   eb = ~(1u  PF_VECTOR); /* bypass_guest_pf = 0 */
vmcs_write32(EXCEPTION_BITMAP, eb);
 }

@@ -1352,8 +1393,64 @@ static void vmx_decache_cr4_guest_bits(struct kvm_vcpu 
*vcpu)
vcpu-arch.cr4 |= vmcs_readl(GUEST_CR4)  ~KVM_GUEST_CR4_MASK;
 }

+static void ept_load_pdptrs(struct kvm_vcpu *vcpu)
+{
+   if (is_paging(vcpu)  is_pae(vcpu)  !is_long_mode(vcpu)) {
+   if (!load_pdptrs(vcpu, vcpu-arch.cr3)) {
+   printk(KERN_ERR EPT: Fail to load pdptrs!\n);
+   return;
+   }
+   vmcs_write64(GUEST_PDPTR0, vcpu-arch.pdptrs[0]);
+   vmcs_write64(GUEST_PDPTR1, vcpu-arch.pdptrs[1]);
+   vmcs_write64(GUEST_PDPTR2, vcpu-arch.pdptrs[2]);
+   vmcs_write64(GUEST_PDPTR3, vcpu-arch.pdptrs[3]);
+   }
+}
+
+static void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4);
+
+static void ept_update_paging_mode_cr0(unsigned long *hw_cr0,
+   unsigned long cr0,
+   struct kvm_vcpu *vcpu)
+{
+   if (!(cr0  X86_CR0_PG)) {
+   /* From paging/starting to nonpaging */
+   

Re: [kvm-devel] [PATCH 1/1] QEMU/KVM: Support for PCI Passthrough

2008-04-18 Thread Samuel Masham
On Fri, Apr 18, 2008 at 2:39 PM, Amit Shah [EMAIL PROTECTED] wrote:
 * On Monday 14 Apr 2008 06:01:07 Samuel Masham wrote:
  
   Please keep the userspace support alive.
  
   I am particularly interested in using the pci-passthough to qemu
   running non x86 system emulation
   (at the moment mips)
  
   My hope is that the pci - passthough could help with developing
   drivers and testing across architectures...

  OK; keeping support around won't be too much of a hassle, though the current
  support for pci-passthrough and the irqhook module are developed with x86 in
  mind (and only tested on x86).

  Since it's not tested on any other architecture, I've marked it TARGET_I386
  and TARGET_X86_64 for now. Feel free to extend it to other architectures.

Thanks,

I will let you know if I get anything useful out of it.

Samuel

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Чем НЕЛЬЗЯ пренебрегать?

2008-04-18 Thread Корпоративный захват
  Приглашатся!

  Руководители и сотрудники служб безопасности, юрисконсульты,
  руководители предприятий, финансовые и коммерческие директора

  принять участие в мероприятии:

 ...Э к о н о м и ч е с к а я
  безопасность  предприятия...

  12 - 16 мая 2008г. в Санкт-Петербурге

  Информационный отдел: (812)... 983... 37... 96

  Основные блоки программы:...

  Основы экономической безопасности предприятия.
  Общие положения теории экономической безопасности.
  Основные направления обеспечения безопасности предприятия.
  Внешние и внутренние угрозы.

  Определение эффективности работы службы экономической безопасности.
  Функции, задачи и направления деятельности СБ.
  Планирование работы.

  Система и методы анализа и управления экономическими рисками.
  Корпоративные захваты: инструменты обнаружения и противодействия
  корпоративным захватам. Правовые и экономические аспекты процесса
  недружественного поглощения. Сценарии проведения.

  Место деловой разведки в обеспечении экономической безопасности
  бизнеса. Понятие деловой разведки. Роль бизнесразведки в принятии
  управленческого решения. Бизнесразведка и безопасность бизнеса.

  Внезапная проверка. Процедура проведения проверок, виды проверок
  и основания проведения.

  Интегрированная система охраны объектов.

  Основы информационной безопасности предприятия.

  Правовые основы защиты конфиденциальной информации.

  Мероприятия по защите конфиденциальной информации.

  Меры по обеспечению информационной безопасности предприятия,
  связанные с кадровой работой.

  Технические средства промышленного шпионажа и средства
  их обнаружения.

  Защита компьютерной информации.

  Практическая демонстрация возможностей средств контроля и
  управления доступом, средств противодействия промышленному шпионажу.

  По запросу высылается полная программа (812)... 983... 37... 96





-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread Nguyen Anh Quynh
On Thu, Apr 17, 2008 at 2:58 PM, H. Peter Anvin [EMAIL PROTECTED] wrote:
 Nguyen Anh Quynh wrote:

  This patch replaces the current assembly code of Extboot option rom
  with new C code. Patch is against kvm-66.
 
  This version returns an error code in case int 13 handler cannot
  handle a requested function.
 
  Signed-off-by: Nguyen Anh Quynh [EMAIL PROTECTED]
 
 

  +   /* -fomit-frame-pointer might clobber %ebp */
  +   pushl %ebp
  +   call setup
  +   popl  %ebp


  No, it might not.  %ebx, %ebp, %esi, and %edi are guaranteed preserved;
 %eax, %ecx and %edx are clobbered.

  It's also prudent to call cld before jumping to C code.

OK, I will fix these in the next version.

Thanks,
Q

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread Nguyen Anh Quynh
On Thu, Apr 17, 2008 at 3:00 PM, H. Peter Anvin [EMAIL PROTECTED] wrote:
  +   .globl linux_boot
  +linux_boot:
  +   cli
  +   cld
  +   mov $0x9000, %ax
  +   mov %ax, %ds
  +   mov %ax, %es
  +   mov %ax, %fs
  +   mov %ax, %gs
  +   mov %ax, %ss
  +   mov $0x8ffe, %sp
  +   ljmp $0x9000 + 0x20, $0
 

  The hard use of segment 9000 is really highly unfortunate for bzImage,
 since it restricts its heap more than necessary.  I suggest following the
 patterns used by the (new) Qemu loader.

Actually, this code is left from the original code of Anthony, and it
seems he took it from qemu 0.8 version.

Anthony, may you explain why you want to hijact the linux boot process
here? If I understand correctly, we can just let the original int19
execute, and if linux boot is desired, it would work in normal way. So
why you want to do this?

Thanks,
Q

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread Nguyen Anh Quynh
On Thu, Apr 17, 2008 at 4:36 PM, Carlo Marcelo Arenas Belon
[EMAIL PROTECTED] wrote:
 On Thu, Apr 17, 2008 at 10:30:27AM +0900, Nguyen Anh Quynh wrote:

  +++ b/extboot/farvar.h
  @@ -0,0 +1,113 @@
  +// Code to access multiple segments within gcc.
  +//
  +// Copyright (C) 2008  Kevin O'Connor [EMAIL PROTECTED]
  +//
  +// This file may be distributed under the terms of the GNU GPLv3 license.

  IANAL but wouldn't this make extboot GPLv3 only? how that will interact
  with the GPLv2 extboot qemu?

I am not sure if that is fine, but it might be better to have the same
license for every code. I will contact Kevin when the next version is
ready (I am still fixing something)

Thanks,
Q

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] gfxboot VMX workaround v2

2008-04-18 Thread Guillaume Thouvenin
On Tue, 15 Apr 2008 16:06:43 +0300
Avi Kivity [EMAIL PROTECTED] wrote:

  ...
  handle_vmentry_failure: invalid guest state
  handle_vmentry_failure: start emulation
  handle_vmentry_failure: emulation failed

 
 What instruction failed, exactly?
 

I added the code do dump the instruction and it seems that it's the
emulation of 0xe6 (== out imm8, al) that failed. I made modifications
to emulate it (see below) and now I have another problem in kvm
userspace with the following message (and the emulation doesn't work):

enterprise:~ $ kvm_run: Operation not permitted
enterprise:~ $ kvm_run returned -1
 
 You need to load rip as well.

Ooops, yes. So jump far emulation is now like:

+   case 0xea: /* jmp far */ {
+   struct kvm_segment kvm_seg;
+   long int eip;
+   int ret;
+
+   kvm_x86_ops-get_segment(ctxt-vcpu, kvm_seg, VCPU_SREG_CS); 
+
+   ret = load_segment_descriptor(ctxt-vcpu, kvm_seg.selector, 9, 
VCPU_SREG_CS);
+   if (ret  0){
+   printk(KERN_INFO %s: Failed to load CS descriptor\n, 
__FUNCTION__);
+   goto cannot_emulate;
+   }
+
+   switch (c-op_bytes) {
+   case 2:
+   eip = insn_fetch(s16, 2, c-eip);
+   break;
+   case 4:
+   eip = insn_fetch(s32, 4, c-eip);
+   break;
+   default:
+   DPRINTF(jmp far: Invalid op_bytes\n);
+   goto cannot_emulate;
+   }
+   printk(KERN_INFO eip == 0x%lx\n, eip);
+   c-eip = eip;
+   break;
+   }

It seems that the jump to cs:eip works and now I have the following error:

[18535.446917] handle_vmentry_failure: invalid guest state
[18535.449519] handle_vmentry_failure: start emulation
[18535.457519] eip == 0x6e18
[18535.467685] handle_vmentry_failure: emulation of 0xe6 failed

For the emulation of 0xe6 I used the following one that I found in
nitin's tree:

+   case 0xe6: /* out imm8, al */
+   case 0xe7: /* out imm8, ax/eax */ {
+   struct kvm_io_device *pio_dev;
+   
+   pio_dev = vcpu_find_pio_dev(ctxt-vcpu, c-src.val);
+   kvm_iodevice_write(pio_dev, c-src.val,
+   (c-d  ByteOp) ? 1 : c-op_bytes,
+   c-regs[VCPU_REGS_RAX]);
+   }
+   break;

I will look closer where is the problem and as you suggested, I will
display the instruction to be emulated and the register state before
and after, and compare with the expected state.


Thanks for your help,
Regards,
Guillaume

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kvm-65/66 bug with Solaris 10 U4 ?

2008-04-18 Thread Chris Lalancette
Ian Kirk wrote:
 
 I can't compile them against 2.6.24.4-64.fc8PAE as the module magic name
 mismatches, and I don't know how to change kernel-devel to know it's PAE.

You just need to install kernel-PAE-devel package, and then build against that.

Chris Lalancette

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread H. Peter Anvin
Nguyen Anh Quynh wrote:
 
 Actually, this code is left from the original code of Anthony, and it
 seems he took it from qemu 0.8 version.
 
 Anthony, may you explain why you want to hijact the linux boot process
 here? If I understand correctly, we can just let the original int19
 execute, and if linux boot is desired, it would work in normal way. So
 why you want to do this?
 

I'm having exactly the *opposite* question... why does extboot have code 
to hook int 13h?

-hpa

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

2008-04-18 Thread Jamie Lokier
Daniel P. Berrange wrote:
  Those cases aren't always discoverable.  Linux-aio just falls back to 
  using synchronous IO.  It's pretty terrible.  We need a new AIO 
  interface for Linux (and yes, we're working on this).  Once we have 
  something better, we'll change that to be the default and things will 
  Just Work for most users.
 
 If QEMU can't discover cases where it won't work, what criteria should
 the end user use to decide between the impls, or for that matter, what
 criteria should a management api/app like libvirt use ? If the only decision
 logic is  'try it  benchmark your VM' then its not a particularly useful
 option.

Good use of Linux-AIO requires that you basically know which cases
it handles well, and which ones it doesn't.  Falling back to
synchronous I/O with no indication (except speed) is a pretty
atrocious API imho.  But that's what the Linux folks decided to do.

I suspect what you have to do is:

1. Try opening the file with O_DIRECT.
2. Use fstat to check the filesystem type and block device type.
3. If it's on a whitelist of filesystem types,
4. and a whitelist of block device types,
5. and the kernel version is later than an fs+bd-dependent value,
6. then select an alignment size (kernel version dependent)
   and use Linux-AIO with it.

Otherwise don't use Linux-AIO.  You may then decide to use Glibc's
POSIX-AIO (which uses threads), or use threads for I/O yourself.

In future, the above recipe will be more complicated, in that you have
to use the same decision tree to decide between:

- Synchronous IO.
- Your own thread based IO.
- Glibc POSIX-AIO using threads.
- Linux-AIO.
- Virtio thing or whatever is based around vringfd.
- Syslets if they gain traction and perform well.

 I've basically got a choice of making libvirt always ad '-aio linux'
 or never add it at all. My inclination is to the latter since it is
 compatible with existing QEMU which has no -aio option. Presumably
 '-aio linux' is intended to provide some performance benefit so it'd
 be nice to use it. If we can't express some criteria under which it
 should be turned on, I can't enable it; where as if you can express
 some criteria, then QEMU should apply them automatically.

I'm of the view that '-aio auto' would be a really good option - and
when it's proven itself, it should be the default.  It could work on
all QEMU hosts: it would pick synchronous IO when there is nothing else.

The criteria for selecting a good AIO strategy on Linux are quite
complex, and might be worth hard coding.  In that case, putting that
into QEMU itself would be much better than every program which
launches QEMU having it's own implementation of the criteria.

 Pushing this choice of AIO impls to the app or user invoking QEMU just
 does not seem like a win here.

I think having the choice is very good, because whatever the hard
coded selection criteria, there will be times when it's wrong (ideally
in conservative ways - it should always be functional, just suboptimal).

So I do support this patch to add the switch.

But _forcing_ the user to decide is not good, since the criteria are
rather obscure and change with things like filesystem.  At least, a
set of command line options to QEMU ought to work when you copy a VM
to another machine!

So I think '-aio auto', which invokes the selection criteria of the
day and is guaranteed to work (conservatively picking a slower method
if it cannot be sure a faster one will work) would be the most useful
option of all.

-- Jamie

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread H. Peter Anvin
Nguyen Anh Quynh wrote:
 
 Actually, this code is left from the original code of Anthony, and it
 seems he took it from qemu 0.8 version.
 
 Anthony, may you explain why you want to hijact the linux boot process
 here? If I understand correctly, we can just let the original int19
 execute, and if linux boot is desired, it would work in normal way. So
 why you want to do this?
 

I'm having exactly the *opposite* question... why does extboot have code 
to hook int 13h?

-hpa

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] gfxboot VMX workaround v2

2008-04-18 Thread Guillaume Thouvenin
On Fri, 18 Apr 2008 14:18:16 +0200
Guillaume Thouvenin [EMAIL PROTECTED] wrote:

 I added the code do dump the instruction and it seems that it's the
 emulation of 0xe6 (== out imm8, al) that failed. I made modifications
 to emulate it (see below) and now I have another problem in kvm
 userspace with the following message (and the emulation doesn't work):
 
 enterprise:~ $ kvm_run: Operation not permitted
 enterprise:~ $ kvm_run returned -1

Ok for this one it seems to be a wrong value in the opcode_table[]. Now
it generates an oops. I'm investigating... 

Regards,
Guillaume

---

Apr 18 14:48:53 enterprise kernel: [22321.010006] handle_vmentry_failure: 
invalid guest state
Apr 18 14:48:53 enterprise kernel: [22321.011953] handle_vmentry_failure: start 
emulation
Apr 18 14:48:53 enterprise kernel: [22321.015875] c-op_bytes == 2
Apr 18 14:48:53 enterprise kernel: [22321.019862] eip == 0x6e18

Message from [EMAIL PROTECTED] at Fri Apr 18 14:48:54 2008 ...
enterprise kernel: [22321.027850] Oops:  [2] SMP

Message from [EMAIL PROTECTED] at Fri Apr 18 14:48:54 2008 ...
enterprise kernel: [22321.027850] Code: 75 58 48 8b 7d 00 e8 64 4f ff ff f6 85 
98 00 00 00 01 ba 01 00 00 00 75 04 0f b6 55 4c 48 8b 75 58 48 8d 8d a0 00 00 
00 48 89 c7 ff 50 08 e9 f1 07 00 00 8a 45 4c 3c 02 74 0a 3c 04 0f 85 73 13

Message from [EMAIL PROTECTED] at Fri Apr 18 14:48:54 2008 ...
enterprise kernel: [22321.027850] CR2: 0008
Apr 18 14:48:54 enterprise kernel: [22321.027850] PGD 36f1a8067 PUD 327c17067 
PMD 0
Apr 18 14:48:54 enterprise kernel: [22321.027850] CPU 1
Apr 18 14:48:54 enterprise kernel: [22321.027850] Modules linked in: kvm_intel 
kvm aic94xx libsas scsi_transport_sas [last unloaded: kvm]
Apr 18 14:48:54 enterprise kernel: [22321.027850] Pid: 7814, comm: 
qemu-system-x86 Tainted: G  D  2.6.25 #207
Apr 18 14:48:54 enterprise kernel: [22321.027850] RIP: 
0010:[88043933]  [88043933] 
:kvm:x86_emulate_insn+0x2d97/0x414c
Apr 18 14:48:54 enterprise kernel: [22321.027850] RSP: 0018:81033005fb68  
EFLAGS: 00010202
Apr 18 14:48:54 enterprise kernel: [22321.027850] RAX:  RBX: 
810344cf9440 RCX: 810344cf9498
Apr 18 14:48:54 enterprise kernel: [22321.027850] RDX: 0001 RSI: 
007a RDI: 
Apr 18 14:48:54 enterprise kernel: [22321.027850] RBP: 810344cf93f8 R08: 
 R09: 
Apr 18 14:48:54 enterprise kernel: [22321.027850] R10:  R11: 
 R12: 
Apr 18 14:48:54 enterprise kernel: [22321.027850] R13: 88051e50 R14: 
810344cf9498 R15: 7ad6
Apr 18 14:48:54 enterprise kernel: [22321.027850] FS:  4108b950() 
GS:810397c250c0() knlGS:
Apr 18 14:48:54 enterprise kernel: [22321.027850] CS:  0010 DS: 002b ES: 002b 
CR0: 80050033
Apr 18 14:48:54 enterprise kernel: [22321.027850] CR2: 0008 CR3: 
0003301b2000 CR4: 26e0
Apr 18 14:48:54 enterprise kernel: [22321.027850] DR0:  DR1: 
 DR2: 
Apr 18 14:48:54 enterprise kernel: [22321.027850] DR3:  DR6: 
0ff0 DR7: 0400
Apr 18 14:48:54 enterprise kernel: [22321.027850] Process qemu-system-x86 (pid: 
7814, threadinfo 81033005e000, task 810396023080)
Apr 18 14:48:54 enterprise kernel: [22321.027850] Stack:  81033005fb04 
0088 810344cf9438 810344cf9440
Apr 18 14:48:54 enterprise kernel: [22321.027850]  00040040 
00055e1c 00055e1c 810344cf9498
Apr 18 14:48:54 enterprise kernel: [22321.027850]  0089 
8805087a  810344cf80c0
Apr 18 14:48:54 enterprise kernel: [22321.027850] Call Trace:
Apr 18 14:48:54 enterprise kernel: [22321.027850]  [88038d91] ? 
:kvm:emulate_instruction+0x1e5/0x2b9
Apr 18 14:48:54 enterprise kernel: [22321.027850]  [88057cd1] ? 
:kvm_intel:kvm_handle_exit+0xea/0x1e8
Apr 18 14:48:54 enterprise kernel: [22321.027850]  [88057a96] ? 
:kvm_intel:vmx_intr_assist+0x68/0x1b9
Apr 18 14:48:54 enterprise kernel: [22321.027850]  [80563398] ? 
__down_read+0x12/0xa1
Apr 18 14:48:54 enterprise kernel: [22321.027850]  [8803b940] ? 
:kvm:kvm_arch_vcpu_ioctl_run+0x4ae/0x631
Apr 18 14:48:54 enterprise kernel: [22321.027850]  [80291ec9] ? 
touch_atime+0xae/0xed
Apr 18 14:48:54 enterprise kernel: [22321.027850]  [8803672e] ? 
:kvm:kvm_vcpu_ioctl+0xf3/0x3a1
Apr 18 14:48:54 enterprise kernel: [22321.027850]  [802802c0] ? 
do_sync_read+0xd1/0x118
Apr 18 14:48:54 enterprise kernel: [22321.027850]  [880363b1] ? 
:kvm:kvm_vm_ioctl+0x1ab/0x1c3
Apr 18 14:48:54 enterprise kernel: [22321.027850]  [8028ae49] ? 
vfs_ioctl+0x21/0x6b
Apr 18 14:48:54 enterprise kernel: [22321.027850]  [8028b0e6] ? 
do_vfs_ioctl+0x253/0x264
Apr 18 14:48:54 enterprise kernel: [22321.027850]  

Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread Anthony Liguori
Nguyen Anh Quynh wrote:
 On Thu, Apr 17, 2008 at 3:00 PM, H. Peter Anvin [EMAIL PROTECTED] wrote:
   
 +   .globl linux_boot
 +linux_boot:
 +   cli
 +   cld
 +   mov $0x9000, %ax
 +   mov %ax, %ds
 +   mov %ax, %es
 +   mov %ax, %fs
 +   mov %ax, %gs
 +   mov %ax, %ss
 +   mov $0x8ffe, %sp
 +   ljmp $0x9000 + 0x20, $0

   
  The hard use of segment 9000 is really highly unfortunate for bzImage,
 since it restricts its heap more than necessary.  I suggest following the
 patterns used by the (new) Qemu loader.
 

 Actually, this code is left from the original code of Anthony, and it
 seems he took it from qemu 0.8 version.

 Anthony, may you explain why you want to hijact the linux boot process
 here? If I understand correctly, we can just let the original int19
 execute, and if linux boot is desired, it would work in normal way. So
 why you want to do this?
   

The thinking is to eliminate the need to hijack the boot sector when 
using the -kernel option.  However, the linux boot stuff in extboot has 
been broken since hpa rewrote the boot code.  It can be removed for now 
and I'll eventually revisit it.

Regards,

Anthony Liguori

 Thanks,
 Q
   


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread Anthony Liguori
H. Peter Anvin wrote:
 Nguyen Anh Quynh wrote:

 Actually, this code is left from the original code of Anthony, and it
 seems he took it from qemu 0.8 version.

 Anthony, may you explain why you want to hijact the linux boot process
 here? If I understand correctly, we can just let the original int19
 execute, and if linux boot is desired, it would work in normal way. So
 why you want to do this?


 I'm having exactly the *opposite* question... why does extboot have 
 code to hook int 13h?

extboot is primarily intended to allow scsi boot or boot from a pv 
disk.  It hooks int13h to fake out disk access.

Regards,

Anthony Liguori

 -hpa


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] gfxboot VMX workaround v2

2008-04-18 Thread Anthony Liguori
Guillaume Thouvenin wrote:
 On Tue, 15 Apr 2008 16:06:43 +0300
 Avi Kivity [EMAIL PROTECTED] wrote:

   
 ...
 handle_vmentry_failure: invalid guest state
 handle_vmentry_failure: start emulation
 handle_vmentry_failure: emulation failed
   
   
 What instruction failed, exactly?

 

 I added the code do dump the instruction and it seems that it's the
 emulation of 0xe6 (== out imm8, al) that failed. I made modifications
 to emulate it (see below) and now I have another problem in kvm
 userspace with the following message (and the emulation doesn't work):

 enterprise:~ $ kvm_run: Operation not permitted
 enterprise:~ $ kvm_run returned -1
  
   
 You need to load rip as well.
 

 Ooops, yes. So jump far emulation is now like:

 +   case 0xea: /* jmp far */ {
 +   struct kvm_segment kvm_seg;
 +   long int eip;
 +   int ret;
 +
 +   kvm_x86_ops-get_segment(ctxt-vcpu, kvm_seg, VCPU_SREG_CS); 
 +
 +   ret = load_segment_descriptor(ctxt-vcpu, kvm_seg.selector, 
 9, VCPU_SREG_CS);
 +   if (ret  0){
 +   printk(KERN_INFO %s: Failed to load CS 
 descriptor\n, __FUNCTION__);
 +   goto cannot_emulate;
 +   }
 +
 +   switch (c-op_bytes) {
 +   case 2:
 +   eip = insn_fetch(s16, 2, c-eip);
 +   break;
 +   case 4:
 +   eip = insn_fetch(s32, 4, c-eip);
 +   break;
 +   default:
 +   DPRINTF(jmp far: Invalid op_bytes\n);
 +   goto cannot_emulate;
 +   }
 +   printk(KERN_INFO eip == 0x%lx\n, eip);
 +   c-eip = eip;
 +   break;
 +   }

 It seems that the jump to cs:eip works and now I have the following error:

 [18535.446917] handle_vmentry_failure: invalid guest state
 [18535.449519] handle_vmentry_failure: start emulation
 [18535.457519] eip == 0x6e18
 [18535.467685] handle_vmentry_failure: emulation of 0xe6 failed

 For the emulation of 0xe6 I used the following one that I found in
 nitin's tree:
   

This doesn't seem right.  You should have been able to break out of the 
emulator long before encountering an out instruction.  The next 
instruction you encounter should be a mov instruction.  Are you sure 
you're updating eip correctly?

Regards,

Anthony Liguori

 +   case 0xe6: /* out imm8, al */
 +   case 0xe7: /* out imm8, ax/eax */ {
 +   struct kvm_io_device *pio_dev;
 +   
 +   pio_dev = vcpu_find_pio_dev(ctxt-vcpu, c-src.val);
 +   kvm_iodevice_write(pio_dev, c-src.val,
 +   (c-d  ByteOp) ? 1 : c-op_bytes,
 +   c-regs[VCPU_REGS_RAX]);
 +   }
 +   break;

 I will look closer where is the problem and as you suggested, I will
 display the instruction to be emulated and the register state before
 and after, and compare with the expected state.


 Thanks for your help,
 Regards,
 Guillaume

 -
 This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
 Don't miss this year's exciting event. There's still time to save $100. 
 Use priority code J8TL2D2. 
 http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel
   


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 3/6] KVM: MMU: Add EPT support

2008-04-18 Thread Anthony Liguori
Yang, Sheng wrote:
 @@ -1048,17 +1071,18 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 
 *shadow_pte,
* whether the guest actually used the pte (in order to detect
* demand paging).
*/
 - spte = PT_PRESENT_MASK | PT_DIRTY_MASK;
 + spte = shadow_base_present_pte | shadow_dirty_mask;
   if (!speculative)
   pte_access |= PT_ACCESSED_MASK;
   if (!dirty)
   pte_access = ~ACC_WRITE_MASK;
 - if (!(pte_access  ACC_EXEC_MASK))
 - spte |= PT64_NX_MASK;
 -
 - spte |= PT_PRESENT_MASK;
 + if (pte_access  ACC_EXEC_MASK) {
 + if (shadow_x_mask)
 + spte |= shadow_x_mask;
 + } else if (shadow_nx_mask)
 + spte |= shadow_nx_mask;
   

This looks like it may be a bug.  The old behavior sets NX if 
(pte_access  ACC_EXEC_MASK).  The new behavior unconditionally sets NX 
and never sets PRESENT.  Also, the if (shadow_x_mask) checks are 
unnecessary.  spte |= 0 is a nop.

   if (pte_access  ACC_USER_MASK)
 - spte |= PT_USER_MASK;
 + spte |= shadow_user_mask;
   if (largepage)
   spte |= PT_PAGE_SIZE_MASK;
   

Regards,

Anthony Liguori

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread Nguyen Anh Quynh
On 4/18/08, Anthony Liguori [EMAIL PROTECTED] wrote:
 Nguyen Anh Quynh wrote:

  On Thu, Apr 17, 2008 at 3:00 PM, H. Peter Anvin [EMAIL PROTECTED] wrote:
 
 
  
+   .globl linux_boot
+linux_boot:
+   cli
+   cld
+   mov $0x9000, %ax
+   mov %ax, %ds
+   mov %ax, %es
+   mov %ax, %fs
+   mov %ax, %gs
+   mov %ax, %ss
+   mov $0x8ffe, %sp
+   ljmp $0x9000 + 0x20, $0
   
   
   
The hard use of segment 9000 is really highly unfortunate for bzImage,
   since it restricts its heap more than necessary.  I suggest following
 the
   patterns used by the (new) Qemu loader.
  
  
 
  Actually, this code is left from the original code of Anthony, and it
  seems he took it from qemu 0.8 version.
 
  Anthony, may you explain why you want to hijact the linux boot process
  here? If I understand correctly, we can just let the original int19
  execute, and if linux boot is desired, it would work in normal way. So
  why you want to do this?
 
 

  The thinking is to eliminate the need to hijack the boot sector when using
 the -kernel option.

I see, but does that offer any advantage over the current approach?

Thanks,
Q

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread H. Peter Anvin
Anthony Liguori wrote:
 
 The thinking is to eliminate the need to hijack the boot sector when 
 using the -kernel option.  However, the linux boot stuff in extboot has 
 been broken since hpa rewrote the boot code.  It can be removed for now 
 and I'll eventually revisit it.
 

It probably makes more sense to have a different boot ROM for that.  I 
thought this was extboot, but apparently not.  I probably can throw 
something together.

-hpa


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] gfxboot VMX workaround v2

2008-04-18 Thread Guillaume Thouvenin
On Fri, 18 Apr 2008 08:23:07 -0500
Anthony Liguori [EMAIL PROTECTED] wrote:

 
 This doesn't seem right.  You should have been able to break out of the 
 emulator long before encountering an out instruction.  The next 
 instruction you encounter should be a mov instruction.  Are you sure 
 you're updating eip correctly?

I think that eip is updated correctly but you're right, I think that
the condition to stop emulation is not well implemented. I emulate a
lot of mov instructions and I remain blocked in the emulation loop
until I reach the out instruction. The loop is the following:

  [...]
  cs_rpl = vmcs_read16(GUEST_CS_SELECTOR)  SELECTOR_RPL_MASK;
  ss_rpl = vmcs_read16(GUEST_SS_SELECTOR)  SELECTOR_RPL_MASK;

  while (cs_rpl != ss_rpl) {
  if (emulate_instruction(vcpu, NULL, 0,0, 0) == EMULATE_FAIL) {
  printk(KERN_INFO %s: emulation of 0x%x failed\n,
   __FUNCTION__,
   vcpu-arch.emulate_ctxt.decode.b);
  return -1;
   }
   cs_rpl = vmcs_read16(GUEST_CS_SELECTOR)  SELECTOR_RPL_MASK;
   ss_rpl = vmcs_read16(GUEST_SS_SELECTOR)  SELECTOR_RPL_MASK;
  }
  printk(KERN_INFO %s: VMX friendly state recovered\n, __FUNCTION__);
  // I never reach this point

Maybe CS and SS selector are not well updated. I will add trace to see
their values before and after the emulation.


Regards,
Guillaume

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] pv clock: kvm is incompatible with xen :-(

2008-04-18 Thread Gerd Hoffmann
Jeremy Fitzhardinge wrote:
 Gerd Hoffmann wrote:
 Wall clock is off a few hours though.  Oops.

 I think the way wall clock and system clock work together in xen (Jeremy
 correct me if I'm wrong) is that the wall clock specifies the point in
 time where the system clock started going.  As kvm fills in host system
 time into the guest system time fields the guest wall clock fields
 should be filled with the host boot time timestamp I'd say.
   
 
 Yes.  The wallclock field in the shared info structure is the wallclock
 time at boot; you compute the current time by adding the system
 timestamp to it.   System time changes are effected by retroactively
 changing the boot time of the machine, though that can also change
 because of suspend/resume/migrate.
 
 In general the kernel only reads the wallclock time at boot, and then
 maintains it for itself from then on.  I think.

Thanks.

I'm looking at the guest side of the issue right now, trying to identify
common code, and while doing so noticed that xen does the
version-check-loop in both get_time_values_from_xen(void) and
xen_clocksource_read(void), and I can't see any obvious reason for that.
 The loop in xen_clocksource_read(void) is not needed IMHO.  Can I drop it?

cheers,
  Gerd

-- 
http://kraxel.fedorapeople.org/xenner/

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread H. Peter Anvin
Anthony Liguori wrote:
 Nguyen Anh Quynh wrote:

  The thinking is to eliminate the need to hijack the boot sector when 
 using
 the -kernel option.
 

 I see, but does that offer any advantage over the current approach?
   
 
 You no longer have to specify a -hda option when using -kernel.
 

Plus, you don't have funny side effects if you do -- and I suspect there 
is ad hoc code in the disk driver which can be removed.

-hpa


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread Nguyen Anh Quynh
On 4/18/08, Anthony Liguori [EMAIL PROTECTED] wrote:
 Nguyen Anh Quynh wrote:

 
  
The thinking is to eliminate the need to hijack the boot sector when
 using
   the -kernel option.
  
  
 
  I see, but does that offer any advantage over the current approach?
 
 

  You no longer have to specify a -hda option when using -kernel.

Without -hda, how can we load disk image? Or you mean you only want to
test the kernel?

Thanks,
Q

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 3/3] Implement linux-aio backend

2008-04-18 Thread Marcelo Tosatti
On Thu, Apr 17, 2008 at 02:26:52PM -0500, Anthony Liguori wrote:
 This patch introduces a Linux-aio backend that is disabled by default.  To
 use this backend effectively, the user should disable caching and select
 it with the appropriate -aio option.  For instance:
 
 qemu-system-x86_64 -drive foo.img,cache=off -aio linux
 
 There's no universal way to asynchronous wait with linux-aio.  At some point,
 signals were added to signal completion.  More recently, and eventfd interface
 was added.  This patch relies on the later.
 
 We try hard to detect whether the right support is available in configure to
 avoid compile failures.

 +do {
 + err = io_submit(aio_ctxt_id, 1, iocbs);
 +} while (err == -1  errno == EINTR);
 +
 +if (err != 1) {
 + fprintf(stderr, failed to submit aio request: %m\n);
 + exit(1);
 +}
 +
 +outstanding_requests++;
 +
 +return aiocb-common;
 +}
 +
 +static void la_wait(void)
 +{
 +main_loop_wait(10);
 +}

Sleeping in the context of vcpu's is extremely bad (eg virtio-block
blocks in write() throttling which kills performance). It should wait
on IO completions instead (qemu-kvm.c creates a pthread waitqueue to
resolve that issue).

Other than that looks fine to me, will give it a try.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 3/6] KVM: MMU: Add EPT support

2008-04-18 Thread Yang, Sheng
On Friday 18 April 2008 21:30:14 Anthony Liguori wrote:
 Yang, Sheng wrote:
  @@ -1048,17 +1071,18 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu,
  u64 *shadow_pte,
   * whether the guest actually used the pte (in order to detect
   * demand paging).
   */
  -   spte = PT_PRESENT_MASK | PT_DIRTY_MASK;
  +   spte = shadow_base_present_pte | shadow_dirty_mask;
  if (!speculative)
  pte_access |= PT_ACCESSED_MASK;
  if (!dirty)
  pte_access = ~ACC_WRITE_MASK;
  -   if (!(pte_access  ACC_EXEC_MASK))
  -   spte |= PT64_NX_MASK;
  -
  -   spte |= PT_PRESENT_MASK;
  +   if (pte_access  ACC_EXEC_MASK) {
  +   if (shadow_x_mask)
  +   spte |= shadow_x_mask;
  +   } else if (shadow_nx_mask)
  +   spte |= shadow_nx_mask;

 This looks like it may be a bug.  The old behavior sets NX if
 (pte_access  ACC_EXEC_MASK).  The new behavior unconditionally sets NX
 and never sets PRESENT.  Also, the if (shadow_x_mas k) checks are
 unnecessary.  spte |= 0 is a nop.

Thanks for the comment! I realized two judgments of shadow_nx/x_mask is 
unnecessary... In fact, the correct behavior is either set shadow_x_mask or 
shadow_nx_mask, may be there is a better approach for this. The logic assured 
by program itself is always safer. But I will remove the redundant code at 
first.

But I don't think it's a bug. The old behavior set NX if (!(pte_access  
ACC_EXEC_MASK)), the same as the new one. And I also curious about the 
PRESENT bit. You see, the PRESENT bit was set at the beginning of the code, 
and I really don't know why the duplicate one exists there... 


  if (pte_access  ACC_USER_MASK)
  -   spte |= PT_USER_MASK;
  +   spte |= shadow_user_mask;
  if (largepage)
  spte |= PT_PAGE_SIZE_MASK;

-- 
Thanks
Yang, Sheng


 Regards,

 Anthony Liguori



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 3/3] Implement linux-aio backend

2008-04-18 Thread Anthony Liguori
Marcelo Tosatti wrote:
 On Thu, Apr 17, 2008 at 02:26:52PM -0500, Anthony Liguori wrote:
   
 This patch introduces a Linux-aio backend that is disabled by default.  To
 use this backend effectively, the user should disable caching and select
 it with the appropriate -aio option.  For instance:

 qemu-system-x86_64 -drive foo.img,cache=off -aio linux

 There's no universal way to asynchronous wait with linux-aio.  At some point,
 signals were added to signal completion.  More recently, and eventfd 
 interface
 was added.  This patch relies on the later.

 We try hard to detect whether the right support is available in configure to
 avoid compile failures.
 

   
 +do {
 +err = io_submit(aio_ctxt_id, 1, iocbs);
 +} while (err == -1  errno == EINTR);
 +
 +if (err != 1) {
 +fprintf(stderr, failed to submit aio request: %m\n);
 +exit(1);
 +}
 +
 +outstanding_requests++;
 +
 +return aiocb-common;
 +}
 +
 +static void la_wait(void)
 +{
 +main_loop_wait(10);
 +}
 

 Sleeping in the context of vcpu's is extremely bad (eg virtio-block
 blocks in write() throttling which kills performance). It should wait
 on IO completions instead (qemu-kvm.c creates a pthread waitqueue to
 resolve that issue).

 Other than that looks fine to me, will give it a try.
   

FWIW, I'm not getting wonderful results in KVM.  It's hard to tell 
though because time seems wildly inaccurate (even with kvm clock in the 
guest).  The time issue appears unrelated to this set of patches.

Regards,

Anthony Liguori

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Extboot Option ROM rewritten in C - v3

2008-04-18 Thread Anthony Liguori
Nguyen Anh Quynh wrote:
  You no longer have to specify a -hda option when using -kernel.
 

 Without -hda, how can we load disk image? Or you mean you only want to
 test the kernel?
   

Right.  You may be booting from NFS, iSCSI, or something like that.

Regards,

Anthony Liguori

 Thanks,
 Q
   


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

2008-04-18 Thread Anthony Liguori
Jamie Lokier wrote:
 I've basically got a choice of making libvirt always ad '-aio linux'
 or never add it at all. My inclination is to the latter since it is
 compatible with existing QEMU which has no -aio option. Presumably
 '-aio linux' is intended to provide some performance benefit so it'd
 be nice to use it. If we can't express some criteria under which it
 should be turned on, I can't enable it; where as if you can express
 some criteria, then QEMU should apply them automatically.
 

 I'm of the view that '-aio auto' would be a really good option - and
 when it's proven itself, it should be the default.  It could work on
 all QEMU hosts: it would pick synchronous IO when there is nothing else.
   

Right now, not specifying the -aio option is equivalent to your proposed 
-aio auto.

I guess I should include an info aio to let the user know what type of 
aio they are using.  We can add selection criteria later but 
semantically, not specifying an explicit -aio option allows QEMU to 
choose whichever one it thinks is best.

Regards,

Anthony Liguori


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] gfxboot VMX workaround v2

2008-04-18 Thread Anthony Liguori
Guillaume Thouvenin wrote:
 On Fri, 18 Apr 2008 08:23:07 -0500
 Anthony Liguori [EMAIL PROTECTED] wrote:

  
   
 This doesn't seem right.  You should have been able to break out of the 
 emulator long before encountering an out instruction.  The next 
 instruction you encounter should be a mov instruction.  Are you sure 
 you're updating eip correctly?
 

 I think that eip is updated correctly but you're right, I think that
 the condition to stop emulation is not well implemented. I emulate a
 lot of mov instructions and I remain blocked in the emulation loop
 until I reach the out instruction. The loop is the following:

   [...]
   cs_rpl = vmcs_read16(GUEST_CS_SELECTOR)  SELECTOR_RPL_MASK;
   ss_rpl = vmcs_read16(GUEST_SS_SELECTOR)  SELECTOR_RPL_MASK;

   while (cs_rpl != ss_rpl) {
   if (emulate_instruction(vcpu, NULL, 0,0, 0) == EMULATE_FAIL) {
   printk(KERN_INFO %s: emulation of 0x%x failed\n,
__FUNCTION__,
vcpu-arch.emulate_ctxt.decode.b);
   return -1;
}
cs_rpl = vmcs_read16(GUEST_CS_SELECTOR)  SELECTOR_RPL_MASK;
ss_rpl = vmcs_read16(GUEST_SS_SELECTOR)  SELECTOR_RPL_MASK;
   }
   printk(KERN_INFO %s: VMX friendly state recovered\n, __FUNCTION__);
   // I never reach this point

 Maybe CS and SS selector are not well updated. I will add trace to see
 their values before and after the emulation.
   

I'd prefer you not do an emulate_instruction loop at all.  Just emulate 
one instruction on vmentry failure and let VT tell you what instructions 
you need to emulate.

It's only four instructions so I don't think the performance is going to 
matter.  Take a look at the patch I posted previously.

Regards,

Anthony Liguori

 Regards,
 Guillaume
   


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1/1] Enble a guest to access a device's memory mapped I/O regions directly.

2008-04-18 Thread Avi Kivity
[EMAIL PROTECTED] wrote:
 From: Ben-Ami Yassour [EMAIL PROTECTED]

 Signed-off-by: Ben-Ami Yassour [EMAIL PROTECTED]
 Signed-off-by: Muli Ben-Yehuda [EMAIL PROTECTED]
 ---
  arch/x86/kvm/mmu.c |   59 +--
  arch/x86/kvm/paging_tmpl.h |   19 +
  include/linux/kvm_host.h   |2 +-
  virt/kvm/kvm_main.c|   17 +++-
  4 files changed, 69 insertions(+), 28 deletions(-)

 diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
 index 078a7f1..c89029d 100644
 --- a/arch/x86/kvm/mmu.c
 +++ b/arch/x86/kvm/mmu.c
 @@ -112,6 +112,8 @@ static int dbg = 1;
  #define PT_FIRST_AVAIL_BITS_SHIFT 9
  #define PT64_SECOND_AVAIL_BITS_SHIFT 52
  
 +#define PT_SHADOW_IO_MARK (1ULL  PT_FIRST_AVAIL_BITS_SHIFT)
 +
   

Please rename this PT_SHADOW_MMIO_MASK.


  #define VALID_PAGE(x) ((x) != INVALID_PAGE)
  
  #define PT64_LEVEL_BITS 9
 @@ -237,6 +239,9 @@ static int is_dirty_pte(unsigned long pte)
  
  static int is_rmap_pte(u64 pte)
  {
 + if (pte  PT_SHADOW_IO_MARK)
 + return false;
 +
   return is_shadow_present_pte(pte);
  }
   

Why avoid rmap on mmio pages?  Sure it's unnecessary work, but having 
less cases improves overall reliability.

You can use pfn_valid() in gfn_to_pfn() and kvm_release_pfn_*() to 
conditionally update the page refcounts.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 3/6] KVM: MMU: Add EPT support

2008-04-18 Thread Anthony Liguori
Yang, Sheng wrote:
 On Friday 18 April 2008 21:30:14 Anthony Liguori wrote:
   
 Yang, Sheng wrote:
 
 @@ -1048,17 +1071,18 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu,
 u64 *shadow_pte,
  * whether the guest actually used the pte (in order to detect
  * demand paging).
  */
 -   spte = PT_PRESENT_MASK | PT_DIRTY_MASK;
 +   spte = shadow_base_present_pte | shadow_dirty_mask;
 if (!speculative)
 pte_access |= PT_ACCESSED_MASK;
 if (!dirty)
 pte_access = ~ACC_WRITE_MASK;
 -   if (!(pte_access  ACC_EXEC_MASK))
 -   spte |= PT64_NX_MASK;
 -
 -   spte |= PT_PRESENT_MASK;
 +   if (pte_access  ACC_EXEC_MASK) {
 +   if (shadow_x_mask)
 +   spte |= shadow_x_mask;
 +   } else if (shadow_nx_mask)
 +   spte |= shadow_nx_mask;
   
 This looks like it may be a bug.  The old behavior sets NX if
 (pte_access  ACC_EXEC_MASK).  The new behavior unconditionally sets NX
 and never sets PRESENT.  Also, the if (shadow_x_mas k) checks are
 unnecessary.  spte |= 0 is a nop.
 

 Thanks for the comment! I realized two judgments of shadow_nx/x_mask is 
 unnecessary... In fact, the correct behavior is either set shadow_x_mask or 
 shadow_nx_mask, may be there is a better approach for this. The logic assured 
 by program itself is always safer. But I will remove the redundant code at 
 first.

 But I don't think it's a bug. The old behavior set NX if (!(pte_access  
 ACC_EXEC_MASK)), the same as the new one.

The new behavior sets NX regardless of whether (pte_access  
ACC_EXEC_MASK).  Is the desired change to unconditionally set NX?

  And I also curious about the 
 PRESENT bit. You see, the PRESENT bit was set at the beginning of the code, 
 and I really don't know why the duplicate one exists there... 
   

Looking at the code, you appear to be right.  In the future, I think you 
should separate any cleanups (like removing the redundant setting of 
PRESENT) into a separate patch and stick to just programmatic changes of 
PT_USER_MASK = shadow_user_mask, etc. in this patch.  That makes it a 
lot easier to review correctness.

Regards,

Anthony Liguori

 if (pte_access  ACC_USER_MASK)
 -   spte |= PT_USER_MASK;
 +   spte |= shadow_user_mask;
 if (largepage)
 spte |= PT_PAGE_SIZE_MASK;
   

   


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1/1] Enble a guest to access a device's memory mapped I/O regions directly.

2008-04-18 Thread Avi Kivity
[EMAIL PROTECTED] wrote:
 From: Ben-Ami Yassour [EMAIL PROTECTED]

 Signed-off-by: Ben-Ami Yassour [EMAIL PROTECTED]
 Signed-off-by: Muli Ben-Yehuda [EMAIL PROTECTED]
 ---
  libkvm/libkvm.c   |   24 
  qemu/hw/pci-passthrough.c |   89 
 +++--
  qemu/hw/pci-passthrough.h |2 +
  3 files changed, 40 insertions(+), 75 deletions(-)

 diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
 index de91328..8c02af9 100644
 --- a/libkvm/libkvm.c
 +++ b/libkvm/libkvm.c
 @@ -400,7 +400,7 @@ void *kvm_create_userspace_phys_mem(kvm_context_t kvm, 
 unsigned long phys_start,
  {
   int r;
   int prot = PROT_READ;
 - void *ptr;
 + void *ptr = NULL;
   struct kvm_userspace_memory_region memory = {
   .memory_size = len,
   .guest_phys_addr = phys_start,
 @@ -410,16 +410,24 @@ void *kvm_create_userspace_phys_mem(kvm_context_t kvm, 
 unsigned long phys_start,
   if (writable)
   prot |= PROT_WRITE;
  
 - ptr = mmap(NULL, len, prot, MAP_ANONYMOUS | MAP_SHARED, -1, 0);
 - if (ptr == MAP_FAILED) {
 - fprintf(stderr, create_userspace_phys_mem: %s, 
 strerror(errno));
 - return 0;
 - }
 + if (len  0) {
 + ptr = mmap(NULL, len, prot, MAP_ANONYMOUS | MAP_SHARED, -1, 0);
 + if (ptr == MAP_FAILED) {
 + fprintf(stderr, create_userspace_phys_mem: %s,
 + strerror(errno));
 + return 0;
 + }
  
 - memset(ptr, 0, len);
 + memset(ptr, 0, len);
 + }
  
   memory.userspace_addr = (unsigned long)ptr;
 - memory.slot = get_free_slot(kvm);
 +
 + if (len  0)
 + memory.slot = get_free_slot(kvm);
 + else
 + memory.slot = get_slot(phys_start);
 +
   r = ioctl(kvm-vm_fd, KVM_SET_USER_MEMORY_REGION, memory);
   if (r == -1) {
   fprintf(stderr, create_userspace_phys_mem: %s, 
 strerror(errno));
   

This looks like support for zero-length memory slots? Why is it needed?

It needs to be in a separate patch.

 diff --git a/qemu/hw/pci-passthrough.c b/qemu/hw/pci-passthrough.c
 index 7ffcc7b..a5894d9 100644
 --- a/qemu/hw/pci-passthrough.c
 +++ b/qemu/hw/pci-passthrough.c
 @@ -25,18 +25,6 @@ typedef __u64 resource_size_t;
  extern kvm_context_t kvm_context;
  extern FILE *logfile;
  
 -CPUReadMemoryFunc *pt_mmio_read_cb[3] = {
 - pt_mmio_readb,
 - pt_mmio_readw,
 - pt_mmio_readl
 -};
 -
 -CPUWriteMemoryFunc *pt_mmio_write_cb[3] = {
 - pt_mmio_writeb,
 - pt_mmio_writew,
 - pt_mmio_writel
 -};
 -
   

There's at least one use case for keeping mmio in userspace: 
reverse-engineering a device driver. So if it doesn't cause too much 
trouble, please keep this an option.


-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] [QEMU POWERPC] FPRs no longer live in kvm_vcpu

2008-04-18 Thread Hollis Blanchard
Signed-off-by: Hollis Blanchard [EMAIL PROTECTED]

diff --git a/qemu/qemu-kvm-powerpc.c b/qemu/qemu-kvm-powerpc.c
--- a/qemu/qemu-kvm-powerpc.c
+++ b/qemu/qemu-kvm-powerpc.c
@@ -72,7 +72,6 @@
 
 for (i = 0;i  32; i++){
 regs.gpr[i] = env-gpr[i];
-regs.fpr[i] = env-fpr[i];
 }
 
 rc = kvm_set_regs(kvm_context, env-cpu_index, regs);
@@ -113,7 +112,6 @@
 
 for (i = 0;i  32; i++){
 env-gpr[i] = regs.gpr[i];
-env-fpr[i] = regs.fpr[i];
 }
 
 }

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] direct mmio for passthrough - kernel part

2008-04-18 Thread Avi Kivity
[EMAIL PROTECTED] wrote:
 This patch for PCI passthrough devices enables a guest to access a device's
 memory mapped I/O regions directly, without requiring the host to trap and
 emulate every MMIO access. 

 Updated from last version: we create a memory slot for each MMIO region of the
 guest's devices, and then use the /sys/bus/pci/.../resource# mapping to find 
 the
 hfn for that MMIO region. The kernel part and the userspace part of this
 patchset apply to Amit's pv-dma tree.  Tested on a Lenovo M57p with an e1000 
 NIC
 assigned directly to an FC8 guest.

 Comments are appreciated. 
   

I see no support for cache attributes in the page attributes table or 
mtrr.  I guess for most devices this will work (as they will be set as 
uncachable by the mtrrs), but for display cards we'd need to set the 
vram as write-combining to get reasonable performance.  This requires 
mtrr and pat emulation in kvm so we detect the guest's intentions.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 0/5] SVM CR8 optimization patches

2008-04-18 Thread Avi Kivity
Joerg Roedel wrote:
 This patch series implements optimizations to the CR8 intercept handling in
 SVM. With these patches applied CR8 reads are not intercepted anymore. The
 writes to CR8 are only intercepted if the TPR masks interrupts. This
 significantly reduces the number of total CR8 intercepts when running Windows
 64 bit versions. Some quick numbers:

 Boot and shudown of Vista 64: 

 Without these patches: ~38.000.000 CR8 writes intercepted
 Withthese patches: ~38.000 CR8 writes intercepted

   

Applied all, thanks.  Good patchset.


-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] SVM: remove selective CR0 comment

2008-04-18 Thread Avi Kivity
Joerg Roedel wrote:
 There is not selective cr0 intercept bug. The code in the comment sets the
 CR0.PG bit. But KVM sets the CR4.PG bit for SVM always to implement the paged
 real mode. So the 'mov %eax,%cr0' instruction does not change the CR0.PG bit.
 Selective CR0 intercepts only occur when a bit is actually changed. So its the
 right behavior that there is no intercept on this instruction.

   
Applied, thanks.


-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Second KVM process hangs eating 80-100% CPU on host during startup

2008-04-18 Thread Avi Kivity
Alex Davis wrote:
 Host software:
 Linux 2.6.24.4
 KVM 65 (I am using the kernel modules from this release).
 X11 7.2 from Xorg
 SDL 1.2.13
 GCC 4.1.1
 Glibc 2.4

 Host hardware:
 Asus P5B Deluxe (P965 chipset based) motherboard
 4 GB RAM
 Intel E6700 CPU

 Guest software:
 Slackware 12.0 installed from CD-ROM.

 Command used to first KVM instance:
 /usr/local/bin/qemu-system-x86_64 -hda /spare/vdisk1.img -cdrom /dev/cdrom 
 -boot c -m 384 -net
 nic,macaddr=DE:AD:BE:EF:11:29 -net tap,ifname=tap0,script=no 

 Command used to start second KVM instance:
 /usr/local/bin/qemu-system-x86_64 -hda /spare/vdisk2.img -cdrom /dev/cdrom 
 -boot c -m 384 -net
 nic,macaddr=DE:AD:BE:EF:11:30 -net tap,ifname=tap1,script=no 

 tap0 and tap1 are bridged on the host. The guest OS was installed on 
 /spare/vdisk1.img, 
 which was initially created by /usr/local/bin/qemu-img create -f qcow 
 /spare/vdisk.img 10G
 After the guest installation completed, vdisk1 was copied to vdisk2.

 The second instance always stops after printing
 Checking if the processor honours the WP bit even in supervisor mode... Ok.
 It stays hung until I press the return key in the first instance; sometimes 
 clicking in another X
 window will wake it up as well. 

 This is a test machine so I can test patches (almost) at will.

   

Strange.  Does pinning each guest to a different cpu help (use 'taskset 
1 qemu ... vdisk1.img   ', taskset 2 qemu ... vdisk2.img)


-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

2008-04-18 Thread Jamie Lokier
Anthony Liguori wrote:
 I'm of the view that '-aio auto' would be a really good option - and
 when it's proven itself, it should be the default.  It could work on
 all QEMU hosts: it would pick synchronous IO when there is nothing else.
 
 Right now, not specifying the -aio option is equivalent to your proposed 
 -aio auto.
 
 I guess I should include an info aio to let the user know what type of 
 aio they are using.  We can add selection criteria later but 
 semantically, not specifying an explicit -aio option allows QEMU to 
 choose whichever one it thinks is best.

Great.  I guess the next step is to add selection criteria, otherwise
a million Wikis will tell everyone to use '-aio linux' :-)

Do you know what the selection criteria should be - or is there a
document/paper somewhere which says (ideally from benchmarks)?  I'm
interested for an unrelated project using AIO - so I'm willing to help
get this right to some extent.

-- Jamie

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] pass virtio disk geometry via config space

2008-04-18 Thread Avi Kivity
Ryan Harper wrote:
 From: Ryan Harper [EMAIL PROTECTED]

 Rather than faking up some geometry, allow the backend to push the disk
 geometry via virtio pci config option.  Keep the old geo code around for
 compatibility.

   

Applied, thanks.

  struct virtio_blk_config
  {
  uint64_t capacity;
  uint32_t size_max;
  uint32_t seg_max;
 +uint16_t cylinders;
 +uint8_t heads;
 +uint8_t sectors;
  };
  
   

I packed the structure here to avoid gcc surprises on odd architectures.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 0/3] Qemu crashes with pci passthrough

2008-04-18 Thread Avi Kivity
Glauber de Oliveira Costa wrote:
 Hi, 

 I've got some qemu crashes while trying to passthrough an ide device
 to a kvm guest. After some investigation, it turned out that 
 register_ioport_{read/write} will abort on errors instead of returning
 a meaningful error.

 However, even if we do return an error, the asynchronous nature of pci
 config space mapping updates makes it a little bit hard to treat.

 This series of patches basically treats errors in the mapping functions in
 the pci layer. If anything goes wrong, we unregister the pci device, unmapping
 any mappings that happened to be sucessfull already.

 After these patches are applied, a lot of warnings appears. And, you know,
 everytime there is a warning, god kills a kitten. But I'm not planning on
 touching the other pieces of qemu code for this until we set up (or not) in
 this solution

 Comments are very welcome, specially from qemu folks (since it is a bit 
 invasive)

   

Have you considered, instead of rolling back the changes you already 
made before the failure, to have a function which checks if an ioport 
registration will be successful?  This may simplify the code.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] Re: [PATCH 1/3] Refactor AIO interface to allow other AIO implementations

2008-04-18 Thread Avi Kivity
Anthony Liguori wrote:
 Right now, not specifying the -aio option is equivalent to your proposed 
 -aio auto.

 I guess I should include an info aio to let the user know what type of 
 aio they are using.  We can add selection criteria later but 
 semantically, not specifying an explicit -aio option allows QEMU to 
 choose whichever one it thinks is best.

   
For the majority of deployments posix aio should be sufficient.  The few 
that need something else can use Linux aio.

Of course, a managed environment can use Linux aio unconditionally if 
knows the kernel has all the needed goodies.

-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 3/3] Implement linux-aio backend

2008-04-18 Thread Marcelo Tosatti
On Fri, Apr 18, 2008 at 10:18:33AM -0500, Anthony Liguori wrote:
 Sleeping in the context of vcpu's is extremely bad (eg virtio-block
 blocks in write() throttling which kills performance). It should wait
 on IO completions instead (qemu-kvm.c creates a pthread waitqueue to
 resolve that issue).
 
 Other than that looks fine to me, will give it a try.
   
 
 FWIW, I'm not getting wonderful results in KVM.  It's hard to tell 
 though because time seems wildly inaccurate (even with kvm clock in the 
 guest).  The time issue appears unrelated to this set of patches.

Oh, you won't get completion signals on the aio eventfd. You might want
to try the select-with-timeout() stuff.

Will submit that with proper signalfd emulation shortly.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] VM Snapshots ?

2008-04-18 Thread Protti, Duilio J
Hi Uri,

 

The method you propose in fact doesn't work (tested with KVM 65) at
least for a Windows XP as guest.

 

After performing steps from 1 to 7 with no errors:

 

 - In step 8, the VM in question is already loaded and its user
interface is showed in the X windows (as mentioned a Windows XP in my
tests)

 - After step 9, the VM seems to be unstopped (no more '[Stopped]' title
in the X window caption) but in fact it doesn't runs.

 

The X window appears to respond to mouse events, i.e. the Press
Ctrl-Alt to exit grab message appears on mouse click, but the Windows
XP interface does not respond. Also, top command shows near 0% CPU usage
for the qemu process, so it seems that the Windows XP is not put to run
after 'cont' in qemu monitor.

 

It is supposed that this should work? Or this type of guest does not
support these operations?

 

Thanks,

Duilio Protti

Intel Corporation

 

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] disappointing speed with virtio_blk

2008-04-18 Thread Gerd von Egidy
Hi Marcelo,

   
http://www.mail-archive.com/kvm-devel@lists.sourceforge.net/msg14732.html
 
  I tried it this evening with kvm 66 - which should include your patch,
  right?

 No its not included. The issue is being worked on.

my bad, sorry.

Now I know I really have that patch: qemu-kvm hangs :(

I was trying kvm 66 with only the patch listed above applied on an otherwise 
perfectly working vm with virtio_blk root partition:

Last line of the booting kernel in my vnc window:

Serial: 8250/16550 driver $Revision 1.90... (you know the rest)

an strace of the qemu-kvm gave the following in rapid succession:

clock_gettime(CLOCK_MONOTONIC, {2565, 306799672}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 307065342}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 307354930}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 307618803}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 307886312}) = 0
timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0
timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 3300}}, NULL) = 0
rt_sigtimedwait([USR1 USR2 ALRM IO], {si_signo=SIGALRM, si_code=SI_TIMER, 
si_pid=0, si_uid=0, si_value={int=0, ptr=0}}, 0xbfe5af88, 8) = 14
rt_sigaction(SIGALRM, NULL, {0x804d8f8, ~[KILL STOP RTMIN RT_1], 0}, 8) = 0
select(12, [6 11], [], [], {0, 0})  = 0 (Timeout)
select(0, [], NULL, NULL, {0, 0})   = 0 (Timeout)
clock_gettime(CLOCK_MONOTONIC, {2565, 342895116}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 343164113}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 343454002}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 343716804}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 343980012}) = 0
timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0
timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 3300}}, NULL) = 0
rt_sigtimedwait([USR1 USR2 ALRM IO], {si_signo=SIGALRM, si_code=SI_TIMER, 
si_pid=0, si_uid=0, si_value={int=0, ptr=0}}, 0xbfe5af88, 8) = 14
rt_sigaction(SIGALRM, NULL, {0x804d8f8, ~[KILL STOP RTMIN RT_1], 0}, 8) = 0
select(12, [6 11], [], [], {0, 0})  = 0 (Timeout)
select(0, [], NULL, NULL, {0, 0})   = 0 (Timeout)
clock_gettime(CLOCK_MONOTONIC, {2565, 379035364}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 379307884}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 379589434}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 379919100}) = 0
clock_gettime(CLOCK_MONOTONIC, {2565, 380183834}) = 0
timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0
timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 3300}}, NULL) = 0
rt_sigtimedwait([USR1 USR2 ALRM IO], {si_signo=SIGALRM, si_code=SI_TIMER, 
si_pid=0, si_uid=0, si_value={int=0, ptr=0}}, 0xbfe5af88, 8) = 14
rt_sigaction(SIGALRM, NULL, {0x804d8f8, ~[KILL STOP RTMIN RT_1], 0}, 8) = 0
select(12, [6 11], [], [], {0, 0})  = 0 (Timeout)
select(0, [], NULL, NULL, {0, 0})   = 0 (Timeout)
...

Hope that helps.

Kind regards,

Gerd


-- 
Address (better: trap) for people I really don't want to get mail from:
[EMAIL PROTECTED]

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] disappointing speed with virtio_blk

2008-04-18 Thread Marcelo Tosatti

Hi Gerd,

On Fri, Apr 18, 2008 at 11:27:58PM +0200, Gerd von Egidy wrote:
 Hi Marcelo,
 

 http://www.mail-archive.com/kvm-devel@lists.sourceforge.net/msg14732.html
  
   I tried it this evening with kvm 66 - which should include your patch,
   right?
 
  No its not included. The issue is being worked on.
 
 my bad, sorry.
 
 Now I know I really have that patch: qemu-kvm hangs :(
 
 I was trying kvm 66 with only the patch listed above applied on an otherwise 
 perfectly working vm with virtio_blk root partition:
 
 Last line of the booting kernel in my vnc window:
 
 Serial: 8250/16550 driver $Revision 1.90... (you know the rest)

When the hang happens, can you run kvm-stat --once (script can be found
kvm-66 directory) and paste the result?

Can you confirm that reverting the patch fixes it?

 an strace of the qemu-kvm gave the following in rapid succession:
 
 clock_gettime(CLOCK_MONOTONIC, {2565, 306799672}) = 0
 clock_gettime(CLOCK_MONOTONIC, {2565, 307065342}) = 0
 clock_gettime(CLOCK_MONOTONIC, {2565, 307354930}) = 0
 clock_gettime(CLOCK_MONOTONIC, {2565, 307618803}) = 0
 clock_gettime(CLOCK_MONOTONIC, {2565, 307886312}) = 0
 timer_gettime(0, {it_interval={0, 0}, it_value={0, 0}}) = 0
 timer_settime(0, 0, {it_interval={0, 0}, it_value={0, 3300}}, NULL) = 0
 rt_sigtimedwait([USR1 USR2 ALRM IO], {si_signo=SIGALRM, si_code=SI_TIMER, 
 si_pid=0, si_uid=0, si_value={int=0, ptr=0}}, 0xbfe5af88, 8) = 14
 rt_sigaction(SIGALRM, NULL, {0x804d8f8, ~[KILL STOP RTMIN RT_1], 0}, 8) = 0

This won't help much.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] pv clock: kvm is incompatible with xen :-(

2008-04-18 Thread Jeremy Fitzhardinge
Gerd Hoffmann wrote:
 I'm looking at the guest side of the issue right now, trying to identify
 common code, and while doing so noticed that xen does the
 version-check-loop in both get_time_values_from_xen(void) and
 xen_clocksource_read(void), and I can't see any obvious reason for that.
  The loop in xen_clocksource_read(void) is not needed IMHO.  Can I drop it?
   

No.  The get_nsec_offset() needs to be atomic with respect to the 
get_time_values() parameters.  There could be a loopless 
__get_time_values() for use in this case, but given that it almost never 
loops, I don't think its worthwhile.

J

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [patch 0/2] virtio-blk async IO

2008-04-18 Thread Marcelo Tosatti
Use the asynchronous version of block IO functions, otherwise guests can block
for long periods of time waiting for the operations to complete.

-- 


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [patch 1/2] QEMU/KVM: provide a reset method for virtio

2008-04-18 Thread Marcelo Tosatti
So drivers can do whatever necessary on reset.

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]

Index: kvm-userspace.aio/qemu/hw/virtio.c
===
--- kvm-userspace.aio.orig/qemu/hw/virtio.c
+++ kvm-userspace.aio/qemu/hw/virtio.c
@@ -166,6 +166,9 @@ void virtio_reset(void *opaque)
 VirtIODevice *vdev = opaque;
 int i;
 
+if (vdev-reset)
+vdev-reset(vdev);
+
 vdev-features = 0;
 vdev-queue_sel = 0;
 vdev-status = 0;
Index: kvm-userspace.aio/qemu/hw/virtio.h
===
--- kvm-userspace.aio.orig/qemu/hw/virtio.h
+++ kvm-userspace.aio/qemu/hw/virtio.h
@@ -119,6 +119,7 @@ struct VirtIODevice
 uint32_t (*get_features)(VirtIODevice *vdev);
 void (*set_features)(VirtIODevice *vdev, uint32_t val);
 void (*update_config)(VirtIODevice *vdev, uint8_t *config);
+void (*reset)(VirtIODevice *vdev);
 VirtQueue vq[VIRTIO_PCI_QUEUE_MAX];
 };
 

-- 


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [patch 2/2] QEMU/KVM: virtio-blk async IO

2008-04-18 Thread Marcelo Tosatti
virtio-blk should not use synchronous requests, as that can blocks vcpus 
outside of guest mode for large periods of time for no reason.

The generic block layer could complete AIO's before re-entering guest mode,
so that cached reads and writes can be reported ASAP, a job for the block layer.

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]

Index: kvm-userspace.aio/qemu/hw/virtio-blk.c
===
--- kvm-userspace.aio.orig/qemu/hw/virtio-blk.c
+++ kvm-userspace.aio/qemu/hw/virtio-blk.c
@@ -77,54 +77,117 @@ static VirtIOBlock *to_virtio_blk(VirtIO
 return (VirtIOBlock *)vdev;
 }
 
+typedef struct VirtIOBlockReq
+{
+VirtIODevice *vdev;
+VirtQueue *vq;
+struct iovec in_sg_status;
+unsigned int pending;
+unsigned int len;
+unsigned int elem_idx;
+int status;
+} VirtIOBlockReq;
+
+static void virtio_blk_rw_complete(void *opaque, int ret)
+{
+VirtIOBlockReq *req = opaque;
+struct virtio_blk_inhdr *in;
+VirtQueueElement elem;
+
+req-status |= ret;
+if (--req-pending  0)
+return;
+
+elem.index = req-elem_idx;
+in = (void *)req-in_sg_status.iov_base;
+
+in-status = req-status ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
+virtqueue_push(req-vq, elem, req-len);
+virtio_notify(req-vdev, req-vq);
+qemu_free(req);
+}
+
 static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
 VirtIOBlock *s = to_virtio_blk(vdev);
 VirtQueueElement elem;
+VirtIOBlockReq *req;
 unsigned int count;
 
 while ((count = virtqueue_pop(vq, elem)) != 0) {
struct virtio_blk_inhdr *in;
struct virtio_blk_outhdr *out;
-   unsigned int wlen;
off_t off;
int i;
 
+   /*
+* FIXME: limit the number of in-flight requests
+*/
+   req = qemu_malloc(sizeof(VirtIOBlockReq));
+   if (!req)
+   return;
+   memset(req, 0, sizeof(*req));
+   memcpy(req-in_sg_status, elem.in_sg[elem.in_num - 1],
+  sizeof(req-in_sg_status));
+   req-vdev = vdev;
+   req-vq = vq;
+   req-elem_idx = elem.index;
+
out = (void *)elem.out_sg[0].iov_base;
in = (void *)elem.in_sg[elem.in_num - 1].iov_base;
off = out-sector;
 
if (out-type  VIRTIO_BLK_T_SCSI_CMD) {
-   wlen = sizeof(*in);
+   unsigned int len = sizeof(*in);
+
in-status = VIRTIO_BLK_S_UNSUPP;
+   virtqueue_push(vq, elem, len);
+   virtio_notify(vdev, vq);
+   qemu_free(req);
+
} else if (out-type  VIRTIO_BLK_T_OUT) {
-   wlen = sizeof(*in);
+   req-pending = elem.out_num - 1;
 
for (i = 1; i  elem.out_num; i++) {
-   bdrv_write(s-bs, off,
+   bdrv_aio_write(s-bs, off,
   elem.out_sg[i].iov_base,
-  elem.out_sg[i].iov_len / 512);
+  elem.out_sg[i].iov_len / 512,
+  virtio_blk_rw_complete,
+  req);
off += elem.out_sg[i].iov_len / 512;
+   req-len += elem.out_sg[i].iov_len;
}
 
-   in-status = VIRTIO_BLK_S_OK;
} else {
-   wlen = sizeof(*in);
+   req-pending = elem.in_num - 1;
 
for (i = 0; i  elem.in_num - 1; i++) {
-   bdrv_read(s-bs, off,
+   bdrv_aio_read(s-bs, off,
  elem.in_sg[i].iov_base,
- elem.in_sg[i].iov_len / 512);
+ elem.in_sg[i].iov_len / 512,
+ virtio_blk_rw_complete,
+ req);
off += elem.in_sg[i].iov_len / 512;
-   wlen += elem.in_sg[i].iov_len;
+   req-len += elem.in_sg[i].iov_len;
}
-
-   in-status = VIRTIO_BLK_S_OK;
}
-
-   virtqueue_push(vq, elem, wlen);
-   virtio_notify(vdev, vq);
 }
+/*
+ * FIXME: Want to check for completions before returning to guest mode,
+ * so cached reads and writes are reported as quickly as possible. But
+ * that should be done in the generic block layer.
+ */
+}
+
+static void virtio_blk_reset(VirtIODevice *vdev)
+{
+VirtIOBlock *s = to_virtio_blk(vdev);
+
+/*
+ * This should cancel pending requests, but can't do nicely until there
+ * are per-device request lists.
+ */
+qemu_aio_flush();
 }
 
 static void virtio_blk_update_config(VirtIODevice *vdev, uint8_t *config)
@@ -156,6 +219,7 @@ void *virtio_blk_init(PCIBus *bus, uint1
 
 s-vdev.update_config = virtio_blk_update_config;
 s-vdev.get_features = virtio_blk_get_features;
+s-vdev.reset = virtio_blk_reset;
 s-bs = bs;
 bs-devfn = s-vdev.pci_dev.devfn;
 

-- 


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's 

[kvm-devel] kvm-trace help

2008-04-18 Thread David S. Ahern
I am trying to add a trace marker and the data is coming out all 0's. e.g.,

0 (+   0)  PTE_WRITE vcpu = 0x0001  pid = 0x240d [ gpa =
0x  gpte = 0x  ]

Patch is attached. I know the data is non-zero as I added an if check before
calling the trace to only do the trace if the data is non-zero. Anyone have
suggestions on what I am missing?

thanks,

david


diff -rb -U 10 kvm-66.orig/kernel/include/asm/kvm.h kvm-66/kernel/include/asm/kvm.h
--- kvm-66.orig/kernel/include/asm/kvm.h	2008-04-16 08:29:14.0 -0600
+++ kvm-66/kernel/include/asm/kvm.h	2008-04-18 12:41:07.0 -0600
@@ -221,12 +221,14 @@
 #define KVM_TRC_MSR_READ (KVM_TRC_HANDLER + 0x0B)
 #define KVM_TRC_MSR_WRITE(KVM_TRC_HANDLER + 0x0C)
 #define KVM_TRC_CPUID(KVM_TRC_HANDLER + 0x0D)
 #define KVM_TRC_INTR (KVM_TRC_HANDLER + 0x0E)
 #define KVM_TRC_NMI  (KVM_TRC_HANDLER + 0x0F)
 #define KVM_TRC_VMMCALL  (KVM_TRC_HANDLER + 0x10)
 #define KVM_TRC_HLT  (KVM_TRC_HANDLER + 0x11)
 #define KVM_TRC_CLTS (KVM_TRC_HANDLER + 0x12)
 #define KVM_TRC_LMSW (KVM_TRC_HANDLER + 0x13)
 #define KVM_TRC_APIC_ACCESS  (KVM_TRC_HANDLER + 0x14)
+#define KVM_TRC_PTE_WRITE(KVM_TRC_HANDLER + 0x15)
+#define KVM_TRC_PTE_FLOODED  (KVM_TRC_HANDLER + 0x16)
 
 #endif
diff -rb -U 10 kvm-66.orig/kernel/include/asm-x86/kvm.h kvm-66/kernel/include/asm-x86/kvm.h
--- kvm-66.orig/kernel/include/asm-x86/kvm.h	2008-04-16 08:29:14.0 -0600
+++ kvm-66/kernel/include/asm-x86/kvm.h	2008-04-18 12:41:07.0 -0600
@@ -221,12 +221,14 @@
 #define KVM_TRC_MSR_READ (KVM_TRC_HANDLER + 0x0B)
 #define KVM_TRC_MSR_WRITE(KVM_TRC_HANDLER + 0x0C)
 #define KVM_TRC_CPUID(KVM_TRC_HANDLER + 0x0D)
 #define KVM_TRC_INTR (KVM_TRC_HANDLER + 0x0E)
 #define KVM_TRC_NMI  (KVM_TRC_HANDLER + 0x0F)
 #define KVM_TRC_VMMCALL  (KVM_TRC_HANDLER + 0x10)
 #define KVM_TRC_HLT  (KVM_TRC_HANDLER + 0x11)
 #define KVM_TRC_CLTS (KVM_TRC_HANDLER + 0x12)
 #define KVM_TRC_LMSW (KVM_TRC_HANDLER + 0x13)
 #define KVM_TRC_APIC_ACCESS  (KVM_TRC_HANDLER + 0x14)
+#define KVM_TRC_PTE_WRITE(KVM_TRC_HANDLER + 0x15)
+#define KVM_TRC_PTE_FLOODED  (KVM_TRC_HANDLER + 0x16)
 
 #endif
diff -rb -U 10 kvm-66.orig/kernel/mmu.c kvm-66/kernel/mmu.c
--- kvm-66.orig/kernel/mmu.c	2008-04-16 08:29:14.0 -0600
+++ kvm-66/kernel/mmu.c	2008-04-18 11:50:16.0 -0600
@@ -1662,20 +1662,22 @@
 			if (r)
 return;
 			memcpy((void *)gpte + (gpa % 8), new, 4);
 		} else if ((bytes == 8)  (gpa % 8 == 0)) {
 			memcpy((void *)gpte, new, 8);
 		}
 	} else {
 		if ((bytes == 4)  (gpa % 4 == 0))
 			memcpy((void *)gpte, new, 4);
 	}
+	KVMTRACE_4D(PTE_WRITE, vcpu, (u32) gpa, (u32)(gpa32), 
+	(u32) gpte, (u32)(gpte32), handler);
 	if (!is_present_pte(gpte))
 		return;
 	gfn = (gpte  PT64_BASE_ADDR_MASK)  PAGE_SHIFT;
 
 	down_read(current-mm-mmap_sem);
 	if (is_large_pte(gpte)  is_largepage_backed(vcpu, gfn)) {
 		gfn = ~(KVM_PAGES_PER_HPAGE-1);
 		vcpu-arch.update_pte.largepage = 1;
 	}
 	pfn = gfn_to_pfn(vcpu-kvm, gfn);
@@ -1711,21 +1713,22 @@
 
 	pgprintk(%s: gpa %llx bytes %d\n, __func__, gpa, bytes);
 	mmu_guess_page_from_pte_write(vcpu, gpa, new, bytes);
 	spin_lock(vcpu-kvm-mmu_lock);
 	kvm_mmu_free_some_pages(vcpu);
 	++vcpu-kvm-stat.mmu_pte_write;
 	kvm_mmu_audit(vcpu, pre pte write);
 	if (gfn == vcpu-arch.last_pt_write_gfn
 	 !last_updated_pte_accessed(vcpu)) {
 		++vcpu-arch.last_pt_write_count;
-		if (vcpu-arch.last_pt_write_count = 3)
+		if (vcpu-arch.last_pt_write_count = 4)
+			KVMTRACE_0D(PTE_FLOODED, vcpu, handler);
 			flooded = 1;
 	} else {
 		vcpu-arch.last_pt_write_gfn = gfn;
 		vcpu-arch.last_pt_write_count = 1;
 		vcpu-arch.last_pte_updated = NULL;
 	}
 	index = kvm_page_table_hashfn(gfn);
 	bucket = vcpu-kvm-arch.mmu_page_hash[index];
 	hlist_for_each_entry_safe(sp, node, n, bucket, hash_link) {
 		if (sp-gfn != gfn || sp-role.metaphysical)
diff -rb -U 10 kvm-66.orig/user/formats kvm-66/user/formats
--- kvm-66.orig/user/formats	2008-04-15 07:35:58.0 -0600
+++ kvm-66/user/formats	2008-04-18 12:46:36.0 -0600
@@ -15,10 +15,12 @@
 0x0002000B  %(tsc)d (+%(reltsc)8d)  MSR_READ  vcpu = 0x%(vcpu)08x  pid = 0x%(pid)08x [ MSR# = 0x%(1)08x, data = 0x%(3)08x %(2)08x ]
 0x0002000C  %(tsc)d (+%(reltsc)8d)  MSR_WRITE vcpu = 0x%(vcpu)08x  pid = 0x%(pid)08x [ MSR# = 0x%(1)08x, data = 0x%(3)08x %(2)08x ]
 0x0002000D  %(tsc)d (+%(reltsc)8d)  CPUID vcpu = 0x%(vcpu)08x  pid = 0x%(pid)08x [ func = 0x%(1)08x, eax = 0x%(2)08x, ebx = 0x%(3)08x, ecx = 0x%(4)08x edx = 0x%(5)08x]
 0x0002000E  %(tsc)d (+%(reltsc)8d)  INTR  vcpu = 0x%(vcpu)08x  pid = 0x%(pid)08x [ vector = 0x%(1)02x ]
 0x0002000F  %(tsc)d (+%(reltsc)8d)  NMI   vcpu = 0x%(vcpu)08x  pid = 0x%(pid)08x
 0x00020010  %(tsc)d (+%(reltsc)8d)  VMMCALL   

Re: [kvm-devel] Second KVM process hangs eating 80-100% CPU on host during startup

2008-04-18 Thread Alex Davis
--- On Fri, 4/18/08, Avi Kivity [EMAIL PROTECTED] wrote:

 From: Avi Kivity [EMAIL PROTECTED]
 Subject: Re: [kvm-devel] Second KVM process hangs eating 80-100% CPU on host 
 during startup
 To: Alex Davis [EMAIL PROTECTED]
 Cc: kvm-devel@lists.sourceforge.net
 Date: Friday, April 18, 2008, 12:12 PM
 Alex Davis wrote:
  Host software:
  Linux 2.6.24.4
  KVM 65 (I am using the kernel modules from this
 release).
  X11 7.2 from Xorg
  SDL 1.2.13
  GCC 4.1.1
  Glibc 2.4
 
  Host hardware:
  Asus P5B Deluxe (P965 chipset based) motherboard
  4 GB RAM
  Intel E6700 CPU
 
  Guest software:
  Slackware 12.0 installed from CD-ROM.
 
  Command used to first KVM instance:
  /usr/local/bin/qemu-system-x86_64 -hda
 /spare/vdisk1.img -cdrom /dev/cdrom -boot c -m 384 -net
  nic,macaddr=DE:AD:BE:EF:11:29 -net
 tap,ifname=tap0,script=no 
 
  Command used to start second KVM instance:
  /usr/local/bin/qemu-system-x86_64 -hda
 /spare/vdisk2.img -cdrom /dev/cdrom -boot c -m 384 -net
  nic,macaddr=DE:AD:BE:EF:11:30 -net
 tap,ifname=tap1,script=no 
 
  tap0 and tap1 are bridged on the host. The guest OS
 was installed on /spare/vdisk1.img, 
  which was initially created by /usr/local/bin/qemu-img
 create -f qcow /spare/vdisk.img 10G
  After the guest installation completed, vdisk1 was
 copied to vdisk2.
 
  The second instance always stops after printing
  Checking if the processor honours the WP bit even in
 supervisor mode... Ok.
  It stays hung until I press the return key in the
 first instance; sometimes clicking in another X
  window will wake it up as well. 
 
  This is a test machine so I can test patches (almost)
 at will.
 

 
 Strange.  Does pinning each guest to a different cpu help
 (use 'taskset 
 1 qemu ... vdisk1.img   ', taskset 2 qemu ...
 vdisk2.img)
 
 

taskset made no difference. Upgrading to kvm-66 didn't help either.
 
 Any sufficiently difficult bug is indistinguishable from a
 feature.


  

Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/2] virtio-blk async IO

2008-04-18 Thread Gerd von Egidy
Hi Marcelo,

 Use the asynchronous version of block IO functions, otherwise guests can
 block for long periods of time waiting for the operations to complete.

just tried these patches. Results are similar to the last ones: the guest 
comes up fine but after running 2 or 3 minutes of bonnie++ the guest-vm 
hangs. This time I used screen on the guest console to try switching to 
another process - hanging too.

Here is the kvm_stat --once output:

efer_reload0 0
exits3325114   196
fpu_reload185671 0
halt_exits 1869229
halt_wakeup24807 0
host_state_reload138730859
insn_emulation   1924291   130
insn_emulation_fail0 0
invlpg 0 0
io_exits  35002030
irq_exits 225446 3
irq_window 0 0
mmio_exits917561 0
mmu_cache_miss 55436 0
mmu_flooded64416 0
mmu_pde_zapped 46914 0
mmu_pte_updated   565547 0
mmu_pte_write 650181 0
mmu_recycled   0 0
mmu_shadow_zapped  64416 0
pf_fixed 1229672 0
pf_guest   94338 0
remote_tlb_flush   0 0
request_irq0 0
signal_exits   1 0
tlb_flush 602678 4

Kind regards,

Gerd

-- 
Address (better: trap) for people I really don't want to get mail from:
james(at)cactusamerica.com

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kvm-trace help

2008-04-18 Thread Liu, Eric E
David S. Ahern wrote:
 I am trying to add a trace marker and the data is coming out all 0's.
 e.g., 
 
 0 (+   0)  PTE_WRITE vcpu = 0x0001  pid = 0x240d [
 gpa = 0x  gpte = 0x  ]
 
 Patch is attached. I know the data is non-zero as I added an if check
 before calling the trace to only do the trace if the data is
 non-zero. Anyone have suggestions on what I am missing?
 
 thanks,
 
 david
Hi, david
I read your patch and find this:
+#define KVM_TRC_PTE_WRITE(KVM_TRC_HANDLER +
0x15)
+#define KVM_TRC_PTE_FLOODED  (KVM_TRC_HANDLER +
0x16)
   but in your formats file 
+0x00020015  %(tsc)d (+%(reltsc)8d)  PTE_FLOODED   vcpu
= 0x%(vcpu)08x  pid = 0x%(pid)08x
+0x00020016  %(tsc)d (+%(reltsc)8d)  PTE_WRITE vcpu
= 0x%(vcpu)08x  pid = 0x%(pid)08x [ gpa = 0x%(2)08x %(1)08x gpte =
0x%(4)08x %(3)08x ]
You mistake the value, right?

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kvm-trace help

2008-04-18 Thread David S. Ahern
inline.

Liu, Eric E wrote:
 David S. Ahern wrote:
 I am trying to add a trace marker and the data is coming out all 0's.
 e.g., 

 0 (+   0)  PTE_WRITE vcpu = 0x0001  pid = 0x240d [
 gpa = 0x  gpte = 0x  ]

 Patch is attached. I know the data is non-zero as I added an if check
 before calling the trace to only do the trace if the data is
 non-zero. Anyone have suggestions on what I am missing?

 thanks,

 david
 Hi, david
   I read your patch and find this:
   +#define KVM_TRC_PTE_WRITE(KVM_TRC_HANDLER +
 0x15)
   +#define KVM_TRC_PTE_FLOODED  (KVM_TRC_HANDLER +
 0x16)
but in your formats file 
   +0x00020015  %(tsc)d (+%(reltsc)8d)  PTE_FLOODED   vcpu
 = 0x%(vcpu)08x  pid = 0x%(pid)08x
   +0x00020016  %(tsc)d (+%(reltsc)8d)  PTE_WRITE vcpu
 = 0x%(vcpu)08x  pid = 0x%(pid)08x [ gpa = 0x%(2)08x %(1)08x gpte =
 0x%(4)08x %(3)08x ]
   You mistake the value, right?
 

Which value? Do you mean the 0x00020015 and0x00020016?

kvm.h shows KVM_TRC_APIC_ACCESS as KVM_TRC_HANDLER + 0x14. I added the PTE_WRITE
and PTE_FLOODED after that in kvm.h with the values 0x15 and 0x16. Then in the
formats file it shows APIC_ACCESS as 0x00020014, and I added the new PTE entries
after that as 20015 and 20016. The kvmtrace_format tool does show those lines in
its output which makes me believe these values are ok.

What has me puzzled is the 0 values for gpa and gpte. I believe they are not 0
because I added if (gpa || gpte) before the KVMTRACE_4D(PTE_WRITE, ...) line
and the lines still show up in the trace output.

david

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kvm-trace help

2008-04-18 Thread Liu, Eric E
David S. Ahern wrote:
 inline.
 
 Liu, Eric E wrote:
 David S. Ahern wrote:
 I am trying to add a trace marker and the data is coming out all
 0's. e.g., 
 
 0 (+   0)  PTE_WRITE vcpu = 0x0001  pid = 0x240d [
 gpa = 0x  gpte = 0x  ]
 
 Patch is attached. I know the data is non-zero as I added an if
 check before calling the trace to only do the trace if the data is
 non-zero. Anyone have suggestions on what I am missing?
 
 thanks,
 
 david
 Hi, david
  I read your patch and find this:
  +#define KVM_TRC_PTE_WRITE(KVM_TRC_HANDLER +
0x15)
  +#define KVM_TRC_PTE_FLOODED  (KVM_TRC_HANDLER +
0x16)
but in your formats file
  +0x00020015  %(tsc)d (+%(reltsc)8d)  PTE_FLOODED   vcpu
 = 0x%(vcpu)08x  pid = 0x%(pid)08x
  +0x00020016  %(tsc)d (+%(reltsc)8d)  PTE_WRITE vcpu
 = 0x%(vcpu)08x  pid = 0x%(pid)08x [ gpa = 0x%(2)08x %(1)08x gpte =
  0x%(4)08x %(3)08x ] You mistake the value, right?
 
 
 Which value? Do you mean the 0x00020015 and0x00020016?
 
 kvm.h shows KVM_TRC_APIC_ACCESS as KVM_TRC_HANDLER + 0x14. I added
 the PTE_WRITE and PTE_FLOODED after that in kvm.h with the values
 0x15 and 0x16. Then in the formats file it shows APIC_ACCESS as
 0x00020014, and I added the new PTE entries after that as 20015 and
 20016. The kvmtrace_format tool does show those lines in its output
 which makes me believe these values are ok. 

 
I mean the value of PTE_WRITE you write in the formats file ( 0x00020016
)should be same with KVM_TRC_PTE_WRITE you define in kvm.h,
but now it is  0x00020015. if not what you get in the text file will be
disordered. 
 
 What has me puzzled is the 0 values for gpa and gpte. I believe they
 are not 0 because I added if (gpa || gpte) before the
 KVMTRACE_4D(PTE_WRITE, ...) line and the lines still show up in the
 trace output. 
 
 david


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Second KVM process hangs eating 80-100% CPU on host during startup

2008-04-18 Thread Alex Davis
--- On Fri, 4/18/08, Avi Kivity [EMAIL PROTECTED] wrote:

 From: Avi Kivity [EMAIL PROTECTED]
 Subject: Re: [kvm-devel] Second KVM process hangs eating 80-100% CPU on host 
 during startup
 To: Alex Davis [EMAIL PROTECTED]
 Cc: kvm-devel@lists.sourceforge.net
 Date: Friday, April 18, 2008, 12:12 PM
 Alex Davis wrote:
  Host software:
  Linux 2.6.24.4
  KVM 65 (I am using the kernel modules from this
 release).
  X11 7.2 from Xorg
  SDL 1.2.13
  GCC 4.1.1
  Glibc 2.4
 
  Host hardware:
  Asus P5B Deluxe (P965 chipset based) motherboard
  4 GB RAM
  Intel E6700 CPU
 
  Guest software:
  Slackware 12.0 installed from CD-ROM.
 
  Command used to first KVM instance:
  /usr/local/bin/qemu-system-x86_64 -hda
 /spare/vdisk1.img -cdrom /dev/cdrom -boot c -m 384 -net
  nic,macaddr=DE:AD:BE:EF:11:29 -net
 tap,ifname=tap0,script=no 
 
  Command used to start second KVM instance:
  /usr/local/bin/qemu-system-x86_64 -hda
 /spare/vdisk2.img -cdrom /dev/cdrom -boot c -m 384 -net
  nic,macaddr=DE:AD:BE:EF:11:30 -net
 tap,ifname=tap1,script=no 
 
  tap0 and tap1 are bridged on the host. The guest OS
 was installed on /spare/vdisk1.img, 
  which was initially created by /usr/local/bin/qemu-img
 create -f qcow /spare/vdisk.img 10G
  After the guest installation completed, vdisk1 was
 copied to vdisk2.
 
  The second instance always stops after printing
  Checking if the processor honours the WP bit even in
 supervisor mode... Ok.
  It stays hung until I press the return key in the
 first instance; sometimes clicking in another X
  window will wake it up as well. 
 
  This is a test machine so I can test patches (almost)
 at will.
 

 
 Strange.  Does pinning each guest to a different cpu help
 (use 'taskset 
 1 qemu ... vdisk1.img   ', taskset 2 qemu ...
 vdisk2.img)

Some additional information:

I upgraded the guest to 2.6.25, and added some printk's to init_32.c and
init/calibrate.c in the kernel source tree. Here's the output from dmesg
for the guest boot:

[0.004000] Checking if this processor honours the WP bit even in supervisor 
mode...Ok.
[0.004000] Before cpa_init.
[0.004000] CPA: page pool initialized 1 of 1 pages preallocated
[0.004000] After cpa_init.
[0.004000] After pagealloc
[0.004000] After cpu_hotplug_init
[0.004000] After kmem_cache_init
[0.004000] After setup_percpu_pageset
[0.004000] After numa_policy_init
[0.004005] After late_time_init
[0.004622] Before read_current_timer(pre_start)
[0.005314] After read_current_timer()
[0.006493] Before read_current_timer(start)
[   16.065027] Before read_current_timer(post_start)  
[   16.065753] Before read_current_timer(post_end)
[   16.066437] Before read_current_timer(start)
[   16.073007] Before read_current_timer(post_start)
[   16.081007] Before read_current_timer(post_end)
[   16.081703] Before read_current_timer(start)
[   16.089008] Before read_current_timer(post_start)
[   16.097008] Before read_current_timer(post_end)
[   16.097695] Before read_current_timer(start)
[   16.105010] Before read_current_timer(post_start)
[   16.113009] Before read_current_timer(post_end)
[   16.113697] Before read_current_timer(start)
[   16.121010] Before read_current_timer(post_start)
[   16.129010] Before read_current_timer(post_end)
[   16.129697] calibrate_delay_direct() failed to get a good estimate for 
loops_per_jiffy.
[   16.129698] Probably due to long platform interrupts. Consider using lpj= 
boot option.
[   16.132180] Calibrating delay loop... 5308.41 BogoMIPS (lpj=10616832)
[   16.237019] After calibrate_delay



Notice how the time jumped from about 0 seconds to 16 seconds. That's where I 
woke it up by typing in another window. The code seems to be hanging in the 
call to read_current_timer(start) in function calibrate_delay_direct in 
init/calibrate.c. Also notice that 
calibrate_delay_direct() failed.




  

Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel