[PATCH] Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6

2008-06-24 Thread Avi Kivity
From: Avi Kivity [EMAIL PROTECTED]

--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm: external module: include linux/time.h

2008-06-24 Thread Avi Kivity
From: Avi Kivity [EMAIL PROTECTED]

for ns_to_timespec().

Signed-off-by: Avi Kivity [EMAIL PROTECTED]

diff --git a/kernel/external-module-compat.h b/kernel/external-module-compat.h
index 1e55153..2a51eeb 100644
--- a/kernel/external-module-compat.h
+++ b/kernel/external-module-compat.h
@@ -13,6 +13,7 @@
 #include linux/kvm.h
 #include linux/kvm_para.h
 #include linux/cpu.h
+#include linux/time.h
 #include asm/processor.h
 #include linux/hrtimer.h
 #include asm/bitops.h
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: Remove now unused structs from kvm_para.h

2008-06-24 Thread Avi Kivity
From: Gerd Hoffmann [EMAIL PROTECTED]

The kvm_* structs are obsoleted by the pvclock_* ones.
Now all users have been switched over and the old structs
can be dropped.

Signed-off-by: Gerd Hoffmann [EMAIL PROTECTED]
Signed-off-by: Avi Kivity [EMAIL PROTECTED]

diff --git a/include/asm-x86/kvm_para.h b/include/asm-x86/kvm_para.h
index 5098459..bfd9900 100644
--- a/include/asm-x86/kvm_para.h
+++ b/include/asm-x86/kvm_para.h
@@ -48,24 +48,6 @@ struct kvm_mmu_op_release_pt {
 #ifdef __KERNEL__
 #include asm/processor.h
 
-/* xen binary-compatible interface. See xen headers for details */
-struct kvm_vcpu_time_info {
-   uint32_t version;
-   uint32_t pad0;
-   uint64_t tsc_timestamp;
-   uint64_t system_time;
-   uint32_t tsc_to_system_mul;
-   int8_t   tsc_shift;
-   int8_t   pad[3];
-} __attribute__((__packed__)); /* 32 bytes */
-
-struct kvm_wall_clock {
-   uint32_t wc_version;
-   uint32_t wc_sec;
-   uint32_t wc_nsec;
-} __attribute__((__packed__));
-
-
 extern void kvmclock_init(void);
 
 
--
To unsubscribe from this list: send the line unsubscribe kvm-commits in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2001452 ] Restarted Windows 2003 Server guests have disk corruption

2008-06-24 Thread SourceForge.net
Bugs item #2001452, was opened at 2008-06-24 07:27
Message generated for change (Comment added) made by gerdwachs
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2001452group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: intel
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: gwachs (gerdwachs)
Assigned to: Nobody/Anonymous (nobody)
Summary: Restarted Windows 2003 Server guests have disk corruption

Initial Comment:
I have a number of Windows 2003 32Bit guests.

I use them to perform installation and configuration
tests of a large software product.

During these tests, the guests are restarted.

Randomly, the guests produce disk corruption messages
after a restart.

The following are two examples :

---
Windows  Registry Hive Recovered

Registry hive (file): SOFTWARE was corrupted and it has
been recovered. Some data might have been lost.
---
The system cannot log on due to the following error:
Unable to complete the requested operation because of
either a catastrophic media failure or a data structure
corruption on the disk.
---

OS : Ubuntu 8.04 x86_64

Kernel : 2.6.24-18-server #1 SMP x86_64 GNU/Linux

KVM: kvm-70

CPU: Intel(R) Core(TM)2 Quad CPU @ 2.40GHz

flags  : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat 
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm 
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 
ssse3 cx16 xt

Start Command : sudo /usr/local/kvm/bin/qemu-system-x86_64 -hda asit51ascs.img \
-m 1024 -std-vga -boot c -k sv -usb -usbdevice tablet -snapshot 
-vnc :51 \
   -net nic,vlan=0,macaddr=00:16:3e:00:51:00 -net 
tap,vlan=0,script=/etc/qemu-ifup-br0 \
   -net nic,vlan=1,macaddr=00:16:3e:00:51:01 -net 
tap,vlan=1,script=/etc/qemu-ifup-br1

no-kvm : Cannot do due to the loss of performance.
 Tests execute time is 7 hours with kvm.




--

Comment By: gwachs (gerdwachs)
Date: 2008-06-24 09:32

Message:
Logged In: YES 
user_id=2122332
Originator: YES

Noted that I get the following in the linux console :

apic write: bad size=1 fee00030


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2001452group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2001452 ] Restarted Windows 2003 Server guests have disk corruption

2008-06-24 Thread SourceForge.net
Bugs item #2001452, was opened at 2008-06-24 07:27
Message generated for change (Comment added) made by gerdwachs
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2001452group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: intel
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: gwachs (gerdwachs)
Assigned to: Nobody/Anonymous (nobody)
Summary: Restarted Windows 2003 Server guests have disk corruption

Initial Comment:
I have a number of Windows 2003 32Bit guests.

I use them to perform installation and configuration
tests of a large software product.

During these tests, the guests are restarted.

Randomly, the guests produce disk corruption messages
after a restart.

The following are two examples :

---
Windows  Registry Hive Recovered

Registry hive (file): SOFTWARE was corrupted and it has
been recovered. Some data might have been lost.
---
The system cannot log on due to the following error:
Unable to complete the requested operation because of
either a catastrophic media failure or a data structure
corruption on the disk.
---

OS : Ubuntu 8.04 x86_64

Kernel : 2.6.24-18-server #1 SMP x86_64 GNU/Linux

KVM: kvm-70

CPU: Intel(R) Core(TM)2 Quad CPU @ 2.40GHz

flags  : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat 
pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm 
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 
ssse3 cx16 xt

Start Command : sudo /usr/local/kvm/bin/qemu-system-x86_64 -hda asit51ascs.img \
-m 1024 -std-vga -boot c -k sv -usb -usbdevice tablet -snapshot 
-vnc :51 \
   -net nic,vlan=0,macaddr=00:16:3e:00:51:00 -net 
tap,vlan=0,script=/etc/qemu-ifup-br0 \
   -net nic,vlan=1,macaddr=00:16:3e:00:51:01 -net 
tap,vlan=1,script=/etc/qemu-ifup-br1

no-kvm : Cannot do due to the loss of performance.
 Tests execute time is 7 hours with kvm.




--

Comment By: gwachs (gerdwachs)
Date: 2008-06-24 10:15

Message:
Logged In: YES 
user_id=2122332
Originator: YES

The message : apic write: bad size=1 fee00030 

only occurs when the guest is started using kvm.

i.e does not occur with the -no-kvm option.

When using the -no-acpi option, the guest does not start kvm or no kvm

--

Comment By: gwachs (gerdwachs)
Date: 2008-06-24 09:32

Message:
Logged In: YES 
user_id=2122332
Originator: YES

Noted that I get the following in the linux console :

apic write: bad size=1 fee00030


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2001452group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: VMX: Some defined name fix

2008-06-24 Thread Yang, Sheng
From 0dae764c94f48bd05f796947df1c85028ade59fa Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Tue, 24 Jun 2008 17:02:38 +0800
Subject: [PATCH] KVM: VMX: Some defined name fix

MSR_IA32_FEATURE_LOCKED is just a bit in fact, which shouldn't prefix with 
MSR_.
So did MSR_IA32_FEATURE_VMXON_ENABLED.

Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/vmx.c |   18 +-
 arch/x86/kvm/vmx.h |4 ++--
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6e4278d..cc3f52b 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1015,9 +1015,9 @@ static __init int vmx_disabled_by_bios(void)
u64 msr;

rdmsrl(MSR_IA32_FEATURE_CONTROL, msr);
-   return (msr  (MSR_IA32_FEATURE_CONTROL_LOCKED |
-  MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED))
-   == MSR_IA32_FEATURE_CONTROL_LOCKED;
+   return (msr  (IA32_FEATURE_CONTROL_LOCKED_BIT |
+  IA32_FEATURE_CONTROL_VMXON_ENABLED_BIT))
+   == IA32_FEATURE_CONTROL_LOCKED_BIT;
/* locked but not enabled */
 }

@@ -1029,14 +1029,14 @@ static void hardware_enable(void *garbage)

INIT_LIST_HEAD(per_cpu(vcpus_on_cpu, cpu));
rdmsrl(MSR_IA32_FEATURE_CONTROL, old);
-   if ((old  (MSR_IA32_FEATURE_CONTROL_LOCKED |
-   MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED))
-   != (MSR_IA32_FEATURE_CONTROL_LOCKED |
-   MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED))
+   if ((old  (IA32_FEATURE_CONTROL_LOCKED_BIT |
+   IA32_FEATURE_CONTROL_VMXON_ENABLED_BIT))
+   != (IA32_FEATURE_CONTROL_LOCKED_BIT |
+   IA32_FEATURE_CONTROL_VMXON_ENABLED_BIT))
/* enable and lock */
wrmsrl(MSR_IA32_FEATURE_CONTROL, old |
-  MSR_IA32_FEATURE_CONTROL_LOCKED |
-  MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED);
+  IA32_FEATURE_CONTROL_LOCKED_BIT |
+  IA32_FEATURE_CONTROL_VMXON_ENABLED_BIT);
write_cr4(read_cr4() | X86_CR4_VMXE); /* FIXME: not cpu hotplug safe */
asm volatile (ASM_VMX_VMXON_RAX
  : : a(phys_addr), m(phys_addr)
diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h
index 425a134..0c22e5f 100644
--- a/arch/x86/kvm/vmx.h
+++ b/arch/x86/kvm/vmx.h
@@ -346,8 +346,8 @@ enum vmcs_field {
 #define MSR_IA32_VMX_EPT_VPID_CAP   0x48c

 #define MSR_IA32_FEATURE_CONTROL0x3a
-#define MSR_IA32_FEATURE_CONTROL_LOCKED 0x1
-#define MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED  0x4
+#define IA32_FEATURE_CONTROL_LOCKED_BIT0x1
+#define IA32_FEATURE_CONTROL_VMXON_ENABLED_BIT 0x4

 #define APIC_ACCESS_PAGE_PRIVATE_MEMSLOT   9
 #define IDENTITY_PAGETABLE_PRIVATE_MEMSLOT 10
--
1.5.5

From 0dae764c94f48bd05f796947df1c85028ade59fa Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Tue, 24 Jun 2008 17:02:38 +0800
Subject: [PATCH] KVM: VMX: Some defined name fix

MSR_IA32_FEATURE_LOCKED is just a bit in fact, which shouldn't prefix with MSR_.
So did MSR_IA32_FEATURE_VMXON_ENABLED.

Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/vmx.c |   18 +-
 arch/x86/kvm/vmx.h |4 ++--
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6e4278d..cc3f52b 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1015,9 +1015,9 @@ static __init int vmx_disabled_by_bios(void)
 	u64 msr;
 
 	rdmsrl(MSR_IA32_FEATURE_CONTROL, msr);
-	return (msr  (MSR_IA32_FEATURE_CONTROL_LOCKED |
-		   MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED))
-	== MSR_IA32_FEATURE_CONTROL_LOCKED;
+	return (msr  (IA32_FEATURE_CONTROL_LOCKED_BIT |
+		   IA32_FEATURE_CONTROL_VMXON_ENABLED_BIT))
+	== IA32_FEATURE_CONTROL_LOCKED_BIT;
 	/* locked but not enabled */
 }
 
@@ -1029,14 +1029,14 @@ static void hardware_enable(void *garbage)
 
 	INIT_LIST_HEAD(per_cpu(vcpus_on_cpu, cpu));
 	rdmsrl(MSR_IA32_FEATURE_CONTROL, old);
-	if ((old  (MSR_IA32_FEATURE_CONTROL_LOCKED |
-		MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED))
-	!= (MSR_IA32_FEATURE_CONTROL_LOCKED |
-		MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED))
+	if ((old  (IA32_FEATURE_CONTROL_LOCKED_BIT |
+		IA32_FEATURE_CONTROL_VMXON_ENABLED_BIT))
+	!= (IA32_FEATURE_CONTROL_LOCKED_BIT |
+		IA32_FEATURE_CONTROL_VMXON_ENABLED_BIT))
 		/* enable and lock */
 		wrmsrl(MSR_IA32_FEATURE_CONTROL, old |
-		   MSR_IA32_FEATURE_CONTROL_LOCKED |
-		   MSR_IA32_FEATURE_CONTROL_VMXON_ENABLED);
+		   IA32_FEATURE_CONTROL_LOCKED_BIT |
+		   IA32_FEATURE_CONTROL_VMXON_ENABLED_BIT);
 	write_cr4(read_cr4() | X86_CR4_VMXE); /* FIXME: not cpu hotplug safe */
 	asm volatile (ASM_VMX_VMXON_RAX
 		  : : a(phys_addr), m(phys_addr)
diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h
index 425a134..0c22e5f 100644
--- a/arch/x86/kvm/vmx.h
+++ b/arch/x86/kvm/vmx.h
@@ 

[PATCH 1/2] x86/KVM: Move VMX MSR definition to msr-index.h

2008-06-24 Thread Yang, Sheng
From 90cc7b5303fab30d53d42b3fb7281d756b3d7134 Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Tue, 24 Jun 2008 17:02:41 +0800
Subject: [PATCH] x86/KVM: Move VMX MSR definition to msr-index.h


Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/vmx.h  |   15 ---
 include/asm-x86/msr-index.h |   16 
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h
index 0c22e5f..da06a4a 100644
--- a/arch/x86/kvm/vmx.h
+++ b/arch/x86/kvm/vmx.h
@@ -331,21 +331,6 @@ enum vmcs_field {

 #define AR_RESERVD_MASK 0xfffe0f00

-#define MSR_IA32_VMX_BASIC  0x480
-#define MSR_IA32_VMX_PINBASED_CTLS  0x481
-#define MSR_IA32_VMX_PROCBASED_CTLS 0x482
-#define MSR_IA32_VMX_EXIT_CTLS  0x483
-#define MSR_IA32_VMX_ENTRY_CTLS 0x484
-#define MSR_IA32_VMX_MISC   0x485
-#define MSR_IA32_VMX_CR0_FIXED0 0x486
-#define MSR_IA32_VMX_CR0_FIXED1 0x487
-#define MSR_IA32_VMX_CR4_FIXED0 0x488
-#define MSR_IA32_VMX_CR4_FIXED1 0x489
-#define MSR_IA32_VMX_VMCS_ENUM  0x48a
-#define MSR_IA32_VMX_PROCBASED_CTLS20x48b
-#define MSR_IA32_VMX_EPT_VPID_CAP   0x48c
-
-#define MSR_IA32_FEATURE_CONTROL0x3a
 #define IA32_FEATURE_CONTROL_LOCKED_BIT0x1
 #define IA32_FEATURE_CONTROL_VMXON_ENABLED_BIT 0x4

diff --git a/include/asm-x86/msr-index.h b/include/asm-x86/msr-index.h
index 09413ad..59ffc93 100644
--- a/include/asm-x86/msr-index.h
+++ b/include/asm-x86/msr-index.h
@@ -174,6 +174,7 @@
 #define MSR_IA32_TSC   0x0010
 #define MSR_IA32_PLATFORM_ID   0x0017
 #define MSR_IA32_EBL_CR_POWERON0x002a
+#define MSR_IA32_FEATURE_CONTROL   0x003a

 #define MSR_IA32_APICBASE  0x001b
 #define MSR_IA32_APICBASE_BSP  (18)
@@ -194,6 +195,21 @@
 #define MSR_IA32_THERM_STATUS  0x019c
 #define MSR_IA32_MISC_ENABLE   0x01a0

+/* Intel VT related */
+#define MSR_IA32_VMX_BASIC 0x0480
+#define MSR_IA32_VMX_PINBASED_CTLS 0x0481
+#define MSR_IA32_VMX_PROCBASED_CTLS0x0482
+#define MSR_IA32_VMX_EXIT_CTLS 0x0483
+#define MSR_IA32_VMX_ENTRY_CTLS0x0484
+#define MSR_IA32_VMX_MISC  0x0485
+#define MSR_IA32_VMX_CR0_FIXED00x0486
+#define MSR_IA32_VMX_CR0_FIXED10x0487
+#define MSR_IA32_VMX_CR4_FIXED00x0488
+#define MSR_IA32_VMX_CR4_FIXED10x0489
+#define MSR_IA32_VMX_VMCS_ENUM 0x048a
+#define MSR_IA32_VMX_PROCBASED_CTLS2   0x048b
+#define MSR_IA32_VMX_EPT_VPID_CAP  0x048c
+
 /* Intel Model 6 */
 #define MSR_P6_EVNTSEL00x0186
 #define MSR_P6_EVNTSEL10x0187
--
1.5.5

From 90cc7b5303fab30d53d42b3fb7281d756b3d7134 Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Tue, 24 Jun 2008 17:02:41 +0800
Subject: [PATCH] x86/KVM: Move VMX MSR definition to msr-index.h


Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/vmx.h  |   15 ---
 include/asm-x86/msr-index.h |   16 
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/vmx.h b/arch/x86/kvm/vmx.h
index 0c22e5f..da06a4a 100644
--- a/arch/x86/kvm/vmx.h
+++ b/arch/x86/kvm/vmx.h
@@ -331,21 +331,6 @@ enum vmcs_field {
 
 #define AR_RESERVD_MASK 0xfffe0f00
 
-#define MSR_IA32_VMX_BASIC  0x480
-#define MSR_IA32_VMX_PINBASED_CTLS  0x481
-#define MSR_IA32_VMX_PROCBASED_CTLS 0x482
-#define MSR_IA32_VMX_EXIT_CTLS  0x483
-#define MSR_IA32_VMX_ENTRY_CTLS 0x484
-#define MSR_IA32_VMX_MISC   0x485
-#define MSR_IA32_VMX_CR0_FIXED0 0x486
-#define MSR_IA32_VMX_CR0_FIXED1 0x487
-#define MSR_IA32_VMX_CR4_FIXED0 0x488
-#define MSR_IA32_VMX_CR4_FIXED1 0x489
-#define MSR_IA32_VMX_VMCS_ENUM  0x48a
-#define MSR_IA32_VMX_PROCBASED_CTLS20x48b
-#define MSR_IA32_VMX_EPT_VPID_CAP   0x48c
-
-#define MSR_IA32_FEATURE_CONTROL0x3a
 #define IA32_FEATURE_CONTROL_LOCKED_BIT		0x1
 #define IA32_FEATURE_CONTROL_VMXON_ENABLED_BIT	0x4
 
diff --git a/include/asm-x86/msr-index.h b/include/asm-x86/msr-index.h
index 09413ad..59ffc93 100644
--- a/include/asm-x86/msr-index.h
+++ b/include/asm-x86/msr-index.h
@@ -174,6 +174,7 @@
 #define MSR_IA32_TSC			0x0010
 #define MSR_IA32_PLATFORM_ID		0x0017
 #define MSR_IA32_EBL_CR_POWERON		0x002a
+#define MSR_IA32_FEATURE_CONTROL	0x003a
 
 #define MSR_IA32_APICBASE		0x001b
 #define MSR_IA32_APICBASE_BSP		(18)
@@ -194,6 +195,21 @@
 #define MSR_IA32_THERM_STATUS		

Re: PCI PT: irq issue

2008-06-24 Thread Amit Shah
On Monday 23 June 2008 20:46:18 Han, Weidong wrote:
 Amit Shah wrote:
  On Saturday 21 June 2008 09:41:18 Han, Weidong wrote:
  Amit Shah wrote:
  A couple of notes for the VT-d patch:
  - The pci_dev struct is now available in the pci_pt kernel
  structure, so just use that information each time you want to add a
  device instead of searching for it each time.
  - The kernel with KVM VT-d patches doesn't build on the
  kvm-userspace.git tree. Please fix that.
 
  I pulled the latest VT-d branch, and it works fine for me.
 
  I mean the 'vtd' branch in the kernel tree with 'master' branch of the
  userspace tree aren't compatible.

 I tried it. It can build. Can you show me the errors?

If you compile kvm as external modules, the userspace tree doesn't contain the 
vtd.o file while linking, but kvm.ko refers to objects within it:


  Building modules, stage 2.
  MODPOST 3 modules
WARNING: kvm_intel_iommu_found [/home/amit/src/kvm-userspace/kernel/kvm.ko] 
undefined!
WARNING: kvm_iommu_unmap_guest [/home/amit/src/kvm-userspace/kernel/kvm.ko] 
undefined!
WARNING: kvm_iommu_map_guest [/home/amit/src/kvm-userspace/kernel/kvm.ko] 
undefined!
WARNING: kvm_iommu_map_pages [/home/amit/src/kvm-userspace/kernel/kvm.ko] 
undefined!

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Ignore DEBUGCTL MSRs

2008-06-24 Thread Joerg Roedel
Hi Alex,

On Tue, Jun 24, 2008 at 07:04:45AM +0200, Alexander Graf wrote:
 Netware writes and reads to the DEBUGCTL and LAST*IP MSRs without
 further checks and is really confused to receive a #GP during that. To
 make it happy we should just make them stubs, which is exactly what
 SVM already does.
 
 To support VMX too, I put these in the generic code. Maybe the SVM
 code could be cleaned up to use generic code too.

I would prefer if you put that into the VMX specific code. We can't move
the SVM parts of it into generic code because Barcelona has hardware
support to virtualize these registers. Therefore SVM don't need that
in generic code.

 Signed-off-by: Alexander Graf [EMAIL PROTECTED]
 
 

 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index fc0721e..02f8490 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -609,6 +609,11 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, 
 u64 data)
   pr_unimpl(vcpu, %s: MSR_IA32_MCG_CTL 0x%llx, nop\n,
   __func__, data);
   break;
 + case MSR_IA32_DEBUGCTLMSR:
 + case MSR_IA32_LASTBRANCHFROMIP:
 + case MSR_IA32_LASTBRANCHTOIP:
 + case MSR_IA32_LASTINTFROMIP:
 + case MSR_IA32_LASTINTTOIP:
   case MSR_IA32_UCODE_REV:
   case MSR_IA32_UCODE_WRITE:
   break;
 @@ -705,6 +710,11 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, 
 u64 *pdata)
   case MSR_IA32_MC0_MISC+16:
   case MSR_IA32_UCODE_REV:
   case MSR_IA32_EBL_CR_POWERON:
 + case MSR_IA32_DEBUGCTLMSR:
 + case MSR_IA32_LASTBRANCHFROMIP:
 + case MSR_IA32_LASTBRANCHTOIP:
 + case MSR_IA32_LASTINTFROMIP:
 + case MSR_IA32_LASTINTTOIP:
   data = 0;
   break;
   case MSR_MTRRcap:


-- 
   |   AMD Saxony Limited Liability Company  Co. KG
 Operating | Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System|  Register Court Dresden: HRA 4896
 Research  |  General Partner authorized to represent:
 Center| AMD Saxony LLC (Wilmington, Delaware, US)
   | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Avoid fragment virtio-blk transfers by copying

2008-06-24 Thread Anthony Liguori
A major source of performance loss for virtio-blk has been the fact that we
split transfers into multiple requests.  This is particularly harmful if you
have striped storage beneath your virtual machine.

This patch copies the request data into a single contiguous buffer to ensure
that we don't split requests.  This improves performance from about 80 MB/sec
to about 155 MB/sec with my fibre channel link.  185 MB/sec is what we get on
native so this gets us pretty darn close.

Signed-off-by: Anthony Liguori [EMAIL PROTECTED]

diff --git a/qemu/hw/virtio-blk.c b/qemu/hw/virtio-blk.c
index 233e6e7..2ea5669 100644
--- a/qemu/hw/virtio-blk.c
+++ b/qemu/hw/virtio-blk.c
@@ -72,6 +72,7 @@ typedef struct VirtIOBlock
 {
 VirtIODevice vdev;
 BlockDriverState *bs;
+VirtQueue *vq;
 } VirtIOBlock;
 
 static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
@@ -81,106 +82,138 @@ static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev)
 
 typedef struct VirtIOBlockReq
 {
-VirtIODevice *vdev;
-VirtQueue *vq;
-struct iovec in_sg_status;
-unsigned int pending;
-unsigned int len;
-unsigned int elem_idx;
-int status;
+VirtIOBlock *dev;
+VirtQueueElement elem;
+struct virtio_blk_inhdr *in;
+struct virtio_blk_outhdr *out;
+size_t size;
+uint8_t *buffer;
 } VirtIOBlockReq;
 
 static void virtio_blk_rw_complete(void *opaque, int ret)
 {
 VirtIOBlockReq *req = opaque;
-struct virtio_blk_inhdr *in;
-VirtQueueElement elem;
+VirtIOBlock *s = req-dev;
+
+/* Copy read data to the guest */
+if (!ret  !(req-out-type  VIRTIO_BLK_T_OUT)) {
+   size_t offset = 0;
+   int i;
 
-req-status |= ret;
-if (--req-pending  0)
-return;
+   for (i = 0; i  req-elem.in_num - 1; i++) {
+   size_t len;
 
-elem.index = req-elem_idx;
-in = (void *)req-in_sg_status.iov_base;
+   /* Be pretty defensive wrt malicious guests */
+   len = MIN(req-elem.in_sg[i].iov_len,
+ req-size - offset);
 
-in-status = req-status ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
-virtqueue_push(req-vq, elem, req-len);
-virtio_notify(req-vdev, req-vq);
+   memcpy(req-elem.in_sg[i].iov_base,
+  req-buffer + offset,
+  len);
+   offset += len;
+   }
+}
+
+req-in-status = ret ? VIRTIO_BLK_S_IOERR : VIRTIO_BLK_S_OK;
+virtqueue_push(s-vq, req-elem, req-size + sizeof(*req-in));
+virtio_notify(s-vdev, s-vq);
+
+qemu_free(req-buffer);
 qemu_free(req);
 }
 
+static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s)
+{
+VirtIOBlockReq *req;
+
+req = qemu_mallocz(sizeof(*req));
+if (req == NULL)
+   return NULL;
+
+req-dev = s;
+if (!virtqueue_pop(s-vq, req-elem)) {
+   qemu_free(req);
+   return NULL;
+}
+
+return req;
+}
+
 static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
 VirtIOBlock *s = to_virtio_blk(vdev);
-VirtQueueElement elem;
 VirtIOBlockReq *req;
-unsigned int count;
 
-while ((count = virtqueue_pop(vq, elem)) != 0) {
-   struct virtio_blk_inhdr *in;
-   struct virtio_blk_outhdr *out;
-   off_t off;
+while ((req = virtio_blk_get_request(s))) {
int i;
 
-   if (elem.out_num  1 || elem.in_num  1) {
+   if (req-elem.out_num  1 || req-elem.in_num  1) {
fprintf(stderr, virtio-blk missing headers\n);
exit(1);
}
 
-   if (elem.out_sg[0].iov_len != sizeof(*out) ||
-   elem.in_sg[elem.in_num - 1].iov_len != sizeof(*in)) {
+   if (req-elem.out_sg[0].iov_len  sizeof(*req-out) ||
+   req-elem.in_sg[req-elem.in_num - 1].iov_len  sizeof(*req-in)) {
fprintf(stderr, virtio-blk header not in correct element\n);
exit(1);
}
 
-   /*
-* FIXME: limit the number of in-flight requests
-*/
-   req = qemu_malloc(sizeof(VirtIOBlockReq));
-   if (!req)
-   return;
-   memset(req, 0, sizeof(*req));
-   memcpy(req-in_sg_status, elem.in_sg[elem.in_num - 1],
-  sizeof(req-in_sg_status));
-   req-vdev = vdev;
-   req-vq = vq;
-   req-elem_idx = elem.index;
-
-   out = (void *)elem.out_sg[0].iov_base;
-   in = (void *)elem.in_sg[elem.in_num - 1].iov_base;
-   off = out-sector;
-
-   if (out-type  VIRTIO_BLK_T_SCSI_CMD) {
-   unsigned int len = sizeof(*in);
-
-   in-status = VIRTIO_BLK_S_UNSUPP;
-   virtqueue_push(vq, elem, len);
+   req-out = (void *)req-elem.out_sg[0].iov_base;
+   req-in = (void *)req-elem.in_sg[req-elem.in_num - 1].iov_base;
+
+   if (req-out-type  VIRTIO_BLK_T_SCSI_CMD) {
+   unsigned int len = sizeof(*req-in);
+
+   req-in-status = VIRTIO_BLK_S_UNSUPP;
+   virtqueue_push(vq, req-elem, len);
virtio_notify(vdev, vq);
qemu_free(req);
-   } else if (out-type  VIRTIO_BLK_T_OUT) {
- 

[PATCH] Use qemu_memalign instead of qemu_malloc

2008-06-24 Thread Anthony Liguori
I guess the main block code is not as defensive as I thought it was.  This patch
uses qemu_memalign to allocate the buffers for IO so that you don't get errors
when using O_DIRECT.

It applies on top of my previous patch to introduce copies in virtio-blk.

Signed-off-by: Anthony Liguori [EMAIL PROTECTED]

diff --git a/qemu/hw/virtio-blk.c b/qemu/hw/virtio-blk.c
index 2ea5669..669e55f 100644
--- a/qemu/hw/virtio-blk.c
+++ b/qemu/hw/virtio-blk.c
@@ -174,7 +174,7 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
for (i = 1; i  req-elem.out_num; i++)
req-size += req-elem.out_sg[i].iov_len;
 
-   req-buffer = qemu_malloc(req-size);
+   req-buffer = qemu_memalign(512, req-size);
if (req-buffer == NULL) {
qemu_free(req);
break;
@@ -203,7 +203,7 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
for (i = 0; i  req-elem.in_num - 1; i++)
req-size += req-elem.in_sg[i].iov_len;
 
-   req-buffer = qemu_malloc(req-size);
+   req-buffer = qemu_memalign(512, req-size);
if (req-buffer == NULL) {
qemu_free(req);
break;
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [v2] Remove use of bit fields in kvm trace structure

2008-06-24 Thread Jerone Young
2 files changed, 30 insertions(+), 11 deletions(-)
include/linux/kvm.h  |   17 ++---
virt/kvm/kvm_trace.c |   24 


 *Updates:
Create global definitions for setting trace records as opposed to
explicitly setting them inside of a function.

This patch fixes kvmtrace use on big endian systems. When using bit fields the 
compiler will lay data out in the wrong order expected when laid down into a 
file. This fixes it by using one variable instead of using bit fields.

Signed-off-by: Jerone Young [EMAIL PROTECTED]

diff --git a/include/linux/kvm.h b/include/linux/kvm.h
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -311,9 +311,13 @@ struct kvm_s390_interrupt {
 
 /* This structure represents a single trace buffer record. */
 struct kvm_trace_rec {
-   __u32 event:28;
-   __u32 extra_u32:3;
-   __u32 cycle_in:1;
+   /* variable rec_val
+* is split into:
+* bits 0 - 27  - event id
+* bits 28 -30  - number of extra data args of size u32
+* bits 31  - binary indicator for if tsc is in record
+*/
+   __u32 rec_val;
__u32 pid;
__u32 vcpu_id;
union {
@@ -326,6 +330,13 @@ struct kvm_trace_rec {
} nocycle;
} u;
 } __attribute__((packed));
+
+#define TRACE_REC_EVENT_ID (val) \
+   (0x0fff  (val))
+#define TRACE_REC_NUM_DATA_ARGS (val) \
+   (0x7000  (val  28))
+#define TRACE_REC_TCS (val) \
+   (0x8000  (val  31))
 
 #define KVMIO 0xAE
 
diff --git a/virt/kvm/kvm_trace.c b/virt/kvm/kvm_trace.c
--- a/virt/kvm/kvm_trace.c
+++ b/virt/kvm/kvm_trace.c
@@ -54,12 +54,17 @@ static void kvm_add_trace(void *probe_pr
struct kvm_trace *kt = kvm_trace;
struct kvm_trace_rec rec;
struct kvm_vcpu *vcpu;
-   inti, extra, size;
+   inti, size;
+   u32 extra;
 
if (unlikely(kt-trace_state != KVM_TRACE_STATE_RUNNING))
return;
 
-   rec.event   = va_arg(*args, u32);
+   rec.rec_val = 0;
+   
+   /* set event id */  
+   rec.rec_val |= TRACE_REC_EVENT_ID(va_arg(*args, u32));
+
vcpu= va_arg(*args, struct kvm_vcpu *);
rec.pid = current-tgid;
rec.vcpu_id = vcpu-vcpu_id;
@@ -67,21 +72,24 @@ static void kvm_add_trace(void *probe_pr
extra   = va_arg(*args, u32);
WARN_ON(!(extra = KVM_TRC_EXTRA_MAX));
extra   = min_t(u32, extra, KVM_TRC_EXTRA_MAX);
-   rec.extra_u32   = extra;
 
-   rec.cycle_in= p-cycle_in;
+   /* set inidicator for tcs record */
+   rec.rec_val |= TRACE_REC_TCS(p-cycle_in);
 
-   if (rec.cycle_in) {
+   /* set extra data num */
+   rec.rec_val |= TRACE_REC_NUM_DATA_ARGS(extra);
+
+   if (p-cycle_in) {
rec.u.cycle.cycle_u64 = get_cycles();
 
-   for (i = 0; i  rec.extra_u32; i++)
+   for (i = 0; i  extra; i++)
rec.u.cycle.extra_u32[i] = va_arg(*args, u32);
} else {
-   for (i = 0; i  rec.extra_u32; i++)
+   for (i = 0; i  extra; i++)
rec.u.nocycle.extra_u32[i] = va_arg(*args, u32);
}
 
-   size = calc_rec_size(rec.cycle_in, rec.extra_u32 * sizeof(u32));
+   size = calc_rec_size(p-cycle_in, extra * sizeof(u32));
relay_write(kt-rchan, rec, size);
 }
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: cache_regs in kvm_emulate_pio

2008-06-24 Thread Marcelo Tosatti
On Sun, Jun 22, 2008 at 08:16:19AM +0300, Avi Kivity wrote:
 Marcelo Tosatti wrote:
 On Sat, Jun 21, 2008 at 10:04:18AM +0300, Avi Kivity wrote:
   
 /*
  * Sync the rsp and rip registers into the vcpu structure.  This allows
  * registers to be accessed by indexing vcpu-arch.regs.
  */

 But I think it just refers to the interface in general, so that nobody
 would try to access RSP or RIP (and RAX in AMD's case) before calling
 -cache_regs().
 
 It refers to the fact that sometimes you don't know which registers 
 you  refer to, e.g. in the emulator.
 

 How's this? 

   

 Looks good, but we can aim higher.  The cache_regs() API was always  
 confusing (I usually swap the two parts).  If we replace all -regs  
 access with accessors, we can make it completely transparent.

 It will be tricky in the emulator, but worthwhile, no?

OK, in the emulator an interface on top of guest_register_write() is
needed to save registers so that the original contents can be restored
on failure. Some brave soul can do it later, so I added a TODO in x86.c.

Smells better now?


--- dev/null2008-06-24 14:36:42.383774904 -0300
+++ b/arch/x86/kvm/kvm_cache_regs.h 2008-06-24 15:26:02.0 -0300
@@ -0,0 +1,21 @@
+#ifndef ASM_KVM_CACHE_REGS_H
+#define ASM_KVM_CACHE_REGS_H
+
+static inline unsigned long guest_register_read(struct kvm_vcpu *vcpu,
+   enum kvm_reg reg)
+{
+   if (!__test_and_set_bit(reg, vcpu-arch.regs_available))
+   kvm_x86_ops-cache_regs(vcpu, reg);
+
+   return vcpu-arch.regs[reg];
+}
+
+static inline void guest_register_write(struct kvm_vcpu *vcpu,
+   enum kvm_reg reg,
+   unsigned long val)
+{
+   vcpu-arch.regs[reg] = val;
+   __set_bit(reg, vcpu-arch.regs_dirty);
+}
+
+#endif
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 73f43de..97919b6 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -32,6 +32,7 @@
 #include asm/current.h
 #include asm/apicdef.h
 #include asm/atomic.h
+#include kvm_cache_regs.h
 #include irq.h
 
 #define PRId64 d
@@ -558,8 +559,7 @@ static void __report_tpr_access(struct kvm_lapic *apic, 
bool write)
struct kvm_run *run = vcpu-run;
 
set_bit(KVM_REQ_REPORT_TPR_ACCESS, vcpu-requests);
-   kvm_x86_ops-cache_regs(vcpu);
-   run-tpr_access.rip = vcpu-arch.rip;
+   run-tpr_access.rip = guest_register_read(vcpu, VCPU_REGS_RIP);
run-tpr_access.is_write = write;
 }
 
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 238e8f3..acd96f6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -18,6 +18,7 @@
 #include kvm_svm.h
 #include irq.h
 #include mmu.h
+#include kvm_cache_regs.h
 
 #include linux/module.h
 #include linux/kernel.h
@@ -241,7 +242,8 @@ static void skip_emulated_instruction(struct kvm_vcpu *vcpu)
   svm-vmcb-save.rip,
   svm-next_rip);
 
-   vcpu-arch.rip = svm-vmcb-save.rip = svm-next_rip;
+   svm-vmcb-save.rip = svm-next_rip;
+   guest_register_write(vcpu, VCPU_REGS_RIP, svm-vmcb-save.rip);
svm-vmcb-control.int_state = ~SVM_INTERRUPT_SHADOW_MASK;
 
vcpu-arch.interrupt_window_open = 1;
@@ -709,21 +711,42 @@ static void svm_vcpu_put(struct kvm_vcpu *vcpu)
rdtscll(vcpu-arch.host_tsc);
 }
 
-static void svm_cache_regs(struct kvm_vcpu *vcpu)
+static void svm_cache_regs(struct kvm_vcpu *vcpu, enum kvm_reg reg)
 {
struct vcpu_svm *svm = to_svm(vcpu);
 
-   vcpu-arch.regs[VCPU_REGS_RAX] = svm-vmcb-save.rax;
-   vcpu-arch.regs[VCPU_REGS_RSP] = svm-vmcb-save.rsp;
-   vcpu-arch.rip = svm-vmcb-save.rip;
+   switch (reg) {
+   case VCPU_REGS_RAX:
+   vcpu-arch.regs[VCPU_REGS_RAX] = svm-vmcb-save.rax;
+   break;
+   case VCPU_REGS_RSP:
+   vcpu-arch.regs[VCPU_REGS_RSP] = svm-vmcb-save.rsp;
+   break;
+   case VCPU_REGS_RIP:
+   vcpu-arch.regs[VCPU_REGS_RIP] = svm-vmcb-save.rip;
+   break;
+   default:
+   break;
+   }
 }
 
-static void svm_decache_regs(struct kvm_vcpu *vcpu)
+static void svm_decache_regs(struct kvm_vcpu *vcpu, enum kvm_reg reg)
 {
struct vcpu_svm *svm = to_svm(vcpu);
-   svm-vmcb-save.rax = vcpu-arch.regs[VCPU_REGS_RAX];
-   svm-vmcb-save.rsp = vcpu-arch.regs[VCPU_REGS_RSP];
-   svm-vmcb-save.rip = vcpu-arch.rip;
+
+   switch (reg) {
+   case VCPU_REGS_RAX:
+   svm-vmcb-save.rax = vcpu-arch.regs[VCPU_REGS_RAX];
+   break;
+   case VCPU_REGS_RSP:
+   svm-vmcb-save.rsp = vcpu-arch.regs[VCPU_REGS_RSP];
+   break;
+   case VCPU_REGS_RIP:
+   svm-vmcb-save.rip = vcpu-arch.regs[VCPU_REGS_RIP];
+   break;
+   default:
+   break;
+   }
 }
 
 static unsigned long svm_get_rflags(struct kvm_vcpu *vcpu)
diff --git 

[GIT PULL] KVM fixes for 2.6.26-rc7

2008-06-24 Thread Avi Kivity

Linus, please pull from the repo and branch at:

 git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git 
kvm-updates-2.6.26


to receive kvm updates for 2.6.26-rc7.  The patches fix host oopses,
guest interrupt loss, and total kvm clock borkage.

Since one of the goals of kvm clock was to be binary compatible with the Xen
clock, the patchset moves the Xen time code to common code and makes kvm 
reuse it.

This fixes both the ABI and correctness issues.  Jeremy has acked the Xen
changes.

Avi Kivity (3):
 KVM: MMU: Fix oops on guest userspace access to guest pagetable
 KVM: ioapic: fix lost interrupt when changing a device's irq
 KVM: VMX: Fix host msr corruption with preemption enabled

Gerd Hoffmann (5):
 x86: Add structs and functions for paravirt clocksource
 x86: Make xen use the paravirt clocksource structs and functions
 KVM: Make kvm host use the paravirt clocksource structs
 x86: KVM guest: Use the paravirt clocksource structs and functions
 KVM: Remove now unused structs from kvm_para.h

Marcelo Tosatti (4):
 KVM: Fix race between timer migration and vcpu migration
 KVM: close timer injection race window in __vcpu_run
 KVM: MMU: Fix rmap_write_protect() hugepage iteration bug
 KVM: MMU: large page update_pte issue with non-PAE 32-bit guests 
(resend)


arch/x86/Kconfig  |5 ++
arch/x86/kernel/Makefile  |1 +
arch/x86/kernel/kvmclock.c|   89 ++
arch/x86/kernel/pvclock.c |  141 
+

arch/x86/kvm/i8254.c  |9 ++-
arch/x86/kvm/lapic.c  |1 +
arch/x86/kvm/mmu.c|   19 +++---
arch/x86/kvm/vmx.c|   19 +++---
arch/x86/kvm/x86.c|   91 +++---
arch/x86/xen/Kconfig  |1 +
arch/x86/xen/time.c   |  132 
---

include/asm-x86/kvm_host.h|4 +-
include/asm-x86/kvm_para.h|   18 -
include/asm-x86/pvclock-abi.h |   42 
include/asm-x86/pvclock.h |   13 
include/linux/kvm_host.h  |1 +
include/xen/interface/xen.h   |7 +-
virt/kvm/ioapic.c |   31 +++--
18 files changed, 358 insertions(+), 266 deletions(-)
create mode 100644 arch/x86/kernel/pvclock.c
create mode 100644 include/asm-x86/pvclock-abi.h
create mode 100644 include/asm-x86/pvclock.h

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


qemu: remove overzelaous virtio-net printf

2008-06-24 Thread Marcelo Tosatti

When two virtio devices share an interrupt virtio-net floods the console
with this should not happen message.

As Anthony points this is not a fatal condition: its possible that the
guest consumed all ring elements between the can_receive check and
actual net_receive call.

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]

diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
index a61fdb1..2e57e5a 100644
--- a/qemu/hw/virtio-net.c
+++ b/qemu/hw/virtio-net.c
@@ -119,10 +119,8 @@ static void virtio_net_receive(void *opaque, const uint8_t 
*buf, int size)
 struct virtio_net_hdr *hdr;
 int offset, i;
 
-if (virtqueue_pop(n-rx_vq, elem) == 0) {
-   fprintf(stderr, virtio_net: this should not happen\n);
+if (virtqueue_pop(n-rx_vq, elem) == 0)
return;
-}
 
 if (elem.in_num  1 || elem.in_sg[0].iov_len != sizeof(*hdr)) {
fprintf(stderr, virtio-net header not in first element\n);
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: qemu: remove overzelaous virtio-net printf

2008-06-24 Thread Anthony Liguori

Marcelo Tosatti wrote:

When two virtio devices share an interrupt virtio-net floods the console
with this should not happen message.

As Anthony points this is not a fatal condition: its possible that the
guest consumed all ring elements between the can_receive check and
actual net_receive call.
  


Thanks for catching this!



Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]
  


Acked-by: Anthony Liguori [EMAIL PROTECTED]

Regards,

Anthony Liguori


diff --git a/qemu/hw/virtio-net.c b/qemu/hw/virtio-net.c
index a61fdb1..2e57e5a 100644
--- a/qemu/hw/virtio-net.c
+++ b/qemu/hw/virtio-net.c
@@ -119,10 +119,8 @@ static void virtio_net_receive(void *opaque, const uint8_t 
*buf, int size)
 struct virtio_net_hdr *hdr;
 int offset, i;

-if (virtqueue_pop(n-rx_vq, elem) == 0) {
-   fprintf(stderr, virtio_net: this should not happen\n);
+if (virtqueue_pop(n-rx_vq, elem) == 0)
return;
-}

 if (elem.in_num  1 || elem.in_sg[0].iov_len != sizeof(*hdr)) {
fprintf(stderr, virtio-net header not in first element\n);
  


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [GIT PULL] KVM fixes for 2.6.26-rc7

2008-06-24 Thread Linus Torvalds


On Wed, 25 Jun 2008, Avi Kivity wrote:

 Linus, please pull from the repo and branch at:
 
  git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git kvm-updates-2.6.26
 
 to receive kvm updates for 2.6.26-rc7.  The patches fix host oopses,
 guest interrupt loss, and total kvm clock borkage.

Avi, you _really_ need to start respecting the merge window.

If you can't learn, I will have to just stop pulling from you. This is 
simply too big for this late in the game.

I pulled, but I'm simply not going to continue doing this dance. I don't 
care much for virtualization, so I've let it slide, but you need to learn 
that 

 18 files changed, 358 insertions(+), 266 deletions(-)

is simply not acceptable this late. 

Linus
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] reserved-ram for pci-passthrough without VT-d capable hardware

2008-06-24 Thread Andrea Arcangeli
From: Andrea Arcangeli [EMAIL PROTECTED]

This has to be applied to the host kernel and for example specifying a
relocation address of 0x2000 it will allow to start kvm guests
capable of pci-passthrough up to -m 512 by passing the
-reserved-ram parameter in the command line. There's no risk of
errors from the user thanks to the reserved ranges being provided to
the virtualization software through /proc/iomem. Only you shouldn't
run more than one -reserved-ram kvm quest per system at once.

This works by reserving the ram early in the e820 map so the initial
pagetables are allocated above the kernel .text relocation and then I
make the sparse code think the reserved-ram is actually available (so
struct pages are allocated) and finally I've to reserve those pages in
the bootmem allocator immediately after the bootmem allocator has been
initialized, so they remain PageReserved not used by linux, but with
'struct page' backing so they can still be exported to qemu via device
driver vma-fault (as they can still be the target of any emulated
dma, not all devices will passthrough).

The virtualization software must create for the guest an e820 map that
only includes the reserved RAM regions but if the guest touches
memory with guest physical address in the reserved RAM failed ranges
it should provide that as ram and map it with a non linear
mapping (in practice the only problem is for the first page at address
0 physical which is usually the bios and no sane OS is doing DMA to
it).

vmx ~ # cat /proc/iomem |head -n 20
-0fff : reserved RAM failed
1000-0008 : reserved RAM
0009-00091fff : reserved RAM failed
00092000-0009cfff : reserved RAM
0009d000-0009 : reserved
000a-000ec16f : reserved RAM failed
000ec170-000f : reserved
0010-1fff : reserved RAM
2000-bff9 : System RAM
  2000-20315f65 : Kernel code
  20315f66-204c3767 : Kernel data
  20557000-205c9eff : Kernel bss
bffa-bffa : ACPI Tables
bffb-bffd : ACPI Non-volatile Storage
bffe-bffedfff : reserved
bfff-bfff : reserved
d000-dfff : PCI Bus :02
  d000-dfff : :02:00.0
e000-efff : PCI MMCONFIG 0
  e000-efff : pnp 00:0c

Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
---

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1198,8 +1198,36 @@ config CRASH_DUMP
  (CONFIG_RELOCATABLE=y).
  For more details see Documentation/kdump/kdump.txt
 
+config RESERVE_PHYSICAL_START
+   bool Reserve all RAM below PHYSICAL_START (EXPERIMENTAL)
+   depends on !RELOCATABLE  X86_64
+   help
+ This makes the kernel use only RAM above __PHYSICAL_START.
+ All memory below __PHYSICAL_START will be left unused and
+ marked as reserved RAM in /proc/iomem. The few special
+ pages that can't be relocated at addresses above
+ __PHYSICAL_START and that can't be guaranteed to be unused
+ by the running kernel will be marked reserved RAM failed
+ in /proc/iomem. Those may or may be not used by the kernel
+ (for example SMP trampoline pages would only be used if
+ CPU hotplug is enabled).
+
+ The reserved RAM can be mapped by virtualization software
+ with /dev/mem to create a 1:1 mapping between guest physical
+ (bus) address and host physical (bus) address. This will
+ allow PCI passthrough with DMA for the guest using the RAM
+ with the 1:1 mapping. The only detail to take care of is the
+ RAM marked reserved RAM failed. The virtualization
+ software must create for the guest an e820 map that only
+ includes the reserved RAM regions but if the guest touches
+ memory with guest physical address in the reserved RAM
+ failed ranges (Linux guest will do that even if the RAM
+ isn't present in the e820 map), it should provide that as
+ RAM and map it with a non-linear mapping. This should allow
+ any Linux kernel to run fine and hopefully any other OS too.
+
 config PHYSICAL_START
-   hex Physical address where the kernel is loaded if (EMBEDDED || 
CRASH_DUMP)
+   hex Physical address where the kernel is loaded if (EMBEDDED || 
CRASH_DUMP || RESERVE_PHYSICAL_START)
default 0x100 if X86_NUMAQ
default 0x20 if X86_64
default 0x10
diff --git a/arch/x86/kernel/e820_64.c b/arch/x86/kernel/e820_64.c
--- a/arch/x86/kernel/e820_64.c
+++ b/arch/x86/kernel/e820_64.c
@@ -119,7 +119,31 @@ void __init early_res_to_bootmem(unsigne
printk(KERN_INFO   early res: %d [%lx-%lx] %s\n, i,
final_start, final_end - 1, r-name);
reserve_bootmem_generic(final_start, final_end - final_start);
+#ifdef CONFIG_RESERVE_PHYSICAL_START
+   if (r-start  __PHYSICAL_START)
+   add_memory_region(r-start, r-end - r-start,
+  

[PATCH] reserved-ram kvm-userland patch

2008-06-24 Thread Andrea Arcangeli
This is the kvm-userland patch to use after applying the reserved-ram
patch to the host kernel. Bios must be rebuilt after applying the
patch, to do that just 'make bios'.

Then it's enough to pass '-reserved-ram' on the command line.

 4997 ?Sl 2:56   3515  1544 4677235 1697028 47.9 
/home/andrea/bin/x86_64/kvm/bin/qemu-system-x86_64 -hda tmp/virt
 5002 ?Sl 3:23   4728  1544 4677235 1600980 45.2 
/home/andrea/bin/x86_64/kvm/bin/qemu-system-x86_64 -hda tmp/virt
 5008 ?Sl 2:39239  1544 892127 15496  0.4 
/home/andrea/bin/x86_64/kvm/bin/qemu-system-x86_64 -hda tmp/virtual

 total   used   free sharedbuffers cached
Mem:   35404923525108  15384  0   1892  51896
-/+ buffers/cache:3471320  69172
Swap:  586368430140722849612


eth0: no IPv6 routers present
loaded kvm module (kvm-70-399-g275f337)
apic write: bad size=1 fee00030
Ignoring de-assert INIT to vcpu 0
apic write: bad size=1 fee00030
Ignoring de-assert INIT to vcpu 0
Ignoring de-assert INIT to vcpu 0
Ignoring de-assert INIT to vcpu 0
kvm: emulating exchange as write
apic write: bad size=1 fee00030
Ignoring de-assert INIT to vcpu 0
Ignoring de-assert INIT to vcpu 0

You can see above 3 KVM guests, last one with -reserved-ram -m 512,
the first two with -m 3000. Host kernel has both mmu-notifier v18 and
-reserved-ram patch applied. KVM kernel has the pfn-mmio patch applied
plus my fix to export the reserved RAM through vma-fault, and the kvm
mmu notifier support for reliable and efficient swapping. All 3 guests
seems to work great together while system is 3G into swap. The
reserved-ram guest is almost responsive as if there would be no swap
of course (only the userland bits need to be paged in but all the
virtual ram remains in ram).

You can also see the RSS of the -reserved-ram task is only 15M which
is about the footprint of kvm userland (part of which are shared libs,
so it's actually much less).

Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]

diff --git a/bios/rombios.c b/bios/rombios.c
index 318de57..f93a6c6 100644
--- a/bios/rombios.c
+++ b/bios/rombios.c
@@ -4251,6 +4251,7 @@ int15_function32(regs, ES, DS, FLAGS)
   Bit32u  extra_lowbits_memory_size=0;
   Bit16u  CX,DX;
   Bit8u   extra_highbits_memory_size=0;
+  Bit32u  below_640_end;
 
 BX_DEBUG_INT15(int15 AX=%04x\n,regs.u.r16.ax);
 
@@ -4305,6 +4306,11 @@ ASM_END
  case 0x20: // coded by osmaker aka K.J.
 if(regs.u.r32.edx == 0x534D4150)
 {
+below_640_end = inb_cmos(0x16);
+below_640_end = 8;
+below_640_end |= inb_cmos(0x15);
+below_640_end *= 1024;
+
 extended_memory_size = inb_cmos(0x35);
 extended_memory_size = 8;
 extended_memory_size |= inb_cmos(0x34);
@@ -4334,7 +4340,7 @@ ASM_END
 {
 case 0:
 set_e820_range(ES, regs.u.r16.di,
-   0x000L, 0x0009fc00L, 0, 0, 1);
+   0x000L, below_640_end, 0, 0, 1);
 regs.u.r32.ebx = 1;
 regs.u.r32.eax = 0x534D4150;
 regs.u.r32.ecx = 0x14;
@@ -4343,7 +4349,7 @@ ASM_END
 break;
 case 1:
 set_e820_range(ES, regs.u.r16.di,
-   0x0009fc00L, 0x000aL, 0, 0, 2);
+   below_640_end, 0x000aL, 0, 0, 2);
 regs.u.r32.ebx = 2;
 regs.u.r32.eax = 0x534D4150;
 regs.u.r32.ecx = 0x14;
diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c
index 42c2687..c6a21d5 100644
--- a/qemu/hw/pc.c
+++ b/qemu/hw/pc.c
@@ -235,6 +235,8 @@ static void cmos_init(ram_addr_t ram_size, ram_addr_t 
above_4g_mem_size,
 
 /* memory size */
 val = 640; /* base memory in K */
+if (reserved_ram)
+   val = reserved[1] / 1024;
 rtc_set_memory(s, 0x15, val);
 rtc_set_memory(s, 0x16, val  8);
 
diff --git a/qemu/pc-bios/bios.bin b/qemu/pc-bios/bios.bin
index 3e5d96a..c9c94e6 100644
Binary files a/qemu/pc-bios/bios.bin and b/qemu/pc-bios/bios.bin differ
diff --git a/qemu/sysemu.h b/qemu/sysemu.h
index 97d73e9..964fee4 100644
--- a/qemu/sysemu.h
+++ b/qemu/sysemu.h
@@ -102,6 +102,8 @@ extern int autostart;
 extern int old_param;
 extern int hpagesize;
 extern const char *bootp_filename;
+extern int reserved_ram;
+extern int64_t reserved[4];
 
 
 #ifdef USE_KQEMU
diff --git a/qemu/vl.c b/qemu/vl.c
index f573dce..3ce2f2a 100644
--- a/qemu/vl.c
+++ b/qemu/vl.c
@@ -235,6 +235,8 @@ int time_drift_fix = 0;
 unsigned int kvm_shadow_memory = 0;
 const char *mem_path = NULL;
 int hpagesize = 0;
+int reserved_ram = 0;
+int64_t reserved[4];
 const char *cpu_vendor_string;
 #ifdef TARGET_ARM
 int old_param = 0;
@@ 

KVM: move slots_lock acquision down to vapic_exit (resend)

2008-06-24 Thread Marcelo Tosatti

There is no need to grab slots_lock if the vapic_page will not
be touched.

Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED]

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 26b051b..29e8983 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2759,8 +2759,10 @@ static void vapic_exit(struct kvm_vcpu *vcpu)
if (!apic || !apic-vapic_addr)
return;
 
+   down_read(vcpu-kvm-slots_lock);
kvm_release_page_dirty(apic-vapic_page);
mark_page_dirty(vcpu-kvm, apic-vapic_addr  PAGE_SHIFT);
+   up_read(vcpu-kvm-slots_lock);
 }
 
 static int __vcpu_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
@@ -2916,9 +2918,7 @@ out:
 
post_kvm_run_save(vcpu, kvm_run);
 
-   down_read(vcpu-kvm-slots_lock);
vapic_exit(vcpu);
-   up_read(vcpu-kvm-slots_lock);
 
return r;
 }
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html