How to use code to create a new GuestVM

2012-01-11 Thread 吴锐
Hi, everyone
  I am a newbie about KVM.
  I am new want to write a module to create a GuestVM in demand.
Which function should I look into.
  And which struct is corresponding to a GuestVM, Shadow page table?
  Thanks for your help.
R
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] net: use this_cpu_xxx replace percpu_xxx funcs

2012-01-11 Thread Alex,Shi
  percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace them
  for further code clean up.
 
  And in preempt safe scenario, __this_cpu_xxx funcs has a bit better
  performance since __this_cpu_xxx has no redundant preempt_disable()
 
  Signed-off-by: Alex Shi alex@intel.com
  ---
   net/netfilter/xt_TEE.c |   12 ++--
   net/socket.c   |4 ++--
   2 files changed, 8 insertions(+), 8 deletions(-)
 
  Acked-by: Eric Dumazet eric.duma...@gmail.com
 
  Thanks !
 
  Anyone like to pick up this patch? or more comments for this? 
  
  Kaber, David: 
  I appreciate for your any comments on this. Could you like do me a
  favor? 
 
 No objections from me.

rend this patch for 3.2.0 kernel with Eric's Ack. 

David, do you have any concerns for this patch?  I will very appreciate
if it can met 3.3 open window. 

-
From 037bd159fdf52b915e452fac8db2252b1c60297e Mon Sep 17 00:00:00 2001
From: Alex Shi alex@intel.com
Date: Thu, 20 Oct 2011 14:52:17 +0800
Subject: [PATCH 1/3] net: use this_cpu_xxx replace percpu_xxx funcs

percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace them
for further code clean up.

And in preempt safe scenario, __this_cpu_xxx funcs has a bit better
performance since __this_cpu_xxx has no redundant preempt_disable()

Signed-off-by: Alex Shi alex@intel.com
Acked-by: Eric Dumazet eric.duma...@gmail.com
---
 net/netfilter/xt_TEE.c |   12 ++--
 net/socket.c   |4 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/xt_TEE.c b/net/netfilter/xt_TEE.c
index 5f054a0..678084c 100644
--- a/net/netfilter/xt_TEE.c
+++ b/net/netfilter/xt_TEE.c
@@ -90,7 +90,7 @@ tee_tg4(struct sk_buff *skb, const struct xt_action_param 
*par)
const struct xt_tee_tginfo *info = par-targinfo;
struct iphdr *iph;
 
-   if (percpu_read(tee_active))
+   if (__this_cpu_read(tee_active))
return XT_CONTINUE;
/*
 * Copy the skb, and route the copy. Will later return %XT_CONTINUE for
@@ -127,9 +127,9 @@ tee_tg4(struct sk_buff *skb, const struct xt_action_param 
*par)
ip_send_check(iph);
 
if (tee_tg_route4(skb, info)) {
-   percpu_write(tee_active, true);
+   __this_cpu_write(tee_active, true);
ip_local_out(skb);
-   percpu_write(tee_active, false);
+   __this_cpu_write(tee_active, false);
} else {
kfree_skb(skb);
}
@@ -170,7 +170,7 @@ tee_tg6(struct sk_buff *skb, const struct xt_action_param 
*par)
 {
const struct xt_tee_tginfo *info = par-targinfo;
 
-   if (percpu_read(tee_active))
+   if (__this_cpu_read(tee_active))
return XT_CONTINUE;
skb = pskb_copy(skb, GFP_ATOMIC);
if (skb == NULL)
@@ -188,9 +188,9 @@ tee_tg6(struct sk_buff *skb, const struct xt_action_param 
*par)
--iph-hop_limit;
}
if (tee_tg_route6(skb, info)) {
-   percpu_write(tee_active, true);
+   __this_cpu_write(tee_active, true);
ip6_local_out(skb);
-   percpu_write(tee_active, false);
+   __this_cpu_write(tee_active, false);
} else {
kfree_skb(skb);
}
diff --git a/net/socket.c b/net/socket.c
index ffe92ca..4b62ca9 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -479,7 +479,7 @@ static struct socket *sock_alloc(void)
inode-i_uid = current_fsuid();
inode-i_gid = current_fsgid();
 
-   percpu_add(sockets_in_use, 1);
+   this_cpu_add(sockets_in_use, 1);
return sock;
 }
 
@@ -522,7 +522,7 @@ void sock_release(struct socket *sock)
if (rcu_dereference_protected(sock-wq, 1)-fasync_list)
printk(KERN_ERR sock_release: fasync list not empty!\n);
 
-   percpu_sub(sockets_in_use, 1);
+   this_cpu_sub(sockets_in_use, 1);
if (!sock-file) {
iput(SOCK_INODE(sock));
return;
-- 
1.6.3.3

 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/3] kvm: use this_cpu_xxx replace percpu_xxx funcs

2012-01-11 Thread Alex,Shi
   
   Acked-by: Avi Kivity a...@redhat.com
   
 
  And this one, picking up or comments are all appreciated. :) 
 
 Just to be clear, you want this applied in kvm.git?
 

Thanks Avi! 
I saw it is in your 3.3 submit list. 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] net: use this_cpu_xxx replace percpu_xxx funcs

2012-01-11 Thread David Miller
From: Alex,Shi alex@intel.com
Date: Wed, 11 Jan 2012 16:45:33 +0800

  percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace them
  for further code clean up.
 
  And in preempt safe scenario, __this_cpu_xxx funcs has a bit better
  performance since __this_cpu_xxx has no redundant preempt_disable()
 
  Signed-off-by: Alex Shi alex@intel.com
  ---
   net/netfilter/xt_TEE.c |   12 ++--
   net/socket.c   |4 ++--
   2 files changed, 8 insertions(+), 8 deletions(-)
 
  Acked-by: Eric Dumazet eric.duma...@gmail.com
 
  Thanks !
 
  Anyone like to pick up this patch? or more comments for this? 
  
  Kaber, David: 
  I appreciate for your any comments on this. Could you like do me a
  favor? 
 
 No objections from me.
 
 rend this patch for 3.2.0 kernel with Eric's Ack. 
 
 David, do you have any concerns for this patch?  I will very appreciate
 if it can met 3.3 open window. 

Please just submit it directly with the other this_cpu() patches:

Acked-by: David S. Miller da...@davemloft.net
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Code clean up for percpu_xxx() functions

2012-01-11 Thread Alex,Shi
On Mon, 2011-11-21 at 17:06 -0700, t...@kernel.org wrote:
 (cc'ing hpa and quoting whole body)
  
  Signed-off-by: Alex Shi alex@intel.com
  Acked-by: Christoph Lameter c...@gentwo.org
 
  Acked-by: Tejun Heo t...@kernel.org
 
 hpa, I suppose this should go through x86?  The original patch can be
 accessed at
 
   http://article.gmane.org/gmane.linux.kernel/1218055/raw

Rend for 3.2 kernel, no any change needed to apply on latest Linus'
tree. :) 

Actually, this clean up has no performance or security impact for
kernel. On the contrary, removing some potential redundant preempt
disable will bring a slight performance benefit to kernel. 

This 3rd patch depends on previous 2 patches, the 2nd one kvm code clean
up was submitted for 3.3 kernel. but the 2st one net code clean up is
waiting for David's comments.


--
From 0dce61dc88b8ed2687b4d5c0633aa54d1f66fdc0 Mon Sep 17 00:00:00 2001
From: Alex Shi alex@intel.com
Date: Tue, 22 Nov 2011 00:05:37 +0800
Subject: [PATCH 3/3] Code clean up for percpu_xxx() functions

Since percpu_xxx() serial functions are duplicate with this_cpu_xxx().
Removing percpu_xxx() definition and replacing them by this_cpu_xxx() in
code.

And further more, as Christoph Lameter's requirement, I try to use
__this_cpu_xx to replace this_cpu_xxx if it is in preempt safe scenario.
The preempt safe scenarios include:
1, in irq/softirq/nmi handler
2, protected by preempt_disable
3, protected by spin_lock
4, if the code context imply that it is preempt safe, like the code is
follows or be followed a preempt safe code.

I left the xen code unchanged, since no idea of them.

BTW, In fact, this_cpu_xxx are same as __this_cpu_xxx since all funcs
implement in a single instruction for x86 machine. But it maybe
different for other platforms, so, doing this distinguish is helpful for
other platforms' performance.

Signed-off-by: Alex Shi alex@intel.com
Acked-by: Christoph Lameter c...@gentwo.org
Acked-by: Tejun Heo t...@kernel.org
---
 arch/x86/include/asm/current.h|2 +-
 arch/x86/include/asm/hardirq.h|9 +++--
 arch/x86/include/asm/irq_regs.h   |4 +-
 arch/x86/include/asm/mmu_context.h|   12 
 arch/x86/include/asm/percpu.h |   24 ++-
 arch/x86/include/asm/smp.h|4 +-
 arch/x86/include/asm/stackprotector.h |4 +-
 arch/x86/include/asm/thread_info.h|2 +-
 arch/x86/include/asm/tlbflush.h   |4 +-
 arch/x86/kernel/cpu/common.c  |2 +-
 arch/x86/kernel/cpu/mcheck/mce.c  |4 +-
 arch/x86/kernel/paravirt.c|   12 
 arch/x86/kernel/process_32.c  |2 +-
 arch/x86/kernel/process_64.c  |   12 
 arch/x86/mm/tlb.c |   10 +++---
 arch/x86/xen/enlighten.c  |6 ++--
 arch/x86/xen/irq.c|8 ++--
 arch/x86/xen/mmu.c|   20 ++--
 arch/x86/xen/multicalls.h |2 +-
 arch/x86/xen/smp.c|2 +-
 include/linux/percpu.h|   53 -
 include/linux/topology.h  |4 +-
 22 files changed, 73 insertions(+), 129 deletions(-)

diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h
index 4d447b7..9476c04 100644
--- a/arch/x86/include/asm/current.h
+++ b/arch/x86/include/asm/current.h
@@ -11,7 +11,7 @@ DECLARE_PER_CPU(struct task_struct *, current_task);
 
 static __always_inline struct task_struct *get_current(void)
 {
-   return percpu_read_stable(current_task);
+   return this_cpu_read_stable(current_task);
 }
 
 #define current get_current()
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 55e4de6..2890444 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -35,14 +35,15 @@ DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat);
 
 #define __ARCH_IRQ_STAT
 
-#define inc_irq_stat(member)   percpu_inc(irq_stat.member)
+#define inc_irq_stat(member)   __this_cpu_inc(irq_stat.member)
 
-#define local_softirq_pending()percpu_read(irq_stat.__softirq_pending)
+#define local_softirq_pending()
__this_cpu_read(irq_stat.__softirq_pending)
 
 #define __ARCH_SET_SOFTIRQ_PENDING
 
-#define set_softirq_pending(x) percpu_write(irq_stat.__softirq_pending, (x))
-#define or_softirq_pending(x)  percpu_or(irq_stat.__softirq_pending, (x))
+#define set_softirq_pending(x) \
+   __this_cpu_write(irq_stat.__softirq_pending, (x))
+#define or_softirq_pending(x)  __this_cpu_or(irq_stat.__softirq_pending, (x))
 
 extern void ack_bad_irq(unsigned int irq);
 
diff --git a/arch/x86/include/asm/irq_regs.h b/arch/x86/include/asm/irq_regs.h
index 7784322..15639ed 100644
--- a/arch/x86/include/asm/irq_regs.h
+++ b/arch/x86/include/asm/irq_regs.h
@@ -15,7 +15,7 @@ DECLARE_PER_CPU(struct pt_regs *, irq_regs);
 
 static inline struct pt_regs *get_irq_regs(void)
 {
-   return 

Re: [PATCH 1/3] net: use this_cpu_xxx replace percpu_xxx funcs

2012-01-11 Thread Alex,Shi
On Wed, 2012-01-11 at 01:03 -0800, David Miller wrote:
 From: Alex,Shi alex@intel.com
 Date: Wed, 11 Jan 2012 16:45:33 +0800
 
   percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace 
   them
   for further code clean up.
  
   And in preempt safe scenario, __this_cpu_xxx funcs has a bit better
   performance since __this_cpu_xxx has no redundant preempt_disable()
  
   Signed-off-by: Alex Shi alex@intel.com
   ---
net/netfilter/xt_TEE.c |   12 ++--
net/socket.c   |4 ++--
2 files changed, 8 insertions(+), 8 deletions(-)
  
   Acked-by: Eric Dumazet eric.duma...@gmail.com
  
   Thanks !
  
   Anyone like to pick up this patch? or more comments for this? 
   
   Kaber, David: 
   I appreciate for your any comments on this. Could you like do me a
   favor? 
  
  No objections from me.
  
  rend this patch for 3.2.0 kernel with Eric's Ack. 
  
  David, do you have any concerns for this patch?  I will very appreciate
  if it can met 3.3 open window. 
 
 Please just submit it directly with the other this_cpu() patches:
 
 Acked-by: David S. Miller da...@davemloft.net

Thanks a lot! :) 


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices

2012-01-11 Thread Michael S. Tsirkin
On Tue, Jan 10, 2012 at 04:41:50PM -0700, Alex Williamson wrote:
  The guest driver will never see such an interrupt as we will notice on
  its arrival that there is some mask pending.
 
 Right, I was thinking more about the affect at the hardware level.

In theory a broken device might assume that intx disable
bit is correlated with internal device registers somehow.
However, the current sharing approach won't work for such
a device anyway as host controls the status bit while
guest controls the rest of the device. So I think we don't care.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kvm-s390: prep cleanup for sync registers patch series

2012-01-11 Thread Christian Borntraeger
Avi, Marcelo,

here is a patch that reworks the setting of the prefix register.
It is a prereq for the prefix patch in the following patch series
about the sync registers in kvm_run.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kvm-s390: rework code that sets the prefix

2012-01-11 Thread Christian Borntraeger
From: Christian Borntraeger borntrae...@de.ibm.com

There are several places in the kvm module, which set the prefix register.
Since we need to flush the cpu, lets combine this operation into a helper
function. This helper will also explicitely mask out the unused bits.

Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
---
 arch/s390/kvm/interrupt.c |3 +--
 arch/s390/kvm/kvm-s390.c  |3 +--
 arch/s390/kvm/kvm-s390.h  |7 +++
 arch/s390/kvm/priv.c  |3 +--
 4 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 278ee00..c6366cf 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -236,8 +236,7 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu,
VCPU_EVENT(vcpu, 4, interrupt: set prefix to %x,
   inti-prefix.address);
vcpu-stat.deliver_prefix_signal++;
-   vcpu-arch.sie_block-prefix = inti-prefix.address;
-   vcpu-arch.sie_block-ihcpu = 0x;
+   kvm_s390_set_prefix(vcpu, inti-prefix.address);
break;
 
case KVM_S390_RESTART:
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index a33b444..1868b89 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -322,8 +322,7 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu 
*vcpu)
/* this equals initial cpu reset in pop, but we don't switch to ESA */
vcpu-arch.sie_block-gpsw.mask = 0UL;
vcpu-arch.sie_block-gpsw.addr = 0UL;
-   vcpu-arch.sie_block-prefix= 0UL;
-   vcpu-arch.sie_block-ihcpu = 0x;
+   kvm_s390_set_prefix(vcpu, 0);
vcpu-arch.sie_block-cputm = 0UL;
vcpu-arch.sie_block-ckc   = 0UL;
vcpu-arch.sie_block-todpr = 0;
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index 62aa5f1..ff28f9d 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -58,6 +58,13 @@ static inline int kvm_is_ucontrol(struct kvm *kvm)
return 0;
 #endif
 }
+
+static inline void kvm_s390_set_prefix(struct kvm_vcpu *vcpu, u32 prefix)
+{
+   vcpu-arch.sie_block-prefix = prefix  0x7fffe000u;
+   vcpu-arch.sie_block-ihcpu  = 0x;
+}
+
 int kvm_s390_handle_wait(struct kvm_vcpu *vcpu);
 enum hrtimer_restart kvm_s390_idle_wakeup(struct hrtimer *timer);
 void kvm_s390_tasklet(unsigned long parm);
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index d026389..9c83b8a 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -56,8 +56,7 @@ static int handle_set_prefix(struct kvm_vcpu *vcpu)
goto out;
}
 
-   vcpu-arch.sie_block-prefix = address;
-   vcpu-arch.sie_block-ihcpu = 0x;
+   kvm_s390_set_prefix(vcpu, address);
 
VCPU_EVENT(vcpu, 5, setting prefix to %x, address);
 out:
-- 
1.7.8.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4] kvm-s390: provide the prefix register via kvm_run

2012-01-11 Thread Christian Borntraeger
Add the prefix register to the synced register field in kvm_run.
While we need the prefix register most of the time read-only, this
patch also adds handling for guest dirtying of the prefix register.

Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
---
 arch/s390/include/asm/kvm.h |2 ++
 arch/s390/kvm/kvm-s390.c|7 +++
 2 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/arch/s390/include/asm/kvm.h b/arch/s390/include/asm/kvm.h
index 325560a..9fc328c 100644
--- a/arch/s390/include/asm/kvm.h
+++ b/arch/s390/include/asm/kvm.h
@@ -41,7 +41,9 @@ struct kvm_debug_exit_arch {
 struct kvm_guest_debug_arch {
 };
 
+#define KVM_SYNC_PREFIX (1UL  0)
 /* definition of registers in kvm_run */
 struct kvm_sync_regs {
+   __u64 prefix;   /* prefix register */
 };
 #endif
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 1868b89..6962c1b 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -132,6 +132,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 #ifdef CONFIG_KVM_S390_UCONTROL
case KVM_CAP_S390_UCONTROL:
 #endif
+   case KVM_CAP_SYNC_REGS:
r = 1;
break;
default:
@@ -288,6 +289,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
}
 
vcpu-arch.gmap = vcpu-kvm-arch.gmap;
+   vcpu-run-kvm_valid_regs = KVM_SYNC_PREFIX;
return 0;
 }
 
@@ -572,6 +574,10 @@ rerun_vcpu:
 
vcpu-arch.sie_block-gpsw.mask = kvm_run-psw_mask;
vcpu-arch.sie_block-gpsw.addr = kvm_run-psw_addr;
+   if (kvm_run-kvm_dirty_regs  KVM_SYNC_PREFIX) {
+   kvm_run-kvm_dirty_regs = ~KVM_SYNC_PREFIX;
+   kvm_s390_set_prefix(vcpu, kvm_run-s.regs.prefix);
+   }
 
might_fault();
 
@@ -620,6 +626,7 @@ rerun_vcpu:
 
kvm_run-psw_mask = vcpu-arch.sie_block-gpsw.mask;
kvm_run-psw_addr = vcpu-arch.sie_block-gpsw.addr;
+   kvm_run-s.regs.prefix = vcpu-arch.sie_block-prefix;
 
if (vcpu-sigset_active)
sigprocmask(SIG_SETMASK, sigsaved, NULL);
-- 
1.7.8.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4] kvm-s390: provide access guest registers via kvm_run

2012-01-11 Thread Christian Borntraeger
This patch adds the access registers to the kvm_run structure.

Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
---
 arch/s390/include/asm/kvm.h  |2 ++
 arch/s390/include/asm/kvm_host.h |1 -
 arch/s390/kvm/kvm-s390.c |   16 +---
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/s390/include/asm/kvm.h b/arch/s390/include/asm/kvm.h
index 420dbb7..9acbde4 100644
--- a/arch/s390/include/asm/kvm.h
+++ b/arch/s390/include/asm/kvm.h
@@ -43,9 +43,11 @@ struct kvm_guest_debug_arch {
 
 #define KVM_SYNC_PREFIX (1UL  0)
 #define KVM_SYNC_GPRS   (1UL  1)
+#define KVM_SYNC_ACRS   (1UL  2)
 /* definition of registers in kvm_run */
 struct kvm_sync_regs {
__u64 prefix;   /* prefix register */
__u64 gprs[16]; /* general purpose registers */
+   __u32 acrs[16]; /* access registers */
 };
 #endif
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index ed843ca..e630426 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -231,7 +231,6 @@ struct kvm_vcpu_arch {
s390_fp_regs  host_fpregs;
unsigned int  host_acrs[NUM_ACRS];
s390_fp_regs  guest_fpregs;
-   unsigned int  guest_acrs[NUM_ACRS];
struct kvm_s390_local_interrupt local_int;
struct hrtimerckc_timer;
struct tasklet_struct tasklet;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 80b12ba..0b91679 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -289,7 +289,9 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
}
 
vcpu-arch.gmap = vcpu-kvm-arch.gmap;
-   vcpu-run-kvm_valid_regs = KVM_SYNC_PREFIX | KVM_SYNC_GPRS;
+   vcpu-run-kvm_valid_regs = KVM_SYNC_PREFIX |
+   KVM_SYNC_GPRS |
+   KVM_SYNC_ACRS;
return 0;
 }
 
@@ -304,7 +306,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
save_access_regs(vcpu-arch.host_acrs);
vcpu-arch.guest_fpregs.fpc = FPC_VALID_MASK;
restore_fp_regs(vcpu-arch.guest_fpregs);
-   restore_access_regs(vcpu-arch.guest_acrs);
+   restore_access_regs(vcpu-run-s.regs.acrs);
gmap_enable(vcpu-arch.gmap);
atomic_set_mask(CPUSTAT_RUNNING, vcpu-arch.sie_block-cpuflags);
 }
@@ -314,7 +316,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
atomic_clear_mask(CPUSTAT_RUNNING, vcpu-arch.sie_block-cpuflags);
gmap_disable(vcpu-arch.gmap);
save_fp_regs(vcpu-arch.guest_fpregs);
-   save_access_regs(vcpu-arch.guest_acrs);
+   save_access_regs(vcpu-run-s.regs.acrs);
restore_fp_regs(vcpu-arch.host_fpregs);
restore_access_regs(vcpu-arch.host_acrs);
 }
@@ -441,16 +443,16 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, 
struct kvm_regs *regs)
 int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu,
  struct kvm_sregs *sregs)
 {
-   memcpy(vcpu-arch.guest_acrs, sregs-acrs, sizeof(sregs-acrs));
+   memcpy(vcpu-run-s.regs.acrs, sregs-acrs, sizeof(sregs-acrs));
memcpy(vcpu-arch.sie_block-gcr, sregs-crs, sizeof(sregs-crs));
-   restore_access_regs(vcpu-arch.guest_acrs);
+   restore_access_regs(vcpu-run-s.regs.acrs);
return 0;
 }
 
 int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu,
  struct kvm_sregs *sregs)
 {
-   memcpy(sregs-acrs, vcpu-arch.guest_acrs, sizeof(sregs-acrs));
+   memcpy(sregs-acrs, vcpu-run-s.regs.acrs, sizeof(sregs-acrs));
memcpy(sregs-crs, vcpu-arch.sie_block-gcr, sizeof(sregs-crs));
return 0;
 }
@@ -702,7 +704,7 @@ int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, 
unsigned long addr)
return -EFAULT;
 
if (__guestcopy(vcpu, addr + offsetof(struct save_area, acc_regs),
-   vcpu-arch.guest_acrs, 64, prefix))
+   vcpu-run-s.regs.acrs, 64, prefix))
return -EFAULT;
 
if (__guestcopy(vcpu,
-- 
1.7.8.2

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kvm: provide synchronous registers in kvm_run

2012-01-11 Thread Christian Borntraeger
Avi, Marcelo,

here is the next version of the sync register patch series.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4] kvm-s390: provide general purpose guest registers via kvm_run

2012-01-11 Thread Christian Borntraeger
This patch adds the general purpose registers to the kvm_run structure.

Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com
---
 arch/s390/include/asm/kvm.h  |2 ++
 arch/s390/include/asm/kvm_host.h |3 +--
 arch/s390/kvm/diag.c |6 +++---
 arch/s390/kvm/intercept.c|4 ++--
 arch/s390/kvm/kvm-s390.c |   14 +++---
 arch/s390/kvm/priv.c |   24 
 arch/s390/kvm/sigp.c |   20 ++--
 7 files changed, 37 insertions(+), 36 deletions(-)

diff --git a/arch/s390/include/asm/kvm.h b/arch/s390/include/asm/kvm.h
index 9fc328c..420dbb7 100644
--- a/arch/s390/include/asm/kvm.h
+++ b/arch/s390/include/asm/kvm.h
@@ -42,8 +42,10 @@ struct kvm_guest_debug_arch {
 };
 
 #define KVM_SYNC_PREFIX (1UL  0)
+#define KVM_SYNC_GPRS   (1UL  1)
 /* definition of registers in kvm_run */
 struct kvm_sync_regs {
__u64 prefix;   /* prefix register */
+   __u64 gprs[16]; /* general purpose registers */
 };
 #endif
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index e34fb2b..ed843ca 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -228,7 +228,6 @@ struct kvm_s390_float_interrupt {
 
 struct kvm_vcpu_arch {
struct kvm_s390_sie_block *sie_block;
-   unsigned long guest_gprs[16];
s390_fp_regs  host_fpregs;
unsigned int  host_acrs[NUM_ACRS];
s390_fp_regs  guest_fpregs;
@@ -254,5 +253,5 @@ struct kvm_arch{
struct gmap *gmap;
 };
 
-extern int sie64a(struct kvm_s390_sie_block *, unsigned long *);
+extern int sie64a(struct kvm_s390_sie_block *, u64 *);
 #endif
diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c
index 8943e82..a353f0e 100644
--- a/arch/s390/kvm/diag.c
+++ b/arch/s390/kvm/diag.c
@@ -20,8 +20,8 @@ static int diag_release_pages(struct kvm_vcpu *vcpu)
unsigned long start, end;
unsigned long prefix  = vcpu-arch.sie_block-prefix;
 
-   start = vcpu-arch.guest_gprs[(vcpu-arch.sie_block-ipa  0xf0)  4];
-   end = vcpu-arch.guest_gprs[vcpu-arch.sie_block-ipa  0xf] + 4096;
+   start = vcpu-run-s.regs.gprs[(vcpu-arch.sie_block-ipa  0xf0)  4];
+   end = vcpu-run-s.regs.gprs[vcpu-arch.sie_block-ipa  0xf] + 4096;
 
if (start  ~PAGE_MASK || end  ~PAGE_MASK || start  end
|| start  2 * PAGE_SIZE)
@@ -56,7 +56,7 @@ static int __diag_time_slice_end(struct kvm_vcpu *vcpu)
 static int __diag_ipl_functions(struct kvm_vcpu *vcpu)
 {
unsigned int reg = vcpu-arch.sie_block-ipa  0xf;
-   unsigned long subcode = vcpu-arch.guest_gprs[reg]  0x;
+   unsigned long subcode = vcpu-run-s.regs.gprs[reg]  0x;
 
VCPU_EVENT(vcpu, 5, diag ipl functions, subcode %lx, subcode);
switch (subcode) {
diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c
index 0243454..776ef83 100644
--- a/arch/s390/kvm/intercept.c
+++ b/arch/s390/kvm/intercept.c
@@ -36,7 +36,7 @@ static int handle_lctlg(struct kvm_vcpu *vcpu)
 
useraddr = disp2;
if (base2)
-   useraddr += vcpu-arch.guest_gprs[base2];
+   useraddr += vcpu-run-s.regs.gprs[base2];
 
if (useraddr  7)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
@@ -75,7 +75,7 @@ static int handle_lctl(struct kvm_vcpu *vcpu)
 
useraddr = disp2;
if (base2)
-   useraddr += vcpu-arch.guest_gprs[base2];
+   useraddr += vcpu-run-s.regs.gprs[base2];
 
if (useraddr  3)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 6962c1b..80b12ba 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -289,7 +289,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
}
 
vcpu-arch.gmap = vcpu-kvm-arch.gmap;
-   vcpu-run-kvm_valid_regs = KVM_SYNC_PREFIX;
+   vcpu-run-kvm_valid_regs = KVM_SYNC_PREFIX | KVM_SYNC_GPRS;
return 0;
 }
 
@@ -428,13 +428,13 @@ static int kvm_arch_vcpu_ioctl_initial_reset(struct 
kvm_vcpu *vcpu)
 
 int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
-   memcpy(vcpu-arch.guest_gprs, regs-gprs, sizeof(regs-gprs));
+   memcpy(vcpu-run-s.regs.gprs, regs-gprs, sizeof(regs-gprs));
return 0;
 }
 
 int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
 {
-   memcpy(regs-gprs, vcpu-arch.guest_gprs, sizeof(regs-gprs));
+   memcpy(regs-gprs, vcpu-run-s.regs.gprs, sizeof(regs-gprs));
return 0;
 }
 
@@ -511,7 +511,7 @@ static int __vcpu_run(struct kvm_vcpu *vcpu)
 {
int rc;
 
-   memcpy(vcpu-arch.sie_block-gg14, vcpu-arch.guest_gprs[14], 16);
+   memcpy(vcpu-arch.sie_block-gg14, vcpu-run-s.regs.gprs[14], 16);
 
if (need_resched())
schedule();
@@ -528,7 +528,7 @@ static int __vcpu_run(struct 

Re: [PATCH 3/3] stop the periodic RTC update timer

2012-01-11 Thread Marcelo Tosatti
On Fri, Jan 06, 2012 at 07:37:31AM +, Zhang, Yang Z wrote:
 change the RTC update logic to use host time with offset to calculate RTC 
 clock.
   There have no need to use two periodic timers to maintain an internal 
 timer for RTC clock update and alarm check. Instead, we calculate the real 
 RTC time by the host time with an offset. For alarm and updated-end 
 interrupt, if guest enabled it, then we setup a timer, or else, stop it.
 
 Signed-off-by: Yang Zhang yang.z.zh...@intel.com
 
 diff --git a/hw/mc146818rtc.c b/hw/mc146818rtc.c
 index 9cbd052..ac1854e 100644
 --- a/hw/mc146818rtc.c
 +++ b/hw/mc146818rtc.c
 @@ -84,7 +84,7 @@ typedef struct RTCState {
  MemoryRegion io;
  uint8_t cmos_data[128];
  uint8_t cmos_index;
 -struct tm current_tm;
 +int64_t offset;
  int32_t base_year;
  qemu_irq irq;
  qemu_irq sqw_irq;
 @@ -93,19 +93,18 @@ typedef struct RTCState {
  QEMUTimer *periodic_timer;
  int64_t next_periodic_time;
  /* second update */
 -int64_t next_second_time;
 +QEMUTimer *update_timer;
 +int64_t next_update_time;
 +/* alarm  */
 +QEMUTimer *alarm_timer;
 +int64_t next_alarm_time;
  uint16_t irq_reinject_on_ack_count;
  uint32_t irq_coalesced;
  uint32_t period;
  QEMUTimer *coalesced_timer;
 -QEMUTimer *second_timer;
 -QEMUTimer *second_timer2;
  Notifier clock_reset_notifier;
  } RTCState;
 
 -static void rtc_set_time(RTCState *s);
 -static void rtc_copy_date(RTCState *s);
 -
  #ifdef TARGET_I386
  static void rtc_coalesced_timer_update(RTCState *s)
  {
 @@ -140,6 +139,72 @@ static void rtc_coalesced_timer(void *opaque)
  }
  #endif
 
 +static inline int rtc_to_bcd(RTCState *s, int a)
 +{
 +if (s-cmos_data[RTC_REG_B]  REG_B_DM) {
 +return a;
 +} else {
 +return ((a / 10)  4) | (a % 10);
 +}
 +}
 +
 +static inline int rtc_from_bcd(RTCState *s, int a)
 +{
 +if (s-cmos_data[RTC_REG_B]  REG_B_DM) {
 +return a;
 +} else {
 +return ((a  4) * 10) + (a  0x0f);
 +}
 +}
 +
 +static void rtc_set_time(RTCState *s)
 +{
 +struct tm tm ;
 +
 +tm.tm_sec = rtc_from_bcd(s, s-cmos_data[RTC_SECONDS]);
 +tm.tm_min = rtc_from_bcd(s, s-cmos_data[RTC_MINUTES]);
 +tm.tm_hour = rtc_from_bcd(s, s-cmos_data[RTC_HOURS]  0x7f);
 +if (!(s-cmos_data[RTC_REG_B]  REG_B_24H) 
 +(s-cmos_data[RTC_HOURS]  0x80)) {
 +tm.tm_hour += 12;
 +}
 +tm.tm_wday = rtc_from_bcd(s, s-cmos_data[RTC_DAY_OF_WEEK]) - 1;
 +tm.tm_mday = rtc_from_bcd(s, s-cmos_data[RTC_DAY_OF_MONTH]);
 +tm.tm_mon = rtc_from_bcd(s, s-cmos_data[RTC_MONTH]) - 1;
 +tm.tm_year = rtc_from_bcd(s, s-cmos_data[RTC_YEAR]) + s-base_year - 
 1900;
 +
 +s-offset = qemu_timedate_diff(tm);
 +
 +rtc_change_mon_event(tm);
 +}
 +
 +static void rtc_update_time(RTCState *s)
 +{
 +struct tm tm;
 +int year;
 +
 +qemu_get_timedate(tm, s-offset);
 +
 +s-cmos_data[RTC_SECONDS] = rtc_to_bcd(s, tm.tm_sec);
 +s-cmos_data[RTC_MINUTES] = rtc_to_bcd(s, tm.tm_min);
 +if (s-cmos_data[RTC_REG_B]  REG_B_24H) {
 +/* 24 hour format */
 +s-cmos_data[RTC_HOURS] = rtc_to_bcd(s, tm.tm_hour);
 +} else {
 +/* 12 hour format */
 +s-cmos_data[RTC_HOURS] = rtc_to_bcd(s, tm.tm_hour % 12);
 +if (tm.tm_hour = 12)
 +s-cmos_data[RTC_HOURS] |= 0x80;
 +}
 +s-cmos_data[RTC_DAY_OF_WEEK] = rtc_to_bcd(s, tm.tm_wday + 1);
 +s-cmos_data[RTC_DAY_OF_MONTH] = rtc_to_bcd(s, tm.tm_mday);
 +s-cmos_data[RTC_MONTH] = rtc_to_bcd(s, tm.tm_mon + 1);
 +year = (tm.tm_year - s-base_year) % 100;
 +if (year  0)
 +year += 100;
 +s-cmos_data[RTC_YEAR] = rtc_to_bcd(s, year);
 +}
 +

Please have this code move in a separate, earlier patch.

  static void rtc_timer_update(RTCState *s, int64_t current_time)
  {
  int period_code, period;
 @@ -174,7 +239,7 @@ static void rtc_timer_update(RTCState *s, int64_t 
 current_time)
  }
  }
 
 -static void rtc_periodic_timer(void *opaque)
 +static void rtc_periodic_interrupt(void *opaque)
  {
  RTCState *s = opaque;
 
 @@ -204,6 +269,92 @@ static void rtc_periodic_timer(void *opaque)
  }
  }
 
 +static void rtc_enable_update_interrupt(void *opaque)
 +{
 +RTCState *s = opaque;
 +
 +s-next_update_time = qemu_get_clock_ns(rtc_clock) + get_ticks_per_sec();
 +qemu_mod_timer(s-update_timer, s-next_update_time);
 +}
 +
 +static void rtc_disable_update_interrupt(void *opaque)
 +{
 +RTCState *s = opaque;
 +
 +qemu_del_timer(s-update_timer);
 +}
 +
 +static void rtc_update_interrupt(void *opaque)
 +{
 +RTCState *s = opaque;
 +
 +/* update ended interrupt */
 +s-cmos_data[RTC_REG_C] |= REG_C_UF;
 +if (s-cmos_data[RTC_REG_B]  REG_B_UIE) {
 +s-cmos_data[RTC_REG_C] |= REG_C_IRQF;
 +qemu_irq_raise(s-irq);
 +
 +s-next_update_time += get_ticks_per_sec();
 +   qemu_mod_timer(s-update_timer, 

[regression] virtio net locks up

2012-01-11 Thread Bernd Schubert
No idea what is going on, but recent kernels lock up here after 
transferring some amount of data. So far I only know that 2.6.32 is the 
last working kernel I have tested and 3.0 is the first non-working 
version I tested.


How to reproduce:

vm1: iperf  -c vm2
vm2: iperf -s vm1

After some time either of both VMs cannot be pinged anymore, neither 
from host nor from the other (still working) VM. Direct access of the 
non-net-working vm via console still works fine.



Also not important if I run with vhost on or off, in both modes it fails.

qemu-kvm version is 1.0.

Here's my qemu-kvm start-up script:


#! /bin/bash

source  ~/bin/kvm-config.sh

iface=`sudo tunctl -b -u $USER`
FILE=${IMAGE_DIR}/squeeze1.img
#NICMODEL=e1000
NICMODEL=virtio


DISKIF=virtio
#DISKIF=ide
#DISKIF=scsi

${kvm}  \
-m 4096 \
-net nic,macaddr=52:54:00:12:34:11,model=${NICMODEL}\
-net 
tap,id=foo,script=${HOME}/bin/kvm-ifup,downscript=${HOME}/bin/kvm-ifdown,ifname=$iface,vhost=on
\
-boot c \
-drive file=${FILE},if=${DISKIF},boot=on,cache=writeback\
${common_opts}  \
$@

sudo /usr/sbin/tunctl -d $iface




Any idea what is going on or how to debug it?


Thanks,
Bernd

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [regression] virtio net locks up

2012-01-11 Thread Bernd Schubert

On 01/11/2012 04:24 PM, Bernd Schubert wrote:

No idea what is going on, but recent kernels lock up here after
transferring some amount of data. So far I only know that 2.6.32 is the
last working kernel I have tested and 3.0 is the first non-working
version I tested.


Sorry forgot to tell the host side kernel version:
- this was not updated and is always 2.6.32-131.6.1.el6.x86_64 (so RHEL6)


Cheers,
Bernd

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [regression] virtio net locks up

2012-01-11 Thread Stefan Hajnoczi
On Wed, Jan 11, 2012 at 3:24 PM, Bernd Schubert
bernd.schub...@itwm.fraunhofer.de wrote:
 Any idea what is going on or how to debug it?

Here are a couple of ideas that would yield more information:

Since the console still works I suggest checking dmesg output inside
the guest.  Are there any error messages at the bottom?

Try pinging the host's IP address from inside the guest.  Run tcpdump
on the guest's tap interface from the host and observe whether or not
you see any packets being sent from the guest.

rmmod virtio_net inside the guest and then modprobe virtio_net again.
See if network connectivity is restored (remember to rerun DHCP or
whatever, if necessary).

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 2/2] kvm: set affinity hint for assigned device msi

2012-01-11 Thread Marcelo Tosatti
On Mon, Oct 17, 2011 at 07:04:40PM +0200, Michael S. Tsirkin wrote:
 On Mon, Oct 17, 2011 at 02:07:41PM -0200, Marcelo Tosatti wrote:
Configurations to consider, all common ones used for assigned devices?
   
   I mean, besides round robin, any other modes that
   have an issue? Interrupts can also be multicast,
   I think, but we probably don't care what happens
   to affinity then, as msi interrupts are probably never
   broadcast ...
  
  There is also lowest priority, which can be used with MSI.
 
 
 So the following will probably address that comment?

Yes, it does. Patch looks fine.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [regression] virtio net locks up

2012-01-11 Thread Bernd Schubert

Hello Stefan,

thanks for your help!

On 01/11/2012 05:04 PM, Stefan Hajnoczi wrote:

On Wed, Jan 11, 2012 at 3:24 PM, Bernd Schubert
bernd.schub...@itwm.fraunhofer.de  wrote:

Any idea what is going on or how to debug it?


Here are a couple of ideas that would yield more information:

Since the console still works I suggest checking dmesg output inside
the guest.  Are there any error messages at the bottom?


No, absolutely nothing in dmesg.



Try pinging the host's IP address from inside the guest.  Run tcpdump
on the guest's tap interface from the host and observe whether or not
you see any packets being sent from the guest.


Seems arp requests are still going out, but then don't go in:

17:16:21.202547 ARP, Reply 192.168.123.1 is-at 00:25:90:38:09:cd (oui 
Unknown), length 28

17:16:21.538724 ARP, Request who-has squeeze1 tell squeeze3, length 28
17:16:21.539026 ARP, Reply squeeze1 is-at 52:54:00:12:34:11 (oui 
Unknown), length 28

17:16:22.200912 ARP, Request who-has 192.168.123.1 tell squeeze3, length 28



rmmod virtio_net inside the guest and then modprobe virtio_net again.
See if network connectivity is restored (remember to rerun DHCP or
whatever, if necessary).


Yep, that makes it work again. But probably is not the real solution ;)


Thanks,
Bernd

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] KVM: fix mov immediate emulation for 64-bit operands

2012-01-11 Thread Nadav Amit
MOV immediate instruction (opcodes 0xB8-0xBF) may take 64-bit operand.
The previous emulation implementation assumes the operand is no longer than 32.
Adding OpImm64 for this matter.

Signed-off-by: Nadav Amit nadav.a...@gmail.com
---
 arch/x86/kvm/emulate.c |   12 ++--
 1 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 05a562b..9ad5c0b 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -43,7 +43,7 @@
 #define OpCL   9ull  /* CL register (for shifts) */
 #define OpImmByte 10ull  /* 8-bit sign extended immediate */
 #define OpOne 11ull  /* Implied 1 */
-#define OpImm 12ull  /* Sign extended immediate */
+#define OpImm 12ull  /* Sign extended up to 32-bit immediate */
 #define OpMem16   13ull  /* Memory operand (16-bit). */
 #define OpMem32   14ull  /* Memory operand (32-bit). */
 #define OpImmU15ull  /* Immediate operand, zero extended */
@@ -57,6 +57,7 @@
 #define OpDS  23ull  /* DS */
 #define OpFS  24ull  /* FS */
 #define OpGS  25ull  /* GS */
+#define OpImm64  26ull  /* Sign extended 16/32/64-bit 
immediate */
 
 #define OpBits 5  /* Width of operand field */
 #define OpMask ((1ull  OpBits) - 1)
@@ -100,6 +101,7 @@
 #define SrcMemFAddr (OpMemFAddr  SrcShift)
 #define SrcAcc  (OpAcc  SrcShift)
 #define SrcImmU16   (OpImmU16  SrcShift)
+#define SrcImm64(OpImm64  SrcShift)
 #define SrcDX   (OpDX  SrcShift)
 #define SrcMask (OpMask  SrcShift)
 #define BitOp   (111)
@@ -3365,7 +3367,7 @@ static struct opcode opcode_table[256] = {
/* 0xB0 - 0xB7 */
X8(I(ByteOp | DstReg | SrcImm | Mov, em_mov)),
/* 0xB8 - 0xBF */
-   X8(I(DstReg | SrcImm | Mov, em_mov)),
+   X8(I(DstReg | SrcImm64 | Mov, em_mov)),
/* 0xC0 - 0xC7 */
D2bv(DstMem | SrcImmByte | ModRM),
I(ImplicitOps | Stack | SrcImmU16, em_ret_near_imm),
@@ -3526,6 +3528,9 @@ static int decode_imm(struct x86_emulate_ctxt *ctxt, 
struct operand *op,
case 4:
op-val = insn_fetch(s32, ctxt);
break;
+   case 8:
+   op-val = insn_fetch(s64, ctxt);
+   break;
}
if (!sign_extension) {
switch (op-bytes) {
@@ -3605,6 +3610,9 @@ static int decode_operand(struct x86_emulate_ctxt *ctxt, 
struct operand *op,
case OpImm:
rc = decode_imm(ctxt, op, imm_size(ctxt), true);
break;
+   case OpImm64:
+   rc = decode_imm(ctxt, op, ctxt-op_bytes, true);
+   break;
case OpMem16:
ctxt-memop.bytes = 2;
goto mem_common;
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 00/13] KVM/ARM Implementation

2012-01-11 Thread Peter Maydell
On 11 December 2011 19:23, Christoffer Dall
c.d...@virtualopensystems.com wrote:
 On Sun, Dec 11, 2011 at 6:32 AM, Peter Maydell peter.mayd...@linaro.org 
 wrote:
 On 11 December 2011 10:24, Christoffer Dall
 c.d...@virtualopensystems.com wrote:
 Still on the to-do list:
  - Reuse VMIDs
  - Fix SMP host support
  - Fix SMP guest support
  - Support guest Thumb mode for MMIO emulation
  - Further testing
  - Performance improvements

 Other items for this list:
  - Support Neon/VFP in guests (the fpu regs struct is empty ATM)
  - Support guest debugging

 ok, thanks, will add these to the list. I have a feeling it will keep
 growing for a while :)

Do you have a kernel-side TODO list somewhere public (wiki page?)

(It would be quite useful to be able to boot a reasonably modern
[read, ARMv7, Thumb2, VFPv3] guest userspace; does anybody plan
to work on this part soon?)

thanks
-- PMM
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] KVM: Exception during emulation decode should propagate

2012-01-11 Thread Nadav Amit
An exception might occur during decode (e.g., #PF during fetch).
Currently, the exception is ignored and emulation is performed.
Instead, emulation should be skipped and the fault should be injected.
Skipping instruction should report a failure in this case.

Signed-off-by: Nadav Amit nadav.a...@gmail.com
---
 arch/x86/kvm/emulate.c |3 +++
 arch/x86/kvm/x86.c |8 
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 05a562b..e06dc98 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3869,6 +3869,9 @@ done:
if (ctxt-memopp  ctxt-memopp-type == OP_MEM  ctxt-rip_relative)
ctxt-memopp-addr.mem.ea += ctxt-_eip;
 
+   if (rc == X86EMUL_PROPAGATE_FAULT)
+   ctxt-have_exception = true;
+
return (rc != X86EMUL_CONTINUE) ? EMULATION_FAILED : EMULATION_OK;
 }
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1171def..05fd3d7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4443,10 +4443,17 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
}
 
if (emulation_type  EMULTYPE_SKIP) {
+   if (ctxt-have_exception)
+   return EMULATE_FAIL;
kvm_rip_write(vcpu, ctxt-_eip);
return EMULATE_DONE;
}
 
+   if (ctxt-have_exception) {
+   writeback = false;
+   goto post;
+   }
+
if (retry_instruction(ctxt, cr2, emulation_type))
return EMULATE_DONE;
 
@@ -4470,6 +4477,7 @@ restart:
return handle_emulation_failure(vcpu);
}
 
+post:
if (ctxt-have_exception) {
inject_emulated_exception(vcpu);
r = EMULATE_DONE;
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM: Fix writeback on page boundary that propagate changes in spite of #PF

2012-01-11 Thread Nadav Amit
Consider the case in which an instruction emulation writeback is performed on a 
page boundary.
In such case, if a #PF occurs on the second page, the write to the first page 
already occurred and cannot be retracted.
Therefore, validation of the second page access must be performed prior to 
writeback.

Signed-off-by: Nadav Amit nadav.a...@gmail.com
---
 arch/x86/kvm/x86.c |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 05fd3d7..7af3d67 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3626,6 +3626,8 @@ struct read_write_emulator_ops {
   int bytes, void *val);
int (*read_write_exit_mmio)(struct kvm_vcpu *vcpu, gpa_t gpa,
void *val, int bytes);
+   gpa_t (*read_write_validate)(struct kvm_vcpu *vcpu, gva_t gva,
+struct x86_exception *exception);
bool write;
 };
 
@@ -3686,6 +3688,7 @@ static struct read_write_emulator_ops write_emultor = {
.read_write_emulate = write_emulate,
.read_write_mmio = write_mmio,
.read_write_exit_mmio = write_exit_mmio,
+   .read_write_validate = kvm_mmu_gva_to_gpa_write,
.write = true,
 };
 
@@ -3750,6 +3753,16 @@ int emulator_read_write(struct x86_emulate_ctxt *ctxt, 
unsigned long addr,
int rc, now;
 
now = -addr  ~PAGE_MASK;
+
+   /* First check there is no page-fault on the next page */
+   if (ops-read_write_validate 
+   ops-read_write_validate(vcpu, addr+now, exception) ==
+   UNMAPPED_GVA) {
+   /* #PF on the first page should be reported first */
+   ops-read_write_validate(vcpu, addr, exception);
+   return X86EMUL_PROPAGATE_FAULT;
+   }
+
rc = emulator_read_write_onepage(addr, val, now, exception,
 vcpu, ops);
 
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost-net: add module alias

2012-01-11 Thread Stephen Hemminger
On Wed, 11 Jan 2012 15:43:42 +0800
Amos Kong kongjian...@gmail.com wrote:

 On Wed, Jan 11, 2012 at 12:54 PM, Stephen Hemminger
 shemmin...@vyatta.comwrote:
 
  By adding the a module alias, programs (or users) won't have to explicitly
  call modprobe. Vhost-net will always be available if built into the kernel.
  It does require assigning a permanent minor number for depmod to work.
  Choose one next to TUN since this driver is related to it.
 
  Also, use C99 style initialization.
 
  Signed-off-by: Stephen Hemminger shemmin...@vyatta.com
 
  ---
   drivers/vhost/net.c|8 +---
   include/linux/miscdevice.h |1 +
   2 files changed, 6 insertions(+), 3 deletions(-)
 
:
 /*
  *  These allocations are managed by dev...@lanana.org. If you use an
  *  entry that is not in assigned your entry may well be moved and
  *  reassigned, or set dynamic if a fixed value is not justified.
  */

Didn't that mailing address was ever used any more. Like many places
in kernel, the comment looked like a historical leftover.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost-net: add module alias

2012-01-11 Thread Stephen Hemminger
On Wed, 11 Jan 2012 11:07:47 +0400
Michael Tokarev m...@tls.msk.ru wrote:

 On 11.01.2012 08:54, Stephen Hemminger wrote:
  By adding the a module alias, programs (or users) won't have to explicitly
  call modprobe. Vhost-net will always be available if built into the kernel.
  It does require assigning a permanent minor number for depmod to work.
  Choose one next to TUN since this driver is related to it.
 
 Why do you think a statically-allocated device number will do any good
 at all?  Static /dev is gone almost completely, at least on the systems
 where whole virt stuff makes any sense, so you don't have pre-created
 vhost-net device anymore, and hence this allocation makes no sense.
 Just IMHO anyway.

The statically allocated device number is required for the udev/module
autoloading to work. Probably the udev infrastructure needs a consistent
number to hang off of.

It looks like:
  * driver adds MODULE_ALIAS() for devname and character device
  * depmod scans modules and creates modules.devname (in /lib/modules)
  * udev uses modules.devname to autoload the module

$ /sbin/modinfo vhost_net
filename:   /lib/modules/3.2.0-net+/kernel/drivers/vhost/vhost_net.ko
alias:  devname:vhost-net
alias:  char-major-10-201
description:Host kernel accelerator for virtio net
...

See also: https://lkml.org/lkml/2010/5/21/134



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost-net: add module alias

2012-01-11 Thread Michael Tokarev
On 11.01.2012 20:58, Stephen Hemminger wrote:
 On Wed, 11 Jan 2012 11:07:47 +0400
 Michael Tokarev m...@tls.msk.ru wrote:
 
 On 11.01.2012 08:54, Stephen Hemminger wrote:
 By adding the a module alias, programs (or users) won't have to explicitly
 call modprobe. Vhost-net will always be available if built into the kernel.
 It does require assigning a permanent minor number for depmod to work.
 Choose one next to TUN since this driver is related to it.

 Why do you think a statically-allocated device number will do any good
 at all?  Static /dev is gone almost completely, at least on the systems
 where whole virt stuff makes any sense, so you don't have pre-created
 vhost-net device anymore, and hence this allocation makes no sense.
 Just IMHO anyway.
[]
 See also: https://lkml.org/lkml/2010/5/21/134

Aha.  So udev pre-creates statically-allocated devnodes nowadays:

 Udev will pick up the depmod created file on startup and create all the
 static device nodes which the kernel modules specify, so that these modules
 get automatically loaded when the device node is accessed...

This was the part I missed.  Now it all looks logically.

Thanks,

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost-net: add module alias

2012-01-11 Thread Kay Sievers
On Wed, Jan 11, 2012 at 17:58, Stephen Hemminger shemmin...@vyatta.com wrote:
 On Wed, 11 Jan 2012 11:07:47 +0400
 Michael Tokarev m...@tls.msk.ru wrote:

 On 11.01.2012 08:54, Stephen Hemminger wrote:
  By adding the a module alias, programs (or users) won't have to explicitly
  call modprobe. Vhost-net will always be available if built into the kernel.
  It does require assigning a permanent minor number for depmod to work.
  Choose one next to TUN since this driver is related to it.

 Why do you think a statically-allocated device number will do any good
 at all?

It's totally fine to use them for single-instance devices. You are
right, enumerated devices must _never_ use any facility like that.
That would just be broken.

 Static /dev is gone almost completely, at least on the systems
 where whole virt stuff makes any sense, so you don't have pre-created
 vhost-net device anymore, and hence this allocation makes no sense.
 Just IMHO anyway.

It makes a lot of sense in this case. The kernel module files
advertise the dev_t, it's not stored anywhere else. UDev finds these
static numbers and does inplicit mkdev() for them.

 The statically allocated device number is required for the udev/module
 autoloading to work. Probably the udev infrastructure needs a consistent
 number to hang off of.

It does that properly.

Just check:
  $ cat /lib/modules/$(uname -r)/modules.devname
  # Device nodes to trigger on-demand module loading.
  fuse fuse c10:229
  btrfs btrfs-control c10:234
  ppp_generic ppp c108:0
  tun net/tun c10:200
  uinput uinput c10:223
  ...

Kay
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [regression] virtio net locks up

2012-01-11 Thread Stefan Hajnoczi
On Wed, Jan 11, 2012 at 4:18 PM, Bernd Schubert
bernd.schub...@itwm.fraunhofer.de wrote:
 On 01/11/2012 05:04 PM, Stefan Hajnoczi wrote:
 Try pinging the host's IP address from inside the guest.  Run tcpdump
 on the guest's tap interface from the host and observe whether or not
 you see any packets being sent from the guest.


 Seems arp requests are still going out, but then don't go in:

 17:16:21.202547 ARP, Reply 192.168.123.1 is-at 00:25:90:38:09:cd (oui
 Unknown), length 28
 17:16:21.538724 ARP, Request who-has squeeze1 tell squeeze3, length 28
 17:16:21.539026 ARP, Reply squeeze1 is-at 52:54:00:12:34:11 (oui Unknown),
 length 28
 17:16:22.200912 ARP, Request who-has 192.168.123.1 tell squeeze3, length 28

Okay, so it seems networking from the tap device and beyond is fine.

 rmmod virtio_net inside the guest and then modprobe virtio_net again.
 See if network connectivity is restored (remember to rerun DHCP or
 whatever, if necessary).


 Yep, that makes it work again. But probably is not the real solution ;)

It's just another piece of information which helps debug this :).  At
least nothing has wedged itself into an unrecoverable state.

When you said the problem happens without vhost, did you explicitly
run vhost=off?  Or did you just omit vhost=on?

This sounds like a guest kernel/driver issue.  I recommend testing
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git in
the guest to see if this has already been fixed.

If you have the -dbg RPMs installed it may be possible to insert a
probe into the virtio_net kernel module and observe receive
interrupts.  This does require the right kernel CONFIG_ but you might
already have it enabled:

$ sudo perf probe --add skb_recv_done
$ sudo perf record -e probe:skb_recv_done -a
...send some packets to the guest...
^C
$ sudo perf script

If you see no skb_recv_done events then the guest driver is not
receiving a notification when packets are received.

You can find more about how to use perf-probe(1) at
http://blog.vmsplice.net/2011/03/how-to-use-perf-probe.html.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] vhost-net: add module alias (v2)

2012-01-11 Thread Stephen Hemminger
By adding the correct module alias, programs won't have to explicitly
call modprobe. Vhost-net will always be available if built into the kernel.
It does require assigning a permanent minor number for depmod to work.
Choose one next to TUN since this driver is related to it.

Also, use C99 style initialization.

Signed-off-by: Stephen Hemminger shemmin...@vyatta.com

---
v2 - document minor number and make sure to not overlap

 Documentation/devices.txt  |2 ++
 drivers/vhost/net.c|8 +---
 include/linux/miscdevice.h |1 +
 3 files changed, 8 insertions(+), 3 deletions(-)

--- a/drivers/vhost/net.c   2012-01-10 10:56:58.883179194 -0800
+++ b/drivers/vhost/net.c   2012-01-10 19:48:23.650225892 -0800
@@ -856,9 +856,9 @@ static const struct file_operations vhos
 };
 
 static struct miscdevice vhost_net_misc = {
-   MISC_DYNAMIC_MINOR,
-   vhost-net,
-   vhost_net_fops,
+   .minor = VHOST_NET_MINOR,
+   .name = vhost-net,
+   .fops = vhost_net_fops,
 };
 
 static int vhost_net_init(void)
@@ -879,3 +879,5 @@ MODULE_VERSION(0.0.1);
 MODULE_LICENSE(GPL v2);
 MODULE_AUTHOR(Michael S. Tsirkin);
 MODULE_DESCRIPTION(Host kernel accelerator for virtio net);
+MODULE_ALIAS_MISCDEV(VHOST_NET_MINOR);
+MODULE_ALIAS(devname:vhost-net);
--- a/include/linux/miscdevice.h2012-01-10 10:56:59.779189436 -0800
+++ b/include/linux/miscdevice.h2012-01-11 09:13:20.803694316 -0800
@@ -42,6 +42,7 @@
 #define AUTOFS_MINOR   235
 #define MAPPER_CTRL_MINOR  236
 #define LOOP_CTRL_MINOR237
+#define VHOST_NET_MINOR238
 #define MISC_DYNAMIC_MINOR 255
 
 struct device;
--- a/Documentation/devices.txt 2012-01-10 10:56:53.399116518 -0800
+++ b/Documentation/devices.txt 2012-01-11 09:12:49.251197653 -0800
@@ -447,6 +447,8 @@ Your cooperation is appreciated.
234 = /dev/btrfs-controlBtrfs control device
235 = /dev/autofs   Autofs control device
236 = /dev/mapper/control   Device-Mapper control device
+   237 = /dev/vhost-netHost kernel accelerator for virtio net
+
240-254 Reserved for local use
255 Reserved for MISC_DYNAMIC_MINOR
 


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Code clean up for percpu_xxx() functions

2012-01-11 Thread t...@kernel.org
On Wed, Jan 11, 2012 at 05:08:41PM +0800, Alex,Shi wrote:
 On Mon, 2011-11-21 at 17:06 -0700, t...@kernel.org wrote:
  (cc'ing hpa and quoting whole body)
   
   Signed-off-by: Alex Shi alex@intel.com
   Acked-by: Christoph Lameter c...@gentwo.org
  
   Acked-by: Tejun Heo t...@kernel.org
  
  hpa, I suppose this should go through x86?  The original patch can be
  accessed at
  
http://article.gmane.org/gmane.linux.kernel/1218055/raw
 
 Rend for 3.2 kernel, no any change needed to apply on latest Linus'
 tree. :) 
 
 Actually, this clean up has no performance or security impact for
 kernel. On the contrary, removing some potential redundant preempt
 disable will bring a slight performance benefit to kernel. 
 
 This 3rd patch depends on previous 2 patches, the 2nd one kvm code clean
 up was submitted for 3.3 kernel. but the 2st one net code clean up is
 waiting for David's comments.

Alex, can you please collect all patches into a single patchset?
Please split it such that, usage changes are per-system so that they
can be routed through respective subsystems (x86 or net) and updates
to percpu proper which can be applied after other changes have been
applied.  It would probably be best to route these patches separately
rather than all through percpu as it touches a lot of different places
and is likely to cause conflicts.  I *think* the best way would be,

* Submit per-subsystem patches and get them merged to subsystem trees.

* (Optional) Apply a patch to mark unused interface deprecated in
  percpu tree, so that new usages in linux-next can be detected.

* Towards the end of the next merge window, merge a patch to actually
  kill the old interface.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/4 V8] Add functions to check if the host has stopped the vm

2012-01-11 Thread Eric B Munson
When a host stops or suspends a VM it will set a flag to show this.  The
watchdog will use these functions to determine if a softlockup is real, or the
result of a suspended VM.

Signed-off-by: Eric B Munson emun...@mgebm.net
asm-generic changes Acked-by: Arnd Bergmann a...@arndb.de
Cc: mi...@redhat.com
Cc: h...@zytor.com
Cc: ry...@linux.vnet.ibm.com
Cc: aligu...@us.ibm.com
Cc: mtosa...@redhat.com
Cc: jeremy.fitzhardi...@citrix.com
Cc: kvm@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org
---
Changes from V6:
 Use __this_cpu_and when clearing the PVCLOCK_GUEST_STOPPED flag

Changes from V5:
 Collapse generic stubs into this patch
 check_and_clear_guest_stopped() takes no args and uses __get_cpu_var()
 Include individual definitions in ia64, s390, and powerpc

 arch/ia64/include/asm/kvm_para.h|5 +
 arch/powerpc/include/asm/kvm_para.h |5 +
 arch/s390/include/asm/kvm_para.h|5 +
 arch/x86/include/asm/kvm_para.h |8 
 arch/x86/kernel/kvmclock.c  |   21 +
 include/asm-generic/kvm_para.h  |   14 ++
 6 files changed, 58 insertions(+), 0 deletions(-)
 create mode 100644 include/asm-generic/kvm_para.h

diff --git a/arch/ia64/include/asm/kvm_para.h b/arch/ia64/include/asm/kvm_para.h
index 1588aee..2019cb9 100644
--- a/arch/ia64/include/asm/kvm_para.h
+++ b/arch/ia64/include/asm/kvm_para.h
@@ -26,6 +26,11 @@ static inline unsigned int kvm_arch_para_features(void)
return 0;
 }
 
+static inline bool kvm_check_and_clear_guest_paused(void)
+{
+   return false;
+}
+
 #endif
 
 #endif
diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index 50533f9..1f80293 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -169,6 +169,11 @@ static inline unsigned int kvm_arch_para_features(void)
return r;
 }
 
+static inline bool kvm_check_and_clear_guest_paused(void)
+{
+   return false;
+}
+
 #endif /* __KERNEL__ */
 
 #endif /* __POWERPC_KVM_PARA_H__ */
diff --git a/arch/s390/include/asm/kvm_para.h b/arch/s390/include/asm/kvm_para.h
index 6964db2..a988329 100644
--- a/arch/s390/include/asm/kvm_para.h
+++ b/arch/s390/include/asm/kvm_para.h
@@ -149,6 +149,11 @@ static inline unsigned int kvm_arch_para_features(void)
return 0;
 }
 
+static inline bool kvm_check_and_clear_guest_paused(void)
+{
+   return false;
+}
+
 #endif
 
 #endif /* __S390_KVM_PARA_H */
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index 734c376..99c4bbe 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -95,6 +95,14 @@ struct kvm_vcpu_pv_apf_data {
 extern void kvmclock_init(void);
 extern int kvm_register_clock(char *txt);
 
+#ifdef CONFIG_KVM_CLOCK
+bool kvm_check_and_clear_guest_paused(void);
+#else
+static inline bool kvm_check_and_clear_guest_paused(void)
+{
+   return false;
+}
+#endif /* CONFIG_KVMCLOCK */
 
 /* This instruction is vmcall.  On non-VT architectures, it will generate a
  * trap that we will then rewrite to the appropriate instruction.
diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 44842d7..bdf6423 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -22,6 +22,7 @@
 #include asm/msr.h
 #include asm/apic.h
 #include linux/percpu.h
+#include linux/hardirq.h
 
 #include asm/x86_init.h
 #include asm/reboot.h
@@ -114,6 +115,26 @@ static void kvm_get_preset_lpj(void)
preset_lpj = lpj;
 }
 
+bool kvm_check_and_clear_guest_paused(void)
+{
+   bool ret = false;
+   struct pvclock_vcpu_time_info *src;
+
+   /*
+* per_cpu() is safe here because this function is only called from
+* timer functions where preemption is already disabled.
+*/
+   WARN_ON(!in_atomic());
+   src = __get_cpu_var(hv_clock);
+   if ((src-flags  PVCLOCK_GUEST_STOPPED) != 0) {
+   __this_cpu_and(hv_clock.flags, ~PVCLOCK_GUEST_STOPPED);
+   ret = true;
+   }
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(kvm_check_and_clear_guest_paused);
+
 static struct clocksource kvm_clock = {
.name = kvm-clock,
.read = kvm_clock_get_cycles,
diff --git a/include/asm-generic/kvm_para.h b/include/asm-generic/kvm_para.h
new file mode 100644
index 000..05ef7e7
--- /dev/null
+++ b/include/asm-generic/kvm_para.h
@@ -0,0 +1,14 @@
+#ifndef _ASM_GENERIC_KVM_PARA_H
+#define _ASM_GENERIC_KVM_PARA_H
+
+
+/*
+ * This function is used by architectures that support kvm to avoid issuing
+ * false soft lockup messages.
+ */
+static inline bool kvm_check_and_clear_guest_paused(void)
+{
+   return false;
+}
+
+#endif
-- 
1.7.5.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/4 V8] Avoid soft lockup message when KVM is stopped by host

2012-01-11 Thread Eric B Munson
Changes from V7:
Define KVM_CAP_GUEST_PAUSED and support check
Call mark_page_dirty () after setting PVCLOCK_GUEST_STOPPED

Changes from V6:
Use __this_cpu_and when clearing the PVCLOCK_GUEST_STOPPED flag

Changes from V5:
Collapse generic check_and_clear_guest_stopped into patch 2
Include check_and_clear_guest_stopped defintion to ia64, s390, and powerpc
Change check_and_clear_guest_stopped to use __get_cpu_var instead of taking the
 cpuid arg.
Protect check_and_clear_guest_stopped declaration with CONFIG_KVM_CLOCK check

Changes from V4:
Rename KVM_GUEST_PAUSED to KVMCLOCK_GUEST_PAUSED
Add description of KVMCLOCK_GUEST_PAUSED ioctl to api.txt

Changes from V3:
Include CC's on patch 3
Drop clear flag ioctl and have the watchdog clear the flag when it is reset

Changes from V2:
A new kvm functions defined in kvm_para.h, the only change to pvclock is the
initial flag definition

Changes from V1:
(Thanks Marcelo)
Host code has all been moved to arch/x86/kvm/x86.c
KVM_PAUSE_GUEST was renamed to KVM_GUEST_PAUSED

When a guest kernel is stopped by the host hypervisor it can look like a soft
lockup to the guest kernel.  This false warning can mask later soft lockup
warnings which may be real.  This patch series adds a method for a host
hypervisor to communicate to a guest kernel that it is being stopped.  The
final patch in the series has the watchdog check this flag when it goes to
issue a soft lockup warning and skip the warning if the guest knows it was
stopped.

It was attempted to solve this in Qemu, but the side effects of saving and
restoring the clock and tsc for each vcpu put the wall clock of the guest behind
by the amount of time of the pause.  This forces a guest to have ntp running
in order to keep the wall clock accurate.

Cc: mi...@redhat.com
Cc: h...@zytor.com
Cc: ry...@linux.vnet.ibm.com
Cc: aligu...@us.ibm.com
Cc: mtosa...@redhat.com
Cc: jeremy.fitzhardi...@citrix.com
Cc: levinsasha...@gmail.com
Cc: Jan Kiszka jan.kis...@siemens.com
Cc: kvm@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org

Eric B Munson (4):
  Add flag to indicate that a vm was stopped by the host
  Add functions to check if the host has stopped the vm
  Add ioctl for KVMCLOCK_GUEST_STOPPED
  Add check for suspended vm in softlockup detector

 Documentation/virtual/kvm/api.txt   |   13 +
 arch/ia64/include/asm/kvm_para.h|5 +
 arch/powerpc/include/asm/kvm_para.h |5 +
 arch/s390/include/asm/kvm_para.h|5 +
 arch/x86/include/asm/kvm_para.h |8 
 arch/x86/include/asm/pvclock-abi.h  |1 +
 arch/x86/kernel/kvmclock.c  |   21 +
 arch/x86/kvm/x86.c  |   21 +
 include/asm-generic/kvm_para.h  |   14 ++
 include/linux/kvm.h |3 +++
 kernel/watchdog.c   |   12 
 11 files changed, 108 insertions(+), 0 deletions(-)
 create mode 100644 include/asm-generic/kvm_para.h

-- 
1.7.5.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/4 V8] Add flag to indicate that a vm was stopped by the host

2012-01-11 Thread Eric B Munson
This flag will be used to check if the vm was stopped by the host when a soft
lockup was detected.  The host will set the flag when it stops the guest.  On
resume, the guest will check this flag if a soft lockup is detected and skip
issuing the warning.

Signed-off-by: Eric B Munson emun...@mgebm.net
Cc: mi...@redhat.com
Cc: h...@zytor.com
Cc: ry...@linux.vnet.ibm.com
Cc: aligu...@us.ibm.com
Cc: mtosa...@redhat.com
Cc: jeremy.fitzhardi...@citrix.com
Cc: kvm@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org
---
 arch/x86/include/asm/pvclock-abi.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/pvclock-abi.h 
b/arch/x86/include/asm/pvclock-abi.h
index 35f2d19..6167fd7 100644
--- a/arch/x86/include/asm/pvclock-abi.h
+++ b/arch/x86/include/asm/pvclock-abi.h
@@ -40,5 +40,6 @@ struct pvclock_wall_clock {
 } __attribute__((__packed__));
 
 #define PVCLOCK_TSC_STABLE_BIT (1  0)
+#define PVCLOCK_GUEST_STOPPED  (1  1)
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_PVCLOCK_ABI_H */
-- 
1.7.5.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH V5] Guest stop notification

2012-01-11 Thread Eric B Munson
Often when a guest is stopped from the qemu console, it will report spurious
soft lockup warnings on resume.  There are kernel patches being discussed that
will give the host the ability to tell the guest that it is being stopped and
should ignore the soft lockup warning that generates.  This patch uses the qemu
Notifier system to tell the guest it is about to be stopped.

Signed-off-by: Eric B Munson emun...@mgebm.net

Cc: Avi Kivity a...@redhat.com
Cc: Marcelo Tosatti mtosa...@redhat.com
Cc: Jan Kiszka jan.kis...@siemens.com
Cc: ry...@linux.vnet.ibm.com
Cc: aligu...@us.ibm.com
Cc: kvm@vger.kernel.org
---
Changes from V4:
 Test if the guest paused capability is available before use

Changes from V3:
 Collapse new state change notification function into existsing function.
 Correct whitespace issues
 Change ioctl name to KVMCLOCK_GUEST_PAUSED
 Use for loop to iterate vpcu's

Changes from V2:
 Move ioctl into hw/kvmclock.c so as other arches can use it as it is
implemented

Changes from V1:
 Remove unnecessary encapsulating function

 hw/kvmclock.c |   20 
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/hw/kvmclock.c b/hw/kvmclock.c
index 5388bc4..d071d61 100644
--- a/hw/kvmclock.c
+++ b/hw/kvmclock.c
@@ -16,6 +16,7 @@
 #include sysbus.h
 #include kvm.h
 #include kvmclock.h
+#include cpu-all.h
 
 #include linux/kvm.h
 #include linux/kvm_para.h
@@ -62,10 +63,29 @@ static int kvmclock_post_load(void *opaque, int version_id)
 static void kvmclock_vm_state_change(void *opaque, int running,
  RunState state)
 {
+int ret;
+CPUState *penv = first_cpu;
 KVMClockState *s = opaque;
+int cap_guest_paused = kvm_check_extension(kvm_state, 
KVM_CAP_GUEST_PAUSED);
 
 if (running) {
 s-clock_valid = false;
+
+if (!cap_guest_paused) {
+return;
+}
+
+for (penv = first_cpu; penv != NULL; penv = penv-next_cpu) {
+ret = kvm_vcpu_ioctl(penv, KVMCLOCK_GUEST_PAUSED, 0);
+if (ret) {
+if (ret != -EINVAL) {
+fprintf(stderr,
+kvmclock_vm_state_change: %s\n,
+strerror(-ret));
+}
+return;
+}
+}
 }
 }
 
-- 
1.7.5.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/4 V8] Add check for suspended vm in softlockup detector

2012-01-11 Thread Eric B Munson
A suspended VM can cause spurious soft lockup warnings.  To avoid these, the
watchdog now checks if the kernel knows it was stopped by the host and skips
the warning if so.  When the watchdog is reset successfully, clear the guest
paused flag.

Signed-off-by: Eric B Munson emun...@mgebm.net
Cc: mi...@redhat.com
Cc: h...@zytor.com
Cc: ry...@linux.vnet.ibm.com
Cc: aligu...@us.ibm.com
Cc: mtosa...@redhat.com
Cc: jeremy.fitzhardi...@citrix.com
Cc: kvm@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org
---
Changes from V3:
 Clear the PAUSED flag when the watchdog is reset

 kernel/watchdog.c |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 1d7bca7..91485e5 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -25,6 +25,7 @@
 #include linux/sysctl.h
 
 #include asm/irq_regs.h
+#include linux/kvm_para.h
 #include linux/perf_event.h
 
 int watchdog_enabled = 1;
@@ -280,6 +281,9 @@ static enum hrtimer_restart watchdog_timer_fn(struct 
hrtimer *hrtimer)
__this_cpu_write(softlockup_touch_sync, false);
sched_clock_tick();
}
+
+   /* Clear the guest paused flag on watchdog reset */
+   kvm_check_and_clear_guest_paused();
__touch_watchdog();
return HRTIMER_RESTART;
}
@@ -292,6 +296,14 @@ static enum hrtimer_restart watchdog_timer_fn(struct 
hrtimer *hrtimer)
 */
duration = is_softlockup(touch_ts);
if (unlikely(duration)) {
+   /*
+* If a virtual machine is stopped by the host it can look to
+* the watchdog like a soft lockup, check to see if the host
+* stopped the vm before we issue the warning
+*/
+   if (kvm_check_and_clear_guest_paused())
+   return HRTIMER_RESTART;
+
/* only warn once */
if (__this_cpu_read(soft_watchdog_warn) == true)
return HRTIMER_RESTART;
-- 
1.7.5.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/4 V8] Add ioctl for KVMCLOCK_GUEST_STOPPED

2012-01-11 Thread Eric B Munson
Now that we have a flag that will tell the guest it was suspended, create an
interface for that communication using a KVM ioctl.

Signed-off-by: Eric B Munson emun...@mgebm.net

Cc: mi...@redhat.com
Cc: h...@zytor.com
Cc: ry...@linux.vnet.ibm.com
Cc: aligu...@us.ibm.com
Cc: mtosa...@redhat.com
Cc: jeremy.fitzhardi...@citrix.com
Cc: kvm@vger.kernel.org
Cc: linux-a...@vger.kernel.org
Cc: x...@kernel.org
Cc: linux-ker...@vger.kernel.org
---
Changes from V7:
 Define KVM_CAP_GUEST_PAUSED and support check
 Call mark_page_dirty () after setting PVCLOCK_GUEST_STOPPED

Changes from V4:
 Rename KVM_GUEST_PAUSED to KVMCLOCK_GUEST_PAUSED
 Add new ioctl description to api.txt

 Documentation/virtual/kvm/api.txt |   13 +
 arch/x86/kvm/x86.c|   21 +
 include/linux/kvm.h   |3 +++
 3 files changed, 37 insertions(+), 0 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt 
b/Documentation/virtual/kvm/api.txt
index e1d94bf..1931e5c 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1491,6 +1491,19 @@ following algorithm:
 Some guests configure the LINT1 NMI input to cause a panic, aiding in
 debugging.
 
+4.65 KVMCLOCK_GUEST_PAUSED
+
+Capability: KVM_CAP_GUEST_PAUSED
+Architechtures: Any that implement pvclocks (currently x86 only)
+Type: vcpu ioctl
+Parameters: None
+Returns: 0 on success, -1 on error
+
+This signals to the host kernel that the specified guest is being paused by
+userspace.  The host will set a flag in the pvclock structure that is checked
+from the soft lockup watchdog.  This ioctl can be called during pause or
+unpause.
+
 5. The kvm_run structure
 
 Application code obtains a pointer to the kvm_run structure by
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1171def..b0b51cb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2056,6 +2056,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_X86_ROBUST_SINGLESTEP:
case KVM_CAP_XSAVE:
case KVM_CAP_ASYNC_PF:
+   case KVM_CAP_GUEST_PAUSED:
case KVM_CAP_GET_TSC_KHZ:
r = 1;
break;
@@ -2503,6 +2504,22 @@ static int kvm_vcpu_ioctl_x86_set_xcrs(struct kvm_vcpu 
*vcpu,
return r;
 }
 
+/*
+ * kvm_set_guest_paused() indicates to the guest kernel that it has been
+ * stopped by the hypervisor.  This function will be called from the host only.
+ * EINVAL is returned when the host attempts to set the flag for a guest that
+ * does not support pv clocks.
+ */
+static int kvm_set_guest_paused(struct kvm_vcpu *vcpu)
+{
+   struct pvclock_vcpu_time_info *src = vcpu-arch.hv_clock;
+   if (!vcpu-arch.time_page)
+   return -EINVAL;
+   src-flags |= PVCLOCK_GUEST_STOPPED;
+   mark_page_dirty(vcpu-kvm, vcpu-arch.time  PAGE_SHIFT);
+   return 0;
+}
+
 long kvm_arch_vcpu_ioctl(struct file *filp,
 unsigned int ioctl, unsigned long arg)
 {
@@ -2784,6 +2801,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 
goto out;
}
+   case KVMCLOCK_GUEST_PAUSED: {
+   r = kvm_set_guest_paused(vcpu);
+   break;
+   }
default:
r = -EINVAL;
}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 68e67e5..4ffe0df 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -558,6 +558,7 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_PPC_PAPR 68
 #define KVM_CAP_S390_GMAP 71
 #define KVM_CAP_TSC_DEADLINE_TIMER 72
+#define KVM_CAP_GUEST_PAUSED 73
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -763,6 +764,8 @@ struct kvm_clock_data {
 #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO,  0xa8, struct 
kvm_create_spapr_tce)
 /* Available with KVM_CAP_RMA */
 #define KVM_ALLOCATE_RMA _IOR(KVMIO,  0xa9, struct kvm_allocate_rma)
+/* VM is being stopped by host */
+#define KVMCLOCK_GUEST_PAUSED_IO(KVMIO,   0xaa)
 
 #define KVM_DEV_ASSIGN_ENABLE_IOMMU(1  0)
 
-- 
1.7.5.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V5] Guest stop notification

2012-01-11 Thread Jan Kiszka
On 2012-01-11 19:17, Eric B Munson wrote:
 Often when a guest is stopped from the qemu console, it will report spurious
 soft lockup warnings on resume.  There are kernel patches being discussed that
 will give the host the ability to tell the guest that it is being stopped and
 should ignore the soft lockup warning that generates.  This patch uses the 
 qemu
 Notifier system to tell the guest it is about to be stopped.
 
 Signed-off-by: Eric B Munson emun...@mgebm.net
 
 Cc: Avi Kivity a...@redhat.com
 Cc: Marcelo Tosatti mtosa...@redhat.com
 Cc: Jan Kiszka jan.kis...@siemens.com
 Cc: ry...@linux.vnet.ibm.com
 Cc: aligu...@us.ibm.com
 Cc: kvm@vger.kernel.org
 ---
 Changes from V4:
  Test if the guest paused capability is available before use
 
 Changes from V3:
  Collapse new state change notification function into existsing function.
  Correct whitespace issues
  Change ioctl name to KVMCLOCK_GUEST_PAUSED
  Use for loop to iterate vpcu's
 
 Changes from V2:
  Move ioctl into hw/kvmclock.c so as other arches can use it as it is
 implemented
 
 Changes from V1:
  Remove unnecessary encapsulating function
 
  hw/kvmclock.c |   20 
  1 files changed, 20 insertions(+), 0 deletions(-)
 
 diff --git a/hw/kvmclock.c b/hw/kvmclock.c
 index 5388bc4..d071d61 100644
 --- a/hw/kvmclock.c
 +++ b/hw/kvmclock.c
 @@ -16,6 +16,7 @@
  #include sysbus.h
  #include kvm.h
  #include kvmclock.h
 +#include cpu-all.h
  
  #include linux/kvm.h
  #include linux/kvm_para.h
 @@ -62,10 +63,29 @@ static int kvmclock_post_load(void *opaque, int 
 version_id)
  static void kvmclock_vm_state_change(void *opaque, int running,
   RunState state)
  {
 +int ret;
 +CPUState *penv = first_cpu;
  KVMClockState *s = opaque;
 +int cap_guest_paused = kvm_check_extension(kvm_state, 
 KVM_CAP_GUEST_PAUSED);
  
  if (running) {
  s-clock_valid = false;
 +
 +if (!cap_guest_paused) {
 +return;
 +}

Why? You already ignore -EINVAL.

 +
 +for (penv = first_cpu; penv != NULL; penv = penv-next_cpu) {
 +ret = kvm_vcpu_ioctl(penv, KVMCLOCK_GUEST_PAUSED, 0);

This indicates that the interface could still be improved:
GUEST_PAUSED implies to me a VM state, but the IOCTL has to be applied
per VCPU. This is inconsistent.

Why not define a per-VM IOCTL? Would make user space's life a little bit
easier as well.

Or is there a valid use case of selectively paused VCPUs? Then call it
KVMCLOCK_VCPU_PAUSED.

 +if (ret) {
 +if (ret != -EINVAL) {

What is special about -EINVAL (as long as the cap is checked)?

 +fprintf(stderr,
 +kvmclock_vm_state_change: %s\n,
 +strerror(-ret));
 +}
 +return;
 +}
 +}
  }
  }
  

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


State of KVM bits in linux-headers

2012-01-11 Thread Jan Kiszka
Hi,

I'm a bit unhappy about the current state of our supposed to be
automatically sync'ed linux-headers directory in qemu. It has been
updated several times against undefined kernel trees, means against
neither a released version nor kvm.git. Now, if I run an update against
kvm.git + some local change, I get a churn of removals. Same will happen
when that local change ever goes upstream before the other stuff got
finally committed.

Alex, it looks to me like this is mostly PPC stuff. Can you comment on
the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a
year ago but is not in any Linux release around. Fishy...

I would like to see us avoiding this in the future. Headers update
patches should mention the source and should not be merged until the ABI
changes actually made it at least into kvm.git. Same applies, of course,
to the functional changes related to that ABI. Otherwise we risk quite
some mess on everyone's side.

Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel
and also the header. Is there real free space now or will the cap
reappear? If there should better be a placeholder, let's add it (to the
kernel).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: __direct_map() questions

2012-01-11 Thread Marcelo Tosatti
Hi,

See Documentation/virtual/kvm/mmu.txt in the kernel source tree.

On Tue, Jan 10, 2012 at 11:41:41AM -0800, Nick H wrote:
 Hello All,
 
 I am preparing for a presentation for my community college, newbie to
 the kvm world. I am trying to understand kvm implementation. I am
 interested in doing a small presentation  on kvm and its internals at
 my school. I am looking at __direct_map() . I see
 for_each_shadow_entry()-shadow_walk_xxx()  (called in context of
 handle_ept_violation() ) functions using the gfn to find the
 iterator.sptep. It passes this iterator.sptep to the mmu_set_spte().
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes

2012-01-11 Thread Marcelo Tosatti
On Tue, Jan 10, 2012 at 03:26:49PM +0100, Stephan Bärwolf wrote:
 From 2168285ffb30716f30e129c3ce98ce42d19c4d4e Mon Sep 17 00:00:00 2001
 From: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de
 Date: Sun, 8 Jan 2012 02:03:47 +
 Subject: [PATCH 2/2] KVM: fix missing illegal instruction-trap in
 protected modes
 
 On hosts without this patch, 32bit guests will crash
 (and 64bit guests may behave in a wrong way) for
 example by simply executing following nasm-demo-application:
 
 [bits 32]
 global _start
 SECTION .text
 _start: syscall
 
 (I tested it with winxp and linux - both always crashed)
 
 Disassembly of section .text:
 
  _start:
0:   0f 05   syscall
 
 The reason seems a missing invalid opcode-trap (int6) for the
 syscall opcode 0f05, which is not available on Intel CPUs
 within non-longmodes, as also on some AMD CPUs within legacy-mode.
 (depending on CPU vendor, MSR_EFER and cpuid)
 
 Because previous mentioned OSs may not engage corresponding
 syscall target-registers (STAR, LSTAR, CSTAR), they remain
 NULL and (non trapping) syscalls are leading to multiple
 faults and finally crashs.
 
 Depending on the architecture (AMD or Intel) pretended by
 guests, various checks according to vendor's documentation
 are implemented to overcome the current issue and behave
 like the CPUs physical counterparts.
 
 (Therefore using Intel's Intel 64 and IA-32 Architecture Software
 Developers Manual http://www.intel.com/content/dam/doc/manual/
 64-ia-32-architectures-software-developer-manual-325462.pdf
 and AMD's AMD64 Architecture Programmer's Manual Volume 3:
 General-Purpose and System Instructions
 http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf )
 
 Screenshots of an i686 testing VM (CORE i5 host) before
 and after applying this patch are available under:
 
 http://matrixstorm.com/software/linux/kvm/20111229/before.jpg
 http://matrixstorm.com/software/linux/kvm/20111229/after.jpg
 
 Signed-off-by: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de
 ---
  arch/x86/include/asm/kvm_emulate.h |   15 ++
  arch/x86/kvm/emulate.c |   92
 ++-
  2 files changed, 104 insertions(+), 3 deletions(-)
 
 diff --git a/arch/x86/include/asm/kvm_emulate.h
 b/arch/x86/include/asm/kvm_emulate.h
 index b172bf4..5b68c23 100644
 --- a/arch/x86/include/asm/kvm_emulate.h
 +++ b/arch/x86/include/asm/kvm_emulate.h
 @@ -301,6 +301,21 @@ struct x86_emulate_ctxt {
  #define X86EMUL_MODE_PROT (X86EMUL_MODE_PROT16|X86EMUL_MODE_PROT32| \
 X86EMUL_MODE_PROT64)
  
 +/* CPUID vendors */
 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541
 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163
 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65
 +
 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ebx 0x69444d41
 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ecx 0x21726574
 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_edx 0x74656273
 +
 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547
 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e
 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69
 +
 +
 +
  enum x86_intercept_stage {
  X86_ICTP_NONE = 0,   /* Allow zero-init to not match anything */
  X86_ICPT_PRE_EXCEPT,
 diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
 index f1e3be1..3357411 100644
 --- a/arch/x86/kvm/emulate.c
 +++ b/arch/x86/kvm/emulate.c
 @@ -1877,6 +1877,94 @@ setup_syscalls_segments(struct x86_emulate_ctxt
 *ctxt,
  ss-p = 1;
  }
  
 +static bool em_syscall_isenabled(struct x86_emulate_ctxt *ctxt)
 +{
 +struct x86_emulate_ops *ops = ctxt-ops;
 +u64 efer = 0;
 +
 +/* syscall is not available in real mode*/
 +if ((ctxt-mode == X86EMUL_MODE_REAL) ||
 +(ctxt-mode == X86EMUL_MODE_VM86))
 +return false;
 +
 +ops-get_msr(ctxt, MSR_EFER, efer);
 +/* check - if guestOS is aware of syscall (0x0f05)  */
 +if ((efer  EFER_SCE) == 0) {
 +return false;
 +} else {
 +  /* ok, at this point it becomes vendor-specific   */
 +  /* so first get us an cpuid   */
 +  bool vendor;
 +  u32 eax, ebx, ecx, edx;
 +
 +  /* getting the cpu-vendor */
 +  eax = 0x;
 +  ecx = 0x;
 +  if (likely(ops-get_cpuid))
 +  vendor = ops-get_cpuid(ctxt, eax, ebx, ecx, edx);
 +  elsevendor = false;
 +
 +  if (likely(vendor)) {
 +
 +/* AMD AuthenticAMD / AMDisbetter!  */
 +if (((ebx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx) 
 + (ecx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx) 
 + (edx==X86EMUL_CPUID_VENDOR_AuthenticAMD_edx)) ||
 +((ebx==X86EMUL_CPUID_VENDOR_AMDisbetter_ebx) 
 + (ecx==X86EMUL_CPUID_VENDOR_AMDisbetter_ecx) 
 + 

[RFC][PATCH] Update linux headers against kvm.git

2012-01-11 Thread Jan Kiszka
On 2012-01-11 20:16, Jan Kiszka wrote:
 Hi,
 
 I'm a bit unhappy about the current state of our supposed to be
 automatically sync'ed linux-headers directory in qemu. It has been
 updated several times against undefined kernel trees, means against
 neither a released version nor kvm.git. Now, if I run an update against
 kvm.git + some local change, I get a churn of removals. Same will happen
 when that local change ever goes upstream before the other stuff got
 finally committed.
 
 Alex, it looks to me like this is mostly PPC stuff. Can you comment on
 the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a
 year ago but is not in any Linux release around. Fishy...
 
 I would like to see us avoiding this in the future. Headers update
 patches should mention the source and should not be merged until the ABI
 changes actually made it at least into kvm.git. Same applies, of course,
 to the functional changes related to that ABI. Otherwise we risk quite
 some mess on everyone's side.
 
 Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel
 and also the header. Is there real free space now or will the cap
 reappear? If there should better be a placeholder, let's add it (to the
 kernel).
 
 Jan
 

Just to underline this, not for merge (yet).

Is it clear that those PPC features will be merged upstream as-is now?

Jan

---8---

This synchronizes our headers with kvm.git ff92e9b557 - and breaks PPC
build. Fairly telling...

---
 linux-headers/asm-powerpc/kvm.h   |   37 -
 linux-headers/asm-x86/hyperv.h|1 +
 linux-headers/linux/kvm.h |   54 ++--
 linux-headers/linux/kvm_para.h|1 -
 linux-headers/linux/virtio_ring.h |6 ++--
 5 files changed, 7 insertions(+), 92 deletions(-)

diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h
index fb3fddc..f7727d9 100644
--- a/linux-headers/asm-powerpc/kvm.h
+++ b/linux-headers/asm-powerpc/kvm.h
@@ -292,41 +292,4 @@ struct kvm_allocate_rma {
__u64 rma_size;
 };
 
-struct kvm_book3e_206_tlb_entry {
-   __u32 mas8;
-   __u32 mas1;
-   __u64 mas2;
-   __u64 mas7_3;
-};
-
-struct kvm_book3e_206_tlb_params {
-   /*
-* For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV:
-*
-* - The number of ways of TLB0 must be a power of two between 2 and
-*   16.
-* - TLB1 must be fully associative.
-* - The size of TLB0 must be a multiple of the number of ways, and
-*   the number of sets must be a power of two.
-* - The size of TLB1 may not exceed 64 entries.
-* - TLB0 supports 4 KiB pages.
-* - The page sizes supported by TLB1 are as indicated by
-*   TLB1CFG (if MMUCFG[MAVN] = 0) or TLB1PS (if MMUCFG[MAVN] = 1)
-*   as returned by KVM_GET_SREGS.
-* - TLB2 and TLB3 are reserved, and their entries in tlb_sizes[]
-*   and tlb_ways[] must be zero.
-*
-* tlb_ways[n] = tlb_sizes[n] means the array is fully associative.
-*
-* KVM will adjust TLBnCFG based on the sizes configured here,
-* though arrays greater than 2048 entries will have TLBnCFG[NENTRY]
-* set to zero.
-*/
-   __u32 tlb_sizes[4];
-   __u32 tlb_ways[4];
-   __u32 reserved[8];
-};
-
-#define KVM_ONE_REG_PPC_HIOR   KVM_ONE_REG_PPC | 0x100
-
 #endif /* __LINUX_KVM_POWERPC_H */
diff --git a/linux-headers/asm-x86/hyperv.h b/linux-headers/asm-x86/hyperv.h
index 5df477a..b80420b 100644
--- a/linux-headers/asm-x86/hyperv.h
+++ b/linux-headers/asm-x86/hyperv.h
@@ -189,5 +189,6 @@
 #define HV_STATUS_INVALID_HYPERCALL_CODE   2
 #define HV_STATUS_INVALID_HYPERCALL_INPUT  3
 #define HV_STATUS_INVALID_ALIGNMENT4
+#define HV_STATUS_INSUFFICIENT_BUFFERS 19
 
 #endif
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index a8761d3..e36ad9a 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -371,6 +371,7 @@ struct kvm_s390_psw {
 #define KVM_S390_INT_VIRTIO0x2603u
 #define KVM_S390_INT_SERVICE   0x2401u
 #define KVM_S390_INT_EMERGENCY 0x1201u
+#define KVM_S390_INT_EXTERNAL_CALL 0x1202u
 
 struct kvm_s390_interrupt {
__u32 type;
@@ -554,10 +555,9 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_PPC_SMT 64
 #define KVM_CAP_PPC_RMA65
 #define KVM_CAP_MAX_VCPUS 66   /* returns max vcpus per vm */
-#define KVM_CAP_PPC_HIOR 67
 #define KVM_CAP_PPC_PAPR 68
-#define KVM_CAP_SW_TLB 69
-#define KVM_CAP_ONE_REG 70
+#define KVM_CAP_S390_GMAP 71
+#define KVM_CAP_TSC_DEADLINE_TIMER 72
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -637,49 +637,6 @@ struct kvm_clock_data {
__u32 pad[9];
 };
 
-#define KVM_MMU_FSL_BOOKE_NOHV 0
-#define KVM_MMU_FSL_BOOKE_HV   1
-
-struct kvm_config_tlb {
-   __u64 params;
-   __u64 array;
-   __u32 mmu_type;
-   __u32 

Re: State of KVM bits in linux-headers

2012-01-11 Thread Alexander Graf

On 11.01.2012, at 20:16, Jan Kiszka wrote:

 Hi,
 
 I'm a bit unhappy about the current state of our supposed to be
 automatically sync'ed linux-headers directory in qemu. It has been
 updated several times against undefined kernel trees, means against
 neither a released version nor kvm.git. Now, if I run an update against
 kvm.git + some local change, I get a churn of removals. Same will happen
 when that local change ever goes upstream before the other stuff got
 finally committed.

Yes, call me even more unhappy about it :(.

 Alex, it looks to me like this is mostly PPC stuff. Can you comment on
 the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a
 year ago but is not in any Linux release around. Fishy...

Ok, here's my workflow:

  * KVM: receive patches on the ML
  * KVM: wait for reviews, review myself
  * KVM: send out a pull request
  -- this is the point in time where I assume the ABI can be considered stable 
--
  * QEMU: run update on the headers, because in a perfect world things should 
hit kvm.git any day
  * KVM: pull request gets reviews causing not-pulls or abi changes and lots of 
churn because i need forever to pullreq again ;)

I guess you see the problem. Hence I haven't pushed any kernel header updates 
since I realized how badly broken that process was. However even the stuff 
that's in qemu.git now hasn't managed to get upstream yet.

 I would like to see us avoiding this in the future. Headers update
 patches should mention the source and should not be merged until the ABI
 changes actually made it at least into kvm.git. Same applies, of course,
 to the functional changes related to that ABI. Otherwise we risk quite
 some mess on everyone's side.

I agree.

 Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel
 and also the header. Is there real free space now or will the cap
 reappear? If there should better be a placeholder, let's add it (to the
 kernel).

I will reappear with ONE_REG semantics.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost-net: add module alias (v2)

2012-01-11 Thread Michael S. Tsirkin
On Wed, Jan 11, 2012 at 09:16:53AM -0800, Stephen Hemminger wrote:
 By adding the correct module alias, programs won't have to explicitly
 call modprobe. Vhost-net will always be available if built into the kernel.
 It does require assigning a permanent minor number for depmod to work.
 Choose one next to TUN since this driver is related to it.
 
 Also, use C99 style initialization.
 
 Signed-off-by: Stephen Hemminger shemmin...@vyatta.com

I don't mind but this needs an Ack from Alan Cox
who made it dynamic in the first place,
see 79907d89c397b8bc2e05b347ec94e928ea919d33.

 ---
 v2 - document minor number and make sure to not overlap
 
  Documentation/devices.txt  |2 ++
  drivers/vhost/net.c|8 +---
  include/linux/miscdevice.h |1 +
  3 files changed, 8 insertions(+), 3 deletions(-)
 
 --- a/drivers/vhost/net.c 2012-01-10 10:56:58.883179194 -0800
 +++ b/drivers/vhost/net.c 2012-01-10 19:48:23.650225892 -0800
 @@ -856,9 +856,9 @@ static const struct file_operations vhos
  };
  
  static struct miscdevice vhost_net_misc = {
 - MISC_DYNAMIC_MINOR,
 - vhost-net,
 - vhost_net_fops,
 + .minor = VHOST_NET_MINOR,
 + .name = vhost-net,
 + .fops = vhost_net_fops,
  };
  
  static int vhost_net_init(void)
 @@ -879,3 +879,5 @@ MODULE_VERSION(0.0.1);
  MODULE_LICENSE(GPL v2);
  MODULE_AUTHOR(Michael S. Tsirkin);
  MODULE_DESCRIPTION(Host kernel accelerator for virtio net);
 +MODULE_ALIAS_MISCDEV(VHOST_NET_MINOR);
 +MODULE_ALIAS(devname:vhost-net);
 --- a/include/linux/miscdevice.h  2012-01-10 10:56:59.779189436 -0800
 +++ b/include/linux/miscdevice.h  2012-01-11 09:13:20.803694316 -0800
 @@ -42,6 +42,7 @@
  #define AUTOFS_MINOR 235
  #define MAPPER_CTRL_MINOR236
  #define LOOP_CTRL_MINOR  237
 +#define VHOST_NET_MINOR  238
  #define MISC_DYNAMIC_MINOR   255
  
  struct device;
 --- a/Documentation/devices.txt   2012-01-10 10:56:53.399116518 -0800
 +++ b/Documentation/devices.txt   2012-01-11 09:12:49.251197653 -0800
 @@ -447,6 +447,8 @@ Your cooperation is appreciated.
   234 = /dev/btrfs-controlBtrfs control device
   235 = /dev/autofs   Autofs control device
   236 = /dev/mapper/control   Device-Mapper control device
 + 237 = /dev/vhost-netHost kernel accelerator for virtio net
 +
   240-254 Reserved for local use
   255 Reserved for MISC_DYNAMIC_MINOR
  
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Anthony Liguori

On 01/11/2012 01:32 PM, Alexander Graf wrote:


On 11.01.2012, at 20:16, Jan Kiszka wrote:


Hi,

I'm a bit unhappy about the current state of our supposed to be
automatically sync'ed linux-headers directory in qemu. It has been
updated several times against undefined kernel trees, means against
neither a released version nor kvm.git. Now, if I run an update against
kvm.git + some local change, I get a churn of removals. Same will happen
when that local change ever goes upstream before the other stuff got
finally committed.


Yes, call me even more unhappy about it :(.


May I suggest the following:

1) Have the header syncing script take a commit hash that's stored in git.  Make 
script ensure that this has is in Linus' tree.


2) Maintain a patch on top of Linus' tree in qemu.git that the script would 
apply before actually syncing header files.


That let's us track how we're differing from upstream in a more reliable 
fashion.


Alex, it looks to me like this is mostly PPC stuff. Can you comment on
the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a
year ago but is not in any Linux release around. Fishy...


Ok, here's my workflow:

   * KVM: receive patches on the ML
   * KVM: wait for reviews, review myself
   * KVM: send out a pull request
   -- this is the point in time where I assume the ABI can be considered stable 
--
   * QEMU: run update on the headers, because in a perfect world things should 
hit kvm.git any day
   * KVM: pull request gets reviews causing not-pulls or abi changes and lots 
of churn because i need forever to pullreq again ;)

I guess you see the problem. Hence I haven't pushed any kernel header updates 
since I realized how badly broken that process was. However even the stuff 
that's in qemu.git now hasn't managed to get upstream yet.


I don't think it's a broken process.  I think you made a reasonable set of 
assumptions.  I think it was just an exceptional circumstance.


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: State of KVM bits in linux-headers

2012-01-11 Thread Jan Kiszka
On 2012-01-11 20:32, Alexander Graf wrote:
 
 On 11.01.2012, at 20:16, Jan Kiszka wrote:
 
 Hi,

 I'm a bit unhappy about the current state of our supposed to be
 automatically sync'ed linux-headers directory in qemu. It has been
 updated several times against undefined kernel trees, means against
 neither a released version nor kvm.git. Now, if I run an update against
 kvm.git + some local change, I get a churn of removals. Same will happen
 when that local change ever goes upstream before the other stuff got
 finally committed.
 
 Yes, call me even more unhappy about it :(.
 
 Alex, it looks to me like this is mostly PPC stuff. Can you comment on
 the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a
 year ago but is not in any Linux release around. Fishy...
 
 Ok, here's my workflow:
 
   * KVM: receive patches on the ML
   * KVM: wait for reviews, review myself
   * KVM: send out a pull request
   -- this is the point in time where I assume the ABI can be considered 
 stable --
   * QEMU: run update on the headers, because in a perfect world things should 
 hit kvm.git any day
   * KVM: pull request gets reviews causing not-pulls or abi changes and lots 
 of churn because i need forever to pullreq again ;)

Likely, the last item has to be moved up by two steps...

 
 I guess you see the problem. Hence I haven't pushed any kernel header updates 
 since I realized how badly broken that process was. However even the stuff 
 that's in qemu.git now hasn't managed to get upstream yet.

On the other hand, if I recall correctly, there were some complaint on
the list recently about a header update patch again a Linux -rc version.
Because it removed the limbo land stuff in the same run, of course.
That's very bad. I see the problem: ppc targets will no longer build, at
least with KVM enabled, right? But this needs to be resolved now.

 
 I would like to see us avoiding this in the future. Headers update
 patches should mention the source and should not be merged until the ABI
 changes actually made it at least into kvm.git. Same applies, of course,
 to the functional changes related to that ABI. Otherwise we risk quite
 some mess on everyone's side.
 
 I agree.
 
 Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel
 and also the header. Is there real free space now or will the cap
 reappear? If there should better be a placeholder, let's add it (to the
 kernel).
 
 I will reappear with ONE_REG semantics.
 

OK.

Then please clean up now so that update-linux-headers.sh can be used
again by normal developers. :)

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Anthony Liguori

On 01/11/2012 01:38 PM, Jan Kiszka wrote:



I would like to see us avoiding this in the future. Headers update
patches should mention the source and should not be merged until the ABI
changes actually made it at least into kvm.git. Same applies, of course,
to the functional changes related to that ABI. Otherwise we risk quite
some mess on everyone's side.


I agree.


Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel
and also the header. Is there real free space now or will the cap
reappear? If there should better be a placeholder, let's add it (to the
kernel).


I will reappear with ONE_REG semantics.



OK.

Then please clean up now so that update-linux-headers.sh can be used
again by normal developers. :)


Before we did submodules and had a responsive BIOS maintainer, we maintained 
patches within qemu.git for our external dependencies.  I think that's a good 
strategy here too.  It's a little painful, but not entirely awful.


At least it makes it possible for you to (hopefully) trivial rebase a patch if 
something is still in limbo.


Regards,

Anthony Liguori



Jan



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Alexander Graf

On 11.01.2012, at 20:38, Anthony Liguori wrote:

 On 01/11/2012 01:32 PM, Alexander Graf wrote:
 
 On 11.01.2012, at 20:16, Jan Kiszka wrote:
 
 Hi,
 
 I'm a bit unhappy about the current state of our supposed to be
 automatically sync'ed linux-headers directory in qemu. It has been
 updated several times against undefined kernel trees, means against
 neither a released version nor kvm.git. Now, if I run an update against
 kvm.git + some local change, I get a churn of removals. Same will happen
 when that local change ever goes upstream before the other stuff got
 finally committed.
 
 Yes, call me even more unhappy about it :(.
 
 May I suggest the following:
 
 1) Have the header syncing script take a commit hash that's stored in git.  
 Make script ensure that this has is in Linus' tree.
 
 2) Maintain a patch on top of Linus' tree in qemu.git that the script would 
 apply before actually syncing header files.
 
 That let's us track how we're differing from upstream in a more reliable 
 fashion.

Yeah, I guess the ultimate question it boils down to is: when is something 
upstream? The average time it takes for patches to trickle through to Linus 
right now is in the magnitude of half a year to a year.

 
 Alex, it looks to me like this is mostly PPC stuff. Can you comment on
 the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a
 year ago but is not in any Linux release around. Fishy...
 
 Ok, here's my workflow:
 
   * KVM: receive patches on the ML
   * KVM: wait for reviews, review myself
   * KVM: send out a pull request
   -- this is the point in time where I assume the ABI can be considered 
 stable --
   * QEMU: run update on the headers, because in a perfect world things 
 should hit kvm.git any day
   * KVM: pull request gets reviews causing not-pulls or abi changes and lots 
 of churn because i need forever to pullreq again ;)
 
 I guess you see the problem. Hence I haven't pushed any kernel header 
 updates since I realized how badly broken that process was. However even the 
 stuff that's in qemu.git now hasn't managed to get upstream yet.
 
 I don't think it's a broken process.  I think you made a reasonable set of 
 assumptions.  I think it was just an exceptional circumstance.

Several times in a row? No, the assumptions were just wrong. In the kvm world, 
pull requests don't mean upstream, they mean the same as a patch set.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Alexander Graf

On 11.01.2012, at 20:41, Anthony Liguori wrote:

 On 01/11/2012 01:38 PM, Jan Kiszka wrote:
 
 I would like to see us avoiding this in the future. Headers update
 patches should mention the source and should not be merged until the ABI
 changes actually made it at least into kvm.git. Same applies, of course,
 to the functional changes related to that ABI. Otherwise we risk quite
 some mess on everyone's side.
 
 I agree.
 
 Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel
 and also the header. Is there real free space now or will the cap
 reappear? If there should better be a placeholder, let's add it (to the
 kernel).
 
 I will reappear with ONE_REG semantics.
 
 
 OK.
 
 Then please clean up now so that update-linux-headers.sh can be used
 again by normal developers. :)
 
 Before we did submodules and had a responsive BIOS maintainer, we maintained 
 patches within qemu.git for our external dependencies.  I think that's a good 
 strategy here too.  It's a little painful, but not entirely awful.
 
 At least it makes it possible for you to (hopefully) trivial rebase a patch 
 if something is still in limbo.

Yeah, that works. I can easily script that part. It doesn't solve the actual 
underlying problem though that we don't know when the abi is actually stable. 
I'm slowly starting to understand Pekka ;).


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Jan Kiszka
On 2012-01-11 20:38, Anthony Liguori wrote:
 On 01/11/2012 01:32 PM, Alexander Graf wrote:

 On 11.01.2012, at 20:16, Jan Kiszka wrote:

 Hi,

 I'm a bit unhappy about the current state of our supposed to be
 automatically sync'ed linux-headers directory in qemu. It has been
 updated several times against undefined kernel trees, means against
 neither a released version nor kvm.git. Now, if I run an update against
 kvm.git + some local change, I get a churn of removals. Same will happen
 when that local change ever goes upstream before the other stuff got
 finally committed.

 Yes, call me even more unhappy about it :(.
 
 May I suggest the following:
 
 1) Have the header syncing script take a commit hash that's stored in git.  
 Make 
 script ensure that this has is in Linus' tree.
 
 2) Maintain a patch on top of Linus' tree in qemu.git that the script would 
 apply before actually syncing header files.
 
 That let's us track how we're differing from upstream in a more reliable 
 fashion.

That sounds fairly complicated for a simple problem: Do not merge ABI
changes that aren't at least in kvm.git. There are also other reasons
for this, beside making the sync harder.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Alexander Graf

On 11.01.2012, at 20:46, Jan Kiszka wrote:

 On 2012-01-11 20:38, Anthony Liguori wrote:
 On 01/11/2012 01:32 PM, Alexander Graf wrote:
 
 On 11.01.2012, at 20:16, Jan Kiszka wrote:
 
 Hi,
 
 I'm a bit unhappy about the current state of our supposed to be
 automatically sync'ed linux-headers directory in qemu. It has been
 updated several times against undefined kernel trees, means against
 neither a released version nor kvm.git. Now, if I run an update against
 kvm.git + some local change, I get a churn of removals. Same will happen
 when that local change ever goes upstream before the other stuff got
 finally committed.
 
 Yes, call me even more unhappy about it :(.
 
 May I suggest the following:
 
 1) Have the header syncing script take a commit hash that's stored in git.  
 Make 
 script ensure that this has is in Linus' tree.
 
 2) Maintain a patch on top of Linus' tree in qemu.git that the script would 
 apply before actually syncing header files.
 
 That let's us track how we're differing from upstream in a more reliable 
 fashion.
 
 That sounds fairly complicated for a simple problem: Do not merge ABI
 changes that aren't at least in kvm.git. There are also other reasons
 for this, beside making the sync harder.

Let's just try to get my patch queue into kvm.git asap and then never to push 
linux-header updates before they hit kvm.git again. That's easier than setting 
up any complicated processes or scripts.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Jan Kiszka
On 2012-01-11 20:46, Alexander Graf wrote:
 
 On 11.01.2012, at 20:41, Anthony Liguori wrote:
 
 On 01/11/2012 01:38 PM, Jan Kiszka wrote:

 I would like to see us avoiding this in the future. Headers update
 patches should mention the source and should not be merged until the ABI
 changes actually made it at least into kvm.git. Same applies, of course,
 to the functional changes related to that ABI. Otherwise we risk quite
 some mess on everyone's side.

 I agree.

 Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel
 and also the header. Is there real free space now or will the cap
 reappear? If there should better be a placeholder, let's add it (to the
 kernel).

 I will reappear with ONE_REG semantics.


 OK.

 Then please clean up now so that update-linux-headers.sh can be used
 again by normal developers. :)

 Before we did submodules and had a responsive BIOS maintainer, we maintained 
 patches within qemu.git for our external dependencies.  I think that's a 
 good strategy here too.  It's a little painful, but not entirely awful.

 At least it makes it possible for you to (hopefully) trivial rebase a patch 
 if something is still in limbo.
 
 Yeah, that works. I can easily script that part. It doesn't solve the actual 
 underlying problem though that we don't know when the abi is actually stable. 
 I'm slowly starting to understand Pekka ;).

IIRC, we never had this problem with qemu-kvm - as the merges were
coordinated with the kernel (subsystem) tree.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost-net: add module alias

2012-01-11 Thread Michael S. Tsirkin
On Wed, Jan 11, 2012 at 08:54:26AM -0800, Stephen Hemminger wrote:
 On Wed, 11 Jan 2012 15:43:42 +0800
 Amos Kong kongjian...@gmail.com wrote:
 
  On Wed, Jan 11, 2012 at 12:54 PM, Stephen Hemminger
  shemmin...@vyatta.comwrote:
  
   By adding the a module alias, programs (or users) won't have to explicitly
   call modprobe. Vhost-net will always be available if built into the 
   kernel.
   It does require assigning a permanent minor number for depmod to work.
   Choose one next to TUN since this driver is related to it.
  
   Also, use C99 style initialization.
  
   Signed-off-by: Stephen Hemminger shemmin...@vyatta.com
  
   ---
drivers/vhost/net.c|8 +---
include/linux/miscdevice.h |1 +
2 files changed, 6 insertions(+), 3 deletions(-)
  
 :
  /*
   *  These allocations are managed by dev...@lanana.org. If you use an
   *  entry that is not in assigned your entry may well be moved and
   *  reassigned, or set dynamic if a fixed value is not justified.
   */
 
 Didn't that mailing address was ever used any more. Like many places
 in kernel, the comment looked like a historical leftover.

This was only added in 2010, see
79907d89c397b8bc2e05b347ec94e928ea919d33.
That said at least lanana.org web site seems to be down.

Alan, any idea?

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Anthony Liguori

On 01/11/2012 01:48 PM, Jan Kiszka wrote:

On 2012-01-11 20:46, Alexander Graf wrote:


On 11.01.2012, at 20:41, Anthony Liguori wrote:


On 01/11/2012 01:38 PM, Jan Kiszka wrote:



I would like to see us avoiding this in the future. Headers update
patches should mention the source and should not be merged until the ABI
changes actually made it at least into kvm.git. Same applies, of course,
to the functional changes related to that ABI. Otherwise we risk quite
some mess on everyone's side.


I agree.


Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel
and also the header. Is there real free space now or will the cap
reappear? If there should better be a placeholder, let's add it (to the
kernel).


I will reappear with ONE_REG semantics.



OK.

Then please clean up now so that update-linux-headers.sh can be used
again by normal developers. :)


Before we did submodules and had a responsive BIOS maintainer, we maintained 
patches within qemu.git for our external dependencies.  I think that's a good 
strategy here too.  It's a little painful, but not entirely awful.

At least it makes it possible for you to (hopefully) trivial rebase a patch if 
something is still in limbo.


Yeah, that works. I can easily script that part. It doesn't solve the actual 
underlying problem though that we don't know when the abi is actually stable. 
I'm slowly starting to understand Pekka ;).


IIRC, we never had this problem with qemu-kvm - as the merges were
coordinated with the kernel (subsystem) tree.


Are you suggesting that kvm header updates go through uq/master?  That seems 
reasonable to me and is certainly the least amount of change.


Regards,

Anthony Liguori



Jan



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Alexander Graf

On 11.01.2012, at 20:52, Anthony Liguori wrote:

 On 01/11/2012 01:48 PM, Jan Kiszka wrote:
 On 2012-01-11 20:46, Alexander Graf wrote:
 
 On 11.01.2012, at 20:41, Anthony Liguori wrote:
 
 On 01/11/2012 01:38 PM, Jan Kiszka wrote:
 
 I would like to see us avoiding this in the future. Headers update
 patches should mention the source and should not be merged until the ABI
 changes actually made it at least into kvm.git. Same applies, of course,
 to the functional changes related to that ABI. Otherwise we risk quite
 some mess on everyone's side.
 
 I agree.
 
 Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel
 and also the header. Is there real free space now or will the cap
 reappear? If there should better be a placeholder, let's add it (to the
 kernel).
 
 I will reappear with ONE_REG semantics.
 
 
 OK.
 
 Then please clean up now so that update-linux-headers.sh can be used
 again by normal developers. :)
 
 Before we did submodules and had a responsive BIOS maintainer, we 
 maintained patches within qemu.git for our external dependencies.  I think 
 that's a good strategy here too.  It's a little painful, but not entirely 
 awful.
 
 At least it makes it possible for you to (hopefully) trivial rebase a 
 patch if something is still in limbo.
 
 Yeah, that works. I can easily script that part. It doesn't solve the 
 actual underlying problem though that we don't know when the abi is 
 actually stable. I'm slowly starting to understand Pekka ;).
 
 IIRC, we never had this problem with qemu-kvm - as the merges were
 coordinated with the kernel (subsystem) tree.
 
 Are you suggesting that kvm header updates go through uq/master?  That seems 
 reasonable to me and is certainly the least amount of change.

So how about code that actually leverages the new headers?


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Jan Kiszka
On 2012-01-11 20:52, Anthony Liguori wrote:
 On 01/11/2012 01:48 PM, Jan Kiszka wrote:
 On 2012-01-11 20:46, Alexander Graf wrote:

 On 11.01.2012, at 20:41, Anthony Liguori wrote:

 On 01/11/2012 01:38 PM, Jan Kiszka wrote:

 I would like to see us avoiding this in the future. Headers update
 patches should mention the source and should not be merged until the ABI
 changes actually made it at least into kvm.git. Same applies, of course,
 to the functional changes related to that ABI. Otherwise we risk quite
 some mess on everyone's side.

 I agree.

 Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel
 and also the header. Is there real free space now or will the cap
 reappear? If there should better be a placeholder, let's add it (to the
 kernel).

 I will reappear with ONE_REG semantics.


 OK.

 Then please clean up now so that update-linux-headers.sh can be used
 again by normal developers. :)

 Before we did submodules and had a responsive BIOS maintainer, we 
 maintained patches within qemu.git for our external dependencies.  I think 
 that's a good strategy here too.  It's a little painful, but not entirely 
 awful.

 At least it makes it possible for you to (hopefully) trivial rebase a 
 patch if something is still in limbo.

 Yeah, that works. I can easily script that part. It doesn't solve the 
 actual underlying problem though that we don't know when the abi is 
 actually stable. I'm slowly starting to understand Pekka ;).

 IIRC, we never had this problem with qemu-kvm - as the merges were
 coordinated with the kernel (subsystem) tree.
 
 Are you suggesting that kvm header updates go through uq/master?  That seems 
 reasonable to me and is certainly the least amount of change.

Would be possible at least for changes that affect KVM bits. But we also
use that headers for virtio and vhost. VFIO will surely join that group.
So there is still coordination necessary.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Anthony Liguori

On 01/11/2012 01:53 PM, Alexander Graf wrote:


On 11.01.2012, at 20:52, Anthony Liguori wrote:


IIRC, we never had this problem with qemu-kvm - as the merges were
coordinated with the kernel (subsystem) tree.


Are you suggesting that kvm header updates go through uq/master?  That seems 
reasonable to me and is certainly the least amount of change.


So how about code that actually leverages the new headers?


Shared KVM infrastructure should go through uq/master.  So changes to kvm-all.c, 
linux-headers/* should go through uq/master.


Target specific kvm changes should go through the appropriate submaintainers 
tree.

Regards,

Anthony Liguori




Alex




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes

2012-01-11 Thread Stephan Bärwolf
On 01/11/12 20:09, Marcelo Tosatti wrote:
 On Tue, Jan 10, 2012 at 03:26:49PM +0100, Stephan Bärwolf wrote:
 From 2168285ffb30716f30e129c3ce98ce42d19c4d4e Mon Sep 17 00:00:00 2001
 From: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de
 Date: Sun, 8 Jan 2012 02:03:47 +
 Subject: [PATCH 2/2] KVM: fix missing illegal instruction-trap in
 protected modes

 On hosts without this patch, 32bit guests will crash
 (and 64bit guests may behave in a wrong way) for
 example by simply executing following nasm-demo-application:

 [bits 32]
 global _start
 SECTION .text
 _start: syscall

 (I tested it with winxp and linux - both always crashed)

 Disassembly of section .text:

  _start:
0:   0f 05   syscall

 The reason seems a missing invalid opcode-trap (int6) for the
 syscall opcode 0f05, which is not available on Intel CPUs
 within non-longmodes, as also on some AMD CPUs within legacy-mode.
 (depending on CPU vendor, MSR_EFER and cpuid)

 Because previous mentioned OSs may not engage corresponding
 syscall target-registers (STAR, LSTAR, CSTAR), they remain
 NULL and (non trapping) syscalls are leading to multiple
 faults and finally crashs.

 Depending on the architecture (AMD or Intel) pretended by
 guests, various checks according to vendor's documentation
 are implemented to overcome the current issue and behave
 like the CPUs physical counterparts.

 (Therefore using Intel's Intel 64 and IA-32 Architecture Software
 Developers Manual http://www.intel.com/content/dam/doc/manual/
 64-ia-32-architectures-software-developer-manual-325462.pdf
 and AMD's AMD64 Architecture Programmer's Manual Volume 3:
 General-Purpose and System Instructions
 http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf )

 Screenshots of an i686 testing VM (CORE i5 host) before
 and after applying this patch are available under:

 http://matrixstorm.com/software/linux/kvm/20111229/before.jpg
 http://matrixstorm.com/software/linux/kvm/20111229/after.jpg

 Signed-off-by: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de
 ---
  arch/x86/include/asm/kvm_emulate.h |   15 ++
  arch/x86/kvm/emulate.c |   92
 ++-
  2 files changed, 104 insertions(+), 3 deletions(-)

 diff --git a/arch/x86/include/asm/kvm_emulate.h
 b/arch/x86/include/asm/kvm_emulate.h
 index b172bf4..5b68c23 100644
 --- a/arch/x86/include/asm/kvm_emulate.h
 +++ b/arch/x86/include/asm/kvm_emulate.h
 @@ -301,6 +301,21 @@ struct x86_emulate_ctxt {
  #define X86EMUL_MODE_PROT (X86EMUL_MODE_PROT16|X86EMUL_MODE_PROT32| \
 X86EMUL_MODE_PROT64)
  
 +/* CPUID vendors */
 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541
 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163
 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65
 +
 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ebx 0x69444d41
 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ecx 0x21726574
 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_edx 0x74656273
 +
 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547
 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e
 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69
 +
 +
 +
  enum x86_intercept_stage {
  X86_ICTP_NONE = 0,   /* Allow zero-init to not match anything */
  X86_ICPT_PRE_EXCEPT,
 diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
 index f1e3be1..3357411 100644
 --- a/arch/x86/kvm/emulate.c
 +++ b/arch/x86/kvm/emulate.c
 @@ -1877,6 +1877,94 @@ setup_syscalls_segments(struct x86_emulate_ctxt
 *ctxt,
  ss-p = 1;
  }
  
 +static bool em_syscall_isenabled(struct x86_emulate_ctxt *ctxt)
 +{
 +struct x86_emulate_ops *ops = ctxt-ops;
 +u64 efer = 0;
 +
 +/* syscall is not available in real mode*/
 +if ((ctxt-mode == X86EMUL_MODE_REAL) ||
 +(ctxt-mode == X86EMUL_MODE_VM86))
 +return false;
 +
 +ops-get_msr(ctxt, MSR_EFER, efer);
 +/* check - if guestOS is aware of syscall (0x0f05)  */
 +if ((efer  EFER_SCE) == 0) {
 +return false;
 +} else {
 +  /* ok, at this point it becomes vendor-specific   */
 +  /* so first get us an cpuid   */
 +  bool vendor;
 +  u32 eax, ebx, ecx, edx;
 +
 +  /* getting the cpu-vendor */
 +  eax = 0x;
 +  ecx = 0x;
 +  if (likely(ops-get_cpuid))
 +  vendor = ops-get_cpuid(ctxt, eax, ebx, ecx, edx);
 +  elsevendor = false;
 +
 +  if (likely(vendor)) {
 +
 +/* AMD AuthenticAMD / AMDisbetter!  */
 +if (((ebx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx) 
 + (ecx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx) 
 + (edx==X86EMUL_CPUID_VENDOR_AuthenticAMD_edx)) ||
 +((ebx==X86EMUL_CPUID_VENDOR_AMDisbetter_ebx) 
 + 

Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Alexander Graf

On 11.01.2012, at 20:59, Anthony Liguori wrote:

 On 01/11/2012 01:53 PM, Alexander Graf wrote:
 
 On 11.01.2012, at 20:52, Anthony Liguori wrote:
 
 IIRC, we never had this problem with qemu-kvm - as the merges were
 coordinated with the kernel (subsystem) tree.
 
 Are you suggesting that kvm header updates go through uq/master?  That 
 seems reasonable to me and is certainly the least amount of change.
 
 So how about code that actually leverages the new headers?
 
 Shared KVM infrastructure should go through uq/master.  So changes to 
 kvm-all.c, linux-headers/* should go through uq/master.
 
 Target specific kvm changes should go through the appropriate submaintainers 
 tree.

So then if I add some target specific stuff to KVM, I have to

  * send pullreq to KVM
  * wait for that to be applied
  * post a patch to uq/master to update headers
  * wait for that to merge back to qemu.git
  * send a pull request to qemu.git

right? And then after about 3 months we'll have the feature available ;).


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Anthony Liguori

On 01/11/2012 02:05 PM, Alexander Graf wrote:


On 11.01.2012, at 20:59, Anthony Liguori wrote:


On 01/11/2012 01:53 PM, Alexander Graf wrote:


On 11.01.2012, at 20:52, Anthony Liguori wrote:


IIRC, we never had this problem with qemu-kvm - as the merges were
coordinated with the kernel (subsystem) tree.


Are you suggesting that kvm header updates go through uq/master?  That seems 
reasonable to me and is certainly the least amount of change.


So how about code that actually leverages the new headers?


Shared KVM infrastructure should go through uq/master.  So changes to 
kvm-all.c, linux-headers/* should go through uq/master.

Target specific kvm changes should go through the appropriate submaintainers 
tree.


So then if I add some target specific stuff to KVM,


That requires a header update?


I have to

   * send pullreq to KVM
   * wait for that to be applied
   * post a patch to uq/master to update headers


Strictly from a QEMU perspective, we can't depend on APIs that aren't committed 
upstream yet.



   * wait for that to merge back to qemu.git
   * send a pull request to qemu.git


Maybe we need to bring a stripped down version of Linux into qemu.git to make it 
easier to simultaneously update both trees... ;-)




right? And then after about 3 months we'll have the feature available ;).


You can always just get Acked-by's from the appropriate maintainers.  That's 
just as good as going through the tree.


Regards,

Anthony Liguori



Alex




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


use of PMU in guest generates messages in host

2012-01-11 Thread David Ahern
Using latest kernel tree (e343a895a9f342f239c5e3c5ffc6c0b1707e6244)
which has KVM bits for using PMU in the guest. Host and guest are both
running Fedora 16, 64-bit, with this kernel.

Running this command in the guest:
   perf stat -ddd -- openssl speed aes

Generates this in the host:
[74728.221863] kvm_set_msr_common: 2760 callbacks suppressed
[74728.221950] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
[74728.222115] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
[74728.222858] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
[74728.223018] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
[74728.223851] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
[74728.224009] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
[74728.224843] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
[74728.224997] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
[74728.225842] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f001
[74728.226010] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f001

David
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost-net: add module alias (v2)

2012-01-11 Thread Ben Hutchings
On Wed, 2012-01-11 at 09:16 -0800, Stephen Hemminger wrote:
 By adding the correct module alias, programs won't have to explicitly
 call modprobe. Vhost-net will always be available if built into the kernel.
 It does require assigning a permanent minor number for depmod to work.
 Choose one next to TUN since this driver is related to it.
 
 Also, use C99 style initialization.
 
 Signed-off-by: Stephen Hemminger shemmin...@vyatta.com
 
 ---
 v2 - document minor number and make sure to not overlap
[...]
 --- a/include/linux/miscdevice.h  2012-01-10 10:56:59.779189436 -0800
 +++ b/include/linux/miscdevice.h  2012-01-11 09:13:20.803694316 -0800
 @@ -42,6 +42,7 @@
  #define AUTOFS_MINOR 235
  #define MAPPER_CTRL_MINOR236
  #define LOOP_CTRL_MINOR  237
 +#define VHOST_NET_MINOR  238
  #define MISC_DYNAMIC_MINOR   255
  
  struct device;
 --- a/Documentation/devices.txt   2012-01-10 10:56:53.399116518 -0800
 +++ b/Documentation/devices.txt   2012-01-11 09:12:49.251197653 -0800
 @@ -447,6 +447,8 @@ Your cooperation is appreciated.
   234 = /dev/btrfs-controlBtrfs control device
   235 = /dev/autofs   Autofs control device
   236 = /dev/mapper/control   Device-Mapper control device
 + 237 = /dev/vhost-netHost kernel accelerator for virtio net
[...]

238 != 237.  It looks like someone forgot to add loopctrl here.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes

2012-01-11 Thread Marcelo Tosatti
On Wed, Jan 11, 2012 at 09:01:10PM +0100, Stephan Bärwolf wrote:
 On 01/11/12 20:09, Marcelo Tosatti wrote:
  On Tue, Jan 10, 2012 at 03:26:49PM +0100, Stephan Bärwolf wrote:
  From 2168285ffb30716f30e129c3ce98ce42d19c4d4e Mon Sep 17 00:00:00 2001
  From: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de
  Date: Sun, 8 Jan 2012 02:03:47 +
  Subject: [PATCH 2/2] KVM: fix missing illegal instruction-trap in
  protected modes
 
  On hosts without this patch, 32bit guests will crash
  (and 64bit guests may behave in a wrong way) for
  example by simply executing following nasm-demo-application:
 
  [bits 32]
  global _start
  SECTION .text
  _start: syscall
 
  (I tested it with winxp and linux - both always crashed)
 
  Disassembly of section .text:
 
   _start:
 0:   0f 05   syscall
 
  The reason seems a missing invalid opcode-trap (int6) for the
  syscall opcode 0f05, which is not available on Intel CPUs
  within non-longmodes, as also on some AMD CPUs within legacy-mode.
  (depending on CPU vendor, MSR_EFER and cpuid)
 
  Because previous mentioned OSs may not engage corresponding
  syscall target-registers (STAR, LSTAR, CSTAR), they remain
  NULL and (non trapping) syscalls are leading to multiple
  faults and finally crashs.
 
  Depending on the architecture (AMD or Intel) pretended by
  guests, various checks according to vendor's documentation
  are implemented to overcome the current issue and behave
  like the CPUs physical counterparts.
 
  (Therefore using Intel's Intel 64 and IA-32 Architecture Software
  Developers Manual http://www.intel.com/content/dam/doc/manual/
  64-ia-32-architectures-software-developer-manual-325462.pdf
  and AMD's AMD64 Architecture Programmer's Manual Volume 3:
  General-Purpose and System Instructions
  http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf )
 
  Screenshots of an i686 testing VM (CORE i5 host) before
  and after applying this patch are available under:
 
  http://matrixstorm.com/software/linux/kvm/20111229/before.jpg
  http://matrixstorm.com/software/linux/kvm/20111229/after.jpg
 
  Signed-off-by: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de
  ---
   arch/x86/include/asm/kvm_emulate.h |   15 ++
   arch/x86/kvm/emulate.c |   92
  ++-
   2 files changed, 104 insertions(+), 3 deletions(-)
 
  diff --git a/arch/x86/include/asm/kvm_emulate.h
  b/arch/x86/include/asm/kvm_emulate.h
  index b172bf4..5b68c23 100644
  --- a/arch/x86/include/asm/kvm_emulate.h
  +++ b/arch/x86/include/asm/kvm_emulate.h
  @@ -301,6 +301,21 @@ struct x86_emulate_ctxt {
   #define X86EMUL_MODE_PROT (X86EMUL_MODE_PROT16|X86EMUL_MODE_PROT32| \
  X86EMUL_MODE_PROT64)
   
  +/* CPUID vendors */
  +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541
  +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163
  +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65
  +
  +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ebx 0x69444d41
  +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ecx 0x21726574
  +#define X86EMUL_CPUID_VENDOR_AMDisbetter_edx 0x74656273
  +
  +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547
  +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e
  +#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69
  +
  +
  +
   enum x86_intercept_stage {
   X86_ICTP_NONE = 0,   /* Allow zero-init to not match anything */
   X86_ICPT_PRE_EXCEPT,
  diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
  index f1e3be1..3357411 100644
  --- a/arch/x86/kvm/emulate.c
  +++ b/arch/x86/kvm/emulate.c
  @@ -1877,6 +1877,94 @@ setup_syscalls_segments(struct x86_emulate_ctxt
  *ctxt,
   ss-p = 1;
   }
   
  +static bool em_syscall_isenabled(struct x86_emulate_ctxt *ctxt)
  +{
  +struct x86_emulate_ops *ops = ctxt-ops;
  +u64 efer = 0;
  +
  +/* syscall is not available in real mode*/
  +if ((ctxt-mode == X86EMUL_MODE_REAL) ||
  +(ctxt-mode == X86EMUL_MODE_VM86))
  +return false;
  +
  +ops-get_msr(ctxt, MSR_EFER, efer);
  +/* check - if guestOS is aware of syscall (0x0f05)  */
  +if ((efer  EFER_SCE) == 0) {
  +return false;
  +} else {
  +  /* ok, at this point it becomes vendor-specific   */
  +  /* so first get us an cpuid   */
  +  bool vendor;
  +  u32 eax, ebx, ecx, edx;
  +
  +  /* getting the cpu-vendor */
  +  eax = 0x;
  +  ecx = 0x;
  +  if (likely(ops-get_cpuid))
  +  vendor = ops-get_cpuid(ctxt, eax, ebx, ecx, edx);
  +  elsevendor = false;
  +
  +  if (likely(vendor)) {
  +
  +/* AMD AuthenticAMD / AMDisbetter!  */
  +if (((ebx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx) 
  + 

Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Alexander Graf

On 11.01.2012, at 21:16, Anthony Liguori wrote:

 On 01/11/2012 02:05 PM, Alexander Graf wrote:
 
 On 11.01.2012, at 20:59, Anthony Liguori wrote:
 
 On 01/11/2012 01:53 PM, Alexander Graf wrote:
 
 On 11.01.2012, at 20:52, Anthony Liguori wrote:
 
 IIRC, we never had this problem with qemu-kvm - as the merges were
 coordinated with the kernel (subsystem) tree.
 
 Are you suggesting that kvm header updates go through uq/master?  That 
 seems reasonable to me and is certainly the least amount of change.
 
 So how about code that actually leverages the new headers?
 
 Shared KVM infrastructure should go through uq/master.  So changes to 
 kvm-all.c, linux-headers/* should go through uq/master.
 
 Target specific kvm changes should go through the appropriate 
 submaintainers tree.
 
 So then if I add some target specific stuff to KVM,
 
 That requires a header update?

Almost all of the time, yes. The target is still rather incomplete. And even in 
places where it is, hardware evolves and we just get new information we need to 
pass back and forth.

 
 I have to
 
   * send pullreq to KVM
   * wait for that to be applied
   * post a patch to uq/master to update headers
 
 Strictly from a QEMU perspective, we can't depend on APIs that aren't 
 committed upstream yet.

The question again is: When do we consider something upstream?

 
   * wait for that to merge back to qemu.git
   * send a pull request to qemu.git
 
 Maybe we need to bring a stripped down version of Linux into qemu.git to make 
 it easier to simultaneously update both trees... ;-)

Nice one ;)

 
 
 right? And then after about 3 months we'll have the feature available ;).
 
 You can always just get Acked-by's from the appropriate maintainers.  That's 
 just as good as going through the tree.

So every time we change headers, I just require Avi's ack and then he can't 
complain on those patches later? Good idea! :)


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Exception during emulation decode should propagate

2012-01-11 Thread Takuya Yoshikawa
On Wed, 11 Jan 2012 18:53:30 +0200
Nadav Amit na...@cs.technion.ac.il wrote:

 An exception might occur during decode (e.g., #PF during fetch).
 Currently, the exception is ignored and emulation is performed.

When I cleaned up insn_fetch(), I thought that fetching the instruction
which is being executed by the guest cannot cause #PF.

The possibility that a meaningless userspace might similtaneously unmap
the page, noted by Avi IIRC, was ignored intentionally, so we just fail
in such a case.

Did you see any real problem?

Takuya


 Instead, emulation should be skipped and the fault should be injected.
 Skipping instruction should report a failure in this case.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes

2012-01-11 Thread Stephan Bärwolf
On 01/11/12 22:21, Marcelo Tosatti wrote:
 On Wed, Jan 11, 2012 at 09:01:10PM +0100, Stephan Bärwolf wrote:
 On 01/11/12 20:09, Marcelo Tosatti wrote:
 On Tue, Jan 10, 2012 at 03:26:49PM +0100, Stephan Bärwolf wrote:
 From 2168285ffb30716f30e129c3ce98ce42d19c4d4e Mon Sep 17 00:00:00 2001
 From: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de
 Date: Sun, 8 Jan 2012 02:03:47 +
 Subject: [PATCH 2/2] KVM: fix missing illegal instruction-trap in
 protected modes

 On hosts without this patch, 32bit guests will crash
 (and 64bit guests may behave in a wrong way) for
 example by simply executing following nasm-demo-application:

 [bits 32]
 global _start
 SECTION .text
 _start: syscall

 (I tested it with winxp and linux - both always crashed)

 Disassembly of section .text:

  _start:
0:   0f 05   syscall

 The reason seems a missing invalid opcode-trap (int6) for the
 syscall opcode 0f05, which is not available on Intel CPUs
 within non-longmodes, as also on some AMD CPUs within legacy-mode.
 (depending on CPU vendor, MSR_EFER and cpuid)

 Because previous mentioned OSs may not engage corresponding
 syscall target-registers (STAR, LSTAR, CSTAR), they remain
 NULL and (non trapping) syscalls are leading to multiple
 faults and finally crashs.

 Depending on the architecture (AMD or Intel) pretended by
 guests, various checks according to vendor's documentation
 are implemented to overcome the current issue and behave
 like the CPUs physical counterparts.

 (Therefore using Intel's Intel 64 and IA-32 Architecture Software
 Developers Manual http://www.intel.com/content/dam/doc/manual/
 64-ia-32-architectures-software-developer-manual-325462.pdf
 and AMD's AMD64 Architecture Programmer's Manual Volume 3:
 General-Purpose and System Instructions
 http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf )

 Screenshots of an i686 testing VM (CORE i5 host) before
 and after applying this patch are available under:

 http://matrixstorm.com/software/linux/kvm/20111229/before.jpg
 http://matrixstorm.com/software/linux/kvm/20111229/after.jpg

 Signed-off-by: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de
 ---
  arch/x86/include/asm/kvm_emulate.h |   15 ++
  arch/x86/kvm/emulate.c |   92
 ++-
  2 files changed, 104 insertions(+), 3 deletions(-)

 diff --git a/arch/x86/include/asm/kvm_emulate.h
 b/arch/x86/include/asm/kvm_emulate.h
 index b172bf4..5b68c23 100644
 --- a/arch/x86/include/asm/kvm_emulate.h
 +++ b/arch/x86/include/asm/kvm_emulate.h
 @@ -301,6 +301,21 @@ struct x86_emulate_ctxt {
  #define X86EMUL_MODE_PROT (X86EMUL_MODE_PROT16|X86EMUL_MODE_PROT32| \
 X86EMUL_MODE_PROT64)
  
 +/* CPUID vendors */
 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541
 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163
 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65
 +
 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ebx 0x69444d41
 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ecx 0x21726574
 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_edx 0x74656273
 +
 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547
 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e
 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69
 +
 +
 +
  enum x86_intercept_stage {
  X86_ICTP_NONE = 0,   /* Allow zero-init to not match anything */
  X86_ICPT_PRE_EXCEPT,
 diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
 index f1e3be1..3357411 100644
 --- a/arch/x86/kvm/emulate.c
 +++ b/arch/x86/kvm/emulate.c
 @@ -1877,6 +1877,94 @@ setup_syscalls_segments(struct x86_emulate_ctxt
 *ctxt,
  ss-p = 1;
  }
  
 +static bool em_syscall_isenabled(struct x86_emulate_ctxt *ctxt)
 +{
 +struct x86_emulate_ops *ops = ctxt-ops;
 +u64 efer = 0;
 +
 +/* syscall is not available in real mode*/
 +if ((ctxt-mode == X86EMUL_MODE_REAL) ||
 +(ctxt-mode == X86EMUL_MODE_VM86))
 +return false;
 +
 +ops-get_msr(ctxt, MSR_EFER, efer);
 +/* check - if guestOS is aware of syscall (0x0f05)  */
 +if ((efer  EFER_SCE) == 0) {
 +return false;
 +} else {
 +  /* ok, at this point it becomes vendor-specific   */
 +  /* so first get us an cpuid   */
 +  bool vendor;
 +  u32 eax, ebx, ecx, edx;
 +
 +  /* getting the cpu-vendor */
 +  eax = 0x;
 +  ecx = 0x;
 +  if (likely(ops-get_cpuid))
 +  vendor = ops-get_cpuid(ctxt, eax, ebx, ecx, edx);
 +  elsevendor = false;
 +
 +  if (likely(vendor)) {
 +
 +/* AMD AuthenticAMD / AMDisbetter!  */
 +if (((ebx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx) 
 + (ecx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx) 
 + (edx==X86EMUL_CPUID_VENDOR_AuthenticAMD_edx)) ||
 +  

[PATCH] KVM: PPC: refer to paravirt docs in header file

2012-01-11 Thread Scott Wood
Instead of keeping separate copies of struct kvm_vcpu_arch_shared (one in
the code, one in the docs) that inevitably fail to be kept in sync
(already sr[] is missing from the doc version), just point to the header
file as the source of documentation on the contents of the magic page.

Signed-off-by: Scott Wood scottw...@freescale.com
---
 Documentation/virtual/kvm/ppc-pv.txt |   24 ++--
 arch/powerpc/include/asm/kvm_para.h  |   10 ++
 2 files changed, 12 insertions(+), 22 deletions(-)

diff --git a/Documentation/virtual/kvm/ppc-pv.txt 
b/Documentation/virtual/kvm/ppc-pv.txt
index 2b7ce19..6e7c370 100644
--- a/Documentation/virtual/kvm/ppc-pv.txt
+++ b/Documentation/virtual/kvm/ppc-pv.txt
@@ -81,28 +81,8 @@ additional registers to the magic page. If you add fields to 
the magic page,
 also define a new hypercall feature to indicate that the host can give you more
 registers. Only if the host supports the additional features, make use of them.
 
-The magic page has the following layout as described in
-arch/powerpc/include/asm/kvm_para.h:
-
-struct kvm_vcpu_arch_shared {
-   __u64 scratch1;
-   __u64 scratch2;
-   __u64 scratch3;
-   __u64 critical; /* Guest may not get interrupts if == r1 */
-   __u64 sprg0;
-   __u64 sprg1;
-   __u64 sprg2;
-   __u64 sprg3;
-   __u64 srr0;
-   __u64 srr1;
-   __u64 dar;
-   __u64 msr;
-   __u32 dsisr;
-   __u32 int_pending;  /* Tells the guest if we have an interrupt */
-};
-
-Additions to the page must only occur at the end. Struct fields are always 32
-or 64 bit aligned, depending on them being 32 or 64 bit wide respectively.
+The magic page layout is described by struct kvm_vcpu_arch_shared
+in arch/powerpc/include/asm/kvm_para.h.
 
 Magic page features
 ===
diff --git a/arch/powerpc/include/asm/kvm_para.h 
b/arch/powerpc/include/asm/kvm_para.h
index ece70fb..7b754e7 100644
--- a/arch/powerpc/include/asm/kvm_para.h
+++ b/arch/powerpc/include/asm/kvm_para.h
@@ -22,6 +22,16 @@
 
 #include linux/types.h
 
+/*
+ * Additions to this struct must only occur at the end, and should be
+ * accompanied by a KVM_MAGIC_FEAT flag to advertise that they are present
+ * (albeit not necessarily relevant to the current target hardware platform).
+ *
+ * Struct fields are always 32 or 64 bit aligned, depending on them being 32
+ * or 64 bit wide respectively.
+ *
+ * See Documentation/virtual/kvm/ppc-pv.txt
+ */
 struct kvm_vcpu_arch_shared {
__u64 scratch1;
__u64 scratch2;
-- 
1.7.7.rc3.4.g8d714

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: refer to paravirt docs in header file

2012-01-11 Thread Alexander Graf

On 12.01.2012, at 00:37, Scott Wood wrote:

 Instead of keeping separate copies of struct kvm_vcpu_arch_shared (one in
 the code, one in the docs) that inevitably fail to be kept in sync
 (already sr[] is missing from the doc version), just point to the header
 file as the source of documentation on the contents of the magic page.
 
 Signed-off-by: Scott Wood scottw...@freescale.com

Avi, please ack.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/3] stop the periodic RTC update timer

2012-01-11 Thread Zhang, Yang Z
 -Original Message-
 From: Marcelo Tosatti [mailto:mtosa...@redhat.com]
 
 Regarding the UIP bit, a guest could read it in a loop and wait for the value 
 to
 change. But you can emulate it in cmos_ioport_read by reading the host time,
 that is, return 1 during 244us, 0 for remaining of the second, and have that 
 in sync
 with update-cycle-ended interrupt if its enabled.
Yes. Guest may use the loop to read RTC, but the point is the guest is waiting 
for the UIP changed to 0. If this bit always equal to 0 , guest will never go 
into the loop. For real RTC, this may wrong, because the RTC cannot give you 
the valid value during the update cycle. But the virtual RTC doesn't' need this 
logic, whenever you read it, it will always return the right value to you.

best regards
yang
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: Exception during emulation decode should propagate

2012-01-11 Thread Takuya Yoshikawa

(2012/01/12 7:11), Takuya Yoshikawa wrote:

On Wed, 11 Jan 2012 18:53:30 +0200
Nadav Amitna...@cs.technion.ac.il  wrote:


An exception might occur during decode (e.g., #PF during fetch).
Currently, the exception is ignored and emulation is performed.


Note that the decode/emulation will not be continued in such a case.

insn_fetch() is a bit tricky macro and it contains goto done to outside.
So if an error happens during fetching the instruction, x86_decode_insn()
will handle the X86EMUL_* fault value and returns FAIL immediately.

Takuya



When I cleaned up insn_fetch(), I thought that fetching the instruction
which is being executed by the guest cannot cause #PF.

The possibility that a meaningless userspace might similtaneously unmap
the page, noted by Avi IIRC, was ignored intentionally, so we just fail
in such a case.

Did you see any real problem?

Takuya



Instead, emulation should be skipped and the fault should be injected.
Skipping instruction should report a failure in this case.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Code clean up for percpu_xxx() functions

2012-01-11 Thread H. Peter Anvin

On 01/11/2012 09:19 AM, t...@kernel.org wrote:


Alex, can you please collect all patches into a single patchset?
Please split it such that, usage changes are per-system so that they
can be routed through respective subsystems (x86 or net) and updates
to percpu proper which can be applied after other changes have been
applied.  It would probably be best to route these patches separately
rather than all through percpu as it touches a lot of different places
and is likely to cause conflicts.  I *think* the best way would be,

* Submit per-subsystem patches and get them merged to subsystem trees.

* (Optional) Apply a patch to mark unused interface deprecated in
   percpu tree, so that new usages in linux-next can be detected.

* Towards the end of the next merge window, merge a patch to actually
   kill the old interface.



That sounds like a good idea.

-hpa
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 3/3] stop the periodic RTC update timer

2012-01-11 Thread Zhang, Yang Z

 -Original Message-
 From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo
 Bonzini
 Because it's not in the spec because some engineer thought it was cool.
It not cool. We need to do some optimizations to get Better Performance.

 It's in the spec because it gives you a way to do atomic reads.
 
 QEMU not being a simulator means that we always assume that the RTC is
 programmed for a 32768 Hz clock, for example, because any other setting would
 not make sense on a PC.  We can use a 1-second (or higher, as in your patches)
 timer, rather than a 32768 Hz timer which anyway would not work well.
 
 So we're taking shortcuts, but each of them must be evaluated separately, and
 _this_ shortcut is not acceptable.
 
  Also, is there an actual case that break with my patch?
 
 Any decent unit test for the RTC would break.
Any decent unit test break the following logic too. The spec provide three ways 
for you to program, why we only focus on 0x20? Because this is for emulation 
not for hardware simulation. Because no real OS set it to other value.
static void rtc_update_second(void *opaque)
{
RTCState *s = opaque;
int64_t delay;

/* if the oscillator is not in normal operation, we do not update */
if ((s-cmos_data[RTC_REG_A]  0x70) != 0x20) {
.
}


  It means that the (not externally visible) millisecond value is set
  to 500 when you modify the current time of the RTC.  The next update
  of the clock will happen exactly 500 ms after you reset bit
  7 of register B.
 
  Same question, any reason need to complicate the current logic? Or any
  actual usage model need to add this?
 
 Is it really so difficult to implement?
I think what we are talking is do we really need it? Not how difficult to add 
it. 

 Note that this case is mentioned in drivers/rtc/rtc-cmos.c in the Linux source
 code, even though it is not used.
Yes, it just mentioned the next update will happen in 500ms later. What's wrong 
with this? The highest resolution of RTC is 1 second, if any software intend to 
use RTC do some check within 1 second, it should be wrong.

Anyway, I agree with your point. If we really need to add those features, I 
will add it in next version. Before it, we need figure out whether it is 
necessary.

Best regards
yang
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] virtio-serial: set up vqs on demand

2012-01-11 Thread zanghongyong
From: Hongyong Zang zanghongy...@huawei.com

Virtio-serial set up (max_ports+1)*2 vqs when device probes, but may not all 
io_ports are used.
These patches create vqs of port0 and control port when probing the device, 
then 
create io-vqs when called add_port().

Hongyong Zang (2):
  virtio-pci: add setup_vqs flag in vp_try_to_find_vqs
  virtio-serial: setup_port_vq when adding port

 drivers/char/virtio_console.c |   65 ++--
 drivers/virtio/virtio_pci.c   |   17 --
 2 files changed, 74 insertions(+), 8 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] virtio-pci: add setup_vqs flag in vp_try_to_find_vqs

2012-01-11 Thread zanghongyong
From: Hongyong Zang zanghongy...@huawei.com

changes in vp_try_to_find_vqs:
Virtio-serial's probe() calls it to request irqs and setup vqs of port0 and
controls; add_port() calls it to set up vqs of io_port.
it will not create virtqueue if the name is null.

Signed-off-by: Hongyong Zang zanghongy...@huawei.com
---
 drivers/virtio/virtio_pci.c |   17 +
 1 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c
index baabb79..1f98c36 100644
--- a/drivers/virtio/virtio_pci.c
+++ b/drivers/virtio/virtio_pci.c
@@ -492,9 +492,11 @@ static void vp_del_vqs(struct virtio_device *vdev)
list_for_each_entry_safe(vq, n, vdev-vqs, list) {
info = vq-priv;
if (vp_dev-per_vq_vectors 
-   info-msix_vector != VIRTIO_MSI_NO_VECTOR)
+   info-msix_vector != VIRTIO_MSI_NO_VECTOR) {
free_irq(vp_dev-msix_entries[info-msix_vector].vector,
 vq);
+   vp_dev-msix_used_vectors--;
+   }
vp_del_vq(vq);
}
vp_dev-per_vq_vectors = false;
@@ -511,7 +513,10 @@ static int vp_try_to_find_vqs(struct virtio_device *vdev, 
unsigned nvqs,
 {
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
u16 msix_vec;
-   int i, err, nvectors, allocated_vectors;
+   int i, err, nvectors;
+
+   if (vp_dev-msix_used_vectors)
+   goto setup_vqs;
 
if (!use_msix) {
/* Old style: one normal interrupt for change and all vqs. */
@@ -536,12 +541,16 @@ static int vp_try_to_find_vqs(struct virtio_device *vdev, 
unsigned nvqs,
}
 
vp_dev-per_vq_vectors = per_vq_vectors;
-   allocated_vectors = vp_dev-msix_used_vectors;
+
+setup_vqs:
for (i = 0; i  nvqs; ++i) {
+   if (names[i] == NULL)
+   continue;
+
if (!callbacks[i] || !vp_dev-msix_enabled)
msix_vec = VIRTIO_MSI_NO_VECTOR;
else if (vp_dev-per_vq_vectors)
-   msix_vec = allocated_vectors++;
+   msix_vec = vp_dev-msix_used_vectors++;
else
msix_vec = VP_MSIX_VQ_VECTOR;
vqs[i] = setup_vq(vdev, i, callbacks[i], names[i], msix_vec);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] virtio-serial: setup_port_vq when adding port

2012-01-11 Thread zanghongyong
From: Hongyong Zang zanghongy...@huawei.com

Add setup_port_vq(). Create the io ports' vqs when add_port.

Signed-off-by: Hongyong Zang zanghongy...@huawei.com
---
 drivers/char/virtio_console.c |   65 ++--
 1 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 8e3c46d..2e5187e 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1132,6 +1132,55 @@ static void send_sigio_to_port(struct port *port)
kill_fasync(port-async_queue, SIGIO, POLL_OUT);
 }
 
+static void in_intr(struct virtqueue *vq);
+static void out_intr(struct virtqueue *vq);
+
+static int setup_port_vq(struct ports_device *portdev,  u32 id)
+{
+   int err, vq_num;
+   vq_callback_t **io_callbacks;
+   char **io_names;
+   struct virtqueue **vqs;
+   u32 i,j,nr_ports,nr_queues;
+
+   err = 0;
+   vq_num = (id + 1) * 2;
+   nr_ports = portdev-config.max_nr_ports;
+   nr_queues = use_multiport(portdev) ? (nr_ports + 1) * 2 : 2;
+
+   vqs = kmalloc(nr_queues * sizeof(struct virtqueue *), GFP_KERNEL);
+   io_callbacks = kmalloc(nr_queues * sizeof(vq_callback_t *), GFP_KERNEL);
+   io_names = kmalloc(nr_queues * sizeof(char *), GFP_KERNEL);
+   if (!vqs || !io_callbacks || !io_names) {
+   err = -ENOMEM;
+   goto free;
+   }
+
+   for (i = 0, j = 0; i = nr_ports; i++) {
+   io_callbacks[j] = in_intr;
+   io_callbacks[j + 1] = out_intr;
+   io_names[j] = NULL;
+   io_names[j + 1] = NULL;
+   j += 2;
+   }
+   io_names[vq_num] = serial-input;
+   io_names[vq_num + 1] = serial-output;
+   err = portdev-vdev-config-find_vqs(portdev-vdev, nr_queues, vqs,
+   io_callbacks,
+   (const char **)io_names);
+   if (err)
+   goto free;
+   portdev-in_vqs[id] = vqs[vq_num];
+   portdev-out_vqs[id] = vqs[vq_num + 1];
+
+free:
+   kfree(io_names);
+   kfree(io_callbacks);
+   kfree(vqs);
+
+   return err;
+}
+
 static int add_port(struct ports_device *portdev, u32 id)
 {
char debugfs_name[16];
@@ -1163,6 +1212,14 @@ static int add_port(struct ports_device *portdev, u32 id)
 
port-outvq_full = false;
 
+   if (!portdev-in_vqs[port-id]  !portdev-out_vqs[port-id]) {
+   spin_lock(portdev-ports_lock);
+   err = setup_port_vq(portdev, port-id);
+   spin_unlock(portdev-ports_lock);
+   if (err)
+   goto free_port;
+   }
+
port-in_vq = portdev-in_vqs[port-id];
port-out_vq = portdev-out_vqs[port-id];
 
@@ -1614,8 +1671,8 @@ static int init_vqs(struct ports_device *portdev)
j += 2;
io_callbacks[j] = in_intr;
io_callbacks[j + 1] = out_intr;
-   io_names[j] = input;
-   io_names[j + 1] = output;
+   io_names[j] = NULL;
+   io_names[j + 1] = NULL;
}
}
/* Find the queues. */
@@ -1635,8 +1692,8 @@ static int init_vqs(struct ports_device *portdev)
 
for (i = 1; i  nr_ports; i++) {
j += 2;
-   portdev-in_vqs[i] = vqs[j];
-   portdev-out_vqs[i] = vqs[j + 1];
+   portdev-in_vqs[i] = NULL;
+   portdev-out_vqs[i] = NULL;
}
}
kfree(io_names);
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] Code clean up for percpu_xxx() functions

2012-01-11 Thread Alex,Shi
On Wed, 2012-01-11 at 16:44 -0800, H. Peter Anvin wrote:
 On 01/11/2012 09:19 AM, t...@kernel.org wrote:
 
  Alex, can you please collect all patches into a single patchset?
  Please split it such that, usage changes are per-system so that they
  can be routed through respective subsystems (x86 or net) and updates
  to percpu proper which can be applied after other changes have been
  applied.  It would probably be best to route these patches separately
  rather than all through percpu as it touches a lot of different places
  and is likely to cause conflicts.  I *think* the best way would be,
 
  * Submit per-subsystem patches and get them merged to subsystem trees.
 
  * (Optional) Apply a patch to mark unused interface deprecated in
 percpu tree, so that new usages in linux-next can be detected.
 
  * Towards the end of the next merge window, merge a patch to actually
 kill the old interface.
 
 
 That sounds like a good idea.

I will try to do so. Many thanks for the advices! 
 
   -hpa


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] vhost-net: add module alias (v2)

2012-01-11 Thread Zhi Yong Wu
On Thu, Jan 12, 2012 at 1:16 AM, Stephen Hemminger
shemmin...@vyatta.com wrote:
 By adding the correct module alias, programs won't have to explicitly
 call modprobe. Vhost-net will always be available if built into the kernel.
 It does require assigning a permanent minor number for depmod to work.
 Choose one next to TUN since this driver is related to it.

 Also, use C99 style initialization.

 Signed-off-by: Stephen Hemminger shemmin...@vyatta.com

 ---
 v2 - document minor number and make sure to not overlap

  Documentation/devices.txt  |    2 ++
  drivers/vhost/net.c        |    8 +---
  include/linux/miscdevice.h |    1 +
  3 files changed, 8 insertions(+), 3 deletions(-)

 --- a/drivers/vhost/net.c       2012-01-10 10:56:58.883179194 -0800
 +++ b/drivers/vhost/net.c       2012-01-10 19:48:23.650225892 -0800
 @@ -856,9 +856,9 @@ static const struct file_operations vhos
  };

  static struct miscdevice vhost_net_misc = {
 -       MISC_DYNAMIC_MINOR,
 -       vhost-net,
 -       vhost_net_fops,
 +       .minor = VHOST_NET_MINOR,
 +       .name = vhost-net,
 +       .fops = vhost_net_fops,
  };

  static int vhost_net_init(void)
 @@ -879,3 +879,5 @@ MODULE_VERSION(0.0.1);
  MODULE_LICENSE(GPL v2);
  MODULE_AUTHOR(Michael S. Tsirkin);
  MODULE_DESCRIPTION(Host kernel accelerator for virtio net);
 +MODULE_ALIAS_MISCDEV(VHOST_NET_MINOR);
 +MODULE_ALIAS(devname:vhost-net);
 --- a/include/linux/miscdevice.h        2012-01-10 10:56:59.779189436 -0800
 +++ b/include/linux/miscdevice.h        2012-01-11 09:13:20.803694316 -0800
 @@ -42,6 +42,7 @@
  #define AUTOFS_MINOR           235
  #define MAPPER_CTRL_MINOR      236
  #define LOOP_CTRL_MINOR                237
 +#define VHOST_NET_MINOR                238
  #define MISC_DYNAMIC_MINOR     255

  struct device;
 --- a/Documentation/devices.txt 2012-01-10 10:56:53.399116518 -0800
 +++ b/Documentation/devices.txt 2012-01-11 09:12:49.251197653 -0800
 @@ -447,6 +447,8 @@ Your cooperation is appreciated.
                234 = /dev/btrfs-control        Btrfs control device
                235 = /dev/autofs       Autofs control device
                236 = /dev/mapper/control       Device-Mapper control device
 +               237 = /dev/vhost-net    Host kernel accelerator for virtio net
 +
238? The stuff for LOOP_CTRL seems to be missing?

                240-254                 Reserved for local use
                255                     Reserved for MISC_DYNAMIC_MINOR



 ___
 Virtualization mailing list
 virtualizat...@lists.linux-foundation.org
 https://lists.linuxfoundation.org/mailman/listinfo/virtualization



-- 
Regards,

Zhi Yong Wu
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 00/13] KVM/ARM Implementation

2012-01-11 Thread Christoffer Dall

On Jan 11, 2012, at 8:48 AM, Peter Maydell wrote:

 On 11 December 2011 19:23, Christoffer Dall
 c.d...@virtualopensystems.com wrote:
 On Sun, Dec 11, 2011 at 6:32 AM, Peter Maydell peter.mayd...@linaro.org 
 wrote:
 On 11 December 2011 10:24, Christoffer Dall
 c.d...@virtualopensystems.com wrote:
 Still on the to-do list:
  - Reuse VMIDs
  - Fix SMP host support
  - Fix SMP guest support
  - Support guest Thumb mode for MMIO emulation
  - Further testing
  - Performance improvements
 
 Other items for this list:
  - Support Neon/VFP in guests (the fpu regs struct is empty ATM)
  - Support guest debugging
 
 ok, thanks, will add these to the list. I have a feeling it will keep
 growing for a while :)
 
 Do you have a kernel-side TODO list somewhere public (wiki page?)
 

I wanted to create this as issues on the github repos...

 (It would be quite useful to be able to boot a reasonably modern
 [read, ARMv7, Thumb2, VFPv3] guest userspace; does anybody plan
 to work on this part soon?)

We have booted the linaro init environment and recent Angstrom distributions. 
Android is being actively tested. What specifically did you have in mind?

-Christoffer--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 04/16] KVM: PPC: factor out lpid allocator from book3s_64_mmu_hv

2012-01-11 Thread Paul Mackerras
On Mon, Jan 09, 2012 at 04:35:52PM +0100, Alexander Graf wrote:

 Paul, does this work for you? IIRC you need this code to be
 available from real mode, which powerpc.c isn't in, right?

We don't need to allocated LPIDs from real mode, so it should be OK.
book3s_64_mmu_hv.c is not real mode code, and it gets compiled into
the KVM module.

Paul.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Could anybody give some description about the implement of hyercall in kvm?

2012-01-11 Thread Liu ping fan
Hi,

Could anybody give some description about the implement of hyercall in
kvm? Or give some links about it?

Thanks,
ping fan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] vhost-net: add module alias (v2.1)

2012-01-11 Thread Stephen Hemminger
Subject: vhost-net: add module alias (v2.1)

By adding some module aliases, programs (or users) won't have to explicitly
call modprobe. Vhost-net will always be available if built into the kernel.
It does require assigning a permanent minor number for depmod to work.

Also:
  - use C99 style initialization.
  - add missing entry in documentation for loop-control

Signed-off-by: Stephen Hemminger shemmin...@vyatta.com

---
2.1 - add missing documentation for loop control as well

 Documentation/devices.txt  |3 +++
 drivers/vhost/net.c|8 +---
 include/linux/miscdevice.h |1 +
 3 files changed, 9 insertions(+), 3 deletions(-)

--- a/drivers/vhost/net.c   2012-01-10 10:56:58.883179194 -0800
+++ b/drivers/vhost/net.c   2012-01-10 19:48:23.650225892 -0800
@@ -856,9 +856,9 @@ static const struct file_operations vhos
 };
 
 static struct miscdevice vhost_net_misc = {
-   MISC_DYNAMIC_MINOR,
-   vhost-net,
-   vhost_net_fops,
+   .minor = VHOST_NET_MINOR,
+   .name = vhost-net,
+   .fops = vhost_net_fops,
 };
 
 static int vhost_net_init(void)
@@ -879,3 +879,5 @@ MODULE_VERSION(0.0.1);
 MODULE_LICENSE(GPL v2);
 MODULE_AUTHOR(Michael S. Tsirkin);
 MODULE_DESCRIPTION(Host kernel accelerator for virtio net);
+MODULE_ALIAS_MISCDEV(VHOST_NET_MINOR);
+MODULE_ALIAS(devname:vhost-net);
--- a/include/linux/miscdevice.h2012-01-10 10:56:59.779189436 -0800
+++ b/include/linux/miscdevice.h2012-01-11 09:13:20.803694316 -0800
@@ -42,6 +42,7 @@
 #define AUTOFS_MINOR   235
 #define MAPPER_CTRL_MINOR  236
 #define LOOP_CTRL_MINOR237
+#define VHOST_NET_MINOR238
 #define MISC_DYNAMIC_MINOR 255
 
 struct device;
--- a/Documentation/devices.txt 2012-01-10 10:56:53.399116518 -0800
+++ b/Documentation/devices.txt 2012-01-11 13:17:07.882113340 -0800
@@ -447,6 +447,9 @@ Your cooperation is appreciated.
234 = /dev/btrfs-controlBtrfs control device
235 = /dev/autofs   Autofs control device
236 = /dev/mapper/control   Device-Mapper control device
+   237 = /dev/loop-control Loopback control device
+   238 = /dev/vhost-netHost kernel accelerator for virtio net
+
240-254 Reserved for local use
255 Reserved for MISC_DYNAMIC_MINOR
 


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: use of PMU in guest generates messages in host

2012-01-11 Thread Gleb Natapov
On Wed, Jan 11, 2012 at 01:47:55PM -0700, David Ahern wrote:
 Using latest kernel tree (e343a895a9f342f239c5e3c5ffc6c0b1707e6244)
 which has KVM bits for using PMU in the guest. Host and guest are both
 running Fedora 16, 64-bit, with this kernel.
 
 Running this command in the guest:
perf stat -ddd -- openssl speed aes
 
 Generates this in the host:
 [74728.221863] kvm_set_msr_common: 2760 callbacks suppressed
 [74728.221950] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
 [74728.222115] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
 [74728.222858] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
 [74728.223018] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
 [74728.223851] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
 [74728.224009] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
 [74728.224843] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
 [74728.224997] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701
 [74728.225842] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f001
 [74728.226010] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f001
 
This is MSR_OFFCORE_RSP_0 MSR which is not (yet?) supported. What is
your host cpu and qemu command line?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] State of KVM bits in linux-headers

2012-01-11 Thread Gleb Natapov
On Wed, Jan 11, 2012 at 08:46:38PM +0100, Alexander Graf wrote:
 
 On 11.01.2012, at 20:41, Anthony Liguori wrote:
 
  On 01/11/2012 01:38 PM, Jan Kiszka wrote:
  
  I would like to see us avoiding this in the future. Headers update
  patches should mention the source and should not be merged until the ABI
  changes actually made it at least into kvm.git. Same applies, of course,
  to the functional changes related to that ABI. Otherwise we risk quite
  some mess on everyone's side.
  
  I agree.
  
  Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel
  and also the header. Is there real free space now or will the cap
  reappear? If there should better be a placeholder, let's add it (to the
  kernel).
  
  I will reappear with ONE_REG semantics.
  
  
  OK.
  
  Then please clean up now so that update-linux-headers.sh can be used
  again by normal developers. :)
  
  Before we did submodules and had a responsive BIOS maintainer, we 
  maintained patches within qemu.git for our external dependencies.  I think 
  that's a good strategy here too.  It's a little painful, but not entirely 
  awful.
  
  At least it makes it possible for you to (hopefully) trivial rebase a patch 
  if something is still in limbo.
 
 Yeah, that works. I can easily script that part. It doesn't solve the actual 
 underlying problem though that we don't know when the abi is actually stable. 
 I'm slowly starting to understand Pekka ;).
 
 
In my recent experience with submitting Joerg's patch series that
touches both kernel and tools/perf I didn't see any advantages in
having them in the same repository. Yes, the repository is the same,
but maintainers are different and have their own timelines and
priorities. Long story short userspace part was applied almost three
month after the kernel part.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support

2012-01-11 Thread Benjamin Herrenschmidt
On Tue, 2012-01-10 at 04:11 +0100, Alexander Graf wrote:
 This is what book3s does:
 
 case EMULATE_FAIL:
 printk(KERN_CRIT %s: emulation at %lx failed
 (%08x)\n,
__func__, kvmppc_get_pc(vcpu),
 kvmppc_get_last_inst(vcpu));
 kvmppc_core_queue_program(vcpu, flags);
 r = RESUME_GUEST;
 
 which also doesn't throttle the printk, but I think injecting a
 program fault into the guest is the most sensible thing to do if we
 don't know what the instruction is supposed to do. Best case we get an
 oops inside the guest telling us what broke :).

You can also fallback to a slow path that reads the guest TLB,
translates then reads the instruction. Of course you have to be careful
as such a manual translate + read + execute needs to be somewhat
synchronized with a possible TLB invalidation :-)

(MMIO emulation is broken in this regard too btw)

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support

2012-01-11 Thread Alexander Graf


On 12.01.2012, at 07:44, Benjamin Herrenschmidt b...@kernel.crashing.org 
wrote:

 On Tue, 2012-01-10 at 04:11 +0100, Alexander Graf wrote:
 This is what book3s does:
 
case EMULATE_FAIL:
printk(KERN_CRIT %s: emulation at %lx failed
 (%08x)\n,
   __func__, kvmppc_get_pc(vcpu),
 kvmppc_get_last_inst(vcpu));
kvmppc_core_queue_program(vcpu, flags);
r = RESUME_GUEST;
 
 which also doesn't throttle the printk, but I think injecting a
 program fault into the guest is the most sensible thing to do if we
 don't know what the instruction is supposed to do. Best case we get an
 oops inside the guest telling us what broke :).
 
 You can also fallback to a slow path that reads the guest TLB,
 translates then reads the instruction. Of course you have to be careful
 as such a manual translate + read + execute needs to be somewhat
 synchronized with a possible TLB invalidation :-)

Well we do want to be fast on the default path though. So yes, what you're 
saying is what book3s does, but as a fallback in case the fast path didn't work.

The problem here however is that we don't know if the fast path failed; we oops.


 
 (MMIO emulation is broken in this regard too btw)

Huh?

Alex

 
 Cheers,
 Ben.
 
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: PPC: refer to paravirt docs in header file

2012-01-11 Thread Alexander Graf

On 12.01.2012, at 00:37, Scott Wood wrote:

 Instead of keeping separate copies of struct kvm_vcpu_arch_shared (one in
 the code, one in the docs) that inevitably fail to be kept in sync
 (already sr[] is missing from the doc version), just point to the header
 file as the source of documentation on the contents of the magic page.
 
 Signed-off-by: Scott Wood scottw...@freescale.com

Avi, please ack.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 04/16] KVM: PPC: factor out lpid allocator from book3s_64_mmu_hv

2012-01-11 Thread Paul Mackerras
On Mon, Jan 09, 2012 at 04:35:52PM +0100, Alexander Graf wrote:

 Paul, does this work for you? IIRC you need this code to be
 available from real mode, which powerpc.c isn't in, right?

We don't need to allocated LPIDs from real mode, so it should be OK.
book3s_64_mmu_hv.c is not real mode code, and it gets compiled into
the KVM module.

Paul.
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support

2012-01-11 Thread Benjamin Herrenschmidt
On Tue, 2012-01-10 at 04:11 +0100, Alexander Graf wrote:
 This is what book3s does:
 
 case EMULATE_FAIL:
 printk(KERN_CRIT %s: emulation at %lx failed
 (%08x)\n,
__func__, kvmppc_get_pc(vcpu),
 kvmppc_get_last_inst(vcpu));
 kvmppc_core_queue_program(vcpu, flags);
 r = RESUME_GUEST;
 
 which also doesn't throttle the printk, but I think injecting a
 program fault into the guest is the most sensible thing to do if we
 don't know what the instruction is supposed to do. Best case we get an
 oops inside the guest telling us what broke :).

You can also fallback to a slow path that reads the guest TLB,
translates then reads the instruction. Of course you have to be careful
as such a manual translate + read + execute needs to be somewhat
synchronized with a possible TLB invalidation :-)

(MMIO emulation is broken in this regard too btw)

Cheers,
Ben.


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support

2012-01-11 Thread Alexander Graf


On 12.01.2012, at 07:44, Benjamin Herrenschmidt b...@kernel.crashing.org 
wrote:

 On Tue, 2012-01-10 at 04:11 +0100, Alexander Graf wrote:
 This is what book3s does:
 
case EMULATE_FAIL:
printk(KERN_CRIT %s: emulation at %lx failed
 (%08x)\n,
   __func__, kvmppc_get_pc(vcpu),
 kvmppc_get_last_inst(vcpu));
kvmppc_core_queue_program(vcpu, flags);
r = RESUME_GUEST;
 
 which also doesn't throttle the printk, but I think injecting a
 program fault into the guest is the most sensible thing to do if we
 don't know what the instruction is supposed to do. Best case we get an
 oops inside the guest telling us what broke :).
 
 You can also fallback to a slow path that reads the guest TLB,
 translates then reads the instruction. Of course you have to be careful
 as such a manual translate + read + execute needs to be somewhat
 synchronized with a possible TLB invalidation :-)

Well we do want to be fast on the default path though. So yes, what you're 
saying is what book3s does, but as a fallback in case the fast path didn't work.

The problem here however is that we don't know if the fast path failed; we oops.


 
 (MMIO emulation is broken in this regard too btw)

Huh?

Alex

 
 Cheers,
 Ben.
 
 
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html