How to use code to create a new GuestVM
Hi, everyone I am a newbie about KVM. I am new want to write a module to create a GuestVM in demand. Which function should I look into. And which struct is corresponding to a GuestVM, Shadow page table? Thanks for your help. R -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] net: use this_cpu_xxx replace percpu_xxx funcs
percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace them for further code clean up. And in preempt safe scenario, __this_cpu_xxx funcs has a bit better performance since __this_cpu_xxx has no redundant preempt_disable() Signed-off-by: Alex Shi alex@intel.com --- net/netfilter/xt_TEE.c | 12 ++-- net/socket.c |4 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) Acked-by: Eric Dumazet eric.duma...@gmail.com Thanks ! Anyone like to pick up this patch? or more comments for this? Kaber, David: I appreciate for your any comments on this. Could you like do me a favor? No objections from me. rend this patch for 3.2.0 kernel with Eric's Ack. David, do you have any concerns for this patch? I will very appreciate if it can met 3.3 open window. - From 037bd159fdf52b915e452fac8db2252b1c60297e Mon Sep 17 00:00:00 2001 From: Alex Shi alex@intel.com Date: Thu, 20 Oct 2011 14:52:17 +0800 Subject: [PATCH 1/3] net: use this_cpu_xxx replace percpu_xxx funcs percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace them for further code clean up. And in preempt safe scenario, __this_cpu_xxx funcs has a bit better performance since __this_cpu_xxx has no redundant preempt_disable() Signed-off-by: Alex Shi alex@intel.com Acked-by: Eric Dumazet eric.duma...@gmail.com --- net/netfilter/xt_TEE.c | 12 ++-- net/socket.c |4 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/net/netfilter/xt_TEE.c b/net/netfilter/xt_TEE.c index 5f054a0..678084c 100644 --- a/net/netfilter/xt_TEE.c +++ b/net/netfilter/xt_TEE.c @@ -90,7 +90,7 @@ tee_tg4(struct sk_buff *skb, const struct xt_action_param *par) const struct xt_tee_tginfo *info = par-targinfo; struct iphdr *iph; - if (percpu_read(tee_active)) + if (__this_cpu_read(tee_active)) return XT_CONTINUE; /* * Copy the skb, and route the copy. Will later return %XT_CONTINUE for @@ -127,9 +127,9 @@ tee_tg4(struct sk_buff *skb, const struct xt_action_param *par) ip_send_check(iph); if (tee_tg_route4(skb, info)) { - percpu_write(tee_active, true); + __this_cpu_write(tee_active, true); ip_local_out(skb); - percpu_write(tee_active, false); + __this_cpu_write(tee_active, false); } else { kfree_skb(skb); } @@ -170,7 +170,7 @@ tee_tg6(struct sk_buff *skb, const struct xt_action_param *par) { const struct xt_tee_tginfo *info = par-targinfo; - if (percpu_read(tee_active)) + if (__this_cpu_read(tee_active)) return XT_CONTINUE; skb = pskb_copy(skb, GFP_ATOMIC); if (skb == NULL) @@ -188,9 +188,9 @@ tee_tg6(struct sk_buff *skb, const struct xt_action_param *par) --iph-hop_limit; } if (tee_tg_route6(skb, info)) { - percpu_write(tee_active, true); + __this_cpu_write(tee_active, true); ip6_local_out(skb); - percpu_write(tee_active, false); + __this_cpu_write(tee_active, false); } else { kfree_skb(skb); } diff --git a/net/socket.c b/net/socket.c index ffe92ca..4b62ca9 100644 --- a/net/socket.c +++ b/net/socket.c @@ -479,7 +479,7 @@ static struct socket *sock_alloc(void) inode-i_uid = current_fsuid(); inode-i_gid = current_fsgid(); - percpu_add(sockets_in_use, 1); + this_cpu_add(sockets_in_use, 1); return sock; } @@ -522,7 +522,7 @@ void sock_release(struct socket *sock) if (rcu_dereference_protected(sock-wq, 1)-fasync_list) printk(KERN_ERR sock_release: fasync list not empty!\n); - percpu_sub(sockets_in_use, 1); + this_cpu_sub(sockets_in_use, 1); if (!sock-file) { iput(SOCK_INODE(sock)); return; -- 1.6.3.3 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] kvm: use this_cpu_xxx replace percpu_xxx funcs
Acked-by: Avi Kivity a...@redhat.com And this one, picking up or comments are all appreciated. :) Just to be clear, you want this applied in kvm.git? Thanks Avi! I saw it is in your 3.3 submit list. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] net: use this_cpu_xxx replace percpu_xxx funcs
From: Alex,Shi alex@intel.com Date: Wed, 11 Jan 2012 16:45:33 +0800 percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace them for further code clean up. And in preempt safe scenario, __this_cpu_xxx funcs has a bit better performance since __this_cpu_xxx has no redundant preempt_disable() Signed-off-by: Alex Shi alex@intel.com --- net/netfilter/xt_TEE.c | 12 ++-- net/socket.c |4 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) Acked-by: Eric Dumazet eric.duma...@gmail.com Thanks ! Anyone like to pick up this patch? or more comments for this? Kaber, David: I appreciate for your any comments on this. Could you like do me a favor? No objections from me. rend this patch for 3.2.0 kernel with Eric's Ack. David, do you have any concerns for this patch? I will very appreciate if it can met 3.3 open window. Please just submit it directly with the other this_cpu() patches: Acked-by: David S. Miller da...@davemloft.net -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Code clean up for percpu_xxx() functions
On Mon, 2011-11-21 at 17:06 -0700, t...@kernel.org wrote: (cc'ing hpa and quoting whole body) Signed-off-by: Alex Shi alex@intel.com Acked-by: Christoph Lameter c...@gentwo.org Acked-by: Tejun Heo t...@kernel.org hpa, I suppose this should go through x86? The original patch can be accessed at http://article.gmane.org/gmane.linux.kernel/1218055/raw Rend for 3.2 kernel, no any change needed to apply on latest Linus' tree. :) Actually, this clean up has no performance or security impact for kernel. On the contrary, removing some potential redundant preempt disable will bring a slight performance benefit to kernel. This 3rd patch depends on previous 2 patches, the 2nd one kvm code clean up was submitted for 3.3 kernel. but the 2st one net code clean up is waiting for David's comments. -- From 0dce61dc88b8ed2687b4d5c0633aa54d1f66fdc0 Mon Sep 17 00:00:00 2001 From: Alex Shi alex@intel.com Date: Tue, 22 Nov 2011 00:05:37 +0800 Subject: [PATCH 3/3] Code clean up for percpu_xxx() functions Since percpu_xxx() serial functions are duplicate with this_cpu_xxx(). Removing percpu_xxx() definition and replacing them by this_cpu_xxx() in code. And further more, as Christoph Lameter's requirement, I try to use __this_cpu_xx to replace this_cpu_xxx if it is in preempt safe scenario. The preempt safe scenarios include: 1, in irq/softirq/nmi handler 2, protected by preempt_disable 3, protected by spin_lock 4, if the code context imply that it is preempt safe, like the code is follows or be followed a preempt safe code. I left the xen code unchanged, since no idea of them. BTW, In fact, this_cpu_xxx are same as __this_cpu_xxx since all funcs implement in a single instruction for x86 machine. But it maybe different for other platforms, so, doing this distinguish is helpful for other platforms' performance. Signed-off-by: Alex Shi alex@intel.com Acked-by: Christoph Lameter c...@gentwo.org Acked-by: Tejun Heo t...@kernel.org --- arch/x86/include/asm/current.h|2 +- arch/x86/include/asm/hardirq.h|9 +++-- arch/x86/include/asm/irq_regs.h |4 +- arch/x86/include/asm/mmu_context.h| 12 arch/x86/include/asm/percpu.h | 24 ++- arch/x86/include/asm/smp.h|4 +- arch/x86/include/asm/stackprotector.h |4 +- arch/x86/include/asm/thread_info.h|2 +- arch/x86/include/asm/tlbflush.h |4 +- arch/x86/kernel/cpu/common.c |2 +- arch/x86/kernel/cpu/mcheck/mce.c |4 +- arch/x86/kernel/paravirt.c| 12 arch/x86/kernel/process_32.c |2 +- arch/x86/kernel/process_64.c | 12 arch/x86/mm/tlb.c | 10 +++--- arch/x86/xen/enlighten.c |6 ++-- arch/x86/xen/irq.c|8 ++-- arch/x86/xen/mmu.c| 20 ++-- arch/x86/xen/multicalls.h |2 +- arch/x86/xen/smp.c|2 +- include/linux/percpu.h| 53 - include/linux/topology.h |4 +- 22 files changed, 73 insertions(+), 129 deletions(-) diff --git a/arch/x86/include/asm/current.h b/arch/x86/include/asm/current.h index 4d447b7..9476c04 100644 --- a/arch/x86/include/asm/current.h +++ b/arch/x86/include/asm/current.h @@ -11,7 +11,7 @@ DECLARE_PER_CPU(struct task_struct *, current_task); static __always_inline struct task_struct *get_current(void) { - return percpu_read_stable(current_task); + return this_cpu_read_stable(current_task); } #define current get_current() diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h index 55e4de6..2890444 100644 --- a/arch/x86/include/asm/hardirq.h +++ b/arch/x86/include/asm/hardirq.h @@ -35,14 +35,15 @@ DECLARE_PER_CPU_SHARED_ALIGNED(irq_cpustat_t, irq_stat); #define __ARCH_IRQ_STAT -#define inc_irq_stat(member) percpu_inc(irq_stat.member) +#define inc_irq_stat(member) __this_cpu_inc(irq_stat.member) -#define local_softirq_pending()percpu_read(irq_stat.__softirq_pending) +#define local_softirq_pending() __this_cpu_read(irq_stat.__softirq_pending) #define __ARCH_SET_SOFTIRQ_PENDING -#define set_softirq_pending(x) percpu_write(irq_stat.__softirq_pending, (x)) -#define or_softirq_pending(x) percpu_or(irq_stat.__softirq_pending, (x)) +#define set_softirq_pending(x) \ + __this_cpu_write(irq_stat.__softirq_pending, (x)) +#define or_softirq_pending(x) __this_cpu_or(irq_stat.__softirq_pending, (x)) extern void ack_bad_irq(unsigned int irq); diff --git a/arch/x86/include/asm/irq_regs.h b/arch/x86/include/asm/irq_regs.h index 7784322..15639ed 100644 --- a/arch/x86/include/asm/irq_regs.h +++ b/arch/x86/include/asm/irq_regs.h @@ -15,7 +15,7 @@ DECLARE_PER_CPU(struct pt_regs *, irq_regs); static inline struct pt_regs *get_irq_regs(void) { - return
Re: [PATCH 1/3] net: use this_cpu_xxx replace percpu_xxx funcs
On Wed, 2012-01-11 at 01:03 -0800, David Miller wrote: From: Alex,Shi alex@intel.com Date: Wed, 11 Jan 2012 16:45:33 +0800 percpu_xxx funcs are duplicated with this_cpu_xxx funcs, so replace them for further code clean up. And in preempt safe scenario, __this_cpu_xxx funcs has a bit better performance since __this_cpu_xxx has no redundant preempt_disable() Signed-off-by: Alex Shi alex@intel.com --- net/netfilter/xt_TEE.c | 12 ++-- net/socket.c |4 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) Acked-by: Eric Dumazet eric.duma...@gmail.com Thanks ! Anyone like to pick up this patch? or more comments for this? Kaber, David: I appreciate for your any comments on this. Could you like do me a favor? No objections from me. rend this patch for 3.2.0 kernel with Eric's Ack. David, do you have any concerns for this patch? I will very appreciate if it can met 3.3 open window. Please just submit it directly with the other this_cpu() patches: Acked-by: David S. Miller da...@davemloft.net Thanks a lot! :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: Allow host IRQ sharing for assigned PCI 2.3 devices
On Tue, Jan 10, 2012 at 04:41:50PM -0700, Alex Williamson wrote: The guest driver will never see such an interrupt as we will notice on its arrival that there is some mask pending. Right, I was thinking more about the affect at the hardware level. In theory a broken device might assume that intx disable bit is correlated with internal device registers somehow. However, the current sharing approach won't work for such a device anyway as host controls the status bit while guest controls the rest of the device. So I think we don't care. -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
kvm-s390: prep cleanup for sync registers patch series
Avi, Marcelo, here is a patch that reworks the setting of the prefix register. It is a prereq for the prefix patch in the following patch series about the sync registers in kvm_run. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm-s390: rework code that sets the prefix
From: Christian Borntraeger borntrae...@de.ibm.com There are several places in the kvm module, which set the prefix register. Since we need to flush the cpu, lets combine this operation into a helper function. This helper will also explicitely mask out the unused bits. Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com --- arch/s390/kvm/interrupt.c |3 +-- arch/s390/kvm/kvm-s390.c |3 +-- arch/s390/kvm/kvm-s390.h |7 +++ arch/s390/kvm/priv.c |3 +-- 4 files changed, 10 insertions(+), 6 deletions(-) diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index 278ee00..c6366cf 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -236,8 +236,7 @@ static void __do_deliver_interrupt(struct kvm_vcpu *vcpu, VCPU_EVENT(vcpu, 4, interrupt: set prefix to %x, inti-prefix.address); vcpu-stat.deliver_prefix_signal++; - vcpu-arch.sie_block-prefix = inti-prefix.address; - vcpu-arch.sie_block-ihcpu = 0x; + kvm_s390_set_prefix(vcpu, inti-prefix.address); break; case KVM_S390_RESTART: diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index a33b444..1868b89 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -322,8 +322,7 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu) /* this equals initial cpu reset in pop, but we don't switch to ESA */ vcpu-arch.sie_block-gpsw.mask = 0UL; vcpu-arch.sie_block-gpsw.addr = 0UL; - vcpu-arch.sie_block-prefix= 0UL; - vcpu-arch.sie_block-ihcpu = 0x; + kvm_s390_set_prefix(vcpu, 0); vcpu-arch.sie_block-cputm = 0UL; vcpu-arch.sie_block-ckc = 0UL; vcpu-arch.sie_block-todpr = 0; diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h index 62aa5f1..ff28f9d 100644 --- a/arch/s390/kvm/kvm-s390.h +++ b/arch/s390/kvm/kvm-s390.h @@ -58,6 +58,13 @@ static inline int kvm_is_ucontrol(struct kvm *kvm) return 0; #endif } + +static inline void kvm_s390_set_prefix(struct kvm_vcpu *vcpu, u32 prefix) +{ + vcpu-arch.sie_block-prefix = prefix 0x7fffe000u; + vcpu-arch.sie_block-ihcpu = 0x; +} + int kvm_s390_handle_wait(struct kvm_vcpu *vcpu); enum hrtimer_restart kvm_s390_idle_wakeup(struct hrtimer *timer); void kvm_s390_tasklet(unsigned long parm); diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c index d026389..9c83b8a 100644 --- a/arch/s390/kvm/priv.c +++ b/arch/s390/kvm/priv.c @@ -56,8 +56,7 @@ static int handle_set_prefix(struct kvm_vcpu *vcpu) goto out; } - vcpu-arch.sie_block-prefix = address; - vcpu-arch.sie_block-ihcpu = 0x; + kvm_s390_set_prefix(vcpu, address); VCPU_EVENT(vcpu, 5, setting prefix to %x, address); out: -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] kvm-s390: provide the prefix register via kvm_run
Add the prefix register to the synced register field in kvm_run. While we need the prefix register most of the time read-only, this patch also adds handling for guest dirtying of the prefix register. Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com --- arch/s390/include/asm/kvm.h |2 ++ arch/s390/kvm/kvm-s390.c|7 +++ 2 files changed, 9 insertions(+), 0 deletions(-) diff --git a/arch/s390/include/asm/kvm.h b/arch/s390/include/asm/kvm.h index 325560a..9fc328c 100644 --- a/arch/s390/include/asm/kvm.h +++ b/arch/s390/include/asm/kvm.h @@ -41,7 +41,9 @@ struct kvm_debug_exit_arch { struct kvm_guest_debug_arch { }; +#define KVM_SYNC_PREFIX (1UL 0) /* definition of registers in kvm_run */ struct kvm_sync_regs { + __u64 prefix; /* prefix register */ }; #endif diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 1868b89..6962c1b 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -132,6 +132,7 @@ int kvm_dev_ioctl_check_extension(long ext) #ifdef CONFIG_KVM_S390_UCONTROL case KVM_CAP_S390_UCONTROL: #endif + case KVM_CAP_SYNC_REGS: r = 1; break; default: @@ -288,6 +289,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) } vcpu-arch.gmap = vcpu-kvm-arch.gmap; + vcpu-run-kvm_valid_regs = KVM_SYNC_PREFIX; return 0; } @@ -572,6 +574,10 @@ rerun_vcpu: vcpu-arch.sie_block-gpsw.mask = kvm_run-psw_mask; vcpu-arch.sie_block-gpsw.addr = kvm_run-psw_addr; + if (kvm_run-kvm_dirty_regs KVM_SYNC_PREFIX) { + kvm_run-kvm_dirty_regs = ~KVM_SYNC_PREFIX; + kvm_s390_set_prefix(vcpu, kvm_run-s.regs.prefix); + } might_fault(); @@ -620,6 +626,7 @@ rerun_vcpu: kvm_run-psw_mask = vcpu-arch.sie_block-gpsw.mask; kvm_run-psw_addr = vcpu-arch.sie_block-gpsw.addr; + kvm_run-s.regs.prefix = vcpu-arch.sie_block-prefix; if (vcpu-sigset_active) sigprocmask(SIG_SETMASK, sigsaved, NULL); -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] kvm-s390: provide access guest registers via kvm_run
This patch adds the access registers to the kvm_run structure. Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com --- arch/s390/include/asm/kvm.h |2 ++ arch/s390/include/asm/kvm_host.h |1 - arch/s390/kvm/kvm-s390.c | 16 +--- 3 files changed, 11 insertions(+), 8 deletions(-) diff --git a/arch/s390/include/asm/kvm.h b/arch/s390/include/asm/kvm.h index 420dbb7..9acbde4 100644 --- a/arch/s390/include/asm/kvm.h +++ b/arch/s390/include/asm/kvm.h @@ -43,9 +43,11 @@ struct kvm_guest_debug_arch { #define KVM_SYNC_PREFIX (1UL 0) #define KVM_SYNC_GPRS (1UL 1) +#define KVM_SYNC_ACRS (1UL 2) /* definition of registers in kvm_run */ struct kvm_sync_regs { __u64 prefix; /* prefix register */ __u64 gprs[16]; /* general purpose registers */ + __u32 acrs[16]; /* access registers */ }; #endif diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index ed843ca..e630426 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -231,7 +231,6 @@ struct kvm_vcpu_arch { s390_fp_regs host_fpregs; unsigned int host_acrs[NUM_ACRS]; s390_fp_regs guest_fpregs; - unsigned int guest_acrs[NUM_ACRS]; struct kvm_s390_local_interrupt local_int; struct hrtimerckc_timer; struct tasklet_struct tasklet; diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 80b12ba..0b91679 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -289,7 +289,9 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) } vcpu-arch.gmap = vcpu-kvm-arch.gmap; - vcpu-run-kvm_valid_regs = KVM_SYNC_PREFIX | KVM_SYNC_GPRS; + vcpu-run-kvm_valid_regs = KVM_SYNC_PREFIX | + KVM_SYNC_GPRS | + KVM_SYNC_ACRS; return 0; } @@ -304,7 +306,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) save_access_regs(vcpu-arch.host_acrs); vcpu-arch.guest_fpregs.fpc = FPC_VALID_MASK; restore_fp_regs(vcpu-arch.guest_fpregs); - restore_access_regs(vcpu-arch.guest_acrs); + restore_access_regs(vcpu-run-s.regs.acrs); gmap_enable(vcpu-arch.gmap); atomic_set_mask(CPUSTAT_RUNNING, vcpu-arch.sie_block-cpuflags); } @@ -314,7 +316,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) atomic_clear_mask(CPUSTAT_RUNNING, vcpu-arch.sie_block-cpuflags); gmap_disable(vcpu-arch.gmap); save_fp_regs(vcpu-arch.guest_fpregs); - save_access_regs(vcpu-arch.guest_acrs); + save_access_regs(vcpu-run-s.regs.acrs); restore_fp_regs(vcpu-arch.host_fpregs); restore_access_regs(vcpu-arch.host_acrs); } @@ -441,16 +443,16 @@ int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs) { - memcpy(vcpu-arch.guest_acrs, sregs-acrs, sizeof(sregs-acrs)); + memcpy(vcpu-run-s.regs.acrs, sregs-acrs, sizeof(sregs-acrs)); memcpy(vcpu-arch.sie_block-gcr, sregs-crs, sizeof(sregs-crs)); - restore_access_regs(vcpu-arch.guest_acrs); + restore_access_regs(vcpu-run-s.regs.acrs); return 0; } int kvm_arch_vcpu_ioctl_get_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs) { - memcpy(sregs-acrs, vcpu-arch.guest_acrs, sizeof(sregs-acrs)); + memcpy(sregs-acrs, vcpu-run-s.regs.acrs, sizeof(sregs-acrs)); memcpy(sregs-crs, vcpu-arch.sie_block-gcr, sizeof(sregs-crs)); return 0; } @@ -702,7 +704,7 @@ int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr) return -EFAULT; if (__guestcopy(vcpu, addr + offsetof(struct save_area, acc_regs), - vcpu-arch.guest_acrs, 64, prefix)) + vcpu-run-s.regs.acrs, 64, prefix)) return -EFAULT; if (__guestcopy(vcpu, -- 1.7.8.2 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
kvm: provide synchronous registers in kvm_run
Avi, Marcelo, here is the next version of the sync register patch series. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] kvm-s390: provide general purpose guest registers via kvm_run
This patch adds the general purpose registers to the kvm_run structure. Signed-off-by: Christian Borntraeger borntrae...@de.ibm.com --- arch/s390/include/asm/kvm.h |2 ++ arch/s390/include/asm/kvm_host.h |3 +-- arch/s390/kvm/diag.c |6 +++--- arch/s390/kvm/intercept.c|4 ++-- arch/s390/kvm/kvm-s390.c | 14 +++--- arch/s390/kvm/priv.c | 24 arch/s390/kvm/sigp.c | 20 ++-- 7 files changed, 37 insertions(+), 36 deletions(-) diff --git a/arch/s390/include/asm/kvm.h b/arch/s390/include/asm/kvm.h index 9fc328c..420dbb7 100644 --- a/arch/s390/include/asm/kvm.h +++ b/arch/s390/include/asm/kvm.h @@ -42,8 +42,10 @@ struct kvm_guest_debug_arch { }; #define KVM_SYNC_PREFIX (1UL 0) +#define KVM_SYNC_GPRS (1UL 1) /* definition of registers in kvm_run */ struct kvm_sync_regs { __u64 prefix; /* prefix register */ + __u64 gprs[16]; /* general purpose registers */ }; #endif diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h index e34fb2b..ed843ca 100644 --- a/arch/s390/include/asm/kvm_host.h +++ b/arch/s390/include/asm/kvm_host.h @@ -228,7 +228,6 @@ struct kvm_s390_float_interrupt { struct kvm_vcpu_arch { struct kvm_s390_sie_block *sie_block; - unsigned long guest_gprs[16]; s390_fp_regs host_fpregs; unsigned int host_acrs[NUM_ACRS]; s390_fp_regs guest_fpregs; @@ -254,5 +253,5 @@ struct kvm_arch{ struct gmap *gmap; }; -extern int sie64a(struct kvm_s390_sie_block *, unsigned long *); +extern int sie64a(struct kvm_s390_sie_block *, u64 *); #endif diff --git a/arch/s390/kvm/diag.c b/arch/s390/kvm/diag.c index 8943e82..a353f0e 100644 --- a/arch/s390/kvm/diag.c +++ b/arch/s390/kvm/diag.c @@ -20,8 +20,8 @@ static int diag_release_pages(struct kvm_vcpu *vcpu) unsigned long start, end; unsigned long prefix = vcpu-arch.sie_block-prefix; - start = vcpu-arch.guest_gprs[(vcpu-arch.sie_block-ipa 0xf0) 4]; - end = vcpu-arch.guest_gprs[vcpu-arch.sie_block-ipa 0xf] + 4096; + start = vcpu-run-s.regs.gprs[(vcpu-arch.sie_block-ipa 0xf0) 4]; + end = vcpu-run-s.regs.gprs[vcpu-arch.sie_block-ipa 0xf] + 4096; if (start ~PAGE_MASK || end ~PAGE_MASK || start end || start 2 * PAGE_SIZE) @@ -56,7 +56,7 @@ static int __diag_time_slice_end(struct kvm_vcpu *vcpu) static int __diag_ipl_functions(struct kvm_vcpu *vcpu) { unsigned int reg = vcpu-arch.sie_block-ipa 0xf; - unsigned long subcode = vcpu-arch.guest_gprs[reg] 0x; + unsigned long subcode = vcpu-run-s.regs.gprs[reg] 0x; VCPU_EVENT(vcpu, 5, diag ipl functions, subcode %lx, subcode); switch (subcode) { diff --git a/arch/s390/kvm/intercept.c b/arch/s390/kvm/intercept.c index 0243454..776ef83 100644 --- a/arch/s390/kvm/intercept.c +++ b/arch/s390/kvm/intercept.c @@ -36,7 +36,7 @@ static int handle_lctlg(struct kvm_vcpu *vcpu) useraddr = disp2; if (base2) - useraddr += vcpu-arch.guest_gprs[base2]; + useraddr += vcpu-run-s.regs.gprs[base2]; if (useraddr 7) return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); @@ -75,7 +75,7 @@ static int handle_lctl(struct kvm_vcpu *vcpu) useraddr = disp2; if (base2) - useraddr += vcpu-arch.guest_gprs[base2]; + useraddr += vcpu-run-s.regs.gprs[base2]; if (useraddr 3) return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION); diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 6962c1b..80b12ba 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -289,7 +289,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) } vcpu-arch.gmap = vcpu-kvm-arch.gmap; - vcpu-run-kvm_valid_regs = KVM_SYNC_PREFIX; + vcpu-run-kvm_valid_regs = KVM_SYNC_PREFIX | KVM_SYNC_GPRS; return 0; } @@ -428,13 +428,13 @@ static int kvm_arch_vcpu_ioctl_initial_reset(struct kvm_vcpu *vcpu) int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) { - memcpy(vcpu-arch.guest_gprs, regs-gprs, sizeof(regs-gprs)); + memcpy(vcpu-run-s.regs.gprs, regs-gprs, sizeof(regs-gprs)); return 0; } int kvm_arch_vcpu_ioctl_get_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs) { - memcpy(regs-gprs, vcpu-arch.guest_gprs, sizeof(regs-gprs)); + memcpy(regs-gprs, vcpu-run-s.regs.gprs, sizeof(regs-gprs)); return 0; } @@ -511,7 +511,7 @@ static int __vcpu_run(struct kvm_vcpu *vcpu) { int rc; - memcpy(vcpu-arch.sie_block-gg14, vcpu-arch.guest_gprs[14], 16); + memcpy(vcpu-arch.sie_block-gg14, vcpu-run-s.regs.gprs[14], 16); if (need_resched()) schedule(); @@ -528,7 +528,7 @@ static int __vcpu_run(struct
Re: [PATCH 3/3] stop the periodic RTC update timer
On Fri, Jan 06, 2012 at 07:37:31AM +, Zhang, Yang Z wrote: change the RTC update logic to use host time with offset to calculate RTC clock. There have no need to use two periodic timers to maintain an internal timer for RTC clock update and alarm check. Instead, we calculate the real RTC time by the host time with an offset. For alarm and updated-end interrupt, if guest enabled it, then we setup a timer, or else, stop it. Signed-off-by: Yang Zhang yang.z.zh...@intel.com diff --git a/hw/mc146818rtc.c b/hw/mc146818rtc.c index 9cbd052..ac1854e 100644 --- a/hw/mc146818rtc.c +++ b/hw/mc146818rtc.c @@ -84,7 +84,7 @@ typedef struct RTCState { MemoryRegion io; uint8_t cmos_data[128]; uint8_t cmos_index; -struct tm current_tm; +int64_t offset; int32_t base_year; qemu_irq irq; qemu_irq sqw_irq; @@ -93,19 +93,18 @@ typedef struct RTCState { QEMUTimer *periodic_timer; int64_t next_periodic_time; /* second update */ -int64_t next_second_time; +QEMUTimer *update_timer; +int64_t next_update_time; +/* alarm */ +QEMUTimer *alarm_timer; +int64_t next_alarm_time; uint16_t irq_reinject_on_ack_count; uint32_t irq_coalesced; uint32_t period; QEMUTimer *coalesced_timer; -QEMUTimer *second_timer; -QEMUTimer *second_timer2; Notifier clock_reset_notifier; } RTCState; -static void rtc_set_time(RTCState *s); -static void rtc_copy_date(RTCState *s); - #ifdef TARGET_I386 static void rtc_coalesced_timer_update(RTCState *s) { @@ -140,6 +139,72 @@ static void rtc_coalesced_timer(void *opaque) } #endif +static inline int rtc_to_bcd(RTCState *s, int a) +{ +if (s-cmos_data[RTC_REG_B] REG_B_DM) { +return a; +} else { +return ((a / 10) 4) | (a % 10); +} +} + +static inline int rtc_from_bcd(RTCState *s, int a) +{ +if (s-cmos_data[RTC_REG_B] REG_B_DM) { +return a; +} else { +return ((a 4) * 10) + (a 0x0f); +} +} + +static void rtc_set_time(RTCState *s) +{ +struct tm tm ; + +tm.tm_sec = rtc_from_bcd(s, s-cmos_data[RTC_SECONDS]); +tm.tm_min = rtc_from_bcd(s, s-cmos_data[RTC_MINUTES]); +tm.tm_hour = rtc_from_bcd(s, s-cmos_data[RTC_HOURS] 0x7f); +if (!(s-cmos_data[RTC_REG_B] REG_B_24H) +(s-cmos_data[RTC_HOURS] 0x80)) { +tm.tm_hour += 12; +} +tm.tm_wday = rtc_from_bcd(s, s-cmos_data[RTC_DAY_OF_WEEK]) - 1; +tm.tm_mday = rtc_from_bcd(s, s-cmos_data[RTC_DAY_OF_MONTH]); +tm.tm_mon = rtc_from_bcd(s, s-cmos_data[RTC_MONTH]) - 1; +tm.tm_year = rtc_from_bcd(s, s-cmos_data[RTC_YEAR]) + s-base_year - 1900; + +s-offset = qemu_timedate_diff(tm); + +rtc_change_mon_event(tm); +} + +static void rtc_update_time(RTCState *s) +{ +struct tm tm; +int year; + +qemu_get_timedate(tm, s-offset); + +s-cmos_data[RTC_SECONDS] = rtc_to_bcd(s, tm.tm_sec); +s-cmos_data[RTC_MINUTES] = rtc_to_bcd(s, tm.tm_min); +if (s-cmos_data[RTC_REG_B] REG_B_24H) { +/* 24 hour format */ +s-cmos_data[RTC_HOURS] = rtc_to_bcd(s, tm.tm_hour); +} else { +/* 12 hour format */ +s-cmos_data[RTC_HOURS] = rtc_to_bcd(s, tm.tm_hour % 12); +if (tm.tm_hour = 12) +s-cmos_data[RTC_HOURS] |= 0x80; +} +s-cmos_data[RTC_DAY_OF_WEEK] = rtc_to_bcd(s, tm.tm_wday + 1); +s-cmos_data[RTC_DAY_OF_MONTH] = rtc_to_bcd(s, tm.tm_mday); +s-cmos_data[RTC_MONTH] = rtc_to_bcd(s, tm.tm_mon + 1); +year = (tm.tm_year - s-base_year) % 100; +if (year 0) +year += 100; +s-cmos_data[RTC_YEAR] = rtc_to_bcd(s, year); +} + Please have this code move in a separate, earlier patch. static void rtc_timer_update(RTCState *s, int64_t current_time) { int period_code, period; @@ -174,7 +239,7 @@ static void rtc_timer_update(RTCState *s, int64_t current_time) } } -static void rtc_periodic_timer(void *opaque) +static void rtc_periodic_interrupt(void *opaque) { RTCState *s = opaque; @@ -204,6 +269,92 @@ static void rtc_periodic_timer(void *opaque) } } +static void rtc_enable_update_interrupt(void *opaque) +{ +RTCState *s = opaque; + +s-next_update_time = qemu_get_clock_ns(rtc_clock) + get_ticks_per_sec(); +qemu_mod_timer(s-update_timer, s-next_update_time); +} + +static void rtc_disable_update_interrupt(void *opaque) +{ +RTCState *s = opaque; + +qemu_del_timer(s-update_timer); +} + +static void rtc_update_interrupt(void *opaque) +{ +RTCState *s = opaque; + +/* update ended interrupt */ +s-cmos_data[RTC_REG_C] |= REG_C_UF; +if (s-cmos_data[RTC_REG_B] REG_B_UIE) { +s-cmos_data[RTC_REG_C] |= REG_C_IRQF; +qemu_irq_raise(s-irq); + +s-next_update_time += get_ticks_per_sec(); + qemu_mod_timer(s-update_timer,
[regression] virtio net locks up
No idea what is going on, but recent kernels lock up here after transferring some amount of data. So far I only know that 2.6.32 is the last working kernel I have tested and 3.0 is the first non-working version I tested. How to reproduce: vm1: iperf -c vm2 vm2: iperf -s vm1 After some time either of both VMs cannot be pinged anymore, neither from host nor from the other (still working) VM. Direct access of the non-net-working vm via console still works fine. Also not important if I run with vhost on or off, in both modes it fails. qemu-kvm version is 1.0. Here's my qemu-kvm start-up script: #! /bin/bash source ~/bin/kvm-config.sh iface=`sudo tunctl -b -u $USER` FILE=${IMAGE_DIR}/squeeze1.img #NICMODEL=e1000 NICMODEL=virtio DISKIF=virtio #DISKIF=ide #DISKIF=scsi ${kvm} \ -m 4096 \ -net nic,macaddr=52:54:00:12:34:11,model=${NICMODEL}\ -net tap,id=foo,script=${HOME}/bin/kvm-ifup,downscript=${HOME}/bin/kvm-ifdown,ifname=$iface,vhost=on \ -boot c \ -drive file=${FILE},if=${DISKIF},boot=on,cache=writeback\ ${common_opts} \ $@ sudo /usr/sbin/tunctl -d $iface Any idea what is going on or how to debug it? Thanks, Bernd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [regression] virtio net locks up
On 01/11/2012 04:24 PM, Bernd Schubert wrote: No idea what is going on, but recent kernels lock up here after transferring some amount of data. So far I only know that 2.6.32 is the last working kernel I have tested and 3.0 is the first non-working version I tested. Sorry forgot to tell the host side kernel version: - this was not updated and is always 2.6.32-131.6.1.el6.x86_64 (so RHEL6) Cheers, Bernd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [regression] virtio net locks up
On Wed, Jan 11, 2012 at 3:24 PM, Bernd Schubert bernd.schub...@itwm.fraunhofer.de wrote: Any idea what is going on or how to debug it? Here are a couple of ideas that would yield more information: Since the console still works I suggest checking dmesg output inside the guest. Are there any error messages at the bottom? Try pinging the host's IP address from inside the guest. Run tcpdump on the guest's tap interface from the host and observe whether or not you see any packets being sent from the guest. rmmod virtio_net inside the guest and then modprobe virtio_net again. See if network connectivity is restored (remember to rerun DHCP or whatever, if necessary). Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH RFC 2/2] kvm: set affinity hint for assigned device msi
On Mon, Oct 17, 2011 at 07:04:40PM +0200, Michael S. Tsirkin wrote: On Mon, Oct 17, 2011 at 02:07:41PM -0200, Marcelo Tosatti wrote: Configurations to consider, all common ones used for assigned devices? I mean, besides round robin, any other modes that have an issue? Interrupts can also be multicast, I think, but we probably don't care what happens to affinity then, as msi interrupts are probably never broadcast ... There is also lowest priority, which can be used with MSI. So the following will probably address that comment? Yes, it does. Patch looks fine. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [regression] virtio net locks up
Hello Stefan, thanks for your help! On 01/11/2012 05:04 PM, Stefan Hajnoczi wrote: On Wed, Jan 11, 2012 at 3:24 PM, Bernd Schubert bernd.schub...@itwm.fraunhofer.de wrote: Any idea what is going on or how to debug it? Here are a couple of ideas that would yield more information: Since the console still works I suggest checking dmesg output inside the guest. Are there any error messages at the bottom? No, absolutely nothing in dmesg. Try pinging the host's IP address from inside the guest. Run tcpdump on the guest's tap interface from the host and observe whether or not you see any packets being sent from the guest. Seems arp requests are still going out, but then don't go in: 17:16:21.202547 ARP, Reply 192.168.123.1 is-at 00:25:90:38:09:cd (oui Unknown), length 28 17:16:21.538724 ARP, Request who-has squeeze1 tell squeeze3, length 28 17:16:21.539026 ARP, Reply squeeze1 is-at 52:54:00:12:34:11 (oui Unknown), length 28 17:16:22.200912 ARP, Request who-has 192.168.123.1 tell squeeze3, length 28 rmmod virtio_net inside the guest and then modprobe virtio_net again. See if network connectivity is restored (remember to rerun DHCP or whatever, if necessary). Yep, that makes it work again. But probably is not the real solution ;) Thanks, Bernd -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] KVM: fix mov immediate emulation for 64-bit operands
MOV immediate instruction (opcodes 0xB8-0xBF) may take 64-bit operand. The previous emulation implementation assumes the operand is no longer than 32. Adding OpImm64 for this matter. Signed-off-by: Nadav Amit nadav.a...@gmail.com --- arch/x86/kvm/emulate.c | 12 ++-- 1 files changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 05a562b..9ad5c0b 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -43,7 +43,7 @@ #define OpCL 9ull /* CL register (for shifts) */ #define OpImmByte 10ull /* 8-bit sign extended immediate */ #define OpOne 11ull /* Implied 1 */ -#define OpImm 12ull /* Sign extended immediate */ +#define OpImm 12ull /* Sign extended up to 32-bit immediate */ #define OpMem16 13ull /* Memory operand (16-bit). */ #define OpMem32 14ull /* Memory operand (32-bit). */ #define OpImmU15ull /* Immediate operand, zero extended */ @@ -57,6 +57,7 @@ #define OpDS 23ull /* DS */ #define OpFS 24ull /* FS */ #define OpGS 25ull /* GS */ +#define OpImm64 26ull /* Sign extended 16/32/64-bit immediate */ #define OpBits 5 /* Width of operand field */ #define OpMask ((1ull OpBits) - 1) @@ -100,6 +101,7 @@ #define SrcMemFAddr (OpMemFAddr SrcShift) #define SrcAcc (OpAcc SrcShift) #define SrcImmU16 (OpImmU16 SrcShift) +#define SrcImm64(OpImm64 SrcShift) #define SrcDX (OpDX SrcShift) #define SrcMask (OpMask SrcShift) #define BitOp (111) @@ -3365,7 +3367,7 @@ static struct opcode opcode_table[256] = { /* 0xB0 - 0xB7 */ X8(I(ByteOp | DstReg | SrcImm | Mov, em_mov)), /* 0xB8 - 0xBF */ - X8(I(DstReg | SrcImm | Mov, em_mov)), + X8(I(DstReg | SrcImm64 | Mov, em_mov)), /* 0xC0 - 0xC7 */ D2bv(DstMem | SrcImmByte | ModRM), I(ImplicitOps | Stack | SrcImmU16, em_ret_near_imm), @@ -3526,6 +3528,9 @@ static int decode_imm(struct x86_emulate_ctxt *ctxt, struct operand *op, case 4: op-val = insn_fetch(s32, ctxt); break; + case 8: + op-val = insn_fetch(s64, ctxt); + break; } if (!sign_extension) { switch (op-bytes) { @@ -3605,6 +3610,9 @@ static int decode_operand(struct x86_emulate_ctxt *ctxt, struct operand *op, case OpImm: rc = decode_imm(ctxt, op, imm_size(ctxt), true); break; + case OpImm64: + rc = decode_imm(ctxt, op, ctxt-op_bytes, true); + break; case OpMem16: ctxt-memop.bytes = 2; goto mem_common; -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 00/13] KVM/ARM Implementation
On 11 December 2011 19:23, Christoffer Dall c.d...@virtualopensystems.com wrote: On Sun, Dec 11, 2011 at 6:32 AM, Peter Maydell peter.mayd...@linaro.org wrote: On 11 December 2011 10:24, Christoffer Dall c.d...@virtualopensystems.com wrote: Still on the to-do list: - Reuse VMIDs - Fix SMP host support - Fix SMP guest support - Support guest Thumb mode for MMIO emulation - Further testing - Performance improvements Other items for this list: - Support Neon/VFP in guests (the fpu regs struct is empty ATM) - Support guest debugging ok, thanks, will add these to the list. I have a feeling it will keep growing for a while :) Do you have a kernel-side TODO list somewhere public (wiki page?) (It would be quite useful to be able to boot a reasonably modern [read, ARMv7, Thumb2, VFPv3] guest userspace; does anybody plan to work on this part soon?) thanks -- PMM -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] KVM: Exception during emulation decode should propagate
An exception might occur during decode (e.g., #PF during fetch). Currently, the exception is ignored and emulation is performed. Instead, emulation should be skipped and the fault should be injected. Skipping instruction should report a failure in this case. Signed-off-by: Nadav Amit nadav.a...@gmail.com --- arch/x86/kvm/emulate.c |3 +++ arch/x86/kvm/x86.c |8 2 files changed, 11 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 05a562b..e06dc98 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -3869,6 +3869,9 @@ done: if (ctxt-memopp ctxt-memopp-type == OP_MEM ctxt-rip_relative) ctxt-memopp-addr.mem.ea += ctxt-_eip; + if (rc == X86EMUL_PROPAGATE_FAULT) + ctxt-have_exception = true; + return (rc != X86EMUL_CONTINUE) ? EMULATION_FAILED : EMULATION_OK; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1171def..05fd3d7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4443,10 +4443,17 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, } if (emulation_type EMULTYPE_SKIP) { + if (ctxt-have_exception) + return EMULATE_FAIL; kvm_rip_write(vcpu, ctxt-_eip); return EMULATE_DONE; } + if (ctxt-have_exception) { + writeback = false; + goto post; + } + if (retry_instruction(ctxt, cr2, emulation_type)) return EMULATE_DONE; @@ -4470,6 +4477,7 @@ restart: return handle_emulation_failure(vcpu); } +post: if (ctxt-have_exception) { inject_emulated_exception(vcpu); r = EMULATE_DONE; -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] KVM: Fix writeback on page boundary that propagate changes in spite of #PF
Consider the case in which an instruction emulation writeback is performed on a page boundary. In such case, if a #PF occurs on the second page, the write to the first page already occurred and cannot be retracted. Therefore, validation of the second page access must be performed prior to writeback. Signed-off-by: Nadav Amit nadav.a...@gmail.com --- arch/x86/kvm/x86.c | 13 + 1 files changed, 13 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 05fd3d7..7af3d67 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3626,6 +3626,8 @@ struct read_write_emulator_ops { int bytes, void *val); int (*read_write_exit_mmio)(struct kvm_vcpu *vcpu, gpa_t gpa, void *val, int bytes); + gpa_t (*read_write_validate)(struct kvm_vcpu *vcpu, gva_t gva, +struct x86_exception *exception); bool write; }; @@ -3686,6 +3688,7 @@ static struct read_write_emulator_ops write_emultor = { .read_write_emulate = write_emulate, .read_write_mmio = write_mmio, .read_write_exit_mmio = write_exit_mmio, + .read_write_validate = kvm_mmu_gva_to_gpa_write, .write = true, }; @@ -3750,6 +3753,16 @@ int emulator_read_write(struct x86_emulate_ctxt *ctxt, unsigned long addr, int rc, now; now = -addr ~PAGE_MASK; + + /* First check there is no page-fault on the next page */ + if (ops-read_write_validate + ops-read_write_validate(vcpu, addr+now, exception) == + UNMAPPED_GVA) { + /* #PF on the first page should be reported first */ + ops-read_write_validate(vcpu, addr, exception); + return X86EMUL_PROPAGATE_FAULT; + } + rc = emulator_read_write_onepage(addr, val, now, exception, vcpu, ops); -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost-net: add module alias
On Wed, 11 Jan 2012 15:43:42 +0800 Amos Kong kongjian...@gmail.com wrote: On Wed, Jan 11, 2012 at 12:54 PM, Stephen Hemminger shemmin...@vyatta.comwrote: By adding the a module alias, programs (or users) won't have to explicitly call modprobe. Vhost-net will always be available if built into the kernel. It does require assigning a permanent minor number for depmod to work. Choose one next to TUN since this driver is related to it. Also, use C99 style initialization. Signed-off-by: Stephen Hemminger shemmin...@vyatta.com --- drivers/vhost/net.c|8 +--- include/linux/miscdevice.h |1 + 2 files changed, 6 insertions(+), 3 deletions(-) : /* * These allocations are managed by dev...@lanana.org. If you use an * entry that is not in assigned your entry may well be moved and * reassigned, or set dynamic if a fixed value is not justified. */ Didn't that mailing address was ever used any more. Like many places in kernel, the comment looked like a historical leftover. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost-net: add module alias
On Wed, 11 Jan 2012 11:07:47 +0400 Michael Tokarev m...@tls.msk.ru wrote: On 11.01.2012 08:54, Stephen Hemminger wrote: By adding the a module alias, programs (or users) won't have to explicitly call modprobe. Vhost-net will always be available if built into the kernel. It does require assigning a permanent minor number for depmod to work. Choose one next to TUN since this driver is related to it. Why do you think a statically-allocated device number will do any good at all? Static /dev is gone almost completely, at least on the systems where whole virt stuff makes any sense, so you don't have pre-created vhost-net device anymore, and hence this allocation makes no sense. Just IMHO anyway. The statically allocated device number is required for the udev/module autoloading to work. Probably the udev infrastructure needs a consistent number to hang off of. It looks like: * driver adds MODULE_ALIAS() for devname and character device * depmod scans modules and creates modules.devname (in /lib/modules) * udev uses modules.devname to autoload the module $ /sbin/modinfo vhost_net filename: /lib/modules/3.2.0-net+/kernel/drivers/vhost/vhost_net.ko alias: devname:vhost-net alias: char-major-10-201 description:Host kernel accelerator for virtio net ... See also: https://lkml.org/lkml/2010/5/21/134 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost-net: add module alias
On 11.01.2012 20:58, Stephen Hemminger wrote: On Wed, 11 Jan 2012 11:07:47 +0400 Michael Tokarev m...@tls.msk.ru wrote: On 11.01.2012 08:54, Stephen Hemminger wrote: By adding the a module alias, programs (or users) won't have to explicitly call modprobe. Vhost-net will always be available if built into the kernel. It does require assigning a permanent minor number for depmod to work. Choose one next to TUN since this driver is related to it. Why do you think a statically-allocated device number will do any good at all? Static /dev is gone almost completely, at least on the systems where whole virt stuff makes any sense, so you don't have pre-created vhost-net device anymore, and hence this allocation makes no sense. Just IMHO anyway. [] See also: https://lkml.org/lkml/2010/5/21/134 Aha. So udev pre-creates statically-allocated devnodes nowadays: Udev will pick up the depmod created file on startup and create all the static device nodes which the kernel modules specify, so that these modules get automatically loaded when the device node is accessed... This was the part I missed. Now it all looks logically. Thanks, /mjt -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost-net: add module alias
On Wed, Jan 11, 2012 at 17:58, Stephen Hemminger shemmin...@vyatta.com wrote: On Wed, 11 Jan 2012 11:07:47 +0400 Michael Tokarev m...@tls.msk.ru wrote: On 11.01.2012 08:54, Stephen Hemminger wrote: By adding the a module alias, programs (or users) won't have to explicitly call modprobe. Vhost-net will always be available if built into the kernel. It does require assigning a permanent minor number for depmod to work. Choose one next to TUN since this driver is related to it. Why do you think a statically-allocated device number will do any good at all? It's totally fine to use them for single-instance devices. You are right, enumerated devices must _never_ use any facility like that. That would just be broken. Static /dev is gone almost completely, at least on the systems where whole virt stuff makes any sense, so you don't have pre-created vhost-net device anymore, and hence this allocation makes no sense. Just IMHO anyway. It makes a lot of sense in this case. The kernel module files advertise the dev_t, it's not stored anywhere else. UDev finds these static numbers and does inplicit mkdev() for them. The statically allocated device number is required for the udev/module autoloading to work. Probably the udev infrastructure needs a consistent number to hang off of. It does that properly. Just check: $ cat /lib/modules/$(uname -r)/modules.devname # Device nodes to trigger on-demand module loading. fuse fuse c10:229 btrfs btrfs-control c10:234 ppp_generic ppp c108:0 tun net/tun c10:200 uinput uinput c10:223 ... Kay -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [regression] virtio net locks up
On Wed, Jan 11, 2012 at 4:18 PM, Bernd Schubert bernd.schub...@itwm.fraunhofer.de wrote: On 01/11/2012 05:04 PM, Stefan Hajnoczi wrote: Try pinging the host's IP address from inside the guest. Run tcpdump on the guest's tap interface from the host and observe whether or not you see any packets being sent from the guest. Seems arp requests are still going out, but then don't go in: 17:16:21.202547 ARP, Reply 192.168.123.1 is-at 00:25:90:38:09:cd (oui Unknown), length 28 17:16:21.538724 ARP, Request who-has squeeze1 tell squeeze3, length 28 17:16:21.539026 ARP, Reply squeeze1 is-at 52:54:00:12:34:11 (oui Unknown), length 28 17:16:22.200912 ARP, Request who-has 192.168.123.1 tell squeeze3, length 28 Okay, so it seems networking from the tap device and beyond is fine. rmmod virtio_net inside the guest and then modprobe virtio_net again. See if network connectivity is restored (remember to rerun DHCP or whatever, if necessary). Yep, that makes it work again. But probably is not the real solution ;) It's just another piece of information which helps debug this :). At least nothing has wedged itself into an unrecoverable state. When you said the problem happens without vhost, did you explicitly run vhost=off? Or did you just omit vhost=on? This sounds like a guest kernel/driver issue. I recommend testing git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git in the guest to see if this has already been fixed. If you have the -dbg RPMs installed it may be possible to insert a probe into the virtio_net kernel module and observe receive interrupts. This does require the right kernel CONFIG_ but you might already have it enabled: $ sudo perf probe --add skb_recv_done $ sudo perf record -e probe:skb_recv_done -a ...send some packets to the guest... ^C $ sudo perf script If you see no skb_recv_done events then the guest driver is not receiving a notification when packets are received. You can find more about how to use perf-probe(1) at http://blog.vmsplice.net/2011/03/how-to-use-perf-probe.html. Stefan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] vhost-net: add module alias (v2)
By adding the correct module alias, programs won't have to explicitly call modprobe. Vhost-net will always be available if built into the kernel. It does require assigning a permanent minor number for depmod to work. Choose one next to TUN since this driver is related to it. Also, use C99 style initialization. Signed-off-by: Stephen Hemminger shemmin...@vyatta.com --- v2 - document minor number and make sure to not overlap Documentation/devices.txt |2 ++ drivers/vhost/net.c|8 +--- include/linux/miscdevice.h |1 + 3 files changed, 8 insertions(+), 3 deletions(-) --- a/drivers/vhost/net.c 2012-01-10 10:56:58.883179194 -0800 +++ b/drivers/vhost/net.c 2012-01-10 19:48:23.650225892 -0800 @@ -856,9 +856,9 @@ static const struct file_operations vhos }; static struct miscdevice vhost_net_misc = { - MISC_DYNAMIC_MINOR, - vhost-net, - vhost_net_fops, + .minor = VHOST_NET_MINOR, + .name = vhost-net, + .fops = vhost_net_fops, }; static int vhost_net_init(void) @@ -879,3 +879,5 @@ MODULE_VERSION(0.0.1); MODULE_LICENSE(GPL v2); MODULE_AUTHOR(Michael S. Tsirkin); MODULE_DESCRIPTION(Host kernel accelerator for virtio net); +MODULE_ALIAS_MISCDEV(VHOST_NET_MINOR); +MODULE_ALIAS(devname:vhost-net); --- a/include/linux/miscdevice.h2012-01-10 10:56:59.779189436 -0800 +++ b/include/linux/miscdevice.h2012-01-11 09:13:20.803694316 -0800 @@ -42,6 +42,7 @@ #define AUTOFS_MINOR 235 #define MAPPER_CTRL_MINOR 236 #define LOOP_CTRL_MINOR237 +#define VHOST_NET_MINOR238 #define MISC_DYNAMIC_MINOR 255 struct device; --- a/Documentation/devices.txt 2012-01-10 10:56:53.399116518 -0800 +++ b/Documentation/devices.txt 2012-01-11 09:12:49.251197653 -0800 @@ -447,6 +447,8 @@ Your cooperation is appreciated. 234 = /dev/btrfs-controlBtrfs control device 235 = /dev/autofs Autofs control device 236 = /dev/mapper/control Device-Mapper control device + 237 = /dev/vhost-netHost kernel accelerator for virtio net + 240-254 Reserved for local use 255 Reserved for MISC_DYNAMIC_MINOR -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Code clean up for percpu_xxx() functions
On Wed, Jan 11, 2012 at 05:08:41PM +0800, Alex,Shi wrote: On Mon, 2011-11-21 at 17:06 -0700, t...@kernel.org wrote: (cc'ing hpa and quoting whole body) Signed-off-by: Alex Shi alex@intel.com Acked-by: Christoph Lameter c...@gentwo.org Acked-by: Tejun Heo t...@kernel.org hpa, I suppose this should go through x86? The original patch can be accessed at http://article.gmane.org/gmane.linux.kernel/1218055/raw Rend for 3.2 kernel, no any change needed to apply on latest Linus' tree. :) Actually, this clean up has no performance or security impact for kernel. On the contrary, removing some potential redundant preempt disable will bring a slight performance benefit to kernel. This 3rd patch depends on previous 2 patches, the 2nd one kvm code clean up was submitted for 3.3 kernel. but the 2st one net code clean up is waiting for David's comments. Alex, can you please collect all patches into a single patchset? Please split it such that, usage changes are per-system so that they can be routed through respective subsystems (x86 or net) and updates to percpu proper which can be applied after other changes have been applied. It would probably be best to route these patches separately rather than all through percpu as it touches a lot of different places and is likely to cause conflicts. I *think* the best way would be, * Submit per-subsystem patches and get them merged to subsystem trees. * (Optional) Apply a patch to mark unused interface deprecated in percpu tree, so that new usages in linux-next can be detected. * Towards the end of the next merge window, merge a patch to actually kill the old interface. Thanks. -- tejun -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4 V8] Add functions to check if the host has stopped the vm
When a host stops or suspends a VM it will set a flag to show this. The watchdog will use these functions to determine if a softlockup is real, or the result of a suspended VM. Signed-off-by: Eric B Munson emun...@mgebm.net asm-generic changes Acked-by: Arnd Bergmann a...@arndb.de Cc: mi...@redhat.com Cc: h...@zytor.com Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: mtosa...@redhat.com Cc: jeremy.fitzhardi...@citrix.com Cc: kvm@vger.kernel.org Cc: linux-a...@vger.kernel.org Cc: x...@kernel.org Cc: linux-ker...@vger.kernel.org --- Changes from V6: Use __this_cpu_and when clearing the PVCLOCK_GUEST_STOPPED flag Changes from V5: Collapse generic stubs into this patch check_and_clear_guest_stopped() takes no args and uses __get_cpu_var() Include individual definitions in ia64, s390, and powerpc arch/ia64/include/asm/kvm_para.h|5 + arch/powerpc/include/asm/kvm_para.h |5 + arch/s390/include/asm/kvm_para.h|5 + arch/x86/include/asm/kvm_para.h |8 arch/x86/kernel/kvmclock.c | 21 + include/asm-generic/kvm_para.h | 14 ++ 6 files changed, 58 insertions(+), 0 deletions(-) create mode 100644 include/asm-generic/kvm_para.h diff --git a/arch/ia64/include/asm/kvm_para.h b/arch/ia64/include/asm/kvm_para.h index 1588aee..2019cb9 100644 --- a/arch/ia64/include/asm/kvm_para.h +++ b/arch/ia64/include/asm/kvm_para.h @@ -26,6 +26,11 @@ static inline unsigned int kvm_arch_para_features(void) return 0; } +static inline bool kvm_check_and_clear_guest_paused(void) +{ + return false; +} + #endif #endif diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h index 50533f9..1f80293 100644 --- a/arch/powerpc/include/asm/kvm_para.h +++ b/arch/powerpc/include/asm/kvm_para.h @@ -169,6 +169,11 @@ static inline unsigned int kvm_arch_para_features(void) return r; } +static inline bool kvm_check_and_clear_guest_paused(void) +{ + return false; +} + #endif /* __KERNEL__ */ #endif /* __POWERPC_KVM_PARA_H__ */ diff --git a/arch/s390/include/asm/kvm_para.h b/arch/s390/include/asm/kvm_para.h index 6964db2..a988329 100644 --- a/arch/s390/include/asm/kvm_para.h +++ b/arch/s390/include/asm/kvm_para.h @@ -149,6 +149,11 @@ static inline unsigned int kvm_arch_para_features(void) return 0; } +static inline bool kvm_check_and_clear_guest_paused(void) +{ + return false; +} + #endif #endif /* __S390_KVM_PARA_H */ diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h index 734c376..99c4bbe 100644 --- a/arch/x86/include/asm/kvm_para.h +++ b/arch/x86/include/asm/kvm_para.h @@ -95,6 +95,14 @@ struct kvm_vcpu_pv_apf_data { extern void kvmclock_init(void); extern int kvm_register_clock(char *txt); +#ifdef CONFIG_KVM_CLOCK +bool kvm_check_and_clear_guest_paused(void); +#else +static inline bool kvm_check_and_clear_guest_paused(void) +{ + return false; +} +#endif /* CONFIG_KVMCLOCK */ /* This instruction is vmcall. On non-VT architectures, it will generate a * trap that we will then rewrite to the appropriate instruction. diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c index 44842d7..bdf6423 100644 --- a/arch/x86/kernel/kvmclock.c +++ b/arch/x86/kernel/kvmclock.c @@ -22,6 +22,7 @@ #include asm/msr.h #include asm/apic.h #include linux/percpu.h +#include linux/hardirq.h #include asm/x86_init.h #include asm/reboot.h @@ -114,6 +115,26 @@ static void kvm_get_preset_lpj(void) preset_lpj = lpj; } +bool kvm_check_and_clear_guest_paused(void) +{ + bool ret = false; + struct pvclock_vcpu_time_info *src; + + /* +* per_cpu() is safe here because this function is only called from +* timer functions where preemption is already disabled. +*/ + WARN_ON(!in_atomic()); + src = __get_cpu_var(hv_clock); + if ((src-flags PVCLOCK_GUEST_STOPPED) != 0) { + __this_cpu_and(hv_clock.flags, ~PVCLOCK_GUEST_STOPPED); + ret = true; + } + + return ret; +} +EXPORT_SYMBOL_GPL(kvm_check_and_clear_guest_paused); + static struct clocksource kvm_clock = { .name = kvm-clock, .read = kvm_clock_get_cycles, diff --git a/include/asm-generic/kvm_para.h b/include/asm-generic/kvm_para.h new file mode 100644 index 000..05ef7e7 --- /dev/null +++ b/include/asm-generic/kvm_para.h @@ -0,0 +1,14 @@ +#ifndef _ASM_GENERIC_KVM_PARA_H +#define _ASM_GENERIC_KVM_PARA_H + + +/* + * This function is used by architectures that support kvm to avoid issuing + * false soft lockup messages. + */ +static inline bool kvm_check_and_clear_guest_paused(void) +{ + return false; +} + +#endif -- 1.7.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4 V8] Avoid soft lockup message when KVM is stopped by host
Changes from V7: Define KVM_CAP_GUEST_PAUSED and support check Call mark_page_dirty () after setting PVCLOCK_GUEST_STOPPED Changes from V6: Use __this_cpu_and when clearing the PVCLOCK_GUEST_STOPPED flag Changes from V5: Collapse generic check_and_clear_guest_stopped into patch 2 Include check_and_clear_guest_stopped defintion to ia64, s390, and powerpc Change check_and_clear_guest_stopped to use __get_cpu_var instead of taking the cpuid arg. Protect check_and_clear_guest_stopped declaration with CONFIG_KVM_CLOCK check Changes from V4: Rename KVM_GUEST_PAUSED to KVMCLOCK_GUEST_PAUSED Add description of KVMCLOCK_GUEST_PAUSED ioctl to api.txt Changes from V3: Include CC's on patch 3 Drop clear flag ioctl and have the watchdog clear the flag when it is reset Changes from V2: A new kvm functions defined in kvm_para.h, the only change to pvclock is the initial flag definition Changes from V1: (Thanks Marcelo) Host code has all been moved to arch/x86/kvm/x86.c KVM_PAUSE_GUEST was renamed to KVM_GUEST_PAUSED When a guest kernel is stopped by the host hypervisor it can look like a soft lockup to the guest kernel. This false warning can mask later soft lockup warnings which may be real. This patch series adds a method for a host hypervisor to communicate to a guest kernel that it is being stopped. The final patch in the series has the watchdog check this flag when it goes to issue a soft lockup warning and skip the warning if the guest knows it was stopped. It was attempted to solve this in Qemu, but the side effects of saving and restoring the clock and tsc for each vcpu put the wall clock of the guest behind by the amount of time of the pause. This forces a guest to have ntp running in order to keep the wall clock accurate. Cc: mi...@redhat.com Cc: h...@zytor.com Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: mtosa...@redhat.com Cc: jeremy.fitzhardi...@citrix.com Cc: levinsasha...@gmail.com Cc: Jan Kiszka jan.kis...@siemens.com Cc: kvm@vger.kernel.org Cc: linux-a...@vger.kernel.org Cc: x...@kernel.org Cc: linux-ker...@vger.kernel.org Eric B Munson (4): Add flag to indicate that a vm was stopped by the host Add functions to check if the host has stopped the vm Add ioctl for KVMCLOCK_GUEST_STOPPED Add check for suspended vm in softlockup detector Documentation/virtual/kvm/api.txt | 13 + arch/ia64/include/asm/kvm_para.h|5 + arch/powerpc/include/asm/kvm_para.h |5 + arch/s390/include/asm/kvm_para.h|5 + arch/x86/include/asm/kvm_para.h |8 arch/x86/include/asm/pvclock-abi.h |1 + arch/x86/kernel/kvmclock.c | 21 + arch/x86/kvm/x86.c | 21 + include/asm-generic/kvm_para.h | 14 ++ include/linux/kvm.h |3 +++ kernel/watchdog.c | 12 11 files changed, 108 insertions(+), 0 deletions(-) create mode 100644 include/asm-generic/kvm_para.h -- 1.7.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4 V8] Add flag to indicate that a vm was stopped by the host
This flag will be used to check if the vm was stopped by the host when a soft lockup was detected. The host will set the flag when it stops the guest. On resume, the guest will check this flag if a soft lockup is detected and skip issuing the warning. Signed-off-by: Eric B Munson emun...@mgebm.net Cc: mi...@redhat.com Cc: h...@zytor.com Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: mtosa...@redhat.com Cc: jeremy.fitzhardi...@citrix.com Cc: kvm@vger.kernel.org Cc: linux-a...@vger.kernel.org Cc: x...@kernel.org Cc: linux-ker...@vger.kernel.org --- arch/x86/include/asm/pvclock-abi.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/arch/x86/include/asm/pvclock-abi.h b/arch/x86/include/asm/pvclock-abi.h index 35f2d19..6167fd7 100644 --- a/arch/x86/include/asm/pvclock-abi.h +++ b/arch/x86/include/asm/pvclock-abi.h @@ -40,5 +40,6 @@ struct pvclock_wall_clock { } __attribute__((__packed__)); #define PVCLOCK_TSC_STABLE_BIT (1 0) +#define PVCLOCK_GUEST_STOPPED (1 1) #endif /* __ASSEMBLY__ */ #endif /* _ASM_X86_PVCLOCK_ABI_H */ -- 1.7.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH V5] Guest stop notification
Often when a guest is stopped from the qemu console, it will report spurious soft lockup warnings on resume. There are kernel patches being discussed that will give the host the ability to tell the guest that it is being stopped and should ignore the soft lockup warning that generates. This patch uses the qemu Notifier system to tell the guest it is about to be stopped. Signed-off-by: Eric B Munson emun...@mgebm.net Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Jan Kiszka jan.kis...@siemens.com Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: kvm@vger.kernel.org --- Changes from V4: Test if the guest paused capability is available before use Changes from V3: Collapse new state change notification function into existsing function. Correct whitespace issues Change ioctl name to KVMCLOCK_GUEST_PAUSED Use for loop to iterate vpcu's Changes from V2: Move ioctl into hw/kvmclock.c so as other arches can use it as it is implemented Changes from V1: Remove unnecessary encapsulating function hw/kvmclock.c | 20 1 files changed, 20 insertions(+), 0 deletions(-) diff --git a/hw/kvmclock.c b/hw/kvmclock.c index 5388bc4..d071d61 100644 --- a/hw/kvmclock.c +++ b/hw/kvmclock.c @@ -16,6 +16,7 @@ #include sysbus.h #include kvm.h #include kvmclock.h +#include cpu-all.h #include linux/kvm.h #include linux/kvm_para.h @@ -62,10 +63,29 @@ static int kvmclock_post_load(void *opaque, int version_id) static void kvmclock_vm_state_change(void *opaque, int running, RunState state) { +int ret; +CPUState *penv = first_cpu; KVMClockState *s = opaque; +int cap_guest_paused = kvm_check_extension(kvm_state, KVM_CAP_GUEST_PAUSED); if (running) { s-clock_valid = false; + +if (!cap_guest_paused) { +return; +} + +for (penv = first_cpu; penv != NULL; penv = penv-next_cpu) { +ret = kvm_vcpu_ioctl(penv, KVMCLOCK_GUEST_PAUSED, 0); +if (ret) { +if (ret != -EINVAL) { +fprintf(stderr, +kvmclock_vm_state_change: %s\n, +strerror(-ret)); +} +return; +} +} } } -- 1.7.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4 V8] Add check for suspended vm in softlockup detector
A suspended VM can cause spurious soft lockup warnings. To avoid these, the watchdog now checks if the kernel knows it was stopped by the host and skips the warning if so. When the watchdog is reset successfully, clear the guest paused flag. Signed-off-by: Eric B Munson emun...@mgebm.net Cc: mi...@redhat.com Cc: h...@zytor.com Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: mtosa...@redhat.com Cc: jeremy.fitzhardi...@citrix.com Cc: kvm@vger.kernel.org Cc: linux-a...@vger.kernel.org Cc: x...@kernel.org Cc: linux-ker...@vger.kernel.org --- Changes from V3: Clear the PAUSED flag when the watchdog is reset kernel/watchdog.c | 12 1 files changed, 12 insertions(+), 0 deletions(-) diff --git a/kernel/watchdog.c b/kernel/watchdog.c index 1d7bca7..91485e5 100644 --- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -25,6 +25,7 @@ #include linux/sysctl.h #include asm/irq_regs.h +#include linux/kvm_para.h #include linux/perf_event.h int watchdog_enabled = 1; @@ -280,6 +281,9 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) __this_cpu_write(softlockup_touch_sync, false); sched_clock_tick(); } + + /* Clear the guest paused flag on watchdog reset */ + kvm_check_and_clear_guest_paused(); __touch_watchdog(); return HRTIMER_RESTART; } @@ -292,6 +296,14 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer) */ duration = is_softlockup(touch_ts); if (unlikely(duration)) { + /* +* If a virtual machine is stopped by the host it can look to +* the watchdog like a soft lockup, check to see if the host +* stopped the vm before we issue the warning +*/ + if (kvm_check_and_clear_guest_paused()) + return HRTIMER_RESTART; + /* only warn once */ if (__this_cpu_read(soft_watchdog_warn) == true) return HRTIMER_RESTART; -- 1.7.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4 V8] Add ioctl for KVMCLOCK_GUEST_STOPPED
Now that we have a flag that will tell the guest it was suspended, create an interface for that communication using a KVM ioctl. Signed-off-by: Eric B Munson emun...@mgebm.net Cc: mi...@redhat.com Cc: h...@zytor.com Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: mtosa...@redhat.com Cc: jeremy.fitzhardi...@citrix.com Cc: kvm@vger.kernel.org Cc: linux-a...@vger.kernel.org Cc: x...@kernel.org Cc: linux-ker...@vger.kernel.org --- Changes from V7: Define KVM_CAP_GUEST_PAUSED and support check Call mark_page_dirty () after setting PVCLOCK_GUEST_STOPPED Changes from V4: Rename KVM_GUEST_PAUSED to KVMCLOCK_GUEST_PAUSED Add new ioctl description to api.txt Documentation/virtual/kvm/api.txt | 13 + arch/x86/kvm/x86.c| 21 + include/linux/kvm.h |3 +++ 3 files changed, 37 insertions(+), 0 deletions(-) diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt index e1d94bf..1931e5c 100644 --- a/Documentation/virtual/kvm/api.txt +++ b/Documentation/virtual/kvm/api.txt @@ -1491,6 +1491,19 @@ following algorithm: Some guests configure the LINT1 NMI input to cause a panic, aiding in debugging. +4.65 KVMCLOCK_GUEST_PAUSED + +Capability: KVM_CAP_GUEST_PAUSED +Architechtures: Any that implement pvclocks (currently x86 only) +Type: vcpu ioctl +Parameters: None +Returns: 0 on success, -1 on error + +This signals to the host kernel that the specified guest is being paused by +userspace. The host will set a flag in the pvclock structure that is checked +from the soft lockup watchdog. This ioctl can be called during pause or +unpause. + 5. The kvm_run structure Application code obtains a pointer to the kvm_run structure by diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1171def..b0b51cb 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2056,6 +2056,7 @@ int kvm_dev_ioctl_check_extension(long ext) case KVM_CAP_X86_ROBUST_SINGLESTEP: case KVM_CAP_XSAVE: case KVM_CAP_ASYNC_PF: + case KVM_CAP_GUEST_PAUSED: case KVM_CAP_GET_TSC_KHZ: r = 1; break; @@ -2503,6 +2504,22 @@ static int kvm_vcpu_ioctl_x86_set_xcrs(struct kvm_vcpu *vcpu, return r; } +/* + * kvm_set_guest_paused() indicates to the guest kernel that it has been + * stopped by the hypervisor. This function will be called from the host only. + * EINVAL is returned when the host attempts to set the flag for a guest that + * does not support pv clocks. + */ +static int kvm_set_guest_paused(struct kvm_vcpu *vcpu) +{ + struct pvclock_vcpu_time_info *src = vcpu-arch.hv_clock; + if (!vcpu-arch.time_page) + return -EINVAL; + src-flags |= PVCLOCK_GUEST_STOPPED; + mark_page_dirty(vcpu-kvm, vcpu-arch.time PAGE_SHIFT); + return 0; +} + long kvm_arch_vcpu_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg) { @@ -2784,6 +2801,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp, goto out; } + case KVMCLOCK_GUEST_PAUSED: { + r = kvm_set_guest_paused(vcpu); + break; + } default: r = -EINVAL; } diff --git a/include/linux/kvm.h b/include/linux/kvm.h index 68e67e5..4ffe0df 100644 --- a/include/linux/kvm.h +++ b/include/linux/kvm.h @@ -558,6 +558,7 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_PPC_PAPR 68 #define KVM_CAP_S390_GMAP 71 #define KVM_CAP_TSC_DEADLINE_TIMER 72 +#define KVM_CAP_GUEST_PAUSED 73 #ifdef KVM_CAP_IRQ_ROUTING @@ -763,6 +764,8 @@ struct kvm_clock_data { #define KVM_CREATE_SPAPR_TCE _IOW(KVMIO, 0xa8, struct kvm_create_spapr_tce) /* Available with KVM_CAP_RMA */ #define KVM_ALLOCATE_RMA _IOR(KVMIO, 0xa9, struct kvm_allocate_rma) +/* VM is being stopped by host */ +#define KVMCLOCK_GUEST_PAUSED_IO(KVMIO, 0xaa) #define KVM_DEV_ASSIGN_ENABLE_IOMMU(1 0) -- 1.7.5.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH V5] Guest stop notification
On 2012-01-11 19:17, Eric B Munson wrote: Often when a guest is stopped from the qemu console, it will report spurious soft lockup warnings on resume. There are kernel patches being discussed that will give the host the ability to tell the guest that it is being stopped and should ignore the soft lockup warning that generates. This patch uses the qemu Notifier system to tell the guest it is about to be stopped. Signed-off-by: Eric B Munson emun...@mgebm.net Cc: Avi Kivity a...@redhat.com Cc: Marcelo Tosatti mtosa...@redhat.com Cc: Jan Kiszka jan.kis...@siemens.com Cc: ry...@linux.vnet.ibm.com Cc: aligu...@us.ibm.com Cc: kvm@vger.kernel.org --- Changes from V4: Test if the guest paused capability is available before use Changes from V3: Collapse new state change notification function into existsing function. Correct whitespace issues Change ioctl name to KVMCLOCK_GUEST_PAUSED Use for loop to iterate vpcu's Changes from V2: Move ioctl into hw/kvmclock.c so as other arches can use it as it is implemented Changes from V1: Remove unnecessary encapsulating function hw/kvmclock.c | 20 1 files changed, 20 insertions(+), 0 deletions(-) diff --git a/hw/kvmclock.c b/hw/kvmclock.c index 5388bc4..d071d61 100644 --- a/hw/kvmclock.c +++ b/hw/kvmclock.c @@ -16,6 +16,7 @@ #include sysbus.h #include kvm.h #include kvmclock.h +#include cpu-all.h #include linux/kvm.h #include linux/kvm_para.h @@ -62,10 +63,29 @@ static int kvmclock_post_load(void *opaque, int version_id) static void kvmclock_vm_state_change(void *opaque, int running, RunState state) { +int ret; +CPUState *penv = first_cpu; KVMClockState *s = opaque; +int cap_guest_paused = kvm_check_extension(kvm_state, KVM_CAP_GUEST_PAUSED); if (running) { s-clock_valid = false; + +if (!cap_guest_paused) { +return; +} Why? You already ignore -EINVAL. + +for (penv = first_cpu; penv != NULL; penv = penv-next_cpu) { +ret = kvm_vcpu_ioctl(penv, KVMCLOCK_GUEST_PAUSED, 0); This indicates that the interface could still be improved: GUEST_PAUSED implies to me a VM state, but the IOCTL has to be applied per VCPU. This is inconsistent. Why not define a per-VM IOCTL? Would make user space's life a little bit easier as well. Or is there a valid use case of selectively paused VCPUs? Then call it KVMCLOCK_VCPU_PAUSED. +if (ret) { +if (ret != -EINVAL) { What is special about -EINVAL (as long as the cap is checked)? +fprintf(stderr, +kvmclock_vm_state_change: %s\n, +strerror(-ret)); +} +return; +} +} } } Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
State of KVM bits in linux-headers
Hi, I'm a bit unhappy about the current state of our supposed to be automatically sync'ed linux-headers directory in qemu. It has been updated several times against undefined kernel trees, means against neither a released version nor kvm.git. Now, if I run an update against kvm.git + some local change, I get a churn of removals. Same will happen when that local change ever goes upstream before the other stuff got finally committed. Alex, it looks to me like this is mostly PPC stuff. Can you comment on the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a year ago but is not in any Linux release around. Fishy... I would like to see us avoiding this in the future. Headers update patches should mention the source and should not be merged until the ABI changes actually made it at least into kvm.git. Same applies, of course, to the functional changes related to that ABI. Otherwise we risk quite some mess on everyone's side. Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel and also the header. Is there real free space now or will the cap reappear? If there should better be a placeholder, let's add it (to the kernel). Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: __direct_map() questions
Hi, See Documentation/virtual/kvm/mmu.txt in the kernel source tree. On Tue, Jan 10, 2012 at 11:41:41AM -0800, Nick H wrote: Hello All, I am preparing for a presentation for my community college, newbie to the kvm world. I am trying to understand kvm implementation. I am interested in doing a small presentation on kvm and its internals at my school. I am looking at __direct_map() . I see for_each_shadow_entry()-shadow_walk_xxx() (called in context of handle_ept_violation() ) functions using the gfn to find the iterator.sptep. It passes this iterator.sptep to the mmu_set_spte(). -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes
On Tue, Jan 10, 2012 at 03:26:49PM +0100, Stephan Bärwolf wrote: From 2168285ffb30716f30e129c3ce98ce42d19c4d4e Mon Sep 17 00:00:00 2001 From: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de Date: Sun, 8 Jan 2012 02:03:47 + Subject: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes On hosts without this patch, 32bit guests will crash (and 64bit guests may behave in a wrong way) for example by simply executing following nasm-demo-application: [bits 32] global _start SECTION .text _start: syscall (I tested it with winxp and linux - both always crashed) Disassembly of section .text: _start: 0: 0f 05 syscall The reason seems a missing invalid opcode-trap (int6) for the syscall opcode 0f05, which is not available on Intel CPUs within non-longmodes, as also on some AMD CPUs within legacy-mode. (depending on CPU vendor, MSR_EFER and cpuid) Because previous mentioned OSs may not engage corresponding syscall target-registers (STAR, LSTAR, CSTAR), they remain NULL and (non trapping) syscalls are leading to multiple faults and finally crashs. Depending on the architecture (AMD or Intel) pretended by guests, various checks according to vendor's documentation are implemented to overcome the current issue and behave like the CPUs physical counterparts. (Therefore using Intel's Intel 64 and IA-32 Architecture Software Developers Manual http://www.intel.com/content/dam/doc/manual/ 64-ia-32-architectures-software-developer-manual-325462.pdf and AMD's AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf ) Screenshots of an i686 testing VM (CORE i5 host) before and after applying this patch are available under: http://matrixstorm.com/software/linux/kvm/20111229/before.jpg http://matrixstorm.com/software/linux/kvm/20111229/after.jpg Signed-off-by: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de --- arch/x86/include/asm/kvm_emulate.h | 15 ++ arch/x86/kvm/emulate.c | 92 ++- 2 files changed, 104 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index b172bf4..5b68c23 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -301,6 +301,21 @@ struct x86_emulate_ctxt { #define X86EMUL_MODE_PROT (X86EMUL_MODE_PROT16|X86EMUL_MODE_PROT32| \ X86EMUL_MODE_PROT64) +/* CPUID vendors */ +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65 + +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ebx 0x69444d41 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ecx 0x21726574 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_edx 0x74656273 + +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e +#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69 + + + enum x86_intercept_stage { X86_ICTP_NONE = 0, /* Allow zero-init to not match anything */ X86_ICPT_PRE_EXCEPT, diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index f1e3be1..3357411 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1877,6 +1877,94 @@ setup_syscalls_segments(struct x86_emulate_ctxt *ctxt, ss-p = 1; } +static bool em_syscall_isenabled(struct x86_emulate_ctxt *ctxt) +{ +struct x86_emulate_ops *ops = ctxt-ops; +u64 efer = 0; + +/* syscall is not available in real mode*/ +if ((ctxt-mode == X86EMUL_MODE_REAL) || +(ctxt-mode == X86EMUL_MODE_VM86)) +return false; + +ops-get_msr(ctxt, MSR_EFER, efer); +/* check - if guestOS is aware of syscall (0x0f05) */ +if ((efer EFER_SCE) == 0) { +return false; +} else { + /* ok, at this point it becomes vendor-specific */ + /* so first get us an cpuid */ + bool vendor; + u32 eax, ebx, ecx, edx; + + /* getting the cpu-vendor */ + eax = 0x; + ecx = 0x; + if (likely(ops-get_cpuid)) + vendor = ops-get_cpuid(ctxt, eax, ebx, ecx, edx); + elsevendor = false; + + if (likely(vendor)) { + +/* AMD AuthenticAMD / AMDisbetter! */ +if (((ebx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx) + (ecx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx) + (edx==X86EMUL_CPUID_VENDOR_AuthenticAMD_edx)) || +((ebx==X86EMUL_CPUID_VENDOR_AMDisbetter_ebx) + (ecx==X86EMUL_CPUID_VENDOR_AMDisbetter_ecx) +
[RFC][PATCH] Update linux headers against kvm.git
On 2012-01-11 20:16, Jan Kiszka wrote: Hi, I'm a bit unhappy about the current state of our supposed to be automatically sync'ed linux-headers directory in qemu. It has been updated several times against undefined kernel trees, means against neither a released version nor kvm.git. Now, if I run an update against kvm.git + some local change, I get a churn of removals. Same will happen when that local change ever goes upstream before the other stuff got finally committed. Alex, it looks to me like this is mostly PPC stuff. Can you comment on the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a year ago but is not in any Linux release around. Fishy... I would like to see us avoiding this in the future. Headers update patches should mention the source and should not be merged until the ABI changes actually made it at least into kvm.git. Same applies, of course, to the functional changes related to that ABI. Otherwise we risk quite some mess on everyone's side. Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel and also the header. Is there real free space now or will the cap reappear? If there should better be a placeholder, let's add it (to the kernel). Jan Just to underline this, not for merge (yet). Is it clear that those PPC features will be merged upstream as-is now? Jan ---8--- This synchronizes our headers with kvm.git ff92e9b557 - and breaks PPC build. Fairly telling... --- linux-headers/asm-powerpc/kvm.h | 37 - linux-headers/asm-x86/hyperv.h|1 + linux-headers/linux/kvm.h | 54 ++-- linux-headers/linux/kvm_para.h|1 - linux-headers/linux/virtio_ring.h |6 ++-- 5 files changed, 7 insertions(+), 92 deletions(-) diff --git a/linux-headers/asm-powerpc/kvm.h b/linux-headers/asm-powerpc/kvm.h index fb3fddc..f7727d9 100644 --- a/linux-headers/asm-powerpc/kvm.h +++ b/linux-headers/asm-powerpc/kvm.h @@ -292,41 +292,4 @@ struct kvm_allocate_rma { __u64 rma_size; }; -struct kvm_book3e_206_tlb_entry { - __u32 mas8; - __u32 mas1; - __u64 mas2; - __u64 mas7_3; -}; - -struct kvm_book3e_206_tlb_params { - /* -* For mmu types KVM_MMU_FSL_BOOKE_NOHV and KVM_MMU_FSL_BOOKE_HV: -* -* - The number of ways of TLB0 must be a power of two between 2 and -* 16. -* - TLB1 must be fully associative. -* - The size of TLB0 must be a multiple of the number of ways, and -* the number of sets must be a power of two. -* - The size of TLB1 may not exceed 64 entries. -* - TLB0 supports 4 KiB pages. -* - The page sizes supported by TLB1 are as indicated by -* TLB1CFG (if MMUCFG[MAVN] = 0) or TLB1PS (if MMUCFG[MAVN] = 1) -* as returned by KVM_GET_SREGS. -* - TLB2 and TLB3 are reserved, and their entries in tlb_sizes[] -* and tlb_ways[] must be zero. -* -* tlb_ways[n] = tlb_sizes[n] means the array is fully associative. -* -* KVM will adjust TLBnCFG based on the sizes configured here, -* though arrays greater than 2048 entries will have TLBnCFG[NENTRY] -* set to zero. -*/ - __u32 tlb_sizes[4]; - __u32 tlb_ways[4]; - __u32 reserved[8]; -}; - -#define KVM_ONE_REG_PPC_HIOR KVM_ONE_REG_PPC | 0x100 - #endif /* __LINUX_KVM_POWERPC_H */ diff --git a/linux-headers/asm-x86/hyperv.h b/linux-headers/asm-x86/hyperv.h index 5df477a..b80420b 100644 --- a/linux-headers/asm-x86/hyperv.h +++ b/linux-headers/asm-x86/hyperv.h @@ -189,5 +189,6 @@ #define HV_STATUS_INVALID_HYPERCALL_CODE 2 #define HV_STATUS_INVALID_HYPERCALL_INPUT 3 #define HV_STATUS_INVALID_ALIGNMENT4 +#define HV_STATUS_INSUFFICIENT_BUFFERS 19 #endif diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h index a8761d3..e36ad9a 100644 --- a/linux-headers/linux/kvm.h +++ b/linux-headers/linux/kvm.h @@ -371,6 +371,7 @@ struct kvm_s390_psw { #define KVM_S390_INT_VIRTIO0x2603u #define KVM_S390_INT_SERVICE 0x2401u #define KVM_S390_INT_EMERGENCY 0x1201u +#define KVM_S390_INT_EXTERNAL_CALL 0x1202u struct kvm_s390_interrupt { __u32 type; @@ -554,10 +555,9 @@ struct kvm_ppc_pvinfo { #define KVM_CAP_PPC_SMT 64 #define KVM_CAP_PPC_RMA65 #define KVM_CAP_MAX_VCPUS 66 /* returns max vcpus per vm */ -#define KVM_CAP_PPC_HIOR 67 #define KVM_CAP_PPC_PAPR 68 -#define KVM_CAP_SW_TLB 69 -#define KVM_CAP_ONE_REG 70 +#define KVM_CAP_S390_GMAP 71 +#define KVM_CAP_TSC_DEADLINE_TIMER 72 #ifdef KVM_CAP_IRQ_ROUTING @@ -637,49 +637,6 @@ struct kvm_clock_data { __u32 pad[9]; }; -#define KVM_MMU_FSL_BOOKE_NOHV 0 -#define KVM_MMU_FSL_BOOKE_HV 1 - -struct kvm_config_tlb { - __u64 params; - __u64 array; - __u32 mmu_type; - __u32
Re: State of KVM bits in linux-headers
On 11.01.2012, at 20:16, Jan Kiszka wrote: Hi, I'm a bit unhappy about the current state of our supposed to be automatically sync'ed linux-headers directory in qemu. It has been updated several times against undefined kernel trees, means against neither a released version nor kvm.git. Now, if I run an update against kvm.git + some local change, I get a churn of removals. Same will happen when that local change ever goes upstream before the other stuff got finally committed. Yes, call me even more unhappy about it :(. Alex, it looks to me like this is mostly PPC stuff. Can you comment on the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a year ago but is not in any Linux release around. Fishy... Ok, here's my workflow: * KVM: receive patches on the ML * KVM: wait for reviews, review myself * KVM: send out a pull request -- this is the point in time where I assume the ABI can be considered stable -- * QEMU: run update on the headers, because in a perfect world things should hit kvm.git any day * KVM: pull request gets reviews causing not-pulls or abi changes and lots of churn because i need forever to pullreq again ;) I guess you see the problem. Hence I haven't pushed any kernel header updates since I realized how badly broken that process was. However even the stuff that's in qemu.git now hasn't managed to get upstream yet. I would like to see us avoiding this in the future. Headers update patches should mention the source and should not be merged until the ABI changes actually made it at least into kvm.git. Same applies, of course, to the functional changes related to that ABI. Otherwise we risk quite some mess on everyone's side. I agree. Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel and also the header. Is there real free space now or will the cap reappear? If there should better be a placeholder, let's add it (to the kernel). I will reappear with ONE_REG semantics. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost-net: add module alias (v2)
On Wed, Jan 11, 2012 at 09:16:53AM -0800, Stephen Hemminger wrote: By adding the correct module alias, programs won't have to explicitly call modprobe. Vhost-net will always be available if built into the kernel. It does require assigning a permanent minor number for depmod to work. Choose one next to TUN since this driver is related to it. Also, use C99 style initialization. Signed-off-by: Stephen Hemminger shemmin...@vyatta.com I don't mind but this needs an Ack from Alan Cox who made it dynamic in the first place, see 79907d89c397b8bc2e05b347ec94e928ea919d33. --- v2 - document minor number and make sure to not overlap Documentation/devices.txt |2 ++ drivers/vhost/net.c|8 +--- include/linux/miscdevice.h |1 + 3 files changed, 8 insertions(+), 3 deletions(-) --- a/drivers/vhost/net.c 2012-01-10 10:56:58.883179194 -0800 +++ b/drivers/vhost/net.c 2012-01-10 19:48:23.650225892 -0800 @@ -856,9 +856,9 @@ static const struct file_operations vhos }; static struct miscdevice vhost_net_misc = { - MISC_DYNAMIC_MINOR, - vhost-net, - vhost_net_fops, + .minor = VHOST_NET_MINOR, + .name = vhost-net, + .fops = vhost_net_fops, }; static int vhost_net_init(void) @@ -879,3 +879,5 @@ MODULE_VERSION(0.0.1); MODULE_LICENSE(GPL v2); MODULE_AUTHOR(Michael S. Tsirkin); MODULE_DESCRIPTION(Host kernel accelerator for virtio net); +MODULE_ALIAS_MISCDEV(VHOST_NET_MINOR); +MODULE_ALIAS(devname:vhost-net); --- a/include/linux/miscdevice.h 2012-01-10 10:56:59.779189436 -0800 +++ b/include/linux/miscdevice.h 2012-01-11 09:13:20.803694316 -0800 @@ -42,6 +42,7 @@ #define AUTOFS_MINOR 235 #define MAPPER_CTRL_MINOR236 #define LOOP_CTRL_MINOR 237 +#define VHOST_NET_MINOR 238 #define MISC_DYNAMIC_MINOR 255 struct device; --- a/Documentation/devices.txt 2012-01-10 10:56:53.399116518 -0800 +++ b/Documentation/devices.txt 2012-01-11 09:12:49.251197653 -0800 @@ -447,6 +447,8 @@ Your cooperation is appreciated. 234 = /dev/btrfs-controlBtrfs control device 235 = /dev/autofs Autofs control device 236 = /dev/mapper/control Device-Mapper control device + 237 = /dev/vhost-netHost kernel accelerator for virtio net + 240-254 Reserved for local use 255 Reserved for MISC_DYNAMIC_MINOR -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 01/11/2012 01:32 PM, Alexander Graf wrote: On 11.01.2012, at 20:16, Jan Kiszka wrote: Hi, I'm a bit unhappy about the current state of our supposed to be automatically sync'ed linux-headers directory in qemu. It has been updated several times against undefined kernel trees, means against neither a released version nor kvm.git. Now, if I run an update against kvm.git + some local change, I get a churn of removals. Same will happen when that local change ever goes upstream before the other stuff got finally committed. Yes, call me even more unhappy about it :(. May I suggest the following: 1) Have the header syncing script take a commit hash that's stored in git. Make script ensure that this has is in Linus' tree. 2) Maintain a patch on top of Linus' tree in qemu.git that the script would apply before actually syncing header files. That let's us track how we're differing from upstream in a more reliable fashion. Alex, it looks to me like this is mostly PPC stuff. Can you comment on the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a year ago but is not in any Linux release around. Fishy... Ok, here's my workflow: * KVM: receive patches on the ML * KVM: wait for reviews, review myself * KVM: send out a pull request -- this is the point in time where I assume the ABI can be considered stable -- * QEMU: run update on the headers, because in a perfect world things should hit kvm.git any day * KVM: pull request gets reviews causing not-pulls or abi changes and lots of churn because i need forever to pullreq again ;) I guess you see the problem. Hence I haven't pushed any kernel header updates since I realized how badly broken that process was. However even the stuff that's in qemu.git now hasn't managed to get upstream yet. I don't think it's a broken process. I think you made a reasonable set of assumptions. I think it was just an exceptional circumstance. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: State of KVM bits in linux-headers
On 2012-01-11 20:32, Alexander Graf wrote: On 11.01.2012, at 20:16, Jan Kiszka wrote: Hi, I'm a bit unhappy about the current state of our supposed to be automatically sync'ed linux-headers directory in qemu. It has been updated several times against undefined kernel trees, means against neither a released version nor kvm.git. Now, if I run an update against kvm.git + some local change, I get a churn of removals. Same will happen when that local change ever goes upstream before the other stuff got finally committed. Yes, call me even more unhappy about it :(. Alex, it looks to me like this is mostly PPC stuff. Can you comment on the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a year ago but is not in any Linux release around. Fishy... Ok, here's my workflow: * KVM: receive patches on the ML * KVM: wait for reviews, review myself * KVM: send out a pull request -- this is the point in time where I assume the ABI can be considered stable -- * QEMU: run update on the headers, because in a perfect world things should hit kvm.git any day * KVM: pull request gets reviews causing not-pulls or abi changes and lots of churn because i need forever to pullreq again ;) Likely, the last item has to be moved up by two steps... I guess you see the problem. Hence I haven't pushed any kernel header updates since I realized how badly broken that process was. However even the stuff that's in qemu.git now hasn't managed to get upstream yet. On the other hand, if I recall correctly, there were some complaint on the list recently about a header update patch again a Linux -rc version. Because it removed the limbo land stuff in the same run, of course. That's very bad. I see the problem: ppc targets will no longer build, at least with KVM enabled, right? But this needs to be resolved now. I would like to see us avoiding this in the future. Headers update patches should mention the source and should not be merged until the ABI changes actually made it at least into kvm.git. Same applies, of course, to the functional changes related to that ABI. Otherwise we risk quite some mess on everyone's side. I agree. Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel and also the header. Is there real free space now or will the cap reappear? If there should better be a placeholder, let's add it (to the kernel). I will reappear with ONE_REG semantics. OK. Then please clean up now so that update-linux-headers.sh can be used again by normal developers. :) Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 01/11/2012 01:38 PM, Jan Kiszka wrote: I would like to see us avoiding this in the future. Headers update patches should mention the source and should not be merged until the ABI changes actually made it at least into kvm.git. Same applies, of course, to the functional changes related to that ABI. Otherwise we risk quite some mess on everyone's side. I agree. Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel and also the header. Is there real free space now or will the cap reappear? If there should better be a placeholder, let's add it (to the kernel). I will reappear with ONE_REG semantics. OK. Then please clean up now so that update-linux-headers.sh can be used again by normal developers. :) Before we did submodules and had a responsive BIOS maintainer, we maintained patches within qemu.git for our external dependencies. I think that's a good strategy here too. It's a little painful, but not entirely awful. At least it makes it possible for you to (hopefully) trivial rebase a patch if something is still in limbo. Regards, Anthony Liguori Jan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 11.01.2012, at 20:38, Anthony Liguori wrote: On 01/11/2012 01:32 PM, Alexander Graf wrote: On 11.01.2012, at 20:16, Jan Kiszka wrote: Hi, I'm a bit unhappy about the current state of our supposed to be automatically sync'ed linux-headers directory in qemu. It has been updated several times against undefined kernel trees, means against neither a released version nor kvm.git. Now, if I run an update against kvm.git + some local change, I get a churn of removals. Same will happen when that local change ever goes upstream before the other stuff got finally committed. Yes, call me even more unhappy about it :(. May I suggest the following: 1) Have the header syncing script take a commit hash that's stored in git. Make script ensure that this has is in Linus' tree. 2) Maintain a patch on top of Linus' tree in qemu.git that the script would apply before actually syncing header files. That let's us track how we're differing from upstream in a more reliable fashion. Yeah, I guess the ultimate question it boils down to is: when is something upstream? The average time it takes for patches to trickle through to Linus right now is in the magnitude of half a year to a year. Alex, it looks to me like this is mostly PPC stuff. Can you comment on the origin and workflow? E.g. KVM_CAP_SW_TLB: This has been added half a year ago but is not in any Linux release around. Fishy... Ok, here's my workflow: * KVM: receive patches on the ML * KVM: wait for reviews, review myself * KVM: send out a pull request -- this is the point in time where I assume the ABI can be considered stable -- * QEMU: run update on the headers, because in a perfect world things should hit kvm.git any day * KVM: pull request gets reviews causing not-pulls or abi changes and lots of churn because i need forever to pullreq again ;) I guess you see the problem. Hence I haven't pushed any kernel header updates since I realized how badly broken that process was. However even the stuff that's in qemu.git now hasn't managed to get upstream yet. I don't think it's a broken process. I think you made a reasonable set of assumptions. I think it was just an exceptional circumstance. Several times in a row? No, the assumptions were just wrong. In the kvm world, pull requests don't mean upstream, they mean the same as a patch set. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 11.01.2012, at 20:41, Anthony Liguori wrote: On 01/11/2012 01:38 PM, Jan Kiszka wrote: I would like to see us avoiding this in the future. Headers update patches should mention the source and should not be merged until the ABI changes actually made it at least into kvm.git. Same applies, of course, to the functional changes related to that ABI. Otherwise we risk quite some mess on everyone's side. I agree. Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel and also the header. Is there real free space now or will the cap reappear? If there should better be a placeholder, let's add it (to the kernel). I will reappear with ONE_REG semantics. OK. Then please clean up now so that update-linux-headers.sh can be used again by normal developers. :) Before we did submodules and had a responsive BIOS maintainer, we maintained patches within qemu.git for our external dependencies. I think that's a good strategy here too. It's a little painful, but not entirely awful. At least it makes it possible for you to (hopefully) trivial rebase a patch if something is still in limbo. Yeah, that works. I can easily script that part. It doesn't solve the actual underlying problem though that we don't know when the abi is actually stable. I'm slowly starting to understand Pekka ;). Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 2012-01-11 20:38, Anthony Liguori wrote: On 01/11/2012 01:32 PM, Alexander Graf wrote: On 11.01.2012, at 20:16, Jan Kiszka wrote: Hi, I'm a bit unhappy about the current state of our supposed to be automatically sync'ed linux-headers directory in qemu. It has been updated several times against undefined kernel trees, means against neither a released version nor kvm.git. Now, if I run an update against kvm.git + some local change, I get a churn of removals. Same will happen when that local change ever goes upstream before the other stuff got finally committed. Yes, call me even more unhappy about it :(. May I suggest the following: 1) Have the header syncing script take a commit hash that's stored in git. Make script ensure that this has is in Linus' tree. 2) Maintain a patch on top of Linus' tree in qemu.git that the script would apply before actually syncing header files. That let's us track how we're differing from upstream in a more reliable fashion. That sounds fairly complicated for a simple problem: Do not merge ABI changes that aren't at least in kvm.git. There are also other reasons for this, beside making the sync harder. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 11.01.2012, at 20:46, Jan Kiszka wrote: On 2012-01-11 20:38, Anthony Liguori wrote: On 01/11/2012 01:32 PM, Alexander Graf wrote: On 11.01.2012, at 20:16, Jan Kiszka wrote: Hi, I'm a bit unhappy about the current state of our supposed to be automatically sync'ed linux-headers directory in qemu. It has been updated several times against undefined kernel trees, means against neither a released version nor kvm.git. Now, if I run an update against kvm.git + some local change, I get a churn of removals. Same will happen when that local change ever goes upstream before the other stuff got finally committed. Yes, call me even more unhappy about it :(. May I suggest the following: 1) Have the header syncing script take a commit hash that's stored in git. Make script ensure that this has is in Linus' tree. 2) Maintain a patch on top of Linus' tree in qemu.git that the script would apply before actually syncing header files. That let's us track how we're differing from upstream in a more reliable fashion. That sounds fairly complicated for a simple problem: Do not merge ABI changes that aren't at least in kvm.git. There are also other reasons for this, beside making the sync harder. Let's just try to get my patch queue into kvm.git asap and then never to push linux-header updates before they hit kvm.git again. That's easier than setting up any complicated processes or scripts. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 2012-01-11 20:46, Alexander Graf wrote: On 11.01.2012, at 20:41, Anthony Liguori wrote: On 01/11/2012 01:38 PM, Jan Kiszka wrote: I would like to see us avoiding this in the future. Headers update patches should mention the source and should not be merged until the ABI changes actually made it at least into kvm.git. Same applies, of course, to the functional changes related to that ABI. Otherwise we risk quite some mess on everyone's side. I agree. Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel and also the header. Is there real free space now or will the cap reappear? If there should better be a placeholder, let's add it (to the kernel). I will reappear with ONE_REG semantics. OK. Then please clean up now so that update-linux-headers.sh can be used again by normal developers. :) Before we did submodules and had a responsive BIOS maintainer, we maintained patches within qemu.git for our external dependencies. I think that's a good strategy here too. It's a little painful, but not entirely awful. At least it makes it possible for you to (hopefully) trivial rebase a patch if something is still in limbo. Yeah, that works. I can easily script that part. It doesn't solve the actual underlying problem though that we don't know when the abi is actually stable. I'm slowly starting to understand Pekka ;). IIRC, we never had this problem with qemu-kvm - as the merges were coordinated with the kernel (subsystem) tree. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost-net: add module alias
On Wed, Jan 11, 2012 at 08:54:26AM -0800, Stephen Hemminger wrote: On Wed, 11 Jan 2012 15:43:42 +0800 Amos Kong kongjian...@gmail.com wrote: On Wed, Jan 11, 2012 at 12:54 PM, Stephen Hemminger shemmin...@vyatta.comwrote: By adding the a module alias, programs (or users) won't have to explicitly call modprobe. Vhost-net will always be available if built into the kernel. It does require assigning a permanent minor number for depmod to work. Choose one next to TUN since this driver is related to it. Also, use C99 style initialization. Signed-off-by: Stephen Hemminger shemmin...@vyatta.com --- drivers/vhost/net.c|8 +--- include/linux/miscdevice.h |1 + 2 files changed, 6 insertions(+), 3 deletions(-) : /* * These allocations are managed by dev...@lanana.org. If you use an * entry that is not in assigned your entry may well be moved and * reassigned, or set dynamic if a fixed value is not justified. */ Didn't that mailing address was ever used any more. Like many places in kernel, the comment looked like a historical leftover. This was only added in 2010, see 79907d89c397b8bc2e05b347ec94e928ea919d33. That said at least lanana.org web site seems to be down. Alan, any idea? -- MST -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 01/11/2012 01:48 PM, Jan Kiszka wrote: On 2012-01-11 20:46, Alexander Graf wrote: On 11.01.2012, at 20:41, Anthony Liguori wrote: On 01/11/2012 01:38 PM, Jan Kiszka wrote: I would like to see us avoiding this in the future. Headers update patches should mention the source and should not be merged until the ABI changes actually made it at least into kvm.git. Same applies, of course, to the functional changes related to that ABI. Otherwise we risk quite some mess on everyone's side. I agree. Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel and also the header. Is there real free space now or will the cap reappear? If there should better be a placeholder, let's add it (to the kernel). I will reappear with ONE_REG semantics. OK. Then please clean up now so that update-linux-headers.sh can be used again by normal developers. :) Before we did submodules and had a responsive BIOS maintainer, we maintained patches within qemu.git for our external dependencies. I think that's a good strategy here too. It's a little painful, but not entirely awful. At least it makes it possible for you to (hopefully) trivial rebase a patch if something is still in limbo. Yeah, that works. I can easily script that part. It doesn't solve the actual underlying problem though that we don't know when the abi is actually stable. I'm slowly starting to understand Pekka ;). IIRC, we never had this problem with qemu-kvm - as the merges were coordinated with the kernel (subsystem) tree. Are you suggesting that kvm header updates go through uq/master? That seems reasonable to me and is certainly the least amount of change. Regards, Anthony Liguori Jan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 11.01.2012, at 20:52, Anthony Liguori wrote: On 01/11/2012 01:48 PM, Jan Kiszka wrote: On 2012-01-11 20:46, Alexander Graf wrote: On 11.01.2012, at 20:41, Anthony Liguori wrote: On 01/11/2012 01:38 PM, Jan Kiszka wrote: I would like to see us avoiding this in the future. Headers update patches should mention the source and should not be merged until the ABI changes actually made it at least into kvm.git. Same applies, of course, to the functional changes related to that ABI. Otherwise we risk quite some mess on everyone's side. I agree. Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel and also the header. Is there real free space now or will the cap reappear? If there should better be a placeholder, let's add it (to the kernel). I will reappear with ONE_REG semantics. OK. Then please clean up now so that update-linux-headers.sh can be used again by normal developers. :) Before we did submodules and had a responsive BIOS maintainer, we maintained patches within qemu.git for our external dependencies. I think that's a good strategy here too. It's a little painful, but not entirely awful. At least it makes it possible for you to (hopefully) trivial rebase a patch if something is still in limbo. Yeah, that works. I can easily script that part. It doesn't solve the actual underlying problem though that we don't know when the abi is actually stable. I'm slowly starting to understand Pekka ;). IIRC, we never had this problem with qemu-kvm - as the merges were coordinated with the kernel (subsystem) tree. Are you suggesting that kvm header updates go through uq/master? That seems reasonable to me and is certainly the least amount of change. So how about code that actually leverages the new headers? Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 2012-01-11 20:52, Anthony Liguori wrote: On 01/11/2012 01:48 PM, Jan Kiszka wrote: On 2012-01-11 20:46, Alexander Graf wrote: On 11.01.2012, at 20:41, Anthony Liguori wrote: On 01/11/2012 01:38 PM, Jan Kiszka wrote: I would like to see us avoiding this in the future. Headers update patches should mention the source and should not be merged until the ABI changes actually made it at least into kvm.git. Same applies, of course, to the functional changes related to that ABI. Otherwise we risk quite some mess on everyone's side. I agree. Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel and also the header. Is there real free space now or will the cap reappear? If there should better be a placeholder, let's add it (to the kernel). I will reappear with ONE_REG semantics. OK. Then please clean up now so that update-linux-headers.sh can be used again by normal developers. :) Before we did submodules and had a responsive BIOS maintainer, we maintained patches within qemu.git for our external dependencies. I think that's a good strategy here too. It's a little painful, but not entirely awful. At least it makes it possible for you to (hopefully) trivial rebase a patch if something is still in limbo. Yeah, that works. I can easily script that part. It doesn't solve the actual underlying problem though that we don't know when the abi is actually stable. I'm slowly starting to understand Pekka ;). IIRC, we never had this problem with qemu-kvm - as the merges were coordinated with the kernel (subsystem) tree. Are you suggesting that kvm header updates go through uq/master? That seems reasonable to me and is certainly the least amount of change. Would be possible at least for changes that affect KVM bits. But we also use that headers for virtio and vhost. VFIO will surely join that group. So there is still coordination necessary. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 01/11/2012 01:53 PM, Alexander Graf wrote: On 11.01.2012, at 20:52, Anthony Liguori wrote: IIRC, we never had this problem with qemu-kvm - as the merges were coordinated with the kernel (subsystem) tree. Are you suggesting that kvm header updates go through uq/master? That seems reasonable to me and is certainly the least amount of change. So how about code that actually leverages the new headers? Shared KVM infrastructure should go through uq/master. So changes to kvm-all.c, linux-headers/* should go through uq/master. Target specific kvm changes should go through the appropriate submaintainers tree. Regards, Anthony Liguori Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes
On 01/11/12 20:09, Marcelo Tosatti wrote: On Tue, Jan 10, 2012 at 03:26:49PM +0100, Stephan Bärwolf wrote: From 2168285ffb30716f30e129c3ce98ce42d19c4d4e Mon Sep 17 00:00:00 2001 From: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de Date: Sun, 8 Jan 2012 02:03:47 + Subject: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes On hosts without this patch, 32bit guests will crash (and 64bit guests may behave in a wrong way) for example by simply executing following nasm-demo-application: [bits 32] global _start SECTION .text _start: syscall (I tested it with winxp and linux - both always crashed) Disassembly of section .text: _start: 0: 0f 05 syscall The reason seems a missing invalid opcode-trap (int6) for the syscall opcode 0f05, which is not available on Intel CPUs within non-longmodes, as also on some AMD CPUs within legacy-mode. (depending on CPU vendor, MSR_EFER and cpuid) Because previous mentioned OSs may not engage corresponding syscall target-registers (STAR, LSTAR, CSTAR), they remain NULL and (non trapping) syscalls are leading to multiple faults and finally crashs. Depending on the architecture (AMD or Intel) pretended by guests, various checks according to vendor's documentation are implemented to overcome the current issue and behave like the CPUs physical counterparts. (Therefore using Intel's Intel 64 and IA-32 Architecture Software Developers Manual http://www.intel.com/content/dam/doc/manual/ 64-ia-32-architectures-software-developer-manual-325462.pdf and AMD's AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf ) Screenshots of an i686 testing VM (CORE i5 host) before and after applying this patch are available under: http://matrixstorm.com/software/linux/kvm/20111229/before.jpg http://matrixstorm.com/software/linux/kvm/20111229/after.jpg Signed-off-by: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de --- arch/x86/include/asm/kvm_emulate.h | 15 ++ arch/x86/kvm/emulate.c | 92 ++- 2 files changed, 104 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index b172bf4..5b68c23 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -301,6 +301,21 @@ struct x86_emulate_ctxt { #define X86EMUL_MODE_PROT (X86EMUL_MODE_PROT16|X86EMUL_MODE_PROT32| \ X86EMUL_MODE_PROT64) +/* CPUID vendors */ +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65 + +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ebx 0x69444d41 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ecx 0x21726574 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_edx 0x74656273 + +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e +#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69 + + + enum x86_intercept_stage { X86_ICTP_NONE = 0, /* Allow zero-init to not match anything */ X86_ICPT_PRE_EXCEPT, diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index f1e3be1..3357411 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1877,6 +1877,94 @@ setup_syscalls_segments(struct x86_emulate_ctxt *ctxt, ss-p = 1; } +static bool em_syscall_isenabled(struct x86_emulate_ctxt *ctxt) +{ +struct x86_emulate_ops *ops = ctxt-ops; +u64 efer = 0; + +/* syscall is not available in real mode*/ +if ((ctxt-mode == X86EMUL_MODE_REAL) || +(ctxt-mode == X86EMUL_MODE_VM86)) +return false; + +ops-get_msr(ctxt, MSR_EFER, efer); +/* check - if guestOS is aware of syscall (0x0f05) */ +if ((efer EFER_SCE) == 0) { +return false; +} else { + /* ok, at this point it becomes vendor-specific */ + /* so first get us an cpuid */ + bool vendor; + u32 eax, ebx, ecx, edx; + + /* getting the cpu-vendor */ + eax = 0x; + ecx = 0x; + if (likely(ops-get_cpuid)) + vendor = ops-get_cpuid(ctxt, eax, ebx, ecx, edx); + elsevendor = false; + + if (likely(vendor)) { + +/* AMD AuthenticAMD / AMDisbetter! */ +if (((ebx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx) + (ecx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx) + (edx==X86EMUL_CPUID_VENDOR_AuthenticAMD_edx)) || +((ebx==X86EMUL_CPUID_VENDOR_AMDisbetter_ebx) +
Re: [Qemu-devel] State of KVM bits in linux-headers
On 11.01.2012, at 20:59, Anthony Liguori wrote: On 01/11/2012 01:53 PM, Alexander Graf wrote: On 11.01.2012, at 20:52, Anthony Liguori wrote: IIRC, we never had this problem with qemu-kvm - as the merges were coordinated with the kernel (subsystem) tree. Are you suggesting that kvm header updates go through uq/master? That seems reasonable to me and is certainly the least amount of change. So how about code that actually leverages the new headers? Shared KVM infrastructure should go through uq/master. So changes to kvm-all.c, linux-headers/* should go through uq/master. Target specific kvm changes should go through the appropriate submaintainers tree. So then if I add some target specific stuff to KVM, I have to * send pullreq to KVM * wait for that to be applied * post a patch to uq/master to update headers * wait for that to merge back to qemu.git * send a pull request to qemu.git right? And then after about 3 months we'll have the feature available ;). Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On 01/11/2012 02:05 PM, Alexander Graf wrote: On 11.01.2012, at 20:59, Anthony Liguori wrote: On 01/11/2012 01:53 PM, Alexander Graf wrote: On 11.01.2012, at 20:52, Anthony Liguori wrote: IIRC, we never had this problem with qemu-kvm - as the merges were coordinated with the kernel (subsystem) tree. Are you suggesting that kvm header updates go through uq/master? That seems reasonable to me and is certainly the least amount of change. So how about code that actually leverages the new headers? Shared KVM infrastructure should go through uq/master. So changes to kvm-all.c, linux-headers/* should go through uq/master. Target specific kvm changes should go through the appropriate submaintainers tree. So then if I add some target specific stuff to KVM, That requires a header update? I have to * send pullreq to KVM * wait for that to be applied * post a patch to uq/master to update headers Strictly from a QEMU perspective, we can't depend on APIs that aren't committed upstream yet. * wait for that to merge back to qemu.git * send a pull request to qemu.git Maybe we need to bring a stripped down version of Linux into qemu.git to make it easier to simultaneously update both trees... ;-) right? And then after about 3 months we'll have the feature available ;). You can always just get Acked-by's from the appropriate maintainers. That's just as good as going through the tree. Regards, Anthony Liguori Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
use of PMU in guest generates messages in host
Using latest kernel tree (e343a895a9f342f239c5e3c5ffc6c0b1707e6244) which has KVM bits for using PMU in the guest. Host and guest are both running Fedora 16, 64-bit, with this kernel. Running this command in the guest: perf stat -ddd -- openssl speed aes Generates this in the host: [74728.221863] kvm_set_msr_common: 2760 callbacks suppressed [74728.221950] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.222115] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.222858] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.223018] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.223851] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.224009] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.224843] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.224997] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.225842] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f001 [74728.226010] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f001 David -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost-net: add module alias (v2)
On Wed, 2012-01-11 at 09:16 -0800, Stephen Hemminger wrote: By adding the correct module alias, programs won't have to explicitly call modprobe. Vhost-net will always be available if built into the kernel. It does require assigning a permanent minor number for depmod to work. Choose one next to TUN since this driver is related to it. Also, use C99 style initialization. Signed-off-by: Stephen Hemminger shemmin...@vyatta.com --- v2 - document minor number and make sure to not overlap [...] --- a/include/linux/miscdevice.h 2012-01-10 10:56:59.779189436 -0800 +++ b/include/linux/miscdevice.h 2012-01-11 09:13:20.803694316 -0800 @@ -42,6 +42,7 @@ #define AUTOFS_MINOR 235 #define MAPPER_CTRL_MINOR236 #define LOOP_CTRL_MINOR 237 +#define VHOST_NET_MINOR 238 #define MISC_DYNAMIC_MINOR 255 struct device; --- a/Documentation/devices.txt 2012-01-10 10:56:53.399116518 -0800 +++ b/Documentation/devices.txt 2012-01-11 09:12:49.251197653 -0800 @@ -447,6 +447,8 @@ Your cooperation is appreciated. 234 = /dev/btrfs-controlBtrfs control device 235 = /dev/autofs Autofs control device 236 = /dev/mapper/control Device-Mapper control device + 237 = /dev/vhost-netHost kernel accelerator for virtio net [...] 238 != 237. It looks like someone forgot to add loopctrl here. Ben. -- Ben Hutchings, Staff Engineer, Solarflare Not speaking for my employer; that's the marketing department's job. They asked us to note that Solarflare product names are trademarked. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes
On Wed, Jan 11, 2012 at 09:01:10PM +0100, Stephan Bärwolf wrote: On 01/11/12 20:09, Marcelo Tosatti wrote: On Tue, Jan 10, 2012 at 03:26:49PM +0100, Stephan Bärwolf wrote: From 2168285ffb30716f30e129c3ce98ce42d19c4d4e Mon Sep 17 00:00:00 2001 From: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de Date: Sun, 8 Jan 2012 02:03:47 + Subject: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes On hosts without this patch, 32bit guests will crash (and 64bit guests may behave in a wrong way) for example by simply executing following nasm-demo-application: [bits 32] global _start SECTION .text _start: syscall (I tested it with winxp and linux - both always crashed) Disassembly of section .text: _start: 0: 0f 05 syscall The reason seems a missing invalid opcode-trap (int6) for the syscall opcode 0f05, which is not available on Intel CPUs within non-longmodes, as also on some AMD CPUs within legacy-mode. (depending on CPU vendor, MSR_EFER and cpuid) Because previous mentioned OSs may not engage corresponding syscall target-registers (STAR, LSTAR, CSTAR), they remain NULL and (non trapping) syscalls are leading to multiple faults and finally crashs. Depending on the architecture (AMD or Intel) pretended by guests, various checks according to vendor's documentation are implemented to overcome the current issue and behave like the CPUs physical counterparts. (Therefore using Intel's Intel 64 and IA-32 Architecture Software Developers Manual http://www.intel.com/content/dam/doc/manual/ 64-ia-32-architectures-software-developer-manual-325462.pdf and AMD's AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf ) Screenshots of an i686 testing VM (CORE i5 host) before and after applying this patch are available under: http://matrixstorm.com/software/linux/kvm/20111229/before.jpg http://matrixstorm.com/software/linux/kvm/20111229/after.jpg Signed-off-by: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de --- arch/x86/include/asm/kvm_emulate.h | 15 ++ arch/x86/kvm/emulate.c | 92 ++- 2 files changed, 104 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index b172bf4..5b68c23 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -301,6 +301,21 @@ struct x86_emulate_ctxt { #define X86EMUL_MODE_PROT (X86EMUL_MODE_PROT16|X86EMUL_MODE_PROT32| \ X86EMUL_MODE_PROT64) +/* CPUID vendors */ +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65 + +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ebx 0x69444d41 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ecx 0x21726574 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_edx 0x74656273 + +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e +#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69 + + + enum x86_intercept_stage { X86_ICTP_NONE = 0, /* Allow zero-init to not match anything */ X86_ICPT_PRE_EXCEPT, diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index f1e3be1..3357411 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1877,6 +1877,94 @@ setup_syscalls_segments(struct x86_emulate_ctxt *ctxt, ss-p = 1; } +static bool em_syscall_isenabled(struct x86_emulate_ctxt *ctxt) +{ +struct x86_emulate_ops *ops = ctxt-ops; +u64 efer = 0; + +/* syscall is not available in real mode*/ +if ((ctxt-mode == X86EMUL_MODE_REAL) || +(ctxt-mode == X86EMUL_MODE_VM86)) +return false; + +ops-get_msr(ctxt, MSR_EFER, efer); +/* check - if guestOS is aware of syscall (0x0f05) */ +if ((efer EFER_SCE) == 0) { +return false; +} else { + /* ok, at this point it becomes vendor-specific */ + /* so first get us an cpuid */ + bool vendor; + u32 eax, ebx, ecx, edx; + + /* getting the cpu-vendor */ + eax = 0x; + ecx = 0x; + if (likely(ops-get_cpuid)) + vendor = ops-get_cpuid(ctxt, eax, ebx, ecx, edx); + elsevendor = false; + + if (likely(vendor)) { + +/* AMD AuthenticAMD / AMDisbetter! */ +if (((ebx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx) +
Re: [Qemu-devel] State of KVM bits in linux-headers
On 11.01.2012, at 21:16, Anthony Liguori wrote: On 01/11/2012 02:05 PM, Alexander Graf wrote: On 11.01.2012, at 20:59, Anthony Liguori wrote: On 01/11/2012 01:53 PM, Alexander Graf wrote: On 11.01.2012, at 20:52, Anthony Liguori wrote: IIRC, we never had this problem with qemu-kvm - as the merges were coordinated with the kernel (subsystem) tree. Are you suggesting that kvm header updates go through uq/master? That seems reasonable to me and is certainly the least amount of change. So how about code that actually leverages the new headers? Shared KVM infrastructure should go through uq/master. So changes to kvm-all.c, linux-headers/* should go through uq/master. Target specific kvm changes should go through the appropriate submaintainers tree. So then if I add some target specific stuff to KVM, That requires a header update? Almost all of the time, yes. The target is still rather incomplete. And even in places where it is, hardware evolves and we just get new information we need to pass back and forth. I have to * send pullreq to KVM * wait for that to be applied * post a patch to uq/master to update headers Strictly from a QEMU perspective, we can't depend on APIs that aren't committed upstream yet. The question again is: When do we consider something upstream? * wait for that to merge back to qemu.git * send a pull request to qemu.git Maybe we need to bring a stripped down version of Linux into qemu.git to make it easier to simultaneously update both trees... ;-) Nice one ;) right? And then after about 3 months we'll have the feature available ;). You can always just get Acked-by's from the appropriate maintainers. That's just as good as going through the tree. So every time we change headers, I just require Avi's ack and then he can't complain on those patches later? Good idea! :) Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: Exception during emulation decode should propagate
On Wed, 11 Jan 2012 18:53:30 +0200 Nadav Amit na...@cs.technion.ac.il wrote: An exception might occur during decode (e.g., #PF during fetch). Currently, the exception is ignored and emulation is performed. When I cleaned up insn_fetch(), I thought that fetching the instruction which is being executed by the guest cannot cause #PF. The possibility that a meaningless userspace might similtaneously unmap the page, noted by Avi IIRC, was ignored intentionally, so we just fail in such a case. Did you see any real problem? Takuya Instead, emulation should be skipped and the fault should be injected. Skipping instruction should report a failure in this case. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes
On 01/11/12 22:21, Marcelo Tosatti wrote: On Wed, Jan 11, 2012 at 09:01:10PM +0100, Stephan Bärwolf wrote: On 01/11/12 20:09, Marcelo Tosatti wrote: On Tue, Jan 10, 2012 at 03:26:49PM +0100, Stephan Bärwolf wrote: From 2168285ffb30716f30e129c3ce98ce42d19c4d4e Mon Sep 17 00:00:00 2001 From: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de Date: Sun, 8 Jan 2012 02:03:47 + Subject: [PATCH 2/2] KVM: fix missing illegal instruction-trap in protected modes On hosts without this patch, 32bit guests will crash (and 64bit guests may behave in a wrong way) for example by simply executing following nasm-demo-application: [bits 32] global _start SECTION .text _start: syscall (I tested it with winxp and linux - both always crashed) Disassembly of section .text: _start: 0: 0f 05 syscall The reason seems a missing invalid opcode-trap (int6) for the syscall opcode 0f05, which is not available on Intel CPUs within non-longmodes, as also on some AMD CPUs within legacy-mode. (depending on CPU vendor, MSR_EFER and cpuid) Because previous mentioned OSs may not engage corresponding syscall target-registers (STAR, LSTAR, CSTAR), they remain NULL and (non trapping) syscalls are leading to multiple faults and finally crashs. Depending on the architecture (AMD or Intel) pretended by guests, various checks according to vendor's documentation are implemented to overcome the current issue and behave like the CPUs physical counterparts. (Therefore using Intel's Intel 64 and IA-32 Architecture Software Developers Manual http://www.intel.com/content/dam/doc/manual/ 64-ia-32-architectures-software-developer-manual-325462.pdf and AMD's AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions http://support.amd.com/us/Processor_TechDocs/APM_V3_24594.pdf ) Screenshots of an i686 testing VM (CORE i5 host) before and after applying this patch are available under: http://matrixstorm.com/software/linux/kvm/20111229/before.jpg http://matrixstorm.com/software/linux/kvm/20111229/after.jpg Signed-off-by: Stephan Baerwolf stephan.baerw...@tu-ilmenau.de --- arch/x86/include/asm/kvm_emulate.h | 15 ++ arch/x86/kvm/emulate.c | 92 ++- 2 files changed, 104 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index b172bf4..5b68c23 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -301,6 +301,21 @@ struct x86_emulate_ctxt { #define X86EMUL_MODE_PROT (X86EMUL_MODE_PROT16|X86EMUL_MODE_PROT32| \ X86EMUL_MODE_PROT64) +/* CPUID vendors */ +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx 0x68747541 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx 0x444d4163 +#define X86EMUL_CPUID_VENDOR_AuthenticAMD_edx 0x69746e65 + +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ebx 0x69444d41 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_ecx 0x21726574 +#define X86EMUL_CPUID_VENDOR_AMDisbetter_edx 0x74656273 + +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ebx 0x756e6547 +#define X86EMUL_CPUID_VENDOR_GenuineIntel_ecx 0x6c65746e +#define X86EMUL_CPUID_VENDOR_GenuineIntel_edx 0x49656e69 + + + enum x86_intercept_stage { X86_ICTP_NONE = 0, /* Allow zero-init to not match anything */ X86_ICPT_PRE_EXCEPT, diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index f1e3be1..3357411 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1877,6 +1877,94 @@ setup_syscalls_segments(struct x86_emulate_ctxt *ctxt, ss-p = 1; } +static bool em_syscall_isenabled(struct x86_emulate_ctxt *ctxt) +{ +struct x86_emulate_ops *ops = ctxt-ops; +u64 efer = 0; + +/* syscall is not available in real mode*/ +if ((ctxt-mode == X86EMUL_MODE_REAL) || +(ctxt-mode == X86EMUL_MODE_VM86)) +return false; + +ops-get_msr(ctxt, MSR_EFER, efer); +/* check - if guestOS is aware of syscall (0x0f05) */ +if ((efer EFER_SCE) == 0) { +return false; +} else { + /* ok, at this point it becomes vendor-specific */ + /* so first get us an cpuid */ + bool vendor; + u32 eax, ebx, ecx, edx; + + /* getting the cpu-vendor */ + eax = 0x; + ecx = 0x; + if (likely(ops-get_cpuid)) + vendor = ops-get_cpuid(ctxt, eax, ebx, ecx, edx); + elsevendor = false; + + if (likely(vendor)) { + +/* AMD AuthenticAMD / AMDisbetter! */ +if (((ebx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx) + (ecx==X86EMUL_CPUID_VENDOR_AuthenticAMD_ecx) + (edx==X86EMUL_CPUID_VENDOR_AuthenticAMD_edx)) || +
[PATCH] KVM: PPC: refer to paravirt docs in header file
Instead of keeping separate copies of struct kvm_vcpu_arch_shared (one in the code, one in the docs) that inevitably fail to be kept in sync (already sr[] is missing from the doc version), just point to the header file as the source of documentation on the contents of the magic page. Signed-off-by: Scott Wood scottw...@freescale.com --- Documentation/virtual/kvm/ppc-pv.txt | 24 ++-- arch/powerpc/include/asm/kvm_para.h | 10 ++ 2 files changed, 12 insertions(+), 22 deletions(-) diff --git a/Documentation/virtual/kvm/ppc-pv.txt b/Documentation/virtual/kvm/ppc-pv.txt index 2b7ce19..6e7c370 100644 --- a/Documentation/virtual/kvm/ppc-pv.txt +++ b/Documentation/virtual/kvm/ppc-pv.txt @@ -81,28 +81,8 @@ additional registers to the magic page. If you add fields to the magic page, also define a new hypercall feature to indicate that the host can give you more registers. Only if the host supports the additional features, make use of them. -The magic page has the following layout as described in -arch/powerpc/include/asm/kvm_para.h: - -struct kvm_vcpu_arch_shared { - __u64 scratch1; - __u64 scratch2; - __u64 scratch3; - __u64 critical; /* Guest may not get interrupts if == r1 */ - __u64 sprg0; - __u64 sprg1; - __u64 sprg2; - __u64 sprg3; - __u64 srr0; - __u64 srr1; - __u64 dar; - __u64 msr; - __u32 dsisr; - __u32 int_pending; /* Tells the guest if we have an interrupt */ -}; - -Additions to the page must only occur at the end. Struct fields are always 32 -or 64 bit aligned, depending on them being 32 or 64 bit wide respectively. +The magic page layout is described by struct kvm_vcpu_arch_shared +in arch/powerpc/include/asm/kvm_para.h. Magic page features === diff --git a/arch/powerpc/include/asm/kvm_para.h b/arch/powerpc/include/asm/kvm_para.h index ece70fb..7b754e7 100644 --- a/arch/powerpc/include/asm/kvm_para.h +++ b/arch/powerpc/include/asm/kvm_para.h @@ -22,6 +22,16 @@ #include linux/types.h +/* + * Additions to this struct must only occur at the end, and should be + * accompanied by a KVM_MAGIC_FEAT flag to advertise that they are present + * (albeit not necessarily relevant to the current target hardware platform). + * + * Struct fields are always 32 or 64 bit aligned, depending on them being 32 + * or 64 bit wide respectively. + * + * See Documentation/virtual/kvm/ppc-pv.txt + */ struct kvm_vcpu_arch_shared { __u64 scratch1; __u64 scratch2; -- 1.7.7.rc3.4.g8d714 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: refer to paravirt docs in header file
On 12.01.2012, at 00:37, Scott Wood wrote: Instead of keeping separate copies of struct kvm_vcpu_arch_shared (one in the code, one in the docs) that inevitably fail to be kept in sync (already sr[] is missing from the doc version), just point to the header file as the source of documentation on the contents of the magic page. Signed-off-by: Scott Wood scottw...@freescale.com Avi, please ack. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 3/3] stop the periodic RTC update timer
-Original Message- From: Marcelo Tosatti [mailto:mtosa...@redhat.com] Regarding the UIP bit, a guest could read it in a loop and wait for the value to change. But you can emulate it in cmos_ioport_read by reading the host time, that is, return 1 during 244us, 0 for remaining of the second, and have that in sync with update-cycle-ended interrupt if its enabled. Yes. Guest may use the loop to read RTC, but the point is the guest is waiting for the UIP changed to 0. If this bit always equal to 0 , guest will never go into the loop. For real RTC, this may wrong, because the RTC cannot give you the valid value during the update cycle. But the virtual RTC doesn't' need this logic, whenever you read it, it will always return the right value to you. best regards yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: Exception during emulation decode should propagate
(2012/01/12 7:11), Takuya Yoshikawa wrote: On Wed, 11 Jan 2012 18:53:30 +0200 Nadav Amitna...@cs.technion.ac.il wrote: An exception might occur during decode (e.g., #PF during fetch). Currently, the exception is ignored and emulation is performed. Note that the decode/emulation will not be continued in such a case. insn_fetch() is a bit tricky macro and it contains goto done to outside. So if an error happens during fetching the instruction, x86_decode_insn() will handle the X86EMUL_* fault value and returns FAIL immediately. Takuya When I cleaned up insn_fetch(), I thought that fetching the instruction which is being executed by the guest cannot cause #PF. The possibility that a meaningless userspace might similtaneously unmap the page, noted by Avi IIRC, was ignored intentionally, so we just fail in such a case. Did you see any real problem? Takuya Instead, emulation should be skipped and the fault should be injected. Skipping instruction should report a failure in this case. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Code clean up for percpu_xxx() functions
On 01/11/2012 09:19 AM, t...@kernel.org wrote: Alex, can you please collect all patches into a single patchset? Please split it such that, usage changes are per-system so that they can be routed through respective subsystems (x86 or net) and updates to percpu proper which can be applied after other changes have been applied. It would probably be best to route these patches separately rather than all through percpu as it touches a lot of different places and is likely to cause conflicts. I *think* the best way would be, * Submit per-subsystem patches and get them merged to subsystem trees. * (Optional) Apply a patch to mark unused interface deprecated in percpu tree, so that new usages in linux-next can be detected. * Towards the end of the next merge window, merge a patch to actually kill the old interface. That sounds like a good idea. -hpa -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 3/3] stop the periodic RTC update timer
-Original Message- From: Paolo Bonzini [mailto:paolo.bonz...@gmail.com] On Behalf Of Paolo Bonzini Because it's not in the spec because some engineer thought it was cool. It not cool. We need to do some optimizations to get Better Performance. It's in the spec because it gives you a way to do atomic reads. QEMU not being a simulator means that we always assume that the RTC is programmed for a 32768 Hz clock, for example, because any other setting would not make sense on a PC. We can use a 1-second (or higher, as in your patches) timer, rather than a 32768 Hz timer which anyway would not work well. So we're taking shortcuts, but each of them must be evaluated separately, and _this_ shortcut is not acceptable. Also, is there an actual case that break with my patch? Any decent unit test for the RTC would break. Any decent unit test break the following logic too. The spec provide three ways for you to program, why we only focus on 0x20? Because this is for emulation not for hardware simulation. Because no real OS set it to other value. static void rtc_update_second(void *opaque) { RTCState *s = opaque; int64_t delay; /* if the oscillator is not in normal operation, we do not update */ if ((s-cmos_data[RTC_REG_A] 0x70) != 0x20) { . } It means that the (not externally visible) millisecond value is set to 500 when you modify the current time of the RTC. The next update of the clock will happen exactly 500 ms after you reset bit 7 of register B. Same question, any reason need to complicate the current logic? Or any actual usage model need to add this? Is it really so difficult to implement? I think what we are talking is do we really need it? Not how difficult to add it. Note that this case is mentioned in drivers/rtc/rtc-cmos.c in the Linux source code, even though it is not used. Yes, it just mentioned the next update will happen in 500ms later. What's wrong with this? The highest resolution of RTC is 1 second, if any software intend to use RTC do some check within 1 second, it should be wrong. Anyway, I agree with your point. If we really need to add those features, I will add it in next version. Before it, we need figure out whether it is necessary. Best regards yang -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] virtio-serial: set up vqs on demand
From: Hongyong Zang zanghongy...@huawei.com Virtio-serial set up (max_ports+1)*2 vqs when device probes, but may not all io_ports are used. These patches create vqs of port0 and control port when probing the device, then create io-vqs when called add_port(). Hongyong Zang (2): virtio-pci: add setup_vqs flag in vp_try_to_find_vqs virtio-serial: setup_port_vq when adding port drivers/char/virtio_console.c | 65 ++-- drivers/virtio/virtio_pci.c | 17 -- 2 files changed, 74 insertions(+), 8 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] virtio-pci: add setup_vqs flag in vp_try_to_find_vqs
From: Hongyong Zang zanghongy...@huawei.com changes in vp_try_to_find_vqs: Virtio-serial's probe() calls it to request irqs and setup vqs of port0 and controls; add_port() calls it to set up vqs of io_port. it will not create virtqueue if the name is null. Signed-off-by: Hongyong Zang zanghongy...@huawei.com --- drivers/virtio/virtio_pci.c | 17 + 1 files changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/virtio/virtio_pci.c b/drivers/virtio/virtio_pci.c index baabb79..1f98c36 100644 --- a/drivers/virtio/virtio_pci.c +++ b/drivers/virtio/virtio_pci.c @@ -492,9 +492,11 @@ static void vp_del_vqs(struct virtio_device *vdev) list_for_each_entry_safe(vq, n, vdev-vqs, list) { info = vq-priv; if (vp_dev-per_vq_vectors - info-msix_vector != VIRTIO_MSI_NO_VECTOR) + info-msix_vector != VIRTIO_MSI_NO_VECTOR) { free_irq(vp_dev-msix_entries[info-msix_vector].vector, vq); + vp_dev-msix_used_vectors--; + } vp_del_vq(vq); } vp_dev-per_vq_vectors = false; @@ -511,7 +513,10 @@ static int vp_try_to_find_vqs(struct virtio_device *vdev, unsigned nvqs, { struct virtio_pci_device *vp_dev = to_vp_device(vdev); u16 msix_vec; - int i, err, nvectors, allocated_vectors; + int i, err, nvectors; + + if (vp_dev-msix_used_vectors) + goto setup_vqs; if (!use_msix) { /* Old style: one normal interrupt for change and all vqs. */ @@ -536,12 +541,16 @@ static int vp_try_to_find_vqs(struct virtio_device *vdev, unsigned nvqs, } vp_dev-per_vq_vectors = per_vq_vectors; - allocated_vectors = vp_dev-msix_used_vectors; + +setup_vqs: for (i = 0; i nvqs; ++i) { + if (names[i] == NULL) + continue; + if (!callbacks[i] || !vp_dev-msix_enabled) msix_vec = VIRTIO_MSI_NO_VECTOR; else if (vp_dev-per_vq_vectors) - msix_vec = allocated_vectors++; + msix_vec = vp_dev-msix_used_vectors++; else msix_vec = VP_MSIX_VQ_VECTOR; vqs[i] = setup_vq(vdev, i, callbacks[i], names[i], msix_vec); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] virtio-serial: setup_port_vq when adding port
From: Hongyong Zang zanghongy...@huawei.com Add setup_port_vq(). Create the io ports' vqs when add_port. Signed-off-by: Hongyong Zang zanghongy...@huawei.com --- drivers/char/virtio_console.c | 65 ++-- 1 files changed, 61 insertions(+), 4 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 8e3c46d..2e5187e 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1132,6 +1132,55 @@ static void send_sigio_to_port(struct port *port) kill_fasync(port-async_queue, SIGIO, POLL_OUT); } +static void in_intr(struct virtqueue *vq); +static void out_intr(struct virtqueue *vq); + +static int setup_port_vq(struct ports_device *portdev, u32 id) +{ + int err, vq_num; + vq_callback_t **io_callbacks; + char **io_names; + struct virtqueue **vqs; + u32 i,j,nr_ports,nr_queues; + + err = 0; + vq_num = (id + 1) * 2; + nr_ports = portdev-config.max_nr_ports; + nr_queues = use_multiport(portdev) ? (nr_ports + 1) * 2 : 2; + + vqs = kmalloc(nr_queues * sizeof(struct virtqueue *), GFP_KERNEL); + io_callbacks = kmalloc(nr_queues * sizeof(vq_callback_t *), GFP_KERNEL); + io_names = kmalloc(nr_queues * sizeof(char *), GFP_KERNEL); + if (!vqs || !io_callbacks || !io_names) { + err = -ENOMEM; + goto free; + } + + for (i = 0, j = 0; i = nr_ports; i++) { + io_callbacks[j] = in_intr; + io_callbacks[j + 1] = out_intr; + io_names[j] = NULL; + io_names[j + 1] = NULL; + j += 2; + } + io_names[vq_num] = serial-input; + io_names[vq_num + 1] = serial-output; + err = portdev-vdev-config-find_vqs(portdev-vdev, nr_queues, vqs, + io_callbacks, + (const char **)io_names); + if (err) + goto free; + portdev-in_vqs[id] = vqs[vq_num]; + portdev-out_vqs[id] = vqs[vq_num + 1]; + +free: + kfree(io_names); + kfree(io_callbacks); + kfree(vqs); + + return err; +} + static int add_port(struct ports_device *portdev, u32 id) { char debugfs_name[16]; @@ -1163,6 +1212,14 @@ static int add_port(struct ports_device *portdev, u32 id) port-outvq_full = false; + if (!portdev-in_vqs[port-id] !portdev-out_vqs[port-id]) { + spin_lock(portdev-ports_lock); + err = setup_port_vq(portdev, port-id); + spin_unlock(portdev-ports_lock); + if (err) + goto free_port; + } + port-in_vq = portdev-in_vqs[port-id]; port-out_vq = portdev-out_vqs[port-id]; @@ -1614,8 +1671,8 @@ static int init_vqs(struct ports_device *portdev) j += 2; io_callbacks[j] = in_intr; io_callbacks[j + 1] = out_intr; - io_names[j] = input; - io_names[j + 1] = output; + io_names[j] = NULL; + io_names[j + 1] = NULL; } } /* Find the queues. */ @@ -1635,8 +1692,8 @@ static int init_vqs(struct ports_device *portdev) for (i = 1; i nr_ports; i++) { j += 2; - portdev-in_vqs[i] = vqs[j]; - portdev-out_vqs[i] = vqs[j + 1]; + portdev-in_vqs[i] = NULL; + portdev-out_vqs[i] = NULL; } } kfree(io_names); -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] Code clean up for percpu_xxx() functions
On Wed, 2012-01-11 at 16:44 -0800, H. Peter Anvin wrote: On 01/11/2012 09:19 AM, t...@kernel.org wrote: Alex, can you please collect all patches into a single patchset? Please split it such that, usage changes are per-system so that they can be routed through respective subsystems (x86 or net) and updates to percpu proper which can be applied after other changes have been applied. It would probably be best to route these patches separately rather than all through percpu as it touches a lot of different places and is likely to cause conflicts. I *think* the best way would be, * Submit per-subsystem patches and get them merged to subsystem trees. * (Optional) Apply a patch to mark unused interface deprecated in percpu tree, so that new usages in linux-next can be detected. * Towards the end of the next merge window, merge a patch to actually kill the old interface. That sounds like a good idea. I will try to do so. Many thanks for the advices! -hpa -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] vhost-net: add module alias (v2)
On Thu, Jan 12, 2012 at 1:16 AM, Stephen Hemminger shemmin...@vyatta.com wrote: By adding the correct module alias, programs won't have to explicitly call modprobe. Vhost-net will always be available if built into the kernel. It does require assigning a permanent minor number for depmod to work. Choose one next to TUN since this driver is related to it. Also, use C99 style initialization. Signed-off-by: Stephen Hemminger shemmin...@vyatta.com --- v2 - document minor number and make sure to not overlap Documentation/devices.txt | 2 ++ drivers/vhost/net.c | 8 +--- include/linux/miscdevice.h | 1 + 3 files changed, 8 insertions(+), 3 deletions(-) --- a/drivers/vhost/net.c 2012-01-10 10:56:58.883179194 -0800 +++ b/drivers/vhost/net.c 2012-01-10 19:48:23.650225892 -0800 @@ -856,9 +856,9 @@ static const struct file_operations vhos }; static struct miscdevice vhost_net_misc = { - MISC_DYNAMIC_MINOR, - vhost-net, - vhost_net_fops, + .minor = VHOST_NET_MINOR, + .name = vhost-net, + .fops = vhost_net_fops, }; static int vhost_net_init(void) @@ -879,3 +879,5 @@ MODULE_VERSION(0.0.1); MODULE_LICENSE(GPL v2); MODULE_AUTHOR(Michael S. Tsirkin); MODULE_DESCRIPTION(Host kernel accelerator for virtio net); +MODULE_ALIAS_MISCDEV(VHOST_NET_MINOR); +MODULE_ALIAS(devname:vhost-net); --- a/include/linux/miscdevice.h 2012-01-10 10:56:59.779189436 -0800 +++ b/include/linux/miscdevice.h 2012-01-11 09:13:20.803694316 -0800 @@ -42,6 +42,7 @@ #define AUTOFS_MINOR 235 #define MAPPER_CTRL_MINOR 236 #define LOOP_CTRL_MINOR 237 +#define VHOST_NET_MINOR 238 #define MISC_DYNAMIC_MINOR 255 struct device; --- a/Documentation/devices.txt 2012-01-10 10:56:53.399116518 -0800 +++ b/Documentation/devices.txt 2012-01-11 09:12:49.251197653 -0800 @@ -447,6 +447,8 @@ Your cooperation is appreciated. 234 = /dev/btrfs-control Btrfs control device 235 = /dev/autofs Autofs control device 236 = /dev/mapper/control Device-Mapper control device + 237 = /dev/vhost-net Host kernel accelerator for virtio net + 238? The stuff for LOOP_CTRL seems to be missing? 240-254 Reserved for local use 255 Reserved for MISC_DYNAMIC_MINOR ___ Virtualization mailing list virtualizat...@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization -- Regards, Zhi Yong Wu -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 00/13] KVM/ARM Implementation
On Jan 11, 2012, at 8:48 AM, Peter Maydell wrote: On 11 December 2011 19:23, Christoffer Dall c.d...@virtualopensystems.com wrote: On Sun, Dec 11, 2011 at 6:32 AM, Peter Maydell peter.mayd...@linaro.org wrote: On 11 December 2011 10:24, Christoffer Dall c.d...@virtualopensystems.com wrote: Still on the to-do list: - Reuse VMIDs - Fix SMP host support - Fix SMP guest support - Support guest Thumb mode for MMIO emulation - Further testing - Performance improvements Other items for this list: - Support Neon/VFP in guests (the fpu regs struct is empty ATM) - Support guest debugging ok, thanks, will add these to the list. I have a feeling it will keep growing for a while :) Do you have a kernel-side TODO list somewhere public (wiki page?) I wanted to create this as issues on the github repos... (It would be quite useful to be able to boot a reasonably modern [read, ARMv7, Thumb2, VFPv3] guest userspace; does anybody plan to work on this part soon?) We have booted the linaro init environment and recent Angstrom distributions. Android is being actively tested. What specifically did you have in mind? -Christoffer-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 04/16] KVM: PPC: factor out lpid allocator from book3s_64_mmu_hv
On Mon, Jan 09, 2012 at 04:35:52PM +0100, Alexander Graf wrote: Paul, does this work for you? IIRC you need this code to be available from real mode, which powerpc.c isn't in, right? We don't need to allocated LPIDs from real mode, so it should be OK. book3s_64_mmu_hv.c is not real mode code, and it gets compiled into the KVM module. Paul. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Could anybody give some description about the implement of hyercall in kvm?
Hi, Could anybody give some description about the implement of hyercall in kvm? Or give some links about it? Thanks, ping fan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] vhost-net: add module alias (v2.1)
Subject: vhost-net: add module alias (v2.1) By adding some module aliases, programs (or users) won't have to explicitly call modprobe. Vhost-net will always be available if built into the kernel. It does require assigning a permanent minor number for depmod to work. Also: - use C99 style initialization. - add missing entry in documentation for loop-control Signed-off-by: Stephen Hemminger shemmin...@vyatta.com --- 2.1 - add missing documentation for loop control as well Documentation/devices.txt |3 +++ drivers/vhost/net.c|8 +--- include/linux/miscdevice.h |1 + 3 files changed, 9 insertions(+), 3 deletions(-) --- a/drivers/vhost/net.c 2012-01-10 10:56:58.883179194 -0800 +++ b/drivers/vhost/net.c 2012-01-10 19:48:23.650225892 -0800 @@ -856,9 +856,9 @@ static const struct file_operations vhos }; static struct miscdevice vhost_net_misc = { - MISC_DYNAMIC_MINOR, - vhost-net, - vhost_net_fops, + .minor = VHOST_NET_MINOR, + .name = vhost-net, + .fops = vhost_net_fops, }; static int vhost_net_init(void) @@ -879,3 +879,5 @@ MODULE_VERSION(0.0.1); MODULE_LICENSE(GPL v2); MODULE_AUTHOR(Michael S. Tsirkin); MODULE_DESCRIPTION(Host kernel accelerator for virtio net); +MODULE_ALIAS_MISCDEV(VHOST_NET_MINOR); +MODULE_ALIAS(devname:vhost-net); --- a/include/linux/miscdevice.h2012-01-10 10:56:59.779189436 -0800 +++ b/include/linux/miscdevice.h2012-01-11 09:13:20.803694316 -0800 @@ -42,6 +42,7 @@ #define AUTOFS_MINOR 235 #define MAPPER_CTRL_MINOR 236 #define LOOP_CTRL_MINOR237 +#define VHOST_NET_MINOR238 #define MISC_DYNAMIC_MINOR 255 struct device; --- a/Documentation/devices.txt 2012-01-10 10:56:53.399116518 -0800 +++ b/Documentation/devices.txt 2012-01-11 13:17:07.882113340 -0800 @@ -447,6 +447,9 @@ Your cooperation is appreciated. 234 = /dev/btrfs-controlBtrfs control device 235 = /dev/autofs Autofs control device 236 = /dev/mapper/control Device-Mapper control device + 237 = /dev/loop-control Loopback control device + 238 = /dev/vhost-netHost kernel accelerator for virtio net + 240-254 Reserved for local use 255 Reserved for MISC_DYNAMIC_MINOR -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: use of PMU in guest generates messages in host
On Wed, Jan 11, 2012 at 01:47:55PM -0700, David Ahern wrote: Using latest kernel tree (e343a895a9f342f239c5e3c5ffc6c0b1707e6244) which has KVM bits for using PMU in the guest. Host and guest are both running Fedora 16, 64-bit, with this kernel. Running this command in the guest: perf stat -ddd -- openssl speed aes Generates this in the host: [74728.221863] kvm_set_msr_common: 2760 callbacks suppressed [74728.221950] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.222115] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.222858] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.223018] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.223851] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.224009] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.224843] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.224997] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f701 [74728.225842] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f001 [74728.226010] kvm: 28217: cpu2 unhandled wrmsr: 0x1a6 data f001 This is MSR_OFFCORE_RSP_0 MSR which is not (yet?) supported. What is your host cpu and qemu command line? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] State of KVM bits in linux-headers
On Wed, Jan 11, 2012 at 08:46:38PM +0100, Alexander Graf wrote: On 11.01.2012, at 20:41, Anthony Liguori wrote: On 01/11/2012 01:38 PM, Jan Kiszka wrote: I would like to see us avoiding this in the future. Headers update patches should mention the source and should not be merged until the ABI changes actually made it at least into kvm.git. Same applies, of course, to the functional changes related to that ABI. Otherwise we risk quite some mess on everyone's side. I agree. Another thing: KVM_CAP_PPC_HIOR has been removed again from the kernel and also the header. Is there real free space now or will the cap reappear? If there should better be a placeholder, let's add it (to the kernel). I will reappear with ONE_REG semantics. OK. Then please clean up now so that update-linux-headers.sh can be used again by normal developers. :) Before we did submodules and had a responsive BIOS maintainer, we maintained patches within qemu.git for our external dependencies. I think that's a good strategy here too. It's a little painful, but not entirely awful. At least it makes it possible for you to (hopefully) trivial rebase a patch if something is still in limbo. Yeah, that works. I can easily script that part. It doesn't solve the actual underlying problem though that we don't know when the abi is actually stable. I'm slowly starting to understand Pekka ;). In my recent experience with submitting Joerg's patch series that touches both kernel and tools/perf I didn't see any advantages in having them in the same repository. Yes, the repository is the same, but maintainers are different and have their own timelines and priorities. Long story short userspace part was applied almost three month after the kernel part. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support
On Tue, 2012-01-10 at 04:11 +0100, Alexander Graf wrote: This is what book3s does: case EMULATE_FAIL: printk(KERN_CRIT %s: emulation at %lx failed (%08x)\n, __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu)); kvmppc_core_queue_program(vcpu, flags); r = RESUME_GUEST; which also doesn't throttle the printk, but I think injecting a program fault into the guest is the most sensible thing to do if we don't know what the instruction is supposed to do. Best case we get an oops inside the guest telling us what broke :). You can also fallback to a slow path that reads the guest TLB, translates then reads the instruction. Of course you have to be careful as such a manual translate + read + execute needs to be somewhat synchronized with a possible TLB invalidation :-) (MMIO emulation is broken in this regard too btw) Cheers, Ben. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support
On 12.01.2012, at 07:44, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: On Tue, 2012-01-10 at 04:11 +0100, Alexander Graf wrote: This is what book3s does: case EMULATE_FAIL: printk(KERN_CRIT %s: emulation at %lx failed (%08x)\n, __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu)); kvmppc_core_queue_program(vcpu, flags); r = RESUME_GUEST; which also doesn't throttle the printk, but I think injecting a program fault into the guest is the most sensible thing to do if we don't know what the instruction is supposed to do. Best case we get an oops inside the guest telling us what broke :). You can also fallback to a slow path that reads the guest TLB, translates then reads the instruction. Of course you have to be careful as such a manual translate + read + execute needs to be somewhat synchronized with a possible TLB invalidation :-) Well we do want to be fast on the default path though. So yes, what you're saying is what book3s does, but as a fallback in case the fast path didn't work. The problem here however is that we don't know if the fast path failed; we oops. (MMIO emulation is broken in this regard too btw) Huh? Alex Cheers, Ben. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: PPC: refer to paravirt docs in header file
On 12.01.2012, at 00:37, Scott Wood wrote: Instead of keeping separate copies of struct kvm_vcpu_arch_shared (one in the code, one in the docs) that inevitably fail to be kept in sync (already sr[] is missing from the doc version), just point to the header file as the source of documentation on the contents of the magic page. Signed-off-by: Scott Wood scottw...@freescale.com Avi, please ack. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 04/16] KVM: PPC: factor out lpid allocator from book3s_64_mmu_hv
On Mon, Jan 09, 2012 at 04:35:52PM +0100, Alexander Graf wrote: Paul, does this work for you? IIRC you need this code to be available from real mode, which powerpc.c isn't in, right? We don't need to allocated LPIDs from real mode, so it should be OK. book3s_64_mmu_hv.c is not real mode code, and it gets compiled into the KVM module. Paul. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support
On Tue, 2012-01-10 at 04:11 +0100, Alexander Graf wrote: This is what book3s does: case EMULATE_FAIL: printk(KERN_CRIT %s: emulation at %lx failed (%08x)\n, __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu)); kvmppc_core_queue_program(vcpu, flags); r = RESUME_GUEST; which also doesn't throttle the printk, but I think injecting a program fault into the guest is the most sensible thing to do if we don't know what the instruction is supposed to do. Best case we get an oops inside the guest telling us what broke :). You can also fallback to a slow path that reads the guest TLB, translates then reads the instruction. Of course you have to be careful as such a manual translate + read + execute needs to be somewhat synchronized with a possible TLB invalidation :-) (MMIO emulation is broken in this regard too btw) Cheers, Ben. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 14/16] KVM: PPC: booke: category E.HV (GS-mode) support
On 12.01.2012, at 07:44, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: On Tue, 2012-01-10 at 04:11 +0100, Alexander Graf wrote: This is what book3s does: case EMULATE_FAIL: printk(KERN_CRIT %s: emulation at %lx failed (%08x)\n, __func__, kvmppc_get_pc(vcpu), kvmppc_get_last_inst(vcpu)); kvmppc_core_queue_program(vcpu, flags); r = RESUME_GUEST; which also doesn't throttle the printk, but I think injecting a program fault into the guest is the most sensible thing to do if we don't know what the instruction is supposed to do. Best case we get an oops inside the guest telling us what broke :). You can also fallback to a slow path that reads the guest TLB, translates then reads the instruction. Of course you have to be careful as such a manual translate + read + execute needs to be somewhat synchronized with a possible TLB invalidation :-) Well we do want to be fast on the default path though. So yes, what you're saying is what book3s does, but as a fallback in case the fast path didn't work. The problem here however is that we don't know if the fast path failed; we oops. (MMIO emulation is broken in this regard too btw) Huh? Alex Cheers, Ben. -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html