Re: [kvm-devel] [RFC] VMX CR3 cache

2008-01-30 Thread Gerd Hoffmann
Marcelo Tosatti wrote: And this is against a changed x86.git -mm tree (with pvops64 patches). I'll send the PTE-write-via-hypercall patches soon and will rebase on top of that (the CR3 cache needs more testing/tuning apparently). Oops for sale ;) Triggered by guests wrmsr, looks like some

Re: [kvm-devel] [RFC] VMX CR3 cache

2008-01-30 Thread Gerd Hoffmann
Gerd Hoffmann wrote: I've passed in a physical address. The vmx_cr3_cache_msr() function has a gva_to_page() call which makes me suspect it expects a virtual address. Confirmed. When passing in a virtual address it works. And it gives me a nice speedup for kernel builds: rhel5-64 kraxel ~#

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-30 Thread Carsten Otte
Joerg Roedel wrote: Since NPT uses the host page table format it is in theory possible to add the pagetable to the Linux MM rmap. In this case it would not be necessary to use MMU notifiers. But I think this would complicate the NPT support code significantly. I was hoping for a nearest

Re: [kvm-devel] [PATCH]: Fix memory corruption in-kernel IOAPIC emulation

2008-01-30 Thread Avi Kivity
Chris Lalancette wrote: All, Attached is a patch that fixes the first (of at least a couple) migration problem that I am running into. Basically, using the setup I described in my last post, I was always getting Disabling IRQ #11 once the guest reached the destination side, and then no

Re: [kvm-devel] [RFC] VMX CR3 cache

2008-01-30 Thread Avi Kivity
Gerd Hoffmann wrote: Gerd Hoffmann wrote: I've passed in a physical address. The vmx_cr3_cache_msr() function has a gva_to_page() call which makes me suspect it expects a virtual address. Confirmed. When passing in a virtual address it works. And it gives me a nice speedup for

Re: [kvm-devel] [PATCH] Clean up KVM/QEMU interaction

2008-01-30 Thread Avi Kivity
Avi Kivity wrote: Anthony Liguori wrote: This patch attempts to clean up the interactions between KVM and QEMU. Sorry for such a big patch, but I don't think there's a better way to approach this such that it's still bisect friendly. I think this is most of what's needed to get basic

[kvm-devel] [PATCH] bios: fix for parallel build (make -j2)

2008-01-30 Thread Carlo Marcelo Arenas Belon
prevents reusing tmp.bin for both BIOS-bochs-legacy and BIOS-bochs-latest targets. committed upstream in revision 1.27 of Makefile.in to fix bug 1799877. patch applied as well to generated Makefile. Signed-off-by: Carlo Marcelo Arenas Belon [EMAIL PROTECTED] --- bios/Makefile|4 ++--

Re: [kvm-devel] [PATCH] bios: fix for parallel build (make -j2)

2008-01-30 Thread Avi Kivity
Carlo Marcelo Arenas Belon wrote: prevents reusing tmp.bin for both BIOS-bochs-legacy and BIOS-bochs-latest targets. committed upstream in revision 1.27 of Makefile.in to fix bug 1799877. patch applied as well to generated Makefile. Applied, thanks. -- error compiling committee.c: too

Re: [kvm-devel] [RFC] nmi watchdog in kvm

2008-01-30 Thread Avi Kivity
Balaji Rao wrote: Hello, I was trying to enable the use of nmi watchdog within a linux guest running in kvm. I have done it by allowing direct access to perfmon msrs using the MSR_BITMAP field in vmcs region. Most of the times the NMI Watchdog Test in the guest fails, but with a finite

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 12:26:39PM +0100, Carsten Otte wrote: Andrea Arcangeli wrote: By your conclusion I suppose you thought NPT maps guest physical to host virtual. If it was the case the cpu would to walk three layer of pagetables (each layer is an arrow): guest virtual - guest physical -

Re: [kvm-devel] [RFC] VMX CR3 cache

2008-01-30 Thread Gerd Hoffmann
Avi Kivity wrote: [fairly amazing results. how do they compare to xen?] Didn't benchmark it side-by-side yet. Most likely xenner is still noticeable slower on 64bit (32bit should be roughly comparable). I also wouldn't surprised if you see different results on different workloads. xen

[kvm-devel] [PATCH 1/2] VMX: unifdef the EFER specific code

2008-01-30 Thread Joerg Roedel
To allow access to the EFER register in 32bit KVM the EFER specific code has to be exported to the x86 generic code. This patch does this in a backwards compatible manner. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/vmx.c | 10 ++ 1 files changed, 6 insertions(+), 4

[kvm-devel] Fix PAE guests on KVM 32 bit host

2008-01-30 Thread Roedel joerg.roedel
This small series of patches fixes a boot problem with PAE guests on a 32 bit KVM host. These guests try to access the EFER register when running on AMD, get an GP and crashing very soon in the boot process. These patches fix that. They where tested with 32 bit legacy and PAE Linux and Vista 32

[kvm-devel] [PATCH 1/2] VMX: unifdef the EFER specific code

2008-01-30 Thread Roedel joerg.roedel
From: Joerg Roedel [EMAIL PROTECTED] To allow access to the EFER register in 32bit KVM the EFER specific code has to be exported to the x86 generic code. This patch does this in a backwards compatible manner. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/vmx.c | 10 ++

[kvm-devel] [PATCH 2/2] X86: allow access to EFER in 32bit KVM

2008-01-30 Thread Roedel joerg.roedel
From: Joerg Roedel [EMAIL PROTECTED] This patch makes the EFER register accessible on a 32bit KVM host. This is necessary to boot 32 bit PAE guests under SVM. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/x86.c | 19 --- 1 files changed, 8 insertions(+), 11

[kvm-devel] Fix PAE guests on KVM 32 bit host

2008-01-30 Thread Joerg Roedel
[resend due to PBKAC using git-send-email] This small series of patches fixes a boot problem with PAE guests on a 32 bit KVM host. These guests try to access the EFER register when running on AMD, get an GP and crashing very soon in the boot process. These patches fix that. They where tested with

[kvm-devel] [PATCH 2/2] X86: allow access to EFER in 32bit KVM

2008-01-30 Thread Joerg Roedel
This patch makes the EFER register accessible on a 32bit KVM host. This is necessary to boot 32 bit PAE guests under SVM. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/x86.c | 19 --- 1 files changed, 8 insertions(+), 11 deletions(-) diff --git

Re: [kvm-devel] [PATCH 1/2] VMX: unifdef the EFER specific code

2008-01-30 Thread Avi Kivity
Joerg Roedel wrote: To allow access to the EFER register in 32bit KVM the EFER specific code has to be exported to the x86 generic code. This patch does this in a backwards compatible manner. Signed-off-by: Joerg Roedel [EMAIL PROTECTED] --- arch/x86/kvm/vmx.c | 10 ++ 1 files

Re: [kvm-devel] [PATCH 2/2] X86: allow access to EFER in 32bit KVM

2008-01-30 Thread Avi Kivity
Joerg Roedel wrote: This patch makes the EFER register accessible on a 32bit KVM host. This is necessary to boot 32 bit PAE guests under SVM. static void set_efer(struct kvm_vcpu *vcpu, u64 efer) { if (efer EFER_RESERVED_BITS) { @@ -432,12 +430,19 @@ static void

Re: [kvm-devel] [RFC] VMX CR3 cache

2008-01-30 Thread Marcelo Tosatti
On Wed, Jan 30, 2008 at 09:26:53AM +0100, Gerd Hoffmann wrote: Marcelo Tosatti wrote: And this is against a changed x86.git -mm tree (with pvops64 patches). I'll send the PTE-write-via-hypercall patches soon and will rebase on top of that (the CR3 cache needs more testing/tuning

Re: [kvm-devel] [PATCH]: Fix memory corruption in-kernel IOAPIC emulation

2008-01-30 Thread Chris Lalancette
Avi Kivity wrote: Excellent catch, but the fix is wrong. Instead of partially restoring the ioapic state in the kernel, you should fully save it in qemu. This is a trap that many fall into: considering kvm and qemu as one entity and making sure they work well together. We need to make

Re: [kvm-devel] [PATCH] Use CONFIG_PREEMPT_NOTIFIERS around struct preempt_notifier

2008-01-30 Thread Chris Lalancette
Avi Kivity wrote: Hm, this causes my build to fail on x86_64: I pushed a fix for this. Yep, that does it. Thanks! Chris Lalancette - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R)

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 06:28:05PM -0600, Jack Steiner wrote: On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Andrea Arcangeli wrote: invalidate_range after populate allows access to memory for which ptes were zapped and the refcount was

Re: [kvm-devel] [RFC] nmi watchdog in kvm

2008-01-30 Thread Balaji Rao
On Wednesday 30 January 2008 04:13:58 pm Avi Kivity wrote: @@ -790,6 +795,18 @@ static int apic_mmio_range(struct kvm_io_device *this, gpa_t addr) return ret; } +static int nmi_notify(struct notifier_block *self,unsigned long val, void *data) { + +struct kvm *kvm; +

Re: [kvm-devel] [PATCH] Clean up KVM/QEMU interaction

2008-01-30 Thread Anthony Liguori
Avi Kivity wrote: Anthony Liguori wrote: This patch attempts to clean up the interactions between KVM and QEMU. Sorry for such a big patch, but I don't think there's a better way to approach this such that it's still bisect friendly. I think this is most of what's needed to get basic

Re: [kvm-devel] [PATCH] Cleanup extern declerations for now removed vcpu_env in Qemu

2008-01-30 Thread Avi Kivity
Jerone Young wrote: # HG changeset patch # User Jerone Young [EMAIL PROTECTED] # Date 1201568508 21600 # Node ID a568d031723942e1baf77077031d2b77795cbd8a # Parent 5ce532cf9a1f711d1fecb42814d301abd37aa378 Cleanup extern declerations for now removed vcpu_env in Qemu This patch removes

Re: [kvm-devel] [RFC] nmi watchdog in kvm

2008-01-30 Thread Balaji Rao
On Wednesday 30 January 2008 08:09:32 pm Avi Kivity wrote: I intended to do this here. Looks like its not the right way to check for presence in vcpu context. How do i do it ? please explain. +static void vmx_inject_nmi(struct kvm_vcpu *vcpu) { + + struct vcpu_vmx * vmx =

Re: [kvm-devel] [PATCH] Remove unnecessary linux/kvm.h include (v2)

2008-01-30 Thread Avi Kivity
Anthony Liguori wrote: This removes an unnecessary include of linux/kvm.h which happens to silence a warning introduced by my previous patch :-) We have to move the ABI check too until we've included libkvm.h. Doesn't apply, please check. -- error compiling committee.c: too many

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-30 Thread Joerg Roedel
On Wed, Jan 30, 2008 at 10:49:10AM +0100, Carsten Otte wrote: Joerg Roedel wrote: Since NPT uses the host page table format it is in theory possible to add the pagetable to the Linux MM rmap. In this case it would not be necessary to use MMU notifiers. But I think this would complicate the

Re: [kvm-devel] [RFC] nmi watchdog in kvm

2008-01-30 Thread Avi Kivity
Balaji Rao wrote: On Wednesday 30 January 2008 04:13:58 pm Avi Kivity wrote: @@ -790,6 +795,18 @@ static int apic_mmio_range(struct kvm_io_device *this, gpa_t addr) return ret; } +static int nmi_notify(struct notifier_block *self,unsigned long val, void *data) { + +struct kvm

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-30 Thread Carsten Otte
Andrea Arcangeli wrote: Oh I see! So when linux pte is cleared, the NPT equivalent is implicitly and atomically cleared too. That really requires _identical_ semantics and formats for both pagetables. Bingo. We have that on s390, and it seems workable on npt too. That problem is quite easily

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-30 Thread Avi Kivity
Carsten Otte wrote: We have similar restrictions than you're naming here. Our guest may start at a (userspace-) page boundary, and has a fixed 1:1 mapping to userspace for a given length. We do that by just having one memory slot which has to start at virtual address zero in kvm. I thought

Re: [kvm-devel] [RFC] nmi watchdog in kvm

2008-01-30 Thread Avi Kivity
Balaji Rao wrote: On Wednesday 30 January 2008 08:09:32 pm Avi Kivity wrote: I intended to do this here. Looks like its not the right way to check for presence in vcpu context. How do i do it ? please explain. +static void vmx_inject_nmi(struct kvm_vcpu *vcpu) { + + struct

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-30 Thread Carsten Otte
Avi Kivity wrote: Carsten Otte wrote: We have similar restrictions than you're naming here. Our guest may start at a (userspace-) page boundary, and has a fixed 1:1 mapping to userspace for a given length. We do that by just having one memory slot which has to start at virtual address zero

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-30 Thread Jack Steiner
On Wed, Jan 30, 2008 at 04:37:49PM +0100, Andrea Arcangeli wrote: On Tue, Jan 29, 2008 at 06:29:10PM -0800, Christoph Lameter wrote: +void mmu_notifier_release(struct mm_struct *mm) +{ + struct mmu_notifier *mn; + struct hlist_node *n, *t; + + if

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-30 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 06:29:10PM -0800, Christoph Lameter wrote: +void mmu_notifier_release(struct mm_struct *mm) +{ + struct mmu_notifier *mn; + struct hlist_node *n, *t; + + if (unlikely(!hlist_empty(mm-mmu_notifier.head))) { + rcu_read_lock(); +

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
Robin, if you don't mind, could you please post or upload somewhere your GPLv2 code that registers itself in Christoph's V2 notifiers? Or is it top secret? I wouldn't mind to have a look so I can better understand what's the exact reason you're sleeping besides attempting GFP_KERNEL

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 09:53:06AM -0600, Jack Steiner wrote: That will also resolve the problem we discussed yesterday. I want to unregister my mmu_notifier when a GRU segment is unmapped. This would not necessarily be at task termination. My proof that there is something wrong in the smp

Re: [kvm-devel] [PATCH]: Fix memory corruption in-kernel IOAPIC emulation

2008-01-30 Thread Chris Lalancette
Avi Kivity wrote: Excellent catch, but the fix is wrong. Instead of partially restoring the ioapic state in the kernel, you should fully save it in qemu. This is a trap that many fall into: considering kvm and qemu as one entity and making sure they work well together. We need to make

Re: [kvm-devel] [PATCH] Making SLIRP code more 64-bit clean

2008-01-30 Thread Scott Pakin
Zhang, Xiantao wrote: Scott Pakin wrote: The attached patch corrects a bug in qemu/slirp/tcp_var.h that defines the seg_next field in struct tcpcb to be 32 bits wide regardless of 32/64-bitness. seg_next is assigned a pointer value in qemu/slirp/tcp_subr.c, then cast back to a pointer in

Re: [kvm-devel] [PATCH]: Fix memory corruption in-kernel IOAPIC emulation

2008-01-30 Thread Avi Kivity
Chris Lalancette wrote: Avi Kivity wrote: Excellent catch, but the fix is wrong. Instead of partially restoring the ioapic state in the kernel, you should fully save it in qemu. This is a trap that many fall into: considering kvm and qemu as one entity and making sure they work well

[kvm-devel] [PATCH] Remove -DCONFIG_X86 from qemu_cflags (v2)

2008-01-30 Thread Anthony Liguori
This is not really going to work out if we want to merge with QEMU. We can't have magic in QEMU that relies on some external define being set. Since the define is needed by linux/kvm.h the solution is to define it as needed before including linux/kvm.h. This probably depends on my previous

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote: Robin, if you don't mind, could you please post or upload somewhere your GPLv2 code that registers itself in Christoph's V2 notifiers? Or is it top secret? I wouldn't mind to have a look so I can better understand what's the exact

Re: [kvm-devel] [Qemu-devel] Re: [PATCH] Making SLIRP code more 64-bit clean

2008-01-30 Thread Blue Swirl
On 1/30/08, Scott Pakin [EMAIL PROTECTED] wrote: Zhang, Xiantao wrote: Scott Pakin wrote: The attached patch corrects a bug in qemu/slirp/tcp_var.h that defines the seg_next field in struct tcpcb to be 32 bits wide regardless of 32/64-bitness. seg_next is assigned a pointer value in

[kvm-devel] Performance monitoring units and KVM

2008-01-30 Thread Markus Armbruster
Before I talk about performance monitoring units (PMUs) and KVM, let me sketch PMUs and the software we have to put them to use. You may wish to skip to the next occurence of KVM. Modern processors sport PMUs in various forms and shapes. The simplest form is a couple of performance counters,

[kvm-devel] [GIT PULL] KVM updates for the 2.6.25 merge window

2008-01-30 Thread Avi Kivity
Linus, Please pull the kvm updates for 2.6.25 from the repo and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git for-linus Changes include performance and scalability improvements, completion of the portability work (though no new architectures are supported with this

Re: [kvm-devel] Performance monitoring units and KVM

2008-01-30 Thread Avi Kivity
Markus Armbruster wrote: System-wide profiling of the *virtual* machine is related to profiling just a process. That's hard. I guess building on Perfmon2 would make sense there, but as long as it's out of tree... Can we wait for it? If not, what then? Give the guest access to the

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
On Wed, Jan 30, 2008 at 06:04:52PM +0100, Andrea Arcangeli wrote: On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote: ... The three issues we need to simultaneously solve is revoking the remote page table/tlb information while still in a sleepable context and not having the remote

Re: [kvm-devel] How-to use paravirt layer for network and block devices

2008-01-30 Thread Cam Macdonell
Dor Laor wrote: On Tue, 2008-01-29 at 10:50 -0700, Cameron Macdonell wrote: Hi, What are the command-line options necessary to get the guest devices to use the paravirt layer? For network you use '-net nic,model=virtio', I hope to write a wiki page for it tomorrow. Great, thanks.

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-30 Thread Robin Holt
Back to one of Andrea's points from a couple days ago, I think we still have a problem with the PageExternalRmap page flag. If I had two drivers with external rmap implementations, there is no way I can think of for a simple flag to coordinate a single page being exported and maintained by the

Re: [kvm-devel] [patch 3/6] mmu_notifier: invalidate_page callbacks for subsystems with rmap

2008-01-30 Thread Robin Holt
This is the second part of a patch posted to patch 1/6. Index: git-linus/mm/rmap.c === --- git-linus.orig/mm/rmap.c2008-01-30 11:55:56.0 -0600 +++ git-linus/mm/rmap.c 2008-01-30 12:01:28.0 -0600 @@ -476,8 +476,10

Re: [kvm-devel] Performance monitoring units and KVM

2008-01-30 Thread Andi Kleen
Is there really a requirement to profile several userspace programs, on several guests, simultaneously? Since guests affect each others performance (e.g. one guest can push the data of another guest out of cache) profiling over guests makes a lot of sense. Otherwise you cannot easily

Re: [kvm-devel] Performance monitoring units and KVM

2008-01-30 Thread Balaji Rao
On Wednesday 30 January 2008 11:11:51 pm Avi Kivity wrote: Markus Armbruster wrote: System-wide profiling of the *virtual* machine is related to profiling just a process. That's hard. I guess building on Perfmon2 would make sense there, but as long as it's out of tree... Can we wait for

Re: [kvm-devel] Performance monitoring units and KVM

2008-01-30 Thread Avi Kivity
Balaji Rao wrote: On Wednesday 30 January 2008 11:11:51 pm Avi Kivity wrote: Markus Armbruster wrote: System-wide profiling of the *virtual* machine is related to profiling just a process. That's hard. I guess building on Perfmon2 would make sense there, but as long as it's out of

Re: [kvm-devel] [PATCH 2/2] X86: allow access to EFER in 32bit KVM

2008-01-30 Thread Joerg Roedel
On Wed, Jan 30, 2008 at 03:11:36PM +0200, Avi Kivity wrote: Joerg Roedel wrote: This patch makes the EFER register accessible on a 32bit KVM host. This is necessary to boot 32 bit PAE guests under SVM. static void set_efer(struct kvm_vcpu *vcpu, u64 efer) { if (efer

Re: [kvm-devel] large page support for kvm

2008-01-30 Thread Joerg Roedel
On Tue, Jan 29, 2008 at 07:20:12PM +0200, Avi Kivity wrote: Here's a rough sketch of my proposal: - For every memory slot, allocate an array containing one int for every potential large page included within that memory slot. Each entry in the array contains the number of write-protected

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-30 Thread Andrea Arcangeli
Ok, I think I found one first deadlock source during swapping with the mmu notifiers and it's a KVM bug. I got a deadlock inversion between PT lock and mmu_lock because of this bug. With PREEMPT=n it's not enough to spin_lock(mmu_lock) to disable preempt and in turn the page fault will go through

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-30 Thread Christoph Lameter
Ok. So I added the following patch: --- include/linux/mmu_notifier.h |1 + mm/mmu_notifier.c| 12 2 files changed, 13 insertions(+) Index: linux-2.6/include/linux/mmu_notifier.h === ---

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Jack Steiner wrote: Moving to a different lock solves the problem. Well it gets us back to the issue why we removed the lock. As Robin said before: If its global then we can have a huge number of tasks contending for the lock on startup of a process with a large number of

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-30 Thread Christoph Lameter
How about just taking the mmap_sem writelock in release? We have only a single caller of mmu_notifier_release() in mm/mmap.c and we know that we are not holding mmap_sem at that point. So just acquire it when needed? Index: linux-2.6/mm/mmu_notifier.c

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Robin Holt wrote: I think I need to straighten this discussion out in my head a little bit. Am I correct in assuming Andrea's original patch set did not have any SMP race conditions for KVM? If so, then we need to start looking at how to implement Christoph's and my

Re: [kvm-devel] Performance monitoring units and KVM

2008-01-30 Thread Markus Armbruster
Avi Kivity [EMAIL PROTECTED] writes: Markus Armbruster wrote: System-wide profiling of the *virtual* machine is related to profiling just a process. That's hard. I guess building on Perfmon2 would make sense there, but as long as it's out of tree... Can we wait for it? If not, what then?

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Jack Steiner
On Wed, Jan 30, 2008 at 11:41:29AM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Jack Steiner wrote: I see what you mean. I need to review to mail to see why this changed but in the original discussions with Christoph, the invalidate_range callouts were suppose to be made BEFORE

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Jack Steiner wrote: Seems that we cannot rely on the invalidate_ranges for correctness at all? We need to have invalidate_page() always. invalidate_range() is only an optimization. I don't understand your point an optimization. How would invalidate_range as

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Andrea Arcangeli wrote: XPMEM requires with invalidate_range (sleepy) + before_invalidate_range (sleepy). invalidate_all should also be called before_release (both sleepy). It sounds we need full

[kvm-devel] [patch 1/4] KVM: basic paravirt support

2008-01-30 Thread Marcelo Tosatti
Add basic KVM paravirt support. Avoid vm-exits on IO delays. Add KVM_GET_PARA_FEATURES ioctl so paravirt features can be reported via cpuid. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: linux-2.6-x86-kvm/arch/x86/Kconfig

[kvm-devel] [patch 2/4] KVM: hypercall based pte updates and TLB flushes

2008-01-30 Thread Marcelo Tosatti
Hypercall based pte updates are faster than faults, and also allow use of the lazy MMU mode to batch operations. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: linux-2.6-x86-kvm/arch/x86/kernel/kvm.c === ---

[kvm-devel] [patch 4/4] KVM: hypercall batching

2008-01-30 Thread Marcelo Tosatti
Batch pte updates and tlb flushes in lazy MMU mode. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: linux-2.6-x86-kvm/arch/x86/kernel/kvm.c === --- linux-2.6-x86-kvm.orig/arch/x86/kernel/kvm.c +++

[kvm-devel] [patch 3/4] paravirt: set_access_flags/set_wrprotect should use paravirt interface

2008-01-30 Thread Marcelo Tosatti
ptep_set_access_flags and ptep_set_wrprotect are doing direct pte updates ignoring the paravirt interface. The wrprotect change is especially important since it allows full batching of fork() on COW mappings. There are still a few PTE update interfaces bypassing paravirt, such as

[kvm-devel] QEMU/KVM: report paravirt features on cpuid

2008-01-30 Thread Marcelo Tosatti
And the corresponding QEMU support to report parafeatures on cpuid. Signed-off-by: Marcelo Tosatti [EMAIL PROTECTED] Index: kvm-userspace/libkvm/libkvm.c === --- kvm-userspace.orig/libkvm/libkvm.c +++ kvm-userspace/libkvm/libkvm.c

Re: [kvm-devel] [EMAIL PROTECTED]: [patch 3/4] paravirt: set_access_flags/set_wrprotect should use paravirt interface]

2008-01-30 Thread Jeremy Fitzhardinge
Marcelo Tosatti wrote: Forgot to copy you... Ideally all pte updates should be done via the paravirt interface. Hm, are you sure? +static inline void pte_clear_bit(unsigned int bit, pte_t *ptep) +{ + pte_t pte = *ptep; + clear_bit(bit, (unsigned long *)pte.pte); +

[kvm-devel] Missing part from qemu merge

2008-01-30 Thread Anders Melchiorsen
Hi Avi, is it on purpose that this part of my qemu rearm rework was left out when the rest was merged into KVM a few days ago? If there is still a problem with it for KVM, I would like to know. Cheers, Anders --- a/qemu/vl.c +++ b/qemu/vl.c @@ -1106,7 +1106,6 @@ static void

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote: Then we have invalidate_range_start(mm) and invalidate_range_finish(mm, start, end) in addition to the invalidate rmap_notifier? --- include/linux/mmu_notifier.h |7 +-- 1 file changed, 5 insertions(+),

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: I think Andrea's original concept of the lock in the mmu_notifier_head structure was the best. I agree with him that it should be a spinlock instead of the rw_lock. BTW, I don't see the scalability concern with huge number of tasks: the lock

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: - void (*invalidate_range)(struct mmu_notifier *mn, + void (*invalidate_range_begin)(struct mmu_notifier *mn, struct mm_struct *mm, -unsigned long start, unsigned long end,

Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: H.. exit_mmap is only called when the last reference is removed against the mm right? So no tasks are running anymore. No pages are left. Do we need to serialize at all for mmu_notifier_release? KVM sure doesn't need any locking there.

[kvm-devel] [PATCH] Converting mmio to port io in userspace for IA64

2008-01-30 Thread Zhang, Xiantao
From: Zhang Xiantao [EMAIL PROTECTED] Date: Thu, 31 Jan 2008 09:06:21 +0800 Subject: [PATCH] kvm: qemu: Covert the mmio address space to port io in userspace. IA64 also have no port io, but chipset is responsible for converting some mmio to port io for keeping compatibility with legacy deviceS.

Re: [kvm-devel] [EMAIL PROTECTED]: [patch 3/4] paravirt: set_access_flags/set_wrprotect should use paravirt interface]

2008-01-30 Thread Jeremy Fitzhardinge
Marcelo Tosatti wrote: On Wed, Jan 30, 2008 at 03:00:49PM -0800, Jeremy Fitzhardinge wrote: Marcelo Tosatti wrote: Forgot to copy you... Ideally all pte updates should be done via the paravirt interface. Hm, are you sure? +static inline void pte_clear_bit(unsigned

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: On Wed, Jan 30, 2008 at 04:01:31PM -0800, Christoph Lameter wrote: How we offload that? Before the scan of the rmaps we do not have the mmstruct. So we'd need another notifier_rmap_callback. My assumption is that that int lock exists just

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
Patch to 1. Remove sync on notifier_release. Must be called when only a single process remain. 2. Add invalidate_range_start/end. This should allow safe removal of ranges of external ptes without having to resort to a callback for every individual page. This must be able to nest so

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Robin Holt wrote: Well the GRU uses follow_page() instead of get_user_pages. Performance is a major issue for the GRU. Worse, the GRU takes its TLB faults from within an interrupt so we use follow_page to prevent going to sleep. That said, I think we could

Re: [kvm-devel] Performance monitoring units and KVM

2008-01-30 Thread Andi Kleen
On Thu, Jan 31, 2008 at 12:44:10AM +0530, Balaji Rao wrote: On Wednesday 30 January 2008 11:56:25 pm Andi Kleen wrote: There is no really an architectural PMU if you consider boxes beyond relatively new Intel CPUs (which got one) But since kvm runs only on such CPUs, it should not really

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote: hlist_for_each_entry_safe_rcu(mn, n, t, mm-mmu_notifier.head, hlist) { hlist_del_rcu(mn-hlist);

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote: hlist_for_each_entry_safe_rcu(mn, n, t, mm-mmu_notifier.head, hlist) {

Re: [kvm-devel] mmu_notifier: invalidate_range_start with lock=1

2008-01-30 Thread Christoph Lameter
One possible way that XPmem could deal with a call of invalidate_range_start with the lock flag set: Scan through the rmaps you have for ptes. If you find one then elevate the refcount of the corresponding page and mark in the maps that you have done so. Also make them readonly. The increased

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
Well the GRU uses follow_page() instead of get_user_pages. Performance is a major issue for the GRU. Worse, the GRU takes its TLB faults from within an interrupt so we use follow_page to prevent going to sleep. That said, I think we could probably use follow_page() with FOLL_GET set to

[kvm-devel] Physician Database

2008-01-30 Thread Mcgowan M Trinidad
Fully Licensed Physicians in the US 788,268 in total 17,680 emails Coverage in many different areas of medicine such as Endocrinology, Pathology, Urology, Neurology, Plastic Surgery, Psychiatry, Cardiology and much more Sort by over a dozen different fields Cost just slashed - $394

Re: [kvm-devel] Performance monitoring units and KVM II

2008-01-30 Thread Andi Kleen
Sure it could, but that would be a new interface. If you were free to define a new interface you could also just go completely hypercall based. Actually thinking about it more it would be probably possible for KVM to emulate ArchPerfMon on AMD and Family 15 Intel based on the local PMU.

[kvm-devel] [patch 2/3] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. invalidate_range_begin/end() is frequently called with only mmap_sem held. If invalidate_range_begin() is called with locks held then we pass a flag into invalidate_range() to

[kvm-devel] [patch 1/3] mmu_notifier: Core code

2008-01-30 Thread Christoph Lameter
Notifier functions for hardware and software that establishes external references to pages of a Linux system. The notifier calls ensure that external mappings are removed when the Linux VM removes memory ranges or individual pages from a process. These fall into two classes: 1. mmu_notifier

[kvm-devel] [patch 0/3] [RFC] MMU Notifiers V4

2008-01-30 Thread Christoph Lameter
I hope this is finally a release that covers all the requirements. Locking description is at the top of the core patch. This is a patchset implementing MMU notifier callbacks based on Andrea's earlier work. These are needed if Linux pages are referenced from something else than tracked by the

[kvm-devel] [patch 3/3] mmu_notifier: invalidate_page callbacks

2008-01-30 Thread Christoph Lameter
Callbacks to remove individual pages as done in rmap code 3 types of callbacks are used: 1. invalidate_page mmu_notifier Called from the inner loop of rmap walks to invalidate pages. 2. invalidate_page mmu_rmap_notifier Called after the Linux rmap loop under PageLock to

Re: [kvm-devel] large page support for kvm

2008-01-30 Thread Avi Kivity
Joerg Roedel wrote: On Tue, Jan 29, 2008 at 07:20:12PM +0200, Avi Kivity wrote: Here's a rough sketch of my proposal: - For every memory slot, allocate an array containing one int for every potential large page included within that memory slot. Each entry in the array contains the

Re: [kvm-devel] Missing part from qemu merge

2008-01-30 Thread Avi Kivity
Anders Melchiorsen wrote: Hi Avi, is it on purpose that this part of my qemu rearm rework was left out when the rest was merged into KVM a few days ago? If there is still a problem with it for KVM, I would like to know. It was unintentional; a by product of the automatic merge. Thanks

Re: [kvm-devel] swapping with MMU Notifiers V2

2008-01-30 Thread Avi Kivity
Andrea Arcangeli wrote: Ok, I think I found one first deadlock source during swapping with the mmu notifiers and it's a KVM bug. I got a deadlock inversion between PT lock and mmu_lock because of this bug. With PREEMPT=n it's not enough to spin_lock(mmu_lock) to disable preempt and in turn the

Re: [kvm-devel] Performance monitoring units and KVM

2008-01-30 Thread Balaji Rao
On Thursday 31 January 2008 08:42:32 am Andi Kleen wrote: On Thu, Jan 31, 2008 at 12:44:10AM +0530, Balaji Rao wrote: On Wednesday 30 January 2008 11:56:25 pm Andi Kleen wrote: There is no really an architectural PMU if you consider boxes beyond relatively new Intel CPUs (which got one)

Re: [kvm-devel] Performance monitoring units and KVM

2008-01-30 Thread Avi Kivity
Markus Armbruster wrote: Avi Kivity [EMAIL PROTECTED] writes: Markus Armbruster wrote: System-wide profiling of the *virtual* machine is related to profiling just a process. That's hard. I guess building on Perfmon2 would make sense there, but as long as it's out of tree... Can

Re: [kvm-devel] [PATCH] Converting mmio to port io in userspace for IA64

2008-01-30 Thread Avi Kivity
Zhang, Xiantao wrote: From: Zhang Xiantao [EMAIL PROTECTED] Date: Thu, 31 Jan 2008 09:06:21 +0800 Subject: [PATCH] kvm: qemu: Covert the mmio address space to port io in userspace. IA64 also have no port io, but chipset is responsible for converting some mmio to port io for keeping

Re: [kvm-devel] [EMAIL PROTECTED]: [patch 3/4] paravirt: set_access_flags/set_wrprotect should use paravirt interface]

2008-01-30 Thread Avi Kivity
Jeremy Fitzhardinge wrote: Marcelo Tosatti wrote: Forgot to copy you... Ideally all pte updates should be done via the paravirt interface. Hm, are you sure? It has the advantage of not falsely triggering any unshadowing heuristics, and of avoiding the lovely x86 emulator.

Re: [kvm-devel] [PATCH]: Fix memory corruption in-kernel IOAPIC emulation

2008-01-30 Thread Avi Kivity
Chris Lalancette wrote: Another version of the patch, done by changing the on-the-wire protocol as Avi suggested. I've tested this with: old - old - Migration works, but runs into the bug I'm trying to fix old - new - Migration works, but runs into the bug I'm trying to fix new - old -

  1   2   >