Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core
On Wed, 07 May 2008 16:35:51 +0200 Andrea Arcangeli [EMAIL PROTECTED] wrote: # HG changeset patch # User Andrea Arcangeli [EMAIL PROTECTED] # Date 1210096013 -7200 # Node ID e20917dcc8284b6a07cfcced13dda4cbca850a9c # Parent 5026689a3bc323a26d33ad882c34c4c9c9a3ecd8 mmu-notifier-core ... --- a/include/linux/list.h +++ b/include/linux/list.h @@ -747,7 +747,7 @@ static inline void hlist_del(struct hlis * or hlist_del_rcu(), running on this same list. * However, it is perfectly legal to run concurrently with * the _rcu list-traversal primitives, such as - * hlist_for_each_entry(). + * hlist_for_each_entry_rcu(). */ static inline void hlist_del_rcu(struct hlist_node *n) { @@ -760,6 +760,34 @@ static inline void hlist_del_init(struct if (!hlist_unhashed(n)) { __hlist_del(n); INIT_HLIST_NODE(n); + } +} + +/** + * hlist_del_init_rcu - deletes entry from hash list with re-initialization + * @n: the element to delete from the hash list. + * + * Note: list_unhashed() on entry does return true after this. It is Should that be does or does not. does, I suppose. It should refer to hlist_unhashed() The term on entry is a bit ambiguous - we normally use that as shorthand to mean on entry to the function. So I'll change this to + * Note: hlist_unhashed() on the node returns true after this. It is OK? oh, that was copied-and-pasted from similarly errant comments in that file --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -10,6 +10,7 @@ #include linux/rbtree.h #include linux/rwsem.h #include linux/completion.h +#include linux/cpumask.h OK, unrelated bugfix ;) --- a/include/linux/srcu.h +++ b/include/linux/srcu.h @@ -27,6 +27,8 @@ #ifndef _LINUX_SRCU_H #define _LINUX_SRCU_H +#include linux/mutex.h And another. Fair enough. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core
On Wed, 07 May 2008 16:35:51 +0200 Andrea Arcangeli [EMAIL PROTECTED] wrote: # HG changeset patch # User Andrea Arcangeli [EMAIL PROTECTED] # Date 1210096013 -7200 # Node ID e20917dcc8284b6a07cfcced13dda4cbca850a9c # Parent 5026689a3bc323a26d33ad882c34c4c9c9a3ecd8 mmu-notifier-core The patch looks OK to me. The proposal is that we sneak this into 2.6.26. Are there any sufficiently-serious objections to this? The patch will be a no-op for 2.6.26. This is all rather unusual. For the record, could we please review the reasons for wanting to do this? Thanks. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem
On Thu, 8 May 2008 00:22:05 +0200 Andrea Arcangeli [EMAIL PROTECTED] wrote: No, the simple solution is to just make up a whole new upper-level lock, and get that lock *first*. You can then take all the multiple locks at a lower level in any order you damn well please. Unfortunately the lock you're talking about would be: static spinlock_t global_lock = ... There's no way to make it more granular. So every time before taking any -i_mmap_lock _and_ any anon_vma-lock we'd need to take that extremely wide spinlock first (and even worse, later it would become a rwsem when XPMEM is selected making the VM even slower than it already becomes when XPMEM support is selected at compile time). Nope. We only need to take the global lock before taking *two or more* of the per-vma locks. I really wish I'd thought of that. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem
On Thu, 8 May 2008 00:44:06 +0200 Andrea Arcangeli [EMAIL PROTECTED] wrote: On Wed, May 07, 2008 at 03:31:03PM -0700, Andrew Morton wrote: Nope. We only need to take the global lock before taking *two or more* of the per-vma locks. I really wish I'd thought of that. I don't see how you can avoid taking the system-wide-global lock before every single anon_vma-lock/i_mmap_lock out there without mm_lock. Please note, we can't allow a thread to be in the middle of zap_page_range while mmu_notifier_register runs. vmtruncate takes 1 single lock, the i_mmap_lock of the inode. Not more than one lock and we've to still take the global-system-wide lock _before_ this single i_mmap_lock and no other lock at all. Please elaborate, thanks! umm... CPU0: CPU1: spin_lock(a-lock); spin_lock(b-lock); spin_lock(b-lock); spin_lock(a-lock); bad. CPU0: CPU1: spin_lock(global_lock) spin_lock(global_lock); spin_lock(a-lock); spin_lock(b-lock); spin_lock(b-lock); spin_lock(a-lock); Is OK. CPU0: CPU1: spin_lock(global_lock) spin_lock(a-lock); spin_lock(b-lock); spin_lock(b-lock); spin_unlock(b-lock); spin_lock(a-lock); spin_unlock(a-lock); also OK. As long as all code paths which can take two-or-more locks are all covered by the global lock there is no deadlock scenario. If a thread takes just a single instance of one of these locks without taking the global_lock then there is also no deadlock. Now, if we need to take both anon_vma-lock AND i_mmap_lock in the newly added mm_lock() thing and we also take both those locks at the same time in regular code, we're probably screwed. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] + kvm-provide-kvmh-for-all-architecture-fixes-headers_install.patch added to -mm tree
On Tue, 25 Mar 2008 16:31:46 +0100 Christian Borntraeger [EMAIL PROTECTED] wrote: Am Mittwoch, 12. M__rz 2008 schrieben Sie: The patch titled kvm: provide kvm.h for all architecture: fixes headers_install has been added to the -mm tree. Its filename is kvm-provide-kvmh-for-all-architecture-fixes-headers_install.patch Hello Andrew, is there a chance to submit this patch before 2.6.25? headers_install of kvm.h worked with 2.6.24 but is still broken with 2.6.25-rc. Sure, I'll merge it. From: Christian Borntraeger [EMAIL PROTECTED] Currently include/linux/kvm.h is not considered by make headers_install, because Kbuild cannot handle unifdef-$(CONFIG_FOO) += foo.h. This problem was introduced by 040922c04cf2c8ac70be2e88a8a9614ecdb41d2e, which makes this an 2.6.25 regression. One way of solving the issue is to enhance Kbuild, but Avi and David conviced me, that changing headers_install is not the way to go. This patch changes the definition for linux/kvm.h to unifdef-y. If _unifdef-y is used for linux/kvm.h make headers_check will fail on all architectures without asm/kvm.h. Therefore, this patch also provides asm/kvm.h on all architectures. Signed-off-by: Christian Borntraeger [EMAIL PROTECTED] Acked-by: Avi Kivity [EMAIL PROTECTED] Cc: Sam Ravnborg [EMAIL PROTECTED] Cc: David Woodhouse [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- include/asm-alpha/kvm.h|6 ++ include/asm-arm/kvm.h |6 ++ include/asm-avr32/kvm.h|6 ++ include/asm-blackfin/kvm.h |6 ++ include/asm-cris/kvm.h |6 ++ include/asm-frv/kvm.h |6 ++ include/asm-generic/Kbuild.asm |2 ++ include/asm-h8300/kvm.h|6 ++ include/asm-ia64/kvm.h |6 ++ include/asm-m32r/kvm.h |6 ++ include/asm-m68k/kvm.h |6 ++ include/asm-m68knommu/kvm.h|6 ++ include/asm-mips/kvm.h |6 ++ include/asm-mn10300/kvm.h |6 ++ include/asm-parisc/kvm.h |6 ++ include/asm-powerpc/kvm.h |6 ++ include/asm-s390/kvm.h |6 ++ include/asm-sh/kvm.h |6 ++ include/asm-sparc/kvm.h|6 ++ include/asm-sparc64/kvm.h |6 ++ include/asm-um/kvm.h |6 ++ include/asm-v850/kvm.h |6 ++ include/asm-xtensa/kvm.h |6 ++ include/linux/Kbuild |2 +- 24 files changed, 135 insertions(+), 1 deletion(-) diff -puN /dev/null include/asm-alpha/kvm.h --- /dev/null +++ a/include/asm-alpha/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_ALPHA_H +#define __LINUX_KVM_ALPHA_H + +/* alpha does not support KVM */ + +#endif diff -puN /dev/null include/asm-arm/kvm.h --- /dev/null +++ a/include/asm-arm/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_ARM_H +#define __LINUX_KVM_ARM_H + +/* arm does not support KVM */ + +#endif diff -puN /dev/null include/asm-avr32/kvm.h --- /dev/null +++ a/include/asm-avr32/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_AVR32_H +#define __LINUX_KVM_AVR32_H + +/* avr32 does not support KVM */ + +#endif diff -puN /dev/null include/asm-blackfin/kvm.h --- /dev/null +++ a/include/asm-blackfin/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_BLACKFIN_H +#define __LINUX_KVM_BLACKFIN_H + +/* blackfin does not support KVM */ + +#endif diff -puN /dev/null include/asm-cris/kvm.h --- /dev/null +++ a/include/asm-cris/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_CRIS_H +#define __LINUX_KVM_CRIS_H + +/* cris does not support KVM */ + +#endif diff -puN /dev/null include/asm-frv/kvm.h --- /dev/null +++ a/include/asm-frv/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_FRV_H +#define __LINUX_KVM_FRV_H + +/* frv does not support KVM */ + +#endif diff -puN include/asm-generic/Kbuild.asm~kvm-provide-kvmh-for-all-architecture-fixes-headers_install include/asm-generic/Kbuild.asm --- a/include/asm-generic/Kbuild.asm~kvm-provide-kvmh-for-all-architecture-fixes-headers_install +++ a/include/asm-generic/Kbuild.asm @@ -1,3 +1,5 @@ +header-y += kvm.h + ifeq ($(wildcard include/asm-$(SRCARCH)/a.out.h),include/asm-$(SRCARCH)/a.out.h) unifdef-y += a.out.h endif diff -puN /dev/null include/asm-h8300/kvm.h --- /dev/null +++ a/include/asm-h8300/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_H8300_H +#define __LINUX_KVM_H8300_H + +/* h8300 does not support KVM */ + +#endif diff -puN /dev/null include/asm-ia64/kvm.h --- /dev/null +++ a/include/asm-ia64/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_IA64_H +#define __LINUX_KVM_IA64_H + +/* ia64 does not support KVM */ + +#endif diff -puN /dev/null include/asm-m32r/kvm.h --- /dev/null +++ a/include/asm-m32r/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_M32R_H +#define __LINUX_KVM_M32R_H + +/* m32r does not support KVM */ + +#endif diff -puN /dev/null include/asm-m68k/kvm.h --- /dev/null +++ a/include/asm-m68k/kvm.h @@ -0,0 +1,6 @@ +#ifndef __LINUX_KVM_M68K_H +#define
Re: [kvm-devel] [RFC/PATCH 01/15 v2] preparation: provide hook to enable pgstes in user pagetable
On Sat, 22 Mar 2008 18:02:37 +0100 Carsten Otte [EMAIL PROTECTED] wrote: From: Martin Schwidefsky [EMAIL PROTECTED] The SIE instruction on s390 uses the 2nd half of the page table page to virtualize the storage keys of a guest. This patch offers the s390_enable_sie function, which reorganizes the page tables of a single-threaded process to reserve space in the page table: s390_enable_sie makes sure that the process is single threaded and then uses dup_mm to create a new mm with reorganized page tables. The old mm is freed and the process has now a page status extended field after every page table. Code that wants to exploit pgstes should SELECT CONFIG_PGSTE. This patch has a small common code hit, namely making dup_mm non-static. Edit (Carsten): I've modified Martin's patch, following Jeremy Fitzhardinge's review feedback. Now we do have the prototype for dup_mm in include/linux/sched.h. ... --- linux-host.orig/kernel/fork.c +++ linux-host/kernel/fork.c @@ -498,7 +498,7 @@ void mm_release(struct task_struct *tsk, * Allocate a new mm structure and copy contents from the * mm structure of the passed in task structure. */ -static struct mm_struct *dup_mm(struct task_struct *tsk) +struct mm_struct *dup_mm(struct task_struct *tsk) { struct mm_struct *mm, *oldmm = current-mm; int err; ack --- linux-host.orig/include/linux/sched.h +++ linux-host/include/linux/sched.h @@ -1758,6 +1758,8 @@ extern void mmput(struct mm_struct *); extern struct mm_struct *get_task_mm(struct task_struct *task); /* Remove the current tasks stale references to the old mm_struct */ extern void mm_release(struct task_struct *, struct mm_struct *); +/* Allocate a new mm structure and copy contents from tsk-mm */ +extern struct mm_struct *dup_mm(struct task_struct *tsk); extern int copy_thread(int, unsigned long, unsigned long, unsigned long, struct task_struct *, struct pt_regs *); extern void flush_thread(void); hm, why did we put these in sched.h? oh well - acked-by-me. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [RFC/PATCH 02/15 v2] preparation: host memory management changes for s390 kvm
On Sat, 22 Mar 2008 18:02:39 +0100 Carsten Otte [EMAIL PROTECTED] wrote: From: Heiko Carstens [EMAIL PROTECTED] From: Christian Borntraeger [EMAIL PROTECTED] This patch changes the s390 memory management defintions to use the pgste field for dirty and reference bit tracking of host and guest code. Usually on s390, dirty and referenced are tracked in storage keys, which belong to the physical page. This changes with virtualization: The guest and host dirty/reference bits are defined to be the logical OR of the values for the mapping and the physical page. This patch implements the necessary changes in pgtable.h for s390. There is a common code change in mm/rmap.c, the call to page_test_and_clear_young must be moved. This is a no-op for all architecture but s390. page_referenced checks the referenced bits for the physiscal page and for all mappings: o The physical page is checked with page_test_and_clear_young. o The mappings are checked with ptep_test_and_clear_young and friends. Without pgstes (the current implementation on Linux s390) the physical page check is implemented but the mapping callbacks are no-ops because dirty and referenced are not tracked in the s390 page tables. The pgstes introduces guest and host dirty and reference bits for s390 in the host mapping. These mapping must be checked before page_test_and_clear_young resets the reference bit. ... --- linux-host.orig/mm/rmap.c +++ linux-host/mm/rmap.c @@ -413,9 +413,6 @@ int page_referenced(struct page *page, i { int referenced = 0; - if (page_test_and_clear_young(page)) - referenced++; - if (TestClearPageReferenced(page)) referenced++; @@ -433,6 +430,10 @@ int page_referenced(struct page *page, i unlock_page(page); } } + + if (page_test_and_clear_young(page)) + referenced++; + return referenced; } ack. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [Bugme-new] [Bug 10246] New: in after successful ioperm() results in SEGV after kvm use
On Fri, 14 Mar 2008 18:48:15 -0700 (PDT) [EMAIL PROTECTED] wrote: http://bugzilla.kernel.org/show_bug.cgi?id=10246 Summary: in after successful ioperm() results in SEGV after kvm use Product: Memory Management Version: 2.5 KernelVersion: 2.6.25-rc5 Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Other AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Latest working kernel version: N/A Earliest failing kernel version: 2.6.24 Distribution: Ubuntu, but tested with mainline Hardware Environment: intel mobo, Intel(R) Core(TM)2 Quad CPU [EMAIL PROTECTED] Software Environment: kvm 62 (x86_64) Problem Description: After a successful ioperm() call, otherwise valid in instructions will segv if a kvm VM has started. Steps to reproduce: 1) run attached reproducer prior to starting a kvm VM, results are: # ./ioperm getting 0x3b4-0x3df permission... fetching 0x3cc... ok: 1 2) start a kvm VM (bug exists only after actually starting a guest VM) 3) run reproducer, which now fails: # ./ioperm getting 0x3b4-0x3df permission... fetching 0x3cc... Segmentation fault (core dumped) Note that it does not always fail. Running within gdb seems to reduce the chances that it will fail. But when it does, it is clearly the in that is failing: Program received signal SIGSEGV, Segmentation fault. 0x004006e4 in inb () (gdb) x/1i $pc 0x4006e4 inb+12: in (%dx),%al (gdb) info reg rdx rdx0x3cc972 I have had the sense that running the CPUs at full load (niced) increases the chance for failure. There is a testcase in the bugzilla report. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] kvm: provide kvm.h for all architecture: fixes headers_install
On Mon, 10 Mar 2008 14:11:04 +0100 Christian Borntraeger [EMAIL PROTECTED] wrote: [PATCH v2] kvm: provide kvm.h for all architecture: fixes headers_install Currently include/linux/kvm.h is not considered by make headers_install, because Kbuild cannot handle unifdef-$(CONFIG_FOO) += foo.h. This problem was introduced by 040922c04cf2c8ac70be2e88a8a9614ecdb41d2e, which makes this an 2.6.25 regression. One way of solving the issue is to enhance Kbuild, but Avi Kivity and David Woodhouse conviced me, that changing headers_install is not the way to go. This patch changes the definition for linux/kvm.h to unifdef-y. If  unifdef-y is used for linux/kvm.h make headers_check will fail on all architectures without asm/kvm.h. Therefore, this patch also provides asm/kvm.h on all architectures. Changes since v1: o use asm-generic/Kbuild.asm (Arnd Bergmann) o fix comment in asm-frv (David Howells) err, this doesn't work. alpha and m68k (at least) fail make headers_check /usr/src/devel/usr/include/linux/kvm.h requires asm/kvm.h, which does not exist in exported headers - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Fri, 29 Feb 2008 01:40:01 +0100 Andrea Arcangeli [EMAIL PROTECTED] wrote: +#define mmu_notifier(function, mm, args...) \ + do {\ + struct mmu_notifier *__mn; \ + struct hlist_node *__n; \ + \ + if (unlikely(!hlist_empty((mm)-mmu_notifier.head))) { \ + rcu_read_lock();\ + hlist_for_each_entry_rcu(__mn, __n, \ + (mm)-mmu_notifier.head, \ + hlist) \ + if (__mn-ops-function)\ + __mn-ops-function(__mn, \ + mm, \ + args); \ + rcu_read_unlock(); \ + } \ + } while (0) Andrew recomended local variables for parameters used multile times. This means the mm parameter here. I don't exactly see what buggy macro meant? multiple refernces to the argument, so mmu_notifier(foo, bar(), zot); will call bar() either once or twice. Unlikely in this case, but bad practice. Easily fixable by using another temporary. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code
On Sat, 16 Feb 2008 10:45:50 +0200 Avi Kivity [EMAIL PROTECTED] wrote: Andrew Morton wrote: How important is this feature to KVM? Very. kvm pins pages that are referenced by the guest; hm. Why does it do that? a 64-bit guest will easily pin its entire memory with the kernel map. So this is critical for guest swapping to actually work. Curious. If KVM can release guest pages at the request of this notifier so that they can be swapped out, why can't it release them by default, and allow swapping to proceed? Other nice features like page migration are also enabled by this patch. We already have page migration. Do you mean page-migration-when-using-kvm? - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] KVM swapping with MMU Notifiers V7
On Sat, 16 Feb 2008 11:48:27 +0100 Andrea Arcangeli [EMAIL PROTECTED] wrote: +void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn, +struct mm_struct *mm, +unsigned long start, unsigned long end, +int lock) +{ + for (; start end; start += PAGE_SIZE) + kvm_mmu_notifier_invalidate_page(mn, mm, start); +} + +static const struct mmu_notifier_ops kvm_mmu_notifier_ops = { + .invalidate_page= kvm_mmu_notifier_invalidate_page, + .age_page = kvm_mmu_notifier_age_page, + .invalidate_range_end = kvm_mmu_notifier_invalidate_range_end, +}; So this doesn't implement -invalidate_range_start(). By what means does it prevent new mappings from being established in the range after core mm has tried to call -invalidate_rande_start()? mmap_sem, I assume? + /* set userspace_addr atomically for kvm_hva_to_rmapp */ + spin_lock(kvm-mmu_lock); + memslot-userspace_addr = userspace_addr; + spin_unlock(kvm-mmu_lock); are you sure? kvm_unmap_hva() and kvm_age_hva() read -userspace_addr a single time and it doesn't immediately look like there's a need to take the lock here? - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges
On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter [EMAIL PROTECTED] wrote: The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. hm. Do they? Why? If I'm in the process of zero-copy writing a hunk of memory out to hardware then do I care if someone write-protects the ptes? Spose so, but some fleshing-out of the various scenarios here would clarify things. If invalidate_range_begin() is called with locks held then we pass a flag into invalidate_range() to indicate that no sleeping is possible. Locks are only held for truncate and huge pages. This is so bad. I supposed in the restricted couple of cases which you're focussed on it works OK. But is it generally suitable? What if IO is in progress? What if other cluster nodes need to be talked to? Does it suit RDMA? In two cases we use invalidate_range_begin/end to invalidate single pages because the pair allows holding off new references (idea by Robin Holt). Assuming that there is a missing within the range in this description, I assume that all clients will just throw up theior hands in horror and will disallow all references to all parts of the mm. Of course, to do that they will need to take a sleeping lock to prevent other threads from establishing new references. whoops. do_wp_page(): We hold off new references while we update the pte. xip_unmap: We are not taking the PageLock so we cannot use the invalidate_page mmu_rmap_notifier. invalidate_range_begin/end stands in. What does stands in mean? Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED] Signed-off-by: Robin Holt [EMAIL PROTECTED] Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- mm/filemap_xip.c |5 + mm/fremap.c |3 +++ mm/hugetlb.c |3 +++ mm/memory.c | 35 +-- mm/mmap.c|2 ++ mm/mprotect.c|3 +++ mm/mremap.c |7 ++- 7 files changed, 51 insertions(+), 7 deletions(-) Index: linux-2.6/mm/fremap.c === --- linux-2.6.orig/mm/fremap.c2008-02-14 18:43:31.0 -0800 +++ linux-2.6/mm/fremap.c 2008-02-14 18:45:07.0 -0800 @@ -15,6 +15,7 @@ #include linux/rmap.h #include linux/module.h #include linux/syscalls.h +#include linux/mmu_notifier.h #include asm/mmu_context.h #include asm/cacheflush.h @@ -214,7 +215,9 @@ asmlinkage long sys_remap_file_pages(uns spin_unlock(mapping-i_mmap_lock); } + mmu_notifier(invalidate_range_begin, mm, start, start + size, 0); err = populate_range(mm, vma, start, size, pgoff); + mmu_notifier(invalidate_range_end, mm, start, start + size, 0); To avoid off-by-one confusion the changelogs, documentation and comments should be very careful to tell the reader whether the range includes the byte at start+size. I don't thik that was done? - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code
On Thu, 14 Feb 2008 22:49:00 -0800 Christoph Lameter [EMAIL PROTECTED] wrote: MMU notifiers are used for hardware and software that establishes external references to pages managed by the Linux kernel. These are page table entriews or tlb entries or something else that allows hardware (such as DMA engines, scatter gather devices, networking, sharing of address spaces across operating system boundaries) and software (Virtualization solutions such as KVM, Xen etc) to access memory managed by the Linux kernel. The MMU notifier will notify the device driver that subscribes to such a notifier that the VM is going to do something with the memory mapped by that device. The device must then drop references for the indicated memory area. The references may be reestablished later. The notification scheme is much better than the current schemes of avoiding the danger of the VM removing pages that are externally mapped. We currently either mlock pages used for RDMA, XPmem etc in memory or increase the refcount to pin the pages. Increasing the refcount makes it impossible for the VM to reclaim the page. Mlock causes problems with reclaim and may lead to OOM if too many pages are pinned in memory. It is also incorrect in terms what the POSIX specificies for what role mlock should play. Mlock does *not* pin pages in memory. Mlock just means do not allow the page to be moved to swap. Linux can move pages in memory (for example through the page migration mechanism). These pages can be moved even if they are mlocked(). The current approach of page pinning in use by RDMA etc is conceptually broken but there are currently no other easy solutions. The alternate of increasing the page count to pin pages is also not that enticing since there will be continual attempts to reclaim or migrate these pages. The solution here allows us to finally fix this issue by requiring such devices to subscribe to a notification chain that will allow them to work without pinning. The VM gains control of its memory again and the memory that has external references can be managed like regular memory. This patch: Core portion What is the status of getting infiniband to use this facility? How important is this feature to KVM? To xpmem? Which other potential clients have been identified and how important it it to those? Index: linux-2.6/Documentation/mmu_notifier/README === --- /dev/null 1970-01-01 00:00:00.0 + +++ linux-2.6/Documentation/mmu_notifier/README 2008-02-14 22:27:19.0 -0800 @@ -0,0 +1,105 @@ +Linux MMU Notifiers +--- + +MMU notifiers are used for hardware and software that establishes +external references to pages managed by the Linux kernel. These are +page table entriews or tlb entries or something else that allows +hardware (such as DMA engines, scatter gather devices, networking, +sharing of address spaces across operating system boundaries) and +software (Virtualization solutions such as KVM, Xen etc) to +access memory managed by the Linux kernel. + +The MMU notifier will notify the device driver that subscribes to such +a notifier that the VM is going to do something with the memory +mapped by that device. The device must then drop references for the +indicated memory area. The references may be reestablished later. + +The notification scheme is much better than the current schemes of +dealing with the danger of the VM removing pages. +We currently mlock pages used for RDMA, XPmem etc in memory or +increase the refcount of the pages. + +Both cause problems with reclaim and may lead to OOM if too many +pages are pinned in memory. Mlock is also incorrect in terms of the POSIX +specification of the role of mlock. Mlock does *not* pin pages in +memory. It just does not allow the page to be moved to swap. +The page refcount is used to track current users of a page struct. +Artificially inflating the refcount means that the VM cannot track +down all references to a page. It will not be able to reclaim or +move a page. However, the core code will try again and again because +the assumption is that an elevated refcount is a temporary situation. + +Linux can move pages in memory (for example through the page migration +mechanism). These pages can be moved even if they are mlocked(). +So the current approach in use by RDMA etc etc is conceptually broken +but there are currently no other easy solutions. + +The solution here allows us to finally fix this issue by requiring +such devices to subscribe to a notification chain that will allow +them to work without pinning. + +The notifier chains provide two callback mechanisms. The +first one is required for any device that establishes external mappings. +The second (rmap) mechanism is required if a device needs to be +able to sleep when invalidating references. Sleeping may be
Re: [kvm-devel] [patch 3/6] mmu_notifier: invalidate_page callbacks
On Thu, 14 Feb 2008 22:49:02 -0800 Christoph Lameter [EMAIL PROTECTED] wrote: Two callbacks to remove individual pages as done in rmap code invalidate_page() Called from the inner loop of rmap walks to invalidate pages. age_page() Called for the determination of the page referenced status. If we do not care about page referenced status then an age_page callback may be be omitted. PageLock and pte lock are held when either of the functions is called. The age_page mystery shallows. It would be useful to have some rationale somewhere in the patchset for the existence of this callback. #include asm/tlbflush.h @@ -287,7 +288,8 @@ static int page_referenced_one(struct pa if (vma-vm_flags VM_LOCKED) { referenced++; *mapcount = 1; /* break early from loop */ - } else if (ptep_clear_flush_young(vma, address, pte)) + } else if (ptep_clear_flush_young(vma, address, pte) | +mmu_notifier_age_page(mm, address)) referenced++; The | is obviously deliberate. But no explanation is provided telling us why we still call the callback if ptep_clear_flush_young() said the page was recently referenced. People who read your code will want to understand this. /* Pretend the page is referenced if the task has the @@ -455,6 +457,7 @@ static int page_mkclean_one(struct page flush_cache_page(vma, address, pte_pfn(*pte)); entry = ptep_clear_flush(vma, address, pte); + mmu_notifier(invalidate_page, mm, address); I just don't see how ths can be done if the callee has another thread in the middle of establishing IO against this region of memory. -invalidate_page() _has_ to be able to block. Confused. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 5/6] mmu_notifier: Support for drivers with revers maps (f.e. for XPmem)
On Thu, 14 Feb 2008 22:49:04 -0800 Christoph Lameter [EMAIL PROTECTED] wrote: These special additional callbacks are required because XPmem (and likely other mechanisms) do use their own rmap (multiple processes on a series of remote Linux instances may be accessing the memory of a process). F.e. XPmem may have to send out notifications to remote Linux instances and receive confirmation before a page can be freed. So we handle this like an additional Linux reverse map that is walked after the existing rmaps have been walked. We leave the walking to the driver that is then able to use something else than a spinlock to walk its reverse maps. So we can actually call the driver without holding spinlocks while we hold the Pagelock. However, we cannot determine the mm_struct that a page belongs to at that point. The mm_struct can only be determined from the rmaps by the device driver. We add another pageflag (PageExternalRmap) that is set if a page has been remotely mapped (f.e. by a process from another Linux instance). We can then only perform the callbacks for pages that are actually in remote use. Rmap notifiers need an extra page bit and are only available on 64 bit platforms. This functionality is not available on 32 bit! A notifier that uses the reverse maps callbacks does not need to provide the invalidate_page() method that is called when locks are held. hrm. +#define mmu_rmap_notifier(function, args...) \ + do {\ + struct mmu_rmap_notifier *__mrn;\ + struct hlist_node *__n; \ + \ + rcu_read_lock();\ + hlist_for_each_entry_rcu(__mrn, __n,\ + mmu_rmap_notifier_list, hlist) \ + if (__mrn-ops-function) \ + __mrn-ops-function(__mrn, args); \ + rcu_read_unlock(); \ + } while (0); + buggy macro: use locals. +#define mmu_rmap_notifier(function, args...) \ + do {\ + if (0) {\ + struct mmu_rmap_notifier *__mrn;\ + \ + __mrn = (struct mmu_rmap_notifier *)(0x00ff); \ + __mrn-ops-function(__mrn, args); \ + } \ + } while (0); + Same observation as in the other patch. === --- linux-2.6.orig/mm/mmu_notifier.c 2008-02-14 21:17:51.0 -0800 +++ linux-2.6/mm/mmu_notifier.c 2008-02-14 21:21:04.0 -0800 @@ -74,3 +74,37 @@ void mmu_notifier_unregister(struct mmu_ } EXPORT_SYMBOL_GPL(mmu_notifier_unregister); +#ifdef CONFIG_64BIT +static DEFINE_SPINLOCK(mmu_notifier_list_lock); +HLIST_HEAD(mmu_rmap_notifier_list); + +void mmu_rmap_notifier_register(struct mmu_rmap_notifier *mrn) +{ + spin_lock(mmu_notifier_list_lock); + hlist_add_head_rcu(mrn-hlist, mmu_rmap_notifier_list); + spin_unlock(mmu_notifier_list_lock); +} +EXPORT_SYMBOL(mmu_rmap_notifier_register); + +void mmu_rmap_notifier_unregister(struct mmu_rmap_notifier *mrn) +{ + spin_lock(mmu_notifier_list_lock); + hlist_del_rcu(mrn-hlist); + spin_unlock(mmu_notifier_list_lock); +} +EXPORT_SYMBOL(mmu_rmap_notifier_unregister); +/* + * Export a page. + * + * Pagelock must be held. + * Must be called before a page is put on an external rmap. + */ +void mmu_rmap_export_page(struct page *page) +{ + BUG_ON(!PageLocked(page)); + SetPageExternalRmap(page); +} +EXPORT_SYMBOL(mmu_rmap_export_page); The other patch used EXPORT_SYMBOL_GPL. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/6] MMU Notifiers V6
On Fri, 08 Feb 2008 14:06:16 -0800 Christoph Lameter [EMAIL PROTECTED] wrote: This is a patchset implementing MMU notifier callbacks based on Andrea's earlier work. These are needed if Linux pages are referenced from something else than tracked by the rmaps of the kernel (an external MMU). MMU notifiers allow us to get rid of the page pinning for RDMA and various other purposes. It gets rid of the broken use of mlock for page pinning. (mlock really does *not* pin pages) More information on the rationale and the technical details can be found in the first patch and the README provided by that patch in Documentation/mmu_notifiers. The known immediate users are KVM - Establishes a refcount to the page via get_user_pages(). - External references are called spte. - Has page tables to track pages whose refcount was elevated but no reverse maps. GRU - Simple additional hardware TLB (possibly covering multiple instances of Linux) - Needs TLB shootdown when the VM unmaps pages. - Determines page address via follow_page (from interrupt context) but can fall back to get_user_pages(). - No page reference possible since no page status is kept.. XPmem - Allows use of a processes memory by remote instances of Linux. - Provides its own reverse mappings to track remote pte. - Established refcounts on the exported pages. - Must sleep in order to wait for remote acks of ptes that are being cleared. What about ib_umem_get()? - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/6] MMU Notifiers V6
On Fri, 8 Feb 2008 17:43:02 -0600 Robin Holt [EMAIL PROTECTED] wrote: On Fri, Feb 08, 2008 at 03:41:24PM -0800, Christoph Lameter wrote: On Fri, 8 Feb 2008, Robin Holt wrote: What about ib_umem_get()? Correct. You missed the turn of the conversation to how ib_umem_get() works. Currently it seems to pin the same way that the SLES10 XPmem works. Ah. I took Andrew's question as more of a probe about whether we had worked with the IB folks to ensure this fits the ib_umem_get needs as well. You took it correctly, and I didn't understand the answer ;) - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 0/6] MMU Notifiers V6
On Fri, 8 Feb 2008 16:05:00 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] wrote: On Fri, 8 Feb 2008, Andrew Morton wrote: You took it correctly, and I didn't understand the answer ;) We have done several rounds of discussion on linux-kernel about this so far and the IB folks have not shown up to join in. I have tried to make this as general as possible. infiniband would appear to be the major present in-kernel client of this new interface. So as a part of proving its usefulness, correctness, etc we should surely work on converting infiniband to use it, and prove its goodness. Quite possibly none of the infiniband developers even know about it.. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] 2.6.24-rc8-mm1 (KVM build issues)
On Fri, 18 Jan 2008 22:56:32 +0530 Balbir Singh [EMAIL PROTECTED] wrote: * Andrew Morton [EMAIL PROTECTED] [2008-01-17 02:35:14]: - kvm probably doesn't work properly because I couldn't be bothered fixing the conflicts between git-kvm and the driver tree Hi, Andrew, The following changes got KVM up and running for me This patch fixes the kvm build on 2.6.24-rc8-mm1. First of all, it enables the KVM build, the second fix moves kset_set_name to the .name member. Signed-off-by: Balbir Singh [EMAIL PROTECTED] --- arch/x86/Makefile |2 +- virt/kvm/kvm_main.c |2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff -puN arch/x86/Makefile~fix-kvm-build arch/x86/Makefile --- linux-2.6.24-rc8/arch/x86/Makefile~fix-kvm-build 2008-01-18 22:42:41.0 +0530 +++ linux-2.6.24-rc8-balbir/arch/x86/Makefile 2008-01-18 22:42:47.0 +0530 @@ -185,7 +185,7 @@ core-y += arch/x86/vdso/ core-$(CONFIG_IA32_EMULATION) += arch/x86/ia32/ # kvm host support - uncomment when merging -# core-$(CONFIG_KVM) += arch/x86/kvm/ +core-$(CONFIG_KVM) += arch/x86/kvm/ # drivers-y are linked after core-y drivers-$(CONFIG_MATH_EMULATION) += arch/x86/math-emu/ diff -puN virt/kvm/kvm_main.c~fix-kvm-build virt/kvm/kvm_main.c --- linux-2.6.24-rc8/virt/kvm/kvm_main.c~fix-kvm-build2008-01-18 22:42:41.0 +0530 +++ linux-2.6.24-rc8-balbir/virt/kvm/kvm_main.c 2008-01-18 22:42:47.0 +0530 @@ -1260,7 +1260,7 @@ static int kvm_resume(struct sys_device } static struct sysdev_class kvm_sysdev_class = { - set_kset_name(kvm), + .name = kvm, .suspend = kvm_suspend, .resume = kvm_resume, }; This patch straddles such a pickle of other patches (driver tree, kvm, git-x86) that there doesn't seem much point in me untangling it. Presumably people will fix things up as various trees merge into 2.6.25-rc1. As long as Greg remembers to try to build kvm ;) - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 3/5] KVM: add kvm_follow_page()
On Sun, 23 Dec 2007 12:35:30 +0200 Avi Kivity [EMAIL PROTECTED] wrote: Andrew Morton wrote: On Sun, 23 Dec 2007 10:59:22 +0200 Avi Kivity [EMAIL PROTECTED] wrote: Avi Kivity wrote: Avi Kivity wrote: Exactly. But it is better to be explicit about it and pass the page directly like you did before. I hate to make you go back-and-fourth, but I did not understand the issue completely before. btw, the call to gfn_to_page() can happen in page_fault() instead of walk_addr(); that will reduce the amount of error handling, and will simplify the callers to walk_addr() that don't need the page. Note further that all this doesn't obviate the need for follow_page() (or get_user_pages_inatomic()); we still need something in update_pte() for the demand paging case. Please review -mm's mm/pagewalk.c for suitability. If is is unsuitable but repairable then please cc Matt Mackall [EMAIL PROTECTED] on the review. The no locks are taken comment is very worrying. We need accurate results. take down_read(mm-mmap_sem) before calling it.. You have to do that anyway for its results to be meaningful in the caller. Ditto get_user_pages(). Getting pte_t's in the callbacks is a little too low level for kvm's use (which wants struct page pointers) but of course that easily handled in a kvm wrapper. I'd prefer an atomic version of get_user_pages(), but if pagewalk is fixed to take the necessary locks, it will do. It isn't exported to modules at present, although I see no problem in changing that. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [patch 3/5] KVM: add kvm_follow_page()
On Sun, 23 Dec 2007 15:15:25 -0500 Marcelo Tosatti [EMAIL PROTECTED] wrote: Are you guys OK with this ? Modular KVM needs walk_page_range(), and also vm_normal_page() to be used on pagewalk callback. I am. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] 2.6.23-rc8-mm1: drivers/kvm/ioapic.o build failure
On Wed, 26 Sep 2007 11:00:09 +0200 Avi Kivity [EMAIL PROTECTED] wrote: Mariusz Kozlowski wrote: Hello, Similar (the same?) as in 2.6.23-rc6-mm1? http://www.mail-archive.com/linux-kernel%40vger.kernel.org/msg208812.html CC [M] drivers/kvm/ioapic.o drivers/kvm/ioapic.c: In function 'ioapic_deliver': drivers/kvm/ioapic.c:208: error: 'dest_LowestPrio' undeclared (first use in this function) drivers/kvm/ioapic.c:208: error: (Each undeclared identifier is reported only once drivers/kvm/ioapic.c:208: error: for each function it appears in.) drivers/kvm/ioapic.c:219: error: 'dest_Fixed' undeclared (first use in this function) make[2]: *** [drivers/kvm/ioapic.o] Error 1 make[1]: *** [drivers/kvm] Error 2 make: *** [drivers] Error 2 We now include asm/io_apic.h like we should. Has that file changed in -mm? CONFIG_X86_IO_APIC isn't set. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
[kvm-devel] kvm warning
ia64 allmodconfig says drivers/kvm/Kconfig:14:warning: 'select' used by config symbol 'KVM' refers to undefined symbol 'PREEMPT_NOTIFIERS' Because of commit 8928fb48c7a7f9053a55f1d0023cbc533f2b3663 Author: Avi Kivity [EMAIL PROTECTED] Date: Wed Jul 11 18:17:21 2007 +0300 KVM: Use the scheduler preemption notifiers to make kvm preemptible Current kvm disables preemption while the new virtualization registers are in use. This of course is not very good for latency sensitive workloads (on use of virtualization is to offload user interface and other latency insensitive stuff to a container, so that it is easier to analyze the remaining workload). This patch re-enables preemption for kvm; preemption is now only disabled when switching the registers in and out, and during the switch to guest mode and back. Contains fixes from Shaohua Li [EMAIL PROTECTED]. Signed-off-by: Avi Kivity [EMAIL PROTECTED] --- a/drivers/kvm/Kconfig +++ b/drivers/kvm/Kconfig @@ -11,6 +11,7 @@ if VIRTUALIZATION config KVM tristate Kernel-based Virtual Machine (KVM) support depends on X86 EXPERIMENTAL + select PREEMPT_NOTIFIERS select ANON_INODES ---help--- Support hosting fully virtualized guest machines using hardware ... a) is kvm supported on ia64 at all?? b) `select' is evil. Just Don't Do It. c) `select' is especially evil when it's done on some kernel-internal secret symbol like PREEMPT_NOTIFIERS. d) I can't see anything else in the kernel which sets or clears PREEMPT_NOTIFIERS so I'm rather wonderring why the config option exists at all. e) sched developers may not like KVM reaching over and twiddling their knobs for them. It all needs more thought, I think... - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] kvm warning
On Thu, 09 Aug 2007 01:48:07 +0300 Avi Kivity [EMAIL PROTECTED] wrote: Ingo Molnar wrote: * Andrew Morton [EMAIL PROTECTED] wrote: ia64 allmodconfig says drivers/kvm/Kconfig:14:warning: 'select' used by config symbol 'KVM' refers to undefined symbol 'PREEMPT_NOTIFIERS' hm, why doesnt ia64 pick up kernel/Kconfig.preempt, like all the other arches? Due to that ia64 also misses out on voluntary preempt and on preempt-bkl. Even more hm, how does ia64 manage to enable kvm? It 'depends on X86' at this moment. beats me. CONFIG_KVM doesn't get set. But it seems that kconfig wants to do error-checking on that item anyway. btw, testing of Kconfig can be done for any architecture without installation of a toolchain for that architecture. Set $ARCH and run mrproper then use menuconfig/oldconfig/allmodconfig/allconfig as usual. Judging by the number of Kconfig problem I see, this is a big secret ;) - This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now http://get.splunk.com/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] 2.6.22-rc4-mm2: kvm compile breakage with X86_CMPXCHG64=n
On Mon, 11 Jun 2007 23:22:24 -0400 Dave Jones [EMAIL PROTECTED] wrote: Add -Werror-implicit-function-declaration This makes builds fail sooner if something is implicitly defined instead of having to wait half an hour for it to fail at the linking stage. Signed-off-by: Dave Jones [EMAIL PROTECTED] --- linux-2.6/Makefile~ 2007-06-04 16:46:24.0 -0400 +++ linux-2.6/Makefile2007-06-04 16:46:53.0 -0400 @@ -313,7 +313,8 @@ LINUXINCLUDE:= -Iinclude \ CPPFLAGS:= -D__KERNEL__ $(LINUXINCLUDE) CFLAGS := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \ - -fno-strict-aliasing -fno-common +-fno-strict-aliasing -fno-common \ +-Werror-implicit-function-declaration AFLAGS := -D__ASSEMBLY__ # Read KERNELRELEASE from include/config/kernel.release (if it exists) This causes the i386 allmodconfig build to fail: include/linux/uaccess.h: In function 'pagefault_disable': include/linux/uaccess.h:23: error: implicit declaration of function '__memory_barrier' I didn't look to see why... - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] 2.6.22-rc4-mm2: kvm compile breakage with X86_CMPXCHG64=n
On Tue, 12 Jun 2007 18:16:29 -0400 Dave Jones [EMAIL PROTECTED] wrote: # Read KERNELRELEASE from include/config/kernel.release (if it exists) This causes the i386 allmodconfig build to fail: Seems to be doing its job rather effectively. err, hang on. I had a different patch in there which hilariously broke the build all over the place, and dropping that has made your patch come good. I'll put it back. - This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 0/6] KVM userspace interface updates for 2.6.21
On Sun, 25 Feb 2007 11:58:23 +0200 Avi Kivity [EMAIL PROTECTED] wrote: Avi Kivity wrote: The patchset, along with the previous fixset, is available as a git tree from git://kvm.qumranet.com/home/avi/kvm/linux-2.6. You may wish to plant it in your little git forest. This is now git://kvm.qumranet.com/home/avi/kvm.git, as a bare 'git pull' will pull the current branch instead of master, giving you whatever I was working on at the moment. The kvm.git repo will always have 'master' as the current branch. OK. drivers/kvm/kvm.h | 13 + drivers/kvm/kvm_main.c| 774 - drivers/kvm/kvm_svm.h |3 drivers/kvm/mmu.c | 36 +- drivers/kvm/paging_tmpl.h | 18 + drivers/kvm/svm.c | 42 ++ drivers/kvm/vmx.c | 33 ++ include/linux/kvm.h | 50 ++- include/linux/kvm_para.h | 73 However things might get messy later on if that tree starts introducing changes outside drivers/kvm. There's a ton of activity in x86-world. But if there are problems, you'll hear about it ;) - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 4/5] KVM: cpu hotplug support
On Tue, 30 Jan 2007 14:56:16 - Avi Kivity [EMAIL PROTECTED] wrote: +static void decache_vcpus_on_cpu(int cpu) +{ + struct kvm *vm; + struct kvm_vcpu *vcpu; + int i; + + spin_lock(kvm_lock); + list_for_each_entry(vm, vm_list, vm_list) + for (i = 0; i KVM_MAX_VCPUS; ++i) { + vcpu = vm-vcpus[i]; + /* + * If the vcpu is locked, then it is running on some + * other cpu and therefore it is not cached on the + * cpu in question. + * + * If it's not locked, check the last cpu it executed + * on. + */ + if (mutex_trylock(vcpu-mutex)) { + if (vcpu-cpu == cpu) { + kvm_arch_ops-vcpu_decache(vcpu); + vcpu-cpu = -1; + } + mutex_unlock(vcpu-mutex); + } + } + spin_unlock(kvm_lock); +} The trylock is unpleasing. Perhaps kvm_lock should be a mutex or something? - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 0/33] KVM: MMU: Cache shadow page tables
On Thu, 04 Jan 2007 17:48:45 +0200 Avi Kivity [EMAIL PROTECTED] wrote: The current kvm shadow page table implementation does not cache shadow page tables (except for global translations, used for kernel addresses) across context switches. This means that after a context switch, every memory access will trap into the host. After a while, the shadow page tables will be rebuild, and the guest can proceed at native speed until the next context switch. The natural solution, then, is to cache shadow page tables across context switches. Unfortunately, this introduces a bucketload of problems: - the guest does not notify the processor (and hence kvm) that it modifies a page table entry if it has reason to believe that the modification will be followed by a tlb flush. It becomes necessary to write-protect guest page tables so that we can use the page fault when the access occurs as a notification. - write protecting the guest page tables means we need to keep track of which ptes map those guest page table. We need to add reverse mapping for all mapped writable guest pages. - when the guest does access the write-protected page, we need to allow it to perform the write in some way. We do that either by emulating the write, or removing all shadow page tables for that page and allowing the write to proceed, depending on circumstances. This patchset implements the ideas above. While a lot of tuning remains to be done (for example, a sane page replacement algorithm), a guest running with this patchset applied is much faster and more responsive than with 2.6.20-rc3. Some preliminary benchmarks are available in http://article.gmane.org/gmane.comp.emulators.kvm.devel/661. The patchset is bisectable compile-wise. Is this intended for 2.6.20, or would you prefer that we release what we have now and hold this off for 2.6.21? - Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 0/14] KVM: Kernel-based Virtual Machine (v4)
On Sun, 05 Nov 2006 22:27:45 +0200 Avi Kivity [EMAIL PROTECTED] wrote: The following patchset adds a driver for Intel's hardware virtualization extensions to the x86 architecture. kapow. {standard input}: Assembler messages: {standard input}:157: Error: no such instruction: `vmxon -20(%ebp)' {standard input}:176: Error: no such instruction: `vmxoff' {standard input}:191: Error: no such instruction: `vmread %eax,%eax' {standard input}:403: Error: no such instruction: `vmwrite %edx,%eax' {standard input}:409: Error: no such instruction: `vmread %eax,12(%esp)' {standard input}:568: Error: no such instruction: `vmread %edx,%edx' {standard input}:596: Error: no such instruction: `vmclear -12(%ebp)' {standard input}:1885: Error: no such instruction: `vmread %eax,4(%esp)' {standard input}:1908: Error: no such instruction: `vmread %edx,%edx' {standard input}:1912: Error: no such instruction: `vmread %eax,%eax' {standard input}:1919: Error: no such instruction: `vmread %eax,%edx' {standard input}:1948: Error: no such instruction: `vmread %eax,%eax' {standard input}:2148: Error: no such instruction: `vmread %eax,%eax' {standard input}:2230: Error: no such instruction: `vmread %eax,%eax' {standard input}:2249: Error: no such instruction: `vmread %edx,%edx' {standard input}:2253: Error: no such instruction: `vmread %eax,%eax' {standard input}:2259: Error: no such instruction: `vmread %edx,%edx' {standard input}:2263: Error: no such instruction: `vmread %eax,%eax' {standard input}:2334: Error: no such instruction: `vmread %eax,%eax' {standard input}:2358: Error: no such instruction: `vmread %edx,%edx' {standard input}:2362: Error: no such instruction: `vmread %eax,%eax' {standard input}:2368: Error: no such instruction: `vmread %edx,%edx' {standard input}:2372: Error: no such instruction: `vmread %eax,%eax' {standard input}:2425: Error: no such instruction: `vmread %edx,%edx' etcetera. That's gas 2.16.1. I assume it needs some super-new binutils. I'm not sure what to do about this. What's the minimum version? - Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642 ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel