Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-07 Thread Andrew Morton
On Wed, 07 May 2008 16:35:51 +0200
Andrea Arcangeli [EMAIL PROTECTED] wrote:

 # HG changeset patch
 # User Andrea Arcangeli [EMAIL PROTECTED]
 # Date 1210096013 -7200
 # Node ID e20917dcc8284b6a07cfcced13dda4cbca850a9c
 # Parent  5026689a3bc323a26d33ad882c34c4c9c9a3ecd8
 mmu-notifier-core
 
 ...

 --- a/include/linux/list.h
 +++ b/include/linux/list.h
 @@ -747,7 +747,7 @@ static inline void hlist_del(struct hlis
   * or hlist_del_rcu(), running on this same list.
   * However, it is perfectly legal to run concurrently with
   * the _rcu list-traversal primitives, such as
 - * hlist_for_each_entry().
 + * hlist_for_each_entry_rcu().
   */
  static inline void hlist_del_rcu(struct hlist_node *n)
  {
 @@ -760,6 +760,34 @@ static inline void hlist_del_init(struct
   if (!hlist_unhashed(n)) {
   __hlist_del(n);
   INIT_HLIST_NODE(n);
 + }
 +}
 +
 +/**
 + * hlist_del_init_rcu - deletes entry from hash list with re-initialization
 + * @n: the element to delete from the hash list.
 + *
 + * Note: list_unhashed() on entry does return true after this. It is

Should that be does or does not.  does, I suppose.

It should refer to hlist_unhashed()

The term on entry is a bit ambiguous - we normally use that as shorthand
to mean on entry to the function.  

So I'll change this to

 + * Note: hlist_unhashed() on the node returns true after this. It is

OK?

oh, that was copied-and-pasted from similarly errant comments in that file

 --- a/include/linux/mm_types.h
 +++ b/include/linux/mm_types.h
 @@ -10,6 +10,7 @@
  #include linux/rbtree.h
  #include linux/rwsem.h
  #include linux/completion.h
 +#include linux/cpumask.h

OK, unrelated bugfix ;)

 --- a/include/linux/srcu.h
 +++ b/include/linux/srcu.h
 @@ -27,6 +27,8 @@
  #ifndef _LINUX_SRCU_H
  #define _LINUX_SRCU_H
  
 +#include linux/mutex.h

And another.  Fair enough.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 01 of 11] mmu-notifier-core

2008-05-07 Thread Andrew Morton
On Wed, 07 May 2008 16:35:51 +0200
Andrea Arcangeli [EMAIL PROTECTED] wrote:

 # HG changeset patch
 # User Andrea Arcangeli [EMAIL PROTECTED]
 # Date 1210096013 -7200
 # Node ID e20917dcc8284b6a07cfcced13dda4cbca850a9c
 # Parent  5026689a3bc323a26d33ad882c34c4c9c9a3ecd8
 mmu-notifier-core

The patch looks OK to me.

The proposal is that we sneak this into 2.6.26.  Are there any
sufficiently-serious objections to this?

The patch will be a no-op for 2.6.26.

This is all rather unusual.  For the record, could we please review the
reasons for wanting to do this?

Thanks.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrew Morton
On Thu, 8 May 2008 00:22:05 +0200
Andrea Arcangeli [EMAIL PROTECTED] wrote:

  No, the simple solution is to just make up a whole new upper-level lock, 
  and get that lock *first*. You can then take all the multiple locks at a 
  lower level in any order you damn well please. 
 
 Unfortunately the lock you're talking about would be:
 
 static spinlock_t global_lock = ...
 
 There's no way to make it more granular.
 
 So every time before taking any -i_mmap_lock _and_ any anon_vma-lock
 we'd need to take that extremely wide spinlock first (and even worse,
 later it would become a rwsem when XPMEM is selected making the VM
 even slower than it already becomes when XPMEM support is selected at
 compile time).

Nope.  We only need to take the global lock before taking *two or more* of
the per-vma locks.

I really wish I'd thought of that.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-07 Thread Andrew Morton
On Thu, 8 May 2008 00:44:06 +0200
Andrea Arcangeli [EMAIL PROTECTED] wrote:

 On Wed, May 07, 2008 at 03:31:03PM -0700, Andrew Morton wrote:
  Nope.  We only need to take the global lock before taking *two or more* of
  the per-vma locks.
  
  I really wish I'd thought of that.
 
 I don't see how you can avoid taking the system-wide-global lock
 before every single anon_vma-lock/i_mmap_lock out there without
 mm_lock.
 
 Please note, we can't allow a thread to be in the middle of
 zap_page_range while mmu_notifier_register runs.
 
 vmtruncate takes 1 single lock, the i_mmap_lock of the inode. Not more
 than one lock and we've to still take the global-system-wide lock
 _before_ this single i_mmap_lock and no other lock at all.
 
 Please elaborate, thanks!


umm...


CPU0:   CPU1:

spin_lock(a-lock); spin_lock(b-lock);
spin_lock(b-lock); spin_lock(a-lock);

bad.

CPU0:   CPU1:

spin_lock(global_lock)  spin_lock(global_lock);
spin_lock(a-lock); spin_lock(b-lock);
spin_lock(b-lock); spin_lock(a-lock);

Is OK.


CPU0:   CPU1:

spin_lock(global_lock)  
spin_lock(a-lock); spin_lock(b-lock);
spin_lock(b-lock); spin_unlock(b-lock);
spin_lock(a-lock);
spin_unlock(a-lock);

also OK.

As long as all code paths which can take two-or-more locks are all covered
by the global lock there is no deadlock scenario.  If a thread takes just a
single instance of one of these locks without taking the global_lock then
there is also no deadlock.


Now, if we need to take both anon_vma-lock AND i_mmap_lock in the newly
added mm_lock() thing and we also take both those locks at the same time in
regular code, we're probably screwed.


-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] + kvm-provide-kvmh-for-all-architecture-fixes-headers_install.patch added to -mm tree

2008-03-25 Thread Andrew Morton
On Tue, 25 Mar 2008 16:31:46 +0100
Christian Borntraeger [EMAIL PROTECTED] wrote:

 Am Mittwoch, 12. M__rz 2008 schrieben Sie:
  
  The patch titled
   kvm: provide kvm.h for all architecture: fixes headers_install
  has been added to the -mm tree.  Its filename is
   kvm-provide-kvmh-for-all-architecture-fixes-headers_install.patch
  
 
 Hello Andrew,
 
 is there a chance to submit this patch before 2.6.25? headers_install of 
 kvm.h worked with 2.6.24 but is still broken with 2.6.25-rc.

Sure, I'll merge it.

From: Christian Borntraeger [EMAIL PROTECTED]

Currently include/linux/kvm.h is not considered by make headers_install,
because Kbuild cannot handle  unifdef-$(CONFIG_FOO) += foo.h.  This problem
was introduced by 040922c04cf2c8ac70be2e88a8a9614ecdb41d2e, which makes this
an 2.6.25 regression.

One way of solving the issue is to enhance Kbuild, but Avi and David conviced
me, that changing headers_install is not the way to go.  This patch changes
the definition for linux/kvm.h to unifdef-y.

If _unifdef-y is used for linux/kvm.h make headers_check will fail on all
architectures without asm/kvm.h.  Therefore, this patch also provides
asm/kvm.h on all architectures.

Signed-off-by: Christian Borntraeger [EMAIL PROTECTED]
Acked-by: Avi Kivity [EMAIL PROTECTED]
Cc: Sam Ravnborg [EMAIL PROTECTED]
Cc: David Woodhouse [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Signed-off-by: Andrew Morton [EMAIL PROTECTED]
---

 include/asm-alpha/kvm.h|6 ++
 include/asm-arm/kvm.h  |6 ++
 include/asm-avr32/kvm.h|6 ++
 include/asm-blackfin/kvm.h |6 ++
 include/asm-cris/kvm.h |6 ++
 include/asm-frv/kvm.h  |6 ++
 include/asm-generic/Kbuild.asm |2 ++
 include/asm-h8300/kvm.h|6 ++
 include/asm-ia64/kvm.h |6 ++
 include/asm-m32r/kvm.h |6 ++
 include/asm-m68k/kvm.h |6 ++
 include/asm-m68knommu/kvm.h|6 ++
 include/asm-mips/kvm.h |6 ++
 include/asm-mn10300/kvm.h  |6 ++
 include/asm-parisc/kvm.h   |6 ++
 include/asm-powerpc/kvm.h  |6 ++
 include/asm-s390/kvm.h |6 ++
 include/asm-sh/kvm.h   |6 ++
 include/asm-sparc/kvm.h|6 ++
 include/asm-sparc64/kvm.h  |6 ++
 include/asm-um/kvm.h   |6 ++
 include/asm-v850/kvm.h |6 ++
 include/asm-xtensa/kvm.h   |6 ++
 include/linux/Kbuild   |2 +-
 24 files changed, 135 insertions(+), 1 deletion(-)

diff -puN /dev/null include/asm-alpha/kvm.h
--- /dev/null
+++ a/include/asm-alpha/kvm.h
@@ -0,0 +1,6 @@
+#ifndef __LINUX_KVM_ALPHA_H
+#define __LINUX_KVM_ALPHA_H
+
+/* alpha does not support KVM */
+
+#endif
diff -puN /dev/null include/asm-arm/kvm.h
--- /dev/null
+++ a/include/asm-arm/kvm.h
@@ -0,0 +1,6 @@
+#ifndef __LINUX_KVM_ARM_H
+#define __LINUX_KVM_ARM_H
+
+/* arm does not support KVM */
+
+#endif
diff -puN /dev/null include/asm-avr32/kvm.h
--- /dev/null
+++ a/include/asm-avr32/kvm.h
@@ -0,0 +1,6 @@
+#ifndef __LINUX_KVM_AVR32_H
+#define __LINUX_KVM_AVR32_H
+
+/* avr32 does not support KVM */
+
+#endif
diff -puN /dev/null include/asm-blackfin/kvm.h
--- /dev/null
+++ a/include/asm-blackfin/kvm.h
@@ -0,0 +1,6 @@
+#ifndef __LINUX_KVM_BLACKFIN_H
+#define __LINUX_KVM_BLACKFIN_H
+
+/* blackfin does not support KVM */
+
+#endif
diff -puN /dev/null include/asm-cris/kvm.h
--- /dev/null
+++ a/include/asm-cris/kvm.h
@@ -0,0 +1,6 @@
+#ifndef __LINUX_KVM_CRIS_H
+#define __LINUX_KVM_CRIS_H
+
+/* cris does not support KVM */
+
+#endif
diff -puN /dev/null include/asm-frv/kvm.h
--- /dev/null
+++ a/include/asm-frv/kvm.h
@@ -0,0 +1,6 @@
+#ifndef __LINUX_KVM_FRV_H
+#define __LINUX_KVM_FRV_H
+
+/* frv does not support KVM */
+
+#endif
diff -puN 
include/asm-generic/Kbuild.asm~kvm-provide-kvmh-for-all-architecture-fixes-headers_install
 include/asm-generic/Kbuild.asm
--- 
a/include/asm-generic/Kbuild.asm~kvm-provide-kvmh-for-all-architecture-fixes-headers_install
+++ a/include/asm-generic/Kbuild.asm
@@ -1,3 +1,5 @@
+header-y  += kvm.h
+
 ifeq ($(wildcard 
include/asm-$(SRCARCH)/a.out.h),include/asm-$(SRCARCH)/a.out.h)
 unifdef-y += a.out.h
 endif
diff -puN /dev/null include/asm-h8300/kvm.h
--- /dev/null
+++ a/include/asm-h8300/kvm.h
@@ -0,0 +1,6 @@
+#ifndef __LINUX_KVM_H8300_H
+#define __LINUX_KVM_H8300_H
+
+/* h8300 does not support KVM */
+
+#endif
diff -puN /dev/null include/asm-ia64/kvm.h
--- /dev/null
+++ a/include/asm-ia64/kvm.h
@@ -0,0 +1,6 @@
+#ifndef __LINUX_KVM_IA64_H
+#define __LINUX_KVM_IA64_H
+
+/* ia64 does not support KVM */
+
+#endif
diff -puN /dev/null include/asm-m32r/kvm.h
--- /dev/null
+++ a/include/asm-m32r/kvm.h
@@ -0,0 +1,6 @@
+#ifndef __LINUX_KVM_M32R_H
+#define __LINUX_KVM_M32R_H
+
+/* m32r does not support KVM */
+
+#endif
diff -puN /dev/null include/asm-m68k/kvm.h
--- /dev/null
+++ a/include/asm-m68k/kvm.h
@@ -0,0 +1,6 @@
+#ifndef __LINUX_KVM_M68K_H
+#define

Re: [kvm-devel] [RFC/PATCH 01/15 v2] preparation: provide hook to enable pgstes in user pagetable

2008-03-24 Thread Andrew Morton
On Sat, 22 Mar 2008 18:02:37 +0100
Carsten Otte [EMAIL PROTECTED] wrote:

 From: Martin Schwidefsky [EMAIL PROTECTED]
 
 The SIE instruction on s390 uses the 2nd half of the page table page to
 virtualize the storage keys of a guest. This patch offers the s390_enable_sie
 function, which reorganizes the page tables of a single-threaded process to
 reserve space in the page table:
 s390_enable_sie makes sure that the process is single threaded and then uses
 dup_mm to create a new mm with reorganized page tables. The old mm is freed 
 and the process has now a page status extended field after every page table.
 
 Code that wants to exploit pgstes should SELECT CONFIG_PGSTE.
 
 This patch has a small common code hit, namely making dup_mm non-static.
 
 Edit (Carsten): I've modified Martin's patch, following Jeremy Fitzhardinge's
 review feedback. Now we do have the prototype for dup_mm in
 include/linux/sched.h.
 
 ...

 --- linux-host.orig/kernel/fork.c
 +++ linux-host/kernel/fork.c
 @@ -498,7 +498,7 @@ void mm_release(struct task_struct *tsk,
   * Allocate a new mm structure and copy contents from the
   * mm structure of the passed in task structure.
   */
 -static struct mm_struct *dup_mm(struct task_struct *tsk)
 +struct mm_struct *dup_mm(struct task_struct *tsk)
  {
   struct mm_struct *mm, *oldmm = current-mm;
   int err;

ack

 --- linux-host.orig/include/linux/sched.h
 +++ linux-host/include/linux/sched.h
 @@ -1758,6 +1758,8 @@ extern void mmput(struct mm_struct *);
  extern struct mm_struct *get_task_mm(struct task_struct *task);
  /* Remove the current tasks stale references to the old mm_struct */
  extern void mm_release(struct task_struct *, struct mm_struct *);
 +/* Allocate a new mm structure and copy contents from tsk-mm */
 +extern struct mm_struct *dup_mm(struct task_struct *tsk);
  
  extern int  copy_thread(int, unsigned long, unsigned long, unsigned long, 
 struct task_struct *, struct pt_regs *);
  extern void flush_thread(void);
 

hm, why did we put these in sched.h?

oh well - acked-by-me.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC/PATCH 02/15 v2] preparation: host memory management changes for s390 kvm

2008-03-24 Thread Andrew Morton
On Sat, 22 Mar 2008 18:02:39 +0100
Carsten Otte [EMAIL PROTECTED] wrote:

 From: Heiko Carstens [EMAIL PROTECTED]
 From: Christian Borntraeger [EMAIL PROTECTED]
 
 This patch changes the s390 memory management defintions to use the pgste 
 field
 for dirty and reference bit tracking of host and guest code. Usually on s390, 
 dirty and referenced are tracked in storage keys, which belong to the physical
 page. This changes with virtualization: The guest and host dirty/reference 
 bits
 are defined to be the logical OR of the values for the mapping and the 
 physical
 page. This patch implements the necessary changes in pgtable.h for s390.
 
 
 There is a common code change in mm/rmap.c, the call to 
 page_test_and_clear_young
 must be moved. This is a no-op for all architecture but s390. page_referenced
 checks the referenced bits for the physiscal page and for all mappings:
 o The physical page is checked with page_test_and_clear_young.
 o The mappings are checked with ptep_test_and_clear_young and friends.
 
 Without pgstes (the current implementation on Linux s390) the physical page
 check is implemented but the mapping callbacks are no-ops because dirty 
 and referenced are not tracked in the s390 page tables. The pgstes introduces 
 guest and host dirty and reference bits for s390 in the host mapping. These
 mapping must be checked before page_test_and_clear_young resets the reference
 bit. 

 ...

 --- linux-host.orig/mm/rmap.c
 +++ linux-host/mm/rmap.c
 @@ -413,9 +413,6 @@ int page_referenced(struct page *page, i
  {
   int referenced = 0;
  
 - if (page_test_and_clear_young(page))
 - referenced++;
 -
   if (TestClearPageReferenced(page))
   referenced++;
  
 @@ -433,6 +430,10 @@ int page_referenced(struct page *page, i
   unlock_page(page);
   }
   }
 +
 + if (page_test_and_clear_young(page))
 + referenced++;
 +
   return referenced;
  }

ack.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Bugme-new] [Bug 10246] New: in after successful ioperm() results in SEGV after kvm use

2008-03-14 Thread Andrew Morton
On Fri, 14 Mar 2008 18:48:15 -0700 (PDT) [EMAIL PROTECTED] wrote:

 http://bugzilla.kernel.org/show_bug.cgi?id=10246
 
Summary: in after successful ioperm() results in SEGV after kvm
 use
Product: Memory Management
Version: 2.5
  KernelVersion: 2.6.25-rc5
   Platform: All
 OS/Version: Linux
   Tree: Mainline
 Status: NEW
   Severity: normal
   Priority: P1
  Component: Other
 AssignedTo: [EMAIL PROTECTED]
 ReportedBy: [EMAIL PROTECTED]
 
 
 Latest working kernel version: N/A
 Earliest failing kernel version: 2.6.24
 Distribution: Ubuntu, but tested with mainline
 Hardware Environment: intel mobo, Intel(R) Core(TM)2 Quad CPU [EMAIL 
 PROTECTED]
 Software Environment: kvm 62 (x86_64)
 Problem Description:
 
 After a successful ioperm() call, otherwise valid in instructions will segv
 if a kvm VM has started.
 
 Steps to reproduce:
 
 1) run attached reproducer prior to starting a kvm VM, results are:
 # ./ioperm
 getting 0x3b4-0x3df permission...
 fetching 0x3cc...
 ok: 1
 
 2) start a kvm VM (bug exists only after actually starting a guest VM)
 
 3) run reproducer, which now fails:
 # ./ioperm
 getting 0x3b4-0x3df permission...
 fetching 0x3cc...
 Segmentation fault (core dumped)
 
 Note that it does not always fail.  Running within gdb seems to reduce the
 chances that it will fail.  But when it does, it is clearly the in that is
 failing:
 
 Program received signal SIGSEGV, Segmentation fault.
 0x004006e4 in inb ()
 (gdb) x/1i $pc
 0x4006e4 inb+12:  in (%dx),%al
 (gdb) info reg rdx
 rdx0x3cc972
 
 I have had the sense that running the CPUs at full load (niced) increases the
 chance for failure.
 

There is a testcase in the bugzilla report.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] kvm: provide kvm.h for all architecture: fixes headers_install

2008-03-11 Thread Andrew Morton
On Mon, 10 Mar 2008 14:11:04 +0100 Christian Borntraeger [EMAIL PROTECTED] 
wrote:

 [PATCH v2] kvm: provide kvm.h for all architecture: fixes headers_install
 
 Currently include/linux/kvm.h is not considered by make headers_install, 
 because Kbuild cannot handle  unifdef-$(CONFIG_FOO) += foo.h. 
 This problem was introduced by 040922c04cf2c8ac70be2e88a8a9614ecdb41d2e, 
 which makes this an 2.6.25 regression.
 
 One way of solving the issue is to enhance Kbuild, but Avi Kivity and David
 Woodhouse conviced me, that changing headers_install is not the way to go. 
 This patch changes the definition for linux/kvm.h to unifdef-y.
 
 If  unifdef-y is used for linux/kvm.h make headers_check will fail on all
 architectures without asm/kvm.h. Therefore, this patch also provides 
 asm/kvm.h on all architectures.
 
 Changes since v1:
 o use asm-generic/Kbuild.asm (Arnd Bergmann)
 o fix comment in asm-frv (David Howells)

err, this doesn't work.

alpha and m68k (at least) fail make headers_check

/usr/src/devel/usr/include/linux/kvm.h requires asm/kvm.h, which does not exist 
in exported headers

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] mmu notifiers #v7

2008-02-28 Thread Andrew Morton
On Fri, 29 Feb 2008 01:40:01 +0100 Andrea Arcangeli [EMAIL PROTECTED] wrote:

   +#define mmu_notifier(function, mm, args...)  
   \
   + do {\
   + struct mmu_notifier *__mn;  \
   + struct hlist_node *__n; \
   + \
   + if (unlikely(!hlist_empty((mm)-mmu_notifier.head))) { \
   + rcu_read_lock();\
   + hlist_for_each_entry_rcu(__mn, __n, \
   +  (mm)-mmu_notifier.head, \
   +  hlist) \
   + if (__mn-ops-function)\
   + __mn-ops-function(__mn,   \
   + mm, \
   + args);  \
   + rcu_read_unlock();  \
   + }   \
   + } while (0)
  
  Andrew recomended local variables for parameters used multile times. This 
  means the mm parameter here.
 
 I don't exactly see what buggy macro meant?

multiple refernces to the argument, so

mmu_notifier(foo, bar(), zot);

will call bar() either once or twice.

Unlikely in this case, but bad practice.  Easily fixable by using another
temporary.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-02-16 Thread Andrew Morton
On Sat, 16 Feb 2008 10:45:50 +0200 Avi Kivity [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
  How important is this feature to KVM?

 
 Very.  kvm pins pages that are referenced by the guest;

hm.  Why does it do that?

 a 64-bit guest 
 will easily pin its entire memory with the kernel map.

  So this is 
 critical for guest swapping to actually work.

Curious.  If KVM can release guest pages at the request of this notifier so
that they can be swapped out, why can't it release them by default, and
allow swapping to proceed?

 
 Other nice features like page migration are also enabled by this patch.
 

We already have page migration.  Do you mean page-migration-when-using-kvm?

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] KVM swapping with MMU Notifiers V7

2008-02-16 Thread Andrew Morton
On Sat, 16 Feb 2008 11:48:27 +0100 Andrea Arcangeli [EMAIL PROTECTED] wrote:

 +void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn,
 +struct mm_struct *mm,
 +unsigned long start, unsigned long 
 end,
 +int lock)
 +{
 + for (; start  end; start += PAGE_SIZE)
 + kvm_mmu_notifier_invalidate_page(mn, mm, start);
 +}
 +
 +static const struct mmu_notifier_ops kvm_mmu_notifier_ops = {
 + .invalidate_page= kvm_mmu_notifier_invalidate_page,
 + .age_page   = kvm_mmu_notifier_age_page,
 + .invalidate_range_end   = kvm_mmu_notifier_invalidate_range_end,
 +};

So this doesn't implement -invalidate_range_start().

By what means does it prevent new mappings from being established in the
range after core mm has tried to call -invalidate_rande_start()?
mmap_sem, I assume?


 + /* set userspace_addr atomically for kvm_hva_to_rmapp */
 + spin_lock(kvm-mmu_lock);
 + memslot-userspace_addr = userspace_addr;
 + spin_unlock(kvm-mmu_lock);

are you sure?  kvm_unmap_hva() and kvm_age_hva() read -userspace_addr a
single time and it doesn't immediately look like there's a need to take the
lock here?



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-15 Thread Andrew Morton
On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter [EMAIL PROTECTED] wrote:

 The invalidation of address ranges in a mm_struct needs to be
 performed when pages are removed or permissions etc change.

hm.  Do they?  Why?  If I'm in the process of zero-copy writing a hunk of
memory out to hardware then do I care if someone write-protects the ptes?

Spose so, but some fleshing-out of the various scenarios here would clarify
things.

 If invalidate_range_begin() is called with locks held then we
 pass a flag into invalidate_range() to indicate that no sleeping is
 possible. Locks are only held for truncate and huge pages.

This is so bad.

I supposed in the restricted couple of cases which you're focussed on it
works OK.  But is it generally suitable?  What if IO is in progress?  What
if other cluster nodes need to be talked to?  Does it suit RDMA?

 In two cases we use invalidate_range_begin/end to invalidate
 single pages because the pair allows holding off new references
 (idea by Robin Holt).

Assuming that there is a missing within the range in this description, I
assume that all clients will just throw up theior hands in horror and will
disallow all references to all parts of the mm.

Of course, to do that they will need to take a sleeping lock to prevent
other threads from establishing new references.  whoops.

 do_wp_page(): We hold off new references while we update the pte.
 
 xip_unmap: We are not taking the PageLock so we cannot
 use the invalidate_page mmu_rmap_notifier. invalidate_range_begin/end
 stands in.

What does stands in mean?

 Signed-off-by: Andrea Arcangeli [EMAIL PROTECTED]
 Signed-off-by: Robin Holt [EMAIL PROTECTED]
 Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
 
 ---
  mm/filemap_xip.c |5 +
  mm/fremap.c  |3 +++
  mm/hugetlb.c |3 +++
  mm/memory.c  |   35 +--
  mm/mmap.c|2 ++
  mm/mprotect.c|3 +++
  mm/mremap.c  |7 ++-
  7 files changed, 51 insertions(+), 7 deletions(-)
 
 Index: linux-2.6/mm/fremap.c
 ===
 --- linux-2.6.orig/mm/fremap.c2008-02-14 18:43:31.0 -0800
 +++ linux-2.6/mm/fremap.c 2008-02-14 18:45:07.0 -0800
 @@ -15,6 +15,7 @@
  #include linux/rmap.h
  #include linux/module.h
  #include linux/syscalls.h
 +#include linux/mmu_notifier.h
  
  #include asm/mmu_context.h
  #include asm/cacheflush.h
 @@ -214,7 +215,9 @@ asmlinkage long sys_remap_file_pages(uns
   spin_unlock(mapping-i_mmap_lock);
   }
  
 + mmu_notifier(invalidate_range_begin, mm, start, start + size, 0);
   err = populate_range(mm, vma, start, size, pgoff);
 + mmu_notifier(invalidate_range_end, mm, start, start + size, 0);

To avoid off-by-one confusion the changelogs, documentation and comments
should be very careful to tell the reader whether the range includes the
byte at start+size.  I don't thik that was done?



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 1/6] mmu_notifier: Core code

2008-02-15 Thread Andrew Morton
On Thu, 14 Feb 2008 22:49:00 -0800 Christoph Lameter [EMAIL PROTECTED] wrote:

 MMU notifiers are used for hardware and software that establishes
 external references to pages managed by the Linux kernel. These are
 page table entriews or tlb entries or something else that allows
 hardware (such as DMA engines, scatter gather devices, networking,
 sharing of address spaces across operating system boundaries) and
 software (Virtualization solutions such as KVM, Xen etc) to
 access memory managed by the Linux kernel.
 
 The MMU notifier will notify the device driver that subscribes to such
 a notifier that the VM is going to do something with the memory
 mapped by that device. The device must then drop references for the
 indicated memory area. The references may be reestablished later.
 
 The notification scheme is much better than the current schemes of
 avoiding the danger of the VM removing pages that are externally
 mapped. We currently either mlock pages used for RDMA, XPmem etc
 in memory or increase the refcount to pin the pages. Increasing
 the refcount makes it impossible for the VM to reclaim the page.
 
 Mlock causes problems with reclaim and may lead to OOM if too many
 pages are pinned in memory. It is also incorrect in terms what the POSIX
 specificies for what role mlock should play. Mlock does *not* pin pages in
 memory. Mlock just means do not allow the page to be moved to swap.
 
 Linux can move pages in memory (for example through the page migration
 mechanism). These pages can be moved even if they are mlocked().
 The current approach of page pinning in use by RDMA etc is conceptually
 broken but there are currently no other easy solutions.
 
 The alternate of increasing the page count to pin pages is also not
 that enticing since there will be continual attempts to reclaim
 or migrate these pages.
 
 The solution here allows us to finally fix this issue by requiring
 such devices to subscribe to a notification chain that will allow
 them to work without pinning. The VM gains control of its memory again
 and the memory that has external references can be managed like regular
 memory.
 
 This patch: Core portion
 

What is the status of getting infiniband to use this facility?

How important is this feature to KVM?

To xpmem?

Which other potential clients have been identified and how important it it
to those?


 Index: linux-2.6/Documentation/mmu_notifier/README
 ===
 --- /dev/null 1970-01-01 00:00:00.0 +
 +++ linux-2.6/Documentation/mmu_notifier/README   2008-02-14 
 22:27:19.0 -0800
 @@ -0,0 +1,105 @@
 +Linux MMU Notifiers
 +---
 +
 +MMU notifiers are used for hardware and software that establishes
 +external references to pages managed by the Linux kernel. These are
 +page table entriews or tlb entries or something else that allows
 +hardware (such as DMA engines, scatter gather devices, networking,
 +sharing of address spaces across operating system boundaries) and
 +software (Virtualization solutions such as KVM, Xen etc) to
 +access memory managed by the Linux kernel.
 +
 +The MMU notifier will notify the device driver that subscribes to such
 +a notifier that the VM is going to do something with the memory
 +mapped by that device. The device must then drop references for the
 +indicated memory area. The references may be reestablished later.
 +
 +The notification scheme is much better than the current schemes of
 +dealing with the danger of the VM removing pages.
 +We currently mlock pages used for RDMA, XPmem etc in memory or
 +increase the refcount of the pages.
 +
 +Both cause problems with reclaim and may lead to OOM if too many
 +pages are pinned in memory. Mlock is also incorrect in terms of the POSIX
 +specification of the role of mlock. Mlock does *not* pin pages in
 +memory. It just does not allow the page to be moved to swap.
 +The page refcount is used to track current users of a page struct.
 +Artificially inflating the refcount means that the VM cannot track
 +down all references to a page. It will not be able to reclaim or
 +move a page. However, the core code will try again and again because
 +the assumption is that an elevated refcount is a temporary situation.
 +
 +Linux can move pages in memory (for example through the page migration
 +mechanism). These pages can be moved even if they are mlocked().
 +So the current approach in use by RDMA etc etc is conceptually broken
 +but there are currently no other easy solutions.
 +
 +The solution here allows us to finally fix this issue by requiring
 +such devices to subscribe to a notification chain that will allow
 +them to work without pinning.
 +
 +The notifier chains provide two callback mechanisms. The
 +first one is required for any device that establishes external mappings.
 +The second (rmap) mechanism is required if a device needs to be
 +able to sleep when invalidating references. Sleeping may be 

Re: [kvm-devel] [patch 3/6] mmu_notifier: invalidate_page callbacks

2008-02-15 Thread Andrew Morton
On Thu, 14 Feb 2008 22:49:02 -0800 Christoph Lameter [EMAIL PROTECTED] wrote:

 Two callbacks to remove individual pages as done in rmap code
 
   invalidate_page()
 
 Called from the inner loop of rmap walks to invalidate pages.
 
   age_page()
 
 Called for the determination of the page referenced status.
 
 If we do not care about page referenced status then an age_page callback
 may be be omitted. PageLock and pte lock are held when either of the
 functions is called.

The age_page mystery shallows.

It would be useful to have some rationale somewhere in the patchset for the
existence of this callback.

  #include asm/tlbflush.h
  
 @@ -287,7 +288,8 @@ static int page_referenced_one(struct pa
   if (vma-vm_flags  VM_LOCKED) {
   referenced++;
   *mapcount = 1;  /* break early from loop */
 - } else if (ptep_clear_flush_young(vma, address, pte))
 + } else if (ptep_clear_flush_young(vma, address, pte) |
 +mmu_notifier_age_page(mm, address))
   referenced++;

The | is obviously deliberate.  But no explanation is provided telling us
why we still call the callback if ptep_clear_flush_young() said the page
was recently referenced.  People who read your code will want to understand
this.

   /* Pretend the page is referenced if the task has the
 @@ -455,6 +457,7 @@ static int page_mkclean_one(struct page 
  
   flush_cache_page(vma, address, pte_pfn(*pte));
   entry = ptep_clear_flush(vma, address, pte);
 + mmu_notifier(invalidate_page, mm, address);

I just don't see how ths can be done if the callee has another thread in
the middle of establishing IO against this region of memory. 
-invalidate_page() _has_ to be able to block.  Confused.



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 5/6] mmu_notifier: Support for drivers with revers maps (f.e. for XPmem)

2008-02-15 Thread Andrew Morton
On Thu, 14 Feb 2008 22:49:04 -0800 Christoph Lameter [EMAIL PROTECTED] wrote:

 These special additional callbacks are required because XPmem (and likely
 other mechanisms) do use their own rmap (multiple processes on a series
 of remote Linux instances may be accessing the memory of a process).
 F.e. XPmem may have to send out notifications to remote Linux instances
 and receive confirmation before a page can be freed.
 
 So we handle this like an additional Linux reverse map that is walked after
 the existing rmaps have been walked. We leave the walking to the driver that
 is then able to use something else than a spinlock to walk its reverse
 maps. So we can actually call the driver without holding spinlocks while
 we hold the Pagelock.
 
 However, we cannot determine the mm_struct that a page belongs to at
 that point. The mm_struct can only be determined from the rmaps by the
 device driver.
 
 We add another pageflag (PageExternalRmap) that is set if a page has
 been remotely mapped (f.e. by a process from another Linux instance).
 We can then only perform the callbacks for pages that are actually in
 remote use.
 
 Rmap notifiers need an extra page bit and are only available
 on 64 bit platforms. This functionality is not available on 32 bit!
 
 A notifier that uses the reverse maps callbacks does not need to provide
 the invalidate_page() method that is called when locks are held.
 

hrm.

 +#define mmu_rmap_notifier(function, args...) \
 + do {\
 + struct mmu_rmap_notifier *__mrn;\
 + struct hlist_node *__n; \
 + \
 + rcu_read_lock();\
 + hlist_for_each_entry_rcu(__mrn, __n,\
 + mmu_rmap_notifier_list, hlist) \
 + if (__mrn-ops-function)   \
 + __mrn-ops-function(__mrn, args);  \
 + rcu_read_unlock();  \
 + } while (0);
 +

buggy macro: use locals.

 +#define mmu_rmap_notifier(function, args...) \
 + do {\
 + if (0) {\
 + struct mmu_rmap_notifier *__mrn;\
 + \
 + __mrn = (struct mmu_rmap_notifier *)(0x00ff);   \
 + __mrn-ops-function(__mrn, args);  \
 + }   \
 + } while (0);
 +

Same observation as in the other patch.

 ===
 --- linux-2.6.orig/mm/mmu_notifier.c  2008-02-14 21:17:51.0 -0800
 +++ linux-2.6/mm/mmu_notifier.c   2008-02-14 21:21:04.0 -0800
 @@ -74,3 +74,37 @@ void mmu_notifier_unregister(struct mmu_
  }
  EXPORT_SYMBOL_GPL(mmu_notifier_unregister);
  
 +#ifdef CONFIG_64BIT
 +static DEFINE_SPINLOCK(mmu_notifier_list_lock);
 +HLIST_HEAD(mmu_rmap_notifier_list);
 +
 +void mmu_rmap_notifier_register(struct mmu_rmap_notifier *mrn)
 +{
 + spin_lock(mmu_notifier_list_lock);
 + hlist_add_head_rcu(mrn-hlist, mmu_rmap_notifier_list);
 + spin_unlock(mmu_notifier_list_lock);
 +}
 +EXPORT_SYMBOL(mmu_rmap_notifier_register);
 +
 +void mmu_rmap_notifier_unregister(struct mmu_rmap_notifier *mrn)
 +{
 + spin_lock(mmu_notifier_list_lock);
 + hlist_del_rcu(mrn-hlist);
 + spin_unlock(mmu_notifier_list_lock);
 +}
 +EXPORT_SYMBOL(mmu_rmap_notifier_unregister);

 +/*
 + * Export a page.
 + *
 + * Pagelock must be held.
 + * Must be called before a page is put on an external rmap.
 + */
 +void mmu_rmap_export_page(struct page *page)
 +{
 + BUG_ON(!PageLocked(page));
 + SetPageExternalRmap(page);
 +}
 +EXPORT_SYMBOL(mmu_rmap_export_page);

The other patch used EXPORT_SYMBOL_GPL.



-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Andrew Morton
On Fri, 08 Feb 2008 14:06:16 -0800
Christoph Lameter [EMAIL PROTECTED] wrote:

 This is a patchset implementing MMU notifier callbacks based on Andrea's
 earlier work. These are needed if Linux pages are referenced from something
 else than tracked by the rmaps of the kernel (an external MMU). MMU
 notifiers allow us to get rid of the page pinning for RDMA and various
 other purposes. It gets rid of the broken use of mlock for page pinning.
 (mlock really does *not* pin pages)
 
 More information on the rationale and the technical details can be found in
 the first patch and the README provided by that patch in
 Documentation/mmu_notifiers.
 
 The known immediate users are
 
 KVM
 - Establishes a refcount to the page via get_user_pages().
 - External references are called spte.
 - Has page tables to track pages whose refcount was elevated but
   no reverse maps.
 
 GRU
 - Simple additional hardware TLB (possibly covering multiple instances of
   Linux)
 - Needs TLB shootdown when the VM unmaps pages.
 - Determines page address via follow_page (from interrupt context) but can
   fall back to get_user_pages().
 - No page reference possible since no page status is kept..
 
 XPmem
 - Allows use of a processes memory by remote instances of Linux.
 - Provides its own reverse mappings to track remote pte.
 - Established refcounts on the exported pages.
 - Must sleep in order to wait for remote acks of ptes that are being
   cleared.
 

What about ib_umem_get()?

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Andrew Morton
On Fri, 8 Feb 2008 17:43:02 -0600 Robin Holt [EMAIL PROTECTED] wrote:

 On Fri, Feb 08, 2008 at 03:41:24PM -0800, Christoph Lameter wrote:
  On Fri, 8 Feb 2008, Robin Holt wrote:
  
 What about ib_umem_get()?
  
  Correct.
  
  You missed the turn of the conversation to how ib_umem_get() works. 
  Currently it seems to pin the same way that the SLES10 XPmem works.
 
 Ah.  I took Andrew's question as more of a probe about whether we had
 worked with the IB folks to ensure this fits the ib_umem_get needs
 as well.
 

You took it correctly, and I didn't understand the answer ;)

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Andrew Morton
On Fri, 8 Feb 2008 16:05:00 -0800 (PST) Christoph Lameter [EMAIL PROTECTED] 
wrote:

 On Fri, 8 Feb 2008, Andrew Morton wrote:
 
  You took it correctly, and I didn't understand the answer ;)
 
 We have done several rounds of discussion on linux-kernel about this so 
 far and the IB folks have not shown up to join in. I have tried to make 
 this as general as possible.

infiniband would appear to be the major present in-kernel client of this new
interface.  So as a part of proving its usefulness, correctness, etc we
should surely work on converting infiniband to use it, and prove its
goodness.

Quite possibly none of the infiniband developers even know about it..

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] 2.6.24-rc8-mm1 (KVM build issues)

2008-01-22 Thread Andrew Morton
 On Fri, 18 Jan 2008 22:56:32 +0530 Balbir Singh [EMAIL PROTECTED] wrote:
 * Andrew Morton [EMAIL PROTECTED] [2008-01-17 02:35:14]:
 
  - kvm probably doesn't work properly because I couldn't be bothered fixing
the conflicts between git-kvm and the driver tree
  
 
 Hi, Andrew,
 
 The following changes got KVM up and running for me
 
 
 This patch fixes the kvm build on 2.6.24-rc8-mm1. First of all, it enables
 the KVM build, the second fix moves kset_set_name to the .name member.
 
 Signed-off-by: Balbir Singh [EMAIL PROTECTED]
 ---
 
  arch/x86/Makefile   |2 +-
  virt/kvm/kvm_main.c |2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)
 
 diff -puN arch/x86/Makefile~fix-kvm-build arch/x86/Makefile
 --- linux-2.6.24-rc8/arch/x86/Makefile~fix-kvm-build  2008-01-18 
 22:42:41.0 +0530
 +++ linux-2.6.24-rc8-balbir/arch/x86/Makefile 2008-01-18 22:42:47.0 
 +0530
 @@ -185,7 +185,7 @@ core-y += arch/x86/vdso/
  core-$(CONFIG_IA32_EMULATION) += arch/x86/ia32/
  
  # kvm host support - uncomment when merging
 -# core-$(CONFIG_KVM) += arch/x86/kvm/
 +core-$(CONFIG_KVM) += arch/x86/kvm/
  
  # drivers-y are linked after core-y
  drivers-$(CONFIG_MATH_EMULATION) += arch/x86/math-emu/
 diff -puN virt/kvm/kvm_main.c~fix-kvm-build virt/kvm/kvm_main.c
 --- linux-2.6.24-rc8/virt/kvm/kvm_main.c~fix-kvm-build2008-01-18 
 22:42:41.0 +0530
 +++ linux-2.6.24-rc8-balbir/virt/kvm/kvm_main.c   2008-01-18 
 22:42:47.0 +0530
 @@ -1260,7 +1260,7 @@ static int kvm_resume(struct sys_device 
  }
  
  static struct sysdev_class kvm_sysdev_class = {
 - set_kset_name(kvm),
 + .name = kvm,
   .suspend = kvm_suspend,
   .resume = kvm_resume,
  };

This patch straddles such a pickle of other patches (driver tree, kvm, git-x86) 
that
there doesn't seem much point in me untangling it.  Presumably people will fix 
things
up as various trees merge into 2.6.25-rc1.

As long as Greg remembers to try to build kvm ;)

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 3/5] KVM: add kvm_follow_page()

2007-12-23 Thread Andrew Morton
On Sun, 23 Dec 2007 12:35:30 +0200 Avi Kivity [EMAIL PROTECTED] wrote:

 Andrew Morton wrote:
  On Sun, 23 Dec 2007 10:59:22 +0200 Avi Kivity [EMAIL PROTECTED] wrote:
 

  Avi Kivity wrote:
  
  Avi Kivity wrote:


  Exactly.  But it is better to be explicit about it and pass the page
  directly like you did before.  I hate to make you go back-and-fourth,
  but I did not understand the issue completely before.
 
  
  
  btw, the call to gfn_to_page() can happen in page_fault() instead of
  walk_addr(); that will reduce the amount of error handling, and will
  simplify the callers to walk_addr() that don't need the page.
 


  Note further that all this doesn't obviate the need for follow_page()
  (or get_user_pages_inatomic()); we still need something in update_pte()
  for the demand paging case.
  
 
  Please review -mm's mm/pagewalk.c for suitability.
 
  If is is unsuitable but repairable then please cc Matt Mackall
  [EMAIL PROTECTED] on the review.
 

 
 The no locks are taken comment is very worrying.  We need accurate 
 results.

take down_read(mm-mmap_sem) before calling it..

You have to do that anyway for its results to be meaningful in the caller. 
Ditto get_user_pages().

 Getting pte_t's in the callbacks is a little too low level for kvm's use 
 (which wants struct page pointers) but of course that easily handled in 
 a kvm wrapper.
 
 I'd prefer an atomic version of get_user_pages(), but if pagewalk is 
 fixed to take the necessary locks, it will do.

It isn't exported to modules at present, although I see no problem in
changing that.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 3/5] KVM: add kvm_follow_page()

2007-12-23 Thread Andrew Morton
On Sun, 23 Dec 2007 15:15:25 -0500 Marcelo Tosatti [EMAIL PROTECTED] wrote:

 Are you guys OK with this ?
 
 
 Modular KVM needs walk_page_range(), and also vm_normal_page() to be 
 used on pagewalk callback.

I am.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] 2.6.23-rc8-mm1: drivers/kvm/ioapic.o build failure

2007-09-26 Thread Andrew Morton
On Wed, 26 Sep 2007 11:00:09 +0200 Avi Kivity [EMAIL PROTECTED] wrote:

 Mariusz Kozlowski wrote:
  Hello,
 
  Similar (the same?) as in 2.6.23-rc6-mm1?
 
  http://www.mail-archive.com/linux-kernel%40vger.kernel.org/msg208812.html
  
CC [M]  drivers/kvm/ioapic.o
  drivers/kvm/ioapic.c: In function 'ioapic_deliver':
  drivers/kvm/ioapic.c:208: error: 'dest_LowestPrio' undeclared (first use in 
  this function)
  drivers/kvm/ioapic.c:208: error: (Each undeclared identifier is reported 
  only once
  drivers/kvm/ioapic.c:208: error: for each function it appears in.)
  drivers/kvm/ioapic.c:219: error: 'dest_Fixed' undeclared (first use in this 
  function)
  make[2]: *** [drivers/kvm/ioapic.o] Error 1
  make[1]: *** [drivers/kvm] Error 2
  make: *** [drivers] Error 2
 

 
 We now include asm/io_apic.h like we should.  Has that file changed in -mm?
 

CONFIG_X86_IO_APIC isn't set.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] kvm warning

2007-08-08 Thread Andrew Morton

ia64 allmodconfig says

drivers/kvm/Kconfig:14:warning: 'select' used by config symbol 'KVM' refers to 
undefined symbol 'PREEMPT_NOTIFIERS'

Because of

commit 8928fb48c7a7f9053a55f1d0023cbc533f2b3663
Author: Avi Kivity [EMAIL PROTECTED]
Date:   Wed Jul 11 18:17:21 2007 +0300

KVM: Use the scheduler preemption notifiers to make kvm preemptible

Current kvm disables preemption while the new virtualization registers are
in use.  This of course is not very good for latency sensitive workloads (on
use of virtualization is to offload user interface and other latency
insensitive stuff to a container, so that it is easier to analyze the
remaining workload).  This patch re-enables preemption for kvm; preemption
is now only disabled when switching the registers in and out, and during
the switch to guest mode and back.

Contains fixes from Shaohua Li [EMAIL PROTECTED].

Signed-off-by: Avi Kivity [EMAIL PROTECTED]

--- a/drivers/kvm/Kconfig
+++ b/drivers/kvm/Kconfig
@@ -11,6 +11,7 @@ if VIRTUALIZATION
 config KVM
tristate Kernel-based Virtual Machine (KVM) support
depends on X86  EXPERIMENTAL
+   select PREEMPT_NOTIFIERS
select ANON_INODES
---help---
  Support hosting fully virtualized guest machines using hardware
...


a) is kvm supported on ia64 at all??

b) `select' is evil.  Just Don't Do It.

c) `select' is especially evil when it's done on some kernel-internal
   secret symbol like PREEMPT_NOTIFIERS.

d) I can't see anything else in the kernel which sets or clears
   PREEMPT_NOTIFIERS so I'm rather wonderring why the config option exists
   at all.

e) sched developers may not like KVM reaching over and twiddling their
   knobs for them.


It all needs more thought, I think...


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] kvm warning

2007-08-08 Thread Andrew Morton
On Thu, 09 Aug 2007 01:48:07 +0300
Avi Kivity [EMAIL PROTECTED] wrote:

 Ingo Molnar wrote:
  * Andrew Morton [EMAIL PROTECTED] wrote:
 

  ia64 allmodconfig says
 
  drivers/kvm/Kconfig:14:warning: 'select' used by config symbol 'KVM' 
  refers to undefined symbol 'PREEMPT_NOTIFIERS'
  
 
  hm, why doesnt ia64 pick up kernel/Kconfig.preempt, like all the other 
  arches? Due to that ia64 also misses out on voluntary preempt and on 
  preempt-bkl.
 

 
 Even more hm, how does ia64 manage to enable kvm?  It 'depends on X86' 
 at this moment.
 

beats me.  CONFIG_KVM doesn't get set.  But it seems that kconfig wants
to do error-checking on that item anyway.



btw, testing of Kconfig can be done for any architecture without
installation of a toolchain for that architecture.  Set $ARCH and run
mrproper then use menuconfig/oldconfig/allmodconfig/allconfig as usual.

Judging by the number of Kconfig problem I see, this is a big secret ;)

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now   http://get.splunk.com/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] 2.6.22-rc4-mm2: kvm compile breakage with X86_CMPXCHG64=n

2007-06-12 Thread Andrew Morton
On Mon, 11 Jun 2007 23:22:24 -0400
Dave Jones [EMAIL PROTECTED] wrote:

 Add -Werror-implicit-function-declaration
 This makes builds fail sooner if something is implicitly defined instead
 of having to wait half an hour for it to fail at the linking stage.
 
 Signed-off-by: Dave Jones [EMAIL PROTECTED]
 
 --- linux-2.6/Makefile~   2007-06-04 16:46:24.0 -0400
 +++ linux-2.6/Makefile2007-06-04 16:46:53.0 -0400
 @@ -313,7 +313,8 @@ LINUXINCLUDE:= -Iinclude \
  CPPFLAGS:= -D__KERNEL__ $(LINUXINCLUDE)
  
  CFLAGS  := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
 -   -fno-strict-aliasing -fno-common
 +-fno-strict-aliasing -fno-common \
 +-Werror-implicit-function-declaration
  AFLAGS  := -D__ASSEMBLY__
  
  # Read KERNELRELEASE from include/config/kernel.release (if it exists)

This causes the i386 allmodconfig build to fail:

include/linux/uaccess.h: In function 'pagefault_disable':
include/linux/uaccess.h:23: error: implicit declaration of function 
'__memory_barrier'

I didn't look to see why...

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] 2.6.22-rc4-mm2: kvm compile breakage with X86_CMPXCHG64=n

2007-06-12 Thread Andrew Morton
On Tue, 12 Jun 2007 18:16:29 -0400
Dave Jones [EMAIL PROTECTED] wrote:

 # Read KERNELRELEASE from include/config/kernel.release (if it exists)
   
   This causes the i386 allmodconfig build to fail:
 
 Seems to be doing its job rather effectively.

err, hang on.  I had a different patch in there which hilariously broke
the build all over the place, and dropping that has made your patch
come good.  I'll put it back.

-
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 0/6] KVM userspace interface updates for 2.6.21

2007-02-25 Thread Andrew Morton
 On Sun, 25 Feb 2007 11:58:23 +0200 Avi Kivity [EMAIL PROTECTED] wrote:
 Avi Kivity wrote:
 
  The patchset, along with the previous fixset, is available as a git 
  tree from git://kvm.qumranet.com/home/avi/kvm/linux-2.6.  You may wish 
  to plant it in your little git forest.
 
 
 This is now git://kvm.qumranet.com/home/avi/kvm.git, as a bare 'git 
 pull' will pull the current branch instead of master, giving you 
 whatever I was working on at the moment.  The kvm.git repo will always 
 have 'master' as the current branch.

OK.

 drivers/kvm/kvm.h |   13 +
 drivers/kvm/kvm_main.c|  774 -
 drivers/kvm/kvm_svm.h |3 
 drivers/kvm/mmu.c |   36 +-
 drivers/kvm/paging_tmpl.h |   18 +
 drivers/kvm/svm.c |   42 ++
 drivers/kvm/vmx.c |   33 ++
 include/linux/kvm.h   |   50 ++-
 include/linux/kvm_para.h  |   73 

However things might get messy later on if that tree starts introducing changes
outside drivers/kvm.  There's a ton of activity in x86-world.

But if there are problems, you'll hear about it ;)

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 4/5] KVM: cpu hotplug support

2007-01-30 Thread Andrew Morton
On Tue, 30 Jan 2007 14:56:16 -
Avi Kivity [EMAIL PROTECTED] wrote:

 +static void decache_vcpus_on_cpu(int cpu)
 +{
 + struct kvm *vm;
 + struct kvm_vcpu *vcpu;
 + int i;
 +
 + spin_lock(kvm_lock);
 + list_for_each_entry(vm, vm_list, vm_list)
 + for (i = 0; i  KVM_MAX_VCPUS; ++i) {
 + vcpu = vm-vcpus[i];
 + /*
 +  * If the vcpu is locked, then it is running on some
 +  * other cpu and therefore it is not cached on the
 +  * cpu in question.
 +  *
 +  * If it's not locked, check the last cpu it executed
 +  * on.
 +  */
 + if (mutex_trylock(vcpu-mutex)) {
 + if (vcpu-cpu == cpu) {
 + kvm_arch_ops-vcpu_decache(vcpu);
 + vcpu-cpu = -1;
 + }
 + mutex_unlock(vcpu-mutex);
 + }
 + }
 + spin_unlock(kvm_lock);
 +}


The trylock is unpleasing.  Perhaps kvm_lock should be a mutex or something?


-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 0/33] KVM: MMU: Cache shadow page tables

2007-01-04 Thread Andrew Morton
On Thu, 04 Jan 2007 17:48:45 +0200
Avi Kivity [EMAIL PROTECTED] wrote:

 The current kvm shadow page table implementation does not cache shadow 
 page tables (except for global translations, used for kernel addresses) 
 across context switches.  This means that after a context switch, every 
 memory access will trap into the host.  After a while, the shadow page 
 tables will be rebuild, and the guest can proceed at native speed until 
 the next context switch.
 
 The natural solution, then, is to cache shadow page tables across 
 context switches.  Unfortunately, this introduces a bucketload of problems:
 
 - the guest does not notify the processor (and hence kvm) that it 
 modifies a page table entry if it has reason to believe that the 
 modification will be followed by a tlb flush.  It becomes necessary to 
 write-protect guest page tables so that we can use the page fault when 
 the access occurs as a notification.
 - write protecting the guest page tables means we need to keep track of 
 which ptes map those guest page table. We need to add reverse mapping 
 for all mapped writable guest pages.
 - when the guest does access the write-protected page, we need to allow 
 it to perform the write in some way.  We do that either by emulating the 
 write, or removing all shadow page tables for that page and allowing the 
 write to proceed, depending on circumstances.
 
 This patchset implements the ideas above.  While a lot of tuning remains 
 to be done (for example, a sane page replacement algorithm), a guest 
 running with this patchset applied is much faster and more responsive 
 than with 2.6.20-rc3.  Some preliminary benchmarks are available in 
 http://article.gmane.org/gmane.comp.emulators.kvm.devel/661.
 
 The patchset is bisectable compile-wise.

Is this intended for 2.6.20, or would you prefer that we release what we
have now and hold this off for 2.6.21?

-
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT  business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.phpp=sourceforgeCID=DEVDEV
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 0/14] KVM: Kernel-based Virtual Machine (v4)

2006-11-07 Thread Andrew Morton
On Sun, 05 Nov 2006 22:27:45 +0200
Avi Kivity [EMAIL PROTECTED] wrote:

 The following patchset adds a driver for Intel's hardware
 virtualization extensions to the x86 architecture. 

kapow.

{standard input}: Assembler messages:
{standard input}:157: Error: no such instruction: `vmxon -20(%ebp)'
{standard input}:176: Error: no such instruction: `vmxoff'
{standard input}:191: Error: no such instruction: `vmread %eax,%eax'
{standard input}:403: Error: no such instruction: `vmwrite %edx,%eax'
{standard input}:409: Error: no such instruction: `vmread %eax,12(%esp)'
{standard input}:568: Error: no such instruction: `vmread %edx,%edx'
{standard input}:596: Error: no such instruction: `vmclear -12(%ebp)'
{standard input}:1885: Error: no such instruction: `vmread %eax,4(%esp)'
{standard input}:1908: Error: no such instruction: `vmread %edx,%edx'
{standard input}:1912: Error: no such instruction: `vmread %eax,%eax'
{standard input}:1919: Error: no such instruction: `vmread %eax,%edx'
{standard input}:1948: Error: no such instruction: `vmread %eax,%eax'
{standard input}:2148: Error: no such instruction: `vmread %eax,%eax'
{standard input}:2230: Error: no such instruction: `vmread %eax,%eax'
{standard input}:2249: Error: no such instruction: `vmread %edx,%edx'
{standard input}:2253: Error: no such instruction: `vmread %eax,%eax'
{standard input}:2259: Error: no such instruction: `vmread %edx,%edx'
{standard input}:2263: Error: no such instruction: `vmread %eax,%eax'
{standard input}:2334: Error: no such instruction: `vmread %eax,%eax'
{standard input}:2358: Error: no such instruction: `vmread %edx,%edx'
{standard input}:2362: Error: no such instruction: `vmread %eax,%eax'
{standard input}:2368: Error: no such instruction: `vmread %edx,%edx'
{standard input}:2372: Error: no such instruction: `vmread %eax,%eax'
{standard input}:2425: Error: no such instruction: `vmread %edx,%edx'
etcetera.


That's gas 2.16.1.  I assume it needs some super-new binutils.

I'm not sure what to do about this.  What's the minimum version?

-
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnkkid=120709bid=263057dat=121642
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel