Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-25 Thread Andrea Arcangeli
I somehow lost missed this email in my inbox, found it now because it
was strangely still unread... Sorry for the late reply!

On Tue, Apr 22, 2008 at 03:06:24PM +1000, Rusty Russell wrote:
 On Wednesday 09 April 2008 01:44:04 Andrea Arcangeli wrote:
  --- a/include/linux/mm.h
  +++ b/include/linux/mm.h
  @@ -1050,6 +1050,15 @@
 unsigned long addr, unsigned long len,
 unsigned long flags, struct page **pages);
 
  +struct mm_lock_data {
  +   spinlock_t **i_mmap_locks;
  +   spinlock_t **anon_vma_locks;
  +   unsigned long nr_i_mmap_locks;
  +   unsigned long nr_anon_vma_locks;
  +};
  +extern struct mm_lock_data *mm_lock(struct mm_struct * mm);
  +extern void mm_unlock(struct mm_struct *mm, struct mm_lock_data *data);
 
 As far as I can tell you don't actually need to expose this struct at all?

Yes, it should be possible to only expose 'struct mm_lock_data;'.

  +   data-i_mmap_locks = vmalloc(nr_i_mmap_locks *
  +sizeof(spinlock_t));
 
 This is why non-typesafe allocators suck.  You want 'sizeof(spinlock_t *)' 
 here.
 
  +   data-anon_vma_locks = vmalloc(nr_anon_vma_locks *
  +  sizeof(spinlock_t));
 
 and here.

Great catch! (it was temporarily wasting some ram which isn't nice at all)

  +   err = -EINTR;
  +   i_mmap_lock_last = NULL;
  +   nr_i_mmap_locks = 0;
  +   for (;;) {
  +   spinlock_t *i_mmap_lock = (spinlock_t *) -1UL;
  +   for (vma = mm-mmap; vma; vma = vma-vm_next) {
 ...
  +   data-i_mmap_locks[nr_i_mmap_locks++] = i_mmap_lock;
  +   }
  +   data-nr_i_mmap_locks = nr_i_mmap_locks;
 
 How about you track your running counter in data-nr_i_mmap_locks, leave 
 nr_i_mmap_locks alone, and BUG_ON(data-nr_i_mmap_locks != nr_i_mmap_locks)?
 
 Even nicer would be to wrap this in a get_sorted_mmap_locks() function.

I'll try to clean this up further and I'll make a further update for review.

 Unfortunately, I just don't think we can fail locking like this.  In your 
 next 
 patch unregistering a notifier can fail because of it: that not usable.

Fortunately I figured out we don't really need mm_lock in unregister
because it's ok to unregister in the middle of the range_begin/end
critical section (that's definitely not ok for register that's why
register needs mm_lock). And it's perfectly ok to fail in register().

Also it wasn't ok to unpin the module count in -release as -release
needs to 'ret' to get back to the mmu notifier code. And without any
unregister at all, the module can't be unloaded at all which
is quite unacceptable...

The logic is to prevent mmu_notifier_register to race with
mmu_notifier_release because it takes the mm_users pin (implicit or
explicit, and then mmput just after mmu_notifier_register
returns). Then _register serializes against all the mmu notifier
methods (except -release) with srcu (-release can't run thanks to
the mm_users pin). The mmu_notifier_mm-lock then serializes the
modification on the list (register vs unregister) and it ensures one
and only one between _unregister and _releases calls -release before
_unregister returns. All other methods runs freely with srcu. Having
the guarante that -release is called just before all pages are freed
or inside _unregister, allows the module to zap and freeze its
secondary mmu inside -release with the race condition of exit()
against mmu_notifier_unregister internally by the mmu notifier code
and without dependency on exit_files/exit_mm ordering depending if the
fd of the driver is open the filetables or in the vma only. The
mmu_notifier_mm can be reset to 0 only after the last mmdrop.

About the mm_count refcounting for _release and _unregiste: no mmu
notifier and not even mmu_notifier_unregister and _release can cope
with mmu_notfier_mm list and srcu structures going away out of
order. exit_mmap is safe as it holds an mm_count implicitly because
mmdrop is run after exit_mmap returns. mmu_notifier_unregister is safe
too as _register takes the mm_count pin. We can't prevent
mmu_notifer_mm to go away with mm_users as that will screwup the vma
filedescriptor closure that only happens inside exit_mmap (mm_users
pinned prevents exit_mmap to run, and it can only be taken temporarily
until _register returns).

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-25 Thread Andrea Arcangeli
On Fri, Apr 25, 2008 at 06:56:39PM +0200, Andrea Arcangeli wrote:
   + data-i_mmap_locks = vmalloc(nr_i_mmap_locks *
   +  sizeof(spinlock_t));
  
  This is why non-typesafe allocators suck.  You want 'sizeof(spinlock_t *)' 
  here.
  
   + data-anon_vma_locks = vmalloc(nr_anon_vma_locks *
   +sizeof(spinlock_t));
  
  and here.
 
 Great catch! (it was temporarily wasting some ram which isn't nice at all)

As I went into the editor I just found the above already fixed in
#v14-pre3. And I can't move the structure into the file anymore
without kmallocing it. Exposing that structure avoids the
ERR_PTR/PTR_ERR on the retvals and one kmalloc so I think it makes the
code simpler in the end to keep it as it is now. I'd rather avoid
further changes to the 1/N patch, as long as they don't make any
difference at runtime and as long as they involve more than
cut-and-pasting a structure from .h to .c file.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-25 Thread Robin Holt
On Fri, Apr 25, 2008 at 06:56:40PM +0200, Andrea Arcangeli wrote:
 Fortunately I figured out we don't really need mm_lock in unregister
 because it's ok to unregister in the middle of the range_begin/end
 critical section (that's definitely not ok for register that's why
 register needs mm_lock). And it's perfectly ok to fail in register().

I think you still need mm_lock (unless I miss something).  What happens
when one callout is scanning mmu_notifier_invalidate_range_start() and
you unlink.  That list next pointer with LIST_POISON1 which is a really
bad address for the processor to track.

Maybe I misunderstood your description.

Thanks,
Robin

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-25 Thread Andrea Arcangeli
On Fri, Apr 25, 2008 at 02:25:32PM -0500, Robin Holt wrote:
 I think you still need mm_lock (unless I miss something).  What happens
 when one callout is scanning mmu_notifier_invalidate_range_start() and
 you unlink.  That list next pointer with LIST_POISON1 which is a really
 bad address for the processor to track.

Ok, _release list_del_init qcan't race with that because it happens in
exit_mmap when no other mmu notifier can trigger anymore.

_unregister can run concurrently but it does list_del_rcu, that only
overwrites the pprev pointer with LIST_POISON2. The
mmu_notifier_invalidate_range_start won't crash on LIST_POISON1 thanks
to srcu.

Actually I did more changes than necessary, for example I noticed the
mmu_notifier_register can return a list_add_head instead of
list_add_head_rcu. _register can't race against _release thanks to the
mm_users temporary or implicit pin. _register can't race against
_unregister thanks to the mmu_notifier_mm-lock. And register can't
race against all other mmu notifiers thanks to the mm_lock.

At this time I've no other pending patches on top of v14-pre3 other
than the below micro-optimizing cleanup. It'd be great to have
confirmation that v14-pre3 passes GRU/XPMEM regressions tests as well
as my KVM testing already passed successfully on it. I'll forward
v14-pre3 mmu-notifier-core plus the below to Andrew tomorrow, I'm
trying to be optimistic here! ;)

diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -187,7 +187,7 @@ int mmu_notifier_register(struct mmu_not
 * current-mm or explicitly with get_task_mm() or similar).
 */
spin_lock(mm-mmu_notifier_mm-lock);
-   hlist_add_head_rcu(mn-hlist, mm-mmu_notifier_mm-list);
+   hlist_add_head(mn-hlist, mm-mmu_notifier_mm-list);
spin_unlock(mm-mmu_notifier_mm-lock);
 out_unlock:
mm_unlock(mm, data);

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-17 Thread Robin Holt
On Wed, Apr 16, 2008 at 12:15:08PM -0700, Christoph Lameter wrote:
 On Wed, 16 Apr 2008, Robin Holt wrote:
 
  On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote:
   On Wed, 16 Apr 2008, Robin Holt wrote:
   
I don't think this lock mechanism is completely working.  I have
gotten a few failures trying to dereference 0x100100 which appears to
be LIST_POISON1.
   
   How does xpmem unregistering of notifiers work?
  
  For the tests I have been running, we are waiting for the release
  callout as part of exit.
 
 Some more details on the failure may be useful. AFAICT list_del[_rcu] is 
 the culprit here and that is only used on release or unregister.

I think I have this understood now.  It happens quite quickly (within
10 minutes) on a 128 rank job of small data set in a loop.

In these failing jobs, all the ranks are nearly symmetric.  There is
a certain part of each ranks address space that has access granted.
All the ranks have included all the other ranks including themselves in
exactly the same layout at exactly the same virtual address.

Rank 3 has hit _release and is beginning to clean up, but has not deleted
the notifier from its list.

Rank 9 calls the xpmem_invalidate_page() callout.  That page was attached
by rank 3 so we call zap_page_range on rank 3 which then calls back into
xpmem's invalidate_range_start callout.

The rank 3 _release callout begins and deletes its notifier from the list.

Rank 9's call to rank 3's zap_page_range notifier returns and dereferences
LIST_POISON1.

I often confuse myself while trying to explain these so please kick me
where the holes in the flow appear.  The console output from the simple
debugging stuff I put in is a bit overwhelming.


I am trying to figure out now which locks we hold as part of the zap
callout that should have prevented the _release callout.

Thanks,
Robin

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-17 Thread Andrea Arcangeli
On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote:
 On Wed, 16 Apr 2008, Robin Holt wrote:
 
  I don't think this lock mechanism is completely working.  I have
  gotten a few failures trying to dereference 0x100100 which appears to
  be LIST_POISON1.
 
 How does xpmem unregistering of notifiers work?

Especially are you using mmu_notifier_unregister?

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-17 Thread Robin Holt
On Thu, Apr 17, 2008 at 05:51:57PM +0200, Andrea Arcangeli wrote:
 On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote:
  On Wed, 16 Apr 2008, Robin Holt wrote:
  
   I don't think this lock mechanism is completely working.  I have
   gotten a few failures trying to dereference 0x100100 which appears to
   be LIST_POISON1.
  
  How does xpmem unregistering of notifiers work?
 
 Especially are you using mmu_notifier_unregister?

In this case, we are not making the call to unregister, we are waiting
for the _release callout which has already removed it from the list.

In the event that the user has removed all the grants, we use unregister.
That typically does not occur.  We merely wait for exit processing to
clean up the structures.

Thanks,
Robin

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-17 Thread Andrea Arcangeli
On Thu, Apr 17, 2008 at 11:36:42AM -0500, Robin Holt wrote:
 In this case, we are not making the call to unregister, we are waiting
 for the _release callout which has already removed it from the list.
 
 In the event that the user has removed all the grants, we use unregister.
 That typically does not occur.  We merely wait for exit processing to
 clean up the structures.

Then it's very strange. LIST_POISON1 is set in n-next. If it was a
second hlist_del triggering the bug in theory list_poison2 should
trigger first, so perhaps it's really a notifier running despite a
mm_lock is taken? Could you post a full stack trace so I can see who's
running into LIST_POISON1? If it's really a notifier running outside
of some mm_lock that will be _immediately_ visible from the stack
trace that triggered the LIST_POISON1!

Also note, EMM isn't using the clean hlist_del, it's implementing list
by hand (with zero runtime gain) so all the debugging may not be
existent in EMM, so if it's really a mm_lock race, and it only
triggers with mmu notifiers and not with EMM, it doesn't necessarily
mean EMM is bug free. If you've a full stack trace it would greatly
help to verify what is mangling over the list when the oops triggers.

Thanks!
Andrea

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-17 Thread Robin Holt
On Thu, Apr 17, 2008 at 07:14:43PM +0200, Andrea Arcangeli wrote:
 On Thu, Apr 17, 2008 at 11:36:42AM -0500, Robin Holt wrote:
  In this case, we are not making the call to unregister, we are waiting
  for the _release callout which has already removed it from the list.
  
  In the event that the user has removed all the grants, we use unregister.
  That typically does not occur.  We merely wait for exit processing to
  clean up the structures.
 
 Then it's very strange. LIST_POISON1 is set in n-next. If it was a
 second hlist_del triggering the bug in theory list_poison2 should
 trigger first, so perhaps it's really a notifier running despite a
 mm_lock is taken? Could you post a full stack trace so I can see who's
 running into LIST_POISON1? If it's really a notifier running outside
 of some mm_lock that will be _immediately_ visible from the stack
 trace that triggered the LIST_POISON1!
 
 Also note, EMM isn't using the clean hlist_del, it's implementing list
 by hand (with zero runtime gain) so all the debugging may not be
 existent in EMM, so if it's really a mm_lock race, and it only
 triggers with mmu notifiers and not with EMM, it doesn't necessarily
 mean EMM is bug free. If you've a full stack trace it would greatly
 help to verify what is mangling over the list when the oops triggers.

The stack trace is below.  I did not do this level of testing on emm so
I can not compare the two in this area.

This is for a different, but equivalent failure.  I just reproduce the
LIST_POISON1 failure without trying to reproduce the exact same failure
as I had documented earlier (lost that stack trace, sorry).

Thanks,
Robin


1Unable to handle kernel paging request at virtual address 00100100
4mpi006.f.x[23403]: Oops 11012296146944 [1]
4Modules linked in: nfs lockd sunrpc binfmt_misc thermal processor fan button 
loop md_mod dm_mod xpmem xp mspec sg
4
4Pid: 23403, CPU 114, comm:   mpi006.f.x
4psr : 121008526010 ifs : 838b ip  : [a0010015d6a1]
Not tainted (2.6.25-rc8)
4ip is at __mmu_notifier_invalidate_range_start+0x81/0x120
4unat:  pfs : 038b rsc : 0003
4rnat: a00100149a00 bsps: a0010740 pr  : 66555666a9599aa9
4ldrs:  ccv :  fpsr: 0009804c0270033f
4csd :  ssd : 
4b0  : a0010015d670 b6  : a002101ddb40 b7  : a001eb50
4f6  : 1003e f7  : 0
4f8  : 0 f9  : 0
4f10 : 0 f11 : 0
4r1  : a00100ef1190 r2  : ee6080cc1940 r3  : a002101edd10
4r8  : ee6080cc1970 r9  :  r10 : ee6080cc19c8
4r11 : 2003a648 r12 : ec60d31efb90 r13 : ec60d31e
4r14 : 004d r15 : ee6080cc1914 r16 : ee6080cc1970
4r17 : 2003a648 r18 : 2007bf90 r19 : 0004
4r20 : ec60d31e r21 : 0010 r22 : ee6080cc19a8
4r23 : ec60c55f1120 r24 : ec60d31efda0 r25 : ec60d31efd98
4r26 : ee60812166d0 r27 : ec60d31efdc0 r28 : ec60d31efdb8
4r29 : ec60d31e0b60 r30 :  r31 : 0081
4
4Call Trace:
4 [a00100014a20] show_stack+0x40/0xa0
4sp=ec60d31ef760 bsp=ec60d31e11f0
4 [a00100015330] show_regs+0x850/0x8a0
4sp=ec60d31ef930 bsp=ec60d31e1198
4 [a00100035ed0] die+0x1b0/0x2e0
4sp=ec60d31ef930 bsp=ec60d31e1150
4 [a00100060e90] ia64_do_page_fault+0x8d0/0xa40
4sp=ec60d31ef930 bsp=ec60d31e1100
4 [a001ab00] ia64_leave_kernel+0x0/0x270
4sp=ec60d31ef9c0 bsp=ec60d31e1100
4 [a0010015d6a0] __mmu_notifier_invalidate_range_start+0x80/0x120
4sp=ec60d31efb90 bsp=ec60d31e10a8
4 [a0010011b1d0] unmap_vmas+0x70/0x14c0
4sp=ec60d31efb90 bsp=ec60d31e0fa8
4 [a0010011c660] zap_page_range+0x40/0x60
4sp=ec60d31efda0 bsp=ec60d31e0f70
4 [a002101d62d0] xpmem_clear_PTEs+0x350/0x560 [xpmem]
4sp=ec60d31efdb0 bsp=ec60d31e0ef0
4 [a002101d1e30] xpmem_remove_seg+0x3f0/0x700 [xpmem]
4sp=ec60d31efde0 bsp=ec60d31e0ea8
4 [a002101d2500] xpmem_remove_segs_of_tg+0x80/0x140 [xpmem]
4sp=ec60d31efe10 bsp=ec60d31e0e78
4 [a002101dda40] xpmem_mmu_notifier_release+0x40/0x80 [xpmem]
4sp=ec60d31efe10 bsp=ec60d31e0e58
4 [a0010015d7f0] __mmu_notifier_release+0xb0/0x100
4sp=ec60d31efe10 bsp=ec60d31e0e38
4 [a00100124430] exit_mmap+0x50/0x180
4sp=ec60d31efe10 

Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-17 Thread Christoph Lameter
On Thu, 17 Apr 2008, Andrea Arcangeli wrote:

 Also note, EMM isn't using the clean hlist_del, it's implementing list
 by hand (with zero runtime gain) so all the debugging may not be
 existent in EMM, so if it's really a mm_lock race, and it only
 triggers with mmu notifiers and not with EMM, it doesn't necessarily
 mean EMM is bug free. If you've a full stack trace it would greatly
 help to verify what is mangling over the list when the oops triggers.

EMM was/is using a single linked list which allows atomic updates. Looked 
cleaner to me since doubly linked list must update two pointers.

I have not seen docs on the locking so not sure why you use rcu 
operations here? Isnt the requirement to have either rmap locks or 
mmap_sem held enough to guarantee the consistency of the doubly linked list?



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-17 Thread Andrea Arcangeli
On Thu, Apr 17, 2008 at 12:10:52PM -0700, Christoph Lameter wrote:
 EMM was/is using a single linked list which allows atomic updates. Looked 
 cleaner to me since doubly linked list must update two pointers.

Cleaner would be if it would provide an abstraction in list.h. The
important is the memory taken by the head for this usage.

 I have not seen docs on the locking so not sure why you use rcu 
 operations here? Isnt the requirement to have either rmap locks or 
 mmap_sem held enough to guarantee the consistency of the doubly linked list?

Yes, exactly, I'm not using rcu anymore.

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-16 Thread Robin Holt

I don't think this lock mechanism is completely working.  I have
gotten a few failures trying to dereference 0x100100 which appears to
be LIST_POISON1.

Thanks,
Robin

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-16 Thread Robin Holt
On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote:
 On Wed, 16 Apr 2008, Robin Holt wrote:
 
  I don't think this lock mechanism is completely working.  I have
  gotten a few failures trying to dereference 0x100100 which appears to
  be LIST_POISON1.
 
 How does xpmem unregistering of notifiers work?

For the tests I have been running, we are waiting for the release
callout as part of exit.

Thanks,
Robin

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-16 Thread Christoph Lameter
On Wed, 16 Apr 2008, Robin Holt wrote:

 I don't think this lock mechanism is completely working.  I have
 gotten a few failures trying to dereference 0x100100 which appears to
 be LIST_POISON1.

How does xpmem unregistering of notifiers work?

-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen

2008-04-16 Thread Christoph Lameter
On Wed, 16 Apr 2008, Robin Holt wrote:

 On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote:
  On Wed, 16 Apr 2008, Robin Holt wrote:
  
   I don't think this lock mechanism is completely working.  I have
   gotten a few failures trying to dereference 0x100100 which appears to
   be LIST_POISON1.
  
  How does xpmem unregistering of notifiers work?
 
 For the tests I have been running, we are waiting for the release
 callout as part of exit.

Some more details on the failure may be useful. AFAICT list_del[_rcu] is 
the culprit here and that is only used on release or unregister.



-
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel