Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
I somehow lost missed this email in my inbox, found it now because it was strangely still unread... Sorry for the late reply! On Tue, Apr 22, 2008 at 03:06:24PM +1000, Rusty Russell wrote: On Wednesday 09 April 2008 01:44:04 Andrea Arcangeli wrote: --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1050,6 +1050,15 @@ unsigned long addr, unsigned long len, unsigned long flags, struct page **pages); +struct mm_lock_data { + spinlock_t **i_mmap_locks; + spinlock_t **anon_vma_locks; + unsigned long nr_i_mmap_locks; + unsigned long nr_anon_vma_locks; +}; +extern struct mm_lock_data *mm_lock(struct mm_struct * mm); +extern void mm_unlock(struct mm_struct *mm, struct mm_lock_data *data); As far as I can tell you don't actually need to expose this struct at all? Yes, it should be possible to only expose 'struct mm_lock_data;'. + data-i_mmap_locks = vmalloc(nr_i_mmap_locks * +sizeof(spinlock_t)); This is why non-typesafe allocators suck. You want 'sizeof(spinlock_t *)' here. + data-anon_vma_locks = vmalloc(nr_anon_vma_locks * + sizeof(spinlock_t)); and here. Great catch! (it was temporarily wasting some ram which isn't nice at all) + err = -EINTR; + i_mmap_lock_last = NULL; + nr_i_mmap_locks = 0; + for (;;) { + spinlock_t *i_mmap_lock = (spinlock_t *) -1UL; + for (vma = mm-mmap; vma; vma = vma-vm_next) { ... + data-i_mmap_locks[nr_i_mmap_locks++] = i_mmap_lock; + } + data-nr_i_mmap_locks = nr_i_mmap_locks; How about you track your running counter in data-nr_i_mmap_locks, leave nr_i_mmap_locks alone, and BUG_ON(data-nr_i_mmap_locks != nr_i_mmap_locks)? Even nicer would be to wrap this in a get_sorted_mmap_locks() function. I'll try to clean this up further and I'll make a further update for review. Unfortunately, I just don't think we can fail locking like this. In your next patch unregistering a notifier can fail because of it: that not usable. Fortunately I figured out we don't really need mm_lock in unregister because it's ok to unregister in the middle of the range_begin/end critical section (that's definitely not ok for register that's why register needs mm_lock). And it's perfectly ok to fail in register(). Also it wasn't ok to unpin the module count in -release as -release needs to 'ret' to get back to the mmu notifier code. And without any unregister at all, the module can't be unloaded at all which is quite unacceptable... The logic is to prevent mmu_notifier_register to race with mmu_notifier_release because it takes the mm_users pin (implicit or explicit, and then mmput just after mmu_notifier_register returns). Then _register serializes against all the mmu notifier methods (except -release) with srcu (-release can't run thanks to the mm_users pin). The mmu_notifier_mm-lock then serializes the modification on the list (register vs unregister) and it ensures one and only one between _unregister and _releases calls -release before _unregister returns. All other methods runs freely with srcu. Having the guarante that -release is called just before all pages are freed or inside _unregister, allows the module to zap and freeze its secondary mmu inside -release with the race condition of exit() against mmu_notifier_unregister internally by the mmu notifier code and without dependency on exit_files/exit_mm ordering depending if the fd of the driver is open the filetables or in the vma only. The mmu_notifier_mm can be reset to 0 only after the last mmdrop. About the mm_count refcounting for _release and _unregiste: no mmu notifier and not even mmu_notifier_unregister and _release can cope with mmu_notfier_mm list and srcu structures going away out of order. exit_mmap is safe as it holds an mm_count implicitly because mmdrop is run after exit_mmap returns. mmu_notifier_unregister is safe too as _register takes the mm_count pin. We can't prevent mmu_notifer_mm to go away with mm_users as that will screwup the vma filedescriptor closure that only happens inside exit_mmap (mm_users pinned prevents exit_mmap to run, and it can only be taken temporarily until _register returns). - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Fri, Apr 25, 2008 at 06:56:39PM +0200, Andrea Arcangeli wrote: + data-i_mmap_locks = vmalloc(nr_i_mmap_locks * + sizeof(spinlock_t)); This is why non-typesafe allocators suck. You want 'sizeof(spinlock_t *)' here. + data-anon_vma_locks = vmalloc(nr_anon_vma_locks * +sizeof(spinlock_t)); and here. Great catch! (it was temporarily wasting some ram which isn't nice at all) As I went into the editor I just found the above already fixed in #v14-pre3. And I can't move the structure into the file anymore without kmallocing it. Exposing that structure avoids the ERR_PTR/PTR_ERR on the retvals and one kmalloc so I think it makes the code simpler in the end to keep it as it is now. I'd rather avoid further changes to the 1/N patch, as long as they don't make any difference at runtime and as long as they involve more than cut-and-pasting a structure from .h to .c file. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Fri, Apr 25, 2008 at 06:56:40PM +0200, Andrea Arcangeli wrote: Fortunately I figured out we don't really need mm_lock in unregister because it's ok to unregister in the middle of the range_begin/end critical section (that's definitely not ok for register that's why register needs mm_lock). And it's perfectly ok to fail in register(). I think you still need mm_lock (unless I miss something). What happens when one callout is scanning mmu_notifier_invalidate_range_start() and you unlink. That list next pointer with LIST_POISON1 which is a really bad address for the processor to track. Maybe I misunderstood your description. Thanks, Robin - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Fri, Apr 25, 2008 at 02:25:32PM -0500, Robin Holt wrote: I think you still need mm_lock (unless I miss something). What happens when one callout is scanning mmu_notifier_invalidate_range_start() and you unlink. That list next pointer with LIST_POISON1 which is a really bad address for the processor to track. Ok, _release list_del_init qcan't race with that because it happens in exit_mmap when no other mmu notifier can trigger anymore. _unregister can run concurrently but it does list_del_rcu, that only overwrites the pprev pointer with LIST_POISON2. The mmu_notifier_invalidate_range_start won't crash on LIST_POISON1 thanks to srcu. Actually I did more changes than necessary, for example I noticed the mmu_notifier_register can return a list_add_head instead of list_add_head_rcu. _register can't race against _release thanks to the mm_users temporary or implicit pin. _register can't race against _unregister thanks to the mmu_notifier_mm-lock. And register can't race against all other mmu notifiers thanks to the mm_lock. At this time I've no other pending patches on top of v14-pre3 other than the below micro-optimizing cleanup. It'd be great to have confirmation that v14-pre3 passes GRU/XPMEM regressions tests as well as my KVM testing already passed successfully on it. I'll forward v14-pre3 mmu-notifier-core plus the below to Andrew tomorrow, I'm trying to be optimistic here! ;) diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c --- a/mm/mmu_notifier.c +++ b/mm/mmu_notifier.c @@ -187,7 +187,7 @@ int mmu_notifier_register(struct mmu_not * current-mm or explicitly with get_task_mm() or similar). */ spin_lock(mm-mmu_notifier_mm-lock); - hlist_add_head_rcu(mn-hlist, mm-mmu_notifier_mm-list); + hlist_add_head(mn-hlist, mm-mmu_notifier_mm-list); spin_unlock(mm-mmu_notifier_mm-lock); out_unlock: mm_unlock(mm, data); - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Wed, Apr 16, 2008 at 12:15:08PM -0700, Christoph Lameter wrote: On Wed, 16 Apr 2008, Robin Holt wrote: On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote: On Wed, 16 Apr 2008, Robin Holt wrote: I don't think this lock mechanism is completely working. I have gotten a few failures trying to dereference 0x100100 which appears to be LIST_POISON1. How does xpmem unregistering of notifiers work? For the tests I have been running, we are waiting for the release callout as part of exit. Some more details on the failure may be useful. AFAICT list_del[_rcu] is the culprit here and that is only used on release or unregister. I think I have this understood now. It happens quite quickly (within 10 minutes) on a 128 rank job of small data set in a loop. In these failing jobs, all the ranks are nearly symmetric. There is a certain part of each ranks address space that has access granted. All the ranks have included all the other ranks including themselves in exactly the same layout at exactly the same virtual address. Rank 3 has hit _release and is beginning to clean up, but has not deleted the notifier from its list. Rank 9 calls the xpmem_invalidate_page() callout. That page was attached by rank 3 so we call zap_page_range on rank 3 which then calls back into xpmem's invalidate_range_start callout. The rank 3 _release callout begins and deletes its notifier from the list. Rank 9's call to rank 3's zap_page_range notifier returns and dereferences LIST_POISON1. I often confuse myself while trying to explain these so please kick me where the holes in the flow appear. The console output from the simple debugging stuff I put in is a bit overwhelming. I am trying to figure out now which locks we hold as part of the zap callout that should have prevented the _release callout. Thanks, Robin - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote: On Wed, 16 Apr 2008, Robin Holt wrote: I don't think this lock mechanism is completely working. I have gotten a few failures trying to dereference 0x100100 which appears to be LIST_POISON1. How does xpmem unregistering of notifiers work? Especially are you using mmu_notifier_unregister? - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Thu, Apr 17, 2008 at 05:51:57PM +0200, Andrea Arcangeli wrote: On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote: On Wed, 16 Apr 2008, Robin Holt wrote: I don't think this lock mechanism is completely working. I have gotten a few failures trying to dereference 0x100100 which appears to be LIST_POISON1. How does xpmem unregistering of notifiers work? Especially are you using mmu_notifier_unregister? In this case, we are not making the call to unregister, we are waiting for the _release callout which has already removed it from the list. In the event that the user has removed all the grants, we use unregister. That typically does not occur. We merely wait for exit processing to clean up the structures. Thanks, Robin - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Thu, Apr 17, 2008 at 11:36:42AM -0500, Robin Holt wrote: In this case, we are not making the call to unregister, we are waiting for the _release callout which has already removed it from the list. In the event that the user has removed all the grants, we use unregister. That typically does not occur. We merely wait for exit processing to clean up the structures. Then it's very strange. LIST_POISON1 is set in n-next. If it was a second hlist_del triggering the bug in theory list_poison2 should trigger first, so perhaps it's really a notifier running despite a mm_lock is taken? Could you post a full stack trace so I can see who's running into LIST_POISON1? If it's really a notifier running outside of some mm_lock that will be _immediately_ visible from the stack trace that triggered the LIST_POISON1! Also note, EMM isn't using the clean hlist_del, it's implementing list by hand (with zero runtime gain) so all the debugging may not be existent in EMM, so if it's really a mm_lock race, and it only triggers with mmu notifiers and not with EMM, it doesn't necessarily mean EMM is bug free. If you've a full stack trace it would greatly help to verify what is mangling over the list when the oops triggers. Thanks! Andrea - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Thu, Apr 17, 2008 at 07:14:43PM +0200, Andrea Arcangeli wrote: On Thu, Apr 17, 2008 at 11:36:42AM -0500, Robin Holt wrote: In this case, we are not making the call to unregister, we are waiting for the _release callout which has already removed it from the list. In the event that the user has removed all the grants, we use unregister. That typically does not occur. We merely wait for exit processing to clean up the structures. Then it's very strange. LIST_POISON1 is set in n-next. If it was a second hlist_del triggering the bug in theory list_poison2 should trigger first, so perhaps it's really a notifier running despite a mm_lock is taken? Could you post a full stack trace so I can see who's running into LIST_POISON1? If it's really a notifier running outside of some mm_lock that will be _immediately_ visible from the stack trace that triggered the LIST_POISON1! Also note, EMM isn't using the clean hlist_del, it's implementing list by hand (with zero runtime gain) so all the debugging may not be existent in EMM, so if it's really a mm_lock race, and it only triggers with mmu notifiers and not with EMM, it doesn't necessarily mean EMM is bug free. If you've a full stack trace it would greatly help to verify what is mangling over the list when the oops triggers. The stack trace is below. I did not do this level of testing on emm so I can not compare the two in this area. This is for a different, but equivalent failure. I just reproduce the LIST_POISON1 failure without trying to reproduce the exact same failure as I had documented earlier (lost that stack trace, sorry). Thanks, Robin 1Unable to handle kernel paging request at virtual address 00100100 4mpi006.f.x[23403]: Oops 11012296146944 [1] 4Modules linked in: nfs lockd sunrpc binfmt_misc thermal processor fan button loop md_mod dm_mod xpmem xp mspec sg 4 4Pid: 23403, CPU 114, comm: mpi006.f.x 4psr : 121008526010 ifs : 838b ip : [a0010015d6a1] Not tainted (2.6.25-rc8) 4ip is at __mmu_notifier_invalidate_range_start+0x81/0x120 4unat: pfs : 038b rsc : 0003 4rnat: a00100149a00 bsps: a0010740 pr : 66555666a9599aa9 4ldrs: ccv : fpsr: 0009804c0270033f 4csd : ssd : 4b0 : a0010015d670 b6 : a002101ddb40 b7 : a001eb50 4f6 : 1003e f7 : 0 4f8 : 0 f9 : 0 4f10 : 0 f11 : 0 4r1 : a00100ef1190 r2 : ee6080cc1940 r3 : a002101edd10 4r8 : ee6080cc1970 r9 : r10 : ee6080cc19c8 4r11 : 2003a648 r12 : ec60d31efb90 r13 : ec60d31e 4r14 : 004d r15 : ee6080cc1914 r16 : ee6080cc1970 4r17 : 2003a648 r18 : 2007bf90 r19 : 0004 4r20 : ec60d31e r21 : 0010 r22 : ee6080cc19a8 4r23 : ec60c55f1120 r24 : ec60d31efda0 r25 : ec60d31efd98 4r26 : ee60812166d0 r27 : ec60d31efdc0 r28 : ec60d31efdb8 4r29 : ec60d31e0b60 r30 : r31 : 0081 4 4Call Trace: 4 [a00100014a20] show_stack+0x40/0xa0 4sp=ec60d31ef760 bsp=ec60d31e11f0 4 [a00100015330] show_regs+0x850/0x8a0 4sp=ec60d31ef930 bsp=ec60d31e1198 4 [a00100035ed0] die+0x1b0/0x2e0 4sp=ec60d31ef930 bsp=ec60d31e1150 4 [a00100060e90] ia64_do_page_fault+0x8d0/0xa40 4sp=ec60d31ef930 bsp=ec60d31e1100 4 [a001ab00] ia64_leave_kernel+0x0/0x270 4sp=ec60d31ef9c0 bsp=ec60d31e1100 4 [a0010015d6a0] __mmu_notifier_invalidate_range_start+0x80/0x120 4sp=ec60d31efb90 bsp=ec60d31e10a8 4 [a0010011b1d0] unmap_vmas+0x70/0x14c0 4sp=ec60d31efb90 bsp=ec60d31e0fa8 4 [a0010011c660] zap_page_range+0x40/0x60 4sp=ec60d31efda0 bsp=ec60d31e0f70 4 [a002101d62d0] xpmem_clear_PTEs+0x350/0x560 [xpmem] 4sp=ec60d31efdb0 bsp=ec60d31e0ef0 4 [a002101d1e30] xpmem_remove_seg+0x3f0/0x700 [xpmem] 4sp=ec60d31efde0 bsp=ec60d31e0ea8 4 [a002101d2500] xpmem_remove_segs_of_tg+0x80/0x140 [xpmem] 4sp=ec60d31efe10 bsp=ec60d31e0e78 4 [a002101dda40] xpmem_mmu_notifier_release+0x40/0x80 [xpmem] 4sp=ec60d31efe10 bsp=ec60d31e0e58 4 [a0010015d7f0] __mmu_notifier_release+0xb0/0x100 4sp=ec60d31efe10 bsp=ec60d31e0e38 4 [a00100124430] exit_mmap+0x50/0x180 4sp=ec60d31efe10
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Thu, 17 Apr 2008, Andrea Arcangeli wrote: Also note, EMM isn't using the clean hlist_del, it's implementing list by hand (with zero runtime gain) so all the debugging may not be existent in EMM, so if it's really a mm_lock race, and it only triggers with mmu notifiers and not with EMM, it doesn't necessarily mean EMM is bug free. If you've a full stack trace it would greatly help to verify what is mangling over the list when the oops triggers. EMM was/is using a single linked list which allows atomic updates. Looked cleaner to me since doubly linked list must update two pointers. I have not seen docs on the locking so not sure why you use rcu operations here? Isnt the requirement to have either rmap locks or mmap_sem held enough to guarantee the consistency of the doubly linked list? - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Thu, Apr 17, 2008 at 12:10:52PM -0700, Christoph Lameter wrote: EMM was/is using a single linked list which allows atomic updates. Looked cleaner to me since doubly linked list must update two pointers. Cleaner would be if it would provide an abstraction in list.h. The important is the memory taken by the head for this usage. I have not seen docs on the locking so not sure why you use rcu operations here? Isnt the requirement to have either rmap locks or mmap_sem held enough to guarantee the consistency of the doubly linked list? Yes, exactly, I'm not using rcu anymore. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
I don't think this lock mechanism is completely working. I have gotten a few failures trying to dereference 0x100100 which appears to be LIST_POISON1. Thanks, Robin - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote: On Wed, 16 Apr 2008, Robin Holt wrote: I don't think this lock mechanism is completely working. I have gotten a few failures trying to dereference 0x100100 which appears to be LIST_POISON1. How does xpmem unregistering of notifiers work? For the tests I have been running, we are waiting for the release callout as part of exit. Thanks, Robin - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Wed, 16 Apr 2008, Robin Holt wrote: I don't think this lock mechanism is completely working. I have gotten a few failures trying to dereference 0x100100 which appears to be LIST_POISON1. How does xpmem unregistering of notifiers work? - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH 1 of 9] Lock the entire mm to prevent any mmu related operation to happen
On Wed, 16 Apr 2008, Robin Holt wrote: On Wed, Apr 16, 2008 at 11:35:38AM -0700, Christoph Lameter wrote: On Wed, 16 Apr 2008, Robin Holt wrote: I don't think this lock mechanism is completely working. I have gotten a few failures trying to dereference 0x100100 which appears to be LIST_POISON1. How does xpmem unregistering of notifiers work? For the tests I have been running, we are waiting for the release callout as part of exit. Some more details on the failure may be useful. AFAICT list_del[_rcu] is the culprit here and that is only used on release or unregister. - This SF.net email is sponsored by the 2008 JavaOne(SM) Conference Don't miss this year's exciting event. There's still time to save $100. Use priority code J8TL2D2. http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel