Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Thu, Feb 28, 2008 at 05:03:01PM -0800, Christoph Lameter wrote: I thought you wanted to get rid of the sync via pte lock? Sure. _notify is happening inside the pt lock by coincidence, to reduce the changes to mm/* as long as the mmu notifiers aren't sleep capable. What changes to do_wp_page do you envision? Converting it to invalidate_range_begin/end. What is the trouble with the current do_wp_page modifications? There is no need for invalidate_page() there so far. invalidate_range() does the trick there. No trouble, it's just that I didn't want to mangle over the logic of do_wp_page unless it was strictly required, the patch has to be obviously safe. You need to keep that bit of your patch to make the mmu notifiers sleepable. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: On Thu, Feb 28, 2008 at 05:03:01PM -0800, Christoph Lameter wrote: I thought you wanted to get rid of the sync via pte lock? Sure. _notify is happening inside the pt lock by coincidence, to reduce the changes to mm/* as long as the mmu notifiers aren't sleep capable. Ok if this is a coincidence then it would be better to separate the notifier callouts from the pte macro calls. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Wed, 27 Feb 2008, Andrea Arcangeli wrote: What Christoph need to do when he's back from vacations to support sleepable mmu notifiers is to add a CONFIG_XPMEM config option that will switch the i_mmap_lock from a semaphore to a mutex (any other change to this patch will be minor compared to that) so XPMEM hardware will have kernels compiled that way. I don't see other sane ways to remove the atomic parameter from the API (apparently required by Andrew for merging something not restricted to the xpmem current usage with only anonymous memory) and I don't want to have such a locking-change intrusive dependency for all other non-blocking users that are fine without having to alter how the VM works (for example KVM and GRU). Very minor changes will be required to this patch to make it work after the VM locking will be altered (for example the CONFIG_XPMEM should also switch the mmu_register/unregister locking from RCU to mutex as well). XPMEM then will only compile if CONFIG_XPMEM=y and in turn the invalidate_range_* will support scheduling inside. This is not going to work even if the mutex would work as easily as you think since the patch here still does an rcu_lock/unlock around a callback. I don't think pretending to merge all in one block (I mean including xpmem support that requires blocking methods) is good idea anymore as long as we agree the atomic parameter shouldn't be merged. But we can quite easily agree on the below to be optimal for GRU/KVM and trivially extendible once a CONFIG_XPMEM will be added. So this first part can go in now I think. Changing the locking for the callouts for users of the mmu notivier that f.e. require a response via the network (RDMA, XPMEM etc) is not trivial at all. RCU lock cannot be used. So we are looking at totally disjunct methods for those users who have to sleep. +struct mmu_notifier_ops { + /* + * Called when nobody can register any more notifier in the mm + * and after the mn notifier has been disarmed already. + */ + void (*release)(struct mmu_notifier *mn, + struct mm_struct *mm); Who disarms the notifier? Why is the method not called to disarm the notifier on exit? +obj-$(CONFIG_MMU_NOTIFIER) += mmu_notifier.o diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c --- a/mm/filemap_xip.c +++ b/mm/filemap_xip.c @@ -194,7 +194,7 @@ __xip_unmap (struct address_space * mapp if (pte) { /* Nuke the page table entry. */ flush_cache_page(vma, address, pte_pfn(*pte)); - pteval = ptep_clear_flush(vma, address, pte); + pteval = ptep_clear_flush_notify(vma, address, pte); page_remove_rmap(page, vma); dec_mm_counter(mm, file_rss); BUG_ON(pte_dirty(pteval)); Well a bit better but now we have to modify both the macro and the code in teh VM. It would be easier to put the notify call in here. @@ -2048,6 +2050,7 @@ void exit_mmap(struct mm_struct *mm) vm_unacct_memory(nr_accounted); free_pgtables(tlb, vma, FIRST_USER_ADDRESS, 0); tlb_finish_mmu(tlb, 0, end); + mmu_notifier_release(mm); The release should be called much earlier to allow the driver to release all resources in one go. This way each vma must be processed individually. For our gobs of memory this method may create a scaling problem on exit(). - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Thu, Feb 28, 2008 at 11:48:10AM -0800, Christoph Lameter wrote: make it work after the VM locking will be altered (for example the ^^^ CONFIG_XPMEM should also switch the mmu_register/unregister locking ^^^ from RCU to mutex as well). XPMEM then will only compile if ^ CONFIG_XPMEM=y and in turn the invalidate_range_* will support scheduling inside. This is not going to work even if the mutex would work as easily as you think since the patch here still does an rcu_lock/unlock around a callback. See underlined. +struct mmu_notifier_ops { + /* +* Called when nobody can register any more notifier in the mm +* and after the mn notifier has been disarmed already. +*/ + void (*release)(struct mmu_notifier *mn, + struct mm_struct *mm); Who disarms the notifier? Why is the method not called to disarm the notifier on exit? The notifier is auto-disarmed by mmu_notifier_release, your patch works the same way. -release is further called just in case anybody wants to know the notifier was disarmed. @@ -2048,6 +2050,7 @@ void exit_mmap(struct mm_struct *mm) vm_unacct_memory(nr_accounted); free_pgtables(tlb, vma, FIRST_USER_ADDRESS, 0); tlb_finish_mmu(tlb, 0, end); + mmu_notifier_release(mm); The release should be called much earlier to allow the driver to release all resources in one go. This way each vma must be processed individually. For our gobs of memory this method may create a scaling problem on exit(). Good point, it has to be called earlier for GRU, but it's not a performance issue. GRU doesn't pin the pages so it should make the global invalidate in -release _before_ unmap_vmas. Linux can't fault in the ptes anymore because mm_users is zero so there's no need of a -release_begin/end, the _begin is enough. In #v6 I was invalidating inside unmap_vmas so it was ok. The performance issues you're talking about refers to #v6 I guess, for #v7 there's a single call. Thanks! diff --git a/mm/mmap.c b/mm/mmap.c --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2039,6 +2039,7 @@ void exit_mmap(struct mm_struct *mm) unsigned long end; /* mm's last user has gone, and its about to be pulled down */ + mmu_notifier_release(mm); arch_exit_mmap(mm); lru_add_drain(); @@ -2050,7 +2051,6 @@ void exit_mmap(struct mm_struct *mm) vm_unacct_memory(nr_accounted); free_pgtables(tlb, vma, FIRST_USER_ADDRESS, 0); tlb_finish_mmu(tlb, 0, end); - mmu_notifier_release(mm); /* * Walk the list again, actually closing and freeing it, - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Thu, 28 Feb 2008, Andrea Arcangeli wrote: This is not going to work even if the mutex would work as easily as you think since the patch here still does an rcu_lock/unlock around a callback. See underlined. Mutex is not acceptable for performance reasons. I think we can just drop the RCU lock if we simply unregister the mmu notifier in release and forbid the drivers from removing themselves from the notification chain. They can simply do nothing until release. At that time there is no concurrency and thus its safe to remove even without rcu locking. Good point, it has to be called earlier for GRU, but it's not a performance issue. GRU doesn't pin the pages so it should make the global invalidate in -release _before_ unmap_vmas. Linux can't fault in the ptes anymore because mm_users is zero so there's no need of a -release_begin/end, the _begin is enough. I do not follow you about the _begin without end but the following fix seems okay. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Wed, 27 Feb 2008, Andrea Arcangeli wrote: +struct mmu_notifier_head { + struct hlist_head head; + spinlock_t lock; +}; Still think that the lock here is not of too much use and can be easily replaced by mmap_sem. +#define mmu_notifier(function, mm, args...) \ + do {\ + struct mmu_notifier *__mn; \ + struct hlist_node *__n; \ + \ + if (unlikely(!hlist_empty((mm)-mmu_notifier.head))) { \ + rcu_read_lock();\ + hlist_for_each_entry_rcu(__mn, __n, \ + (mm)-mmu_notifier.head, \ + hlist) \ + if (__mn-ops-function)\ + __mn-ops-function(__mn, \ + mm, \ + args); \ + rcu_read_unlock(); \ + } \ + } while (0) Andrew recomended local variables for parameters used multile times. This means the mm parameter here. +/* + * Notifiers that use the parameters that they were passed so that the + * compiler does not complain about unused variables but does proper + * parameter checks even if !CONFIG_MMU_NOTIFIER. + * Macros generate no code. + */ +#define mmu_notifier(function, mm, args...) \ + do { \ + if (0) { \ + struct mmu_notifier *__mn; \ +\ + __mn = (struct mmu_notifier *)(0x00ff);\ + __mn-ops-function(__mn, mm, args); \ + }; \ + } while (0) Note also Andrew's comments on the use of 0x00ff... +/* + * No synchronization. This function can only be called when only a single + * process remains that performs teardown. + */ +void mmu_notifier_release(struct mm_struct *mm) +{ + struct mmu_notifier *mn; + struct hlist_node *n, *tmp; + + if (unlikely(!hlist_empty(mm-mmu_notifier.head))) { + hlist_for_each_entry_safe(mn, n, tmp, + mm-mmu_notifier.head, hlist) { + hlist_del(mn-hlist); + if (mn-ops-release) + mn-ops-release(mn, mm); + } + } +} One could avoid a hlist_for_each_entry_safe here by simply always deleting the first object. Also re the _notify variants: The binding to pte_clear_flush_young etc will become a problem for notifiers that want to sleep because pte_clear_flush is usually called with the pte lock held. See f.e. try_to_unmap_one, page_mkclean_one etc. It would be better if the notifier calls could be moved outside of the pte lock. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
The release should be called much earlier to allow the driver to release all resources in one go. This way each vma must be processed individually. For our gobs of memory this method may create a scaling problem on exit(). Good point, it has to be called earlier for GRU, but it's not a performance issue. GRU doesn't pin the pages so it should make the global invalidate in -release _before_ unmap_vmas. Linux can't fault in the ptes anymore because mm_users is zero so there's no need of a -release_begin/end, the _begin is enough. I disagree. The location of the callout IS a performance issue. In simple comparisons of the 2 patches (Christoph's vs. Andrea's), Andrea's has a 7X increase in the number of TLB purges being issued to the GRU. TLB flushing is slow and can impact the performance of of tasks using the GRU. --- jack - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Thu, Feb 28, 2008 at 05:17:33PM -0600, Jack Steiner wrote: I disagree. The location of the callout IS a performance issue. In simple comparisons of the 2 patches (Christoph's vs. Andrea's), Andrea's has a 7X increase in the number of TLB purges being issued to the GRU. TLB flushing Are you sure that you're referring to #v7? - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Thu, Feb 28, 2008 at 03:05:30PM -0800, Christoph Lameter wrote: Still think that the lock here is not of too much use and can be easily replaced by mmap_sem. I can use the mmap_sem. +#define mmu_notifier(function, mm, args...) \ + do {\ + struct mmu_notifier *__mn; \ + struct hlist_node *__n; \ + \ + if (unlikely(!hlist_empty((mm)-mmu_notifier.head))) { \ + rcu_read_lock();\ + hlist_for_each_entry_rcu(__mn, __n, \ +(mm)-mmu_notifier.head, \ +hlist) \ + if (__mn-ops-function)\ + __mn-ops-function(__mn, \ + mm, \ + args); \ + rcu_read_unlock(); \ + } \ + } while (0) Andrew recomended local variables for parameters used multile times. This means the mm parameter here. I don't exactly see what buggy macro meant? I already use parenthesis as needed to avoid the need of local variables to be safe. Not really sure what's buggy, sorry! Note also Andrew's comments on the use of 0x00ff... I thought I tried the (void) but it didn't work and your solution worked, but perhaps I did something wrong, I'll try again with (void) nevertheless. +/* + * No synchronization. This function can only be called when only a single + * process remains that performs teardown. + */ +void mmu_notifier_release(struct mm_struct *mm) +{ + struct mmu_notifier *mn; + struct hlist_node *n, *tmp; + + if (unlikely(!hlist_empty(mm-mmu_notifier.head))) { + hlist_for_each_entry_safe(mn, n, tmp, + mm-mmu_notifier.head, hlist) { + hlist_del(mn-hlist); + if (mn-ops-release) + mn-ops-release(mn, mm); + } + } +} One could avoid a hlist_for_each_entry_safe here by simply always deleting the first object. Agreed, the current construct come from the fact we previously didn't assume nobody could ever call mmu_notifier_unregister by the time mm_users is 0. Also re the _notify variants: The binding to pte_clear_flush_young etc will become a problem for notifiers that want to sleep because pte_clear_flush is usually called with the pte lock held. See f.e. try_to_unmap_one, page_mkclean_one etc. Calling __free_page out of the PT lock is much bigger change. do_wp_page will require changes anyway when the sleepable notifiers are merged. It would be better if the notifier calls could be moved outside of the pte lock. The point is that it can't make a difference right now, and my objective was to avoid unnecessary source code duplication (later it will be necessary, right now it isn't). By the time you rework do_wp_page, removing _notify will be a very minor detail compared to the rest of the changes to do_wp_page IMHO. Expanding it now won't provide a real advantage later. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Fri, 29 Feb 2008 01:40:01 +0100 Andrea Arcangeli [EMAIL PROTECTED] wrote: +#define mmu_notifier(function, mm, args...) \ + do {\ + struct mmu_notifier *__mn; \ + struct hlist_node *__n; \ + \ + if (unlikely(!hlist_empty((mm)-mmu_notifier.head))) { \ + rcu_read_lock();\ + hlist_for_each_entry_rcu(__mn, __n, \ + (mm)-mmu_notifier.head, \ + hlist) \ + if (__mn-ops-function)\ + __mn-ops-function(__mn, \ + mm, \ + args); \ + rcu_read_unlock(); \ + } \ + } while (0) Andrew recomended local variables for parameters used multile times. This means the mm parameter here. I don't exactly see what buggy macro meant? multiple refernces to the argument, so mmu_notifier(foo, bar(), zot); will call bar() either once or twice. Unlikely in this case, but bad practice. Easily fixable by using another temporary. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: Also re the _notify variants: The binding to pte_clear_flush_young etc will become a problem for notifiers that want to sleep because pte_clear_flush is usually called with the pte lock held. See f.e. try_to_unmap_one, page_mkclean_one etc. Calling __free_page out of the PT lock is much bigger change. do_wp_page will require changes anyway when the sleepable notifiers are merged. I thought you wanted to get rid of the sync via pte lock? What changes to do_wp_page do you envision? It would be better if the notifier calls could be moved outside of the pte lock. The point is that it can't make a difference right now, and my objective was to avoid unnecessary source code duplication (later it will be necessary, right now it isn't). By the time you rework do_wp_page, removing _notify will be a very minor detail compared to the rest of the changes to do_wp_page IMHO. Expanding it now won't provide a real advantage later. What is the trouble with the current do_wp_page modifications? There is no need for invalidate_page() there so far. invalidate_range() does the trick there. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: On Thu, Feb 28, 2008 at 05:17:33PM -0600, Jack Steiner wrote: I disagree. The location of the callout IS a performance issue. In simple comparisons of the 2 patches (Christoph's vs. Andrea's), Andrea's has a 7X increase in the number of TLB purges being issued to the GRU. TLB flushing Are you sure that you're referring to #v7? Jack: AFAICT Andrea moved the release callout and things will be fine in the next release. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
Hi Izik kvm-devel, Just wanted to remind that if we'll converge on #v7, the ksm code in replace_page will have to call ptep_clear_flush_notify too (just like do_wp_page). - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Wed, 27 Feb 2008, Andrea Arcangeli wrote: I hope this will can be considered final for .25 and be merged. Risk is zero, the only discussion here is to make an API that will last forever, functionality-wise all these patches provides zero risk and zero overhead when MMU_NOTIFIER=n. This last patch covers KVM and GRU and hopefully all other non-blocking users optimally, and the below Ok so it somehow works slowly with GRU and you are happy with it. What about the RDMA folks etc etc? API will hopefully last forever (but even if it lasts just for .25 and .26 is changed that's fine with us, it's a kernel _internal_ API anyway, there's absolutely nothing visible to userland). Would it not be better to have a solution that fits all instead of hacking something in now and then having to modify it later? What Christoph need to do when he's back from vacations to support sleepable mmu notifiers is to add a CONFIG_XPMEM config option that will switch the i_mmap_lock from a semaphore to a mutex (any other change to this patch will be minor compared to that) so XPMEM hardware will have kernels compiled that way. I don't see other sane ways to remove the atomic parameter from the API (apparently required by Andrew for merging something not restricted to the xpmem current usage with only anonymous memory) and I don't want to have such a locking-change intrusive dependency for all other non-blocking users that are fine without having to alter how the VM works (for example KVM and GRU). Very minor changes will be required to this patch to make it work after the VM locking will be altered (for example the CONFIG_XPMEM should also switch the mmu_register/unregister locking from RCU to mutex as well). XPMEM then will only compile if CONFIG_XPMEM=y and in turn the invalidate_range_* will support scheduling inside. Hmmm.. There were earlier discussions of changing the anon vma lock to a rw lock because of contention issues in large systems. Maybe we can just generally switch the locks taken while walking rmaps to semaphores? That would still require to put the invalidate outside of the pte lock. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Wed, Feb 27, 2008 at 03:06:10PM -0800, Christoph Lameter wrote: Ok so it somehow works slowly with GRU and you are happy with it. What As far as GRU is concerned, performance is the same as with your patch (Jack can confirm). about the RDMA folks etc etc? If RDMA/IB folks needed to block in invalidate_range, I guess they need to do so on top of tmpfs too, and that never worked with your patch anyway. Would it not be better to have a solution that fits all instead of hacking something in now and then having to modify it later? The whole point is that your solution fits only GRU and KVM too. XPMEM in your patch works in a hacked mode limited to anonymous memory only, Robin already received incoming mail asking to allow xpmem to work on more than anonymous memory, so your solution-that-fits-all doesn't actually fit some of Robin's customer needs. So if it doesn't even entirely satisfy xpmem users, imagine the other potential blocking-users of this code. Hmmm.. There were earlier discussions of changing the anon vma lock to a rw lock because of contention issues in large systems. Maybe we can just generally switch the locks taken while walking rmaps to semaphores? That would still require to put the invalidate outside of the pte lock. anon_vma lock can remain a spinlock unless you also want to schedule inside try_to_unmap. If converting the i_mmap_lock to a mutex is a big trouble, another way that might work to allow invalidate_range to block, would be to try to boost the mm_users to prevent the mmu_notifier_release to run in another cpu the moment after i_mmap_lock spinlock is unlocked. But even if that works, it'll run slower and the mmu notifiers RCU locking should be switched to a mutex, so it'd be nice to have it as a separate option. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Thu, 28 Feb 2008, Andrea Arcangeli wrote: If RDMA/IB folks needed to block in invalidate_range, I guess they need to do so on top of tmpfs too, and that never worked with your patch anyway. How about blocking in invalidate_page()? It can be made to work... Would it not be better to have a solution that fits all instead of hacking something in now and then having to modify it later? The whole point is that your solution fits only GRU and KVM too. Well so we do not address the issues? XPMEM in your patch works in a hacked mode limited to anonymous memory only, Robin already received incoming mail asking to allow xpmem to work on more than anonymous memory, so your solution-that-fits-all doesn't actually fit some of Robin's customer needs. So if it doesn't even entirely satisfy xpmem users, imagine the other potential blocking-users of this code. The solutions have been mentioned... anon_vma lock can remain a spinlock unless you also want to schedule inside try_to_unmap. Either that or a separate rmap as also mentioned before. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Wed, Feb 27, 2008 at 04:08:07PM -0800, Christoph Lameter wrote: On Thu, 28 Feb 2008, Andrea Arcangeli wrote: If RDMA/IB folks needed to block in invalidate_range, I guess they need to do so on top of tmpfs too, and that never worked with your patch anyway. How about blocking in invalidate_page()? It can be made to work... Yes, it can be made to work with even more extended VM changes than to only allow invalidate_range to schedule. Those core VM changes should only be done by default (w/o CONFIG_XPMEM=y), if they're doing good to the VM regardless of xpmem requirements. And I'm not really sure of that. I think they don't do any good or they would be a mutex already... Well so we do not address the issues? I'm not suggesting not to address the issues, just that those issues requires VM core changes, and likely those changes should be switchable under a CONFIG_XPMEM, so I see no reason to delay the mmu notifier until those changes are done and merged too. It's kind of a separate problem. Either that or a separate rmap as also mentioned before. DRI also wants invalidate_page by (mm,addr). - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel
Re: [kvm-devel] [PATCH] mmu notifiers #v7
On Thu, 28 Feb 2008, Andrea Arcangeli wrote: I'm not suggesting not to address the issues, just that those issues requires VM core changes, and likely those changes should be switchable under a CONFIG_XPMEM, so I see no reason to delay the mmu notifier until those changes are done and merged too. It's kind of a separate problem. No its the core problem of the mmu notifier. It needs to be usable for a lot of scenarios. - This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/ ___ kvm-devel mailing list kvm-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/kvm-devel