Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-03-04 Thread Christoph Lameter
On Tue, 4 Mar 2008, Nick Piggin wrote: Then put it into the arch code for TLB invalidation. Paravirt ops gives good examples on how to do that. Put what into arch code? The mmu notifier code. What about a completely different approach... XPmem runs over NUMAlink, right? Why not

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-03-04 Thread Nick Piggin
On Wednesday 05 March 2008 05:58, Christoph Lameter wrote: On Tue, 4 Mar 2008, Nick Piggin wrote: Then put it into the arch code for TLB invalidation. Paravirt ops gives good examples on how to do that. Put what into arch code? The mmu notifier code. It isn't arch specific.

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-03-03 Thread Christoph Lameter
On Mon, 3 Mar 2008, Nick Piggin wrote: Your skeleton is just registering notifiers and saying /* you fill the hard part in */ If somebody needs a skeleton in order just to register the notifiers, then almost by definition they are unqualified to write the hard part ;) Its also providing

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-03-02 Thread Nick Piggin
On Thursday 28 February 2008 09:35, Christoph Lameter wrote: On Wed, 20 Feb 2008, Nick Piggin wrote: On Friday 15 February 2008 17:49, Christoph Lameter wrote: Also, what we are going to need here are not skeleton drivers that just do all the *easy* bits (of registering their callbacks),

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Andrea Arcangeli
On Thu, Feb 28, 2008 at 04:59:59PM -0800, Christoph Lameter wrote: And thus the device driver may stop receiving data on a UP system? It will never get the ack. Not sure to follow, sorry. My idea was: post the invalidate in the mmio region of the device smp_call_function() while

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: On Thu, Feb 28, 2008 at 04:59:59PM -0800, Christoph Lameter wrote: And thus the device driver may stop receiving data on a UP system? It will never get the ack. Not sure to follow, sorry. My idea was: post the invalidate in the mmio

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Andrea Arcangeli
On Fri, Feb 29, 2008 at 11:55:17AM -0800, Christoph Lameter wrote: post the invalidate in the mmio region of the device smp_call_function() while (mmio device wait-bitflag is on); So the device driver on UP can only operate through interrupts? If you are hogging the only cpu

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: Agreed. I just thought xpmem needed an invalidate-by-page, but I'm glad if xpmem can go in sync with the KVM/GRU/DRI model in this regard. That means we need both the anon_vma locks and the i_mmap_lock to become semaphores. I think semaphores are

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote: That means we need both the anon_vma locks and the i_mmap_lock to become semaphores. I think semaphores are better than mutexes. Rik and Lee saw some performance improvements

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Andrea Arcangeli
On Fri, Feb 29, 2008 at 01:34:34PM -0800, Christoph Lameter wrote: On Fri, 29 Feb 2008, Andrea Arcangeli wrote: On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote: That means we need both the anon_vma locks and the i_mmap_lock to become semaphores. I think semaphores are

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Andrea Arcangeli
On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote: That means we need both the anon_vma locks and the i_mmap_lock to become semaphores. I think semaphores are better than mutexes. Rik and Lee saw some performance improvements because list can be traversed in parallel when

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-29 Thread Andrea Arcangeli
On Fri, Feb 29, 2008 at 02:12:57PM -0800, Christoph Lameter wrote: On Fri, 29 Feb 2008, Andrea Arcangeli wrote: AFAICT The rw semaphore fastpath is similar in performance to a rw spinlock. read side is taken in the slow path. Slowpath meaning VM slowpath or lock slow path? Its

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-28 Thread Robin Holt
On Thu, Feb 28, 2008 at 01:52:50AM +0100, Andrea Arcangeli wrote: On Wed, Feb 27, 2008 at 04:14:08PM -0800, Christoph Lameter wrote: Erm. This would also be needed by RDMA etc. The only RDMA I know is Quadrics, and Quadrics apparently doesn't need to schedule inside the invalidate methods

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-28 Thread Christoph Lameter
On Thu, 28 Feb 2008, Andrea Arcangeli wrote: On Wed, Feb 27, 2008 at 05:03:21PM -0800, Christoph Lameter wrote: RDMA works across a network and I would assume that it needs confirmation that a connection has been torn down before pages can be unmapped. Depends on the latency of the

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-28 Thread Andrea Arcangeli
On Thu, Feb 28, 2008 at 10:43:54AM -0800, Christoph Lameter wrote: What about invalidate_page()? That would just spin waiting an ack (just like the smp-tlb-flushing invalidates in numa already does). Thinking more about this, we could also parallelize it with an invalidate_page_before/end. If

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-28 Thread Christoph Lameter
On Fri, 29 Feb 2008, Andrea Arcangeli wrote: On Thu, Feb 28, 2008 at 10:43:54AM -0800, Christoph Lameter wrote: What about invalidate_page()? That would just spin waiting an ack (just like the smp-tlb-flushing invalidates in numa already does). And thus the device driver may stop

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Christoph Lameter
On Tue, 19 Feb 2008, Andrea Arcangeli wrote: Yes, that's why I kept maintaining my patch and I posted the last revision to Andrew. I use pte/tlb locking of the core VM, it's unintrusive and obviously safe. Furthermore it can be extended with Christoph's stuff in a 100% backwards compatible

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Christoph Lameter
On Wed, 20 Feb 2008, Nick Piggin wrote: On Friday 15 February 2008 17:49, Christoph Lameter wrote: The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. If invalidate_range_begin() is called with locks held then we

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Christoph Lameter
On Wed, 20 Feb 2008, Andrea Arcangeli wrote: Well, xpmem requirements are complex. As as side effect of the simplicity of my approach, my patch is 100% safe since #v1. Now it also works for GRU and it cluster invalidates. The patch has to satisfy RDMA, XPMEM, GRU and KVM. I keep hearing that

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Jack Steiner
Also, what we are going to need here are not skeleton drivers that just do all the *easy* bits (of registering their callbacks), but actual fully working examples that do everything that any real driver will need to do. If not for the sanity of the driver writer, then for the sanity

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Andrea Arcangeli
On Wed, Feb 27, 2008 at 02:23:29PM -0800, Christoph Lameter wrote: How would that work? You rely on the pte locking. Thus calls are all in an I don't rely on the pte locking in #v7, exactly to satisfy GRU (so far purely theoretical) performance complains. atomic context. I think we need a

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Christoph Lameter
On Wed, 27 Feb 2008, Christoph Lameter wrote: Could you be specific? This refers to page migration? Hmmm... Guess we would need to inc the refcount there instead? Argh. No its the callback list scanning. Yuck. No one noticed.

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Andrea Arcangeli
On Wed, Feb 27, 2008 at 02:35:59PM -0800, Christoph Lameter wrote: Could you be specific? This refers to page migration? Hmmm... Guess we If the reader schedule, the synchronize_rcu will return in the other cpu and the objects in the list will be freed and overwritten, and when the task is

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Christoph Lameter
On Thu, 28 Feb 2008, Andrea Arcangeli wrote: 3. Keep the refcount elevated until pages are freed in another execution context. Page refcount is not enough (the mmu_notifier_release will run in another cpu the moment after i_mmap_lock is unlocked) but mm_users may prevent us to change

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Andrea Arcangeli
On Wed, Feb 27, 2008 at 02:39:46PM -0800, Christoph Lameter wrote: On Wed, 20 Feb 2008, Andrea Arcangeli wrote: Well, xpmem requirements are complex. As as side effect of the simplicity of my approach, my patch is 100% safe since #v1. Now it also works for GRU and it cluster invalidates.

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Andrea Arcangeli
On Wed, Feb 27, 2008 at 04:14:08PM -0800, Christoph Lameter wrote: Erm. This would also be needed by RDMA etc. The only RDMA I know is Quadrics, and Quadrics apparently doesn't need to schedule inside the invalidate methods AFIK, so I doubt the above is true. It'd be interesting to know if IB is

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Christoph Lameter
On Thu, 28 Feb 2008, Andrea Arcangeli wrote: On Wed, Feb 27, 2008 at 04:14:08PM -0800, Christoph Lameter wrote: Erm. This would also be needed by RDMA etc. The only RDMA I know is Quadrics, and Quadrics apparently doesn't need to schedule inside the invalidate methods AFIK, so I doubt the

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-27 Thread Andrea Arcangeli
On Wed, Feb 27, 2008 at 05:03:21PM -0800, Christoph Lameter wrote: RDMA works across a network and I would assume that it needs confirmation that a connection has been torn down before pages can be unmapped. Depends on the latency of the network, for example with page pinning it can even try

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Nick Piggin
On Friday 15 February 2008 17:49, Christoph Lameter wrote: The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. If invalidate_range_begin() is called with locks held then we pass a flag into invalidate_range() to indicate

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Andrea Arcangeli
On Tue, Feb 19, 2008 at 07:54:14PM +1100, Nick Piggin wrote: As far as sleeping inside callbacks goes... I think there are big problems with the patch (the sleeping patch and the external rmap patch). I don't think it is workable in its current state. Either we have to make some big changes to

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Nick Piggin
On Friday 15 February 2008 17:49, Christoph Lameter wrote: The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. If invalidate_range_begin() is called with locks held then we pass a flag into invalidate_range() to indicate

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Andrea Arcangeli
On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote: You can't sleep inside rcu_read_lock()! I must say that for a patch that is up to v8 or whatever and is posted twice a week to such a big cc list, it is kind of slack to not even test it and expect other people to review it. Well,

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-19 Thread Robin Holt
On Wed, Feb 20, 2008 at 02:11:41PM +1100, Nick Piggin wrote: On Wednesday 20 February 2008 14:00, Robin Holt wrote: On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote: On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote: Also, how to you resolve the case where you

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-16 Thread Christoph Lameter
On Fri, 15 Feb 2008, Andrew Morton wrote: On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter [EMAIL PROTECTED] wrote: The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. hm. Do they? Why? If I'm in the process

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-15 Thread Andrew Morton
On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter [EMAIL PROTECTED] wrote: The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. hm. Do they? Why? If I'm in the process of zero-copy writing a hunk of memory out to

[kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-14 Thread Christoph Lameter
The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. If invalidate_range_begin() is called with locks held then we pass a flag into invalidate_range() to indicate that no sleeping is possible. Locks are only held for truncate

[kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-02-08 Thread Christoph Lameter
The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. If invalidate_range_begin() is called with locks held then we pass a flag into invalidate_range() to indicate that no sleeping is possible. Locks are only held for truncate

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-31 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 05:46:21PM -0800, Christoph Lameter wrote: Well the GRU uses follow_page() instead of get_user_pages. Performance is a major issue for the GRU. GRU is a external TLB, we have to allocate RAM instead but we do it through the regular userland paging mechanism.

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 06:28:05PM -0600, Jack Steiner wrote: On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Andrea Arcangeli wrote: invalidate_range after populate allows access to memory for which ptes were zapped and the refcount was

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
Robin, if you don't mind, could you please post or upload somewhere your GPLv2 code that registers itself in Christoph's V2 notifiers? Or is it top secret? I wouldn't mind to have a look so I can better understand what's the exact reason you're sleeping besides attempting GFP_KERNEL

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote: Robin, if you don't mind, could you please post or upload somewhere your GPLv2 code that registers itself in Christoph's V2 notifiers? Or is it top secret? I wouldn't mind to have a look so I can better understand what's the exact

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
On Wed, Jan 30, 2008 at 06:04:52PM +0100, Andrea Arcangeli wrote: On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote: ... The three issues we need to simultaneously solve is revoking the remote page table/tlb information while still in a sleepable context and not having the remote

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Robin Holt wrote: I think I need to straighten this discussion out in my head a little bit. Am I correct in assuming Andrea's original patch set did not have any SMP race conditions for KVM? If so, then we need to start looking at how to implement Christoph's and my

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Jack Steiner
On Wed, Jan 30, 2008 at 11:41:29AM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Jack Steiner wrote: I see what you mean. I need to review to mail to see why this changed but in the original discussions with Christoph, the invalidate_range callouts were suppose to be made BEFORE

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Jack Steiner wrote: Seems that we cannot rely on the invalidate_ranges for correctness at all? We need to have invalidate_page() always. invalidate_range() is only an optimization. I don't understand your point an optimization. How would invalidate_range as

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Andrea Arcangeli wrote: XPMEM requires with invalidate_range (sleepy) + before_invalidate_range (sleepy). invalidate_all should also be called before_release (both sleepy). It sounds we need full

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote: Then we have invalidate_range_start(mm) and invalidate_range_finish(mm, start, end) in addition to the invalidate rmap_notifier? --- include/linux/mmu_notifier.h |7 +-- 1 file changed, 5 insertions(+),

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: - void (*invalidate_range)(struct mmu_notifier *mn, + void (*invalidate_range_begin)(struct mmu_notifier *mn, struct mm_struct *mm, -unsigned long start, unsigned long end,

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: On Wed, Jan 30, 2008 at 04:01:31PM -0800, Christoph Lameter wrote: How we offload that? Before the scan of the rmaps we do not have the mmstruct. So we'd need another notifier_rmap_callback. My assumption is that that int lock exists just

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
Patch to 1. Remove sync on notifier_release. Must be called when only a single process remain. 2. Add invalidate_range_start/end. This should allow safe removal of ranges of external ptes without having to resort to a callback for every individual page. This must be able to nest so

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Wed, 30 Jan 2008, Robin Holt wrote: Well the GRU uses follow_page() instead of get_user_pages. Performance is a major issue for the GRU. Worse, the GRU takes its TLB faults from within an interrupt so we use follow_page to prevent going to sleep. That said, I think we could

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote: hlist_for_each_entry_safe_rcu(mn, n, t, mm-mmu_notifier.head, hlist) { hlist_del_rcu(mn-hlist);

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Christoph Lameter
On Thu, 31 Jan 2008, Andrea Arcangeli wrote: On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote: hlist_for_each_entry_safe_rcu(mn, n, t, mm-mmu_notifier.head, hlist) {

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-30 Thread Robin Holt
Well the GRU uses follow_page() instead of get_user_pages. Performance is a major issue for the GRU. Worse, the GRU takes its TLB faults from within an interrupt so we use follow_page to prevent going to sleep. That said, I think we could probably use follow_page() with FOLL_GET set to

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Mon, Jan 28, 2008 at 12:28:42PM -0800, Christoph Lameter wrote: Index: linux-2.6/mm/fremap.c === --- linux-2.6.orig/mm/fremap.c2008-01-25 19:31:05.0 -0800 +++ linux-2.6/mm/fremap.c 2008-01-25

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
Christoph, the below patch should fix the current leak of the pinned pages. I hope the page-pin that should be dropped by the invalidate_range op, is enough to prevent the physical page mapped on that mm+address to change before invalidate_range returns. If that would ever happen, there would be a

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 11:55:10AM -0800, Christoph Lameter wrote: I am not sure. AFAICT you wrote that code. Actually I didn't need to change a single line in do_wp_page because ptep_clear_flush was already doing everything transparently for me. This was the memory.c part of my last patch I

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 12:30:06PM -0800, Christoph Lameter wrote: On Tue, 29 Jan 2008, Andrea Arcangeli wrote: diff --git a/mm/fremap.c b/mm/fremap.c --- a/mm/fremap.c +++ b/mm/fremap.c @@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: diff --git a/mm/fremap.c b/mm/fremap.c --- a/mm/fremap.c +++ b/mm/fremap.c @@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns spin_unlock(mapping-i_mmap_lock); } + err = populate_range(mm, vma, start, size,

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: + mmu_notifier(invalidate_range, mm, address, + address + PAGE_SIZE - 1, 0); page_table = pte_offset_map_lock(mm, pmd, address, ptl); if (likely(pte_same(*page_table, orig_pte))) { if (old_page)

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: It seems to be okay to invalidate range if you hold mmap_sem writably. In that case no additional faults can happen that would create new ptes. In that place the mmap_sem is taken but in readonly mode. I never rely on the mmap_sem in the mmu

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 01:53:05PM -0800, Christoph Lameter wrote: On Tue, 29 Jan 2008, Andrea Arcangeli wrote: We invalidate the range *after* populating it? Isnt it okay to establish references while populate_range() runs? It's not ok because that function can very well overwrite

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: We invalidate the range *after* populating it? Isnt it okay to establish references while populate_range() runs? It's not ok because that function can very well overwrite existing and present ptes (it's actually the nonlinear common case fast

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
n Tue, 29 Jan 2008, Andrea Arcangeli wrote: hmm, there where? When I said it was taken in readonly mode I meant for the quoted code (it would be at the top if it wasn't cut), so I quote below again: + mmu_notifier(invalidate_range, mm, address, + address +

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Andrea Arcangeli wrote: But now I think there may be an issue with a third thread that may show unsafe the removal of invalidate_page from ptep_clear_flush. A third thread writing to a page through the linux-pte and the guest VM writing to the same page through the

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 02:55:56PM -0800, Christoph Lameter wrote: On Tue, 29 Jan 2008, Andrea Arcangeli wrote: But now I think there may be an issue with a third thread that may show unsafe the removal of invalidate_page from ptep_clear_flush. A third thread writing to a page through

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Tue, Jan 29, 2008 at 02:39:00PM -0800, Christoph Lameter wrote: If it does not run in write mode then concurrent faults are permissible while we remap pages. Weird. Maybe we better handle this like individual page operations? Put the invalidate_page back into zap_pte. But then there would

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Wed, 30 Jan 2008, Andrea Arcangeli wrote: On Wed, Jan 30, 2008 at 01:00:39AM +0100, Andrea Arcangeli wrote: get_user_pages, regular linux writes don't fault unless it's explicitly writeprotect, which is mandatory in a few archs, x86 not). actually get_user_pages doesn't fault either

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Jack Steiner
On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote: On Wed, 30 Jan 2008, Andrea Arcangeli wrote: invalidate_range after populate allows access to memory for which ptes were zapped and the refcount was released. The last refcount is released by the invalidate_range

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Andrea Arcangeli
On Wed, Jan 30, 2008 at 01:00:39AM +0100, Andrea Arcangeli wrote: get_user_pages, regular linux writes don't fault unless it's explicitly writeprotect, which is mandatory in a few archs, x86 not). actually get_user_pages doesn't fault either but it calls into set_page_dirty, however

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Wed, 30 Jan 2008, Andrea Arcangeli wrote: A user space spinlock plays into this??? That is irrelevant to the kernel. And we are discussing your placement of the invalidate_range not mine. With my code, invalidate_range wasn't placed there at all, my modification to ptep_clear_flush

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Tue, 29 Jan 2008, Jack Steiner wrote: That is true for your implementation and to address Robin's issues. Jack: Is that true for the GRU? I'm not sure I understand the question. The GRU never (currently) takes a reference on a page. It has no mechanism for tracking pages that were

Re: [kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
On Wed, 30 Jan 2008, Andrea Arcangeli wrote: invalidate_range after populate allows access to memory for which ptes were zapped and the refcount was released. The last refcount is released by the invalidate_range itself. That is true for your implementation and to address Robin's issues.

[kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-29 Thread Christoph Lameter
The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. Most of the VM address space changes can use the range invalidate callback. invalidate_range() is generally called with mmap_sem held but no spinlocks are active. If

[kvm-devel] [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

2008-01-28 Thread Christoph Lameter
The invalidation of address ranges in a mm_struct needs to be performed when pages are removed or permissions etc change. Most of the VM address space changes can use the range invalidate callback. invalidate_range() is generally called with mmap_sem held but no spinlocks are active. If