On Wednesday 05 March 2008 05:58, Christoph Lameter wrote:
> On Tue, 4 Mar 2008, Nick Piggin wrote:
> > > Then put it into the arch code for TLB invalidation. Paravirt ops gives
> > > good examples on how to do that.
> >
> > Put what into arch code?
>
> The mmu notifier code.
It isn't arch specifi
On Tue, 4 Mar 2008, Nick Piggin wrote:
> > Then put it into the arch code for TLB invalidation. Paravirt ops gives
> > good examples on how to do that.
>
> Put what into arch code?
The mmu notifier code.
> > > What about a completely different approach... XPmem runs over NUMAlink,
> > > right?
On Tuesday 04 March 2008 06:28, Christoph Lameter wrote:
> On Mon, 3 Mar 2008, Nick Piggin wrote:
> > Your skeleton is just registering notifiers and saying
> >
> > /* you fill the hard part in */
> >
> > If somebody needs a skeleton in order just to register the notifiers,
> > then almost by defin
On Mon, 3 Mar 2008, Nick Piggin wrote:
> Your skeleton is just registering notifiers and saying
>
> /* you fill the hard part in */
>
> If somebody needs a skeleton in order just to register the notifiers,
> then almost by definition they are unqualified to write the hard
> part ;)
Its also pro
On Thursday 28 February 2008 09:35, Christoph Lameter wrote:
> On Wed, 20 Feb 2008, Nick Piggin wrote:
> > On Friday 15 February 2008 17:49, Christoph Lameter wrote:
> > Also, what we are going to need here are not skeleton drivers
> > that just do all the *easy* bits (of registering their callbac
On Fri, Feb 29, 2008 at 02:12:57PM -0800, Christoph Lameter wrote:
> On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
>
> > > AFAICT The rw semaphore fastpath is similar in performance to a rw
> > > spinlock.
> >
> > read side is taken in the slow path.
>
> Slowpath meaning VM slowpath or lock slo
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
> > AFAICT The rw semaphore fastpath is similar in performance to a rw
> > spinlock.
>
> read side is taken in the slow path.
Slowpath meaning VM slowpath or lock slow path? Its seems that the rwsem
read side path is pretty efficient:
static inlin
On Fri, Feb 29, 2008 at 01:34:34PM -0800, Christoph Lameter wrote:
> On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
>
> > On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote:
> > > That means we need both the anon_vma locks and the i_mmap_lock to become
> > > semaphores. I think semap
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
> On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote:
> > That means we need both the anon_vma locks and the i_mmap_lock to become
> > semaphores. I think semaphores are better than mutexes. Rik and Lee saw
> > some performance improvem
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
> I don't have a strong opinion if it should become a semaphore
> unconditionally or only with a CONFIG_XPMEM=y. But keep in mind
> preempt-rt runs quite a bit slower, or we could rip spinlocks out of
> the kernel in the first place ;)
D you just skip
On Fri, Feb 29, 2008 at 01:03:16PM -0800, Christoph Lameter wrote:
> That means we need both the anon_vma locks and the i_mmap_lock to become
> semaphores. I think semaphores are better than mutexes. Rik and Lee saw
> some performance improvements because list can be traversed in parallel
> when
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
> Agreed. I just thought xpmem needed an invalidate-by-page, but
> I'm glad if xpmem can go in sync with the KVM/GRU/DRI model in this
> regard.
That means we need both the anon_vma locks and the i_mmap_lock to become
semaphores. I think semaphores ar
On Fri, Feb 29, 2008 at 11:55:17AM -0800, Christoph Lameter wrote:
> >post the invalidate in the mmio region of the device
> >smp_call_function()
> >while (mmio device wait-bitflag is on);
>
> So the device driver on UP can only operate through interrupts? If you are
> hogging the onl
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
> On Thu, Feb 28, 2008 at 04:59:59PM -0800, Christoph Lameter wrote:
> > And thus the device driver may stop receiving data on a UP system? It will
> > never get the ack.
>
> Not sure to follow, sorry.
>
> My idea was:
>
>post the invalidate in
On Thu, Feb 28, 2008 at 04:59:59PM -0800, Christoph Lameter wrote:
> And thus the device driver may stop receiving data on a UP system? It will
> never get the ack.
Not sure to follow, sorry.
My idea was:
post the invalidate in the mmio region of the device
smp_call_function()
while (m
On Fri, 29 Feb 2008, Andrea Arcangeli wrote:
> On Thu, Feb 28, 2008 at 10:43:54AM -0800, Christoph Lameter wrote:
> > What about invalidate_page()?
>
> That would just spin waiting an ack (just like the smp-tlb-flushing
> invalidates in numa already does).
And thus the device driver may stop rec
On Thu, Feb 28, 2008 at 10:43:54AM -0800, Christoph Lameter wrote:
> What about invalidate_page()?
That would just spin waiting an ack (just like the smp-tlb-flushing
invalidates in numa already does).
Thinking more about this, we could also parallelize it with an
invalidate_page_before/end. If i
On Thu, 28 Feb 2008, Andrea Arcangeli wrote:
> On Wed, Feb 27, 2008 at 05:03:21PM -0800, Christoph Lameter wrote:
> > RDMA works across a network and I would assume that it needs confirmation
> > that a connection has been torn down before pages can be unmapped.
>
> Depends on the latency of the
On Thu, Feb 28, 2008 at 01:52:50AM +0100, Andrea Arcangeli wrote:
> On Wed, Feb 27, 2008 at 04:14:08PM -0800, Christoph Lameter wrote:
> > Erm. This would also be needed by RDMA etc.
>
> The only RDMA I know is Quadrics, and Quadrics apparently doesn't need
> to schedule inside the invalidate meth
On Wed, Feb 27, 2008 at 05:03:21PM -0800, Christoph Lameter wrote:
> RDMA works across a network and I would assume that it needs confirmation
> that a connection has been torn down before pages can be unmapped.
Depends on the latency of the network, for example with page pinning
it can even try
On Thu, 28 Feb 2008, Andrea Arcangeli wrote:
> On Wed, Feb 27, 2008 at 04:14:08PM -0800, Christoph Lameter wrote:
> > Erm. This would also be needed by RDMA etc.
>
> The only RDMA I know is Quadrics, and Quadrics apparently doesn't need
> to schedule inside the invalidate methods AFIK, so I doubt
On Wed, Feb 27, 2008 at 04:14:08PM -0800, Christoph Lameter wrote:
> Erm. This would also be needed by RDMA etc.
The only RDMA I know is Quadrics, and Quadrics apparently doesn't need
to schedule inside the invalidate methods AFIK, so I doubt the above
is true. It'd be interesting to know if IB is
On Wed, Feb 27, 2008 at 02:39:46PM -0800, Christoph Lameter wrote:
> On Wed, 20 Feb 2008, Andrea Arcangeli wrote:
>
> > Well, xpmem requirements are complex. As as side effect of the
> > simplicity of my approach, my patch is 100% safe since #v1. Now it
> > also works for GRU and it cluster invali
On Thu, 28 Feb 2008, Andrea Arcangeli wrote:
> > 3. Keep the refcount elevated until pages are freed in another execution
> > context.
>
> Page refcount is not enough (the mmu_notifier_release will run in
> another cpu the moment after i_mmap_lock is unlocked) but mm_users may
> prevent us to ch
On Wed, Feb 27, 2008 at 02:35:59PM -0800, Christoph Lameter wrote:
> Could you be specific? This refers to page migration? Hmmm... Guess we
If the reader schedule, the synchronize_rcu will return in the other
cpu and the objects in the list will be freed and overwritten, and
when the task is sche
On Wed, 27 Feb 2008, Christoph Lameter wrote:
> Could you be specific? This refers to page migration? Hmmm... Guess we
> would need to inc the refcount there instead?
Argh. No its the callback list scanning. Yuck. No one noticed.
On Wed, Feb 27, 2008 at 02:23:29PM -0800, Christoph Lameter wrote:
> How would that work? You rely on the pte locking. Thus calls are all in an
I don't rely on the pte locking in #v7, exactly to satisfy GRU
(so far purely theoretical) performance complains.
> atomic context. I think we need a ge
>
> > Also, what we are going to need here are not skeleton drivers
> > that just do all the *easy* bits (of registering their callbacks),
> > but actual fully working examples that do everything that any
> > real driver will need to do. If not for the sanity of the driver
> > writer, then for th
On Wed, 20 Feb 2008, Andrea Arcangeli wrote:
> Well, xpmem requirements are complex. As as side effect of the
> simplicity of my approach, my patch is 100% safe since #v1. Now it
> also works for GRU and it cluster invalidates.
The patch has to satisfy RDMA, XPMEM, GRU and KVM. I keep hearing tha
On Wed, 20 Feb 2008, Nick Piggin wrote:
> On Friday 15 February 2008 17:49, Christoph Lameter wrote:
> > The invalidation of address ranges in a mm_struct needs to be
> > performed when pages are removed or permissions etc change.
> >
> > If invalidate_range_begin() is called with locks held then
On Tue, 19 Feb 2008, Andrea Arcangeli wrote:
> Yes, that's why I kept maintaining my patch and I posted the last
> revision to Andrew. I use pte/tlb locking of the core VM, it's
> unintrusive and obviously safe. Furthermore it can be extended with
> Christoph's stuff in a 100% backwards compatible
On Wed, Feb 20, 2008 at 02:11:41PM +1100, Nick Piggin wrote:
> On Wednesday 20 February 2008 14:00, Robin Holt wrote:
> > On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote:
> > > On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote:
>
> > > > Also, how to you resolve the case
On Wednesday 20 February 2008 14:00, Robin Holt wrote:
> On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote:
> > On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote:
> > > Also, how to you resolve the case where you are not allowed to sleep?
> > > I would have thought either
On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote:
> On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote:
> > You can't sleep inside rcu_read_lock()!
> >
> > I must say that for a patch that is up to v8 or whatever and is
> > posted twice a week to such a big cc list, it is
On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote:
> You can't sleep inside rcu_read_lock()!
>
> I must say that for a patch that is up to v8 or whatever and is
> posted twice a week to such a big cc list, it is kind of slack to
> not even test it and expect other people to review it.
W
On Friday 15 February 2008 17:49, Christoph Lameter wrote:
> The invalidation of address ranges in a mm_struct needs to be
> performed when pages are removed or permissions etc change.
>
> If invalidate_range_begin() is called with locks held then we
> pass a flag into invalidate_range() to indicat
On Tue, Feb 19, 2008 at 07:54:14PM +1100, Nick Piggin wrote:
> As far as sleeping inside callbacks goes... I think there are big
> problems with the patch (the sleeping patch and the external rmap
> patch). I don't think it is workable in its current state. Either
> we have to make some big changes
On Friday 15 February 2008 17:49, Christoph Lameter wrote:
> The invalidation of address ranges in a mm_struct needs to be
> performed when pages are removed or permissions etc change.
>
> If invalidate_range_begin() is called with locks held then we
> pass a flag into invalidate_range() to indicat
On Fri, 15 Feb 2008, Andrew Morton wrote:
> On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter <[EMAIL PROTECTED]>
> wrote:
>
> > The invalidation of address ranges in a mm_struct needs to be
> > performed when pages are removed or permissions etc change.
>
> hm. Do they? Why? If I'm in th
On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter <[EMAIL PROTECTED]> wrote:
> The invalidation of address ranges in a mm_struct needs to be
> performed when pages are removed or permissions etc change.
hm. Do they? Why? If I'm in the process of zero-copy writing a hunk of
memory out to har
The invalidation of address ranges in a mm_struct needs to be
performed when pages are removed or permissions etc change.
If invalidate_range_begin() is called with locks held then we
pass a flag into invalidate_range() to indicate that no sleeping is
possible. Locks are only held for truncate and
The invalidation of address ranges in a mm_struct needs to be
performed when pages are removed or permissions etc change.
If invalidate_range_begin() is called with locks held then we
pass a flag into invalidate_range() to indicate that no sleeping is
possible. Locks are only held for truncate and
On Wed, Jan 30, 2008 at 06:51:26PM -0800, Christoph Lameter wrote:
> True. hlist_del_init ok? That would allow to check the driver that the
> mmu_notifier is already linked in using !hlist_unhashed(). Driver then
> needs to properly initialize the mmu_notifier list with INIT_HLIST_NODE().
A driv
On Wed, Jan 30, 2008 at 05:46:21PM -0800, Christoph Lameter wrote:
> Well the GRU uses follow_page() instead of get_user_pages. Performance is
> a major issue for the GRU.
GRU is a external TLB, we have to allocate RAM instead but we do it
through the regular userland paging mechanism. Performan
On Thu, 31 Jan 2008, Andrea Arcangeli wrote:
> On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote:
> > hlist_for_each_entry_safe_rcu(mn, n, t,
>
>
> > &mm->mmu_notifier.head, hlist) {
> >
On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote:
> hlist_for_each_entry_safe_rcu(mn, n, t,
> &mm->mmu_notifier.head, hlist) {
> hlist_del_rcu(&mn->hlist);
On Wed, 30 Jan 2008, Robin Holt wrote:
> > Well the GRU uses follow_page() instead of get_user_pages. Performance is
> > a major issue for the GRU.
>
> Worse, the GRU takes its TLB faults from within an interrupt so we
> use follow_page to prevent going to sleep. That said, I think we
> could
> Well the GRU uses follow_page() instead of get_user_pages. Performance is
> a major issue for the GRU.
Worse, the GRU takes its TLB faults from within an interrupt so we
use follow_page to prevent going to sleep. That said, I think we
could probably use follow_page() with FOLL_GET set to acco
Patch to
1. Remove sync on notifier_release. Must be called when only a
single process remain.
2. Add invalidate_range_start/end. This should allow safe removal
of ranges of external ptes without having to resort to a callback
for every individual page.
This must be able to nest so t
On Thu, 31 Jan 2008, Andrea Arcangeli wrote:
> On Wed, Jan 30, 2008 at 04:01:31PM -0800, Christoph Lameter wrote:
> > How we offload that? Before the scan of the rmaps we do not have the
> > mmstruct. So we'd need another notifier_rmap_callback.
>
> My assumption is that that "int lock" exists j
On Wed, Jan 30, 2008 at 04:01:31PM -0800, Christoph Lameter wrote:
> How we offload that? Before the scan of the rmaps we do not have the
> mmstruct. So we'd need another notifier_rmap_callback.
My assumption is that that "int lock" exists just because
unmap_mapping_range_vma exists. If I'm right
On Thu, 31 Jan 2008, Andrea Arcangeli wrote:
> > - void (*invalidate_range)(struct mmu_notifier *mn,
> > + void (*invalidate_range_begin)(struct mmu_notifier *mn,
> > struct mm_struct *mm,
> > -unsigned long start, unsigned long end,
> >
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote:
> Then we have
>
> invalidate_range_start(mm)
>
> and
>
> invalidate_range_finish(mm, start, end)
>
> in addition to the invalidate rmap_notifier?
>
> ---
> include/linux/mmu_notifier.h |7 +--
> 1 file changed, 5 ins
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote:
> On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
>
> > XPMEM requires with invalidate_range (sleepy) +
> > before_invalidate_range (sleepy). invalidate_all should also be called
> > before_release (both sleepy).
> >
> > It sounds we
On Wed, 30 Jan 2008, Jack Steiner wrote:
> > Seems that we cannot rely on the invalidate_ranges for correctness at all?
> > We need to have invalidate_page() always. invalidate_range() is only an
> > optimization.
> >
>
> I don't understand your point "an optimization". How would invalidate_ran
On Wed, Jan 30, 2008 at 11:41:29AM -0800, Christoph Lameter wrote:
> On Wed, 30 Jan 2008, Jack Steiner wrote:
>
> > I see what you mean. I need to review to mail to see why this changed
> > but in the original discussions with Christoph, the invalidate_range
> > callouts were suppose to be made BE
On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> XPMEM requires with invalidate_range (sleepy) +
> before_invalidate_range (sleepy). invalidate_all should also be called
> before_release (both sleepy).
>
> It sounds we need full overlap of information provided by
> invalidate_page and invalidate_ra
On Wed, 30 Jan 2008, Jack Steiner wrote:
> I see what you mean. I need to review to mail to see why this changed
> but in the original discussions with Christoph, the invalidate_range
> callouts were suppose to be made BEFORE the pages were put on the freelist.
Seems that we cannot rely on the in
On Wed, 30 Jan 2008, Robin Holt wrote:
> I think I need to straighten this discussion out in my head a little bit.
> Am I correct in assuming Andrea's original patch set did not have any SMP
> race conditions for KVM? If so, then we need to start looking at how to
> implement Christoph's and my c
On Wed, Jan 30, 2008 at 11:30:09AM -0600, Robin Holt wrote:
> I don't think I saw the answer to my original question. I assume your
> original patch, extended in a way similar to what Christoph has done,
> can be made to work to cover both the KVM and GRU (Jack's) case.
Yes, I think so.
> XPMEM,
On Wed, Jan 30, 2008 at 06:04:52PM +0100, Andrea Arcangeli wrote:
> On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote:
...
> > The three issues we need to simultaneously solve is revoking the remote
> > page table/tlb information while still in a sleepable context and not
> > having the re
On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote:
> > Robin, if you don't mind, could you please post or upload somewhere
> > your GPLv2 code that registers itself in Christoph's V2 notifiers? Or
> > is it top secret? I wouldn't mind to have a look so I can better
> > understand what's th
> Robin, if you don't mind, could you please post or upload somewhere
> your GPLv2 code that registers itself in Christoph's V2 notifiers? Or
> is it top secret? I wouldn't mind to have a look so I can better
> understand what's the exact reason you're sleeping besides attempting
> GFP_KERNEL alloc
On Wed, Jan 30, 2008 at 02:37:20PM +0100, Andrea Arcangeli wrote:
> On Tue, Jan 29, 2008 at 06:28:05PM -0600, Jack Steiner wrote:
> > On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote:
> > > On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> > >
> > > > > invalidate_range after populat
On Tue, Jan 29, 2008 at 06:28:05PM -0600, Jack Steiner wrote:
> On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote:
> > On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> >
> > > > invalidate_range after populate allows access to memory for which ptes
> > > > were zapped and the refcou
The invalidation of address ranges in a mm_struct needs to be
performed when pages are removed or permissions etc change.
Most of the VM address space changes can use the range invalidate
callback.
invalidate_range() is generally called with mmap_sem held but
no spinlocks are active. If invalidate
On Tue, Jan 29, 2008 at 04:22:46PM -0800, Christoph Lameter wrote:
> That is only partially true. pte are created wronly in order to track
> dirty state these days. The first write will lead to a fault that switches
> the pte to writable. When the page undergoes writeback the page again
> become
On Tue, 29 Jan 2008, Jack Steiner wrote:
> > That is true for your implementation and to address Robin's issues. Jack:
> > Is that true for the GRU?
>
> I'm not sure I understand the question. The GRU never (currently) takes
> a reference on a page. It has no mechanism for tracking pages that
>
On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> > A user space spinlock plays into this??? That is irrelevant to the kernel.
> > And we are discussing "your" placement of the invalidate_range not mine.
>
> With "my" code, invalidate_range wasn't placed there at all, my
> modification to ptep_clea
On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote:
> On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
>
> > > invalidate_range after populate allows access to memory for which ptes
> > > were zapped and the refcount was released.
> >
> > The last refcount is released by the invalidat
On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> On Wed, Jan 30, 2008 at 01:00:39AM +0100, Andrea Arcangeli wrote:
> > get_user_pages, regular linux writes don't fault unless it's
> > explicitly writeprotect, which is mandatory in a few archs, x86 not).
>
> actually get_user_pages doesn't fault eit
On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> > invalidate_range after populate allows access to memory for which ptes
> > were zapped and the refcount was released.
>
> The last refcount is released by the invalidate_range itself.
That is true for your implementation and to address Robin's is
On Wed, Jan 30, 2008 at 01:00:39AM +0100, Andrea Arcangeli wrote:
> get_user_pages, regular linux writes don't fault unless it's
> explicitly writeprotect, which is mandatory in a few archs, x86 not).
actually get_user_pages doesn't fault either but it calls into
set_page_dirty, however get_user_p
On Tue, Jan 29, 2008 at 02:39:00PM -0800, Christoph Lameter wrote:
> If it does not run in write mode then concurrent faults are permissible
> while we remap pages. Weird. Maybe we better handle this like individual
> page operations? Put the invalidate_page back into zap_pte. But then there
> wo
On Tue, Jan 29, 2008 at 02:55:56PM -0800, Christoph Lameter wrote:
> On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
>
> > But now I think there may be an issue with a third thread that may
> > show unsafe the removal of invalidate_page from ptep_clear_flush.
> >
> > A third thread writing to a page
On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> But now I think there may be an issue with a third thread that may
> show unsafe the removal of invalidate_page from ptep_clear_flush.
>
> A third thread writing to a page through the linux-pte and the guest
> VM writing to the same page through the
n Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> hmm, "there" where? When I said it was taken in readonly mode I meant
> for the quoted code (it would be at the top if it wasn't cut), so I
> quote below again:
>
> > > + mmu_notifier(invalidate_range, mm, address,
> > > +
On Tue, Jan 29, 2008 at 01:53:05PM -0800, Christoph Lameter wrote:
> On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
>
> > > We invalidate the range *after* populating it? Isnt it okay to establish
> > > references while populate_range() runs?
> >
> > It's not ok because that function can very well
On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> > We invalidate the range *after* populating it? Isnt it okay to establish
> > references while populate_range() runs?
>
> It's not ok because that function can very well overwrite existing and
> present ptes (it's actually the nonlinear common case
On Tue, Jan 29, 2008 at 01:35:58PM -0800, Christoph Lameter wrote:
> On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
>
> > > It seems to be okay to invalidate range if you hold mmap_sem writably. In
> > > that case no additional faults can happen that would create new ptes.
> >
> > In that place th
On Tue, Jan 29, 2008 at 12:30:06PM -0800, Christoph Lameter wrote:
> On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
>
> > diff --git a/mm/fremap.c b/mm/fremap.c
> > --- a/mm/fremap.c
> > +++ b/mm/fremap.c
> > @@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns
> > spin_unlock(&m
On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> > It seems to be okay to invalidate range if you hold mmap_sem writably. In
> > that case no additional faults can happen that would create new ptes.
>
> In that place the mmap_sem is taken but in readonly mode. I never rely
> on the mmap_sem in the
On Tue, Jan 29, 2008 at 11:55:10AM -0800, Christoph Lameter wrote:
> I am not sure. AFAICT you wrote that code.
Actually I didn't need to change a single line in do_wp_page because
ptep_clear_flush was already doing everything transparently for
me. This was the memory.c part of my last patch I pos
On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> diff --git a/mm/fremap.c b/mm/fremap.c
> --- a/mm/fremap.c
> +++ b/mm/fremap.c
> @@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns
> spin_unlock(&mapping->i_mmap_lock);
> }
>
> + err = populate_range(mm, vma, start
On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> > + mmu_notifier(invalidate_range, mm, address,
> > + address + PAGE_SIZE - 1, 0);
> > page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
> > if (likely(pte_same(*page_table, orig_pte))) {
> > if
Christoph, the below patch should fix the current leak of the pinned
pages. I hope the page-pin that should be dropped by the
invalidate_range op, is enough to prevent the "physical page" mapped
on that "mm+address" to change before invalidate_range returns. If
that would ever happen, there would b
On Mon, Jan 28, 2008 at 12:28:42PM -0800, Christoph Lameter wrote:
> Index: linux-2.6/mm/fremap.c
> ===
> --- linux-2.6.orig/mm/fremap.c2008-01-25 19:31:05.0 -0800
> +++ linux-2.6/mm/fremap.c 2008-01-25 19:32:49.000
The invalidation of address ranges in a mm_struct needs to be
performed when pages are removed or permissions etc change.
Most of the VM address space changes can use the range invalidate
callback.
invalidate_range() is generally called with mmap_sem held but
no spinlocks are active. If invalidate
88 matches
Mail list logo