On Wed, Feb 20, 2008 at 02:11:41PM +1100, Nick Piggin wrote:
> On Wednesday 20 February 2008 14:00, Robin Holt wrote:
> > On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote:
> > > On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote:
>
> > > > Also, how to you resolve the case
On Wednesday 20 February 2008 14:00, Robin Holt wrote:
> On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote:
> > On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote:
> > > Also, how to you resolve the case where you are not allowed to sleep?
> > > I would have thought either
On Wed, Feb 20, 2008 at 02:00:38AM +0100, Andrea Arcangeli wrote:
> On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote:
> > You can't sleep inside rcu_read_lock()!
> >
> > I must say that for a patch that is up to v8 or whatever and is
> > posted twice a week to such a big cc list, it is
On Wed, Feb 20, 2008 at 10:08:49AM +1100, Nick Piggin wrote:
> You can't sleep inside rcu_read_lock()!
>
> I must say that for a patch that is up to v8 or whatever and is
> posted twice a week to such a big cc list, it is kind of slack to
> not even test it and expect other people to review it.
W
On Friday 15 February 2008 17:49, Christoph Lameter wrote:
> The invalidation of address ranges in a mm_struct needs to be
> performed when pages are removed or permissions etc change.
>
> If invalidate_range_begin() is called with locks held then we
> pass a flag into invalidate_range() to indicat
On Tue, Feb 19, 2008 at 07:54:14PM +1100, Nick Piggin wrote:
> As far as sleeping inside callbacks goes... I think there are big
> problems with the patch (the sleeping patch and the external rmap
> patch). I don't think it is workable in its current state. Either
> we have to make some big changes
On Friday 15 February 2008 17:49, Christoph Lameter wrote:
> The invalidation of address ranges in a mm_struct needs to be
> performed when pages are removed or permissions etc change.
>
> If invalidate_range_begin() is called with locks held then we
> pass a flag into invalidate_range() to indicat
On Fri, 15 Feb 2008, Andrew Morton wrote:
> On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter <[EMAIL PROTECTED]>
> wrote:
>
> > The invalidation of address ranges in a mm_struct needs to be
> > performed when pages are removed or permissions etc change.
>
> hm. Do they? Why? If I'm in th
On Thu, 14 Feb 2008 22:49:01 -0800 Christoph Lameter <[EMAIL PROTECTED]> wrote:
> The invalidation of address ranges in a mm_struct needs to be
> performed when pages are removed or permissions etc change.
hm. Do they? Why? If I'm in the process of zero-copy writing a hunk of
memory out to har
The invalidation of address ranges in a mm_struct needs to be
performed when pages are removed or permissions etc change.
If invalidate_range_begin() is called with locks held then we
pass a flag into invalidate_range() to indicate that no sleeping is
possible. Locks are only held for truncate and
The invalidation of address ranges in a mm_struct needs to be
performed when pages are removed or permissions etc change.
If invalidate_range_begin() is called with locks held then we
pass a flag into invalidate_range() to indicate that no sleeping is
possible. Locks are only held for truncate and
On Wed, Jan 30, 2008 at 06:51:26PM -0800, Christoph Lameter wrote:
> True. hlist_del_init ok? That would allow to check the driver that the
> mmu_notifier is already linked in using !hlist_unhashed(). Driver then
> needs to properly initialize the mmu_notifier list with INIT_HLIST_NODE().
A driv
On Wed, Jan 30, 2008 at 05:46:21PM -0800, Christoph Lameter wrote:
> Well the GRU uses follow_page() instead of get_user_pages. Performance is
> a major issue for the GRU.
GRU is a external TLB, we have to allocate RAM instead but we do it
through the regular userland paging mechanism. Performan
On Thu, 31 Jan 2008, Andrea Arcangeli wrote:
> On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote:
> > hlist_for_each_entry_safe_rcu(mn, n, t,
>
>
> > &mm->mmu_notifier.head, hlist) {
> >
On Wed, Jan 30, 2008 at 06:08:14PM -0800, Christoph Lameter wrote:
> hlist_for_each_entry_safe_rcu(mn, n, t,
> &mm->mmu_notifier.head, hlist) {
> hlist_del_rcu(&mn->hlist);
On Wed, 30 Jan 2008, Robin Holt wrote:
> > Well the GRU uses follow_page() instead of get_user_pages. Performance is
> > a major issue for the GRU.
>
> Worse, the GRU takes its TLB faults from within an interrupt so we
> use follow_page to prevent going to sleep. That said, I think we
> could
> Well the GRU uses follow_page() instead of get_user_pages. Performance is
> a major issue for the GRU.
Worse, the GRU takes its TLB faults from within an interrupt so we
use follow_page to prevent going to sleep. That said, I think we
could probably use follow_page() with FOLL_GET set to acco
Patch to
1. Remove sync on notifier_release. Must be called when only a
single process remain.
2. Add invalidate_range_start/end. This should allow safe removal
of ranges of external ptes without having to resort to a callback
for every individual page.
This must be able to nest so t
On Thu, 31 Jan 2008, Andrea Arcangeli wrote:
> On Wed, Jan 30, 2008 at 04:01:31PM -0800, Christoph Lameter wrote:
> > How we offload that? Before the scan of the rmaps we do not have the
> > mmstruct. So we'd need another notifier_rmap_callback.
>
> My assumption is that that "int lock" exists j
On Wed, Jan 30, 2008 at 04:01:31PM -0800, Christoph Lameter wrote:
> How we offload that? Before the scan of the rmaps we do not have the
> mmstruct. So we'd need another notifier_rmap_callback.
My assumption is that that "int lock" exists just because
unmap_mapping_range_vma exists. If I'm right
On Thu, 31 Jan 2008, Andrea Arcangeli wrote:
> > - void (*invalidate_range)(struct mmu_notifier *mn,
> > + void (*invalidate_range_begin)(struct mmu_notifier *mn,
> > struct mm_struct *mm,
> > -unsigned long start, unsigned long end,
> >
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote:
> Then we have
>
> invalidate_range_start(mm)
>
> and
>
> invalidate_range_finish(mm, start, end)
>
> in addition to the invalidate rmap_notifier?
>
> ---
> include/linux/mmu_notifier.h |7 +--
> 1 file changed, 5 ins
On Wed, Jan 30, 2008 at 11:50:26AM -0800, Christoph Lameter wrote:
> On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
>
> > XPMEM requires with invalidate_range (sleepy) +
> > before_invalidate_range (sleepy). invalidate_all should also be called
> > before_release (both sleepy).
> >
> > It sounds we
On Wed, 30 Jan 2008, Jack Steiner wrote:
> > Seems that we cannot rely on the invalidate_ranges for correctness at all?
> > We need to have invalidate_page() always. invalidate_range() is only an
> > optimization.
> >
>
> I don't understand your point "an optimization". How would invalidate_ran
On Wed, Jan 30, 2008 at 11:41:29AM -0800, Christoph Lameter wrote:
> On Wed, 30 Jan 2008, Jack Steiner wrote:
>
> > I see what you mean. I need to review to mail to see why this changed
> > but in the original discussions with Christoph, the invalidate_range
> > callouts were suppose to be made BE
On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> XPMEM requires with invalidate_range (sleepy) +
> before_invalidate_range (sleepy). invalidate_all should also be called
> before_release (both sleepy).
>
> It sounds we need full overlap of information provided by
> invalidate_page and invalidate_ra
On Wed, 30 Jan 2008, Jack Steiner wrote:
> I see what you mean. I need to review to mail to see why this changed
> but in the original discussions with Christoph, the invalidate_range
> callouts were suppose to be made BEFORE the pages were put on the freelist.
Seems that we cannot rely on the in
On Wed, 30 Jan 2008, Robin Holt wrote:
> I think I need to straighten this discussion out in my head a little bit.
> Am I correct in assuming Andrea's original patch set did not have any SMP
> race conditions for KVM? If so, then we need to start looking at how to
> implement Christoph's and my c
On Wed, Jan 30, 2008 at 11:30:09AM -0600, Robin Holt wrote:
> I don't think I saw the answer to my original question. I assume your
> original patch, extended in a way similar to what Christoph has done,
> can be made to work to cover both the KVM and GRU (Jack's) case.
Yes, I think so.
> XPMEM,
On Wed, Jan 30, 2008 at 06:04:52PM +0100, Andrea Arcangeli wrote:
> On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote:
...
> > The three issues we need to simultaneously solve is revoking the remote
> > page table/tlb information while still in a sleepable context and not
> > having the re
On Wed, Jan 30, 2008 at 10:11:24AM -0600, Robin Holt wrote:
> > Robin, if you don't mind, could you please post or upload somewhere
> > your GPLv2 code that registers itself in Christoph's V2 notifiers? Or
> > is it top secret? I wouldn't mind to have a look so I can better
> > understand what's th
> Robin, if you don't mind, could you please post or upload somewhere
> your GPLv2 code that registers itself in Christoph's V2 notifiers? Or
> is it top secret? I wouldn't mind to have a look so I can better
> understand what's the exact reason you're sleeping besides attempting
> GFP_KERNEL alloc
On Wed, Jan 30, 2008 at 02:37:20PM +0100, Andrea Arcangeli wrote:
> On Tue, Jan 29, 2008 at 06:28:05PM -0600, Jack Steiner wrote:
> > On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote:
> > > On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> > >
> > > > > invalidate_range after populat
On Tue, Jan 29, 2008 at 06:28:05PM -0600, Jack Steiner wrote:
> On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote:
> > On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> >
> > > > invalidate_range after populate allows access to memory for which ptes
> > > > were zapped and the refcou
On Wed, 2008-01-30 at 01:59 +0100, Andrea Arcangeli wrote:
> On Tue, Jan 29, 2008 at 04:22:46PM -0800, Christoph Lameter wrote:
> > That is only partially true. pte are created wronly in order to track
> > dirty state these days. The first write will lead to a fault that switches
> > the pte to
The invalidation of address ranges in a mm_struct needs to be
performed when pages are removed or permissions etc change.
Most of the VM address space changes can use the range invalidate
callback.
invalidate_range() is generally called with mmap_sem held but
no spinlocks are active. If invalidate
On Tue, Jan 29, 2008 at 04:22:46PM -0800, Christoph Lameter wrote:
> That is only partially true. pte are created wronly in order to track
> dirty state these days. The first write will lead to a fault that switches
> the pte to writable. When the page undergoes writeback the page again
> become
On Tue, 29 Jan 2008, Jack Steiner wrote:
> > That is true for your implementation and to address Robin's issues. Jack:
> > Is that true for the GRU?
>
> I'm not sure I understand the question. The GRU never (currently) takes
> a reference on a page. It has no mechanism for tracking pages that
>
On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> > A user space spinlock plays into this??? That is irrelevant to the kernel.
> > And we are discussing "your" placement of the invalidate_range not mine.
>
> With "my" code, invalidate_range wasn't placed there at all, my
> modification to ptep_clea
On Tue, Jan 29, 2008 at 04:20:50PM -0800, Christoph Lameter wrote:
> On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
>
> > > invalidate_range after populate allows access to memory for which ptes
> > > were zapped and the refcount was released.
> >
> > The last refcount is released by the invalidat
On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> On Wed, Jan 30, 2008 at 01:00:39AM +0100, Andrea Arcangeli wrote:
> > get_user_pages, regular linux writes don't fault unless it's
> > explicitly writeprotect, which is mandatory in a few archs, x86 not).
>
> actually get_user_pages doesn't fault eit
On Wed, 30 Jan 2008, Andrea Arcangeli wrote:
> > invalidate_range after populate allows access to memory for which ptes
> > were zapped and the refcount was released.
>
> The last refcount is released by the invalidate_range itself.
That is true for your implementation and to address Robin's is
On Wed, Jan 30, 2008 at 01:00:39AM +0100, Andrea Arcangeli wrote:
> get_user_pages, regular linux writes don't fault unless it's
> explicitly writeprotect, which is mandatory in a few archs, x86 not).
actually get_user_pages doesn't fault either but it calls into
set_page_dirty, however get_user_p
On Tue, Jan 29, 2008 at 02:39:00PM -0800, Christoph Lameter wrote:
> If it does not run in write mode then concurrent faults are permissible
> while we remap pages. Weird. Maybe we better handle this like individual
> page operations? Put the invalidate_page back into zap_pte. But then there
> wo
On Tue, Jan 29, 2008 at 02:55:56PM -0800, Christoph Lameter wrote:
> On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
>
> > But now I think there may be an issue with a third thread that may
> > show unsafe the removal of invalidate_page from ptep_clear_flush.
> >
> > A third thread writing to a page
On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> But now I think there may be an issue with a third thread that may
> show unsafe the removal of invalidate_page from ptep_clear_flush.
>
> A third thread writing to a page through the linux-pte and the guest
> VM writing to the same page through the
n Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> hmm, "there" where? When I said it was taken in readonly mode I meant
> for the quoted code (it would be at the top if it wasn't cut), so I
> quote below again:
>
> > > + mmu_notifier(invalidate_range, mm, address,
> > > +
On Tue, Jan 29, 2008 at 01:53:05PM -0800, Christoph Lameter wrote:
> On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
>
> > > We invalidate the range *after* populating it? Isnt it okay to establish
> > > references while populate_range() runs?
> >
> > It's not ok because that function can very well
On Tue, Jan 29, 2008 at 01:35:58PM -0800, Christoph Lameter wrote:
> On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
>
> > > It seems to be okay to invalidate range if you hold mmap_sem writably. In
> > > that case no additional faults can happen that would create new ptes.
> >
> > In that place th
On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> > We invalidate the range *after* populating it? Isnt it okay to establish
> > references while populate_range() runs?
>
> It's not ok because that function can very well overwrite existing and
> present ptes (it's actually the nonlinear common case
On Tue, Jan 29, 2008 at 12:30:06PM -0800, Christoph Lameter wrote:
> On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
>
> > diff --git a/mm/fremap.c b/mm/fremap.c
> > --- a/mm/fremap.c
> > +++ b/mm/fremap.c
> > @@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns
> > spin_unlock(&m
On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> > It seems to be okay to invalidate range if you hold mmap_sem writably. In
> > that case no additional faults can happen that would create new ptes.
>
> In that place the mmap_sem is taken but in readonly mode. I never rely
> on the mmap_sem in the
On Tue, Jan 29, 2008 at 11:55:10AM -0800, Christoph Lameter wrote:
> I am not sure. AFAICT you wrote that code.
Actually I didn't need to change a single line in do_wp_page because
ptep_clear_flush was already doing everything transparently for
me. This was the memory.c part of my last patch I pos
On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> diff --git a/mm/fremap.c b/mm/fremap.c
> --- a/mm/fremap.c
> +++ b/mm/fremap.c
> @@ -212,8 +212,8 @@ asmlinkage long sys_remap_file_pages(uns
> spin_unlock(&mapping->i_mmap_lock);
> }
>
> + err = populate_range(mm, vma, start
On Tue, 29 Jan 2008, Andrea Arcangeli wrote:
> > + mmu_notifier(invalidate_range, mm, address,
> > + address + PAGE_SIZE - 1, 0);
> > page_table = pte_offset_map_lock(mm, pmd, address, &ptl);
> > if (likely(pte_same(*page_table, orig_pte))) {
> > if
Christoph, the below patch should fix the current leak of the pinned
pages. I hope the page-pin that should be dropped by the
invalidate_range op, is enough to prevent the "physical page" mapped
on that "mm+address" to change before invalidate_range returns. If
that would ever happen, there would b
On Mon, Jan 28, 2008 at 12:28:42PM -0800, Christoph Lameter wrote:
> Index: linux-2.6/mm/fremap.c
> ===
> --- linux-2.6.orig/mm/fremap.c2008-01-25 19:31:05.0 -0800
> +++ linux-2.6/mm/fremap.c 2008-01-25 19:32:49.000
The invalidation of address ranges in a mm_struct needs to be
performed when pages are removed or permissions etc change.
Most of the VM address space changes can use the range invalidate
callback.
invalidate_range() is generally called with mmap_sem held but
no spinlocks are active. If invalidate
58 matches
Mail list logo