Re: [PATCH 02/36] mmu_notifier: keep track of active invalidation ranges v3

2015-06-04 Thread John Hubbard
On Wed, 3 Jun 2015, Jerome Glisse wrote:

> On Tue, Jun 02, 2015 at 02:32:01AM -0700, John Hubbard wrote:
> > On Thu, 21 May 2015, j.gli...@gmail.com wrote:
> > 
> > > From: Jérôme Glisse 
> > > 
> > > The mmu_notifier_invalidate_range_start() and 
> > > mmu_notifier_invalidate_range_end()
> > > can be considered as forming an "atomic" section for the cpu page table 
> > > update
> > > point of view. Between this two function the cpu page table content is 
> > > unreliable
> > > for the address range being invalidated.
> > > 
> > > Current user such as kvm need to know when they can trust the content of 
> > > the cpu
> > > page table. This becomes even more important to new users of the 
> > > mmu_notifier
> > > api (such as HMM or ODP).
> > > 
> > > This patch use a structure define at all call site to 
> > > invalidate_range_start()
> > > that is added to a list for the duration of the invalidation. It adds two 
> > > new
> > > helpers to allow querying if a range is being invalidated or to wait for 
> > > a range
> > > to become valid.
> > > 
> > > For proper synchronization, user must block new range invalidation from 
> > > inside
> > > there invalidate_range_start() callback, before calling the helper 
> > > functions.
> > > Otherwise there is no garanty that a new range invalidation will not be 
> > > added
> > > after the call to the helper function to query for existing range.
> > 
> > Hi Jerome,
> > 
> > Most of this information will make nice block comments for the new helper 
> > routines. I can help tighten up the writing slightly, but first:
> > 
> > Question: in hmm.c's hmm_notifier_invalidate function (looking at the 
> > entire patchset, for a moment), I don't see any blocking of new range 
> > invalidations, even though you point out, above, that this is required. Am 
> > I missing it, and if so, where should I be looking instead?
> 
> This is a 2 sided synchronization:
> 
> - hmm_device_fault_start() will wait for active invalidation that conflict
>   to be done
> - hmm_wait_device_fault() will block new invalidation until
>   active fault that conflict back off.
>


OK. I'll wait until those patches to talk about those, then.
 
> 
> > [...]
> > 
> > > -enum mmu_event event)
> > > +struct mmu_notifier_range *range)
> > >  
> > >  {
> > >   struct mmu_notifier *mn;
> > >   int id;
> > >  
> > > + spin_lock(&mm->mmu_notifier_mm->lock);
> > > + list_add_tail(&range->list, &mm->mmu_notifier_mm->ranges);
> > > + mm->mmu_notifier_mm->nranges++;
> > 
> > 
> > Is this missing a call to wake_up(&mm->mmu_notifier_mm->wait_queue)? If 
> > not, then it would be helpful to explain why that's only required for 
> > nranges--, and not for the nranges++ case. The helper routine is merely 
> > waiting for nranges to *change*, not looking for greater than or less 
> > than.
> 
> This is on purpose, as the waiting side only wait for active invalidation
> to be done ie for mm->mmu_notifier_mm->nranges-- so there is no reasons to
> wake up when a new invalidation is starting. Also the test need to be a not
> equal because other non conflicting range might be added/removed meaning
> that wait might finish even if mm->mmu_notifier_mm->nranges > saved_nranges.
> 


OK, I convinced myself that this works as intended. So I don't see 
anything wrong with this approach.

thanks,
john h

> 
> [...]
> > > +static bool mmu_notifier_range_is_valid_locked(struct mm_struct *mm,
> > > +unsigned long start,
> > > +unsigned long end)
> > 
> > 
> > This routine is named "_range_is_valid_", but it takes in an implicit 
> > range (start, end), and also a list of ranges (buried in mm), and so it's 
> > a little confusing. I'd like to consider *maybe* changing either the name, 
> > or the args (range* instead of start, end?), or something.
> > 
> > Could you please say a few words about the intent of this routine, to get 
> > us started there?
> 
> It is just the same as mmu_notifier_range_is_valid() but it expects locks
> to be taken. This is for the benefit of mmu_notifier_range_wait_valid()
> which need to test if a range is valid (ie no conflicting invalidation)
> or not. I added a comment to explain this 3 function and to explain how
> the 2 publics helper needs to be use.
> 
> Cheers,
> Jérôme
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org";> em...@kvack.org 
> 



Re: [PATCH 02/36] mmu_notifier: keep track of active invalidation ranges v3

2015-06-03 Thread Jerome Glisse
On Tue, Jun 02, 2015 at 02:32:01AM -0700, John Hubbard wrote:
> On Thu, 21 May 2015, j.gli...@gmail.com wrote:
> 
> > From: Jérôme Glisse 
> > 
> > The mmu_notifier_invalidate_range_start() and 
> > mmu_notifier_invalidate_range_end()
> > can be considered as forming an "atomic" section for the cpu page table 
> > update
> > point of view. Between this two function the cpu page table content is 
> > unreliable
> > for the address range being invalidated.
> > 
> > Current user such as kvm need to know when they can trust the content of 
> > the cpu
> > page table. This becomes even more important to new users of the 
> > mmu_notifier
> > api (such as HMM or ODP).
> > 
> > This patch use a structure define at all call site to 
> > invalidate_range_start()
> > that is added to a list for the duration of the invalidation. It adds two 
> > new
> > helpers to allow querying if a range is being invalidated or to wait for a 
> > range
> > to become valid.
> > 
> > For proper synchronization, user must block new range invalidation from 
> > inside
> > there invalidate_range_start() callback, before calling the helper 
> > functions.
> > Otherwise there is no garanty that a new range invalidation will not be 
> > added
> > after the call to the helper function to query for existing range.
> 
> Hi Jerome,
> 
> Most of this information will make nice block comments for the new helper 
> routines. I can help tighten up the writing slightly, but first:
> 
> Question: in hmm.c's hmm_notifier_invalidate function (looking at the 
> entire patchset, for a moment), I don't see any blocking of new range 
> invalidations, even though you point out, above, that this is required. Am 
> I missing it, and if so, where should I be looking instead?

This is a 2 sided synchronization:

- hmm_device_fault_start() will wait for active invalidation that conflict
  to be done
- hmm_wait_device_fault() will block new invalidation until
  active fault that conflict back off.


> [...]
> 
> > -  enum mmu_event event)
> > +  struct mmu_notifier_range *range)
> >  
> >  {
> > struct mmu_notifier *mn;
> > int id;
> >  
> > +   spin_lock(&mm->mmu_notifier_mm->lock);
> > +   list_add_tail(&range->list, &mm->mmu_notifier_mm->ranges);
> > +   mm->mmu_notifier_mm->nranges++;
> 
> 
> Is this missing a call to wake_up(&mm->mmu_notifier_mm->wait_queue)? If 
> not, then it would be helpful to explain why that's only required for 
> nranges--, and not for the nranges++ case. The helper routine is merely 
> waiting for nranges to *change*, not looking for greater than or less 
> than.

This is on purpose, as the waiting side only wait for active invalidation
to be done ie for mm->mmu_notifier_mm->nranges-- so there is no reasons to
wake up when a new invalidation is starting. Also the test need to be a not
equal because other non conflicting range might be added/removed meaning
that wait might finish even if mm->mmu_notifier_mm->nranges > saved_nranges.


[...]
> > +static bool mmu_notifier_range_is_valid_locked(struct mm_struct *mm,
> > +  unsigned long start,
> > +  unsigned long end)
> 
> 
> This routine is named "_range_is_valid_", but it takes in an implicit 
> range (start, end), and also a list of ranges (buried in mm), and so it's 
> a little confusing. I'd like to consider *maybe* changing either the name, 
> or the args (range* instead of start, end?), or something.
> 
> Could you please say a few words about the intent of this routine, to get 
> us started there?

It is just the same as mmu_notifier_range_is_valid() but it expects locks
to be taken. This is for the benefit of mmu_notifier_range_wait_valid()
which need to test if a range is valid (ie no conflicting invalidation)
or not. I added a comment to explain this 3 function and to explain how
the 2 publics helper needs to be use.

Cheers,
Jérôme
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/36] mmu_notifier: keep track of active invalidation ranges v3

2015-06-02 Thread John Hubbard
On Thu, 21 May 2015, j.gli...@gmail.com wrote:

> From: Jérôme Glisse 
> 
> The mmu_notifier_invalidate_range_start() and 
> mmu_notifier_invalidate_range_end()
> can be considered as forming an "atomic" section for the cpu page table update
> point of view. Between this two function the cpu page table content is 
> unreliable
> for the address range being invalidated.
> 
> Current user such as kvm need to know when they can trust the content of the 
> cpu
> page table. This becomes even more important to new users of the mmu_notifier
> api (such as HMM or ODP).
> 
> This patch use a structure define at all call site to invalidate_range_start()
> that is added to a list for the duration of the invalidation. It adds two new
> helpers to allow querying if a range is being invalidated or to wait for a 
> range
> to become valid.
> 
> For proper synchronization, user must block new range invalidation from inside
> there invalidate_range_start() callback, before calling the helper functions.
> Otherwise there is no garanty that a new range invalidation will not be added
> after the call to the helper function to query for existing range.

Hi Jerome,

Most of this information will make nice block comments for the new helper 
routines. I can help tighten up the writing slightly, but first:

Question: in hmm.c's hmm_notifier_invalidate function (looking at the 
entire patchset, for a moment), I don't see any blocking of new range 
invalidations, even though you point out, above, that this is required. Am 
I missing it, and if so, where should I be looking instead?

> 
> Changed since v1:
>   - Fix a possible deadlock in mmu_notifier_range_wait_valid()
> 
> Changed since v2:
>   - Add the range to invalid range list before calling ->range_start().
>   - Del the range from invalid range list after calling ->range_end().
>   - Remove useless list initialization.
> 
> Signed-off-by: Jérôme Glisse 
> Reviewed-by: Rik van Riel 
> Reviewed-by: Haggai Eran 
> ---
>  drivers/gpu/drm/i915/i915_gem_userptr.c |  9 ++--
>  drivers/gpu/drm/radeon/radeon_mn.c  | 14 +++---
>  drivers/infiniband/core/umem_odp.c  | 16 +++
>  drivers/misc/sgi-gru/grutlbpurge.c  | 15 +++
>  drivers/xen/gntdev.c| 15 ---
>  fs/proc/task_mmu.c  | 11 +++--
>  include/linux/mmu_notifier.h| 55 ---
>  kernel/events/uprobes.c | 13 +++---
>  mm/huge_memory.c| 78 ++--
>  mm/hugetlb.c| 55 ---
>  mm/ksm.c| 28 +---
>  mm/madvise.c| 20 -
>  mm/memory.c | 72 +-
>  mm/migrate.c| 36 +++
>  mm/mmu_notifier.c   | 79 
> -
>  mm/mprotect.c   | 18 
>  mm/mremap.c | 14 +++---
>  virt/kvm/kvm_main.c | 10 ++---
>  18 files changed, 302 insertions(+), 256 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
> b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 452e9b1..80fe72a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -131,16 +131,15 @@ restart:
>  
>  static void i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier 
> *_mn,
>  struct mm_struct *mm,
> -unsigned long start,
> -unsigned long end,
> -enum mmu_event event)
> +const struct 
> mmu_notifier_range *range)
>  {
>   struct i915_mmu_notifier *mn = container_of(_mn, struct 
> i915_mmu_notifier, mn);
>   struct interval_tree_node *it = NULL;
> - unsigned long next = start;
> + unsigned long next = range->start;
>   unsigned long serial = 0;
> + /* interval ranges are inclusive, but invalidate range is exclusive */
> + unsigned long end = range->end - 1, start = range->start;


A *very* minor point, but doing it that way messes up the scope of the 
comment. Something more like this might be cleaner:

unsigned long start = range->start;
unsigned long next = start;
unsigned long serial = 0;
/* interval ranges are inclusive, but invalidate range is exclusive */
unsigned long end = range->end - 1;


[...]

> -enum mmu_event event)
> +struct mmu_notifier_range *range)
>  
>  {
>   struct mmu_notifier *mn;
>   int id;
>  
> + spin_lock(&mm->mmu_notifier_mm->lock);
> + list_add_tail(&range->list, &mm->mmu_notifier_mm->ranges);
> + mm->mmu_notifier_mm->

Re: [PATCH 02/36] mmu_notifier: keep track of active invalidation ranges v3

2015-05-27 Thread Jerome Glisse
On Wed, May 27, 2015 at 10:39:23AM +0530, Aneesh Kumar K.V wrote:
> j.gli...@gmail.com writes:
> 
> > From: Jérôme Glisse 
> >
> > The mmu_notifier_invalidate_range_start() and 
> > mmu_notifier_invalidate_range_end()
> > can be considered as forming an "atomic" section for the cpu page table 
> > update
> > point of view. Between this two function the cpu page table content is 
> > unreliable
> > for the address range being invalidated.
> >
> > Current user such as kvm need to know when they can trust the content of 
> > the cpu
> > page table. This becomes even more important to new users of the 
> > mmu_notifier
> > api (such as HMM or ODP).
> 
> I don't see kvm using the new APIs in this patch. Also what is that HMM use 
> this
> for, to protect walking of mirror page table ?. I am sure you are
> covering that in the later patches. May be you may want to mention
> the details here too. 

KVM side is not done, i looked at KVM code long time ago and thought oh it
could take advantage of this but now i do not remember exactly. I would need
to check back.

For HMM this is simple, no device fault can populate or walk the mirror page
table on a range that is being invalidated. But concurrent fault/walk can
happen outside the invalidated range. All handled in hmm_device_fault_start().

Cheers,
Jérôme
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/36] mmu_notifier: keep track of active invalidation ranges v3

2015-05-26 Thread Aneesh Kumar K.V
j.gli...@gmail.com writes:

> From: Jérôme Glisse 
>
> The mmu_notifier_invalidate_range_start() and 
> mmu_notifier_invalidate_range_end()
> can be considered as forming an "atomic" section for the cpu page table update
> point of view. Between this two function the cpu page table content is 
> unreliable
> for the address range being invalidated.
>
> Current user such as kvm need to know when they can trust the content of the 
> cpu
> page table. This becomes even more important to new users of the mmu_notifier
> api (such as HMM or ODP).

I don't see kvm using the new APIs in this patch. Also what is that HMM use this
for, to protect walking of mirror page table ?. I am sure you are
covering that in the later patches. May be you may want to mention
the details here too. 

>
> This patch use a structure define at all call site to invalidate_range_start()
> that is added to a list for the duration of the invalidation. It adds two new
> helpers to allow querying if a range is being invalidated or to wait for a 
> range
> to become valid.
>
> For proper synchronization, user must block new range invalidation from inside
> there invalidate_range_start() callback, before calling the helper functions.
> Otherwise there is no garanty that a new range invalidation will not be added
> after the call to the helper function to query for existing range.
>
> Changed since v1:
>   - Fix a possible deadlock in mmu_notifier_range_wait_valid()
>
> Changed since v2:
>   - Add the range to invalid range list before calling ->range_start().
>   - Del the range from invalid range list after calling ->range_end().
>   - Remove useless list initialization.
>

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/36] mmu_notifier: keep track of active invalidation ranges v3

2015-05-21 Thread j . glisse
From: Jérôme Glisse 

The mmu_notifier_invalidate_range_start() and 
mmu_notifier_invalidate_range_end()
can be considered as forming an "atomic" section for the cpu page table update
point of view. Between this two function the cpu page table content is 
unreliable
for the address range being invalidated.

Current user such as kvm need to know when they can trust the content of the cpu
page table. This becomes even more important to new users of the mmu_notifier
api (such as HMM or ODP).

This patch use a structure define at all call site to invalidate_range_start()
that is added to a list for the duration of the invalidation. It adds two new
helpers to allow querying if a range is being invalidated or to wait for a range
to become valid.

For proper synchronization, user must block new range invalidation from inside
there invalidate_range_start() callback, before calling the helper functions.
Otherwise there is no garanty that a new range invalidation will not be added
after the call to the helper function to query for existing range.

Changed since v1:
  - Fix a possible deadlock in mmu_notifier_range_wait_valid()

Changed since v2:
  - Add the range to invalid range list before calling ->range_start().
  - Del the range from invalid range list after calling ->range_end().
  - Remove useless list initialization.

Signed-off-by: Jérôme Glisse 
Reviewed-by: Rik van Riel 
Reviewed-by: Haggai Eran 
---
 drivers/gpu/drm/i915/i915_gem_userptr.c |  9 ++--
 drivers/gpu/drm/radeon/radeon_mn.c  | 14 +++---
 drivers/infiniband/core/umem_odp.c  | 16 +++
 drivers/misc/sgi-gru/grutlbpurge.c  | 15 +++
 drivers/xen/gntdev.c| 15 ---
 fs/proc/task_mmu.c  | 11 +++--
 include/linux/mmu_notifier.h| 55 ---
 kernel/events/uprobes.c | 13 +++---
 mm/huge_memory.c| 78 ++--
 mm/hugetlb.c| 55 ---
 mm/ksm.c| 28 +---
 mm/madvise.c| 20 -
 mm/memory.c | 72 +-
 mm/migrate.c| 36 +++
 mm/mmu_notifier.c   | 79 -
 mm/mprotect.c   | 18 
 mm/mremap.c | 14 +++---
 virt/kvm/kvm_main.c | 10 ++---
 18 files changed, 302 insertions(+), 256 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c 
b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 452e9b1..80fe72a 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -131,16 +131,15 @@ restart:
 
 static void i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier 
*_mn,
   struct mm_struct *mm,
-  unsigned long start,
-  unsigned long end,
-  enum mmu_event event)
+  const struct 
mmu_notifier_range *range)
 {
struct i915_mmu_notifier *mn = container_of(_mn, struct 
i915_mmu_notifier, mn);
struct interval_tree_node *it = NULL;
-   unsigned long next = start;
+   unsigned long next = range->start;
unsigned long serial = 0;
+   /* interval ranges are inclusive, but invalidate range is exclusive */
+   unsigned long end = range->end - 1, start = range->start;
 
-   end--; /* interval ranges are inclusive, but invalidate range is 
exclusive */
while (next < end) {
struct drm_i915_gem_object *obj = NULL;
 
diff --git a/drivers/gpu/drm/radeon/radeon_mn.c 
b/drivers/gpu/drm/radeon/radeon_mn.c
index 3a9615b..24898bf 100644
--- a/drivers/gpu/drm/radeon/radeon_mn.c
+++ b/drivers/gpu/drm/radeon/radeon_mn.c
@@ -112,34 +112,30 @@ static void radeon_mn_release(struct mmu_notifier *mn,
  *
  * @mn: our notifier
  * @mn: the mm this callback is about
- * @start: start of updated range
- * @end: end of updated range
+ * @range: Address range information.
  *
  * We block for all BOs between start and end to be idle and
  * unmap them by move them into system domain again.
  */
 static void radeon_mn_invalidate_range_start(struct mmu_notifier *mn,
 struct mm_struct *mm,
-unsigned long start,
-unsigned long end,
-enum mmu_event event)
+const struct mmu_notifier_range 
*range)
 {
struct radeon_mn *rmn = container_of(mn, struct radeon_mn, mn);
struct interval_tree_node *it;
-
/* notificati