Re: [PATCH v3 3/3] mm/mmu_notifier: contextual information for event triggering invalidation v2

2018-12-13 Thread Dan Williams
On Thu, Dec 13, 2018 at 9:14 AM  wrote:
>
> From: Jérôme Glisse 
>
> CPU page table update can happens for many reasons, not only as a result
> of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also
> as a result of kernel activities (memory compression, reclaim, migration,
> ...).
>
> Users of mmu notifier API track changes to the CPU page table and take
> specific action for them. While current API only provide range of virtual
> address affected by the change, not why the changes is happening.
>
> This patchset adds event information so that users of mmu notifier can
> differentiate among broad category:
> - UNMAP: munmap() or mremap()
> - CLEAR: page table is cleared (migration, compaction, reclaim, ...)
> - PROTECTION_VMA: change in access protections for the range
> - PROTECTION_PAGE: change in access protections for page in the range
> - SOFT_DIRTY: soft dirtyness tracking
>
> Being able to identify munmap() and mremap() from other reasons why the
> page table is cleared is important to allow user of mmu notifier to
> update their own internal tracking structure accordingly (on munmap or
> mremap it is not longer needed to track range of virtual address as it
> becomes invalid).

Who consumes these new enum values? The consumer of the new
infrastructure should be included in the patchset that adds the new
functionality. So a NAK from me until the consumer is clarified /
included.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH v3 3/3] mm/mmu_notifier: contextual information for event triggering invalidation v2

2018-12-13 Thread jglisse
From: Jérôme Glisse 

CPU page table update can happens for many reasons, not only as a result
of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also
as a result of kernel activities (memory compression, reclaim, migration,
...).

Users of mmu notifier API track changes to the CPU page table and take
specific action for them. While current API only provide range of virtual
address affected by the change, not why the changes is happening.

This patchset adds event information so that users of mmu notifier can
differentiate among broad category:
- UNMAP: munmap() or mremap()
- CLEAR: page table is cleared (migration, compaction, reclaim, ...)
- PROTECTION_VMA: change in access protections for the range
- PROTECTION_PAGE: change in access protections for page in the range
- SOFT_DIRTY: soft dirtyness tracking

Being able to identify munmap() and mremap() from other reasons why the
page table is cleared is important to allow user of mmu notifier to
update their own internal tracking structure accordingly (on munmap or
mremap it is not longer needed to track range of virtual address as it
becomes invalid).

Changes since v1:
- use mmu_notifier_range_init() helper to to optimize out the case
  when mmu notifier is not enabled
- use kernel doc format for describing the enum values

Signed-off-by: Jérôme Glisse 
Acked-by: Christian König 
Acked-by: Jan Kara 
Acked-by: Felix Kuehling 
Acked-by: Jason Gunthorpe 
Cc: Andrew Morton 
Cc: Matthew Wilcox 
Cc: Ross Zwisler 
Cc: Dan Williams 
Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: Michal Hocko 
Cc: Christian Koenig 
Cc: Ralph Campbell 
Cc: John Hubbard 
Cc: k...@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-r...@vger.kernel.org
Cc: linux-fsde...@vger.kernel.org
Cc: Arnd Bergmann 
---
 fs/dax.c |  7 +++
 fs/proc/task_mmu.c   |  3 ++-
 include/linux/mmu_notifier.h | 35 +--
 kernel/events/uprobes.c  |  3 ++-
 mm/huge_memory.c | 12 
 mm/hugetlb.c | 10 ++
 mm/khugepaged.c  |  3 ++-
 mm/ksm.c |  6 --
 mm/madvise.c |  3 ++-
 mm/memory.c  | 18 --
 mm/migrate.c |  5 +++--
 mm/mprotect.c|  3 ++-
 mm/mremap.c  |  3 ++-
 mm/oom_kill.c|  2 +-
 mm/rmap.c|  6 --
 15 files changed, 90 insertions(+), 29 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 874085bacaf5..6056b03a1626 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -768,6 +768,13 @@ static void dax_entry_mkclean(struct address_space 
*mapping, pgoff_t index,
 
address = pgoff_address(index, vma);
 
+   /*
+* All the field are populated by follow_pte_pmd() except
+* the event field.
+*/
+   mmu_notifier_range_init(&range, NULL, 0, -1UL,
+   MMU_NOTIFY_PROTECTION_PAGE);
+
/*
 * Note because we provide start/end to follow_pte_pmd it will
 * call mmu_notifier_invalidate_range_start() on our behalf
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index b3ddceb003bc..f68a9ebb0218 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1141,7 +1141,8 @@ static ssize_t clear_refs_write(struct file *file, const 
char __user *buf,
break;
}
 
-   mmu_notifier_range_init(&range, mm, 0, -1UL);
+   mmu_notifier_range_init(&range, mm, 0, -1UL,
+   MMU_NOTIFY_SOFT_DIRTY);
mmu_notifier_invalidate_range_start(&range);
}
walk_page_range(0, mm->highest_vm_end, &clear_refs_walk);
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 39b06772427f..d249e24acea5 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -25,10 +25,39 @@ struct mmu_notifier_mm {
spinlock_t lock;
 };
 
+/**
+ * enum mmu_notifier_event - reason for the mmu notifier callback
+ * @MMU_NOTIFY_UNMAP: either munmap() that unmap the range or a mremap() that
+ * move the range
+ *
+ * @MMU_NOTIFY_CLEAR: clear page table entry (many reasons for this like
+ * madvise() or replacing a page by another one, ...).
+ *
+ * @MMU_NOTIFY_PROTECTION_VMA: update is due to protection change for the range
+ * ie using the vma access permission (vm_page_prot) to update the whole range
+ * is enough no need to inspect changes to the CPU page table (mprotect()
+ * syscall)
+ *
+ * @MMU_NOTIFY_PROTECTION_PAGE: update is due to change in read/write flag for
+ * pages in the range so to mirror those changes the user must inspect the CPU
+ * page table (from the end callback).
+ *
+ * @MMU_NOTIFY_SOFT_DIRTY: soft dirty accounting (sti