Reading tlb_flush_pending while the page-table lock is taken does not
require a barrier, since the lock/unlock already acts as a barrier.
Removing the barrier in mm_tlb_flush_pending() to address this issue.

However, migrate_misplaced_transhuge_page() calls mm_tlb_flush_pending()
while the page-table lock is already released, which may present a
problem on architectures with weak memory model (PPC). To deal with this
case, a new parameter is added to mm_tlb_flush_pending() to indicate
if it is read without the page-table lock taken, and calling
smp_mb__after_unlock_lock() in this case.

Cc: Minchan Kim <minc...@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhat...@gmail.com>
Cc: Andy Lutomirski <l...@kernel.org>
Cc: Mel Gorman <mgor...@suse.de>

Signed-off-by: Nadav Amit <na...@vmware.com>
Acked-by: Rik van Riel <r...@redhat.com>
---
 include/linux/mm_types.h | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index f5263dd0f1bc..2956513619a7 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -522,12 +522,12 @@ static inline cpumask_t *mm_cpumask(struct mm_struct *mm)
 /*
  * Memory barriers to keep this state in sync are graciously provided by
  * the page table locks, outside of which no page table modifications happen.
- * The barriers below prevent the compiler from re-ordering the instructions
- * around the memory barriers that are already present in the code.
+ * The barriers are used to ensure the order between tlb_flush_pending updates,
+ * which happen while the lock is not taken, and the PTE updates, which happen
+ * while the lock is taken, are serialized.
  */
 static inline bool mm_tlb_flush_pending(struct mm_struct *mm)
 {
-       barrier();
        return atomic_read(&mm->tlb_flush_pending) > 0;
 }
 
@@ -550,7 +550,13 @@ static inline void inc_tlb_flush_pending(struct mm_struct 
*mm)
 /* Clearing is done after a TLB flush, which also provides a barrier. */
 static inline void dec_tlb_flush_pending(struct mm_struct *mm)
 {
-       barrier();
+       /*
+        * Guarantee that the tlb_flush_pending does not not leak into the
+        * critical section, since we must order the PTE change and changes to
+        * the pending TLB flush indication. We could have relied on TLB flush
+        * as a memory barrier, but this behavior is not clearly documented.
+        */
+       smp_mb__before_atomic();
        atomic_dec(&mm->tlb_flush_pending);
 }
 #else
-- 
2.11.0

Reply via email to