Re: [PATCH v2] powerpc/32s: Cleanup the mess in __set_pte_at()

2023-08-23 Thread Michael Ellerman
On Tue, 15 Aug 2023 19:42:40 +0200, Christophe Leroy wrote:
> __set_pte_at() handles 3 main cases with #ifdefs plus the 'percpu'
> subcase which leads to code duplication.
> 
> Rewrite the function using IS_ENABLED() to minimise the total number
> of cases and remove duplicated code.
> 
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/32s: Cleanup the mess in __set_pte_at()
  https://git.kernel.org/powerpc/c/7cb0094be4a5dfb3c91d285977f489d334455e19

cheers


[PATCH v2] powerpc/32s: Cleanup the mess in __set_pte_at()

2023-08-15 Thread Christophe Leroy
__set_pte_at() handles 3 main cases with #ifdefs plus the 'percpu'
subcase which leads to code duplication.

Rewrite the function using IS_ENABLED() to minimise the total number
of cases and remove duplicated code.

Signed-off-by: Christophe Leroy 
---
v2: Reorganise comments, first case becomes third and third become first.
---
 arch/powerpc/include/asm/book3s/32/pgtable.h | 77 
 1 file changed, 31 insertions(+), 46 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h 
b/arch/powerpc/include/asm/book3s/32/pgtable.h
index 7bf1fe7297c6..d49c2a9d4ffe 100644
--- a/arch/powerpc/include/asm/book3s/32/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/32/pgtable.h
@@ -541,58 +541,43 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t 
newprot)
 
 
 /* This low level function performs the actual PTE insertion
- * Setting the PTE depends on the MMU type and other factors. It's
- * an horrible mess that I'm not going to try to clean up now but
- * I'm keeping it in one place rather than spread around
+ * Setting the PTE depends on the MMU type and other factors.
+ *
+ * First case is 32-bit in UP mode with 32-bit PTEs, we need to preserve
+ * the _PAGE_HASHPTE bit since we may not have invalidated the previous
+ * translation in the hash yet (done in a subsequent flush_tlb_xxx())
+ * and see we need to keep track that this PTE needs invalidating.
+ *
+ * Second case is 32-bit with 64-bit PTE.  In this case, we
+ * can just store as long as we do the two halves in the right order
+ * with a barrier in between. This is possible because we take care,
+ * in the hash code, to pre-invalidate if the PTE was already hashed,
+ * which synchronizes us with any concurrent invalidation.
+ * In the percpu case, we fallback to the simple update preserving
+ * the hash bits (ie, same as the non-SMP case).
+ *
+ * Third case is 32-bit in SMP mode with 32-bit PTEs. We use the
+ * helper pte_update() which does an atomic update. We need to do that
+ * because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
+ * per-CPU PTE such as a kmap_atomic, we also do a simple update preserving
+ * the hash bits instead.
  */
 static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, pte_t pte, int percpu)
 {
-#if defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT)
-   /* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use 
the
-* helper pte_update() which does an atomic update. We need to do that
-* because a concurrent invalidation can clear _PAGE_HASHPTE. If it's a
-* per-CPU PTE such as a kmap_atomic, we do a simple update preserving
-* the hash bits instead (ie, same as the non-SMP case)
-*/
-   if (percpu)
-   *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
- | (pte_val(pte) & ~_PAGE_HASHPTE));
-   else
-   pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, pte_val(pte), 0);
+   if ((!IS_ENABLED(CONFIG_SMP) && !IS_ENABLED(CONFIG_PTE_64BIT)) || 
percpu) {
+   *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE) |
+ (pte_val(pte) & ~_PAGE_HASHPTE));
+   } else if (IS_ENABLED(CONFIG_PTE_64BIT)) {
+   if (pte_val(*ptep) & _PAGE_HASHPTE)
+   flush_hash_entry(mm, ptep, addr);
 
-#elif defined(CONFIG_PTE_64BIT)
-   /* Second case is 32-bit with 64-bit PTE.  In this case, we
-* can just store as long as we do the two halves in the right order
-* with a barrier in between. This is possible because we take care,
-* in the hash code, to pre-invalidate if the PTE was already hashed,
-* which synchronizes us with any concurrent invalidation.
-* In the percpu case, we also fallback to the simple update preserving
-* the hash bits
-*/
-   if (percpu) {
-   *ptep = __pte((pte_val(*ptep) & _PAGE_HASHPTE)
- | (pte_val(pte) & ~_PAGE_HASHPTE));
-   return;
+   asm volatile("stw%X0 %2,%0; eieio; stw%X1 %L2,%1" :
+"=m" (*ptep), "=m" (*((unsigned char *)ptep+4)) :
+"r" (pte) : "memory");
+   } else {
+   pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, pte_val(pte), 0);
}
-   if (pte_val(*ptep) & _PAGE_HASHPTE)
-   flush_hash_entry(mm, ptep, addr);
-   __asm__ __volatile__("\
-   stw%X0 %2,%0\n\
-   eieio\n\
-   stw%X1 %L2,%1"
-   : "=m" (*ptep), "=m" (*((unsigned char *)ptep+4))
-   : "r" (pte) : "memory");
-
-#else
-   /* Third case is 32-bit hash table in UP mode, we need to preserve
-* the _PAGE_HASHPTE bit since we may not have invalidated the previous
-* translation in the hash yet (done in a subsequent flush_tlb_xxx())
-* and see we need to keep track that this PTE needs