This patch series add split pmd pagetable lock for book3s64. nohash64 also should be able to switch to this. I need to workout the code dependency. This series also migh have broken the build on platforms otherthan book3s64. I am sending this early to get feedback on whether we should continue with the approach.
We switch the pmd allocator to use something similar to what we already use for level 4 pagetable allocation. We get an order 0 page and divide that to fragments and hand over fragments when we get request for a pmd pagetable. The pmd lock is now stashed in the struct page backing the allocated page. The series helps in reducing lock contention on mm->page_table_lock. without patch 32.72% mmap_bench [kernel.vmlinux] [k] do_raw_spin_lock | ---do_raw_spin_lock | --32.68%--0 | |--15.82%--pte_fragment_alloc | | | --15.79%--do_huge_pmd_anonymous_page | __handle_mm_fault | handle_mm_fault | __do_page_fault | handle_page_fault | test_mmap | test_mmap | start_thread | __clone | |--14.95%--do_huge_pmd_anonymous_page | __handle_mm_fault | handle_mm_fault | __do_page_fault | handle_page_fault | test_mmap | test_mmap | start_thread | __clone | with patch 12.89% mmap_bench [kernel.vmlinux] [k] do_raw_spin_lock | ---do_raw_spin_lock | --12.83%--0 | |--3.21%--pagevec_lru_move_fn | __lru_cache_add | | | --2.74%--do_huge_pmd_anonymous_page | __handle_mm_fault | handle_mm_fault | __do_page_fault | handle_page_fault | test_mmap | test_mmap | start_thread | __clone | |--3.11%--do_huge_pmd_anonymous_page | __handle_mm_fault | handle_mm_fault | __do_page_fault | handle_page_fault | test_mmap | test_mmap | start_thread | __clone ..... | --0.55%--pte_fragment_alloc | --0.55%--do_huge_pmd_anonymous_page __handle_mm_fault handle_mm_fault __do_page_fault handle_page_fault test_mmap test_mmap start_thread __clone Aneesh Kumar K.V (11): powerpc/mm/book3s64: Move book3s64 code to pgtable-book3s64 powerpc/kvm: Switch kvm pmd allocator to custom allocator powerpc/mm: Use pmd_lockptr instead of opencoding it powerpc/mm: Rename pte fragment functions powerpc/mm/book3e/64: Remove unsupported 64Kpage size from 64bit booke powerpc/mm/nohash: Remove pte fragment dependency from nohash powerpc/mm/book3s64/4k: Switch 4k pagesize config to use pagetable fragment powerpc/book3s64/mm: Simplify the rcu callback for page table free powerpc/mm: Implement helpers for pagetable fragment support at PMD level powerpc/mm: Use page fragments for allocation page table at PMD level powerpc/book3s64: Enable split pmd ptlock. arch/powerpc/include/asm/book3s/64/hash-4k.h | 8 +- arch/powerpc/include/asm/book3s/64/hash-64k.h | 7 + arch/powerpc/include/asm/book3s/64/hash.h | 10 - arch/powerpc/include/asm/book3s/64/mmu.h | 7 +- arch/powerpc/include/asm/book3s/64/pgalloc.h | 46 +--- arch/powerpc/include/asm/book3s/64/pgtable.h | 20 +- arch/powerpc/include/asm/book3s/64/radix-4k.h | 3 + arch/powerpc/include/asm/book3s/64/radix-64k.h | 4 + arch/powerpc/include/asm/mmu-book3e.h | 6 - arch/powerpc/include/asm/nohash/64/pgalloc.h | 95 +++----- arch/powerpc/include/asm/nohash/64/pgtable-64k.h | 57 ----- arch/powerpc/include/asm/nohash/64/pgtable.h | 8 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 36 ++- arch/powerpc/mm/hash_utils_64.c | 3 +- arch/powerpc/mm/mmu_context_book3s64.c | 39 +++- arch/powerpc/mm/pgtable-book3s64.c | 267 ++++++++++++++++++++++- arch/powerpc/mm/pgtable-hash64.c | 8 +- arch/powerpc/mm/pgtable-radix.c | 5 +- arch/powerpc/mm/pgtable_64.c | 171 --------------- arch/powerpc/platforms/Kconfig.cputype | 4 + 20 files changed, 427 insertions(+), 377 deletions(-) delete mode 100644 arch/powerpc/include/asm/nohash/64/pgtable-64k.h -- 2.14.3