Kfence on book3s64 Hash is broken. Kfence depends upon debug_pagealloc
infrastructure on Hash. debug_pagealloc allocates a linear map based on the size
of the DRAM i.e. 1 byte for every 64k page. That means for a 16TB DRAM, it will
need 256MB memory for linear map. Memory for linear map on pseries comes from
RMA region which has size limitation. On P8 RMA is 512MB, in which we also
fit crash kernel at 256MB, paca allocations and emergency stacks.
That means there is not enough memory in the RMA region for the linear map
based on DRAM size (required by debug_pagealloc).

Now kfence only requires memory for it's kfence objects. kfence by default
requires only (255 + 1) * 2 i.e. 32 MB for 64k pagesize.

This patch series removes the direct dependency of kfence on debug_pagealloc
infrastructure. We separate the Hash kernel linear map functions to take
linear map array as a parameter so that it can support debug_pagealloc and
kfence individually. That means we don't need to keep the linear map region of
size DRAM_SIZE >> PAGE_SHIFT anymore for kfence.

Hence, the current patch series solves the boot failure problem when kfence is
enabled by optimizing the memory it requires for linear map within RMA region.

On radix we don't have this problem because no SLB and no RMA region size
limitation.

Testing:
========
The patch series is still undergoing some testing. However, given that it's in
good shape, I wanted to send out for review.
Note: It passes kfence kunit tests.
  <dmesg results>
  [   48.715649][    T1] # kfence: pass:23 fail:0 skip:2 total:25
  [   48.716697][    T1] # Totals: pass:23 fail:0 skip:2 total:25
  [   48.717842][    T1] ok 1 kfence


TODOs: (for future patches)
===========================
However, there is still another problem which IMO makes kfence not suitable to
be enabled by default on production kernels with Hash MMU i.e.
When kfence is enabled the kernel linear map uses PAGE_SIZE mapping rather than
16MB mapping as in the original case. Correct me if I am wrong, but 
theoretically
at least this could cause TLB pressure in certain cases, which makes it not
really suitable to be enabled by default on production kernels on Hash.

This is because on P8 book3s64, we don't support mapping multiple pagesizes
(MPSS) within the kernel linear map segment. Is this understanding correct?


Ritesh Harjani (IBM) (10):
  book3s64/hash: Remove kfence support temporarily
  book3s64/hash: Refactor kernel linear map related calls
  book3s64/hash: Add hash_debug_pagealloc_add_slot() function
  book3s64/hash: Add hash_debug_pagealloc_alloc_slots() function
  book3s64/hash: Refactor hash__kernel_map_pages() function
  book3s64/hash: Make kernel_map_linear_page() generic
  book3s64/hash: Disable debug_pagealloc if it requires more memory
  book3s64/hash: Add kfence functionality
  book3s64/radix: Refactoring common kfence related functions
  book3s64/hash: Disable kfence if not early init

 arch/powerpc/include/asm/kfence.h        |   2 +
 arch/powerpc/mm/book3s64/hash_utils.c    | 364 +++++++++++++++++------
 arch/powerpc/mm/book3s64/radix_pgtable.c |  12 -
 arch/powerpc/mm/init-common.c            |  12 +
 4 files changed, 286 insertions(+), 104 deletions(-)

--
2.45.2

Reply via email to