On 12/21/25 06:59, Ritesh Harjani (IBM) wrote:
Hi Sourabh,
Sourabh Jain <[email protected]> writes:
Skip processing hugepage kernel arguments (hugepagesz, hugepages, and
default_hugepagesz) when hugepages are not supported by the
architecture.
Some architectures may need to disable hugepages based on conditions
discovered during kernel boot. The hugepages_supported() helper allows
architecture code to advertise whether hugepages are supported.
Currently, normal hugepage allocation is guarded by
hugepages_supported(), but gigantic hugepages are allocated regardless
of this check. This causes problems on powerpc for fadump (firmware-
assisted dump).
In the fadump (firmware-assisted dump) scenario, a production kernel
crash causes the system to boot into a special kernel whose sole
purpose is to collect the memory dump and reboot. Features such as
hugepages are not required in this environment and should be
disabled.
For example, when the fadump kernel boots with the following kernel
arguments:
default_hugepagesz=1GB hugepagesz=1GB hugepages=200
Before this patch, the kernel prints the following logs:
HugeTLB: allocating 200 of page size 1.00 GiB failed. Only allocated 58
hugepages.
HugeTLB support is disabled!
HugeTLB: huge pages not supported, ignoring associated command-line parameters
hugetlbfs: disabling because there are no supported hugepage sizes
Even though the logs state that HugeTLB support is disabled, gigantic
hugepages are still allocated. This causes the fadump kernel to run out
of memory during boot.
After this patch is applied, the kernel prints the following logs for
the same command line:
HugeTLB: hugepages unsupported, ignoring default_hugepagesz=1GB cmdline
HugeTLB: hugepages unsupported, ignoring hugepagesz=1GB cmdline
HugeTLB: hugepages unsupported, ignoring hugepages=200 cmdline
HugeTLB support is disabled!
hugetlbfs: disabling because there are no supported hugepage sizes
To fix the issue, gigantic hugepage allocation should be guarded by
hugepages_supported().
Previously, two approaches were proposed to bring gigantic hugepage
allocation under hugepages_supported():
[1] Check hugepages_supported() in the generic code before allocating
gigantic hugepages
[2] Make arch_hugetlb_valid_size() return false for all hugetlb sizes
Approach [2] has two minor issues:
1. It prints misleading logs about invalid hugepage sizes
2. The kernel still processes hugepage kernel arguments unnecessarily
To control gigantic hugepage allocation, skip processing hugepage kernel
arguments (default_hugepagesz, hugepagesz and hugepages) when
hugepages_supported() returns false.
Link:
https://lore.kernel.org/all/[email protected]/
[1]
Link:
https://lore.kernel.org/all/[email protected]/
[2]
Fixes: c2833a5bf75b ("hugetlbfs: fix changes to command line processing")
I appreciate our proactiveness to respond quickly on mailing list, but I
suggest we give enough time to folks before sending the next version
please ;).
Your email from last night [1] says that we will use this fixes tag but
you haven't even given us 24hrs to respond to that email thread :). Now
we've sent this v6, with Acked-by of David and Reviewed-by of mine,
which seems like everything was agreed upon, but that isn't the case
actually.
Agreed.
My main concern was -
A fixes tag means it might get auto backported to stable kernels too,
Not in the MM world -- IIRC. I think there is the agreement, that we
decide what should go into stable and what not.
Andrew can correct me if my memory is wrong.
But we can always jump in and say that something should not go to stable
trees.
which means if the fixes tag is incorrect it could even break stable
kernels then.
[1]:
https://lore.kernel.org/linuxppc-dev/[email protected]/T/#m6e16738c03b2b2a8d09717f6291e46207033507a
Anyways,
Coming back to the fixes tag. I did mention a bit of a history [2] of
whatever I could find while reviewing this patch. I am not sure whether
you have looked into the links shared in that email or not. Here [2]:
[2]: https://lore.kernel.org/linuxppc-dev/[email protected]/
Where I am coming from is.. The current patch is acutally a partial
revert of the patch mentioned in the fixes tag. That means if this patch
gets applied to the older stable kernels, it would end up bringing the
same problem back, which the "Fixes" tagged patch is fixing in the 1st
place, isnt' it? See this discussion [3]...
[3]:
https://lore.kernel.org/all/[email protected]/T/#m0eee87b458d93559426b8b0e78dc6ebcd26ad3ae
... So, IMO - the right fixes tag, if we have to add, it should be the
patch which moved the hpage_shift initialization to happen early i.e. in
mmu_early_init_devtree. That would be this patch [4]:
[4]:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2354ad252b66695be02f4acd18e37bf6264f0464
Now, it's not really that the patch [4] had any issue as such. But it
seems like, that the current fix can only be applied after patch [4] is
taken.
Do we agree?
I think we should document all that in the cover letter, an describe
that this partial revert is only possible after [4], and that that must
be considered when attempting any kind of stable backports.
Thanks for pointing all that out.
--
Cheers
David