On Mon, Mar 25, 2024 at 03:55:54PM +0100, Christophe Leroy wrote: > Unlike many architectures, powerpc 8xx hardware tablewalk requires > a two level process for all page sizes, allthough second level only > has one entry when pagesize is 8M. > > To fit with Linux page table topology and without requiring special > page directory layout like hugepd, the page entry will be replicated > 1024 times in the standard page table. However for large pages it is > necessary to set bits in the level-1 (PMD) entry. At the time being, > for 512k pages the flag is kept in the PTE and inserted in the PMD > entry at TLB miss exception, that is necessary because we can have > pages of different sizes in a page table. However the 12 PTE bits are > fully used and there is no room for an additional bit for page size. > > For 8M pages, there will be only one page per PMD entry, it is > therefore possible to flag the pagesize in the PMD entry, with the > advantage that the information will already be at the right place for > the hardware. > > To do so, add a new helper called pmd_populate_size() which takes the > page size as an additional argument, and modify __pte_alloc() to also > take that argument. pte_alloc() is left unmodified in order to > reduce churn on callers, and a pte_alloc_size() is added for use by > pte_alloc_huge(). > > When an architecture doesn't provide pmd_populate_size(), > pmd_populate() is used as a fallback.
I think it would be a good idea to document what the semantic is supposed to be for sz? Just a general remark, probably nothing for this, but with these new arguments the historical naming seems pretty tortured for pte_alloc_size().. Something like pmd_populate_leaf(size) as a naming scheme would make this more intuitive. Ie pmd_populate_leaf() gives you a PMD entry where the entry points to a leaf page table able to store folios of at least size. Anyhow, I thought the edits to the mm helpers were fine, certainly much nicer than hugepd. Do you see a path to remove hugepd entirely from here? Thanks, Jason