Re: [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
On Thu, Jul 27, 2017 at 07:29:32AM +0530, Aneesh Kumar K.V wrote: > > > On 07/26/2017 09:36 PM, Ram Pai wrote: > >On Wed, Jul 26, 2017 at 04:05:48PM +0530, Aneesh Kumar K.V wrote: > >>Ram Paiwrites: > >> > > >>>diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h > >>>b/arch/powerpc/include/asm/book3s/64/hash-64k.h > >>>index 9732837..62e580c 100644 > >>>--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h > >>>+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h > >>>@@ -12,18 +12,14 @@ > >>> */ > >>> #define H_PAGE_COMBO _RPAGE_RPN0 /* this is a combo 4k page */ > >>> #define H_PAGE_4K_PFN_RPAGE_RPN1 /* PFN is for a single 4k page */ > >>>+#define H_PAGE_BUSY _RPAGE_RPN42 /* software: PTE & hash are > >>>busy */ > >> > >> > >>Why are we moving H_PAGE_BUSY. Right now 4k and 64k linux page table > >>format looks similar. > > > >The goal is to clear off all the _RPAGE_RSV* bits so that they can be > >used for protection keys. the aim is to keep the protection-bits in the > >_RPAGE_RSV* bits, so that they will work as-is whenever radix MMU enables > >protection keys. > > > >Yes this makes the PTE format differ from 4k PTE. Hopefully it is a > >small inconvenience. The PTE format for 4K is anyway not exactly the > >same compared to 64K PTE format. For example, higher RPN bits are > >used on 4K but not on 64k. lower RPN bits are used on 64k but not > >on 4k. > I was wondering why in this patch ? You do in the next patch True. because in this patch, we have not yet freed up bit _RPAGE_RPN44. _RPAGE_RPN44 bit is still used by H_PAGE_F_GIX for 64K backed HPTEs. Hence I have temporarily parked H_PAGE_BUSY at _RPAGE_RPN42. I could leave H_PAGE_BUSY at bit _RPAGE_RSV1 and move it to _RPAGE_RPN44 in the next patch. But by doing so, i would have not truely released bit _RPAGE_RSV1 for 4K backed hptes; as claimed in the title of this patch > > --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h > +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h > @@ -12,7 +12,7 @@ > */ > #define H_PAGE_COMBO _RPAGE_RPN0 /* this is a combo 4k page */ > #define H_PAGE_4K_PFN_RPAGE_RPN1 /* PFN is for a single 4k page */ > -#define H_PAGE_BUSY _RPAGE_RPN42 /* software: PTE & hash are busy */ > +#define H_PAGE_BUSY _RPAGE_RPN44 /* software: PTE & hash are busy */ > ... -- Ram Pai
Re: [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
On 07/26/2017 09:36 PM, Ram Pai wrote: On Wed, Jul 26, 2017 at 04:05:48PM +0530, Aneesh Kumar K.V wrote: Ram Paiwrites: diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h index 9732837..62e580c 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h @@ -12,18 +12,14 @@ */ #define H_PAGE_COMBO _RPAGE_RPN0 /* this is a combo 4k page */ #define H_PAGE_4K_PFN _RPAGE_RPN1 /* PFN is for a single 4k page */ +#define H_PAGE_BUSY_RPAGE_RPN42 /* software: PTE & hash are busy */ Why are we moving H_PAGE_BUSY. Right now 4k and 64k linux page table format looks similar. The goal is to clear off all the _RPAGE_RSV* bits so that they can be used for protection keys. the aim is to keep the protection-bits in the _RPAGE_RSV* bits, so that they will work as-is whenever radix MMU enables protection keys. Yes this makes the PTE format differ from 4k PTE. Hopefully it is a small inconvenience. The PTE format for 4K is anyway not exactly the same compared to 64K PTE format. For example, higher RPN bits are used on 4K but not on 64k. lower RPN bits are used on 64k but not on 4k. I was wondering why in this patch ? You do in the next patch --- a/arch/powerpc/include/asm/book3s/64/hash-64k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h @@ -12,7 +12,7 @@ */ #define H_PAGE_COMBO _RPAGE_RPN0 /* this is a combo 4k page */ #define H_PAGE_4K_PFN _RPAGE_RPN1 /* PFN is for a single 4k page */ -#define H_PAGE_BUSY_RPAGE_RPN42 /* software: PTE & hash are busy */ +#define H_PAGE_BUSY_RPAGE_RPN44 /* software: PTE & hash are busy */ -aneesh
Re: [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
On Wed, Jul 26, 2017 at 04:05:48PM +0530, Aneesh Kumar K.V wrote: > Ram Paiwrites: > > > Rearrange 64K PTE bits to free up bits 3, 4, 5 and 6, > > in the 4K backed HPTE pages.These bits continue to be used > > for 64K backed HPTE pages in this patch, but will be freed > > up in the next patch. The bit numbers are big-endian as > > defined in the ISA3.0 > > > > The patch does the following change to the 4k htpe backed > > 64K PTE's format. > > > > H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure > > below) > > V0 which occupied bit 4 is not used anymore. > > V1 which occupied bit 5 is not used anymore. > > V2 which occupied bit 6 is not used anymore. > > V3 which occupied bit 7 is not used anymore. > > > > Before the patch, the 4k backed 64k PTE format was as follows > > > > 0 1 2 3 4 5 6 7 8 9 10...63 > > : : : : : : : : : : :: > > v v v v v v v v v v vv > > > > ,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-, > > |x|x|x|B|V0|V1|V2|V3|x| | |x|x||x|x|x|x| <- primary pte > > '_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_' > > |S|G|I|X|S |G |I |X |S|G|I|X|..|S|G|I|X| <- secondary pte > > '_'_'_'_'__'__'__'__'_'_'_'_'__'_'_'_'_' > > > > After the patch, the 4k backed 64k PTE format is as follows > > > > 0 1 2 3 4 5 6 7 8 9 10...63 > > : : : : : : : : : : :: > > v v v v v v v v v v vv > > > > ,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-, > > |x|x|x| | | | | |x|B| |x|x||.|.|.|.| <- primary pte > > '_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_' > > |S|G|I|X|S |G |I |X |S|G|I|X|..|S|G|I|X| <- secondary pte > > '_'_'_'_'__'__'__'__'_'_'_'_'__'_'_'_'_' > > > > the four bits S,G,I,X (one quadruplet per 4k HPTE) that > > cache the hash-bucket slot value, is initialized to > > 1,1,1,1 indicating -- an invalid slot. If a HPTE gets > > cached in a slot(i.e 7th slot of secondary hash > > bucket), it is released immediately. In other words, > > even though is a valid slot value in the hash > > bucket, we consider it invalid and release the slot and > > the HPTE. This gives us the opportunity to determine > > the validity of S,G,I,X bits based on its contents and > > not on any of the bits V0,V1,V2 or V3 in the primary PTE > > > > When we release aHPTEcached in the slot > > we alsorelease a legitimate slot in the primary > > hash bucket and unmap its corresponding HPTE. This > > is to ensure that we do get a HPTE cached in a slot > > of the primary hash bucket, the next time we retry. > > > > Though treating slot as invalid, reduces the > > number of available slots in the hash bucket and may > > have an effect on the performance, the probabilty of > > hitting a slot is extermely low. > > > > Compared to the current scheme, the above described > > scheme reduces the number of false hash table updates > > significantly andhas the added advantage of > > releasing four valuable PTE bits for other purpose. > > > > NOTE:even though bits 3, 4, 5, 6, 7 are not used when > > the 64K PTE is backed by 4k HPTE, they continue to be > > used if the PTE gets backed by 64k HPTE. The next > > patch will decouple that aswell, and truely release the > > bits. > > > > This idea was jointly developed by Paul Mackerras, > > Aneesh, Michael Ellermen and myself. > > > > 4K PTE format remains unchanged currently. > > > > The patch does the following code changes > > a) PTE flags are split between 64k and 4k header files. > > b) __hash_page_4K() is reimplemented to reflect the > >above logic. > > > > Reviewed-by: Aneesh Kumar K.V > > Signed-off-by: Ram Pai > > --- > > arch/powerpc/include/asm/book3s/64/hash-4k.h |2 + > > arch/powerpc/include/asm/book3s/64/hash-64k.h |8 +-- > > arch/powerpc/include/asm/book3s/64/hash.h |1 - > > arch/powerpc/mm/hash64_64k.c | 74 > > - > > arch/powerpc/mm/hash_utils_64.c |4 +- > > 5 files changed, 55 insertions(+), 34 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h > > b/arch/powerpc/include/asm/book3s/64/hash-4k.h > > index 0c4e470..f959c00 100644 > > --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h > > +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h > > @@ -16,6 +16,8 @@ > > #define H_PUD_TABLE_SIZE (sizeof(pud_t) << H_PUD_INDEX_SIZE) > > #define H_PGD_TABLE_SIZE (sizeof(pgd_t) << H_PGD_INDEX_SIZE) > > > > +#define H_PAGE_BUSY_RPAGE_RSV1 /* software: PTE & hash are > > busy */ > > + > > /* PTE flags
Re: [PATCH 1/6] powerpc: Free up four 64K PTE bits in 4K backed HPTE pages
Ram Paiwrites: > Rearrange 64K PTE bits to free up bits 3, 4, 5 and 6, > in the 4K backed HPTE pages.These bits continue to be used > for 64K backed HPTE pages in this patch, but will be freed > up in the next patch. The bit numbers are big-endian as > defined in the ISA3.0 > > The patch does the following change to the 4k htpe backed > 64K PTE's format. > > H_PAGE_BUSY moves from bit 3 to bit 9 (B bit in the figure > below) > V0 which occupied bit 4 is not used anymore. > V1 which occupied bit 5 is not used anymore. > V2 which occupied bit 6 is not used anymore. > V3 which occupied bit 7 is not used anymore. > > Before the patch, the 4k backed 64k PTE format was as follows > > 0 1 2 3 4 5 6 7 8 9 10...63 > : : : : : : : : : : :: > v v v v v v v v v v vv > > ,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-, > |x|x|x|B|V0|V1|V2|V3|x| | |x|x||x|x|x|x| <- primary pte > '_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_' > |S|G|I|X|S |G |I |X |S|G|I|X|..|S|G|I|X| <- secondary pte > '_'_'_'_'__'__'__'__'_'_'_'_'__'_'_'_'_' > > After the patch, the 4k backed 64k PTE format is as follows > > 0 1 2 3 4 5 6 7 8 9 10...63 > : : : : : : : : : : :: > v v v v v v v v v v vv > > ,-,-,-,-,--,--,--,--,-,-,-,-,-,--,-,-,-, > |x|x|x| | | | | |x|B| |x|x||.|.|.|.| <- primary pte > '_'_'_'_'__'__'__'__'_'_'_'_'_''_'_'_'_' > |S|G|I|X|S |G |I |X |S|G|I|X|..|S|G|I|X| <- secondary pte > '_'_'_'_'__'__'__'__'_'_'_'_'__'_'_'_'_' > > the four bits S,G,I,X (one quadruplet per 4k HPTE) that > cache the hash-bucket slot value, is initialized to > 1,1,1,1 indicating -- an invalid slot. If a HPTE gets > cached in a slot(i.e 7th slot of secondary hash > bucket), it is released immediately. In other words, > even though is a valid slot value in the hash > bucket, we consider it invalid and release the slot and > the HPTE. This gives us the opportunity to determine > the validity of S,G,I,X bits based on its contents and > not on any of the bits V0,V1,V2 or V3 in the primary PTE > > When we release aHPTEcached in the slot > we alsorelease a legitimate slot in the primary > hash bucket and unmap its corresponding HPTE. This > is to ensure that we do get a HPTE cached in a slot > of the primary hash bucket, the next time we retry. > > Though treating slot as invalid, reduces the > number of available slots in the hash bucket and may > have an effect on the performance, the probabilty of > hitting a slot is extermely low. > > Compared to the current scheme, the above described > scheme reduces the number of false hash table updates > significantly andhas the added advantage of > releasing four valuable PTE bits for other purpose. > > NOTE:even though bits 3, 4, 5, 6, 7 are not used when > the 64K PTE is backed by 4k HPTE, they continue to be > used if the PTE gets backed by 64k HPTE. The next > patch will decouple that aswell, and truely release the > bits. > > This idea was jointly developed by Paul Mackerras, > Aneesh, Michael Ellermen and myself. > > 4K PTE format remains unchanged currently. > > The patch does the following code changes > a) PTE flags are split between 64k and 4k header files. > b) __hash_page_4K() is reimplemented to reflect the >above logic. > > Reviewed-by: Aneesh Kumar K.V > Signed-off-by: Ram Pai > --- > arch/powerpc/include/asm/book3s/64/hash-4k.h |2 + > arch/powerpc/include/asm/book3s/64/hash-64k.h |8 +-- > arch/powerpc/include/asm/book3s/64/hash.h |1 - > arch/powerpc/mm/hash64_64k.c | 74 > - > arch/powerpc/mm/hash_utils_64.c |4 +- > 5 files changed, 55 insertions(+), 34 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h > b/arch/powerpc/include/asm/book3s/64/hash-4k.h > index 0c4e470..f959c00 100644 > --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h > +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h > @@ -16,6 +16,8 @@ > #define H_PUD_TABLE_SIZE (sizeof(pud_t) << H_PUD_INDEX_SIZE) > #define H_PGD_TABLE_SIZE (sizeof(pgd_t) << H_PGD_INDEX_SIZE) > > +#define H_PAGE_BUSY _RPAGE_RSV1 /* software: PTE & hash are busy */ > + > /* PTE flags to conserve for HPTE identification */ > #define _PAGE_HPTEFLAGS (H_PAGE_BUSY | H_PAGE_HASHPTE | \ >H_PAGE_F_SECOND | H_PAGE_F_GIX) > diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h > b/arch/powerpc/include/asm/book3s/64/hash-64k.h > index