Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb
On Mon, Oct 08, 2018 at 12:46:26 -0700, Richard Henderson wrote: > On 10/8/18 7:42 AM, Emilio G. Cota wrote: > > On Sun, Oct 07, 2018 at 19:09:01 -0700, Richard Henderson wrote: > >> On 10/6/18 2:45 PM, Emilio G. Cota wrote: > >>> Currently we evict an entry to the victim TLB when it doesn't match > >>> the current address. But it could be that there's no match because > >>> the current entry is invalid. Do not evict the entry to the vtlb > >>> in that case. > >>> > >>> This change will help us keep track of the TLB's use rate. > >>> > >>> Signed-off-by: Emilio G. Cota > >>> --- > >>> include/exec/cpu-all.h | 14 ++ > >>> accel/tcg/cputlb.c | 2 +- > >>> 2 files changed, 15 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h > >>> index 117d2fbbca..d938dedafc 100644 > >>> --- a/include/exec/cpu-all.h > >>> +++ b/include/exec/cpu-all.h > >>> @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, > >>> target_ulong addr) > >>> return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK); > >>> } > >>> > >>> +/** > >>> + * tlb_is_valid - return true if at least one of the addresses is valid > >>> + * @te: pointer to CPUTLBEntry > >>> + * > >>> + * This is useful when we don't have a particular address to compare > >>> against, > >>> + * and we just want to know whether any entry holds valid data. > >>> + */ > >>> +static inline bool tlb_is_valid(const CPUTLBEntry *te) > >>> +{ > >>> +return !(te->addr_read & TLB_INVALID_MASK) || > >>> + !(te->addr_write & TLB_INVALID_MASK) || > >>> + !(te->addr_code & TLB_INVALID_MASK); > >>> +} > >> > >> No, I think you misunderstand. > >> > >> First, TLB_INVALID_MASK is only ever set for addr_write, in response to > >> PAGE_WRITE_INV. Second, an entry that is invalid for write is still valid > >> for > >> read+exec. So there is benefit to swapping it out to the victim cache. > >> > >> This is used by the s390x target to make the "lowpage" read-only, which is > >> a > >> special architected 512 byte range within pages 0 and 1. This is done by > >> forcing writes, but not reads, back through tlb_fill. > > > > Aah I see. The point is to avoid pushing to the victim cache > > an entry that is all invalid, not just partially invalid. > > I've slightly misspoken here. It is (ab)used for the s390 thing, but that bit > is also set by memset -1. Perhaps you might check > > static inline bool tlb_is_invalid(const CPUTLBEntry *te) > { > return te->addr_read & te->addr_write & te->addr_code & TLB_INVALID_MASK; > } > > That is, all forms of access have the INVALID bit set. I see. I just ran a few tests and the below gives the best performance, so I'll go with it: static inline bool tlb_is_empty(const CPUTLBEntry *te) { return te->addr_read == -1 && te->addr_write == -1 && te->addr_code == -1; } I renamed it from "invalid" to "empty" to avoid even thinking about the invalid flag. Thanks, Emilio
Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb
On 10/8/18 7:42 AM, Emilio G. Cota wrote: > On Sun, Oct 07, 2018 at 19:09:01 -0700, Richard Henderson wrote: >> On 10/6/18 2:45 PM, Emilio G. Cota wrote: >>> Currently we evict an entry to the victim TLB when it doesn't match >>> the current address. But it could be that there's no match because >>> the current entry is invalid. Do not evict the entry to the vtlb >>> in that case. >>> >>> This change will help us keep track of the TLB's use rate. >>> >>> Signed-off-by: Emilio G. Cota >>> --- >>> include/exec/cpu-all.h | 14 ++ >>> accel/tcg/cputlb.c | 2 +- >>> 2 files changed, 15 insertions(+), 1 deletion(-) >>> >>> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h >>> index 117d2fbbca..d938dedafc 100644 >>> --- a/include/exec/cpu-all.h >>> +++ b/include/exec/cpu-all.h >>> @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, >>> target_ulong addr) >>> return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK); >>> } >>> >>> +/** >>> + * tlb_is_valid - return true if at least one of the addresses is valid >>> + * @te: pointer to CPUTLBEntry >>> + * >>> + * This is useful when we don't have a particular address to compare >>> against, >>> + * and we just want to know whether any entry holds valid data. >>> + */ >>> +static inline bool tlb_is_valid(const CPUTLBEntry *te) >>> +{ >>> +return !(te->addr_read & TLB_INVALID_MASK) || >>> + !(te->addr_write & TLB_INVALID_MASK) || >>> + !(te->addr_code & TLB_INVALID_MASK); >>> +} >> >> No, I think you misunderstand. >> >> First, TLB_INVALID_MASK is only ever set for addr_write, in response to >> PAGE_WRITE_INV. Second, an entry that is invalid for write is still valid >> for >> read+exec. So there is benefit to swapping it out to the victim cache. >> >> This is used by the s390x target to make the "lowpage" read-only, which is a >> special architected 512 byte range within pages 0 and 1. This is done by >> forcing writes, but not reads, back through tlb_fill. > > Aah I see. The point is to avoid pushing to the victim cache > an entry that is all invalid, not just partially invalid. I've slightly misspoken here. It is (ab)used for the s390 thing, but that bit is also set by memset -1. Perhaps you might check static inline bool tlb_is_invalid(const CPUTLBEntry *te) { return te->addr_read & te->addr_write & te->addr_code & TLB_INVALID_MASK; } That is, all forms of access have the INVALID bit set. r~
Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb
On Sun, Oct 07, 2018 at 19:09:01 -0700, Richard Henderson wrote: > On 10/6/18 2:45 PM, Emilio G. Cota wrote: > > Currently we evict an entry to the victim TLB when it doesn't match > > the current address. But it could be that there's no match because > > the current entry is invalid. Do not evict the entry to the vtlb > > in that case. > > > > This change will help us keep track of the TLB's use rate. > > > > Signed-off-by: Emilio G. Cota > > --- > > include/exec/cpu-all.h | 14 ++ > > accel/tcg/cputlb.c | 2 +- > > 2 files changed, 15 insertions(+), 1 deletion(-) > > > > diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h > > index 117d2fbbca..d938dedafc 100644 > > --- a/include/exec/cpu-all.h > > +++ b/include/exec/cpu-all.h > > @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, > > target_ulong addr) > > return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK); > > } > > > > +/** > > + * tlb_is_valid - return true if at least one of the addresses is valid > > + * @te: pointer to CPUTLBEntry > > + * > > + * This is useful when we don't have a particular address to compare > > against, > > + * and we just want to know whether any entry holds valid data. > > + */ > > +static inline bool tlb_is_valid(const CPUTLBEntry *te) > > +{ > > +return !(te->addr_read & TLB_INVALID_MASK) || > > + !(te->addr_write & TLB_INVALID_MASK) || > > + !(te->addr_code & TLB_INVALID_MASK); > > +} > > No, I think you misunderstand. > > First, TLB_INVALID_MASK is only ever set for addr_write, in response to > PAGE_WRITE_INV. Second, an entry that is invalid for write is still valid for > read+exec. So there is benefit to swapping it out to the victim cache. > > This is used by the s390x target to make the "lowpage" read-only, which is a > special architected 512 byte range within pages 0 and 1. This is done by > forcing writes, but not reads, back through tlb_fill. Aah I see. The point is to avoid pushing to the victim cache an entry that is all invalid, not just partially invalid. Thanks for the clarification! Emilio
Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb
On 10/6/18 2:45 PM, Emilio G. Cota wrote: > Currently we evict an entry to the victim TLB when it doesn't match > the current address. But it could be that there's no match because > the current entry is invalid. Do not evict the entry to the vtlb > in that case. > > This change will help us keep track of the TLB's use rate. > > Signed-off-by: Emilio G. Cota > --- > include/exec/cpu-all.h | 14 ++ > accel/tcg/cputlb.c | 2 +- > 2 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h > index 117d2fbbca..d938dedafc 100644 > --- a/include/exec/cpu-all.h > +++ b/include/exec/cpu-all.h > @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, > target_ulong addr) > return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK); > } > > +/** > + * tlb_is_valid - return true if at least one of the addresses is valid > + * @te: pointer to CPUTLBEntry > + * > + * This is useful when we don't have a particular address to compare against, > + * and we just want to know whether any entry holds valid data. > + */ > +static inline bool tlb_is_valid(const CPUTLBEntry *te) > +{ > +return !(te->addr_read & TLB_INVALID_MASK) || > + !(te->addr_write & TLB_INVALID_MASK) || > + !(te->addr_code & TLB_INVALID_MASK); > +} No, I think you misunderstand. First, TLB_INVALID_MASK is only ever set for addr_write, in response to PAGE_WRITE_INV. Second, an entry that is invalid for write is still valid for read+exec. So there is benefit to swapping it out to the victim cache. This is used by the s390x target to make the "lowpage" read-only, which is a special architected 512 byte range within pages 0 and 1. This is done by forcing writes, but not reads, back through tlb_fill. r~