Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb

2018-10-08 Thread Emilio G. Cota
On Mon, Oct 08, 2018 at 12:46:26 -0700, Richard Henderson wrote:
> On 10/8/18 7:42 AM, Emilio G. Cota wrote:
> > On Sun, Oct 07, 2018 at 19:09:01 -0700, Richard Henderson wrote:
> >> On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> >>> Currently we evict an entry to the victim TLB when it doesn't match
> >>> the current address. But it could be that there's no match because
> >>> the current entry is invalid. Do not evict the entry to the vtlb
> >>> in that case.
> >>>
> >>> This change will help us keep track of the TLB's use rate.
> >>>
> >>> Signed-off-by: Emilio G. Cota 
> >>> ---
> >>>  include/exec/cpu-all.h | 14 ++
> >>>  accel/tcg/cputlb.c |  2 +-
> >>>  2 files changed, 15 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> >>> index 117d2fbbca..d938dedafc 100644
> >>> --- a/include/exec/cpu-all.h
> >>> +++ b/include/exec/cpu-all.h
> >>> @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, 
> >>> target_ulong addr)
> >>>  return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK);
> >>>  }
> >>>  
> >>> +/**
> >>> + * tlb_is_valid - return true if at least one of the addresses is valid
> >>> + * @te: pointer to CPUTLBEntry
> >>> + *
> >>> + * This is useful when we don't have a particular address to compare 
> >>> against,
> >>> + * and we just want to know whether any entry holds valid data.
> >>> + */
> >>> +static inline bool tlb_is_valid(const CPUTLBEntry *te)
> >>> +{
> >>> +return !(te->addr_read & TLB_INVALID_MASK) ||
> >>> +   !(te->addr_write & TLB_INVALID_MASK) ||
> >>> +   !(te->addr_code & TLB_INVALID_MASK);
> >>> +}
> >>
> >> No, I think you misunderstand.
> >>
> >> First, TLB_INVALID_MASK is only ever set for addr_write, in response to
> >> PAGE_WRITE_INV.  Second, an entry that is invalid for write is still valid 
> >> for
> >> read+exec.  So there is benefit to swapping it out to the victim cache.
> >>
> >> This is used by the s390x target to make the "lowpage" read-only, which is 
> >> a
> >> special architected 512 byte range within pages 0 and 1.  This is done by
> >> forcing writes, but not reads, back through tlb_fill.
> > 
> > Aah I see. The point is to avoid pushing to the victim cache
> > an entry that is all invalid, not just partially invalid.
> 
> I've slightly misspoken here.  It is (ab)used for the s390 thing, but that bit
> is also set by memset -1.  Perhaps you might check
> 
> static inline bool tlb_is_invalid(const CPUTLBEntry *te)
> {
> return te->addr_read & te->addr_write & te->addr_code & TLB_INVALID_MASK;
> }
> 
> That is, all forms of access have the INVALID bit set.

I see. I just ran a few tests and the below gives the best
performance, so I'll go with it:

static inline bool tlb_is_empty(const CPUTLBEntry *te)
{
return te->addr_read == -1 && te->addr_write == -1 && te->addr_code == -1;
}

I renamed it from "invalid" to "empty" to avoid even thinking about
the invalid flag.

Thanks,

Emilio



Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb

2018-10-08 Thread Richard Henderson
On 10/8/18 7:42 AM, Emilio G. Cota wrote:
> On Sun, Oct 07, 2018 at 19:09:01 -0700, Richard Henderson wrote:
>> On 10/6/18 2:45 PM, Emilio G. Cota wrote:
>>> Currently we evict an entry to the victim TLB when it doesn't match
>>> the current address. But it could be that there's no match because
>>> the current entry is invalid. Do not evict the entry to the vtlb
>>> in that case.
>>>
>>> This change will help us keep track of the TLB's use rate.
>>>
>>> Signed-off-by: Emilio G. Cota 
>>> ---
>>>  include/exec/cpu-all.h | 14 ++
>>>  accel/tcg/cputlb.c |  2 +-
>>>  2 files changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
>>> index 117d2fbbca..d938dedafc 100644
>>> --- a/include/exec/cpu-all.h
>>> +++ b/include/exec/cpu-all.h
>>> @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, 
>>> target_ulong addr)
>>>  return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK);
>>>  }
>>>  
>>> +/**
>>> + * tlb_is_valid - return true if at least one of the addresses is valid
>>> + * @te: pointer to CPUTLBEntry
>>> + *
>>> + * This is useful when we don't have a particular address to compare 
>>> against,
>>> + * and we just want to know whether any entry holds valid data.
>>> + */
>>> +static inline bool tlb_is_valid(const CPUTLBEntry *te)
>>> +{
>>> +return !(te->addr_read & TLB_INVALID_MASK) ||
>>> +   !(te->addr_write & TLB_INVALID_MASK) ||
>>> +   !(te->addr_code & TLB_INVALID_MASK);
>>> +}
>>
>> No, I think you misunderstand.
>>
>> First, TLB_INVALID_MASK is only ever set for addr_write, in response to
>> PAGE_WRITE_INV.  Second, an entry that is invalid for write is still valid 
>> for
>> read+exec.  So there is benefit to swapping it out to the victim cache.
>>
>> This is used by the s390x target to make the "lowpage" read-only, which is a
>> special architected 512 byte range within pages 0 and 1.  This is done by
>> forcing writes, but not reads, back through tlb_fill.
> 
> Aah I see. The point is to avoid pushing to the victim cache
> an entry that is all invalid, not just partially invalid.

I've slightly misspoken here.  It is (ab)used for the s390 thing, but that bit
is also set by memset -1.  Perhaps you might check

static inline bool tlb_is_invalid(const CPUTLBEntry *te)
{
return te->addr_read & te->addr_write & te->addr_code & TLB_INVALID_MASK;
}

That is, all forms of access have the INVALID bit set.


r~



Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb

2018-10-08 Thread Emilio G. Cota
On Sun, Oct 07, 2018 at 19:09:01 -0700, Richard Henderson wrote:
> On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> > Currently we evict an entry to the victim TLB when it doesn't match
> > the current address. But it could be that there's no match because
> > the current entry is invalid. Do not evict the entry to the vtlb
> > in that case.
> > 
> > This change will help us keep track of the TLB's use rate.
> > 
> > Signed-off-by: Emilio G. Cota 
> > ---
> >  include/exec/cpu-all.h | 14 ++
> >  accel/tcg/cputlb.c |  2 +-
> >  2 files changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> > index 117d2fbbca..d938dedafc 100644
> > --- a/include/exec/cpu-all.h
> > +++ b/include/exec/cpu-all.h
> > @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, 
> > target_ulong addr)
> >  return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK);
> >  }
> >  
> > +/**
> > + * tlb_is_valid - return true if at least one of the addresses is valid
> > + * @te: pointer to CPUTLBEntry
> > + *
> > + * This is useful when we don't have a particular address to compare 
> > against,
> > + * and we just want to know whether any entry holds valid data.
> > + */
> > +static inline bool tlb_is_valid(const CPUTLBEntry *te)
> > +{
> > +return !(te->addr_read & TLB_INVALID_MASK) ||
> > +   !(te->addr_write & TLB_INVALID_MASK) ||
> > +   !(te->addr_code & TLB_INVALID_MASK);
> > +}
> 
> No, I think you misunderstand.
> 
> First, TLB_INVALID_MASK is only ever set for addr_write, in response to
> PAGE_WRITE_INV.  Second, an entry that is invalid for write is still valid for
> read+exec.  So there is benefit to swapping it out to the victim cache.
> 
> This is used by the s390x target to make the "lowpage" read-only, which is a
> special architected 512 byte range within pages 0 and 1.  This is done by
> forcing writes, but not reads, back through tlb_fill.

Aah I see. The point is to avoid pushing to the victim cache
an entry that is all invalid, not just partially invalid.

Thanks for the clarification!

Emilio



Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb

2018-10-07 Thread Richard Henderson
On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> Currently we evict an entry to the victim TLB when it doesn't match
> the current address. But it could be that there's no match because
> the current entry is invalid. Do not evict the entry to the vtlb
> in that case.
> 
> This change will help us keep track of the TLB's use rate.
> 
> Signed-off-by: Emilio G. Cota 
> ---
>  include/exec/cpu-all.h | 14 ++
>  accel/tcg/cputlb.c |  2 +-
>  2 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> index 117d2fbbca..d938dedafc 100644
> --- a/include/exec/cpu-all.h
> +++ b/include/exec/cpu-all.h
> @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, 
> target_ulong addr)
>  return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK);
>  }
>  
> +/**
> + * tlb_is_valid - return true if at least one of the addresses is valid
> + * @te: pointer to CPUTLBEntry
> + *
> + * This is useful when we don't have a particular address to compare against,
> + * and we just want to know whether any entry holds valid data.
> + */
> +static inline bool tlb_is_valid(const CPUTLBEntry *te)
> +{
> +return !(te->addr_read & TLB_INVALID_MASK) ||
> +   !(te->addr_write & TLB_INVALID_MASK) ||
> +   !(te->addr_code & TLB_INVALID_MASK);
> +}

No, I think you misunderstand.

First, TLB_INVALID_MASK is only ever set for addr_write, in response to
PAGE_WRITE_INV.  Second, an entry that is invalid for write is still valid for
read+exec.  So there is benefit to swapping it out to the victim cache.

This is used by the s390x target to make the "lowpage" read-only, which is a
special architected 512 byte range within pages 0 and 1.  This is done by
forcing writes, but not reads, back through tlb_fill.


r~



[Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb

2018-10-06 Thread Emilio G. Cota
Currently we evict an entry to the victim TLB when it doesn't match
the current address. But it could be that there's no match because
the current entry is invalid. Do not evict the entry to the vtlb
in that case.

This change will help us keep track of the TLB's use rate.

Signed-off-by: Emilio G. Cota 
---
 include/exec/cpu-all.h | 14 ++
 accel/tcg/cputlb.c |  2 +-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 117d2fbbca..d938dedafc 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, 
target_ulong addr)
 return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK);
 }
 
+/**
+ * tlb_is_valid - return true if at least one of the addresses is valid
+ * @te: pointer to CPUTLBEntry
+ *
+ * This is useful when we don't have a particular address to compare against,
+ * and we just want to know whether any entry holds valid data.
+ */
+static inline bool tlb_is_valid(const CPUTLBEntry *te)
+{
+return !(te->addr_read & TLB_INVALID_MASK) ||
+   !(te->addr_write & TLB_INVALID_MASK) ||
+   !(te->addr_code & TLB_INVALID_MASK);
+}
+
 void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
 void dump_opcount_info(FILE *f, fprintf_function cpu_fprintf);
 #endif /* !CONFIG_USER_ONLY */
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 0b51efc374..0e2c149d6b 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -695,7 +695,7 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong 
vaddr,
  * Only evict the old entry to the victim tlb if it's for a
  * different page; otherwise just overwrite the stale data.
  */
-if (!tlb_hit_page_anyprot(te, vaddr_page)) {
+if (!tlb_hit_page_anyprot(te, vaddr_page) && tlb_is_valid(te)) {
 unsigned vidx = env->vtlb_index++ % CPU_VTLB_SIZE;
 CPUTLBEntry *tv = >tlb_v_table[mmu_idx][vidx];
 
-- 
2.17.1