Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb

2018-10-08 Thread Emilio G. Cota
On Mon, Oct 08, 2018 at 12:46:26 -0700, Richard Henderson wrote:
> On 10/8/18 7:42 AM, Emilio G. Cota wrote:
> > On Sun, Oct 07, 2018 at 19:09:01 -0700, Richard Henderson wrote:
> >> On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> >>> Currently we evict an entry to the victim TLB when it doesn't match
> >>> the current address. But it could be that there's no match because
> >>> the current entry is invalid. Do not evict the entry to the vtlb
> >>> in that case.
> >>>
> >>> This change will help us keep track of the TLB's use rate.
> >>>
> >>> Signed-off-by: Emilio G. Cota 
> >>> ---
> >>>  include/exec/cpu-all.h | 14 ++
> >>>  accel/tcg/cputlb.c |  2 +-
> >>>  2 files changed, 15 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> >>> index 117d2fbbca..d938dedafc 100644
> >>> --- a/include/exec/cpu-all.h
> >>> +++ b/include/exec/cpu-all.h
> >>> @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, 
> >>> target_ulong addr)
> >>>  return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK);
> >>>  }
> >>>  
> >>> +/**
> >>> + * tlb_is_valid - return true if at least one of the addresses is valid
> >>> + * @te: pointer to CPUTLBEntry
> >>> + *
> >>> + * This is useful when we don't have a particular address to compare 
> >>> against,
> >>> + * and we just want to know whether any entry holds valid data.
> >>> + */
> >>> +static inline bool tlb_is_valid(const CPUTLBEntry *te)
> >>> +{
> >>> +return !(te->addr_read & TLB_INVALID_MASK) ||
> >>> +   !(te->addr_write & TLB_INVALID_MASK) ||
> >>> +   !(te->addr_code & TLB_INVALID_MASK);
> >>> +}
> >>
> >> No, I think you misunderstand.
> >>
> >> First, TLB_INVALID_MASK is only ever set for addr_write, in response to
> >> PAGE_WRITE_INV.  Second, an entry that is invalid for write is still valid 
> >> for
> >> read+exec.  So there is benefit to swapping it out to the victim cache.
> >>
> >> This is used by the s390x target to make the "lowpage" read-only, which is 
> >> a
> >> special architected 512 byte range within pages 0 and 1.  This is done by
> >> forcing writes, but not reads, back through tlb_fill.
> > 
> > Aah I see. The point is to avoid pushing to the victim cache
> > an entry that is all invalid, not just partially invalid.
> 
> I've slightly misspoken here.  It is (ab)used for the s390 thing, but that bit
> is also set by memset -1.  Perhaps you might check
> 
> static inline bool tlb_is_invalid(const CPUTLBEntry *te)
> {
> return te->addr_read & te->addr_write & te->addr_code & TLB_INVALID_MASK;
> }
> 
> That is, all forms of access have the INVALID bit set.

I see. I just ran a few tests and the below gives the best
performance, so I'll go with it:

static inline bool tlb_is_empty(const CPUTLBEntry *te)
{
return te->addr_read == -1 && te->addr_write == -1 && te->addr_code == -1;
}

I renamed it from "invalid" to "empty" to avoid even thinking about
the invalid flag.

Thanks,

Emilio



Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb

2018-10-08 Thread Richard Henderson
On 10/8/18 7:42 AM, Emilio G. Cota wrote:
> On Sun, Oct 07, 2018 at 19:09:01 -0700, Richard Henderson wrote:
>> On 10/6/18 2:45 PM, Emilio G. Cota wrote:
>>> Currently we evict an entry to the victim TLB when it doesn't match
>>> the current address. But it could be that there's no match because
>>> the current entry is invalid. Do not evict the entry to the vtlb
>>> in that case.
>>>
>>> This change will help us keep track of the TLB's use rate.
>>>
>>> Signed-off-by: Emilio G. Cota 
>>> ---
>>>  include/exec/cpu-all.h | 14 ++
>>>  accel/tcg/cputlb.c |  2 +-
>>>  2 files changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
>>> index 117d2fbbca..d938dedafc 100644
>>> --- a/include/exec/cpu-all.h
>>> +++ b/include/exec/cpu-all.h
>>> @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, 
>>> target_ulong addr)
>>>  return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK);
>>>  }
>>>  
>>> +/**
>>> + * tlb_is_valid - return true if at least one of the addresses is valid
>>> + * @te: pointer to CPUTLBEntry
>>> + *
>>> + * This is useful when we don't have a particular address to compare 
>>> against,
>>> + * and we just want to know whether any entry holds valid data.
>>> + */
>>> +static inline bool tlb_is_valid(const CPUTLBEntry *te)
>>> +{
>>> +return !(te->addr_read & TLB_INVALID_MASK) ||
>>> +   !(te->addr_write & TLB_INVALID_MASK) ||
>>> +   !(te->addr_code & TLB_INVALID_MASK);
>>> +}
>>
>> No, I think you misunderstand.
>>
>> First, TLB_INVALID_MASK is only ever set for addr_write, in response to
>> PAGE_WRITE_INV.  Second, an entry that is invalid for write is still valid 
>> for
>> read+exec.  So there is benefit to swapping it out to the victim cache.
>>
>> This is used by the s390x target to make the "lowpage" read-only, which is a
>> special architected 512 byte range within pages 0 and 1.  This is done by
>> forcing writes, but not reads, back through tlb_fill.
> 
> Aah I see. The point is to avoid pushing to the victim cache
> an entry that is all invalid, not just partially invalid.

I've slightly misspoken here.  It is (ab)used for the s390 thing, but that bit
is also set by memset -1.  Perhaps you might check

static inline bool tlb_is_invalid(const CPUTLBEntry *te)
{
return te->addr_read & te->addr_write & te->addr_code & TLB_INVALID_MASK;
}

That is, all forms of access have the INVALID bit set.


r~



Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb

2018-10-08 Thread Emilio G. Cota
On Sun, Oct 07, 2018 at 19:09:01 -0700, Richard Henderson wrote:
> On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> > Currently we evict an entry to the victim TLB when it doesn't match
> > the current address. But it could be that there's no match because
> > the current entry is invalid. Do not evict the entry to the vtlb
> > in that case.
> > 
> > This change will help us keep track of the TLB's use rate.
> > 
> > Signed-off-by: Emilio G. Cota 
> > ---
> >  include/exec/cpu-all.h | 14 ++
> >  accel/tcg/cputlb.c |  2 +-
> >  2 files changed, 15 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> > index 117d2fbbca..d938dedafc 100644
> > --- a/include/exec/cpu-all.h
> > +++ b/include/exec/cpu-all.h
> > @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, 
> > target_ulong addr)
> >  return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK);
> >  }
> >  
> > +/**
> > + * tlb_is_valid - return true if at least one of the addresses is valid
> > + * @te: pointer to CPUTLBEntry
> > + *
> > + * This is useful when we don't have a particular address to compare 
> > against,
> > + * and we just want to know whether any entry holds valid data.
> > + */
> > +static inline bool tlb_is_valid(const CPUTLBEntry *te)
> > +{
> > +return !(te->addr_read & TLB_INVALID_MASK) ||
> > +   !(te->addr_write & TLB_INVALID_MASK) ||
> > +   !(te->addr_code & TLB_INVALID_MASK);
> > +}
> 
> No, I think you misunderstand.
> 
> First, TLB_INVALID_MASK is only ever set for addr_write, in response to
> PAGE_WRITE_INV.  Second, an entry that is invalid for write is still valid for
> read+exec.  So there is benefit to swapping it out to the victim cache.
> 
> This is used by the s390x target to make the "lowpage" read-only, which is a
> special architected 512 byte range within pages 0 and 1.  This is done by
> forcing writes, but not reads, back through tlb_fill.

Aah I see. The point is to avoid pushing to the victim cache
an entry that is all invalid, not just partially invalid.

Thanks for the clarification!

Emilio



Re: [Qemu-devel] [RFC 2/6] cputlb: do not evict invalid entries to the vtlb

2018-10-07 Thread Richard Henderson
On 10/6/18 2:45 PM, Emilio G. Cota wrote:
> Currently we evict an entry to the victim TLB when it doesn't match
> the current address. But it could be that there's no match because
> the current entry is invalid. Do not evict the entry to the vtlb
> in that case.
> 
> This change will help us keep track of the TLB's use rate.
> 
> Signed-off-by: Emilio G. Cota 
> ---
>  include/exec/cpu-all.h | 14 ++
>  accel/tcg/cputlb.c |  2 +-
>  2 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
> index 117d2fbbca..d938dedafc 100644
> --- a/include/exec/cpu-all.h
> +++ b/include/exec/cpu-all.h
> @@ -362,6 +362,20 @@ static inline bool tlb_hit(target_ulong tlb_addr, 
> target_ulong addr)
>  return tlb_hit_page(tlb_addr, addr & TARGET_PAGE_MASK);
>  }
>  
> +/**
> + * tlb_is_valid - return true if at least one of the addresses is valid
> + * @te: pointer to CPUTLBEntry
> + *
> + * This is useful when we don't have a particular address to compare against,
> + * and we just want to know whether any entry holds valid data.
> + */
> +static inline bool tlb_is_valid(const CPUTLBEntry *te)
> +{
> +return !(te->addr_read & TLB_INVALID_MASK) ||
> +   !(te->addr_write & TLB_INVALID_MASK) ||
> +   !(te->addr_code & TLB_INVALID_MASK);
> +}

No, I think you misunderstand.

First, TLB_INVALID_MASK is only ever set for addr_write, in response to
PAGE_WRITE_INV.  Second, an entry that is invalid for write is still valid for
read+exec.  So there is benefit to swapping it out to the victim cache.

This is used by the s390x target to make the "lowpage" read-only, which is a
special architected 512 byte range within pages 0 and 1.  This is done by
forcing writes, but not reads, back through tlb_fill.


r~