On 27.11.2012, at 00:16, Paul Mackerras wrote:

> On Mon, Nov 26, 2012 at 11:03:19PM +0100, Alexander Graf wrote:
>> 
>> On 26.11.2012, at 22:48, Paul Mackerras wrote:
>> 
>>> On Mon, Nov 26, 2012 at 02:10:33PM +0100, Alexander Graf wrote:
>>>> 
>>>> On 23.11.2012, at 23:07, Paul Mackerras wrote:
>>>> 
>>>>> On Fri, Nov 23, 2012 at 04:43:03PM +0100, Alexander Graf wrote:
>>>>>> 
>>>>>> On 22.11.2012, at 10:28, Paul Mackerras wrote:
>>>>>> 
>>>>>>> - With the possibility of the host paging out guest pages, the use of
>>>>>>> H_LOCAL by an SMP guest is dangerous since the guest could possibly
>>>>>>> retain and use a stale TLB entry pointing to a page that had been
>>>>>>> removed from the guest.
>>>>>> 
>>>>>> I don't understand this part. Don't we flush the TLB when the page gets 
>>>>>> evicted from the shadow HTAB?
>>>>> 
>>>>> The H_LOCAL flag is something that we invented to allow the guest to
>>>>> tell the host "I only ever used this translation (HPTE) on the current
>>>>> vcpu" when it's removing or modifying an HPTE.  The idea is that that
>>>>> would then let the host use the tlbiel instruction (local TLB
>>>>> invalidate) rather than the usual global tlbie instruction.  Tlbiel is
>>>>> faster because it doesn't need to go out on the fabric and get
>>>>> processed by all cpus.  In fact our guests don't use it at present,
>>>>> but we put it in because we thought we should be able to get a
>>>>> performance improvement, particularly on large machines.
>>>>> 
>>>>> However, the catch is that the guest's setting of H_LOCAL might be
>>>>> incorrect, in which case we could have a stale TLB entry on another
>>>>> physical cpu.  While the physical page that it refers to is still
>>>>> owned by the guest, that stale entry doesn't matter from the host's
>>>>> point of view.  But if the host wants to take that page away from the
>>>>> guest, the stale entry becomes a problem.
>>>> 
>>>> That's exactly where my question lies. Does that mean we don't flush the 
>>>> TLB entry regardless when we take the page away from the guest?
>>> 
>>> The question is how to find the TLB entry if the HPTE it came from is
>>> no longer present.  Flushing a TLB entry requires a virtual address.
>>> When we're taking a page away from the guest we have the real address
>>> of the page, not the virtual address.  We can use the reverse-mapping
>>> chains to loop through all the HPTEs that map the page, and from each
>>> HPTE we can (and do) calculate a virtual address and do a TLBIE on
>>> that virtual address (each HPTE could be at a different virtual
>>> address).
>>> 
>>> The difficulty comes when we no longer have the HPTE but we
>>> potentially have a stale TLB entry, due to having used tlbiel when we
>>> removed the HPTE.  Without the HPTE the only way to get rid of the
>>> stale TLB entry would be to completely flush all the TLB entries for
>>> the guest's LPID on every physical CPU it had ever run on.  Since I
>>> don't want to go to that much effort, what I am proposing, and what
>>> this patch implements, is to not ever use tlbiel when removing HPTEs
>>> in SMP guests on POWER7.
>>> 
>>> In other words, what this patch is about is making sure we don't get
>>> these troublesome stale TLB entries.
>> 
>> I see. You could keep a list of to-be-flushed VAs around that you could skim 
>> through when taking a page away from the guest. That way you make the fast 
>> case fast (add/remove of page from the guest) and the slow path slow 
>> (paging).
> 
> Yes, I thought about that, but the problem is that the list of VAs
> could get arbitrarily long and take up a lot of host memory.

You can always cap it at an arbitrary number, similar to how the TLB itself is 
limited too.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to