On Wed, Mar 26, 2008 at 07:12:25PM +0100, Andrea Arcangeli wrote:
> On Wed, Mar 26, 2008 at 02:51:28PM -0300, Marcelo Tosatti wrote:
> > Nope. If a physical CPU has page translations cached it _must_ be
> > running in the context of a qemu thread (does not matter if its in
> > userspace or executing guest code). The bit corresponding to such CPU's
> > will be set in mm->vm_cpu_mask and flush_tlb_page() will take care of
> > flushing the TLBs appropriately.
>
> That would require the tlb to lookup any SVM/VMX virtualized address
> that could point to the same physical page that is pointed by the
> invlpg linux virtual address. So it might be feasible if the hardware
> is using a physical tag on each spte translation, but it'd require the
> tlb to do a lot more work than we need it to do, so I hope you're
> wrong on the hardware side of things. That would be an hardware
> complexity or slowdown that would provide no benefit to software.
>
> But regardless, even if this was the case and you're right that
> invalidating a linux userland address invalidates atomically all other
> virtualized addresses in the svm/vmx tlbs (all asn included, not just
> one), the spte is still instantiated when flush_tlb_page runs on all
> cpus.
You're right, there is no guarantee that a full TLB flush will happen,
and invlpg for the qemu task virtual address is not enough.
> So just after the global tlb flush, a guest spte tlb-miss (no page
> fault required, no get_user_pages required) can happen that will
> re-instantiate the spte contents inside the tlb before flush_tlb_page
> returns.
>
> CPU0 CPU 1
> pte_clear() inside ptep_clear_flush
> flush_tlb_page inside ptep_clear_flush inside rmap
> page_count = 1
> guest tlb miss
> tlb entry is back!
> ioctl()
> mark spte nonpresent
> rmap_remove -> put_page
> tlb flush
>
> Any tlb flush happening before clearing the shadow-pte entry is
> totally useless.
>
> Avi patch is great fix, and it will need furtherx changes to properly
> fix this race, because many set_shadow_pte/and pt[i] = nonpresent, are
> executed _after_ rmap_remove (they must be executed first, in case the
> array is full and we have flush tlb and free the page).
One comment on the patch:
+static void kvm_release_page_clean_deferred(struct kvm *kvm,
+ struct page *page)
+{
+ kvm_release_page_deferred(kvm, page, true);
+}
false
-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace
_______________________________________________
kvm-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kvm-devel