On Wed, Oct 08, 2014 at 02:15:34PM -0300, Marcelo Tosatti wrote:
> On Wed, Oct 08, 2014 at 09:56:36AM +0300, Gleb Natapov wrote:
> > On Mon, Oct 06, 2014 at 02:19:32PM -0300, Marcelo Tosatti wrote:
> > > On Sat, Oct 04, 2014 at 10:23:32AM +0300, Gleb Natapov wrote:
> > > > On Tue, Sep 09, 2014 at 12:28:11PM -0300, Marcelo Tosatti wrote:
> > > > > On Mon, Jul 21, 2014 at 04:14:24PM +0300, Gleb Natapov wrote:
> > > > > > On Wed, Jul 09, 2014 at 04:12:53PM -0300, mtosa...@redhat.com wrote:
> > > > > > > Reload remote vcpus MMU from GET_DIRTY_LOG codepath, before
> > > > > > > deleting a pinned spte.
> > > > > > > 
> > > > > > > Signed-off-by: Marcelo Tosatti <mtosa...@redhat.com>
> > > > > > > 
> > > > > > > ---
> > > > > > >  arch/x86/kvm/mmu.c |   29 +++++++++++++++++++++++------
> > > > > > >  1 file changed, 23 insertions(+), 6 deletions(-)
> > > > > > > 
> > > > > > > Index: kvm.pinned-sptes/arch/x86/kvm/mmu.c
> > > > > > > ===================================================================
> > > > > > > --- kvm.pinned-sptes.orig/arch/x86/kvm/mmu.c      2014-07-09 
> > > > > > > 11:23:59.290744490 -0300
> > > > > > > +++ kvm.pinned-sptes/arch/x86/kvm/mmu.c   2014-07-09 
> > > > > > > 11:24:58.449632435 -0300
> > > > > > > @@ -1208,7 +1208,8 @@
> > > > > > >   *
> > > > > > >   * Return true if tlb need be flushed.
> > > > > > >   */
> > > > > > > -static bool spte_write_protect(struct kvm *kvm, u64 *sptep, bool 
> > > > > > > pt_protect)
> > > > > > > +static bool spte_write_protect(struct kvm *kvm, u64 *sptep, bool 
> > > > > > > pt_protect,
> > > > > > > +                        bool skip_pinned)
> > > > > > >  {
> > > > > > >   u64 spte = *sptep;
> > > > > > >  
> > > > > > > @@ -1218,6 +1219,22 @@
> > > > > > >  
> > > > > > >   rmap_printk("rmap_write_protect: spte %p %llx\n", sptep, 
> > > > > > > *sptep);
> > > > > > >  
> > > > > > > + if (is_pinned_spte(spte)) {
> > > > > > > +         /* keep pinned spte intact, mark page dirty again */
> > > > > > > +         if (skip_pinned) {
> > > > > > > +                 struct kvm_mmu_page *sp;
> > > > > > > +                 gfn_t gfn;
> > > > > > > +
> > > > > > > +                 sp = page_header(__pa(sptep));
> > > > > > > +                 gfn = kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
> > > > > > > +
> > > > > > > +                 mark_page_dirty(kvm, gfn);
> > > > > > > +                 return false;
> > > > > > Why not mark all pinned gfns as dirty in 
> > > > > > kvm_vm_ioctl_get_dirty_log() while
> > > > > > populating dirty_bitmap_buffer?
> > > > > 
> > > > > The pinned gfns are per-vcpu. Requires looping all vcpus (not
> > > > > scalable).
> > > > > 
> > > > OK, but do they really have to be per-cpu? What's the advantage?
> > > 
> > > The guest configures pinning on a per-cpu basis (that is, enabling PEBS 
> > > is done on a per-cpu basis).
> > Is it a problem to maintain global pinned pages list for each memslot too?
> > 
> > > 
> > > > > 
> > > > > > > +         } else
> > > > > > > +                 mmu_reload_pinned_vcpus(kvm);
> > > > > > Can you explain why do you need this?
> > > > > 
> > > > > Because if skip_pinned = false, we want vcpus to exit (called
> > > > > from enable dirty logging codepath).
> > > > > 
> > > > I guess what I wanted to ask is why do we need skip_pinned? As far as I 
> > > > see it
> > > > is set to false in two cases:
> > > > 1: page is write protected for shadow MMU needs, should not happen 
> > > > since the feature
> > > 
> > > Correct.
> > > 
> > > >    is not supported with shadow mmu (can happen with nested EPT, but 
> > > > page will be marked
> > > >    is accessed during next vcpu entry anyway, so how will it work)?
> > > 
> > > PEBS is not supported on nested EPT.
> > > 
> > OK, so for this case we do not need skip_pinned. Assert if it happens.
> > 
> > > > 2: when slot is marked as read only: such slot cannot have PEBS pages 
> > > > and if it will guest will die
> > > >    anyway during next guest entry, so why not maintain list of pinned 
> > > > pages per slot and kill aguest
> > > >    if slot with pinned pages is marked read only.
> > > 
> > > 2: when slots pages have dirty logging enabled. In that case, the page
> > > is marked dirty immediately. This is the call from
> > > kvm_mmu_slot_remove_write_access.
> > > 
> > Right, my 2 is incorrect.
> > 
> > > So when enabling dirty logging, for pinned sptes:
> > > 
> > >     - maintain pinned spte intact.
> > >     - mark gfn for which pinned spte represents as dirty in the dirty
> > >       log.
> > > 
> > But you set skip_pinned to false in kvm_mmu_slot_remove_write_access(), so 
> > this is not
> > what is happening. Did you mean to set it to true there?
> 
> Argh, lets try again:
> 
> skip_pinned = true
> ------------------
> 
> mark page dirty, keep spte intact
> 
> called from get dirty log path.
> 
> skip_pinned = false
> -------------------
> reload remote mmu
> destroy pinned spte.
> 
> called from: dirty log enablement, rmap write protect (unused for pinned
> sptes)
> 
> 
> Note this behaviour is your suggestion:

Yes, I remember that and I thought we will not need this skip_pinned
at all.  For rmap write protect case there shouldn't be any pinned pages,
but why dirty log enablement sets skip_pinned to false? Why not mark
pinned pages as dirty just like you do in get dirty log path?

> 
> > > > > Why write protect it at all? mmu_reload_pinned_vcpus() will
> > > > > unprotected
> > > > > it anyway
> > > > > on the next vmentry. Isn't it better to just report all pinned
> > > > > pages as
> > > > > dirty alway.
> 

--
                        Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to