On Mon, Mar 10, 2014 at 10:01:49PM +0800, Xiao Guangrong wrote:
> Now we can flush all the TLBs out of the mmu lock without TLB corruption when
> write-proect the sptes, it is because:
> - we have marked large sptes readonly instead of dropping them that means we
>   just change the spte from writable to readonly so that we only need to care
>   the case of changing spte from present to present (changing the spte from
>   present to nonpresent will flush all the TLBs immediately), in other words,
>   the only case we need to care is mmu_spte_update()
> 
> - in mmu_spte_update(), we haved checked
>   SPTE_HOST_WRITEABLE | PTE_MMU_WRITEABLE instead of PT_WRITABLE_MASK, that
>   means it does not depend on PT_WRITABLE_MASK anymore
> 
> Signed-off-by: Xiao Guangrong <[email protected]>
> ---
>  arch/x86/kvm/mmu.c | 25 +++++++++++++++++++++----
>  arch/x86/kvm/mmu.h | 14 ++++++++++++++
>  arch/x86/kvm/x86.c | 12 ++++++++++--
>  3 files changed, 45 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 17bb136..01a8c35 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -4281,15 +4281,32 @@ void kvm_mmu_slot_remove_write_access(struct kvm 
> *kvm, int slot)
>                       if (*rmapp)
>                               __rmap_write_protect(kvm, rmapp, false);
>  
> -                     if (need_resched() || spin_needbreak(&kvm->mmu_lock)) {
> -                             kvm_flush_remote_tlbs(kvm);
> +                     if (need_resched() || spin_needbreak(&kvm->mmu_lock))
>                               cond_resched_lock(&kvm->mmu_lock);
> -                     }
>               }
>       }
>  
> -     kvm_flush_remote_tlbs(kvm);
>       spin_unlock(&kvm->mmu_lock);
> +
> +     /*
> +      * kvm_mmu_slot_remove_write_access() and kvm_vm_ioctl_get_dirty_log()
> +      * which do tlb flush out of mmu-lock should be serialized by
> +      * kvm->slots_lock otherwise tlb flush would be missed.
> +      */
> +     lockdep_assert_held(&kvm->slots_lock);
> +
> +     /*
> +      * We can flush all the TLBs out of the mmu lock without TLB
> +      * corruption since we just change the spte from writable to
> +      * readonly so that we only need to care the case of changing
> +      * spte from present to present (changing the spte from present
> +      * to nonpresent will flush all the TLBs immediately), in other
> +      * words, the only case we care is mmu_spte_update() where we
> +      * haved checked SPTE_HOST_WRITEABLE | SPTE_MMU_WRITEABLE
> +      * instead of PT_WRITABLE_MASK, that means it does not depend
> +      * on PT_WRITABLE_MASK anymore.
> +      */
> +     kvm_flush_remote_tlbs(kvm);
>  }
>  
>  #define BATCH_ZAP_PAGES      10
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index 2926152..585d6b1 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -96,6 +96,20 @@ static inline int is_present_gpte(unsigned long pte)
>       return pte & PT_PRESENT_MASK;
>  }
>  
> +/*
> + * Please note PT_WRITABLE_MASK is not stable since
> + * 1) fast_page_fault() sets spte from readonly to writable out of mmu-lock 
> or
> + * 2) kvm_mmu_slot_remove_write_access() and kvm_vm_ioctl_get_dirty_log()
> + *    can write protect sptes but flush tlb out mmu-lock that means we may 
> use
> + *    the corrupt tlb entries which depend on this bit.
> + *
> + * Both cases do not modify the status of  spte_is_locklessly_modifiable() so
> + * if you want to check whether the spte is writable on MMU you can check
> + * SPTE_MMU_WRITEABLE instead. If you want to update spte without losing
> + * A/D bits and tlb flush, you can check spte_is_locklessly_modifiable()
> + * instead. See the comments in spte_has_volatile_bits() and
> + * mmu_spte_update().
> + */
>  static inline int is_writable_pte(unsigned long pte)
>  {

Xiao,

Can't get the SPTE_MMU_WRITEABLE part.

So assume you are writing code to perform some action after guest memory
has been write protected. You would

spin_lock(mmu_lock);

if (writeable spte bit is set)
    remove writeable spte bit from spte
remote TLB flush            (*)
action

spin_unlock(mmu_lock);

(*) is necessary because reading the writeable spte bit as zero
does not guarantee remote TLBs have been flushed.

Now what SPTE_MMU_WRITEABLE has to do with it ?

Perhaps a recipe like that (or just the rules) would be useful.

The remaining patches look good.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to