On 03/08/2016 07:45 PM, Paolo Bonzini wrote:
For the next patch, we will want to filter PFERR_FETCH_MASK away early, and not pass it to permission_fault if neither NX nor SMEP are enabled. Prepare for the change.
Why it is needed? It is much easier to drop PFEC.F in update_permission_bitmask().
Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> --- arch/x86/kvm/mmu.c | 2 +- arch/x86/kvm/paging_tmpl.h | 26 +++++++++++++++----------- 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 2463de0b935c..e57f7be061e3 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3883,7 +3883,7 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu, u = bit & ACC_USER_MASK; if (!ept) { - /* Not really needed: !nx will cause pte.nx to fault */ + /* Not really needed: if !nx, ff will always be zero */ x |= !mmu->nx; /* Allow supervisor writes if !cr0.wp */ w |= !is_write_protection(vcpu) && !uf; diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h index 6013f3685ef4..285858d3223b 100644 --- a/arch/x86/kvm/paging_tmpl.h +++ b/arch/x86/kvm/paging_tmpl.h @@ -272,13 +272,24 @@ static int FNAME(walk_addr_generic)(struct guest_walker *walker, gpa_t pte_gpa; int offset; const int write_fault = access & PFERR_WRITE_MASK; - const int user_fault = access & PFERR_USER_MASK; - const int fetch_fault = access & PFERR_FETCH_MASK; - u16 errcode = 0; + u16 errcode; gpa_t real_gpa; gfn_t gfn; trace_kvm_mmu_pagetable_walk(addr, access); + + /* + * Do not modify PFERR_FETCH_MASK in access. It is used later in the call to + * mmu->translate_gpa and, when nested virtualization is in use, the X or NX + * bit of nested page tables always applies---even if NX and SMEP are disabled + * in the guest. + * + * TODO: cache the result of the NX and SMEP test in struct kvm_mmu? + */ + errcode = access; + if (!(mmu->nx || kvm_read_cr4_bits(vcpu, X86_CR4_SMEP))) + errcode &= ~PFERR_FETCH_MASK; +
PFEC.P might be set since it is copied from @access. And the workload is moved from rare path to the common path...
retry_walk: walker->level = mmu->root_level; pte = mmu->get_cr3(vcpu); @@ -389,9 +400,7 @@ retry_walk: if (unlikely(!accessed_dirty)) { ret = FNAME(update_accessed_dirty_bits)(vcpu, mmu, walker, write_fault); - if (unlikely(ret < 0)) - goto error; - else if (ret) + if (ret > 0) goto retry_walk;
So it returns normally even if update_accessed_dirty_bits() failed.