On Thu, Jul 07, 2011 at 03:59:12AM +0800, Xiao Guangrong wrote: > On 07/07/2011 02:52 AM, Marcelo Tosatti wrote: > > >> +/* > >> + * If it is a real mmio page fault, return 1 and emulat the instruction > >> + * directly, return 0 to let CPU fault again on the address, -1 is > >> + * returned if bug is detected. > >> + */ > >> +int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool > >> direct) > >> +{ > >> + u64 spte; > >> + > >> + if (quickly_check_mmio_pf(vcpu, addr, direct)) > >> + return 1; > >> + > >> + spte = walk_shadow_page_get_mmio_spte(vcpu, addr); > >> + > >> + if (is_mmio_spte(spte)) { > >> + gfn_t gfn = get_mmio_spte_gfn(spte); > >> + unsigned access = get_mmio_spte_access(spte); > >> + > >> + if (direct) > >> + addr = 0; > >> + vcpu_cache_mmio_info(vcpu, addr, gfn, access); > >> + return 1; > >> + } > >> + > >> + /* > >> + * It's ok if the gva is remapped by other cpus on shadow guest, > >> + * it's a BUG if the gfn is not a mmio page. > >> + */ > >> + if (direct && is_shadow_present_pte(spte)) > >> + return -1; > > > > Marcelo, > > > This is only going to generate an spte dump, for a genuine EPT > > misconfig, if the present bit is set. > > > > Should be: > > > > /* > > * Page table zapped by other cpus, let CPU fault again on > > * the address. > > */ > > if (*spte == 0ull) > > return 0; > > Can not use "*spte == 0ull" here, it should use !is_shadow_present_pte(spte) > instead, since on x86-32 host, we can get the high 32 bit is set, and the > present > bit is cleared.
OK, then check for 0ull on x86-64, and low 32-bit zeroed but high 32-bit with mmio pattern on x86-32. It is not acceptable to disable spte dump for sptes without present bit. > > /* BUG if gfn is not an mmio page */ > > return -1; > > We can not detect bug for soft-mmu, since the shadow page can be changed > anytime, > for example: > > VCPU 0 VCPU1 > mmio page is intercepted > change the guest page table and map the virtual > address to the RAM region > walk shadow page table, and > detect the gfn is RAM, spurious > BUG is reported > > In theory, it can be happened. Yes, with TDP only assigning a memory slot can change an spte from mmio to valid, which can only happen under SRCU, so this path is protected. > >> +static bool sync_mmio_spte(u64 *sptep, gfn_t gfn, unsigned access, > >> + int *nr_present) > >> +{ > >> + if (unlikely(is_mmio_spte(*sptep))) { > >> + if (gfn != get_mmio_spte_gfn(*sptep)) { > >> + mmu_spte_clear_no_track(sptep); > >> + return true; > >> + } > >> + > >> + (*nr_present)++; > > > > Can increase nr_present in the caller. > > > > Yes, we should increase it to avoid the unsync shadow page to be freed. > > >> @@ -6481,6 +6506,13 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, > >> if (!kvm->arch.n_requested_mmu_pages) > >> nr_mmu_pages = kvm_mmu_calculate_mmu_pages(kvm); > >> > >> + /* > >> + * If the new memory slot is created, we need to clear all > >> + * mmio sptes. > >> + */ > >> + if (npages && old.base_gfn != mem->guest_phys_addr >> PAGE_SHIFT) > >> + kvm_arch_flush_shadow(kvm); > >> + > > > > This should be in __kvm_set_memory_region. > > > > Um, __kvm_set_memory_region is a common function, and only x86 support mmio > spte, > it seems no need to do this for all architecture? Please do it in __kvm_set_memory_region, after kvm_arch_prepare_memory_region (other arches won't mind the flush). -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html