Re: Nested EPT Write Protection
On 22/06/2015 15:28, Hu Yaohui wrote: > > */2504 pseudo_gfn = base_addr >> PAGE_SHIFT; > 2505 sp = kvm_mmu_get_page(vcpu, pseudo_gfn, iterator.addr, > 2506 iterator.level - 1, > 2507 1, ACC_ALL, iterator.sptep);/* > 2508 if (!sp) { > 2509 pgprintk("nonpaging_map: ENOMEM\n"); > 2510 kvm_release_pfn_clean(pfn); > 2511 return -ENOMEM; > 2512 } >. > > > it will get a pseudo_gfn to allocate a kvm_mmu_page. What if a > pseudo_gfn itself causes a tdp_page_fault? > Will it make the corresponding EPT page table entry marked as readonly also? If tdp_page_fault is used (meaning non-nested KVM: nested KVM uses ept_page_fault instead), sp->unsync is always true: /* in kvm_mmu_get_page - __direct_map passes direct == true */ if (!direct) { if (rmap_write_protect(vcpu, gfn)) kvm_flush_remote_tlbs(vcpu->kvm); if (level > PT_PAGE_TABLE_LEVEL && need_sync) kvm_sync_pages(vcpu, gfn); account_shadowed(vcpu->kvm, sp); } so mmu_need_write_protect always returns false. Note that higher in kvm_mmu_get_page there is another conditional: if (!need_sync && sp->unsync) need_sync = true; but it only applies to the !direct case. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in
Re: Nested EPT Write Protection
On 19/06/2015 20:57, Hu Yaohui wrote: > One more thing, for the standard guest VM which uses EPT, What's the > usage of "gfn" field in the "struct kvm_mmu_page"? Since it uses EPT, > a single shadow page should has no relation with any of the guest > physical page, right? The gfn is the same value that you can find in bits 12 to MAXPHYADDR-1 of the EPT page table entry. Paolo > According to the source code, each allocated > shadow page "struct kvm_mmu_page" got it's gfn field filled. -- To unsubscribe from this list: send the line "unsubscribe kvm" in
Re: Nested EPT Write Protection
Thanks a lot! It's much straightforward to me right now. One more thing, for the standard guest VM which uses EPT, What's the usage of "gfn" field in the "struct kvm_mmu_page"? Since it uses EPT, a single shadow page should has no relation with any of the guest physical page, right? According to the source code, each allocated shadow page "struct kvm_mmu_page" got it's gfn field filled. Thanks, Yaohui On Fri, Jun 19, 2015 at 11:23 AM, Paolo Bonzini wrote: > > > On 19/06/2015 14:44, Hu Yaohui wrote: >> Hi Paolo, >> Thanks a lot! >> >> On Fri, Jun 19, 2015 at 2:27 AM, Paolo Bonzini wrote: >>> >>> >>> On 19/06/2015 03:52, Hu Yaohui wrote: >>>> Hi All, >>>> In kernel 3.14.2, the kvm uses shadow EPT(EPT02) to implement the >>>> nested EPT. The shadow EPT(EPT02) is a shadow of guest EPT (EPT12). If >>>> the L1 guest writes to the guest EPT(EPT12). How can the shadow >>>> EPT(EPT02) be modified according? >>> >>> Because the EPT02 is write protected, writes to the EPT12 will trap to >>> the hypervisor. The hypervisor will execute the write instruction >>> before reentering the guest and invalidate the modified parts of the >>> EPT02. When the invalidated part of the EPT02 is accessed, the >>> hypervisor will rebuild it according to the EPT12 and the KVM memslots. >>> >> Do you mean EPT12 is write protected instead of EPT02? > > Yes, sorry. > >> According to my understanding, EPT12 will be write protected by marking the >> page table entry of EPT01 as readonly or marking the host page table >> entry as readonly. >> Could you please be more specific the code path which makes the >> corresponding page table entry as write protected? > > Look at set_spte's call to mmu_need_write_protect. > > Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in
Re: Nested EPT Write Protection
On 19/06/2015 14:44, Hu Yaohui wrote: > Hi Paolo, > Thanks a lot! > > On Fri, Jun 19, 2015 at 2:27 AM, Paolo Bonzini wrote: >> >> >> On 19/06/2015 03:52, Hu Yaohui wrote: >>> Hi All, >>> In kernel 3.14.2, the kvm uses shadow EPT(EPT02) to implement the >>> nested EPT. The shadow EPT(EPT02) is a shadow of guest EPT (EPT12). If >>> the L1 guest writes to the guest EPT(EPT12). How can the shadow >>> EPT(EPT02) be modified according? >> >> Because the EPT02 is write protected, writes to the EPT12 will trap to >> the hypervisor. The hypervisor will execute the write instruction >> before reentering the guest and invalidate the modified parts of the >> EPT02. When the invalidated part of the EPT02 is accessed, the >> hypervisor will rebuild it according to the EPT12 and the KVM memslots. >> > Do you mean EPT12 is write protected instead of EPT02? Yes, sorry. > According to my understanding, EPT12 will be write protected by marking the > page table entry of EPT01 as readonly or marking the host page table > entry as readonly. > Could you please be more specific the code path which makes the > corresponding page table entry as write protected? Look at set_spte's call to mmu_need_write_protect. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in
Re: Nested EPT Write Protection
Hi Paolo, Thanks a lot! On Fri, Jun 19, 2015 at 2:27 AM, Paolo Bonzini wrote: > > > On 19/06/2015 03:52, Hu Yaohui wrote: >> Hi All, >> In kernel 3.14.2, the kvm uses shadow EPT(EPT02) to implement the >> nested EPT. The shadow EPT(EPT02) is a shadow of guest EPT (EPT12). If >> the L1 guest writes to the guest EPT(EPT12). How can the shadow >> EPT(EPT02) be modified according? > > Because the EPT02 is write protected, writes to the EPT12 will trap to > the hypervisor. The hypervisor will execute the write instruction > before reentering the guest and invalidate the modified parts of the > EPT02. When the invalidated part of the EPT02 is accessed, the > hypervisor will rebuild it according to the EPT12 and the KVM memslots. > Do you mean EPT12 is write protected instead of EPT02? According to my understanding, EPT12 will be write protected by marking the page table entry of EPT01 as readonly or marking the host page table entry as readonly. Could you please be more specific the code path which makes the corresponding page table entry as write protected? Thanks, Yaohui > Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in
Re: Nested EPT Write Protection
On 19/06/2015 03:52, Hu Yaohui wrote: > Hi All, > In kernel 3.14.2, the kvm uses shadow EPT(EPT02) to implement the > nested EPT. The shadow EPT(EPT02) is a shadow of guest EPT (EPT12). If > the L1 guest writes to the guest EPT(EPT12). How can the shadow > EPT(EPT02) be modified according? Because the EPT02 is write protected, writes to the EPT12 will trap to the hypervisor. The hypervisor will execute the write instruction before reentering the guest and invalidate the modified parts of the EPT02. When the invalidated part of the EPT02 is accessed, the hypervisor will rebuild it according to the EPT12 and the KVM memslots. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Nested EPT Write Protection
Hi All, In kernel 3.14.2, the kvm uses shadow EPT(EPT02) to implement the nested EPT. The shadow EPT(EPT02) is a shadow of guest EPT (EPT12). If the L1 guest writes to the guest EPT(EPT12). How can the shadow EPT(EPT02) be modified according? Thanks, Yaohui -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 53611] nVMX: Add nested EPT
https://bugzilla.kernel.org/show_bug.cgi?id=53611 Paolo Bonzini changed: What|Removed |Added Status|NEW |RESOLVED CC||bonz...@gnu.org Kernel Version||3.19 Resolution|--- |CODE_FIX --- Comment #2 from Paolo Bonzini --- Fixed by commit afa61f752ba6 (Advertise the support of EPT to the L1 guest, through the appropriate MSR., 2013-08-07) -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 53611] nVMX: Add nested EPT
https://bugzilla.kernel.org/show_bug.cgi?id=53611 Bandan Das changed: What|Removed |Added Blocks||94971 -- You are receiving this mail because: You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Nested EPT page fault
Hi, I have one question related to nested EPT page fault. At the very start, L0 hypervisor launches L2 with an empty EPT0->2 table, building the table on-the-fly. when one L2 physical page is accessed, ept_page_fault(paging_tmpl.h) will be called to handle this fault in L0. which will first call ept_walk_addr to get guest ept entry from EPT1->2. If there is no such entry, a guest page fault will be injected into L1 to handle this fault. When the next time, the same L2 physical page is accessed, ept_page_fault will be triggered again in L0, which will also call ept_walk_addr and get the previously filled ept entry in EPT1->2, then try_async_pf will be called to translate the L1 physical page to L0 physical page. At the very last, an entry will be created in the EPT0->2 to solve the page fault. Please correct me if I am wrong. My question is when the EPT0->1 will be accessed during the EPT0->2 entry created, since according to the turtle's paper, both EPT0->1 and EPT->12 will be accessed to populate an entry in EPT0->2. Thanks for your time! Best Wishes, Yaohui -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nested EPT
Thank you very much. I can launch L2 from L1 by directly using qemu-system-x86_64 name_of_image. L2 still hangs if I launch it using 'virsh' command; libvirt shows this log: warning : virAuditSend:135 : Failed to send audit message virt=kvm resrc=net reason=start vm="L2" uuid=e9549443-e63f-31b5-0692-1396736d06b4 old-net=? new-net=52:54:00:75:c1:5b: Operation not permitted I am using libvirt 1.1.1. Is it the above warning that causes the problem? Best, Hai On Fri, Jan 17, 2014 at 6:40 AM, Jan Kiszka wrote: > On 2014-01-17 12:29, Kashyap Chamarthy wrote: >> On Fri, Jan 17, 2014 at 2:51 AM, duy hai nguyen wrote: >>> Now I can run an L2 guest (nested guest) using the kvm kernel module >>> of kernel 3.12 >>> >>> However, I am facing a new problem when trying to build and use kvm >>> kernel module from git://git.kiszka.org/kvm-kmod.git: L1 (nested >>> hypervisor) cannot boot L2 and the graphic console of virt-manager >>> hangs displaying 'Booting from Hard Disk...'. L1 still runs fine. >>> >>> Loading kvm_intel with 'emulate_invalid_guest_state=0' in L0 does not >>> solve the problem. I have also tried with different kernel versions: >>> 3.12.0, 3.12.8 and 3.13.0 without success. >>> >>> Can you give me some suggestions? >> >> Maybe you can try without graphical managers and enable serial console >> ('console=ttyS0') to your Kernel command-line of L2 guest, so you can >> see where it's stuck. > > Tracing can also be helpful, both in L1 and L0: > > http://www.linux-kvm.org/page/Tracing > > Jan > > -- > Siemens AG, Corporate Technology, CT RTC ITP SES-DE > Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nested EPT
On 2014-01-17 12:29, Kashyap Chamarthy wrote: > On Fri, Jan 17, 2014 at 2:51 AM, duy hai nguyen wrote: >> Now I can run an L2 guest (nested guest) using the kvm kernel module >> of kernel 3.12 >> >> However, I am facing a new problem when trying to build and use kvm >> kernel module from git://git.kiszka.org/kvm-kmod.git: L1 (nested >> hypervisor) cannot boot L2 and the graphic console of virt-manager >> hangs displaying 'Booting from Hard Disk...'. L1 still runs fine. >> >> Loading kvm_intel with 'emulate_invalid_guest_state=0' in L0 does not >> solve the problem. I have also tried with different kernel versions: >> 3.12.0, 3.12.8 and 3.13.0 without success. >> >> Can you give me some suggestions? > > Maybe you can try without graphical managers and enable serial console > ('console=ttyS0') to your Kernel command-line of L2 guest, so you can > see where it's stuck. Tracing can also be helpful, both in L1 and L0: http://www.linux-kvm.org/page/Tracing Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nested EPT
On Fri, Jan 17, 2014 at 2:51 AM, duy hai nguyen wrote: > Now I can run an L2 guest (nested guest) using the kvm kernel module > of kernel 3.12 > > However, I am facing a new problem when trying to build and use kvm > kernel module from git://git.kiszka.org/kvm-kmod.git: L1 (nested > hypervisor) cannot boot L2 and the graphic console of virt-manager > hangs displaying 'Booting from Hard Disk...'. L1 still runs fine. > > Loading kvm_intel with 'emulate_invalid_guest_state=0' in L0 does not > solve the problem. I have also tried with different kernel versions: > 3.12.0, 3.12.8 and 3.13.0 without success. > > Can you give me some suggestions? Maybe you can try without graphical managers and enable serial console ('console=ttyS0') to your Kernel command-line of L2 guest, so you can see where it's stuck. /kashyap -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nested EPT
Now I can run an L2 guest (nested guest) using the kvm kernel module of kernel 3.12 However, I am facing a new problem when trying to build and use kvm kernel module from git://git.kiszka.org/kvm-kmod.git: L1 (nested hypervisor) cannot boot L2 and the graphic console of virt-manager hangs displaying 'Booting from Hard Disk...'. L1 still runs fine. Loading kvm_intel with 'emulate_invalid_guest_state=0' in L0 does not solve the problem. I have also tried with different kernel versions: 3.12.0, 3.12.8 and 3.13.0 without success. Can you give me some suggestions? Thank you very much Best, Hai On Thu, Jan 16, 2014 at 1:17 PM, duy hai nguyen wrote: > Thanks, Jan and Paolo! > > Great! It helps solve the problem. > > Sincerely, > Hai > > On Thu, Jan 16, 2014 at 12:09 PM, Paolo Bonzini wrote: >> Il 16/01/2014 17:10, duy hai nguyen ha scritto: >>> Dear All, >>> >>> I am having a problem with using nested EPT in my system: In L0 >>> hypervisor CPUs support vmx and ept; however, L1 hypervisor's CPUs do >>> not have ept capability. Flag 'ept' appears in /proc/cpuinfo of L0 but >>> does not show in that of L1. >>> >>> - 'Nested' and 'EPT' are enabled in L0: >>> >>> $cat /sys/module/kvm_intel/parameters/nested >>> Y >>> >>> $cat /sys/module/kvm_intel/parameters/ept >>> Y >>> >>> - The libvirt xml file used in L0 has this cpu configuration: >>> >>> >>> >>> - The kernel version I am using for both L0 and L1 is 3.9.11 >> >> Nested EPT was added in 3.12. You need that version in L0. >> >> Paolo >> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nested EPT
Thanks, Jan and Paolo! Great! It helps solve the problem. Sincerely, Hai On Thu, Jan 16, 2014 at 12:09 PM, Paolo Bonzini wrote: > Il 16/01/2014 17:10, duy hai nguyen ha scritto: >> Dear All, >> >> I am having a problem with using nested EPT in my system: In L0 >> hypervisor CPUs support vmx and ept; however, L1 hypervisor's CPUs do >> not have ept capability. Flag 'ept' appears in /proc/cpuinfo of L0 but >> does not show in that of L1. >> >> - 'Nested' and 'EPT' are enabled in L0: >> >> $cat /sys/module/kvm_intel/parameters/nested >> Y >> >> $cat /sys/module/kvm_intel/parameters/ept >> Y >> >> - The libvirt xml file used in L0 has this cpu configuration: >> >> >> >> - The kernel version I am using for both L0 and L1 is 3.9.11 > > Nested EPT was added in 3.12. You need that version in L0. > > Paolo > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nested EPT
Il 16/01/2014 17:10, duy hai nguyen ha scritto: > Dear All, > > I am having a problem with using nested EPT in my system: In L0 > hypervisor CPUs support vmx and ept; however, L1 hypervisor's CPUs do > not have ept capability. Flag 'ept' appears in /proc/cpuinfo of L0 but > does not show in that of L1. > > - 'Nested' and 'EPT' are enabled in L0: > > $cat /sys/module/kvm_intel/parameters/nested > Y > > $cat /sys/module/kvm_intel/parameters/ept > Y > > - The libvirt xml file used in L0 has this cpu configuration: > > > > - The kernel version I am using for both L0 and L1 is 3.9.11 Nested EPT was added in 3.12. You need that version in L0. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: nested EPT
On 2014-01-16 17:10, duy hai nguyen wrote: > Dear All, > > I am having a problem with using nested EPT in my system: In L0 > hypervisor CPUs support vmx and ept; however, L1 hypervisor's CPUs do > not have ept capability. Flag 'ept' appears in /proc/cpuinfo of L0 but > does not show in that of L1. > > - 'Nested' and 'EPT' are enabled in L0: > > $cat /sys/module/kvm_intel/parameters/nested > Y > > $cat /sys/module/kvm_intel/parameters/ept > Y > > - The libvirt xml file used in L0 has this cpu configuration: > > > > - The kernel version I am using for both L0 and L1 is 3.9.11 > Update your host kernel (L0), nEPT got merged in 3.12. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SES-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
nested EPT
Dear All, I am having a problem with using nested EPT in my system: In L0 hypervisor CPUs support vmx and ept; however, L1 hypervisor's CPUs do not have ept capability. Flag 'ept' appears in /proc/cpuinfo of L0 but does not show in that of L1. - 'Nested' and 'EPT' are enabled in L0: $cat /sys/module/kvm_intel/parameters/nested Y $cat /sys/module/kvm_intel/parameters/ept Y - The libvirt xml file used in L0 has this cpu configuration: - The kernel version I am using for both L0 and L1 is 3.9.11 Thank you very much. Best, Hai -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm-unit-tests: VMX: Fix some nested EPT related bugs
Il 09/09/2013 17:55, Arthur Chunqi Li ha scritto: > This patch fix 3 bugs in VMX framework and EPT framework > 1. Fix bug of setting default value of CPU_SECONDARY > 2. Fix bug of reading MSR_IA32_VMX_PROCBASED_CTLS2 and > MSR_IA32_VMX_EPT_VPID_CAP > 3. For EPT violation and misconfiguration reduced vmexit, vmcs field > "VM-exit instruction length" is not used and will return unexpected > value when read. > > Signed-off-by: Arthur Chunqi Li > --- > x86/vmx.c | 13 ++--- > x86/vmx_tests.c |2 -- > 2 files changed, 10 insertions(+), 5 deletions(-) > > diff --git a/x86/vmx.c b/x86/vmx.c > index 87d1d55..9db4ef4 100644 > --- a/x86/vmx.c > +++ b/x86/vmx.c > @@ -304,7 +304,8 @@ static void init_vmcs_ctrl(void) > /* Disable VMEXIT of IO instruction */ > vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu[0]); > if (ctrl_cpu_rev[0].set & CPU_SECONDARY) { > - ctrl_cpu[1] |= ctrl_cpu_rev[1].set & ctrl_cpu_rev[1].clr; > + ctrl_cpu[1] = (ctrl_cpu[1] | ctrl_cpu_rev[1].set) & > + ctrl_cpu_rev[1].clr; > vmcs_write(CPU_EXEC_CTRL1, ctrl_cpu[1]); > } > vmcs_write(CR3_TARGET_COUNT, 0); > @@ -489,8 +490,14 @@ static void init_vmx(void) > : MSR_IA32_VMX_ENTRY_CTLS); > ctrl_cpu_rev[0].val = rdmsr(basic.ctrl ? MSR_IA32_VMX_TRUE_PROC > : MSR_IA32_VMX_PROCBASED_CTLS); > - ctrl_cpu_rev[1].val = rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2); > - ept_vpid.val = rdmsr(MSR_IA32_VMX_EPT_VPID_CAP); > + if ((ctrl_cpu_rev[0].clr & CPU_SECONDARY) != 0) > + ctrl_cpu_rev[1].val = rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2); > + else > + ctrl_cpu_rev[1].val = 0; > + if ((ctrl_cpu_rev[1].clr & (CPU_EPT | CPU_VPID)) != 0) > + ept_vpid.val = rdmsr(MSR_IA32_VMX_EPT_VPID_CAP); > + else > + ept_vpid.val = 0; > > write_cr0((read_cr0() & fix_cr0_clr) | fix_cr0_set); > write_cr4((read_cr4() & fix_cr4_clr) | fix_cr4_set | X86_CR4_VMXE); > diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c > index 6d972c0..e891a9f 100644 > --- a/x86/vmx_tests.c > +++ b/x86/vmx_tests.c > @@ -1075,7 +1075,6 @@ static int ept_exit_handler() > print_vmexit_info(); > return VMX_TEST_VMEXIT; > } > - vmcs_write(GUEST_RIP, guest_rip + insn_len); > return VMX_TEST_RESUME; > case VMX_EPT_VIOLATION: > switch(get_stage()) { > @@ -1100,7 +1099,6 @@ static int ept_exit_handler() > print_vmexit_info(); > return VMX_TEST_VMEXIT; > } > - vmcs_write(GUEST_RIP, guest_rip + insn_len); > return VMX_TEST_RESUME; > default: > printf("Unknown exit reason, %d\n", reason); > Looks good, thanks! Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm-unit-tests: VMX: Test cases for nested EPT
Il 09/09/2013 17:29, Arthur Chunqi Li ha scritto: > Hi Paolo, > I noticed another possible bug of this patch. Stage 4 of this patch > test the scenario that the page of a paging structure is not present, > then this will cause EPT violation vmexit with bit 8 of exit_qual > unset. My question is: will instruction length be correctly set on > this scenario? I got wrong insn_len in "case 4" of VMX_EPT_VIOLATION, > which may cause triple fault vmexit. It's plausible that the instruction length is wrong, since the processor might be fetching the instruction itself and doesn't know the length. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kvm-unit-tests: VMX: Fix some nested EPT related bugs
This patch fix 3 bugs in VMX framework and EPT framework 1. Fix bug of setting default value of CPU_SECONDARY 2. Fix bug of reading MSR_IA32_VMX_PROCBASED_CTLS2 and MSR_IA32_VMX_EPT_VPID_CAP 3. For EPT violation and misconfiguration reduced vmexit, vmcs field "VM-exit instruction length" is not used and will return unexpected value when read. Signed-off-by: Arthur Chunqi Li --- x86/vmx.c | 13 ++--- x86/vmx_tests.c |2 -- 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/x86/vmx.c b/x86/vmx.c index 87d1d55..9db4ef4 100644 --- a/x86/vmx.c +++ b/x86/vmx.c @@ -304,7 +304,8 @@ static void init_vmcs_ctrl(void) /* Disable VMEXIT of IO instruction */ vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu[0]); if (ctrl_cpu_rev[0].set & CPU_SECONDARY) { - ctrl_cpu[1] |= ctrl_cpu_rev[1].set & ctrl_cpu_rev[1].clr; + ctrl_cpu[1] = (ctrl_cpu[1] | ctrl_cpu_rev[1].set) & + ctrl_cpu_rev[1].clr; vmcs_write(CPU_EXEC_CTRL1, ctrl_cpu[1]); } vmcs_write(CR3_TARGET_COUNT, 0); @@ -489,8 +490,14 @@ static void init_vmx(void) : MSR_IA32_VMX_ENTRY_CTLS); ctrl_cpu_rev[0].val = rdmsr(basic.ctrl ? MSR_IA32_VMX_TRUE_PROC : MSR_IA32_VMX_PROCBASED_CTLS); - ctrl_cpu_rev[1].val = rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2); - ept_vpid.val = rdmsr(MSR_IA32_VMX_EPT_VPID_CAP); + if ((ctrl_cpu_rev[0].clr & CPU_SECONDARY) != 0) + ctrl_cpu_rev[1].val = rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2); + else + ctrl_cpu_rev[1].val = 0; + if ((ctrl_cpu_rev[1].clr & (CPU_EPT | CPU_VPID)) != 0) + ept_vpid.val = rdmsr(MSR_IA32_VMX_EPT_VPID_CAP); + else + ept_vpid.val = 0; write_cr0((read_cr0() & fix_cr0_clr) | fix_cr0_set); write_cr4((read_cr4() & fix_cr4_clr) | fix_cr4_set | X86_CR4_VMXE); diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c index 6d972c0..e891a9f 100644 --- a/x86/vmx_tests.c +++ b/x86/vmx_tests.c @@ -1075,7 +1075,6 @@ static int ept_exit_handler() print_vmexit_info(); return VMX_TEST_VMEXIT; } - vmcs_write(GUEST_RIP, guest_rip + insn_len); return VMX_TEST_RESUME; case VMX_EPT_VIOLATION: switch(get_stage()) { @@ -1100,7 +1099,6 @@ static int ept_exit_handler() print_vmexit_info(); return VMX_TEST_VMEXIT; } - vmcs_write(GUEST_RIP, guest_rip + insn_len); return VMX_TEST_RESUME; default: printf("Unknown exit reason, %d\n", reason); -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm-unit-tests: VMX: Test cases for nested EPT
On Mon, Sep 9, 2013 at 12:57 PM, Arthur Chunqi Li wrote: > Some test cases for nested EPT features, including: > 1. EPT basic framework tests: read, write and remap. > 2. EPT misconfigurations test cases: page permission mieconfiguration > and memory type misconfiguration > 3. EPT violations test cases: page permission violation and paging > structure violation > > Signed-off-by: Arthur Chunqi Li > --- > x86/vmx_tests.c | 266 > +++ > 1 file changed, 266 insertions(+) > > diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c > index c1b39f4..a0b9824 100644 > --- a/x86/vmx_tests.c > +++ b/x86/vmx_tests.c > @@ -1,4 +1,36 @@ > #include "vmx.h" > +#include "processor.h" > +#include "vm.h" > +#include "msr.h" > +#include "fwcfg.h" > + > +volatile u32 stage; > +volatile bool init_fail; > +unsigned long *pml4; > +u64 eptp; > +void *data_page1, *data_page2; > + > +static inline void set_stage(u32 s) > +{ > + barrier(); > + stage = s; > + barrier(); > +} > + > +static inline u32 get_stage() > +{ > + u32 s; > + > + barrier(); > + s = stage; > + barrier(); > + return s; > +} > + > +static inline void vmcall() > +{ > + asm volatile ("vmcall"); > +} > > void basic_init() > { > @@ -76,6 +108,238 @@ int vmenter_exit_handler() > return VMX_TEST_VMEXIT; > } > > +static int setup_ept() > +{ > + int support_2m; > + unsigned long end_of_memory; > + > + if (!(ept_vpid.val & EPT_CAP_UC) && > + !(ept_vpid.val & EPT_CAP_WB)) { > + printf("\tEPT paging-structure memory type " > + "UC&WB are not supported\n"); > + return 1; > + } > + if (ept_vpid.val & EPT_CAP_UC) > + eptp = EPT_MEM_TYPE_UC; > + else > + eptp = EPT_MEM_TYPE_WB; > + if (!(ept_vpid.val & EPT_CAP_PWL4)) { > + printf("\tPWL4 is not supported\n"); > + return 1; > + } > + eptp |= (3 << EPTP_PG_WALK_LEN_SHIFT); > + pml4 = alloc_page(); > + memset(pml4, 0, PAGE_SIZE); > + eptp |= virt_to_phys(pml4); > + vmcs_write(EPTP, eptp); > + support_2m = !!(ept_vpid.val & EPT_CAP_2M_PAGE); > + end_of_memory = fwcfg_get_u64(FW_CFG_RAM_SIZE); > + if (end_of_memory < (1ul << 32)) > + end_of_memory = (1ul << 32); > + if (setup_ept_range(pml4, 0, end_of_memory, > + 0, support_2m, EPT_WA | EPT_RA | EPT_EA)) { > + printf("\tSet ept tables failed.\n"); > + return 1; > + } > + return 0; > +} > + > +static void ept_init() > +{ > + u32 ctrl_cpu[2]; > + > + init_fail = false; > + ctrl_cpu[0] = vmcs_read(CPU_EXEC_CTRL0); > + ctrl_cpu[1] = vmcs_read(CPU_EXEC_CTRL1); > + ctrl_cpu[0] = (ctrl_cpu[0] | CPU_SECONDARY) > + & ctrl_cpu_rev[0].clr; > + ctrl_cpu[1] = (ctrl_cpu[1] | CPU_EPT) > + & ctrl_cpu_rev[1].clr; > + vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu[0]); > + vmcs_write(CPU_EXEC_CTRL1, ctrl_cpu[1] | CPU_EPT); > + if (setup_ept()) > + init_fail = true; > + data_page1 = alloc_page(); > + data_page2 = alloc_page(); > + memset(data_page1, 0x0, PAGE_SIZE); > + memset(data_page2, 0x0, PAGE_SIZE); > + *((u32 *)data_page1) = MAGIC_VAL_1; > + *((u32 *)data_page2) = MAGIC_VAL_2; > + install_ept(pml4, (unsigned long)data_page1, (unsigned > long)data_page2, > + EPT_RA | EPT_WA | EPT_EA); > +} > + > +static void ept_main() > +{ > + if (init_fail) > + return; > + if (!(ctrl_cpu_rev[0].clr & CPU_SECONDARY) > + && !(ctrl_cpu_rev[1].clr & CPU_EPT)) { > + printf("\tEPT is not supported"); > + return; > + } > + set_stage(0); > + if (*((u32 *)data_page2) != MAGIC_VAL_1 && > + *((u32 *)data_page1) != MAGIC_VAL_1) > + report("EPT basic framework - read", 0); > + else { > + *((u32 *)data_page2) = MAGIC_VAL_3; > + vmcall(); > + if (get_stage() == 1) { > + if (*((u32 *)data_page1) == MAGIC_VAL_3 && > +
Re: [PATCH 2/2] kvm-unit-tests: VMX: Test cases for nested EPT
Il 09/09/2013 16:11, Arthur Chunqi Li ha scritto: >>> >> +volatile u32 stage; >>> >> +volatile bool init_fail; >> > >> > Why volatile? > Because init_fail is only set but not used later in ept_init(), and if > I don't add volatile, compiler may optimize the setting to init_fail. > > This occasion firstly occurred when I write set_stage/get_stage. If > one variant is set in a function but not used later, the compiler > usually optimizes this setting as redundant assignment and remove it. No, the two are different. "stage" is written several times in the same function, with no code in the middle: stage++; *p = 1; stage++; To the compiler, the first store is dead. The compiler doesn't know that "*p = 1" traps to the hypervisor. But this is not the case for "init_fail". Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm-unit-tests: VMX: Test cases for nested EPT
On Mon, Sep 9, 2013 at 9:56 PM, Paolo Bonzini wrote: > Il 09/09/2013 06:57, Arthur Chunqi Li ha scritto: >> Some test cases for nested EPT features, including: >> 1. EPT basic framework tests: read, write and remap. >> 2. EPT misconfigurations test cases: page permission mieconfiguration >> and memory type misconfiguration >> 3. EPT violations test cases: page permission violation and paging >> structure violation >> >> Signed-off-by: Arthur Chunqi Li >> --- >> x86/vmx_tests.c | 266 >> +++ >> 1 file changed, 266 insertions(+) >> >> diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c >> index c1b39f4..a0b9824 100644 >> --- a/x86/vmx_tests.c >> +++ b/x86/vmx_tests.c >> @@ -1,4 +1,36 @@ >> #include "vmx.h" >> +#include "processor.h" >> +#include "vm.h" >> +#include "msr.h" >> +#include "fwcfg.h" >> + >> +volatile u32 stage; >> +volatile bool init_fail; > > Why volatile? Because init_fail is only set but not used later in ept_init(), and if I don't add volatile, compiler may optimize the setting to init_fail. This occasion firstly occurred when I write set_stage/get_stage. If one variant is set in a function but not used later, the compiler usually optimizes this setting as redundant assignment and remove it. Arthur > > The patch looks good. > >> +unsigned long *pml4; >> +u64 eptp; >> +void *data_page1, *data_page2; >> + >> +static inline void set_stage(u32 s) >> +{ >> + barrier(); >> + stage = s; >> + barrier(); >> +} >> + >> +static inline u32 get_stage() >> +{ >> + u32 s; >> + >> + barrier(); >> + s = stage; >> + barrier(); >> + return s; >> +} >> + >> +static inline void vmcall() >> +{ >> + asm volatile ("vmcall"); >> +} >> >> void basic_init() >> { >> @@ -76,6 +108,238 @@ int vmenter_exit_handler() >> return VMX_TEST_VMEXIT; >> } >> >> +static int setup_ept() >> +{ >> + int support_2m; >> + unsigned long end_of_memory; >> + >> + if (!(ept_vpid.val & EPT_CAP_UC) && >> + !(ept_vpid.val & EPT_CAP_WB)) { >> + printf("\tEPT paging-structure memory type " >> + "UC&WB are not supported\n"); >> + return 1; >> + } >> + if (ept_vpid.val & EPT_CAP_UC) >> + eptp = EPT_MEM_TYPE_UC; >> + else >> + eptp = EPT_MEM_TYPE_WB; >> + if (!(ept_vpid.val & EPT_CAP_PWL4)) { >> + printf("\tPWL4 is not supported\n"); >> + return 1; >> + } >> + eptp |= (3 << EPTP_PG_WALK_LEN_SHIFT); >> + pml4 = alloc_page(); >> + memset(pml4, 0, PAGE_SIZE); >> + eptp |= virt_to_phys(pml4); >> + vmcs_write(EPTP, eptp); >> + support_2m = !!(ept_vpid.val & EPT_CAP_2M_PAGE); >> + end_of_memory = fwcfg_get_u64(FW_CFG_RAM_SIZE); >> + if (end_of_memory < (1ul << 32)) >> + end_of_memory = (1ul << 32); >> + if (setup_ept_range(pml4, 0, end_of_memory, >> + 0, support_2m, EPT_WA | EPT_RA | EPT_EA)) { >> + printf("\tSet ept tables failed.\n"); >> + return 1; >> + } >> + return 0; >> +} >> + >> +static void ept_init() >> +{ >> + u32 ctrl_cpu[2]; >> + >> + init_fail = false; >> + ctrl_cpu[0] = vmcs_read(CPU_EXEC_CTRL0); >> + ctrl_cpu[1] = vmcs_read(CPU_EXEC_CTRL1); >> + ctrl_cpu[0] = (ctrl_cpu[0] | CPU_SECONDARY) >> + & ctrl_cpu_rev[0].clr; >> + ctrl_cpu[1] = (ctrl_cpu[1] | CPU_EPT) >> + & ctrl_cpu_rev[1].clr; >> + vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu[0]); >> + vmcs_write(CPU_EXEC_CTRL1, ctrl_cpu[1] | CPU_EPT); >> + if (setup_ept()) >> + init_fail = true; >> + data_page1 = alloc_page(); >> + data_page2 = alloc_page(); >> + memset(data_page1, 0x0, PAGE_SIZE); >> + memset(data_page2, 0x0, PAGE_SIZE); >> + *((u32 *)data_page1) = MAGIC_VAL_1; >> + *((u32 *)data_page2) = MAGIC_VAL_2; >> + install_ept(pml4, (unsigned long)data_page1, (unsigned long)data_page2, >> + EPT_RA | EPT_WA | EPT_EA); >> +} >> + >> +static void
Re: [PATCH 2/2] kvm-unit-tests: VMX: Test cases for nested EPT
Il 09/09/2013 06:57, Arthur Chunqi Li ha scritto: > Some test cases for nested EPT features, including: > 1. EPT basic framework tests: read, write and remap. > 2. EPT misconfigurations test cases: page permission mieconfiguration > and memory type misconfiguration > 3. EPT violations test cases: page permission violation and paging > structure violation > > Signed-off-by: Arthur Chunqi Li > --- > x86/vmx_tests.c | 266 > +++ > 1 file changed, 266 insertions(+) > > diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c > index c1b39f4..a0b9824 100644 > --- a/x86/vmx_tests.c > +++ b/x86/vmx_tests.c > @@ -1,4 +1,36 @@ > #include "vmx.h" > +#include "processor.h" > +#include "vm.h" > +#include "msr.h" > +#include "fwcfg.h" > + > +volatile u32 stage; > +volatile bool init_fail; Why volatile? The patch looks good. > +unsigned long *pml4; > +u64 eptp; > +void *data_page1, *data_page2; > + > +static inline void set_stage(u32 s) > +{ > + barrier(); > + stage = s; > + barrier(); > +} > + > +static inline u32 get_stage() > +{ > + u32 s; > + > + barrier(); > + s = stage; > + barrier(); > + return s; > +} > + > +static inline void vmcall() > +{ > + asm volatile ("vmcall"); > +} > > void basic_init() > { > @@ -76,6 +108,238 @@ int vmenter_exit_handler() > return VMX_TEST_VMEXIT; > } > > +static int setup_ept() > +{ > + int support_2m; > + unsigned long end_of_memory; > + > + if (!(ept_vpid.val & EPT_CAP_UC) && > + !(ept_vpid.val & EPT_CAP_WB)) { > + printf("\tEPT paging-structure memory type " > + "UC&WB are not supported\n"); > + return 1; > + } > + if (ept_vpid.val & EPT_CAP_UC) > + eptp = EPT_MEM_TYPE_UC; > + else > + eptp = EPT_MEM_TYPE_WB; > + if (!(ept_vpid.val & EPT_CAP_PWL4)) { > + printf("\tPWL4 is not supported\n"); > + return 1; > + } > + eptp |= (3 << EPTP_PG_WALK_LEN_SHIFT); > + pml4 = alloc_page(); > + memset(pml4, 0, PAGE_SIZE); > + eptp |= virt_to_phys(pml4); > + vmcs_write(EPTP, eptp); > + support_2m = !!(ept_vpid.val & EPT_CAP_2M_PAGE); > + end_of_memory = fwcfg_get_u64(FW_CFG_RAM_SIZE); > + if (end_of_memory < (1ul << 32)) > + end_of_memory = (1ul << 32); > + if (setup_ept_range(pml4, 0, end_of_memory, > + 0, support_2m, EPT_WA | EPT_RA | EPT_EA)) { > + printf("\tSet ept tables failed.\n"); > + return 1; > + } > + return 0; > +} > + > +static void ept_init() > +{ > + u32 ctrl_cpu[2]; > + > + init_fail = false; > + ctrl_cpu[0] = vmcs_read(CPU_EXEC_CTRL0); > + ctrl_cpu[1] = vmcs_read(CPU_EXEC_CTRL1); > + ctrl_cpu[0] = (ctrl_cpu[0] | CPU_SECONDARY) > + & ctrl_cpu_rev[0].clr; > + ctrl_cpu[1] = (ctrl_cpu[1] | CPU_EPT) > + & ctrl_cpu_rev[1].clr; > + vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu[0]); > + vmcs_write(CPU_EXEC_CTRL1, ctrl_cpu[1] | CPU_EPT); > + if (setup_ept()) > + init_fail = true; > + data_page1 = alloc_page(); > + data_page2 = alloc_page(); > + memset(data_page1, 0x0, PAGE_SIZE); > + memset(data_page2, 0x0, PAGE_SIZE); > + *((u32 *)data_page1) = MAGIC_VAL_1; > + *((u32 *)data_page2) = MAGIC_VAL_2; > + install_ept(pml4, (unsigned long)data_page1, (unsigned long)data_page2, > + EPT_RA | EPT_WA | EPT_EA); > +} > + > +static void ept_main() > +{ > + if (init_fail) > + return; > + if (!(ctrl_cpu_rev[0].clr & CPU_SECONDARY) > + && !(ctrl_cpu_rev[1].clr & CPU_EPT)) { > + printf("\tEPT is not supported"); > + return; > + } > + set_stage(0); > + if (*((u32 *)data_page2) != MAGIC_VAL_1 && > + *((u32 *)data_page1) != MAGIC_VAL_1) > + report("EPT basic framework - read", 0); > + else { > + *((u32 *)data_page2) = MAGIC_VAL_3; > + vmcall(); > + if (get_stage() == 1) { > + if (*((u32 *)data_page1) == MAGIC_VAL_3 && > + *((u32 *)data_page2) == MAGIC_VAL_2) > + report("EPT basic framework", 1); >
Re: [PATCH 0/2] kvm-unit-tests: VMX: Test nested EPT features
On Mon, Sep 9, 2013 at 3:17 PM, Jan Kiszka wrote: > On 2013-09-09 06:57, Arthur Chunqi Li wrote: >> This series of patches provide the framework of nested EPT and some test >> cases for nested EPT features. >> >> Arthur Chunqi Li (2): >> kvm-unit-tests: VMX: The framework of EPT for nested VMX testing >> kvm-unit-tests: VMX: Test cases for nested EPT >> >> x86/vmx.c | 159 - >> x86/vmx.h | 76 >> x86/vmx_tests.c | 266 >> +++ >> 3 files changed, 497 insertions(+), 4 deletions(-) >> > > I suppose this is v2 of the previous patch? What is the delta? A meta > changelog could go here. Yes, v1 just provide the framework of EPT (similar to the first patch of this series), and some more tests about nested EPT is added in this series (the second patch). Arthur > > Jan > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/2] kvm-unit-tests: VMX: Test nested EPT features
On 2013-09-09 06:57, Arthur Chunqi Li wrote: > This series of patches provide the framework of nested EPT and some test > cases for nested EPT features. > > Arthur Chunqi Li (2): > kvm-unit-tests: VMX: The framework of EPT for nested VMX testing > kvm-unit-tests: VMX: Test cases for nested EPT > > x86/vmx.c | 159 - > x86/vmx.h | 76 > x86/vmx_tests.c | 266 > +++ > 3 files changed, 497 insertions(+), 4 deletions(-) > I suppose this is v2 of the previous patch? What is the delta? A meta changelog could go here. Jan signature.asc Description: OpenPGP digital signature
[PATCH 0/2] kvm-unit-tests: VMX: Test nested EPT features
This series of patches provide the framework of nested EPT and some test cases for nested EPT features. Arthur Chunqi Li (2): kvm-unit-tests: VMX: The framework of EPT for nested VMX testing kvm-unit-tests: VMX: Test cases for nested EPT x86/vmx.c | 159 - x86/vmx.h | 76 x86/vmx_tests.c | 266 +++ 3 files changed, 497 insertions(+), 4 deletions(-) -- 1.7.9.5 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] kvm-unit-tests: VMX: Test cases for nested EPT
Some test cases for nested EPT features, including: 1. EPT basic framework tests: read, write and remap. 2. EPT misconfigurations test cases: page permission mieconfiguration and memory type misconfiguration 3. EPT violations test cases: page permission violation and paging structure violation Signed-off-by: Arthur Chunqi Li --- x86/vmx_tests.c | 266 +++ 1 file changed, 266 insertions(+) diff --git a/x86/vmx_tests.c b/x86/vmx_tests.c index c1b39f4..a0b9824 100644 --- a/x86/vmx_tests.c +++ b/x86/vmx_tests.c @@ -1,4 +1,36 @@ #include "vmx.h" +#include "processor.h" +#include "vm.h" +#include "msr.h" +#include "fwcfg.h" + +volatile u32 stage; +volatile bool init_fail; +unsigned long *pml4; +u64 eptp; +void *data_page1, *data_page2; + +static inline void set_stage(u32 s) +{ + barrier(); + stage = s; + barrier(); +} + +static inline u32 get_stage() +{ + u32 s; + + barrier(); + s = stage; + barrier(); + return s; +} + +static inline void vmcall() +{ + asm volatile ("vmcall"); +} void basic_init() { @@ -76,6 +108,238 @@ int vmenter_exit_handler() return VMX_TEST_VMEXIT; } +static int setup_ept() +{ + int support_2m; + unsigned long end_of_memory; + + if (!(ept_vpid.val & EPT_CAP_UC) && + !(ept_vpid.val & EPT_CAP_WB)) { + printf("\tEPT paging-structure memory type " + "UC&WB are not supported\n"); + return 1; + } + if (ept_vpid.val & EPT_CAP_UC) + eptp = EPT_MEM_TYPE_UC; + else + eptp = EPT_MEM_TYPE_WB; + if (!(ept_vpid.val & EPT_CAP_PWL4)) { + printf("\tPWL4 is not supported\n"); + return 1; + } + eptp |= (3 << EPTP_PG_WALK_LEN_SHIFT); + pml4 = alloc_page(); + memset(pml4, 0, PAGE_SIZE); + eptp |= virt_to_phys(pml4); + vmcs_write(EPTP, eptp); + support_2m = !!(ept_vpid.val & EPT_CAP_2M_PAGE); + end_of_memory = fwcfg_get_u64(FW_CFG_RAM_SIZE); + if (end_of_memory < (1ul << 32)) + end_of_memory = (1ul << 32); + if (setup_ept_range(pml4, 0, end_of_memory, + 0, support_2m, EPT_WA | EPT_RA | EPT_EA)) { + printf("\tSet ept tables failed.\n"); + return 1; + } + return 0; +} + +static void ept_init() +{ + u32 ctrl_cpu[2]; + + init_fail = false; + ctrl_cpu[0] = vmcs_read(CPU_EXEC_CTRL0); + ctrl_cpu[1] = vmcs_read(CPU_EXEC_CTRL1); + ctrl_cpu[0] = (ctrl_cpu[0] | CPU_SECONDARY) + & ctrl_cpu_rev[0].clr; + ctrl_cpu[1] = (ctrl_cpu[1] | CPU_EPT) + & ctrl_cpu_rev[1].clr; + vmcs_write(CPU_EXEC_CTRL0, ctrl_cpu[0]); + vmcs_write(CPU_EXEC_CTRL1, ctrl_cpu[1] | CPU_EPT); + if (setup_ept()) + init_fail = true; + data_page1 = alloc_page(); + data_page2 = alloc_page(); + memset(data_page1, 0x0, PAGE_SIZE); + memset(data_page2, 0x0, PAGE_SIZE); + *((u32 *)data_page1) = MAGIC_VAL_1; + *((u32 *)data_page2) = MAGIC_VAL_2; + install_ept(pml4, (unsigned long)data_page1, (unsigned long)data_page2, + EPT_RA | EPT_WA | EPT_EA); +} + +static void ept_main() +{ + if (init_fail) + return; + if (!(ctrl_cpu_rev[0].clr & CPU_SECONDARY) + && !(ctrl_cpu_rev[1].clr & CPU_EPT)) { + printf("\tEPT is not supported"); + return; + } + set_stage(0); + if (*((u32 *)data_page2) != MAGIC_VAL_1 && + *((u32 *)data_page1) != MAGIC_VAL_1) + report("EPT basic framework - read", 0); + else { + *((u32 *)data_page2) = MAGIC_VAL_3; + vmcall(); + if (get_stage() == 1) { + if (*((u32 *)data_page1) == MAGIC_VAL_3 && + *((u32 *)data_page2) == MAGIC_VAL_2) + report("EPT basic framework", 1); + else + report("EPT basic framework - remap", 1); + } + } + // Test EPT Misconfigurations + set_stage(1); + vmcall(); + *((u32 *)data_page1) = MAGIC_VAL_1; + if (get_stage() != 2) { + report("EPT misconfigurations", 0); + goto t1; + } + set_stage(2); + vmcall(); + *((u32 *)data_page1) = MAGIC_VAL_1; + if (get_stage() != 3) { + report("EPT misconfigurations", 0); + goto t1; + } + report
Some questions about nested EPT
Hi there, When I test nested EPT (enable EPT of L2->L1 address translation), it occurred some questions when query IA32_VMX_EPT_VPID_CAP. 1. It show that bit 16 and 17 (support for 1G and 2M page) are disabled in nested IA32_VMX_EPT_VPID_CAP. Why nested EPT fails to support these? Are there any difficulties? 2. Can the bit 6 (support for a page-walk length of 4) of IA32_VMX_EPT_VPID_CAP is 0? That is to say if I can design a paging structure >4 or <4 levels? Cause I don't know who is the original author of nested EPT, I send this mail to the whole list. If anyone knows please tell me and CC the authors for more detailed discussion. Thanks, Arthur -- Arthur Chunqi Li Department of Computer Science School of EECS Peking University Beijing, China -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 00/15] Nested EPT
On 08/05/2013 10:07 AM, Gleb Natapov wrote: Xiao comment about checking ept pointer before flushing individual ept context is addressed here. Gleb Natapov (3): nEPT: make guest's A/D bits depends on guest's paging mode nEPT: Support shadow paging for guest paging without A/D bits nEPT: correctly check if remote tlb flush is needed for shadowed EPT tables Nadav Har'El (10): nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 nEPT: Fix cr3 handling in nested exit and entry nEPT: Fix wrong test in kvm_set_cr3 nEPT: Move common code to paging_tmpl.h nEPT: Add EPT tables support to paging_tmpl.h nEPT: MMU context for nested EPT nEPT: Nested INVEPT nEPT: Advertise EPT to L1 nEPT: Some additional comments nEPT: Miscelleneous cleanups Yang Zhang (2): nEPT: Redefine EPT-specific link_shadow_page() nEPT: Add nEPT violation/misconfigration support arch/x86/include/asm/kvm_host.h |4 + arch/x86/include/asm/vmx.h |2 + arch/x86/include/uapi/asm/vmx.h |1 + arch/x86/kvm/mmu.c | 170 +- arch/x86/kvm/mmu.h |2 + arch/x86/kvm/paging_tmpl.h | 176 +++ arch/x86/kvm/vmx.c | 220 --- arch/x86/kvm/x86.c | 11 -- 8 files changed, 467 insertions(+), 119 deletions(-) Applied, thanks (rebased on top of Xiao's walk_addr_generic fix). Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 00/15] Nested EPT
Xiao comment about checking ept pointer before flushing individual ept context is addressed here. Gleb Natapov (3): nEPT: make guest's A/D bits depends on guest's paging mode nEPT: Support shadow paging for guest paging without A/D bits nEPT: correctly check if remote tlb flush is needed for shadowed EPT tables Nadav Har'El (10): nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 nEPT: Fix cr3 handling in nested exit and entry nEPT: Fix wrong test in kvm_set_cr3 nEPT: Move common code to paging_tmpl.h nEPT: Add EPT tables support to paging_tmpl.h nEPT: MMU context for nested EPT nEPT: Nested INVEPT nEPT: Advertise EPT to L1 nEPT: Some additional comments nEPT: Miscelleneous cleanups Yang Zhang (2): nEPT: Redefine EPT-specific link_shadow_page() nEPT: Add nEPT violation/misconfigration support arch/x86/include/asm/kvm_host.h |4 + arch/x86/include/asm/vmx.h |2 + arch/x86/include/uapi/asm/vmx.h |1 + arch/x86/kvm/mmu.c | 170 +- arch/x86/kvm/mmu.h |2 + arch/x86/kvm/paging_tmpl.h | 176 +++ arch/x86/kvm/vmx.c | 220 --- arch/x86/kvm/x86.c | 11 -- 8 files changed, 467 insertions(+), 119 deletions(-) -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v7 11/15] nEPT: MMU context for nested EPT
From: Nadav Har'El KVM's existing shadow MMU code already supports nested TDP. To use it, we need to set up a new "MMU context" for nested EPT, and create a few callbacks for it (nested_ept_*()). This context should also use the EPT versions of the page table access functions (defined in the previous patch). Then, we need to switch back and forth between this nested context and the regular MMU context when switching between L1 and L2 (when L1 runs this L2 with EPT). Reviewed-by: Xiao Guangrong Signed-off-by: Nadav Har'El Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu Signed-off-by: Yang Zhang Signed-off-by: Gleb Natapov --- arch/x86/kvm/mmu.c | 27 +++ arch/x86/kvm/mmu.h |2 ++ arch/x86/kvm/vmx.c | 41 - 3 files changed, 69 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index f2d982d..e3bfdde 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3795,6 +3795,33 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) } EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); +int kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context, + bool execonly) +{ + ASSERT(vcpu); + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); + + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); + + context->nx = true; + context->new_cr3 = paging_new_cr3; + context->page_fault = ept_page_fault; + context->gva_to_gpa = ept_gva_to_gpa; + context->sync_page = ept_sync_page; + context->invlpg = ept_invlpg; + context->update_pte = ept_update_pte; + context->free = paging_free; + context->root_level = context->shadow_root_level; + context->root_hpa = INVALID_PAGE; + context->direct_map = false; + + update_permission_bitmask(vcpu, context, true); + reset_rsvds_bits_mask_ept(vcpu, context, execonly); + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu); + static int init_kvm_softmmu(struct kvm_vcpu *vcpu) { int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 5b59c57..77e044a 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -71,6 +71,8 @@ enum { int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct); int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); +int kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context, + bool execonly); static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 984f8d7..fbfabbe 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1046,6 +1046,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12, return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; } +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) +{ + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); +} + static inline bool is_exception(u32 intr_info) { return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) @@ -7367,6 +7372,33 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, vmcs12->guest_physical_address = fault->address; } +/* Callbacks for nested_ept_init_mmu_context: */ + +static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) +{ + /* return the page table to be shadowed - in our case, EPT12 */ + return get_vmcs12(vcpu)->ept_pointer; +} + +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +{ + int r = kvm_init_shadow_ept_mmu(vcpu, &vcpu->arch.mmu, + nested_vmx_ept_caps & VMX_EPT_EXECUTE_ONLY_BIT); + + vcpu->arch.mmu.set_cr3 = vmx_set_cr3; + vcpu->arch.mmu.get_cr3 = nested_ept_get_cr3; + vcpu->arch.mmu.inject_page_fault = nested_ept_inject_page_fault; + + vcpu->arch.walk_mmu = &vcpu->arch.nested_mmu; + + return r; +} + +static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu) +{ + vcpu->arch.walk_mmu = &vcpu->arch.mmu; +} + /* * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function "merges" it @@ -7587,6 +7619,11 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmx_flush_tlb(vcpu); } + if (nested_cpu_has_ept(vmcs12)) { + kvm_mmu_unload(vcpu); + nested_ept_init_mmu_context(vcpu); + } + if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_EFER) vcpu->arch.efer = vmcs12->guest_ia32_efer; else if (vmcs12->vm_entry_contro
Re: [PATCH v6 00/15] Nested EPT
On Mon, Aug 05, 2013 at 01:19:26AM +0800, Xiao Guangrong wrote: > > On Aug 5, 2013, at 12:58 AM, Gleb Natapov wrote: > > > On Sun, Aug 04, 2013 at 06:42:09PM +0200, Jan Kiszka wrote: > >> On 2013-08-04 18:15, Xiao Guangrong wrote: > >>> > >>> On Aug 4, 2013, at 11:14 PM, Jan Kiszka wrote: > >>> > >>>> On 2013-08-04 15:44, Gleb Natapov wrote: > >>>>> On Sun, Aug 04, 2013 at 12:53:56PM +0300, Gleb Natapov wrote: > >>>>>> On Sun, Aug 04, 2013 at 12:32:06PM +0300, Gleb Natapov wrote: > >>>>>>> On Sun, Aug 04, 2013 at 11:24:41AM +0200, Jan Kiszka wrote: > >>>>>>>> On 2013-08-01 16:08, Gleb Natapov wrote: > >>>>>>>>> Another day -- another version of the nested EPT patches. In this > >>>>>>>>> version > >>>>>>>>> included fix for need_remote_flush() with shadowed ept, set bits 6:8 > >>>>>>>>> of exit_qualification during ept_violation, > >>>>>>>>> update_permission_bitmask() > >>>>>>>>> made to work with shadowed ept pages and other small adjustment > >>>>>>>>> according > >>>>>>>>> to review comments. > >>>>>>>> > >>>>>>>> Was just testing it here and ran into a bug: I've L2 accessing the > >>>>>>>> HPET > >>>>>>>> MMIO region that my L1 passed through from L0 (where it is supposed > >>>>>>>> to > >>>>>>>> be emulated in this setup). This used to work with an older posting > >>>>>>>> of > >>>>>>> Not sure I understand your setup. L0 emulates HPET, L1 passes it > >>>>>>> through > >>>>>>> to L2 (mmaps it and creates kvm slot that points to it) and when L2 > >>>>>>> accessed it it locks up? > >>>>>>> > >>>>>>>> Jun, but now it locks up (infinite loop over L2's MMIO access, no > >>>>>>>> L2->L1 > >>>>>>>> transition). Any ideas where to look for debugging this? > >>>>>>>> > >>>>>>> Can you do an ftrace -e kvm -e kvmmmu? Unit test will also be helpful > >>>>>>> :) > >>>>>>> > >>>>>> I did an MMIO access from nested guest in the vmx unit test (which is > >>>>>> naturally passed through to L0 since L1 is so simple) and I can see > >>>>>> that > >>>>>> the access hits L0. > >>>>>> > >>>>> But then unit test not yet uses nested EPT :) > >>>> > >>>> Indeed, that's what I was about to notice as well. EPT test cases are on > >>>> Arthur's list, but I suggested to start easier with some MSR switches > >>>> (just to let him run into KVM's PAT bugs ;) ). > >>>> > >>>> Anyway, here are the traces: > >>>> > >>>> qemu-system-x86-11521 [000] 4724.170191: kvm_entry:vcpu 0 > >>>> qemu-system-x86-11521 [000] 4724.170192: kvm_exit: reason > >>>> EPT_VIOLATION rip 0x8102ab70 info 181 0 > >>>> qemu-system-x86-11521 [000] 4724.170192: kvm_page_fault: address > >>>> 1901978 error_code 181 > >>>> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_pagetable_walk: addr > >>>> 1901978 pferr 0 > >>>> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte > >>>> 3c04c007 level 4 > >>>> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte > >>>> 3c04d007 level 3 > >>>> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte > >>>> 3c05a007 level 2 > >>>> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte > >>>> 1901037 level 1 > >>>> qemu-system-x86-11521 [000] 4724.170197: kvm_entry:vcpu 0 > >>>> qemu-system-x86-11521 [000] 4724.170198: kvm_exit: reason > >>>> EPT_VIOLATION rip 0x8102ab77 info 81 0 > >>>> qemu-system-x86-11521 [000] 4724.170199: kvm_page_fault: address > >>>> 3a029000 error_code 81 > >>>> qemu-system-x86-1
Re: [PATCH v6 00/15] Nested EPT
On Aug 5, 2013, at 12:58 AM, Gleb Natapov wrote: > On Sun, Aug 04, 2013 at 06:42:09PM +0200, Jan Kiszka wrote: >> On 2013-08-04 18:15, Xiao Guangrong wrote: >>> >>> On Aug 4, 2013, at 11:14 PM, Jan Kiszka wrote: >>> >>>> On 2013-08-04 15:44, Gleb Natapov wrote: >>>>> On Sun, Aug 04, 2013 at 12:53:56PM +0300, Gleb Natapov wrote: >>>>>> On Sun, Aug 04, 2013 at 12:32:06PM +0300, Gleb Natapov wrote: >>>>>>> On Sun, Aug 04, 2013 at 11:24:41AM +0200, Jan Kiszka wrote: >>>>>>>> On 2013-08-01 16:08, Gleb Natapov wrote: >>>>>>>>> Another day -- another version of the nested EPT patches. In this >>>>>>>>> version >>>>>>>>> included fix for need_remote_flush() with shadowed ept, set bits 6:8 >>>>>>>>> of exit_qualification during ept_violation, >>>>>>>>> update_permission_bitmask() >>>>>>>>> made to work with shadowed ept pages and other small adjustment >>>>>>>>> according >>>>>>>>> to review comments. >>>>>>>> >>>>>>>> Was just testing it here and ran into a bug: I've L2 accessing the HPET >>>>>>>> MMIO region that my L1 passed through from L0 (where it is supposed to >>>>>>>> be emulated in this setup). This used to work with an older posting of >>>>>>> Not sure I understand your setup. L0 emulates HPET, L1 passes it through >>>>>>> to L2 (mmaps it and creates kvm slot that points to it) and when L2 >>>>>>> accessed it it locks up? >>>>>>> >>>>>>>> Jun, but now it locks up (infinite loop over L2's MMIO access, no >>>>>>>> L2->L1 >>>>>>>> transition). Any ideas where to look for debugging this? >>>>>>>> >>>>>>> Can you do an ftrace -e kvm -e kvmmmu? Unit test will also be helpful :) >>>>>>> >>>>>> I did an MMIO access from nested guest in the vmx unit test (which is >>>>>> naturally passed through to L0 since L1 is so simple) and I can see that >>>>>> the access hits L0. >>>>>> >>>>> But then unit test not yet uses nested EPT :) >>>> >>>> Indeed, that's what I was about to notice as well. EPT test cases are on >>>> Arthur's list, but I suggested to start easier with some MSR switches >>>> (just to let him run into KVM's PAT bugs ;) ). >>>> >>>> Anyway, here are the traces: >>>> >>>> qemu-system-x86-11521 [000] 4724.170191: kvm_entry:vcpu 0 >>>> qemu-system-x86-11521 [000] 4724.170192: kvm_exit: reason >>>> EPT_VIOLATION rip 0x8102ab70 info 181 0 >>>> qemu-system-x86-11521 [000] 4724.170192: kvm_page_fault: address >>>> 1901978 error_code 181 >>>> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_pagetable_walk: addr >>>> 1901978 pferr 0 >>>> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte >>>> 3c04c007 level 4 >>>> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte >>>> 3c04d007 level 3 >>>> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte >>>> 3c05a007 level 2 >>>> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte >>>> 1901037 level 1 >>>> qemu-system-x86-11521 [000] 4724.170197: kvm_entry:vcpu 0 >>>> qemu-system-x86-11521 [000] 4724.170198: kvm_exit: reason >>>> EPT_VIOLATION rip 0x8102ab77 info 81 0 >>>> qemu-system-x86-11521 [000] 4724.170199: kvm_page_fault: address >>>> 3a029000 error_code 81 >>>> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_pagetable_walk: addr >>>> 3a029000 pferr 0 >>>> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte >>>> 3c04c007 level 4 >>>> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte >>>> 3c04d007 level 3 >>>> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte >>>> 3c21e007 level 2 >>>> qemu-system-x86-11521 [000] 4724.170200: kvm_mmu_paging_element: pte >>>> 3a029037 level 1 >>>> qemu-system-x86-11521 [000] 472
Re: [PATCH v6 00/15] Nested EPT
On Sun, Aug 04, 2013 at 06:42:09PM +0200, Jan Kiszka wrote: > On 2013-08-04 18:15, Xiao Guangrong wrote: > > > > On Aug 4, 2013, at 11:14 PM, Jan Kiszka wrote: > > > >> On 2013-08-04 15:44, Gleb Natapov wrote: > >>> On Sun, Aug 04, 2013 at 12:53:56PM +0300, Gleb Natapov wrote: > >>>> On Sun, Aug 04, 2013 at 12:32:06PM +0300, Gleb Natapov wrote: > >>>>> On Sun, Aug 04, 2013 at 11:24:41AM +0200, Jan Kiszka wrote: > >>>>>> On 2013-08-01 16:08, Gleb Natapov wrote: > >>>>>>> Another day -- another version of the nested EPT patches. In this > >>>>>>> version > >>>>>>> included fix for need_remote_flush() with shadowed ept, set bits 6:8 > >>>>>>> of exit_qualification during ept_violation, > >>>>>>> update_permission_bitmask() > >>>>>>> made to work with shadowed ept pages and other small adjustment > >>>>>>> according > >>>>>>> to review comments. > >>>>>> > >>>>>> Was just testing it here and ran into a bug: I've L2 accessing the HPET > >>>>>> MMIO region that my L1 passed through from L0 (where it is supposed to > >>>>>> be emulated in this setup). This used to work with an older posting of > >>>>> Not sure I understand your setup. L0 emulates HPET, L1 passes it through > >>>>> to L2 (mmaps it and creates kvm slot that points to it) and when L2 > >>>>> accessed it it locks up? > >>>>> > >>>>>> Jun, but now it locks up (infinite loop over L2's MMIO access, no > >>>>>> L2->L1 > >>>>>> transition). Any ideas where to look for debugging this? > >>>>>> > >>>>> Can you do an ftrace -e kvm -e kvmmmu? Unit test will also be helpful :) > >>>>> > >>>> I did an MMIO access from nested guest in the vmx unit test (which is > >>>> naturally passed through to L0 since L1 is so simple) and I can see that > >>>> the access hits L0. > >>>> > >>> But then unit test not yet uses nested EPT :) > >> > >> Indeed, that's what I was about to notice as well. EPT test cases are on > >> Arthur's list, but I suggested to start easier with some MSR switches > >> (just to let him run into KVM's PAT bugs ;) ). > >> > >> Anyway, here are the traces: > >> > >> qemu-system-x86-11521 [000] 4724.170191: kvm_entry:vcpu 0 > >> qemu-system-x86-11521 [000] 4724.170192: kvm_exit: reason > >> EPT_VIOLATION rip 0x8102ab70 info 181 0 > >> qemu-system-x86-11521 [000] 4724.170192: kvm_page_fault: address > >> 1901978 error_code 181 > >> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_pagetable_walk: addr > >> 1901978 pferr 0 > >> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte > >> 3c04c007 level 4 > >> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte > >> 3c04d007 level 3 > >> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte > >> 3c05a007 level 2 > >> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte > >> 1901037 level 1 > >> qemu-system-x86-11521 [000] 4724.170197: kvm_entry:vcpu 0 > >> qemu-system-x86-11521 [000] 4724.170198: kvm_exit: reason > >> EPT_VIOLATION rip 0x8102ab77 info 81 0 > >> qemu-system-x86-11521 [000] 4724.170199: kvm_page_fault: address > >> 3a029000 error_code 81 > >> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_pagetable_walk: addr > >> 3a029000 pferr 0 > >> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte > >> 3c04c007 level 4 > >> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte > >> 3c04d007 level 3 > >> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte > >> 3c21e007 level 2 > >> qemu-system-x86-11521 [000] 4724.170200: kvm_mmu_paging_element: pte > >> 3a029037 level 1 > >> qemu-system-x86-11521 [000] 4724.170203: kvm_entry:vcpu 0 > >> qemu-system-x86-11521 [000] 4724.170204: kvm_exit: reason > >> EPT_VIOLATION rip 0x8102ab77 info 181 0 > >> qemu-system-x86-11521 [000] 4724.170204: kvm_page_fault: address > >>
Re: [PATCH v6 00/15] Nested EPT
On 2013-08-04 18:15, Xiao Guangrong wrote: > > On Aug 4, 2013, at 11:14 PM, Jan Kiszka wrote: > >> On 2013-08-04 15:44, Gleb Natapov wrote: >>> On Sun, Aug 04, 2013 at 12:53:56PM +0300, Gleb Natapov wrote: >>>> On Sun, Aug 04, 2013 at 12:32:06PM +0300, Gleb Natapov wrote: >>>>> On Sun, Aug 04, 2013 at 11:24:41AM +0200, Jan Kiszka wrote: >>>>>> On 2013-08-01 16:08, Gleb Natapov wrote: >>>>>>> Another day -- another version of the nested EPT patches. In this >>>>>>> version >>>>>>> included fix for need_remote_flush() with shadowed ept, set bits 6:8 >>>>>>> of exit_qualification during ept_violation, update_permission_bitmask() >>>>>>> made to work with shadowed ept pages and other small adjustment >>>>>>> according >>>>>>> to review comments. >>>>>> >>>>>> Was just testing it here and ran into a bug: I've L2 accessing the HPET >>>>>> MMIO region that my L1 passed through from L0 (where it is supposed to >>>>>> be emulated in this setup). This used to work with an older posting of >>>>> Not sure I understand your setup. L0 emulates HPET, L1 passes it through >>>>> to L2 (mmaps it and creates kvm slot that points to it) and when L2 >>>>> accessed it it locks up? >>>>> >>>>>> Jun, but now it locks up (infinite loop over L2's MMIO access, no L2->L1 >>>>>> transition). Any ideas where to look for debugging this? >>>>>> >>>>> Can you do an ftrace -e kvm -e kvmmmu? Unit test will also be helpful :) >>>>> >>>> I did an MMIO access from nested guest in the vmx unit test (which is >>>> naturally passed through to L0 since L1 is so simple) and I can see that >>>> the access hits L0. >>>> >>> But then unit test not yet uses nested EPT :) >> >> Indeed, that's what I was about to notice as well. EPT test cases are on >> Arthur's list, but I suggested to start easier with some MSR switches >> (just to let him run into KVM's PAT bugs ;) ). >> >> Anyway, here are the traces: >> >> qemu-system-x86-11521 [000] 4724.170191: kvm_entry:vcpu 0 >> qemu-system-x86-11521 [000] 4724.170192: kvm_exit: reason >> EPT_VIOLATION rip 0x8102ab70 info 181 0 >> qemu-system-x86-11521 [000] 4724.170192: kvm_page_fault: address >> 1901978 error_code 181 >> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_pagetable_walk: addr >> 1901978 pferr 0 >> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte >> 3c04c007 level 4 >> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte >> 3c04d007 level 3 >> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte >> 3c05a007 level 2 >> qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte >> 1901037 level 1 >> qemu-system-x86-11521 [000] 4724.170197: kvm_entry:vcpu 0 >> qemu-system-x86-11521 [000] 4724.170198: kvm_exit: reason >> EPT_VIOLATION rip 0x8102ab77 info 81 0 >> qemu-system-x86-11521 [000] 4724.170199: kvm_page_fault: address >> 3a029000 error_code 81 >> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_pagetable_walk: addr >> 3a029000 pferr 0 >> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte >> 3c04c007 level 4 >> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte >> 3c04d007 level 3 >> qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte >> 3c21e007 level 2 >> qemu-system-x86-11521 [000] 4724.170200: kvm_mmu_paging_element: pte >> 3a029037 level 1 >> qemu-system-x86-11521 [000] 4724.170203: kvm_entry:vcpu 0 >> qemu-system-x86-11521 [000] 4724.170204: kvm_exit: reason >> EPT_VIOLATION rip 0x8102ab77 info 181 0 >> qemu-system-x86-11521 [000] 4724.170204: kvm_page_fault: address >> fed000f0 error_code 181 >> qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_pagetable_walk: addr >> fed000f0 pferr 0 >> qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_paging_element: pte >> 3c04c007 level 4 >> qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_paging_element: pte >> 3c42f003 level 3 >> qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_paging_element: pte >> 3c626003 level 2 >> qemu-system-x86-11521 [000] 4724.170206:
Re: [PATCH v6 00/15] Nested EPT
On Aug 4, 2013, at 11:14 PM, Jan Kiszka wrote: > On 2013-08-04 15:44, Gleb Natapov wrote: >> On Sun, Aug 04, 2013 at 12:53:56PM +0300, Gleb Natapov wrote: >>> On Sun, Aug 04, 2013 at 12:32:06PM +0300, Gleb Natapov wrote: >>>> On Sun, Aug 04, 2013 at 11:24:41AM +0200, Jan Kiszka wrote: >>>>> On 2013-08-01 16:08, Gleb Natapov wrote: >>>>>> Another day -- another version of the nested EPT patches. In this version >>>>>> included fix for need_remote_flush() with shadowed ept, set bits 6:8 >>>>>> of exit_qualification during ept_violation, update_permission_bitmask() >>>>>> made to work with shadowed ept pages and other small adjustment according >>>>>> to review comments. >>>>> >>>>> Was just testing it here and ran into a bug: I've L2 accessing the HPET >>>>> MMIO region that my L1 passed through from L0 (where it is supposed to >>>>> be emulated in this setup). This used to work with an older posting of >>>> Not sure I understand your setup. L0 emulates HPET, L1 passes it through >>>> to L2 (mmaps it and creates kvm slot that points to it) and when L2 >>>> accessed it it locks up? >>>> >>>>> Jun, but now it locks up (infinite loop over L2's MMIO access, no L2->L1 >>>>> transition). Any ideas where to look for debugging this? >>>>> >>>> Can you do an ftrace -e kvm -e kvmmmu? Unit test will also be helpful :) >>>> >>> I did an MMIO access from nested guest in the vmx unit test (which is >>> naturally passed through to L0 since L1 is so simple) and I can see that >>> the access hits L0. >>> >> But then unit test not yet uses nested EPT :) > > Indeed, that's what I was about to notice as well. EPT test cases are on > Arthur's list, but I suggested to start easier with some MSR switches > (just to let him run into KVM's PAT bugs ;) ). > > Anyway, here are the traces: > > qemu-system-x86-11521 [000] 4724.170191: kvm_entry:vcpu 0 > qemu-system-x86-11521 [000] 4724.170192: kvm_exit: reason > EPT_VIOLATION rip 0x8102ab70 info 181 0 > qemu-system-x86-11521 [000] 4724.170192: kvm_page_fault: address > 1901978 error_code 181 > qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_pagetable_walk: addr > 1901978 pferr 0 > qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte > 3c04c007 level 4 > qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte > 3c04d007 level 3 > qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte > 3c05a007 level 2 > qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte 1901037 > level 1 > qemu-system-x86-11521 [000] 4724.170197: kvm_entry:vcpu 0 > qemu-system-x86-11521 [000] 4724.170198: kvm_exit: reason > EPT_VIOLATION rip 0x8102ab77 info 81 0 > qemu-system-x86-11521 [000] 4724.170199: kvm_page_fault: address > 3a029000 error_code 81 > qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_pagetable_walk: addr > 3a029000 pferr 0 > qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte > 3c04c007 level 4 > qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte > 3c04d007 level 3 > qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte > 3c21e007 level 2 > qemu-system-x86-11521 [000] 4724.170200: kvm_mmu_paging_element: pte > 3a029037 level 1 > qemu-system-x86-11521 [000] 4724.170203: kvm_entry:vcpu 0 > qemu-system-x86-11521 [000] 4724.170204: kvm_exit: reason > EPT_VIOLATION rip 0x8102ab77 info 181 0 > qemu-system-x86-11521 [000] 4724.170204: kvm_page_fault: address > fed000f0 error_code 181 > qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_pagetable_walk: addr > fed000f0 pferr 0 > qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_paging_element: pte > 3c04c007 level 4 > qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_paging_element: pte > 3c42f003 level 3 > qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_paging_element: pte > 3c626003 level 2 > qemu-system-x86-11521 [000] 4724.170206: kvm_mmu_paging_element: pte > fed00033 level 1 > qemu-system-x86-11521 [000] 4724.170213: mark_mmio_spte: > sptep:0x88014e8ad800 gfn fed00 access 6 gen b7f > qemu-system-x86-11521 [000] 4724.170214: kvm_mmu_pagetable_walk: addr > 8102ab77 pferr 10 F > qemu-system-x86-11521 [000] 4724.170215: kvm_mmu_pagetable_walk: addr > 171 pferr 6 W|U > qemu-system-x86-11521 [000] 4724
Re: [PATCH v6 00/15] Nested EPT
On 2013-08-04 15:44, Gleb Natapov wrote: > On Sun, Aug 04, 2013 at 12:53:56PM +0300, Gleb Natapov wrote: >> On Sun, Aug 04, 2013 at 12:32:06PM +0300, Gleb Natapov wrote: >>> On Sun, Aug 04, 2013 at 11:24:41AM +0200, Jan Kiszka wrote: >>>> On 2013-08-01 16:08, Gleb Natapov wrote: >>>>> Another day -- another version of the nested EPT patches. In this version >>>>> included fix for need_remote_flush() with shadowed ept, set bits 6:8 >>>>> of exit_qualification during ept_violation, update_permission_bitmask() >>>>> made to work with shadowed ept pages and other small adjustment according >>>>> to review comments. >>>> >>>> Was just testing it here and ran into a bug: I've L2 accessing the HPET >>>> MMIO region that my L1 passed through from L0 (where it is supposed to >>>> be emulated in this setup). This used to work with an older posting of >>> Not sure I understand your setup. L0 emulates HPET, L1 passes it through >>> to L2 (mmaps it and creates kvm slot that points to it) and when L2 >>> accessed it it locks up? >>> >>>> Jun, but now it locks up (infinite loop over L2's MMIO access, no L2->L1 >>>> transition). Any ideas where to look for debugging this? >>>> >>> Can you do an ftrace -e kvm -e kvmmmu? Unit test will also be helpful :) >>> >> I did an MMIO access from nested guest in the vmx unit test (which is >> naturally passed through to L0 since L1 is so simple) and I can see that >> the access hits L0. >> > But then unit test not yet uses nested EPT :) Indeed, that's what I was about to notice as well. EPT test cases are on Arthur's list, but I suggested to start easier with some MSR switches (just to let him run into KVM's PAT bugs ;) ). Anyway, here are the traces: qemu-system-x86-11521 [000] 4724.170191: kvm_entry:vcpu 0 qemu-system-x86-11521 [000] 4724.170192: kvm_exit: reason EPT_VIOLATION rip 0x8102ab70 info 181 0 qemu-system-x86-11521 [000] 4724.170192: kvm_page_fault: address 1901978 error_code 181 qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_pagetable_walk: addr 1901978 pferr 0 qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte 3c04c007 level 4 qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte 3c04d007 level 3 qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte 3c05a007 level 2 qemu-system-x86-11521 [000] 4724.170193: kvm_mmu_paging_element: pte 1901037 level 1 qemu-system-x86-11521 [000] 4724.170197: kvm_entry:vcpu 0 qemu-system-x86-11521 [000] 4724.170198: kvm_exit: reason EPT_VIOLATION rip 0x8102ab77 info 81 0 qemu-system-x86-11521 [000] 4724.170199: kvm_page_fault: address 3a029000 error_code 81 qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_pagetable_walk: addr 3a029000 pferr 0 qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte 3c04c007 level 4 qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte 3c04d007 level 3 qemu-system-x86-11521 [000] 4724.170199: kvm_mmu_paging_element: pte 3c21e007 level 2 qemu-system-x86-11521 [000] 4724.170200: kvm_mmu_paging_element: pte 3a029037 level 1 qemu-system-x86-11521 [000] 4724.170203: kvm_entry:vcpu 0 qemu-system-x86-11521 [000] 4724.170204: kvm_exit: reason EPT_VIOLATION rip 0x8102ab77 info 181 0 qemu-system-x86-11521 [000] 4724.170204: kvm_page_fault: address fed000f0 error_code 181 qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_pagetable_walk: addr fed000f0 pferr 0 qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_paging_element: pte 3c04c007 level 4 qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_paging_element: pte 3c42f003 level 3 qemu-system-x86-11521 [000] 4724.170205: kvm_mmu_paging_element: pte 3c626003 level 2 qemu-system-x86-11521 [000] 4724.170206: kvm_mmu_paging_element: pte fed00033 level 1 qemu-system-x86-11521 [000] 4724.170213: mark_mmio_spte: sptep:0x88014e8ad800 gfn fed00 access 6 gen b7f qemu-system-x86-11521 [000] 4724.170214: kvm_mmu_pagetable_walk: addr 8102ab77 pferr 10 F qemu-system-x86-11521 [000] 4724.170215: kvm_mmu_pagetable_walk: addr 171 pferr 6 W|U qemu-system-x86-11521 [000] 4724.170215: kvm_mmu_paging_element: pte 3c04c007 level 4 qemu-system-x86-11521 [000] 4724.170215: kvm_mmu_paging_element: pte 3c04d007 level 3 qemu-system-x86-11521 [000] 4724.170216: kvm_mmu_paging_element: pte 3c059007 level 2 qemu-system-x86-11521 [000] 4724.170216: kvm_mmu_paging_element: pte 1710037 level 1 qemu-system-x86-11521 [000] 4724.170216: kvm_mmu_paging_element: pte 1711067 level 4 qemu-system-x86-11521 [000]
Re: [PATCH v6 00/15] Nested EPT
On Sun, Aug 04, 2013 at 12:53:56PM +0300, Gleb Natapov wrote: > On Sun, Aug 04, 2013 at 12:32:06PM +0300, Gleb Natapov wrote: > > On Sun, Aug 04, 2013 at 11:24:41AM +0200, Jan Kiszka wrote: > > > On 2013-08-01 16:08, Gleb Natapov wrote: > > > > Another day -- another version of the nested EPT patches. In this > > > > version > > > > included fix for need_remote_flush() with shadowed ept, set bits 6:8 > > > > of exit_qualification during ept_violation, update_permission_bitmask() > > > > made to work with shadowed ept pages and other small adjustment > > > > according > > > > to review comments. > > > > > > Was just testing it here and ran into a bug: I've L2 accessing the HPET > > > MMIO region that my L1 passed through from L0 (where it is supposed to > > > be emulated in this setup). This used to work with an older posting of > > Not sure I understand your setup. L0 emulates HPET, L1 passes it through > > to L2 (mmaps it and creates kvm slot that points to it) and when L2 > > accessed it it locks up? > > > > > Jun, but now it locks up (infinite loop over L2's MMIO access, no L2->L1 > > > transition). Any ideas where to look for debugging this? > > > > > Can you do an ftrace -e kvm -e kvmmmu? Unit test will also be helpful :) > > > I did an MMIO access from nested guest in the vmx unit test (which is > naturally passed through to L0 since L1 is so simple) and I can see that > the access hits L0. > But then unit test not yet uses nested EPT :) -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 00/15] Nested EPT
On Sun, Aug 04, 2013 at 12:32:06PM +0300, Gleb Natapov wrote: > On Sun, Aug 04, 2013 at 11:24:41AM +0200, Jan Kiszka wrote: > > On 2013-08-01 16:08, Gleb Natapov wrote: > > > Another day -- another version of the nested EPT patches. In this version > > > included fix for need_remote_flush() with shadowed ept, set bits 6:8 > > > of exit_qualification during ept_violation, update_permission_bitmask() > > > made to work with shadowed ept pages and other small adjustment according > > > to review comments. > > > > Was just testing it here and ran into a bug: I've L2 accessing the HPET > > MMIO region that my L1 passed through from L0 (where it is supposed to > > be emulated in this setup). This used to work with an older posting of > Not sure I understand your setup. L0 emulates HPET, L1 passes it through > to L2 (mmaps it and creates kvm slot that points to it) and when L2 > accessed it it locks up? > > > Jun, but now it locks up (infinite loop over L2's MMIO access, no L2->L1 > > transition). Any ideas where to look for debugging this? > > > Can you do an ftrace -e kvm -e kvmmmu? Unit test will also be helpful :) > I did an MMIO access from nested guest in the vmx unit test (which is naturally passed through to L0 since L1 is so simple) and I can see that the access hits L0. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 00/15] Nested EPT
On Sun, Aug 04, 2013 at 11:24:41AM +0200, Jan Kiszka wrote: > On 2013-08-01 16:08, Gleb Natapov wrote: > > Another day -- another version of the nested EPT patches. In this version > > included fix for need_remote_flush() with shadowed ept, set bits 6:8 > > of exit_qualification during ept_violation, update_permission_bitmask() > > made to work with shadowed ept pages and other small adjustment according > > to review comments. > > Was just testing it here and ran into a bug: I've L2 accessing the HPET > MMIO region that my L1 passed through from L0 (where it is supposed to > be emulated in this setup). This used to work with an older posting of Not sure I understand your setup. L0 emulates HPET, L1 passes it through to L2 (mmaps it and creates kvm slot that points to it) and when L2 accessed it it locks up? > Jun, but now it locks up (infinite loop over L2's MMIO access, no L2->L1 > transition). Any ideas where to look for debugging this? > Can you do an ftrace -e kvm -e kvmmmu? Unit test will also be helpful :) -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 00/15] Nested EPT
On 2013-08-01 16:08, Gleb Natapov wrote: > Another day -- another version of the nested EPT patches. In this version > included fix for need_remote_flush() with shadowed ept, set bits 6:8 > of exit_qualification during ept_violation, update_permission_bitmask() > made to work with shadowed ept pages and other small adjustment according > to review comments. Was just testing it here and ran into a bug: I've L2 accessing the HPET MMIO region that my L1 passed through from L0 (where it is supposed to be emulated in this setup). This used to work with an older posting of Jun, but now it locks up (infinite loop over L2's MMIO access, no L2->L1 transition). Any ideas where to look for debugging this? Jan signature.asc Description: OpenPGP digital signature
Re: [PATCH v6 12/15] nEPT: MMU context for nested EPT
On 08/01/2013 10:08 PM, Gleb Natapov wrote: > From: Nadav Har'El > > KVM's existing shadow MMU code already supports nested TDP. To use it, we > need to set up a new "MMU context" for nested EPT, and create a few callbacks > for it (nested_ept_*()). This context should also use the EPT versions of > the page table access functions (defined in the previous patch). > Then, we need to switch back and forth between this nested context and the > regular MMU context when switching between L1 and L2 (when L1 runs this L2 > with EPT). Reviewed-by: Xiao Guangrong -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 00/15] Nested EPT
Another day -- another version of the nested EPT patches. In this version included fix for need_remote_flush() with shadowed ept, set bits 6:8 of exit_qualification during ept_violation, update_permission_bitmask() made to work with shadowed ept pages and other small adjustment according to review comments. Gleb Natapov (3): nEPT: make guest's A/D bits depends on guest's paging mode nEPT: Support shadow paging for guest paging without A/D bits nEPT: correctly check if remote tlb flush is needed for shadowed EPT tables Nadav Har'El (10): nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 nEPT: Fix cr3 handling in nested exit and entry nEPT: Fix wrong test in kvm_set_cr3 nEPT: Move common code to paging_tmpl.h nEPT: Add EPT tables support to paging_tmpl.h nEPT: Nested INVEPT nEPT: MMU context for nested EPT nEPT: Advertise EPT to L1 nEPT: Some additional comments nEPT: Miscelleneous cleanups Yang Zhang (2): nEPT: Redefine EPT-specific link_shadow_page() nEPT: Add nEPT violation/misconfigration support arch/x86/include/asm/kvm_host.h |4 + arch/x86/include/asm/vmx.h |2 + arch/x86/include/uapi/asm/vmx.h |1 + arch/x86/kvm/mmu.c | 170 ++- arch/x86/kvm/mmu.h |2 + arch/x86/kvm/paging_tmpl.h | 176 arch/x86/kvm/vmx.c | 215 --- arch/x86/kvm/x86.c | 11 -- 8 files changed, 462 insertions(+), 119 deletions(-) -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v6 12/15] nEPT: MMU context for nested EPT
From: Nadav Har'El KVM's existing shadow MMU code already supports nested TDP. To use it, we need to set up a new "MMU context" for nested EPT, and create a few callbacks for it (nested_ept_*()). This context should also use the EPT versions of the page table access functions (defined in the previous patch). Then, we need to switch back and forth between this nested context and the regular MMU context when switching between L1 and L2 (when L1 runs this L2 with EPT). Signed-off-by: Nadav Har'El Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu Signed-off-by: Yang Zhang Signed-off-by: Gleb Natapov --- arch/x86/kvm/mmu.c | 27 +++ arch/x86/kvm/mmu.h |2 ++ arch/x86/kvm/vmx.c | 41 - 3 files changed, 69 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 81b73bc..c0b4e0f 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3797,6 +3797,33 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) } EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); +int kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context, + bool execonly) +{ + ASSERT(vcpu); + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); + + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); + + context->nx = true; + context->new_cr3 = paging_new_cr3; + context->page_fault = ept_page_fault; + context->gva_to_gpa = ept_gva_to_gpa; + context->sync_page = ept_sync_page; + context->invlpg = ept_invlpg; + context->update_pte = ept_update_pte; + context->free = paging_free; + context->root_level = context->shadow_root_level; + context->root_hpa = INVALID_PAGE; + context->direct_map = false; + + update_permission_bitmask(vcpu, context, true); + reset_rsvds_bits_mask_ept(vcpu, context, execonly); + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu); + static int init_kvm_softmmu(struct kvm_vcpu *vcpu) { int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 5b59c57..77e044a 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -71,6 +71,8 @@ enum { int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct); int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); +int kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context, + bool execonly); static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 2d84875..627b504 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1046,6 +1046,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12, return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; } +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) +{ + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); +} + static inline bool is_exception(u32 intr_info) { return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) @@ -7434,6 +7439,33 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, vmcs12->guest_physical_address = fault->address; } +/* Callbacks for nested_ept_init_mmu_context: */ + +static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) +{ + /* return the page table to be shadowed - in our case, EPT12 */ + return get_vmcs12(vcpu)->ept_pointer; +} + +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +{ + int r = kvm_init_shadow_ept_mmu(vcpu, &vcpu->arch.mmu, + nested_vmx_ept_caps & VMX_EPT_EXECUTE_ONLY_BIT); + + vcpu->arch.mmu.set_cr3 = vmx_set_cr3; + vcpu->arch.mmu.get_cr3 = nested_ept_get_cr3; + vcpu->arch.mmu.inject_page_fault = nested_ept_inject_page_fault; + + vcpu->arch.walk_mmu = &vcpu->arch.nested_mmu; + + return r; +} + +static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu) +{ + vcpu->arch.walk_mmu = &vcpu->arch.mmu; +} + /* * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function "merges" it @@ -7654,6 +7686,11 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmx_flush_tlb(vcpu); } + if (nested_cpu_has_ept(vmcs12)) { + kvm_mmu_unload(vcpu); + nested_ept_init_mmu_context(vcpu); + } + if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_EFER) vcpu->arch.efer = vmcs12->guest_ia32_efer; else if (vmcs12->vm_entry_controls & VM_ENTRY_IA
Re: [PATCH v5 11/14] nEPT: MMU context for nested EPT
On 08/01/2013 05:16 PM, Xiao Guangrong wrote: > On 07/31/2013 10:48 PM, Gleb Natapov wrote: >> From: Nadav Har'El >> >> KVM's existing shadow MMU code already supports nested TDP. To use it, we >> need to set up a new "MMU context" for nested EPT, and create a few callbacks >> for it (nested_ept_*()). This context should also use the EPT versions of >> the page table access functions (defined in the previous patch). >> Then, we need to switch back and forth between this nested context and the >> regular MMU context when switching between L1 and L2 (when L1 runs this L2 >> with EPT). > > This patch looks good to me. > > Reviewed-by: Xiao Guangrong > > But i am confused that update_permission_bitmask() is not adjusted in this > series. That function depends on kvm_read_cr4_bits(X86_CR4_SMEP) and > is_write_protection(), these two functions should read the registers from > L2 guest, using the L2 status to check L1's page table seems strange. > The same issue is in nested npt. Anything i missed? After check the code, i found vcpu->arch.mmu is not updated when switch to nested mmu, that means, "using the L2 status to check L1's page table seems strange" is wrong. That is fine on nested npt, but nested ept should adjust the logic anyway. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 11/14] nEPT: MMU context for nested EPT
On Thu, Aug 01, 2013 at 05:16:07PM +0800, Xiao Guangrong wrote: > On 07/31/2013 10:48 PM, Gleb Natapov wrote: > > From: Nadav Har'El > > > > KVM's existing shadow MMU code already supports nested TDP. To use it, we > > need to set up a new "MMU context" for nested EPT, and create a few > > callbacks > > for it (nested_ept_*()). This context should also use the EPT versions of > > the page table access functions (defined in the previous patch). > > Then, we need to switch back and forth between this nested context and the > > regular MMU context when switching between L1 and L2 (when L1 runs this L2 > > with EPT). > > This patch looks good to me. > > Reviewed-by: Xiao Guangrong > > But i am confused that update_permission_bitmask() is not adjusted in this > series. That function depends on kvm_read_cr4_bits(X86_CR4_SMEP) and > is_write_protection(), these two functions should read the registers from > L2 guest, using the L2 status to check L1's page table seems strange. > The same issue is in nested npt. Anything i missed? Good catch again. Looks like we need update_permission_bitmask_ept() that uses different logic to calculate permissions. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 11/14] nEPT: MMU context for nested EPT
On 07/31/2013 10:48 PM, Gleb Natapov wrote: > From: Nadav Har'El > > KVM's existing shadow MMU code already supports nested TDP. To use it, we > need to set up a new "MMU context" for nested EPT, and create a few callbacks > for it (nested_ept_*()). This context should also use the EPT versions of > the page table access functions (defined in the previous patch). > Then, we need to switch back and forth between this nested context and the > regular MMU context when switching between L1 and L2 (when L1 runs this L2 > with EPT). This patch looks good to me. Reviewed-by: Xiao Guangrong But i am confused that update_permission_bitmask() is not adjusted in this series. That function depends on kvm_read_cr4_bits(X86_CR4_SMEP) and is_write_protection(), these two functions should read the registers from L2 guest, using the L2 status to check L1's page table seems strange. The same issue is in nested npt. Anything i missed? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 00/14] Nested EPT
Here is another version of nested EPT patch series. All comments given to v4 are, hopefully, addressed. Gleb Natapov (2): nEPT: make guest's A/D bits depends on guest's paging mode nEPT: Support shadow paging for guest paging without A/D bits Nadav Har'El (10): nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 nEPT: Fix cr3 handling in nested exit and entry nEPT: Fix wrong test in kvm_set_cr3 nEPT: Move common code to paging_tmpl.h nEPT: Add EPT tables support to paging_tmpl.h nEPT: Nested INVEPT nEPT: MMU context for nested EPT nEPT: Advertise EPT to L1 nEPT: Some additional comments nEPT: Miscelleneous cleanups Yang Zhang (2): nEPT: Redefine EPT-specific link_shadow_page() nEPT: Add nEPT violation/misconfigration support arch/x86/include/asm/kvm_host.h |4 + arch/x86/include/asm/vmx.h |2 + arch/x86/include/uapi/asm/vmx.h |1 + arch/x86/kvm/mmu.c | 134 +--- arch/x86/kvm/mmu.h |2 + arch/x86/kvm/paging_tmpl.h | 177 arch/x86/kvm/vmx.c | 213 --- arch/x86/kvm/x86.c | 11 -- 8 files changed, 440 insertions(+), 104 deletions(-) -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v5 11/14] nEPT: MMU context for nested EPT
From: Nadav Har'El KVM's existing shadow MMU code already supports nested TDP. To use it, we need to set up a new "MMU context" for nested EPT, and create a few callbacks for it (nested_ept_*()). This context should also use the EPT versions of the page table access functions (defined in the previous patch). Then, we need to switch back and forth between this nested context and the regular MMU context when switching between L1 and L2 (when L1 runs this L2 with EPT). Signed-off-by: Nadav Har'El Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu Signed-off-by: Yang Zhang Signed-off-by: Gleb Natapov --- arch/x86/kvm/mmu.c | 26 ++ arch/x86/kvm/mmu.h |2 ++ arch/x86/kvm/vmx.c | 41 - 3 files changed, 68 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 58ae9db..37fff14 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3792,6 +3792,32 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) } EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); +int kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context, + bool execonly) +{ + ASSERT(vcpu); + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); + + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); + + context->nx = true; + context->new_cr3 = paging_new_cr3; + context->page_fault = ept_page_fault; + context->gva_to_gpa = ept_gva_to_gpa; + context->sync_page = ept_sync_page; + context->invlpg = ept_invlpg; + context->update_pte = ept_update_pte; + context->free = paging_free; + context->root_level = context->shadow_root_level; + context->root_hpa = INVALID_PAGE; + context->direct_map = false; + + reset_rsvds_bits_mask_ept(vcpu, context, execonly); + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu); + static int init_kvm_softmmu(struct kvm_vcpu *vcpu) { int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 5b59c57..77e044a 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -71,6 +71,8 @@ enum { int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct); int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); +int kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context, + bool execonly); static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index f3514d7..f41751a 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1046,6 +1046,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12, return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; } +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) +{ + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); +} + static inline bool is_exception(u32 intr_info) { return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) @@ -7432,6 +7437,33 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, vmcs12->guest_physical_address = fault->address; } +/* Callbacks for nested_ept_init_mmu_context: */ + +static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) +{ + /* return the page table to be shadowed - in our case, EPT12 */ + return get_vmcs12(vcpu)->ept_pointer; +} + +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +{ + int r = kvm_init_shadow_ept_mmu(vcpu, &vcpu->arch.mmu, + nested_vmx_ept_caps & VMX_EPT_EXECUTE_ONLY_BIT); + + vcpu->arch.mmu.set_cr3 = vmx_set_cr3; + vcpu->arch.mmu.get_cr3 = nested_ept_get_cr3; + vcpu->arch.mmu.inject_page_fault = nested_ept_inject_page_fault; + + vcpu->arch.walk_mmu = &vcpu->arch.nested_mmu; + + return r; +} + +static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu) +{ + vcpu->arch.walk_mmu = &vcpu->arch.mmu; +} + /* * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function "merges" it @@ -7652,6 +7684,11 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmx_flush_tlb(vcpu); } + if (nested_cpu_has_ept(vmcs12)) { + kvm_mmu_unload(vcpu); + nested_ept_init_mmu_context(vcpu); + } + if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_EFER) vcpu->arch.efer = vmcs12->guest_ia32_efer; else if (vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) @@ -8124,7 +8161,9 @@ static void l
[PATCH v4 10/13] nEPT: MMU context for nested EPT
From: Nadav Har'El KVM's existing shadow MMU code already supports nested TDP. To use it, we need to set up a new "MMU context" for nested EPT, and create a few callbacks for it (nested_ept_*()). This context should also use the EPT versions of the page table access functions (defined in the previous patch). Then, we need to switch back and forth between this nested context and the regular MMU context when switching between L1 and L2 (when L1 runs this L2 with EPT). Signed-off-by: Nadav Har'El Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu Signed-off-by: Yang Zhang Signed-off-by: Gleb Natapov --- arch/x86/kvm/mmu.c | 26 ++ arch/x86/kvm/mmu.h |2 ++ arch/x86/kvm/vmx.c | 41 - 3 files changed, 68 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 58ae9db..37fff14 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3792,6 +3792,32 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) } EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); +int kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context, + bool execonly) +{ + ASSERT(vcpu); + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); + + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); + + context->nx = true; + context->new_cr3 = paging_new_cr3; + context->page_fault = ept_page_fault; + context->gva_to_gpa = ept_gva_to_gpa; + context->sync_page = ept_sync_page; + context->invlpg = ept_invlpg; + context->update_pte = ept_update_pte; + context->free = paging_free; + context->root_level = context->shadow_root_level; + context->root_hpa = INVALID_PAGE; + context->direct_map = false; + + reset_rsvds_bits_mask_ept(vcpu, context, execonly); + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_init_shadow_ept_mmu); + static int init_kvm_softmmu(struct kvm_vcpu *vcpu) { int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 5b59c57..77e044a 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -71,6 +71,8 @@ enum { int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct); int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); +int kvm_init_shadow_ept_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context, + bool execonly); static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index bbfff8d..6b79db7 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1046,6 +1046,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12, return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; } +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) +{ + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); +} + static inline bool is_exception(u32 intr_info) { return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) @@ -7433,6 +7438,33 @@ static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, vmcs12->guest_physical_address = fault->address; } +/* Callbacks for nested_ept_init_mmu_context: */ + +static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) +{ + /* return the page table to be shadowed - in our case, EPT12 */ + return get_vmcs12(vcpu)->ept_pointer; +} + +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +{ + int r = kvm_init_shadow_ept_mmu(vcpu, &vcpu->arch.mmu, + nested_vmx_ept_caps & VMX_EPT_EXECUTE_ONLY_BIT); + + vcpu->arch.mmu.set_cr3 = vmx_set_cr3; + vcpu->arch.mmu.get_cr3 = nested_ept_get_cr3; + vcpu->arch.mmu.inject_page_fault = nested_ept_inject_page_fault; + + vcpu->arch.walk_mmu = &vcpu->arch.nested_mmu; + + return r; +} + +static void nested_ept_uninit_mmu_context(struct kvm_vcpu *vcpu) +{ + vcpu->arch.walk_mmu = &vcpu->arch.mmu; +} + /* * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function "merges" it @@ -7653,6 +7685,11 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12) vmx_flush_tlb(vcpu); } + if (nested_cpu_has_ept(vmcs12)) { + kvm_mmu_unload(vcpu); + nested_ept_init_mmu_context(vcpu); + } + if (vmcs12->vm_entry_controls & VM_ENTRY_LOAD_IA32_EFER) vcpu->arch.efer = vmcs12->guest_ia32_efer; else if (vmcs12->vm_entry_controls & VM_ENTRY_IA32E_MODE) @@ -8125,7 +8162,9 @@ static void l
[PATCH v4 00/13] Nested EPT
After changing hands several times I proud to present a new version of Nested EPT patches. Nothing groundbreaking here comparing to v3: all review comment are addressed, some by Yang Zhang and some by Yours Truly. Gleb Natapov (1): nEPT: make guest's A/D bits depends on guest's paging mode Nadav Har'El (10): nEPT: Support LOAD_IA32_EFER entry/exit controls for L1 nEPT: Fix cr3 handling in nested exit and entry nEPT: Fix wrong test in kvm_set_cr3 nEPT: Move common code to paging_tmpl.h nEPT: Add EPT tables support to paging_tmpl.h nEPT: Nested INVEPT nEPT: MMU context for nested EPT nEPT: Advertise EPT to L1 nEPT: Some additional comments nEPT: Miscelleneous cleanups Yang Zhang (2): nEPT: Redefine EPT-specific link_shadow_page() nEPT: Add nEPT violation/misconfigration support arch/x86/include/asm/kvm_host.h |4 + arch/x86/include/asm/vmx.h |3 + arch/x86/include/uapi/asm/vmx.h |1 + arch/x86/kvm/mmu.c | 134 ++--- arch/x86/kvm/mmu.h |2 + arch/x86/kvm/paging_tmpl.h | 175 arch/x86/kvm/vmx.c | 210 --- arch/x86/kvm/x86.c | 11 -- 8 files changed, 436 insertions(+), 104 deletions(-) -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 05/13] nEPT: MMU context for nested EPT
On Tue, May 21, 2013 at 1:50 AM, Xiao Guangrong wrote: > On 05/19/2013 12:52 PM, Jun Nakajima wrote: >> From: Nadav Har'El >> >> KVM's existing shadow MMU code already supports nested TDP. To use it, we >> need to set up a new "MMU context" for nested EPT, and create a few callbacks >> for it (nested_ept_*()). This context should also use the EPT versions of >> the page table access functions (defined in the previous patch). >> Then, we need to switch back and forth between this nested context and the >> regular MMU context when switching between L1 and L2 (when L1 runs this L2 >> with EPT). >> >> Signed-off-by: Nadav Har'El >> Signed-off-by: Jun Nakajima >> Signed-off-by: Xinhao Xu >> --- >> arch/x86/kvm/mmu.c | 38 ++ >> arch/x86/kvm/mmu.h | 1 + >> arch/x86/kvm/vmx.c | 54 >> +- >> 3 files changed, 92 insertions(+), 1 deletion(-) >> >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c >> index 6c1670f..37f8d7f 100644 >> --- a/arch/x86/kvm/mmu.c >> +++ b/arch/x86/kvm/mmu.c >> @@ -3653,6 +3653,44 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct >> kvm_mmu *context) >> } >> EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); >> >> +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) >> +{ >> + ASSERT(vcpu); >> + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); >> + >> + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); > > That means L1 guest always uses page-walk length == 4? But in your previous > patch, > it can be 2. We want to support "page-walk length == 4" only. > >> + >> + context->nx = is_nx(vcpu); /* TODO: ? */ > > Hmm? EPT always support NX. > >> + context->new_cr3 = paging_new_cr3; >> + context->page_fault = EPT_page_fault; >> + context->gva_to_gpa = EPT_gva_to_gpa; >> + context->sync_page = EPT_sync_page; >> + context->invlpg = EPT_invlpg; >> + context->update_pte = EPT_update_pte; >> + context->free = paging_free; >> + context->root_level = context->shadow_root_level; >> + context->root_hpa = INVALID_PAGE; >> + context->direct_map = false; >> + >> + /* TODO: reset_rsvds_bits_mask() is not built for EPT, we need >> +something different. >> + */ > > Exactly. :) > >> + reset_rsvds_bits_mask(vcpu, context); >> + >> + >> + /* TODO: I copied these from kvm_init_shadow_mmu, I don't know why >> +they are done, or why they write to vcpu->arch.mmu and not context >> + */ >> + vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu); >> + vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu); >> + vcpu->arch.mmu.base_role.smep_andnot_wp = >> + kvm_read_cr4_bits(vcpu, X86_CR4_SMEP) && >> + !is_write_protection(vcpu); > > I guess we need not care these since the permission of EPT page does not > depend > on these. Right. I'll clean up this. > >> + >> + return 0; >> +} >> +EXPORT_SYMBOL_GPL(kvm_init_shadow_EPT_mmu); >> + >> static int init_kvm_softmmu(struct kvm_vcpu *vcpu) >> { >> int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); >> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h >> index 2adcbc2..8fc94dd 100644 >> --- a/arch/x86/kvm/mmu.h >> +++ b/arch/x86/kvm/mmu.h >> @@ -54,6 +54,7 @@ int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 >> addr, u64 sptes[4]); >> void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask); >> int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool >> direct); >> int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); >> +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); >> >> static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) >> { >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index fb9cae5..a88432f 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -1045,6 +1045,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct >> vmcs12 *vmcs12, >> return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; >> } >> >> +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) >> +{ >> + return nested_cpu_has2(vmcs12, SEC
Re: [PATCH v3 05/13] nEPT: MMU context for nested EPT
On 05/19/2013 12:52 PM, Jun Nakajima wrote: > From: Nadav Har'El > > KVM's existing shadow MMU code already supports nested TDP. To use it, we > need to set up a new "MMU context" for nested EPT, and create a few callbacks > for it (nested_ept_*()). This context should also use the EPT versions of > the page table access functions (defined in the previous patch). > Then, we need to switch back and forth between this nested context and the > regular MMU context when switching between L1 and L2 (when L1 runs this L2 > with EPT). > > Signed-off-by: Nadav Har'El > Signed-off-by: Jun Nakajima > Signed-off-by: Xinhao Xu > --- > arch/x86/kvm/mmu.c | 38 ++ > arch/x86/kvm/mmu.h | 1 + > arch/x86/kvm/vmx.c | 54 > +- > 3 files changed, 92 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 6c1670f..37f8d7f 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -3653,6 +3653,44 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct > kvm_mmu *context) > } > EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); > > +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) > +{ > + ASSERT(vcpu); > + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); > + > + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); That means L1 guest always uses page-walk length == 4? But in your previous patch, it can be 2. > + > + context->nx = is_nx(vcpu); /* TODO: ? */ Hmm? EPT always support NX. > + context->new_cr3 = paging_new_cr3; > + context->page_fault = EPT_page_fault; > + context->gva_to_gpa = EPT_gva_to_gpa; > + context->sync_page = EPT_sync_page; > + context->invlpg = EPT_invlpg; > + context->update_pte = EPT_update_pte; > + context->free = paging_free; > + context->root_level = context->shadow_root_level; > + context->root_hpa = INVALID_PAGE; > + context->direct_map = false; > + > + /* TODO: reset_rsvds_bits_mask() is not built for EPT, we need > +something different. > + */ Exactly. :) > + reset_rsvds_bits_mask(vcpu, context); > + > + > + /* TODO: I copied these from kvm_init_shadow_mmu, I don't know why > +they are done, or why they write to vcpu->arch.mmu and not context > + */ > + vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu); > + vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu); > + vcpu->arch.mmu.base_role.smep_andnot_wp = > + kvm_read_cr4_bits(vcpu, X86_CR4_SMEP) && > + !is_write_protection(vcpu); I guess we need not care these since the permission of EPT page does not depend on these. > + > + return 0; > +} > +EXPORT_SYMBOL_GPL(kvm_init_shadow_EPT_mmu); > + > static int init_kvm_softmmu(struct kvm_vcpu *vcpu) > { > int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h > index 2adcbc2..8fc94dd 100644 > --- a/arch/x86/kvm/mmu.h > +++ b/arch/x86/kvm/mmu.h > @@ -54,6 +54,7 @@ int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 > addr, u64 sptes[4]); > void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask); > int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool > direct); > int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); > +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); > > static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) > { > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index fb9cae5..a88432f 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -1045,6 +1045,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct > vmcs12 *vmcs12, > return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; > } > > +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) > +{ > + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); > +} > + > static inline bool is_exception(u32 intr_info) > { > return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) > @@ -7311,6 +7316,46 @@ static void vmx_set_supported_cpuid(u32 func, struct > kvm_cpuid_entry2 *entry) > entry->ecx |= bit(X86_FEATURE_VMX); > } > > +/* Callbacks for nested_ept_init_mmu_context: */ > + > +static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) > +{ > + /* return the page table to be shadowed - in our case, EPT12 */ > + return get_vmcs12(vcpu)-
[PATCH v3 05/13] nEPT: MMU context for nested EPT
From: Nadav Har'El KVM's existing shadow MMU code already supports nested TDP. To use it, we need to set up a new "MMU context" for nested EPT, and create a few callbacks for it (nested_ept_*()). This context should also use the EPT versions of the page table access functions (defined in the previous patch). Then, we need to switch back and forth between this nested context and the regular MMU context when switching between L1 and L2 (when L1 runs this L2 with EPT). Signed-off-by: Nadav Har'El Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu --- arch/x86/kvm/mmu.c | 38 ++ arch/x86/kvm/mmu.h | 1 + arch/x86/kvm/vmx.c | 54 +- 3 files changed, 92 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 6c1670f..37f8d7f 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3653,6 +3653,44 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) } EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) +{ + ASSERT(vcpu); + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); + + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); + + context->nx = is_nx(vcpu); /* TODO: ? */ + context->new_cr3 = paging_new_cr3; + context->page_fault = EPT_page_fault; + context->gva_to_gpa = EPT_gva_to_gpa; + context->sync_page = EPT_sync_page; + context->invlpg = EPT_invlpg; + context->update_pte = EPT_update_pte; + context->free = paging_free; + context->root_level = context->shadow_root_level; + context->root_hpa = INVALID_PAGE; + context->direct_map = false; + + /* TODO: reset_rsvds_bits_mask() is not built for EPT, we need + something different. +*/ + reset_rsvds_bits_mask(vcpu, context); + + + /* TODO: I copied these from kvm_init_shadow_mmu, I don't know why + they are done, or why they write to vcpu->arch.mmu and not context +*/ + vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu); + vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu); + vcpu->arch.mmu.base_role.smep_andnot_wp = + kvm_read_cr4_bits(vcpu, X86_CR4_SMEP) && + !is_write_protection(vcpu); + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_init_shadow_EPT_mmu); + static int init_kvm_softmmu(struct kvm_vcpu *vcpu) { int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 2adcbc2..8fc94dd 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -54,6 +54,7 @@ int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]); void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask); int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct); int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index fb9cae5..a88432f 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1045,6 +1045,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12, return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; } +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) +{ + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); +} + static inline bool is_exception(u32 intr_info) { return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) @@ -7311,6 +7316,46 @@ static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry) entry->ecx |= bit(X86_FEATURE_VMX); } +/* Callbacks for nested_ept_init_mmu_context: */ + +static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) +{ + /* return the page table to be shadowed - in our case, EPT12 */ + return get_vmcs12(vcpu)->ept_pointer; +} + +static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, + struct x86_exception *fault) +{ + struct vmcs12 *vmcs12; + nested_vmx_vmexit(vcpu); + vmcs12 = get_vmcs12(vcpu); + /* +* Note no need to set vmcs12->vm_exit_reason as it is already copied +* from vmcs02 in nested_vmx_vmexit() above, i.e., EPT_VIOLATION. +*/ + vmcs12->exit_qualification = fault->error_code; + vmcs12->guest_physical_address = fault->address; +} + +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +{ + int r = kvm_init_shadow_EPT_mmu(vcpu, &vcpu->arch.mmu); + + vcpu->arch.mmu.set_cr3 = vmx_set_cr3; + vcpu->arch.mmu
[PATCH v3 05/13] nEPT: MMU context for nested EPT
KVM's existing shadow MMU code already supports nested TDP. To use it, we need to set up a new "MMU context" for nested EPT, and create a few callbacks for it (nested_ept_*()). This context should also use the EPT versions of the page table access functions (defined in the previous patch). Then, we need to switch back and forth between this nested context and the regular MMU context when switching between L1 and L2 (when L1 runs this L2 with EPT). Signed-off-by: Nadav Har'El Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu --- arch/x86/kvm/mmu.c | 38 ++ arch/x86/kvm/mmu.h | 1 + arch/x86/kvm/vmx.c | 54 +- 3 files changed, 92 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 6c1670f..37f8d7f 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3653,6 +3653,44 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) } EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) +{ + ASSERT(vcpu); + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); + + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); + + context->nx = is_nx(vcpu); /* TODO: ? */ + context->new_cr3 = paging_new_cr3; + context->page_fault = EPT_page_fault; + context->gva_to_gpa = EPT_gva_to_gpa; + context->sync_page = EPT_sync_page; + context->invlpg = EPT_invlpg; + context->update_pte = EPT_update_pte; + context->free = paging_free; + context->root_level = context->shadow_root_level; + context->root_hpa = INVALID_PAGE; + context->direct_map = false; + + /* TODO: reset_rsvds_bits_mask() is not built for EPT, we need + something different. +*/ + reset_rsvds_bits_mask(vcpu, context); + + + /* TODO: I copied these from kvm_init_shadow_mmu, I don't know why + they are done, or why they write to vcpu->arch.mmu and not context +*/ + vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu); + vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu); + vcpu->arch.mmu.base_role.smep_andnot_wp = + kvm_read_cr4_bits(vcpu, X86_CR4_SMEP) && + !is_write_protection(vcpu); + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_init_shadow_EPT_mmu); + static int init_kvm_softmmu(struct kvm_vcpu *vcpu) { int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 2adcbc2..8fc94dd 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -54,6 +54,7 @@ int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]); void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask); int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct); int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 51b8b4f0..80ab5b1 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -1045,6 +1045,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12, return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; } +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) +{ + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); +} + static inline bool is_exception(u32 intr_info) { return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) @@ -7305,6 +7310,46 @@ static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry) entry->ecx |= bit(X86_FEATURE_VMX); } +/* Callbacks for nested_ept_init_mmu_context: */ + +static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) +{ + /* return the page table to be shadowed - in our case, EPT12 */ + return get_vmcs12(vcpu)->ept_pointer; +} + +static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, + struct x86_exception *fault) +{ + struct vmcs12 *vmcs12; + nested_vmx_vmexit(vcpu); + vmcs12 = get_vmcs12(vcpu); + /* +* Note no need to set vmcs12->vm_exit_reason as it is already copied +* from vmcs02 in nested_vmx_vmexit() above, i.e., EPT_VIOLATION. +*/ + vmcs12->exit_qualification = fault->error_code; + vmcs12->guest_physical_address = fault->address; +} + +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +{ + int r = kvm_init_shadow_EPT_mmu(vcpu, &vcpu->arch.mmu); + + vcpu->arch.mmu.set_cr3 = vmx_set_cr3; + vcpu->arch.mmu.get_cr3 = neste
[PATCH v2 05/13] nEPT: MMU context for nested EPT
KVM's existing shadow MMU code already supports nested TDP. To use it, we need to set up a new "MMU context" for nested EPT, and create a few callbacks for it (nested_ept_*()). This context should also use the EPT versions of the page table access functions (defined in the previous patch). Then, we need to switch back and forth between this nested context and the regular MMU context when switching between L1 and L2 (when L1 runs this L2 with EPT). Signed-off-by: Nadav Har'El Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu --- arch/x86/kvm/mmu.c | 38 ++ arch/x86/kvm/mmu.h | 1 + arch/x86/kvm/vmx.c | 53 - 3 files changed, 91 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index cb9c6fd..99bfc5e 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3644,6 +3644,44 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) } EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) +{ + ASSERT(vcpu); + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); + + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); + + context->nx = is_nx(vcpu); /* TODO: ? */ + context->new_cr3 = paging_new_cr3; + context->page_fault = EPT_page_fault; + context->gva_to_gpa = EPT_gva_to_gpa; + context->sync_page = EPT_sync_page; + context->invlpg = EPT_invlpg; + context->update_pte = EPT_update_pte; + context->free = paging_free; + context->root_level = context->shadow_root_level; + context->root_hpa = INVALID_PAGE; + context->direct_map = false; + + /* TODO: reset_rsvds_bits_mask() is not built for EPT, we need + something different. +*/ + reset_rsvds_bits_mask(vcpu, context); + + + /* TODO: I copied these from kvm_init_shadow_mmu, I don't know why + they are done, or why they write to vcpu->arch.mmu and not context +*/ + vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu); + vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu); + vcpu->arch.mmu.base_role.smep_andnot_wp = + kvm_read_cr4_bits(vcpu, X86_CR4_SMEP) && + !is_write_protection(vcpu); + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_init_shadow_EPT_mmu); + static int init_kvm_softmmu(struct kvm_vcpu *vcpu) { int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 6987108..19dd5ab 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -54,6 +54,7 @@ int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]); void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask); int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct); int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 485ded6..8fdcacf 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -918,6 +918,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12, return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; } +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) +{ + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); +} + static inline bool is_exception(u32 intr_info) { return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) @@ -6873,6 +6878,46 @@ static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry) entry->ecx |= bit(X86_FEATURE_VMX); } +/* Callbacks for nested_ept_init_mmu_context: */ + +static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) +{ + /* return the page table to be shadowed - in our case, EPT12 */ + return get_vmcs12(vcpu)->ept_pointer; +} + +static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, + struct x86_exception *fault) +{ + struct vmcs12 *vmcs12; + nested_vmx_vmexit(vcpu); + vmcs12 = get_vmcs12(vcpu); + /* +* Note no need to set vmcs12->vm_exit_reason as it is already copied +* from vmcs02 in nested_vmx_vmexit() above, i.e., EPT_VIOLATION. +*/ + vmcs12->exit_qualification = fault->error_code; + vmcs12->guest_physical_address = fault->address; +} + +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +{ + int r = kvm_init_shadow_EPT_mmu(vcpu, &vcpu->arch.mmu); + + vcpu->arch.mmu.set_cr3 = vmx_set_cr3; + vcpu->arch.mmu.get_cr3 = neste
Re: [Bug 53611] New: nVMX: Add nested EPT
On 2013-04-26 18:07, Nakajima, Jun wrote: > On Thu, Apr 25, 2013 at 11:26 PM, Jan Kiszka wrote: > >> That's great but - as Gleb already said - unfortunately not yet usable. >> I'd like to rebase my fixes and enhancements (unrestricted guest mode >> specifically) on top these days, and also run some tests with a non-KVM >> guest. So, if git send-email is not yet working there, I would also be >> happy about a public git repository. >> > > I re-submitted the patches last night using git send-email this time. > We had some email problems at that time, and I needed to use a > workaround (imap-send) at that time (and it didn't work well). I've picked them up (except for Xinhao's follow-up patch) and rebased them over next + my pending patches: git://git.kiszka.org/linux-kvm.git queues/nept Some patches required a bit massaging to apply, and the last one had a trivial style issue. Feel free to integrate the changes. I didn't look into functional details yet. Instead, I've rebased my unrestricted guest mode patch plus a bunch of fixes around nEPT and that feature. See the branch above. I'm currently testing them, and it looks very good so far. Unrestricted guest mode speeds up L2's BIOS and boot loader phase noticeably. Jan signature.asc Description: OpenPGP digital signature
Re: [Bug 53611] New: nVMX: Add nested EPT
On Thu, Apr 25, 2013 at 11:26 PM, Jan Kiszka wrote: > That's great but - as Gleb already said - unfortunately not yet usable. > I'd like to rebase my fixes and enhancements (unrestricted guest mode > specifically) on top these days, and also run some tests with a non-KVM > guest. So, if git send-email is not yet working there, I would also be > happy about a public git repository. > I re-submitted the patches last night using git send-email this time. We had some email problems at that time, and I needed to use a workaround (imap-send) at that time (and it didn't work well). -- Jun Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/11] nEPT: MMU context for nested EPT
KVM's existing shadow MMU code already supports nested TDP. To use it, we need to set up a new "MMU context" for nested EPT, and create a few callbacks for it (nested_ept_*()). This context should also use the EPT versions of the page table access functions (defined in the previous patch). Then, we need to switch back and forth between this nested context and the regular MMU context when switching between L1 and L2 (when L1 runs this L2 with EPT). Signed-off-by: Nadav Har'El Signed-off-by: Jun Nakajima Signed-off-by: Xinhao Xu --- arch/x86/kvm/mmu.c | 38 ++ arch/x86/kvm/mmu.h | 1 + arch/x86/kvm/vmx.c | 53 - 3 files changed, 91 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index cb9c6fd..99bfc5e 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3644,6 +3644,44 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) } EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) +{ + ASSERT(vcpu); + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); + + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); + + context->nx = is_nx(vcpu); /* TODO: ? */ + context->new_cr3 = paging_new_cr3; + context->page_fault = EPT_page_fault; + context->gva_to_gpa = EPT_gva_to_gpa; + context->sync_page = EPT_sync_page; + context->invlpg = EPT_invlpg; + context->update_pte = EPT_update_pte; + context->free = paging_free; + context->root_level = context->shadow_root_level; + context->root_hpa = INVALID_PAGE; + context->direct_map = false; + + /* TODO: reset_rsvds_bits_mask() is not built for EPT, we need + something different. +*/ + reset_rsvds_bits_mask(vcpu, context); + + + /* TODO: I copied these from kvm_init_shadow_mmu, I don't know why + they are done, or why they write to vcpu->arch.mmu and not context +*/ + vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu); + vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu); + vcpu->arch.mmu.base_role.smep_andnot_wp = + kvm_read_cr4_bits(vcpu, X86_CR4_SMEP) && + !is_write_protection(vcpu); + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_init_shadow_EPT_mmu); + static int init_kvm_softmmu(struct kvm_vcpu *vcpu) { int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 6987108..19dd5ab 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -54,6 +54,7 @@ int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]); void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask); int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct); int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 9e0ec9d..6ab53ca 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -918,6 +918,11 @@ static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12, return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; } +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) +{ + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); +} + static inline bool is_exception(u32 intr_info) { return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) @@ -6873,6 +6878,46 @@ static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry) entry->ecx |= bit(X86_FEATURE_VMX); } +/* Callbacks for nested_ept_init_mmu_context: */ + +static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) +{ + /* return the page table to be shadowed - in our case, EPT12 */ + return get_vmcs12(vcpu)->ept_pointer; +} + +static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, + struct x86_exception *fault) +{ + struct vmcs12 *vmcs12; + nested_vmx_vmexit(vcpu); + vmcs12 = get_vmcs12(vcpu); + /* +* Note no need to set vmcs12->vm_exit_reason as it is already copied +* from vmcs02 in nested_vmx_vmexit() above, i.e., EPT_VIOLATION. +*/ + vmcs12->exit_qualification = fault->error_code; + vmcs12->guest_physical_address = fault->address; +} + +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +{ + int r = kvm_init_shadow_EPT_mmu(vcpu, &vcpu->arch.mmu); + + vcpu->arch.mmu.set_cr3 = vmx_set_cr3; + vcpu->arch.mmu.get_cr3 = neste
Re: [Bug 53611] New: nVMX: Add nested EPT
On 2013-04-25 10:00, Nakajima, Jun wrote: > On Wed, Apr 24, 2013 at 8:55 AM, Nakajima, Jun wrote: >> Sorry about the slow progress. We've been distracted by some priority >> things. The patches are ready (i.e. working), but we are cleaning them >> up. I'll send what we have today. > > So, I have sent them, and frankly we are still cleaning up. Please > bear with us. > We are also sending one more patchset to deal with EPT > misconfiguration, but Linux should run in L2 on top of L1 KVM. That's great but - as Gleb already said - unfortunately not yet usable. I'd like to rebase my fixes and enhancements (unrestricted guest mode specifically) on top these days, and also run some tests with a non-KVM guest. So, if git send-email is not yet working there, I would also be happy about a public git repository. Thanks, Jan signature.asc Description: OpenPGP digital signature
Re: [Bug 53611] New: nVMX: Add nested EPT
On Thu, Apr 25, 2013 at 01:00:42AM -0700, Nakajima, Jun wrote: > On Wed, Apr 24, 2013 at 8:55 AM, Nakajima, Jun wrote: > > Sorry about the slow progress. We've been distracted by some priority > > things. The patches are ready (i.e. working), but we are cleaning them > > up. I'll send what we have today. > > So, I have sent them, and frankly we are still cleaning up. Please > bear with us. > We are also sending one more patchset to deal with EPT > misconfiguration, but Linux should run in L2 on top of L1 KVM. > The patches are mangled and unreadable. Please resend using "git send-email". -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
On Wed, Apr 24, 2013 at 8:55 AM, Nakajima, Jun wrote: > Sorry about the slow progress. We've been distracted by some priority > things. The patches are ready (i.e. working), but we are cleaning them > up. I'll send what we have today. So, I have sent them, and frankly we are still cleaning up. Please bear with us. We are also sending one more patchset to deal with EPT misconfiguration, but Linux should run in L2 on top of L1 KVM. -- Jun Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/12] Subject: [PATCH 03/10] nEPT: MMU context for nested EPT
KVM's existing shadow MMU code already supports nested TDP. To use it, we need to set up a new "MMU context" for nested EPT, and create a few callbacks for it (nested_ept_*()). This context should also use the EPT versions of the page table access functions (defined in the previous patch). Then, we need to switch back and forth between this nested context and the regular MMU context when switching between L1 and L2 (when L1 runs this L2 with EPT). Signed-off-by: Nadav Har'El Signed-off-by: Jun Nakajima modified: arch/x86/kvm/mmu.c modified: arch/x86/kvm/mmu.h modified: arch/x86/kvm/vmx.c --- arch/x86/kvm/mmu.c | 38 arch/x86/kvm/mmu.h | 1 + arch/x86/kvm/vmx.c | 56 +++--- 3 files changed, 92 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 91cac19..34e406e2 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3674,6 +3674,44 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) } EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) +{ + ASSERT(vcpu); + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); + + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); + + context->nx = is_nx(vcpu); /* TODO: ? */ + context->new_cr3 = paging_new_cr3; + context->page_fault = EPT_page_fault; + context->gva_to_gpa = EPT_gva_to_gpa; + context->sync_page = EPT_sync_page; + context->invlpg = EPT_invlpg; + context->update_pte = EPT_update_pte; + context->free = paging_free; + context->root_level = context->shadow_root_level; + context->root_hpa = INVALID_PAGE; + context->direct_map = false; + + /* TODO: reset_rsvds_bits_mask() is not built for EPT, we need + something different. + */ + reset_rsvds_bits_mask(vcpu, context); + + + /* TODO: I copied these from kvm_init_shadow_mmu, I don't know why + they are done, or why they write to vcpu->arch.mmu and not context + */ + vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu); + vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu); + vcpu->arch.mmu.base_role.smep_andnot_wp = + kvm_read_cr4_bits(vcpu, X86_CR4_SMEP) && + !is_write_protection(vcpu); + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_init_shadow_EPT_mmu); + static int init_kvm_softmmu(struct kvm_vcpu *vcpu) { int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 6987108..19dd5ab 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -54,6 +54,7 @@ int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]); void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask); int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct); int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) { diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 9e0ec9d..f2fd79d 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -912,12 +912,16 @@ static inline bool nested_cpu_has2(struct vmcs12 *vmcs12, u32 bit) (vmcs12->secondary_vm_exec_control & bit); } -static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12, - struct kvm_vcpu *vcpu) +static inline bool nested_cpu_has_virtual_nmis(struct vmcs12 *vmcs12) { return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; } +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) +{ + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); +} + static inline bool is_exception(u32 intr_info) { return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) @@ -6873,6 +6877,46 @@ static void vmx_set_supported_cpuid(u32 func, struct kvm_cpuid_entry2 *entry) entry->ecx |= bit(X86_FEATURE_VMX); } +/* Callbacks for nested_ept_init_mmu_context: */ + +static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) +{ + /* return the page table to be shadowed - in our case, EPT12 */ + return get_vmcs12(vcpu)->ept_pointer; +} + +static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, + struct x86_exception *fault) +{ + struct vmcs12 *vmcs12; + nested_vmx_vmexit(vcpu); + vmcs12 = get_vmcs12(vcpu); + /* + * Note no need to set vmcs12->vm_exit_reason as it is already copied + * from vmcs02 in nested_vmx_vmexit() above, i.e., EPT_VIOLATION. + */ + vmcs12->exit_qualification = fault->error_code; + vmcs12->guest_physical_address = fault->address; +} + +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +{ + int r = kvm_init_shadow_EPT_mmu(vcpu, &vcpu->arch.mmu); + + vcpu->arch.mmu.set_cr3 = vmx_set_cr3; + vcpu->arch.mmu.get_cr3 = nested_ept_get_cr3; + vcpu->arch.
Re: [Bug 53611] New: nVMX: Add nested EPT
On 2013-04-24 17:55, Nakajima, Jun wrote: > On Wed, Apr 24, 2013 at 12:25 AM, Jan Kiszka wrote: >>> >>> I don't have a full picture (already asked you to post / git-push your >>> intermediate state), but nested related states typically go to >>> nested_vmx, thus vcpu_vmx. >> >> Ping regarding publication. I'm about to redo your porting work as we >> are making no progress. >> > > Sorry about the slow progress. We've been distracted by some priority > things. The patches are ready (i.e. working), but we are cleaning them > up. I'll send what we have today. Great news, thanks a lot! Jan signature.asc Description: OpenPGP digital signature
Re: [Bug 53611] New: nVMX: Add nested EPT
On Wed, Apr 24, 2013 at 12:25 AM, Jan Kiszka wrote: >> >> I don't have a full picture (already asked you to post / git-push your >> intermediate state), but nested related states typically go to >> nested_vmx, thus vcpu_vmx. > > Ping regarding publication. I'm about to redo your porting work as we > are making no progress. > Sorry about the slow progress. We've been distracted by some priority things. The patches are ready (i.e. working), but we are cleaning them up. I'll send what we have today. -- Jun Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
On 2013-03-22 17:45, Jan Kiszka wrote: > On 2013-03-22 07:23, Nakajima, Jun wrote: >> On Mon, Mar 4, 2013 at 8:45 PM, Nakajima, Jun wrote: >>> I have some updates on this. We rebased the patched to the latest KVM >>> (L0). It turned out that the version of L1 KVM/Linux matters. At that >>> time, actually I used v3.7 kernel for L1, and the L2 didn't work as I >>> described above. If I use v3.5 or older for L1, L2 works with the EPT >>> patches. So, I guess some changes made to v3.6 might have exposed a >>> bug with the nested EPT patches or somewhere. We are looking at the >>> changes to root-cause it. >>> >> >> Finally I've had more time to work on this, and I think I've fixed >> this. The problem was that the exit qualification for EPT violation >> (to L1) was not accurate (enough). And I needed to save the exit >> qualification upon EPT violation somewhere. Today, that information is >> converted to error_code (see below), and we lose the information. We >> need to use at least the lower 3 bits when injecting EPT violation to >> the L1 VMM. I tried to use the upper bytes of error_code to pass part >> of the exit qualification, but it didn't work well. Any suggestion for >> the place to store the value? kvm_vcpu? >> >>... >> /* It is a write fault? */ >> error_code = exit_qualification & (1U << 1); >> /* ept page table is present? */ >> error_code |= (exit_qualification >> 3) & 0x1; >> >> return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); > > I don't have a full picture (already asked you to post / git-push your > intermediate state), but nested related states typically go to > nested_vmx, thus vcpu_vmx. Ping regarding publication. I'm about to redo your porting work as we are making no progress. Jan signature.asc Description: OpenPGP digital signature
Re: [Bug 53611] New: nVMX: Add nested EPT
On 2013-03-22 07:23, Nakajima, Jun wrote: > On Mon, Mar 4, 2013 at 8:45 PM, Nakajima, Jun wrote: >> I have some updates on this. We rebased the patched to the latest KVM >> (L0). It turned out that the version of L1 KVM/Linux matters. At that >> time, actually I used v3.7 kernel for L1, and the L2 didn't work as I >> described above. If I use v3.5 or older for L1, L2 works with the EPT >> patches. So, I guess some changes made to v3.6 might have exposed a >> bug with the nested EPT patches or somewhere. We are looking at the >> changes to root-cause it. >> > > Finally I've had more time to work on this, and I think I've fixed > this. The problem was that the exit qualification for EPT violation > (to L1) was not accurate (enough). And I needed to save the exit > qualification upon EPT violation somewhere. Today, that information is > converted to error_code (see below), and we lose the information. We > need to use at least the lower 3 bits when injecting EPT violation to > the L1 VMM. I tried to use the upper bytes of error_code to pass part > of the exit qualification, but it didn't work well. Any suggestion for > the place to store the value? kvm_vcpu? > >... > /* It is a write fault? */ > error_code = exit_qualification & (1U << 1); > /* ept page table is present? */ > error_code |= (exit_qualification >> 3) & 0x1; > > return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); I don't have a full picture (already asked you to post / git-push your intermediate state), but nested related states typically go to nested_vmx, thus vcpu_vmx. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
On Mon, Mar 4, 2013 at 8:45 PM, Nakajima, Jun wrote: > I have some updates on this. We rebased the patched to the latest KVM > (L0). It turned out that the version of L1 KVM/Linux matters. At that > time, actually I used v3.7 kernel for L1, and the L2 didn't work as I > described above. If I use v3.5 or older for L1, L2 works with the EPT > patches. So, I guess some changes made to v3.6 might have exposed a > bug with the nested EPT patches or somewhere. We are looking at the > changes to root-cause it. > Finally I've had more time to work on this, and I think I've fixed this. The problem was that the exit qualification for EPT violation (to L1) was not accurate (enough). And I needed to save the exit qualification upon EPT violation somewhere. Today, that information is converted to error_code (see below), and we lose the information. We need to use at least the lower 3 bits when injecting EPT violation to the L1 VMM. I tried to use the upper bytes of error_code to pass part of the exit qualification, but it didn't work well. Any suggestion for the place to store the value? kvm_vcpu? ... /* It is a write fault? */ error_code = exit_qualification & (1U << 1); /* ept page table is present? */ error_code |= (exit_qualification >> 3) & 0x1; return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); -- Jun Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
On 2013-03-05 05:45, Nakajima, Jun wrote: > On Tue, Feb 26, 2013 at 11:43 AM, Jan Kiszka wrote: >> On 2013-02-26 15:11, Nadav Har'El wrote: >>> On Thu, Feb 14, 2013, Nakajima, Jun wrote about "Re: [Bug 53611] New: nVMX: >>> Add nested EPT": >>>> We have started looking at the pataches first. But I couldn't >>>> reproduce the results by simply applying the original patches to v3.6: >>>> - L2 Ubuntu 12.04 (64-bit) (smp 2) >>>> - L1 Ubuntu 12.04 (64-bit) KVM (smp 2) >>>> - L0 Ubuntu 12.04 (64-bit)-based. kernel/KVM is v3.6 + patches (the >>>> ones in nept-v2.tgz). >>>> https://bugzilla.kernel.org/attachment.cgi?id=93101 >>>> >>>> Without the patches, the L2 guest works. With it, it hangs at boot >>>> time (just black screen): >>>> - EPT was detected by L1 KVM. >>>> - UP L2 didn't help. >>>> - Looks like it's looping at EPT_walk_add_generic at the same address in >>>> L0. >>>> >>>> Will take a closer look. It would be helpful if the test configuration >>>> (e.g kernel/commit id used, L1/L2 guests) was documented as well. >>> >>> I sent the patches in August 1st, and they applied to commit >>> ade38c311a0ad8c32e902fe1d0ae74d0d44bc71e from a week earlier. >>> >>> In most of my tests, L1 and L2 were old images - L1 had Linux 2.6.33, >>> while L2 had Linux 2.6.28. In most of my tests both L1 and L2 were UP. >>> >>> I've heard another report of my patch not working with newer L1/L2 - >>> the report said that L2 failed to boot (like you reported), and also >>> that L1 became unstable (running anything in it gave a memory fault). >>> So it is very likely that this code still has bugs - but since I already >>> know of errors and holes that need to be plugged (see the announcement file >>> together with the patches), it's not very surprising :( These patches >>> definitely need some lovin', but it's easier than starting from scratch. >> >> FWIW, I'm playing with them on top of kvm-3.6-2 (second pull request for >> 3.6) for a while. They work OK for my use case (static mapping) but >> apparently lock up L2 when starting KVM on KVM, just as reported. I >> didn't look into any details there, still busy with fixing other issues >> like CR0/CR4 handling (which I came across while adding unrestricted >> guest support on top of EPT). > > I have some updates on this. We rebased the patched to the latest KVM > (L0). It turned out that the version of L1 KVM/Linux matters. At that > time, actually I used v3.7 kernel for L1, and the L2 didn't work as I > described above. If I use v3.5 or older for L1, L2 works with the EPT > patches. So, I guess some changes made to v3.6 might have exposed a > bug with the nested EPT patches or somewhere. We are looking at the > changes to root-cause it. Great to hear! Would you mind to share your work early, even when it's not yet stable? At least regarding lockups or misbehaviors of L1 and L2, some of the patches I posted recently may help. Did you try to merge them as well? Thanks, Jan signature.asc Description: OpenPGP digital signature
Re: [Bug 53611] New: nVMX: Add nested EPT
On Tue, Feb 26, 2013 at 11:43 AM, Jan Kiszka wrote: > On 2013-02-26 15:11, Nadav Har'El wrote: >> On Thu, Feb 14, 2013, Nakajima, Jun wrote about "Re: [Bug 53611] New: nVMX: >> Add nested EPT": >>> We have started looking at the pataches first. But I couldn't >>> reproduce the results by simply applying the original patches to v3.6: >>> - L2 Ubuntu 12.04 (64-bit) (smp 2) >>> - L1 Ubuntu 12.04 (64-bit) KVM (smp 2) >>> - L0 Ubuntu 12.04 (64-bit)-based. kernel/KVM is v3.6 + patches (the >>> ones in nept-v2.tgz). >>> https://bugzilla.kernel.org/attachment.cgi?id=93101 >>> >>> Without the patches, the L2 guest works. With it, it hangs at boot >>> time (just black screen): >>> - EPT was detected by L1 KVM. >>> - UP L2 didn't help. >>> - Looks like it's looping at EPT_walk_add_generic at the same address in L0. >>> >>> Will take a closer look. It would be helpful if the test configuration >>> (e.g kernel/commit id used, L1/L2 guests) was documented as well. >> >> I sent the patches in August 1st, and they applied to commit >> ade38c311a0ad8c32e902fe1d0ae74d0d44bc71e from a week earlier. >> >> In most of my tests, L1 and L2 were old images - L1 had Linux 2.6.33, >> while L2 had Linux 2.6.28. In most of my tests both L1 and L2 were UP. >> >> I've heard another report of my patch not working with newer L1/L2 - >> the report said that L2 failed to boot (like you reported), and also >> that L1 became unstable (running anything in it gave a memory fault). >> So it is very likely that this code still has bugs - but since I already >> know of errors and holes that need to be plugged (see the announcement file >> together with the patches), it's not very surprising :( These patches >> definitely need some lovin', but it's easier than starting from scratch. > > FWIW, I'm playing with them on top of kvm-3.6-2 (second pull request for > 3.6) for a while. They work OK for my use case (static mapping) but > apparently lock up L2 when starting KVM on KVM, just as reported. I > didn't look into any details there, still busy with fixing other issues > like CR0/CR4 handling (which I came across while adding unrestricted > guest support on top of EPT). I have some updates on this. We rebased the patched to the latest KVM (L0). It turned out that the version of L1 KVM/Linux matters. At that time, actually I used v3.7 kernel for L1, and the L2 didn't work as I described above. If I use v3.5 or older for L1, L2 works with the EPT patches. So, I guess some changes made to v3.6 might have exposed a bug with the nested EPT patches or somewhere. We are looking at the changes to root-cause it. > > Given that I'm porting now patches between that branch and "next" back > and forth (I depend on EPT), it would be really great if someone > familiar with the KVM MMU (or enough time) could port the series to the > current git head. That would not solve remaining bugs but could trigger > more development, maybe also help me jumping into this. > > Thanks, > Jan > -- Jun Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 53611] nVMX: Add nested EPT
https://bugzilla.kernel.org/show_bug.cgi?id=53611 --- Comment #1 from Nadav Har'El 2013-02-27 08:14:13 --- In addition to the known issues list in the "announce" file attached above, I thought of several more issues that should be considered: 1. When switching back and forth between L1 and L2 it will be a waste to throw away the EPT table already built. So I hope (need to check...) that the EPT table is cached. But what is the cache key - the cr3? But cr3 has a different meaning in L2 and L1, so it might not be correct to use that as the key. 2. When L0 swaps out pages, it needs to remove these entries in all EPT tables, including the cached EPT02 even if not currently used. Does this happen correctly? 3. If L1 uses EPT ("nested EPT") and gives us a malformed EPT12 table, we may need to inject an EPT_MISCONFIGURATION exit when building the merged EPT02 entry. Typically, we do this building (see "fetch" in paging_tmpl.h) when handling an EPT violation exit from L2, so if we encounter this problem instead of reentering L2 immediately, we should exit to L1 with an EPT misconfigration. I'm not sure exactly how to notice this problem. Perhaps the pagetable walking code, which in our case walks EPT12 already notices a problem and does something (#GP perhaps?) and we need to have it do the EPT misconfig instead. But it is possible we need to add additional tests that are not done for normal page tables - in particularly regarding reserved bits, and especially bit 5 (in EPT it is reserved, in normal page tables it is the accessed bit). This issue is low priority, as it only deals with the error path; A well-written L1 will not caused EPT configurations anyway. -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
On Tue, Feb 26, 2013 at 08:43:13PM +0100, Jan Kiszka wrote: > On 2013-02-26 15:11, Nadav Har'El wrote: > > On Thu, Feb 14, 2013, Nakajima, Jun wrote about "Re: [Bug 53611] New: nVMX: > > Add nested EPT": > >> We have started looking at the pataches first. But I couldn't > >> reproduce the results by simply applying the original patches to v3.6: > >> - L2 Ubuntu 12.04 (64-bit) (smp 2) > >> - L1 Ubuntu 12.04 (64-bit) KVM (smp 2) > >> - L0 Ubuntu 12.04 (64-bit)-based. kernel/KVM is v3.6 + patches (the > >> ones in nept-v2.tgz). > >> https://bugzilla.kernel.org/attachment.cgi?id=93101 > >> > >> Without the patches, the L2 guest works. With it, it hangs at boot > >> time (just black screen): > >> - EPT was detected by L1 KVM. > >> - UP L2 didn't help. > >> - Looks like it's looping at EPT_walk_add_generic at the same address in > >> L0. > >> > >> Will take a closer look. It would be helpful if the test configuration > >> (e.g kernel/commit id used, L1/L2 guests) was documented as well. > > > > I sent the patches in August 1st, and they applied to commit > > ade38c311a0ad8c32e902fe1d0ae74d0d44bc71e from a week earlier. > > > > In most of my tests, L1 and L2 were old images - L1 had Linux 2.6.33, > > while L2 had Linux 2.6.28. In most of my tests both L1 and L2 were UP. > > > > I've heard another report of my patch not working with newer L1/L2 - > > the report said that L2 failed to boot (like you reported), and also > > that L1 became unstable (running anything in it gave a memory fault). > > So it is very likely that this code still has bugs - but since I already > > know of errors and holes that need to be plugged (see the announcement file > > together with the patches), it's not very surprising :( These patches > > definitely need some lovin', but it's easier than starting from scratch. > > FWIW, I'm playing with them on top of kvm-3.6-2 (second pull request for > 3.6) for a while. They work OK for my use case (static mapping) but > apparently lock up L2 when starting KVM on KVM, just as reported. I > didn't look into any details there, still busy with fixing other issues > like CR0/CR4 handling (which I came across while adding unrestricted > guest support on top of EPT). > > Given that I'm porting now patches between that branch and "next" back > and forth (I depend on EPT), it would be really great if someone > familiar with the KVM MMU (or enough time) could port the series to the > current git head. That would not solve remaining bugs but could trigger > more development, maybe also help me jumping into this. > I'd like to do that. See if I'll have time... -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
On 2013-02-26 15:11, Nadav Har'El wrote: > On Thu, Feb 14, 2013, Nakajima, Jun wrote about "Re: [Bug 53611] New: nVMX: > Add nested EPT": >> We have started looking at the pataches first. But I couldn't >> reproduce the results by simply applying the original patches to v3.6: >> - L2 Ubuntu 12.04 (64-bit) (smp 2) >> - L1 Ubuntu 12.04 (64-bit) KVM (smp 2) >> - L0 Ubuntu 12.04 (64-bit)-based. kernel/KVM is v3.6 + patches (the >> ones in nept-v2.tgz). >> https://bugzilla.kernel.org/attachment.cgi?id=93101 >> >> Without the patches, the L2 guest works. With it, it hangs at boot >> time (just black screen): >> - EPT was detected by L1 KVM. >> - UP L2 didn't help. >> - Looks like it's looping at EPT_walk_add_generic at the same address in L0. >> >> Will take a closer look. It would be helpful if the test configuration >> (e.g kernel/commit id used, L1/L2 guests) was documented as well. > > I sent the patches in August 1st, and they applied to commit > ade38c311a0ad8c32e902fe1d0ae74d0d44bc71e from a week earlier. > > In most of my tests, L1 and L2 were old images - L1 had Linux 2.6.33, > while L2 had Linux 2.6.28. In most of my tests both L1 and L2 were UP. > > I've heard another report of my patch not working with newer L1/L2 - > the report said that L2 failed to boot (like you reported), and also > that L1 became unstable (running anything in it gave a memory fault). > So it is very likely that this code still has bugs - but since I already > know of errors and holes that need to be plugged (see the announcement file > together with the patches), it's not very surprising :( These patches > definitely need some lovin', but it's easier than starting from scratch. FWIW, I'm playing with them on top of kvm-3.6-2 (second pull request for 3.6) for a while. They work OK for my use case (static mapping) but apparently lock up L2 when starting KVM on KVM, just as reported. I didn't look into any details there, still busy with fixing other issues like CR0/CR4 handling (which I came across while adding unrestricted guest support on top of EPT). Given that I'm porting now patches between that branch and "next" back and forth (I depend on EPT), it would be really great if someone familiar with the KVM MMU (or enough time) could port the series to the current git head. That would not solve remaining bugs but could trigger more development, maybe also help me jumping into this. Thanks, Jan signature.asc Description: OpenPGP digital signature
Re: [Bug 53611] New: nVMX: Add nested EPT
On Thu, Feb 14, 2013, Nakajima, Jun wrote about "Re: [Bug 53611] New: nVMX: Add nested EPT": > We have started looking at the pataches first. But I couldn't > reproduce the results by simply applying the original patches to v3.6: > - L2 Ubuntu 12.04 (64-bit) (smp 2) > - L1 Ubuntu 12.04 (64-bit) KVM (smp 2) > - L0 Ubuntu 12.04 (64-bit)-based. kernel/KVM is v3.6 + patches (the > ones in nept-v2.tgz). > https://bugzilla.kernel.org/attachment.cgi?id=93101 > > Without the patches, the L2 guest works. With it, it hangs at boot > time (just black screen): > - EPT was detected by L1 KVM. > - UP L2 didn't help. > - Looks like it's looping at EPT_walk_add_generic at the same address in L0. > > Will take a closer look. It would be helpful if the test configuration > (e.g kernel/commit id used, L1/L2 guests) was documented as well. I sent the patches in August 1st, and they applied to commit ade38c311a0ad8c32e902fe1d0ae74d0d44bc71e from a week earlier. In most of my tests, L1 and L2 were old images - L1 had Linux 2.6.33, while L2 had Linux 2.6.28. In most of my tests both L1 and L2 were UP. I've heard another report of my patch not working with newer L1/L2 - the report said that L2 failed to boot (like you reported), and also that L1 became unstable (running anything in it gave a memory fault). So it is very likely that this code still has bugs - but since I already know of errors and holes that need to be plugged (see the announcement file together with the patches), it's not very surprising :( These patches definitely need some lovin', but it's easier than starting from scratch. Nadav. -- Nadav Har'El| Tuesday, Feb 26 2013, 16 Adar 5773 n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |I think therefore I am. My computer http://nadav.harel.org.il |thinks for me, therefore I am not. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
On Tue, Feb 12, 2013 at 11:43 PM, Jan Kiszka wrote: > > On 2013-02-12 20:13, Nakajima, Jun wrote: > > I looked at your (old) patches, and they seem to be very useful > > although some of them require rebasing or rewriting. We are interested > > in completing the nested-VMX features. > > That's great news. Can you estimate when you will be able to work on it? > We have started looking at the pataches first. But I couldn't reproduce the results by simply applying the original patches to v3.6: - L2 Ubuntu 12.04 (64-bit) (smp 2) - L1 Ubuntu 12.04 (64-bit) KVM (smp 2) - L0 Ubuntu 12.04 (64-bit)-based. kernel/KVM is v3.6 + patches (the ones in nept-v2.tgz). https://bugzilla.kernel.org/attachment.cgi?id=93101 Without the patches, the L2 guest works. With it, it hangs at boot time (just black screen): - EPT was detected by L1 KVM. - UP L2 didn't help. - Looks like it's looping at EPT_walk_add_generic at the same address in L0. Will take a closer look. It would be helpful if the test configuration (e.g kernel/commit id used, L1/L2 guests) was documented as well. > I will have a use case for nEPT soon - testing purposes. But working > into the KVM MMU and doing the port myself may unfortunately consume too > much time here. > > Jan > -- Jun Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
On 2013-02-12 20:13, Nakajima, Jun wrote: > On Mon, Feb 11, 2013 at 5:27 AM, Nadav Har'El > wrote: >> Hi, >> >> On Mon, Feb 11, 2013, Jan Kiszka wrote about "Re: [Bug 53611] New: nVMX: Add >> nested EPT": >>> On 2013-02-11 13:49, bugzilla-dae...@bugzilla.kernel.org wrote: >>>> https://bugzilla.kernel.org/show_bug.cgi?id=53611 >>>>Summary: nVMX: Add nested EPT >> >> Yikes, I didn't realize that these bugzilla edits all get spammed to the >> entire mailing list :( Sorry about those... >> >>> I suppose they do not apply anymore as well. Do you have a recent tree >>> around somewhere or plan to resume work on it? >> >> Unfortunately, no - I did not have time to work on these patches since >> August. >> >> The reason I'm now stuffing these things into the bug tracker is that >> at the end of this month I am leaving IBM to a new job, so I'm pretty >> sure I won't have time myself to continue any work on nested VMX, and >> would like for the missing nested-VMX features to be documented in case >> someone else comes along and wants to improve it. So unfortunately, you >> should expect more of this bugzilla spam on the mailing list... >> > > I looked at your (old) patches, and they seem to be very useful > although some of them require rebasing or rewriting. We are interested > in completing the nested-VMX features. That's great news. Can you estimate when you will be able to work on it? I will have a use case for nEPT soon - testing purposes. But working into the KVM MMU and doing the port myself may unfortunately consume too much time here. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
On Mon, Feb 11, 2013 at 5:27 AM, Nadav Har'El wrote: > Hi, > > On Mon, Feb 11, 2013, Jan Kiszka wrote about "Re: [Bug 53611] New: nVMX: Add > nested EPT": >> On 2013-02-11 13:49, bugzilla-dae...@bugzilla.kernel.org wrote: >> > https://bugzilla.kernel.org/show_bug.cgi?id=53611 >> >Summary: nVMX: Add nested EPT > > Yikes, I didn't realize that these bugzilla edits all get spammed to the > entire mailing list :( Sorry about those... > >> I suppose they do not apply anymore as well. Do you have a recent tree >> around somewhere or plan to resume work on it? > > Unfortunately, no - I did not have time to work on these patches since > August. > > The reason I'm now stuffing these things into the bug tracker is that > at the end of this month I am leaving IBM to a new job, so I'm pretty > sure I won't have time myself to continue any work on nested VMX, and > would like for the missing nested-VMX features to be documented in case > someone else comes along and wants to improve it. So unfortunately, you > should expect more of this bugzilla spam on the mailing list... > I looked at your (old) patches, and they seem to be very useful although some of them require rebasing or rewriting. We are interested in completing the nested-VMX features. -- Jun Intel Open Source Technology Center -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
On 2013-02-11 14:27, Nadav Har'El wrote: > Hi, > > On Mon, Feb 11, 2013, Jan Kiszka wrote about "Re: [Bug 53611] New: nVMX: Add > nested EPT": >> On 2013-02-11 13:49, bugzilla-dae...@bugzilla.kernel.org wrote: >>> https://bugzilla.kernel.org/show_bug.cgi?id=53611 >>>Summary: nVMX: Add nested EPT > > Yikes, I didn't realize that these bugzilla edits all get spammed to the > entire mailing list :( Sorry about those... > >> I suppose they do not apply anymore as well. Do you have a recent tree >> around somewhere or plan to resume work on it? > > Unfortunately, no - I did not have time to work on these patches since > August. > > The reason I'm now stuffing these things into the bug tracker is that > at the end of this month I am leaving IBM to a new job, so I'm pretty > sure I won't have time myself to continue any work on nested VMX, and > would like for the missing nested-VMX features to be documented in case > someone else comes along and wants to improve it. So unfortunately, you > should expect more of this bugzilla spam on the mailing list... A pity that you cannot finish this great work. But documenting the open issues is definitely helpful and welcome. Best wishes, Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
Hi, On Mon, Feb 11, 2013, Jan Kiszka wrote about "Re: [Bug 53611] New: nVMX: Add nested EPT": > On 2013-02-11 13:49, bugzilla-dae...@bugzilla.kernel.org wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=53611 > > Summary: nVMX: Add nested EPT Yikes, I didn't realize that these bugzilla edits all get spammed to the entire mailing list :( Sorry about those... > I suppose they do not apply anymore as well. Do you have a recent tree > around somewhere or plan to resume work on it? Unfortunately, no - I did not have time to work on these patches since August. The reason I'm now stuffing these things into the bug tracker is that at the end of this month I am leaving IBM to a new job, so I'm pretty sure I won't have time myself to continue any work on nested VMX, and would like for the missing nested-VMX features to be documented in case someone else comes along and wants to improve it. So unfortunately, you should expect more of this bugzilla spam on the mailing list... Nadav. -- Nadav Har'El| Monday, Feb 11 2013, 1 Adar 5773 n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |The message above is just this http://nadav.harel.org.il |signature's way of propagating itself. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Bug 53611] New: nVMX: Add nested EPT
On 2013-02-11 13:49, bugzilla-dae...@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=53611 > >Summary: nVMX: Add nested EPT >Product: Virtualization >Version: unspecified > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: kvm > AssignedTo: virtualization_...@kernel-bugs.osdl.org > ReportedBy: n...@math.technion.ac.il > Regression: No > > > Created an attachment (id=93101) > --> (https://bugzilla.kernel.org/attachment.cgi?id=93101) > Nested EPT patches, v2 > > Nested EPT means emulating EPT for an L1 guest, allowing it to use EPT when > running a nested guest L2. When L1 uses EPT, it allows the L2 guest to set > its own cr3 and take its own page faults without either of L0 or L1 getting > involved. In many workloads this significanlty improves L2's performance over > the previous two alternatives (shadow page tables over ept, and shadow page > tables over shadow page tables). As an example, I measured a single-threaded > "make", which has a lot of context switches and page faults, on the three > options: > > shadow over shadow: 105 seconds > shadow over EPT: 87 seconds (this is the default currently) > EPT over EPT: 29 seconds > > single-level virtualization (with EPT): 25 seconds > > So clearly nested EPT would be a big win for such workloads. > > I attach a patch set which I worked on and allowed me to measure the above > results. This is the same patch set I sent to KVM mailing list on August 1st, > 2012, titled "nEPT v2: Nested EPT support for Nested VMX". > > This patch set still needs some work: it is known to only work in some setups > but not others, and the file "announce" in the attached tar lists 5 things > which definitely need to be done. There were a few additional comments in the > mailing list - see > http://comments.gmane.org/gmane.comp.emulators.kvm.devel/95395 > I suppose they do not apply anymore as well. Do you have a recent tree around somewhere or plan to resume work on it? Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 53611] nVMX: Add nested EPT
https://bugzilla.kernel.org/show_bug.cgi?id=53611 Nadav Har'El changed: What|Removed |Added Blocks||53601 -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[Bug 53611] New: nVMX: Add nested EPT
https://bugzilla.kernel.org/show_bug.cgi?id=53611 Summary: nVMX: Add nested EPT Product: Virtualization Version: unspecified Platform: All OS/Version: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: kvm AssignedTo: virtualization_...@kernel-bugs.osdl.org ReportedBy: n...@math.technion.ac.il Regression: No Created an attachment (id=93101) --> (https://bugzilla.kernel.org/attachment.cgi?id=93101) Nested EPT patches, v2 Nested EPT means emulating EPT for an L1 guest, allowing it to use EPT when running a nested guest L2. When L1 uses EPT, it allows the L2 guest to set its own cr3 and take its own page faults without either of L0 or L1 getting involved. In many workloads this significanlty improves L2's performance over the previous two alternatives (shadow page tables over ept, and shadow page tables over shadow page tables). As an example, I measured a single-threaded "make", which has a lot of context switches and page faults, on the three options: shadow over shadow: 105 seconds shadow over EPT: 87 seconds (this is the default currently) EPT over EPT: 29 seconds single-level virtualization (with EPT): 25 seconds So clearly nested EPT would be a big win for such workloads. I attach a patch set which I worked on and allowed me to measure the above results. This is the same patch set I sent to KVM mailing list on August 1st, 2012, titled "nEPT v2: Nested EPT support for Nested VMX". This patch set still needs some work: it is known to only work in some setups but not others, and the file "announce" in the attached tar lists 5 things which definitely need to be done. There were a few additional comments in the mailing list - see http://comments.gmane.org/gmane.comp.emulators.kvm.devel/95395 -- Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are watching the assignee of the bug. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/10] nEPT v2: Nested EPT support for Nested VMX
On 08/01/2012 05:36 PM, Nadav Har'El wrote: > The following patches add nested EPT support to Nested VMX. > > This is the second version of this patch set. Most of the issues from the > previous reviews were handled, and in particular there is now a new variant > of paging_tmpl for EPT page tables. Thanks for this repost. > However, while this version does work in my tests, there are still some known > problems/bugs with this version and unhandled issues from the previous review: > > 1. 32-bit *PAE* L2s currently don't work. non-PAE 32-bit L2s do work > (and so do, of course, 64-bit L2s). > I'm guessing that this has to do with loading the PDPTEs; probably we're loading them from L1 instead of L2 during mode transitions. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/10] nEPT: MMU context for nested EPT
KVM's existing shadow MMU code already supports nested TDP. To use it, we need to set up a new "MMU context" for nested EPT, and create a few callbacks for it (nested_ept_*()). This context should also use the EPT versions of the page table access functions (defined in the previous patch). Then, we need to switch back and forth between this nested context and the regular MMU context when switching between L1 and L2 (when L1 runs this L2 with EPT). Signed-off-by: Nadav Har'El --- arch/x86/kvm/mmu.c | 38 +++ arch/x86/kvm/mmu.h |1 arch/x86/kvm/vmx.c | 52 +++ 3 files changed, 91 insertions(+) --- .before/arch/x86/kvm/mmu.h 2012-08-01 17:22:46.0 +0300 +++ .after/arch/x86/kvm/mmu.h 2012-08-01 17:22:46.0 +0300 @@ -52,6 +52,7 @@ int kvm_mmu_get_spte_hierarchy(struct kv void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask); int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct); int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context); static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm) { --- .before/arch/x86/kvm/mmu.c 2012-08-01 17:22:46.0 +0300 +++ .after/arch/x86/kvm/mmu.c 2012-08-01 17:22:46.0 +0300 @@ -3616,6 +3616,44 @@ int kvm_init_shadow_mmu(struct kvm_vcpu } EXPORT_SYMBOL_GPL(kvm_init_shadow_mmu); +int kvm_init_shadow_EPT_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context) +{ + ASSERT(vcpu); + ASSERT(!VALID_PAGE(vcpu->arch.mmu.root_hpa)); + + context->shadow_root_level = kvm_x86_ops->get_tdp_level(); + + context->nx = is_nx(vcpu); /* TODO: ? */ + context->new_cr3 = paging_new_cr3; + context->page_fault = EPT_page_fault; + context->gva_to_gpa = EPT_gva_to_gpa; + context->sync_page = EPT_sync_page; + context->invlpg = EPT_invlpg; + context->update_pte = EPT_update_pte; + context->free = paging_free; + context->root_level = context->shadow_root_level; + context->root_hpa = INVALID_PAGE; + context->direct_map = false; + + /* TODO: reset_rsvds_bits_mask() is not built for EPT, we need + something different. +*/ + reset_rsvds_bits_mask(vcpu, context); + + + /* TODO: I copied these from kvm_init_shadow_mmu, I don't know why + they are done, or why they write to vcpu->arch.mmu and not context +*/ + vcpu->arch.mmu.base_role.cr4_pae = !!is_pae(vcpu); + vcpu->arch.mmu.base_role.cr0_wp = is_write_protection(vcpu); + vcpu->arch.mmu.base_role.smep_andnot_wp = + kvm_read_cr4_bits(vcpu, X86_CR4_SMEP) && + !is_write_protection(vcpu); + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_init_shadow_EPT_mmu); + static int init_kvm_softmmu(struct kvm_vcpu *vcpu) { int r = kvm_init_shadow_mmu(vcpu, vcpu->arch.walk_mmu); --- .before/arch/x86/kvm/vmx.c 2012-08-01 17:22:46.0 +0300 +++ .after/arch/x86/kvm/vmx.c 2012-08-01 17:22:46.0 +0300 @@ -901,6 +901,11 @@ static inline bool nested_cpu_has_virtua return vmcs12->pin_based_vm_exec_control & PIN_BASED_VIRTUAL_NMIS; } +static inline int nested_cpu_has_ept(struct vmcs12 *vmcs12) +{ + return nested_cpu_has2(vmcs12, SECONDARY_EXEC_ENABLE_EPT); +} + static inline bool is_exception(u32 intr_info) { return (intr_info & (INTR_INFO_INTR_TYPE_MASK | INTR_INFO_VALID_MASK)) @@ -6591,6 +6596,46 @@ static void vmx_set_supported_cpuid(u32 entry->ecx |= bit(X86_FEATURE_VMX); } +/* Callbacks for nested_ept_init_mmu_context: */ + +static unsigned long nested_ept_get_cr3(struct kvm_vcpu *vcpu) +{ + /* return the page table to be shadowed - in our case, EPT12 */ + return get_vmcs12(vcpu)->ept_pointer; +} + +static void nested_ept_inject_page_fault(struct kvm_vcpu *vcpu, + struct x86_exception *fault) +{ + struct vmcs12 *vmcs12; + nested_vmx_vmexit(vcpu); + vmcs12 = get_vmcs12(vcpu); + /* +* Note no need to set vmcs12->vm_exit_reason as it is already copied +* from vmcs02 in nested_vmx_vmexit() above, i.e., EPT_VIOLATION. +*/ + vmcs12->exit_qualification = fault->error_code; + vmcs12->guest_physical_address = fault->address; +} + +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) +{ + int r = kvm_init_shadow_EPT_mmu(vcpu, &vcpu->arch.mmu); + + vcpu->arch.mmu.set_cr3 = vmx_set_cr3; + vcpu->arch.mmu.get_cr3 = nested_ept_get_cr3; + vcpu->arch.mmu.inject_page_fault = nested_ept_inject_page_fault; + + vcpu->arch.walk_mmu = &vcpu->arch.nested_mmu; + + return r; +} + +static void nested_ept_uninit_
[PATCH 0/10] nEPT v2: Nested EPT support for Nested VMX
The following patches add nested EPT support to Nested VMX. This is the second version of this patch set. Most of the issues from the previous reviews were handled, and in particular there is now a new variant of paging_tmpl for EPT page tables. However, while this version does work in my tests, there are still some known problems/bugs with this version and unhandled issues from the previous review: 1. 32-bit *PAE* L2s currently don't work. non-PAE 32-bit L2s do work (and so do, of course, 64-bit L2s). 2. nested_ept_inject_page_fault() assumes vm_exit_reason is already set to EPT_VIOLATION. However, it is conceivable that L0 emulates some L2 instruction, and during this emulation we read some L2 memory causing a need to exit (from L2 to L1) with an EPT violation. 3. Moreover, now nested_ept_inject_page_fault() always causes an EPT_VIOLATION, with vmcs12->exit_qualification = fault->error_code. This is wrong: first fault->error code is not in exit qualification format but in PFERR_* format. Moreover, PFERR_RSVD_MASK would mean we need to cause an EPT_MISCONFIG, NOT EPT_VIOLATION. Instead of trying to fix this by translating PFERR to exit_qualification, we should calculate and remember in walk_addr() the exit qualification (and and an additional bit: whether it's an EPT VIOLATION or MISCONFIGURATION). This will be remembered in new fields in x86_exception. Avi suggested: "[add to x86_exception] another bool, to distinguish between EPT VIOLATION and EPT_QUALIFICATION. The error_code field should be extended to 64 bits for EXIT_QUALIFICATION (though only bits 0-12 are defined). You need another field for the guest linear address. EXIT_QUALIFICATION has to be calculated, it cannot be derived from the original exit. Look at kvm_propagate_fault()." He also added: "If we're injecting an EPT VIOLATION to L1 (because we weren't able to resolve it; say L1 write-protected the page), then we need to compute EXIT_QUALIFICATION. Bits 3-5 of EXIT_QUALIFICATION are computed from EPT12 paging structure entries (easy to derive them from pt_access/pte_access)." 4. Also, nested_ept_inject_page_fault() doesn't set guest linear address. 5. There are several "TODO"s left in the code. If there's any volunteer willing to help me with some of these issues, it would be great :-) About nested EPT: - Nested EPT means emulating EPT for an L1 guest, allowing it to use EPT when running a nested guest L2. When L1 uses EPT, it allows the L2 guest to set its own cr3 and take its own page faults without either of L0 or L1 getting involved. In many workloads this significanlty improves L2's performance over the previous two alternatives (shadow page tables over ept, and shadow page tables over shadow page tables). Our paper [1] described these three options, and the advantages of nested EPT ("multidimensional paging" in the paper). Nested EPT is enabled by default (if the hardware supports EPT), so users do not have to do anything special to enjoy the performance improvement that this patch gives to L2 guests. L1 may of course choose not to use nested EPT, by simply not using EPT (e.g., a KVM in L1 may use the "ept=0" option). Just as a non-scientific, non-representative indication of the kind of dramatic performance improvement you may see in workloads that have a lot of context switches and page faults, here is a measurement of the time an example single-threaded "make" took in L2 (kvm over kvm): shadow over shadow: 105 seconds ("ept=0" in L0 forces this) shadow over EPT: 87 seconds (the previous default; Can be forced with "ept=0" in L1) EPT over EPT: 29 seconds (the default after this patch) Note that the same test on L1 (with EPT) took 25 seconds, so for this example workload, performance of nested virtualization is now very close to that of single-level virtualization. [1] "The Turtles Project: Design and Implementation of Nested Virtualization", http://www.usenix.org/events/osdi10/tech/full_papers/Ben-Yehuda.pdf Patch statistics: - Documentation/virtual/kvm/nested-vmx.txt |4 arch/x86/include/asm/vmx.h |2 arch/x86/kvm/mmu.c | 52 +++- arch/x86/kvm/mmu.h |1 arch/x86/kvm/paging_tmpl.h | 98 - arch/x86/kvm/vmx.c | 227 +++-- arch/x86/kvm/x86.c | 11 - 7 files changed, 354 insertions(+), 41 deletions(-) -- Nadav Har'El IBM Haifa Research Lab -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/10] nEPT: Nested EPT support for Nested VMX
On 12/12/2011 01:37 PM, Nadav Har'El wrote: > On Sun, Nov 13, 2011, Avi Kivity wrote about "Re: [PATCH 0/10] nEPT: Nested > EPT support for Nested VMX": > > > I also believed that the fault injection part was also correct: I > > > thought that the code already knows when to handle the fault in L2 (when > > > the address is missing in cr3), in L1 (when the translation is missing > > > in EPT12) or else, in L0. > > > > It does, but it needs to propagate the fault code correctly. The exit > > reason (ept violation vs ept misconfiguration) is meaningless, since we > > don't encode anything about it from ept12 into ept02. In particular an > > ept violation could lead to > > > > - no fault, ept02 updated, instruction retried > > - no fault, instruction emulated > > - L2 fault > > - ept violation, need to compute ept12 permissions for exit qualification > > - ept misconfiguration > > > > (the second and third cases occur when it is impossible to create an > > ept02 mapping - when L0 emulates a gpa that L1 assigns to L2 via ept12). > > I'm now trying to figure out this part, and I think I am beginning to > understand the mess you are referring to: > > In nested_ept_inject_page_fault I now assume the exit reason is always EPT > VIOLATION and have > > vmcs12->exit_qualification = fault->error_code; > > But fault->error_code is not in the exit qualification format but in > the PFERR_* format, which has different meanings for the bits... > Moreover, PFERR_RSVD_MASK should cause an EPT MISCONFIG, not EPT > VIOLATION. Is this what you meant above? In spirit yes. In practice rather than translating from PFERR format to EPT VIOLATION EXIT_QUALIFICATION format, walk_addr() should directly compute the exit qualification (and an additional bit: whether it's an EPT VIOLATION or EPT MISCONFIGURATION. > I didn't quite understand what you meant in the 4th case about needing > to compute ept12 permissions. I'm assuming that if the EPT violation > was caused because L0 decreased permissions from what L1 thought, then L0 > will solve the problem itself and not inject it to L1. So if we are injecting > the fault to L1, don't we already know the correct fault reason and don't > need to compute it? If we're injecting an EPT VIOLATION to L1 (because we weren't able to resolve it; say L1 write-protected the page), then we need to compute EXIT_QUALIFICATION. Bits 3-5 of EXIT_QUALIFICATION are computed from EPT12 paging structure entries (easy to derive them from pt_access/pte_access). > > There's another complication: when the fault comes from an EPT violation > in L2, handle_ept_violation() calls mmu.page_fault() with an error_code of > exit_qualification & 0x3. This means that the error_code in this case is > *not* in the expected PFERR_* format, and we need to know that in > nested_ept_inject_page_fault. Moreover, in the original EPT visolation's > exit qualification, there were various other bits which we lose (and don't > have a direct parallel in PFERR_* anyway), so when we reinject the fault, > L1 doesn't get them. struct x86_exception already has 'bool nested', which indicates whether it's an L1 or L2 fault. You need to extend that, perhaps by adding another bool, to distinguish between EPT VIOLATION and EPT_QUALIFICATION. The error_code field should be extended to 64 bits for EXIT_QUALIFICATION (though only bits 0-12 are defined). You need another field for the guest linear address. EXIT_QUALIFICATION has to be calculated, it cannot be derived from the original exit. Look at kvm_propagate_fault(). > What a mess :( If you have a splitting headache, you're on the right track. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/10] nEPT: Nested EPT support for Nested VMX
On Sun, Nov 13, 2011, Avi Kivity wrote about "Re: [PATCH 0/10] nEPT: Nested EPT support for Nested VMX": > > I also believed that the fault injection part was also correct: I > > thought that the code already knows when to handle the fault in L2 (when > > the address is missing in cr3), in L1 (when the translation is missing > > in EPT12) or else, in L0. > > It does, but it needs to propagate the fault code correctly. The exit > reason (ept violation vs ept misconfiguration) is meaningless, since we > don't encode anything about it from ept12 into ept02. In particular an > ept violation could lead to > > - no fault, ept02 updated, instruction retried > - no fault, instruction emulated > - L2 fault > - ept violation, need to compute ept12 permissions for exit qualification > - ept misconfiguration > > (the second and third cases occur when it is impossible to create an > ept02 mapping - when L0 emulates a gpa that L1 assigns to L2 via ept12). I'm now trying to figure out this part, and I think I am beginning to understand the mess you are referring to: In nested_ept_inject_page_fault I now assume the exit reason is always EPT VIOLATION and have vmcs12->exit_qualification = fault->error_code; But fault->error_code is not in the exit qualification format but in the PFERR_* format, which has different meanings for the bits... Moreover, PFERR_RSVD_MASK should cause an EPT MISCONFIG, not EPT VIOLATION. Is this what you meant above? I didn't quite understand what you meant in the 4th case about needing to compute ept12 permissions. I'm assuming that if the EPT violation was caused because L0 decreased permissions from what L1 thought, then L0 will solve the problem itself and not inject it to L1. So if we are injecting the fault to L1, don't we already know the correct fault reason and don't need to compute it? There's another complication: when the fault comes from an EPT violation in L2, handle_ept_violation() calls mmu.page_fault() with an error_code of exit_qualification & 0x3. This means that the error_code in this case is *not* in the expected PFERR_* format, and we need to know that in nested_ept_inject_page_fault. Moreover, in the original EPT visolation's exit qualification, there were various other bits which we lose (and don't have a direct parallel in PFERR_* anyway), so when we reinject the fault, L1 doesn't get them. What a mess :( -- Nadav Har'El|Monday, Dec 12 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |Hardware, n.: The parts of a computer http://nadav.harel.org.il |system that can be kicked. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On Mon, Nov 14, 2011, Avi Kivity wrote about "Re: [PATCH 02/10] nEPT: MMU context for nested EPT": > > >> +#if PTTYPE == EPT > > >> +real_gfn = mmu->translate_gpa(vcpu, > > >> gfn_to_gpa(table_gfn), > > >> + EPT_WRITABLE_MASK); > > >> +#else > > >> real_gfn = mmu->translate_gpa(vcpu, > > >> gfn_to_gpa(table_gfn), > > >> > > >> PFERR_USER_MASK|PFERR_WRITE_MASK); > > >> +#endif > > >> + > > > > > > Unneeded, I think. > > > > Is it because translate_nested_gpa always set USER_MASK ? > > Yes... maybe that function needs to do something like > >access |= mmu->default_access; Unless I'm misunderstanding something, translate_nested_gpa, and gva_to_gpa, take as their "access" parameter a bitmask of PFERR_*, so it's fine for PFERR_USER_MASK to be enabled in translate_nested_gpa; It just shouldn't cause PT_USER_MASK to be used. The only additional problem I can find is in walk_addr_generic which does if (!check_write_user_access(vcpu, write_fault, user_fault, pte)) eperm = true; and that checks pte & PT_USER_MASK, which it shouldn't if PTTYPE==PTTYPE_EPT. It's really confusing that we now have in mmu.c no less than 4 (!) access bit schemes, similar in many ways but different in many others: 1. page fault error codes (PFERR_*_MASK) 2. x86 page tables acess bits (PT_*_MASK) 3. KVM private access bits (ACC_*_MASK) 4. EPT access bits (VMX_EPT_*_MASK). I just have to try hard not to confuse them. -- Nadav Har'El| Thursday, Dec 8 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |Sorry, but my karma just ran over your http://nadav.harel.org.il |dogma. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On 12/07/2011 11:06 AM, Nadav Har'El wrote: > On Sun, Nov 13, 2011, Orit Wasserman wrote about "Re: [PATCH 02/10] nEPT: MMU > context for nested EPT": > > +++ b/arch/x86/kvm/mmu.h > > @@ -48,6 +48,11 @@ > > #define PFERR_RSVD_MASK (1U << 3) > > #define PFERR_FETCH_MASK (1U << 4) > > > > +#define EPT_WRITABLE_MASK 2 > > +#define EPT_EXEC_MASK 4 > > This is another example of the "unclean" movement of VMX-specific things into > x86 :( We already have VMX_EPT_WRITABLE_MASK and friends in vmx.h. I'll > need to think what is less ugly: to move them to mmu.h, or to include vmx.h > in mmu.c, or perhaps even create a new include file, ept.h. Avi, do you have > a preference? Include vmx.h in mmu.c. vmx.h is neutral wrt guestiness/hostiness, so it can be included from mmu.c and vmx.c without issues. > The last thing I want to do is to repeat the same definitions in two places. Right. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On Sun, Nov 13, 2011, Orit Wasserman wrote about "Re: [PATCH 02/10] nEPT: MMU context for nested EPT": > +++ b/arch/x86/kvm/mmu.h > @@ -48,6 +48,11 @@ > #define PFERR_RSVD_MASK (1U << 3) > #define PFERR_FETCH_MASK (1U << 4) > > +#define EPT_WRITABLE_MASK 2 > +#define EPT_EXEC_MASK 4 This is another example of the "unclean" movement of VMX-specific things into x86 :( We already have VMX_EPT_WRITABLE_MASK and friends in vmx.h. I'll need to think what is less ugly: to move them to mmu.h, or to include vmx.h in mmu.c, or perhaps even create a new include file, ept.h. Avi, do you have a preference? The last thing I want to do is to repeat the same definitions in two places. -- Nadav Har'El| Wednesday, Dec 7 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |"A witty saying proves nothing." -- http://nadav.harel.org.il |Voltaire -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On 12/06/2011 02:40 PM, Nadav Har'El wrote: > On Sun, Nov 13, 2011, Avi Kivity wrote about "Re: [PATCH 02/10] nEPT: MMU > context for nested EPT": > > On 11/13/2011 01:30 PM, Orit Wasserman wrote: > > > Maybe this patch can help, this is roughly what Avi wants (I hope) done > > > very quickly. > > > I'm sorry I don't have setup to run nested VMX at the moment so i can't > > > test it. > >... > > > +#define PTTYPE EPT > > > +#include "paging_tmpl.h" > > > +#undef PTTYPE > > > > Yes, that's the key. > > I'm now preparing a patch based on such ideas. > > One downside of this approach is that mmu.c (and therefore the x86 > module) will now include EPT-specific functions that are of no use or > relevance to the SVM code. It's not a terrible disaster, but it's > "unclean". I'll try to think if there's a cleaner way. I'm perfectly willing to live with this. In general vmx.c and svm.c only deal with host-side differences between Intel and AMD. EPT support in paging.h is guest-side, so it doesn't belong there. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On Sun, Nov 13, 2011, Avi Kivity wrote about "Re: [PATCH 02/10] nEPT: MMU context for nested EPT": > On 11/13/2011 01:30 PM, Orit Wasserman wrote: > > Maybe this patch can help, this is roughly what Avi wants (I hope) done > > very quickly. > > I'm sorry I don't have setup to run nested VMX at the moment so i can't > > test it. >... > > +#define PTTYPE EPT > > +#include "paging_tmpl.h" > > +#undef PTTYPE > > Yes, that's the key. I'm now preparing a patch based on such ideas. One downside of this approach is that mmu.c (and therefore the x86 module) will now include EPT-specific functions that are of no use or relevance to the SVM code. It's not a terrible disaster, but it's "unclean". I'll try to think if there's a cleaner way. Nadav. -- Nadav Har'El|Tuesday, Dec 6 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |Writing software is like sex: One mistake http://nadav.harel.org.il |and you have to support it forever. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On 11/23/2011 05:44 PM, Nadav Har'El wrote: > On Wed, Nov 23, 2011, Nadav Har'El wrote about "Re: [PATCH 02/10] nEPT: MMU > context for nested EPT": > > > +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) > > > +{ > > > + int r = kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu); > > > + > > > + vcpu->arch.nested_mmu.gva_to_gpa = EPT_gva_to_gpa_nested; > > > + > > > + return r; > > > +} > >.. > > I didn't see you actually call this function anywhere - how is it > > supposed to work? > >.. > > It seems we need a fifth case in that function. > >.. > > On second thought, why is this modifying nested_mmu.gva_to_gpa, and not > mmu.gva_to_gpa? Isn't the nested_mmu the L2 CR3, which is *not* in EPT > format, and what we really want to change is the outer mmu, which is > EPT12 and is indeed in EPT format? > Or am I missing something? I think you're right. The key is to look at what ->walk_mmu points at. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On Wed, Nov 23, 2011, Nadav Har'El wrote about "Re: [PATCH 02/10] nEPT: MMU context for nested EPT": > > +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) > > +{ > > + int r = kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu); > > + > > + vcpu->arch.nested_mmu.gva_to_gpa = EPT_gva_to_gpa_nested; > > + > > + return r; > > +} >.. > I didn't see you actually call this function anywhere - how is it > supposed to work? >.. > It seems we need a fifth case in that function. >.. On second thought, why is this modifying nested_mmu.gva_to_gpa, and not mmu.gva_to_gpa? Isn't the nested_mmu the L2 CR3, which is *not* in EPT format, and what we really want to change is the outer mmu, which is EPT12 and is indeed in EPT format? Or am I missing something? Thanks, Nadav. -- Nadav Har'El| Wednesday, Nov 23 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |My password is my dog's name. His name is http://nadav.harel.org.il |a#j!4@h, but I change it every month. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On Sun, Nov 13, 2011, Orit Wasserman wrote about "Re: [PATCH 02/10] nEPT: MMU context for nested EPT": > Maybe this patch can help, this is roughly what Avi wants (I hope) done very > quickly. > I'm sorry I don't have setup to run nested VMX at the moment so i can't test > it. Hi Orit, thanks for the code - I'm now working on incorporating something based on this into my patch. However, I do have a question: > +static int nested_ept_init_mmu_context(struct kvm_vcpu *vcpu) > +{ > + int r = kvm_init_shadow_mmu(vcpu, &vcpu->arch.mmu); > + > + vcpu->arch.nested_mmu.gva_to_gpa = EPT_gva_to_gpa_nested; > + > + return r; > +} I didn't see you actually call this function anywhere - how is it supposed to work? The way I understand the current code, kvm_mmu_reset_context() calls init_kvm_mmu() which (in our case) calls init_kvm_nested_mmu(). I think the above gva_to_gpa setting should be there - right? It seems we need a fifth case in that function. But at that point in mmu.c, how will I be able to check if this is the nested EPT case? Do you have any suggestion? Thanks, Nadav. -- Nadav Har'El| Wednesday, Nov 23 2011, n...@math.technion.ac.il |- Phone +972-523-790466, ICQ 13349191 |This message contains 100% recycled http://nadav.harel.org.il |characters. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On 11/13/2011 08:26 PM, Orit Wasserman wrote: > > > >> int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 > >> sptes[4]); > >> void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask); > >> int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool > >> direct); > >> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h > >> index 507e2b8..70d4cfd 100644 > >> --- a/arch/x86/kvm/paging_tmpl.h > >> +++ b/arch/x86/kvm/paging_tmpl.h > >> @@ -39,6 +39,21 @@ > >>#define CMPXCHG cmpxchg64 > >>#define PT_MAX_FULL_LEVELS 2 > >>#endif > >> +#elif PTTYPE == EPT > >> + #define pt_element_t u64 > >> + #define FNAME(name) EPT_##name > >> + #define PT_BASE_ADDR_MASK PT64_BASE_ADDR_MASK > >> + #define PT_LVL_ADDR_MASK(lvl) PT64_LVL_ADDR_MASK(lvl) > >> + #define PT_LVL_OFFSET_MASK(lvl) PT64_LVL_OFFSET_MASK(lvl) > >> + #define PT_INDEX(addr, level) PT64_INDEX(addr, level) > >> + #define PT_LEVEL_BITS PT64_LEVEL_BITS > >> + #ifdef CONFIG_X86_64 > >> + #define PT_MAX_FULL_LEVELS 4 > >> + #define CMPXCHG cmpxchg > >> + #else > >> + #define CMPXCHG cmpxchg64 > >> + #define PT_MAX_FULL_LEVELS 2 > >> + #endif > > > > The various masks should be defined here, to avoid lots of #ifdefs later. > > > > That what I did first but than I was afraid that the MASK will be changed for > mmu.c too. > so I decided on ifdefs. > The more I think about it I think we need rapper function for mask checking > (at least for this file). > What do you think ? Either should work, as long as the main logic is clean. > >>for (;;) { > >>gfn_t real_gfn; > >> @@ -186,9 +215,14 @@ retry_walk: > >>pte_gpa = gfn_to_gpa(table_gfn) + offset; > >>walker->table_gfn[walker->level - 1] = table_gfn; > >>walker->pte_gpa[walker->level - 1] = pte_gpa; > >> - > >> +#if PTTYPE == EPT > >> + real_gfn = mmu->translate_gpa(vcpu, gfn_to_gpa(table_gfn), > >> +EPT_WRITABLE_MASK); > >> +#else > >>real_gfn = mmu->translate_gpa(vcpu, gfn_to_gpa(table_gfn), > >> PFERR_USER_MASK|PFERR_WRITE_MASK); > >> +#endif > >> + > > > > Unneeded, I think. > > Is it because translate_nested_gpa always set USER_MASK ? Yes... maybe that function needs to do something like access |= mmu->default_access; -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On 11/13/2011 04:32 PM, Avi Kivity wrote: > On 11/13/2011 01:30 PM, Orit Wasserman wrote: >> Maybe this patch can help, this is roughly what Avi wants (I hope) done very >> quickly. >> I'm sorry I don't have setup to run nested VMX at the moment so i can't test >> it. >> >> Orit >> >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c >> index 9335e1b..bbe212f 100644 >> --- a/arch/x86/kvm/mmu.c >> +++ b/arch/x86/kvm/mmu.c >> @@ -3180,6 +3180,10 @@ static bool sync_mmio_spte(u64 *sptep, gfn_t gfn, >> unsigned access, >> #include "paging_tmpl.h" >> #undef PTTYPE >> >> +#define PTTYPE EPT >> +#include "paging_tmpl.h" >> +#undef PTTYPE >> + > > Yes, that's the key. > > >> int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 >> sptes[4]); >> void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask); >> int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool >> direct); >> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h >> index 507e2b8..70d4cfd 100644 >> --- a/arch/x86/kvm/paging_tmpl.h >> +++ b/arch/x86/kvm/paging_tmpl.h >> @@ -39,6 +39,21 @@ >> #define CMPXCHG cmpxchg64 >> #define PT_MAX_FULL_LEVELS 2 >> #endif >> +#elif PTTYPE == EPT >> +#define pt_element_t u64 >> +#define FNAME(name) EPT_##name >> +#define PT_BASE_ADDR_MASK PT64_BASE_ADDR_MASK >> +#define PT_LVL_ADDR_MASK(lvl) PT64_LVL_ADDR_MASK(lvl) >> +#define PT_LVL_OFFSET_MASK(lvl) PT64_LVL_OFFSET_MASK(lvl) >> +#define PT_INDEX(addr, level) PT64_INDEX(addr, level) >> +#define PT_LEVEL_BITS PT64_LEVEL_BITS >> +#ifdef CONFIG_X86_64 >> +#define PT_MAX_FULL_LEVELS 4 >> +#define CMPXCHG cmpxchg >> +#else >> +#define CMPXCHG cmpxchg64 >> +#define PT_MAX_FULL_LEVELS 2 >> +#endif > > The various masks should be defined here, to avoid lots of #ifdefs later. > That what I did first but than I was afraid that the MASK will be changed for mmu.c too. so I decided on ifdefs. The more I think about it I think we need rapper function for mask checking (at least for this file). What do you think ? >> #elif PTTYPE == 32 >> #define pt_element_t u32 >> #define guest_walker guest_walker32 >> @@ -106,14 +121,19 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu >> *vcpu, pt_element_t gpte, >> { >> unsigned access; >> >> +#if PTTYPE == EPT >> access = (gpte & (PT_WRITABLE_MASK | PT_USER_MASK)) | ACC_EXEC_MASK; >> +#else >> +access = (gpte & EPT_WRITABLE_MASK) | EPT_EXEC_MASK; >> if (last && !is_dirty_gpte(gpte)) >> access &= ~ACC_WRITE_MASK; >> +#endif > > Like here, you could make is_dirty_gpte() local to paging_tmpl() > returning true for EPT and the dirty bit otherwise. > > >> >> #if PTTYPE == 64 >> if (vcpu->arch.mmu.nx) >> access &= ~(gpte >> PT64_NX_SHIFT); > > The ept X bit is lost. > > Could do something like > >access &= (gpte >> PT_X_NX_SHIFT) ^ PT_X_NX_SENSE; > > >> +#if PTTYPE == EPT >> +const int write_fault = access & EPT_WRITABLE_MASK; >> +const int user_fault = 0; >> +const int fetch_fault = 0; >> +#else > > EPT has fetch permissions (but not user permissions); anyway > translate_nested_gpa() already does this. > >> const int write_fault = access & PFERR_WRITE_MASK; >> const int user_fault = access & PFERR_USER_MASK; >> const int fetch_fault = access & PFERR_FETCH_MASK; >> +#endif >> u16 errcode = 0; >> >> trace_kvm_mmu_pagetable_walk(addr, write_fault, user_fault, >> @@ -174,6 +200,9 @@ retry_walk: >> (mmu->get_cr3(vcpu) & CR3_NONPAE_RESERVED_BITS) == 0); >> >> pt_access = ACC_ALL; >> +#if PTTYPE == EPT >> +pt_access = PT_PRESENT_MASK | EPT_WRITABLE_MASK | EPT_EXEC_MASK; >> +#endif > > pt_access is not in EPT or ia32 format - it's our own format (xwu). So > this doesn't need changing. Updating gpte_access() is sufficient. > >> >> for (;;) { >> gfn_t real_gfn; >> @@ -186,9 +215,14 @@ retry_walk: >> pte_gpa = gfn_to_gpa(table_gfn) + offset; >> walker->table_gfn[walker->level - 1] = table_gfn; >> walker->pte_gpa[walker->level - 1] = pte_gpa; >> - >> +#if PTTYPE == EPT >> +real_gfn = mmu->translate_gpa(vcpu, gfn_to_gpa(table_gfn), >> + EPT_WRITABLE_MASK); >> +#else >> real_gfn = mmu->translate_gpa(vcpu, gfn_to_gpa(table_gfn), >>PFERR_USER_MASK|PFERR_WRITE_MASK); >> +#endif >> + > > Unneeded, I think. Is it because translate_nested_gpa always set USER_MASK ? > >> if (unlikely(real_gfn == UNMAPPED_GVA)) >> goto error; >> real_gfn = gpa_to_gfn(real_gfn); >> @@ -221,6 +255,7 @@ retry_walk: >> eperm = true; >> #endif >> >> +#if PTTYPE != EPT >> if (!eperm && unlikely(!(pte & PT_ACCESSED_MASK))) { >>
Re: [PATCH 02/10] nEPT: MMU context for nested EPT
On 11/13/2011 01:30 PM, Orit Wasserman wrote: > Maybe this patch can help, this is roughly what Avi wants (I hope) done very > quickly. > I'm sorry I don't have setup to run nested VMX at the moment so i can't test > it. > > Orit > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 9335e1b..bbe212f 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -3180,6 +3180,10 @@ static bool sync_mmio_spte(u64 *sptep, gfn_t gfn, > unsigned access, > #include "paging_tmpl.h" > #undef PTTYPE > > +#define PTTYPE EPT > +#include "paging_tmpl.h" > +#undef PTTYPE > + Yes, that's the key. > int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 > sptes[4]); > void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask); > int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool > direct); > diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h > index 507e2b8..70d4cfd 100644 > --- a/arch/x86/kvm/paging_tmpl.h > +++ b/arch/x86/kvm/paging_tmpl.h > @@ -39,6 +39,21 @@ > #define CMPXCHG cmpxchg64 > #define PT_MAX_FULL_LEVELS 2 > #endif > +#elif PTTYPE == EPT > + #define pt_element_t u64 > + #define FNAME(name) EPT_##name > + #define PT_BASE_ADDR_MASK PT64_BASE_ADDR_MASK > + #define PT_LVL_ADDR_MASK(lvl) PT64_LVL_ADDR_MASK(lvl) > + #define PT_LVL_OFFSET_MASK(lvl) PT64_LVL_OFFSET_MASK(lvl) > + #define PT_INDEX(addr, level) PT64_INDEX(addr, level) > + #define PT_LEVEL_BITS PT64_LEVEL_BITS > + #ifdef CONFIG_X86_64 > + #define PT_MAX_FULL_LEVELS 4 > + #define CMPXCHG cmpxchg > + #else > + #define CMPXCHG cmpxchg64 > + #define PT_MAX_FULL_LEVELS 2 > + #endif The various masks should be defined here, to avoid lots of #ifdefs later. > #elif PTTYPE == 32 > #define pt_element_t u32 > #define guest_walker guest_walker32 > @@ -106,14 +121,19 @@ static unsigned FNAME(gpte_access)(struct kvm_vcpu > *vcpu, pt_element_t gpte, > { > unsigned access; > > +#if PTTYPE == EPT > access = (gpte & (PT_WRITABLE_MASK | PT_USER_MASK)) | ACC_EXEC_MASK; > +#else > + access = (gpte & EPT_WRITABLE_MASK) | EPT_EXEC_MASK; > if (last && !is_dirty_gpte(gpte)) > access &= ~ACC_WRITE_MASK; > +#endif Like here, you could make is_dirty_gpte() local to paging_tmpl() returning true for EPT and the dirty bit otherwise. > > #if PTTYPE == 64 > if (vcpu->arch.mmu.nx) > access &= ~(gpte >> PT64_NX_SHIFT); The ept X bit is lost. Could do something like access &= (gpte >> PT_X_NX_SHIFT) ^ PT_X_NX_SENSE; > +#if PTTYPE == EPT > + const int write_fault = access & EPT_WRITABLE_MASK; > + const int user_fault = 0; > + const int fetch_fault = 0; > +#else EPT has fetch permissions (but not user permissions); anyway translate_nested_gpa() already does this. > const int write_fault = access & PFERR_WRITE_MASK; > const int user_fault = access & PFERR_USER_MASK; > const int fetch_fault = access & PFERR_FETCH_MASK; > +#endif > u16 errcode = 0; > > trace_kvm_mmu_pagetable_walk(addr, write_fault, user_fault, > @@ -174,6 +200,9 @@ retry_walk: > (mmu->get_cr3(vcpu) & CR3_NONPAE_RESERVED_BITS) == 0); > > pt_access = ACC_ALL; > +#if PTTYPE == EPT > + pt_access = PT_PRESENT_MASK | EPT_WRITABLE_MASK | EPT_EXEC_MASK; > +#endif pt_access is not in EPT or ia32 format - it's our own format (xwu). So this doesn't need changing. Updating gpte_access() is sufficient. > > for (;;) { > gfn_t real_gfn; > @@ -186,9 +215,14 @@ retry_walk: > pte_gpa = gfn_to_gpa(table_gfn) + offset; > walker->table_gfn[walker->level - 1] = table_gfn; > walker->pte_gpa[walker->level - 1] = pte_gpa; > - > +#if PTTYPE == EPT > + real_gfn = mmu->translate_gpa(vcpu, gfn_to_gpa(table_gfn), > + EPT_WRITABLE_MASK); > +#else > real_gfn = mmu->translate_gpa(vcpu, gfn_to_gpa(table_gfn), > PFERR_USER_MASK|PFERR_WRITE_MASK); > +#endif > + Unneeded, I think. > if (unlikely(real_gfn == UNMAPPED_GVA)) > goto error; > real_gfn = gpa_to_gfn(real_gfn); > @@ -221,6 +255,7 @@ retry_walk: > eperm = true; > #endif > > +#if PTTYPE != EPT > if (!eperm && unlikely(!(pte & PT_ACCESSED_MASK))) { > int ret; > trace_kvm_mmu_set_accessed_bit(table_gfn, index, > @@ -235,7 +270,7 @@ retry_walk: > mark_page_dirty(vcpu->kvm, table_gfn); > pte |= PT_ACCESSED_MASK; > } > - > +#endif If PT_ACCESSED_MASK is 0 for EPT, this goes away without #ifdef. > +#if PTTYPE != EPT > /* check if the kernel is fetching from user page */ > if (unlikely(pte