Re: KVM internal error. Suberror: 1, emulation failure
Il 17/07/2013 18:16, Dave Hansen ha scritto: I'm causing qemu to spew these emulation failure messages until I kill it. The guest kernel being run has been hacked up pretty heavily and is probably either accessing bad physical addresses (above the address ranges in the e820 table) or trying to DMA to bad addresses. What I'd really like qemu to be doing is trapping back in to the guest kernel to have it handle this issue. Then I'd have a better chance of dumping out some debugging information to see where I went wrong. This is happening because the kernel is executing a PCMPEQB instruction on an invalid memory address. This instruction is not yet emulated by KVM. If you want QEMU to trap back to the guest kernel, you can add emulation of the instruction to arch/x86/kvm/emulate.c. If you do not really care about the guest doing something sane, you can use a stub emulation function that is just return emulate_ud(ctxt). That alone could be a good starting point to attach a kernel debugger to the guest. Paolo host kernel: 3.10 guest kernel: Linus commit d2b4a64 + patches qemu: v1.4.0-2835-g6453a3a KVM internal error. Suberror: 1 emulation failure RAX= RBX=013c0410 RCX=0010 RDX=0010 RSI=000a RDI=7f6d256a73c0 RBP= RSP=7fffe2720ce8 R8 = R9 = R10=0022 R11=0246 R12=7fffe2720d58 R13=0400 R14=7f6d256a7000 R15= RIP=7f6d24c5a50e RFL=00010202 [---] CPL=3 II=0 A20=1 SMM=0 HLT=0 ES = 00c0 CS =0033 00a0fb00 DPL=3 CS64 [-RA] SS =002b 00c0f300 DPL=3 DS [-WA] DS = 00c0 FS = 7f6d2569d740 00c0 GS = 00c0 LDT= 00c0 TR =0040 88007b190480 2087 8b00 DPL=0 TSS64-busy GDT= 88007b184000 007f IDT= ff57a000 0fff CR0=80050033 CR2=7f6d256a7000 CR3=6f13b000 CR4=06e0 DR0= DR1= DR2= DR3= DR6=0ff0 DR7=0400 EFER=0d01 Code=d7 c3 85 c0 0f 85 bc 00 00 00 48 83 ea 10 0f 8e d2 00 00 00 66 0f 74 4f 30 66 0f d7 c1 85 c0 0f 85 b1 00 00 00 48 31 c0 c3 66 66 66 66 2e 0f 1f 84 00 KVM internal error. Suberror: 1 emulation failure RAX= RBX=013c0410 RCX=0010 RDX=0010 RSI=000a RDI=7f6d256a73c0 RBP= RSP=7fffe2720ce8 R8 = R9 = R10=0022 R11=0246 R12=7fffe2720d58 R13=0400 R14=7f6d256a7000 R15= RIP=7f6d24c5a50e RFL=00010202 [---] CPL=3 II=0 A20=1 SMM=0 HLT=0 ES = 00c0 CS =0033 00a0fb00 DPL=3 CS64 [-RA] SS =002b 00c0f300 DPL=3 DS [-WA] DS = 00c0 FS = 7f6d2569d740 00c0 GS = 00c0 LDT= 00c0 TR =0040 88007b190480 2087 8b00 DPL=0 TSS64-busy GDT= 88007b184000 007f IDT= ff57a000 0fff CR0=80050033 CR2=7f6d256a7000 CR3=6f13b000 CR4=06e0 DR0= DR1= DR2= DR3= DR6=0ff0 DR7=0400 EFER=0d01 Code=d7 c3 85 c0 0f 85 bc 00 00 00 48 83 ea 10 0f 8e d2 00 00 00 66 0f 74 4f 30 66 0f d7 c1 85 c0 0f 85 b1 00 00 00 48 31 c0 c3 66 66 66 66 2e 0f 1f 84 00 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault
On 07/18/2013 01:31 PM, Gleb Natapov wrote: On Thu, Jul 18, 2013 at 12:52:37PM +0800, Xiao Guangrong wrote: Currently, fast page fault tries to fix mmio page fault when the generation number is invalid (spte.gen != kvm.gen) and returns to guest to retry the fault since it sees the last spte is nonpresent which causes infinity loop It can be triggered only on AMD host since the mmio page fault is recognized as ept-misconfig We still call into regular page fault handler from ept-misconfig handler, but fake zero error_code we provide makes page_fault_can_be_fast() return false. Yes. Shouldn't shadow paging trigger this too? I haven't encountered this on Intel without ept. Since currently fast page fault only works for direct mmu. :) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault
On Thu, Jul 18, 2013 at 02:01:47PM +0800, Xiao Guangrong wrote: On 07/18/2013 01:31 PM, Gleb Natapov wrote: On Thu, Jul 18, 2013 at 12:52:37PM +0800, Xiao Guangrong wrote: Currently, fast page fault tries to fix mmio page fault when the generation number is invalid (spte.gen != kvm.gen) and returns to guest to retry the fault since it sees the last spte is nonpresent which causes infinity loop It can be triggered only on AMD host since the mmio page fault is recognized as ept-misconfig We still call into regular page fault handler from ept-misconfig handler, but fake zero error_code we provide makes page_fault_can_be_fast() return false. Yes. Shouldn't shadow paging trigger this too? I haven't encountered this on Intel without ept. Since currently fast page fault only works for direct mmu. :) Ah, yes. So with shadow page and paging disabled in a guest is can happen eventually, but we do not trigger it for some reason? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] kvm: powerpc: Do not ignore E attribute in mas2
Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index c2e5e98..277cb18 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW) #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW) #define MAS2_ATTRIB_MASK \ - (MAS2_X0 | MAS2_X1) + (MAS2_X0 | MAS2_X1 | MAS2_E) #define MAS3_ATTRIB_MASK \ (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \ | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault
On 07/18/2013 02:06 PM, Gleb Natapov wrote: On Thu, Jul 18, 2013 at 02:01:47PM +0800, Xiao Guangrong wrote: On 07/18/2013 01:31 PM, Gleb Natapov wrote: On Thu, Jul 18, 2013 at 12:52:37PM +0800, Xiao Guangrong wrote: Currently, fast page fault tries to fix mmio page fault when the generation number is invalid (spte.gen != kvm.gen) and returns to guest to retry the fault since it sees the last spte is nonpresent which causes infinity loop It can be triggered only on AMD host since the mmio page fault is recognized as ept-misconfig We still call into regular page fault handler from ept-misconfig handler, but fake zero error_code we provide makes page_fault_can_be_fast() return false. Yes. Shouldn't shadow paging trigger this too? I haven't encountered this on Intel without ept. Since currently fast page fault only works for direct mmu. :) Ah, yes. So with shadow page and paging disabled in a guest is can happen eventually, but we do not trigger it for some reason? Yes. I guess so, paging disable is short-lived and the sptes will be invalid after memslot changed for 150 times, so it is hard to be triggered. I should update this to the changelog, thanks for your reminder, Gleb. == [PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault Currently, fast page fault tries to fix mmio page fault when the generation number is invalid (spte.gen != kvm.gen) and returns to guest to retry the fault since it sees the last spte is nonpresent. It causes infinity loop Since fast page fault only works for direct mmu, the issue exists when 1) tdp is enabled. It is only triggered only on AMD host since on Intel host the mmio page fault is recognized as ept-misconfig whose handler call fault-page path with error_code = 0 2) guest paging is disabled. Under this case, the issue is hardly discovered since paging disable is short-lived and the sptes will be invalid after memslot changed for 150 times Fix it by filtering the mmio page fault out in page_fault_can_be_fast Reported-by: Markus Trippelsdorf mar...@trippelsdorf.de Tested-by: Markus Trippelsdorf mar...@trippelsdorf.de Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com --- arch/x86/kvm/mmu.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index bf7af1e..3a9493a 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -2811,6 +2811,13 @@ exit: static bool page_fault_can_be_fast(struct kvm_vcpu *vcpu, u32 error_code) { /* +* Do not fix the mmio spte with invalid generation number which +* need to be updated by slow page fault path. +*/ + if (unlikely(error_code PFERR_RSVD_MASK)) + return false; + + /* * #PF can be fast only if the shadow page table is present and it * is caused by write-protect, that means we just need change the * W bit of the spte which can be done out of mmu-lock. -- 1.8.1.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault
On Thu, Jul 18, 2013 at 02:25:19PM +0800, Xiao Guangrong wrote: On 07/18/2013 02:06 PM, Gleb Natapov wrote: On Thu, Jul 18, 2013 at 02:01:47PM +0800, Xiao Guangrong wrote: On 07/18/2013 01:31 PM, Gleb Natapov wrote: On Thu, Jul 18, 2013 at 12:52:37PM +0800, Xiao Guangrong wrote: Currently, fast page fault tries to fix mmio page fault when the generation number is invalid (spte.gen != kvm.gen) and returns to guest to retry the fault since it sees the last spte is nonpresent which causes infinity loop It can be triggered only on AMD host since the mmio page fault is recognized as ept-misconfig We still call into regular page fault handler from ept-misconfig handler, but fake zero error_code we provide makes page_fault_can_be_fast() return false. Yes. Shouldn't shadow paging trigger this too? I haven't encountered this on Intel without ept. Since currently fast page fault only works for direct mmu. :) Ah, yes. So with shadow page and paging disabled in a guest is can happen eventually, but we do not trigger it for some reason? Yes. I guess so, paging disable is short-lived and the sptes will be invalid after memslot changed for 150 times, so it is hard to be triggered. I should update this to the changelog, thanks for your reminder, Gleb. == [PATCH] KVM: MMU: avoid fast page fault fixing mmio page fault Currently, fast page fault tries to fix mmio page fault when the generation number is invalid (spte.gen != kvm.gen) and returns to guest to retry the fault since it sees the last spte is nonpresent. It causes infinity loop Since fast page fault only works for direct mmu, the issue exists when 1) tdp is enabled. It is only triggered only on AMD host since on Intel host the mmio page fault is recognized as ept-misconfig whose handler call fault-page path with error_code = 0 2) guest paging is disabled. Under this case, the issue is hardly discovered since paging disable is short-lived and the sptes will be invalid after memslot changed for 150 times Fix it by filtering the mmio page fault out in page_fault_can_be_fast Reported-by: Markus Trippelsdorf mar...@trippelsdorf.de Tested-by: Markus Trippelsdorf mar...@trippelsdorf.de Signed-off-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com Reviewed-by: Gleb Natapov g...@redhat.com --- arch/x86/kvm/mmu.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index bf7af1e..3a9493a 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -2811,6 +2811,13 @@ exit: static bool page_fault_can_be_fast(struct kvm_vcpu *vcpu, u32 error_code) { /* + * Do not fix the mmio spte with invalid generation number which + * need to be updated by slow page fault path. + */ + if (unlikely(error_code PFERR_RSVD_MASK)) + return false; + + /* * #PF can be fast only if the shadow page table is present and it * is caused by write-protect, that means we just need change the * W bit of the spte which can be done out of mmu-lock. -- 1.8.1.4 -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KVM internal error. Suberror: 1, emulation failure
On Thu, Jul 18, 2013 at 07:58:31AM +0200, Paolo Bonzini wrote: Il 17/07/2013 18:16, Dave Hansen ha scritto: I'm causing qemu to spew these emulation failure messages until I kill it. The guest kernel being run has been hacked up pretty heavily and is probably either accessing bad physical addresses (above the address ranges in the e820 table) or trying to DMA to bad addresses. What I'd really like qemu to be doing is trapping back in to the guest kernel to have it handle this issue. Then I'd have a better chance of dumping out some debugging information to see where I went wrong. This is happening because the kernel is executing a PCMPEQB instruction on an invalid memory address. This instruction is not yet emulated by KVM. If you want QEMU to trap back to the guest kernel, you can add emulation of the instruction to arch/x86/kvm/emulate.c. If you do not really care about the guest doing something sane, you can use a stub emulation function that is just return emulate_ud(ctxt). That alone could be a good starting point to attach a kernel debugger to the guest. This is the behaviour that he currently gets (assuming there is no bug somewhere, run ftrace to check), see my other reply. Not sure what he does in his #UD handler that emulation error reappear. Restart offending process? -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: windows 2008 falling asleep under KVM
Hi, just to let You know, with realtek instead of virtio-net, it seems to be reachable for almost three days.. I'd give it another day, then we can be pretty sure it's virtio related... BR nik -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - pgpRgZflQSLxE.pgp Description: PGP signature
Re: windows 2008 falling asleep under KVM
On Thu, Jul 18, 2013 at 08:55:25AM +0200, Nikola Ciprich wrote: Hi, just to let You know, with realtek instead of virtio-net, it seems to be reachable for almost three days.. I'd give it another day, then we can be pretty sure it's virtio related... Thanks. Interesting. Copying virtio-net expert. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); N�r��yb�X��ǧv�^�){.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf
Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case
On Thu, Jul 18, 2013 at 07:52:21AM +0200, Paolo Bonzini wrote: Il 17/07/2013 20:54, Arthur Chunqi Li ha scritto: + .globl entry_sysenter\n\t + entry_sysenter:\n\t + SAVE_GPR + and $0xf, %rax\n\t + push%rax\n\t push should be wrong here, the first argument is in %rdi. + callsyscall_handler\n\t + LOAD_GPR + vmresume\n\t +); + +int exit_handler() This (and syscall_handler too) needs __attribute__((__used__)) because it's only used from assembly. That was my question actually, why is it used from assembly? Calling it from see should not be a problem. Please add static to all functions except main, it triggers better compiler optimization and warnings and it will avoid nasty surprises in the future if the compiler becomes smarter. +{ + int ret; + + current-exits++; + current-guest_regs = regs; + if (is_hypercall()) + ret = handle_hypercall(); + else + ret = current-exit_handler(); + regs = current-guest_regs; + switch (ret) { + case VMX_TEST_VMEXIT: + case VMX_TEST_RESUME: + return ret; + case VMX_TEST_EXIT: + break; + default: + printf(ERROR : Invalid exit_handler return val %d.\n + , ret); + } Here have to propagate the exit codes through multiple levels, because we're not using setjmp/longjmp. Not a big deal. My worries with this version are below. + print_vmexit_info(); + exit(-1); + return 0; +} + +int vmx_run() +{ + bool ret; + u32 eax; + u64 rsp; + + asm volatile(mov %%rsp, %0\n\t : =r(rsp)); + vmcs_write(HOST_RSP, rsp); + asm volatile(vmlaunch;seta %0\n\t : =m(ret)); setbe (this path is probably untested because it never happens in practice). At least one of the tests should set up wrong vmcs state to verify that nested vmx handles it correctly. In fact one of the patches that Arthur have sent to nested vmx fixes exactly that code path. Also, missing memory clobber. + if (ret) { + printf(%s : vmlaunch failed.\n, __func__); + return 1; + } + while (1) { + asm volatile( + LOAD_GPR_C + vmresume;seta %0\n\t + : =m(ret)); setbe and missing memory clobber. + if (ret) { + printf(%s : vmresume failed.\n, __func__); + return 1; + } + asm volatile( + .global vmx_return\n\t .global should not be needed. + vmx_return:\n\t + SAVE_GPR_C + call exit_handler\n\t + mov %%eax, %0\n\t + : =r(eax) + ); I had written a long explanation here about why I don't trust the compiler to do the right thing, and ideas about how to fix that. But in the end the only workable solution is a single assembly blob like vmx.c in KVM to do vmlaunch/vmresume, so I'll get right to the point: * the similarity with KVM code and as little assembly as * * possible goals are mutually exclusive * I never said that code should be similar to KVM code or have as little assembly as possible :) Reread the thread. The only thing I asked for is to make code flow linear, if it makes code looks more like KVM or reduce amount of assembly this is just a bonus. and for a testsuite I'd prefer the latter---which means I'd still favor setjmp/longjmp. Now, here is the long explanation. I must admit that the code looks nice. There are some nits I'd like to see done differently (such as putting vmx_return at the beginning of the while (1), and the vmresume asm at the end), but it is indeed nice. Why do you prefer setjmp/longjmp then? However, it is also very deceiving, because the processor is not doing what the code says. If I see: vmlaunch(); exit(-1); it is clear that magic happens in vmlaunch(). If I see asm volatile(vmlaunch;setbe %0\n\t : =m(ret)); if (ret) { ... } asm(vmx_return:) it is absolutely not clear that the setbe and if are skipped on a successful vmlaunch. So, at the very least you need a comment before those if (ret) to document the control flow, or you can use a jmp like this: asm volatile goto (vmlaunch;jmp %0\n\t : : : memory : vmlaunch_error); if (0) { vmlaunch_error: ... } The unconditional jump and asm goto make it clear that magic happens. Agree, I dislike this magic too, but this is fixed by you suggestion above about putting vmx_return at the beginning of while(). So the logic will looks like that: asm volatile(vmlaunch;setbe %0\n\t : =m(ret)); while(ret) { vmx_return: SAVE_GPR_C eax = exit_handler(); switch(eax) { }
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? -Bharat Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html N�r��yb�X��ǧv�^�){.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Tiejun -Bharat Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) -Bharat Tiejun -Bharat Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } Additionally, in UP case this little chunk of code is equivalent to if (1) { mas2_attr |= MAS2_I | MAS2_G; } else { } So you'd better wrapper MAS2_m in advance like, #ifdef CONFIG_SMP #define M_IF_SMPMAS2_M #else #define M_IF_SMP0 #endif Then if (1) mas2_attr |= MAS2_I | MAS2_G; else mas2_attr |= M_IF_SMP; Tiejun + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. -Bharat + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html N�r��yb�X��ǧv�^�){.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. Furthermore, how to distinguish we're creating TLB entry for the device assigned directly to the GS? I think its unnecessary to always check if that is mmio's pfn since we have more non direct assigned devices. So maybe we can introduce another helper to fixup that TLB entry in instead of this path. Tiejun -Bharat + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 10:55, “tiejun.chen” wrote: On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. Furthermore, how to distinguish we're creating TLB entry for the device assigned directly to the GS? Because other devices wouldn't be available to the guest through memory slots. I think its unnecessary to always check if that is mmio's pfn since we have more non direct assigned devices. I'm not sure I understand. The shadow TLB code only knows here is a host virtual address. It needs to figure out whether the host physical address behind that is RAM (can access with cache enabled) or not (has to disable cache) So maybe we can introduce another helper to fixup that TLB entry in instead of this path. This path does fix up the shadow (host) TLB entry :). Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Which means I think it's fine to slim this whole thing down to only check for pfn_valid(), as the code does in this patch. It would however be very useful to have a comment there that explains why it's safe to do so. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 3:19 PM To: Bhushan Bharat-R65777 Cc: “tiejun.chen”; kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { +u32 mas2_attr; + +mas2_attr = mas2 MAS2_ATTRIB_MASK; + +if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Which means I think it's fine to slim
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 05:44 PM, Alexander Graf wrote: On 18.07.2013, at 10:55, �tiejun.chen� wrote: On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. Furthermore, how to distinguish we're creating TLB entry for the device assigned directly to the GS? Because other devices wouldn't be available to the guest through memory slots. Yes. I think its unnecessary to always check if that is mmio's pfn since we have more non direct assigned devices. I'm not sure I understand. The shadow TLB code only knows here is a host virtual address. It needs to figure out whether the host physical address behind that is RAM (can access with cache enabled) or not (has to disable cache) Sorry, looks I'm misleading you :-P So maybe we can introduce another helper to fixup that TLB entry in instead of this path. This path does fix up the shadow (host) TLB entry :). I just mean whether we can have a separate path dedicated to those direct assigned devices, not go this common path :) Tiejun -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 11:56, “tiejun.chen” wrote: On 07/18/2013 05:44 PM, Alexander Graf wrote: On 18.07.2013, at 10:55, �tiejun.chen� wrote: On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. Furthermore, how to distinguish we're creating TLB entry for the device assigned directly to the GS? Because other devices wouldn't be available to the guest through memory slots. Yes. I think its unnecessary to always check if that is mmio's pfn since we have more non direct assigned devices. I'm not sure I understand. The shadow TLB code only knows here is a host virtual address. It needs to figure out whether the host physical address behind that is RAM (can access with cache enabled) or not (has to disable cache) Sorry, looks I'm misleading you :-P So maybe we can introduce another helper to fixup that TLB entry in instead of this path. This path does fix up the shadow (host) TLB entry :). I just mean whether we can have a separate path dedicated to those direct assigned devices, not go this common path :) I don't think it's possible to have a separate path without a certain level of trust. In the current flow we don't trust anyone. We just check every translated page whether we should enable caching or not. We could take that information from 2 other side though: 1) Memory Slot
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 05:48 PM, Alexander Graf wrote: On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Thanks for this good information :) So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside kvm_is_mmio_pfn() to make sure that check is only valid when that is really needed? This can decrease those unnecessary performance loss. If I'm wrong please correct me :) Tiejun -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 12:08, “tiejun.chen” wrote: On 07/18/2013 05:48 PM, Alexander Graf wrote: On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { +u32 mas2_attr; + +mas2_attr = mas2 MAS2_ATTRIB_MASK; + +if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Thanks for this good information :) So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside kvm_is_mmio_pfn() to make sure that check is only valid when that is really needed? This can decrease those unnecessary performance loss. If I'm wrong please correct me :) You're perfectly right, but this is generic KVM code. So it gets run across all architectures. What if someone has the great idea
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 06:00 PM, Alexander Graf wrote: On 18.07.2013, at 11:56, “tiejun.chen” wrote: On 07/18/2013 05:44 PM, Alexander Graf wrote: On 18.07.2013, at 10:55, �tiejun.chen� wrote: On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. Furthermore, how to distinguish we're creating TLB entry for the device assigned directly to the GS? Because other devices wouldn't be available to the guest through memory slots. Yes. I think its unnecessary to always check if that is mmio's pfn since we have more non direct assigned devices. I'm not sure I understand. The shadow TLB code only knows here is a host virtual address. It needs to figure out whether the host physical address behind that is RAM (can access with cache enabled) or not (has to disable cache) Sorry, looks I'm misleading you :-P So maybe we can introduce another helper to fixup that TLB entry in instead of this path. This path does fix up the shadow (host) TLB entry :). I just mean whether we can have a separate path dedicated to those direct assigned devices, not go this common path :) I don't think it's possible to have a separate path without a certain level of trust. In the current flow we don't trust anyone. We just check every translated page whether we should enable caching or not. We could take that information from 2 other side though: 1) Memory Slot 2) Guest TLB Flags If we take it from the memory slot we
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 06:12 PM, Alexander Graf wrote: On 18.07.2013, at 12:08, “tiejun.chen” wrote: On 07/18/2013 05:48 PM, Alexander Graf wrote: On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Thanks for this good information :) So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside kvm_is_mmio_pfn() to make sure that check is only valid when that is really needed? This can decrease those unnecessary performance loss. If I'm wrong please correct me :) You're perfectly right, but this is generic KVM code. So it gets run across all architectures. What if someone has the great idea to add a new case here for x86, but doesn't tell us? In
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 12:19, “tiejun.chen” wrote: On 07/18/2013 06:12 PM, Alexander Graf wrote: On 18.07.2013, at 12:08, “tiejun.chen” wrote: On 07/18/2013 05:48 PM, Alexander Graf wrote: On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Thanks for this good information :) So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside kvm_is_mmio_pfn() to make sure that check is only valid when that is really needed? This can decrease those unnecessary performance loss. If I'm wrong please correct me :) You're perfectly right,
Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case
Il 18/07/2013 09:26, Gleb Natapov ha scritto: I had written a long explanation here about why I don't trust the compiler to do the right thing, and ideas about how to fix that. But in the end the only workable solution is a single assembly blob like vmx.c in KVM to do vmlaunch/vmresume, so I'll get right to the point: * the similarity with KVM code and as little assembly as * * possible goals are mutually exclusive * I never said that code should be similar to KVM code or have as little assembly as possible :) Reread the thread. The only thing I asked for is to make code flow linear, if it makes code looks more like KVM or reduce amount of assembly this is just a bonus. Point taken. and for a testsuite I'd prefer the latter---which means I'd still favor setjmp/longjmp. Now, here is the long explanation. I must admit that the code looks nice. There are some nits I'd like to see done differently (such as putting vmx_return at the beginning of the while (1), and the vmresume asm at the end), but it is indeed nice. Why do you prefer setjmp/longjmp then? Because it is still deceiving, and I dislike the deceit more than I like the linear code flow. Agree, I dislike this magic too, but this is fixed by you suggestion above about putting vmx_return at the beginning of while(). So the logic will looks like that: asm volatile(vmlaunch;setbe %0\n\t : =m(ret)); while(ret) { while(!ret) if you use setbe, of course. vmx_return: SAVE_GPR_C eax = exit_handler(); switch(eax) { } LOAD_GPR_C asm volatile(vmresume;seta %0\n\t : =m(ret)); } It is still somewhat magic: the while (ret) is only there to please the compiler, and you rely on the compiler not changing %rsp between the vmlaunch and the vmx_return label. Minor nit, you cannot anymore print different error messages for vmlaunch vs. vmresume failure. In the end the choice is between all in asm and small asm trampoline (which also happens to use setjmp/longjmp, but perhaps Arthur can propagate return codes without using setjmp/longjmp, too). Rewriting the whole guest entry exit path in asm like you suggest bellow will eliminate the magic too. I much prefer one big asm statement than many small asm statement scattered among two or three C lines. It's not many asm statements, it's just a dozen lines: align 4, 0x90 entry_vmx: SAVE_GPR call default_exit_handler /* Should not reach here*/ mov $1, %eax call exit .align 4, 0x90 entry_sysenter: SAVE_GPR and $0xf, %eax mov %eax, %edi calldefault_syscall_handler /* Arthur, is something missing here? :)) */ Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case
On Thu, Jul 18, 2013 at 12:47:46PM +0200, Paolo Bonzini wrote: and for a testsuite I'd prefer the latter---which means I'd still favor setjmp/longjmp. Now, here is the long explanation. I must admit that the code looks nice. There are some nits I'd like to see done differently (such as putting vmx_return at the beginning of the while (1), and the vmresume asm at the end), but it is indeed nice. Why do you prefer setjmp/longjmp then? Because it is still deceiving, and I dislike the deceit more than I like the linear code flow. What is deceiving about it? Of course for someone who has no idea how vmx works the code will not be obvious, but why should we care. For someone who knows what is deceiving about returning into the same function guest was launched from by using VMX mechanism instead of longjmp()? Agree, I dislike this magic too, but this is fixed by you suggestion above about putting vmx_return at the beginning of while(). So the logic will looks like that: asm volatile(vmlaunch;setbe %0\n\t : =m(ret)); while(ret) { while(!ret) if you use setbe, of course. Yeah, this not meant to be compilable code :) vmx_return: SAVE_GPR_C eax = exit_handler(); switch(eax) { } LOAD_GPR_C asm volatile(vmresume;seta %0\n\t : =m(ret)); } It is still somewhat magic: the while (ret) is only there to please No, it it there to catch vmlaunch/vmresume errors. the compiler, and you rely on the compiler not changing %rsp between the vmlaunch and the vmx_return label. Minor nit, you cannot anymore print HOST_RSP should be loaded on each guest entry. different error messages for vmlaunch vs. vmresume failure. Just because the same variable is used to store error values :) Make vmlaunch use err1 and vmresume err2 and do while (!err1 !err2) In the end the choice is between all in asm and small asm trampoline (which also happens to use setjmp/longjmp, but perhaps Arthur can propagate return codes without using setjmp/longjmp, too). Rewriting the whole guest entry exit path in asm like you suggest bellow will eliminate the magic too. I much prefer one big asm statement than many small asm statement scattered among two or three C lines. It's not many asm statements, it's just a dozen lines: I am not talking about overall file, but the how vmx_run() is written: asm() C code asm() C code ... I much prefer: C code big asm() blob C code. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case
Il 18/07/2013 13:06, Gleb Natapov ha scritto: On Thu, Jul 18, 2013 at 12:47:46PM +0200, Paolo Bonzini wrote: and for a testsuite I'd prefer the latter---which means I'd still favor setjmp/longjmp. Now, here is the long explanation. I must admit that the code looks nice. There are some nits I'd like to see done differently (such as putting vmx_return at the beginning of the while (1), and the vmresume asm at the end), but it is indeed nice. Why do you prefer setjmp/longjmp then? Because it is still deceiving, and I dislike the deceit more than I like the linear code flow. What is deceiving about it? Of course for someone who has no idea how vmx works the code will not be obvious, but why should we care. For someone who knows what is deceiving about returning into the same function guest was launched from by using VMX mechanism The way the code is written is deceiving. If I see asm(vmlaunch; seta %0) while (ret) I expect HOST_RIP to point at the seta or somewhere near, not at a completely different label somewhere else. instead of longjmp()? Look again at the setjmp/longjmp version. longjmp is not used to handle vmexit. It is used to jump back from the vmexit handler to main, which is exactly what setjmp/longjmp is meant for. It is still somewhat magic: the while (ret) is only there to please the compiler No, it it there to catch vmlaunch/vmresume errors. You could change it to while (0) and it would still work. That's what I mean by only there to please the compiler. the compiler, and you rely on the compiler not changing %rsp between the vmlaunch and the vmx_return label. Minor nit, you cannot anymore print HOST_RSP should be loaded on each guest entry. Right, but my point is: without a big asm blob, you don't know the right value to load. It depends on the generated code. And the big asm blob limits a lot the code looks nice value of this approach. different error messages for vmlaunch vs. vmresume failure. Just because the same variable is used to store error values :) Make vmlaunch use err1 and vmresume err2 and do while (!err1 !err2) Yeah. :) In the end the choice is between all in asm and small asm trampoline (which also happens to use setjmp/longjmp, but perhaps Arthur can propagate return codes without using setjmp/longjmp, too). Rewriting the whole guest entry exit path in asm like you suggest bellow will eliminate the magic too. I much prefer one big asm statement than many small asm statement scattered among two or three C lines. It's not many asm statements, it's just a dozen lines: I am not talking about overall file, but the how vmx_run() is written: asm() C code asm() C code ... I much prefer: C code big asm() blob C code. Me too. But the trampoline version is neither, it is static inline void vmlaunch() { asm(vmlaunch) } static inline void vmresume() { asm(vmresume) } small asm() blob C code Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [ [PATCH]] nVMX: Initialize IA32_FEATURE_CONTROL MSR in reset and migration
On Tue, Jul 16, 2013 at 03:01:58PM +0300, Gleb Natapov wrote: On Tue, Jul 16, 2013 at 07:56:25PM +0800, Arthur Chunqi Li wrote: On Tue, Jul 16, 2013 at 7:42 PM, Gleb Natapov g...@redhat.com wrote: On Sun, Jul 07, 2013 at 11:13:37PM +0800, Arthur Chunqi Li wrote: The recent KVM patch adds IA32_FEATURE_CONTROL support. QEMU needs to clear this MSR when reset vCPU and keep the value of it when migration. This patch add this feature. So what happens if we migrate from qemu that does not have this patch to qemu that does? Since msr_ia32_feature_control will not be migrated it will not be set on the destination so destination will not be able to use nested vmx. Since nested vmx is experimental it may be to early for us to care about it though, and nested vmx does not work with migration anyway. In my test, if migration doesn't care about msr_ia32_feature_control, the value will be set to 0 in the destination VM and this may cause some logical confusions, but the VMX running on it may not aware of this (if migration nested vmx is supported in the future) because once VMX initialized, it will not check this msr any more in normal cases. With vmm_exclusive=0 kvm does vmxon/vmxoff while running. But lest not worry about nested kvm migration for now. There are much harder problems to overcome before it will work. This is also a complex problem since we don't know how many states like this msr need to be migrated related to nested virt. If there're a lot of states need migrating, it is better to reconstruct the relevant codes. But now this patch is enough. Besides, though migration is not supported in nested vmx, we should keep the machine state consistent during migration. So this patch is also meaningful. I'm assuming that even qemu-1.6 -machine pc-1.5 is not expected to allow migration to a qemu-1.5 binary. Is that OK for everybody, or should we support backwards migration? Other than that, the patch looks good to me. If migrating from a version that doesn't have the patch, we are just going to get the same behavior we had before. Arthur Signed-off-by: Arthur Chunqi Li yzt...@gmail.com --- target-i386/cpu.h |2 ++ target-i386/kvm.c |4 target-i386/machine.c | 22 ++ 3 files changed, 28 insertions(+) diff --git a/target-i386/cpu.h b/target-i386/cpu.h index 62e3547..a418e17 100644 --- a/target-i386/cpu.h +++ b/target-i386/cpu.h @@ -301,6 +301,7 @@ #define MSR_IA32_APICBASE_BSP (18) #define MSR_IA32_APICBASE_ENABLE(111) #define MSR_IA32_APICBASE_BASE (0xf12) +#define MSR_IA32_FEATURE_CONTROL0x003a #define MSR_TSC_ADJUST 0x003b #define MSR_IA32_TSCDEADLINE0x6e0 @@ -813,6 +814,7 @@ typedef struct CPUX86State { uint64_t mcg_status; uint64_t msr_ia32_misc_enable; +uint64_t msr_ia32_feature_control; /* exception/interrupt handling */ int error_code; diff --git a/target-i386/kvm.c b/target-i386/kvm.c index 39f4fbb..3cb2161 100644 --- a/target-i386/kvm.c +++ b/target-i386/kvm.c @@ -1122,6 +1122,7 @@ static int kvm_put_msrs(X86CPU *cpu, int level) if (hyperv_vapic_recommended()) { kvm_msr_entry_set(msrs[n++], HV_X64_MSR_APIC_ASSIST_PAGE, 0); } +kvm_msr_entry_set(msrs[n++], MSR_IA32_FEATURE_CONTROL, env-msr_ia32_feature_control); } if (env-mcg_cap) { int i; @@ -1346,6 +1347,7 @@ static int kvm_get_msrs(X86CPU *cpu) if (has_msr_misc_enable) { msrs[n++].index = MSR_IA32_MISC_ENABLE; } +msrs[n++].index = MSR_IA32_FEATURE_CONTROL; if (!env-tsc_valid) { msrs[n++].index = MSR_IA32_TSC; @@ -1444,6 +1446,8 @@ static int kvm_get_msrs(X86CPU *cpu) case MSR_IA32_MISC_ENABLE: env-msr_ia32_misc_enable = msrs[i].data; break; +case MSR_IA32_FEATURE_CONTROL: +env-msr_ia32_feature_control = msrs[i].data; default: if (msrs[i].index = MSR_MC0_CTL msrs[i].index MSR_MC0_CTL + (env-mcg_cap 0xff) * 4) { diff --git a/target-i386/machine.c b/target-i386/machine.c index 3659db9..94ca914 100644 --- a/target-i386/machine.c +++ b/target-i386/machine.c @@ -399,6 +399,14 @@ static bool misc_enable_needed(void *opaque) return env-msr_ia32_misc_enable != MSR_IA32_MISC_ENABLE_DEFAULT; } +static bool feature_control_needed(void *opaque) +{ +X86CPU *cpu = opaque; +CPUX86State *env = cpu-env; + +return env-msr_ia32_feature_control != 0; +} + static const VMStateDescription vmstate_msr_ia32_misc_enable = { .name = cpu/msr_ia32_misc_enable, .version_id = 1, @@ -410,6 +418,17 @@ static
[PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - No change arch/powerpc/kvm/e500.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index c2e5e98..277cb18 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW) #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW) #define MAS2_ATTRIB_MASK \ - (MAS2_X0 | MAS2_X1) + (MAS2_X0 | MAS2_X1 | MAS2_E) #define MAS3_ATTRIB_MASK \ (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \ | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* +* RAM is always mappable on e500 systems, so this is identical +* to kvm_is_mmio_pfn(), just without its overhead. +*/ + if (!pfn_valid(pfn)) { + /* Pages not managed by Kernel are treated as I/O, set I + G */ + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +326,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] PF: Async page fault support on s390
Il 11/07/2013 12:41, Christian Borntraeger ha scritto: On 11/07/13 11:04, Gleb Natapov wrote: On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote: This patch enables async page faults for s390 kvm guests. It provides the userspace API to enable, disable or get the status of this feature. Also it includes the diagnose code, called by the guest to enable async page faults. The async page faults will use an already existing guest interface for this purpose, as described in CP Programming Services (SC24-6084). Signed-off-by: Dominik Dingel din...@linux.vnet.ibm.com Christian, looks good now? Looks good, but I just had a discussion with Dominik about several other cases (guest driven reboot, qemu driven reboot, life migration). This patch should allow all these cases (independent from this patch we need an ioctl to flush the list of pending interrupts to do so, but reboot is currently broken in that regard anyway - patch is currently being looked at) We are currently discussion if we should get rid of the APF_STATUS and let the kernel wait for outstanding page faults before returning from KVM_RUN or if we go with this patch and let userspace wait for completion. Will discuss this with Dominik, Conny and Alex. So lets defer that till next week, ok? Let us know if we should wait for a v5. :) Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case
On Thu, Jul 18, 2013 at 8:08 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 18/07/2013 13:06, Gleb Natapov ha scritto: On Thu, Jul 18, 2013 at 12:47:46PM +0200, Paolo Bonzini wrote: and for a testsuite I'd prefer the latter---which means I'd still favor setjmp/longjmp. Now, here is the long explanation. I must admit that the code looks nice. There are some nits I'd like to see done differently (such as putting vmx_return at the beginning of the while (1), and the vmresume asm at the end), but it is indeed nice. Why do you prefer setjmp/longjmp then? Because it is still deceiving, and I dislike the deceit more than I like the linear code flow. What is deceiving about it? Of course for someone who has no idea how vmx works the code will not be obvious, but why should we care. For someone who knows what is deceiving about returning into the same function guest was launched from by using VMX mechanism The way the code is written is deceiving. If I see asm(vmlaunch; seta %0) while (ret) I expect HOST_RIP to point at the seta or somewhere near, not at a completely different label somewhere else. Here for me, I prefer a separate asm blob of entry_vmx instead of one hidden in a C function, which may make it much harder to trace for someone not familiar with vmx in KVM. instead of longjmp()? Look again at the setjmp/longjmp version. longjmp is not used to handle vmexit. It is used to jump back from the vmexit handler to main, which is exactly what setjmp/longjmp is meant for. It is still somewhat magic: the while (ret) is only there to please the compiler No, it it there to catch vmlaunch/vmresume errors. You could change it to while (0) and it would still work. That's what I mean by only there to please the compiler. the compiler, and you rely on the compiler not changing %rsp between the vmlaunch and the vmx_return label. Minor nit, you cannot anymore print HOST_RSP should be loaded on each guest entry. Right, but my point is: without a big asm blob, you don't know the right value to load. It depends on the generated code. And the big asm blob limits a lot the code looks nice value of this approach. different error messages for vmlaunch vs. vmresume failure. Just because the same variable is used to store error values :) Make vmlaunch use err1 and vmresume err2 and do while (!err1 !err2) Yeah. :) In the end the choice is between all in asm and small asm trampoline (which also happens to use setjmp/longjmp, but perhaps Arthur can propagate return codes without using setjmp/longjmp, too). Rewriting the whole guest entry exit path in asm like you suggest bellow will eliminate the magic too. I much prefer one big asm statement than many small asm statement scattered among two or three C lines. It's not many asm statements, it's just a dozen lines: I am not talking about overall file, but the how vmx_run() is written: asm() C code asm() C code ... I much prefer: C code big asm() blob C code. Me too. But the trampoline version is neither, it is static inline void vmlaunch() { asm(vmlaunch) } static inline void vmresume() { asm(vmresume) } small asm() blob C code So this is a style problem on the basis of right code generation indeed. When I write codes of this version, it occurs that the compiler optimized some of my codes and something goes wrong. If we use C style, we need setjmp/longjmp, otherwise we need big asm blob to forbid compiler optimization. I prefer to Paolo indeed as to my writing the two versions. Writing asm in C is sometimes uncomfortable (e.g. %rax and %%rax, and considering the C codes before and after the asm blob). Actually, we can hide setjmp in vmx_on and longjmp in the asm codes after executing exit_handler, thus we just need to call vmx_on to enter VM and return a designed value (e.g. -1) in exit_handler to exit VM. Then any following codes don't need to care about setjmp/longjmp. Arthur Paolo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 4/4] PF: Async page fault support on s390
On 18/07/13 15:57, Paolo Bonzini wrote: Il 11/07/2013 12:41, Christian Borntraeger ha scritto: On 11/07/13 11:04, Gleb Natapov wrote: On Wed, Jul 10, 2013 at 02:59:55PM +0200, Dominik Dingel wrote: This patch enables async page faults for s390 kvm guests. It provides the userspace API to enable, disable or get the status of this feature. Also it includes the diagnose code, called by the guest to enable async page faults. The async page faults will use an already existing guest interface for this purpose, as described in CP Programming Services (SC24-6084). Signed-off-by: Dominik Dingel din...@linux.vnet.ibm.com Christian, looks good now? Looks good, but I just had a discussion with Dominik about several other cases (guest driven reboot, qemu driven reboot, life migration). This patch should allow all these cases (independent from this patch we need an ioctl to flush the list of pending interrupts to do so, but reboot is currently broken in that regard anyway - patch is currently being looked at) We are currently discussion if we should get rid of the APF_STATUS and let the kernel wait for outstanding page faults before returning from KVM_RUN or if we go with this patch and let userspace wait for completion. Will discuss this with Dominik, Conny and Alex. So lets defer that till next week, ok? Let us know if we should wait for a v5. :) Yes, there will be a v5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
This needs a description. Why shouldn't we ignore E? Alex On 18.07.2013, at 15:19, Bharat Bhushan wrote: Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - No change arch/powerpc/kvm/e500.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index c2e5e98..277cb18 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW) #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW) #define MAS2_ATTRIB_MASK \ - (MAS2_X0 | MAS2_X1) + (MAS2_X0 | MAS2_X1 | MAS2_E) #define MAS3_ATTRIB_MASK \ (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \ | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 15:19, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* + * RAM is always mappable on e500 systems, so this is identical + * to kvm_is_mmio_pfn(), just without its overhead. + */ + if (!pfn_valid(pfn)) { + /* Pages not managed by Kernel are treated as I/O, set I + G */ Please also document the intermediate thought that I/O should be mapped non-cached. + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP Please separate the SMP case out of the branch. - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ This comment doesn't tell me why M can be set ;). Alex + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +326,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:18 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 This needs a description. Why shouldn't we ignore E? What I understood is that there is no reason to stop guest setting E, so allow him. -Bharat Alex On 18.07.2013, at 15:19, Bharat Bhushan wrote: Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - No change arch/powerpc/kvm/e500.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index c2e5e98..277cb18 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW) #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW) #define MAS2_ATTRIB_MASK \ - (MAS2_X0 | MAS2_X1) + (MAS2_X0 | MAS2_X1 | MAS2_E) #define MAS3_ATTRIB_MASK \ (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \ | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:23 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 18.07.2013, at 15:19, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* +* RAM is always mappable on e500 systems, so this is identical +* to kvm_is_mmio_pfn(), just without its overhead. +*/ + if (!pfn_valid(pfn)) { + /* Pages not managed by Kernel are treated as I/O, set I + G */ Please also document the intermediate thought that I/O should be mapped non- cached. I did not get what you mean to document? + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP Please separate the SMP case out of the branch. Really :) this was looking simple to me. - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ This comment doesn't tell me why M can be set ;). RAM in SMP, so setting coherent, is not that obvious? -Bharat Alex + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +326,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 17:15, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:23 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 18.07.2013, at 15:19, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* +* RAM is always mappable on e500 systems, so this is identical +* to kvm_is_mmio_pfn(), just without its overhead. +*/ + if (!pfn_valid(pfn)) { + /* Pages not managed by Kernel are treated as I/O, set I + G */ Please also document the intermediate thought that I/O should be mapped non- cached. I did not get what you mean to document? Page snot managed by the kernel are treated as I/O. Map it with disabled cache. + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP Please separate the SMP case out of the branch. Really :) this was looking simple to me. Two branches intertwined never look simple :). - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ This comment doesn't tell me why M can be set ;). RAM in SMP, so setting coherent, is not that obvious? No. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
On 18.07.2013, at 17:12, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:18 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 This needs a description. Why shouldn't we ignore E? What I understood is that there is no reason to stop guest setting E, so allow him. Please add that to the patch description. Also explain what the bit means. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, July 18, 2013 8:50 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 On 18.07.2013, at 17:12, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:18 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 This needs a description. Why shouldn't we ignore E? What I understood is that there is no reason to stop guest setting E, so allow him. Please add that to the patch description. Also explain what the bit means. Ok :) -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 05:00:42 AM, Alexander Graf wrote: Now why is setting invalid flags a problem? If I understand Scott correctly, it can break the host if you access certain host devices with caching enabled. But to be sure I'd say we ask him directly :). The architecture makes it illegal to mix cacheable and cache-inhibited mappings to the same physical page. Mixing W or M bits is generally bad as well. I've seen it cause machine checks, error interrupts, etc. -- not just corrupting the page in question. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
On 07/18/2013 10:12:30 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:18 PM To: Bhushan Bharat-R65777 Cc: kvm-...@vger.kernel.org; kvm@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 This needs a description. Why shouldn't we ignore E? What I understood is that there is no reason to stop guest setting E, so allow him. FWIW, it'd probably be safe to let the guest control the G bit as well. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* +* RAM is always mappable on e500 systems, so this is identical +* to kvm_is_mmio_pfn(), just without its overhead. +*/ + if (!pfn_valid(pfn)) { Please use page_is_ram(), which is what gets used when setting the WIMG for the host userspace mapping. We want to make sure the two are consistent. + /* Pages not managed by Kernel are treated as I/O, set I + G */ + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ + mas2_attr |= MAS2_M; #endif Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { +u32 mas2_attr; + +mas2_attr = mas2 MAS2_ATTRIB_MASK; + +/* + * RAM is always mappable on e500 systems, so this is identical + * to kvm_is_mmio_pfn(), just without its overhead. + */ +if (!pfn_valid(pfn)) { Please use page_is_ram(), which is what gets used when setting the WIMG for the host userspace mapping. We want to make sure the two are consistent. +/* Pages not managed by Kernel are treated as I/O, set I + G */ +mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP -return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else -return mas2 MAS2_ATTRIB_MASK; +} else { +/* Kernel managed pages are actually RAM so set M */ +mas2_attr |= MAS2_M; #endif Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 12:32:18 PM, Alexander Graf wrote: On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? M is like I in that bad things happen if you mix them. So we really want to match exactly what the rest of the kernel is doing. Plus, the performance penalty on some single-core chips can be pretty bad. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] kvm-unit-tests : Basic architecture of VMX nested test case
On Thu, Jul 18, 2013 at 02:08:51PM +0200, Paolo Bonzini wrote: Il 18/07/2013 13:06, Gleb Natapov ha scritto: On Thu, Jul 18, 2013 at 12:47:46PM +0200, Paolo Bonzini wrote: and for a testsuite I'd prefer the latter---which means I'd still favor setjmp/longjmp. Now, here is the long explanation. I must admit that the code looks nice. There are some nits I'd like to see done differently (such as putting vmx_return at the beginning of the while (1), and the vmresume asm at the end), but it is indeed nice. Why do you prefer setjmp/longjmp then? Because it is still deceiving, and I dislike the deceit more than I like the linear code flow. What is deceiving about it? Of course for someone who has no idea how vmx works the code will not be obvious, but why should we care. For someone who knows what is deceiving about returning into the same function guest was launched from by using VMX mechanism The way the code is written is deceiving. If I see asm(vmlaunch; seta %0) while (ret) I expect HOST_RIP to point at the seta or somewhere near, not at a completely different label somewhere else. Why would you expect that assuming you know what vmlaunch is? So it is OK when HOST_RIP point to somewhere outside a function, but deceiving if it point 3 lines down in the same function? I have a hard time following this logic. instead of longjmp()? Look again at the setjmp/longjmp version. longjmp is not used to handle vmexit. It is used to jump back from the vmexit handler to main, which is exactly what setjmp/longjmp is meant for. That's because simple return will not work in that version, this is artifact of how vmexit was done. It is still somewhat magic: the while (ret) is only there to please the compiler No, it it there to catch vmlaunch/vmresume errors. You could change it to while (0) and it would still work. That's what I mean by only there to please the compiler. But while (1) will not, so the code is executed, it is not just for compiler there, but it is executed only if vmlaunch/vmresume fails. the compiler, and you rely on the compiler not changing %rsp between the vmlaunch and the vmx_return label. Minor nit, you cannot anymore print HOST_RSP should be loaded on each guest entry. Right, but my point is: without a big asm blob, you don't know the right value to load. It depends on the generated code. And the big asm blob limits a lot the code looks nice value of this approach. I said it number of times already, this is not about code looking nice, code looks like KVM or use less assembler as possible, this is about linear data flow. It is not fun chasing code execution path. Yes, you can argue that vmlaunch/vmresume inherently non linear, but there is a difference between skipping one instruction and remain in the same function and on the same stack, or jump completely to a different context. Speaking about KVM. Guest enter/exit assembly blob is around ~50 lines if assembly code and more then half of that is saving restoring context. And the code plays some tricks there for optimization that we do not need to do here, so I expect the code to be even smaller, not much more then 10 lines of assembly excluding state save/restore. different error messages for vmlaunch vs. vmresume failure. Just because the same variable is used to store error values :) Make vmlaunch use err1 and vmresume err2 and do while (!err1 !err2) Yeah. :) In the end the choice is between all in asm and small asm trampoline (which also happens to use setjmp/longjmp, but perhaps Arthur can propagate return codes without using setjmp/longjmp, too). Rewriting the whole guest entry exit path in asm like you suggest bellow will eliminate the magic too. I much prefer one big asm statement than many small asm statement scattered among two or three C lines. It's not many asm statements, it's just a dozen lines: I am not talking about overall file, but the how vmx_run() is written: asm() C code asm() C code ... I much prefer: C code big asm() blob C code. Me too. But the trampoline version is neither, it is static inline void vmlaunch() { asm(vmlaunch) } static inline void vmresume() { asm(vmresume) } small asm() blob I is small only because context save restore is hidden behind macro and 4 asm instructions (vmlaunch/seta/vmresume/seta) a hidden somewhere else. The actually differences in asm instruction between both version will not be bigger then a couple of lines (if we will not take setjmp/longjmp implementation into account :)), but I do not even see why we discuss this argument since minimizing asm instructions here is not an point. We should not use more then needed to achieve the goal of course, but design should not be around number of assembly lines. C code Paolo -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in
[PATCH] perf, kvm: Support the in_tx/in_tx_cp modifiers in KVM arch perfmon emulation v5
From: Andi Kleen a...@linux.intel.com [KVM maintainers: The underlying support for this is in perf/core now. So please merge this patch into the KVM tree.] This is not arch perfmon, but older CPUs will just ignore it. This makes it possible to do at least some TSX measurements from a KVM guest v2: Various fixes to address review feedback v3: Ignore the bits when no CPUID. No #GP. Force raw events with TSX bits. v4: Use reserved bits for #GP v5: Remove obsolete argument Acked-by: Gleb Natapov g...@redhat.com Signed-off-by: Andi Kleen a...@linux.intel.com --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/pmu.c | 25 - 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index af9c552..01493a1 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -315,6 +315,7 @@ struct kvm_pmu { u64 global_ovf_ctrl; u64 counter_bitmask[2]; u64 global_ctrl_mask; + u64 reserved_bits; u8 version; struct kvm_pmc gp_counters[INTEL_PMC_MAX_GENERIC]; struct kvm_pmc fixed_counters[INTEL_PMC_MAX_FIXED]; diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index c53e797..5c4f631 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -160,7 +160,7 @@ static void stop_counter(struct kvm_pmc *pmc) static void reprogram_counter(struct kvm_pmc *pmc, u32 type, unsigned config, bool exclude_user, bool exclude_kernel, - bool intr) + bool intr, bool in_tx, bool in_tx_cp) { struct perf_event *event; struct perf_event_attr attr = { @@ -173,6 +173,10 @@ static void reprogram_counter(struct kvm_pmc *pmc, u32 type, .exclude_kernel = exclude_kernel, .config = config, }; + if (in_tx) + attr.config |= HSW_IN_TX; + if (in_tx_cp) + attr.config |= HSW_IN_TX_CHECKPOINTED; attr.sample_period = (-pmc-counter) pmc_bitmask(pmc); @@ -226,7 +230,9 @@ static void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel) if (!(eventsel (ARCH_PERFMON_EVENTSEL_EDGE | ARCH_PERFMON_EVENTSEL_INV | - ARCH_PERFMON_EVENTSEL_CMASK))) { + ARCH_PERFMON_EVENTSEL_CMASK | + HSW_IN_TX | + HSW_IN_TX_CHECKPOINTED))) { config = find_arch_event(pmc-vcpu-arch.pmu, event_select, unit_mask); if (config != PERF_COUNT_HW_MAX) @@ -239,7 +245,9 @@ static void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel) reprogram_counter(pmc, type, config, !(eventsel ARCH_PERFMON_EVENTSEL_USR), !(eventsel ARCH_PERFMON_EVENTSEL_OS), - eventsel ARCH_PERFMON_EVENTSEL_INT); + eventsel ARCH_PERFMON_EVENTSEL_INT, + (eventsel HSW_IN_TX), + (eventsel HSW_IN_TX_CHECKPOINTED)); } static void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 en_pmi, int idx) @@ -256,7 +264,7 @@ static void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 en_pmi, int idx) arch_events[fixed_pmc_events[idx]].event_type, !(en 0x2), /* exclude user */ !(en 0x1), /* exclude kernel */ - pmi); + pmi, false, false); } static inline u8 fixed_en_pmi(u64 ctrl, int idx) @@ -408,7 +416,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } else if ((pmc = get_gp_pmc(pmu, index, MSR_P6_EVNTSEL0))) { if (data == pmc-eventsel) return 0; - if (!(data 0x0020ull)) { + if (!(data pmu-reserved_bits)) { reprogram_gp_counter(pmc, data); return 0; } @@ -450,6 +458,7 @@ void kvm_pmu_cpuid_update(struct kvm_vcpu *vcpu) pmu-counter_bitmask[KVM_PMC_GP] = 0; pmu-counter_bitmask[KVM_PMC_FIXED] = 0; pmu-version = 0; + pmu-reserved_bits = 0x0020ull; entry = kvm_find_cpuid_entry(vcpu, 0xa, 0); if (!entry) @@ -478,6 +487,12 @@ void kvm_pmu_cpuid_update(struct kvm_vcpu *vcpu) pmu-global_ctrl = ((1 pmu-nr_arch_gp_counters) - 1) | (((1ull pmu-nr_arch_fixed_counters) - 1) INTEL_PMC_IDX_FIXED); pmu-global_ctrl_mask = ~pmu-global_ctrl; + + entry = kvm_find_cpuid_entry(vcpu, 7, 0); + if (entry + (boot_cpu_has(X86_FEATURE_HLE) || boot_cpu_has(X86_FEATURE_RTM)) + (entry-ebx (X86_FEATURE_HLE|X86_FEATURE_RTM))) +
Re: [PATCH 00/10 v6] KVM: PPC: IOMMU in-kernel handling
On 07/16/2013 10:53 AM, Alexey Kardashevskiy wrote: The changes are: 1. rebased on v3.11-rc1 so the capability numbers changed again 2. fixed multiple comments from maintainers 3. KVM: PPC: Add support for IOMMU in-kernel handling is split into 2 patches, the new one is powerpc/iommu: rework to support realmode. 4. IOMMU_API is now always enabled for KVM_BOOK3S_64. MOre details in the individual patch comments. Depends on hashtable: add hash_for_each_possible_rcu_notrace(), posted a while ago. Alexey Kardashevskiy (10): KVM: PPC: reserve a capability number for multitce support KVM: PPC: reserve a capability and ioctl numbers for realmode VFIO Alex, could you please pull these 2 patches or tell what is wrong with them? Having them sooner in the kernel would let me ask for a headers update for QEMU and then I would try pushing miltiple TCE and VFIO support in QEMU. Thanks. -- Alexey -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] kvm: powerpc: Do not ignore E attribute in mas2
Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index c2e5e98..277cb18 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW) #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW) #define MAS2_ATTRIB_MASK \ - (MAS2_X0 | MAS2_X1) + (MAS2_X0 | MAS2_X1 | MAS2_E) #define MAS3_ATTRIB_MASK \ (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \ | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? -Bharat Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Tiejun -Bharat Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) -Bharat Tiejun -Bharat Tiejun -Bharat Tiejun + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } Additionally, in UP case this little chunk of code is equivalent to if (1) { mas2_attr |= MAS2_I | MAS2_G; } else { } So you'd better wrapper MAS2_m in advance like, #ifdef CONFIG_SMP #define M_IF_SMPMAS2_M #else #define M_IF_SMP0 #endif Then if (1) mas2_attr |= MAS2_I | MAS2_G; else mas2_attr |= M_IF_SMP; Tiejun + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. -Bharat + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html N�r��yb�X��ǧv�^�){.n�+jir)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. Furthermore, how to distinguish we're creating TLB entry for the device assigned directly to the GS? I think its unnecessary to always check if that is mmio's pfn since we have more non direct assigned devices. So maybe we can introduce another helper to fixup that TLB entry in instead of this path. Tiejun -Bharat + mas2_attr |= MAS2_I | MAS2_G; + } else { #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +320,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 10:55, “tiejun.chen” wrote: On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. Furthermore, how to distinguish we're creating TLB entry for the device assigned directly to the GS? Because other devices wouldn't be available to the guest through memory slots. I think its unnecessary to always check if that is mmio's pfn since we have more non direct assigned devices. I'm not sure I understand. The shadow TLB code only knows here is a host virtual address. It needs to figure out whether the host physical address behind that is RAM (can access with cache enabled) or not (has to disable cache) So maybe we can introduce another helper to fixup that TLB entry in instead of this path. This path does fix up the shadow (host) TLB entry :). Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Which means I think it's fine to slim this whole thing down to only check for pfn_valid(), as the code does in this patch. It would however be very useful to have a comment there that explains why it's safe to do so. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
-Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 3:19 PM To: Bhushan Bharat-R65777 Cc: “tiejun.chen”; kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '“tiejun.chen”' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of “tiejun.chen” Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: “tiejun.chen” [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { +u32 mas2_attr; + +mas2_attr = mas2 MAS2_ATTRIB_MASK; + +if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Which means I think it's fine
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 05:44 PM, Alexander Graf wrote: On 18.07.2013, at 10:55, �tiejun.chen� wrote: On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. Furthermore, how to distinguish we're creating TLB entry for the device assigned directly to the GS? Because other devices wouldn't be available to the guest through memory slots. Yes. I think its unnecessary to always check if that is mmio's pfn since we have more non direct assigned devices. I'm not sure I understand. The shadow TLB code only knows here is a host virtual address. It needs to figure out whether the host physical address behind that is RAM (can access with cache enabled) or not (has to disable cache) Sorry, looks I'm misleading you :-P So maybe we can introduce another helper to fixup that TLB entry in instead of this path. This path does fix up the shadow (host) TLB entry :). I just mean whether we can have a separate path dedicated to those direct assigned devices, not go this common path :) Tiejun -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 11:56, “tiejun.chen” wrote: On 07/18/2013 05:44 PM, Alexander Graf wrote: On 18.07.2013, at 10:55, �tiejun.chen� wrote: On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. Furthermore, how to distinguish we're creating TLB entry for the device assigned directly to the GS? Because other devices wouldn't be available to the guest through memory slots. Yes. I think its unnecessary to always check if that is mmio's pfn since we have more non direct assigned devices. I'm not sure I understand. The shadow TLB code only knows here is a host virtual address. It needs to figure out whether the host physical address behind that is RAM (can access with cache enabled) or not (has to disable cache) Sorry, looks I'm misleading you :-P So maybe we can introduce another helper to fixup that TLB entry in instead of this path. This path does fix up the shadow (host) TLB entry :). I just mean whether we can have a separate path dedicated to those direct assigned devices, not go this common path :) I don't think it's possible to have a separate path without a certain level of trust. In the current flow we don't trust anyone. We just check every translated page whether we should enable caching or not. We could take that information from 2 other side though: 1) Memory
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 12:08, “tiejun.chen” wrote: On 07/18/2013 05:48 PM, Alexander Graf wrote: On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { +u32 mas2_attr; + +mas2_attr = mas2 MAS2_ATTRIB_MASK; + +if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Thanks for this good information :) So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside kvm_is_mmio_pfn() to make sure that check is only valid when that is really needed? This can decrease those unnecessary performance loss. If I'm wrong please correct me :) You're perfectly right, but this is generic KVM code. So it gets run across all architectures. What if someone has the great
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 06:00 PM, Alexander Graf wrote: On 18.07.2013, at 11:56, “tiejun.chen” wrote: On 07/18/2013 05:44 PM, Alexander Graf wrote: On 18.07.2013, at 10:55, �tiejun.chen� wrote: On 07/18/2013 04:25 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. Furthermore, how to distinguish we're creating TLB entry for the device assigned directly to the GS? Because other devices wouldn't be available to the guest through memory slots. Yes. I think its unnecessary to always check if that is mmio's pfn since we have more non direct assigned devices. I'm not sure I understand. The shadow TLB code only knows here is a host virtual address. It needs to figure out whether the host physical address behind that is RAM (can access with cache enabled) or not (has to disable cache) Sorry, looks I'm misleading you :-P So maybe we can introduce another helper to fixup that TLB entry in instead of this path. This path does fix up the shadow (host) TLB entry :). I just mean whether we can have a separate path dedicated to those direct assigned devices, not go this common path :) I don't think it's possible to have a separate path without a certain level of trust. In the current flow we don't trust anyone. We just check every translated page whether we should enable caching or not. We could take that information from 2 other side though: 1) Memory Slot 2) Guest TLB Flags If we take it from the memory
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 06:12 PM, Alexander Graf wrote: On 18.07.2013, at 12:08, “tiejun.chen” wrote: On 07/18/2013 05:48 PM, Alexander Graf wrote: On 18.07.2013, at 10:25, Bhushan Bharat-R65777 wrote: -Original Message- From: Bhushan Bharat-R65777 Sent: Thursday, July 18, 2013 1:53 PM To: '�tiejun.chen�' Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: RE: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 1:52 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 04:08 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of �tiejun.chen� Sent: Thursday, July 18, 2013 1:01 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 03:12 PM, Bhushan Bharat-R65777 wrote: -Original Message- From: �tiejun.chen� [mailto:tiejun.c...@windriver.com] Sent: Thursday, July 18, 2013 11:56 AM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; ag...@suse.de; Wood Scott- B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages On 07/18/2013 02:04 PM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal DDR and the mapping sets M bit (coherent, cacheable) else this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- arch/powerpc/kvm/e500_mmu_host.c | 17 - 1 files changed, 12 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..089c227 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,20 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + if (!pfn_valid(pfn)) { Why not directly use kvm_is_mmio_pfn()? What I understand from this function (someone can correct me) is that it returns false when the page is managed by kernel and is not marked as RESERVED (for some reason). For us it does not matter whether the page is reserved or not, if it is kernel visible page then it is DDR. I think you are setting I|G by addressing all mmio pages, right? If so, KVM: direct mmio pfn check Userspace may specify memory slots that are backed by mmio pages rather than normal RAM. In some cases it is not enough to identify these mmio pages by pfn_valid(). This patch adds checking the PageReserved as well. Do you know what are those some cases and how checking PageReserved helps in those cases? No, myself didn't see these actual cases in qemu,too. But this should be chronically persistent as I understand ;-) Then I will wait till someone educate me :) The reason is , kvm_is_mmio_pfn() function looks pretty heavy and I do not want to call this for all tlbwe operation unless it is necessary. It certainly does more than we need and potentially slows down the fast path (RAM mapping). The only thing it does on top of if (pfn_valid()) is to check for pages that are declared reserved on the host. This happens in 2 cases: 1) Non cache coherent DMA 2) Memory hot remove The non coherent DMA case would be interesting, as with the mechanism as it is in place in Linux today, we could potentially break normal guest operation if we don't take it into account. However, it's Kconfig guarded by: depends on 4xx || 8xx || E200 || PPC_MPC512x || GAMECUBE_COMMON default n if PPC_47x default y so we never hit it with any core we care about ;). Memory hot remove does not exist on e500 FWIW, so we don't have to worry about that one either. Thanks for this good information :) So why not limit those codes with CONFIG_MEMORY_HOTPLUG inside kvm_is_mmio_pfn() to make sure that check is only valid when that is really needed? This can decrease those unnecessary performance loss. If I'm wrong please correct me :) You're perfectly right, but this is generic KVM code. So it gets run across all architectures. What if someone has the great idea to add a new case here for x86, but doesn't tell
[PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - No change arch/powerpc/kvm/e500.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index c2e5e98..277cb18 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW) #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW) #define MAS2_ATTRIB_MASK \ - (MAS2_X0 | MAS2_X1) + (MAS2_X0 | MAS2_X1 | MAS2_E) #define MAS3_ATTRIB_MASK \ (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \ | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* +* RAM is always mappable on e500 systems, so this is identical +* to kvm_is_mmio_pfn(), just without its overhead. +*/ + if (!pfn_valid(pfn)) { + /* Pages not managed by Kernel are treated as I/O, set I + G */ + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +326,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
This needs a description. Why shouldn't we ignore E? Alex On 18.07.2013, at 15:19, Bharat Bhushan wrote: Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: - No change arch/powerpc/kvm/e500.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/powerpc/kvm/e500.h b/arch/powerpc/kvm/e500.h index c2e5e98..277cb18 100644 --- a/arch/powerpc/kvm/e500.h +++ b/arch/powerpc/kvm/e500.h @@ -117,7 +117,7 @@ static inline struct kvmppc_vcpu_e500 *to_e500(struct kvm_vcpu *vcpu) #define E500_TLB_USER_PERM_MASK (MAS3_UX|MAS3_UR|MAS3_UW) #define E500_TLB_SUPER_PERM_MASK (MAS3_SX|MAS3_SR|MAS3_SW) #define MAS2_ATTRIB_MASK \ - (MAS2_X0 | MAS2_X1) + (MAS2_X0 | MAS2_X1 | MAS2_E) #define MAS3_ATTRIB_MASK \ (MAS3_U0 | MAS3_U1 | MAS3_U2 | MAS3_U3 \ | E500_TLB_USER_PERM_MASK | E500_TLB_SUPER_PERM_MASK) -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 15:19, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* + * RAM is always mappable on e500 systems, so this is identical + * to kvm_is_mmio_pfn(), just without its overhead. + */ + if (!pfn_valid(pfn)) { + /* Pages not managed by Kernel are treated as I/O, set I + G */ Please also document the intermediate thought that I/O should be mapped non-cached. + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP Please separate the SMP case out of the branch. - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ This comment doesn't tell me why M can be set ;). Alex + mas2_attr |= MAS2_M; #endif + } + return mas2_attr; } /* @@ -313,7 +326,7 @@ static void kvmppc_e500_setup_stlbe( /* Force IPROT=0 for all guest mappings. */ stlbe-mas1 = MAS1_TSIZE(tsize) | get_tlb_sts(gtlbe) | MAS1_VALID; stlbe-mas2 = (gvaddr MAS2_EPN) | - e500_shadow_mas2_attrib(gtlbe-mas2, pr); + e500_shadow_mas2_attrib(gtlbe-mas2, pfn); stlbe-mas7_3 = ((u64)pfn PAGE_SHIFT) | e500_shadow_mas3_attrib(gtlbe-mas7_3, pr); -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 17:15, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:23 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages On 18.07.2013, at 15:19, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { + u32 mas2_attr; + + mas2_attr = mas2 MAS2_ATTRIB_MASK; + + /* +* RAM is always mappable on e500 systems, so this is identical +* to kvm_is_mmio_pfn(), just without its overhead. +*/ + if (!pfn_valid(pfn)) { + /* Pages not managed by Kernel are treated as I/O, set I + G */ Please also document the intermediate thought that I/O should be mapped non- cached. I did not get what you mean to document? Page snot managed by the kernel are treated as I/O. Map it with disabled cache. + mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP Please separate the SMP case out of the branch. Really :) this was looking simple to me. Two branches intertwined never look simple :). - return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else - return mas2 MAS2_ATTRIB_MASK; + } else { + /* Kernel managed pages are actually RAM so set M */ This comment doesn't tell me why M can be set ;). RAM in SMP, so setting coherent, is not that obvious? No. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
On 18.07.2013, at 17:12, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:18 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 This needs a description. Why shouldn't we ignore E? What I understood is that there is no reason to stop guest setting E, so allow him. Please add that to the patch description. Also explain what the bit means. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
-Original Message- From: Alexander Graf [mailto:ag...@suse.de] Sent: Thursday, July 18, 2013 8:50 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 On 18.07.2013, at 17:12, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:18 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 This needs a description. Why shouldn't we ignore E? What I understood is that there is no reason to stop guest setting E, so allow him. Please add that to the patch description. Also explain what the bit means. Ok :) -Bharat Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] kvm: powerpc: set cache coherency only for kernel managed pages
On 07/18/2013 05:00:42 AM, Alexander Graf wrote: Now why is setting invalid flags a problem? If I understand Scott correctly, it can break the host if you access certain host devices with caching enabled. But to be sure I'd say we ask him directly :). The architecture makes it illegal to mix cacheable and cache-inhibited mappings to the same physical page. Mixing W or M bits is generally bad as well. I've seen it cause machine checks, error interrupts, etc. -- not just corrupting the page in question. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2
On 07/18/2013 10:12:30 AM, Bhushan Bharat-R65777 wrote: -Original Message- From: kvm-ppc-ow...@vger.kernel.org [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf Sent: Thursday, July 18, 2013 8:18 PM To: Bhushan Bharat-R65777 Cc: kvm-ppc@vger.kernel.org; k...@vger.kernel.org; Wood Scott-B07421; Bhushan Bharat-R65777 Subject: Re: [PATCH 1/2 v2] kvm: powerpc: Do not ignore E attribute in mas2 This needs a description. Why shouldn't we ignore E? What I understood is that there is no reason to stop guest setting E, so allow him. FWIW, it'd probably be safe to let the guest control the G bit as well. -Scott -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2 v2] kvm: powerpc: set cache coherency only for kernel managed pages
On 18.07.2013, at 19:17, Scott Wood wrote: On 07/18/2013 08:19:03 AM, Bharat Bhushan wrote: If there is a struct page for the requested mapping then it's normal RAM and the mapping is set to M bit (coherent, cacheable) otherwise this is treated as I/O and we set I + G (cache inhibited, guarded) This helps setting proper TLB mapping for direct assigned device Signed-off-by: Bharat Bhushan bharat.bhus...@freescale.com --- v2: some cleanup and added comment - arch/powerpc/kvm/e500_mmu_host.c | 23 ++- 1 files changed, 18 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/kvm/e500_mmu_host.c b/arch/powerpc/kvm/e500_mmu_host.c index 1c6a9d7..02eb973 100644 --- a/arch/powerpc/kvm/e500_mmu_host.c +++ b/arch/powerpc/kvm/e500_mmu_host.c @@ -64,13 +64,26 @@ static inline u32 e500_shadow_mas3_attrib(u32 mas3, int usermode) return mas3; } -static inline u32 e500_shadow_mas2_attrib(u32 mas2, int usermode) +static inline u32 e500_shadow_mas2_attrib(u32 mas2, pfn_t pfn) { +u32 mas2_attr; + +mas2_attr = mas2 MAS2_ATTRIB_MASK; + +/* + * RAM is always mappable on e500 systems, so this is identical + * to kvm_is_mmio_pfn(), just without its overhead. + */ +if (!pfn_valid(pfn)) { Please use page_is_ram(), which is what gets used when setting the WIMG for the host userspace mapping. We want to make sure the two are consistent. +/* Pages not managed by Kernel are treated as I/O, set I + G */ +mas2_attr |= MAS2_I | MAS2_G; #ifdef CONFIG_SMP -return (mas2 MAS2_ATTRIB_MASK) | MAS2_M; -#else -return mas2 MAS2_ATTRIB_MASK; +} else { +/* Kernel managed pages are actually RAM so set M */ +mas2_attr |= MAS2_M; #endif Likewise, we want to make sure this matches the host entry. Unfortunately, this is a bit of a mess already. 64-bit booke appears to always set MAS2_M for TLB0 mappings. The initial KERNELBASE mapping on boot uses M_IF_SMP, and the settlbcam() that (IIRC) replaces it uses _PAGE_COHERENT. 32-bit always uses _PAGE_COHERENT, except that initial KERNELBASE mapping. _PAGE_COHERENT appears to be set based on CONFIG_SMP || CONFIG_PPC_STD_MMU (the latter config clears _PAGE_COHERENT in the non-CPU_FTR_NEED_COHERENT case). As for what we actually want to happen, there are cases when we want M to be set for non-SMP. One such case is AMP, where CPUs may be sharing memory even if the Linux instance only runs on one CPU (this is not hypothetical, BTW). It's also possible that we encounter a hardware bug that requires MAS2_M, similar to what some of our non-booke chips require. How about we always set M then for RAM? Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 00/10 v6] KVM: PPC: IOMMU in-kernel handling
On 07/16/2013 10:53 AM, Alexey Kardashevskiy wrote: The changes are: 1. rebased on v3.11-rc1 so the capability numbers changed again 2. fixed multiple comments from maintainers 3. KVM: PPC: Add support for IOMMU in-kernel handling is split into 2 patches, the new one is powerpc/iommu: rework to support realmode. 4. IOMMU_API is now always enabled for KVM_BOOK3S_64. MOre details in the individual patch comments. Depends on hashtable: add hash_for_each_possible_rcu_notrace(), posted a while ago. Alexey Kardashevskiy (10): KVM: PPC: reserve a capability number for multitce support KVM: PPC: reserve a capability and ioctl numbers for realmode VFIO Alex, could you please pull these 2 patches or tell what is wrong with them? Having them sooner in the kernel would let me ask for a headers update for QEMU and then I would try pushing miltiple TCE and VFIO support in QEMU. Thanks. -- Alexey -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html