Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 09/25/2015 03:49 PM, Paolo Bonzini wrote: On 24/09/2015 12:12, Borislav Petkov wrote: On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote: +static inline bool +boot_cpu_is_amd(void) +{ + WARN_ON_ONCE(!tdp_enabled); + return shadow_x_mask != 0; shadow_x_mask != 0 is Intel's CPU. Borislav, could you please check shadow_x_mask == 0 instead and test it again? That did the trick: [ 62.640392] kvm: zapping shadow pages for mmio generation wraparound [ 63.100301] cpuid(0).ebx = 68747541 [ 63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112 [ 64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021 [ 64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021 [ 64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021 [ 65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021 [ 65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021 [ 65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021 [ 65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021 [ 65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021 That's all I got in dmesg from booting the guest - no more mmio PF warnings. Great, though this doesn't yet explain why guest_cpuid_is_amd failed. I'll look into it next week when Amazon delivers my shiny new AMD mini-PC :) and send the patch to Linus meanwhile. I guess QEMU passed AMD cpu mode to KVM... but not sure. :( -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 09/25/2015 03:49 PM, Paolo Bonzini wrote: On 24/09/2015 12:12, Borislav Petkov wrote: On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote: +static inline bool +boot_cpu_is_amd(void) +{ + WARN_ON_ONCE(!tdp_enabled); + return shadow_x_mask != 0; shadow_x_mask != 0 is Intel's CPU. Borislav, could you please check shadow_x_mask == 0 instead and test it again? That did the trick: [ 62.640392] kvm: zapping shadow pages for mmio generation wraparound [ 63.100301] cpuid(0).ebx = 68747541 [ 63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112 [ 64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021 [ 64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021 [ 64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021 [ 65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021 [ 65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021 [ 65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021 [ 65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021 [ 65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021 That's all I got in dmesg from booting the guest - no more mmio PF warnings. Great, though this doesn't yet explain why guest_cpuid_is_amd failed. I'll look into it next week when Amazon delivers my shiny new AMD mini-PC :) and send the patch to Linus meanwhile. I guess QEMU passed AMD cpu mode to KVM... but not sure. :( -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 24/09/2015 12:12, Borislav Petkov wrote: > On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote: >>> +static inline bool >>> +boot_cpu_is_amd(void) >>> +{ >>> + WARN_ON_ONCE(!tdp_enabled); >>> + return shadow_x_mask != 0; >> >> shadow_x_mask != 0 is Intel's CPU. >> >> Borislav, could you please check shadow_x_mask == 0 instead and test it >> again? > > That did the trick: > > [ 62.640392] kvm: zapping shadow pages for mmio generation wraparound > [ 63.100301] cpuid(0).ebx = 68747541 > [ 63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112 > [ 64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021 > [ 64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021 > [ 64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021 > [ 65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021 > [ 65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021 > [ 65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021 > [ 65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021 > [ 65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021 > > That's all I got in dmesg from booting the guest - no more mmio PF > warnings. Great, though this doesn't yet explain why guest_cpuid_is_amd failed. I'll look into it next week when Amazon delivers my shiny new AMD mini-PC :) and send the patch to Linus meanwhile. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 24/09/2015 12:12, Borislav Petkov wrote: > On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote: >>> +static inline bool >>> +boot_cpu_is_amd(void) >>> +{ >>> + WARN_ON_ONCE(!tdp_enabled); >>> + return shadow_x_mask != 0; >> >> shadow_x_mask != 0 is Intel's CPU. >> >> Borislav, could you please check shadow_x_mask == 0 instead and test it >> again? > > That did the trick: > > [ 62.640392] kvm: zapping shadow pages for mmio generation wraparound > [ 63.100301] cpuid(0).ebx = 68747541 > [ 63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112 > [ 64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021 > [ 64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021 > [ 64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021 > [ 65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021 > [ 65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021 > [ 65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021 > [ 65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021 > [ 65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021 > > That's all I got in dmesg from booting the guest - no more mmio PF > warnings. Great, though this doesn't yet explain why guest_cpuid_is_amd failed. I'll look into it next week when Amazon delivers my shiny new AMD mini-PC :) and send the patch to Linus meanwhile. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote: > >+static inline bool > >+boot_cpu_is_amd(void) > >+{ > >+WARN_ON_ONCE(!tdp_enabled); > >+return shadow_x_mask != 0; > > shadow_x_mask != 0 is Intel's CPU. > > Borislav, could you please check shadow_x_mask == 0 instead and test it again? That did the trick: [ 62.640392] kvm: zapping shadow pages for mmio generation wraparound [ 63.100301] cpuid(0).ebx = 68747541 [ 63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112 [ 64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021 [ 64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021 [ 64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021 [ 65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021 [ 65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021 [ 65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021 [ 65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021 [ 65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021 That's all I got in dmesg from booting the guest - no more mmio PF warnings. Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote: > >+static inline bool > >+boot_cpu_is_amd(void) > >+{ > >+WARN_ON_ONCE(!tdp_enabled); > >+return shadow_x_mask != 0; > > shadow_x_mask != 0 is Intel's CPU. > > Borislav, could you please check shadow_x_mask == 0 instead and test it again? That did the trick: [ 62.640392] kvm: zapping shadow pages for mmio generation wraparound [ 63.100301] cpuid(0).ebx = 68747541 [ 63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112 [ 64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021 [ 64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021 [ 64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021 [ 65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021 [ 65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021 [ 65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021 [ 65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021 [ 65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021 That's all I got in dmesg from booting the guest - no more mmio PF warnings. Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 09/23/2015 05:36 PM, Paolo Bonzini wrote: On 23/09/2015 09:56, Borislav Petkov wrote: On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote: Let's add more debugging output: Here you go: [ 50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 4, 0xf00f8) [ 50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 3, 0xf0078) [ 50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 2, 0xf0078) And another patch, which both cranks up the debugging a bit and tries another fix: diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index dd05b9cef6ae..b2f49bb15ba1 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct kvm_vcpu *vcpu) static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; + static bool first; best = kvm_find_cpuid_entry(vcpu, 0, 0); + if (first && best) { + printk("cpuid(0).ebx = %x\n", best->ebx); + first = false; + } else if (first) + printk_ratelimited("cpuid(0) not initialized yet\n"); + return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx; } diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index bf1122e9c7bf..f50b280ffee1 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3625,7 +3625,7 @@ static void __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, struct rsvd_bits_validate *rsvd_check, int maxphyaddr, int level, bool nx, bool gbpages, - bool pse) + bool pse, bool amd) { u64 exb_bit_rsvd = 0; u64 gbpages_bit_rsvd = 0; @@ -3642,7 +3642,7 @@ __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, * Non-leaf PML4Es and PDPEs reserve bit 8 (which would be the G bit for * leaf entries) on AMD CPUs only. */ - if (guest_cpuid_is_amd(vcpu)) + if (amd) nonleaf_bit8_rsvd = rsvd_bits(8, 8); switch (level) { @@ -3710,7 +3710,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, __reset_rsvds_bits_mask(vcpu, >guest_rsvd_check, cpuid_maxphyaddr(vcpu), context->root_level, context->nx, guest_cpuid_has_gbpages(vcpu), - is_pse(vcpu)); + is_pse(vcpu), guest_cpuid_is_amd(vcpu)); } static void @@ -3760,13 +3760,25 @@ static void reset_rsvds_bits_mask_ept(struct kvm_vcpu *vcpu, void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context) { + /* +* Passing "true" to the last argument is okay; it adds a check +* on bit 8 of the SPTEs which KVM doesn't use anyway. +*/ __reset_rsvds_bits_mask(vcpu, >shadow_zero_check, boot_cpu_data.x86_phys_bits, context->shadow_root_level, context->nx, - guest_cpuid_has_gbpages(vcpu), is_pse(vcpu)); + guest_cpuid_has_gbpages(vcpu), is_pse(vcpu), + true); } EXPORT_SYMBOL_GPL(reset_shadow_zero_bits_mask); +static inline bool +boot_cpu_is_amd(void) +{ + WARN_ON_ONCE(!tdp_enabled); + return shadow_x_mask != 0; shadow_x_mask != 0 is Intel's CPU. Borislav, could you please check shadow_x_mask == 0 instead and test it again? Further more, use guest_cpuid_is_amd() to detect hardware CPU vendor is wrong as usespace can fool KVM. Should test host CPUID or introduce intel/amd callback instead. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 23/09/2015 13:07, Borislav Petkov wrote: >> > + static bool first; >> > >> >best = kvm_find_cpuid_entry(vcpu, 0, 0); >> > + if (first && best) { >> > + printk("cpuid(0).ebx = %x\n", best->ebx); >> > + first = false; >> > + } else if (first) >> > + printk_ratelimited("cpuid(0) not initialized yet\n"); >> > + >> >return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx; > Do I see it correctly that that "first" thing is never true? > > In any case, I changed it to initialize to true but still no output from > that function. > > [ 102.448438] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 4, 0xf00f8) > [ 102.458706] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 3, 0xf0078) > [ 102.468955] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 2, 0xf0078) Yeah, but I didn't totally need the output so I didn't bother answering with yet another v2. I'm a bit stymied and will try to find an AMD machine inside RH (it's always a pain to install latest-and-greatest kernel on unknown machines). It's probably also time to buy one too... Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On Wed, Sep 23, 2015 at 11:36:47AM +0200, Paolo Bonzini wrote: > And another patch, which both cranks up the debugging a bit and > tries another fix: > > diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h > index dd05b9cef6ae..b2f49bb15ba1 100644 > --- a/arch/x86/kvm/cpuid.h > +++ b/arch/x86/kvm/cpuid.h > @@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct > kvm_vcpu *vcpu) > static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu) > { > struct kvm_cpuid_entry2 *best; > + static bool first; > > best = kvm_find_cpuid_entry(vcpu, 0, 0); > + if (first && best) { > + printk("cpuid(0).ebx = %x\n", best->ebx); > + first = false; > + } else if (first) > + printk_ratelimited("cpuid(0) not initialized yet\n"); > + > return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx; Do I see it correctly that that "first" thing is never true? In any case, I changed it to initialize to true but still no output from that function. [ 102.448438] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 4, 0xf00f8) [ 102.458706] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 3, 0xf0078) [ 102.468955] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 2, 0xf0078) [ 102.479337] dump hierarchy: [ 102.482152] -- spte 0x416edb027 level 4. [ 102.482154] -- spte 0x416eda027 level 3. [ 102.482155] -- spte 0x416ed5027 level 2. [ 102.482157] -- spte 0x000b8f67 level 1. [ 102.482158] [ cut here ] [ 102.482196] WARNING: CPU: 6 PID: 3550 at arch/x86/kvm/mmu.c:3396 handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]() [ 102.482236] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 102.482240] CPU: 6 PID: 3550 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #1 [ 102.482242] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 102.482249] a030c992 880424b8fb78 812c758a [ 102.482253] 880424b8fbb0 810534c1 8804160e 000f [ 102.482257] 000b8000 880424b8fbc0 [ 102.482259] Call Trace: [ 102.482268] [] dump_stack+0x4e/0x84 [ 102.482273] [] warn_slowpath_common+0x91/0xd0 [ 102.482276] [] warn_slowpath_null+0x1a/0x20 [ 102.482306] [] handle_mmio_page_fault.part.57+0x1a/0x20 [kvm] [ 102.482334] [] tdp_page_fault+0x2a0/0x2b0 [kvm] [ 102.482340] [] ? __lock_acquire+0x57d/0x17a0 [ 102.482369] [] kvm_mmu_page_fault+0x35/0x240 [kvm] [ 102.482376] [] pf_interception+0x108/0x1d0 [kvm_amd] [ 102.482381] [] handle_exit+0x150/0xa40 [kvm_amd] [ 102.482408] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 102.482435] [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm] [ 102.482461] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 102.482466] [] ? mutex_lock_killable_nested+0x312/0x480 [ 102.482485] [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm] [ 102.482490] [] ? preempt_count_sub+0xb3/0x110 [ 102.482509] [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm] [ 102.482515] [] do_vfs_ioctl+0x2d7/0x530 [ 102.482519] [] ? __fget_light+0x29/0x90 [ 102.482523] [] SyS_ioctl+0x4c/0x90 [ 102.482527] [] entry_SYSCALL_64_fastpath+0x16/0x73 [ 102.482531] ---[ end trace b8899512fc52cf2e ]--- Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 23/09/2015 09:56, Borislav Petkov wrote: > On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote: >> Let's add more debugging output: > > Here you go: > > [ 50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 4, 0xf00f8) > [ 50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 3, 0xf0078) > [ 50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 2, 0xf0078) And another patch, which both cranks up the debugging a bit and tries another fix: diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index dd05b9cef6ae..b2f49bb15ba1 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct kvm_vcpu *vcpu) static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; + static bool first; best = kvm_find_cpuid_entry(vcpu, 0, 0); + if (first && best) { + printk("cpuid(0).ebx = %x\n", best->ebx); + first = false; + } else if (first) + printk_ratelimited("cpuid(0) not initialized yet\n"); + return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx; } diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index bf1122e9c7bf..f50b280ffee1 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3625,7 +3625,7 @@ static void __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, struct rsvd_bits_validate *rsvd_check, int maxphyaddr, int level, bool nx, bool gbpages, - bool pse) + bool pse, bool amd) { u64 exb_bit_rsvd = 0; u64 gbpages_bit_rsvd = 0; @@ -3642,7 +3642,7 @@ __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, * Non-leaf PML4Es and PDPEs reserve bit 8 (which would be the G bit for * leaf entries) on AMD CPUs only. */ - if (guest_cpuid_is_amd(vcpu)) + if (amd) nonleaf_bit8_rsvd = rsvd_bits(8, 8); switch (level) { @@ -3710,7 +3710,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, __reset_rsvds_bits_mask(vcpu, >guest_rsvd_check, cpuid_maxphyaddr(vcpu), context->root_level, context->nx, guest_cpuid_has_gbpages(vcpu), - is_pse(vcpu)); + is_pse(vcpu), guest_cpuid_is_amd(vcpu)); } static void @@ -3760,13 +3760,25 @@ static void reset_rsvds_bits_mask_ept(struct kvm_vcpu *vcpu, void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context) { + /* +* Passing "true" to the last argument is okay; it adds a check +* on bit 8 of the SPTEs which KVM doesn't use anyway. +*/ __reset_rsvds_bits_mask(vcpu, >shadow_zero_check, boot_cpu_data.x86_phys_bits, context->shadow_root_level, context->nx, - guest_cpuid_has_gbpages(vcpu), is_pse(vcpu)); + guest_cpuid_has_gbpages(vcpu), is_pse(vcpu), + true); } EXPORT_SYMBOL_GPL(reset_shadow_zero_bits_mask); +static inline bool +boot_cpu_is_amd(void) +{ + WARN_ON_ONCE(!tdp_enabled); + return shadow_x_mask != 0; +} + /* * the direct page table on host, use as much mmu features as * possible, however, kvm currently does not do execution-protection. @@ -3775,11 +3787,11 @@ static void reset_tdp_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context) { - if (guest_cpuid_is_amd(vcpu)) + if (boot_cpu_is_amd()) __reset_rsvds_bits_mask(vcpu, >shadow_zero_check, boot_cpu_data.x86_phys_bits, context->shadow_root_level, false, - cpu_has_gbpages, true); + cpu_has_gbpages, true, true); else __reset_rsvds_bits_mask_ept(>shadow_zero_check, boot_cpu_data.x86_phys_bits, Applies on top of everything else you've got already. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 23/09/2015 09:56, Borislav Petkov wrote: > [ 50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 4, 0xf00f8) > [ 50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 3, 0xf0078) > [ 50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 2, 0xf0078) It's checking against EPT format, no surprise that it complains... Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote: > Let's add more debugging output: Here you go: [ 50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 4, 0xf00f8) [ 50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 3, 0xf0078) [ 50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 2, 0xf0078) [ 50.504767] dump hierarchy: [ 50.507595] -- spte 0x416533027 level 4. [ 50.507595] -- spte 0x416534027 level 3. [ 50.507596] -- spte 0x416535027 level 2. [ 50.507596] -- spte 0x000b8f67 level 1. [ 50.507597] [ cut here ] [ 50.507616] WARNING: CPU: 4 PID: 3539 at arch/x86/kvm/mmu.c:3396 handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]() [ 50.507630] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod k10temp edac_core fam15h_power amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 50.507632] CPU: 4 PID: 3539 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #2 [ 50.507633] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 50.507635] a0433932 880416973b78 812c758a [ 50.507637] 880416973bb0 810534c1 8804231c 000f [ 50.507638] 000b8000 880416973bc0 [ 50.507639] Call Trace: [ 50.507643] [] dump_stack+0x4e/0x84 [ 50.507646] [] warn_slowpath_common+0x91/0xd0 [ 50.507647] [] warn_slowpath_null+0x1a/0x20 [ 50.507657] [] handle_mmio_page_fault.part.57+0x1a/0x20 [kvm] [ 50.507667] [] tdp_page_fault+0x2a0/0x2b0 [kvm] [ 50.507673] [] ? __lock_acquire+0x57d/0x17a0 [ 50.507682] [] kvm_mmu_page_fault+0x35/0x240 [kvm] [ 50.507685] [] pf_interception+0x108/0x1d0 [kvm_amd] [ 50.507688] [] handle_exit+0x150/0xa40 [kvm_amd] [ 50.507697] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 50.507706] [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm] [ 50.507715] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 50.507717] [] ? mutex_lock_killable_nested+0x312/0x480 [ 50.507724] [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm] [ 50.507726] [] ? preempt_count_sub+0xb3/0x110 [ 50.507733] [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm] [ 50.507735] [] do_vfs_ioctl+0x2d7/0x530 [ 50.507737] [] ? __fget_light+0x29/0x90 [ 50.507738] [] SyS_ioctl+0x4c/0x90 [ 50.507740] [] entry_SYSCALL_64_fastpath+0x16/0x73 [ 50.507741] ---[ end trace ff23795fcc279cbd ]--- > Thus same as before. > > Just to be safe, can you try using "-cpu host" on the QEMU command > line and see if it changes anything? This would catch things such > as an Intel CPUID on an AMD host. Here's my full qemu command: qemu-system-x86_64 -enable-kvm -gdb tcp::1234 -cpu host -m 2048 -hda /home/boris/kvm/debian/sid-x86_64.img -hdb /home/boris/kvm/swap.img -boot menu=off,order=c -localtime -net nic,model=rtl8139 -net user,hostfwd=tcp::1235-:22 -usbdevice tablet -kernel /home/boris/kernel/linux-2.6/arch/x86/boot/bzImage -append "root=/dev/sda1 resume=/dev/sdb1 debug ignore_loglevel log_buf_len=16M earlyprintk=ttyS0,115200 console=ttyS0,115200 console=tty0 " -monitor pty -virtfs local,path=/tmp,mount_tag=tmp,security_model=none -serial file:/home/boris/kvm/test-x86_64-1235.log -snapshot -name "Debian x86_64:1235" -smp 8 and that splats too: [ 146.891735] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 4, 0xf00f8) [ 146.901981] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 3, 0xf0078) [ 146.912224] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 2, 0xf0078) [ 146.922496] dump hierarchy: [ 146.925331] -- spte 0x37d47027 level 4. [ 146.925332] -- spte 0x37d46027 level 3. [ 146.925332] -- spte 0xb9faa027 level 2. [ 146.925333] -- spte 0x000b8f67 level 1. [ 146.925333] [ cut here ] [ 146.925351] WARNING: CPU: 6 PID: 3753 at arch/x86/kvm/mmu.c:3396 handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]() [ 146.925371] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod k10temp edac_core fam15h_power amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 146.925373] CPU: 6 PID: 3753 Comm: qemu-system-x86 Tainted: GW 4.3.0-rc2+ #2 [ 146.925374] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 146.925376] a0433932 880423377b78 812c758a [ 146.925378]
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On Wed, Sep 23, 2015 at 11:36:47AM +0200, Paolo Bonzini wrote: > And another patch, which both cranks up the debugging a bit and > tries another fix: > > diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h > index dd05b9cef6ae..b2f49bb15ba1 100644 > --- a/arch/x86/kvm/cpuid.h > +++ b/arch/x86/kvm/cpuid.h > @@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct > kvm_vcpu *vcpu) > static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu) > { > struct kvm_cpuid_entry2 *best; > + static bool first; > > best = kvm_find_cpuid_entry(vcpu, 0, 0); > + if (first && best) { > + printk("cpuid(0).ebx = %x\n", best->ebx); > + first = false; > + } else if (first) > + printk_ratelimited("cpuid(0) not initialized yet\n"); > + > return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx; Do I see it correctly that that "first" thing is never true? In any case, I changed it to initialize to true but still no output from that function. [ 102.448438] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 4, 0xf00f8) [ 102.458706] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 3, 0xf0078) [ 102.468955] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 2, 0xf0078) [ 102.479337] dump hierarchy: [ 102.482152] -- spte 0x416edb027 level 4. [ 102.482154] -- spte 0x416eda027 level 3. [ 102.482155] -- spte 0x416ed5027 level 2. [ 102.482157] -- spte 0x000b8f67 level 1. [ 102.482158] [ cut here ] [ 102.482196] WARNING: CPU: 6 PID: 3550 at arch/x86/kvm/mmu.c:3396 handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]() [ 102.482236] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 102.482240] CPU: 6 PID: 3550 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #1 [ 102.482242] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 102.482249] a030c992 880424b8fb78 812c758a [ 102.482253] 880424b8fbb0 810534c1 8804160e 000f [ 102.482257] 000b8000 880424b8fbc0 [ 102.482259] Call Trace: [ 102.482268] [] dump_stack+0x4e/0x84 [ 102.482273] [] warn_slowpath_common+0x91/0xd0 [ 102.482276] [] warn_slowpath_null+0x1a/0x20 [ 102.482306] [] handle_mmio_page_fault.part.57+0x1a/0x20 [kvm] [ 102.482334] [] tdp_page_fault+0x2a0/0x2b0 [kvm] [ 102.482340] [] ? __lock_acquire+0x57d/0x17a0 [ 102.482369] [] kvm_mmu_page_fault+0x35/0x240 [kvm] [ 102.482376] [] pf_interception+0x108/0x1d0 [kvm_amd] [ 102.482381] [] handle_exit+0x150/0xa40 [kvm_amd] [ 102.482408] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 102.482435] [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm] [ 102.482461] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 102.482466] [] ? mutex_lock_killable_nested+0x312/0x480 [ 102.482485] [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm] [ 102.482490] [] ? preempt_count_sub+0xb3/0x110 [ 102.482509] [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm] [ 102.482515] [] do_vfs_ioctl+0x2d7/0x530 [ 102.482519] [] ? __fget_light+0x29/0x90 [ 102.482523] [] SyS_ioctl+0x4c/0x90 [ 102.482527] [] entry_SYSCALL_64_fastpath+0x16/0x73 [ 102.482531] ---[ end trace b8899512fc52cf2e ]--- Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 23/09/2015 13:07, Borislav Petkov wrote: >> > + static bool first; >> > >> >best = kvm_find_cpuid_entry(vcpu, 0, 0); >> > + if (first && best) { >> > + printk("cpuid(0).ebx = %x\n", best->ebx); >> > + first = false; >> > + } else if (first) >> > + printk_ratelimited("cpuid(0) not initialized yet\n"); >> > + >> >return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx; > Do I see it correctly that that "first" thing is never true? > > In any case, I changed it to initialize to true but still no output from > that function. > > [ 102.448438] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 4, 0xf00f8) > [ 102.458706] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 3, 0xf0078) > [ 102.468955] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 2, 0xf0078) Yeah, but I didn't totally need the output so I didn't bother answering with yet another v2. I'm a bit stymied and will try to find an AMD machine inside RH (it's always a pain to install latest-and-greatest kernel on unknown machines). It's probably also time to buy one too... Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 23/09/2015 09:56, Borislav Petkov wrote: > [ 50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 4, 0xf00f8) > [ 50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 3, 0xf0078) > [ 50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 2, 0xf0078) It's checking against EPT format, no surprise that it complains... Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote: > Let's add more debugging output: Here you go: [ 50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 4, 0xf00f8) [ 50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 3, 0xf0078) [ 50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 2, 0xf0078) [ 50.504767] dump hierarchy: [ 50.507595] -- spte 0x416533027 level 4. [ 50.507595] -- spte 0x416534027 level 3. [ 50.507596] -- spte 0x416535027 level 2. [ 50.507596] -- spte 0x000b8f67 level 1. [ 50.507597] [ cut here ] [ 50.507616] WARNING: CPU: 4 PID: 3539 at arch/x86/kvm/mmu.c:3396 handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]() [ 50.507630] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod k10temp edac_core fam15h_power amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 50.507632] CPU: 4 PID: 3539 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #2 [ 50.507633] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 50.507635] a0433932 880416973b78 812c758a [ 50.507637] 880416973bb0 810534c1 8804231c 000f [ 50.507638] 000b8000 880416973bc0 [ 50.507639] Call Trace: [ 50.507643] [] dump_stack+0x4e/0x84 [ 50.507646] [] warn_slowpath_common+0x91/0xd0 [ 50.507647] [] warn_slowpath_null+0x1a/0x20 [ 50.507657] [] handle_mmio_page_fault.part.57+0x1a/0x20 [kvm] [ 50.507667] [] tdp_page_fault+0x2a0/0x2b0 [kvm] [ 50.507673] [] ? __lock_acquire+0x57d/0x17a0 [ 50.507682] [] kvm_mmu_page_fault+0x35/0x240 [kvm] [ 50.507685] [] pf_interception+0x108/0x1d0 [kvm_amd] [ 50.507688] [] handle_exit+0x150/0xa40 [kvm_amd] [ 50.507697] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 50.507706] [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm] [ 50.507715] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 50.507717] [] ? mutex_lock_killable_nested+0x312/0x480 [ 50.507724] [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm] [ 50.507726] [] ? preempt_count_sub+0xb3/0x110 [ 50.507733] [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm] [ 50.507735] [] do_vfs_ioctl+0x2d7/0x530 [ 50.507737] [] ? __fget_light+0x29/0x90 [ 50.507738] [] SyS_ioctl+0x4c/0x90 [ 50.507740] [] entry_SYSCALL_64_fastpath+0x16/0x73 [ 50.507741] ---[ end trace ff23795fcc279cbd ]--- > Thus same as before. > > Just to be safe, can you try using "-cpu host" on the QEMU command > line and see if it changes anything? This would catch things such > as an Intel CPUID on an AMD host. Here's my full qemu command: qemu-system-x86_64 -enable-kvm -gdb tcp::1234 -cpu host -m 2048 -hda /home/boris/kvm/debian/sid-x86_64.img -hdb /home/boris/kvm/swap.img -boot menu=off,order=c -localtime -net nic,model=rtl8139 -net user,hostfwd=tcp::1235-:22 -usbdevice tablet -kernel /home/boris/kernel/linux-2.6/arch/x86/boot/bzImage -append "root=/dev/sda1 resume=/dev/sdb1 debug ignore_loglevel log_buf_len=16M earlyprintk=ttyS0,115200 console=ttyS0,115200 console=tty0 " -monitor pty -virtfs local,path=/tmp,mount_tag=tmp,security_model=none -serial file:/home/boris/kvm/test-x86_64-1235.log -snapshot -name "Debian x86_64:1235" -smp 8 and that splats too: [ 146.891735] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 4, 0xf00f8) [ 146.901981] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 3, 0xf0078) [ 146.912224] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 2, 0xf0078) [ 146.922496] dump hierarchy: [ 146.925331] -- spte 0x37d47027 level 4. [ 146.925332] -- spte 0x37d46027 level 3. [ 146.925332] -- spte 0xb9faa027 level 2. [ 146.925333] -- spte 0x000b8f67 level 1. [ 146.925333] [ cut here ] [ 146.925351] WARNING: CPU: 6 PID: 3753 at arch/x86/kvm/mmu.c:3396 handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]() [ 146.925371] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod k10temp edac_core fam15h_power amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 146.925373] CPU: 6 PID: 3753 Comm: qemu-system-x86 Tainted: GW 4.3.0-rc2+ #2 [ 146.925374] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 146.925376] a0433932 880423377b78 812c758a [ 146.925378]
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 09/23/2015 05:36 PM, Paolo Bonzini wrote: On 23/09/2015 09:56, Borislav Petkov wrote: On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote: Let's add more debugging output: Here you go: [ 50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 4, 0xf00f8) [ 50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 3, 0xf0078) [ 50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000 (level 2, 0xf0078) And another patch, which both cranks up the debugging a bit and tries another fix: diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index dd05b9cef6ae..b2f49bb15ba1 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct kvm_vcpu *vcpu) static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; + static bool first; best = kvm_find_cpuid_entry(vcpu, 0, 0); + if (first && best) { + printk("cpuid(0).ebx = %x\n", best->ebx); + first = false; + } else if (first) + printk_ratelimited("cpuid(0) not initialized yet\n"); + return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx; } diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index bf1122e9c7bf..f50b280ffee1 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3625,7 +3625,7 @@ static void __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, struct rsvd_bits_validate *rsvd_check, int maxphyaddr, int level, bool nx, bool gbpages, - bool pse) + bool pse, bool amd) { u64 exb_bit_rsvd = 0; u64 gbpages_bit_rsvd = 0; @@ -3642,7 +3642,7 @@ __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, * Non-leaf PML4Es and PDPEs reserve bit 8 (which would be the G bit for * leaf entries) on AMD CPUs only. */ - if (guest_cpuid_is_amd(vcpu)) + if (amd) nonleaf_bit8_rsvd = rsvd_bits(8, 8); switch (level) { @@ -3710,7 +3710,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, __reset_rsvds_bits_mask(vcpu, >guest_rsvd_check, cpuid_maxphyaddr(vcpu), context->root_level, context->nx, guest_cpuid_has_gbpages(vcpu), - is_pse(vcpu)); + is_pse(vcpu), guest_cpuid_is_amd(vcpu)); } static void @@ -3760,13 +3760,25 @@ static void reset_rsvds_bits_mask_ept(struct kvm_vcpu *vcpu, void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context) { + /* +* Passing "true" to the last argument is okay; it adds a check +* on bit 8 of the SPTEs which KVM doesn't use anyway. +*/ __reset_rsvds_bits_mask(vcpu, >shadow_zero_check, boot_cpu_data.x86_phys_bits, context->shadow_root_level, context->nx, - guest_cpuid_has_gbpages(vcpu), is_pse(vcpu)); + guest_cpuid_has_gbpages(vcpu), is_pse(vcpu), + true); } EXPORT_SYMBOL_GPL(reset_shadow_zero_bits_mask); +static inline bool +boot_cpu_is_amd(void) +{ + WARN_ON_ONCE(!tdp_enabled); + return shadow_x_mask != 0; shadow_x_mask != 0 is Intel's CPU. Borislav, could you please check shadow_x_mask == 0 instead and test it again? Further more, use guest_cpuid_is_amd() to detect hardware CPU vendor is wrong as usespace can fool KVM. Should test host CPUID or introduce intel/amd callback instead. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 23/09/2015 09:56, Borislav Petkov wrote: > On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote: >> Let's add more debugging output: > > Here you go: > > [ 50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 4, 0xf00f8) > [ 50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 3, 0xf0078) > [ 50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000 (level 2, 0xf0078) And another patch, which both cranks up the debugging a bit and tries another fix: diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h index dd05b9cef6ae..b2f49bb15ba1 100644 --- a/arch/x86/kvm/cpuid.h +++ b/arch/x86/kvm/cpuid.h @@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct kvm_vcpu *vcpu) static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; + static bool first; best = kvm_find_cpuid_entry(vcpu, 0, 0); + if (first && best) { + printk("cpuid(0).ebx = %x\n", best->ebx); + first = false; + } else if (first) + printk_ratelimited("cpuid(0) not initialized yet\n"); + return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx; } diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index bf1122e9c7bf..f50b280ffee1 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3625,7 +3625,7 @@ static void __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, struct rsvd_bits_validate *rsvd_check, int maxphyaddr, int level, bool nx, bool gbpages, - bool pse) + bool pse, bool amd) { u64 exb_bit_rsvd = 0; u64 gbpages_bit_rsvd = 0; @@ -3642,7 +3642,7 @@ __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, * Non-leaf PML4Es and PDPEs reserve bit 8 (which would be the G bit for * leaf entries) on AMD CPUs only. */ - if (guest_cpuid_is_amd(vcpu)) + if (amd) nonleaf_bit8_rsvd = rsvd_bits(8, 8); switch (level) { @@ -3710,7 +3710,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu, __reset_rsvds_bits_mask(vcpu, >guest_rsvd_check, cpuid_maxphyaddr(vcpu), context->root_level, context->nx, guest_cpuid_has_gbpages(vcpu), - is_pse(vcpu)); + is_pse(vcpu), guest_cpuid_is_amd(vcpu)); } static void @@ -3760,13 +3760,25 @@ static void reset_rsvds_bits_mask_ept(struct kvm_vcpu *vcpu, void reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context) { + /* +* Passing "true" to the last argument is okay; it adds a check +* on bit 8 of the SPTEs which KVM doesn't use anyway. +*/ __reset_rsvds_bits_mask(vcpu, >shadow_zero_check, boot_cpu_data.x86_phys_bits, context->shadow_root_level, context->nx, - guest_cpuid_has_gbpages(vcpu), is_pse(vcpu)); + guest_cpuid_has_gbpages(vcpu), is_pse(vcpu), + true); } EXPORT_SYMBOL_GPL(reset_shadow_zero_bits_mask); +static inline bool +boot_cpu_is_amd(void) +{ + WARN_ON_ONCE(!tdp_enabled); + return shadow_x_mask != 0; +} + /* * the direct page table on host, use as much mmu features as * possible, however, kvm currently does not do execution-protection. @@ -3775,11 +3787,11 @@ static void reset_tdp_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context) { - if (guest_cpuid_is_amd(vcpu)) + if (boot_cpu_is_amd()) __reset_rsvds_bits_mask(vcpu, >shadow_zero_check, boot_cpu_data.x86_phys_bits, context->shadow_root_level, false, - cpu_has_gbpages, true); + cpu_has_gbpages, true, true); else __reset_rsvds_bits_mask_ept(>shadow_zero_check, boot_cpu_data.x86_phys_bits, Applies on top of everything else you've got already. Paolo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 22/09/2015 19:56, Borislav Petkov wrote: > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 69088a1ba509..3ce2b74c75dc 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, > u64 addr, u64 *sptep) > break; > > reserved |= is_shadow_zero_bits_set(>arch.mmu, spte, > - leaf); > + iterator.level); > } > > walk_shadow_page_lockless_end(vcpu); > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index c0b9ff3e1aec..a44f8fed9be1 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -7063,13 +7063,16 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, > unsigned int id) > { > struct kvm_vcpu *vcpu; > + int idx; > > if (check_tsc_unstable() && atomic_read(>online_vcpus) != 0) > printk_once(KERN_WARNING > "kvm: SMP vm created on host with unstable TSC; " > "guest TSC will not be reliable\n"); > > + idx = srcu_read_lock(>srcu); > vcpu = kvm_x86_ops->vcpu_create(kvm, id); > + srcu_read_unlock(>srcu, idx); > > return vcpu; > } Yup, looks good. Let's add more debugging output: diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 3ce2b74c75dc..bf1122e9c7bf 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3268,23 +3268,28 @@ static gpa_t nonpaging_gva_to_gpa_nested(struct kvm_vcpu *vcpu, gva_t vaddr, return vcpu->arch.nested_mmu.translate_gpa(vcpu, vaddr, access, exception); } -static bool -__is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level) +static u64 +rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level) { int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f; + u64 mask = rsvd_check->rsvd_bits_mask[bit7][level-1]; + + if (unlikely(pte & mask)) + return mask; + if (unlikely(rsvd_check->bad_mt_xwr & (1ull << low6))) + return rsvd_check->bad_mt_xwr; - return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) | - ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0); + return 0; } static bool is_rsvd_bits_set(struct kvm_mmu *mmu, u64 gpte, int level) { - return __is_rsvd_bits_set(>guest_rsvd_check, gpte, level); + return rsvd_bits_set(>guest_rsvd_check, gpte, level) != 0; } -static bool is_shadow_zero_bits_set(struct kvm_mmu *mmu, u64 spte, int level) +static u64 shadow_rsvd_bits_set(struct kvm_mmu *mmu, u64 spte, int level) { - return __is_rsvd_bits_set(>shadow_zero_check, spte, level); + return rsvd_bits_set(>shadow_zero_check, spte, level); } static bool quickly_check_mmio_pf(struct kvm_vcpu *vcpu, u64 addr, bool direct) @@ -3302,6 +3307,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep) struct kvm_shadow_walk_iterator iterator; u64 sptes[PT64_ROOT_LEVEL], spte = 0ull; int root, leaf; + u64 result; bool reserved = false; if (!VALID_PAGE(vcpu->arch.mmu.root_hpa)) @@ -3321,15 +3327,20 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep) if (!is_shadow_present_pte(spte)) break; - reserved |= is_shadow_zero_bits_set(>arch.mmu, spte, - iterator.level); + result = shadow_rsvd_bits_set(>arch.mmu, spte, + iterator.level); + if (unlikely(result)) { + pr_err("%s: detect reserved bits on spte, addr 0x%llx " + "(level %d, 0x%llx)\n", + __func__, addr, iterator.level, result); + reserved = true; + } } walk_shadow_page_lockless_end(vcpu); if (reserved) { - pr_err("%s: detect reserved bits on spte, addr 0x%llx, dump hierarchy:\n", - __func__, addr); + pr_err("dump hierarchy:\n"); while (root > leaf) { pr_err("-- spte 0x%llx level %d.\n", sptes[root - 1], root); > [ 49.456533] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000, dump hierarchy: > [ 49.465945] -- spte 0x416ed9027 level 4. > [ 49.470221] -- spte 0x416888027 level 3. > [ 49.474494] -- spte 0x41694f027 level 2. > [ 49.474495] -- spte 0x000b8f67 level 1. Thus same as before. Just to be safe, can you try using "-cpu host" on the QEMU command line and see if it changes anything? This would catch things such as an Intel CPUID on an AMD host. Paolo > [ 49.474496] [ cut here
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On Tue, Sep 22, 2015 at 10:25:29AM +0200, Paolo Bonzini wrote: > 29ecd6601904 ("KVM: x86: avoid uninitialized variable warning", > 2015-09-06) introduced a not-so-subtle problem, which probably > escaped review because it was not part of the patch context. ... > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 69088a1ba509..3ce2b74c75dc 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, > u64 addr, u64 *sptep) > break; > > reserved |= is_shadow_zero_bits_set(>arch.mmu, spte, > - leaf); > + iterator.level); > } > > walk_shadow_page_lockless_end(vcpu); > -- No joy, I still see the splat at the end of this mail when starting a kvm guest. Btw, this is what I have ontop of rc2+tip: --- diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 69088a1ba509..3ce2b74c75dc 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep) break; reserved |= is_shadow_zero_bits_set(>arch.mmu, spte, - leaf); + iterator.level); } walk_shadow_page_lockless_end(vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c0b9ff3e1aec..a44f8fed9be1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7063,13 +7063,16 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id) { struct kvm_vcpu *vcpu; + int idx; if (check_tsc_unstable() && atomic_read(>online_vcpus) != 0) printk_once(KERN_WARNING "kvm: SMP vm created on host with unstable TSC; " "guest TSC will not be reliable\n"); + idx = srcu_read_lock(>srcu); vcpu = kvm_x86_ops->vcpu_create(kvm, id); + srcu_read_unlock(>srcu, idx); return vcpu; } --- --- [ 49.456533] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000, dump hierarchy: [ 49.465945] -- spte 0x416ed9027 level 4. [ 49.470221] -- spte 0x416888027 level 3. [ 49.474494] -- spte 0x41694f027 level 2. [ 49.474495] -- spte 0x000b8f67 level 1. [ 49.474496] [ cut here ] [ 49.474515] WARNING: CPU: 4 PID: 3540 at arch/x86/kvm/mmu.c:3385 handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]() [ 49.474555] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 49.474560] CPU: 4 PID: 3540 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #2 [ 49.474562] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 49.474569] a032f8b2 880416a73b78 812c758a [ 49.474574] 880416a73bb0 810534c1 8804171b 000f [ 49.474578] 000b8000 880416a73bc0 [ 49.474579] Call Trace: [ 49.474586] [] dump_stack+0x4e/0x84 [ 49.474589] [] warn_slowpath_common+0x91/0xd0 [ 49.474592] [] warn_slowpath_null+0x1a/0x20 [ 49.474603] [] handle_mmio_page_fault.part.57+0x1a/0x20 [kvm] [ 49.474615] [] tdp_page_fault+0x2a0/0x2b0 [kvm] [ 49.474620] [] ? __lock_acquire+0x57d/0x17a0 [ 49.474633] [] kvm_mmu_page_fault+0x35/0x240 [kvm] [ 49.474637] [] pf_interception+0x108/0x1d0 [kvm_amd] [ 49.474642] [] handle_exit+0x150/0xa40 [kvm_amd] [ 49.474662] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 49.474674] [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm] [ 49.474686] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 49.474690] [] ? mutex_lock_killable_nested+0x312/0x480 [ 49.474700] [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm] [ 49.474705] [] ? preempt_count_sub+0xb3/0x110 [ 49.474715] [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm] [ 49.474719] [] do_vfs_ioctl+0x2d7/0x530 [ 49.474722] [] ? __fget_light+0x29/0x90 [ 49.474724] [] SyS_ioctl+0x4c/0x90 [ 49.474729] [] entry_SYSCALL_64_fastpath+0x16/0x73 [ 49.474732] ---[ end trace 0e0be3552b84977c ]--- Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] KVM: x86: fix bogus warning about reserved bits
29ecd6601904 ("KVM: x86: avoid uninitialized variable warning", 2015-09-06) introduced a not-so-subtle problem, which probably escaped review because it was not part of the patch context. Before the patch, leaf was always equal to iterator.level. After, it is equal to iterator.level - 1 in the call to is_shadow_zero_bits_set, and when is_shadow_zero_bits_set does another "-1" the check on reserved bits becomes incorrect. Using "iterator.level" in the call fixes this call trace: WARNING: CPU: 2 PID: 17000 at arch/x86/kvm/mmu.c:3385 handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]() Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd fam15h_power amd64_edac_mod k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [...] Call Trace: dump_stack+0x4e/0x84 warn_slowpath_common+0x95/0xe0 warn_slowpath_null+0x1a/0x20 handle_mmio_page_fault.part.93+0x1a/0x20 [kvm] tdp_page_fault+0x231/0x290 [kvm] ? emulator_pio_in_out+0x6e/0xf0 [kvm] kvm_mmu_page_fault+0x36/0x240 [kvm] ? svm_set_cr0+0x95/0xc0 [kvm_amd] pf_interception+0xde/0x1d0 [kvm_amd] handle_exit+0x181/0xa70 [kvm_amd] ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm] kvm_arch_vcpu_ioctl_run+0x6f6/0x1730 [kvm] ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm] ? preempt_count_sub+0x9b/0xf0 ? mutex_lock_killable_nested+0x26f/0x490 ? preempt_count_sub+0x9b/0xf0 kvm_vcpu_ioctl+0x358/0x710 [kvm] ? __fget+0x5/0x210 ? __fget+0x101/0x210 do_vfs_ioctl+0x2f4/0x560 ? __fget_light+0x29/0x90 SyS_ioctl+0x4c/0x90 entry_SYSCALL_64_fastpath+0x16/0x73 ---[ end trace 37901c8686d84de6 ]--- Reported-by: Borislav Petkov Signed-off-by: Paolo Bonzini --- arch/x86/kvm/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 69088a1ba509..3ce2b74c75dc 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep) break; reserved |= is_shadow_zero_bits_set(>arch.mmu, spte, - leaf); + iterator.level); } walk_shadow_page_lockless_end(vcpu); -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On Tue, Sep 22, 2015 at 10:25:29AM +0200, Paolo Bonzini wrote: > 29ecd6601904 ("KVM: x86: avoid uninitialized variable warning", > 2015-09-06) introduced a not-so-subtle problem, which probably > escaped review because it was not part of the patch context. ... > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 69088a1ba509..3ce2b74c75dc 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, > u64 addr, u64 *sptep) > break; > > reserved |= is_shadow_zero_bits_set(>arch.mmu, spte, > - leaf); > + iterator.level); > } > > walk_shadow_page_lockless_end(vcpu); > -- No joy, I still see the splat at the end of this mail when starting a kvm guest. Btw, this is what I have ontop of rc2+tip: --- diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 69088a1ba509..3ce2b74c75dc 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep) break; reserved |= is_shadow_zero_bits_set(>arch.mmu, spte, - leaf); + iterator.level); } walk_shadow_page_lockless_end(vcpu); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c0b9ff3e1aec..a44f8fed9be1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -7063,13 +7063,16 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, unsigned int id) { struct kvm_vcpu *vcpu; + int idx; if (check_tsc_unstable() && atomic_read(>online_vcpus) != 0) printk_once(KERN_WARNING "kvm: SMP vm created on host with unstable TSC; " "guest TSC will not be reliable\n"); + idx = srcu_read_lock(>srcu); vcpu = kvm_x86_ops->vcpu_create(kvm, id); + srcu_read_unlock(>srcu, idx); return vcpu; } --- --- [ 49.456533] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, addr 0xb8000, dump hierarchy: [ 49.465945] -- spte 0x416ed9027 level 4. [ 49.470221] -- spte 0x416888027 level 3. [ 49.474494] -- spte 0x41694f027 level 2. [ 49.474495] -- spte 0x000b8f67 level 1. [ 49.474496] [ cut here ] [ 49.474515] WARNING: CPU: 4 PID: 3540 at arch/x86/kvm/mmu.c:3385 handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]() [ 49.474555] Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd amd64_edac_mod fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [ 49.474560] CPU: 4 PID: 3540 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #2 [ 49.474562] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013 [ 49.474569] a032f8b2 880416a73b78 812c758a [ 49.474574] 880416a73bb0 810534c1 8804171b 000f [ 49.474578] 000b8000 880416a73bc0 [ 49.474579] Call Trace: [ 49.474586] [] dump_stack+0x4e/0x84 [ 49.474589] [] warn_slowpath_common+0x91/0xd0 [ 49.474592] [] warn_slowpath_null+0x1a/0x20 [ 49.474603] [] handle_mmio_page_fault.part.57+0x1a/0x20 [kvm] [ 49.474615] [] tdp_page_fault+0x2a0/0x2b0 [kvm] [ 49.474620] [] ? __lock_acquire+0x57d/0x17a0 [ 49.474633] [] kvm_mmu_page_fault+0x35/0x240 [kvm] [ 49.474637] [] pf_interception+0x108/0x1d0 [kvm_amd] [ 49.474642] [] handle_exit+0x150/0xa40 [kvm_amd] [ 49.474662] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 49.474674] [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm] [ 49.474686] [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 [kvm] [ 49.474690] [] ? mutex_lock_killable_nested+0x312/0x480 [ 49.474700] [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm] [ 49.474705] [] ? preempt_count_sub+0xb3/0x110 [ 49.474715] [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm] [ 49.474719] [] do_vfs_ioctl+0x2d7/0x530 [ 49.474722] [] ? __fget_light+0x29/0x90 [ 49.474724] [] SyS_ioctl+0x4c/0x90 [ 49.474729] [] entry_SYSCALL_64_fastpath+0x16/0x73 [ 49.474732] ---[ end trace 0e0be3552b84977c ]--- Thanks. -- Regards/Gruss, Boris. ECO tip #101: Trim your mails when you reply. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] KVM: x86: fix bogus warning about reserved bits
On 22/09/2015 19:56, Borislav Petkov wrote: > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 69088a1ba509..3ce2b74c75dc 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, > u64 addr, u64 *sptep) > break; > > reserved |= is_shadow_zero_bits_set(>arch.mmu, spte, > - leaf); > + iterator.level); > } > > walk_shadow_page_lockless_end(vcpu); > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index c0b9ff3e1aec..a44f8fed9be1 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -7063,13 +7063,16 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm, > unsigned int id) > { > struct kvm_vcpu *vcpu; > + int idx; > > if (check_tsc_unstable() && atomic_read(>online_vcpus) != 0) > printk_once(KERN_WARNING > "kvm: SMP vm created on host with unstable TSC; " > "guest TSC will not be reliable\n"); > > + idx = srcu_read_lock(>srcu); > vcpu = kvm_x86_ops->vcpu_create(kvm, id); > + srcu_read_unlock(>srcu, idx); > > return vcpu; > } Yup, looks good. Let's add more debugging output: diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 3ce2b74c75dc..bf1122e9c7bf 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3268,23 +3268,28 @@ static gpa_t nonpaging_gva_to_gpa_nested(struct kvm_vcpu *vcpu, gva_t vaddr, return vcpu->arch.nested_mmu.translate_gpa(vcpu, vaddr, access, exception); } -static bool -__is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level) +static u64 +rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level) { int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f; + u64 mask = rsvd_check->rsvd_bits_mask[bit7][level-1]; + + if (unlikely(pte & mask)) + return mask; + if (unlikely(rsvd_check->bad_mt_xwr & (1ull << low6))) + return rsvd_check->bad_mt_xwr; - return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) | - ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0); + return 0; } static bool is_rsvd_bits_set(struct kvm_mmu *mmu, u64 gpte, int level) { - return __is_rsvd_bits_set(>guest_rsvd_check, gpte, level); + return rsvd_bits_set(>guest_rsvd_check, gpte, level) != 0; } -static bool is_shadow_zero_bits_set(struct kvm_mmu *mmu, u64 spte, int level) +static u64 shadow_rsvd_bits_set(struct kvm_mmu *mmu, u64 spte, int level) { - return __is_rsvd_bits_set(>shadow_zero_check, spte, level); + return rsvd_bits_set(>shadow_zero_check, spte, level); } static bool quickly_check_mmio_pf(struct kvm_vcpu *vcpu, u64 addr, bool direct) @@ -3302,6 +3307,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep) struct kvm_shadow_walk_iterator iterator; u64 sptes[PT64_ROOT_LEVEL], spte = 0ull; int root, leaf; + u64 result; bool reserved = false; if (!VALID_PAGE(vcpu->arch.mmu.root_hpa)) @@ -3321,15 +3327,20 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep) if (!is_shadow_present_pte(spte)) break; - reserved |= is_shadow_zero_bits_set(>arch.mmu, spte, - iterator.level); + result = shadow_rsvd_bits_set(>arch.mmu, spte, + iterator.level); + if (unlikely(result)) { + pr_err("%s: detect reserved bits on spte, addr 0x%llx " + "(level %d, 0x%llx)\n", + __func__, addr, iterator.level, result); + reserved = true; + } } walk_shadow_page_lockless_end(vcpu); if (reserved) { - pr_err("%s: detect reserved bits on spte, addr 0x%llx, dump hierarchy:\n", - __func__, addr); + pr_err("dump hierarchy:\n"); while (root > leaf) { pr_err("-- spte 0x%llx level %d.\n", sptes[root - 1], root); > [ 49.456533] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, > addr 0xb8000, dump hierarchy: > [ 49.465945] -- spte 0x416ed9027 level 4. > [ 49.470221] -- spte 0x416888027 level 3. > [ 49.474494] -- spte 0x41694f027 level 2. > [ 49.474495] -- spte 0x000b8f67 level 1. Thus same as before. Just to be safe, can you try using "-cpu host" on the QEMU command line and see if it changes anything? This would catch things such as an Intel CPUID on an AMD host. Paolo > [ 49.474496] [ cut here
[PATCH] KVM: x86: fix bogus warning about reserved bits
29ecd6601904 ("KVM: x86: avoid uninitialized variable warning", 2015-09-06) introduced a not-so-subtle problem, which probably escaped review because it was not part of the patch context. Before the patch, leaf was always equal to iterator.level. After, it is equal to iterator.level - 1 in the call to is_shadow_zero_bits_set, and when is_shadow_zero_bits_set does another "-1" the check on reserved bits becomes incorrect. Using "iterator.level" in the call fixes this call trace: WARNING: CPU: 2 PID: 17000 at arch/x86/kvm/mmu.c:3385 handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]() Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd fam15h_power amd64_edac_mod k10temp edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq [...] Call Trace: dump_stack+0x4e/0x84 warn_slowpath_common+0x95/0xe0 warn_slowpath_null+0x1a/0x20 handle_mmio_page_fault.part.93+0x1a/0x20 [kvm] tdp_page_fault+0x231/0x290 [kvm] ? emulator_pio_in_out+0x6e/0xf0 [kvm] kvm_mmu_page_fault+0x36/0x240 [kvm] ? svm_set_cr0+0x95/0xc0 [kvm_amd] pf_interception+0xde/0x1d0 [kvm_amd] handle_exit+0x181/0xa70 [kvm_amd] ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm] kvm_arch_vcpu_ioctl_run+0x6f6/0x1730 [kvm] ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm] ? preempt_count_sub+0x9b/0xf0 ? mutex_lock_killable_nested+0x26f/0x490 ? preempt_count_sub+0x9b/0xf0 kvm_vcpu_ioctl+0x358/0x710 [kvm] ? __fget+0x5/0x210 ? __fget+0x101/0x210 do_vfs_ioctl+0x2f4/0x560 ? __fget_light+0x29/0x90 SyS_ioctl+0x4c/0x90 entry_SYSCALL_64_fastpath+0x16/0x73 ---[ end trace 37901c8686d84de6 ]--- Reported-by: Borislav PetkovSigned-off-by: Paolo Bonzini --- arch/x86/kvm/mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c index 69088a1ba509..3ce2b74c75dc 100644 --- a/arch/x86/kvm/mmu.c +++ b/arch/x86/kvm/mmu.c @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 addr, u64 *sptep) break; reserved |= is_shadow_zero_bits_set(>arch.mmu, spte, - leaf); + iterator.level); } walk_shadow_page_lockless_end(vcpu); -- 1.8.3.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/