Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-27 Thread Xiao Guangrong



On 09/25/2015 03:49 PM, Paolo Bonzini wrote:



On 24/09/2015 12:12, Borislav Petkov wrote:

On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote:

+static inline bool
+boot_cpu_is_amd(void)
+{
+   WARN_ON_ONCE(!tdp_enabled);
+   return shadow_x_mask != 0;


shadow_x_mask != 0 is Intel's CPU.

Borislav, could you please check shadow_x_mask == 0 instead and test it again?


That did the trick:

[   62.640392] kvm: zapping shadow pages for mmio generation wraparound
[   63.100301] cpuid(0).ebx = 68747541
[   63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112
[   64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021
[   64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021
[   64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021
[   65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021
[   65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021
[   65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021
[   65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021
[   65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021

That's all I got in dmesg from booting the guest - no more mmio PF
warnings.


Great, though this doesn't yet explain why guest_cpuid_is_amd failed.
I'll look into it next week when Amazon delivers my shiny new AMD
mini-PC :) and send the patch to Linus meanwhile.


I guess QEMU passed AMD cpu mode to KVM... but not sure. :(
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-27 Thread Xiao Guangrong



On 09/25/2015 03:49 PM, Paolo Bonzini wrote:



On 24/09/2015 12:12, Borislav Petkov wrote:

On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote:

+static inline bool
+boot_cpu_is_amd(void)
+{
+   WARN_ON_ONCE(!tdp_enabled);
+   return shadow_x_mask != 0;


shadow_x_mask != 0 is Intel's CPU.

Borislav, could you please check shadow_x_mask == 0 instead and test it again?


That did the trick:

[   62.640392] kvm: zapping shadow pages for mmio generation wraparound
[   63.100301] cpuid(0).ebx = 68747541
[   63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112
[   64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021
[   64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021
[   64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021
[   65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021
[   65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021
[   65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021
[   65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021
[   65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021

That's all I got in dmesg from booting the guest - no more mmio PF
warnings.


Great, though this doesn't yet explain why guest_cpuid_is_amd failed.
I'll look into it next week when Amazon delivers my shiny new AMD
mini-PC :) and send the patch to Linus meanwhile.


I guess QEMU passed AMD cpu mode to KVM... but not sure. :(
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-25 Thread Paolo Bonzini


On 24/09/2015 12:12, Borislav Petkov wrote:
> On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote:
>>> +static inline bool
>>> +boot_cpu_is_amd(void)
>>> +{
>>> +   WARN_ON_ONCE(!tdp_enabled);
>>> +   return shadow_x_mask != 0;
>>
>> shadow_x_mask != 0 is Intel's CPU.
>>
>> Borislav, could you please check shadow_x_mask == 0 instead and test it 
>> again?
> 
> That did the trick:
> 
> [   62.640392] kvm: zapping shadow pages for mmio generation wraparound
> [   63.100301] cpuid(0).ebx = 68747541
> [   63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112
> [   64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021
> [   64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021
> [   64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021
> [   65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021
> [   65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021
> [   65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021
> [   65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021
> [   65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021
> 
> That's all I got in dmesg from booting the guest - no more mmio PF
> warnings.

Great, though this doesn't yet explain why guest_cpuid_is_amd failed.
I'll look into it next week when Amazon delivers my shiny new AMD
mini-PC :) and send the patch to Linus meanwhile.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-25 Thread Paolo Bonzini


On 24/09/2015 12:12, Borislav Petkov wrote:
> On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote:
>>> +static inline bool
>>> +boot_cpu_is_amd(void)
>>> +{
>>> +   WARN_ON_ONCE(!tdp_enabled);
>>> +   return shadow_x_mask != 0;
>>
>> shadow_x_mask != 0 is Intel's CPU.
>>
>> Borislav, could you please check shadow_x_mask == 0 instead and test it 
>> again?
> 
> That did the trick:
> 
> [   62.640392] kvm: zapping shadow pages for mmio generation wraparound
> [   63.100301] cpuid(0).ebx = 68747541
> [   63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112
> [   64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021
> [   64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021
> [   64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021
> [   65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021
> [   65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021
> [   65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021
> [   65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021
> [   65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021
> 
> That's all I got in dmesg from booting the guest - no more mmio PF
> warnings.

Great, though this doesn't yet explain why guest_cpuid_is_amd failed.
I'll look into it next week when Amazon delivers my shiny new AMD
mini-PC :) and send the patch to Linus meanwhile.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-24 Thread Borislav Petkov
On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote:
> >+static inline bool
> >+boot_cpu_is_amd(void)
> >+{
> >+WARN_ON_ONCE(!tdp_enabled);
> >+return shadow_x_mask != 0;
> 
> shadow_x_mask != 0 is Intel's CPU.
> 
> Borislav, could you please check shadow_x_mask == 0 instead and test it again?

That did the trick:

[   62.640392] kvm: zapping shadow pages for mmio generation wraparound
[   63.100301] cpuid(0).ebx = 68747541
[   63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112
[   64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021
[   64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021
[   64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021
[   65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021
[   65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021
[   65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021
[   65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021
[   65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021

That's all I got in dmesg from booting the guest - no more mmio PF
warnings.

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-24 Thread Borislav Petkov
On Thu, Sep 24, 2015 at 11:23:08AM +0800, Xiao Guangrong wrote:
> >+static inline bool
> >+boot_cpu_is_amd(void)
> >+{
> >+WARN_ON_ONCE(!tdp_enabled);
> >+return shadow_x_mask != 0;
> 
> shadow_x_mask != 0 is Intel's CPU.
> 
> Borislav, could you please check shadow_x_mask == 0 instead and test it again?

That did the trick:

[   62.640392] kvm: zapping shadow pages for mmio generation wraparound
[   63.100301] cpuid(0).ebx = 68747541
[   63.193426] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0010112
[   64.538294] kvm [3748]: vcpu0 unhandled rdmsr: 0xc0011021
[   64.866263] kvm [3748]: vcpu1 unhandled rdmsr: 0xc0011021
[   64.971972] kvm [3748]: vcpu2 unhandled rdmsr: 0xc0011021
[   65.070376] kvm [3748]: vcpu3 unhandled rdmsr: 0xc0011021
[   65.170625] kvm [3748]: vcpu4 unhandled rdmsr: 0xc0011021
[   65.272838] kvm [3748]: vcpu5 unhandled rdmsr: 0xc0011021
[   65.374288] kvm [3748]: vcpu6 unhandled rdmsr: 0xc0011021
[   65.474825] kvm [3748]: vcpu7 unhandled rdmsr: 0xc0011021

That's all I got in dmesg from booting the guest - no more mmio PF
warnings.

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Xiao Guangrong



On 09/23/2015 05:36 PM, Paolo Bonzini wrote:



On 23/09/2015 09:56, Borislav Petkov wrote:

On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote:

Let's add more debugging output:


Here you go:

[   50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 4, 0xf00f8)
[   50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 3, 0xf0078)
[   50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 2, 0xf0078)


And another patch, which both cranks up the debugging a bit and
tries another fix:

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index dd05b9cef6ae..b2f49bb15ba1 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct kvm_vcpu 
*vcpu)
  static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu)
  {
struct kvm_cpuid_entry2 *best;
+   static bool first;

best = kvm_find_cpuid_entry(vcpu, 0, 0);
+   if (first && best) {
+   printk("cpuid(0).ebx = %x\n", best->ebx);
+   first = false;
+   } else if (first)
+   printk_ratelimited("cpuid(0) not initialized yet\n");
+
return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx;
  }

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bf1122e9c7bf..f50b280ffee1 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3625,7 +3625,7 @@ static void
  __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
struct rsvd_bits_validate *rsvd_check,
int maxphyaddr, int level, bool nx, bool gbpages,
-   bool pse)
+   bool pse, bool amd)
  {
u64 exb_bit_rsvd = 0;
u64 gbpages_bit_rsvd = 0;
@@ -3642,7 +3642,7 @@ __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 * Non-leaf PML4Es and PDPEs reserve bit 8 (which would be the G bit for
 * leaf entries) on AMD CPUs only.
 */
-   if (guest_cpuid_is_amd(vcpu))
+   if (amd)
nonleaf_bit8_rsvd = rsvd_bits(8, 8);

switch (level) {
@@ -3710,7 +3710,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
__reset_rsvds_bits_mask(vcpu, >guest_rsvd_check,
cpuid_maxphyaddr(vcpu), context->root_level,
context->nx, guest_cpuid_has_gbpages(vcpu),
-   is_pse(vcpu));
+   is_pse(vcpu), guest_cpuid_is_amd(vcpu));
  }

  static void
@@ -3760,13 +3760,25 @@ static void reset_rsvds_bits_mask_ept(struct kvm_vcpu 
*vcpu,
  void
  reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
  {
+   /*
+* Passing "true" to the last argument is okay; it adds a check
+* on bit 8 of the SPTEs which KVM doesn't use anyway.
+*/
__reset_rsvds_bits_mask(vcpu, >shadow_zero_check,
boot_cpu_data.x86_phys_bits,
context->shadow_root_level, context->nx,
-   guest_cpuid_has_gbpages(vcpu), is_pse(vcpu));
+   guest_cpuid_has_gbpages(vcpu), is_pse(vcpu),
+   true);
  }
  EXPORT_SYMBOL_GPL(reset_shadow_zero_bits_mask);

+static inline bool
+boot_cpu_is_amd(void)
+{
+   WARN_ON_ONCE(!tdp_enabled);
+   return shadow_x_mask != 0;


shadow_x_mask != 0 is Intel's CPU.

Borislav, could you please check shadow_x_mask == 0 instead and test it again?

Further more, use guest_cpuid_is_amd() to detect hardware CPU vendor is wrong
as usespace can fool KVM. Should test host CPUID or introduce intel/amd callback
instead.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Paolo Bonzini


On 23/09/2015 13:07, Borislav Petkov wrote:
>> > +  static bool first;
>> >  
>> >best = kvm_find_cpuid_entry(vcpu, 0, 0);
>> > +  if (first && best) {
>> > +  printk("cpuid(0).ebx = %x\n", best->ebx);
>> > +  first = false;
>> > +  } else if (first)
>> > +  printk_ratelimited("cpuid(0) not initialized yet\n");
>> > +
>> >return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx;
> Do I see it correctly that that "first" thing is never true?
> 
> In any case, I changed it to initialize to true but still no output from
> that function.
> 
> [  102.448438] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 4, 0xf00f8)
> [  102.458706] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 3, 0xf0078)
> [  102.468955] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 2, 0xf0078)

Yeah, but I didn't totally need the output so I didn't bother answering
with yet another v2.  I'm a bit stymied and will try to find an AMD
machine inside RH (it's always a pain to install latest-and-greatest
kernel on unknown machines).  It's probably also time to buy one too...

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Borislav Petkov
On Wed, Sep 23, 2015 at 11:36:47AM +0200, Paolo Bonzini wrote:
> And another patch, which both cranks up the debugging a bit and
> tries another fix:
> 
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index dd05b9cef6ae..b2f49bb15ba1 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct 
> kvm_vcpu *vcpu)
>  static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu)
>  {
>   struct kvm_cpuid_entry2 *best;
> + static bool first;
>  
>   best = kvm_find_cpuid_entry(vcpu, 0, 0);
> + if (first && best) {
> + printk("cpuid(0).ebx = %x\n", best->ebx);
> + first = false;
> + } else if (first)
> + printk_ratelimited("cpuid(0) not initialized yet\n");
> +
>   return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx;

Do I see it correctly that that "first" thing is never true?

In any case, I changed it to initialize to true but still no output from
that function.

[  102.448438] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 4, 0xf00f8)
[  102.458706] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 3, 0xf0078)
[  102.468955] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 2, 0xf0078)
[  102.479337] dump hierarchy:
[  102.482152] -- spte 0x416edb027 level 4.
[  102.482154] -- spte 0x416eda027 level 3.
[  102.482155] -- spte 0x416ed5027 level 2.
[  102.482157] -- spte 0x000b8f67 level 1.
[  102.482158] [ cut here ]
[  102.482196] WARNING: CPU: 6 PID: 3550 at arch/x86/kvm/mmu.c:3396 
handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]()
[  102.482236] Modules linked in: tun sha256_ssse3 sha256_generic drbg 
binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd 
amd64_edac_mod fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon 
acpi_cpufreq
[  102.482240] CPU: 6 PID: 3550 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #1
[  102.482242] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[  102.482249]  a030c992 880424b8fb78 812c758a 

[  102.482253]  880424b8fbb0 810534c1 8804160e 
000f
[  102.482257]  000b8000   
880424b8fbc0
[  102.482259] Call Trace:
[  102.482268]  [] dump_stack+0x4e/0x84
[  102.482273]  [] warn_slowpath_common+0x91/0xd0
[  102.482276]  [] warn_slowpath_null+0x1a/0x20
[  102.482306]  [] handle_mmio_page_fault.part.57+0x1a/0x20 
[kvm]
[  102.482334]  [] tdp_page_fault+0x2a0/0x2b0 [kvm]
[  102.482340]  [] ? __lock_acquire+0x57d/0x17a0
[  102.482369]  [] kvm_mmu_page_fault+0x35/0x240 [kvm]
[  102.482376]  [] pf_interception+0x108/0x1d0 [kvm_amd]
[  102.482381]  [] handle_exit+0x150/0xa40 [kvm_amd]
[  102.482408]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[  102.482435]  [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm]
[  102.482461]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[  102.482466]  [] ? mutex_lock_killable_nested+0x312/0x480
[  102.482485]  [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm]
[  102.482490]  [] ? preempt_count_sub+0xb3/0x110
[  102.482509]  [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm]
[  102.482515]  [] do_vfs_ioctl+0x2d7/0x530
[  102.482519]  [] ? __fget_light+0x29/0x90
[  102.482523]  [] SyS_ioctl+0x4c/0x90
[  102.482527]  [] entry_SYSCALL_64_fastpath+0x16/0x73
[  102.482531] ---[ end trace b8899512fc52cf2e ]---

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Paolo Bonzini


On 23/09/2015 09:56, Borislav Petkov wrote:
> On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote:
>> Let's add more debugging output:
> 
> Here you go:
> 
> [   50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 4, 0xf00f8)
> [   50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 3, 0xf0078)
> [   50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 2, 0xf0078)

And another patch, which both cranks up the debugging a bit and
tries another fix:

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index dd05b9cef6ae..b2f49bb15ba1 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct kvm_vcpu 
*vcpu)
 static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu)
 {
struct kvm_cpuid_entry2 *best;
+   static bool first;
 
best = kvm_find_cpuid_entry(vcpu, 0, 0);
+   if (first && best) {
+   printk("cpuid(0).ebx = %x\n", best->ebx);
+   first = false;
+   } else if (first)
+   printk_ratelimited("cpuid(0) not initialized yet\n");
+
return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx;
 }
 
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bf1122e9c7bf..f50b280ffee1 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3625,7 +3625,7 @@ static void
 __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
struct rsvd_bits_validate *rsvd_check,
int maxphyaddr, int level, bool nx, bool gbpages,
-   bool pse)
+   bool pse, bool amd)
 {
u64 exb_bit_rsvd = 0;
u64 gbpages_bit_rsvd = 0;
@@ -3642,7 +3642,7 @@ __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 * Non-leaf PML4Es and PDPEs reserve bit 8 (which would be the G bit for
 * leaf entries) on AMD CPUs only.
 */
-   if (guest_cpuid_is_amd(vcpu))
+   if (amd)
nonleaf_bit8_rsvd = rsvd_bits(8, 8);
 
switch (level) {
@@ -3710,7 +3710,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
__reset_rsvds_bits_mask(vcpu, >guest_rsvd_check,
cpuid_maxphyaddr(vcpu), context->root_level,
context->nx, guest_cpuid_has_gbpages(vcpu),
-   is_pse(vcpu));
+   is_pse(vcpu), guest_cpuid_is_amd(vcpu));
 }
 
 static void
@@ -3760,13 +3760,25 @@ static void reset_rsvds_bits_mask_ept(struct kvm_vcpu 
*vcpu,
 void
 reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
 {
+   /*
+* Passing "true" to the last argument is okay; it adds a check
+* on bit 8 of the SPTEs which KVM doesn't use anyway.
+*/
__reset_rsvds_bits_mask(vcpu, >shadow_zero_check,
boot_cpu_data.x86_phys_bits,
context->shadow_root_level, context->nx,
-   guest_cpuid_has_gbpages(vcpu), is_pse(vcpu));
+   guest_cpuid_has_gbpages(vcpu), is_pse(vcpu),
+   true);
 }
 EXPORT_SYMBOL_GPL(reset_shadow_zero_bits_mask);
 
+static inline bool
+boot_cpu_is_amd(void)
+{
+   WARN_ON_ONCE(!tdp_enabled);
+   return shadow_x_mask != 0;
+}
+
 /*
  * the direct page table on host, use as much mmu features as
  * possible, however, kvm currently does not do execution-protection.
@@ -3775,11 +3787,11 @@ static void
 reset_tdp_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
struct kvm_mmu *context)
 {
-   if (guest_cpuid_is_amd(vcpu))
+   if (boot_cpu_is_amd())
__reset_rsvds_bits_mask(vcpu, >shadow_zero_check,
boot_cpu_data.x86_phys_bits,
context->shadow_root_level, false,
-   cpu_has_gbpages, true);
+   cpu_has_gbpages, true, true);
else
__reset_rsvds_bits_mask_ept(>shadow_zero_check,
boot_cpu_data.x86_phys_bits,



Applies on top of everything else you've got already.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Paolo Bonzini


On 23/09/2015 09:56, Borislav Petkov wrote:
> [   50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 4, 0xf00f8)
> [   50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 3, 0xf0078)
> [   50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 2, 0xf0078)

It's checking against EPT format, no surprise that it complains...

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Borislav Petkov
On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote:
> Let's add more debugging output:

Here you go:

[   50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 4, 0xf00f8)
[   50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 3, 0xf0078)
[   50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 2, 0xf0078)
[   50.504767] dump hierarchy:
[   50.507595] -- spte 0x416533027 level 4.
[   50.507595] -- spte 0x416534027 level 3.
[   50.507596] -- spte 0x416535027 level 2.
[   50.507596] -- spte 0x000b8f67 level 1.
[   50.507597] [ cut here ]
[   50.507616] WARNING: CPU: 4 PID: 3539 at arch/x86/kvm/mmu.c:3396 
handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]()
[   50.507630] Modules linked in: tun sha256_ssse3 sha256_generic drbg 
binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd 
amd64_edac_mod k10temp edac_core fam15h_power amdkfd amd_iommu_v2 radeon 
acpi_cpufreq
[   50.507632] CPU: 4 PID: 3539 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #2
[   50.507633] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[   50.507635]  a0433932 880416973b78 812c758a 

[   50.507637]  880416973bb0 810534c1 8804231c 
000f
[   50.507638]  000b8000   
880416973bc0
[   50.507639] Call Trace:
[   50.507643]  [] dump_stack+0x4e/0x84
[   50.507646]  [] warn_slowpath_common+0x91/0xd0
[   50.507647]  [] warn_slowpath_null+0x1a/0x20
[   50.507657]  [] handle_mmio_page_fault.part.57+0x1a/0x20 
[kvm]
[   50.507667]  [] tdp_page_fault+0x2a0/0x2b0 [kvm]
[   50.507673]  [] ? __lock_acquire+0x57d/0x17a0
[   50.507682]  [] kvm_mmu_page_fault+0x35/0x240 [kvm]
[   50.507685]  [] pf_interception+0x108/0x1d0 [kvm_amd]
[   50.507688]  [] handle_exit+0x150/0xa40 [kvm_amd]
[   50.507697]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[   50.507706]  [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm]
[   50.507715]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[   50.507717]  [] ? mutex_lock_killable_nested+0x312/0x480
[   50.507724]  [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm]
[   50.507726]  [] ? preempt_count_sub+0xb3/0x110
[   50.507733]  [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm]
[   50.507735]  [] do_vfs_ioctl+0x2d7/0x530
[   50.507737]  [] ? __fget_light+0x29/0x90
[   50.507738]  [] SyS_ioctl+0x4c/0x90
[   50.507740]  [] entry_SYSCALL_64_fastpath+0x16/0x73
[   50.507741] ---[ end trace ff23795fcc279cbd ]---

> Thus same as before.
> 
> Just to be safe, can you try using "-cpu host" on the QEMU command
> line and see if it changes anything?  This would catch things such
> as an Intel CPUID on an AMD host.

Here's my full qemu command:

qemu-system-x86_64 -enable-kvm -gdb tcp::1234 -cpu host -m 2048 -hda 
/home/boris/kvm/debian/sid-x86_64.img -hdb /home/boris/kvm/swap.img -boot 
menu=off,order=c -localtime -net nic,model=rtl8139 -net 
user,hostfwd=tcp::1235-:22 -usbdevice tablet -kernel 
/home/boris/kernel/linux-2.6/arch/x86/boot/bzImage -append "root=/dev/sda1 
resume=/dev/sdb1 debug ignore_loglevel log_buf_len=16M earlyprintk=ttyS0,115200 
console=ttyS0,115200 console=tty0 " -monitor pty -virtfs 
local,path=/tmp,mount_tag=tmp,security_model=none -serial 
file:/home/boris/kvm/test-x86_64-1235.log -snapshot -name "Debian x86_64:1235" 
-smp 8

and that splats too:

[  146.891735] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 4, 0xf00f8)
[  146.901981] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 3, 0xf0078)
[  146.912224] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 2, 0xf0078)
[  146.922496] dump hierarchy:
[  146.925331] -- spte 0x37d47027 level 4.
[  146.925332] -- spte 0x37d46027 level 3.
[  146.925332] -- spte 0xb9faa027 level 2.
[  146.925333] -- spte 0x000b8f67 level 1.
[  146.925333] [ cut here ]
[  146.925351] WARNING: CPU: 6 PID: 3753 at arch/x86/kvm/mmu.c:3396 
handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]()
[  146.925371] Modules linked in: tun sha256_ssse3 sha256_generic drbg 
binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd 
amd64_edac_mod k10temp edac_core fam15h_power amdkfd amd_iommu_v2 radeon 
acpi_cpufreq
[  146.925373] CPU: 6 PID: 3753 Comm: qemu-system-x86 Tainted: GW   
4.3.0-rc2+ #2
[  146.925374] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[  146.925376]  a0433932 880423377b78 812c758a 

[  146.925378]  

Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Borislav Petkov
On Wed, Sep 23, 2015 at 11:36:47AM +0200, Paolo Bonzini wrote:
> And another patch, which both cranks up the debugging a bit and
> tries another fix:
> 
> diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
> index dd05b9cef6ae..b2f49bb15ba1 100644
> --- a/arch/x86/kvm/cpuid.h
> +++ b/arch/x86/kvm/cpuid.h
> @@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct 
> kvm_vcpu *vcpu)
>  static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu)
>  {
>   struct kvm_cpuid_entry2 *best;
> + static bool first;
>  
>   best = kvm_find_cpuid_entry(vcpu, 0, 0);
> + if (first && best) {
> + printk("cpuid(0).ebx = %x\n", best->ebx);
> + first = false;
> + } else if (first)
> + printk_ratelimited("cpuid(0) not initialized yet\n");
> +
>   return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx;

Do I see it correctly that that "first" thing is never true?

In any case, I changed it to initialize to true but still no output from
that function.

[  102.448438] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 4, 0xf00f8)
[  102.458706] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 3, 0xf0078)
[  102.468955] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 2, 0xf0078)
[  102.479337] dump hierarchy:
[  102.482152] -- spte 0x416edb027 level 4.
[  102.482154] -- spte 0x416eda027 level 3.
[  102.482155] -- spte 0x416ed5027 level 2.
[  102.482157] -- spte 0x000b8f67 level 1.
[  102.482158] [ cut here ]
[  102.482196] WARNING: CPU: 6 PID: 3550 at arch/x86/kvm/mmu.c:3396 
handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]()
[  102.482236] Modules linked in: tun sha256_ssse3 sha256_generic drbg 
binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd 
amd64_edac_mod fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon 
acpi_cpufreq
[  102.482240] CPU: 6 PID: 3550 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #1
[  102.482242] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[  102.482249]  a030c992 880424b8fb78 812c758a 

[  102.482253]  880424b8fbb0 810534c1 8804160e 
000f
[  102.482257]  000b8000   
880424b8fbc0
[  102.482259] Call Trace:
[  102.482268]  [] dump_stack+0x4e/0x84
[  102.482273]  [] warn_slowpath_common+0x91/0xd0
[  102.482276]  [] warn_slowpath_null+0x1a/0x20
[  102.482306]  [] handle_mmio_page_fault.part.57+0x1a/0x20 
[kvm]
[  102.482334]  [] tdp_page_fault+0x2a0/0x2b0 [kvm]
[  102.482340]  [] ? __lock_acquire+0x57d/0x17a0
[  102.482369]  [] kvm_mmu_page_fault+0x35/0x240 [kvm]
[  102.482376]  [] pf_interception+0x108/0x1d0 [kvm_amd]
[  102.482381]  [] handle_exit+0x150/0xa40 [kvm_amd]
[  102.482408]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[  102.482435]  [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm]
[  102.482461]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[  102.482466]  [] ? mutex_lock_killable_nested+0x312/0x480
[  102.482485]  [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm]
[  102.482490]  [] ? preempt_count_sub+0xb3/0x110
[  102.482509]  [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm]
[  102.482515]  [] do_vfs_ioctl+0x2d7/0x530
[  102.482519]  [] ? __fget_light+0x29/0x90
[  102.482523]  [] SyS_ioctl+0x4c/0x90
[  102.482527]  [] entry_SYSCALL_64_fastpath+0x16/0x73
[  102.482531] ---[ end trace b8899512fc52cf2e ]---

Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Paolo Bonzini


On 23/09/2015 13:07, Borislav Petkov wrote:
>> > +  static bool first;
>> >  
>> >best = kvm_find_cpuid_entry(vcpu, 0, 0);
>> > +  if (first && best) {
>> > +  printk("cpuid(0).ebx = %x\n", best->ebx);
>> > +  first = false;
>> > +  } else if (first)
>> > +  printk_ratelimited("cpuid(0) not initialized yet\n");
>> > +
>> >return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx;
> Do I see it correctly that that "first" thing is never true?
> 
> In any case, I changed it to initialize to true but still no output from
> that function.
> 
> [  102.448438] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 4, 0xf00f8)
> [  102.458706] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 3, 0xf0078)
> [  102.468955] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 2, 0xf0078)

Yeah, but I didn't totally need the output so I didn't bother answering
with yet another v2.  I'm a bit stymied and will try to find an AMD
machine inside RH (it's always a pain to install latest-and-greatest
kernel on unknown machines).  It's probably also time to buy one too...

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Paolo Bonzini


On 23/09/2015 09:56, Borislav Petkov wrote:
> [   50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 4, 0xf00f8)
> [   50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 3, 0xf0078)
> [   50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 2, 0xf0078)

It's checking against EPT format, no surprise that it complains...

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Borislav Petkov
On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote:
> Let's add more debugging output:

Here you go:

[   50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 4, 0xf00f8)
[   50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 3, 0xf0078)
[   50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 2, 0xf0078)
[   50.504767] dump hierarchy:
[   50.507595] -- spte 0x416533027 level 4.
[   50.507595] -- spte 0x416534027 level 3.
[   50.507596] -- spte 0x416535027 level 2.
[   50.507596] -- spte 0x000b8f67 level 1.
[   50.507597] [ cut here ]
[   50.507616] WARNING: CPU: 4 PID: 3539 at arch/x86/kvm/mmu.c:3396 
handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]()
[   50.507630] Modules linked in: tun sha256_ssse3 sha256_generic drbg 
binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd 
amd64_edac_mod k10temp edac_core fam15h_power amdkfd amd_iommu_v2 radeon 
acpi_cpufreq
[   50.507632] CPU: 4 PID: 3539 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #2
[   50.507633] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[   50.507635]  a0433932 880416973b78 812c758a 

[   50.507637]  880416973bb0 810534c1 8804231c 
000f
[   50.507638]  000b8000   
880416973bc0
[   50.507639] Call Trace:
[   50.507643]  [] dump_stack+0x4e/0x84
[   50.507646]  [] warn_slowpath_common+0x91/0xd0
[   50.507647]  [] warn_slowpath_null+0x1a/0x20
[   50.507657]  [] handle_mmio_page_fault.part.57+0x1a/0x20 
[kvm]
[   50.507667]  [] tdp_page_fault+0x2a0/0x2b0 [kvm]
[   50.507673]  [] ? __lock_acquire+0x57d/0x17a0
[   50.507682]  [] kvm_mmu_page_fault+0x35/0x240 [kvm]
[   50.507685]  [] pf_interception+0x108/0x1d0 [kvm_amd]
[   50.507688]  [] handle_exit+0x150/0xa40 [kvm_amd]
[   50.507697]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[   50.507706]  [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm]
[   50.507715]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[   50.507717]  [] ? mutex_lock_killable_nested+0x312/0x480
[   50.507724]  [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm]
[   50.507726]  [] ? preempt_count_sub+0xb3/0x110
[   50.507733]  [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm]
[   50.507735]  [] do_vfs_ioctl+0x2d7/0x530
[   50.507737]  [] ? __fget_light+0x29/0x90
[   50.507738]  [] SyS_ioctl+0x4c/0x90
[   50.507740]  [] entry_SYSCALL_64_fastpath+0x16/0x73
[   50.507741] ---[ end trace ff23795fcc279cbd ]---

> Thus same as before.
> 
> Just to be safe, can you try using "-cpu host" on the QEMU command
> line and see if it changes anything?  This would catch things such
> as an Intel CPUID on an AMD host.

Here's my full qemu command:

qemu-system-x86_64 -enable-kvm -gdb tcp::1234 -cpu host -m 2048 -hda 
/home/boris/kvm/debian/sid-x86_64.img -hdb /home/boris/kvm/swap.img -boot 
menu=off,order=c -localtime -net nic,model=rtl8139 -net 
user,hostfwd=tcp::1235-:22 -usbdevice tablet -kernel 
/home/boris/kernel/linux-2.6/arch/x86/boot/bzImage -append "root=/dev/sda1 
resume=/dev/sdb1 debug ignore_loglevel log_buf_len=16M earlyprintk=ttyS0,115200 
console=ttyS0,115200 console=tty0 " -monitor pty -virtfs 
local,path=/tmp,mount_tag=tmp,security_model=none -serial 
file:/home/boris/kvm/test-x86_64-1235.log -snapshot -name "Debian x86_64:1235" 
-smp 8

and that splats too:

[  146.891735] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 4, 0xf00f8)
[  146.901981] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 3, 0xf0078)
[  146.912224] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 2, 0xf0078)
[  146.922496] dump hierarchy:
[  146.925331] -- spte 0x37d47027 level 4.
[  146.925332] -- spte 0x37d46027 level 3.
[  146.925332] -- spte 0xb9faa027 level 2.
[  146.925333] -- spte 0x000b8f67 level 1.
[  146.925333] [ cut here ]
[  146.925351] WARNING: CPU: 6 PID: 3753 at arch/x86/kvm/mmu.c:3396 
handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]()
[  146.925371] Modules linked in: tun sha256_ssse3 sha256_generic drbg 
binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd 
amd64_edac_mod k10temp edac_core fam15h_power amdkfd amd_iommu_v2 radeon 
acpi_cpufreq
[  146.925373] CPU: 6 PID: 3753 Comm: qemu-system-x86 Tainted: GW   
4.3.0-rc2+ #2
[  146.925374] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[  146.925376]  a0433932 880423377b78 812c758a 

[  146.925378]  

Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Xiao Guangrong



On 09/23/2015 05:36 PM, Paolo Bonzini wrote:



On 23/09/2015 09:56, Borislav Petkov wrote:

On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote:

Let's add more debugging output:


Here you go:

[   50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 4, 0xf00f8)
[   50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 3, 0xf0078)
[   50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000 (level 2, 0xf0078)


And another patch, which both cranks up the debugging a bit and
tries another fix:

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index dd05b9cef6ae..b2f49bb15ba1 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct kvm_vcpu 
*vcpu)
  static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu)
  {
struct kvm_cpuid_entry2 *best;
+   static bool first;

best = kvm_find_cpuid_entry(vcpu, 0, 0);
+   if (first && best) {
+   printk("cpuid(0).ebx = %x\n", best->ebx);
+   first = false;
+   } else if (first)
+   printk_ratelimited("cpuid(0) not initialized yet\n");
+
return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx;
  }

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bf1122e9c7bf..f50b280ffee1 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3625,7 +3625,7 @@ static void
  __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
struct rsvd_bits_validate *rsvd_check,
int maxphyaddr, int level, bool nx, bool gbpages,
-   bool pse)
+   bool pse, bool amd)
  {
u64 exb_bit_rsvd = 0;
u64 gbpages_bit_rsvd = 0;
@@ -3642,7 +3642,7 @@ __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 * Non-leaf PML4Es and PDPEs reserve bit 8 (which would be the G bit for
 * leaf entries) on AMD CPUs only.
 */
-   if (guest_cpuid_is_amd(vcpu))
+   if (amd)
nonleaf_bit8_rsvd = rsvd_bits(8, 8);

switch (level) {
@@ -3710,7 +3710,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
__reset_rsvds_bits_mask(vcpu, >guest_rsvd_check,
cpuid_maxphyaddr(vcpu), context->root_level,
context->nx, guest_cpuid_has_gbpages(vcpu),
-   is_pse(vcpu));
+   is_pse(vcpu), guest_cpuid_is_amd(vcpu));
  }

  static void
@@ -3760,13 +3760,25 @@ static void reset_rsvds_bits_mask_ept(struct kvm_vcpu 
*vcpu,
  void
  reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
  {
+   /*
+* Passing "true" to the last argument is okay; it adds a check
+* on bit 8 of the SPTEs which KVM doesn't use anyway.
+*/
__reset_rsvds_bits_mask(vcpu, >shadow_zero_check,
boot_cpu_data.x86_phys_bits,
context->shadow_root_level, context->nx,
-   guest_cpuid_has_gbpages(vcpu), is_pse(vcpu));
+   guest_cpuid_has_gbpages(vcpu), is_pse(vcpu),
+   true);
  }
  EXPORT_SYMBOL_GPL(reset_shadow_zero_bits_mask);

+static inline bool
+boot_cpu_is_amd(void)
+{
+   WARN_ON_ONCE(!tdp_enabled);
+   return shadow_x_mask != 0;


shadow_x_mask != 0 is Intel's CPU.

Borislav, could you please check shadow_x_mask == 0 instead and test it again?

Further more, use guest_cpuid_is_amd() to detect hardware CPU vendor is wrong
as usespace can fool KVM. Should test host CPUID or introduce intel/amd callback
instead.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-23 Thread Paolo Bonzini


On 23/09/2015 09:56, Borislav Petkov wrote:
> On Tue, Sep 22, 2015 at 11:04:38PM +0200, Paolo Bonzini wrote:
>> Let's add more debugging output:
> 
> Here you go:
> 
> [   50.474002] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 4, 0xf00f8)
> [   50.484249] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 3, 0xf0078)
> [   50.494492] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000 (level 2, 0xf0078)

And another patch, which both cranks up the debugging a bit and
tries another fix:

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index dd05b9cef6ae..b2f49bb15ba1 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -105,8 +105,15 @@ static inline bool guest_cpuid_has_x2apic(struct kvm_vcpu 
*vcpu)
 static inline bool guest_cpuid_is_amd(struct kvm_vcpu *vcpu)
 {
struct kvm_cpuid_entry2 *best;
+   static bool first;
 
best = kvm_find_cpuid_entry(vcpu, 0, 0);
+   if (first && best) {
+   printk("cpuid(0).ebx = %x\n", best->ebx);
+   first = false;
+   } else if (first)
+   printk_ratelimited("cpuid(0) not initialized yet\n");
+
return best && best->ebx == X86EMUL_CPUID_VENDOR_AuthenticAMD_ebx;
 }
 
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bf1122e9c7bf..f50b280ffee1 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3625,7 +3625,7 @@ static void
 __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
struct rsvd_bits_validate *rsvd_check,
int maxphyaddr, int level, bool nx, bool gbpages,
-   bool pse)
+   bool pse, bool amd)
 {
u64 exb_bit_rsvd = 0;
u64 gbpages_bit_rsvd = 0;
@@ -3642,7 +3642,7 @@ __reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
 * Non-leaf PML4Es and PDPEs reserve bit 8 (which would be the G bit for
 * leaf entries) on AMD CPUs only.
 */
-   if (guest_cpuid_is_amd(vcpu))
+   if (amd)
nonleaf_bit8_rsvd = rsvd_bits(8, 8);
 
switch (level) {
@@ -3710,7 +3710,7 @@ static void reset_rsvds_bits_mask(struct kvm_vcpu *vcpu,
__reset_rsvds_bits_mask(vcpu, >guest_rsvd_check,
cpuid_maxphyaddr(vcpu), context->root_level,
context->nx, guest_cpuid_has_gbpages(vcpu),
-   is_pse(vcpu));
+   is_pse(vcpu), guest_cpuid_is_amd(vcpu));
 }
 
 static void
@@ -3760,13 +3760,25 @@ static void reset_rsvds_bits_mask_ept(struct kvm_vcpu 
*vcpu,
 void
 reset_shadow_zero_bits_mask(struct kvm_vcpu *vcpu, struct kvm_mmu *context)
 {
+   /*
+* Passing "true" to the last argument is okay; it adds a check
+* on bit 8 of the SPTEs which KVM doesn't use anyway.
+*/
__reset_rsvds_bits_mask(vcpu, >shadow_zero_check,
boot_cpu_data.x86_phys_bits,
context->shadow_root_level, context->nx,
-   guest_cpuid_has_gbpages(vcpu), is_pse(vcpu));
+   guest_cpuid_has_gbpages(vcpu), is_pse(vcpu),
+   true);
 }
 EXPORT_SYMBOL_GPL(reset_shadow_zero_bits_mask);
 
+static inline bool
+boot_cpu_is_amd(void)
+{
+   WARN_ON_ONCE(!tdp_enabled);
+   return shadow_x_mask != 0;
+}
+
 /*
  * the direct page table on host, use as much mmu features as
  * possible, however, kvm currently does not do execution-protection.
@@ -3775,11 +3787,11 @@ static void
 reset_tdp_shadow_zero_bits_mask(struct kvm_vcpu *vcpu,
struct kvm_mmu *context)
 {
-   if (guest_cpuid_is_amd(vcpu))
+   if (boot_cpu_is_amd())
__reset_rsvds_bits_mask(vcpu, >shadow_zero_check,
boot_cpu_data.x86_phys_bits,
context->shadow_root_level, false,
-   cpu_has_gbpages, true);
+   cpu_has_gbpages, true, true);
else
__reset_rsvds_bits_mask_ept(>shadow_zero_check,
boot_cpu_data.x86_phys_bits,



Applies on top of everything else you've got already.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-22 Thread Paolo Bonzini


On 22/09/2015 19:56, Borislav Petkov wrote:
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 69088a1ba509..3ce2b74c75dc 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, 
> u64 addr, u64 *sptep)
>   break;
>  
>   reserved |= is_shadow_zero_bits_set(>arch.mmu, spte,
> - leaf);
> + iterator.level);
>   }
>  
>   walk_shadow_page_lockless_end(vcpu);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c0b9ff3e1aec..a44f8fed9be1 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7063,13 +7063,16 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
>   unsigned int id)
>  {
>   struct kvm_vcpu *vcpu;
> + int idx;
>  
>   if (check_tsc_unstable() && atomic_read(>online_vcpus) != 0)
>   printk_once(KERN_WARNING
>   "kvm: SMP vm created on host with unstable TSC; "
>   "guest TSC will not be reliable\n");
>  
> + idx = srcu_read_lock(>srcu);
>   vcpu = kvm_x86_ops->vcpu_create(kvm, id);
> + srcu_read_unlock(>srcu, idx);
>  
>   return vcpu;
>  }

Yup, looks good.

Let's add more debugging output:

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 3ce2b74c75dc..bf1122e9c7bf 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3268,23 +3268,28 @@ static gpa_t nonpaging_gva_to_gpa_nested(struct 
kvm_vcpu *vcpu, gva_t vaddr,
return vcpu->arch.nested_mmu.translate_gpa(vcpu, vaddr, access, 
exception);
 }
 
-static bool
-__is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
+static u64
+rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
 {
int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
+   u64 mask = rsvd_check->rsvd_bits_mask[bit7][level-1];
+
+   if (unlikely(pte & mask))
+   return mask;
+   if (unlikely(rsvd_check->bad_mt_xwr & (1ull << low6)))
+   return rsvd_check->bad_mt_xwr;
 
-   return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
-   ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0);
+   return 0;
 }
 
 static bool is_rsvd_bits_set(struct kvm_mmu *mmu, u64 gpte, int level)
 {
-   return __is_rsvd_bits_set(>guest_rsvd_check, gpte, level);
+   return rsvd_bits_set(>guest_rsvd_check, gpte, level) != 0;
 }
 
-static bool is_shadow_zero_bits_set(struct kvm_mmu *mmu, u64 spte, int level)
+static u64 shadow_rsvd_bits_set(struct kvm_mmu *mmu, u64 spte, int level)
 {
-   return __is_rsvd_bits_set(>shadow_zero_check, spte, level);
+   return rsvd_bits_set(>shadow_zero_check, spte, level);
 }
 
 static bool quickly_check_mmio_pf(struct kvm_vcpu *vcpu, u64 addr, bool direct)
@@ -3302,6 +3307,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 
addr, u64 *sptep)
struct kvm_shadow_walk_iterator iterator;
u64 sptes[PT64_ROOT_LEVEL], spte = 0ull;
int root, leaf;
+   u64 result;
bool reserved = false;
 
if (!VALID_PAGE(vcpu->arch.mmu.root_hpa))
@@ -3321,15 +3327,20 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, 
u64 addr, u64 *sptep)
if (!is_shadow_present_pte(spte))
break;
 
-   reserved |= is_shadow_zero_bits_set(>arch.mmu, spte,
-   iterator.level);
+   result = shadow_rsvd_bits_set(>arch.mmu, spte,
+ iterator.level);
+   if (unlikely(result)) {
+   pr_err("%s: detect reserved bits on spte, addr 0x%llx "
+  "(level %d, 0x%llx)\n",
+  __func__, addr, iterator.level, result);
+   reserved = true;
+   }
}
 
walk_shadow_page_lockless_end(vcpu);
 
if (reserved) {
-   pr_err("%s: detect reserved bits on spte, addr 0x%llx, dump 
hierarchy:\n",
-  __func__, addr);
+   pr_err("dump hierarchy:\n");
while (root > leaf) {
pr_err("-- spte 0x%llx level %d.\n",
   sptes[root - 1], root);

> [   49.456533] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000, dump hierarchy:
> [   49.465945] -- spte 0x416ed9027 level 4.
> [   49.470221] -- spte 0x416888027 level 3.
> [   49.474494] -- spte 0x41694f027 level 2.
> [   49.474495] -- spte 0x000b8f67 level 1.

Thus same as before.

Just to be safe, can you try using "-cpu host" on the QEMU command
line and see if it changes anything?  This would catch things such
as an Intel CPUID on an AMD host.

Paolo

> [   49.474496] [ cut here 

Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-22 Thread Borislav Petkov
On Tue, Sep 22, 2015 at 10:25:29AM +0200, Paolo Bonzini wrote:
> 29ecd6601904 ("KVM: x86: avoid uninitialized variable warning",
> 2015-09-06) introduced a not-so-subtle problem, which probably
> escaped review because it was not part of the patch context.

...

> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 69088a1ba509..3ce2b74c75dc 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, 
> u64 addr, u64 *sptep)
>   break;
>  
>   reserved |= is_shadow_zero_bits_set(>arch.mmu, spte,
> - leaf);
> + iterator.level);
>   }
>  
>   walk_shadow_page_lockless_end(vcpu);
> -- 

No joy, I still see the splat at the end of this mail when starting a
kvm guest. Btw, this is what I have ontop of rc2+tip:

---
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 69088a1ba509..3ce2b74c75dc 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 
addr, u64 *sptep)
break;
 
reserved |= is_shadow_zero_bits_set(>arch.mmu, spte,
-   leaf);
+   iterator.level);
}
 
walk_shadow_page_lockless_end(vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c0b9ff3e1aec..a44f8fed9be1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7063,13 +7063,16 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
unsigned int id)
 {
struct kvm_vcpu *vcpu;
+   int idx;
 
if (check_tsc_unstable() && atomic_read(>online_vcpus) != 0)
printk_once(KERN_WARNING
"kvm: SMP vm created on host with unstable TSC; "
"guest TSC will not be reliable\n");
 
+   idx = srcu_read_lock(>srcu);
vcpu = kvm_x86_ops->vcpu_create(kvm, id);
+   srcu_read_unlock(>srcu, idx);
 
return vcpu;
 }
---

---
[   49.456533] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000, dump hierarchy:
[   49.465945] -- spte 0x416ed9027 level 4.
[   49.470221] -- spte 0x416888027 level 3.
[   49.474494] -- spte 0x41694f027 level 2.
[   49.474495] -- spte 0x000b8f67 level 1.
[   49.474496] [ cut here ]
[   49.474515] WARNING: CPU: 4 PID: 3540 at arch/x86/kvm/mmu.c:3385 
handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]()
[   49.474555] Modules linked in: tun sha256_ssse3 sha256_generic drbg 
binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd 
amd64_edac_mod fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon 
acpi_cpufreq
[   49.474560] CPU: 4 PID: 3540 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #2
[   49.474562] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[   49.474569]  a032f8b2 880416a73b78 812c758a 

[   49.474574]  880416a73bb0 810534c1 8804171b 
000f
[   49.474578]  000b8000   
880416a73bc0
[   49.474579] Call Trace:
[   49.474586]  [] dump_stack+0x4e/0x84
[   49.474589]  [] warn_slowpath_common+0x91/0xd0
[   49.474592]  [] warn_slowpath_null+0x1a/0x20
[   49.474603]  [] handle_mmio_page_fault.part.57+0x1a/0x20 
[kvm]
[   49.474615]  [] tdp_page_fault+0x2a0/0x2b0 [kvm]
[   49.474620]  [] ? __lock_acquire+0x57d/0x17a0
[   49.474633]  [] kvm_mmu_page_fault+0x35/0x240 [kvm]
[   49.474637]  [] pf_interception+0x108/0x1d0 [kvm_amd]
[   49.474642]  [] handle_exit+0x150/0xa40 [kvm_amd]
[   49.474662]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[   49.474674]  [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm]
[   49.474686]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[   49.474690]  [] ? mutex_lock_killable_nested+0x312/0x480
[   49.474700]  [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm]
[   49.474705]  [] ? preempt_count_sub+0xb3/0x110
[   49.474715]  [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm]
[   49.474719]  [] do_vfs_ioctl+0x2d7/0x530
[   49.474722]  [] ? __fget_light+0x29/0x90
[   49.474724]  [] SyS_ioctl+0x4c/0x90
[   49.474729]  [] entry_SYSCALL_64_fastpath+0x16/0x73
[   49.474732] ---[ end trace 0e0be3552b84977c ]---


Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-22 Thread Paolo Bonzini
29ecd6601904 ("KVM: x86: avoid uninitialized variable warning",
2015-09-06) introduced a not-so-subtle problem, which probably
escaped review because it was not part of the patch context.

Before the patch, leaf was always equal to iterator.level.  After,
it is equal to iterator.level - 1 in the call to is_shadow_zero_bits_set,
and when is_shadow_zero_bits_set does another "-1" the check on
reserved bits becomes incorrect.  Using "iterator.level" in the call
fixes this call trace:

WARNING: CPU: 2 PID: 17000 at arch/x86/kvm/mmu.c:3385 
handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]()
Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat 
fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw 
gf128mul glue_helper ablk_helper cryptd fam15h_power amd64_edac_mod k10temp 
edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq
[...]
Call Trace:
  dump_stack+0x4e/0x84
  warn_slowpath_common+0x95/0xe0
  warn_slowpath_null+0x1a/0x20
  handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]
  tdp_page_fault+0x231/0x290 [kvm]
  ? emulator_pio_in_out+0x6e/0xf0 [kvm]
  kvm_mmu_page_fault+0x36/0x240 [kvm]
  ? svm_set_cr0+0x95/0xc0 [kvm_amd]
  pf_interception+0xde/0x1d0 [kvm_amd]
  handle_exit+0x181/0xa70 [kvm_amd]
  ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
  kvm_arch_vcpu_ioctl_run+0x6f6/0x1730 [kvm]
  ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
  ? preempt_count_sub+0x9b/0xf0
  ? mutex_lock_killable_nested+0x26f/0x490
  ? preempt_count_sub+0x9b/0xf0
  kvm_vcpu_ioctl+0x358/0x710 [kvm]
  ? __fget+0x5/0x210
  ? __fget+0x101/0x210
  do_vfs_ioctl+0x2f4/0x560
  ? __fget_light+0x29/0x90
  SyS_ioctl+0x4c/0x90
  entry_SYSCALL_64_fastpath+0x16/0x73
---[ end trace 37901c8686d84de6 ]---

Reported-by: Borislav Petkov 
Signed-off-by: Paolo Bonzini 
---
 arch/x86/kvm/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 69088a1ba509..3ce2b74c75dc 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 
addr, u64 *sptep)
break;
 
reserved |= is_shadow_zero_bits_set(>arch.mmu, spte,
-   leaf);
+   iterator.level);
}
 
walk_shadow_page_lockless_end(vcpu);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-22 Thread Borislav Petkov
On Tue, Sep 22, 2015 at 10:25:29AM +0200, Paolo Bonzini wrote:
> 29ecd6601904 ("KVM: x86: avoid uninitialized variable warning",
> 2015-09-06) introduced a not-so-subtle problem, which probably
> escaped review because it was not part of the patch context.

...

> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 69088a1ba509..3ce2b74c75dc 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, 
> u64 addr, u64 *sptep)
>   break;
>  
>   reserved |= is_shadow_zero_bits_set(>arch.mmu, spte,
> - leaf);
> + iterator.level);
>   }
>  
>   walk_shadow_page_lockless_end(vcpu);
> -- 

No joy, I still see the splat at the end of this mail when starting a
kvm guest. Btw, this is what I have ontop of rc2+tip:

---
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 69088a1ba509..3ce2b74c75dc 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 
addr, u64 *sptep)
break;
 
reserved |= is_shadow_zero_bits_set(>arch.mmu, spte,
-   leaf);
+   iterator.level);
}
 
walk_shadow_page_lockless_end(vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index c0b9ff3e1aec..a44f8fed9be1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7063,13 +7063,16 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
unsigned int id)
 {
struct kvm_vcpu *vcpu;
+   int idx;
 
if (check_tsc_unstable() && atomic_read(>online_vcpus) != 0)
printk_once(KERN_WARNING
"kvm: SMP vm created on host with unstable TSC; "
"guest TSC will not be reliable\n");
 
+   idx = srcu_read_lock(>srcu);
vcpu = kvm_x86_ops->vcpu_create(kvm, id);
+   srcu_read_unlock(>srcu, idx);
 
return vcpu;
 }
---

---
[   49.456533] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
addr 0xb8000, dump hierarchy:
[   49.465945] -- spte 0x416ed9027 level 4.
[   49.470221] -- spte 0x416888027 level 3.
[   49.474494] -- spte 0x41694f027 level 2.
[   49.474495] -- spte 0x000b8f67 level 1.
[   49.474496] [ cut here ]
[   49.474515] WARNING: CPU: 4 PID: 3540 at arch/x86/kvm/mmu.c:3385 
handle_mmio_page_fault.part.57+0x1a/0x20 [kvm]()
[   49.474555] Modules linked in: tun sha256_ssse3 sha256_generic drbg 
binfmt_misc ipv6 vfat fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul 
aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd 
amd64_edac_mod fam15h_power k10temp edac_core amdkfd amd_iommu_v2 radeon 
acpi_cpufreq
[   49.474560] CPU: 4 PID: 3540 Comm: qemu-system-x86 Not tainted 4.3.0-rc2+ #2
[   49.474562] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[   49.474569]  a032f8b2 880416a73b78 812c758a 

[   49.474574]  880416a73bb0 810534c1 8804171b 
000f
[   49.474578]  000b8000   
880416a73bc0
[   49.474579] Call Trace:
[   49.474586]  [] dump_stack+0x4e/0x84
[   49.474589]  [] warn_slowpath_common+0x91/0xd0
[   49.474592]  [] warn_slowpath_null+0x1a/0x20
[   49.474603]  [] handle_mmio_page_fault.part.57+0x1a/0x20 
[kvm]
[   49.474615]  [] tdp_page_fault+0x2a0/0x2b0 [kvm]
[   49.474620]  [] ? __lock_acquire+0x57d/0x17a0
[   49.474633]  [] kvm_mmu_page_fault+0x35/0x240 [kvm]
[   49.474637]  [] pf_interception+0x108/0x1d0 [kvm_amd]
[   49.474642]  [] handle_exit+0x150/0xa40 [kvm_amd]
[   49.474662]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[   49.474674]  [] kvm_arch_vcpu_ioctl_run+0x533/0x16f0 [kvm]
[   49.474686]  [] ? kvm_arch_vcpu_ioctl_run+0x4c8/0x16f0 
[kvm]
[   49.474690]  [] ? mutex_lock_killable_nested+0x312/0x480
[   49.474700]  [] ? kvm_vcpu_ioctl+0x79/0x6f0 [kvm]
[   49.474705]  [] ? preempt_count_sub+0xb3/0x110
[   49.474715]  [] kvm_vcpu_ioctl+0x33f/0x6f0 [kvm]
[   49.474719]  [] do_vfs_ioctl+0x2d7/0x530
[   49.474722]  [] ? __fget_light+0x29/0x90
[   49.474724]  [] SyS_ioctl+0x4c/0x90
[   49.474729]  [] entry_SYSCALL_64_fastpath+0x16/0x73
[   49.474732] ---[ end trace 0e0be3552b84977c ]---


Thanks.

-- 
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-22 Thread Paolo Bonzini


On 22/09/2015 19:56, Borislav Petkov wrote:
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 69088a1ba509..3ce2b74c75dc 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, 
> u64 addr, u64 *sptep)
>   break;
>  
>   reserved |= is_shadow_zero_bits_set(>arch.mmu, spte,
> - leaf);
> + iterator.level);
>   }
>  
>   walk_shadow_page_lockless_end(vcpu);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index c0b9ff3e1aec..a44f8fed9be1 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7063,13 +7063,16 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
>   unsigned int id)
>  {
>   struct kvm_vcpu *vcpu;
> + int idx;
>  
>   if (check_tsc_unstable() && atomic_read(>online_vcpus) != 0)
>   printk_once(KERN_WARNING
>   "kvm: SMP vm created on host with unstable TSC; "
>   "guest TSC will not be reliable\n");
>  
> + idx = srcu_read_lock(>srcu);
>   vcpu = kvm_x86_ops->vcpu_create(kvm, id);
> + srcu_read_unlock(>srcu, idx);
>  
>   return vcpu;
>  }

Yup, looks good.

Let's add more debugging output:

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 3ce2b74c75dc..bf1122e9c7bf 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3268,23 +3268,28 @@ static gpa_t nonpaging_gva_to_gpa_nested(struct 
kvm_vcpu *vcpu, gva_t vaddr,
return vcpu->arch.nested_mmu.translate_gpa(vcpu, vaddr, access, 
exception);
 }
 
-static bool
-__is_rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
+static u64
+rsvd_bits_set(struct rsvd_bits_validate *rsvd_check, u64 pte, int level)
 {
int bit7 = (pte >> 7) & 1, low6 = pte & 0x3f;
+   u64 mask = rsvd_check->rsvd_bits_mask[bit7][level-1];
+
+   if (unlikely(pte & mask))
+   return mask;
+   if (unlikely(rsvd_check->bad_mt_xwr & (1ull << low6)))
+   return rsvd_check->bad_mt_xwr;
 
-   return (pte & rsvd_check->rsvd_bits_mask[bit7][level-1]) |
-   ((rsvd_check->bad_mt_xwr & (1ull << low6)) != 0);
+   return 0;
 }
 
 static bool is_rsvd_bits_set(struct kvm_mmu *mmu, u64 gpte, int level)
 {
-   return __is_rsvd_bits_set(>guest_rsvd_check, gpte, level);
+   return rsvd_bits_set(>guest_rsvd_check, gpte, level) != 0;
 }
 
-static bool is_shadow_zero_bits_set(struct kvm_mmu *mmu, u64 spte, int level)
+static u64 shadow_rsvd_bits_set(struct kvm_mmu *mmu, u64 spte, int level)
 {
-   return __is_rsvd_bits_set(>shadow_zero_check, spte, level);
+   return rsvd_bits_set(>shadow_zero_check, spte, level);
 }
 
 static bool quickly_check_mmio_pf(struct kvm_vcpu *vcpu, u64 addr, bool direct)
@@ -3302,6 +3307,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 
addr, u64 *sptep)
struct kvm_shadow_walk_iterator iterator;
u64 sptes[PT64_ROOT_LEVEL], spte = 0ull;
int root, leaf;
+   u64 result;
bool reserved = false;
 
if (!VALID_PAGE(vcpu->arch.mmu.root_hpa))
@@ -3321,15 +3327,20 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, 
u64 addr, u64 *sptep)
if (!is_shadow_present_pte(spte))
break;
 
-   reserved |= is_shadow_zero_bits_set(>arch.mmu, spte,
-   iterator.level);
+   result = shadow_rsvd_bits_set(>arch.mmu, spte,
+ iterator.level);
+   if (unlikely(result)) {
+   pr_err("%s: detect reserved bits on spte, addr 0x%llx "
+  "(level %d, 0x%llx)\n",
+  __func__, addr, iterator.level, result);
+   reserved = true;
+   }
}
 
walk_shadow_page_lockless_end(vcpu);
 
if (reserved) {
-   pr_err("%s: detect reserved bits on spte, addr 0x%llx, dump 
hierarchy:\n",
-  __func__, addr);
+   pr_err("dump hierarchy:\n");
while (root > leaf) {
pr_err("-- spte 0x%llx level %d.\n",
   sptes[root - 1], root);

> [   49.456533] walk_shadow_page_get_mmio_spte: detect reserved bits on spte, 
> addr 0xb8000, dump hierarchy:
> [   49.465945] -- spte 0x416ed9027 level 4.
> [   49.470221] -- spte 0x416888027 level 3.
> [   49.474494] -- spte 0x41694f027 level 2.
> [   49.474495] -- spte 0x000b8f67 level 1.

Thus same as before.

Just to be safe, can you try using "-cpu host" on the QEMU command
line and see if it changes anything?  This would catch things such
as an Intel CPUID on an AMD host.

Paolo

> [   49.474496] [ cut here 

[PATCH] KVM: x86: fix bogus warning about reserved bits

2015-09-22 Thread Paolo Bonzini
29ecd6601904 ("KVM: x86: avoid uninitialized variable warning",
2015-09-06) introduced a not-so-subtle problem, which probably
escaped review because it was not part of the patch context.

Before the patch, leaf was always equal to iterator.level.  After,
it is equal to iterator.level - 1 in the call to is_shadow_zero_bits_set,
and when is_shadow_zero_bits_set does another "-1" the check on
reserved bits becomes incorrect.  Using "iterator.level" in the call
fixes this call trace:

WARNING: CPU: 2 PID: 17000 at arch/x86/kvm/mmu.c:3385 
handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]()
Modules linked in: tun sha256_ssse3 sha256_generic drbg binfmt_misc ipv6 vfat 
fat fuse dm_crypt dm_mod kvm_amd kvm crc32_pclmul aesni_intel aes_x86_64 lrw 
gf128mul glue_helper ablk_helper cryptd fam15h_power amd64_edac_mod k10temp 
edac_core amdkfd amd_iommu_v2 radeon acpi_cpufreq
[...]
Call Trace:
  dump_stack+0x4e/0x84
  warn_slowpath_common+0x95/0xe0
  warn_slowpath_null+0x1a/0x20
  handle_mmio_page_fault.part.93+0x1a/0x20 [kvm]
  tdp_page_fault+0x231/0x290 [kvm]
  ? emulator_pio_in_out+0x6e/0xf0 [kvm]
  kvm_mmu_page_fault+0x36/0x240 [kvm]
  ? svm_set_cr0+0x95/0xc0 [kvm_amd]
  pf_interception+0xde/0x1d0 [kvm_amd]
  handle_exit+0x181/0xa70 [kvm_amd]
  ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
  kvm_arch_vcpu_ioctl_run+0x6f6/0x1730 [kvm]
  ? kvm_arch_vcpu_ioctl_run+0x68b/0x1730 [kvm]
  ? preempt_count_sub+0x9b/0xf0
  ? mutex_lock_killable_nested+0x26f/0x490
  ? preempt_count_sub+0x9b/0xf0
  kvm_vcpu_ioctl+0x358/0x710 [kvm]
  ? __fget+0x5/0x210
  ? __fget+0x101/0x210
  do_vfs_ioctl+0x2f4/0x560
  ? __fget_light+0x29/0x90
  SyS_ioctl+0x4c/0x90
  entry_SYSCALL_64_fastpath+0x16/0x73
---[ end trace 37901c8686d84de6 ]---

Reported-by: Borislav Petkov 
Signed-off-by: Paolo Bonzini 
---
 arch/x86/kvm/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 69088a1ba509..3ce2b74c75dc 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3322,7 +3322,7 @@ walk_shadow_page_get_mmio_spte(struct kvm_vcpu *vcpu, u64 
addr, u64 *sptep)
break;
 
reserved |= is_shadow_zero_bits_set(>arch.mmu, spte,
-   leaf);
+   iterator.level);
}
 
walk_shadow_page_lockless_end(vcpu);
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/