Re: [PATCH 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019-06-25 Thread Andy Lutomirski
On Tue, Jun 25, 2019 at 8:48 PM Nadav Amit  wrote:
>
> > On Jun 25, 2019, at 8:36 PM, Andy Lutomirski  wrote:
> >
> > On Wed, Jun 12, 2019 at 11:49 PM Nadav Amit  wrote:
> >> To improve TLB shootdown performance, flush the remote and local TLBs
> >> concurrently. Introduce flush_tlb_multi() that does so. The current
> >> flush_tlb_others() interface is kept, since paravirtual interfaces need
> >> to be adapted first before it can be removed. This is left for future
> >> work. In such PV environments, TLB flushes are not performed, at this
> >> time, concurrently.
> >
> > Would it be straightforward to have a default PV flush_tlb_multi()
> > that uses flush_tlb_others() under the hood?
>
> I prefer not to have a default PV implementation that should anyhow go away.
>
> I can create unoptimized untested versions for Xen and Hyper-V, if you want.
>

I think I prefer that approach.  We should be able to get the
maintainers to test it.  I don't love having legacy paths in there,
ahem, UV.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019-06-25 Thread Nadav Amit via Virtualization
> On Jun 25, 2019, at 8:36 PM, Andy Lutomirski  wrote:
> 
> On Wed, Jun 12, 2019 at 11:49 PM Nadav Amit  wrote:
>> To improve TLB shootdown performance, flush the remote and local TLBs
>> concurrently. Introduce flush_tlb_multi() that does so. The current
>> flush_tlb_others() interface is kept, since paravirtual interfaces need
>> to be adapted first before it can be removed. This is left for future
>> work. In such PV environments, TLB flushes are not performed, at this
>> time, concurrently.
> 
> Would it be straightforward to have a default PV flush_tlb_multi()
> that uses flush_tlb_others() under the hood?

I prefer not to have a default PV implementation that should anyhow go away.

I can create unoptimized untested versions for Xen and Hyper-V, if you want.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019-06-25 Thread Andy Lutomirski
On Wed, Jun 12, 2019 at 11:49 PM Nadav Amit  wrote:
>
> To improve TLB shootdown performance, flush the remote and local TLBs
> concurrently. Introduce flush_tlb_multi() that does so. The current
> flush_tlb_others() interface is kept, since paravirtual interfaces need
> to be adapted first before it can be removed. This is left for future
> work. In such PV environments, TLB flushes are not performed, at this
> time, concurrently.

Would it be straightforward to have a default PV flush_tlb_multi()
that uses flush_tlb_others() under the hood?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019-06-25 Thread Nadav Amit via Virtualization
> On Jun 25, 2019, at 8:00 PM, Dave Hansen  wrote:
> 
> On 6/25/19 7:35 PM, Nadav Amit wrote:
 const struct flush_tlb_info *f = info;
 +  enum tlb_flush_reason reason;
 +
 +  reason = (f->mm == NULL) ? TLB_LOCAL_SHOOTDOWN : TLB_LOCAL_MM_SHOOTDOWN;
>>> 
>>> Should we just add the "reason" to flush_tlb_info?  It's OK-ish to imply
>>> it like this, but seems like it would be nicer and easier to track down
>>> the origins of these things if we did this at the caller.
>> 
>> I prefer not to. I want later to inline flush_tlb_info into the same
>> cacheline that holds call_function_data. Increasing the size of
>> flush_tlb_info for no good reason will not help…
> 
> Well, flush_tlb_info is at 6/8ths of a cacheline at the moment.
> call_function_data is 3/8ths.  To me, that means we have some slack in
> the size.

I do not understand your math.. :(

6 + 3 > 8 so putting both flush_tlb_info and call_function_data does not
leave us any slack (we can save one qword, so we can actually put them
at the same cacheline).

You can see my current implementation here:

https://lore.kernel.org/lkml/20190531063645.4697-4-na...@vmware.com/T/#m0ab5fe0799ba9ff0d41197f1095679fe26aebd57
https://lore.kernel.org/lkml/20190531063645.4697-4-na...@vmware.com/T/#m7b35a93dffd23fbb7ca813c795a0777d4cdcb51b

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019-06-25 Thread Dave Hansen
On 6/25/19 7:35 PM, Nadav Amit wrote:
>>> const struct flush_tlb_info *f = info;
>>> +   enum tlb_flush_reason reason;
>>> +
>>> +   reason = (f->mm == NULL) ? TLB_LOCAL_SHOOTDOWN : TLB_LOCAL_MM_SHOOTDOWN;
>>
>> Should we just add the "reason" to flush_tlb_info?  It's OK-ish to imply
>> it like this, but seems like it would be nicer and easier to track down
>> the origins of these things if we did this at the caller.
> 
> I prefer not to. I want later to inline flush_tlb_info into the same
> cacheline that holds call_function_data. Increasing the size of
> flush_tlb_info for no good reason will not help…

Well, flush_tlb_info is at 6/8ths of a cacheline at the moment.
call_function_data is 3/8ths.  To me, that means we have some slack in
the size.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019-06-25 Thread Nadav Amit via Virtualization
> On Jun 25, 2019, at 2:29 PM, Dave Hansen  wrote:
> 
> On 6/12/19 11:48 PM, Nadav Amit wrote:
>> To improve TLB shootdown performance, flush the remote and local TLBs
>> concurrently. Introduce flush_tlb_multi() that does so. The current
>> flush_tlb_others() interface is kept, since paravirtual interfaces need
>> to be adapted first before it can be removed. This is left for future
>> work. In such PV environments, TLB flushes are not performed, at this
>> time, concurrently.
>> 
>> Add a static key to tell whether this new interface is supported.
>> 
>> Cc: "K. Y. Srinivasan" 
>> Cc: Haiyang Zhang 
>> Cc: Stephen Hemminger 
>> Cc: Sasha Levin 
>> Cc: Thomas Gleixner 
>> Cc: Ingo Molnar 
>> Cc: Borislav Petkov 
>> Cc: x...@kernel.org
>> Cc: Juergen Gross 
>> Cc: Paolo Bonzini 
>> Cc: Dave Hansen 
>> Cc: Andy Lutomirski 
>> Cc: Peter Zijlstra 
>> Cc: Boris Ostrovsky 
>> Cc: linux-hyp...@vger.kernel.org
>> Cc: linux-ker...@vger.kernel.org
>> Cc: virtualization@lists.linux-foundation.org
>> Cc: k...@vger.kernel.org
>> Cc: xen-de...@lists.xenproject.org
>> Signed-off-by: Nadav Amit 
>> ---
>> arch/x86/hyperv/mmu.c |  2 +
>> arch/x86/include/asm/paravirt.h   |  8 +++
>> arch/x86/include/asm/paravirt_types.h |  6 +++
>> arch/x86/include/asm/tlbflush.h   |  6 +++
>> arch/x86/kernel/kvm.c |  1 +
>> arch/x86/kernel/paravirt.c|  3 ++
>> arch/x86/mm/tlb.c | 71 ++-
>> arch/x86/xen/mmu_pv.c |  2 +
>> 8 files changed, 87 insertions(+), 12 deletions(-)
>> 
>> diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
>> index e65d7fe6489f..ca28b400c87c 100644
>> --- a/arch/x86/hyperv/mmu.c
>> +++ b/arch/x86/hyperv/mmu.c
>> @@ -233,4 +233,6 @@ void hyperv_setup_mmu_ops(void)
>>  pr_info("Using hypercall for remote TLB flush\n");
>>  pv_ops.mmu.flush_tlb_others = hyperv_flush_tlb_others;
>>  pv_ops.mmu.tlb_remove_table = tlb_remove_table;
>> +
>> +static_key_disable(_tlb_multi_enabled.key);
>> }
>> diff --git a/arch/x86/include/asm/paravirt.h 
>> b/arch/x86/include/asm/paravirt.h
>> index c25c38a05c1c..192be7254457 100644
>> --- a/arch/x86/include/asm/paravirt.h
>> +++ b/arch/x86/include/asm/paravirt.h
>> @@ -47,6 +47,8 @@ static inline void slow_down_io(void)
>> #endif
>> }
>> 
>> +DECLARE_STATIC_KEY_TRUE(flush_tlb_multi_enabled);
>> +
>> static inline void __flush_tlb(void)
>> {
>>  PVOP_VCALL0(mmu.flush_tlb_user);
>> @@ -62,6 +64,12 @@ static inline void __flush_tlb_one_user(unsigned long 
>> addr)
>>  PVOP_VCALL1(mmu.flush_tlb_one_user, addr);
>> }
>> 
>> +static inline void flush_tlb_multi(const struct cpumask *cpumask,
>> +   const struct flush_tlb_info *info)
>> +{
>> +PVOP_VCALL2(mmu.flush_tlb_multi, cpumask, info);
>> +}
>> +
>> static inline void flush_tlb_others(const struct cpumask *cpumask,
>>  const struct flush_tlb_info *info)
>> {
>> diff --git a/arch/x86/include/asm/paravirt_types.h 
>> b/arch/x86/include/asm/paravirt_types.h
>> index 946f8f1f1efc..b93b3d90729a 100644
>> --- a/arch/x86/include/asm/paravirt_types.h
>> +++ b/arch/x86/include/asm/paravirt_types.h
>> @@ -211,6 +211,12 @@ struct pv_mmu_ops {
>>  void (*flush_tlb_user)(void);
>>  void (*flush_tlb_kernel)(void);
>>  void (*flush_tlb_one_user)(unsigned long addr);
>> +/*
>> + * flush_tlb_multi() is the preferred interface, which is capable to
>> + * flush both local and remote CPUs.
>> + */
>> +void (*flush_tlb_multi)(const struct cpumask *cpus,
>> +const struct flush_tlb_info *info);
>>  void (*flush_tlb_others)(const struct cpumask *cpus,
>>   const struct flush_tlb_info *info);
>> 
>> diff --git a/arch/x86/include/asm/tlbflush.h 
>> b/arch/x86/include/asm/tlbflush.h
>> index dee375831962..79272938cf79 100644
>> --- a/arch/x86/include/asm/tlbflush.h
>> +++ b/arch/x86/include/asm/tlbflush.h
>> @@ -569,6 +569,9 @@ static inline void flush_tlb_page(struct vm_area_struct 
>> *vma, unsigned long a)
>>  flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, PAGE_SHIFT, false);
>> }
>> 
>> +void native_flush_tlb_multi(const struct cpumask *cpumask,
>> + const struct flush_tlb_info *info);
>> +
>> void native_flush_tlb_others(const struct cpumask *cpumask,
>>   const struct flush_tlb_info *info);
>> 
>> @@ -593,6 +596,9 @@ static inline void arch_tlbbatch_add_mm(struct 
>> arch_tlbflush_unmap_batch *batch,
>> extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch);
>> 
>> #ifndef CONFIG_PARAVIRT
>> +#define flush_tlb_multi(mask, info) \
>> +native_flush_tlb_multi(mask, info)
>> +
>> #define flush_tlb_others(mask, info) \
>>  native_flush_tlb_others(mask, info)
>> 
>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
>> index 5169b8cc35bb..00d81e898717 100644
>> --- 

Re: [PATCH 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019-06-25 Thread Dave Hansen
On 6/12/19 11:48 PM, Nadav Amit wrote:
> To improve TLB shootdown performance, flush the remote and local TLBs
> concurrently. Introduce flush_tlb_multi() that does so. The current
> flush_tlb_others() interface is kept, since paravirtual interfaces need
> to be adapted first before it can be removed. This is left for future
> work. In such PV environments, TLB flushes are not performed, at this
> time, concurrently.
> 
> Add a static key to tell whether this new interface is supported.
> 
> Cc: "K. Y. Srinivasan" 
> Cc: Haiyang Zhang 
> Cc: Stephen Hemminger 
> Cc: Sasha Levin 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Borislav Petkov 
> Cc: x...@kernel.org
> Cc: Juergen Gross 
> Cc: Paolo Bonzini 
> Cc: Dave Hansen 
> Cc: Andy Lutomirski 
> Cc: Peter Zijlstra 
> Cc: Boris Ostrovsky 
> Cc: linux-hyp...@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Cc: virtualization@lists.linux-foundation.org
> Cc: k...@vger.kernel.org
> Cc: xen-de...@lists.xenproject.org
> Signed-off-by: Nadav Amit 
> ---
>  arch/x86/hyperv/mmu.c |  2 +
>  arch/x86/include/asm/paravirt.h   |  8 +++
>  arch/x86/include/asm/paravirt_types.h |  6 +++
>  arch/x86/include/asm/tlbflush.h   |  6 +++
>  arch/x86/kernel/kvm.c |  1 +
>  arch/x86/kernel/paravirt.c|  3 ++
>  arch/x86/mm/tlb.c | 71 ++-
>  arch/x86/xen/mmu_pv.c |  2 +
>  8 files changed, 87 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
> index e65d7fe6489f..ca28b400c87c 100644
> --- a/arch/x86/hyperv/mmu.c
> +++ b/arch/x86/hyperv/mmu.c
> @@ -233,4 +233,6 @@ void hyperv_setup_mmu_ops(void)
>   pr_info("Using hypercall for remote TLB flush\n");
>   pv_ops.mmu.flush_tlb_others = hyperv_flush_tlb_others;
>   pv_ops.mmu.tlb_remove_table = tlb_remove_table;
> +
> + static_key_disable(_tlb_multi_enabled.key);
>  }
> diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
> index c25c38a05c1c..192be7254457 100644
> --- a/arch/x86/include/asm/paravirt.h
> +++ b/arch/x86/include/asm/paravirt.h
> @@ -47,6 +47,8 @@ static inline void slow_down_io(void)
>  #endif
>  }
>  
> +DECLARE_STATIC_KEY_TRUE(flush_tlb_multi_enabled);
> +
>  static inline void __flush_tlb(void)
>  {
>   PVOP_VCALL0(mmu.flush_tlb_user);
> @@ -62,6 +64,12 @@ static inline void __flush_tlb_one_user(unsigned long addr)
>   PVOP_VCALL1(mmu.flush_tlb_one_user, addr);
>  }
>  
> +static inline void flush_tlb_multi(const struct cpumask *cpumask,
> +const struct flush_tlb_info *info)
> +{
> + PVOP_VCALL2(mmu.flush_tlb_multi, cpumask, info);
> +}
> +
>  static inline void flush_tlb_others(const struct cpumask *cpumask,
>   const struct flush_tlb_info *info)
>  {
> diff --git a/arch/x86/include/asm/paravirt_types.h 
> b/arch/x86/include/asm/paravirt_types.h
> index 946f8f1f1efc..b93b3d90729a 100644
> --- a/arch/x86/include/asm/paravirt_types.h
> +++ b/arch/x86/include/asm/paravirt_types.h
> @@ -211,6 +211,12 @@ struct pv_mmu_ops {
>   void (*flush_tlb_user)(void);
>   void (*flush_tlb_kernel)(void);
>   void (*flush_tlb_one_user)(unsigned long addr);
> + /*
> +  * flush_tlb_multi() is the preferred interface, which is capable to
> +  * flush both local and remote CPUs.
> +  */
> + void (*flush_tlb_multi)(const struct cpumask *cpus,
> + const struct flush_tlb_info *info);
>   void (*flush_tlb_others)(const struct cpumask *cpus,
>const struct flush_tlb_info *info);
>  
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index dee375831962..79272938cf79 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -569,6 +569,9 @@ static inline void flush_tlb_page(struct vm_area_struct 
> *vma, unsigned long a)
>   flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, PAGE_SHIFT, false);
>  }
>  
> +void native_flush_tlb_multi(const struct cpumask *cpumask,
> +  const struct flush_tlb_info *info);
> +
>  void native_flush_tlb_others(const struct cpumask *cpumask,
>const struct flush_tlb_info *info);
>  
> @@ -593,6 +596,9 @@ static inline void arch_tlbbatch_add_mm(struct 
> arch_tlbflush_unmap_batch *batch,
>  extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch);
>  
>  #ifndef CONFIG_PARAVIRT
> +#define flush_tlb_multi(mask, info)  \
> + native_flush_tlb_multi(mask, info)
> +
>  #define flush_tlb_others(mask, info) \
>   native_flush_tlb_others(mask, info)
>  
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 5169b8cc35bb..00d81e898717 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -630,6 +630,7 @@ static void __init kvm_guest_init(void)
>   

[PATCH 4/9] x86/mm/tlb: Flush remote and local TLBs concurrently

2019-06-13 Thread Nadav Amit via Virtualization
To improve TLB shootdown performance, flush the remote and local TLBs
concurrently. Introduce flush_tlb_multi() that does so. The current
flush_tlb_others() interface is kept, since paravirtual interfaces need
to be adapted first before it can be removed. This is left for future
work. In such PV environments, TLB flushes are not performed, at this
time, concurrently.

Add a static key to tell whether this new interface is supported.

Cc: "K. Y. Srinivasan" 
Cc: Haiyang Zhang 
Cc: Stephen Hemminger 
Cc: Sasha Levin 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: x...@kernel.org
Cc: Juergen Gross 
Cc: Paolo Bonzini 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Boris Ostrovsky 
Cc: linux-hyp...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: k...@vger.kernel.org
Cc: xen-de...@lists.xenproject.org
Signed-off-by: Nadav Amit 
---
 arch/x86/hyperv/mmu.c |  2 +
 arch/x86/include/asm/paravirt.h   |  8 +++
 arch/x86/include/asm/paravirt_types.h |  6 +++
 arch/x86/include/asm/tlbflush.h   |  6 +++
 arch/x86/kernel/kvm.c |  1 +
 arch/x86/kernel/paravirt.c|  3 ++
 arch/x86/mm/tlb.c | 71 ++-
 arch/x86/xen/mmu_pv.c |  2 +
 8 files changed, 87 insertions(+), 12 deletions(-)

diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
index e65d7fe6489f..ca28b400c87c 100644
--- a/arch/x86/hyperv/mmu.c
+++ b/arch/x86/hyperv/mmu.c
@@ -233,4 +233,6 @@ void hyperv_setup_mmu_ops(void)
pr_info("Using hypercall for remote TLB flush\n");
pv_ops.mmu.flush_tlb_others = hyperv_flush_tlb_others;
pv_ops.mmu.tlb_remove_table = tlb_remove_table;
+
+   static_key_disable(_tlb_multi_enabled.key);
 }
diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
index c25c38a05c1c..192be7254457 100644
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -47,6 +47,8 @@ static inline void slow_down_io(void)
 #endif
 }
 
+DECLARE_STATIC_KEY_TRUE(flush_tlb_multi_enabled);
+
 static inline void __flush_tlb(void)
 {
PVOP_VCALL0(mmu.flush_tlb_user);
@@ -62,6 +64,12 @@ static inline void __flush_tlb_one_user(unsigned long addr)
PVOP_VCALL1(mmu.flush_tlb_one_user, addr);
 }
 
+static inline void flush_tlb_multi(const struct cpumask *cpumask,
+  const struct flush_tlb_info *info)
+{
+   PVOP_VCALL2(mmu.flush_tlb_multi, cpumask, info);
+}
+
 static inline void flush_tlb_others(const struct cpumask *cpumask,
const struct flush_tlb_info *info)
 {
diff --git a/arch/x86/include/asm/paravirt_types.h 
b/arch/x86/include/asm/paravirt_types.h
index 946f8f1f1efc..b93b3d90729a 100644
--- a/arch/x86/include/asm/paravirt_types.h
+++ b/arch/x86/include/asm/paravirt_types.h
@@ -211,6 +211,12 @@ struct pv_mmu_ops {
void (*flush_tlb_user)(void);
void (*flush_tlb_kernel)(void);
void (*flush_tlb_one_user)(unsigned long addr);
+   /*
+* flush_tlb_multi() is the preferred interface, which is capable to
+* flush both local and remote CPUs.
+*/
+   void (*flush_tlb_multi)(const struct cpumask *cpus,
+   const struct flush_tlb_info *info);
void (*flush_tlb_others)(const struct cpumask *cpus,
 const struct flush_tlb_info *info);
 
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index dee375831962..79272938cf79 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -569,6 +569,9 @@ static inline void flush_tlb_page(struct vm_area_struct 
*vma, unsigned long a)
flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, PAGE_SHIFT, false);
 }
 
+void native_flush_tlb_multi(const struct cpumask *cpumask,
+const struct flush_tlb_info *info);
+
 void native_flush_tlb_others(const struct cpumask *cpumask,
 const struct flush_tlb_info *info);
 
@@ -593,6 +596,9 @@ static inline void arch_tlbbatch_add_mm(struct 
arch_tlbflush_unmap_batch *batch,
 extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch);
 
 #ifndef CONFIG_PARAVIRT
+#define flush_tlb_multi(mask, info)\
+   native_flush_tlb_multi(mask, info)
+
 #define flush_tlb_others(mask, info)   \
native_flush_tlb_others(mask, info)
 
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 5169b8cc35bb..00d81e898717 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -630,6 +630,7 @@ static void __init kvm_guest_init(void)
kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) {
pv_ops.mmu.flush_tlb_others = kvm_flush_tlb_others;
pv_ops.mmu.tlb_remove_table = tlb_remove_table;
+   static_key_disable(_tlb_multi_enabled.key);
}
 
if