Re: [Devel] [PATCH vz8 1/3] arch/x86: introduce cpuid override

2020-11-03 Thread Kirill Tkhai
On 30.10.2020 14:45, Andrey Ryabinin wrote:
> From: Vladimir Davydov 
> 
> Port diff-arch-x86-introduce-cpuid-override
> 
> Recent Intel CPUs rejected CPUID masking, which is required for flex
> migration, in favor of CPUID faulting. So we need to support it in
> kenrel.
> 
> This patch adds user writable file /proc/vz/cpuid_override, which
> contains CPUID override table. Each table entry must have the following
> format:
> 
>   op[ count]: eax ebx ecx edx
> 
> where @op and optional @count define a CPUID function, whose output one
> would like to override (@op and @count are loaded to EAX and ECX
> registers respectively before calling CPUID); @eax, @ebx, @ecx, @edx -
> the desired CPUID output for the specified function. All values must be
> in HEX, 0x prefix is optional.
> 
> Notes:
> 
>  - the file is only present on hosts that support CPUID faulting;
>  - CPUID faulting is always enabled if it is supported;
>  - CPUID output is overridden on all present CPUs;
>  - the maximal number of entries one can override equals 16;
>  - each write(2) to the file removes all existing entries before adding
>new ones, so the whole table must be written in one write(2); in
>particular writing an empty line to the file removes all existing
>rules.
> 
> Example:
> 
> Suppose we want to mask out SSE2 (CPUID.01H:EDX:26) and RDTSCP
> (CPUID.8001H:EDX:27). Then we should execute the following sequence:
> 
>  - get the current cpuid value:
> 
># cpuid -r | grep -e '^\s*0x0001' -e '^\s*0x8001' | head -n 2
>   0x0001 0x00: eax=0x000306e4 ebx=0x00200800 ecx=0x7fbee3ff 
> edx=0xbfebfbff
>   0x8001 0x00: eax=0x ebx=0x ecx=0x0001 
> edx=0x2c100800
> 
>  - clear the feature bits we want to mask out and write the result to
>/proc/vz/cpuid_override:
> 
># cat >/proc/vz/cpuid_override <0x0001: 0x000306e4 0x00200800 0x7fbee3ff 0xbbebfbff
>0x8001: 0x 0x 0x0001 0x24100800
>EOF
> 
>  - check that cpuid output was overridden:
> 
># cpuid -r | grep -e '^\s*0x0001' -e '^\s*0x8001' | head -n 2
>   0x0001 0x00: eax=0x000306e4 ebx=0x00200800 ecx=0x7fbee3ff 
> edx=0xbbebfbff
>   0x8001 0x00: eax=0x ebx=0x ecx=0x0001 
> edx=0x24100800
> 
> https://jira.sw.ru/browse/PSBM-28682
> 
> Signed-off-by: Vladimir Davydov 
> 
> Acked-by: Cyrill Gorcunov 
> =
> 
> https://jira.sw.ru/browse/PSBM-33638
> 
> Signed-off-by: Vladimir Davydov 
> Rebase:
> Signed-off-by: Kirill Tkhai 
> 
> https://jira.sw.ru/browse/PSBM-121823
> [aryabinin: vz8 rebase]
> Signed-off-by: Andrey Ryabinin 

For the series:

Reviewed-by: Kirill Tkhai 

> ---
>  arch/x86/include/asm/msr-index.h |   1 +
>  arch/x86/include/asm/traps.h |   2 +
>  arch/x86/kernel/Makefile |   1 +
>  arch/x86/kernel/cpu/proc.c   |   4 +
>  arch/x86/kernel/cpuid_fault.c| 258 +++
>  arch/x86/kernel/traps.c  |  24 +++
>  6 files changed, 290 insertions(+)
>  create mode 100644 arch/x86/kernel/cpuid_fault.c
> 
> diff --git a/arch/x86/include/asm/msr-index.h 
> b/arch/x86/include/asm/msr-index.h
> index 6a21c227775c..9668ec6a064d 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -114,6 +114,7 @@
>  
>  #define MSR_IA32_BBL_CR_CTL  0x0119
>  #define MSR_IA32_BBL_CR_CTL3 0x011e
> +#define MSR_MISC_FEATURES_ENABLES0x0140
>  
>  #define MSR_IA32_TSX_CTRL0x0122
>  #define TSX_CTRL_RTM_DISABLE BIT(0)  /* Disable RTM feature */
> diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
> index 0ae298ea01a1..0282c81719e7 100644
> --- a/arch/x86/include/asm/traps.h
> +++ b/arch/x86/include/asm/traps.h
> @@ -124,6 +124,8 @@ void __noreturn handle_stack_overflow(const char *message,
> unsigned long fault_address);
>  #endif
>  
> +void do_cpuid_fault(struct pt_regs *);
> +
>  /* Interrupts/Exceptions */
>  enum {
>   X86_TRAP_DE = 0,/*  0, Divide-by-zero */
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index 431d8c6e641d..b9451b653b04 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -63,6 +63,7 @@ obj-y   += pci-iommu_table.o
>  obj-y+= resource.o
>  obj-y+= irqflags.o
>  obj-y+= spec_ctrl.o
> +obj-y+= cpuid_fault.o
>  
>  obj-y+= process.o
>  obj-y+= fpu/
> diff --git a/arch/x86/kernel/cpu/proc.c b/arch/x86/kernel/cpu/proc.c
> index 2c8522a39ed5..d6b17a60acf6 100644
> --- a/arch/x86/kernel/cpu/proc.c
> +++ b/arch/x86/kernel/cpu/proc.c
> @@ -54,6 +54,10 @@ static void show_cpuinfo_misc(struct seq_file *m, struct 
> cpuinfo_x86 *c)
>  }
>  

[Devel] [PATCH vz8 1/3] arch/x86: introduce cpuid override

2020-10-30 Thread Andrey Ryabinin
From: Vladimir Davydov 

Port diff-arch-x86-introduce-cpuid-override

Recent Intel CPUs rejected CPUID masking, which is required for flex
migration, in favor of CPUID faulting. So we need to support it in
kenrel.

This patch adds user writable file /proc/vz/cpuid_override, which
contains CPUID override table. Each table entry must have the following
format:

  op[ count]: eax ebx ecx edx

where @op and optional @count define a CPUID function, whose output one
would like to override (@op and @count are loaded to EAX and ECX
registers respectively before calling CPUID); @eax, @ebx, @ecx, @edx -
the desired CPUID output for the specified function. All values must be
in HEX, 0x prefix is optional.

Notes:

 - the file is only present on hosts that support CPUID faulting;
 - CPUID faulting is always enabled if it is supported;
 - CPUID output is overridden on all present CPUs;
 - the maximal number of entries one can override equals 16;
 - each write(2) to the file removes all existing entries before adding
   new ones, so the whole table must be written in one write(2); in
   particular writing an empty line to the file removes all existing
   rules.

Example:

Suppose we want to mask out SSE2 (CPUID.01H:EDX:26) and RDTSCP
(CPUID.8001H:EDX:27). Then we should execute the following sequence:

 - get the current cpuid value:

   # cpuid -r | grep -e '^\s*0x0001' -e '^\s*0x8001' | head -n 2
  0x0001 0x00: eax=0x000306e4 ebx=0x00200800 ecx=0x7fbee3ff 
edx=0xbfebfbff
  0x8001 0x00: eax=0x ebx=0x ecx=0x0001 
edx=0x2c100800

 - clear the feature bits we want to mask out and write the result to
   /proc/vz/cpuid_override:

   # cat >/proc/vz/cpuid_override