Re: [PATCH] sched: pull tasks when CPU is about to run SCHED_IDLE tasks
On Wed, Dec 23, 2020 at 12:30:26PM +0100, Vincent Guittot wrote: > On Wed, 23 Dec 2020 at 09:32, wrote: > > > > From: Chen Xiaoguang > > > > Before a CPU switches from running SCHED_NORMAL task to > > SCHED_IDLE task, trying to pull SCHED_NORMAL tasks from other > > Could you explain more in detail why you only care about this use case > in particular and not the general case? > We want to run online tasks using SCHED_NORMAL policy and offline tasks using SCHED_IDLE policy. The online tasks and the offline tasks run in the same computer in order to use the computer efficiently. The online tasks are in sleep in most times but should responce soon once wake up. The offline tasks are in low priority and will run only when no online tasks. The online tasks are more important than the offline tasks and are latency sensitive we should make sure the online tasks preempt the offline tasks as soon as possilbe while there are online tasks waiting to run. So in our situation we hope the SCHED_NORMAL to run if has any. Let's assume we have 2 CPUs, In CPU1 we got 2 SCHED_NORMAL tasks. in CPU2 we got 1 SCHED_NORMAL task and 2 SCHED_IDLE tasks. CPU1 CPU2 curr rq1curr rq2 +--+ | +--+ +--+ | ++ ++ t0|NORMAL| | |NORMAL| |NORMAL| | |IDLE| |IDLE| +--+ | +--+ +--+ | ++ ++ NORMAL exits or blocked +--+ | +--+| ++ ++ t1|NORMAL| | |NORMAL|| |IDLE| |IDLE| +--+ | +--+| ++ ++ pick_next_task_fair +--+ | +--+ ++ | ++ t2|NORMAL| | |NORMAL| |IDLE| | |IDLE| +--+ | +--+ ++ | ++ SCHED_IDLE running t3+--+ | +--+++ | ++ |NORMAL| | |NORMAL||IDLE| | |IDLE| +--+ | +--+++ | ++ run_rebalance_domains +--+ |+--+ | ++ ++ t4|NORMAL| ||NORMAL| | |IDLE| |IDLE| +--+ |+--+ | ++ ++ As we can see t1: NORMAL task in CPU2 exits or blocked t2: CPU2 pick_next_task_fair would pick a SCHED_IDLE to run while another SCHED_NORMAL in rq1 is waiting. t3: SCHED_IDLE run in CPU2 while a SCHED_NORMAL wait in CPU1. t4: after a short time, periodic load_balance triggerd and pull SCHED_NORMAL in rq1 to rq2, and SCHED_NORMAL likely preempts SCHED_IDLE. In this scenario, SCHED_IDLE is running while SCHED_NORMAL is waiting to run. The latency of this SCHED_NORMAL will be high which is not acceptble. Do a load_balance before running the SCHED_IDLE may fix this problem. This patch works as below: CPU1 CPU2 curr rq1curr rq2 +--+ | +--+ +--+ | ++ ++ t0|NORMAL| | |NORMAL| |NORMAL| | |IDLE| |IDLE| +--+ | +--+ +--+ | ++ ++ NORMAL exits or blocked +--+ | +--+| ++ ++ t1|NORMAL| | |NORMAL|| |IDLE| |IDLE| +--+ | +--+| ++ ++ t2pick_next_task_fair (all se are SCHED_IDLE) newidle_balance +--+ | +--+ | ++ ++ t3|NORMAL| | |NORMAL| | |IDLE| |IDLE| +--+ | +--+ | ++ ++ t1: NORMAL task in CPU2 exits or blocked t2: pick_next_task_fair check all se in rbtree are SCHED_IDLE and calls newidle_balance who tries to pull a SCHED_NORMAL(if has). t3: pick_next_task_fair would pick a SCHED_NORMAL to run instead of SCHED_IDLE(likely). > > CPU by doing load_balance first. > > > > Signed-off-by: Chen Xiaoguang > > Signed-off-by: Chen He > > --- > > kernel/sched/fair.c | 5 + > > 1 file changed, 5 insertions(+) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index ae7ceba..0a26132 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -7004,6 +7004,11 @@ struct task_struct * > > struct task_struct *p; > > int new_tasks; > > > > + if (prev && > > + fair_policy(prev->policy) && > > Why do you need a prev and fair task ? You seem to target the special > case of pick_next_task but in this case why not only testing rf!=null > to make sure to not return immediately after jumping to the idle > label? > We just want to do load_balance only when CPU switches from SCHED_NORMAL to SCHED_IDLE. If not check prev, when the running tasks are all SCHED_IDLE, we would do newidle_balance everytime in pick_next_task_fair, it makes no sense and kind of wasting. > Also why not doing
[tip:x86/cpufeature] x86/cpuid: Provide get_scattered_cpuid_leaf()
Commit-ID: 47bdf3378d62a627cfb8a54e1180c08d67078b61 Gitweb: http://git.kernel.org/tip/47bdf3378d62a627cfb8a54e1180c08d67078b61 Author: He Chen <he.c...@linux.intel.com> AuthorDate: Fri, 11 Nov 2016 17:25:35 +0800 Committer: Thomas Gleixner <t...@linutronix.de> CommitDate: Wed, 16 Nov 2016 11:13:09 +0100 x86/cpuid: Provide get_scattered_cpuid_leaf() Sparse populated CPUID leafs are collected in a software provided leaf to avoid bloat of the x86_capability array, but there is no way to rebuild the real leafs (e.g. for KVM CPUID enumeration) other than rereading the CPUID leaf from the CPU. While this is possible it is problematic as it does not take software disabled features into account. If a feature is disabled on the host it should not be exposed to a guest either. Add get_scattered_cpuid_leaf() which rebuilds the leaf from the scattered cpuid table information and the active CPU features. [ tglx: Rewrote changelog ] Signed-off-by: He Chen <he.c...@linux.intel.com> Reviewed-by: Borislav Petkov <b...@suse.de> Cc: Luwei Kang <luwei.k...@intel.com> Cc: k...@vger.kernel.org Cc: Radim Krčmář <rkrc...@redhat.com> Cc: Piotr Luc <piotr@intel.com> Cc: Borislav Petkov <b...@alien8.de> Cc: Paolo Bonzini <pbonz...@redhat.com> Link: http://lkml.kernel.org/r/1478856336-9388-3-git-send-email-he.c...@linux.intel.com Signed-off-by: Thomas Gleixner <t...@linutronix.de> --- arch/x86/include/asm/processor.h | 3 +++ arch/x86/kernel/cpu/scattered.c | 49 ++-- 2 files changed, 40 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8f6ac5b..e7f8c62 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -189,6 +189,9 @@ extern void identify_secondary_cpu(struct cpuinfo_x86 *); extern void print_cpu_info(struct cpuinfo_x86 *); void print_cpu_msr(struct cpuinfo_x86 *); extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c); +extern u32 get_scattered_cpuid_leaf(unsigned int level, + unsigned int sub_leaf, + enum cpuid_regs_idx reg); extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c); extern void init_amd_cacheinfo(struct cpuinfo_x86 *c); diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index dbb470e..d1316f9 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,24 +17,25 @@ struct cpuid_bit { u32 sub_leaf; }; +/* Please keep the leaf sorted by cpuid_bit.level for faster search. */ +static const struct cpuid_bit cpuid_bits[] = { + { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, + { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, + { X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x0007, 0 }, + { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, + { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, + { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, + { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, + { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, + { 0, 0, 0, 0, 0 } +}; + void init_scattered_cpuid_features(struct cpuinfo_x86 *c) { u32 max_level; u32 regs[4]; const struct cpuid_bit *cb; - static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, - { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, - { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, - { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, - { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, - { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, - { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, - { 0, 0, 0, 0, 0 } - }; - for (cb = cpuid_bits; cb->feature; cb++) { /* Verify that the level is valid */ @@ -51,3 +52,27 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) set_cpu_cap(c, cb->feature); } } + +u32 get_scattered_cpuid_leaf(unsigned int level, unsigned int sub_leaf, +enum cpuid_regs_idx reg) +{ + const struct cpuid_bit *cb; + u32 cpuid_val = 0; + + for (cb = cpuid_bits; cb->feature; cb++) { + + if (level > cb->level) + continue; + + if (level < cb->level) + break; + + if (reg == cb->reg && sub_leaf == cb->sub_leaf) { +
[tip:x86/cpufeature] x86/cpuid: Provide get_scattered_cpuid_leaf()
Commit-ID: 47bdf3378d62a627cfb8a54e1180c08d67078b61 Gitweb: http://git.kernel.org/tip/47bdf3378d62a627cfb8a54e1180c08d67078b61 Author: He Chen AuthorDate: Fri, 11 Nov 2016 17:25:35 +0800 Committer: Thomas Gleixner CommitDate: Wed, 16 Nov 2016 11:13:09 +0100 x86/cpuid: Provide get_scattered_cpuid_leaf() Sparse populated CPUID leafs are collected in a software provided leaf to avoid bloat of the x86_capability array, but there is no way to rebuild the real leafs (e.g. for KVM CPUID enumeration) other than rereading the CPUID leaf from the CPU. While this is possible it is problematic as it does not take software disabled features into account. If a feature is disabled on the host it should not be exposed to a guest either. Add get_scattered_cpuid_leaf() which rebuilds the leaf from the scattered cpuid table information and the active CPU features. [ tglx: Rewrote changelog ] Signed-off-by: He Chen Reviewed-by: Borislav Petkov Cc: Luwei Kang Cc: k...@vger.kernel.org Cc: Radim Krčmář Cc: Piotr Luc Cc: Borislav Petkov Cc: Paolo Bonzini Link: http://lkml.kernel.org/r/1478856336-9388-3-git-send-email-he.c...@linux.intel.com Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/processor.h | 3 +++ arch/x86/kernel/cpu/scattered.c | 49 ++-- 2 files changed, 40 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8f6ac5b..e7f8c62 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -189,6 +189,9 @@ extern void identify_secondary_cpu(struct cpuinfo_x86 *); extern void print_cpu_info(struct cpuinfo_x86 *); void print_cpu_msr(struct cpuinfo_x86 *); extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c); +extern u32 get_scattered_cpuid_leaf(unsigned int level, + unsigned int sub_leaf, + enum cpuid_regs_idx reg); extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c); extern void init_amd_cacheinfo(struct cpuinfo_x86 *c); diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index dbb470e..d1316f9 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,24 +17,25 @@ struct cpuid_bit { u32 sub_leaf; }; +/* Please keep the leaf sorted by cpuid_bit.level for faster search. */ +static const struct cpuid_bit cpuid_bits[] = { + { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, + { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, + { X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x0007, 0 }, + { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, + { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, + { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, + { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, + { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, + { 0, 0, 0, 0, 0 } +}; + void init_scattered_cpuid_features(struct cpuinfo_x86 *c) { u32 max_level; u32 regs[4]; const struct cpuid_bit *cb; - static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, - { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, - { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, - { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, - { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, - { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, - { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, - { 0, 0, 0, 0, 0 } - }; - for (cb = cpuid_bits; cb->feature; cb++) { /* Verify that the level is valid */ @@ -51,3 +52,27 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) set_cpu_cap(c, cb->feature); } } + +u32 get_scattered_cpuid_leaf(unsigned int level, unsigned int sub_leaf, +enum cpuid_regs_idx reg) +{ + const struct cpuid_bit *cb; + u32 cpuid_val = 0; + + for (cb = cpuid_bits; cb->feature; cb++) { + + if (level > cb->level) + continue; + + if (level < cb->level) + break; + + if (reg == cb->reg && sub_leaf == cb->sub_leaf) { + if (cpu_has(_cpu_data, cb->feature)) + cpuid_val |= BIT(cb->bit); + } + } + + return cpuid_val; +} +EXPORT_SYMBOL_GPL(get_scattered_cpuid_leaf);
[tip:x86/cpufeature] x86/cpuid: Cleanup cpuid_regs definitions
Commit-ID: 47f10a36003eaf493125a5e6687dd1ff775bfd8c Gitweb: http://git.kernel.org/tip/47f10a36003eaf493125a5e6687dd1ff775bfd8c Author: He Chen <he.c...@linux.intel.com> AuthorDate: Fri, 11 Nov 2016 17:25:34 +0800 Committer: Thomas Gleixner <t...@linutronix.de> CommitDate: Wed, 16 Nov 2016 11:13:09 +0100 x86/cpuid: Cleanup cpuid_regs definitions cpuid_regs is defined multiple times as structure and enum. Rename the enum and move all of it to processor.h so we don't end up with more instances. Rename the misnomed register enumeration from CR_* to the obvious CPUID_*. [ tglx: Rewrote changelog ] Signed-off-by: He Chen <he.c...@linux.intel.com> Reviewed-by: Borislav Petkov <b...@alien8.de> Cc: Luwei Kang <luwei.k...@intel.com> Cc: k...@vger.kernel.org Cc: Radim Krčmář <rkrc...@redhat.com> Cc: Piotr Luc <piotr@intel.com> Cc: Paolo Bonzini <pbonz...@redhat.com> Link: http://lkml.kernel.org/r/1478856336-9388-2-git-send-email-he.c...@linux.intel.com Signed-off-by: Thomas Gleixner <t...@linutronix.de> --- arch/x86/events/intel/pt.c | 45 +--- arch/x86/include/asm/processor.h | 11 ++ arch/x86/kernel/cpu/scattered.c | 28 ++--- arch/x86/kernel/cpuid.c | 4 4 files changed, 41 insertions(+), 47 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index c5047b8..1c1b9fe 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -36,13 +36,6 @@ static DEFINE_PER_CPU(struct pt, pt_ctx); static struct pt_pmu pt_pmu; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - /* * Capabilities of Intel PT hardware, such as number of address bits or * supported output schemes, are cached and exported to userspace as "caps" @@ -64,21 +57,21 @@ static struct pt_cap_desc { u8 reg; u32 mask; } pt_caps[] = { - PT_CAP(max_subleaf, 0, CR_EAX, 0x), - PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)), - PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)), - PT_CAP(ip_filtering,0, CR_EBX, BIT(2)), - PT_CAP(mtc, 0, CR_EBX, BIT(3)), - PT_CAP(ptwrite, 0, CR_EBX, BIT(4)), - PT_CAP(power_event_trace, 0, CR_EBX, BIT(5)), - PT_CAP(topa_output, 0, CR_ECX, BIT(0)), - PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)), - PT_CAP(single_range_output, 0, CR_ECX, BIT(2)), - PT_CAP(payloads_lip,0, CR_ECX, BIT(31)), - PT_CAP(num_address_ranges, 1, CR_EAX, 0x3), - PT_CAP(mtc_periods, 1, CR_EAX, 0x), - PT_CAP(cycle_thresholds,1, CR_EBX, 0x), - PT_CAP(psb_periods, 1, CR_EBX, 0x), + PT_CAP(max_subleaf, 0, CPUID_EAX, 0x), + PT_CAP(cr3_filtering, 0, CPUID_EBX, BIT(0)), + PT_CAP(psb_cyc, 0, CPUID_EBX, BIT(1)), + PT_CAP(ip_filtering,0, CPUID_EBX, BIT(2)), + PT_CAP(mtc, 0, CPUID_EBX, BIT(3)), + PT_CAP(ptwrite, 0, CPUID_EBX, BIT(4)), + PT_CAP(power_event_trace, 0, CPUID_EBX, BIT(5)), + PT_CAP(topa_output, 0, CPUID_ECX, BIT(0)), + PT_CAP(topa_multiple_entries, 0, CPUID_ECX, BIT(1)), + PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)), + PT_CAP(payloads_lip,0, CPUID_ECX, BIT(31)), + PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3), + PT_CAP(mtc_periods, 1, CPUID_EAX, 0x), + PT_CAP(cycle_thresholds,1, CPUID_EBX, 0x), + PT_CAP(psb_periods, 1, CPUID_EBX, 0x), }; static u32 pt_cap_get(enum pt_capabilities cap) @@ -213,10 +206,10 @@ static int __init pt_pmu_hw_init(void) for (i = 0; i < PT_CPUID_LEAVES; i++) { cpuid_count(20, i, - _pmu.caps[CR_EAX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EBX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_ECX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EDX + i*PT_CPUID_REGS_NUM]); + _pmu.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM]); } ret = -ENOMEM; diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..8f6ac5b 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -137,6 +137,17 @@ struct cpuinfo_x86 { u32 microcode; }; +struct cpui
[tip:x86/cpufeature] x86/cpuid: Cleanup cpuid_regs definitions
Commit-ID: 47f10a36003eaf493125a5e6687dd1ff775bfd8c Gitweb: http://git.kernel.org/tip/47f10a36003eaf493125a5e6687dd1ff775bfd8c Author: He Chen AuthorDate: Fri, 11 Nov 2016 17:25:34 +0800 Committer: Thomas Gleixner CommitDate: Wed, 16 Nov 2016 11:13:09 +0100 x86/cpuid: Cleanup cpuid_regs definitions cpuid_regs is defined multiple times as structure and enum. Rename the enum and move all of it to processor.h so we don't end up with more instances. Rename the misnomed register enumeration from CR_* to the obvious CPUID_*. [ tglx: Rewrote changelog ] Signed-off-by: He Chen Reviewed-by: Borislav Petkov Cc: Luwei Kang Cc: k...@vger.kernel.org Cc: Radim Krčmář Cc: Piotr Luc Cc: Paolo Bonzini Link: http://lkml.kernel.org/r/1478856336-9388-2-git-send-email-he.c...@linux.intel.com Signed-off-by: Thomas Gleixner --- arch/x86/events/intel/pt.c | 45 +--- arch/x86/include/asm/processor.h | 11 ++ arch/x86/kernel/cpu/scattered.c | 28 ++--- arch/x86/kernel/cpuid.c | 4 4 files changed, 41 insertions(+), 47 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index c5047b8..1c1b9fe 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -36,13 +36,6 @@ static DEFINE_PER_CPU(struct pt, pt_ctx); static struct pt_pmu pt_pmu; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - /* * Capabilities of Intel PT hardware, such as number of address bits or * supported output schemes, are cached and exported to userspace as "caps" @@ -64,21 +57,21 @@ static struct pt_cap_desc { u8 reg; u32 mask; } pt_caps[] = { - PT_CAP(max_subleaf, 0, CR_EAX, 0x), - PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)), - PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)), - PT_CAP(ip_filtering,0, CR_EBX, BIT(2)), - PT_CAP(mtc, 0, CR_EBX, BIT(3)), - PT_CAP(ptwrite, 0, CR_EBX, BIT(4)), - PT_CAP(power_event_trace, 0, CR_EBX, BIT(5)), - PT_CAP(topa_output, 0, CR_ECX, BIT(0)), - PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)), - PT_CAP(single_range_output, 0, CR_ECX, BIT(2)), - PT_CAP(payloads_lip,0, CR_ECX, BIT(31)), - PT_CAP(num_address_ranges, 1, CR_EAX, 0x3), - PT_CAP(mtc_periods, 1, CR_EAX, 0x), - PT_CAP(cycle_thresholds,1, CR_EBX, 0x), - PT_CAP(psb_periods, 1, CR_EBX, 0x), + PT_CAP(max_subleaf, 0, CPUID_EAX, 0x), + PT_CAP(cr3_filtering, 0, CPUID_EBX, BIT(0)), + PT_CAP(psb_cyc, 0, CPUID_EBX, BIT(1)), + PT_CAP(ip_filtering,0, CPUID_EBX, BIT(2)), + PT_CAP(mtc, 0, CPUID_EBX, BIT(3)), + PT_CAP(ptwrite, 0, CPUID_EBX, BIT(4)), + PT_CAP(power_event_trace, 0, CPUID_EBX, BIT(5)), + PT_CAP(topa_output, 0, CPUID_ECX, BIT(0)), + PT_CAP(topa_multiple_entries, 0, CPUID_ECX, BIT(1)), + PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)), + PT_CAP(payloads_lip,0, CPUID_ECX, BIT(31)), + PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3), + PT_CAP(mtc_periods, 1, CPUID_EAX, 0x), + PT_CAP(cycle_thresholds,1, CPUID_EBX, 0x), + PT_CAP(psb_periods, 1, CPUID_EBX, 0x), }; static u32 pt_cap_get(enum pt_capabilities cap) @@ -213,10 +206,10 @@ static int __init pt_pmu_hw_init(void) for (i = 0; i < PT_CPUID_LEAVES; i++) { cpuid_count(20, i, - _pmu.caps[CR_EAX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EBX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_ECX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EDX + i*PT_CPUID_REGS_NUM]); + _pmu.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM]); } ret = -ENOMEM; diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..8f6ac5b 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -137,6 +137,17 @@ struct cpuinfo_x86 { u32 microcode; }; +struct cpuid_regs { + u32 eax, ebx, ecx, edx; +}; + +enum cpuid_regs_idx { + CPUID_EAX = 0, + CPUID_EBX, + CPUID_ECX, + CPUID_EDX, +}; + #define X86_VENDOR_INTEL 0 #define X86_VENDOR_CYRIX 1 #define X86_VENDOR_AMD
Re: [Patch v6.1] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support
On Tue, Nov 15, 2016 at 04:24:39AM +0800, kbuild test robot wrote: > Hi He, > > [auto build test ERROR on kvm/linux-next] > [also build test ERROR on v4.9-rc5] > [cannot apply to next-20161114] > [if your patch is applied to the wrong git tree, please drop us a note to > help improve the system] > > url: > https://github.com/0day-ci/linux/commits/He-Chen/x86-kvm-Add-AVX512_4VNNIW-and-AVX512_4FMAPS-support/20161114-170941 > base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next > config: x86_64-kexec (attached as .config) > compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 > reproduce: > # save the attached .config to linux build tree > make ARCH=x86_64 > > All errors (new ones prefixed by >>): > >arch/x86/kvm/cpuid.c: In function '__do_cpuid_ent': > >> arch/x86/kvm/cpuid.c:472:18: error: implicit declaration of function > >> 'get_scattered_cpuid_leaf' [-Werror=implicit-function-declaration] >entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); > ^~~~ > >> arch/x86/kvm/cpuid.c:472:49: error: 'CPUID_EDX' undeclared (first use in > >> this function) >entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); > ^ >arch/x86/kvm/cpuid.c:472:49: note: each undeclared identifier is reported > only once for each function it appears in >cc1: some warnings being treated as errors > I have downloaded .config.gz in attachment and use the .config in it to build kernel in my local branch again, and I don't see any warn or error message. I wonder whether the previous 0001 and 0002 patches have applied to run this test? Or is there something wrong with my compiler or patches? Thanks, -He
Re: [Patch v6.1] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support
On Tue, Nov 15, 2016 at 04:24:39AM +0800, kbuild test robot wrote: > Hi He, > > [auto build test ERROR on kvm/linux-next] > [also build test ERROR on v4.9-rc5] > [cannot apply to next-20161114] > [if your patch is applied to the wrong git tree, please drop us a note to > help improve the system] > > url: > https://github.com/0day-ci/linux/commits/He-Chen/x86-kvm-Add-AVX512_4VNNIW-and-AVX512_4FMAPS-support/20161114-170941 > base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git linux-next > config: x86_64-kexec (attached as .config) > compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901 > reproduce: > # save the attached .config to linux build tree > make ARCH=x86_64 > > All errors (new ones prefixed by >>): > >arch/x86/kvm/cpuid.c: In function '__do_cpuid_ent': > >> arch/x86/kvm/cpuid.c:472:18: error: implicit declaration of function > >> 'get_scattered_cpuid_leaf' [-Werror=implicit-function-declaration] >entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); > ^~~~ > >> arch/x86/kvm/cpuid.c:472:49: error: 'CPUID_EDX' undeclared (first use in > >> this function) >entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); > ^ >arch/x86/kvm/cpuid.c:472:49: note: each undeclared identifier is reported > only once for each function it appears in >cc1: some warnings being treated as errors > I have downloaded .config.gz in attachment and use the .config in it to build kernel in my local branch again, and I don't see any warn or error message. I wonder whether the previous 0001 and 0002 patches have applied to run this test? Or is there something wrong with my compiler or patches? Thanks, -He
Re: [PATCH v6 3/3] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support
On Mon, Nov 14, 2016 at 06:58:22AM +0100, Borislav Petkov wrote: > On Mon, Nov 14, 2016 at 09:41:04AM +0800, He Chen wrote: > > Yep, Luwei wrote it and I send it on behalf of him. > > Then it needs to have the following format so that tools can pick up the > proper author: > > "From: Luwei ... > > > > Signed-off-by: He Chen... > Signed-off-by: Luwei... > ... > " > > git format-patch gives that formatting. > > If you want to change the ownership, do the following on the local > commit: > > $ git commit --amend --author="Luwei Kang <luwei.k...@intel.com>" > > in case it lists you locally as author. > > HTH. > I am not sure if it is ok to reply this amended patch in this thread. or should I send another [Patch v6.1] patchset? Thanks, -He
Re: [PATCH v6 3/3] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support
On Mon, Nov 14, 2016 at 06:58:22AM +0100, Borislav Petkov wrote: > On Mon, Nov 14, 2016 at 09:41:04AM +0800, He Chen wrote: > > Yep, Luwei wrote it and I send it on behalf of him. > > Then it needs to have the following format so that tools can pick up the > proper author: > > "From: Luwei ... > > > > Signed-off-by: He Chen... > Signed-off-by: Luwei... > ... > " > > git format-patch gives that formatting. > > If you want to change the ownership, do the following on the local > commit: > > $ git commit --amend --author="Luwei Kang " > > in case it lists you locally as author. > > HTH. > I am not sure if it is ok to reply this amended patch in this thread. or should I send another [Patch v6.1] patchset? Thanks, -He
[Patch v6.1] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support
>From 2daa60b3c6ab5aa6414ebb33119a34403dad2048 Mon Sep 17 00:00:00 2001 From: Luwei Kang <luwei.k...@intel.com> Date: Mon, 7 Nov 2016 14:03:20 +0800 Subject: [Patch v6.1] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support Add two new AVX512 subfeatures support for KVM guest. AVX512_4VNNIW: Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS: Vector instructions for deep learning floating-point single precision. Reviewed-by: Borislav Petkov <b...@suse.de> Signed-off-by: He Chen <he.c...@linux.intel.com> Signed-off-by: Luwei Kang <luwei.k...@intel.com> --- arch/x86/kvm/cpuid.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..ddcdf7c 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -16,6 +16,7 @@ #include #include #include +#include #include /* For use_eager_fpu. Ugh! */ #include #include @@ -65,6 +66,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +382,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +468,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
[Patch v6.1] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support
>From 2daa60b3c6ab5aa6414ebb33119a34403dad2048 Mon Sep 17 00:00:00 2001 From: Luwei Kang Date: Mon, 7 Nov 2016 14:03:20 +0800 Subject: [Patch v6.1] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support Add two new AVX512 subfeatures support for KVM guest. AVX512_4VNNIW: Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS: Vector instructions for deep learning floating-point single precision. Reviewed-by: Borislav Petkov Signed-off-by: He Chen Signed-off-by: Luwei Kang --- arch/x86/kvm/cpuid.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..ddcdf7c 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -16,6 +16,7 @@ #include #include #include +#include #include /* For use_eager_fpu. Ugh! */ #include #include @@ -65,6 +66,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +382,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +468,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
Re: [PATCH v6 3/3] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support
On Sat, Nov 12, 2016 at 01:53:29PM +0100, Borislav Petkov wrote: > On Fri, Nov 11, 2016 at 05:25:36PM +0800, He Chen wrote: > > Add two new AVX512 subfeatures support for KVM guest. > > > > AVX512_4VNNIW: > > Vector instructions for deep learning enhanced word variable precision. > > > > AVX512_4FMAPS: > > Vector instructions for deep learning floating-point single precision. > > > > Reviewed-by: Borislav Petkov <b...@suse.de> > > Signed-off-by: Luwei Kang <luwei.k...@intel.com> > > Signed-off-by: He Chen <he.c...@linux.intel.com> > > --- > > Whoops, I said it looked ok but missed that SOB chain above. > > What does it mean? Did Luwei wrote the patch and you're sending it or > ...? > Yep, Luwei wrote it and I send it on behalf of him. Thanks, -He
Re: [PATCH v6 3/3] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support
On Sat, Nov 12, 2016 at 01:53:29PM +0100, Borislav Petkov wrote: > On Fri, Nov 11, 2016 at 05:25:36PM +0800, He Chen wrote: > > Add two new AVX512 subfeatures support for KVM guest. > > > > AVX512_4VNNIW: > > Vector instructions for deep learning enhanced word variable precision. > > > > AVX512_4FMAPS: > > Vector instructions for deep learning floating-point single precision. > > > > Reviewed-by: Borislav Petkov > > Signed-off-by: Luwei Kang > > Signed-off-by: He Chen > > --- > > Whoops, I said it looked ok but missed that SOB chain above. > > What does it mean? Did Luwei wrote the patch and you're sending it or > ...? > Yep, Luwei wrote it and I send it on behalf of him. Thanks, -He
[PATCH v6 1/3] x86/cpuid: Cleanup cpuid_regs definitions
make cpuid_regs more clear and avoid potential name clash. Signed-off-by: He Chen <he.c...@linux.intel.com> --- arch/x86/events/intel/pt.c | 45 +--- arch/x86/include/asm/processor.h | 11 ++ arch/x86/kernel/cpu/scattered.c | 28 ++--- arch/x86/kernel/cpuid.c | 4 4 files changed, 41 insertions(+), 47 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index c5047b8..1c1b9fe 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -36,13 +36,6 @@ static DEFINE_PER_CPU(struct pt, pt_ctx); static struct pt_pmu pt_pmu; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - /* * Capabilities of Intel PT hardware, such as number of address bits or * supported output schemes, are cached and exported to userspace as "caps" @@ -64,21 +57,21 @@ static struct pt_cap_desc { u8 reg; u32 mask; } pt_caps[] = { - PT_CAP(max_subleaf, 0, CR_EAX, 0x), - PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)), - PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)), - PT_CAP(ip_filtering,0, CR_EBX, BIT(2)), - PT_CAP(mtc, 0, CR_EBX, BIT(3)), - PT_CAP(ptwrite, 0, CR_EBX, BIT(4)), - PT_CAP(power_event_trace, 0, CR_EBX, BIT(5)), - PT_CAP(topa_output, 0, CR_ECX, BIT(0)), - PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)), - PT_CAP(single_range_output, 0, CR_ECX, BIT(2)), - PT_CAP(payloads_lip,0, CR_ECX, BIT(31)), - PT_CAP(num_address_ranges, 1, CR_EAX, 0x3), - PT_CAP(mtc_periods, 1, CR_EAX, 0x), - PT_CAP(cycle_thresholds,1, CR_EBX, 0x), - PT_CAP(psb_periods, 1, CR_EBX, 0x), + PT_CAP(max_subleaf, 0, CPUID_EAX, 0x), + PT_CAP(cr3_filtering, 0, CPUID_EBX, BIT(0)), + PT_CAP(psb_cyc, 0, CPUID_EBX, BIT(1)), + PT_CAP(ip_filtering,0, CPUID_EBX, BIT(2)), + PT_CAP(mtc, 0, CPUID_EBX, BIT(3)), + PT_CAP(ptwrite, 0, CPUID_EBX, BIT(4)), + PT_CAP(power_event_trace, 0, CPUID_EBX, BIT(5)), + PT_CAP(topa_output, 0, CPUID_ECX, BIT(0)), + PT_CAP(topa_multiple_entries, 0, CPUID_ECX, BIT(1)), + PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)), + PT_CAP(payloads_lip,0, CPUID_ECX, BIT(31)), + PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3), + PT_CAP(mtc_periods, 1, CPUID_EAX, 0x), + PT_CAP(cycle_thresholds,1, CPUID_EBX, 0x), + PT_CAP(psb_periods, 1, CPUID_EBX, 0x), }; static u32 pt_cap_get(enum pt_capabilities cap) @@ -213,10 +206,10 @@ static int __init pt_pmu_hw_init(void) for (i = 0; i < PT_CPUID_LEAVES; i++) { cpuid_count(20, i, - _pmu.caps[CR_EAX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EBX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_ECX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EDX + i*PT_CPUID_REGS_NUM]); + _pmu.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM]); } ret = -ENOMEM; diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..8f6ac5b 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -137,6 +137,17 @@ struct cpuinfo_x86 { u32 microcode; }; +struct cpuid_regs { + u32 eax, ebx, ecx, edx; +}; + +enum cpuid_regs_idx { + CPUID_EAX = 0, + CPUID_EBX, + CPUID_ECX, + CPUID_EDX, +}; + #define X86_VENDOR_INTEL 0 #define X86_VENDOR_CYRIX 1 #define X86_VENDOR_AMD 2 diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index 1db8dc4..dbb470e 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,13 +17,6 @@ struct cpuid_bit { u32 sub_leaf; }; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - void init_scattered_cpuid_features(struct cpuinfo_x86 *c) { u32 max_level; @@ -31,14 +24,14 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) const struct cpuid_bit *cb; static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CR_EBX,25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CR
[PATCH v6 1/3] x86/cpuid: Cleanup cpuid_regs definitions
make cpuid_regs more clear and avoid potential name clash. Signed-off-by: He Chen --- arch/x86/events/intel/pt.c | 45 +--- arch/x86/include/asm/processor.h | 11 ++ arch/x86/kernel/cpu/scattered.c | 28 ++--- arch/x86/kernel/cpuid.c | 4 4 files changed, 41 insertions(+), 47 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index c5047b8..1c1b9fe 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -36,13 +36,6 @@ static DEFINE_PER_CPU(struct pt, pt_ctx); static struct pt_pmu pt_pmu; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - /* * Capabilities of Intel PT hardware, such as number of address bits or * supported output schemes, are cached and exported to userspace as "caps" @@ -64,21 +57,21 @@ static struct pt_cap_desc { u8 reg; u32 mask; } pt_caps[] = { - PT_CAP(max_subleaf, 0, CR_EAX, 0x), - PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)), - PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)), - PT_CAP(ip_filtering,0, CR_EBX, BIT(2)), - PT_CAP(mtc, 0, CR_EBX, BIT(3)), - PT_CAP(ptwrite, 0, CR_EBX, BIT(4)), - PT_CAP(power_event_trace, 0, CR_EBX, BIT(5)), - PT_CAP(topa_output, 0, CR_ECX, BIT(0)), - PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)), - PT_CAP(single_range_output, 0, CR_ECX, BIT(2)), - PT_CAP(payloads_lip,0, CR_ECX, BIT(31)), - PT_CAP(num_address_ranges, 1, CR_EAX, 0x3), - PT_CAP(mtc_periods, 1, CR_EAX, 0x), - PT_CAP(cycle_thresholds,1, CR_EBX, 0x), - PT_CAP(psb_periods, 1, CR_EBX, 0x), + PT_CAP(max_subleaf, 0, CPUID_EAX, 0x), + PT_CAP(cr3_filtering, 0, CPUID_EBX, BIT(0)), + PT_CAP(psb_cyc, 0, CPUID_EBX, BIT(1)), + PT_CAP(ip_filtering,0, CPUID_EBX, BIT(2)), + PT_CAP(mtc, 0, CPUID_EBX, BIT(3)), + PT_CAP(ptwrite, 0, CPUID_EBX, BIT(4)), + PT_CAP(power_event_trace, 0, CPUID_EBX, BIT(5)), + PT_CAP(topa_output, 0, CPUID_ECX, BIT(0)), + PT_CAP(topa_multiple_entries, 0, CPUID_ECX, BIT(1)), + PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)), + PT_CAP(payloads_lip,0, CPUID_ECX, BIT(31)), + PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3), + PT_CAP(mtc_periods, 1, CPUID_EAX, 0x), + PT_CAP(cycle_thresholds,1, CPUID_EBX, 0x), + PT_CAP(psb_periods, 1, CPUID_EBX, 0x), }; static u32 pt_cap_get(enum pt_capabilities cap) @@ -213,10 +206,10 @@ static int __init pt_pmu_hw_init(void) for (i = 0; i < PT_CPUID_LEAVES; i++) { cpuid_count(20, i, - _pmu.caps[CR_EAX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EBX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_ECX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EDX + i*PT_CPUID_REGS_NUM]); + _pmu.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM]); } ret = -ENOMEM; diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..8f6ac5b 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -137,6 +137,17 @@ struct cpuinfo_x86 { u32 microcode; }; +struct cpuid_regs { + u32 eax, ebx, ecx, edx; +}; + +enum cpuid_regs_idx { + CPUID_EAX = 0, + CPUID_EBX, + CPUID_ECX, + CPUID_EDX, +}; + #define X86_VENDOR_INTEL 0 #define X86_VENDOR_CYRIX 1 #define X86_VENDOR_AMD 2 diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index 1db8dc4..dbb470e 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,13 +17,6 @@ struct cpuid_bit { u32 sub_leaf; }; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - void init_scattered_cpuid_features(struct cpuinfo_x86 *c) { u32 max_level; @@ -31,14 +24,14 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) const struct cpuid_bit *cb; static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CR_EBX,25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CR_EDX, 2,
[PATCH v6 3/3] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support
Add two new AVX512 subfeatures support for KVM guest. AVX512_4VNNIW: Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS: Vector instructions for deep learning floating-point single precision. Reviewed-by: Borislav Petkov <b...@suse.de> Signed-off-by: Luwei Kang <luwei.k...@intel.com> Signed-off-by: He Chen <he.c...@linux.intel.com> --- arch/x86/kvm/cpuid.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..ddcdf7c 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -16,6 +16,7 @@ #include #include #include +#include #include /* For use_eager_fpu. Ugh! */ #include #include @@ -65,6 +66,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +382,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +468,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
[PATCH v6 3/3] x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS support
Add two new AVX512 subfeatures support for KVM guest. AVX512_4VNNIW: Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS: Vector instructions for deep learning floating-point single precision. Reviewed-by: Borislav Petkov Signed-off-by: Luwei Kang Signed-off-by: He Chen --- arch/x86/kvm/cpuid.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..ddcdf7c 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -16,6 +16,7 @@ #include #include #include +#include #include /* For use_eager_fpu. Ugh! */ #include #include @@ -65,6 +66,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +382,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +468,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
[PATCH v6 2/3] x86/cpuid: Add a helper in scattered.c to return cpuid
Some sparse CPUID leafs are gathered in a fake leaf to save size of x86_capability array in current code, but sometimes, kernel or other modules (e.g. KVM CPUID enumeration) may need actual hardware leaf information. This patch adds a helper get_scattered_cpuid_leaf() to rebuild actual CPUID leaf, and it can be called outside by modules. Reviewed-by: Borislav Petkov <b...@suse.de> Signed-off-by: He Chen <he.c...@linux.intel.com> --- arch/x86/include/asm/processor.h | 3 +++ arch/x86/kernel/cpu/scattered.c | 49 ++-- 2 files changed, 40 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8f6ac5b..e7f8c62 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -189,6 +189,9 @@ extern void identify_secondary_cpu(struct cpuinfo_x86 *); extern void print_cpu_info(struct cpuinfo_x86 *); void print_cpu_msr(struct cpuinfo_x86 *); extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c); +extern u32 get_scattered_cpuid_leaf(unsigned int level, + unsigned int sub_leaf, + enum cpuid_regs_idx reg); extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c); extern void init_amd_cacheinfo(struct cpuinfo_x86 *c); diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index dbb470e..d1316f9 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,24 +17,25 @@ struct cpuid_bit { u32 sub_leaf; }; +/* Please keep the leaf sorted by cpuid_bit.level for faster search. */ +static const struct cpuid_bit cpuid_bits[] = { + { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, + { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, + { X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x0007, 0 }, + { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, + { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, + { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, + { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, + { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, + { 0, 0, 0, 0, 0 } +}; + void init_scattered_cpuid_features(struct cpuinfo_x86 *c) { u32 max_level; u32 regs[4]; const struct cpuid_bit *cb; - static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, - { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, - { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, - { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, - { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, - { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, - { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, - { 0, 0, 0, 0, 0 } - }; - for (cb = cpuid_bits; cb->feature; cb++) { /* Verify that the level is valid */ @@ -51,3 +52,27 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) set_cpu_cap(c, cb->feature); } } + +u32 get_scattered_cpuid_leaf(unsigned int level, unsigned int sub_leaf, +enum cpuid_regs_idx reg) +{ + const struct cpuid_bit *cb; + u32 cpuid_val = 0; + + for (cb = cpuid_bits; cb->feature; cb++) { + + if (level > cb->level) + continue; + + if (level < cb->level) + break; + + if (reg == cb->reg && sub_leaf == cb->sub_leaf) { + if (cpu_has(_cpu_data, cb->feature)) + cpuid_val |= BIT(cb->bit); + } + } + + return cpuid_val; +} +EXPORT_SYMBOL_GPL(get_scattered_cpuid_leaf); -- 2.7.4
[PATCH v6 2/3] x86/cpuid: Add a helper in scattered.c to return cpuid
Some sparse CPUID leafs are gathered in a fake leaf to save size of x86_capability array in current code, but sometimes, kernel or other modules (e.g. KVM CPUID enumeration) may need actual hardware leaf information. This patch adds a helper get_scattered_cpuid_leaf() to rebuild actual CPUID leaf, and it can be called outside by modules. Reviewed-by: Borislav Petkov Signed-off-by: He Chen --- arch/x86/include/asm/processor.h | 3 +++ arch/x86/kernel/cpu/scattered.c | 49 ++-- 2 files changed, 40 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8f6ac5b..e7f8c62 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -189,6 +189,9 @@ extern void identify_secondary_cpu(struct cpuinfo_x86 *); extern void print_cpu_info(struct cpuinfo_x86 *); void print_cpu_msr(struct cpuinfo_x86 *); extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c); +extern u32 get_scattered_cpuid_leaf(unsigned int level, + unsigned int sub_leaf, + enum cpuid_regs_idx reg); extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c); extern void init_amd_cacheinfo(struct cpuinfo_x86 *c); diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index dbb470e..d1316f9 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,24 +17,25 @@ struct cpuid_bit { u32 sub_leaf; }; +/* Please keep the leaf sorted by cpuid_bit.level for faster search. */ +static const struct cpuid_bit cpuid_bits[] = { + { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, + { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, + { X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x0007, 0 }, + { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, + { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, + { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, + { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, + { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, + { 0, 0, 0, 0, 0 } +}; + void init_scattered_cpuid_features(struct cpuinfo_x86 *c) { u32 max_level; u32 regs[4]; const struct cpuid_bit *cb; - static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, - { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, - { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, - { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, - { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, - { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, - { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, - { 0, 0, 0, 0, 0 } - }; - for (cb = cpuid_bits; cb->feature; cb++) { /* Verify that the level is valid */ @@ -51,3 +52,27 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) set_cpu_cap(c, cb->feature); } } + +u32 get_scattered_cpuid_leaf(unsigned int level, unsigned int sub_leaf, +enum cpuid_regs_idx reg) +{ + const struct cpuid_bit *cb; + u32 cpuid_val = 0; + + for (cb = cpuid_bits; cb->feature; cb++) { + + if (level > cb->level) + continue; + + if (level < cb->level) + break; + + if (reg == cb->reg && sub_leaf == cb->sub_leaf) { + if (cpu_has(_cpu_data, cb->feature)) + cpuid_val |= BIT(cb->bit); + } + } + + return cpuid_val; +} +EXPORT_SYMBOL_GPL(get_scattered_cpuid_leaf); -- 2.7.4
[PATCH v6 0/3] x86/kvm: Support AVX512_4VNNIW and AVX512_4FMAPS for KVM guest
This patch series is going to add two new AVX512 features to KVM guest. Since these two features are defined as scattered features in kernel, some extra modification in kernel is included. BTW. sorry for sending patch so frequently, and really appreciate your kindly review. --- Changes in v6: * refine commit messages. Changes in v5: * divide the whole patchset into 3 parts. * refine commit messages. Changes in v4: * divide patch into 2 parts, including modification in scattered.c and support new AVX512 instructions for KVM. * coding style. * refine commit message. Changes in v3: * add a helper in scattered.c to get scattered leaf. Changes in v2: * add new macros for new AVX512 scattered features. * add a cpuid_count_edx function to processor.h He Chen (3): x86/cpuid: Cleanup cpuid_regs definitions x86/cpuid: Add a helper in scattered.c to return cpuid x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS subfeatures support arch/x86/events/intel/pt.c | 45 ++- arch/x86/include/asm/processor.h | 14 ++ arch/x86/kernel/cpu/scattered.c | 57 ++-- arch/x86/kernel/cpuid.c | 4 --- arch/x86/kvm/cpuid.c | 14 +- 5 files changed, 84 insertions(+), 50 deletions(-) -- 2.7.4
[PATCH v6 0/3] x86/kvm: Support AVX512_4VNNIW and AVX512_4FMAPS for KVM guest
This patch series is going to add two new AVX512 features to KVM guest. Since these two features are defined as scattered features in kernel, some extra modification in kernel is included. BTW. sorry for sending patch so frequently, and really appreciate your kindly review. --- Changes in v6: * refine commit messages. Changes in v5: * divide the whole patchset into 3 parts. * refine commit messages. Changes in v4: * divide patch into 2 parts, including modification in scattered.c and support new AVX512 instructions for KVM. * coding style. * refine commit message. Changes in v3: * add a helper in scattered.c to get scattered leaf. Changes in v2: * add new macros for new AVX512 scattered features. * add a cpuid_count_edx function to processor.h He Chen (3): x86/cpuid: Cleanup cpuid_regs definitions x86/cpuid: Add a helper in scattered.c to return cpuid x86/kvm: Add AVX512_4VNNIW and AVX512_4FMAPS subfeatures support arch/x86/events/intel/pt.c | 45 ++- arch/x86/include/asm/processor.h | 14 ++ arch/x86/kernel/cpu/scattered.c | 57 ++-- arch/x86/kernel/cpuid.c | 4 --- arch/x86/kvm/cpuid.c | 14 +- 5 files changed, 84 insertions(+), 50 deletions(-) -- 2.7.4
[PATCH v5 0/3] cpuid: Support AVX512_4VNNIW and AVX512_4FMAPS for KVM guest
This patch series is going to add two new AVX512 features to KVM guest. Since these two features are defined as scattered features in kernel, some extra modification in kernel is included. --- Changes in v5: * divide the whole patchset into 3 parts. * refine commit messages. Changes in v4: * divide patch into 2 parts, including modification in scattered.c and support new AVX512 instructions for KVM. * coding style. * refine commit message. Changes in v3: * add a helper in scattered.c to get scattered leaf. Changes in v2: * add new macros for new AVX512 scattered features. * add a cpuid_count_edx function to processor.h He Chen (3): cpuid: cleanup cpuid_regs definitions cpuid: Add a helper in scattered.c to return cpuid cpuid: add AVX512_4VNNIW and AVX512_4FMAPS instructions support arch/x86/events/intel/pt.c | 45 ++- arch/x86/include/asm/processor.h | 14 ++ arch/x86/kernel/cpu/scattered.c | 57 ++-- arch/x86/kernel/cpuid.c | 4 --- arch/x86/kvm/cpuid.c | 14 +- 5 files changed, 84 insertions(+), 50 deletions(-) -- 2.7.4
[PATCH v5 0/3] cpuid: Support AVX512_4VNNIW and AVX512_4FMAPS for KVM guest
This patch series is going to add two new AVX512 features to KVM guest. Since these two features are defined as scattered features in kernel, some extra modification in kernel is included. --- Changes in v5: * divide the whole patchset into 3 parts. * refine commit messages. Changes in v4: * divide patch into 2 parts, including modification in scattered.c and support new AVX512 instructions for KVM. * coding style. * refine commit message. Changes in v3: * add a helper in scattered.c to get scattered leaf. Changes in v2: * add new macros for new AVX512 scattered features. * add a cpuid_count_edx function to processor.h He Chen (3): cpuid: cleanup cpuid_regs definitions cpuid: Add a helper in scattered.c to return cpuid cpuid: add AVX512_4VNNIW and AVX512_4FMAPS instructions support arch/x86/events/intel/pt.c | 45 ++- arch/x86/include/asm/processor.h | 14 ++ arch/x86/kernel/cpu/scattered.c | 57 ++-- arch/x86/kernel/cpuid.c | 4 --- arch/x86/kvm/cpuid.c | 14 +- 5 files changed, 84 insertions(+), 50 deletions(-) -- 2.7.4
[PATCH v5 1/3] cpuid: cleanup cpuid_regs definitions
make cpuid_regs more clear and avoid potential name clash. Signed-off-by: He Chen <he.c...@linux.intel.com> --- arch/x86/events/intel/pt.c | 45 +--- arch/x86/include/asm/processor.h | 11 ++ arch/x86/kernel/cpu/scattered.c | 28 ++--- arch/x86/kernel/cpuid.c | 4 4 files changed, 41 insertions(+), 47 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index c5047b8..1c1b9fe 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -36,13 +36,6 @@ static DEFINE_PER_CPU(struct pt, pt_ctx); static struct pt_pmu pt_pmu; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - /* * Capabilities of Intel PT hardware, such as number of address bits or * supported output schemes, are cached and exported to userspace as "caps" @@ -64,21 +57,21 @@ static struct pt_cap_desc { u8 reg; u32 mask; } pt_caps[] = { - PT_CAP(max_subleaf, 0, CR_EAX, 0x), - PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)), - PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)), - PT_CAP(ip_filtering,0, CR_EBX, BIT(2)), - PT_CAP(mtc, 0, CR_EBX, BIT(3)), - PT_CAP(ptwrite, 0, CR_EBX, BIT(4)), - PT_CAP(power_event_trace, 0, CR_EBX, BIT(5)), - PT_CAP(topa_output, 0, CR_ECX, BIT(0)), - PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)), - PT_CAP(single_range_output, 0, CR_ECX, BIT(2)), - PT_CAP(payloads_lip,0, CR_ECX, BIT(31)), - PT_CAP(num_address_ranges, 1, CR_EAX, 0x3), - PT_CAP(mtc_periods, 1, CR_EAX, 0x), - PT_CAP(cycle_thresholds,1, CR_EBX, 0x), - PT_CAP(psb_periods, 1, CR_EBX, 0x), + PT_CAP(max_subleaf, 0, CPUID_EAX, 0x), + PT_CAP(cr3_filtering, 0, CPUID_EBX, BIT(0)), + PT_CAP(psb_cyc, 0, CPUID_EBX, BIT(1)), + PT_CAP(ip_filtering,0, CPUID_EBX, BIT(2)), + PT_CAP(mtc, 0, CPUID_EBX, BIT(3)), + PT_CAP(ptwrite, 0, CPUID_EBX, BIT(4)), + PT_CAP(power_event_trace, 0, CPUID_EBX, BIT(5)), + PT_CAP(topa_output, 0, CPUID_ECX, BIT(0)), + PT_CAP(topa_multiple_entries, 0, CPUID_ECX, BIT(1)), + PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)), + PT_CAP(payloads_lip,0, CPUID_ECX, BIT(31)), + PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3), + PT_CAP(mtc_periods, 1, CPUID_EAX, 0x), + PT_CAP(cycle_thresholds,1, CPUID_EBX, 0x), + PT_CAP(psb_periods, 1, CPUID_EBX, 0x), }; static u32 pt_cap_get(enum pt_capabilities cap) @@ -213,10 +206,10 @@ static int __init pt_pmu_hw_init(void) for (i = 0; i < PT_CPUID_LEAVES; i++) { cpuid_count(20, i, - _pmu.caps[CR_EAX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EBX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_ECX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EDX + i*PT_CPUID_REGS_NUM]); + _pmu.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM]); } ret = -ENOMEM; diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..8f6ac5b 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -137,6 +137,17 @@ struct cpuinfo_x86 { u32 microcode; }; +struct cpuid_regs { + u32 eax, ebx, ecx, edx; +}; + +enum cpuid_regs_idx { + CPUID_EAX = 0, + CPUID_EBX, + CPUID_ECX, + CPUID_EDX, +}; + #define X86_VENDOR_INTEL 0 #define X86_VENDOR_CYRIX 1 #define X86_VENDOR_AMD 2 diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index 1db8dc4..5dbdd0b 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,13 +17,6 @@ struct cpuid_bit { u32 sub_leaf; }; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - void init_scattered_cpuid_features(struct cpuinfo_x86 *c) { u32 max_level; @@ -31,14 +24,14 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) const struct cpuid_bit *cb; static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CR_EBX,25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CR
[PATCH v5 2/3] cpuid: Add a helper in scattered.c to return cpuid
Some sparse CPUID leafs are gathered in a fake leaf to save size of x86_capability array in current code, but sometimes, kernel or other modules (e.g. KVM cpuid enumeration) may need actual hardware leaf information. This patch adds a helper get_scattered_cpuid_leaf() to rebuild actual CPUID leaf, and it can be called outside by modules. Signed-off-by: He Chen <he.c...@linux.intel.com> --- arch/x86/include/asm/processor.h | 3 +++ arch/x86/kernel/cpu/scattered.c | 49 ++-- 2 files changed, 40 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8f6ac5b..e7f8c62 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -189,6 +189,9 @@ extern void identify_secondary_cpu(struct cpuinfo_x86 *); extern void print_cpu_info(struct cpuinfo_x86 *); void print_cpu_msr(struct cpuinfo_x86 *); extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c); +extern u32 get_scattered_cpuid_leaf(unsigned int level, + unsigned int sub_leaf, + enum cpuid_regs_idx reg); extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c); extern void init_amd_cacheinfo(struct cpuinfo_x86 *c); diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index 5dbdd0b..d1316f9 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,24 +17,25 @@ struct cpuid_bit { u32 sub_leaf; }; +/* Please keep the leaf sorted by cpuid_bit.level for faster search. */ +static const struct cpuid_bit cpuid_bits[] = { + { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, + { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, + { X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x0007, 0 }, + { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, + { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, + { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, + { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, + { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, + { 0, 0, 0, 0, 0 } +}; + void init_scattered_cpuid_features(struct cpuinfo_x86 *c) { u32 max_level; u32 regs[4]; const struct cpuid_bit *cb; - static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CPUID_EBX,25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, - { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, - { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, - { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, - { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, - { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, - { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX,11, 0x8007, 0 }, - { 0, 0, 0, 0, 0 } - }; - for (cb = cpuid_bits; cb->feature; cb++) { /* Verify that the level is valid */ @@ -51,3 +52,27 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) set_cpu_cap(c, cb->feature); } } + +u32 get_scattered_cpuid_leaf(unsigned int level, unsigned int sub_leaf, +enum cpuid_regs_idx reg) +{ + const struct cpuid_bit *cb; + u32 cpuid_val = 0; + + for (cb = cpuid_bits; cb->feature; cb++) { + + if (level > cb->level) + continue; + + if (level < cb->level) + break; + + if (reg == cb->reg && sub_leaf == cb->sub_leaf) { + if (cpu_has(_cpu_data, cb->feature)) + cpuid_val |= BIT(cb->bit); + } + } + + return cpuid_val; +} +EXPORT_SYMBOL_GPL(get_scattered_cpuid_leaf); -- 2.7.4
[PATCH v5 2/3] cpuid: Add a helper in scattered.c to return cpuid
Some sparse CPUID leafs are gathered in a fake leaf to save size of x86_capability array in current code, but sometimes, kernel or other modules (e.g. KVM cpuid enumeration) may need actual hardware leaf information. This patch adds a helper get_scattered_cpuid_leaf() to rebuild actual CPUID leaf, and it can be called outside by modules. Signed-off-by: He Chen --- arch/x86/include/asm/processor.h | 3 +++ arch/x86/kernel/cpu/scattered.c | 49 ++-- 2 files changed, 40 insertions(+), 12 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 8f6ac5b..e7f8c62 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -189,6 +189,9 @@ extern void identify_secondary_cpu(struct cpuinfo_x86 *); extern void print_cpu_info(struct cpuinfo_x86 *); void print_cpu_msr(struct cpuinfo_x86 *); extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c); +extern u32 get_scattered_cpuid_leaf(unsigned int level, + unsigned int sub_leaf, + enum cpuid_regs_idx reg); extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c); extern void init_amd_cacheinfo(struct cpuinfo_x86 *c); diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index 5dbdd0b..d1316f9 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,24 +17,25 @@ struct cpuid_bit { u32 sub_leaf; }; +/* Please keep the leaf sorted by cpuid_bit.level for faster search. */ +static const struct cpuid_bit cpuid_bits[] = { + { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, + { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, + { X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x0007, 0 }, + { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, + { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, + { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, + { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, + { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX, 11, 0x8007, 0 }, + { 0, 0, 0, 0, 0 } +}; + void init_scattered_cpuid_features(struct cpuinfo_x86 *c) { u32 max_level; u32 regs[4]; const struct cpuid_bit *cb; - static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CPUID_EBX,25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CPUID_EDX, 2, 0x0007, 0 }, - { X86_FEATURE_AVX512_4FMAPS,CPUID_EDX, 3, 0x0007, 0 }, - { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x0006, 0 }, - { X86_FEATURE_EPB, CPUID_ECX, 3, 0x0006, 0 }, - { X86_FEATURE_HW_PSTATE,CPUID_EDX, 7, 0x8007, 0 }, - { X86_FEATURE_CPB, CPUID_EDX, 9, 0x8007, 0 }, - { X86_FEATURE_PROC_FEEDBACK,CPUID_EDX,11, 0x8007, 0 }, - { 0, 0, 0, 0, 0 } - }; - for (cb = cpuid_bits; cb->feature; cb++) { /* Verify that the level is valid */ @@ -51,3 +52,27 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) set_cpu_cap(c, cb->feature); } } + +u32 get_scattered_cpuid_leaf(unsigned int level, unsigned int sub_leaf, +enum cpuid_regs_idx reg) +{ + const struct cpuid_bit *cb; + u32 cpuid_val = 0; + + for (cb = cpuid_bits; cb->feature; cb++) { + + if (level > cb->level) + continue; + + if (level < cb->level) + break; + + if (reg == cb->reg && sub_leaf == cb->sub_leaf) { + if (cpu_has(_cpu_data, cb->feature)) + cpuid_val |= BIT(cb->bit); + } + } + + return cpuid_val; +} +EXPORT_SYMBOL_GPL(get_scattered_cpuid_leaf); -- 2.7.4
[PATCH v5 1/3] cpuid: cleanup cpuid_regs definitions
make cpuid_regs more clear and avoid potential name clash. Signed-off-by: He Chen --- arch/x86/events/intel/pt.c | 45 +--- arch/x86/include/asm/processor.h | 11 ++ arch/x86/kernel/cpu/scattered.c | 28 ++--- arch/x86/kernel/cpuid.c | 4 4 files changed, 41 insertions(+), 47 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index c5047b8..1c1b9fe 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -36,13 +36,6 @@ static DEFINE_PER_CPU(struct pt, pt_ctx); static struct pt_pmu pt_pmu; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - /* * Capabilities of Intel PT hardware, such as number of address bits or * supported output schemes, are cached and exported to userspace as "caps" @@ -64,21 +57,21 @@ static struct pt_cap_desc { u8 reg; u32 mask; } pt_caps[] = { - PT_CAP(max_subleaf, 0, CR_EAX, 0x), - PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)), - PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)), - PT_CAP(ip_filtering,0, CR_EBX, BIT(2)), - PT_CAP(mtc, 0, CR_EBX, BIT(3)), - PT_CAP(ptwrite, 0, CR_EBX, BIT(4)), - PT_CAP(power_event_trace, 0, CR_EBX, BIT(5)), - PT_CAP(topa_output, 0, CR_ECX, BIT(0)), - PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)), - PT_CAP(single_range_output, 0, CR_ECX, BIT(2)), - PT_CAP(payloads_lip,0, CR_ECX, BIT(31)), - PT_CAP(num_address_ranges, 1, CR_EAX, 0x3), - PT_CAP(mtc_periods, 1, CR_EAX, 0x), - PT_CAP(cycle_thresholds,1, CR_EBX, 0x), - PT_CAP(psb_periods, 1, CR_EBX, 0x), + PT_CAP(max_subleaf, 0, CPUID_EAX, 0x), + PT_CAP(cr3_filtering, 0, CPUID_EBX, BIT(0)), + PT_CAP(psb_cyc, 0, CPUID_EBX, BIT(1)), + PT_CAP(ip_filtering,0, CPUID_EBX, BIT(2)), + PT_CAP(mtc, 0, CPUID_EBX, BIT(3)), + PT_CAP(ptwrite, 0, CPUID_EBX, BIT(4)), + PT_CAP(power_event_trace, 0, CPUID_EBX, BIT(5)), + PT_CAP(topa_output, 0, CPUID_ECX, BIT(0)), + PT_CAP(topa_multiple_entries, 0, CPUID_ECX, BIT(1)), + PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)), + PT_CAP(payloads_lip,0, CPUID_ECX, BIT(31)), + PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3), + PT_CAP(mtc_periods, 1, CPUID_EAX, 0x), + PT_CAP(cycle_thresholds,1, CPUID_EBX, 0x), + PT_CAP(psb_periods, 1, CPUID_EBX, 0x), }; static u32 pt_cap_get(enum pt_capabilities cap) @@ -213,10 +206,10 @@ static int __init pt_pmu_hw_init(void) for (i = 0; i < PT_CPUID_LEAVES; i++) { cpuid_count(20, i, - _pmu.caps[CR_EAX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EBX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_ECX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EDX + i*PT_CPUID_REGS_NUM]); + _pmu.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM]); } ret = -ENOMEM; diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..8f6ac5b 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -137,6 +137,17 @@ struct cpuinfo_x86 { u32 microcode; }; +struct cpuid_regs { + u32 eax, ebx, ecx, edx; +}; + +enum cpuid_regs_idx { + CPUID_EAX = 0, + CPUID_EBX, + CPUID_ECX, + CPUID_EDX, +}; + #define X86_VENDOR_INTEL 0 #define X86_VENDOR_CYRIX 1 #define X86_VENDOR_AMD 2 diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index 1db8dc4..5dbdd0b 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,13 +17,6 @@ struct cpuid_bit { u32 sub_leaf; }; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - void init_scattered_cpuid_features(struct cpuinfo_x86 *c) { u32 max_level; @@ -31,14 +24,14 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) const struct cpuid_bit *cb; static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CR_EBX,25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CR_EDX, 2,
[PATCH v5 3/3] cpuid: add AVX512_4VNNIW and AVX512_4FMAPS instructions support
Add two new AVX512 instructions support for KVM guest. AVX512_4VNNIW: Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS: Vector instructions for deep learning floating-point single precision. Signed-off-by: Luwei Kang <luwei.k...@intel.com> Signed-off-by: He Chen <he.c...@linux.intel.com> --- arch/x86/kvm/cpuid.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..ddcdf7c 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -16,6 +16,7 @@ #include #include #include +#include #include /* For use_eager_fpu. Ugh! */ #include #include @@ -65,6 +66,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +382,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +468,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
[PATCH v5 3/3] cpuid: add AVX512_4VNNIW and AVX512_4FMAPS instructions support
Add two new AVX512 instructions support for KVM guest. AVX512_4VNNIW: Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS: Vector instructions for deep learning floating-point single precision. Signed-off-by: Luwei Kang Signed-off-by: He Chen --- arch/x86/kvm/cpuid.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..ddcdf7c 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -16,6 +16,7 @@ #include #include #include +#include #include /* For use_eager_fpu. Ugh! */ #include #include @@ -65,6 +66,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +382,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +468,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
[PATCH v4 1/2] cpuid: Add a helper in scattered.c to return cpuid leaf info
Some sparse cpuid leafs are gathered in a fake leaf to save size of x86_capability array in current code, but sometimes, kernel or other modules (e.g. KVM cpuid enumeration) may need actual hardware leaf information. This patch adds a helper get_scattered_cpuid_leaf to rebuild actual cpuid leaf, and it can be called outside by modules. Also, export enum cpuid_regs in pt.c and scattered.c to enum cpuid_regs_idx in processor.h. --- arch/x86/events/intel/pt.c | 45 ++-- arch/x86/include/asm/processor.h | 14 ++ arch/x86/kernel/cpu/scattered.c | 56 ++-- arch/x86/kernel/cpuid.c | 4 --- 4 files changed, 70 insertions(+), 49 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index c5047b8..1c1b9fe 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -36,13 +36,6 @@ static DEFINE_PER_CPU(struct pt, pt_ctx); static struct pt_pmu pt_pmu; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - /* * Capabilities of Intel PT hardware, such as number of address bits or * supported output schemes, are cached and exported to userspace as "caps" @@ -64,21 +57,21 @@ static struct pt_cap_desc { u8 reg; u32 mask; } pt_caps[] = { - PT_CAP(max_subleaf, 0, CR_EAX, 0x), - PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)), - PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)), - PT_CAP(ip_filtering,0, CR_EBX, BIT(2)), - PT_CAP(mtc, 0, CR_EBX, BIT(3)), - PT_CAP(ptwrite, 0, CR_EBX, BIT(4)), - PT_CAP(power_event_trace, 0, CR_EBX, BIT(5)), - PT_CAP(topa_output, 0, CR_ECX, BIT(0)), - PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)), - PT_CAP(single_range_output, 0, CR_ECX, BIT(2)), - PT_CAP(payloads_lip,0, CR_ECX, BIT(31)), - PT_CAP(num_address_ranges, 1, CR_EAX, 0x3), - PT_CAP(mtc_periods, 1, CR_EAX, 0x), - PT_CAP(cycle_thresholds,1, CR_EBX, 0x), - PT_CAP(psb_periods, 1, CR_EBX, 0x), + PT_CAP(max_subleaf, 0, CPUID_EAX, 0x), + PT_CAP(cr3_filtering, 0, CPUID_EBX, BIT(0)), + PT_CAP(psb_cyc, 0, CPUID_EBX, BIT(1)), + PT_CAP(ip_filtering,0, CPUID_EBX, BIT(2)), + PT_CAP(mtc, 0, CPUID_EBX, BIT(3)), + PT_CAP(ptwrite, 0, CPUID_EBX, BIT(4)), + PT_CAP(power_event_trace, 0, CPUID_EBX, BIT(5)), + PT_CAP(topa_output, 0, CPUID_ECX, BIT(0)), + PT_CAP(topa_multiple_entries, 0, CPUID_ECX, BIT(1)), + PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)), + PT_CAP(payloads_lip,0, CPUID_ECX, BIT(31)), + PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3), + PT_CAP(mtc_periods, 1, CPUID_EAX, 0x), + PT_CAP(cycle_thresholds,1, CPUID_EBX, 0x), + PT_CAP(psb_periods, 1, CPUID_EBX, 0x), }; static u32 pt_cap_get(enum pt_capabilities cap) @@ -213,10 +206,10 @@ static int __init pt_pmu_hw_init(void) for (i = 0; i < PT_CPUID_LEAVES; i++) { cpuid_count(20, i, - _pmu.caps[CR_EAX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EBX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_ECX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EDX + i*PT_CPUID_REGS_NUM]); + _pmu.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM]); } ret = -ENOMEM; diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..e7f8c62 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -137,6 +137,17 @@ struct cpuinfo_x86 { u32 microcode; }; +struct cpuid_regs { + u32 eax, ebx, ecx, edx; +}; + +enum cpuid_regs_idx { + CPUID_EAX = 0, + CPUID_EBX, + CPUID_ECX, + CPUID_EDX, +}; + #define X86_VENDOR_INTEL 0 #define X86_VENDOR_CYRIX 1 #define X86_VENDOR_AMD 2 @@ -178,6 +189,9 @@ extern void identify_secondary_cpu(struct cpuinfo_x86 *); extern void print_cpu_info(struct cpuinfo_x86 *); void print_cpu_msr(struct cpuinfo_x86 *); extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c); +extern u32 get_scattered_cpuid_leaf(unsigned int level, + unsigned int sub_leaf, + enum cpuid_regs_idx reg);
[PATCH v4 1/2] cpuid: Add a helper in scattered.c to return cpuid leaf info
Some sparse cpuid leafs are gathered in a fake leaf to save size of x86_capability array in current code, but sometimes, kernel or other modules (e.g. KVM cpuid enumeration) may need actual hardware leaf information. This patch adds a helper get_scattered_cpuid_leaf to rebuild actual cpuid leaf, and it can be called outside by modules. Also, export enum cpuid_regs in pt.c and scattered.c to enum cpuid_regs_idx in processor.h. --- arch/x86/events/intel/pt.c | 45 ++-- arch/x86/include/asm/processor.h | 14 ++ arch/x86/kernel/cpu/scattered.c | 56 ++-- arch/x86/kernel/cpuid.c | 4 --- 4 files changed, 70 insertions(+), 49 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index c5047b8..1c1b9fe 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -36,13 +36,6 @@ static DEFINE_PER_CPU(struct pt, pt_ctx); static struct pt_pmu pt_pmu; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - /* * Capabilities of Intel PT hardware, such as number of address bits or * supported output schemes, are cached and exported to userspace as "caps" @@ -64,21 +57,21 @@ static struct pt_cap_desc { u8 reg; u32 mask; } pt_caps[] = { - PT_CAP(max_subleaf, 0, CR_EAX, 0x), - PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)), - PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)), - PT_CAP(ip_filtering,0, CR_EBX, BIT(2)), - PT_CAP(mtc, 0, CR_EBX, BIT(3)), - PT_CAP(ptwrite, 0, CR_EBX, BIT(4)), - PT_CAP(power_event_trace, 0, CR_EBX, BIT(5)), - PT_CAP(topa_output, 0, CR_ECX, BIT(0)), - PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)), - PT_CAP(single_range_output, 0, CR_ECX, BIT(2)), - PT_CAP(payloads_lip,0, CR_ECX, BIT(31)), - PT_CAP(num_address_ranges, 1, CR_EAX, 0x3), - PT_CAP(mtc_periods, 1, CR_EAX, 0x), - PT_CAP(cycle_thresholds,1, CR_EBX, 0x), - PT_CAP(psb_periods, 1, CR_EBX, 0x), + PT_CAP(max_subleaf, 0, CPUID_EAX, 0x), + PT_CAP(cr3_filtering, 0, CPUID_EBX, BIT(0)), + PT_CAP(psb_cyc, 0, CPUID_EBX, BIT(1)), + PT_CAP(ip_filtering,0, CPUID_EBX, BIT(2)), + PT_CAP(mtc, 0, CPUID_EBX, BIT(3)), + PT_CAP(ptwrite, 0, CPUID_EBX, BIT(4)), + PT_CAP(power_event_trace, 0, CPUID_EBX, BIT(5)), + PT_CAP(topa_output, 0, CPUID_ECX, BIT(0)), + PT_CAP(topa_multiple_entries, 0, CPUID_ECX, BIT(1)), + PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)), + PT_CAP(payloads_lip,0, CPUID_ECX, BIT(31)), + PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3), + PT_CAP(mtc_periods, 1, CPUID_EAX, 0x), + PT_CAP(cycle_thresholds,1, CPUID_EBX, 0x), + PT_CAP(psb_periods, 1, CPUID_EBX, 0x), }; static u32 pt_cap_get(enum pt_capabilities cap) @@ -213,10 +206,10 @@ static int __init pt_pmu_hw_init(void) for (i = 0; i < PT_CPUID_LEAVES; i++) { cpuid_count(20, i, - _pmu.caps[CR_EAX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EBX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_ECX + i*PT_CPUID_REGS_NUM], - _pmu.caps[CR_EDX + i*PT_CPUID_REGS_NUM]); + _pmu.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM], + _pmu.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM]); } ret = -ENOMEM; diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..e7f8c62 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -137,6 +137,17 @@ struct cpuinfo_x86 { u32 microcode; }; +struct cpuid_regs { + u32 eax, ebx, ecx, edx; +}; + +enum cpuid_regs_idx { + CPUID_EAX = 0, + CPUID_EBX, + CPUID_ECX, + CPUID_EDX, +}; + #define X86_VENDOR_INTEL 0 #define X86_VENDOR_CYRIX 1 #define X86_VENDOR_AMD 2 @@ -178,6 +189,9 @@ extern void identify_secondary_cpu(struct cpuinfo_x86 *); extern void print_cpu_info(struct cpuinfo_x86 *); void print_cpu_msr(struct cpuinfo_x86 *); extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c); +extern u32 get_scattered_cpuid_leaf(unsigned int level, + unsigned int sub_leaf, + enum cpuid_regs_idx reg);
[PATCH v4 2/2] cpuid: add AVX512_4VNNIW and AVX512_4FMAPS instructions support
Add two new AVX512 instructions support for KVM guest. AVX512_4VNNIW: Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS: Vector instructions for deep learning floating-point single precision. --- arch/x86/kvm/cpuid.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..ddcdf7c 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -16,6 +16,7 @@ #include #include #include +#include #include /* For use_eager_fpu. Ugh! */ #include #include @@ -65,6 +66,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +382,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +468,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
[PATCH v4 2/2] cpuid: add AVX512_4VNNIW and AVX512_4FMAPS instructions support
Add two new AVX512 instructions support for KVM guest. AVX512_4VNNIW: Vector instructions for deep learning enhanced word variable precision. AVX512_4FMAPS: Vector instructions for deep learning floating-point single precision. --- arch/x86/kvm/cpuid.c | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..ddcdf7c 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -16,6 +16,7 @@ #include #include #include +#include #include /* For use_eager_fpu. Ugh! */ #include #include @@ -65,6 +66,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +382,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +468,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
[PATCH v4 0/2] cpuid: Support AVX512_4VNNIW and AVX512_4FMAPS for KVM guest
This patch series is going to add two new AVX512 features to KVM guest. Since these two features are defined as scattered features in kernel, some extra modification in kernel is included. --- Changes in v4: * divide patch into 2 parts, including modification in scattered.c and support new AVX512 instructions for KVM. * coding style. * refine commit message. Changes in v3: * add a helper in scattered.c to get scattered leaf. Changes in v2: * add new macros for new AVX512 scattered features. * add a cpuid_count_edx function to processor.h He Chen (2): cpuid: Add a helper in scattered.c to return cpuid leaf info cpuid: add AVX512_4VNNIW and AVX512_4FMAPS instructions support arch/x86/events/intel/pt.c | 45 ++-- arch/x86/include/asm/processor.h | 14 ++ arch/x86/kernel/cpu/scattered.c | 56 ++-- arch/x86/kernel/cpuid.c | 4 --- arch/x86/kvm/cpuid.c | 14 +- 5 files changed, 83 insertions(+), 50 deletions(-) -- 2.7.4
[PATCH v4 0/2] cpuid: Support AVX512_4VNNIW and AVX512_4FMAPS for KVM guest
This patch series is going to add two new AVX512 features to KVM guest. Since these two features are defined as scattered features in kernel, some extra modification in kernel is included. --- Changes in v4: * divide patch into 2 parts, including modification in scattered.c and support new AVX512 instructions for KVM. * coding style. * refine commit message. Changes in v3: * add a helper in scattered.c to get scattered leaf. Changes in v2: * add new macros for new AVX512 scattered features. * add a cpuid_count_edx function to processor.h He Chen (2): cpuid: Add a helper in scattered.c to return cpuid leaf info cpuid: add AVX512_4VNNIW and AVX512_4FMAPS instructions support arch/x86/events/intel/pt.c | 45 ++-- arch/x86/include/asm/processor.h | 14 ++ arch/x86/kernel/cpu/scattered.c | 56 ++-- arch/x86/kernel/cpuid.c | 4 --- arch/x86/kvm/cpuid.c | 14 +- 5 files changed, 83 insertions(+), 50 deletions(-) -- 2.7.4
Re: [PATCH v3] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
On Fri, Nov 04, 2016 at 11:52:35AM +0100, Borislav Petkov wrote: > Please CC me on your future submissions, thanks. > Sure. > On Fri, Nov 04, 2016 at 03:07:19PM +0800, He Chen wrote: > > The spec can be found in Intel Software Developer Manual or in > > Instruction Set Extensions Programming Reference. > > This commit message is completely useless. Write commit messages in > the way as if you're explaining to another person *why* this change is > needed and that other person doesn't have an idea what you're doing. > My carelessness, will improve it in next patch. Thanks for kindly advices. > > Changes in v3: > > * add a helper in scattered.c to get scattered leaf. > > The modification to scattered et al without the kvm use should be a > separate patch. > Agreed. > > * Capabilities of Intel PT hardware, such as number of address bits or > > * supported output schemes, are cached and exported to userspace as "caps" > > diff --git a/arch/x86/include/asm/processor.h > > b/arch/x86/include/asm/processor.h > > index 984a7bf..47978b7 100644 > > --- a/arch/x86/include/asm/processor.h > > +++ b/arch/x86/include/asm/processor.h > > @@ -137,6 +137,13 @@ struct cpuinfo_x86 { > > u32 microcode; > > }; > > > > +enum cpuid_regs_idx { > > cpuid_regs was just fine. > It should be, but I found it conflcts with `struct cpuid_regs` in `arch/x86/kernel/cpuid.c` since it got exported. Thanks, -He
Re: [PATCH v3] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
On Fri, Nov 04, 2016 at 11:52:35AM +0100, Borislav Petkov wrote: > Please CC me on your future submissions, thanks. > Sure. > On Fri, Nov 04, 2016 at 03:07:19PM +0800, He Chen wrote: > > The spec can be found in Intel Software Developer Manual or in > > Instruction Set Extensions Programming Reference. > > This commit message is completely useless. Write commit messages in > the way as if you're explaining to another person *why* this change is > needed and that other person doesn't have an idea what you're doing. > My carelessness, will improve it in next patch. Thanks for kindly advices. > > Changes in v3: > > * add a helper in scattered.c to get scattered leaf. > > The modification to scattered et al without the kvm use should be a > separate patch. > Agreed. > > * Capabilities of Intel PT hardware, such as number of address bits or > > * supported output schemes, are cached and exported to userspace as "caps" > > diff --git a/arch/x86/include/asm/processor.h > > b/arch/x86/include/asm/processor.h > > index 984a7bf..47978b7 100644 > > --- a/arch/x86/include/asm/processor.h > > +++ b/arch/x86/include/asm/processor.h > > @@ -137,6 +137,13 @@ struct cpuinfo_x86 { > > u32 microcode; > > }; > > > > +enum cpuid_regs_idx { > > cpuid_regs was just fine. > It should be, but I found it conflcts with `struct cpuid_regs` in `arch/x86/kernel/cpuid.c` since it got exported. Thanks, -He
[PATCH v3] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
The spec can be found in Intel Software Developer Manual or in Instruction Set Extensions Programming Reference. Signed-off-by: Luwei Kang <luwei.k...@intel.com> Signed-off-by: He Chen <he.c...@linux.intel.com> --- Changes in v3: * add a helper in scattered.c to get scattered leaf. Changes in v2: * add new macros for new AVX512 scattered features. * add a cpuid_count_edx function to processor.h --- arch/x86/events/intel/pt.c | 7 -- arch/x86/include/asm/processor.h | 9 +++ arch/x86/kernel/cpu/scattered.c | 52 +++- arch/x86/kvm/cpuid.c | 14 ++- 4 files changed, 57 insertions(+), 25 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index c5047b8..5b4b972 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -36,13 +36,6 @@ static DEFINE_PER_CPU(struct pt, pt_ctx); static struct pt_pmu pt_pmu; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - /* * Capabilities of Intel PT hardware, such as number of address bits or * supported output schemes, are cached and exported to userspace as "caps" diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..47978b7 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -137,6 +137,13 @@ struct cpuinfo_x86 { u32 microcode; }; +enum cpuid_regs_idx { + CR_EAX = 0, + CR_ECX, + CR_EDX, + CR_EBX +}; + #define X86_VENDOR_INTEL 0 #define X86_VENDOR_CYRIX 1 #define X86_VENDOR_AMD 2 @@ -178,6 +185,8 @@ extern void identify_secondary_cpu(struct cpuinfo_x86 *); extern void print_cpu_info(struct cpuinfo_x86 *); void print_cpu_msr(struct cpuinfo_x86 *); extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c); +extern u32 get_scattered_cpuid_leaf(unsigned int level, + unsigned int sub_leaf, enum cpuid_regs_idx reg); extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c); extern void init_amd_cacheinfo(struct cpuinfo_x86 *c); diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index 1db8dc4..ca3c605 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,11 +17,17 @@ struct cpuid_bit { u32 sub_leaf; }; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX +/* Please keep the leaf sorted. */ +static const struct cpuid_bit cpuid_bits[] = { + { X86_FEATURE_APERFMPERF, CR_ECX, 0, 0x0006, 0 }, + { X86_FEATURE_EPB, CR_ECX, 3, 0x0006, 0 }, + { X86_FEATURE_INTEL_PT, CR_EBX, 25, 0x0007, 0 }, + { X86_FEATURE_AVX512_4VNNIW,CR_EDX, 2, 0x0007, 0 }, + { X86_FEATURE_AVX512_4FMAPS,CR_EDX, 3, 0x0007, 0 }, + { X86_FEATURE_HW_PSTATE,CR_EDX, 7, 0x8007, 0 }, + { X86_FEATURE_CPB, CR_EDX, 9, 0x8007, 0 }, + { X86_FEATURE_PROC_FEEDBACK,CR_EDX, 11, 0x8007, 0 }, + { 0, 0, 0, 0, 0 } }; void init_scattered_cpuid_features(struct cpuinfo_x86 *c) @@ -30,18 +36,6 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) u32 regs[4]; const struct cpuid_bit *cb; - static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CR_EBX,25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CR_EDX, 2, 0x0007, 0 }, - { X86_FEATURE_AVX512_4FMAPS,CR_EDX, 3, 0x0007, 0 }, - { X86_FEATURE_APERFMPERF, CR_ECX, 0, 0x0006, 0 }, - { X86_FEATURE_EPB, CR_ECX, 3, 0x0006, 0 }, - { X86_FEATURE_HW_PSTATE,CR_EDX, 7, 0x8007, 0 }, - { X86_FEATURE_CPB, CR_EDX, 9, 0x8007, 0 }, - { X86_FEATURE_PROC_FEEDBACK,CR_EDX,11, 0x8007, 0 }, - { 0, 0, 0, 0, 0 } - }; - for (cb = cpuid_bits; cb->feature; cb++) { /* Verify that the level is valid */ @@ -57,3 +51,27 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) set_cpu_cap(c, cb->feature); } } + +u32 get_scattered_cpuid_leaf(unsigned int level, unsigned int sub_leaf, +enum cpuid_regs_idx reg) +{ + u32 cpuid_val = 0; + const struct cpuid_bit *cb; + + for (cb = cpuid_bits; cb->feature; cb++) { + + if (level > cb->level) + continue; + + if (level < cb->level) + break; + + if (reg == cb->reg && sub_leaf == cb->sub_leaf) { + if (cpu_has(_cpu_data, cb->feature)) + cpuid_val |= BIT(cb->bit); + } + } + + return c
[PATCH v3] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
The spec can be found in Intel Software Developer Manual or in Instruction Set Extensions Programming Reference. Signed-off-by: Luwei Kang Signed-off-by: He Chen --- Changes in v3: * add a helper in scattered.c to get scattered leaf. Changes in v2: * add new macros for new AVX512 scattered features. * add a cpuid_count_edx function to processor.h --- arch/x86/events/intel/pt.c | 7 -- arch/x86/include/asm/processor.h | 9 +++ arch/x86/kernel/cpu/scattered.c | 52 +++- arch/x86/kvm/cpuid.c | 14 ++- 4 files changed, 57 insertions(+), 25 deletions(-) diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c index c5047b8..5b4b972 100644 --- a/arch/x86/events/intel/pt.c +++ b/arch/x86/events/intel/pt.c @@ -36,13 +36,6 @@ static DEFINE_PER_CPU(struct pt, pt_ctx); static struct pt_pmu pt_pmu; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX -}; - /* * Capabilities of Intel PT hardware, such as number of address bits or * supported output schemes, are cached and exported to userspace as "caps" diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..47978b7 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -137,6 +137,13 @@ struct cpuinfo_x86 { u32 microcode; }; +enum cpuid_regs_idx { + CR_EAX = 0, + CR_ECX, + CR_EDX, + CR_EBX +}; + #define X86_VENDOR_INTEL 0 #define X86_VENDOR_CYRIX 1 #define X86_VENDOR_AMD 2 @@ -178,6 +185,8 @@ extern void identify_secondary_cpu(struct cpuinfo_x86 *); extern void print_cpu_info(struct cpuinfo_x86 *); void print_cpu_msr(struct cpuinfo_x86 *); extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c); +extern u32 get_scattered_cpuid_leaf(unsigned int level, + unsigned int sub_leaf, enum cpuid_regs_idx reg); extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c); extern void init_amd_cacheinfo(struct cpuinfo_x86 *c); diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c index 1db8dc4..ca3c605 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -17,11 +17,17 @@ struct cpuid_bit { u32 sub_leaf; }; -enum cpuid_regs { - CR_EAX = 0, - CR_ECX, - CR_EDX, - CR_EBX +/* Please keep the leaf sorted. */ +static const struct cpuid_bit cpuid_bits[] = { + { X86_FEATURE_APERFMPERF, CR_ECX, 0, 0x0006, 0 }, + { X86_FEATURE_EPB, CR_ECX, 3, 0x0006, 0 }, + { X86_FEATURE_INTEL_PT, CR_EBX, 25, 0x0007, 0 }, + { X86_FEATURE_AVX512_4VNNIW,CR_EDX, 2, 0x0007, 0 }, + { X86_FEATURE_AVX512_4FMAPS,CR_EDX, 3, 0x0007, 0 }, + { X86_FEATURE_HW_PSTATE,CR_EDX, 7, 0x8007, 0 }, + { X86_FEATURE_CPB, CR_EDX, 9, 0x8007, 0 }, + { X86_FEATURE_PROC_FEEDBACK,CR_EDX, 11, 0x8007, 0 }, + { 0, 0, 0, 0, 0 } }; void init_scattered_cpuid_features(struct cpuinfo_x86 *c) @@ -30,18 +36,6 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) u32 regs[4]; const struct cpuid_bit *cb; - static const struct cpuid_bit cpuid_bits[] = { - { X86_FEATURE_INTEL_PT, CR_EBX,25, 0x0007, 0 }, - { X86_FEATURE_AVX512_4VNNIW,CR_EDX, 2, 0x0007, 0 }, - { X86_FEATURE_AVX512_4FMAPS,CR_EDX, 3, 0x0007, 0 }, - { X86_FEATURE_APERFMPERF, CR_ECX, 0, 0x0006, 0 }, - { X86_FEATURE_EPB, CR_ECX, 3, 0x0006, 0 }, - { X86_FEATURE_HW_PSTATE,CR_EDX, 7, 0x8007, 0 }, - { X86_FEATURE_CPB, CR_EDX, 9, 0x8007, 0 }, - { X86_FEATURE_PROC_FEEDBACK,CR_EDX,11, 0x8007, 0 }, - { 0, 0, 0, 0, 0 } - }; - for (cb = cpuid_bits; cb->feature; cb++) { /* Verify that the level is valid */ @@ -57,3 +51,27 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c) set_cpu_cap(c, cb->feature); } } + +u32 get_scattered_cpuid_leaf(unsigned int level, unsigned int sub_leaf, +enum cpuid_regs_idx reg) +{ + u32 cpuid_val = 0; + const struct cpuid_bit *cb; + + for (cb = cpuid_bits; cb->feature; cb++) { + + if (level > cb->level) + continue; + + if (level < cb->level) + break; + + if (reg == cb->reg && sub_leaf == cb->sub_leaf) { + if (cpu_has(_cpu_data, cb->feature)) + cpuid_val |= BIT(cb->bit); + } + } + + return cpuid_val; +} +EXPORT_SYMBOL_GPL(get_scattered_cpuid_leaf
Re: [PATCH] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
On Mon, Oct 31, 2016 at 12:41:32PM +0100, Paolo Bonzini wrote: > > > On 31/10/2016 12:05, Borislav Petkov wrote: > > On Mon, Oct 31, 2016 at 11:47:48AM +0100, Paolo Bonzini wrote: > >> The information is all in arch/x86/kernel/cpu/scattered.c's cpuid_bits > >> array. Borislav, would it be okay to export the cpuid_regs enum? > > > > Yeah, and kill the duplicated one in arch/x86/events/intel/pt.c too > > please, while at it. > > > > I'd still put it all in arch/x86/kernel/cpu/scattered.c so that it is > > close-by and call it from outside. > > Good. Chen, are you going to do this? > Sure. Before sending a patch, let me check if my understanding is right... I will add a helper in scattered.c like: unsigned int get_scattered_cpuid_features(unsigned int level, unsigned int sub_leaf, enum cpuid_regs reg) { u32 val = 0; const struct cpuid_bit *cb; for (cb = cpuid_bits; cb->feature; cb++) { if (reg == cb->reg && level == cb->level && sub_leaf == cb->sub_leaf && boot_cpu_has(cb->feature)) val |= cb->bit; } return val; } And, when KVM wants to mask out features, it can be called outside like: entry->edx &= kvm_cpuid_7_0_edx_x86_features; entry->edx &= get_scatterd_cpuid_features(7, 0, CR_EDX); Thanks, -He
Re: [PATCH] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
On Mon, Oct 31, 2016 at 12:41:32PM +0100, Paolo Bonzini wrote: > > > On 31/10/2016 12:05, Borislav Petkov wrote: > > On Mon, Oct 31, 2016 at 11:47:48AM +0100, Paolo Bonzini wrote: > >> The information is all in arch/x86/kernel/cpu/scattered.c's cpuid_bits > >> array. Borislav, would it be okay to export the cpuid_regs enum? > > > > Yeah, and kill the duplicated one in arch/x86/events/intel/pt.c too > > please, while at it. > > > > I'd still put it all in arch/x86/kernel/cpu/scattered.c so that it is > > close-by and call it from outside. > > Good. Chen, are you going to do this? > Sure. Before sending a patch, let me check if my understanding is right... I will add a helper in scattered.c like: unsigned int get_scattered_cpuid_features(unsigned int level, unsigned int sub_leaf, enum cpuid_regs reg) { u32 val = 0; const struct cpuid_bit *cb; for (cb = cpuid_bits; cb->feature; cb++) { if (reg == cb->reg && level == cb->level && sub_leaf == cb->sub_leaf && boot_cpu_has(cb->feature)) val |= cb->bit; } return val; } And, when KVM wants to mask out features, it can be called outside like: entry->edx &= kvm_cpuid_7_0_edx_x86_features; entry->edx &= get_scatterd_cpuid_features(7, 0, CR_EDX); Thanks, -He
[PATCH v2] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
The spec can be found in Intel Software Developer Manual or in Instruction Set Extensions Programming Reference. Signed-off-by: Luwei Kang <luwei.k...@intel.com> Signed-off-by: He Chen <he.c...@linux.intel.com> --- Changes in v2: * add new macros for new AVX512 scattered features * add a cpuid_count_edx function to processor.h --- arch/x86/include/asm/processor.h | 9 + arch/x86/kvm/cpuid.c | 13 - 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..e5ad7a74 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -577,6 +577,15 @@ static inline unsigned int cpuid_edx(unsigned int op) return edx; } +static inline unsigned int cpuid_count_edx(unsigned op, unsigned count) +{ + unsigned int eax, ebx, ecx, edx; + + cpuid_count(op, count, , , , ); + + return edx; +} + /* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */ static __always_inline void rep_nop(void) { diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..9990e7a 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -65,6 +65,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +381,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +467,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= cpuid_count_edx(7, 0); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
[PATCH v2] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
The spec can be found in Intel Software Developer Manual or in Instruction Set Extensions Programming Reference. Signed-off-by: Luwei Kang Signed-off-by: He Chen --- Changes in v2: * add new macros for new AVX512 scattered features * add a cpuid_count_edx function to processor.h --- arch/x86/include/asm/processor.h | 9 + arch/x86/kvm/cpuid.c | 13 - 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..e5ad7a74 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -577,6 +577,15 @@ static inline unsigned int cpuid_edx(unsigned int op) return edx; } +static inline unsigned int cpuid_count_edx(unsigned op, unsigned count) +{ + unsigned int eax, ebx, ecx, edx; + + cpuid_count(op, count, , , , ); + + return edx; +} + /* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */ static __always_inline void rep_nop(void) { diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..9990e7a 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -65,6 +65,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +381,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +467,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= cpuid_count_edx(7, 0); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
[PATCH v2] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
The spec can be found in Intel Software Developer Manual or in Instruction Set Extensions Programming Reference. Signed-off-by: Luwei Kang <luwei.k...@intel.com> Signed-off-by: He Chen <he.c...@linux.intel.com> --- Changes in v2: * add new macros for new AVX512 scattered features * add a cpuid_count_edx function to processor.h --- arch/x86/include/asm/processor.h | 9 + arch/x86/kvm/cpuid.c | 13 - 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..e5ad7a74 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -577,6 +577,15 @@ static inline unsigned int cpuid_edx(unsigned int op) return edx; } +static inline unsigned int cpuid_count_edx(unsigned op, unsigned count) +{ + unsigned int eax, ebx, ecx, edx; + + cpuid_count(op, count, , , , ); + + return edx; +} + /* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */ static __always_inline void rep_nop(void) { diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..9990e7a 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -65,6 +65,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +381,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +467,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= cpuid_count_edx(7, 0); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
[PATCH v2] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
The spec can be found in Intel Software Developer Manual or in Instruction Set Extensions Programming Reference. Signed-off-by: Luwei Kang Signed-off-by: He Chen --- Changes in v2: * add new macros for new AVX512 scattered features * add a cpuid_count_edx function to processor.h --- arch/x86/include/asm/processor.h | 9 + arch/x86/kvm/cpuid.c | 13 - 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 984a7bf..e5ad7a74 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -577,6 +577,15 @@ static inline unsigned int cpuid_edx(unsigned int op) return edx; } +static inline unsigned int cpuid_count_edx(unsigned op, unsigned count) +{ + unsigned int eax, ebx, ecx, edx; + + cpuid_count(op, count, , , , ); + + return edx; +} + /* REP NOP (PAUSE) is a good thing to insert into busy-wait loops. */ static __always_inline void rep_nop(void) { diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..9990e7a 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -65,6 +65,11 @@ u64 kvm_supported_xcr0(void) #define F(x) bit(X86_FEATURE_##x) +/* These are scattered features in cpufeatures.h. */ +#define KVM_CPUID_BIT_AVX512_4VNNIW 2 +#define KVM_CPUID_BIT_AVX512_4FMAPS 3 +#define KF(x) bit(KVM_CPUID_BIT_##x) + int kvm_update_cpuid(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; @@ -376,6 +381,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = + KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS); + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +467,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); + entry->edx &= kvm_cpuid_7_0_edx_x86_features; + entry->edx &= cpuid_count_edx(7, 0); } else { entry->ebx = 0; entry->ecx = 0; + entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
Re: [PATCH] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
On Fri, Oct 28, 2016 at 11:54:13AM +0200, Paolo Bonzini wrote: > > > On 28/10/2016 11:46, He Chen wrote: > > On Fri, Oct 28, 2016 at 11:31:05AM +0200, Paolo Bonzini wrote: > >> > >> > >> On 28/10/2016 11:12, He Chen wrote: > >>> The spec can be found in Intel Software Developer Manual or in > >>> Instruction Set Extensions Programming Reference. > >>> > >>> Signed-off-by: Luwei Kang <luwei.k...@intel.com> > >>> Signed-off-by: He Chen <he.c...@linux.intel.com> > >>> --- > >>> arch/x86/kvm/cpuid.c | 7 ++- > >>> 1 file changed, 6 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > >>> index afa7bbb..328b169 100644 > >>> --- a/arch/x86/kvm/cpuid.c > >>> +++ b/arch/x86/kvm/cpuid.c > >>> @@ -376,6 +376,10 @@ static inline int __do_cpuid_ent(struct > >>> kvm_cpuid_entry2 *entry, u32 function, > >>> /* cpuid 7.0.ecx*/ > >>> const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; > >>> > >>> + /* cpuid 7.0.edx*/ > >>> + const u32 kvm_cpuid_7_0_edx_x86_features = > >>> +0x4 /* AVX512-4VNNIW */ | 0x8 /* AVX512-4FMAPS */; > >> > >> Please define the new features in cpufeature.h first. > >> > > These 2 new features defined as scattered features in kernel. > > In cpufeature.h, there are: > > #define X86_FEATURE_AVX512_4VNNIW (7*32+16) > > #define X86_FEATURE_AVX512_4FMAPS (7*32+17) > > > > Please see disscusion here: > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1250183.html > > Uff, that sucks. :( I'd agree with hpa's position in that thread. > > Please do something like > > /* These are scattered features in cpufeature.h. */ > #define KVM_CPUID_BIT_AVX512_4VNNIW 2 > #define KVM_CPUID_BIT_AVX512_4FMAPS 3 > #define KF(x) bit(KVM_CPUID_BIT_##x) > > and then > > const u32 kvm_cpuid_7_0_edx_x86_features = > KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) > > I'll think of a trick to avoid using F for scattered features... > Appreciate it :-)
Re: [PATCH] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
On Fri, Oct 28, 2016 at 11:54:13AM +0200, Paolo Bonzini wrote: > > > On 28/10/2016 11:46, He Chen wrote: > > On Fri, Oct 28, 2016 at 11:31:05AM +0200, Paolo Bonzini wrote: > >> > >> > >> On 28/10/2016 11:12, He Chen wrote: > >>> The spec can be found in Intel Software Developer Manual or in > >>> Instruction Set Extensions Programming Reference. > >>> > >>> Signed-off-by: Luwei Kang > >>> Signed-off-by: He Chen > >>> --- > >>> arch/x86/kvm/cpuid.c | 7 ++- > >>> 1 file changed, 6 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > >>> index afa7bbb..328b169 100644 > >>> --- a/arch/x86/kvm/cpuid.c > >>> +++ b/arch/x86/kvm/cpuid.c > >>> @@ -376,6 +376,10 @@ static inline int __do_cpuid_ent(struct > >>> kvm_cpuid_entry2 *entry, u32 function, > >>> /* cpuid 7.0.ecx*/ > >>> const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; > >>> > >>> + /* cpuid 7.0.edx*/ > >>> + const u32 kvm_cpuid_7_0_edx_x86_features = > >>> +0x4 /* AVX512-4VNNIW */ | 0x8 /* AVX512-4FMAPS */; > >> > >> Please define the new features in cpufeature.h first. > >> > > These 2 new features defined as scattered features in kernel. > > In cpufeature.h, there are: > > #define X86_FEATURE_AVX512_4VNNIW (7*32+16) > > #define X86_FEATURE_AVX512_4FMAPS (7*32+17) > > > > Please see disscusion here: > > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1250183.html > > Uff, that sucks. :( I'd agree with hpa's position in that thread. > > Please do something like > > /* These are scattered features in cpufeature.h. */ > #define KVM_CPUID_BIT_AVX512_4VNNIW 2 > #define KVM_CPUID_BIT_AVX512_4FMAPS 3 > #define KF(x) bit(KVM_CPUID_BIT_##x) > > and then > > const u32 kvm_cpuid_7_0_edx_x86_features = > KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS) > > I'll think of a trick to avoid using F for scattered features... > Appreciate it :-)
Re: [PATCH] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
On Fri, Oct 28, 2016 at 11:31:05AM +0200, Paolo Bonzini wrote: > > > On 28/10/2016 11:12, He Chen wrote: > > The spec can be found in Intel Software Developer Manual or in > > Instruction Set Extensions Programming Reference. > > > > Signed-off-by: Luwei Kang <luwei.k...@intel.com> > > Signed-off-by: He Chen <he.c...@linux.intel.com> > > --- > > arch/x86/kvm/cpuid.c | 7 ++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > > index afa7bbb..328b169 100644 > > --- a/arch/x86/kvm/cpuid.c > > +++ b/arch/x86/kvm/cpuid.c > > @@ -376,6 +376,10 @@ static inline int __do_cpuid_ent(struct > > kvm_cpuid_entry2 *entry, u32 function, > > /* cpuid 7.0.ecx*/ > > const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; > > > > + /* cpuid 7.0.edx*/ > > + const u32 kvm_cpuid_7_0_edx_x86_features = > > +0x4 /* AVX512-4VNNIW */ | 0x8 /* AVX512-4FMAPS */; > > Please define the new features in cpufeature.h first. > These 2 new features defined as scattered features in kernel. In cpufeature.h, there are: #define X86_FEATURE_AVX512_4VNNIW (7*32+16) #define X86_FEATURE_AVX512_4FMAPS (7*32+17) Please see disscusion here: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1250183.html Thanks, -He
Re: [PATCH] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
On Fri, Oct 28, 2016 at 11:31:05AM +0200, Paolo Bonzini wrote: > > > On 28/10/2016 11:12, He Chen wrote: > > The spec can be found in Intel Software Developer Manual or in > > Instruction Set Extensions Programming Reference. > > > > Signed-off-by: Luwei Kang > > Signed-off-by: He Chen > > --- > > arch/x86/kvm/cpuid.c | 7 ++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > > index afa7bbb..328b169 100644 > > --- a/arch/x86/kvm/cpuid.c > > +++ b/arch/x86/kvm/cpuid.c > > @@ -376,6 +376,10 @@ static inline int __do_cpuid_ent(struct > > kvm_cpuid_entry2 *entry, u32 function, > > /* cpuid 7.0.ecx*/ > > const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; > > > > + /* cpuid 7.0.edx*/ > > + const u32 kvm_cpuid_7_0_edx_x86_features = > > +0x4 /* AVX512-4VNNIW */ | 0x8 /* AVX512-4FMAPS */; > > Please define the new features in cpufeature.h first. > These 2 new features defined as scattered features in kernel. In cpufeature.h, there are: #define X86_FEATURE_AVX512_4VNNIW (7*32+16) #define X86_FEATURE_AVX512_4FMAPS (7*32+17) Please see disscusion here: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1250183.html Thanks, -He
[PATCH] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
The spec can be found in Intel Software Developer Manual or in Instruction Set Extensions Programming Reference. Signed-off-by: Luwei Kang <luwei.k...@intel.com> Signed-off-by: He Chen <he.c...@linux.intel.com> --- arch/x86/kvm/cpuid.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..328b169 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -376,6 +376,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = +0x4 /* AVX512-4VNNIW */ | 0x8 /* AVX512-4FMAPS */; + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +462,13 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); +entry->edx &= kvm_cpuid_7_0_edx_x86_features; } else { entry->ebx = 0; entry->ecx = 0; +entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4
[PATCH] x86/cpuid: expose AVX512_4VNNIW and AVX512_4FMAPS features to kvm guest
The spec can be found in Intel Software Developer Manual or in Instruction Set Extensions Programming Reference. Signed-off-by: Luwei Kang Signed-off-by: He Chen --- arch/x86/kvm/cpuid.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index afa7bbb..328b169 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -376,6 +376,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* cpuid 7.0.ecx*/ const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/; + /* cpuid 7.0.edx*/ + const u32 kvm_cpuid_7_0_edx_x86_features = +0x4 /* AVX512-4VNNIW */ | 0x8 /* AVX512-4FMAPS */; + /* all calls to cpuid_count() should be made on the same cpu */ get_cpu(); @@ -458,12 +462,13 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function, /* PKU is not yet implemented for shadow paging. */ if (!tdp_enabled) entry->ecx &= ~F(PKU); +entry->edx &= kvm_cpuid_7_0_edx_x86_features; } else { entry->ebx = 0; entry->ecx = 0; +entry->edx = 0; } entry->eax = 0; - entry->edx = 0; break; } case 9: -- 2.7.4