[tip:x86/urgent] x86/resctrl: Prevent NULL pointer dereference when local MBM is disabled
Commit-ID: c7563e62a6d720aa3b068e26ddffab5f0df29263 Gitweb: https://git.kernel.org/tip/c7563e62a6d720aa3b068e26ddffab5f0df29263 Author: Prarit Bhargava AuthorDate: Mon, 10 Jun 2019 13:15:44 -0400 Committer: Thomas Gleixner CommitDate: Wed, 12 Jun 2019 10:31:50 +0200 x86/resctrl: Prevent NULL pointer dereference when local MBM is disabled Booting with kernel parameter "rdt=cmt,mbmtotal,memlocal,l3cat,mba" and executing "mount -t resctrl resctrl -o mba_MBps /sys/fs/resctrl" results in a NULL pointer dereference on systems which do not have local MBM support enabled.. BUG: kernel NULL pointer dereference, address: 0020 PGD 0 P4D 0 Oops: [#1] SMP PTI CPU: 0 PID: 722 Comm: kworker/0:3 Not tainted 5.2.0-0.rc3.git0.1.el7_UNSUPPORTED.x86_64 #2 Workqueue: events mbm_handle_overflow RIP: 0010:mbm_handle_overflow+0x150/0x2b0 Only enter the bandwith update loop if the system has local MBM enabled. Fixes: de73f38f7680 ("x86/intel_rdt/mba_sc: Feedback loop to dynamically update mem bandwidth") Signed-off-by: Prarit Bhargava Signed-off-by: Thomas Gleixner Cc: Fenghua Yu Cc: Reinette Chatre Cc: Borislav Petkov Cc: "H. Peter Anvin" Cc: sta...@vger.kernel.org Link: https://lkml.kernel.org/r/20190610171544.13474-1-pra...@redhat.com --- arch/x86/kernel/cpu/resctrl/monitor.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 1573a0a6b525..ff6e8e561405 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -368,6 +368,9 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) struct list_head *head; struct rdtgroup *entry; + if (!is_mbm_local_enabled()) + return; + r_mba = _resources_all[RDT_RESOURCE_MBA]; closid = rgrp->closid; rmid = rgrp->mon.rmid;
[tip:x86/urgent] x86/microcode: Make sure boot_cpu_data.microcode is up-to-date
Commit-ID: 370a132bb2227ff76278f98370e0e701d86ff752 Gitweb: https://git.kernel.org/tip/370a132bb2227ff76278f98370e0e701d86ff752 Author: Prarit Bhargava AuthorDate: Tue, 31 Jul 2018 07:27:39 -0400 Committer: Thomas Gleixner CommitDate: Sun, 2 Sep 2018 14:10:54 +0200 x86/microcode: Make sure boot_cpu_data.microcode is up-to-date When preparing an MCE record for logging, boot_cpu_data.microcode is used to read out the microcode revision on the box. However, on systems where late microcode update has happened, the microcode revision output in a MCE log record is wrong because boot_cpu_data.microcode is not updated when the microcode gets updated. But, the microcode revision saved in boot_cpu_data's microcode member should be kept up-to-date, regardless, for consistency. Make it so. Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records") Signed-off-by: Prarit Bhargava Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner Cc: Tony Luck Cc: sir...@amazon.de Cc: sta...@vger.kernel.org Link: http://lkml.kernel.org/r/20180731112739.32338-1-pra...@redhat.com --- arch/x86/kernel/cpu/microcode/amd.c | 4 arch/x86/kernel/cpu/microcode/intel.c | 4 2 files changed, 8 insertions(+) diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c index 0624957aa068..602f17134103 100644 --- a/arch/x86/kernel/cpu/microcode/amd.c +++ b/arch/x86/kernel/cpu/microcode/amd.c @@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu) uci->cpu_sig.rev = mc_amd->hdr.patch_id; c->microcode = mc_amd->hdr.patch_id; + /* Update boot_cpu_data's revision too, if we're on the BSP: */ + if (c->cpu_index == boot_cpu_data.cpu_index) + boot_cpu_data.microcode = mc_amd->hdr.patch_id; + return UCODE_UPDATED; } diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index 97ccf4c3b45b..256d336cbc04 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu) uci->cpu_sig.rev = rev; c->microcode = rev; + /* Update boot_cpu_data's revision too, if we're on the BSP: */ + if (c->cpu_index == boot_cpu_data.cpu_index) + boot_cpu_data.microcode = rev; + return UCODE_UPDATED; }
[tip:x86/urgent] x86/microcode: Make sure boot_cpu_data.microcode is up-to-date
Commit-ID: 370a132bb2227ff76278f98370e0e701d86ff752 Gitweb: https://git.kernel.org/tip/370a132bb2227ff76278f98370e0e701d86ff752 Author: Prarit Bhargava AuthorDate: Tue, 31 Jul 2018 07:27:39 -0400 Committer: Thomas Gleixner CommitDate: Sun, 2 Sep 2018 14:10:54 +0200 x86/microcode: Make sure boot_cpu_data.microcode is up-to-date When preparing an MCE record for logging, boot_cpu_data.microcode is used to read out the microcode revision on the box. However, on systems where late microcode update has happened, the microcode revision output in a MCE log record is wrong because boot_cpu_data.microcode is not updated when the microcode gets updated. But, the microcode revision saved in boot_cpu_data's microcode member should be kept up-to-date, regardless, for consistency. Make it so. Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records") Signed-off-by: Prarit Bhargava Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner Cc: Tony Luck Cc: sir...@amazon.de Cc: sta...@vger.kernel.org Link: http://lkml.kernel.org/r/20180731112739.32338-1-pra...@redhat.com --- arch/x86/kernel/cpu/microcode/amd.c | 4 arch/x86/kernel/cpu/microcode/intel.c | 4 2 files changed, 8 insertions(+) diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c index 0624957aa068..602f17134103 100644 --- a/arch/x86/kernel/cpu/microcode/amd.c +++ b/arch/x86/kernel/cpu/microcode/amd.c @@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu) uci->cpu_sig.rev = mc_amd->hdr.patch_id; c->microcode = mc_amd->hdr.patch_id; + /* Update boot_cpu_data's revision too, if we're on the BSP: */ + if (c->cpu_index == boot_cpu_data.cpu_index) + boot_cpu_data.microcode = mc_amd->hdr.patch_id; + return UCODE_UPDATED; } diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c index 97ccf4c3b45b..256d336cbc04 100644 --- a/arch/x86/kernel/cpu/microcode/intel.c +++ b/arch/x86/kernel/cpu/microcode/intel.c @@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu) uci->cpu_sig.rev = rev; c->microcode = rev; + /* Update boot_cpu_data's revision too, if we're on the BSP: */ + if (c->cpu_index == boot_cpu_data.cpu_index) + boot_cpu_data.microcode = rev; + return UCODE_UPDATED; }
[tip:x86/urgent] x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation
Commit-ID: 947134d9b00f342415af7eddd42a5fce7262a1b9 Gitweb: https://git.kernel.org/tip/947134d9b00f342415af7eddd42a5fce7262a1b9 Author: Prarit BhargavaAuthorDate: Mon, 4 Dec 2017 11:45:21 -0500 Committer: Thomas Gleixner CommitDate: Thu, 7 Dec 2017 10:28:22 +0100 x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation Documentation/x86/topology.txt defines smp_num_siblings as "The number of threads in a core". Since commit bbb65d2d365e ("x86: use cpuid vector 0xb when available for detecting cpu topology") smp_num_siblings is the maximum number of threads in a core. If Simultaneous MultiThreading (SMT) is disabled on a system, smp_num_siblings is 2 and not 1 as expected. Use topology_max_smt_threads(), which contains the active numer of threads, in the __max_logical_packages calculation. On a single socket, single core, single thread system __max_smt_threads has not been updated when the __max_logical_packages calculation happens, so its zero which makes the package estimate fail. Initialize it to one, which is the minimum number of threads on a core. [ tglx: Folded the __max_smt_threads fix in ] Fixes: b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages estimate") Reported-by: Jakub Kicinski Signed-off-by: Prarit Bhargava Tested-by: Jakub Kicinski Cc: net...@vger.kernel.org Cc: "net...@vger.kernel.org" Cc: Clark Williams Link: https://lkml.kernel.org/r/20171204164521.17870-1-pra...@redhat.com --- arch/x86/kernel/smpboot.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 05a97d5..35cb20994 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -106,7 +106,7 @@ EXPORT_SYMBOL(__max_logical_packages); static unsigned int logical_packages __read_mostly; /* Maximum number of SMT threads on any online core */ -int __max_smt_threads __read_mostly; +int __read_mostly __max_smt_threads = 1; /* Flag to indicate if a complete sched domain rebuild is required */ bool x86_topology_update; @@ -1304,7 +1304,7 @@ void __init native_smp_cpus_done(unsigned int max_cpus) * Today neither Intel nor AMD support heterogenous systems so * extrapolate the boot cpu's data to all packages. */ - ncpus = cpu_data(0).booted_cores * smp_num_siblings; + ncpus = cpu_data(0).booted_cores * topology_max_smt_threads(); __max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus); pr_info("Max logical packages: %u\n", __max_logical_packages);
[tip:x86/urgent] x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation
Commit-ID: 947134d9b00f342415af7eddd42a5fce7262a1b9 Gitweb: https://git.kernel.org/tip/947134d9b00f342415af7eddd42a5fce7262a1b9 Author: Prarit Bhargava AuthorDate: Mon, 4 Dec 2017 11:45:21 -0500 Committer: Thomas Gleixner CommitDate: Thu, 7 Dec 2017 10:28:22 +0100 x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation Documentation/x86/topology.txt defines smp_num_siblings as "The number of threads in a core". Since commit bbb65d2d365e ("x86: use cpuid vector 0xb when available for detecting cpu topology") smp_num_siblings is the maximum number of threads in a core. If Simultaneous MultiThreading (SMT) is disabled on a system, smp_num_siblings is 2 and not 1 as expected. Use topology_max_smt_threads(), which contains the active numer of threads, in the __max_logical_packages calculation. On a single socket, single core, single thread system __max_smt_threads has not been updated when the __max_logical_packages calculation happens, so its zero which makes the package estimate fail. Initialize it to one, which is the minimum number of threads on a core. [ tglx: Folded the __max_smt_threads fix in ] Fixes: b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages estimate") Reported-by: Jakub Kicinski Signed-off-by: Prarit Bhargava Tested-by: Jakub Kicinski Cc: net...@vger.kernel.org Cc: "net...@vger.kernel.org" Cc: Clark Williams Link: https://lkml.kernel.org/r/20171204164521.17870-1-pra...@redhat.com --- arch/x86/kernel/smpboot.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 05a97d5..35cb20994 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -106,7 +106,7 @@ EXPORT_SYMBOL(__max_logical_packages); static unsigned int logical_packages __read_mostly; /* Maximum number of SMT threads on any online core */ -int __max_smt_threads __read_mostly; +int __read_mostly __max_smt_threads = 1; /* Flag to indicate if a complete sched domain rebuild is required */ bool x86_topology_update; @@ -1304,7 +1304,7 @@ void __init native_smp_cpus_done(unsigned int max_cpus) * Today neither Intel nor AMD support heterogenous systems so * extrapolate the boot cpu's data to all packages. */ - ncpus = cpu_data(0).booted_cores * smp_num_siblings; + ncpus = cpu_data(0).booted_cores * topology_max_smt_threads(); __max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus); pr_info("Max logical packages: %u\n", __max_logical_packages);
[tip:x86/urgent] x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation
Commit-ID: b1cbacc8663a4dce62e4ae501e859c82f4aeb1ca Gitweb: https://git.kernel.org/tip/b1cbacc8663a4dce62e4ae501e859c82f4aeb1ca Author: Prarit BhargavaAuthorDate: Mon, 4 Dec 2017 11:45:21 -0500 Committer: Thomas Gleixner CommitDate: Mon, 4 Dec 2017 23:03:48 +0100 x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation Documentation/x86/topology.txt defines smp_num_siblings as "The number of threads in a core". Since commit bbb65d2d365e ("x86: use cpuid vector 0xb when available for detecting cpu topology") smp_num_siblings is the maximum number of threads in a core. If Simultaneous MultiThreading (SMT) is disabled on a system, smp_num_siblings is 2 and not 1 as expected. Use topology_max_smt_threads(), which contains the active numer of threads, in the __max_logical_packages calculation. Fixes: b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages estimate") Reported-by: Jakub Kicinski Signed-off-by: Prarit Bhargava Tested-by: Jakub Kicinski Cc: net...@vger.kernel.org Cc: "net...@vger.kernel.org" Cc: Clark Williams Link: https://lkml.kernel.org/r/20171204164521.17870-1-pra...@redhat.com --- arch/x86/kernel/smpboot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 05a97d5..7de0aa2 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1304,7 +1304,7 @@ void __init native_smp_cpus_done(unsigned int max_cpus) * Today neither Intel nor AMD support heterogenous systems so * extrapolate the boot cpu's data to all packages. */ - ncpus = cpu_data(0).booted_cores * smp_num_siblings; + ncpus = cpu_data(0).booted_cores * topology_max_smt_threads(); __max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus); pr_info("Max logical packages: %u\n", __max_logical_packages);
[tip:x86/urgent] x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation
Commit-ID: b1cbacc8663a4dce62e4ae501e859c82f4aeb1ca Gitweb: https://git.kernel.org/tip/b1cbacc8663a4dce62e4ae501e859c82f4aeb1ca Author: Prarit Bhargava AuthorDate: Mon, 4 Dec 2017 11:45:21 -0500 Committer: Thomas Gleixner CommitDate: Mon, 4 Dec 2017 23:03:48 +0100 x86/smpboot: Do not use smp_num_siblings in __max_logical_packages calculation Documentation/x86/topology.txt defines smp_num_siblings as "The number of threads in a core". Since commit bbb65d2d365e ("x86: use cpuid vector 0xb when available for detecting cpu topology") smp_num_siblings is the maximum number of threads in a core. If Simultaneous MultiThreading (SMT) is disabled on a system, smp_num_siblings is 2 and not 1 as expected. Use topology_max_smt_threads(), which contains the active numer of threads, in the __max_logical_packages calculation. Fixes: b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages estimate") Reported-by: Jakub Kicinski Signed-off-by: Prarit Bhargava Tested-by: Jakub Kicinski Cc: net...@vger.kernel.org Cc: "net...@vger.kernel.org" Cc: Clark Williams Link: https://lkml.kernel.org/r/20171204164521.17870-1-pra...@redhat.com --- arch/x86/kernel/smpboot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 05a97d5..7de0aa2 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1304,7 +1304,7 @@ void __init native_smp_cpus_done(unsigned int max_cpus) * Today neither Intel nor AMD support heterogenous systems so * extrapolate the boot cpu's data to all packages. */ - ncpus = cpu_data(0).booted_cores * smp_num_siblings; + ncpus = cpu_data(0).booted_cores * topology_max_smt_threads(); __max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus); pr_info("Max logical packages: %u\n", __max_logical_packages);
[tip:x86/urgent] x86/smpboot: Fix __max_logical_packages estimate
Commit-ID: b4c0a7326f5dc0ef7a64128b0ae7d081f4b2cbd1 Gitweb: https://git.kernel.org/tip/b4c0a7326f5dc0ef7a64128b0ae7d081f4b2cbd1 Author: Prarit BhargavaAuthorDate: Tue, 14 Nov 2017 07:42:57 -0500 Committer: Thomas Gleixner CommitDate: Fri, 17 Nov 2017 16:22:31 +0100 x86/smpboot: Fix __max_logical_packages estimate A system booted with a small number of cores enabled per package panics because the estimate of __max_logical_packages is too low. This occurs when the total number of active cores across all packages is less than the maximum core count for a single package. e.g.: On a 4 package system with 20 cores/package where only 4 cores are enabled on each package, the value of __max_logical_packages is calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4. Calculate __max_logical_packages after the cpu enumeration has completed. Use the boot cpu's data to extrapolate the number of packages. Signed-off-by: Prarit Bhargava Signed-off-by: Thomas Gleixner Cc: Tom Lendacky Cc: Andi Kleen Cc: Christian Borntraeger Cc: Peter Zijlstra Cc: Kan Liang Cc: He Chen Cc: Stephane Eranian Cc: Dave Hansen Cc: Piotr Luc Cc: Andy Lutomirski Cc: Arvind Yadav Cc: Vitaly Kuznetsov Cc: Borislav Petkov Cc: Tim Chen Cc: Mathias Krause Cc: "Kirill A. Shutemov" Link: https://lkml.kernel.org/r/20171114124257.22013-4-pra...@redhat.com --- arch/x86/kernel/smpboot.c | 55 +-- 1 file changed, 10 insertions(+), 45 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index da5e162..3d01df7 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -310,12 +310,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu) if (new >= 0) goto found; - if (logical_packages >= __max_logical_packages) { - pr_warn("Package %u of CPU %u exceeds BIOS package data %u.\n", - logical_packages, cpu, __max_logical_packages); - return -ENOSPC; - } - new = logical_packages++; if (new != pkg) { pr_info("CPU %u Converting physical %u to logical package %u\n", @@ -326,44 +320,6 @@ found: return 0; } -static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu) -{ - unsigned int ncpus; - - /* -* Today neither Intel nor AMD support heterogenous systems. That -* might change in the future -* -* While ideally we'd want '* smp_num_siblings' in the below @ncpus -* computation, this won't actually work since some Intel BIOSes -* report inconsistent HT data when they disable HT. -* -* In particular, they reduce the APIC-IDs to only include the cores, -* but leave the CPUID topology to say there are (2) siblings. -* This means we don't know how many threads there will be until -* after the APIC enumeration. -* -* By not including this we'll sometimes over-estimate the number of -* logical packages by the amount of !present siblings, but this is -* still better than MAX_LOCAL_APIC. -* -* We use total_cpus not nr_cpu_ids because nr_cpu_ids can be limited -* on the command line leading to a similar issue as the HT disable -* problem because the hyperthreads are usually enumerated after the -* primary cores. -*/ - ncpus = boot_cpu_data.x86_max_cores; - if (!ncpus) { - pr_warn("x86_max_cores == zero !?!?"); - ncpus = 1; - } - - __max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus); - pr_info("Max logical packages: %u\n", __max_logical_packages); - - topology_update_package_map(c->phys_proc_id, cpu); -} - void __init smp_store_boot_cpu_info(void) { int id = 0; /* CPU 0 */ @@ -371,7 +327,7 @@ void __init smp_store_boot_cpu_info(void) *c = boot_cpu_data; c->cpu_index = id; - smp_init_package_map(c, id); + topology_update_package_map(c->phys_proc_id, id); c->initialized = true; } @@ -1341,7 +1297,16 @@ void __init native_smp_prepare_boot_cpu(void) void __init native_smp_cpus_done(unsigned int max_cpus) { + int ncpus; + pr_debug("Boot done\n"); + /* +* Today neither Intel nor AMD support heterogenous systems so +* extrapolate the boot cpu's data to all packages. +*/ + ncpus = cpu_data(0).booted_cores *
[tip:x86/urgent] x86/smpboot: Fix __max_logical_packages estimate
Commit-ID: b4c0a7326f5dc0ef7a64128b0ae7d081f4b2cbd1 Gitweb: https://git.kernel.org/tip/b4c0a7326f5dc0ef7a64128b0ae7d081f4b2cbd1 Author: Prarit Bhargava AuthorDate: Tue, 14 Nov 2017 07:42:57 -0500 Committer: Thomas Gleixner CommitDate: Fri, 17 Nov 2017 16:22:31 +0100 x86/smpboot: Fix __max_logical_packages estimate A system booted with a small number of cores enabled per package panics because the estimate of __max_logical_packages is too low. This occurs when the total number of active cores across all packages is less than the maximum core count for a single package. e.g.: On a 4 package system with 20 cores/package where only 4 cores are enabled on each package, the value of __max_logical_packages is calculated as DIV_ROUND_UP(16 / 20) = 1 and not 4. Calculate __max_logical_packages after the cpu enumeration has completed. Use the boot cpu's data to extrapolate the number of packages. Signed-off-by: Prarit Bhargava Signed-off-by: Thomas Gleixner Cc: Tom Lendacky Cc: Andi Kleen Cc: Christian Borntraeger Cc: Peter Zijlstra Cc: Kan Liang Cc: He Chen Cc: Stephane Eranian Cc: Dave Hansen Cc: Piotr Luc Cc: Andy Lutomirski Cc: Arvind Yadav Cc: Vitaly Kuznetsov Cc: Borislav Petkov Cc: Tim Chen Cc: Mathias Krause Cc: "Kirill A. Shutemov" Link: https://lkml.kernel.org/r/20171114124257.22013-4-pra...@redhat.com --- arch/x86/kernel/smpboot.c | 55 +-- 1 file changed, 10 insertions(+), 45 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index da5e162..3d01df7 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -310,12 +310,6 @@ int topology_update_package_map(unsigned int pkg, unsigned int cpu) if (new >= 0) goto found; - if (logical_packages >= __max_logical_packages) { - pr_warn("Package %u of CPU %u exceeds BIOS package data %u.\n", - logical_packages, cpu, __max_logical_packages); - return -ENOSPC; - } - new = logical_packages++; if (new != pkg) { pr_info("CPU %u Converting physical %u to logical package %u\n", @@ -326,44 +320,6 @@ found: return 0; } -static void __init smp_init_package_map(struct cpuinfo_x86 *c, unsigned int cpu) -{ - unsigned int ncpus; - - /* -* Today neither Intel nor AMD support heterogenous systems. That -* might change in the future -* -* While ideally we'd want '* smp_num_siblings' in the below @ncpus -* computation, this won't actually work since some Intel BIOSes -* report inconsistent HT data when they disable HT. -* -* In particular, they reduce the APIC-IDs to only include the cores, -* but leave the CPUID topology to say there are (2) siblings. -* This means we don't know how many threads there will be until -* after the APIC enumeration. -* -* By not including this we'll sometimes over-estimate the number of -* logical packages by the amount of !present siblings, but this is -* still better than MAX_LOCAL_APIC. -* -* We use total_cpus not nr_cpu_ids because nr_cpu_ids can be limited -* on the command line leading to a similar issue as the HT disable -* problem because the hyperthreads are usually enumerated after the -* primary cores. -*/ - ncpus = boot_cpu_data.x86_max_cores; - if (!ncpus) { - pr_warn("x86_max_cores == zero !?!?"); - ncpus = 1; - } - - __max_logical_packages = DIV_ROUND_UP(total_cpus, ncpus); - pr_info("Max logical packages: %u\n", __max_logical_packages); - - topology_update_package_map(c->phys_proc_id, cpu); -} - void __init smp_store_boot_cpu_info(void) { int id = 0; /* CPU 0 */ @@ -371,7 +327,7 @@ void __init smp_store_boot_cpu_info(void) *c = boot_cpu_data; c->cpu_index = id; - smp_init_package_map(c, id); + topology_update_package_map(c->phys_proc_id, id); c->initialized = true; } @@ -1341,7 +1297,16 @@ void __init native_smp_prepare_boot_cpu(void) void __init native_smp_cpus_done(unsigned int max_cpus) { + int ncpus; + pr_debug("Boot done\n"); + /* +* Today neither Intel nor AMD support heterogenous systems so +* extrapolate the boot cpu's data to all packages. +*/ + ncpus = cpu_data(0).booted_cores * smp_num_siblings; + __max_logical_packages = DIV_ROUND_UP(nr_cpu_ids, ncpus); + pr_info("Max logical packages: %u\n", __max_logical_packages); if (x86_has_numa_in_package) set_sched_topology(x86_numa_in_package_topology);
[tip:core/printk] printk: Add monotonic, boottime, and realtime timestamps
Commit-ID: a4c1a0002f4518363da9d9ecd7b805af152dcdf1 Gitweb: http://git.kernel.org/tip/a4c1a0002f4518363da9d9ecd7b805af152dcdf1 Author: Prarit BhargavaAuthorDate: Mon, 28 Aug 2017 08:21:54 -0400 Committer: Thomas Gleixner CommitDate: Mon, 25 Sep 2017 21:12:06 +0200 printk: Add monotonic, boottime, and realtime timestamps printk.time=1/CONFIG_PRINTK_TIME=1 adds a unmodified local hardware clock timestamp to printk messages. The local hardware clock loses time each day making it difficult to determine exactly when an issue has occurred in the kernel log, and making it difficult to determine how kernel and hardware issues relate to each other. Make printk output different timestamps by adding options for no timestamp, the local hardware clock, the monotonic clock, the boottime clock, and the clock realtime. The default clock can be selected via: - Kconfig - Kernel command line parameter - Sysfs file Note, that existing user space tools might be confused by selecting clock realtime, so handle with care. [ jstultz: Reworked Kconfig settings to avoid defconfig noise ] Signed-off-by: Prarit Bhargava Signed-off-by: Thomas Gleixner Cc: John Stultz Cc: Joel Fernandes Cc: Geert Uytterhoeven Cc: linux-...@vger.kernel.org Cc: Peter Zijlstra Cc: Deepa Dinamani Cc: Christoffer Dall Cc: "Jason A. Donenfeld" Cc: Jonathan Corbet Cc: "Paul E. McKenney" Cc: Petr Mladek Cc: Kees Cook Cc: Steven Rostedt Cc: Nicholas Piggin Cc: Josh Poimboeuf Cc: Greg Kroah-Hartman Cc: Stephen Boyd Cc: Mark Salyzyn Cc: Sergey Senozhatsky Cc: "Luis R. Rodriguez" Cc: Olof Johansson Cc: Andrew Morton Link: http://lkml.kernel.org/r/1503922914-10660-3-git-send-email-pra...@redhat.com --- Documentation/admin-guide/kernel-parameters.txt | 6 +- kernel/printk/printk.c | 116 +++- lib/Kconfig.debug | 48 +- 3 files changed, 162 insertions(+), 8 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 0549662..9a84483 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3211,8 +3211,10 @@ ratelimit - ratelimit the logging Default: ratelimit - printk.time=Show timing data prefixed to each printk message line - Format: (1/Y/y=enable, 0/N/n=disable) + printk.time=Show timestamp prefixed to each printk message line + Format: + (0/N/n/disable, 1/Y/y/local, +b/boot, m/monotonic, r/realtime (in UTC)) processor.max_cstate= [HW,ACPI] Limit processor to maximum C-state diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 512f7c2..4b824dd 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -576,6 +576,9 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len, return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len); } +static u64 printk_get_first_ts(void); +static u64 (*printk_get_ts)(void) = printk_get_first_ts; + /* insert record into the buffer, discard old ones, update heads */ static int log_store(int facility, int level, enum log_flags flags, u64 ts_nsec, @@ -624,7 +627,7 @@ static int log_store(int facility, int level, if (ts_nsec > 0) msg->ts_nsec = ts_nsec; else - msg->ts_nsec = local_clock(); + msg->ts_nsec = printk_get_ts(); memset(log_dict(msg) + dict_len, 0, pad_len); msg->len = size; @@ -1201,14 +1204,116 @@ static inline void boot_delay_msec(int level) } #endif -static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME); -module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR); +/** + * enum timestamp_sources - Timestamp sources for printk() messages. + * @PRINTK_TIME_DISABLED: No time stamp. + * @PRINTK_TIME_LOCAL: Local hardware clock timestamp. + * @PRINTK_TIME_BOOT: Boottime clock timestamp. + * @PRINTK_TIME_MONO: Monotonic clock timestamp. + * @PRINTK_TIME_REAL: Realtime clock timestamp. On 32-bit + * systems selecting the real clock printk timestamp may lead to unlikely + * situations where a timestamp is wrong because the real time offset is read + * without the protection of a
[tip:core/printk] printk: Add monotonic, boottime, and realtime timestamps
Commit-ID: a4c1a0002f4518363da9d9ecd7b805af152dcdf1 Gitweb: http://git.kernel.org/tip/a4c1a0002f4518363da9d9ecd7b805af152dcdf1 Author: Prarit Bhargava AuthorDate: Mon, 28 Aug 2017 08:21:54 -0400 Committer: Thomas Gleixner CommitDate: Mon, 25 Sep 2017 21:12:06 +0200 printk: Add monotonic, boottime, and realtime timestamps printk.time=1/CONFIG_PRINTK_TIME=1 adds a unmodified local hardware clock timestamp to printk messages. The local hardware clock loses time each day making it difficult to determine exactly when an issue has occurred in the kernel log, and making it difficult to determine how kernel and hardware issues relate to each other. Make printk output different timestamps by adding options for no timestamp, the local hardware clock, the monotonic clock, the boottime clock, and the clock realtime. The default clock can be selected via: - Kconfig - Kernel command line parameter - Sysfs file Note, that existing user space tools might be confused by selecting clock realtime, so handle with care. [ jstultz: Reworked Kconfig settings to avoid defconfig noise ] Signed-off-by: Prarit Bhargava Signed-off-by: Thomas Gleixner Cc: John Stultz Cc: Joel Fernandes Cc: Geert Uytterhoeven Cc: linux-...@vger.kernel.org Cc: Peter Zijlstra Cc: Deepa Dinamani Cc: Christoffer Dall Cc: "Jason A. Donenfeld" Cc: Jonathan Corbet Cc: "Paul E. McKenney" Cc: Petr Mladek Cc: Kees Cook Cc: Steven Rostedt Cc: Nicholas Piggin Cc: Josh Poimboeuf Cc: Greg Kroah-Hartman Cc: Stephen Boyd Cc: Mark Salyzyn Cc: Sergey Senozhatsky Cc: "Luis R. Rodriguez" Cc: Olof Johansson Cc: Andrew Morton Link: http://lkml.kernel.org/r/1503922914-10660-3-git-send-email-pra...@redhat.com --- Documentation/admin-guide/kernel-parameters.txt | 6 +- kernel/printk/printk.c | 116 +++- lib/Kconfig.debug | 48 +- 3 files changed, 162 insertions(+), 8 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 0549662..9a84483 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3211,8 +3211,10 @@ ratelimit - ratelimit the logging Default: ratelimit - printk.time=Show timing data prefixed to each printk message line - Format: (1/Y/y=enable, 0/N/n=disable) + printk.time=Show timestamp prefixed to each printk message line + Format: + (0/N/n/disable, 1/Y/y/local, +b/boot, m/monotonic, r/realtime (in UTC)) processor.max_cstate= [HW,ACPI] Limit processor to maximum C-state diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 512f7c2..4b824dd 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -576,6 +576,9 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len, return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len); } +static u64 printk_get_first_ts(void); +static u64 (*printk_get_ts)(void) = printk_get_first_ts; + /* insert record into the buffer, discard old ones, update heads */ static int log_store(int facility, int level, enum log_flags flags, u64 ts_nsec, @@ -624,7 +627,7 @@ static int log_store(int facility, int level, if (ts_nsec > 0) msg->ts_nsec = ts_nsec; else - msg->ts_nsec = local_clock(); + msg->ts_nsec = printk_get_ts(); memset(log_dict(msg) + dict_len, 0, pad_len); msg->len = size; @@ -1201,14 +1204,116 @@ static inline void boot_delay_msec(int level) } #endif -static bool printk_time = IS_ENABLED(CONFIG_PRINTK_TIME); -module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR); +/** + * enum timestamp_sources - Timestamp sources for printk() messages. + * @PRINTK_TIME_DISABLED: No time stamp. + * @PRINTK_TIME_LOCAL: Local hardware clock timestamp. + * @PRINTK_TIME_BOOT: Boottime clock timestamp. + * @PRINTK_TIME_MONO: Monotonic clock timestamp. + * @PRINTK_TIME_REAL: Realtime clock timestamp. On 32-bit + * systems selecting the real clock printk timestamp may lead to unlikely + * situations where a timestamp is wrong because the real time offset is read + * without the protection of a sequence lock. + */ +enum timestamp_sources { + PRINTK_TIME_DISABLED = 0, + PRINTK_TIME_LOCAL = 1, + PRINTK_TIME_BOOT = 2, + PRINTK_TIME_MONO = 3, + PRINTK_TIME_REAL = 4, +}; + +static const char * const timestamp_sources_str[5] = { + "disabled", + "local", + "boottime", + "monotonic", + "realtime", +}; + +static int printk_time = CONFIG_PRINTK_TIME_TYPE; + +static void printk_set_ts_func(void) +{ + switch (printk_time) { + case PRINTK_TIME_LOCAL: + case
[tip:core/printk] timekeeping: Make fast accessors return 0 before timekeeping is initialized
Commit-ID: 5df32107f609c1f621bcdac0a685c23677ef671e Gitweb: http://git.kernel.org/tip/5df32107f609c1f621bcdac0a685c23677ef671e Author: Prarit BhargavaAuthorDate: Mon, 28 Aug 2017 08:21:53 -0400 Committer: Thomas Gleixner CommitDate: Mon, 25 Sep 2017 21:05:59 +0200 timekeeping: Make fast accessors return 0 before timekeeping is initialized printk timestamps will be extended to include mono and boot time by using the fast timekeeping accessors ktime_get_mono|boot_fast_ns(). The functions can return garbage before timekeeping is initialized resulting in garbage timestamps. Initialize the fast timekeepers with dummy clocks which guarantee a 0 readout up to timekeeping_init(). Suggested-by: Peter Zijlstra Signed-off-by: Prarit Bhargava Signed-off-by: Thomas Gleixner Cc: Stephen Boyd Cc: John Stultz Link: http://lkml.kernel.org/r/1503922914-10660-2-git-send-email-pra...@redhat.com --- kernel/time/timekeeping.c | 35 +-- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 2cafb49..6a92794 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -60,8 +60,27 @@ struct tk_fast { struct tk_read_base base[2]; }; -static struct tk_fast tk_fast_mono cacheline_aligned; -static struct tk_fast tk_fast_raw cacheline_aligned; +/* Suspend-time cycles value for halted fast timekeeper. */ +static u64 cycles_at_suspend; + +static u64 dummy_clock_read(struct clocksource *cs) +{ + return cycles_at_suspend; +} + +static struct clocksource dummy_clock = { + .read = dummy_clock_read, +}; + +static struct tk_fast tk_fast_mono cacheline_aligned = { + .base[0] = { .clock = _clock, }, + .base[1] = { .clock = _clock, }, +}; + +static struct tk_fast tk_fast_raw cacheline_aligned = { + .base[0] = { .clock = _clock, }, + .base[1] = { .clock = _clock, }, +}; /* flag for if timekeeping is suspended */ int __read_mostly timekeeping_suspended; @@ -477,18 +496,6 @@ u64 notrace ktime_get_boot_fast_ns(void) } EXPORT_SYMBOL_GPL(ktime_get_boot_fast_ns); -/* Suspend-time cycles value for halted fast timekeeper. */ -static u64 cycles_at_suspend; - -static u64 dummy_clock_read(struct clocksource *cs) -{ - return cycles_at_suspend; -} - -static struct clocksource dummy_clock = { - .read = dummy_clock_read, -}; - /** * halt_fast_timekeeper - Prevent fast timekeeper from accessing clocksource. * @tk: Timekeeper to snapshot.
[tip:core/printk] timekeeping: Make fast accessors return 0 before timekeeping is initialized
Commit-ID: 5df32107f609c1f621bcdac0a685c23677ef671e Gitweb: http://git.kernel.org/tip/5df32107f609c1f621bcdac0a685c23677ef671e Author: Prarit Bhargava AuthorDate: Mon, 28 Aug 2017 08:21:53 -0400 Committer: Thomas Gleixner CommitDate: Mon, 25 Sep 2017 21:05:59 +0200 timekeeping: Make fast accessors return 0 before timekeeping is initialized printk timestamps will be extended to include mono and boot time by using the fast timekeeping accessors ktime_get_mono|boot_fast_ns(). The functions can return garbage before timekeeping is initialized resulting in garbage timestamps. Initialize the fast timekeepers with dummy clocks which guarantee a 0 readout up to timekeeping_init(). Suggested-by: Peter Zijlstra Signed-off-by: Prarit Bhargava Signed-off-by: Thomas Gleixner Cc: Stephen Boyd Cc: John Stultz Link: http://lkml.kernel.org/r/1503922914-10660-2-git-send-email-pra...@redhat.com --- kernel/time/timekeeping.c | 35 +-- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 2cafb49..6a92794 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -60,8 +60,27 @@ struct tk_fast { struct tk_read_base base[2]; }; -static struct tk_fast tk_fast_mono cacheline_aligned; -static struct tk_fast tk_fast_raw cacheline_aligned; +/* Suspend-time cycles value for halted fast timekeeper. */ +static u64 cycles_at_suspend; + +static u64 dummy_clock_read(struct clocksource *cs) +{ + return cycles_at_suspend; +} + +static struct clocksource dummy_clock = { + .read = dummy_clock_read, +}; + +static struct tk_fast tk_fast_mono cacheline_aligned = { + .base[0] = { .clock = _clock, }, + .base[1] = { .clock = _clock, }, +}; + +static struct tk_fast tk_fast_raw cacheline_aligned = { + .base[0] = { .clock = _clock, }, + .base[1] = { .clock = _clock, }, +}; /* flag for if timekeeping is suspended */ int __read_mostly timekeeping_suspended; @@ -477,18 +496,6 @@ u64 notrace ktime_get_boot_fast_ns(void) } EXPORT_SYMBOL_GPL(ktime_get_boot_fast_ns); -/* Suspend-time cycles value for halted fast timekeeper. */ -static u64 cycles_at_suspend; - -static u64 dummy_clock_read(struct clocksource *cs) -{ - return cycles_at_suspend; -} - -static struct clocksource dummy_clock = { - .read = dummy_clock_read, -}; - /** * halt_fast_timekeeper - Prevent fast timekeeper from accessing clocksource. * @tk: Timekeeper to snapshot.
[tip:sched/core] sched/x86: Update reschedule warning text
Commit-ID: 21173d0b4d2a0b9e9e5f3155cf2cfc5781a6f4b1 Gitweb: http://git.kernel.org/tip/21173d0b4d2a0b9e9e5f3155cf2cfc5781a6f4b1 Author: Prarit BhargavaAuthorDate: Tue, 18 Apr 2017 08:25:05 -0400 Committer: Ingo Molnar CommitDate: Thu, 20 Apr 2017 10:14:30 +0200 sched/x86: Update reschedule warning text Modify the reschedule warning to output the offline CPU number and use a better debug message. Signed-off-by: Prarit Bhargava Cc: Andrew Morton Cc: Daniel Bristot de Oliveira Cc: Hidehiro Kawai Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Steven Rostedt (VMware) Cc: Thomas Gleixner Cc: Wanpeng Li Link: http://lkml.kernel.org/r/1492518305-3808-1-git-send-email-pra...@redhat.com [ Tweaked the warning message. ] Signed-off-by: Ingo Molnar --- arch/x86/kernel/smp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index d3c66a1..3cab841 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -124,7 +124,7 @@ static bool smp_no_nmi_ipi = false; static void native_smp_send_reschedule(int cpu) { if (unlikely(cpu_is_offline(cpu))) { - WARN_ON(1); + WARN(1, "sched: Unexpected reschedule of offline CPU#%d!\n", cpu); return; } apic->send_IPI(cpu, RESCHEDULE_VECTOR);
[tip:sched/core] sched/x86: Update reschedule warning text
Commit-ID: 21173d0b4d2a0b9e9e5f3155cf2cfc5781a6f4b1 Gitweb: http://git.kernel.org/tip/21173d0b4d2a0b9e9e5f3155cf2cfc5781a6f4b1 Author: Prarit Bhargava AuthorDate: Tue, 18 Apr 2017 08:25:05 -0400 Committer: Ingo Molnar CommitDate: Thu, 20 Apr 2017 10:14:30 +0200 sched/x86: Update reschedule warning text Modify the reschedule warning to output the offline CPU number and use a better debug message. Signed-off-by: Prarit Bhargava Cc: Andrew Morton Cc: Daniel Bristot de Oliveira Cc: Hidehiro Kawai Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Steven Rostedt (VMware) Cc: Thomas Gleixner Cc: Wanpeng Li Link: http://lkml.kernel.org/r/1492518305-3808-1-git-send-email-pra...@redhat.com [ Tweaked the warning message. ] Signed-off-by: Ingo Molnar --- arch/x86/kernel/smp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index d3c66a1..3cab841 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -124,7 +124,7 @@ static bool smp_no_nmi_ipi = false; static void native_smp_send_reschedule(int cpu) { if (unlikely(cpu_is_offline(cpu))) { - WARN_ON(1); + WARN(1, "sched: Unexpected reschedule of offline CPU#%d!\n", cpu); return; } apic->send_IPI(cpu, RESCHEDULE_VECTOR);
[tip:perf/urgent] perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code
Commit-ID: 6d6daa20945f3f598e56e18d1f926c08754f5801 Gitweb: http://git.kernel.org/tip/6d6daa20945f3f598e56e18d1f926c08754f5801 Author: Prarit BhargavaAuthorDate: Thu, 5 Jan 2017 10:09:25 -0500 Committer: Thomas Gleixner CommitDate: Wed, 11 Jan 2017 12:13:21 +0100 perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code hswep_uncore_cpu_init() uses a hardcoded physical package id 0 for the boot cpu. This works as long as the boot CPU is actually on the physical package 0, which is normaly the case after power on / reboot. But it fails with a NULL pointer dereference when a kdump kernel is started on a secondary socket which has a different physical package id because the locigal package translation for physical package 0 does not exist. Use the logical package id of the boot cpu instead of hard coded 0. [ tglx: Rewrote changelog once more ] Fixes: cf6d445f6897 ("perf/x86/uncore: Track packages, not per CPU data") Signed-off-by: Prarit Bhargava Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Harish Chegondi Cc: Jiri Olsa Cc: Kan Liang Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Cc: sta...@vger.kernel.org Link: http://lkml.kernel.org/r/1483628965-2890-1-git-send-email-pra...@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Thomas Gleixner --- arch/x86/events/intel/uncore_snbep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index e6832be..dae2fed 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -2686,7 +2686,7 @@ static struct intel_uncore_type *hswep_msr_uncores[] = { void hswep_uncore_cpu_init(void) { - int pkg = topology_phys_to_logical_pkg(0); + int pkg = boot_cpu_data.logical_proc_id; if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores) hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
[tip:perf/urgent] perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code
Commit-ID: 6d6daa20945f3f598e56e18d1f926c08754f5801 Gitweb: http://git.kernel.org/tip/6d6daa20945f3f598e56e18d1f926c08754f5801 Author: Prarit Bhargava AuthorDate: Thu, 5 Jan 2017 10:09:25 -0500 Committer: Thomas Gleixner CommitDate: Wed, 11 Jan 2017 12:13:21 +0100 perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code hswep_uncore_cpu_init() uses a hardcoded physical package id 0 for the boot cpu. This works as long as the boot CPU is actually on the physical package 0, which is normaly the case after power on / reboot. But it fails with a NULL pointer dereference when a kdump kernel is started on a secondary socket which has a different physical package id because the locigal package translation for physical package 0 does not exist. Use the logical package id of the boot cpu instead of hard coded 0. [ tglx: Rewrote changelog once more ] Fixes: cf6d445f6897 ("perf/x86/uncore: Track packages, not per CPU data") Signed-off-by: Prarit Bhargava Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Harish Chegondi Cc: Jiri Olsa Cc: Kan Liang Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Cc: sta...@vger.kernel.org Link: http://lkml.kernel.org/r/1483628965-2890-1-git-send-email-pra...@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Thomas Gleixner --- arch/x86/events/intel/uncore_snbep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index e6832be..dae2fed 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -2686,7 +2686,7 @@ static struct intel_uncore_type *hswep_msr_uncores[] = { void hswep_uncore_cpu_init(void) { - int pkg = topology_phys_to_logical_pkg(0); + int pkg = boot_cpu_data.logical_proc_id; if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores) hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
[tip:x86/urgent] perf/x86/intel/uncore: Do not use hard coded physical package id 0
Commit-ID: 42433049c51e326baa1f45c834af9572fdb65b35 Gitweb: http://git.kernel.org/tip/42433049c51e326baa1f45c834af9572fdb65b35 Author: Prarit BhargavaAuthorDate: Tue, 3 Jan 2017 14:24:31 -0500 Committer: Thomas Gleixner CommitDate: Wed, 11 Jan 2017 11:29:37 +0100 perf/x86/intel/uncore: Do not use hard coded physical package id 0 hswep_uncore_cpu_init() uses a hardcoded physical package id 0 for the boot cpu. This works as long as the boot CPU is actually on the physical package 0, which is normaly the case after power on / reboot. But it fails with a NULL pointer dereference when a kdump kernel is started on a secondary socket which has a different physical package id because the locigal package translation for physical package 0 does not exist. Use the physical package id of the boot cpu instead of hard coded 0. [ tglx: Rewrote changelog once more ] commit cf6d445f6897 ("perf/x86/uncore: Track packages, not per CPU data") Signed-off-by: Prarit Bhargava Cc: Peter Zijlstra Cc: Kan Liang Cc: Harish Chegondi Cc: Borislav Petkov Cc: sta...@vger.kernel.org Link: http://lkml.kernel.org/r/1483471471-14450-1-git-send-email-pra...@redhat.com Signed-off-by: Thomas Gleixner --- arch/x86/events/intel/uncore_snbep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index e6832be..b5fbb59 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -2686,7 +2686,7 @@ static struct intel_uncore_type *hswep_msr_uncores[] = { void hswep_uncore_cpu_init(void) { - int pkg = topology_phys_to_logical_pkg(0); + int pkg = topology_phys_to_logical_pkg(boot_cpu_data.phys_proc_id); if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores) hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
[tip:x86/urgent] perf/x86/intel/uncore: Do not use hard coded physical package id 0
Commit-ID: 42433049c51e326baa1f45c834af9572fdb65b35 Gitweb: http://git.kernel.org/tip/42433049c51e326baa1f45c834af9572fdb65b35 Author: Prarit Bhargava AuthorDate: Tue, 3 Jan 2017 14:24:31 -0500 Committer: Thomas Gleixner CommitDate: Wed, 11 Jan 2017 11:29:37 +0100 perf/x86/intel/uncore: Do not use hard coded physical package id 0 hswep_uncore_cpu_init() uses a hardcoded physical package id 0 for the boot cpu. This works as long as the boot CPU is actually on the physical package 0, which is normaly the case after power on / reboot. But it fails with a NULL pointer dereference when a kdump kernel is started on a secondary socket which has a different physical package id because the locigal package translation for physical package 0 does not exist. Use the physical package id of the boot cpu instead of hard coded 0. [ tglx: Rewrote changelog once more ] commit cf6d445f6897 ("perf/x86/uncore: Track packages, not per CPU data") Signed-off-by: Prarit Bhargava Cc: Peter Zijlstra Cc: Kan Liang Cc: Harish Chegondi Cc: Borislav Petkov Cc: sta...@vger.kernel.org Link: http://lkml.kernel.org/r/1483471471-14450-1-git-send-email-pra...@redhat.com Signed-off-by: Thomas Gleixner --- arch/x86/events/intel/uncore_snbep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index e6832be..b5fbb59 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -2686,7 +2686,7 @@ static struct intel_uncore_type *hswep_msr_uncores[] = { void hswep_uncore_cpu_init(void) { - int pkg = topology_phys_to_logical_pkg(0); + int pkg = topology_phys_to_logical_pkg(boot_cpu_data.phys_proc_id); if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores) hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
[tip:perf/urgent] perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code
Commit-ID: fa37361e291bfe528872b9aef5c8644a3fc7ff20 Gitweb: http://git.kernel.org/tip/fa37361e291bfe528872b9aef5c8644a3fc7ff20 Author: Prarit BhargavaAuthorDate: Thu, 5 Jan 2017 10:09:25 -0500 Committer: Ingo Molnar CommitDate: Sat, 7 Jan 2017 08:54:38 +0100 perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code On multi-socket Intel v3 processor systems (aka Haswell), kdump can crash in hswep_uncore_cpu_init(): BUG: unable to handle kernel paging request at 006563a1 IP: [] hswep_uncore_cpu_init+0x52/0xa0 The crash was introduced by the following commit: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") ... which patch corrected the physical ID to logical ID mapping of the threads if the kdumped panic occurs on any socket other than socket 0. But hswep_uncore_cpu_init() is hard coded for physical socket 0 and if the system is kdump'ing on any other socket the logical package value will be incorrect - crashing the kdump kernel. The code should not use 0 as the physical ID, and should use the boot CPU's logical package ID in this calculation. Signed-off-by: Prarit Bhargava Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Harish Chegondi Cc: Jiri Olsa Cc: Kan Liang Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Link: http://lkml.kernel.org/r/1483628965-2890-1-git-send-email-pra...@redhat.com Signed-off-by: Ingo Molnar --- arch/x86/events/intel/uncore_snbep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index e6832be..dae2fed 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -2686,7 +2686,7 @@ static struct intel_uncore_type *hswep_msr_uncores[] = { void hswep_uncore_cpu_init(void) { - int pkg = topology_phys_to_logical_pkg(0); + int pkg = boot_cpu_data.logical_proc_id; if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores) hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
[tip:perf/urgent] perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code
Commit-ID: fa37361e291bfe528872b9aef5c8644a3fc7ff20 Gitweb: http://git.kernel.org/tip/fa37361e291bfe528872b9aef5c8644a3fc7ff20 Author: Prarit Bhargava AuthorDate: Thu, 5 Jan 2017 10:09:25 -0500 Committer: Ingo Molnar CommitDate: Sat, 7 Jan 2017 08:54:38 +0100 perf/x86/intel/uncore: Fix hardcoded socket 0 assumption in the Haswell init code On multi-socket Intel v3 processor systems (aka Haswell), kdump can crash in hswep_uncore_cpu_init(): BUG: unable to handle kernel paging request at 006563a1 IP: [] hswep_uncore_cpu_init+0x52/0xa0 The crash was introduced by the following commit: 9d85eb9119f4 ("x86/smpboot: Make logical package management more robust") ... which patch corrected the physical ID to logical ID mapping of the threads if the kdumped panic occurs on any socket other than socket 0. But hswep_uncore_cpu_init() is hard coded for physical socket 0 and if the system is kdump'ing on any other socket the logical package value will be incorrect - crashing the kdump kernel. The code should not use 0 as the physical ID, and should use the boot CPU's logical package ID in this calculation. Signed-off-by: Prarit Bhargava Cc: Alexander Shishkin Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: H. Peter Anvin Cc: Harish Chegondi Cc: Jiri Olsa Cc: Kan Liang Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Link: http://lkml.kernel.org/r/1483628965-2890-1-git-send-email-pra...@redhat.com Signed-off-by: Ingo Molnar --- arch/x86/events/intel/uncore_snbep.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/events/intel/uncore_snbep.c b/arch/x86/events/intel/uncore_snbep.c index e6832be..dae2fed 100644 --- a/arch/x86/events/intel/uncore_snbep.c +++ b/arch/x86/events/intel/uncore_snbep.c @@ -2686,7 +2686,7 @@ static struct intel_uncore_type *hswep_msr_uncores[] = { void hswep_uncore_cpu_init(void) { - int pkg = topology_phys_to_logical_pkg(0); + int pkg = boot_cpu_data.logical_proc_id; if (hswep_uncore_cbox.num_boxes > boot_cpu_data.x86_max_cores) hswep_uncore_cbox.num_boxes = boot_cpu_data.x86_max_cores;
[tip:x86/urgent] arch/x86: Handle non enumerated CPU after physical hotplug
Commit-ID: 2a51fe083eba7f99cbda72f5ef90cdf2f4df882c Gitweb: http://git.kernel.org/tip/2a51fe083eba7f99cbda72f5ef90cdf2f4df882c Author: Prarit BhargavaAuthorDate: Mon, 3 Oct 2016 13:07:12 -0400 Committer: Thomas Gleixner CommitDate: Fri, 7 Oct 2016 15:22:15 +0200 arch/x86: Handle non enumerated CPU after physical hotplug When a CPU is physically added to a system then the MADT table is not updated. If subsequently a kdump kernel is started on that physically added CPU then the ACPI enumeration fails to provide the information for this CPU which is now the boot CPU of the kdump kernel. As a consequence, generic_processor_info() is not invoked for that CPU so the number of enumerated processors is 0 and none of the initializations, including the logical package id management, are performed. We have code which relies on the correctness of the logical package map and other information which is initialized via generic_processor_info(). Executing such code will result in undefined behaviour or kernel crashes. This problem applies only to the kdump kernel because a normal kexec will switch to the original boot CPU, which is enumerated in MADT, before jumping into the kexec kernel. The boot code already has a check for num_processors equal 0 in prefill_possible_map(). We can use that check as an indicator that the enumeration of the boot CPU did not happen and invoke generic_processor_info() for it. That initializes the relevant data for the boot CPU and therefore prevents subsequent failure. [ tglx: Refined the code and rewrote the changelog ] Signed-off-by: Prarit Bhargava Fixes: 1f12e32f4cd5 ("x86/topology: Create logical package id") Cc: Peter Zijlstra Cc: Len Brown Cc: Borislav Petkov Cc: Andi Kleen Cc: Jiri Olsa Cc: Juergen Gross Cc: dyo...@redhat.com Cc: Eric Biederman Cc: ke...@lists.infradead.org Link: http://lkml.kernel.org/r/1475514432-27682-1-git-send-email-pra...@redhat.com Signed-off-by: Thomas Gleixner --- arch/x86/kernel/smpboot.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 42a9362..951f093 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1407,9 +1407,21 @@ __init void prefill_possible_map(void) { int i, possible; - /* no processor from mptable or madt */ - if (!num_processors) - num_processors = 1; + /* No boot processor was found in mptable or ACPI MADT */ + if (!num_processors) { + int apicid = boot_cpu_physical_apicid; + int cpu = hard_smp_processor_id(); + + pr_warn("Boot CPU (id %d) not listed by BIOS\n", cpu); + + /* Make sure boot cpu is enumerated */ + if (apic->cpu_present_to_apicid(0) == BAD_APICID && + apic->apic_id_valid(apicid)) + generic_processor_info(apicid, boot_cpu_apic_version); + + if (!num_processors) + num_processors = 1; + } i = setup_max_cpus ?: 1; if (setup_possible_cpus == -1) {
[tip:x86/urgent] arch/x86: Handle non enumerated CPU after physical hotplug
Commit-ID: 2a51fe083eba7f99cbda72f5ef90cdf2f4df882c Gitweb: http://git.kernel.org/tip/2a51fe083eba7f99cbda72f5ef90cdf2f4df882c Author: Prarit Bhargava AuthorDate: Mon, 3 Oct 2016 13:07:12 -0400 Committer: Thomas Gleixner CommitDate: Fri, 7 Oct 2016 15:22:15 +0200 arch/x86: Handle non enumerated CPU after physical hotplug When a CPU is physically added to a system then the MADT table is not updated. If subsequently a kdump kernel is started on that physically added CPU then the ACPI enumeration fails to provide the information for this CPU which is now the boot CPU of the kdump kernel. As a consequence, generic_processor_info() is not invoked for that CPU so the number of enumerated processors is 0 and none of the initializations, including the logical package id management, are performed. We have code which relies on the correctness of the logical package map and other information which is initialized via generic_processor_info(). Executing such code will result in undefined behaviour or kernel crashes. This problem applies only to the kdump kernel because a normal kexec will switch to the original boot CPU, which is enumerated in MADT, before jumping into the kexec kernel. The boot code already has a check for num_processors equal 0 in prefill_possible_map(). We can use that check as an indicator that the enumeration of the boot CPU did not happen and invoke generic_processor_info() for it. That initializes the relevant data for the boot CPU and therefore prevents subsequent failure. [ tglx: Refined the code and rewrote the changelog ] Signed-off-by: Prarit Bhargava Fixes: 1f12e32f4cd5 ("x86/topology: Create logical package id") Cc: Peter Zijlstra Cc: Len Brown Cc: Borislav Petkov Cc: Andi Kleen Cc: Jiri Olsa Cc: Juergen Gross Cc: dyo...@redhat.com Cc: Eric Biederman Cc: ke...@lists.infradead.org Link: http://lkml.kernel.org/r/1475514432-27682-1-git-send-email-pra...@redhat.com Signed-off-by: Thomas Gleixner --- arch/x86/kernel/smpboot.c | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 42a9362..951f093 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1407,9 +1407,21 @@ __init void prefill_possible_map(void) { int i, possible; - /* no processor from mptable or madt */ - if (!num_processors) - num_processors = 1; + /* No boot processor was found in mptable or ACPI MADT */ + if (!num_processors) { + int apicid = boot_cpu_physical_apicid; + int cpu = hard_smp_processor_id(); + + pr_warn("Boot CPU (id %d) not listed by BIOS\n", cpu); + + /* Make sure boot cpu is enumerated */ + if (apic->cpu_present_to_apicid(0) == BAD_APICID && + apic->apic_id_valid(apicid)) + generic_processor_info(apicid, boot_cpu_apic_version); + + if (!num_processors) + num_processors = 1; + } i = setup_max_cpus ?: 1; if (setup_possible_cpus == -1) {
[tip:x86/timers] x86/tsc: Add additional Intel CPU models to the crystal quirk list
Commit-ID: 6baf3d61821f5b38f27b4e9f044ad4d1e8f3d14f Gitweb: http://git.kernel.org/tip/6baf3d61821f5b38f27b4e9f044ad4d1e8f3d14f Author: Prarit BhargavaAuthorDate: Mon, 19 Sep 2016 08:51:41 -0400 Committer: Thomas Gleixner CommitDate: Tue, 20 Sep 2016 01:00:32 +0200 x86/tsc: Add additional Intel CPU models to the crystal quirk list commit aa297292d708 ("x86/tsc: Enumerate SKL cpu_khz and tsc_khz via CPUID") added code to retrieve the crystal and TSC frequency from CPUID leaves. If the crystal freqency is enumerated as 0,the resulting TSC frequency is 0 as well. For CPUs with a known fixed crystal frequency a quirk list is available to set the frequency, Kabylake and SkylakeX CPUs are missing in the list of CPUs which need this quirk. Add them so the TSC frequency can be calculated correctly. [ tglx: Removed the silly default case as the switch() is only invoked when cpu_khz is 0. Massaged changelog. ] Signed-off-by: Prarit Bhargava Cc: Len Brown Cc: Rafael Aquini Cc: "Peter Zijlstra (Intel)" Cc: Andy Lutomirski Link: http://lkml.kernel.org/r/1474289501-31717-3-git-send-email-pra...@redhat.com Signed-off-by: Thomas Gleixner --- arch/x86/kernel/tsc.c | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 2344758..46b2f41 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -689,8 +689,13 @@ unsigned long native_calibrate_tsc(void) switch (boot_cpu_data.x86_model) { case INTEL_FAM6_SKYLAKE_MOBILE: case INTEL_FAM6_SKYLAKE_DESKTOP: + case INTEL_FAM6_KABYLAKE_MOBILE: + case INTEL_FAM6_KABYLAKE_DESKTOP: crystal_khz = 24000;/* 24.0 MHz */ break; + case INTEL_FAM6_SKYLAKE_X: + crystal_khz = 25000;/* 25.0 MHz */ + break; case INTEL_FAM6_ATOM_GOLDMONT: crystal_khz = 19200;/* 19.2 MHz */ break;
[tip:x86/timers] x86/tsc: Add additional Intel CPU models to the crystal quirk list
Commit-ID: 6baf3d61821f5b38f27b4e9f044ad4d1e8f3d14f Gitweb: http://git.kernel.org/tip/6baf3d61821f5b38f27b4e9f044ad4d1e8f3d14f Author: Prarit Bhargava AuthorDate: Mon, 19 Sep 2016 08:51:41 -0400 Committer: Thomas Gleixner CommitDate: Tue, 20 Sep 2016 01:00:32 +0200 x86/tsc: Add additional Intel CPU models to the crystal quirk list commit aa297292d708 ("x86/tsc: Enumerate SKL cpu_khz and tsc_khz via CPUID") added code to retrieve the crystal and TSC frequency from CPUID leaves. If the crystal freqency is enumerated as 0,the resulting TSC frequency is 0 as well. For CPUs with a known fixed crystal frequency a quirk list is available to set the frequency, Kabylake and SkylakeX CPUs are missing in the list of CPUs which need this quirk. Add them so the TSC frequency can be calculated correctly. [ tglx: Removed the silly default case as the switch() is only invoked when cpu_khz is 0. Massaged changelog. ] Signed-off-by: Prarit Bhargava Cc: Len Brown Cc: Rafael Aquini Cc: "Peter Zijlstra (Intel)" Cc: Andy Lutomirski Link: http://lkml.kernel.org/r/1474289501-31717-3-git-send-email-pra...@redhat.com Signed-off-by: Thomas Gleixner --- arch/x86/kernel/tsc.c | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 2344758..46b2f41 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -689,8 +689,13 @@ unsigned long native_calibrate_tsc(void) switch (boot_cpu_data.x86_model) { case INTEL_FAM6_SKYLAKE_MOBILE: case INTEL_FAM6_SKYLAKE_DESKTOP: + case INTEL_FAM6_KABYLAKE_MOBILE: + case INTEL_FAM6_KABYLAKE_DESKTOP: crystal_khz = 24000;/* 24.0 MHz */ break; + case INTEL_FAM6_SKYLAKE_X: + crystal_khz = 25000;/* 25.0 MHz */ + break; case INTEL_FAM6_ATOM_GOLDMONT: crystal_khz = 19200;/* 19.2 MHz */ break;
[tip:x86/timers] x86/tsc: Use cpu id defines instead of hex constants
Commit-ID: 655e52d2b62458032fc67ff7daaa664af6f36fb5 Gitweb: http://git.kernel.org/tip/655e52d2b62458032fc67ff7daaa664af6f36fb5 Author: Prarit BhargavaAuthorDate: Mon, 19 Sep 2016 08:51:40 -0400 Committer: Thomas Gleixner CommitDate: Tue, 20 Sep 2016 01:00:32 +0200 x86/tsc: Use cpu id defines instead of hex constants asm/intel-family.h contains defines for cpu ids which should be used instead of hex constants. Convert the switch case in native_calibrate_tsc() to use the defines before adding more cpu models. [ tglx: Massaged changelog ] Signed-off-by: Prarit Bhargava Cc: Len Brown Cc: Rafael Aquini Cc: "Peter Zijlstra (Intel)" Cc: Andy Lutomirski Link: http://lkml.kernel.org/r/1474289501-31717-2-git-send-email-pra...@redhat.com Signed-off-by: Thomas Gleixner --- arch/x86/kernel/tsc.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 78b9cb5..2344758 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -23,6 +23,7 @@ #include #include #include +#include unsigned int __read_mostly cpu_khz;/* TSC clocks / usec, not used here */ EXPORT_SYMBOL(cpu_khz); @@ -686,11 +687,11 @@ unsigned long native_calibrate_tsc(void) if (crystal_khz == 0) { switch (boot_cpu_data.x86_model) { - case 0x4E: /* SKL */ - case 0x5E: /* SKL */ + case INTEL_FAM6_SKYLAKE_MOBILE: + case INTEL_FAM6_SKYLAKE_DESKTOP: crystal_khz = 24000;/* 24.0 MHz */ break; - case 0x5C: /* BXT */ + case INTEL_FAM6_ATOM_GOLDMONT: crystal_khz = 19200;/* 19.2 MHz */ break; }
[tip:x86/timers] x86/tsc: Use cpu id defines instead of hex constants
Commit-ID: 655e52d2b62458032fc67ff7daaa664af6f36fb5 Gitweb: http://git.kernel.org/tip/655e52d2b62458032fc67ff7daaa664af6f36fb5 Author: Prarit Bhargava AuthorDate: Mon, 19 Sep 2016 08:51:40 -0400 Committer: Thomas Gleixner CommitDate: Tue, 20 Sep 2016 01:00:32 +0200 x86/tsc: Use cpu id defines instead of hex constants asm/intel-family.h contains defines for cpu ids which should be used instead of hex constants. Convert the switch case in native_calibrate_tsc() to use the defines before adding more cpu models. [ tglx: Massaged changelog ] Signed-off-by: Prarit Bhargava Cc: Len Brown Cc: Rafael Aquini Cc: "Peter Zijlstra (Intel)" Cc: Andy Lutomirski Link: http://lkml.kernel.org/r/1474289501-31717-2-git-send-email-pra...@redhat.com Signed-off-by: Thomas Gleixner --- arch/x86/kernel/tsc.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 78b9cb5..2344758 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -23,6 +23,7 @@ #include #include #include +#include unsigned int __read_mostly cpu_khz;/* TSC clocks / usec, not used here */ EXPORT_SYMBOL(cpu_khz); @@ -686,11 +687,11 @@ unsigned long native_calibrate_tsc(void) if (crystal_khz == 0) { switch (boot_cpu_data.x86_model) { - case 0x4E: /* SKL */ - case 0x5E: /* SKL */ + case INTEL_FAM6_SKYLAKE_MOBILE: + case INTEL_FAM6_SKYLAKE_DESKTOP: crystal_khz = 24000;/* 24.0 MHz */ break; - case 0x5C: /* BXT */ + case INTEL_FAM6_ATOM_GOLDMONT: crystal_khz = 19200;/* 19.2 MHz */ break; }
[tip:x86/urgent] x86/msr: Remove unused native_read_tscp()
Commit-ID: 9da77666d6975219281fd400eb9608a047337414 Gitweb: http://git.kernel.org/tip/9da77666d6975219281fd400eb9608a047337414 Author: Prarit BhargavaAuthorDate: Tue, 22 Mar 2016 19:06:08 -0400 Committer: Thomas Gleixner CommitDate: Wed, 23 Mar 2016 12:34:17 +0100 x86/msr: Remove unused native_read_tscp() After e76b027 ("x86,vdso: Use LSL unconditionally for vgetcpu") native_read_tscp() is unused in the kernel. The function can be removed like native_read_tsc() was. Signed-off-by: Prarit Bhargava Acked-by: Andy Lutomirski Cc: Borislav Petkov Link: http://lkml.kernel.org/r/1458687968-9106-1-git-send-email-pra...@redhat.com Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/msr.h | 8 1 file changed, 8 deletions(-) diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h index 93fb7c1..7a79ee2 100644 --- a/arch/x86/include/asm/msr.h +++ b/arch/x86/include/asm/msr.h @@ -42,14 +42,6 @@ struct saved_msrs { struct saved_msr *array; }; -static inline unsigned long long native_read_tscp(unsigned int *aux) -{ - unsigned long low, high; - asm volatile(".byte 0x0f,0x01,0xf9" -: "=a" (low), "=d" (high), "=c" (*aux)); - return low | ((u64)high << 32); -} - /* * both i386 and x86_64 returns 64-bit value in edx:eax, but gcc's "A" * constraint has different meanings. For i386, "A" means exactly
[tip:x86/urgent] x86/msr: Remove unused native_read_tscp()
Commit-ID: 9da77666d6975219281fd400eb9608a047337414 Gitweb: http://git.kernel.org/tip/9da77666d6975219281fd400eb9608a047337414 Author: Prarit Bhargava AuthorDate: Tue, 22 Mar 2016 19:06:08 -0400 Committer: Thomas Gleixner CommitDate: Wed, 23 Mar 2016 12:34:17 +0100 x86/msr: Remove unused native_read_tscp() After e76b027 ("x86,vdso: Use LSL unconditionally for vgetcpu") native_read_tscp() is unused in the kernel. The function can be removed like native_read_tsc() was. Signed-off-by: Prarit Bhargava Acked-by: Andy Lutomirski Cc: Borislav Petkov Link: http://lkml.kernel.org/r/1458687968-9106-1-git-send-email-pra...@redhat.com Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/msr.h | 8 1 file changed, 8 deletions(-) diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h index 93fb7c1..7a79ee2 100644 --- a/arch/x86/include/asm/msr.h +++ b/arch/x86/include/asm/msr.h @@ -42,14 +42,6 @@ struct saved_msrs { struct saved_msr *array; }; -static inline unsigned long long native_read_tscp(unsigned int *aux) -{ - unsigned long low, high; - asm volatile(".byte 0x0f,0x01,0xf9" -: "=a" (low), "=d" (high), "=c" (*aux)); - return low | ((u64)high << 32); -} - /* * both i386 and x86_64 returns 64-bit value in edx:eax, but gcc's "A" * constraint has different meanings. For i386, "A" means exactly
[tip:sched/core] sched/isolcpus: Output warning when the ' isolcpus=' kernel parameter is invalid
Commit-ID: a6e4491c682a7b28574a62e6f311a0acec50b318 Gitweb: http://git.kernel.org/tip/a6e4491c682a7b28574a62e6f311a0acec50b318 Author: Prarit Bhargava AuthorDate: Thu, 4 Feb 2016 09:38:00 -0500 Committer: Ingo Molnar CommitDate: Fri, 5 Feb 2016 08:46:38 +0100 sched/isolcpus: Output warning when the 'isolcpus=' kernel parameter is invalid The isolcpus= kernel boot parameter restricts userspace from scheduling on the specified CPUs. If a CPU is specified that is outside the range of 0 to nr_cpu_ids, cpulist_parse() will return -ERANGE, return an empty cpulist, and fail silently. This patch adds an error message to isolated_cpu_setup() to indicate to the user that something has gone awry, and returns 0 on error. Signed-off-by: Prarit Bhargava Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1454596680-10367-1-git-send-email-pra...@redhat.com [ Twiddled some details. ] Signed-off-by: Ingo Molnar --- kernel/sched/core.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9503d59..24fcdbf 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6173,11 +6173,16 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu) /* Setup the mask of cpus configured for isolated domains */ static int __init isolated_cpu_setup(char *str) { + int ret; + alloc_bootmem_cpumask_var(_isolated_map); - cpulist_parse(str, cpu_isolated_map); + ret = cpulist_parse(str, cpu_isolated_map); + if (ret) { + pr_err("sched: Error, all isolcpus= values must be between 0 and %d\n", nr_cpu_ids); + return 0; + } return 1; } - __setup("isolcpus=", isolated_cpu_setup); struct s_data {
[tip:sched/core] sched/isolcpus: Output warning when the ' isolcpus=' kernel parameter is invalid
Commit-ID: a6e4491c682a7b28574a62e6f311a0acec50b318 Gitweb: http://git.kernel.org/tip/a6e4491c682a7b28574a62e6f311a0acec50b318 Author: Prarit BhargavaAuthorDate: Thu, 4 Feb 2016 09:38:00 -0500 Committer: Ingo Molnar CommitDate: Fri, 5 Feb 2016 08:46:38 +0100 sched/isolcpus: Output warning when the 'isolcpus=' kernel parameter is invalid The isolcpus= kernel boot parameter restricts userspace from scheduling on the specified CPUs. If a CPU is specified that is outside the range of 0 to nr_cpu_ids, cpulist_parse() will return -ERANGE, return an empty cpulist, and fail silently. This patch adds an error message to isolated_cpu_setup() to indicate to the user that something has gone awry, and returns 0 on error. Signed-off-by: Prarit Bhargava Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1454596680-10367-1-git-send-email-pra...@redhat.com [ Twiddled some details. ] Signed-off-by: Ingo Molnar --- kernel/sched/core.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9503d59..24fcdbf 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6173,11 +6173,16 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu) /* Setup the mask of cpus configured for isolated domains */ static int __init isolated_cpu_setup(char *str) { + int ret; + alloc_bootmem_cpumask_var(_isolated_map); - cpulist_parse(str, cpu_isolated_map); + ret = cpulist_parse(str, cpu_isolated_map); + if (ret) { + pr_err("sched: Error, all isolcpus= values must be between 0 and %d\n", nr_cpu_ids); + return 0; + } return 1; } - __setup("isolcpus=", isolated_cpu_setup); struct s_data {
[tip:x86/cpu] x86/cpu: Strip any /proc/ cpuinfo model name field whitespace
Commit-ID: adafb98da6a7af5e45362933a7dae6ab0e5076bf Gitweb: http://git.kernel.org/tip/adafb98da6a7af5e45362933a7dae6ab0e5076bf Author: Prarit Bhargava AuthorDate: Tue, 26 May 2015 10:28:17 +0200 Committer: Ingo Molnar CommitDate: Wed, 27 May 2015 14:38:24 +0200 x86/cpu: Strip any /proc/cpuinfo model name field whitespace When comparing the 'model name' field of each core in /proc/cpuinfo it was noticed that there is a whitespace difference between the cores' model names. After some quick investigation it was noticed that the model name fields were actually different -- processor 0's model name field had trailing whitespace removed, while the other processors did not. Another way of seeing this behaviour is to convert spaces into underscores in the output of /proc/cpuinfo, [thetango@prarit ~]# grep "^model name" /proc/cpuinfo | uniq -c | sed 's/\ /_/g' __1_model_name :_AMD_Opteron(TM)_Processor_6272 _63_model_name :_AMD_Opteron(TM)_Processor_6272_ which shows the discrepancy. This occurs because the kernel calls strim() on cpu 0's x86_model_id field to output a pretty message to the console in print_cpu_info(), and as a result strips the whitespace at the end of the ->x86_model_id field. But, the ->x86_model_id field should be the same for the all identical CPUs in the box. Thus, we need to remove both leading and trailing whitespace. As a result, the print_cpu_info() output looks like smpboot: CPU0: AMD Opteron(TM) Processor 6272 (fam: 15, model: 01, stepping: 02) and the x86_model_id field is correct on all processors on AMD platforms: _64_model_name :_AMD_Opteron(TM)_Processor_6272 Output is still correct on an Intel box: 144_model_name :_Intel(R)_Xeon(R)_CPU_E7-8890_v3_@_2.50GHz Signed-off-by: Prarit Bhargava Signed-off-by: Borislav Petkov Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Dave Hansen Cc: Denys Vlasenko Cc: Fenghua Yu Cc: H. Peter Anvin Cc: Igor Mammedov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1432050210-32036-1-git-send-email-pra...@redhat.com Link: http://lkml.kernel.org/r/1432628901-18044-15-git-send-email...@alien8.de Signed-off-by: Ingo Molnar --- arch/x86/kernel/cpu/common.c | 17 - 1 file changed, 4 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index a62cf04..41a8e9c 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -419,7 +419,6 @@ static const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {}; static void get_model_name(struct cpuinfo_x86 *c) { unsigned int *v; - char *p, *q; if (c->extended_cpuid_level < 0x8004) return; @@ -431,18 +430,10 @@ static void get_model_name(struct cpuinfo_x86 *c) c->x86_model_id[48] = 0; /* -* Intel chips right-justify this string for some dumb reason; -* undo that brain damage: +* Remove leading whitespace on Intel processors and trailing +* whitespace on AMD processors. */ - p = q = >x86_model_id[0]; - while (*p == ' ') - p++; - if (p != q) { - while (*p) - *q++ = *p++; - while (q <= >x86_model_id[48]) - *q++ = '\0';/* Zero-pad the rest */ - } + memmove(c->x86_model_id, strim(c->x86_model_id), 48); } void cpu_detect_cache_sizes(struct cpuinfo_x86 *c) @@ -1122,7 +1113,7 @@ void print_cpu_info(struct cpuinfo_x86 *c) printk(KERN_CONT "%s ", vendor); if (c->x86_model_id[0]) - printk(KERN_CONT "%s", strim(c->x86_model_id)); + printk(KERN_CONT "%s", c->x86_model_id); else printk(KERN_CONT "%d86", c->x86); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/cpu] x86/cpu: Strip any /proc/ cpuinfo model name field whitespace
Commit-ID: adafb98da6a7af5e45362933a7dae6ab0e5076bf Gitweb: http://git.kernel.org/tip/adafb98da6a7af5e45362933a7dae6ab0e5076bf Author: Prarit Bhargava pra...@redhat.com AuthorDate: Tue, 26 May 2015 10:28:17 +0200 Committer: Ingo Molnar mi...@kernel.org CommitDate: Wed, 27 May 2015 14:38:24 +0200 x86/cpu: Strip any /proc/cpuinfo model name field whitespace When comparing the 'model name' field of each core in /proc/cpuinfo it was noticed that there is a whitespace difference between the cores' model names. After some quick investigation it was noticed that the model name fields were actually different -- processor 0's model name field had trailing whitespace removed, while the other processors did not. Another way of seeing this behaviour is to convert spaces into underscores in the output of /proc/cpuinfo, [thetango@prarit ~]# grep ^model name /proc/cpuinfo | uniq -c | sed 's/\ /_/g' __1_model_name :_AMD_Opteron(TM)_Processor_6272 _63_model_name :_AMD_Opteron(TM)_Processor_6272_ which shows the discrepancy. This occurs because the kernel calls strim() on cpu 0's x86_model_id field to output a pretty message to the console in print_cpu_info(), and as a result strips the whitespace at the end of the -x86_model_id field. But, the -x86_model_id field should be the same for the all identical CPUs in the box. Thus, we need to remove both leading and trailing whitespace. As a result, the print_cpu_info() output looks like smpboot: CPU0: AMD Opteron(TM) Processor 6272 (fam: 15, model: 01, stepping: 02) and the x86_model_id field is correct on all processors on AMD platforms: _64_model_name :_AMD_Opteron(TM)_Processor_6272 Output is still correct on an Intel box: 144_model_name :_Intel(R)_Xeon(R)_CPU_E7-8890_v3_@_2.50GHz Signed-off-by: Prarit Bhargava pra...@redhat.com Signed-off-by: Borislav Petkov b...@suse.de Cc: Andy Lutomirski l...@amacapital.net Cc: Borislav Petkov b...@alien8.de Cc: Brian Gerst brge...@gmail.com Cc: Dave Hansen dave.han...@linux.intel.com Cc: Denys Vlasenko dvlas...@redhat.com Cc: Fenghua Yu fenghua...@intel.com Cc: H. Peter Anvin h...@zytor.com Cc: Igor Mammedov imamm...@redhat.com Cc: Linus Torvalds torva...@linux-foundation.org Cc: Peter Zijlstra pet...@infradead.org Cc: Thomas Gleixner t...@linutronix.de Link: http://lkml.kernel.org/r/1432050210-32036-1-git-send-email-pra...@redhat.com Link: http://lkml.kernel.org/r/1432628901-18044-15-git-send-email...@alien8.de Signed-off-by: Ingo Molnar mi...@kernel.org --- arch/x86/kernel/cpu/common.c | 17 - 1 file changed, 4 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index a62cf04..41a8e9c 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -419,7 +419,6 @@ static const struct cpu_dev *cpu_devs[X86_VENDOR_NUM] = {}; static void get_model_name(struct cpuinfo_x86 *c) { unsigned int *v; - char *p, *q; if (c-extended_cpuid_level 0x8004) return; @@ -431,18 +430,10 @@ static void get_model_name(struct cpuinfo_x86 *c) c-x86_model_id[48] = 0; /* -* Intel chips right-justify this string for some dumb reason; -* undo that brain damage: +* Remove leading whitespace on Intel processors and trailing +* whitespace on AMD processors. */ - p = q = c-x86_model_id[0]; - while (*p == ' ') - p++; - if (p != q) { - while (*p) - *q++ = *p++; - while (q = c-x86_model_id[48]) - *q++ = '\0';/* Zero-pad the rest */ - } + memmove(c-x86_model_id, strim(c-x86_model_id), 48); } void cpu_detect_cache_sizes(struct cpuinfo_x86 *c) @@ -1122,7 +1113,7 @@ void print_cpu_info(struct cpuinfo_x86 *c) printk(KERN_CONT %s , vendor); if (c-x86_model_id[0]) - printk(KERN_CONT %s, strim(c-x86_model_id)); + printk(KERN_CONT %s, c-x86_model_id); else printk(KERN_CONT %d86, c-x86); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/irq] x86/irq: Fix fixup_irqs() error handling
Commit-ID: fb24da805729ee4a83efa34015948f7d64da4b28 Gitweb: http://git.kernel.org/tip/fb24da805729ee4a83efa34015948f7d64da4b28 Author: Prarit Bhargava AuthorDate: Wed, 2 Apr 2014 08:11:13 -0400 Committer: Ingo Molnar CommitDate: Wed, 16 Apr 2014 13:30:49 +0200 x86/irq: Fix fixup_irqs() error handling Several patches to fix cpu hotplug and the down'd cpu's irq relocations have been submitted in the past month or so. The patches should resolve the problems with cpu hotplug and irq relocation, however, there is always a possibility that a bug still exists. The big problem with debugging these irq reassignments is that the cpu down completes and then we get random stack traces from drivers for which irqs have not been properly assigned to a new cpu. The stack traces are a mix of storage, network, and other kernel subsystem (I once saw the serial port stop working ...) warnings and failures. The problem with these failures is that they are difficult to diagnose. There is no warning in the cpu hotplug down path to indicate that an IRQ has failed to be assigned to a new cpu, and all we are left with is a stack trace from a driver, or a non-functional device. If we had some information on the console debugging these situations would be much easier; after all we can map an IRQ to a device by simply using lspci or /proc/interrupts. The current code, fixup_irqs(), which migrates IRQs from the down'd cpu and is called close to the end of the cpu down path, calls chip->set_irq_affinity which eventually calls __assign_irq_vector(). Errors are not propogated back from this function call and this results in silent irq relocation failures. This patch fixes this issue by returning the error codes up the call stack and prints out a warning if there is a relocation failure. Signed-off-by: Prarit Bhargava Acked-by: Thomas Gleixner Cc: Rui Wang Cc: Liu Ping Fan Cc: Bjorn Helgaas Cc: Yoshihiro YUNOMAE Cc: Lv Zheng Cc: Seiji Aguchi Cc: Yang Zhang Cc: Andi Kleen Cc: Steven Rostedt (Red Hat) Cc: Li Fei Cc: gong.c...@linux.intel.com Link: http://lkml.kernel.org/r/1396440673-18286-1-git-send-email-pra...@redhat.com [ Made small cleanliness tweaks. ] Signed-off-by: Ingo Molnar --- arch/x86/kernel/apic/io_apic.c | 28 ++-- arch/x86/kernel/irq.c | 13 + 2 files changed, 27 insertions(+), 14 deletions(-) diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 6ad4658..b4b21db 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2312,7 +2312,7 @@ int __ioapic_set_affinity(struct irq_data *data, const struct cpumask *mask, int err; if (!config_enabled(CONFIG_SMP)) - return -1; + return -EPERM; if (!cpumask_intersects(mask, cpu_online_mask)) return -EINVAL; @@ -2343,7 +2343,7 @@ int native_ioapic_set_affinity(struct irq_data *data, int ret; if (!config_enabled(CONFIG_SMP)) - return -1; + return -EPERM; raw_spin_lock_irqsave(_lock, flags); ret = __ioapic_set_affinity(data, mask, ); @@ -3075,9 +3075,11 @@ msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force) struct irq_cfg *cfg = data->chip_data; struct msi_msg msg; unsigned int dest; + int ret; - if (__ioapic_set_affinity(data, mask, )) - return -1; + ret = __ioapic_set_affinity(data, mask, ); + if (ret) + return ret; __get_cached_msi_msg(data->msi_desc, ); @@ -3177,9 +3179,11 @@ dmar_msi_set_affinity(struct irq_data *data, const struct cpumask *mask, struct irq_cfg *cfg = data->chip_data; unsigned int dest, irq = data->irq; struct msi_msg msg; + int ret; - if (__ioapic_set_affinity(data, mask, )) - return -1; + ret = __ioapic_set_affinity(data, mask, ); + if (ret) + return ret; dmar_msi_read(irq, ); @@ -3226,9 +3230,11 @@ static int hpet_msi_set_affinity(struct irq_data *data, struct irq_cfg *cfg = data->chip_data; struct msi_msg msg; unsigned int dest; + int ret; - if (__ioapic_set_affinity(data, mask, )) - return -1; + ret = __ioapic_set_affinity(data, mask, ); + if (ret) + return ret; hpet_msi_read(data->handler_data, ); @@ -3295,9 +3301,11 @@ ht_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force) { struct irq_cfg *cfg = data->chip_data; unsigned int dest; + int ret; - if (__ioapic_set_affinity(data, mask, )) - return -1; + ret = __ioapic_set_affinity(data, mask, ); + if (ret) + return ret; target_ht_irq(data->irq, dest, cfg->vector); return IRQ_SET_MASK_OK_NOCOPY; diff --git a/arch/x86/kernel/irq.c
[tip:x86/irq] x86/irq: Fix fixup_irqs() error handling
Commit-ID: fb24da805729ee4a83efa34015948f7d64da4b28 Gitweb: http://git.kernel.org/tip/fb24da805729ee4a83efa34015948f7d64da4b28 Author: Prarit Bhargava pra...@redhat.com AuthorDate: Wed, 2 Apr 2014 08:11:13 -0400 Committer: Ingo Molnar mi...@kernel.org CommitDate: Wed, 16 Apr 2014 13:30:49 +0200 x86/irq: Fix fixup_irqs() error handling Several patches to fix cpu hotplug and the down'd cpu's irq relocations have been submitted in the past month or so. The patches should resolve the problems with cpu hotplug and irq relocation, however, there is always a possibility that a bug still exists. The big problem with debugging these irq reassignments is that the cpu down completes and then we get random stack traces from drivers for which irqs have not been properly assigned to a new cpu. The stack traces are a mix of storage, network, and other kernel subsystem (I once saw the serial port stop working ...) warnings and failures. The problem with these failures is that they are difficult to diagnose. There is no warning in the cpu hotplug down path to indicate that an IRQ has failed to be assigned to a new cpu, and all we are left with is a stack trace from a driver, or a non-functional device. If we had some information on the console debugging these situations would be much easier; after all we can map an IRQ to a device by simply using lspci or /proc/interrupts. The current code, fixup_irqs(), which migrates IRQs from the down'd cpu and is called close to the end of the cpu down path, calls chip-set_irq_affinity which eventually calls __assign_irq_vector(). Errors are not propogated back from this function call and this results in silent irq relocation failures. This patch fixes this issue by returning the error codes up the call stack and prints out a warning if there is a relocation failure. Signed-off-by: Prarit Bhargava pra...@redhat.com Acked-by: Thomas Gleixner t...@linutronix.de Cc: Rui Wang rui.y.w...@intel.com Cc: Liu Ping Fan kernelf...@gmail.com Cc: Bjorn Helgaas bhelg...@google.com Cc: Yoshihiro YUNOMAE yoshihiro.yunomae...@hitachi.com Cc: Lv Zheng lv.zh...@intel.com Cc: Seiji Aguchi seiji.agu...@hds.com Cc: Yang Zhang yang.z.zh...@intel.com Cc: Andi Kleen a...@linux.intel.com Cc: Steven Rostedt (Red Hat) rost...@goodmis.org Cc: Li Fei fei...@intel.com Cc: gong.c...@linux.intel.com Link: http://lkml.kernel.org/r/1396440673-18286-1-git-send-email-pra...@redhat.com [ Made small cleanliness tweaks. ] Signed-off-by: Ingo Molnar mi...@kernel.org --- arch/x86/kernel/apic/io_apic.c | 28 ++-- arch/x86/kernel/irq.c | 13 + 2 files changed, 27 insertions(+), 14 deletions(-) diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 6ad4658..b4b21db 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2312,7 +2312,7 @@ int __ioapic_set_affinity(struct irq_data *data, const struct cpumask *mask, int err; if (!config_enabled(CONFIG_SMP)) - return -1; + return -EPERM; if (!cpumask_intersects(mask, cpu_online_mask)) return -EINVAL; @@ -2343,7 +2343,7 @@ int native_ioapic_set_affinity(struct irq_data *data, int ret; if (!config_enabled(CONFIG_SMP)) - return -1; + return -EPERM; raw_spin_lock_irqsave(ioapic_lock, flags); ret = __ioapic_set_affinity(data, mask, dest); @@ -3075,9 +3075,11 @@ msi_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force) struct irq_cfg *cfg = data-chip_data; struct msi_msg msg; unsigned int dest; + int ret; - if (__ioapic_set_affinity(data, mask, dest)) - return -1; + ret = __ioapic_set_affinity(data, mask, dest); + if (ret) + return ret; __get_cached_msi_msg(data-msi_desc, msg); @@ -3177,9 +3179,11 @@ dmar_msi_set_affinity(struct irq_data *data, const struct cpumask *mask, struct irq_cfg *cfg = data-chip_data; unsigned int dest, irq = data-irq; struct msi_msg msg; + int ret; - if (__ioapic_set_affinity(data, mask, dest)) - return -1; + ret = __ioapic_set_affinity(data, mask, dest); + if (ret) + return ret; dmar_msi_read(irq, msg); @@ -3226,9 +3230,11 @@ static int hpet_msi_set_affinity(struct irq_data *data, struct irq_cfg *cfg = data-chip_data; struct msi_msg msg; unsigned int dest; + int ret; - if (__ioapic_set_affinity(data, mask, dest)) - return -1; + ret = __ioapic_set_affinity(data, mask, dest); + if (ret) + return ret; hpet_msi_read(data-handler_data, msg); @@ -3295,9 +3301,11 @@ ht_set_affinity(struct irq_data *data, const struct cpumask *mask, bool force) { struct irq_cfg *cfg = data-chip_data; unsigned int dest; +
[tip:x86/apic] x86/irq: Clean up VECTOR_UNDEFINED and VECTOR_RETRIGGERED definition
Commit-ID: 79a51b25badae79d2da6f7b54530adf56697f669 Gitweb: http://git.kernel.org/tip/79a51b25badae79d2da6f7b54530adf56697f669 Author: Prarit Bhargava AuthorDate: Wed, 2 Apr 2014 08:13:47 -0400 Committer: Ingo Molnar CommitDate: Mon, 14 Apr 2014 13:42:05 +0200 x86/irq: Clean up VECTOR_UNDEFINED and VECTOR_RETRIGGERED definition During another patch review, David Rientjes noted that VECTOR_UNDEFINED and VECTOR_RETRIGGERED should be defined with ()s so that they are not erroneously used in an arithmetic operation. Suggested-by: David Rientjes Signed-off-by: Prarit Bhargava Cc: Seiji Aguchi Cc: Yang Zhang Link: http://lkml.kernel.org/r/1396440827-18352-1-git-send-email-pra...@redhat.com Signed-off-by: Ingo Molnar --- arch/x86/include/asm/hw_irq.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h index a307b75..4615906 100644 --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -190,8 +190,8 @@ extern void (*__initconst interrupt[NR_VECTORS-FIRST_EXTERNAL_VECTOR])(void); #define trace_interrupt interrupt #endif -#define VECTOR_UNDEFINED -1 -#define VECTOR_RETRIGGERED -2 +#define VECTOR_UNDEFINED (-1) +#define VECTOR_RETRIGGERED (-2) typedef int vector_irq_t[NR_VECTORS]; DECLARE_PER_CPU(vector_irq_t, vector_irq); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/apic] x86/irq: Clean up VECTOR_UNDEFINED and VECTOR_RETRIGGERED definition
Commit-ID: 79a51b25badae79d2da6f7b54530adf56697f669 Gitweb: http://git.kernel.org/tip/79a51b25badae79d2da6f7b54530adf56697f669 Author: Prarit Bhargava pra...@redhat.com AuthorDate: Wed, 2 Apr 2014 08:13:47 -0400 Committer: Ingo Molnar mi...@kernel.org CommitDate: Mon, 14 Apr 2014 13:42:05 +0200 x86/irq: Clean up VECTOR_UNDEFINED and VECTOR_RETRIGGERED definition During another patch review, David Rientjes noted that VECTOR_UNDEFINED and VECTOR_RETRIGGERED should be defined with ()s so that they are not erroneously used in an arithmetic operation. Suggested-by: David Rientjes rient...@google.com Signed-off-by: Prarit Bhargava pra...@redhat.com Cc: Seiji Aguchi seiji.agu...@hds.com Cc: Yang Zhang yang.z.zh...@intel.com Link: http://lkml.kernel.org/r/1396440827-18352-1-git-send-email-pra...@redhat.com Signed-off-by: Ingo Molnar mi...@kernel.org --- arch/x86/include/asm/hw_irq.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h index a307b75..4615906 100644 --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -190,8 +190,8 @@ extern void (*__initconst interrupt[NR_VECTORS-FIRST_EXTERNAL_VECTOR])(void); #define trace_interrupt interrupt #endif -#define VECTOR_UNDEFINED -1 -#define VECTOR_RETRIGGERED -2 +#define VECTOR_UNDEFINED (-1) +#define VECTOR_RETRIGGERED (-2) typedef int vector_irq_t[NR_VECTORS]; DECLARE_PER_CPU(vector_irq_t, vector_irq); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/urgent] x86, cpu hotplug: Fix stack frame warning in check_irq_vectors_for_cpu_disable()
Commit-ID: 39424e89d64661faa0a2e00c5ad1e6dbeebfa972 Gitweb: http://git.kernel.org/tip/39424e89d64661faa0a2e00c5ad1e6dbeebfa972 Author: Prarit Bhargava AuthorDate: Tue, 28 Jan 2014 08:22:11 -0500 Committer: H. Peter Anvin CommitDate: Thu, 30 Jan 2014 16:40:13 -0800 x86, cpu hotplug: Fix stack frame warning in check_irq_vectors_for_cpu_disable() Further discussion here: http://marc.info/?l=linux-kernel=139073901101034=2 kbuild, 0day kernel build service, outputs the warning: arch/x86/kernel/irq.c:333:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=] because check_irq_vectors_for_cpu_disable() allocates two cpumasks on the stack. Fix this by moving the two cpumasks to a global file context. Reported-by: Fengguang Wu Tested-by: David Rientjes Signed-off-by: Prarit Bhargava Link: http://lkml.kernel.org/r/1390915331-27375-1-git-send-email-pra...@redhat.com Cc: Andi Kleen Cc: Michel Lespinasse Cc: Seiji Aguchi Cc: Yang Zhang Cc: Paul Gortmaker Cc: Janet Morgan Cc: Tony Luck Cc: Ruiv Wang Cc: Gong Chen Cc: Yinghai Lu Signed-off-by: H. Peter Anvin --- arch/x86/kernel/irq.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index dbb6087..d99f31d 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -266,6 +266,14 @@ __visible void smp_trace_x86_platform_ipi(struct pt_regs *regs) EXPORT_SYMBOL_GPL(vector_used_by_percpu_irq); #ifdef CONFIG_HOTPLUG_CPU + +/* These two declarations are only used in check_irq_vectors_for_cpu_disable() + * below, which is protected by stop_machine(). Putting them on the stack + * results in a stack frame overflow. Dynamically allocating could result in a + * failure so declare these two cpumasks as global. + */ +static struct cpumask affinity_new, online_new; + /* * This cpu is going to be removed and its vectors migrated to the remaining * online cpus. Check to see if there are enough vectors in the remaining cpus. @@ -277,7 +285,6 @@ int check_irq_vectors_for_cpu_disable(void) unsigned int this_cpu, vector, this_count, count; struct irq_desc *desc; struct irq_data *data; - struct cpumask affinity_new, online_new; this_cpu = smp_processor_id(); cpumask_copy(_new, cpu_online_mask); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/urgent] x86, cpu hotplug: Fix stack frame warning in check_irq_vectors_for_cpu_disable()
Commit-ID: 39424e89d64661faa0a2e00c5ad1e6dbeebfa972 Gitweb: http://git.kernel.org/tip/39424e89d64661faa0a2e00c5ad1e6dbeebfa972 Author: Prarit Bhargava pra...@redhat.com AuthorDate: Tue, 28 Jan 2014 08:22:11 -0500 Committer: H. Peter Anvin h...@linux.intel.com CommitDate: Thu, 30 Jan 2014 16:40:13 -0800 x86, cpu hotplug: Fix stack frame warning in check_irq_vectors_for_cpu_disable() Further discussion here: http://marc.info/?l=linux-kernelm=139073901101034w=2 kbuild, 0day kernel build service, outputs the warning: arch/x86/kernel/irq.c:333:1: warning: the frame size of 2056 bytes is larger than 2048 bytes [-Wframe-larger-than=] because check_irq_vectors_for_cpu_disable() allocates two cpumasks on the stack. Fix this by moving the two cpumasks to a global file context. Reported-by: Fengguang Wu fengguang...@intel.com Tested-by: David Rientjes rient...@google.com Signed-off-by: Prarit Bhargava pra...@redhat.com Link: http://lkml.kernel.org/r/1390915331-27375-1-git-send-email-pra...@redhat.com Cc: Andi Kleen a...@linux.intel.com Cc: Michel Lespinasse wal...@google.com Cc: Seiji Aguchi seiji.agu...@hds.com Cc: Yang Zhang yang.z.zh...@intel.com Cc: Paul Gortmaker paul.gortma...@windriver.com Cc: Janet Morgan janet.mor...@intel.com Cc: Tony Luck tony.l...@intel.com Cc: Ruiv Wang ruiv.w...@gmail.com Cc: Gong Chen gong.c...@linux.intel.com Cc: Yinghai Lu ying...@kernel.org Signed-off-by: H. Peter Anvin h...@linux.intel.com --- arch/x86/kernel/irq.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c index dbb6087..d99f31d 100644 --- a/arch/x86/kernel/irq.c +++ b/arch/x86/kernel/irq.c @@ -266,6 +266,14 @@ __visible void smp_trace_x86_platform_ipi(struct pt_regs *regs) EXPORT_SYMBOL_GPL(vector_used_by_percpu_irq); #ifdef CONFIG_HOTPLUG_CPU + +/* These two declarations are only used in check_irq_vectors_for_cpu_disable() + * below, which is protected by stop_machine(). Putting them on the stack + * results in a stack frame overflow. Dynamically allocating could result in a + * failure so declare these two cpumasks as global. + */ +static struct cpumask affinity_new, online_new; + /* * This cpu is going to be removed and its vectors migrated to the remaining * online cpus. Check to see if there are enough vectors in the remaining cpus. @@ -277,7 +285,6 @@ int check_irq_vectors_for_cpu_disable(void) unsigned int this_cpu, vector, this_count, count; struct irq_desc *desc; struct irq_data *data; - struct cpumask affinity_new, online_new; this_cpu = smp_processor_id(); cpumask_copy(online_new, cpu_online_mask); -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/urgent] x86: Add check for number of available vectors before CPU down
Commit-ID: da6139e49c7cb0f4251265cb5243b8d220adb48d Gitweb: http://git.kernel.org/tip/da6139e49c7cb0f4251265cb5243b8d220adb48d Author: Prarit Bhargava AuthorDate: Mon, 13 Jan 2014 06:51:01 -0500 Committer: H. Peter Anvin CommitDate: Wed, 15 Jan 2014 22:24:02 -0800 x86: Add check for number of available vectors before CPU down Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791 When a cpu is downed on a system, the irqs on the cpu are assigned to other cpus. It is possible, however, that when a cpu is downed there aren't enough free vectors on the remaining cpus to account for the vectors from the cpu that is being downed. This results in an interesting "overflow" condition where irqs are "assigned" to a CPU but are not handled. For example, when downing cpus on a 1-64 logical processor system: [ 232.021745] smpboot: CPU 61 is now offline [ 238.480275] smpboot: CPU 62 is now offline [ 245.991080] [ cut here ] [ 245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250() [ 246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out [ 246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas [ 246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14 [ 246.044451] Hardware name: Intel Corporation S4600LH ../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 [ 246.057371] 0009 88081fa03d40 8164fbf6 88081fa0ee48 [ 246.065728] 88081fa03d90 88081fa03d80 81054ecc 88081fa13040 [ 246.074073] 88200cce 0040 [ 246.082430] Call Trace: [ 246.085174][] dump_stack+0x46/0x58 [ 246.091633] [] warn_slowpath_common+0x8c/0xc0 [ 246.098352] [] warn_slowpath_fmt+0x46/0x50 [ 246.104786] [] dev_watchdog+0x246/0x250 [ 246.110923] [] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.119097] [] call_timer_fn+0x3a/0x110 [ 246.125224] [] ? update_process_times+0x6f/0x80 [ 246.132137] [] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.140308] [] run_timer_softirq+0x1f0/0x2a0 [ 246.146933] [] __do_softirq+0xe0/0x220 [ 246.152976] [] call_softirq+0x1c/0x30 [ 246.158920] [] do_softirq+0x55/0x90 [ 246.164670] [] irq_exit+0xa5/0xb0 [ 246.170227] [] smp_apic_timer_interrupt+0x4a/0x60 [ 246.177324] [] apic_timer_interrupt+0x6a/0x70 [ 246.184041][] ? cpuidle_enter_state+0x5b/0xe0 [ 246.191559] [] ? cpuidle_enter_state+0x57/0xe0 [ 246.198374] [] cpuidle_idle_call+0xbd/0x200 [ 246.204900] [] arch_cpu_idle+0xe/0x30 [ 246.210846] [] cpu_startup_entry+0xd0/0x250 [ 246.217371] [] rest_init+0x77/0x80 [ 246.223028] [] start_kernel+0x3ee/0x3fb [ 246.229165] [] ? repair_env_string+0x5e/0x5e [ 246.235787] [] x86_64_start_reservations+0x2a/0x2c [ 246.242990] [] x86_64_start_kernel+0xf8/0xfc [ 246.249610] ---[ end trace fb74fdef54d79039 ]--- [ 246.254807] ixgbe :c2:00.0 p786p1: initiating reset due to tx timeout [ 246.262489] ixgbe :c2:00.0 p786p1: Reset adapter Last login: Mon Nov 11 08:35:14 from 10.18.17.119 [root@(none) ~]# [ 246.792676] ixgbe :c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe :c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [ 246.792676] ixgbe :c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe :c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX (last lines keep repeating. ixgbe driver is dead until module reload.) If the downed cpu has more vectors than are free on the remaining cpus on the system, it is possible that some vectors are "orphaned" even though they are assigned to a cpu. In this case, since the ixgbe driver had a watchdog, the watchdog fired and notified that something was wrong. This patch adds a function, check_vectors(), to compare the number of vectors on the CPU going down and compares it to the number of vectors available on the system. If there aren't enough vectors for the CPU to go down, an error is returned and propogated back to userspace. v2: Do not need to look at percpu irqs v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest Priority Mode v4: Additional changes suggested by Gong Chen. v5/v6/v7/v8: Updated comment text Signed-off-by: Prarit Bhargava Link: http://lkml.kernel.org/r/1389613861-3853-1-git-send-email-pra...@redhat.com Reviewed-by: Gong Chen Cc: Andi Kleen Cc: Michel Lespinasse Cc: Seiji Aguchi Cc: Yang Zhang Cc: Paul Gortmaker Cc: Janet Morgan Cc: Tony Luck Cc: Ruiv Wang Cc: Gong Chen Signed-off-by: H. Peter Anvin Cc: --- arch/x86/include/asm/irq.h | 1 + arch/x86/kernel/irq.c | 70 ++ arch/x86/kernel/smpboot.c | 6 3 files changed, 77
[tip:x86/urgent] x86: Add check for number of available vectors before CPU down
Commit-ID: da6139e49c7cb0f4251265cb5243b8d220adb48d Gitweb: http://git.kernel.org/tip/da6139e49c7cb0f4251265cb5243b8d220adb48d Author: Prarit Bhargava pra...@redhat.com AuthorDate: Mon, 13 Jan 2014 06:51:01 -0500 Committer: H. Peter Anvin h...@linux.intel.com CommitDate: Wed, 15 Jan 2014 22:24:02 -0800 x86: Add check for number of available vectors before CPU down Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791 When a cpu is downed on a system, the irqs on the cpu are assigned to other cpus. It is possible, however, that when a cpu is downed there aren't enough free vectors on the remaining cpus to account for the vectors from the cpu that is being downed. This results in an interesting overflow condition where irqs are assigned to a CPU but are not handled. For example, when downing cpus on a 1-64 logical processor system: snip [ 232.021745] smpboot: CPU 61 is now offline [ 238.480275] smpboot: CPU 62 is now offline [ 245.991080] [ cut here ] [ 245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250() [ 246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out [ 246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas [ 246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14 [ 246.044451] Hardware name: Intel Corporation S4600LH ../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 [ 246.057371] 0009 88081fa03d40 8164fbf6 88081fa0ee48 [ 246.065728] 88081fa03d90 88081fa03d80 81054ecc 88081fa13040 [ 246.074073] 88200cce 0040 [ 246.082430] Call Trace: [ 246.085174] IRQ [8164fbf6] dump_stack+0x46/0x58 [ 246.091633] [81054ecc] warn_slowpath_common+0x8c/0xc0 [ 246.098352] [81054fb6] warn_slowpath_fmt+0x46/0x50 [ 246.104786] [815710d6] dev_watchdog+0x246/0x250 [ 246.110923] [81570e90] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.119097] [8106092a] call_timer_fn+0x3a/0x110 [ 246.125224] [8106280f] ? update_process_times+0x6f/0x80 [ 246.132137] [81570e90] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.140308] [81061db0] run_timer_softirq+0x1f0/0x2a0 [ 246.146933] [81059a80] __do_softirq+0xe0/0x220 [ 246.152976] [8165fedc] call_softirq+0x1c/0x30 [ 246.158920] [810045f5] do_softirq+0x55/0x90 [ 246.164670] [81059d35] irq_exit+0xa5/0xb0 [ 246.170227] [8166062a] smp_apic_timer_interrupt+0x4a/0x60 [ 246.177324] [8165f40a] apic_timer_interrupt+0x6a/0x70 [ 246.184041] EOI [81505a1b] ? cpuidle_enter_state+0x5b/0xe0 [ 246.191559] [81505a17] ? cpuidle_enter_state+0x57/0xe0 [ 246.198374] [81505b5d] cpuidle_idle_call+0xbd/0x200 [ 246.204900] [8100b7ae] arch_cpu_idle+0xe/0x30 [ 246.210846] [810a47b0] cpu_startup_entry+0xd0/0x250 [ 246.217371] [81646b47] rest_init+0x77/0x80 [ 246.223028] [81d09e8e] start_kernel+0x3ee/0x3fb [ 246.229165] [81d0989f] ? repair_env_string+0x5e/0x5e [ 246.235787] [81d095a5] x86_64_start_reservations+0x2a/0x2c [ 246.242990] [81d0969f] x86_64_start_kernel+0xf8/0xfc [ 246.249610] ---[ end trace fb74fdef54d79039 ]--- [ 246.254807] ixgbe :c2:00.0 p786p1: initiating reset due to tx timeout [ 246.262489] ixgbe :c2:00.0 p786p1: Reset adapter Last login: Mon Nov 11 08:35:14 from 10.18.17.119 [root@(none) ~]# [ 246.792676] ixgbe :c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe :c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [ 246.792676] ixgbe :c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe :c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX (last lines keep repeating. ixgbe driver is dead until module reload.) If the downed cpu has more vectors than are free on the remaining cpus on the system, it is possible that some vectors are orphaned even though they are assigned to a cpu. In this case, since the ixgbe driver had a watchdog, the watchdog fired and notified that something was wrong. This patch adds a function, check_vectors(), to compare the number of vectors on the CPU going down and compares it to the number of vectors available on the system. If there aren't enough vectors for the CPU to go down, an error is returned and propogated back to userspace. v2: Do not need to look at percpu irqs v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest Priority Mode v4: Additional changes suggested by Gong Chen. v5/v6/v7/v8: Updated comment text Signed-off-by: Prarit Bhargava pra...@redhat.com Link:
[tip:irq/core] x86/irq: Fix kbuild warning in smp_irq_move_cleanup_interrupt()
Commit-ID: c7a730fa4624092e2d1c0cb7b750816e87c32364 Gitweb: http://git.kernel.org/tip/c7a730fa4624092e2d1c0cb7b750816e87c32364 Author: Prarit Bhargava AuthorDate: Mon, 13 Jan 2014 08:40:20 -0500 Committer: Ingo Molnar CommitDate: Mon, 13 Jan 2014 15:08:37 +0100 x86/irq: Fix kbuild warning in smp_irq_move_cleanup_interrupt() Fengguang Wu's 0day kernel build service reported the following build warning: arch/x86/kernel/apic/io_apic.c:2211 smp_irq_move_cleanup_interrupt() warn: always true condition '(irq <= -1) => (0-u32max <= (-1))' because irq is defined as an unsigned int instead of an int. Fix this trivial error by redefining irq as a signed int. The remaining consumers of the int are okay. Signed-off-by: Prarit Bhargava Cc: Konrad Rzeszutek Wilk Cc: Sebastian Andrzej Siewior Cc: Joerg Roedel Cc: Fengguang Wu Link: http://lkml.kernel.org/r/1389620420-7110-1-git-send-email-pra...@redhat.com Signed-off-by: Ingo Molnar --- arch/x86/kernel/apic/io_apic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 6df0b66..a43f068 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2202,7 +2202,7 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) me = smp_processor_id(); for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) { - unsigned int irq; + int irq; unsigned int irr; struct irq_desc *desc; struct irq_cfg *cfg; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:irq/core] x86/irq: Fix kbuild warning in smp_irq_move_cleanup_interrupt()
Commit-ID: c7a730fa4624092e2d1c0cb7b750816e87c32364 Gitweb: http://git.kernel.org/tip/c7a730fa4624092e2d1c0cb7b750816e87c32364 Author: Prarit Bhargava pra...@redhat.com AuthorDate: Mon, 13 Jan 2014 08:40:20 -0500 Committer: Ingo Molnar mi...@kernel.org CommitDate: Mon, 13 Jan 2014 15:08:37 +0100 x86/irq: Fix kbuild warning in smp_irq_move_cleanup_interrupt() Fengguang Wu's 0day kernel build service reported the following build warning: arch/x86/kernel/apic/io_apic.c:2211 smp_irq_move_cleanup_interrupt() warn: always true condition '(irq = -1) = (0-u32max = (-1))' because irq is defined as an unsigned int instead of an int. Fix this trivial error by redefining irq as a signed int. The remaining consumers of the int are okay. Signed-off-by: Prarit Bhargava pra...@redhat.com Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com Cc: Sebastian Andrzej Siewior sebast...@breakpoint.cc Cc: Joerg Roedel j...@8bytes.org Cc: Fengguang Wu fengguang...@intel.com Link: http://lkml.kernel.org/r/1389620420-7110-1-git-send-email-pra...@redhat.com Signed-off-by: Ingo Molnar mi...@kernel.org --- arch/x86/kernel/apic/io_apic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index 6df0b66..a43f068 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2202,7 +2202,7 @@ asmlinkage void smp_irq_move_cleanup_interrupt(void) me = smp_processor_id(); for (vector = FIRST_EXTERNAL_VECTOR; vector NR_VECTORS; vector++) { - unsigned int irq; + int irq; unsigned int irr; struct irq_desc *desc; struct irq_cfg *cfg; -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:irq/core] x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs
Commit-ID: 9345005f4eed805308193658d12e4e7e9c261e74 Gitweb: http://git.kernel.org/tip/9345005f4eed805308193658d12e4e7e9c261e74 Author: Prarit Bhargava AuthorDate: Sun, 5 Jan 2014 11:10:52 -0500 Committer: Ingo Molnar CommitDate: Sun, 12 Jan 2014 13:13:02 +0100 x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs During heavy CPU-hotplug operations the following spurious kernel warnings can trigger: do_IRQ: No ... irq handler for vector (irq -1) [ See: https://bugzilla.kernel.org/show_bug.cgi?id=64831 ] When downing a cpu it is possible that there are unhandled irqs left in the APIC IRR register. The following code path shows how the problem can occur: 1. CPU 5 is to go down. 2. cpu_disable() on CPU 5 executes with interrupt flag cleared by local_irq_save() via stop_machine(). 3. IRQ 12 asserts on CPU 5, setting IRR but not ISR because interrupt flag is cleared (CPU unabled to handle the irq) 4. IRQs are migrated off of CPU 5, and the vectors' irqs are set to -1. 5. stop_machine() finishes cpu_disable() 6. cpu_die() for CPU 5 executes in normal context. 7. CPU 5 attempts to handle IRQ 12 because the IRR is set for IRQ 12. The code attempts to find the vector's IRQ and cannot because it has been set to -1. 8. do_IRQ() warning displays warning about CPU 5 IRQ 12. I added a debug printk to output which CPU & vector was retriggered and discovered that that we are getting bogus events. I see a 100% correlation between this debug printk in fixup_irqs() and the do_IRQ() warning. This patchset resolves this by adding definitions for VECTOR_UNDEFINED(-1) and VECTOR_RETRIGGERED(-2) and modifying the code to use them. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=64831 Signed-off-by: Prarit Bhargava Reviewed-by: Rui Wang Cc: Michel Lespinasse Cc: Seiji Aguchi Cc: Yang Zhang Cc: Paul Gortmaker Cc: janet.mor...@intel.com Cc: tony.l...@intel.com Cc: ruiv.w...@gmail.com Link: http://lkml.kernel.org/r/1388938252-16627-1-git-send-email-pra...@redhat.com [ Cleaned up the code a bit. ] Signed-off-by: Ingo Molnar --- arch/x86/include/asm/hw_irq.h | 3 +++ arch/x86/kernel/apic/io_apic.c | 18 +- arch/x86/kernel/irq.c | 19 +-- arch/x86/kernel/irqinit.c | 4 ++-- 4 files changed, 27 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h index cba45d9..67d69b8 100644 --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -191,6 +191,9 @@ extern void (*__initconst interrupt[NR_VECTORS-FIRST_EXTERNAL_VECTOR])(void); #define trace_interrupt interrupt #endif +#define VECTOR_UNDEFINED -1 +#define VECTOR_RETRIGGERED -2 + typedef int vector_irq_t[NR_VECTORS]; DECLARE_PER_CPU(vector_irq_t, vector_irq); extern void setup_vector_irq(int cpu); diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index e63a5bd..6df0b66 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -1142,9 +1142,10 @@ next: if (test_bit(vector, used_vectors)) goto next; - for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask) - if (per_cpu(vector_irq, new_cpu)[vector] != -1) + for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask) { + if (per_cpu(vector_irq, new_cpu)[vector] > VECTOR_UNDEFINED) goto next; + } /* Found one! */ current_vector = vector; current_offset = offset; @@ -1183,7 +1184,7 @@ static void __clear_irq_vector(int irq, struct irq_cfg *cfg) vector = cfg->vector; for_each_cpu_and(cpu, cfg->domain, cpu_online_mask) - per_cpu(vector_irq, cpu)[vector] = -1; + per_cpu(vector_irq, cpu)[vector] = VECTOR_UNDEFINED; cfg->vector = 0; cpumask_clear(cfg->domain); @@ -1191,11 +1192,10 @@ static void __clear_irq_vector(int irq, struct irq_cfg *cfg) if (likely(!cfg->move_in_progress)) return; for_each_cpu_and(cpu, cfg->old_domain, cpu_online_mask) { - for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; - vector++) { + for (vector = FIRST_EXTERNAL_VECTOR; vector < NR_VECTORS; vector++) { if (per_cpu(vector_irq, cpu)[vector] != irq) continue; - per_cpu(vector_irq, cpu)[vector] = -1; + per_cpu(vector_irq, cpu)[vector] = VECTOR_UNDEFINED; break; } } @@ -1228,12 +1228,12 @@ void __setup_vector_irq(int cpu) /* Mark the free vectors */ for (vector = 0; vector < NR_VECTORS; ++vector) { irq = per_cpu(vector_irq,
[tip:irq/core] x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs
Commit-ID: 9345005f4eed805308193658d12e4e7e9c261e74 Gitweb: http://git.kernel.org/tip/9345005f4eed805308193658d12e4e7e9c261e74 Author: Prarit Bhargava pra...@redhat.com AuthorDate: Sun, 5 Jan 2014 11:10:52 -0500 Committer: Ingo Molnar mi...@kernel.org CommitDate: Sun, 12 Jan 2014 13:13:02 +0100 x86/irq: Fix do_IRQ() interrupt warning for cpu hotplug retriggered irqs During heavy CPU-hotplug operations the following spurious kernel warnings can trigger: do_IRQ: No ... irq handler for vector (irq -1) [ See: https://bugzilla.kernel.org/show_bug.cgi?id=64831 ] When downing a cpu it is possible that there are unhandled irqs left in the APIC IRR register. The following code path shows how the problem can occur: 1. CPU 5 is to go down. 2. cpu_disable() on CPU 5 executes with interrupt flag cleared by local_irq_save() via stop_machine(). 3. IRQ 12 asserts on CPU 5, setting IRR but not ISR because interrupt flag is cleared (CPU unabled to handle the irq) 4. IRQs are migrated off of CPU 5, and the vectors' irqs are set to -1. 5. stop_machine() finishes cpu_disable() 6. cpu_die() for CPU 5 executes in normal context. 7. CPU 5 attempts to handle IRQ 12 because the IRR is set for IRQ 12. The code attempts to find the vector's IRQ and cannot because it has been set to -1. 8. do_IRQ() warning displays warning about CPU 5 IRQ 12. I added a debug printk to output which CPU vector was retriggered and discovered that that we are getting bogus events. I see a 100% correlation between this debug printk in fixup_irqs() and the do_IRQ() warning. This patchset resolves this by adding definitions for VECTOR_UNDEFINED(-1) and VECTOR_RETRIGGERED(-2) and modifying the code to use them. Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=64831 Signed-off-by: Prarit Bhargava pra...@redhat.com Reviewed-by: Rui Wang rui.y.w...@intel.com Cc: Michel Lespinasse wal...@google.com Cc: Seiji Aguchi seiji.agu...@hds.com Cc: Yang Zhang yang.z.zh...@intel.com Cc: Paul Gortmaker paul.gortma...@windriver.com Cc: janet.mor...@intel.com Cc: tony.l...@intel.com Cc: ruiv.w...@gmail.com Link: http://lkml.kernel.org/r/1388938252-16627-1-git-send-email-pra...@redhat.com [ Cleaned up the code a bit. ] Signed-off-by: Ingo Molnar mi...@kernel.org --- arch/x86/include/asm/hw_irq.h | 3 +++ arch/x86/kernel/apic/io_apic.c | 18 +- arch/x86/kernel/irq.c | 19 +-- arch/x86/kernel/irqinit.c | 4 ++-- 4 files changed, 27 insertions(+), 17 deletions(-) diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h index cba45d9..67d69b8 100644 --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -191,6 +191,9 @@ extern void (*__initconst interrupt[NR_VECTORS-FIRST_EXTERNAL_VECTOR])(void); #define trace_interrupt interrupt #endif +#define VECTOR_UNDEFINED -1 +#define VECTOR_RETRIGGERED -2 + typedef int vector_irq_t[NR_VECTORS]; DECLARE_PER_CPU(vector_irq_t, vector_irq); extern void setup_vector_irq(int cpu); diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index e63a5bd..6df0b66 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -1142,9 +1142,10 @@ next: if (test_bit(vector, used_vectors)) goto next; - for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask) - if (per_cpu(vector_irq, new_cpu)[vector] != -1) + for_each_cpu_and(new_cpu, tmp_mask, cpu_online_mask) { + if (per_cpu(vector_irq, new_cpu)[vector] VECTOR_UNDEFINED) goto next; + } /* Found one! */ current_vector = vector; current_offset = offset; @@ -1183,7 +1184,7 @@ static void __clear_irq_vector(int irq, struct irq_cfg *cfg) vector = cfg-vector; for_each_cpu_and(cpu, cfg-domain, cpu_online_mask) - per_cpu(vector_irq, cpu)[vector] = -1; + per_cpu(vector_irq, cpu)[vector] = VECTOR_UNDEFINED; cfg-vector = 0; cpumask_clear(cfg-domain); @@ -1191,11 +1192,10 @@ static void __clear_irq_vector(int irq, struct irq_cfg *cfg) if (likely(!cfg-move_in_progress)) return; for_each_cpu_and(cpu, cfg-old_domain, cpu_online_mask) { - for (vector = FIRST_EXTERNAL_VECTOR; vector NR_VECTORS; - vector++) { + for (vector = FIRST_EXTERNAL_VECTOR; vector NR_VECTORS; vector++) { if (per_cpu(vector_irq, cpu)[vector] != irq) continue; - per_cpu(vector_irq, cpu)[vector] = -1; + per_cpu(vector_irq, cpu)[vector] = VECTOR_UNDEFINED; break; } } @@ -1228,12 +1228,12 @@ void
[tip:x86/urgent] x86: Add check for number of available vectors before CPU down
Commit-ID: 5fd782a0553cf9572bd38cb877ee6fbf070ef651 Gitweb: http://git.kernel.org/tip/5fd782a0553cf9572bd38cb877ee6fbf070ef651 Author: Prarit Bhargava AuthorDate: Fri, 20 Dec 2013 10:50:09 -0500 Committer: H. Peter Anvin CommitDate: Fri, 20 Dec 2013 15:24:04 -0800 x86: Add check for number of available vectors before CPU down Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791 When a cpu is downed on a system, the irqs on the cpu are assigned to other cpus. It is possible, however, that when a cpu is downed there aren't enough free vectors on the remaining cpus to account for the vectors from the cpu that is being downed. This results in an interesting "overflow" condition where irqs are "assigned" to a CPU but are not handled. For example, when downing cpus on a 1-64 logical processor system: [ 232.021745] smpboot: CPU 61 is now offline [ 238.480275] smpboot: CPU 62 is now offline [ 245.991080] [ cut here ] [ 245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250() [ 246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out [ 246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas [ 246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14 [ 246.044451] Hardware name: Intel Corporation S4600LH ../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 [ 246.057371] 0009 88081fa03d40 8164fbf6 88081fa0ee48 [ 246.065728] 88081fa03d90 88081fa03d80 81054ecc 88081fa13040 [ 246.074073] 88200cce 0040 [ 246.082430] Call Trace: [ 246.085174][] dump_stack+0x46/0x58 [ 246.091633] [] warn_slowpath_common+0x8c/0xc0 [ 246.098352] [] warn_slowpath_fmt+0x46/0x50 [ 246.104786] [] dev_watchdog+0x246/0x250 [ 246.110923] [] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.119097] [] call_timer_fn+0x3a/0x110 [ 246.125224] [] ? update_process_times+0x6f/0x80 [ 246.132137] [] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.140308] [] run_timer_softirq+0x1f0/0x2a0 [ 246.146933] [] __do_softirq+0xe0/0x220 [ 246.152976] [] call_softirq+0x1c/0x30 [ 246.158920] [] do_softirq+0x55/0x90 [ 246.164670] [] irq_exit+0xa5/0xb0 [ 246.170227] [] smp_apic_timer_interrupt+0x4a/0x60 [ 246.177324] [] apic_timer_interrupt+0x6a/0x70 [ 246.184041][] ? cpuidle_enter_state+0x5b/0xe0 [ 246.191559] [] ? cpuidle_enter_state+0x57/0xe0 [ 246.198374] [] cpuidle_idle_call+0xbd/0x200 [ 246.204900] [] arch_cpu_idle+0xe/0x30 [ 246.210846] [] cpu_startup_entry+0xd0/0x250 [ 246.217371] [] rest_init+0x77/0x80 [ 246.223028] [] start_kernel+0x3ee/0x3fb [ 246.229165] [] ? repair_env_string+0x5e/0x5e [ 246.235787] [] x86_64_start_reservations+0x2a/0x2c [ 246.242990] [] x86_64_start_kernel+0xf8/0xfc [ 246.249610] ---[ end trace fb74fdef54d79039 ]--- [ 246.254807] ixgbe :c2:00.0 p786p1: initiating reset due to tx timeout [ 246.262489] ixgbe :c2:00.0 p786p1: Reset adapter Last login: Mon Nov 11 08:35:14 from 10.18.17.119 [root@(none) ~]# [ 246.792676] ixgbe :c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe :c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [ 246.792676] ixgbe :c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe :c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX (last lines keep repeating. ixgbe driver is dead until module reload.) If the downed cpu has more vectors than are free on the remaining cpus on the system, it is possible that some vectors are "orphaned" even though they are assigned to a cpu. In this case, since the ixgbe driver had a watchdog, the watchdog fired and notified that something was wrong. This patch adds a function, check_vectors(), to compare the number of vectors on the CPU going down and compares it to the number of vectors available on the system. If there aren't enough vectors for the CPU to go down, an error is returned and propogated back to userspace. v2: Do not need to look at percpu irqs v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest Priority Mode Signed-off-by: Prarit Bhargava Link: http://lkml.kernel.org/r/1387554609-9823-1-git-send-email-pra...@redhat.com Reviewed-by: Andi Kleen Cc: Michel Lespinasse Cc: Seiji Aguchi Cc: Yang Zhang Cc: Paul Gortmaker Cc: Janet Morgan Cc: Tony Luck Cc: Ruiv Wang Cc: Gong Chen Signed-off-by: H. Peter Anvin Cc: # see note below [ hpa: I'm tagging this for -stable, as it is a serious failure on these large systems, but this is definitely a policy call for the -stable maintainers. ] --- arch/x86/include/asm/irq.h | 1 + arch/x86/kernel/irq.c | 44
[tip:x86/urgent] x86: Add check for number of available vectors before CPU down
Commit-ID: 5fd782a0553cf9572bd38cb877ee6fbf070ef651 Gitweb: http://git.kernel.org/tip/5fd782a0553cf9572bd38cb877ee6fbf070ef651 Author: Prarit Bhargava pra...@redhat.com AuthorDate: Fri, 20 Dec 2013 10:50:09 -0500 Committer: H. Peter Anvin h...@linux.intel.com CommitDate: Fri, 20 Dec 2013 15:24:04 -0800 x86: Add check for number of available vectors before CPU down Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=64791 When a cpu is downed on a system, the irqs on the cpu are assigned to other cpus. It is possible, however, that when a cpu is downed there aren't enough free vectors on the remaining cpus to account for the vectors from the cpu that is being downed. This results in an interesting overflow condition where irqs are assigned to a CPU but are not handled. For example, when downing cpus on a 1-64 logical processor system: snip [ 232.021745] smpboot: CPU 61 is now offline [ 238.480275] smpboot: CPU 62 is now offline [ 245.991080] [ cut here ] [ 245.996270] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:264 dev_watchdog+0x246/0x250() [ 246.005688] NETDEV WATCHDOG: p786p1 (ixgbe): transmit queue 0 timed out [ 246.013070] Modules linked in: lockd sunrpc iTCO_wdt iTCO_vendor_support sb_edac ixgbe microcode e1000e pcspkr joydev edac_core lpc_ich ioatdma ptp mdio mfd_core i2c_i801 dca pps_core i2c_core wmi acpi_cpufreq isci libsas scsi_transport_sas [ 246.037633] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.12.0+ #14 [ 246.044451] Hardware name: Intel Corporation S4600LH ../SVRBD-ROW_T, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013 [ 246.057371] 0009 88081fa03d40 8164fbf6 88081fa0ee48 [ 246.065728] 88081fa03d90 88081fa03d80 81054ecc 88081fa13040 [ 246.074073] 88200cce 0040 [ 246.082430] Call Trace: [ 246.085174] IRQ [8164fbf6] dump_stack+0x46/0x58 [ 246.091633] [81054ecc] warn_slowpath_common+0x8c/0xc0 [ 246.098352] [81054fb6] warn_slowpath_fmt+0x46/0x50 [ 246.104786] [815710d6] dev_watchdog+0x246/0x250 [ 246.110923] [81570e90] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.119097] [8106092a] call_timer_fn+0x3a/0x110 [ 246.125224] [8106280f] ? update_process_times+0x6f/0x80 [ 246.132137] [81570e90] ? dev_deactivate_queue.constprop.31+0x80/0x80 [ 246.140308] [81061db0] run_timer_softirq+0x1f0/0x2a0 [ 246.146933] [81059a80] __do_softirq+0xe0/0x220 [ 246.152976] [8165fedc] call_softirq+0x1c/0x30 [ 246.158920] [810045f5] do_softirq+0x55/0x90 [ 246.164670] [81059d35] irq_exit+0xa5/0xb0 [ 246.170227] [8166062a] smp_apic_timer_interrupt+0x4a/0x60 [ 246.177324] [8165f40a] apic_timer_interrupt+0x6a/0x70 [ 246.184041] EOI [81505a1b] ? cpuidle_enter_state+0x5b/0xe0 [ 246.191559] [81505a17] ? cpuidle_enter_state+0x57/0xe0 [ 246.198374] [81505b5d] cpuidle_idle_call+0xbd/0x200 [ 246.204900] [8100b7ae] arch_cpu_idle+0xe/0x30 [ 246.210846] [810a47b0] cpu_startup_entry+0xd0/0x250 [ 246.217371] [81646b47] rest_init+0x77/0x80 [ 246.223028] [81d09e8e] start_kernel+0x3ee/0x3fb [ 246.229165] [81d0989f] ? repair_env_string+0x5e/0x5e [ 246.235787] [81d095a5] x86_64_start_reservations+0x2a/0x2c [ 246.242990] [81d0969f] x86_64_start_kernel+0xf8/0xfc [ 246.249610] ---[ end trace fb74fdef54d79039 ]--- [ 246.254807] ixgbe :c2:00.0 p786p1: initiating reset due to tx timeout [ 246.262489] ixgbe :c2:00.0 p786p1: Reset adapter Last login: Mon Nov 11 08:35:14 from 10.18.17.119 [root@(none) ~]# [ 246.792676] ixgbe :c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe :c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX [ 246.792676] ixgbe :c2:00.0 p786p1: detected SFP+: 5 [ 249.231598] ixgbe :c2:00.0 p786p1: NIC Link is Up 10 Gbps, Flow Control: RX/TX (last lines keep repeating. ixgbe driver is dead until module reload.) If the downed cpu has more vectors than are free on the remaining cpus on the system, it is possible that some vectors are orphaned even though they are assigned to a cpu. In this case, since the ixgbe driver had a watchdog, the watchdog fired and notified that something was wrong. This patch adds a function, check_vectors(), to compare the number of vectors on the CPU going down and compares it to the number of vectors available on the system. If there aren't enough vectors for the CPU to go down, an error is returned and propogated back to userspace. v2: Do not need to look at percpu irqs v3: Need to check affinity to prevent counting of MSIs in IOAPIC Lowest Priority Mode Signed-off-by: Prarit Bhargava pra...@redhat.com Link: http://lkml.kernel.org/r/1387554609-9823-1-git-send-email-pra...@redhat.com Reviewed-by: Andi Kleen