[tip:perf/urgent] perf/x86/intel: Add support/quirk for the MISPREDICT bit on Knights Landing CPUs
Commit-ID: 16160c1946b702dcfa95ef63389a56deb2f1c7cb Gitweb: https://git.kernel.org/tip/16160c1946b702dcfa95ef63389a56deb2f1c7cb Author: Jacek Tomaka AuthorDate: Thu, 2 Aug 2018 09:38:30 +0800 Committer: Ingo Molnar CommitDate: Mon, 10 Sep 2018 10:03:01 +0200 perf/x86/intel: Add support/quirk for the MISPREDICT bit on Knights Landing CPUs Problem: perf did not show branch predicted/mispredicted bit in brstack. Output of perf -F brstack for profile collected Before: 0x4fdbcd/0x4fdc03/-/-/-/0 0x45f4c1/0x4fdba0/-/-/-/0 0x45f544/0x45f4bb/-/-/-/0 0x45f555/0x45f53c/-/-/-/0 0x7f66901cc24b/0x45f555/-/-/-/0 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 After: 0x4fdbcd/0x4fdc03/P/-/-/0 0x45f4c1/0x4fdba0/P/-/-/0 0x45f544/0x45f4bb/P/-/-/0 0x45f555/0x45f53c/P/-/-/0 0x7f66901cc24b/0x45f555/P/-/-/0 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 Cause: As mentioned in Software Development Manual vol 3, 17.4.8.1, IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as its format. Despite that, registers containing FROM address of the branch, do have MISPREDICT bit but because of the format indicated in IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. Solution: Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit. Signed-off-by: Jacek Tomaka Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20180802013830.10600-1-jac...@dugeo.com Signed-off-by: Ingo Molnar --- arch/x86/events/intel/lbr.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index f3e006bed9a7..c88ed39582a1 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1272,4 +1272,8 @@ void intel_pmu_lbr_init_knl(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = snb_lbr_sel_map; + + /* Knights Landing does have MISPREDICT bit */ + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) + x86_pmu.intel_cap.lbr_format = LBR_FORMAT_EIP_FLAGS; }
[tip:perf/urgent] perf/x86/intel: Add support/quirk for the MISPREDICT bit on Knights Landing CPUs
Commit-ID: 16160c1946b702dcfa95ef63389a56deb2f1c7cb Gitweb: https://git.kernel.org/tip/16160c1946b702dcfa95ef63389a56deb2f1c7cb Author: Jacek Tomaka AuthorDate: Thu, 2 Aug 2018 09:38:30 +0800 Committer: Ingo Molnar CommitDate: Mon, 10 Sep 2018 10:03:01 +0200 perf/x86/intel: Add support/quirk for the MISPREDICT bit on Knights Landing CPUs Problem: perf did not show branch predicted/mispredicted bit in brstack. Output of perf -F brstack for profile collected Before: 0x4fdbcd/0x4fdc03/-/-/-/0 0x45f4c1/0x4fdba0/-/-/-/0 0x45f544/0x45f4bb/-/-/-/0 0x45f555/0x45f53c/-/-/-/0 0x7f66901cc24b/0x45f555/-/-/-/0 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 After: 0x4fdbcd/0x4fdc03/P/-/-/0 0x45f4c1/0x4fdba0/P/-/-/0 0x45f544/0x45f4bb/P/-/-/0 0x45f555/0x45f53c/P/-/-/0 0x7f66901cc24b/0x45f555/P/-/-/0 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 Cause: As mentioned in Software Development Manual vol 3, 17.4.8.1, IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as its format. Despite that, registers containing FROM address of the branch, do have MISPREDICT bit but because of the format indicated in IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. Solution: Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit. Signed-off-by: Jacek Tomaka Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/20180802013830.10600-1-jac...@dugeo.com Signed-off-by: Ingo Molnar --- arch/x86/events/intel/lbr.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index f3e006bed9a7..c88ed39582a1 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1272,4 +1272,8 @@ void intel_pmu_lbr_init_knl(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = snb_lbr_sel_map; + + /* Knights Landing does have MISPREDICT bit */ + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) + x86_pmu.intel_cap.lbr_format = LBR_FORMAT_EIP_FLAGS; }
Hugepages mixed with stacks in process address space
Hello, I was trying to track down the performance differences of one of my applications between running it on kernel used in Centos 7.4 and the latest 4.x version. On 4.x kernels its performance depended on the run and the variability was more than 30%. Bisecting showed that my issue was introduced by : fd8526ad14c182605e42b64646344b95befd9f94 :x86/mm: Implement ASLR for hugetlb mappings But it was not the ASLR aspect of that commit that created the issue but the change from bottom-up to top-down unmapped area lookup when allocating huge pages. After that change, the huge page allocations could become intertwined with stacks. Before, the stacks and huge pages were on the other side of the process address space. The machine i am seeing it on is Knights Landing 7250, with 68 cores x 4 hyper-threads. My application spawns 272 threads and each thread allocates its memory - a couple of 2MB huge pages and does some computation, dominated by memory accesses. My theory is that because KNL has 8-way 2MB TLB, when the huge pages are exactly 8 pages apart they collide. And this is where the variability comes from, if the stacks come in between, they increase chances of them colliding. I do realise that the application is (I am ) doing a few things dubiously: it allocates memory on each thread and each huge page separately. But i thought you might want to know about this behaviour change. When i allocate all my memory before i start threads, the problem goes away. /proc/PID/maps: After change: 7f5e06a0-7f5e06c0 rw-p 00:0f 31809 /anon_hugepage (deleted) 7f5e06c0-7f5e06e0 rw-p 00:0f 29767 /anon_hugepage (deleted) 7f5e06e0-7f5e0700 rw-p 00:0f 30787 /anon_hugepage (deleted) 7f5e0700-7f5e0720 rw-p 00:0f 30786 /anon_hugepage (deleted) 7f5e0720-7f5e0740 rw-p 00:0f 28744 /anon_hugepage (deleted) 7f5e075ff000-7f5e0760 ---p 00:00 0 7f5e0760-7f5e07e0 rw-p 00:00 0 7f5e07e0-7f5e0800 rw-p 00:0f 30785 /anon_hugepage (deleted) 7f5e0800-7f5e08021000 rw-p 00:00 0 7f5e08021000-7f5e0c00 ---p 00:00 0 7f5e0c00-7f5e0c021000 rw-p 00:00 0 7f5e0c021000-7f5e1000 ---p 00:00 0 7f5e1000-7f5e10021000 rw-p 00:00 0 7f5e10021000-7f5e1400 ---p 00:00 0 7f5e1420-7f5e1440 rw-p 00:0f 29765 /anon_hugepage (deleted) 7f5e1440-7f5e1460 rw-p 00:0f 28743 /anon_hugepage (deleted) 7f5e1460-7f5e1480 rw-p 00:0f 29764 /anon_hugepage (deleted) (...) Before change: 2ac0-2ae0 rw-p 00:0f 25582 /anon_hugepage (deleted) 2ae0-2b00 rw-p 00:0f 25583 /anon_hugepage (deleted) 2b00-2b20 rw-p 00:0f 25584 /anon_hugepage (deleted) 2b20-2b40 rw-p 00:0f 25585 /anon_hugepage (deleted) 2b40-2b60 rw-p 00:0f 25601 /anon_hugepage (deleted) 2b60-2b80 rw-p 00:0f 25599 /anon_hugepage (deleted) 2b80-2ba0 rw-p 00:0f 25602 /anon_hugepage (deleted) 2ba0-2bc0 rw-p 00:0f 26652 /anon_hugepage (deleted) (...) 7fc4f0021000-7fc4f400 ---p 00:00 0 7fc4f400-7fc4f4021000 rw-p 00:00 0 7fc4f4021000-7fc4f800 ---p 00:00 0 7fc4f800-7fc4f8021000 rw-p 00:00 0 7fc4f8021000-7fc4fc00 ---p 00:00 0 7fc4fc00-7fc4fc021000 rw-p 00:00 0 7fc4fc021000-7fc5 ---p 00:00 0 7fc5-7fc500021000 rw-p 00:00 0 7fc500021000-7fc50400 ---p 00:00 0 7fc50400-7fc504021000 rw-p 00:00 0 7fc504021000-7fc50800 ---p 00:00 0 7fc50800-7fc508021000 rw-p 00:00 0 7fc508021000-7fc50c00 ---p 00:00 0 (...) I was wondering if this intertwined stacks and hugepages is an expected feature of ASLR? If not, maybe mmap's MAP_STACK flag could finally start to be used by the kernel to keep all the stacks together in process address space? Or should users just not allocate huge pages on separate threads? MAP_STACK could also be used to mark a VMA as a mapping for stack, (if there are flags left) to re-implement: 65376df582174ffcec9e6471bf5b0dd79ba05e4a proc: revert /proc//maps [stack:TID] annotation correctly, as having these pieces of information in place would greatly simplify my investigation. Regards. Jacek Tomaka
Hugepages mixed with stacks in process address space
Hello, I was trying to track down the performance differences of one of my applications between running it on kernel used in Centos 7.4 and the latest 4.x version. On 4.x kernels its performance depended on the run and the variability was more than 30%. Bisecting showed that my issue was introduced by : fd8526ad14c182605e42b64646344b95befd9f94 :x86/mm: Implement ASLR for hugetlb mappings But it was not the ASLR aspect of that commit that created the issue but the change from bottom-up to top-down unmapped area lookup when allocating huge pages. After that change, the huge page allocations could become intertwined with stacks. Before, the stacks and huge pages were on the other side of the process address space. The machine i am seeing it on is Knights Landing 7250, with 68 cores x 4 hyper-threads. My application spawns 272 threads and each thread allocates its memory - a couple of 2MB huge pages and does some computation, dominated by memory accesses. My theory is that because KNL has 8-way 2MB TLB, when the huge pages are exactly 8 pages apart they collide. And this is where the variability comes from, if the stacks come in between, they increase chances of them colliding. I do realise that the application is (I am ) doing a few things dubiously: it allocates memory on each thread and each huge page separately. But i thought you might want to know about this behaviour change. When i allocate all my memory before i start threads, the problem goes away. /proc/PID/maps: After change: 7f5e06a0-7f5e06c0 rw-p 00:0f 31809 /anon_hugepage (deleted) 7f5e06c0-7f5e06e0 rw-p 00:0f 29767 /anon_hugepage (deleted) 7f5e06e0-7f5e0700 rw-p 00:0f 30787 /anon_hugepage (deleted) 7f5e0700-7f5e0720 rw-p 00:0f 30786 /anon_hugepage (deleted) 7f5e0720-7f5e0740 rw-p 00:0f 28744 /anon_hugepage (deleted) 7f5e075ff000-7f5e0760 ---p 00:00 0 7f5e0760-7f5e07e0 rw-p 00:00 0 7f5e07e0-7f5e0800 rw-p 00:0f 30785 /anon_hugepage (deleted) 7f5e0800-7f5e08021000 rw-p 00:00 0 7f5e08021000-7f5e0c00 ---p 00:00 0 7f5e0c00-7f5e0c021000 rw-p 00:00 0 7f5e0c021000-7f5e1000 ---p 00:00 0 7f5e1000-7f5e10021000 rw-p 00:00 0 7f5e10021000-7f5e1400 ---p 00:00 0 7f5e1420-7f5e1440 rw-p 00:0f 29765 /anon_hugepage (deleted) 7f5e1440-7f5e1460 rw-p 00:0f 28743 /anon_hugepage (deleted) 7f5e1460-7f5e1480 rw-p 00:0f 29764 /anon_hugepage (deleted) (...) Before change: 2ac0-2ae0 rw-p 00:0f 25582 /anon_hugepage (deleted) 2ae0-2b00 rw-p 00:0f 25583 /anon_hugepage (deleted) 2b00-2b20 rw-p 00:0f 25584 /anon_hugepage (deleted) 2b20-2b40 rw-p 00:0f 25585 /anon_hugepage (deleted) 2b40-2b60 rw-p 00:0f 25601 /anon_hugepage (deleted) 2b60-2b80 rw-p 00:0f 25599 /anon_hugepage (deleted) 2b80-2ba0 rw-p 00:0f 25602 /anon_hugepage (deleted) 2ba0-2bc0 rw-p 00:0f 26652 /anon_hugepage (deleted) (...) 7fc4f0021000-7fc4f400 ---p 00:00 0 7fc4f400-7fc4f4021000 rw-p 00:00 0 7fc4f4021000-7fc4f800 ---p 00:00 0 7fc4f800-7fc4f8021000 rw-p 00:00 0 7fc4f8021000-7fc4fc00 ---p 00:00 0 7fc4fc00-7fc4fc021000 rw-p 00:00 0 7fc4fc021000-7fc5 ---p 00:00 0 7fc5-7fc500021000 rw-p 00:00 0 7fc500021000-7fc50400 ---p 00:00 0 7fc50400-7fc504021000 rw-p 00:00 0 7fc504021000-7fc50800 ---p 00:00 0 7fc50800-7fc508021000 rw-p 00:00 0 7fc508021000-7fc50c00 ---p 00:00 0 (...) I was wondering if this intertwined stacks and hugepages is an expected feature of ASLR? If not, maybe mmap's MAP_STACK flag could finally start to be used by the kernel to keep all the stacks together in process address space? Or should users just not allocate huge pages on separate threads? MAP_STACK could also be used to mark a VMA as a mapping for stack, (if there are flags left) to re-implement: 65376df582174ffcec9e6471bf5b0dd79ba05e4a proc: revert /proc//maps [stack:TID] annotation correctly, as having these pieces of information in place would greatly simplify my investigation. Regards. Jacek Tomaka
[tip:x86/microcode] x86/microcode: Make revision and processor flags world-readable
Commit-ID: f4661d293eb2d01dfc742982761a36fafe456d46 Gitweb: https://git.kernel.org/tip/f4661d293eb2d01dfc742982761a36fafe456d46 Author: Jacek Tomaka AuthorDate: Sat, 25 Aug 2018 11:50:39 +0800 Committer: Thomas Gleixner CommitDate: Sun, 2 Sep 2018 14:09:13 +0200 x86/microcode: Make revision and processor flags world-readable The microcode revision is already readable for non-root users via /proc/cpuinfo. Thus, there's no reason to keep the same information readable by root only in /sys/devices/system/cpu/cpuX/microcode/. Make .../processor_flags world-readable too, while at it. Reported-by: Tim Burgess Signed-off-by: Jacek Tomaka Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner Link: http://lkml.kernel.org/r/20180825035039.14409-1-jac...@dugeo.com --- arch/x86/kernel/cpu/microcode/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c index b9bc8a1a584e..2637ff09d6a0 100644 --- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -666,8 +666,8 @@ static ssize_t pf_show(struct device *dev, } static DEVICE_ATTR_WO(reload); -static DEVICE_ATTR(version, 0400, version_show, NULL); -static DEVICE_ATTR(processor_flags, 0400, pf_show, NULL); +static DEVICE_ATTR(version, 0444, version_show, NULL); +static DEVICE_ATTR(processor_flags, 0444, pf_show, NULL); static struct attribute *mc_default_attrs[] = { _attr_version.attr,
[tip:x86/microcode] x86/microcode: Make revision and processor flags world-readable
Commit-ID: f4661d293eb2d01dfc742982761a36fafe456d46 Gitweb: https://git.kernel.org/tip/f4661d293eb2d01dfc742982761a36fafe456d46 Author: Jacek Tomaka AuthorDate: Sat, 25 Aug 2018 11:50:39 +0800 Committer: Thomas Gleixner CommitDate: Sun, 2 Sep 2018 14:09:13 +0200 x86/microcode: Make revision and processor flags world-readable The microcode revision is already readable for non-root users via /proc/cpuinfo. Thus, there's no reason to keep the same information readable by root only in /sys/devices/system/cpu/cpuX/microcode/. Make .../processor_flags world-readable too, while at it. Reported-by: Tim Burgess Signed-off-by: Jacek Tomaka Signed-off-by: Borislav Petkov Signed-off-by: Thomas Gleixner Link: http://lkml.kernel.org/r/20180825035039.14409-1-jac...@dugeo.com --- arch/x86/kernel/cpu/microcode/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c index b9bc8a1a584e..2637ff09d6a0 100644 --- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -666,8 +666,8 @@ static ssize_t pf_show(struct device *dev, } static DEVICE_ATTR_WO(reload); -static DEVICE_ATTR(version, 0400, version_show, NULL); -static DEVICE_ATTR(processor_flags, 0400, pf_show, NULL); +static DEVICE_ATTR(version, 0444, version_show, NULL); +static DEVICE_ATTR(processor_flags, 0444, pf_show, NULL); static struct attribute *mc_default_attrs[] = { _attr_version.attr,
Re: [PATCH v4] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
> On 2 Aug 2018, at 6:07 pm, Thomas Gleixner wrote: > > The actiual purpose of sending V4 which is identical to V3 is? > >> >> Signed-off-by: Jacek Tomaka >> --- Yes, thanks. I missed it initially, sorry. > It's good practice to add a > > V3 -> V4: changed foo > V2 -> V3: fixed bla > ... > > section to patches which have more than one version. Sure. Would you like me to send it for this patch as well? Regards. Jacek Tomaka
Re: [PATCH v4] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
> On 2 Aug 2018, at 6:07 pm, Thomas Gleixner wrote: > > The actiual purpose of sending V4 which is identical to V3 is? > >> >> Signed-off-by: Jacek Tomaka >> --- Yes, thanks. I missed it initially, sorry. > It's good practice to add a > > V3 -> V4: changed foo > V2 -> V3: fixed bla > ... > > section to patches which have more than one version. Sure. Would you like me to send it for this patch as well? Regards. Jacek Tomaka
Re: [PATCH] x86/microcode: allow non-root reading of microcode version and processor flags
On Mon, Aug 27, 2018 at 3:52 PM, Borislav Petkov wrote: > Your From: is Jacek Tomaka and your SOB is different. > Which one should I use? Please use my SOB: Jacek Tomaka > (Having a single email address for both is easier...) Sorry about the trouble. Regards. Jacek Tomaka
Re: [PATCH] x86/microcode: allow non-root reading of microcode version and processor flags
On Mon, Aug 27, 2018 at 3:52 PM, Borislav Petkov wrote: > Your From: is Jacek Tomaka and your SOB is different. > Which one should I use? Please use my SOB: Jacek Tomaka > (Having a single email address for both is easier...) Sorry about the trouble. Regards. Jacek Tomaka
Re: [PATCH] x86/microcode: allow non-root reading of microcode version and processor flags
> On 26 Aug 2018, at 7:52 pm, Boris Petkov wrote: > >> On August 25, 2018 6:50:39 AM GMT+03:00, Jacek Tomaka >> wrote: >> /sys/devices/system/cpu/cpuX/microcode >> >> Before: >> -r processor_flags >> -r version >> >> After: >> -r--r--r-- processor_flags >> -r--r--r-- version >> >> Microcode version has been already readable for non root users via >> /proc/cpuinfo. However it is easier to access it from >> /sys/devices/system/cpu/cpuX/microcode/version > > Easier than /proc/cpuinfo?! Sorry, not really. Why not? > You'd need to elaborate in greater detail what exactly you're trying to > achieve. I am trying to get microcode version from user space. Reading it from /proc/cpuinfo requires greping/awking to extract the bits of information that are readily available in microcode/version. Any reason why the same piece of information has different access permissions, depending on the way it is accessed? Regards. Jacek Tomaka
Re: [PATCH] x86/microcode: allow non-root reading of microcode version and processor flags
> On 26 Aug 2018, at 7:52 pm, Boris Petkov wrote: > >> On August 25, 2018 6:50:39 AM GMT+03:00, Jacek Tomaka >> wrote: >> /sys/devices/system/cpu/cpuX/microcode >> >> Before: >> -r processor_flags >> -r version >> >> After: >> -r--r--r-- processor_flags >> -r--r--r-- version >> >> Microcode version has been already readable for non root users via >> /proc/cpuinfo. However it is easier to access it from >> /sys/devices/system/cpu/cpuX/microcode/version > > Easier than /proc/cpuinfo?! Sorry, not really. Why not? > You'd need to elaborate in greater detail what exactly you're trying to > achieve. I am trying to get microcode version from user space. Reading it from /proc/cpuinfo requires greping/awking to extract the bits of information that are readily available in microcode/version. Any reason why the same piece of information has different access permissions, depending on the way it is accessed? Regards. Jacek Tomaka
[PATCH] x86/microcode: allow non-root reading of microcode version and processor flags
/sys/devices/system/cpu/cpuX/microcode Before: -r processor_flags -r version After: -r--r--r-- processor_flags -r--r--r-- version Microcode version has been already readable for non root users via /proc/cpuinfo. However it is easier to access it from /sys/devices/system/cpu/cpuX/microcode/version Reported-by: Tim Burgess Signed-off-by: Jacek Tomaka --- arch/x86/kernel/cpu/microcode/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c index b9bc8a1a58..2637ff09d6 100644 --- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -666,8 +666,8 @@ static ssize_t pf_show(struct device *dev, } static DEVICE_ATTR_WO(reload); -static DEVICE_ATTR(version, 0400, version_show, NULL); -static DEVICE_ATTR(processor_flags, 0400, pf_show, NULL); +static DEVICE_ATTR(version, 0444, version_show, NULL); +static DEVICE_ATTR(processor_flags, 0444, pf_show, NULL); static struct attribute *mc_default_attrs[] = { _attr_version.attr, -- 2.17.0
[PATCH] x86/microcode: allow non-root reading of microcode version and processor flags
/sys/devices/system/cpu/cpuX/microcode Before: -r processor_flags -r version After: -r--r--r-- processor_flags -r--r--r-- version Microcode version has been already readable for non root users via /proc/cpuinfo. However it is easier to access it from /sys/devices/system/cpu/cpuX/microcode/version Reported-by: Tim Burgess Signed-off-by: Jacek Tomaka --- arch/x86/kernel/cpu/microcode/core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c index b9bc8a1a58..2637ff09d6 100644 --- a/arch/x86/kernel/cpu/microcode/core.c +++ b/arch/x86/kernel/cpu/microcode/core.c @@ -666,8 +666,8 @@ static ssize_t pf_show(struct device *dev, } static DEVICE_ATTR_WO(reload); -static DEVICE_ATTR(version, 0400, version_show, NULL); -static DEVICE_ATTR(processor_flags, 0400, pf_show, NULL); +static DEVICE_ATTR(version, 0444, version_show, NULL); +static DEVICE_ATTR(processor_flags, 0444, pf_show, NULL); static struct attribute *mc_default_attrs[] = { _attr_version.attr, -- 2.17.0
[PATCH v4] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
From: Jacek Tomaka Problem: perf did not show branch predicted/mispredicted bit in brstack. Output of perf -F brstack for profile collected Before: 0x4fdbcd/0x4fdc03/-/-/-/0 0x45f4c1/0x4fdba0/-/-/-/0 0x45f544/0x45f4bb/-/-/-/0 0x45f555/0x45f53c/-/-/-/0 0x7f66901cc24b/0x45f555/-/-/-/0 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 After: 0x4fdbcd/0x4fdc03/P/-/-/0 0x45f4c1/0x4fdba0/P/-/-/0 0x45f544/0x45f4bb/P/-/-/0 0x45f555/0x45f53c/P/-/-/0 0x7f66901cc24b/0x45f555/P/-/-/0 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 Cause: As mentioned in Software Development Manual vol 3, 17.4.8.1, IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as its format. Despite that, registers containing FROM address of the branch, do have MISPREDICT bit but because of the format indicated in IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. Solution: Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit. Signed-off-by: Jacek Tomaka --- arch/x86/events/intel/lbr.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index cf372b9055..81fe5047c6 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1230,4 +1230,8 @@ void intel_pmu_lbr_init_knl(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = snb_lbr_sel_map; + + /* Knights Landing does have MISPREDICT bit */ + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) + x86_pmu.intel_cap.lbr_format = LBR_FORMAT_EIP_FLAGS; } -- 2.17.0
[PATCH v4] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
From: Jacek Tomaka Problem: perf did not show branch predicted/mispredicted bit in brstack. Output of perf -F brstack for profile collected Before: 0x4fdbcd/0x4fdc03/-/-/-/0 0x45f4c1/0x4fdba0/-/-/-/0 0x45f544/0x45f4bb/-/-/-/0 0x45f555/0x45f53c/-/-/-/0 0x7f66901cc24b/0x45f555/-/-/-/0 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 After: 0x4fdbcd/0x4fdc03/P/-/-/0 0x45f4c1/0x4fdba0/P/-/-/0 0x45f544/0x45f4bb/P/-/-/0 0x45f555/0x45f53c/P/-/-/0 0x7f66901cc24b/0x45f555/P/-/-/0 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 Cause: As mentioned in Software Development Manual vol 3, 17.4.8.1, IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as its format. Despite that, registers containing FROM address of the branch, do have MISPREDICT bit but because of the format indicated in IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. Solution: Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit. Signed-off-by: Jacek Tomaka --- arch/x86/events/intel/lbr.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index cf372b9055..81fe5047c6 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1230,4 +1230,8 @@ void intel_pmu_lbr_init_knl(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = snb_lbr_sel_map; + + /* Knights Landing does have MISPREDICT bit */ + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) + x86_pmu.intel_cap.lbr_format = LBR_FORMAT_EIP_FLAGS; } -- 2.17.0
[PATCH v3] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
From: Jacek Tomaka Problem: perf did not show branch predicted/mispredicted bit in brstack. Output of perf -F brstack for profile collected Before: 0x4fdbcd/0x4fdc03/-/-/-/0 0x45f4c1/0x4fdba0/-/-/-/0 0x45f544/0x45f4bb/-/-/-/0 0x45f555/0x45f53c/-/-/-/0 0x7f66901cc24b/0x45f555/-/-/-/0 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 After: 0x4fdbcd/0x4fdc03/P/-/-/0 0x45f4c1/0x4fdba0/P/-/-/0 0x45f544/0x45f4bb/P/-/-/0 0x45f555/0x45f53c/P/-/-/0 0x7f66901cc24b/0x45f555/P/-/-/0 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 Cause: As mentioned in Software Development Manual vol 3, 17.4.8.1, IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as its format. Despite that, registers containing FROM address of the branch, do have MISPREDICT bit but because of the format indicated in IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. Solution: Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit. --- arch/x86/events/intel/lbr.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index cf372b9055..81fe5047c6 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1230,4 +1230,8 @@ void intel_pmu_lbr_init_knl(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = snb_lbr_sel_map; + + /* Knights Landing does have MISPREDICT bit */ + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) + x86_pmu.intel_cap.lbr_format = LBR_FORMAT_EIP_FLAGS; } -- 2.17.0
[PATCH v3] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
From: Jacek Tomaka Problem: perf did not show branch predicted/mispredicted bit in brstack. Output of perf -F brstack for profile collected Before: 0x4fdbcd/0x4fdc03/-/-/-/0 0x45f4c1/0x4fdba0/-/-/-/0 0x45f544/0x45f4bb/-/-/-/0 0x45f555/0x45f53c/-/-/-/0 0x7f66901cc24b/0x45f555/-/-/-/0 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 After: 0x4fdbcd/0x4fdc03/P/-/-/0 0x45f4c1/0x4fdba0/P/-/-/0 0x45f544/0x45f4bb/P/-/-/0 0x45f555/0x45f53c/P/-/-/0 0x7f66901cc24b/0x45f555/P/-/-/0 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 Cause: As mentioned in Software Development Manual vol 3, 17.4.8.1, IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as its format. Despite that, registers containing FROM address of the branch, do have MISPREDICT bit but because of the format indicated in IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. Solution: Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit. --- arch/x86/events/intel/lbr.c | 4 1 file changed, 4 insertions(+) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index cf372b9055..81fe5047c6 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1230,4 +1230,8 @@ void intel_pmu_lbr_init_knl(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = snb_lbr_sel_map; + + /* Knights Landing does have MISPREDICT bit */ + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) + x86_pmu.intel_cap.lbr_format = LBR_FORMAT_EIP_FLAGS; } -- 2.17.0
Re: [PATCH v2] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
Ah, right: /* * Due to lack of segmentation in Linux the effective address (offset) * is the same as the linear address, allowing us to merge the LIP and EIP * LBR formats. */ Yeah, LBR_FORMAT_EIP_FLAGS is ok as well. Would it be preffered? On Tue, Jul 31, 2018 at 12:29 AM, Jacek Tomaka wrote: > I do not understand the difference between linear address vs effective > address but LBR_FORMAT_EIP_FLAGS implies effective address, no? > > On Tue, Jul 31, 2018 at 12:17 AM, Peter Zijlstra > wrote: > >> On Mon, Jul 30, 2018 at 10:28:13PM +0800, Jacek Tomaka wrote: >> > From: Jacek Tomaka >> > >> > Problem: perf did not show branch predicted/mispredicted bit in brstack. >> > >> > Output of perf -F brstack for profile collected >> > >> > Before: >> > 0x4fdbcd/0x4fdc03/-/-/-/0 >> > 0x45f4c1/0x4fdba0/-/-/-/0 >> > 0x45f544/0x45f4bb/-/-/-/0 >> > 0x45f555/0x45f53c/-/-/-/0 >> > 0x7f66901cc24b/0x45f555/-/-/-/0 >> > 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 >> > 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 >> > 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 >> > >> > After: >> > 0x4fdbcd/0x4fdc03/P/-/-/0 >> > 0x45f4c1/0x4fdba0/P/-/-/0 >> > 0x45f544/0x45f4bb/P/-/-/0 >> > 0x45f555/0x45f53c/P/-/-/0 >> > 0x7f66901cc24b/0x45f555/P/-/-/0 >> > 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 >> > 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 >> > 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 >> > >> > Cause: >> > As mentioned in Software Development Manual vol 3, 17.4.8.1, >> > IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is >> > stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as >> > its format. Despite that, registers containing FROM address of the >> branch, >> > do have MISPREDICT bit but because of the format indicated in >> > IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. >> > >> > Solution: >> > Teach LBR about above Knights Landing quirk and make it read MISPREDICT >> bit. >> > --- >> > arch/x86/events/intel/lbr.c | 6 +- >> > 1 file changed, 5 insertions(+), 1 deletion(-) >> > >> > diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c >> > index cf372b9055..043aa09f3a 100644 >> > --- a/arch/x86/events/intel/lbr.c >> > +++ b/arch/x86/events/intel/lbr.c >> > @@ -19,7 +19,7 @@ enum { >> > LBR_FORMAT_MAX_KNOWN= LBR_FORMAT_TIME, >> > }; >> > >> > -static const enum { >> > +static enum { >> > LBR_EIP_FLAGS = 1, >> > LBR_TSX = 2, >> > } lbr_desc[LBR_FORMAT_MAX_KNOWN + 1] = { >> > @@ -1230,4 +1230,8 @@ void intel_pmu_lbr_init_knl(void) >> > >> > x86_pmu.lbr_sel_mask = LBR_SEL_MASK; >> > x86_pmu.lbr_sel_map = snb_lbr_sel_map; >> > + >> > + /* Knights Landing does have MISPREDICT bit */ >> > + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) >> > + lbr_desc[LBR_FORMAT_LIP] |= LBR_EIP_FLAGS; >> > } >> >> So why not set lbr_format to LBR_FORMAT_EIP_FLAGS ? >> > > > > -- > *Jacek Tomaka* > Geophysical Software Developer > > > > > > > *DownUnder GeoSolutions* > 76 Kings Park Road > West Perth 6005 WA, Australia > *tel *+61 8 9287 4143 <+61%208%209287%204143> > jac...@dug.com > *www.dug.com <http://www.dug.com>* > -- *Jacek Tomaka* Geophysical Software Developer *DownUnder GeoSolutions* 76 Kings Park Road West Perth 6005 WA, Australia *tel *+61 8 9287 4143 <+61%208%209287%204143> jac...@dug.com *www.dug.com <http://www.dug.com>*
Re: [PATCH v2] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
Ah, right: /* * Due to lack of segmentation in Linux the effective address (offset) * is the same as the linear address, allowing us to merge the LIP and EIP * LBR formats. */ Yeah, LBR_FORMAT_EIP_FLAGS is ok as well. Would it be preffered? On Tue, Jul 31, 2018 at 12:29 AM, Jacek Tomaka wrote: > I do not understand the difference between linear address vs effective > address but LBR_FORMAT_EIP_FLAGS implies effective address, no? > > On Tue, Jul 31, 2018 at 12:17 AM, Peter Zijlstra > wrote: > >> On Mon, Jul 30, 2018 at 10:28:13PM +0800, Jacek Tomaka wrote: >> > From: Jacek Tomaka >> > >> > Problem: perf did not show branch predicted/mispredicted bit in brstack. >> > >> > Output of perf -F brstack for profile collected >> > >> > Before: >> > 0x4fdbcd/0x4fdc03/-/-/-/0 >> > 0x45f4c1/0x4fdba0/-/-/-/0 >> > 0x45f544/0x45f4bb/-/-/-/0 >> > 0x45f555/0x45f53c/-/-/-/0 >> > 0x7f66901cc24b/0x45f555/-/-/-/0 >> > 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 >> > 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 >> > 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 >> > >> > After: >> > 0x4fdbcd/0x4fdc03/P/-/-/0 >> > 0x45f4c1/0x4fdba0/P/-/-/0 >> > 0x45f544/0x45f4bb/P/-/-/0 >> > 0x45f555/0x45f53c/P/-/-/0 >> > 0x7f66901cc24b/0x45f555/P/-/-/0 >> > 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 >> > 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 >> > 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 >> > >> > Cause: >> > As mentioned in Software Development Manual vol 3, 17.4.8.1, >> > IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is >> > stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as >> > its format. Despite that, registers containing FROM address of the >> branch, >> > do have MISPREDICT bit but because of the format indicated in >> > IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. >> > >> > Solution: >> > Teach LBR about above Knights Landing quirk and make it read MISPREDICT >> bit. >> > --- >> > arch/x86/events/intel/lbr.c | 6 +- >> > 1 file changed, 5 insertions(+), 1 deletion(-) >> > >> > diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c >> > index cf372b9055..043aa09f3a 100644 >> > --- a/arch/x86/events/intel/lbr.c >> > +++ b/arch/x86/events/intel/lbr.c >> > @@ -19,7 +19,7 @@ enum { >> > LBR_FORMAT_MAX_KNOWN= LBR_FORMAT_TIME, >> > }; >> > >> > -static const enum { >> > +static enum { >> > LBR_EIP_FLAGS = 1, >> > LBR_TSX = 2, >> > } lbr_desc[LBR_FORMAT_MAX_KNOWN + 1] = { >> > @@ -1230,4 +1230,8 @@ void intel_pmu_lbr_init_knl(void) >> > >> > x86_pmu.lbr_sel_mask = LBR_SEL_MASK; >> > x86_pmu.lbr_sel_map = snb_lbr_sel_map; >> > + >> > + /* Knights Landing does have MISPREDICT bit */ >> > + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) >> > + lbr_desc[LBR_FORMAT_LIP] |= LBR_EIP_FLAGS; >> > } >> >> So why not set lbr_format to LBR_FORMAT_EIP_FLAGS ? >> > > > > -- > *Jacek Tomaka* > Geophysical Software Developer > > > > > > > *DownUnder GeoSolutions* > 76 Kings Park Road > West Perth 6005 WA, Australia > *tel *+61 8 9287 4143 <+61%208%209287%204143> > jac...@dug.com > *www.dug.com <http://www.dug.com>* > -- *Jacek Tomaka* Geophysical Software Developer *DownUnder GeoSolutions* 76 Kings Park Road West Perth 6005 WA, Australia *tel *+61 8 9287 4143 <+61%208%209287%204143> jac...@dug.com *www.dug.com <http://www.dug.com>*
Re: [PATCH v2] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
I do not understand the difference between linear address vs effective address but LBR_FORMAT_EIP_FLAGS implies effective address, no? On Tue, Jul 31, 2018 at 12:17 AM, Peter Zijlstra wrote: > On Mon, Jul 30, 2018 at 10:28:13PM +0800, Jacek Tomaka wrote: > > From: Jacek Tomaka > > > > Problem: perf did not show branch predicted/mispredicted bit in brstack. > > > > Output of perf -F brstack for profile collected > > > > Before: > > 0x4fdbcd/0x4fdc03/-/-/-/0 > > 0x45f4c1/0x4fdba0/-/-/-/0 > > 0x45f544/0x45f4bb/-/-/-/0 > > 0x45f555/0x45f53c/-/-/-/0 > > 0x7f66901cc24b/0x45f555/-/-/-/0 > > 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 > > 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 > > 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 > > > > After: > > 0x4fdbcd/0x4fdc03/P/-/-/0 > > 0x45f4c1/0x4fdba0/P/-/-/0 > > 0x45f544/0x45f4bb/P/-/-/0 > > 0x45f555/0x45f53c/P/-/-/0 > > 0x7f66901cc24b/0x45f555/P/-/-/0 > > 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 > > 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 > > 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 > > > > Cause: > > As mentioned in Software Development Manual vol 3, 17.4.8.1, > > IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is > > stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as > > its format. Despite that, registers containing FROM address of the > branch, > > do have MISPREDICT bit but because of the format indicated in > > IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. > > > > Solution: > > Teach LBR about above Knights Landing quirk and make it read MISPREDICT > bit. > > --- > > arch/x86/events/intel/lbr.c | 6 +- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c > > index cf372b9055..043aa09f3a 100644 > > --- a/arch/x86/events/intel/lbr.c > > +++ b/arch/x86/events/intel/lbr.c > > @@ -19,7 +19,7 @@ enum { > > LBR_FORMAT_MAX_KNOWN= LBR_FORMAT_TIME, > > }; > > > > -static const enum { > > +static enum { > > LBR_EIP_FLAGS = 1, > > LBR_TSX = 2, > > } lbr_desc[LBR_FORMAT_MAX_KNOWN + 1] = { > > @@ -1230,4 +1230,8 @@ void intel_pmu_lbr_init_knl(void) > > > > x86_pmu.lbr_sel_mask = LBR_SEL_MASK; > > x86_pmu.lbr_sel_map = snb_lbr_sel_map; > > + > > + /* Knights Landing does have MISPREDICT bit */ > > + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) > > + lbr_desc[LBR_FORMAT_LIP] |= LBR_EIP_FLAGS; > > } > > So why not set lbr_format to LBR_FORMAT_EIP_FLAGS ? > -- *Jacek Tomaka* Geophysical Software Developer *DownUnder GeoSolutions* 76 Kings Park Road West Perth 6005 WA, Australia *tel *+61 8 9287 4143 <+61%208%209287%204143> jac...@dug.com *www.dug.com <http://www.dug.com>*
Re: [PATCH v2] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
I do not understand the difference between linear address vs effective address but LBR_FORMAT_EIP_FLAGS implies effective address, no? On Tue, Jul 31, 2018 at 12:17 AM, Peter Zijlstra wrote: > On Mon, Jul 30, 2018 at 10:28:13PM +0800, Jacek Tomaka wrote: > > From: Jacek Tomaka > > > > Problem: perf did not show branch predicted/mispredicted bit in brstack. > > > > Output of perf -F brstack for profile collected > > > > Before: > > 0x4fdbcd/0x4fdc03/-/-/-/0 > > 0x45f4c1/0x4fdba0/-/-/-/0 > > 0x45f544/0x45f4bb/-/-/-/0 > > 0x45f555/0x45f53c/-/-/-/0 > > 0x7f66901cc24b/0x45f555/-/-/-/0 > > 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 > > 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 > > 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 > > > > After: > > 0x4fdbcd/0x4fdc03/P/-/-/0 > > 0x45f4c1/0x4fdba0/P/-/-/0 > > 0x45f544/0x45f4bb/P/-/-/0 > > 0x45f555/0x45f53c/P/-/-/0 > > 0x7f66901cc24b/0x45f555/P/-/-/0 > > 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 > > 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 > > 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 > > > > Cause: > > As mentioned in Software Development Manual vol 3, 17.4.8.1, > > IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is > > stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as > > its format. Despite that, registers containing FROM address of the > branch, > > do have MISPREDICT bit but because of the format indicated in > > IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. > > > > Solution: > > Teach LBR about above Knights Landing quirk and make it read MISPREDICT > bit. > > --- > > arch/x86/events/intel/lbr.c | 6 +- > > 1 file changed, 5 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c > > index cf372b9055..043aa09f3a 100644 > > --- a/arch/x86/events/intel/lbr.c > > +++ b/arch/x86/events/intel/lbr.c > > @@ -19,7 +19,7 @@ enum { > > LBR_FORMAT_MAX_KNOWN= LBR_FORMAT_TIME, > > }; > > > > -static const enum { > > +static enum { > > LBR_EIP_FLAGS = 1, > > LBR_TSX = 2, > > } lbr_desc[LBR_FORMAT_MAX_KNOWN + 1] = { > > @@ -1230,4 +1230,8 @@ void intel_pmu_lbr_init_knl(void) > > > > x86_pmu.lbr_sel_mask = LBR_SEL_MASK; > > x86_pmu.lbr_sel_map = snb_lbr_sel_map; > > + > > + /* Knights Landing does have MISPREDICT bit */ > > + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) > > + lbr_desc[LBR_FORMAT_LIP] |= LBR_EIP_FLAGS; > > } > > So why not set lbr_format to LBR_FORMAT_EIP_FLAGS ? > -- *Jacek Tomaka* Geophysical Software Developer *DownUnder GeoSolutions* 76 Kings Park Road West Perth 6005 WA, Australia *tel *+61 8 9287 4143 <+61%208%209287%204143> jac...@dug.com *www.dug.com <http://www.dug.com>*
[PATCH v2] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
From: Jacek Tomaka Problem: perf did not show branch predicted/mispredicted bit in brstack. Output of perf -F brstack for profile collected Before: 0x4fdbcd/0x4fdc03/-/-/-/0 0x45f4c1/0x4fdba0/-/-/-/0 0x45f544/0x45f4bb/-/-/-/0 0x45f555/0x45f53c/-/-/-/0 0x7f66901cc24b/0x45f555/-/-/-/0 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 After: 0x4fdbcd/0x4fdc03/P/-/-/0 0x45f4c1/0x4fdba0/P/-/-/0 0x45f544/0x45f4bb/P/-/-/0 0x45f555/0x45f53c/P/-/-/0 0x7f66901cc24b/0x45f555/P/-/-/0 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 Cause: As mentioned in Software Development Manual vol 3, 17.4.8.1, IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as its format. Despite that, registers containing FROM address of the branch, do have MISPREDICT bit but because of the format indicated in IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. Solution: Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit. --- arch/x86/events/intel/lbr.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index cf372b9055..043aa09f3a 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -19,7 +19,7 @@ enum { LBR_FORMAT_MAX_KNOWN= LBR_FORMAT_TIME, }; -static const enum { +static enum { LBR_EIP_FLAGS = 1, LBR_TSX = 2, } lbr_desc[LBR_FORMAT_MAX_KNOWN + 1] = { @@ -1230,4 +1230,8 @@ void intel_pmu_lbr_init_knl(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = snb_lbr_sel_map; + + /* Knights Landing does have MISPREDICT bit */ + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) + lbr_desc[LBR_FORMAT_LIP] |= LBR_EIP_FLAGS; } -- 2.17.0
[PATCH v2] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
From: Jacek Tomaka Problem: perf did not show branch predicted/mispredicted bit in brstack. Output of perf -F brstack for profile collected Before: 0x4fdbcd/0x4fdc03/-/-/-/0 0x45f4c1/0x4fdba0/-/-/-/0 0x45f544/0x45f4bb/-/-/-/0 0x45f555/0x45f53c/-/-/-/0 0x7f66901cc24b/0x45f555/-/-/-/0 0x7f66901cc22e/0x7f66901cc23d/-/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/-/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/-/-/-/0 After: 0x4fdbcd/0x4fdc03/P/-/-/0 0x45f4c1/0x4fdba0/P/-/-/0 0x45f544/0x45f4bb/P/-/-/0 0x45f555/0x45f53c/P/-/-/0 0x7f66901cc24b/0x45f555/P/-/-/0 0x7f66901cc22e/0x7f66901cc23d/P/-/-/0 0x7f66901cc1ff/0x7f66901cc20f/P/-/-/0 0x7f66901cc1e8/0x7f66901cc1fc/P/-/-/0 Cause: As mentioned in Software Development Manual vol 3, 17.4.8.1, IA32_PERF_CAPABILITIES[5:0] indicates the format of the address that is stored in the LBR stack. Knights Landing reports 1 (LBR_FORMAT_LIP) as its format. Despite that, registers containing FROM address of the branch, do have MISPREDICT bit but because of the format indicated in IA32_PERF_CAPABILITIES[5:0], LBR did not read MISPREDICT bit. Solution: Teach LBR about above Knights Landing quirk and make it read MISPREDICT bit. --- arch/x86/events/intel/lbr.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index cf372b9055..043aa09f3a 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -19,7 +19,7 @@ enum { LBR_FORMAT_MAX_KNOWN= LBR_FORMAT_TIME, }; -static const enum { +static enum { LBR_EIP_FLAGS = 1, LBR_TSX = 2, } lbr_desc[LBR_FORMAT_MAX_KNOWN + 1] = { @@ -1230,4 +1230,8 @@ void intel_pmu_lbr_init_knl(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = snb_lbr_sel_map; + + /* Knights Landing does have MISPREDICT bit */ + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) + lbr_desc[LBR_FORMAT_LIP] |= LBR_EIP_FLAGS; } -- 2.17.0
[PATCH] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
From: Jacek Tomaka Knights Landing supports half baked LBR_FORMAT_TIME format. The addresses are linear but it does have MISPREDICT bit but nothing else. Unfortunately IA32_PERF_CAPABILITIES[5:0] will report LBR_FORMAT_LIP. This change teaches LBR about this Knights Landing quirk. --- arch/x86/events/intel/lbr.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index cf372b9055..0f73e60315 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1230,4 +1230,10 @@ void intel_pmu_lbr_init_knl(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = snb_lbr_sel_map; + + /* Knights Landing supports half baked LBR format. The addresses are linear but it does have MISPREDICT bit. +* Unfortunately IA32_PERF_CAPABILITIES[5:0] will report LBR_FORMAT_LIP. +*/ + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) + lbr_desc[LBR_FORMAT_LIP] |= LBR_EIP_FLAGS; } -- 2.17.0
[PATCH] perf/x86/intel: Add support for MISPREDICT bit on Knights Landing cpus
From: Jacek Tomaka Knights Landing supports half baked LBR_FORMAT_TIME format. The addresses are linear but it does have MISPREDICT bit but nothing else. Unfortunately IA32_PERF_CAPABILITIES[5:0] will report LBR_FORMAT_LIP. This change teaches LBR about this Knights Landing quirk. --- arch/x86/events/intel/lbr.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c index cf372b9055..0f73e60315 100644 --- a/arch/x86/events/intel/lbr.c +++ b/arch/x86/events/intel/lbr.c @@ -1230,4 +1230,10 @@ void intel_pmu_lbr_init_knl(void) x86_pmu.lbr_sel_mask = LBR_SEL_MASK; x86_pmu.lbr_sel_map = snb_lbr_sel_map; + + /* Knights Landing supports half baked LBR format. The addresses are linear but it does have MISPREDICT bit. +* Unfortunately IA32_PERF_CAPABILITIES[5:0] will report LBR_FORMAT_LIP. +*/ + if (x86_pmu.intel_cap.lbr_format == LBR_FORMAT_LIP) + lbr_desc[LBR_FORMAT_LIP] |= LBR_EIP_FLAGS; } -- 2.17.0
[PATCH] x86/cpuid: Add missing TLB cpuid values
From: jacek.tom...@poczta.fm Make kernel print correct message upon boot on Intel Xeon Phi 7210 (and others). Before: [ 0.320005] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0 After: [ 0.320005] Last level dTLB entries: 4KB 256, 2MB 128, 4MB 128, 1GB 16 The entries do exist in the official Intel's documentation but the type column there is incorrect (states "Cache" where it should read "TLB") https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-2a-manual.html Signed-off-by: Jacek Tomaka <jacek.tom...@poczta.fm> --- arch/x86/kernel/cpu/intel.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index b9693b80fc21..21e6cc52d56f 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -835,6 +835,9 @@ static const struct _tlb_table intel_tlb_table[] = { { 0x5d, TLB_DATA_4K_4M, 256," TLB_DATA 4 KByte and 4 MByte pages" }, { 0x61, TLB_INST_4K,48, " TLB_INST 4 KByte pages, full associative" }, { 0x63, TLB_DATA_1G,4, " TLB_DATA 1 GByte pages, 4-way set associative" }, + { 0x6b, TLB_DATA_4K,256," TLB_DATA 4 KByte pages, 8-way associative" }, + { 0x6c, TLB_DATA_2M_4M, 128," TLB_DATA 2 MByte or 4 MByte pages, 8-way associative" }, + { 0x6d, TLB_DATA_1G,16, " TLB_DATA 1 GByte pages, fully associative" }, { 0x76, TLB_INST_2M_4M, 8, " TLB_INST 2-MByte or 4-MByte pages, fully associative" }, { 0xb0, TLB_INST_4K,128," TLB_INST 4 KByte pages, 4-way set associative" }, { 0xb1, TLB_INST_2M_4M, 4, " TLB_INST 2M pages, 4-way, 8 entries or 4M pages, 4-way entries" }, -- 2.17.0
[PATCH] x86/cpuid: Add missing TLB cpuid values
From: jacek.tom...@poczta.fm Make kernel print correct message upon boot on Intel Xeon Phi 7210 (and others). Before: [ 0.320005] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0 After: [ 0.320005] Last level dTLB entries: 4KB 256, 2MB 128, 4MB 128, 1GB 16 The entries do exist in the official Intel's documentation but the type column there is incorrect (states "Cache" where it should read "TLB") https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-2a-manual.html Signed-off-by: Jacek Tomaka --- arch/x86/kernel/cpu/intel.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index b9693b80fc21..21e6cc52d56f 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -835,6 +835,9 @@ static const struct _tlb_table intel_tlb_table[] = { { 0x5d, TLB_DATA_4K_4M, 256," TLB_DATA 4 KByte and 4 MByte pages" }, { 0x61, TLB_INST_4K,48, " TLB_INST 4 KByte pages, full associative" }, { 0x63, TLB_DATA_1G,4, " TLB_DATA 1 GByte pages, 4-way set associative" }, + { 0x6b, TLB_DATA_4K,256," TLB_DATA 4 KByte pages, 8-way associative" }, + { 0x6c, TLB_DATA_2M_4M, 128," TLB_DATA 2 MByte or 4 MByte pages, 8-way associative" }, + { 0x6d, TLB_DATA_1G,16, " TLB_DATA 1 GByte pages, fully associative" }, { 0x76, TLB_INST_2M_4M, 8, " TLB_INST 2-MByte or 4-MByte pages, fully associative" }, { 0xb0, TLB_INST_4K,128," TLB_INST 4 KByte pages, 4-way set associative" }, { 0xb1, TLB_INST_2M_4M, 4, " TLB_INST 2M pages, 4-way, 8 entries or 4M pages, 4-way entries" }, -- 2.17.0
Re: NO_HZ_FULL and tick running within a reasonable amount of time
48 c7 c7 e8 e1 fd a4 c6 05 6d > > ---[ end trace f0c6a1afa55d130d ]--- > > clock_task: 6304138787 exec_start: 221487873 > > > > That's a 6 second delay, it's huge! > > Could it be because you use Qemu and the virtualized CPUs got interrupted for a long while? Hello, I am seeing the following message as well and I am running it on real hardware (Intel Xeon Phi 7250): My kernel parameters include: nohz_full=1-271 noretpoline isolcpus=nohz RCU threads are pinned to cpu zero for i in `pgrep rcu[^c]` ; do taskset -pc 0 $i done One thing i noticed though that my date is not set properly: -bash-4.2# date Thu Jan 1 10:56:21 UTC 1970 Probably because of : [0.00] NO_HZ: Full dynticks CPUs: 1-271. [0.00] Offload RCU callbacks from CPUs: 1-271. [0.00] WARNING: Persistent clock returned invalid value! [0.00] Check your CMOS/BIOS settings. ? Here is the message i am seeing (kernel is tainted because I stripped debug symbols without resigning): [ 133.471425] WARNING: CPU: 0 PID: 2185 at kernel/sched/core.c:3124 sched_tick_remote+0xf5/0x100 [ 133.576313] Modules linked in: coretemp(E) nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) bridge(E) tg3(E) ipmi_ssif(E) ipmi_devintf(E) ipmi_si(E) ip mi_msghandler(E) e1000(E) igb(E) e1000e(E) ixgbe(E) i40e(E) mlx5_core(E) mlx4_en(E) vxlan(E) ip6_udp_tunnel(E) udp_tunnel(E) ip_tunnel(E) mlx4_core(E) mlxfw(E) devlink(E) mdio( E) i2c_algo_bit(E) i2c_piix4(E) i2c_core(E) 8021q(E) mrp(E) garp(E) stp(E) llc(E) dca(E) [ 134.044162] CPU: 0 PID: 2185 Comm: kworker/u544:2 Tainted: G E 4.16.0+ #1 [ 134.142670] Hardware name: Intel Corporation S7200AP/S7200AP, BIOS S72C610.86B.01.02.0072.041620172102 04/16/2017 [ 134.267638] Workqueue: events_unbound sched_tick_remote [ 134.331240] RIP: 0010:sched_tick_remote+0xf5/0x100 [ 134.389540] RSP: 0018:b009c6653e60 EFLAGS: 00010006 [ 134.453132] RAX: 001e8285ce48 RBX: 91c8ebe62500 RCX: 91c8ebe4 [ 134.540003] RDX: b2d05e00 RSI: 91c5c3d5 RDI: 0031 [ 134.626875] RBP: 91c8ebe65e28 R08: 91c8ebca5e30 R09: 91c8ebca5e30 [ 134.713742] R10: R11: 0018 R12: 91c5c7d18000 [ 134.800609] R13: 91c5c7d2 R14: R15: 91c8ebe65e30 [ 134.887477] FS: () GS:91c8eb20() knlGS: [ 134.985988] CS: 0010 DS: ES: CR0: 80050033 [ 135.055930] CR2: 7f8cd5352f70 CR3: 00036f40a000 CR4: 001406f0 [ 135.142801] Call Trace: [ 135.172563] process_one_work+0x152/0x350 [ 135.221345] worker_thread+0x47/0x3e0 [ 135.265903] kthread+0xf5/0x130 [ 135.304109] ? max_active_store+0x80/0x80 [ 135.352893] ? kthread_bind+0x10/0x10 [ 135.397458] ret_from_fork+0x35/0x40 [ 135.440953] Code: 83 40 0b 00 00 48 85 c0 0f 85 5f ff ff ff e9 63 ff ff ff 80 3d dd ef 12 01 00 75 a0 48 89 34 24 e8 5e 55 00 00 48 8b 34 24 eb 91 <0f> 0b eb a5 0f 1f 80 00 00 00 00 0f 1f 44 00 00 41 56 41 55 41 [ 135.670765] ---[ end trace 7031db6a8ce43506 ]-- Regards. -- Jacek Tomaka
Re: NO_HZ_FULL and tick running within a reasonable amount of time
48 c7 c7 e8 e1 fd a4 c6 05 6d > > ---[ end trace f0c6a1afa55d130d ]--- > > clock_task: 6304138787 exec_start: 221487873 > > > > That's a 6 second delay, it's huge! > > Could it be because you use Qemu and the virtualized CPUs got interrupted for a long while? Hello, I am seeing the following message as well and I am running it on real hardware (Intel Xeon Phi 7250): My kernel parameters include: nohz_full=1-271 noretpoline isolcpus=nohz RCU threads are pinned to cpu zero for i in `pgrep rcu[^c]` ; do taskset -pc 0 $i done One thing i noticed though that my date is not set properly: -bash-4.2# date Thu Jan 1 10:56:21 UTC 1970 Probably because of : [0.00] NO_HZ: Full dynticks CPUs: 1-271. [0.00] Offload RCU callbacks from CPUs: 1-271. [0.00] WARNING: Persistent clock returned invalid value! [0.00] Check your CMOS/BIOS settings. ? Here is the message i am seeing (kernel is tainted because I stripped debug symbols without resigning): [ 133.471425] WARNING: CPU: 0 PID: 2185 at kernel/sched/core.c:3124 sched_tick_remote+0xf5/0x100 [ 133.576313] Modules linked in: coretemp(E) nfsv4(E) dns_resolver(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) bridge(E) tg3(E) ipmi_ssif(E) ipmi_devintf(E) ipmi_si(E) ip mi_msghandler(E) e1000(E) igb(E) e1000e(E) ixgbe(E) i40e(E) mlx5_core(E) mlx4_en(E) vxlan(E) ip6_udp_tunnel(E) udp_tunnel(E) ip_tunnel(E) mlx4_core(E) mlxfw(E) devlink(E) mdio( E) i2c_algo_bit(E) i2c_piix4(E) i2c_core(E) 8021q(E) mrp(E) garp(E) stp(E) llc(E) dca(E) [ 134.044162] CPU: 0 PID: 2185 Comm: kworker/u544:2 Tainted: G E 4.16.0+ #1 [ 134.142670] Hardware name: Intel Corporation S7200AP/S7200AP, BIOS S72C610.86B.01.02.0072.041620172102 04/16/2017 [ 134.267638] Workqueue: events_unbound sched_tick_remote [ 134.331240] RIP: 0010:sched_tick_remote+0xf5/0x100 [ 134.389540] RSP: 0018:b009c6653e60 EFLAGS: 00010006 [ 134.453132] RAX: 001e8285ce48 RBX: 91c8ebe62500 RCX: 91c8ebe4 [ 134.540003] RDX: b2d05e00 RSI: 91c5c3d5 RDI: 0031 [ 134.626875] RBP: 91c8ebe65e28 R08: 91c8ebca5e30 R09: 91c8ebca5e30 [ 134.713742] R10: R11: 0018 R12: 91c5c7d18000 [ 134.800609] R13: 91c5c7d2 R14: R15: 91c8ebe65e30 [ 134.887477] FS: () GS:91c8eb20() knlGS: [ 134.985988] CS: 0010 DS: ES: CR0: 80050033 [ 135.055930] CR2: 7f8cd5352f70 CR3: 00036f40a000 CR4: 001406f0 [ 135.142801] Call Trace: [ 135.172563] process_one_work+0x152/0x350 [ 135.221345] worker_thread+0x47/0x3e0 [ 135.265903] kthread+0xf5/0x130 [ 135.304109] ? max_active_store+0x80/0x80 [ 135.352893] ? kthread_bind+0x10/0x10 [ 135.397458] ret_from_fork+0x35/0x40 [ 135.440953] Code: 83 40 0b 00 00 48 85 c0 0f 85 5f ff ff ff e9 63 ff ff ff 80 3d dd ef 12 01 00 75 a0 48 89 34 24 e8 5e 55 00 00 48 8b 34 24 eb 91 <0f> 0b eb a5 0f 1f 80 00 00 00 00 0f 1f 44 00 00 41 56 41 55 41 [ 135.670765] ---[ end trace 7031db6a8ce43506 ]-- Regards. -- Jacek Tomaka