Re: [PATCH v3 26/27] powerpc/powernv/pmem: Expose the firmware version in sysfs

2020-03-01 Thread Andrew Donnellan

On 21/2/20 2:27 pm, Alastair D'Silva wrote:

From: Alastair D'Silva 

This information will be used by ndctl in userspace to help users identify
the device.


You should include the information from the subject line in the body of 
the commit message too.


I think this patch could probably be squashed in with the last one.

--
Andrew Donnellan  OzLabs, ADL Canberra
a...@linux.ibm.com IBM Australia Limited



Reg: KASLR backporting to 4.9 kernels for ppc platform.

2020-03-01 Thread Bala murugan
Hi,

Any plans to back port KASLR for 4.9 kernel. How feasible is to back port
to 4.9 kernel for PPC platforms.

Regards
S Balamurugan.


Re: [PATCH v3 0/6] implement KASLR for powerpc/fsl_booke/64

2020-03-01 Thread Jason Yan




在 2020/3/2 11:24, Scott Wood 写道:

On Mon, 2020-03-02 at 10:17 +0800, Jason Yan wrote:


在 2020/3/1 6:54, Scott Wood 写道:

On Sat, 2020-02-29 at 15:27 +0800, Jason Yan wrote:


Turnning to %p may not be a good idea in this situation. So
for the REG logs printed when dumping stack, we can disable it when
KASLR is open. For the REG logs in other places like show_regs(), only
privileged can trigger it, and they are not combind with a symbol, so
I think it's ok to keep them.

diff --git a/arch/powerpc/kernel/process.c
b/arch/powerpc/kernel/process.c
index fad50db9dcf2..659c51f0739a 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2068,7 +2068,10 @@ void show_stack(struct task_struct *tsk, unsigned
long *stack)
   newsp = stack[0];
   ip = stack[STACK_FRAME_LR_SAVE];
   if (!firstframe || ip != lr) {
-   printk("["REG"] ["REG"] %pS", sp, ip, (void
*)ip);
+   if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
+   printk("%pS", (void *)ip);
+   else
+   printk("["REG"] ["REG"] %pS", sp, ip,
(void *)ip);


This doesn't deal with "nokaslr" on the kernel command line.  It also
doesn't
seem like something that every callsite should have to opencode, versus
having
an appropriate format specifier behaves as I described above (and I still
don't see why that format specifier should not be "%p").



Actually I still do not understand why we should print the raw value
here. When KALLSYMS is enabled we have symbol name  and  offset like
put_cred_rcu+0x108/0x110, and when KALLSYMS is disabled we have the raw
address.


I'm more concerned about the stack address for wading through a raw stack dump
(to find function call arguments, etc).  The return address does help confirm
that I'm on the right stack frame though, and also makes looking up a line
number slightly easier than having to look up a symbol address and then add
the offset (at least for non-module addresses).

As a random aside, the mismatch between Linux printing a hex offset and GDB
using decimal in disassembly is annoying...



OK, I will send a RFC patch to add a new format specifier such as "%pk" 
or change the exsiting "%pK" to print raw value of addresses when KASLR 
is disabled and print hash value of addresses when KASLR is enabled. 
Let's see what the printk guys would say :)




-Scott



.





[RFC 2/3] mm/vma: Introduce VM_ACCESS_FLAGS

2020-03-01 Thread Anshuman Khandual
There are many places where all basic VMA access flags (read, write, exec)
are initialized or checked against as a group. One such example is during
page fault. Existing vma_is_accessible() wrapper already creates the notion
of VMA accessibility as a group access permissions. Hence lets just create
VM_ACCESS_FLAGS (VM_READ|VM_WRITE|VM_EXEC) which will not only reduce code
duplication but also extend the VMA accessibility concept in general.

Cc: Russell King 
CC: Catalin Marinas 
CC: Mark Salter 
Cc: Nick Hu 
CC: Ley Foon Tan 
Cc: Michael Ellerman 
Cc: Heiko Carstens 
Cc: Yoshinori Sato 
Cc: Guan Xuetao 
Cc: Dave Hansen 
Cc: Thomas Gleixner 
Cc: Rob Springer 
Cc: Greg Kroah-Hartman 
Cc: Andrew Morton 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c6x-...@linux-c6x.org
Cc: nios2-...@lists.rocketboards.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: de...@driverdev.osuosl.org
Cc: linux...@kvack.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Anshuman Khandual 
---
 arch/arm/mm/fault.c  | 2 +-
 arch/arm64/mm/fault.c| 2 +-
 arch/c6x/include/asm/processor.h | 2 +-
 arch/nds32/mm/fault.c| 2 +-
 arch/nios2/include/asm/processor.h   | 2 +-
 arch/powerpc/mm/book3s64/pkeys.c | 2 +-
 arch/s390/mm/fault.c | 2 +-
 arch/sh/include/asm/processor_64.h   | 2 +-
 arch/unicore32/mm/fault.c| 2 +-
 arch/x86/mm/pkeys.c  | 2 +-
 drivers/staging/gasket/gasket_core.c | 2 +-
 include/linux/mm.h   | 4 +++-
 mm/mmap.c| 4 ++--
 mm/mprotect.c| 7 +++
 14 files changed, 19 insertions(+), 18 deletions(-)

diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index bd0f4821f7e1..2c71028d9d6b 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -189,7 +189,7 @@ void do_bad_area(unsigned long addr, unsigned int fsr, 
struct pt_regs *regs)
  */
 static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma)
 {
-   unsigned int mask = VM_READ | VM_WRITE | VM_EXEC;
+   unsigned int mask = VM_ACCESS_FLAGS;
 
if ((fsr & FSR_WRITE) && !(fsr & FSR_CM))
mask = VM_WRITE;
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 85566d32958f..63f31206a12e 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -445,7 +445,7 @@ static int __kprobes do_page_fault(unsigned long addr, 
unsigned int esr,
const struct fault_info *inf;
struct mm_struct *mm = current->mm;
vm_fault_t fault, major = 0;
-   unsigned long vm_flags = VM_READ | VM_WRITE | VM_EXEC;
+   unsigned long vm_flags = VM_ACCESS_FLAGS;
unsigned int mm_flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
 
if (kprobe_page_fault(regs, esr))
diff --git a/arch/c6x/include/asm/processor.h b/arch/c6x/include/asm/processor.h
index 1456f5e11de3..77372b8c28d7 100644
--- a/arch/c6x/include/asm/processor.h
+++ b/arch/c6x/include/asm/processor.h
@@ -57,7 +57,7 @@ struct thread_struct {
 }
 
 #define INIT_MMAP { \
-   _mm, 0, 0, NULL, PAGE_SHARED, VM_READ | VM_WRITE | VM_EXEC, 1, \
+   _mm, 0, 0, NULL, PAGE_SHARED, VM_ACCESS_FLAGS, 1, \
NULL, NULL }
 
 #define task_pt_regs(task) \
diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c
index 906dfb25353c..55387a31bf42 100644
--- a/arch/nds32/mm/fault.c
+++ b/arch/nds32/mm/fault.c
@@ -79,7 +79,7 @@ void do_page_fault(unsigned long entry, unsigned long addr,
struct vm_area_struct *vma;
int si_code;
vm_fault_t fault;
-   unsigned int mask = VM_READ | VM_WRITE | VM_EXEC;
+   unsigned int mask = VM_ACCESS_FLAGS;
unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE;
 
error_code = error_code & (ITYPE_mskINST | ITYPE_mskETYPE);
diff --git a/arch/nios2/include/asm/processor.h 
b/arch/nios2/include/asm/processor.h
index 94bcb86f679f..fbfb3ab14cfc 100644
--- a/arch/nios2/include/asm/processor.h
+++ b/arch/nios2/include/asm/processor.h
@@ -51,7 +51,7 @@ struct thread_struct {
 };
 
 #define INIT_MMAP \
-   { _mm, (0), (0), __pgprot(0x0), VM_READ | VM_WRITE | VM_EXEC }
+   { _mm, (0), (0), __pgprot(0x0), VM_ACCESS_FLAGS }
 
 # define INIT_THREAD { \
.kregs  = NULL, \
diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c
index 59e0ebbd8036..11fd52b24f68 100644
--- a/arch/powerpc/mm/book3s64/pkeys.c
+++ b/arch/powerpc/mm/book3s64/pkeys.c
@@ -315,7 +315,7 @@ int __execute_only_pkey(struct mm_struct *mm)
 static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma)
 {
/* Do this check first since the vm_flags should be hot */
-   if ((vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) != VM_EXEC)
+   if ((vma->vm_flags & VM_ACCESS_FLAGS) != VM_EXEC)
return false;
 
return (vma_pkey(vma) == 

[RFC 1/3] mm/vma: Define a default value for VM_DATA_DEFAULT_FLAGS

2020-03-01 Thread Anshuman Khandual
There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS
This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the
existing VM_STACK_DEFAULT_FLAGS. While here, also define some more macros
with standard VMA access flag combinations that are used frequently across
many platforms. Apart from simplification, this reduces code duplication
as well.

Cc: Richard Henderson 
Cc: Vineet Gupta 
Cc: Russell King 
Cc: Catalin Marinas 
Cc: Mark Salter 
Cc: Guo Ren 
Cc: Yoshinori Sato 
Cc: Brian Cain 
Cc: Tony Luck 
Cc: Geert Uytterhoeven 
Cc: Michal Simek 
Cc: Ralf Baechle 
Cc: Paul Burton 
Cc: Nick Hu 
Cc: Ley Foon Tan 
Cc: Jonas Bonn 
Cc: "James E.J. Bottomley" 
Cc: Michael Ellerman 
Cc: Paul Walmsley 
Cc: Heiko Carstens 
Cc: Rich Felker 
Cc: "David S. Miller" 
Cc: Guan Xuetao 
Cc: Thomas Gleixner 
Cc: Jeff Dike 
Cc: Chris Zankel 
Cc: Andrew Morton 
Cc: linux-al...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c6x-...@linux-c6x.org
Cc: uclinux-h8-de...@lists.sourceforge.jp
Cc: linux-hexa...@vger.kernel.org
Cc: linux-i...@vger.kernel.org
Cc: linux-m...@lists.linux-m68k.org
Cc: linux-m...@vger.kernel.org
Cc: nios2-...@lists.rocketboards.org
Cc: openr...@lists.librecores.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-ri...@lists.infradead.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux...@lists.infradead.org
Cc: linux-xte...@linux-xtensa.org
Cc: linux...@kvack.org
Signed-off-by: Anshuman Khandual 
---
 arch/alpha/include/asm/page.h  |  3 ---
 arch/arc/include/asm/page.h|  2 +-
 arch/arm/include/asm/page.h|  4 +---
 arch/arm64/include/asm/page.h  |  4 +---
 arch/c6x/include/asm/page.h|  5 +
 arch/csky/include/asm/page.h   |  3 ---
 arch/h8300/include/asm/page.h  |  2 --
 arch/hexagon/include/asm/page.h|  3 +--
 arch/ia64/include/asm/page.h   |  5 +
 arch/m68k/include/asm/page.h   |  3 ---
 arch/microblaze/include/asm/page.h |  2 --
 arch/mips/include/asm/page.h   |  5 +
 arch/nds32/include/asm/page.h  |  3 ---
 arch/nios2/include/asm/page.h  |  3 +--
 arch/openrisc/include/asm/page.h   |  5 -
 arch/parisc/include/asm/page.h |  3 ---
 arch/powerpc/include/asm/page.h|  9 ++---
 arch/powerpc/include/asm/page_64.h |  7 ++-
 arch/riscv/include/asm/page.h  |  3 +--
 arch/s390/include/asm/page.h   |  3 +--
 arch/sh/include/asm/page.h |  3 ---
 arch/sparc/include/asm/page_32.h   |  3 ---
 arch/sparc/include/asm/page_64.h   |  3 ---
 arch/unicore32/include/asm/page.h  |  3 ---
 arch/x86/include/asm/page_types.h  |  4 +---
 arch/x86/um/asm/vm-flags.h | 10 ++
 arch/xtensa/include/asm/page.h |  3 ---
 include/linux/mm.h | 15 +++
 28 files changed, 32 insertions(+), 89 deletions(-)

diff --git a/arch/alpha/include/asm/page.h b/arch/alpha/include/asm/page.h
index f3fb2848470a..e241bd0f 100644
--- a/arch/alpha/include/asm/page.h
+++ b/arch/alpha/include/asm/page.h
@@ -90,9 +90,6 @@ typedef struct page *pgtable_t;
 #define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT)
 #endif /* CONFIG_DISCONTIGMEM */
 
-#define VM_DATA_DEFAULT_FLAGS  (VM_READ | VM_WRITE | VM_EXEC | \
-VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
-
 #include 
 #include 
 
diff --git a/arch/arc/include/asm/page.h b/arch/arc/include/asm/page.h
index 0a32e8cfd074..b0dfed0f12be 100644
--- a/arch/arc/include/asm/page.h
+++ b/arch/arc/include/asm/page.h
@@ -102,7 +102,7 @@ typedef pte_t * pgtable_t;
 #define virt_addr_valid(kaddr)  pfn_valid(virt_to_pfn(kaddr))
 
 /* Default Permissions for stack/heaps pages (Non Executable) */
-#define VM_DATA_DEFAULT_FLAGS   (VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE 
| VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS  VM_DATA_FLAGS_NON_EXEC
 
 #define WANT_PAGE_VIRTUAL   1
 
diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index c2b75cba26df..11b058a72a5b 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -161,9 +161,7 @@ extern int pfn_valid(unsigned long);
 
 #endif /* !__ASSEMBLY__ */
 
-#define VM_DATA_DEFAULT_FLAGS \
-   (((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
-VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+#define VM_DATA_DEFAULT_FLAGS  VM_DATA_FLAGS_TSK_EXEC
 
 #include 
 
diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index d39ddb258a04..cb4e1e6ca385 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -32,9 +32,7 @@ extern int pfn_valid(unsigned long);
 
 #endif /* !__ASSEMBLY__ */
 
-#define VM_DATA_DEFAULT_FLAGS \
-   (((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
-VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | 

[RFC 0/3] mm/vma: some new flags and helpers

2020-03-01 Thread Anshuman Khandual
The motivation here is to consolidate VMA flag combinations commonly used
across platforms and reduce code duplication while making it uncluttered
in general.

This first introduces a default VM_DATA_DEFAULT_FLAGS which platforms can
easily fall back on without requiring to define any similar data flag
combinations as they currently do. This also adds some more common data
flag combinations which are generally used when the platforms decide to
override the default.

The second patch consolidates VM_READ, VM_WRITE, VM_EXEC as VM_ACCESS_FLAGS
extending the existing VMA accessibility concept via vma_is_accessibility().
VM_ACCESS_FLAGS replaces many other instances which used check all three
VMA access flags simultaneously.

While here, this also adds some more special VMA flag based helpers which
wraps around similar checks at various places thus improving readability.
This series intentionally limits these new helpers which are applicable 
only for special purpose VM flags than the more common ones like VM_READ,
VM_WRITE, VM_EXEC, VM_SHARED etc just to limit code churn. But if there is
common agreement that every flag should have it's own wrapper here, we can
do that as well. Otherwise if this patch seems really unnecessary with much
code churn, will be happy to drop it.

Reviews, comments, suggestions and concerns welcome. Thank you.

This series is based on v5.6-r4 after applying these patches.

1. https://patchwork.kernel.org/cover/11399319/
2. https://patchwork.kernel.org/patch/11399379/

This series is build tested across multiple architectures but boot tested
only on arm64 and x86 platforms.

Cc: linux-al...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-snps-...@lists.infradead.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-c6x-...@linux-c6x.org
Cc: uclinux-h8-de...@lists.sourceforge.jp
Cc: linux-hexa...@vger.kernel.org
Cc: linux-i...@vger.kernel.org
Cc: linux-m...@lists.linux-m68k.org
Cc: linux-m...@vger.kernel.org
Cc: nios2-...@lists.rocketboards.org
Cc: openr...@lists.librecores.org
Cc: linux-par...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-ri...@lists.infradead.org
Cc: linux-s...@vger.kernel.org
Cc: linux...@vger.kernel.org
Cc: sparcli...@vger.kernel.org
Cc: linux...@lists.infradead.org
Cc: linux-xte...@linux-xtensa.org
Cc: linux...@kvack.org

Anshuman Khandual (3):
  mm/vma: Define a default value for VM_DATA_DEFAULT_FLAGS
  mm/vma: Introduce VM_ACCESS_FLAGS
  mm/vma: Introduce some more VMA flag wrappers

 arch/alpha/include/asm/page.h|  3 --
 arch/arc/include/asm/page.h  |  2 +-
 arch/arm/include/asm/page.h  |  4 +-
 arch/arm/mm/fault.c  |  2 +-
 arch/arm64/include/asm/page.h|  4 +-
 arch/arm64/mm/fault.c|  2 +-
 arch/c6x/include/asm/page.h  |  5 +--
 arch/c6x/include/asm/processor.h |  2 +-
 arch/csky/include/asm/page.h |  3 --
 arch/h8300/include/asm/page.h|  2 -
 arch/hexagon/include/asm/page.h  |  3 +-
 arch/ia64/include/asm/page.h |  5 +--
 arch/m68k/include/asm/page.h |  3 --
 arch/microblaze/include/asm/page.h   |  2 -
 arch/mips/include/asm/page.h |  5 +--
 arch/nds32/include/asm/page.h|  3 --
 arch/nds32/mm/fault.c|  2 +-
 arch/nios2/include/asm/page.h|  3 +-
 arch/nios2/include/asm/processor.h   |  2 +-
 arch/openrisc/include/asm/page.h |  5 ---
 arch/parisc/include/asm/page.h   |  3 --
 arch/powerpc/include/asm/page.h  |  9 +
 arch/powerpc/include/asm/page_64.h   |  7 +---
 arch/powerpc/mm/book3s64/pkeys.c |  2 +-
 arch/riscv/include/asm/page.h|  3 +-
 arch/s390/include/asm/page.h |  3 +-
 arch/s390/mm/fault.c |  2 +-
 arch/sh/include/asm/page.h   |  3 --
 arch/sh/include/asm/processor_64.h   |  2 +-
 arch/sparc/include/asm/mman.h|  2 +-
 arch/sparc/include/asm/page_32.h |  3 --
 arch/sparc/include/asm/page_64.h |  3 --
 arch/unicore32/include/asm/page.h|  3 --
 arch/unicore32/mm/fault.c|  2 +-
 arch/x86/include/asm/page_types.h|  4 +-
 arch/x86/mm/pkeys.c  |  2 +-
 arch/x86/um/asm/vm-flags.h   | 10 +
 arch/xtensa/include/asm/page.h   |  3 --
 drivers/staging/gasket/gasket_core.c |  2 +-
 fs/binfmt_elf.c  |  2 +-
 fs/proc/task_mmu.c   | 14 +++
 include/linux/huge_mm.h  |  4 +-
 include/linux/mm.h   | 58 +++-
 kernel/events/core.c |  2 +-
 kernel/events/uprobes.c  |  2 +-
 mm/gup.c |  2 +-
 mm/huge_memory.c |  6 +--
 mm/hugetlb.c |  4 +-
 mm/ksm.c |  8 ++--
 mm/madvise.c |  4 +-
 mm/memory.c  |  4 +-
 mm/migrate.c |  4 +-
 mm/mlock.c   |  4 +-
 mm/mmap.c 

Re: [PATCH] powerpc/sysdev: fix compile errors

2020-03-01 Thread Christophe Leroy




Le 02/03/2020 à 06:37, WANG Wenhu a écrit :

Include linux/io.h into fsl_85xx_cache_sram.c to fix the
implicit-declaration compile errors when building Cache-Sram.

arch/powerpc/sysdev/fsl_85xx_cache_sram.c: In function ‘instantiate_cache_sram’:
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:26: error: implicit declaration of 
function ‘ioremap_coherent’; did you mean ‘bitmap_complement’? 
[-Werror=implicit-function-declaration]
   cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys,
   ^~~~
   bitmap_complement
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:24: error: assignment makes 
pointer from integer without a cast [-Werror=int-conversion]
   cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys,
 ^
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:123:2: error: implicit declaration of 
function ‘iounmap’; did you mean ‘roundup’? 
[-Werror=implicit-function-declaration]
   iounmap(cache_sram->base_virt);
   ^~~
   roundup
cc1: all warnings being treated as errors

Fixed: commit 6db92cc9d07d ("powerpc/85xx: add cache-sram support")
Signed-off-by: WANG Wenhu 


Reviewed-by: Christophe Leroy 


---
  arch/powerpc/sysdev/fsl_85xx_cache_sram.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c 
b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
index f6c665dac725..be3aef4229d7 100644
--- a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
+++ b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
@@ -17,6 +17,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "fsl_85xx_cache_ctlr.h"
  



Re: [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with

2020-03-01 Thread Alastair D'Silva
On Mon, 2020-03-02 at 16:34 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva 
> > 
> > This patch introduces a character device (/dev/ocxl-scmX) which
> > further
> > patches will use to interact with userspace.
> 
> As with the comments on other patches in this series, this commit 
> message is lacking in explanation. What's the purpose of this device?
> 

I'll reword this for v4.

> > Signed-off-by: Alastair D'Silva 
> > ---
> >   arch/powerpc/platforms/powernv/pmem/ocxl.c| 116
> > +-
> >   .../platforms/powernv/pmem/ocxl_internal.h|   2 +
> >   2 files changed, 116 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > index b8bd7e703b19..63109a870d2c 100644
> > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
> > @@ -10,6 +10,7 @@
> >   #include 
> >   #include 
> >   #include 
> > +#include 
> >   #include 
> >   #include 
> >   #include "ocxl_internal.h"
> > @@ -339,6 +340,9 @@ static void free_ocxlpmem(struct ocxlpmem
> > *ocxlpmem)
> >   
> > free_minor(ocxlpmem);
> >   
> > +   if (ocxlpmem->cdev.owner)
> > +   cdev_del(>cdev);
> > +
> > if (ocxlpmem->metadata_addr)
> > devm_memunmap(>dev, ocxlpmem->metadata_addr);
> >   
> > @@ -396,6 +400,70 @@ static int ocxlpmem_register(struct ocxlpmem
> > *ocxlpmem)
> > return device_register(>dev);
> >   }
> >   
> > +static void ocxlpmem_put(struct ocxlpmem *ocxlpmem)
> > +{
> > +   put_device(>dev);
> > +}
> > +
> > +static struct ocxlpmem *ocxlpmem_get(struct ocxlpmem *ocxlpmem)
> > +{
> > +   return (get_device(>dev) == NULL) ? NULL : ocxlpmem;
> > +}
> > +
> > +static struct ocxlpmem *find_and_get_ocxlpmem(dev_t devno)
> > +{
> > +   struct ocxlpmem *ocxlpmem;
> > +   int minor = MINOR(devno);
> > +   /*
> > +* We don't declare an RCU critical section here, as our AFU
> > +* is protected by a re0ference counter on the device. By the
> > time the
> > +* minor number of a device is removed from the idr, the ref
> > count of
> > +* the device is already at 0, so no user API will access that
> > AFU and
> > +* this function can't return it.
> > +*/
> > +   ocxlpmem = idr_find(_idr, minor);
> > +   if (ocxlpmem)
> > +   ocxlpmem_get(ocxlpmem);
> > +   return ocxlpmem;
> > +}
> > +
> > +static int file_open(struct inode *inode, struct file *file)
> > +{
> > +   struct ocxlpmem *ocxlpmem;
> > +
> > +   ocxlpmem = find_and_get_ocxlpmem(inode->i_rdev);
> > +   if (!ocxlpmem)
> > +   return -ENODEV;
> > +
> > +   file->private_data = ocxlpmem;
> > +   return 0;
> > +}
> > +
> > +static int file_release(struct inode *inode, struct file *file)
> > +{
> > +   struct ocxlpmem *ocxlpmem = file->private_data;
> > +
> > +   ocxlpmem_put(ocxlpmem);
> > +   return 0;
> > +}
> > +
> > +static const struct file_operations fops = {
> > +   .owner  = THIS_MODULE,
> > +   .open   = file_open,
> > +   .release= file_release,
> > +};
> > +
> > +/**
> > + * create_cdev() - Create the chardev in /dev for the device
> > + * @ocxlpmem: the SCM metadata
> > + * Return: 0 on success, negative on failure
> > + */
> > +static int create_cdev(struct ocxlpmem *ocxlpmem)
> > +{
> > +   cdev_init(>cdev, );
> > +   return cdev_add(>cdev, ocxlpmem->dev.devt, 1);
> > +}
> > +
> >   /**
> >* ocxlpmem_remove() - Free an OpenCAPI persistent memory device
> >* @pdev: the PCI device information struct
> > @@ -572,6 +640,11 @@ static int probe(struct pci_dev *pdev, const
> > struct pci_device_id *ent)
> > goto err;
> > }
> >   
> > +   if (create_cdev(ocxlpmem)) {
> > +   dev_err(>dev, "Could not create character
> > device\n");
> > +   goto err;
> > +   }
> > +
> > elapsed = 0;
> > timeout = ocxlpmem->readiness_timeout + ocxlpmem-
> > >memory_available_timeout;
> > while (!is_usable(ocxlpmem, false)) {
> > @@ -613,20 +686,59 @@ static struct pci_driver pci_driver = {
> > .shutdown = ocxlpmem_remove,
> >   };
> >   
> > +static int file_init(void)
> > +{
> > +   int rc;
> > +
> > +   mutex_init(_idr_lock);
> > +   idr_init(_idr);
> > +
> > +   rc = alloc_chrdev_region(_dev, 0, NUM_MINORS, "ocxl-
> > pmem");
> 
> If the driver is going to be called "ocxlpmem" can we standardise on 
> that without the extra hyphen?

Ok

> > +   if (rc) {
> > +   idr_destroy(_idr);
> > +   pr_err("Unable to allocate OpenCAPI persistent memory
> > major number: %d\n", rc);
> > +   return rc;
> > +   }
> > +
> > +   ocxlpmem_class = class_create(THIS_MODULE, "ocxl-pmem");
> > +   if (IS_ERR(ocxlpmem_class)) {
> > +   idr_destroy(_idr);
> > +   pr_err("Unable to create ocxl-pmem class\n");
> > +   unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
> > +   return PTR_ERR(ocxlpmem_class);
> > +   

[PATCH] powerpc/sysdev: fix compile errors

2020-03-01 Thread WANG Wenhu
Include linux/io.h into fsl_85xx_cache_sram.c to fix the
implicit-declaration compile errors when building Cache-Sram.

arch/powerpc/sysdev/fsl_85xx_cache_sram.c: In function ‘instantiate_cache_sram’:
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:26: error: implicit declaration of 
function ‘ioremap_coherent’; did you mean ‘bitmap_complement’? 
[-Werror=implicit-function-declaration]
  cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys,
  ^~~~
  bitmap_complement
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:24: error: assignment makes 
pointer from integer without a cast [-Werror=int-conversion]
  cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys,
^
arch/powerpc/sysdev/fsl_85xx_cache_sram.c:123:2: error: implicit declaration of 
function ‘iounmap’; did you mean ‘roundup’? 
[-Werror=implicit-function-declaration]
  iounmap(cache_sram->base_virt);
  ^~~
  roundup
cc1: all warnings being treated as errors

Fixed: commit 6db92cc9d07d ("powerpc/85xx: add cache-sram support")
Signed-off-by: WANG Wenhu 
---
 arch/powerpc/sysdev/fsl_85xx_cache_sram.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c 
b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
index f6c665dac725..be3aef4229d7 100644
--- a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
+++ b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "fsl_85xx_cache_ctlr.h"
 
-- 
2.17.1



Re: [PATCH v3 21/27] powerpc/powernv/pmem: Add an IOCTL to request controller health & perf data

2020-03-01 Thread Alastair D'Silva
On Fri, 2020-02-28 at 17:12 +1100, Andrew Donnellan wrote:
> On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > From: Alastair D'Silva 
> > 
> > When health & performance data is requested from the controller,
> > it responds with an error log containing the requested information.
> > 
> > This patch allows the request to me issued via an IOCTL.
> 
> A better explanation would be good - this IOCTL triggers a request
> to 
> the controller to collect controller health/perf data, and the 
> controller will later respond with an error log that can be picked
> up 
> via the error log IOCTL that you've defined earlier.
> 
> 

Ok

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819



RE: [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs

2020-03-01 Thread Alastair D'Silva
On Mon, 2020-03-02 at 10:42 +1100, Alastair D'Silva wrote:
> On Fri, 2020-02-28 at 08:15 +0100, Greg Kroah-Hartman wrote:
> > On Fri, Feb 28, 2020 at 05:25:31PM +1100, Andrew Donnellan wrote:
> > > On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > > > +int ocxlpmem_sysfs_add(struct ocxlpmem *ocxlpmem)
> > > > +{
> > > > +   int i, rc;
> > > > +
> > > > +   for (i = 0; i < ARRAY_SIZE(attrs); i++) {
> > > > +   rc = device_create_file(>dev,
> > > > [i]);
> > > > +   if (rc) {
> > > > +   for (; --i >= 0;)
> > > > +   device_remove_file(
> > > > >dev,
> > > > [i]);
> > > 
> > > I'd rather avoid weird for loop constructs if possible.
> > > 
> > > Is it actually dangerous to call device_remove_file() on an attr
> > > that hasn't
> > > been added? If not then I'd rather define an err: label and loop
> > > over the
> > > whole array there.
> > 
> > None of this should be used at all, just use attribute groups
> > properly
> > and the driver core will handle this all for you.
> > 
> > device_create/remove_file should never be called by anyone anymore
> > if
> > at all
> > possible.
> > 
> > thanks,
> > 
> > greg k-h
> 
> Thanks, I'll rework it to use the .groups member of struct
> pci_driver.
> 

I ended up making these available as DIMM attributes instead.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819



Re: [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with

2020-03-01 Thread Andrew Donnellan

On 21/2/20 2:27 pm, Alastair D'Silva wrote:

From: Alastair D'Silva 

This patch introduces a character device (/dev/ocxl-scmX) which further
patches will use to interact with userspace.


As with the comments on other patches in this series, this commit 
message is lacking in explanation. What's the purpose of this device?




Signed-off-by: Alastair D'Silva 
---
  arch/powerpc/platforms/powernv/pmem/ocxl.c| 116 +-
  .../platforms/powernv/pmem/ocxl_internal.h|   2 +
  2 files changed, 116 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c 
b/arch/powerpc/platforms/powernv/pmem/ocxl.c
index b8bd7e703b19..63109a870d2c 100644
--- a/arch/powerpc/platforms/powernv/pmem/ocxl.c
+++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c
@@ -10,6 +10,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include "ocxl_internal.h"
@@ -339,6 +340,9 @@ static void free_ocxlpmem(struct ocxlpmem *ocxlpmem)
  
  	free_minor(ocxlpmem);
  
+	if (ocxlpmem->cdev.owner)

+   cdev_del(>cdev);
+
if (ocxlpmem->metadata_addr)
devm_memunmap(>dev, ocxlpmem->metadata_addr);
  
@@ -396,6 +400,70 @@ static int ocxlpmem_register(struct ocxlpmem *ocxlpmem)

return device_register(>dev);
  }
  
+static void ocxlpmem_put(struct ocxlpmem *ocxlpmem)

+{
+   put_device(>dev);
+}
+
+static struct ocxlpmem *ocxlpmem_get(struct ocxlpmem *ocxlpmem)
+{
+   return (get_device(>dev) == NULL) ? NULL : ocxlpmem;
+}
+
+static struct ocxlpmem *find_and_get_ocxlpmem(dev_t devno)
+{
+   struct ocxlpmem *ocxlpmem;
+   int minor = MINOR(devno);
+   /*
+* We don't declare an RCU critical section here, as our AFU
+* is protected by a re0ference counter on the device. By the time the
+* minor number of a device is removed from the idr, the ref count of
+* the device is already at 0, so no user API will access that AFU and
+* this function can't return it.
+*/
+   ocxlpmem = idr_find(_idr, minor);
+   if (ocxlpmem)
+   ocxlpmem_get(ocxlpmem);
+   return ocxlpmem;
+}
+
+static int file_open(struct inode *inode, struct file *file)
+{
+   struct ocxlpmem *ocxlpmem;
+
+   ocxlpmem = find_and_get_ocxlpmem(inode->i_rdev);
+   if (!ocxlpmem)
+   return -ENODEV;
+
+   file->private_data = ocxlpmem;
+   return 0;
+}
+
+static int file_release(struct inode *inode, struct file *file)
+{
+   struct ocxlpmem *ocxlpmem = file->private_data;
+
+   ocxlpmem_put(ocxlpmem);
+   return 0;
+}
+
+static const struct file_operations fops = {
+   .owner  = THIS_MODULE,
+   .open   = file_open,
+   .release= file_release,
+};
+
+/**
+ * create_cdev() - Create the chardev in /dev for the device
+ * @ocxlpmem: the SCM metadata
+ * Return: 0 on success, negative on failure
+ */
+static int create_cdev(struct ocxlpmem *ocxlpmem)
+{
+   cdev_init(>cdev, );
+   return cdev_add(>cdev, ocxlpmem->dev.devt, 1);
+}
+
  /**
   * ocxlpmem_remove() - Free an OpenCAPI persistent memory device
   * @pdev: the PCI device information struct
@@ -572,6 +640,11 @@ static int probe(struct pci_dev *pdev, const struct 
pci_device_id *ent)
goto err;
}
  
+	if (create_cdev(ocxlpmem)) {

+   dev_err(>dev, "Could not create character device\n");
+   goto err;
+   }
+
elapsed = 0;
timeout = ocxlpmem->readiness_timeout + 
ocxlpmem->memory_available_timeout;
while (!is_usable(ocxlpmem, false)) {
@@ -613,20 +686,59 @@ static struct pci_driver pci_driver = {
.shutdown = ocxlpmem_remove,
  };
  
+static int file_init(void)

+{
+   int rc;
+
+   mutex_init(_idr_lock);
+   idr_init(_idr);
+
+   rc = alloc_chrdev_region(_dev, 0, NUM_MINORS, "ocxl-pmem");


If the driver is going to be called "ocxlpmem" can we standardise on 
that without the extra hyphen?



+   if (rc) {
+   idr_destroy(_idr);
+   pr_err("Unable to allocate OpenCAPI persistent memory major number: 
%d\n", rc);
+   return rc;
+   }
+
+   ocxlpmem_class = class_create(THIS_MODULE, "ocxl-pmem");
+   if (IS_ERR(ocxlpmem_class)) {
+   idr_destroy(_idr);
+   pr_err("Unable to create ocxl-pmem class\n");
+   unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
+   return PTR_ERR(ocxlpmem_class);
+   }
+
+   return 0;
+}
+
+static void file_exit(void)
+{
+   class_destroy(ocxlpmem_class);
+   unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS);
+   idr_destroy(_idr);
+}
+
  static int __init ocxlpmem_init(void)
  {
-   int rc = 0;
+   int rc;
  
-	rc = pci_register_driver(_driver);

+   rc = file_init();
if (rc)
return rc;
  
+	rc = pci_register_driver(_driver);

+   if (rc) {
+   file_exit();
+  

[RFC 05/11] perf tools: Enable record and script to record and show hazard data

2020-03-01 Thread Ravi Bangoria
From: Madhavan Srinivasan 

Introduce new perf record option "--hazard" to capture cpu pipeline
hazard data. Also enable perf script -D to dump raw values of it.
Sample o/p:

  $ ./perf record -e r4010e --hazard -- ls
  $ ./perf script -D
  ... PERF_RECORD_SAMPLE(IP, 0x2): ...
  hazard information:
  Inst Type 0x1
  Inst Cache 0x1
  Hazard Stage 0x4
  Hazard Reason 0x3
  Stall Stage 0x4
  Stall Reason 0x2

Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Ravi Bangoria 
---
 tools/perf/Documentation/perf-record.txt  |  3 +++
 tools/perf/builtin-record.c   |  1 +
 tools/perf/util/event.h   |  1 +
 tools/perf/util/evsel.c   | 10 ++
 tools/perf/util/perf_event_attr_fprintf.c |  1 +
 tools/perf/util/record.h  |  1 +
 tools/perf/util/session.c | 16 
 7 files changed, 33 insertions(+)

diff --git a/tools/perf/Documentation/perf-record.txt 
b/tools/perf/Documentation/perf-record.txt
index b23a4012a606..e7bd1b6938ce 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -283,6 +283,9 @@ OPTIONS
 --phys-data::
Record the sample physical addresses.
 
+--hazard::
+   Record processor pipeline hazard and stall information.
+
 -T::
 --timestamp::
Record the sample timestamps. Use it with 'perf report -D' to see the
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 4c301466101b..6bd32d7bc4e9 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -2301,6 +2301,7 @@ static struct option __record_options[] = {
OPT_BOOLEAN('s', "stat", _stat,
"per thread counts"),
OPT_BOOLEAN('d', "data", _address, "Record the 
sample addresses"),
+   OPT_BOOLEAN(0, "hazard", , "Record processor 
pipeline hazard and stall information"),
OPT_BOOLEAN(0, "phys-data", _phys_addr,
"Record the sample physical addresses"),
OPT_BOOLEAN(0, "sample-cpu", _cpu, "Record the 
sample cpu"),
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 85223159737c..ff0f03253a95 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -148,6 +148,7 @@ struct perf_sample {
struct stack_dump user_stack;
struct sample_read read;
struct aux_sample aux_sample;
+   struct perf_pipeline_haz_data *pipeline_haz;
 };
 
 #define PERF_MEM_DATA_SRC_NONE \
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index c8dc4450884c..e37ed7929c2c 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1080,6 +1080,9 @@ void perf_evsel__config(struct evsel *evsel, struct 
record_opts *opts,
if (opts->sample_phys_addr)
perf_evsel__set_sample_bit(evsel, PHYS_ADDR);
 
+   if (opts->hazard)
+   perf_evsel__set_sample_bit(evsel, PIPELINE_HAZ);
+
if (opts->no_buffering) {
attr->watermark = 0;
attr->wakeup_events = 1;
@@ -2265,6 +2268,13 @@ int perf_evsel__parse_sample(struct evsel *evsel, union 
perf_event *event,
array = (void *)array + sz;
}
 
+   if (type & PERF_SAMPLE_PIPELINE_HAZ) {
+   sz = sizeof(struct perf_pipeline_haz_data);
+   OVERFLOW_CHECK(array, sz, max_size);
+   data->pipeline_haz = (struct perf_pipeline_haz_data *)array;
+   array = (void *)array + sz;
+   }
+
return 0;
 }
 
diff --git a/tools/perf/util/perf_event_attr_fprintf.c 
b/tools/perf/util/perf_event_attr_fprintf.c
index 651203126c71..d97e755c886b 100644
--- a/tools/perf/util/perf_event_attr_fprintf.c
+++ b/tools/perf/util/perf_event_attr_fprintf.c
@@ -35,6 +35,7 @@ static void __p_sample_type(char *buf, size_t size, u64 value)
bit_name(BRANCH_STACK), bit_name(REGS_USER), 
bit_name(STACK_USER),
bit_name(IDENTIFIER), bit_name(REGS_INTR), bit_name(DATA_SRC),
bit_name(WEIGHT), bit_name(PHYS_ADDR), bit_name(AUX),
+   bit_name(PIPELINE_HAZ),
{ .name = NULL, }
};
 #undef bit_name
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 5421fd2ad383..f1678a0bc8ce 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -67,6 +67,7 @@ struct record_opts {
int   affinity;
int   mmap_flush;
unsigned int  comp_level;
+   bool  hazard;
 };
 
 extern const char * const *record_usage;
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index d0d7d25b23e3..834ca7df2349 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1153,6 +1153,19 @@ static void stack_user__printf(struct stack_dump *dump)
   dump->size, dump->offset);
 }
 
+static void pipeline_hazard__printf(struct perf_sample *sample)
+{
+   struct perf_pipeline_haz_data *haz = sample->pipeline_haz;
+
+   printf("... 

[RFC 11/11] perf annotate: Show hazard data in tui mode

2020-03-01 Thread Ravi Bangoria
Enable perf report->annotate tui mode to show hazard information. By
default they are hidden, but user can unhide them by pressing hot key
'S'. Sample o/p:

 │Disassembly of section .text:
 │
 │10001cf8 :
 │compare():
 │return NULL;
 │}
 │
 │static int
 │compare(const void *p1, const void *p2)
 │{
   33.23 │  stdr31,-8(r1)
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}
 │   {haz_stage: LSU, haz_reason: Load Hit Store, stall_stage: LSU, 
stall_reason: -, icache: L3 hit}
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: -, 
stall_reason: -, icache: L1 hit}
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}
0.84 │  stdu   r1,-64(r1)
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: -, 
stall_reason: -, icache: L1 hit}
0.24 │  mr r31,r1
 │   {haz_stage: -, haz_reason: -, stall_stage: -, stall_reason: -, 
icache: L1 hit}
   21.18 │  stdr3,32(r31)
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}

Signed-off-by: Ravi Bangoria 
---
 tools/perf/builtin-annotate.c |   5 ++
 tools/perf/ui/browsers/annotate.c | 124 ++
 tools/perf/util/annotate.c|  51 +++-
 tools/perf/util/annotate.h|  18 -
 4 files changed, 178 insertions(+), 20 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 78552a9428a6..a51313a6b019 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -472,6 +472,7 @@ static const char * const annotate_usage[] = {
 
 int cmd_annotate(int argc, const char **argv)
 {
+   bool annotate_haz = false;
struct perf_annotate annotate = {
.tool = {
.sample = process_sample_event,
@@ -531,6 +532,8 @@ int cmd_annotate(int argc, const char **argv)
 symbol__config_symfs),
OPT_BOOLEAN(0, "source", _src,
"Interleave source code with assembly code (default)"),
+   OPT_BOOLEAN(0, "hazard", _haz,
+   "Interleave CPU pileline hazard/stall data with assembly 
code"),
OPT_BOOLEAN(0, "asm-raw", _asm_raw,
"Display raw encoding of assembly instructions (default)"),
OPT_STRING('M', "disassembler-style", 
_style, "disassembler style",
@@ -583,6 +586,8 @@ int cmd_annotate(int argc, const char **argv)
if (annotate_check_args() < 0)
return -EINVAL;
 
+   annotate.opts.hide_haz_data = !annotate_haz;
+
if (symbol_conf.show_nr_samples && annotate.use_gtk) {
pr_err("--show-nr-samples is not available in --gtk mode at 
this time\n");
return ret;
diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 2e4db8216b3b..b04d825cee50 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -190,9 +190,15 @@ static void annotate_browser__draw_current_jump(struct 
ui_browser *browser)
return;
}
 
-   if (notes->options->hide_src_code) {
+   if (notes->options->hide_src_code && notes->options->hide_haz_data) {
from = cursor->al.idx_asm;
to = target->idx_asm;
+   } else if (!notes->options->hide_src_code && 
notes->options->hide_haz_data) {
+   from = cursor->al.idx_asm + cursor->al.idx_src + 1;
+   to = target->idx_asm + target->idx_src + 1;
+   } else if (notes->options->hide_src_code && 
!notes->options->hide_haz_data) {
+   from = cursor->al.idx_asm + cursor->al.idx_haz + 1;
+   to = target->idx_asm + target->idx_haz + 1;
} else {
from = (u64)cursor->al.idx;
to = (u64)target->idx;
@@ -293,8 +299,13 @@ static void annotate_browser__set_rb_top(struct 
annotate_browser *browser,
struct annotation_line * pos = rb_entry(nd, struct annotation_line, 
rb_node);
u32 idx = pos->idx;
 
-   if (notes->options->hide_src_code)
+   if (notes->options->hide_src_code && notes->options->hide_haz_data)
idx = pos->idx_asm;
+   else if (!notes->options->hide_src_code && 
notes->options->hide_haz_data)
+

[RFC 09/11] perf annotate: Introduce type for annotation_line

2020-03-01 Thread Ravi Bangoria
struct annotation_line can contain either assembly instruction or
a source code line. To distinguish between them we currently use
offset. If offset is -1, it's a source otherwise it's assembly.
This is bit cryptic when you first read the code. Introduce new
field 'type' that denotes type of the data the annotation_line
object contains.

Signed-off-by: Ravi Bangoria 
---
 tools/perf/ui/browsers/annotate.c |  4 ++--
 tools/perf/ui/gtk/annotate.c  |  6 +++---
 tools/perf/util/annotate.c| 27 ---
 tools/perf/util/annotate.h|  8 +++-
 4 files changed, 28 insertions(+), 17 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c 
b/tools/perf/ui/browsers/annotate.c
index 9023267e5643..2e4db8216b3b 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -317,7 +317,7 @@ static void annotate_browser__calc_percent(struct 
annotate_browser *browser,
double max_percent = 0.0;
int i;
 
-   if (pos->al.offset == -1) {
+   if (pos->al.type != AL_TYPE_ASM) {
RB_CLEAR_NODE(>al.rb_node);
continue;
}
@@ -816,7 +816,7 @@ static int annotate_browser__run(struct annotate_browser 
*browser,
 
if (browser->selection == NULL)
ui_helpline__puts("Huh? No selection. Report to 
linux-ker...@vger.kernel.org");
-   else if (browser->selection->offset == -1)
+   else if (browser->selection->type != AL_TYPE_ASM)
ui_helpline__puts("Actions are only available 
for assembly lines.");
else if (!dl->ins.ops)
goto show_sup_ins;
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index 35f9641bf670..71c792a0b17d 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -35,7 +35,7 @@ static int perf_gtk__get_percent(char *buf, size_t size, 
struct symbol *sym,
 
strcpy(buf, "");
 
-   if (dl->al.offset == (s64) -1)
+   if (dl->al.type != AL_TYPE_ASM)
return 0;
 
symhist = annotation__histogram(symbol__annotation(sym), evidx);
@@ -61,7 +61,7 @@ static int perf_gtk__get_offset(char *buf, size_t size, 
struct map_symbol *ms,
 
strcpy(buf, "");
 
-   if (dl->al.offset == (s64) -1)
+   if (dl->al.type != AL_TYPE_ASM)
return 0;
 
return scnprintf(buf, size, "%"PRIx64, start + dl->al.offset);
@@ -78,7 +78,7 @@ static int perf_gtk__get_line(char *buf, size_t size, struct 
disasm_line *dl)
if (!line)
return 0;
 
-   if (dl->al.offset != (s64) -1)
+   if (dl->al.type == AL_TYPE_ASM)
markup = NULL;
 
if (markup)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 4e2706274d85..8aef60a6ffea 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1148,6 +1148,7 @@ struct annotate_args {
struct evsel  *evsel;
struct annotation_options *options;
s64   offset;
+   u8type;
char  *line;
int   line_nr;
 };
@@ -1156,6 +1157,7 @@ static void annotation_line__init(struct annotation_line 
*al,
  struct annotate_args *args,
  int nr)
 {
+   al->type = args->type;
al->offset = args->offset;
al->line = strdup(args->line);
al->line_nr = args->line_nr;
@@ -1202,7 +1204,7 @@ static struct disasm_line *disasm_line__new(struct 
annotate_args *args)
if (dl->al.line == NULL)
goto out_delete;
 
-   if (args->offset != -1) {
+   if (dl->al.type == AL_TYPE_ASM) {
if (disasm_line__parse(dl->al.line, >ins.name, 
>ops.raw) < 0)
goto out_free_line;
 
@@ -1246,7 +1248,7 @@ struct annotation_line *
 annotation_line__next(struct annotation_line *pos, struct list_head *head)
 {
list_for_each_entry_continue(pos, head, node)
-   if (pos->offset >= 0)
+   if (pos->type == AL_TYPE_ASM)
return pos;
 
return NULL;
@@ -1357,7 +1359,7 @@ annotation_line__print(struct annotation_line *al, struct 
symbol *sym, u64 start
static const char *prev_line;
static const char *prev_color;
 
-   if (al->offset != -1) {
+   if (al->type == AL_TYPE_ASM) {
double max_percent = 0.0;
int i, nr_percent = 1;
const char *color;
@@ -1500,6 +1502,7 @@ static int symbol__parse_objdump_line(struct symbol *sym,
}
 
args->offset  = offset;
+   args->type= (offset != -1) ? AL_TYPE_ASM : AL_TYPE_SRC;
args->line= parsed_line;
args->line_nr = 

[RFC 10/11] perf annotate: Preparation for hazard

2020-03-01 Thread Ravi Bangoria
Introduce 'struct hazard_hist' that will contain hazard specific
information for annotate. Add Array of list of 'struct hazard_hist'
into 'struct annotated_source' where array length = symbol size and
each member of list contain hazard info from associated perf sample.
This information is prepared while parsing samples in perf report.
Also, this is just a preparation step for annotate and followup
patch does actual annotate ui changes.

Signed-off-by: Ravi Bangoria 
---
 tools/perf/builtin-report.c |  1 +
 tools/perf/util/annotate.c  | 75 +
 tools/perf/util/annotate.h  | 14 +++
 tools/perf/util/hist.c  | 13 +++
 tools/perf/util/hist.h  |  4 ++
 tools/perf/util/machine.c   |  6 +++
 tools/perf/util/machine.h   |  3 ++
 7 files changed, 116 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index a47542a12da1..ff950ff8dd51 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -301,6 +301,7 @@ static int process_sample_event(struct perf_tool *tool,
hist__account_cycles(sample->branch_stack, , sample,
 rep->nonany_branch_mode,
 >total_cycles);
+   hist__capture_haz_info(, sample, evsel);
}
 
ret = hist_entry_iter__add(, , rep->max_stack, rep);
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 8aef60a6ffea..766934b0f36d 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -36,6 +36,7 @@
 #include "string2.h"
 #include "util/event.h"
 #include "arch/common.h"
+#include "hazard.h"
 #include 
 #include 
 #include 
@@ -800,6 +801,21 @@ static int symbol__alloc_hist_cycles(struct symbol *sym)
return 0;
 }
 
+static int symbol__alloc_hist_hazard(struct symbol *sym)
+{
+   struct annotation *notes = symbol__annotation(sym);
+   const size_t size = symbol__size(sym);
+   size_t i;
+
+   notes->src->haz_hist = calloc(size, sizeof(struct hazard_hist));
+   if (notes->src->haz_hist == NULL)
+   return -1;
+
+   for (i = 0; i < size; i++)
+   INIT_LIST_HEAD(>src->haz_hist[i].list);
+   return 0;
+}
+
 void symbol__annotate_zero_histograms(struct symbol *sym)
 {
struct annotation *notes = symbol__annotation(sym);
@@ -920,6 +936,25 @@ static struct cyc_hist *symbol__cycles_hist(struct symbol 
*sym)
return notes->src->cycles_hist;
 }
 
+static struct hazard_hist *symbol__hazard_hist(struct symbol *sym)
+{
+   struct annotation *notes = symbol__annotation(sym);
+
+   if (notes->src == NULL) {
+   notes->src = annotated_source__new();
+   if (notes->src == NULL)
+   return NULL;
+   goto alloc_haz_hist;
+   }
+
+   if (!notes->src->haz_hist) {
+alloc_haz_hist:
+   symbol__alloc_hist_hazard(sym);
+   }
+
+   return notes->src->haz_hist;
+}
+
 struct annotated_source *symbol__hists(struct symbol *sym, int nr_hists)
 {
struct annotation *notes = symbol__annotation(sym);
@@ -1014,6 +1049,46 @@ int addr_map_symbol__account_cycles(struct 
addr_map_symbol *ams,
return err;
 }
 
+int symbol__capture_haz_info(struct addr_map_symbol *ams,
+struct perf_sample *sample,
+struct evsel *evsel)
+{
+   struct hazard_hist *hh, *tmp;
+   u64 offset;
+   const char *arch = perf_env__arch(perf_evsel__env(evsel));
+
+   if (ams->ms.sym == NULL)
+   return 0;
+
+   hh = symbol__hazard_hist(ams->ms.sym);
+   if (!hh)
+   return -ENOMEM;
+
+   if (ams->al_addr < ams->ms.sym->start || ams->al_addr >= 
ams->ms.sym->end)
+   return -ERANGE;
+
+   offset = ams->al_addr - ams->ms.sym->start;
+
+   tmp = zalloc(sizeof(*tmp));
+   if (!tmp)
+   return -ENOMEM;
+
+   tmp->icache = perf_haz__icache_str(sample->pipeline_haz->icache, arch);
+   tmp->haz_stage = 
perf_haz__hstage_str(sample->pipeline_haz->hazard_stage,
+ arch);
+   tmp->haz_reason = 
perf_haz__hreason_str(sample->pipeline_haz->hazard_stage,
+   
sample->pipeline_haz->hazard_reason,
+   arch);
+   tmp->stall_stage = 
perf_haz__sstage_str(sample->pipeline_haz->stall_stage,
+   arch);
+   tmp->stall_reason = 
perf_haz__sreason_str(sample->pipeline_haz->stall_stage,
+ 
sample->pipeline_haz->stall_reason,
+ arch);
+
+   list_add(>list, [offset].list);
+   return 0;
+}
+
 static unsigned annotation__count_insn(struct annotation *notes, u64 start, 
u64 end)
 {
unsigned n_insn = 0;
diff --git 

[RFC 08/11] perf report: Enable hazard mode

2020-03-01 Thread Ravi Bangoria
From: Madhavan Srinivasan 

Introduce --hazard with perf report to show perf report with hazard
data. Hazard mode columns are Instruction Type, Hazard Stage, Hazard
Reason, Stall Stage, Stall Reason and Icache access. Default sort
order is sym, dso, inst type, hazard stage, hazard reason, stall
stage, stall reason, inst cache.

Sample o/p on IBM PowerPC machine:

  Overhead  Symbol  Shared  Instruction Type  Hazard Stage   Hazard 
Reason Stall Stage   Stall Reason  ICache access
36.58%  [.] thread_run  ebizzy  Load  LSUMispredict 
   LSU   Load fin  L1 hit
 9.46%  [.] thread_run  ebizzy  Load  LSUMispredict 
   LSU   Dcache_miss   L1 hit
 1.76%  [.] thread_run  ebizzy  Fixed point   -  -  
   - - L1 hit
 1.31%  [.] thread_run  ebizzy  Load  LSUERAT Miss  
   LSU   Load fin  L1 hit
 1.27%  [.] thread_run  ebizzy  Load  LSUMispredict 
   - - L1 hit
 1.16%  [.] thread_run  ebizzy  Fixed point   -  -  
   FXU   Fixed cycle   L1 hit
 0.50%  [.] thread_run  ebizzy  Fixed point   ISUSource 
UnavailableFXU   Fixed cycle   L1 hit
 0.30%  [.] thread_run  ebizzy  Load  LSULMQ Full, 
DERAT Miss  LSU   Load fin  L1 hit
 0.24%  [.] thread_run  ebizzy  Load  LSUERAT Miss  
   - - L1 hit
 0.08%  [.] thread_run  ebizzy  - -  -  
   BRU   Fixed cycle   L1 hit
 0.05%  [.] thread_run  ebizzy  Branch-  -  
   BRU   Fixed cycle   L1 hit
 0.04%  [.] thread_run  ebizzy  Fixed point   ISUSource 
Unavailable- - L1 hit

Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Ravi Bangoria 
---
 tools/perf/builtin-report.c |  28 +
 tools/perf/util/hist.c  |  77 
 tools/perf/util/hist.h  |   7 ++
 tools/perf/util/sort.c  | 230 
 tools/perf/util/sort.h  |  22 
 5 files changed, 364 insertions(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 72a12b69f120..a47542a12da1 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -77,6 +77,7 @@ struct report {
boolshow_threads;
boolinverted_callchain;
boolmem_mode;
+   boolhazard;
boolstats_mode;
booltasks_mode;
boolmmaps_mode;
@@ -285,6 +286,8 @@ static int process_sample_event(struct perf_tool *tool,
iter.ops = _iter_branch;
} else if (rep->mem_mode) {
iter.ops = _iter_mem;
+   } else if (rep->hazard) {
+   iter.ops = _iter_haz;
} else if (symbol_conf.cumulate_callchain) {
iter.ops = _iter_cumulative;
} else {
@@ -396,6 +399,14 @@ static int report__setup_sample_type(struct report *rep)
}
}
 
+   if (sort__mode == SORT_MODE__HAZARD) {
+   if (!is_pipe && !(sample_type & PERF_SAMPLE_PIPELINE_HAZ)) {
+   ui__error("Selected --hazard but no hazard data. "
+ "Did you call perf record without 
--hazard?\n");
+   return -1;
+   }
+   }
+
if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
if ((sample_type & PERF_SAMPLE_REGS_USER) &&
(sample_type & PERF_SAMPLE_STACK_USER)) {
@@ -484,6 +495,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists 
*hists, struct report
if (rep->mem_mode) {
ret += fprintf(fp, "\n# Total weight : %" PRIu64, nr_events);
ret += fprintf(fp, "\n# Sort order   : %s", sort_order ? : 
default_mem_sort_order);
+   } else if (rep->hazard) {
+   ret += fprintf(fp, "\n# Event count (approx.): %" PRIu64, 
nr_events);
+   ret += fprintf(fp, "\n# Sort order: %s", sort_order ? : 
default_haz_sort_order);
} else
ret += fprintf(fp, "\n# Event count (approx.): %" PRIu64, 
nr_events);
 
@@ -1228,6 +1242,7 @@ int cmd_report(int argc, const char **argv)
OPT_BOOLEAN(0, "demangle-kernel", _conf.demangle_kernel,
"Enable kernel symbol demangling"),
OPT_BOOLEAN(0, "mem-mode", _mode, "mem access profile"),
+   OPT_BOOLEAN(0, "hazard", , "Processor pipeline hazard and 
stalls"),
OPT_INTEGER(0, "samples", _conf.res_sample,
"Number of samples to save per 

[RFC 07/11] perf hazard: Functions to convert generic hazard data to arch specific string

2020-03-01 Thread Ravi Bangoria
From: Madhavan Srinivasan 

Kernel provides pipeline hazard data in struct perf_pipeline_haz_data
format. Add code to convert this data into meaningful string which can
be shown in perf report (followup patch).

Introduce tools/perf/utils/hazard directory which will contains arch
specific directories. Under arch specific directory, add arch specific
logic that will be called by generic code. This directory structure is
introduced to enable cross-arch reporting.

Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Ravi Bangoria 
---
 tools/perf/util/Build |   2 +
 tools/perf/util/hazard.c  |  51 +++
 tools/perf/util/hazard.h  |  14 ++
 tools/perf/util/hazard/Build  |   1 +
 .../util/hazard/powerpc/perf_pipeline_haz.h   |  80 ++
 .../perf/util/hazard/powerpc/powerpc_hazard.c | 142 ++
 .../perf/util/hazard/powerpc/powerpc_hazard.h |  14 ++
 7 files changed, 304 insertions(+)
 create mode 100644 tools/perf/util/hazard.c
 create mode 100644 tools/perf/util/hazard.h
 create mode 100644 tools/perf/util/hazard/Build
 create mode 100644 tools/perf/util/hazard/powerpc/perf_pipeline_haz.h
 create mode 100644 tools/perf/util/hazard/powerpc/powerpc_hazard.c
 create mode 100644 tools/perf/util/hazard/powerpc/powerpc_hazard.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 07da6c790b63..f5e1b7d79b6d 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -118,6 +118,7 @@ perf-y += parse-regs-options.o
 perf-y += term.o
 perf-y += help-unknown-cmd.o
 perf-y += mem-events.o
+perf-y += hazard.o
 perf-y += vsprintf.o
 perf-y += units.o
 perf-y += time-utils.o
@@ -153,6 +154,7 @@ perf-$(CONFIG_LIBUNWIND_AARCH64)  += libunwind/arm64.o
 perf-$(CONFIG_LIBBABELTRACE) += data-convert-bt.o
 
 perf-y += scripting-engines/
+perf-y += hazard/
 
 perf-$(CONFIG_ZLIB) += zlib.o
 perf-$(CONFIG_LZMA) += lzma.o
diff --git a/tools/perf/util/hazard.c b/tools/perf/util/hazard.c
new file mode 100644
index ..db235b26b266
--- /dev/null
+++ b/tools/perf/util/hazard.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0
+#include 
+#include "hazard/powerpc/powerpc_hazard.h"
+
+const char *perf_haz__itype_str(u8 itype, const char *arch)
+{
+   if (!strncmp(arch, "powerpc", strlen("powerpc")))
+   return powerpc__haz__itype_str(itype);
+
+   return "-";
+}
+
+const char *perf_haz__icache_str(u8 icache, const char *arch)
+{
+   if (!strncmp(arch, "powerpc", strlen("powerpc")))
+   return powerpc__haz__icache_str(icache);
+
+   return "-";
+}
+
+const char *perf_haz__hstage_str(u8 hstage, const char *arch)
+{
+   if (!strncmp(arch, "powerpc", strlen("powerpc")))
+   return powerpc__haz__hstage_str(hstage);
+
+   return "-";
+}
+
+const char *perf_haz__hreason_str(u8 hstage, u8 hreason, const char *arch)
+{
+   if (!strncmp(arch, "powerpc", strlen("powerpc")))
+   return powerpc__haz__hreason_str(hstage, hreason);
+
+   return "-";
+}
+
+const char *perf_haz__sstage_str(u8 sstage, const char *arch)
+{
+   if (!strncmp(arch, "powerpc", strlen("powerpc")))
+   return powerpc__haz__sstage_str(sstage);
+
+   return "-";
+}
+
+const char *perf_haz__sreason_str(u8 sstage, u8 sreason, const char *arch)
+{
+   if (!strncmp(arch, "powerpc", strlen("powerpc")))
+   return powerpc__haz__sreason_str(sstage, sreason);
+
+   return "-";
+}
diff --git a/tools/perf/util/hazard.h b/tools/perf/util/hazard.h
new file mode 100644
index ..eab4190e056a
--- /dev/null
+++ b/tools/perf/util/hazard.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_HAZARD_H
+#define __PERF_HAZARD_H
+
+#include "sort.h"
+
+const char *perf_haz__itype_str(u8 itype, const char *arch);
+const char *perf_haz__icache_str(u8 icache, const char *arch);
+const char *perf_haz__hstage_str(u8 hstage, const char *arch);
+const char *perf_haz__hreason_str(u8 hstage, u8 hreason, const char *arch);
+const char *perf_haz__sstage_str(u8 sstage, const char *arch);
+const char *perf_haz__sreason_str(u8 sstage, u8 sreason, const char *arch);
+
+#endif /* __PERF_HAZARD_H */
diff --git a/tools/perf/util/hazard/Build b/tools/perf/util/hazard/Build
new file mode 100644
index ..314c5e316383
--- /dev/null
+++ b/tools/perf/util/hazard/Build
@@ -0,0 +1 @@
+perf-y += powerpc/powerpc_hazard.o
diff --git a/tools/perf/util/hazard/powerpc/perf_pipeline_haz.h 
b/tools/perf/util/hazard/powerpc/perf_pipeline_haz.h
new file mode 100644
index ..de8857ec31dd
--- /dev/null
+++ b/tools/perf/util/hazard/powerpc/perf_pipeline_haz.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_ASM_POWERPC_PERF_PIPELINE_HAZ_H
+#define _UAPI_ASM_POWERPC_PERF_PIPELINE_HAZ_H
+
+enum perf_inst_type {
+   PERF_HAZ__ITYPE_LOAD = 1,
+   PERF_HAZ__ITYPE_STORE,
+   

[RFC 06/11] perf hists: Make a room for hazard info in struct hist_entry

2020-03-01 Thread Ravi Bangoria
From: Madhavan Srinivasan 

To enable hazard mode with perf report (followup patch) we need to
have cpu pipeline hazard data available in hist_entry. Add hazard
info into struct hist_entry. Also add hazard_info as parameter to
hists__add_entry().

Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Ravi Bangoria 
---
 tools/perf/builtin-annotate.c |  2 +-
 tools/perf/builtin-c2c.c  |  4 ++--
 tools/perf/builtin-diff.c |  6 +++---
 tools/perf/tests/hists_link.c |  4 ++--
 tools/perf/util/hist.c| 22 +++---
 tools/perf/util/hist.h|  2 ++
 tools/perf/util/sort.h|  1 +
 7 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 6c0a0412502e..78552a9428a6 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -249,7 +249,7 @@ static int perf_evsel__add_sample(struct evsel *evsel,
if (ann->has_br_stack && has_annotation(ann))
return process_branch_callback(evsel, sample, al, ann, machine);
 
-   he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);
+   he = hists__add_entry(hists, al, NULL, NULL, NULL, NULL, sample, true);
if (he == NULL)
return -ENOMEM;
 
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 246ac0b4d54f..2a1cb5cda6d9 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -292,7 +292,7 @@ static int process_sample_event(struct perf_tool *tool 
__maybe_unused,
c2c_decode_stats(, mi);
 
he = hists__add_entry_ops(_hists->hists, _entry_ops,
- , NULL, NULL, mi,
+ , NULL, NULL, mi, NULL,
  sample, true);
if (he == NULL)
goto free_mi;
@@ -326,7 +326,7 @@ static int process_sample_event(struct perf_tool *tool 
__maybe_unused,
goto free_mi;
 
he = hists__add_entry_ops(_hists->hists, _entry_ops,
- , NULL, NULL, mi,
+ , NULL, NULL, mi, NULL,
  sample, true);
if (he == NULL)
goto free_mi;
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index f8b6ae557d8b..e32e91f89a18 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -412,15 +412,15 @@ static int diff__process_sample_event(struct perf_tool 
*tool,
}
 
if (compute != COMPUTE_CYCLES) {
-   if (!hists__add_entry(hists, , NULL, NULL, NULL, sample,
- true)) {
+   if (!hists__add_entry(hists, , NULL, NULL, NULL, NULL,
+ sample, true)) {
pr_warning("problem incrementing symbol period, "
   "skipping event\n");
goto out_put;
}
} else {
if (!hists__add_entry_ops(hists, _hist_ops, , NULL,
- NULL, NULL, sample, true)) {
+ NULL, NULL, NULL, sample, true)) {
pr_warning("problem incrementing symbol period, "
   "skipping event\n");
goto out_put;
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index a024d3f3a412..112a90818d2e 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -86,7 +86,7 @@ static int add_hist_entries(struct evlist *evlist, struct 
machine *machine)
if (machine__resolve(machine, , ) < 0)
goto out;
 
-   he = hists__add_entry(hists, , NULL,
+   he = hists__add_entry(hists, , NULL, NULL,
NULL, NULL, , true);
if (he == NULL) {
addr_location__put();
@@ -105,7 +105,7 @@ static int add_hist_entries(struct evlist *evlist, struct 
machine *machine)
if (machine__resolve(machine, , ) < 0)
goto out;
 
-   he = hists__add_entry(hists, , NULL,
+   he = hists__add_entry(hists, , NULL, NULL,
NULL, NULL, , true);
if (he == NULL) {
addr_location__put();
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index ca5a8f4d007e..6d23efaa52c8 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -604,6 +604,7 @@ static struct hist_entry *hists__findnew_entry(struct hists 
*hists,
 * and will not be used anymore.
 */

[RFC 03/11] powerpc/perf: Arch specific definitions for pipeline

2020-03-01 Thread Ravi Bangoria
From: Madhavan Srinivasan 

Create powerpc specific definitions for pipeline hazard and stalls.
This information is available in SIER register on powerpc. Current
definitions are based on IBM PowerPC SIER specification available
in ISA[1] and Performance Monitor Unit User’s Guide[2].

[1]: Book III, Section 9.4.10:
 https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0
[2]: 
https://wiki.raptorcs.com/w/images/6/6b/POWER9_PMU_UG_v12_28NOV2018_pub.pdf#G9.1106986

Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Ravi Bangoria 
---
 .../include/uapi/asm/perf_pipeline_haz.h  | 80 +++
 1 file changed, 80 insertions(+)
 create mode 100644 arch/powerpc/include/uapi/asm/perf_pipeline_haz.h

diff --git a/arch/powerpc/include/uapi/asm/perf_pipeline_haz.h 
b/arch/powerpc/include/uapi/asm/perf_pipeline_haz.h
new file mode 100644
index ..de8857ec31dd
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/perf_pipeline_haz.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_ASM_POWERPC_PERF_PIPELINE_HAZ_H
+#define _UAPI_ASM_POWERPC_PERF_PIPELINE_HAZ_H
+
+enum perf_inst_type {
+   PERF_HAZ__ITYPE_LOAD = 1,
+   PERF_HAZ__ITYPE_STORE,
+   PERF_HAZ__ITYPE_BRANCH,
+   PERF_HAZ__ITYPE_FP,
+   PERF_HAZ__ITYPE_FX,
+   PERF_HAZ__ITYPE_CR_OR_SC,
+};
+
+enum perf_inst_cache {
+   PERF_HAZ__ICACHE_L1_HIT = 1,
+   PERF_HAZ__ICACHE_L2_HIT,
+   PERF_HAZ__ICACHE_L3_HIT,
+   PERF_HAZ__ICACHE_L3_MISS,
+};
+
+enum perf_pipeline_stage {
+   PERF_HAZ__PIPE_STAGE_IFU = 1,
+   PERF_HAZ__PIPE_STAGE_IDU,
+   PERF_HAZ__PIPE_STAGE_ISU,
+   PERF_HAZ__PIPE_STAGE_LSU,
+   PERF_HAZ__PIPE_STAGE_BRU,
+   PERF_HAZ__PIPE_STAGE_FXU,
+   PERF_HAZ__PIPE_STAGE_FPU,
+   PERF_HAZ__PIPE_STAGE_VSU,
+   PERF_HAZ__PIPE_STAGE_OTHER,
+};
+
+enum perf_haz_bru_reason {
+   PERF_HAZ__HAZ_BRU_MPRED_DIR = 1,
+   PERF_HAZ__HAZ_BRU_MPRED_TA,
+};
+
+enum perf_haz_isu_reason {
+   PERF_HAZ__HAZ_ISU_SRC = 1,
+   PERF_HAZ__HAZ_ISU_COL = 1,
+};
+
+enum perf_haz_lsu_reason {
+   PERF_HAZ__HAZ_LSU_ERAT_MISS = 1,
+   PERF_HAZ__HAZ_LSU_LMQ,
+   PERF_HAZ__HAZ_LSU_LHS,
+   PERF_HAZ__HAZ_LSU_MPRED,
+   PERF_HAZ__HAZ_DERAT_MISS,
+   PERF_HAZ__HAZ_LSU_LMQ_DERAT_MISS,
+   PERF_HAZ__HAZ_LSU_LHS_DERAT_MISS,
+   PERF_HAZ__HAZ_LSU_MPRED_DERAT_MISS,
+};
+
+enum perf_stall_lsu_reason {
+   PERF_HAZ__STALL_LSU_DCACHE_MISS = 1,
+   PERF_HAZ__STALL_LSU_LD_FIN,
+   PERF_HAZ__STALL_LSU_ST_FWD,
+   PERF_HAZ__STALL_LSU_ST,
+};
+
+enum perf_stall_fxu_reason {
+   PERF_HAZ__STALL_FXU_MC = 1,
+   PERF_HAZ__STALL_FXU_FC,
+};
+
+enum perf_stall_bru_reason {
+   PERF_HAZ__STALL_BRU_FIN_MPRED = 1,
+   PERF_HAZ__STALL_BRU_FC,
+};
+
+enum perf_stall_vsu_reason {
+   PERF_HAZ__STALL_VSU_MC = 1,
+   PERF_HAZ__STALL_VSU_FC,
+};
+
+enum perf_stall_other_reason {
+   PERF_HAZ__STALL_NTC,
+};
+
+#endif /* _UAPI_ASM_POWERPC_PERF_PIPELINE_HAZ_H */
-- 
2.21.1



[RFC 04/11] powerpc/perf: Arch support to expose Hazard data

2020-03-01 Thread Ravi Bangoria
From: Madhavan Srinivasan 

SIER register on PowerPC hw pmu provides cpu pipeline hazard information.
Add logic to convert this arch specific data into perf_pipeline_haz_data
structure.

Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/include/asm/perf_event_server.h |   2 +
 arch/powerpc/perf/core-book3s.c  |   4 +
 arch/powerpc/perf/isa207-common.c| 157 +++
 arch/powerpc/perf/isa207-common.h|  12 ++
 arch/powerpc/perf/power8-pmu.c   |   1 +
 arch/powerpc/perf/power9-pmu.c   |   1 +
 6 files changed, 177 insertions(+)

diff --git a/arch/powerpc/include/asm/perf_event_server.h 
b/arch/powerpc/include/asm/perf_event_server.h
index 3e9703f44c7c..9b8f90439ff2 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -37,6 +37,8 @@ struct power_pmu {
void(*get_mem_data_src)(union perf_mem_data_src *dsrc,
u32 flags, struct pt_regs *regs);
void(*get_mem_weight)(u64 *weight);
+   void(*get_phazard_data)(struct perf_pipeline_haz_data *phaz,
+   u32 flags, struct pt_regs *regs);
unsigned long   group_constraint_mask;
unsigned long   group_constraint_val;
u64 (*bhrb_filter_map)(u64 branch_sample_type);
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 3086055bf681..fcbb4acc3a03 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2096,6 +2096,10 @@ static void record_and_restart(struct perf_event *event, 
unsigned long val,
ppmu->get_mem_weight)
ppmu->get_mem_weight();
 
+   if (event->attr.sample_type & PERF_SAMPLE_PIPELINE_HAZ &&
+   ppmu->get_phazard_data)
+   ppmu->get_phazard_data(_haz, ppmu->flags, 
regs);
+
if (perf_event_overflow(event, , regs))
power_pmu_stop(event, 0);
}
diff --git a/arch/powerpc/perf/isa207-common.c 
b/arch/powerpc/perf/isa207-common.c
index 07026bbd292b..03dafde7cace 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -239,6 +239,163 @@ void isa207_get_mem_weight(u64 *weight)
*weight = mantissa << (2 * exp);
 }
 
+static __u8 get_inst_type(u64 sier)
+{
+   switch (SIER_TYPE(sier)) {
+   case 1:
+   return PERF_HAZ__ITYPE_LOAD;
+   case 2:
+   return PERF_HAZ__ITYPE_STORE;
+   case 3:
+   return PERF_HAZ__ITYPE_BRANCH;
+   case 4:
+   return PERF_HAZ__ITYPE_FP;
+   case 5:
+   return PERF_HAZ__ITYPE_FX;
+   case 6:
+   return PERF_HAZ__ITYPE_CR_OR_SC;
+   }
+   return PERF_HAZ__ITYPE_NA;
+}
+
+static __u8 get_inst_cache(u64 sier)
+{
+   switch (SIER_ICACHE(sier)) {
+   case 1:
+   return PERF_HAZ__ICACHE_L1_HIT;
+   case 2:
+   return PERF_HAZ__ICACHE_L2_HIT;
+   case 3:
+   return PERF_HAZ__ICACHE_L3_HIT;
+   case 4:
+   return PERF_HAZ__ICACHE_L3_MISS;
+   }
+   return PERF_HAZ__ICACHE_NA;
+}
+
+static void get_hazard_data(u64 sier, struct perf_pipeline_haz_data *haz)
+{
+   if (SIER_MPRED(sier)) {
+   haz->hazard_stage = PERF_HAZ__PIPE_STAGE_BRU;
+
+   switch (SIER_MPRED_TYPE(sier)) {
+   case 1:
+   haz->hazard_reason = PERF_HAZ__HAZ_BRU_MPRED_DIR;
+   return;
+   case 2:
+   haz->hazard_reason = PERF_HAZ__HAZ_BRU_MPRED_TA;
+   return;
+   }
+   }
+
+   if (cpu_has_feature(CPU_FTR_ARCH_300) &&
+   (SIER_TYPE(sier) == 1 || SIER_TYPE(sier) == 2)) {
+   haz->hazard_stage = PERF_HAZ__PIPE_STAGE_LSU;
+   haz->hazard_reason = PERF_HAZ__HAZ_DERAT_MISS;
+   return;
+   }
+
+   if (cpu_has_feature(CPU_FTR_ARCH_207S) &&
+   (SIER_TYPE(sier) == 1 || SIER_TYPE(sier) == 2)) {
+   int derat_miss = SIER_DERAT_MISS(sier);
+
+   haz->hazard_stage = PERF_HAZ__PIPE_STAGE_LSU;
+
+   switch (p8_SIER_REJ_LSU_REASON(sier)) {
+   case 0:
+   haz->hazard_reason = PERF_HAZ__HAZ_LSU_ERAT_MISS;
+   return;
+   case 1:
+   haz->hazard_reason = (derat_miss) ?
+PERF_HAZ__HAZ_LSU_LMQ_DERAT_MISS :
+PERF_HAZ__HAZ_LSU_LMQ;
+   return;
+   case 2:
+   haz->hazard_reason = (derat_miss) ?
+

[RFC 02/11] perf/core: Data structure to present hazard data

2020-03-01 Thread Ravi Bangoria
From: Madhavan Srinivasan 

Introduce new perf sample_type PERF_SAMPLE_PIPELINE_HAZ to request kernel
to provide cpu pipeline hazard data. Also, introduce arch independent
structure 'perf_pipeline_haz_data' to pass hazard data to userspace. This
is generic structure and arch specific data needs to be converted to this
format.

Signed-off-by: Madhavan Srinivasan 
Signed-off-by: Ravi Bangoria 
---
 include/linux/perf_event.h|  7 ++
 include/uapi/linux/perf_event.h   | 32 ++-
 kernel/events/core.c  |  6 +
 tools/include/uapi/linux/perf_event.h | 32 ++-
 4 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 547773f5894e..d5b606e3c57d 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1001,6 +1001,7 @@ struct perf_sample_data {
u64 stack_user_size;
 
u64 phys_addr;
+   struct perf_pipeline_haz_data   pipeline_haz;
 } cacheline_aligned;
 
 /* default value for data source */
@@ -1021,6 +1022,12 @@ static inline void perf_sample_data_init(struct 
perf_sample_data *data,
data->weight = 0;
data->data_src.val = PERF_MEM_NA;
data->txn = 0;
+   data->pipeline_haz.itype = PERF_HAZ__ITYPE_NA;
+   data->pipeline_haz.icache = PERF_HAZ__ICACHE_NA;
+   data->pipeline_haz.hazard_stage = PERF_HAZ__PIPE_STAGE_NA;
+   data->pipeline_haz.hazard_reason = PERF_HAZ__HREASON_NA;
+   data->pipeline_haz.stall_stage = PERF_HAZ__PIPE_STAGE_NA;
+   data->pipeline_haz.stall_reason = PERF_HAZ__SREASON_NA;
 }
 
 extern void perf_output_sample(struct perf_output_handle *handle,
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 377d794d3105..ff252618ca93 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -142,8 +142,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_REGS_INTR   = 1U << 18,
PERF_SAMPLE_PHYS_ADDR   = 1U << 19,
PERF_SAMPLE_AUX = 1U << 20,
+   PERF_SAMPLE_PIPELINE_HAZ= 1U << 21,
 
-   PERF_SAMPLE_MAX = 1U << 21, /* non-ABI */
+   PERF_SAMPLE_MAX = 1U << 22, /* non-ABI */
 
__PERF_SAMPLE_CALLCHAIN_EARLY   = 1ULL << 63, /* non-ABI; 
internal use */
 };
@@ -870,6 +871,13 @@ enum perf_event_type {
 *  { u64   phys_addr;} && PERF_SAMPLE_PHYS_ADDR
 *  { u64   size;
 *char  data[size]; } && PERF_SAMPLE_AUX
+*  { u8itype;
+*u8icache;
+*u8hazard_stage;
+*u8hazard_reason;
+*u8stall_stage;
+*u8stall_reason;
+*u16   pad;} && PERF_SAMPLE_PIPELINE_HAZ
 * };
 */
PERF_RECORD_SAMPLE  = 9,
@@ -1185,4 +1193,26 @@ struct perf_branch_entry {
reserved:40;
 };
 
+struct perf_pipeline_haz_data {
+   /* Instruction/Opcode type: Load, Store, Branch  */
+   __u8itype;
+   /* Instruction Cache source */
+   __u8icache;
+   /* Instruction suffered hazard in pipeline stage */
+   __u8hazard_stage;
+   /* Hazard reason */
+   __u8hazard_reason;
+   /* Instruction suffered stall in pipeline stage */
+   __u8stall_stage;
+   /* Stall reason */
+   __u8stall_reason;
+   __u16   pad;
+};
+
+#define PERF_HAZ__ITYPE_NA 0x0
+#define PERF_HAZ__ICACHE_NA0x0
+#define PERF_HAZ__PIPE_STAGE_NA0x0
+#define PERF_HAZ__HREASON_NA   0x0
+#define PERF_HAZ__SREASON_NA   0x0
+
 #endif /* _UAPI_LINUX_PERF_EVENT_H */
diff --git a/kernel/events/core.c b/kernel/events/core.c
index e453589da97c..d00037c77ccf 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1754,6 +1754,9 @@ static void __perf_event_header_size(struct perf_event 
*event, u64 sample_type)
if (sample_type & PERF_SAMPLE_PHYS_ADDR)
size += sizeof(data->phys_addr);
 
+   if (sample_type & PERF_SAMPLE_PIPELINE_HAZ)
+   size += sizeof(data->pipeline_haz);
+
event->header_size = size;
 }
 
@@ -6712,6 +6715,9 @@ void perf_output_sample(struct perf_output_handle *handle,
perf_aux_sample_output(event, handle, data);
}
 
+   if (sample_type & PERF_SAMPLE_PIPELINE_HAZ)
+   perf_output_put(handle, data->pipeline_haz);
+
if (!event->attr.watermark) {
int wakeup_events = event->attr.wakeup_events;
 
diff --git a/tools/include/uapi/linux/perf_event.h 
b/tools/include/uapi/linux/perf_event.h
index 

[RFC 01/11] powerpc/perf: Simplify ISA207_SIER macros

2020-03-01 Thread Ravi Bangoria
Instead of having separate macros for MASK and SHIFT, and using
them to derive the bits, let's have simple macro to do the job.
Also, remove ISA207_ prefix because some of the SIER bits which
are extracted with these macros are not defined in ISA, example
DATA_SRC bits.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/perf/isa207-common.c |  8 
 arch/powerpc/perf/isa207-common.h | 11 +++
 2 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/perf/isa207-common.c 
b/arch/powerpc/perf/isa207-common.c
index 4c86da5eb28a..07026bbd292b 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -215,10 +215,10 @@ void isa207_get_mem_data_src(union perf_mem_data_src 
*dsrc, u32 flags,
}
 
sier = mfspr(SPRN_SIER);
-   val = (sier & ISA207_SIER_TYPE_MASK) >> ISA207_SIER_TYPE_SHIFT;
+   val = SIER_TYPE(sier);
if (val == 1 || val == 2) {
-   idx = (sier & ISA207_SIER_LDST_MASK) >> ISA207_SIER_LDST_SHIFT;
-   sub_idx = (sier & ISA207_SIER_DATA_SRC_MASK) >> 
ISA207_SIER_DATA_SRC_SHIFT;
+   idx = SIER_LDST(sier);
+   sub_idx = SIER_DATA_SRC(sier);
 
dsrc->val = isa207_find_source(idx, sub_idx);
dsrc->val |= (val == 1) ? P(OP, LOAD) : P(OP, STORE);
@@ -231,7 +231,7 @@ void isa207_get_mem_weight(u64 *weight)
u64 exp = MMCRA_THR_CTR_EXP(mmcra);
u64 mantissa = MMCRA_THR_CTR_MANT(mmcra);
u64 sier = mfspr(SPRN_SIER);
-   u64 val = (sier & ISA207_SIER_TYPE_MASK) >> ISA207_SIER_TYPE_SHIFT;
+   u64 val = SIER_TYPE(sier);
 
if (val == 0 || val == 7)
*weight = 0;
diff --git a/arch/powerpc/perf/isa207-common.h 
b/arch/powerpc/perf/isa207-common.h
index 63fd4f3f6013..7027eb9f3e40 100644
--- a/arch/powerpc/perf/isa207-common.h
+++ b/arch/powerpc/perf/isa207-common.h
@@ -202,14 +202,9 @@
 #define MAX_ALT2
 #define MAX_PMU_COUNTERS   6
 
-#define ISA207_SIER_TYPE_SHIFT 15
-#define ISA207_SIER_TYPE_MASK  (0x7ull << ISA207_SIER_TYPE_SHIFT)
-
-#define ISA207_SIER_LDST_SHIFT 1
-#define ISA207_SIER_LDST_MASK  (0x7ull << ISA207_SIER_LDST_SHIFT)
-
-#define ISA207_SIER_DATA_SRC_SHIFT 53
-#define ISA207_SIER_DATA_SRC_MASK  (0x7ull << ISA207_SIER_DATA_SRC_SHIFT)
+#define SIER_DATA_SRC(sier)(((sier) >> (63 - 10)) & 0x7ull)
+#define SIER_TYPE(sier)(((sier) >> (63 - 48)) & 0x7ull)
+#define SIER_LDST(sier)(((sier) >> (63 - 62)) & 0x7ull)
 
 #define P(a, b)PERF_MEM_S(a, b)
 #define PH(a, b)   (P(LVL, HIT) | P(a, b))
-- 
2.21.1



[RFC 00/11] perf: Enhancing perf to export processor hazard information

2020-03-01 Thread Ravi Bangoria
Most modern microprocessors employ complex instruction execution
pipelines such that many instructions can be 'in flight' at any
given point in time. Various factors affect this pipeline and
hazards are the primary among them. Different types of hazards
exist - Data hazards, Structural hazards and Control hazards.
Data hazard is the case where data dependencies exist between
instructions in different stages in the pipeline. Structural
hazard is when the same processor hardware is needed by more
than one instruction in flight at the same time. Control hazards
are more the branch misprediction kinds. 

Information about these hazards are critical towards analyzing
performance issues and also to tune software to overcome such
issues. Modern processors export such hazard data in Performance
Monitoring Unit (PMU) registers. Ex, 'Sampled Instruction Event
Register' on IBM PowerPC[1][2] and 'Instruction-Based Sampling' on
AMD[3] provides similar information.

Implementation detail:

A new sample_type called PERF_SAMPLE_PIPELINE_HAZ is introduced.
If it's set, kernel converts arch specific hazard information
into generic format:

  struct perf_pipeline_haz_data {
 /* Instruction/Opcode type: Load, Store, Branch  */
 __u8itype;
 /* Instruction Cache source */
 __u8icache;
 /* Instruction suffered hazard in pipeline stage */
 __u8hazard_stage;
 /* Hazard reason */
 __u8hazard_reason;
 /* Instruction suffered stall in pipeline stage */
 __u8stall_stage;
 /* Stall reason */
 __u8stall_reason;
 __u16   pad;
  };

... which can be read by user from mmap() ring buffer. With this
approach, sample perf report in hazard mode looks like (On IBM
PowerPC):

  # ./perf record --hazard ./ebizzy
  # ./perf report --hazard
  Overhead  Symbol  Shared  Instruction Type  Hazard Stage   Hazard 
Reason Stall Stage   Stall Reason  ICache access
36.58%  [.] thread_run  ebizzy  Load  LSUMispredict 
   LSU   Load fin  L1 hit
 9.46%  [.] thread_run  ebizzy  Load  LSUMispredict 
   LSU   Dcache_miss   L1 hit
 1.76%  [.] thread_run  ebizzy  Fixed point   -  -  
   - - L1 hit
 1.31%  [.] thread_run  ebizzy  Load  LSUERAT Miss  
   LSU   Load fin  L1 hit
 1.27%  [.] thread_run  ebizzy  Load  LSUMispredict 
   - - L1 hit
 1.16%  [.] thread_run  ebizzy  Fixed point   -  -  
   FXU   Fixed cycle   L1 hit
 0.50%  [.] thread_run  ebizzy  Fixed point   ISUSource 
UnavailableFXU   Fixed cycle   L1 hit
 0.30%  [.] thread_run  ebizzy  Load  LSULMQ Full, 
DERAT Miss  LSU   Load fin  L1 hit
 0.24%  [.] thread_run  ebizzy  Load  LSUERAT Miss  
   - - L1 hit
 0.08%  [.] thread_run  ebizzy  - -  -  
   BRU   Fixed cycle   L1 hit
 0.05%  [.] thread_run  ebizzy  Branch-  -  
   BRU   Fixed cycle   L1 hit
 0.04%  [.] thread_run  ebizzy  Fixed point   ISUSource 
Unavailable- - L1 hit

Also perf annotate with hazard data:

 │Disassembly of section .text:
 │
 │10001cf8 :
 │compare():
 │return NULL;
 │}
 │
 │static int
 │compare(const void *p1, const void *p2)
 │{
   33.23 │  stdr31,-8(r1)
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}
 │   {haz_stage: LSU, haz_reason: Load Hit Store, stall_stage: LSU, 
stall_reason: -, icache: L3 hit}
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: -, 
stall_reason: -, icache: L1 hit}
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}
0.84 │  stdu   r1,-64(r1)
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: -, 
stall_reason: -, icache: L1 hit}
0.24 │  mr r31,r1
 │   {haz_stage: -, haz_reason: -, stall_stage: -, stall_reason: -, 
icache: L1 hit}
   21.18 │  stdr3,32(r31)
 │   {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, 
stall_reason: Store, icache: L1 hit}
 │   {haz_stage: LSU, haz_reason: ERAT Miss, 

[RFC PATCH] powerpc/64s: CONFIG_PPC_HASH_MMU

2020-03-01 Thread Nicholas Piggin
This allows the 64s hash MMU code to be compiled out if radix is
selected. This saves about 128kB kernel image size (90kB text) on
powernv_defconfig minus KVM, 40kB on a tiny config.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/Kconfig  |  1 +
 arch/powerpc/include/asm/book3s/64/mmu.h  | 20 -
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  6 +
 .../include/asm/book3s/64/tlbflush-hash.h | 10 +++--
 arch/powerpc/include/asm/book3s/64/tlbflush.h |  4 
 arch/powerpc/include/asm/book3s/pgtable.h |  4 
 arch/powerpc/include/asm/mmu.h| 14 ++--
 arch/powerpc/include/asm/mmu_context.h|  2 ++
 arch/powerpc/include/asm/paca.h   |  9 
 arch/powerpc/include/asm/sparsemem.h  |  2 +-
 arch/powerpc/kernel/asm-offsets.c |  4 
 arch/powerpc/kernel/dt_cpu_ftrs.c | 10 -
 arch/powerpc/kernel/entry_64.S|  4 ++--
 arch/powerpc/kernel/exceptions-64s.S  | 22 ---
 arch/powerpc/kernel/mce.c |  2 +-
 arch/powerpc/kernel/mce_power.c   | 10 ++---
 arch/powerpc/kernel/paca.c| 18 ++-
 arch/powerpc/kernel/process.c | 13 ++-
 arch/powerpc/kernel/prom.c|  2 ++
 arch/powerpc/kernel/setup_64.c|  4 
 arch/powerpc/kexec/core_64.c  |  4 ++--
 arch/powerpc/kvm/Kconfig  |  1 +
 arch/powerpc/mm/book3s64/Makefile | 17 --
 arch/powerpc/mm/book3s64/hash_pgtable.c   |  1 -
 arch/powerpc/mm/book3s64/hash_utils.c |  9 
 .../{hash_hugetlbpage.c => hugetlbpage.c} |  2 ++
 arch/powerpc/mm/book3s64/mmu_context.c| 18 +--
 arch/powerpc/mm/book3s64/pgtable.c| 22 +--
 arch/powerpc/mm/book3s64/radix_pgtable.c  |  5 +
 arch/powerpc/mm/book3s64/slb.c| 14 
 arch/powerpc/mm/copro_fault.c |  2 ++
 arch/powerpc/mm/fault.c   | 20 +
 arch/powerpc/platforms/Kconfig.cputype| 20 -
 arch/powerpc/platforms/powernv/idle.c |  2 ++
 arch/powerpc/platforms/powernv/setup.c|  2 ++
 arch/powerpc/xmon/xmon.c  |  8 +--
 36 files changed, 231 insertions(+), 77 deletions(-)
 rename arch/powerpc/mm/book3s64/{hash_hugetlbpage.c => hugetlbpage.c} (99%)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 497b7d0b2d7e..50c361b8b7fd 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -963,6 +963,7 @@ config PPC_MEM_KEYS
prompt "PowerPC Memory Protection Keys"
def_bool y
depends on PPC_BOOK3S_64
+   depends on PPC_HASH_MMU
select ARCH_USES_HIGH_VMA_FLAGS
select ARCH_HAS_PKEYS
help
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h 
b/arch/powerpc/include/asm/book3s/64/mmu.h
index bb3deb76c951..fca69dd23e25 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -107,7 +107,9 @@ typedef struct {
 * from EA and new context ids to build the new VAs.
 */
mm_context_id_t id;
+#ifdef CONFIG_PPC_HASH_MMU
mm_context_id_t extended_id[TASK_SIZE_USER64/TASK_CONTEXT_SIZE];
+#endif
};
 
/* Number of bits in the mm_cpumask */
@@ -116,7 +118,9 @@ typedef struct {
/* Number of users of the external (Nest) MMU */
atomic_t copros;
 
+#ifdef CONFIG_PPC_HASH_MMU
struct hash_mm_context *hash_context;
+#endif
 
unsigned long vdso_base;
/*
@@ -139,6 +143,7 @@ typedef struct {
 #endif
 } mm_context_t;
 
+#ifdef CONFIG_PPC_HASH_MMU
 static inline u16 mm_ctx_user_psize(mm_context_t *ctx)
 {
return ctx->hash_context->user_psize;
@@ -193,11 +198,22 @@ static inline struct subpage_prot_table 
*mm_ctx_subpage_prot(mm_context_t *ctx)
 }
 #endif
 
+#endif
+
 /*
  * The current system page and segment sizes
  */
-extern int mmu_linear_psize;
+#if defined(CONFIG_PPC_RADIX_MMU) && !defined(CONFIG_PPC_HASH_MMU)
+#ifdef CONFIG_PPC_64K_PAGES
+#define mmu_virtual_psize MMU_PAGE_64K
+#else
+#define mmu_virtual_psize MMU_PAGE_4K
+#endif
+#else
 extern int mmu_virtual_psize;
+#endif
+
+extern int mmu_linear_psize;
 extern int mmu_vmalloc_psize;
 extern int mmu_vmemmap_psize;
 extern int mmu_io_psize;
@@ -243,6 +259,7 @@ extern void radix_init_pseries(void);
 static inline void radix_init_pseries(void) { };
 #endif
 
+#ifdef CONFIG_PPC_HASH_MMU
 static inline int get_user_context(mm_context_t *ctx, unsigned long ea)
 {
int index = ea >> MAX_EA_BITS_PER_CONTEXT;
@@ -262,6 +279,7 @@ static inline unsigned long get_user_vsid(mm_context_t *ctx,
 
return get_vsid(context, ea, ssize);
 }
+#endif
 
 #endif /* __ASSEMBLY__ */
 #endif /* 

Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable

2020-03-01 Thread 王文虎
发件人:Scott Wood 
发送日期:2020-03-01 07:12:58
收件人:"王文虎" 
抄送人:wangwenhu ,Kumar Gala 
,Benjamin Herrenschmidt 
,Paul Mackerras ,Michael Ellerman 
,linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org,triv...@kernel.org,Rai
 Harninder 
主题:Re: Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable>On 
Tue, 2020-01-21 at 14:38 +0800, 王文虎 wrote:
>> 发件人:Scott Wood 
>> 发送日期:2020-01-21 13:49:59
>> 收件人:"王文虎" 
>> 抄送人:wangwenhu ,Kumar Gala ,B
>> enjamin Herrenschmidt ,Paul Mackerras <
>> pau...@samba.org>,Michael Ellerman ,
>> linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org,
>> triv...@kernel.org,Rai Harninder 
>> 主题:Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable>On
>> Tue, 2020-01-21 at 13:20 +0800, 王文虎 wrote:
>> > > From: Scott Wood 
>> > > Date: 2020-01-21 11:25:25
>> > > To:  wangwenhu ,Kumar Gala <
>> > > ga...@kernel.crashing.org>,
>> > > Benjamin Herrenschmidt ,Paul Mackerras <
>> > > pau...@samba.org>,Michael Ellerman ,
>> > > linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org
>> > > Cc:  triv...@kernel.org,wenhu.w...@vivo.com,Rai Harninder <
>> > > harninder@nxp.com>
>> > > Subject: Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM
>> > > configurable>On Mon, 2020-01-20 at 06:43 -0800, wangwenhu wrote:
>> > > > > From: wangwenhu 
>> > > > > 
>> > > > > When generating .config file with menuconfig on Freescale BOOKE
>> > > > > SOC, FSL_85XX_CACHE_SRAM is not configurable for the lack of
>> > > > > description in the Kconfig field, which makes it impossible
>> > > > > to support L2Cache-Sram driver. Add a description to make it
>> > > > > configurable.
>> > > > > 
>> > > > > Signed-off-by: wangwenhu 
>> > > > 
>> > > > The intent was that drivers using the SRAM API would select the
>> > > > symbol.  What
>> > > > is the use case for selecting it manually?
>> > > > 
>> > > 
>> > > With a repository of multiple products(meaning different defconfigs) and
>> > > multiple
>> > > developers, the Kconfigs of the Kernel Source Tree change frequently. So
>> > > the
>> > > "make menuconfig"
>> > > process is needed for defconfigs' re-generating or updating for the
>> > > complexity of dependencies
>> > > between different features defined in the Kconfigs.
>> > 
>> > That doesn't answer my question of how the SRAM code would be useful other
>> > than to some other driver that uses the API (which would use
>> > "select").  There
>> > is no userspace API.  You could use the kernel command line to configure
>> > the
>> > SRAM but you need to get the address of it for it to be useful.
>> > 
>> 
>> Like you've asked below, via /dev/mem or direct calling within the Kernel.
>> And they are not submitted yes, under development.
>
>If they are calling within the kernel, then whatever driver that is should
>select FSL_85XX_CACHE_SRAM.  Directly accessing /dev/mem without any way for
>the kernel to advertise where it is or which parts of SRAM are available for
>use sounds like a bad idea.

>
Yes, definitely. So like we enable the moulde which should selet 
FSL_85XX_CACHE_SRAM to build vmlinux, FSL_85XX_CACHE_SRAM 
could not be seleted because of the Kconfig definition problem 
which I am trying to fix now.  So would you please merge the patch 
for the convenience of later works depending on the driver.

Wenhu

>-Scott
>
>




Re: [PATCH v3 0/6] implement KASLR for powerpc/fsl_booke/64

2020-03-01 Thread Scott Wood
On Mon, 2020-03-02 at 10:17 +0800, Jason Yan wrote:
> 
> 在 2020/3/1 6:54, Scott Wood 写道:
> > On Sat, 2020-02-29 at 15:27 +0800, Jason Yan wrote:
> > > 
> > > Turnning to %p may not be a good idea in this situation. So
> > > for the REG logs printed when dumping stack, we can disable it when
> > > KASLR is open. For the REG logs in other places like show_regs(), only
> > > privileged can trigger it, and they are not combind with a symbol, so
> > > I think it's ok to keep them.
> > > 
> > > diff --git a/arch/powerpc/kernel/process.c
> > > b/arch/powerpc/kernel/process.c
> > > index fad50db9dcf2..659c51f0739a 100644
> > > --- a/arch/powerpc/kernel/process.c
> > > +++ b/arch/powerpc/kernel/process.c
> > > @@ -2068,7 +2068,10 @@ void show_stack(struct task_struct *tsk, unsigned
> > > long *stack)
> > >   newsp = stack[0];
> > >   ip = stack[STACK_FRAME_LR_SAVE];
> > >   if (!firstframe || ip != lr) {
> > > -   printk("["REG"] ["REG"] %pS", sp, ip, (void
> > > *)ip);
> > > +   if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
> > > +   printk("%pS", (void *)ip);
> > > +   else
> > > +   printk("["REG"] ["REG"] %pS", sp, ip,
> > > (void *)ip);
> > 
> > This doesn't deal with "nokaslr" on the kernel command line.  It also
> > doesn't
> > seem like something that every callsite should have to opencode, versus
> > having
> > an appropriate format specifier behaves as I described above (and I still
> > don't see why that format specifier should not be "%p").
> > 
> 
> Actually I still do not understand why we should print the raw value 
> here. When KALLSYMS is enabled we have symbol name  and  offset like 
> put_cred_rcu+0x108/0x110, and when KALLSYMS is disabled we have the raw 
> address.

I'm more concerned about the stack address for wading through a raw stack dump
(to find function call arguments, etc).  The return address does help confirm
that I'm on the right stack frame though, and also makes looking up a line
number slightly easier than having to look up a symbol address and then add
the offset (at least for non-module addresses).

As a random aside, the mismatch between Linux printing a hex offset and GDB
using decimal in disassembly is annoying...

-Scott




Re: [PATCH] mm/debug: Add tests validating arch page table helpers for core features

2020-03-01 Thread Anshuman Khandual
On 02/27/2020 04:59 PM, Christophe Leroy wrote:
> 
> 
> Le 27/02/2020 à 11:33, Anshuman Khandual a écrit :
>> This adds new tests validating arch page table helpers for these following
>> core memory features. These tests create and test specific mapping types at
>> various page table levels.
>>
>> * SPECIAL mapping
>> * PROTNONE mapping
>> * DEVMAP mapping
>> * SOFTDIRTY mapping
>> * SWAP mapping
>> * MIGRATION mapping
>> * HUGETLB mapping
>> * THP mapping
>>
>> Cc: Andrew Morton 
>> Cc: Mike Rapoport 
>> Cc: Vineet Gupta 
>> Cc: Catalin Marinas 
>> Cc: Will Deacon 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Paul Mackerras 
>> Cc: Michael Ellerman 
>> Cc: Heiko Carstens 
>> Cc: Vasily Gorbik 
>> Cc: Christian Borntraeger 
>> Cc: Thomas Gleixner 
>> Cc: Ingo Molnar 
>> Cc: Borislav Petkov 
>> Cc: "H. Peter Anvin" 
>> Cc: Kirill A. Shutemov 
>> Cc: Paul Walmsley 
>> Cc: Palmer Dabbelt 
>> Cc: linux-snps-...@lists.infradead.org
>> Cc: linux-arm-ker...@lists.infradead.org
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Cc: linux-s...@vger.kernel.org
>> Cc: linux-ri...@lists.infradead.org
>> Cc: x...@kernel.org
>> Cc: linux-a...@vger.kernel.org
>> Cc: linux-ker...@vger.kernel.org
>> Suggested-by: Catalin Marinas 
>> Signed-off-by: Anshuman Khandual 
>> ---
>> Tested on arm64 and x86 platforms without any test failures. But this has
>> only been built tested on several other platforms. Individual tests need
>> to be verified on all current enabling platforms for the test i.e s390,
>> ppc32, arc etc.
>>
>> This patch must be applied on v5.6-rc3 after these patches
>>
>> 1. https://patchwork.kernel.org/patch/11385057/
>> 2. https://patchwork.kernel.org/patch/11407715/
>>
>> OR
>>
>> This patch must be applied on linux-next (next-20200227) after this patch
>>
>> 2. https://patchwork.kernel.org/patch/11407715/
>>
>>   mm/debug_vm_pgtable.c | 310 +-
>>   1 file changed, 309 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
>> index 96dd7d574cef..3fb90d5b604e 100644
>> --- a/mm/debug_vm_pgtable.c
>> +++ b/mm/debug_vm_pgtable.c
>> @@ -41,6 +41,44 @@
>>    * wrprotect(entry)    = A write protected and not a write entry
>>    * pxx_bad(entry)    = A mapped and non-table entry
>>    * pxx_same(entry1, entry2)    = Both entries hold the exact same value
>> + *
>> + * Specific feature operations
>> + *
>> + * pte_mkspecial(entry)    = Creates a special entry at PTE level
>> + * pte_special(entry)    = Tests a special entry at PTE level
>> + *
>> + * pte_protnone(entry)    = Tests a no access entry at PTE level
>> + * pmd_protnone(entry)    = Tests a no access entry at PMD level
>> + *
>> + * pte_mkdevmap(entry)    = Creates a device entry at PTE level
>> + * pmd_mkdevmap(entry)    = Creates a device entry at PMD level
>> + * pud_mkdevmap(entry)    = Creates a device entry at PUD level
>> + * pte_devmap(entry)    = Tests a device entry at PTE level
>> + * pmd_devmap(entry)    = Tests a device entry at PMD level
>> + * pud_devmap(entry)    = Tests a device entry at PUD level
>> + *
>> + * pte_mksoft_dirty(entry)    = Creates a soft dirty entry at PTE level
>> + * pmd_mksoft_dirty(entry)    = Creates a soft dirty entry at PMD level
>> + * pte_swp_mksoft_dirty(entry)    = Creates a soft dirty swap entry at PTE 
>> level
>> + * pmd_swp_mksoft_dirty(entry)    = Creates a soft dirty swap entry at PMD 
>> level
>> + * pte_soft_dirty(entry)    = Tests a soft dirty entry at PTE level
>> + * pmd_soft_dirty(entry)    = Tests a soft dirty entry at PMD level
>> + * pte_swp_soft_dirty(entry)    = Tests a soft dirty swap entry at PTE level
>> + * pmd_swp_soft_dirty(entry)    = Tests a soft dirty swap entry at PMD level
>> + * pte_clear_soft_dirty(entry)   = Clears a soft dirty entry at PTE 
>> level
>> + * pmd_clear_soft_dirty(entry)   = Clears a soft dirty entry at PMD 
>> level
>> + * pte_swp_clear_soft_dirty(entry) = Clears a soft dirty swap entry at PTE 
>> level
>> + * pmd_swp_clear_soft_dirty(entry) = Clears a soft dirty swap entry at PMD 
>> level
>> + *
>> + * pte_mkhuge(entry)    = Creates a HugeTLB entry at given level
>> + * pte_huge(entry)    = Tests a HugeTLB entry at given level
>> + *
>> + * pmd_trans_huge(entry)    = Tests a trans huge page at PMD level
>> + * pud_trans_huge(entry)    = Tests a trans huge page at PUD level
>> + * pmd_present(entry)    = Tests an entry points to memory at PMD level
>> + * pud_present(entry)    = Tests an entry points to memory at PUD level
>> + * pmd_mknotpresent(entry)    = Invalidates an PMD entry for MMU
>> + * pud_mknotpresent(entry)    = Invalidates an PUD entry for MMU
>>    */
>>   #define VMFLAGS    (VM_READ|VM_WRITE|VM_EXEC)
>>   @@ -287,6 +325,233 @@ static void __init pmd_populate_tests(struct 
>> mm_struct *mm, pmd_t *pmdp,
>>   WARN_ON(pmd_bad(pmd));
>>   }
>>   +#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
> 
> Can we avoid ifdefs unless 

Re: [GIT PULL] Second batch of KVM changes for Linux 5.6-rc4 (or rc5)

2020-03-01 Thread Paolo Bonzini
On 01/03/20 22:33, Linus Torvalds wrote:
> On Sun, Mar 1, 2020 at 1:03 PM Paolo Bonzini  wrote:
>>
>> Paolo Bonzini (4):
>>   KVM: allow disabling -Werror
> 
> Honestly, this is just badly done.
> 
> You've basically made it enable -Werror only for very random
> configurations - and apparently the one you test.
> Doing things like COMPILE_TEST disables it, but so does not having
> EXPERT enabled.

Yes, I took this from the i915 Kconfig.  It's temporary, in 5.7 I am
planning to get it to just !KASAN, but for 5.6 I wanted to avoid more
breakage so I added the other restrictions.  The difference between
x86-64 and i386 is really just the frame size warnings, which Christoph
triggered because of a higher CONFIG_NR_CPUS.

(BTW, perhaps it makes sense for Sparse to have something like __nostack
for structs that contain potentially large arrays).

> I've merged this, but I wonder why you couldn't just do what I
> suggested originally?  Seriously, if you script your build tests,
> and don't even look at the results, then you might as well use
> 
>make KCFLAGS=-Werror

I did that and I'm also adding W=1; and I threw in a smaller than
default frame size warning option too because I don't want cpumasks on
the stack anyway.  However, that wouldn't help contributors.  I'm okay
if I get W=1 or frame size warnings from patches from other
contributors, but I think it's a disservice to them that they have to
set KCFLAGS in order to avoid warnings.

> the "now it causes problems for
> random compiler versions" is a real issue again - but at least it
> wouldn't be a random kernel subsystem that happens to trigger it, it
> would be a _generic_ issue, and we'd have everybody involved when a
> compiler change introduces a new warning.

Yes, and GCC prereleases are tested with Linux, for example by doing
full Rawhide rebuilds.  If we started using -Werror by default
(including allyesconfig), they would probably report warnings early.
Same for clang.

I hope that Linux can have -Werror everywhere, or at least a
CONFIG_WERROR option that does it even if it defaults to n for a release
or more.  But I don't think we can get there without first seeing what
issues pop up in a few subsystems or arches---even before considering
new compilers---so I decided I would just try.

Paolo

> Adding the powerpc people, since they have more history with their
> somewhat less hacky one. Except that one automatically gets disabled
> by "make allmodconfig" and friends, which is also kind of pointless.

> Michael, what tends to be the triggers for people using
> PPC_DISABLE_WERROR? Do you have reports for it? Could we have a
> _generic_ option that just gets enabled by default, except it gets
> disabled by _known_ issues (like KASAN).
> 
> Being disabled for "make allmodconfig" is kind of against one of the
> _points_ of "the build should be warning-free".



Re: [PATCH net-next 00/23] Clean driver, module and FW versions

2020-03-01 Thread David Miller
From: Leon Romanovsky 
Date: Sun,  1 Mar 2020 16:44:33 +0200

> This is second batch of the series which removes various static versions
> in favour of globaly defined Linux kernel version.

This generally looks fine to me but I'll let it sit for a few days so that
others can review.



Re: [PATCH v3 0/6] implement KASLR for powerpc/fsl_booke/64

2020-03-01 Thread Jason Yan




在 2020/3/1 6:54, Scott Wood 写道:

On Sat, 2020-02-29 at 15:27 +0800, Jason Yan wrote:


在 2020/2/29 12:28, Scott Wood 写道:

On Fri, 2020-02-28 at 14:47 +0800, Jason Yan wrote:


在 2020/2/28 13:53, Scott Wood 写道:


I don't see any debug setting for %pK (or %p) to always print the
actual
address (closest is kptr_restrict=1 but that only works in certain
contexts)... from looking at the code it seems it hashes even if kaslr
is
entirely disabled?  Or am I missing something?



Yes, %pK (or %p) always hashes whether kaslr is disabled or not. So if
we want the real value of the address, we cannot use it. But if you only
want to distinguish if two pointers are the same, it's ok.


Am I the only one that finds this a bit crazy?  If you want to lock a
system
down then fine, but why wage war on debugging even when there's no
randomization going on?  Comparing two pointers for equality is not always
adequate.



AFAIK, %p hashing is only exist because of many legacy address printings
and force who really want the raw values to switch to %px or even %lx.
It's not the opposite of debugging. Raw address printing is not
forbidden, only people need to estimate the risk of adrdress leaks.


Yes, but I don't see any format specifier to switch to that will hash in a
randomized production environment, but not in a debug or other non-randomized
environment which seems like the ideal default for most debug output.



Sorry I have no idea why there is no format specifier considered for 
switching of randomized or non-randomized environment. May they think 
that raw address should not leak in non-randomized environment too. May 
be Kees or Tobin can answer this question.


Kees? Tobin?



Turnning to %p may not be a good idea in this situation. So
for the REG logs printed when dumping stack, we can disable it when
KASLR is open. For the REG logs in other places like show_regs(), only
privileged can trigger it, and they are not combind with a symbol, so
I think it's ok to keep them.

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index fad50db9dcf2..659c51f0739a 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2068,7 +2068,10 @@ void show_stack(struct task_struct *tsk, unsigned
long *stack)
  newsp = stack[0];
  ip = stack[STACK_FRAME_LR_SAVE];
  if (!firstframe || ip != lr) {
-   printk("["REG"] ["REG"] %pS", sp, ip, (void *)ip);
+   if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
+   printk("%pS", (void *)ip);
+   else
+   printk("["REG"] ["REG"] %pS", sp, ip,
(void *)ip);


This doesn't deal with "nokaslr" on the kernel command line.  It also doesn't
seem like something that every callsite should have to opencode, versus having
an appropriate format specifier behaves as I described above (and I still
don't see why that format specifier should not be "%p").



Actually I still do not understand why we should print the raw value 
here. When KALLSYMS is enabled we have symbol name  and  offset like 
put_cred_rcu+0x108/0x110, and when KALLSYMS is disabled we have the raw 
address.



-Scott



.





[PATCH] powerpc/64s/radix: Fix !SMP build

2020-03-01 Thread Nicholas Piggin
Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/mm/book3s64/radix_pgtable.c | 1 +
 arch/powerpc/mm/book3s64/radix_tlb.c | 7 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
b/arch/powerpc/mm/book3s64/radix_pgtable.c
index dd1bea45325c..2a9a0cd79490 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
b/arch/powerpc/mm/book3s64/radix_tlb.c
index 03f43c924e00..758ade2c2b6e 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -587,6 +587,11 @@ void radix__local_flush_all_mm(struct mm_struct *mm)
preempt_enable();
 }
 EXPORT_SYMBOL(radix__local_flush_all_mm);
+
+static void __flush_all_mm(struct mm_struct *mm, bool fullmm)
+{
+   radix__local_flush_all_mm(mm);
+}
 #endif /* CONFIG_SMP */
 
 void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long 
vmaddr,
@@ -777,7 +782,7 @@ void radix__flush_tlb_page(struct vm_area_struct *vma, 
unsigned long vmaddr)
 EXPORT_SYMBOL(radix__flush_tlb_page);
 
 #else /* CONFIG_SMP */
-#define radix__flush_all_mm radix__local_flush_all_mm
+static inline void exit_flush_lazy_tlbs(struct mm_struct *mm) { }
 #endif /* CONFIG_SMP */
 
 static void do_tlbiel_kernel(void *info)
-- 
2.23.0



Re: [Intel-gfx] [PATCH v7 00/12] Introduce CAP_PERFMON to secure system performance monitoring and observability

2020-03-01 Thread Serge Hallyn
Thanks, this looks good to me, in keeping with the CAP_SYSLOG break.

Acked-by: Serge E. Hallyn 

for the set.

James/Ingo/Peter, if noone has remaining objections, whose branch
should these go in through?

thanks,
-serge

On Tue, Feb 25, 2020 at 12:55:54PM +0300, Alexey Budankov wrote:
> 
> Hi,
> 
> Is there anything else I could do in order to move the changes forward
> or is something still missing from this patch set?
> Could you please share you mind?
> 
> Thanks,
> Alexey
> 
> On 17.02.2020 11:02, Alexey Budankov wrote:
> > 
> > Currently access to perf_events, i915_perf and other performance
> > monitoring and observability subsystems of the kernel is open only for
> > a privileged process [1] with CAP_SYS_ADMIN capability enabled in the
> > process effective set [2].
> > 
> > This patch set introduces CAP_PERFMON capability designed to secure
> > system performance monitoring and observability operations so that
> > CAP_PERFMON would assist CAP_SYS_ADMIN capability in its governing role
> > for performance monitoring and observability subsystems of the kernel.
> > 
> > CAP_PERFMON intends to harden system security and integrity during
> > performance monitoring and observability operations by decreasing attack
> > surface that is available to a CAP_SYS_ADMIN privileged process [2].
> > Providing the access to performance monitoring and observability
> > operations under CAP_PERFMON capability singly, without the rest of
> > CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials
> > and makes the operation more secure. Thus, CAP_PERFMON implements the
> > principal of least privilege for performance monitoring and
> > observability operations (POSIX IEEE 1003.1e: 2.2.2.39 principle of
> > least privilege: A security design principle that states that a process
> > or program be granted only those privileges (e.g., capabilities)
> > necessary to accomplish its legitimate function, and only for the time
> > that such privileges are actually required)
> > 
> > CAP_PERFMON intends to meet the demand to secure system performance
> > monitoring and observability operations for adoption in security
> > sensitive, restricted, multiuser production environments (e.g. HPC
> > clusters, cloud and virtual compute environments), where root or
> > CAP_SYS_ADMIN credentials are not available to mass users of a system,
> > and securely unblock accessibility of system performance monitoring and
> > observability operations beyond root and CAP_SYS_ADMIN use cases.
> > 
> > CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to
> > system performance monitoring and observability operations and balance
> > amount of CAP_SYS_ADMIN credentials following the recommendations in
> > the capabilities man page [2] for CAP_SYS_ADMIN: "Note: this capability
> > is overloaded; see Notes to kernel developers, below." For backward
> > compatibility reasons access to system performance monitoring and
> > observability subsystems of the kernel remains open for CAP_SYS_ADMIN
> > privileged processes but CAP_SYS_ADMIN capability usage for secure
> > system performance monitoring and observability operations is
> > discouraged with respect to the designed CAP_PERFMON capability.
> > 
> > Possible alternative solution to this system security hardening,
> > capabilities balancing task of making performance monitoring and
> > observability operations more secure and accessible could be to use
> > the existing CAP_SYS_PTRACE capability to govern system performance
> > monitoring and observability subsystems. However CAP_SYS_PTRACE
> > capability still provides users with more credentials than are
> > required for secure performance monitoring and observability
> > operations and this excess is avoided by the designed CAP_PERFMON.
> > 
> > Although software running under CAP_PERFMON can not ensure avoidance of
> > related hardware issues, the software can still mitigate those issues
> > following the official hardware issues mitigation procedure [3]. The
> > bugs in the software itself can be fixed following the standard kernel
> > development process [4] to maintain and harden security of system
> > performance monitoring and observability operations. Finally, the patch
> > set is shaped in the way that simplifies backtracking procedure of
> > possible induced issues [5] as much as possible.
> > 
> > The patch set is for tip perf/core repository:
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
> > sha1: fdb64822443ec9fb8c3a74b598a74790ae8d2e22
> > 
> > ---
> > Changes in v7:
> > - updated and extended kernel.rst and perf-security.rst documentation 
> >   files with the information about CAP_PERFMON capability and its use cases
> > - documented the case of double audit logging of CAP_PERFMON and 
> > CAP_SYS_ADMIN
> >   capabilities on a SELinux enabled system
> > Changes in v6:
> > - avoided noaudit checks in perfmon_capable() to explicitly advertise
> >   CAP_PERFMON usage thru audit logs 

RE: [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs

2020-03-01 Thread Alastair D'Silva
On Fri, 2020-02-28 at 08:15 +0100, Greg Kroah-Hartman wrote:
> On Fri, Feb 28, 2020 at 05:25:31PM +1100, Andrew Donnellan wrote:
> > On 21/2/20 2:27 pm, Alastair D'Silva wrote:
> > > +int ocxlpmem_sysfs_add(struct ocxlpmem *ocxlpmem)
> > > +{
> > > + int i, rc;
> > > +
> > > + for (i = 0; i < ARRAY_SIZE(attrs); i++) {
> > > + rc = device_create_file(>dev, [i]);
> > > + if (rc) {
> > > + for (; --i >= 0;)
> > > + device_remove_file(>dev,
> > > [i]);
> > 
> > I'd rather avoid weird for loop constructs if possible.
> > 
> > Is it actually dangerous to call device_remove_file() on an attr
> > that hasn't
> > been added? If not then I'd rather define an err: label and loop
> > over the
> > whole array there.
> 
> None of this should be used at all, just use attribute groups
> properly
> and the driver core will handle this all for you.
> 
> device_create/remove_file should never be called by anyone anymore if
> at all
> possible.
> 
> thanks,
> 
> greg k-h


Thanks, I'll rework it to use the .groups member of struct pci_driver.

-- 
Alastair D'Silva
Open Source Developer
Linux Technology Centre, IBM Australia
mob: 0423 762 819



Re: [GIT PULL] Second batch of KVM changes for Linux 5.6-rc4 (or rc5)

2020-03-01 Thread Linus Torvalds
On Sun, Mar 1, 2020 at 1:03 PM Paolo Bonzini  wrote:
>
> Paolo Bonzini (4):
>   KVM: allow disabling -Werror

Honestly, this is just badly done.

You've basically made it enable -Werror only for very random
configurations - and apparently the one you test.

Doing things like COMPILE_TEST disables it, but so does not having
EXPERT enabled.

So it looks entirely ad-hoc and makes very little sense. At least the
"with KASAN, disable this" part makes sense, since that's a known
source or warnings. But everything else looks very random.

I've merged this, but I wonder why you couldn't just do what I
suggested originally?

Seriously, if you script your build tests, and don't even look at the
results, then you might as well use

   make KCFLAGS=-Werror

instead of having this kind of completely random option that has
almost no logic to it at all.

And if you depend entirely on random build infrastructure like the
0day bot etc, this likely _is_ going to break when it starts using a
new gcc version, or when it starts testing using clang, or whatever.
So then we end up with another odd random situation where now kvm (and
only kvm) will fail those builds just because they are automated.

Yes, as I said in that original thread, I'd love to do -Werror in
general, at which point it wouldn't be some random ad-hoc kvm special
case for some random option. But the "now it causes problems for
random compiler versions" is a real issue again - but at least it
wouldn't be a random kernel subsystem that happens to trigger it, it
would be a _generic_ issue, and we'd have everybody involved when a
compiler change introduces a new warning.

I've pulled this for now, but I really think it's a horrible hack, and
it's just done entirely wrong.

Adding the powerpc people, since they have more history with their
somewhat less hacky one. Except that one automatically gets disabled
by "make allmodconfig" and friends, which is also kind of pointless.

Michael, what tends to be the triggers for people using
PPC_DISABLE_WERROR? Do you have reports for it? Could we have a
_generic_ option that just gets enabled by default, except it gets
disabled by _known_ issues (like KASAN).

Being disabled for "make allmodconfig" is kind of against one of the
_points_ of "the build should be warning-free".

   Linus


[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set

2020-03-01 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199471

Erhard F. (erhar...@mailbox.org) changed:

   What|Removed |Added

 Attachment #275509|0   |1
is obsolete||

--- Comment #14 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 287749
  --> https://bugzilla.kernel.org/attachment.cgi?id=287749=edit
dmesg (kernel 4.17, PowerMac G5 11,2)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set

2020-03-01 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199471

--- Comment #13 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 287747
  --> https://bugzilla.kernel.org/attachment.cgi?id=287747=edit
kernel .config (kernel 4.17, PowerMac G5 11,2)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set

2020-03-01 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199471

Erhard F. (erhar...@mailbox.org) changed:

   What|Removed |Added

 Attachment #275507|0   |1
is obsolete||

--- Comment #12 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 287745
  --> https://bugzilla.kernel.org/attachment.cgi?id=287745=edit
dmesg (kernel 4.16.18, PowerMac G5 11,2)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set

2020-03-01 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199471

--- Comment #11 from Erhard F. (erhar...@mailbox.org) ---
(In reply to Wolfram Sang from comment #8)
> "This has been quite nice since 4.?.x up to 4.16.x as you only need
> CONFIG_I2C_POWERMAC=y which selects the proper windfarm_pmXX at boot time."
> 
> I can't find that in the code. Are you sure i2c-powermac requested that
> module?
I guess so 'cause if I build i2c_powermac as a module and manually modprobe it,
all the relevant windfarm modules get pulled in. But not before.

 # modprobe -v i2c_powermac
insmod
/lib/modules/4.16.18-PowerMacG5+/kernel/drivers/i2c/busses/i2c-powermac.ko 
 # dmesg | tail
[  150.181478]  11
[  150.182851]  0
[  150.184220]  0

[  150.626685] windfarm: Backside control loop started.
[  150.690132] windfarm: Slots control loop started.
[  150.794843] i2c i2c-0: master_xfer[0] W, addr=0x50, len=1
[  150.796467] i2c i2c-0: master_xfer[1] R, addr=0x50, len=8
[  150.801851] i2c i2c-0: NAK from device addr 0x50 msg #0
[  150.807758] windfarm: Drive bay control loop started.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 199471] windfarm_pm72 no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set (regression)

2020-03-01 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199471

--- Comment #10 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 287743
  --> https://bugzilla.kernel.org/attachment.cgi?id=287743=edit
bisect.log

Finally checked on that bug again and bisected it. The offending commit is:

# git bisect bad | tee -a ~/bisect02.log 
af503716ac1444db61d80cb6d17cfe62929c21df is the first bad commit
commit af503716ac1444db61d80cb6d17cfe62929c21df
Author: Javier Martinez Canillas 
Date:   Sun Dec 3 22:40:50 2017 +0100

i2c: core: report OF style module alias for devices registered via OF

The buses should honor the firmware interface used to register the device,
but the I2C core reports a MODALIAS of the form i2c: even for I2C
devices registered via OF.

This means that user-space will never get an OF stype uevent MODALIAS even
when the drivers modules contain aliases exported from both the I2C and OF
device ID tables. For example, an Atmel maXTouch Touchscreen registered by
a DT node with compatible "atmel,maxtouch" has the following module alias:

$ cat /sys/class/i2c-adapter/i2c-8/8-004b/modalias
i2c:maxtouch

So udev won't be able to auto-load a module for an OF-only device driver.
Many OF-only drivers duplicate the OF device ID table entries in an I2C ID
table only has a workaround for how the I2C core reports the module alias.

This patch changes the I2C core to report an OF related MODALIAS uevent if
the device was registered via OF. So for the previous example, after this
patch, the reported MODALIAS for the Atmel maXTouch will be the following:

$ cat /sys/class/i2c-adapter/i2c-8/8-004b/modalias
of:NtrackpadTCatmel,maxtouch

NOTE: This patch may break out-of-tree drivers that were relying on this
  behavior, and only had an I2C device ID table even when the device
  was registered via OF. There are no remaining drivers in mainline
  that do this, but out-of-tree drivers have to be fixed and define
  a proper OF device ID table to have module auto-loading working.

Signed-off-by: Javier Martinez Canillas 
Tested-by: Dmitry Mastykin 
Signed-off-by: Wolfram Sang 

 drivers/i2c/i2c-core-base.c | 8 
 1 file changed, 8 insertions(+)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 199471] windfarm_pm72 no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set (regression)

2020-03-01 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=199471

Erhard F. (erhar...@mailbox.org) changed:

   What|Removed |Added

 Attachment #275503|0   |1
is obsolete||
 Attachment #275505|0   |1
is obsolete||

--- Comment #9 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 287741
  --> https://bugzilla.kernel.org/attachment.cgi?id=287741=edit
kernel .config (kernel 4.16, PowerMac G5 11,2)

With the attached kernel .config the G5 7,3 and the G5 11,2 automatically load
the suitable windfarm module on kernel <4.17. Starting from kernel 4.17
windfarm core needs to be CONFIG_WINDFARM=y to automacitally load the suitable
windfarm module, CONFIG_WINDFARM=m is no longer sufficient.

Needed for 4.16.x to automatically load the suitable windfarm module:
# grep -i wind .config
CONFIG_WINDFARM=m
CONFIG_WINDFARM_PM81=m
CONFIG_WINDFARM_PM72=m
CONFIG_WINDFARM_RM31=m
CONFIG_WINDFARM_PM91=m
CONFIG_WINDFARM_PM112=m
CONFIG_WINDFARM_PM121=m

Needed for >=4.17.x to automatically load the suitable windfarm module:
# grep -i wind .config
CONFIG_WINDFARM=y
CONFIG_WINDFARM_PM81=m
CONFIG_WINDFARM_PM72=m
CONFIG_WINDFARM_RM31=m
CONFIG_WINDFARM_PM91=m
CONFIG_WINDFARM_PM112=m
CONFIG_WINDFARM_PM121=m

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[PATCH] tty: hvc: Use the correct style for SPDX License Identifier

2020-03-01 Thread Nishad Kamdar
This patch corrects the SPDX License Identifier style in
header file related to the HVC driver.
For C header files Documentation/process/license-rules.rst
mandates C-like comments (opposed to C source files where
C++ style should be used).

Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.

Suggested-by: Joe Perches 
Signed-off-by: Nishad Kamdar 
---
 drivers/tty/hvc/hvc_console.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h
index e9319954c832..18d005814e4b 100644
--- a/drivers/tty/hvc/hvc_console.h
+++ b/drivers/tty/hvc/hvc_console.h
@@ -1,4 +1,4 @@
-// SPDX-License-Identifier: GPL-2.0+
+/* SPDX-License-Identifier: GPL-2.0+ */
 /*
  * hvc_console.h
  * Copyright (C) 2005 IBM Corporation
-- 
2.17.1



[PATCH] tty: hvc: Use the correct style for SPDX License Identifier

2020-03-01 Thread Nishad Kamdar
This patch corrects the SPDX License Identifier style in
header file related to the HVC driver.
For C header files Documentation/process/license-rules.rst
mandates C-like comments (opposed to C source files where
C++ style should be used).

Changes made by using a script provided by Joe Perches here:
https://lkml.org/lkml/2019/2/7/46.

Suggested-by: Joe Perches 
Signed-off-by: Nishad Kamdar 
---
 drivers/tty/hvc/hvc_console.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h
index e9319954c832..18d005814e4b 100644
--- a/drivers/tty/hvc/hvc_console.h
+++ b/drivers/tty/hvc/hvc_console.h
@@ -1,4 +1,4 @@
-// SPDX-License-Identifier: GPL-2.0+
+/* SPDX-License-Identifier: GPL-2.0+ */
 /*
  * hvc_console.h
  * Copyright (C) 2005 IBM Corporation
-- 
2.17.1



[PATCH net-next 22/23] net/freescale: Don't set zero if FW not-available in ucc_geth

2020-03-01 Thread Leon Romanovsky
From: Leon Romanovsky 

Rely on ethtool to properly present the fact that FW is not
available for the ucc_geth driver.

Signed-off-by: Leon Romanovsky 
---
 drivers/net/ethernet/freescale/ucc_geth_ethtool.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/freescale/ucc_geth_ethtool.c 
b/drivers/net/ethernet/freescale/ucc_geth_ethtool.c
index bc7ba70d176c..14c08a868190 100644
--- a/drivers/net/ethernet/freescale/ucc_geth_ethtool.c
+++ b/drivers/net/ethernet/freescale/ucc_geth_ethtool.c
@@ -334,7 +334,6 @@ uec_get_drvinfo(struct net_device *netdev,
struct ethtool_drvinfo *drvinfo)
 {
strlcpy(drvinfo->driver, DRV_NAME, sizeof(drvinfo->driver));
-   strlcpy(drvinfo->fw_version, "N/A", sizeof(drvinfo->fw_version));
strlcpy(drvinfo->bus_info, "QUICC ENGINE", sizeof(drvinfo->bus_info));
 }
 
-- 
2.24.1



[PATCH net-next 20/23] net/freescale: Clean drivers from static versions

2020-03-01 Thread Leon Romanovsky
From: Leon Romanovsky 

There is no need to set static versions because linux kernel is
released all together with same version applicable to the whole
code base.

Signed-off-by: Leon Romanovsky 
---
 drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c  |  2 --
 drivers/net/ethernet/freescale/enetc/enetc_pf.c | 13 -
 drivers/net/ethernet/freescale/enetc/enetc_vf.c | 12 
 drivers/net/ethernet/freescale/fec_main.c   |  1 -
 .../net/ethernet/freescale/fs_enet/fs_enet-main.c   |  2 --
 drivers/net/ethernet/freescale/fs_enet/fs_enet.h|  2 --
 drivers/net/ethernet/freescale/gianfar.c|  2 --
 drivers/net/ethernet/freescale/gianfar.h|  1 -
 drivers/net/ethernet/freescale/gianfar_ethtool.c|  2 --
 drivers/net/ethernet/freescale/ucc_geth.c   |  1 -
 drivers/net/ethernet/freescale/ucc_geth.h   |  1 -
 drivers/net/ethernet/freescale/ucc_geth_ethtool.c   |  1 -
 12 files changed, 40 deletions(-)

diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c 
b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
index 66d150872d48..13ab669ca8b3 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
@@ -110,8 +110,6 @@ static void dpaa_get_drvinfo(struct net_device *net_dev,
 
strlcpy(drvinfo->driver, KBUILD_MODNAME,
sizeof(drvinfo->driver));
-   len = snprintf(drvinfo->version, sizeof(drvinfo->version),
-  "%X", 0);
len = snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version),
   "%X", 0);
 
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c 
b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
index fc0d7d99e9a1..545a344bce00 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c
@@ -7,12 +7,6 @@
 #include 
 #include "enetc_pf.h"
 
-#define ENETC_DRV_VER_MAJ 1
-#define ENETC_DRV_VER_MIN 0
-
-#define ENETC_DRV_VER_STR __stringify(ENETC_DRV_VER_MAJ) "." \
- __stringify(ENETC_DRV_VER_MIN)
-static const char enetc_drv_ver[] = ENETC_DRV_VER_STR;
 #define ENETC_DRV_NAME_STR "ENETC PF driver"
 static const char enetc_drv_name[] = ENETC_DRV_NAME_STR;
 
@@ -929,9 +923,6 @@ static int enetc_pf_probe(struct pci_dev *pdev,
 
netif_carrier_off(ndev);
 
-   netif_info(priv, probe, ndev, "%s v%s\n",
-  enetc_drv_name, enetc_drv_ver);
-
return 0;
 
 err_reg_netdev:
@@ -959,9 +950,6 @@ static void enetc_pf_remove(struct pci_dev *pdev)
enetc_sriov_configure(pdev, 0);
 
priv = netdev_priv(si->ndev);
-   netif_info(priv, drv, si->ndev, "%s v%s remove\n",
-  enetc_drv_name, enetc_drv_ver);
-
unregister_netdev(si->ndev);
 
enetc_mdio_remove(pf);
@@ -995,4 +983,3 @@ module_pci_driver(enetc_pf_driver);
 
 MODULE_DESCRIPTION(ENETC_DRV_NAME_STR);
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(ENETC_DRV_VER_STR);
diff --git a/drivers/net/ethernet/freescale/enetc/enetc_vf.c 
b/drivers/net/ethernet/freescale/enetc/enetc_vf.c
index ebd21bf4cfa1..28a786b2f3e7 100644
--- a/drivers/net/ethernet/freescale/enetc/enetc_vf.c
+++ b/drivers/net/ethernet/freescale/enetc/enetc_vf.c
@@ -4,12 +4,6 @@
 #include 
 #include "enetc.h"
 
-#define ENETC_DRV_VER_MAJ 1
-#define ENETC_DRV_VER_MIN 0
-
-#define ENETC_DRV_VER_STR __stringify(ENETC_DRV_VER_MAJ) "." \
- __stringify(ENETC_DRV_VER_MIN)
-static const char enetc_drv_ver[] = ENETC_DRV_VER_STR;
 #define ENETC_DRV_NAME_STR "ENETC VF driver"
 static const char enetc_drv_name[] = ENETC_DRV_NAME_STR;
 
@@ -201,9 +195,6 @@ static int enetc_vf_probe(struct pci_dev *pdev,
 
netif_carrier_off(ndev);
 
-   netif_info(priv, probe, ndev, "%s v%s\n",
-  enetc_drv_name, enetc_drv_ver);
-
return 0;
 
 err_reg_netdev:
@@ -225,8 +216,6 @@ static void enetc_vf_remove(struct pci_dev *pdev)
struct enetc_ndev_priv *priv;
 
priv = netdev_priv(si->ndev);
-   netif_info(priv, drv, si->ndev, "%s v%s remove\n",
-  enetc_drv_name, enetc_drv_ver);
unregister_netdev(si->ndev);
 
enetc_free_msix(priv);
@@ -254,4 +243,3 @@ module_pci_driver(enetc_vf_driver);
 
 MODULE_DESCRIPTION(ENETC_DRV_NAME_STR);
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(ENETC_DRV_VER_STR);
diff --git a/drivers/net/ethernet/freescale/fec_main.c 
b/drivers/net/ethernet/freescale/fec_main.c
index 12edd4e358f8..af7653e341f2 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -2128,7 +2128,6 @@ static void fec_enet_get_drvinfo(struct net_device *ndev,
 
strlcpy(info->driver, fep->pdev->dev.driver->name,
sizeof(info->driver));
-   strlcpy(info->version, "Revision: 1.0", sizeof(info->version));
strlcpy(info->bus_info, dev_name(>dev), sizeof(info->bus_info));
 

[PATCH net-next 00/23] Clean driver, module and FW versions

2020-03-01 Thread Leon Romanovsky
From: Leon Romanovsky 

Hi,

This is second batch of the series which removes various static versions
in favour of globaly defined Linux kernel version.

The first part with better cover letter can be found here
https://lore.kernel.org/lkml/20200224085311.460338-1-l...@kernel.org

The code is based on
68e2c37690b0 ("Merge branch 'hsr-several-code-cleanup-for-hsr-module'")

and WIP branch is
https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=ethtool

Thanks

Leon Romanovsky (23):
  net/broadcom: Clean broadcom code from driver versions
  net/broadcom: Don't set N/A FW if it is not available
  net/brocade: Delete driver version
  net/liquidio: Delete driver version assignment
  net/liquidio: Delete non-working LIQUIDIO_PACKAGE check
  net/cavium: Clean driver versions
  net/cavium: Delete N/A assignments for ethtool
  net/chelsio: Delete drive and  module versions
  net/chelsio: Don't set N/A for not available FW
  net/cirrus: Delete driver version
  net/cisco: Delete driver and module versions
  net/cortina: Delete driver version from ethtool output
  net/davicom: Delete ethtool version assignment
  net/dec: Delete driver versions
  net/dlink: Remove driver version and release date
  net/dnet: Delete static version from the driver
  net/emulex: Delete driver version
  net/faraday: Delete driver version from the drivers
  net/fealnx: Delete driver version
  net/freescale: Clean drivers from static versions
  net/freescale: Don't set zero if FW not-available in dpaa
  net/freescale: Don't set zero if FW not-available in ucc_geth
  net/freescale: Don't set zero if FW iand bus not-available in gianfar

 drivers/net/ethernet/broadcom/b44.c   |  5 
 drivers/net/ethernet/broadcom/bcm63xx_enet.c  | 10 ++-
 drivers/net/ethernet/broadcom/bcmsysport.c|  1 -
 drivers/net/ethernet/broadcom/bnx2.c  | 11 
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h   |  8 +-
 .../ethernet/broadcom/bnx2x/bnx2x_ethtool.c   |  7 -
 .../net/ethernet/broadcom/bnx2x/bnx2x_main.c  |  7 -
 drivers/net/ethernet/broadcom/bnxt/bnxt.c |  8 --
 drivers/net/ethernet/broadcom/bnxt/bnxt.h |  4 ++-
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  1 -
 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c |  1 -
 .../net/ethernet/broadcom/genet/bcmgenet.c|  1 -
 drivers/net/ethernet/broadcom/tg3.c   | 11 +---
 drivers/net/ethernet/brocade/bna/bnad.c   |  4 ---
 drivers/net/ethernet/brocade/bna/bnad.h   |  2 --
 .../net/ethernet/brocade/bna/bnad_ethtool.c   |  1 -
 .../ethernet/cavium/liquidio/lio_ethtool.c|  2 --
 .../net/ethernet/cavium/liquidio/lio_main.c   |  8 --
 .../ethernet/cavium/liquidio/lio_vf_main.c|  5 ++--
 .../cavium/liquidio/liquidio_common.h |  6 -
 .../ethernet/cavium/liquidio/octeon_console.c | 10 ++-
 .../net/ethernet/cavium/octeon/octeon_mgmt.c  |  6 -
 .../ethernet/cavium/thunder/nicvf_ethtool.c   |  2 --
 drivers/net/ethernet/chelsio/cxgb/common.h|  1 -
 drivers/net/ethernet/chelsio/cxgb/cxgb2.c |  3 ---
 .../net/ethernet/chelsio/cxgb3/cxgb3_main.c   |  4 ---
 drivers/net/ethernet/chelsio/cxgb3/version.h  |  2 --
 drivers/net/ethernet/chelsio/cxgb4/cxgb4.h|  3 +--
 .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c|  6 +
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   | 10 ---
 .../ethernet/chelsio/cxgb4vf/cxgb4vf_main.c   |  9 ---
 .../ethernet/chelsio/libcxgb/libcxgb_ppm.c|  2 --
 drivers/net/ethernet/cirrus/ep93xx_eth.c  |  2 --
 drivers/net/ethernet/cisco/enic/enic.h|  2 --
 .../net/ethernet/cisco/enic/enic_ethtool.c|  1 -
 drivers/net/ethernet/cisco/enic/enic_main.c   |  3 ---
 drivers/net/ethernet/cortina/gemini.c |  2 --
 drivers/net/ethernet/davicom/dm9000.c |  2 --
 drivers/net/ethernet/dec/tulip/de2104x.c  | 15 ---
 drivers/net/ethernet/dec/tulip/dmfe.c | 14 --
 drivers/net/ethernet/dec/tulip/tulip_core.c   | 26 ++-
 drivers/net/ethernet/dec/tulip/uli526x.c  | 13 --
 drivers/net/ethernet/dec/tulip/winbond-840.c  | 12 -
 drivers/net/ethernet/dlink/dl2k.c |  9 ---
 drivers/net/ethernet/dlink/sundance.c | 20 --
 drivers/net/ethernet/dnet.c   |  1 -
 drivers/net/ethernet/dnet.h   |  1 -
 drivers/net/ethernet/emulex/benet/be.h|  1 -
 .../net/ethernet/emulex/benet/be_ethtool.c|  1 -
 drivers/net/ethernet/emulex/benet/be_main.c   |  5 +---
 drivers/net/ethernet/faraday/ftgmac100.c  |  2 --
 drivers/net/ethernet/faraday/ftmac100.c   |  3 ---
 drivers/net/ethernet/fealnx.c | 20 --
 .../ethernet/freescale/dpaa/dpaa_ethtool.c| 11 
 .../net/ethernet/freescale/enetc/enetc_pf.c   | 13 --
 .../net/ethernet/freescale/enetc/enetc_vf.c   | 12 -
 drivers/net/ethernet/freescale/fec_main.c |  1 -
 .../ethernet/freescale/fs_enet/fs_enet-main.c |  2 --
 

Re: [PATCH v3 32/32] powerpc/64s: system call support for scv/rfscv instructions

2020-03-01 Thread kbuild test robot
Hi Nicholas,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.6-rc3 next-20200228]
[cannot apply to kvm-ppc/kvm-ppc-next scottwood/next]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-64-interrupts-and-syscalls-series/20200226-043224
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-ppc64e_defconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 7.5.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=7.5.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   arch/powerpc/kernel/entry_64.S: Assembler messages:
>> arch/powerpc/kernel/entry_64.S:67: Error: unrecognized opcode: 
>> `interrupt_to_kernel'
>> arch/powerpc/kernel/entry_64.S:164: Error: unrecognized opcode: 
>> `rfscv_to_user'

vim +67 arch/powerpc/kernel/entry_64.S

47  
48  /*
49   * System calls.
50   */
51  .section".toc","aw"
52  SYS_CALL_TABLE:
53  .tc sys_call_table[TC],sys_call_table
54  
55  COMPAT_SYS_CALL_TABLE:
56  .tc compat_sys_call_table[TC],compat_sys_call_table
57  
58  /* This value is used to mark exception frames on the stack. */
59  exception_marker:
60  .tc ID_EXC_MARKER[TC],STACK_FRAME_REGS_MARKER
61  
62  .section".text"
63  .align 7
64  
65  .globl system_call_vectored_common
66  system_call_vectored_common:
  > 67  INTERRUPT_TO_KERNEL
68  mr  r10,r1
69  ld  r1,PACAKSAVE(r13)
70  std r10,0(r1)
71  std r11,_NIP(r1)
72  std r12,_MSR(r1)
73  std r0,GPR0(r1)
74  std r10,GPR1(r1)
75  std r2,GPR2(r1)
76  ld  r2,PACATOC(r13)
77  mfcrr12
78  li  r11,0
79  /* Can we avoid saving r3-r8 in common case? */
80  std r3,GPR3(r1)
81  std r4,GPR4(r1)
82  std r5,GPR5(r1)
83  std r6,GPR6(r1)
84  std r7,GPR7(r1)
85  std r8,GPR8(r1)
86  /* Zero r9-r12, this should only be required when restoring all 
GPRs */
87  std r11,GPR9(r1)
88  std r11,GPR10(r1)
89  std r11,GPR11(r1)
90  std r11,GPR12(r1)
91  std r9,GPR13(r1)
92  SAVE_NVGPRS(r1)
93  std r11,_XER(r1)
94  std r11,_LINK(r1)
95  std r11,_CTR(r1)
96  
97  li  r11,0xc00
98  std r11,_TRAP(r1)
99  std r12,_CCR(r1)
   100  std r3,ORIG_GPR3(r1)
   101  addir10,r1,STACK_FRAME_OVERHEAD
   102  ld  r11,exception_marker@toc(r2)
   103  std r11,-16(r10)/* "regshere" marker */
   104  
   105  /*
   106   * RECONCILE_IRQ_STATE without calling trace_hardirqs_off(), 
which
   107   * would clobber syscall parameters. Also we always enter with 
IRQs
   108   * enabled and nothing pending. system_call_exception() will 
call
   109   * trace_hardirqs_off().
   110   *
   111   * scv enters with MSR[EE]=1, so don't set PACA_IRQ_HARD_DIS.
   112   */
   113  li  r9,IRQS_ALL_DISABLED
   114  stb r9,PACAIRQSOFTMASK(r13)
   115  
   116  /* Calling convention has r9 = orig r0, r10 = regs */
   117  mr  r9,r0
   118  bl  system_call_exception
   119  
   120  .Lsyscall_vectored_exit:
   121  addir4,r1,STACK_FRAME_OVERHEAD
   122  li  r5,1 /* scv */
   123  bl  syscall_exit_prepare
   124  
   125  ld  r2,_CCR(r1)
   126  ld  r4,_NIP(r1)
   127  ld  r5,_MSR(r1)
   128  
   129  BEGIN_FTR_SECTION
   130  stdcx.  r0,0,r1 /* to clear the reservation */
   131  END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
   132  
   133  mtlrr4
   134  mtctr   r5
   135  
   136  cmpdi   r3,0
   137  bne syscall_vectored_restore_regs
   138  li  r0,0
   139  li  r4,0
   140  li  r5,0
   141  li  r6,0
   142  li  r7,0
   143  li  r8,0
   144  li  

Re: [PATCH 1/2] powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems

2020-03-01 Thread Michael Ellerman
Michael Ellerman  writes:
> From: "Desnes A. Nunes do Rosario" 
>
> PowerVM systems running compatibility mode on a few Power8 revisions are
> still vulnerable to the hardware defect that loses PMU exceptions arriving
> prior to a context switch.
>
> The software fix for this issue is enabled through the CPU_FTR_PMAO_BUG
> cpu_feature bit, nevertheless this bit also needs to be set for PowerVM
> compatibility mode systems.
>
> Fixes: 68f2f0d431d9ea4 ("powerpc: Add a cpu feature CPU_FTR_PMAO_BUG")
> Signed-off-by: Desnes A. Nunes do Rosario 
> Reviewed-by: Leonardo Bras 
> Signed-off-by: Michael Ellerman 
> Link: https://lore.kernel.org/r/20200227134715.9715-1-desn...@linux.ibm.com
> ---
>  arch/powerpc/kernel/cputable.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Ignore, PEBKAC.

Don't try to operate git-send-email after 10pm.

cheers


> diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
> index e745abc5457a..245be4fafe13 100644
> --- a/arch/powerpc/kernel/cputable.c
> +++ b/arch/powerpc/kernel/cputable.c
> @@ -2193,11 +2193,13 @@ static struct cpu_spec * __init 
> setup_cpu_spec(unsigned long offset,
>* oprofile_cpu_type already has a value, then we are
>* possibly overriding a real PVR with a logical one,
>* and, in that case, keep the current value for
> -  * oprofile_cpu_type.
> +  * oprofile_cpu_type. Futhermore, let's ensure that the
> +  * fix for the PMAO bug is enabled on compatibility mode.
>*/
>   if (old.oprofile_cpu_type != NULL) {
>   t->oprofile_cpu_type = old.oprofile_cpu_type;
>   t->oprofile_type = old.oprofile_type;
> + t->cpu_features |= old.cpu_features & CPU_FTR_PMAO_BUG;
>   }
>   }
>  
> -- 
> 2.21.1


[PATCH] powerpc/kuap: PPC_KUAP_DEBUG should depend on PPC_KUAP

2020-03-01 Thread Michael Ellerman
Currently you can enable PPC_KUAP_DEBUG when PPC_KUAP is disabled,
even though the former has not effect without the latter.

Fix it so that PPC_KUAP_DEBUG can only be enabled when PPC_KUAP is
enabled, not when the platform could support KUAP (PPC_HAVE_KUAP).

Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/Kconfig.cputype | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/Kconfig.cputype 
b/arch/powerpc/platforms/Kconfig.cputype
index 6caedc88474f..6cd4e3240ec6 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -397,7 +397,7 @@ config PPC_KUAP
 
 config PPC_KUAP_DEBUG
bool "Extra debugging for Kernel Userspace Access Protection"
-   depends on PPC_HAVE_KUAP && (PPC_RADIX_MMU || PPC_32)
+   depends on PPC_KUAP && (PPC_RADIX_MMU || PPC_32)
help
  Add extra debugging for Kernel Userspace Access Protection (KUAP)
  If you're unsure, say N.
-- 
2.21.1



[PATCH 1/2] powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems

2020-03-01 Thread Michael Ellerman
From: "Desnes A. Nunes do Rosario" 

PowerVM systems running compatibility mode on a few Power8 revisions are
still vulnerable to the hardware defect that loses PMU exceptions arriving
prior to a context switch.

The software fix for this issue is enabled through the CPU_FTR_PMAO_BUG
cpu_feature bit, nevertheless this bit also needs to be set for PowerVM
compatibility mode systems.

Fixes: 68f2f0d431d9ea4 ("powerpc: Add a cpu feature CPU_FTR_PMAO_BUG")
Signed-off-by: Desnes A. Nunes do Rosario 
Reviewed-by: Leonardo Bras 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200227134715.9715-1-desn...@linux.ibm.com
---
 arch/powerpc/kernel/cputable.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c
index e745abc5457a..245be4fafe13 100644
--- a/arch/powerpc/kernel/cputable.c
+++ b/arch/powerpc/kernel/cputable.c
@@ -2193,11 +2193,13 @@ static struct cpu_spec * __init setup_cpu_spec(unsigned 
long offset,
 * oprofile_cpu_type already has a value, then we are
 * possibly overriding a real PVR with a logical one,
 * and, in that case, keep the current value for
-* oprofile_cpu_type.
+* oprofile_cpu_type. Futhermore, let's ensure that the
+* fix for the PMAO bug is enabled on compatibility mode.
 */
if (old.oprofile_cpu_type != NULL) {
t->oprofile_cpu_type = old.oprofile_cpu_type;
t->oprofile_type = old.oprofile_type;
+   t->cpu_features |= old.cpu_features & CPU_FTR_PMAO_BUG;
}
}
 
-- 
2.21.1