[PATCH] powerpc/ptdump: don't entirely rebuild kernel when selecting CONFIG_PPC_DEBUG_WX

2020-01-10 Thread Christophe Leroy
Selecting CONFIG_PPC_DEBUG_WX only impacts ptdump and pgtable_32/64
init calls. Declaring related functions in asm/pgtable.h implies
rebuilding almost everything.

Move ptdump_check_wx() declaration in a new dedicated header file.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/pgtable.h |  6 --
 arch/powerpc/include/asm/ptdump.h  | 15 +++
 arch/powerpc/mm/pgtable_32.c   |  1 +
 arch/powerpc/mm/pgtable_64.c   |  1 +
 arch/powerpc/mm/ptdump/ptdump.c|  1 +
 5 files changed, 18 insertions(+), 6 deletions(-)
 create mode 100644 arch/powerpc/include/asm/ptdump.h

diff --git a/arch/powerpc/include/asm/pgtable.h 
b/arch/powerpc/include/asm/pgtable.h
index 0e4ec8cc37b7..8cc543ed114c 100644
--- a/arch/powerpc/include/asm/pgtable.h
+++ b/arch/powerpc/include/asm/pgtable.h
@@ -94,12 +94,6 @@ void mark_initmem_nx(void);
 static inline void mark_initmem_nx(void) { }
 #endif
 
-#ifdef CONFIG_PPC_DEBUG_WX
-void ptdump_check_wx(void);
-#else
-static inline void ptdump_check_wx(void) { }
-#endif
-
 /*
  * When used, PTE_FRAG_NR is defined in subarch pgtable.h
  * so we are sure it is included when arriving here.
diff --git a/arch/powerpc/include/asm/ptdump.h 
b/arch/powerpc/include/asm/ptdump.h
new file mode 100644
index ..246b92c21729
--- /dev/null
+++ b/arch/powerpc/include/asm/ptdump.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_PTDUMP_H
+#define _ASM_POWERPC_PTDUMP_H
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_PPC_DEBUG_WX
+void ptdump_check_wx(void);
+#else
+static inline void ptdump_check_wx(void) { }
+#endif
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_POWERPC_PTDUMP_H */
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c
index 73b84166d06a..6c866f1b1eeb 100644
--- a/arch/powerpc/mm/pgtable_32.c
+++ b/arch/powerpc/mm/pgtable_32.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index e78832dce7bb..3686cd887c2f 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index 2f9ddc29c535..d7b02bcd0691 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "ptdump.h"
 
-- 
2.13.3



Re: [PATCH 2/3] powerpc sstep: add support for divde[.] and divdeu[.] instructions

2020-01-10 Thread Balamuruhan S
On Thu, Jan 09, 2020 at 09:23:14AM +1100, Paul Mackerras wrote:
> On Tue, Dec 10, 2019 at 12:49:03PM +0530, Balamuruhan S wrote:
> > This patch adds emulation support for divde, divdeu instructions,
> > * Divide Doubleword Extended (divde[.])
> > * Divide Doubleword Extended Unsigned (divdeu[.])
> > 
> > Signed-off-by: Balamuruhan S 
> > ---
> >  arch/powerpc/lib/sstep.c | 27 ++-
> >  1 file changed, 26 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> > index c077acb983a1..4b4119729e59 100644
> > --- a/arch/powerpc/lib/sstep.c
> > +++ b/arch/powerpc/lib/sstep.c
> > @@ -1736,7 +1736,32 @@ int analyse_instr(struct instruction_op *op, const 
> > struct pt_regs *regs,
> > op->val = (int) regs->gpr[ra] /
> > (int) regs->gpr[rb];
> > goto arith_done;
> > -
> > +#ifdef __powerpc64__
> > +   case 425:   /* divde[.] */
> > +   if (instr & 1) {
> > +   asm volatile(PPC_DIVDE_DOT(%0, %1, %2) :
> > +   "=r" (op->val) : "r" (regs->gpr[ra]),
> > +   "r" (regs->gpr[rb]));
> > +   set_cr0(regs, op);
> 
> This seems unneccesarily complicated.  You take the trouble to do a
> "divde." instruction rather than a "divde" instruction but then don't
> use the CR0 setting that the instruction did, but instead go and work
> out what happens to CR0 manually in set_cr0().  Also you don't tell
> the compiler that CR0 has been modified, which could lead to problems.
> 
> This case could be done much more simply like this:
> 
> 
> 
>   case 425:   /* divde[.] */
>   asm volatile(PPC_DIVDE(%0, %1, %2) :
>   "=r" (op->val) : "r" (regs->gpr[ra]),
>   "r" (regs->gpr[rb]));
>   goto arith_done;
> 
> (note, goto arith_done rather than compute_done) and similarly for the
> divdeu case.

Thanks Paul for review, I will fix it as suggested and post the v2 version.

-- Bala
> 
> Paul.



[Bug 205201] Booting halts if Dawicontrol DC-2976 UW SCSI board installed, unless RAM size limited to 3500M

2020-01-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=205201

--- Comment #17 from Roland (rj.ron...@gmail.com) ---
Kernel 5.5 alpha 1 fixed the issue with Dawicontrol DC-2976 UW SCSI board. Also
a RTL 8169 ethernet card which had similar type of problem with earlier kernels
works now with full (8 GB) Ram.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 205201] Booting halts if Dawicontrol DC-2976 UW SCSI board installed, unless RAM size limited to 3500M

2020-01-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=205201

Roland (rj.ron...@gmail.com) changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |PATCH_ALREADY_AVAILABLE

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores

2020-01-10 Thread Balamuruhan S
On Tue, Nov 26, 2019 at 04:21:29PM +1100, Jordan Niethe wrote:
> This adds emulation support for the following prefixed integer
> load/stores:
>   * Prefixed Load Byte and Zero (plbz)
>   * Prefixed Load Halfword and Zero (plhz)
>   * Prefixed Load Halfword Algebraic (plha)
>   * Prefixed Load Word and Zero (plwz)
>   * Prefixed Load Word Algebraic (plwa)
>   * Prefixed Load Doubleword (pld)
>   * Prefixed Store Byte (pstb)
>   * Prefixed Store Halfword (psth)
>   * Prefixed Store Word (pstw)
>   * Prefixed Store Doubleword (pstd)
>   * Prefixed Load Quadword (plq)
>   * Prefixed Store Quadword (pstq)
> 
> Signed-off-by: Jordan Niethe 
> ---
>  arch/powerpc/lib/sstep.c | 110 +++
>  1 file changed, 110 insertions(+)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index ade3f5eba2e5..4f5ad1f602d8 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -187,6 +187,43 @@ static nokprobe_inline unsigned long xform_ea(unsigned 
> int instr,
>   return ea;
>  }
>  
> +/*
> + * Calculate effective address for a MLS:D-form / 8LS:D-form prefixed 
> instruction
> + */
> +static nokprobe_inline unsigned long mlsd_8lsd_ea(unsigned int instr,
> +   unsigned int sufx,
> +   const struct pt_regs *regs)
> +{
> + int ra, prefix_r;
> + unsigned int  dd;
> + unsigned long ea, d0, d1, d;
> +
> + prefix_r = instr & (1ul << 20);
> + ra = (sufx >> 16) & 0x1f;
> +
> + d0 = instr & 0x3;
> + d1 = sufx & 0x;
> + d = (d0 << 16) | d1;
> +
> + /*
> +  * sign extend a 34 bit number
> +  */
> + dd = (unsigned int) (d >> 2);
> + ea = (signed int) dd;
> + ea = (ea << 2) | (d & 0x3);
> +
> + if (!prefix_r && ra)
> + ea += regs->gpr[ra];
> + else if (!prefix_r && !ra)
> + ; /* Leave ea as is */
> + else if (prefix_r && !ra)
> + ea += regs->nip;
> + else if (prefix_r && ra)
> + ; /* Invalid form. Should already be checked for by caller! */
> +
> + return ea;
> +}
> +
>  /*
>   * Return the largest power of 2, not greater than sizeof(unsigned long),
>   * such that x is a multiple of it.
> @@ -1166,6 +1203,7 @@ int analyse_instr(struct instruction_op *op, const 
> struct pt_regs *regs,
> unsigned int instr, unsigned int sufx)
>  {
>   unsigned int opcode, ra, rb, rc, rd, spr, u;
> + unsigned int sufxopcode, prefixtype, prefix_r;
>   unsigned long int imm;
>   unsigned long int val, val2;
>   unsigned int mb, me, sh;
> @@ -2652,6 +2690,78 @@ int analyse_instr(struct instruction_op *op, const 
> struct pt_regs *regs,
>  
>   }
>  
> +/*
> + * Prefixed instructions
> + */
> + switch (opcode) {
> + case 1:
> + prefix_r = instr & (1ul << 20);
> + ra = (sufx >> 16) & 0x1f;
> + op->update_reg = ra;
> + rd = (sufx >> 21) & 0x1f;
> + op->reg = rd;
> + op->val = regs->gpr[rd];
> +
> + sufxopcode = sufx >> 26;
> + prefixtype = (instr >> 24) & 0x3;
> + switch (prefixtype) {
> + case 0: /* Type 00  Eight-Byte Load/Store */
> + if (prefix_r && ra)
> + break;
> + op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> + switch (sufxopcode) {
> + case 41:/* plwa */
> + op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
> + break;
> + case 56:/* plq */
> + op->type = MKOP(LOAD, PREFIXED, 16);
> + break;
> + case 57:/* pld */
> + op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 8);
> + break;
> + case 60:/* stq */
> + op->type = MKOP(STORE, PREFIXED, 16);
> + break;
> + case 61:/* pstd */
> + op->type = MKOP(STORE, PREFIXED | SIGNEXT, 8);

For 8 byte and and 1 byte (mentioned below for Type 10 instructions), we
do not have their respective definitions in `do_signext()`, I am not
sure whether it is typo/miss.

> + break;
> + }
> + break;
> + case 1: /* Type 01 Modified Register-to-Register */

Type 01 would be Eight-Byte Register-to-Register.

-- Bala
> + break;
> + case 2: /* Type 10 Modified Load/Store */
> + if (prefix_r && ra)
> + break;
> + op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> + switch (sufxopcode) {
> + case 32:/* 

Re: [PATCH v2 2/9] net: wireless: rtl818x: Constify ioreadX() iomem argument (as in generic implementation)

2020-01-10 Thread Kalle Valo
Krzysztof Kozlowski  writes:

> The ioreadX() helpers have inconsistent interface.  On some architectures
> void *__iomem address argument is a pointer to const, on some not.
>
> Implementations of ioreadX() do not modify the memory under the address
> so they can be converted to a "const" version for const-safety and
> consistency among architectures.
>
> Signed-off-by: Krzysztof Kozlowski 
> Reviewed-by: Geert Uytterhoeven 
>
> ---
>
> Changes since v1:
> 1. Add Geert's review.
> ---
>  drivers/net/wireless/realtek/rtl818x/rtl8180/rtl8180.h | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

No need to have "net: wireless: " in the title, this is enough.

rtl818x: Constify ioreadX() iomem argument (as in generic implementation)

I assume someone else will take this so here's my ack:

Acked-by: Kalle Valo 

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


Re: [PATCH v2 9/9] net: wireless: ath5k: Constify ioreadX() iomem argument (as in generic implementation)

2020-01-10 Thread Kalle Valo
Krzysztof Kozlowski  writes:

> The ioreadX() helpers have inconsistent interface.  On some architectures
> void *__iomem address argument is a pointer to const, on some not.
>
> Implementations of ioreadX() do not modify the memory under the address
> so they can be converted to a "const" version for const-safety and
> consistency among architectures.
>
> Signed-off-by: Krzysztof Kozlowski 
> ---
>  drivers/net/wireless/ath/ath5k/ahb.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)

No need to have "net: wireless: " in the title, otherwise looks good.

Acked-by: Kalle Valo 

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches


Re: [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process

2020-01-10 Thread Peter Zijlstra
On Thu, Jan 09, 2020 at 02:36:50PM +0300, Alexey Budankov wrote:
> On 08.01.2020 19:07, Peter Zijlstra wrote:
> > On Wed, Dec 18, 2019 at 12:25:35PM +0300, Alexey Budankov wrote:

> >> diff --git a/kernel/events/core.c b/kernel/events/core.c
> >> index 059ee7116008..d9db414f2197 100644
> >> --- a/kernel/events/core.c
> >> +++ b/kernel/events/core.c
> >> @@ -9056,7 +9056,7 @@ static int perf_kprobe_event_init(struct perf_event 
> >> *event)
> >>if (event->attr.type != perf_kprobe.type)
> >>return -ENOENT;
> >>  
> >> -  if (!capable(CAP_SYS_ADMIN))
> >> +  if (!perfmon_capable())
> >>return -EACCES;
> >>  
> >>/*
> > 
> > This one only allows attaching to already extant kprobes, right? It does
> > not allow creation of kprobes.
> 
> This unblocks creation of local trace kprobes and uprobes by CAP_SYS_PERFMON 
> privileged process, exactly the same as for CAP_SYS_ADMIN privileged process.

I've no idea what you just said; it's just words.

Again, this only allows attaching to previously created kprobes, it does
not allow creating kprobes, right?

That is; I don't think CAP_SYS_PERFMON should be allowed to create
kprobes.

As might be clear; I don't actually know what the user-ABI is for
creating kprobes.


[PATCH v2 00/10] Impveovements for random.h/archrandom.h

2020-01-10 Thread Mark Brown
This is a resend of a series from Richard Henderson last posted back in
November:

   
https://lore.kernel.org/linux-arm-kernel/20191106141308.30535-1-...@twiddle.net/

Back then Borislav said they looked good and asked if he should take
them through the tip tree but things seem to have got lost since then.

Original cover letter:

During patch review for an addition of archrandom.h for arm64, it was
suggeted that the arch_random_get_* functions should be marked __must_check.
Which does sound like a good idea, since the by-reference integer output
may be uninitialized when the boolean result is false.

In addition, it turns out that arch_has_random() and arch_has_random_seed()
are not used, and not easy to support for arm64.  Rather than cobble
something together that would not be testable, remove the interfaces
against some future accidental use.

In addition, I noticed a few other minor inconsistencies between the
different architectures, e.g. powerpc isn't using bool.

Change since v1:
   * Remove arch_has_random, arch_has_random_seed.

Richard Henderson (10):
  x86: Remove arch_has_random, arch_has_random_seed
  powerpc: Remove arch_has_random, arch_has_random_seed
  s390: Remove arch_has_random, arch_has_random_seed
  linux/random.h: Remove arch_has_random, arch_has_random_seed
  linux/random.h: Use false with bool
  linux/random.h: Mark CONFIG_ARCH_RANDOM functions __must_check
  x86: Mark archrandom.h functions __must_check
  powerpc: Use bool in archrandom.h
  powerpc: Mark archrandom.h functions __must_check
  s390x: Mark archrandom.h functions __must_check

 arch/powerpc/include/asm/archrandom.h | 27 +-
 arch/s390/include/asm/archrandom.h| 20 ---
 arch/x86/include/asm/archrandom.h | 28 ---
 include/linux/random.h| 24 ---
 4 files changed, 33 insertions(+), 66 deletions(-)

-- 
2.20.1



[PATCH v2 02/10] powerpc: Remove arch_has_random, arch_has_random_seed

2020-01-10 Thread Mark Brown
From: Richard Henderson 

These symbols are currently part of the generic archrandom.h
interface, but are currently unused and can be removed.

Signed-off-by: Richard Henderson 
Signed-off-by: Mark Brown 
---
 arch/powerpc/include/asm/archrandom.h | 10 --
 1 file changed, 10 deletions(-)

diff --git a/arch/powerpc/include/asm/archrandom.h 
b/arch/powerpc/include/asm/archrandom.h
index a09595f00cab..2fa7cdfbba24 100644
--- a/arch/powerpc/include/asm/archrandom.h
+++ b/arch/powerpc/include/asm/archrandom.h
@@ -34,16 +34,6 @@ static inline int arch_get_random_seed_int(unsigned int *v)
 
return rc;
 }
-
-static inline int arch_has_random(void)
-{
-   return 0;
-}
-
-static inline int arch_has_random_seed(void)
-{
-   return !!ppc_md.get_random_seed;
-}
 #endif /* CONFIG_ARCH_RANDOM */
 
 #ifdef CONFIG_PPC_POWERNV
-- 
2.20.1



[PATCH v2 01/10] x86: Remove arch_has_random, arch_has_random_seed

2020-01-10 Thread Mark Brown
From: Richard Henderson 

Use the expansion of these macros directly in arch_get_random_*.

These symbols are currently part of the generic archrandom.h
interface, but are currently unused and can be removed.

Signed-off-by: Richard Henderson 
Signed-off-by: Mark Brown 
---
 arch/x86/include/asm/archrandom.h | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/archrandom.h 
b/arch/x86/include/asm/archrandom.h
index af45e1452f09..feb59461046c 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -73,10 +73,6 @@ static inline bool rdseed_int(unsigned int *v)
return ok;
 }
 
-/* Conditional execution based on CPU type */
-#define arch_has_random()  static_cpu_has(X86_FEATURE_RDRAND)
-#define arch_has_random_seed() static_cpu_has(X86_FEATURE_RDSEED)
-
 /*
  * These are the generic interfaces; they must not be declared if the
  * stubs in  are to be invoked,
@@ -86,22 +82,22 @@ static inline bool rdseed_int(unsigned int *v)
 
 static inline bool arch_get_random_long(unsigned long *v)
 {
-   return arch_has_random() ? rdrand_long(v) : false;
+   return static_cpu_has(X86_FEATURE_RDRAND) ? rdrand_long(v) : false;
 }
 
 static inline bool arch_get_random_int(unsigned int *v)
 {
-   return arch_has_random() ? rdrand_int(v) : false;
+   return static_cpu_has(X86_FEATURE_RDRAND) ? rdrand_int(v) : false;
 }
 
 static inline bool arch_get_random_seed_long(unsigned long *v)
 {
-   return arch_has_random_seed() ? rdseed_long(v) : false;
+   return static_cpu_has(X86_FEATURE_RDSEED) ? rdseed_long(v) : false;
 }
 
 static inline bool arch_get_random_seed_int(unsigned int *v)
 {
-   return arch_has_random_seed() ? rdseed_int(v) : false;
+   return static_cpu_has(X86_FEATURE_RDSEED) ? rdseed_int(v) : false;
 }
 
 extern void x86_init_rdrand(struct cpuinfo_x86 *c);
-- 
2.20.1



[PATCH v2 03/10] s390: Remove arch_has_random, arch_has_random_seed

2020-01-10 Thread Mark Brown
From: Richard Henderson 

These symbols are currently part of the generic archrandom.h
interface, but are currently unused and can be removed.

Signed-off-by: Richard Henderson 
Signed-off-by: Mark Brown 
---
 arch/s390/include/asm/archrandom.h | 12 
 1 file changed, 12 deletions(-)

diff --git a/arch/s390/include/asm/archrandom.h 
b/arch/s390/include/asm/archrandom.h
index c67b82dfa558..9a6835137a16 100644
--- a/arch/s390/include/asm/archrandom.h
+++ b/arch/s390/include/asm/archrandom.h
@@ -21,18 +21,6 @@ extern atomic64_t s390_arch_random_counter;
 
 bool s390_arch_random_generate(u8 *buf, unsigned int nbytes);
 
-static inline bool arch_has_random(void)
-{
-   return false;
-}
-
-static inline bool arch_has_random_seed(void)
-{
-   if (static_branch_likely(&s390_arch_random_available))
-   return true;
-   return false;
-}
-
 static inline bool arch_get_random_long(unsigned long *v)
 {
return false;
-- 
2.20.1



[PATCH v2 04/10] linux/random.h: Remove arch_has_random, arch_has_random_seed

2020-01-10 Thread Mark Brown
From: Richard Henderson 

The arm64 version of archrandom.h will need to be able to test for
support and read the random number without preemption, so a separate
query predicate is not practical.

Since this part of the generic interface is unused, remove it.

Signed-off-by: Richard Henderson 
Signed-off-by: Mark Brown 
---
 include/linux/random.h | 8 
 1 file changed, 8 deletions(-)

diff --git a/include/linux/random.h b/include/linux/random.h
index f189c927fdea..7fd0360908d2 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -175,10 +175,6 @@ static inline bool arch_get_random_int(unsigned int *v)
 {
return 0;
 }
-static inline bool arch_has_random(void)
-{
-   return 0;
-}
 static inline bool arch_get_random_seed_long(unsigned long *v)
 {
return 0;
@@ -187,10 +183,6 @@ static inline bool arch_get_random_seed_int(unsigned int 
*v)
 {
return 0;
 }
-static inline bool arch_has_random_seed(void)
-{
-   return 0;
-}
 #endif
 
 /* Pseudo random number generator from numerical recipes. */
-- 
2.20.1



[PATCH v2 05/10] linux/random.h: Use false with bool

2020-01-10 Thread Mark Brown
From: Richard Henderson 

Keep the generic fallback versions in sync with the other architecture
specific implementations and use the proper name for false.

Suggested-by: Ard Biesheuvel 
Signed-off-by: Richard Henderson 
Signed-off-by: Mark Brown 
---
 include/linux/random.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/random.h b/include/linux/random.h
index 7fd0360908d2..ea0e2f5f1ec5 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -169,19 +169,19 @@ static inline void prandom_seed_state(struct rnd_state 
*state, u64 seed)
 #else
 static inline bool arch_get_random_long(unsigned long *v)
 {
-   return 0;
+   return false;
 }
 static inline bool arch_get_random_int(unsigned int *v)
 {
-   return 0;
+   return false;
 }
 static inline bool arch_get_random_seed_long(unsigned long *v)
 {
-   return 0;
+   return false;
 }
 static inline bool arch_get_random_seed_int(unsigned int *v)
 {
-   return 0;
+   return false;
 }
 #endif
 
-- 
2.20.1



[PATCH v2 06/10] linux/random.h: Mark CONFIG_ARCH_RANDOM functions __must_check

2020-01-10 Thread Mark Brown
From: Richard Henderson 

We must not use the pointer output without validating the
success of the random read.

Reviewed-by: Ard Biesheuvel 
Signed-off-by: Richard Henderson 
Signed-off-by: Mark Brown 
---
 include/linux/random.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/random.h b/include/linux/random.h
index ea0e2f5f1ec5..d319f9a1e429 100644
--- a/include/linux/random.h
+++ b/include/linux/random.h
@@ -167,19 +167,19 @@ static inline void prandom_seed_state(struct rnd_state 
*state, u64 seed)
 #ifdef CONFIG_ARCH_RANDOM
 # include 
 #else
-static inline bool arch_get_random_long(unsigned long *v)
+static inline bool __must_check arch_get_random_long(unsigned long *v)
 {
return false;
 }
-static inline bool arch_get_random_int(unsigned int *v)
+static inline bool __must_check arch_get_random_int(unsigned int *v)
 {
return false;
 }
-static inline bool arch_get_random_seed_long(unsigned long *v)
+static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
 {
return false;
 }
-static inline bool arch_get_random_seed_int(unsigned int *v)
+static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
 {
return false;
 }
-- 
2.20.1



[PATCH v2 07/10] x86: Mark archrandom.h functions __must_check

2020-01-10 Thread Mark Brown
From: Richard Henderson 

We must not use the pointer output without validating the
success of the random read.

Reviewed-by: Ard Biesheuvel 
Signed-off-by: Richard Henderson 
Signed-off-by: Mark Brown 
---
 arch/x86/include/asm/archrandom.h | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/archrandom.h 
b/arch/x86/include/asm/archrandom.h
index feb59461046c..7a4bb1bd4bdb 100644
--- a/arch/x86/include/asm/archrandom.h
+++ b/arch/x86/include/asm/archrandom.h
@@ -27,7 +27,7 @@
 
 /* Unconditional execution of RDRAND and RDSEED */
 
-static inline bool rdrand_long(unsigned long *v)
+static inline bool __must_check rdrand_long(unsigned long *v)
 {
bool ok;
unsigned int retry = RDRAND_RETRY_LOOPS;
@@ -41,7 +41,7 @@ static inline bool rdrand_long(unsigned long *v)
return false;
 }
 
-static inline bool rdrand_int(unsigned int *v)
+static inline bool __must_check rdrand_int(unsigned int *v)
 {
bool ok;
unsigned int retry = RDRAND_RETRY_LOOPS;
@@ -55,7 +55,7 @@ static inline bool rdrand_int(unsigned int *v)
return false;
 }
 
-static inline bool rdseed_long(unsigned long *v)
+static inline bool __must_check rdseed_long(unsigned long *v)
 {
bool ok;
asm volatile(RDSEED_LONG
@@ -64,7 +64,7 @@ static inline bool rdseed_long(unsigned long *v)
return ok;
 }
 
-static inline bool rdseed_int(unsigned int *v)
+static inline bool __must_check rdseed_int(unsigned int *v)
 {
bool ok;
asm volatile(RDSEED_INT
@@ -80,22 +80,22 @@ static inline bool rdseed_int(unsigned int *v)
  */
 #ifdef CONFIG_ARCH_RANDOM
 
-static inline bool arch_get_random_long(unsigned long *v)
+static inline bool __must_check arch_get_random_long(unsigned long *v)
 {
return static_cpu_has(X86_FEATURE_RDRAND) ? rdrand_long(v) : false;
 }
 
-static inline bool arch_get_random_int(unsigned int *v)
+static inline bool __must_check arch_get_random_int(unsigned int *v)
 {
return static_cpu_has(X86_FEATURE_RDRAND) ? rdrand_int(v) : false;
 }
 
-static inline bool arch_get_random_seed_long(unsigned long *v)
+static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
 {
return static_cpu_has(X86_FEATURE_RDSEED) ? rdseed_long(v) : false;
 }
 
-static inline bool arch_get_random_seed_int(unsigned int *v)
+static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
 {
return static_cpu_has(X86_FEATURE_RDSEED) ? rdseed_int(v) : false;
 }
-- 
2.20.1



[PATCH v2 08/10] powerpc: Use bool in archrandom.h

2020-01-10 Thread Mark Brown
From: Richard Henderson 

The generic interface uses bool not int; match that.

Reviewed-by: Ard Biesheuvel 
Signed-off-by: Richard Henderson 
Signed-off-by: Mark Brown 
---
 arch/powerpc/include/asm/archrandom.h | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/archrandom.h 
b/arch/powerpc/include/asm/archrandom.h
index 2fa7cdfbba24..f0f16b4fc5ea 100644
--- a/arch/powerpc/include/asm/archrandom.h
+++ b/arch/powerpc/include/asm/archrandom.h
@@ -6,27 +6,28 @@
 
 #include 
 
-static inline int arch_get_random_long(unsigned long *v)
+static inline bool arch_get_random_long(unsigned long *v)
 {
-   return 0;
+   return false;
 }
 
-static inline int arch_get_random_int(unsigned int *v)
+static inline bool arch_get_random_int(unsigned int *v)
 {
-   return 0;
+   return false;
 }
 
-static inline int arch_get_random_seed_long(unsigned long *v)
+static inline bool arch_get_random_seed_long(unsigned long *v)
 {
if (ppc_md.get_random_seed)
return ppc_md.get_random_seed(v);
 
-   return 0;
+   return false;
 }
-static inline int arch_get_random_seed_int(unsigned int *v)
+
+static inline bool arch_get_random_seed_int(unsigned int *v)
 {
unsigned long val;
-   int rc;
+   bool rc;
 
rc = arch_get_random_seed_long(&val);
if (rc)
-- 
2.20.1



[PATCH v2 09/10] powerpc: Mark archrandom.h functions __must_check

2020-01-10 Thread Mark Brown
From: Richard Henderson 

We must not use the pointer output without validating the
success of the random read.

Acked-by: Michael Ellerman 
Reviewed-by: Ard Biesheuvel 
Signed-off-by: Richard Henderson 
Signed-off-by: Mark Brown 
---
 arch/powerpc/include/asm/archrandom.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/archrandom.h 
b/arch/powerpc/include/asm/archrandom.h
index f0f16b4fc5ea..9a53e29680f4 100644
--- a/arch/powerpc/include/asm/archrandom.h
+++ b/arch/powerpc/include/asm/archrandom.h
@@ -6,17 +6,17 @@
 
 #include 
 
-static inline bool arch_get_random_long(unsigned long *v)
+static inline bool __must_check arch_get_random_long(unsigned long *v)
 {
return false;
 }
 
-static inline bool arch_get_random_int(unsigned int *v)
+static inline bool __must_check arch_get_random_int(unsigned int *v)
 {
return false;
 }
 
-static inline bool arch_get_random_seed_long(unsigned long *v)
+static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
 {
if (ppc_md.get_random_seed)
return ppc_md.get_random_seed(v);
@@ -24,7 +24,7 @@ static inline bool arch_get_random_seed_long(unsigned long *v)
return false;
 }
 
-static inline bool arch_get_random_seed_int(unsigned int *v)
+static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
 {
unsigned long val;
bool rc;
-- 
2.20.1



[PATCH v2 10/10] s390x: Mark archrandom.h functions __must_check

2020-01-10 Thread Mark Brown
From: Richard Henderson 

We must not use the pointer output without validating the
success of the random read.

Reviewed-by: Harald Freudenberger 
Reviewed-by: Ard Biesheuvel 
Signed-off-by: Richard Henderson 
Signed-off-by: Mark Brown 
---
 arch/s390/include/asm/archrandom.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/s390/include/asm/archrandom.h 
b/arch/s390/include/asm/archrandom.h
index 9a6835137a16..de61ce562052 100644
--- a/arch/s390/include/asm/archrandom.h
+++ b/arch/s390/include/asm/archrandom.h
@@ -21,17 +21,17 @@ extern atomic64_t s390_arch_random_counter;
 
 bool s390_arch_random_generate(u8 *buf, unsigned int nbytes);
 
-static inline bool arch_get_random_long(unsigned long *v)
+static inline bool __must_check arch_get_random_long(unsigned long *v)
 {
return false;
 }
 
-static inline bool arch_get_random_int(unsigned int *v)
+static inline bool __must_check arch_get_random_int(unsigned int *v)
 {
return false;
 }
 
-static inline bool arch_get_random_seed_long(unsigned long *v)
+static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
 {
if (static_branch_likely(&s390_arch_random_available)) {
return s390_arch_random_generate((u8 *)v, sizeof(*v));
@@ -39,7 +39,7 @@ static inline bool arch_get_random_seed_long(unsigned long *v)
return false;
 }
 
-static inline bool arch_get_random_seed_int(unsigned int *v)
+static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
 {
if (static_branch_likely(&s390_arch_random_available)) {
return s390_arch_random_generate((u8 *)v, sizeof(*v));
-- 
2.20.1



Re: [PATCH 06/18] powerpc sstep: Add support for prefixed integer load/stores

2020-01-10 Thread Balamuruhan S
On Tue, Nov 26, 2019 at 04:21:29PM +1100, Jordan Niethe wrote:
> This adds emulation support for the following prefixed integer
> load/stores:
>   * Prefixed Load Byte and Zero (plbz)
>   * Prefixed Load Halfword and Zero (plhz)
>   * Prefixed Load Halfword Algebraic (plha)
>   * Prefixed Load Word and Zero (plwz)
>   * Prefixed Load Word Algebraic (plwa)
>   * Prefixed Load Doubleword (pld)
>   * Prefixed Store Byte (pstb)
>   * Prefixed Store Halfword (psth)
>   * Prefixed Store Word (pstw)
>   * Prefixed Store Doubleword (pstd)
>   * Prefixed Load Quadword (plq)
>   * Prefixed Store Quadword (pstq)
> 
> Signed-off-by: Jordan Niethe 
> ---
>  arch/powerpc/lib/sstep.c | 110 +++
>  1 file changed, 110 insertions(+)
> 
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index ade3f5eba2e5..4f5ad1f602d8 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -187,6 +187,43 @@ static nokprobe_inline unsigned long xform_ea(unsigned 
> int instr,
>   return ea;
>  }
>  
> +/*
> + * Calculate effective address for a MLS:D-form / 8LS:D-form prefixed 
> instruction
> + */
> +static nokprobe_inline unsigned long mlsd_8lsd_ea(unsigned int instr,
> +   unsigned int sufx,
> +   const struct pt_regs *regs)
> +{
> + int ra, prefix_r;
> + unsigned int  dd;
> + unsigned long ea, d0, d1, d;
> +
> + prefix_r = instr & (1ul << 20);
> + ra = (sufx >> 16) & 0x1f;
> +
> + d0 = instr & 0x3;
> + d1 = sufx & 0x;
> + d = (d0 << 16) | d1;
> +
> + /*
> +  * sign extend a 34 bit number
> +  */
> + dd = (unsigned int) (d >> 2);
> + ea = (signed int) dd;
> + ea = (ea << 2) | (d & 0x3);
> +
> + if (!prefix_r && ra)
> + ea += regs->gpr[ra];
> + else if (!prefix_r && !ra)
> + ; /* Leave ea as is */
> + else if (prefix_r && !ra)
> + ea += regs->nip;
> + else if (prefix_r && ra)
> + ; /* Invalid form. Should already be checked for by caller! */
> +
> + return ea;
> +}
> +
>  /*
>   * Return the largest power of 2, not greater than sizeof(unsigned long),
>   * such that x is a multiple of it.
> @@ -1166,6 +1203,7 @@ int analyse_instr(struct instruction_op *op, const 
> struct pt_regs *regs,
> unsigned int instr, unsigned int sufx)
>  {
>   unsigned int opcode, ra, rb, rc, rd, spr, u;
> + unsigned int sufxopcode, prefixtype, prefix_r;
>   unsigned long int imm;
>   unsigned long int val, val2;
>   unsigned int mb, me, sh;
> @@ -2652,6 +2690,78 @@ int analyse_instr(struct instruction_op *op, const 
> struct pt_regs *regs,
>  
>   }
>  
> +/*
> + * Prefixed instructions
> + */
> + switch (opcode) {
> + case 1:
> + prefix_r = instr & (1ul << 20);
> + ra = (sufx >> 16) & 0x1f;
> + op->update_reg = ra;
> + rd = (sufx >> 21) & 0x1f;
> + op->reg = rd;
> + op->val = regs->gpr[rd];
> +
> + sufxopcode = sufx >> 26;
> + prefixtype = (instr >> 24) & 0x3;
> + switch (prefixtype) {
> + case 0: /* Type 00  Eight-Byte Load/Store */
> + if (prefix_r && ra)
> + break;
> + op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> + switch (sufxopcode) {
> + case 41:/* plwa */
> + op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 4);
> + break;
> + case 56:/* plq */
> + op->type = MKOP(LOAD, PREFIXED, 16);
> + break;
> + case 57:/* pld */
> + op->type = MKOP(LOAD, PREFIXED | SIGNEXT, 8);
> + break;
> + case 60:/* stq */
> + op->type = MKOP(STORE, PREFIXED, 16);
> + break;
> + case 61:/* pstd */
> + op->type = MKOP(STORE, PREFIXED | SIGNEXT, 8);

sorry, we don't do SIGNEXT for 1 byte below in Type 10, so does 8 byte is used
conscious without definition in `do_signext()` as we don't really need to do
anything ?

-- Bala

> + break;
> + }
> + break;
> + case 1: /* Type 01 Modified Register-to-Register */
> + break;
> + case 2: /* Type 10 Modified Load/Store */
> + if (prefix_r && ra)
> + break;
> + op->ea = mlsd_8lsd_ea(instr, sufx, regs);
> + switch (sufxopcode) {
> + case 32:/* plwz */
> + op->type = MKOP(

Re: [PATCH v2 00/10] Impveovements for random.h/archrandom.h

2020-01-10 Thread Borislav Petkov
On Fri, Jan 10, 2020 at 02:54:12PM +, Mark Brown wrote:
> This is a resend of a series from Richard Henderson last posted back in
> November:
> 
>
> https://lore.kernel.org/linux-arm-kernel/20191106141308.30535-1-...@twiddle.net/
> 
> Back then Borislav said they looked good and asked if he should take
> them through the tip tree but things seem to have got lost since then.

Or, alternatively, akpm could take them. In any case, if someone else
ends up doing that, for the x86 bits:

Reviewed-by: Borislav Petkov 

Thx.

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process

2020-01-10 Thread Masami Hiramatsu
On Fri, 10 Jan 2020 15:02:34 +0100
Peter Zijlstra  wrote:

> On Thu, Jan 09, 2020 at 02:36:50PM +0300, Alexey Budankov wrote:
> > On 08.01.2020 19:07, Peter Zijlstra wrote:
> > > On Wed, Dec 18, 2019 at 12:25:35PM +0300, Alexey Budankov wrote:
> 
> > >> diff --git a/kernel/events/core.c b/kernel/events/core.c
> > >> index 059ee7116008..d9db414f2197 100644
> > >> --- a/kernel/events/core.c
> > >> +++ b/kernel/events/core.c
> > >> @@ -9056,7 +9056,7 @@ static int perf_kprobe_event_init(struct 
> > >> perf_event *event)
> > >>  if (event->attr.type != perf_kprobe.type)
> > >>  return -ENOENT;
> > >>  
> > >> -if (!capable(CAP_SYS_ADMIN))
> > >> +if (!perfmon_capable())
> > >>  return -EACCES;
> > >>  
> > >>  /*
> > > 
> > > This one only allows attaching to already extant kprobes, right? It does
> > > not allow creation of kprobes.
> > 
> > This unblocks creation of local trace kprobes and uprobes by 
> > CAP_SYS_PERFMON 
> > privileged process, exactly the same as for CAP_SYS_ADMIN privileged 
> > process.
> 
> I've no idea what you just said; it's just words.
> 
> Again, this only allows attaching to previously created kprobes, it does
> not allow creating kprobes, right?
> 
> That is; I don't think CAP_SYS_PERFMON should be allowed to create
> kprobes.
> 
> As might be clear; I don't actually know what the user-ABI is for
> creating kprobes.

There are 2 ABIs nowadays, ftrace and ebpf. perf-probe uses ftrace interface to
define new kprobe events, and those events are treated as completely same as
tracepoint events. On the other hand, ebpf tries to define new probe event
via perf_event interface. Above one is that interface. IOW, it creates new 
kprobe.

Thank you,


-- 
Masami Hiramatsu 


Re: [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process

2020-01-10 Thread Alexey Budankov


On 10.01.2020 17:02, Peter Zijlstra wrote:
> On Thu, Jan 09, 2020 at 02:36:50PM +0300, Alexey Budankov wrote:
>> On 08.01.2020 19:07, Peter Zijlstra wrote:
>>> On Wed, Dec 18, 2019 at 12:25:35PM +0300, Alexey Budankov wrote:
> 
 diff --git a/kernel/events/core.c b/kernel/events/core.c
 index 059ee7116008..d9db414f2197 100644
 --- a/kernel/events/core.c
 +++ b/kernel/events/core.c
 @@ -9056,7 +9056,7 @@ static int perf_kprobe_event_init(struct perf_event 
 *event)
if (event->attr.type != perf_kprobe.type)
return -ENOENT;
  
 -  if (!capable(CAP_SYS_ADMIN))
 +  if (!perfmon_capable())
return -EACCES;
  
/*
>>>
>>> This one only allows attaching to already extant kprobes, right? It does
>>> not allow creation of kprobes.
>>
>> This unblocks creation of local trace kprobes and uprobes by CAP_SYS_PERFMON 
>> privileged process, exactly the same as for CAP_SYS_ADMIN privileged process.
> 
> I've no idea what you just said; it's just words.
> 
> Again, this only allows attaching to previously created kprobes, it does
> not allow creating kprobes, right?

Not really, this allows creating a kprobe using perf_event_open syscall that
associates file descriptor with the kprobe [1].

Lifetime of that kprobe is equal to the lifetime of the file descriptor and 
the kprobe is not visible in tracefs: /sys/kernel/debug/tracing/kprobe_events

> 
> That is; I don't think CAP_SYS_PERFMON should be allowed to create
> kprobes.
> 
> As might be clear; I don't actually know what the user-ABI is for
> creating kprobes.

~Alexey

---

[1] https://lore.kernel.org/lkml/20171206224518.3598254-1-songliubrav...@fb.com/


Re: [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process

2020-01-10 Thread Arnaldo Carvalho de Melo
Em Sat, Jan 11, 2020 at 12:52:13AM +0900, Masami Hiramatsu escreveu:
> On Fri, 10 Jan 2020 15:02:34 +0100 Peter Zijlstra  
> wrote:
> > Again, this only allows attaching to previously created kprobes, it does
> > not allow creating kprobes, right?

> > That is; I don't think CAP_SYS_PERFMON should be allowed to create
> > kprobes.

> > As might be clear; I don't actually know what the user-ABI is for
> > creating kprobes.

> There are 2 ABIs nowadays, ftrace and ebpf. perf-probe uses ftrace interface 
> to
> define new kprobe events, and those events are treated as completely same as
> tracepoint events. On the other hand, ebpf tries to define new probe event
> via perf_event interface. Above one is that interface. IOW, it creates new 
> kprobe.

Masami, any plans to make 'perf probe' use the perf_event_open()
interface for creating kprobes/uprobes?

- Arnaldo


Re: [PATCH v2 00/10] Impveovements for random.h/archrandom.h

2020-01-10 Thread Theodore Y. Ts'o
On Fri, Jan 10, 2020 at 04:51:53PM +0100, Borislav Petkov wrote:
> On Fri, Jan 10, 2020 at 02:54:12PM +, Mark Brown wrote:
> > This is a resend of a series from Richard Henderson last posted back in
> > November:
> > 
> >
> > https://lore.kernel.org/linux-arm-kernel/20191106141308.30535-1-...@twiddle.net/
> > 
> > Back then Borislav said they looked good and asked if he should take
> > them through the tip tree but things seem to have got lost since then.
> 
> Or, alternatively, akpm could take them. In any case, if someone else
> ends up doing that, for the x86 bits:
> 
> Reviewed-by: Borislav Petkov 

Or I can take them through the random.git tree, since we have a lot of
changes this cycle going to Linus anyway.  Any objections?

   - Ted


Re: [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process

2020-01-10 Thread Alexey Budankov



On 10.01.2020 17:02, Peter Zijlstra wrote:
> On Thu, Jan 09, 2020 at 02:36:50PM +0300, Alexey Budankov wrote:
>> On 08.01.2020 19:07, Peter Zijlstra wrote:
>>> On Wed, Dec 18, 2019 at 12:25:35PM +0300, Alexey Budankov wrote:
> 
 diff --git a/kernel/events/core.c b/kernel/events/core.c
 index 059ee7116008..d9db414f2197 100644
 --- a/kernel/events/core.c
 +++ b/kernel/events/core.c
 @@ -9056,7 +9056,7 @@ static int perf_kprobe_event_init(struct perf_event 
 *event)
if (event->attr.type != perf_kprobe.type)
return -ENOENT;
  
 -  if (!capable(CAP_SYS_ADMIN))
 +  if (!perfmon_capable())
return -EACCES;
  
/*
>>>
>>> This one only allows attaching to already extant kprobes, right? It does
>>> not allow creation of kprobes.
>>
>> This unblocks creation of local trace kprobes and uprobes by CAP_SYS_PERFMON 
>> privileged process, exactly the same as for CAP_SYS_ADMIN privileged process.
> 
> I've no idea what you just said; it's just words.
> 
> Again, this only allows attaching to previously created kprobes, it does
> not allow creating kprobes, right?

Not really, this allows creating a kprobe using perf_event_open syscall that
associates file descriptor with the kprobe [1].

Lifetime of that kprobe is equal to the lifetime of the file descriptor and 
the kprobe is not visible in tracefs: /sys/kernel/debug/tracing/kprobe_events

> 
> That is; I don't think CAP_SYS_PERFMON should be allowed to create
> kprobes.
> 
> As might be clear; I don't actually know what the user-ABI is for
> creating kprobes.
> 

~Alexey

[1] https://lore.kernel.org/lkml/20171206224518.3598254-1-songliubrav...@fb.com/


Re: [PATCH v15 00/24] selftests, powerpc, x86: Memory Protection Keys

2020-01-10 Thread Aneesh Kumar K.V
Sandipan Das  writes:

> Memory protection keys enables an application to protect its address
> space from inadvertent access by its own code.
>
> This feature is now enabled on powerpc and has been available since
> 4.16-rc1. The patches move the selftests to arch neutral directory
> and enhance their test coverage.
>
> Testing
> ---
> Verified for correctness on powerpc. Need help with x86 testing as I
> do not have access to a Skylake server. Client platforms like Coffee
> Lake do not have the required feature bits set in CPUID.
>
> Changelog
> -
> Link to previous version (v14):
> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=55981&state=*
>
> v15:
>   (1) Rebased on top of latest master.
>   (2) Addressed review comments from Dave Hansen.
>   (3) Moved code for getting or setting pkey bits to new
>   helpers. These changes replace patch 7 of v14.
>   (4) Added a fix which ensures that the correct count of
>   reserved keys is used across different platforms.
>   (5) Added a fix which ensures that the correct page size
>   is used as powerpc supports both 4K and 64K pages.
>

Any update on merging this series? Can Intel help with testing this
series on Skylake server? Possibly merging to -next will result in
automated 01.org tests?


> v14:
>   (1) Incorporated another round of comments from Dave Hansen.
>
> v13:
>   (1) Incorporated comments for Dave Hansen.
>   (2) Added one more test for correct pkey-0 behavior.
>
> v12:
>   (1) Fixed the offset of pkey field in the siginfo structure for
>   x86_64 and powerpc. And tries to use the actual field
>   if the headers have it defined.
>
> v11:
>   (1) Fixed a deadlock in the ptrace testcase.
>
> v10 and prior:
>   (1) Moved the testcase to arch neutral directory.
>   (2) Split the changes into incremental patches.
>
> Desnes A. Nunes do Rosario (1):
>   selftests/vm/pkeys: Fix number of reserved powerpc pkeys
>
> Ram Pai (17):
>   selftests/x86/pkeys: Move selftests to arch-neutral directory
>   selftests/vm/pkeys: Rename all references to pkru to a generic name
>   selftests/vm/pkeys: Move generic definitions to header file
>   selftests/vm/pkeys: Typecast the pkey register
>   selftests/vm/pkeys: Fix pkey_disable_clear()
>   selftests/vm/pkeys: Fix assertion in pkey_disable_set/clear()
>   selftests/vm/pkeys: Fix alloc_random_pkey() to make it really random
>   selftests/vm/pkeys: Introduce generic pkey abstractions
>   selftests/vm/pkeys: Introduce powerpc support
>   selftests/vm/pkeys: Fix assertion in test_pkey_alloc_exhaust()
>   selftests/vm/pkeys: Improve checks to determine pkey support
>   selftests/vm/pkeys: Associate key on a mapped page and detect access
> violation
>   selftests/vm/pkeys: Associate key on a mapped page and detect write
> violation
>   selftests/vm/pkeys: Detect write violation on a mapped
> access-denied-key page
>   selftests/vm/pkeys: Introduce a sub-page allocator
>   selftests/vm/pkeys: Test correct behaviour of pkey-0
>   selftests/vm/pkeys: Override access right definitions on powerpc
>
> Sandipan Das (3):
>   selftests: vm: pkeys: Add helpers for pkey bits
>   selftests: vm: pkeys: Use the correct huge page size
>   selftests: vm: pkeys: Use the correct page size on powerpc
>
> Thiago Jung Bauermann (2):
>   selftests/vm/pkeys: Move some definitions to arch-specific header
>   selftests/vm/pkeys: Make gcc check arguments of sigsafe_printf()
>
>  tools/testing/selftests/vm/.gitignore |   1 +
>  tools/testing/selftests/vm/Makefile   |   5 +
>  tools/testing/selftests/vm/pkey-helpers.h | 226 ++
>  tools/testing/selftests/vm/pkey-powerpc.h | 138 
>  tools/testing/selftests/vm/pkey-x86.h | 183 +
>  .../selftests/{x86 => vm}/protection_keys.c   | 688 ++
>  tools/testing/selftests/x86/.gitignore|   1 -
>  tools/testing/selftests/x86/pkey-helpers.h| 219 --
>  8 files changed, 931 insertions(+), 530 deletions(-)
>  create mode 100644 tools/testing/selftests/vm/pkey-helpers.h
>  create mode 100644 tools/testing/selftests/vm/pkey-powerpc.h
>  create mode 100644 tools/testing/selftests/vm/pkey-x86.h
>  rename tools/testing/selftests/{x86 => vm}/protection_keys.c (74%)
>  delete mode 100644 tools/testing/selftests/x86/pkey-helpers.h
>
> -- 
> 2.17.1

-aneesh


Re: [PATCH v15 00/24] selftests, powerpc, x86: Memory Protection Keys

2020-01-10 Thread Dave Hansen
On 1/10/20 9:38 AM, Aneesh Kumar K.V wrote:
>> v15:
>>  (1) Rebased on top of latest master.
>>  (2) Addressed review comments from Dave Hansen.
>>  (3) Moved code for getting or setting pkey bits to new
>>  helpers. These changes replace patch 7 of v14.
>>  (4) Added a fix which ensures that the correct count of
>>  reserved keys is used across different platforms.
>>  (5) Added a fix which ensures that the correct page size
>>  is used as powerpc supports both 4K and 64K pages.
>>
> Any update on merging this series? Can Intel help with testing this
> series on Skylake server? Possibly merging to -next will result in
> automated 01.org tests?

Could you dump these in a git tree, please?  It will make it a wee bit
easier for me to ship the resulting tree around to a couple different
systems.


Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo

2020-01-10 Thread James Morse
Hi Bhupesh,

On 25/12/2019 19:01, Bhupesh Sharma wrote:
> On 12/12/2019 04:02 PM, James Morse wrote:
>> On 29/11/2019 19:59, Bhupesh Sharma wrote:
>>> vabits_actual variable on arm64 indicates the actual VA space size,
>>> and allows a single binary to support both 48-bit and 52-bit VA
>>> spaces.
>>>
>>> If the ARMv8.2-LVA optional feature is present, and we are running
>>> with a 64KB page size; then it is possible to use 52-bits of address
>>> space for both userspace and kernel addresses. However, any kernel
>>> binary that supports 52-bit must also be able to fall back to 48-bit
>>> at early boot time if the hardware feature is not present.
>>>
>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region
>>> addressed by TTBR1_EL1 (and hence can be used for determining the
>>> vabits_actual value) it makes more sense to export the same in
>>> vmcoreinfo rather than vabits_actual variable, as the name of the
>>> variable can change in future kernel versions, but the architectural
>>> constructs like TCR_EL1.T1SZ can be used better to indicate intended
>>> specific fields to user-space.
>>>
>>> User-space utilities like makedumpfile and crash-utility, need to
>>> read/write this value from/to vmcoreinfo
>>
>> (write?)
> 
> Yes, also write so that the vmcoreinfo from an (crashing) arm64 system can be 
> used for
> analysis of the root-cause of panic/crash on say an x86_64 host using 
> utilities like
> crash-utility/gdb.

I read this as as "User-space [...] needs to write to vmcoreinfo".


>>> for determining if a virtual address lies in the linear map range.
>>
>> I think this is a fragile example. The debugger shouldn't need to know this.
> 
> Well that the current user-space utility design, so I am not sure we can 
> tweak that too much.
> 
>>> The user-space computation for determining whether an address lies in
>>> the linear map range is the same as we have in kernel-space:
>>>
>>>    #define __is_lm_address(addr)    (!(((u64)addr) & BIT(vabits_actual - 
>>> 1)))
>>
>> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA space"). If 
>> user-space
>> tools rely on 'knowing' the kernel memory layout, they must have to 
>> constantly be fixed
>> and updated. This is a poor argument for adding this to something that ends 
>> up as ABI.
> 
> See above. The user-space has to rely on some ABI/guaranteed hardware-symbols 
> which can be
> used for 'determining' the kernel memory layout.

I disagree. Everything and anything in the kernel will change. The ABI rules 
apply to
stuff exposed via syscalls and kernel filesystems. It does not apply to kernel 
internals,
like the memory layout we used yesterday. 14c127c957c1 is a case in point.

A debugger trying to rely on this sort of thing would have to play catchup 
whenever it
changes.

I'm looking for a justification that isn't paper-thin. Putting 'for guessing 
the memory
map' in the commit message gives the impression we can support that.


>> I think a better argument is walking the kernel page tables from the core 
>> dump.
>> Core code's vmcoreinfo exports the location of the kernel page tables, but 
>> in the example
>> above you can't walk them without knowing how T1SZ was configured.
> 
> Sure, both makedumpfile and crash-utility (which walks the kernel page tables 
> from the
> core dump) use this (and similar) information currently in the user-space.

[...]

>> (From-memory: one of vmcore/kcore is virtually addressed, the other 
>> physically. Does this
>> fix your poblem in both cases?)
>>
>>
>>> diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c
>>> index ca4c3e12d8c5..f78310ba65ea 100644
>>> --- a/arch/arm64/kernel/crash_core.c
>>> +++ b/arch/arm64/kernel/crash_core.c
>>> @@ -7,6 +7,13 @@
>>>   #include 
>>>   #include 
>>
>> You need to include asm/sysreg.h for read_sysreg(), and asm/pgtable-hwdef.h 
>> for the macros
>> you added.
> 
> Ok. Will check as I did not get any compilation errors without the same and 
> build-bot also
> did not raise a flag for the missing include files.

Don't trust the header jungle!


>>> +static inline u64 get_tcr_el1_t1sz(void);
> 
>> Why do you need to do this?
> 
> Without this I was getting a missing declaration error, while compiling the 
> code.

Missing declaration?

>>> +static inline u64 get_tcr_el1_t1sz(void)
>>> +{
>>> +    return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET;
>>> +}

Here it is! (I must be going mad...)


Thanks,

James


Re: [PATCH kernel] vfio/spapr/nvlink2: Skip unpinning pages on error exit

2020-01-10 Thread Alex Williamson
On Mon, 23 Dec 2019 12:09:27 +1100
Alexey Kardashevskiy  wrote:

> The nvlink2 subdriver for IBM Witherspoon machines preregisters
> GPU memory in the IOMMI API so KVM TCE code can map this memory
> for DMA as well. This is done by mm_iommu_newdev() called from
> vfio_pci_nvgpu_regops::mmap.
> 
> In an unlikely event of failure the data->mem remains NULL and
> since mm_iommu_put() (which unregisters the region and unpins memory
> if that was regular memory) does not expect mem==NULL, it should not be
> called.
> 
> This adds a check to only call mm_iommu_put() for a valid data->mem.
> 
> Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] 
> subdriver")
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  drivers/vfio/pci/vfio_pci_nvlink2.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_nvlink2.c 
> b/drivers/vfio/pci/vfio_pci_nvlink2.c
> index f2983f0f84be..3f5f8198a6bb 100644
> --- a/drivers/vfio/pci/vfio_pci_nvlink2.c
> +++ b/drivers/vfio/pci/vfio_pci_nvlink2.c
> @@ -97,8 +97,10 @@ static void vfio_pci_nvgpu_release(struct vfio_pci_device 
> *vdev,
>  
>   /* If there were any mappings at all... */
>   if (data->mm) {
> - ret = mm_iommu_put(data->mm, data->mem);
> - WARN_ON(ret);
> + if (data->mem) {
> + ret = mm_iommu_put(data->mm, data->mem);
> + WARN_ON(ret);
> + }
>  
>   mmdrop(data->mm);
>   }

Applied to vfio next branch for v5.6.  Thanks,

Alex



Re: [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback

2020-01-10 Thread Thomas Gleixner
Andy Lutomirski  writes:

> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
>  wrote:
>>
>> In order to simplify next step which moves fallback call at arch
>> level, ensure all arches have a 32bit fallback instead of handling
>> the lack of 32bit fallback in the common code based
>> on VDSO_HAS_32BIT_FALLBACK
>
> I don't like this.  You've implemented what appear to be nonsensical
> fallbacks (the 32-bit fallback for a 64-bit vDSO build?  There's no
> such thing).
>
> How exactly does this simplify patch 2?

There is a patchset from Vincenzo which fell through the cracks which
addresses the VDS_HAS_32BIT_FALLBACK issue properly. I'm about to pick
it up. See:

 https://lore.kernel.org/lkml/20190830135902.20861-1-vincenzo.frasc...@arm.com/

Thanks,

tglx


Re: [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback

2020-01-10 Thread Andy Lutomirski



> On Jan 10, 2020, at 10:56 AM, Thomas Gleixner  wrote:
> 
> Andy Lutomirski  writes:
> 
>>> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
>>>  wrote:
>>> 
>>> In order to simplify next step which moves fallback call at arch
>>> level, ensure all arches have a 32bit fallback instead of handling
>>> the lack of 32bit fallback in the common code based
>>> on VDSO_HAS_32BIT_FALLBACK
>> 
>> I don't like this.  You've implemented what appear to be nonsensical
>> fallbacks (the 32-bit fallback for a 64-bit vDSO build?  There's no
>> such thing).
>> 
>> How exactly does this simplify patch 2?
> 
> There is a patchset from Vincenzo which fell through the cracks which
> addresses the VDS_HAS_32BIT_FALLBACK issue properly. I'm about to pick
> it up. See:
> 
> https://lore.kernel.org/lkml/20190830135902.20861-1-vincenzo.frasc...@arm.com/
> 

Thanks.  I had been wondering why the conditionals were still there, since I 
remember seeing these patches.

> Thanks,
> 
>tglx


Re: [RFC PATCH v2 05/10] lib: vdso: inline do_hres()

2020-01-10 Thread Thomas Gleixner
Arnd Bergmann  writes:
> On Mon, Dec 23, 2019 at 3:31 PM Christophe Leroy
>  wrote:
>>
>> do_hres() is called from several places, so GCC doesn't inline
>> it at first.
>>
>> do_hres() takes a struct __kernel_timespec * parameter for
>> passing the result. In the 32 bits case, this parameter corresponds
>> to a local var in the caller. In order to provide a pointer
>> to this structure, the caller has to put it in its stack and
>> do_hres() has to write the result in the stack. This is suboptimal,
>> especially on RISC processor like powerpc.
>>
>> By making GCC inline the function, the struct __kernel_timespec
>> remains a local var using registers, avoiding the need to write and
>> read stack.
>>
>> The improvement is significant on powerpc.
>>
>> Signed-off-by: Christophe Leroy 
>
> Good idea, I can see how this ends up being an improvement
> for most of the callers.
>
> Acked-by: Arnd Bergmann 

  https://lore.kernel.org/r/20191112012724.250792-3-d...@arista.com

On the way to be applied.

Thanks,

tglx


Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()

2020-01-10 Thread Thomas Gleixner
Christophe Leroy  writes:
>
> diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
> index 17b4cff6e5f0..5a17a9d2e6cd 100644
> --- a/lib/vdso/gettimeofday.c
> +++ b/lib/vdso/gettimeofday.c
> @@ -144,7 +144,7 @@ __cvdso_gettimeofday(const struct vdso_data *vd, struct 
> __kernel_old_timeval *tv
>  static __maybe_unused __kernel_old_time_t
>  __cvdso_time(const struct vdso_data *vd, __kernel_old_time_t *time)
>  {
> - __kernel_old_time_t t = 
> READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
> + __kernel_old_time_t t = vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec;
>  
>   if (time)
>   *time = t;

Allows the compiler to load twice, i.e. the returned value might be different 
from the
stored value. So no.

Thanks,

tglx


Re: [RFC PATCH] powerpc/32: Switch VDSO to C implementation.

2020-01-10 Thread Thomas Gleixner
Christophe,

Christophe Leroy  writes:
> On 01/09/2020 02:05 PM, Thomas Gleixner wrote:
>> The reason why this is implemented in this way is that
>> __arch_get_hw_counter() needs a way to express that the clocksource of
>> the moment is not suitable for VDSO so that the syscall fallback gets
>> invoked.
>> 
>> Sure we could have used a pointer for the value and a return value
>> indicating the validity, but given the required uptime the resulting
>> code overhead seemed to be not worth it. At least not for me as I'm not
>> planning to be around 58 years from now :)
>> 
>
> I managed to get better code and better performance by splitting out the 
> validity check as follows. Would it be suitable for all arches ?

A quick test on x86 shows only a minimal impact, but it's in the
noise. So from my side that's fine, but I can't talk for ARM[64]/MIPS

> diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h 
> b/arch/powerpc/include/asm/vdso/gettimeofday.h
> index 689f51b0d8c9..11cdd6faa4ad 100644
> --- a/arch/powerpc/include/asm/vdso/gettimeofday.h
> +++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
> @@ -114,15 +114,17 @@ int clock_getres32_fallback(clockid_t _clkid, 
> struct old_timespec32 *_ts)
>   return ret;
>   }
>
> -static __always_inline u64 __arch_get_hw_counter(s32 clock_mode)
> +static __always_inline bool __arch_is_hw_counter_valid(s32 clock_mode)
>   {
>   /*
>* clock_mode == 0 implies that vDSO are enabled otherwise
>* fallback on syscall.
>*/
> - if (clock_mode)
> - return ULLONG_MAX;
> + return clock_mode ? false : true;

I don't think we need an arch specific function here. I rather convert
the boolean of ARM[64] to the x86/MIPS way of VCLOCK_* modes and let the
generic code check for clock_mode == VCLOCK_NONE.

There is some magic in ARM and MIPS which wants to be able to disable
the whole thing at compile time, but that can be handled in the generic
code with just an extra config switch.

I'll have a stab at that tomorrow.

Thanks,

tglx


Re: Re: linux-next: build warning after merge of the bpf-next tree

2020-01-10 Thread Alexandre Ghiti

Hi guys,

On 10/27/19 8:02 PM, Stephen Rothwell wrote:

Hi all,

On Fri, 18 Oct 2019 10:56:57 +1100 Stephen Rothwell  
wrote:

Hi all,

After merging the bpf-next tree, today's linux-next build (powerpc
ppc64_defconfig) produced this warning:

WARNING: 2 bad relocations
c1998a48 R_PPC64_ADDR64_binary__btf_vmlinux_bin_start
c1998a50 R_PPC64_ADDR64_binary__btf_vmlinux_bin_end

Introduced by commit

   8580ac9404f6 ("bpf: Process in-kernel BTF")

This warning now appears in the net-next tree build.



I bump that thread up because Zong also noticed that 2 new relocations for
those symbols appeared in my riscv relocatable kernel branch following 
that commit.


I also noticed 2 new relocations R_AARCH64_ABS64 appearing in arm64 kernel.

Those 2 weak undefined symbols have existed since commit
341dfcf8d78e ("btf: expose BTF info through sysfs") but this is the fact
to declare those symbols into btf.c that produced those relocations.

I'm not sure what this all means, but this is not something I expected 
for riscv for
a kernel linked with -shared/-fpie. Maybe should we just leave them to 
zero ?


I think that deserves a deeper look if someone understands all this 
better than I do.


Alex



Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo

2020-01-10 Thread Dave Anderson



- Original Message -
> Hi Bhupesh,
> 
> On 25/12/2019 19:01, Bhupesh Sharma wrote:
> > On 12/12/2019 04:02 PM, James Morse wrote:
> >> On 29/11/2019 19:59, Bhupesh Sharma wrote:
> >>> vabits_actual variable on arm64 indicates the actual VA space size,
> >>> and allows a single binary to support both 48-bit and 52-bit VA
> >>> spaces.
> >>>
> >>> If the ARMv8.2-LVA optional feature is present, and we are running
> >>> with a 64KB page size; then it is possible to use 52-bits of address
> >>> space for both userspace and kernel addresses. However, any kernel
> >>> binary that supports 52-bit must also be able to fall back to 48-bit
> >>> at early boot time if the hardware feature is not present.
> >>>
> >>> Since TCR_EL1.T1SZ indicates the size offset of the memory region
> >>> addressed by TTBR1_EL1 (and hence can be used for determining the
> >>> vabits_actual value) it makes more sense to export the same in
> >>> vmcoreinfo rather than vabits_actual variable, as the name of the
> >>> variable can change in future kernel versions, but the architectural
> >>> constructs like TCR_EL1.T1SZ can be used better to indicate intended
> >>> specific fields to user-space.
> >>>
> >>> User-space utilities like makedumpfile and crash-utility, need to
> >>> read/write this value from/to vmcoreinfo
> >>
> >> (write?)
> > 
> > Yes, also write so that the vmcoreinfo from an (crashing) arm64 system can
> > be used for
> > analysis of the root-cause of panic/crash on say an x86_64 host using
> > utilities like
> > crash-utility/gdb.
> 
> I read this as as "User-space [...] needs to write to vmcoreinfo".
> 
> 
> >>> for determining if a virtual address lies in the linear map range.
> >>
> >> I think this is a fragile example. The debugger shouldn't need to know
> >> this.
> > 
> > Well that the current user-space utility design, so I am not sure we can
> > tweak that too much.
> > 
> >>> The user-space computation for determining whether an address lies in
> >>> the linear map range is the same as we have in kernel-space:
> >>>
> >>>    #define __is_lm_address(addr)    (!(((u64)addr) & BIT(vabits_actual -
> >>>    1)))
> >>
> >> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA space"). If
> >> user-space
> >> tools rely on 'knowing' the kernel memory layout, they must have to
> >> constantly be fixed
> >> and updated. This is a poor argument for adding this to something that
> >> ends up as ABI.
> > 
> > See above. The user-space has to rely on some ABI/guaranteed
> > hardware-symbols which can be
> > used for 'determining' the kernel memory layout.
> 
> I disagree. Everything and anything in the kernel will change. The ABI rules 
> apply to
> stuff exposed via syscalls and kernel filesystems. It does not apply to 
> kernel internals,
> like the memory layout we used yesterday. 14c127c957c1 is a case in point.
> 
> A debugger trying to rely on this sort of thing would have to play catchup 
> whenever it
> changes.

Exactly.  That's the whole point.

The crash utility and makedumpfile are not in the same league as other 
user-space tools.
They have always had to "play catchup" precisely because they depend upon 
kernel internals,
which constantly change.

Dave 



Re: Re: linux-next: build warning after merge of the bpf-next tree

2020-01-10 Thread Alexei Starovoitov
On Fri, Jan 10, 2020 at 2:28 PM Alexandre Ghiti  wrote:
>
> Hi guys,
>
> On 10/27/19 8:02 PM, Stephen Rothwell wrote:
> > Hi all,
> >
> > On Fri, 18 Oct 2019 10:56:57 +1100 Stephen Rothwell  
> > wrote:
> >> Hi all,
> >>
> >> After merging the bpf-next tree, today's linux-next build (powerpc
> >> ppc64_defconfig) produced this warning:
> >>
> >> WARNING: 2 bad relocations
> >> c1998a48 R_PPC64_ADDR64_binary__btf_vmlinux_bin_start
> >> c1998a50 R_PPC64_ADDR64_binary__btf_vmlinux_bin_end
> >>
> >> Introduced by commit
> >>
> >>8580ac9404f6 ("bpf: Process in-kernel BTF")
> > This warning now appears in the net-next tree build.
> >
> >
> I bump that thread up because Zong also noticed that 2 new relocations for
> those symbols appeared in my riscv relocatable kernel branch following
> that commit.
>
> I also noticed 2 new relocations R_AARCH64_ABS64 appearing in arm64 kernel.
>
> Those 2 weak undefined symbols have existed since commit
> 341dfcf8d78e ("btf: expose BTF info through sysfs") but this is the fact
> to declare those symbols into btf.c that produced those relocations.
>
> I'm not sure what this all means, but this is not something I expected
> for riscv for
> a kernel linked with -shared/-fpie. Maybe should we just leave them to
> zero ?
>
> I think that deserves a deeper look if someone understands all this
> better than I do.

Are you saying there is a warning for arm64 as well?
Can ppc folks explain the above warning?
What does it mean "2 bad relocations"?
The code is doing:
extern char __weak _binary__btf_vmlinux_bin_start[];
extern char __weak _binary__btf_vmlinux_bin_end[];
Since they are weak they should be zero when not defined.
What's the issue?


Re: [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process

2020-01-10 Thread Masami Hiramatsu
On Fri, 10 Jan 2020 13:45:31 -0300
Arnaldo Carvalho de Melo  wrote:

> Em Sat, Jan 11, 2020 at 12:52:13AM +0900, Masami Hiramatsu escreveu:
> > On Fri, 10 Jan 2020 15:02:34 +0100 Peter Zijlstra  
> > wrote:
> > > Again, this only allows attaching to previously created kprobes, it does
> > > not allow creating kprobes, right?
> 
> > > That is; I don't think CAP_SYS_PERFMON should be allowed to create
> > > kprobes.
> 
> > > As might be clear; I don't actually know what the user-ABI is for
> > > creating kprobes.
> 
> > There are 2 ABIs nowadays, ftrace and ebpf. perf-probe uses ftrace 
> > interface to
> > define new kprobe events, and those events are treated as completely same as
> > tracepoint events. On the other hand, ebpf tries to define new probe event
> > via perf_event interface. Above one is that interface. IOW, it creates new 
> > kprobe.
> 
> Masami, any plans to make 'perf probe' use the perf_event_open()
> interface for creating kprobes/uprobes?

Would you mean perf probe to switch to perf_event_open()?
No, perf probe is for setting up the ftrace probe events. I think we can add an
option to use perf_event_open(). But current kprobe creation from 
perf_event_open()
is separated from ftrace by design.

I think the reason why ebpf uses perf_event_open() interface is to avoid 
conflict
with ftrace users. Those probes are temporally used by ebpf, but if it is 
appeared on
ftrace, it is easy to be used by ftrace. In that case, it can not be removed 
when
the ebpf exits.

Thank you,

-- 
Masami Hiramatsu 


Re: Re: linux-next: build warning after merge of the bpf-next tree

2020-01-10 Thread Palmer Dabbelt

On Fri, 10 Jan 2020 14:28:17 PST (-0800), alexan...@ghiti.fr wrote:

Hi guys,

On 10/27/19 8:02 PM, Stephen Rothwell wrote:

Hi all,

On Fri, 18 Oct 2019 10:56:57 +1100 Stephen Rothwell  
wrote:

Hi all,

After merging the bpf-next tree, today's linux-next build (powerpc
ppc64_defconfig) produced this warning:

WARNING: 2 bad relocations
c1998a48 R_PPC64_ADDR64_binary__btf_vmlinux_bin_start
c1998a50 R_PPC64_ADDR64_binary__btf_vmlinux_bin_end

Introduced by commit

   8580ac9404f6 ("bpf: Process in-kernel BTF")

This warning now appears in the net-next tree build.



I bump that thread up because Zong also noticed that 2 new relocations for
those symbols appeared in my riscv relocatable kernel branch following
that commit.

I also noticed 2 new relocations R_AARCH64_ABS64 appearing in arm64 kernel.

Those 2 weak undefined symbols have existed since commit
341dfcf8d78e ("btf: expose BTF info through sysfs") but this is the fact
to declare those symbols into btf.c that produced those relocations.

I'm not sure what this all means, but this is not something I expected
for riscv for
a kernel linked with -shared/-fpie. Maybe should we just leave them to
zero ?

I think that deserves a deeper look if someone understands all this
better than I do.


Can you give me a pointer to your tree and how to build a relocatable kernel?
Weak undefined symbols have the absolute value 0, but the kernel is linked at
an address such that 0 can't be reached by normal means.  When I added support
to binutils for this I did it in a way that required almost no code --
essetially I just stopped dissallowing x0 as a possible base register for PCREL
relocations, which results in 0 always being accessible.  I just wanted to get
the kernel to build again, so I didn't worry about chasing around all the
addressing modes.  The PIC/PIE support generates different relocations and I
wouldn't be surprised if I just missed one (or more likely all) of them.

It's probably a simple fix, though I feel like every time I say that about the
linker I end up spending a month in there...


Re: [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process

2020-01-10 Thread Song Liu



> On Jan 10, 2020, at 3:47 PM, Masami Hiramatsu  wrote:
> 
> On Fri, 10 Jan 2020 13:45:31 -0300
> Arnaldo Carvalho de Melo  wrote:
> 
>> Em Sat, Jan 11, 2020 at 12:52:13AM +0900, Masami Hiramatsu escreveu:
>>> On Fri, 10 Jan 2020 15:02:34 +0100 Peter Zijlstra  
>>> wrote:
 Again, this only allows attaching to previously created kprobes, it does
 not allow creating kprobes, right?
>> 
 That is; I don't think CAP_SYS_PERFMON should be allowed to create
 kprobes.
>> 
 As might be clear; I don't actually know what the user-ABI is for
 creating kprobes.
>> 
>>> There are 2 ABIs nowadays, ftrace and ebpf. perf-probe uses ftrace 
>>> interface to
>>> define new kprobe events, and those events are treated as completely same as
>>> tracepoint events. On the other hand, ebpf tries to define new probe event
>>> via perf_event interface. Above one is that interface. IOW, it creates new 
>>> kprobe.
>> 
>> Masami, any plans to make 'perf probe' use the perf_event_open()
>> interface for creating kprobes/uprobes?
> 
> Would you mean perf probe to switch to perf_event_open()?
> No, perf probe is for setting up the ftrace probe events. I think we can add 
> an
> option to use perf_event_open(). But current kprobe creation from 
> perf_event_open()
> is separated from ftrace by design.

I guess we can extend event parser to understand kprobe directly. Instead of

perf probe kernel_func
perf stat/record -e probe:kernel_func ...

We can just do 

perf stat/record -e kprobe:kernel_func ...

Thanks,
Song

Re: [PATCH v4 2/9] perf/core: open access for CAP_SYS_PERFMON privileged process

2020-01-10 Thread arnaldo . melo
,Jann Horn ,Thomas Gleixner 
,Tvrtko Ursulin ,Lionel 
Landwerlin ,linux-kernel 
,"linux-security-mod...@vger.kernel.org" 
,"seli...@vger.kernel.org" 
,"intel-...@lists.freedesktop.org" 
,"b...@vger.kernel.org" 
,"linux-par...@vger.kernel.org" 
,"linuxppc-dev@lists.ozlabs.org" 
,"linux-perf-us...@vger.kernel.org" 
,"linux-arm-ker...@lists.infradead.org" 
,"oprofile-l...@lists.sf.net" 

From: Arnaldo Carvalho de Melo 
Message-ID: 

On January 10, 2020 9:23:27 PM GMT-03:00, Song Liu  
wrote:
>
>
>> On Jan 10, 2020, at 3:47 PM, Masami Hiramatsu 
>wrote:
>> 
>> On Fri, 10 Jan 2020 13:45:31 -0300
>> Arnaldo Carvalho de Melo  wrote:
>> 
>>> Em Sat, Jan 11, 2020 at 12:52:13AM +0900, Masami Hiramatsu escreveu:
 On Fri, 10 Jan 2020 15:02:34 +0100 Peter Zijlstra
> wrote:
> Again, this only allows attaching to previously created kprobes,
>it does
> not allow creating kprobes, right?
>>> 
> That is; I don't think CAP_SYS_PERFMON should be allowed to create
> kprobes.
>>> 
> As might be clear; I don't actually know what the user-ABI is for
> creating kprobes.
>>> 
 There are 2 ABIs nowadays, ftrace and ebpf. perf-probe uses ftrace
>interface to
 define new kprobe events, and those events are treated as
>completely same as
 tracepoint events. On the other hand, ebpf tries to define new
>probe event
 via perf_event interface. Above one is that interface. IOW, it
>creates new kprobe.
>>> 
>>> Masami, any plans to make 'perf probe' use the perf_event_open()
>>> interface for creating kprobes/uprobes?
>> 
>> Would you mean perf probe to switch to perf_event_open()?
>> No, perf probe is for setting up the ftrace probe events. I think we
>can add an
>> option to use perf_event_open(). But current kprobe creation from
>perf_event_open()
>> is separated from ftrace by design.
>
>I guess we can extend event parser to understand kprobe directly.
>Instead of
>
>   perf probe kernel_func
>   perf stat/record -e probe:kernel_func ...
>
>We can just do 
>
>   perf stat/record -e kprobe:kernel_func ...


You took the words from my mouth, exactly, that is a perfect use case, an 
alternative to the 'perf probe' one of making a disabled event that then gets 
activated via record/stat/trace, in many cases it's better, removes the 
explicit probe setup case.

Regards, 

- Arnaldo

>
>Thanks,
>Song