date:20181031

With preempt enabled we see warnings in do_slb_fault():

  BUG: using smp_processor_id() in preemptible [] code: kworker/u33:0/98
  futex hash table entries: 4096 (order: 3, 524288 bytes)
  caller is do_slb_fault+0x204/0x230
  CPU: 5 PID: 98 Comm: kworker/u33:0 Not tainted 
4.19.0-rc3-gcc-7.3.1-00022-g1936f094e164 #138
  Call Trace:
dump_stack+0xb4/0x104 (unreliable)
check_preemption_disabled+0x148/0x150
do_slb_fault+0x204/0x230
data_access_slb_common+0x138/0x180

This is caused by the get_paca() in slb_allocate_kernel(), which
includes a call to debug_smp_processor_id().

slb_allocate_kernel() can only be called from do_slb_fault(), and in
that path interrupts are hard disabled and so we can't be preempted,
but we can't update the preempt flags (in thread_info) because that
could cause an SLB fault.

So just use local_paca which is safe and doesn't cause the warning.

Fixes: 48e7b7695745 ("powerpc/64s/hash: Convert SLB miss handlers to C")
Signed-off-by: Michael Ellerman 
---
 arch/powerpc/mm/slb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index c3fdf2969d9f..2f5c0a10fac1 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -715,7 +715,7 @@ static long slb_allocate_kernel(unsigned long ea, unsigned 
long id)
return -EFAULT;
 
if (ea < H_VMALLOC_END)
-   flags = get_paca()->vmalloc_sllp;
+   flags = local_paca->vmalloc_sllp;
else
flags = SLB_VSID_KERNEL | 
mmu_psize_defs[mmu_io_psize].sllp;
} else {
-- 
2.17.2

Re: [PATCH] powerpc/powernv: remove dead npu-dma code

2018-10-31 Thread Alexey Kardashevskiy




On 31/10/2018 00:31, Christoph Hellwig wrote:
> This code has never been unused in the kernel since it was merged, and
> there has been no attempt that I could find to even submit users for
> it.  Besides the general policy of not keep 1000+ lines of dead
> code, it helps cleaning up the DMA code and making powerpc user common
> infrastructure.
> 
> This effectively reverts commit 5d2aa710 ("powerpc/powernv: Add support
> for Nvlink NPUs").
> 
> Signed-off-by: Christoph Hellwig 
> ---
> 
> [this is a follow up to the brief maintainer summit discussion]
> 
>  arch/powerpc/include/asm/pci.h|   3 -
>  arch/powerpc/include/asm/powernv.h|  23 -
>  arch/powerpc/platforms/powernv/Makefile   |   2 +-
>  arch/powerpc/platforms/powernv/npu-dma.c  | 989 --
>  arch/powerpc/platforms/powernv/pci-ioda.c | 243 --
>  arch/powerpc/platforms/powernv/pci.h  |  11 -
>  6 files changed, 1 insertion(+), 1270 deletions(-)
>  delete mode 100644 arch/powerpc/platforms/powernv/npu-dma.c
> 
> diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
> index 2af9ded80540..a01d2e3d6ff9 100644
> --- a/arch/powerpc/include/asm/pci.h
> +++ b/arch/powerpc/include/asm/pci.h
> @@ -127,7 +127,4 @@ extern void pcibios_scan_phb(struct pci_controller *hose);
>  
>  #endif   /* __KERNEL__ */
>  
> -extern struct pci_dev *pnv_pci_get_gpu_dev(struct pci_dev *npdev);
> -extern struct pci_dev *pnv_pci_get_npu_dev(struct pci_dev *gpdev, int index);
> -
>  #endif /* __ASM_POWERPC_PCI_H */
> diff --git a/arch/powerpc/include/asm/powernv.h 
> b/arch/powerpc/include/asm/powernv.h
> index 2f3ff7a27881..4848a6b3c6b2 100644
> --- a/arch/powerpc/include/asm/powernv.h
> +++ b/arch/powerpc/include/asm/powernv.h
> @@ -11,33 +11,10 @@
>  #define _ASM_POWERNV_H
>  
>  #ifdef CONFIG_PPC_POWERNV
> -#define NPU2_WRITE 1
>  extern void powernv_set_nmmu_ptcr(unsigned long ptcr);
> -extern struct npu_context *pnv_npu2_init_context(struct pci_dev *gpdev,
> - unsigned long flags,
> - void (*cb)(struct npu_context *, void *),
> - void *priv);
> -extern void pnv_npu2_destroy_context(struct npu_context *context,
> - struct pci_dev *gpdev);
> -extern int pnv_npu2_handle_fault(struct npu_context *context, uintptr_t *ea,
> - unsigned long *flags, unsigned long *status,
> - int count);
> -
>  void pnv_tm_init(void);
>  #else
>  static inline void powernv_set_nmmu_ptcr(unsigned long ptcr) { }
> -static inline struct npu_context *pnv_npu2_init_context(struct pci_dev 
> *gpdev,
> - unsigned long flags,
> - struct npu_context *(*cb)(struct npu_context *, void *),
> - void *priv) { return ERR_PTR(-ENODEV); }
> -static inline void pnv_npu2_destroy_context(struct npu_context *context,
> - struct pci_dev *gpdev) { }
> -
> -static inline int pnv_npu2_handle_fault(struct npu_context *context,
> - uintptr_t *ea, unsigned long *flags,
> - unsigned long *status, int count) {
> - return -ENODEV;
> -}
>  
>  static inline void pnv_tm_init(void) { }
>  static inline void pnv_power9_force_smt4(void) { }
> diff --git a/arch/powerpc/platforms/powernv/Makefile 
> b/arch/powerpc/platforms/powernv/Makefile
> index b540ce8eec55..2b13e9dd137c 100644
> --- a/arch/powerpc/platforms/powernv/Makefile
> +++ b/arch/powerpc/platforms/powernv/Makefile
> @@ -6,7 +6,7 @@ obj-y += opal-msglog.o opal-hmi.o 
> opal-power.o opal-irqchip.o
>  obj-y+= opal-kmsg.o opal-powercap.o opal-psr.o 
> opal-sensor-groups.o
>  
>  obj-$(CONFIG_SMP)+= smp.o subcore.o subcore-asm.o
> -obj-$(CONFIG_PCI)+= pci.o pci-ioda.o npu-dma.o pci-ioda-tce.o
> +obj-$(CONFIG_PCI)+= pci.o pci-ioda.o pci-ioda-tce.o
>  obj-$(CONFIG_CXL_BASE)   += pci-cxl.o
>  obj-$(CONFIG_EEH)+= eeh-powernv.o
>  obj-$(CONFIG_PPC_SCOM)   += opal-xscom.o
> diff --git a/arch/powerpc/platforms/powernv/npu-dma.c 
> b/arch/powerpc/platforms/powernv/npu-dma.c
> deleted file mode 100644
> index 6f60e0931922..
> --- a/arch/powerpc/platforms/powernv/npu-dma.c
> +++ /dev/null
> @@ -1,989 +0,0 @@
> -/*
> - * This file implements the DMA operations for NVLink devices. The NPU
> - * devices all point to the same iommu table as the parent PCI device.
> - *
> - * Copyright Alistair Popple, IBM Corporation 2015.
> - *
> - * This program is free software; you can redistribute it and/or
> - * modify it under the terms of version 2 of the GNU General Public
> - * License as published by the Free Software Foundation.
> - */
> -
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
> -#include 
>

Re: [PATCH] raid6/ppc: Fix build for clang

Nick Desaulniers  writes:
> On Tue, Oct 30, 2018 at 8:28 PM Joel Stanley  wrote:
>>
>> We cannot build these files with clang as it does not allow altivec
>> instructions in assembly when -msoft-float is passed.
>>
>> Jinsong Ji  wrote:
>> > We currently disable Altivec/VSX support when enabling soft-float.  So
>> > any usage of vector builtins will break.
>> >
>> > Enable Altivec/VSX with soft-float may need quite some clean up work, so
>> > I guess this is currently a limitation.
>> >
>> > Removing -msoft-float will make it work (and we are lucky that no
>> > floating point instructions will be generated as well).
>>
>> This is a workaround until the issue is resolved in clang.
>>
>> Link: https://bugs.llvm.org/show_bug.cgi?id=31177
>> Link: https://github.com/ClangBuiltLinux/linux/issues/239
>> Signed-off-by: Joel Stanley 
>
> Hopefully we can have this fixed up soon.  I'll try to see who works
> on Clang+PPC here and see if they can help move that bug along. Thanks
> for the patch!
>
>> ---
>>  lib/raid6/Makefile | 15 +++
>>  1 file changed, 15 insertions(+)
>>
>> diff --git a/lib/raid6/Makefile b/lib/raid6/Makefile
>> index 2f8b61dfd9b0..3a844e6fd01c 100644
>> --- a/lib/raid6/Makefile
>> +++ b/lib/raid6/Makefile
>> @@ -18,6 +18,21 @@ quiet_cmd_unroll = UNROLL  $@
>>
>>  ifeq ($(CONFIG_ALTIVEC),y)
>>  altivec_flags := -maltivec $(call cc-option,-mabi=altivec)
>> +
>> +ifdef CONFIG_CC_IS_CLANG
>
> Note that the top level makefile detects clang via:
>
> ifeq ($(cc-name),clang)
>
> Does that not work here? I'd prefer to keep compiler detection in
> Makefiles consistent, if it works?

cc-name is being removed in favor of CONFIG_CC_IS_CLANG:

https://lore.kernel.org/lkml/1540905994-6073-1-git-send-email-yamada.masah...@socionext.com/

cheers

Re: PIE binaries are no longer mapped below 4 GiB on ppc64le

Hi Florian,

Florian Weimer  writes:
> We tried to use Go to build PIE binaries, and while the Go toolchain is
> definitely not ready (it produces text relocations and problematic
> relocations in general), it exposed what could be an accidental
> userspace ABI change.
>
> With our 4.10-derived kernel, PIE binaries are mapped below 4 GiB, so
> relocations like R_PPC64_ADDR16_HA work:
>
> 21f0-220d r-xp  fd:00 36593493   
> /root/extld
> 220d-220e r--p 001c fd:00 36593493   
> /root/extld
> 220e-2210 rw-p 001d fd:00 36593493   
> /root/extld
...
>
> With a 4.18-derived kernel (with the hashed mm), we get this instead:
>
> 120e6-12103 rw-p  fd:00 102447141
> /root/extld
> 12103-12106 rw-p 001c fd:00 102447141
> /root/extld
> 12106-12108 rw-p  00:00 0 

I assume that's caused by:

  47ebb09d5485 ("powerpc: move ELF_ET_DYN_BASE to 4GB / 4MB")

Which did roughly:

  -#define ELF_ET_DYN_BASE  0x2000
  +#define ELF_ET_DYN_BASE  (is_32bit_task() ? 0x00040UL : \
  +0x1UL)

And went into 4.13.

> ...
> I'm not entirely sure what to make of this, but I'm worried that this
> could be a regression that matters to userspace.

It was a deliberate change, and it seemed to not break anything so we
merged it. But obviously we didn't test widely enough.

So I guess it clearly can matter to userspace, and it used to work, so
therefore it is a regression.

But at the same time we haven't had any other reports of breakage, so is
this somehow specific to something Go is doing? Or did we just get lucky
up until now? Or is no one actually testing on Power? ;)

cheers

Re: [PATCH 0/5] Guarded Userspace Access Prevention on Radix

2018-10-31 Thread Russell Currey

On Wed, 2018-10-31 at 17:58 +0100, LEROY Christophe wrote:
> Russell Currey  a écrit :
> 
> > On Fri, 2018-10-26 at 18:29 +0200, LEROY Christophe wrote:
> > > Russell Currey  a écrit :
> > > 
> > > > Guarded Userspace Access Prevention is a security mechanism
> > > > that
> > > > prevents
> > > > the kernel from being able to read and write userspace
> > > > addresses
> > > > outside of
> > > > the allowed paths, most commonly copy_{to/from}_user().
> > > > 
> > > > At present, the only CPU that supports this is POWER9, and only
> > > > while using
> > > > the Radix MMU.  Privileged reads and writes cannot access user
> > > > data
> > > > when
> > > > key 0 of the AMR is set.  This is described in the "Radix Tree
> > > > Translation
> > > > Storage Protection" section of the POWER ISA as of version 3.0.
> > > 
> > > It is not right that only power9 can support that.
> > 
> > It's true that not only P9 can support it, but there are more
> > considerations under hash than radix, implementing this for radix
> > is a
> > first step.
> 
> I don't know much about hash, but I was talking about the 8xx which
> is  
> a nohash ppc32. I'll see next week if I can do something with it on  
> top of your serie.

My small brain saw the number 8 and assumed you were talking about
POWER8, I didn't know what 8xx was until now.

Working on a refactor to make things a bit more generic, and removing
the radix name and dependency from the config option.

> 
> > > The 8xx has mmu access protection registers which can serve the
> > > same
> > > purpose. Today on the 8xx kernel space is group 0 and user space
> > > is
> > > group 1. Group 0 is set to "page defined access permission" in
> > > MD_AP
> > > and MI_AP registers, and group 1 is set to "all accesses done
> > > with
> > > supervisor rights". By setting group 1 to "user and supervisor
> > > interpretation swapped" we can forbid kernel access to user space
> > > while still allowing user access to it. Then by simply changing
> > > group
> > > 1 mode at dedicated places we can lock/unlock kernel access to
> > > user
> > > space.
> > > 
> > > Could you implement something as generic as possible having that
> > > in
> > > mind for a future patch ?
> > 
> > I don't think anything in this series is particularly problematic
> > in
> > relation to future work for hash.  I am interested in doing a hash
> > implementation in future too.
> 
> I think we have to look at that carrefuly to avoid uggly ifdefs
> 
> Christophe
> 
> > - Russell
> > 
> > > Christophe
> > > 
> > > > GUAP code sets key 0 of the AMR (thus disabling accesses of
> > > > user
> > > > data)
> > > > early during boot, and only ever "unlocks" access prior to
> > > > certain
> > > > operations, like copy_{to/from}_user(), futex ops,
> > > > etc.  Setting
> > > > this does
> > > > not prevent unprivileged access, so userspace can operate fine
> > > > while access
> > > > is locked.
> > > > 
> > > > There is a performance impact, although I don't consider it
> > > > heavy.  Running
> > > > a worst-case benchmark of a 1GB copy 1 byte at a time (and thus
> > > > constant
> > > > read(1) write(1) syscalls), I found enabling GUAP to be 3.5%
> > > > slower
> > > > than
> > > > when disabled.  In most cases, the difference is
> > > > negligible.  The
> > > > main
> > > > performance impact is the mtspr instruction, which is quite
> > > > slow.
> > > > 
> > > > There are a few caveats with this series that could be improved
> > > > upon in
> > > > future.  Right now there is no saving and restoring of the AMR
> > > > value -
> > > > there is no userspace exploitation of the AMR on Radix in
> > > > POWER9,
> > > > but if
> > > > this were to change in future, saving and restoring the value
> > > > would
> > > > be
> > > > necessary.
> > > > 
> > > > No attempt to optimise cases of repeated calls - for example,
> > > > if
> > > > some
> > > > code was repeatedly calling copy_to_user() for small sizes very
> > > > frequently,
> > > > it would be slower than the equivalent of wrapping that code in
> > > > an
> > > > unlock
> > > > and lock and only having to modify the AMR once.
> > > > 
> > > > There are some interesting cases that I've attempted to handle,
> > > > such as if
> > > > the AMR is unlocked (i.e. because a copy_{to_from}_user is in
> > > > progress)...
> > > > 
> > > > - and an exception is taken, the kernel would then be
> > > > running
> > > > with the
> > > > AMR unlocked and freely able to access userspace again.  I
> > > > am
> > > > working
> > > > around this by storing a flag in the PACA to indicate if
> > > > the
> > > > AMR is
> > > > unlocked (to save a costly SPR read), and if so, locking
> > > > the
> > > > AMR in
> > > > the exception entry path and unlocking it on the way out.
> > > > 
> > > > - and gets context switched out, goes into a path that
> > > > locks
> > > > the AMR,
> > > > then context switches back, access will be disabled and
> > > > will
> > > > fault.
> >

Re: [RFC PATCH] lib: Introduce generic __cmpxchg_u64() and use it where needed

2018-10-31 Thread Guenter Roeck


On 10/31/18 4:32 PM, Paul Burton wrote:

(Copying SunRPC & net maintainers.)

Hi Guenter,

On Wed, Oct 31, 2018 at 03:02:53PM -0700, Guenter Roeck wrote:

The alternatives I can see are
- Do not use cmpxchg64() outside architecture code (ie drop its use from
   the offending driver, and keep doing the same whenever the problem comes
   up again).
or
- Introduce something like ARCH_HAS_CMPXCHG64 and use it to determine
   if cmpxchg64 is supported or not.

Any preference ?


My preference would be option 1 - avoiding cmpxchg64() where possible in
generic code. I wouldn't be opposed to the Kconfig option if there are
cases where cmpxchg64() can really help performance though.

The last time I'm aware of this coming up the affected driver was
modified to avoid cmpxchg64() [1].

In this particular case I have no idea why
net/sunrpc/auth_gss/gss_krb5_seal.c is using cmpxchg64() at all. It's
essentially reinventing atomic64_fetch_inc() which is already provided
everywhere via CONFIG_GENERIC_ATOMIC64 & the spinlock approach. At least
for atomic64_* functions the assumption that all access will be
performed using those same functions seems somewhat reasonable.

So how does the below look? Trond?



For my part I agree that this would be a much better solution. The argument
that it is not always absolutely guaranteed that atomics don't wrap doesn't
really hold for me because it looks like they all do. On top of that, there
is an explicit atomic_dec_if_positive() and atomic_fetch_add_unless(),
which to me strongly suggests that they _are_ supposed to wrap.
Given the cost of adding a comparison to each atomic operation to
prevent it from wrapping, anything else would not really make sense to me.

So ... please consider my patch abandoned. Thanks for looking into this!

Guenter


Thanks,
 Paul

[1] https://patchwork.ozlabs.org/cover/891284/

---
diff --git a/include/linux/sunrpc/gss_krb5.h b/include/linux/sunrpc/gss_krb5.h
index 131424cefc6a..02c0412e368c 100644
--- a/include/linux/sunrpc/gss_krb5.h
+++ b/include/linux/sunrpc/gss_krb5.h
@@ -107,8 +107,8 @@ struct krb5_ctx {
u8  Ksess[GSS_KRB5_MAX_KEYLEN]; /* session key */
u8  cksum[GSS_KRB5_MAX_KEYLEN];
s32 endtime;
-   u32 seq_send;
-   u64 seq_send64;
+   atomic_tseq_send;
+   atomic64_t  seq_send64;
struct xdr_netobj   mech_used;
u8  initiator_sign[GSS_KRB5_MAX_KEYLEN];
u8  acceptor_sign[GSS_KRB5_MAX_KEYLEN];
@@ -118,9 +118,6 @@ struct krb5_ctx {
u8  acceptor_integ[GSS_KRB5_MAX_KEYLEN];
  };
  
-extern u32 gss_seq_send_fetch_and_inc(struct krb5_ctx *ctx);

-extern u64 gss_seq_send64_fetch_and_inc(struct krb5_ctx *ctx);
-
  /* The length of the Kerberos GSS token header */
  #define GSS_KRB5_TOK_HDR_LEN  (16)
  
diff --git a/net/sunrpc/auth_gss/gss_krb5_mech.c b/net/sunrpc/auth_gss/gss_krb5_mech.c

index 7f0424dfa8f6..eab71fc7af3e 100644
--- a/net/sunrpc/auth_gss/gss_krb5_mech.c
+++ b/net/sunrpc/auth_gss/gss_krb5_mech.c
@@ -274,6 +274,7 @@ get_key(const void *p, const void *end,
  static int
  gss_import_v1_context(const void *p, const void *end, struct krb5_ctx *ctx)
  {
+   u32 seq_send;
int tmp;
  
  	p = simple_get_bytes(p, end, >initiate, sizeof(ctx->initiate));

@@ -315,9 +316,10 @@ gss_import_v1_context(const void *p, const void *end, 
struct krb5_ctx *ctx)
p = simple_get_bytes(p, end, >endtime, sizeof(ctx->endtime));
if (IS_ERR(p))
goto out_err;
-   p = simple_get_bytes(p, end, >seq_send, sizeof(ctx->seq_send));
+   p = simple_get_bytes(p, end, _send, sizeof(seq_send));
if (IS_ERR(p))
goto out_err;
+   atomic_set(>seq_send, seq_send);
p = simple_get_netobj(p, end, >mech_used);
if (IS_ERR(p))
goto out_err;
@@ -607,6 +609,7 @@ static int
  gss_import_v2_context(const void *p, const void *end, struct krb5_ctx *ctx,
gfp_t gfp_mask)
  {
+   u64 seq_send64;
int keylen;
  
  	p = simple_get_bytes(p, end, >flags, sizeof(ctx->flags));

@@ -617,14 +620,15 @@ gss_import_v2_context(const void *p, const void *end, 
struct krb5_ctx *ctx,
p = simple_get_bytes(p, end, >endtime, sizeof(ctx->endtime));
if (IS_ERR(p))
goto out_err;
-   p = simple_get_bytes(p, end, >seq_send64, sizeof(ctx->seq_send64));
+   p = simple_get_bytes(p, end, _send64, sizeof(seq_send64));
if (IS_ERR(p))
goto out_err;
+   atomic64_set(>seq_send64, seq_send64);
/* set seq_send for use by "older" enctypes */
-   ctx->seq_send = ctx->seq_send64;
-   if (ctx->seq_send64 != ctx->seq_send) {
-   dprintk("%s: seq_send64 %lx, seq_send %x overflow?\n", __func__,
-   (unsigned

Re: [RFC PATCH] lib: Introduce generic __cmpxchg_u64() and use it where needed

2018-10-31 Thread Trond Myklebust

Hi Paul,

On Wed, 2018-10-31 at 23:32 +, Paul Burton wrote:
> (Copying SunRPC & net maintainers.)
> 
> Hi Guenter,
> 
> On Wed, Oct 31, 2018 at 03:02:53PM -0700, Guenter Roeck wrote:
> > The alternatives I can see are
> > - Do not use cmpxchg64() outside architecture code (ie drop its use
> > from
> >   the offending driver, and keep doing the same whenever the
> > problem comes
> >   up again).
> > or
> > - Introduce something like ARCH_HAS_CMPXCHG64 and use it to
> > determine
> >   if cmpxchg64 is supported or not.
> > 
> > Any preference ?
> 
> My preference would be option 1 - avoiding cmpxchg64() where possible
> in
> generic code. I wouldn't be opposed to the Kconfig option if there
> are
> cases where cmpxchg64() can really help performance though.
> 
> The last time I'm aware of this coming up the affected driver was
> modified to avoid cmpxchg64() [1].
> 
> In this particular case I have no idea why
> net/sunrpc/auth_gss/gss_krb5_seal.c is using cmpxchg64() at all. It's
> essentially reinventing atomic64_fetch_inc() which is already
> provided
> everywhere via CONFIG_GENERIC_ATOMIC64 & the spinlock approach. At
> least
> for atomic64_* functions the assumption that all access will be
> performed using those same functions seems somewhat reasonable.
> 
> So how does the below look? Trond?
> 

My one question (and the reason why I went with cmpxchg() in the first
place) would be about the overflow behaviour for atomic_fetch_inc() and
friends. I believe those functions should be OK on x86, so that when we
overflow the counter, it behaves like an unsigned value and wraps back
around.  Is that the case for all architectures?

i.e. are atomic_t/atomic64_t always guaranteed to behave like u32/u64
on increment?

I could not find any documentation that explicitly stated that they
should.

Cheers
  Trond

> Thanks,
> Paul
> 
> [1] https://patchwork.ozlabs.org/cover/891284/
> 
> ---
> diff --git a/include/linux/sunrpc/gss_krb5.h
> b/include/linux/sunrpc/gss_krb5.h
> index 131424cefc6a..02c0412e368c 100644
> --- a/include/linux/sunrpc/gss_krb5.h
> +++ b/include/linux/sunrpc/gss_krb5.h
> @@ -107,8 +107,8 @@ struct krb5_ctx {
>   u8  Ksess[GSS_KRB5_MAX_KEYLEN]; /*
> session key */
>   u8  cksum[GSS_KRB5_MAX_KEYLEN];
>   s32 endtime;
> - u32 seq_send;
> - u64 seq_send64;
> + atomic_tseq_send;
> + atomic64_t  seq_send64;
>   struct xdr_netobj   mech_used;
>   u8  initiator_sign[GSS_KRB5_MAX_KEYLEN];
>   u8  acceptor_sign[GSS_KRB5_MAX_KEYLEN];
> @@ -118,9 +118,6 @@ struct krb5_ctx {
>   u8  acceptor_integ[GSS_KRB5_MAX_KEYLEN];
>  };
>  
> -extern u32 gss_seq_send_fetch_and_inc(struct krb5_ctx *ctx);
> -extern u64 gss_seq_send64_fetch_and_inc(struct krb5_ctx *ctx);
> -
>  /* The length of the Kerberos GSS token header */
>  #define GSS_KRB5_TOK_HDR_LEN (16)
>  
> diff --git a/net/sunrpc/auth_gss/gss_krb5_mech.c
> b/net/sunrpc/auth_gss/gss_krb5_mech.c
> index 7f0424dfa8f6..eab71fc7af3e 100644
> --- a/net/sunrpc/auth_gss/gss_krb5_mech.c
> +++ b/net/sunrpc/auth_gss/gss_krb5_mech.c
> @@ -274,6 +274,7 @@ get_key(const void *p, const void *end,
>  static int
>  gss_import_v1_context(const void *p, const void *end, struct
> krb5_ctx *ctx)
>  {
> + u32 seq_send;
>   int tmp;
>  
>   p = simple_get_bytes(p, end, >initiate, sizeof(ctx-
> >initiate));
> @@ -315,9 +316,10 @@ gss_import_v1_context(const void *p, const void
> *end, struct krb5_ctx *ctx)
>   p = simple_get_bytes(p, end, >endtime, sizeof(ctx-
> >endtime));
>   if (IS_ERR(p))
>   goto out_err;
> - p = simple_get_bytes(p, end, >seq_send, sizeof(ctx-
> >seq_send));
> + p = simple_get_bytes(p, end, _send, sizeof(seq_send));
>   if (IS_ERR(p))
>   goto out_err;
> + atomic_set(>seq_send, seq_send);
>   p = simple_get_netobj(p, end, >mech_used);
>   if (IS_ERR(p))
>   goto out_err;
> @@ -607,6 +609,7 @@ static int
>  gss_import_v2_context(const void *p, const void *end, struct
> krb5_ctx *ctx,
>   gfp_t gfp_mask)
>  {
> + u64 seq_send64;
>   int keylen;
>  
>   p = simple_get_bytes(p, end, >flags, sizeof(ctx->flags));
> @@ -617,14 +620,15 @@ gss_import_v2_context(const void *p, const void
> *end, struct krb5_ctx *ctx,
>   p = simple_get_bytes(p, end, >endtime, sizeof(ctx-
> >endtime));
>   if (IS_ERR(p))
>   goto out_err;
> - p = simple_get_bytes(p, end, >seq_send64, sizeof(ctx-
> >seq_send64));
> + p = simple_get_bytes(p, end, _send64, sizeof(seq_send64));
>   if (IS_ERR(p))
>   goto out_err;
> + atomic64_set(>seq_send64, seq_send64);
>   /* set seq_send for use by "older" enctypes */
> - ctx->seq_send = ctx->seq_send64;
> - if (ctx->seq_send64 != ctx->seq_send) {
> -

Re: [RFC PATCH] lib: Introduce generic __cmpxchg_u64() and use it where needed

2018-10-31 Thread Paul Burton

(Copying SunRPC & net maintainers.)

Hi Guenter,

On Wed, Oct 31, 2018 at 03:02:53PM -0700, Guenter Roeck wrote:
> The alternatives I can see are
> - Do not use cmpxchg64() outside architecture code (ie drop its use from
>   the offending driver, and keep doing the same whenever the problem comes
>   up again).
> or
> - Introduce something like ARCH_HAS_CMPXCHG64 and use it to determine
>   if cmpxchg64 is supported or not.
> 
> Any preference ?

My preference would be option 1 - avoiding cmpxchg64() where possible in
generic code. I wouldn't be opposed to the Kconfig option if there are
cases where cmpxchg64() can really help performance though.

The last time I'm aware of this coming up the affected driver was
modified to avoid cmpxchg64() [1].

In this particular case I have no idea why
net/sunrpc/auth_gss/gss_krb5_seal.c is using cmpxchg64() at all. It's
essentially reinventing atomic64_fetch_inc() which is already provided
everywhere via CONFIG_GENERIC_ATOMIC64 & the spinlock approach. At least
for atomic64_* functions the assumption that all access will be
performed using those same functions seems somewhat reasonable.

So how does the below look? Trond?

Thanks,
Paul

[1] https://patchwork.ozlabs.org/cover/891284/

---
diff --git a/include/linux/sunrpc/gss_krb5.h b/include/linux/sunrpc/gss_krb5.h
index 131424cefc6a..02c0412e368c 100644
--- a/include/linux/sunrpc/gss_krb5.h
+++ b/include/linux/sunrpc/gss_krb5.h
@@ -107,8 +107,8 @@ struct krb5_ctx {
u8  Ksess[GSS_KRB5_MAX_KEYLEN]; /* session key */
u8  cksum[GSS_KRB5_MAX_KEYLEN];
s32 endtime;
-   u32 seq_send;
-   u64 seq_send64;
+   atomic_tseq_send;
+   atomic64_t  seq_send64;
struct xdr_netobj   mech_used;
u8  initiator_sign[GSS_KRB5_MAX_KEYLEN];
u8  acceptor_sign[GSS_KRB5_MAX_KEYLEN];
@@ -118,9 +118,6 @@ struct krb5_ctx {
u8  acceptor_integ[GSS_KRB5_MAX_KEYLEN];
 };
 
-extern u32 gss_seq_send_fetch_and_inc(struct krb5_ctx *ctx);
-extern u64 gss_seq_send64_fetch_and_inc(struct krb5_ctx *ctx);
-
 /* The length of the Kerberos GSS token header */
 #define GSS_KRB5_TOK_HDR_LEN   (16)
 
diff --git a/net/sunrpc/auth_gss/gss_krb5_mech.c 
b/net/sunrpc/auth_gss/gss_krb5_mech.c
index 7f0424dfa8f6..eab71fc7af3e 100644
--- a/net/sunrpc/auth_gss/gss_krb5_mech.c
+++ b/net/sunrpc/auth_gss/gss_krb5_mech.c
@@ -274,6 +274,7 @@ get_key(const void *p, const void *end,
 static int
 gss_import_v1_context(const void *p, const void *end, struct krb5_ctx *ctx)
 {
+   u32 seq_send;
int tmp;
 
p = simple_get_bytes(p, end, >initiate, sizeof(ctx->initiate));
@@ -315,9 +316,10 @@ gss_import_v1_context(const void *p, const void *end, 
struct krb5_ctx *ctx)
p = simple_get_bytes(p, end, >endtime, sizeof(ctx->endtime));
if (IS_ERR(p))
goto out_err;
-   p = simple_get_bytes(p, end, >seq_send, sizeof(ctx->seq_send));
+   p = simple_get_bytes(p, end, _send, sizeof(seq_send));
if (IS_ERR(p))
goto out_err;
+   atomic_set(>seq_send, seq_send);
p = simple_get_netobj(p, end, >mech_used);
if (IS_ERR(p))
goto out_err;
@@ -607,6 +609,7 @@ static int
 gss_import_v2_context(const void *p, const void *end, struct krb5_ctx *ctx,
gfp_t gfp_mask)
 {
+   u64 seq_send64;
int keylen;
 
p = simple_get_bytes(p, end, >flags, sizeof(ctx->flags));
@@ -617,14 +620,15 @@ gss_import_v2_context(const void *p, const void *end, 
struct krb5_ctx *ctx,
p = simple_get_bytes(p, end, >endtime, sizeof(ctx->endtime));
if (IS_ERR(p))
goto out_err;
-   p = simple_get_bytes(p, end, >seq_send64, sizeof(ctx->seq_send64));
+   p = simple_get_bytes(p, end, _send64, sizeof(seq_send64));
if (IS_ERR(p))
goto out_err;
+   atomic64_set(>seq_send64, seq_send64);
/* set seq_send for use by "older" enctypes */
-   ctx->seq_send = ctx->seq_send64;
-   if (ctx->seq_send64 != ctx->seq_send) {
-   dprintk("%s: seq_send64 %lx, seq_send %x overflow?\n", __func__,
-   (unsigned long)ctx->seq_send64, ctx->seq_send);
+   atomic_set(>seq_send, seq_send64);
+   if (seq_send64 != atomic_read(>seq_send)) {
+   dprintk("%s: seq_send64 %llx, seq_send %x overflow?\n", 
__func__,
+   seq_send64, atomic_read(>seq_send));
p = ERR_PTR(-EINVAL);
goto out_err;
}
diff --git a/net/sunrpc/auth_gss/gss_krb5_seal.c 
b/net/sunrpc/auth_gss/gss_krb5_seal.c
index b4adeb06660b..48fe4a591b54 100644
--- a/net/sunrpc/auth_gss/gss_krb5_seal.c
+++ b/net/sunrpc/auth_gss/gss_krb5_seal.c
@@ -123,30 +123,6 @@ setup_token_v2(struct krb5_ctx *ctx, struct xdr_netobj

Re: PIE binaries are no longer mapped below 4 GiB on ppc64le

2018-10-31 Thread Michal Suchánek

On Wed, 31 Oct 2018 19:04:14 -0300
Tulio Magno Quites Machado Filho  wrote:

> Florian Weimer  writes:
> 
> > * Tulio Magno Quites Machado Filho:
> >  
> >> I wonder if this is restricted to linker that Golang uses.
> >> Were you able to reproduce the same problem with Binutils'
> >> linker?  
> >
> > The example is carefully constructed to use the external linker.  It
> > invokes gcc, which then invokes the BFD linker in my case.  
> 
> Indeed. That question was unnecessary.  :-D
> 
> > Based on the relocations, I assume there is only so much the linker
> > can do here.  I'm amazed that it produces an executable at all, let
> > alone one that runs correctly on some kernel versions!  
> 
> Agreed.  That isn't expected to work.  Both the compiler and the
> linker have to generate PIE for it to work.
> 
> > I assume that the Go toolchain simply lacks PIE support on
> > ppc64le.  
> 
> Maybe the support is there, but it doesn't generate PIC by default?
> 
golang has -fPIC IIRC. It does not benefit from the GNU toolchian
synergy of always calling the linker with the correct flags
corresponding to the generated code, though. So when gcc flips the
switch default value golang happily produces incompatible objects.

Also I suspect some pieces of stdlib are not compiled with the flags
you pass in for the build so there are always some objects somewhere
that are not compatible.

Thanks

Michal

Re: PIE binaries are no longer mapped below 4 GiB on ppc64le

2018-10-31 Thread Benjamin Herrenschmidt

On Wed, 2018-10-31 at 18:54 +0100, Florian Weimer wrote:
> 
> It would matter to C code which returns the address of a global variable
> in the main program through and (implicit) int return value.
> 
> The old behavior hid some pointer truncation issues.

Hiding bugs like that is never a good idea..

> > Maybe it would be good idea to generate 64bit relocations on 64bit
> > targets?
> 
> Yes, the Go toolchain definitely needs fixing for PIE.  I don't dispute
> that.

There was never any ABI guarantee that programs would be loaded below
4G... it just *happened*, so that's not per-se an ABI change.

That said, I'm surprised of the choice of address.. I would have rather
moved to above 1TB to benefit from 1T segments...

Nick, Anton, do you know anything about that change ?

Ben.

Re: PIE binaries are no longer mapped below 4 GiB on ppc64le

2018-10-31 Thread Tulio Magno Quites Machado Filho

Florian Weimer  writes:

> * Tulio Magno Quites Machado Filho:
>
>> I wonder if this is restricted to linker that Golang uses.
>> Were you able to reproduce the same problem with Binutils' linker?
>
> The example is carefully constructed to use the external linker.  It
> invokes gcc, which then invokes the BFD linker in my case.

Indeed. That question was unnecessary.  :-D

> Based on the relocations, I assume there is only so much the linker can
> do here.  I'm amazed that it produces an executable at all, let alone
> one that runs correctly on some kernel versions!

Agreed.  That isn't expected to work.  Both the compiler and the linker have
to generate PIE for it to work.

> I assume that the Go toolchain simply lacks PIE support on ppc64le.

Maybe the support is there, but it doesn't generate PIC by default?

-- 
Tulio Magno

Re: [RFC PATCH] lib: Introduce generic __cmpxchg_u64() and use it where needed

2018-10-31 Thread Guenter Roeck

On Wed, Oct 31, 2018 at 09:32:43PM +, Paul Burton wrote:
> Hi Guenter,
> 
> On Wed, Oct 31, 2018 at 12:52:18PM -0700, Guenter Roeck wrote:
> > +/*
> > + * Generic version of __cmpxchg_u64, to be used for cmpxchg64().
> > + * Takes u64 parameters.
> > + */
> > +u64 __cmpxchg_u64(u64 *ptr, u64 old, u64 new)
> > +{
> > +   raw_spinlock_t *lock = lock_addr(ptr);
> > +   unsigned long flags;
> > +   u64 prev;
> > +
> > +   raw_spin_lock_irqsave(lock, flags);
> > +   prev = READ_ONCE(*ptr);
> > +   if (prev == old)
> > +   *ptr = new;
> > +   raw_spin_unlock_irqrestore(lock, flags);
> > +
> > +   return prev;
> > +}
> > +EXPORT_SYMBOL(__cmpxchg_u64);
> 
> This is only going to work if we know that memory modified using
> __cmpxchg_u64() is *always* modified using __cmpxchg_u64(). Without that
> guarantee there's nothing to stop some other CPU writing to *ptr after
> the READ_ONCE() above but before we write new to it.
> 
> As far as I'm aware this is not a guarantee we currently provide, so it
> would mean making that a requirement for cmpxchg64() users & auditing
> them all. That would also leave cmpxchg64() with semantics that differ
> from plain cmpxchg(), and semantics that may surprise people. In my view
> that's probably not worth it, and it would be better to avoid using
> cmpxchg64() on systems that can't properly support it.
> 

Good point. Unfortunately this is also true for the architectures with
similar implementations, ie at least sparc32 (and possibly parisc).

The alternatives I can see are
- Do not use cmpxchg64() outside architecture code (ie drop its use from
  the offending driver, and keep doing the same whenever the problem comes
  up again).
or
- Introduce something like ARCH_HAS_CMPXCHG64 and use it to determine
  if cmpxchg64 is supported or not.

Any preference ?

Thanks,
Guenter

Re: [PATCH v2 2/2] kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()

2018-10-31 Thread Doug Anderson

Hi,

On Wed, Oct 31, 2018 at 11:40 AM Daniel Thompson
 wrote:
>
> On Tue, Oct 30, 2018 at 03:18:43PM -0700, Douglas Anderson wrote:
> > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > index f3cadda45f07..9a3f952de6ed 100644
> > --- a/kernel/debug/debug_core.c
> > +++ b/kernel/debug/debug_core.c
> > @@ -55,6 +55,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  #include 
> >  #include 
> > @@ -220,6 +221,39 @@ int __weak kgdb_skipexception(int exception, struct 
> > pt_regs *regs)
> >   return 0;
> >  }
> >
> > +/*
> > + * Default (weak) implementation for kgdb_roundup_cpus
> > + */
> > +
> > +static DEFINE_PER_CPU(call_single_data_t, kgdb_roundup_csd);
> > +
> > +void __weak kgdb_call_nmi_hook(void *ignored)
> > +{
> > + kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs());
> > +}
> > +
> > +void __weak kgdb_roundup_cpus(void)
> > +{
> > + call_single_data_t *csd;
> > + int cpu;
> > +
> > + for_each_cpu(cpu, cpu_online_mask) {
> > + csd = _cpu(kgdb_roundup_csd, cpu);
> > + smp_call_function_single_async(cpu, csd);
> > + }
>
> smp_call_function() automatically skips the calling CPU but this code does
> not. It isn't a hard bug since kgdb_nmicallback() does a re-entrancy
> check but I'd still prefer to skip the calling CPU.

I'll incorporate this into the next version.

> As mentioned in another part of the thread we can also add robustness
> by skipping a cpu where csd->flags != 0 (and adding an appropriately
> large comment regarding why). Doing the check directly is abusing
> internal knowledge that smp.c normally keeps to itself so an accessor
> of some kind would be needed.

Sure.  I could add smp_async_func_finished() that just looked like:

int smp_async_func_finished(call_single_data_t *csd)
{
  return !(csd->flags & CSD_FLAG_LOCK);
}

My understanding of all the mutual exclusion / memory barrier concepts
employed by smp.c is pretty weak, though.  I'm hoping that it's safe
to just access the structure and check the bit directly.

...but do you think adding a generic accessor like this is better than
just keeping track of this in kgdb directly?  I could avoid the
accessor by adding a "rounding_up" member to "struct
debuggerinfo_struct" and doing something like this in roundup:

  /* If it didn't round up last time, don't try again */
  if (kgdb_info[cpu].rounding_up)
continue

  kgdb_info[cpu].rounding_up = true
  smp_call_function_single_async(cpu, csd);

...and then in kgdb_nmicallback() I could just add:

  kgdb_info[cpu].rounding_up = false

In that case we're not adding a generic accessor to smp.c that most
people should never use.

I'll wait to hear back from you if you think the accessor is OK.  It
seems like it might be nice not to have to add something to smp.c just
for this one use case.

> > +}
> > +
> > +static void kgdb_generic_roundup_init(void)
> > +{
> > + call_single_data_t *csd;
> > + int cpu;
> > +
> > + for_each_possible_cpu(cpu) {
> > + csd = _cpu(kgdb_roundup_csd, cpu);
> > + csd->func = kgdb_call_nmi_hook;
> > + }
> > +}
>
> I can't help noticing this code is very similar to kgdb_roundup_cpus. Do
> we really gain much from ahead-of-time initializing csd->func?

Oh!  Right...  At first I thought about just trying to put the "csd"
on the stack in kgdb_roundup_cpus() but then I realized that it needed
to persist past the end of kgdb_roundup_cpus().  ...and once I gave up
on the idea of putting it on the stack I decided I needed the init.

...but you're right that I don't really.  The only thing I'm initting
is the function pointer and it totally wouldn't hurt to just init that
over and over again every time kgdb_roundup_cpus() is called.

-Doug

Re: PIE binaries are no longer mapped below 4 GiB on ppc64le

2018-10-31 Thread Tulio Magno Quites Machado Filho

Florian Weimer  writes:

> * Michal Suchánek:
>
>> On Wed, 31 Oct 2018 18:20:56 +0100
>> Florian Weimer  wrote:
>>
>>> And it needs to be built with:
>>> 
>>>   go build -ldflags=-extldflags=-pie extld.go
>>> 
>>> I'm not entirely sure what to make of this, but I'm worried that this
>>> could be a regression that matters to userspace.
>>
>> I encountered the same when trying to build go on ppc64le. I am not
>> familiar with the internals so I just let it be.
>>
>> It does not seem to matter to any other userspace.
>
> It would matter to C code which returns the address of a global variable
> in the main program through and (implicit) int return value.

I wonder if this is restricted to linker that Golang uses.
Were you able to reproduce the same problem with Binutils' linker?

-- 
Tulio Magno

Re: [RFC PATCH] lib: Introduce generic __cmpxchg_u64() and use it where needed

2018-10-31 Thread Paul Burton

Hi Guenter,

On Wed, Oct 31, 2018 at 12:52:18PM -0700, Guenter Roeck wrote:
> +/*
> + * Generic version of __cmpxchg_u64, to be used for cmpxchg64().
> + * Takes u64 parameters.
> + */
> +u64 __cmpxchg_u64(u64 *ptr, u64 old, u64 new)
> +{
> + raw_spinlock_t *lock = lock_addr(ptr);
> + unsigned long flags;
> + u64 prev;
> +
> + raw_spin_lock_irqsave(lock, flags);
> + prev = READ_ONCE(*ptr);
> + if (prev == old)
> + *ptr = new;
> + raw_spin_unlock_irqrestore(lock, flags);
> +
> + return prev;
> +}
> +EXPORT_SYMBOL(__cmpxchg_u64);

This is only going to work if we know that memory modified using
__cmpxchg_u64() is *always* modified using __cmpxchg_u64(). Without that
guarantee there's nothing to stop some other CPU writing to *ptr after
the READ_ONCE() above but before we write new to it.

As far as I'm aware this is not a guarantee we currently provide, so it
would mean making that a requirement for cmpxchg64() users & auditing
them all. That would also leave cmpxchg64() with semantics that differ
from plain cmpxchg(), and semantics that may surprise people. In my view
that's probably not worth it, and it would be better to avoid using
cmpxchg64() on systems that can't properly support it.

For MIPS the problem will go away with the nanoMIPS ISA which includes &
requires LLWP/SCWP (W = word, P = paired) instructions that we can use
to implement cmpxchg64() properly, but of course that won't help older
systems.

Thanks,
Paul

Re: PIE binaries are no longer mapped below 4 GiB on ppc64le

2018-10-31 Thread Florian Weimer

* Tulio Magno Quites Machado Filho:

> Florian Weimer  writes:
>
>> * Michal Suchánek:
>>
>>> On Wed, 31 Oct 2018 18:20:56 +0100
>>> Florian Weimer  wrote:
>>>
 And it needs to be built with:

   go build -ldflags=-extldflags=-pie extld.go

 I'm not entirely sure what to make of this, but I'm worried that this
 could be a regression that matters to userspace.
>>>
>>> I encountered the same when trying to build go on ppc64le. I am not
>>> familiar with the internals so I just let it be.
>>>
>>> It does not seem to matter to any other userspace.
>>
>> It would matter to C code which returns the address of a global variable
>> in the main program through and (implicit) int return value.
>
> I wonder if this is restricted to linker that Golang uses.
> Were you able to reproduce the same problem with Binutils' linker?

The example is carefully constructed to use the external linker.  It
invokes gcc, which then invokes the BFD linker in my case.

Based on the relocations, I assume there is only so much the linker can
do here.  I'm amazed that it produces an executable at all, let alone
one that runs correctly on some kernel versions!  I assume that the Go
toolchain simply lacks PIE support on ppc64le.

Thanks,
Florian

[RFC PATCH] lib: Introduce generic __cmpxchg_u64() and use it where needed

2018-10-31 Thread Guenter Roeck

Some architectures do not or not always support cmpxchg64(). This results
in on/off problems when the function is used in common code. The latest
example is

net/sunrpc/auth_gss/gss_krb5_seal.c: In function 'gss_seq_send64_fetch_and_inc':
net/sunrpc/auth_gss/gss_krb5_seal.c:145:14: error:
implicit declaration of function 'cmpxchg64'

which is seen with some powerpc and mips builds.

Introduce a generic version of __cmpxchg_u64() and use it for affected
architectures.

Fixes: 21924765862a
("SUNRPC: use cmpxchg64() in gss_seq_send64_fetch_and_inc()")
Cc: Arnd Bergmann 
Cc: Trond Myklebust 
Signed-off-by: Guenter Roeck 
---
Couple of questions:
- Is this the right (or an acceptable) approach to fix the problem ?
- Should I split the patch into three, one to introduce __cmpxchg_u64()
  and one per architecture ?
- Who should take the patch (series) ?

 arch/mips/Kconfig  |  1 +
 arch/mips/include/asm/cmpxchg.h|  3 ++
 arch/powerpc/Kconfig   |  1 +
 arch/powerpc/include/asm/cmpxchg.h |  3 ++
 lib/Kconfig|  3 ++
 lib/Makefile   |  2 ++
 lib/cmpxchg64.c| 60 ++
 7 files changed, 73 insertions(+)
 create mode 100644 lib/cmpxchg64.c

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 80778b40f8fa..7392a5f4e517 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -18,6 +18,7 @@ config MIPS
select CPU_PM if CPU_IDLE
select DMA_DIRECT_OPS
select GENERIC_ATOMIC64 if !64BIT
+   select GENERIC_CMPXCHG64 if !64BIT && SMP
select GENERIC_CLOCKEVENTS
select GENERIC_CMOS_UPDATE
select GENERIC_CPU_AUTOPROBE
diff --git a/arch/mips/include/asm/cmpxchg.h b/arch/mips/include/asm/cmpxchg.h
index 89e9fb7976fe..ca837b05bf3d 100644
--- a/arch/mips/include/asm/cmpxchg.h
+++ b/arch/mips/include/asm/cmpxchg.h
@@ -206,6 +206,9 @@ static inline unsigned long __cmpxchg(volatile void *ptr, 
unsigned long old,
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
 #ifndef CONFIG_SMP
 #define cmpxchg64(ptr, o, n) cmpxchg64_local((ptr), (o), (n))
+#else
+extern u64 __cmpxchg_u64(u64 *p, u64 old, u64 new);
+#define cmpxchg64(ptr, o, n) __cmpxchg_u64((ptr), (o), (n))
 #endif
 #endif
 
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index e84943d24e5c..bd1d99c664c4 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -161,6 +161,7 @@ config PPC
select EDAC_ATOMIC_SCRUB
select EDAC_SUPPORT
select GENERIC_ATOMIC64 if PPC32
+   select GENERIC_CMPXCHG64if PPC32
select GENERIC_CLOCKEVENTS
select GENERIC_CLOCKEVENTS_BROADCASTif SMP
select GENERIC_CMOS_UPDATE
diff --git a/arch/powerpc/include/asm/cmpxchg.h 
b/arch/powerpc/include/asm/cmpxchg.h
index 27183871eb3b..da8be4189731 100644
--- a/arch/powerpc/include/asm/cmpxchg.h
+++ b/arch/powerpc/include/asm/cmpxchg.h
@@ -534,8 +534,11 @@ __cmpxchg_acquire(void *ptr, unsigned long old, unsigned 
long new,
cmpxchg_acquire((ptr), (o), (n));   \
 })
 #else
+extern u64 __cmpxchg_u64(u64 *p, u64 old, u64 new);
+
 #include 
 #define cmpxchg64_local(ptr, o, n) __cmpxchg64_local_generic((ptr), (o), (n))
+#define cmpxchg64(ptr, o, n) __cmpxchg_u64((ptr), (o), (n))
 #endif
 
 #endif /* __KERNEL__ */
diff --git a/lib/Kconfig b/lib/Kconfig
index d1573a16aa92..2b581a70ded2 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -500,6 +500,9 @@ config NLATTR
 config GENERIC_ATOMIC64
bool
 
+config GENERIC_CMPXCHG64
+   bool
+
 config LRU_CACHE
tristate
 
diff --git a/lib/Makefile b/lib/Makefile
index 988949c4fd3a..4646a06ed418 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -172,6 +172,8 @@ obj-$(CONFIG_GENERIC_CSUM) += checksum.o
 
 obj-$(CONFIG_GENERIC_ATOMIC64) += atomic64.o
 
+obj-$(CONFIG_GENERIC_CMPXCHG64) += cmpxchg64.o
+
 obj-$(CONFIG_ATOMIC64_SELFTEST) += atomic64_test.o
 
 obj-$(CONFIG_CPU_RMAP) += cpu_rmap.o
diff --git a/lib/cmpxchg64.c b/lib/cmpxchg64.c
new file mode 100644
index ..239c43d05d00
--- /dev/null
+++ b/lib/cmpxchg64.c
@@ -0,0 +1,60 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Generic implementation of cmpxchg64().
+ * Derived from implementation in arch/sparc/lib/atomic32.c
+ * and from locking code implemented in lib/atomic32.c.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * We use a hashed array of spinlocks to provide exclusive access
+ * to each variable. Since this is expected to used on systems
+ * with small numbers of CPUs (<= 4 or so), we use a relatively
+ * small array of 16 spinlocks to avoid wasting too much memory
+ * on the spinlock array.
+ */
+#define NR_LOCKS   16
+
+/* Ensure that each lock is in a separate cacheline */
+static union {
+   raw_spinlock_t lock;
+   char pad[L1_CACHE_BYTES];
+} cmpxchg_lock[NR_LOCKS] __cacheline_aligned_in_smp = {

Re: [PATCH] selftests/powerpc: Fix compilation issue due to asm label

hi Naveen,

On 10/31/18 2:24 PM, Naveen N. Rao wrote:
> Naveen N. Rao wrote:
>> We are using 'dscr_insn' as a label in inline asm to identify if a
>> SIGILL was generated by the mtspr instruction at that point. However,
>> with inline assembly, the compiler is still free to duplicate the asm
>> statement for optimization purposes, which results in the label being
>> defined twice with the error:
>> /tmp/ccerQCql.s:874: Error: symbol `dscr_insn' is already defined
>>
>> With different compiler versions, we may also see:
>> /tmp/ccJzLDlN.o:(.toc+0x0): undefined reference to `dscr_insn'
>>
>> Remove the use of the label in the inline assembly. Instead, just look
>> for the offending instruction in the signal handler.

Cool, I've tested this patch on my mainline tree and selftests are being
able to build again. I also tested the rfi_flush_test selftest as other
tests (ptrace/ and tm/) and everything seems to be normal again.

Thank you!

>> Reported-by: Breno Leitao 
>> Signed-off-by: Naveen N. Rao 

Tested-by: Breno Leitao

Re: [PATCH v2 2/2] kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()

2018-10-31 Thread Daniel Thompson

On Tue, Oct 30, 2018 at 03:18:43PM -0700, Douglas Anderson wrote:
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index f3cadda45f07..9a3f952de6ed 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -55,6 +55,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -220,6 +221,39 @@ int __weak kgdb_skipexception(int exception, struct 
> pt_regs *regs)
>   return 0;
>  }
>  
> +/*
> + * Default (weak) implementation for kgdb_roundup_cpus
> + */
> +
> +static DEFINE_PER_CPU(call_single_data_t, kgdb_roundup_csd);
> +
> +void __weak kgdb_call_nmi_hook(void *ignored)
> +{
> + kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs());
> +}
> +
> +void __weak kgdb_roundup_cpus(void)
> +{
> + call_single_data_t *csd;
> + int cpu;
> +
> + for_each_cpu(cpu, cpu_online_mask) {
> + csd = _cpu(kgdb_roundup_csd, cpu);
> + smp_call_function_single_async(cpu, csd);
> + }

smp_call_function() automatically skips the calling CPU but this code does
not. It isn't a hard bug since kgdb_nmicallback() does a re-entrancy
check but I'd still prefer to skip the calling CPU.

As mentioned in another part of the thread we can also add robustness
by skipping a cpu where csd->flags != 0 (and adding an appropriately
large comment regarding why). Doing the check directly is abusing
internal knowledge that smp.c normally keeps to itself so an accessor
of some kind would be needed.

> +}
> +
> +static void kgdb_generic_roundup_init(void)
> +{
> + call_single_data_t *csd;
> + int cpu;
> +
> + for_each_possible_cpu(cpu) {
> + csd = _cpu(kgdb_roundup_csd, cpu);
> + csd->func = kgdb_call_nmi_hook;
> + }
> +}

I can't help noticing this code is very similar to kgdb_roundup_cpus. Do
we really gain much from ahead-of-time initializing csd->func?

Daniel.

Re: PIE binaries are no longer mapped below 4 GiB on ppc64le

2018-10-31 Thread Florian Weimer

* Michal Suchánek:

> On Wed, 31 Oct 2018 18:20:56 +0100
> Florian Weimer  wrote:
>
>> We tried to use Go to build PIE binaries, and while the Go toolchain
>> is definitely not ready (it produces text relocations and problematic
>> relocations in general), it exposed what could be an accidental
>> userspace ABI change.
>> 
>> With our 4.10-derived kernel, PIE binaries are mapped below 4 GiB, so
>> relocations like R_PPC64_ADDR16_HA work:
>> 
> ...
>
>> There are fewer mappings because the loader detects a relocation
>> overflow and aborts (“error while loading shared libraries:
>> R_PPC64_ADDR16_HA reloc at 0x000120f0983c for symbol `' out of
>> range”), so I had to recover the mappings externally.  Disabling ASLR
>> does not help.
>> 
> ...
>> 
>> And it needs to be built with:
>> 
>>   go build -ldflags=-extldflags=-pie extld.go
>> 
>> I'm not entirely sure what to make of this, but I'm worried that this
>> could be a regression that matters to userspace.
>
> I encountered the same when trying to build go on ppc64le. I am not
> familiar with the internals so I just let it be.
>
> It does not seem to matter to any other userspace.

It would matter to C code which returns the address of a global variable
in the main program through and (implicit) int return value.

The old behavior hid some pointer truncation issues.

> Maybe it would be good idea to generate 64bit relocations on 64bit
> targets?

Yes, the Go toolchain definitely needs fixing for PIE.  I don't dispute
that.

Thanks,
Florian

Re: PIE binaries are no longer mapped below 4 GiB on ppc64le

2018-10-31 Thread Michal Suchánek

On Wed, 31 Oct 2018 18:20:56 +0100
Florian Weimer  wrote:

> We tried to use Go to build PIE binaries, and while the Go toolchain
> is definitely not ready (it produces text relocations and problematic
> relocations in general), it exposed what could be an accidental
> userspace ABI change.
> 
> With our 4.10-derived kernel, PIE binaries are mapped below 4 GiB, so
> relocations like R_PPC64_ADDR16_HA work:
> 
...

> There are fewer mappings because the loader detects a relocation
> overflow and aborts (“error while loading shared libraries:
> R_PPC64_ADDR16_HA reloc at 0x000120f0983c for symbol `' out of
> range”), so I had to recover the mappings externally.  Disabling ASLR
> does not help.
> 
...
> 
> And it needs to be built with:
> 
>   go build -ldflags=-extldflags=-pie extld.go
> 
> I'm not entirely sure what to make of this, but I'm worried that this
> could be a regression that matters to userspace.

I encountered the same when trying to build go on ppc64le. I am not
familiar with the internals so I just let it be.

It does not seem to matter to any other userspace. Maybe it would be
good idea to generate 64bit relocations on 64bit targets?

Thanks

Michal

Re: [PATCH] selftests/powerpc: Fix compilation issue due to asm label

2018-10-31 Thread Naveen N. Rao


Naveen N. Rao wrote:

We are using 'dscr_insn' as a label in inline asm to identify if a
SIGILL was generated by the mtspr instruction at that point. However,
with inline assembly, the compiler is still free to duplicate the asm
statement for optimization purposes, which results in the label being
defined twice with the error:
/tmp/ccerQCql.s:874: Error: symbol `dscr_insn' is already defined

With different compiler versions, we may also see:
/tmp/ccJzLDlN.o:(.toc+0x0): undefined reference to `dscr_insn'

Remove the use of the label in the inline assembly. Instead, just look
for the offending instruction in the signal handler.

Reported-by: Breno Leitao 
Signed-off-by: Naveen N. Rao 
---


Missed adding:
Fixes: d2bf793237b3aa9c ("selftests/powerpc: Add test to verify rfi flush across a 
system call")

- Naveen

PIE binaries are no longer mapped below 4 GiB on ppc64le

2018-10-31 Thread Florian Weimer

We tried to use Go to build PIE binaries, and while the Go toolchain is
definitely not ready (it produces text relocations and problematic
relocations in general), it exposed what could be an accidental
userspace ABI change.

With our 4.10-derived kernel, PIE binaries are mapped below 4 GiB, so
relocations like R_PPC64_ADDR16_HA work:

21f0-220d r-xp  fd:00 36593493   
/root/extld
220d-220e r--p 001c fd:00 36593493   
/root/extld
220e-2210 rw-p 001d fd:00 36593493   
/root/extld
2210-2212 rw-p  00:00 0
264b-264e rw-p  00:00 0  [heap]
c0-c1 rw-p  00:00 0
c41ffe-c42030 rw-p  00:00 0
3fff8c00-3fff8c03 rw-p  00:00 0
3fff8c03-3fff9000 ---p  00:00 0
3fff9000-3fff9003 rw-p  00:00 0
3fff9003-3fff9400 ---p  00:00 0
3fff9400-3fff9403 rw-p  00:00 0
3fff9403-3fff9800 ---p  00:00 0
3fff9800-3fff9803 rw-p  00:00 0
3fff9803-3fff9c00 ---p  00:00 0
3fff9c00-3fff9c03 rw-p  00:00 0
3fff9c03-3fffa000 ---p  00:00 0
3fffa229-3fffa22d rw-p  00:00 0
3fffa22d-3fffa22e ---p  00:00 0
3fffa22e-3fffa2ae rw-p  00:00 0
3fffa2ae-3fffa2af ---p  00:00 0
3fffa2af-3fffa32f rw-p  00:00 0
3fffa32f-3fffa330 ---p  00:00 0
3fffa330-3fffa3b0 rw-p  00:00 0
3fffa3b0-3fffa3b1 ---p  00:00 0
3fffa3b1-3fffa431 rw-p  00:00 0
3fffa431-3fffa432 ---p  00:00 0
3fffa432-3fffa4bb rw-p  00:00 0
3fffa4bb-3fffa4da r-xp  fd:00 34316081   
/usr/lib64/power9/libc-2.28.so
3fffa4da-3fffa4db r--p 001e fd:00 34316081   
/usr/lib64/power9/libc-2.28.so
3fffa4db-3fffa4dc rw-p 001f fd:00 34316081   
/usr/lib64/power9/libc-2.28.so
3fffa4dc-3fffa4df r-xp  fd:00 34316085   
/usr/lib64/power9/libpthread-2.28.so
3fffa4df-3fffa4e0 r--p 0002 fd:00 34316085   
/usr/lib64/power9/libpthread-2.28.so
3fffa4e0-3fffa4e1 rw-p 0003 fd:00 34316085   
/usr/lib64/power9/libpthread-2.28.so
3fffa4e1-3fffa4e2 rw-p  00:00 0
3fffa4e2-3fffa4e4 r-xp  00:00 0  [vdso]
3fffa4e4-3fffa4e7 r-xp  fd:00 874114 
/usr/lib64/ld-2.28.so
3fffa4e7-3fffa4e8 r--p 0002 fd:00 874114 
/usr/lib64/ld-2.28.so
3fffa4e8-3fffa4e9 rw-p 0003 fd:00 874114 
/usr/lib64/ld-2.28.so
3300-3303 rw-p  00:00 0
[stack]

With a 4.18-derived kernel (with the hashed mm), we get this instead:

120e6-12103 rw-p  fd:00 102447141
/root/extld
12103-12106 rw-p 001c fd:00 102447141
/root/extld
12106-12108 rw-p  00:00 0 
7fffb5b0-7fffb5cf r-xp  fd:00 67169871   
/usr/lib64/power9/libc-2.28.so
7fffb5cf-7fffb5d0 r--p 001e fd:00 67169871   
/usr/lib64/power9/libc-2.28.so
7fffb5d0-7fffb5d1 rw-p 001f fd:00 67169871   
/usr/lib64/power9/libc-2.28.so
7fffb5d1-7fffb5d4 r-xp  fd:00 67169875   
/usr/lib64/power9/libpthread-2.28.so
7fffb5d4-7fffb5d5 r--p 0002 fd:00 67169875   
/usr/lib64/power9/libpthread-2.28.so
7fffb5d5-7fffb5d6 rw-p 0003 fd:00 67169875   
/usr/lib64/power9/libpthread-2.28.so
7fffb5d6-7fffb5d7 r--p  fd:00 67780267   
/etc/ld.so.cache
7fffb5d7-7fffb5d9 r-xp  00:00 0  [vdso]
7fffb5d9-7fffb5dc r-xp  fd:00 1477   
/usr/lib64/ld-2.28.so
7fffb5dc-7fffb5de rw-p 0002 fd:00 1477   
/usr/lib64/ld-2.28.so
7f6c-7f6f rw-p  00:00 0  [stack]

There are fewer mappings because the loader detects a relocation
overflow and aborts (“error while loading shared libraries:
R_PPC64_ADDR16_HA reloc at 0x000120f0983c for symbol `' out of
range”), so I had to recover the mappings externally.  Disabling ASLR
does not help.

The Go program looks like this:

package main

import (
"fmt"
"io/ioutil"
"os"
)

// #include 
import "C"

func main() {
// Force external linking against glibc.
fmt.Printf("%#v\n", C.GoString(C.gnu_get_libc_version()))

maps, err := os.Open("/proc/self/maps")
if err != nil {
panic(err)
}
defer maps.Close()

[PATCH] selftests/powerpc: Fix compilation issue due to asm label

2018-10-31 Thread Naveen N. Rao

We are using 'dscr_insn' as a label in inline asm to identify if a
SIGILL was generated by the mtspr instruction at that point. However,
with inline assembly, the compiler is still free to duplicate the asm
statement for optimization purposes, which results in the label being
defined twice with the error:
/tmp/ccerQCql.s:874: Error: symbol `dscr_insn' is already defined

With different compiler versions, we may also see:
/tmp/ccJzLDlN.o:(.toc+0x0): undefined reference to `dscr_insn'

Remove the use of the label in the inline assembly. Instead, just look
for the offending instruction in the signal handler.

Reported-by: Breno Leitao 
Signed-off-by: Naveen N. Rao 
---
 tools/testing/selftests/powerpc/utils.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/powerpc/utils.c 
b/tools/testing/selftests/powerpc/utils.c
index 43c342845be0..ed62f4153d3e 100644
--- a/tools/testing/selftests/powerpc/utils.c
+++ b/tools/testing/selftests/powerpc/utils.c
@@ -25,7 +25,6 @@
 #include "utils.h"
 
 static char auxv[4096];
-extern unsigned int dscr_insn[];
 
 int read_auxv(char *buf, ssize_t buf_size)
 {
@@ -247,7 +246,8 @@ static void sigill_handler(int signr, siginfo_t *info, void 
*unused)
ucontext_t *ctx = (ucontext_t *)unused;
unsigned long *pc = _NIA(ctx);
 
-   if (*pc == (unsigned long)_insn) {
+   /* mtspr 3,RS to check for move to DSCR below */
+   if ((*((unsigned int *)*pc) & 0xfc1f) == 0x7c0303a6) {
if (!warned++)
printf("WARNING: Skipping over dscr setup. Consider 
running 'ppc64_cpu --dscr=1' manually.\n");
*pc += 4;
@@ -271,5 +271,5 @@ void set_dscr(unsigned long val)
init = 1;
}
 
-   asm volatile("dscr_insn: mtspr %1,%0" : : "r" (val), "i" (SPRN_DSCR));
+   asm volatile("mtspr %1,%0" : : "r" (val), "i" (SPRN_DSCR));
 }
-- 
2.19.1

Re: [PATCH 5/5] powerpc/64s: Document that PPC supports nosmap

2018-10-31 Thread LEROY Christophe

Russell Currey  a écrit :

On Fri, 2018-10-26 at 18:35 +0200, LEROY Christophe wrote:

Why not call our new functionnality SMAP instead of calling it GUAP ?

mpe wasn't a fan of using the same terminology as other architectures.

I don't like too much the word 'guarded' because it means something  
different for powerpc MMUs.

What about something like 'user space access protection' ?

Christophe

Having a separate term does avoid some assumptions about how things
work or are implemented, but sharing compatibility with an existing
parameter is nice.

Personally I don't really care too much about the name.

- Russell

Christophe

Russell Currey  a écrit :

> Signed-off-by: Russell Currey 
> ---
>  Documentation/admin-guide/kernel-parameters.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt
> b/Documentation/admin-guide/kernel-parameters.txt
> index a5ad67d5cb16..8f78e75965f0 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -2764,7 +2764,7 @@
>noexec=on: enable non-executable mappings
> (default)
>noexec=off: disable non-executable mappings
>
> -  nosmap  [X86]
> +  nosmap  [X86,PPC]
>Disable SMAP (Supervisor Mode Access
> Prevention)
>even if it is supported by processor.
>
> --
> 2.19.1

Re: [PATCH v2 2/2] kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()

2018-10-31 Thread Daniel Thompson

On Wed, Oct 31, 2018 at 02:49:26PM +0100, Peter Zijlstra wrote:
> On Tue, Oct 30, 2018 at 03:18:43PM -0700, Douglas Anderson wrote:
> > Looking closely at it, it seems like a really bad idea to be calling
> > local_irq_enable() in kgdb_roundup_cpus().  If nothing else that seems
> > like it could violate spinlock semantics and cause a deadlock.
> > 
> > Instead, let's use a private csd alongside
> > smp_call_function_single_async() to round up the other CPUs.  Using
> > smp_call_function_single_async() doesn't require interrupts to be
> > enabled so we can remove the offending bit of code.
> 
> You might want to mention that the only reason this isn't a deadlock
> itself is because there is a timeout on waiting for the slaves to
> check-in.

dbg_master_lock must be owned to call kgdb_roundup_cpus() so
the calls to smp_call_function_single_async() should never deadlock the
calling CPU unless there has been a previous failure to round up (e.g.
cores that cannot react to the round up signal).

When there is a failure to round up when we resume, there is a window (before
whatever locks that prevented the IPI being serviced are released) during which
the system will deadlock if the debugger is re entered.

I don't think there is any point trying to round up a CPU that did not
previously respond... it should still have an IPI pending. The deadlock
can be eliminated by getting the round up code to avoid CPUs whose csd->flags
are non-zero either by checking the flag in the kgdb code or adding something
like smp_trycall_function_single_async().

Daniel.

Re: [PATCH 0/5] Guarded Userspace Access Prevention on Radix

2018-10-31 Thread LEROY Christophe


Russell Currey  a écrit :


On Fri, 2018-10-26 at 18:29 +0200, LEROY Christophe wrote:

Russell Currey  a écrit :

> Guarded Userspace Access Prevention is a security mechanism that
> prevents
> the kernel from being able to read and write userspace addresses
> outside of
> the allowed paths, most commonly copy_{to/from}_user().
>
> At present, the only CPU that supports this is POWER9, and only
> while using
> the Radix MMU.  Privileged reads and writes cannot access user data
> when
> key 0 of the AMR is set.  This is described in the "Radix Tree
> Translation
> Storage Protection" section of the POWER ISA as of version 3.0.

It is not right that only power9 can support that.


It's true that not only P9 can support it, but there are more
considerations under hash than radix, implementing this for radix is a
first step.


I don't know much about hash, but I was talking about the 8xx which is  
a nohash ppc32. I'll see next week if I can do something with it on  
top of your serie.






The 8xx has mmu access protection registers which can serve the
same
purpose. Today on the 8xx kernel space is group 0 and user space is
group 1. Group 0 is set to "page defined access permission" in
MD_AP
and MI_AP registers, and group 1 is set to "all accesses done with
supervisor rights". By setting group 1 to "user and supervisor
interpretation swapped" we can forbid kernel access to user space
while still allowing user access to it. Then by simply changing
group
1 mode at dedicated places we can lock/unlock kernel access to user
space.

Could you implement something as generic as possible having that in
mind for a future patch ?


I don't think anything in this series is particularly problematic in
relation to future work for hash.  I am interested in doing a hash
implementation in future too.


I think we have to look at that carrefuly to avoid uggly ifdefs

Christophe



- Russell



Christophe

> GUAP code sets key 0 of the AMR (thus disabling accesses of user
> data)
> early during boot, and only ever "unlocks" access prior to certain
> operations, like copy_{to/from}_user(), futex ops, etc.  Setting
> this does
> not prevent unprivileged access, so userspace can operate fine
> while access
> is locked.
>
> There is a performance impact, although I don't consider it
> heavy.  Running
> a worst-case benchmark of a 1GB copy 1 byte at a time (and thus
> constant
> read(1) write(1) syscalls), I found enabling GUAP to be 3.5% slower
> than
> when disabled.  In most cases, the difference is negligible.  The
> main
> performance impact is the mtspr instruction, which is quite slow.
>
> There are a few caveats with this series that could be improved
> upon in
> future.  Right now there is no saving and restoring of the AMR
> value -
> there is no userspace exploitation of the AMR on Radix in POWER9,
> but if
> this were to change in future, saving and restoring the value would
> be
> necessary.
>
> No attempt to optimise cases of repeated calls - for example, if
> some
> code was repeatedly calling copy_to_user() for small sizes very
> frequently,
> it would be slower than the equivalent of wrapping that code in an
> unlock
> and lock and only having to modify the AMR once.
>
> There are some interesting cases that I've attempted to handle,
> such as if
> the AMR is unlocked (i.e. because a copy_{to_from}_user is in
> progress)...
>
> - and an exception is taken, the kernel would then be running
> with the
> AMR unlocked and freely able to access userspace again.  I am
> working
> around this by storing a flag in the PACA to indicate if the
> AMR is
> unlocked (to save a costly SPR read), and if so, locking the
> AMR in
> the exception entry path and unlocking it on the way out.
>
> - and gets context switched out, goes into a path that locks
> the AMR,
> then context switches back, access will be disabled and will
> fault.
> As a result, I context switch the AMR between tasks as if it
> was used
> by userspace like hash (which already implements this).
>
> Another consideration is use of the isync instruction.  Without an
> isync
> following the mtspr instruction, there is no guarantee that the
> change
> takes effect.  The issue is that isync is very slow, and so I tried
> to
> avoid them wherever necessary.  In this series, the only place an
> isync
> gets used is after *unlocking* the AMR, because if an access takes
> place
> and access is still prevented, the kernel will fault.
>
> On the flipside, a slight delay in unlocking caused by skipping an
> isync
> potentially allows a small window of vulnerability.  It is my
> opinion
> that this window is practically impossible to exploit, but if
> someone
> thinks otherwise, please do share.
>
> This series is my first attempt at POWER assembly so all feedback
> is very
> welcome.
>
> The official theme song of this series can be found here:
> https://www.youtube.com/watch?v=QjTrnKAcYjE
>
> Russell Currey (5):
>   powerpc/64s:

Re: [PATCH 1/5] powerpc/64s: Guarded Userspace Access Prevention

2018-10-31 Thread LEROY Christophe


Russell Currey  a écrit :


On Sun, 2018-10-28 at 18:57 +0100, LEROY Christophe wrote:

Russell Currey  a écrit :

> Guarded Userspace Access Prevention (GUAP)  utilises a feature of
> the Radix MMU which disallows read and write access to userspace
> addresses.  By utilising this, the kernel is prevented from
> accessing
> user data from outside of trusted paths that perform proper safety
> checks,
> such as copy_{to/from}_user() and friends.
>
> Userspace access is disabled from early boot and is only enabled
> when:
>
>- exiting the kernel and entering userspace
>- performing an operation like copy_{to/from}_user()
>- context switching to a process that has access enabled
>
> and similarly, access is disabled again when exiting userspace and
> entering
> the kernel.
>
> This feature has a slight performance impact which I roughly
> measured to be
> 3% slower in the worst case (performing 1GB of 1 byte
> read()/write()
> syscalls), and is gated behind the CONFIG_PPC_RADIX_GUAP option for
> performance-critical builds.
>
> This feature can be tested by using the lkdtm driver
> (CONFIG_LKDTM=y) and
> performing the following:
>
>echo ACCESS_USERSPACE > [debugfs]/provoke-crash/DIRECT
>
> if enabled, this should send SIGSEGV to the thread.
>
> Signed-off-by: Russell Currey 

I think this patch should be split in at least two parts:
First part for implementing the generic part, including the changes
to
futex and csum, and a second part implementing the radix part.


I'll see how I go making generic handlers - I am concerned about the
implementation becoming more complex than it needs to be just to
accommodate potential future changes that could end up having different
requirements anyway, rather than something simple that works today.


I think we can do something quite simple. I'll draft something next  
week and send it to get your feedback.





> ---
> Since the previous version of this patchset (named KHRAP) there
> have been
> several changes, some of which include:
>
> - macro naming, suggested by Nick
> - builds should be fixed outside of 64s
> - no longer unlock heading out to userspace
> - removal of unnecessary isyncs
> - more config option testing
> - removal of save/restore
> - use pr_crit() and reword message on fault
>
>  arch/powerpc/include/asm/exception-64e.h |  3 ++
>  arch/powerpc/include/asm/exception-64s.h | 19 +++-
>  arch/powerpc/include/asm/mmu.h   |  7 +++
>  arch/powerpc/include/asm/paca.h  |  3 ++
>  arch/powerpc/include/asm/reg.h   |  1 +
>  arch/powerpc/include/asm/uaccess.h   | 57
> 
>  arch/powerpc/kernel/asm-offsets.c|  1 +
>  arch/powerpc/kernel/dt_cpu_ftrs.c|  4 ++
>  arch/powerpc/kernel/entry_64.S   | 17 ++-
>  arch/powerpc/mm/fault.c  | 12 +
>  arch/powerpc/mm/pgtable-radix.c  |  2 +
>  arch/powerpc/mm/pkeys.c  |  7 ++-
>  arch/powerpc/platforms/Kconfig.cputype   | 15 +++
>  13 files changed, 135 insertions(+), 13 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/exception-64e.h
> b/arch/powerpc/include/asm/exception-64e.h
> index 555e22d5e07f..bf25015834ee 100644
> --- a/arch/powerpc/include/asm/exception-64e.h
> +++ b/arch/powerpc/include/asm/exception-64e.h
> @@ -215,5 +215,8 @@ exc_##label##_book3e:
>  #define RFI_TO_USER
>\
>rfi
>
> +#define UNLOCK_USER_ACCESS(reg)
> +#define LOCK_USER_ACCESS(reg)
> +
>  #endif /* _ASM_POWERPC_EXCEPTION_64E_H */
>
> diff --git a/arch/powerpc/include/asm/exception-64s.h
> b/arch/powerpc/include/asm/exception-64s.h
> index 3b4767ed3ec5..0cac5bd380ca 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -264,6 +264,19 @@ BEGIN_FTR_SECTION_NESTED(943)
>\
>std ra,offset(r13); \
>  END_FTR_SECTION_NESTED(ftr,ftr,943)
>
> +#define LOCK_USER_ACCESS(reg)
>\
> +BEGIN_MMU_FTR_SECTION_NESTED(944)
> \
> +  LOAD_REG_IMMEDIATE(reg,AMR_LOCKED); \
> +  mtspr   SPRN_AMR,reg;
> \
> +END_MMU_FTR_SECTION_NESTED(MMU_FTR_RADIX_GUAP,MMU_FTR_RADIX_GUAP,9
> 44)
> +
> +#define UNLOCK_USER_ACCESS(reg)
>\
> +BEGIN_MMU_FTR_SECTION_NESTED(945)
> \
> +  li  reg,0;  \
> +  mtspr   SPRN_AMR,reg;
> \
> +  isync
> \
> +END_MMU_FTR_SECTION_NESTED(MMU_FTR_RADIX_GUAP,MMU_FTR_RADIX_GUAP,9
> 45)
> +
>  #define EXCEPTION_PROLOG_0(area)
> \
>GET_PACA(r13);
> \
>std r9,area+EX_R9(r13); /* save r9 */   \
> @@ -500,7 +513,11 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
>beq 4f; /* if from kernel mode  */
> \
>ACCOUNT_CPU_USER_ENTRY(r13, r9, r10);
>\
>SAVE_PPR(area, r9);
> \
> -4:EXCEPTION_PROLOG_COMMON_2(area)
>\
> +4:lbz

Re: [PATCH 6/9] PCI: consolidate PCI config entry in drivers/pci

Hi Christoph,


On Fri, Oct 19, 2018 at 9:58 PM Masahiro Yamada
 wrote:
>
> On Fri, Oct 19, 2018 at 9:23 PM Russell King - ARM Linux
>  wrote:
>
> > > index a68b34183107..b185794549be 100644
> > > --- a/arch/arm/mach-pxa/Kconfig
> > > +++ b/arch/arm/mach-pxa/Kconfig
> > > @@ -125,7 +125,7 @@ config MACH_ARMCORE
> > >   bool "CompuLab CM-X255/CM-X270 modules"
> > >   select ARCH_HAS_DMA_SET_COHERENT_MASK if PCI
> > >   select IWMMXT
> > > - select MIGHT_HAVE_PCI
> > > + select HAVE_PCI
> > >   select NEED_MACH_IO_H if PCI
> > >   select PXA25x
> > >   select PXA27x
> >
> > This is wrong.  "MIGHT_HAVE_PCI" is _not_ the same as "HAVE_PCI" - we
> > have a bunch of platforms that mandatorily have PCI and these select
> > PCI directly.  "MIGHT_HAVE_PCI" controls the _visibility_ of the PCI
> > menu option, but does not prevent it being selected.  Your patch will
> > cause Kconfig to complain for those which mandatorily have PCI but
> > do not set HAVE_PCI.
>
>
> Good catch!
> But, adding a bunch of 'select HAVE_PCI' along with 'select PCI' is ugly.
>
> Do you have any suggestion?
>
> How about letting CONFIG_ARM to select HAVE_PCI ?
>


I applied 1/9, 3/9, 4/9, 5/9.
(I think 2/9 should be squashed to 9/9)

As Russell pointed out, we need to avoid
the unmet dependency.


Are you planning to send
the updated version for 6/9 through - 9/9 ?

If so, could you please rebase 6/9
so that it is cleanly applicable ?


Thanks.


--
Best Regards
Masahiro Yamada

Re: [PATCH v2 3/3] selftests/powerpc: Skip test instead of failing

2018-10-31 Thread Thiago Jung Bauermann



Breno Leitao  writes:

> Current core-pkey selftest fails if the test runs without privileges to
> write into the core pattern file (/proc/sys/kernel/core_pattern). This
> causes the test to fail and give the impression that the subsystem being
> tested is broken, when, in fact, the test is being executed without the
> proper privileges. This is the current error:
>
>   test: core_pkey
>   tags: git_version:v4.19-3-g9e3363be9bce-dirty
>   Error writing to core_pattern file: Permission denied
>   failure: core_pkey
>
> This patch simply skips this test if it runs without the proper privileges,
> avoiding this undesired failure.
>
> CC: Tyrel Datwyler 
> CC: Thiago Jung Bauermann 
> Signed-off-by: Breno Leitao 

Reviewed-by: Thiago Jung Bauermann 

> ---
>  tools/testing/selftests/powerpc/ptrace/core-pkey.c | 5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/tools/testing/selftests/powerpc/ptrace/core-pkey.c 
> b/tools/testing/selftests/powerpc/ptrace/core-pkey.c
> index e23e2e199eb4..d5c64fee032d 100644
> --- a/tools/testing/selftests/powerpc/ptrace/core-pkey.c
> +++ b/tools/testing/selftests/powerpc/ptrace/core-pkey.c
> @@ -352,10 +352,7 @@ static int write_core_pattern(const char *core_pattern)
>   FILE *f;
>  
>   f = fopen(core_pattern_file, "w");
> - if (!f) {
> - perror("Error writing to core_pattern file");
> - return TEST_FAIL;
> - }
> + SKIP_IF_MSG(!f, "Try with root privileges");
>  
>   ret = fwrite(core_pattern, 1, len, f);
>   fclose(f);


-- 
Thiago Jung Bauermann
IBM Linux Technology Center

Re: [PATCH v2 2/3] selftests/powerpc: Create a new SKIP_IF macro

2018-10-31 Thread Thiago Jung Bauermann



Breno Leitao  writes:

> This patch creates a new macro that skips a test and prints a message to
> stderr. This is useful to give an idea why the tests is being skipped,
> other than just skipping the test blindly.
>
> Signed-off-by: Breno Leitao 

Reviewed-by: Thiago Jung Bauermann 

> ---
>  tools/testing/selftests/powerpc/include/utils.h | 10 ++
>  1 file changed, 10 insertions(+)
>
> diff --git a/tools/testing/selftests/powerpc/include/utils.h 
> b/tools/testing/selftests/powerpc/include/utils.h
> index da1b963cdb32..486c1782c23e 100644
> --- a/tools/testing/selftests/powerpc/include/utils.h
> +++ b/tools/testing/selftests/powerpc/include/utils.h
> @@ -72,6 +72,16 @@ do {   
> \
>   }   \
>  } while (0)
>  
> +#define SKIP_IF_MSG(x, msg)  \
> +do { \
> + if ((x)) {  \
> + fprintf(stderr, \
> + "[SKIP] Test skipped on line %d: %s\n", \
> +  __LINE__, msg);\
> + return MAGIC_SKIP_RETURN_VALUE; \
> + }   \
> +} while (0)
> +
>  #define _str(s) #s
>  #define str(s) _str(s)


-- 
Thiago Jung Bauermann
IBM Linux Technology Center

Re: [PATCH] selftests: powerpc: Fix warning for security subdir

2018-10-31 Thread Naveen N. Rao


Joel Stanley wrote:

typing 'make' inside tools/testing/selftests/powerpc gave a build
warning:

BUILD_TARGET=tools/testing/selftests/powerpc/security; mkdir -p $BUILD_TARGET; 
make OUTPUT=$BUILD_TARGET -k -C security all
make[1]: Entering directory 'tools/testing/selftests/powerpc/security'
../../lib.mk:20: ../../../../scripts/subarch.include: No such file or directory
make[1]: *** No rule to make target '../../../../scripts/subarch.include'.
make[1]: Failed to remake makefile '../../../../scripts/subarch.include'.

The build is one level deeper than lib.mk thinks it is. Set top_srcdir
to set things straight.

Note that the test program is still built.

Signed-off-by: Joel Stanley 


Thanks.
Tested-by: Naveen N. Rao

Re: [PATCH 5/9] powerpc: PCI_MSI needs PCI

On Sat, Oct 20, 2018 at 12:10 AM Josh Triplett  wrote:
>
> On Fri, Oct 19, 2018 at 02:09:48PM +0200, Christoph Hellwig wrote:
> > Various powerpc boards select the PCI_MSI config option without selecting
> > PCI, resulting in potentially not compilable configurations if the by
> > default enabled PCI option is disabled.  Explicitly select PCI to ensure
> > we always have valid configs.
> [...]
> > --- a/arch/powerpc/platforms/44x/Kconfig
> > +++ b/arch/powerpc/platforms/44x/Kconfig
> > @@ -24,6 +24,7 @@ config BLUESTONE
> >   default n
> >   select PPC44x_SIMPLE
> >   select APM821xx
> > + select PCI
> >   select PCI_MSI
> >   select PPC4xx_MSI
> >   select PPC4xx_PCI_EXPRESS
> > @@ -78,6 +79,7 @@ config KATMAI
> >   select 440SPe
> >   select PCI
> >   select PPC4xx_PCI_EXPRESS
> > + select PCI
> >   select PCI_MSI
>
> This case already had PCI selected a couple of lines above.


Good catch!

I dropped this hunk,
and applied to linux-kbuild.





> >   select PPC4xx_MSI
> >   help
> > @@ -219,6 +221,7 @@ config AKEBONO
> >   select SWIOTLB
> >   select 476FPE
> >   select PPC4xx_PCI_EXPRESS
> > + select PCI
> >   select PCI_MSI
> >   select PPC4xx_HSTA_MSI
> >   select I2C
> > --
> > 2.19.1
> >



-- 
Best Regards
Masahiro Yamada

Re: [PATCH 4/9] powerpc: remove CONFIG_MCA leftovers

On Fri, Oct 19, 2018 at 9:12 PM Christoph Hellwig  wrote:
>
> Signed-off-by: Christoph Hellwig 
> Acked-by: Thomas Gleixner 
> ---

Applied to linux-kbuild.



>  arch/powerpc/Kconfig | 4 
>  drivers/scsi/Kconfig | 6 +++---
>  2 files changed, 3 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index e8c8970248bc..d4e97469a5f0 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -941,10 +941,6 @@ config FSL_GTM
> help
>   Freescale General-purpose Timers support
>
> -# Yes MCA RS/6000s exist but Linux-PPC does not currently support any
> -config MCA
> -   bool
> -
>  # Platforms that what PCI turned unconditionally just do select PCI
>  # in their config node.  Platforms that want to choose at config
>  # time should select PPC_PCI_CHOICE
> diff --git a/drivers/scsi/Kconfig b/drivers/scsi/Kconfig
> index 7c097006c54d..d3734c54aec9 100644
> --- a/drivers/scsi/Kconfig
> +++ b/drivers/scsi/Kconfig
> @@ -535,7 +535,7 @@ config SCSI_HPTIOP
>
>  config SCSI_BUSLOGIC
> tristate "BusLogic SCSI support"
> -   depends on (PCI || ISA || MCA) && SCSI && ISA_DMA_API && VIRT_TO_BUS
> +   depends on (PCI || ISA) && SCSI && ISA_DMA_API && VIRT_TO_BUS
> ---help---
>   This is support for BusLogic MultiMaster and FlashPoint SCSI Host
>   Adapters. Consult the SCSI-HOWTO, available from
> @@ -1142,12 +1142,12 @@ config SCSI_LPFC_DEBUG_FS
>
>  config SCSI_SIM710
> tristate "Simple 53c710 SCSI support (Compaq, NCR machines)"
> -   depends on (EISA || MCA) && SCSI
> +   depends on EISA && SCSI
> select SCSI_SPI_ATTRS
> ---help---
>   This driver is for NCR53c710 based SCSI host adapters.
>
> - It currently supports Compaq EISA cards and NCR MCA cards
> + It currently supports Compaq EISA cards.
>
>  config SCSI_DC395x
> tristate "Tekram DC395(U/UW/F) and DC315(U) SCSI support"
> --
> 2.19.1
>


-- 
Best Regards
Masahiro Yamada

Re: [PATCH 3/9] powerpc: remove CONFIG_PCI_QSPAN

On Fri, Oct 19, 2018 at 9:10 PM Christoph Hellwig  wrote:
>
> This option isn't actually used anywhere.
>
> Signed-off-by: Christoph Hellwig 
> Acked-by: Benjamin Herrenschmidt 
> ---


Applied to linux-kbuild.




>  arch/powerpc/Kconfig | 9 -
>  1 file changed, 9 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index a80669209155..e8c8970248bc 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -955,7 +955,6 @@ config PCI
> bool "PCI support" if PPC_PCI_CHOICE
> default y if !40x && !CPM2 && !PPC_8xx && !PPC_83xx \
> && !PPC_85xx && !PPC_86xx && !GAMECUBE_COMMON
> -   default PCI_QSPAN if PPC_8xx
> select GENERIC_PCI_IOMAP
> help
>   Find out whether your system includes a PCI bus. PCI is the name of
> @@ -969,14 +968,6 @@ config PCI_DOMAINS
>  config PCI_SYSCALL
> def_bool PCI
>
> -config PCI_QSPAN
> -   bool "QSpan PCI"
> -   depends on PPC_8xx
> -   select PPC_I8259
> -   help
> - Say Y here if you have a system based on a Motorola 8xx-series
> - embedded processor with a QSPAN PCI interface, otherwise say N.
> -
>  config PCI_8260
> bool
> depends on PCI && 8260
> --
> 2.19.1
>


-- 
Best Regards
Masahiro Yamada

Re: [PATCH 1/9] aha152x: rename the PCMCIA define

On Fri, Oct 19, 2018 at 9:10 PM Christoph Hellwig  wrote:
>
> We plan to enable building the PCMCIA core and drivers, and the
> non-prefixed PCMCIA name clashes with some arch headers.
>
> Signed-off-by: Christoph Hellwig 
> Acked-by: Thomas Gleixner 
> ---


Applied to linux-kbuild.




>  drivers/scsi/aha152x.c | 14 +++---
>  drivers/scsi/pcmcia/aha152x_core.c |  2 +-
>  2 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/scsi/aha152x.c b/drivers/scsi/aha152x.c
> index 4d7b0e0adbf7..301b3cad15f8 100644
> --- a/drivers/scsi/aha152x.c
> +++ b/drivers/scsi/aha152x.c
> @@ -269,7 +269,7 @@ static LIST_HEAD(aha152x_host_list);
>  /* DEFINES */
>
>  /* For PCMCIA cards, always use AUTOCONF */
> -#if defined(PCMCIA) || defined(MODULE)
> +#if defined(AHA152X_PCMCIA) || defined(MODULE)
>  #if !defined(AUTOCONF)
>  #define AUTOCONF
>  #endif
> @@ -297,7 +297,7 @@ CMD_INC_RESID(struct scsi_cmnd *cmd, int inc)
>
>  #define DELAY_DEFAULT 1000
>
> -#if defined(PCMCIA)
> +#if defined(AHA152X_PCMCIA)
>  #define IRQ_MIN 0
>  #define IRQ_MAX 16
>  #else
> @@ -328,7 +328,7 @@ MODULE_AUTHOR("Jürgen Fischer");
>  MODULE_DESCRIPTION(AHA152X_REVID);
>  MODULE_LICENSE("GPL");
>
> -#if !defined(PCMCIA)
> +#if !defined(AHA152X_PCMCIA)
>  #if defined(MODULE)
>  static int io[] = {0, 0};
>  module_param_hw_array(io, int, ioport, NULL, 0);
> @@ -391,7 +391,7 @@ static struct isapnp_device_id id_table[] = {
>  MODULE_DEVICE_TABLE(isapnp, id_table);
>  #endif /* ISAPNP */
>
> -#endif /* !PCMCIA */
> +#endif /* !AHA152X_PCMCIA */
>
>  static struct scsi_host_template aha152x_driver_template;
>
> @@ -863,7 +863,7 @@ void aha152x_release(struct Scsi_Host *shpnt)
> if (shpnt->irq)
> free_irq(shpnt->irq, shpnt);
>
> -#if !defined(PCMCIA)
> +#if !defined(AHA152X_PCMCIA)
> if (shpnt->io_port)
> release_region(shpnt->io_port, IO_RANGE);
>  #endif
> @@ -2924,7 +2924,7 @@ static struct scsi_host_template 
> aha152x_driver_template = {
> .slave_alloc= aha152x_adjust_queue,
>  };
>
> -#if !defined(PCMCIA)
> +#if !defined(AHA152X_PCMCIA)
>  static int setup_count;
>  static struct aha152x_setup setup[2];
>
> @@ -3392,4 +3392,4 @@ static int __init aha152x_setup(char *str)
>  __setup("aha152x=", aha152x_setup);
>  #endif
>
> -#endif /* !PCMCIA */
> +#endif /* !AHA152X_PCMCIA */
> diff --git a/drivers/scsi/pcmcia/aha152x_core.c 
> b/drivers/scsi/pcmcia/aha152x_core.c
> index dba3716511c5..24b89228b241 100644
> --- a/drivers/scsi/pcmcia/aha152x_core.c
> +++ b/drivers/scsi/pcmcia/aha152x_core.c
> @@ -1,3 +1,3 @@
> -#define PCMCIA 1
> +#define AHA152X_PCMCIA 1
>  #define AHA152X_STAT 1
>  #include "aha152x.c"
> --
> 2.19.1
>


-- 
Best Regards
Masahiro Yamada

Re: [PATCH 3/9] powerpc/mm: Remove extern from function definition

Hi Christophe,

On 10/24/18 12:12 PM, LEROY Christophe wrote:
> Breno Leitao  a écrit :
> 
>> hi Christophe,
>>
>> On 10/23/2018 12:38 PM, LEROY Christophe wrote:
>>> Breno Leitao  a écrit :

 This patch removes the keyword from the definition part, while keeps
 it in
 the declaration part.
>>>
>>> I think checkpatch also says that extern should be avoided in
>>> declarations.
>>
>> Thanks for the review. I tried to look at this complain, but I didn't see
>> this behavior on checkpatch.pl from kernel 4.19. I created a commit
>> that adds
>> a new extern prototype and checked the patch. Take a look:
> 
> Use option --strict with checkpatch.pl

Thanks. I fixed this patch and resent a v2 with this fix.

Michael,

If you happen to accept this series, please ignore this patch (3/9) and
use a v2 as its replacement:

https://patchwork.ozlabs.org/patch/991444/

[PATCH v2 3/3] selftests/powerpc: Skip test instead of failing

Current core-pkey selftest fails if the test runs without privileges to
write into the core pattern file (/proc/sys/kernel/core_pattern). This
causes the test to fail and give the impression that the subsystem being
tested is broken, when, in fact, the test is being executed without the
proper privileges. This is the current error:

test: core_pkey
tags: git_version:v4.19-3-g9e3363be9bce-dirty
Error writing to core_pattern file: Permission denied
failure: core_pkey

This patch simply skips this test if it runs without the proper privileges,
avoiding this undesired failure.

CC: Tyrel Datwyler 
CC: Thiago Jung Bauermann 
Signed-off-by: Breno Leitao 
---
 tools/testing/selftests/powerpc/ptrace/core-pkey.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/tools/testing/selftests/powerpc/ptrace/core-pkey.c 
b/tools/testing/selftests/powerpc/ptrace/core-pkey.c
index e23e2e199eb4..d5c64fee032d 100644
--- a/tools/testing/selftests/powerpc/ptrace/core-pkey.c
+++ b/tools/testing/selftests/powerpc/ptrace/core-pkey.c
@@ -352,10 +352,7 @@ static int write_core_pattern(const char *core_pattern)
FILE *f;
 
f = fopen(core_pattern_file, "w");
-   if (!f) {
-   perror("Error writing to core_pattern file");
-   return TEST_FAIL;
-   }
+   SKIP_IF_MSG(!f, "Try with root privileges");
 
ret = fwrite(core_pattern, 1, len, f);
fclose(f);
-- 
2.13.2

[PATCH v2 2/3] selftests/powerpc: Create a new SKIP_IF macro

This patch creates a new macro that skips a test and prints a message to
stderr. This is useful to give an idea why the tests is being skipped,
other than just skipping the test blindly.

Signed-off-by: Breno Leitao 
---
 tools/testing/selftests/powerpc/include/utils.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/tools/testing/selftests/powerpc/include/utils.h 
b/tools/testing/selftests/powerpc/include/utils.h
index da1b963cdb32..486c1782c23e 100644
--- a/tools/testing/selftests/powerpc/include/utils.h
+++ b/tools/testing/selftests/powerpc/include/utils.h
@@ -72,6 +72,16 @@ do { 
\
}   \
 } while (0)
 
+#define SKIP_IF_MSG(x, msg)\
+do {   \
+   if ((x)) {  \
+   fprintf(stderr, \
+   "[SKIP] Test skipped on line %d: %s\n", \
+__LINE__, msg);\
+   return MAGIC_SKIP_RETURN_VALUE; \
+   }   \
+} while (0)
+
 #define _str(s) #s
 #define str(s) _str(s)
 
-- 
2.13.2

[PATCH v2 1/3] selftests/powerpc: Allocate base registers

Some ptrace selftests are passing input operands using a constraint that
can allocate any register for the operand, and using these registers on
load/store operations.

If the register allocated by the compiler happens to be zero (r0), it might
cause an invalid memory address access, since load and store operations
consider the content of 0x0 address if the base register is r0, instead of
the content of the r0 register. For example:

r1 := 0xdeadbeef
r0 := 0xdeadbeef

ld r2, 0(1) /* will load into r2 the content of r1 address */
ld r2, 0(0) /* will load into r2 the content of 0x0 */

In order to avoid this possible problem, the inline assembly constraint
should be aware that these registers will be used as a base register, thus,
r0 should not be allocated.

Other than that, this patch removes inline assembly operands that are not
used by the tests.

Signed-off-by: Breno Leitao 
Reviewed-by: Segher Boessenkool 
---
 tools/testing/selftests/powerpc/ptrace/ptrace-gpr.c| 2 +-
 tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr.c | 4 ++--
 tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-tar.c | 2 +-
 tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c | 3 +--
 tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c | 2 +-
 tools/testing/selftests/powerpc/ptrace/ptrace-tm-tar.c | 2 +-
 tools/testing/selftests/powerpc/ptrace/ptrace-tm-vsx.c | 3 +--
 7 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-gpr.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-gpr.c
index 0b4ebcc2f485..ca29fafeed5d 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-gpr.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-gpr.c
@@ -31,7 +31,7 @@ void gpr(void)
ASM_LOAD_GPR_IMMED(gpr_1)
ASM_LOAD_FPR_SINGLE_PRECISION(flt_1)
:
-   : [gpr_1]"i"(GPR_1), [flt_1] "r" ()
+   : [gpr_1]"i"(GPR_1), [flt_1] "b" ()
: "memory", "r6", "r7", "r8", "r9", "r10",
"r11", "r12", "r13", "r14", "r15", "r16", "r17",
"r18", "r19", "r20", "r21", "r22", "r23", "r24",
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr.c
index 59206b96e98a..a08a91594dbe 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-gpr.c
@@ -59,8 +59,8 @@ void tm_gpr(void)
"3: ;"
: [res] "=r" (result), [texasr] "=r" (texasr)
: [gpr_1]"i"(GPR_1), [gpr_2]"i"(GPR_2),
-   [sprn_texasr] "i" (SPRN_TEXASR), [flt_1] "r" (),
-   [flt_2] "r" (), [cptr1] "r" ([1])
+   [sprn_texasr] "i" (SPRN_TEXASR), [flt_1] "b" (),
+   [flt_2] "b" (), [cptr1] "b" ([1])
: "memory", "r7", "r8", "r9", "r10",
"r11", "r12", "r13", "r14", "r15", "r16",
"r17", "r18", "r19", "r20", "r21", "r22",
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-tar.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-tar.c
index b3c061dc9512..f47174746231 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-tar.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-tar.c
@@ -72,7 +72,7 @@ void tm_spd_tar(void)
"3: ;"
 
: [res] "=r" (result), [texasr] "=r" (texasr)
-   : [val] "r" (cptr[1]), [sprn_dscr]"i"(SPRN_DSCR),
+   : [sprn_dscr]"i"(SPRN_DSCR),
[sprn_tar]"i"(SPRN_TAR), [sprn_ppr]"i"(SPRN_PPR),
[sprn_texasr]"i"(SPRN_TEXASR), [tar_1]"i"(TAR_1),
[dscr_1]"i"(DSCR_1), [tar_2]"i"(TAR_2), [dscr_2]"i"(DSCR_2),
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c
index 277dade1b382..18a685bf6a09 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spd-vsx.c
@@ -77,8 +77,7 @@ void tm_spd_vsx(void)
 
"3: ;"
: [res] "=r" (result), [texasr] "=r" (texasr)
-   : [fp_load] "r" (fp_load), [fp_load_ckpt] "r" (fp_load_ckpt),
-   [sprn_texasr] "i"  (SPRN_TEXASR)
+   : [sprn_texasr] "i"  (SPRN_TEXASR)
: "memory", "r0", "r1", "r3", "r4",
"r7", "r8", "r9", "r10", "r11"
);
diff --git a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c 
b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c
index 51427a2465f6..ba04999254e3 100644
--- a/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c
+++ b/tools/testing/selftests/powerpc/ptrace/ptrace-tm-spr.c
@@ -74,7 +74,7 @@ void tm_spr(void)
 
"3: ;"
: [tfhar] "=r" (tfhar), [res] "=r" (result),
-   [texasr] "=r" (texasr),

[PATCH v2] powerpc/mm: Remove extern from function definition

Function huge_ptep_set_access_flags() has the 'extern' keyword in the
function definition and also in the function declaration. This causes a
warning in 'sparse' since the 'extern' storage class should not be used
in the function definition.

arch/powerpc/mm/pgtable.c:232:12: warning: function 
'huge_ptep_set_access_flags' with external linkage has definition

This patch removes the keyword from the definition part. It also removes
the extern keyword from the declaration part, since checkpatch --strict
complains about it.

Suggested-by: Christophe Leroy 
Signed-off-by: Breno Leitao 
---
 arch/powerpc/include/asm/hugetlb.h | 6 +++---
 arch/powerpc/mm/pgtable.c  | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index 383da1ab9e23..98004262bc87 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -135,9 +135,9 @@ static inline void huge_ptep_clear_flush(struct 
vm_area_struct *vma,
 }
 
 #define __HAVE_ARCH_HUGE_PTEP_SET_ACCESS_FLAGS
-extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
- unsigned long addr, pte_t *ptep,
- pte_t pte, int dirty);
+int huge_ptep_set_access_flags(struct vm_area_struct *vma,
+  unsigned long addr, pte_t *ptep,
+  pte_t pte, int dirty);
 
 static inline void arch_clear_hugepage_flags(struct page *page)
 {
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index 010e1c616cb2..1e33dccbd176 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -221,9 +221,9 @@ int ptep_set_access_flags(struct vm_area_struct *vma, 
unsigned long address,
 }
 
 #ifdef CONFIG_HUGETLB_PAGE
-extern int huge_ptep_set_access_flags(struct vm_area_struct *vma,
- unsigned long addr, pte_t *ptep,
- pte_t pte, int dirty)
+int huge_ptep_set_access_flags(struct vm_area_struct *vma,
+  unsigned long addr, pte_t *ptep,
+  pte_t pte, int dirty)
 {
 #ifdef HUGETLB_NEED_PRELOAD
/*
-- 
2.13.2

Re: [PATCH v2 2/2] kgdb: Fix kgdb_roundup_cpus() for arches who used smp_call_function()

2018-10-31 Thread Peter Zijlstra

On Tue, Oct 30, 2018 at 03:18:43PM -0700, Douglas Anderson wrote:
> Looking closely at it, it seems like a really bad idea to be calling
> local_irq_enable() in kgdb_roundup_cpus().  If nothing else that seems
> like it could violate spinlock semantics and cause a deadlock.
> 
> Instead, let's use a private csd alongside
> smp_call_function_single_async() to round up the other CPUs.  Using
> smp_call_function_single_async() doesn't require interrupts to be
> enabled so we can remove the offending bit of code.

You might want to mention that the only reason this isn't a deadlock
itself is because there is a timeout on waiting for the slaves to
check-in.

> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -55,6 +55,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -220,6 +221,39 @@ int __weak kgdb_skipexception(int exception, struct 
> pt_regs *regs)
>   return 0;
>  }
>  
> +/*
> + * Default (weak) implementation for kgdb_roundup_cpus
> + */
> +
> +static DEFINE_PER_CPU(call_single_data_t, kgdb_roundup_csd);
> +
> +void __weak kgdb_call_nmi_hook(void *ignored)
> +{
> + kgdb_nmicallback(raw_smp_processor_id(), get_irq_regs());
> +}
> +
> +void __weak kgdb_roundup_cpus(void)
> +{
> + call_single_data_t *csd;
> + int cpu;
> +
> + for_each_cpu(cpu, cpu_online_mask) {
> + csd = _cpu(kgdb_roundup_csd, cpu);
> + smp_call_function_single_async(cpu, csd);
> + }
> +}
> +
> +static void kgdb_generic_roundup_init(void)
> +{
> + call_single_data_t *csd;
> + int cpu;
> +
> + for_each_possible_cpu(cpu) {
> + csd = _cpu(kgdb_roundup_csd, cpu);
> + csd->func = kgdb_call_nmi_hook;
> + }
> +}
> +
>  /*
>   * Some architectures need cache flushes when we set/clear a
>   * breakpoint:
> @@ -993,6 +1027,8 @@ int kgdb_register_io_module(struct kgdb_io 
> *new_dbg_io_ops)
>   return -EBUSY;
>   }
>  
> + kgdb_generic_roundup_init();
> +
>   if (new_dbg_io_ops->init) {
>   err = new_dbg_io_ops->init();
>   if (err) {

That's a little bit of overhead for those architectures not using that
generic code; but I suppose we can live with that.

Re: NXP P50XX/e5500: SMP doesn't work anymore with the latest Git kernel

2018-10-31 Thread Christian Zigotzky

Hi Michael,

Many thanks for this good explanation. I will try to learn more about
bisecting. Sometimes the problem is the time. I usually work for a Linux
first level support for our AmigaOne machines. That means my main work
is end user support. Therefore I have to learn more about second and
third level Linux support.

Cheers,
Christian

On 31 October 2018 at 2:20PM, Michael Ellerman wrote:

Christian Zigotzky writes:

Little progress ...

I reverted the following two OF files of the commit 'Merge tag
devicetree-for-4.20' and SMP works! The problematic code is somewhere in
these two files.

a/include/linux/of.h
a/drivers/of/base.c

Hi Christian,

Trying to debug things by reverting like this can work, but it's quite
error prone and is usually only used *after* a bisect has identified the
suspect code, or if a bisect can't work for some reason.

I know you said you'd had trouble bisecting in the past, but this one
should be a good one to practice on.

You already identified that the merge of the devicetree changes was the
problem, ie.

b27186abb37b Merge tag 'devicetree-for-4.20' of
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

So you do:
$ git show b27186abb37b
commit b27186abb37b7bd19e0ca434f4f425c807dbd708
Merge: 0ef7791e2bfb d061864b89c3
Author: Linus Torvalds
Date: Fri Oct 26 12:09:58 2018 -0700

Merge tag 'devicetree-for-4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

And that shows you the two commits that were merged 0ef7791e2bfb and
d061864b89c3. If you look at them you see:

$ git log -1 --oneline 0ef7791e2bfb
0ef7791e2bfb Merge branch 'linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal

$ git log -1 --oneline d061864b89c3

d061864b89c3 ARM: dt: relicense two DT binding IRQ headers

You can see that the first one is the previous commit on Linus' branch,
ie. an unrelated merge. The 2nd commit is the commit that was on top of
robh's tree, ie. that's the start of the interesting commits for us.

You can also get to that 2nd commit using b27186abb37b^2.

If you look at what came in via Rob's branch with:

$ git log --oneline d061864b89c3
or
$ git log --oneline b27186abb37b^2

You see there's quite a few commits, and in particular there's another
merge:

389d0a8a7af8 Merge branch 'dt/cpu-type-rework' into dt/next

If we log the 2nd parent of that, we see:

$ git log --oneline 389d0a8a7af8^2
4c29e5934f6c microblaze: get cpu node with of_get_cpu_node
a691240e36e3 fbdev: fsl-diu: get cpu node with of_get_cpu_node
651d44f9679c of: use for_each_of_cpu_node iterator
a9a455e854cd iommu: fsl_pamu: use for_each_of_cpu_node iterator
37dc218bed44 edac: cpc925: use for_each_of_cpu_node iterator
76ec23b127cd clk: mvebu: use for_each_of_cpu_node iterator
7de8f4aa2f35 x86: DT: use for_each_of_cpu_node iterator
8cabf5bc1049 SH: use for_each_of_cpu_node iterator
38959a091e4a powerpc: 8xx: get cpu node with of_get_cpu_node
84dbc69a2ff3 powerpc: 4xx: get cpu node with of_get_cpu_node
a94fe366340a powerpc: use for_each_of_cpu_node iterator
5e5abae858b5 openrisc: use for_each_of_cpu_node iterator
1f0fe1f67cef nios2: get cpu node with of_get_cpu_node
5a931a3c80b5 c6x: use for_each_of_cpu_node iterator
de76e70a8d4e arm64: use for_each_of_cpu_node iterator
5af5d40c4015 ARM: shmobile: use for_each_of_cpu_node iterator
07d44f1f82b7 ARM: topology: remove unneeded check for /cpus node
d4866f751edf ARM: use for_each_of_cpu_node iterator
6487c15f1cc9 of: Support matching cpu nodes with no 'reg' property
f1f207e43b8a of: Add cpu node iterator for_each_of_cpu_node()
f6707fd6241e of: make PowerMac cache node search conditional on
CONFIG_PPC_PMAC
6d0a70a284be vsprintf: print OF node name using full_name
a613b26a5013 of: Convert to using %pOFn instead of device_node.name
6901378c799d of/unittest: add printf tests for node name
b610e2ff4622 of/unittest: remove use of node name pointer in overlay high
level test
57361846b52b (tag: v4.19-rc2) Linux 4.19-rc2

So if we think the suspect commit is in there, we would confirm that by
checking out v4.19-rc2 and testing it works. And then checkout out
4c29e5934f6c and testing that it's broken.

Assuming the former worked and the latter was broken, we do:

$ git bisect good v4.19-rc2
$ git bisect bad 4c29e5934f6c

And then just follow the prompts.

One thing to watch out for is hitting an unrelated bug, that can
sometimes derail your bisection.

In this case the bug we're looking for is that CPU 1 isn't onlined
properly. But if the system doesn't boot entirely for example then you
shouldn't mark the commit as bad, instead it's better to skip it. Then
git will choose a different commit for you to test.

Anyway hope that helps.

cheers

On 29 October 2018 at 6:00PM, Christian Zigotzky wrote:

Hello,

I figured out that the problem is in the OF source code of the commit:
Merge tag devicetree-for-4.20.

Re: NXP P50XX/e5500: SMP doesn't work anymore with the latest Git kernel

Christian Zigotzky  writes:

> Little progress ...
>
> I reverted the following two OF files of the commit 'Merge tag 
> devicetree-for-4.20' and SMP works! The problematic code is somewhere in 
> these two files.
>
> a/include/linux/of.h
> a/drivers/of/base.c

Hi Christian,

Trying to debug things by reverting like this can work, but it's quite
error prone and is usually only used *after* a bisect has identified the
suspect code, or if a bisect can't work for some reason.

I know you said you'd had trouble bisecting in the past, but this one
should be a good one to practice on.

You already identified that the merge of the devicetree changes was the
problem, ie. 

  b27186abb37b Merge tag 'devicetree-for-4.20' of 
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

So you do:
  $ git show b27186abb37b 
  commit b27186abb37b7bd19e0ca434f4f425c807dbd708
  Merge: 0ef7791e2bfb d061864b89c3
  Author: Linus Torvalds 
  Date:   Fri Oct 26 12:09:58 2018 -0700

  Merge tag 'devicetree-for-4.20' of 
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

And that shows you the two commits that were merged 0ef7791e2bfb and
d061864b89c3. If you look at them you see:

  $ git log -1 --oneline 0ef7791e2bfb
  0ef7791e2bfb Merge branch 'linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal

  $ git log -1 --oneline d061864b89c3
  d061864b89c3 ARM: dt: relicense two DT binding IRQ headers

You can see that the first one is the previous commit on Linus' branch,
ie. an unrelated merge. The 2nd commit is the commit that was on top of
robh's tree, ie. that's the start of the interesting commits for us.

You can also get to that 2nd commit using b27186abb37b^2.

If you look at what came in via Rob's branch with:

  $ git log --oneline d061864b89c3
  or
  $ git log --oneline b27186abb37b^2

You see there's quite a few commits, and in particular there's another
merge:

  389d0a8a7af8 Merge branch 'dt/cpu-type-rework' into dt/next

If we log the 2nd parent of that, we see:

 $ git log --oneline 389d0a8a7af8^2
 4c29e5934f6c microblaze: get cpu node with of_get_cpu_node
 a691240e36e3 fbdev: fsl-diu: get cpu node with of_get_cpu_node
 651d44f9679c of: use for_each_of_cpu_node iterator
 a9a455e854cd iommu: fsl_pamu: use for_each_of_cpu_node iterator
 37dc218bed44 edac: cpc925: use for_each_of_cpu_node iterator
 76ec23b127cd clk: mvebu: use for_each_of_cpu_node iterator
 7de8f4aa2f35 x86: DT: use for_each_of_cpu_node iterator
 8cabf5bc1049 SH: use for_each_of_cpu_node iterator
 38959a091e4a powerpc: 8xx: get cpu node with of_get_cpu_node
 84dbc69a2ff3 powerpc: 4xx: get cpu node with of_get_cpu_node
 a94fe366340a powerpc: use for_each_of_cpu_node iterator
 5e5abae858b5 openrisc: use for_each_of_cpu_node iterator
 1f0fe1f67cef nios2: get cpu node with of_get_cpu_node
 5a931a3c80b5 c6x: use for_each_of_cpu_node iterator
 de76e70a8d4e arm64: use for_each_of_cpu_node iterator
 5af5d40c4015 ARM: shmobile: use for_each_of_cpu_node iterator
 07d44f1f82b7 ARM: topology: remove unneeded check for /cpus node
 d4866f751edf ARM: use for_each_of_cpu_node iterator
 6487c15f1cc9 of: Support matching cpu nodes with no 'reg' property
 f1f207e43b8a of: Add cpu node iterator for_each_of_cpu_node()
 f6707fd6241e of: make PowerMac cache node search conditional on CONFIG_PPC_PMAC
 6d0a70a284be vsprintf: print OF node name using full_name
 a613b26a5013 of: Convert to using %pOFn instead of device_node.name
 6901378c799d of/unittest: add printf tests for node name
 b610e2ff4622 of/unittest: remove use of node name pointer in overlay high 
level test
 57361846b52b (tag: v4.19-rc2) Linux 4.19-rc2

So if we think the suspect commit is in there, we would confirm that by
checking out v4.19-rc2 and testing it works. And then checkout out
4c29e5934f6c and testing that it's broken.

Assuming the former worked and the latter was broken, we do:

 $ git bisect good v4.19-rc2
 $ git bisect bad 4c29e5934f6c 

And then just follow the prompts.

One thing to watch out for is hitting an unrelated bug, that can
sometimes derail your bisection.

In this case the bug we're looking for is that CPU 1 isn't onlined
properly. But if the system doesn't boot entirely for example then you
shouldn't mark the commit as bad, instead it's better to skip it. Then
git will choose a different commit for you to test.

Anyway hope that helps.

cheers

> On 29 October 2018 at 6:00PM, Christian Zigotzky wrote:
>> Hello,
>>
>> I figured out that the problem is in the OF source code of the commit: 
>> Merge tag devicetree-for-4.20. [1]
>>
>> I reverted the following OF files and SMP works!
>>
>> drivers/of/base.c
>> drivers/of/device.c
>> drivers/of/of_mdio.c
>> drivers/of/of_numa.c
>> drivers/of/of_private.h
>> drivers/of/overlay.c
>> drivers/of/platform.c
>> drivers/of/unittest-data/overlay_15.dts
>> drivers/of/unittest-data/tests-overlay.dtsi
>> drivers/of/unittest.c
>> include/linux/of.h
>>
>> Cheers,
>> Christian
>>
>> [1] 
>>

[PATCH] powerpc/32: Add .data..Lubsan_data/.data..Lubsan_type sections explicitly

2018-10-31 Thread Mathieu Malaterre

When both `CONFIG_LD_DEAD_CODE_DATA_ELIMINATION=y` and `CONFIG_UBSAN=y`
are set, link step typically produce numberous warnings about orphan
section:

  + powerpc-linux-gnu-ld -EB -m elf32ppc -Bstatic --orphan-handling=warn 
--build-id --gc-sections -X -o .tmp_vmlinux1 -T 
./arch/powerpc/kernel/vmlinux.lds --who
  le-archive built-in.a --no-whole-archive --start-group lib/lib.a --end-group
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_data393' from 
`init/main.o' being placed in section `.data..Lubsan_data393'.
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_data394' from 
`init/main.o' being placed in section `.data..Lubsan_data394'.
  ...
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_type11' from 
`init/main.o' being placed in section `.data..Lubsan_type11'.
  powerpc-linux-gnu-ld: warning: orphan section `.data..Lubsan_type12' from 
`init/main.o' being placed in section `.data..Lubsan_type12'.
  ...

This commit remove those warnings produced at W=1.

Link: https://www.mail-archive.com/linuxppc-dev@lists.ozlabs.org/msg135407.html
Suggested-by: Nicholas Piggin 
Signed-off-by: Mathieu Malaterre 
---
 arch/powerpc/kernel/vmlinux.lds.S | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
b/arch/powerpc/kernel/vmlinux.lds.S
index 434581bcd5b4..1148c3c60c3b 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -308,6 +308,10 @@ SECTIONS
 #ifdef CONFIG_PPC32
.data : AT(ADDR(.data) - LOAD_OFFSET) {
DATA_DATA
+#ifdef CONFIG_UBSAN
+   *(.data..Lubsan_data*)
+   *(.data..Lubsan_type*)
+#endif
*(.data.rel*)
*(SDATA_MAIN)
*(.sdata2)
-- 
2.11.0

[PATCH] powerpc/mm: remove const type qualifier from function ‘pud_pfn’

2018-10-31 Thread Mathieu Malaterre

Type qualifier on return type is ignored. Remove warning in W=1:

  arch/powerpc/include/asm/book3s/64/pgtable.h:1268:25: error: type qualifiers 
ignored on function return type [-Werror=ignored-qualifiers]

Signed-off-by: Mathieu Malaterre 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 6c99e846a8c9..2e6ada28da64 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1304,7 +1304,7 @@ static inline int pgd_devmap(pgd_t pgd)
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
-static inline const int pud_pfn(pud_t pud)
+static inline int pud_pfn(pud_t pud)
 {
/*
 * Currently all calls to pud_pfn() are gated around a pud_devmap()
-- 
2.11.0

RE: [PATCH 5/6] pci: layerscape: Add the EP mode support.

2018-10-31 Thread Xiaowei Bao



-Original Message-
From: Kishon Vijay Abraham I  
Sent: 2018年10月31日 12:15
To: Xiaowei Bao ; bhelg...@google.com; robh...@kernel.org; 
mark.rutl...@arm.com; shawn...@kernel.org; Leo Li ; 
lorenzo.pieral...@arm.com; a...@arndb.de; gre...@linuxfoundation.org; M.h. Lian 
; Mingkai Hu ; Roy Zang 
; kstew...@linuxfoundation.org; 
cyrille.pitc...@free-electrons.com; pombreda...@nexb.com; 
shawn@rock-chips.com; niklas.cas...@axis.com; linux-...@vger.kernel.org; 
devicet...@vger.kernel.org; linux-ker...@vger.kernel.org; 
linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org
Cc: Jiafei Pan 
Subject: Re: [PATCH 5/6] pci: layerscape: Add the EP mode support.

Hi,

On 31/10/18 8:03 AM, Xiaowei Bao wrote:
> 
> 
> -Original Message-
> From: Xiaowei Bao
> Sent: 2018年10月26日 17:19
> To: 'Kishon Vijay Abraham I' ; bhelg...@google.com; 
> robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; Leo Li 
> ; lorenzo.pieral...@arm.com; a...@arndb.de; 
> gre...@linuxfoundation.org; M.h. Lian ; Mingkai 
> Hu ; Roy Zang ; 
> kstew...@linuxfoundation.org; cyrille.pitc...@free-electrons.com; 
> pombreda...@nexb.com; shawn@rock-chips.com; 
> niklas.cas...@axis.com; linux-...@vger.kernel.org; 
> devicet...@vger.kernel.org; linux-ker...@vger.kernel.org; 
> linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org
> Cc: Jiafei Pan 
> Subject: RE: [PATCH 5/6] pci: layerscape: Add the EP mode support.
> 
> 
> 
> -Original Message-
> From: Kishon Vijay Abraham I 
> Sent: 2018年10月26日 13:29
> To: Xiaowei Bao ; bhelg...@google.com; 
> robh...@kernel.org; mark.rutl...@arm.com; shawn...@kernel.org; Leo Li 
> ; lorenzo.pieral...@arm.com; a...@arndb.de; 
> gre...@linuxfoundation.org; M.h. Lian ; Mingkai 
> Hu ; Roy Zang ; 
> kstew...@linuxfoundation.org; cyrille.pitc...@free-electrons.com; 
> pombreda...@nexb.com; shawn@rock-chips.com; 
> niklas.cas...@axis.com; linux-...@vger.kernel.org; 
> devicet...@vger.kernel.org; linux-ker...@vger.kernel.org; 
> linux-arm-ker...@lists.infradead.org; linuxppc-dev@lists.ozlabs.org
> Subject: Re: [PATCH 5/6] pci: layerscape: Add the EP mode support.
> 
> Hi,
> 
> On Thursday 25 October 2018 04:39 PM, Xiaowei Bao wrote:
>> Add the PCIe EP mode support for layerscape platform.
>>
>> Signed-off-by: Xiaowei Bao 
>> ---
>>  drivers/pci/controller/dwc/Makefile|2 +-
>>  drivers/pci/controller/dwc/pci-layerscape-ep.c |  161
>> 
>>  2 files changed, 162 insertions(+), 1 deletions(-)  create mode
>> 100644 drivers/pci/controller/dwc/pci-layerscape-ep.c
>>
>> diff --git a/drivers/pci/controller/dwc/Makefile
>> b/drivers/pci/controller/dwc/Makefile
>> index 5d2ce72..b26d617 100644
>> --- a/drivers/pci/controller/dwc/Makefile
>> +++ b/drivers/pci/controller/dwc/Makefile
>> @@ -8,7 +8,7 @@ obj-$(CONFIG_PCI_EXYNOS) += pci-exynos.o
>>  obj-$(CONFIG_PCI_IMX6) += pci-imx6.o
>>  obj-$(CONFIG_PCIE_SPEAR13XX) += pcie-spear13xx.o
>>  obj-$(CONFIG_PCI_KEYSTONE) += pci-keystone-dw.o pci-keystone.o
>> -obj-$(CONFIG_PCI_LAYERSCAPE) += pci-layerscape.o
>> +obj-$(CONFIG_PCI_LAYERSCAPE) += pci-layerscape.o pci-layerscape-ep.o
>>  obj-$(CONFIG_PCIE_QCOM) += pcie-qcom.o
>>  obj-$(CONFIG_PCIE_ARMADA_8K) += pcie-armada8k.o
>>  obj-$(CONFIG_PCIE_ARTPEC6) += pcie-artpec6.o diff --git 
>> a/drivers/pci/controller/dwc/pci-layerscape-ep.c
>> b/drivers/pci/controller/dwc/pci-layerscape-ep.c
>> new file mode 100644
>> index 000..3b33bbc
>> --- /dev/null
>> +++ b/drivers/pci/controller/dwc/pci-layerscape-ep.c
>> @@ -0,0 +1,161 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * PCIe controller EP driver for Freescale Layerscape SoCs
>> + *
>> + * Copyright (C) 2018 NXP Semiconductor.
>> + *
>> + * Author: Xiaowei Bao   */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +#include "pcie-designware.h"
>> +
>> +#define PCIE_DBI2_OFFSET0x1000  /* DBI2 base address*/
> 
> The base address should come from dt.
>> +
>> +struct ls_pcie_ep {
>> +struct dw_pcie  *pci;
>> +};
>> +
>> +#define to_ls_pcie_ep(x)dev_get_drvdata((x)->dev)
>> +
>> +static bool ls_pcie_is_bridge(struct ls_pcie_ep *pcie) {
>> +struct dw_pcie *pci = pcie->pci;
>> +u32 header_type;
>> +
>> +header_type = ioread8(pci->dbi_base + PCI_HEADER_TYPE);
>> +header_type &= 0x7f;
>> +
>> +return header_type == PCI_HEADER_TYPE_BRIDGE; }
>> +
>> +static int ls_pcie_establish_link(struct dw_pcie *pci) {
>> +return 0;
>> +}
> 
> There should be some way by which EP should tell RC that it is not configured 
> yet. Are there no bits to control LTSSM state initialization or Configuration 
> retry status enabling?
> [Xiaowei Bao] There have not bits to control LTSSM state to tell the RC it is 
> configured. The start link is auto completed.
> [Xiaowei Bao] Hi Kishon, is there any advice?

If there is no HW support, I don't think anything could be done here. This

Re: arch/powerpc/kvm/trace.h:9:0: error: "TRACE_INCLUDE_PATH" redefined

2018-10-31 Thread Christian Zigotzky


Hello,

I compiled the latest Git kernel today. The error 'TRACE_INCLUDE_PATH 
redefined' still exist.


Cheers,
Christian


On 29 October 2018 at 11:22AM, Christian Zigotzky wrote:

Hello,

The latest Git kernel doesn't compile currently because of the 
following error:



christian@christian-virtual-machine:~/Downloads/a$ env LANG=C make 
CROSS_COMPILE=powerpc-linux-gnu- ARCH=powerpc uImage

  CALL    scripts/checksyscalls.sh
  CHK include/generated/compile.h
  CALL    arch/powerpc/kernel/systbl_chk.sh
  CC  arch/powerpc/kvm/powerpc.o
In file included from arch/powerpc/kvm/powerpc.c:51:0:
arch/powerpc/kvm/trace.h:9:0: error: "TRACE_INCLUDE_PATH" redefined 
[-Werror]

 #define TRACE_INCLUDE_PATH .
 ^
In file included from arch/powerpc/kvm/../mm/mmu_decl.h:25:0,
 from arch/powerpc/kvm/powerpc.c:48:
./arch/powerpc/include/asm/trace.h:224:0: note: this is the location 
of the previous definition

 #define TRACE_INCLUDE_PATH asm
 ^
cc1: all warnings being treated as errors
scripts/Makefile.build:305: recipe for target 
'arch/powerpc/kvm/powerpc.o' failed

make[2]: *** [arch/powerpc/kvm/powerpc.o] Error 1
scripts/Makefile.build:546: recipe for target 'arch/powerpc/kvm' failed
make[1]: *** [arch/powerpc/kvm] Error 2
Makefile:1052: recipe for target 'arch/powerpc' failed
make: *** [arch/powerpc] Error 2

---

I deleted the definition of 'TRACE_INCLUDE_PATH' in 
'arch/powerpc/kvm/trace.h'. After that the kernel compiled without any 
problems.


-- Christian

Re: [PATCH 2/2] selftests/powerpc: Skip test instead of failing

Tyrel Datwyler  writes:
> On 10/30/2018 08:16 AM, Michael Ellerman wrote:
>> Thiago Jung Bauermann  writes:
>> 
>>> Breno Leitao  writes:
>>>
 Hi Tyrel,

 On 10/23/2018 05:41 PM, Tyrel Datwyler wrote:
>> --- a/tools/testing/selftests/powerpc/ptrace/core-pkey.c
>> +++ b/tools/testing/selftests/powerpc/ptrace/core-pkey.c
>> @@ -352,17 +352,11 @@ static int write_core_pattern(const char 
>> *core_pattern)
>>  FILE *f;
>>
>>  f = fopen(core_pattern_file, "w");
>> -if (!f) {
>> -perror("Error writing to core_pattern file");
>> -return TEST_FAIL;
>> -}
>> +SKIP_IF(!f);
>>
>>  ret = fwrite(core_pattern, 1, len, f);
>>  fclose(f);
>> -if (ret != len) {
>> -perror("Error writing to core_pattern file");
>> -return TEST_FAIL;
>> -}
>> +SKIP_IF(ret != len);

> If we don't have proper privileges we should fail on the open, right?
> So wouldn't we still want to fail on the write if something goes
> wrong?

 That is a good point. Should the test fail or skip if it is not possible
 to create the infrastructure to run the core test?

 Trying to find the answer in the current test sets, I find tests where
 the self test skips if the test environment is not able to be set up, as
 for example, when a memory allocation fails.

 File: tools/testing/selftests/powerpc/alignment/alignment_handler.c

 ci1 = mmap(NULL, bufsize, PROT_WRITE, MAP_SHARED,
fd, bufsize);
 if ((ci0 == MAP_FAILED) || (ci1 == MAP_FAILED)) {
 printf("\n");
 perror("mmap failed");
 SKIP_IF(1);
 }
>>>
>>> I think TEST_FAIL means the test was able to exercise the feature
>>> and found a problem with it. In this case, the test wasn't able to
>>> exercise the feature so it's not appropriate.
>>>
>>> Ideally, there should be a TEST_ERROR result for a case like this where
>>> an unexpected problem prevented the testcase from exercising the
>>> feature.
>>>
>>> If we're to use the an existing result then I vote for SKIP_IF.
>> 
>> Yeah I agree.
>> 
>> See for example some of the TM tests, which skip if TM is not available.
>> Or the alignment test which skips if it can't open /dev/fb0.
>> 
>> In this case it should print "you need to be root to run this" and then
>> skip.
>
> Agreed that there should be some indicator of why we are skipping the
> test. My original point I was trying to make was that I thought
> skipping a failed open was okay because we are likely not root.
> However, I wasn't sure that a failed write was okay to skip as that
> could be an indicator that something has actually been broken.

Yes I agree. I didn't read your mail closely enough. Have just replied
to it.

cheers

Re: [PATCH 2/2] selftests/powerpc: Skip test instead of failing