Re: clean up and modularize arch dma_mapping interface V2

2017-06-24 Thread Christoph Hellwig
On Wed, Jun 21, 2017 at 12:24:28PM -0700, tndave wrote:
> Thanks for doing this.
> So archs can still have their own definition for dma_set_mask() if 
> HAVE_ARCH_DMA_SET_MASK is y?
> (and similarly for dma_set_coherent_mask() when 
> CONFIG_ARCH_HAS_DMA_SET_COHERENT_MASK is y)
> Any plan to change these?

Yes, those should go away, but I'm not entirely sure how yet.  We'll
need some hook for switching between an IOMMU and a direct mapping
(I guess that's what you want to do for sparc as well?), and I need
to find the best way to do that.  Reimplementing all of dma_set_mask
and dma_set_coherent_mask is something that I want to move away from.


Re: [PATCH v3 1/6] mm, x86: Add ARCH_HAS_ZONE_DEVICE to Kconfig

2017-06-24 Thread John Hubbard

On 06/23/2017 01:31 AM, Oliver O'Halloran wrote:

Currently ZONE_DEVICE depends on X86_64 and this will get unwieldly as
new architectures (and platforms) get ZONE_DEVICE support. Move to an
arch selected Kconfig option to save us the trouble.

Cc: linux...@kvack.org
Acked-by: Ingo Molnar 
Acked-by: Balbir Singh 
Signed-off-by: Oliver O'Halloran 
---
v2: Added missing hunk.
v3: No changes
---
  arch/x86/Kconfig | 1 +
  mm/Kconfig   | 6 +-
  2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0efb4c9497bc..325429a3f32f 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -59,6 +59,7 @@ config X86
select ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_HAS_STRICT_MODULE_RWX
select ARCH_HAS_UBSAN_SANITIZE_ALL
+   select ARCH_HAS_ZONE_DEVICE if X86_64
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
select ARCH_MIGHT_HAVE_PC_PARPORT
diff --git a/mm/Kconfig b/mm/Kconfig
index beb7a455915d..790e52a8a486 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -683,12 +683,16 @@ config IDLE_PAGE_TRACKING
  
  	  See Documentation/vm/idle_page_tracking.txt for more details.
  
+# arch_add_memory() comprehends device memory

+config ARCH_HAS_ZONE_DEVICE
+   bool
+
  config ZONE_DEVICE
bool "Device memory (pmem, etc...) hotplug support"
depends on MEMORY_HOTPLUG
depends on MEMORY_HOTREMOVE
depends on SPARSEMEM_VMEMMAP
-   depends on X86_64 #arch_add_memory() comprehends device memory
+   depends on ARCH_HAS_ZONE_DEVICE
  
  	help

  Device memory hotplug support allows for establishing pmem,



Hi Oliver,

+1, this is nice to have, and it behaves as expected on x86_64 with and without HMM, at 
least with the small bit of Kconfig dependency testing I did here.


thanks,
john h


Re: [PATCH v2 0/5] Use ELF_ET_DYN_BASE only for PIE

2017-06-24 Thread Russell King - ARM Linux
On Fri, Jun 23, 2017 at 01:59:55PM -0700, Kees Cook wrote:
> This is v2 (to refresh the 5 patches in -mm) for moving ELF_ET_DYN_BASE
> safely lower. Changes are clarifications in the commit logs (suggested
> by mpe), a compat think-o fix for arm64 (thanks to Ard), and to add
> Rik and mpe's Acks.
> 
> Quoting patch 1/5:
> 
> The ELF_ET_DYN_BASE position was originally intended to keep loaders
> away from ET_EXEC binaries. (For example, running "/lib/ld-linux.so.2
> /bin/cat" might cause the subsequent load of /bin/cat into where the
> loader had been loaded.) With the advent of PIE (ET_DYN binaries with
> an INTERP Program Header), ELF_ET_DYN_BASE continued to be used since
> the kernel was only looking at ET_DYN. However, since ELF_ET_DYN_BASE
> is traditionally set at the top 1/3rd of the TASK_SIZE, a substantial
> portion of the address space is unused.

With existing kernels on ARM:

0001-00017000 r-xp  08:01 270810 /bin/cat
00026000-00027000 r--p 6000 08:01 270810 /bin/cat
00027000-00028000 rw-p 7000 08:01 270810 /bin/cat
7f661000-7f679000 r-xp  08:01 281659 
/lib/arm-linux-gnueabihf/ld-2.23.so
7f688000-7f689000 r--p 00017000 08:01 281659 
/lib/arm-linux-gnueabihf/ld-2.23.so
7f689000-7f68a000 rw-p 00018000 08:01 281659 
/lib/arm-linux-gnueabihf/ld-2.23.so

If the loader is loaded at 4MB, this means the size of an ET_EXEC
program is limited to less than 4MB - and distros aren't yet
building everything as PIE on ARM.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


[PATCH] powerpc/smp: Do not BUG_ON if invalid CPU during kick

2017-06-24 Thread Santosh Sivaraj
During secondary start, we do not need to BUG_ON if an invalid CPU number
is passed. We alreay print an error if secondary cannot be started, so
just return an error instead.

Signed-off-by: Santosh Sivaraj 
---
 arch/powerpc/kernel/smp.c| 3 ++-
 arch/powerpc/platforms/cell/smp.c| 3 ++-
 arch/powerpc/platforms/powernv/smp.c | 3 ++-
 arch/powerpc/platforms/pseries/smp.c | 3 ++-
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index df2a416..05bf583 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -112,7 +112,8 @@ int smp_generic_cpu_bootable(unsigned int nr)
 #ifdef CONFIG_PPC64
 int smp_generic_kick_cpu(int nr)
 {
-   BUG_ON(nr < 0 || nr >= NR_CPUS);
+   if (nr < 0 || nr >= NR_CPUS)
+   return -EINVAL;
 
/*
 * The processor is currently spinning, waiting for the
diff --git a/arch/powerpc/platforms/cell/smp.c 
b/arch/powerpc/platforms/cell/smp.c
index 895560f..ee8c535 100644
--- a/arch/powerpc/platforms/cell/smp.c
+++ b/arch/powerpc/platforms/cell/smp.c
@@ -115,7 +115,8 @@ static void smp_cell_setup_cpu(int cpu)
 
 static int smp_cell_kick_cpu(int nr)
 {
-   BUG_ON(nr < 0 || nr >= NR_CPUS);
+   if (nr < 0 || nr >= NR_CPUS)
+   return -EINVAL;
 
if (!smp_startup_cpu(nr))
return -ENOENT;
diff --git a/arch/powerpc/platforms/powernv/smp.c 
b/arch/powerpc/platforms/powernv/smp.c
index c04c87a..292825f 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -63,7 +63,8 @@ static int pnv_smp_kick_cpu(int nr)
long rc;
uint8_t status;
 
-   BUG_ON(nr < 0 || nr >= NR_CPUS);
+   if (nr < 0 || nr >= NR_CPUS)
+   return -EINVAL;
 
/*
 * If we already started or OPAL is not supported, we just
diff --git a/arch/powerpc/platforms/pseries/smp.c 
b/arch/powerpc/platforms/pseries/smp.c
index 52ca6b3..c82182a 100644
--- a/arch/powerpc/platforms/pseries/smp.c
+++ b/arch/powerpc/platforms/pseries/smp.c
@@ -151,7 +151,8 @@ static void smp_setup_cpu(int cpu)
 
 static int smp_pSeries_kick_cpu(int nr)
 {
-   BUG_ON(nr < 0 || nr >= NR_CPUS);
+   if (nr < 0 || nr >= NR_CPUS)
+   return -EINVAL;
 
if (!smp_startup_cpu(nr))
return -ENOENT;
-- 
2.9.4



Re: [PATCH 2/2] selftests/ftrace: Update multiple kprobes test for powerpc

2017-06-24 Thread Masami Hiramatsu
On Sat, 24 Jun 2017 02:30:21 +0900
Masami Hiramatsu  wrote:

> On Thu, 22 Jun 2017 22:33:25 +0530
> "Naveen N. Rao"  wrote:
> 
> > On 2017/06/22 06:07PM, Masami Hiramatsu wrote:
> > > On Thu, 22 Jun 2017 00:20:28 +0530
> > > "Naveen N. Rao"  wrote:
> > > 
> > > > KPROBES_ON_FTRACE is only available on powerpc64le. Update comment to
> > > > clarify this.
> > > > 
> > > > Also, we should use an offset of 8 to ensure that the probe does not
> > > > fall on ftrace location. The current offset of 4 will fall before the
> > > > function local entry point and won't fire, while an offset of 12 or 16
> > > > will fall on ftrace location. Offset 8 is currently guaranteed to not be
> > > > the ftrace location.
> > > 
> > > OK, these part seems good to me.
> > > 
> > > > 
> > > > Finally, do not filter out symbols with a dot. Powerpc Elfv1 uses dot
> > > > prefix for all functions and this prevents us from testing some of those
> > > > symbols. Furthermore, with the patch to derive event names properly in
> > > > the presence of ':' and '.', such names are accepted by kprobe_events
> > > > and constitutes a good test for those symbols.
> > > 
> > > Hmm, the reason why I added such filter was to avoid symbols including
> > > gcc-generated suffixes like as .constprop or .isra etc.
> > 
> > I see.
> > 
> > I do wonder -- is there a problem if we try probing those symbols? On my 
> > local x86 vm, I don't see an issue probing it especially with the 
> > previous patch to enable probing with symbols having a '.' or ':'.
> > 
> > Furthermore, since this is for testing kprobe_events, I feel it is good 
> > to try probing those symbols too to catch any weird errors we may hit.
> 
> Yes, and that is not what this testcase is aiming to. That testcase should
> be a separated one, with correct error handling.

Hi Naveen,

Here is the testcase which I meant above. This may help if there is any
regression related to this specific issue.

Thank you,

-

selftests: ftrace: Add a testcase for kprobe event naming

From: Masami Hiramatsu 

Add a testcase for kprobe event naming. This testcase
checks whether the kprobe events can automatically ganerate
its event name on normal function and dot-suffixed function.
Also it checks whether the kprobe events can correctly
define new event with given event name and group name.

Signed-off-by: Masami Hiramatsu 
---
 .../ftrace/test.d/kprobe/kprobe_eventname.tc   |   28 
 1 file changed, 28 insertions(+)
 create mode 100644 
tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc

diff --git a/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc 
b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc
new file mode 100644
index 000..d259031
--- /dev/null
+++ b/tools/testing/selftests/ftrace/test.d/kprobe/kprobe_eventname.tc
@@ -0,0 +1,28 @@
+#!/bin/sh
+# description: Kprobe event auto/manual naming
+
+disable_events
+echo > kprobe_events
+
+:;: "Add an event on function without name" ;:
+
+FUNC=`grep -m 10 " [tT] [^.]*$" /proc/kallsyms | tail -n 1 | cut -f 3 -d " "`
+echo p $FUNC > kprobe_events
+test -d events/kprobes/p_${FUNC}_0 || exit_failure
+
+:;: "Add an event on function with new name" ;:
+
+echo p:event1 $FUNC > kprobe_events
+test -d events/kprobes/event1 || exit_failure
+
+:;: "Add an event on function with new name and group" ;:
+
+echo p:kprobes2/event2 $FUNC > kprobe_events
+test -d events/kprobes2/event2 || exit_failure
+
+:;: "Add an event on dot function without name" ;:
+
+FUNC=`grep -m 10 " [tT] .*\..*$" /proc/kallsyms | tail -n 1 | cut -f 3 -d " "`
+echo p $FUNC > kprobe_events
+EVENT=`grep $FUNC kprobe_events | cut -f 1 -d " " | cut -f 2 -d:` || 
exit_failure
+test -d events/$EVENT || exit_failure
-- 
Masami Hiramatsu 


[PATCH v2 7/7] crypto: caam: cleanup CONFIG_64BIT ifdefs when using io{read|write}64

2017-06-24 Thread Horia Geantă
Now that ioread64 and iowrite64 are always available we don't
need the ugly ifdefs to change their implementation when they
are not.

Signed-off-by: Logan Gunthorpe 
Cc: Horia Geantă 
Cc: Dan Douglass 
Cc: Herbert Xu 
Cc: "David S. Miller" 

Updated patch such that behaviour does not change
from i.MX workaround point of view.

Signed-off-by: Horia Geantă 
---
 drivers/crypto/caam/regs.h | 33 -
 1 file changed, 4 insertions(+), 29 deletions(-)

diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
index 84d2f838a063..b893ebb24e65 100644
--- a/drivers/crypto/caam/regs.h
+++ b/drivers/crypto/caam/regs.h
@@ -134,50 +134,25 @@ static inline void clrsetbits_32(void __iomem *reg, u32 
clear, u32 set)
  *base + 0x : least-significant 32 bits
  *base + 0x0004 : most-significant 32 bits
  */
-#ifdef CONFIG_64BIT
 static inline void wr_reg64(void __iomem *reg, u64 data)
 {
+#ifndef CONFIG_CRYPTO_DEV_FSL_CAAM_IMX
if (caam_little_end)
iowrite64(data, reg);
else
-   iowrite64be(data, reg);
-}
-
-static inline u64 rd_reg64(void __iomem *reg)
-{
-   if (caam_little_end)
-   return ioread64(reg);
-   else
-   return ioread64be(reg);
-}
-
-#else /* CONFIG_64BIT */
-static inline void wr_reg64(void __iomem *reg, u64 data)
-{
-#ifndef CONFIG_CRYPTO_DEV_FSL_CAAM_IMX
-   if (caam_little_end) {
-   wr_reg32((u32 __iomem *)(reg) + 1, data >> 32);
-   wr_reg32((u32 __iomem *)(reg), data);
-   } else
 #endif
-   {
-   wr_reg32((u32 __iomem *)(reg), data >> 32);
-   wr_reg32((u32 __iomem *)(reg) + 1, data);
-   }
+   iowrite64be(data, reg);
 }
 
 static inline u64 rd_reg64(void __iomem *reg)
 {
 #ifndef CONFIG_CRYPTO_DEV_FSL_CAAM_IMX
if (caam_little_end)
-   return ((u64)rd_reg32((u32 __iomem *)(reg) + 1) << 32 |
-   (u64)rd_reg32((u32 __iomem *)(reg)));
+   return ioread64(reg);
else
 #endif
-   return ((u64)rd_reg32((u32 __iomem *)(reg)) << 32 |
-   (u64)rd_reg32((u32 __iomem *)(reg) + 1));
+   return ioread64be(reg);
 }
-#endif /* CONFIG_64BIT  */
 
 #ifdef CONFIG_ARCH_DMA_ADDR_T_64BIT
 #ifdef CONFIG_SOC_IMX7D
-- 
2.12.0.264.gd6db3f216544



Re: [PATCH v2 0/5] Use ELF_ET_DYN_BASE only for PIE

2017-06-24 Thread Kees Cook
On Sat, Jun 24, 2017 at 2:11 AM, Russell King - ARM Linux
 wrote:
> On Fri, Jun 23, 2017 at 01:59:55PM -0700, Kees Cook wrote:
>> This is v2 (to refresh the 5 patches in -mm) for moving ELF_ET_DYN_BASE
>> safely lower. Changes are clarifications in the commit logs (suggested
>> by mpe), a compat think-o fix for arm64 (thanks to Ard), and to add
>> Rik and mpe's Acks.
>>
>> Quoting patch 1/5:
>>
>> The ELF_ET_DYN_BASE position was originally intended to keep loaders
>> away from ET_EXEC binaries. (For example, running "/lib/ld-linux.so.2
>> /bin/cat" might cause the subsequent load of /bin/cat into where the
>> loader had been loaded.) With the advent of PIE (ET_DYN binaries with
>> an INTERP Program Header), ELF_ET_DYN_BASE continued to be used since
>> the kernel was only looking at ET_DYN. However, since ELF_ET_DYN_BASE
>> is traditionally set at the top 1/3rd of the TASK_SIZE, a substantial
>> portion of the address space is unused.
>
> With existing kernels on ARM:
>
> 0001-00017000 r-xp  08:01 270810 /bin/cat
> 00026000-00027000 r--p 6000 08:01 270810 /bin/cat
> 00027000-00028000 rw-p 7000 08:01 270810 /bin/cat
> 7f661000-7f679000 r-xp  08:01 281659 
> /lib/arm-linux-gnueabihf/ld-2.23.so
> 7f688000-7f689000 r--p 00017000 08:01 281659 
> /lib/arm-linux-gnueabihf/ld-2.23.so
> 7f689000-7f68a000 rw-p 00018000 08:01 281659 
> /lib/arm-linux-gnueabihf/ld-2.23.so
>
> If the loader is loaded at 4MB, this means the size of an ET_EXEC
> program is limited to less than 4MB - and distros aren't yet
> building everything as PIE on ARM.

The loader isn't loaded at 4MB; that's what patch 1 changes: loaders
are moved into the mmap region so they will not collide with either
ET_EXEC nor PIE (ET_DYN-with-INTERP).

(After this patch, the name "ELF_ET_DYN_BASE" becomes a bit misleading...)

-Kees

-- 
Kees Cook
Pixel Security


Re: clean up and modularize arch dma_mapping interface V2

2017-06-24 Thread Benjamin Herrenschmidt
On Sat, 2017-06-24 at 09:18 +0200, Christoph Hellwig wrote:
> On Wed, Jun 21, 2017 at 12:24:28PM -0700, tndave wrote:
> > Thanks for doing this.
> > So archs can still have their own definition for dma_set_mask() if 
> > HAVE_ARCH_DMA_SET_MASK is y?
> > (and similarly for dma_set_coherent_mask() when 
> > CONFIG_ARCH_HAS_DMA_SET_COHERENT_MASK is y)
> > Any plan to change these?
> 
> Yes, those should go away, but I'm not entirely sure how yet.  We'll
> need some hook for switching between an IOMMU and a direct mapping
> (I guess that's what you want to do for sparc as well?), and I need
> to find the best way to do that.  Reimplementing all of dma_set_mask
> and dma_set_coherent_mask is something that I want to move away from.

I think we still need to do it. For example we have a bunch new "funky"
cases.

We already have the case where we mix the direct and iommu mappings,
on some powerpc platforms that translates in an iommu mapping down at
0 for the 32-bit space and a direct mapping high up in the PCI address
space (which crops the top bits and thus hits memory at 0 onwards).

This type of hybrid layout is needed by some adapters, typically
storage, which want to keep the "coherent" mask at 32-bit but support
64-bit for streaming masks.

Another one we are trying to deal better with at the moment is devices
with DMA addressing limitations. Some GPUs typically (but not only)
have limits that go all accross the gamut, typically I've seen 40 bits,
44 bits and 47 bits... And of course those are "high peformance"
adapters so we don't want to limit them to the comparatively small
iommu mapping with extra overhead.

At the moment, we're looking at a dma_set_mask() hook that will, for
these guys, re-configure the iommu mapping to create a "compressed"
linear mapping of system memory (ie, skipping the holes we have between
chip on P9 for example) using the largest possible iommu page size
(256M on P8, 1G on P9).

This is made tricky of course because several devices can potentially
share a DMA domain based on various platform specific reasons. And of
course we have no way to figure out what's the "common denominator" of
all those devices before they start doing DMA. A driver can start
before the neighbour is probed and a driver can start doing DMAs using
the standard 32-bit mapping without doing dma_set_mask().

So heuristics ... ugh. Better ideas welcome :-) All that to say that we
are far from being able to get rid of dma_set_mask() custom
implementations (and coherent mask too).

I was tempted at some point retiring the 32-bit iommu mapping
completely, just doing that "linear" thing I mentioned above and
swiotlb for the rest, along with introducing ZONE_DMA32 on powerpc
(with the real 64-bit bypass still around for non-limited devices but
that's then just extra speed by bypassing the iommu xlate & cache).

But I worry of the impact on those silly adapters that set the coherent
mask to 32-bits to keep their microcode & descriptor ring down in 32-
bit space. I'm not sure how well ZONE_DMA32 behaves in those cases.

Cheers,
Ben.



[PATCH v2] powerpc: Invalidate ERAT on powersave wakeup for POWER9

2017-06-24 Thread Benjamin Herrenschmidt
On POWER9 the ERAT may be incorrect on wakeup from some stop states
that lose state. This causes random segvs and illegal instructions
when these stop states are enabled.

This patch invalidates the ERAT on wakeup on POWER9 to prevent this
from causing a problem.

Signed-off-by: Michael Neuling 
Signed-off-by: Benjamin Herrenschmidt 
---

v2. [BenH] Move to a place before we branch off to KVM if the
   core was in a guest. Also add a comment about the
   SRR1 bit extraction.
---
 arch/powerpc/kernel/exceptions-64s.S | 4 +++-
 arch/powerpc/kernel/idle_book3s.S| 7 +++
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index ae418b85c17c..b4b2c3a344c4 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -99,7 +99,9 @@ EXC_VIRT_NONE(0x4000, 0x100)
 #ifdef CONFIG_PPC_P7_NAP
/*
 * If running native on arch 2.06 or later, check if we are waking up
-* from nap/sleep/winkle, and branch to idle handler.
+* from nap/sleep/winkle, and branch to idle handler. This tests
+* SRR1 bits 46:47. A non-0 value indicates that we are coming from
+* a power saving state.
 */
 #define IDLETEST(n)\
BEGIN_FTR_SECTION ; \
diff --git a/arch/powerpc/kernel/idle_book3s.S 
b/arch/powerpc/kernel/idle_book3s.S
index 4898d676dcae..3fd65739e105 100644
--- a/arch/powerpc/kernel/idle_book3s.S
+++ b/arch/powerpc/kernel/idle_book3s.S
@@ -489,6 +489,13 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
  */
 pnv_restore_hyp_resource_arch300:
/*
+* Workaround for POWER9, if we lost resources, the ERAT
+* might have been mixed up and needs flushing.
+*/
+   blt cr3,1f
+   PPC_INVALIDATE_ERAT
+1:
+   /*
 * POWER ISA 3. Use PSSCR to determine if we
 * are waking up from deep idle state
 */



Re: Regression in kernel 4.12-rc1 for Powerpc 32 - bisected to commit 3448890c32c3

2017-06-24 Thread Larry Finger

On 06/23/2017 03:29 PM, Al Viro wrote:

On Fri, Jun 23, 2017 at 01:49:16PM -0500, Larry Finger wrote:


BTW, could you try to check what happens if you kill the
if (__builtin_constant_p(n) && (n <= 8))
bits in raw_copy_{to,from}_user()?  The usefulness of those (in 
__copy_from_user()
originally) had always been dubious and the things are simpler without them.
If _that_ turns out to cure breakage, I would be very surprised, though.


Sorry I was gone so long. Installing jessie on this box resulted in a crash
on boot. Lubuntu 14.04 yielded a desktop with a functioning cursor, but
nothing else. Finally, Ubuntu 12.04 resulted in a working system. I hate
Unity, but I guess I'm stuck for now.


Ho-hum...  Jessie is 3.16, so whatever is crashing there, it's something
different...  Ubuntu 12.04 is what, 3.2?


I know how easy it is to screw up a long bisection by booting the wrong
kernel. To help that problem and to work around the yaconf/yboot nonsense on
the MAC, my /etc/yaconf has always had generic kernel stanzas with only
default, old, and original kernels mentioned. From there I use a local
script to finish a kernel installation by moving the default links to the
old ones and creating the new default links pointing to the current kernel.
With those long-tested scripts, I'm sure that I am booting the one I want.

With the new installation, kernel 4.12-rc6 failed, as did 3448890c with the
backported 46f401c4 added.

Replacing "if (__builtin_constant_p(n) && (n <= 8))" with "if (0)" had no 
effect.


OK, that simplifies things a bit.  Just to make sure we are on the same page:

* f2ed8bebee69 + cherry-pick of 46f401c4 boots (Ubuntu 12.04 userland)
* 3448890c32c3 + cherry-pick of 46f401c4 fails (Ubuntu 12.04 userland), ditto
   with removal of constant-size bits in raw_copy_..._user().  Failure appears
   to be on udev getting EFAULT on some syscalls.
* straight Ubuntu 12.04 works
* jessie crashes on boot.


I made a break through. If I turn off inline copy to/from users for 32-bit ppc 
with the following patch, then the system boots:


diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 5c0d8a8cdae5..1e6a8723f497 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -267,12 +267,7 @@ do { 
   \

 extern unsigned long __copy_tofrom_user(void __user *to,
const void __user *from, unsigned long size);

-#ifndef __powerpc64__
-
-#define INLINE_COPY_FROM_USER
-#define INLINE_COPY_TO_USE
-
-#else /* __powerpc64__ */
+#ifdef __powerpc64__

 static inline unsigned long
 raw_copy_in_user(void __user *to, const void __user *from, unsigned long n)

It seems whatever problem I am seeing is in the inline version of 
_copy_to_user() and _copy_from_user() on the 32-bit ppc. The only other 
difference between the two versions is the placement of the __user macro, which 
looks to be wrong in the non-inlined version of _copy_to_user() in 
lib/usercopy.c, but that is the one that works.


To me, this looks like a compiler error. On the PowerBook, 'gcc --version' 
reports "gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3".


I will prepare a proper patch that I will send to you privately. If you agree 
with it, it can be send through normal channels in time for the release of 4.12.


Larry



Re: [PATCH v2 1/5] binfmt_elf: Use ELF_ET_DYN_BASE only for PIE

2017-06-24 Thread Kees Cook
On Fri, Jun 23, 2017 at 1:59 PM, Kees Cook  wrote:
> For 32-bit tasks when RLIMIT_STACK is set to RLIM_INFINITY, programs
> are loaded below the mmap region. This means they can be made to collide
> (CVE-2017-1000370) or nearly collide (CVE-2017-1000371) with pathological
> stack regions. Lowering ELF_ET_DYN_BASE solves both by moving programs
> above the mmap region in all cases, and will now additionally avoid
> programs falling back to the mmap region by enforcing MAP_FIXED for
> program loads (i.e. if it would have collided with the stack, now it
> will fail to load instead of falling back to the mmap region).

It was pointed out by rmk that I described this inaccurately. I mix up
my own visualization of the address space (above/below in
/proc/$pid/maps) with actual value comparisons (above/below
numerically). This paragraph should read:


For 32-bit tasks when RLIMIT_STACK is set to RLIM_INFINITY, programs
are loaded above the mmap region. This means they can be made to collide
(CVE-2017-1000370) or nearly collide (CVE-2017-1000371) with pathological
stack regions. Lowering ELF_ET_DYN_BASE solves both by moving programs
below the mmap region in all cases, and will now additionally avoid
programs falling back to the mmap region by enforcing MAP_FIXED for
program loads (i.e. if it would have collided with the stack, now it
will fail to load instead of falling back to the mmap region).


Andrew, are you able to manually adjust this commit log in -mm, or
should I resend the patch with this paragraph corrected?

Thanks!

-Kees

-- 
Kees Cook
Pixel Security


[PATCH] powerpc/xive: Silence message about VP block allocation

2017-06-24 Thread Benjamin Herrenschmidt
There is no reason for that message to be pr_info(), it will be printed
every time we start a KVM guest.

Signed-off-by: Benjamin Herrenschmidt 
---

diff --git a/arch/powerpc/sysdev/xive/native.c
b/arch/powerpc/sysdev/xive/native.c
index ab9ecce..0f95476b 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -633,8 +633,8 @@ u32 xive_native_alloc_vp_block(u32 max_vcpus)
if (max_vcpus > (1 << order))
order++;
 
-   pr_info("VP block alloc, for max VCPUs %d use order %d\n",
-   max_vcpus, order);
+   pr_debug("VP block alloc, for max VCPUs %d use order %d\n",
+max_vcpus, order);
 
for (;;) {
rc = opal_xive_alloc_vp_block(order);


Re: [PATCH V6 2/2] powerpc/numa: Update CPU topology when VPHN enabled

2017-06-24 Thread kbuild test robot
Hi Michael,

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.12-rc6 next-20170623]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Michael-Bringmann/powerpc-hotplug-Ensure-enough-nodes-avail-for-operations/20170621-141803
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-storcenter_defconfig (attached as .config)
compiler: powerpc-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   In file included from include/linux/topology.h:35:0,
from include/linux/gfp.h:8,
from include/linux/idr.h:16,
from include/linux/kernfs.h:14,
from include/linux/sysfs.h:15,
from include/linux/kobject.h:21,
from include/linux/of.h:21,
from include/linux/irqdomain.h:34,
from arch/powerpc/include/asm/irq.h:12,
from arch/powerpc/include/asm/prom.h:19,
from arch/powerpc/kernel/cputable.c:22:
>> arch/powerpc/include/asm/topology.h:85:12: error: 'timed_topology_update' 
>> defined but not used [-Werror=unused-function]
static int timed_topology_update(int nsecs)
   ^
   cc1: all warnings being treated as errors

vim +/timed_topology_update +85 arch/powerpc/include/asm/topology.h

79  #endif /* CONFIG_NUMA && CONFIG_PPC_SPLPAR */
80  
81  #if defined(CONFIG_NUMA) && defined(CONFIG_PPC_SPLPAR) && \
82  defined(CONFIG_HOTPLUG_CPU)
83  extern int timed_topology_update(int nsecs);
84  #else
  > 85  static int timed_topology_update(int nsecs)
86  {
87  return 0;
88  }

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


[PATCH] powerpc/ipic: Support edge on IRQ0

2017-06-24 Thread Scott Wood
External IRQ0 has the same capabilities as the other IRQ1-7 and is
handled by the same register IPIC_SEPNR.  When this register is not
specified for "ack" in "ipic_info", you cannot configure this IRQ as
IRQ_TYPE_EDGE_FALLING.  This oversight was probably due to the
non-contiguous hwirq numbering of IRQ0 in the IPIC.

Signed-off-by: Jurgen Schindele 
[scottwood: Cleaned up commit message and posted as a proper patch]
Signed-off-by: Scott Wood 
---
 arch/powerpc/sysdev/ipic.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/sysdev/ipic.c b/arch/powerpc/sysdev/ipic.c
index f267ee0afc08..16f1edd78c40 100644
--- a/arch/powerpc/sysdev/ipic.c
+++ b/arch/powerpc/sysdev/ipic.c
@@ -315,6 +315,7 @@ static struct ipic_info ipic_info[] = {
.prio_mask = 7,
},
[48] = {
+   .ack= IPIC_SEPNR,
.mask   = IPIC_SEMSR,
.prio   = IPIC_SMPRR_A,
.force  = IPIC_SEFCR,
-- 
2.11.0



Re: [v3] drivers:soc:fsl:qbman:qman.c: Sleep instead of stuck hacking jiffies.

2017-06-24 Thread Scott Wood
On Fri, May 05, 2017 at 07:45:18AM +0200, Karim Eshapa wrote:
> Use msleep() instead of stucking with
> long delay will be more efficient.
> 
> Signed-off-by: Karim Eshapa 
> ---
>  drivers/soc/fsl/qbman/qman.c | 6 +-
>  1 file changed, 1 insertion(+), 5 deletions(-)

Acked-by: Scott Wood 

(though the subject line should be "soc/qman: ...")

Leo, are you going to send this patch (and other qman patches) via
arm-soc?

-Scott


Re: drivers:soc:fsl:qbman:qman.c: Change a comment for an entry check inside drain_mr_fqrni function

2017-06-24 Thread Scott Wood
On Fri, May 05, 2017 at 10:05:56AM +0200, Karim Eshapa wrote:
> Change the comment for an entry check inside function
> drain_mr_fqrni() with sleep for sufficient period
> of time instead of long time proccessor cycles.
> 
> Signed-off-by: Karim Eshapa 
> ---
>  drivers/soc/fsl/qbman/qman.c | 25 +
>  1 file changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c
> index 18d391e..636a7d7 100644
> --- a/drivers/soc/fsl/qbman/qman.c
> +++ b/drivers/soc/fsl/qbman/qman.c
> @@ -1071,18 +1071,19 @@ static int drain_mr_fqrni(struct qm_portal *p)
>   msg = qm_mr_current(p);
>   if (!msg) {
>   /*
> -  * if MR was full and h/w had other FQRNI entries to produce, we
> -  * need to allow it time to produce those entries once the
> -  * existing entries are consumed. A worst-case situation
> -  * (fully-loaded system) means h/w sequencers may have to do 3-4
> -  * other things before servicing the portal's MR pump, each of
> -  * which (if slow) may take ~50 qman cycles (which is ~200
> -  * processor cycles). So rounding up and then multiplying this
> -  * worst-case estimate by a factor of 10, just to be
> -  * ultra-paranoid, goes as high as 10,000 cycles. NB, we consume
> -  * one entry at a time, so h/w has an opportunity to produce new
> -  * entries well before the ring has been fully consumed, so
> -  * we're being *really* paranoid here.
> +  * if MR was full and h/w had other FQRNI entries to
> +  * produce, we need to allow it time to produce those
> +  * entries once the existing entries are consumed.
> +  * A worst-case situation (fully-loaded system) means
> +  * h/w sequencers may have to do 3-4 other things
> +  * before servicing the portal's MR pump, each of
> +  * which (if slow) may take ~50 qman cycles
> +  * (which is ~200 processor cycles). So sleep with
> +  * 1 ms would be very efficient, after this period
> +  * we can check if there is something produced.
> +  * NB, we consume one entry at a time, so h/w has
> +  * an opportunity to produce new entries well before
> +  * the ring has been fully consumed.

Do you mean "sufficient" here rather than "efficient"?  It's far less
inefficient than what the code was previously doing, but still...

Otherwise, looks good.

-Scott


Re: [PATCH v3 4/6] powerpc/mm: Add devmap support for ppc64

2017-06-24 Thread kbuild test robot
Hi Oliver,

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.12-rc6 next-20170623]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Oliver-O-Halloran/mm-x86-Add-ARCH_HAS_ZONE_DEVICE-to-Kconfig/20170625-102522
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
wget 
https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   mm/gup.c: In function '__gup_device_huge_pud':
>> mm/gup.c:1329:14: error: implicit declaration of function 'pud_pfn' 
>> [-Werror=implicit-function-declaration]
 fault_pfn = pud_pfn(pud) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
 ^~~
   cc1: some warnings being treated as errors

vim +/pud_pfn +1329 mm/gup.c

b59f65fa Kirill A. Shutemov 2017-03-16  1323  
b59f65fa Kirill A. Shutemov 2017-03-16  1324  static int 
__gup_device_huge_pud(pud_t pud, unsigned long addr,
b59f65fa Kirill A. Shutemov 2017-03-16  1325unsigned long end, 
struct page **pages, int *nr)
b59f65fa Kirill A. Shutemov 2017-03-16  1326  {
b59f65fa Kirill A. Shutemov 2017-03-16  1327unsigned long fault_pfn;
b59f65fa Kirill A. Shutemov 2017-03-16  1328  
b59f65fa Kirill A. Shutemov 2017-03-16 @1329fault_pfn = pud_pfn(pud) + 
((addr & ~PUD_MASK) >> PAGE_SHIFT);
b59f65fa Kirill A. Shutemov 2017-03-16  1330return 
__gup_device_huge(fault_pfn, addr, end, pages, nr);
b59f65fa Kirill A. Shutemov 2017-03-16  1331  }
b59f65fa Kirill A. Shutemov 2017-03-16  1332  #else

:: The code at line 1329 was first introduced by commit
:: b59f65fa076a8eac2ff3a8ab7f8e1705b9fa86cb mm/gup: Implement the 
dev_pagemap() logic in the generic get_user_pages_fast() function

:: TO: Kirill A. Shutemov 
:: CC: Ingo Molnar 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH v2] powerpc: Invalidate ERAT on powersave wakeup for POWER9

2017-06-24 Thread Nicholas Piggin
On Sat, 24 Jun 2017 12:29:01 -0500
Benjamin Herrenschmidt  wrote:

> On POWER9 the ERAT may be incorrect on wakeup from some stop states
> that lose state. This causes random segvs and illegal instructions
> when these stop states are enabled.
> 
> This patch invalidates the ERAT on wakeup on POWER9 to prevent this
> from causing a problem.
> 
> Signed-off-by: Michael Neuling 
> Signed-off-by: Benjamin Herrenschmidt 
> ---
> 
> v2. [BenH] Move to a place before we branch off to KVM if the
>core was in a guest. Also add a comment about the
>SRR1 bit extraction.

This looks a bit safer to me now (avoiding KVM). My understanding is
the real-mode i- and d-ERAT entries are still valid and usable,
which is why this works.

Reviewed-by: Nicholas Piggin 

> ---
>  arch/powerpc/kernel/exceptions-64s.S | 4 +++-
>  arch/powerpc/kernel/idle_book3s.S| 7 +++
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index ae418b85c17c..b4b2c3a344c4 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -99,7 +99,9 @@ EXC_VIRT_NONE(0x4000, 0x100)
>  #ifdef CONFIG_PPC_P7_NAP
>   /*
>* If running native on arch 2.06 or later, check if we are waking up
> -  * from nap/sleep/winkle, and branch to idle handler.
> +  * from nap/sleep/winkle, and branch to idle handler. This tests
> +  * SRR1 bits 46:47. A non-0 value indicates that we are coming from
> +  * a power saving state.
>*/
>  #define IDLETEST(n)  \
>   BEGIN_FTR_SECTION ; \
> diff --git a/arch/powerpc/kernel/idle_book3s.S 
> b/arch/powerpc/kernel/idle_book3s.S
> index 4898d676dcae..3fd65739e105 100644
> --- a/arch/powerpc/kernel/idle_book3s.S
> +++ b/arch/powerpc/kernel/idle_book3s.S
> @@ -489,6 +489,13 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_ARCH_300)
>   */
>  pnv_restore_hyp_resource_arch300:
>   /*
> +  * Workaround for POWER9, if we lost resources, the ERAT
> +  * might have been mixed up and needs flushing.
> +  */
> + blt cr3,1f
> + PPC_INVALIDATE_ERAT
> +1:
> + /*
>* POWER ISA 3. Use PSSCR to determine if we
>* are waking up from deep idle state
>*/
>