Re: [PATCH v2] slab: Fix nodeid bounds check for non-contiguous node IDs

2014-12-01 Thread Michael Ellerman
On Mon, 2014-12-01 at 16:24 +1100, Paul Mackerras wrote:
> On Mon, Dec 01, 2014 at 04:02:14PM +1100, Michael Ellerman wrote:
> > On Mon, 2014-12-01 at 15:28 +1100, Paul Mackerras wrote:
> > > The bounds check for nodeid in cache_alloc_node gives false
> > > positives on machines where the node IDs are not contiguous, leading
> > > to a panic at boot time.  For example, on a POWER8 machine the node
> > > IDs are typically 0, 1, 16 and 17.  This means that num_online_nodes()
> > > returns 4, so when cache_alloc_node is called with nodeid = 16 the
> > > VM_BUG_ON triggers, like this:
> > ...
> > > 
> > > To fix this, we instead compare the nodeid with MAX_NUMNODES, and
> > > additionally make sure it isn't negative (since nodeid is an int).
> > > The check is there mainly to protect the array dereference in the
> > > get_node() call in the next line, and the array being dereferenced is
> > > of size MAX_NUMNODES.  If the nodeid is in range but invalid (for
> > > example if the node is off-line), the BUG_ON in the next line will
> > > catch that.
> > 
> > When did this break? How come we only just noticed?
> 
> Commit 14e50c6a9bc2, which went into 3.10-rc1.

OK. So a Fixes tag is nice:

Fixes: 14e50c6a9bc2 ("mm: slab: Verify the nodeid passed to 
cache_alloc_node")

> You'll only notice if you have CONFIG_SLAB=y and CONFIG_DEBUG_VM=y
> and you're running on a machine with discontiguous node IDs.

Right. And we have SLUB=y for all the defconfigs that are likely to hit that.

> > Also needs:
> > 
> > Cc: sta...@vger.kernel.org
> 
> It does.  I remembered that a minute after I sent the patch.

OK. Hopefully one of the slab maintainers will be happy to add it for us when
they merge this?

cheers



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH REPOST 3/3] powerpc/vphn: move endianness fixing to vphn_unpack_associativity()

2014-12-01 Thread Michael Ellerman
On Fri, 2014-11-28 at 09:39 +0100, Greg Kurz wrote:
> On Fri, 28 Nov 2014 12:49:08 +1100
> Benjamin Herrenschmidt  wrote:
> > In a second pass, we parse that stream, one 16-bytes at a time, and
> > we could do so with a simple loop of be16_to_cpup(foo++). I wouldn't
> > bother with the cast to 32-bit etc... if you encounter a 32-bit case,
> > you just fetch another 16-bit and do value = (old << 16) | new
> > 
> > I think that should lead to something more readable, no ?
> 
> Of course ! This is THE way to go. Thanks Ben ! :)
> 
> An while we're here, I have a question about VPHN_ASSOC_BUFSIZE. The
> H_HOME_NODE_ASSOCIATIVITY spec says that the stream:
> - is at most 64 * 6 = 384 bits long

That's from "Each of the registers R4-R9 ..."

> - may contain 16-bit numbers

"... is divided into 4 fields each 2 bytes long."

> - is padded with "all ones"
> 
> The stream could theoretically contain up to 384 / 16 = 24 domain numbers.

Yes I think that's right, based on:

"The high order bit of each 2 byte field is a length specifier:

1: The associativity domain number is contained in the low order 15 bits of the 
field,"

But then there's also:

"0: The associativity domain number is contained in the low order 15 bits of
the current field concatenated with the 16 bits of the next sequential field)"

> The current code expects no more than 12 domain numbers... and strangely
> seems to correlate the size of the output array to the size of the input
> one as noted in the comment:
> 
>  "6 64-bit registers unpacked into 12 32-bit associativity values"
> 
> My understanding is that the resulting array is be32 only because it is
> supposed to look like the ibm,associativity property from the DT... and
> I could find no clue that this property is limited to 12 values. Have I
> missed something ?

I don't know for sure, but I strongly suspect it's just confused about the two
options above. Probably when it was tested they only ever saw 12 32-bit values,
and so that assumption was allowed to stay in the code.

I'd be quite happy if you wanted to pull the parsing logic out into a separate
file, so we could write some userspace tests of it.

cheers



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] slab: Fix nodeid bounds check for non-contiguous node IDs

2014-12-01 Thread Pekka Enberg

On 12/1/14 6:28 AM, Paul Mackerras wrote:

---
v2: include the oops message in the patch description

  mm/slab.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slab.c b/mm/slab.c
index eb2b2ea..f34e053 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -3076,7 +3076,7 @@ static void *cache_alloc_node(struct kmem_cache 
*cachep, gfp_t flags,
void *obj;
int x;
  
-	VM_BUG_ON(nodeid > num_online_nodes());

+   VM_BUG_ON(nodeid < 0 || nodeid >= MAX_NUMNODES);
n = get_node(cachep, nodeid);
BUG_ON(!n);


Reviewed-by: Pekka Enberg 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 0/4] powerpc/mpc85xx: Add FSL QorIQ DPAA B/QMan support to device tree(s)

2014-12-01 Thread Emil Medve
v3: Remove no-map
Adjust alloc-ranges for the 32-/36-bit SoC(s)

v2: Remove some reserved-memory properties
Split the patchset per IP block
Refined patch assignment

Kumar Gala (4):
  powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA BMan
  powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA QMan
  powerpc/mpc85xx: Add FSL QorIQ DPAA BMan support to device tree(s)
  powerpc/mpc85xx: Add FSL QorIQ DPAA QMan support to device tree(s)

 arch/powerpc/boot/dts/b4qds.dtsi   |  35 +-
 arch/powerpc/boot/dts/fsl/b4860si-post.dtsi| 129 -
 arch/powerpc/boot/dts/fsl/b4si-post.dtsi   | 180 ++-
 arch/powerpc/boot/dts/fsl/p1023si-post.dtsi|  61 ++-
 arch/powerpc/boot/dts/fsl/p2041si-post.dtsi|   9 +-
 arch/powerpc/boot/dts/fsl/p3041si-post.dtsi|   9 +-
 arch/powerpc/boot/dts/fsl/p4080si-post.dtsi|   9 +-
 arch/powerpc/boot/dts/fsl/p5020si-post.dtsi|   9 +-
 arch/powerpc/boot/dts/fsl/p5040si-post.dtsi|   9 +-
 arch/powerpc/boot/dts/fsl/qoriq-bman1-portals.dtsi |  90 
 arch/powerpc/boot/dts/fsl/qoriq-bman1.dtsi |  41 ++
 arch/powerpc/boot/dts/fsl/qoriq-qman1-portals.dtsi | 101 
 arch/powerpc/boot/dts/fsl/qoriq-qman1.dtsi |  41 ++
 arch/powerpc/boot/dts/fsl/qoriq-qman3.dtsi |  41 ++
 arch/powerpc/boot/dts/fsl/t1040si-post.dtsi| 128 -
 arch/powerpc/boot/dts/fsl/t2081si-post.dtsi| 216 +++-
 arch/powerpc/boot/dts/fsl/t4240si-post.dtsi| 568 -
 arch/powerpc/boot/dts/kmcoge4.dts  |  33 ++
 arch/powerpc/boot/dts/oca4080.dts  |  33 ++
 arch/powerpc/boot/dts/p1023rdb.dts |  36 +-
 arch/powerpc/boot/dts/p2041rdb.dts |  35 +-
 arch/powerpc/boot/dts/p3041ds.dts  |  35 +-
 arch/powerpc/boot/dts/p4080ds.dts  |  35 +-
 arch/powerpc/boot/dts/p5020ds.dts  |  35 +-
 arch/powerpc/boot/dts/p5040ds.dts  |  35 +-
 arch/powerpc/boot/dts/t104xqds.dtsi|  35 +-
 arch/powerpc/boot/dts/t104xrdb.dtsi|  32 ++
 arch/powerpc/boot/dts/t208xqds.dtsi|  35 +-
 arch/powerpc/boot/dts/t208xrdb.dtsi|  33 ++
 arch/powerpc/boot/dts/t4240qds.dts |  35 +-
 arch/powerpc/boot/dts/t4240rdb.dts |  33 ++
 31 files changed, 2134 insertions(+), 22 deletions(-)
 create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-bman1-portals.dtsi
 create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-bman1.dtsi
 create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-qman1-portals.dtsi
 create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-qman1.dtsi
 create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-qman3.dtsi

-- 
2.1.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v3 1/4] powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA BMan

2014-12-01 Thread Emil Medve
From: Kumar Gala 

Change-Id: I16e63db731e55a3d60d4e147573c1af8718082d3
Signed-off-by: Kumar Gala 
Signed-off-by: Geoff Thorpe 
Signed-off-by: Hai-Ying Wang 
[Emil Medve: Sync with the upstream binding]
Signed-off-by: Emil Medve 
---
 arch/powerpc/boot/dts/fsl/qoriq-bman1-portals.dtsi | 90 ++
 arch/powerpc/boot/dts/fsl/qoriq-bman1.dtsi | 41 ++
 2 files changed, 131 insertions(+)
 create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-bman1-portals.dtsi
 create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-bman1.dtsi

diff --git a/arch/powerpc/boot/dts/fsl/qoriq-bman1-portals.dtsi 
b/arch/powerpc/boot/dts/fsl/qoriq-bman1-portals.dtsi
new file mode 100644
index 000..5022432
--- /dev/null
+++ b/arch/powerpc/boot/dts/fsl/qoriq-bman1-portals.dtsi
@@ -0,0 +1,90 @@
+/*
+ * QorIQ BMan Portal device tree stub for 10 portals
+ *
+ * Copyright 2011 - 2014 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *  names of its contributors may be used to endorse or promote products
+ *  derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+&bportals {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "simple-bus";
+
+   bman-portal@0 {
+   compatible = "fsl,bman-portal";
+   reg = <0x0 0x4000>, <0x10 0x1000>;
+   interrupts = <105 2 0 0>;
+   };
+   bman-portal@4000 {
+   compatible = "fsl,bman-portal";
+   reg = <0x4000 0x4000>, <0x101000 0x1000>;
+   interrupts = <107 2 0 0>;
+   };
+   bman-portal@8000 {
+   compatible = "fsl,bman-portal";
+   reg = <0x8000 0x4000>, <0x102000 0x1000>;
+   interrupts = <109 2 0 0>;
+   };
+   bman-portal@c000 {
+   compatible = "fsl,bman-portal";
+   reg = <0xc000 0x4000>, <0x103000 0x1000>;
+   interrupts = <111 2 0 0>;
+   };
+   bman-portal@1 {
+   compatible = "fsl,bman-portal";
+   reg = <0x1 0x4000>, <0x104000 0x1000>;
+   interrupts = <113 2 0 0>;
+   };
+   bman-portal@14000 {
+   compatible = "fsl,bman-portal";
+   reg = <0x14000 0x4000>, <0x105000 0x1000>;
+   interrupts = <115 2 0 0>;
+   };
+   bman-portal@18000 {
+   compatible = "fsl,bman-portal";
+   reg = <0x18000 0x4000>, <0x106000 0x1000>;
+   interrupts = <117 2 0 0>;
+   };
+   bman-portal@1c000 {
+   compatible = "fsl,bman-portal";
+   reg = <0x1c000 0x4000>, <0x107000 0x1000>;
+   interrupts = <119 2 0 0>;
+   };
+   bman-portal@2 {
+   compatible = "fsl,bman-portal";
+   reg = <0x2 0x4000>, <0x108000 0x1000>;
+   interrupts = <121 2 0 0>;
+   };
+   bman-portal@24000 {
+   compatible = "fsl,bman-portal";
+   reg = <0x24000 0x4000>, <0x109000 0x1000>;
+   interrupts = <123 2 0 0>;
+   };
+};
diff --git a/arch/powerpc/boot/dts/fsl/qoriq-bman1.dtsi 
b/arch/powerpc/boot/dts/fsl/qoriq-bman1.dtsi
new file mode 100644
index 000..1adc09f
--- /dev/null
+++ b/arch/powerpc/boot/dts/fsl/qoriq-bman1.dtsi
@@ -0,0 +1,41 @@
+/*
+ * QorIQ BMan device tree stub [ controller @ offset 0x31a000 ]
+ *
+ * Copyright 2011 - 201

[PATCH v3 2/4] powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA QMan

2014-12-01 Thread Emil Medve
From: Kumar Gala 

Change-Id: I16e63db731e55a3d60d4e147573c1af8718082d3
Signed-off-by: Kumar Gala 
Signed-off-by: Geoff Thorpe 
Signed-off-by: Hai-Ying Wang 
[Emil Medve: Sync with the upstream binding]
Signed-off-by: Emil Medve 
---
 arch/powerpc/boot/dts/fsl/qoriq-qman1-portals.dtsi | 101 +
 arch/powerpc/boot/dts/fsl/qoriq-qman1.dtsi |  41 +
 arch/powerpc/boot/dts/fsl/qoriq-qman3.dtsi |  41 +
 3 files changed, 183 insertions(+)
 create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-qman1-portals.dtsi
 create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-qman1.dtsi
 create mode 100644 arch/powerpc/boot/dts/fsl/qoriq-qman3.dtsi

diff --git a/arch/powerpc/boot/dts/fsl/qoriq-qman1-portals.dtsi 
b/arch/powerpc/boot/dts/fsl/qoriq-qman1-portals.dtsi
new file mode 100644
index 000..05d51ac
--- /dev/null
+++ b/arch/powerpc/boot/dts/fsl/qoriq-qman1-portals.dtsi
@@ -0,0 +1,101 @@
+/*
+ * QorIQ QMan Portal device tree stub for 10 portals & 15 pool channels
+ *
+ * Copyright 2011 - 2014 Freescale Semiconductor Inc.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ * * Redistributions of source code must retain the above copyright
+ *  notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *  notice, this list of conditions and the following disclaimer in the
+ *  documentation and/or other materials provided with the distribution.
+ * * Neither the name of Freescale Semiconductor nor the
+ *  names of its contributors may be used to endorse or promote products
+ *  derived from this software without specific prior written permission.
+ *
+ *
+ * ALTERNATIVELY, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") as published by the Free Software
+ * Foundation, either version 2 of that License or (at your option) any
+ * later version.
+ *
+ * THIS SOFTWARE IS PROVIDED BY Freescale Semiconductor ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
+ * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+ * DISCLAIMED. IN NO EVENT SHALL Freescale Semiconductor BE LIABLE FOR ANY
+ * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
+ * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
+ * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
+ * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+&qportals {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   compatible = "simple-bus";
+
+   qportal0: qman-portal@0 {
+   compatible = "fsl,qman-portal";
+   reg = <0x0 0x4000>, <0x10 0x1000>;
+   interrupts = <104 2 0 0>;
+   fsl,qman-channel-id = <0x0>;
+   };
+   qportal1: qman-portal@4000 {
+   compatible = "fsl,qman-portal";
+   reg = <0x4000 0x4000>, <0x101000 0x1000>;
+   interrupts = <106 2 0 0>;
+   fsl,qman-channel-id = <1>;
+   };
+   qportal2: qman-portal@8000 {
+   compatible = "fsl,qman-portal";
+   reg = <0x8000 0x4000>, <0x102000 0x1000>;
+   interrupts = <108 2 0 0>;
+   fsl,qman-channel-id = <2>;
+   };
+   qportal3: qman-portal@c000 {
+   compatible = "fsl,qman-portal";
+   reg = <0xc000 0x4000>, <0x103000 0x1000>;
+   interrupts = <110 2 0 0>;
+   fsl,qman-channel-id = <3>;
+   };
+   qportal4: qman-portal@1 {
+   compatible = "fsl,qman-portal";
+   reg = <0x1 0x4000>, <0x104000 0x1000>;
+   interrupts = <112 2 0 0>;
+   fsl,qman-channel-id = <4>;
+   };
+   qportal5: qman-portal@14000 {
+   compatible = "fsl,qman-portal";
+   reg = <0x14000 0x4000>, <0x105000 0x1000>;
+   interrupts = <114 2 0 0>;
+   fsl,qman-channel-id = <5>;
+   };
+   qportal6: qman-portal@18000 {
+   compatible = "fsl,qman-portal";
+   reg = <0x18000 0x4000>, <0x106000 0x1000>;
+   interrupts = <116 2 0 0>;
+   fsl,qman-channel-id = <6>;
+   };
+
+   qportal7: qman-portal@1c000 {
+   compatible = "fsl,qman-portal";
+   reg = <0x1c000 0x4000>, <0x107000 0x1000>;
+   interrupts = <118 2 0 0>;
+   fsl,qman-channel-id = <7>;
+   };
+   qportal8: qman-portal@2 {
+   compatible = "fsl,qman-portal";
+   reg = <0x2 0x4000>

[PATCH v3 4/4] powerpc/mpc85xx: Add FSL QorIQ DPAA QMan support to device tree(s)

2014-12-01 Thread Emil Medve
From: Kumar Gala 

Change-Id: If643fa5ba0a903aef8f5056a2c90ebecc995b760
Signed-off-by: Kumar Gala 
Signed-off-by: Geoff Thorpe 
Signed-off-by: Hai-Ying Wang 
Signed-off-by: Chunhe Lan 
Signed-off-by: Poonam Aggrwal 
[Emil Medve: Sync with the upstream binding]
Signed-off-by: Emil Medve 
---
 arch/powerpc/boot/dts/b4qds.dtsi|  16 ++
 arch/powerpc/boot/dts/fsl/b4860si-post.dtsi |  69 +++
 arch/powerpc/boot/dts/fsl/b4si-post.dtsi|  96 +
 arch/powerpc/boot/dts/fsl/p1023si-post.dtsi |  31 +++
 arch/powerpc/boot/dts/fsl/p2041si-post.dtsi |   3 +
 arch/powerpc/boot/dts/fsl/p3041si-post.dtsi |   3 +
 arch/powerpc/boot/dts/fsl/p4080si-post.dtsi |   3 +
 arch/powerpc/boot/dts/fsl/p5020si-post.dtsi |   3 +
 arch/powerpc/boot/dts/fsl/p5040si-post.dtsi |   3 +
 arch/powerpc/boot/dts/fsl/t1040si-post.dtsi |  68 ++
 arch/powerpc/boot/dts/fsl/t2081si-post.dtsi | 116 +++
 arch/powerpc/boot/dts/fsl/t4240si-post.dtsi | 308 
 arch/powerpc/boot/dts/kmcoge4.dts   |  16 ++
 arch/powerpc/boot/dts/oca4080.dts   |  16 ++
 arch/powerpc/boot/dts/p1023rdb.dts  |  16 ++
 arch/powerpc/boot/dts/p2041rdb.dts  |  16 ++
 arch/powerpc/boot/dts/p3041ds.dts   |  16 ++
 arch/powerpc/boot/dts/p4080ds.dts   |  16 ++
 arch/powerpc/boot/dts/p5020ds.dts   |  16 ++
 arch/powerpc/boot/dts/p5040ds.dts   |  16 ++
 arch/powerpc/boot/dts/t104xqds.dtsi |  16 ++
 arch/powerpc/boot/dts/t104xrdb.dtsi |  16 ++
 arch/powerpc/boot/dts/t208xqds.dtsi |  16 ++
 arch/powerpc/boot/dts/t208xrdb.dtsi |  16 ++
 arch/powerpc/boot/dts/t4240qds.dts  |  16 ++
 arch/powerpc/boot/dts/t4240rdb.dts  |  16 ++
 26 files changed, 943 insertions(+)

diff --git a/arch/powerpc/boot/dts/b4qds.dtsi b/arch/powerpc/boot/dts/b4qds.dtsi
index b30fa5d..542ec19 100644
--- a/arch/powerpc/boot/dts/b4qds.dtsi
+++ b/arch/powerpc/boot/dts/b4qds.dtsi
@@ -115,6 +115,18 @@
size = <0 0x100>;
alignment = <0 0x100>;
};
+   qman_fqd: qman-fqd {
+   compatible = "fsl,qman-fqd";
+   alloc-ranges = <0 0 0x 0x>;
+   size = <0 0x40>;
+   alignment = <0 0x40>;
+   };
+   qman_pfdr: qman-pfdr {
+   compatible = "fsl,qman-pfdr";
+   alloc-ranges = <0 0 0x 0x>;
+   size = <0 0x200>;
+   alignment = <0 0x200>;
+   };
};
 
dcsr: dcsr@f {
@@ -125,6 +137,10 @@
ranges = <0x0 0xf 0xf400 0x200>;
};
 
+   qportals: qman-portals@ff600 {
+   ranges = <0x0 0xf 0xf600 0x200>;
+   };
+
soc: soc@ffe00 {
ranges = <0x 0xf 0xfe00 0x100>;
reg = <0xf 0xfe00 0 0x1000>;
diff --git a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi
index 2dd61fa..38e297b 100644
--- a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi
@@ -167,6 +167,75 @@
};
 };
 
+&qportals {
+   qportal14: qman-portal@38000 {
+   compatible = "fsl,qman-portal";
+   reg = <0x38000 0x4000>, <0x100e000 0x1000>;
+   interrupts = <132 0x2 0 0>;
+   fsl,qman-channel-id = <0xe>;
+   };
+   qportal15: qman-portal@3c000 {
+   compatible = "fsl,qman-portal";
+   reg = <0x3c000 0x4000>, <0x100f000 0x1000>;
+   interrupts = <134 0x2 0 0>;
+   fsl,qman-channel-id = <0xf>;
+   };
+   qportal16: qman-portal@4 {
+   compatible = "fsl,qman-portal";
+   reg = <0x4 0x4000>, <0x101 0x1000>;
+   interrupts = <136 0x2 0 0>;
+   fsl,qman-channel-id = <0x10>;
+   };
+   qportal17: qman-portal@44000 {
+   compatible = "fsl,qman-portal";
+   reg = <0x44000 0x4000>, <0x1011000 0x1000>;
+   interrupts = <138 0x2 0 0>;
+   fsl,qman-channel-id = <0x11>;
+   };
+   qportal18: qman-portal@48000 {
+   compatible = "fsl,qman-portal";
+   reg = <0x48000 0x4000>, <0x1012000 0x1000>;
+   interrupts = <140 0x2 0 0>;
+   fsl,qman-channel-id = <0x12>;
+   };
+   qportal19: qman-portal@4c000 {
+   compatible = "fsl,qman-portal";
+   reg = <0x4c000 0x4000>, <0x1013000 0x1000>;
+   interrupts = <142 0x2 0 0>;
+   fsl,qman-channel-id = <0x13>;
+   };
+   qportal20: qman-portal@5 {
+   compatible = "fsl,qman-portal";
+   reg = <0x5 0x4000>, <0x1014000 0x1000>;
+   interrupts = <144 0x2 0

[PATCH v3 3/4] powerpc/mpc85xx: Add FSL QorIQ DPAA BMan support to device tree(s)

2014-12-01 Thread Emil Medve
From: Kumar Gala 

Change-Id: If643fa5ba0a903aef8f5056a2c90ebecc995b760
Signed-off-by: Kumar Gala 
Signed-off-by: Geoff Thorpe 
Signed-off-by: Hai-Ying Wang 
Signed-off-by: Chunhe Lan 
Signed-off-by: Poonam Aggrwal 
[Emil Medve: Sync with the upstream binding]
Signed-off-by: Emil Medve 
---
 arch/powerpc/boot/dts/b4qds.dtsi|  19 +-
 arch/powerpc/boot/dts/fsl/b4860si-post.dtsi |  60 ++-
 arch/powerpc/boot/dts/fsl/b4si-post.dtsi|  84 -
 arch/powerpc/boot/dts/fsl/p1023si-post.dtsi |  30 +++-
 arch/powerpc/boot/dts/fsl/p2041si-post.dtsi |   6 +-
 arch/powerpc/boot/dts/fsl/p3041si-post.dtsi |   6 +-
 arch/powerpc/boot/dts/fsl/p4080si-post.dtsi |   6 +-
 arch/powerpc/boot/dts/fsl/p5020si-post.dtsi |   6 +-
 arch/powerpc/boot/dts/fsl/p5040si-post.dtsi |   6 +-
 arch/powerpc/boot/dts/fsl/t1040si-post.dtsi |  60 ++-
 arch/powerpc/boot/dts/fsl/t2081si-post.dtsi | 100 ++-
 arch/powerpc/boot/dts/fsl/t4240si-post.dtsi | 260 +++-
 arch/powerpc/boot/dts/kmcoge4.dts   |  17 ++
 arch/powerpc/boot/dts/oca4080.dts   |  17 ++
 arch/powerpc/boot/dts/p1023rdb.dts  |  20 ++-
 arch/powerpc/boot/dts/p2041rdb.dts  |  19 +-
 arch/powerpc/boot/dts/p3041ds.dts   |  19 +-
 arch/powerpc/boot/dts/p4080ds.dts   |  19 +-
 arch/powerpc/boot/dts/p5020ds.dts   |  19 +-
 arch/powerpc/boot/dts/p5040ds.dts   |  19 +-
 arch/powerpc/boot/dts/t104xqds.dtsi |  19 +-
 arch/powerpc/boot/dts/t104xrdb.dtsi |  16 ++
 arch/powerpc/boot/dts/t208xqds.dtsi |  19 +-
 arch/powerpc/boot/dts/t208xrdb.dtsi |  17 ++
 arch/powerpc/boot/dts/t4240qds.dts  |  19 +-
 arch/powerpc/boot/dts/t4240rdb.dts  |  17 ++
 26 files changed, 877 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/boot/dts/b4qds.dtsi b/arch/powerpc/boot/dts/b4qds.dtsi
index 6188583..b30fa5d 100644
--- a/arch/powerpc/boot/dts/b4qds.dtsi
+++ b/arch/powerpc/boot/dts/b4qds.dtsi
@@ -1,7 +1,7 @@
 /*
  * B4420DS Device Tree Source
  *
- * Copyright 2012 Freescale Semiconductor, Inc.
+ * Copyright 2012 - 2014 Freescale Semiconductor, Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are met:
@@ -104,10 +104,27 @@
device_type = "memory";
};
 
+   reserved-memory {
+   #address-cells = <2>;
+   #size-cells = <2>;
+   ranges;
+
+   bman_fbpr: bman-fbpr {
+   compatible = "fsl,bman-fbpr";
+   alloc-ranges = <0 0 0x 0x>;
+   size = <0 0x100>;
+   alignment = <0 0x100>;
+   };
+   };
+
dcsr: dcsr@f {
ranges = <0x 0xf 0x 0x01052000>;
};
 
+   bportals: bman-portals@ff400 {
+   ranges = <0x0 0xf 0xf400 0x200>;
+   };
+
soc: soc@ffe00 {
ranges = <0x 0xf 0xfe00 0x100>;
reg = <0xf 0xfe00 0 0x1000>;
diff --git a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi 
b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi
index f356ed2..2dd61fa 100644
--- a/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi
+++ b/arch/powerpc/boot/dts/fsl/b4860si-post.dtsi
@@ -1,7 +1,7 @@
 /*
  * B4860 Silicon/SoC Device Tree Source (post include)
  *
- * Copyright 2012 Freescale Semiconductor Inc.
+ * Copyright 2012 - 2014 Freescale Semiconductor Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are met:
@@ -109,6 +109,64 @@
};
 };
 
+&bportals {
+   bman-portal@38000 {
+   compatible = "fsl,bman-portal";
+   reg = <0x38000 0x4000>, <0x100e000 0x1000>;
+   interrupts = <133 2 0 0>;
+   };
+   bman-portal@3c000 {
+   compatible = "fsl,bman-portal";
+   reg = <0x3c000 0x4000>, <0x100f000 0x1000>;
+   interrupts = <135 2 0 0>;
+   };
+   bman-portal@4 {
+   compatible = "fsl,bman-portal";
+   reg = <0x4 0x4000>, <0x101 0x1000>;
+   interrupts = <137 2 0 0>;
+   };
+   bman-portal@44000 {
+   compatible = "fsl,bman-portal";
+   reg = <0x44000 0x4000>, <0x1011000 0x1000>;
+   interrupts = <139 2 0 0>;
+   };
+   bman-portal@48000 {
+   compatible = "fsl,bman-portal";
+   reg = <0x48000 0x4000>, <0x1012000 0x1000>;
+   interrupts = <141 2 0 0>;
+   };
+   bman-portal@4c000 {
+   compatible = "fsl,bman-portal";
+   reg = <0x4c000 0x4000>, <0x1013000 0x1000>;
+   interrupts = <143 2 0 0>;
+   };
+   bman-portal@5 {
+   compatible = "fsl,bman-portal";
+ 

Re: Build regressions/improvements in v3.18-rc7

2014-12-01 Thread Geert Uytterhoeven
On Mon, Dec 1, 2014 at 11:55 AM, Geert Uytterhoeven
 wrote:
> JFYI, when comparing v3.18-rc7[1]  to v3.18-rc6[3], the summaries are:
>   - build errors: +2/-10

Nothing interesting, just a known powerpc-randconfig issue and one more
truncated relocation in powerpc-allyesconfig[*].

> [1] http://kisskb.ellerman.id.au/kisskb/head/8143/ (262 out of 119 configs)
> [3] http://kisskb.ellerman.id.au/kisskb/head/8117/ (120 out of 119 configs)

[*] Just wondering: Recently m68k had to fix a kernel size limitation (cfr.
commit 486df8bc4627bdfc ("m68k: Increase initial mapping to 8 or 16 MiB
if possible"), as multi_defconfig was no longer bootable. How long
before PPC is forced to fix the toolchain due to kernel bloat ;-)

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation

2014-12-01 Thread Madhavan Srinivasan
On Friday 28 November 2014 03:39 PM, David Laight wrote:
> From: Madhavan Srinivasan
>> On Thursday 27 November 2014 07:35 PM, David Laight wrote:
>>> From: Madhavan Srinivasan
 This patchset create the infrastructure to handle the CR based
 local_* atomic operations. Local atomic operations are fast
 and highly reentrant per CPU counters.  Used for percpu
 variable updates. Local atomic operations only guarantee
 variable modification atomicity wrt the CPU which owns the
 data and these needs to be executed in a preemption safe way.
>>>
>>> These are usually called 'restartable atomic sequences (RAS)'.
>>>
 Here is the design of the first patch. Since local_* operations
 are only need to be atomic to interrupts (IIUC), patch uses
 one of the Condition Register (CR) fields as a flag variable. When
 entering the local_*, specific bit in the CR5 field is set
 and on exit, bit is cleared. CR bit checking is done in the
 interrupt return path. If CR5[EQ] bit set and if we return
 to kernel, we reset to start of local_* operation.
>>>
>>> I don't claim to be able to read ppc assembler.
>>> But I can't see the code that clears CR5[EQ] for the duration
>>> of the ISR.
>> I use crclr instruction at the end of the code block to clear the bit.
>>
>>> Without it a nested interrupt will go through unwanted paths.
> 
> That crclr looks to be in the ISR exit path, you need one in the
> isr entry path.

In case of entry path, i clear the field using mtcr instruction

> 
>>>
>>> There are also a lot of 'magic' constants in that assembly code.
>>>
>> All these constants are define in asm/ppc-opcode.h
> 
> I was thinking of the lines like:
> + ori r3,r3,16384

 ok sure. Will comment

> This one probably deserves a comment - or something
> +"3:" PPC405_ERR77(0,%2)
> 
>>> I also wonder if it is possible to inspect the interrupted
>>> code to determine the start/end of the RAS block.
>>> (Easiest if you assume that there is a single 'write' instruction
>>> as the last entry in the block.)
>>>
>> So each local_* function also have code in the __ex_table section. IIUC,
>> __ex_table contains two address. So if the return address found in the
>> first column of the _ex_table, use the corresponding address in the
>> second column to continue from.
> 
> That really doesn't scale.
> I don't know how many 1000 address pairs you table will have (and the
> ones in each loadable module), but the search isn't going to be cheap.
> 
> If these sequences are restartable then they can only have one write
> to memory.
> 

May be, but i see these issues incase of insts decode path,

1) Decoding instruction may also cause a fault (in case of module) and
handling a fault at this stage toward the exit path of interrupt exit
makes me nervous

2) resulting code with lot of condition and branch (for opcode decode)
will be lot messy and may be an issue incase of maintenance,

but let me think it over again on this.

Regards
Maddy

> Given your:
>> This patch re-write the current local_* functions to CR5 based one.
>> Base flow for each function is 
>>
>> {
>>  set cr5(eq)
>>  load
>>  ..
>>  store
>>  clear cr5(eq)
>> }
> 
> On ISR entry:
> If an ISR detects cr5(eq) set then look at the returned to instruction.
> If it is 'clear cr5(eq)' do nothing.
> Otherwise read backwards through the code (for a max of (say) 16 instructions)
> searching for the 'set cr5(eq)' and change the return address to be that
> of the instruction following the 'set cr5(eq)'.
> In all cases clear cr5(eq) for the ISR itself (leave the saved value 
> unchanged).
> 
> The you don't need a table of fault locations.
> 
>   David
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Right location in sysfs for dlpar file

2014-12-01 Thread Nathan Fontenot
On 11/26/2014 09:12 PM, Benjamin Herrenschmidt wrote:
> Hi Greg,
> 
> So Nathan is working on a patch series to cleanup and improve our
> "DLPAR" infrastructure which is basically our hotplug mechanism when
> running under the PowerVM (aka pHyp) and KVM hypervisors.

The cleanup to the dlpar infrastructure will move the entire operation
of hotplugging a device to the kernel instead of doing it partially in
userspace and partially in the kernel as is currently done.

> 
> I'll let Nathan give you a bit more details/background and answer
> subsequent question you might have as this is really his area of
> expertise.
> 
> To cut a long story short, we need a sysfs file that allows our
> userspace tools to notify the kernel of hotplug events coming from
> the management console (which talks to userspace daemons using a
> proprietary protocol) to "initiate" the hotplug operations, which in
> turn get dispatched internally in the kernel to the right subsystem
> (memory, cpu, pci, ...) based on the resource type.
> 
> On IRC, Greg suggested /sys/firmware and /sys/hypervisor which both
> look like a reasonable option to me, probably better than dlpar...

For PowerVM systems we need this sysfs file to deliver what is
essentially a binary blob (specifically a rtas error log) to the
kernel. The current patch set is creating /sys/kernel/dlpar. As Ben
mentioned we would like your input on what would be the proper place
to create this file.

-Nathan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [RFC PATCH 0/2] powerpc: CR based local atomic operation implementation

2014-12-01 Thread David Laight
From: Madhavan Srinivasan [mailto:ma...@linux.vnet.ibm.com]
...
> >>> I also wonder if it is possible to inspect the interrupted
> >>> code to determine the start/end of the RAS block.
> >>> (Easiest if you assume that there is a single 'write' instruction
> >>> as the last entry in the block.)
> >>>
> >> So each local_* function also have code in the __ex_table section. IIUC,
> >> __ex_table contains two address. So if the return address found in the
> >> first column of the _ex_table, use the corresponding address in the
> >> second column to continue from.
> >
> > That really doesn't scale.
> > I don't know how many 1000 address pairs you table will have (and the
> > ones in each loadable module), but the search isn't going to be cheap.
> >
> > If these sequences are restartable then they can only have one write
> > to memory.
> >
> 
> May be, but i see these issues incase of insts decode path,
> 
> 1) Decoding instruction may also cause a fault (in case of module) and
> handling a fault at this stage toward the exit path of interrupt exit
> makes me nervous

It shouldn't be possible to unload a module that is interrupted by
a hardware interrupt.
An 'invalid' loadable module can cause an oops/panic anyway.

> 2) resulting code with lot of condition and branch (for opcode decode)
> will be lot messy and may be an issue incase of maintenance,

You don't need to decode the instructions.
Just look for the two specific instructions used as markers.
This is only really possible with fixed-size instructions.

It might also be that the 'interrupt entry' path is easier to
modify than the 'interrupt exit' one (fewer code paths) and
you just need to modify the 'pc' in the stack frame.
You are only interested in interrupts from kernel space.

David


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] i2c: Driver to expose PowerNV platform i2c busses

2014-12-01 Thread Wolfram Sang

> Suuure, let's create more process and committees, and make sure nothing
> gets done in any reasonable amount of time. Have we gone completely
> insane ?

I did not invent DT bindings. I did not invent that DT is/should be a
hardware description. For me, it is a burden that I (as a subsystem
maintainer for mainly drivers) have to prevent people from using DT for
software configuration (some people use it as an 1:1 mapping for
platform data even.) Since there are no guidelines (probably there can't
be), I developed a set of rules out of experience and when those don't
match I ask for help. Having a different set of rules for
powerpc/arm/... (or server/embedded for that matter) will increase this
burden a lot. People will come and say "But they did it as well..."

> It's getting quite tempting to just throw that driver into powerpc.git

Maybe this is the easiest. Just make sure that MAINTAINERS also point
this driver to you or PowerNV maintainers. And no Ack from me, please.
Then, I can always say "I dunno" if people start asking questions.

> > > + pname = of_get_property(pdev->dev.of_node, "ibm,port-name", NULL);
> > > + if (pname)
> > > + strlcpy(adapter->name, pname, sizeof(adapter->name));
> > > + else
> > > + strlcpy(adapter->name, "opal", sizeof(adapter->name));
> > 
> > ... because I'd like to get an ack from them because of this binding.
> 
> And I don't give a flying crap about what random ARM SOC vendor thinks
> of my powerpc FW interface for a powerpc unique FW interface.

But you are not alone here. If you open the box for giving busses a
configurable name, I can see other people (without FW) wanting this,
too. So, this discussion will come anyhow IMO.



signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Timur Tabi

On 12/01/2014 10:49 AM, Lars-Peter Clausen wrote:


The driver creates the mapping by calling irq_of_parse_and_map(), so it
also has to dispose the mapping. But the easy way out is to simply use
platform_get_irq() instead of irq_of_parse_map(). In this case the
mapping is not managed by the device but by the of core, so the device
has not to dispose the mapping.


Is this a problem unique to the SSI driver?  Maybe devm_free_irq() 
should also dispose of the mapping?


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Lars-Peter Clausen

On 12/01/2014 05:51 PM, Timur Tabi wrote:

On 12/01/2014 10:49 AM, Lars-Peter Clausen wrote:


The driver creates the mapping by calling irq_of_parse_and_map(), so it
also has to dispose the mapping. But the easy way out is to simply use
platform_get_irq() instead of irq_of_parse_map(). In this case the
mapping is not managed by the device but by the of core, so the device
has not to dispose the mapping.


Is this a problem unique to the SSI driver?  Maybe devm_free_irq() should
also dispose of the mapping?



If the mapping was not created by the device, the device shouldn't dispose 
it. Mapping and requesting the interrupt are two independent operations.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Lars-Peter Clausen

On 12/01/2014 07:50 AM, Markus Pargmann wrote:
[...]


devm_request_irq() is used by other drivers too, this should not be a
problem. Looking at the code it seems that irq_dispose_mapping may not
be necessary with devm_request_irq(). So I think it would be better to
remove irq_dispose_mapping() instead.


The driver creates the mapping by calling irq_of_parse_and_map(), so it also 
has to dispose the mapping. But the easy way out is to simply use 
platform_get_irq() instead of irq_of_parse_map(). In this case the mapping 
is not managed by the device but by the of core, so the device has not to 
dispose the mapping.


- Lars
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] i2c: mpc: add register documentation to Freescale I2C driver

2014-12-01 Thread Wolfram Sang
On Sat, Nov 29, 2014 at 01:58:42PM -0800, Danielle Costantino wrote:
> i2c: mpc: add register documentation to Freescale I2C driver

This should be in one patch.

> 
> return -ETIMEDOUT for all time-out error conditions

This should be in a seperate patch.

> and warn on arbitration lost.

And this should be dropped. "Arbitration lost" lost is not an error, it
is specified behaviour of I2C.

>  if (!(cmd_err & CSR_MCF)) {
> -dev_dbg(i2c->dev, "unfinished\n");
> +dev_warn(i2c->dev, "unfinished\n");

Are you sure this helps a regular user?



signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] i2c-qoriq: modified compatibility for correct prescaler

2014-12-01 Thread Wolfram Sang

> I saw that this patch was marked as not applicable, but on most qoriq
> devices the pre-scaler is 2 especially for p2020/p2010 devices
> arch/powerpc/boot/dts/fsl/p2020si-post.dtsi

Just for completeness: "Not applicable" given from patchwork of the i2c
subsystem means this patch is not for the i2c subsystem. In this case,
it is for powerpc because it was modifying powerpc dts files only. That
doesn't say anything about the patch itself.



signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH 2/2]powerpc: rewrite local_* to use CR5 flag

2014-12-01 Thread Gabriel Paubert
On Thu, Nov 27, 2014 at 05:48:41PM +0530, Madhavan Srinivasan wrote:
> This patch re-write the current local_* functions to CR5 based one.
> Base flow for each function is 
> 
> {
>   set cr5(eq)
>   load
>   ..
>   store
>   clear cr5(eq)
> }
> 
> Above set of instructions are followed by a fixup section which points
> to the entry of the function incase of interrupt in the flow. If the 
> interrupt happens to be after the store, we just continue to last 
> instruction in that block. 
> 
> Currently only asm/local.h has been rewrite, and local64 is TODO.
> Also the entire change is only for PPC64.
> 
> Signed-off-by: Madhavan Srinivasan 
> ---
>  arch/powerpc/include/asm/local.h | 306 
> +++
>  1 file changed, 306 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/local.h 
> b/arch/powerpc/include/asm/local.h
> index b8da913..a26e5d3 100644
> --- a/arch/powerpc/include/asm/local.h
> +++ b/arch/powerpc/include/asm/local.h
> @@ -11,6 +11,310 @@ typedef struct
>  
>  #define LOCAL_INIT(i){ ATOMIC_LONG_INIT(i) }
>  
> +#ifdef   CONFIG_PPC64
> +
> +static __inline__ long local_read(local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1:  crset   22\n"
> +"2:" PPC_LL" %0,0(%1)\n"
> +"3:  crclr   22\n"
> +"4:\n"
> +".section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,3b\n"
> +".previous\n"
> + : "=&r" (t)
> + : "r" (&(l->a.counter)));
> +
> + return t;
> +}
> +
> +static __inline__ void local_set(local_t *l, long i)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1:  crset   22\n"
> +"2:" PPC_LL" %0,0(%1)\n"
> +"3:" PPC405_ERR77(0,%2)
> +"4:" PPC_STL" %0,0(%2)\n"
> +"5:  crclr   22\n"
> +"6:\n"
> +".section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,5b\n"
> +".previous\n"
> + : "=&r" (t)
> + : "r" (&(i)), "r" (&(l->a.counter)));
> +}
> +

Apart from the other comments on bloat which can very likely
be removed by tracing backwards for a few instructions, removing
the exception table entries which are 2 or 4 (64 bit?) times as 
large as the instruction sequence, I don't understand at all why
you need these sequences for the local_read and local_set functions.

After all these are single instructions (why do you perform a read before
the write in set when the result of the read is never used?).

I believe read and set are better mapped to access_once (or assign_once
or whatever it's called after the recent discussion on linux-kernel).
You don't even need a memory barrier if it's for a single thread,
so you could get away with a single volatile access to the variables.

For the other ones, I think that what you do is correct, except that
the workaround for PPC405 erratum 77 is not needed since this erratum
only affects the stwcx. instruction and the whole point of the patch
is to avoid the use of an l?arx/st?cx. pair.

Regards,
Gabriel

> +static __inline__ void local_add(long i, local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1:  crset   22\n"
> +"2:" PPC_LL" %0,0(%2)\n"
> +"3:  add %0,%1,%0\n"
> +"4:" PPC405_ERR77(0,%2)
> +"5:" PPC_STL" %0,0(%2)\n"
> +"6:  crclr   22\n"
> +"7:\n"
> +".section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,1b\n"
> + PPC_LONG "6b,6b\n"
> +".previous\n"
> + : "=&r" (t)
> + : "r" (i), "r" (&(l->a.counter)));
> +}
> +
> +static __inline__ void local_sub(long i, local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1:  crset   22\n"
> +"2:" PPC_LL" %0,0(%2)\n"
> +"3:  subf%0,%1,%0\n"
> +"4:" PPC405_ERR77(0,%2)
> +"5:" PPC_STL" %0,0(%2)\n"
> +"6:  crclr   22\n"
> +"7:\n"
> +".section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,1b\n"
> + PPC_LONG "6b,6b\n"
> +".previous\n"
> + : "=&r" (t)
> + : "r" (i), "r" (&(l->a.counter)));
> +}
> +
> +static __inline__ long local_add_return(long a, local_t *l)
> +{
> + long t;
> +
> + __asm__ __volatile__(
> +"1:  crset   22\n"
> +"2:" PPC_LL" %0,0(%2)\n"
> +"3:  add %0,%1,%0\n"
> +"4:" PPC405_ERR77(0,%2)
> +"5:" PPC_STL "%0,0(%2)\n"
> +"6:  crclr   22\n"
> +"7:\n"
> +".section __ex_table,\"a\"\n"
> + PPC_LONG_ALIGN "\n"
> + PPC_LONG "2b,1b\n"
> + PPC_LONG "3b,1b\n"
> + PPC_LONG "4b,1b\n"
> + PPC_LONG "5b,1b\n"
> + PPC_LONG "6b,6b\n"
> +".previous\n"
> + : "=&r" (t)
> + : "r" (a), "r" (&(l->a.counter))
> + : "cc", "memory");
> +
> + return t;
> +}
> +
> +
> +#define local_add_negative(a, l) (local_add_return((a), (l)) < 0)
> +
> +static __inline__ long local_sub_return(long a, local_t *l)

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Timur Tabi

On 12/01/2014 10:49 AM, Lars-Peter Clausen wrote:

The driver creates the mapping by calling irq_of_parse_and_map(), so it
also has to dispose the mapping.


I agree with Markus, this does seem weird.  It sounds like you're saying 
that irq_of_parse_and_map() and devm_request_irq() are incompatible.  A 
quick grep shows the following drivers that call both functions:


ata/pata_mpc52xx.c
built-in.o
cpufreq/exynos5440-cpufreq.c
crypto/omap-sham.c
dma/moxart-dma.c
edac/mpc85xx_edac.c
hsi/clients/nokia-modem.c
i2c/busses/i2c-wmt.c
input/serio/apbps2.c
mmc/host/omap_hsmmc.c
mmc/host/moxart-mmc.c
mtd/nand/mpc5121_nfc.c
net/ethernet/arc/emac_main.c
net/ethernet/moxa/moxart_ether.c
pci/host/pcie-rcar.c
pinctrl/samsung/pinctrl-exynos5440.c
pinctrl/samsung/pinctrl-exynos.c
pinctrl/pinctrl-bcm2835.c
spi/spi-bcm2835.c
spi/spi-mpc512x-psc.c
staging/xillybus/xillybus_of.c
thermal/samsung/exynos_tmu.c

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Mark Brown
On Mon, Dec 01, 2014 at 05:49:56PM +0100, Lars-Peter Clausen wrote:
> On 12/01/2014 07:50 AM, Markus Pargmann wrote:

> >devm_request_irq() is used by other drivers too, this should not be a
> >problem. Looking at the code it seems that irq_dispose_mapping may not
> >be necessary with devm_request_irq(). So I think it would be better to
> >remove irq_dispose_mapping() instead.

> The driver creates the mapping by calling irq_of_parse_and_map(), so it also
> has to dispose the mapping. But the easy way out is to simply use
> platform_get_irq() instead of irq_of_parse_map(). In this case the mapping
> is not managed by the device but by the of core, so the device has not to
> dispose the mapping.

It also has the advantage of not being DT specific so providing some
chance that future firmware interfaces can be supported without driver
modification.


signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Lars-Peter Clausen

On 12/01/2014 07:48 PM, Timur Tabi wrote:

On 12/01/2014 10:49 AM, Lars-Peter Clausen wrote:

The driver creates the mapping by calling irq_of_parse_and_map(), so it
also has to dispose the mapping.


I agree with Markus, this does seem weird.  It sounds like you're saying
that irq_of_parse_and_map() and devm_request_irq() are incompatible.


They probably are. You have to create the mapping before you request the IRQ 
and if devm_request_irq() is used the IRQ is only freed again after the 
remove function of the driver been called. Yet if a driver creates a mapping 
in its probe function it should also dispose it in its remove function. So 
you are stuck with either freeing the mapping before freeing the IRQ or 
leaking the mapping.


My opinion on this is that devices should not create mappings and should 
leave that to the core. This quite easily solves the dilemma.



A quick grep shows the following drivers that call both functions:



Most of these drivers will probably work fine without irq_of_parse_and_map().


ata/pata_mpc52xx.c
built-in.o
cpufreq/exynos5440-cpufreq.c
crypto/omap-sham.c
dma/moxart-dma.c
edac/mpc85xx_edac.c
hsi/clients/nokia-modem.c
i2c/busses/i2c-wmt.c
input/serio/apbps2.c
mmc/host/omap_hsmmc.c
mmc/host/moxart-mmc.c
mtd/nand/mpc5121_nfc.c
net/ethernet/arc/emac_main.c
net/ethernet/moxa/moxart_ether.c
pci/host/pcie-rcar.c
pinctrl/samsung/pinctrl-exynos5440.c
pinctrl/samsung/pinctrl-exynos.c
pinctrl/pinctrl-bcm2835.c
spi/spi-bcm2835.c
spi/spi-mpc512x-psc.c
staging/xillybus/xillybus_of.c
thermal/samsung/exynos_tmu.c



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Mark Brown
On Mon, Dec 01, 2014 at 08:39:51PM +0100, Lars-Peter Clausen wrote:
> On 12/01/2014 07:48 PM, Timur Tabi wrote:

> >A quick grep shows the following drivers that call both functions:

> Most of these drivers will probably work fine without irq_of_parse_and_map().

I'd also note that quite a few of these drivers look pretty legacy - a
very large proportion are for old PowerPC hardware, though by no means
all.


signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Arnd Bergmann
On Monday 01 December 2014 19:41:47 Mark Brown wrote:
> On Mon, Dec 01, 2014 at 08:39:51PM +0100, Lars-Peter Clausen wrote:
> > On 12/01/2014 07:48 PM, Timur Tabi wrote:
> 
> > >A quick grep shows the following drivers that call both functions:
> 
> > Most of these drivers will probably work fine without 
> > irq_of_parse_and_map().
> 
> I'd also note that quite a few of these drivers look pretty legacy - a
> very large proportion are for old PowerPC hardware, though by no means
> all.

Right, from the times before we were using platform_device for probing
device tree based devices and they had to map the interrupt themselves.

Some of them like arch/powerpc/sysdev/fsl_pci.c seem fine, this one
is does not expect to ever destroy a device, and it only unmaps the
interrupt if request_irq fails. drivers/ata/pata_mpc52xx.c on the
other hand seems wrong in the same was as drivers/edac/mpc85xx_edac.c
and sound/soc/fsl/fsl_ssi.c.

All other drivers that call irq_of_parse_and_map and pass that into
devm_request_irq just never unmap, and their interrupts are already
mapped by the platform code, so I think it's not even a leak.

Arnd
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Timur Tabi

On 12/01/2014 01:56 PM, Arnd Bergmann wrote:

All other drivers that call irq_of_parse_and_map and pass that into
devm_request_irq just never unmap, and their interrupts are already
mapped by the platform code, so I think it's not even a leak.


Does this mean that fsl_ssi.c should not be calling 
irq_of_parse_and_map?  How else should it get the IRQ?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Arnd Bergmann
On Monday 01 December 2014 13:59:27 Timur Tabi wrote:
> On 12/01/2014 01:56 PM, Arnd Bergmann wrote:
> > All other drivers that call irq_of_parse_and_map and pass that into
> > devm_request_irq just never unmap, and their interrupts are already
> > mapped by the platform code, so I think it's not even a leak.
> 
> Does this mean that fsl_ssi.c should not be calling 
> irq_of_parse_and_map?  How else should it get the IRQ?

platform_get_irq()

Arnd
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Timur Tabi

On 12/01/2014 02:01 PM, Arnd Bergmann wrote:

>Does this mean that fsl_ssi.c should not be calling
>irq_of_parse_and_map?  How else should it get the IRQ?

platform_get_irq()


Ok, but that function also calls irq_create_of_mapping().  So it still 
appears that the only way to get the IRQ is to map it, but then we can't 
use devm_request_irq().


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Mark Brown
On Mon, Dec 01, 2014 at 09:01:43PM +0100, Arnd Bergmann wrote:
> On Monday 01 December 2014 13:59:27 Timur Tabi wrote:
> > On 12/01/2014 01:56 PM, Arnd Bergmann wrote:

> > > All other drivers that call irq_of_parse_and_map and pass that into
> > > devm_request_irq just never unmap, and their interrupts are already
> > > mapped by the platform code, so I think it's not even a leak.

> > Does this mean that fsl_ssi.c should not be calling 
> > irq_of_parse_and_map?  How else should it get the IRQ?

> platform_get_irq()

Right, and just to emphasize what we were saying earlier the code was
fine when originally written - both mapping inside platform_get_irq()
and devm_ came along quite a while after the driver was originally
written.


signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Lars-Peter Clausen

On 12/01/2014 09:11 PM, Timur Tabi wrote:

On 12/01/2014 02:01 PM, Arnd Bergmann wrote:

>Does this mean that fsl_ssi.c should not be calling
>irq_of_parse_and_map?  How else should it get the IRQ?

platform_get_irq()


Ok, but that function also calls irq_create_of_mapping().  So it still
appears that the only way to get the IRQ is to map it, but then we can't use
devm_request_irq().



Hm... that's new. But it's not really a driver issue anymore if it is done 
in the core. So I guess for now just use platform_get_irq() and ignore the 
other issue.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Fabio Estevam
On Mon, Dec 1, 2014 at 6:30 PM, Lars-Peter Clausen  wrote:
> On 12/01/2014 09:11 PM, Timur Tabi wrote:
>>
>> On 12/01/2014 02:01 PM, Arnd Bergmann wrote:

 >Does this mean that fsl_ssi.c should not be calling
 >irq_of_parse_and_map?  How else should it get the IRQ?
>>>
>>> platform_get_irq()
>>
>>
>> Ok, but that function also calls irq_create_of_mapping().  So it still
>> appears that the only way to get the IRQ is to map it, but then we can't
>> use
>> devm_request_irq().
>>
>
> Hm... that's new. But it's not really a driver issue anymore if it is done
> in the core. So I guess for now just use platform_get_irq() and ignore the
> other issue.

With the suggested changes below, the removal of the driver works fine on a mx6:

root@freescale /$ modprobe   snd-soc-fsl-ssi
root@freescale /$ modprobe snd-soc-imx-wm8962
[  319.517679] input: WM8962 Beep Generator as
/devices/soc0/soc/210.aips-bus/21a.i2c/i2c-0/0-001a/input/input7
[  319.543225] imx-wm8962 sound: wm8962 <-> 202c000.ssi mapping ok
root@freescale /$ rmmod  snd-soc-imx-wm8962
root@freescale /$ rmmod   snd-soc-fsl-ssi

 sound/soc/fsl/fsl_ssi.c | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index 32a31d9..c528f16 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -1361,7 +1361,7 @@ static int fsl_ssi_probe(struct platform_device *pdev)
 return PTR_ERR(ssi_private->regs);
 }

-ssi_private->irq = irq_of_parse_and_map(np, 0);
+ssi_private->irq = platform_get_irq(pdev, 0);
 if (!ssi_private->irq) {
 dev_err(&pdev->dev, "no irq for node %s\n", np->full_name);
 return -ENXIO;
@@ -1387,7 +1387,7 @@ static int fsl_ssi_probe(struct platform_device *pdev)
 if (ssi_private->soc->imx) {
 ret = fsl_ssi_imx_probe(pdev, ssi_private, iomem);
 if (ret)
-goto error_irqmap;
+return ret;
 }

 ret = snd_soc_register_component(&pdev->dev, &fsl_ssi_component,
@@ -1458,10 +1458,6 @@ error_asoc_register:
 if (ssi_private->soc->imx)
 fsl_ssi_imx_clean(pdev, ssi_private);

-error_irqmap:
-if (ssi_private->use_dma)
-irq_dispose_mapping(ssi_private->irq);
-
 return ret;
 }

@@ -1478,9 +1474,6 @@ static int fsl_ssi_remove(struct platform_device *pdev)
 if (ssi_private->soc->imx)
 fsl_ssi_imx_clean(pdev, ssi_private);

-if (ssi_private->use_dma)
-irq_dispose_mapping(ssi_private->irq);
-
 return 0;
 }

-- 
1.9.1
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Timur Tabi

On 12/01/2014 02:40 PM, Fabio Estevam wrote:

>Hm... that's new. But it's not really a driver issue anymore if it is done
>in the core. So I guess for now just use platform_get_irq() and ignore the
>other issue.



With the suggested changes below, the removal of the driver works fine on a mx6:


Would the mapping continue to exist after the driver is unloaded?  Can 
you try multiple loads/unloads and see if interrupts still work?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH] ASoC: fsl_ssi: free irq before irq_dispose_mapping()

2014-12-01 Thread Fabio Estevam
On Mon, Dec 1, 2014 at 6:42 PM, Timur Tabi  wrote:

> Would the mapping continue to exist after the driver is unloaded?  Can you
> try multiple loads/unloads and see if interrupts still work?

I tried multiple loads/unloads and audio works fine with those changes.

About the ssi irq we have:

- With the ssi driver loaded:
root@freescale /home$ cat /proc/interrupts | grep ssi
 79:  0  0  0  0   GIC  79  202c000.ssi

- After removing the ssi driver:
root@freescale /home$ rmmod   snd-soc-fsl-ssi
root@freescale /home$ cat /proc/interrupts | grep ssi
root@freescale /home$

,so it seems to behave properly.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2] slab: Fix nodeid bounds check for non-contiguous node IDs

2014-12-01 Thread David Rientjes
On Mon, 1 Dec 2014, Paul Mackerras wrote:

> The bounds check for nodeid in cache_alloc_node gives false
> positives on machines where the node IDs are not contiguous, leading
> to a panic at boot time.  For example, on a POWER8 machine the node
> IDs are typically 0, 1, 16 and 17.  This means that num_online_nodes()
> returns 4, so when cache_alloc_node is called with nodeid = 16 the
> VM_BUG_ON triggers, like this:
> 
> kernel BUG at /home/paulus/kernel/kvm/mm/slab.c:3079!
> Oops: Exception in kernel mode, sig: 5 [#1]
> SMP NR_CPUS=1024 NUMA PowerNV
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper Not tainted 3.18.0-rc5-kvm+ #17
> task: c13ba230 ti: c1494000 task.ti: c1494000
> NIP: c0264f6c LR: c0264f5c CTR: 
> REGS: c14979a0 TRAP: 0700   Not tainted  (3.18.0-rc5-kvm+)
> MSR: 92021032   CR: 28000448  XER: 2000
> CFAR: c047e978 SOFTE: 0
> GPR00: c0264f5c c1497c20 c1499d48 0004
> GPR04: 0100 0010 0068 
> GPR08:  0001 082d c0cca5a8
> GPR12: 48000448 cfda 01003bd44ff0 10020578
> GPR16: 01003bd44ff8 01003bd45000 0001 
> GPR20:    0010
> GPR24: c00ffe80 c0c824ec 0068 c00ffe80
> GPR28: 0010 c00ffe80 0010 
> NIP [c0264f6c] .cache_alloc_node+0x6c/0x270
> LR [c0264f5c] .cache_alloc_node+0x5c/0x270
> Call Trace:
> [c1497c20] [c0264f5c] .cache_alloc_node+0x5c/0x270 
> (unreliable)
> [c1497cf0] [c026552c] .kmem_cache_alloc_node_trace+0xdc/0x360
> [c1497dc0] [c0c824ec] .init_list+0x3c/0x128
> [c1497e50] [c0c827b4] .kmem_cache_init+0x1dc/0x258
> [c1497ef0] [c0c54090] .start_kernel+0x2a0/0x568
> [c1497f90] [c0008c6c] start_here_common+0x20/0xa8
> Instruction dump:
> 7c7d1b78 7c962378 4bda4e91 6000 3c620004 38800100 386370d8 48219959
> 6000 7f83e000 7d301026 5529effe <0b09> 393c0010 79291f24 7d3d4a14
> 
> To fix this, we instead compare the nodeid with MAX_NUMNODES, and
> additionally make sure it isn't negative (since nodeid is an int).
> The check is there mainly to protect the array dereference in the
> get_node() call in the next line, and the array being dereferenced is
> of size MAX_NUMNODES.  If the nodeid is in range but invalid (for
> example if the node is off-line), the BUG_ON in the next line will
> catch that.
> 
> Signed-off-by: Paul Mackerras 

Acked-by: David Rientjes 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t

2014-12-01 Thread Gabriel Paubert
On Thu, Nov 27, 2014 at 05:48:40PM +0530, Madhavan Srinivasan wrote:
> This patch create the infrastructure to handle the CR based 
> local_* atomic operations. Local atomic operations are fast 
> and highly reentrant per CPU counters.  Used for percpu 
> variable updates. Local atomic operations only guarantee 
> variable modification atomicity wrt the CPU which owns the
> data and these needs to be executed in a preemption safe way. 
> 
> Here is the design of this patch. Since local_* operations 
> are only need to be atomic to interrupts (IIUC), patch uses 
> one of the Condition Register (CR) fields as a flag variable. When 
> entering the local_*, specific bit in the CR5 field is set
> and on exit, bit is cleared. CR bit checking is done in the
> interrupt return path. If CR5[EQ] bit set and if we return 
> to kernel, we reset to start of local_* operation.
> 
> Reason for this approach is that, currently l[w/d]arx/st[w/d]cx.
> instruction pair is used for local_* operations, which are heavy 
> on cycle count and they dont support a local variant. So to 
> see whether the new implementation helps, used a modified 
> version of Rusty's benchmark code on local_t.   
> 
> https://lkml.org/lkml/2008/12/16/450
> 
> Modifications: 
>  - increated the working set size from 1MB to 8MB,
>  - removed cpu_local_inc test.
> 
> Test ran 
> - on POWER8 1S Scale out System 2.0GHz
> - on OPAL v3 with v3.18-rc4 patch kernel as Host
> 
> Here are the values with the patch.
> 
> Time in ns per iteration
> 
>   inc add readadd_return
> atomic_long   67  67  18  69
> irqsave/rest  39  39  23  39
> trivalue  39  39  29  49
> local_t   26  26  24  26
> 
> Since CR5 is used as a flag, have added CFLAGS to avoid CR5 
> for the kernel compilation and CR5 is zeroed at the kernel
> entry.  
> 
> Tested the patch in a 
>  - pSeries LPAR, 
>  - Host with patched/unmodified guest kernel 
> 
> To check whether userspace see any CR5 corruption, ran a simple
> test which does,
>  - set CR5 field,
>  - while(1)
>- sleep or gettimeofday
>- chk bit set
> 
> Signed-off-by: Madhavan Srinivasan 
> ---
> - I really appreciate feedback on the patchset.
> - Kindly comment if I should try with any other benchmark or
> workload to check the numbers.
> - Also, kindly recommand any know stress test for CR
> 
>  Makefile |   6 ++
>  arch/powerpc/include/asm/exception-64s.h |  21 +-
>  arch/powerpc/kernel/entry_64.S   | 106 
> ++-
>  arch/powerpc/kernel/exceptions-64s.S |   2 +-
>  arch/powerpc/kernel/head_64.S|   8 +++
>  5 files changed, 138 insertions(+), 5 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 00d618b..2e271ad 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -706,6 +706,12 @@ endif
>  
>  KBUILD_CFLAGS   += $(call cc-option, -fno-var-tracking-assignments)
>  
> +ifdefCONFIG_PPC64
> +# We need this flag to force compiler not to use CR5, since
> +# local_t type code is based on this.
> +KBUILD_CFLAGS   += -ffixed-cr5
> +endif
> +
>  ifdef CONFIG_DEBUG_INFO
>  ifdef CONFIG_DEBUG_INFO_SPLIT
>  KBUILD_CFLAGS   += $(call cc-option, -gsplit-dwarf, -g)
> diff --git a/arch/powerpc/include/asm/exception-64s.h 
> b/arch/powerpc/include/asm/exception-64s.h
> index 77f52b2..c42919a 100644
> --- a/arch/powerpc/include/asm/exception-64s.h
> +++ b/arch/powerpc/include/asm/exception-64s.h
> @@ -306,7 +306,26 @@ do_kvm_##n:  
> \
>   std r10,0(r1);  /* make stack chain pointer */ \
>   std r0,GPR0(r1);/* save r0 in stackframe*/ \
>   std r10,GPR1(r1);   /* save r1 in stackframe*/ \
> - beq 4f; /* if from kernel mode  */ \
> +BEGIN_FTR_SECTION;  \
> + lis r9,4096;/* Create a mask with HV and PR */ \
> + rldicr  r9,r9,32,31;/* bits, AND with the MSR   */ \
> + mr  r10,r9; /* to check for Hyp state   */ \
> + ori r9,r9,16384;   \
> + and r9,r12,r9; \
> + cmpdcr3,r10,r9; 
>\
> + beq cr3,66f;/* Jump if we come from Hyp mode*/ \
> + mtcrf   0x04,r10;   /* Clear CR5 if coming from usr */ \

I think you can do better than this, powerpc has a fantastic set
of rotate and mask instructions. If I understand correctly your
code you can replace it with the following:

rldicl  r10,r12,4,63   /* Extract HV bit to LSB of r10*/
rlwinm  r9,r12,19,0x02 /* Extract PR bit to 2nd to last bit of r9 */
or  r9,r9,10
cmplwi  cr3,

Re: [PATCH 02/10] mm: Add p[te|md] protnone helpers for use by NUMA balancing

2014-12-01 Thread Benjamin Herrenschmidt
On Fri, 2014-11-21 at 13:57 +, Mel Gorman wrote:

>  #ifdef CONFIG_NUMA_BALANCING
> +/*
> + * These work without NUMA balancing but the kernel does not care. See the
> + * comment in include/asm-generic/pgtable.h
> + */
> +static inline int pte_protnone(pte_t pte)
> +{
> + return (pte_val(pte) &
> + (_PAGE_PRESENT | _PAGE_USER)) == _PAGE_PRESENT;
> +}

I would add a comment clarifying that this only works for user pages,
ie, this accessor will always return "true" for a kernel page on ppc.

> +static inline int pmd_protnone(pmd_t pmd)
> +{
> + return pte_protnone(pmd_pte(pmd));
> +}
> +
>  static inline int pte_present(pte_t pte)
>  {
>   return pte_val(pte) & _PAGE_NUMA_MASK;
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 081d6f4..2e25780 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -502,6 +502,22 @@ static inline int pmd_present(pmd_t pmd)
>_PAGE_NUMA);
>  }
>  
> +#ifdef CONFIG_NUMA_BALANCING
> +/*
> + * These work without NUMA balancing but the kernel does not care. See the
> + * comment in include/asm-generic/pgtable.h
> + */
> +static inline int pte_protnone(pte_t pte)
> +{
> + return pte_flags(pte) & _PAGE_PROTNONE;
> +}
> +
> +static inline int pmd_protnone(pmd_t pmd)
> +{
> + return pmd_flags(pmd) & _PAGE_PROTNONE;
> +}
> +#endif /* CONFIG_NUMA_BALANCING */
> +
>  static inline int pmd_none(pmd_t pmd)
>  {
>   /* Only check low word on 32-bit platforms, since it might be
> diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h
> index 177d597..d497d08 100644
> --- a/include/asm-generic/pgtable.h
> +++ b/include/asm-generic/pgtable.h
> @@ -688,6 +688,26 @@ static inline int pmd_trans_unstable(pmd_t *pmd)
>  #endif
>  }
>  
> +#ifndef CONFIG_NUMA_BALANCING
> +/*
> + * Technically a PTE can be PROTNONE even when not doing NUMA balancing but
> + * the only case the kernel cares is for NUMA balancing and is only ever set
> + * when the VMA is accessible. For PROT_NONE VMAs, the PTEs are not marked
> + * _PAGE_PROTNONE so by by default, implement the helper as "always no". It
> + * is the responsibility of the caller to distinguish between PROT_NONE
> + * protections and NUMA hinting fault protections.
> + */
> +static inline int pte_protnone(pte_t pte)
> +{
> + return 0;
> +}
> +
> +static inline int pmd_protnone(pmd_t pmd)
> +{
> + return 0;
> +}
> +#endif /* CONFIG_NUMA_BALANCING */
> +
>  #ifdef CONFIG_NUMA_BALANCING
>  /*
>   * _PAGE_NUMA distinguishes between an unmapped page table entry, an entry 
> that


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 03/10] mm: Convert p[te|md]_numa users to p[te|md]_protnone_numa

2014-12-01 Thread Benjamin Herrenschmidt
On Fri, 2014-11-21 at 13:57 +, Mel Gorman wrote:
> Convert existing users of pte_numa and friends to the new helper. Note
> that the kernel is broken after this patch is applied until the other
> page table modifiers are also altered. This patch layout is to make
> review easier.

Aneesh, the removal of the DSISR_PROTFAULT checks, I wonder if we might
break something here ... (I know, I asked for them to be removed :-)

IE, we basically bounce all protection checks to the "normal" VMA
protection checking, so far so good...

But what about the subpage protection stuff ? Will that still work ?

Cheers,
Ben.

> Signed-off-by: Mel Gorman 
> Acked-by: Linus Torvalds 
> Acked-by: Aneesh Kumar 
> ---
>  arch/powerpc/kvm/book3s_hv_rm_mmu.c |  2 +-
>  arch/powerpc/mm/fault.c |  5 -
>  arch/powerpc/mm/pgtable.c   | 11 ---
>  arch/powerpc/mm/pgtable_64.c|  3 ++-
>  arch/x86/mm/gup.c   |  4 ++--
>  include/uapi/linux/mempolicy.h  |  2 +-
>  mm/gup.c| 10 +-
>  mm/huge_memory.c| 16 +++
>  mm/memory.c |  4 ++--
>  mm/mprotect.c   | 39 
> ++---
>  mm/pgtable-generic.c|  2 +-
>  11 files changed, 40 insertions(+), 58 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_hv_rm_mmu.c 
> b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> index 084ad54..3e6ad3f 100644
> --- a/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> +++ b/arch/powerpc/kvm/book3s_hv_rm_mmu.c
> @@ -235,7 +235,7 @@ long kvmppc_do_h_enter(struct kvm *kvm, unsigned long 
> flags,
>   pte_size = psize;
>   pte = lookup_linux_pte_and_update(pgdir, hva, writing,
> &pte_size);
> - if (pte_present(pte) && !pte_numa(pte)) {
> + if (pte_present(pte) && !pte_protnone(pte)) {
>   if (writing && !pte_write(pte))
>   /* make the actual HPTE be read-only */
>   ptel = hpte_make_readonly(ptel);
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index eb79907..b434153 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -398,8 +398,6 @@ good_area:
>* processors use the same I/D cache coherency mechanism
>* as embedded.
>*/
> - if (error_code & DSISR_PROTFAULT)
> - goto bad_area;
>  #endif /* CONFIG_PPC_STD_MMU */
>  
>   /*
> @@ -423,9 +421,6 @@ good_area:
>   flags |= FAULT_FLAG_WRITE;
>   /* a read */
>   } else {
> - /* protection fault */
> - if (error_code & 0x0800)
> - goto bad_area;
>   if (!(vma->vm_flags & (VM_READ | VM_EXEC | VM_WRITE)))
>   goto bad_area;
>   }
> diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
> index c90e602..83dfcb5 100644
> --- a/arch/powerpc/mm/pgtable.c
> +++ b/arch/powerpc/mm/pgtable.c
> @@ -172,9 +172,14 @@ static pte_t set_access_flags_filter(pte_t pte, struct 
> vm_area_struct *vma,
>  void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
>   pte_t pte)
>  {
> -#ifdef CONFIG_DEBUG_VM
> - WARN_ON(pte_val(*ptep) & _PAGE_PRESENT);
> -#endif
> + /*
> +  * When handling numa faults, we already have the pte marked
> +  * _PAGE_PRESENT, but we can be sure that it is not in hpte.
> +  * Hence we can use set_pte_at for them.
> +  */
> + VM_WARN_ON((pte_val(*ptep) & (_PAGE_PRESENT | _PAGE_USER)) ==
> + (_PAGE_PRESENT | _PAGE_USER));
> +
>   /* Note: mm->context.id might not yet have been assigned as
>* this context might not have been activated yet when this
>* is called.
> diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
> index 87ff0c1..435ebf7 100644
> --- a/arch/powerpc/mm/pgtable_64.c
> +++ b/arch/powerpc/mm/pgtable_64.c
> @@ -718,7 +718,8 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr,
>   pmd_t *pmdp, pmd_t pmd)
>  {
>  #ifdef CONFIG_DEBUG_VM
> - WARN_ON(pmd_val(*pmdp) & _PAGE_PRESENT);
> + WARN_ON((pmd_val(*pmdp) & (_PAGE_PRESENT | _PAGE_USER)) ==
> + (_PAGE_PRESENT | _PAGE_USER));
>   assert_spin_locked(&mm->page_table_lock);
>   WARN_ON(!pmd_trans_huge(pmd));
>  #endif
> diff --git a/arch/x86/mm/gup.c b/arch/x86/mm/gup.c
> index 207d9aef..f32e12c 100644
> --- a/arch/x86/mm/gup.c
> +++ b/arch/x86/mm/gup.c
> @@ -84,7 +84,7 @@ static noinline int gup_pte_range(pmd_t pmd, unsigned long 
> addr,
>   struct page *page;
>  
>   /* Similar to the PMD case, NUMA hinting must take slow path */
> - if (pte_numa(pte)) {
> + if (pte_protnone(pte)) {
>   pte_unmap(ptep);
>   return 

Re: [PATCH 03/10] mm: Convert p[te|md]_numa users to p[te|md]_protnone_numa

2014-12-01 Thread Benjamin Herrenschmidt
On Fri, 2014-11-21 at 13:57 +, Mel Gorman wrote:
> void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
> pte_t pte)
>  {
> -#ifdef CONFIG_DEBUG_VM
> -   WARN_ON(pte_val(*ptep) & _PAGE_PRESENT);
> -#endif
> +   /*
> +* When handling numa faults, we already have the pte marked
> +* _PAGE_PRESENT, but we can be sure that it is not in hpte.
> +* Hence we can use set_pte_at for them.
> +*/
> +   VM_WARN_ON((pte_val(*ptep) & (_PAGE_PRESENT | _PAGE_USER)) ==
> +   (_PAGE_PRESENT | _PAGE_USER));
> +

His is that going to fare with set_pte_at() called for kernel pages ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault

2014-12-01 Thread Benjamin Herrenschmidt
On Mon, 2014-11-03 at 20:21 +0530, Aneesh Kumar K.V wrote:
> --- a/arch/powerpc/mm/hash_native_64.c
> +++ b/arch/powerpc/mm/hash_native_64.c
> @@ -283,11 +283,11 @@ static long native_hpte_remove(unsigned long hpte_group)
>  
>  static long native_hpte_updatepp(unsigned long slot, unsigned long newpp,
>  unsigned long vpn, int bpsize,
> -int apsize, int ssize, int local)
> +int apsize, int ssize, unsigned long flags)
>  {
> struct hash_pte *hptep = htab_address + slot;
> unsigned long hpte_v, want_v;
> -   int ret = 0;
> +   int ret = 0, local = 0;
>  
> want_v = hpte_encode_avpn(vpn, bpsize, ssize);
>  
> @@ -322,8 +322,15 @@ static long native_hpte_updatepp(unsigned long slot, 
> unsigned long newpp,
> }
> native_unlock_hpte(hptep);
> }
> -   /* Ensure it is out of the tlb too. */
> -   tlbie(vpn, bpsize, apsize, ssize, local);
> +
> +   if (flags & HPTE_LOCAL_UPDATE)
> +   local = 1;
> +   /*
> +* Ensure it is out of the tlb too if it is not a nohpte fault
> +*/
> +   if (!(flags & HPTE_NOHPTE_UPDATE))
> +   tlbie(vpn, bpsize, apsize, ssize, local);
> +
> return ret;
>  }

An additional refinement we discussed that I'd like you to test/measure
is to basically always be local for updatepp unless we have a flag that
forces us not to.

That flag would be set by copro faults only.

Can you do something on top of this series ?

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault

2014-12-01 Thread Michael Ellerman
On Mon, 2014-11-03 at 20:21 +0530, Aneesh Kumar K.V wrote:
> upatepp get called for a nohpte fault, when we find from the linux
> page table that the translation was hashed before. In that case
> we are sure that there is no existing translation, hence we could
> avoid doing tlbie.

We are sure there *was* no existing translation. It's possible that since the
nohpte fault occurred the translation has been loaded into the tlb.

Ben says that's OK, because updatepp is only ever relaxing permissions. But
please add some explanation of that to the changelog - it's not obvious.

> @@ -322,8 +322,15 @@ static long native_hpte_updatepp(unsigned long slot, 
> unsigned long newpp,
>   }
>   native_unlock_hpte(hptep);
>   }
> - /* Ensure it is out of the tlb too. */
> - tlbie(vpn, bpsize, apsize, ssize, local);
> +
> + if (flags & HPTE_LOCAL_UPDATE)
> + local = 1;
> + /*
> +  * Ensure it is out of the tlb too if it is not a nohpte fault
> +  */
> + if (!(flags & HPTE_NOHPTE_UPDATE))
> + tlbie(vpn, bpsize, apsize, ssize, local);
> +
>   return ret;
>  }

The context preceeding this hunk includes this comment:

/*
 * We need to invalidate the TLB always because hpte_remove doesn't do
 * a tlb invalidate. If a hash bucket gets full, we "evict" a more/less
 * random entry from it. When we do that we don't invalidate the TLB
 * (hpte_remove) because we assume the old translation is still
 * technically "valid".
 */

Which seems out of sync with the code now.

cheers



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH 1/2]powerpc: foundation code to handle CR5 for local_t

2014-12-01 Thread Scott Wood
On Thu, 2014-11-27 at 17:48 +0530, Madhavan Srinivasan wrote:
> - I really appreciate feedback on the patchset.
> - Kindly comment if I should try with any other benchmark or
> workload to check the numbers.
> - Also, kindly recommand any know stress test for CR
> 
>  Makefile |   6 ++
>  arch/powerpc/include/asm/exception-64s.h |  21 +-
>  arch/powerpc/kernel/entry_64.S   | 106 
> ++-
>  arch/powerpc/kernel/exceptions-64s.S |   2 +-
>  arch/powerpc/kernel/head_64.S|   8 +++
>  5 files changed, 138 insertions(+), 5 deletions(-)

Patch 2/2 enables this for all PPC64, not just book3s -- so please don't
forget about the book3e exception paths (also MSR[GS] for KVM, but
aren't most if not all the places you're checking for HV mode after KVM
would have taken control?  Or am I missing something about how book3s
KVM works?).

Or, if you don't want to do that, change patch 2/2 to be book3s only and
ifdef-protect the changes to common exception code.

> @@ -224,8 +243,26 @@ syscall_exit:
>  BEGIN_FTR_SECTION
>   stdcx.  r0,0,r1 /* to clear the reservation */
>  END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
> +BEGIN_FTR_SECTION
> + lis r4,4096
> + rldicr  r4,r4,32,31
> + mr  r6,r4
> + ori r4,r4,16384
> + and r4,r8,r4
> + cmpdcr3,r6,r4
> + beq cr3,65f
> + mtcrr5
> +FTR_SECTION_ELSE
>   andi.   r6,r8,MSR_PR
> - ld  r4,_LINK(r1)
> + beq 65f
> + mtcrr5
> + nop
> + nop
> + nop
> + nop
> + nop
> +ALT_FTR_SECTION_END_IFSET(CPU_FTR_HVMODE)
> +65:  ld  r4,_LINK(r1)
>  
>   beq-1f
>   ACCOUNT_CPU_USER_EXIT(r11, r12)
> @@ -234,7 +271,11 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS)
>  1:   ld  r2,GPR2(r1)
>   ld  r1,GPR1(r1)
>   mtlrr4
> +#ifdef   CONFIG_PPC64
> + mtcrf   0xFB,r5
> +#else
>   mtcrr5
> +#endif

mtcrf with more than one CRn being updated is expensive on Freescale
chips (and this isn't a book3s-only code path).  Why do you need to do
it twice?  I don't see where either r5 or cr5 are messed with between
the two places...

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault

2014-12-01 Thread Benjamin Herrenschmidt
On Mon, 2014-11-03 at 20:21 +0530, Aneesh Kumar K.V wrote:
> upatepp get called for a nohpte fault, when we find from the linux
> page table that the translation was hashed before. In that case
> we are sure that there is no existing translation, hence we could
> avoid doing tlbie.

You need to test your own stuff together :-)

/home/benh/linux-powerpc-test/arch/powerpc/mm/hugepage-hash64.c: In function 
'__hash_page_thp':
/home/benh/linux-powerpc-test/arch/powerpc/mm/hugepage-hash64.c:98:17: error: 
'local' undeclared (first use in this function)
/home/benh/linux-powerpc-test/arch/powerpc/mm/hugepage-hash64.c:98:17: note: 
each undeclared identifier is reported only once for each function it appears in

Cheers,
Ben.


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Right location in sysfs for dlpar file

2014-12-01 Thread Greg KH
On Mon, Dec 01, 2014 at 09:41:03AM -0600, Nathan Fontenot wrote:
> On 11/26/2014 09:12 PM, Benjamin Herrenschmidt wrote:
> > Hi Greg,
> > 
> > So Nathan is working on a patch series to cleanup and improve our
> > "DLPAR" infrastructure which is basically our hotplug mechanism when
> > running under the PowerVM (aka pHyp) and KVM hypervisors.
> 
> The cleanup to the dlpar infrastructure will move the entire operation
> of hotplugging a device to the kernel instead of doing it partially in
> userspace and partially in the kernel as is currently done.
> 
> > 
> > I'll let Nathan give you a bit more details/background and answer
> > subsequent question you might have as this is really his area of
> > expertise.
> > 
> > To cut a long story short, we need a sysfs file that allows our
> > userspace tools to notify the kernel of hotplug events coming from
> > the management console (which talks to userspace daemons using a
> > proprietary protocol) to "initiate" the hotplug operations, which in
> > turn get dispatched internally in the kernel to the right subsystem
> > (memory, cpu, pci, ...) based on the resource type.
> > 
> > On IRC, Greg suggested /sys/firmware and /sys/hypervisor which both
> > look like a reasonable option to me, probably better than dlpar...
> 
> For PowerVM systems we need this sysfs file to deliver what is
> essentially a binary blob (specifically a rtas error log) to the
> kernel. The current patch set is creating /sys/kernel/dlpar. As Ben
> mentioned we would like your input on what would be the proper place
> to create this file.

And what is the kernel supposed to do with such a binary blob?  Parse
it?  Or pass it to something else?

Anyway, let's see the patches before I guess anything else, that will
determine how things work out best.

thanks,

greg k-h
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] Documentation: bindings: net: DPAA corenet binding document

2014-12-01 Thread Scott Wood
On Fri, 2014-11-28 at 12:10 +0200, Madalin Bucur wrote:
> Add the device tree binding document for the DPAA corenet node
> and DPAA Ethernet nodes.
> 
> Signed-off-by: Madalin Bucur 
> ---
>  Documentation/devicetree/bindings/net/fsl-dpaa.txt | 31 
> ++
>  1 file changed, 31 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/net/fsl-dpaa.txt
> 
> diff --git a/Documentation/devicetree/bindings/net/fsl-dpaa.txt 
> b/Documentation/devicetree/bindings/net/fsl-dpaa.txt
> new file mode 100644
> index 000..822c668
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/net/fsl-dpaa.txt
> @@ -0,0 +1,31 @@
> +*DPAA corenet
> +
> +The corenet bus containing all DPAA Ethernet nodes.

What does this have to do with corenet?

> +Required property
> + - compatible: string property.  Must include "fsl,dpaa". Can include
> +   also "fsl,-dpaa".

No need for the  part.  As we previously discussed, the only
purpose of this node is backwards compatibility with the U-Boot MAC
address fixup -- if U-Boot doesn't look for the  version, then
don't complicate things.

Though, I can't find where U-Boot references this node.  Are you sure
it's not using the ethernet%d aliases like everything else, in which
case why do we need this node at all?

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault

2014-12-01 Thread Aneesh Kumar K.V
Benjamin Herrenschmidt  writes:

> On Mon, 2014-11-03 at 20:21 +0530, Aneesh Kumar K.V wrote:
>> upatepp get called for a nohpte fault, when we find from the linux
>> page table that the translation was hashed before. In that case
>> we are sure that there is no existing translation, hence we could
>> avoid doing tlbie.
>
> You need to test your own stuff together :-)
>
> /home/benh/linux-powerpc-test/arch/powerpc/mm/hugepage-hash64.c: In function 
> '__hash_page_thp':
> /home/benh/linux-powerpc-test/arch/powerpc/mm/hugepage-hash64.c:98:17: error: 
> 'local' undeclared (first use in this function)
> /home/benh/linux-powerpc-test/arch/powerpc/mm/hugepage-hash64.c:98:17: note: 
> each undeclared identifier is reported only once for each function it appears 
> in
>

I will redo that patch. I was not sure which of these patches get pulled
in which sequences. So all of that was done on top of master. Hence the
conflict. What i will do is I will respin this on top of what you pushed
to next and send only this patch again.


Thanks
-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH v1 1/1] powerpc/85xx: Add support for Emerson/Artesyn MVME2500.

2014-12-01 Thread Scott Wood
On Thu, 2014-11-27 at 15:28 +0100, Alessio Igor Bogani wrote:
> Scott,
> 
> On 26 November 2014 at 23:21, Scott Wood  wrote:
> > On Wed, 2014-11-26 at 15:17 +0100, Alessio Igor Bogani wrote:
> >> + lbc: localbus@ffe05000 {
> >> + reg = <0 0xffe05000 0 0x1000>;
> >> +
> >
> > It's not possible to program the LBC with a window of only 0x1000 bytes.
> 
> All similar boards seem to have the same value there. 

I was referring to the final ranges entry:

> + 0x5 0x0 0x0 0xffdf 0x1000>;

The localbus ranges should reflect what was programmed into BRn/ORn.
The smallest size that can be programmed into ORn is 32 KiB.

> AFAIK 0x1000 is a offset so it stands for 4KB.

It's not an offset.  It's a size in both cases.

> >> +
> >> + serial2: serial@1,0 {
> >> + #cell-index = <2>;
> >> + device_type = "serial";
> >> + compatible = "ns16550";
> >> + reg = <0x1 0x0 0x100>;
> >> + clock-frequency = <1843200>;
> >> + interrupts = <11 2 0 0>;
> >> + };
> >
> > Why do you need cell-index, what connection do these values have to
> > actual hardware (e.g. values written to a register, rather than numbers
> > in a manual), and why did the name change to #cell-index?
> 
> I have used fsl/pq3-duart-0.dtsi as template and #cell-index are used there.

"cell-index" is used there (though I don't know why), not "#cell-index".
The latter string does not appear anywhere in the kernel.

> >> +/include/ "mvme2500.dtsi"
> >
> > Are you going to have more than one .dts using this .dtsi?  If not, why
> > separate this part?
> 
> The pq3-gpio-0.dtsi defines an gpio controller in this way:
> 
> gpio-controller@f000 {
>  reg = <0xf000 0x100>;
>  [...]
> 
> But MVME2500 board requires a slightly different definition:
> 
>  reg = <0xfc00 0x100>;

The GPIO CCSR registers on a P2010 don't change based on what board you
put it on.  It looks like pq3-gpio-0.dtsi is just wrong, for all chips
that use it.  It should be fixed there.

> Override gpio-controller reg definition included by
> fsl/p2020si-post.dtsi (which includes the above mentioned
> fsl/pq3-gpio-0.dtsi) using mvme2500.dtsi is the only solution I have
> found so far.
> 
> Can you suggest me a better approach, please?

There's no need here, but if you did for some reason need to override
something in -post.dtsi, you could just put it after the include
of the post file.  No need to push the board's fragment into yet another
dtsi. 

> >> diff --git a/arch/powerpc/configs/85xx/mvme2500_defconfig 
> >> b/arch/powerpc/configs/85xx/mvme2500_defconfig
> >> new file mode 100644
> >> index 000..06fe629
> >> --- /dev/null
> >> +++ b/arch/powerpc/configs/85xx/mvme2500_defconfig
> >
> > Why does this board need its own defconfig?
> >
> > If it's just for the address space stuff, maybe it could be a more
> > general mpc85xx_2g_1g_1g_defconfig.  xes_mpc85xx_defconfig uses the same
> > layout (though it's SMP).  Maybe other boards could share it in the
> > future, or users of existing boards might prefer it...
> 
> Sorry for ignorance but what are *_defconfigs supposed to provide?
> A barely bootable system (in that case I can pick the config of a
> similar board) or a system with all drivers for devices exposed by its
> device tree?

All drivers.  It's OK to add drivers to existing defconfigs -- though
I'd hesitate to put staging drivers in there, so I guess it can have its
own defconfig as long as it relies on that.

> > Better still would be if we could have address map tweaks be kconfig
> > fragments that get mixed in by the user, with merge_config.sh.
> 
> Personally I would prefer see something more simple like this:
> 
> %_defconfig: scripts/kconfig/conf
># Grab the platform generic config file (for a SoC family)
>$(Q)$< --defconfig=arch/$(SRCARCH)/configs/mpc$(shell dirname
> $@)_defconfig Kconfig

What is the dirname here trying to do?

> >> +CONFIG_ADVANCED_OPTIONS=y
> >> +CONFIG_LOWMEM_SIZE_BOOL=y
> >> +CONFIG_LOWMEM_SIZE=0x4000
> >> +CONFIG_PAGE_OFFSET_BOOL=y
> >> +CONFIG_PAGE_OFFSET=0x8000
> >> +CONFIG_KERNEL_START_BOOL=y
> >> +CONFIG_TASK_SIZE_BOOL=y
> >> +CONFIG_TASK_SIZE=0x8000
> >
> > I gues the point here is to avoid using highmem just for the last 256
> > MiB?
> 
> Yes. Can you suggest me a better solution, please?

Not if the performance benefit from getting rid of highmem is worth
carrying this around...  But it would still be good if the board support
were build in the standard defconfig as well.  It's unlikely to get much
build coverage (by people who don't use this board) in a board-specific
defconfig.

> >> +CONFIG_STAGING=y
> >
> > What do you need from staging?
> 
> CONFIG_VME_USER. It is a staging driver although it isn't appear in
> staging menu.

OK.

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.

Re: [V4] powerpc, xmon: Enable HW instruction breakpoint on POWER8

2014-12-01 Thread Anshuman Khandual
On 12/01/2014 11:10 AM, Michael Ellerman wrote:
> On Fri, 2014-28-11 at 04:36:42 UTC, Anshuman Khandual wrote:
>> This patch enables support for hardware instruction breakpoint in
>> xmon on POWER8 platform with the help of a new register called the
>> CIABR (Completed Instruction Address Breakpoint Register). With this
>> patch, a single hardware instruction breakpoint can be added and
>> cleared during any active xmon debug session. The hardware based
>> instruction breakpoint mechanism works correctly with the existing
>> TRAP based instruction breakpoint available on xmon.
>>
>> There are no powerpc CPU with CPU_FTR_IABR feature any more. This
>> patch has re-purposed all the existing IABR related code to work
>> with CIABR register based HW instruction breakpoint.
> 
> OK I think I'm happy with this, I am going to add this to the changelog 
> though:
> 
> This has one odd feature, which is that when we hit a breakpoint xmon
> doesn't tell us we have hit the breakpoint. This is because xmon is
> expecting bp->address == regs->nip. Because CIABR fires on completition
> regs->nip points to the instruction after the breakpoint. We could fix
> that, but it would then confuse other parts of the xmon code which think
> we need to emulate the instruction. [mpe]

Sounds good. Thanks Michael.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Right location in sysfs for dlpar file

2014-12-01 Thread Michael Ellerman
On Mon, 2014-12-01 at 09:41 -0600, Nathan Fontenot wrote:
> On 11/26/2014 09:12 PM, Benjamin Herrenschmidt wrote:
> > Hi Greg,
> > 
> > So Nathan is working on a patch series to cleanup and improve our
> > "DLPAR" infrastructure which is basically our hotplug mechanism when
> > running under the PowerVM (aka pHyp) and KVM hypervisors.
> 
> The cleanup to the dlpar infrastructure will move the entire operation
> of hotplugging a device to the kernel instead of doing it partially in
> userspace and partially in the kernel as is currently done.
> 
...
> 
> For PowerVM systems we need this sysfs file to deliver what is
> essentially a binary blob (specifically a rtas error log) to the
> kernel.

Those two statements don't really agree with each other. ie. "move the entire
operation .. to the kernel", but then we need a sysfs file so userspace can
deliver us a blob?

I think what you mean is that all the actual logic will move into the kernel,
and the only thing userspace will do (on PowerVM) is write the blog to kick off
the process.

On PowerKVM the entire process will be handled in the kernel (after some
additional patches to hook up the rtas event to the hotplug).


As ugly as it is, we already have /proc/rtas, which includes a bunch of files,
including error_log, which is where you can *read* the RTAS error logs from.

So maybe we just extend that, either a new file, or just by making error_log
writable?

It'd be nice to drop all that rtas gunk and move to something cleaner in /sys,
but I don't think we can realistically do that any time soon anyway?

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/xmon: Cleanup the breakpoint flags

2014-12-01 Thread Anshuman Khandual
On 12/01/2014 11:25 AM, Michael Ellerman wrote:
> Drop BP_IABR_TE, which though used, does not do anything useful. Rename
> BP_IABR to BP_CIABR. Renumber the flags.
> 
> Signed-off-by: Michael Ellerman 
> ---
>  arch/powerpc/xmon/xmon.c | 19 +--
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> 
> This is on top of Anshuman's v4 of CIABR breakpoint support.

Looks good to me.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 1/2] ASoC: fsl_ssi: fix error path in probe

2014-12-01 Thread Jiada Wang
SSI component isn't unregistered if fsl_ssi_debugfs_create() fails
in probe phase.

To fix it, this commit replaces label error_asoc_register with
error_irq.

Signed-off-by: Jiada Wang 
---
 sound/soc/fsl/fsl_ssi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index e695517..e19ed39 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -1412,7 +1412,7 @@ static int fsl_ssi_probe(struct platform_device *pdev)
 
ret = fsl_ssi_debugfs_create(&ssi_private->dbg_stats, &pdev->dev);
if (ret)
-   goto error_asoc_register;
+   goto error_irq;
 
/*
 * If codec-handle property is missing from SSI node, we assume
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 0/2] fsl_ssi misc fix

2014-12-01 Thread Jiada Wang
Hi,

Changes in v2:
- fix error path in probe
- replace irq_of_parse_and_map with platform_get_irq

Changes in v1:
- free IRQ before irq_dispose_mapping

Jiada Wang (2):
  ASoC: fsl_ssi: fix error path in probe
  ASoC: fsl_ssi: use platform_get_irq instead of irq_of_parse_and_map

 sound/soc/fsl/fsl_ssi.c | 13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2 2/2] ASoC: fsl_ssi: use platform_get_irq instead of irq_of_parse_and_map

2014-12-01 Thread Jiada Wang
Use platform_get_irq as no mapping needs to be done.
By using platform_get_irq, driver can avoid to free IRQ manually
when SSI driver exits.

Signed-off-by: Jiada Wang 
---
 sound/soc/fsl/fsl_ssi.c | 11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index e19ed39..a7a9eb8 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -1363,7 +1363,7 @@ static int fsl_ssi_probe(struct platform_device *pdev)
return PTR_ERR(ssi_private->regs);
}
 
-   ssi_private->irq = irq_of_parse_and_map(np, 0);
+   ssi_private->irq = platform_get_irq(pdev, 0);
if (!ssi_private->irq) {
dev_err(&pdev->dev, "no irq for node %s\n", np->full_name);
return -ENXIO;
@@ -1389,7 +1389,7 @@ static int fsl_ssi_probe(struct platform_device *pdev)
if (ssi_private->soc->imx) {
ret = fsl_ssi_imx_probe(pdev, ssi_private, iomem);
if (ret)
-   goto error_irqmap;
+   return ret;
}
 
ret = snd_soc_register_component(&pdev->dev, &fsl_ssi_component,
@@ -1460,10 +1460,6 @@ error_asoc_register:
if (ssi_private->soc->imx)
fsl_ssi_imx_clean(pdev, ssi_private);
 
-error_irqmap:
-   if (ssi_private->use_dma)
-   irq_dispose_mapping(ssi_private->irq);
-
return ret;
 }
 
@@ -1480,9 +1476,6 @@ static int fsl_ssi_remove(struct platform_device *pdev)
if (ssi_private->soc->imx)
fsl_ssi_imx_clean(pdev, ssi_private);
 
-   if (ssi_private->use_dma)
-   irq_dispose_mapping(ssi_private->irq);
-
return 0;
 }
 
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2] powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault

2014-12-01 Thread Aneesh Kumar K.V
upatepp get called for a nohpte fault, when we find from the linux
page table that the translation was hashed before. In that case
we are sure that there is no existing translation, hence we could
avoid doing tlbie.

Performance number:
We use randbox_access_bench written by Anton.

Kernel with THP disabled and smaller hash page table size.

86.60%  random_access_b  [kernel.kallsyms][k] 
.native_hpte_updatepp
 2.10%  random_access_b  random_access_bench  [.] doit
 1.99%  random_access_b  [kernel.kallsyms][k] 
.do_raw_spin_lock
 1.85%  random_access_b  [kernel.kallsyms][k] 
.native_hpte_insert
 1.26%  random_access_b  [kernel.kallsyms][k] 
.native_flush_hash_range
 1.18%  random_access_b  [kernel.kallsyms][k] .__delay
 0.69%  random_access_b  [kernel.kallsyms][k] 
.native_hpte_remove
 0.37%  random_access_b  [kernel.kallsyms][k] 
.clear_user_page
 0.34%  random_access_b  [kernel.kallsyms][k] 
.__hash_page_64K
 0.32%  random_access_b  [kernel.kallsyms][k] 
fast_exception_return
 0.30%  random_access_b  [kernel.kallsyms][k] .hash_page_mm

With Fix:

27.54%  random_access_b  random_access_bench  [.] doit
22.90%  random_access_b  [kernel.kallsyms][k] 
.native_hpte_insert
 5.76%  random_access_b  [kernel.kallsyms][k] 
.native_hpte_remove
 5.20%  random_access_b  [kernel.kallsyms][k] 
fast_exception_return
 5.12%  random_access_b  [kernel.kallsyms][k] 
.__hash_page_64K
 4.80%  random_access_b  [kernel.kallsyms][k] .hash_page_mm
 3.31%  random_access_b  [kernel.kallsyms][k] 
data_access_common
 1.84%  random_access_b  [kernel.kallsyms][k] 
.trace_hardirqs_on_caller

Signed-off-by: Aneesh Kumar K.V 
---
Changes from V1:
* rebased to next branch of Ben's tree

 arch/powerpc/include/asm/machdep.h|  2 +-
 arch/powerpc/include/asm/mmu-hash64.h | 22 ++--
 arch/powerpc/include/asm/tlbflush.h   |  4 +--
 arch/powerpc/kernel/exceptions-64s.S  |  2 ++
 arch/powerpc/mm/hash_low_64.S | 15 ++-
 arch/powerpc/mm/hash_native_64.c  | 15 ---
 arch/powerpc/mm/hash_utils_64.c   | 44 ---
 arch/powerpc/mm/hugepage-hash64.c |  8 +++---
 arch/powerpc/mm/hugetlbpage-hash64.c  |  6 ++---
 arch/powerpc/mm/pgtable_64.c  |  7 ++---
 arch/powerpc/platforms/cell/beat_htab.c   |  4 +--
 arch/powerpc/platforms/cell/spu_base.c|  5 ++--
 arch/powerpc/platforms/cell/spufs/fault.c |  2 +-
 arch/powerpc/platforms/ps3/htab.c |  2 +-
 arch/powerpc/platforms/pseries/lpar.c |  2 +-
 drivers/misc/cxl/fault.c  |  8 --
 16 files changed, 91 insertions(+), 57 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index e5c0919acca4..c8175a3fe560 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -42,7 +42,7 @@ struct machdep_calls {
 unsigned long newpp, 
 unsigned long vpn,
 int bpsize, int apsize,
-int ssize, int local);
+int ssize, unsigned long flags);
void(*hpte_updateboltedpp)(unsigned long newpp, 
   unsigned long ea,
   int psize, int ssize);
diff --git a/arch/powerpc/include/asm/mmu-hash64.h 
b/arch/powerpc/include/asm/mmu-hash64.h
index aeebc94b2bce..4f13c3ed7acf 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -316,27 +316,33 @@ static inline unsigned long hpt_hash(unsigned long vpn,
return hash & 0x7fUL;
 }
 
+#define HPTE_LOCAL_UPDATE  0x1
+#define HPTE_NOHPTE_UPDATE 0x2
+
 extern int __hash_page_4K(unsigned long ea, unsigned long access,
  unsigned long vsid, pte_t *ptep, unsigned long trap,
- unsigned int local, int ssize, int subpage_prot);
+ unsigned long flags, int ssize, int subpage_prot);
 extern int __hash_page_64K(unsigned long ea, unsigned long access,
   unsigned long vsid, pte_t *ptep, unsigned long trap,
-  unsigned int local, int ssize);
+  unsigned long flags, int ssize);
 struct mm_struct;
 unsigned int hash_page_do_lazy_icache(unsigned int pp, pte_t pte, int trap);
-extern int hash_page_mm(struct mm_struct *mm, unsigned long ea, unsigned long 
access, unsigned long trap);
-extern int hash_page(unsigned lo

Re: [PATCH 2/2] powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault

2014-12-01 Thread Aneesh Kumar K.V
Benjamin Herrenschmidt  writes:

> On Mon, 2014-11-03 at 20:21 +0530, Aneesh Kumar K.V wrote:
>> --- a/arch/powerpc/mm/hash_native_64.c
>> +++ b/arch/powerpc/mm/hash_native_64.c
>> @@ -283,11 +283,11 @@ static long native_hpte_remove(unsigned long 
>> hpte_group)
>>  
>>  static long native_hpte_updatepp(unsigned long slot, unsigned long newpp,
>>  unsigned long vpn, int bpsize,
>> -int apsize, int ssize, int local)
>> +int apsize, int ssize, unsigned long flags)
>>  {
>> struct hash_pte *hptep = htab_address + slot;
>> unsigned long hpte_v, want_v;
>> -   int ret = 0;
>> +   int ret = 0, local = 0;
>>  
>> want_v = hpte_encode_avpn(vpn, bpsize, ssize);
>>  
>> @@ -322,8 +322,15 @@ static long native_hpte_updatepp(unsigned long slot, 
>> unsigned long newpp,
>> }
>> native_unlock_hpte(hptep);
>> }
>> -   /* Ensure it is out of the tlb too. */
>> -   tlbie(vpn, bpsize, apsize, ssize, local);
>> +
>> +   if (flags & HPTE_LOCAL_UPDATE)
>> +   local = 1;
>> +   /*
>> +* Ensure it is out of the tlb too if it is not a nohpte fault
>> +*/
>> +   if (!(flags & HPTE_NOHPTE_UPDATE))
>> +   tlbie(vpn, bpsize, apsize, ssize, local);
>> +
>> return ret;
>>  }
>
> An additional refinement we discussed that I'd like you to test/measure
> is to basically always be local for updatepp unless we have a flag that
> forces us not to.
>
> That flag would be set by copro faults only.
>
> Can you do something on top of this series ?

Yes. Will try that out.

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault

2014-12-01 Thread Aneesh Kumar K.V
Michael Ellerman  writes:

> On Mon, 2014-11-03 at 20:21 +0530, Aneesh Kumar K.V wrote:
>> upatepp get called for a nohpte fault, when we find from the linux
>> page table that the translation was hashed before. In that case
>> we are sure that there is no existing translation, hence we could
>> avoid doing tlbie.
>
> We are sure there *was* no existing translation. It's possible that since the
> nohpte fault occurred the translation has been loaded into the tlb.
>
> Ben says that's OK, because updatepp is only ever relaxing permissions. But
> please add some explanation of that to the changelog - it's not obvious.
>
>> @@ -322,8 +322,15 @@ static long native_hpte_updatepp(unsigned long slot, 
>> unsigned long newpp,
>>  }
>>  native_unlock_hpte(hptep);
>>  }
>> -/* Ensure it is out of the tlb too. */
>> -tlbie(vpn, bpsize, apsize, ssize, local);
>> +
>> +if (flags & HPTE_LOCAL_UPDATE)
>> +local = 1;
>> +/*
>> + * Ensure it is out of the tlb too if it is not a nohpte fault
>> + */
>> +if (!(flags & HPTE_NOHPTE_UPDATE))
>> +tlbie(vpn, bpsize, apsize, ssize, local);
>> +
>>  return ret;
>>  }
>
> The context preceeding this hunk includes this comment:
>
>   /*
>* We need to invalidate the TLB always because hpte_remove doesn't do
>* a tlb invalidate. If a hash bucket gets full, we "evict" a more/less
>* random entry from it. When we do that we don't invalidate the TLB
>* (hpte_remove) because we assume the old translation is still
>* technically "valid".
>*/
>
> Which seems out of sync with the code now.

The comment is still valid. What it explain is the part that, even if we
didn't find hash pte matching we still need to do a tlbie. We don't look
at the nohpte fault details in the comment.

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 03/10] mm: Convert p[te|md]_numa users to p[te|md]_protnone_numa

2014-12-01 Thread Aneesh Kumar K.V
Benjamin Herrenschmidt  writes:

> On Fri, 2014-11-21 at 13:57 +, Mel Gorman wrote:
>> Convert existing users of pte_numa and friends to the new helper. Note
>> that the kernel is broken after this patch is applied until the other
>> page table modifiers are also altered. This patch layout is to make
>> review easier.
>
> Aneesh, the removal of the DSISR_PROTFAULT checks, I wonder if we might
> break something here ... (I know, I asked for them to be removed :-)
>

That is the reason I converted that to a WARN_ON in later patch. 

> IE, we basically bounce all protection checks to the "normal" VMA
> protection checking, so far so good...
>
> But what about the subpage protection stuff ? Will that still work ?
>

I did look at that before. So if we had subpage access limitted, when we
take a fault for that subpage, we bail out early in hash_page_mm. (with
rc = 2). low_hash_fault handle that case directly. We will not end up
calling do_page_fault.

Now, hash_preload can possibly insert an hpte in hash page table even if
the access is not allowed by the pte permissions. But i guess even that
is ok. because we will fault again, end-up calling hash_page_mm where we
handle that part correctly.

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 03/10] mm: Convert p[te|md]_numa users to p[te|md]_protnone_numa

2014-12-01 Thread Aneesh Kumar K.V
Benjamin Herrenschmidt  writes:

> On Fri, 2014-11-21 at 13:57 +, Mel Gorman wrote:
>> void set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep,
>> pte_t pte)
>>  {
>> -#ifdef CONFIG_DEBUG_VM
>> -   WARN_ON(pte_val(*ptep) & _PAGE_PRESENT);
>> -#endif
>> +   /*
>> +* When handling numa faults, we already have the pte marked
>> +* _PAGE_PRESENT, but we can be sure that it is not in hpte.
>> +* Hence we can use set_pte_at for them.
>> +*/
>> +   VM_WARN_ON((pte_val(*ptep) & (_PAGE_PRESENT | _PAGE_USER)) ==
>> +   (_PAGE_PRESENT | _PAGE_USER));
>> +
>
> His is that going to fare with set_pte_at() called for kernel pages ?
>

Yes, we won't capture those errors now. But is there any other debug
check i could use to capture the wrong usage of set_pte_at ?

-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V6 0/9] Add new powerpc specific ELF core notes

2014-12-01 Thread Anshuman Khandual
This patch series adds five new ELF core note sections which can be
used with existing ptrace request PTRACE_GETREGSET-SETREGSET for accessing
various transactional memory and miscellaneous debug register sets on powerpc
platform.

Previous versions:
==
RFC: https://lkml.org/lkml/2014/4/1/292
V1:  https://lkml.org/lkml/2014/4/2/43
V2:  https://lkml.org/lkml/2014/5/5/88
V3:  https://lkml.org/lkml/2014/5/23/486
V4:  https://lkml.org/lkml/2014/11/11/6
V5:  https://lkml.org/lkml/2014/11/25/134

Changes in V6:
--
- Added two git ignore patches for powerpc selftests
- Re-formatted all in-code function definitions in kernel-doc format

Changes in V5:
--
- Changed flush_tmregs_to_thread, so not to take into account self tracing
- Dropped the 3rd patch in the series which had merged two functions
- Fixed one build problem for the misc debug register patch
- Accommodated almost all the review comments from Suka on the 6th patch
- Minor changes to the self test program
- Changed commit messages for some of the patches

Changes in V4:
--
- Added one test program into the powerpc selftest bucket in this regard
- Split the 2nd patch in the previous series into four different patches
- Accommodated most of the review comments on the previous patch series
- Added a patch to merge functions __switch_to_tm and tm_reclaim_task

Changes in V3:
--
- Added two new error paths in every TM related get/set functions when regset
  support is not present on the system (ENODEV) or when the process does not
  have any transaction active (ENODATA) in the context
- Installed the active hooks for all the newly added regset core note types

Changes in V2:
--
- Removed all the power specific ptrace requests corresponding to new NT_PPC_*
  elf core note types. Now all the register sets can be accessed from ptrace
  through PTRACE_GETREGSET/PTRACE_SETREGSET using the individual NT_PPC* core
  note type instead
- Fixed couple of attribute values for REGSET_TM_CGPR register set
- Renamed flush_tmreg_to_thread as flush_tmregs_to_thread
- Fixed 32 bit checkpointed GPR support
- Changed commit messages accordingly

Test Result
---
The patch series has been verified both in 32 bit and 64 bit compiled test
program. Test result for the selftest test (64 bit compiled) can be found here.

test: tm_ptrace
tags: git_version:v3.18-rc6-8-ge2aa4ce
===Testing TM based PTRACE Interface===
Testing TM specific SPR:
TFHAR: 10001098
TEXASR: de018c01
TFIAR: c0041858
TM ORIG_MSR: 8005f032
TM CH DSCR: a (PASSED)
TM CH TAR: 14 (PASSED)
TM CH PPR: 8 (PASSED)
Testing TM checkpointed GPR:
TM CH NIP: 10001098
TM CH LINK: 1ea0
TM CH CCR: 24000422
TM CH GPR[0]: 0 (PASSED)
TM CH GPR[1]: 1 (PASSED)
TM CH GPR[2]: 2 (PASSED)
TM CH GPR[3]: 3 (PASSED)
TM CH GPR[4]: 4 (PASSED)
TM CH GPR[5]: 5 (PASSED)
TM CH GPR[6]: 6 (PASSED)
TM CH GPR[7]: 7 (PASSED)
TM CH GPR[8]: 8 (PASSED)
TM CH GPR[9]: 9 (PASSED)
TM CH GPR[10]: a (PASSED)
TM CH GPR[11]: b (PASSED)
TM CH GPR[12]: c (PASSED)
TM CH GPR[13]: d (PASSED)
TM CH GPR[14]: e (PASSED)
TM CH GPR[15]: f (PASSED)
TM CH GPR[16]: 0 (PASSED)
TM CH GPR[17]: 1 (PASSED)
TM CH GPR[18]: 2 (PASSED)
TM CH GPR[19]: 3 (PASSED)
TM CH GPR[20]: 4 (PASSED)
TM CH GPR[21]: 5 (PASSED)
TM CH GPR[22]: 6 (PASSED)
TM CH GPR[23]: 7 (PASSED)
TM CH GPR[24]: 8 (PASSED)
TM CH GPR[25]: 9 (PASSED)
TM CH GPR[26]: a (PASSED)
TM CH GPR[27]: b (PASSED)
TM CH GPR[28]: c (PASSED)
TM CH GPR[29]: d (PASSED)
TM CH GPR[30]: e (PASSED)
TM CH GPR[31]: f (PASSED)
Testing TM checkpointed FPR:
TM CH FPSCR: 0
TM CH FPR[0]: 0 (PASSED)
TM CH FPR[1]: 1 (PASSED)
TM CH FPR[2]: 2 (PASSED)
TM CH FPR[3]: 3 (PASSED)
TM CH FPR[4]: 4 (PASSED)
TM CH FPR[5]: 5 (PASSED)
TM CH FPR[6]: 6 (PASSED)
TM CH FPR[7]: 7 (PASSED)
TM CH FPR[8]: 8 (PASSED)
TM CH FPR[9]: 9 (PASSED)
TM CH FPR[10]: a (PASSED)
TM CH FPR[11]: b (PASSED)
TM CH FPR[12]: c (PASSED)
TM CH FPR[13]: d (PASSED)
TM CH FPR[14]: e (PASSED)
TM CH FPR[15]: f (PASSED)
TM CH FPR[16]: 0 (PASSED)
TM CH FPR[17]: 1 (PASSED)
TM CH FPR[18]: 2 (PASSED)
TM CH FPR[19]: 3 (PASSED)
TM CH FPR[20]: 4 (PASSED)
TM CH FPR[21]: 5 (PASSED)
TM CH FPR[22]: 6 (PASSED)
TM CH FPR[23]: 7 (PASSED)
TM CH FPR[24]: 8 (PASSED)
TM CH FPR[25]: 9 (PASSED)
TM CH FPR[26]: a (PASSED)
TM CH FPR[27]: b (PASSED)
TM CH FPR[28]: c (PASSED)
TM CH FPR[29]: d (PASSED)
TM CH FPR[30]: e (PASSED)
TM CH FPR[31]: f (PASSED)
Testing TM running GPR:
TM RN NIP: 100011b0
TM RN LINK: 1ea0
TM RN CCR: 4000422
TM RN GPR[0]: f (PASSED)
TM RN GPR[1]: e (PASSED)
TM RN GPR[2]: d (PASSED)
TM RN GPR[3]: c (PASSED)
TM RN GPR[4]: b (PASSED)
TM RN GPR[5]: a (PASSED)
TM RN GPR[6]: 9 (PASSED)
TM RN GPR[7]: 8 (PASSED)
TM RN GPR[8]: 7 (PASSED)
TM RN GPR[9]: 6 (PASSED)
TM RN GPR[10]: 5 (PASSED)
TM RN GPR[11]: 4 (PASSED)
TM RN GPR[12]: 3 (PASSED)
TM RN GPR[13]: 2 (PASSED)
TM RN GPR[14]: 1 (PASSED)
TM RN GPR[15]: 0 (PASSED)
TM RN GPR[16]: f (PASSED)
TM RN GPR[17]: e (PASSED)
TM RN GP

[PATCH V6 2/9] powerpc, process: Add the function flush_tmregs_to_thread

2014-12-01 Thread Anshuman Khandual
This patch creates a function flush_tmregs_to_thread which
will then be used by subsequent patches in this series. The
function checks for self tracing ptrace interface attempts
while in the TM context and logs appropriate warning message.

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/include/asm/switch_to.h |  8 
 arch/powerpc/kernel/process.c| 20 
 2 files changed, 28 insertions(+)

diff --git a/arch/powerpc/include/asm/switch_to.h 
b/arch/powerpc/include/asm/switch_to.h
index 58abeda..23752a9 100644
--- a/arch/powerpc/include/asm/switch_to.h
+++ b/arch/powerpc/include/asm/switch_to.h
@@ -82,6 +82,14 @@ static inline void flush_spe_to_thread(struct task_struct *t)
 }
 #endif
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+extern void flush_tmregs_to_thread(struct task_struct *);
+#else
+static inline void flush_tmregs_to_thread(struct task_struct *t)
+{
+}
+#endif
+
 static inline void clear_task_ebb(struct task_struct *t)
 {
 #ifdef CONFIG_PPC_BOOK3S_64
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index 923cd2d..0013f24 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -745,6 +745,26 @@ void restore_tm_state(struct pt_regs *regs)
 #define __switch_to_tm(prev)
 #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
 
+#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
+void flush_tmregs_to_thread(struct task_struct *tsk)
+{
+   /*
+* Process self tracing is not yet supported through
+* ptrace interface. Ptrace generic code should have
+* prevented this from happening in the first place.
+* Warn once here with the message, if some how it
+* is attempted.
+*/
+   WARN_ONCE(tsk == current,
+   "Not expecting ptrace on self: TM regs may be incorrect\n");
+
+   /*
+* If task is not current, it should have been flushed
+* already to it's thread_struct during __switch_to().
+*/
+}
+#endif
+
 struct task_struct *__switch_to(struct task_struct *prev,
struct task_struct *new)
 {
-- 
1.9.3

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V6 3/9] powerpc, ptrace: Enable fpr_(get/set) for transactional memory

2014-12-01 Thread Anshuman Khandual
This patch enables the fpr_get which gets the running value of all
the FPR registers and the fpr_set which sets the running value of
of all the FPR registers to accommodate in transaction ptrace
interface based requests.

Signed-off-by: Anshuman Khandual 
---
 arch/powerpc/kernel/ptrace.c | 110 ---
 1 file changed, 104 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index f21897b..2de3b2c 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -357,6 +357,36 @@ static int gpr_set(struct task_struct *target, const 
struct user_regset *regset,
return ret;
 }
 
+
+/**
+ * fpr_get - get FPR registers
+ * @target:The target task.
+ * @regset:The user regset structure.
+ * @pos:   The buffer position.
+ * @count: Number of bytes to copy.
+ * @kbuf:  Kernel buffer to copy from.
+ * @ubuf:  User buffer to copy into.
+ *
+ * When the transaction is active, 'transact_fp' holds the current running
+ * value of all FPR registers and 'fp_state' holds the last checkpointed
+ * value of all FPR registers for the current transaction. When transaction
+ * is not active 'fp_state' holds the current running state of all the FPR
+ * registers. So this function which returns the current running values of
+ * all the FPR registers, needs to know whether any transaction is active
+ * or not. The userspace interface buffer layout is as follows.
+ *
+ * struct data {
+ * u64 fpr[32];
+ * u64 fpscr;
+ * };
+ *
+ * There are two config options CONFIG_VSX and CONFIG_PPC_TRANSACTIONAL_MEM
+ * which determines the final code in this function. All the combinations of
+ * these two config options are possible except the one below as transactional
+ * memory config pulls in CONFIG_VSX automatically.
+ *
+ * !defined(CONFIG_VSX) && defined(CONFIG_PPC_TRANSACTIONAL_MEM)
+ */
 static int fpr_get(struct task_struct *target, const struct user_regset 
*regset,
   unsigned int pos, unsigned int count,
   void *kbuf, void __user *ubuf)
@@ -367,22 +397,68 @@ static int fpr_get(struct task_struct *target, const 
struct user_regset *regset,
 #endif
flush_fp_to_thread(target);
 
-#ifdef CONFIG_VSX
+#if defined(CONFIG_VSX) && defined(CONFIG_PPC_TRANSACTIONAL_MEM)
+   /* copy to local buffer then write that out */
+   if (MSR_TM_ACTIVE(target->thread.regs->msr)) {
+   flush_altivec_to_thread(target);
+   flush_tmregs_to_thread(target);
+   for (i = 0; i < 32 ; i++)
+   buf[i] = target->thread.TS_TRANS_FPR(i);
+   buf[32] = target->thread.transact_fp.fpscr;
+   } else {
+   for (i = 0; i < 32 ; i++)
+   buf[i] = target->thread.TS_FPR(i);
+   buf[32] = target->thread.fp_state.fpscr;
+   }
+   return user_regset_copyout(&pos, &count, &kbuf, &ubuf, buf, 0, -1);
+#endif
+
+#if defined(CONFIG_VSX) && !defined(CONFIG_PPC_TRANSACTIONAL_MEM)
/* copy to local buffer then write that out */
for (i = 0; i < 32 ; i++)
buf[i] = target->thread.TS_FPR(i);
buf[32] = target->thread.fp_state.fpscr;
return user_regset_copyout(&pos, &count, &kbuf, &ubuf, buf, 0, -1);
+#endif
 
-#else
+
+#if !defined(CONFIG_VSX) && !defined(CONFIG_PPC_TRANSACTIONAL_MEM)
BUILD_BUG_ON(offsetof(struct thread_fp_state, fpscr) !=
 offsetof(struct thread_fp_state, fpr[32][0]));
-
return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
   &target->thread.fp_state, 0, -1);
 #endif
 }
 
+/**
+ * fpr_set - set FPR registers
+ * @target:The target task.
+ * @regset:The user regset structure.
+ * @pos:   The buffer position.
+ * @count: Number of bytes to copy.
+ * @kbuf:  Kernel buffer to copy into.
+ * @ubuf:  User buffer to copy from.
+ *
+ * When the transaction is active, 'transact_fp' holds the current running
+ * value of all FPR registers and 'fp_state' holds the last checkpointed
+ * value of all FPR registers for the current transaction. When transaction
+ * is not active 'fp_state' holds the current running state of all the FPR
+ * registers. So this function which setss the current running values of
+ * all the FPR registers, needs to know whether any transaction is active
+ * or not. The userspace interface buffer layout is as follows.
+ *
+ * struct data {
+ * u64 fpr[32];
+ * u64 fpscr;
+ * };
+ *
+ * There are two config options CONFIG_VSX and CONFIG_PPC_TRANSACTIONAL_MEM
+ * which determines the final code in this function. All the combinations of
+ * these two config options are possible except the one below as transactional
+ * memory config pulls in CONFIG_VSX automatically.
+ *
+ * !defined(CONFIG_VSX) && defined(CONFIG_PPC_TRANSACTIONAL_MEM)
+ */
 static int fpr_set(struct task_struct *ta